PinnedRitvik RastogiThanks for the appreciation, Its surreal for me to get acknowledged from the author itself.Feb 1Feb 1
Ritvik RastogiPapers Explained 212: DataGemmaThis work presents an approach for enhancing the accuracy of LLMs by integrating them with Data Commons, a vast, open-source repository of…10h ago110h ago1
Ritvik RastogiPapers Explained 211: o1OpenAI o1 is a large language model trained with reinforcement learning to perform complex reasoning. o1 thinks before it answers — it can…1d ago1d ago
Ritvik RastogiPapers Explained 210: MaxViTMax ViT introduces an efficient and scalable attention model called multi-axis attention, consisting of two aspects: blocked local and…4d ago4d ago
Ritvik RastogiPapers Explained 209: Minitron Approach in PracticeThis work presents a comprehensive report on compressing the Llama 3.1 8B and Mistral NeMo 12B models to 4B and 8B parameters…5d ago5d ago
Ritvik RastogiPapers Explained 208: MinitronThe study investigates whether pruning an existing Large Language Model (LLM) and re-training it with a fraction of the original training…6d ago6d ago
Ritvik RastogiPapers Explained 207: Nemotron-4 340BA family of 340B models including a base model, instruct model and a reward model, aimed to benefit in various research studies and…Sep 10Sep 10
Ritvik RastogiPapers Explained 206: Nemotron-4 15BNemotron-4 15B is a large multilingual language model trained on 8T text tokens by Nvidia.It exhibits high downstream accuracies across a…Sep 9Sep 9
Ritvik RastogiPapers Explained 205: LeViTLeViT is a hybrid neural network for fast inference image classification. LeViT significantly outperforms existing convnets and vision…Sep 8Sep 8
Ritvik RastogiPapers Explained 204: Matryoshka AdaptorMatryoshka-Adaptor is a framework designed to customize LLM embeddings for improved computational efficiency and cost-effectiveness. The…Sep 6Sep 6