PinnedRitvik RastogiThanks for the appreciation, Its surreal for me to get acknowledged from the author itself.1 min read·Feb 1, 2024----
Ritvik RastogiPapers Explained 138: LLMLingua-2LLMLingua-2 focuses on task-agnostic prompt compression for better generalizability and efficiency in LLMs. It proposes a data distillation…6 min read·2 hours ago----
Ritvik RastogiPapers Explained 137: LongLLMLinguaLongLLMLingua is a framework designed for prompt compression in long context scenarios. It addresses three main challenges associated with…6 min read·2 days ago--1--1
Ritvik RastogiPapers Explained 136: LLMLinguaLLMLingua is a coarse-to-fine prompt compression method that involves a budget controller to maintain semantic integrity under high…5 min read·4 days ago----
Ritvik RastogiPapers Explained 135: DSPyDSPy is a programming model designed to improve how language models (LMs) are used in complex tasks. Traditionally, LMs are controlled…7 min read·May 10, 2024----
Ritvik RastogiPapers Explained 134: Open ELMOpenELM is an open language model by Apple with not only open source model weights and inference code but the complete framework for…4 min read·May 8, 2024----
Ritvik RastogiPapers Explained 133: Rho-1The study analyzes token-level training dynamics of language models, revealing distinct loss patterns for different tokens. RHO-1 leverages…5 min read·May 6, 2024----
Ritvik RastogiPapers Explained 132: RecurrentGemmaRecurrentGemma-2B is an open model based on the Griffin architecture. It uses a combination of linear recurrences and local attention…3 min read·May 3, 2024----
Ritvik RastogiPapers Explained 131: Hawk, GriffinThis work presents the Real-Gated Linear Recurrent Unit (RG-LRU) layer, a novel gated linear recurrent layer, around which a new recurrent…6 min read·May 1, 2024----
Ritvik RastogiPapers Explained 130: Phi-3phi-3-mini is a 3.8B language model trained on 3.3T tokens data which is a scaled-up version of the one used for phi-2, composed of heavily…3 min read·Apr 29, 2024----