Ritvik Rastogi – Medium

Pinned

Ritvik Rastogi

Thanks for the appreciation, Its surreal for me to get acknowledged from the author itself.

1 min readFeb 1, 2024

--

--

Ritvik Rastogi

Papers Explained 138: LLMLingua-2

LLMLingua-2 focuses on task-agnostic prompt compression for better generalizability and efficiency in LLMs. It proposes a data distillation…

6 min read2 hours ago

--

Papers Explained 138: LLMLingua-2

--

Ritvik Rastogi

Papers Explained 137: LongLLMLingua

LongLLMLingua is a framework designed for prompt compression in long context scenarios. It addresses three main challenges associated with…

6 min read2 days ago

--

1

Papers Explained 137: LongLLMLingua

--

1

Ritvik Rastogi

Papers Explained 136: LLMLingua

LLMLingua is a coarse-to-fine prompt compression method that involves a budget controller to maintain semantic integrity under high…

5 min read4 days ago

--

Papers Explained 136: LLMLingua

--

Ritvik Rastogi

Papers Explained 135: DSPy

DSPy is a programming model designed to improve how language models (LMs) are used in complex tasks. Traditionally, LMs are controlled…

7 min readMay 10, 2024

--

Papers Explained 135: DSPy

--

Ritvik Rastogi

Papers Explained 134: Open ELM

OpenELM is an open language model by Apple with not only open source model weights and inference code but the complete framework for…

4 min readMay 8, 2024

--

Papers Explained 134: Open ELM

--

Ritvik Rastogi

Papers Explained 133: Rho-1

The study analyzes token-level training dynamics of language models, revealing distinct loss patterns for different tokens. RHO-1 leverages…

5 min readMay 6, 2024

--

Papers Explained 133: Rho-1

--

Ritvik Rastogi

Papers Explained 132: RecurrentGemma

RecurrentGemma-2B is an open model based on the Griffin architecture. It uses a combination of linear recurrences and local attention…

3 min readMay 3, 2024

--

Papers Explained 132: RecurrentGemma

--

Ritvik Rastogi

Papers Explained 131: Hawk, Griffin

This work presents the Real-Gated Linear Recurrent Unit (RG-LRU) layer, a novel gated linear recurrent layer, around which a new recurrent…

6 min readMay 1, 2024

--

Papers Explained 131: Hawk, Griffin

--

Ritvik Rastogi

Papers Explained 130: Phi-3

phi-3-mini is a 3.8B language model trained on 3.3T tokens data which is a scaled-up version of the one used for phi-2, composed of heavily…

3 min readApr 29, 2024

--

Papers Explained 130: Phi-3

--

Ritvik Rastogi

Ritvik Rastogi

Data Scientist, 2x Kaggle Expert

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams