List: LLMs for Code | Curated by Ritvik Rastogi

Ritvik Rastogi

May 31, 2024

7 stories

3 saves

LLMs for Code

A family of code models ranging from 3B to 34B trained on 3.5-4.5T tokens of code written in 116 programming languages.

Ritvik Rastogi

Paper Explained 144: Granite Code Models

This paper introduces a series of decoder-only code models (3B, 8B, 20B, 34B) for code generative tasks, trained with code written in 116…

May 31, 2024

Paper Explained 144: Granite Code Models

May 31, 2024

Open code models based on Gemma models by further training on over 500 billion tokens of primarily code

Ritvik Rastogi

Papers Explained 124: CodeGemma

CodeGemma is a collection of open code models built on top of Gemma by further training on more than 500 billion tokens of code, capable of…

Apr 15, 2024

Apr 15, 2024

A diffusion code generation model that iteratively refines entire programs based on encoded natural language, overcoming the limitation of auto-regressive models in code generation by allowing reconsideration of earlier tokens.

Ritvik Rastogi

Papers Explained 70: CodeFusion

Auto-regressive models for code generation have a limitation: they do not easily allow reconsidering earlier tokens generated. CodeFusion…

Nov 13, 2023

Nov 13, 2023

LLaMA 2 based LLM for code.

DAIR.AI

Ritvik Rastogi

Papers Explained 62: Code Llama

Code Llama is a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models…

Oct 16, 2023

Oct 16, 2023

Proposes an approach to make the training of LLMs for program synthesis more efficient by unifying key components of model architectures, learning methods, infill sampling, and data distributions

Ritvik Rastogi

Papers Explained 126: CodeGen2

CodeGen2 proposes an approach to make the training of LLMs for program synthesis more efficient by unifying key components of model…

Apr 19, 2024

Apr 19, 2024

An LLM trained for program synthesis using input-output examples and natural language descriptions.

Ritvik Rastogi

Papers Explained 125: CodeGen

CodeGen is a 16.1B parameter LLM trained for program synthesis using input-output examples and natural language descriptions. CodeGen…

Apr 17, 2024

Apr 17, 2024

A GPT language model finetuned on publicly available code from GitHub.

DAIR.AI

Ritvik Rastogi

Papers Explained 45: Codex

Codex is a GPT language model finetuned on publicly available code from GitHub. A distinct production version of Codex powers GitHub…

May 8, 2023

May 8, 2023

Responses

LLMs for Code

Paper Explained 144: Granite Code Models

This paper introduces a series of decoder-only code models (3B, 8B, 20B, 34B) for code generative tasks, trained with code written in 116…

Papers Explained 124: CodeGemma

CodeGemma is a collection of open code models built on top of Gemma by further training on more than 500 billion tokens of code, capable of…

Papers Explained 70: CodeFusion

Auto-regressive models for code generation have a limitation: they do not easily allow reconsidering earlier tokens generated. CodeFusion…

Papers Explained 62: Code Llama

Code Llama is a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models…

Papers Explained 126: CodeGen2

CodeGen2 proposes an approach to make the training of LLMs for program synthesis more efficient by unifying key components of model…

Papers Explained 125: CodeGen

CodeGen is a 16.1B parameter LLM trained for program synthesis using input-output examples and natural language descriptions. CodeGen…

Papers Explained 45: Codex

Codex is a GPT language model finetuned on publicly available code from GitHub. A distinct production version of Codex powers GitHub…

Ritvik Rastogi

Phi Series

Small LLMs

LLM Training