Ritvik Rastogi

May 31, 2024

7 stories

3 saves

A family of code models ranging from 3B to 34B trained on 3.5-4.5T tokens of code written in 116 programming languages.
A diffusion code generation model that iteratively refines entire programs based on encoded natural language, overcoming the limitation of auto-regressive models in code generation by allowing reconsideration of earlier tokens.
Proposes an approach to make the training of LLMs for program synthesis more efficient by unifying key components of model architectures, learning methods, infill sampling, and data distributions
An LLM trained for program synthesis using input-output examples and natural language descriptions.