Apr 26, 2024
3 stories
1 save
Proposes Reinforcement Learning from Evol-Instruct Feedback (RLEIF) method, applied to Llama-2 to enhance the mathematical reasoning abilities.
Enhances the performance of the open-source Code LLM, StarCoder, through the application of Code Evol-Instruct.
Introduces Evol-Instruct, a method to generate large amounts of instruction data with varying levels of complexity using LLM instead of humans to fine tune a Llama model