Ritvik Rastogi

Jul 31, 2024

2 stories

2 saves

LLM Evaluation

7B & 8x7B evaluation LLMs that score high correlations with both human evaluators and proprietary LM-based judges on both direct assessment and pairwise ranking, obtained by merging Mistral models trained on Feedback Collection and Preference Collection (curated in this work.
A 13B fully open source evaluation LLM trained on Feedback Collection curated using GPT-4 (in this work).
Ritvik Rastogi

Ritvik Rastogi

Data Scientist, 2x Kaggle Expert