-
nuprl/MultiPL-E
Viewer • Updated • 12.7k • 184k • 49 -
openai/openai_humaneval
Viewer • Updated • 164 • 92.4k • 289 -
1.22k
Big Code Models Leaderboard
📈Submit code models for evaluation on benchmarks
-
Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming
Paper • 2402.14261 • Published • 11
Shaun
drgitt
AI & ML interests
None yet
Organizations
None yet
Collections
2
datasets
None public yet