eval-leaderboard / refactor_eval_results.py

Commit History

Change model names to reflect version
954d8ee

xeon27 commited on

Add agentharm and swe-bench tasks
1289818

xeon27 commited on

Add results for GAIA and GDM tasks
2718fde

xeon27 commited on

Add model name links and change single-turn to base
9c55d6d

xeon27 commited on

Change nomenclature to single-turn
eb538cb

xeon27 commited on

Replace missing values by None
18638a9

xeon27 commited on

Add relevant model links
5438c77

xeon27 commited on

Add tmp code
e004342

xeon27 commited on

Add script for refactoring results from log files
8b91831

xeon27 commited on