open-r1-eval-leaderboard / eval_results

Commit History

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.0/mmlu/results_2024-03-30T21-25-19.158029.json with huggingface_hub
5336fd3
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.1/hellaswag/results_2024-03-30T21-18-09.229165.json with huggingface_hub
27a3899
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.0/hellaswag/results_2024-03-30T21-17-32.693848.json with huggingface_hub
220325a
verified

edbeeching HF staff commited on

Upload eval_results/orpo-explorers/Mixtral-8x7B-capybara-dpo-7k-v0.1/main/agieval/results_2024-03-30T21-15-29.859510.json with huggingface_hub
26b594c
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.1/truthfulqa/results_2024-03-30T21-12-33.921157.json with huggingface_hub
0a1de97
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.1/winogrande/results_2024-03-30T21-12-30.374377.json with huggingface_hub
37cf702
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.1/arc/results_2024-03-30T21-11-01.462809.json with huggingface_hub
5202461
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.0/truthfulqa/results_2024-03-30T21-10-33.934171.json with huggingface_hub
62256b8
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.0/winogrande/results_2024-03-30T21-10-04.767645.json with huggingface_hub
eacd956
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.0/arc/results_2024-03-30T21-01-44.015685.json with huggingface_hub
810997f
verified

edbeeching HF staff commited on

Upload eval_results/orpo-explorers/Mixtral-8x7B-capybara-dpo-7k-v0.1/main/bbh/results_2024-03-30T20-49-06.416818.json with huggingface_hub
649c798
verified

lewtun HF staff commited on

Upload eval_results/databricks/dbrx-base/main/bbh/results_2024-03-30T20-19-23.953419.json with huggingface_hub
1303040
verified

lewtun HF staff commited on

Upload eval_results/orpo-explorers/Mixtral-8x7B-capybara-dpo-7k-v0.1/main/ifeval/results_2024-03-30T19-44-39.606723.json with huggingface_hub
d703db3
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v2.2/gsm8k/results_2024-03-30T19-18-25.065327.json with huggingface_hub
cb7803e
verified

edbeeching HF staff commited on

Upload eval_results/orpo-explorers/Qwen1.5-72B-capybara-dpo-7k-v0.1/main/ifeval/results_2024-03-30T19-13-35.596880.json with huggingface_hub
8b99ec7
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v2.2/hellaswag/results_2024-03-30T19-07-18.596541.json with huggingface_hub
1fc95ac
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v2.2/arc/results_2024-03-30T19-00-21.848737.json with huggingface_hub
f7186ab
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v2.2/truthfulqa/results_2024-03-30T19-00-04.065908.json with huggingface_hub
aae3958
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v2.2/winogrande/results_2024-03-30T18-59-39.855768.json with huggingface_hub
111369a
verified

edbeeching HF staff commited on

Upload eval_results/orpo-explorers/Qwen1.5-72B-capybara-dpo-7k-v0.1/main/agieval/results_2024-03-30T17-44-39.606833.json with huggingface_hub
a257f64
verified

lewtun HF staff commited on

Upload eval_results/orpo-explorers/Mixtral-8x7B-capybara-dpo-7k-v0.1/main/agieval/results_2024-03-30T17-38-23.897235.json with huggingface_hub
3987b92
verified

lewtun HF staff commited on

Upload eval_results/orpo-explorers/Qwen1.5-72B-capybara-dpo-7k-v0.1/main/bbh/results_2024-03-30T17-18-28.109071.json with huggingface_hub
ef2d058
verified

lewtun HF staff commited on

Upload eval_results/orpo-explorers/Qwen1.5-7B-ORPO-Capybara-7k/main/ifeval/results_2024-03-30T17-12-34.161956.json with huggingface_hub
f31bb42
verified

lewtun HF staff commited on

Upload eval_results/orpo-explorers/Mixtral-8x7B-capybara-dpo-7k-v0.1/main/bbh/results_2024-03-30T17-11-17.169052.json with huggingface_hub
801897f
verified

lewtun HF staff commited on

Upload eval_results/orpo-explorers/Qwen1.5-7B-ORPO-Capybara-7k/main/agieval/results_2024-03-30T17-03-38.642914.json with huggingface_hub
407daaf
verified

lewtun HF staff commited on

Upload eval_results/orpo-explorers/Qwen1.5-7B-ORPO-Capybara-7k/main/bbh/results_2024-03-30T17-02-33.902374.json with huggingface_hub
f80c87c
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v2.1/gsm8k/results_2024-03-30T16-50-31.306857.json with huggingface_hub
eb0b160
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v2.1/mmlu/results_2024-03-30T16-46-03.042433.json with huggingface_hub
8b2ab66
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v2.1/hellaswag/results_2024-03-30T16-38-49.972281.json with huggingface_hub
eec1aa9
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v2.1/arc/results_2024-03-30T16-31-49.002038.json with huggingface_hub
f51fcbe
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v2.1/truthfulqa/results_2024-03-30T16-31-39.181910.json with huggingface_hub
de44cd8
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v2.1/winogrande/results_2024-03-30T16-31-10.075271.json with huggingface_hub
7a354e2
verified

edbeeching HF staff commited on

Upload eval_results/databricks/dbrx-instruct/main/agieval/results_2024-03-30T15-19-26.415183.json with huggingface_hub
e5995a8
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v2.0/gsm8k/results_2024-03-30T14-41-06.028143.json with huggingface_hub
047e95d
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v2.0/hellaswag/results_2024-03-30T14-28-47.896745.json with huggingface_hub
0f119cc
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v2.0/truthfulqa/results_2024-03-30T14-22-48.980806.json with huggingface_hub
ad01eeb
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v2.0/winogrande/results_2024-03-30T14-22-21.423288.json with huggingface_hub
409c12a
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v2.0/arc/results_2024-03-30T14-19-06.620508.json with huggingface_hub
bc29876
verified

edbeeching HF staff commited on

Upload eval_results/databricks/dbrx-instruct/main/bbh/results_2024-03-30T10-32-52.452069.json with huggingface_hub
f0ade0d
verified

lewtun HF staff commited on

Upload eval_results/orpo-explorers/Qwen1.5-0.5B-capybara-dpo-7k-v0.1/main/ifeval/results_2024-03-30T07-38-59.579211.json with huggingface_hub
7bf370b
verified

lewtun HF staff commited on

Upload eval_results/orpo-explorers/Qwen1.5-0.5B-capybara-dpo-7k-v0.1/main/agieval/results_2024-03-30T07-32-01.059566.json with huggingface_hub
0caee7e
verified

lewtun HF staff commited on

Upload eval_results/orpo-explorers/Qwen1.5-0.5B-capybara-dpo-7k-v0.1/main/bbh/results_2024-03-30T07-02-19.906312.json with huggingface_hub
0c017b8
verified

lewtun HF staff commited on

Upload eval_results/orpo-explorers/Qwen1.5-14B-ORPO-Capybara-7k/main/ifeval/results_2024-03-29T19-41-58.931501.json with huggingface_hub
5d903f3
verified

lewtun HF staff commited on

Upload eval_results/orpo-explorers/Qwen1.5-14B-ORPO-Capybara-7k/main/agieval/results_2024-03-29T19-30-44.012482.json with huggingface_hub
8b104ea
verified

lewtun HF staff commited on

Upload eval_results/orpo-explorers/Qwen1.5-14B-ORPO-Capybara-7k/main/bbh/results_2024-03-29T19-29-20.279393.json with huggingface_hub
cac1e66
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v1.21/gsm8k/results_2024-03-29T16-09-14.294488.json with huggingface_hub
6e7bb59
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v1.21/mmlu/results_2024-03-29T16-07-43.386533.json with huggingface_hub
4502550
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v1.21/hellaswag/results_2024-03-29T15-58-22.732793.json with huggingface_hub
4b656ab
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v1.21/arc/results_2024-03-29T15-50-31.752781.json with huggingface_hub
e2e74ba
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-odpo/v1.21/truthfulqa/results_2024-03-29T15-50-21.556810.json with huggingface_hub
5f8050f
verified

edbeeching HF staff commited on