open-r1-eval-leaderboard / eval_results

Commit History

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.5/winogrande/results_2024-05-02T14-25-34.289106.json with huggingface_hub
6f5ad1f
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.5/arc/results_2024-05-02T14-23-52.012531.json with huggingface_hub
ed2b845
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.7/hellaswag/results_2024-05-02T14-22-39.150905.json with huggingface_hub
8a039cf
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.6/gsm8k/results_2024-05-02T14-21-03.694627.json with huggingface_hub
ee273e1
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.7/winogrande/results_2024-05-02T14-20-06.692899.json with huggingface_hub
0a55bef
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.7/truthfulqa/results_2024-05-02T14-18-15.152264.json with huggingface_hub
523f9e4
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.4/hellaswag/results_2024-05-02T14-14-59.255705.json with huggingface_hub
a1e42cb
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.7/arc/results_2024-05-02T14-13-20.361687.json with huggingface_hub
be03480
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.4/winogrande/results_2024-05-02T14-09-46.315715.json with huggingface_hub
5c5e84e
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.4/truthfulqa/results_2024-05-02T14-08-08.368056.json with huggingface_hub
fbb68b3
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.4/arc/results_2024-05-02T14-05-21.248837.json with huggingface_hub
2a1855e
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.6/winogrande/results_2024-05-02T14-02-56.030449.json with huggingface_hub
40127fb
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.6/truthfulqa/results_2024-05-02T13-06-51.700229.json with huggingface_hub
99ab0d5
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.6/hellaswag/results_2024-05-02T12-56-29.977154.json with huggingface_hub
0cc5f93
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.6/arc/results_2024-05-02T12-46-56.658196.json with huggingface_hub
45908c0
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.3/gsm8k/results_2024-05-02T12-22-02.757339.json with huggingface_hub
df5bd7b
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.3/hellaswag/results_2024-05-02T12-12-03.809335.json with huggingface_hub
23a791f
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.3/arc/results_2024-05-02T12-05-04.647169.json with huggingface_hub
741bfc5
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.3/truthfulqa/results_2024-05-02T12-04-50.022576.json with huggingface_hub
1de3572
verified

edbeeching HF staff commited on

Upload eval_results/HuggingFaceH4/mistral-7b-dpo/v52.3/winogrande/results_2024-05-02T12-04-24.137150.json with huggingface_hub
31920a4
verified

edbeeching HF staff commited on

Upload eval_results/AI-MO/internlm-math-20b-sft/aimo_v00.02/math_v2/results_2024-05-02T09-51-31.784464.json with huggingface_hub
6dfd591
verified

lewtun HF staff commited on

Upload eval_results/AI-MO/internlm-math-20b-sft/aimo_v00.01/math_v2/results_2024-05-02T09-25-56.167251.json with huggingface_hub
6fa7eea
verified

lewtun HF staff commited on

Upload eval_results/abhishek/autotrain-mixtral-8x7b-orpo-v2/main/ifeval/results_2024-05-02T08-38-34.856515.json with huggingface_hub
5a59ef1
verified

abhishek commited on

Upload eval_results/AI-MO/internlm-math-20b-sft/aimo_v00.02/mini_math_v2/results_2024-05-02T08-07-18.677114.json with huggingface_hub
88a0fe9
verified

lewtun HF staff commited on

Upload eval_results/AI-MO/internlm-math-20b-sft/aimo_v00.02/aimo_kaggle/results_2024-05-02T07-57-01.442736.json with huggingface_hub
4da3738
verified

lewtun HF staff commited on

Upload eval_results/abhishek/autotrain-mixtral-8x7b-orpo-v2/main/gsm8k/results_2024-05-02T07-27-42.901910.json with huggingface_hub
505f4de
verified

abhishek commited on

Upload eval_results/AI-MO/internlm-math-20b-sft/aimo_v00.01/mini_math_v2/results_2024-05-02T07-21-47.028397.json with huggingface_hub
6e7a0cd
verified

lewtun HF staff commited on

Upload eval_results/AI-MO/internlm-math-20b-sft/aimo_v00.01/aimo_kaggle/results_2024-05-02T07-11-15.842384.json with huggingface_hub
c654b16
verified

lewtun HF staff commited on

Upload eval_results/abhishek/autotrain-mixtral-8x7b-orpo-v2/main/hellaswag/results_2024-05-02T06-41-19.551430.json with huggingface_hub
de87573
verified

abhishek commited on

Upload eval_results/abhishek/autotrain-mixtral-8x7b-orpo-v2/main/arc/results_2024-05-02T06-14-38.752912.json with huggingface_hub
46f64cc
verified

abhishek commited on

Upload eval_results/abhishek/autotrain-mixtral-8x7b-orpo-v2/main/bbh/results_2024-05-02T06-13-44.080022.json with huggingface_hub
bcc7a4e
verified

abhishek commited on

Upload eval_results/abhishek/autotrain-mixtral-8x7b-orpo-v2/main/truthfulqa/results_2024-05-02T06-13-37.785615.json with huggingface_hub
51afb49
verified

abhishek commited on

Upload eval_results/abhishek/autotrain-mixtral-8x7b-orpo-v2/main/truthfulqa/results_2024-05-02T06-13-34.734291.json with huggingface_hub
fa76b27
verified

abhishek commited on

Upload eval_results/abhishek/autotrain-mixtral-8x7b-orpo-v2/main/winogrande/results_2024-05-02T06-11-19.440174.json with huggingface_hub
312225a
verified

abhishek commited on

Upload eval_results/AI-MO/internlm-math-20b-sft/aimo_v00.00/math_v2/results_2024-05-02T02-34-55.213377.json with huggingface_hub
38564c3
verified

lewtun HF staff commited on

Upload eval_results/AI-MO/internlm-math-20b-sft/aimo_v00.00/mini_math_v2/results_2024-05-02T00-06-43.336916.json with huggingface_hub
51609fe
verified

lewtun HF staff commited on

Upload eval_results/AI-MO/internlm-math-20b-sft/aimo_v00.00/aimo_kaggle/results_2024-05-01T23-52-48.250595.json with huggingface_hub
c9cae79
verified

lewtun HF staff commited on

Upload eval_results/AI-MO/internlm-math-7b-sft/aimo_v00.01/math_v2/results_2024-05-01T22-28-40.931116.json with huggingface_hub
b47db93
verified

lewtun HF staff commited on

Upload eval_results/AI-MO/internlm-math-7b-sft/aimo_v00.02/math_v2/results_2024-05-01T22-18-48.673223.json with huggingface_hub
d6766b2
verified

lewtun HF staff commited on

Upload eval_results/AI-MO/internlm-math-7b-sft/aimo_v00.02/mini_math_v2/results_2024-05-01T20-56-55.525944.json with huggingface_hub
e4a803f
verified

lewtun HF staff commited on

Upload eval_results/AI-MO/internlm-math-7b-sft/aimo_v00.02/aimo_kaggle/results_2024-05-01T20-48-07.028513.json with huggingface_hub
5fb1d51
verified

lewtun HF staff commited on

Upload eval_results/AI-MO/internlm-math-7b-sft/aimo_v00.01/mini_math_v2/results_2024-05-01T20-42-44.533414.json with huggingface_hub
6725339
verified

lewtun HF staff commited on

Upload eval_results/AI-MO/internlm-math-7b-sft/aimo_v00.01/aimo_kaggle/results_2024-05-01T20-35-30.808328.json with huggingface_hub
b20e556
verified

lewtun HF staff commited on

Upload eval_results/abhishek/autotrain-mixtral-8x7b-orpo-v1/main/ifeval/results_2024-05-01T20-19-35.471494.json with huggingface_hub
fb67266
verified

abhishek commited on

Upload eval_results/abhishek/autotrain-mixtral-8x7b-orpo-v1/main/mmlu/results_2024-05-01T18-50-41.016030.json with huggingface_hub
242f4aa
verified

abhishek commited on

Upload eval_results/abhishek/autotrain-mixtral-8x7b-orpo-v1/main/gsm8k/results_2024-05-01T18-51-23.575547.json with huggingface_hub
d0ffe1b
verified

abhishek commited on

Upload eval_results/AI-MO/internlm-math-7b-sft/aimo_v00.00/mini_math_v2/results_2024-05-01T18-15-44.926897.json with huggingface_hub
835021a
verified

lewtun HF staff commited on

Upload eval_results/abhishek/autotrain-mixtral-8x7b-orpo-v1/main/hellaswag/results_2024-05-01T18-10-18.674527.json with huggingface_hub
46ed9bc
verified

abhishek commited on

Upload eval_results/AI-MO/internlm-math-7b-sft/aimo_v00.00/aimo_kaggle/results_2024-05-01T18-07-17.755219.json with huggingface_hub
3432998
verified

lewtun HF staff commited on

Upload eval_results/abhishek/autotrain-mixtral-8x7b-orpo-v1/main/agieval/results_2024-05-01T17-53-21.084520.json with huggingface_hub
6abc6d6
verified

abhishek commited on