open-r1-eval-leaderboard / eval_results

Commit History

Upload eval_results/meta-llama/Llama-2-7b-chat-hf/main/mmlu/results_2024-03-04T21-59-48.523094.json with huggingface_hub
1c8c8a7
verified

lewtun HF staff commited on

Upload eval_results/meta-llama/Llama-2-13b-chat-hf/main/ifeval/results_2024-03-04T22-00-20.679239.json with huggingface_hub
ecaad5a
verified

lewtun HF staff commited on

Upload eval_results/meta-llama/Llama-2-7b-chat-hf/main/gsm8k/results_2024-03-04T21-59-11.311474.json with huggingface_hub
62babbf
verified

lewtun HF staff commited on

Upload eval_results/meta-llama/Llama-2-7b-chat-hf/main/ifeval/results_2024-03-04T21-58-02.772717.json with huggingface_hub
91f57a4
verified

lewtun HF staff commited on

Upload eval_results/meta-llama/Llama-2-13b-chat-hf/main/hellaswag/results_2024-03-04T21-56-14.603220.json with huggingface_hub
a1ba49a
verified

lewtun HF staff commited on

Upload eval_results/meta-llama/Llama-2-70b-chat-hf/main/winogrande/results_2024-03-04T21-54-28.593159.json with huggingface_hub
905df3c
verified

lewtun HF staff commited on

Upload eval_results/meta-llama/Llama-2-7b-chat-hf/main/hellaswag/results_2024-03-04T21-52-09.604877.json with huggingface_hub
c6d7c79
verified

lewtun HF staff commited on

Upload eval_results/NousResearch/Nous-Hermes-2-Yi-34B/main/hellaswag/results_2024-03-04T21-52-04.254438.json with huggingface_hub
ecdcb36
verified

lewtun HF staff commited on

Upload eval_results/meta-llama/Llama-2-13b-chat-hf/main/arc/results_2024-03-04T21-46-41.694295.json with huggingface_hub
e8e7b45
verified

lewtun HF staff commited on

Upload eval_results/meta-llama/Llama-2-13b-chat-hf/main/truthfulqa/results_2024-03-04T21-46-13.700082.json with huggingface_hub
ef306ba
verified

lewtun HF staff commited on

Upload eval_results/meta-llama/Llama-2-13b-chat-hf/main/winogrande/results_2024-03-04T21-45-31.161645.json with huggingface_hub
1f85815
verified

lewtun HF staff commited on

Upload eval_results/meta-llama/Llama-2-7b-chat-hf/main/arc/results_2024-03-04T21-45-23.211298.json with huggingface_hub
53e5c1c
verified

lewtun HF staff commited on

Upload eval_results/meta-llama/Llama-2-7b-chat-hf/main/truthfulqa/results_2024-03-04T21-45-14.056589.json with huggingface_hub
8df6e7c
verified

lewtun HF staff commited on

Upload eval_results/meta-llama/Llama-2-7b-chat-hf/main/winogrande/results_2024-03-04T21-44-37.061369.json with huggingface_hub
9bae3f1
verified

lewtun HF staff commited on

Upload eval_results/NousResearch/Nous-Hermes-2-Yi-34B/main/arc/results_2024-03-04T20-43-53.422268.json with huggingface_hub
133c4ea
verified

lewtun HF staff commited on

Upload eval_results/NousResearch/Nous-Hermes-2-Yi-34B/main/truthfulqa/results_2024-03-04T20-41-39.837328.json with huggingface_hub
c253b64
verified

lewtun HF staff commited on

Upload eval_results/NousResearch/Nous-Hermes-2-Yi-34B/main/winogrande/results_2024-03-04T20-34-28.350705.json with huggingface_hub
c055f33
verified

lewtun HF staff commited on

Upload eval_results/codellama/CodeLlama-70b-Instruct-hf/main/gsm8k/results_2024-03-04T19-13-08.293491.json with huggingface_hub
1bac590
verified

lewtun HF staff commited on

Upload eval_results/deepseek-ai/deepseek-coder-33b-instruct/main/arc/results_2024-03-04T14-08-18.696682.json with huggingface_hub
793151e
verified

lewtun HF staff commited on

Upload eval_results/deepseek-ai/deepseek-coder-33b-instruct/main/winogrande/results_2024-03-04T13-40-11.995573.json with huggingface_hub
93fc866
verified

lewtun HF staff commited on

Upload eval_results/deepseek-ai/deepseek-coder-6.7b-instruct/main/gsm8k/results_2024-03-04T12-58-31.861266.json with huggingface_hub
481952d
verified

lewtun HF staff commited on

Upload eval_results/deepseek-ai/deepseek-coder-6.7b-instruct/main/ifeval/results_2024-03-04T12-43-54.459792.json with huggingface_hub
b0b356c
verified

lewtun HF staff commited on

Upload eval_results/codellama/CodeLlama-13b-Instruct-hf/main/gsm8k/results_2024-03-04T12-34-03.656232.json with huggingface_hub
177e290
verified

lewtun HF staff commited on

Upload eval_results/codellama/CodeLlama-13b-Instruct-hf/main/ifeval/results_2024-03-04T12-32-58.942126.json with huggingface_hub
1ca91f1
verified

lewtun HF staff commited on

Upload eval_results/codellama/CodeLlama-7b-Instruct-hf/main/gsm8k/results_2024-03-04T12-29-35.662851.json with huggingface_hub
0024fb0
verified

lewtun HF staff commited on

Upload eval_results/codellama/CodeLlama-7b-Instruct-hf/main/ifeval/results_2024-03-04T12-28-02.938982.json with huggingface_hub
e134ebe
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/zephyr-7b-alpha/main/mmlu/results_2024-03-04T11-26-53.744909.json with huggingface_hub
0b8ece3
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/zephyr-7b-alpha/main/mmlu/results_2024-03-02T15-40-03.942721.json with huggingface_hub
ab64b0a
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/starcoder2-15b-dpo/v0.0/gsm8k/results_2024-03-04T10-05-19.752449.json with huggingface_hub
b16fbb9
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/starcoder2-15b-dpo/v0.0/ifeval/results_2024-03-04T10-01-01.237692.json with huggingface_hub
da8fd96
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/starcoder2-15b-ift/v0.0/gsm8k/results_2024-03-04T08-03-24.186135.json with huggingface_hub
597753b
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/starcoder2-15b-ift/v0.0/ifeval/results_2024-03-04T00-24-57.481715.json with huggingface_hub
d058bf7
verified

lewtun HF staff commited on

Clean
b6155d5

lewtun HF staff commited on

Upload eval_results/152334H/miqu-1-70b-sf/main/truthfulqa/results_2024-03-03T00-56-46.153531.json with huggingface_hub
5396149
verified

lewtun HF staff commited on

Upload eval_results/152334H/miqu-1-70b-sf/main/arc/results_2024-03-02T21-57-55.211943.json with huggingface_hub
75eb81e
verified

lewtun HF staff commited on

Upload eval_results/Qwen/Qwen1.5-72B-Chat/main/mmlu/results_2024-03-02T21-44-42.749463.json with huggingface_hub
91c4f9c
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/zephyr-7b-beta-dpo/v0.2.2/mmlu/results_2024-03-02T20-36-57.523104.json with huggingface_hub
13a0e8d
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/zephyr-7b-beta-dpo/v0.2.2/gsm8k/results_2024-03-02T20-37-36.252173.json with huggingface_hub
7ba2b4e
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/zephyr-7b-beta-dpo/v0.2.2/ifeval/results_2024-03-02T20-36-30.855806.json with huggingface_hub
acdd9c4
verified

lewtun HF staff commited on

Upload eval_results/152334H/miqu-1-70b-sf/main/winogrande/results_2024-03-02T20-34-53.564701.json with huggingface_hub
e574e6c
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/zephyr-7b-beta-dpo/v0.2.2/hellaswag/results_2024-03-02T20-31-31.320058.json with huggingface_hub
16b5623
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/zephyr-7b-beta-dpo/v0.2.2/arc/results_2024-03-02T20-24-36.289330.json with huggingface_hub
6d4edf4
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/zephyr-7b-beta-dpo/v0.2.2/truthfulqa/results_2024-03-02T20-24-30.364492.json with huggingface_hub
76cfd85
verified

lewtun HF staff commited on

Upload eval_results/HuggingFaceH4/zephyr-7b-beta-dpo/v0.2.2/winogrande/results_2024-03-02T20-24-00.412782.json with huggingface_hub
080efa7
verified

lewtun HF staff commited on

Upload eval_results/Qwen/Qwen1.5-72B-Chat/main/gsm8k/results_2024-03-02T20-22-15.616600.json with huggingface_hub
0a51dd7
verified

lewtun HF staff commited on

Upload eval_results/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO/main/gsm8k/results_2024-03-02T19-57-38.971728.json with huggingface_hub
8de861a
verified

lewtun HF staff commited on

Upload eval_results/NousResearch/Nous-Hermes-2-Mixtral-8x7B-DPO/main/ifeval/results_2024-03-02T19-51-36.280411.json with huggingface_hub
3813b87
verified

lewtun HF staff commited on

Upload eval_results/Qwen/Qwen1.5-72B-Chat/main/ifeval/results_2024-03-02T19-50-17.449357.json with huggingface_hub
f8ad229
verified

lewtun HF staff commited on

Upload eval_results/mistralai/Mixtral-8x7B-Instruct-v0.1/main/gsm8k/results_2024-03-02T19-44-12.500885.json with huggingface_hub
11d5f71
verified

lewtun HF staff commited on

Upload eval_results/mistralai/Mixtral-8x7B-Instruct-v0.1/main/ifeval/results_2024-03-02T19-39-49.974051.json with huggingface_hub
0e851a1
verified

lewtun HF staff commited on