Spaces:
Running
Running
Ludwig Stumpp
commited on
Commit
·
265c39e
1
Parent(s):
a011af1
Add koala results on HellaSwag and WinoGrande zero-shot
Browse files
README.md
CHANGED
@@ -35,7 +35,7 @@ https://huggingface.co/spaces/ludwigstumpp/llm-leaderboard
|
|
35 |
| [gpt-4](https://arxiv.org/abs/2303.08774v3) | OpenAI | no | | [0.953](https://arxiv.org/abs/2303.08774v3) | | | [0.670](https://arxiv.org/abs/2303.08774v3) | | | | [0.864](https://arxiv.org/abs/2303.08774v3) | | | | | [0.875](https://arxiv.org/abs/2303.08774v3) |
|
36 |
| [gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) | EleutherAI | yes | | [0.718](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | [0.719](https://www.mosaicml.com/blog/mpt-7b) | | | [0.719](https://www.mosaicml.com/blog/mpt-7b) | | [0.269](https://www.mosaicml.com/blog/mpt-7b) | [0.276](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | [0.347](https://www.mosaicml.com/blog/mpt-7b) | | | | |
|
37 |
| [gpt-j-6b](https://huggingface.co/EleutherAI/gpt-j-6b) | EleutherAI | yes | | [0.663](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | [0.683](https://www.mosaicml.com/blog/mpt-7b) | | | [0.683](https://www.mosaicml.com/blog/mpt-7b) | | [0.261](https://www.mosaicml.com/blog/mpt-7b) | [0.249](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | [0.234](https://www.mosaicml.com/blog/mpt-7b) | | | | |
|
38 |
-
| [koala-13b](https://bair.berkeley.edu/blog/2023/04/03/koala/) | Berkeley BAIR | no | [1082](https://lmsys.org/blog/2023-05-03-arena/) | |
|
39 |
| [llama-7b](https://arxiv.org/abs/2302.13971) | Meta AI | no | | | [0.738](https://www.mosaicml.com/blog/mpt-7b) | | [0.105](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | [0.738](https://www.mosaicml.com/blog/mpt-7b) | | [0.302](https://www.mosaicml.com/blog/mpt-7b) | | [0.443](https://www.mosaicml.com/blog/mpt-7b) | | [0.701](https://arxiv.org/abs/2302.13971v1) | | |
|
40 |
| [llama-13b](https://arxiv.org/abs/2302.13971) | Meta AI | no | [932](https://lmsys.org/blog/2023-05-03-arena/) | | [0.792](https://arxiv.org/abs/2302.13971) | | [0.158](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | [0.730](https://arxiv.org/abs/2302.13971v1) | | |
|
41 |
| [llama-33b](https://arxiv.org/abs/2302.13971) | Meta AI | no | | | [0.828](https://arxiv.org/abs/2302.13971) | | [0.217](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | [0.760](https://arxiv.org/abs/2302.13971v1) | | |
|
|
|
35 |
| [gpt-4](https://arxiv.org/abs/2303.08774v3) | OpenAI | no | | [0.953](https://arxiv.org/abs/2303.08774v3) | | | [0.670](https://arxiv.org/abs/2303.08774v3) | | | | [0.864](https://arxiv.org/abs/2303.08774v3) | | | | | [0.875](https://arxiv.org/abs/2303.08774v3) |
|
36 |
| [gpt-neox-20b](https://huggingface.co/EleutherAI/gpt-neox-20b) | EleutherAI | yes | | [0.718](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | [0.719](https://www.mosaicml.com/blog/mpt-7b) | | | [0.719](https://www.mosaicml.com/blog/mpt-7b) | | [0.269](https://www.mosaicml.com/blog/mpt-7b) | [0.276](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | [0.347](https://www.mosaicml.com/blog/mpt-7b) | | | | |
|
37 |
| [gpt-j-6b](https://huggingface.co/EleutherAI/gpt-j-6b) | EleutherAI | yes | | [0.663](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | [0.683](https://www.mosaicml.com/blog/mpt-7b) | | | [0.683](https://www.mosaicml.com/blog/mpt-7b) | | [0.261](https://www.mosaicml.com/blog/mpt-7b) | [0.249](https://crfm.stanford.edu/helm/latest/?group=core_scenarios) | [0.234](https://www.mosaicml.com/blog/mpt-7b) | | | | |
|
38 |
+
| [koala-13b](https://bair.berkeley.edu/blog/2023/04/03/koala/) | Berkeley BAIR | no | [1082](https://lmsys.org/blog/2023-05-03-arena/) | | [0.726](https://gpt4all.io/reports/GPT4All_Technical_Report_3.pdf) | | | | | | | | | [0.688](https://gpt4all.io/reports/GPT4All_Technical_Report_3.pdf) | | |
|
39 |
| [llama-7b](https://arxiv.org/abs/2302.13971) | Meta AI | no | | | [0.738](https://www.mosaicml.com/blog/mpt-7b) | | [0.105](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | [0.738](https://www.mosaicml.com/blog/mpt-7b) | | [0.302](https://www.mosaicml.com/blog/mpt-7b) | | [0.443](https://www.mosaicml.com/blog/mpt-7b) | | [0.701](https://arxiv.org/abs/2302.13971v1) | | |
|
40 |
| [llama-13b](https://arxiv.org/abs/2302.13971) | Meta AI | no | [932](https://lmsys.org/blog/2023-05-03-arena/) | | [0.792](https://arxiv.org/abs/2302.13971) | | [0.158](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | [0.730](https://arxiv.org/abs/2302.13971v1) | | |
|
41 |
| [llama-33b](https://arxiv.org/abs/2302.13971) | Meta AI | no | | | [0.828](https://arxiv.org/abs/2302.13971) | | [0.217](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | [0.760](https://arxiv.org/abs/2302.13971v1) | | |
|