Spaces:
Running
Running
Ludwig Stumpp
commited on
Commit
·
72edf21
1
Parent(s):
b7e4ee9
Remove codeT results for code-davinci-002 as not comparable to other HumanEval results, due to additional explicit testing of outputs
Browse files
README.md
CHANGED
@@ -18,7 +18,6 @@ https://huggingface.co/spaces/ludwigstumpp/llm-leaderboard
|
|
18 |
| [chatglm-6b](https://chatglm.cn/blog) | ChatGLM | yes | [985](https://lmsys.org/blog/2023-05-03-arena/) | | | | | | | | | | | | | |
|
19 |
| [chinchilla-70b](https://arxiv.org/abs/2203.15556v1) | DeepMind | no | | | [0.808](https://arxiv.org/abs/2203.15556v1) | | | [0.774](https://arxiv.org/abs/2203.15556v1) | | | [0.675](https://arxiv.org/abs/2203.15556v1) | | | [0.749](https://arxiv.org/abs/2203.15556v1) | | |
|
20 |
| [codex-12b / code-cushman-001](https://arxiv.org/abs/2107.03374) | OpenAI | no | | | | | [0.317](https://crfm.stanford.edu/helm/latest/?group=targeted_evaluations) | | | | | | | | | |
|
21 |
-
| [code-davinci-002](https://arxiv.org/abs/2207.10397v2) | OpenAI | no | | | | | [0.658](https://arxiv.org/abs/2207.10397v2) | | | | | | | | | |
|
22 |
| [codegen-16B-mono](https://huggingface.co/Salesforce/codegen-16B-mono) | Salesforce | yes | | | | | [0.293](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | | | |
|
23 |
| [codegen-16B-multi](https://huggingface.co/Salesforce/codegen-16B-multi) | Salesforce | yes | | | | | [0.183](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | | | |
|
24 |
| [codegx-13b](http://keg.cs.tsinghua.edu.cn/codegeex/) | Tsinghua University | no | | | | | [0.229](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | | | |
|
|
|
18 |
| [chatglm-6b](https://chatglm.cn/blog) | ChatGLM | yes | [985](https://lmsys.org/blog/2023-05-03-arena/) | | | | | | | | | | | | | |
|
19 |
| [chinchilla-70b](https://arxiv.org/abs/2203.15556v1) | DeepMind | no | | | [0.808](https://arxiv.org/abs/2203.15556v1) | | | [0.774](https://arxiv.org/abs/2203.15556v1) | | | [0.675](https://arxiv.org/abs/2203.15556v1) | | | [0.749](https://arxiv.org/abs/2203.15556v1) | | |
|
20 |
| [codex-12b / code-cushman-001](https://arxiv.org/abs/2107.03374) | OpenAI | no | | | | | [0.317](https://crfm.stanford.edu/helm/latest/?group=targeted_evaluations) | | | | | | | | | |
|
|
|
21 |
| [codegen-16B-mono](https://huggingface.co/Salesforce/codegen-16B-mono) | Salesforce | yes | | | | | [0.293](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | | | |
|
22 |
| [codegen-16B-multi](https://huggingface.co/Salesforce/codegen-16B-multi) | Salesforce | yes | | | | | [0.183](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | | | |
|
23 |
| [codegx-13b](http://keg.cs.tsinghua.edu.cn/codegeex/) | Tsinghua University | no | | | | | [0.229](https://drive.google.com/file/d/1cN-b9GnWtHzQRoE7M7gAEyivY0kl4BYs/view) | | | | | | | | | |
|