Update README.md
Browse files
README.md
CHANGED
@@ -66,7 +66,9 @@ To simplify the comparison, we chosed the Pass@1 metric for the Python language,
|
|
66 |
| opencsg-CodeLlama-13b-v0.1| **51.2%** |
|
67 |
| CodeLlama-34b-hf | 48.2%|
|
68 |
| opencsg-CodeLlama-34b-v0.1| **56.1%** |
|
69 |
-
| opencsg-CodeLlama-34b-v0.
|
|
|
|
|
70 |
|
71 |
**TODO**
|
72 |
- We will provide more benchmark scores on fine-tuned models in the future.
|
@@ -193,7 +195,9 @@ HumanEval 是评估模型在代码生成方面性能的最常见的基准,尤
|
|
193 |
| opencsg-CodeLlama-13b-v0.1 | **51.2%** |
|
194 |
| CodeLlama-34b-hf | 48.2%|
|
195 |
| opencsg-CodeLlama-34b-v0.1| **56.1%** |
|
196 |
-
| opencsg-CodeLlama-34b-v0.
|
|
|
|
|
197 |
|
198 |
**TODO**
|
199 |
- 未来我们将提供更多微调模型的在各基准上的分数。
|
|
|
66 |
| opencsg-CodeLlama-13b-v0.1| **51.2%** |
|
67 |
| CodeLlama-34b-hf | 48.2%|
|
68 |
| opencsg-CodeLlama-34b-v0.1| **56.1%** |
|
69 |
+
| opencsg-CodeLlama-34b-v0.2| **64.0%** |
|
70 |
+
| CodeLlama-70b-hf| 53.0% |
|
71 |
+
| CodeLlama-70b-Instruct-hf| **67.8%** |
|
72 |
|
73 |
**TODO**
|
74 |
- We will provide more benchmark scores on fine-tuned models in the future.
|
|
|
195 |
| opencsg-CodeLlama-13b-v0.1 | **51.2%** |
|
196 |
| CodeLlama-34b-hf | 48.2%|
|
197 |
| opencsg-CodeLlama-34b-v0.1| **56.1%** |
|
198 |
+
| opencsg-CodeLlama-34b-v0.2| **64.0%** |
|
199 |
+
| CodeLlama-70b-hf| 53.0% |
|
200 |
+
| CodeLlama-70b-Instruct-hf| **67.8%** |
|
201 |
|
202 |
**TODO**
|
203 |
- 未来我们将提供更多微调模型的在各基准上的分数。
|