Update README.md
Browse files
README.md
CHANGED
@@ -52,7 +52,8 @@ It is impratical for us to manually set specific configuration for each fine-tun
|
|
52 |
|
53 |
Thus, OpenCSG strained our brains to provide a relatively fair method to compare the fine-tuned models on HumanEval benchmark.
|
54 |
To simplify the comparision, we chosed the Pass@1 metric on python language, but our finetuning dataset includes samples in multi language.
|
55 |
-
|
|
|
56 |
|
57 |
| Model | HumanEval python pass@1 |
|
58 |
| --- |----------------------------------------------------------------------------- |
|
|
|
52 |
|
53 |
Thus, OpenCSG strained our brains to provide a relatively fair method to compare the fine-tuned models on HumanEval benchmark.
|
54 |
To simplify the comparision, we chosed the Pass@1 metric on python language, but our finetuning dataset includes samples in multi language.
|
55 |
+
|
56 |
+
**For fair, we evaluated the fine-tuned and origin codellama models only with the original cases' prompts, not including any other instruction else.**
|
57 |
|
58 |
| Model | HumanEval python pass@1 |
|
59 |
| --- |----------------------------------------------------------------------------- |
|