Update README.md
Browse files
README.md
CHANGED
|
@@ -54,7 +54,8 @@ Thus, OpenCSG strained our brains to provide a relatively fair method to compare
|
|
| 54 |
To simplify the comparision, we chosed the Pass@1 metric on python language, but our finetuning dataset includes samples in multi language.
|
| 55 |
|
| 56 |
**For fair, we evaluated the fine-tuned and origin codellama models only with the original cases' prompts, not including any other instruction else.**
|
| 57 |
-
|
|
|
|
| 58 |
|
| 59 |
| Model | HumanEval python pass@1 |
|
| 60 |
| --- |----------------------------------------------------------------------------- |
|
|
|
|
| 54 |
To simplify the comparision, we chosed the Pass@1 metric on python language, but our finetuning dataset includes samples in multi language.
|
| 55 |
|
| 56 |
**For fair, we evaluated the fine-tuned and origin codellama models only with the original cases' prompts, not including any other instruction else.**
|
| 57 |
+
|
| 58 |
+
**Otherwise, we use greedy decoding method for each model during the evaluation.**
|
| 59 |
|
| 60 |
| Model | HumanEval python pass@1 |
|
| 61 |
| --- |----------------------------------------------------------------------------- |
|