update accuracy
Browse files
README.md
CHANGED
@@ -103,25 +103,24 @@ pip3 install lm-eval==0.4.7
|
|
103 |
we found lm-eval is very unstable for this model. Please set `add_bos_token=True `to align with the origin model. Please use autogptq format
|
104 |
|
105 |
```bash
|
106 |
-
lm-eval --model hf --model_args pretrained=OPEA/Llama-3.3-70B-Instruct-
|
107 |
```
|
108 |
-
|
109 |
-
|
|
110 |
-
|
|
111 |
-
|
|
112 |
-
|
|
113 |
-
|
|
114 |
-
|
|
115 |
-
|
|
116 |
-
|
|
117 |
-
|
|
118 |
-
|
|
119 |
-
|
|
120 |
-
|
|
121 |
-
|
|
122 |
-
| arc_challenge | 0.6109 |
|
123 |
-
|
|
124 |
-
| gsm8k(5shot) strict match | 0.9083 | | |
|
125 |
|
126 |
|
127 |
## Generate the model
|
|
|
103 |
we found lm-eval is very unstable for this model. Please set `add_bos_token=True `to align with the origin model. Please use autogptq format
|
104 |
|
105 |
```bash
|
106 |
+
lm-eval --model hf --model_args pretrained=OPEA/Llama-3.3-70B-Instruct-int3-sym-inc,add_bos_token=True --tasks leaderboard_mmlu_pro,leaderboard_ifeval,lambada_openai,hellaswag,piqa,winogrande,truthfulqa_mc1,openbookqa,boolq,arc_easy,arc_challenge,mmlu,gsm8k --batch_size 16
|
107 |
```
|
108 |
+
| Metric | BF16(lm-eval==0.4.5) | W2G32 With BOS | BF16(lm-eval==0.4.7 with BOS) | WO BOS |
|
109 |
+
| :------------------------: | :----------------------: | ------------------------- | ----------------------------- | :---------------: |
|
110 |
+
| avg | 0.7023 | 0.6606 | | |
|
111 |
+
| leaderboard_mmlu_pro 5shot | 0.5484 | 0.4461 | | 0.4384 |
|
112 |
+
| mmlu | 0.8195 | 0.7606 | 0.8229 | 0.7142 |
|
113 |
+
| lambada_openai | 0.7528 | 0.7413 | | 0.7013 |
|
114 |
+
| hellaswag | 0.6575 | 0.6056 | | 0.5576 |
|
115 |
+
| winogrande | 0.7869 | 0.7727 | | 0.7080 |
|
116 |
+
| piqa | 0.8303 | 0.8069 | | 0.7797 |
|
117 |
+
| truthfulqa_mc1 | 0.4284 | 0.3647 | | 0.3586 |
|
118 |
+
| openbookqa | 0.3720 | 0.3540 | | 0.3000 |
|
119 |
+
| boolq | 0.8865 | 0.8716 | | 0.8339 |
|
120 |
+
| arc_easy | 0.8624 | 0.8367 | | 0.8064 |
|
121 |
+
| leaderboard_ifeval | 0.6661=(0.7110+0.6211)/2 | 0.61235=(0.6739+0.5508)/2 | | (0.5959+0.4603)/2 |
|
122 |
+
| arc_challenge | 0.6109 | 0.5580 | | 0.5188 |
|
123 |
+
| gsm8k(5shot) strict match | 0.9083 | 0.8575 | | |
|
|
|
124 |
|
125 |
|
126 |
## Generate the model
|