Update README.md
Browse files
README.md
CHANGED
@@ -90,7 +90,7 @@ The plot below highlights the alignment comparison of the model trained with Con
|
|
90 |
### Benchmark Results Table
|
91 |
The table below summarizes the evaluation results across mathematical tasks and original capabilities for various models and training approaches.
|
92 |
|
93 |
-
| **Model** | **
|
94 |
|--------------------------|--------------|----------|-----------|---------------|---------|----------|----------|-----------|----------------|-------------|
|
95 |
| Llama3.1-8B-Instruct | 23.7 | 50.9 | 85.6 | 52.1 | 83.4 | 29.9 | 72.4 | 46.7 | 60.5 | 56.3 |
|
96 |
| OpenMath2-Llama3.1 | 38.4 | 64.1 | 90.3 | 64.3 | 45.8 | 1.3 | 4.5 | 19.5 | 12.9 | 38.6 |
|
|
|
90 |
### Benchmark Results Table
|
91 |
The table below summarizes the evaluation results across mathematical tasks and original capabilities for various models and training approaches.
|
92 |
|
93 |
+
| **Model** | **MathH** | **Math** | **GSM8K** | **Math Avg.** | **ARC** | **GPQA** | **MMLU** | **MMLUP** | **Orig. Avg.** | **Overall** |
|
94 |
|--------------------------|--------------|----------|-----------|---------------|---------|----------|----------|-----------|----------------|-------------|
|
95 |
| Llama3.1-8B-Instruct | 23.7 | 50.9 | 85.6 | 52.1 | 83.4 | 29.9 | 72.4 | 46.7 | 60.5 | 56.3 |
|
96 |
| OpenMath2-Llama3.1 | 38.4 | 64.1 | 90.3 | 64.3 | 45.8 | 1.3 | 4.5 | 19.5 | 12.9 | 38.6 |
|