Text Generation
Transformers
Safetensors
English
Eval Results
hawei commited on
Commit
0d94b52
·
verified ·
1 Parent(s): ba08dcc

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -2
README.md CHANGED
@@ -91,9 +91,7 @@ The plot below highlights the alignment comparison of the model trained with Con
91
  The table below summarizes the evaluation results across mathematical tasks and original capabilities for various models and training approaches.
92
 
93
  | **Model** | **Math Tasks** | | | | **Original Capabilities** | | | | **Overall Avg.** |
94
- |--------------------------|----------------------------|----------|-----------|----------|-----------------------------|---------|---------|-----------|------------------|
95
  | | **MathHard** | **Math** | **GSM8K** | **Avg.** | **ARC** | **GPQA**| **MMLU**| **MMLUP** | |
96
- |--------------------------|----------------------------|----------|-----------|----------|-----------------------------|---------|---------|-----------|------------------|
97
  | Llama3.1-8B-Instruct | 23.7 | 50.9 | 85.6 | 52.1 | 83.4 | 29.9 | 72.4 | 46.7 | 56.3 |
98
  | OpenMath2-Llama3.1 | 38.4 | 64.1 | 90.3 | 64.3 | 45.8 | 1.3 | 4.5 | 19.5 | 38.6 |
99
  | **Full Param Tune** | **38.5** | **63.7** | 90.2 | **63.9** | 58.2 | 1.1 | 7.3 | 23.5 | 40.1 |
 
91
  The table below summarizes the evaluation results across mathematical tasks and original capabilities for various models and training approaches.
92
 
93
  | **Model** | **Math Tasks** | | | | **Original Capabilities** | | | | **Overall Avg.** |
 
94
  | | **MathHard** | **Math** | **GSM8K** | **Avg.** | **ARC** | **GPQA**| **MMLU**| **MMLUP** | |
 
95
  | Llama3.1-8B-Instruct | 23.7 | 50.9 | 85.6 | 52.1 | 83.4 | 29.9 | 72.4 | 46.7 | 56.3 |
96
  | OpenMath2-Llama3.1 | 38.4 | 64.1 | 90.3 | 64.3 | 45.8 | 1.3 | 4.5 | 19.5 | 38.6 |
97
  | **Full Param Tune** | **38.5** | **63.7** | 90.2 | **63.9** | 58.2 | 1.1 | 7.3 | 23.5 | 40.1 |