hawei_LinkedIn commited on
Commit
0f70593
·
1 Parent(s): a5c538b

update readme with benchmark result

Browse files
Files changed (1) hide show
  1. README.md +10 -9
README.md CHANGED
@@ -77,6 +77,16 @@ This is a fine-tuned model of Llama-3.1-8B-Instruct for mathematical tasks on Op
77
  ## Evaluation Results
78
  Here is an overview of the evaluation results and findings:
79
 
 
 
 
 
 
 
 
 
 
 
80
  ### Catastrophic Forgetting on OpenMath
81
  The following plot illustrates and compares catastrophic forgetting mitigation during training
82
 
@@ -87,15 +97,6 @@ The plot below highlights the alignment result of the model trained with Control
87
 
88
  ![Alignment](plots/alignment_best.png)
89
 
90
- ### Benchmark Results Table
91
- The table below summarizes evaluation results across mathematical tasks and original capabilities.
92
-
93
- | **Model** | **MH** | **M** | **G8K** | **M-Avg** | **ARC** | **GPQA** | **MLU** | **MLUP** | **O-Avg** | **Overall** |
94
- |-------------------|--------|--------|---------|-----------|---------|----------|---------|----------|-----------|-------------|
95
- | Llama3.1-8B-Inst | 23.7 | 50.9 | 85.6 | 52.1 | 83.4 | 29.9 | 72.4 | 46.7 | 60.5 | 56.3 |
96
- | **Control LLM*** | 36.0 | 61.7 | **89.7**| 62.5 | 82.5 | 30.8 | **71.6**| 45.4 | **57.6** | **60.0** |
97
-
98
- ---
99
 
100
  ### Explanation:
101
  - **MH**: MathHard
 
77
  ## Evaluation Results
78
  Here is an overview of the evaluation results and findings:
79
 
80
+ ### Benchmark Results Table
81
+ The table below summarizes evaluation results across mathematical tasks and original capabilities.
82
+
83
+ | **Model** | **MH** | **M** | **G8K** | **M-Avg** | **ARC** | **GPQA** | **MLU** | **MLUP** | **O-Avg** | **Overall** |
84
+ |-------------------|--------|--------|---------|-----------|---------|----------|---------|----------|-----------|-------------|
85
+ | Llama3.1-8B-Inst | 23.7 | 50.9 | 85.6 | 52.1 | 83.4 | 29.9 | 72.4 | 46.7 | 60.5 | 56.3 |
86
+ | **Control LLM*** | 36.0 | 61.7 | **89.7**| 62.5 | 82.5 | 30.8 | **71.6**| 45.4 | **57.6** | **60.0** |
87
+
88
+ ---
89
+
90
  ### Catastrophic Forgetting on OpenMath
91
  The following plot illustrates and compares catastrophic forgetting mitigation during training
92
 
 
97
 
98
  ![Alignment](plots/alignment_best.png)
99
 
 
 
 
 
 
 
 
 
 
100
 
101
  ### Explanation:
102
  - **MH**: MathHard