ControlLLM
/

Control-LLM-Llama3.1-8B-Math16-Instruct

hawei_LinkedIn commited on Jan 10

Commit

0f70593

1 Parent(s): a5c538b

update readme with benchmark result

Files changed (1) hide show

README.md CHANGED Viewed

@@ -77,6 +77,16 @@ This is a fine-tuned model of Llama-3.1-8B-Instruct for mathematical tasks on Op
 ## Evaluation Results
 Here is an overview of the evaluation results and findings:
 ### Catastrophic Forgetting on OpenMath
 The following plot illustrates and compares catastrophic forgetting mitigation during training
@@ -87,15 +97,6 @@ The plot below highlights the alignment result of the model trained with Control
 ![Alignment](plots/alignment_best.png)
-### Benchmark Results Table
-The table below summarizes evaluation results across mathematical tasks and original capabilities.
-| **Model**         | **MH** | **M**  | **G8K** | **M-Avg** | **ARC** | **GPQA** | **MLU** | **MLUP** | **O-Avg** | **Overall** |
-|-------------------|--------|--------|---------|-----------|---------|----------|---------|----------|-----------|-------------|
-| Llama3.1-8B-Inst  | 23.7   | 50.9   | 85.6    | 52.1      | 83.4    | 29.9     | 72.4    | 46.7     | 60.5      | 56.3        |
-| **Control LLM***   | 36.0   | 61.7   | **89.7**| 62.5      | 82.5    | 30.8     | **71.6**| 45.4     | **57.6**  | **60.0**    |
----
 ### Explanation:
 - **MH**: MathHard

 ## Evaluation Results
 Here is an overview of the evaluation results and findings:
+### Benchmark Results Table
+The table below summarizes evaluation results across mathematical tasks and original capabilities.
+| **Model**         | **MH** | **M**  | **G8K** | **M-Avg** | **ARC** | **GPQA** | **MLU** | **MLUP** | **O-Avg** | **Overall** |
+|-------------------|--------|--------|---------|-----------|---------|----------|---------|----------|-----------|-------------|
+| Llama3.1-8B-Inst  | 23.7   | 50.9   | 85.6    | 52.1      | 83.4    | 29.9     | 72.4    | 46.7     | 60.5      | 56.3        |
+| **Control LLM***   | 36.0   | 61.7   | **89.7**| 62.5      | 82.5    | 30.8     | **71.6**| 45.4     | **57.6**  | **60.0**    |
+---
 ### Catastrophic Forgetting on OpenMath
 The following plot illustrates and compares catastrophic forgetting mitigation during training
 ![Alignment](plots/alignment_best.png)
 ### Explanation:
 - **MH**: MathHard