Update README.md
Browse files
README.md
CHANGED
|
@@ -1,11 +1,11 @@
|
|
| 1 |
-
---
|
| 2 |
-
license: apache-2.0
|
| 3 |
-
language:
|
| 4 |
-
- en
|
| 5 |
-
- zh
|
| 6 |
-
tags:
|
| 7 |
-
- moe
|
| 8 |
-
---
|
| 9 |
# AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies
|
| 10 |
<p align="center">
|
| 11 |
<br>
|
|
@@ -142,6 +142,10 @@ The performance of the AquilaMoE model series improves significantly across mult
|
|
| 142 |
| mmlu-ppl | 59.93 |
|
| 143 |
| winograd-ppl | 57.5 |
|
| 144 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 145 |
*Table: Performance of AquilaMoE-SFT (16\*8B) on various benchmarks.*
|
| 146 |
|
| 147 |
## License Agreement
|
|
|
|
| 1 |
+
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
language:
|
| 4 |
+
- en
|
| 5 |
+
- zh
|
| 6 |
+
tags:
|
| 7 |
+
- moe
|
| 8 |
+
---
|
| 9 |
# AquilaMoE: Efficient Training for MoE Models with Scale-Up and Scale-Out Strategies
|
| 10 |
<p align="center">
|
| 11 |
<br>
|
|
|
|
| 142 |
| mmlu-ppl | 59.93 |
|
| 143 |
| winograd-ppl | 57.5 |
|
| 144 |
|
| 145 |
+
| Model | GPT 3.5 Turbo (11/06) | GPT 3.5 Turbo (03/01) | AquilaMoE-SFT |
|
| 146 |
+
|------------------|-----------------------|-----------------------|---------------|
|
| 147 |
+
| AlpacaEval 2.0 | 19.3 | 18.1 | 21.1 |
|
| 148 |
+
|
| 149 |
*Table: Performance of AquilaMoE-SFT (16\*8B) on various benchmarks.*
|
| 150 |
|
| 151 |
## License Agreement
|