huangzixian commited on
Commit
29d7d78
·
1 Parent(s): f896060

update readme

Browse files
Files changed (1) hide show
  1. README.md +13 -16
README.md CHANGED
@@ -1,16 +1,16 @@
1
 
2
  ### Model Sources
3
- **Paper**: LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages
4
-
5
- Link: https://arxiv.org/pdf/2407
6
 
7
  ### Model Description
8
 
9
- 🔥 LLaMAX-7B-MetaMath is fully fine-tuned on the MetaMathQA dataset based on the powerful multilingual model LLaMAX-7B.
10
 
11
- 🔥 Compared with the [MetaMath-7B](https://huggingface.co/meta-math/MetaMath-7B-V1.0), LLaMAX-7B-MetaMath performs significantly better in mathematical reasoning in low-resource languages, improving the average accuracy of low-resource languages on MGSM dataset by up to 18.8%.
12
 
13
- 🔥 LLaMAX-7B-MetaMath demonstrates good multilingual math reasoning capability in all languages, improving the average accuracy by 6.2% across all languages in MGSM dataset.
14
 
15
  ### Model Usage
16
 
@@ -46,20 +46,17 @@ the total number of words (1050) by the number of days in two weeks (14). So, th
46
  1050/14 = 75 words in each daily crossword puzzle on average. #### The answer is: 75“
47
  ```
48
  ### Experiments
49
- We evaluated LLaMAX-7B-MetaMath on the MGSM dataset. Compared with MetaMath-7B, LLaMAX-7B-MetaMath achieves a leading on both high-resource languages (Hrl.) and low-resource languages (Lrl.).
50
 
51
- | MGSM | Bn | Th | Sw | Ja | Zh | De | Fr | Ru | Es | En | Lrl. | Hrl. | Avg. |
52
- |-----------------------------|-------|------|----|-------|------|----|----|------|----|----|------|------|--------|
53
- | MetaMath-7B (official) | 6.8 | 7.2 |6.8| 36.4 | 38.4 | 55.2|54.4| 52.0 |57.2|68.8| 6.9 | 51.8 | 38.32 |
54
- | MetaMath-7B (Reproduced) | 6.0 | 10.0 |4.4|36.4|42.8|52.8|56.0|48.8|58.8|64.8| 6.8 | 51.5 | 38.08 |
55
- | LLaMAX-7B-MetaMath |26.8| 24.0 |26.0|35.6|42.4|56.8|55.2|53.6|56.8|65.6| 25.6 | 52.3 | 44.28 |
56
 
57
  ### Citation
58
  if our model helps your work, please cite this paper:
59
 
60
  ```
61
- @inproceedings{Huang2024MindMergerEB,
62
- title={XLLaMA2: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages},
63
- year={2024},
64
- }
65
  ```
 
1
 
2
  ### Model Sources
3
+ - **Paper**: LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages
4
+ - **Link**:
5
+ - **Repository**: https://github.com/CONE-MT/LLaMAX/
6
 
7
  ### Model Description
8
 
9
+ 🔥 LLaMAX2-7B-MetaMath is fully fine-tuned on the MetaMathQA dataset based on the powerful multilingual model LLaMAX2-7B.
10
 
11
+ 🔥 Compared with the [MetaMath-7B](https://huggingface.co/meta-math/MetaMath-7B-V1.0), LLaMAX2-7B-MetaMath performs significantly better in mathematical reasoning in low-resource languages, improving the average accuracy of low-resource languages on MGSM dataset by up to 18.8%.
12
 
13
+ 🔥 LLaMAX2-7B-MetaMath demonstrates good multilingual math reasoning capability in all languages, improving the average accuracy by 6.2% across all languages in MGSM dataset.
14
 
15
  ### Model Usage
16
 
 
46
  1050/14 = 75 words in each daily crossword puzzle on average. #### The answer is: 75“
47
  ```
48
  ### Experiments
49
+ We evaluated LLaMAX2-7B-MetaMath on the MGSM dataset. Compared with MetaMath-7B, LLaMAX-7B-MetaMath achieves a leading on both high-resource languages (Hrl.) and low-resource languages (Lrl.).
50
 
51
+ | MGSM | Avg. | Lrl. | Hrl. | Bn | Th | Sw | Ja | Zh | De | Fr | Ru | Es | En |
52
+ |---------------------------|---------|------|--------|--------|------|----|----|------|----|----|------|------|--------|
53
+ | MetaMath-7B (official) | 38.32 | 6.9 | 51.8 | 6.8 | 7.2 |6.8| 36.4 | 38.4 | 55.2|54.4| 52.0 |57.2|68.8|
54
+ | MetaMath-7B (Reproduced) | 38.08 | 6.8 | 51.5 | 6.0 | 10.0 |4.4| 36.4 |42.8|52.8|56.0|48.8|58.8|64.8|
55
+ | LLaMAX2-7B-MetaMath | 44.28 | 25.6 | 52.3 | 26.8 | 24.0 |26.0| 35.6 |42.4|56.8|55.2|53.6|56.8|65.6|
56
 
57
  ### Citation
58
  if our model helps your work, please cite this paper:
59
 
60
  ```
61
+
 
 
 
62
  ```