README.md · Mathoctopus/Parallel_xRFT_7B at df603ea509f761ea60d16a46310e3b4047bc6b2a

metadata

license: apache-2.0
datasets:
  - Mathoctopus/GSM8KInstruct_Parallel
language:
  - en
  - es
  - zh
  - de
  - ru
  - th
  - sw
  - ja
  - fr
  - bn

Introduction

We introduce 🐙 MathOctopus, a series of open-source large language models (LLMs) specifically tailored for multilingual math problem-solving. The MathOctopus models are trained on 🤗 MGSM8KInstruct Dataset, encompassing ten distinct languages. MathOctopus notably outperforms conventional open-source LLMs and exhibits superiority over ChatGPT in few-shot scenarios.

Datasets

MGSM8KInstruct

Training Dataset	En	Sw	Zh	Bn	De	Es	Fr	Ja	Ru	Th	Overall
MGSM8KInstruct	7473	7472	7466	6539	7466	7470	7469	7471	7361	7473	73.6K

MSVAMP

Test Dataset	En	Sw	Zh	Bn	De	Es	Fr	Ja	Ru	Th	Overall
MSVAMP	1000	1000	1000	1000	1000	1000	1000	1000	1000	1000	10K

Usage

Our dataset and models are all available at Huggingface.

🤗 MGSM8KInstruct_Parallel Dataset

🤗 MSVAMP Dataset

Or you can directly download them from

Models

Base Model: LLama	Parallel-Training	Cross-Training
7B-LLaMA 2	🐙 MathOctopus-Parallel-7B	🐙 MathOctopus-Cross-7B
	🐙MathOctopus-Parallel-xRFT-7B	🐙MathOctopus-Cross-xRFT-7B
13B-LLaMA 2	🐙 [MathOctopus-Parallel-13B]	🐙 [MathOctopus-Cross-13B]
	🐙MathOctopus-Parallel-xRFT-13B	🐙[MathOctopus-Cross-xRFT-13B]
33B-LLaMA 1	🐙 [MathOctopus-Parallel-33B]	🐙 [MathOctopus-Cross-33B]
70B-LLaMA 2	Coming soon!	Coming Soon!

*-Parallel refers to our model trained with the parallel-training strategy. *-Cross refers to our model trained with cross-training strategy.

*-xRFT means we train the model with multilingual rejection sampling.

Overall Results on MGSM

7B Model	En	Sw	Zh	Bn	De	Es	Fr	Ja	Ru	Th	Overall
MathOctupos^C	52.0	23.6	31.6	18.8	38.0	39.2	36.4	27.2	33.6	21.6	32.2
xRFT-MathOctupos^C	51.2	24.0	33.2	18.8	36.0	41.2	37.6	29.6	36.4	25.2	33.3
MathOctupos^P-LoRA	30.4	15.2	23.6	10.4	22.8	24.8	26.4	18.0	22.0	14.8	20.8
MathOctupos^P	52.4	39.2	38.4	28.8	44.8	42.4	43.6	36.0	39.6	34.4	40.0
xRFT-MathOctupos^P	54.8	38.4	45.2	33.2	43.6	45.2	38.0	35.6	48.4	36.4	41.9

| 13B Model | En | Sw | Zh | Bn | De | Es | Fr | Ja | Ru | Th | Overall | |:--------------------------------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------| | MathOctupos^C | 56.4 | 27.2 | 39.2 | 24.0 | 47.6 | 49.6 | 47.6 | 40.4 | 42.0 | 24.8 | 39.9 | | **xRFT**-MathOctupos^C| 53.6 | 28.0 | 45.2 | 21.2 | 48.0 | 46.4 | 46.0 | 35.2 | 45.6 | 28.8 | 39.8 | | MathOctupos^P | 53.2 | 42.8 | 48.8 | 35.2 | 44.4 | 48.0 | 48.4 | 43.2 | 47.6 | 46.8 | 45.8 | | **xRFT**-MathOctupos^P| 51.6 | 46.0 | 51.2 | 42.0 | 49.2 | 53.2 | 49.6 | 39.6 | 47.6 | 46.0 | 47.6 |

| 30-34B Model | En | Sw | Zh | Bn | De | Es | Fr | Ja | Ru | Th | Overall | |:--------------------------------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------| | MathOctupos^C | 55.6 | 24.4 | 36.0 | 19.2 | 40.4 | 51.2 | 44.4 | 27.2 | 37.2 | 21.6 | 35.7 | | **xRFT**-MathOctupos^C| 53.6 | 27.6 | 34.4 | 19.2 | 47.2 | 47.6 | 44.8 | 30.8 | 38.8 | 22.8 | 36.7 | | MathOctupos^P | 56.4 | 46.8 | 52.0 | 35.2 | 47.2 | 53.2 | 48.0 | 39.2 | 45.6 | 41.2 | 46.5 | | **xRFT**-MathOctupos^P| 51.6 | 47.2 | 52.4 | 37.6 | 51.2 | 52.8 | 44.4 | 41.6 | 50.0 | 47.6 | 47.6 | ### **Overall Results on MSVAMP** | 7B Model | En | Sw | Zh | Bn | De | Es | Fr | Ja | Ru | Th | Overall | |:--------------------------------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------| | MathOctupos^C | 49.2 | 36.6 | 43.6 | 30.2 | 48.6 | 46.8 | 46.4 | 42.5 | 46.7 | 34.0 | 42.5 | | **xRFT**-MathOctupos^C| 49.9 | 37.7 | 43.3 | 32.9 | 46.5 | 47.6 | 47.3 | 42.7 | 46.6 | 36.2 | 43.1 | | MathOctupos^P-LoRA | 30.4 | 15.2 | 23.6 | 10.4 | 22.8 | 24.8 | 26.4 | 18.0 | 22.0 | 14.8 | 20.8 | | MathOctupos^P | 46.5 | 40.1 | 42.5 | 29.1 | 43.5 | 45.4 | 46.0 | 42.5 | 45.4 | 35.7 | 41.7 | | **xRFT**-MathOctupos^P| 46.8 | 42.3 | 43.2 | 32.8 | 43.1 | 44.5 | 45.3 | 43.2 | 42.1 | 40.5 | 42.4 |

| 13B Model | En | Sw | Zh | Bn | De | Es | Fr | Ja | Ru | Th | Overall | |:--------------------------------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------| | MathOctupos^C | 56.6 | 40.4 | 49.0 | 30.3 | 50.9 | 54.2 | 54.7 | 46.3 | 52.4 | 35.7 | 47.1 | | **xRFT**-MathOctupos^C| 52.9 | 41.9 | 49.2 | 34.1 | 50.5 | 52.8 | 51.5 | 45.8 | 50.2 | 35.7 | 46.5 | | MathOctupos^P | 50.7 | 43.4 | 42.6 | 31.8 | 48.4 | 49.4 | 50.6 | 41.1 | 46.9 | 39.3 | 44.4 | | **xRFT**-MathOctupos^P| 44.6 | 43.4 | 46.4 | 34.2 | 47.7 | 48.2 | 49.9 | 43.1 | 48.2 | 39.5 | 44.5 |

| 30-34B Model | En | Sw | Zh | Bn | De | Es | Fr | Ja | Ru | Th | Overall | |:--------------------------------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------|:--------| | MathOctupos^C | 51.5 | 42.1 | 46.2 | 23.2 | 50.5 | 52.1 | 52.9 | 42.2 | 50.5 | 33.4 | 44.5 | | **xRFT**-MathOctupos^C| 48.1 | 42.8 | 43.6 | 23.3 | 48.7 | 50.0 | 48.9 | 43.4 | 44.6 | 35.5 | 42.9 | | MathOctupos^P | 56.4 | 46.8 | 52.0 | 35.2 | 47.2 | 53.2 | 48.0 | 39.2 | 45.6 | 41.2 | 46.5 | | **xRFT**-MathOctupos^P| 48.0 | 42.3 | 46.1 | 36.2 | 47.5 | 48.5 | 48.3 | 45.8 | 47.2 | 41.2 | 45.1 | ### **MathOctupos in English** | Models | GSM8K | SVAMP | |:--------------------------------|:--------|:--------| | LLaMA 2-7B | 42.4 | 38.3 | | MathOctupos^P-7B | 49.3 | 46.8 | | MathOctupos^C-7B | 50.8 | 49.3 | | LLaMA 2-13B | 51.0 | 50.9 | | MathOctupos^P-13B | 55.5 | 52.1 | | MathOctupos^C-13B | 56.6 | 56.6 | | LLaMA 1-33B | 50.0 | 49.0 | | MathOctupos^P-33B | 56.0 | 52.5 | | MathOctupos^C-33B | 53.7 | 51.5 | ## Intended Uses These models are trained for research purposes. They are designed to solve multilingual math problems. They can be used in educational software, tutoring systems, or any application where a solution to a math problem is needed.