phmartins commited on
Commit
3d4b446
·
verified ·
1 Parent(s): f872814

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -130,17 +130,17 @@ The results show that EuroLLM-1.7B is substantially better than Gemma-2B in Mach
130
  ### General Benchmarks
131
  We also compare EuroLLM-1.7B with [TinyLlama-v1.1](https://huggingface.co/TinyLlama/TinyLlama_v1.1) and [Gemma-2B](https://huggingface.co/google/gemma-2b) on 3 general benchmarks: Arc Challenge and Hellaswag.
132
  For the non-english languages we use the [Okapi](https://aclanthology.org/2023.emnlp-demo.28.pdf) datasets.
133
- Results show that EuroLLM-1.7B is superior to TinyLlama-1.1-3T and similar to Gemma-2B on Hellaswag but worse on Arc Challenge. This can be due to the lower number of parameters of EuroLLM-1.7B (1.133B non-embedding parameters against 1.981B).
134
 
135
  #### Arc Challenge
136
  | Model | Average | English | German | Spanish | French | Italian | Portuguese | Chinese | Russian | Dutch | Arabic | Swedish | Hindi | Hungarian | Romanian | Ukrainian | Danish | Catalan |
137
  |--------------------|---------|---------|--------|---------|--------|---------|------------|---------|---------|-------|--------|---------|--------|-----------|----------|-----------|--------|---------|
138
- | EuroLLM-1.7B-Instruct | 0.3268 | 0.3218 | 0.4070 | 0.3293 | 0.3521 | 0.3370 | 0.3422 | 0.3496 | 0.3060 | 0.3122 | 0.3174 | 0.2866 | 0.3373 | 0.2817 | 0.3031 | 0.3179 | 0.3199 | 0.3248 | 0.3310 |
139
- | TinyLlama-v1.1 | 0.2650 | 0.2583 | 0.3712 | 0.2524 | 0.2795 | 0.2883 | 0.2652 | 0.2906 | 0.2410 | 0.2669 | 0.2404 | 0.2310 | 0.2687 | 0.2354 | 0.2449 | 0.2476 | 0.2524 | 0.2494 | 0.2796 |
140
- | Gemma-2B | 0.3617 | 0.3540 | 0.4846 | 0.3755 | 0.3940 | 0.4080 | 0.3687 | 0.3872 | 0.3726 | 0.3456 | 0.3328 | 0.3122 | 0.3519 | 0.2851 | 0.3039 | 0.3590 | 0.3601 | 0.3565 | 0.3516 |
141
  #### Hellaswag
142
  | Model | Average | English | German | Spanish | French | Italian | Portuguese | Russian | Dutch | Arabic | Swedish | Hindi | Hungarian | Romanian | Ukrainian | Danish | Catalan |
143
  |--------------------|---------|---------|--------|---------|--------|---------|------------|---------|--------|--------|---------|--------|-----------|----------|-----------|--------|---------|
144
- | EuroLLM-1.7B-Instruct | 0.4744 | 0.4654 | 0.6084 | 0.4772 | 0.5310 | 0.5260 | 0.5067 | 0.5206 | 0.4674 | 0.4893 | 0.4075 | 0.4813 | 0.3605 | 0.4067 | 0.4598 | 0.4368 | 0.4700 | 0.4405 |
145
- | TinyLlama-v1.1 |0.3674 | 0.3503 | 0.6248 | 0.3650 | 0.4137 | 0.4010 | 0.3780 | 0.3892 | 0.3494 | 0.3588 | 0.2880 | 0.3561 | 0.2841 | 0.3073 | 0.3267 | 0.3349 | 0.3408 | 0.3613 |
146
- | Gemma-2B |0.4666 | 0.4499 | 0.7165 | 0.4756 | 0.5414 | 0.5180 | 0.4841 | 0.5081 | 0.4664 | 0.4655 | 0.3868 | 0.4383 | 0.3413 | 0.3710 | 0.4316 | 0.4291 | 0.4471 | 0.4448 |
 
130
  ### General Benchmarks
131
  We also compare EuroLLM-1.7B with [TinyLlama-v1.1](https://huggingface.co/TinyLlama/TinyLlama_v1.1) and [Gemma-2B](https://huggingface.co/google/gemma-2b) on 3 general benchmarks: Arc Challenge and Hellaswag.
132
  For the non-english languages we use the [Okapi](https://aclanthology.org/2023.emnlp-demo.28.pdf) datasets.
133
+ Results show that EuroLLM-1.7B is superior to TinyLlama-v1.1 and similar to Gemma-2B on Hellaswag but worse on Arc Challenge. This can be due to the lower number of parameters of EuroLLM-1.7B (1.133B non-embedding parameters against 1.981B).
134
 
135
  #### Arc Challenge
136
  | Model | Average | English | German | Spanish | French | Italian | Portuguese | Chinese | Russian | Dutch | Arabic | Swedish | Hindi | Hungarian | Romanian | Ukrainian | Danish | Catalan |
137
  |--------------------|---------|---------|--------|---------|--------|---------|------------|---------|---------|-------|--------|---------|--------|-----------|----------|-----------|--------|---------|
138
+ | EuroLLM-1.7B | 0.3268 | 0.4070 | 0.3293 | 0.3521 | 0.3370 | 0.3422 | 0.3496 | 0.3060 | 0.3122 | 0.3174 | 0.2866 | 0.3373 | 0.2817 | 0.3031 | 0.3179 | 0.3199 | 0.3248 | 0.3310 |
139
+ | TinyLlama-v1.1 | 0.2650 | 0.3712 | 0.2524 | 0.2795 | 0.2883 | 0.2652 | 0.2906 | 0.2410 | 0.2669 | 0.2404 | 0.2310 | 0.2687 | 0.2354 | 0.2449 | 0.2476 | 0.2524 | 0.2494 | 0.2796 |
140
+ | Gemma-2B | 0.3617 | 0.4846 | 0.3755 | 0.3940 | 0.4080 | 0.3687 | 0.3872 | 0.3726 | 0.3456 | 0.3328 | 0.3122 | 0.3519 | 0.2851 | 0.3039 | 0.3590 | 0.3601 | 0.3565 | 0.3516 |
141
  #### Hellaswag
142
  | Model | Average | English | German | Spanish | French | Italian | Portuguese | Russian | Dutch | Arabic | Swedish | Hindi | Hungarian | Romanian | Ukrainian | Danish | Catalan |
143
  |--------------------|---------|---------|--------|---------|--------|---------|------------|---------|--------|--------|---------|--------|-----------|----------|-----------|--------|---------|
144
+ | EuroLLM-1.7B | 0.4744 | 0.6084 | 0.4772 | 0.5310 | 0.5260 | 0.5067 | 0.5206 | 0.4674 | 0.4893 | 0.4075 | 0.4813 | 0.3605 | 0.4067 | 0.4598 | 0.4368 | 0.4700 | 0.4405 |
145
+ | TinyLlama-v1.1 |0.3674 | 0.6248 | 0.3650 | 0.4137 | 0.4010 | 0.3780 | 0.3892 | 0.3494 | 0.3588 | 0.2880 | 0.3561 | 0.2841 | 0.3073 | 0.3267 | 0.3349 | 0.3408 | 0.3613 |
146
+ | Gemma-2B |0.4666 | 0.7165 | 0.4756 | 0.5414 | 0.5180 | 0.4841 | 0.5081 | 0.4664 | 0.4655 | 0.3868 | 0.4383 | 0.3413 | 0.3710 | 0.4316 | 0.4291 | 0.4471 | 0.4448 |