Update README.md
Browse files
README.md
CHANGED
@@ -133,16 +133,16 @@ And for the 1B model:
|
|
133 |
|
134 |
| task | random | [StableLM 2 1.6b](https://huggingface.co/stabilityai/stablelm-2-1_6b)\* | [Pythia 1B](https://huggingface.co/EleutherAI/pythia-1b) | [TinyLlama 1.1B](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1195k-token-2.5T) | OLMo 1B | **OLMo 1.7-1B** (ours) |
|
135 |
| ------------- | ------ | ----------------- | --------- | -------------------------------------- | ------- | ---- |
|
136 |
-
| arc_challenge | 25 | 43.
|
137 |
-
| arc_easy | 25 | 63.
|
138 |
| boolq | 50 | 76.6 | 61.8 | 64.6 | 60.7 | 67.5 |
|
139 |
-
| copa | 50 | 84
|
140 |
| hellaswag | 25 | 68.2 | 44.7 | 58.7 | 62.5 | 66.9 |
|
141 |
| openbookqa | 25 | 45.8 | 37.8 | 43.6 | 46.4 | 46.4 |
|
142 |
-
| piqa | 50 | 74
|
143 |
-
| sciq | 25 | 94.7 | 86
|
144 |
| winogrande | 50 | 64.9 | 53.3 | 58.9 | 58.9 | 61.4 |
|
145 |
-
| Average | 36.
|
146 |
|
147 |
\*Unlike OLMo, Pythia, and TinyLlama, StabilityAI has not disclosed yet the data StableLM was trained on, making comparisons with other efforts challenging.
|
148 |
|
|
|
133 |
|
134 |
| task | random | [StableLM 2 1.6b](https://huggingface.co/stabilityai/stablelm-2-1_6b)\* | [Pythia 1B](https://huggingface.co/EleutherAI/pythia-1b) | [TinyLlama 1.1B](https://huggingface.co/TinyLlama/TinyLlama-1.1B-intermediate-step-1195k-token-2.5T) | OLMo 1B | **OLMo 1.7-1B** (ours) |
|
135 |
| ------------- | ------ | ----------------- | --------- | -------------------------------------- | ------- | ---- |
|
136 |
+
| arc_challenge | 25 | 43.8 | 33.1 | 34.8 | 34.5 | 36.5 |
|
137 |
+
| arc_easy | 25 | 63.7 | 50.2 | 53.2 | 58.1 | 55.3 |
|
138 |
| boolq | 50 | 76.6 | 61.8 | 64.6 | 60.7 | 67.5 |
|
139 |
+
| copa | 50 | 84.0 | 72.0 | 78.0 | 79.0 | 83.0 |
|
140 |
| hellaswag | 25 | 68.2 | 44.7 | 58.7 | 62.5 | 66.9 |
|
141 |
| openbookqa | 25 | 45.8 | 37.8 | 43.6 | 46.4 | 46.4 |
|
142 |
+
| piqa | 50 | 74.0 | 69.1 | 71.1 | 73.7 | 74.9 |
|
143 |
+
| sciq | 25 | 94.7 | 86.0 | 90.5 | 88.1 | 93.4 |
|
144 |
| winogrande | 50 | 64.9 | 53.3 | 58.9 | 58.9 | 61.4 |
|
145 |
+
| Average | 36.1 | 68.4 | 56.4 | 61.5 | 62.4 | 65.0 |
|
146 |
|
147 |
\*Unlike OLMo, Pythia, and TinyLlama, StabilityAI has not disclosed yet the data StableLM was trained on, making comparisons with other efforts challenging.
|
148 |
|