Update README.md
Browse files
README.md
CHANGED
@@ -1,38 +1,212 @@
|
|
1 |
---
|
2 |
license: mit
|
|
|
|
|
|
|
|
|
3 |
---
|
4 |
|
5 |
|
6 |
-
|
7 |
-
|
8 |
-
|
9 |
-
|
10 |
-
|
11 |
-
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
16 |
-
|
17 |
-
|
18 |
-
|
19 |
-
|
20 |
-
|
21 |
-
|
22 |
-
|
23 |
-
|
24 |
-
|
25 |
-
|
26 |
-
|
27 |
-
|
28 |
-
|
29 |
-
|
30 |
-
|
31 |
-
|
32 |
-
|
33 |
-
|
34 |
-
|
35 |
-
|
36 |
-
|
37 |
-
|
38 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
---
|
2 |
license: mit
|
3 |
+
datasets:
|
4 |
+
- sail/regmix-data
|
5 |
+
language:
|
6 |
+
- en
|
7 |
---
|
8 |
|
9 |
|
10 |
+
# Models Trained with Random Mixture
|
11 |
+
|
12 |
+
## How to Load a Model
|
13 |
+
|
14 |
+
You can load any model using the corresponding branch with the Hugging Face Transformers library:
|
15 |
+
|
16 |
+
```python
|
17 |
+
from transformers import AutoModel, AutoTokenizer
|
18 |
+
|
19 |
+
model = AutoModel.from_pretrained("sail/data-mixture-random-1b", revision="model-index-1")
|
20 |
+
tokenizer = AutoTokenizer.from_pretrained("sail/data-mixture-random-1b", revision="model-index-1")
|
21 |
+
```
|
22 |
+
|
23 |
+
|
24 |
+
## Data Mixture
|
25 |
+
|
26 |
+
The specific data mixture used for training each 1B model can be found in the file `train_config.yaml` in each corresponding model branch.
|
27 |
+
|
28 |
+
## Model Variants
|
29 |
+
|
30 |
+
To access different model variants, simply change the `revision` parameter in the `from_pretrained` method to the desired model index (e.g., "model-index-2", "model-index-3"), and the maxium index is 64.
|
31 |
+
|
32 |
+
## Usage Notes
|
33 |
+
|
34 |
+
- These models are primarily intended for research purposes.
|
35 |
+
- Performance may vary depending on the specific task and domain.
|
36 |
+
|
37 |
+
## Citation
|
38 |
+
|
39 |
+
If you use these models in your research, please cite the RegMix paper:
|
40 |
+
|
41 |
+
```
|
42 |
+
@misc{liu2024regmix,
|
43 |
+
title={RegMix: Data Mixture as Regression for Language Model Pre-training},
|
44 |
+
author={Qian Liu and Xiaosen Zheng and Niklas Muennighoff and Guangtao Zeng and Longxu Dou and Tianyu Pang and Jing Jiang and Min Lin},
|
45 |
+
year={2024},
|
46 |
+
eprint={2407.01492},
|
47 |
+
archivePrefix={arXiv},
|
48 |
+
primaryClass={cs.CL},
|
49 |
+
url={https://arxiv.org/abs/2407.01492},
|
50 |
+
}
|
51 |
+
```
|
52 |
+
|
53 |
+
For more information about the RegMix methodology and its applications, please refer to the [original paper](https://huggingface.co/papers/2407.01492).
|
54 |
+
|
55 |
+
## Performance
|
56 |
+
|
57 |
+
We evaluated each model using [lm-evaluation-harness](https://github.com/EleutherAI/lm-evaluation-harness). The performance metric for each task is the average of 0-shot to 5-shot `accnorm` (accuracy normalized, if available) or `acc` (accuracy) scores.
|
58 |
+
|
59 |
+
### Table 1: Model Index 1-8
|
60 |
+
|
61 |
+
| Task | Model 1 | Model 2 | Model 3 | Model 4 | Model 5 | Model 6 | Model 7 | Model 8 |
|
62 |
+
|---------------|---------|---------|---------|---------|---------|---------|---------|---------|
|
63 |
+
| Social IQA | 33.27 | 33.33 | 33.62 | 33.53 | 33.49 | 33.56 | 33.62 | 33.55 |
|
64 |
+
| HellaSwag | 40.58 | 36.86 | 40.58 | 36.06 | 40.07 | 37.85 | 37.93 | 39.59 |
|
65 |
+
| PiQA | 67.29 | 65.14 | 67.97 | 64.66 | 67.03 | 65.36 | 66.00 | 66.55 |
|
66 |
+
| OpenBookQA | 28.63 | 27.87 | 29.33 | 29.10 | 29.23 | 28.33 | 29.13 | 28.73 |
|
67 |
+
| Lambada | 29.17 | 26.86 | 31.55 | 27.11 | 29.16 | 28.92 | 31.53 | 30.92 |
|
68 |
+
| SciQ | 80.68 | 79.98 | 81.05 | 80.80 | 82.40 | 79.88 | 78.67 | 79.70 |
|
69 |
+
| COPA | 70.50 | 63.83 | 69.17 | 65.00 | 67.50 | 66.00 | 66.67 | 68.67 |
|
70 |
+
| RACE | 29.47 | 30.00 | 32.11 | 28.82 | 31.13 | 30.06 | 29.90 | 30.75 |
|
71 |
+
| ARC Easy | 50.03 | 48.72 | 50.01 | 46.64 | 51.06 | 47.46 | 46.75 | 48.39 |
|
72 |
+
| LogiQA | 23.76 | 24.17 | 25.29 | 25.29 | 24.55 | 25.96 | 25.45 | 26.32 |
|
73 |
+
| QQP | 55.71 | 55.90 | 54.84 | 56.52 | 54.01 | 56.34 | 52.35 | 54.20 |
|
74 |
+
| WinoGrande | 51.54 | 51.59 | 51.39 | 50.91 | 53.13 | 52.26 | 51.26 | 51.45 |
|
75 |
+
| MultiRC | 52.65 | 53.39 | 51.89 | 50.92 | 49.03 | 53.09 | 53.64 | 50.23 |
|
76 |
+
| **Average** | **47.18** | **45.97** | **47.60** | **45.80** | **47.06** | **46.54** | **46.38** | **46.85** |
|
77 |
+
|
78 |
+
### Table 2: Model Index 9-16
|
79 |
+
|
80 |
+
| Task | Model 9 | Model 10 | Model 11 | Model 12 | Model 13 | Model 14 | Model 15 | Model 16 |
|
81 |
+
|---------------|---------|----------|----------|----------|----------|----------|----------|----------|
|
82 |
+
| Social IQA | 33.43 | 33.21 | 33.31 | 33.17 | 33.28 | 32.43 | 33.57 | 33.70 |
|
83 |
+
| HellaSwag | 40.05 | 35.89 | 39.55 | 39.89 | 38.63 | 36.18 | 39.52 | 35.94 |
|
84 |
+
| PiQA | 66.60 | 64.74 | 66.29 | 66.27 | 66.90 | 64.05 | 66.70 | 64.51 |
|
85 |
+
| OpenBookQA | 28.87 | 26.60 | 29.33 | 28.73 | 29.40 | 27.87 | 29.67 | 27.83 |
|
86 |
+
| Lambada | 31.39 | 27.37 | 30.32 | 30.31 | 31.38 | 26.25 | 29.86 | 26.95 |
|
87 |
+
| SciQ | 81.10 | 79.12 | 79.97 | 82.85 | 79.42 | 81.40 | 81.38 | 81.23 |
|
88 |
+
| COPA | 67.00 | 64.50 | 66.83 | 69.50 | 67.33 | 65.83 | 69.50 | 66.33 |
|
89 |
+
| RACE | 30.57 | 29.63 | 30.49 | 30.85 | 30.35 | 28.66 | 31.21 | 29.57 |
|
90 |
+
| ARC Easy | 50.66 | 47.74 | 47.47 | 50.18 | 49.92 | 49.52 | 50.73 | 48.65 |
|
91 |
+
| LogiQA | 23.60 | 25.65 | 26.37 | 23.81 | 25.58 | 26.29 | 25.86 | 25.12 |
|
92 |
+
| QQP | 54.89 | 54.79 | 54.20 | 55.23 | 53.69 | 57.09 | 53.95 | 54.24 |
|
93 |
+
| WinoGrande | 50.83 | 51.84 | 51.05 | 51.83 | 52.12 | 52.00 | 51.01 | 51.82 |
|
94 |
+
| MultiRC | 54.18 | 54.48 | 50.17 | 52.12 | 51.42 | 52.69 | 51.87 | 53.48 |
|
95 |
+
| **Average** | **47.17** | **45.81** | **46.57** | **47.29** | **46.88** | **46.17** | **47.30** | **46.11** |
|
96 |
+
|
97 |
+
### Table 3: Model Index 17-24
|
98 |
+
|
99 |
+
| Task | Model 17 | Model 18 | Model 19 | Model 20 | Model 21 | Model 22 | Model 23 | Model 24 |
|
100 |
+
|---------------|----------|----------|----------|----------|----------|----------|----------|----------|
|
101 |
+
| Social IQA | 33.89 | 33.31 | 33.53 | 33.38 | 33.75 | 33.24 | 33.56 | 33.71 |
|
102 |
+
| HellaSwag | 38.68 | 39.90 | 34.67 | 37.12 | 37.44 | 36.07 | 42.15 | 34.67 |
|
103 |
+
| PiQA | 66.83 | 67.39 | 63.33 | 64.83 | 65.00 | 63.68 | 67.80 | 62.99 |
|
104 |
+
| OpenBookQA | 28.13 | 30.67 | 28.03 | 29.40 | 27.67 | 27.77 | 29.37 | 25.83 |
|
105 |
+
| Lambada | 28.78 | 28.56 | 24.13 | 29.41 | 27.67 | 28.03 | 33.47 | 24.04 |
|
106 |
+
| SciQ | 79.60 | 78.83 | 77.42 | 78.98 | 78.95 | 78.72 | 81.83 | 79.12 |
|
107 |
+
| COPA | 65.17 | 68.17 | 65.33 | 67.33 | 67.67 | 62.67 | 69.83 | 65.83 |
|
108 |
+
| RACE | 28.74 | 30.03 | 29.76 | 29.49 | 30.77 | 29.76 | 31.21 | 27.91 |
|
109 |
+
| ARC Easy | 48.86 | 49.42 | 47.90 | 48.30 | 47.88 | 46.68 | 50.92 | 45.24 |
|
110 |
+
| LogiQA | 25.91 | 26.34 | 26.24 | 25.76 | 26.11 | 26.24 | 24.17 | 25.91 |
|
111 |
+
| QQP | 53.35 | 53.18 | 50.61 | 51.49 | 54.27 | 54.99 | 52.77 | 55.19 |
|
112 |
+
| WinoGrande | 52.54 | 51.17 | 52.01 | 51.09 | 52.13 | 52.03 | 52.50 | 50.28 |
|
113 |
+
| MultiRC | 51.49 | 52.45 | 55.40 | 54.87 | 51.73 | 49.49 | 50.61 | 50.29 |
|
114 |
+
| **Average** | **46.30** | **46.88** | **45.26** | **46.27** | **46.23** | **45.34** | **47.71** | **44.69** |
|
115 |
+
|
116 |
+
### Table 4: Model Index 25-32
|
117 |
+
|
118 |
+
| Task | Model 25 | Model 26 | Model 27 | Model 28 | Model 29 | Model 30 | Model 31 | Model 32 |
|
119 |
+
|---------------|----------|----------|----------|----------|----------|----------|----------|----------|
|
120 |
+
| Social IQA | 33.51 | 33.40 | 33.59 | 33.52 | 33.53 | 33.49 | 33.16 | 33.56 |
|
121 |
+
| HellaSwag | 36.75 | 36.97 | 40.81 | 38.25 | 40.28 | 35.71 | 37.37 | 37.39 |
|
122 |
+
| PiQA | 64.09 | 64.74 | 67.97 | 66.15 | 66.88 | 63.84 | 64.47 | 65.05 |
|
123 |
+
| OpenBookQA | 29.47 | 28.70 | 29.57 | 29.77 | 29.50 | 29.13 | 29.47 | 28.00 |
|
124 |
+
| Lambada | 26.69 | 33.00 | 31.60 | 33.08 | 31.49 | 27.69 | 26.99 | 29.54 |
|
125 |
+
| SciQ | 80.03 | 79.17 | 80.12 | 80.22 | 81.92 | 78.23 | 77.42 | 80.87 |
|
126 |
+
| COPA | 67.67 | 65.50 | 69.00 | 65.67 | 68.33 | 63.33 | 64.67 | 67.17 |
|
127 |
+
| RACE | 30.05 | 30.19 | 30.96 | 30.37 | 30.08 | 29.62 | 30.13 | 29.92 |
|
128 |
+
| ARC Easy | 47.50 | 46.90 | 50.26 | 48.57 | 50.55 | 46.96 | 48.77 | 48.79 |
|
129 |
+
| LogiQA | 27.24 | 25.55 | 25.86 | 24.37 | 25.32 | 25.12 | 26.40 | 24.30 |
|
130 |
+
| QQP | 49.68 | 55.43 | 50.94 | 50.91 | 51.99 | 53.53 | 49.53 | 51.36 |
|
131 |
+
| WinoGrande | 51.68 | 52.12 | 51.93 | 51.50 | 52.32 | 51.67 | 52.13 | 52.63 |
|
132 |
+
| MultiRC | 51.24 | 51.91 | 50.33 | 52.42 | 52.52 | 54.04 | 52.05 | 53.04 |
|
133 |
+
| **Average** | **45.82** | **46.43** | **47.15** | **46.52** | **47.29** | **45.57** | **45.58** | **46.28** |
|
134 |
+
|
135 |
+
### Table 5: Model Index 33-40
|
136 |
+
|
137 |
+
| Task | Model 33 | Model 34 | Model 35 | Model 36 | Model 37 | Model 38 | Model 39 | Model 40 |
|
138 |
+
|---------------|----------|----------|----------|----------|----------|----------|----------|----------|
|
139 |
+
| Social IQA | 33.48 | 33.28 | 33.35 | 33.29 | 33.63 | 33.61 | 33.21 | 33.61 |
|
140 |
+
| HellaSwag | 38.00 | 40.18 | 43.37 | 37.69 | 32.96 | 32.98 | 37.31 | 37.79 |
|
141 |
+
| PiQA | 65.30 | 66.68 | 69.04 | 66.46 | 62.25 | 60.17 | 65.24 | 65.32 |
|
142 |
+
| OpenBookQA | 29.43 | 30.37 | 30.43 | 27.63 | 26.43 | 26.83 | 27.97 | 28.70 |
|
143 |
+
| Lambada | 26.59 | 31.46 | 31.71 | 30.21 | 18.92 | 20.29 | 28.10 | 28.58 |
|
144 |
+
| SciQ | 79.82 | 80.58 | 82.13 | 80.83 | 76.73 | 77.90 | 79.12 | 79.60 |
|
145 |
+
| COPA | 64.33 | 69.33 | 67.00 | 67.83 | 61.50 | 62.67 | 64.67 | 66.00 |
|
146 |
+
| RACE | 30.03 | 30.16 | 32.47 | 30.49 | 29.27 | 28.12 | 30.11 | 30.21 |
|
147 |
+
| ARC Easy | 48.86 | 49.88 | 52.22 | 48.32 | 44.86 | 45.54 | 48.15 | 48.86 |
|
148 |
+
| LogiQA | 25.91 | 24.30 | 23.35 | 24.96 | 26.19 | 27.68 | 25.47 | 25.37 |
|
149 |
+
| QQP | 56.06 | 56.56 | 52.57 | 56.70 | 52.54 | 48.04 | 49.81 | 57.12 |
|
150 |
+
| WinoGrande | 50.92 | 50.97 | 52.39 | 52.70 | 52.30 | 51.68 | 51.42 | 52.80 |
|
151 |
+
| MultiRC | 53.09 | 49.97 | 52.18 | 49.05 | 53.78 | 52.27 | 51.45 | 55.68 |
|
152 |
+
| **Average** | **46.29** | **47.21** | **47.86** | **46.63** | **43.95** | **43.67** | **45.54** | **46.90** |
|
153 |
+
|
154 |
+
|
155 |
+
### Table 6: Model Index 41-48
|
156 |
+
|
157 |
+
| Task | Model 41 | Model 42 | Model 43 | Model 44 | Model 45 | Model 46 | Model 47 | Model 48 |
|
158 |
+
|---------------|----------|----------|----------|----------|----------|----------|----------|----------|
|
159 |
+
| Social IQA | 33.49 | 33.43 | 33.07 | 33.28 | 33.44 | 33.08 | 33.78 | 33.17 |
|
160 |
+
| HellaSwag | 34.51 | 37.59 | 42.69 | 37.37 | 38.31 | 38.30 | 39.67 | 41.07 |
|
161 |
+
| PiQA | 62.24 | 65.58 | 68.05 | 66.62 | 66.54 | 65.52 | 66.98 | 67.21 |
|
162 |
+
| OpenBookQA | 27.10 | 28.77 | 28.90 | 28.07 | 28.07 | 27.60 | 31.17 | 29.73 |
|
163 |
+
| Lambada | 22.78 | 26.99 | 31.34 | 29.51 | 27.87 | 29.47 | 30.34 | 32.71 |
|
164 |
+
| SciQ | 77.78 | 80.25 | 79.47 | 80.25 | 80.70 | 79.72 | 81.35 | 81.77 |
|
165 |
+
| COPA | 64.00 | 66.33 | 67.00 | 67.00 | 67.33 | 68.33 | 67.17 | 67.67 |
|
166 |
+
| RACE | 28.33 | 28.82 | 30.78 | 30.80 | 30.08 | 30.24 | 30.24 | 30.67 |
|
167 |
+
| ARC Easy | 45.48 | 48.64 | 51.49 | 46.99 | 48.79 | 48.05 | 49.58 | 49.49 |
|
168 |
+
| LogiQA | 24.83 | 24.96 | 24.76 | 23.25 | 26.06 | 25.55 | 24.32 | 24.68 |
|
169 |
+
| QQP | 50.27 | 54.73 | 53.96 | 57.00 | 53.73 | 51.19 | 57.52 | 56.91 |
|
170 |
+
| WinoGrande | 51.79 | 51.63 | 51.32 | 50.76 | 53.18 | 52.45 | 50.72 | 52.24 |
|
171 |
+
| MultiRC | 54.03 | 53.96 | 48.91 | 50.74 | 53.01 | 50.89 | 47.63 | 53.84 |
|
172 |
+
| **Average** | **44.35** | **46.28** | **47.06** | **46.28** | **46.70** | **46.18** | **46.96** | **47.78** |
|
173 |
+
|
174 |
+
|
175 |
+
## Table 7: Model Index 49-56
|
176 |
+
|
177 |
+
| Task | Model 49 | Model 50 | Model 51 | Model 52 | Model 53 | Model 54 | Model 55 | Model 56 |
|
178 |
+
|---------------|----------|----------|----------|----------|----------|----------|----------|----------|
|
179 |
+
| Social IQA | 33.53 | 33.74 | 33.37 | 33.41 | 32.96 | 33.88 | 33.75 | 33.79 |
|
180 |
+
| HellaSwag | 39.09 | 35.65 | 38.68 | 36.07 | 37.68 | 38.53 | 35.40 | 40.50 |
|
181 |
+
| PiQA | 66.81 | 64.58 | 65.68 | 63.99 | 65.85 | 65.76 | 64.51 | 66.89 |
|
182 |
+
| OpenBookQA | 29.13 | 27.57 | 28.27 | 29.10 | 29.43 | 28.73 | 28.30 | 29.87 |
|
183 |
+
| Lambada | 30.23 | 26.19 | 30.29 | 30.84 | 29.76 | 29.03 | 28.63 | 30.74 |
|
184 |
+
| SciQ | 79.90 | 80.83 | 78.40 | 80.03 | 81.38 | 80.92 | 77.75 | 82.07 |
|
185 |
+
| COPA | 68.17 | 61.83 | 67.00 | 66.00 | 66.17 | 63.17 | 66.33 | 64.00 |
|
186 |
+
| RACE | 31.42 | 29.35 | 30.41 | 31.08 | 30.77 | 29.73 | 30.80 | 31.42 |
|
187 |
+
| ARC Easy | 49.54 | 47.71 | 49.02 | 47.64 | 48.38 | 49.36 | 46.96 | 51.22 |
|
188 |
+
| LogiQA | 24.99 | 24.58 | 25.32 | 24.91 | 25.17 | 26.22 | 24.63 | 24.91 |
|
189 |
+
| QQP | 54.06 | 56.48 | 50.96 | 56.62 | 56.45 | 53.86 | 53.85 | 53.26 |
|
190 |
+
| WinoGrande | 50.51 | 50.26 | 51.83 | 51.33 | 52.18 | 51.89 | 51.59 | 50.50 |
|
191 |
+
| MultiRC | 50.25 | 54.37 | 50.94 | 52.38 | 51.21 | 55.34 | 54.52 | 50.50 |
|
192 |
+
| **Average** | **46.74** | **45.63** | **46.17** | **46.42** | **46.72** | **46.65** | **45.92** | **46.90** |
|
193 |
+
|
194 |
+
|
195 |
+
## Table 8: Model Index 57-64
|
196 |
+
|
197 |
+
| Task | Model 57 | Model 58 | Model 59 | Model 60 | Model 61 | Model 62 | Model 63 | Model 64 |
|
198 |
+
|---------------|----------|----------|----------|----------|----------|----------|----------|----------|
|
199 |
+
| Social IQA | 33.24 | 33.30 | 33.56 | 33.54 | 33.42 | 33.84 | 33.32 | 33.55 |
|
200 |
+
| HellaSwag | 41.74 | 39.63 | 35.36 | 38.83 | 38.53 | 36.46 | 38.80 | 36.43 |
|
201 |
+
| PiQA | 68.07 | 67.31 | 64.44 | 66.38 | 66.50 | 64.74 | 66.54 | 64.87 |
|
202 |
+
| OpenBookQA | 29.20 | 29.50 | 28.10 | 27.97 | 27.83 | 27.37 | 28.83 | 27.87 |
|
203 |
+
| Lambada | 31.79 | 31.11 | 27.32 | 30.17 | 28.75 | 26.22 | 30.38 | 26.25 |
|
204 |
+
| SciQ | 80.42 | 79.83 | 80.85 | 79.60 | 78.93 | 80.05 | 79.50 | 78.65 |
|
205 |
+
| COPA | 66.17 | 69.00 | 64.00 | 64.83 | 67.00 | 64.00 | 66.00 | 66.83 |
|
206 |
+
| RACE | 31.39 | 29.82 | 29.67 | 30.08 | 29.98 | 29.46 | 30.37 | 29.19 |
|
207 |
+
| ARC Easy | 51.14 | 49.24 | 47.13 | 47.88 | 48.20 | 47.09 | 49.09 | 46.90 |
|
208 |
+
| LogiQA | 25.19 | 25.93 | 23.68 | 25.17 | 25.70 | 25.52 | 26.50 | 26.65 |
|
209 |
+
| QQP | 55.37 | 54.46 | 52.73 | 53.17 | 59.65 | 58.15 | 57.50 | 55.31 |
|
210 |
+
| WinoGrande | 53.21 | 51.46 | 50.83 | 52.16 | 52.37 | 51.41 | 51.63 | 51.85 |
|
211 |
+
| MultiRC | 53.58 | 52.31 | 52.22 | 53.03 | 50.41 | 52.17 | 52.27 | 51.50 |
|
212 |
+
| **Average** | **47.73** | **47.15** | **45.38** | **46.37** | **46.71** | **45.88** | **46.98** | **45.84** |
|