Update README.md
Browse files
README.md
CHANGED
@@ -20,7 +20,7 @@ These models are trained on the Dolma dataset. We have released all code, checkp
|
|
20 |
|------|--------|---------|-------------|-----------------|----------------|
|
21 |
| [OLMo 2-7B](https://huggingface.co/allenai/OLMo-2-1124-7B) | 4 Trillion | 32 | 4096 | 32 | 4096 |
|
22 |
| [OLMo 2-13B](https://huggingface.co/allenai/OLMo-2-1124-13B) | 5 Trillion | 40 | 5120 | 40 | 4096 |
|
23 |
-
| [OLMo 2-32B](https://huggingface.co/allenai/OLMo-2-0325-32B) | 6 Trillion |
|
24 |
|
25 |
The core models released in this batch include the following:
|
26 |
|
@@ -86,13 +86,13 @@ branches = [b.name for b in out.branches]
|
|
86 |
Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
|
87 |
1. Fine-tune with the OLMo repository:
|
88 |
```bash
|
89 |
-
|
90 |
-
--data.paths=[{path_to_data}/input_ids.npy] \
|
91 |
-
--data.label_mask_paths=[{path_to_data}/label_mask.npy] \
|
92 |
-
--load_path={path_to_checkpoint} \
|
93 |
-
--reset_trainer_state
|
94 |
```
|
95 |
-
|
|
|
|
|
|
|
|
|
96 |
|
97 |
2. Further fine-tuning support is being developing in AI2's Open Instruct repository. Details are [here](https://github.com/allenai/open-instruct).
|
98 |
|
@@ -110,7 +110,7 @@ For more documentation, see the [GitHub readme](https://github.com/allenai/OLMo?
|
|
110 |
|
111 |
- **Project Page:** https://allenai.org/olmo
|
112 |
- **Repositories:**
|
113 |
-
- Core repo (training, inference, fine-tuning etc.): https://github.com/allenai/OLMo
|
114 |
- Evaluation code: https://github.com/allenai/OLMo-Eval
|
115 |
- Further fine-tuning code: https://github.com/allenai/open-instruct
|
116 |
- **Paper:** https://arxiv.org/abs/2501.00656
|
@@ -119,7 +119,7 @@ For more documentation, see the [GitHub readme](https://github.com/allenai/OLMo?
|
|
119 |
|
120 |
|
121 |
## Evaluation
|
122 |
-
Core model results for OLMo 2
|
123 |
|
124 |
| Model | Train FLOPs | Average | ARC/C | HSwag | WinoG | MMLU | DROP | NQ | AGIEval | GSM8k | MMLUPro | TriviaQA |
|
125 |
|-------------------|------------|---------|--------|--------|--------|-------|-------|-----|----------|--------|-----------|-----------|
|
@@ -146,29 +146,27 @@ Core model results for OLMo 2 7B and 13B models are found below.
|
|
146 |
## Model Details
|
147 |
|
148 |
### Pretraining
|
149 |
-
| | **OLMo 2
|
150 |
-
|
151 |
-
| Pretraining Stage 1
|
152 |
-
| Pretraining Stage 2
|
153 |
-
| Post-training<br>([
|
154 |
|
155 |
#### Stage 1: Initial Pretraining
|
156 |
-
- Dataset: [OLMo-Mix-
|
157 |
- Coverage: 90%+ of total pretraining budget
|
158 |
-
-
|
159 |
-
- 13B Model: 1.2 epochs (5T tokens)
|
160 |
|
161 |
#### Stage 2: Fine-tuning
|
162 |
-
- Dataset: [Dolmino-Mix-
|
163 |
- Three training mixes:
|
164 |
-
-
|
165 |
- 100B tokens
|
166 |
- 300B tokens
|
167 |
- Mix composition: 50% high-quality data + academic/Q&A/instruction/math content
|
168 |
|
169 |
#### Model Merging
|
170 |
-
-
|
171 |
-
- 13B Model: 3 versions on 100B mix + 1 version on 300B mix, merged for final checkpoint
|
172 |
|
173 |
|
174 |
## Bias, Risks, and Limitations
|
|
|
20 |
|------|--------|---------|-------------|-----------------|----------------|
|
21 |
| [OLMo 2-7B](https://huggingface.co/allenai/OLMo-2-1124-7B) | 4 Trillion | 32 | 4096 | 32 | 4096 |
|
22 |
| [OLMo 2-13B](https://huggingface.co/allenai/OLMo-2-1124-13B) | 5 Trillion | 40 | 5120 | 40 | 4096 |
|
23 |
+
| [OLMo 2-32B](https://huggingface.co/allenai/OLMo-2-0325-32B) | 6 Trillion | 64 | 5120 | 40 | 4096 |
|
24 |
|
25 |
The core models released in this batch include the following:
|
26 |
|
|
|
86 |
Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
|
87 |
1. Fine-tune with the OLMo repository:
|
88 |
```bash
|
89 |
+
python src/scripts/train/OLMo2-32B.py train_single {training_name} --trainer.load_path="{/path/to/checkpoint}" --trainer.load_strategy=if_available
|
|
|
|
|
|
|
|
|
90 |
```
|
91 |
+
Example:
|
92 |
+
```bash
|
93 |
+
python src/scripts/train/OLMo2-32B.py train_single 32b_run_from_step_10000 --trainer.load_path="step10000" --trainer.load_strategy=if_available
|
94 |
+
```
|
95 |
+
For more documentation, see the [GitHub readme](https://github.com/allenai/OLMo-core).
|
96 |
|
97 |
2. Further fine-tuning support is being developing in AI2's Open Instruct repository. Details are [here](https://github.com/allenai/open-instruct).
|
98 |
|
|
|
110 |
|
111 |
- **Project Page:** https://allenai.org/olmo
|
112 |
- **Repositories:**
|
113 |
+
- Core repo (training, inference, fine-tuning etc.): https://github.com/allenai/OLMo-core
|
114 |
- Evaluation code: https://github.com/allenai/OLMo-Eval
|
115 |
- Further fine-tuning code: https://github.com/allenai/open-instruct
|
116 |
- **Paper:** https://arxiv.org/abs/2501.00656
|
|
|
119 |
|
120 |
|
121 |
## Evaluation
|
122 |
+
Core model results for OLMo 2 32B are found below.
|
123 |
|
124 |
| Model | Train FLOPs | Average | ARC/C | HSwag | WinoG | MMLU | DROP | NQ | AGIEval | GSM8k | MMLUPro | TriviaQA |
|
125 |
|-------------------|------------|---------|--------|--------|--------|-------|-------|-----|----------|--------|-----------|-----------|
|
|
|
146 |
## Model Details
|
147 |
|
148 |
### Pretraining
|
149 |
+
| | **OLMo 2 32B** | **OLMo 2 13B** | **OLMo 2 7B** |
|
150 |
+
|-------------------|------------|------------|------------|
|
151 |
+
| Pretraining Stage 1 | 6 trillion tokens<br>(1 epoch) | 5 trillion tokens<br>(1.2 epochs) | 4 trillion tokens<br>(1 epoch) |
|
152 |
+
| Pretraining Stage 2 | 100B tokens (2 runs)<br>300B tokens (1 run)<br>*merged* | 100B tokens (3 runs)<br>300B tokens (1 run)<br>*merged* | 50B tokens (3 runs)<br>*merged* |
|
153 |
+
| Post-training | SFT + DPO + PPO<br>([preference mix](https://huggingface.co/datasets/allenai/olmo-2-32b-pref-mix-v1)) | SFT + DPO + PPO<br>([preference mix](https://huggingface.co/datasets/allenai/olmo-2-1124-13b-preference-mix)) | SFT + DPO + PPO<br>([preference mix](https://huggingface.co/datasets/allenai/olmo-2-1124-7b-preference-mix)) |
|
154 |
|
155 |
#### Stage 1: Initial Pretraining
|
156 |
+
- Dataset: [OLMo-Mix-0325](https://huggingface.co/datasets/allenai/olmo-2-32b-pref-mix-v1) (3.9T tokens)
|
157 |
- Coverage: 90%+ of total pretraining budget
|
158 |
+
- 32B Model: ~1 epoch
|
|
|
159 |
|
160 |
#### Stage 2: Fine-tuning
|
161 |
+
- Dataset: [Dolmino-Mix-0325](https://huggingface.co/datasets/allenai/dolmino-mix-1124) (843B tokens)
|
162 |
- Three training mixes:
|
163 |
+
- 100B tokens
|
164 |
- 100B tokens
|
165 |
- 300B tokens
|
166 |
- Mix composition: 50% high-quality data + academic/Q&A/instruction/math content
|
167 |
|
168 |
#### Model Merging
|
169 |
+
- 32B Model: 2 versions on 100B mix + 1 version on 300B mix, merged for final checkpoint
|
|
|
170 |
|
171 |
|
172 |
## Bias, Risks, and Limitations
|