allenai
/

OLMo-2-0325-32B

Text Generation

Transformers

Safetensors

English

olmo2

Model card Files Files and versions Community

amanrangapur commited on 17 days ago

Commit

4f4baab

verified ·

1 Parent(s): 03ce88b

Update README.md

Browse files

Files changed (1) hide show

README.md +19 -21

README.md CHANGED Viewed

@@ -20,7 +20,7 @@ These models are trained on the Dolma dataset. We have released all code, checkp
 |------|--------|---------|-------------|-----------------|----------------|
 | [OLMo 2-7B](https://huggingface.co/allenai/OLMo-2-1124-7B) | 4 Trillion   | 32     | 4096        | 32              |  4096  |
 | [OLMo 2-13B](https://huggingface.co/allenai/OLMo-2-1124-13B) | 5 Trillion   | 40     | 5120        | 40              |  4096  |
-| [OLMo 2-32B](https://huggingface.co/allenai/OLMo-2-0325-32B) | 6 Trillion   | 40     | 5120        | 40              |  4096  |
 The core models released in this batch include the following:
@@ -86,13 +86,13 @@ branches = [b.name for b in out.branches]
 Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
 1. Fine-tune with the OLMo repository:
 ```bash
-torchrun --nproc_per_node=8 scripts/train.py {path_to_train_config} \
-    --data.paths=[{path_to_data}/input_ids.npy] \
-    --data.label_mask_paths=[{path_to_data}/label_mask.npy] \
-    --load_path={path_to_checkpoint} \
-    --reset_trainer_state
 ```
-For more documentation, see the [GitHub readme](https://github.com/allenai/OLMo?tab=readme-ov-file#fine-tuning).
 2. Further fine-tuning support is being developing in AI2's Open Instruct repository. Details are [here](https://github.com/allenai/open-instruct).
@@ -110,7 +110,7 @@ For more documentation, see the [GitHub readme](https://github.com/allenai/OLMo?
 - **Project Page:** https://allenai.org/olmo
 - **Repositories:**
-    - Core repo (training, inference, fine-tuning etc.): https://github.com/allenai/OLMo
     - Evaluation code: https://github.com/allenai/OLMo-Eval
     - Further fine-tuning code: https://github.com/allenai/open-instruct
 - **Paper:** https://arxiv.org/abs/2501.00656
@@ -119,7 +119,7 @@ For more documentation, see the [GitHub readme](https://github.com/allenai/OLMo?
 ## Evaluation
-Core model results for OLMo 2 7B and 13B models are found below.
 | Model | Train FLOPs | Average | ARC/C | HSwag | WinoG | MMLU | DROP | NQ | AGIEval | GSM8k | MMLUPro | TriviaQA |
 |-------------------|------------|---------|--------|--------|--------|-------|-------|-----|----------|--------|-----------|-----------|
@@ -146,29 +146,27 @@ Core model results for OLMo 2 7B and 13B models are found below.
 ## Model Details
 ### Pretraining
-|  | **OLMo 2 7B** | **OLMo 2 13B** |
-|-------------------|------------|------------|
-| Pretraining Stage 1<br>([OLMo-Mix-1124](https://huggingface.co/datasets/allenai/olmo-mix-1124)) | 4 trillion tokens<br>(1 epoch) | 5 trillion tokens<br>(1.2 epochs) |
-| Pretraining Stage 2<br>([Dolmino-Mix-1124](https://huggingface.co/datasets/allenai/dolmino-mix-1124)) | 50B tokens (3 runs)<br>*merged* | 100B tokens (3 runs)<br>300B tokens (1 run)<br>*merged* |
-| Post-training<br>([Tulu 3 SFT OLMo mix](https://huggingface.co/datasets/allenai/tulu-3-sft-olmo-mixture)) | SFT + DPO + PPO<br>([preference mix](https://huggingface.co/datasets/allenai/olmo-2-1124-7b-preference-mix)) | SFT + DPO + PPO<br>([preference mix](https://huggingface.co/datasets/allenai/olmo-2-1124-13b-preference-mix)) |
 #### Stage 1: Initial Pretraining
-- Dataset: [OLMo-Mix-1124](https://huggingface.co/datasets/allenai/olmo-mix-1124) (3.9T tokens)
 - Coverage: 90%+ of total pretraining budget
-- 7B Model: ~1 epoch
-- 13B Model: 1.2 epochs (5T tokens)
 #### Stage 2: Fine-tuning
-- Dataset: [Dolmino-Mix-1124](https://huggingface.co/datasets/allenai/dolmino-mix-1124) (843B tokens)
 - Three training mixes:
-  - 50B tokens
   - 100B tokens
   - 300B tokens
 - Mix composition: 50% high-quality data + academic/Q&A/instruction/math content
 #### Model Merging
-- 7B Model: 3 versions trained on 50B mix, merged via model souping
-- 13B Model: 3 versions on 100B mix + 1 version on 300B mix, merged for final checkpoint
 ## Bias, Risks, and Limitations

 |------|--------|---------|-------------|-----------------|----------------|
 | [OLMo 2-7B](https://huggingface.co/allenai/OLMo-2-1124-7B) | 4 Trillion   | 32     | 4096        | 32              |  4096  |
 | [OLMo 2-13B](https://huggingface.co/allenai/OLMo-2-1124-13B) | 5 Trillion   | 40     | 5120        | 40              |  4096  |
+| [OLMo 2-32B](https://huggingface.co/allenai/OLMo-2-0325-32B) | 6 Trillion   | 64     | 5120        | 40              |  4096  |
 The core models released in this batch include the following:
 Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
 1. Fine-tune with the OLMo repository:
 ```bash
+python src/scripts/train/OLMo2-32B.py train_single {training_name} --trainer.load_path="{/path/to/checkpoint}" --trainer.load_strategy=if_available
 ```
+Example:
+```bash
+python src/scripts/train/OLMo2-32B.py train_single 32b_run_from_step_10000 --trainer.load_path="step10000" --trainer.load_strategy=if_available
+```
+For more documentation, see the [GitHub readme](https://github.com/allenai/OLMo-core).
 2. Further fine-tuning support is being developing in AI2's Open Instruct repository. Details are [here](https://github.com/allenai/open-instruct).
 - **Project Page:** https://allenai.org/olmo
 - **Repositories:**
+    - Core repo (training, inference, fine-tuning etc.): https://github.com/allenai/OLMo-core
     - Evaluation code: https://github.com/allenai/OLMo-Eval
     - Further fine-tuning code: https://github.com/allenai/open-instruct
 - **Paper:** https://arxiv.org/abs/2501.00656
 ## Evaluation
+Core model results for OLMo 2 32B are found below.
 | Model | Train FLOPs | Average | ARC/C | HSwag | WinoG | MMLU | DROP | NQ | AGIEval | GSM8k | MMLUPro | TriviaQA |
 |-------------------|------------|---------|--------|--------|--------|-------|-------|-----|----------|--------|-----------|-----------|
 ## Model Details
 ### Pretraining
+|  | **OLMo 2 32B** | **OLMo 2 13B** | **OLMo 2 7B** |
+|-------------------|------------|------------|------------|
+| Pretraining Stage 1 | 6 trillion tokens<br>(1 epoch) | 5 trillion tokens<br>(1.2 epochs) | 4 trillion tokens<br>(1 epoch) |
+| Pretraining Stage 2 | 100B tokens (2 runs)<br>300B tokens (1 run)<br>*merged* | 100B tokens (3 runs)<br>300B tokens (1 run)<br>*merged* | 50B tokens (3 runs)<br>*merged* |
+| Post-training | SFT + DPO + PPO<br>([preference mix](https://huggingface.co/datasets/allenai/olmo-2-32b-pref-mix-v1)) | SFT + DPO + PPO<br>([preference mix](https://huggingface.co/datasets/allenai/olmo-2-1124-13b-preference-mix)) | SFT + DPO + PPO<br>([preference mix](https://huggingface.co/datasets/allenai/olmo-2-1124-7b-preference-mix)) |
 #### Stage 1: Initial Pretraining
+- Dataset: [OLMo-Mix-0325](https://huggingface.co/datasets/allenai/olmo-2-32b-pref-mix-v1) (3.9T tokens)
 - Coverage: 90%+ of total pretraining budget
+- 32B Model: ~1 epoch
 #### Stage 2: Fine-tuning
+- Dataset: [Dolmino-Mix-0325](https://huggingface.co/datasets/allenai/dolmino-mix-1124) (843B tokens)
 - Three training mixes:
+  - 100B tokens
   - 100B tokens
   - 300B tokens
 - Mix composition: 50% high-quality data + academic/Q&A/instruction/math content
 #### Model Merging
+- 32B Model: 2 versions on 100B mix + 1 version on 300B mix, merged for final checkpoint
 ## Bias, Risks, and Limitations