Update README.md
Browse files
README.md
CHANGED
@@ -26,11 +26,11 @@ The core models released in this batch include the following:
|
|
26 |
|
27 |
| **Stage** | **OLMo 2 32B** | **OLMo 2 13B** | **OLMo 2 7B**
|
28 |
|----------------------|----------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|
|
29 |
-
| **Base Model** | [allenai/OLMo-2-0325-32B](https://huggingface.co/allenai/OLMo-2-
|
30 |
-
| **SFT** | [allenai/OLMo-2-0325-32B-SFT](https://huggingface.co/allenai/OLMo-2-
|
31 |
-
| **DPO** | [allenai/OLMo-2-0325-32B-DPO](https://huggingface.co/allenai/OLMo-2-
|
32 |
-
| **Final Models (RLVR)** | [allenai/OLMo-2-0325-32B-Instruct](https://huggingface.co/allenai/OLMo-2-
|
33 |
-
| **Reward Model (RM)**|
|
34 |
|
35 |
## Installation
|
36 |
|
@@ -81,6 +81,7 @@ from huggingface_hub import list_repo_refs
|
|
81 |
out = list_repo_refs("allenai/OLMo-2-0325-32B")
|
82 |
branches = [b.name for b in out.branches]
|
83 |
```
|
|
|
84 |
|
85 |
### Fine-tuning
|
86 |
Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
|
|
|
26 |
|
27 |
| **Stage** | **OLMo 2 32B** | **OLMo 2 13B** | **OLMo 2 7B**
|
28 |
|----------------------|----------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|
|
29 |
+
| **Base Model** | [allenai/OLMo-2-0325-32B](https://huggingface.co/allenai/OLMo-2-0325-32B) | [allenai/OLMo-2-1124-13B](https://huggingface.co/allenai/OLMo-2-1124-13B) | [allenai/OLMo-2-1124-7B](https://huggingface.co/allenai/OLMo-2-1124-7B) |
|
30 |
+
| **SFT** | [allenai/OLMo-2-0325-32B-SFT](https://huggingface.co/allenai/OLMo-2-0325-32B-SFT) | [allenai/OLMo-2-1124-13B-SFT](https://huggingface.co/allenai/OLMo-2-1124-13B-SFT) | [allenai/OLMo-2-1124-7B-SFT](https://huggingface.co/allenai/OLMo-2-1124-7B-SFT) |
|
31 |
+
| **DPO** | [allenai/OLMo-2-0325-32B-DPO](https://huggingface.co/allenai/OLMo-2-0325-32B-DPO) | [allenai/OLMo-2-1124-13B-DPO](https://huggingface.co/allenai/OLMo-2-1124-13B-DPO) | [allenai/OLMo-2-1124-7B-DPO](https://huggingface.co/allenai/OLMo-2-1124-7B-DPO) |
|
32 |
+
| **Final Models (RLVR)** | [allenai/OLMo-2-0325-32B-Instruct](https://huggingface.co/allenai/OLMo-2-0325-32B-Instruct) | [allenai/OLMo-2-1124-13B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-13B-Instruct) | [allenai/OLMo-2-1124-7B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-7B-Instruct) |
|
33 |
+
| **Reward Model (RM)**| | (Same as 7B) | [allenai/OLMo-2-1124-7B-RM](https://huggingface.co/allenai/OLMo-2-1124-7B-RM) |
|
34 |
|
35 |
## Installation
|
36 |
|
|
|
81 |
out = list_repo_refs("allenai/OLMo-2-0325-32B")
|
82 |
branches = [b.name for b in out.branches]
|
83 |
```
|
84 |
+
Note: vLLM for OLMo2 32B does not correctly handle attention when the number of heads differs from the number of KV heads (i.e., when using Grouped-Query Attention (GQA) or Multi-Query Attention (MQA) instead of Multi-Head Attention (MHA)). Specifically, it incorrectly splits QKV into equal chunks rather than based on the actual sizes of Q, K, and V. vLLM hasn't released a version with the fix yet ([Issue](https://github.com/vllm-project/vllm/pull/13687)).
|
85 |
|
86 |
### Fine-tuning
|
87 |
Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
|