amanrangapur commited on
Commit
39ac4d3
·
verified ·
1 Parent(s): 8d2d36b

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +6 -5
README.md CHANGED
@@ -26,11 +26,11 @@ The core models released in this batch include the following:
26
 
27
  | **Stage** | **OLMo 2 32B** | **OLMo 2 13B** | **OLMo 2 7B**
28
  |----------------------|----------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|
29
- | **Base Model** | [allenai/OLMo-2-0325-32B](https://huggingface.co/allenai/OLMo-2-1124-7B) | [allenai/OLMo-2-1124-13B](https://huggingface.co/allenai/OLMo-2-1124-13B) | [allenai/OLMo-2-1124-7B](https://huggingface.co/allenai/OLMo-2-1124-7B) |
30
- | **SFT** | [allenai/OLMo-2-0325-32B-SFT](https://huggingface.co/allenai/OLMo-2-1124-7B-SFT) | [allenai/OLMo-2-1124-13B-SFT](https://huggingface.co/allenai/OLMo-2-1124-13B-SFT) | [allenai/OLMo-2-1124-7B-SFT](https://huggingface.co/allenai/OLMo-2-1124-7B-SFT) |
31
- | **DPO** | [allenai/OLMo-2-0325-32B-DPO](https://huggingface.co/allenai/OLMo-2-1124-7B-DPO) | [allenai/OLMo-2-1124-13B-DPO](https://huggingface.co/allenai/OLMo-2-1124-13B-DPO) | [allenai/OLMo-2-1124-7B-DPO](https://huggingface.co/allenai/OLMo-2-1124-7B-DPO) |
32
- | **Final Models (RLVR)** | [allenai/OLMo-2-0325-32B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-7B-Instruct) | [allenai/OLMo-2-1124-13B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-13B-Instruct) | [allenai/OLMo-2-1124-7B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-7B-Instruct) |
33
- | **Reward Model (RM)**| [allenai/OLMo-2-0325-32B-RM](https://huggingface.co/allenai/OLMo-2-1124-7B-RM) | (Same as 7B) | [allenai/OLMo-2-1124-7B-RM](https://huggingface.co/allenai/OLMo-2-1124-7B-RM) |
34
 
35
  ## Installation
36
 
@@ -81,6 +81,7 @@ from huggingface_hub import list_repo_refs
81
  out = list_repo_refs("allenai/OLMo-2-0325-32B")
82
  branches = [b.name for b in out.branches]
83
  ```
 
84
 
85
  ### Fine-tuning
86
  Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.
 
26
 
27
  | **Stage** | **OLMo 2 32B** | **OLMo 2 13B** | **OLMo 2 7B**
28
  |----------------------|----------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|----------------------------------------------------------------------------------------------------------|
29
+ | **Base Model** | [allenai/OLMo-2-0325-32B](https://huggingface.co/allenai/OLMo-2-0325-32B) | [allenai/OLMo-2-1124-13B](https://huggingface.co/allenai/OLMo-2-1124-13B) | [allenai/OLMo-2-1124-7B](https://huggingface.co/allenai/OLMo-2-1124-7B) |
30
+ | **SFT** | [allenai/OLMo-2-0325-32B-SFT](https://huggingface.co/allenai/OLMo-2-0325-32B-SFT) | [allenai/OLMo-2-1124-13B-SFT](https://huggingface.co/allenai/OLMo-2-1124-13B-SFT) | [allenai/OLMo-2-1124-7B-SFT](https://huggingface.co/allenai/OLMo-2-1124-7B-SFT) |
31
+ | **DPO** | [allenai/OLMo-2-0325-32B-DPO](https://huggingface.co/allenai/OLMo-2-0325-32B-DPO) | [allenai/OLMo-2-1124-13B-DPO](https://huggingface.co/allenai/OLMo-2-1124-13B-DPO) | [allenai/OLMo-2-1124-7B-DPO](https://huggingface.co/allenai/OLMo-2-1124-7B-DPO) |
32
+ | **Final Models (RLVR)** | [allenai/OLMo-2-0325-32B-Instruct](https://huggingface.co/allenai/OLMo-2-0325-32B-Instruct) | [allenai/OLMo-2-1124-13B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-13B-Instruct) | [allenai/OLMo-2-1124-7B-Instruct](https://huggingface.co/allenai/OLMo-2-1124-7B-Instruct) |
33
+ | **Reward Model (RM)**| | (Same as 7B) | [allenai/OLMo-2-1124-7B-RM](https://huggingface.co/allenai/OLMo-2-1124-7B-RM) |
34
 
35
  ## Installation
36
 
 
81
  out = list_repo_refs("allenai/OLMo-2-0325-32B")
82
  branches = [b.name for b in out.branches]
83
  ```
84
+ Note: vLLM for OLMo2 32B does not correctly handle attention when the number of heads differs from the number of KV heads (i.e., when using Grouped-Query Attention (GQA) or Multi-Query Attention (MQA) instead of Multi-Head Attention (MHA)). Specifically, it incorrectly splits QKV into equal chunks rather than based on the actual sizes of Q, K, and V. vLLM hasn't released a version with the fix yet ([Issue](https://github.com/vllm-project/vllm/pull/13687)).
85
 
86
  ### Fine-tuning
87
  Model fine-tuning can be done from the final checkpoint (the `main` revision of this model) or many intermediate checkpoints. Two recipes for tuning are available.