Question Answering
Transformers
Safetensors
English
doge
text-generation
trl
sft
dpo
custom_code
JingzeShi commited on
Commit
2f95140
verified
1 Parent(s): ea6fe65

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -71,7 +71,7 @@ We build the Doge-Instruct by first SFT on [SmolTalk](https://huggingface.co/dat
71
  | Model | Training Data | Epochs | Content Length | LR | Batch Size | Precision |
72
  |---|---|---|---|---|---|---|
73
  | [Doge-20M-Instruct-SFT](https://huggingface.co/SmallDoge/Doge-20M-Instruct-SFT) | [HuggingFaceTB/smoltalk](https://huggingface.co/datasets/HuggingFaceTB/smoltalk) | 2 | 2048 | 8e-4 | 0.25M | bfloat16 |
74
- | [Doge-60M-Instruct](https://huggingface.co/SmallDoge/Doge-60M-Instruct) | [HuggingFaceTB/smoltalk](https://huggingface.co/datasets/HuggingFaceTB/smoltalk) | 2 | 2048 | 6e-4 | 0.25M | bfloat16 |
75
 
76
  **DPO**:
77
  | Model | Training Data | Epochs | Content Length | LR | Batch Size | Precision |
 
71
  | Model | Training Data | Epochs | Content Length | LR | Batch Size | Precision |
72
  |---|---|---|---|---|---|---|
73
  | [Doge-20M-Instruct-SFT](https://huggingface.co/SmallDoge/Doge-20M-Instruct-SFT) | [HuggingFaceTB/smoltalk](https://huggingface.co/datasets/HuggingFaceTB/smoltalk) | 2 | 2048 | 8e-4 | 0.25M | bfloat16 |
74
+ | [Doge-60M-Instruct-SFT](https://huggingface.co/SmallDoge/Doge-60M-Instruct-SFT) | [HuggingFaceTB/smoltalk](https://huggingface.co/datasets/HuggingFaceTB/smoltalk) | 2 | 2048 | 6e-4 | 0.25M | bfloat16 |
75
 
76
  **DPO**:
77
  | Model | Training Data | Epochs | Content Length | LR | Batch Size | Precision |