JingzeShi commited on
Commit
533655e
·
verified ·
1 Parent(s): 0fce2c3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -3
README.md CHANGED
@@ -21,9 +21,9 @@ Doge uses `wsd_scheduler` as the training scheduler, which divides the learning
21
 
22
  Here are the initial learning rates required to continue training at each checkpoint:
23
 
24
- - **[Doge-20M](https://huggingface.co/JingzeShi/Doge-20M-checkpoint)**: 8e-3
25
- - **[Doge-60M](https://huggingface.co/JingzeShi/Doge-60M-checkpoint)**: 6e-3
26
- - **[Doge-160M](https://huggingface.co/JingzeShi/Doge-160M-checkpoint)**: 4e-3
27
  - **Doge-320M**: 2e-3
28
 
29
  | Model | Learning Rate | Schedule | Warmup Steps | Stable Steps |
 
21
 
22
  Here are the initial learning rates required to continue training at each checkpoint:
23
 
24
+ - **[Doge-20M](https://huggingface.co/SmallDoge/Doge-20M-checkpoint)**: 8e-3
25
+ - **[Doge-60M](https://huggingface.co/SmallDoge/Doge-60M-checkpoint)**: 6e-3
26
+ - **[Doge-160M](https://huggingface.co/SmallDoge/Doge-160M-checkpoint)**: 4e-3
27
  - **Doge-320M**: 2e-3
28
 
29
  | Model | Learning Rate | Schedule | Warmup Steps | Stable Steps |