weqweasdas commited on
Commit
12658a6
·
verified ·
1 Parent(s): af68c94

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -37,7 +37,7 @@ The total number of the comparison pairs is 250K, where we perform the following
37
 
38
  ### Training
39
 
40
- We train the model for one epoch with a learning rate of 5e-6, batch size 256, cosine learning rate decay with a warmup ratio 0.03. You can see my training script here: https://github.com/WeiXiongUST/RAFT-Reward-Ranked-Finetuning/blob/main/reward_modeling.py
41
 
42
 
43
 
 
37
 
38
  ### Training
39
 
40
+ We train the model for one epoch with a learning rate of 5e-6, batch size 256, cosine learning rate decay with a warmup ratio 0.03. You can see my training script here: https://github.com/WeiXiongUST/RAFT-Reward-Ranked-Finetuning/blob/main/reward_modeling.py , which is modified from the TRL package.
41
 
42
 
43