hanyinwang
/

layer-project-reward-model

Generated from Trainer

Model card Files Files and versions Community

hanyinwang commited on May 3, 2024

Commit

d5a3dcf

·

verified ·

1 Parent(s): c79a1aa

Update README.md

Files changed (1) hide show

README.md +3 -4

README.md CHANGED Viewed

@@ -29,17 +29,16 @@ It achieves the following results on the evaluation set:
 ## Model description
-More information needed
 ## Intended uses & limitations
-More information needed
 ## Training and evaluation data
-More information needed
-## Training procedure
 ### Training hyperparameters

 ## Model description
+The model is fine-tuned as a reward function for RLHF finetuning.
 ## Intended uses & limitations
+The model is trained on very limited data.
 ## Training and evaluation data
+hanyinwang/layer-project-reward-training
 ### Training hyperparameters