zhezi12138
/

llama-3b-iter-1

Model card Files Files and versions Community

zhezi12138 commited on Feb 2

Commit

d72fcbb

·

verified ·

1 Parent(s): 0c024c7

Update README.md

Files changed (1) hide show

README.md +1 -1

README.md CHANGED Viewed

@@ -7,4 +7,4 @@ language:
 base_model:
 - openlm-research/open_llama_3b_v2
 ---
-This model is for the reproduction of results on Iterative-Prompt dataset of paper "The crucial role of samplers in online direct preference optimization". You can download it and save it as "models/rlhflow_iter1", and then start the training pipeline. Since we've retrained the models, the results may slightly differ from that reported in the paper, and we will update it later.

 base_model:
 - openlm-research/open_llama_3b_v2
 ---
+This model is for the reproduction of results on Iterative-Prompt dataset of paper "The crucial role of samplers in online direct preference optimization". You can download it and save it as "models/rlhflow_iter1", and then start the training pipeline.