![](https://cdn-avatars.huggingface.co/v1/production/uploads/638fb8cf2380ffd99caf8c2a/xTHSf1YDQDriY5eZ7cn_1.jpeg)
RLHFlow/ArmoRM-Llama3-8B-v0.1
Text Classification
•
Updated
•
50.5k
•
167
Reward models trained by RLHFlow codebase (https://github.com/RLHFlow/RLHF-Reward-Modeling/)
Note Bradley-Terry reward model trained with RLHFlow codebase
Note Tech report that covers Pairwise Preference Model
Note Tech report for ArmoRM