[Request] Release of Reward Model
#7
by
pchiang
- opened
Would the team consider releasing the reward model in addition to the trained model? Reward model could be very useful for evaluating the performance of generation, and could also make it easier for others to reproduce RLHF training.