May I ask how this differs from the 10ep version?

#2
by bdsqlsz - opened

Thank you for your great work.
SPO works very well, but I just discovered this model, is this the version that trains higher epochs?
If possible please add a lora version.

I rechecked github and it turns out that this is for training the preference model used.

bdsqlsz changed discussion status to closed
Your need to confirm your account before you can post a new comment.

Sign up or log in to comment