Reward Model GPT2

fine-tuned GPT2 to a reward model.

The model is designed to generate human-like responses to questions in Stack Exchange domains of programming, mathematics, physics, and more.

For training code check the github example.

info:

  • epoch: 1.0
  • train_loss: 0.641692199903866
  • eval_loss: 0.6299035549163818
  • eval_accuracy: 0.729
Downloads last month
0
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.
The model cannot be deployed to the HF Inference API: The HF Inference API does not support text-generation models for adapter-transformers library.

Dataset used to train qgyd2021/reward_model_gpt2_stack_exchange

Space using qgyd2021/reward_model_gpt2_stack_exchange 1