This is the official checkpoint of feedback model trained using COFFEE-GYM with PPO strategy.

This model generates natural language feedback given an erroneous code.

For further detials, please see our paper.

https://huggingface.co/spaces/Coffee-Gym/Project-Coffee-Gym

Downloads last month
9
Safetensors
Model size
6.74B params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.

Model tree for Team-Coffee-Gym/DS-Coder-7B-PPO-CoffeeEval

Quantizations
1 model

Spaces using Team-Coffee-Gym/DS-Coder-7B-PPO-CoffeeEval 2