Deepseek-Qwen2.5-7B-Redistil-GRPO-cp-800 / model-00004-of-00004.safetensors

Commit History

Trained with Unsloth
b657404
verified

jan-hq commited on