Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
coolcui
/
DeepSeek-R1-Distill-Qwen-1.5B-GRPO
like
0
Text Generation
Transformers
Safetensors
open-r1/OpenR1-Math-220k
qwen2
Generated from Trainer
open-r1
trl
grpo
conversational
text-generation-inference
Inference Endpoints
arxiv:
2402.03300
Model card
Files
Files and versions
Community
Train
Deploy
Use this model
main
DeepSeek-R1-Distill-Qwen-1.5B-GRPO
Commit History
End of training
bcd9dc8
verified
coolcui
commited on
11 days ago
Model save
fe2cc2c
verified
coolcui
commited on
11 days ago
Training in progress, epoch 0
aa21fb2
verified
coolcui
commited on
11 days ago
End of training
e8feb86
verified
coolcui
commited on
14 days ago
Model save
55873ef
verified
coolcui
commited on
14 days ago
Training in progress, epoch 0
b06e789
verified
coolcui
commited on
14 days ago
End of training
716272e
verified
coolcui
commited on
20 days ago
Model save
f8c79e5
verified
coolcui
commited on
20 days ago
Training in progress, epoch 0
d773cfa
verified
coolcui
commited on
20 days ago
initial commit
4debf27
verified
coolcui
commited on
24 days ago