Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
sudocoder
/
Qwen2-0.5B-GRPO-test
like
0
Transformers
TensorBoard
Safetensors
AI-MO/NuminaMath-TIR
Generated from Trainer
trl
grpo
arxiv:
2402.03300
Model card
Files
Files and versions
xet
Metrics
Training metrics
Community
Train
Deploy
Use this model
b47ffaf
Qwen2-0.5B-GRPO-test
Commit History
Training in progress, step 70
b47ffaf
verified
sudocoder
commited on
Mar 21
Training in progress, step 60
467d6a7
verified
sudocoder
commited on
Mar 21
Training in progress, step 50
a5238fc
verified
sudocoder
commited on
Mar 21
Training in progress, step 40
ece3091
verified
sudocoder
commited on
Mar 21
Training in progress, step 30
1f4164b
verified
sudocoder
commited on
Mar 21
Training in progress, step 20
67f224c
verified
sudocoder
commited on
Mar 21
Training in progress, step 10
0d4a4d8
verified
sudocoder
commited on
Mar 21
initial commit
018f4ac
verified
sudocoder
commited on
Mar 21