Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
alexchen4ai
/
Qwen2-0.5B-GRPO-test
like
0
Transformers
TensorBoard
Safetensors
AI-MO/NuminaMath-TIR
Generated from Trainer
trl
grpo
Inference Endpoints
arxiv:
2402.03300
Model card
Files
Files and versions
Metrics
Training metrics
Community
Train
Deploy
Use this model
main
Qwen2-0.5B-GRPO-test
Commit History
End of training
7627479
verified
alexchen4ai
commited on
10 days ago
Model save
92788ac
verified
alexchen4ai
commited on
10 days ago
Training in progress, step 113
081a69f
verified
alexchen4ai
commited on
10 days ago
Training in progress, step 110
48ff20f
verified
alexchen4ai
commited on
10 days ago
Training in progress, step 100
1260d4b
verified
alexchen4ai
commited on
10 days ago
Training in progress, step 90
86ca063
verified
alexchen4ai
commited on
10 days ago
Training in progress, step 80
86638ae
verified
alexchen4ai
commited on
10 days ago
Training in progress, step 70
df70aa4
verified
alexchen4ai
commited on
10 days ago
Training in progress, step 60
256bc40
verified
alexchen4ai
commited on
10 days ago
Training in progress, step 50
11113ef
verified
alexchen4ai
commited on
10 days ago
Training in progress, step 40
78bfd91
verified
alexchen4ai
commited on
10 days ago
Training in progress, step 30
17b4b86
verified
alexchen4ai
commited on
10 days ago
Training in progress, step 20
1d6223a
verified
alexchen4ai
commited on
10 days ago
Training in progress, step 10
6eca66a
verified
alexchen4ai
commited on
10 days ago
initial commit
8afb8fa
verified
alexchen4ai
commited on
10 days ago