Edit Models filters

Inference Providers

Nebius AI Studio

HF Inference API

Misc

arxiv: 2402.03300

Inference Endpoints

text-generation-inference

AutoTrain Compatible

4-bit precision

Carbon Emissions

8-bit precision

Misc with no match

text-embeddings-inference

Mixture of Experts

Models

720

Full-text search

Active filters: 2402.03300

hyunw3/qwen-2.5-0.5b-r1-countdown

Text Generation • Updated Feb 1 • 39

hyunw3/qwen-2.5-0.5b-r1-countdown_lr1.0e-6

Text Generation • Updated Feb 1 • 8

mgaimm/qwen-2.5-3b-r1-countdown

Text Generation • Updated Feb 1 • 25

tuyentx/qwen-2.5-3b-r1-countdown

Text Generation • Updated Feb 2 • 6

pablo-chocobar/qwen-2.5-3b-r1-countdown

Text Generation • Updated Feb 3 • 9

Julian-Sheeper/Qwen2.5-1.5B-Open-R1-GRPO

Text Generation • Updated Feb 2 • 6

pullpull/qwen-2.5-3b-r1-countdown

Text Generation • Updated Feb 2 • 6

justinj92/Qwen2.5-1.5B-Thinking

Text Generation • Updated Feb 4 • 127 • 4

spinech/qwen2.5-3b-r1-arc-train

Text Generation • Updated Feb 3 • 52

howardzhou/Qwen2.5-3B-Open-R1-GRPO

Text Generation • Updated Feb 5 • 21

jainamit/qwen-2.5-3b-r1-countdown

Text Generation • Updated Feb 6 • 60

GitBag/Qwen2.5-1.5B-Open-R1-GRPO

Text Generation • Updated Feb 4 • 9

Dongwei/Qwen-2.5-7B

Text Generation • Updated Feb 3 • 9

spinech/qwen2.5-3b-r1-arc-train-synthetic

Text Generation • Updated Feb 4 • 19

laolaorkk/Qwen2.5-1.5B-R1-GRPO-debug

Text Generation • Updated Feb 6 • 33

Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math

Text Generation • Updated Feb 4 • 24

Dongwei/Qwen-2.5-7B_Math

Text Generation • Updated Feb 4 • 12

Dongwei/Qwen2.5-1.5B-Open-R1-GRPO_Math

Text Generation • Updated Feb 3 • 132

Dongwei/DeepSeek-R1-Distill-Qwen-1.5B-GRPO_Math

Text Generation • Updated Feb 3 • 82

skzxjus/Qwen2.5-7B-Open-R1-GRPO

Text Generation • Updated Feb 8 • 58

AndreasX1206/Qwen2-0.5B-countdown

Text Generation • Updated Feb 4 • 11

alicogniai/Qwen2.5-1.5B-Open-R1-GRPO

Text Generation • Updated 28 days ago • 5

ununtrium/Qwen2.5-1.5B-Open-R1-GRPO

Text Generation • Updated Feb 11 • 10

yuta0x89/llmjp13b-numinacot-epoch2-GRPO

Text Generation • Updated Feb 11 • 83

yeshsurya/Qwen2.5-7B-Math-with_50stepGRPO

Text Generation • Updated Feb 12 • 8

hyunw3/qwen-2.5-0.5b-r1-countdown_lr5e-6

Text Generation • Updated Feb 5 • 233

AlistairPullen/Llama-3.1-8b-Instruct-GRPO-fine-tuned-lora

Dongwei/DeepSeek-R1-Distill-Qwen-7B-GRPO_Math_lowlr

Text Generation • Updated Feb 4 • 16

Dongwei/Qwen-2.5-7B_Math_smalllr

Text Generation • Updated Feb 4 • 15

Dongwei/Qwen2.5-1.5B-Open-R1-GRPO_Math_smalllr

Text Generation • Updated Feb 4 • 84