ajagota71/pythia-160m-fb-detox-checkpoint-epoch-80 Reinforcement Learning • 0.2B • Updated May 16 • 11
mradermacher/VeriReason-Qwen2.5-7b-SFT-Reasoning-GGUF Reinforcement Learning • 8B • Updated Jul 11 • 72 • 1
mradermacher/VeriReason-Qwen2.5-1.5B-grpo-small-GGUF Reinforcement Learning • 2B • Updated Jul 11 • 46 • 1
mradermacher/VeriReason-Qwen2.5-3B-Verilog-RTL-GRPO-reasoning-tb-GGUF Reinforcement Learning • 3B • Updated Jul 11 • 178
mradermacher/VeriReason-Qwen2.5-7b-SFT-Reasoning-i1-GGUF Reinforcement Learning • 8B • Updated Jul 11 • 209 • 1
mradermacher/VeriReason-Qwen2.5-1.5b-RTLCoder-Verilog-GRPO-reasoning-tb-GGUF Reinforcement Learning • 2B • Updated Jul 11 • 169
mradermacher/VeriReason-Qwen2.5-3b-RTLCoder-Verilog-GRPO-reasoning-tb-GGUF Reinforcement Learning • 3B • Updated Jul 11 • 86
mradermacher/VeriReason-Qwen2.5-7b-RTLCoder-Verilog-GRPO-reasoning-tb-GGUF Reinforcement Learning • 8B • Updated Jul 11 • 114 • 1
mradermacher/VeriReason-Qwen2.5-7b-RTLCoder-Verilog-GRPO-reasoning-tb-i1-GGUF Reinforcement Learning • 8B • Updated Jul 11 • 270 • 3
mradermacher/CscSQL-Grpo-XiYanSQL-QwenCoder-7B-2502-GGUF Reinforcement Learning • 8B • Updated Jul 11 • 162 • 1