ValueFX9507/Tifa-Deepsex-14b-CoT Reinforcement Learning • Updated about 12 hours ago • 11k • 104
Prem Collection Finetunes and Quantizations of the Prem LLMs • 13 items • Updated 5 days ago • 2
view post Post 1802 GRPO reasoning embedded in a custom Prem-1B model ucalyptus/prem-663ff8769efa4d3700ba14e5 ucalyptus/prem-1B-grpo See translation 🔥 4 4 + Reply
view post Post 1802 GRPO reasoning embedded in a custom Prem-1B model ucalyptus/prem-663ff8769efa4d3700ba14e5 ucalyptus/prem-1B-grpo See translation 🔥 4 4 + Reply
Prem Collection Finetunes and Quantizations of the Prem LLMs • 13 items • Updated 5 days ago • 2