Post
1802
GRPO reasoning embedded in a custom Prem-1B model
ucalyptus/prem-663ff8769efa4d3700ba14e5
ucalyptus/prem-1B-grpo
ucalyptus/prem-663ff8769efa4d3700ba14e5
ucalyptus/prem-1B-grpo
i realized that naively quantizing the prem-1b caused it to give gibberish outputs on the webgpu demo. lmao. stay tuned for better models.
can u dm me on X ?
Outstanding issues :
Fix Q4 demo
https://huggingface.co/spaces/ucalyptus/prem-1B-chat-webgpu/discussions/1#664b621d8742922b9e4f3de8
Also work on fp16 (see what onnxruntime-web has to say about this)
how do u obtain the wasm file? Didn't find it here: https://cdn.jsdelivr.net/npm/@xenova/[email protected]/dist/
cc: @Xenova
ORPO-tuned Prem-1B chat model
Prem-2B-chat created using frankenmerge