sm54/FuseO1-QwQ-SkyT1-Flash-32B

bobig

3 days ago

QwQ is better with SkyT1-Flash.

Thanks for merging and sharing and updating.

sm54

Owner 3 days ago

•

edited 3 days ago

Hopefully that's the case, I've not had a chance to test it much yet, it seems to work better now with the updated tokenizer. I might do a remerge to specifically exclude Qwen2.5-32B from the final model, as I think it might be being included at the moment.

bobig

3 days ago

•

edited 2 days ago

Your previous version with Flash worked well in Roo Code for Architecting small projects
Better than a couple of other QwQ merges I tried. Flash really helps.

Downloading and testing this version.
Just noticed, your update fixed the Jinja bug.
Thanks for doing that.

sm54

Owner 3 days ago

Your previous version worked well in Roo Code for Architecting small projects
Better than a couple of other QwQ merges I tried. Flash really helps.

Downloading and testing this version.
Just noticed, your update fixed the Jinja bug.
Thanks for doing that.

You might want to try this version as it's working well. Annoyingly I think QwQ-SkyT1-Flash would be a better model, but I can't get the jinja template to work as I want it to.

https://huggingface.co/sm54/FuseO1-DeepSeekR1-QwQ-SkyT1-Flash-32B-Q4_K_M-GGUF

bobig

2 days ago

•

edited 2 days ago

Thanks for the tip, I really like the 128k context in your R1 merges for reasoning models.
Downloading and testing now:
https://huggingface.co/sm54/FuseO1-DeepSeekR1-QwQ-SkyT1-Flash-32B-Q4_K_M-GGUF

sm54
/

FuseO1-QwQ-SkyT1-Flash-32B

Thank you