What's the difference between this and https://huggingface.co/bartowski/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview-GGUF

by ksze - opened 2 days ago

ksze

2 days ago

Could you enlighten me as to why there are two versions?

Both are derived from the original https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview.
This one using a newer version of llama.cpp (b4546) vs the other one using an older version (b4514). So what are the practical implications?

bartowski

Owner 1 day ago

https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-Qwen2.5-Coder-32B-Preview/discussions/1#67947fb94e725a560dbe7594

They updated the weights in place, in order to preserve the original I uploaded this one as v0.1

I could probably be a bit more clear on the model card!

ksze

1 day ago

@bartowski So it means the one without the v0.1 in the name has updated weights, which fixes the problem where it "struggles with long-chain reasoning and tends to provide immediate answers directly", correct?

bartowski

Owner about 13 hours ago

no sorry, v0.1 is the updated one, I meant to update the model card 🤦 i'll do it right now haha

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment