Bartowski! 0.0!!!! You are on double-secret probation for this jinja error!

by rkh661 - opened 8 days ago

8 days ago

Just joking and thanks dude for all you do for the community and opening AI and this bizarro world up for non-rocket scientists, but I am getting a jinja error in LM-studio with your Q5KL quant? Here is the error: "Failed to parse Jinja template: Parser Error: Expected closing statement token. OpenSquareBracket !== CloseStatement." Can you see whats up with this and I am gonna try Iq4NL with my 4080 right now too since I still don't know wtf to pick, or which is better but thanks!!

async0x42

8 days ago

•

edited 8 days ago

Same issue here, I tried a few GGUFs including the official one but they all fail with that error in LM Studio (had a similar issue with arcee-blitz)

bartowski

Owner 8 days ago

You can either use the official lmstudio-community ones here: https://huggingface.co/lmstudio-community/QwQ-32B-GGUF

or you can apply the fix from here: https://github.com/lmstudio-ai/lmstudio-bug-tracker/issues/479#issuecomment-2701947624

rkh661

7 days ago

•

edited 7 days ago

Thanks, and since you use iMatrix and LM-studio quants don't I stick to yours but my gratiude to the devs of that program is due as well. I hope u got the Animal House reference and idk why your name always makes me think of it? But it appears to be working now except he made random tool calls now?- in LM-Studio and not saying that's your fault but for others who read this just be careful and I will test it further and follow up but my sincere gratitude for contributing the way you do! Also did u ever figure out what was going on with your 3bit quants?

usbphone

6 days ago

since you use iMatrix and LM-studio quants don't I stick to yours

Same - and just curious since you're (Bartowski) the official curator for their models, is there any reason they don't?
iMatrix always seem to be a straight-up upgrade.

bartowski

Owner 6 days ago

I agree, but there was circumstantial evidence (that I don't agree with) showing degradation in performance when imatrix was applied

The lack of imatrix also makes the results more "pure" and impossible to accuse of dataset biasing

Again, I still hold that imatrix is universally better, but there was enough to convince to switch to static. Plus static is way faster to make so the lmstudio ones can be first out the door while I crunch the numbers for my own behind the scenes

sm54

about 15 hours ago

I agree, but there was circumstantial evidence (that I don't agree with) showing degradation in performance when imatrix was applied

The lack of imatrix also makes the results more "pure" and impossible to accuse of dataset biasing

Again, I still hold that imatrix is universally better, but there was enough to convince to switch to static. Plus static is way faster to make so the lmstudio ones can be first out the door while I crunch the numbers for my own behind the scenes

Are there many reasoning examples in the imatrix dataset? I'm wondering if the issue is to do with using non reasoning datasets that might not work so well with a reasoning model. According to R1:

To create an imatrix dataset (importance matrix dataset) for quantizing a reasoning-focused LLM, focus on capturing diverse, high-quality examples of reasoning tasks. The goal is to help the quantization process identify and preserve critical weights/activations that enable logical, mathematical, and analytical reasoning. Here's a step-by-step guide:

Key Requirements for the Dataset

Diverse Reasoning Tasks: Cover multiple types (logical, mathematical, causal, analytical, commonsense).

Complexity Levels: Include easy, medium, and hard examples.

Balanced Distribution: Ensure no single task type dominates.

High-Quality Explanations: Include step-by-step reasoning (chain-of-thought).

Domain Alignment: Match the model’s intended use case (e.g., math, science, coding).

Example Data Structure

Each entry should include:

Input: A question/problem requiring reasoning.

Output: A detailed, structured answer (e.g., chain-of-thought).

Metadata: Task type, complexity, and domain.

bartowski

Owner about 11 hours ago

•

edited about 11 hours ago

This doesn't really apply to imatrix

The dataset is better to be diverse and random than targetted and specific

Check this discussion posted today:

https://www.reddit.com/r/LocalLLaMA/comments/1j9ih6e/english_k_quantization_of_llms_does_not/

If language of the imatrix file doesn't affect the output, then I think it would be logical to conclude that structure also is irrelevant

That said, the one thing that MAY be important is chunk size. Since default imatrix is 512 token chunks, it's possible this would adversely affect long context, and reasoning models obviously tend to have extremely long context. There's an argument to be made that a stronger beginning to the reasoning is more important than having the same quality throughout, but that's very debatable in both directions and difficult to quantify..

I'm hoping to do more experiments with chunk sizes, and especially combining chunk sizes to see if we can get better overall results, but that's for the future ideally when multiple chunk size support is improved

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment