--- base_model: - Pinkstack/Superthoughts-lite-v1 tags: - text-generation-inference - transformers - unsloth - llama - gguf - cot - superthoughts - reasoning - grpo license: apache-2.0 language: - en datasets: - openai/gsm8k - Pinkstack/intructions-sft-sharegpt --- Demo: https://huggingface.co/spaces/Pinkstack/Chat-with-superthoughts-lite ![superthoughts lite](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/K5kYIHYj2aX2kB6MlcM9O.png) # Information Advanced, high-quality and **lite** reasoning for a tiny size that you can run on your phone. At original quality, it runs at ~400 tokens/second on a single H100 Nvidia GPU from Friendli. Trained similarly to Deepseek R1, we used Smollm2 as a base model, then we've SFT fine tuned on reasoning using our own private superthoughts instruct dataset which includes a mix of code, website generation, day-to-day chats, math and counting problems. And then we modified the tokenizer slightly, after the SFT fine tuning we used GRPO to further amplify it's mathematics & problem solving abilities. # Which quant is right for you? ***F16***: Least hallucinations, high-quality reasoning yet heavy to run. ***Q8_0***: Limited amount of hallucinations high-quality reasoning, recommended ***Q6_k***: Hallucinates more, good reasoning but may fail at counting etc. only use if you cannot run Q8_0. ***Q4_k_m***: Not recommended, Hallucinates, doesn't always think properly. easier to run though. # Format ``` <|im_start|>user How many R's in strawberry<|im_end|> <|im_start|>assistant Alright, the user has asked how many R's in the word strawberry, that's easy! I just need to count each instance of the letter 'R' in the word 's-t-r-a-w-b-e-r-r-y' and then find out how many R's there are, lets count! S - Not an R, T - Not an R, R - First instance of the letter R! (1), A - Not an R, W - Not an R, B - Not an R, E - Not an R, R - Great! Second instance of the letter R. (2), R - Third instance of the letter R. (3), Y - Not an R. So, i've counted all the letters correctly, meaning that I am sure that there are 3 R's in the word Strawberry. I should probably let the user know. 3 <|im_end|> ``` We reccomend to use a low temperatures as higher values may cause it to not think. # system prompt (important to ensure it would always think, output). ``` respond in the following format: ... ... ``` # Examples: all responses below generated with our system prompt and a temperature of 0.7. Generated inside the android application, ChatterUI via GGUF Q8, using the model's prompt format. and our 1) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/5veZJmkjuv_7W7pKhvsu0.png) 2) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/pAwPdVkEZ7rnFf-TZ5tMU.png) 3) ![image/png](https://cdn-uploads.huggingface.co/production/uploads/6710ba6af1279fe0dfe33afe/FDaWAAqgv2kvoZvjl8gjl.png) # Uploaded model - **Developed by:** Pinkstack - **License:** apache-2.0 - **Finetuned from model :** HuggingFaceTB/SmolLM2-1.7B-Instruct