|
--- |
|
base_model: |
|
- Pinkstack/Superthoughts-lite-v1 |
|
tags: |
|
- text-generation-inference |
|
- transformers |
|
- unsloth |
|
- llama |
|
- gguf |
|
- cot |
|
- superthoughts |
|
- reasoning |
|
- grpo |
|
license: apache-2.0 |
|
language: |
|
- en |
|
datasets: |
|
- openai/gsm8k |
|
- Pinkstack/intructions-sft-sharegpt |
|
--- |
|
Demo: https://huggingface.co/spaces/Pinkstack/Chat-with-superthoughts-lite |
|
data:image/s3,"s3://crabby-images/efe5a/efe5a86e7e063822b64de647d14b256d3751a5bb" alt="superthoughts lite" |
|
|
|
# Information |
|
Advanced, high-quality and **lite** reasoning for a tiny size that you can run on your phone. |
|
|
|
At original quality, it runs at ~400 tokens/second on a single H100 Nvidia GPU from Friendli. |
|
|
|
Trained similarly to Deepseek R1, we used Smollm2 as a base model, then we've SFT fine tuned on reasoning using our own private superthoughts instruct dataset which includes a mix of code, website generation, day-to-day chats, math and counting problems. And then we modified the tokenizer slightly, after the SFT fine tuning we used GRPO to further amplify it's mathematics & problem solving abilities. |
|
|
|
# Which quant is right for you? |
|
|
|
***F16***: Least hallucinations, high-quality reasoning yet heavy to run. |
|
***Q8_0***: Limited amount of hallucinations high-quality reasoning, recommended |
|
***Q6_k***: Hallucinates more, good reasoning but may fail at counting etc. only use if you cannot run Q8_0. |
|
***Q4_k_m***: Not recommended, Hallucinates, doesn't always think properly. easier to run though. |
|
|
|
# Format |
|
``` |
|
<|im_start|>user |
|
How many R's in strawberry<|im_end|> |
|
<|im_start|>assistant |
|
<think> |
|
Alright, the user has asked how many R's in the word strawberry, that's easy! I just need to count each instance of the letter 'R' in the word 's-t-r-a-w-b-e-r-r-y' and then find out how many R's there are, lets count! |
|
S - Not an R, |
|
T - Not an R, |
|
R - First instance of the letter R! (1), |
|
A - Not an R, |
|
W - Not an R, |
|
B - Not an R, |
|
E - Not an R, |
|
R - Great! Second instance of the letter R. (2), |
|
R - Third instance of the letter R. (3), |
|
Y - Not an R. |
|
|
|
So, i've counted all the letters correctly, meaning that I am sure that there are 3 R's in the word Strawberry. I should probably let the user know. |
|
</think> |
|
<output>3 |
|
</output><|im_end|> |
|
``` |
|
We reccomend to use a low temperatures as higher values may cause it to not think. |
|
# system prompt |
|
(important to ensure it would always think, output). |
|
``` |
|
respond in the following format: |
|
<think> |
|
... |
|
</think> |
|
<output> |
|
... |
|
</output> |
|
``` |
|
# Examples: |
|
all responses below generated with our system prompt and a temperature of 0.7. |
|
Generated inside the android application, ChatterUI via GGUF Q8, using the model's prompt format. and our |
|
1) |
|
data:image/s3,"s3://crabby-images/6627d/6627d065b41a548cb5ed40d7e2861a7ca686b242" alt="image/png" |
|
2) |
|
data:image/s3,"s3://crabby-images/f760d/f760d8162a7928b5825c18b74636cbebb1f4f105" alt="image/png" |
|
3) |
|
data:image/s3,"s3://crabby-images/fc162/fc162c135b1e1fedbba21c1fd997c68c28c52f3c" alt="image/png" |
|
|
|
# Uploaded model |
|
|
|
- **Developed by:** Pinkstack |
|
- **License:** apache-2.0 |
|
- **Finetuned from model :** HuggingFaceTB/SmolLM2-1.7B-Instruct |