Uploaded model

Developed by: alphaaico
License: apache-2.0
Finetuned from model : llama-3.2-3b-instruct-bnb-4bit

This llama model was trained 2x faster with Unsloth and Huggingface's TRL library.

Deep-Reason-SMALL-V0

Overview Deep-Reason-SMALL-V0 is a fine-tuned version of llama-3.2-3b-instruct, designed for advanced reasoning and thinking capabilities. It has been trained using Reasoning GRPO techniques and a custom dataset curated for enhancing logical inference, decision-making, and structured reasoning.

Built with Unsloth and Hugging Face’s TRL, this model is optimized for faster inference and superior logical performance.

The model is available in GGUF and 16 Bit format and has been quantized to different levels to support various hardware configurations.

Model Details

Base Model: LLaMA-3 3B
Fine-tuned By: Alpha AI
Training Framework: Unsloth

Quantization Levels Available

q4_k_m
q5_k_m
q8_0
16 Bit (This)

GGUF Models - https://huggingface.co/alpha-ai/Deep-Reason-SMALL-V0-GGUF

Key Features

Enhanced Reasoning: Fine-tuned using GRPO to improve problem-solving and structured thought processes.
Optimized for Thinking Tasks: Excels in logical, multi-step, and causal reasoning.
Structured XML Responses: Outputs are formatted using a structured reasoning-answer format for easy parsing. Outputs are formatted using structured <reasoning>...</think> and <answer>...</answer> sections for easy parsing.
Efficient Deployment: Available in GGUF format for local AI deployments on consumer hardware.

Response Format & Parsing Instructions Deep-Reason-SMALL-V0 follows a structured response format with designated XML-like tags for easy parsing. The XML responses will include tokens such as <reasoning>...</reasoning> and <answer>...</answer>. Users must extract the tokens accordingly when using programmatically. This ensures clarity and traceability in decision-making.

Ideal Configuration for using the GGUF Models

temperature = 0.8
top_p = 0.95
max_tokens = 1024
SYSTEM_PROMPT = """ Respond in the following format: <reasoning> ... </reasoning> <answer> ... </answer> """

Use Cases Deep-Reason-SMALL-V0 is best suited for:

Conversational AI – Improving chatbot and AI assistant reasoning.
AI Research – Studying logical thought modeling in AI.
Automated Decision Making – Use in AI-powered business intelligence systems.
Education & Tutoring – Helping students and professionals with structured learning.
Legal & Financial Analysis – Generating step-by-step arguments for case studies.

Limitations & Considerations

May require further fine-tuning for domain-specific logic.
Not a factual knowledge base – Focused on reasoning, not general knowledge retrieval.
Potential biases – Results depend on training data.
Computational Trade-off – Reasoning performance comes at the cost of slightly longer inference times.

License

This model is released under a permissible license.

Acknowledgments

Special thanks to the Unsloth team for providing an optimized training pipeline for LLaMA models.

Disclaimer This model has been saved in the .bin format because it was trained using Unsloth. The .bin format is the default PyTorch serialization method and functions as expected. However, .bin files use Python's pickle module, which can execute arbitrary code during loading.

If security is a concern, we strongly recommend loading the model in a sandboxed environment such as staging servers, Kaggle, or Google Colab before deploying in production. You can also convert the model to .safetensors, a more secure and optimized format, using the following approach:

from transformers import AutoModel
from safetensors.torch import save_file

# Load model
model = AutoModel.from_pretrained("path/to/model")
state_dict = model.state_dict()

# Convert to safetensors
save_file(state_dict, "model.safetensors")

print("Model converted to safetensors successfully.")

Alternatively, you can use our GGUF models, which are optimized for inference with llama.cpp, exllama, and other efficient runtimes. GGUF provides better performance on CPU/GPU and is a more portable option for deployment.

Choose the format that best suits your security, performance, and deployment needs.

alpha-ai
/

Deep-Reason-SMALL-V0

Uploaded model

Model tree for alpha-ai/Deep-Reason-SMALL-V0

Collection including alpha-ai/Deep-Reason-SMALL-V0

Finetunes | SLMs and LLMs