QwQ-Buddy-32B-Alpha / README.md
FINGU-AI's picture
Update README.md
d975cf8 verified
|
raw
history blame
3.16 kB
metadata
library_name: transformers
license: mit

QwQ-Buddy-32B-Alpha

Model Summary

QwQ-Buddy-32B-Alpha is a merged 32B model created by fusing two high-performing models:

  • huihui-ai/QwQ-32B-Coder-Fusion-9010 (strong in coding and logical reasoning)
  • OpenBuddy/openbuddy-qwq-32b-v24.2-200k (strong in general knowledge and reasoning)

The merge was performed using Spherical Linear Interpolation (SLERP) to ensure a smooth and balanced integration of capabilities from both source models. The result is a powerful and versatile 32B model that excels in both coding and reasoning tasks, making it one of the top candidates for leaderboard evaluations.

Model Details

  • Model Type: Merged LLM (Qwen-2.5 32B architecture-based)
  • Precision: bfloat16
  • Merge Method: SLERP (Spherical Linear Interpolation)
  • Weight Type: Original (fully merged model, NOT delta-based)
  • Context Length: 200K tokens (inherits capabilities from OpenBuddy-QwQ)
  • Training Base Models:
    • huihui-ai/QwQ-32B-Coder-Fusion-9010
    • OpenBuddy/openbuddy-qwq-32b-v24.2-200k
  • Merged Layers:
    • 0-32 equally distributed from both models
    • 24-64 optimized for knowledge reasoning and logical computations

Performance Improvements

βœ… Stronger coding capabilities (inherits high performance from QwQ-32B-Coder-Fusion-9010) βœ… Enhanced general knowledge & reasoning (boosted by OpenBuddy-QwQ) βœ… Balanced self-attention and MLP layers for smoother response generation βœ… Higher robustness in multilingual support (OpenBuddy-QwQ contributions) βœ… Fine-tuned SLERP weighting for best accuracy in benchmarks

Expected Leaderboard Performance

Based on internal testing and model comparisons, QwQ-Buddy-32B-Alpha is expected to achieve top 20 rankings in:

  • HumanEval (coding tasks)
  • MMLU (multi-task language understanding)
  • HellaSwag (commonsense reasoning)
  • BBH (Big Bench Hard) (complex problem-solving)

Limitations & Considerations

  • 🚧 Not fine-tuned post-merge (raw merge evaluation may have slight instabilities)
  • 🚧 No explicit safety alignment applied (inherits behavior from base models)
  • 🚧 Performance on unseen edge cases requires additional evaluation

How to Use

To load the model for inference:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "FINGU-AI/QwQ-Buddy-32B-Alpha"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, torch_dtype="bfloat16")

inputs = tokenizer("Write a Python function to compute Fibonacci numbers:", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=200)
print(tokenizer.decode(outputs[0]))

Acknowledgments

This model was built using:

  • MergeKit for SLERP-based weight interpolation
  • Hugging Face Transformers for model loading and testing
  • Leaderboard Evaluation Benchmarks for performance comparisons

Contact & Feedback

For any inquiries, issues, or feedback regarding QwQ-Buddy-32B-Alpha, please reach out via GitHub or Hugging Face discussions.