|
--- |
|
license: mit |
|
language: |
|
- en |
|
base_model: |
|
- codellama/CodeLlama-7b-hf |
|
- codellama/CodeLlama-7b-Python-hf |
|
library_name: transformers |
|
tags: |
|
- mergekit |
|
- merged-model |
|
- codellama |
|
- programming |
|
- language-model |
|
--- |
|
|
|
# π CodeLlama-Hybrid-7B: Optimized for Code Generation |
|
|
|
## π Overview |
|
**CodeLlama-Hybrid-7B** is an **experimental hybrid language model** that merges the capabilities of two CodeLlama variants. Built using **MergeKit**, this model is optimized for programming-related tasks, balancing efficiency and performance in code generation and understanding. |
|
|
|
π **Created by**: Matteo Khan |
|
π **Affiliation**: Apprentice at TW3 Partners (Generative AI Research) |
|
π **License**: MIT |
|
|
|
π [Connect with me on LinkedIn](https://www.linkedin.com/in/matteo-khan-a10309263/) |
|
π [Model on Hugging Face](https://huggingface.co/MatteoKhan/CodeLlama-Hybrid-7B) |
|
|
|
## π§ Model Details |
|
- **Model Type**: Hybrid Language Model (Merged for Code Generation) |
|
- **Parent Models**: |
|
- [CodeLlama-7B](https://huggingface.co/codellama/CodeLlama-7b-hf) |
|
- [CodeLlama-7B-Python](https://huggingface.co/codellama/CodeLlama-7b-Python-hf) |
|
- **Merging Technique**: Linear Merge (MergeKit) |
|
- **Tokenizer Source**: `codellama/CodeLlama-7b-hf` |
|
|
|
## π― Intended Use |
|
This model is designed for **code-related tasks** and experimentation in hybrid model optimization. Possible applications include: |
|
- β
Code Generation |
|
- β
Code Completion & Assistance |
|
- β
Code Understanding & Refactoring |
|
- β
Exploration of Model Merging Effects on Programming Tasks |
|
|
|
## β οΈ Limitations & Considerations |
|
While **CodeLlama-Hybrid-7B** provides enhanced code generation capabilities, it inherits some limitations from its parent models: |
|
- β May produce **incorrect or insecure** code |
|
- β οΈ Can generate **biased, offensive, or inappropriate** content |
|
- π Merging may introduce **unpredictable behaviors** |
|
- π Performance may **vary depending on the programming language and context** |
|
|
|
## π¬ Merging Process & Configuration |
|
This is **not a newly trained model**, but rather a merge of existing models using the following configuration: |
|
|
|
```yaml |
|
merge_method: linear |
|
dtype: float16 |
|
allow_crimes: true |
|
models: |
|
- model: "codellama/CodeLlama-7b-hf" |
|
parameters: |
|
t: 1.0 |
|
weight: 0.5 |
|
- model: "codellama/CodeLlama-7b-Python-hf" |
|
parameters: |
|
t: 1.0 |
|
weight: 0.5 |
|
parameters: |
|
normalize: true |
|
int8_mask: false |
|
ignore_mismatched_sizes: true |
|
layers: |
|
- pattern: "model.*" |
|
tokenizer_source: "codellama/CodeLlama-7b-hf" |
|
``` |
|
|
|
π **No formal evaluation** has been conducted yet. Users are encouraged to **benchmark and share feedback**! |
|
|
|
## π Environmental Impact |
|
By utilizing **model merging** instead of training from scratch, **CodeLlama-Hybrid-7B** significantly reduces computational and environmental costs. |
|
|
|
## π How to Use |
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model_name = "MatteoKhan/CodeLlama-Hybrid-7B" |
|
tokenizer = AutoTokenizer.from_pretrained(model_name) |
|
model = AutoModelForCausalLM.from_pretrained(model_name) |
|
|
|
# Example usage |
|
prompt = "Write a Python function to calculate Fibonacci numbers." |
|
inputs = tokenizer(prompt, return_tensors="pt") |
|
outputs = model.generate(**inputs, max_length=200) |
|
response = tokenizer.decode(outputs[0], skip_special_tokens=True) |
|
print(response) |
|
``` |
|
|
|
## π Citation & References |
|
If you use **CodeLlama-Hybrid-7B** in your research, please cite the parent models: |
|
|
|
**π CodeLlama** |
|
_(Citation to be added when available)_ |
|
|
|
π© **Feedback & Contact**: Reach out via [Hugging Face](https://huggingface.co/MatteoKhan). |
|
|
|
π **Happy Coding!** π |
|
|
|
|