File size: 2,519 Bytes
35cbcc7
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
---
license: llama3.1
language:
- en
- py
library_name: transformers
tags:
- llama-3.1
- python
- code-generation
- instruction-following
- fine-tune
- alpaca
- unsloth
base_model: meta-llama/Meta-Llama-3.1-8B-Instruct
datasets:
- iamtarun/python_code_instructions_18k_alpaca
---
---

# Llama-3.1-8B-Instruct-Python-Alpaca-Unsloth

This is a fine-tuned version of Meta's **`Llama-3.1-8B-Instruct`** model, specialized for Python code generation. It was trained on the high-quality **`iamtarun/python_code_instructions_18k_alpaca`** dataset using the **Unsloth** library for significantly faster training and reduced memory usage.

The result is a powerful and responsive coding assistant, designed to follow instructions and generate accurate, high-quality Python code.

---
## ## Model Details 🛠️

* **Base Model:** `meta-llama/Meta-Llama-3.1-8B-Instruct`
* **Dataset:** `iamtarun/python_code_instructions_18k_alpaca` (18,000 instruction-following examples for Python)
* **Fine-tuning Technique:** QLoRA (4-bit Quantization with LoRA adapters)
* **Framework:** Unsloth (for up to 2x faster training and optimized memory)

---
## ## How to Use 👨‍💻

This model is designed to be used with the Unsloth library for maximum performance, but it can also be used with the standard Hugging Face `transformers` library. For the best results, always use the Llama 3 chat template.

### ### Using with Unsloth (Recommended)

```python
from unsloth import FastLanguageModel
import torch

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name = "YOUR_USERNAME/YOUR_MODEL_NAME", # REMEMBER TO REPLACE THIS
    max_seq_length = 4096,
    dtype = None,
    load_in_4bit = True,
)

# Prepare the model for faster inference
FastLanguageModel.for_inference(model)

messages = [
    {
        "role": "system",
        "content": "You are a helpful Python coding assistant. Please provide a clear, concise, and correct Python code response to the user's request."
    },
    {
        "role": "user",
        "content": "Create a Python function that finds the nth Fibonacci number using recursion."
    },
]

input_ids = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)

outputs = model.generate(
    input_ids,
    max_new_tokens=200,
    do_sample=True,
    temperature=0.6,
    top_p=0.9,
    eos_token_id=tokenizer.eos_token_id
)

response = outputs[0][input_ids.shape[-1]:]
print(tokenizer.decode(response, skip_special_tokens=True))