File size: 3,843 Bytes
c6b74f8 ae54932 c47714a 3ae3842 c47714a 4b367a9 c47714a |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 |
---
license: apache-2.0
datasets:
- Orion-zhen/dpo-mathinstuct-emoji
language:
- en
base_model:
- meta-llama/Llama-3.1-8B-Instruct
pipeline_tag: text-generation
library_name: transformers
tags:
- dpo
- rl
- axolotl
---
# EmojiLlama-3.1-8B
This model is a fine-tuned version of Llama-3.1-8B using DPO (Direct Preference Optimization) RL technique, designed to make it more friendly and expressive with emojis and jokes.
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)
<details><summary>See axolotl config</summary>
```yaml
base_model: meta-llama/Llama-3.1-8B-Instruct
model_type: LlamaForCausalLM
tokenizer_type: AutoTokenizer
load_in_8bit: false
load_in_4bit: true
strict: false
chat_template: llama3
rl: dpo
datasets:
- path: Orion-zhen/dpo-mathinstuct-emoji
type: llama3.prompt_pairs
chat_template: llama3
dataset_prepared_path:
val_set_size: 0.05
output_dir: ./llama-results
sequence_len: 8192
sample_packing: false
pad_to_sequence_len: true
adapter: lora
lora_model_dir:
lora_r: 8
lora_alpha: 4
lora_dropout: 0.05
lora_target_linear: true
lora_fan_in_fan_out:
bf16: true
fp16: false
special_tokens:
bos_token: "<|begin_of_text|>"
eos_token: "<|eot_id|>"
pad_token: "<|eot_id|>"
additional_special_tokens:
- "<|begin_of_text|>"
- "<|eot_id|>"
wandb_project:
wandb_entity:
wandb_watch:
wandb_name:
wandb_log_model:
gradient_accumulation_steps: 8
micro_batch_size: 2
num_epochs: 1
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.0002
train_on_inputs: false
group_by_length: false
tf32: false
gradient_checkpointing: true
early_stopping_patience:
resume_from_checkpoint:
local_rank:
logging_steps: 1
xformers_attention:
flash_attention: true
s2_attention:
warmup_steps: 10
evals_per_epoch: 2
eval_table_size:
eval_max_new_tokens: 128
saves_per_epoch: 1
debug:
deepspeed:
weight_decay: 0.0
fsdp:
fsdp_config:
save_safetensors: true
```
</details><br>
# Prompt Template
You can use Llama3 prompt template while using the model:
### Llama3
```
<|start_header_id|>system<|end_header_id|>
{system}<|eot_id|>
<|start_header_id|>user<|end_header_id|>
{user}<|eot_id|>
<|start_header_id|>assistant<|end_header_id|>
{assistant}<|eot_id|>
```
## Example usage:
```py
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained(
"suayptalha/DeepSeek-R1-Distill-Llama-3B",
device_map="auto"
)
tokenizer = AutoTokenizer.from_pretrained("suayptalha/DeepSeek-R1-Distill-Llama-3B")
messages = [
{"role": "user", "content": "Lana had 8 blank pages left in her binder, but she knew she would need more for her next class. Duane took half of the 42 pages in his binder out and gave them to her. How many pages does Lana have in her binder after adding Duane’s?"},
]
inputs = tokenizer.apply_chat_template(
messages,
tokenize = True,
add_generation_prompt = True,
return_tensors = "pt",
).to("cuda")
output = model.generate(input_ids=inputs, max_new_tokens=256, use_cache=True, temperature=0.7)
decoded_output = tokenizer.decode(output[0], skip_special_tokens=False)
print(decoded_output)
```
## Output:
```
💡 Remember, we're doubling Lana's pages, thanks to Duane's kindness! 💕
Duane gave Lana 42 / 2 = 21 pages 👍
After adding Duane's, Lana has 21 + 8 = 29 pages in her binder 📚
The answer is 29 🎉
```
# Parameters
- lr: 2e-5
- epochs: 1
- batch_size: 16
- optimizer: adamw_bnb_8bit
# Support
<a href="https://www.buymeacoffee.com/suayptalha" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a> |