|
--- |
|
license: apache-2.0 |
|
datasets: |
|
- Orion-zhen/dpo-mathinstuct-emoji |
|
language: |
|
- en |
|
base_model: |
|
- meta-llama/Llama-3.1-8B-Instruct |
|
pipeline_tag: text-generation |
|
library_name: transformers |
|
tags: |
|
- dpo |
|
- rl |
|
- axolotl |
|
--- |
|
|
|
|
|
# EmojiLlama-3.1-8B |
|
|
|
This model is a fine-tuned version of Llama-3.1-8B using DPO (Direct Preference Optimization) RL technique, designed to make it more friendly and expressive with emojis and jokes. |
|
|
|
[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl) |
|
|
|
<details><summary>See axolotl config</summary> |
|
|
|
```yaml |
|
base_model: meta-llama/Llama-3.1-8B-Instruct |
|
model_type: LlamaForCausalLM |
|
tokenizer_type: AutoTokenizer |
|
|
|
load_in_8bit: false |
|
load_in_4bit: true |
|
strict: false |
|
|
|
chat_template: llama3 |
|
rl: dpo |
|
datasets: |
|
- path: Orion-zhen/dpo-mathinstuct-emoji |
|
type: llama3.prompt_pairs |
|
chat_template: llama3 |
|
|
|
dataset_prepared_path: |
|
val_set_size: 0.05 |
|
output_dir: ./llama-results |
|
|
|
sequence_len: 8192 |
|
sample_packing: false |
|
pad_to_sequence_len: true |
|
|
|
adapter: lora |
|
lora_model_dir: |
|
lora_r: 8 |
|
lora_alpha: 4 |
|
lora_dropout: 0.05 |
|
lora_target_linear: true |
|
lora_fan_in_fan_out: |
|
|
|
bf16: true |
|
fp16: false |
|
|
|
special_tokens: |
|
bos_token: "<|begin_of_text|>" |
|
eos_token: "<|eot_id|>" |
|
pad_token: "<|eot_id|>" |
|
additional_special_tokens: |
|
- "<|begin_of_text|>" |
|
- "<|eot_id|>" |
|
|
|
wandb_project: |
|
wandb_entity: |
|
wandb_watch: |
|
wandb_name: |
|
wandb_log_model: |
|
|
|
gradient_accumulation_steps: 8 |
|
micro_batch_size: 2 |
|
num_epochs: 1 |
|
optimizer: adamw_bnb_8bit |
|
lr_scheduler: cosine |
|
learning_rate: 0.0002 |
|
|
|
train_on_inputs: false |
|
group_by_length: false |
|
tf32: false |
|
|
|
gradient_checkpointing: true |
|
early_stopping_patience: |
|
resume_from_checkpoint: |
|
local_rank: |
|
logging_steps: 1 |
|
xformers_attention: |
|
flash_attention: true |
|
s2_attention: |
|
|
|
warmup_steps: 10 |
|
evals_per_epoch: 2 |
|
eval_table_size: |
|
eval_max_new_tokens: 128 |
|
saves_per_epoch: 1 |
|
debug: |
|
deepspeed: |
|
weight_decay: 0.0 |
|
fsdp: |
|
fsdp_config: |
|
|
|
save_safetensors: true |
|
``` |
|
</details><br> |
|
|
|
# Prompt Template |
|
|
|
You can use Llama3 prompt template while using the model: |
|
|
|
### Llama3 |
|
|
|
``` |
|
<|start_header_id|>system<|end_header_id|> |
|
{system}<|eot_id|> |
|
|
|
<|start_header_id|>user<|end_header_id|> |
|
{user}<|eot_id|> |
|
|
|
<|start_header_id|>assistant<|end_header_id|> |
|
{assistant}<|eot_id|> |
|
``` |
|
|
|
## Example usage: |
|
|
|
```py |
|
import torch |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
"suayptalha/DeepSeek-R1-Distill-Llama-3B", |
|
device_map="auto" |
|
) |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("suayptalha/DeepSeek-R1-Distill-Llama-3B") |
|
|
|
messages = [ |
|
{"role": "user", "content": "Lana had 8 blank pages left in her binder, but she knew she would need more for her next class. Duane took half of the 42 pages in his binder out and gave them to her. How many pages does Lana have in her binder after adding Duane’s?"}, |
|
] |
|
inputs = tokenizer.apply_chat_template( |
|
messages, |
|
tokenize = True, |
|
add_generation_prompt = True, |
|
return_tensors = "pt", |
|
).to("cuda") |
|
output = model.generate(input_ids=inputs, max_new_tokens=256, use_cache=True, temperature=0.7) |
|
decoded_output = tokenizer.decode(output[0], skip_special_tokens=False) |
|
print(decoded_output) |
|
``` |
|
|
|
## Output: |
|
``` |
|
💡 Remember, we're doubling Lana's pages, thanks to Duane's kindness! 💕 |
|
Duane gave Lana 42 / 2 = 21 pages 👍 |
|
After adding Duane's, Lana has 21 + 8 = 29 pages in her binder 📚 |
|
The answer is 29 🎉 |
|
``` |
|
|
|
# Parameters |
|
- lr: 2e-5 |
|
- epochs: 1 |
|
- batch_size: 16 |
|
- optimizer: adamw_bnb_8bit |
|
|
|
# Support |
|
|
|
<a href="https://www.buymeacoffee.com/suayptalha" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a> |