EmojiLlama-3.1-8B / README.md

Update README.md

ae54932 verified 5 days ago

3.84 kB

	---
	license: apache-2.0
	datasets:
	- Orion-zhen/dpo-mathinstuct-emoji
	language:
	- en
	base_model:
	- meta-llama/Llama-3.1-8B-Instruct
	pipeline_tag: text-generation
	library_name: transformers
	tags:
	- dpo
	- rl
	- axolotl
	---


	# EmojiLlama-3.1-8B

	This model is a fine-tuned version of Llama-3.1-8B using DPO (Direct Preference Optimization) RL technique, designed to make it more friendly and expressive with emojis and jokes.

	[<img src="https://raw.githubusercontent.com/OpenAccess-AI-Collective/axolotl/main/image/axolotl-badge-web.png" alt="Built with Axolotl" width="200" height="32"/>](https://github.com/OpenAccess-AI-Collective/axolotl)

	<details><summary>See axolotl config</summary>

	```yaml
	base_model: meta-llama/Llama-3.1-8B-Instruct
	model_type: LlamaForCausalLM
	tokenizer_type: AutoTokenizer

	load_in_8bit: false
	load_in_4bit: true
	strict: false

	chat_template: llama3
	rl: dpo
	datasets:
	- path: Orion-zhen/dpo-mathinstuct-emoji
	type: llama3.prompt_pairs
	chat_template: llama3

	dataset_prepared_path:
	val_set_size: 0.05
	output_dir: ./llama-results

	sequence_len: 8192
	sample_packing: false
	pad_to_sequence_len: true

	adapter: lora
	lora_model_dir:
	lora_r: 8
	lora_alpha: 4
	lora_dropout: 0.05
	lora_target_linear: true
	lora_fan_in_fan_out:

	bf16: true
	fp16: false

	special_tokens:
	bos_token: "<\|begin_of_text\|>"
	eos_token: "<\|eot_id\|>"
	pad_token: "<\|eot_id\|>"
	additional_special_tokens:
	- "<\|begin_of_text\|>"
	- "<\|eot_id\|>"

	wandb_project:
	wandb_entity:
	wandb_watch:
	wandb_name:
	wandb_log_model:

	gradient_accumulation_steps: 8
	micro_batch_size: 2
	num_epochs: 1
	optimizer: adamw_bnb_8bit
	lr_scheduler: cosine
	learning_rate: 0.0002

	train_on_inputs: false
	group_by_length: false
	tf32: false

	gradient_checkpointing: true
	early_stopping_patience:
	resume_from_checkpoint:
	local_rank:
	logging_steps: 1
	xformers_attention:
	flash_attention: true
	s2_attention:

	warmup_steps: 10
	evals_per_epoch: 2
	eval_table_size:
	eval_max_new_tokens: 128
	saves_per_epoch: 1
	debug:
	deepspeed:
	weight_decay: 0.0
	fsdp:
	fsdp_config:

	save_safetensors: true
	```
	</details><br>

	# Prompt Template

	You can use Llama3 prompt template while using the model:

	### Llama3

	```
	<\|start_header_id\|>system<\|end_header_id\|>
	{system}<\|eot_id\|>

	<\|start_header_id\|>user<\|end_header_id\|>
	{user}<\|eot_id\|>

	<\|start_header_id\|>assistant<\|end_header_id\|>
	{assistant}<\|eot_id\|>
	```

	## Example usage:

	```py
	import torch
	from transformers import AutoModelForCausalLM, AutoTokenizer

	model = AutoModelForCausalLM.from_pretrained(
	"suayptalha/DeepSeek-R1-Distill-Llama-3B",
	device_map="auto"
	)

	tokenizer = AutoTokenizer.from_pretrained("suayptalha/DeepSeek-R1-Distill-Llama-3B")

	messages = [
	{"role": "user", "content": "Lana had 8 blank pages left in her binder, but she knew she would need more for her next class. Duane took half of the 42 pages in his binder out and gave them to her. How many pages does Lana have in her binder after adding Duane’s?"},
	]
	inputs = tokenizer.apply_chat_template(
	messages,
	tokenize = True,
	add_generation_prompt = True,
	return_tensors = "pt",
	).to("cuda")
	output = model.generate(input_ids=inputs, max_new_tokens=256, use_cache=True, temperature=0.7)
	decoded_output = tokenizer.decode(output[0], skip_special_tokens=False)
	print(decoded_output)
	```

	## Output:
	```
	💡 Remember, we're doubling Lana's pages, thanks to Duane's kindness! 💕
	Duane gave Lana 42 / 2 = 21 pages 👍
	After adding Duane's, Lana has 21 + 8 = 29 pages in her binder 📚
	The answer is 29 🎉
	```

	# Parameters
	- lr: 2e-5
	- epochs: 1
	- batch_size: 16
	- optimizer: adamw_bnb_8bit

	# Support

	<a href="https://www.buymeacoffee.com/suayptalha" target="_blank"><img src="https://cdn.buymeacoffee.com/buttons/v2/default-yellow.png" alt="Buy Me A Coffee" style="height: 60px !important;width: 217px !important;" ></a>