liuhailin0123
/

llm-course-hw3-tinyllama-qlora-model

Model card Files Files and versions Community

llm-course-hw3-tinyllama-qlora-model / README.md

liuhailin0123's picture

Update README.md

0681344 verified 5 months ago

|

history blame contribute delete

2.01 kB

	---
	library_name: transformers
	license: mit
	datasets:
	- cardiffnlp/tweet_eval
	language:
	- en
	base_model:
	- TinyLlama/TinyLlama-1.1B-Chat-v1.0
	---

	# Model Card for Model ID

	<!-- Provide a quick summary of what the model is/does. -->

	## Model Details

	In this work, we fine tuned on the 4-bit of the base model [TinyLlama/TinyLlama-1.1B-Chat-v1.0](https://huggingface.co/TinyLlama/TinyLlama-1.1B-Chat-v1.0) on a tweet sentiment dataset [cardiffnlp/tweet_eval](https://huggingface.co/datasets/cardiffnlp/tweet_eval) dataset to determine tweets tonality in one of the three classes: positive, neutral or negative.

	### Model Description
	We used a system prompt to instruct the model:

	SYSTEM PROMPT:

	You are a tweet sentiment classifier. For each tweet input, analyze its sentiment and output exactly one word: "negative", "neutral", or "positive". Do not include any extra text.
	But the model is not trained to return only the sentiment name.

	The PEFT method applied was QLoRA, the optimizer parameters be offloaded to CPU, when GPU memory is limited but could slow down the learning.

	## Training Details

	The peft framework was applied to the model, k_proj and v_proj layers whas modified with LoRA linear layer rank=8, alpha=16.

	This model was trained with SFTTrainer of trl, batch_size=12, gradient_accumulation_steps=128, learning_rate=5e-4, lr_scheduler_type="linear", for 1 epoch on the dataset.

	The model achieved 0.25 macro f1-score on the test dataset, comparing with the initial model which is 0.03.

	### Comparison

	==========
	User Prompt: "Ben Smith / Smith (concussion) remains out of the lineup Thursday, Curtis #NHL #SJ"
	Label: neutral
	After: neutral

	==========
	User Prompt: @user Alciato: Bee will invest 150 million in January, another 200 in the Summer and plans to bring Messi by 2017"
	Label: positive
	After: neutral

	#### Summary
	QLoRA fine-tuning let us more efficiently to fine tuned the quantitatived model, and allows us to fine tune a larger model.