Spaces:

lhoestq
/

test-smollm

Runtime error

App Files Files Community

test-smollm / README.md

lhoestq HF Staff

Update README.md

0d732cb verified 4 months ago

preview code

raw

history blame contribute delete

2.46 kB

	---
	title: Test Smollm
	emoji: 🏆
	colorFrom: blue
	colorTo: gray
	sdk: docker
	pinned: false
	---

	# Fine-tuning

	## SmolLM2 Instruct

	We build the SmolLM2 Instruct family by finetuning the base 1.7B on [SmolTalk](https://huggingface.co/datasets/HuggingFaceTB/smoltalk) and the base 360M and 135M models on [Smol-smoltalk](https://huggingface.co/datasets/HuggingFaceTB/smol-smoltalk) using `TRL` and the alignement handbook and then doing DPO on [UltraFeedBack](https://huggingface.co/datasets/openbmb/UltraFeedback). You can find the scipts and instructions for dohere: https://github.com/huggingface/alignment-handbook/tree/main/recipes/smollm2#instructions-to-train-smollm2-17b-instruct

	## Custom script
	Here, we provide a simple script for finetuning SmolLM2. In this case, we fine-tune the base 1.7B on python data.

	### Setup

	Install `pytorch` [see documentation](https://pytorch.org/), and then install the requirements
	```bash
	pip install -r requirements.txt
	```

	Before you run any of the scripts make sure you are logged in `wandb` and HuggingFace Hub to push the checkpoints, and you have `accelerate` configured:
	```bash
	wandb login
	huggingface-cli login
	accelerate config
	```
	Now that everything is done, you can clone the repository and get into the corresponding directory.

	```bash
	git clone https://github.com/huggingface/smollm
	cd smollm/finetune
	```

	### Training
	To fine-tune efficiently with a low cost, we use [PEFT](https://github.com/huggingface/peft) library for Low-Rank Adaptation (LoRA) training. We also use the `SFTTrainer` from [TRL](https://github.com/huggingface/trl).

	For this example, we will fine-tune SmolLM1-1.7B on the `Python` subset of [the-stack-smol](https://huggingface.co/datasets/bigcode/the-stack-smol). This is just for illustration purposes.

	To launch the training:
	```bash
	accelerate launch train.py \
	--model_id "HuggingFaceTB/SmolLM2-1.7B" \
	--dataset_name "bigcode/the-stack-smol" \
	--subset "data/python" \
	--dataset_text_field "content" \
	--split "train" \
	--max_seq_length 2048 \
	--max_steps 5000 \
	--micro_batch_size 1 \
	--gradient_accumulation_steps 8 \
	--learning_rate 3e-4 \
	--warmup_steps 100 \
	--num_proc "$(nproc)"
	```

	If you want to fine-tune on other text datasets, you need to change `dataset_text_field` argument to the name of the column containing the code/text you want to train on.