damndeepesh commited on
Commit
e926fee
·
verified ·
1 Parent(s): df0c3e3

Uploaded Files from System

Browse files
Files changed (3) hide show
  1. README.md +64 -17
  2. app.py +255 -0
  3. requirements.txt +8 -3
README.md CHANGED
@@ -1,20 +1,67 @@
1
- ---
2
- title: LoraFineTuningForApple
3
- emoji: 🚀
4
- colorFrom: red
5
- colorTo: red
6
- sdk: docker
7
- app_port: 8501
8
- tags:
9
- - streamlit
10
- pinned: false
11
- short_description: Fine Tuning Lora on apple M series Devices
12
- license: mit
13
- ---
 
 
 
 
 
 
 
 
 
14
 
15
- # Welcome to Streamlit!
 
 
 
 
 
 
16
 
17
- Edit `/src/streamlit_app.py` to customize this app to your heart's desire. :heart:
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
18
 
19
- If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
20
- forums](https://discuss.streamlit.io).
 
1
+ # LoRA Fine-Tuning and CoreML Conversion with Streamlit
2
+
3
+ This project demonstrates how to fine-tune a large language model (LLM) using Low-Rank Adaptation (LoRA) and convert it to the CoreML format for on-device deployment. The entire process is wrapped in a user-friendly Streamlit web application.
4
+
5
+ ## Features
6
+
7
+ - **Fine-Tune LLMs with LoRA:** Easily fine-tune the `distilbert/distilgpt2` model on the `roneneldan/TinyStories` dataset using Parameter-Efficient Fine-Tuning (PEFT) with LoRA.
8
+ - **Text Generation:** Generate creative stories from a text prompt using the fine-tuned model.
9
+ - **Adjustable Generation Parameters:** Control text generation with parameters like temperature, max length, and repetition penalty.
10
+ - **CoreML Conversion:** Convert the fine-tuned model to a `.mlpackage` file, ready for integration into Apple ecosystem applications.
11
+ - **Interactive Web UI:** A simple and interactive user interface built with Streamlit.
12
+
13
+ ## How it Works
14
+
15
+ The application follows a simple workflow:
16
+
17
+ 1. **Load Base Model:** It starts by loading the pre-trained `distilbert/distilgpt2` model and its tokenizer from the Hugging Face Hub.
18
+ 2. **Fine-Tuning:** The user can initiate the fine-tuning process. The application uses the `peft` library to apply LoRA to the base model and trains it on the `TinyStories` dataset. The resulting LoRA adapter is saved locally.
19
+ 3. **Text Generation:** Once the model is fine-tuned (or a pre-existing adapter is loaded), you can provide a prompt to generate stories.
20
+ 4. **CoreML Conversion:** The application can merge the LoRA adapter with the base model and then convert the merged model into the CoreML format, which can be downloaded as a `.zip` file.
21
+
22
+ ## Technologies Used
23
 
24
+ - **Model:** `distilbert/distilgpt2` from Hugging Face
25
+ - **Dataset:** `roneneldan/TinyStories` from Hugging Face
26
+ - **Fine-Tuning:** `peft` (Parameter-Efficient Fine-Tuning) library with LoRA
27
+ - **Framework:** PyTorch
28
+ - **Web App:** Streamlit
29
+ - **Model Conversion:** CoreMLTools
30
+ - **Core Libraries:** `transformers`, `datasets`, `accelerate`, `sentencepiece`
31
 
32
+ ## Setup and Usage
33
+
34
+ 1. **Clone the repository:**
35
+ ```bash
36
+ git clone <repository-url>
37
+ cd <repository-directory>
38
+ ```
39
+
40
+ 2. **Create a virtual environment and install dependencies:**
41
+ ```bash
42
+ python3 -m venv venv
43
+ source venv/bin/activate
44
+ pip install -r requirements.txt
45
+ ```
46
+
47
+ 3. **Run the Streamlit application:**
48
+ ```bash
49
+ streamlit run app.py
50
+ ```
51
+
52
+ 4. **Open the application in your browser:**
53
+ Navigate to the URL provided by Streamlit (usually `http://localhost:8501`).
54
+
55
+ ## File Structure
56
+
57
+ - `app.py`: The main Python script containing the Streamlit application logic.
58
+ - `requirements.txt`: A list of the Python packages required to run the project.
59
+ - `README.md`: This file, providing information about the project.
60
+ - `distilgpt2-lora-tinystories/`: (Generated Directory) This directory will be created to store the LoRA adapter after fine-tuning.
61
+ - `results/`: (Generated Directory) This directory is used by the `transformers.Trainer` to save training outputs.
62
+ - `distilgpt2-lora-tinystories.mlpackage/`: (Generated Directory) This directory will be created after the CoreML conversion.
63
+ - `distilgpt2-lora-tinystories.zip`: (Generated File) The zipped CoreML model ready for download.
64
+
65
+ ---
66
 
67
+ Happy fine-tuning and story generating!
 
app.py ADDED
@@ -0,0 +1,255 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ import streamlit as st
2
+ import torch
3
+ from datasets import load_dataset
4
+ from peft import LoraConfig, get_peft_model, PeftModel
5
+ from transformers import AutoModelForCausalLM, AutoTokenizer
6
+ from transformers.training_args import TrainingArguments
7
+ from transformers.trainer import Trainer
8
+ import coremltools as ct
9
+ import os
10
+ import zipfile
11
+ import tempfile
12
+
13
+ MODEL_NAME = "distilbert/distilgpt2"
14
+ DATASET_NAME = "roneneldan/TinyStories"
15
+ ADAPTER_PATH = "distilgpt2-lora-tinystories"
16
+
17
+ @st.cache_resource
18
+ def load_base_model_and_tokenizer():
19
+ """Loads the base model and tokenizer."""
20
+ tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
21
+ if tokenizer.pad_token is None:
22
+ tokenizer.pad_token = tokenizer.eos_token
23
+ model = AutoModelForCausalLM.from_pretrained(MODEL_NAME)
24
+ return model, tokenizer
25
+
26
+ def load_and_prepare_dataset(tokenizer, split="train"):
27
+ """Loads and tokenizes the dataset."""
28
+ dataset = load_dataset(DATASET_NAME, split=split)
29
+
30
+ def tokenize_function(examples):
31
+ tokenized = tokenizer(
32
+ examples["text"],
33
+ truncation=True,
34
+ padding="max_length",
35
+ max_length=256
36
+ )
37
+ # For causal language modeling, labels are the same as input_ids
38
+ tokenized["labels"] = tokenized["input_ids"].copy()
39
+ return tokenized
40
+
41
+ # Handle different dataset types safely
42
+ try:
43
+ if hasattr(dataset, 'column_names'):
44
+ remove_cols = dataset.column_names
45
+ else:
46
+ remove_cols = None
47
+ except:
48
+ remove_cols = None
49
+
50
+ tokenized_dataset = dataset.map(
51
+ tokenize_function,
52
+ batched=True,
53
+ remove_columns=remove_cols
54
+ )
55
+ return tokenized_dataset
56
+
57
+ def fine_tune_model(model, tokenizer, tokenized_dataset):
58
+ """Fine-tunes the model using LoRA."""
59
+ lora_config = LoraConfig(
60
+ r=4,
61
+ lora_alpha=16,
62
+ lora_dropout=0.1,
63
+ bias="none",
64
+ task_type="CAUSAL_LM",
65
+ )
66
+
67
+ peft_model = get_peft_model(model, lora_config)
68
+ peft_model.print_trainable_parameters()
69
+
70
+ training_args = TrainingArguments(
71
+ output_dir="./results",
72
+ num_train_epochs=0.5,
73
+ per_device_train_batch_size=1,
74
+ per_device_eval_batch_size=1,
75
+ gradient_accumulation_steps=4,
76
+ logging_steps=10,
77
+ save_steps=100,
78
+ eval_steps=50,
79
+ warmup_steps=10,
80
+ fp16=torch.cuda.is_available(),
81
+ dataloader_pin_memory=False,
82
+ remove_unused_columns=False,
83
+ max_steps=100,
84
+ )
85
+
86
+ trainer = Trainer(
87
+ model=peft_model,
88
+ args=training_args,
89
+ train_dataset=tokenized_dataset,
90
+ )
91
+
92
+ trainer.train()
93
+ peft_model.save_pretrained(ADAPTER_PATH)
94
+ return peft_model
95
+
96
+ def convert_to_coreml(model, tokenizer):
97
+ """Converts the model to CoreML format."""
98
+ st.info("Merging LoRA adapter...")
99
+ merged_model = model.merge_and_unload()
100
+ st.success("Adapter merged.")
101
+
102
+ st.info("Moving model to CPU for CoreML conversion...")
103
+ merged_model = merged_model.cpu()
104
+ merged_model.eval()
105
+ st.success("Model moved to CPU.")
106
+
107
+ # Create a simple wrapper that only returns logits
108
+ class SimpleModel(torch.nn.Module):
109
+ def __init__(self, model):
110
+ super().__init__()
111
+ self.model = model
112
+
113
+ def forward(self, input_ids):
114
+ outputs = self.model(input_ids)
115
+ return outputs.logits
116
+
117
+ simple_model = SimpleModel(merged_model)
118
+ st.info("Created simple model wrapper.")
119
+
120
+ st.info("Tracing the model...")
121
+ example_input = tokenizer("Once upon a time", return_tensors="pt")
122
+ input_ids = example_input.input_ids
123
+
124
+ # Ensure input is on CPU
125
+ input_ids = input_ids.cpu()
126
+
127
+ with torch.no_grad():
128
+ traced_model = torch.jit.trace(simple_model, input_ids)
129
+ st.success("Model traced.")
130
+
131
+ st.info("Converting to CoreML ML Program...")
132
+ coreml_model = ct.convert(
133
+ traced_model,
134
+ convert_to="mlprogram",
135
+ inputs=[ct.TensorType(name="input_ids", shape=(1, 512), dtype=int)],
136
+ compute_units=ct.ComputeUnit.CPU_ONLY,
137
+ )
138
+ st.success("Conversion to CoreML complete.")
139
+
140
+ output_path = f"{ADAPTER_PATH}.mlpackage"
141
+ # Save CoreML model using the correct method
142
+ try:
143
+ coreml_model.save(output_path)
144
+ except AttributeError:
145
+ # Alternative method for newer versions
146
+ ct.models.MLModel(coreml_model).save(output_path)
147
+ return output_path
148
+
149
+
150
+ def main():
151
+ st.title("LoRA Fine-Tuning of distilgpt2 for TinyStories")
152
+ st.write("This app fine-tunes the `distilbert/distilgpt2` model on the `TinyStories` dataset using LoRA and PEFT.")
153
+
154
+ # --- Load Model and Tokenizer ---
155
+ with st.spinner("Loading base model and tokenizer..."):
156
+ base_model, tokenizer = load_base_model_and_tokenizer()
157
+ st.session_state.base_model = base_model
158
+ st.session_state.tokenizer = tokenizer
159
+ st.success("Base model and tokenizer loaded.")
160
+ st.markdown(f"**Model:** `{MODEL_NAME}`")
161
+
162
+ # --- Fine-Tuning ---
163
+ st.header("1. LoRA Fine-Tuning")
164
+ if st.button("Start Fine-Tuning"):
165
+ with st.spinner("Loading dataset and fine-tuning... This might take a few minutes."):
166
+ tokenized_dataset = load_and_prepare_dataset(tokenizer)
167
+ st.session_state.tokenized_dataset = tokenized_dataset
168
+
169
+ # Safe way to get dataset length
170
+ try:
171
+ dataset_length = len(tokenized_dataset)
172
+ st.info(f"Dataset loaded with {dataset_length} examples.")
173
+ except (TypeError, AttributeError):
174
+ st.info("Dataset loaded (length unknown).")
175
+
176
+ peft_model = fine_tune_model(base_model, tokenizer, tokenized_dataset)
177
+ st.session_state.peft_model = peft_model
178
+ st.success("Fine-tuning complete! LoRA adapter saved.")
179
+ st.balloons()
180
+
181
+ # Check if adapter exists to offer loading it
182
+ if os.path.exists(ADAPTER_PATH) and "peft_model" not in st.session_state:
183
+ if st.button("Load Fine-Tuned LoRA Adapter"):
184
+ with st.spinner("Loading fine-tuned model..."):
185
+ peft_model = PeftModel.from_pretrained(base_model, ADAPTER_PATH)
186
+ st.session_state.peft_model = peft_model
187
+ st.success("Fine-tuned LoRA model loaded.")
188
+
189
+
190
+ # --- Text Generation ---
191
+ if "peft_model" in st.session_state:
192
+ st.header("2. Generate Story")
193
+ prompt = st.text_input("Enter a prompt to start a story:", "Once upon a time, in a land full of sunshine,")
194
+
195
+ # Generation parameters
196
+ col1, col2, col3 = st.columns(3)
197
+ with col1:
198
+ temperature = st.slider("Temperature", 0.1, 2.0, 0.8, 0.1)
199
+ with col2:
200
+ max_length = st.slider("Max Length", 50, 200, 100, 10)
201
+ with col3:
202
+ repetition_penalty = st.slider("Repetition Penalty", 1.0, 2.0, 1.2, 0.1)
203
+
204
+ if st.button("Generate"):
205
+ with st.spinner("Generating text..."):
206
+ model = st.session_state.peft_model
207
+ inputs = tokenizer(prompt, return_tensors="pt")
208
+
209
+ device = next(model.parameters()).device
210
+ inputs = {k: v.to(device) for k, v in inputs.items()}
211
+
212
+ outputs = model.generate(
213
+ **inputs,
214
+ max_length=max_length,
215
+ num_return_sequences=1,
216
+ temperature=temperature,
217
+ do_sample=True,
218
+ top_k=50,
219
+ top_p=0.9,
220
+ repetition_penalty=repetition_penalty,
221
+ pad_token_id=tokenizer.eos_token_id,
222
+ eos_token_id=tokenizer.eos_token_id,
223
+ no_repeat_ngram_size=3
224
+ )
225
+ generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
226
+
227
+ st.write("### Generated Story:")
228
+ st.write(generated_text)
229
+
230
+ # --- CoreML Conversion ---
231
+ st.header("3. Convert to CoreML")
232
+ if st.button("Convert Model to CoreML"):
233
+ with st.spinner("Converting model to CoreML format..."):
234
+ coreml_model_path = convert_to_coreml(st.session_state.peft_model, st.session_state.tokenizer)
235
+ st.success(f"Model successfully converted and saved to `{coreml_model_path}`")
236
+
237
+ # For .mlpackage files, we need to create a zip file for download
238
+ zip_path = f"{ADAPTER_PATH}.zip"
239
+ with zipfile.ZipFile(zip_path, 'w', zipfile.ZIP_DEFLATED) as zipf:
240
+ for root, dirs, files in os.walk(coreml_model_path):
241
+ for file in files:
242
+ file_path = os.path.join(root, file)
243
+ arcname = os.path.relpath(file_path, coreml_model_path)
244
+ zipf.write(file_path, arcname)
245
+
246
+ with open(zip_path, "rb") as f:
247
+ st.download_button(
248
+ label="Download CoreML Model",
249
+ data=f,
250
+ file_name=os.path.basename(zip_path),
251
+ mime="application/zip"
252
+ )
253
+
254
+ if __name__ == "__main__":
255
+ main()
requirements.txt CHANGED
@@ -1,3 +1,8 @@
1
- altair
2
- pandas
3
- streamlit
 
 
 
 
 
 
1
+ transformers
2
+ datasets
3
+ peft
4
+ streamlit
5
+ coremltools
6
+ torch
7
+ accelerate
8
+ sentencepiece