Commit
ยท
76512ea
1
Parent(s):
76474f3
Update README.md to include new streaming functionality, enhance usage examples, and improve model performance parameters. Added sections for simple non-streaming usage and clarified limitations and future improvements for FinSight AI.
Browse files
README.md
CHANGED
@@ -10,7 +10,10 @@ tags:
|
|
10 |
- qlora
|
11 |
- financial-advice
|
12 |
- text-generation
|
13 |
-
|
|
|
|
|
|
|
14 |
model-index:
|
15 |
- name: FinSight AI
|
16 |
results:
|
@@ -52,7 +55,7 @@ base_model: HuggingFaceTB/SmolLM2-1.7B-Instruct
|
|
52 |
|
53 |
|
54 |
<div align="center">
|
55 |
-
<h3><a href="https://github.com/zahemen9900/Datasets-for-Finsight/blob/97d7cacfff62e7b6099ef3bb0af9cf3d044a5b35/metrics/model_paper.md"
|
56 |
</div>
|
57 |
|
58 |
## Model Details
|
@@ -72,10 +75,13 @@ Our evaluation demonstrates significant performance improvements across all stan
|
|
72 |
|
73 |
## Usage
|
74 |
|
|
|
|
|
75 |
```python
|
76 |
-
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
|
77 |
import torch
|
78 |
from peft import PeftModel
|
|
|
79 |
|
80 |
# For 4-bit quantized inference (recommended)
|
81 |
bnb_config = BitsAndBytesConfig(
|
@@ -96,44 +102,45 @@ base_model = AutoModelForCausalLM.from_pretrained(
|
|
96 |
model = PeftModel.from_pretrained(base_model, "zahemen9900/finsight-ai")
|
97 |
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-1.7B-Instruct")
|
98 |
|
99 |
-
|
100 |
-
|
101 |
-
|
102 |
-
outputs = model.generate(
|
103 |
-
inputs.input_ids,
|
104 |
-
max_new_tokens=512,
|
105 |
-
temperature=0.7,
|
106 |
-
top_p=0.95,
|
107 |
-
do_sample=True
|
108 |
-
)
|
109 |
-
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
110 |
-
print(response)
|
111 |
-
```
|
112 |
-
|
113 |
-
### For streaming options:
|
114 |
|
115 |
-
|
116 |
-
|
117 |
-
|
|
|
118 |
|
119 |
-
|
|
|
|
|
120 |
|
121 |
-
|
122 |
-
inputs = tokenizer(
|
|
|
123 |
|
124 |
# Create a streamer
|
125 |
-
streamer = TextIteratorStreamer(tokenizer, timeout=
|
126 |
|
127 |
-
#
|
128 |
-
|
129 |
-
"
|
130 |
-
"
|
131 |
-
"temperature": 0.7,
|
132 |
"top_p": 0.95,
|
133 |
"do_sample": True,
|
134 |
-
"
|
|
|
|
|
|
|
|
|
|
|
|
|
135 |
}
|
136 |
|
|
|
|
|
|
|
|
|
137 |
thread = threading.Thread(target=model.generate, kwargs=generation_kwargs)
|
138 |
thread.start()
|
139 |
|
@@ -141,7 +148,63 @@ thread.start()
|
|
141 |
print("Response: ", end="")
|
142 |
for text in streamer:
|
143 |
print(text, end="", flush=True)
|
|
|
|
|
|
|
|
|
|
|
144 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
145 |
```
|
146 |
|
147 |
## Training Details
|
@@ -174,6 +237,7 @@ The model was trained using the following configuration:
|
|
174 |
- **Language Limitations**: While optimized for English financial terminology, the model may have reduced performance with non-English financial terms or concepts specific to regional markets.
|
175 |
- **Regulatory Compliance**: The model is not updated with the latest financial regulations across different jurisdictions and cannot ensure compliance with local financial laws.
|
176 |
- **Complexity Handling**: May struggle with highly complex or niche financial scenarios that were underrepresented in the training data.
|
|
|
177 |
|
178 |
## Future Improvements
|
179 |
|
@@ -181,6 +245,7 @@ The model was trained using the following configuration:
|
|
181 |
- **Domain-Specific Fine-tuning**: Additional training on specialized financial domains like cryptocurrency, derivatives trading, and international tax regulations.
|
182 |
- **Multilingual Support**: Expanding capabilities to handle financial terminology and concepts across multiple languages and markets.
|
183 |
- **Personalization Framework**: Developing mechanisms to better contextualize responses based on stated user preferences while maintaining privacy.
|
|
|
184 |
|
185 |
## Citation
|
186 |
|
|
|
10 |
- qlora
|
11 |
- financial-advice
|
12 |
- text-generation
|
13 |
+
- peft
|
14 |
+
- lora
|
15 |
+
- adapter
|
16 |
+
inference: false
|
17 |
model-index:
|
18 |
- name: FinSight AI
|
19 |
results:
|
|
|
55 |
|
56 |
|
57 |
<div align="center">
|
58 |
+
<h3><a href="https://github.com/zahemen9900/Datasets-for-Finsight/blob/97d7cacfff62e7b6099ef3bb0af9cf3d044a5b35/metrics/model_paper.md">Read Model Paper ๐</a></h3>
|
59 |
</div>
|
60 |
|
61 |
## Model Details
|
|
|
75 |
|
76 |
## Usage
|
77 |
|
78 |
+
### Streaming function
|
79 |
+
|
80 |
```python
|
81 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, TextIteratorStreamer
|
82 |
import torch
|
83 |
from peft import PeftModel
|
84 |
+
import threading
|
85 |
|
86 |
# For 4-bit quantized inference (recommended)
|
87 |
bnb_config = BitsAndBytesConfig(
|
|
|
102 |
model = PeftModel.from_pretrained(base_model, "zahemen9900/finsight-ai")
|
103 |
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-1.7B-Instruct")
|
104 |
|
105 |
+
device = 'cuda' if torch.cuda.is_available() else 'cpu'
|
106 |
+
system_prompt = "You are Finsight, a finance bot trained to assist users with financial insights"
|
107 |
+
prompt = "What's your name, and what're you good at?"
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
108 |
|
109 |
+
messages = [
|
110 |
+
{"role": "system", "content": system_prompt},
|
111 |
+
{"role": "user", "content": prompt}
|
112 |
+
]
|
113 |
|
114 |
+
formatted_prompt = tokenizer.apply_chat_template(
|
115 |
+
messages, tokenize=False, add_generation_prompt=True
|
116 |
+
)
|
117 |
|
118 |
+
# Tokenize the formatted prompt
|
119 |
+
inputs = tokenizer(formatted_prompt, return_tensors="pt")
|
120 |
+
inputs = {k: v.to(device) for k, v in inputs.items()} # Move all tensors to device
|
121 |
|
122 |
# Create a streamer
|
123 |
+
streamer = TextIteratorStreamer(tokenizer, timeout=20.0, skip_prompt=True, skip_special_tokens=True)
|
124 |
|
125 |
+
# Adjust generation parameters for more controlled responses
|
126 |
+
generation_config = {
|
127 |
+
"max_new_tokens": 256,
|
128 |
+
"temperature": 0.6,
|
|
|
129 |
"top_p": 0.95,
|
130 |
"do_sample": True,
|
131 |
+
"pad_token_id": tokenizer.eos_token_id,
|
132 |
+
"eos_token_id": tokenizer.eos_token_id,
|
133 |
+
"repetition_penalty": 1.2,
|
134 |
+
"no_repeat_ngram_size": 4,
|
135 |
+
"num_beams": 1,
|
136 |
+
"early_stopping": False,
|
137 |
+
"length_penalty": 1.0,
|
138 |
}
|
139 |
|
140 |
+
# Combine inputs and generation config for the generate function
|
141 |
+
generation_kwargs = {**generation_config, "input_ids": inputs["input_ids"], "streamer": streamer}
|
142 |
+
|
143 |
+
# Start generation in a separate thread
|
144 |
thread = threading.Thread(target=model.generate, kwargs=generation_kwargs)
|
145 |
thread.start()
|
146 |
|
|
|
148 |
print("Response: ", end="")
|
149 |
for text in streamer:
|
150 |
print(text, end="", flush=True)
|
151 |
+
```
|
152 |
+
|
153 |
+
### Simple Non-Streaming Usage
|
154 |
+
|
155 |
+
If you prefer a simpler approach without streaming:
|
156 |
|
157 |
+
```python
|
158 |
+
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
|
159 |
+
import torch
|
160 |
+
from peft import PeftModel
|
161 |
+
|
162 |
+
# For 4-bit quantized inference
|
163 |
+
bnb_config = BitsAndBytesConfig(
|
164 |
+
load_in_4bit=True,
|
165 |
+
bnb_4bit_use_double_quant=True,
|
166 |
+
bnb_4bit_quant_type="nf4",
|
167 |
+
bnb_4bit_compute_dtype=torch.bfloat16
|
168 |
+
)
|
169 |
+
|
170 |
+
# Load base model with quantization
|
171 |
+
base_model = AutoModelForCausalLM.from_pretrained(
|
172 |
+
"HuggingFaceTB/SmolLM2-1.7B-Instruct",
|
173 |
+
quantization_config=bnb_config,
|
174 |
+
device_map="auto"
|
175 |
+
)
|
176 |
+
|
177 |
+
# Load adapter weights (LoRA)
|
178 |
+
model = PeftModel.from_pretrained(base_model, "zahemen9900/finsight-ai")
|
179 |
+
tokenizer = AutoTokenizer.from_pretrained("HuggingFaceTB/SmolLM2-1.7B-Instruct")
|
180 |
+
|
181 |
+
# Prepare input
|
182 |
+
system_prompt = "You are Finsight, a finance bot trained to assist users with financial insights"
|
183 |
+
user_prompt = "What's a good strategy for long-term investing?"
|
184 |
+
|
185 |
+
messages = [
|
186 |
+
{"role": "system", "content": system_prompt},
|
187 |
+
{"role": "user", "content": user_prompt}
|
188 |
+
]
|
189 |
+
|
190 |
+
formatted_prompt = tokenizer.apply_chat_template(
|
191 |
+
messages, tokenize=False, add_generation_prompt=True
|
192 |
+
)
|
193 |
+
|
194 |
+
inputs = tokenizer(formatted_prompt, return_tensors="pt").to(model.device)
|
195 |
+
|
196 |
+
# Generate response
|
197 |
+
outputs = model.generate(
|
198 |
+
inputs.input_ids,
|
199 |
+
max_new_tokens=256,
|
200 |
+
temperature=0.7,
|
201 |
+
top_p=0.95,
|
202 |
+
do_sample=True,
|
203 |
+
repetition_penalty=1.2
|
204 |
+
)
|
205 |
+
|
206 |
+
response = tokenizer.decode(outputs[0], skip_special_tokens=True)
|
207 |
+
print("Response:\n", response.strip())
|
208 |
```
|
209 |
|
210 |
## Training Details
|
|
|
237 |
- **Language Limitations**: While optimized for English financial terminology, the model may have reduced performance with non-English financial terms or concepts specific to regional markets.
|
238 |
- **Regulatory Compliance**: The model is not updated with the latest financial regulations across different jurisdictions and cannot ensure compliance with local financial laws.
|
239 |
- **Complexity Handling**: May struggle with highly complex or niche financial scenarios that were underrepresented in the training data.
|
240 |
+
- **Size of Dataset**: The size of the dataset appears to be a significant bottleneck in the fine-tuning process, as we observed it's inability to generate very useful content for niche or extremely specific topics.
|
241 |
|
242 |
## Future Improvements
|
243 |
|
|
|
245 |
- **Domain-Specific Fine-tuning**: Additional training on specialized financial domains like cryptocurrency, derivatives trading, and international tax regulations.
|
246 |
- **Multilingual Support**: Expanding capabilities to handle financial terminology and concepts across multiple languages and markets.
|
247 |
- **Personalization Framework**: Developing mechanisms to better contextualize responses based on stated user preferences while maintaining privacy.
|
248 |
+
- **A larger, higher quality dataset**: The model already shows promising results on the relatively small dataset trained on (16.5M tokens). This suggests that a larger high-quality dataset would yield very promisingly in future fine-tuning pipelines. Steps will be taken to address this in a future version of the model
|
249 |
|
250 |
## Citation
|
251 |
|