Update README.md
Browse files
README.md
CHANGED
@@ -1,6 +1,13 @@
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
-
tags:
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
4 |
---
|
5 |
|
6 |
# Model Card for plutus-8B-instruct
|
@@ -13,19 +20,18 @@ This model is an instruction-tuned large language model specialized for Greek fi
|
|
13 |
|
14 |
This model card describes plutus-8B-instruct, a model developed by The Fin AI and finetuned to serve Greek-centric financial language tasks. The model leverages parameter-efficient fine-tuning (PEFT) via LoRA and is designed to generate or understand financial texts in Greek. The training was carried out with a local command-line backend with logging managed via TensorBoard.
|
15 |
|
16 |
-
- **Developed by:** The Fin AI
|
17 |
-
- **Funded by
|
18 |
-
- **Shared by [optional]:** [More Information Needed]
|
19 |
- **Model type:** Instruction-tuned large language model, specialized for low-resource domains such as Greek finance
|
20 |
- **Language(s) (NLP):** Greek
|
21 |
-
- **License:**
|
22 |
- **Finetuned from model [optional]:** ilsp/Llama-Krikri-8B-Instruct
|
23 |
|
24 |
-
### Model Sources
|
25 |
|
26 |
- **Repository:** https://huggingface.co/TheFinAI/plutus-8B-instruct
|
27 |
-
- **Paper
|
28 |
-
- **Demo
|
29 |
|
30 |
## Uses
|
31 |
|
@@ -33,7 +39,7 @@ This model card describes plutus-8B-instruct, a model developed by The Fin AI an
|
|
33 |
|
34 |
Plutus-8B-instruct can be directly applied in Greek finance applications such as answering user queries, summarizing financial reports, or generating context-aware financial planning text. Users should ensure that input texts align with the financial domain to achieve optimal performance.
|
35 |
|
36 |
-
### Downstream Use
|
37 |
|
38 |
The model can be integrated into larger systems such as chatbots, recommendation systems, or data analysis pipelines focusing on Greek financial markets. Fine-tuning on domain-specific datasets might further improve performance for specialized tasks.
|
39 |
|
@@ -78,7 +84,7 @@ The model was fine-tuned using the dataset available at https://huggingface.co/c
|
|
78 |
|
79 |
The finetuning process used parameter-efficient fine-tuning (PEFT) with LoRA. The base model was ilsp/Llama-Krikri-8B-Instruct, and the training was executed on a local CLI backend with logs monitored via TensorBoard.
|
80 |
|
81 |
-
#### Preprocessing
|
82 |
|
83 |
Inputs were tokenized with a specialized tokenizer configured for the Greek language, and a chat template was applied to structure the conversational data accordingly. Padding was applied to the right as per the configuration.
|
84 |
|
@@ -100,7 +106,7 @@ Inputs were tokenized with a specialized tokenizer configured for the Greek lang
|
|
100 |
- **Gradient accumulation:** 4
|
101 |
- **Mixed precision:** bf16
|
102 |
|
103 |
-
#### Speeds, Sizes, Times
|
104 |
|
105 |
Training was performed on locally available hardware. Specific details on throughput, training runtime, and checkpoint sizes will be provided in subsequent documentation.
|
106 |
|
@@ -125,47 +131,49 @@ Evaluation metrics included standard language generation measures as well as dom
|
|
125 |
|
126 |
### Results
|
127 |
|
128 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
129 |
|
130 |
#### Summary
|
131 |
|
132 |
In summary, plutus-8B-instruct is a domain-specific, instruction-tuned large language model optimized for Greek finance applications. Although it performs well in targeted scenarios, users are advised to validate its outputs carefully.
|
133 |
|
134 |
-
##
|
135 |
-
|
136 |
-
Future updates may include interpretability analyses, such as attention visualizations and performance breakdowns by subpopulation or domain.
|
137 |
-
|
138 |
-
## Environmental Impact
|
139 |
-
|
140 |
-
Carbon emissions can be estimated using the Machine Learning Impact Calculator (https://mlco2.github.io/impact#compute) based on the following details:
|
141 |
-
|
142 |
-
- **Hardware Type:** [More Information Needed]
|
143 |
-
- **Hours used:** [More Information Needed]
|
144 |
-
- **Cloud Provider:** [More Information Needed]
|
145 |
-
- **Compute Region:** [More Information Needed]
|
146 |
-
- **Carbon Emitted:** [More Information Needed]
|
147 |
-
|
148 |
-
## Technical Specifications [optional]
|
149 |
-
|
150 |
-
### Model Architecture and Objective
|
151 |
-
|
152 |
-
The model architecture is derived from ilsp/Llama-Krikri-8B-Instruct and fine-tuned using LoRA adaptations. Its objective is to generate and comprehend Greek language texts with an emphasis on financial contexts in an instruction-based setting.
|
153 |
-
|
154 |
-
### Compute Infrastructure
|
155 |
-
|
156 |
-
The model training and fine-tuning were executed using a local CLI environment, with logs monitored via TensorBoard.
|
157 |
-
|
158 |
-
#### Hardware
|
159 |
-
|
160 |
-
Specific hardware details are under documentation review. Future updates will include the configuration of GPUs/CPUs used.
|
161 |
-
|
162 |
-
#### Software
|
163 |
-
|
164 |
-
- **Framework:** Hugging Face Transformers
|
165 |
-
- **Backend:** Local CLI
|
166 |
-
- **Mixed Precision:** bf16
|
167 |
-
|
168 |
-
## Citation [optional]
|
169 |
|
170 |
**BibTeX:**
|
171 |
|
@@ -185,21 +193,21 @@ Specific hardware details are under documentation review. Future updates will in
|
|
185 |
|
186 |
Peng, X., Papadopoulos, T., Soufleri, E., Giannouris, P., Xiang, R., Wang, Y., Qian, L., Huang, J., Xie, Q., & Ananiadou, S. (2025). Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance. arXiv preprint arXiv:2502.18772. https://arxiv.org/abs/2502.18772
|
187 |
|
188 |
-
## Glossary
|
189 |
|
190 |
- PEFT: Parameter-Efficient Fine-Tuning.
|
191 |
- LoRA: Low-Rank Adaptation, a technique to reduce the number of trainable parameters.
|
192 |
- BF16: bfloat16, a mixed precision format used to optimize training speed.
|
193 |
- Int4 Quantization: A lower precision format aimed at reducing model size and inference latency.
|
194 |
|
195 |
-
## More Information
|
196 |
|
197 |
For more details regarding training logs, dataset preparations, and further technical insights, please refer to the associated GitHub repositories and documentation provided by The Fin AI.
|
198 |
|
199 |
-
## Model Card Authors
|
200 |
|
201 |
-
The model card was prepared by The Fin AI
|
202 |
|
203 |
## Model Card Contact
|
204 |
|
205 |
-
For additional questions or feedback, please contact The Fin AI
|
|
|
1 |
---
|
2 |
library_name: transformers
|
3 |
+
tags:
|
4 |
+
- finance
|
5 |
+
- greek
|
6 |
+
license: llama3.1
|
7 |
+
language:
|
8 |
+
- el
|
9 |
+
base_model:
|
10 |
+
- ilsp/Llama-Krikri-8B-Instruct
|
11 |
---
|
12 |
|
13 |
# Model Card for plutus-8B-instruct
|
|
|
20 |
|
21 |
This model card describes plutus-8B-instruct, a model developed by The Fin AI and finetuned to serve Greek-centric financial language tasks. The model leverages parameter-efficient fine-tuning (PEFT) via LoRA and is designed to generate or understand financial texts in Greek. The training was carried out with a local command-line backend with logging managed via TensorBoard.
|
22 |
|
23 |
+
- **Developed by:** NaCTeM, The Fin AI
|
24 |
+
- **Funded by:** NaCTeM, Archemedes RC
|
|
|
25 |
- **Model type:** Instruction-tuned large language model, specialized for low-resource domains such as Greek finance
|
26 |
- **Language(s) (NLP):** Greek
|
27 |
+
- **License:** LlaMA 3.1 Community License
|
28 |
- **Finetuned from model [optional]:** ilsp/Llama-Krikri-8B-Instruct
|
29 |
|
30 |
+
### Model Sources
|
31 |
|
32 |
- **Repository:** https://huggingface.co/TheFinAI/plutus-8B-instruct
|
33 |
+
- **Paper:** https://arxiv.org/abs/2502.18772
|
34 |
+
- **Demo:** https://huggingface.co/spaces/TheFinAI/plutus-8B-instruct
|
35 |
|
36 |
## Uses
|
37 |
|
|
|
39 |
|
40 |
Plutus-8B-instruct can be directly applied in Greek finance applications such as answering user queries, summarizing financial reports, or generating context-aware financial planning text. Users should ensure that input texts align with the financial domain to achieve optimal performance.
|
41 |
|
42 |
+
### Downstream Use
|
43 |
|
44 |
The model can be integrated into larger systems such as chatbots, recommendation systems, or data analysis pipelines focusing on Greek financial markets. Fine-tuning on domain-specific datasets might further improve performance for specialized tasks.
|
45 |
|
|
|
84 |
|
85 |
The finetuning process used parameter-efficient fine-tuning (PEFT) with LoRA. The base model was ilsp/Llama-Krikri-8B-Instruct, and the training was executed on a local CLI backend with logs monitored via TensorBoard.
|
86 |
|
87 |
+
#### Preprocessing
|
88 |
|
89 |
Inputs were tokenized with a specialized tokenizer configured for the Greek language, and a chat template was applied to structure the conversational data accordingly. Padding was applied to the right as per the configuration.
|
90 |
|
|
|
106 |
- **Gradient accumulation:** 4
|
107 |
- **Mixed precision:** bf16
|
108 |
|
109 |
+
#### Speeds, Sizes, Times
|
110 |
|
111 |
Training was performed on locally available hardware. Specific details on throughput, training runtime, and checkpoint sizes will be provided in subsequent documentation.
|
112 |
|
|
|
131 |
|
132 |
### Results
|
133 |
|
134 |
+
Evaluated across multiple Greek financial NLP tasks. **Bold values** denote the highest scores, while _underlined_ values indicate the second-highest scores in each column.
|
135 |
+
|
136 |
+
| Model | GRFinNUM (Entity F1) | GRFinNER (Entity F1) | GRFinQA (Acc) | GRFNS-2023 (Rouge-1) | GRMultiFin (Acc) | Mean |
|
137 |
+
|--------------|----------------------|----------------------|--------------|----------------------|----------------|-------|
|
138 |
+
| **Open-source Small Models** |
|
139 |
+
| LLaMA-3.2-1B | 0.00 | 0.00 | 0.29 | 0.14 | 0.39 | 0.16 |
|
140 |
+
| LLaMA-3-8b | 0.00 | 0.13 | 0.33 | 0.07 | 0.70 | 0.25 |
|
141 |
+
| LLaMA-3.1-8b | 0.10 | 0.21 | 0.40 | 0.20 | 0.54 | 0.29 |
|
142 |
+
| Qwen2.5-1.5B | 0.00 | 0.00 | 0.36 | 0.02 | 0.31 | 0.14 |
|
143 |
+
| Qwen2.5-7B | 0.00 | 0.13 | 0.43 | 0.07 | 0.54 | 0.23 |
|
144 |
+
| Gemma-2-2B | 0.00 | 0.16 | 0.22 | 0.03 | 0.41 | 0.16 |
|
145 |
+
| Gemma-2-9B | 0.02 | 0.05 | 0.31 | 0.06 | 0.61 | 0.21 |
|
146 |
+
| Mistral-7B | 0.00 | 0.00 | 0.30 | 0.14 | 0.39 | 0.17 |
|
147 |
+
|
148 |
+
| **Open-source Large Models** |
|
149 |
+
| Deepseek-V3 | 0.07 | 0.00 | 0.50 | **0.38** | 0.61 | 0.31 |
|
150 |
+
| LLaMA-3-70B | 0.05 | 0.45 | 0.60 | 0.08 | 0.61 | 0.36 |
|
151 |
+
| Qwen2.5-32B | 0.37 | 0.55 | 0.60 | 0.10 | _0.70_ | 0.47 |
|
152 |
+
| Qwen2.5-72B | _0.32_ | 0.39 | _0.74_ | 0.04 | **0.72** | 0.44 |
|
153 |
+
| Gemma-2-27B | 0.18 | 0.18 | 0.25 | 0.09 | 0.61 | 0.26 |
|
154 |
+
|
155 |
+
| **Proprietary Models** |
|
156 |
+
| GPT-3.5-Turbo | 0.14 | 0.30 | 0.51 | 0.31 | 0.50 | 0.35 |
|
157 |
+
| GPT-4o-Mini | 0.25 | 0.30 | 0.12 | _0.36_ | 0.59 | 0.32 |
|
158 |
+
| GPT-4o | 0.09 | 0.31 | **0.78** | 0.26 | 0.59 | 0.41 |
|
159 |
+
| GPT-4 | _0.28_ | _0.60_ | 0.71 | **0.38** | 0.63 | **0.52** |
|
160 |
+
|
161 |
+
| **English Financial Models** |
|
162 |
+
| Finma-7B | 0.00 | 0.00 | 0.25 | 0.11 | 0.35 | 0.14 |
|
163 |
+
| FinLLaMA-8B | 0.00 | 0.00 | 0.28 | 0.03 | 0.38 | 0.14 |
|
164 |
+
|
165 |
+
| **Greek General Models** |
|
166 |
+
| Meltemi-7B | 0.12 | 0.50 | 0.48 | 0.19 | 0.43 | 0.34 |
|
167 |
+
| Llama-Kríkri-8B | 0.19 | 0.45 | 0.57 | 0.22 | 0.39 | 0.36 |
|
168 |
+
|
169 |
+
| **Greek Financial Models** |
|
170 |
+
| Plutus-8B | **0.70** | **0.57** | _0.64_ | _0.34_ | **0.72** | **0.60** |
|
171 |
|
172 |
#### Summary
|
173 |
|
174 |
In summary, plutus-8B-instruct is a domain-specific, instruction-tuned large language model optimized for Greek finance applications. Although it performs well in targeted scenarios, users are advised to validate its outputs carefully.
|
175 |
|
176 |
+
## Citation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
177 |
|
178 |
**BibTeX:**
|
179 |
|
|
|
193 |
|
194 |
Peng, X., Papadopoulos, T., Soufleri, E., Giannouris, P., Xiang, R., Wang, Y., Qian, L., Huang, J., Xie, Q., & Ananiadou, S. (2025). Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance. arXiv preprint arXiv:2502.18772. https://arxiv.org/abs/2502.18772
|
195 |
|
196 |
+
## Glossary
|
197 |
|
198 |
- PEFT: Parameter-Efficient Fine-Tuning.
|
199 |
- LoRA: Low-Rank Adaptation, a technique to reduce the number of trainable parameters.
|
200 |
- BF16: bfloat16, a mixed precision format used to optimize training speed.
|
201 |
- Int4 Quantization: A lower precision format aimed at reducing model size and inference latency.
|
202 |
|
203 |
+
## More Information
|
204 |
|
205 |
For more details regarding training logs, dataset preparations, and further technical insights, please refer to the associated GitHub repositories and documentation provided by The Fin AI.
|
206 |
|
207 |
+
## Model Card Authors
|
208 |
|
209 |
+
The model card was prepared by The Fin AI with inputs from the Hugging Face community.
|
210 |
|
211 |
## Model Card Contact
|
212 |
|
213 |
+
For additional questions or feedback, please contact The Fin AI.
|