jiminHuang commited on
Commit
88a625d
·
verified ·
1 Parent(s): e103c70

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +60 -52
README.md CHANGED
@@ -1,6 +1,13 @@
1
  ---
2
  library_name: transformers
3
- tags: []
 
 
 
 
 
 
 
4
  ---
5
 
6
  # Model Card for plutus-8B-instruct
@@ -13,19 +20,18 @@ This model is an instruction-tuned large language model specialized for Greek fi
13
 
14
  This model card describes plutus-8B-instruct, a model developed by The Fin AI and finetuned to serve Greek-centric financial language tasks. The model leverages parameter-efficient fine-tuning (PEFT) via LoRA and is designed to generate or understand financial texts in Greek. The training was carried out with a local command-line backend with logging managed via TensorBoard.
15
 
16
- - **Developed by:** The Fin AI
17
- - **Funded by [optional]:** [More Information Needed]
18
- - **Shared by [optional]:** [More Information Needed]
19
  - **Model type:** Instruction-tuned large language model, specialized for low-resource domains such as Greek finance
20
  - **Language(s) (NLP):** Greek
21
- - **License:** Apache License 2.0
22
  - **Finetuned from model [optional]:** ilsp/Llama-Krikri-8B-Instruct
23
 
24
- ### Model Sources [optional]
25
 
26
  - **Repository:** https://huggingface.co/TheFinAI/plutus-8B-instruct
27
- - **Paper [optional]:** Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance
28
- - **Demo [optional]:** https://huggingface.co/spaces/TheFinAI/plutus-8B-instruct
29
 
30
  ## Uses
31
 
@@ -33,7 +39,7 @@ This model card describes plutus-8B-instruct, a model developed by The Fin AI an
33
 
34
  Plutus-8B-instruct can be directly applied in Greek finance applications such as answering user queries, summarizing financial reports, or generating context-aware financial planning text. Users should ensure that input texts align with the financial domain to achieve optimal performance.
35
 
36
- ### Downstream Use [optional]
37
 
38
  The model can be integrated into larger systems such as chatbots, recommendation systems, or data analysis pipelines focusing on Greek financial markets. Fine-tuning on domain-specific datasets might further improve performance for specialized tasks.
39
 
@@ -78,7 +84,7 @@ The model was fine-tuned using the dataset available at https://huggingface.co/c
78
 
79
  The finetuning process used parameter-efficient fine-tuning (PEFT) with LoRA. The base model was ilsp/Llama-Krikri-8B-Instruct, and the training was executed on a local CLI backend with logs monitored via TensorBoard.
80
 
81
- #### Preprocessing [optional]
82
 
83
  Inputs were tokenized with a specialized tokenizer configured for the Greek language, and a chat template was applied to structure the conversational data accordingly. Padding was applied to the right as per the configuration.
84
 
@@ -100,7 +106,7 @@ Inputs were tokenized with a specialized tokenizer configured for the Greek lang
100
  - **Gradient accumulation:** 4
101
  - **Mixed precision:** bf16
102
 
103
- #### Speeds, Sizes, Times [optional]
104
 
105
  Training was performed on locally available hardware. Specific details on throughput, training runtime, and checkpoint sizes will be provided in subsequent documentation.
106
 
@@ -125,47 +131,49 @@ Evaluation metrics included standard language generation measures as well as dom
125
 
126
  ### Results
127
 
128
- Preliminary evaluations indicate that the model generates coherent and context-aware responses in Greek, with particular aptitude in understanding and generating financial-related texts. Detailed metric numbers and error analyses are under review and will be added when available.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
129
 
130
  #### Summary
131
 
132
  In summary, plutus-8B-instruct is a domain-specific, instruction-tuned large language model optimized for Greek finance applications. Although it performs well in targeted scenarios, users are advised to validate its outputs carefully.
133
 
134
- ## Model Examination [optional]
135
-
136
- Future updates may include interpretability analyses, such as attention visualizations and performance breakdowns by subpopulation or domain.
137
-
138
- ## Environmental Impact
139
-
140
- Carbon emissions can be estimated using the Machine Learning Impact Calculator (https://mlco2.github.io/impact#compute) based on the following details:
141
-
142
- - **Hardware Type:** [More Information Needed]
143
- - **Hours used:** [More Information Needed]
144
- - **Cloud Provider:** [More Information Needed]
145
- - **Compute Region:** [More Information Needed]
146
- - **Carbon Emitted:** [More Information Needed]
147
-
148
- ## Technical Specifications [optional]
149
-
150
- ### Model Architecture and Objective
151
-
152
- The model architecture is derived from ilsp/Llama-Krikri-8B-Instruct and fine-tuned using LoRA adaptations. Its objective is to generate and comprehend Greek language texts with an emphasis on financial contexts in an instruction-based setting.
153
-
154
- ### Compute Infrastructure
155
-
156
- The model training and fine-tuning were executed using a local CLI environment, with logs monitored via TensorBoard.
157
-
158
- #### Hardware
159
-
160
- Specific hardware details are under documentation review. Future updates will include the configuration of GPUs/CPUs used.
161
-
162
- #### Software
163
-
164
- - **Framework:** Hugging Face Transformers
165
- - **Backend:** Local CLI
166
- - **Mixed Precision:** bf16
167
-
168
- ## Citation [optional]
169
 
170
  **BibTeX:**
171
 
@@ -185,21 +193,21 @@ Specific hardware details are under documentation review. Future updates will in
185
 
186
  Peng, X., Papadopoulos, T., Soufleri, E., Giannouris, P., Xiang, R., Wang, Y., Qian, L., Huang, J., Xie, Q., & Ananiadou, S. (2025). Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance. arXiv preprint arXiv:2502.18772. https://arxiv.org/abs/2502.18772
187
 
188
- ## Glossary [optional]
189
 
190
  - PEFT: Parameter-Efficient Fine-Tuning.
191
  - LoRA: Low-Rank Adaptation, a technique to reduce the number of trainable parameters.
192
  - BF16: bfloat16, a mixed precision format used to optimize training speed.
193
  - Int4 Quantization: A lower precision format aimed at reducing model size and inference latency.
194
 
195
- ## More Information [optional]
196
 
197
  For more details regarding training logs, dataset preparations, and further technical insights, please refer to the associated GitHub repositories and documentation provided by The Fin AI.
198
 
199
- ## Model Card Authors [optional]
200
 
201
- The model card was prepared by The Fin AI team with inputs from the Hugging Face community.
202
 
203
  ## Model Card Contact
204
 
205
- For additional questions or feedback, please contact The Fin AI team.
 
1
  ---
2
  library_name: transformers
3
+ tags:
4
+ - finance
5
+ - greek
6
+ license: llama3.1
7
+ language:
8
+ - el
9
+ base_model:
10
+ - ilsp/Llama-Krikri-8B-Instruct
11
  ---
12
 
13
  # Model Card for plutus-8B-instruct
 
20
 
21
  This model card describes plutus-8B-instruct, a model developed by The Fin AI and finetuned to serve Greek-centric financial language tasks. The model leverages parameter-efficient fine-tuning (PEFT) via LoRA and is designed to generate or understand financial texts in Greek. The training was carried out with a local command-line backend with logging managed via TensorBoard.
22
 
23
+ - **Developed by:** NaCTeM, The Fin AI
24
+ - **Funded by:** NaCTeM, Archemedes RC
 
25
  - **Model type:** Instruction-tuned large language model, specialized for low-resource domains such as Greek finance
26
  - **Language(s) (NLP):** Greek
27
+ - **License:** LlaMA 3.1 Community License
28
  - **Finetuned from model [optional]:** ilsp/Llama-Krikri-8B-Instruct
29
 
30
+ ### Model Sources
31
 
32
  - **Repository:** https://huggingface.co/TheFinAI/plutus-8B-instruct
33
+ - **Paper:** https://arxiv.org/abs/2502.18772
34
+ - **Demo:** https://huggingface.co/spaces/TheFinAI/plutus-8B-instruct
35
 
36
  ## Uses
37
 
 
39
 
40
  Plutus-8B-instruct can be directly applied in Greek finance applications such as answering user queries, summarizing financial reports, or generating context-aware financial planning text. Users should ensure that input texts align with the financial domain to achieve optimal performance.
41
 
42
+ ### Downstream Use
43
 
44
  The model can be integrated into larger systems such as chatbots, recommendation systems, or data analysis pipelines focusing on Greek financial markets. Fine-tuning on domain-specific datasets might further improve performance for specialized tasks.
45
 
 
84
 
85
  The finetuning process used parameter-efficient fine-tuning (PEFT) with LoRA. The base model was ilsp/Llama-Krikri-8B-Instruct, and the training was executed on a local CLI backend with logs monitored via TensorBoard.
86
 
87
+ #### Preprocessing
88
 
89
  Inputs were tokenized with a specialized tokenizer configured for the Greek language, and a chat template was applied to structure the conversational data accordingly. Padding was applied to the right as per the configuration.
90
 
 
106
  - **Gradient accumulation:** 4
107
  - **Mixed precision:** bf16
108
 
109
+ #### Speeds, Sizes, Times
110
 
111
  Training was performed on locally available hardware. Specific details on throughput, training runtime, and checkpoint sizes will be provided in subsequent documentation.
112
 
 
131
 
132
  ### Results
133
 
134
+ Evaluated across multiple Greek financial NLP tasks. **Bold values** denote the highest scores, while _underlined_ values indicate the second-highest scores in each column.
135
+
136
+ | Model | GRFinNUM (Entity F1) | GRFinNER (Entity F1) | GRFinQA (Acc) | GRFNS-2023 (Rouge-1) | GRMultiFin (Acc) | Mean |
137
+ |--------------|----------------------|----------------------|--------------|----------------------|----------------|-------|
138
+ | **Open-source Small Models** |
139
+ | LLaMA-3.2-1B | 0.00 | 0.00 | 0.29 | 0.14 | 0.39 | 0.16 |
140
+ | LLaMA-3-8b | 0.00 | 0.13 | 0.33 | 0.07 | 0.70 | 0.25 |
141
+ | LLaMA-3.1-8b | 0.10 | 0.21 | 0.40 | 0.20 | 0.54 | 0.29 |
142
+ | Qwen2.5-1.5B | 0.00 | 0.00 | 0.36 | 0.02 | 0.31 | 0.14 |
143
+ | Qwen2.5-7B | 0.00 | 0.13 | 0.43 | 0.07 | 0.54 | 0.23 |
144
+ | Gemma-2-2B | 0.00 | 0.16 | 0.22 | 0.03 | 0.41 | 0.16 |
145
+ | Gemma-2-9B | 0.02 | 0.05 | 0.31 | 0.06 | 0.61 | 0.21 |
146
+ | Mistral-7B | 0.00 | 0.00 | 0.30 | 0.14 | 0.39 | 0.17 |
147
+
148
+ | **Open-source Large Models** |
149
+ | Deepseek-V3 | 0.07 | 0.00 | 0.50 | **0.38** | 0.61 | 0.31 |
150
+ | LLaMA-3-70B | 0.05 | 0.45 | 0.60 | 0.08 | 0.61 | 0.36 |
151
+ | Qwen2.5-32B | 0.37 | 0.55 | 0.60 | 0.10 | _0.70_ | 0.47 |
152
+ | Qwen2.5-72B | _0.32_ | 0.39 | _0.74_ | 0.04 | **0.72** | 0.44 |
153
+ | Gemma-2-27B | 0.18 | 0.18 | 0.25 | 0.09 | 0.61 | 0.26 |
154
+
155
+ | **Proprietary Models** |
156
+ | GPT-3.5-Turbo | 0.14 | 0.30 | 0.51 | 0.31 | 0.50 | 0.35 |
157
+ | GPT-4o-Mini | 0.25 | 0.30 | 0.12 | _0.36_ | 0.59 | 0.32 |
158
+ | GPT-4o | 0.09 | 0.31 | **0.78** | 0.26 | 0.59 | 0.41 |
159
+ | GPT-4 | _0.28_ | _0.60_ | 0.71 | **0.38** | 0.63 | **0.52** |
160
+
161
+ | **English Financial Models** |
162
+ | Finma-7B | 0.00 | 0.00 | 0.25 | 0.11 | 0.35 | 0.14 |
163
+ | FinLLaMA-8B | 0.00 | 0.00 | 0.28 | 0.03 | 0.38 | 0.14 |
164
+
165
+ | **Greek General Models** |
166
+ | Meltemi-7B | 0.12 | 0.50 | 0.48 | 0.19 | 0.43 | 0.34 |
167
+ | Llama-Kríkri-8B | 0.19 | 0.45 | 0.57 | 0.22 | 0.39 | 0.36 |
168
+
169
+ | **Greek Financial Models** |
170
+ | Plutus-8B | **0.70** | **0.57** | _0.64_ | _0.34_ | **0.72** | **0.60** |
171
 
172
  #### Summary
173
 
174
  In summary, plutus-8B-instruct is a domain-specific, instruction-tuned large language model optimized for Greek finance applications. Although it performs well in targeted scenarios, users are advised to validate its outputs carefully.
175
 
176
+ ## Citation
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
177
 
178
  **BibTeX:**
179
 
 
193
 
194
  Peng, X., Papadopoulos, T., Soufleri, E., Giannouris, P., Xiang, R., Wang, Y., Qian, L., Huang, J., Xie, Q., & Ananiadou, S. (2025). Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance. arXiv preprint arXiv:2502.18772. https://arxiv.org/abs/2502.18772
195
 
196
+ ## Glossary
197
 
198
  - PEFT: Parameter-Efficient Fine-Tuning.
199
  - LoRA: Low-Rank Adaptation, a technique to reduce the number of trainable parameters.
200
  - BF16: bfloat16, a mixed precision format used to optimize training speed.
201
  - Int4 Quantization: A lower precision format aimed at reducing model size and inference latency.
202
 
203
+ ## More Information
204
 
205
  For more details regarding training logs, dataset preparations, and further technical insights, please refer to the associated GitHub repositories and documentation provided by The Fin AI.
206
 
207
+ ## Model Card Authors
208
 
209
+ The model card was prepared by The Fin AI with inputs from the Hugging Face community.
210
 
211
  ## Model Card Contact
212
 
213
+ For additional questions or feedback, please contact The Fin AI.