TheFinAI
/

plutus-8B-instruct

@@ -1,6 +1,13 @@
 ---
 library_name: transformers
-tags: []
 ---
 # Model Card for plutus-8B-instruct
@@ -13,19 +20,18 @@ This model is an instruction-tuned large language model specialized for Greek fi
 This model card describes plutus-8B-instruct, a model developed by The Fin AI and finetuned to serve Greek-centric financial language tasks. The model leverages parameter-efficient fine-tuning (PEFT) via LoRA and is designed to generate or understand financial texts in Greek. The training was carried out with a local command-line backend with logging managed via TensorBoard.
-- **Developed by:** The Fin AI
-- **Funded by [optional]:** [More Information Needed]
-- **Shared by [optional]:** [More Information Needed]
 - **Model type:** Instruction-tuned large language model, specialized for low-resource domains such as Greek finance
 - **Language(s) (NLP):** Greek
-- **License:** Apache License 2.0
 - **Finetuned from model [optional]:** ilsp/Llama-Krikri-8B-Instruct
-### Model Sources [optional]
 - **Repository:** https://huggingface.co/TheFinAI/plutus-8B-instruct
-- **Paper [optional]:** Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance
-- **Demo [optional]:** https://huggingface.co/spaces/TheFinAI/plutus-8B-instruct
 ## Uses
@@ -33,7 +39,7 @@ This model card describes plutus-8B-instruct, a model developed by The Fin AI an
 Plutus-8B-instruct can be directly applied in Greek finance applications such as answering user queries, summarizing financial reports, or generating context-aware financial planning text. Users should ensure that input texts align with the financial domain to achieve optimal performance.
-### Downstream Use [optional]
 The model can be integrated into larger systems such as chatbots, recommendation systems, or data analysis pipelines focusing on Greek financial markets. Fine-tuning on domain-specific datasets might further improve performance for specialized tasks.
@@ -78,7 +84,7 @@ The model was fine-tuned using the dataset available at https://huggingface.co/c
 The finetuning process used parameter-efficient fine-tuning (PEFT) with LoRA. The base model was ilsp/Llama-Krikri-8B-Instruct, and the training was executed on a local CLI backend with logs monitored via TensorBoard.
-#### Preprocessing [optional]
 Inputs were tokenized with a specialized tokenizer configured for the Greek language, and a chat template was applied to structure the conversational data accordingly. Padding was applied to the right as per the configuration.
@@ -100,7 +106,7 @@ Inputs were tokenized with a specialized tokenizer configured for the Greek lang
 - **Gradient accumulation:** 4
 - **Mixed precision:** bf16
-#### Speeds, Sizes, Times [optional]
 Training was performed on locally available hardware. Specific details on throughput, training runtime, and checkpoint sizes will be provided in subsequent documentation.
@@ -125,47 +131,49 @@ Evaluation metrics included standard language generation measures as well as dom
 ### Results
-Preliminary evaluations indicate that the model generates coherent and context-aware responses in Greek, with particular aptitude in understanding and generating financial-related texts. Detailed metric numbers and error analyses are under review and will be added when available.
 #### Summary
 In summary, plutus-8B-instruct is a domain-specific, instruction-tuned large language model optimized for Greek finance applications. Although it performs well in targeted scenarios, users are advised to validate its outputs carefully.
-## Model Examination [optional]
-Future updates may include interpretability analyses, such as attention visualizations and performance breakdowns by subpopulation or domain.
-## Environmental Impact
-Carbon emissions can be estimated using the Machine Learning Impact Calculator (https://mlco2.github.io/impact#compute) based on the following details:
-- **Hardware Type:** [More Information Needed]
-- **Hours used:** [More Information Needed]
-- **Cloud Provider:** [More Information Needed]
-- **Compute Region:** [More Information Needed]
-- **Carbon Emitted:** [More Information Needed]
-## Technical Specifications [optional]
-### Model Architecture and Objective
-The model architecture is derived from ilsp/Llama-Krikri-8B-Instruct and fine-tuned using LoRA adaptations. Its objective is to generate and comprehend Greek language texts with an emphasis on financial contexts in an instruction-based setting.
-### Compute Infrastructure
-The model training and fine-tuning were executed using a local CLI environment, with logs monitored via TensorBoard.
-#### Hardware
-Specific hardware details are under documentation review. Future updates will include the configuration of GPUs/CPUs used.
-#### Software
-- **Framework:** Hugging Face Transformers
-- **Backend:** Local CLI
-- **Mixed Precision:** bf16
-## Citation [optional]
 **BibTeX:**
@@ -185,21 +193,21 @@ Specific hardware details are under documentation review. Future updates will in
 Peng, X., Papadopoulos, T., Soufleri, E., Giannouris, P., Xiang, R., Wang, Y., Qian, L., Huang, J., Xie, Q., & Ananiadou, S. (2025). Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance. arXiv preprint arXiv:2502.18772. https://arxiv.org/abs/2502.18772
-## Glossary [optional]
 - PEFT: Parameter-Efficient Fine-Tuning.
 - LoRA: Low-Rank Adaptation, a technique to reduce the number of trainable parameters.
 - BF16: bfloat16, a mixed precision format used to optimize training speed.
 - Int4 Quantization: A lower precision format aimed at reducing model size and inference latency.
-## More Information [optional]
 For more details regarding training logs, dataset preparations, and further technical insights, please refer to the associated GitHub repositories and documentation provided by The Fin AI.
-## Model Card Authors [optional]
-The model card was prepared by The Fin AI team with inputs from the Hugging Face community.
 ## Model Card Contact
-For additional questions or feedback, please contact The Fin AI team.

 ---
 library_name: transformers
+tags:
+- finance
+- greek
+license: llama3.1
+language:
+- el
+base_model:
+- ilsp/Llama-Krikri-8B-Instruct
 ---
 # Model Card for plutus-8B-instruct
 This model card describes plutus-8B-instruct, a model developed by The Fin AI and finetuned to serve Greek-centric financial language tasks. The model leverages parameter-efficient fine-tuning (PEFT) via LoRA and is designed to generate or understand financial texts in Greek. The training was carried out with a local command-line backend with logging managed via TensorBoard.
+- **Developed by:** NaCTeM, The Fin AI
+- **Funded by:** NaCTeM, Archemedes RC
 - **Model type:** Instruction-tuned large language model, specialized for low-resource domains such as Greek finance
 - **Language(s) (NLP):** Greek
+- **License:** LlaMA 3.1 Community License
 - **Finetuned from model [optional]:** ilsp/Llama-Krikri-8B-Instruct
+### Model Sources
 - **Repository:** https://huggingface.co/TheFinAI/plutus-8B-instruct
+- **Paper:** https://arxiv.org/abs/2502.18772
+- **Demo:** https://huggingface.co/spaces/TheFinAI/plutus-8B-instruct
 ## Uses
 Plutus-8B-instruct can be directly applied in Greek finance applications such as answering user queries, summarizing financial reports, or generating context-aware financial planning text. Users should ensure that input texts align with the financial domain to achieve optimal performance.
+### Downstream Use
 The model can be integrated into larger systems such as chatbots, recommendation systems, or data analysis pipelines focusing on Greek financial markets. Fine-tuning on domain-specific datasets might further improve performance for specialized tasks.
 The finetuning process used parameter-efficient fine-tuning (PEFT) with LoRA. The base model was ilsp/Llama-Krikri-8B-Instruct, and the training was executed on a local CLI backend with logs monitored via TensorBoard.
+#### Preprocessing
 Inputs were tokenized with a specialized tokenizer configured for the Greek language, and a chat template was applied to structure the conversational data accordingly. Padding was applied to the right as per the configuration.
 - **Gradient accumulation:** 4
 - **Mixed precision:** bf16
+#### Speeds, Sizes, Times
 Training was performed on locally available hardware. Specific details on throughput, training runtime, and checkpoint sizes will be provided in subsequent documentation.
 ### Results
+Evaluated across multiple Greek financial NLP tasks. **Bold values** denote the highest scores, while _underlined_ values indicate the second-highest scores in each column.
+| Model         | GRFinNUM (Entity F1) | GRFinNER (Entity F1) | GRFinQA (Acc) | GRFNS-2023 (Rouge-1) | GRMultiFin (Acc) | Mean  |
+|--------------|----------------------|----------------------|--------------|----------------------|----------------|-------|
+| **Open-source Small Models** |
+| LLaMA-3.2-1B | 0.00                 | 0.00                 | 0.29         | 0.14                 | 0.39           | 0.16  |
+| LLaMA-3-8b   | 0.00                 | 0.13                 | 0.33         | 0.07                 | 0.70           | 0.25  |
+| LLaMA-3.1-8b | 0.10                 | 0.21                 | 0.40         | 0.20                 | 0.54           | 0.29  |
+| Qwen2.5-1.5B | 0.00                 | 0.00                 | 0.36         | 0.02                 | 0.31           | 0.14  |
+| Qwen2.5-7B   | 0.00                 | 0.13                 | 0.43         | 0.07                 | 0.54           | 0.23  |
+| Gemma-2-2B   | 0.00                 | 0.16                 | 0.22         | 0.03                 | 0.41           | 0.16  |
+| Gemma-2-9B   | 0.02                 | 0.05                 | 0.31         | 0.06                 | 0.61           | 0.21  |
+| Mistral-7B   | 0.00                 | 0.00                 | 0.30         | 0.14                 | 0.39           | 0.17  |
+| **Open-source Large Models** |
+| Deepseek-V3  | 0.07                 | 0.00                 | 0.50         | **0.38**             | 0.61           | 0.31  |
+| LLaMA-3-70B  | 0.05                 | 0.45                 | 0.60         | 0.08                 | 0.61           | 0.36  |
+| Qwen2.5-32B  | 0.37                 | 0.55                 | 0.60         | 0.10                 | _0.70_         | 0.47  |
+| Qwen2.5-72B  | _0.32_               | 0.39                 | _0.74_       | 0.04                 | **0.72**       | 0.44  |
+| Gemma-2-27B  | 0.18                 | 0.18                 | 0.25         | 0.09                 | 0.61           | 0.26  |
+| **Proprietary Models** |
+| GPT-3.5-Turbo | 0.14                | 0.30                 | 0.51         | 0.31                 | 0.50           | 0.35  |
+| GPT-4o-Mini   | 0.25                | 0.30                 | 0.12         | _0.36_               | 0.59           | 0.32  |
+| GPT-4o        | 0.09                | 0.31                 | **0.78**     | 0.26                 | 0.59           | 0.41  |
+| GPT-4         | _0.28_              | _0.60_               | 0.71         | **0.38**             | 0.63           | **0.52** |
+| **English Financial Models** |
+| Finma-7B     | 0.00                 | 0.00                 | 0.25         | 0.11                 | 0.35           | 0.14  |
+| FinLLaMA-8B  | 0.00                 | 0.00                 | 0.28         | 0.03                 | 0.38           | 0.14  |
+| **Greek General Models** |
+| Meltemi-7B   | 0.12                 | 0.50                 | 0.48         | 0.19                 | 0.43           | 0.34  |
+| Llama-Kríkri-8B | 0.19              | 0.45                 | 0.57         | 0.22                 | 0.39           | 0.36  |
+| **Greek Financial Models** |
+| Plutus-8B    | **0.70**             | **0.57**             | _0.64_       | _0.34_               | **0.72**       | **0.60** |
 #### Summary
 In summary, plutus-8B-instruct is a domain-specific, instruction-tuned large language model optimized for Greek finance applications. Although it performs well in targeted scenarios, users are advised to validate its outputs carefully.
+## Citation
 **BibTeX:**
 Peng, X., Papadopoulos, T., Soufleri, E., Giannouris, P., Xiang, R., Wang, Y., Qian, L., Huang, J., Xie, Q., & Ananiadou, S. (2025). Plutus: Benchmarking Large Language Models in Low-Resource Greek Finance. arXiv preprint arXiv:2502.18772. https://arxiv.org/abs/2502.18772
+## Glossary
 - PEFT: Parameter-Efficient Fine-Tuning.
 - LoRA: Low-Rank Adaptation, a technique to reduce the number of trainable parameters.
 - BF16: bfloat16, a mixed precision format used to optimize training speed.
 - Int4 Quantization: A lower precision format aimed at reducing model size and inference latency.
+## More Information
 For more details regarding training logs, dataset preparations, and further technical insights, please refer to the associated GitHub repositories and documentation provided by The Fin AI.
+## Model Card Authors
+The model card was prepared by The Fin AI with inputs from the Hugging Face community.
 ## Model Card Contact
+For additional questions or feedback, please contact The Fin AI.