financial-summarization-vit5-lora
financial-summarization-vit5-lora
model, a fine-tuned version of the VietAI/vit5-base
model using LoRA (Low-Rank Adaptation) for Vietnamese financial text summarization. The model achieves a ROUGE-L score of 34.45 on the test set, demonstrating its effectiveness in summarizing financial news articles.
Model Details
Model Description
The financial-summarization-vit5-lora
model is a parameter-efficient fine-tuned version of the VietAI/vit5-base
model, optimized for summarizing Vietnamese financial news. It uses the LoRA technique to adapt the pre-trained model with minimal additional parameters, making it suitable for training on limited hardware like Kaggle T4 GPUs.
- Developed by: mrstarkng
- Model type: Seq2Seq (T5-based) with LoRA
- Language(s): Vietnamese
- License: MIT (or specify your preferred license)
- Finetuned from model: VietAI/vit5-base
- Framework: PyTorch, PEFT 0.12.0
Model Sources
- Repository: https://github.com/mrstarkng/financial-summarization-vit5-lora
- Hugging Face Hub: https://huggingface.co/mrstarkng/financial-summarization-vit5-lora
Uses
Direct Use
This model can be used directly for summarizing Vietnamese financial texts. Example use cases include:
- Generating concise summaries of financial news articles.
- Assisting in financial report analysis for non-expert users.
Downstream Use
The model can be fine-tuned further for specific financial domains or integrated into larger NLP pipelines for text generation tasks.
Out-of-Scope Use
- Summarization of non-financial or non-Vietnamese texts (performance may degrade).
- Malicious use, such as generating misleading financial summaries.
Bias, Risks, and Limitations
The model was trained on a specific dataset of Vietnamese financial news, which may introduce biases related to the data source (e.g., specific news outlets or time periods). Limitations include:
- Potential overfitting to the training data.
- Limited performance on out-of-domain texts.
Recommendations
Users should validate the model's outputs against ground truth data, especially for critical financial decisions. Awareness of potential biases and limitations is recommended.
How to Get Started with the Model
Use the following code to get started with the model:
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer
model_name = "mrstarkng/financial-summarization-vit5-lora"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
input_text = "summarize: [Your Vietnamese financial text here]"
inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True, padding="max_length")
outputs = model.generate(**inputs, max_length=128)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(summary)
Training Details
Training Data
The model was trained on a dataset of Vietnamese financial news articles, sourced from vnexpress.vm/cafef.vn/thanhnnien.vn. The dataset includes train.csv
(9217 samples) and val.csv
(1153 samples), preprocessed with a task prefix "summarize: ".
Training Procedure
Preprocessing
- Texts were tokenized using the
VietAI/vit5-base
tokenizer. - Input sequences were truncated/padded to
max_length=512
, and target sequences tomax_length=128
. - Padding tokens were replaced with
-100
to be ignored in loss calculation.
Training Hyperparameters
- Training regime: bf16 mixed precision
- Epochs: 4
- Batch size: 4 (effective batch size 16 with gradient accumulation 4)
- Learning rate: 2e-4 (with warmup ratio 0.05)
- LoRA parameters:
r=8
,lora_alpha=16
,lora_dropout=0.05
,target_modules=["q", "k", "v", "o", "wi", "wo"]
Speeds, Sizes, Times
- Training time: ~1 hour 40 minutes (6031.292 seconds)
- Throughput: 6.113 samples/second, 0.191 steps/second
- Model size: ~13MB (adapter weights) + tokenizer files
Evaluation
Testing Data, Factors & Metrics
Testing Data
- Evaluated on the validation split of the Vietnamese financial news dataset (1153 samples).
Factors
- Performance evaluated across the entire validation set.
Metrics
- ROUGE-L: Used to measure summarization quality, with fmeasure * 100 as the final score.
Results
- ROUGE-L: 34.43
- Eval Loss: 1.746
- Eval Runtime: 316.337 seconds
Summary
The model achieves a ROUGE-L score of 34.43, indicating good summarization performance for Vietnamese financial texts under the given constraints.
Environmental Impact
- Hardware Type: NVIDIA T4 GPU (Kaggle)
- Hours used: ~1.67 hours
- Cloud Provider: Kaggle
- Compute Region: Unknown (Kaggle default)
- Carbon Emitted: Estimated ~0.1-0.2 kg CO2eq (using ML Impact calculator, assuming T4 efficiency).
Technical Specifications
Model Architecture and Objective
- Based on T5 architecture (VietAI/vit5-base) with LoRA for parameter-efficient fine-tuning.
- Objective: Minimize cross-entropy loss for sequence-to-sequence summarization.
Compute Infrastructure
Hardware
- NVIDIA T4 GPU (15GB VRAM)
Software
- Python 3.11
- Transformers library
- PEFT 0.12.0
- PyTorch
Citation
BibTeX:
@misc{financial-summarization-vit5-lora,
author = {mrstarkng},
title = {Financial Summarization with ViT5 and LoRA},
year = {2025},
url = {https://huggingface.co/mrstarkng/financial-summarization-vit5-lora}
}
APA: mrstarkng. (2025). Financial Summarization with ViT5 and LoRA. https://huggingface.co/mrstarkng/financial-summarization-vit5-lora
Model Card Authors
- mrstarkng
Model Card Contact
For questions or feedback, please contact mrstarkng via GitHub or the Hugging Face platform.
- Downloads last month
- 54
Model tree for mrstarkng/financial-summarization-vit5-lora
Base model
VietAI/vit5-base