financial-summarization-vit5-lora

financial-summarization-vit5-lora model, a fine-tuned version of the VietAI/vit5-base model using LoRA (Low-Rank Adaptation) for Vietnamese financial text summarization. The model achieves a ROUGE-L score of 34.45 on the test set, demonstrating its effectiveness in summarizing financial news articles.

Model Details

Model Description

The financial-summarization-vit5-lora model is a parameter-efficient fine-tuned version of the VietAI/vit5-base model, optimized for summarizing Vietnamese financial news. It uses the LoRA technique to adapt the pre-trained model with minimal additional parameters, making it suitable for training on limited hardware like Kaggle T4 GPUs.

Developed by: mrstarkng
Model type: Seq2Seq (T5-based) with LoRA
Language(s): Vietnamese
License: MIT (or specify your preferred license)
Finetuned from model: VietAI/vit5-base
Framework: PyTorch, PEFT 0.12.0

Model Sources

Repository: https://github.com/mrstarkng/financial-summarization-vit5-lora
Hugging Face Hub: https://huggingface.co/mrstarkng/financial-summarization-vit5-lora

Uses

Direct Use

This model can be used directly for summarizing Vietnamese financial texts. Example use cases include:

Generating concise summaries of financial news articles.
Assisting in financial report analysis for non-expert users.

Downstream Use

The model can be fine-tuned further for specific financial domains or integrated into larger NLP pipelines for text generation tasks.

Out-of-Scope Use

Summarization of non-financial or non-Vietnamese texts (performance may degrade).
Malicious use, such as generating misleading financial summaries.

Bias, Risks, and Limitations

The model was trained on a specific dataset of Vietnamese financial news, which may introduce biases related to the data source (e.g., specific news outlets or time periods). Limitations include:

Potential overfitting to the training data.
Limited performance on out-of-domain texts.

Recommendations

Users should validate the model's outputs against ground truth data, especially for critical financial decisions. Awareness of potential biases and limitations is recommended.

How to Get Started with the Model

Use the following code to get started with the model:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "mrstarkng/financial-summarization-vit5-lora"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

input_text = "summarize: [Your Vietnamese financial text here]"
inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True, padding="max_length")
outputs = model.generate(**inputs, max_length=128)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(summary)

Training Details

Training Data

The model was trained on a dataset of Vietnamese financial news articles, sourced from vnexpress.vm/cafef.vn/thanhnnien.vn. The dataset includes train.csv (9217 samples) and val.csv (1153 samples), preprocessed with a task prefix "summarize: ".

Training Procedure

Preprocessing

Texts were tokenized using the VietAI/vit5-base tokenizer.
Input sequences were truncated/padded to max_length=512, and target sequences to max_length=128.
Padding tokens were replaced with -100 to be ignored in loss calculation.

Training Hyperparameters

Training regime: bf16 mixed precision
Epochs: 4
Batch size: 4 (effective batch size 16 with gradient accumulation 4)
Learning rate: 2e-4 (with warmup ratio 0.05)
LoRA parameters: r=8, lora_alpha=16, lora_dropout=0.05, target_modules=["q", "k", "v", "o", "wi", "wo"]

Speeds, Sizes, Times

Training time: ~1 hour 40 minutes (6031.292 seconds)
Throughput: 6.113 samples/second, 0.191 steps/second
Model size: ~13MB (adapter weights) + tokenizer files

Evaluation

Testing Data, Factors & Metrics

Testing Data

Evaluated on the validation split of the Vietnamese financial news dataset (1153 samples).

Factors

Performance evaluated across the entire validation set.

Metrics

ROUGE-L: Used to measure summarization quality, with fmeasure * 100 as the final score.

Results

ROUGE-L: 34.43
Eval Loss: 1.746
Eval Runtime: 316.337 seconds

Summary

The model achieves a ROUGE-L score of 34.43, indicating good summarization performance for Vietnamese financial texts under the given constraints.

Environmental Impact

Hardware Type: NVIDIA T4 GPU (Kaggle)
Hours used: ~1.67 hours
Cloud Provider: Kaggle
Compute Region: Unknown (Kaggle default)
Carbon Emitted: Estimated ~0.1-0.2 kg CO2eq (using ML Impact calculator, assuming T4 efficiency).

Technical Specifications

Model Architecture and Objective

Based on T5 architecture (VietAI/vit5-base) with LoRA for parameter-efficient fine-tuning.
Objective: Minimize cross-entropy loss for sequence-to-sequence summarization.

Compute Infrastructure

Hardware

NVIDIA T4 GPU (15GB VRAM)

Software

Python 3.11
Transformers library
PEFT 0.12.0
PyTorch

Citation

BibTeX:

@misc{financial-summarization-vit5-lora,
  author = {mrstarkng},
  title = {Financial Summarization with ViT5 and LoRA},
  year = {2025},
  url = {https://huggingface.co/mrstarkng/financial-summarization-vit5-lora}
}

APA: mrstarkng. (2025). Financial Summarization with ViT5 and LoRA. https://huggingface.co/mrstarkng/financial-summarization-vit5-lora

Model Card Authors

mrstarkng

Model Card Contact

For questions or feedback, please contact mrstarkng via GitHub or the Hugging Face platform.

mrstarkng
/

financial-summarization-vit5-lora