financial-summarization-vit5-lora

financial-summarization-vit5-lora model, a fine-tuned version of the VietAI/vit5-base model using LoRA (Low-Rank Adaptation) for Vietnamese financial text summarization. The model achieves a ROUGE-L score of 34.45 on the test set, demonstrating its effectiveness in summarizing financial news articles.

Model Details

Model Description

The financial-summarization-vit5-lora model is a parameter-efficient fine-tuned version of the VietAI/vit5-base model, optimized for summarizing Vietnamese financial news. It uses the LoRA technique to adapt the pre-trained model with minimal additional parameters, making it suitable for training on limited hardware like Kaggle T4 GPUs.

  • Developed by: mrstarkng
  • Model type: Seq2Seq (T5-based) with LoRA
  • Language(s): Vietnamese
  • License: MIT (or specify your preferred license)
  • Finetuned from model: VietAI/vit5-base
  • Framework: PyTorch, PEFT 0.12.0

Model Sources

Uses

Direct Use

This model can be used directly for summarizing Vietnamese financial texts. Example use cases include:

  • Generating concise summaries of financial news articles.
  • Assisting in financial report analysis for non-expert users.

Downstream Use

The model can be fine-tuned further for specific financial domains or integrated into larger NLP pipelines for text generation tasks.

Out-of-Scope Use

  • Summarization of non-financial or non-Vietnamese texts (performance may degrade).
  • Malicious use, such as generating misleading financial summaries.

Bias, Risks, and Limitations

The model was trained on a specific dataset of Vietnamese financial news, which may introduce biases related to the data source (e.g., specific news outlets or time periods). Limitations include:

  • Potential overfitting to the training data.
  • Limited performance on out-of-domain texts.

Recommendations

Users should validate the model's outputs against ground truth data, especially for critical financial decisions. Awareness of potential biases and limitations is recommended.

How to Get Started with the Model

Use the following code to get started with the model:

from transformers import AutoModelForSeq2SeqLM, AutoTokenizer

model_name = "mrstarkng/financial-summarization-vit5-lora"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

input_text = "summarize: [Your Vietnamese financial text here]"
inputs = tokenizer(input_text, return_tensors="pt", max_length=512, truncation=True, padding="max_length")
outputs = model.generate(**inputs, max_length=128)
summary = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(summary)

Training Details

Training Data

The model was trained on a dataset of Vietnamese financial news articles, sourced from vnexpress.vm/cafef.vn/thanhnnien.vn. The dataset includes train.csv (9217 samples) and val.csv (1153 samples), preprocessed with a task prefix "summarize: ".

Training Procedure

Preprocessing

  • Texts were tokenized using the VietAI/vit5-base tokenizer.
  • Input sequences were truncated/padded to max_length=512, and target sequences to max_length=128.
  • Padding tokens were replaced with -100 to be ignored in loss calculation.

Training Hyperparameters

  • Training regime: bf16 mixed precision
  • Epochs: 4
  • Batch size: 4 (effective batch size 16 with gradient accumulation 4)
  • Learning rate: 2e-4 (with warmup ratio 0.05)
  • LoRA parameters: r=8, lora_alpha=16, lora_dropout=0.05, target_modules=["q", "k", "v", "o", "wi", "wo"]

Speeds, Sizes, Times

  • Training time: ~1 hour 40 minutes (6031.292 seconds)
  • Throughput: 6.113 samples/second, 0.191 steps/second
  • Model size: ~13MB (adapter weights) + tokenizer files

Evaluation

Testing Data, Factors & Metrics

Testing Data

  • Evaluated on the validation split of the Vietnamese financial news dataset (1153 samples).

Factors

  • Performance evaluated across the entire validation set.

Metrics

  • ROUGE-L: Used to measure summarization quality, with fmeasure * 100 as the final score.

Results

  • ROUGE-L: 34.43
  • Eval Loss: 1.746
  • Eval Runtime: 316.337 seconds

Summary

The model achieves a ROUGE-L score of 34.43, indicating good summarization performance for Vietnamese financial texts under the given constraints.

Environmental Impact

  • Hardware Type: NVIDIA T4 GPU (Kaggle)
  • Hours used: ~1.67 hours
  • Cloud Provider: Kaggle
  • Compute Region: Unknown (Kaggle default)
  • Carbon Emitted: Estimated ~0.1-0.2 kg CO2eq (using ML Impact calculator, assuming T4 efficiency).

Technical Specifications

Model Architecture and Objective

  • Based on T5 architecture (VietAI/vit5-base) with LoRA for parameter-efficient fine-tuning.
  • Objective: Minimize cross-entropy loss for sequence-to-sequence summarization.

Compute Infrastructure

Hardware

  • NVIDIA T4 GPU (15GB VRAM)

Software

  • Python 3.11
  • Transformers library
  • PEFT 0.12.0
  • PyTorch

Citation

BibTeX:

@misc{financial-summarization-vit5-lora,
  author = {mrstarkng},
  title = {Financial Summarization with ViT5 and LoRA},
  year = {2025},
  url = {https://huggingface.co/mrstarkng/financial-summarization-vit5-lora}
}

APA: mrstarkng. (2025). Financial Summarization with ViT5 and LoRA. https://huggingface.co/mrstarkng/financial-summarization-vit5-lora

Model Card Authors

  • mrstarkng

Model Card Contact

For questions or feedback, please contact mrstarkng via GitHub or the Hugging Face platform.

Downloads last month
54
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for mrstarkng/financial-summarization-vit5-lora

Base model

VietAI/vit5-base
Adapter
(5)
this model

Space using mrstarkng/financial-summarization-vit5-lora 1