kritsadaK's picture
Update README.md
607863d verified
metadata
license: apache-2.0
datasets:
  - kritsadaK/EDGAR-CORPUS-Financial-Summarization
language:
  - en
metrics:
  - rouge
base_model:
  - facebook/bart-large-cnn

BART Financial Summarization Model

Model Name: kritsadaK/bart-financial-summarization
Base Model: facebook/bart-large-cnn
Task: Financial Text Summarization
Dataset: kritsadaK/EDGAR-CORPUS-Financial-Summarization

Techniques:

  • Fine-tuned using the Hugging Face Trainer API
  • Tokenized with AutoTokenizer (max length 1024 for input, 256 for summary)
  • Optimized with AdamW, learning rate 2e-5, batch size 2, fp16 enabled
  • Evaluated using ROUGE scores

Evaluation Results:

  • Loss: 1.18
  • Runtime: 18.9 seconds
  • Samples per second: 56.1
  • Steps per second: 28.1
  • Epochs: 3

Usage Example (Python):

from transformers import pipeline

max_input_length = 1024  
summarizer = pipeline("summarization", model="kritsadaK/bart-financial-summarization")
text = "Your financial document text here..."
summary = summarizer(text, max_length=256, min_length=50, do_sample=False)
print(summary)

The Financial Statements Summary 10K Dataset was developed as part of the CSX4210: Natural Language Processing project at Assumption University.