bhargavis
/

fewshot-xsum-bart

Summarization

Safetensors

bart

Model card Files Files and versions Community

bhargavis commited on 14 days ago

Commit

8e5a9d4

verified ·

1 Parent(s): f63375d

Update README.md (#1)

Browse files

- Update README.md (e374e861e889fed598826848f0052a001ea3de66)

Files changed (1) hide show

README.md +105 -3

README.md CHANGED Viewed

@@ -1,3 +1,105 @@
----
-license: mit
----

+---
+license: mit
+datasets:
+- EdinburghNLP/xsum
+metrics:
+- bleu
+- rouge
+base_model:
+- facebook/bart-large
+pipeline_tag: summarization
+---
+### Model Description
+* Model Card for `fewshot-xsum-bart`
+* Model Name- fewshot-xsum-bart
+* Base Model- facebook/bart-large
+* Task- Summarization (Few-Shot Learning)
+### Dataset: XSUM (Extreme Summarization Dataset)
+- Few-Shot Setup: Trained on 100 samples from the XSUM training set and validated on 50 samples from the XSUM validation set.
+- This model is a few-shot learning variant of the BART-large model, fine-tuned on a very small subset of the XSUM dataset.
+- The purpose of this model is to demonstrate the effectiveness of few-shot learning in summarization tasks where only a limited amount of labeled data is available.
+### Purpose
+The goal of this model is to explore how well a large pre-trained language model like BART can perform on abstractive summarization when fine-tuned with very limited data (few-shot learning). By training on only 100 samples and validating on 50 samples, this model serves as a proof of concept for few-shot summarization tasks.
+- Training Set: 100 samples (randomly selected from the XSUM training set).
+- Validation Set: 50 samples (randomly selected from the XSUM validation set).
+The small dataset size is intentional, as the focus is on few-shot learning rather than large-scale training.
+- Base Model: facebook/bart-large (pre-trained on large corpora).
+- Fine-Tuning:
+    - Epochs: 3
+    - Batch Size: 8
+    - Learning Rate: 5e-5
+    - Max Input Length: 512 tokens
+    - Max Output Length: 64 tokens
+### Performance
+Due to the few-shot nature of this model, its performance is not directly comparable to models trained on the full XSUM dataset. However, it demonstrates the potential of few-shot learning for summarization tasks. Key metrics on the validation set (50 samples) include:
+#### Few-shot learning model
+- ROUGE Scores:
+    - ROUGE-1: 0.34979462836539676
+    - ROUGE-2: 0.1307846421186083
+    - ROUGE-L: 0.27450996607520567
+- BLEU Score: 6.176957339134279
+#### Zero-shot/Baseline model
+- ROUGE Scores:
+    - ROUGE-1: 0.15600324782737301
+    - ROUGE-2: 0.017444778781163447
+    - ROUGE-L: 0.12044578560849475
+- BLEU Score: 0.6167333943579659
+### Usage
+Use this model for few-shot abstractive summarization tasks. Below is an example of how to load and use the model:
+```
+from transformers import pipeline
+# Load the few-shot model
+summarizer = pipeline("summarization", model="bhargavis/fewshot-xsum-bart")
+# Provide input text
+input_text = """
+Authorities have issued a warning after multiple sightings of a large brown bear in the woods. The bear is known to become aggressive if disturbed, and residents are urged to exercise caution. Last week, a group of hikers reported a close encounter with the animal. While no injuries were sustained, the bear displayed defensive behavior when approached. Wildlife officials advise keeping a safe distance and avoiding the area if possible. Those encountering the bear should remain calm, back away slowly, and refrain from making sudden movements. Officials continue to monitor the situation.
+"""
+# Generate summary
+summary = summarizer(input_text, max_length=64, min_length=30, do_sample=False)
+print(summary[0]["summary_text"])
+```
+### Limitations
+- The model is trained on a very small dataset so its performance may not generalize well to all types of text.
+- The purpose of building this model is to compare its performace with Zero-shot and Full-Shot learning model
+- It is best suited for tasks where only limited labeled data is available.
+- The model is fine-tuned on BBC articles from the XSUM dataset. Its performance may vary on text from other domains.
+- The model may overfit to the training data due to the small dataset size.
+##### Full-Shot learning model- For a more general-purpose summarization model, check out the full model trained on the entire XSUM dataset: [WIP].
+### Citation
+If you use this model in your research please cite it as follows:
+```
+@misc{fewshot-xsum-bart,
+  author = {Bhargavi Sriram},
+  title = {Few-Shot Abstractive Summarization with BART-Large},
+  year = {2023},
+  publisher = {Hugging Face},
+  howpublished = {\url{https://huggingface.co/bhargavis/fewshot-xsum-bart}},
+}
+```