Update README.md (#1)
Browse files- Update README.md (e374e861e889fed598826848f0052a001ea3de66)
README.md
CHANGED
@@ -1,3 +1,105 @@
|
|
1 |
-
---
|
2 |
-
license: mit
|
3 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
---
|
2 |
+
license: mit
|
3 |
+
datasets:
|
4 |
+
- EdinburghNLP/xsum
|
5 |
+
metrics:
|
6 |
+
- bleu
|
7 |
+
- rouge
|
8 |
+
base_model:
|
9 |
+
- facebook/bart-large
|
10 |
+
pipeline_tag: summarization
|
11 |
+
---
|
12 |
+
|
13 |
+
|
14 |
+
|
15 |
+
### Model Description
|
16 |
+
|
17 |
+
* Model Card for `fewshot-xsum-bart`
|
18 |
+
* Model Name- fewshot-xsum-bart
|
19 |
+
* Base Model- facebook/bart-large
|
20 |
+
* Task- Summarization (Few-Shot Learning)
|
21 |
+
|
22 |
+
### Dataset: XSUM (Extreme Summarization Dataset)
|
23 |
+
|
24 |
+
- Few-Shot Setup: Trained on 100 samples from the XSUM training set and validated on 50 samples from the XSUM validation set.
|
25 |
+
- This model is a few-shot learning variant of the BART-large model, fine-tuned on a very small subset of the XSUM dataset.
|
26 |
+
- The purpose of this model is to demonstrate the effectiveness of few-shot learning in summarization tasks where only a limited amount of labeled data is available.
|
27 |
+
|
28 |
+
|
29 |
+
### Purpose
|
30 |
+
The goal of this model is to explore how well a large pre-trained language model like BART can perform on abstractive summarization when fine-tuned with very limited data (few-shot learning). By training on only 100 samples and validating on 50 samples, this model serves as a proof of concept for few-shot summarization tasks.
|
31 |
+
|
32 |
+
- Training Set: 100 samples (randomly selected from the XSUM training set).
|
33 |
+
- Validation Set: 50 samples (randomly selected from the XSUM validation set).
|
34 |
+
|
35 |
+
The small dataset size is intentional, as the focus is on few-shot learning rather than large-scale training.
|
36 |
+
|
37 |
+
|
38 |
+
- Base Model: facebook/bart-large (pre-trained on large corpora).
|
39 |
+
- Fine-Tuning:
|
40 |
+
- Epochs: 3
|
41 |
+
- Batch Size: 8
|
42 |
+
- Learning Rate: 5e-5
|
43 |
+
- Max Input Length: 512 tokens
|
44 |
+
- Max Output Length: 64 tokens
|
45 |
+
|
46 |
+
### Performance
|
47 |
+
Due to the few-shot nature of this model, its performance is not directly comparable to models trained on the full XSUM dataset. However, it demonstrates the potential of few-shot learning for summarization tasks. Key metrics on the validation set (50 samples) include:
|
48 |
+
|
49 |
+
#### Few-shot learning model
|
50 |
+
- ROUGE Scores:
|
51 |
+
- ROUGE-1: 0.34979462836539676
|
52 |
+
- ROUGE-2: 0.1307846421186083
|
53 |
+
- ROUGE-L: 0.27450996607520567
|
54 |
+
- BLEU Score: 6.176957339134279
|
55 |
+
|
56 |
+
|
57 |
+
#### Zero-shot/Baseline model
|
58 |
+
- ROUGE Scores:
|
59 |
+
- ROUGE-1: 0.15600324782737301
|
60 |
+
- ROUGE-2: 0.017444778781163447
|
61 |
+
- ROUGE-L: 0.12044578560849475
|
62 |
+
- BLEU Score: 0.6167333943579659
|
63 |
+
|
64 |
+
### Usage
|
65 |
+
Use this model for few-shot abstractive summarization tasks. Below is an example of how to load and use the model:
|
66 |
+
|
67 |
+
```
|
68 |
+
from transformers import pipeline
|
69 |
+
|
70 |
+
# Load the few-shot model
|
71 |
+
summarizer = pipeline("summarization", model="bhargavis/fewshot-xsum-bart")
|
72 |
+
|
73 |
+
# Provide input text
|
74 |
+
input_text = """
|
75 |
+
Authorities have issued a warning after multiple sightings of a large brown bear in the woods. The bear is known to become aggressive if disturbed, and residents are urged to exercise caution. Last week, a group of hikers reported a close encounter with the animal. While no injuries were sustained, the bear displayed defensive behavior when approached. Wildlife officials advise keeping a safe distance and avoiding the area if possible. Those encountering the bear should remain calm, back away slowly, and refrain from making sudden movements. Officials continue to monitor the situation.
|
76 |
+
"""
|
77 |
+
|
78 |
+
# Generate summary
|
79 |
+
summary = summarizer(input_text, max_length=64, min_length=30, do_sample=False)
|
80 |
+
print(summary[0]["summary_text"])
|
81 |
+
```
|
82 |
+
|
83 |
+
|
84 |
+
### Limitations
|
85 |
+
- The model is trained on a very small dataset so its performance may not generalize well to all types of text.
|
86 |
+
- The purpose of building this model is to compare its performace with Zero-shot and Full-Shot learning model
|
87 |
+
- It is best suited for tasks where only limited labeled data is available.
|
88 |
+
- The model is fine-tuned on BBC articles from the XSUM dataset. Its performance may vary on text from other domains.
|
89 |
+
- The model may overfit to the training data due to the small dataset size.
|
90 |
+
|
91 |
+
##### Full-Shot learning model- For a more general-purpose summarization model, check out the full model trained on the entire XSUM dataset: [WIP].
|
92 |
+
|
93 |
+
|
94 |
+
### Citation
|
95 |
+
If you use this model in your research please cite it as follows:
|
96 |
+
|
97 |
+
```
|
98 |
+
@misc{fewshot-xsum-bart,
|
99 |
+
author = {Bhargavi Sriram},
|
100 |
+
title = {Few-Shot Abstractive Summarization with BART-Large},
|
101 |
+
year = {2023},
|
102 |
+
publisher = {Hugging Face},
|
103 |
+
howpublished = {\url{https://huggingface.co/bhargavis/fewshot-xsum-bart}},
|
104 |
+
}
|
105 |
+
```
|