bhargavis commited on
Commit
8e5a9d4
·
verified ·
1 Parent(s): f63375d

Update README.md (#1)

Browse files

- Update README.md (e374e861e889fed598826848f0052a001ea3de66)

Files changed (1) hide show
  1. README.md +105 -3
README.md CHANGED
@@ -1,3 +1,105 @@
1
- ---
2
- license: mit
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ datasets:
4
+ - EdinburghNLP/xsum
5
+ metrics:
6
+ - bleu
7
+ - rouge
8
+ base_model:
9
+ - facebook/bart-large
10
+ pipeline_tag: summarization
11
+ ---
12
+
13
+
14
+
15
+ ### Model Description
16
+
17
+ * Model Card for `fewshot-xsum-bart`
18
+ * Model Name- fewshot-xsum-bart
19
+ * Base Model- facebook/bart-large
20
+ * Task- Summarization (Few-Shot Learning)
21
+
22
+ ### Dataset: XSUM (Extreme Summarization Dataset)
23
+
24
+ - Few-Shot Setup: Trained on 100 samples from the XSUM training set and validated on 50 samples from the XSUM validation set.
25
+ - This model is a few-shot learning variant of the BART-large model, fine-tuned on a very small subset of the XSUM dataset.
26
+ - The purpose of this model is to demonstrate the effectiveness of few-shot learning in summarization tasks where only a limited amount of labeled data is available.
27
+
28
+
29
+ ### Purpose
30
+ The goal of this model is to explore how well a large pre-trained language model like BART can perform on abstractive summarization when fine-tuned with very limited data (few-shot learning). By training on only 100 samples and validating on 50 samples, this model serves as a proof of concept for few-shot summarization tasks.
31
+
32
+ - Training Set: 100 samples (randomly selected from the XSUM training set).
33
+ - Validation Set: 50 samples (randomly selected from the XSUM validation set).
34
+
35
+ The small dataset size is intentional, as the focus is on few-shot learning rather than large-scale training.
36
+
37
+
38
+ - Base Model: facebook/bart-large (pre-trained on large corpora).
39
+ - Fine-Tuning:
40
+ - Epochs: 3
41
+ - Batch Size: 8
42
+ - Learning Rate: 5e-5
43
+ - Max Input Length: 512 tokens
44
+ - Max Output Length: 64 tokens
45
+
46
+ ### Performance
47
+ Due to the few-shot nature of this model, its performance is not directly comparable to models trained on the full XSUM dataset. However, it demonstrates the potential of few-shot learning for summarization tasks. Key metrics on the validation set (50 samples) include:
48
+
49
+ #### Few-shot learning model
50
+ - ROUGE Scores:
51
+ - ROUGE-1: 0.34979462836539676
52
+ - ROUGE-2: 0.1307846421186083
53
+ - ROUGE-L: 0.27450996607520567
54
+ - BLEU Score: 6.176957339134279
55
+
56
+
57
+ #### Zero-shot/Baseline model
58
+ - ROUGE Scores:
59
+ - ROUGE-1: 0.15600324782737301
60
+ - ROUGE-2: 0.017444778781163447
61
+ - ROUGE-L: 0.12044578560849475
62
+ - BLEU Score: 0.6167333943579659
63
+
64
+ ### Usage
65
+ Use this model for few-shot abstractive summarization tasks. Below is an example of how to load and use the model:
66
+
67
+ ```
68
+ from transformers import pipeline
69
+
70
+ # Load the few-shot model
71
+ summarizer = pipeline("summarization", model="bhargavis/fewshot-xsum-bart")
72
+
73
+ # Provide input text
74
+ input_text = """
75
+ Authorities have issued a warning after multiple sightings of a large brown bear in the woods. The bear is known to become aggressive if disturbed, and residents are urged to exercise caution. Last week, a group of hikers reported a close encounter with the animal. While no injuries were sustained, the bear displayed defensive behavior when approached. Wildlife officials advise keeping a safe distance and avoiding the area if possible. Those encountering the bear should remain calm, back away slowly, and refrain from making sudden movements. Officials continue to monitor the situation.
76
+ """
77
+
78
+ # Generate summary
79
+ summary = summarizer(input_text, max_length=64, min_length=30, do_sample=False)
80
+ print(summary[0]["summary_text"])
81
+ ```
82
+
83
+
84
+ ### Limitations
85
+ - The model is trained on a very small dataset so its performance may not generalize well to all types of text.
86
+ - The purpose of building this model is to compare its performace with Zero-shot and Full-Shot learning model
87
+ - It is best suited for tasks where only limited labeled data is available.
88
+ - The model is fine-tuned on BBC articles from the XSUM dataset. Its performance may vary on text from other domains.
89
+ - The model may overfit to the training data due to the small dataset size.
90
+
91
+ ##### Full-Shot learning model- For a more general-purpose summarization model, check out the full model trained on the entire XSUM dataset: [WIP].
92
+
93
+
94
+ ### Citation
95
+ If you use this model in your research please cite it as follows:
96
+
97
+ ```
98
+ @misc{fewshot-xsum-bart,
99
+ author = {Bhargavi Sriram},
100
+ title = {Few-Shot Abstractive Summarization with BART-Large},
101
+ year = {2023},
102
+ publisher = {Hugging Face},
103
+ howpublished = {\url{https://huggingface.co/bhargavis/fewshot-xsum-bart}},
104
+ }
105
+ ```