shahin-as
/

bart-large-sentence-compression

Text Generation

sentence-compression

sentence-simplification

Model card Files Files and versions Community

bart-large-sentence-compression / README.md

shahin-as's picture

Update README.md

4fda53a verified about 2 months ago

|

history blame contribute delete

2.43 kB

	---
	datasets:
	- sentence-transformers/sentence-compression
	language:
	- en
	metrics:
	- sari
	- rouge
	base_model:
	- facebook/bart-large
	pipeline_tag: text-generation
	tags:
	- sentence-compression
	- sentence-simplification
	---
	## Fine-Tuned BART-Large for Sentence Compression

	### Model Overview

	This model is a fine-tuned version of ```facebook/bart-large``` trained on the ```sentence-transformers/sentence-compression``` dataset. The goal of this model is to generate compressed versions of input sentences while maintaining fluency and meaning.

	---

	### Training Details

	Base Model: ```facebook/bart-large```

	Dataset: ```sentence-transformers/sentence-compression```

	Batch Size: 8

	Epochs: 5

	Learning Rate: 2e-5

	Weight Decay: 0.01

	Evaluation Metric for Best Model: SARI Penalized

	Precision Mode: FP16 for efficient training

	---
	### Evaluation Results

	### Validation Set Performance:

	\| Metric \| Score \|
	\|---------------------\|-------\|
	\| SARI \| 89.68 \|
	\| SARI Penalized \| 88.42 \|
	\| ROUGE-1 \| 93.05 \|
	\| ROUGE-2 \| 88.47 \|
	\| ROUGE-L \| 92.98 \|

	### Test Set Performance:

	\| Metric \| Score \|
	\|---------------------\|-------\|
	\| SARI \| 89.76 \|
	\| SARI Penalized \| 88.32 \|
	\| ROUGE-1 \| 93.14 \|
	\| ROUGE-2 \| 88.65 \|
	\| ROUGE-L \| 93.07 \|

	---
	### Training Loss Curve

	The loss curves during training are visualized in bart-large-sentence-compression_loss.eps, showing both training and evaluation loss over steps.

	<img src="Training_and_Evaluation_Loss_Plot.png" alt="Stats1" width="200" height="200">

	---
	## Usage

	### Load the Model

	```python
	from transformers import BartForConditionalGeneration, BartTokenizer

	model_name = "shahin-as/bart-large-sentence-compression"

	model = BartForConditionalGeneration.from_pretrained(model_name)
	tokenizer = BartTokenizer.from_pretrained(model_name)

	def compress_sentence(sentence):
	inputs = tokenizer(sentence, return_tensors="pt", max_length=1024, truncation=True)
	summary_ids = model.generate(**inputs, max_length=50, num_beams=5, length_penalty=2.0, early_stopping=True)
	return tokenizer.decode(summary_ids[0], skip_special_tokens=True)

	# Example usage
	sentence = "Insert the sentence to be compressed here."
	compressed_sentence = compress_sentence(sentence)
	print("Original:", sentence)
	print("Compressed:", compressed_sentence)
	```