pszemraj
/

led-base-book-summary

text2text-generation

Inference Endpoints

Model card Files Files and versions Community

pszemraj commited on Mar 3, 2022

Commit

9b33f3e

·

1 Parent(s): 5e6d18e

add details on use cases

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -69,7 +69,8 @@ inference:
 # Longformer Encoder-Decoder (LED) fine-tuned on Booksum
-- an 'upgraded' version of [`pszemraj/led-base-16384-finetuned-booksum`](https://huggingface.co/pszemraj/led-base-16384-finetuned-booksum), it was trained for an additional epoch with a max summary length of 1024 tokens (original was trained with 512) as a small portion of the summaries are between 512-1024 tokens long.
   - all the parameters for generation on the API are the same for easy comparison between versions.
 - works well on lots of text, can hand 16384 tokens/batch.
@@ -83,7 +84,7 @@ inference:
 # Usage - Basics
 - it is recommended to use `encoder_no_repeat_ngram_size=3` when calling the pipeline object to improve summary quality.
-  - this param forces the model to use new vocabulary and create an abstractive summary, otherwise it may l compile the best _extractive_ summary from the input provided.
 - create the pipeline object:
 ```

 # Longformer Encoder-Decoder (LED) fine-tuned on Booksum
+- **Use cases:** long narrative summarization (think stories - as the dataset intended), article/paper/textbook/other summarization, technical:simple summarization. Models trained on this dataset tend to also _explain_ what they are summarizing, which IMO is awesome.
+- This is an 'upgraded' version of [`pszemraj/led-base-16384-finetuned-booksum`](https://huggingface.co/pszemraj/led-base-16384-finetuned-booksum), it was trained for an additional epoch with a max summary length of 1024 tokens (original was trained with 512) as a small portion of the summaries are between 512-1024 tokens long.
   - all the parameters for generation on the API are the same for easy comparison between versions.
 - works well on lots of text, can hand 16384 tokens/batch.
 # Usage - Basics
 - it is recommended to use `encoder_no_repeat_ngram_size=3` when calling the pipeline object to improve summary quality.
+  - this param forces the model to use new vocabulary and create an abstractive summary otherwise it may l compile the best _extractive_ summary from the input provided.
 - create the pipeline object:
 ```