pszemraj commited on
Commit
a6b49df
·
1 Parent(s): 9443f08
Files changed (1) hide show
  1. README.md +2 -0
README.md CHANGED
@@ -50,6 +50,8 @@ Let's think about a sensible choice of key tokens that a queried token actually
50
  >>> key_tokens = [] # => currently 'available' token doesn't have anything to attend
51
  Nearby tokens should be important because, in a sentence (sequence of words), the current word is highly dependent on neighboring past & future tokens. This intuition is the idea behind the concept of sliding attention."
52
  example_title: "bigbird blog intro"
 
 
53
  - text: " language en tags summarization led summary longformer license apache20 datasets kmfoda/booksum metrics rouge widget inference parameters max_length 64 min_length 4 no_repeat_ngram_size 2 early_stopping true repetition_penalty 24 length_penalty 05 encoder_no_repeat_ngram_size 3 num_beams 4 # longformer encoderdecoder (led) finetuned on booksum an upgraded version of [`pszemraj/ledbase16384finetunedbooksum`](https //huggingface co/pszemraj/ledbase16384finetunedbooksum) it was trained for an additional epoch with a max summary length of 1024 tokens (original was trained with 512) as a small portion of the summaries are between 5121024 tokens long all the parameters for generation on the api are the same for easy comparison between versions works well on lots of text can hand 16384 tokens/batch ## other checkpoints on booksum a oneepoch version of [ledlarge is available here](https //huggingface co/pszemraj/ledlargebooksummary1e) a more polished version still wip # usage basics it is recommended to use `encoder_no_repeat_ngram_size=3` when calling the pipeline object to improve summary quality this param forces the model to use new vocabulary and create an abstractive summary otherwise it may l compile the best _extractive_ summary from the input provided create the pipeline object ``` from transformers import automodelforseq2seqlm autotokenizer from transformers import pipeline hf_name = pszemraj/ledbasebooksummary _model = automodelforseq2seqlm from_pretrained( hf_name low_cpu_mem_usage=true ) _tokenizer = autotokenizer from_pretrained( hf_name ) summarizer = pipeline( summarization model=_model tokenizer=_tokenizer ) ``` put words into the pipeline object ``` wall_of_text = your words here result = summarizer( wall_of_text min_length=16 max_length=256 no_repeat_ngram_size=3 encoder_no_repeat_ngram_size =3 clean_up_tokenization_spaces=true repetition_penalty=37 num_beams=4 early_stopping=true ) ``` # results **no results for this version yet** "
54
  example_title: "the readme"
55
 
 
50
  >>> key_tokens = [] # => currently 'available' token doesn't have anything to attend
51
  Nearby tokens should be important because, in a sentence (sequence of words), the current word is highly dependent on neighboring past & future tokens. This intuition is the idea behind the concept of sliding attention."
52
  example_title: "bigbird blog intro"
53
+ - text: "The tower is 324 metres (1,063 ft) tall, about the same height as an 81-storey building, and the tallest structure in Paris. Its base is square, measuring 125 metres (410 ft) on each side. During its construction, the Eiffel Tower surpassed the Washington Monument to become the tallest man-made structure in the world, a title it held for 41 years until the Chrysler Building in New York City was finished in 1930. It was the first structure to reach a height of 300 metres. Due to the addition of a broadcasting aerial at the top of the tower in 1957, it is now taller than the Chrysler Building by 5.2 metres (17 ft). Excluding transmitters, the Eiffel Tower is the second tallest free-standing structure in France after the Millau Viaduct."
54
+ example_title: "das classic"
55
  - text: " language en tags summarization led summary longformer license apache20 datasets kmfoda/booksum metrics rouge widget inference parameters max_length 64 min_length 4 no_repeat_ngram_size 2 early_stopping true repetition_penalty 24 length_penalty 05 encoder_no_repeat_ngram_size 3 num_beams 4 # longformer encoderdecoder (led) finetuned on booksum an upgraded version of [`pszemraj/ledbase16384finetunedbooksum`](https //huggingface co/pszemraj/ledbase16384finetunedbooksum) it was trained for an additional epoch with a max summary length of 1024 tokens (original was trained with 512) as a small portion of the summaries are between 5121024 tokens long all the parameters for generation on the api are the same for easy comparison between versions works well on lots of text can hand 16384 tokens/batch ## other checkpoints on booksum a oneepoch version of [ledlarge is available here](https //huggingface co/pszemraj/ledlargebooksummary1e) a more polished version still wip # usage basics it is recommended to use `encoder_no_repeat_ngram_size=3` when calling the pipeline object to improve summary quality this param forces the model to use new vocabulary and create an abstractive summary otherwise it may l compile the best _extractive_ summary from the input provided create the pipeline object ``` from transformers import automodelforseq2seqlm autotokenizer from transformers import pipeline hf_name = pszemraj/ledbasebooksummary _model = automodelforseq2seqlm from_pretrained( hf_name low_cpu_mem_usage=true ) _tokenizer = autotokenizer from_pretrained( hf_name ) summarizer = pipeline( summarization model=_model tokenizer=_tokenizer ) ``` put words into the pipeline object ``` wall_of_text = your words here result = summarizer( wall_of_text min_length=16 max_length=256 no_repeat_ngram_size=3 encoder_no_repeat_ngram_size =3 clean_up_tokenization_spaces=true repetition_penalty=37 num_beams=4 early_stopping=true ) ``` # results **no results for this version yet** "
56
  example_title: "the readme"
57