liamcripwell
/

ctrl44-simp

text2text-generation

Model card Files Files and versions

liamcripwell commited on Apr 21, 2022

Commit

533efd4

·

1 Parent(s): 079aa3f

Create README.md

Files changed (1) hide show

README.md +22 -0

README.md ADDED Viewed

	@@ -0,0 +1,22 @@

+# CTRL44 Simplification model
+This is a pretrained version of the controllable simplification model presented in the NAACL 2022 paper "Controllable Sentence Simplification via Operation Classification". It was trained on the IRSD simplification dataset.
+A control token is expected at the start of input sequences to dictate which simplification operation should be performed. This can either be done manually or with an operation classifier like [this one](https://huggingface.co/liamcripwell/ctrl44-clf).
+Possible control tokens are: "\<ident\>", "\<para\>", "\<ssplit\>", and "\<dsplit\>".
+## How to use
+Here is how to use this model in PyTorch:
+```python
+from transformers import BartForConditionalGeneration, AutoTokenizer
+model = BartForConditionalGeneration.from_pretrained("liamcripwell/ctrl44-simp")
+tokenizer = AutoTokenizer.from_pretrained("liamcripwell/ctrl44-simp")
+text = "<para> Barack Hussein Obama II is an American politician who served as the 44th president of the United States from 2009 to 2017."
+inputs = tokenizer(text, return_tensors="pt")
+outputs = model.generate(**inputs, num_beams=10, max_length=128)
+```