File size: 370 Bytes
5fa1a76
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
tokenizer = AutoTokenizer.from_pretrained("facebook/mbart-large-50-many-to-many-mmt", src_lang="fi_FI")
model = AutoModelForSeq2SeqLM.from_pretrained("facebook/mbart-large-50-many-to-many-mmt")

Tokenize the text:

encoded_en = tokenizer(en_text, return_tensors="pt")

MBart forces the target language id as the first generated token to translate to the target language.