xgiga commited on
Commit
83d9e11
verified
1 Parent(s): 480c367

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -1
README.md CHANGED
@@ -24,7 +24,9 @@ It achieves the following results on the evaluation set:
24
 
25
  ## Model description
26
 
27
- More information needed
 
 
28
 
29
  ## Intended uses & limitations
30
 
 
24
 
25
  ## Model description
26
 
27
+ mBART-50 is a multilingual Sequence-to-Sequence model. It was introduced to show that multilingual translation models can be created through multilingual fine-tuning. Instead of fine-tuning on one direction, a pre-trained model is fine-tuned on many directions simultaneously. mBART-50 is created using the original mBART model and extended to add extra 25 languages to support multilingual machine translation models of 50 languages. The pre-training objective is explained below.
28
+
29
+ Multilingual Denoising Pretraining: The model incorporates N languages by concatenating data: D = {D1, ..., DN } where each Di is a collection of monolingual documents in language i. The source documents are noised using two schemes, first randomly shuffling the original sentences' order, and second a novel in-filling scheme, where spans of text are replaced with a single mask token. The model is then tasked to reconstruct the original text. 35% of each instance's words are masked by random sampling a span length according to a Poisson distribution (位 = 3.5). The decoder input is the original text with one position offset. A language id symbol LID is used as the initial token to predict the sentence.
30
 
31
  ## Intended uses & limitations
32