Text2Text Generation
Transformers
Safetensors
English
German
mt5
wardenga commited on
Commit
df70dea
·
verified ·
1 Parent(s): 25ff330

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +22 -2
README.md CHANGED
@@ -3,6 +3,7 @@
3
  datasets:
4
  - bigscience/xP3
5
  - mc4
 
6
  license: apache-2.0
7
  language:
8
  - en
@@ -24,10 +25,29 @@ The vocabulary has been adjusted to contain the top 2000 tokens overall as well
24
 
25
  The new vocabulary has been computed using
26
  - deu_news_2023_1M for de
27
- - eng_news_2023_1M for en
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
28
  from the [Leipzig language corpora collection](https://wortschatz.uni-leipzig.de/en/download/).
29
 
30
- According to the mentioned article the perfarmance degradation can be expected to be small (but no evaluation was done.
31
 
32
  ![xmtf](https://github.com/bigscience-workshop/xmtf/blob/master/xmtf_banner.png?raw=true)
33
 
 
3
  datasets:
4
  - bigscience/xP3
5
  - mc4
6
+ - Leipzig_corpora_collection
7
  license: apache-2.0
8
  language:
9
  - en
 
25
 
26
  The new vocabulary has been computed using
27
  - deu_news_2023_1M for de
28
+ - eng_news_2023_1M for en
29
+
30
+ from the [Leipzig language corpora collection](https://wortschatz.uni-leipzig.de/en/download/).
31
+
32
+ According to the mentioned article the perfarmance degradation can be expected to be small (but no evaluation was done).
33
+
34
+
35
+
36
+ # Language Adaption of mT0
37
+ This model is an adaption of the mT0 variant (see below) for languages en, de based on the methodology in [Load What You Need: Smaller Versions of Multilingual BERT](https://arxiv.org/ftp/arxiv/papers/2010/2010.05609.pdf).
38
+
39
+ The vocabulary has been adjusted to contain the top 2000 tokens overall as well as
40
+ - the top 15000 tokens from en
41
+ - the top 30000 tokens from de
42
+ and 100 special tokens.
43
+
44
+ The new vocabulary has been computed using
45
+ - deu_news_2023_1M for de
46
+ - eng_news_2023_1M for en
47
+
48
  from the [Leipzig language corpora collection](https://wortschatz.uni-leipzig.de/en/download/).
49
 
50
+ According to the mentioned article the perfarmance degradation can be expected to be small (but no evaluation was done).
51
 
52
  ![xmtf](https://github.com/bigscience-workshop/xmtf/blob/master/xmtf_banner.png?raw=true)
53