--- license: cc-by-nc-4.0 datasets: - mozilla-foundation/common_voice_11_0 language: - fr - es - pt - da - de - nl - fy - zh - ja - ar - sw - gn library_name: fairseq --- HUTTER-12: H(uBERT) UTTER model covering 12 languages. * Total training hours: 1,622 from: Romance= {fr: 300; es: 300; pt: 102.3}; West-Germanic={da: 3.5; de: 300; nl: 72.1; fy: 41.2;}; Unrelated={zh-CN: 104.6; ja: 37; ar: 61; sw: 300; gn: 0.4} * Number of updates: 400k * Number of iterations: 3 * Clustering approach: mini-batch K-means (100% of the data)