Languages identification - a hoan Collection

hoan 's Collections

Languages identification

Languages identification

updated 13 days ago

a variety of pre-trained language identification models

alexneakameni/language_detection

Text Classification • Updated 16 days ago • 235

Note BERT-based language detection model trained on hac541309/open-lid-dataset, which includes 121 million sentences across 200 languages. 24.5M params
papluca/xlm-roberta-base-language-detection

Text Classification • Updated Dec 28, 2023 • 4.93M • • 319

Note fine-tuned version of xlm-roberta-base on the Language Identification dataset, 20 langs. 278M params
facebook/fasttext-language-identification

Text Classification • Updated Jun 9, 2023 • 693k • • 221

Note fasttext model, 217 langs
julien-c/fasttext-language-id

Updated Sep 23, 2021 • 3.54k • 3

Note fasttext, 172 langs
NeuML/language-id

Text Classification • Updated Jan 26 • 296 • 1

Note staticvectors, extracted from fasttext model 32.6M params
NeuML/language-id-quantized

Text Classification • Updated Jan 26 • 2.24k • 1

Note staticvestors, extracted from fasttext model quantized 4.09M params
cis-lmu/glotlid

Text Classification • Updated Oct 26, 2024 • 39.3k • 60

Note fasttext model, 2000 langs
laurievb/OpenLID-v2

Text Classification • Updated Nov 28, 2024 • 609k • 1

Note fasttext model, 189 langs