metadata

license: apache-2.0
metrics:
  - accuracy
  - f1
  - precision
  - recall
pipeline_tag: text-classification
tags:
  - language detection
  - German
  - English
  - French
  - Spanish
  - GEFS
  - Language dectetor
datasets:
  - papluca/language-identification
language:
  - de
  - en
  - fr
  - es

German, English, French and Spanish Language Detector

The ImranzamanML/GEFS-language-detector is a fined tuned model by using the dataset of papluca Language Identification and the base model xlm-roberta-base .

Supported languages

Currently this model support 4 languages for Theum AG

Following languages supported by the model:

german (de)
english (en)
spanish (es)
french (fr)

Training Results

Epoch	  Training Loss	    Validation Loss
1	      0.002600	        0.000148  
2	      0.001000	        0.000015
3	      0.000000	        0.000011
4	      0.001800	        0.000009
5	      0.002700	        0.000016
6	      0.001600	        0.000012
7	      0.001300	        0.000009
8	      0.001200	        0.000008
9	      0.000900	        0.000007
10	      0.000900	        0.000007

Testing Results

Language   Precision   Recall	F1 	     Accuracy
de	       0.9997	   0.9998	0.9998   0.9999
en	       1.0000	   1.0000	1.0000	 1.0000
fr	       0.9995	   0.9996	0.9996	 0.9996
es	       0.9994	   0.9996	0.9995	 0.9996