metadata
license: mit
language:
- en
- de
- es
- fr
- pt
metrics:
- accuracy
base_model:
- microsoft/mdeberta-v3-base
pipeline_tag: text-classification
tags:
- formal or informal classification
widget:
- text: Bitte geh einkaufen.
- text: Können Sie mir spontan dabei helfen?
- text: Als nächstes kommen 4g Champignons und 500g Mehl dazu.
formality-classifier-mdeberta-v3-base
This model can classify texts based on their formality. It classifies inputs into one of the three classes ["formal", "informal", "neutral"]
, with neutral pertaining to texts which do not have a clear formality, such as passive statements etc.
In selecting and generating training data, a focus was put on languages that actually have a type of formal address etc., including French, German, Italian, Portuguese and Spanish. Some samples from osyvokon/pavlick-formality-scores were also used to try and teach the model to classify English inputs.
Results
Accuracy on the test set:
Language | Accuracy |
---|---|
all | 88.93% |
English | 79.20% |
French | 100% |
German | 97.73% |
Italian | 97.83% |
Portuguese | 100% |
Spanish | 98.53% |
Confusion Matrix:
By Language: