File size: 1,469 Bytes
481deca |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 |
---
license: mit
language:
- en
- de
- es
- fr
- pt
metrics:
- accuracy
base_model:
- microsoft/mdeberta-v3-base
pipeline_tag: text-classification
tags:
- formal or informal classification
widget:
- text: Bitte geh einkaufen.
- text: Können Sie mir spontan dabei helfen?
- text: Als nächstes kommen 4g Champignons und 500g Mehl dazu.
---
# formality-classifier-mdeberta-v3-base
This model can classify texts based on their formality. It classifies inputs into one of the three classes `["formal", "informal", "neutral"]`, with neutral pertaining to texts which do not have a clear formality, such as passive statements etc.
In selecting and generating training data, a focus was put on languages that actually have a type of formal address etc., including French, German, Italian, Portuguese and Spanish.
Some samples from [osyvokon/pavlick-formality-scores](https://huggingface.co/datasets/osyvokon/pavlick-formality-scores) were also used to try and teach the model to classify English inputs.
## Results
Accuracy on the test set:
| Language | Accuracy |
| --- | --- |
| all | 88.93% |
| English | 79.20% |
| French | 100% |
| German | 97.73% |
| Italian | 97.83% |
| Portuguese | 100% |
| Spanish | 98.53% |
Confusion Matrix:

By Language:





 |