File size: 1,469 Bytes

481deca

---
license: mit
language:
- en
- de
- es
- fr
- pt
metrics:
- accuracy
base_model:
- microsoft/mdeberta-v3-base
pipeline_tag: text-classification
tags:
- formal or informal classification
widget:
  - text: Bitte geh einkaufen.
  - text: Können Sie mir spontan dabei helfen?
  - text: Als nächstes kommen 4g Champignons und 500g Mehl dazu.
---


# formality-classifier-mdeberta-v3-base

This model can classify texts based on their formality. It classifies inputs into one of the three classes `["formal", "informal", "neutral"]`, with neutral pertaining to texts which do not have a clear formality, such as passive statements etc.


In selecting and generating training data, a focus was put on languages that actually have a type of formal address etc., including French, German, Italian, Portuguese and Spanish.
Some samples from [osyvokon/pavlick-formality-scores](https://huggingface.co/datasets/osyvokon/pavlick-formality-scores) were also used to try and teach the model to classify English inputs.




## Results

Accuracy on the test set:

| Language | Accuracy |
| --- | --- |
| all | 88.93% |
| English | 79.20% |
| French | 100% |
| German | 97.73% |
| Italian | 97.83% |
| Portuguese | 100% |
| Spanish | 98.53% |

Confusion Matrix:

![](confusion_matrix.svg)

By Language:

![](confusion_matrix_en.svg)

![](confusion_matrix_fr.svg)

![](confusion_matrix_de.svg)

![](confusion_matrix_it.svg)

![](confusion_matrix_pt.svg)

![](confusion_matrix_es.svg)