LenDigLearn's picture
Update README.md
481deca verified
|
raw
history blame
1.47 kB
metadata
license: mit
language:
  - en
  - de
  - es
  - fr
  - pt
metrics:
  - accuracy
base_model:
  - microsoft/mdeberta-v3-base
pipeline_tag: text-classification
tags:
  - formal or informal classification
widget:
  - text: Bitte geh einkaufen.
  - text: Können Sie mir spontan dabei helfen?
  - text: Als nächstes kommen 4g Champignons und 500g Mehl dazu.

formality-classifier-mdeberta-v3-base

This model can classify texts based on their formality. It classifies inputs into one of the three classes ["formal", "informal", "neutral"], with neutral pertaining to texts which do not have a clear formality, such as passive statements etc.

In selecting and generating training data, a focus was put on languages that actually have a type of formal address etc., including French, German, Italian, Portuguese and Spanish. Some samples from osyvokon/pavlick-formality-scores were also used to try and teach the model to classify English inputs.

Results

Accuracy on the test set:

Language Accuracy
all 88.93%
English 79.20%
French 100%
German 97.73%
Italian 97.83%
Portuguese 100%
Spanish 98.53%

Confusion Matrix:

By Language: