LenDigLearn
/

formality-classifier-mdeberta-v3-base

Text Classification

formal or informal classification

sentiment-analysis

Model card Files Files and versions Community

formality-classifier-mdeberta-v3-base / README.md

LenDigLearn's picture

Update README.md

481deca verified 2 months ago

|

1.47 kB

	---
	license: mit
	language:
	- en
	- de
	- es
	- fr
	- pt
	metrics:
	- accuracy
	base_model:
	- microsoft/mdeberta-v3-base
	pipeline_tag: text-classification
	tags:
	- formal or informal classification
	widget:
	- text: Bitte geh einkaufen.
	- text: Können Sie mir spontan dabei helfen?
	- text: Als nächstes kommen 4g Champignons und 500g Mehl dazu.
	---


	# formality-classifier-mdeberta-v3-base

	This model can classify texts based on their formality. It classifies inputs into one of the three classes `["formal", "informal", "neutral"]`, with neutral pertaining to texts which do not have a clear formality, such as passive statements etc.


	In selecting and generating training data, a focus was put on languages that actually have a type of formal address etc., including French, German, Italian, Portuguese and Spanish.
	Some samples from [osyvokon/pavlick-formality-scores](https://huggingface.co/datasets/osyvokon/pavlick-formality-scores) were also used to try and teach the model to classify English inputs.




	## Results

	Accuracy on the test set:

	\| Language \| Accuracy \|
	\| --- \| --- \|
	\| all \| 88.93% \|
	\| English \| 79.20% \|
	\| French \| 100% \|
	\| German \| 97.73% \|
	\| Italian \| 97.83% \|
	\| Portuguese \| 100% \|
	\| Spanish \| 98.53% \|

	Confusion Matrix:

	![](confusion_matrix.svg)

	By Language:

	![](confusion_matrix_en.svg)

	![](confusion_matrix_fr.svg)

	![](confusion_matrix_de.svg)

	![](confusion_matrix_it.svg)

	![](confusion_matrix_pt.svg)

	![](confusion_matrix_es.svg)