|
--- |
|
license: mit |
|
language: |
|
- multilingual |
|
- en |
|
- it |
|
- sl |
|
metrics: |
|
- f1 |
|
- accuracy |
|
base_model: FacebookAI/xlm-roberta-large |
|
pipeline_tag: text-classification |
|
tags: |
|
- hate-speech |
|
- xlm-roberta |
|
- Youtube |
|
- Twitter |
|
--- |
|
|
|
# Multilingual Hate Speech Classifier for Social Media with Disagreement-Aware Training |
|
|
|
A multilingual [XLM-R-based (100 languages)](https://huggingface.co/FacebookAI/xlm-roberta-large) hate speech classification model fine-tuned on English, Italian and Slovenian with inter-annotator disagreement-aware training. |
|
|
|
The details of the model and the disagreement-aware training are described in our [paper](https://www.researchgate.net/publication/384628421_Multilingual_Hate_Speech_Modeling_by_Leveraging_Inter-Annotator_Disagreement): |
|
|
|
@inproceedings{ |
|
grigor2024multilingual, |
|
title={Multilingual Hate Speech Modeling by Leveraging Inter-Annotator Disagreement}, |
|
author={Grigor, Patricia-Carla and Evkoski, Bojan and Kralj Novak, Petra}, |
|
url={http://dx.doi.org/10.70314/is.2024.sikdd.7}, |
|
DOI={10.70314/is.2024.sikdd.7}, |
|
booktitle={Proceedings of Data Mining and Data Warehouses – Sikdd 2024}, |
|
publisher={Jožef Stefan Instutute}, |
|
year={2024} |
|
} |
|
|
|
Authors: Patricia-Carla Grigor, Bojan Evkoski, Petra Kralj Novak |
|
|
|
Data available here: [English](https://www.clarin.si/repository/xmlui/handle/11356/1454); [Italian](https://www.clarin.si/repository/xmlui/handle/11356/1450); [Slovenian](https://www.clarin.si/repository/xmlui/handle/11356/1398) |
|
|
|
**Model output** |
|
The model classifies each input into one of four distinct classes: |
|
|
|
* 0 - appropriate |
|
* 1 - inappropriate |
|
* 2 - offensive |
|
* 3 - violent |
|
|
|
**Training data*** |
|
* 51k English Youtube comments |
|
* 60k Italian Youtube comments |
|
* 50k Slovenian Twitter comments |
|
|
|
**Evaluation data*** |
|
* 10k English Youtube comments |
|
* 10k Italian Youtube comments |
|
* 10k Slovenian Twitter comments |
|
|
|
\* each comment is manually labeled by two different annotators |
|
|
|
**Fine-tuning hyperparameters** |
|
|
|
num_train_epochs=3, |
|
train_batch_size=8, |
|
learning_rate=6e-6 |
|
|
|
**Evaluation Results** |
|
Model agreement (accuracy) vs. Inter-annotator agreement (0 - no agreement; 100 - perfect agreement): |
|
| | Model-annotator Agreement | Inter-annotator Agreement | |
|
|-----------|---------------------------|---------------------------| |
|
| English | 79.97 | 82.91 | |
|
| Italian | 82.00 | 81.79 | |
|
| Slovenian | 78.84 | 79.43 | |
|
|
|
Class-specific model F1-scores: |
|
| | Appropriate | Inappropriate | Offensive | Violent | |
|
|-----------|-------------|---------------|-----------|---------| |
|
| English | 86.10 | 39.16 | 68.24 | 27.82 | |
|
| Italian | 89.77 | 58.45 | 60.42 | 44.97 | |
|
| Slovenian | 84.30 | 45.22 | 69.69 | 24.79 | |
|
|
|
**Usage** |
|
|
|
from transformers import AutoModelForSequenceClassification, TextClassificationPipeline, AutoTokenizer, AutoConfig |
|
|
|
MODEL = "IMSyPP/hate_speech_multilingual" |
|
tokenizer = AutoTokenizer.from_pretrained(MODEL) |
|
config = AutoConfig.from_pretrained(MODEL) |
|
model = AutoModelForSequenceClassification.from_pretrained(MODEL) |
|
|
|
pipe = TextClassificationPipeline(model=model, tokenizer=tokenizer, return_all_scores=True, |
|
task='sentiment_analysis', device=0, function_to_apply="none") |
|
pipe([ |
|
"Thank you for using our model", |
|
"Grazie per aver utilizzato il nostro modello" |
|
"Hvala za uporabo našega modela" |
|
]) |