|
# Model Detail Information |
|
|
|
### 1. Overview |
|
|
|
This model is trained to detect the presence of harmful expressions in Korean sentences.<br> |
|
It performs binary classification to determine whether a given sentence contains hateful expressions or is a general, non-hateful sentence.<br> |
|
This model is designed for the AI task of 'text classification', using the 'TTA-DQA/hate_sentence' dataset.<br> |
|
|
|
The classification labels are: |
|
- "0": "no_hate" |
|
- "1": "hate" |
|
|
|
### 2. Training Information |
|
|
|
- Base Model: KcElectra (a pre-trained Korean language model based on Electra) |
|
- Source: beomi/KcELECTRA-base-v2022(https://huggingface.co/beomi/KcELECTRA-base-v2022) |
|
- Model Type: Casual Language Model |
|
- Pre-training (Korean): Approximately 17GB (over 180 million sentences) |
|
- Fine-tuning (hate dataset): Approximately 22.3MB(TTA-DQA/hate_sentence) |
|
- Learning Rate: 5e-6 |
|
- Weight Decay: 0.01 |
|
- Epochs: 20 |
|
- Batch Size: 16 |
|
- Data Loader Workers: 2 |
|
- Tokenizer: BertWordPieceTokenizer |
|
- Model Size: Approximately 512MB |
|
|
|
### 3. Requirements |
|
|
|
To use this model, ensure the following dependencies are installed: |
|
- pytorch ~= 1.8.0 |
|
- transformers ~= 4.11.3 |
|
- emoji ~= 0.6.0 |
|
- soynlp ~= 0.0.493 |
|
|
|
### 4. Quick Start |
|
|
|
- python |
|
```python |
|
from transformers import AutoTokenizer, AutoModel |
|
|
|
tokenizer = AutoTokenizer.from_pretrained("TTA-DQA/HateDetection-KcElectra-FineTuning") |
|
model = AutoModel.from_pretrained("TTA-DQA/HateDetection-KcElectra-FineTuning") |
|
|
|
``` |
|
|
|
### 5. Citation |
|
|
|
- This model was developed as part of the Quality Validation Project for Super-Giant AI Training Data (305-2100-2131, 2024 Quality Validation for Super-Giant AI Training). |
|
|
|
### 6. Bias, Risks, and Limitations |
|
|
|
- The determination of harmful expressions may vary depending on language, culture, application context, and personal perspectives. |
|
- Results may reflect biases or lead to controversy due to the subjective nature of evaluating harmful content. |
|
- This model's outputs should not be considered as definitive standards for identifying harmful expressions. |
|
|
|
# Results |
|
- type : binary classification(text-classification) |
|
- f1-score : 0.9928 |
|
- accuracy : 0.9928 |