|
--- |
|
library_name: transformers |
|
license: mit |
|
language: |
|
- en |
|
tags: |
|
- text-classification |
|
- nlp |
|
- transformers |
|
- bert |
|
pipeline_tag: text-classification |
|
datasets: |
|
- fancyzhx/ag_news |
|
metrics: |
|
- accuracy |
|
- recall |
|
- precision |
|
- f1 |
|
--- |
|
|
|
# AG-News BERT Classification |
|
|
|
## Model Details |
|
|
|
**Model Name:** AG-News BERT Classification |
|
**Model Type:** Text Classification |
|
**Developer:** Mansoor Hamidzadeh |
|
**Repository:** [mansoorhamidzadeh/ag-news-bert-classification](https://huggingface.co/mansoorhamidzadeh/ag-news-bert-classification) |
|
**Language(s):** English |
|
**License:** MIT |
|
|
|
## Model Description |
|
|
|
### Overview |
|
|
|
The AG-News BERT Classification model is a fine-tuned BERT (Bidirectional Encoder Representations from Transformers) model designed for text classification tasks, specifically for classifying news articles into four categories: World, Sports, Business, and Sci/Tech. The model leverages the pre-trained BERT architecture, which has been fine-tuned on the AG-News dataset to optimize its performance for this specific task. |
|
|
|
### Intended Use |
|
|
|
#### Primary Use Case |
|
|
|
The primary use case for this model is to automatically classify news articles into one of the four predefined categories: |
|
- World |
|
- Sports |
|
- Business |
|
- Sci/Tech |
|
|
|
This can be useful for news aggregation services, content recommendation systems, and any application that requires automated content categorization. |
|
|
|
#### Applications |
|
|
|
- News aggregators and curators |
|
- Content recommendation engines |
|
- Media monitoring tools |
|
- Sentiment analysis and trend detection in news |
|
|
|
## Training Data |
|
|
|
### Dataset |
|
|
|
- **Name:** AG-News Dataset |
|
- **Source:** [AG News Corpus](https://huggingface.co/datasets/fancyzhx/ag_newss) |
|
- **Description:** The AG-News dataset is a widely used benchmark dataset for text classification. It contains 120,000 training samples and 7,600 test samples of news articles categorized into four classes: World, Sports, Business, and Sci/Tech. |
|
|
|
### Data Preprocessing |
|
|
|
The text data was preprocessed to tokenize the sentences using the BERT tokenizer, converting the tokens to their corresponding IDs, and creating attention masks. |
|
|
|
## Training Procedure |
|
|
|
**Training Configuration:** |
|
- **Number of Epochs:** 4 |
|
- **Batch Size:** 8 |
|
- **Learning Rate:** 1e-5 |
|
- **Optimizer:** AdamW |
|
|
|
**Training and Validation Losses:** |
|
- **Epoch 1:** |
|
- Average training loss: 0.1330 |
|
- Average test loss: 0.1762 |
|
- **Epoch 2:** |
|
- Average training loss: 0.0918 |
|
- Average test loss: 0.1733 |
|
- **Epoch 3:** |
|
- Average training loss: 0.0622 |
|
- Average test loss: 0.1922 |
|
- **Epoch 4:** |
|
- Average training loss: 0.0416 |
|
- Average test loss: 0.2305 |
|
|
|
**Hardware:** |
|
- **Training Environment:** NVIDIA P100 GPU |
|
- **Training Time:** Approximately 3 hours |
|
|
|
## Performance |
|
|
|
### Evaluation Metrics |
|
|
|
The model was evaluated using standard text classification metrics: |
|
- Accuracy |
|
- Precision |
|
- Recall |
|
- F1 Score |
|
|
|
### Results |
|
|
|
On the AG-News test set, the model achieved the following performance: |
|
- **Accuracy:** 93.8% |
|
- **Precision:** 93.8% |
|
- **Recall:** 93.8% |
|
- **F1 Score:** 93.8% |
|
|
|
## Limitations and Biases |
|
|
|
### Limitations |
|
|
|
- The model may not generalize well to other text types or news sources outside the AG-News dataset. |
|
- Primarily designed for English text and may not perform well on text in other languages. |
|
|
|
### Biases |
|
|
|
- Potential biases present in the training data, reflecting biases in news reporting. |
|
- Category-specific biases due to the distribution of articles in the dataset. |
|
|
|
## Ethical Considerations |
|
|
|
- Ensure the model is used in compliance with user privacy and data security standards. |
|
- Be aware of potential biases and take steps to mitigate negative impacts, especially in sensitive applications. |
|
|
|
## How to Use |
|
|
|
### Inference |
|
|
|
To use the model for inference, load it using the Hugging Face Transformers library: |
|
|
|
```python |
|
from transformers import BertTokenizer, BertForSequenceClassification |
|
from transformers import TextClassificationPipeline |
|
|
|
tokenizer = BertTokenizer.from_pretrained("mansoorhamidzadeh/ag-news-bert-classification") |
|
model = BertForSequenceClassification.from_pretrained("mansoorhamidzadeh/ag-news-bert-classification") |
|
|
|
pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer) |
|
|
|
text = "Sample news article text here." |
|
prediction = pipeline(text) |
|
print(prediction) |
|
|
|
|
|
|
|
``` |
|
``` bibtxt |
|
@misc{mansoorhamidzadeh, |
|
author = {Mansoor Hamidzadeh}, |
|
title = {AG-News BERT Classification}, |
|
year = {2024}, |
|
publisher = {Hugging Face}, |
|
howpublished = {\url{https://huggingface.co/mansoorhamidzadeh/ag-news-bert-classification}}, |
|
} |