--- library_name: transformers license: mit language: - en tags: - text-classification - nlp - transformers - bert pipeline_tag: text-classification datasets: - fancyzhx/ag_news metrics: - accuracy - recall - precision - f1 --- # AG-News BERT Classification ## Model Details **Model Name:** AG-News BERT Classification **Model Type:** Text Classification **Developer:** Mansoor Hamidzadeh **Repository:** [mansoorhamidzadeh/ag-news-bert-classification](https://huggingface.co/mansoorhamidzadeh/ag-news-bert-classification) **Language(s):** English **License:** MIT ## Model Description ### Overview The AG-News BERT Classification model is a fine-tuned BERT (Bidirectional Encoder Representations from Transformers) model designed for text classification tasks, specifically for classifying news articles into four categories: World, Sports, Business, and Sci/Tech. The model leverages the pre-trained BERT architecture, which has been fine-tuned on the AG-News dataset to optimize its performance for this specific task. ### Intended Use #### Primary Use Case The primary use case for this model is to automatically classify news articles into one of the four predefined categories: - World - Sports - Business - Sci/Tech This can be useful for news aggregation services, content recommendation systems, and any application that requires automated content categorization. #### Applications - News aggregators and curators - Content recommendation engines - Media monitoring tools - Sentiment analysis and trend detection in news ## Training Data ### Dataset - **Name:** AG-News Dataset - **Source:** [AG News Corpus](https://huggingface.co/datasets/fancyzhx/ag_newss) - **Description:** The AG-News dataset is a widely used benchmark dataset for text classification. It contains 120,000 training samples and 7,600 test samples of news articles categorized into four classes: World, Sports, Business, and Sci/Tech. ### Data Preprocessing The text data was preprocessed to tokenize the sentences using the BERT tokenizer, converting the tokens to their corresponding IDs, and creating attention masks. ## Training Procedure **Training Configuration:** - **Number of Epochs:** 4 - **Batch Size:** 8 - **Learning Rate:** 1e-5 - **Optimizer:** AdamW **Training and Validation Losses:** - **Epoch 1:** - Average training loss: 0.1330 - Average test loss: 0.1762 - **Epoch 2:** - Average training loss: 0.0918 - Average test loss: 0.1733 - **Epoch 3:** - Average training loss: 0.0622 - Average test loss: 0.1922 - **Epoch 4:** - Average training loss: 0.0416 - Average test loss: 0.2305 **Hardware:** - **Training Environment:** NVIDIA P100 GPU - **Training Time:** Approximately 3 hours ## Performance ### Evaluation Metrics The model was evaluated using standard text classification metrics: - Accuracy - Precision - Recall - F1 Score ### Results On the AG-News test set, the model achieved the following performance: - **Accuracy:** 93.8% - **Precision:** 93.8% - **Recall:** 93.8% - **F1 Score:** 93.8% ## Limitations and Biases ### Limitations - The model may not generalize well to other text types or news sources outside the AG-News dataset. - Primarily designed for English text and may not perform well on text in other languages. ### Biases - Potential biases present in the training data, reflecting biases in news reporting. - Category-specific biases due to the distribution of articles in the dataset. ## Ethical Considerations - Ensure the model is used in compliance with user privacy and data security standards. - Be aware of potential biases and take steps to mitigate negative impacts, especially in sensitive applications. ## How to Use ### Inference To use the model for inference, load it using the Hugging Face Transformers library: ```python from transformers import BertTokenizer, BertForSequenceClassification from transformers import TextClassificationPipeline tokenizer = BertTokenizer.from_pretrained("mansoorhamidzadeh/ag-news-bert-classification") model = BertForSequenceClassification.from_pretrained("mansoorhamidzadeh/ag-news-bert-classification") pipeline = TextClassificationPipeline(model=model, tokenizer=tokenizer) text = "Sample news article text here." prediction = pipeline(text) print(prediction) ``` ``` bibtxt @misc{mansoorhamidzadeh, author = {Mansoor Hamidzadeh}, title = {AG-News BERT Classification}, year = {2024}, publisher = {Hugging Face}, howpublished = {\url{https://huggingface.co/mansoorhamidzadeh/ag-news-bert-classification}}, }