File size: 4,052 Bytes

5bcf39b
 
 
6b3acc4
 
 
 
5bcf39b
6b3acc4
 
 
 
 
 
5bcf39b
6b3acc4
 
 
 
 
 
 
 
 
 
 
 
 
 
5bcf39b
 
 
 
6b3acc4
 
 
 
 
5bcf39b
6b3acc4
5bcf39b
6b3acc4
5bcf39b
6b3acc4
 
 
 
5bcf39b
6b3acc4
 
 
 
5bcf39b
6b3acc4
5bcf39b
6b3acc4
5bcf39b
6b3acc4
5bcf39b
6b3acc4
5bcf39b
 
6b3acc4
 
 
 
 
 
 
 
5bcf39b
6b3acc4
5bcf39b
 
 
2cdd2f5
 
 
5bcf39b
6b3acc4
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5bcf39b
6b3acc4
 
 
5bcf39b
6b3acc4

---
license: apache-2.0
tags:
  - generated_from_trainer
  - token-classification
  - ner
  - nlp
datasets:
  - conll2003
language:
  - en
pipeline_tag: token-classification
library_name: transformers
base_model: bert-base-uncased
model-index:
  - name: token-classification-ai-fine-tune
    results:
      - task:
          type: token-classification
          name: Named Entity Recognition (NER)
        dataset:
          name: CoNLL-2003
          type: conll2003
        metrics:
          - name: Validation Loss
            type: loss
            value: 0.0474
widget:
  - text: "Apple is buying a U.K. startup for $1 billion"
---

# token-classification-ai-fine-tune

[![Hugging Face](https://img.shields.io/badge/%F0%9F%A4%97%20Hugging%20Face-Model-blue)](https://huggingface.co/bniladridas/token-classification-ai-fine-tune)

This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the [CoNLL-2003](https://huggingface.co/datasets/conll2003) dataset. It achieves a validation loss of **0.0474** on the evaluation set.

## Model Description

This is a token classification model fine-tuned for **Named Entity Recognition (NER)**, built on the `bert-base-uncased` architecture. It’s crafted to identify entities (like people, organizations, and locations) in text, optimized here for CPU accessibility. Uploaded by [bniladridas](https://huggingface.co/bniladridas), it delivers strong NER performance on the CoNLL-2003 benchmark. For a GPU-accelerated version with CUDA support, see the [GitHub repository](https://github.com/bniladridas/token-classification-ai-fine-tune).

## Intended Uses & Limitations

### Intended Uses
- Extracting named entities from unstructured text (e.g., news articles, reports)
- Powering NLP pipelines on CPU-based systems
- Research or lightweight production use

### Limitations
- Trained on English text from CoNLL-2003, so it may not generalize well to other languages or domains
- Uses `bert-base-uncased` tokenization (lowercase-only), potentially missing case-sensitive nuances
- Optimized for NER; additional tuning needed for other token-classification tasks

## Training and Evaluation Data

The model was trained and evaluated on the [CoNLL-2003](https://huggingface.co/datasets/conll2003) dataset, a standard NER benchmark. It features annotated English news articles with entities like persons, organizations, and locations, split into training, validation, and test sets. Metrics here reflect the evaluation subset.

## Training Procedure

### Training Hyperparameters

The following hyperparameters were used during training:
- **learning_rate**: 2e-05
- **train_batch_size**: 8
- **eval_batch_size**: 8
- **seed**: 42
- **optimizer**: Adam with betas=(0.9,0.999) and epsilon=1e-08
- **lr_scheduler_type**: linear
- **lr_scheduler_warmup_steps**: 500
- **num_epochs**: 3

### Training Results

| Training Loss | Epoch | Step | Validation Loss |
|:-------------:|:-----:|:----:|:---------------:|
| 0.048         | 1.0   | 1756 | 0.0531          |
| 0.0251        | 2.0   | 3512 | 0.0473          |
| 0.016         | 3.0   | 5268 | 0.0474          |

### Framework Versions

- **Transformers**: 4.28.1
- **PyTorch**: 2.0.1
- **Datasets**: 1.18.3
- **Tokenizers**: 0.13.3

### Additional Notes
This version is optimized for CPU use with these intentional adjustments:
1. **Full-precision training**: Swapped out fp16 for broader compatibility
2. **Streamlined batch sizes**: Set to 8 for efficient CPU processing
3. **Simplified workflow**: Skipped gradient accumulation for smoother CPU runs
4. **Full feature set**: Retained all monitoring (e.g., TensorBoard) and saving capabilities

For the GPU version with CUDA, mixed precision, and gradient accumulation, check out the [GitHub repository](https://github.com/bniladridas/token-classification-ai-fine-tune). To clone it, run:

```bash
git clone https://github.com/bniladridas/token-classification-ai-fine-tune.git
```

This model was pushed to the Hugging Face Hub for easy CPU-based deployment.