Metadata Enhancements:
Browse filesAdded pipeline_tag: token-classification: Ensures it’s recognized for NER pipelines. Added language: en: Specifies English, improving search and filtering. Added library_name: transformers: Clarifies the framework. Added base_model: bert-base-uncased: Links the pre-trained backbone. Enhanced model-index: Includes the validation loss as a metric, structured for Hugging Face’s UI display. Added widget: Provides an interactive example (optional—remove if you don’t want it). Added nlp tag: Broadens visibility in NLP-related searches. UI Impact: On Hugging Face, this will display the loss under “Metrics,” show the task as NER, and filter better in searches for English NER models. The widget (if kept) lets users test the model directly on the page with your example sentence.
@@ -1,50 +1,75 @@
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
tags:
|
4 |
-
- generated_from_trainer
|
|
|
|
|
|
|
5 |
datasets:
|
6 |
-
- conll2003
|
|
|
|
|
|
|
|
|
|
|
7 |
model-index:
|
8 |
-
- name: token-classification-ai-fine-tune
|
9 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
10 |
---
|
11 |
|
12 |
-
<!-- This model card has been generated automatically according to the information the Trainer had access to. You
|
13 |
-
should probably proofread and complete it, then remove this comment. -->
|
14 |
-
|
15 |
# token-classification-ai-fine-tune
|
16 |
|
17 |
-
|
18 |
-
|
19 |
-
-
|
|
|
|
|
20 |
|
21 |
-
|
22 |
|
23 |
-
|
24 |
|
25 |
-
|
|
|
|
|
|
|
26 |
|
27 |
-
|
|
|
|
|
|
|
28 |
|
29 |
-
## Training and
|
30 |
|
31 |
-
|
32 |
|
33 |
-
## Training
|
34 |
|
35 |
-
### Training
|
36 |
|
37 |
The following hyperparameters were used during training:
|
38 |
-
- learning_rate
|
39 |
-
- train_batch_size
|
40 |
-
- eval_batch_size
|
41 |
-
- seed
|
42 |
-
- optimizer
|
43 |
-
- lr_scheduler_type
|
44 |
-
- lr_scheduler_warmup_steps
|
45 |
-
- num_epochs
|
46 |
|
47 |
-
### Training
|
48 |
|
49 |
| Training Loss | Epoch | Step | Validation Loss |
|
50 |
|:-------------:|:-----:|:----:|:---------------:|
|
@@ -52,10 +77,24 @@ The following hyperparameters were used during training:
|
|
52 |
| 0.0251 | 2.0 | 3512 | 0.0473 |
|
53 |
| 0.016 | 3.0 | 5268 | 0.0474 |
|
54 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
55 |
|
56 |
-
|
|
|
|
|
57 |
|
58 |
-
-
|
59 |
-
- Pytorch 2.0.1
|
60 |
-
- Datasets 1.18.3
|
61 |
-
- Tokenizers 0.13.3
|
|
|
1 |
---
|
2 |
license: apache-2.0
|
3 |
tags:
|
4 |
+
- generated_from_trainer
|
5 |
+
- token-classification
|
6 |
+
- ner
|
7 |
+
- nlp
|
8 |
datasets:
|
9 |
+
- conll2003
|
10 |
+
language:
|
11 |
+
- en
|
12 |
+
pipeline_tag: token-classification
|
13 |
+
library_name: transformers
|
14 |
+
base_model: bert-base-uncased
|
15 |
model-index:
|
16 |
+
- name: token-classification-ai-fine-tune
|
17 |
+
results:
|
18 |
+
- task:
|
19 |
+
type: token-classification
|
20 |
+
name: Named Entity Recognition (NER)
|
21 |
+
dataset:
|
22 |
+
name: CoNLL-2003
|
23 |
+
type: conll2003
|
24 |
+
metrics:
|
25 |
+
- name: Validation Loss
|
26 |
+
type: loss
|
27 |
+
value: 0.0474
|
28 |
+
widget:
|
29 |
+
- text: "Apple is buying a U.K. startup for $1 billion"
|
30 |
---
|
31 |
|
|
|
|
|
|
|
32 |
# token-classification-ai-fine-tune
|
33 |
|
34 |
+
[](https://huggingface.co/bniladridas/token-classification-ai-fine-tune)
|
35 |
+
|
36 |
+
This model is a fine-tuned version of [bert-base-uncased](https://huggingface.co/bert-base-uncased) on the [CoNLL-2003](https://huggingface.co/datasets/conll2003) dataset. It achieves a validation loss of **0.0474** on the evaluation set.
|
37 |
+
|
38 |
+
## Model Description
|
39 |
|
40 |
+
This is a token classification model fine-tuned for **Named Entity Recognition (NER)**, built on the `bert-base-uncased` architecture. It’s crafted to identify entities (like people, organizations, and locations) in text, optimized here for CPU accessibility. Uploaded by [bniladridas](https://huggingface.co/bniladridas), it delivers strong NER performance on the CoNLL-2003 benchmark. For a GPU-accelerated version with CUDA support, see the [GitHub repository](https://github.com/bniladridas/token-classification-ai-fine-tune).
|
41 |
|
42 |
+
## Intended Uses & Limitations
|
43 |
|
44 |
+
### Intended Uses
|
45 |
+
- Extracting named entities from unstructured text (e.g., news articles, reports)
|
46 |
+
- Powering NLP pipelines on CPU-based systems
|
47 |
+
- Research or lightweight production use
|
48 |
|
49 |
+
### Limitations
|
50 |
+
- Trained on English text from CoNLL-2003, so it may not generalize well to other languages or domains
|
51 |
+
- Uses `bert-base-uncased` tokenization (lowercase-only), potentially missing case-sensitive nuances
|
52 |
+
- Optimized for NER; additional tuning needed for other token-classification tasks
|
53 |
|
54 |
+
## Training and Evaluation Data
|
55 |
|
56 |
+
The model was trained and evaluated on the [CoNLL-2003](https://huggingface.co/datasets/conll2003) dataset, a standard NER benchmark. It features annotated English news articles with entities like persons, organizations, and locations, split into training, validation, and test sets. Metrics here reflect the evaluation subset.
|
57 |
|
58 |
+
## Training Procedure
|
59 |
|
60 |
+
### Training Hyperparameters
|
61 |
|
62 |
The following hyperparameters were used during training:
|
63 |
+
- **learning_rate**: 2e-05
|
64 |
+
- **train_batch_size**: 8
|
65 |
+
- **eval_batch_size**: 8
|
66 |
+
- **seed**: 42
|
67 |
+
- **optimizer**: Adam with betas=(0.9,0.999) and epsilon=1e-08
|
68 |
+
- **lr_scheduler_type**: linear
|
69 |
+
- **lr_scheduler_warmup_steps**: 500
|
70 |
+
- **num_epochs**: 3
|
71 |
|
72 |
+
### Training Results
|
73 |
|
74 |
| Training Loss | Epoch | Step | Validation Loss |
|
75 |
|:-------------:|:-----:|:----:|:---------------:|
|
|
|
77 |
| 0.0251 | 2.0 | 3512 | 0.0473 |
|
78 |
| 0.016 | 3.0 | 5268 | 0.0474 |
|
79 |
|
80 |
+
### Framework Versions
|
81 |
+
|
82 |
+
- **Transformers**: 4.28.1
|
83 |
+
- **PyTorch**: 2.0.1
|
84 |
+
- **Datasets**: 1.18.3
|
85 |
+
- **Tokenizers**: 0.13.3
|
86 |
+
|
87 |
+
### Additional Notes
|
88 |
+
This version is optimized for CPU use with these intentional adjustments:
|
89 |
+
1. **Full-precision training**: Swapped out fp16 for broader compatibility
|
90 |
+
2. **Streamlined batch sizes**: Set to 8 for efficient CPU processing
|
91 |
+
3. **Simplified workflow**: Skipped gradient accumulation for smoother CPU runs
|
92 |
+
4. **Full feature set**: Retained all monitoring (e.g., TensorBoard) and saving capabilities
|
93 |
+
|
94 |
+
For the GPU version with CUDA, mixed precision, and gradient accumulation, check out the [GitHub repository](https://github.com/bniladridas/token-classification-ai-fine-tune). To clone it, run:
|
95 |
|
96 |
+
```bash
|
97 |
+
git clone https://github.com/bniladridas/token-classification-ai-fine-tune.git
|
98 |
+
```
|
99 |
|
100 |
+
This model was pushed to the Hugging Face Hub for easy CPU-based deployment.
|
|
|
|
|
|