File size: 1,928 Bytes
bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e74ce20 bccc26c e425069 bccc26c e425069 bccc26c e425069 bccc26c e425069 bccc26c e425069 bccc26c e425069 bccc26c e74ce20 bccc26c e74ce20 bccc26c |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 |
---
language: bn
tags:
- collaborative
- bengali
- NER
license: apache-2.0
datasets: xtreme
metrics:
- Loss
- Accuracy
- Precision
- Recall
---
# sahajBERT Named Entity Recognition
## Model description
[sahajBERT](https://huggingface.co/neuropark/sahajBERT-NER) fine-tuned for NER using the bengali split of [WikiANN ](https://huggingface.co/datasets/wikiann).
Named Entities predicted by the model:
| Label id | Label |
|:--------:|:----:|
|0 |O|
|1 |B-PER|
|2 |I-PER|
|3 |B-ORG|
|4 |I-ORG|
|5 |B-LOC|
|6 |I-LOC|
## Intended uses & limitations
#### How to use
You can use this model directly with a pipeline for token classification:
```python
from transformers import AlbertForTokenClassification, TokenClassificationPipeline, PreTrainedTokenizerFast
# Initialize tokenizer
tokenizer = PreTrainedTokenizerFast.from_pretrained("neuropark/sahajBERT-NER")
# Initialize model
model = AlbertForTokenClassification.from_pretrained("neuropark/sahajBERT-NER")
# Initialize pipeline
pipeline = TokenClassificationPipeline(tokenizer=tokenizer, model=model)
raw_text = "এই ইউনিয়নে ৩ টি মৌজা ও ১০ টি গ্রাম আছে ।" # Change me
output = pipeline(raw_text)
```
#### Limitations and bias
<!-- Provide examples of latent issues and potential remediations. -->
WIP
## Training data
The model was initialized with pre-trained weights of [sahajBERT](https://huggingface.co/neuropark/sahajBERT-NER) at step 19519 and trained on the bengali split of [WikiANN ](https://huggingface.co/datasets/wikiann)
## Training procedure
Coming soon!
<!-- ```bibtex
@inproceedings{...,
year={2020}
}
``` -->
## Eval results
accuracy: 0.9756540697674418
f1: 0.9570102589154861
loss: 0.13705264031887054
precision: 0.9518950437317785
recall: 0.962180746561886
### BibTeX entry and citation info
Coming soon!
<!-- ```bibtex
@inproceedings{...,
year={2020}
}
``` -->
|