--- license: cc datasets: - vector-institute/NMB-Plus-Named-Entities base_model: - distilbert/distilbert-base-uncased pipeline_tag: token-classification tags: - ner - bias_detection model-index: - name: nmb-plus-bias-ner-bert results: - task: type: named-entity-recognition name: Named Entity Recognition (NER) dataset: type: vector-institute/NMB-Plus-Named-Entities name: Biased Named Entities metrics: - type: precision value: 0.6405 - type: recall value: 0.5589 - type: f1 value: 0.5922 language: - en --- # Model Overview A fine-tuned DistilBERT model for Named Entity Recognition (NER) in bias detection. ## Model Details We used `distilbert-base-uncased` and fine-tuned it on `vector-institute/NMB-Plus-Named-Entities` dataset. ## How to Get Started with the Model ```python from transformers import AutoModelForTokenClassification, AutoTokenizer model_name = "vector-institute/nmb-plus-bias-ner-bert" tokenizer = AutoTokenizer.from_pretrained(model_name) label_list = ["O", "B-BIAS", "I-BIAS"] id2label = {i: label for i, label in enumerate(label_list)} label2id = {label: i for i, label in enumerate(label_list)} model = AutoModelForTokenClassification.from_pretrained( model_name, id2label=id2label, label2id=label2id ) ner_pipeline = pipeline("ner", model=model, tokenizer=tokenizer) text = "Fox News reported that Joe Biden met with CNN executives." predictions = ner_pipeline(text) print(predictions) ``` ## Training Hyperparameters - **Training regime:** Here's the training arguments we used: ```python training_args = TrainingArguments( learning_rate=2e-5, per_device_train_batch_size=64, per_device_eval_batch_size=32, num_train_epochs=10, weight_decay=0.01, eval_strategy="epoch", save_strategy="epoch", load_best_model_at_end=True, output_dir="./results", logging_dir="./logs", logging_steps=50, group_by_length=True, ) ``` ## Evaluation We split the data to train(80%), validation(10%) and test(10%) sets. ### Results We used common classification metrics: - precision - recall - f1-score #### Overall Results: | Metric | Precision | Recall | F1-Score | Support | |---------------|-----------|--------|----------|---------| | **Macro Avg** | 0.6405 | 0.5589 | 0.5922 | 48710 | | **Weighted Avg** | 0.9330 | 0.9418 | 0.9366 | 48710 | #### Per-class Results: | Label | Precision | Recall | F1-Score | Support | |----------|-----------|--------|----------|---------| | **O** | 0.9615 | 0.9792 | 0.9703 | 45921 | | **B-BIAS** | 0.5314 | 0.4183 | 0.4681 | 930 | | **I-BIAS** | 0.4286 | 0.2792 | 0.3381 | 1859 | ## Environmental Impact Total energy consumption for fine-tuning is 0.032804 kWh **Local CO2 Emission:** Approximately 3.12 grams of CO₂ equivalent. ## License CC BY 4.0 (Creative Commons Attribution 4.0): Allows sharing and adaptation with proper credit.