You need to agree to share your contact information to access this model

This repository is publicly accessible, but you have to accept the conditions to access its files and content.

Log in or Sign Up to review the conditions and access this model content.

Model Card for Medical Safety Classification Model AIShield

Model Details

Model Description

This model is designed for medical safety classification, distinguishing between medical safe and medical unsafe queries. It has been evaluated rigorously on multiple datasets to assess its reliability in safety-critical applications.

  • Developed by: AIShield
  • Model type: Transformer-based classification model
  • Language(s) (NLP): English
  • License: Non-permissive, private, not for commercialization
  • Finetuned from model: distilbert-base-uncased

Model Sources [optional]

  • Repository: [More Information Needed]
  • Paper [optional]: [More Information Needed]
  • Demo [optional]: [More Information Needed]

Uses

Direct Use

This model is intended for medical content moderation, ensuring that unsafe queries are flagged appropriately while minimizing false positives for safe content.

Downstream Use [optional]

  • Can be fine-tuned further for broader safety classification, including generic unsafe content.
  • May be integrated into health-related AI assistants to prevent the spread of misinformation.

Out-of-Scope Use

  • Not intended for legal or regulatory decision-making.
  • Not a substitute for medical expertise.
  • Might not generalize well to non-medical domains without further training.

Bias, Risks, and Limitations

Risks and Limitations

  • Potential Over-Filtering: Some safe medical queries may be incorrectly flagged as unsafe (~0.059% false positive rate).
  • Domain-Specific Performance: While effective on medical safety classification, performance slightly varies on generic unsafe content.
  • False Negatives on Generic Unsafe Data: In one test, 5.26% of generic unsafe queries were misclassified as safe.

Recommendations

  • Fine-tune with diverse safety datasets to improve generalization.
  • Adjust classification thresholds to balance false positives and false negatives based on application needs.

How to Get Started with the Model

Use the code below to get started with the model.

from transformers import pipeline

classifier = pipeline("text-classification", model="parmarm/medical_unsafe_detection_bert_final_v1")
result = classifier("Is it safe to take ibuprofen with aspirin?")
print(result)

Training Details

Training Data

Left-Out Dataset

Generic Safety Dataset

Training Procedure

  • Output Directory: ./bert_medical_classifier_train
  • Evaluation Strategy: Epoch-based
  • Save Strategy: Epoch-based
  • Learning Rate: 1e-5
  • Batch Size (Train & Eval): 32
  • Gradient Accumulation Steps: 4
  • Epochs: 2
  • Weight Decay: 0.1
  • Warmup Ratio: 0.06
  • Logging Steps: 100
  • Save Total Limit: 2
  • Load Best Model at End: True
  • Best Model Metric: eval_loss
  • Dataloader Workers: 16

Optimization Details

  • Optimizer: AdamW (lr=2e-5, weight_decay=0.1, fused=True)
  • Loss Function: Class-weighted CrossEntropyLoss
  • Custom Trainer: Implements weighted loss computation

Post-Training Performance Metrics

Training Metrics

  • Global Steps: 296
  • Training Loss: 0.0663
  • Training Runtime: 141.55s
  • Train Samples per Second: 268.75
  • Train Steps per Second: 2.091

Evaluation Metrics

  • Eval Loss: 0.0120
  • Eval Accuracy: 99.68%
  • Eval Precision: 99.41%
  • Eval Recall: 99.94%
  • Eval F1 Score: 99.68%
  • Eval ROC-AUC: 99.998%
  • Evaluation Runtime: 34.81s
  • Eval Samples per Second: 1639.08
  • Eval Steps per Second: 51.24

Evaluation

Testing Data, Factors & Metrics

Datasets Used for Evaluation

Dataset Size Category Purpose
Balanced Medical Dataset 50,742 Medical Safe & Unsafe Primary performance evaluation
Left-Out Medical Unsafe 49,003 Medical Unsafe Evaluating recall for unsafe cases
Left-Out Medical Safe 10,178 Medical Safe Evaluating false positives
Generic Unsafe #1 456 Generic Unsafe Checking generalization capability
Generic Unsafe #2 520 Generic Unsafe Further verification of generalization

Evaluation Metrics

  • Accuracy: Measures overall correctness.
  • Precision (for Unsafe Queries): How many predicted unsafe cases were actually unsafe.
  • Recall (for Unsafe Queries): How many actual unsafe cases were correctly identified.
  • F1 Score: The harmonic mean of precision and recall.
  • False Positive Rate (FPR): Percentage of safe queries misclassified as unsafe.
  • False Negative Rate (FNR): Percentage of unsafe queries misclassified as safe.

Results Summary

1. Balanced Medical Dataset (50,742 samples)

  • Accuracy: 99.74%
  • Precision (Unsafe): 99.49%
  • Recall (Unsafe): 99.97%
  • F1 Score: 99.73%
  • False Positive Rate: 0.51%
  • False Negative Rate: 0.03%

2. Left-Out Medical Unsafe Dataset (49,003 samples)

  • Recall (Unsafe): 99.98%
  • False Negative Rate: 0.0163%

3. Left-Out Medical Safe Dataset (10,178 samples)

  • Accuracy/Specificity: 99.94%
  • False Positive Rate: 0.059%

4. Generic Unsafe Dataset #1 (456 samples)

  • Recall: 94.74%
  • False Negative Rate: 5.26%

5. Generic Unsafe Dataset #2 (520 samples)

  • Recall: 100%

Model Card Contact

For inquiries, contact AIShield.

Downloads last month
21
Safetensors
Model size
67M params
Tensor type
F32
·
Inference Providers NEW
This model is not currently available via any of the supported Inference Providers.