dcarpintero
/

pangolin-guard-large

@@ -3,7 +3,9 @@ library_name: transformers
 license: apache-2.0
 base_model: answerdotai/ModernBERT-large
 tags:
-- generated_from_trainer
 metrics:
 - f1
 - accuracy
@@ -16,15 +18,15 @@ model-index:
 LLM applications face critical security challenges in form of prompt injections and jailbreaks. This can result in models leaking sensitive data or deviating from their intended behavior. Existing safeguard models are not fully open and have limited context windows (e.g., only 512 tokens in LlamaGuard).
-PangolinGuard is a ModernBERT (Large), lightweight model that discriminates malicious prompts (i.e. prompt injection attacks).
 🤗 [Tech-Blog](https://huggingface.co/blog/dcarpintero/pangolin-fine-tuning-modern-bert) | [GitHub Repo](https://github.com/dcarpintero/pangolin-guard)
-## Intended uses
-- Adding a self-hosted, inexpensive safety checks (against prompt injection attacks) to AI agents and conversational interfaces.
-## Evaluation data
 Evaluated on unseen data from a subset of specialized benchmarks targeting prompt safety and malicious input detection, while testing over-defense behavior:
@@ -35,9 +37,9 @@ Evaluated on unseen data from a subset of specialized benchmarks targeting promp
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a13b68b14ab77f9e3eb061/ygIo-Yo3NN7mDhZlLFvZb.png)
-## Training procedure
-### Training hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
@@ -49,7 +51,7 @@ The following hyperparameters were used during training:
 - bf16: True
 - num_epochs: 2
-### Training results
 | Training Loss | Epoch  | Step | Validation Loss | F1     | Accuracy |
 |:-------------:|:------:|:----:|:---------------:|:------:|:--------:|
@@ -74,7 +76,7 @@ The following hyperparameters were used during training:
 | 0.0052        | 1.9792 | 1900 | 0.0225          | 0.9904 | 0.9937   |
-### Framework versions
 - Transformers 4.48.3
 - Pytorch 2.5.1+cu124

 license: apache-2.0
 base_model: answerdotai/ModernBERT-large
 tags:
+- ai-safety
+- safeguards
+- guardrails
 metrics:
 - f1
 - accuracy
 LLM applications face critical security challenges in form of prompt injections and jailbreaks. This can result in models leaking sensitive data or deviating from their intended behavior. Existing safeguard models are not fully open and have limited context windows (e.g., only 512 tokens in LlamaGuard).
+**Pangolin Guard** is a ModernBERT (Large), lightweight model that discriminates malicious prompts (i.e. prompt injection attacks).
 🤗 [Tech-Blog](https://huggingface.co/blog/dcarpintero/pangolin-fine-tuning-modern-bert) | [GitHub Repo](https://github.com/dcarpintero/pangolin-guard)
+## Intended Use Cases
+- Adding a self-hosted, inexpensive defense mechanism against prompt injection attacks to AI agents and conversational interfaces.
+## Evaluation Data
 Evaluated on unseen data from a subset of specialized benchmarks targeting prompt safety and malicious input detection, while testing over-defense behavior:
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/64a13b68b14ab77f9e3eb061/ygIo-Yo3NN7mDhZlLFvZb.png)
+## Training Procedure
+### Training Hyperparameters
 The following hyperparameters were used during training:
 - learning_rate: 5e-05
 - bf16: True
 - num_epochs: 2
+### Training Results
 | Training Loss | Epoch  | Step | Validation Loss | F1     | Accuracy |
 |:-------------:|:------:|:----:|:---------------:|:------:|:--------:|
 | 0.0052        | 1.9792 | 1900 | 0.0225          | 0.9904 | 0.9937   |
+### Framework Versions
 - Transformers 4.48.3
 - Pytorch 2.5.1+cu124