ash0ts commited on
Commit
06b5c11
·
1 Parent(s): 084dab1

Create ReadMe.md

Browse files
Files changed (1) hide show
  1. guardrails_genie/guardrails/ReadMe.md +136 -0
guardrails_genie/guardrails/ReadMe.md ADDED
@@ -0,0 +1,136 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Entity Recognition Guardrails
2
+
3
+ A collection of guardrails for detecting and anonymizing various types of entities in text, including PII (Personally Identifiable Information), restricted terms, and custom entities.
4
+
5
+ ## Available Guardrails
6
+
7
+ ### 1. Regex Entity Recognition
8
+ Simple pattern-based entity detection using regular expressions.
9
+
10
+ ```python
11
+ from guardrails_genie.guardrails.entity_recognition import RegexEntityRecognitionGuardrail
12
+
13
+ # Initialize with default PII patterns
14
+ guardrail = RegexEntityRecognitionGuardrail(should_anonymize=True)
15
+
16
+ # Or with custom patterns
17
+ custom_patterns = {
18
+ "employee_id": r"EMP\d{6}",
19
+ "project_code": r"PRJ-[A-Z]{2}-\d{4}"
20
+ }
21
+ guardrail = RegexEntityRecognitionGuardrail(patterns=custom_patterns, should_anonymize=True)
22
+ ```
23
+
24
+ ### 2. Presidio Entity Recognition
25
+ Advanced entity detection using Microsoft's Presidio analyzer.
26
+
27
+ ```python
28
+ from guardrails_genie.guardrails.entity_recognition import PresidioEntityRecognitionGuardrail
29
+
30
+ # Initialize with default entities
31
+ guardrail = PresidioEntityRecognitionGuardrail(should_anonymize=True)
32
+
33
+ # Or with specific entities
34
+ selected_entities = ["CREDIT_CARD", "US_SSN", "EMAIL_ADDRESS"]
35
+ guardrail = PresidioEntityRecognitionGuardrail(
36
+ selected_entities=selected_entities,
37
+ should_anonymize=True
38
+ )
39
+ ```
40
+
41
+ ### 3. Transformers Entity Recognition
42
+ Entity detection using transformer-based models.
43
+
44
+ ```python
45
+ from guardrails_genie.guardrails.entity_recognition import TransformersEntityRecognitionGuardrail
46
+
47
+ # Initialize with default model
48
+ guardrail = TransformersEntityRecognitionGuardrail(should_anonymize=True)
49
+
50
+ # Or with specific model and entities
51
+ guardrail = TransformersEntityRecognitionGuardrail(
52
+ model_name="iiiorg/piiranha-v1-detect-personal-information",
53
+ selected_entities=["GIVENNAME", "SURNAME", "EMAIL"],
54
+ should_anonymize=True
55
+ )
56
+ ```
57
+
58
+ ### 4. LLM Judge for Restricted Terms
59
+ Advanced detection of restricted terms, competitor mentions, and brand protection using LLMs.
60
+
61
+ ```python
62
+ from guardrails_genie.guardrails.entity_recognition import RestrictedTermsJudge
63
+
64
+ # Initialize with OpenAI model
65
+ guardrail = RestrictedTermsJudge(should_anonymize=True)
66
+
67
+ # Check for specific terms
68
+ result = guardrail.guard(
69
+ text="Let's implement features like Salesforce",
70
+ custom_terms=["Salesforce", "Oracle", "AWS"]
71
+ )
72
+ ```
73
+
74
+ ## Usage
75
+
76
+ All guardrails follow a consistent interface:
77
+
78
+ ```python
79
+ # Initialize a guardrail
80
+ guardrail = RegexEntityRecognitionGuardrail(should_anonymize=True)
81
+
82
+ # Check text for entities
83
+ result = guardrail.guard("Hello, my email is [email protected]")
84
+
85
+ # Access results
86
+ print(f"Contains entities: {result.contains_entities}")
87
+ print(f"Detected entities: {result.detected_entities}")
88
+ print(f"Explanation: {result.explanation}")
89
+ print(f"Anonymized text: {result.anonymized_text}")
90
+ ```
91
+
92
+ ## Evaluation Tools
93
+
94
+ The module includes comprehensive evaluation tools and test cases:
95
+
96
+ - `pii_examples/`: Test cases for PII detection
97
+ - `banned_terms_examples/`: Test cases for restricted terms
98
+ - Benchmark scripts for evaluating model performance
99
+
100
+ ### Running Evaluations
101
+
102
+ ```python
103
+ # PII Detection Benchmark
104
+ from guardrails_genie.guardrails.entity_recognition.pii_examples.pii_benchmark import main
105
+ main()
106
+
107
+ # (TODO): Restricted Terms Testing
108
+ from guardrails_genie.guardrails.entity_recognition.banned_terms_examples.banned_term_benchmark import main
109
+ main()
110
+ ```
111
+
112
+ ## Features
113
+
114
+ - Entity detection and anonymization
115
+ - Support for multiple detection methods (regex, Presidio, transformers, LLMs)
116
+ - Customizable entity types and patterns
117
+ - Detailed explanations of detected entities
118
+ - Comprehensive evaluation framework
119
+ - Support for custom terms and patterns
120
+ - Batch processing capabilities
121
+ - Performance metrics and benchmarking
122
+
123
+ ## Response Format
124
+
125
+ All guardrails return responses with the following structure:
126
+
127
+ ```python
128
+ {
129
+ "contains_entities": bool,
130
+ "detected_entities": {
131
+ "entity_type": ["detected_value_1", "detected_value_2"]
132
+ },
133
+ "explanation": str,
134
+ "anonymized_text": Optional[str]
135
+ }
136
+ ```