Hugging Face
Models
Datasets
Spaces
Community
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
seanpedrickcase
/
document_redaction
like
5
Running
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
c6b043a
document_redaction
/
tools
Ctrl+K
Ctrl+K
3 contributors
History:
95 commits
seanpedrickcase
Integrated AWS Comprehend and fuzzy matching functions with tabular data redaction.
ff290e1
3 months ago
__init__.py
Safe
0 Bytes
Initial commit
about 1 year ago
auth.py
Safe
2.92 kB
Ensured the text ocr outputs have no line breaks at end. Multi-line custom text searches now possible. Files for review sent from redact button. Fixed image redaction (not review yet). Can get user pool details from headers. Gradio update.
4 months ago
aws_functions.py
Safe
8.03 kB
Allowed for Textract and Comprehend API calls through AWS keys. File preparation function incorporated into main redaction function to avoid needing user to 'check in' during redaction process
3 months ago
aws_textract.py
Safe
12.1 kB
Laid groundwork for passing in AWS API keys. Duplicate pages option should now work for pages with no text.
3 months ago
cli_redact.py
Safe
4.73 kB
Allowed for overwriting of default output folder in choose_and_run_redactor function.
6 months ago
custom_csvlogger.py
Safe
6.65 kB
Created custom csvlogger to try to overcome AWS Lambda's incompatibility with multithread locks
6 months ago
custom_image_analyser_engine.py
49.2 kB
Integrated AWS Comprehend and fuzzy matching functions with tabular data redaction.
3 months ago
data_anonymise.py
34.8 kB
Integrated AWS Comprehend and fuzzy matching functions with tabular data redaction.
3 months ago
file_conversion.py
Safe
40.3 kB
Allowed for Textract and Comprehend API calls through AWS keys. File preparation function incorporated into main redaction function to avoid needing user to 'check in' during redaction process
3 months ago
file_redaction.py
94.9 kB
Integrated AWS Comprehend and fuzzy matching functions with tabular data redaction.
3 months ago
find_duplicate_pages.py
Safe
9.68 kB
Laid groundwork for passing in AWS API keys. Duplicate pages option should now work for pages with no text.
3 months ago
helper_functions.py
14.6 kB
Allowed for output files to be saved into user-specific folders. Added deny list capability to xlsx/csv file redaction
3 months ago
load_spacy_model_custom_recognisers.py
Safe
13.7 kB
Fixed issues with gradio version 5.16. Fixed fuzzy search error with pages with no data.
3 months ago
presidio_analyzer_custom.py
Safe
4.94 kB
Added support for AWS Comprehend for PII identification. OCR and detection results now written to main output
7 months ago
redaction_review.py
28.6 kB
Allowed for output files to be saved into user-specific folders. Added deny list capability to xlsx/csv file redaction
3 months ago