Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
Spaces:
seanpedrickcase
/
document_redaction
like
5
Running
App
Files
Files
Community
Fetching metadata from the HF Docker repository...
928b1e9
document_redaction
/
tools
3 contributors
History:
62 commits
seanpedrickcase
Can now toggle colour change for boxes. Large boxes now remove text correctly
928b1e9
3 months ago
__init__.py
Safe
0 Bytes
Initial commit
11 months ago
auth.py
Safe
1.64 kB
Included entrypoint.sh
4 months ago
aws_functions.py
Safe
7.37 kB
Fixed issue where redactions were sometimes not removing text underneath boxes. You can now redact in different colours from review page
3 months ago
aws_textract.py
Safe
10.8 kB
Updated packages. Reinstituted multithreading with page load, now with order protected. Smaller spacy model used for speed. Textract calls should now be faster
3 months ago
cli_redact.py
Safe
4.73 kB
Allowed for overwriting of default output folder in choose_and_run_redactor function.
4 months ago
custom_csvlogger.py
Safe
6.65 kB
Created custom csvlogger to try to overcome AWS Lambda's incompatibility with multithread locks
4 months ago
custom_image_analyser_engine.py
Safe
39 kB
Started adding in support for custom deny list. Fixed textract call issue. Removed multithreading for now as it mixes up pages
3 months ago
data_anonymise.py
Safe
20.9 kB
Added support for AWS Comprehend for PII identification. OCR and detection results now written to main output
5 months ago
file_conversion.py
Safe
19.8 kB
Updated packages. Reinstituted multithreading with page load, now with order protected. Smaller spacy model used for speed. Textract calls should now be faster
3 months ago
file_redaction.py
Safe
95.3 kB
Can now toggle colour change for boxes. Large boxes now remove text correctly
3 months ago
helper_functions.py
Safe
11.7 kB
Updated packages. Reinstituted multithreading with page load, now with order protected. Smaller spacy model used for speed. Textract calls should now be faster
3 months ago
load_spacy_model_custom_recognisers.py
Safe
6.51 kB
Updated packages. Reinstituted multithreading with page load, now with order protected. Smaller spacy model used for speed. Textract calls should now be faster
3 months ago
presidio_analyzer_custom.py
Safe
4.94 kB
Added support for AWS Comprehend for PII identification. OCR and detection results now written to main output
5 months ago
redaction_review.py
Safe
7.98 kB
Fixed issue where redactions were sometimes not removing text underneath boxes. You can now redact in different colours from review page
3 months ago