Spaces:

seanpedrickcase
/

document_redaction

Running

App Files Files Community

document_redaction / tools

Ctrl+K

Ctrl+K

3 contributors

History: 126 commits

seanpedrickcase's picture

seanpedrickcase

Expanded checks for out of range page cropboxes

5fcccbe 3 months ago

__init__.py

0 Bytes

Initial commit over 1 year ago
auth.py

2.46 kB

Added compatibility with gradio_image_annotation for passing through id and text properties to annotator. Corrected csv location for Textract api calls. Other minor changes 4 months ago
aws_functions.py

9.47 kB

Improved logging format a little. Now possible to save logs to DynamoDB 4 months ago
aws_textract.py

27.3 kB

Now local OCR outputs can be saved to file and reloaded to save preparation time. Bug fixing in logs and tabular data redaction. Update to documentation 4 months ago
cli_redact.py

4.74 kB

More config options. Fixed some bugs with removing elements from review page and Adobe export. Some UI rearrangements 5 months ago
config.py

14.6 kB

Expanded checks for out of range page cropboxes 3 months ago
custom_csvlogger.py

12.8 kB

Updated logging format for timestamps to be compatible with AWS. Added load_dynamo_logs.py example file. 4 months ago
custom_image_analyser_engine.py

53.9 kB

Now local OCR outputs can be saved to file and reloaded to save preparation time. Bug fixing in logs and tabular data redaction. Update to documentation 4 months ago
data_anonymise.py

35.9 kB

Now local OCR outputs can be saved to file and reloaded to save preparation time. Bug fixing in logs and tabular data redaction. Update to documentation 4 months ago
file_conversion.py

100 kB

Added config options for compressing output pdfs, returning output redacted pdfs at all, and for changing the length of time for showing previous Textract jobs 3 months ago
file_redaction.py

121 kB

Expanded checks for out of range page cropboxes 3 months ago
find_duplicate_pages.py

9.87 kB

Corrected a couple of bugs. Now Textract whole document API call outputs will load also the input PDF into the app 3 months ago
helper_functions.py

26.3 kB

Now local OCR outputs can be saved to file and reloaded to save preparation time. Bug fixing in logs and tabular data redaction. Update to documentation 4 months ago
load_spacy_model_custom_recognisers.py

13.7 kB

Major update. General code revision. Improved config variables. Dataframe based review frame now includes text, items can be searched and excluded. Costs now estimated. Option for adding cost codes added. Option to extract text only. 4 months ago
presidio_analyzer_custom.py

4.92 kB

More config options. Fixed some bugs with removing elements from review page and Adobe export. Some UI rearrangements 5 months ago
redaction_review.py

81 kB

Added config options for compressing output pdfs, returning output redacted pdfs at all, and for changing the length of time for showing previous Textract jobs 3 months ago
textract_batch_call.py

28 kB

Expanded checks for out of range page cropboxes 3 months ago