document_redaction / tools /textract_batch_call.py

Commit History

Expanded checks for out of range page cropboxes
5fcccbe

seanpedrickcase commited on

Updated gradio version. Minor changes to redactor function sequence. Minor formatting and wording changes.
5a21738

seanpedrickcase commited on

Added config options for compressing output pdfs, returning output redacted pdfs at all, and for changing the length of time for showing previous Textract jobs
3bbf593

seanpedrickcase commited on

Corrected a couple of bugs. Now Textract whole document API call outputs will load also the input PDF into the app
10f46e9

seanpedrickcase commited on

Improved logging format a little. Now possible to save logs to DynamoDB
0042e78

seanpedrickcase commited on

Added button to convert Textract API outputs to ocr_output files easily. Corrected Textract job file location
46bf91e

seanpedrickcase commited on

Added compatibility with gradio_image_annotation for passing through id and text properties to annotator. Corrected csv location for Textract api calls. Other minor changes
52c1a90

seanpedrickcase commited on

Minor function documentation changes. Requirements update for new Gradio and version of Gradio annotator that allows for saving preferred redaction format and to include box id
f6e6d80

seanpedrickcase commited on

Implemented Textract document API calls and associated output tracking/download. Fixes to config and cost code implementation. General minor bug fixes.
ed5f8c7

seanpedrickcase commited on

Major update. General code revision. Improved config variables. Dataframe based review frame now includes text, items can be searched and excluded. Costs now estimated. Option for adding cost codes added. Option to extract text only.
0ea8b9e

seanpedrickcase commited on