document_redaction / lambda_entrypoint.py

Commit History

Updated user guide and app settings. Updated some additional lambda_entrypoint arguments. Ensured that examples are correctly displayed on GUI.
c543ba0

seanpedrickcase commited on

head attribute added to Gradio blocks context to enable enforcement of direct vs relative file paths. Updates to direct mode/lambda entrypoint to ensure as many options as possible can be user defined
febacad

seanpedrickcase commited on

Correction to PaddleOCR config variable. Minor print statement changes
6c62394

seanpedrickcase commited on

Revised environment variables for consistency.
5f824f4

seanpedrickcase commited on

Custom env variables should now overwrite defaults for lambda function. Usage logs should now be correctly created with lambda function
6806363

seanpedrickcase commited on

Updated some config variable defaults for lambda_entrypoint (e.g. page max, min) to ensure that they are correctly parsed
8da3518

seanpedrickcase commited on

Corrected environment variable file references for log files and spacy/paddle folders for lambda_entrypoint
e347a56

seanpedrickcase commited on

Added logging folders to cli_redact to ensure correct saves with read-only file systems (e.g. lambda). Updated list-based parsing of arguments in lambda_entrypoint.py
40c65f7

seanpedrickcase commited on

Updated lambda_entrypoint dict references. Redaction functions should now return files even if MAX_TIME_VALUE value exceeded. load_all_output_files should now return subfolder files
260af8f

seanpedrickcase commited on

Updated file processing for more efficient redaction for specific page ranges. Updated lambda_entrypoint to allow for environment variables from .env files, and limits to compatible file types
59caba2

seanpedrickcase commited on

Updated some config variables for lambda functions to enable successful run
20046b2

seanpedrickcase commited on

Updated cdk_stack for build commands compatible with new dockerfile. Minor changes to lambda function to specify text extraction method correctly.
e7e4e50

seanpedrickcase commited on

Fixed on deprecated Github workflow functions. Applied linter and formatter to code throughout. Added tests for GUI load.
bafcf39

seanpedrickcase commited on

Added form, table, and layout extraction options to AWS Textract calls. Added options to config to bound document length, maximum table rows, etc.
d3e6a24

seanpedrickcase commited on

Added example data files. Greatly revised CLI redaction for redaction, deduplication, and AWS Textract batch calls. Various minor fixes and package updates.
d60759d

seanpedrickcase commited on

Fix to tabular redaction, added tabular deduplication. Updated cli call capability for both
aa5c211

seanpedrickcase commited on

Allowed for overwriting of default output folder in choose_and_run_redactor function.
68a91f4

seanpedrickcase commited on

Updated output file creation variables for Lambda direct redaction runs
e85b74e

seanpedrickcase commited on

Removed need to write result.stdout in lambda entrypoint
5d649ba

seanpedrickcase commited on

Added a little more debugging code to lambda_entrypoint
653bd2d

seanpedrickcase commited on

Moved gradio run code to outside of lambda_handler function in lambda_entrypoint.py
1cfa6e8

seanpedrickcase commited on

Switched start py file through Dockerfile to lambda_entrypoint. Added gradio links from this .py
6622361

seanpedrickcase commited on

Some more debugging. Added aws-lambda-adapter just in case that's useful in AWS Lambda
a3ba5e2

seanpedrickcase commited on

Added some debugging statements for entrypoint_router and lambda_entrypoint.py
18fb7ec

seanpedrickcase commited on

Added lambda_entrypoint.py to main folder
9337aae

seanpedrickcase commited on

Added option for running redact function through CLI (i.e. not going through Gradio UI or API). Test functions for running this through AWS Lambda.
e5dfae7

seanpedrickcase commited on