document_redaction / test /GITHUB_ACTIONS.md
seanpedrickcase's picture
Added a test suite based on the functions in cli_redact.py
084af54
|
raw
history blame
6.52 kB

GitHub Actions Integration Guide

This guide explains how to use your test suite with GitHub Actions for automated CI/CD.

πŸš€ Quick Start

1. Choose Your Workflow

I've created multiple workflow options for you:

Option A: Simple Test Run (Recommended for beginners)

# File: .github/workflows/simple-test.yml
# - Basic test execution
# - Ubuntu Latest
# - Python 3.11
# - Minimal setup

Option B: Comprehensive CI/CD (Recommended for production)

# File: .github/workflows/ci.yml
# - Full pipeline with linting, security, coverage
# - Multiple Python versions
# - Integration tests
# - Package building

Option C: Multi-OS Testing (For cross-platform compatibility)

# File: .github/workflows/multi-os-test.yml
# - Tests on Ubuntu, Windows, macOS
# - Multiple Python versions
# - Cross-platform compatibility

2. Enable GitHub Actions

  1. Push your code to GitHub
  2. Go to your repository β†’ Actions tab
  3. Select a workflow and click "Run workflow"
  4. Watch the tests run automatically!

πŸ“‹ What Each Workflow Does

Simple Test Run (.github/workflows/simple-test.yml)

βœ… Installs system dependencies (tesseract, poppler, OpenGL)
βœ… Installs Python dependencies from requirements.txt
βœ… Downloads spaCy model
βœ… Creates dummy test data automatically
βœ… Runs your CLI tests
βœ… Runs pytest with coverage

Comprehensive CI/CD (.github/workflows/ci.yml)

βœ… Linting (Ruff, Black)
βœ… Unit tests (Python 3.10, 3.11, 3.12)
βœ… Integration tests
βœ… Security scanning (Safety, Bandit)
βœ… Coverage reporting
βœ… Package building (on main branch)
βœ… Artifact uploads

Multi-OS Testing (.github/workflows/multi-os-test.yml)

βœ… Tests on Ubuntu, Windows, macOS
βœ… Python 3.10, 3.11, 3.12
βœ… Cross-platform compatibility
βœ… OS-specific dependency handling

πŸ”§ How It Works

Automatic Test Data Creation

The workflows automatically create dummy test files when your example data is missing:

# .github/scripts/setup_test_data.py creates:
- example_data/example_of_emails_sent_to_a_professor_before_applying.pdf
- example_data/combined_case_notes.csv
- example_data/Bold minimalist professional cover letter.docx
- example_data/example_complaint_letter.jpg
- example_data/test_allow_list_*.csv
- example_data/partnership_toolkit_redact_*.csv
- example_data/example_outputs/doubled_output_joined.pdf_ocr_output.csv

System Dependencies

Each OS gets the right dependencies:

Ubuntu:

sudo apt-get install tesseract-ocr poppler-utils libgl1-mesa-glx

macOS:

brew install tesseract poppler

Windows:

# Handled by Python packages

Python Dependencies

pip install -r requirements.txt
pip install pytest pytest-cov reportlab pillow

🎯 Triggers

When Tests Run:

  • βœ… Push to main/dev branches
  • βœ… Pull requests to main/dev
  • βœ… Daily at 2 AM UTC (scheduled)
  • βœ… Manual trigger from GitHub UI

What Happens:

  1. Checkout code
  2. Install dependencies
  3. Create test data
  4. Run tests
  5. Generate reports
  6. Upload artifacts

πŸ“Š Test Results

Success Criteria:

  • βœ… All tests pass
  • βœ… No linting errors
  • βœ… Security checks pass
  • βœ… Coverage reports generated

Failure Handling:

  • βœ… Tests skip gracefully if files missing
  • βœ… AWS tests expected to fail without credentials
  • βœ… System dependency failures handled with fallbacks

πŸ” Monitoring

GitHub Actions Tab:

  • View workflow runs
  • See test results
  • Download artifacts
  • View logs

Artifacts Generated:

  • test-results.xml - JUnit test results
  • coverage.xml - Coverage data
  • htmlcov/ - HTML coverage report
  • bandit-report.json - Security scan results

Coverage Reports:

  • Uploaded to Codecov automatically
  • Available in GitHub Actions artifacts
  • HTML reports for detailed analysis

πŸ› οΈ Customization

Adding New Tests:

  1. Add test methods to test/test.py
  2. Update setup_test_data.py if needed
  3. Tests run automatically in all workflows

Modifying Workflows:

  1. Edit the .yml file
  2. Test locally first
  3. Push to trigger workflow

Environment Variables:

env:
  PYTHON_VERSION: "3.11"
  # Add your custom variables here

🚨 Troubleshooting

Common Issues:

  1. "Example file not found"

    • βœ… Solution: Test data is created automatically
    • βœ… Check: setup_test_data.py runs in workflow
  2. "AWS credentials not configured"

    • βœ… Expected: AWS tests fail without credentials
    • βœ… Solution: Tests are designed to handle this
  3. "System dependency error"

    • βœ… Check: OS-specific installation commands
    • βœ… Solution: Dependencies are installed automatically
  4. "Test timeout"

    • βœ… Default: 10-minute timeout per test
    • βœ… Solution: Tests are designed to be fast

Debug Mode:

Add --verbose to pytest commands for detailed output:

pytest test/test.py -v --tb=short

πŸ“ˆ Performance

Optimizations:

  • βœ… Parallel execution where possible
  • βœ… Dependency caching for faster builds
  • βœ… Minimal system packages installed
  • βœ… Efficient test data creation

Build Times:

  • Simple Test: ~5-10 minutes
  • Comprehensive CI: ~15-20 minutes
  • Multi-OS: ~20-30 minutes

πŸ”’ Security

Security Features:

  • βœ… Dependency scanning with Safety
  • βœ… Code scanning with Bandit
  • βœ… No secrets exposed in logs
  • βœ… Temporary test data cleaned up

Secrets Management:

  • Use GitHub Secrets for sensitive data
  • Never hardcode credentials in workflows
  • Test data is dummy data only

πŸŽ‰ Success!

Once set up, your GitHub Actions will:

  1. Automatically test every push and PR
  2. Generate reports and coverage data
  3. Catch issues before they reach production
  4. Ensure compatibility across platforms
  5. Provide confidence in your code quality

πŸ“š Next Steps

  1. Choose a workflow that fits your needs
  2. Push to GitHub to trigger the first run
  3. Monitor the Actions tab for results
  4. Customize as needed for your project
  5. Enjoy automated testing! πŸŽ‰

Need help? Check the workflow logs in the GitHub Actions tab for detailed error messages and troubleshooting information.