Skip to content

Latest commit

 

History

History

python-pdf-form-extractor

Trigger.dev + Python PDF form extractor demo

This demo showcases how to use Trigger.dev with Python to extract structured form data from a PDF file available at a URL.

Features

Getting Started

  1. After cloning the repo, run npm install to install the dependencies.
  2. Create a virtual environment python -m venv venv
  3. Activate the virtual environment, depending on your OS: On Mac/Linux: source venv/bin/activate, on Windows: venv\Scripts\activate
  4. Install the Python dependencies pip install -r requirements.txt
  5. Copy the project ref from your Trigger.dev dashboard and add it to the trigger.config.ts file.
  6. Run the Trigger.dev CLI dev command (it may ask you to authorize the CLI if you haven't already).
  7. Test the task in the dashboard by providing a valid PDF URL.
  8. Deploy the task to production using the Trigger.dev CLI deploy command.

Relevant code

  • pythonPdfTask.ts triggers the Python script and returns the structured form data as JSON
  • trigger.config.ts uses the Trigger.dev Python extension to install the dependencies and run the script
  • extract-pdf-form.py is the main Python script that takes a URL and returns the form data from the PDF in JSON format