Skip to content

feat(library): add Polygraf PII detection and masking integration#1657

Closed
DataMonarch wants to merge 4 commits into
NVIDIA-NeMo:developfrom
polygraf-ai:feat/polygraf-ai-pii-detection
Closed

feat(library): add Polygraf PII detection and masking integration#1657
DataMonarch wants to merge 4 commits into
NVIDIA-NeMo:developfrom
polygraf-ai:feat/polygraf-ai-pii-detection

Conversation

@DataMonarch

Copy link
Copy Markdown

Summary

  • Add Polygraf as a supported PII detection/masking provider, following the
    same patterns as the existing PrivateAI and GLiNER integrations.
  • Includes configuration schema, library actions, Colang v1/v2 flows,
    unit tests (7 passing), user guide documentation, and example configs.

Testing

  • All 7 new Polygraf tests pass (pytest tests/test_polygraf.py)

DataMonarch and others added 4 commits February 20, 2026 19:50
Add PolygrafDetectionOptions and PolygrafDetection pydantic models
to support configuring Polygraf as a PII detection provider, following
the same pattern as PrivateAI and GLiNER integrations.

The config block supports server_endpoint and per-stage (input, output,
retrieval) entity lists for selective PII detection.

Co-authored-by: Cursor <cursoragent@cursor.com>
Implement the Polygraf library module with:

- request.py: async HTTP client for the Polygraf PII text-detect API,
  using aiohttp with Bearer token auth via POLYGRAF_API_KEY env var.
- actions.py: polygraf_detect_pii and polygraf_mask_pii actions that
  read config, call the API, and filter/mask entities by type. Masking
  replaces detected spans with <ENTITY_TYPE> placeholders.
- flows.v1.co: Colang v1.0 subflows for detect/mask on input, output,
  and retrieval stages.
- flows.co: Colang v2.x flows for the same stages.

Follows the established patterns from the GLiNER and PrivateAI
integrations for action signatures, error handling, and flow structure.

Co-authored-by: Cursor <cursoragent@cursor.com>
Add 7 tests covering:
- No-op when no detection/masking flows are configured
- Input/output/retrieval PII detection (blocking)
- Input/output/retrieval PII masking (entity replacement)

Tests use mock actions registered directly with the app (matching the
GLiNER test pattern) to avoid depending on a live Polygraf server.

Co-authored-by: Cursor <cursoragent@cursor.com>
- Add docs/user-guides/community/polygraf.md with setup, configuration,
  and entity type reference for the Polygraf PII integration.
- Update guardrail-catalog.md with a Polygraf PII Detection section
  alongside the existing PrivateAI and GLiNER entries.
- Update overview.md to list Polygraf under PII detection providers.
- Add example configs for pii_detection and pii_masking use cases.

Co-authored-by: Cursor <cursoragent@cursor.com>
@github-actions

Copy link
Copy Markdown
Contributor

Documentation preview

https://nvidia-nemo.github.io/Guardrails/review/pr-1657

@greptile-apps

greptile-apps Bot commented Feb 20, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

Added Polygraf as a new PII detection and masking provider following the established patterns from PrivateAI and GLiNER integrations.

Major Changes:

  • Created nemoguardrails/library/polygraf/ module with actions for detection and masking
  • Added configuration schema (PolygrafDetection, PolygrafDetectionOptions) to support entity filtering per source (input/output/retrieval)
  • Implemented Colang v1 and v2 flows for both detection and masking operations
  • Added comprehensive test suite with 7 passing unit tests covering all flow types
  • Included user documentation and example configurations

Implementation Details:

  • Detection returns boolean indicating PII presence, with optional entity filtering
  • Masking replaces detected entities with <ENTITY_TYPE> placeholders
  • API key retrieved from POLYGRAF_API_KEY environment variable
  • Server endpoint configurable via server_endpoint field (defaults to localhost:8000)

Architecture:
The integration follows existing patterns: configuration schema in config.py, HTTP request handling in request.py, action implementations in actions.py, and reusable flows in both Colang versions.

Confidence Score: 4/5

  • Safe to merge with minor verification needed on API header format
  • Code follows established patterns, has comprehensive tests, and good documentation. Only concern is verifying the API_Key header format matches Polygraf's actual API specification.
  • Verify nemoguardrails/library/polygraf/request.py API header format against Polygraf documentation

Important Files Changed

Filename Overview
nemoguardrails/library/polygraf/actions.py Implements core PII detection and masking actions with proper error handling and entity filtering
nemoguardrails/library/polygraf/request.py Handles HTTP requests to Polygraf API; check API_Key header format (uses 'Bearer' prefix)
nemoguardrails/rails/llm/config.py Adds Polygraf configuration schema matching existing PII provider patterns
tests/test_polygraf.py Comprehensive test suite with 7 tests covering detection, masking, and all flow types

Sequence Diagram

sequenceDiagram
    participant User
    participant NemoGuardrails
    participant PolygrafAction
    participant PolygrafAPI
    
    User->>NemoGuardrails: Send message with PII
    NemoGuardrails->>PolygrafAction: polygraf_detect_pii(source, text, config)
    PolygrafAction->>PolygrafAction: Get config (server_endpoint, entities)
    PolygrafAction->>PolygrafAction: Get POLYGRAF_API_KEY from env
    PolygrafAction->>PolygrafAPI: POST /v1/pii/text-detect<br/>{text, headers with API_Key}
    PolygrafAPI-->>PolygrafAction: Return detected entities
    PolygrafAction->>PolygrafAction: Filter by enabled entities
    PolygrafAction-->>NemoGuardrails: Return has_pii boolean
    alt PII Detected
        NemoGuardrails->>User: "I can't answer that"
    else No PII
        NemoGuardrails->>User: Continue normal flow
    end
    
    Note over User,PolygrafAPI: Masking Flow
    User->>NemoGuardrails: Send message with PII
    NemoGuardrails->>PolygrafAction: polygraf_mask_pii(source, text, config)
    PolygrafAction->>PolygrafAPI: POST /v1/pii/text-detect
    PolygrafAPI-->>PolygrafAction: Return detected entities
    PolygrafAction->>PolygrafAction: Replace entities with <ENTITY_TYPE>
    PolygrafAction-->>NemoGuardrails: Return masked text
    NemoGuardrails->>User: Process with masked content
Loading

Last reviewed commit: 20f2588

@greptile-apps greptile-apps Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

12 files reviewed, 1 comment

Edit Code Review Agent Settings | Greptile

headers: Dict[str, str] = {"Content-Type": "application/json"}

if api_key:
headers["API_Key"] = f"Bearer {api_key}"

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verify the API_Key header name and Bearer prefix format match Polygraf API expectations - some APIs use Authorization or X-API-Key without Bearer prefix

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/library/polygraf/request.py
Line: 46

Comment:
Verify the `API_Key` header name and `Bearer` prefix format match Polygraf API expectations - some APIs use `Authorization` or `X-API-Key` without Bearer prefix

<sub>Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!</sub>

How can I resolve this? If you propose a fix, please make it concise.

@codecov

codecov Bot commented Feb 20, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 34.72222% with 47 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
nemoguardrails/library/polygraf/actions.py 25.53% 35 Missing ⚠️
nemoguardrails/library/polygraf/request.py 29.41% 12 Missing ⚠️

📢 Thoughts on this report? Let us know!

@Pouyanpi Pouyanpi force-pushed the develop branch 4 times, most recently from a6be550 to c69efe5 Compare May 6, 2026 16:01
@Pouyanpi

Copy link
Copy Markdown
Collaborator

closing in favor of #1693

@Pouyanpi Pouyanpi closed this Jun 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants