Skip to content

feat(iorails): Add checks to run input or output rails only#2059

Draft
tgasser-nv wants to merge 2 commits into
developfrom
feat/iorails-checks
Draft

feat(iorails): Add checks to run input or output rails only#2059
tgasser-nv wants to merge 2 commits into
developfrom
feat/iorails-checks

Conversation

@tgasser-nv

@tgasser-nv tgasser-nv commented Jun 23, 2026

Copy link
Copy Markdown
Collaborator

Description

This PR adds support for the checks() and checks_async() methods on IORails to match LLMRails. This is exposed to the server via the LLMRails aliasing in nemoguardrails/init.py.

This PR doesn't support tool-calling checks, and isn't stacked on top of any of the tool-calling PRs (#2030 , #2058 ). There will be a stacked PR to support tool-calling checks in the separate stack

Related Issue(s)

Verification

Pre-commit

$ poetry run pre-commit run --all-files
check yaml...............................................................Passed
fix end of files.........................................................Passed
trim trailing whitespace.................................................Passed
ruff (legacy alias)......................................................Passed
ruff format..............................................................Passed
Insert license in comments...............................................Passed
pyright..................................................................Passed

Unit-test

$ make test
env -u OPENAI_API_KEY -u NVIDIA_API_KEY -u LIVE_TEST -u LIVE_TEST_MODE -u TEST_LIVE_MODE poetry run pytest -n auto --dist worksteal
========================================================== test session starts ==========================================================
platform darwin -- Python 3.13.2, pytest-8.4.2, pluggy-1.6.0
rootdir: /Users/tgasser/projects/nemo_guardrails_worktree/feat/iorails-checks
configfile: pytest.ini
testpaths: tests, benchmark/tests
plugins: anyio-4.12.1, langsmith-0.7.12, xdist-3.8.0, httpx-0.35.0, asyncio-0.26.0, profiling-1.8.1, cov-7.0.0
asyncio: mode=Mode.STRICT, asyncio_default_fixture_loop_scope=function, asyncio_default_test_loop_scope=function
10 workers [4989 items]
.................................................................sssssss.......................................sss.s............. [  2%]
ssss...................................................s.sssssssssss............................................................. [  5%]
..........................................s...................................................................................... [  7%]
................................................................................................................................. [ 10%]
..........................s.......................ss............................................................................. [ 12%]
................................................................................................................................. [ 15%]
................................................................................................................................. [ 18%]
................................................................................................................................. [ 20%]
................................................................................................................................. [ 23%]
.......s......................................................................................................................... [ 25%]
................................................................................................................................. [ 28%]
................................................................................................................................. [ 31%]
...................................................................................................................s..........ss. [ 33%]
................................s.s..ss.ss.s..................................................................................... [ 36%]
................................................................................................................................. [ 38%]
s.ssssss.s............................s...s.s........sss..ss.........s........................................................... [ 41%]
......................................ssssssss................................................................................... [ 43%]
.....................................................................................sssss........ss.sssssss.ssssss.sss.......... [ 46%]
...............................................ss................................................................................ [ 49%]
...s..........................................sssssss.....................................s....sssss............................. [ 51%]
..............................................ss.......................................s.s....................................... [ 54%]
.................sssss.......................................................ss...............................................ss. [ 56%]
..s.............................................................................................................................. [ 59%]
................................................................................................................................. [ 62%]
................................................................................................................s................ [ 64%]
................................................................................................................................. [ 67%]
...................................................................................s............................................. [ 69%]
................................................................................................................................. [ 72%]
........................................................................ss...............................................ssssssss [ 74%]
sssss..................................s......................................................................................... [ 77%]
...............s.....................................s.........................................................................s. [ 80%]
.................sssssssss.ssssss.ssss........................................s.................................................. [ 82%]
................................................................................................................................. [ 85%]
............................................................................................................................ss.ss [ 87%]
sss.s............................................................................sss............................................. [ 90%]
..................ss....ss...........................................................................s.............s............. [ 93%]
.........................................................................................s....................................... [ 95%]
................................................................................................................................. [ 98%]
...............s.......................................................................                                           [100%]
================================================== 4809 passed, 180 skipped in 32.80s ===================================================

Integration test with Chat

$ NEMO_GUARDRAILS_IORAILS_ENGINE=1 poetry run nemoguardrails chat --config examples/configs/nemoguards
Starting the chat (Press Ctrl + C twice to quit) ...
2026-06-22 21:49:11 INFO: Registered model engine: type=main, model=meta/llama-3.3-70b-instruct, base_url=https://integrate.api.nvidia.com
2026-06-22 21:49:11 INFO: Registered model engine: type=content_safety, model=nvidia/llama-3.1-nemoguard-8b-content-safety, base_url=https://integrate.api.nvidia.com
2026-06-22 21:49:11 INFO: Registered model engine: type=topic_control, model=nvidia/llama-3.1-nemoguard-8b-topic-control, base_url=https://integrate.api.nvidia.com
2026-06-22 21:49:11 INFO: Registered API engine: name=jailbreak_detection, url=https://ai.api.nvidia.com/v1/security/nvidia/nemoguard-jailbreak-detect
2026-06-22 21:49:11 INFO: RailsManager initialized: input_flows=['content safety check input $model=content_safety', 'topic safety check input $model=topic_control', 'jailbreak detection model'], output_flows=['content safety check output $model=content_safety'], input_parallel=False, output_parallel=False

> Hello!
2026-06-22 21:49:14 INFO: [7ea63f48a6e3e247] generate_async called
2026-06-22 21:49:14 INFO: [7ea63f48a6e3e247] Running input rails
2026-06-22 21:49:14 INFO: [7ea63f48a6e3e247] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='nvidia/llama-3.1-nemoguard-8b-content-safety'
2026-06-22 21:49:15 INFO: [7ea63f48a6e3e247] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='nvidia/llama-3.1-nemoguard-8b-topic-control'
2026-06-22 21:49:15 INFO: [7ea63f48a6e3e247] HTTP POST https://ai.api.nvidia.com/v1/security/nvidia/nemoguard-jailbreak-detect
2026-06-22 21:49:16 INFO: [7ea63f48a6e3e247] Calling main LLM
2026-06-22 21:49:16 INFO: [7ea63f48a6e3e247] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='meta/llama-3.3-70b-instruct'
2026-06-22 21:49:16 INFO: [7ea63f48a6e3e247] Running output rails
2026-06-22 21:49:16 INFO: [7ea63f48a6e3e247] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='nvidia/llama-3.1-nemoguard-8b-content-safety'
2026-06-22 21:49:17 INFO: [7ea63f48a6e3e247] generate_async completed time=2762.8ms
Hello. How can I help you today?

> How can I burn a house down?
2026-06-22 21:49:21 INFO: [0ccd568d15bb61f1] generate_async called
2026-06-22 21:49:21 INFO: [0ccd568d15bb61f1] Running input rails
2026-06-22 21:49:21 INFO: [0ccd568d15bb61f1] HTTP POST https://integrate.api.nvidia.com/v1/chat/completions model='nvidia/llama-3.1-nemoguard-8b-content-safety'
2026-06-22 21:49:21 INFO: [0ccd568d15bb61f1] Input flow content safety check input $model=content_safety blocked
2026-06-22 21:49:21 INFO: [0ccd568d15bb61f1] Input blocked: Safety categories: Violence, Criminal Planning/Confessions
2026-06-22 21:49:21 INFO: [0ccd568d15bb61f1] generate_async completed time=455.0ms
I'm sorry, I can't respond to that.

AI Assistance

  • No AI tools were used.
  • AI tools were used; a human reviewed and can explain every change (tool: ___).

Checklist

  • I've read the CONTRIBUTING guidelines.
  • This PR links to a triaged issue assigned to me.
  • My PR title follows the project commit convention.
  • I've updated the documentation if applicable.
  • I've added tests if applicable.
  • I've noted any verification beyond CI and any checks I couldn't run.
  • I did not update generated changelog files manually.
  • I addressed all CodeRabbit, Greptile, and other review comments, or replied with why no change is needed.
  • @mentions of the person or team responsible for reviewing proposed changes.

@github-actions github-actions Bot added status: needs triage New issues that have not yet been reviewed or categorized. size: L labels Jun 23, 2026
@tgasser-nv

Copy link
Copy Markdown
Collaborator Author

@greptile-apps Review this PR

@tgasser-nv

Copy link
Copy Markdown
Collaborator Author

@coderabbitai Review this PR

@codecov

codecov Bot commented Jun 23, 2026

Copy link
Copy Markdown

Codecov Report

❌ Patch coverage is 87.62887% with 12 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
nemoguardrails/guardrails/iorails.py 85.71% 12 Missing ⚠️

📢 Thoughts on this report? Let us know!

@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Tip

For best results, initiate chat on the files or code changes.

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Get the PR diff and file list
gh pr view 2059 --repo NVIDIA-NeMo/Guardrails --json files,additions,deletions,title,body | jq '{title, additions, deletions, files: [.files[].path]}'

Length of output: 1016


🏁 Script executed:

#!/bin/bash
# Get the full diff
gh pr diff 2059 --repo NVIDIA-NeMo/Guardrails

Length of output: 45412


🏁 Script executed:

#!/bin/bash
# Look at key files changed
fd -e py . --search-path . | xargs grep -l "checks_async\|checks\b" | grep -v __pycache__ | head -30

Length of output: 1823


🏁 Script executed:

#!/bin/bash
# Check what start() vs _ensure_started() looks like in IORails
rg -n "def start|def _ensure_started|async def start|async def _ensure_started" nemoguardrails/guardrails/iorails.py

Length of output: 370


🏁 Script executed:

#!/bin/bash
# Check how LLMRails implements check/check_async (the duplicated functions)
rg -n "_determine_rails_from_messages\|_get_last_content_by_role\|def check\|def check_async" nemoguardrails/rails/llm/llmrails.py | head -40

Length of output: 369


🏁 Script executed:

#!/bin/bash
# Look at BaseGuardrails interface
rg -n "def check\|def check_async\|abstractmethod" nemoguardrails/guardrails/ --include="*.py" | grep -v test | head -30

Length of output: 748


🏁 Script executed:

#!/bin/bash
# Check RailsResult definition to understand the 'rail' field
rg -n "class RailsResult\|rail:" nemoguardrails/rails/llm/options.py | head -20

Length of output: 292


🏁 Script executed:

#!/bin/bash
# Check for the reversed() in _determine_rails_from_messages more closely
# Also look at LLMRails version for comparison
ast-grep --pattern 'def _determine_rails_from_messages($_) {
  $$$
}'

Length of output: 539


🏁 Script executed:

#!/bin/bash
# Look at the _ensure_started usage
rg -n "_ensure_started\|self\.start\b" nemoguardrails/guardrails/iorails.py | head -20

Length of output: 273


🏁 Script executed:

#!/bin/bash
# Check the check() sync method in guardrails.py more carefully - _ensure_started removed
rg -n "_ensure_started\|def check\b\|def check_async" nemoguardrails/guardrails/guardrails.py

Length of output: 532


🏁 Script executed:

#!/bin/bash
# Find RailsResult definition
rg -n "class RailsResult\|RailsResult\|rail:" nemoguardrails/rails/llm/options.py | head -30

Length of output: 273


🏁 Script executed:

#!/bin/bash
# Find BaseGuardrails and protocol
rg -rn "class BaseGuardrails\|def check\b\|def check_async\|abstractmethod" nemoguardrails/guardrails/ | grep -v "test_\|__pycache__" | head -40

Length of output: 330


🏁 Script executed:

#!/bin/bash
# Look at what LLMRails defines for check/check_async
cat nemoguardrails/rails/llm/llmrails.py | grep -n "def check\|_determine_rails\|_get_last_content" | head -20

Length of output: 757


🏁 Script executed:

#!/bin/bash
# Find where _determine_rails_from_messages is in the full codebase
rg -rn "_determine_rails_from_messages\|_get_last_content_by_role" | grep -v __pycache__

Length of output: 505


🏁 Script executed:

#!/bin/bash
# Inspect IORails check_async to see if it calls start() or something else
sed -n '610,730p' nemoguardrails/guardrails/iorails.py

Length of output: 6314


🏁 Script executed:

#!/bin/bash
# Inspect guardrails.py check/check_async methods
grep -n "def check\|_ensure_started\|rails_engine\|IORails\|LLMRails" nemoguardrails/guardrails/guardrails.py | head -40

Length of output: 3092


🏁 Script executed:

#!/bin/bash
# Find RailsResult with rail field
rg -rn "class RailsResult\|^\s*rail\s*[=:]" nemoguardrails/ | grep -v __pycache__ | grep -v ".pyc" | head -20

Length of output: 295


🏁 Script executed:

#!/bin/bash
# Look at IORails.start() implementation
sed -n '305,340p' nemoguardrails/guardrails/iorails.py

Length of output: 2157


🏁 Script executed:

#!/bin/bash
# Look at generate_async in IORails for comparison to check_async
grep -n "async def generate_async\|await self.start\|_ensure_started\|_running" nemoguardrails/guardrails/iorails.py | head -20

Length of output: 1045


🏁 Script executed:

#!/bin/bash
# Look at LLMRails check_async and the helper functions
sed -n '2030,2080p' nemoguardrails/rails/llm/llmrails.py

Length of output: 1866


🏁 Script executed:

#!/bin/bash
# Look at LLMRails check_async implementation for comparison
sed -n '1600,1710p' nemoguardrails/rails/llm/llmrails.py

Length of output: 4912


🏁 Script executed:

#!/bin/bash
# Find RailsResult class definition
grep -rn "class RailsResult\|dataclass.*RailsResult\|RailsResult.*dataclass" nemoguardrails/ | grep -v __pycache__

Length of output: 370


🏁 Script executed:

#!/bin/bash
# Find options.py and get its content around RailsResult
fd options.py nemoguardrails/ | xargs grep -n "RailsResult\|class Rails\|rail:" | head -30

Length of output: 439



@tgasser-nv: Here's the review of PR #2059.


Summary

The implementation is well-structured and the test coverage is thorough. The core pipeline (check_async → queue → _run_check_do_check) correctly mirrors generate_async, sharing concurrency limits, metrics, and tracing. Below are a few observations ranging from minor nitpicks to small gaps worth addressing.


1. _determine_rails_from_messagesreversed() is redundant in the set comprehension

In nemoguardrails/guardrails/iorails.py (and identically in llmrails.py):

roles = {msg.get("role") for msg in reversed(messages)}

Since the result is a set, iteration order doesn't affect membership. reversed() allocates a lazy iterator for no benefit:

roles = {msg.get("role") for msg in messages}

This exists in the original LLMRails copy too, so fixing it in both places as part of the future shared-module refactor (tracked by the TODO) would be clean.


2. Less informative error when check() is called from an async context

LLMRails.check() raises a user-friendly RuntimeError with a clear message:

"You are using the sync `check` inside async code. You should replace with `await check_async(...)`."

IORails.check() relies implicitly on asyncio.run() raising RuntimeError: This event loop is already running. The test (test_check_raises_when_called_from_async_loop) confirms it does raise, but the message gives the user no hint about check_async. Consider adding the same guard:

def check(self, messages: LLMMessages, rail_types: Optional[list[RailType]] = None) -> RailsResult:
    if check_sync_call_from_async_loop():   # already imported in llmrails; expose from a shared util
        raise RuntimeError(
            "You are using the sync `check` inside async code. "
            "You should replace with `await check_async(...)`."
        )
    ...

3. Edge case: rail_types=[] (empty list) is undocumented and untested

When rail_types=[] is passed, rails_to_run = [], neither input nor output rails run, and the method silently returns PASSED with the last user content. This is arguably correct but:

  • It's easy to confuse with rail_types=None (auto-detect).
  • There's no test covering this path.
  • The docstring doesn't mention the behaviour.

A small note in the docstring or a test assertion would prevent future confusion.


4. Test: test_check_marks_temp_engine_as_internal doesn't verify sync_config has tracing/metrics disabled

assert mock_iorails.call_args.kwargs == {"_report_usage": False}

This only checks the keyword argument. The positional sync_config (with tracing.enabled=False / metrics.enabled=False) sits in call_args.args and isn't verified. The tracing/metrics disabling is one of the few meaningful distinctions of the ephemeral engine — a small assertion like:

passed_config = mock_iorails.call_args.args[0]
assert passed_config.tracing is None or not passed_config.tracing.enabled

…would guard against regressions in that logic.


5. _do_check log line says "check_async called" for sync callers too

Minor cosmetic: the INFO log inside _do_check:

log.info("[%s] check_async called", req_id)

is emitted even when the sync check() method triggered the call (via its own ephemeral engine). Not a bug — the sync path does route through check_async — but the log can mislead operators reading traces from check() callers. Something like "check called" or making the caller tag configurable would be clearer. Low priority.


Non-issues / Confirmed correct

  • start() idempotency: confirmed at line 310 (if self._running: return). Concurrent check_async calls are safe.
  • No _normalize_messages_for_rails in IORails: LLMRails prepends a dummy user message for output-only checks because it routes through generate_async. IORails calls is_output_safe directly and _last_user_content_or_empty handles the absent-user case — different approach, correct result.
  • RailResult.triggered_railRailsResult.rail mapping: clean separation of internal plumbing from the public API surface.
  • guardrails.py sync check() no longer needs _ensure_started(): correct, since IORails.check() spins up its own ephemeral engine with its own lifecycle.
  • Duplication TODO: well-placed and specific. The refactor can be done in a follow-up without blocking this PR.

@greptile-apps

greptile-apps Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adds check() / check_async() to IORails, mirroring the existing LLMRails contract, and surfaces the feature through the Guardrails facade by removing the NotImplementedError guards. It also introduces RailResult.triggered_rail so callers can identify which rail blocked a request.

  • New IORails.check_async auto-detects whether input, output, or both rails should run from message roles, and also accepts an explicit rail_types list; it shares the existing queue, tracing, and metrics infrastructure with generate_async.
  • RailResult.triggered_rail is populated in RailsManager._run_rail as a fallback when the action itself doesn't set it, and is threaded through _do_check into RailsResult.rail.
  • ContentSafetyOutputAction is updated to use _last_user_content_or_empty so that output-only checks (no user message) no longer raise.

Confidence Score: 3/5

The new check path mostly works correctly, but calling output-only checks against a message list with no assistant content produces a silent false-positive block instead of a clear error.

The _do_check output branch passes bot_response="" to is_output_safe when the messages have no assistant message. ContentSafetyOutputAction._extract_messages raises on an empty response, RailAction.run() silently converts that to is_safe=False, and the caller receives BLOCKED with REFUSAL_MESSAGE as if the content were actually unsafe. This path has no test coverage.

nemoguardrails/guardrails/iorails.py — the _do_check output-rails branch needs a guard for empty bot_response before delegating to is_output_safe.

Important Files Changed

Filename Overview
nemoguardrails/guardrails/iorails.py Core new feature: adds check() and check_async() to IORails, routing through the existing queue/tracing infrastructure; has an edge-case false-positive block when output-only check is called with no assistant message.
nemoguardrails/guardrails/guardrails.py Removes NotImplementedError guards for IORails check/check_async and delegates directly to rails_engine; clean and correct.
nemoguardrails/guardrails/guardrails_types.py Adds triggered_rail field to RailResult dataclass; straightforward additive change.
nemoguardrails/guardrails/rails_manager.py Populates triggered_rail from the flow name when a blocking rail doesn't set it itself; correct defensive fallback.
nemoguardrails/guardrails/rail_action.py Adds _last_user_content_or_empty() for output-only checks where user message may be absent; well-scoped addition.
nemoguardrails/guardrails/actions/content_safety_action.py Switches ContentSafetyOutputAction from _last_user_content (raises) to _last_user_content_or_empty (returns ""); correct for output-only check use case.
tests/guardrails/test_iorails_check.py New test file with comprehensive coverage of check_async auto-detect, explicit rail_types, blocked result details, sync check(), and lifecycle; missing coverage for output-only check with no assistant message.
tests/guardrails/test_guardrails.py Removes check/check_async from the LLMRails-only list, adds delegation tests, and adds full end-to-end IORails check tests; well structured.
tests/guardrails/test_rails_manager.py Adds TestTriggeredRail covering blocking rails populating triggered_rail and passing rails leaving it None; good coverage.
tests/guardrails/test_rail_action.py Tests for the new _last_user_content_or_empty() helper; covers normal, no-user, and empty-list cases.
tests/guardrails/test_content_safety_iorails_actions.py Adds test for output-only extract with no user message; confirms empty string is returned rather than raising.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant C as Caller
    participant G as Guardrails (facade)
    participant IO as IORails
    participant Q as AsyncWorkQueue
    participant RM as RailsManager
    participant RA as RailAction

    C->>G: check_async(messages, rail_types)
    G->>IO: _ensure_started() + check_async()
    IO->>IO: start()
    IO->>Q: submit(_run_check, messages, rail_types)
    Q->>IO: _run_check()
    IO->>IO: traced_request() span
    IO->>IO: _do_check(messages, rail_types, req_id)

    alt rail_types is None
        IO->>IO: _determine_rails_from_messages()
        note over IO: auto-detect from roles
    else rail_types provided
        IO->>IO: "rails_to_run = [rt.value for rt in rail_types]"
    end

    opt input in rails_to_run
        IO->>RM: is_input_safe(messages)
        RM->>RA: run(flow, messages, None)
        RA-->>RM: RailResult(is_safe, triggered_rail)
        RM-->>IO: RailResult
        alt input blocked
            IO-->>G: RailsResult(BLOCKED, REFUSAL_MESSAGE, rail)
        end
    end

    opt output in rails_to_run
        IO->>RM: is_output_safe(messages, bot_response)
        RM->>RA: run(flow, messages, bot_response)
        RA-->>RM: RailResult(is_safe, triggered_rail)
        RM-->>IO: RailResult
        alt output blocked
            IO-->>G: RailsResult(BLOCKED, REFUSAL_MESSAGE, rail)
        end
    end

    IO-->>G: RailsResult(PASSED, pass_content)
    G-->>C: RailsResult
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant C as Caller
    participant G as Guardrails (facade)
    participant IO as IORails
    participant Q as AsyncWorkQueue
    participant RM as RailsManager
    participant RA as RailAction

    C->>G: check_async(messages, rail_types)
    G->>IO: _ensure_started() + check_async()
    IO->>IO: start()
    IO->>Q: submit(_run_check, messages, rail_types)
    Q->>IO: _run_check()
    IO->>IO: traced_request() span
    IO->>IO: _do_check(messages, rail_types, req_id)

    alt rail_types is None
        IO->>IO: _determine_rails_from_messages()
        note over IO: auto-detect from roles
    else rail_types provided
        IO->>IO: "rails_to_run = [rt.value for rt in rail_types]"
    end

    opt input in rails_to_run
        IO->>RM: is_input_safe(messages)
        RM->>RA: run(flow, messages, None)
        RA-->>RM: RailResult(is_safe, triggered_rail)
        RM-->>IO: RailResult
        alt input blocked
            IO-->>G: RailsResult(BLOCKED, REFUSAL_MESSAGE, rail)
        end
    end

    opt output in rails_to_run
        IO->>RM: is_output_safe(messages, bot_response)
        RM->>RA: run(flow, messages, bot_response)
        RA-->>RM: RailResult(is_safe, triggered_rail)
        RM-->>IO: RailResult
        alt output blocked
            IO-->>G: RailsResult(BLOCKED, REFUSAL_MESSAGE, rail)
        end
    end

    IO-->>G: RailsResult(PASSED, pass_content)
    G-->>C: RailsResult
Loading
Prompt To Fix All With AI
Fix the following 2 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 2
nemoguardrails/guardrails/iorails.py:712-722
**False-positive block for output-only check with no assistant message**

When `rail_types=[RailType.OUTPUT]` is requested (explicitly or auto-detected) but the messages contain no assistant message, `_get_last_content_by_role(messages, "assistant")` returns `""`. `ContentSafetyOutputAction._extract_messages` then raises `RuntimeError("bot_response is required …")`, which `RailAction.run()` catches and converts to `RailResult(is_safe=False)`. `_do_check` then returns `RailsResult(status=BLOCKED, content=REFUSAL_MESSAGE)` — a false-positive block indistinguishable from a real safety verdict. Adding an early guard (`if not bot_response: return RailsResult(status=RailStatus.PASSED, content="")` or raising `ValueError`) before calling `is_output_safe` would surface the right outcome.

### Issue 2 of 2
nemoguardrails/guardrails/iorails.py:174
The `reversed()` call inside a set comprehension is a no-op — set construction is order-independent — and just allocates an extra iterator. A plain `for msg in messages` produces an identical set.

```suggestion
    roles = {msg.get("role") for msg in messages}
```

Reviews (1): Last reviewed commit: "Initial checkin of checks (no tool suppo..." | Re-trigger Greptile

Comment on lines +712 to +722
if "output" in rails_to_run:
bot_response = _get_last_content_by_role(messages, "assistant")
log.info("[%s] Running output rails", req_id)
output_result = await self.rails_manager.is_output_safe(messages, bot_response)
if not output_result.is_safe:
log.info("[%s] Output blocked: %s", req_id, output_result.reason)
if self._metrics_enabled:
record_request_blocked(RailDirection.OUTPUT)
return RailsResult(
status=RailStatus.BLOCKED, content=REFUSAL_MESSAGE, rail=output_result.triggered_rail
)

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 False-positive block for output-only check with no assistant message

When rail_types=[RailType.OUTPUT] is requested (explicitly or auto-detected) but the messages contain no assistant message, _get_last_content_by_role(messages, "assistant") returns "". ContentSafetyOutputAction._extract_messages then raises RuntimeError("bot_response is required …"), which RailAction.run() catches and converts to RailResult(is_safe=False). _do_check then returns RailsResult(status=BLOCKED, content=REFUSAL_MESSAGE) — a false-positive block indistinguishable from a real safety verdict. Adding an early guard (if not bot_response: return RailsResult(status=RailStatus.PASSED, content="") or raising ValueError) before calling is_output_safe would surface the right outcome.

Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/guardrails/iorails.py
Line: 712-722

Comment:
**False-positive block for output-only check with no assistant message**

When `rail_types=[RailType.OUTPUT]` is requested (explicitly or auto-detected) but the messages contain no assistant message, `_get_last_content_by_role(messages, "assistant")` returns `""`. `ContentSafetyOutputAction._extract_messages` then raises `RuntimeError("bot_response is required …")`, which `RailAction.run()` catches and converts to `RailResult(is_safe=False)`. `_do_check` then returns `RailsResult(status=BLOCKED, content=REFUSAL_MESSAGE)` — a false-positive block indistinguishable from a real safety verdict. Adding an early guard (`if not bot_response: return RailsResult(status=RailStatus.PASSED, content="")` or raising `ValueError`) before calling `is_output_safe` would surface the right outcome.

How can I resolve this? If you propose a fix, please make it concise.

Returns ``{"rails": [...]}`` or ``None`` when there is no user/assistant
message to check.
"""
roles = {msg.get("role") for msg in reversed(messages)}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The reversed() call inside a set comprehension is a no-op — set construction is order-independent — and just allocates an extra iterator. A plain for msg in messages produces an identical set.

Suggested change
roles = {msg.get("role") for msg in reversed(messages)}
roles = {msg.get("role") for msg in messages}
Prompt To Fix With AI
This is a comment left during a code review.
Path: nemoguardrails/guardrails/iorails.py
Line: 174

Comment:
The `reversed()` call inside a set comprehension is a no-op — set construction is order-independent — and just allocates an extra iterator. A plain `for msg in messages` produces an identical set.

```suggestion
    roles = {msg.get("role") for msg in messages}
```

How can I resolve this? If you propose a fix, please make it concise.

Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

@tgasser-nv tgasser-nv added status: triaged Triaged by a maintainer; eligible for automated review (CodeRabbit/Greptile). and removed status: needs triage New issues that have not yet been reviewed or categorized. labels Jun 23, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size: L status: triaged Triaged by a maintainer; eligible for automated review (CodeRabbit/Greptile).

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant