Skip to content

test(recorded): add rails library coverage (5/5)#1978

Open
Pouyanpi wants to merge 4 commits into
fix/streaming-output-rails-user-contentfrom
stack/recorded-tests-05-library-rails
Open

test(recorded): add rails library coverage (5/5)#1978
Pouyanpi wants to merge 4 commits into
fix/streaming-output-rails-user-contentfrom
stack/recorded-tests-05-library-rails

Conversation

@Pouyanpi

@Pouyanpi Pouyanpi commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds recorded coverage for library rails and their composition order.

Why

Built-in rails need replayable end-to-end coverage so future rail changes can be reviewed against stable behavior.

What Changed

  • Adds library rail configs, cassettes, and tests for regex, injection, self-check, content safety, jailbreak, topic control, and composition.
  • Adds library README guidance requiring recorded coverage for new rails.

Review Notes

This is the final stack PR and includes no additional runtime changes.

Stack Position

Part 5 of 5.

Stack Context

This stack decomposes recorded end-to-end replay coverage into reviewable slices. The PRs should be reviewed against their parent branch in the stack.

Please review each PR against its parent branch, not directly against the root base branch, except for part 1.

Order PR Branch Base
1 #1974 stack/recorded-tests-01-harness develop
2 #1975 stack/recorded-tests-02-deterministic-library-load stack/recorded-tests-01-harness
3 #1976 stack/recorded-tests-03-clients stack/recorded-tests-02-deterministic-library-load
4 #1977 stack/recorded-tests-04-public-api stack/recorded-tests-03-clients
5 #1978 stack/recorded-tests-05-library-rails stack/recorded-tests-04-public-api

Validation

poetry check --lock
poetry lock --no-update
poetry install --with dev
poetry run pytest tests/recorded --block-network -q
pre-commit hooks passed during commit creation

@codecov

codecov Bot commented Jun 3, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-04-public-api branch from e9ed8b3 to e3e8722 Compare June 9, 2026 16:22
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-05-library-rails branch from 05feb29 to 4fbda3a Compare June 9, 2026 16:22
@Pouyanpi Pouyanpi marked this pull request as ready for review June 11, 2026 10:33
@greptile-apps

greptile-apps Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This is the final PR in a 5-part stack adding recorded end-to-end test coverage for NeMo Guardrails library rails. It also ships a runtime fix to llmrails.py that corrects the streaming path's user-input extraction.

  • Adds seven test modules (regex, injection, self-check, content safety, jailbreak, topic control, composition) with VCR cassettes and supporting configs that pin the public contract of every built-in rail.
  • Fixes _get_latest_user_message in llmrails.py to return the message's .content string instead of the full dict, so the streaming output-rail path renders user_input correctly in prompt templates (the cassette for test_content_safety_output_blocks_fake_main_stream now shows user: hello instead of user: {'role': 'user', 'content': 'hello'}).
  • Adds contribution guidance to nemoguardrails/library/README.md requiring recorded coverage for new rails.

Confidence Score: 5/5

Safe to merge — adds test infrastructure only and a well-evidenced single-function fix in the streaming path.

The runtime change is narrow: one inner function inside the streaming output-rail handler now returns a content string instead of a full message dict. The cassettes before and after both verify the correct prompt is sent to the provider. All 45 changed files are either new test fixtures, configs, or VCR cassettes; no existing logic outside the streaming path is affected.

No files require special attention. The cassettes, configs, and test helpers are internally consistent.

Important Files Changed

Filename Overview
nemoguardrails/rails/llm/llmrails.py Fixes _get_latest_user_message to return the message content string instead of the full dict; cassettes confirm the streaming prompt now renders correctly.
tests/recorded/rails/library/helpers.py Provides check_rails, generate_with_fake_main, and stream_with_fake_main helpers; streaming helper intentionally passes no FakeLLMModel and relies on options={"rails": ["output"]} to prevent main-model calls.
tests/recorded/rails/library/test_composition.py Composition-order tests covering regex→self-check→jailbreak→content safety→topic control ordering; cassettes confirm each rail short-circuits correctly before the next rail runs.
tests/recorded/rails/library/configs/full_stack/config.yml Full-stack config now has consistent jailbreak-before-content-safety ordering matching full_stack_no_topic (previous ordering discrepancy resolved).
tests/recorded/rails/library/test_content_safety.py Covers safe-pass, block, streaming block, generation block, and provider-error scenarios for NIM content safety; streaming cassette confirms the user_input fix.
tests/recorded/rails/library/test_injection.py Tests injection detection (block and omit modes) plus exception propagation when enable_rails_exceptions=True; all regex-based, no cassettes needed.
tests/recorded/rails/library/configs.py Centralises RailsConfigSource constants and the shared JAILBREAK_PROMPT test fixture.

Sequence Diagram

%%{init: {'theme': 'neutral'}}%%
sequenceDiagram
    participant T as Test
    participant R as LLMRails
    participant Regex as regex check
    participant SC as self check (OpenAI)
    participant JB as jailbreak detect (NIM)
    participant CS as content safety (NIM)
    participant TC as topic safety (NIM)

    T->>R: "check_async(messages, rail_types=[INPUT])"
    R->>Regex: detect_regex_pattern
    Regex-->>R: passed
    R->>SC: self_check_input prompt
    SC-->>R: safe → passed
    R->>JB: nemoguard-jailbreak-detect
    JB-->>R: "jailbreak=true → BLOCKED (stops here)"
    note over CS,TC: not reached when jailbreak blocks

    T->>R: "stream_async(messages, generator, rails=[output])"
    note over R: generator provides main content
    R->>CS: content_safety_check_output (NIM)
    CS-->>R: unsafe → stream error chunk emitted
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
sequenceDiagram
    participant T as Test
    participant R as LLMRails
    participant Regex as regex check
    participant SC as self check (OpenAI)
    participant JB as jailbreak detect (NIM)
    participant CS as content safety (NIM)
    participant TC as topic safety (NIM)

    T->>R: "check_async(messages, rail_types=[INPUT])"
    R->>Regex: detect_regex_pattern
    Regex-->>R: passed
    R->>SC: self_check_input prompt
    SC-->>R: safe → passed
    R->>JB: nemoguard-jailbreak-detect
    JB-->>R: "jailbreak=true → BLOCKED (stops here)"
    note over CS,TC: not reached when jailbreak blocks

    T->>R: "stream_async(messages, generator, rails=[output])"
    note over R: generator provides main content
    R->>CS: content_safety_check_output (NIM)
    CS-->>R: unsafe → stream error chunk emitted
Loading

Reviews (10): Last reviewed commit: "test(recorded): fix self-check-facts fal..." | Re-trigger Greptile

Comment thread tests/recorded/rails/library/configs/full_stack/config.yml
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-04-public-api branch from e3e8722 to 3b279b4 Compare June 11, 2026 10:52
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-05-library-rails branch from 4fbda3a to 100ca5d Compare June 11, 2026 10:52
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-04-public-api branch from 3b279b4 to 27732c7 Compare June 11, 2026 12:44
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-05-library-rails branch 2 times, most recently from 039b4e1 to 9e5bfa4 Compare June 11, 2026 13:18
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-04-public-api branch from 27732c7 to a759e09 Compare June 11, 2026 13:18
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-05-library-rails branch from 9e5bfa4 to e7713a2 Compare June 11, 2026 13:36
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-04-public-api branch from a759e09 to bbc9455 Compare June 11, 2026 13:36
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-05-library-rails branch 2 times, most recently from 0092cf2 to 86ca895 Compare June 11, 2026 13:56
@Pouyanpi Pouyanpi self-assigned this Jun 11, 2026
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-04-public-api branch from efee56d to d7a1262 Compare June 15, 2026 08:57
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-05-library-rails branch 2 times, most recently from 0c5207e to e25239a Compare June 15, 2026 12:00
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-04-public-api branch from d7a1262 to 19ec9e4 Compare June 15, 2026 12:00
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-05-library-rails branch from e25239a to dac2f3c Compare June 15, 2026 14:18
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-04-public-api branch 2 times, most recently from 0f531ea to e6370fc Compare June 16, 2026 08:36
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-05-library-rails branch from dac2f3c to 16ec0ef Compare June 16, 2026 08:36
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-05-library-rails branch from 16ec0ef to 80962e0 Compare June 17, 2026 12:17
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-04-public-api branch 2 times, most recently from 399c914 to 5d395f8 Compare June 22, 2026 14:09
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-05-library-rails branch from 80962e0 to 3f40196 Compare June 22, 2026 14:09
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-04-public-api branch from 5d395f8 to 54bea6d Compare June 23, 2026 10:16
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-05-library-rails branch from 3f40196 to c211bc1 Compare June 23, 2026 10:16

@tgasser-nv tgasser-nv left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only question is on the changes to llmrails.py which seem to be out-of-scope for an integration-testing PR. The description says there were no runtime changes

Comment thread nemoguardrails/rails/llm/llmrails.py Outdated
def _get_latest_user_message(
messages: Optional[List[dict]] = None,
) -> dict:
) -> Any:

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this change to the core runtime in the PR? It seems like a bugfix and out-of-scope for adding rails library coverage. It contradicts the PR description of "no runtime changes"

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the test surfaced a pre-existing bug. I've extracted it to #2081

this PR now targets #2081

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment thread tests/recorded/rails/library/configs/full_stack/config.yml
Comment thread tests/recorded/rails/library/configs/nim_topic_control/config.yml
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-04-public-api branch 2 times, most recently from de9ab3c to 4222f94 Compare June 26, 2026 08:30
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-05-library-rails branch from c211bc1 to cf71284 Compare June 26, 2026 08:45
@Pouyanpi Pouyanpi changed the base branch from stack/recorded-tests-04-public-api to develop June 26, 2026 08:46
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-05-library-rails branch from cf71284 to 8b07eb3 Compare June 26, 2026 09:01
@Pouyanpi Pouyanpi changed the base branch from develop to fix/streaming-output-rails-user-content June 26, 2026 09:20
Pouyanpi added 4 commits June 26, 2026 11:46
Fold library/helpers.py onto the shared build_rails()/async_chunks() instead of its own
LLMRails(load_config(...)) + local _chunks (D11/F), and assert the content-safety output
block via assert_blocked_generation (refusal + rail stop) rather than the weak
assert_generation_response non-empty check (D6).
The test passed a list for relevant_chunks, which crashes
retrieve_relevant_chunks (list + str) after the self_check_facts LLM call;
the swallowed error produced the generic internal-error refusal, and the
snapshot pinned that string as the expected fact-check block. A refactor
that fixed the crash would have flipped this test red.

Pass relevant_chunks as its canonical string form so the rail completes and
the snapshot pins the real fact-check refusal. The underlying
retrieve_relevant_chunks list+str bug is pre-existing on develop and tracked
separately.
@Pouyanpi Pouyanpi force-pushed the fix/streaming-output-rails-user-content branch from 69745f6 to 81ef48f Compare June 26, 2026 09:56
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-05-library-rails branch from 8b07eb3 to 6b8a76e Compare June 26, 2026 09:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants