fix(llmrails): load library files deterministically (2/5) by Pouyanpi · Pull Request #1975 · NVIDIA-NeMo/Guardrails

Pouyanpi · 2026-06-03T15:00:15Z

Summary

Sorts library traversal during LLMRails initialization so recorded rail tests do not depend on filesystem walk order.

Why

Recorded tests expose that library bot message insertion order can vary across platforms when files are loaded in filesystem order.

What Changed

Sorts directories during library os.walk traversal.
Sorts .co files before loading them.

Review Notes

This is the only runtime change in the stack.

Stack Position

Part 2 of 5.

Previous: test(recorded): add replay harness (1/5) #1974
Next: test(recorded): add client cassette coverage (3/5) #1976

Stack Context

This stack decomposes recorded end-to-end replay coverage into reviewable slices. The PRs should be reviewed against their parent branch in the stack.

Please review each PR against its parent branch, not directly against the root base branch, except for part 1.

Order	PR	Branch	Base
1	#1974	`stack/recorded-tests-01-harness`	`develop`
2	#1975	`stack/recorded-tests-02-deterministic-library-load`	`stack/recorded-tests-01-harness`
3	#1976	`stack/recorded-tests-03-clients`	`stack/recorded-tests-02-deterministic-library-load`
4	#1977	`stack/recorded-tests-04-public-api`	`stack/recorded-tests-03-clients`
5	#1978	`stack/recorded-tests-05-library-rails`	`stack/recorded-tests-04-public-api`

Validation

poetry check --lock
poetry lock --no-update
poetry install --with dev
poetry run pytest tests/recorded --block-network -q
pre-commit hooks passed during commit creation

codecov · 2026-06-03T15:08:36Z

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

greptile-apps · 2026-06-11T10:38:07Z

Greptile Summary

This PR makes library and prompt file loading order deterministic by sorting directory and file lists during os.walk traversal in both llmrails.py and prompts.py. A new test in test_prompt_override.py verifies that _load_prompts() returns prompts in sorted file order even when the filesystem yields them out of order.

llmrails.py: adds dirs.sort() before recursion and wraps the files loop with sorted(files), ensuring .co library files are always loaded in the same order regardless of platform filesystem walk order.
prompts.py: adds dirs.sort() and files.sort() before the filename loop, mirroring the same fix for YAML prompt files.
tests/test_prompt_override.py: adds test_load_prompts_sorts_files_for_deterministic_overrides, which monkeypatches os.walk to return files out of order and asserts the resulting prompt list is sorted.

Confidence Score: 5/5

Safe to merge — both changes are narrow, additive sorts with no logic alterations beyond load order.

The only runtime changes are two pairs of sort calls that make directory and file traversal order stable. No data is dropped, no control flow is altered, and the fix is strictly additive. The new test directly validates the file-sort behavior in _load_prompts.

No files require special attention.

Important Files Changed

Filename	Overview
nemoguardrails/llm/prompts.py	Adds `dirs.sort()` and `files.sort()` inside the `os.walk` loop so prompt YAML files are loaded in a stable, platform-independent order.
nemoguardrails/rails/llm/llmrails.py	Adds `dirs.sort()` and switches to `for file in sorted(files):` during library `.co` file traversal, making bot_message insertion order deterministic across platforms.
tests/test_prompt_override.py	Adds a new test that verifies `_load_prompts()` returns files in sorted order by passing an out-of-order list through a monkeypatched `os.walk` stub.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[LLMRails.__init__] --> B[os.walk library_path]
    B --> C[dirs.sort in-place for os.walk recursion order]
    C --> D[for file in sorted files]
    D --> E{file.endswith .co?}
    E -- Yes --> F[parse_colang_file]
    F --> G[extend config.flows]
    F --> H[insert bot_messages if not already set]
    E -- No --> D

    I[_load_prompts] --> J[os.walk prompts_dir]
    J --> K[dirs.sort + files.sort]
    K --> L[for filename in files]
    L --> M{.yml or .yaml?}
    M -- Yes --> N[yaml.safe_load]
    N --> O[extend prompts list]
    M -- No --> L

%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[LLMRails.__init__] --> B[os.walk library_path]
    B --> C[dirs.sort in-place for os.walk recursion order]
    C --> D[for file in sorted files]
    D --> E{file.endswith .co?}
    E -- Yes --> F[parse_colang_file]
    F --> G[extend config.flows]
    F --> H[insert bot_messages if not already set]
    E -- No --> D

    I[_load_prompts] --> J[os.walk prompts_dir]
    J --> K[dirs.sort + files.sort]
    K --> L[for filename in files]
    L --> M{.yml or .yaml?}
    M -- Yes --> N[yaml.safe_load]
    N --> O[extend prompts list]
    M -- No --> L

_{Reviews (10): Last reviewed commit: "test(prompts): isolate prompt loading or..." | Re-trigger Greptile}

tgasser-nv

Looks good, need to add unit-tests for the library traversal as well as prompts before merging.

Foundation for converging the recorded suite's cross-surface drift, consumed by the public_api and library layers above: - rails/helpers.py: shared build_rails() construction helper + async_chunks() (replaces the LLMRails(load_config(...)) boilerplate inlined per test, D11/F). - assertions.py: assert_blocked_generation() asserts refusal + rail stop semantics, not just non-empty text (D6).

Replay under --block-network must not depend on ambient proxy env: a SOCKS proxy makes httpx raise ImportError (missing socksio) on a cassette hit, turning a deterministic replay into a shell-dependent error. Add an autouse fixture that strips proxy vars during replay (record_mode == none) while leaving them intact for recording. Also fix the README 'Adding a test' snippet to include the imports it relies on (LLMRails, load_config, suite-local snapshot, OPENAI_BASELINE_CONFIG) so a new contributor can copy-paste it and land on the intended snapshot re-export.

github-actions · 2026-06-26T07:27:17Z

Staged Fern docs preview: https://nvidia-preview-pr-1975.docs.buildwithfern.com/nemo/guardrails

Adds the library-traversal sibling of test_load_prompts_sorts_files_for_deterministic_overrides, addressing the stack-2 review ask. Mocks os.walk to yield two library .co files defining the same bot message in non-sorted order and asserts the alphabetically-first file wins the collision, pinning the dirs.sort()/sorted(files) fix in LLMRails.__init__ so library load order stays filesystem-independent.

This was referenced Jun 3, 2026

test(recorded): add client cassette coverage (3/5) #1976

Merged

test(recorded): add rails public API coverage (4/5) #1977

Open

test(recorded): add rails library coverage (5/5) #1978

Open

test(recorded): add replay harness (1/5) #1974

Merged

Pouyanpi mentioned this pull request Jun 3, 2026

WIP: add recorded E2E tests #1938

Closed

Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from 6fec9aa to 9cad57c Compare June 9, 2026 16:22

Pouyanpi force-pushed the stack/recorded-tests-01-harness branch from 59c19ea to c0ff0c7 Compare June 9, 2026 16:22

Pouyanpi marked this pull request as ready for review June 11, 2026 10:33

greptile-apps Bot reviewed Jun 11, 2026

View reviewed changes

Comment thread tests/test_prompt_override.py

Comment thread tests/test_prompt_override.py

Pouyanpi force-pushed the stack/recorded-tests-01-harness branch from c0ff0c7 to b3c164e Compare June 11, 2026 10:52

Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch 2 times, most recently from 7310fa7 to 6b783f2 Compare June 11, 2026 12:44

Pouyanpi force-pushed the stack/recorded-tests-01-harness branch from b3c164e to f2aa414 Compare June 11, 2026 12:44

Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch 2 times, most recently from 3d885f4 to 7000fcc Compare June 15, 2026 08:57

Pouyanpi force-pushed the stack/recorded-tests-01-harness branch from d44488d to 68db0d2 Compare June 15, 2026 12:00

Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch 4 times, most recently from 05def3f to f912ae5 Compare June 17, 2026 12:17

Pouyanpi force-pushed the stack/recorded-tests-01-harness branch from ab362c6 to 4aa1590 Compare June 17, 2026 12:17

Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from f912ae5 to e5a52c6 Compare June 22, 2026 14:09

github-actions Bot added the size: S label Jun 22, 2026

Pouyanpi force-pushed the stack/recorded-tests-01-harness branch from 6bf95dc to b6d7a5b Compare June 23, 2026 10:16

Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from e5a52c6 to 14ddbc7 Compare June 23, 2026 10:16

tgasser-nv approved these changes Jun 25, 2026

View reviewed changes

Comment thread nemoguardrails/rails/llm/llmrails.py

Pouyanpi added 2 commits June 26, 2026 09:12

test(recorded): add replay harness

59b1588

test(recorded): harden cassette refresh workflow

5852f84

Pouyanpi added 8 commits June 26, 2026 09:12

test(recorded): cover cassette edge cases

2a1a609

test(recorded): document harness contracts

aff0b9f

test(recorded): handle refresh edge cases

9b3dbdf

test(recorded): improve cassette refresh workflow

371daf6

apply review suggestions by tgasser-nv

f7decd8

minor improvement to README

e522751

Pouyanpi force-pushed the stack/recorded-tests-01-harness branch from e06a4fe to e522751 Compare June 26, 2026 07:18

Pouyanpi added 2 commits June 26, 2026 09:20

fix(llmrails): load library files deterministically

f301e54

test(prompts): isolate prompt loading order test

82429ff

Base automatically changed from stack/recorded-tests-01-harness to develop June 26, 2026 07:24

github-actions Bot added the needs: rebase label Jun 26, 2026

Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from 14ddbc7 to 82429ff Compare June 26, 2026 07:25

github-actions Bot added size: XL and removed size: S needs: rebase labels Jun 26, 2026

Pouyanpi merged commit 86feba9 into develop Jun 26, 2026
14 checks passed

Pouyanpi deleted the stack/recorded-tests-02-deterministic-library-load branch June 26, 2026 07:41

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(llmrails): load library files deterministically (2/5)#1975

fix(llmrails): load library files deterministically (2/5)#1975
Pouyanpi merged 13 commits into
developfrom
stack/recorded-tests-02-deterministic-library-load

Pouyanpi commented Jun 3, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 3, 2026

Uh oh!

greptile-apps Bot commented Jun 11, 2026 •

edited

Loading

Confidence Score: 5/5

Flowchart

Uh oh!

Uh oh!

Uh oh!

tgasser-nv left a comment

Uh oh!

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Pouyanpi commented Jun 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Why

What Changed

Review Notes

Stack Position

Stack Context

Validation

Uh oh!

codecov Bot commented Jun 3, 2026

Codecov Report

Uh oh!

greptile-apps Bot commented Jun 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Flowchart

Uh oh!

Uh oh!

Uh oh!

tgasser-nv left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

github-actions Bot commented Jun 26, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Pouyanpi commented Jun 3, 2026 •

edited

Loading

greptile-apps Bot commented Jun 11, 2026 •

edited

Loading