Skip to content

fix(llmrails): load library files deterministically (2/5)#1975

Merged
Pouyanpi merged 13 commits into
developfrom
stack/recorded-tests-02-deterministic-library-load
Jun 26, 2026
Merged

fix(llmrails): load library files deterministically (2/5)#1975
Pouyanpi merged 13 commits into
developfrom
stack/recorded-tests-02-deterministic-library-load

Conversation

@Pouyanpi

@Pouyanpi Pouyanpi commented Jun 3, 2026

Copy link
Copy Markdown
Collaborator

Summary

Sorts library traversal during LLMRails initialization so recorded rail tests do not depend on filesystem walk order.

Why

Recorded tests expose that library bot message insertion order can vary across platforms when files are loaded in filesystem order.

What Changed

  • Sorts directories during library os.walk traversal.
  • Sorts .co files before loading them.

Review Notes

This is the only runtime change in the stack.

Stack Position

Part 2 of 5.

Stack Context

This stack decomposes recorded end-to-end replay coverage into reviewable slices. The PRs should be reviewed against their parent branch in the stack.

Please review each PR against its parent branch, not directly against the root base branch, except for part 1.

Order PR Branch Base
1 #1974 stack/recorded-tests-01-harness develop
2 #1975 stack/recorded-tests-02-deterministic-library-load stack/recorded-tests-01-harness
3 #1976 stack/recorded-tests-03-clients stack/recorded-tests-02-deterministic-library-load
4 #1977 stack/recorded-tests-04-public-api stack/recorded-tests-03-clients
5 #1978 stack/recorded-tests-05-library-rails stack/recorded-tests-04-public-api

Validation

poetry check --lock
poetry lock --no-update
poetry install --with dev
poetry run pytest tests/recorded --block-network -q
pre-commit hooks passed during commit creation

@codecov

codecov Bot commented Jun 3, 2026

Copy link
Copy Markdown

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from 6fec9aa to 9cad57c Compare June 9, 2026 16:22
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-01-harness branch from 59c19ea to c0ff0c7 Compare June 9, 2026 16:22
@Pouyanpi Pouyanpi marked this pull request as ready for review June 11, 2026 10:33
@greptile-apps

greptile-apps Bot commented Jun 11, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR makes library and prompt file loading order deterministic by sorting directory and file lists during os.walk traversal in both llmrails.py and prompts.py. A new test in test_prompt_override.py verifies that _load_prompts() returns prompts in sorted file order even when the filesystem yields them out of order.

  • llmrails.py: adds dirs.sort() before recursion and wraps the files loop with sorted(files), ensuring .co library files are always loaded in the same order regardless of platform filesystem walk order.
  • prompts.py: adds dirs.sort() and files.sort() before the filename loop, mirroring the same fix for YAML prompt files.
  • tests/test_prompt_override.py: adds test_load_prompts_sorts_files_for_deterministic_overrides, which monkeypatches os.walk to return files out of order and asserts the resulting prompt list is sorted.

Confidence Score: 5/5

Safe to merge — both changes are narrow, additive sorts with no logic alterations beyond load order.

The only runtime changes are two pairs of sort calls that make directory and file traversal order stable. No data is dropped, no control flow is altered, and the fix is strictly additive. The new test directly validates the file-sort behavior in _load_prompts.

No files require special attention.

Important Files Changed

Filename Overview
nemoguardrails/llm/prompts.py Adds dirs.sort() and files.sort() inside the os.walk loop so prompt YAML files are loaded in a stable, platform-independent order.
nemoguardrails/rails/llm/llmrails.py Adds dirs.sort() and switches to for file in sorted(files): during library .co file traversal, making bot_message insertion order deterministic across platforms.
tests/test_prompt_override.py Adds a new test that verifies _load_prompts() returns files in sorted order by passing an out-of-order list through a monkeypatched os.walk stub.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[LLMRails.__init__] --> B[os.walk library_path]
    B --> C[dirs.sort in-place for os.walk recursion order]
    C --> D[for file in sorted files]
    D --> E{file.endswith .co?}
    E -- Yes --> F[parse_colang_file]
    F --> G[extend config.flows]
    F --> H[insert bot_messages if not already set]
    E -- No --> D

    I[_load_prompts] --> J[os.walk prompts_dir]
    J --> K[dirs.sort + files.sort]
    K --> L[for filename in files]
    L --> M{.yml or .yaml?}
    M -- Yes --> N[yaml.safe_load]
    N --> O[extend prompts list]
    M -- No --> L
Loading
%%{init: {'theme': 'base', 'themeVariables': {"darkMode": true, "background": "#0d1117", "primaryColor": "#21262d", "primaryTextColor": "#e6edf3", "primaryBorderColor": "#8b949e", "lineColor": "#8b949e", "textColor": "#e6edf3", "edgeLabelBackground": "#161b22", "actorBkg": "#21262d", "actorBorder": "#8b949e", "actorTextColor": "#e6edf3", "actorLineColor": "#8b949e", "signalColor": "#8b949e", "signalTextColor": "#e6edf3", "noteBkgColor": "#373320", "noteBorderColor": "#d4a72c", "noteTextColor": "#f0e6c0", "labelBoxBkgColor": "#21262d", "labelBoxBorderColor": "#8b949e", "labelTextColor": "#e6edf3", "loopTextColor": "#e6edf3", "activationBkgColor": "#30363d", "activationBorderColor": "#8b949e"}}}%%
flowchart TD
    A[LLMRails.__init__] --> B[os.walk library_path]
    B --> C[dirs.sort in-place for os.walk recursion order]
    C --> D[for file in sorted files]
    D --> E{file.endswith .co?}
    E -- Yes --> F[parse_colang_file]
    F --> G[extend config.flows]
    F --> H[insert bot_messages if not already set]
    E -- No --> D

    I[_load_prompts] --> J[os.walk prompts_dir]
    J --> K[dirs.sort + files.sort]
    K --> L[for filename in files]
    L --> M{.yml or .yaml?}
    M -- Yes --> N[yaml.safe_load]
    N --> O[extend prompts list]
    M -- No --> L
Loading

Reviews (10): Last reviewed commit: "test(prompts): isolate prompt loading or..." | Re-trigger Greptile

Comment thread tests/test_prompt_override.py
Comment thread tests/test_prompt_override.py
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-01-harness branch from c0ff0c7 to b3c164e Compare June 11, 2026 10:52
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch 2 times, most recently from 7310fa7 to 6b783f2 Compare June 11, 2026 12:44
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-01-harness branch from b3c164e to f2aa414 Compare June 11, 2026 12:44
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch 2 times, most recently from 3d885f4 to 7000fcc Compare June 15, 2026 08:57
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-01-harness branch from d44488d to 68db0d2 Compare June 15, 2026 12:00
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch 4 times, most recently from 05def3f to f912ae5 Compare June 17, 2026 12:17
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-01-harness branch from ab362c6 to 4aa1590 Compare June 17, 2026 12:17
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from f912ae5 to e5a52c6 Compare June 22, 2026 14:09
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-01-harness branch from 6bf95dc to b6d7a5b Compare June 23, 2026 10:16
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from e5a52c6 to 14ddbc7 Compare June 23, 2026 10:16

@tgasser-nv tgasser-nv left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, need to add unit-tests for the library traversal as well as prompts before merging.

Comment thread nemoguardrails/rails/llm/llmrails.py
Pouyanpi added 8 commits June 26, 2026 09:12
Foundation for converging the recorded suite's cross-surface drift, consumed by the
public_api and library layers above:

- rails/helpers.py: shared build_rails() construction helper + async_chunks()
  (replaces the LLMRails(load_config(...)) boilerplate inlined per test, D11/F).
- assertions.py: assert_blocked_generation() asserts refusal + rail stop semantics,
  not just non-empty text (D6).
Replay under --block-network must not depend on ambient proxy env: a SOCKS
proxy makes httpx raise ImportError (missing socksio) on a cassette hit,
turning a deterministic replay into a shell-dependent error. Add an autouse
fixture that strips proxy vars during replay (record_mode == none) while
leaving them intact for recording.

Also fix the README 'Adding a test' snippet to include the imports it relies
on (LLMRails, load_config, suite-local snapshot, OPENAI_BASELINE_CONFIG) so a
new contributor can copy-paste it and land on the intended snapshot re-export.
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-01-harness branch from e06a4fe to e522751 Compare June 26, 2026 07:18
Base automatically changed from stack/recorded-tests-01-harness to develop June 26, 2026 07:24
@Pouyanpi Pouyanpi force-pushed the stack/recorded-tests-02-deterministic-library-load branch from 14ddbc7 to 82429ff Compare June 26, 2026 07:25
@github-actions

Copy link
Copy Markdown
Contributor

Adds the library-traversal sibling of
test_load_prompts_sorts_files_for_deterministic_overrides, addressing the
stack-2 review ask. Mocks os.walk to yield two library .co files defining the
same bot message in non-sorted order and asserts the alphabetically-first
file wins the collision, pinning the dirs.sort()/sorted(files) fix in
LLMRails.__init__ so library load order stays filesystem-independent.
@Pouyanpi Pouyanpi merged commit 86feba9 into develop Jun 26, 2026
14 checks passed
@Pouyanpi Pouyanpi deleted the stack/recorded-tests-02-deterministic-library-load branch June 26, 2026 07:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants