NVIDIA-NeMo · eeee2345 · May 9, 2026 · May 10, 2026 · May 17, 2026 · May 17, 2026
diff --git a/examples/configs/atr_threat_detection/README.md b/examples/configs/atr_threat_detection/README.md
@@ -0,0 +1,84 @@
+# ATR-inspired threat detection example
+
+This example shows how to use the built-in `regex_detection` input rail
+with a small set of patterns inspired by Agent Threat Rules, an open
+detection standard for AI agent threats published under the MIT license:
-detection standard for AI agent threats published under the MIT license:
+detection standard for AI agent threats published under the Apache-2.0 license:
-detection standard for AI agent threats published under the MIT license:
+detection standard for AI agent threats published under the Apache-2.0 license:
+
+https://github.com/Agent-Threat-Rule/agent-threat-rules
+
+## What it covers
+
+The patterns in `config/config.yml` map to common attack categories that
+ATR ships rules for:
+
+- ATR-PI-001 instruction override ("ignore previous instructions")
+- ATR-PI-002 system prompt exfiltration ("reveal your system prompt")
+- ATR-PI-003 role-play jailbreak ("act as DAN")
+- ATR-PI-004 base64-wrapped payload hint
+- ATR-MCP-001 MCP tool override markers
+- ATR-SSRF-001 `file://` scheme reference
+
+Each entry is illustrative. The full ruleset and YAML schema live in the
+ATR repository; this example exists so a NeMo Guardrails user can see the
+shape of an agent-specific input rail without needing an external service.
+
+## Running the example
+
+From the project root:
+
+```bash
+nemoguardrails chat --config=examples/configs/atr_threat_detection/config
+```
+
+A user message such as "Ignore all previous instructions" will trigger the
+`regex check input` flow and the bot will respond with the library default
+refusal message defined in `nemoguardrails/library/regex/flows.v1.co`
+(`"I'm sorry, I can't respond to that."`). Benign messages are forwarded
+to the configured main model.
+
+The `config.yml` lists `openai`/`gpt-4o-mini` as the main model so that
+chat runs end-to-end. Replace with your preferred provider; the input
+rail blocks threats before the model is invoked, so the model only sees
+benign inputs.
+
+## Extending
+
+To run against the live ATR YAML ruleset, parse the rule files at startup
+and append the `detection.regex_patterns` field of each rule to the
+`patterns` list under `regex_detection.input`.
+
+To surface a custom signal (rather than only refusing), add a custom
+flow that calls `detect_regex_pattern` directly. Follow the library's
+established `if $config.enable_rails_exceptions` pattern (see
+`examples/configs/guardrails_only/input/config.co`) so the flow emits
+**either** the exception event **or** the bot utterance, not both — in
+Colang 1.0 the rails event loop short-circuits on the exception and
+drops the bot utterance from the response if both fire in the same
+flow.
+
+```colang
+define bot refuse atr_threat
+  "I'm sorry, that request was blocked by an ATR input safety rule."
+
+define flow atr report match
+  $result = execute detect_regex_pattern(source="input", text=$user_message)
+  if $result["is_match"]
+    if $config.enable_rails_exceptions
+      create event AtrRuleMatchedRailException(message="ATR input rail blocked")
+    else
+      bot refuse atr_threat
+    stop
+```
+
+Then wire `atr report match` instead of `regex check input` under
+`rails.input.flows`. The custom flow uses a non-conflicting bot utterance
+(`bot refuse atr_threat`) so it does not collide with the library
+default, and emits a `AtrRuleMatchedRailException` event when
+`enable_rails_exceptions` is set so downstream observers (audit logging,
+metrics) can subscribe to it.
+
+If you also want to capture the matched rule list for audit, assign
+`$matched_rules = $result["detections"]` before the if/else and pass it
+through your own action call or to the event message — keep the
+exception/utterance branches single-action to preserve the canonical
+event-loop semantics.
diff --git a/examples/configs/atr_threat_detection/config/config.yml b/examples/configs/atr_threat_detection/config/config.yml
@@ -0,0 +1,35 @@
+# This example wires the built-in regex_detection rail to a small set of
+# ATR-inspired threat patterns covering common AI agent attack categories.
+# The full open detection set lives in Agent Threat Rules (MIT-licensed):
+# https://github.com/Agent-Threat-Rule/agent-threat-rules
+#
+# A main model is configured so `nemoguardrails chat` runs end-to-end against
+# this example. Replace the engine/model with your preferred provider; the
+# input rail blocks threats before the model is invoked, so the model is only
+# called for benign user messages.
+models:
+  - type: main
+    engine: openai
+    model: gpt-4o-mini
+
+rails:
+  config:
+    regex_detection:
+      input:
+        case_insensitive: true
+        patterns:
+          # ATR-PI-001 instruction override
+          - "\\b(ignore|disregard|forget)\\s+(all\\s+)?(previous|prior|above)\\s+(instructions?|prompts?|rules?)"
+          # ATR-PI-002 system prompt exfiltration
+          - "(reveal|print|repeat|show)\\s+(your\\s+)?(system\\s+prompt|initial\\s+instructions)"
+          # ATR-PI-003 role-play jailbreak
+          - "\\b(you\\s+are\\s+now|act\\s+as|pretend\\s+to\\s+be)\\s+(DAN|developer\\s+mode|jailbroken|an?\\s+unrestricted)"
+          # ATR-PI-004 base64-wrapped payload hint
+          - "(decode|run|execute)\\s+(this\\s+)?base64[:\\s]+[A-Za-z0-9+/=]{40,}"
+          # ATR-MCP-001 mcp tool override
+          - "<\\s*(tool_override|mcp_override|new_tool_definition)\\s*>"
+          # ATR-SSRF-001 file:// scheme reference
+          - "file://[^\\s\"'<>]+"
+  input:
+    flows:
+      - regex check input