NVIDIA-NeMo · tgasser-nv · Jun 3, 2026 · Jun 3, 2026 · Jun 3, 2026 · Jun 5, 2026
diff --git a/docs/configure-rails/configuration-reference.md b/docs/configure-rails/configuration-reference.md
@@ -761,9 +761,11 @@ tracing:
   adapters:
     - name: FileSystem
   span_format: opentelemetry
-  enable_content_capture: false
+  enable_content_capture: false  # records prompts/responses on spans; may include PII
 ```
 
+For the environment-variable overrides, output formats, and privacy guidance that apply to `enable_content_capture`, see [](../observability/tracing/content-capture.md).
+
 ### Streaming
 
 ```{deprecated} v0.20.0

diff --git a/docs/configure-rails/yaml-schema/tracing-configuration.md b/docs/configure-rails/yaml-schema/tracing-configuration.md
@@ -48,6 +48,8 @@ tracing:
 |--------|-------------|---------|
 | `enabled` | Enable or disable tracing | `false` |
 | `adapters` | List of tracing adapters | `[]` |
+| `span_format` | Span structure used by the LLMRails adapter: `opentelemetry` or `legacy` | `opentelemetry` |
+| `enable_content_capture` | Record prompts and responses on spans. Captures potentially sensitive data (PII); see [](../../observability/tracing/content-capture.md) for full behavior and engine-specific differences | `false` |
 
 ## Tracing Adapters
 
@@ -121,6 +123,9 @@ Traces capture the following information:
 | **Errors** | Error conditions and debugging information |
 | **Timing** | Duration of each operation |
 
+By default, traces capture only metadata such as timing and token counts.
+Prompt and response content is recorded only when content capture is enabled; see [](../../observability/tracing/content-capture.md).
+
 ## Example Configurations
 
 ### Development Configuration

diff --git a/docs/observability/tracing/content-capture.md b/docs/observability/tracing/content-capture.md
@@ -0,0 +1,161 @@
+---
+title:
+  page: Capturing Message Content
+  nav: Content Capture
+description: Capture prompts, responses, and rail inputs on guardrails spans for debugging, with privacy controls and OpenTelemetry GenAI formats.
+topics:
+- Observability
+- AI Safety
+tags:
+- Tracing
+- OpenTelemetry
+- Content Capture
+- Privacy
+- GenAI
+content:
+  type: how_to
+  difficulty: technical_intermediate
+  audience:
+  - engineer
+  - DevOps Engineer
+  - AI Engineer
+---
+
+(content-capture)=
+
+# Capturing Message Content
+
+By default, guardrails spans carry only metadata: durations, token counts, finish reasons, and which rails activated.
+The prompts and responses themselves are not recorded.
+Content capture is an opt-in feature that adds the actual user inputs, model outputs, and rail inputs to your spans, so you can debug blocked prompts, investigate false positives in safety rails, and see exactly what a model received and returned.
+
+:::{admonition} Experimental Feature
+The inline content-capture behavior described on this page is emitted by the opt-in IORails engine.
+To enable IORails, set `NEMO_GUARDRAILS_IORAILS_ENGINE=1`.
+IORails is an early-release feature, and the captured attribute and event names follow the OpenTelemetry GenAI semantic conventions, which are still under active development and can change.
+:::
+
+:::{warning}
+Enabling content capture writes user inputs and model outputs to your telemetry backend.
+This may include personally identifiable information (PII) and other sensitive data.
+Only enable it when necessary, restrict access to the backend that receives the spans, and ensure compliance with your data-protection obligations.
+:::
+
+## Enabling Content Capture
+
+Content capture is controlled by the `enable_content_capture` field in the `tracing` section of `config.yml`:
+
+```yaml
+tracing:
+  enabled: true
+  enable_content_capture: true  # default: false
+  adapters:
+    - name: OpenTelemetry
+```
+
+Content is only captured when tracing is also enabled — there is no point recording content onto spans that are never exported.
+
+### Environment Variable Override
+
+*Applies to the IORails engine only — see [Engine Support](#engine-support) below.*
+
+The `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` environment variable overrides the config field in **both** directions.
+This gives operators a single OpenTelemetry-standard switch to flip capture across all services, regardless of what each deployed `config.yml` says.
+
+| `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` | Result |
+|------------------------------------------------------|--------|
+| `true` or `1` | Capture is forced **on**, even if `enable_content_capture: false`. |
+| `false` or `0` | Capture is forced **off**, even if `enable_content_capture: true`. |
+| Unset, empty, or any other value | Falls through to the `enable_content_capture` config field. |
+
+Values are case-insensitive, and surrounding whitespace is ignored.
+
+## Output Format
+
+*Applies to the IORails engine only — see [Engine Support](#engine-support) below.*
+
+The `gen_ai.*` content captured on the LLM calls is emitted in one of two forms, selected by the `OTEL_SEMCONV_STABILITY_OPT_IN` environment variable.
+This variable holds a comma-separated list of opt-in tokens.
+The format selector applies only to this `gen_ai.*` content; the `guardrails.request.*` and `guardrails.rail.*` attributes described under [Where Content Is Captured](#where-content-is-captured) are always emitted as plain span attributes regardless of its value.
+
+| `OTEL_SEMCONV_STABILITY_OPT_IN` | Format | What is emitted |
+|---------------------------------|--------|-----------------|
+| Contains `gen_ai_latest_experimental` | **JSON span attributes** | Structured, JSON-encoded span attributes following the latest experimental OpenTelemetry GenAI conventions. |
+| Unset, or does not contain the token (default) | **Legacy span events** | One span event per message, following the earlier GenAI event conventions. |
+
+### JSON Span Attributes
+
+When `OTEL_SEMCONV_STABILITY_OPT_IN` contains `gen_ai_latest_experimental`, content is recorded as JSON-encoded span attributes:
+
+| Attribute | Contents |
+|-----------|----------|
+| `gen_ai.input.messages` | The non-system input messages, each as `{"role": ..., "parts": [{"type": "text", "content": ...}]}`. |
+| `gen_ai.output.messages` | The assistant output, as a single role-wrapped message. |
+| `gen_ai.system_instructions` | The system messages, as a flat list of `{"type": "text", "content": ...}` parts (no role wrapper, per the specification). |
+
+Each attribute is set only when it has content, so a backend can distinguish "no system instructions" from an empty string.
+
+### Legacy Span Events
+
+By default — when `OTEL_SEMCONV_STABILITY_OPT_IN` is unset or does not include the token — content is recorded as span events instead:
+
+| Event | Emitted for |
+|-------|-------------|
+| `gen_ai.system.message` | Each system message. |
+| `gen_ai.user.message` | Each user message. |
+| `gen_ai.assistant.message` | Each assistant message in the input. |
+| `gen_ai.tool.message` | Each tool message in the input. |
+| `gen_ai.choice` | The assistant output (the response). |
+
+Roles outside this set (for example, the legacy `function` role) are skipped; function-call events are not yet captured.
+
+```{note}
+This format selection is independent of the `tracing.span_format` config field.
+`span_format` (`opentelemetry` or `legacy`) selects the span structure used by the LLMRails post-hoc tracing adapter, whereas `OTEL_SEMCONV_STABILITY_OPT_IN` selects how IORails encodes captured content on its inline spans.
+```
+
+## Where Content Is Captured
+
+When capture is active, content lands on the following spans.
+
+| Span | Kind | Captured content |
+|------|------|------------------|
+| `guardrails.request` | SERVER | `guardrails.request.input` — the JSON-encoded caller input messages — and `guardrails.request.output` — the plain-text response actually delivered to the caller (the refusal message when a rail blocks). These are always plain span attributes, independent of the `OTEL_SEMCONV_STABILITY_OPT_IN` format selector. |
+| `gen_ai.*` | CLIENT | The input messages and output of every LLM call — both the main LLM and the per-rail-action LLM calls (for example, content-safety models) — using the `gen_ai.*` attribute or event names above. |
+| `guardrails.rail` | INTERNAL | `guardrails.rail.input` — the JSON-encoded rail input (`{"messages": [...], "bot_response": ...}`). On a rail that blocks, `guardrails.rail.reason` also carries the human-readable block reason. |
+
+The request span deliberately uses its own `guardrails.request.*` attributes rather than the `gen_ai.*` names.
+On a block path the two diverge: the LLM CLIENT span records the raw model response, while the SERVER span records what the caller actually received — the refusal message.
+Reusing `gen_ai.output.messages` on both would put different values under the same name and confuse a backend correlating them.
+
+Because the request span and the LLM spans belong to the same trace, a backend correlates the outer guardrails request with the inner model calls through trace and span context (the `trace_id` and parent-child `span_id` relationships).
+The shared `gen_ai.*` names across the LLM CLIENT spans do not establish that link — names repeat across requests — but they make the captured content easier to interpret once the spans are correlated.
+
+## Streaming
+
+For streamed responses, output chunks are accumulated and the captured output is written once, at the end of the stream.
+The recorded output is exactly what reached the consumer.
+If an output rail blocks mid-stream, the captured output reflects the truncated stream plus any injected error response — not text the caller never received.
+When nothing is delivered, no output is recorded.
+
+## Engine Support
+
+| Engine | Content capture |
+|--------|-----------------|
+| **IORails** | Preview support. The full behavior on this page — environment-variable resolution, JSON-attribute or legacy-event format selection, and capture on the request, LLM, and rail spans — is emitted by the opt-in `IORails` engine. |
+| **LLMRails** | The `enable_content_capture` field is honored by the LLMRails post-hoc tracing adapter, but content is emitted through that adapter's own span extractors and attribute names. The `OTEL_INSTRUMENTATION_GENAI_CAPTURE_MESSAGE_CONTENT` and `OTEL_SEMCONV_STABILITY_OPT_IN` controls described here apply to IORails. |
+
+## Important Considerations
+
+- **Privacy first.**
+  Captured spans contain raw prompts and responses.
+  Treat the receiving backend as holding sensitive data, and prefer enabling capture in development or in scoped investigations rather than broadly in production.
+- **No truncation.**
+  Content is captured in full; there is no size limit or truncation knob.
+  Size your exporters and backend accordingly, especially for large inputs or long streamed responses.
+- **Evolving GenAI standards.**
+  The [OpenTelemetry GenAI semantic conventions](https://opentelemetry.io/docs/specs/semconv/gen-ai/) are still under active development.
+  Attribute names, event names, and structures can change as the specification matures.
+- **Performance.**
+  Extensive telemetry collection can affect performance, especially with large inputs and outputs.
+  The hot-path cost is dominated by SDK-level batching and export, which your application controls.
diff --git a/docs/observability/tracing/index.md b/docs/observability/tracing/index.md
@@ -84,7 +84,7 @@ The following are the key differences between the supported span formats.
 **Development Status**: The [OpenTelemetry semantic conventions for GenAI](https://opentelemetry.io/docs/specs/semconv/gen-ai/) are currently in development and may undergo changes. Consider the following risks:
 
 - **Evolving Standards**: Conventions may change as they mature, potentially affecting existing implementations
-- **Data Privacy**: The `enable_content_capture` option captures user inputs and model outputs, which may include sensitive information (PII). Only enable when necessary and ensure compliance with data protection regulations. See [GenAI Events documentation](https://opentelemetry.io/docs/specs/semconv/gen-ai/gen-ai-events/) for details
+- **Data Privacy**: The `enable_content_capture` option records user inputs and model outputs onto spans, which may include sensitive information (PII). Only enable when necessary and ensure compliance with data protection regulations. See [](content-capture.md) for the full behavior, environment-variable controls, and privacy guidance
 - **Performance Impact**: Extensive telemetry collection may impact system performance, especially with large inputs/outputs
 
 ### Migration Path
@@ -100,6 +100,7 @@ Existing configurations will continue to work. However, it is strongly recommend
 
 - [](quick-start.md) - Minimal setup to enable tracing using the OpenTelemetry SDK
 - [](adapter-configurations.md) - Detailed configuration for FileSystem, OpenTelemetry, and Custom adapters
+- [](content-capture.md) - Capture prompts, responses, and rail inputs on spans, with privacy controls and output formats
 - [](opentelemetry-integration.md) - Production-ready OpenTelemetry setup and ecosystem compatibility
 - [](opentelemetry-logs.md) - Forward guardrails Python logs to OpenTelemetry with trace correlation
 - [](troubleshooting.md) - Common issues and solutions
@@ -114,6 +115,7 @@ Existing configurations will continue to work. However, it is strongly recommend
 
 Quick Start <quick-start>
 Adapters <adapter-configurations>
+Content Capture <content-capture>
 OpenTelemetry <opentelemetry-integration>
 OpenTelemetry Logs <opentelemetry-logs>
 Troubleshooting <troubleshooting>