Skip to content

feature: improve request/response transparency in /v1/chat/completions #2060

Description

@m-misiura

Did you check the docs?

  • I have read all the NeMo-Guardrails docs

Is your feature request related to a problem? Please describe.

Teams using NeMo Guardrails as an inference proxy via /v1/chat/completions might encounter issues where the server silently drops or rejects valid OpenAI-compatible parameters and response fields:

  • request filtering: strict Pydantic schemas reject unknown fields, so newer OpenAI parameters (e.g. tool_choice, response_format, reasoning) are rejected rather than forwarded to the upstream LLM.
  • response filtering: extension fields from upstream LLM responses (e.g. provider-specific metadata) may be dropped.
  • header forwarding: auth headers and request-correlation headers (e.g. X-Request-Id) are not propagated to/from the upstream LLM.
  • streaming fidelity: SSE chunk structure may not be preserved end-to-end.

Describe the solution you'd like

To be conformant, we need to:

  • forward unrecognized request parameters to the upstream LLM rather than rejecting them (pass-through mode)
  • preserve upstream response fields that the guardrails layer does not need to modify
  • propagate relevant headers (auth, correlation IDs) between client and upstream LLM
  • ensure streaming SSE chunks maintain their structure through the guardrails pipeline

Describe alternatives you've considered

  • dedicated proxy mode that bypasses Pydantic validation for unknown fields — more transparent but harder to validate.
  • document that /v1/chat/completions is not a transparent proxy and should not be used as one — accurate today, but limits adoption.

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or requeststatus: needs triageNew issues that have not yet been reviewed or categorized.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions