Did you check the docs?
Is your feature request related to a problem? Please describe.
Teams using NeMo Guardrails as an inference proxy via /v1/chat/completions might encounter issues where the server silently drops or rejects valid OpenAI-compatible parameters and response fields:
- request filtering: strict Pydantic schemas reject unknown fields, so newer OpenAI parameters (e.g.
tool_choice, response_format, reasoning) are rejected rather than forwarded to the upstream LLM.
- response filtering: extension fields from upstream LLM responses (e.g. provider-specific metadata) may be dropped.
- header forwarding: auth headers and request-correlation headers (e.g.
X-Request-Id) are not propagated to/from the upstream LLM.
- streaming fidelity: SSE chunk structure may not be preserved end-to-end.
Describe the solution you'd like
To be conformant, we need to:
- forward unrecognized request parameters to the upstream LLM rather than rejecting them (pass-through mode)
- preserve upstream response fields that the guardrails layer does not need to modify
- propagate relevant headers (auth, correlation IDs) between client and upstream LLM
- ensure streaming SSE chunks maintain their structure through the guardrails pipeline
Describe alternatives you've considered
- dedicated proxy mode that bypasses Pydantic validation for unknown fields — more transparent but harder to validate.
- document that
/v1/chat/completions is not a transparent proxy and should not be used as one — accurate today, but limits adoption.
Additional context
No response
Did you check the docs?
Is your feature request related to a problem? Please describe.
Teams using NeMo Guardrails as an inference proxy via
/v1/chat/completionsmight encounter issues where the server silently drops or rejects valid OpenAI-compatible parameters and response fields:tool_choice,response_format,reasoning) are rejected rather than forwarded to the upstream LLM.X-Request-Id) are not propagated to/from the upstream LLM.Describe the solution you'd like
To be conformant, we need to:
Describe alternatives you've considered
/v1/chat/completionsis not a transparent proxy and should not be used as one — accurate today, but limits adoption.Additional context
No response