feat: reconcile per-user MCP credentials on VK and MCP client changes#3705
Conversation
|
Warning Review limit reached
More reviews will be available in 15 minutes. Learn how PR review limits work. Your organization has run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans include higher PR review limits than trial, open-source, and free plans. In all cases, reviews become available again over time. During sustained high-volume PR review activity, CodeRabbit may temporarily slow when the next review becomes available. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (10)
📝 WalkthroughWalkthroughThis PR adds per-user credential reconciliation and header-key canonicalization. It introduces helper utilities to normalize header keys (lowercase+trim), canonicalizes keys at request and storage boundaries, defines a four-method ConfigStore reconciliation interface, implements RDB transactional reconciliation that orphans and reactivates per-user OAuth tokens and MCP header credentials based on VK-MCP grant changes, wires reconciliation into MCP and governance handlers with error logging, and improves error handling in session operations. ChangesPer-user Credential Reconciliation and Header Normalization
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
|
79fe495 to
fb2ba5b
Compare
b45e15d to
8737c78
Compare
Security Policy Alert: Secret Policy ViolationThis workflow run has been blocked by StepSecurity's secrets policy because it accesses secrets and the workflow file differs from the default branch. To approve this workflow, please add the Note: The label must be added by someone other than the PR author (Pratham-Mishra04) or automation bots to ensure proper security review. After the label is added, you can re-run the blocked workflow to proceed. This workflow will be automatically approved once merged into the default branch. For more information, see StepSecurity's Secret Exfiltration Policy documentation. |
Confidence Score: 5/5The reconciliation logic is correct and idempotent; the two remaining findings are non-blocking quality concerns that do not affect security or credential grant enforcement. The core correctness goals are implemented correctly. The pending-only flow delete and the pre-update snapshots address the most critical issues from the prior review. The two open findings are non-fatal and do not produce wrong grant state. framework/configstore/rdb.go (readVKsHoldingOauthCredsForMCP status filter) and transports/bifrost-http/handlers/mcp_per_user_headers.go (mergeExistingPerUserHeaders error handling) Important Files Changed
Reviews (14): Last reviewed commit: "fix: mcp auth session reconciliation fix..." | Re-trigger Greptile |
fb2ba5b to
5919a07
Compare
8737c78 to
70c0449
Compare
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
transports/bifrost-http/handlers/mcp.go (1)
1115-1125:⚠️ Potential issue | 🟠 Major | ⚡ Quick winReconcile after partial-success updates too.
If
allow_on_all_virtual_keyschanges, the MCP config update has already committed before thevk_configstransaction runs. When that later transaction fails and the handler returns at Line 1124, the block at Lines 1193-1203 never executes, so revoked per-user OAuth/header credentials stay in their old state until another update.💡 Minimal fix
}); err != nil { + if req.AllowOnAllVirtualKeys != existingAllowOnAllVirtualKeys { + if recErr := h.store.ConfigStore.ReconcileOauthAfterMCPChange(ctx, id); recErr != nil { + logger.Error(fmt.Sprintf("reconcile OAuth credentials after MCP %s update failed: %v", id, recErr)) + } + if recErr := h.store.ConfigStore.ReconcileMCPHeadersAfterMCPChange(ctx, id); recErr != nil { + logger.Error(fmt.Sprintf("reconcile per-user-headers credentials after MCP %s update failed: %v", id, recErr)) + } + } // NOTE: Partial success — the MCP client config was already updated in DB and memory above.Also applies to: 1183-1203
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@framework/configstore/rdb.go`:
- Around line 5707-5717: The vkIDs slice returned by
readVKsHoldingOauthCredsForMCP is unordered and can cause deadlocks when two
MCP-triggered reconciliations lock VKs in different orders; to fix, sort the
vkIDs slice (e.g., using Go's sort.Slice or sort.Strings/int conversion)
immediately after vkIDs, before iterating and calling
reconcileVKDirectTokensDB(tx, vkID) inside the transaction, and apply the same
change to the other similar transaction block (the one mirrored at lines
~5727-5737) so both reconciliation loops iterate VKs in a deterministic,
consistent order.
- Around line 5570-5599: The orphaning/reactivation and flows delete currently
only filter on virtual_key_id (orphanQ, flowsQ), so they can touch rows from
other modes; update the queries on TableOauthUserToken (orphanQ and the
reactivation query) to also filter by auth_mode = "virtual_key" and update the
TableOauthUserSession delete (flowsQ) to also filter by flow_mode =
"virtual_key" (or the exact sentinel your codebase uses for VK-mode) and
preserve the existing allowedClientIDs conditions; this ensures only VK-mode
rows (vkID-bound tokens and sessions) are orphaned/reactivated or deleted.
In `@framework/configstore/store.go`:
- Around line 434-437: Update the comment for
ListAllPendingMCPPerUserHeaderFlows to reflect that it does not return rows
regardless of caller identity but is subject to query-scope narrowing via
ScopedDB(ctx); mention that the implementation in framework/configstore/rdb.go
uses ScopedDB(ctx) so results are limited to rows visible to the caller
(row-visibility scoping) rather than global pending rows.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 4997c318-9da5-41cd-a2a2-20056cd15f2b
📒 Files selected for processing (5)
framework/configstore/rdb.goframework/configstore/store.gotransports/bifrost-http/handlers/governance.gotransports/bifrost-http/handlers/mcp.gotransports/bifrost-http/lib/config_test.go
70c0449 to
cf07e4e
Compare
5919a07 to
24adf78
Compare
24adf78 to
dbf3e80
Compare
cf07e4e to
522cf0a
Compare
dbf3e80 to
44bcebd
Compare
522cf0a to
1c9aa14
Compare
44bcebd to
be4c732
Compare
b36721f to
c00573b
Compare
5dde11f to
ddbf8a7
Compare
2d84d56 to
a396bc2
Compare
ddbf8a7 to
3e4f2f5
Compare
a396bc2 to
311e88c
Compare
e435e91 to
6a51858
Compare
311e88c to
2a4a450
Compare
2a4a450 to
dfa9a71
Compare
6a51858 to
78a9988
Compare
Merge activity
|
The base branch was changed.
78a9988 to
dd1652a
Compare
…#3705) ## Summary When a Virtual Key's MCP allowlist changes (via dashboard edit, AP propagation, or SCIM auto-assign) or when an MCP client's access configuration changes (VK configs diff or `AllowOnAllVirtualKeys` toggle), per-user credentials that are no longer valid were previously left stale. This PR introduces a reconciliation layer that orphans vk-keyed OAuth tokens and header credentials whose MCP grant was revoked, reactivates them if the grant returns, and hard-deletes any in-flight pending flow rows that can no longer complete. ## Changes - Added `DeleteVirtualKey` now also hard-deletes `TableMCPPerUserHeaderFlow` rows tied to the deleted VK, consistent with how other credential rows are cleaned up on VK deletion. - Added `vkEffectiveMCPClientIDs` helper that computes the union of a VK's explicit per-VK MCP allowlist and MCPs with `AllowOnAllVirtualKeys=true`, mirroring the runtime grant check. - Added `reconcileVKDirectTokensDB` and `reconcileVKDirectHeaderRowsDB` — transactional DB helpers that orphan active credentials outside the effective allowlist, reactivate orphaned credentials that regained access, and hard-delete pending flow rows for lost grants. - Added `readVKsHoldingOauthCredsForMCP` and `readVKsHoldingHeaderCredsForMCP` to identify which VKs need re-evaluation when a change originates on the MCP side. - Exposed four new `ConfigStore` interface methods: `ReconcileOauthAfterVKChange`, `ReconcileMCPHeadersAfterVKChange`, `ReconcileOauthAfterMCPChange`, and `ReconcileMCPHeadersAfterMCPChange`. - Called the VK-side reconcile methods in `updateVirtualKey` when `MCPConfigs` is present in the request. - Called the MCP-side reconcile methods in `updateMCPClient` when `VKConfigs` changed or `AllowOnAllVirtualKeys` was toggled. - Added no-op stubs for all four new interface methods to `MockConfigStore` in tests. - Reconciliation errors are logged but non-fatal so the primary update response is not blocked. ## Type of change - [ ] Bug fix - [x] Feature - [ ] Refactor - [ ] Documentation - [ ] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [ ] Providers/Integrations - [ ] Plugins - [ ] UI (React) - [ ] Docs ## How to test 1. Create a VK with an explicit MCP allowlist and establish an active OAuth token and a header credential for one of the allowed MCPs. 2. Edit the VK to remove that MCP from its allowlist — verify the token and credential rows transition to `status='orphaned'` and any pending flow rows for that MCP are deleted. 3. Re-add the MCP to the VK's allowlist — verify the orphaned rows return to `status='active'`. 4. Toggle `AllowOnAllVirtualKeys` on an MCP client — verify all VKs holding credentials for that MCP are re-evaluated accordingly. 5. Delete a VK — verify `mcp_per_user_header_flows` rows for that VK are removed alongside other credential rows. ```sh go test ./framework/configstore/... go test ./transports/bifrost-http/... ``` ## Breaking changes - [ ] Yes - [x] No ## Related issues ## Security considerations Orphaned credentials are invisible to runtime grant checks (`status='active'` filter), so a user whose VK loses MCP access can no longer use stale credentials to reach that MCP. Pending in-flight flows for revoked grants are hard-deleted immediately, preventing completion of an auth flow that would produce an unusable credential. ## Checklist - [ ] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [ ] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [ ] I verified the CI pipeline passes locally if applicable
## Summary This PR releases **core v1.5.14**, **framework v1.3.14**, **transports v1.5.6**, and bumps all dependent plugins to their respective `.14` patch versions. It delivers a broad set of new capabilities across MCP authentication, key rotation, OTel metrics, Bedrock/Anthropic compatibility, and UI improvements, alongside a number of targeted bug fixes and refactors. ## Changes - **Direct API Key Header** — Providers can now receive an API key passed directly via a request header (#3817) - **MCP Per-User Auth** — Introduced `MCPCredentialStore` abstraction, per-user MCP credential reconciliation, and a new per-user header auth type with lazy-auth submission flow (#3656, #3702, #3703, #3704, #3705) - **MCP TLS Configuration** — Added configurable TLS (`insecureSkipVerify`, `caCertPem`) for HTTP/SSE MCP client connections (#3779, #3783) - **MCP Sessions Management** — Filter, search, and pagination on the MCP sessions list API and table, plus a `can_reauth` identity gate (#3823, #3824, #3825) - **Key Rotation** — Keys now rotate on 401/402/403 responses; returns `502 upstream_credentials_exhausted` when all keys are permanently exhausted. Added `triggered_rotation` to `KeyAttemptRecord` and tightened `bifrost_key_rotation_events_total` semantics (#3430, #3491) - **OTel Metrics** — Added OTel spec-compatible metrics (backward compatible) with provider cache and semantic cache attributes in metrics export (#3865, #3816) - **Opus 4.8 Support** — System message handling and general compatibility for Opus 4.8 (#3868, #3878) - **Dimension Rankings** — New `GetDimensionRankings` API and dashboard tabs for team, customer, BU, and user rankings (#3766) - **Model Pricing Attributes** — `additional_attributes` field on model pricing rows with management API and UI editor (#3829) - **Prompt Cache Retention** — Added prompt cache retention parameter on responses requests (#3810) - **Tool Call Execution UI** — Inline tool-call execution, stop streaming, bulk execute/submit, and a redesigned tool-call UI (#3837, #3843) - **Sheet Navigation** — Prev/next keyboard navigation and URL state across virtual key, MCP client, and routing rule sheets (#3739, #3740, #3744, #3745) - **Bedrock Tool Name Truncation** — Truncate Bedrock function/tool names to the provider length limit - **Bedrock Guardrails** — Set guardrail config in Bedrock requests built from responses (#3862) - **Anthropic Tool Use** — Default `tool_use` input to `{}` when arguments are absent (#3880) - **Responses Streaming** — Fixed responses stream events (#3838) - **Compat Flow** — Fixed missing parameter parsing on the compat flow (#3881) - **Passthrough API Version** — Set a default API version in passthrough requests as a fallback (#3853) - **Virtual Key Updates** — Avoid overriding optional fields during virtual key update (#3855) - **User-Mode Flows** — Gate user-mode flows on caller `user_id`, skip temp token mint, and unify flow/credential kind filtering for pending flows (#3841, #3859) - **Partial Tool Calls** — Handle partial tool call execution failures and return successful results (#3849) - **URL Query Escaping** — Support escaped characters in URL query parameters (#3826) - **MCP Auth Errors** — Inline banner and retry support for MCP auth-required errors (#3856) - **Renamed Resolvers** — `staticHeadersResolver`/`serverOAuthResolver` renamed to `sharedHeadersResolver`/`sharedOAuthResolver` (#3840) - **Starlark Nested Tool Calls** — Exposed `RunWithPluginPipeline` on `ClientManager` and routed Starlark nested tool calls through the canonical plugin gate (#3794) - **Deferred-Fill OAuth Removed** — Removed deferred-fill user-mode OAuth flow support (#3839) - **Go 1.26.3** — Upgraded toolchain to Go 1.26.3 (#3782) ## Type of change - [x] Bug fix - [x] Feature - [x] Refactor - [ ] Documentation - [x] Chore/CI ## Affected areas - [x] Core (Go) - [x] Transports (HTTP) - [x] Providers/Integrations - [x] Plugins - [x] UI (React) - [ ] Docs ## How to test ```sh # Core/Transports go version # should report go1.26.3 go test ./... # UI cd ui pnpm i || npm i pnpm test || npm test pnpm build || npm run build ``` - Validate MCP per-user auth by configuring a per-user header auth type and confirming credentials are stored and reconciled on virtual key and MCP client changes. - Validate key rotation by triggering a 401/402/403 from an upstream provider and confirming rotation occurs; exhaust all keys and confirm a `502 upstream_credentials_exhausted` is returned. - Validate OTel metrics output includes `provider_cache` and `semantic_cache` attributes. - Validate Bedrock requests with tool names exceeding the provider limit are truncated correctly. - Validate Opus 4.8 system message handling by sending a request with a system message to an Opus 4.8 endpoint. ## Breaking changes - [x] Yes - [ ] No The deferred-fill user-mode OAuth flow has been removed (#3839). Any integrations relying on that flow must migrate to the new per-user credential store approach. The `staticHeadersResolver` and `serverOAuthResolver` identifiers have been renamed to `sharedHeadersResolver` and `sharedOAuthResolver` respectively (#3840); any direct references must be updated. ## Related issues #3817, #3656, #3702, #3703, #3704, #3705, #3779, #3783, #3823, #3824, #3825, #3430, #3491, #3865, #3816, #3868, #3878, #3766, #3829, #3810, #3837, #3843, #3739, #3740, #3744, #3745, #3862, #3880, #3838, #3881, #3853, #3855, #3841, #3859, #3849, #3826, #3856, #3840, #3794, #3839, #3782, #3724, #3814, #3836, #3869, #3886 ## Security considerations - MCP per-user credentials are stored via the new `MCPCredentialStore` abstraction; ensure the backing store is appropriately access-controlled and that credential values are encrypted at rest. - The direct API key header feature passes provider secrets via HTTP headers; ensure TLS is enforced on all ingress paths and that headers are not logged in plaintext. - User-mode flows are now gated on `caller user_id` and temp token minting is skipped where appropriate, reducing the surface for privilege escalation. - TLS configuration for MCP HTTP/SSE connections supports `insecureSkipVerify`; this should only be enabled in controlled environments. ## Checklist - [x] I read `docs/contributing/README.md` and followed the guidelines - [x] I added/updated tests where appropriate - [x] I updated documentation where needed - [x] I verified builds succeed (Go and UI) - [x] I verified the CI pipeline passes locally if applicable
## ✨ Features - **Direct API Key Header** - Pass a provider API key directly via request header (#3817) - **MCP Per-User Authentication** - New per-user header auth type with credential storage and lazy-auth submission flow (#3703, #3704, #3705) - **MCP TLS Configuration** - Configurable TLS (insecureSkipVerify, caCertPem) for HTTP/SSE MCP client connections (#3779, #3783) - **MCP Sessions Management** - Filter, search, and pagination on the MCP sessions list API and table, plus a can_reauth identity gate (#3823, #3824, #3825) - **Tool Call Execution UI** - Inline tool-call execution, stop streaming, bulk execute/submit, and a redesigned tool-call UI (#3837, #3843) - **Dimension Rankings Dashboard** - New dashboard tabs for team, customer, BU, and user rankings, backed by a GetDimensionRankings API (#3766) - **Model Pricing Attributes** - additional_attributes on model pricing rows with management API and UI editor (#3829) - **Prompt Cache Retention** - Prompt cache retention parameter on responses requests (#3810) - **Opus 4.8 Support** - System message handling and compatibility for Opus 4.8 (#3878, #3868) - **Key Rotation** - Rotate keys on 401/402/403 and return 502 upstream_credentials_exhausted when all keys are permanently dead (#3491) - **OTel Metrics** - OTel spec compatible metrics plus provider and semantic cache attributes in metrics export (#3865, #3816) - **Sheet Navigation** - Prev/next keyboard navigation and URL state across virtual key, MCP client, and routing rule sheets (#3739, #3740, #3744, #3745) - **Go 1.26.3** - Upgraded toolchain to Go 1.26.3 (#3782) ## 🐞 Fixed - **Bedrock Tool Names** - Truncate Bedrock function/tool names to the provider length limit - **Bedrock Guardrails** - Set guardrail config in Bedrock request built from responses (#3862) - **Anthropic Tool Use** - Default Anthropic tool_use input to {} when arguments are absent (#3880) - **Responses Streaming** - Fixed responses stream events (#3838) - **Compat Flow** - Fixed missing parameter parsing on the compat flow (#3881) - **Passthrough API Version** - Set a default API version in passthrough requests as a fallback (#3853) - **Virtual Key Updates** - Avoid overriding optional fields during virtual key update (#3855) - **User-Mode Flows** - Gate user-mode flows on caller user_id, skip temp token mint, and unify flow/credential kind filtering for pending flows (#3841, #3859) - **Partial Tool Calls** - Handle partial tool call execution failures and return successful results (#3849) - **URL Query Escaping** - Support escaped characters in URL query parameters (#3826) - **MCP Auth Errors** - Inline banner and retry support for MCP auth-required errors (#3856) - **JSON Editor Height** - Cap JSON editor max height at 400px in message views (#3842)

Summary
When a Virtual Key's MCP allowlist changes (via dashboard edit, AP propagation, or SCIM auto-assign) or when an MCP client's access configuration changes (VK configs diff or
AllowOnAllVirtualKeystoggle), per-user credentials that are no longer valid were previously left stale. This PR introduces a reconciliation layer that orphans vk-keyed OAuth tokens and header credentials whose MCP grant was revoked, reactivates them if the grant returns, and hard-deletes any in-flight pending flow rows that can no longer complete.Changes
DeleteVirtualKeynow also hard-deletesTableMCPPerUserHeaderFlowrows tied to the deleted VK, consistent with how other credential rows are cleaned up on VK deletion.vkEffectiveMCPClientIDshelper that computes the union of a VK's explicit per-VK MCP allowlist and MCPs withAllowOnAllVirtualKeys=true, mirroring the runtime grant check.reconcileVKDirectTokensDBandreconcileVKDirectHeaderRowsDB— transactional DB helpers that orphan active credentials outside the effective allowlist, reactivate orphaned credentials that regained access, and hard-delete pending flow rows for lost grants.readVKsHoldingOauthCredsForMCPandreadVKsHoldingHeaderCredsForMCPto identify which VKs need re-evaluation when a change originates on the MCP side.ConfigStoreinterface methods:ReconcileOauthAfterVKChange,ReconcileMCPHeadersAfterVKChange,ReconcileOauthAfterMCPChange, andReconcileMCPHeadersAfterMCPChange.updateVirtualKeywhenMCPConfigsis present in the request.updateMCPClientwhenVKConfigschanged orAllowOnAllVirtualKeyswas toggled.MockConfigStorein tests.Type of change
Affected areas
How to test
status='orphaned'and any pending flow rows for that MCP are deleted.status='active'.AllowOnAllVirtualKeyson an MCP client — verify all VKs holding credentials for that MCP are re-evaluated accordingly.mcp_per_user_header_flowsrows for that VK are removed alongside other credential rows.Breaking changes
Related issues
Security considerations
Orphaned credentials are invisible to runtime grant checks (
status='active'filter), so a user whose VK loses MCP access can no longer use stale credentials to reach that MCP. Pending in-flight flows for revoked grants are hard-deleted immediately, preventing completion of an auth flow that would produce an unusable credential.Checklist
docs/contributing/README.mdand followed the guidelines