feat(mcp): add HTTP auth middleware and caller-provided backend tokens#1253
Open
sriaradhyula wants to merge 55 commits into0.5.0from
Open
feat(mcp): add HTTP auth middleware and caller-provided backend tokens#1253sriaradhyula wants to merge 55 commits into0.5.0from
sriaradhyula wants to merge 55 commits into0.5.0from
Conversation
* fix(ci): bump Go 1.25→1.26 and langgraph 1.0.10→1.1.6 The github MCP Docker build fails because go.mod requires go>=1.26.2 but Dockerfile.mcp pins golang:1.25-alpine (running go 1.25.9). Three sub-packages (agent_ontology, multi_agents, splunk/agent_splunk) still pinned langgraph==1.0.10, which is incompatible with the transitively resolved langgraph-prebuilt==1.0.9 that imports ExecutionInfo from langgraph.runtime — a symbol only present in langgraph>=1.1.0. Align these to 1.1.6 matching the rest of the repo. Also bumps langchain 1.1.3→1.2.15 in agent_ontology (langchain 1.1.3 caps langgraph<1.1.0) and removes a duplicate langgraph line in multi_agents/pyproject.toml. Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * fix(ci): move Grype container scan and integration test to post-release Remove tag-push triggers from security-scan.yml and tests-quick-sanity-integration-on-latest-tag.yml — images are not yet published when the tag event fires, causing MANIFEST_UNKNOWN failures. Instead, the release-finalize workflow now dispatches both workflows via workflow_dispatch after all CI builds pass and the release is published, ensuring container images are available. Also bumps release-finalize permissions from actions:read to actions:write so it can dispatch workflows. Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> --------- Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>
Covers MongoDB schema changes (agent_configs→agent_skills rename, feedback extraction, new collections), Helm value changes (Slack MCP image swap, Langfuse removal, checkpoint persistence), environment variable changes (DISTRIBUTED_AGENTS), RAG entity rename, dependency upgrades, rollback scripts, and pre/post-upgrade checklists. Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>
Bumps the npm_and_yarn group with 1 update in the /docs directory: [dompurify](https://github.com/cure53/DOMPurify). Updates `dompurify` from 3.3.3 to 3.4.0 - [Release notes](https://github.com/cure53/DOMPurify/releases) - [Commits](cure53/DOMPurify@3.3.3...3.4.0) --- updated-dependencies: - dependency-name: dompurify dependency-version: 3.4.0 dependency-type: indirect dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
) * SDPL-1601 fix(middleware): allow LLM to synthesize after RAG cap exhaustion [GAI] When the RAG search cap is exhausted and the agent keeps calling search, DeterministicTaskMiddleware injects a ToolMessage telling the agent to synthesize from what was already retrieved — but then immediately set jump_to="end", skipping the LLM's chance to actually produce a response. This caused "I've completed your request." empty responses. Remove jump_to="end" from the RAG loop termination path so the agent gets one more LLM turn to generate a real answer from retrieved context. Signed-off-by: Erik Lutz <elutz@splunk.com> * fix(middleware): improve RAG loop synthesis logging message Update the log message from "terminating" to "allowing LLM one more turn to synthesize response" to accurately reflect the behavior — we inject a synthesis prompt and return to the model, we don't terminate the graph. The misleading log message was confusing during debugging. Assisted-by: Claude:claude-haiku-4-5-20251001-v1 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * fix(middleware): add debug logging and inline comments for RAG loop detection - Add debug log showing which RAG tools are capped (helps diagnose RAG loop scenarios) - Add comprehensive inline comments explaining the RAG loop detection logic - Change exception logging from debug to warning level with full traceback (exc_info=True) - Clarify that we return to the model WITHOUT jump_to:"end" so LLM gets a synthesis turn This improves debuggability when RAG caps are exhausted. Assisted-by: Claude:claude-haiku-4-5-20251001-v1 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> --------- Signed-off-by: Erik Lutz <elutz@splunk.com> Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Co-authored-by: Erik Lutz <elutz@splunk.com> Co-authored-by: Sri Aradhyula <sraradhy@cisco.com>
- Quote associative array keys in declare statements to prevent treating them as unbound variables under set -u - Use parameter expansion default `[@]:-` syntax for array expansions that may be unset, matching bash best practices for nounset-safe iteration - Fixes errors: "argocd: unbound variable" at line 1484 and "PF_PIDS[@]: unbound variable" at line 97 Assisted-by: Claude:claude-haiku-4-5-20251001-v1:0 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>
…nt writes (#1234) Problem ------- The caipe-dev supervisor pod crashes repeatedly (12+ restarts / 21h) with: Primary stream failed: update command document too large Root cause: the LangGraph MongoDBSaver persists the FULL graph state on every step. In single-node mode the state includes two large channels that are always re-populated from the in-process skill cache at the start of every graph invocation and therefore have no business being in MongoDB: files["/skills/*"] Raw SKILL.md content injected by skills_middleware.backend_sync.build_skills_files(). 56 skills across 3 hub sources produce 422 file entries (~1-3 MB). skills_metadata Processed skill catalogue built by deepagents.middleware.skills.SkillsMiddleware.before_model() from the files channel on every graph step (~1-2 MB). As conversation state grows (messages + todos + tasks + files + skills_metadata), the BSON update document exceeds MongoDB's hard 16 MB limit and the supervisor crashes. Nginx then returns 503 until the pod recovers (~2 min). Fix --- Add two new helper functions alongside the existing _truncate_large_messages: _strip_skills_from_checkpoint(checkpoint) Called in _LazyAsyncMongoDBSaver.aput(). - Removes all files entries whose key starts with "/skills/" while preserving user-created files (write_file / read_file output) so cross-turn file-passing continues to work. - Removes the skills_metadata channel entirely. _strip_skills_from_writes(writes) Called in _LazyAsyncMongoDBSaver.aput_writes(). - Drops any ("skills_metadata", ...) write entry. - Strips "/skills/*" keys from any ("files", {...}) write entry. Both functions only modify the copy that is persisted to MongoDB. The live in-memory LangGraph state is unaffected, so SkillsMiddleware continues to see the full skill catalogue for the duration of the current turn. Why this is safe ---------------- agent.py:AIPlatformEngineerA2ABinding.stream() (and the equivalent deep_agent.py:serve()/serve_stream() paths) already inject state_dict["files"] = dict(self._mas_instance._skills_files) on EVERY call before graph.astream(). LangGraph reads the checkpoint ONCE at the start of an invocation and then merges the inputs dict into the recovered state via file_reducer ({**checkpoint_files, **input_files}). Skills are therefore always present in the in-memory state for the entire turn even though they are absent from the persisted checkpoint. Between turns (pod restarts, HITL resumes) the next stream() call re-injects them. Observed impact --------------- - files channel: -422 entries (~1-3 MB) stripped per checkpoint write - skills_metadata channel: ~1-2 MB stripped per checkpoint write - Total checkpoint size reduction: 2-5 MB per write - With a 16 MB BSON limit, this gives ~3x more headroom for conversation state (messages, todos, tasks) before the limit is reached assisted-by claude code claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>
…1237) PlatformEngineerResponse.is_task_complete, require_user_input, and Metadata.user_input had no defaults. A partial LLM response (common for read-only tasks like doc retrieval) omitting these fields failed Pydantic validation, triggering ModelRetryMiddleware to retry 5× before bailing with on_failure="continue" and emitting no response. Add sensible defaults (is_task_complete=True, require_user_input=False, user_input=False) so partial ResponseFormat calls succeed without retries. Update test_response_format.py to reflect the new behavior and add regression tests in test_supervisor_fetch_document_cap_e2e.py that validate the exact partial-payload shapes that were previously failing. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>
* fix(ui): don't persist session-expired errors in chat history When the A2A client receives a 401, it throws a 'Session expired:' error. Previously this was appended to the assistant message and persisted in MongoDB, so recovering a chat would show the error inline. The TokenExpiryGuard already handles the UX (modal + redirect), so skip appendToMessage for session expiry errors across all 4 catch blocks in ChatPanel and DynamicAgentChatPanel. Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * test(ui): add tests for session-expired error suppression in chat Verify that appendToMessage is NOT called when the A2A client throws 'Session expired: …', and that other errors (network, backend) are still appended inline as expected. Also adds @testing-library/dom as an explicit dev dependency. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> --------- Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>
The devDependency used ^10.4.1 (semver range), which fails the pinned-deps CI check that requires all Node deps to be exact versions. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>
#1235) * fix(a2a): isolate execution plan state per request to prevent cross-session leakage The AIPlatformEngineerA2ABinding is a singleton shared across all concurrent chat sessions. Four mutable instance attributes tracking execution plan state (_execution_plan_sent, _previous_todos, _task_plan_entries, _in_self_service_workflow) caused plan data from one user's session to leak into another's when requests overlapped. Introduces a PlanState dataclass instantiated as a local variable in stream(), ensuring each request gets its own isolated plan tracking state. Helper methods _build_todo_plan_text() and _build_task_plan_text() now accept PlanState as a parameter instead of reading instance state. Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(test): correct plan_step_id assertion on post-plan streaming chunks The test asserted post-plan streaming_result chunks should NOT carry plan_step_id, but agent_executor.py:830-835 intentionally stamps it so the UI nests content under plan steps. Updated assertion to match actual behavior. Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove unused import Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * test(a2a): add concurrent session isolation tests for PlanState Simulate two overlapping stream() calls on the same singleton binding using asyncio barriers and verify plan entries, execution_plan_sent flag, and sequential state are all per-session isolated. Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove unused PlanState import in concurrent test Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
…nce (#1229) * feat(auth): add dual auth middleware for shared key + OAuth2 coexistence When both A2A_AUTH_SHARED_KEY and A2A_AUTH_OAUTH2=true are set, the new DualAuthMiddleware allows both auth methods simultaneously: 1. Shared key → immediate access (A2A machine-to-machine clients) 2. OAuth2 JWT → validated via JWKS (UI OIDC flow) Previously, setting A2A_AUTH_SHARED_KEY would override OAuth2 auth entirely (if/elif), breaking the UI's OIDC-based backend calls. The middleware priority chain in main.py is now: - Both set → DualAuthMiddleware (shared key OR JWT) - Only shared key → SharedKeyMiddleware - Only OAuth2 → OAuth2Middleware - Neither → no auth verify_token is imported lazily inside dispatch() to avoid pulling in oauth2_middleware module-level env validation at import time. Includes unit tests for all auth paths. Signed-off-by: Arthur Drozdov <adrozdov@cisco.com> * fix(auth): address review findings in dual auth middleware - Use hmac.compare_digest() for shared key comparison to prevent timing attacks - Fix broken test mocking: verify_token mock was being restored before requests ran, so OAuth2 fallback tests were hitting real (uninitialized) verify_token. Moved mocking to test-level context managers. Signed-off-by: Arthur Drozdov <adrozdov@cisco.com> * fix: address PR review feedback Signed-off-by: Arthur Drozdov <adrozdov@cisco.com> --------- Signed-off-by: Arthur Drozdov <adrozdov@cisco.com>
Signed-off-by: Ubuntu <ubuntu@ip-10-175-48-77.us-east-2.compute.internal>
Signed-off-by: Ubuntu <ubuntu@ip-10-175-48-77.us-east-2.compute.internal>
…langgraph-redis slack-bot
Signed-off-by: Ubuntu <ubuntu@ip-10-175-48-77.us-east-2.compute.internal>
Signed-off-by: Ubuntu <ubuntu@ip-10-175-48-77.us-east-2.compute.internal>
…-prefix feat(helm): allow custom ingress path for supervisor ingress
…gent output Signed-off-by: Kevin Kantesaria <kkantesaria@splunk.com>
…repeated failures Signed-off-by: Kevin Kantesaria <kkantesaria@splunk.com>
Signed-off-by: Kevin Kantesaria <kkantesaria@splunk.com>
…ming behavior Signed-off-by: Kevin Kantesaria <kkantesaria@splunk.com>
[GAI] Removes the diagnostic logger.info block that logged every A2A event (type, artifact, text_len, preview, step statuses). This was added for debugging but clutters production logs and was flagged in PR review. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Erik Lutz <elutz@splunk.com>
…e-io/ai-platform-engineering into fix/slack-streaming-restore-v0241-ux # Conflicts: # ai_platform_engineering/integrations/slack_bot/utils/ai.py
…pts (SDPL-1601) [GAI] The system prompt instructed the LLM to call search(keyword_search=True/False) but this parameter never existed in the tool schema. Pydantic rejected it on every call, causing 7 traced production failures today where users received "I ran into an issue" error messages instead of answers. Fixes both locations where the bad instruction appeared: - ai_platform_engineering/integrations/slack_bot/utils/config_models.py - charts/ai-platform-engineering/data/prompt_config.rag.yaml Replaces keyword_search references with equivalent phrasing guidance that doesn't reference nonexistent tool parameters. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Erik Lutz <elutz@splunk.com>
…ompt fix(slack-bot): remove nonexistent keyword_search param from RAG search prompts
…41-ux fix(slack): restore v0.2.41 streaming UX and suppress duplicate sub-agent output
Establishes a consistent `appuser` (UID/GID 1001) for running applications within containers. Signed-off-by: Louise Champ <myauie@gmail.com>
…1242) * fix(supervisor): add curl tool to supervisor utility tools The curl tool was implemented and exported but never added to the supervisor utility_tools list in _build_graph_async. Without it, the supervisor only had fetch_url (GET-only) for HTTP requests. PUT/POST requests would fail with 405 or be silently mishandled, causing the agent to hallucinate success on write API calls. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * fix(supervisor): suppress curl/fetch_url raw content from stream; restore response_format defaults curl and fetch_url return large HTTP response bodies that were streaming directly to clients (Slack appendStream failures, raw HTML in chat). Add _SUPPRESS_CONTENT_TOOLS constant in agent.py (curl, fetch_url, fetch_markdown, wget) and suppress their ToolMessage content the same way RAG tools are suppressed — the tool_call completion notification is sufficient for the client. Also restore Metadata.user_input, PlatformEngineerResponse.is_task_complete, and require_user_input defaults that were accidentally dropped in this branch. Partial LLM responses omitting these fields were triggering Pydantic ValidationError → ModelRetryMiddleware retry loop (5×) → bailout after 1 step. Update test_missing_user_input_raises_validation_error → defaults_to_false to reflect the lenient-parsing design. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * refactor(tools): replace fetch_url with curl; add strip_html param fetch_url was a GET-only requests.get() wrapper — a strict subset of what curl already handles. Remove it from utility_tools and add an optional strip_html=True parameter to the curl tool that runs BeautifulSoup stripping on the response, preserving the one unique capability fetch_url had. Also remove fetch_url from _SUPPRESS_CONTENT_TOOLS since it is no longer a supervisor tool. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * fix(curl): add guardrails — https-only, block file-write flags - Reject any URL with a scheme other than https:// to prevent http://, file://, ftp:// etc. - Block -o/--output and --config/-K flags that write to disk or read curl config files from LLM-generated commands. - Validation runs on parsed args before subprocess.run so blocked commands never reach the shell. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * fix(curl): replace terse errors with detailed user-facing messages Guardrail rejections now explain what was blocked, why, and what is supported — so the LLM can relay the information clearly instead of surfacing a bare "ERROR:" string. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * fix(curl): unblock -o/--output, only block --config/-K -o is needed for legitimate use cases (e.g. saving API responses to pass to jq). Only --config/-K (reads curl config from disk) remains blocked. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * fix(curl): simplify guardrails to https-only URL check Remove flag blocklist entirely — only enforce https:// URLs. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * test(curl): add unit and e2e tests for curl tool guardrails and supervisor wiring Unit tests (test_curl_tool.py): - _validate_curl_args: https accepted, http/file/ftp rejected, error messages contain scheme and list supported alternatives, query strings stripped from error output - Argument handling: curl prefix auto-prepended, not doubled, PUT/body args passed through, invalid shlex rejected - Subprocess results: success, empty output, nonzero exit, stderr appended, timeout, curl not found - strip_html: tags stripped, raw returned when False, default is False E2E tests (TestCurlToolInSupervisor): - curl present in tools after _build_graph() - fetch_url absent from tools after _build_graph() - strip_html and timeout params exposed on the tool - http:// and file:// URLs return informative user-facing strings Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * feat(rag): add tool arbitration guidance — KB-first, curl for execution only Add tool_arbitration_prompt section to prompt_config.rag.yaml that explicitly tells the LLM to use the knowledge base (search/fetch_document) first for discovery and documentation, and reserve curl for state-changing API calls and live data not available in the KB. Wire _TOOL_ARBITRATION_PROMPT into both _RAG_ONLY_INSTRUCTIONS and _RAG_WITH_GRAPH_INSTRUCTIONS so the rule applies regardless of whether graph RAG is enabled. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * feat(rag): add explicit-curl and live-data exceptions to arbitration rule Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * feat(rag): rephrase curl arbitration exception Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * feat(rag): finalize curl arbitration wording Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * refactor(agent): remove _SUPPRESS_CONTENT_TOOLS, inline curl check The frozenset only covered curl in practice (fetch_markdown/wget are not supervisor tools). Inline tool_name == "curl" directly. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * docs(curl): expand docstring with wget/fetch_markdown equivalent examples Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * docs(curl): remove internal reference to wget/fetch_markdown Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * test(curl): add file download test; fix stdout mock for -o flag Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * chore: bump chart versions for ai-platform-engineering --------- Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
* feat: add vertical pod autoscaler support - Add Vertical Pod Autoscaler capabilities for agent and supervisor-agent deployments, as well as for MCP containers when running in single-node mode Signed-off-by: Louise Champ <myauie@gmail.com> * feat(helm): add global.vpa to enable VPA across all agent sub-charts Introduces global.vpa in the parent chart values so operators can enable VPA on the supervisor and all agent sub-charts with a single flag instead of repeating vpa.enabled=true under each agent section. Signed-off-by: Louise Champ <myauie@gmail.com> * chore: bump chart versions for agent supervisor-agent * chore: bump chart versions for ai-platform-engineering * feat(helm): enable VPA in more sub-charts - Standardises VPA resource policies by setting `controlledValues` to `RequestsAndLimits` for all VPA resources. - Extends VPA support by adding VPA templates and default configurations to: - `caipe-ui` - `caipe-ui-mongodb` - `dynamic-agents` - `langgraph-redis` - `slack-bot` - `agent-ontology` - `rag-ingestors` (creates one VPA per ingestor) - `rag-server` - Separates VPA for the main agent container and the standalone MCP HTTP server in the `agent` chart, with a dedicated VPA resource for MCP when running as a deployment. - Adjusts VPA rendering logic to handle single-node deployment mode and remote agents for better compatibility. Signed-off-by: Louise Champ <myauie@gmail.com> * chore: bump chart versions for agent caipe-ui-mongodb dynamic-agents langgraph-redis slack-bot supervisor-agent agent-ontology rag-ingestors rag-server * chore: bump chart versions for ai-platform-engineering rag-stack --------- Signed-off-by: Louise Champ <myauie@gmail.com> Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Sri Aradhyula <sraradhy@cisco.com>
…tters to fix CVEs (#1248) Addresses Dependabot security alerts: - CVE: langchain-openai SSRF DNS rebinding (fix: >=1.1.14) - CVE: langchain-text-splitters HTMLHeaderTextSplitter SSRF (fix: >=1.1.2) Changes: - langchain-openai: 1.0.3/1.1.0/1.1.1 → 1.1.14 across all agents and RAG packages - langchain-core: 1.2.28 → 1.2.31 (required by langchain-openai 1.1.14) - langchain-text-splitters: 1.0.0 → 1.1.2 in rag/ingestors and rag/server - openai: 2.19.0 → 2.32.0 in rag/server (required by langchain-openai 1.1.14) - Regenerated all 26 affected uv.lock files Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>
Addresses remaining Dependabot security alerts (151 open): - pypdf 6.10.0 → 6.10.2 (medium CVE) - python-multipart, authlib (1.6.11), langsmith (0.7.31): transitive bumps via uv lock upgrade - langchain-text-splitters: final remaining lock file updated Changes: - pypdf==6.10.2 in constraint-dependencies across 19 pyproject.toml files - Regenerated 38 uv.lock files (incl. mcp/ subworkspaces) upgrading: pypdf, python-multipart, authlib, langsmith, langchain-text-splitters - scrapy: no fix available upstream, left as-is Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>
…smith-vulnerabilities fix(deps): bump pypdf, python-multipart, authlib, langsmith to fix CVEs
All 10 first-party MCP servers previously had no authentication in HTTP mode. This change adds three selectable auth modes via a new shared package mcp-agent-auth, and allows the caller Authorization Bearer token to serve as the backend service API key, eliminating the need for per-server credential env vars in HTTP deployments. New package ai_platform_engineering/agents/common/mcp-auth: - MCPAuthMiddleware: Starlette BaseHTTPMiddleware supporting none, shared_key (hmac.compare_digest), and oauth2 (JWT via JWKS) modes - get_request_token(): resolves token from HTTP request header at call time, falling back to env var for STDIO backward compatibility All 10 MCP servers: - mcp-agent-auth added as uv path dependency - MCPAuthMiddleware injected when MCP_MODE=http - api/client.py updated to use get_request_token() - Module-level ValueError for token env vars changed to logger.warning - Webex: per-request _get_token() helper; --auth-token made optional - VictorOps: OrgCredentials.api_key made Optional - 21 unit tests covering all auth modes and get_request_token behavior Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>
Contributor
|
✅ No proprietary content detected. This PR is clear for review! |
Add ContextVar-based token propagation so the bearer token that authenticates each A2A request is automatically forwarded as the Authorization header on every outbound MCP HTTP call made by LangGraph — isolated per async Task with no bleed across requests. Changes: - utils/auth/token_context.py: new ContextVar current_bearer_token - SharedKeyMiddleware, OAuth2Middleware, DualAuthMiddleware: set ContextVar after successful auth - mcp_agent_auth/token_context.py: separate ContextVar for MCP-side token (supports MCP-to-MCP chaining) - MCPAuthMiddleware: set mcp ContextVar after auth passes (all modes except none) - base_langgraph_agent._build_httpx_client_factory: always returns a callable; reads ContextVar at call time to inject Authorization header; resolves TBD_USER_JWT stubs in _load_mcp_tools and _setup_mcp_and_graph - tests/test_token_context.py: 23 unit tests covering ContextVar defaults, all middleware token injection, factory behavior, SSL verify, concurrent task isolation - tests/test_auth_token_forwarding_e2e.py: 18 e2e tests covering the full A2A->LangGraph->MCP token forwarding chain Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>
Contributor
|
✅ No proprietary content detected. This PR is clear for review! |
| @pytest.mark.anyio | ||
| async def test_factory_token_does_not_leak_after_request(self): | ||
| """After a request completes the ContextVar is restored to its prior value.""" | ||
| factory = MagicMock() |
| import asyncio | ||
| import importlib | ||
| import os | ||
| from typing import Any |
| import importlib | ||
| import os | ||
| from typing import Any | ||
| from unittest.mock import AsyncMock, MagicMock, patch |
| import asyncio | ||
| import importlib | ||
| import os | ||
| from unittest.mock import patch, MagicMock |
| assert len(captured_client) == 1 | ||
| auth = captured_client[0].headers.get("authorization") | ||
| assert auth == f"Bearer {SHARED_KEY}" | ||
| import asyncio; asyncio.run(captured_client[0].aclose()) |
| # ContextVar is None (no middleware set it) | ||
| client = factory() | ||
| assert "authorization" not in {k.lower() for k in client.headers} | ||
| import asyncio; asyncio.run(client.aclose()) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
ai_platform_engineering/agents/common/mcp-auth/(mcp-agent-auth) providingMCPAuthMiddlewareandget_request_token()for all MCP serversMCP_AUTH_MODEenv var:none(default, backward-compat),shared_key(constant-time HMAC compare vsMCP_SHARED_KEY),oauth2(JWT validation via JWKS)Authorization: Bearer <token>header that authenticates the MCP connection is also used as the backend service API key — no per-server credential env vars required in HTTP deploymentsget_request_tokenresolution orderHow it works
Incoming auth (HTTP transport only):
Backend token resolution (all transports):
In
shared_keymode, the same bearer token serves as both the MCP auth credential and the backend API key — a single credential for both hops.Test plan
PYTHONPATH=. uv run pytest tests/test_mcp_auth_middleware.py -v— 21 tests passmake lint— cleanMCP_AUTH_MODE=none+ backend token in env → tools work (backward compat)none: no Authorization header required → tools work with env var tokenshared_key: correctMCP_SHARED_KEY→ 200; wrong/missing → 401shared_key, no env token:Authorization: Bearer <api-token>→ used for backend callsoauth2: valid JWT → 200; invalid/expired → 401/healthzpublic path bypasses auth in all modesNotes
MCP_AUTH_MODEis a new env var; existing deployments continue to work unchanged (defaults tonone)ValueErrorat startup now emitlogger.warning— servers can start without pre-set credentials when using caller-provided tokens--auth-tokenCLI option changed fromrequired=Trueto optional;_get_token()helper resolves per-request in HTTP mode, falls back to startup token in STDIO mode🤖 Generated with Claude Code