feat(mcp): add HTTP auth middleware and caller-provided backend tokens by sriaradhyula · Pull Request #1253 · cnoe-io/ai-platform-engineering

sriaradhyula · 2026-04-18T01:33:49Z

Summary

New shared package ai_platform_engineering/agents/common/mcp-auth/ (mcp-agent-auth) providing MCPAuthMiddleware and get_request_token() for all MCP servers
Three auth modes selectable via MCP_AUTH_MODE env var: none (default, backward-compat), shared_key (constant-time HMAC compare vs MCP_SHARED_KEY), oauth2 (JWT validation via JWKS)
Caller-provided backend tokens: the Authorization: Bearer <token> header that authenticates the MCP connection is also used as the backend service API key — no per-server credential env vars required in HTTP deployments
All 10 MCP servers updated: argocd, backstage, confluence, jira, komodor, netutils, pagerduty, splunk, victorops, webex
21 unit tests covering all auth modes, public path bypass, SSE error format, and get_request_token resolution order

How it works

Incoming auth (HTTP transport only):

MCP_AUTH_MODE=none           # pass-through (default)
MCP_AUTH_MODE=shared_key     # validates Bearer token == MCP_SHARED_KEY
MCP_AUTH_MODE=oauth2         # validates JWT via JWKS_URI / AUDIENCE / ISSUER

Backend token resolution (all transports):

# In HTTP mode: reads Authorization: Bearer header from live request
# In STDIO mode: falls back to env var (e.g. ARGOCD_API_TOKEN)
token = get_request_token("ARGOCD_API_TOKEN")

In shared_key mode, the same bearer token serves as both the MCP auth credential and the backend API key — a single credential for both hops.

Test plan

PYTHONPATH=. uv run pytest tests/test_mcp_auth_middleware.py -v — 21 tests pass
make lint — clean
STDIO mode: MCP_AUTH_MODE=none + backend token in env → tools work (backward compat)
HTTP mode, none: no Authorization header required → tools work with env var token
HTTP mode, shared_key: correct MCP_SHARED_KEY → 200; wrong/missing → 401
HTTP mode, shared_key, no env token: Authorization: Bearer <api-token> → used for backend calls
HTTP mode, oauth2: valid JWT → 200; invalid/expired → 401
/healthz public path bypasses auth in all modes

Notes

MCP_AUTH_MODE is a new env var; existing deployments continue to work unchanged (defaults to none)
Backend token env vars that previously raised ValueError at startup now emit logger.warning — servers can start without pre-set credentials when using caller-provided tokens
Webex special case: --auth-token CLI option changed from required=True to optional; _get_token() helper resolves per-request in HTTP mode, falls back to startup token in STDIO mode

🤖 Generated with Claude Code

* fix(ci): bump Go 1.25→1.26 and langgraph 1.0.10→1.1.6 The github MCP Docker build fails because go.mod requires go>=1.26.2 but Dockerfile.mcp pins golang:1.25-alpine (running go 1.25.9). Three sub-packages (agent_ontology, multi_agents, splunk/agent_splunk) still pinned langgraph==1.0.10, which is incompatible with the transitively resolved langgraph-prebuilt==1.0.9 that imports ExecutionInfo from langgraph.runtime — a symbol only present in langgraph>=1.1.0. Align these to 1.1.6 matching the rest of the repo. Also bumps langchain 1.1.3→1.2.15 in agent_ontology (langchain 1.1.3 caps langgraph<1.1.0) and removes a duplicate langgraph line in multi_agents/pyproject.toml. Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * fix(ci): move Grype container scan and integration test to post-release Remove tag-push triggers from security-scan.yml and tests-quick-sanity-integration-on-latest-tag.yml — images are not yet published when the tag event fires, causing MANIFEST_UNKNOWN failures. Instead, the release-finalize workflow now dispatches both workflows via workflow_dispatch after all CI builds pass and the release is published, ensuring container images are available. Also bumps release-finalize permissions from actions:read to actions:write so it can dispatch workflows. Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> --------- Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>

Covers MongoDB schema changes (agent_configs→agent_skills rename, feedback extraction, new collections), Helm value changes (Slack MCP image swap, Langfuse removal, checkpoint persistence), environment variable changes (DISTRIBUTED_AGENTS), RAG entity rename, dependency upgrades, rollback scripts, and pre/post-upgrade checklists. Assisted-by: Claude:claude-opus-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>

Bumps the npm_and_yarn group with 1 update in the /docs directory: [dompurify](https://github.com/cure53/DOMPurify). Updates `dompurify` from 3.3.3 to 3.4.0 - [Release notes](https://github.com/cure53/DOMPurify/releases) - [Commits](cure53/DOMPurify@3.3.3...3.4.0) --- updated-dependencies: - dependency-name: dompurify dependency-version: 3.4.0 dependency-type: indirect dependency-group: npm_and_yarn ... Signed-off-by: dependabot[bot] <support@github.com> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>

) * SDPL-1601 fix(middleware): allow LLM to synthesize after RAG cap exhaustion [GAI] When the RAG search cap is exhausted and the agent keeps calling search, DeterministicTaskMiddleware injects a ToolMessage telling the agent to synthesize from what was already retrieved — but then immediately set jump_to="end", skipping the LLM's chance to actually produce a response. This caused "I've completed your request." empty responses. Remove jump_to="end" from the RAG loop termination path so the agent gets one more LLM turn to generate a real answer from retrieved context. Signed-off-by: Erik Lutz <elutz@splunk.com> * fix(middleware): improve RAG loop synthesis logging message Update the log message from "terminating" to "allowing LLM one more turn to synthesize response" to accurately reflect the behavior — we inject a synthesis prompt and return to the model, we don't terminate the graph. The misleading log message was confusing during debugging. Assisted-by: Claude:claude-haiku-4-5-20251001-v1 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * fix(middleware): add debug logging and inline comments for RAG loop detection - Add debug log showing which RAG tools are capped (helps diagnose RAG loop scenarios) - Add comprehensive inline comments explaining the RAG loop detection logic - Change exception logging from debug to warning level with full traceback (exc_info=True) - Clarify that we return to the model WITHOUT jump_to:"end" so LLM gets a synthesis turn This improves debuggability when RAG caps are exhausted. Assisted-by: Claude:claude-haiku-4-5-20251001-v1 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> --------- Signed-off-by: Erik Lutz <elutz@splunk.com> Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Co-authored-by: Erik Lutz <elutz@splunk.com> Co-authored-by: Sri Aradhyula <sraradhy@cisco.com>

- Quote associative array keys in declare statements to prevent treating them as unbound variables under set -u - Use parameter expansion default `[@]:-` syntax for array expansions that may be unset, matching bash best practices for nounset-safe iteration - Fixes errors: "argocd: unbound variable" at line 1484 and "PF_PIDS[@]: unbound variable" at line 97 Assisted-by: Claude:claude-haiku-4-5-20251001-v1:0 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>

…nt writes (#1234) Problem ------- The caipe-dev supervisor pod crashes repeatedly (12+ restarts / 21h) with: Primary stream failed: update command document too large Root cause: the LangGraph MongoDBSaver persists the FULL graph state on every step. In single-node mode the state includes two large channels that are always re-populated from the in-process skill cache at the start of every graph invocation and therefore have no business being in MongoDB: files["/skills/*"] Raw SKILL.md content injected by skills_middleware.backend_sync.build_skills_files(). 56 skills across 3 hub sources produce 422 file entries (~1-3 MB). skills_metadata Processed skill catalogue built by deepagents.middleware.skills.SkillsMiddleware.before_model() from the files channel on every graph step (~1-2 MB). As conversation state grows (messages + todos + tasks + files + skills_metadata), the BSON update document exceeds MongoDB's hard 16 MB limit and the supervisor crashes. Nginx then returns 503 until the pod recovers (~2 min). Fix --- Add two new helper functions alongside the existing _truncate_large_messages: _strip_skills_from_checkpoint(checkpoint) Called in _LazyAsyncMongoDBSaver.aput(). - Removes all files entries whose key starts with "/skills/" while preserving user-created files (write_file / read_file output) so cross-turn file-passing continues to work. - Removes the skills_metadata channel entirely. _strip_skills_from_writes(writes) Called in _LazyAsyncMongoDBSaver.aput_writes(). - Drops any ("skills_metadata", ...) write entry. - Strips "/skills/*" keys from any ("files", {...}) write entry. Both functions only modify the copy that is persisted to MongoDB. The live in-memory LangGraph state is unaffected, so SkillsMiddleware continues to see the full skill catalogue for the duration of the current turn. Why this is safe ---------------- agent.py:AIPlatformEngineerA2ABinding.stream() (and the equivalent deep_agent.py:serve()/serve_stream() paths) already inject state_dict["files"] = dict(self._mas_instance._skills_files) on EVERY call before graph.astream(). LangGraph reads the checkpoint ONCE at the start of an invocation and then merges the inputs dict into the recovered state via file_reducer ({**checkpoint_files, **input_files}). Skills are therefore always present in the in-memory state for the entire turn even though they are absent from the persisted checkpoint. Between turns (pod restarts, HITL resumes) the next stream() call re-injects them. Observed impact --------------- - files channel: -422 entries (~1-3 MB) stripped per checkpoint write - skills_metadata channel: ~1-2 MB stripped per checkpoint write - Total checkpoint size reduction: 2-5 MB per write - With a 16 MB BSON limit, this gives ~3x more headroom for conversation state (messages, todos, tasks) before the limit is reached assisted-by claude code claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>

…1237) PlatformEngineerResponse.is_task_complete, require_user_input, and Metadata.user_input had no defaults. A partial LLM response (common for read-only tasks like doc retrieval) omitting these fields failed Pydantic validation, triggering ModelRetryMiddleware to retry 5× before bailing with on_failure="continue" and emitting no response. Add sensible defaults (is_task_complete=True, require_user_input=False, user_input=False) so partial ResponseFormat calls succeed without retries. Update test_response_format.py to reflect the new behavior and add regression tests in test_supervisor_fetch_document_cap_e2e.py that validate the exact partial-payload shapes that were previously failing. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>

* fix(ui): don't persist session-expired errors in chat history When the A2A client receives a 401, it throws a 'Session expired:' error. Previously this was appended to the assistant message and persisted in MongoDB, so recovering a chat would show the error inline. The TokenExpiryGuard already handles the UX (modal + redirect), so skip appendToMessage for session expiry errors across all 4 catch blocks in ChatPanel and DynamicAgentChatPanel. Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * test(ui): add tests for session-expired error suppression in chat Verify that appendToMessage is NOT called when the A2A client throws 'Session expired: …', and that other errors (network, backend) are still appended inline as expected. Also adds @testing-library/dom as an explicit dev dependency. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> --------- Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>

The devDependency used ^10.4.1 (semver range), which fails the pinned-deps CI check that requires all Node deps to be exact versions. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>

#1235) * fix(a2a): isolate execution plan state per request to prevent cross-session leakage The AIPlatformEngineerA2ABinding is a singleton shared across all concurrent chat sessions. Four mutable instance attributes tracking execution plan state (_execution_plan_sent, _previous_todos, _task_plan_entries, _in_self_service_workflow) caused plan data from one user's session to leak into another's when requests overlapped. Introduces a PlanState dataclass instantiated as a local variable in stream(), ensuring each request gets its own isolated plan tracking state. Helper methods _build_todo_plan_text() and _build_task_plan_text() now accept PlanState as a parameter instead of reading instance state. Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(test): correct plan_step_id assertion on post-plan streaming chunks The test asserted post-plan streaming_result chunks should NOT carry plan_step_id, but agent_executor.py:830-835 intentionally stamps it so the UI nests content under plan steps. Updated assertion to match actual behavior. Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove unused import Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com> Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * test(a2a): add concurrent session isolation tests for PlanState Simulate two overlapping stream() calls on the same singleton binding using asyncio barriers and verify plan entries, execution_plan_sent flag, and sequential state are all per-session isolated. Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix: remove unused PlanState import in concurrent test Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> --------- Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>

…nce (#1229) * feat(auth): add dual auth middleware for shared key + OAuth2 coexistence When both A2A_AUTH_SHARED_KEY and A2A_AUTH_OAUTH2=true are set, the new DualAuthMiddleware allows both auth methods simultaneously: 1. Shared key → immediate access (A2A machine-to-machine clients) 2. OAuth2 JWT → validated via JWKS (UI OIDC flow) Previously, setting A2A_AUTH_SHARED_KEY would override OAuth2 auth entirely (if/elif), breaking the UI's OIDC-based backend calls. The middleware priority chain in main.py is now: - Both set → DualAuthMiddleware (shared key OR JWT) - Only shared key → SharedKeyMiddleware - Only OAuth2 → OAuth2Middleware - Neither → no auth verify_token is imported lazily inside dispatch() to avoid pulling in oauth2_middleware module-level env validation at import time. Includes unit tests for all auth paths. Signed-off-by: Arthur Drozdov <adrozdov@cisco.com> * fix(auth): address review findings in dual auth middleware - Use hmac.compare_digest() for shared key comparison to prevent timing attacks - Fix broken test mocking: verify_token mock was being restored before requests ran, so OAuth2 fallback tests were hitting real (uninitialized) verify_token. Moved mocking to test-level context managers. Signed-off-by: Arthur Drozdov <adrozdov@cisco.com> * fix: address PR review feedback Signed-off-by: Arthur Drozdov <adrozdov@cisco.com> --------- Signed-off-by: Arthur Drozdov <adrozdov@cisco.com>

Signed-off-by: Ubuntu <ubuntu@ip-10-175-48-77.us-east-2.compute.internal>

…langgraph-redis slack-bot

Signed-off-by: Ubuntu <ubuntu@ip-10-175-48-77.us-east-2.compute.internal>

… rag-server

…-prefix feat(helm): allow custom ingress path for supervisor ingress

…gent output Signed-off-by: Kevin Kantesaria <kkantesaria@splunk.com>

…repeated failures Signed-off-by: Kevin Kantesaria <kkantesaria@splunk.com>

Signed-off-by: Kevin Kantesaria <kkantesaria@splunk.com>

…ming behavior Signed-off-by: Kevin Kantesaria <kkantesaria@splunk.com>

[GAI] Removes the diagnostic logger.info block that logged every A2A event (type, artifact, text_len, preview, step statuses). This was added for debugging but clutters production logs and was flagged in PR review. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Erik Lutz <elutz@splunk.com>

…e-io/ai-platform-engineering into fix/slack-streaming-restore-v0241-ux # Conflicts: # ai_platform_engineering/integrations/slack_bot/utils/ai.py

…pts (SDPL-1601) [GAI] The system prompt instructed the LLM to call search(keyword_search=True/False) but this parameter never existed in the tool schema. Pydantic rejected it on every call, causing 7 traced production failures today where users received "I ran into an issue" error messages instead of answers. Fixes both locations where the bad instruction appeared: - ai_platform_engineering/integrations/slack_bot/utils/config_models.py - charts/ai-platform-engineering/data/prompt_config.rag.yaml Replaces keyword_search references with equivalent phrasing guidance that doesn't reference nonexistent tool parameters. Assisted-by: Claude:claude-opus-4-7 Signed-off-by: Erik Lutz <elutz@splunk.com>

…ompt fix(slack-bot): remove nonexistent keyword_search param from RAG search prompts

…41-ux fix(slack): restore v0.2.41 streaming UX and suppress duplicate sub-agent output

Establishes a consistent `appuser` (UID/GID 1001) for running applications within containers. Signed-off-by: Louise Champ <myauie@gmail.com>

…1242) * fix(supervisor): add curl tool to supervisor utility tools The curl tool was implemented and exported but never added to the supervisor utility_tools list in _build_graph_async. Without it, the supervisor only had fetch_url (GET-only) for HTTP requests. PUT/POST requests would fail with 405 or be silently mishandled, causing the agent to hallucinate success on write API calls. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * fix(supervisor): suppress curl/fetch_url raw content from stream; restore response_format defaults curl and fetch_url return large HTTP response bodies that were streaming directly to clients (Slack appendStream failures, raw HTML in chat). Add _SUPPRESS_CONTENT_TOOLS constant in agent.py (curl, fetch_url, fetch_markdown, wget) and suppress their ToolMessage content the same way RAG tools are suppressed — the tool_call completion notification is sufficient for the client. Also restore Metadata.user_input, PlatformEngineerResponse.is_task_complete, and require_user_input defaults that were accidentally dropped in this branch. Partial LLM responses omitting these fields were triggering Pydantic ValidationError → ModelRetryMiddleware retry loop (5×) → bailout after 1 step. Update test_missing_user_input_raises_validation_error → defaults_to_false to reflect the lenient-parsing design. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * refactor(tools): replace fetch_url with curl; add strip_html param fetch_url was a GET-only requests.get() wrapper — a strict subset of what curl already handles. Remove it from utility_tools and add an optional strip_html=True parameter to the curl tool that runs BeautifulSoup stripping on the response, preserving the one unique capability fetch_url had. Also remove fetch_url from _SUPPRESS_CONTENT_TOOLS since it is no longer a supervisor tool. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * fix(curl): add guardrails — https-only, block file-write flags - Reject any URL with a scheme other than https:// to prevent http://, file://, ftp:// etc. - Block -o/--output and --config/-K flags that write to disk or read curl config files from LLM-generated commands. - Validation runs on parsed args before subprocess.run so blocked commands never reach the shell. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * fix(curl): replace terse errors with detailed user-facing messages Guardrail rejections now explain what was blocked, why, and what is supported — so the LLM can relay the information clearly instead of surfacing a bare "ERROR:" string. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * fix(curl): unblock -o/--output, only block --config/-K -o is needed for legitimate use cases (e.g. saving API responses to pass to jq). Only --config/-K (reads curl config from disk) remains blocked. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * fix(curl): simplify guardrails to https-only URL check Remove flag blocklist entirely — only enforce https:// URLs. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * test(curl): add unit and e2e tests for curl tool guardrails and supervisor wiring Unit tests (test_curl_tool.py): - _validate_curl_args: https accepted, http/file/ftp rejected, error messages contain scheme and list supported alternatives, query strings stripped from error output - Argument handling: curl prefix auto-prepended, not doubled, PUT/body args passed through, invalid shlex rejected - Subprocess results: success, empty output, nonzero exit, stderr appended, timeout, curl not found - strip_html: tags stripped, raw returned when False, default is False E2E tests (TestCurlToolInSupervisor): - curl present in tools after _build_graph() - fetch_url absent from tools after _build_graph() - strip_html and timeout params exposed on the tool - http:// and file:// URLs return informative user-facing strings Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * feat(rag): add tool arbitration guidance — KB-first, curl for execution only Add tool_arbitration_prompt section to prompt_config.rag.yaml that explicitly tells the LLM to use the knowledge base (search/fetch_document) first for discovery and documentation, and reserve curl for state-changing API calls and live data not available in the KB. Wire _TOOL_ARBITRATION_PROMPT into both _RAG_ONLY_INSTRUCTIONS and _RAG_WITH_GRAPH_INSTRUCTIONS so the rule applies regardless of whether graph RAG is enabled. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * feat(rag): add explicit-curl and live-data exceptions to arbitration rule Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * feat(rag): rephrase curl arbitration exception Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * feat(rag): finalize curl arbitration wording Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * refactor(agent): remove _SUPPRESS_CONTENT_TOOLS, inline curl check The frozenset only covered curl in practice (fetch_markdown/wget are not supervisor tools). Inline tool_name == "curl" directly. Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * docs(curl): expand docstring with wget/fetch_markdown equivalent examples Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * docs(curl): remove internal reference to wget/fetch_markdown Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * test(curl): add file download test; fix stdout mock for -o flag Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> * chore: bump chart versions for ai-platform-engineering --------- Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* feat: add vertical pod autoscaler support - Add Vertical Pod Autoscaler capabilities for agent and supervisor-agent deployments, as well as for MCP containers when running in single-node mode Signed-off-by: Louise Champ <myauie@gmail.com> * feat(helm): add global.vpa to enable VPA across all agent sub-charts Introduces global.vpa in the parent chart values so operators can enable VPA on the supervisor and all agent sub-charts with a single flag instead of repeating vpa.enabled=true under each agent section. Signed-off-by: Louise Champ <myauie@gmail.com> * chore: bump chart versions for agent supervisor-agent * chore: bump chart versions for ai-platform-engineering * feat(helm): enable VPA in more sub-charts - Standardises VPA resource policies by setting `controlledValues` to `RequestsAndLimits` for all VPA resources. - Extends VPA support by adding VPA templates and default configurations to: - `caipe-ui` - `caipe-ui-mongodb` - `dynamic-agents` - `langgraph-redis` - `slack-bot` - `agent-ontology` - `rag-ingestors` (creates one VPA per ingestor) - `rag-server` - Separates VPA for the main agent container and the standalone MCP HTTP server in the `agent` chart, with a dedicated VPA resource for MCP when running as a deployment. - Adjusts VPA rendering logic to handle single-node deployment mode and remote agents for better compatibility. Signed-off-by: Louise Champ <myauie@gmail.com> * chore: bump chart versions for agent caipe-ui-mongodb dynamic-agents langgraph-redis slack-bot supervisor-agent agent-ontology rag-ingestors rag-server * chore: bump chart versions for ai-platform-engineering rag-stack --------- Signed-off-by: Louise Champ <myauie@gmail.com> Signed-off-by: Sri Aradhyula <sraradhy@cisco.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: Sri Aradhyula <sraradhy@cisco.com>

…tters to fix CVEs (#1248) Addresses Dependabot security alerts: - CVE: langchain-openai SSRF DNS rebinding (fix: >=1.1.14) - CVE: langchain-text-splitters HTMLHeaderTextSplitter SSRF (fix: >=1.1.2) Changes: - langchain-openai: 1.0.3/1.1.0/1.1.1 → 1.1.14 across all agents and RAG packages - langchain-core: 1.2.28 → 1.2.31 (required by langchain-openai 1.1.14) - langchain-text-splitters: 1.0.0 → 1.1.2 in rag/ingestors and rag/server - openai: 2.19.0 → 2.32.0 in rag/server (required by langchain-openai 1.1.14) - Regenerated all 26 affected uv.lock files Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>

Addresses remaining Dependabot security alerts (151 open): - pypdf 6.10.0 → 6.10.2 (medium CVE) - python-multipart, authlib (1.6.11), langsmith (0.7.31): transitive bumps via uv lock upgrade - langchain-text-splitters: final remaining lock file updated Changes: - pypdf==6.10.2 in constraint-dependencies across 19 pyproject.toml files - Regenerated 38 uv.lock files (incl. mcp/ subworkspaces) upgrading: pypdf, python-multipart, authlib, langsmith, langchain-text-splitters - scrapy: no fix available upstream, left as-is Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>

…smith-vulnerabilities fix(deps): bump pypdf, python-multipart, authlib, langsmith to fix CVEs

All 10 first-party MCP servers previously had no authentication in HTTP mode. This change adds three selectable auth modes via a new shared package mcp-agent-auth, and allows the caller Authorization Bearer token to serve as the backend service API key, eliminating the need for per-server credential env vars in HTTP deployments. New package ai_platform_engineering/agents/common/mcp-auth: - MCPAuthMiddleware: Starlette BaseHTTPMiddleware supporting none, shared_key (hmac.compare_digest), and oauth2 (JWT via JWKS) modes - get_request_token(): resolves token from HTTP request header at call time, falling back to env var for STDIO backward compatibility All 10 MCP servers: - mcp-agent-auth added as uv path dependency - MCPAuthMiddleware injected when MCP_MODE=http - api/client.py updated to use get_request_token() - Module-level ValueError for token env vars changed to logger.warning - Webex: per-request _get_token() helper; --auth-token made optional - VictorOps: OrgCredentials.api_key made Optional - 21 unit tests covering all auth modes and get_request_token behavior Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>

github-actions · 2026-04-18T01:35:16Z

✅ No proprietary content detected. This PR is clear for review!

Add ContextVar-based token propagation so the bearer token that authenticates each A2A request is automatically forwarded as the Authorization header on every outbound MCP HTTP call made by LangGraph — isolated per async Task with no bleed across requests. Changes: - utils/auth/token_context.py: new ContextVar current_bearer_token - SharedKeyMiddleware, OAuth2Middleware, DualAuthMiddleware: set ContextVar after successful auth - mcp_agent_auth/token_context.py: separate ContextVar for MCP-side token (supports MCP-to-MCP chaining) - MCPAuthMiddleware: set mcp ContextVar after auth passes (all modes except none) - base_langgraph_agent._build_httpx_client_factory: always returns a callable; reads ContextVar at call time to inject Authorization header; resolves TBD_USER_JWT stubs in _load_mcp_tools and _setup_mcp_and_graph - tests/test_token_context.py: 23 unit tests covering ContextVar defaults, all middleware token injection, factory behavior, SSL verify, concurrent task isolation - tests/test_auth_token_forwarding_e2e.py: 18 e2e tests covering the full A2A->LangGraph->MCP token forwarding chain Assisted-by: Claude:claude-sonnet-4-6 Signed-off-by: Sri Aradhyula <sraradhy@cisco.com>

github-actions · 2026-04-18T02:22:35Z

✅ No proprietary content detected. This PR is clear for review!

+    @pytest.mark.anyio
+    async def test_factory_token_does_not_leak_after_request(self):
+        """After a request completes the ContextVar is restored to its prior value."""
+        factory = MagicMock()


+import asyncio
+import importlib
+import os
+from typing import Any


+import importlib
+import os
+from typing import Any
+from unittest.mock import AsyncMock, MagicMock, patch


+import asyncio
+import importlib
+import os
+from unittest.mock import patch, MagicMock


+        assert len(captured_client) == 1
+        auth = captured_client[0].headers.get("authorization")
+        assert auth == f"Bearer {SHARED_KEY}"
+        import asyncio; asyncio.run(captured_client[0].aclose())


+        # ContextVar is None (no middleware set it)
+        client = factory()
+        assert "authorization" not in {k.lower() for k in client.headers}
+        import asyncio; asyncio.run(client.aclose())


sriaradhyula and others added 30 commits April 15, 2026 08:50

chore: sync Chart.yaml versions and dependency refs to 0.3.3

662b813

bump: version 0.3.2 → 0.3.3

4d144c1

chore: sync Chart.yaml versions and dependency refs to 0.3.4

9966724

bump: version 0.3.3 → 0.3.4

54dd31d

chore: sync Chart.yaml versions and dependency refs to 0.3.5

6722570

bump: version 0.3.4 → 0.3.5

15701a9

chore: sync Chart.yaml versions and dependency refs to 0.3.6

6096ebf

bump: version 0.3.5 → 0.3.6

0dca5aa

feat(helm): allow custom ingress path for supervisor ingress

ba48d7f

Signed-off-by: Ubuntu <ubuntu@ip-10-175-48-77.us-east-2.compute.internal>

chore: bump chart versions for supervisor-agent

2fa1c28

chore: bump chart versions for ai-platform-engineering

b80bdd9

feat(helm): allow easy single var override of appVersion for image tag

d77e123

Signed-off-by: Ubuntu <ubuntu@ip-10-175-48-77.us-east-2.compute.internal>

chore: bump chart versions for agent caipe-ui-mongodb dynamic-agents …

abbba87

…langgraph-redis slack-bot

fix(helm): no dig man

2dab81c

Signed-off-by: Ubuntu <ubuntu@ip-10-175-48-77.us-east-2.compute.internal>

fix(helm): do not neglect precious rag

cdeba97

Signed-off-by: Ubuntu <ubuntu@ip-10-175-48-77.us-east-2.compute.internal>

chore: bump chart versions for agent-ontology rag-ingestors rag-redis…

a471f5b

… rag-server

chore: bump chart versions for rag-stack

e119bca

Merge pull request #1243 from cnoe-io/prebuild/fix/helm-allow-ingress…

ae18c9c

…-prefix feat(helm): allow custom ingress path for supervisor ingress

fix(slack): restore v0.2.41 streaming UX and suppress duplicate sub-a…

2f2748e

…gent output Signed-off-by: Kevin Kantesaria <kkantesaria@splunk.com>

Kevin Kantesaria and others added 20 commits April 17, 2026 14:33

fix(slack): disable streaming on channel_type_not_supported to avoid …

d13085a

…repeated failures Signed-off-by: Kevin Kantesaria <kkantesaria@splunk.com>

fix(slack): include channel_id in streaming fallback log message

99a5d75

Signed-off-by: Kevin Kantesaria <kkantesaria@splunk.com>

test(streaming): align tests with restored v0.2.41 no-plan live strea…

0dd7da0

…ming behavior Signed-off-by: Kevin Kantesaria <kkantesaria@splunk.com>

Merge branch 'fix/slack-streaming-restore-v0241-ux' of github.com:cno…

cd140ae

…e-io/ai-platform-engineering into fix/slack-streaming-restore-v0241-ux # Conflicts: # ai_platform_engineering/integrations/slack_bot/utils/ai.py

chore: bump chart versions for ai-platform-engineering

23f925e

Merge pull request #1247 from cnoe-io/SDPL-1601-fix-keyword-search-pr…

660e152

…ompt fix(slack-bot): remove nonexistent keyword_search param from RAG search prompts

Merge pull request #1245 from cnoe-io/fix/slack-streaming-restore-v02…

1b4e726

…41-ux fix(slack): restore v0.2.41 streaming UX and suppress duplicate sub-agent output

chore: sync Chart.yaml versions and dependency refs to 0.3.7

06405e8

bump: version 0.3.6 → 0.3.7

92a462e

fix(Dockerfile): Standardise non-root user configuration (#1246)

96f7e7b

Establishes a consistent `appuser` (UID/GID 1001) for running applications within containers. Signed-off-by: Louise Champ <myauie@gmail.com>

chore: sync Chart.yaml versions and dependency refs to 0.3.8

8b86a20

bump: version 0.3.7 → 0.3.8

0c256fa

Merge pull request #1250 from cnoe-io/prebuild/fix/pypdf-authlib-lang…

8057ab1

…smith-vulnerabilities fix(deps): bump pypdf, python-multipart, authlib, langsmith to fix CVEs

github-project-automation bot added this to CAIPE (AI Platform Engineering) Project Backlog Apr 18, 2026

github-code-quality bot found potential problems Apr 18, 2026

View reviewed changes

sriaradhyula marked this pull request as ready for review April 18, 2026 10:43

sriaradhyula mentioned this pull request Apr 18, 2026

Security: MCP HTTP/SSE server lacks authentication and authorization controls #1249

Closed

sriaradhyula changed the base branch from main to 0.5.0 April 20, 2026 03:36

sriaradhyula requested review from dasein and kevkantes as code owners April 20, 2026 03:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(mcp): add HTTP auth middleware and caller-provided backend tokens#1253

feat(mcp): add HTTP auth middleware and caller-provided backend tokens#1253
sriaradhyula wants to merge 55 commits into0.5.0from
101-mcp-auth-caller-key

sriaradhyula commented Apr 18, 2026

Uh oh!

github-actions bot commented Apr 18, 2026

Uh oh!

github-actions bot commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

Conversation

sriaradhyula commented Apr 18, 2026

Summary

How it works

Test plan

Notes

Uh oh!

github-actions bot commented Apr 18, 2026

Uh oh!

github-actions bot commented Apr 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants