This document describes KubeLens AI runtime components, boundaries, and data flow. Documentation review refresh: 2026-05-07 (no architecture changes required).
flowchart TD
Browser["Browser UI (React + Vite)"] -->|HTTP + WS/SSE| API
subgraph API_RUNTIME ["Go API Runtime"]
Router["HTTP Router + Middleware"] --> Auth["Auth + RBAC + Write Gate"]
Router --> Audit["Audit Logger"]
Router --> Cluster["Cluster Service"]
Router --> Intel["Deterministic Intelligence"]
Router --> Incident["Incident/Remediation Services"]
Router --> Memory["Runbook/Fix Memory Store"]
Router --> Assistant["Assistant Orchestrator"]
Router --> Alerts["Alert Dispatch + Lifecycle"]
Router --> Stream["Event Bus + Stream Endpoints"]
Router --> Obs["Request Metrics + OTEL"]
end
Cluster --> State["Cluster State Cache"]
State --> K8S["Kubernetes API"]
State --> MetricsServer["metrics.k8s.io"]
Intel --> Plugins["Diagnostic Plugins"]
Assistant --> RAG["RAG Retriever"]
Assistant --> LLM["OpenAI-compatible Provider (optional)"]
Router --> PredictorClient["Predictor Client (optional)"] --> Predictor["FastAPI Predictor"]
Alerts --> Webhooks["Alertmanager / Slack / PagerDuty"]
| Layer | Responsibility |
|---|---|
src/ |
UI shell, view routing, feature views, typed API usage |
internal/httpapi |
Transport, middleware, route handlers, streaming endpoints |
internal/auth |
Principal extraction, role checks, write-gate policy |
internal/cluster |
Kubernetes read/write integration and model mapping |
internal/state |
Snapshot cache and watcher-fed cluster state |
internal/intelligence + plugins/* |
Deterministic diagnostics |
internal/rag |
Documentation retrieval/ranking and telemetry |
internal/incident |
Incident construction and runbook lifecycle |
internal/remediation |
Proposal generation + controlled execution |
internal/memory |
Persistent runbook and fix pattern storage |
internal/postmortem |
Postmortem generation and storage |
internal/alerts + internal/chatops |
Outbound notifications and alert channels |
predictor/app |
External deterministic risk scoring service |
- UI calls
/api/*inventory endpoints. - Handlers query
clusterservice. - Cluster service uses cache snapshots and/or live client-go queries.
- Results are returned as typed
internal/modelcontracts.
- Current snapshot is collected from state/cluster service.
- Deterministic analyzers produce diagnostics.
- Prediction endpoint calls predictor service when configured.
- If predictor is unavailable, backend falls back to deterministic local scoring.
- Assistant request enters
/api/assistant. - Backend assembles deterministic context: diagnostics, cluster state, incidents, memory.
- Optional RAG references are retrieved and ranked.
- Optional LLM provider enriches response; deterministic fallback remains available.
- Incident is created from current diagnostics/snapshot.
- Remediation proposals are generated from diagnostics and linked to incidents.
- Approved proposals can be executed via guarded write routes.
- Executed outcomes feed memory fixes and postmortem content.
- Runtime publishes events to in-process bus.
- Clients subscribe over
/api/stream(SSE) or/api/stream/ws(WebSocket). - Request-level and action-level audit entries are persisted in bounded audit storage.
- Route-level role requirements are enforced in auth middleware.
- Mutating cluster operations require both sufficient role and
WRITE_ACTIONS_ENABLED=true. - Cookie-authenticated mutating requests enforce same-origin CSRF checks.
- In
prodmode, remediation execution enforces four-eyes separation between approver and executor.
GET /api/healthz- livenessGET /api/readyz- readiness/dependency checksGET /api/runtime- runtime security posture summaryGET /api/metrics- JSON API telemetryGET /api/metrics/prometheus- Prometheus expositionGET /api/openapi.yaml- API contract