🤖 AgentOS

The Operating System for AI Agents

Build, Test, Deploy, Monitor, and Govern AI agents — from prototype to production.

For teams who need to deploy AI agents with testing, governance, and monitoring built in — not bolted on.

3 Differentiators

🧪 Test: Run scenario-based simulation before deploy, with quality and cost scoring.
🛡️ Govern: Enforce budgets, permissions, and kill-switch policies with auditability.
📊 Monitor: Observe live agent runs, tool usage, latency, and spend in one dashboard.

Quick Start

pip install agentos-platform

10-line example:

from agentos.governed_agent import GovernedAgent
from agentos.core.tool import tool

@tool(description="Add two numbers")
def add(a: float, b: float) -> float:
    return a + b

agent = GovernedAgent(name="demo", model="gpt-4o-mini", tools=[add])
print(agent.run("What is 12.5 + 7.5?"))

Demo mode:

AGENTOS_DEMO_MODE=true python examples/run_web_builder.py

Features

MCP server with stdio/SSE transport (Claude Desktop + Cursor)

Install the MCP extra:

pip install 'agentos-platform[mcp]'

1) Start the MCP server

Expose built-in AgentOS tools (stdio transport is the safest choice for MCP clients like Claude Desktop and Cursor):

agentos mcp serve --transport stdio

Expose tools from a specific agent module (example ./my_agent/agent.py):

agentos mcp serve --transport stdio --agent ./my_agent

Optional: run the HTTP SSE transport for clients that support it:

agentos mcp serve --transport sse --host 127.0.0.1 --port 8080

2) Configure Claude Desktop

Add the following snippet to your claude_desktop_config.json (restart Claude Desktop after editing):

{
  "mcpServers": {
    "agentos": {
      "command": "agentos",
      "args": ["mcp", "serve", "--transport", "stdio"]
    }
  }
}

If you want a specific agent module:

{
  "mcpServers": {
    "agentos": {
      "command": "agentos",
      "args": ["mcp", "serve", "--transport", "stdio", "--agent", "/absolute/path/to/agent.py"]
    }
  }
}

3) Configure Cursor

Add to Cursor .cursor/mcp.json:

{
  "mcpServers": {
    "agentos": {
      "command": "agentos",
      "args": ["mcp", "serve", "--transport", "stdio"]
    }
  }
}

Agent delegation (delegate tool + SharedContext + chaining)

AgentOS includes a structured delegation system that lets a “parent” agent offload subtasks to “child” agents while propagating rich context through a shared, in-memory key/value store.

Key pieces:

delegate_subtask tool: LLM-facing tool that accepts structured fields like task, context_json, constraints_json, expected_output_schema_json, and timeout.
SharedContext: a key/value store child agents can read/write during the delegation chain (avoids lossy prompt compression).
Delegation chaining: if a child agent delegates again, the same shared context key is reused automatically.

Minimal wiring example:

from agentos.core.agent import Agent
from agentos.core.delegation import DelegationManager

# Define your child agents however you like.
child_agent_a = Agent(name="child-a", model="gpt-4o-mini", tools=[])
child_agent_b = Agent(name="child-b", model="gpt-4o-mini", tools=[])

manager = DelegationManager()
manager.register_agent("child-a", child_agent_a)
manager.register_agent("child-b", child_agent_b)

# Create your parent agent and attach the delegate tool.
parent = Agent(name="parent", model="gpt-4o-mini", tools=[])
manager.attach_delegate_tool(parent)  # adds `delegate_subtask` to the toolset

# Now the parent agent can call `delegate_subtask`.
parent.run("Delegate a subtask and use shared context for details.")

SharedContext tools available to delegated agents:

shared_context_key()
shared_context_get(key)
shared_context_set(key, value_json)
shared_context_dump()

Core Modules

Module	What it does
Agent SDK	Define agents and tools with provider-agnostic model routing
Simulation Sandbox	Test scenarios with LLM-as-judge quality and pass/fail scoring
Governance Engine	Budget controls, permissions, kill switch, and audit logging
Live Dashboard	Real-time traces for prompts, tool calls, latency, and spend
RAG Pipeline	Ingest, chunk, embed, and retrieve knowledge sources
Workflow Engine	Compose repeatable multi-step agent workflows

📋 Full 15-module list (click to expand)

Module	Description
Agent SDK	Core governed agent runtime and tool-calling loop
WebSocket Streaming	Token streaming and low-latency interactive sessions
RAG Pipeline	Ingestion, chunking, embeddings, retrieval, and reranking
Simulation Sandbox	Scenario simulation, scoring, and comparison reports
Live Dashboard	Event stream, usage analytics, and operational visibility
Governance Engine	Guardrails, budget caps, permission checks, and audits
Agent Scheduler	Interval and cron scheduling with execution history
Event Bus	Trigger-driven orchestration via internal and external events
Plugin System	Runtime-extensible tools, providers, and adapters
Authentication	API key auth, org and user usage tracking, and middleware
A/B Testing	Side-by-side evaluation for variants and prompt changes
Workflow Engine	DAG-based execution with retries and branching
Multimodal	Vision and document flows for image and file-aware agents
Marketplace	Template registry for reusable agents and workflows
Embed SDK	Embeddable widget and integration surface for web apps

Honest Comparison

Capability	AgentOS	LangChain	CrewAI	AutoGen
Built-in testing sandbox	✅ Native	❌ External setup	❌ External setup	❌ External setup
Governance (budget/kill switch)	✅ Native	⚠️ Custom code	⚠️ Custom code	⚠️ Custom code
Real-time ops dashboard	✅ Native	⚠️ LangSmith add-on	❌	❌
Batteries-included platform	✅ Yes	⚠️ Framework-first	⚠️ Orchestration-first	⚠️ Research-first
Ecosystem maturity	🌱 Growing	✅ Very mature	✅ Mature	✅ Mature

Benchmarks

See full benchmark results. Key findings:

Our weighted evaluation ensemble correlates 0.91 with human judgment
Local embeddings achieve 95% of OpenAI quality at zero cost
Governance adds <5ms overhead to any query

Architecture

See the architecture diagram above and docs/ for component-level details and ADRs.

Project Structure

agentos/
├── src/agentos/      # Core platform modules
├── frontend/         # React frontend
├── dashboard/        # Web dashboard UI
├── deploy/helm/      # Helm charts
├── examples/         # Runnable examples
├── tests/            # Unit and integration tests
└── docs/             # Docs and ADRs

Contributing

Contributions are welcome: CONTRIBUTING.md

Roadmap

Roadmap and upcoming work are tracked in GitHub Issues.

Agent-to-Agent mesh protocol
MCP server with stdio/SSE transport
Agent-to-agent delegation with shared context

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
.github		.github
benchmarks		benchmarks
bin		bin
dashboard		dashboard
deploy/helm/agentos		deploy/helm/agentos
docs		docs
examples		examples
frontend		frontend
plugins		plugins
scripts		scripts
src/agentos		src/agentos
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
docker-compose.yml		docker-compose.yml
pyproject.toml		pyproject.toml
vercel.json		vercel.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 AgentOS

3 Differentiators

Quick Start

Features

MCP server with stdio/SSE transport (Claude Desktop + Cursor)

1) Start the MCP server

2) Configure Claude Desktop

3) Configure Cursor

Agent delegation (delegate tool + SharedContext + chaining)

Core Modules

Honest Comparison

Benchmarks

Architecture

Project Structure

Contributing

Roadmap

About

Uh oh!

Releases 3

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🤖 AgentOS

3 Differentiators

Quick Start

Features

MCP server with stdio/SSE transport (Claude Desktop + Cursor)

1) Start the MCP server

2) Configure Claude Desktop

3) Configure Cursor

Agent delegation (delegate tool + SharedContext + chaining)

Core Modules

Honest Comparison

Benchmarks

Architecture

Project Structure

Contributing

Roadmap

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages