Skip to content

dattgoswami/ferrum-gateway

Repository files navigation

Ferrum Gateway

Rust control plane for HTTP, model, tool, and agent traffic.

Ferrum Gateway is a standalone service gateway with a generic HTTP core and optional agent-native modules. A normal Web2 service can use it for routing, auth, rate limits, body limits, timeouts, retries, circuit breakers, streaming proxying, JSON logs, tracing, and Prometheus metrics. Agent workloads can add LLM provider routing, token budgets, usage metering, MCP/tool governance, and Ferrum adapters on top.

The core gateway does not depend on LLM, MCP, or Ferrum crates. That boundary is intentional: Ferrum integrations are adapters, not prerequisites.

Why Not Envoy, Kong, Or Traefik

Those are stronger general-purpose gateways today. Ferrum Gateway is narrower: it is a portfolio-grade Rust gateway that treats agent traffic as a first-class operations problem. The generic core keeps it credible infrastructure; the agent layer adds controls that are awkward to bolt onto a classic gateway:

  • policy-driven LLM provider fallback with streaming preserved;
  • per-key, per-session, and per-run token budgets;
  • OpenAI-compatible usage extraction and tokengate-shaped events;
  • MCP / JSON-RPC tool policy with allow, deny, approval-required, and path guards;
  • one correlated run trail across HTTP, model, and tool calls.

Current Status

Phase 10 is a presentable public slice:

  • Pingora-backed data plane plus axum admin plane.
  • Static TOML config with strict validation.
  • Local mocks for HTTP, LLM, MCP, and tokengate-compatible metering.
  • End-to-end tests for auth, rate limits, body limits, streaming, timeouts, retries, circuit breakers, LLM fallback, budgets, metering, and MCP policy.
  • Dockerfile, Docker Compose demo, CI workflow, and a local load-test script.

Normal verification is local and deterministic. No external providers, network services, or secrets are required.

Quickstart

Prerequisites:

  • Rust stable.
  • cmake for Pingora's zlib-ng build path (brew install cmake on macOS).

Run the generic HTTP gateway in two shells:

# shell 1
cargo run -p mock-upstream

# shell 2
cargo run -p ferrum-gateway-server -- --config config/gateway.example.toml

Exercise it:

curl -i http://127.0.0.1:8081/healthz
curl -i http://127.0.0.1:8080/api/echo
curl -N "http://127.0.0.1:8080/sse?count=5&interval_ms=80"
curl -s http://127.0.0.1:8081/metrics | grep gateway_

Run the full mock-first agent demo:

bash examples/ferrum-demo/run.sh

That script starts mock LLM primary/fallback providers, mock tokengate, mock MCP, and the gateway. It sends one correlated run through model routing, usage metering, MCP allow/approval/protected-path decisions, and prints a compact run trail.

Docker Compose Demo

Build and run the gateway plus all local mocks:

docker compose up --build

Then, in another shell:

curl -i http://127.0.0.1:8081/healthz
curl -i http://127.0.0.1:8080/api/echo

curl -sS -X POST http://127.0.0.1:8080/v1/chat/completions \
  -H 'content-type: application/json' \
  -H 'x-ferrum-run-id: compose-run-1' \
  -d '{"model":"axon-sim","messages":[{"role":"user","content":"hello"}]}'

curl -sS -X POST http://127.0.0.1:8080/mcp \
  -H 'content-type: application/json' \
  -H 'x-ferrum-run-id: compose-run-1' \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"coding/read_file","arguments":{"path":"src/lib.rs"}}}'

The Compose config is generated inside the gateway container so checked-in local configs can keep their 127.0.0.1 development defaults.

Load Test

The bundled smoke load test needs only bash, cargo, and curl:

bash load-tests/http-smoke.sh

Tune it with environment variables:

FGW_LOAD_REQUESTS=500 \
FGW_LOAD_CONCURRENCY=25 \
bash load-tests/http-smoke.sh

The script starts mock-upstream and the gateway, waits for health, sends concurrent requests to /api/echo, prints status counts and duration, then cleans up its child processes.

Features

Generic gateway core:

  • path, method, host, and header route matching;
  • upstream pools with round-robin selection;
  • API-key auth with sensitive header redaction;
  • token-bucket rate limits;
  • request body size limits;
  • streaming-safe SSE and chunked proxying;
  • per-route timeouts;
  • bounded retries for idempotent, non-streaming requests;
  • passive circuit breakers per (route, upstream);
  • Prometheus metrics and structured logs.

Agent-native modules:

  • OpenAI-compatible chat completions proxy;
  • primary/fallback provider policies;
  • streaming and non-streaming usage extraction;
  • run/session/API-key token budgets;
  • tokengate-compatible usage sink;
  • MCP / JSON-RPC inspection and forwarding;
  • tool allow, deny, approval-required, and protected-path policy.

Crate Boundary

Dependency direction is one-way:

ferrum-gateway-server -> ferrum-gateway-core
ferrum-gateway-llm    -> ferrum-gateway-core
ferrum-gateway-mcp    -> ferrum-gateway-core
ferrum-gateway-ferrum -> ferrum-gateway-core + optional adapter surfaces

ferrum-gateway-core declares zero dependencies on LLM, MCP, Ferrum, or any other ferrum-gateway-* crate.

Repository Map

crates/ferrum-gateway-core/    generic config, routing, limits, metrics, events
crates/ferrum-gateway-server/  Pingora data plane and axum admin plane
crates/ferrum-gateway-llm/     provider policies, budgets, usage extraction
crates/ferrum-gateway-mcp/     JSON-RPC inspection and tool governance
crates/ferrum-gateway-ferrum/  optional Ferrum-stack adapters
examples/                     local mocks and the Phase 9 demo script
docs/                         architecture, config, agent control, failures
load-tests/                   local smoke load tests

Useful Commands

cargo fmt --all
cargo test --workspace
cargo clippy --workspace --all-targets --no-deps

cargo run -p ferrum-gateway-server -- --config config/gateway.example.toml --validate-only
cargo run -p ferrum-gateway-server -- --config config/ferrum-demo.local.toml --validate-only

docker build -t ferrum-gateway:local .
docker compose up --build

Strict clippy -D warnings currently reports cleanup lints in existing Rust source; the Phase 10 CI runs clippy without turning warnings into build failures.

More Docs

Non-goals For This Slice

  • A full Envoy replacement.
  • A dynamic plugin runtime.
  • A UI dashboard.
  • A database-backed control plane.
  • A Kubernetes operator.
  • A complete MCP server.
  • A billing platform.

License

Apache-2.0.

About

A standalone service gateway for HTTP, LLM, MCP, and agent traffic, with routing, auth, limits, streaming, observability, token budgets, and tool policy built in.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages