crabllm

High-performance LLM API gateway in Rust, built by crabtalk.

What It Does

Crabllm sits between your application and LLM providers. It exposes an OpenAI-compatible API and routes requests to the configured provider — OpenAI, Anthropic, Azure, Ollama, and any OpenAI-compatible service.

One API format. Many providers. Low overhead.

Inspired by LiteLLM. Built in Rust for minimal overhead. See the docs for providers, routing, extensions, and configuration.

Quick Start

cargo install crabllm

Create crabllm.toml:

listen = "0.0.0.0:8080"

[providers.openai]
kind = "openai"
api_key = "${OPENAI_API_KEY}"
models = ["gpt-4o"]

[providers.anthropic]
kind = "anthropic"
api_key = "${ANTHROPIC_API_KEY}"
models = ["claude-sonnet-4-20250514"]

Run:

crabllm --config crabllm.toml

Send requests using the OpenAI format:

curl http://localhost:8080/v1/chat/completions \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-4o",
    "messages": [{"role": "user", "content": "Hello!"}]
  }'

Why Rust

Performance: Sub-millisecond proxy overhead. No GC pauses.
Safety: Memory safety without runtime cost.
Concurrency: Tokio async runtime handles thousands of concurrent streaming connections.
Deployment: Single static binary. No interpreter, no virtualenv.

Benchmarks

Gateway overhead measured against a mock LLM server with instant responses. Numbers show proxy cost only (gateway latency minus direct baseline). See full results for all scenarios and memory usage.

Streaming P50 overhead (ms) — the metric that matters for LLMs:

RPS	crabllm	bifrost	litellm
500	+0.02	+0.16	+593.20
1000	+0.09	+0.18	+596.90
5000	+0.23	+0.47	+593.85

Chat completions P50 overhead (ms):

RPS	crabllm	bifrost	litellm
500	+0.41	+0.07	+151.42
1000	+0.30	+0.10	+160.12
5000	+0.16	+0.14	+159.68

Peak memory: crabllm 37MB · bifrost 169MB · litellm 541MB

# requires linux + docker
make bench                         # full competitive benchmark
make bench-debug GW=crabllm        # quick single-gateway debug run
make bench-chart                   # render terminal charts from results
make bench-json                    # dump summary JSON to stdout
make summary                       # generate docs/src/benchmarks.md

Crates

Crate	Description
`crabllm`	Binary entry point (CLI + server startup)
`crabllm-core`	Shared types: config, OpenAI-format request/response, errors
`crabllm-provider`	Provider enum, registry, and upstream HTTP dispatch
`crabllm-proxy`	Axum HTTP server, route handlers, auth middleware

License

MIT OR Apache-2.0

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
.cargo		.cargo
.claude		.claude
.github/workflows		.github/workflows
crates		crates
docker		docker
docs		docs
.dockerignore		.dockerignore
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
Dockerfile		Dockerfile
LICENSE-APACHE		LICENSE-APACHE
LICENSE-MIT		LICENSE-MIT
Makefile		Makefile
README.md		README.md
_typos.toml		_typos.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

crabllm

What It Does

Quick Start

Why Rust

Benchmarks

Crates

License

About

Licenses found

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

crabllm

What It Does

Quick Start

Why Rust

Benchmarks

Crates

License

About

Topics

Resources

License

Licenses found

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages