KVCache.ai

Official website source for KVCache.ai — the home of open-source projects and research on KV cache management and LLM serving optimization.

About KVCache.ai

KVCache.ai advances the state of the art in Large Language Model (LLM) inference optimization. In decoder-only Transformer models, data from diverse modalities can ultimately be transformed into KV cache, making it a central component of modern LLM serving systems. As a result, KV cache has become a key focus for improving inference efficiency through techniques such as caching, scheduling, compression, offloading, and disaggregated serving architectures.

Through open-source projects and academic research, KVCache.ai develops effective, practical, and high-performance solutions for KV cache management and LLM serving optimization. The goal is to make LLM deployment more accessible, efficient, and cost-effective for organizations of all sizes.

Featured Projects

Mooncake — A KV cache-centric disaggregated architecture for LLM serving.
KTransformers — A CPU/GPU heterogeneous LLM inference and fine-tuning framework for running and tuning 100B+ models on accessible workstation hardware.
TrEnv-X — An open-source runtime platform designed for AI Agent applications.

Tools

KV Cache Size Calculator — Estimate KV cache capacity for common production LLM families, including DeepSeek, GLM, Kimi, Qwen, MiniMax, MiMo, and others.
KV Cache Hit Rate Simulator — Calculate KV cache hit rate of preset or your own trace, under different memory budgets.

Tech Stack

This site is built with:

Hugo (v0.126.3) — static site generator
Hugo Blox — theme and page builder (Tailwind CSS)
Pagefind — static search (used in production builds on Netlify)

Content is written in Markdown with YAML front matter. Custom layouts and shortcodes live under layouts/.

Links

Website: https://kvcache.ai
GitHub Organization: https://github.com/kvcache-ai
X (Twitter): @KVCache_AI

Name		Name	Last commit message	Last commit date
Latest commit History 79 Commits
assets		assets
config/_default		config/_default
content		content
data		data
hugo-blox		hugo-blox
images		images
layouts		layouts
scripts		scripts
static		static
tests		tests
wasm/kvcache-sim		wasm/kvcache-sim
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.hugo_build.lock		.hugo_build.lock
LICENSE.md		LICENSE.md
README.md		README.md
go.mod		go.mod
go.sum		go.sum
hugoblox.yaml		hugoblox.yaml
netlify.toml		netlify.toml
test.md		test.md
theme.toml		theme.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KVCache.ai

About KVCache.ai

Featured Projects

Tools

Tech Stack

Links

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

KVCache.ai

About KVCache.ai

Featured Projects

Tools

Tech Stack

Links

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages