Skip to content

kvcache-ai/kvcache-blog

Repository files navigation

KVCache.ai

Official website source for KVCache.ai — the home of open-source projects and research on KV cache management and LLM serving optimization.

About KVCache.ai

KVCache.ai advances the state of the art in Large Language Model (LLM) inference optimization. In decoder-only Transformer models, data from diverse modalities can ultimately be transformed into KV cache, making it a central component of modern LLM serving systems. As a result, KV cache has become a key focus for improving inference efficiency through techniques such as caching, scheduling, compression, offloading, and disaggregated serving architectures.

Through open-source projects and academic research, KVCache.ai develops effective, practical, and high-performance solutions for KV cache management and LLM serving optimization. The goal is to make LLM deployment more accessible, efficient, and cost-effective for organizations of all sizes.

Featured Projects

  • Mooncake — A KV cache-centric disaggregated architecture for LLM serving.
  • KTransformers — A CPU/GPU heterogeneous LLM inference and fine-tuning framework for running and tuning 100B+ models on accessible workstation hardware.
  • TrEnv-X — An open-source runtime platform designed for AI Agent applications.

Tools

  • KV Cache Size Calculator — Estimate KV cache capacity for common production LLM families, including DeepSeek, GLM, Kimi, Qwen, MiniMax, MiMo, and others.
  • KV Cache Hit Rate Simulator — Calculate KV cache hit rate of preset or your own trace, under different memory budgets.

Tech Stack

This site is built with:

  • Hugo (v0.126.3) — static site generator
  • Hugo Blox — theme and page builder (Tailwind CSS)
  • Pagefind — static search (used in production builds on Netlify)

Content is written in Markdown with YAML front matter. Custom layouts and shortcodes live under layouts/.

Links

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors