Skip to content
View kapiljain1989's full-sized avatar

Block or report kapiljain1989

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
kapiljain1989/README.md

👋 Hi, I'm Kapil Jain

I am a Senior Backend & Distributed Systems Engineer with over 11 years of industry experience. I specialize in building highly scalable, cloud-native infrastructure, with a deep focus on Go, Kubernetes, and LLM inference optimization.

Lately, I've been diving deep into the intersection of MLOps and cloud infrastructure—specifically optimizing distributed LLM serving, prefix-cache aware scheduling, and Kubernetes accelerator orchestration.


🛠️ Core Expertise

  • Languages: Go (Golang), Python, Shell scripting
  • Cloud & DevOps: Kubernetes (K8s), Helm, Docker, Microservices Architecture
  • Systems & ML Infra: LLM Inference Optimization (vLLM, HMA scheduling), Distributed Systems, Auth (JWT, OAuth)

🚀 Active Focus & Contributions

  • LLM Inference Scaling: Actively working on state-of-the-art inference performance using modern accelerators on Kubernetes (llm-d).
  • Kubernetes Ecosystem: Designing robust scheduling mechanisms (like Hybrid Memory Attention awareness) and upstreaming features/contributions.
  • Open Source: Deeply passionate about giving back to the cloud-native and Kubernetes communities.

📫 Connect with Me

Pinned Loading

  1. llm-d/llm-d llm-d/llm-d Public

    Achieve state of the art inference performance with modern accelerators on Kubernetes

    Shell 3.5k 552

  2. voiceagent voiceagent Public

    Telecom-native AI call center platform — SIPREC observer + B2BUA gateway with real-time transcription, agent copilot, voice sentiment analysis, robocall detection, and PII masking. Go + FreeSWITCH…

    Go 3

  3. IronGate IronGate Public

    A comprehensive microservices-based authentication platform built with Go, featuring JWT authentication, Google OAuth, email verification, user management, multi-tenant company support, and Kuberne…

    Go

  4. credit_default_prediction credit_default_prediction Public

    This repository contains the code and report for the Loan Default Prediction project, which aims to build a machine learning model to predict whether a customer will default on a loan based on hist…

    Jupyter Notebook 1 1

  5. meshnet meshnet Public

    Self-hosted mesh VPN using Headscale, Caddy, and WireGuard. Single-command deploy on Ubuntu VPS with automatic TLS, embedded DERP relay, MagicDNS, and web admin UI.

    Shell 1

  6. vllm vllm Public

    Forked from vllm-project/vllm

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python