Skip to content

InferencePool Status Not Being Updated #13456

@danehans

Description

@danehans

kgateway version

main

Kubernetes Version

v1.35.0

Describe the bug

InferencePool status is not being managed:

$ kubectl get inferencepool vllm-llama3-8b-instruct -o yaml
apiVersion: inference.networking.k8s.io/v1
kind: InferencePool
metadata:
  annotations:
    meta.helm.sh/release-name: vllm-llama3-8b-instruct
    meta.helm.sh/release-namespace: default
  creationTimestamp: "2026-02-02T16:29:44Z"
  generation: 1
  labels:
    app.kubernetes.io/managed-by: Helm
    app.kubernetes.io/name: vllm-llama3-8b-instruct-epp
    app.kubernetes.io/version: v1.3.0
  name: vllm-llama3-8b-instruct
  namespace: default
  resourceVersion: "9938"
  uid: e3469d81-5ddf-415a-83a6-ec51f95a3853
spec:
  endpointPickerRef:
    failureMode: FailClose
    group: ""
    kind: Service
    name: vllm-llama3-8b-instruct-epp
    port:
      number: 9002
  selector:
    matchLabels:
      app: vllm-llama3-8b-instruct
  targetPorts:
  - number: 8000

However, kgateway does properly configure the agtw data plane:

$ IP=$(kubectl get gateway/inference-gateway -o jsonpath='{.status.addresses[0].value}')
PORT=80

curl -i ${IP}:${PORT}/v1/completions -H 'Content-Type: application/json' -d '{
"model": "food-review-1",
"prompt": "Write as if you were a critic: San Francisco",
"max_tokens": 100,
"temperature": 0
}'
HTTP/1.1 200 OK
server: fasthttp
server: fasthttp
date: Mon, 02 Feb 2026 16:33:26 GMT
date: Mon, 02 Feb 2026 16:33:26 GMT
content-type: application/json
content-type: application/json
x-inference-port: 8000
x-inference-port: 8000
x-inference-pod: vllm-llama3-8b-instruct-549764dd95-8lwhh
x-inference-pod: vllm-llama3-8b-instruct-549764dd95-8lwhh
x-went-into-resp-headers: true
transfer-encoding: chunked

{"choices":[{"finish_reason":"length","index":0,"text":"The rest is silence.  Today is a nice sunny day. To be or not to be that is the question. Testing, testing 1,2,3. Today is a nice sunny day. Testing, testing 1,2,3. To be or not to be that is the question. I am your AI assistant, how can I help you today? Alas, poor Yorick! I knew him, Horatio: A fellow of infinite jest Testing, testing 1,2,3. I am your "}],"created":1770050006,"do_remote_decode":false,"do_remote_prefill":false,"id":"chatcmpl-2fb23bd3-fb86-4a59-be20-cc06b6982ad7","model":"food-review-1","object":"text_completion","remote_block_ids":null,"remote_engine_id":"","remote_host":"","remote_port":0,"usage":{"completion_tokens":100,"prompt_tokens":10,"total_tokens":110}}

HTTPRoute status is as expected:

$ kubectl get httproute vllm-llama3-8b-instruct -o yaml
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  annotations:
    meta.helm.sh/release-name: vllm-llama3-8b-instruct
    meta.helm.sh/release-namespace: default
  creationTimestamp: "2026-02-02T16:29:44Z"
  generation: 1
  labels:
    app.kubernetes.io/managed-by: Helm
  name: vllm-llama3-8b-instruct
  namespace: default
  resourceVersion: "9947"
  uid: 8ef8ff51-c848-4898-adee-4442b66c4b8a
spec:
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: inference-gateway
  rules:
  - backendRefs:
    - group: inference.networking.k8s.io
      kind: InferencePool
      name: vllm-llama3-8b-instruct
      weight: 1
    matches:
    - path:
        type: PathPrefix
        value: /
    timeouts:
      request: 300s
status:
  parents:
  - conditions:
    - lastTransitionTime: "2026-02-02T16:29:44Z"
      message: ""
      observedGeneration: 1
      reason: Accepted
      status: "True"
      type: Accepted
    - lastTransitionTime: "2026-02-02T16:29:44Z"
      message: ""
      observedGeneration: 1
      reason: ResolvedRefs
      status: "True"
      type: ResolvedRefs
    controllerName: agentgateway.dev/agentgateway
    parentRef:
      group: gateway.networking.k8s.io
      kind: Gateway
      name: inference-gateway
      namespace: default

Expected Behavior

InferencePool status to be populated.

Steps to reproduce the bug

  1. Use make tooling to create cluster, run kgtw with agtw DP, etc.
  2. Follow the steps in the Inference Gateway getting started guide.
  3. Use the above examples to check resource status, verify connectivity, etc.

Additional Environment Detail

No response

Additional Context

InferencePool status is managed correctly in v2.1.2.

Metadata

Metadata

Assignees

Labels

Area: InferenceActivities related to Gateway API Inference Extension support.Area: agentgatewayPriority: HighRequired in next 3 months to make progress, bugs that affect multiple users, or very bad UX

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions