-
Notifications
You must be signed in to change notification settings - Fork 853
Description
Problem
When using a Seldon inference graph with two model containers in the same pod, the executor’s gRPC client re-uses a grpc.ClientConn keyed only by host:port while the client interceptor captures modelName at connection creation. The connection is reused for the next node and carries the first model’s name into downstream calls → downstream model receives wrong ModelName and fails.
Observed symptom (logs)
model-1 container receives and handles the first call correctly:
model-1: "POST /v2/models/model-1/infer HTTP/1.1" 200 OK
model-2 container incorrectly receives the same path (note model-1 name) and fails:
model-2: "POST /v2/models/model-1/infer HTTP/1.1" 404 Not Found
Repro (minimal)
- Seldon Core: v1.17.1 (helm chart seldon-core-operator v1.17.1).
- Create this SeldonDeployment (use simple HTTP echo containers that log incoming request lines, or two MLServer containers):
apiVersion: machinelearning.seldon.io/v1
kind: SeldonDeployment
metadata:
name: graph-test
namespace: test
spec:
predictors:
- componentSpecs:
- spec:
containers:
- image: <image-1> # container that logs requests (name: model-1)
imagePullPolicy: Always
name: model-1
- image: <image-2> # container that logs requests (name: model-2)
imagePullPolicy: Always
name: model-2
graph:
children:
- name: model-2
name: model-1
name: default
svcOrchSpec:
env:
- name: SELDON_LOG_LEVEL
value: INFO
resources:
limits:
cpu: 500m
memory: 2Gi
requests:
cpu: 100m
memory: 1Gi
protocol: v2
- Send V2 requests that include model name in the path:
# doesn't work
curl -X POST http://<seldon-host>/v2/models/model-1/infer
> model-1: "POST /v2/models/model-1/infer HTTP/1.1" 200 OK
> model-2: "POST /v2/models/model-1/infer HTTP/1.1" 404 Not Found
# working - workaround
curl -X POST http://<seldon-host>/v2/models/infer
> model-1: "POST /v2/models/model-1/infer HTTP/1.1" 200 OK
> model-2: "POST /v2/models/model-2/infer HTTP/1.1" 200 OK
Expected:
curl -X POST http://<seldon-host>/v2/models/model-1/infer
> model-1: "POST /v2/models/model-1/infer HTTP/1.1" 200 OK
> model-2: "POST /v2/models/model-2/infer HTTP/1.1" 200 OK --> called with model-2 by seldon executor
As I am using seldon-python-client, which doesn't provide a way to call /v2/models/infer path, and forcing us to avoid python client and call /v2/models/infer API directly using aiohttp.
If this is a bug, open to contribute to it under guidance or else please help me do it in right way!