Skip to content

distiluse-base-multilingual-cased-v2 error when start #600

Closed
@franklucky001

Description

@franklucky001

System Info

image

text-embeddings-inference:turing-1.6-grpc

model id

sentence-transformers/distiluse-base-multilingual-cased-v2

Information

  • Docker
  • The CLI directly

Tasks

  • An officially supported command
  • My own modifications

Reproduction

error info

dense-embed  | 2025-04-23T03:14:03.297238Z  INFO text_embeddings_router: router/src/lib.rs:188: Maximum number of tokens per request: 128
dense-embed  | 2025-04-23T03:14:03.297495Z  INFO text_embeddings_core::tokenization: core/src/tokenization.rs:28: Starting 48 tokenization workers
dense-embed  | 2025-04-23T03:14:05.341111Z  INFO text_embeddings_router: router/src/lib.rs:230: Starting model backend
dense-embed  | 2025-04-23T03:14:05.918661Z  INFO text_embeddings_backend_candle: backends/candle/src/lib.rs:317: Starting DistilBertModel model on Cuda(CudaDevice(DeviceId(1)))
dense-embed  | 2025-04-23T03:14:07.818830Z ERROR text_embeddings_backend: backends/src/lib.rs:255: Could not start Candle backend: Could not start backend: cannot find tensor encoder.layer.0.attention.q_lin.weight
dense-embed  | Error: Could not create backend
dense-embed  | 
dense-embed  | Caused by:
dense-embed  |     Could not start backend: Could not start a suitable backend

Expected behavior

start service successful

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions