[NPU] Support NPUW for text-embedding models #3244

mengweiguo · 2026-01-29T01:50:32Z

Description

The duplicate PR#3088 in master branch.

CVS-###

Fixes #(issue)

Checklist:

Tests have been updated or added to cover the new code.
This patch fully addresses the ticket.
I have made corresponding changes to the documentation.

…_length`

Copilot

Pull request overview

This PR adds support for the Qwen3 embedding model release, enhancing text embedding functionality with improved configuration handling and NPU device support.

Changes:

Added pad_to_max_length configuration option for text embeddings
Fixed typo in function name from get_argprser to get_argparser
Refactored NPU compilation logic to support text embedding models with dynamic inputs
Added comprehensive test coverage for Qwen3 embedding model with various pooling types and configurations

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Show a summary per file

File	Description
tools/llm_bench/task/text_embeddings.py	Updated tokenizer configuration to support pad_to_max_length parameter
tools/llm_bench/llm_bench_utils/ov_utils.py	Reorganized configuration logic and fixed padding_side parameter name
tools/llm_bench/llm_bench_utils/model_utils.py	Added emb_pad_to_max_length to model arguments
tools/llm_bench/benchmark.py	Fixed typo in get_argparser function name and added embedding_pad_to_max_length argument
tests/python_tests/test_rag.py	Added device and properties parameters to run_text_embedding_genai, parameterized validation threshold, and added extensive NPU tests
src/cpp/src/utils.hpp	Added declaration for compile_decoder_for_npu_text_embedding function
src/cpp/src/utils.cpp	Refactored NPU compilation into reusable functions and added text embedding specific configuration
src/cpp/src/rag/text_embedding_utils.hpp	Created new utility header for text embedding operations
src/cpp/src/rag/text_embedding_utils.cpp	Implemented utility functions for model reshaping and post-processing
src/cpp/src/rag/text_embedding_pipeline.cpp	Refactored to use utility functions and added NPU support with separate post-processing
src/cpp/src/rag/npu/text_embedding_pipeline.hpp	Added NPU-specific text embedding pipeline declarations
src/cpp/src/rag/npu/text_embedding_pipeline.cpp	Implemented NPU-specific text embedding pipeline creation
src/cpp/include/openvino/genai/rag/text_embedding_pipeline.hpp	Added documentation for NPU dynamic prompt input properties

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

src/cpp/src/utils.cpp

mengweiguo · 2026-01-29T05:14:09Z

Node.js bindings tests on macOS failure is expected as PR#3227 is not merged in release branch.

mengweiguo requested review from as-suvorov, sbalandi and sgonorov as code owners January 29, 2026 01:50

Copilot AI review requested due to automatic review settings January 29, 2026 01:50

mengweiguo requested a review from Wovchena as a code owner January 29, 2026 01:50

github-actions bot added category: llm_bench Label for tool/llm_bench folder category: CPP API Changes in GenAI C++ public headers no-match-files category: GGUF GGUF file reader category: RAG RAG pipeline components labels Jan 29, 2026

mengweiguo added 18 commits January 29, 2026 09:51

Support NPUW for text-embedding models

e9d9704

Fix review remark

864c23d

Add option normalize support

fdaaeed

Add tests

1d0db49

tests fallback to CPU

b417b56

Handle the post-processing with a separate model

0593b5f

Add qeury tests

0a10db2

Fix corner case and add comments

d193a10

Change embedding_pad_to_max_length to `disable_embedding_pad_to_max…

52510bf

…_length`

Fix AI review

25dcae1

Fix code style

b5f6660

Fix Lint and tests

0789e60

Refactor code

41ac8c9

Restore emb_pad_to_max_length

d7976c0

Set padding according to pad_to_max_length

c100e1c

Resolve review remarks

c49ffae

Move npu/text_embedding_pipeline.hpp

affc53e

Move property description to public header

2eb3cf1

Copilot AI reviewed Jan 29, 2026

View reviewed changes

src/cpp/src/utils.cpp Show resolved Hide resolved

mengweiguo changed the title ~~Qwen3 embedding release~~ [NPU] Support NPUW for text-embedding models Jan 29, 2026

mengweiguo requested a review from dmatveev January 29, 2026 05:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[NPU] Support NPUW for text-embedding models #3244

[NPU] Support NPUW for text-embedding models #3244

mengweiguo commented Jan 29, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

mengweiguo commented Jan 29, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

[NPU] Support NPUW for text-embedding models #3244

Are you sure you want to change the base?

[NPU] Support NPUW for text-embedding models #3244

Conversation

mengweiguo commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Checklist:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

mengweiguo commented Jan 29, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

mengweiguo commented Jan 29, 2026 •

edited

Loading

mengweiguo commented Jan 29, 2026 •

edited

Loading