Skip to content

Commit a7f45ba

Browse files
[DOCS] 25.0 polishing mstr (#28809)
port: #28795
1 parent b603d00 commit a7f45ba

File tree

2 files changed

+39
-34
lines changed

2 files changed

+39
-34
lines changed

docs/articles_en/about-openvino/release-notes-openvino.rst

Lines changed: 38 additions & 33 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,8 @@ What's new
2828

2929
* More GenAI coverage and framework integrations to minimize code changes.
3030

31-
* New models supported: Qwen 2.5.
31+
* New models supported: Qwen 2.5, Deepseek-R1-Distill-Llama-8B, DeepSeek-R1-Distill-Qwen-7B,
32+
and DeepSeek-R1-Distill-Qwen-1.5B, FLUX.1 Schnell and FLUX.1 Dev.
3233
* Whisper Model: Improved performance on CPUs, built-in GPUs, and discrete GPUs with GenAI API.
3334
* Preview: Introducing NPU support for torch.compile, giving developers the ability to use the
3435
OpenVINO backend to run the PyTorch API on NPUs. 300+ deep learning models enabled from the
@@ -38,30 +39,34 @@ What's new
3839

3940
* Preview: Addition of Prompt Lookup to GenAI API improves 2nd token latency for LLMs by
4041
effectively utilizing predefined prompts that match the intended use case.
42+
* Preview: The GenAI API now offers image-to-image inpainting functionality. This feature
43+
enables models to generate realistic content by inpainting specified modifications and
44+
seamlessly integrating them with the original image.
4145
* Asymmetric KV Cache compression is now enabled for INT8 on CPUs, resulting in lower
4246
memory consumption and improved 2nd token latency, especially when dealing with long prompts
4347
that require significant memory. The option should be explicitly specified by the user.
4448

4549
* More portability and performance to run AI at the edge, in the cloud, or locally.
4650

47-
* Support for the latest Intel® Core™ Ultra 200H series processors (formerly codenamed Arrow
48-
Lake-H)
49-
* Preview: The GenAI API now offers image-to-image inpainting functionality. This feature
50-
enables models to generate realistic content by inpainting specified modifications and
51-
seamlessly integrating them with the original image.
52-
* Integration of the OpenVINO backend with the Triton Inference Server allows developers to
51+
* Support for the latest Intel® Core™ Ultra 200H series processors (formerly codenamed
52+
Arrow Lake-H)
53+
* Integration of the OpenVINO ™ backend with the Triton Inference Server allows developers to
5354
utilize the Triton server for enhanced model serving performance when deploying on Intel
5455
CPUs.
55-
* Preview: A new OpenVINO backend integration allows developers to leverage OpenVINO
56-
performance optimizations directly within Keras 3 workflows for faster AI inference on
57-
Intel® CPUs, built-in GPUs, discrete GPUs, and NPUs. This feature is available with the
58-
latest Keras 3.8 release.
56+
* Preview: A new OpenVINO ™ backend integration allows developers to leverage OpenVINO
57+
performance optimizations directly within Keras 3 workflows for faster AI inference on CPUs,
58+
built-in GPUs, discrete GPUs, and NPUs. This feature is available with the latest Keras 3.8
59+
release.
60+
* The OpenVINO Model Server now supports native Windows Server deployments, allowing
61+
developers to leverage better performance by eliminating container overhead and simplifying
62+
GPU deployment.
63+
5964

6065

6166
Now Deprecated
6267
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
6368

64-
* Legacy prefixes (l_,w_,m_) have been removed from OpenVINO archive names.
69+
* Legacy prefixes `l_`, `w_`, and `m_` have been removed from OpenVINO archive names.
6570
* The `runtime` namespace for Python API has been marked as deprecated and designated to be
6671
removed for 2026.0. The new namespace structure has been delivered, and migration is possible
6772
immediately. Details will be communicated through warnings and via documentation.
@@ -91,9 +96,9 @@ CPU Device Plugin
9196
-----------------------------------------------------------------------------------------------
9297

9398
* Intel® Core™ Ultra 200H processors (formerly code named Arrow Lake-H) are now fully supported.
94-
* Asymmetric 8bit key-value cache compression is now enabled on CPU by default, reducing memory
99+
* Asymmetric 8bit KV Cache cache compression is now enabled on CPU by default, reducing memory
95100
usage and memory bandwidth consumption for large language models and improving performance
96-
for 2nd token generation. Asymmetric 4bit key-value cache compression on CPU is now supported
101+
for 2nd token generation. Asymmetric 4bit KV Cache cache compression on CPU is now supported
97102
as an option to further reduce memory consumption.
98103
* Performance of models running in FP16 on 6th generation of Intel® Xeon® processors with P-core
99104
has been enhanced by improving utilization of the underlying AMX FP16 capabilities.
@@ -112,18 +117,19 @@ GPU Device Plugin
112117
OpenVINO GenAI APIs with continuous batching and SDPA-based LLMs with long prompts (>4k).
113118
* Stateful models are now enabled, significantly improving performance of Whisper models on all
114119
GPU platforms.
115-
* Stable Diffusion 3 and Flux.1 performance has been improved.
120+
* Stable Diffusion 3 and FLUX.1 performance has been improved.
116121
* The issue of a black image output for image generation models, including SDXL, SD3, and
117-
Flux.1, with FP16 precision has been solved.
122+
FLUX.1, with FP16 precision has been solved.
118123

119124

120125
NPU Device Plugin
121126
-----------------------------------------------------------------------------------------------
122127

123-
* Performance has been improved for Channel-Wise symmetrically quantized LLMs, including
124-
Llama2-7B-chat, Llama3-8B-instruct, qwen-2-7B, Mistral-0.2-7B-instruct, phi-3-mini-4K-instruct,
125-
miniCPM-1B models. The best performance is achieved using fp16-in4 quantized models.
126-
* Preview: Introducing NPU support for torch.compile, giving developers the ability to use the
128+
* Performance has been improved for CW symmetrically quantized LLMs, including Llama2-7B-chat,
129+
Llama3-8B-instruct, Qwen-2-7B, Mistral-0.2-7B-Instruct, Phi-3-Mini-4K-Instruct, MiniCPM-1B
130+
models. The best performance is achieved using symmetrically-quantized 4-bit (INT4) quantized
131+
models.
132+
* Preview: Introducing NPU support for torch.compile, giving developers the ability to use the
127133
OpenVINO backend to run the PyTorch API on NPUs. 300+ deep learning models enabled from
128134
the TorchVision, Timm, and TorchBench repositories.
129135

@@ -187,9 +193,6 @@ ONNX Framework Support
187193
-----------------------------------------------------------------------------------------------
188194

189195
* Runtime memory consumption for models with quantized weight has been reduced.
190-
* Models from the com.microsoft domain that use the following operations are now enabled:
191-
SkipSimplifiedLayerNormalization, SimplifiedLayerNormalization, FusedMatMul, QLinearSigmoid,
192-
QLinearLeakyRelu, QLinearAdd, QLinearMul, Range, DynamicQuantizeMatMul, MatMulIntegerToFloat.
193196
* Workflow which affected reading of 2 bytes data types has been fixed.
194197

195198

@@ -205,7 +208,7 @@ OpenVINO Model Server
205208
* Generative endpoints are fully supported, including text generation and embeddings based on
206209
the OpenAI API, and reranking based on the Cohere API.
207210
* Functional parity with the Linux version is available with minor differences.
208-
* The feature is targeted at client machines with Windows 11 and Data Center environment
211+
* The feature is targeted at client machines with Windows 11 and data center environment
209212
with Windows 2022 Server OS.
210213
* Demos have been updated to work on both Linux and Windows. Check the
211214
`installation guide <https://docs.openvino.ai/2025/openvino-workflow/model-server/ovms_docs_deploying_server_baremetal.html>`__
@@ -284,7 +287,7 @@ The following has been added:
284287
* Stateful decoder for WhisperPipeline. Whisper decoder models with past are deprecated.
285288
* Export a model with new optimum-intel to obtain stateful version.
286289
* Performance metrics for WhisperPipeline.
287-
* initial_prompt and hotwords parameters for whisper pipeline allowing to guide generation.
290+
* initial_prompt and hotwords parameters for the Whisper pipeline allowing to guide generation.
288291

289292
* LLMPipeline
290293

@@ -297,10 +300,9 @@ The following has been added:
297300
* rng_seed parameter to ImageGenerationConfig.
298301
* Callback for image generation pipelines allowing to track generation progress and obtain
299302
intermediate results.
300-
* EulerAncestralDiscreteScheduler - SDXL turbo.
301-
* PNDMScheduler – Stable Diffusion 1.x and 2.x.
302-
* Models: black-forest-labs/FLUX.1-schnell, Freepik/flux.1-lite-8B-alpha,
303-
black-forest-labs/FLUX.1-dev shuttleai/shuttle-3-diffusion.
303+
* EulerAncestralDiscreteScheduler for SDXL turbo.
304+
* PNDMScheduler for Stable Diffusion 1.x and 2.x.
305+
* Models: FLUX.1-Schnell, Flux.1-Lite-8B-Alpha, FLUX.1-Dev, and Shuttle-3-Diffusion.
304306
* T5 encoder for SD3 Pipeline.
305307

306308
* VLMPipeline
@@ -351,18 +353,21 @@ Known Issues
351353
| ID: 161336
352354
| Description:
353355
| Compilation of an openvino model performing weight quantization fails with Segmentation
354-
Fault on LNL. The following workaround can be applied to make it work with existing OV
355-
versions (including 25.0 RCs) before application run: export DNNL_MAX_CPU_ISA=AVX2_VNNI.
356+
Fault on Intel® Core™ Ultra 200V processors. The following workaround can be applied to
357+
make it work with existing OV versions (including 25.0 RCs) before application run:
358+
export DNNL_MAX_CPU_ISA=AVX2_VNNI.
356359
357360
| **Component: GPU Plugin**
358361
| ID: 160802
359362
| Description:
360-
| mllama model crashes on LNL. Please use OpenVINO 2024.6 or earlier to run the model.
363+
| mllama model crashes on Intel® Core™ Ultra 200V processors. Please use OpenVINO 2024.6 or
364+
earlier to run the model.
361365
362366
| **Component: GPU Plugin**
363367
| ID: 160948
364368
| Description:
365-
| Several models have accuracy degradation on LNL, ACM, and BMG. Please use OpenVINO 2024.6
369+
| Several models have accuracy degradation on Intel® Core™ Ultra 200V processors,
370+
Intel® Arc™ A-Series Graphics, and Intel® Arc™ B-Series Graphics. Please use OpenVINO 2024.6
366371
to run the models. Model list: Denoise, Sharpen-Sharpen, fastseg-small, hbonet-0.5,
367372
modnet_photographic_portrait_matting, modnet_webcam_portrait_matting,
368373
mobilenet-v3-small-1.0-224, nasnet-a-mobile-224, yolo_v4, yolo_v5m, yolo_v5s, yolo_v8n,

docs/articles_en/get-started/install-openvino/configurations.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ Additional Configurations
1515
For GPU <configurations/configurations-intel-gpu>
1616
For NPU <configurations/configurations-intel-npu>
1717
GenAI Dependencies <configurations/genai-dependencies>
18-
Troubleshooting Guide for OpenVINO™ Installation & Configuration <troubleshooting-install-config.html>
18+
Troubleshooting Guide for OpenVINO™ Installation & Configuration <configurations/troubleshooting-install-config>
1919

2020
For certain use cases, you may need to install additional software, to benefit from the full
2121
potential of OpenVINO™. Check the following list for components used in your workflow:

0 commit comments

Comments
 (0)