Skip to content

Commit 879e80c

Browse files
Update ReleaseNotes.md (#3214) (#3238)
(cherry picked from commit 80bd756)
1 parent be61f69 commit 879e80c

File tree

1 file changed

+38
-0
lines changed

1 file changed

+38
-0
lines changed

ReleaseNotes.md

+38
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,43 @@
11
# Release Notes
22

3+
## New in Release 2.15.0
4+
5+
Post-training Quantization:
6+
7+
- Features:
8+
- (TensorFlow) The `nncf.quantize()` method is now the recommended API for Quantization-Aware Training. Please refer to an [example](examples/quantization_aware_training/tensorflow/mobilenet_v2) for more details about how to use a new approach.
9+
- (TensorFlow) Compression layers placement in the model now can be serialized and restored with new API functions: `nncf.tensorflow.get_config()` and `nncf.tensorflow.load_from_config()`. Please see the [documentation](docs/usage/training_time_compression/quantization_aware_training/Usage.md#saving-and-loading-compressed-models) for the saving/loading of a quantized model for more details.
10+
- (OpenVINO) Added [example](examples/llm_compression/openvino/smollm2_360m_fp8) with LLM quantization to FP8 precision.
11+
- (TorchFX, Experimental) Preview support for the new `quantize_pt2e` API has been introduced, enabling quantization of `torch.fx.GraphModule` models with the `OpenVINOQuantizer` and the `X86InductorQuantizer` quantizers. `quantize_pt2e` API utilizes MinMax algorithm statistic collectors, as well as SmoothQuant, BiasCorrection and FastBiasCorrection Post-Training Quantization algorithms.
12+
- Added unification of scales for ScaledDotProductAttention operation.
13+
- Fixes:
14+
- (ONNX) Fixed sporadic accuracy issues with the BiasCorrection algorithm.
15+
- (ONNX) Fixed GroupConvolution operation weight quantization, which also improves performance for a number of models.
16+
- Fixed AccuracyAwareQuantization algorithm to solve [#3118](https://github.com/openvinotoolkit/nncf/issues/3118) issue.
17+
- Fixed issue with NNCF usage with potentially corrupted backend frameworks.
18+
- Improvements:
19+
- (TorchFX, Experimental) Added YoloV11 support.
20+
- (OpenvINO) The performance of the FastBiasCorrection algorithm was improved.
21+
- Significantly faster data-free weight compression for OpenVINO models: INT4 compression is now up to 10x faster, while INT8 compression is up to 3x faster. The larger the model the higher the time reduction.
22+
- AWQ weight compression is now up to 2x faster, improving overall runtime efficiency.
23+
- Peak memory usage during INT4 data-free weight compression in the OpenVINO backend is reduced by up to 50% for certain models.
24+
- Deprecations/Removals:
25+
- (TensorFlow) The `nncf.tensorflow.create_compressed_model()` method is now marked as deprecated. Please use the `nncf.quantize()` method for the quantization initialization.
26+
- Tutorials:
27+
- [Post-Training Optimization of GLM-Edge-V Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/glm-edge-v/glm-edge-v.ipynb)
28+
- [Post-Training Optimization of OmniGen Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/omnigen/omnigen.ipynb)
29+
- [Post-Training Optimization of Sana Models](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/sana-image-generation/sana-image-generation.ipynb)
30+
- [Post-Training Optimization of BGE Models](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-rag-langchain/llm-rag-langchain-genai.ipynb)
31+
- [Post-Training Optimization of Stable Diffusion Inpainting Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/inpainting-genai/inpainting-genai.ipynb)
32+
- [Post-Training Optimization of LTX Video Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/ltx-video/ltx-video.ipynb)
33+
- [Post-Training Optimization of DeepSeek-R1-Distill Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-chatbot/llm-chatbot-generate-api.ipynb)
34+
- [Post-Training Optimization of Janus DeepSeek-LLM-1.3b Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/janus-multimodal-generation/janus-multimodal-generation.ipynb)
35+
36+
Requirements:
37+
38+
- Updated the minimal version for `numpy` (>=1.24.0).
39+
- Removed `tqdm` dependency.
40+
341
## New in Release 2.14.1
442

543
Post-training Quantization:

0 commit comments

Comments
 (0)