|
1 | 1 | # Release Notes
|
2 | 2 |
|
| 3 | +## New in Release 2.15.0 |
| 4 | + |
| 5 | +Post-training Quantization: |
| 6 | + |
| 7 | +- Features: |
| 8 | + - (TensorFlow) The `nncf.quantize()` method is now the recommended API for Quantization-Aware Training. Please refer to an [example](examples/quantization_aware_training/tensorflow/mobilenet_v2) for more details about how to use a new approach. |
| 9 | + - (TensorFlow) Compression layers placement in the model now can be serialized and restored with new API functions: `nncf.tensorflow.get_config()` and `nncf.tensorflow.load_from_config()`. Please see the [documentation](docs/usage/training_time_compression/quantization_aware_training/Usage.md#saving-and-loading-compressed-models) for the saving/loading of a quantized model for more details. |
| 10 | + - (OpenVINO) Added [example](examples/llm_compression/openvino/smollm2_360m_fp8) with LLM quantization to FP8 precision. |
| 11 | + - (TorchFX, Experimental) Preview support for the new `quantize_pt2e` API has been introduced, enabling quantization of `torch.fx.GraphModule` models with the `OpenVINOQuantizer` and the `X86InductorQuantizer` quantizers. `quantize_pt2e` API utilizes MinMax algorithm statistic collectors, as well as SmoothQuant, BiasCorrection and FastBiasCorrection Post-Training Quantization algorithms. |
| 12 | + - Added unification of scales for ScaledDotProductAttention operation. |
| 13 | +- Fixes: |
| 14 | + - (ONNX) Fixed sporadic accuracy issues with the BiasCorrection algorithm. |
| 15 | + - (ONNX) Fixed GroupConvolution operation weight quantization, which also improves performance for a number of models. |
| 16 | + - Fixed AccuracyAwareQuantization algorithm to solve [#3118](https://github.com/openvinotoolkit/nncf/issues/3118) issue. |
| 17 | + - Fixed issue with NNCF usage with potentially corrupted backend frameworks. |
| 18 | +- Improvements: |
| 19 | + - (TorchFX, Experimental) Added YoloV11 support. |
| 20 | + - (OpenvINO) The performance of the FastBiasCorrection algorithm was improved. |
| 21 | + - Significantly faster data-free weight compression for OpenVINO models: INT4 compression is now up to 10x faster, while INT8 compression is up to 3x faster. The larger the model the higher the time reduction. |
| 22 | + - AWQ weight compression is now up to 2x faster, improving overall runtime efficiency. |
| 23 | + - Peak memory usage during INT4 data-free weight compression in the OpenVINO backend is reduced by up to 50% for certain models. |
| 24 | +- Deprecations/Removals: |
| 25 | + - (TensorFlow) The `nncf.tensorflow.create_compressed_model()` method is now marked as deprecated. Please use the `nncf.quantize()` method for the quantization initialization. |
| 26 | +- Tutorials: |
| 27 | + - [Post-Training Optimization of GLM-Edge-V Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/glm-edge-v/glm-edge-v.ipynb) |
| 28 | + - [Post-Training Optimization of OmniGen Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/omnigen/omnigen.ipynb) |
| 29 | + - [Post-Training Optimization of Sana Models](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/sana-image-generation/sana-image-generation.ipynb) |
| 30 | + - [Post-Training Optimization of BGE Models](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-rag-langchain/llm-rag-langchain-genai.ipynb) |
| 31 | + - [Post-Training Optimization of Stable Diffusion Inpainting Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/inpainting-genai/inpainting-genai.ipynb) |
| 32 | + - [Post-Training Optimization of LTX Video Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/ltx-video/ltx-video.ipynb) |
| 33 | + - [Post-Training Optimization of DeepSeek-R1-Distill Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/llm-chatbot/llm-chatbot-generate-api.ipynb) |
| 34 | + - [Post-Training Optimization of Janus DeepSeek-LLM-1.3b Model](https://github.com/openvinotoolkit/openvino_notebooks/blob/latest/notebooks/janus-multimodal-generation/janus-multimodal-generation.ipynb) |
| 35 | + |
| 36 | +Requirements: |
| 37 | + |
| 38 | +- Updated the minimal version for `numpy` (>=1.24.0). |
| 39 | +- Removed `tqdm` dependency. |
| 40 | + |
3 | 41 | ## New in Release 2.14.1
|
4 | 42 |
|
5 | 43 | Post-training Quantization:
|
|
0 commit comments