v2.12.0
Post-training Quantization:
Features:
- (OpenVINO, PyTorch, ONNX) Excluded comparison operators from the quantization scope for
nncf.ModelType.TRANSFORMER
. - (OpenVINO, PyTorch) Changed the representation of symmetrically quantized weights from an unsigned integer with a fixed zero-point to a signed data type without a zero-point in the
nncf.compress_weights()
method. - (OpenVINO) Extended patterns support of the AWQ algorithm as part of
nncf.compress_weights()
. This allows apply AWQ for the wider scope of the models. - (OpenVINO) Introduced
nncf.CompressWeightsMode.E2M1
mode
option ofnncf.compress_weights()
as the new MXFP4 precision (Experimental). - (OpenVINO) Added support for models with BF16 precision in the
nncf.quantize()
method. - (PyTorch) Added quantization support for the
torch.addmm
. - (PyTorch) Added quantization support for the
torch.nn.functional.scaled_dot_product_attention
.
Fixes:
- (OpenVINO, PyTorch, ONNX) Fixed Fast-/BiasCorrection algorithms with correct support of transposed MatMul layers.
- (OpenVINO) Fixed
nncf.IgnoredScope()
functionality for models with If operation. - (OpenVINO) Fixed patterns with PReLU operations.
- Fixed runtime error while importing NNCF without Matplotlib package.
Improvements:
- Reduced the amount of memory required for applying
nncf.compress_weights()
to OpenVINO models. - Improved logging in case of the not empty
nncf.IgnoredScope()
.
Tutorials:
- Post-Training Optimization of Stable Audio Open Model
- Post-Training Optimization of Phi3-Vision Model
- Post-Training Optimization of MiniCPM-V2 Model
- Post-Training Optimization of Jina CLIP Model
- Post-Training Optimization of Stable Diffusion v3 Model
- Post-Training Optimization of HunyuanDIT Model
- Post-Training Optimization of DDColor Model
- Post-Training Optimization of DynamiCrafter Model
- Post-Training Optimization of DepthAnythingV2 Model
- Post-Training Optimization of Kosmos-2 Model
Compression-aware training:
Fixes:
- (PyTorch) Fixed issue with wrapping for operator without patched state.
Requirements:
- Updated Tensorflow (2.15) version. This version requires Python 3.9-3.11.
Acknowledgements
Thanks for contributions from the OpenVINO developer community:
@Lars-Codes