forked from chromium/chromium
-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
Description
Phi-3-mini failed while creating session due to this validation code- https://github.com/openvinotoolkit/openvino/blob/master/src/frontends/onnx/frontend/src/op/dequantize_linear.cpp#L228.
For block-wise dequantizeLinear support, OV first reshape input to a shape [x.shape[0]/block_size, block_size, x.shape[1]]. Then do unsqueeze to zeroPoint and scale using axes = {1}. Finally do broadcasting for scale and zeroPoint. I suppose OV restricts the value of axis to 0 is to simplify the emulation.
Maybe we need to insert transpose before and after dequantizeLinear? @huningxin WDYT?