Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[numeric] Numeric error for Conv operator with quantize/dequantize #19416

Closed
pdhirajkumarprasad opened this issue Dec 9, 2024 · 9 comments
Closed
Labels
bug 🐞 Something isn't working

Comments

@pdhirajkumarprasad
Copy link

pdhirajkumarprasad commented Dec 9, 2024

What happened?

For the given IR

module {
  func.func @main_graph(%arg0: !torch.vtensor<[1,3,224,224],f32>, %arg1: !torch.vtensor<[1,24,112,112],f32>) -> !torch.vtensor<[1,24,112,112],f32> attributes {torch.onnx_meta.ir_version = 8 : si64, torch.onnx_meta.opset_version = 21 : si64, torch.onnx_meta.opset_versions = {ai.onnx.contrib = 1 : si64, ai.onnx.ml = 4 : si64, ai.onnx.preview.training = 1 : si64, ai.onnx.training = 1 : si64, com.microsoft = 1 : si64, com.microsoft.experimental = 1 : si64, com.microsoft.nchwc = 1 : si64, org.pytorch.aten = 1 : si64}, torch.onnx_meta.producer_name = "vai_q_onnx", torch.onnx_meta.producer_version = "1.17.0+43059a7"} {
    %12 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<0> : tensor<si8>} : () -> !torch.vtensor<[],si8> 
    %13 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<5.000000e-01> : tensor<f32>} : () -> !torch.vtensor<[],f32> 
    %14 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<1.000000e+00> : tensor<f32>} : () -> !torch.vtensor<[],f32> 
    %15 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<0> : tensor<si8>} : () -> !torch.vtensor<[],si8> 
    %16 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<_onnx__Conv_1060_quantized> : tensor<24x1x3x3xsi8>} : () -> !torch.vtensor<[24,1,3,3],si8> 
    %17 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<2.500000e-01> : tensor<f32>} : () -> !torch.vtensor<[],f32> 
    %18 = torch.operator "onnx.Constant"() {torch.onnx.value = dense<0> : tensor<si8>} : () -> !torch.vtensor<[],si8> 
    %19 = torch.operator "onnx.Constant"() {torch.onnx.value = dense_resource<_onnx__Conv_1061_quantized> : tensor<24xsi8>} : () -> !torch.vtensor<[24],si8> 
    %24 = torch.operator "onnx.DequantizeLinear"(%16, %14, %15) : (!torch.vtensor<[24,1,3,3],si8>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[24,1,3,3],f32> 
    %25 = torch.operator "onnx.DequantizeLinear"(%19, %17, %18) : (!torch.vtensor<[24],si8>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[24],f32> 
    %35 = torch.operator "onnx.QuantizeLinear"(%arg1, %13, %12) : (!torch.vtensor<[1,24,112,112],f32>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[1,24,112,112],si8> 
    %36 = torch.operator "onnx.DequantizeLinear"(%35, %13, %12) : (!torch.vtensor<[1,24,112,112],si8>, !torch.vtensor<[],f32>, !torch.vtensor<[],si8>) -> !torch.vtensor<[1,24,112,112],f32> 
    %37 = torch.operator "onnx.Conv"(%36, %24, %25) {torch.onnx.auto_pad = "NOTSET", torch.onnx.dilations = [1 : si64, 1 : si64], torch.onnx.group = 24 : si64, torch.onnx.kernel_shape = [3 : si64, 3 : si64], torch.onnx.pads = [1 : si64, 1 : si64, 1 : si64, 1 : si64], torch.onnx.strides = [1 : si64, 1 : si64]} : (!torch.vtensor<[1,24,112,112],f32>, !torch.vtensor<[24,1,3,3],f32>, !torch.vtensor<[24],f32>) -> !torch.vtensor<[1,24,112,112],f32> 
    return %37 : !torch.vtensor<[1,24,112,112],f32>
  }
}

{-#
  dialect_resources: {
    builtin: {
      _onnx__Conv_1060_quantized: "0x0800000000000000FF00000000000000000000000000000000000000000000000000000000000000FBE208EAA4F91B7A0100000000FE0000000000010000010000FE0000320700CEF703FD0200000000FF0000020003F9FDF529FCFEFB0200000001FF0000000000000000000000000000000000020000000000000000000000010000000000010000000000000000000000000000000000000000000000000000FC0100000300000000010000000000000000030000000000000000FF00000000FF0001FE000200000000000000FF00000000000000000000000000",
      _onnx__Conv_1061_quantized: "0x0800000012044E020B59F50B030B0B0F020114FBFE0800FE040B1014"
    }
  }
#-}

getting numeric error as

EXEC @main_graph
[FAILED] result[0]: element at index 50176 (3) does not match the expected (2.75); expected that the view is equal to contents of a view of 1x24x112x112xf32

Steps to reproduce your issue

command:

iree-compile model.torch_onnx.mlir --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-target-cpu=host -o compiled_model.vmfb 

iree-run-module --module='compiled_model.vmfb' --device=local-task --function='main_graph' --input='[email protected]' --input='[email protected]'  --output=@'output.0.bin' --expected_output='1x24x112x112xf32=@golden_output.0.bin'

Version : IREE compiler version 3.1.0rc20241208 @ 39c56de

golden_output.0.bin.txt
input.0.bin.txt
input.1.bin.txt
model.torch_onnx.mlir.txt

model impacted: fbnetv3_d.ra2_in1k* and other, total 50+ models

What component(s) does this issue relate to?

Runtime

Version information

No response

Additional context

No response

@pdhirajkumarprasad pdhirajkumarprasad added the bug 🐞 Something isn't working label Dec 9, 2024
@pdhirajkumarprasad pdhirajkumarprasad changed the title [numeric] Numeric error for HardSigmoid with Conv operator [numeric] Numeric error for Conv operator with quantize/dequantize Dec 9, 2024
@zjgarvey
Copy link
Contributor

zjgarvey commented Dec 9, 2024

What model is this coming from? The fact that the bias quantization scale is not the same as the product of weight and input scales is concerning. I would not expect us to have comparable numerics for such an example without some additional work in the TorchFuseQuantizedOps pass.

@zjgarvey
Copy link
Contributor

zjgarvey commented Dec 9, 2024

See https://github.com/zjgarvey/SHARK-TestSuite/tree/conv_numerics_repro for operator level tests in the test suite.

You can run both of the examples with:

python run.py -m cl-onnx-iree -v -t qconv_numerics

If all of our models are failing because of bias, weight, and input scale mismatches, we should add support for this behavior at the torch-mlir level.

@JibAxelera
Copy link

JibAxelera commented Feb 25, 2025

Hi @pdhirajkumarprasad , I've tried to compile your code but got the following : error: failed to legalize operation 'torch.operator' that was explicitly marked illegal
%24 = torch.operator "onnx.DequantizeLinear"

Any Idea on how to make it compile ?

@zjgarvey
Copy link
Contributor

Hi @pdhirajkumarprasad , I've tried to compile your code but got the following : error: failed to legalize operation 'torch.operator' that was explicitly marked illegal
%24 = torch.operator "onnx.DequantizeLinear"

Any Idea on how to make it compile ?

What was your compile command?

@pdhirajkumarprasad
Copy link
Author

@zjgarvey As mentioned with problem, I am using following command and able to see the issue I reported

iree-compile model.torch_onnx.mlir --iree-hal-target-backends=llvm-cpu --iree-llvmcpu-target-cpu=host -o compiled_model.vmfb 

iree-run-module --module='compiled_model.vmfb' --device=local-task --function='main_graph' --input='[email protected]' --input='[email protected]'  --output=@'output.0.bin' --expected_output='1x24x112x112xf32=@golden_output.0.bin'

@zjgarvey
Copy link
Contributor

@pdhirajkumarprasad , I was asking @JibAxelera , since they had some compilation failure that doesn't make sense to me.

This compiles for me fine with your reproducing command on

iree-base-compiler 3.3.0rc20250224
iree-base-runtime  3.3.0rc20250224

@zjgarvey
Copy link
Contributor

@pdhirajkumarprasad Do you mind if I move this issue to torch-mlir?

The reason for the error is the discrepancy between input, weight, and bias quantization scales and we aren't handling this correctly in torch-mlir.

@pdhirajkumarprasad
Copy link
Author

@zjgarvey Yes, please

@zjgarvey
Copy link
Contributor

Closing this issue in favor of filing in torch-mlir: llvm/torch-mlir#4059

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug 🐞 Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants