Skip to content

Commit 6061664

Browse files
iupaikov-amdpytorchmergebot
authored andcommitted
Enabled force_shape_pad for triton tests in test_kernel_benchmark (pytorch#147620)
During ROCm runs we naturally have those tests show that padding path will be slower for our archs and the pad_mm chooses to opt out of padding thus failing those tests. Reasoning for this is per my understanding those tests don't check IF the operation should be padded in the first place, but HOW is it padded and if it's done in a correct way. More than that the tests shouldn't really be hardware dependent or have some condition for them. Similar PR for reference: pytorch#141768 Pull Request resolved: pytorch#147620 Approved by: https://github.com/jeffdaily, https://github.com/chenyang78, https://github.com/shunting314
1 parent 651e6aa commit 6061664

File tree

1 file changed

+6
-2
lines changed

1 file changed

+6
-2
lines changed

test/inductor/test_kernel_benchmark.py

+6-2
Original file line numberDiff line numberDiff line change
@@ -137,7 +137,9 @@ def f(x):
137137
# TODO: Currently the Triton mm template + relu fusion causes slowdown on XPU,
138138
# Need to refine the template and config for XPU.
139139
@expectedFailureXPU
140-
@config.patch(max_autotune=True, max_autotune_gemm_backends="TRITON")
140+
@config.patch(
141+
max_autotune=True, max_autotune_gemm_backends="TRITON", force_shape_pad=True
142+
)
141143
@fresh_inductor_cache()
142144
def test_matmul_triton_kernel_benchmark(self):
143145
M = 12544
@@ -153,7 +155,9 @@ def f(a, b):
153155
f(a, b)
154156
self.verify_compiled_kernels()
155157

156-
@config.patch(max_autotune=True, max_autotune_gemm_backends="TRITON")
158+
@config.patch(
159+
max_autotune=True, max_autotune_gemm_backends="TRITON", force_shape_pad=True
160+
)
157161
@fresh_inductor_cache()
158162
def test_mm_triton_kernel_benchmark(self):
159163
M = 2048

0 commit comments

Comments
 (0)