[Feature Request] Add dynamic kernel selection to torchao/experimental #1376

metascroy · 2024-12-04T01:22:37Z

Currently in torchao/experimental, we use a ukernel config to identify the function pointers to use in the linear operator.

During runtime, we select ukernel config to use, but the current logic is very simplistic. This is partially because we currently only have one kind of kernel.

But if we wish to support more kernels in future (e.g., GEMM kernels, kernels from KleidiAI, kernels based on i8mm), we need a better ukernel config selection mechanism.

We'd like to select an appropriate ukernel based on features like CPU uarch, activation size, and packing format. We can use CPU info to get CPU uarch. The feature request here is to design an efficient dynamic kernel selection infrastructure. XNNPACK has a similar feature implemented.

For now, we will select the ukernel config based on the CPU, but in future we might want to extend the design to select a different ukernel config based on the CPU core.

cc @digantdesai @kimishpatel @supriyar @msaroufim

metascroy · 2024-12-04T01:28:25Z

@supriyar @msaroufim an external contributor reached out to me and expressed interest in this project (https://github.com/Darshcg). How can I tag him on the issue?

Darshcg · 2024-12-04T02:33:25Z

Thank you @metascroy for creating this issue and adding me.

Hi @supriyar @msaroufim, I am Darshan, an AI Inference and Compiler Engineer. Over the past few years, I have been working more on optimizing quantized (low-bit) and sparse inference for X86 and ARM targets, primarily with QNNPACK and XNNPACK. Although I have not contributed to open-source projects yet, you can explore some of my related work here https://scholar.google.com/citations?user=GUaOqIIAAAAJ&hl=en

On last Saturday, I attended Scott’s session on low-bit inference(via CUDA mode) and reached out to him regarding potential contributions and collaborations. That led me here, I look forward to working on this with you all, solving problems, and learning new stuff. Thanks!

msaroufim · 2024-12-04T03:12:31Z

Wonderful! I'm glad you two got to meet. At a high level your plan sounds reasonable, much easier to give feedback once we have some sample PRs we can take a look at!

metascroy · 2024-12-05T00:34:15Z

Awesome @Darshcg!

Maybe the first thing to do is study the current ukernel selection logic, look at what other libraries like XNNPACK do, and put together some sample prototype PRs or an RFC/design proposal? We can then chime in and give feedback?

Darshcg · 2024-12-05T04:01:37Z

For sure @metascroy! I will go through the existing ukernel selection implementation in torchao/experimental and compare it with XNNPACK's kernel selection mechanisms. And I aim to identify the differences, improvements, and any best practices we could adopt. Based on this I will come up with a draft design proposal/sample PR for review and feedback. Thanks!

metascroy · 2025-01-16T17:50:44Z

@Darshcg are you still working on this?

metascroy · 2025-02-04T19:59:59Z

Started draft PR here: draft ukernel selection logic #1652

metascroy · 2025-02-06T00:01:39Z

@Darshcg here is a task to track the bias work we discussed on the call today: #1675

Let's try to have it complete by 2/21.

Darshcg · 2025-02-06T00:07:44Z

Thank you a lot @metascroy! Thank you for taking the time for the call to discuss the tasks regarding the dynamic kernel selection and the progress. I will make sure to finish it before next week.

msaroufim assigned Darshcg Dec 6, 2024

metascroy mentioned this issue Feb 3, 2025

Add ukernel selection logic + clean up KleidiAI integration #1652

Merged

digantdesai added the cpu label Feb 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature Request] Add dynamic kernel selection to torchao/experimental #1376

[Feature Request] Add dynamic kernel selection to torchao/experimental #1376

metascroy commented Dec 4, 2024 •

edited

Loading

metascroy commented Dec 4, 2024

Darshcg commented Dec 4, 2024 •

edited

Loading

msaroufim commented Dec 4, 2024

metascroy commented Dec 5, 2024

Darshcg commented Dec 5, 2024

metascroy commented Jan 16, 2025

metascroy commented Feb 4, 2025

metascroy commented Feb 6, 2025

Darshcg commented Feb 6, 2025

[Feature Request] Add dynamic kernel selection to torchao/experimental #1376

[Feature Request] Add dynamic kernel selection to torchao/experimental #1376

Comments

metascroy commented Dec 4, 2024 • edited Loading

metascroy commented Dec 4, 2024

Darshcg commented Dec 4, 2024 • edited Loading

msaroufim commented Dec 4, 2024

metascroy commented Dec 5, 2024

Darshcg commented Dec 5, 2024

metascroy commented Jan 16, 2025

metascroy commented Feb 4, 2025

metascroy commented Feb 6, 2025

Darshcg commented Feb 6, 2025

metascroy commented Dec 4, 2024 •

edited

Loading

Darshcg commented Dec 4, 2024 •

edited

Loading