Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature Request] Add dynamic kernel selection to torchao/experimental #1376

Open
metascroy opened this issue Dec 4, 2024 · 9 comments
Open
Assignees
Labels

Comments

@metascroy
Copy link
Contributor

metascroy commented Dec 4, 2024

Currently in torchao/experimental, we use a ukernel config to identify the function pointers to use in the linear operator.

During runtime, we select ukernel config to use, but the current logic is very simplistic. This is partially because we currently only have one kind of kernel.

But if we wish to support more kernels in future (e.g., GEMM kernels, kernels from KleidiAI, kernels based on i8mm), we need a better ukernel config selection mechanism.

We'd like to select an appropriate ukernel based on features like CPU uarch, activation size, and packing format. We can use CPU info to get CPU uarch. The feature request here is to design an efficient dynamic kernel selection infrastructure. XNNPACK has a similar feature implemented.

For now, we will select the ukernel config based on the CPU, but in future we might want to extend the design to select a different ukernel config based on the CPU core.

cc @digantdesai @kimishpatel @supriyar @msaroufim

@metascroy
Copy link
Contributor Author

@supriyar @msaroufim an external contributor reached out to me and expressed interest in this project (https://github.com/Darshcg). How can I tag him on the issue?

@Darshcg
Copy link

Darshcg commented Dec 4, 2024

Thank you @metascroy for creating this issue and adding me.

Hi @supriyar @msaroufim, I am Darshan, an AI Inference and Compiler Engineer. Over the past few years, I have been working more on optimizing quantized (low-bit) and sparse inference for X86 and ARM targets, primarily with QNNPACK and XNNPACK. Although I have not contributed to open-source projects yet, you can explore some of my related work here https://scholar.google.com/citations?user=GUaOqIIAAAAJ&hl=en

On last Saturday, I attended Scott’s session on low-bit inference(via CUDA mode) and reached out to him regarding potential contributions and collaborations. That led me here, I look forward to working on this with you all, solving problems, and learning new stuff. Thanks!

@msaroufim
Copy link
Member

Wonderful! I'm glad you two got to meet. At a high level your plan sounds reasonable, much easier to give feedback once we have some sample PRs we can take a look at!

@metascroy
Copy link
Contributor Author

Awesome @Darshcg!

Maybe the first thing to do is study the current ukernel selection logic, look at what other libraries like XNNPACK do, and put together some sample prototype PRs or an RFC/design proposal? We can then chime in and give feedback?

@Darshcg
Copy link

Darshcg commented Dec 5, 2024

For sure @metascroy! I will go through the existing ukernel selection implementation in torchao/experimental and compare it with XNNPACK's kernel selection mechanisms. And I aim to identify the differences, improvements, and any best practices we could adopt. Based on this I will come up with a draft design proposal/sample PR for review and feedback. Thanks!

@metascroy
Copy link
Contributor Author

@Darshcg are you still working on this?

@metascroy
Copy link
Contributor Author

Started draft PR here: draft ukernel selection logic #1652

@digantdesai digantdesai added the cpu label Feb 4, 2025
@metascroy
Copy link
Contributor Author

@Darshcg here is a task to track the bias work we discussed on the call today: #1675

Let's try to have it complete by 2/21.

@Darshcg
Copy link

Darshcg commented Feb 6, 2025

Thank you a lot @metascroy! Thank you for taking the time for the call to discuss the tasks regarding the dynamic kernel selection and the progress. I will make sure to finish it before next week.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants