Skip to content

ExecuTorch Vulkan Backend on Linux - Request for Documentation/Guidance #8288

Answered by SS-JIA
csn1800 asked this question in Q&A
Discussion options

You must be logged in to vote

The Vulkan Delegate was started last year, and the focus in the first year was building the core components of the platform and adding initial implementations of several operators. The focus for this year will be optimization, both for latency and memory consumption. In particular, my specific focus this year will be to optimize 4-bit weight quantized matrix multiplication to improve performance on Transformer models.

Integer computations, such as embedding layers for text/speech models

We are currently working on optimizing weight quantized operators, but that may be different than what you mean here. In these quantized shaders a quantized weight value is converted back into a floating…

Replies: 4 comments

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Answer selected by byjlw
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
module: doc Issues related to documentation, both in docs/ and inlined in code module: vulkan Issues related to the Vulkan delegate and code under backends/vulkan/ triaged This issue has been looked at a team member, and triaged and prioritized into an appropriate module
2 participants
Converted from issue

This discussion was converted from issue #8211 on February 06, 2025 21:08.