-
Notifications
You must be signed in to change notification settings - Fork 807
[SYCL][Doc] Add SPV_INTEL_function_variants extension #18365
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
gmlueck
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd like to put this out for review in order to get feedback from the wider SPIR-V community.
|
I don't fully understand what new capability the decoration provides, compared to using specialization constants with conditionals. E.g., assuming void func()
{
if(platformId == INTEL_IGPU)
{
code for intel integrated
}
else if(platformId == RADEON)
{
code for radeon
}
else if(platformId == NVIDIA)
{
code for nvidia
}
else
{
fallback code
}
}The above is what I've usually done to have functions with different behaviour on separate platforms. Conditional extensions & capabilities are very welcome in the computer graphics context as well. I've had to generate separate binaries for Nvidia in some cases, when I want to use e.g. the shader execution reordering extension. This would allow shipping only one. What happens with Also, I don't fully see the value of For portions of the "targets registry" (at least for GPU vendors), you may be able to leverage or sync with existing Vulkan or OpenCL vendor / device IDs. One capability that would be nice to have (although probably not in scope for this extension) would be target-specific structure types. There are sometimes entries that only make sense on one platform. This would likely cause lots of complications for validation though. |
The
You're right, we can only check it in the consumer during specialization. The "first pair" makes the rules simpler, sounds like a good idea.
They make the module standalone, i.e., all the required information is contained within the module in a standardized format. You can still choose to use external specialization and pass the side information separately, and that's fine if you control both the producer and the consumer (e.g., that's how SYCL intends to use it). But, one potential use case can be, for example, generating the multi-target SPIR-V from the OpenMP declare variant construct, then shipping it to a customer device where it will be consumed by an OpenCL implementation, like PoCL. Without these spec constants, OpenMP and PoCL would need to come up with some ad-hoc mechanism to pass this information along.
You can annotate types with |
|
Is there an existing public implementation of this extension? Do you have an implementation / plan on how to expose this functionality in a high-level language like OpenCL C (or something other than OpenMP)? If some day in the future a Vulkan implementation happens to gain support for this extension, I may be interested in implementing this for Slang. |
|
Not public yet, but the goal is to upstream a proof-of-concept producer to SPIRV-Tools, extending |
|
@juliusikkala note that thanks to the possibility to pick the variant in a preprocessing step, the driver doesn't need to support the extension directly. |
|
Nice, I didn't think of that. Implementing this in a Vulkan Layer would be a great way to make this immediately available there as well! |
|
Yep, this was one of the design ideas in this extension and one of the reasons we didn't want to require complicated compiler passes to specialize. It should ease adoption a lot. |
|
IMHO Anything thats not a full blown metaprogramming system, is but a bandaid. We deal with it by just compiling a SPIR-V with multiple entry points. Ideally the way that SPIR-V modules declare the capabilities and extensions should change, because right now if any entry point uses a capability you declare it for the whole module. This leads to obvious problems (try linking/combining two SPIR-V entry points meant for vastly different capability GPUs). As stated in my Vulkanised 2023 and 2024 presentations, what I'd REALLY appreciate is constant (as in SPIR-V constants and spec constants) function pointers (and decorate those with capabilities and extensions), then you could fill a constant struct will constant function pointers and gather your extensions and capabilities from walking the static call graph of some entry point (which would also be a constant function pointer). |
|
@devshgraphicsprogramming You mean that if an entry point requires a capability/extension it will be implicitly declared for the whole module? It is ultimately up to the producer algorithm, but each entry point has an associated boolean spec constant, so when implicitly declaring a cap/ext, declare it as a conditional cap/ext using the same spec constant. In practice, this can be easily done by compiling a module for each architecture separately. In that case, each compiled module ends up with all the capabilities/extensions. These modules can then be linked with each getting its own boolean spec constant to be used as the Condition operand in instructions from this extensions as necessary. What you're proposing sounds like being able to associate capabilities/extensions directly with the entry points, but I believe the same thing can also be achieved with extension. Note that you can combine spec constants with OpSpecConstantOp, so if, let's say, 2 out of 3 entry points use the same capability, you combine their spec constants with the OR operation and use that to annotate the conditional capability. So instead of using a struct, the hierarchy is encoded with spec constants, and it is indeed with an entry point rather than module granularity. Walking a static call tree can be used to track which capabilities/extensions are used for which , but it would work only for code that doesn't use dynamic function pointers. (Constant function pointers are a part of the SPV_INTEL_function_pointers extension btw.) |
That just moves the problem into the application space or tooling. My personal very biased take is that specialization constants are near useless, everything I do that requires that sort of flex I do through HLSL2021 (and soon C++ with vcc I hope) metaprogramming. TL;DR Spec constants don't let me programmatically codegen, e.g. iterations of FFTs, shared memory size, etc. They also inhibit many optimizations/transformations, I have to trust the IHV driver/runtime actually does them after turning spec constants into constants. But again I'm a Vulkan developer and not a SYCL one. Also its a problem we've already solved, we take massive unity build SPIR-Vs and run SPIR-V Optimizers after specialization to reamove all entry points which aren't used by a Vulkan pipeline, then do several rounds of aggressive DCE, such that any types, intrinsics etc. requiring capabilities or extensions drop out. Then we run the capability stripping pass to get what we want.
Yes I'm aware, but dynamic function pointers should match some signature (like in C, specifically WASM) , you can't just cast a IMHO the caps should be part of the signature (kinda how storage class should be part of a type).
I know, but its for the Kernel environment, I really wish it was more cross vendor and supported in vulkan. |
Just a quick note, GLSL and Slang do allow specifying shared memory size with a specialization constant, so it has to already be supported in SPIR-V. That specific problem is then probably just an HLSL problem. |
Nice to know. What's the GLSL syntax for making complex e.g. to compute the amount of groupshared memory needed for a prefix sum, you need to do the geometric series up to a point. |
|
You can do basic arithmetic but AFAIK no loops or function calls unfortunately, so no complex codegen there :( In complex cases like that, I've just computed the necessary size on the host and passed it as a separate specialization constant (like So basically you can do this: layout (constant_id = 0) const uint BLOCK_SIZE;
shared float buf_a[(BLOCK_SIZE + 5) * 4];but that's about it. |
No description provided.