[SYCL] Optimize handler::StoreLambda implementation #17669

Alexandr-Konovalov · 2025-03-26T18:07:29Z

Current implementation creates descriptions of all kernel params, then process them one by one. It's possible to process each param right away.

Current implemenation creates descriptions of all kernal params, then process them in turn. It's prossible to process each param right away.

Alexandr-Konovalov · 2025-03-26T18:08:45Z

sycl/include/sycl/handler.hpp

@@ -375,16 +375,6 @@ template <int Dims> bool range_size_fits_in_size_t(const range<Dims> &r) {
  return true;
 }

-template <typename KernelNameType>


I'm not sure can we drop it unconditionally.

I think we can, but not sure either. IMO, let's drop and fix if something breaks somewhere.

+ @steffenlarsen , @AlexeySachkov , @sergey-semenov

From what I can find, you removed the only use of this function, so I agree with @aelovikov-intel - drop it! ⭐

aelovikov-intel · 2025-03-27T14:34:43Z

sycl/include/sycl/handler.hpp

+      reserveArgs(NumParams);
+
+      for (size_t I = 0, IndexShift = 0; I < NumParams; ++I) {
+        extractArgsAndReqsFromLambda(


extractArgs* for a single argument doesn't look like a right name for this.

Thank you for the advice, I moved the loop over args inside of extractArgsAndReqsFromLambda().

aelovikov-intel

LGTM, but I'd want @steffenlarsen to approve the usage of function pointer too, just in case.

steffenlarsen · 2025-03-28T06:18:32Z

sycl/source/handler.cpp

+    char *LambdaPtr, detail::kernel_param_desc_t (*ParamDescGetter)(int),
+    size_t NumKernelParams, bool IsESIMD) {
+  size_t IndexShift = 0;
+  impl->MArgs.reserve(MaxNumAdditionalArgs * NumKernelParams);


Aside from this line, why do we have to move this to the sources?

To me, there is no strict necessity, just desire to move functionality out of public headers and out of templates.

Removing the templates I don't mind, but the more symbols we have in the library, the less freedom we have due to the ABI it imposes. I could maybe also see some benefit in having this in the header w.r.t. inlining.
That is not to say I am against moving more stuff to the library, but when we do I prefer we move as much as we can to minimize the ABI surface.
Note that this is not a strong objection to moving this, but mainly for us to consider the pros/cons of moving small segments from the header to the library.

This patch isn't based on the latest origin/sycl. Top of trunk has this (note constexpr):

llvm/sycl/include/sycl/handler.hpp

Lines 769 to 774 in 0a406c9

if constexpr (KernelHasName) {

// TODO support ESIMD in no-integration-header case too.

clearArgs();

extractArgsAndReqsFromLambda(MHostKernel->getPtr(),

detail::getKernelParamDescs<KernelName>(),

detail::isKernelESIMD<KernelName>());

the only other thing we can fold into this library api is the clearArgs call (which might be a good thing to do, but doesn't really affect our freedom regarding ABI changes).

@steffenlarsen , I agree with you in general, but don't see how that is applicable in this particular case.

My point here is, it seems like all that is actually forcing us to move this to the library is an optimization for pre-emptive reservation. It means the function we now have in the ABI does a handful of different things. Keeping it in the header and adding another new function in the library that does as little as possible still means we have another symbol, but I would argue that the chance of its signature changing or part of the ABI causing problems decreases with that design.

Am I right that current plan is to have extractArgsAndReqsFromLambda() logic internal and to prevent export of handler::processArg() at ABI breaking moment?

IIUC, yes. @steffenlarsen , can you confirm?

Yes, that plan sounds good to me!

@steffenlarsen , what do you think?

Yes! I like that solution, then we can move it when we promote preview during the next ABI break. ⭐

vinser52 · 2025-04-07T21:43:31Z

@aelovikov-intel is this PR ready to be merged?

aelovikov-intel · 2025-04-07T21:46:53Z

I think the plan was to split it in two pieces in case bisect/revert would be necessary. However, I'm fine merging as-is.

Alexandr-Konovalov · 2025-04-08T08:42:47Z

I think the plan was to split it in two pieces in case bisect/revert would be necessary. However, I'm fine merging as-is.

@aelovikov-intel , it's probably some misunderstanding. I think SYCL Do not lock unconditionally while access queue_iml::MInOrderExternalEvent #17575 should be split, and I definitely put the independent part to [SYCL] Do not lock unconditionally while access queue_iml::MMissedCleanupRequests #17883,

aelovikov-intel · 2025-04-08T15:09:41Z

I think the plan was to split it in two pieces in case bisect/revert would be necessary. However, I'm fine merging as-is.

@aelovikov-intel , it's probably some misunderstanding. I think SYCL Do not lock unconditionally while access queue_iml::MInOrderExternalEvent #17575 should be split, and I definitely put the independent part to [SYCL] Do not lock unconditionally while access queue_iml::MMissedCleanupRequests #17883,

My brain was melting from working on conflicts resolution... You're definitely correct :)

I don't mind the other PR going as one piece though ;)

[SYCL] Optimize handler::StoreLambda implemnation

717a398

Current implemenation creates descriptions of all kernal params, then process them in turn. It's prossible to process each param right away.

Alexandr-Konovalov had a problem deploying to WindowsCILock March 26, 2025 18:07 — with GitHub Actions Error

Alexandr-Konovalov commented Mar 26, 2025

View reviewed changes

Code formatting.

55c9d25

Alexandr-Konovalov had a problem deploying to WindowsCILock March 26, 2025 18:50 — with GitHub Actions Failure

Alexandr-Konovalov temporarily deployed to WindowsCILock March 26, 2025 19:22 — with GitHub Actions Inactive

Cosmetic changes.

9fdf613

Alexandr-Konovalov temporarily deployed to WindowsCILock March 27, 2025 11:24 — with GitHub Actions Inactive

Alexandr-Konovalov temporarily deployed to WindowsCILock March 27, 2025 12:01 — with GitHub Actions Inactive

Alexandr-Konovalov marked this pull request as ready for review March 27, 2025 13:27

Alexandr-Konovalov requested a review from a team as a code owner March 27, 2025 13:27

Alexandr-Konovalov requested a review from aelovikov-intel March 27, 2025 13:27

aelovikov-intel reviewed Mar 27, 2025

View reviewed changes

Move loop over args inside handler::extractArgsAndReqsFromLambda()

84eaf62

Alexandr-Konovalov had a problem deploying to WindowsCILock March 27, 2025 18:39 — with GitHub Actions Failure

aelovikov-intel approved these changes Mar 27, 2025

View reviewed changes

Alexandr-Konovalov temporarily deployed to WindowsCILock March 27, 2025 19:38 — with GitHub Actions Inactive

steffenlarsen reviewed Mar 28, 2025

View reviewed changes

Code formatting. Add newly exported Windows symbol.

7c6895c

Alexandr-Konovalov had a problem deploying to WindowsCILock March 28, 2025 06:41 — with GitHub Actions Failure

Alexandr-Konovalov temporarily deployed to WindowsCILock March 28, 2025 06:57 — with GitHub Actions Inactive

Alexandr-Konovalov added 3 commits April 1, 2025 17:06

Merge branch 'sycl' into Alexandr-Konovalov/StoreLambda-opt

07be10c

Merge branch 'sycl' into Alexandr-Konovalov/StoreLambda-opt

3a194f0

Conditionally remove handler::processArg from ABI.

312f59c

Alexandr-Konovalov had a problem deploying to WindowsCILock April 2, 2025 10:03 — with GitHub Actions Failure

Alexandr-Konovalov temporarily deployed to WindowsCILock April 2, 2025 10:41 — with GitHub Actions Inactive

Add newly exported Windows symbol.

3ce04a1

Alexandr-Konovalov temporarily deployed to WindowsCILock April 2, 2025 12:04 — with GitHub Actions Inactive

Alexandr-Konovalov temporarily deployed to WindowsCILock April 2, 2025 12:41 — with GitHub Actions Inactive

Merge branch 'sycl' into Alexandr-Konovalov/StoreLambda-opt

9276a8d

Alexandr-Konovalov temporarily deployed to WindowsCILock April 4, 2025 08:39 — with GitHub Actions Inactive

Alexandr-Konovalov temporarily deployed to WindowsCILock April 4, 2025 09:15 — with GitHub Actions Inactive

steffenlarsen approved these changes Apr 4, 2025

View reviewed changes

aelovikov-intel merged commit 313c4f0 into intel:sycl Apr 7, 2025
24 checks passed

	if constexpr (KernelHasName) {
	// TODO support ESIMD in no-integration-header case too.
	clearArgs();
	extractArgsAndReqsFromLambda(MHostKernel->getPtr(),
	detail::getKernelParamDescs<KernelName>(),
	detail::isKernelESIMD<KernelName>());

[SYCL] Optimize handler::StoreLambda implementation #17669

[SYCL] Optimize handler::StoreLambda implementation #17669

Uh oh!

Conversation

Alexandr-Konovalov commented Mar 26, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aelovikov-intel Mar 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

aelovikov-intel left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vinser52 commented Apr 7, 2025

Uh oh!

aelovikov-intel commented Apr 7, 2025

Uh oh!

Uh oh!

Alexandr-Konovalov commented Apr 8, 2025

Uh oh!

aelovikov-intel commented Apr 8, 2025

Uh oh!

Uh oh!

Alexandr-Konovalov commented Mar 26, 2025 •

edited

Loading

aelovikov-intel Mar 27, 2025 •

edited

Loading