Skip to content

[SYCL] Deprecate fallback assertions #18310

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: sycl
Choose a base branch
from

Conversation

steffenlarsen
Copy link
Contributor

@steffenlarsen steffenlarsen commented May 5, 2025

This commit deprecates (and removes under preview flag) the fallback assertion implementation for SYCL kernels. The expected behavior after this is that backends that do not support native asserts, as reported through the ext_oneapi_native_assert aspect, will ignore assertions in kernel code.

This commit also makes the L0 adapter report support for native assert, as it was incorrectly reporting false before.

This commit deprecates (and removes under preview flag) the fallback
assertion implementation for SYCL kernels. The expected behavior after
this is that backends that do not support native asserts, as reported
through the ext_oneapi_native_assert aspect, will ignore assertions in
kernel code.

Signed-off-by: Larsen, Steffen <[email protected]>
@npmiller
Copy link
Contributor

npmiller commented May 5, 2025

I believe these two tests also rely on the fallback asserts:

Or at least they used to, I'm not sure the current checks represent that intent, and they never had the macros defined. But you can see the comment for the cuda/hip unsupported. And fallback is really the only case that would make sense to test for discard events with assert since it adds extra commands that might add/use events. I've been meaning to dig further into them to re-enable them for CUDA and HIP, but if we're dropping fallback we might be able to just delete these eventually.

@gmlueck
Copy link
Contributor

gmlueck commented May 5, 2025

What is our plan for the assert feature once the environment variable is removed? Will assert statements simply be silently ignored if they appear in a kernel that is submitted to a device without native support? If that is the case, I think we need an update to the sycl_ext_oneapi_assert specification.

Or, will assert be treated like other device-optional features, and the implementation will throw an exception if the device does not support it? In this case, we would also need an update to sycl_ext_oneapi_assert.

@jbrodman
Copy link
Contributor

jbrodman commented May 5, 2025

I believe these two tests also rely on the fallback asserts:

Or at least they used to, I'm not sure the current checks represent that intent, and they never had the macros defined. But you can see the comment for the cuda/hip unsupported. And fallback is really the only case that would make sense to test for discard events with assert since it adds extra commands that might add/use events. I've been meaning to dig further into them to re-enable them for CUDA and HIP, but if we're dropping fallback we might be able to just delete these eventually.

We'd ultimately like to get rid of DiscardEvents too...

Signed-off-by: Larsen, Steffen <[email protected]>
@steffenlarsen steffenlarsen requested review from a team as code owners May 5, 2025 16:03
@steffenlarsen
Copy link
Contributor Author

What is our plan for the assert feature once the environment variable is removed? Will assert statements simply be silently ignored if they appear in a kernel that is submitted to a device without native support? If that is the case, I think we need an update to the sycl_ext_oneapi_assert specification.

This seems like the safest solution to me, as it means users can still run their kernels on devices where they would run before, even if we can't do the asserts in there for them. I've updated the extension.

Signed-off-by: Larsen, Steffen <[email protected]>
Signed-off-by: Larsen, Steffen <[email protected]>
@steffenlarsen
Copy link
Contributor Author

It appears that L0 assertions don't print and abort at synchronization points, but instead does so at the end of the application. This has been reported internally.

@gmlueck
Copy link
Contributor

gmlueck commented May 6, 2025

It appears that L0 assertions don't print and abort at synchronization points, but instead does so at the end of the application. This has been reported internally.

This seems like it could be a serious bug. What happens if the application crashes and terminates abnormally? Does Level Zero print the assertion even when the application terminates abnormally like this? What if the application hangs, and the user terminates it via CTRL-C? I would expect many failed assertion checks to result in either a crash or a hung application. If the assertion is not printed in either of these cases, it seems pretty limited.

@steffenlarsen
Copy link
Contributor Author

This seems like it could be a serious bug. What happens if the application crashes and terminates abnormally? Does Level Zero print the assertion even when the application terminates abnormally like this? What if the application hangs, and the user terminates it via CTRL-C? I would expect many failed assertion checks to result in either a crash or a hung application. If the assertion is not printed in either of these cases, it seems pretty limited.

I absolutely agree. Even if there is some guarantee that it is printed by the end of the program, if the user cannot rely on it failing at the point of synchronization, it can be hard to track failures and users may want to rely on the abort happening before they continue on with the following tasks.

@jinge90
Copy link
Contributor

jinge90 commented May 6, 2025

Hi, @steffenlarsen
I created a PR #18089 to remove unused fallback assert implementation in libdevice previously, I think it can be merged into your current PR if we are going to deprecate fallback assert. Do you have any objection and concern? If no, I can add it to your current PR.
Thanks very much.

@steffenlarsen
Copy link
Contributor Author

Hi, @steffenlarsen I created a PR #18089 to remove unused fallback assert implementation in libdevice previously, I think it can be merged into your current PR if we are going to deprecate fallback assert. Do you have any objection and concern? If no, I can add it to your current PR. Thanks very much.

Thank you, @jinge90 ! This patch should not change the behavior without the preview flag being set. Can we integrate your changes in the same way?

@jinge90
Copy link
Contributor

jinge90 commented May 7, 2025

Hi, @steffenlarsen I created a PR #18089 to remove unused fallback assert implementation in libdevice previously, I think it can be merged into your current PR if we are going to deprecate fallback assert. Do you have any objection and concern? If no, I can add it to your current PR. Thanks very much.

Thank you, @jinge90 ! This patch should not change the behavior without the preview flag being set. Can we integrate your changes in the same way?

Hi, @steffenlarsen
It should be doable, we can create a new libsycl-crt-previw.o which cleans up all fallback assert related code and links it with user's device code when -fpreview-breaking-changes is applied. If the option is not applied, we just link the original libsycl-crt.o, I will take a look at this later.
Thanks very much.

@steffenlarsen
Copy link
Contributor Author

Hi, @steffenlarsen It should be doable, we can create a new libsycl-crt-previw.o which cleans up all fallback assert related code and links it with user's device code when -fpreview-breaking-changes is applied. If the option is not applied, we just link the original libsycl-crt.o, I will take a look at this later. Thanks very much.

If it isn't too much work, I don't mind doing it as part of this, though alternatively we can do it in a separate patch and just let the patch sit there until the breaking window opens. That way we won't need to make some preview work-around for now.

@jinge90
Copy link
Contributor

jinge90 commented May 7, 2025

Hi, @steffenlarsen It should be doable, we can create a new libsycl-crt-previw.o which cleans up all fallback assert related code and links it with user's device code when -fpreview-breaking-changes is applied. If the option is not applied, we just link the original libsycl-crt.o, I will take a look at this later. Thanks very much.

If it isn't too much work, I don't mind doing it as part of this, though alternatively we can do it in a separate patch and just let the patch sit there until the breaking window opens. That way we won't need to make some preview work-around for now.

Hi, @steffenlarsen
So I can continue to do this in #18089 and wait for the breaking windows opens.
When you plan to land current PR, please ping me to also land the libdevice part after current PR is merged. Is it OK for you?
Thanks very much.

@jinge90
Copy link
Contributor

jinge90 commented May 9, 2025

Hi, @steffenlarsen
I created a new PR: #18376 including driver change.
A new version of libsycl-fallback-cassert-preview is created removing all legacy fallback assert impl, this lib will be linked with user's device code only when -fpreview-breaking-changes. Please ping me when you plan to land your PR.
Thanks very much.

@@ -371,9 +377,15 @@ class __SYCL_EXPORT queue : public detail::OwnerLessBase<queue> {
std::enable_if_t<std::is_invocable_r_v<void, T, handler &>, event> submit(
T CGF,
const detail::code_location &CodeLoc = detail::code_location::current()) {
#ifndef __INTEL_PREVIEW_BREAKING_CHANGES
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: For if/else cases it's more readable to use #ifdef <preview>/#else/#endif, IMO.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants