Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shader_debugprintf: support new VVL-DEBUG-PRINTF message and fix VVL version check for API selection #1187

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

SRSaunders
Copy link
Contributor

@SRSaunders SRSaunders commented Oct 9, 2024

Description

Fixes two issues that arose with Vulkan SDK 1.3.296:

  1. Supports new VVL-DEBUG-PRINTF callback message. Previous SDKs used WARNING-DEBUG-PRINTF or UNKNOWN-DEBUG-PRINTF. Without this fix the debug data is not available in the UI Overlay.
  2. Fixes my incorrect assumption that the Vulkan instance version matched the SDK version for all platforms - true on macOS but not true for Windows and Linux. This version is used to set the API level for the sample, which is important for performance and to avoid a previous defect in the Vulkan Validation layer. I have replaced the instance version check with a Validation Layer version check which is portable across all platforms: Win, Linux, macOS. Without this fix, performance is poor on Windows and Linux when using Vulkan SDK 1.3.296.

Fixes #1184.

Tested on Windows 10, Manjaro Linux, and macOS Ventura using Vulkan SDKs 1.3.290 and 1.3.296.

I hope this is the last time I have to fix this. It seems that VVL changes can easily break this sample.

General Checklist:

Please ensure the following points are checked:

  • My code follows the coding style
  • I have reviewed file licenses
  • I have commented any added functions (in line with Doxygen)
  • I have commented any code that could be hard to understand
  • My changes do not add any new compiler warnings
  • My changes do not add any new validation layer errors or warnings
  • I have used existing framework/helper functions where possible
  • My changes do not add any regressions
  • I have tested every sample to ensure everything runs correctly
  • This PR describes the scope and expected impact of the changes I am making

Note: The Samples CI runs a number of checks including:

  • I have updated the header Copyright to reflect the current year (CI build will fail if Copyright is out of date)
  • My changes build on Windows, Linux, macOS and Android. Otherwise I have documented any exceptions

If this PR contains framework changes:

  • I did a full batch run using the batch command line argument to make sure all samples still work properly

Sample Checklist

If your PR contains a new or modified sample, these further checks must be carried out in addition to the General Checklist:

  • I have tested the sample on at least one compliant Vulkan implementation
  • If the sample is vendor-specific, I have tagged it appropriately
  • I have stated on what implementation the sample has been tested so that others can test on different implementations and platforms
  • Any dependent assets have been merged and published in downstream modules
  • For new samples, I have added a paragraph with a summary to the appropriate chapter in the readme of the folder that the sample belongs to e.g. api samples readme
  • For new samples, I have added a tutorial README.md file to guide users through what they need to know to implement code using this feature. For example, see conditional_rendering
  • For new samples, I have added a link to the Antora navigation so that the sample will be listed at the Vulkan documentation site

Copy link
Collaborator

@SaschaWillems SaschaWillems left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for this PR. I do have some remarks though, mostly related to comment and code structure. I think it's important that people can easily follow understand the changes ;)

@SRSaunders
Copy link
Contributor Author

SRSaunders commented Oct 18, 2024

Thanks @SaschaWillems for the feedback. I am away on vacation this week, but will make the requested changes when I am back.

UPDATE: Back now and changes submitted in 0dc4963.

asuessenbach
asuessenbach previously approved these changes Oct 22, 2024
@SaschaWillems
Copy link
Collaborator

No idea why, but with this PR and the latest SDK (1.3.296) and in windows, this sample is now again running with less than 1 fps. Forcing it to use VK 1.2 is somehow even slower (0 or inf fps).

If I force VK 1.0 performance is fine, but I don't get any debug output.

Not sure what is happening here and why this sample is so problematic. The debug printf sample from m own samples repo works just fine no matter the api version :/

@SRSaunders
Copy link
Contributor Author

No idea why, but with this PR and the latest SDK (1.3.296) and in windows, this sample is now again running with less than 1 fps. Forcing it to use VK 1.2 is somehow even slower (0 or inf fps).

Very strange. Can I ask you to recheck before and after this PR, but being careful with your SDK version selection and project gen/build? I did a lot of testing with old and new SDKs on Windows 10, Linux and macOS before submitting originally. I will go back and test again to see if I can somehow duplicate what you are seeing.

If I force VK 1.0 performance is fine, but I don't get any debug output.

Debug PrintF requires Vulkan 1.1 or later. So no surprise that you are not getting debug output with API 1.0.

The debug printf sample from my own samples repo works just fine no matter the api version

I suspect your repo's sample relies on the instrinsic Debug PrintF capability at the shader level on Windows. However, this is not cross-platform portable. Whereas the Vulkan-Samples one uses the VVL version of the feature all the time. Perhaps that is why you are seeing a difference at least on Windows. Again, I will so back and see if I can verify this.

@SaschaWillems
Copy link
Collaborator

It also happens with the old code (before this PR). I only have SDK 1.3.296 installed.

So probably a regression in the validation layers?

@SRSaunders
Copy link
Contributor Author

SRSaunders commented Oct 23, 2024

Ok, I have rechecked this PR on Windows 10, and even fast-forwarded my local branch to current main HEAD just to make sure. I am using Vulkan SDK 1.3.296.0 with my Radeon RX6600XT GPU. My Vulkan Configurator has been reset to default settings.

Before this PR I get:
main only

After this PR I get:
shader_debugprintf FF

Is it possible that your Vulkan Configurator has a custom setting that is interfering with the sample? Or possibly a difference between AMD and nVidia GPUs? Just grasping at straws since I cannot duplicate your issue and the 1.3.296 VVL seems to be working correctly using API 1.1 for debug printf.

Copy link
Contributor

@asuessenbach asuessenbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this change, I have to distinguish two cases:

  1. VulkanConfigurator is running
    VK_EXT_LAYER_SETTINGS_EXTENSION_NAME is available
    instance creation is done by VulkanSample::create_instance (line 469)
    render speed is high
    debug_utils_message_callback is never called, thus no debugprintf output
  2. VulkanConfigurator is not running
    VK_EXT_LAYER_SETTINGS_EXTENSION_NAME is not available
    instance creation is done locally (line 523)
    render speed is extremely low
    debug_utils_message_callback is called, with higher rate than the frame rate

Note, in case 2, you're using VkValidationFeaturesEXT, which is part of VK_EXT_VALIDATION_FEATURES_EXTENSION_NAME. But you don't ask for it in the ShaderDebugPrintf constructor (or anywhere else). And in fact, that extension is not supported on my machine. Strange, that the VVL doesn't cry there.

@SaschaWillems
Copy link
Collaborator

That would explain why it's so slow for me. I never ran that sample with the VulkanConfigurator running. That's case 2.

@SRSaunders
Copy link
Contributor Author

SRSaunders commented Oct 24, 2024

Thanks @asuessenbach for pointing out the missing VK_EXT_validation_features extension. I have made a few changes that might make a difference as follows:

  1. Moved layer settings out of the constructor, and into ShaderDebugPrintf::create_instance(). Now it will run only if the VK_EXT_layer_settings extension is available. This part is for encapsulation only and will not change behaviour.
  2. Added and enabled the VK_EXT_validation_features extension when the VK_EXT_layer_settings extension is not available at runtime. This might change behaviour, but I am concerned about @asuessenbach's comment that the extension is not available on his machine. I'm not sure how that is possible.
  3. Fixed an incorrect string comparison operation for VK_EXT_layer_settings in [HPP]Instance::[HPP]Instance(). This was my mistake from an earlier PR. This could have prevented proper specification of the validation layer feature settings when VK_EXT_layer_settings is active. Again, this could change behaviour.

These changes may not be the final solution as I have observed the following when testing:

  1. Linux (Manjaro) using Vulkan 1.3.295 (from pkg mgr) and VVL 1.3.290 (from pkg mgr): this PR works properly (good frame rate, debug data available) when running with vkconfig and without. VK_EXT_layer_settings is only available when vkconfig is active. In this case the debug data is available both in the UI and in the stdout console. No performance issues are visible in either case.
  2. macOS (Ventura) using Vulkan SDK 1.3.296: this PR works properly (good frame rate, debug data available) when running with vkconfig and without. VK_EXT_layer_settings is available both when vkconfig is inactive and active - this is a difference vs Linux. In the latter case (vkconfig active) the debug data is available both in the UI and in the stdout console. No performance issues are present. Also tested with Vulkan SDK 1.3.290 and the results are the same - no performance problems. The only issue is that vkconfig does not appear to recognize the repeated message limit for the new VVL-* messages (vs. the previous INFO-* or WARNING-* messages, etc). A minor issue but likely a bug.
  3. Windows 10 using Vulkan SDK 1.3.296 with my AMD 6600XT GPU: this PR works properly (good frame rate, debug data available) when running without vkconfig only. When vkconfig is active, the sample will not start and complains about an unsupported extension during vkCreateInstance(). However, VK_EXT_layer_settings is available during enumeration when vkconfig is active. Something very strange is going on here - either a bug on the Windows side or something I do not understand. I am not sure how VK_EXT_layer_settings can be enumerated but not supported. See my console output in this case:

nolayerext

In summary:

  1. Linux: works properly using VVL 1.3.290 with and without vkconfig. Can't test VVL 1.3.296 since it is not yet available as a package for my Manjaro distro.
  2. macOS: works properly using VVL 1.3.290 and 1.3.296 with and without vkconfig.
  3. Windows 10 on AMD 6600XT GPU: works properly using VVL 1.3.296 without vkconfig only.

Lastly, I thought VK_EXT_layer_settings was meant to replace and deprecate VK_EXT_validation_features. I don't understand why VK_EXT_layer_settings is available all the time on macOS, but for Windows and Linux seems to be enabled only when vkconfig is running. This seems incorrect to me. Can you explain this?

@SRSaunders
Copy link
Contributor Author

SRSaunders commented Oct 24, 2024

Ok, I think I have finally figured it out. It appears that you don't need to actually enable the VK_EXT_layer_settings extension in order to use it. I’m not sure if this is a feature or a bug. In any case, I have updated the sample and [HPP]Instance::[HPP]Instance() to check for availability of the extension vs. enablement. This approach works across all platforms and behaviours appear to be consistent now:

  1. Sample is tolerant of Vulkan SDK versions: tested against VVL 1.3.290 (Win, Linux, macOS) and 1.3.296 (Win, macOS)
  2. Sample is tolerant of vkconfig running or not running. The only thing to be careful of when running vkconfig is to make sure "Limit Duplicated Messages" is turned off - otherwise debug callback messages will be suppressed and the debug output UI will be blank.

@asuessenbach
Copy link
Contributor

AFAIK, those two extensions (VK_EXT_layer_settings and VK_EXT_validation_features) are not supported by any NVIDIA GPU, but are provided by a layer injected by for example the VulkanConfigurator. That might explain why it's that slow.

Besides that, just to make sure it has been noted: As VK_EXT_validation_features is deprecated in favour of VK_EXT_layer_settings, using VK_EXT_validation_features would just be a fallback solution. Don't know, if it's worth to have that. And you should bail out in a friendly way, if none of those extensions is available, maybe with a hint to the VulkanConfigurator.

@SaschaWillems
Copy link
Collaborator

Welp, still sub 1 fps for me with latest SDK and vkconfig NOT running.

Just let me know when it's in a state were I should test.

If we can't get this to work, we may simply go back to the initial version and maybe remove the debug output and tell people to attach a graphics debugger.

@SRSaunders
Copy link
Contributor Author

Thanks @asuessenbach for the info re nVidia GPUs. I have an AMD card and I guess this is the difference here.

@SaschaWillems would you please test using this PR with vkconfig running and let me know the result? I presume you are using an nVidia GPU - please confirm.

If this works, and as @asuessenbach suggests, I will try to detect this condition and offer a message to nVidia users.

@SaschaWillems
Copy link
Collaborator

If this works, and as @asuessenbach suggests, I will try to detect this condition and offer a message to nVidia users.

If we get to a point where we have to show a message under certain conditions to users of a certain vendor we're not heading where I'd like our samples to head. I'd rather remove the output debug stuff then.

@SRSaunders
Copy link
Contributor Author

@SaschaWillems I understand. However I’d still like to track this down if possible and you testing on Nvidia with vkconfig active would give more information. I can’t do this test myself. Thx.

@SaschaWillems
Copy link
Collaborator

SaschaWillems commented Oct 24, 2024

Windows 11 23H2, nvidia RTX 4070, latest Vulkan developer driver, SDK 1.3.296.

And I get <1 fps even with vkconfig up and running:

image

I'm pretty sure that the sample ran fine when I initially wrote it, but not sure why it no longer does.

Can't rule out a configuration issue on my side 100%, but not sure where to start looking.

@SRSaunders
Copy link
Contributor Author

I just added a minor hygiene change to use vk::ExtensionProperties vs. VkExtensionProperties in HPPInstance(). Also updated some comments and decided to explicitly request required GPU features for debugPrintfEXT as per docs.

More importantly, I was able to find an nVidia GPU to test this. I have narrowed down what causes the slowdown and am now convinced it is a VVL debugPrintfEXT defect on that GPU platform. Simply by disabling the following debugPrintfEXT feature enablement lines I can restore FPS performance on nVidia machines for both vkconfig running and not running cases. Unfortunately this drops the debug info, but hopefully this is a temporary thing until this issue can be addressed.

...
	//add_layer_setting(layerSetting);
...
	instance_create_info.pNext = nullptr; //&validation_features;
...

I will respond on the other thread to @spencer-lunarg to see if he can help.

@spencer-lunarg
Copy link
Contributor

@SRSaunders before we had the Slow Down on for Vulkan 1.1 and 1.2/1.3 were good... is that still the case or is it now for all versions?

@SRSaunders
Copy link
Contributor Author

SRSaunders commented Oct 25, 2024

When using an nVidia GPU with SDK 1.3.296, it slows down for all API versions. When using SDK 1.3.290 with the same setup (nVidia GPU), the sample works properly when using API 1.2 - as expected per previous discussion.

For AMD GPUs (and Apple Silicon on macOS) with SDK 1.3.296 everything works properly when using API 1.1

@spencer-lunarg
Copy link
Contributor

ok, so the problem has be isolated down to an NVIDIA GPU (I was testing on Intel and found no issues)... Later tonight I will be back at my desk and can try again on my NVIDIA machine

…gPrintfEXT

(cherry picked from commit 3365c7d974ae1cb7222cf35fdbe82accfa3fd926)
@SRSaunders
Copy link
Contributor Author

SRSaunders commented Oct 31, 2024

Following interaction with the VVL team, I think this is now ready for review. A couple of learnings:

  1. The VK_EXT_layer_settings extension is trickier than I first realized. On Windows and Linux it is primarily a layer instance extension and not a driver extension. On macOS things are a bit different where VK_EXT_layer_settings is made visible by the MoltenVK driver as well. However, to detect its availability in general you need to query the layer and not the driver during instance enumeration. Thanks to @spencer-lunarg + team and armed with this new (to me) information, I have modified the sample to now do proper detection. This results in all platforms using the layer settings path independent of whether vkconfig is running or not. The legacy VkValidationFeatureEnableEXT path is still present in the code, but will be used only with older SDKs that don't have VK_EXT_layer_settings within the VVL layer.
  2. The above findings also made me realize that the general framework for adding a layer setting (in [HPP]Instance()) was not using the correct criteria to chain in the VkLayerSettingsCreateInfoEXT struct during instance creation. Testing for the presence of VK_EXT_layer_settings in the driver will not give the right answer. And looking for which layer to check would only add a bunch of unnecessary complexity to the code. So I simplified things and now leave it up to the sample to determine if layer settings is supported, and if so, to push a layer setting using add_layer_setting(). In [HPP]Instance() I now only check for the presence of layer setting entries in the required_layer_settings vector. This puts the onus on the sample to make sure layer settings entries are supported. However, this is not very risky since vkCreateInstance() will throw away any layer settings that don't match and should not complain.
  3. Lastly, the observed performance slow-downs on various platforms and SDK versions has these mitigations:
    a) The slow-downs that @spencer-lunarg observed with older SDKs in shader_debugprintf: support new VVL-DEBUG-PRINTF message and fix VVL version check for API selection #1187 (comment) has been solved. While I was picking the correct API version 1.2 for older SDKs, I had added code that explicitly requested support for timeline semaphores. It turns out that while the VVL debugPrintfEXT feature requires this and implicitly enables it under the covers, explicitly requesting support in the sample breaks performance. I am not sure why this is the case, but I will leave it to @spencer-lunarg to decide if this is an issue or not. Nonetheless, I have removed this from the sample and older SDKs now work without performance degradation.
    b) The slow-down that was visible for nVidia GPUs running SDK 1.3.296 is apparently solved by gpu: Skip present submission Vulkan-ValidationLayers#8766, which will be available with the next SDK. I have not been able to test this yet, but will report back once I can. UPDATED: fix verified as working on my nVidia GPU machine. However, the only workaround for now is to use SDK 1.3.290 for nVidia GPU users until the next SDK is released.
    c) By fixing the logic to test for the VVL version (and not the instance version) when selecting API 1.1 vs. 1.2, this PR does solve performance issues observed with SDK 1.3.296 for Windows AMD, Linux AMD, and macOS. I'm not sure about Linux nVidia as I cannot test that combination.

@spencer-lunarg
Copy link
Contributor

It turns out that while the VVL debugPrintfEXT feature requires this and implicitly enables it under the covers, explicitly requesting support in the sample breaks performance.

Is this still true? I tried to add it explicitly in this PR and didn't see it slow down (the issue was the old pre-1.3.290 SDK and should be patched now)

@SRSaunders
Copy link
Contributor Author

It turns out that while the VVL debugPrintfEXT feature requires this and implicitly enables it under the covers, explicitly requesting support in the sample breaks performance.

Is this still true? I tried to add it explicitly in this PR and didn't see it slow down (the issue was the old pre-1.3.290 SDK and should be patched now)

I observed this performance impact when using SDK 1.3.290 with API 1.2 and timeline semaphores explicitly enabled. With SDK 1.3.296 using API 1.1 with timeline semaphores enabled, performance was fine. So I removed the explicit enablement of timeline semaphores and now I get consistent performance using: a) API 1.2 with SDKs <= 1.3.290, and b) API 1.1 with SDKs >= 1.3.296 (aside from the nVidia issue mentioned above).

@SRSaunders
Copy link
Contributor Author

SRSaunders commented Nov 11, 2024

Given the long discussion and in case it wasn't clear, this PR is now ready to go.

Note for nVidia users (Windows) using SDK 1.3.296: A fix is also required from the VVL which will ship in the next SDK. In the interim, nVidia users should run using this PR combined with SDK 1.3.290 for the shader_debugprintf sample.

Note: This also removes the framework check in [hpp_]instance.cpp for VK_EXT_layer_settings when using add_layer_setting(). This makes using the layer settings feature possible when the extension is not present in the driver, but only present in the layer. It's now up to the sample itself to check for support in the layer if required.

@JoseEmilio-ARM JoseEmilio-ARM self-requested a review November 18, 2024 16:18
@SaschaWillems
Copy link
Collaborator

Quick update: We discussed this in a recent call and will probably wait with the merge until the next SDK is available to make sure we can simply point people to update their SDK in case of any problems with this sample. Hope that's okay with you.

@SRSaunders
Copy link
Contributor Author

@SaschaWillems that’s fine re the debugprintf sample. My only concern is the fix for the add_layer_setting() framework capability. Not sure if you want that separated out and brought forward into a new PR for merging now. Let me know.

@SaschaWillems
Copy link
Collaborator

Is that fix important (sorry, I got kinda lost with all the discussion in this issue)? If so, splitting it into a separate PR is fine, if not it's fine if we merge all of this once the new SDK is out.

@SRSaunders
Copy link
Contributor Author

I guess we can wait givin the add_layer_setting() feature is not used extensively at this point. I will leave it up to @JoseEmilio-ARM to comment if an earlier fix would be preferable.

@SRSaunders
Copy link
Contributor Author

SRSaunders commented Feb 4, 2025

@SaschaWillems - a new Vulkan SDK 1.4.304.0 has been released and this PR can be retested against Nvidia GPUs. I only have access to an AMD card at this point and haved retested this PR on Windows+AMD and macOS+AMD using the new SDK. I will test against Linux+AMD as soon as the new SDK is available as an updated package for my distro (Manjaro Linux).

I would appreciate someone testing this PR with Windows+Nvidia against the new SDK to prove out the solution. If this checks out perhaps this long-delayed PR could be merged.

(As an aside, the VVL team may have broken things again with the new SDK but for macOS only this time. It seems they have added a requirement for VkPhysicalDeviceVulkanMemoryModelFeatures::vulkanMemoryModel and VkPhysicalDeviceVulkanMemoryModelFeatures::vulkanMemoryModelDeviceScope when the Validation Layer's debug printf feature is enabled. This was not the case for previous SDKs, and unfortunately these features are not available on MoltenVK. I plan to raise an issue with the VVL project asking why this change was made. However, this does not affect the current PR and shader_debugprintf sample which should be correct for all platforms. The sample + this PR continue to work fine for other platforms as well as previous SDKs on macOS. See issue KhronosGroup/Vulkan-ValidationLayers#9386)

@SaschaWillems SaschaWillems self-requested a review February 5, 2025 20:26
SaschaWillems
SaschaWillems previously approved these changes Feb 5, 2025
Copy link
Collaborator

@SaschaWillems SaschaWillems left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much. Can confirm that this no finally runs at proper frame rates on nvidia + windows + latest SDK.

@SRSaunders
Copy link
Contributor Author

SRSaunders commented Feb 5, 2025

Can confirm that this no finally runs at proper frame rates on nvidia + windows + latest SDK.

Thanks very much @SaschaWillems for testing and approving. Perhaps @JoseEmilio-ARM could review as well using the new SDK 1.4.304.0.

Copy link
Contributor

@asuessenbach asuessenbach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything runs fine now!
Just one little issue.

@CLAassistant
Copy link

CLAassistant commented Feb 7, 2025

CLA assistant check
All committers have signed the CLA.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

shader_debugprintf problems with new VulkanSDK 1.3.296
5 participants