-
Notifications
You must be signed in to change notification settings - Fork 26
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NVTX support via DrHook (Refactored) #28
Conversation
This commit moves NVTX functionality from an external implementation to within the DrHook core. This increases performance, aids maintainability, and makes NVTX available to all languages supported by DrHook. The following options have been introduced: - opt_nvtx (DR_HOOK_NVTX) - A default of 0 - opt_nvtx_SCC (DR_HOOK_NVTX_SPAM_CALL_COUNT) - NVTX spam call count - How many calls are required before it's marked as spam - Defaults to value of compile time definition nvtx_SCC_default (10) - opt_nvtx_SWT (DR_HOOK_NVTX_SPAM_WT) - NVTX spam wall time - How much cumulative wall time is needed before SCC is overriden - Measured in seconds - Defaults to value of compile time definition nvtx_SWT_default (0.0001) Fix options being under DR_HOOK_OPT
This commit standardises the approach to adding third party extensions (such as NVTX) to DrHook's CMakeLists.txt. The new approach is to exclude the extensions folder from the initial fiat target creation, and then individually add the extension's sources, include directories, and external libraries. This allows for an easy pattern to follow for future extensions, which is entirely contained within one `if (HAVE_DR_HOOK_{feature})` statement.
This option enforces DrHook regions to have the same name on both entry and exit calls. This is done by aborting immediately upon exit calls that have a non-matching region name. Previously DrHook would instead ignore these mismatches as it only tracks the "hoo_handle" variable.
This commit refactors tests for NVTX and adds a new test, `drhook_nvtx_mismatched_regions`. This new test checks that `DR_HOOK_STRICT_REGIONS` is enabled and working when NVTX is enabled.
Previously the silent option would only be local to the `process_options()` function. Now it has been promoted to be global to drhook.c. This allows it to be used without being passed through functions, such as where it has been used by NVTX code in the `putkey()` function.
These two new tests, `drhook_nvtx_skip_spam_regions` & `drhook_nvtx_no_skip_spam_regions`, demonstrate the following behaviours: - drhook_nvtx_skip_spam_regions - Regions are successfully skipped when called frequently & with low total runtime - drhook_nvtx_no_skip_spam_regions - Regions are successfully not skipped when called frequently, but with sufficient total runtime
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks very good! I have a few questions, see below.
Now that it is part of drhook.c could you please also add a C-language test?
Also it would be great if @pmarguinaud can test this in his software stack to make sure it is working properly.
@Andrew-Beggs-ECMWF testing this I had to apply a fixup to your last commit. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All looks like it's working now.
This PR is an extension of #23. It adds all the same functionality, as well as support for NVTX in C programs.
Overall, code was moved from separate source files, to be in the DrHook core. This increases both performance and maintainability. Additional tests were also added.
Two new options were also added:
Currently there is no support for NVTX3, the newest version of the library. This is because it was not in the original PR & it was not installed on any machine I have access to. Given this lack of demand, I have decided to postpone it until requested in an issue. The structure is also already in place to implement it when necessary.Functionality added as of9b82f8e