Skip to content

Disable chmod-hook by default #1221

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

ArangoGutierrez
Copy link
Collaborator

@ArangoGutierrez ArangoGutierrez commented Aug 4, 2025

Fixes #1218

The chmod createContainer hook was added as a workaround to a specific crun issue for device nodes that are found in subdirectories of /dev. Since this has since been addressed in crun, this PR disables the chmod hook by default to reduce the number of supported hooks.

See:

@ArangoGutierrez ArangoGutierrez added this to the v1.18.0 milestone Aug 4, 2025
@ArangoGutierrez ArangoGutierrez self-assigned this Aug 4, 2025
Copilot

This comment was marked as outdated.

@@ -82,6 +82,7 @@ type Option func(*cdiHookCreator)
type cdiHookCreator struct {
nvidiaCDIHookPath string
disabledHooks map[HookName]bool
enableChmodHook bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned in the addition of the flag. Does maintaining a map of "forced" hooks make more sense here than handling each hook separately?

hookCreator := discover.NewHookCreator(discover.WithNVIDIACDIHookPath(cfg.NVIDIACTKConfig.Path))
hookCreator := discover.NewHookCreator(
discover.WithNVIDIACDIHookPath(cfg.NVIDIACTKConfig.Path),
discover.WithEnableChmodHook(cfg.Features.EnableChmodHook.IsEnabled()),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this indicate that we may need something like the following in the runtime config?

[nvidia-container-runtime.hooks]

disabled = [...]
# Specify a list of explicitly enabled hooks. If a hook is specified as both enabled and disabled, it will be enabled.
enabled = [...]

@coveralls
Copy link

coveralls commented Aug 7, 2025

Pull Request Test Coverage Report for Build 16913104397

Details

  • 40 of 44 (90.91%) changed or added relevant lines in 5 files are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage increased (+0.3%) to 35.81%

Changes Missing Coverage Covered Lines Changed/Added Lines %
pkg/nvcdi/options.go 0 4 0.0%
Totals Coverage Status
Change from base Build 16907071183: 0.3%
Covered Lines: 4627
Relevant Lines: 12921

💛 - Coveralls

@ArangoGutierrez ArangoGutierrez force-pushed the i-1218 branch 2 times, most recently from f18a08a to b26ad2b Compare August 7, 2025 16:54
&cli.StringSliceFlag{
Name: "enable-hook",
Aliases: []string{"enable-hooks"},
Usage: "Explicitly enable a hook in the generated CDI specification. This can be specified multiple times.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Usage: "Explicitly enable a hook in the generated CDI specification. This can be specified multiple times.",
Usage: "Explicitly enable a hook in the generated CDI specification. This overrides disabled hooks. This can be specified multiple times.",

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

Comment on lines 304 to 315
if m.config != nil {
if len(opts.disabledHooks) == 0 {
if disabledHooks := m.config.Get("nvidia-container-runtime-hook.disabled"); disabledHooks != nil {
if hooks, ok := disabledHooks.([]interface{}); ok {
for _, hook := range hooks {
if h, ok := hook.(string); ok {
opts.disabledHooks = append(opts.disabledHooks, h)
}
}
}
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this required? I thought we added support to read settings from a config file using altsrc.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

delete

Comment on lines 26 to 33
// DisabledHooks specifies the hooks that are disabled for the NVIDIA
// Container Runtime hook.
DisabledHooks []string `toml:"disabled,omitempty"`
// EnabledHooks specifies the hooks that are enabled for the NVIDIA
// Container Runtime hook.
// Specify a list of explicitly enabled hooks. If a hook is specified as
// both enabled and disabled, it will be enabled.
EnabledHooks []string `toml:"enabled,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we should add these config options, and IF we decide to add them we should not add them in the nvidia-container-runtime-hook section. This section is specific for the nvidia-container-runtime-hook that is used to call out to the nvidia-container-cli.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed for now, maybe a discuss-todo item

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you forget to push?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes...

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, now pushed

// This can be specified multiple times.
func WithDisabledHooks(hooks ...HookName) Option {
return func(c *cdiHookCreator) {
for _, hook := range hooks {
c.disabledHooks[hook] = true
c.hookStates[hook] = false
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This implies ordering between the calls to WithDisabledHooks and WithEnabledHooks. I would rather have the order or precedence well-defined such that hooks that are specifically ENABLED take precedence over hooks that are specifically DISABLED.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this can be simplified by keeping the cdiHookCreator struct as is, and introducing an options struct that we can use to set up the correct values of disabledHooks.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you are right, let me think on a proper solution for this

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
…the hook creator

Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Comment on lines 26 to 28
// DisabledHooks specifies the hooks that are disabled for the NVIDIA
// Container Runtime hook.
DisabledHooks []string `toml:"disabled,omitempty"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought we said we're going to remove this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed

@@ -50,6 +50,14 @@ const (
defaultNvidiaCDIHookPath = "/usr/bin/nvidia-cdi-hook"
)

// defaultDisabledHooks defines hooks that are disabled by default.
// These hooks can be explicitly enabled using the WithEnabledHooks option.
var defaultDisabledHooks = map[HookName]bool{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This need not be a map. Also does it make sense to just set this inline when constructing the hook creator?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

turned into a list.
I do think it is better to have it here, so the list is easier to maintain. Over time, we might want to add other hooks to the list.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR disables the chmod createContainer hook by default, addressing issue #1218. The chmod hook was originally added as a workaround for a specific crun issue with device nodes in subdirectories of /dev, but this has since been fixed in crun version 1.7.

  • Adds defaultDisabledHooks slice containing the ChmodHook to disable it by default
  • Introduces WithEnabledHooks option to explicitly enable hooks that are disabled by default
  • Updates the hook creation logic to handle both disabled and enabled hooks with proper precedence

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
internal/discover/hooks.go Adds default disabled hooks list and enabled hooks functionality
pkg/nvcdi/options.go Adds WithEnabledHook option function
pkg/nvcdi/lib.go Adds enabledHooks field and passes it to hook creator
cmd/nvidia-ctk/cdi/generate/generate.go Adds --enable-hook CLI flag and option handling
cmd/nvidia-ctk/cdi/generate/generate_test.go Adds test case for enabling chmod hook
internal/runtime/runtime_factory.go Refactors hook creator initialization to use options slice
cmd/nvidia-ctk-installer/toolkit/toolkit_test.go Updates expected test output

}
for _, opt := range opts {
opt(cdiHookCreator)
}

// Correct the disabledHooks map to ensure that explicitly enabled hooks
// are not disabled.
for hook := range enabledHooks {
Copy link
Preview

Copilot AI Aug 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The loop iterates over enabledHooks variable instead of cdiHookCreator.enabledHooks. This will cause enabled hooks to not be properly processed since the local enabledHooks variable is always empty at this point.

Suggested change
for hook := range enabledHooks {
for hook := range cdiHookCreator.enabledHooks {

Copilot uses AI. Check for mistakes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this mean that we don't have a test for the case where enabled overrides disabled?

@@ -75,7 +75,11 @@ func newSpecModifier(logger logger.Interface, cfg *config.Config, ociSpec oci.Sp
return nil, err
}

hookCreator := discover.NewHookCreator(discover.WithNVIDIACDIHookPath(cfg.NVIDIACTKConfig.Path))
opts := []discover.Option{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's revert this change.

@@ -100,7 +109,7 @@ type HookCreator interface {
Create(HookName, ...string) *Hook
}

// WithDisabledHooks sets the set of hooks that are disabled for the CDI hook creator.
// WithDisabledHooks explicitly disables the specified hooks.
// This can be specified multiple times.
func WithDisabledHooks(hooks ...HookName) Option {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's update these to also use generics so that we can pass ...string or ...HookName.

@@ -82,6 +90,7 @@ type Option func(*cdiHookCreator)
type cdiHookCreator struct {
nvidiaCDIHookPath string
disabledHooks map[HookName]bool
enabledHooks map[HookName]bool
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is only actually used in the construction of the cdiHookCreator and doesn't affect behaviour, does it make sense to split out separate options instead of updating the struct directly? Something like:

diff --git a/internal/discover/hooks.go b/internal/discover/hooks.go
index acfaef86..ada69fb8 100644
--- a/internal/discover/hooks.go
+++ b/internal/discover/hooks.go
@@ -85,12 +85,17 @@ func (h *Hook) Hooks() ([]Hook, error) {
 	return []Hook{*h}, nil
 }
 
-type Option func(*cdiHookCreator)
+type hookCreatorOptions struct {
+	nvidiaCDIHookPath string
+	disabledHooks     []HookName
+	enabledHooks      []HookName
+}
+
+type Option func(*hookCreatorOptions)
 
 type cdiHookCreator struct {
 	nvidiaCDIHookPath string
 	disabledHooks     map[HookName]bool
-	enabledHooks      map[HookName]bool
 
 	fixedArgs    []string
 	debugLogging bool
@@ -111,60 +116,61 @@ type HookCreator interface {
 
 // WithDisabledHooks explicitly disables the specified hooks.
 // This can be specified multiple times.
-func WithDisabledHooks(hooks ...HookName) Option {
-	return func(c *cdiHookCreator) {
-		for _, hook := range hooks {
-			c.disabledHooks[hook] = true
+func WithDisabledHooks[T string | HookName](hooks ...T) Option {
+	return func(c *hookCreatorOptions) {
+		for _, h := range hooks {
+			c.disabledHooks = append(c.disabledHooks, HookName(h))
 		}
 	}
 }
 
 // WithEnabledHooks explicitly enables the specified hooks.
 // This is useful for enabling hooks that are disabled by default.
-func WithEnabledHooks(hooks ...HookName) Option {
-	return func(c *cdiHookCreator) {
-		for _, hook := range hooks {
-			c.enabledHooks[hook] = true
+func WithEnabledHooks[T string | HookName](hooks ...T) Option {
+	return func(c *hookCreatorOptions) {
+		for _, h := range hooks {
+			c.enabledHooks = append(c.disabledHooks, HookName(h))
 		}
 	}
 }
 
 // WithNVIDIACDIHookPath sets the path to the nvidia-cdi-hook binary.
 func WithNVIDIACDIHookPath(nvidiaCDIHookPath string) Option {
-	return func(c *cdiHookCreator) {
+	return func(c *hookCreatorOptions) {
 		c.nvidiaCDIHookPath = nvidiaCDIHookPath
 	}
 }
 
 func NewHookCreator(opts ...Option) HookCreator {
-	disabledHooks := make(map[HookName]bool)
-	enabledHooks := make(map[HookName]bool)
-	for _, hook := range defaultDisabledHooks {
-		disabledHooks[hook] = true
-	}
-
-	cdiHookCreator := &cdiHookCreator{
+	o := &hookCreatorOptions{
 		nvidiaCDIHookPath: defaultNvidiaCDIHookPath,
-		disabledHooks:     disabledHooks,
-		enabledHooks:      enabledHooks,
 	}
 	for _, opt := range opts {
-		opt(cdiHookCreator)
+		opt(o)
 	}
 
-	// Correct the disabledHooks map to ensure that explicitly enabled hooks
-	// are not disabled.
-	for hook := range enabledHooks {
-		cdiHookCreator.disabledHooks[hook] = false
+	o.disabledHooks = append(o.disabledHooks, defaultDisabledHooks...)
+
+	disabledHooks := make(map[HookName]bool)
+	for _, h := range o.disabledHooks {
+		disabledHooks[h] = true
 	}
 
-	if cdiHookCreator.disabledHooks[AllHooks] {
+	if disabledHooks[AllHooks] && len(o.enabledHooks) == 0 {
 		return &allDisabledHookCreator{}
 	}
 
-	cdiHookCreator.fixedArgs = getFixedArgsForCDIHookCLI(cdiHookCreator.nvidiaCDIHookPath)
+	for _, h := range o.enabledHooks {
+		disabledHooks[h] = false
+	}
+
+	c := &cdiHookCreator{
+		nvidiaCDIHookPath: o.nvidiaCDIHookPath,
+		disabledHooks:     make(map[HookName]bool),
+		fixedArgs:         getFixedArgsForCDIHookCLI(o.nvidiaCDIHookPath),
+	}
 
-	return cdiHookCreator
+	return c
 }
 
 // Create creates a new hook with the given name and arguments.
@@ -183,7 +189,11 @@ func (c cdiHookCreator) Create(name HookName, args ...string) *Hook {
 }
 
 func (c cdiHookCreator) isDisabled(name HookName, args ...string) bool {
-	if c.disabledHooks[name] {
+	disabled, ok := c.disabledHooks[name]
+	if ok {
+		return disabled
+	}
+	if c.disabledHooks[AllHooks] {
 		return true
 	}
 

It would also be good to add a unit test to test cdiHookCreator.isDisabled behaviour.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unit test added

@@ -0,0 +1,193 @@
package discover
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: license header

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 182 to 266
// Apply enabled hooks override
for _, enabledHook := range tt.enabledHooks {
creator.disabledHooks[enabledHook] = false
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be calling New() to test the actual constructor. Here we are just testing that we disable the hooks in the test.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Comment on lines 187 to 190
result := creator.isDisabled(tt.hookName, tt.args...)
if result != tt.expectedResult {
t.Errorf("isDisabled() = %v, want %v", result, tt.expectedResult)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We use the require package for our tests in this project.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

@elezar
Copy link
Member

elezar commented Aug 12, 2025

One note. Users that REQUIRE this hook don't have the option to opt-in to it when using the runtime -- only CDI. Maybe that's sufficient though.

Signed-off-by: Carlos Eduardo Arango Gutierrez <[email protected]>
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot encountered an error and was unable to review this pull request. You can try again by re-requesting a review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Disable chmod hook by default
3 participants