Skip to content

external-snapshotter does not retry removal of VolumeSnapshotContent finalizers #1301

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
jsafrane opened this issue May 15, 2025 · 2 comments

Comments

@jsafrane
Copy link
Contributor

jsafrane commented May 15, 2025

What happened:
On an API server hiccup, the external-snapshotter was not able to remove its finalizer from VolumeSnapshotContent and got a timeout. Such hiccups happen all the time, however, the snapshotter did not retry with exp. backoff.

What you expected to happen:
The snapshotter re-tries with exponential backoff.

How to reproduce it:
I can't reproduce it on demand, it needs careful timing of the API server error. This small commit shows how removeContentFinalizer() call error causes the snapshotter not to retry: jsafrane@3da05bb

With that commit, I can see in the snapshotter logs just once:

I0515 16:52:17.267236       1 snapshot_controller.go:59] synchronizing VolumeSnapshotContent[snapcontent-2cee4522-4ae0-40d4-9718-4b51c1e261f8]
I0515 16:52:17.267260       1 snapshot_controller.go:631] Check if VolumeSnapshotContent[snapcontent-2cee4522-4ae0-40d4-9718-4b51c1e261f8] should be deleted.
I0515 16:52:17.267278       1 snapshot_controller.go:62] VolumeSnapshotContent[snapcontent-2cee4522-4ae0-40d4-9718-4b51c1e261f8]: the policy is Delete
E0515 16:52:17.267344       1 snapshot_controller_base.go:361] could not sync content "snapcontent-2cee4522-4ae0-40d4-9718-4b51c1e261f8": snapshot controller failed to update snapcontent-2cee4522-4ae0-40d4-9718-4b51c1e261f8 on API server: mock finalizer removal error

I want the snapshotter to retry periodically.

Environment:

  • Driver version: csi-driver-hostpath
  • csi-snapshotter: v8.2.0
@yati1998
Copy link
Contributor

Hi @jsafrane , I see similiar issue already raised: #1282
The volumeSnapshotContent is not requeued

@jsafrane
Copy link
Contributor Author

I think its a different issue. In #1282 re-queues a snapshot / snapshot content after 5 minutes and that may be too much in some cases. I don't see my snapshot content requeued ever.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants