Skip to content

MLE-28304 Volume Resizing Implementation#162

Merged
pengzhouml merged 28 commits into
developfrom
feature/MLE-28304-volume-resize
May 26, 2026
Merged

MLE-28304 Volume Resizing Implementation#162
pengzhouml merged 28 commits into
developfrom
feature/MLE-28304-volume-resize

Conversation

@pengzhouml
Copy link
Copy Markdown
Collaborator

No description provided.

Copilot AI review requested due to automatic review settings May 8, 2026 16:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Implements end-to-end PersistentVolumeClaim (PVC) expansion support for MarkLogic groups, including validation, multi-phase workflow/status tracking, StatefulSet template synchronization via delete/recreate, controlled pod restarts when filesystem expansion is offline, and accompanying RBAC/CRD/API updates.

Changes:

  • Adds a new multi-phase ReconcileVolumeResizeValidation() workflow with status/events, retry handling, sequential/parallel strategies, and StatefulSet/pod orchestration.
  • Extends the API/CRDs with spec.persistence.resizeStrategy and status.volumeResizeStatus (plus deepcopy updates) and adds unit/controller tests.
  • Updates controller RBAC/manifests/Helm chart to allow required PVC/StorageClass/PV/Event access, and adds functional spec documentation.

Reviewed changes

Copilot reviewed 15 out of 16 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
pkg/k8sutil/volume_resize_validation.go Core resize reconciliation state machine (validation → PVC patching → waiting → STS sync → pod restarts → verification).
pkg/k8sutil/volume_resize_validation_test.go Unit tests covering validation, strategies, retry behavior, sync markers, restarts, and verification transitions.
pkg/k8sutil/handler.go Inserts resize reconciliation into the MarklogicGroup handler flow (before StatefulSet reconcile).
internal/controller/marklogicgroup_controller.go Adds RBAC annotations for PVC/PV/StorageClass/events needed by resizing.
internal/controller/marklogicgroup_controller_test.go Adds envtest-style controller tests for resize validation behaviors.
docs/spec/volume resize.md Functional spec for the resize feature (workflow, status contract, recovery, RBAC).
docs/operator-scope-configuration.md Documents extra StorageClass ClusterRole needed in namespace-scoped mode.
config/rbac/role.yaml Adds PVC/PV/StorageClass/events permissions to cluster-scoped role.
config/rbac/role_namespaced.yaml Adds PVC/events Role permissions and a StorageClass reader ClusterRole/Binding for namespace-scoped mode.
config/crd/bases/marklogic.progress.com_marklogicgroups.yaml Adds persistence.resizeStrategy and status.volumeResizeStatus schema.
config/crd/bases/marklogic.progress.com_marklogicclusters.yaml Adds persistence.resizeStrategy schema at cluster and group override levels.
charts/marklogic-operator-kubernetes/templates/manager-rbac.yaml Helm RBAC updates for PVC/PV/StorageClass/events + StorageClass reader ClusterRole/Binding in namespace scope.
api/v1/common_types.go Introduces VolumeResizeStrategy and adds Persistence.ResizeStrategy with default/enum validation.
api/v1/marklogicgroup_types.go Adds resize phase/reason/state enums and VolumeResizeStatus/PVC status types onto MarklogicGroupStatus.
api/v1/zz_generated.deepcopy.go Deepcopy support for new status/types.
api/v1/marklogicgroup_types_test.go Deepcopy regression test for the new status fields.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread pkg/k8sutil/volume_resize_validation.go
Comment thread pkg/k8sutil/volume_resize_validation.go
Comment thread pkg/k8sutil/volume_resize_validation.go
Comment thread pkg/k8sutil/volume_resize_validation.go Outdated
Comment thread pkg/k8sutil/volume_resize_validation.go
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 15 out of 17 changed files in this pull request and generated 3 comments.

Files not reviewed (1)
  • api/v1/zz_generated.deepcopy.go: Language not supported

Comment thread pkg/k8sutil/volume_resize_validation.go
Comment thread pkg/k8sutil/volume_resize_validation.go
Comment thread pkg/k8sutil/volume_resize_validation.go
@pengzhouml pengzhouml force-pushed the feature/MLE-28304-volume-resize branch from 2b8a349 to 9125c5b Compare May 15, 2026 18:27
Peng Zhou and others added 18 commits May 18, 2026 17:10
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
Co-authored-by: Copilot <copilot@github.com>
add controller RBAC markers for PVC status, PV, and events
add matching cluster and namespaced RBAC manifest rules
align Helm manager RBAC template with generated RBAC
…wait loop, and retry/backoff status handling

Co-authored-by: Copilot <copilot@github.com>
New phases: SynchronizingStatefulSet, RestartingPods, WaitingForPodsReady
Recovery markers and resume behavior
OfflinePending-only restart candidate logic
Reverse ordinal restart ordering
Tests passed:
go test ./pkg/k8sutil -run TestResize -count=1
go test ./api/v1 -count=1
go test ./internal/controller -run TestDoesNotExist -count=1

Co-authored-by: Copilot <copilot@github.com>
…safety

add VerifyingResizeOutcome execution path and active-phase routing
implement final verification checks for PVCs, StatefulSet template, restart state, filesystem pending, and pod readiness
transition successful verification to Completed with coherent terminal status fields
add verification retry and failure handling with stalled resume back to verification and max-retry failure
add coarse verification lifecycle markers and events
fix CAS claim behavior to reliably start deferred target after terminal persistence
add PR5 tests for completion, retry resume, terminal failure, deferred handoff, and final field consistency

Co-authored-by: Copilot <copilot@github.com>
…cess blockers

Co-authored-by: Copilot <copilot@github.com>
…dation

add MarklogicGroup env component tests for growth initialization, shrink rejection, and non-OnDelete strategy rejection
add reusable helpers to create persistent MarklogicGroup and PVC fixtures
align PVC test fixture with current Kubernetes API using VolumeResourceRequirements
- switch resize retries to bounded exponential backoff (10s initial, 5m cap, 15 max)
- requeue missing/unbound PVC validation stalls so resize can self-recover automatically
- fix sequential strategy handoff by transitioning back to ResizingPVCs for next patch
- classify template-below-target verification as StatefulSetSyncFailed (not MarkLogicHealthCheckFailed)
- route stalled retries to the correct phase based on failure domain
- move internal crash-recovery markers out of warnings into a dedicated markers field
- add legacy marker normalization, CRD/deepcopy updates, and expanded unit test coverage
- update sample config to enable persistence in quick-start
* add-test-suites

* add csi-hostpath-driver addon

* fix Copilot comments

* improve E2E summary output

* fix SC config

* re arrange test sequence

* remove duplicate test
@pengzhouml pengzhouml force-pushed the feature/MLE-28304-volume-resize branch from 9125c5b to 2cbd1b4 Compare May 19, 2026 00:11
…ehavior

preserve current resize phase while paused and mark status with reason=Paused
clear pause reason/message on resume so normal phase progression can continue
prevent new resize operation creation after terminal status when spec generation/target is unchanged
add bounded jitter to exponential retry backoff (still capped at max delay)
expand tests for pause/resume, terminal restart fencing, and retry jitter bounds
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 32 changed files in this pull request and generated 5 comments.

Files not reviewed (1)
  • api/v1/zz_generated.deepcopy.go: Language not supported

Comment thread docs/operator-scope-configuration.md Outdated
Comment thread internal/controller/marklogicgroup_controller.go Outdated
Comment thread config/rbac/role.yaml Outdated
Comment thread pkg/k8sutil/volume_resize_validation.go
Comment thread test/e2e/8_metrics_test.go Outdated
pengzhouml and others added 4 commits May 19, 2026 22:41
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@sumanthravipati
Copy link
Copy Markdown
Collaborator

@pengzhouml please make sure to use Jira id in the commit messages as standard practice

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 30 out of 32 changed files in this pull request and generated 4 comments.

Files not reviewed (1)
  • api/v1/zz_generated.deepcopy.go: Language not supported

Comment thread test/e2e/main_test.go
Comment thread Jenkinsfile Outdated
Comment thread config/samples/quick-start.yaml
Comment thread test/e2e-helm/9_volume_resize_test.go Outdated
pengzhouml and others added 2 commits May 19, 2026 23:57
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@pengzhouml pengzhouml merged commit ab08499 into develop May 26, 2026
3 of 4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants