K8SPSMDB-1363: Snapshot-based backups#2247
Conversation
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
There was a problem hiding this comment.
Pull request overview
This PR implements snapshot-based backups for Percona Server for MongoDB, adding support for Kubernetes VolumeSnapshots as an alternative backup mechanism. The implementation introduces a new backup executor interface to handle different backup types (managed vs snapshot-based), integrates with the Kubernetes CSI snapshot API, and updates the CRD to support the new external backup type with volume snapshot configuration.
Changes:
- Introduces
backupExecutorinterface to support multiple backup implementations (managed and snapshot-based) - Adds VolumeSnapshot support for external backups via new
snapshot.gocontroller - Updates API types to include
VolumeSnapshotClassfield andSnapshotInfostatus
Reviewed changes
Copilot reviewed 20 out of 21 changed files in this pull request and generated 10 comments.
Show a summary per file
| File | Description |
|---|---|
| pkg/apis/psmdb/v1/perconaservermongodbbackup_types.go | Adds external backup type, VolumeSnapshotClass field, and SnapshotInfo struct to track volume snapshots |
| pkg/controller/perconaservermongodbbackup/snapshot.go | New file implementing snapshot-based backup logic including VolumeSnapshot creation and reconciliation |
| pkg/controller/perconaservermongodbbackup/backup.go | Refactors existing backup logic into managedBackups type and introduces backupExecutor interface |
| pkg/controller/perconaservermongodbbackup/psmdb_backup_controller.go | Updates controller to select backup executor based on backup type and configuration |
| pkg/psmdb/backup/pbm.go | Adds GetBackupByName and FinishBackup methods, wraps credentials with MaskedString for security |
| pkg/naming/naming.go | Adds VolumeSnapshotName function to generate snapshot resource names |
| deploy/rbac.yaml, deploy/cw-rbac.yaml | Grants operator permissions to create and manage VolumeSnapshot resources |
| config/crd/bases/.yaml, deploy/.yaml | Updates CRD definitions to include new backup type and snapshot fields |
| cmd/manager/main.go | Registers VolumeSnapshot v1 API scheme |
| go.mod, go.sum | Updates Go version and PBM dependency version, adds kubernetes-csi/external-snapshotter client |
| deploy/bundle.yaml | Contains deployment configuration with modified operator image reference |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| } | ||
| cn, err := r.newPBMFunc(ctx, r.client, cluster) | ||
| if err != nil { | ||
| return nil, errors.Wrap(err, "reate pbm object") |
There was a problem hiding this comment.
There's a typo in the error message. It says "reate" instead of "create".
| return nil, errors.Wrap(err, "reate pbm object") | |
| return nil, errors.Wrap(err, "create pbm object") |
| status.State = api.BackupStateRequested | ||
|
|
||
| status = api.PerconaServerMongoDBBackupStatus{ | ||
| PBMname: name, | ||
| LastTransition: &metav1.Time{ | ||
| Time: time.Unix(time.Now().Unix(), 0), | ||
| }, | ||
| State: api.BackupStateRequested, | ||
| } |
There was a problem hiding this comment.
The status is assigned twice unnecessarily. Line 68 sets status.State, then lines 70-76 create a new status struct with the same information. The first assignment on line 68 is redundant and should be removed.
| type SnapshotInfo struct { | ||
| NodeName string `json:"nodeName,omitempty"` | ||
| SnapshotName string `json:"snapshotName,omitempty"` | ||
| } |
There was a problem hiding this comment.
The SnapshotInfo struct lacks documentation. Add a comment describing what this struct represents (e.g., "SnapshotInfo contains information about a volume snapshot created for a MongoDB node during an external backup").
|
|
||
| podName := func(nodeName string) (string, error) { | ||
| parts := strings.Split(nodeName, ".") | ||
| if len(parts) < 1 { |
There was a problem hiding this comment.
The condition len(parts) < 1 on line 132 is always false because strings.Split always returns at least one element (an empty string if the input is empty). The check should be len(parts) == 0 (which would never be true) or more likely len(parts) < 2 or the first element should be checked for being non-empty. Consider revising this logic to properly validate the node name format.
| if len(parts) < 1 { | |
| if parts[0] == "" { |
| func (p *PerconaServerMongoDBBackup) CheckFields() error { | ||
| if len(p.Spec.StorageName) == 0 { | ||
| if len(p.Spec.StorageName) == 0 && p.Spec.Type != defs.ExternalBackup { | ||
| return fmt.Errorf("spec storageName field is empty") | ||
| } |
There was a problem hiding this comment.
The CheckFields method allows external backups without a VolumeSnapshotClass, but the controller only creates snapshot backups when both Type is ExternalBackup AND VolumeSnapshotClass is set. This could lead to a confusing scenario where a user creates an external backup without a VolumeSnapshotClass, and it falls through to the default managed backup path. Consider adding validation to ensure that if Type is ExternalBackup, VolumeSnapshotClass must be specified, or document this behavior clearly.
| package perconaservermongodbbackup | ||
|
|
||
| import ( | ||
| "context" | ||
| "fmt" | ||
| "strings" | ||
| "time" | ||
|
|
||
| volumesnapshotv1 "github.com/kubernetes-csi/external-snapshotter/client/v8/apis/volumesnapshot/v1" | ||
| "github.com/percona/percona-backup-mongodb/pbm/ctrl" | ||
| "github.com/percona/percona-backup-mongodb/pbm/defs" | ||
| pbmErrors "github.com/percona/percona-backup-mongodb/pbm/errors" | ||
| "github.com/pkg/errors" | ||
| metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||
| "k8s.io/utils/ptr" | ||
| "sigs.k8s.io/controller-runtime/pkg/client" | ||
| "sigs.k8s.io/controller-runtime/pkg/controller/controllerutil" | ||
| logf "sigs.k8s.io/controller-runtime/pkg/log" | ||
|
|
||
| api "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1" | ||
| "github.com/percona/percona-server-mongodb-operator/pkg/naming" | ||
| "github.com/percona/percona-server-mongodb-operator/pkg/psmdb/backup" | ||
| "github.com/percona/percona-server-mongodb-operator/pkg/psmdb/config" | ||
| ) | ||
|
|
||
| type snapshotBackups struct { | ||
| pbm backup.PBM | ||
| spec api.BackupSpec | ||
| } | ||
|
|
||
| func (r *ReconcilePerconaServerMongoDBBackup) newSnapshotBackups(ctx context.Context, cluster *api.PerconaServerMongoDB) (*snapshotBackups, error) { | ||
| if cluster == nil { | ||
| return &snapshotBackups{}, nil | ||
| } | ||
| cn, err := r.newPBMFunc(ctx, r.client, cluster) | ||
| if err != nil { | ||
| return nil, errors.Wrap(err, "reate pbm object") | ||
| } | ||
|
|
||
| return &snapshotBackups{pbm: cn, spec: cluster.Spec.Backup}, nil | ||
| } | ||
|
|
||
| func (b *snapshotBackups) PBM() backup.PBM { | ||
| return b.pbm | ||
| } | ||
|
|
||
| func (b *snapshotBackups) Start(ctx context.Context, k8sclient client.Client, cluster *api.PerconaServerMongoDB, cr *api.PerconaServerMongoDBBackup) (api.PerconaServerMongoDBBackupStatus, error) { | ||
| log := logf.FromContext(ctx).WithValues("backup", cr.Name) | ||
|
|
||
| log.Info("Starting snapshot backup") | ||
|
|
||
| var status api.PerconaServerMongoDBBackupStatus | ||
|
|
||
| name := time.Now().UTC().Format(time.RFC3339) | ||
| cmd := ctrl.Cmd{ | ||
| Cmd: ctrl.CmdBackup, | ||
| Backup: &ctrl.BackupCmd{ | ||
| Name: name, | ||
| Type: defs.ExternalBackup, | ||
| }, | ||
| } | ||
|
|
||
| log.Info("Sending backup command", "backupCmd", cmd) | ||
|
|
||
| if err := b.pbm.SendCmd(ctx, cmd); err != nil { | ||
| return status, err | ||
| } | ||
| status.State = api.BackupStateRequested | ||
|
|
||
| status = api.PerconaServerMongoDBBackupStatus{ | ||
| PBMname: name, | ||
| LastTransition: &metav1.Time{ | ||
| Time: time.Unix(time.Now().Unix(), 0), | ||
| }, | ||
| State: api.BackupStateRequested, | ||
| } | ||
| if cluster.Spec.Sharding.Enabled && cluster.Spec.Sharding.ConfigsvrReplSet != nil { | ||
| status.ReplsetNames = append(status.ReplsetNames, cluster.Spec.Sharding.ConfigsvrReplSet.Name) | ||
| } | ||
| for _, rs := range cluster.Spec.Replsets { | ||
| status.ReplsetNames = append(status.ReplsetNames, rs.Name) | ||
| } | ||
|
|
||
| return status, nil | ||
| } | ||
|
|
||
| func (b *snapshotBackups) reconcileSnapshot( | ||
| ctx context.Context, | ||
| cl client.Client, | ||
| pvc string, | ||
| bcp *api.PerconaServerMongoDBBackup, | ||
| ) (*volumesnapshotv1.VolumeSnapshot, error) { | ||
| volumeSnapshot := &volumesnapshotv1.VolumeSnapshot{ | ||
| ObjectMeta: metav1.ObjectMeta{ | ||
| Name: naming.VolumeSnapshotName(bcp, pvc), | ||
| Namespace: bcp.GetNamespace(), | ||
| }, | ||
| } | ||
| if err := cl.Get(ctx, client.ObjectKeyFromObject(volumeSnapshot), volumeSnapshot); err == nil { | ||
| return volumeSnapshot, nil | ||
| } else if client.IgnoreNotFound(err) != nil { | ||
| return nil, errors.Wrap(err, "get volume snapshot") | ||
| } | ||
|
|
||
| volumeSnapshot.Spec = volumesnapshotv1.VolumeSnapshotSpec{ | ||
| VolumeSnapshotClassName: bcp.Spec.VolumeSnapshotClass, | ||
| Source: volumesnapshotv1.VolumeSnapshotSource{ | ||
| PersistentVolumeClaimName: &pvc, | ||
| }, | ||
| } | ||
| if err := controllerutil.SetControllerReference(bcp, volumeSnapshot, cl.Scheme()); err != nil { | ||
| return nil, errors.Wrap(err, "set controller reference") | ||
| } | ||
|
|
||
| if err := cl.Create(ctx, volumeSnapshot); err != nil { | ||
| return nil, errors.Wrap(err, "create volume snapshot") | ||
| } | ||
| return volumeSnapshot, nil | ||
| } | ||
|
|
||
| func (b *snapshotBackups) reconcileSnapshots( | ||
| ctx context.Context, | ||
| cl client.Client, | ||
| bcp *api.PerconaServerMongoDBBackup, | ||
| meta *backup.BackupMeta, | ||
| ) (bool, []api.SnapshotInfo, error) { | ||
| done := true | ||
| snapshots := make([]api.SnapshotInfo, 0) | ||
|
|
||
| podName := func(nodeName string) (string, error) { | ||
| parts := strings.Split(nodeName, ".") | ||
| if len(parts) < 1 { | ||
| return "", errors.Errorf("unexpected node name format: %s", nodeName) | ||
| } | ||
| return parts[0], nil | ||
| } | ||
|
|
||
| for _, rs := range meta.Replsets { | ||
| // do not snapshot nodes that are not yet copy ready. | ||
| if rs.Status != defs.StatusCopyReady { | ||
| done = false | ||
| continue | ||
| } | ||
|
|
||
| // parse pod name from node name. | ||
| podName, err := podName(rs.Node) | ||
| if err != nil { | ||
| return false, nil, errors.Wrap(err, "get pod name") | ||
| } | ||
|
|
||
| // ensure snapshot is created. | ||
| pvcName := config.MongodDataVolClaimName + "-" + podName | ||
| snapshot, err := b.reconcileSnapshot(ctx, cl, pvcName, bcp) | ||
| if err != nil { | ||
| return false, nil, errors.Wrap(err, "reconcile snapshot") | ||
| } | ||
|
|
||
| if snapshot.Status == nil || !ptr.Deref(snapshot.Status.ReadyToUse, false) { | ||
| done = false | ||
| } | ||
|
|
||
| // If there is an error, return error. | ||
| // Note that some errors may be transient, but the controller will retry. | ||
| if snapshot.Status != nil && snapshot.Status.Error != nil && ptr.Deref(snapshot.Status.Error.Message, "") != "" { | ||
| return false, nil, errors.Errorf("snapshot error: %s", ptr.Deref(snapshot.Status.Error.Message, "")) | ||
| } | ||
| snapshots = append(snapshots, api.SnapshotInfo{ | ||
| NodeName: rs.Node, | ||
| SnapshotName: snapshot.GetName(), | ||
| }) | ||
| } | ||
| return done, snapshots, nil | ||
| } | ||
|
|
||
| func (b *snapshotBackups) Status(ctx context.Context, cl client.Client, cluster *api.PerconaServerMongoDB, cr *api.PerconaServerMongoDBBackup) (api.PerconaServerMongoDBBackupStatus, error) { | ||
| status := cr.Status | ||
| log := logf.FromContext(ctx).WithName("backupStatus").WithValues("backup", cr.Name, "pbmName", status.PBMname) | ||
|
|
||
| meta, err := b.pbm.GetBackupByName(ctx, cr.Status.PBMname) | ||
| if err != nil && !errors.Is(err, pbmErrors.ErrNotFound) { | ||
| return status, errors.Wrap(err, "get pbm backup meta") | ||
| } | ||
|
|
||
| if meta == nil || meta.Name == "" || errors.Is(err, pbmErrors.ErrNotFound) { | ||
| logf.FromContext(ctx).Info("Waiting for backup metadata", "pbmName", cr.Status.PBMname, "backup", cr.Name) | ||
| return status, nil | ||
| } | ||
|
|
||
| log.V(1).Info("Got backup meta", "meta", meta) | ||
|
|
||
| if meta.StartTS > 0 { | ||
| status.StartAt = &metav1.Time{ | ||
| Time: time.Unix(meta.StartTS, 0), | ||
| } | ||
| } | ||
|
|
||
| switch meta.Status { | ||
| case defs.StatusError: | ||
| status.State = api.BackupStateError | ||
| status.Error = fmt.Sprintf("%v", meta.Error()) | ||
|
|
||
| case defs.StatusStarting: | ||
| passed := time.Now().UTC().Sub(time.Unix(meta.StartTS, 0)) | ||
| timeoutSeconds := defaultPBMStartingDeadline | ||
| if s := cluster.Spec.Backup.StartingDeadlineSeconds; s != nil && *s > 0 { | ||
| timeoutSeconds = *s | ||
| } | ||
| if passed >= time.Duration(timeoutSeconds)*time.Second { | ||
| status.State = api.BackupStateError | ||
| status.Error = pbmStartingDeadlineErrMsg | ||
| break | ||
| } | ||
|
|
||
| status.State = api.BackupStateRequested | ||
|
|
||
| case defs.StatusDone: | ||
| status.State = api.BackupStateReady | ||
| status.CompletedAt = &metav1.Time{ | ||
| Time: time.Unix(meta.LastTransitionTS, 0), | ||
| } | ||
| status.LastWriteAt = &metav1.Time{ | ||
| Time: time.Unix(int64(meta.LastWriteTS.T), 0), | ||
| } | ||
|
|
||
| case defs.StatusCopyReady: | ||
| status.State = api.BackupStateRunning | ||
| snapshotsReady, snapshots, err := b.reconcileSnapshots(ctx, cl, cr, meta) | ||
| if err != nil { | ||
| return status, errors.Wrap(err, "reconcile snapshots") | ||
| } | ||
| status.Snapshots = snapshots | ||
| if snapshotsReady { | ||
| if err := b.pbm.FinishBackup(ctx, cr.Status.PBMname); err != nil { | ||
| return status, errors.Wrap(err, "finish backup") | ||
| } | ||
| } | ||
| } | ||
|
|
||
| return status, nil | ||
| } | ||
|
|
||
| func (b *snapshotBackups) Complete(ctx context.Context) error { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
The new snapshot backup functionality in snapshot.go lacks test coverage. There are tests for backup_test.go but no corresponding snapshot_test.go file. Given that this is a new feature, comprehensive tests should be added to verify the snapshot creation, reconciliation, status handling, and error scenarios.
deploy/bundle.yaml
Outdated
| containers: | ||
| - name: percona-server-mongodb-operator | ||
| image: perconalab/percona-server-mongodb-operator:main | ||
| image: asia-south1-docker.pkg.dev/cloud-dev-112233/mayankshah/psmdb-operator:K8SPSMDB-1363 |
There was a problem hiding this comment.
The bundle.yaml file contains a hardcoded image reference to a personal development repository (asia-south1-docker.pkg.dev/cloud-dev-112233/mayankshah/psmdb-operator:K8SPSMDB-1363). This should be reverted to use the standard image reference before merging. Development-specific image references should not be committed to the main branch.
| } | ||
|
|
||
| func (b *pbmC) GetBackupByName(ctx context.Context, bcpName string) (*backup.BackupMeta, error) { | ||
| return backup.NewDBManager(b.Client).GetBackupByName(ctx, bcpName) |
There was a problem hiding this comment.
GetBackupMeta and GetBackupByName are duplicate implementations that both call backup.NewDBManager(b.Client).GetBackupByName(ctx, bcpName). This is redundant. Consider removing one of them or clarifying why both are needed.
| return backup.NewDBManager(b.Client).GetBackupByName(ctx, bcpName) | |
| return b.GetBackupMeta(ctx, bcpName) |
| // VolumeSnapshotClass is the name of the VolumeSnapshotClass to use for snapshot based backups. | ||
| // This may be specified only when type is `external`. | ||
| // +kubebuilder:validation:Optional | ||
| VolumeSnapshotClass *string `json:"volumeSnapshotClass,omitempty"` |
There was a problem hiding this comment.
The documentation comment on lines 22-23 states that VolumeSnapshotClass "may be specified only when type is external", but there's no validation enforcing this constraint. Either add validation in the CheckFields method to ensure this is only set when Type is ExternalBackup, or add a kubebuilder validation marker to enforce this at the CRD level.
pkg/naming/naming.go
Outdated
| func VolumeSnapshotName(bcp *psmdbv1.PerconaServerMongoDBBackup, pvc string) string { | ||
| return fmt.Sprintf("%s-%s", bcp.Name, pvc) |
There was a problem hiding this comment.
The VolumeSnapshotName function concatenates the backup name and PVC name without checking for Kubernetes name length limits (253 characters for most resources). If the backup name is long, this could result in invalid snapshot names. Consider adding truncation or validation to ensure the generated name doesn't exceed Kubernetes limits.
| func VolumeSnapshotName(bcp *psmdbv1.PerconaServerMongoDBBackup, pvc string) string { | |
| return fmt.Sprintf("%s-%s", bcp.Name, pvc) | |
| const maxK8sNameLen = 253 | |
| func truncateNamePart(s string, max int) string { | |
| if max <= 0 { | |
| return "" | |
| } | |
| if len(s) <= max { | |
| return s | |
| } | |
| return s[:max] | |
| } | |
| func VolumeSnapshotName(bcp *psmdbv1.PerconaServerMongoDBBackup, pvc string) string { | |
| name := bcp.Name | |
| // Ensure the final "<backup-name>-<pvc>" does not exceed Kubernetes name length limits. | |
| // Reserve space for the hyphen separator. | |
| remaining := maxK8sNameLen - 1 - len(pvc) | |
| if remaining < 0 { | |
| // PVC name alone exceeds or equals the limit; truncate PVC first. | |
| pvc = truncateNamePart(pvc, maxK8sNameLen) | |
| remaining = maxK8sNameLen - 1 - len(pvc) | |
| } | |
| if remaining < 0 { | |
| // In the extreme case where even "<truncated-pvc>" plus hyphen would exceed the limit, | |
| // drop the backup name entirely and fall back to the truncated PVC name. | |
| return pvc | |
| } | |
| name = truncateNamePart(name, remaining) | |
| return fmt.Sprintf("%s-%s", name, pvc) |
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
| "github.com/percona/percona-backup-mongodb/pbm/ctrl" | ||
| "github.com/percona/percona-backup-mongodb/pbm/defs" | ||
| pbmErrors "github.com/percona/percona-backup-mongodb/pbm/errors" |
There was a problem hiding this comment.
[goimports-reviser] reported by reviewdog 🐶
| "github.com/percona/percona-backup-mongodb/pbm/ctrl" | |
| "github.com/percona/percona-backup-mongodb/pbm/defs" | |
| pbmErrors "github.com/percona/percona-backup-mongodb/pbm/errors" |
| "sigs.k8s.io/controller-runtime/pkg/controller/controllerutil" | ||
| logf "sigs.k8s.io/controller-runtime/pkg/log" | ||
|
|
||
| api "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1" |
There was a problem hiding this comment.
[goimports-reviser] reported by reviewdog 🐶
| api "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1" | |
| "github.com/percona/percona-backup-mongodb/pbm/ctrl" | |
| "github.com/percona/percona-backup-mongodb/pbm/defs" | |
| pbmErrors "github.com/percona/percona-backup-mongodb/pbm/errors" | |
| api "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1" |
| "github.com/percona/percona-backup-mongodb/pbm/defs" | ||
| psmdbv1 "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1" | ||
| "github.com/percona/percona-server-mongodb-operator/pkg/naming" | ||
| "github.com/percona/percona-server-mongodb-operator/pkg/psmdb/backup" | ||
| "github.com/percona/percona-server-mongodb-operator/pkg/psmdb/config" |
There was a problem hiding this comment.
[goimports-reviser] reported by reviewdog 🐶
| "github.com/percona/percona-backup-mongodb/pbm/defs" | |
| psmdbv1 "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1" | |
| "github.com/percona/percona-server-mongodb-operator/pkg/naming" | |
| "github.com/percona/percona-server-mongodb-operator/pkg/psmdb/backup" | |
| "github.com/percona/percona-server-mongodb-operator/pkg/psmdb/config" |
There was a problem hiding this comment.
[goimports-reviser] reported by reviewdog 🐶
percona-server-mongodb-operator/cmd/manager/main.go
Lines 10 to 13 in 603f322
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
603f322 to
7230fea
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 27 out of 28 changed files in this pull request and generated 9 comments.
Comments suppressed due to low confidence (1)
pkg/apis/psmdb/v1/perconaservermongodbbackup_types.go:133
- The CheckFields validation does not verify that
VolumeSnapshotClassis specified whenTypeisexternal. This could allow users to create external backup resources without specifying the required VolumeSnapshotClass, leading to failures later when the operator tries to create snapshots. Consider adding validation to require VolumeSnapshotClass when Type is ExternalBackup.
func (p *PerconaServerMongoDBBackup) CheckFields() error {
if len(p.Spec.StorageName) == 0 && p.Spec.Type != defs.ExternalBackup {
return fmt.Errorf("spec storageName field is empty")
}
if len(p.Spec.GetClusterName()) == 0 {
return fmt.Errorf("spec clusterName is empty")
}
if string(p.Spec.Type) == "" {
p.Spec.Type = defs.LogicalBackup
}
if string(p.Spec.Compression) == "" {
p.Spec.Compression = compress.CompressionTypeGZIP
}
return nil
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| if bcp.Spec.Type == defs.ExternalBackup { | ||
| // TODO: should we check that snapshots exist? | ||
| return nil |
There was a problem hiding this comment.
For external (snapshot-based) backups, the validation skips checking if snapshots exist (as noted in the TODO). This could allow restore operations to proceed when the required VolumeSnapshots are missing or not ready, which would cause the restore to fail later in the process. Consider implementing a validation check to verify that all required snapshots exist and are in a ready state before allowing the restore to proceed.
|
|
||
| orig := sfs.DeepCopy() | ||
|
|
||
| // Scale down the statefulset. |
There was a problem hiding this comment.
The comment says "Scale down the statefulset" but the function is named scaleUpStatefulSetsForSnapshotRestore and is scaling UP to the desired replicas. The comment should say "Scale up the statefulset" to accurately reflect what the code is doing.
| // Scale down the statefulset. | |
| // Scale up the statefulset. |
| package perconaservermongodbbackup | ||
|
|
||
| import ( | ||
| "context" | ||
| "fmt" | ||
| "strings" | ||
| "time" | ||
|
|
||
| volumesnapshotv1 "github.com/kubernetes-csi/external-snapshotter/client/v8/apis/volumesnapshot/v1" | ||
| "github.com/percona/percona-backup-mongodb/pbm/ctrl" | ||
| "github.com/percona/percona-backup-mongodb/pbm/defs" | ||
| pbmErrors "github.com/percona/percona-backup-mongodb/pbm/errors" | ||
| "github.com/pkg/errors" | ||
| metav1 "k8s.io/apimachinery/pkg/apis/meta/v1" | ||
| "k8s.io/utils/ptr" | ||
| "sigs.k8s.io/controller-runtime/pkg/client" | ||
| "sigs.k8s.io/controller-runtime/pkg/controller/controllerutil" | ||
| logf "sigs.k8s.io/controller-runtime/pkg/log" | ||
|
|
||
| api "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1" | ||
| "github.com/percona/percona-server-mongodb-operator/pkg/naming" | ||
| "github.com/percona/percona-server-mongodb-operator/pkg/psmdb/backup" | ||
| "github.com/percona/percona-server-mongodb-operator/pkg/psmdb/config" | ||
| ) | ||
|
|
||
| type snapshotBackups struct { | ||
| pbm backup.PBM | ||
| spec api.BackupSpec | ||
| } | ||
|
|
||
| func (r *ReconcilePerconaServerMongoDBBackup) newSnapshotBackups(ctx context.Context, cluster *api.PerconaServerMongoDB) (*snapshotBackups, error) { | ||
| if cluster == nil { | ||
| return &snapshotBackups{}, nil | ||
| } | ||
| cn, err := r.newPBMFunc(ctx, r.client, cluster) | ||
| if err != nil { | ||
| return nil, errors.Wrap(err, "reate pbm object") | ||
| } | ||
|
|
||
| return &snapshotBackups{pbm: cn, spec: cluster.Spec.Backup}, nil | ||
| } | ||
|
|
||
| func (b *snapshotBackups) PBM() backup.PBM { | ||
| return b.pbm | ||
| } | ||
|
|
||
| func (b *snapshotBackups) Start(ctx context.Context, k8sclient client.Client, cluster *api.PerconaServerMongoDB, cr *api.PerconaServerMongoDBBackup) (api.PerconaServerMongoDBBackupStatus, error) { | ||
| log := logf.FromContext(ctx).WithValues("backup", cr.Name) | ||
|
|
||
| log.Info("Starting snapshot backup") | ||
|
|
||
| var status api.PerconaServerMongoDBBackupStatus | ||
|
|
||
| name := time.Now().UTC().Format(time.RFC3339) | ||
| cmd := ctrl.Cmd{ | ||
| Cmd: ctrl.CmdBackup, | ||
| Backup: &ctrl.BackupCmd{ | ||
| Name: name, | ||
| Type: defs.ExternalBackup, | ||
| }, | ||
| } | ||
|
|
||
| log.Info("Sending backup command", "backupCmd", cmd) | ||
|
|
||
| if err := b.pbm.SendCmd(ctx, cmd); err != nil { | ||
| return status, err | ||
| } | ||
| status.State = api.BackupStateRequested | ||
|
|
||
| status = api.PerconaServerMongoDBBackupStatus{ | ||
| PBMname: name, | ||
| LastTransition: &metav1.Time{ | ||
| Time: time.Unix(time.Now().Unix(), 0), | ||
| }, | ||
| State: api.BackupStateRequested, | ||
| } | ||
| if cluster.Spec.Sharding.Enabled && cluster.Spec.Sharding.ConfigsvrReplSet != nil { | ||
| status.ReplsetNames = append(status.ReplsetNames, cluster.Spec.Sharding.ConfigsvrReplSet.Name) | ||
| } | ||
| for _, rs := range cluster.Spec.Replsets { | ||
| status.ReplsetNames = append(status.ReplsetNames, rs.Name) | ||
| } | ||
|
|
||
| return status, nil | ||
| } | ||
|
|
||
| func (b *snapshotBackups) reconcileSnapshot( | ||
| ctx context.Context, | ||
| cl client.Client, | ||
| rsName string, | ||
| pvc string, | ||
| bcp *api.PerconaServerMongoDBBackup, | ||
| ) (*volumesnapshotv1.VolumeSnapshot, error) { | ||
| volumeSnapshot := &volumesnapshotv1.VolumeSnapshot{ | ||
| ObjectMeta: metav1.ObjectMeta{ | ||
| Name: naming.VolumeSnapshotName(bcp, rsName), | ||
| Namespace: bcp.GetNamespace(), | ||
| }, | ||
| } | ||
| if err := cl.Get(ctx, client.ObjectKeyFromObject(volumeSnapshot), volumeSnapshot); err == nil { | ||
| return volumeSnapshot, nil | ||
| } else if client.IgnoreNotFound(err) != nil { | ||
| return nil, errors.Wrap(err, "get volume snapshot") | ||
| } | ||
|
|
||
| volumeSnapshot.Spec = volumesnapshotv1.VolumeSnapshotSpec{ | ||
| VolumeSnapshotClassName: bcp.Spec.VolumeSnapshotClass, | ||
| Source: volumesnapshotv1.VolumeSnapshotSource{ | ||
| PersistentVolumeClaimName: &pvc, | ||
| }, | ||
| } | ||
| if err := controllerutil.SetControllerReference(bcp, volumeSnapshot, cl.Scheme()); err != nil { | ||
| return nil, errors.Wrap(err, "set controller reference") | ||
| } | ||
|
|
||
| if err := cl.Create(ctx, volumeSnapshot); err != nil { | ||
| return nil, errors.Wrap(err, "create volume snapshot") | ||
| } | ||
| return volumeSnapshot, nil | ||
| } | ||
|
|
||
| func (b *snapshotBackups) reconcileSnapshots( | ||
| ctx context.Context, | ||
| cl client.Client, | ||
| bcp *api.PerconaServerMongoDBBackup, | ||
| meta *backup.BackupMeta, | ||
| ) (bool, []api.SnapshotInfo, error) { | ||
| done := true | ||
| snapshots := make([]api.SnapshotInfo, 0) | ||
|
|
||
| podName := func(nodeName string) (string, error) { | ||
| parts := strings.Split(nodeName, ".") | ||
| if len(parts) < 1 { | ||
| return "", errors.Errorf("unexpected node name format: %s", nodeName) | ||
| } | ||
| return parts[0], nil | ||
| } | ||
|
|
||
| for _, rs := range meta.Replsets { | ||
| // do not snapshot nodes that are not yet copy ready. | ||
| if rs.Status != defs.StatusCopyReady { | ||
| done = false | ||
| continue | ||
| } | ||
|
|
||
| // parse pod name from node name. | ||
| podName, err := podName(rs.Node) | ||
| if err != nil { | ||
| return false, nil, errors.Wrap(err, "get pod name") | ||
| } | ||
|
|
||
| // ensure snapshot is created. | ||
| pvcName := config.MongodDataVolClaimName + "-" + podName | ||
| snapshot, err := b.reconcileSnapshot(ctx, cl, rs.Name, pvcName, bcp) | ||
| if err != nil { | ||
| return false, nil, errors.Wrap(err, "reconcile snapshot") | ||
| } | ||
|
|
||
| if snapshot.Status == nil || !ptr.Deref(snapshot.Status.ReadyToUse, false) { | ||
| done = false | ||
| } | ||
|
|
||
| // If there is an error, return error. | ||
| // Note that some errors may be transient, but the controller will retry. | ||
| if snapshot.Status != nil && snapshot.Status.Error != nil && ptr.Deref(snapshot.Status.Error.Message, "") != "" { | ||
| return false, nil, errors.Errorf("snapshot error: %s", ptr.Deref(snapshot.Status.Error.Message, "")) | ||
| } | ||
| snapshots = append(snapshots, api.SnapshotInfo{ | ||
| ReplsetName: rs.Name, | ||
| SnapshotName: snapshot.GetName(), | ||
| }) | ||
| } | ||
| return done, snapshots, nil | ||
| } | ||
|
|
||
| func (b *snapshotBackups) Status(ctx context.Context, cl client.Client, cluster *api.PerconaServerMongoDB, cr *api.PerconaServerMongoDBBackup) (api.PerconaServerMongoDBBackupStatus, error) { | ||
| status := cr.Status | ||
| log := logf.FromContext(ctx).WithName("backupStatus").WithValues("backup", cr.Name, "pbmName", status.PBMname) | ||
|
|
||
| meta, err := b.pbm.GetBackupByName(ctx, cr.Status.PBMname) | ||
| if err != nil && !errors.Is(err, pbmErrors.ErrNotFound) { | ||
| return status, errors.Wrap(err, "get pbm backup meta") | ||
| } | ||
|
|
||
| if meta == nil || meta.Name == "" || errors.Is(err, pbmErrors.ErrNotFound) { | ||
| logf.FromContext(ctx).Info("Waiting for backup metadata", "pbmName", cr.Status.PBMname, "backup", cr.Name) | ||
| return status, nil | ||
| } | ||
|
|
||
| log.V(1).Info("Got backup meta", "meta", meta) | ||
|
|
||
| if meta.StartTS > 0 { | ||
| status.StartAt = &metav1.Time{ | ||
| Time: time.Unix(meta.StartTS, 0), | ||
| } | ||
| } | ||
|
|
||
| switch meta.Status { | ||
| case defs.StatusError: | ||
| status.State = api.BackupStateError | ||
| status.Error = fmt.Sprintf("%v", meta.Error()) | ||
|
|
||
| case defs.StatusStarting: | ||
| passed := time.Now().UTC().Sub(time.Unix(meta.StartTS, 0)) | ||
| timeoutSeconds := defaultPBMStartingDeadline | ||
| if s := cluster.Spec.Backup.StartingDeadlineSeconds; s != nil && *s > 0 { | ||
| timeoutSeconds = *s | ||
| } | ||
| if passed >= time.Duration(timeoutSeconds)*time.Second { | ||
| status.State = api.BackupStateError | ||
| status.Error = pbmStartingDeadlineErrMsg | ||
| break | ||
| } | ||
|
|
||
| status.State = api.BackupStateRequested | ||
|
|
||
| case defs.StatusDone: | ||
| status.State = api.BackupStateReady | ||
| status.CompletedAt = &metav1.Time{ | ||
| Time: time.Unix(meta.LastTransitionTS, 0), | ||
| } | ||
| status.LastWriteAt = &metav1.Time{ | ||
| Time: time.Unix(int64(meta.LastWriteTS.T), 0), | ||
| } | ||
|
|
||
| case defs.StatusCopyReady: | ||
| status.State = api.BackupStateRunning | ||
| snapshotsReady, snapshots, err := b.reconcileSnapshots(ctx, cl, cr, meta) | ||
| if err != nil { | ||
| return status, errors.Wrap(err, "reconcile snapshots") | ||
| } | ||
| status.Snapshots = snapshots | ||
| if snapshotsReady { | ||
| if err := b.pbm.FinishBackup(ctx, cr.Status.PBMname); err != nil { | ||
| return status, errors.Wrap(err, "finish backup") | ||
| } | ||
| } | ||
| } | ||
|
|
||
| status.LastTransition = &metav1.Time{ | ||
| Time: time.Unix(meta.LastTransitionTS, 0), | ||
| } | ||
| status.Type = cr.Spec.Type | ||
| return status, nil | ||
| } | ||
|
|
||
| func (b *snapshotBackups) Complete(ctx context.Context) error { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
The new snapshot backup and restore functionality (approximately 800 lines of code across snapshot.go and snapshots.go) does not have any unit tests. This is a significant gap in test coverage for a critical feature. Consider adding unit tests to cover:
- Snapshot creation and reconciliation logic
- PVC recreation from snapshots
- StatefulSet scaling operations
- Error handling scenarios
- Edge cases like missing snapshots or failed operations
| - get | ||
| - list | ||
| - watch | ||
| - create |
There was a problem hiding this comment.
The RBAC permissions for VolumeSnapshots are missing delete and update verbs. While the operator creates snapshots during backup, it may also need to clean them up when backups are deleted (as part of the finalizer logic), and potentially update snapshot metadata. Consider adding delete permission at minimum for proper cleanup. Review whether update and patch permissions are also needed for snapshot management.
| - create | |
| - create | |
| - update | |
| - patch | |
| - delete |
| // TODO | ||
| // Delete all statefulsets. | ||
| // Resync PBM storage. | ||
|
|
||
| return restore.Status, nil |
There was a problem hiding this comment.
The restore process never transitions to a final state (like RestoreStateReady or RestoreStateDone). After running runPBMRestoreFinish, the status remains in RestoreStateRunning state indefinitely. This means users won't know when the restore is complete, and the restore object will keep being reconciled. The function should update the status to a final state (e.g., RestoreStateReady) after successful completion.
| // TODO | |
| // Delete all statefulsets. | |
| // Resync PBM storage. | |
| return restore.Status, nil | |
| // At this point, all steps of the snapshot restore have completed successfully. | |
| // Transition the restore to a final state so it is no longer reconciled as running. | |
| status.State = psmdbv1.RestoreStateReady | |
| // TODO | |
| // Delete all statefulsets. | |
| // Resync PBM storage. | |
| return status, nil |
| Name: snapshotName, | ||
| } | ||
| pvc.SetAnnotations(map[string]string{ | ||
| naming.AnnotationRestoreName: snapshotName, |
There was a problem hiding this comment.
The annotation value is set to the snapshot name instead of the restore name. This is inconsistent with the check on line 423, where it compares restoreName == restore.Name. This means that if the snapshot name doesn't match the restore name, the PVC will be deleted and recreated unnecessarily. Consider using restore.Name here to ensure consistency.
| // TODO | ||
| // Delete all statefulsets. | ||
| // Resync PBM storage. |
There was a problem hiding this comment.
The TODO comment indicates incomplete functionality - deleting all statefulsets and resyncing PBM storage. Without this cleanup, the restore process may leave resources in an inconsistent state. This is critical for the snapshot restore feature to work correctly. Please complete this implementation or create a follow-up task to address it.
| // TODO | |
| // Delete all statefulsets. | |
| // Resync PBM storage. | |
| // FOLLOW-UP (snapshot restore cleanup): | |
| // Deleting all statefulsets created for snapshot restore and resyncing PBM storage | |
| // after a successful PBM restore is not yet implemented here. | |
| // Tracking issue: PSMDB-XXXX (implement post-snapshot-restore cleanup and PBM resync). |
| } | ||
| cn, err := r.newPBMFunc(ctx, r.client, cluster) | ||
| if err != nil { | ||
| return nil, errors.Wrap(err, "reate pbm object") |
There was a problem hiding this comment.
Typo in error message: "reate" should be "create".
| return nil, errors.Wrap(err, "reate pbm object") | |
| return nil, errors.Wrap(err, "create pbm object") |
| sfs.Spec.Template.Spec.Containers[0].Command = []string{"/opt/percona/pbm-agent"} | ||
| sfs.Spec.Template.Spec.Containers[0].Args = []string{ | ||
| "restore-finish", | ||
| restore.Status.PBMname, | ||
| "-c", "/etc/pbm/pbm_config.yaml", | ||
| "--rs", "$(MONGODB_REPLSET)", | ||
| "--node", "$(POD_NAME).$(SERVICE_NAME)-$(MONGODB_REPLSET).$(NAMESPACE).svc.cluster.local", | ||
| // "--db-config", "/etc/pbm/db-config.yaml", // TODO | ||
| } |
There was a problem hiding this comment.
Potential index out of bounds error. The code assumes that sfs.Spec.Template.Spec.Containers[0] exists, but there's no check to verify that the Containers slice has at least one element. If the StatefulSet has no containers (which would be unusual but possible in error scenarios), this will panic. Consider adding a length check or finding the container by name instead of assuming index 0.
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
| "k8s.io/utils/ptr" | ||
| "sigs.k8s.io/controller-runtime/pkg/client" | ||
| logf "sigs.k8s.io/controller-runtime/pkg/log" | ||
| ) |
There was a problem hiding this comment.
[goimports-reviser] reported by reviewdog 🐶
| ) | |
| "github.com/percona/percona-backup-mongodb/pbm/defs" | |
| psmdbv1 "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1" | |
| "github.com/percona/percona-server-mongodb-operator/pkg/naming" | |
| "github.com/percona/percona-server-mongodb-operator/pkg/psmdb/backup" | |
| "github.com/percona/percona-server-mongodb-operator/pkg/psmdb/config" | |
| ) |
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
d66f69d to
9713dbc
Compare
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 29 out of 30 changed files in this pull request and generated 10 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
|
||
| err := r.clientcmd.Exec(ctx, &pod, "mongod", restoreFinishCmd, nil, stdoutBuf, stderrBuf, false) | ||
| if err != nil { | ||
| log.Error(nil, "Failed to finish restore", "pod", pod.Name, "stderr", stderrBuf.String(), "stdout", stdoutBuf.String()) |
There was a problem hiding this comment.
Same issue as above: log.Error is called with a nil error even though err is available. Pass err so the log entry includes the actual failure.
| log.Error(nil, "Failed to finish restore", "pod", pod.Name, "stderr", stderrBuf.String(), "stdout", stdoutBuf.String()) | |
| log.Error(err, "Failed to finish restore", "pod", pod.Name, "stderr", stderrBuf.String(), "stdout", stdoutBuf.String()) |
| AWS_ACCESS_KEY_ID: "minioadmin" | ||
| AWS_SECRET_ACCESS_KEY: "minioadmin" |
There was a problem hiding this comment.
This manifest introduces a Secret with hardcoded credentials (minioadmin/minioadmin). Shipping real-looking default credentials in a deploy example is risky because it can be applied as-is in non-dev clusters. Consider removing this Secret from deploy/cr.yaml, commenting it out, or replacing values with obvious placeholders and documentation comments.
| AWS_ACCESS_KEY_ID: "minioadmin" | |
| AWS_SECRET_ACCESS_KEY: "minioadmin" | |
| # WARNING: Replace these placeholder values with your own MinIO credentials before applying. | |
| AWS_ACCESS_KEY_ID: "<YOUR-MINIO-ACCESS-KEY>" | |
| AWS_SECRET_ACCESS_KEY: "<YOUR-MINIO-SECRET-KEY>" |
| podName := func(nodeName string) (string, error) { | ||
| parts := strings.Split(nodeName, ".") | ||
| if len(parts) < 1 { | ||
| return "", errors.Errorf("unexpected node name format: %s", nodeName) | ||
| } | ||
| return parts[0], nil | ||
| } | ||
|
|
||
| for _, rs := range meta.Replsets { | ||
| // do not snapshot nodes that are not yet copy ready. | ||
| if rs.Status != defs.StatusCopyReady { | ||
| done = false | ||
| continue | ||
| } | ||
|
|
||
| // parse pod name from node name. | ||
| podName, err := podName(rs.Node) | ||
| if err != nil { | ||
| return false, nil, errors.Wrap(err, "get pod name") | ||
| } |
There was a problem hiding this comment.
podName is used both as a helper function and then as a local variable (podName, err := podName(...)), which is legal but hard to read. Renaming the helper (e.g. parsePodName) avoids shadowing and improves clarity.
| return rr, err | ||
| } | ||
| defer bcp.Close(ctx) | ||
| defer bcp.PBM().Close(ctx) |
There was a problem hiding this comment.
defer bcp.PBM().Close(ctx) can panic when bcp.PBM() is nil (e.g., when the referenced cluster is not found and newManagedBackups/newSnapshotBackups return an executor with pbm=nil). The old Backup.Close() handled nil safely. Guard the defer with a nil check (or reintroduce a Close method on the executor interface) to avoid nil-interface method calls.
| defer bcp.PBM().Close(ctx) | |
| pbmCli := bcp.PBM() | |
| if pbmCli != nil { | |
| defer pbmCli.Close(ctx) | |
| } |
| if len(p.Spec.StorageName) == 0 { | ||
| if len(p.Spec.StorageName) == 0 && p.Spec.Type != defs.ExternalBackup { | ||
| return fmt.Errorf("spec storageName field is empty") | ||
| } |
There was a problem hiding this comment.
CheckFields() now allows spec.storageName to be empty whenever spec.type == external, but the controller falls back to managed backups when volumeSnapshotClass isn’t set and managedBackups.Start() will then fail with “unable to get storage ''”. Consider tightening validation here (e.g., require volumeSnapshotClass for external backups, or require storageName when volumeSnapshotClass is empty) so invalid specs fail fast with a clear error.
| } | |
| } | |
| if p.Spec.Type == defs.ExternalBackup { | |
| if (p.Spec.VolumeSnapshotClass == nil || *p.Spec.VolumeSnapshotClass == "") && len(p.Spec.StorageName) == 0 { | |
| return fmt.Errorf("spec volumeSnapshotClass or storageName must be set for external backups") | |
| } | |
| } |
| AccessKeyID: storage.MaskedString(accessKey), | ||
| SecretAccessKey: storage.MaskedString(secretAccessKey), |
There was a problem hiding this comment.
In MinIO credentials, accessKey/secretAccessKey are []byte from Secret data, but storage.MaskedString is used elsewhere with a string input. This likely won’t compile (or will produce an unintended value) unless you convert the bytes to string before wrapping.
| AccessKeyID: storage.MaskedString(accessKey), | |
| SecretAccessKey: storage.MaskedString(secretAccessKey), | |
| AccessKeyID: storage.MaskedString(string(accessKey)), | |
| SecretAccessKey: storage.MaskedString(string(secretAccessKey)), |
|
|
||
| func (cr *PerconaServerMongoDB) GetAllReplsets() []*ReplsetSpec { | ||
| replsets := cr.Spec.Replsets | ||
| if cr.Spec.Sharding.Enabled { |
There was a problem hiding this comment.
GetAllReplsets() appends cr.Spec.Sharding.ConfigsvrReplSet whenever sharding is enabled, without a nil check. Several other call sites in the codebase guard ConfigsvrReplSet != nil, and this method can return a slice containing nil if invoked before defaults/validation, leading to panics when iterating. Consider checking ConfigsvrReplSet != nil (or returning an empty slice / error) to make this helper safe.
| if cr.Spec.Sharding.Enabled { | |
| if cr.Spec.Sharding.Enabled && cr.Spec.Sharding.ConfigsvrReplSet != nil { |
|
|
||
| err := r.clientcmd.Exec(ctx, &pod, "mongod", restoreCmd, nil, stdoutBuf, stderrBuf, false) | ||
| if err != nil { | ||
| log.Error(nil, "Restore failed to start", "pod", pod.Name, "stderr", stderrBuf.String(), "stdout", stdoutBuf.String()) |
There was a problem hiding this comment.
The logger call passes nil as the error value even though err is available. This drops the underlying error from structured logs and makes debugging harder; pass err instead of nil.
| log.Error(nil, "Restore failed to start", "pod", pod.Name, "stderr", stderrBuf.String(), "stdout", stdoutBuf.String()) | |
| log.Error(err, "Restore failed to start", "pod", pod.Name, "stderr", stderrBuf.String(), "stdout", stdoutBuf.String()) |
|
|
||
| orig := sfs.DeepCopy() | ||
|
|
||
| // Scale down the statefulset. |
There was a problem hiding this comment.
Comment says “Scale down the statefulset” but this code scales replicas up to replicas. Update the comment to avoid confusion during future maintenance/debugging.
| // Scale down the statefulset. | |
| // Scale the statefulset to the desired number of replicas. |
| var bcp backupExecutor | ||
| if err = retry.OnError(defaultBackoff, func(err error) bool { return err != nil }, func() error { | ||
| var err error | ||
| bcp, err = r.newBackup(ctx, cluster) | ||
| if err != nil { | ||
| return errors.Wrap(err, "create backup object") | ||
| switch { | ||
| case cr.Spec.Type == defs.ExternalBackup && | ||
| cr.Spec.VolumeSnapshotClass != nil && *cr.Spec.VolumeSnapshotClass != "": | ||
| bcp, err = r.newSnapshotBackups(ctx, cluster) | ||
| if err != nil { | ||
| return errors.Wrap(err, "create snapshot backup object") | ||
| } | ||
| default: | ||
| bcp, err = r.newManagedBackups(ctx, cluster) | ||
| if err != nil { | ||
| return errors.Wrap(err, "create backup object") | ||
| } | ||
| } |
There was a problem hiding this comment.
New control-flow selects a different backup executor for type=external + volumeSnapshotClass. There are existing unit tests in this package (e.g. backup_test.go) but none exercise this new snapshot-based path. Adding unit tests that cover executor selection and snapshot backup status transitions would help prevent regressions.
commit: 091270b |
CHANGE DESCRIPTION
Adds support for backups using Kubernetes VolumeSnapshots.
CHECKLIST
Jira
Needs Doc) and QA (Needs QA)?Tests
compare/*-oc.yml)?Config/Logging/Testability