Skip to content

K8SPSMDB-1363: Snapshot-based backups#2247

Draft
mayankshah1607 wants to merge 15 commits intomainfrom
K8SPSMDB-1363
Draft

K8SPSMDB-1363: Snapshot-based backups#2247
mayankshah1607 wants to merge 15 commits intomainfrom
K8SPSMDB-1363

Conversation

@mayankshah1607
Copy link
Member

@mayankshah1607 mayankshah1607 commented Feb 18, 2026

CHANGE DESCRIPTION

Adds support for backups using Kubernetes VolumeSnapshots.

CHECKLIST

Jira

  • Is the Jira ticket created and referenced properly?
  • Does the Jira ticket have the proper statuses for documentation (Needs Doc) and QA (Needs QA)?
  • Does the Jira ticket link to the proper milestone (Fix Version field)?

Tests

  • Is an E2E test/test case added for the new feature/change?
  • Are unit tests added where appropriate?
  • Are OpenShift compare files changed for E2E tests (compare/*-oc.yml)?

Config/Logging/Testability

  • Are all needed new/changed options added to default YAML files?
  • Are all needed new/changed options added to the Helm Chart?
  • Did we add proper logging messages for operator actions?
  • Did we ensure compatibility with the previous version or cluster upgrade process?
  • Does the change support oldest and newest supported MongoDB version?
  • Does the change support oldest and newest supported Kubernetes version?

Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Copilot AI review requested due to automatic review settings February 18, 2026 11:13
@pull-request-size pull-request-size bot added the size/XL 500-999 lines label Feb 18, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR implements snapshot-based backups for Percona Server for MongoDB, adding support for Kubernetes VolumeSnapshots as an alternative backup mechanism. The implementation introduces a new backup executor interface to handle different backup types (managed vs snapshot-based), integrates with the Kubernetes CSI snapshot API, and updates the CRD to support the new external backup type with volume snapshot configuration.

Changes:

  • Introduces backupExecutor interface to support multiple backup implementations (managed and snapshot-based)
  • Adds VolumeSnapshot support for external backups via new snapshot.go controller
  • Updates API types to include VolumeSnapshotClass field and SnapshotInfo status

Reviewed changes

Copilot reviewed 20 out of 21 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
pkg/apis/psmdb/v1/perconaservermongodbbackup_types.go Adds external backup type, VolumeSnapshotClass field, and SnapshotInfo struct to track volume snapshots
pkg/controller/perconaservermongodbbackup/snapshot.go New file implementing snapshot-based backup logic including VolumeSnapshot creation and reconciliation
pkg/controller/perconaservermongodbbackup/backup.go Refactors existing backup logic into managedBackups type and introduces backupExecutor interface
pkg/controller/perconaservermongodbbackup/psmdb_backup_controller.go Updates controller to select backup executor based on backup type and configuration
pkg/psmdb/backup/pbm.go Adds GetBackupByName and FinishBackup methods, wraps credentials with MaskedString for security
pkg/naming/naming.go Adds VolumeSnapshotName function to generate snapshot resource names
deploy/rbac.yaml, deploy/cw-rbac.yaml Grants operator permissions to create and manage VolumeSnapshot resources
config/crd/bases/.yaml, deploy/.yaml Updates CRD definitions to include new backup type and snapshot fields
cmd/manager/main.go Registers VolumeSnapshot v1 API scheme
go.mod, go.sum Updates Go version and PBM dependency version, adds kubernetes-csi/external-snapshotter client
deploy/bundle.yaml Contains deployment configuration with modified operator image reference

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

}
cn, err := r.newPBMFunc(ctx, r.client, cluster)
if err != nil {
return nil, errors.Wrap(err, "reate pbm object")
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a typo in the error message. It says "reate" instead of "create".

Suggested change
return nil, errors.Wrap(err, "reate pbm object")
return nil, errors.Wrap(err, "create pbm object")

Copilot uses AI. Check for mistakes.
Comment on lines +68 to +76
status.State = api.BackupStateRequested

status = api.PerconaServerMongoDBBackupStatus{
PBMname: name,
LastTransition: &metav1.Time{
Time: time.Unix(time.Now().Unix(), 0),
},
State: api.BackupStateRequested,
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The status is assigned twice unnecessarily. Line 68 sets status.State, then lines 70-76 create a new status struct with the same information. The first assignment on line 68 is redundant and should be removed.

Copilot uses AI. Check for mistakes.
Comment on lines 72 to 75
type SnapshotInfo struct {
NodeName string `json:"nodeName,omitempty"`
SnapshotName string `json:"snapshotName,omitempty"`
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The SnapshotInfo struct lacks documentation. Add a comment describing what this struct represents (e.g., "SnapshotInfo contains information about a volume snapshot created for a MongoDB node during an external backup").

Copilot uses AI. Check for mistakes.

podName := func(nodeName string) (string, error) {
parts := strings.Split(nodeName, ".")
if len(parts) < 1 {
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The condition len(parts) < 1 on line 132 is always false because strings.Split always returns at least one element (an empty string if the input is empty). The check should be len(parts) == 0 (which would never be true) or more likely len(parts) < 2 or the first element should be checked for being non-empty. Consider revising this logic to properly validate the node name format.

Suggested change
if len(parts) < 1 {
if parts[0] == "" {

Copilot uses AI. Check for mistakes.
Comment on lines 109 to 112
func (p *PerconaServerMongoDBBackup) CheckFields() error {
if len(p.Spec.StorageName) == 0 {
if len(p.Spec.StorageName) == 0 && p.Spec.Type != defs.ExternalBackup {
return fmt.Errorf("spec storageName field is empty")
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CheckFields method allows external backups without a VolumeSnapshotClass, but the controller only creates snapshot backups when both Type is ExternalBackup AND VolumeSnapshotClass is set. This could lead to a confusing scenario where a user creates an external backup without a VolumeSnapshotClass, and it falls through to the default managed backup path. Consider adding validation to ensure that if Type is ExternalBackup, VolumeSnapshotClass must be specified, or document this behavior clearly.

Copilot uses AI. Check for mistakes.
Comment on lines 1 to 244
package perconaservermongodbbackup

import (
"context"
"fmt"
"strings"
"time"

volumesnapshotv1 "github.com/kubernetes-csi/external-snapshotter/client/v8/apis/volumesnapshot/v1"
"github.com/percona/percona-backup-mongodb/pbm/ctrl"
"github.com/percona/percona-backup-mongodb/pbm/defs"
pbmErrors "github.com/percona/percona-backup-mongodb/pbm/errors"
"github.com/pkg/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/utils/ptr"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
logf "sigs.k8s.io/controller-runtime/pkg/log"

api "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1"
"github.com/percona/percona-server-mongodb-operator/pkg/naming"
"github.com/percona/percona-server-mongodb-operator/pkg/psmdb/backup"
"github.com/percona/percona-server-mongodb-operator/pkg/psmdb/config"
)

type snapshotBackups struct {
pbm backup.PBM
spec api.BackupSpec
}

func (r *ReconcilePerconaServerMongoDBBackup) newSnapshotBackups(ctx context.Context, cluster *api.PerconaServerMongoDB) (*snapshotBackups, error) {
if cluster == nil {
return &snapshotBackups{}, nil
}
cn, err := r.newPBMFunc(ctx, r.client, cluster)
if err != nil {
return nil, errors.Wrap(err, "reate pbm object")
}

return &snapshotBackups{pbm: cn, spec: cluster.Spec.Backup}, nil
}

func (b *snapshotBackups) PBM() backup.PBM {
return b.pbm
}

func (b *snapshotBackups) Start(ctx context.Context, k8sclient client.Client, cluster *api.PerconaServerMongoDB, cr *api.PerconaServerMongoDBBackup) (api.PerconaServerMongoDBBackupStatus, error) {
log := logf.FromContext(ctx).WithValues("backup", cr.Name)

log.Info("Starting snapshot backup")

var status api.PerconaServerMongoDBBackupStatus

name := time.Now().UTC().Format(time.RFC3339)
cmd := ctrl.Cmd{
Cmd: ctrl.CmdBackup,
Backup: &ctrl.BackupCmd{
Name: name,
Type: defs.ExternalBackup,
},
}

log.Info("Sending backup command", "backupCmd", cmd)

if err := b.pbm.SendCmd(ctx, cmd); err != nil {
return status, err
}
status.State = api.BackupStateRequested

status = api.PerconaServerMongoDBBackupStatus{
PBMname: name,
LastTransition: &metav1.Time{
Time: time.Unix(time.Now().Unix(), 0),
},
State: api.BackupStateRequested,
}
if cluster.Spec.Sharding.Enabled && cluster.Spec.Sharding.ConfigsvrReplSet != nil {
status.ReplsetNames = append(status.ReplsetNames, cluster.Spec.Sharding.ConfigsvrReplSet.Name)
}
for _, rs := range cluster.Spec.Replsets {
status.ReplsetNames = append(status.ReplsetNames, rs.Name)
}

return status, nil
}

func (b *snapshotBackups) reconcileSnapshot(
ctx context.Context,
cl client.Client,
pvc string,
bcp *api.PerconaServerMongoDBBackup,
) (*volumesnapshotv1.VolumeSnapshot, error) {
volumeSnapshot := &volumesnapshotv1.VolumeSnapshot{
ObjectMeta: metav1.ObjectMeta{
Name: naming.VolumeSnapshotName(bcp, pvc),
Namespace: bcp.GetNamespace(),
},
}
if err := cl.Get(ctx, client.ObjectKeyFromObject(volumeSnapshot), volumeSnapshot); err == nil {
return volumeSnapshot, nil
} else if client.IgnoreNotFound(err) != nil {
return nil, errors.Wrap(err, "get volume snapshot")
}

volumeSnapshot.Spec = volumesnapshotv1.VolumeSnapshotSpec{
VolumeSnapshotClassName: bcp.Spec.VolumeSnapshotClass,
Source: volumesnapshotv1.VolumeSnapshotSource{
PersistentVolumeClaimName: &pvc,
},
}
if err := controllerutil.SetControllerReference(bcp, volumeSnapshot, cl.Scheme()); err != nil {
return nil, errors.Wrap(err, "set controller reference")
}

if err := cl.Create(ctx, volumeSnapshot); err != nil {
return nil, errors.Wrap(err, "create volume snapshot")
}
return volumeSnapshot, nil
}

func (b *snapshotBackups) reconcileSnapshots(
ctx context.Context,
cl client.Client,
bcp *api.PerconaServerMongoDBBackup,
meta *backup.BackupMeta,
) (bool, []api.SnapshotInfo, error) {
done := true
snapshots := make([]api.SnapshotInfo, 0)

podName := func(nodeName string) (string, error) {
parts := strings.Split(nodeName, ".")
if len(parts) < 1 {
return "", errors.Errorf("unexpected node name format: %s", nodeName)
}
return parts[0], nil
}

for _, rs := range meta.Replsets {
// do not snapshot nodes that are not yet copy ready.
if rs.Status != defs.StatusCopyReady {
done = false
continue
}

// parse pod name from node name.
podName, err := podName(rs.Node)
if err != nil {
return false, nil, errors.Wrap(err, "get pod name")
}

// ensure snapshot is created.
pvcName := config.MongodDataVolClaimName + "-" + podName
snapshot, err := b.reconcileSnapshot(ctx, cl, pvcName, bcp)
if err != nil {
return false, nil, errors.Wrap(err, "reconcile snapshot")
}

if snapshot.Status == nil || !ptr.Deref(snapshot.Status.ReadyToUse, false) {
done = false
}

// If there is an error, return error.
// Note that some errors may be transient, but the controller will retry.
if snapshot.Status != nil && snapshot.Status.Error != nil && ptr.Deref(snapshot.Status.Error.Message, "") != "" {
return false, nil, errors.Errorf("snapshot error: %s", ptr.Deref(snapshot.Status.Error.Message, ""))
}
snapshots = append(snapshots, api.SnapshotInfo{
NodeName: rs.Node,
SnapshotName: snapshot.GetName(),
})
}
return done, snapshots, nil
}

func (b *snapshotBackups) Status(ctx context.Context, cl client.Client, cluster *api.PerconaServerMongoDB, cr *api.PerconaServerMongoDBBackup) (api.PerconaServerMongoDBBackupStatus, error) {
status := cr.Status
log := logf.FromContext(ctx).WithName("backupStatus").WithValues("backup", cr.Name, "pbmName", status.PBMname)

meta, err := b.pbm.GetBackupByName(ctx, cr.Status.PBMname)
if err != nil && !errors.Is(err, pbmErrors.ErrNotFound) {
return status, errors.Wrap(err, "get pbm backup meta")
}

if meta == nil || meta.Name == "" || errors.Is(err, pbmErrors.ErrNotFound) {
logf.FromContext(ctx).Info("Waiting for backup metadata", "pbmName", cr.Status.PBMname, "backup", cr.Name)
return status, nil
}

log.V(1).Info("Got backup meta", "meta", meta)

if meta.StartTS > 0 {
status.StartAt = &metav1.Time{
Time: time.Unix(meta.StartTS, 0),
}
}

switch meta.Status {
case defs.StatusError:
status.State = api.BackupStateError
status.Error = fmt.Sprintf("%v", meta.Error())

case defs.StatusStarting:
passed := time.Now().UTC().Sub(time.Unix(meta.StartTS, 0))
timeoutSeconds := defaultPBMStartingDeadline
if s := cluster.Spec.Backup.StartingDeadlineSeconds; s != nil && *s > 0 {
timeoutSeconds = *s
}
if passed >= time.Duration(timeoutSeconds)*time.Second {
status.State = api.BackupStateError
status.Error = pbmStartingDeadlineErrMsg
break
}

status.State = api.BackupStateRequested

case defs.StatusDone:
status.State = api.BackupStateReady
status.CompletedAt = &metav1.Time{
Time: time.Unix(meta.LastTransitionTS, 0),
}
status.LastWriteAt = &metav1.Time{
Time: time.Unix(int64(meta.LastWriteTS.T), 0),
}

case defs.StatusCopyReady:
status.State = api.BackupStateRunning
snapshotsReady, snapshots, err := b.reconcileSnapshots(ctx, cl, cr, meta)
if err != nil {
return status, errors.Wrap(err, "reconcile snapshots")
}
status.Snapshots = snapshots
if snapshotsReady {
if err := b.pbm.FinishBackup(ctx, cr.Status.PBMname); err != nil {
return status, errors.Wrap(err, "finish backup")
}
}
}

return status, nil
}

func (b *snapshotBackups) Complete(ctx context.Context) error {
return nil
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new snapshot backup functionality in snapshot.go lacks test coverage. There are tests for backup_test.go but no corresponding snapshot_test.go file. Given that this is a new feature, comprehensive tests should be added to verify the snapshot creation, reconciliation, status handling, and error scenarios.

Copilot uses AI. Check for mistakes.
containers:
- name: percona-server-mongodb-operator
image: perconalab/percona-server-mongodb-operator:main
image: asia-south1-docker.pkg.dev/cloud-dev-112233/mayankshah/psmdb-operator:K8SPSMDB-1363
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The bundle.yaml file contains a hardcoded image reference to a personal development repository (asia-south1-docker.pkg.dev/cloud-dev-112233/mayankshah/psmdb-operator:K8SPSMDB-1363). This should be reverted to use the standard image reference before merging. Development-specific image references should not be committed to the main branch.

Copilot uses AI. Check for mistakes.
}

func (b *pbmC) GetBackupByName(ctx context.Context, bcpName string) (*backup.BackupMeta, error) {
return backup.NewDBManager(b.Client).GetBackupByName(ctx, bcpName)
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetBackupMeta and GetBackupByName are duplicate implementations that both call backup.NewDBManager(b.Client).GetBackupByName(ctx, bcpName). This is redundant. Consider removing one of them or clarifying why both are needed.

Suggested change
return backup.NewDBManager(b.Client).GetBackupByName(ctx, bcpName)
return b.GetBackupMeta(ctx, bcpName)

Copilot uses AI. Check for mistakes.
Comment on lines +22 to +25
// VolumeSnapshotClass is the name of the VolumeSnapshotClass to use for snapshot based backups.
// This may be specified only when type is `external`.
// +kubebuilder:validation:Optional
VolumeSnapshotClass *string `json:"volumeSnapshotClass,omitempty"`
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation comment on lines 22-23 states that VolumeSnapshotClass "may be specified only when type is external", but there's no validation enforcing this constraint. Either add validation in the CheckFields method to ensure this is only set when Type is ExternalBackup, or add a kubebuilder validation marker to enforce this at the CRD level.

Copilot uses AI. Check for mistakes.
Comment on lines 88 to 89
func VolumeSnapshotName(bcp *psmdbv1.PerconaServerMongoDBBackup, pvc string) string {
return fmt.Sprintf("%s-%s", bcp.Name, pvc)
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The VolumeSnapshotName function concatenates the backup name and PVC name without checking for Kubernetes name length limits (253 characters for most resources). If the backup name is long, this could result in invalid snapshot names. Consider adding truncation or validation to ensure the generated name doesn't exceed Kubernetes limits.

Suggested change
func VolumeSnapshotName(bcp *psmdbv1.PerconaServerMongoDBBackup, pvc string) string {
return fmt.Sprintf("%s-%s", bcp.Name, pvc)
const maxK8sNameLen = 253
func truncateNamePart(s string, max int) string {
if max <= 0 {
return ""
}
if len(s) <= max {
return s
}
return s[:max]
}
func VolumeSnapshotName(bcp *psmdbv1.PerconaServerMongoDBBackup, pvc string) string {
name := bcp.Name
// Ensure the final "<backup-name>-<pvc>" does not exceed Kubernetes name length limits.
// Reserve space for the hyphen separator.
remaining := maxK8sNameLen - 1 - len(pvc)
if remaining < 0 {
// PVC name alone exceeds or equals the limit; truncate PVC first.
pvc = truncateNamePart(pvc, maxK8sNameLen)
remaining = maxK8sNameLen - 1 - len(pvc)
}
if remaining < 0 {
// In the extreme case where even "<truncated-pvc>" plus hyphen would exceed the limit,
// drop the backup name entirely and fall back to the truncated PVC name.
return pvc
}
name = truncateNamePart(name, remaining)
return fmt.Sprintf("%s-%s", name, pvc)

Copilot uses AI. Check for mistakes.
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
@pull-request-size pull-request-size bot added size/XXL 1000+ lines and removed size/XL 500-999 lines labels Feb 18, 2026
@github-actions github-actions bot added tests dependencies Pull requests that update a dependency file labels Feb 18, 2026
Comment on lines +10 to +12
"github.com/percona/percona-backup-mongodb/pbm/ctrl"
"github.com/percona/percona-backup-mongodb/pbm/defs"
pbmErrors "github.com/percona/percona-backup-mongodb/pbm/errors"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[goimports-reviser] reported by reviewdog 🐶

Suggested change
"github.com/percona/percona-backup-mongodb/pbm/ctrl"
"github.com/percona/percona-backup-mongodb/pbm/defs"
pbmErrors "github.com/percona/percona-backup-mongodb/pbm/errors"

"sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
logf "sigs.k8s.io/controller-runtime/pkg/log"

api "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[goimports-reviser] reported by reviewdog 🐶

Suggested change
api "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1"
"github.com/percona/percona-backup-mongodb/pbm/ctrl"
"github.com/percona/percona-backup-mongodb/pbm/defs"
pbmErrors "github.com/percona/percona-backup-mongodb/pbm/errors"
api "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1"

Comment on lines +11 to +15
"github.com/percona/percona-backup-mongodb/pbm/defs"
psmdbv1 "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1"
"github.com/percona/percona-server-mongodb-operator/pkg/naming"
"github.com/percona/percona-server-mongodb-operator/pkg/psmdb/backup"
"github.com/percona/percona-server-mongodb-operator/pkg/psmdb/config"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[goimports-reviser] reported by reviewdog 🐶

Suggested change
"github.com/percona/percona-backup-mongodb/pbm/defs"
psmdbv1 "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1"
"github.com/percona/percona-server-mongodb-operator/pkg/naming"
"github.com/percona/percona-server-mongodb-operator/pkg/psmdb/backup"
"github.com/percona/percona-server-mongodb-operator/pkg/psmdb/config"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[goimports-reviser] reported by reviewdog 🐶

// Import all Kubernetes client auth plugins (e.g. Azure, GCP, OIDC, etc.)
// to ensure that exec-entrypoint and run can make use of them.
_ "k8s.io/client-go/plugin/pkg/client/auth"

Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Copilot AI review requested due to automatic review settings February 18, 2026 18:40
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 27 out of 28 changed files in this pull request and generated 9 comments.

Comments suppressed due to low confidence (1)

pkg/apis/psmdb/v1/perconaservermongodbbackup_types.go:133

  • The CheckFields validation does not verify that VolumeSnapshotClass is specified when Type is external. This could allow users to create external backup resources without specifying the required VolumeSnapshotClass, leading to failures later when the operator tries to create snapshots. Consider adding validation to require VolumeSnapshotClass when Type is ExternalBackup.
func (p *PerconaServerMongoDBBackup) CheckFields() error {
	if len(p.Spec.StorageName) == 0 && p.Spec.Type != defs.ExternalBackup {
		return fmt.Errorf("spec storageName field is empty")
	}
	if len(p.Spec.GetClusterName()) == 0 {
		return fmt.Errorf("spec clusterName is empty")
	}
	if string(p.Spec.Type) == "" {
		p.Spec.Type = defs.LogicalBackup
	}
	if string(p.Spec.Compression) == "" {
		p.Spec.Compression = compress.CompressionTypeGZIP
	}
	return nil

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +30 to +32
if bcp.Spec.Type == defs.ExternalBackup {
// TODO: should we check that snapshots exist?
return nil
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For external (snapshot-based) backups, the validation skips checking if snapshots exist (as noted in the TODO). This could allow restore operations to proceed when the required VolumeSnapshots are missing or not ready, which would cause the restore to fail later in the process. Consider implementing a validation check to verify that all required snapshots exist and are in a ready state before allowing the restore to proceed.

Copilot uses AI. Check for mistakes.

orig := sfs.DeepCopy()

// Scale down the statefulset.
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says "Scale down the statefulset" but the function is named scaleUpStatefulSetsForSnapshotRestore and is scaling UP to the desired replicas. The comment should say "Scale up the statefulset" to accurately reflect what the code is doing.

Suggested change
// Scale down the statefulset.
// Scale up the statefulset.

Copilot uses AI. Check for mistakes.
Comment on lines 1 to 249
package perconaservermongodbbackup

import (
"context"
"fmt"
"strings"
"time"

volumesnapshotv1 "github.com/kubernetes-csi/external-snapshotter/client/v8/apis/volumesnapshot/v1"
"github.com/percona/percona-backup-mongodb/pbm/ctrl"
"github.com/percona/percona-backup-mongodb/pbm/defs"
pbmErrors "github.com/percona/percona-backup-mongodb/pbm/errors"
"github.com/pkg/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/utils/ptr"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/controller/controllerutil"
logf "sigs.k8s.io/controller-runtime/pkg/log"

api "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1"
"github.com/percona/percona-server-mongodb-operator/pkg/naming"
"github.com/percona/percona-server-mongodb-operator/pkg/psmdb/backup"
"github.com/percona/percona-server-mongodb-operator/pkg/psmdb/config"
)

type snapshotBackups struct {
pbm backup.PBM
spec api.BackupSpec
}

func (r *ReconcilePerconaServerMongoDBBackup) newSnapshotBackups(ctx context.Context, cluster *api.PerconaServerMongoDB) (*snapshotBackups, error) {
if cluster == nil {
return &snapshotBackups{}, nil
}
cn, err := r.newPBMFunc(ctx, r.client, cluster)
if err != nil {
return nil, errors.Wrap(err, "reate pbm object")
}

return &snapshotBackups{pbm: cn, spec: cluster.Spec.Backup}, nil
}

func (b *snapshotBackups) PBM() backup.PBM {
return b.pbm
}

func (b *snapshotBackups) Start(ctx context.Context, k8sclient client.Client, cluster *api.PerconaServerMongoDB, cr *api.PerconaServerMongoDBBackup) (api.PerconaServerMongoDBBackupStatus, error) {
log := logf.FromContext(ctx).WithValues("backup", cr.Name)

log.Info("Starting snapshot backup")

var status api.PerconaServerMongoDBBackupStatus

name := time.Now().UTC().Format(time.RFC3339)
cmd := ctrl.Cmd{
Cmd: ctrl.CmdBackup,
Backup: &ctrl.BackupCmd{
Name: name,
Type: defs.ExternalBackup,
},
}

log.Info("Sending backup command", "backupCmd", cmd)

if err := b.pbm.SendCmd(ctx, cmd); err != nil {
return status, err
}
status.State = api.BackupStateRequested

status = api.PerconaServerMongoDBBackupStatus{
PBMname: name,
LastTransition: &metav1.Time{
Time: time.Unix(time.Now().Unix(), 0),
},
State: api.BackupStateRequested,
}
if cluster.Spec.Sharding.Enabled && cluster.Spec.Sharding.ConfigsvrReplSet != nil {
status.ReplsetNames = append(status.ReplsetNames, cluster.Spec.Sharding.ConfigsvrReplSet.Name)
}
for _, rs := range cluster.Spec.Replsets {
status.ReplsetNames = append(status.ReplsetNames, rs.Name)
}

return status, nil
}

func (b *snapshotBackups) reconcileSnapshot(
ctx context.Context,
cl client.Client,
rsName string,
pvc string,
bcp *api.PerconaServerMongoDBBackup,
) (*volumesnapshotv1.VolumeSnapshot, error) {
volumeSnapshot := &volumesnapshotv1.VolumeSnapshot{
ObjectMeta: metav1.ObjectMeta{
Name: naming.VolumeSnapshotName(bcp, rsName),
Namespace: bcp.GetNamespace(),
},
}
if err := cl.Get(ctx, client.ObjectKeyFromObject(volumeSnapshot), volumeSnapshot); err == nil {
return volumeSnapshot, nil
} else if client.IgnoreNotFound(err) != nil {
return nil, errors.Wrap(err, "get volume snapshot")
}

volumeSnapshot.Spec = volumesnapshotv1.VolumeSnapshotSpec{
VolumeSnapshotClassName: bcp.Spec.VolumeSnapshotClass,
Source: volumesnapshotv1.VolumeSnapshotSource{
PersistentVolumeClaimName: &pvc,
},
}
if err := controllerutil.SetControllerReference(bcp, volumeSnapshot, cl.Scheme()); err != nil {
return nil, errors.Wrap(err, "set controller reference")
}

if err := cl.Create(ctx, volumeSnapshot); err != nil {
return nil, errors.Wrap(err, "create volume snapshot")
}
return volumeSnapshot, nil
}

func (b *snapshotBackups) reconcileSnapshots(
ctx context.Context,
cl client.Client,
bcp *api.PerconaServerMongoDBBackup,
meta *backup.BackupMeta,
) (bool, []api.SnapshotInfo, error) {
done := true
snapshots := make([]api.SnapshotInfo, 0)

podName := func(nodeName string) (string, error) {
parts := strings.Split(nodeName, ".")
if len(parts) < 1 {
return "", errors.Errorf("unexpected node name format: %s", nodeName)
}
return parts[0], nil
}

for _, rs := range meta.Replsets {
// do not snapshot nodes that are not yet copy ready.
if rs.Status != defs.StatusCopyReady {
done = false
continue
}

// parse pod name from node name.
podName, err := podName(rs.Node)
if err != nil {
return false, nil, errors.Wrap(err, "get pod name")
}

// ensure snapshot is created.
pvcName := config.MongodDataVolClaimName + "-" + podName
snapshot, err := b.reconcileSnapshot(ctx, cl, rs.Name, pvcName, bcp)
if err != nil {
return false, nil, errors.Wrap(err, "reconcile snapshot")
}

if snapshot.Status == nil || !ptr.Deref(snapshot.Status.ReadyToUse, false) {
done = false
}

// If there is an error, return error.
// Note that some errors may be transient, but the controller will retry.
if snapshot.Status != nil && snapshot.Status.Error != nil && ptr.Deref(snapshot.Status.Error.Message, "") != "" {
return false, nil, errors.Errorf("snapshot error: %s", ptr.Deref(snapshot.Status.Error.Message, ""))
}
snapshots = append(snapshots, api.SnapshotInfo{
ReplsetName: rs.Name,
SnapshotName: snapshot.GetName(),
})
}
return done, snapshots, nil
}

func (b *snapshotBackups) Status(ctx context.Context, cl client.Client, cluster *api.PerconaServerMongoDB, cr *api.PerconaServerMongoDBBackup) (api.PerconaServerMongoDBBackupStatus, error) {
status := cr.Status
log := logf.FromContext(ctx).WithName("backupStatus").WithValues("backup", cr.Name, "pbmName", status.PBMname)

meta, err := b.pbm.GetBackupByName(ctx, cr.Status.PBMname)
if err != nil && !errors.Is(err, pbmErrors.ErrNotFound) {
return status, errors.Wrap(err, "get pbm backup meta")
}

if meta == nil || meta.Name == "" || errors.Is(err, pbmErrors.ErrNotFound) {
logf.FromContext(ctx).Info("Waiting for backup metadata", "pbmName", cr.Status.PBMname, "backup", cr.Name)
return status, nil
}

log.V(1).Info("Got backup meta", "meta", meta)

if meta.StartTS > 0 {
status.StartAt = &metav1.Time{
Time: time.Unix(meta.StartTS, 0),
}
}

switch meta.Status {
case defs.StatusError:
status.State = api.BackupStateError
status.Error = fmt.Sprintf("%v", meta.Error())

case defs.StatusStarting:
passed := time.Now().UTC().Sub(time.Unix(meta.StartTS, 0))
timeoutSeconds := defaultPBMStartingDeadline
if s := cluster.Spec.Backup.StartingDeadlineSeconds; s != nil && *s > 0 {
timeoutSeconds = *s
}
if passed >= time.Duration(timeoutSeconds)*time.Second {
status.State = api.BackupStateError
status.Error = pbmStartingDeadlineErrMsg
break
}

status.State = api.BackupStateRequested

case defs.StatusDone:
status.State = api.BackupStateReady
status.CompletedAt = &metav1.Time{
Time: time.Unix(meta.LastTransitionTS, 0),
}
status.LastWriteAt = &metav1.Time{
Time: time.Unix(int64(meta.LastWriteTS.T), 0),
}

case defs.StatusCopyReady:
status.State = api.BackupStateRunning
snapshotsReady, snapshots, err := b.reconcileSnapshots(ctx, cl, cr, meta)
if err != nil {
return status, errors.Wrap(err, "reconcile snapshots")
}
status.Snapshots = snapshots
if snapshotsReady {
if err := b.pbm.FinishBackup(ctx, cr.Status.PBMname); err != nil {
return status, errors.Wrap(err, "finish backup")
}
}
}

status.LastTransition = &metav1.Time{
Time: time.Unix(meta.LastTransitionTS, 0),
}
status.Type = cr.Spec.Type
return status, nil
}

func (b *snapshotBackups) Complete(ctx context.Context) error {
return nil
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new snapshot backup and restore functionality (approximately 800 lines of code across snapshot.go and snapshots.go) does not have any unit tests. This is a significant gap in test coverage for a critical feature. Consider adding unit tests to cover:

  • Snapshot creation and reconciliation logic
  • PVC recreation from snapshots
  • StatefulSet scaling operations
  • Error handling scenarios
  • Edge cases like missing snapshots or failed operations

Copilot uses AI. Check for mistakes.
- get
- list
- watch
- create
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The RBAC permissions for VolumeSnapshots are missing delete and update verbs. While the operator creates snapshots during backup, it may also need to clean them up when backups are deleted (as part of the finalizer logic), and potentially update snapshot metadata. Consider adding delete permission at minimum for proper cleanup. Review whether update and patch permissions are also needed for snapshot management.

Suggested change
- create
- create
- update
- patch
- delete

Copilot uses AI. Check for mistakes.
Comment on lines 232 to 236
// TODO
// Delete all statefulsets.
// Resync PBM storage.

return restore.Status, nil
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The restore process never transitions to a final state (like RestoreStateReady or RestoreStateDone). After running runPBMRestoreFinish, the status remains in RestoreStateRunning state indefinitely. This means users won't know when the restore is complete, and the restore object will keep being reconciled. The function should update the status to a final state (e.g., RestoreStateReady) after successful completion.

Suggested change
// TODO
// Delete all statefulsets.
// Resync PBM storage.
return restore.Status, nil
// At this point, all steps of the snapshot restore have completed successfully.
// Transition the restore to a final state so it is no longer reconciled as running.
status.State = psmdbv1.RestoreStateReady
// TODO
// Delete all statefulsets.
// Resync PBM storage.
return status, nil

Copilot uses AI. Check for mistakes.
Name: snapshotName,
}
pvc.SetAnnotations(map[string]string{
naming.AnnotationRestoreName: snapshotName,
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The annotation value is set to the snapshot name instead of the restore name. This is inconsistent with the check on line 423, where it compares restoreName == restore.Name. This means that if the snapshot name doesn't match the restore name, the PVC will be deleted and recreated unnecessarily. Consider using restore.Name here to ensure consistency.

Copilot uses AI. Check for mistakes.
Comment on lines 232 to 234
// TODO
// Delete all statefulsets.
// Resync PBM storage.
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The TODO comment indicates incomplete functionality - deleting all statefulsets and resyncing PBM storage. Without this cleanup, the restore process may leave resources in an inconsistent state. This is critical for the snapshot restore feature to work correctly. Please complete this implementation or create a follow-up task to address it.

Suggested change
// TODO
// Delete all statefulsets.
// Resync PBM storage.
// FOLLOW-UP (snapshot restore cleanup):
// Deleting all statefulsets created for snapshot restore and resyncing PBM storage
// after a successful PBM restore is not yet implemented here.
// Tracking issue: PSMDB-XXXX (implement post-snapshot-restore cleanup and PBM resync).

Copilot uses AI. Check for mistakes.
}
cn, err := r.newPBMFunc(ctx, r.client, cluster)
if err != nil {
return nil, errors.Wrap(err, "reate pbm object")
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo in error message: "reate" should be "create".

Suggested change
return nil, errors.Wrap(err, "reate pbm object")
return nil, errors.Wrap(err, "create pbm object")

Copilot uses AI. Check for mistakes.
Comment on lines 280 to 288
sfs.Spec.Template.Spec.Containers[0].Command = []string{"/opt/percona/pbm-agent"}
sfs.Spec.Template.Spec.Containers[0].Args = []string{
"restore-finish",
restore.Status.PBMname,
"-c", "/etc/pbm/pbm_config.yaml",
"--rs", "$(MONGODB_REPLSET)",
"--node", "$(POD_NAME).$(SERVICE_NAME)-$(MONGODB_REPLSET).$(NAMESPACE).svc.cluster.local",
// "--db-config", "/etc/pbm/db-config.yaml", // TODO
}
Copy link

Copilot AI Feb 18, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Potential index out of bounds error. The code assumes that sfs.Spec.Template.Spec.Containers[0] exists, but there's no check to verify that the Containers slice has at least one element. If the StatefulSet has no containers (which would be unusual but possible in error scenarios), this will panic. Consider adding a length check or finding the container by name instead of assuming index 0.

Copilot uses AI. Check for mistakes.
Signed-off-by: Mayank Shah <mayank.shah@percona.com>
"k8s.io/utils/ptr"
"sigs.k8s.io/controller-runtime/pkg/client"
logf "sigs.k8s.io/controller-runtime/pkg/log"
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[goimports-reviser] reported by reviewdog 🐶

Suggested change
)
"github.com/percona/percona-backup-mongodb/pbm/defs"
psmdbv1 "github.com/percona/percona-server-mongodb-operator/pkg/apis/psmdb/v1"
"github.com/percona/percona-server-mongodb-operator/pkg/naming"
"github.com/percona/percona-server-mongodb-operator/pkg/psmdb/backup"
"github.com/percona/percona-server-mongodb-operator/pkg/psmdb/config"
)

Signed-off-by: Mayank Shah <mayank.shah@percona.com>
Copilot AI review requested due to automatic review settings February 19, 2026 11:24
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 29 out of 30 changed files in this pull request and generated 10 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


err := r.clientcmd.Exec(ctx, &pod, "mongod", restoreFinishCmd, nil, stdoutBuf, stderrBuf, false)
if err != nil {
log.Error(nil, "Failed to finish restore", "pod", pod.Name, "stderr", stderrBuf.String(), "stdout", stdoutBuf.String())
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above: log.Error is called with a nil error even though err is available. Pass err so the log entry includes the actual failure.

Suggested change
log.Error(nil, "Failed to finish restore", "pod", pod.Name, "stderr", stderrBuf.String(), "stdout", stdoutBuf.String())
log.Error(err, "Failed to finish restore", "pod", pod.Name, "stderr", stderrBuf.String(), "stdout", stdoutBuf.String())

Copilot uses AI. Check for mistakes.
Comment on lines +6 to +7
AWS_ACCESS_KEY_ID: "minioadmin"
AWS_SECRET_ACCESS_KEY: "minioadmin"
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This manifest introduces a Secret with hardcoded credentials (minioadmin/minioadmin). Shipping real-looking default credentials in a deploy example is risky because it can be applied as-is in non-dev clusters. Consider removing this Secret from deploy/cr.yaml, commenting it out, or replacing values with obvious placeholders and documentation comments.

Suggested change
AWS_ACCESS_KEY_ID: "minioadmin"
AWS_SECRET_ACCESS_KEY: "minioadmin"
# WARNING: Replace these placeholder values with your own MinIO credentials before applying.
AWS_ACCESS_KEY_ID: "<YOUR-MINIO-ACCESS-KEY>"
AWS_SECRET_ACCESS_KEY: "<YOUR-MINIO-SECRET-KEY>"

Copilot uses AI. Check for mistakes.
Comment on lines +131 to +150
podName := func(nodeName string) (string, error) {
parts := strings.Split(nodeName, ".")
if len(parts) < 1 {
return "", errors.Errorf("unexpected node name format: %s", nodeName)
}
return parts[0], nil
}

for _, rs := range meta.Replsets {
// do not snapshot nodes that are not yet copy ready.
if rs.Status != defs.StatusCopyReady {
done = false
continue
}

// parse pod name from node name.
podName, err := podName(rs.Node)
if err != nil {
return false, nil, errors.Wrap(err, "get pod name")
}
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

podName is used both as a helper function and then as a local variable (podName, err := podName(...)), which is legal but hard to read. Renaming the helper (e.g. parsePodName) avoids shadowing and improves clarity.

Copilot uses AI. Check for mistakes.
return rr, err
}
defer bcp.Close(ctx)
defer bcp.PBM().Close(ctx)
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

defer bcp.PBM().Close(ctx) can panic when bcp.PBM() is nil (e.g., when the referenced cluster is not found and newManagedBackups/newSnapshotBackups return an executor with pbm=nil). The old Backup.Close() handled nil safely. Guard the defer with a nil check (or reintroduce a Close method on the executor interface) to avoid nil-interface method calls.

Suggested change
defer bcp.PBM().Close(ctx)
pbmCli := bcp.PBM()
if pbmCli != nil {
defer pbmCli.Close(ctx)
}

Copilot uses AI. Check for mistakes.
if len(p.Spec.StorageName) == 0 {
if len(p.Spec.StorageName) == 0 && p.Spec.Type != defs.ExternalBackup {
return fmt.Errorf("spec storageName field is empty")
}
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CheckFields() now allows spec.storageName to be empty whenever spec.type == external, but the controller falls back to managed backups when volumeSnapshotClass isn’t set and managedBackups.Start() will then fail with “unable to get storage ''”. Consider tightening validation here (e.g., require volumeSnapshotClass for external backups, or require storageName when volumeSnapshotClass is empty) so invalid specs fail fast with a clear error.

Suggested change
}
}
if p.Spec.Type == defs.ExternalBackup {
if (p.Spec.VolumeSnapshotClass == nil || *p.Spec.VolumeSnapshotClass == "") && len(p.Spec.StorageName) == 0 {
return fmt.Errorf("spec volumeSnapshotClass or storageName must be set for external backups")
}
}

Copilot uses AI. Check for mistakes.
Comment on lines +382 to +383
AccessKeyID: storage.MaskedString(accessKey),
SecretAccessKey: storage.MaskedString(secretAccessKey),
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In MinIO credentials, accessKey/secretAccessKey are []byte from Secret data, but storage.MaskedString is used elsewhere with a string input. This likely won’t compile (or will produce an unintended value) unless you convert the bytes to string before wrapping.

Suggested change
AccessKeyID: storage.MaskedString(accessKey),
SecretAccessKey: storage.MaskedString(secretAccessKey),
AccessKeyID: storage.MaskedString(string(accessKey)),
SecretAccessKey: storage.MaskedString(string(secretAccessKey)),

Copilot uses AI. Check for mistakes.

func (cr *PerconaServerMongoDB) GetAllReplsets() []*ReplsetSpec {
replsets := cr.Spec.Replsets
if cr.Spec.Sharding.Enabled {
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetAllReplsets() appends cr.Spec.Sharding.ConfigsvrReplSet whenever sharding is enabled, without a nil check. Several other call sites in the codebase guard ConfigsvrReplSet != nil, and this method can return a slice containing nil if invoked before defaults/validation, leading to panics when iterating. Consider checking ConfigsvrReplSet != nil (or returning an empty slice / error) to make this helper safe.

Suggested change
if cr.Spec.Sharding.Enabled {
if cr.Spec.Sharding.Enabled && cr.Spec.Sharding.ConfigsvrReplSet != nil {

Copilot uses AI. Check for mistakes.

err := r.clientcmd.Exec(ctx, &pod, "mongod", restoreCmd, nil, stdoutBuf, stderrBuf, false)
if err != nil {
log.Error(nil, "Restore failed to start", "pod", pod.Name, "stderr", stderrBuf.String(), "stdout", stdoutBuf.String())
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The logger call passes nil as the error value even though err is available. This drops the underlying error from structured logs and makes debugging harder; pass err instead of nil.

Suggested change
log.Error(nil, "Restore failed to start", "pod", pod.Name, "stderr", stderrBuf.String(), "stdout", stdoutBuf.String())
log.Error(err, "Restore failed to start", "pod", pod.Name, "stderr", stderrBuf.String(), "stdout", stdoutBuf.String())

Copilot uses AI. Check for mistakes.

orig := sfs.DeepCopy()

// Scale down the statefulset.
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment says “Scale down the statefulset” but this code scales replicas up to replicas. Update the comment to avoid confusion during future maintenance/debugging.

Suggested change
// Scale down the statefulset.
// Scale the statefulset to the desired number of replicas.

Copilot uses AI. Check for mistakes.
Comment on lines +200 to 215
var bcp backupExecutor
if err = retry.OnError(defaultBackoff, func(err error) bool { return err != nil }, func() error {
var err error
bcp, err = r.newBackup(ctx, cluster)
if err != nil {
return errors.Wrap(err, "create backup object")
switch {
case cr.Spec.Type == defs.ExternalBackup &&
cr.Spec.VolumeSnapshotClass != nil && *cr.Spec.VolumeSnapshotClass != "":
bcp, err = r.newSnapshotBackups(ctx, cluster)
if err != nil {
return errors.Wrap(err, "create snapshot backup object")
}
default:
bcp, err = r.newManagedBackups(ctx, cluster)
if err != nil {
return errors.Wrap(err, "create backup object")
}
}
Copy link

Copilot AI Feb 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New control-flow selects a different backup executor for type=external + volumeSnapshotClass. There are existing unit tests in this package (e.g. backup_test.go) but none exercise this new snapshot-based path. Adding unit tests that cover executor selection and snapshot backup status transitions would help prevent regressions.

Copilot uses AI. Check for mistakes.
@JNKPercona
Copy link
Collaborator

Test Name Result Time
arbiter passed 00:10:59
balancer passed 00:19:04
cross-site-sharded passed 00:18:09
custom-replset-name passed 00:10:17
custom-tls passed 00:13:59
custom-users-roles passed 00:10:14
custom-users-roles-sharded passed 00:11:20
data-at-rest-encryption passed 00:12:11
data-sharded passed 00:23:09
demand-backup failure 01:30:13
demand-backup-eks-credentials-irsa passed 00:00:07
demand-backup-fs failure 00:38:17
demand-backup-if-unhealthy passed 00:11:01
demand-backup-incremental-aws failure 00:34:20
demand-backup-incremental-azure failure 00:34:00
demand-backup-incremental-gcp-native failure 00:33:40
demand-backup-incremental-gcp-s3 failure 00:24:52
demand-backup-incremental-minio failure 00:23:29
demand-backup-incremental-sharded-aws failure 00:20:30
demand-backup-incremental-sharded-azure failure 00:20:02
demand-backup-incremental-sharded-gcp-native failure 00:20:11
demand-backup-incremental-sharded-gcp-s3 failure 00:19:29
demand-backup-incremental-sharded-minio failure 00:17:06
demand-backup-physical-parallel passed 00:08:32
demand-backup-physical-aws failure 00:13:45
demand-backup-physical-azure failure 00:09:34
demand-backup-physical-gcp-s3 failure 00:06:50
demand-backup-physical-gcp-native failure 00:57:51
demand-backup-physical-minio failure 00:31:42
demand-backup-physical-minio-native failure 00:32:02
demand-backup-physical-minio-native-tls failure 00:24:48
demand-backup-physical-sharded-parallel passed 00:11:48
demand-backup-physical-sharded-aws failure 00:30:20
demand-backup-physical-sharded-azure failure 00:30:11
demand-backup-physical-sharded-gcp-native failure 00:30:27
demand-backup-physical-sharded-minio failure 00:30:15
demand-backup-physical-sharded-minio-native failure 00:30:12
demand-backup-sharded passed 00:26:32
disabled-auth passed 00:16:51
expose-sharded passed 00:33:36
finalizer passed 00:10:27
ignore-labels-annotations passed 00:07:48
init-deploy passed 00:12:41
ldap passed 00:08:53
ldap-tls passed 00:12:34
limits passed 00:05:59
liveness passed 00:08:45
mongod-major-upgrade passed 00:11:56
mongod-major-upgrade-sharded passed 00:21:06
monitoring-2-0 passed 00:24:26
monitoring-pmm3 passed 00:27:44
multi-cluster-service passed 00:12:59
multi-storage failure 00:22:46
non-voting-and-hidden failure 00:01:03
one-pod passed 00:08:12
operator-self-healing-chaos passed 00:12:46
pitr passed 00:32:16
pitr-physical failure 00:40:53
pitr-sharded passed 00:21:26
pitr-to-new-cluster failure 00:47:58
pitr-physical-backup-source failure 00:41:04
preinit-updates passed 00:05:18
pvc-auto-resize passed 00:13:22
pvc-resize passed 00:18:05
recover-no-primary passed 00:29:35
replset-overrides failure 00:18:47
replset-remapping failure 00:00:48
replset-remapping-sharded failure 00:00:47
rs-shard-migration passed 00:14:50
scaling passed 00:11:30
scheduled-backup passed 00:12:27
security-context passed 00:07:39
self-healing-chaos passed 00:15:06
service-per-pod passed 00:19:10
serviceless-external-nodes passed 00:07:31
smart-update passed 00:08:18
split-horizon passed 00:13:47
stable-resource-version passed 00:04:43
storage passed 00:07:35
tls-issue-cert-manager passed 00:30:16
unsafe-psa passed 00:07:40
upgrade passed 00:09:34
upgrade-consistency passed 00:07:42
upgrade-consistency-sharded-tls failure 00:00:46
upgrade-sharded passed 00:19:51
upgrade-partial-backup failure 00:00:44
users passed 00:17:16
users-vault passed 00:13:33
version-service failure 00:00:45
Summary Value
Tests Run 89/89
Job Duration 04:13:39
Total Test Time 27:33:46

commit: 091270b
image: perconalab/percona-server-mongodb-operator:PR-2247-091270b4

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dependencies Pull requests that update a dependency file size/XXL 1000+ lines tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants