Skip to content

Commit

Permalink
neonvm-controller: Don't retry update on conflict (#796)
Browse files Browse the repository at this point in the history
The previous logic would:

1. Try to update the status
2. If we got a conflict, overwrite the current VirtualMachine object
equal to the state on the API server
3. Then, retry with the udpated object (without changing anything!)

This is basically doing extra work for nothing!

We discussed potentially changing this to overwrite the status on
conflict, but this could result in issues when operating on stale data
(e.g., overwriting .status.podName back to "" after it was already
created).

ref https://www.notion.so/neondatabase/Autoscaling-Team-Internal-Sync-179c7d597dbb4fe5b565d9c482d4d166
ref https://neondb.slack.com/archives/C03TN5G758R/p1707414998235459
  • Loading branch information
sharnoff authored Feb 9, 2024
1 parent 52a883b commit f723811
Showing 1 changed file with 6 additions and 25 deletions.
31 changes: 6 additions & 25 deletions neonvm/controllers/virtualmachine_controller.go
Original file line number Diff line number Diff line change
Expand Up @@ -168,32 +168,13 @@ func (r *VirtualMachineReconciler) Reconcile(ctx context.Context, req ctrl.Reque
return ctrl.Result{}, err
}

// update status after reconcile loop (try 10 times)
statusNow := statusBefore.DeepCopy()
statusNew := virtualmachine.Status.DeepCopy()
try := 1
for try < 10 {
if !DeepEqual(statusNow, statusNew) {
// update VirtualMachine status
// log.Info("DEBUG", "StatusNow", statusNow, "StatusNew", statusNew, "attempt", try)
if err := r.Status().Update(ctx, &virtualmachine); err != nil {
if apierrors.IsConflict(err) {
try++
time.Sleep(time.Second)
// re-get statusNow from current state
statusNow = virtualmachine.Status.DeepCopy()
continue
}
log.Error(err, "Failed to update VirtualMachine status after reconcile loop",
"virtualmachine", virtualmachine.Name)
return ctrl.Result{}, err
}
// If the status changed, try to update the object
if !DeepEqual(statusBefore, virtualmachine.Status) {
if err := r.Status().Update(ctx, &virtualmachine); err != nil {
log.Error(err, "Failed to update VirtualMachine status after reconcile loop",
"virtualmachine", virtualmachine.Name)
return ctrl.Result{}, err
}
// status updated (before and now are equal)
break
}
if try >= 10 {
return ctrl.Result{}, fmt.Errorf("unable update .status for virtualmachine %s in %d attempts", virtualmachine.Name, try)
}

return ctrl.Result{RequeueAfter: time.Second}, nil
Expand Down

0 comments on commit f723811

Please sign in to comment.