You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Many of the follower resets that we see happen because append entries
from previous terms can be in-flight during leadership transfers. This
problem is worsened when high or variable latency is involved.
When a follower receives an AE from an old term today, they will reset
and nuke their WAL and then run a catch-up regardless of the integrity
of the log. However this situation ignores the fact that the node might
have been otherwise functioning normally and network latency may be
worse on some links than others. Silently dropping the AEs from previous
terms and restricting the reset behaviour to the last log term and last
log index reduces the number of resets considerably whilst maintaining
log consistency.
Co-authored-by: Reuben Ninan <[email protected]>
Signed-off-by: Neil Twigg <[email protected]>
0 commit comments