Skip to content

Conversation

@xsu-git
Copy link

@xsu-git xsu-git commented Aug 28, 2025

What kind of change does this PR introduce? (check at least one)

  • Bugfix
  • Feature
  • Code style update
  • Refactor
  • Build-related changes
  • Other, please describe:

The description of the PR:
存在 TOCTOU 窗口:
执行器在 findDead 与 removeDead 之间发送心跳更新update_time ,但结果ID依然会被删掉,导致出现活节点被当成死节点误删
出现“执行器心跳正常却被清掉”的情况

There exists a TOCTOU (time-of-check to time-of-use) race condition in the current implementation. Specifically, an executor may send a heartbeat update (update_time) in the interval between findDead and removeDead. However, the record is still deleted based on the earlier “dead” determination, resulting in an executor that is actually healthy being incorrectly removed.

In practice, this manifests as executors being cleared even though their heartbeats are up to date.

Other information:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant