Skip to content

Decommissioned node not removed from Hosts when TOPOLOGY_CHANGE event is missed during ControlConnection reconnection #202

@dkropachev

Description

@dkropachev

Description

When a ControlConnection reconnects to a new node during or shortly after a node decommission, there is a race condition that can cause the decommissioned node to permanently remain in the Metadata.Hosts collection.

Root cause

During reconnection the ControlConnection:

  1. Opens a connection to a new node
  2. Queries system.peers to build the node list
  3. Registers for server events (TOPOLOGY_CHANGE, STATUS_CHANGE, SCHEMA_CHANGE)

If a node is being decommissioned concurrently:

  • Step 2 may return stale data (the decommissioned node is still in system.peers)
  • The TOPOLOGY_CHANGE REMOVED_NODE event may have already been broadcast by the server before step 3 completes

Since there is no periodic node list refresh, the driver has no further trigger to re-query system.peers, and the decommissioned node stays in the Hosts collection indefinitely. The HostConnectionPool keeps attempting to reconnect to the dead node every 5 seconds forever.

Impact

  • Metadata.Hosts.Count reports incorrect host count after decommission
  • GetReplicas() may return stale replica sets
  • Token map is not rebuilt to reflect the reduced cluster
  • Unnecessary reconnection attempts to the decommissioned node

Reproduction

The issue is observed in TokenMap_Should_RebuildTokenMap_When_NodeIsDecommissioned when two independent Cluster objects are connected to the same CCM cluster and one node is decommissioned. One cluster detects the decommission (it received the event), the other does not.

Proposed fix

Schedule a delayed node list refresh via the event debouncer after every successful ControlConnection reconnection. This re-queries system.peers ~1 second later, catching any topology changes that were missed during the reconnection window.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions