Skip to content

[Bug] DDS reader leak: duplicate endpoints accumulate after publisher restart #700

@nypyp

Description

@nypyp

Describe the bug

zenoh-bridge-ros2dds v1.9.0 accumulates duplicate CycloneDDS reader endpoints when a ROS 2 publisher is killed and restarted while subscribers disconnect and reconnect.
Symptoms:

  • Subscriber receives N× nominal frequency after N kill/restart cycles (10Hz → 20Hz → 30Hz...)
  • ros2 topic info -v /topic shows multiple zenoh_bridge_ros2dds subscription endpoints with different GIDs
  • Subscription count increases by 1 per cycle

To reproduce

Environment: Ubuntu 24.04 (Jazzy, hub) + Ubuntu 22.04 (Humble, client), TCP connection
Trigger sequence (no special config — reproduces with default zenoh-bridge-ros2dds):

# Hub side
ROS_DOMAIN_ID=10 zenoh-bridge-ros2dds
# Client side (other machine)
ROS_DOMAIN_ID=11 zenoh-bridge-ros2dds -e tcp/<hub-ip>:7447
# Publisher (hub side)
ROS_DOMAIN_ID=10 ros2 topic pub -r 10 /test std_msgs/msg/String "data: hello"
# Subscriber (client side)
ROS_DOMAIN_ID=11 ros2 topic hz /test  # → 10 Hz
Then reproduce the leak:
1. Kill publisher (Ctrl+C), kill subscriber (Ctrl+C) — both dead simultaneously
2. Restart publisher, restart subscriber
3. ros2 topic info -v /test → Subscription count: 2 (was 1)
4. Repeat → 3, 4...
After one cycle, ros2 topic info -v /test shows:
Subscription count: 2
  zenoh_bridge_ros2dds  GID: ...0f.04
  zenoh_bridge_ros2dds  GID: ...10.04
The subscriber also shows ~20Hz instead of 10Hz.

Root Cause

RoutePublisher::create() in zenoh-plugin-ros2dds/src/route_publisher.rs registers a matching listener via .background():

publisher
    .matching_listener()
    .callback(|status| {
        if status.matching() {
            activate_dds_reader(&dds_reader, ...) // creates DDS reader
        }
    })
    .background()  // ← callback outlives RoutePublisher
    .await?;

.background() detaches the callback from the RoutePublisher lifecycle. When the route is destroyed and re-created, the old background callback remains registered with its own clone of Arc<AtomicDDSEntity>.

On subscriber reconnect:

  1. Old callback fires → activates reader on its dds_reader clone
  2. New callback fires → activates reader on new dds_reader
    Result: two DDS readers, two distinct GIDs → 2× frequency.

Proposed Fix

Add cancelled: Arc<AtomicBool> to RoutePublisher, check in callback, set on Drop. Three changes in route_publisher.rs. Happy to submit a PR if accepted.

Related

System info

Hub Client
Platform x86_64 x86_64
OS Ubuntu 24.04.4 LTS Ubuntu 22.04 LTS
CPU Intel Xeon W-2245 (16 cores) -
ROS 2 Jazzy Humble
zenoh-bridge-ros2dds v1.9.0 (49c5764) v1.9.0 (49c5764)
RMW rmw_cyclonedds_cpp rmw_cyclonedds_cpp

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions