Skip to content

[Bug] ROS 2 services fail across zenoh-router + zenoh-bridge-ros2dds topology #642

@bigredfrog

Description

@bigredfrog

Describe the bug

I cannot seem to get even simple test service to work across a zenoh-ros2dds <-> zenoh-router <-> zenoh-ros2dds

All topics seem solid, but any service call times out / hangs.

I can see the service call arriving at the host of the service, but never get a response back.

Happy to support any discovery, I sincerely hope I am just doing something silly, but I can't find it.

The following is a summary of the scenario as I understand it ( obviously regurgitated by chatgpt )

ROS 2 services fail across zenoh-router + zenoh-bridge-ros2dds topology:
service requests arrive at the server, but responses never reach the client.

Topics work reliably; only services/actions are affected.

This reproduces consistently with the demo_nodes_cpp/add_two_ints example.

Versions

zenoh: 1.7.2

zenoh-bridge-ros2dds: 1.7.2

ROS 2: Jazzy (same base install on both ends)

OS: Ubuntu 24.04 (IPC host bare metal + Docker, client laptop)

Topology
ROS2 client (laptop)
|
| ros2dds bridge (client mode)
| ./zenoh-bridge-ros2dds -d 0 client -e tcp/<IPC_HOST>:7447
|
Zenoh router (zenohd) on IPC host
|
| ros2dds bridge (client mode, Docker, network_mode=host)
| /opt/zenoh-bridge-ros2dds -d 0 client -e tcp/127.0.0.1:7447
|
ROS2 server (IPC host / Docker)

Router config is minimal:

{
listen: {
endpoints: ["tcp/0.0.0.0:7447"]
}
}

No allow/deny lists, no namespaces, no filtering

IPC bridge container uses network_mode: host

Reproduction Steps
On IPC host (server side)
ros2 run demo_nodes_cpp add_two_ints_server

On remote client
ros2 run demo_nodes_cpp add_two_ints_client

Expected Behavior

Client receives a response:

result: 5

Observed Behavior

Server receives and logs every request:

[INFO] [add_two_ints_server]: Incoming request
a: 2 b: 3

Client blocks indefinitely (or until timeout)

No response is delivered back to the client

Zenoh / Bridge Logs
Client-side ros2dds bridge
Route Service Client (ROS:/add_two_ints -> Zenoh:add_two_ints):
received error as reply: ReplyError { payload: "Timeout" }

Didn't receive final reply for query ... Timeout(5s)
Route reply: Query not found!
Route final reply: Query not found!

IPC-side ros2dds bridge
Didn't receive final reply for query ... Timeout(5s)

zenohd router
Didn't receive final reply for query ... Timeout(5s)
Route reply: Query not found!
Route final reply: Query not found!

This pattern is consistent across all components:

query times out after exactly 5 seconds

late reply frames arrive after the query context is destroyed

Key Observations

Pub/sub topics are healthy and stable

Service request path works (server receives requests immediately)

Service response path is broken

This affects:

demo_nodes_cpp/add_two_ints

controller_manager/list_controllers

std_srvs/Empty services

What We Ruled Out

❌ ROS message / interface availability (identical installs)

❌ Docker networking issues (network_mode: host)

❌ Clock skew / NTP issues (topic timestamp deltas ~1 ms)

❌ Discovery problems (services are visible on both ends)

❌ Router ACLs / allowlists (none configured)

❌ Peer mode instability (router topology intentionally used)

Working Hypothesis

This appears to be a zenoh query/reply lifecycle issue affecting ros2dds services:

Query context times out after 5s

Service reply or final reply arrives after timeout

Zenoh core rejects it as Query not found

The failure occurs consistently across router + bridge logs, suggesting this is not ROS serialization, but rather query tracking / final reply handling in the zenoh ↔ ros2dds service path.

Why This Matters

This makes ROS 2 services and actions unusable across zenoh routing, even though:

Topics work correctly

Requests reach the server

The server executes immediately

This blocks real systems (MoveIt, controller_manager, lifecycle, actions).

To reproduce

Steps to Reproduce

  1. Start a Zenoh router

On the IPC host:

/opt/zenoh/zenohd

(or with an explicit config)

/opt/zenoh/zenohd -c /opt/zenoh_router.json5

{
listen: {
endpoints: ["tcp/0.0.0.0:7447"]
}
}

  1. Start the ros2dds bridge on the IPC host (server side)

This bridge runs inside a Docker container with network_mode: host.

/opt/zenoh_ros2dds/zenoh-bridge-ros2dds
-d 0
client
-e tcp/127.0.0.1:7447

  1. Start the ros2dds bridge on the remote client

On the client machine:

./zenoh-bridge-ros2dds
-d 0
client
-e tcp/<IPC_HOSTNAME_OR_IP>:7447

  1. Start a ROS 2 service server on the IPC host

In a ROS-sourced shell on the IPC host:

ros2 run demo_nodes_cpp add_two_ints_server

  1. Call the service from the remote client

In a ROS-sourced shell on the remote client:

ros2 run demo_nodes_cpp add_two_ints_client

  1. Observe the failure

The server prints:

[INFO] [add_two_ints_server]: Incoming request
a: 2 b: 3

The client never receives a response and blocks until timeout.

Zenoh and ros2dds logs show:

Timeout(5s)

Didn't receive final reply for query

Route reply: Query not found!

  1. Control check (local works)

On the IPC host (no zenoh):

ros2 run demo_nodes_cpp add_two_ints_client

This works immediately, confirming the service itself is healthy.

Notes

The issue reproduces consistently.

Pub/sub topics work correctly in the same topology.

No allow/deny lists, namespaces, or filters are configured.

Both bridges and router are version 1.7.2.

System info

System Information
Software Versions

zenoh: 1.7.2

zenoh-bridge-ros2dds: 1.7.2

ROS 2 distro: Jazzy Jalisco

DDS RMW: default (CycloneDDS / rmw_cyclonedds_cpp)

demo used: demo_nodes_cpp/add_two_ints

Server Side (IPC Host)

OS: Ubuntu 24.04 LTS (bare metal)

ROS 2: Jazzy (standard install)

zenohd: running natively on host

ros2dds bridge:

runs inside Docker

Docker network_mode: host

launched as:

/opt/zenoh_ros2dds/zenoh-bridge-ros2dds -d 0 client -e tcp/127.0.0.1:7447

Service server:

ros2 run demo_nodes_cpp add_two_ints_server

Client Side (Remote Laptop)

OS: Ubuntu (WSL2 on Windows host)

ROS 2: Jazzy (same base install as server)

zenoh-bridge-ros2dds: 1.7.2 (local binary)

ros2dds bridge launch:

./zenoh-bridge-ros2dds -d 0 client -e tcp/<IPC_HOSTNAME_OR_IP>:7447

Service client:

ros2 run demo_nodes_cpp add_two_ints_client

Network / Topology Notes

Zenoh router listens on tcp/0.0.0.0:7447

Both ros2dds bridges connect to the same router

No allow/deny lists, namespaces, filters, or ACLs configured

Docker bridge uses host networking (no NAT)

Pub/sub topics are stable and low-latency

Issue reproduces consistently across runs

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions