- 
                Notifications
    
You must be signed in to change notification settings  - Fork 68
 
Description
Generated by Generative AI
No
Operating System:
Linux XPS-15-9520 6.8.0-78-generic #78-Ubuntu SMP PREEMPT_DYNAMIC Tue Aug 12 11:34:18 UTC 2025 x86_64 x86_64 x86_64 GNU/Linux
ROS version or commit hash:
Rolling binary http://packages.ros.org/ros2/ubuntu noble/main amd64 ros-rolling-rmw-zenoh-cpp amd64 0.9.0-1noble.20250822.215306
RMW implementation (if applicable):
zenoh
RMW Configuration (if applicable):
No response
Client library (if applicable):
No response
'ros2 doctor --report' output
ros2 doctor --report
<COPY OUTPUT HERE>Steps to reproduce issue
- run nav2 with
 
ros2 launch nav2_bringup tb3_simulation_launch.py headless:=0
- wait until initial transform timeout (around 1 minute) and see the error from zenoh
 
Expected behavior
Receive a response of false from the change_state service server, and nav2 abort bringup
[ERROR 1757400210.592444579] [global_costmap.global_costmap]: Failed to activate global_costmap because transform from base_link to map did not become available before timeout (on_activate() at /root/nav2_ws/src/navigation2/nav2_costmap_2d/src/costmap_2d_ros.cpp:304)
[component_container_isolated-7] [ERROR 1757400210.593177984] [lifecycle_manager_navigation]: Failed to change state for node: planner_server (changeStateForNode() at /root/nav2_ws/src/navigation2/nav2_lifecycle_manager/src/lifecycle_manager.cpp:268)
[component_container_isolated-7] [ERROR 1757400210.593291041] [lifecycle_manager_navigation]: Failed to bring up all requested nodes. Aborting bringup. (startup() at /root/nav2_ws/src/navigation2/nav2_lifecycle_manager/src/lifecycle_manager.cpp:335)
Actual behavior
The change_state service does not return false, instead I see a couple of errors from the zenoh side
[component_container_isolated-8] [ERROR 1757390757.072415008] [rmw_zenoh_cpp]: z_reply_is_ok returned False Reason: Timeout (add_new_query() at ./src/detail/rmw_service_data.cpp:252)
[component_container_isolated-8] 2025-09-09T04:05:58.074188Z  WARN ThreadId(12) zenoh::api::session: Received ReplyData for unknown Query: 77
[component_container_isolated-8] 2025-09-09T04:05:58.074224Z  WARN ThreadId(12) zenoh::api::session: Received ResponseFinal for unknown Request: 77
[component_container_isolated-8] [ERROR 1757390758.431290576] [rmw_zenoh_cpp]: Query queue depth of 10 reached, discarding oldest Query for service for 0/lifecycle_manager_navigation/is_active/std_srvs::srv::dds_::Trigger_/RIHS01_eeff2cd6fa5ad9d27cdf4dec64818317839b62f212a91e6b5304b634b2062c5f (add_new_query() at ./src/detail/rmw_service_data.cpp:252)
I believe it hangs over there and never receive the result from the service server side
Additional information
I was trying to setup rmw_zenoh for Nav2 in this PR ros-navigation/navigation2#5516, one of the system tests is failing here https://app.circleci.com/pipelines/github/ros-navigation/navigation2/16036/workflows/7e2c6499-4b74-4e47-9c88-cf5953de2fd1/jobs/46834. Note that this test passes on fastrtps / cyclonedds, and only fails on zenoh.
I will illustrate the working flow here:
- After we launch the tb3 simulation, we are changing the states for the lifecycle nodes, the planner server, when calling 
on_activate, https://github.com/ros-navigation/navigation2/blob/ab4aaf863aea6a7d1e67e0c2ad536ec1f0c8ccf7/nav2_planner/src/planner_server.cpp#L180 will activate global costmap. However, for global costmap to be activated we typically need a transform from map -> odom -> robot_base_frame, which is unavailable unless we provide an initial pose. - Suppose we never provide the initial pose, then costmap ros should fail to activate when it reaches initial transform timeout, https://github.com/ros-navigation/navigation2/blob/ab4aaf863aea6a7d1e67e0c2ad536ec1f0c8ccf7/nav2_costmap_2d/src/costmap_2d_ros.cpp#L303-L310, planner will also return failure here https://github.com/ros-navigation/navigation2/blob/ab4aaf863aea6a7d1e67e0c2ad536ec1f0c8ccf7/nav2_planner/src/planner_server.cpp#L181-L183 and fail to change state
 - The lifecycle manager, who was trying to change planner server's state earlier https://github.com/ros-navigation/navigation2/blob/ab4aaf863aea6a7d1e67e0c2ad536ec1f0c8ccf7/nav2_lifecycle_manager/src/lifecycle_manager.cpp#L264-L269, should say "failed to change node for planner server" after
 
!node_map_[node_name]->change_state(transition, std::chrono::milliseconds(-1),
      service_timeout_receiving the response(false) from service server. However, it seems that we are unable to get any response from the service server when running with rmw zenoh