Discovery Server becomes unresponsive with a large number of participants #5682
Closed
thomasmoore-torc
started this conversation in
Support
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Overview
In our system, we are utilizing the Discovery Server and have a single participant that subscribes to all of the other topics in the system. We have noticed that some of the subscriptions in this participant will go unmatched despite the topic being successfully matched in other subscribers. While looking into this issue, we've discovered that it is possible to make the discovery server become unresponsive when there are a sufficient number of new participants. While the scenario below is likely more aggressive than most systems, it is effective in demonstrating how the discovery server can become unresponsive under load, which could be a contributing factor to our observed issue of subscribers not being matched.
Our testing of the below scenario was done on an Ubuntu 22.04 system with the current ROS2 Iron packages installed, which use Fast-DDS version 2.10.6. Similar results were observed with a compiled version of ROS2 Jazzy using Fast-DDS version 2.14.1. We have tried several things to attempt to improve the behavior, but have been unsuccessful:
leaseAnnouncement
andleaseDuration
toDURATION_INFINITY
in the XML configclientAnnouncementPeriod
in the XML configScenario
The scenario below is executed on a single machine with 5 terminals.
In terminal 1, start an instance of the
fast-discovery-server
:In terminal 2, start an instance of
htop
to monitor the CPU utilization offast-discovery-server
:In terminal 3, start 100 publishers, which will cause the CPU utilization of
fast-discovery-server
to increase significantly:In terminal 4, attempt to run
ros2 topic list
, which will fail to return any of the/chatter_{N}
topics while the CPU utilization offast-discovery-server
remains high:In terminal 5, start 100 subscribers, which will cause the CPU utilization of
fast-discovery-server
to increase significantly and will never echo any topic data or return:At this point, attempting to run
ros2 topic list
in terminal 4 will fail to show any of the/chatter_{N}
topics as the discovery server is completely unresponsive.As a data point, if
ROS_DISCOVERY_SERVER
is not set in terminals 2 and 5 such that simple discovery is utilized, theros2 topic echo
commands will eventually display a topic and exit. It does take some time, but it does eventually work.Beta Was this translation helpful? Give feedback.
All reactions