Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot Switch to Auto Mode on Second Autoware Instance over LAN with Zenoh Bridge #150

Open
Justin-Xiang opened this issue Feb 5, 2025 · 10 comments
Labels
bug Something isn't working

Comments

@Justin-Xiang
Copy link
Contributor

I am running two Autoware instances on two separate machines within the same LAN. The first machine (IP 192.168.1.1) can successfully switch its vehicle to Auto mode, while the second machine (IP 192.168.1.2) cannot — even though it can successfully plan routes. Additionally, there are error messages in the zenoh_bridge_ros2dds.log file that I have not encountered before.

Setup Details

Machine 1

  • IP: 192.168.1.1
  • Actions:
    • Starts Carla
    • Builds and runs the Zenoh bridge and Autoware using:
      • run-bridge-two-vehicles.sh
      • run-autoware.sh v1

Machine 2

  • IP: 192.168.1.2
  • Actions:
    • Builds Autoware
    • Changes in run-autoware.sh:
      export ZENOH_CARLA_IP_PORT="${2:-'192.168.1.1:7447'}" # previously 127.0.0.1
      export ZENOH_FMS_IP_PORT="${3:-'192.168.1.1:7887'}"  # previously 127.0.0.1
    • Runs run-autoware.sh v2

Expected Behavior

  • Both Autoware instances should be able to plan routes and switch the vehicle to Auto mode in Carla.

Actual Behavior

  • Machine 1: Can switch to Auto mode and plan routes without issues.
  • Machine 2: Can plan routes but cannot switch to Auto mode.

Logs and Error Messages

Below are excerpts from zenoh_bridge_ros2dds.log on Machine 2:

�[2m2025-02-05T01:06:43.328667Z�[0m �[31mERROR�[0m                 rx-1 ThreadId(26) �[2mzenoh_codec::common::extension�[0m�[2m:�[0m Unknown OAM ext: ZExtUnknown { Id: 4, Mandatory: true, Encoding: "Unit" }
�[2m2025-02-05T01:06:47.305967Z�[0m �[31mERROR�[0m                 rx-0 ThreadId(25) �[2mzenoh_codec::common::extension�[0m�[2m:�[0m Unknown OAM ext: ZExtUnknown { Id: 3, Mandatory: true, Encoding: "Unit" }
�[2m2025-02-05T01:06:47.450742Z�[0m �[32m INFO�[0m tokio-runtime-worker ThreadId(12) �[2mzenoh_plugin_ros2dds::discovered_entities�[0m�[2m:�[0m Discovered ROS Node /transform_listener_impl_7d0c340e4ec0
�[2m2025-02-05T01:06:47.450813Z�[0m �[32m INFO�[0m tokio-runtime-worker ThreadId(12) �[2mzenoh_plugin_ros2dds�[0m�[2m:�[0m Node /transform_listener_impl_7d0c340e4ec0 declares Subscriber /clock: rosgraph_msgs/msg/Clock - Allowed
�[2m2025-02-05T01:06:47.618595Z�[0m �[32m INFO�[0m tokio-runtime-worker ThreadId(18) �[2mzenoh_plugin_ros2dds�[0m�[2m:�[0m Node /transform_listener_impl_7d0c340e4ec0 undeclares Subscriber /clock: rosgraph_msgs/msg/Clock - Allowed
�[2m2025-02-05T01:06:47.652328Z�[0m �[32m INFO�[0m tokio-runtime-worker ThreadId(18) �[2mzenoh_plugin_ros2dds::discovered_entities�[0m�[2m:�[0m Undiscovered ROS Node /transform_listener_impl_7d0c340e4ec0
�[2m2025-02-05T01:06:48.787616Z�[0m �[31mERROR�[0m                 rx-1 ThreadId(26) �[2mzenoh_codec::common::extension�[0m�[2m:�[0m Unknown OAM ext: ZExtUnknown { Id: 3, Mandatory: true, Encoding: "Unit" }

...

[2m2025-02-05T01:01:41.664775Z�[0m �[31mERROR�[0m                 rx-1 ThreadId(26) �[2mzenoh_codec::common::extension�[0m�[2m:�[0m Unknown OAM ext: ZExtUnknown { Id: 5, Mandatory: true, Encoding: "ZBuf", Value: ZBuf { slices: [[83, c9, 3e

It seems to be the problem of zenoh-bridge-ros2dds. Any insights or guidance would be highly appreciated!

@Justin-Xiang
Copy link
Contributor Author

Or perhaps the issue is caused by using the same zenoh-bridge-ros2dds-conf.json5 file on both machines? I’m not entirely sure, but I’ll investigate further and try to resolve it.

@minseokim521
Copy link

Hello @Justin-Xiang !

I'm a college student studying autoware-based v2x. Having been through an issue similar to yours, I wrote down my issue right before your issue.

I'm running autoware on one machine, and I've experienced an error that the auto button sometimes activates or does not activate after specifying 2d goal pose, and if not, sometimes it's resolved by shutting down the terminal and running it again.

My problem is that in the process of running two autoware instruments, when you turn on the second autoware after the first autoware is set up, the two autowares are experiencing issues that collide with each other.

If there's anything I set up differently, in the run-autoware-docker.sh file, I added the --network host setting so that the docker container is set to communicate in host mode, is my problem related to this? I'm studying the docker container's network, ros2 dds, zenoh bridge, so I don't know the communication structure.

And can I put the ip address of host1 on run-autoware.sh ?

The os on my computer are uutntu 22.04 and ROS humble, and I'm using nvidia rtx 3080. And I've been cloning the main branch of GitHub. Could you also share your development environment and settings?

Lastly, I'm also looking into it to solve the problem, but if you solve the problem, it would be very helpful if you could share the method with me. If it's possible additionally, could you share the settings related to your container settings sh file and the sh file running autoware and zenoh bridge, or the ROS_DOMAIN_ID / ROS_LOCALHOST_ONLY settings for ROS communication? I wonder if there are any corrections you made in the github data.

I will also research more and study to solve the problem.

Thank you so much.

@Justin-Xiang
Copy link
Contributor Author

@minseokim521 Hi, I came across your issue but didn’t look into it deeply since you’re using V2X and running everything on a single machine. I haven’t worked with V2X before, but I do have experience running two Autoware instances on a single machine.

From my experience, I didn’t make any code modifications—just followed the documentation—and everything worked as expected. You might want to try running it without modifications to see if that makes a difference.

Since our development environments seem quite similar, I don’t think hardware resources should be a constraint. Let me know if you have any other questions!

@Justin-Xiang
Copy link
Contributor Author

Justin-Xiang commented Feb 5, 2025

Found this from zenoh issues:
eclipse-zenoh/zenoh-plugin-ros2dds#305 (comment)

it states: zenoh-bridge-ros2dds doesn't support two hosts having different namespaces communicate with each other.

@evshary Could you confirm whether this means the setup I described in this issue is currently not achievable? Additionally, would it be possible to run two zenoh_carla_bridge instances on separate machines while sharing the same Carla simulator?

@evshary
Copy link
Owner

evshary commented Feb 6, 2025

Hi @Justin-Xiang

I tried your setup before and it should be available. However, the error message is quite weird and worth investigating.
I have some questions:

  • Are you available to run all of them on the same machine? That is, Carla, bridges, two Autoware instances.
  • While the second one is unable to switch to Auto, can it get the camera image from Carla bridge? If yes, the communication should be okay. I think there is something wrong with a certain topic.

Anyway, I don't think your issue is related to eclipse-zenoh/zenoh-plugin-ros2dds#305 (comment), since we don't need two Autoware to talk to each other directly.
It should be something else.

@Justin-Xiang
Copy link
Contributor Author

Hi @evshary Thanks for your response.

  1. Yes, I can run it on the same machine, but it’s not stable. I’m not sure if this is due to hardware constraints (I’m using an RTX 3090, 32GB RAM, and an i7-12700KF) or a connection issue. Out of every five attempts, only one successfully sets both vehicles to Auto and allows them to move. Sometimes, one vehicle fails to switch to Auto and sometimes two, and in the worst cases, both fail to initialize. I haven’t made any modifications—I’m strictly following the documentation.
  2. Yes, the system receives images from the Carla bridge and successfully plans routes (I can see the green routes in RViz). The issue is that the vehicles can’t be set to Auto. I also tried swapping Machine 1 and Machine 2, changing which one runs what, but the problem persists. The machine running zenoh-bridge works fine, while the other one still can’t switch to Auto.

Let me know if you need more details-I’d be happy to provide additional information.

@evshary
Copy link
Owner

evshary commented Feb 6, 2025

Thank you! It didn't happen on my machine, but I indeed heard someone else face the issue before. I will take a look. Feel free to provide any other information if you find something weird.

@minseokim521
Copy link

Hi @Justin-Xiang First of all, thank you for your response.

As you told me, I downloaded the existing github material and completed the build without any modifications, and then ran the Run Carla with multiple Autowares scenario.

It was divided into host1 and host2 just like you, and host1 ran carla and ran run-bridge-two-vehicles.sh and run-autoware.sh v1. Host2 runs run-autoware.sh v2, and we also added the export ZENOH_CARLA_IP_PORT="${2:-'host1 ip'}" #previously 127.0.0.1 operation you mentioned.

But just like you, sometimes successfully running auto mode, and sometimes one or both of them deactivated auto mode, which caused the test to fail to proceed properly. This seems to be the same phenomenon as the problem you mentioned.

I'm still looking for additional information to resolve this.

If you have any methods or problems you found to solve the problem, I would really appreciate it if you could share them.

@Justin-Xiang
Copy link
Contributor Author

@minseokim521 Yes, I’ve encountered the same issue—it works sometimes but fails at other times. Unfortunately, I don’t have a concrete solution yet. I suspect it might be a performance-related issue with the autoware_carla_launch package since running ros2 launch autoware_launch planning_simulator.launch.xml works smoothly. However, this is just my assumption. I plan to investigate further when I have time.

@habby1012
Copy link
Collaborator

Hi @Justin-Xiang @minseokim521

I have also encountered issues where the vehicle cannot switch to auto mode, especially when running two vehicles simultaneously. During execution, I frequently see the following log messages: /steering_status and /velocity_status dropped to the warning level appearing repeatedly., and I believe this is the reason why I cannot activate auto mode.
Image

I found that Autoware has certain settings that monitor safety-related topics. These are defined in the following file:
/autoware_launch/config/system/component_state_monitor/topics.yaml

Inside this file, the levels for /steering_status and /velocity_status are defined, as shown in the image below:
Image

If these topics are published too slowly, Autoware will determine that the system is unsafe and prevent auto mode from being activated. I suspect that running two vehicles increases the system load, leading to performance issues that cause these topics to be published at a lower frequency.

As a solution, I have tried lowering the warning_rate slightly (to 1.0), which may help mitigate the issue.

This is my current understanding, and I’m open to further discussion. Thanks!

@evshary evshary added the bug Something isn't working label Feb 28, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants