Optimizing ROS2 Applications with Streaming Executor: A Performance Analysis

Optimizing ROS2 Applications with Streaming Executor: A Performance Analysis

In the recent ROS Meetup at Robert Bosch Center for Research and advance development, Renningen brought together robotics enthusiasts and ROS developers eager to explore the latest developments in the Robot Operating System 2 (ROS2). Among the engaging presentations, one interesting topic was the in-depth discussion on ROS2 executors, led by Pablo Ghiglino from Klepsydra Technologies. In this article, we will delve into the key insights and findings from Klepsydra's presentation, about the streaming executor and its advantages in various scenarios.

No alt text provided for this image

Understanding ROS2 Executors

Before diving into the streaming executor, it's essential to understand the role of executors in ROS2 applications. Executors play a vital role in managing the flow of data by coordinating and scheduling callbacks for subscriptions, messages, services, timers, and nodes. Unlike maintaining their own queues of messages and callbacks, ROS2 executors consume messages directly from the middleware DDS queues. They then dispatch these messages for execution to one or more threads, ensuring seamless data flow within the system.

No alt text provided for this image

Klepsydra's presentation covered three well-known types of ROS2 executors:

  1. Single Threaded Executor: This executor operates with a single thread that queries the middleware and executes callbacks sequentially. It periodically scans the application's structure to update nodes, subscriptions, services, and more. While simple, it may not be the most suitable choice for resource-intensive workloads.
  2. Static Single Threaded Executor: The static single-threaded executor scans and defines the application structure only once, during construction. It creates all nodes, callback groups, timers, and subscriptions before the spin function is called. Despite its simplicity, this executor has proven effective for lightweight node work.
  3. Multi-Threaded Executor: The multi-threaded executor creates multiple threads to execute callbacks in parallel, optimizing performance for demanding workloads. However, managing threads can be complex and challenging.

The highlight: The Streaming Executor

Klepsydra's presentation highlighted the unique streaming executor, which offers distinct advantages over its counterparts. Here's how it works:

  1. Publisher-Subscriber Pairing: In the streaming executor, a publisher-subscriber pair is created for each topic required by a ROS2 node. These pairs are internally identified by the node name and the topic name. This design ensures that even if two different nodes publish to the same topic, they are managed independently, enhancing efficiency.
  2. Event Loop Management: The streaming executor manages all publisher-subscriber pairs associated with topics belonging to the same node using a shared event loop. As a result, subscribers are efficiently handled by the thread associated with their respective event loop. This approach eliminates the need for complex multi-threading management and works seamlessly on both single-core and multi-core systems.

Real-World Results: Performance Comparison

Klepsydra's presentation brought forth compelling evidence for the superiority of the streaming executor, particularly in specific scenarios.

No alt text provided for this image

  1. Small Node Work: For lightweight workloads, the static single-threaded executor emerged as the top performer, showcasing that simplicity can translate into impressive results.
  2. Scaling with Workload: As the workload increased, the streaming executor demonstrated remarkable performance, closely followed by the static single-threaded executor. The streaming executor's consistency in delivering optimal results was particularly noteworthy, as it showcased the power of stable application topology.

Those who are interested to learn how this performance measurements are made, check this section.

Performance Benchmark scenario

  • The benchmark was based on the Autoware reference system. It emulates a realistic driving application.
  • All measurements were taken using a Raspberry Pi 4B with ROS galactic, Ubuntu 20.04 and 4 GB of ram, a constant frequency of 1.50GHz
  • Compatible setup of the reference system, and without CPU isolation

Processors tested:

  • Raspberry PI 4 (reference processor for the RTWG)
  • Unibap's iXIO (NASA and Blue Origin Testbed)
  • Teledyne e2v LS1046

???? Follow me for more interesting content on Robotics and ROS.

#ros?#robotics?#automation?#devops?#robotops?#Networking #technology #education #personaldevelopment #docker #yocto #whatsnextrobotics #WhatsNextRobotics #contentcreator

Amjad Haider

Research Assistant @ RPTU Kaiserslautern-Landau || ex-Intern @ Volkswagen AG || Python, Machine Learning

1 年

Thanks for sharing

要查看或添加评论,请登录

Ragesh Ramachandran的更多文章

社区洞察

其他会员也浏览了