Replies: 4 comments
-
Added to |
Beta Was this translation helpful? Give feedback.
-
Roadmap to stable debian package: #3310 |
Beta Was this translation helpful? Give feedback.
This comment was marked as off-topic.
This comment was marked as off-topic.
-
@sykwer thanks a lot for working on this. I've merged #3306
Could you create the steps to set up, run and adjust the parameters in the how-to-guides folder? I think
Once the documentation is out, we would like to test it on Leo Drive's vehicles to evaluate its performance. For the unit tests, I think for this repo and the way it's integrated, integration tests (where multiple nodes run together in the test) could be suitable. But it's not too high priority just yet. Do you have any specific nodes you'd like to run unit tests with it? Launch file integrationDo you have plans to make it a part of autoware.universe launch files? |
Beta Was this translation helpful? Give feedback.
-
I've created a library named
heaphook
(https://github.com/tier4/heaphook), which replaces all the heap allocations in the specified ROS2 node. In this discussion thread, I'm going to explain the topics below and would like to obtain the community's consent to introduce this library.heaphook
?What is heaphook?
@sykwer developed a library that replaces all heap allocations (malloc, calloc, realloc, free, etc..) with an arbitrary memory allocator in the specified ROS2 node (more precisely, in the specified process).
You can apply this feature to any
Node
andComposableNodeContainer
by modifying a launch file as shown below.As you can see, it's extremely easy to introduce the library and replace all the heap allocation functionality. Just specify the library in
LD_PRELOAD
.For now, two types of allocators are provided.
libpreloaded_heaptrack.so
: uses a standard malloc runtime and generates a log file to visualize heap usage over time, and is assumed to be used to capture the initial memory pool size required by the TLSF allocator.libpreloaded_tlsf.so
: TLSF (Two Level Segregate Fit) allocator.See the README page for more details.
Background and Performance Improvement
In the standard malloc runtime implementation, a large memory allocation request (e.g. malloc function call) triggers
mmap(2)
syscall. The first time the user program touches the mapped memory area, soft page faults occur, resulting in a large overhead. Therefore, in real-time systems, all virtual address space resources used for heap allocation should be allocated bymmap(2)
at start-up and "first touch" in advance.Additionally, there are recommended heap allocators for real-time systems and the default heap allocator should not be used. So heap allocators should be freely interchangeable by the Autoware users.
Our performance analysis shows that the response time bottleneck for each node of the
PointCloud Preprocessor
is due to the soft page faults. For example, the response time incrop_box_filter_self
main callback can be improved as shown in below.Measurement condition:
What change is needed to Autoware repositories?
autoware.repos
file in https://github.com/autowarefoundation/autoware.LD_PRELOAD=libpreloaded_tlsf.so
,INITIAL_MEMPOOL_SIZE=...
andADDITIONAL_MEMPOOL_SIZE=...
.For the first target node, I'm going to specify the pointcloud preprocessor container (here: https://github.com/autowarefoundation/sample_sensor_kit_launch/blob/6275063720f6fc94b5e2b477337b4246889c4c2a/common_sensor_launch/launch/velodyne_node_container.launch.py#L167-L175).
According to my measurement by
libpreloaded_heaphook.so
,INITIAL_MEMPOOL_SIZE=10MB
is enough for the default rosbag simulation.User Story
This library works as transparently as possible for Autoware users. Even if you don't know about
heaphook
, Autoware works fine in terms of logical output.However, for high performance, the following steps need to be taken.
1. Grasp the maximum heap consumption
Set
libpreloaded_heaptrack.so
and run Autoware.After running Autoware, you can get a log file under the current working directory in the format
heaplog.{pid}.log
.You can visualize heap consumption transitions in PDF format based on the generated log file.
2. Set TLSF Allocator with appropriate settings
Based on the information obtained in the step1, configure the TLSF allocator appropreately.
We need documentation
The appropriate
INITIAL_MEMPOOL_SIZE
is highly dependent on the environment in which Autoware is running and is expected to be set by the Autoware operator themselves (The default configuration values should be those that work well with the rosbag in the Autoware Documentation).Therefore, we need to prepare appropriate documentation for Autoware users and encourage them to set appropriate values for the node with heavy heap consumption.
Discussion
Beta Was this translation helpful? Give feedback.
All reactions