Address performance issues related to large objects workflows #898

gilopez01 · 2024-12-11T11:53:56Z

I would like to propose a couple of enhancements that address multiple performance issues, one related to a performance issue in GVM and a couple related to deepcopy when using flow ports between states. I understand that the RAFCON team has previously expressed a preference to retain the use of deepcopy for safety reasons. However, I propose the introduction of optional configuration flags to provide users with greater flexibility in optimizing performance while preserving the current behavior by default.

Below are the specific instances where significant performance bottlenecks have been observed:

1. Global Variable Manager - gvm.set_variable delay

File: source/rafcon/core/global_variable_manager.py:112

Issue:
When setting a global variable, the following line serializes the value to generate a message string. For large objects, this serialization takes several seconds to complete:

logger.debug("Global variable '{}' was set to value '{}' with type '{}'".format(key, value, data_type.__name__))

Proposed Solution:
Serialize the message only if debug logging is enabled, as shown below:

if logger.isEnabledFor(10):  # DEBUG level is 10
    logger.debug("Global variable '{}' was set to value '{}' with type '{}'".format(key, value, data_type.__name__))

This change avoids unnecessary serialization when debug logging is disabled, leading to significant performance improvements.

2. Output Port Delay

File: source/rafcon/core/execution/execution_history_items.py:176

Issue:
When generating ScopedDataItem for use in the data flow, a deepcopy is performed, which takes around 500-800 ms per trajectory (using MoveIt2 trajectories as a large complex object):

self.child_state_input_output_data = copy.deepcopy(child_state_input_output_data)

Proposed Solution:
Introduce an option to skip deepcopy for users who do not require it.

if use_deepcopy:  # use_deepcopy could be a configurable flag in the execution context
    self.child_state_input_output_data = copy.deepcopy(child_state_input_output_data)
else:
    self.child_state_input_output_data = child_state_input_output_data

This change would maintain backward compatibility, as use_deepcopy would default to True, but allow users to enable somehow this flag for significant performance gains.

3. Input Port Delay

File: source/rafcon/core/states/container_state.py:1474

Issue:
When retrieving input data for state execution, a deepcopy is performed on ScopedDataItem data from the data flow, leading to an additional 500-800 ms delay in our case sending a moveit2 trajectory:

actual_value = deepcopy(self.scoped_data[key].value)

Proposed Solution:
Optionally disable deepcopy using a similar approach to the output port delay fix:

if use_deepcopy:  # use_deepcopy could be a configurable flag in the execution context
    actual_value = deepcopy(self.scoped_data[key].value)
else:
    actual_value = self.scoped_data[key].value

Again, this approach maintains current behavior by default but gives users the flexibility to avoid the deepcopy overhead.

These proposed changes offer a significant reduction in execution time, especially for workflows with large data objects or high-frequency state transitions.

Benefits:

Backwards Compatibility: Defaults to the current behavior (deepcopy enabled).
Performance Improvement: Reduces delays in input/output processing and global variable handling.
User Flexibility: Provides configurable flags for advanced users who can guarantee data immutability.

I would like to know if there is a feasible way to introduce this flag in the mentioned cases. Additionally, I propose applying the fix for point 1 directly, as it provides immediate performance benefits. Please let me know if this approach aligns with RAFCON's design principles, and I would be happy to assist in implementing or testing these changes.

The text was updated successfully, but these errors were encountered:

JohannesErnst · 2024-12-11T16:33:25Z

Thanks for posting the issue and giving detailed information about your suggestions! Here are our current thoughts:

1. Global Variable Manager - gvm.set_variable delay

Of course it is possible to enable this DEBUG level check for this single line of code, but maybe it is to specific to actually patch it like this. If we decide to insert the line as you suggested we would then need to do it for every line where variables are printed like this.
However, there might be an easier approach for you: There already is a config file named logging.conf. Here, you can find the line to specify the logging level for both the GUI and the core of RAFCON:

RAFCON/source/rafcon/logging.conf

Line 11 in 59987cd

"level": "VERBOSE",

If you set this to e.g. INFO in your RAFCON installation path, everything below INFO-level information will not be printed and I assume the line of code you linked will not consume time anymore. Does this seem a suitable option for your case or do you think it would still be a nice feature if this can also be set from the RAFCON config (which is usually saved in ~/.config/rafcon/config.yaml and can be modified inside the GUI preferences also)?

2. Output Port Delay and 3. Input Port Delay

We discussed to include the functionality of disabling the deepcopy lines in the code for passing the data in between states. However, we noticed that this might be a little dangerous (see image below): If the data output from ExecutionState1 is changed in ExecutionState2, the data input into ExecutionState3 will also receive the altered data. This is usually not wanted and misleading in the GUI representation also.

So if we implement such a flag to disable the deepcopies, it would be some kind of expert-level-flag, to only use when you are really sure that nowhere in your state machine it should be used in another way. Maybe then it is also a little to specific and dangerous if users accidentally disable the flag.
As a workaround you can of course always change the lines of code you mentioned in your local RAFCON installation (also for the 1. point of your issue) and go like this. What are your thoughts on this?

sillkjc · 2024-12-11T23:42:41Z

For 1, as you mentioned Johannes having all settings in one place (including the logging) is important for us. We want to ensure we can replicate performance across all development and deployment machines, including automated CI testing. This can only be done if we can enforce all settings and prevent user configuration. Being able to point to the settings files on launch is super handy. RAFCON has many settings in the YAML's that can drastically alter behavior or performance.

For 2 and 3 I concur having a "Perform deepcopy" checkbox in the data port table would be good, that way when there is a known large object we can choose to prevent the copy at a per port basis. If we are concerned about the above case (data being unknowingly mutated before use) it should be sufficient to put in some tooltip warnings or hide the feature under advanced mode.

If we want to show this visibly in the GUI, maybe changing the connection colour to red or showing a red flag next to the name might be good?

This is a real problem, as it prevents common state patterns performing at high rate. Example:
Take A Photo -> Detect objects in Photo -> Store results -> Loop to start.

Typical images we work with are quite large and can take seconds to copy, so the overall loop update time is crippled. As mentioned, we also have the problem passing MoveIt trajectories between states.

gilopez01 · 2024-12-12T06:55:50Z

Thanks for posting the issue and giving detailed information about your suggestions! Here are our current thoughts:

1. Global Variable Manager - gvm.set_variable delay

Of course it is possible to enable this DEBUG level check for this single line of code, but maybe it is to specific to actually patch it like this. If we decide to insert the line as you suggested we would then need to do it for every line where variables are printed like this. However, there might be an easier approach for you: There already is a config file named logging.conf. Here, you can find the line to specify the logging level for both the GUI and the core of RAFCON:

RAFCON/source/rafcon/logging.conf

Line 11 in 59987cd

"level": "VERBOSE",

If you set this to e.g. INFO in your RAFCON installation path, everything below INFO-level information will not be printed and I assume the line of code you linked will not consume time anymore. Does this seem a suitable option for your case or do you think it would still be a nice feature if this can also be set from the RAFCON config (which is usually saved in ~/.config/rafcon/config.yaml and can be modified inside the GUI preferences also)?

2. Output Port Delay and 3. Input Port Delay

We discussed to include the functionality of disabling the deepcopy lines in the code for passing the data in between states. However, we noticed that this might be a little dangerous (see image below): If the data output from ExecutionState1 is changed in ExecutionState2, the data input into ExecutionState3 will also receive the altered data. This is usually not wanted and misleading in the GUI representation also.

So if we implement such a flag to disable the deepcopies, it would be some kind of expert-level-flag, to only use when you are really sure that nowhere in your state machine it should be used in another way. Maybe then it is also a little to specific and dangerous if users accidentally disable the flag. As a workaround you can of course always change the lines of code you mentioned in your local RAFCON installation (also for the 1. point of your issue) and go like this. What are your thoughts on this?

Regarding the first point, I have already set the level to INFO. However, the issue is that since the input of logger.debug() is a string, the .format() method is executed before the logger function is called. As a result, even if the message is not printed, time is still consumed serializing the message before entering to logger.debug().

flolay · 2024-12-12T16:01:16Z

Hi together,
thanks for posting this issue.

I would like to address the deepcopy issue you raised:
Rafcon was from the beginning on not designed to do high-rate computations as there are probably better tools to do that (e.g. simulink for control-loops), but to design a behavior and then parameterize the "function calls" within a state.
The idea we try to follow is to keep the states themselves as stateless as possible.

The problem that @JohannesErnst raised with unknowingly mutating the state is in my opinion not to be underestimated, and therefore in my opinion using the dataflow for that is not ideal.

A possibility that you could take is using the global variable manager to store the images and access those in other states. This would be already supported (obviously requiring the adaptation you suggested for setting the variables).

Alternatively, how we do this on our robots is, that we have the perception loop running outside of rafcon. In a rafcon statemachine we trigger the perception actively given the parameterization we have in our behavior.

We use e.g. a ROS-node that provides a service "detect_object(object_id)" which implements the fetching of the camera image, running a perception method on the image and returning the object pose.
And in rafcon we have another ros-node running (stored in the gvm) that triggers the service with the object_id and utilize the returned object's pose in the ongoing behavior.

This has multiple advantages:

heavy computation is done outside the general behavior execution (less complexity)
we can have high-rate computations (not limited by python and the rafcon stack)
we decouple the robots "capabilities" from the behavior

So if you really would like to have this behavior within rafcon, either use the global variable space, fork the repository and integrate this behavior in a fork or you could also try to create a plugin that allows you to do this (https://rafcon.readthedocs.io/en/latest/plugins.html)

AlxStark · 2024-12-13T09:03:34Z

Also thank you from my side for proposing these adaptions.

Regarding the first point, I have already set the level to INFO. However, the issue is that since the input of logger.debug() is a string, the .format() method is executed before the logger function is called. As a result, even if the message is not printed, time is still consumed serializing the message before entering to logger.debug().

After quick consultation we think that's a valid point to overcome the performance neck introduced by the logging, especially for the lower levels. However it would then be useful to have a holistic solution which addresses all logger occurrences of the two lowest level (VERBOSE and DEBUG) in the code.

In that case it can be handy to implement the suggested changes for the log-levels 5 and 10:

Serialize the message only if debug logging is enabled, as shown below:
if logger.isEnabledFor(10):  # DEBUG level is 10
logger.debug("Global variable '{}' was set to value '{}' with type '{}'".format(key, value, >data_type.__name__))

If you are willing to come up with a request for this fix as proposed, we'd be happy to have a look over it and apply it to the code.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Address performance issues related to large objects workflows #898

Address performance issues related to large objects workflows #898

gilopez01 commented Dec 11, 2024 •

edited

Loading

JohannesErnst commented Dec 11, 2024

sillkjc commented Dec 11, 2024 •

edited

Loading

gilopez01 commented Dec 12, 2024

1. Global Variable Manager - gvm.set_variable delay

2. Output Port Delay and 3. Input Port Delay

flolay commented Dec 12, 2024 •

edited

Loading

AlxStark commented Dec 13, 2024

Address performance issues related to large objects workflows #898

Address performance issues related to large objects workflows #898

Comments

gilopez01 commented Dec 11, 2024 • edited Loading

1. Global Variable Manager - gvm.set_variable delay

2. Output Port Delay

3. Input Port Delay

JohannesErnst commented Dec 11, 2024

1. Global Variable Manager - gvm.set_variable delay

2. Output Port Delay and 3. Input Port Delay

sillkjc commented Dec 11, 2024 • edited Loading

gilopez01 commented Dec 12, 2024

1. Global Variable Manager - gvm.set_variable delay

2. Output Port Delay and 3. Input Port Delay

flolay commented Dec 12, 2024 • edited Loading

AlxStark commented Dec 13, 2024

gilopez01 commented Dec 11, 2024 •

edited

Loading

sillkjc commented Dec 11, 2024 •

edited

Loading

flolay commented Dec 12, 2024 •

edited

Loading