Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ports connected to ROS subscribers should not depend on spinning the global callback queue #117

Open
meyerj opened this issue Feb 13, 2019 · 2 comments

Comments

@meyerj
Copy link
Member

meyerj commented Feb 13, 2019

Copied from #104 (comment):

I agree that the number of spinner threads has a significant effect on the overall behavior of a ROS application, including ROS-enabled Orocos processes. This is especially true if the process provides services, where a long-running callback actually blocks all subscriptions.

Increasing the number of default spinner threads would be one option to mitigate the problem, and unless someone would use ROS primitives with callbacks directly within the same process or rtt_roscomm service server proxies with ClientThread operations, there should be no noticeable side effect.

We had a similar problem in the past few weeks and I was thinking about the following alternative solutions, which eventually would scale better:

  • a per subscriber or service server CallbackQueue and associated Orocos activity
  • a per component CallbackQueue and associated Orocos activity, similar to the dispatcher threads of the CORBA or mqueue transport for the sending side. Similarly the RosPublishActivity could be instantiated per component instead of as a process-wide singleton.
  • a customized implementation of ros::CallbackQueue that instead of enqueuing a callback and waiting until a spinner thread polls it would call the callback immediately. This callback would directly push the message to the data or buffer object associated with the ROS stream connection, bypassing ROS' callback queue and the spinner threads completely. Orocos RTT already has all the primitives we need. On the other hand deserialization would already happen in roscpp's network thread and I am not sure whether this is a good idea.
@ahoarau
Copy link
Contributor

ahoarau commented Feb 20, 2019

I'm testing the "per subscriber/publisher/client/server Callbackqueue + async spinner (1 thread)"
It's unfortunately breaking ABI compatibility. Should I PR ?

@meyerj
Copy link
Member Author

meyerj commented Feb 21, 2019

I'm testing the "per subscriber/publisher/client/server Callbackqueue + async spinner (1 thread)"
It's unfortunately breaking ABI compatibility. Should I PR ?

Yes, thanks, that would be valuable. We can check whether your patch indeed breaks ABI. The RosPubChannelElement<T>, RosSubChannelElement<T> and RosMsgTransporter<T> are template classes used for building the ROS transport plugin and typically no other library links to that plugin, so the risk of breaking the ABI should be very low. Any change there will simply only affect typekits and transport plugins that depend on the new version of rtt_roscomm. Compiled plugins do not depend on that header file anymore and would continue to work using the global callback queue. Of course it depends on your exact patch.

For bigger application having a separate thread per publisher/subscriber might be too much, so I think we should not make this the default behavior. Is it possible to add a global property (to the GlobalsRepository) that decides on whether to create new ROS streaming connections using the global callback queue or a custom one with a separate thread?

Unfortunately I don't have time at the moment to propose a patch for the last approach, which would be quite simple for subscribers. Basically you would need the define a child class of ros::CallbackQueueInterface that simply calls the callback directly from addCallback(), without any additional queue involved. The ROS network thread, which calls into addCallback() for each new message, then directly pushes the message into the Orocos connection's data element or buffer. I have to check in which cases a subscriber callback can return TryAgain and of course I have not tested it yet, but I am quite sure that this approach would work if you pass a per RosSubChannelElement<T> instance dummy callback queue to the respective subscriber via the SubscribeOptions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants