Skip to content
This repository has been archived by the owner on Aug 11, 2023. It is now read-only.

TCP_NODELAY connection header #292

Open
StefanGlaser opened this issue Mar 15, 2019 · 10 comments
Open

TCP_NODELAY connection header #292

StefanGlaser opened this issue Mar 15, 2019 · 10 comments

Comments

@StefanGlaser
Copy link
Contributor

Hi,
I'm using rosjava for controlling a robot in "real-time". In order to get a "real-time" communication, I must use the TCP_NODELAY option for the internal TCP connections, otherwise messages are buffered and not received in time.

There exists a corresponding option in the ConnectionHeaderFields. However, I don't see any way to use this option on a Publisher / Subscriber. So currently I hard-coded this option into the TopicDeclaration and everything works fine. But that's not the way it's supposed to be.

What is the correct way to set the TCP_NODELAY option for a Publisher / Subscriber? Is it even possible in the current implementation?

Thanks a lot in advance!

@jubeira
Copy link

jubeira commented Mar 15, 2019

Hi @StefanGlaser,

I see the definition but I don't see it used anywhere in the code.

So currently I hard-coded this option into the TopicDeclaration and everything works fine. But that's not the way it's supposed to be.

Does that mean that TCP_NODELAY works like that or not? If it does, we can find a way to do that properly without needing to hardcode it.

@StefanGlaser
Copy link
Contributor Author

Hi @jubeira,

thanks a lot for your quick response!
Same to me - couldn't find any usages of the defined constant. Thus my question.

I have a running setup where I use the option hardcoded in the TopicDeclaration class by adding the following line:

connectionHeader.addField(ConnectionHeaderFields.TCP_NODELAY, "1");

I agree that it should be an option that you can specify when creating a Subscriber. However, before I change / extend the implementation accordingly I wanted to ask if there already exists an solution to this problem. I really wonder why nobody ever mentioned this issue. It is one of the most fundamental settings when dealing with "real-time" communication via TCP.

On the other hand I know that ROS has some means of telling a Publisher to use TCP_NODELAY when subscribing to a topic. I'm not sure how rosjava handles connection requests from an external Subscriber that requests TCP_NODELAY. So maybe the implementation of that feature is more complex than it looks on the first glance (See https://wiki.ros.org/ROS/Connection%20Header).

@jubeira
Copy link

jubeira commented Mar 15, 2019

As you say, adding the the option to the header is a start, but the problem is how the connection requests are handled in the lower layers.

Right now this is not implemented AFAIK. If you need this feature and want to propose a patch for it you are more than welcome! I'll be happy to review it.

@StefanGlaser
Copy link
Contributor Author

The lower connection layers do use this option.

I run a test in which I exchange very small messages in a ping-pong scenario between two nodes in 100Hz (and a message buffer of 1). With my change I don't miss any message (running for several minutes). Without the change I do miss a couple of messages every second.

So, I guess I'll go ahead and try to properly integrate this option into the rosjava core framework.

Thank you very much for your help!

@jubeira
Copy link

jubeira commented Mar 15, 2019

Then that sounds great! Thanks to you!

My only advice is to try to make the usage somewhat similar to how it's done in roscpp or rospy if possible.

@StefanGlaser
Copy link
Contributor Author

I meanwhile did some in depth analysis of the issue and recognized that rosjava simply ignores the tcp_nodelay flag in the connection header.

I extended rosjava to set the tcp_nodelay header field in the connection header. Then I started a Talker node and a Listener node, which exchange simple, short messages containing a sequence number. Analyzing the exchanged TCP messages via Wireshark I was able to confirm that the tcp_nodelay flag is set to 1 when the subscriber connects to the publisher. However, some TCP messages still contained multiple ROS messages, indicating that tcp_nodelay was not active.

I then reimplemented the Talker node in Python and tried the same test again. The result was as expected - no TCP message contained multiple ROS messages. Thus, Python is respecting the tcp_nodelay flag accordingly.

So I guess I have to dig deeper... :(

@jubeira I tried to get a similar usage to roscpp. roscpp is using TransportHints for specifying tcp_nodelay. As far as I know rosjava doesn't support UDP communication. Thus the TransportHints class in rosjava currently only contains one boolean variable - the tcp_nodelay flag. I'm not sure if it would be a better solution to simply add a boolean parameter to the newSubscriber methods. Furthermore, connections are buffered. As a result, the first subscriber determines if the connection has the tcp_nodelayflag enabled or not.

@jubeira
Copy link

jubeira commented Mar 27, 2019

I extended rosjava to set the tcp_nodelay header field in the connection header. Then I started a Talker node and a Listener node, which exchange simple, short messages containing a sequence number. Analyzing the exchanged TCP messages via Wireshark I was able to confirm that the tcp_nodelay flag is set to 1 when the subscriber connects to the publisher. However, some TCP messages still contained multiple ROS messages, indicating that tcp_nodelay was not active.

I then reimplemented the Talker node in Python and tried the same test again. The result was as expected - no TCP message contained multiple ROS messages. Thus, Python is respecting the tcp_nodelay flag accordingly.

Alright; interesting conclusion. Thanks for all those details! I have to say I was surprised when you said everything had worked out of the box.

@jubeira I tried to get a similar usage to roscpp. roscpp is using TransportHints for specifying tcp_nodelay. As far as I know rosjava doesn't support UDP communication. Thus the TransportHints class in rosjava currently only contains one boolean variable - the tcp_nodelay flag. I'm not sure if it would be a better solution to simply add a boolean parameter to the newSubscriber methods. Furthermore, connections are buffered. As a result, the first subscriber determines if the connection has the tcp_nodelayflag enabled or not.

For now a boolean would work as you say. I'd be fine with adding a TransportHints class that only has a boolean considering that someday we might support UDP, but I don't see that happening in the short term unless someone really needs it. My only strong suggestion in either case is to keep the API as it is now (i.e. don't use tcp_nodelay by default).

@StefanGlaser
Copy link
Contributor Author

Well, my initial test involved a rosjava node and a c++ node and I unfortunately only measured on the rosjava side. Looking back this was a bit premature, as I didn't really check the transmitted messages on TCP level. But for the final implementation such tests are fundamental...

Anyhow, I'll take a second look at how connections are established and try to integrate this option if possible. Of course, I'll only add new overloaded functions and keep the default functionality as it was before to ensure backward compatibility.

Thanks again for your interest in this issue and your quick feedback!

@StefanGlaser
Copy link
Contributor Author

Hi again,
just a quick update: I'm happy to report that I managed to solve this issue in an elegant way.
I'll create a merge request and put together some test results, soon.

@jubeira
Copy link

jubeira commented Mar 29, 2019

Sounds great! Thanks @StefanGlaser ! 😃

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants