Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using socket_posix to replace socket_starboard #3904

Merged
merged 1 commit into from
Sep 19, 2024

Conversation

maxz-lab
Copy link
Contributor

@maxz-lab maxz-lab commented Jul 27, 2024

Replace net/tcp&udp_starboard with net/tcp&udp_posix.

b/302741384

@datadog-cobalt-youtube
Copy link

datadog-cobalt-youtube bot commented Jul 29, 2024

Datadog Report

Branch report: maxz-sb16-sock-netposix
Commit report: abc343d
Test service: cobalt

✅ 0 Failed, 33856 Passed, 6 Skipped, 7m 17.7s Total Time

@maxz-lab maxz-lab force-pushed the maxz-sb16-sock-netposix branch 7 times, most recently from f14ab03 to aaf0f38 Compare August 1, 2024 01:07
@maxz-lab maxz-lab force-pushed the maxz-sb16-sock-netposix branch 3 times, most recently from 1210dd6 to e221129 Compare August 8, 2024 23:33
@maxz-lab maxz-lab force-pushed the maxz-sb16-sock-netposix branch 8 times, most recently from 9ffed3b to 11c4097 Compare August 16, 2024 17:48
@maxz-lab maxz-lab force-pushed the maxz-sb16-sock-netposix branch 7 times, most recently from 6f6b89f to a4422f3 Compare August 26, 2024 04:58
@maxz-lab maxz-lab marked this pull request as ready for review August 27, 2024 23:43
@maxz-lab maxz-lab requested a review from a team as a code owner August 27, 2024 23:43
@jellefoks
Copy link
Member

Can you first make sure all the tests pass?

@maxz-lab maxz-lab force-pushed the maxz-sb16-sock-netposix branch 4 times, most recently from 6c1f2c5 to 72d5274 Compare September 10, 2024 21:08
@maxz-lab maxz-lab enabled auto-merge (squash) September 11, 2024 00:59
@maxz-lab maxz-lab requested a review from y4vor September 11, 2024 00:59
@maxz-lab maxz-lab enabled auto-merge (squash) September 12, 2024 16:49
@maxz-lab maxz-lab enabled auto-merge (squash) September 12, 2024 16:51
DCHECK(!accept_socket_);

if ((!socket_)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which case do we call this, and why isn't it an issue in upstream?

Copy link
Contributor Author

@maxz-lab maxz-lab Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is needed when the socket_ isn't properly populated. This case happens when IPv6 is not supported by host (only evergreen_test fails on github container who doesn't support such). tck_socket_starboard handles this case the same way.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of changing the upstream code to handle this, we should fix the place where Accept() is called with nullptr for tcp_socket.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of changing the upstream code to handle this, we should fix the place where Accept() is called with nullptr for tcp_socket.

};
int result;
int bytes_transferred = HANDLE_EINTR(recvmsg(socket_, &msg, 0));
int result = -1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few reasons why changing this to be different than upstream is scary. We really should support recvmsg in Starboard.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recvmsg() behaves differently from recvfrom() only when the socket is in blocking mode. All of the sockets created here are set to non-blocking by default, these two functions essentially provide the same functionality in our case.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm pretty sure that the Chrome implementation is correct, so any changes made should not be here.

They are not the same. There are edge case differences (wrt truncated packages) and performance differences that matter. We should avoid changing this without good reason. Perhaps the issue is that we shouldn't be setting our UDP sockets to nonblocking.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The crux is fragility in relying on the return value to detect truncated packets (in addition to understandable sensitivity in Chrome to any changes here without detailed performance investigation).

Posix says Upon successful completion, recvfrom() shall return the length of the message in bytes while the linux man page says that plus These calls return the number of bytes received.

Those two statements contradict each other for truncated packets, since in that case the message was longer than the number of bytes received, since "the excess is discarded" as allowed in both documentation of both Posix and the linux man page.

The difference is in interpretation: Whether a truncated message is treated as a successful completion or not, and if treated as successful, whether a discarded byte is still counted as part of "the length of the message".

There is no such confusion when using recvmsg.

This code in upstream used to use recvfrom before detection of packet truncation was added in this CL: https://chromium-review.googlesource.com/c/chromium/src/+/657780

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, we're going to need this to be using recvmsg for the best behavior on the network because the ECN bits aren't signaled when using recvfrom. Support for this is actively being worked on in upstream.

https://crbug.com/40285166
https://en.wikipedia.org/wiki/Explicit_Congestion_Notification

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

recvmsg is restored.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR was merged, but this conversation was ignored?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have restored recvmsg and forgot to push the change, as I explained in the chat channel. The new patch has been submitted.

class IOBuffer;
struct SockaddrStorage;

// Socket class to provide asynchronous read/write operations on top of the
// posix socket api. It supports AF_INET, AF_INET6, and AF_UNIX addresses.
class NET_EXPORT_PRIVATE SocketPosix
: public base::MessagePumpForIO::FdWatcher {
: public base::MessagePumpForIO::Watcher {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of this, we should remove the use of message_pump_io_starboard.h in base/message_loop/message_pump_for_io.h and from there use the base/message_loop/message_pump_libevent.h under the BUILDFLAG(IS_POSIX), then we don't have to make these scary changes here.

If that is nontrivial and these changes are temporary, then MessagePumpIOStarboard should be made to have an interface compatible with MessagePumpLibevent so that the changes here are not necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This CL is knid of temporary and we will decide later based on further performance analysis.
b/302741384 tracks separate effort to migrate /consolidate different message_pumps.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is clearly upstream code. Any non-upstreamed changes made for starboardization should be clearly visible as such and these changes don't, but they do actually have an impact on both the behavior and performance of this code.

The top #3 native crash in the latest releases are in a call to ClearWatcherIfOperationsNotPending that currently only exists in TCPSocketStarboard, and this PR replaces calls to StopWatchingFileDescriptor(); calls with calls to a very similar ClearWatcherIfOperationsNotPending that it adds here to socket_posix. That's not a good sign.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The PR was merged, but this conversation was ignored?

Using base/starboard implementation to work with net/tcp&udp.
Combine read and write waiter into one to match starboard's
  setup.
Updated some win32 test cases treat "\r" and "\r\n"
Check evergreen bind with IPv6 err

b/302741384
Copy link
Contributor

@yell0wd0g yell0wd0g left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests passing - RS LGTM!

@maxz-lab maxz-lab merged commit e3a485c into main Sep 19, 2024
257 of 259 checks passed
@maxz-lab maxz-lab deleted the maxz-sb16-sock-netposix branch September 19, 2024 18:14
@@ -237,7 +239,6 @@ int TCPSocketPosix::Bind(const IPEndPoint& address) {
SockaddrStorage storage;
if (!address.ToSockAddr(storage.addr, &storage.addr_len))
return ERR_ADDRESS_INVALID;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you delete this line in upstream code?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was this merged with whitespace changes to upstream code?

@@ -171,7 +171,7 @@ TEST_F(PacFileFetcherImplTest, HttpMimeType) {
TRAFFIC_ANNOTATION_FOR_TESTS);
EXPECT_THAT(result, IsError(ERR_IO_PENDING));
EXPECT_THAT(callback.WaitForResult(), IsOk());
EXPECT_EQ(u"-pac.txt-\n", text);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure that it's OK that when we run it, the newline is dropped?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Specifically, the file being fetched 'pac.txt' does have the newline, and the upstream test has always assumed that the newline is also read, so I don't think the test was wrong.

jellefoks pushed a commit to jellefoks/cobalt that referenced this pull request Oct 3, 2024
Replace net/tcp&udp_starboard with net/tcp&udp_posix.

b/302741384
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants