Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[inetstack] TCP ctrlblk checks SeqNumber of out-of-order FIN #924

Open
wants to merge 1 commit into
base: dev
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions src/rust/inetstack/protocols/tcp/established/ctrlblk.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1122,8 +1122,9 @@ impl<const N: usize> ControlBlock<N> {
if added_out_of_order {
match self.out_of_order_fin.get() {
Some(fin) => {
debug_assert_eq!(fin, recv_next);
return true;
if fin == recv_next {
return true;
}
Comment on lines +1125 to +1127
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have an example that show cases this error? self.out_of_order_fin.get() is indeed supposed to match recv_next at this point. That is why we have the debug_assert!() there. With your change we are now failing silently and returning something that is not expected from this function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I faced this issue while using our own codebase based on an old version of Demikernel. In an experiment using an open-loop TCP load generator, the behavior of the error case I observed was as follows:

  • Some TCP packets are dropped in the middle
  • While those dropped packets are recovered, subsequent packets are stored in the out-of-order buffer
  • Once the out-of-order buffer becomes full (size limit 16 in Demikernel), more packets are dropped
  • Eventually, the client reaches the last packet and FIN, which is stored at the end of the out-of-order buffer
  • Thus, the out-of-order buffer has some packets and FIN, and there are some dropped packets in between those out-of-order packets and FIN.
  • Without the above condition, the server processes the FIN message while it has not received all prior messages yet.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, but then it looks like the actual bug is in the way that we queue out of order packets, right? If we check in this, I would suggest at least to add a warning statement there. What are your thoughts @iyzhang .

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we probably queue them in the same order that they arrive and it sounds like we might have some dropped packets that arrive after the fin and we should process them first.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change is actually correct, the debug_assert shouldn't be there. This code predates me, but I didn't recognize the bug when I wrote the comments on line 1120 and 1121, which should now also be removed.

Basically, the bug here is an assumption that if a packet arrives that fills a hole in the sequence space (due to a previously dropped packet) and thus we can now take some subsequent data off of the out-of-order queue and put it on the receive queue, that that data is all the data on the out-of-order queue (and thus we should now process the FIN). But if there were multiple holes in the sequence space (due to multiple drops of non-contiguous packets), this assumption would be false. We could "added_out_of_order" without adding all out of order, and thus shouldn't receive the FIN yet.

This bug could be fixed in other ways, however, that might both be clearer and save a couple of "if" statements for better perf. If "added_out_of_order" was renamed "added_all_out_of_order", and a "added_all_out_of_order = false" was added to the else clause between lines 1099 and 1100, then the original code would go back to being correct, and we could keep the debug_assert to check correctness in debug builds.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @BrianZill, I think the way of using "added_all_out_of_order" doesn't work for this case.

Let's consider a TCP stream with sequence numbers 1, 2, 3, 4, 5 (where 5 is the FIN).
If the server receives them in out-of-order as 1 3 5 2 4, the behavior will be

  • 1 is received
  • 3 is stored into the out-of-order queue
  • 5 is stored as out-or-order-fin
  • 2 is received, and then 3 is taken and received from out-of-order queue (out-of-order queue becomes empty and "added_all_out_of_order" is true at this point)
  • 5 (FIN) is processed before receiving 4

so, "added_all_out_of_order" doesn't actually work as we intend here.

What do you think?

},
_ => (),
}
Expand Down