-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
issue: 3586273 Use XLIO_DEFERRED_CLOSE by default #136
Open
iftahl
wants to merge
163
commits into
Mellanox:vNext
Choose a base branch
from
iftahl:deferred_close
base: vNext
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
At most a single element of this vector is always used. Once rfs constructor is complete there must be exactly one attach_flow_data element in case of ring_simple. For ring_tap this element remains null. Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Alex Briskin <[email protected]>
Set ETIMEDOUT errno and return -1 from recv in case a socket was timed out, instead of 0 return value and 0 errno. For instance, in case of TCP keep alive timeout. Signed-off-by: Alexander Grissik <[email protected]>
The idea is to scan all rpm/deb packages for personal emails we should not be releasing packages with such emails the scan is done on both the metadat info and the changelog of a specific package Issue: HPCINFRA-919 Signed-off-by: Daniel Pressler <[email protected]>
reclaim_recv_single_buffer() accumulates buffers in a list. In the performance oriented API we want to reuse hot buffers immediately, so reclaim_recv_buffers() implementation is more suitable. Signed-off-by: Dmytro Podgornyi <[email protected]>
The memory callback provides hugepage size of the underlying pages. Replace hardcoded 0 with real hugepage size. Keep the page size in xlio_allocator object. This field a relevant only the hugepage allocation method and 0 in all other cases. Signed-off-by: Dmytro Podgornyi <[email protected]>
XLIO Socket API must guarantee that the XLIO_SOCKET_EVENT_TERMINATED is not followed by any other events. Therefore, all the TX completion events must be completed by that moment. Do a polling iteration before calling socket destructor to increase the chance that all the relevant WQEs are completed. This mechanism needs to be improved in the future. Signed-off-by: Dmytro Podgornyi <[email protected]>
xlio_init_ex() changes some default parameters. However, a global object can trigger safe_mce_sys() constructor at the start. Therefore, we need to re-read the environment variables again to guarantee that the changed parameters take place. Signed-off-by: Dmytro Podgornyi <[email protected]>
Avoid using connect() with sock fd interface, because fd_collection doesn't keep xlio_socket_t objects. Signed-off-by: Dmytro Podgornyi <[email protected]>
xlio_socket_t objects aren't connected to the fd_collection anymore. Therefore, all the methods must be called from the sockinfo_tcp objects directly. Also, xlio_socket_fd() is not relevant anymore and can be removed. Signed-off-by: Dmytro Podgornyi <[email protected]>
Iterate over std::list of TCP sockets while erasing socket during iteration. Overcomed by increasing iterator before erase. Signed-off-by: Iftah Levi <[email protected]>
rdma-core limits number of UARs per context to 16 by default. After creating 16 QPs, XLIO receives duplicates of blueflame registers for each subsequent QP. As results, blueflame doorbell method can write WQEs concurrently without serialization and this leads to a data corruption. BlueFlame can make impact on throughput, since copy to the blueflame register is expensive. It can improve latency in some low latency scenarios, however, XLIO targets high traffic/PPS rates. Removing blueflame method also slightly improves performance in some scenarios. BlueFlame can be returned back in the future to improve low-latency scenarios, however, it will need some rework to avoid the data corruption. Signed-off-by: Dmytro Podgornyi <[email protected]>
The inline WQE branch is not likely in most throughput scenarios. Signed-off-by: Dmytro Podgornyi <[email protected]>
Avoid calling register_socket_timer_event when a socket is already registered (TIME-WAIT). Although there is no functionality issue with that, it produces too high rate of posting events for internal-thread. This leads to lock contantion inside internal-thread and degraded performance of HTTP CPS. Signed-off-by: Alexander Grissik <[email protected]>
Signed-off-by: Gal Noam <[email protected]>
UTLS uses tcp_tx_express() for non blocking sockets. However, this TX method doesn't support XLIO_RX_POLL_ON_TX_TCP. Additional RX polling improves scenarios such as WEB servers. Insert RX polling into UTLS TX path to resolve performance degradation. Signed-off-by: Dmytro Podgornyi <[email protected]>
In heavy CPS scenarios a socket may go to TIME-WAIT state and be reused before first TCP timer registration is performed by internal-thread. 1. Setting timer_registered=true while posting the event prevents the second attemp to try and post the event again. 2. Adding sanity check in add_new_timer that verifies that the socket is not already in the timer map. Signed-off-by: Alexander Grissik <[email protected]>
Added new env parameter - XLIO_MAX_TSO_SIZE. It allows the user to control maximum size of TSO, instead of taking the maximum cap by HW. The default size is 256KB (maximum by current HW). Values higher than HW capabilities won't be taken into account. Signed-off-by: Iftah Levi <[email protected]>
Signed-off-by: Gal Noam <[email protected]>
iftahl
force-pushed
the
deferred_close
branch
3 times, most recently
from
April 14, 2024 18:01
b70ab11
to
72c5bc9
Compare
For incoming sockets - no change. For outgoing sockets - since outgoing sockets occupy a local port, we should release it on the socket destructor to prevent race from another socket to use the same port. This race might cause XLIO to hold 2 sockets with the same RFS object at the same time, and this is fatal (particularly for TCP). Signed-off-by: Iftah Levi <[email protected]>
For tests related to bind() Signed-off-by: Iftah Levi <[email protected]>
bot:retest |
1 similar comment
bot:retest |
AlexanderGrissik
force-pushed
the
vNext
branch
from
August 18, 2024 08:58
cfce2e8
to
51c2340
Compare
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
For incoming sockets - no change.
For outgoing sockets - since outgoing sockets occupy a local port, we should release it on the socket destructor to prevent race from another socket to use the same port.
This race might cause XLIO to hold 2 sockets with the same RFS object at the same time, and this is fatal (particularly for TCP).
Change type
What kind of change does this PR introduce?
Check list