-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implemented the clone fallback when clone3 returns ENOSYS #2203
Conversation
For a number of reasons, platforms can choose to block clone3 and force return ENOSYS. We implement a clone fallback in the case that we can't use clone3. Also, clone3 has no libc wrapper at this point. The current implementation calls the kernel version of the syscall directly. There are undefined behaviors potentially when we create process bypassing the libc. However, we have not observed any issue with our tests. This is likely because `youki` runs short lived process and calls exec or exit in the end. Nonetheless, we should have a backup plan and this change is our way out in the case that we discover clone3 has issue as the default code path. Remove the use of the clone3 crate. We use `clone3` is a very specific way to create a process. We don't have to support the many other flags and usecases of the `clone3` call. So it is simpler for us to use the libc crate directly for the syscall. This avoids an extra dependency and reduces our binary size. Signed-off-by: yihuaf <[email protected]>
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #2203 +/- ##
==========================================
+ Coverage 64.93% 65.05% +0.11%
==========================================
Files 129 129
Lines 14979 15139 +160
==========================================
+ Hits 9727 9848 +121
- Misses 5252 5291 +39 |
Hey @yihuaf is it possible that we do this as a To be clear, I am saying that we should keep the current fallback mechanism even with feature, but if the feature is disabled, we will directly use clone, without trying clone3 first. wdyt? |
I think this is a great idea, but I would want to implement this in a different PR as a follow up. This will also make the testing of this feature easier, otherwise I have to resort to seccomp to force clone3 to return enosys. |
Great challenge 👍 Is it possible to have it benchmarked? |
So there is no visible performance difference between This is using
This is using
With the fallback path, we do need to make an extra |
If there is no blocking issue, I would like to get this merged as is and address any review comments in a follow up PR along with |
We often use this to quickly get a sense of perf of a PR change. Signed-off-by: yihuaf <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm 👍 I think @utam0k didn't have any issues with this, but if there are any problems, we can fix them in follow up PRs. Merging this for now --- After the CI is done :)
#!/usr/bin/env bash | ||
set -euo pipefail |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey, given that this is a single command, I don't thing we need to specify the shell here, we can just run it. Also I think this is the best way we have currently to run benchmarks, and this is not in hacks, so we can just name it benchmark
instead of hack benchmark.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@yihuaf take a look at this in your next PR of clone regarding changes. Merging this one as not a blocking issue.
For a number of reasons, platforms can choose to block clone3 and force return ENOSYS. We implement a clone fallback in the case that we can't use clone3.
Also, clone3 has no libc wrapper at this point. The current implementation calls the kernel version of the syscall directly. There are undefined behaviors potentially when we create process bypassing the libc. However, we have not observed any issue with our tests. This is likely because
youki
runs short lived process and calls exec or exit in the end. Nonetheless, we should have a backup plan and this change is our way out in the case that we discover clone3 has issue as the default code path.Remove the use of the clone3 crate. We use
clone3
is a very specific way to create a process. We don't have to support the many other flags and usecases of theclone3
call. So it is simpler for us to use the libc crate directly for the syscall. This avoids an extra dependency and reduces our binary size.