-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix invalid handling of pids.limit=0 from runtime spec #4023
Conversation
@haircommander PTAL |
}, | ||
cli.IntFlag{ | ||
Name: "pids-limit", | ||
Usage: "Maximum number of pids allowed in the container", | ||
Usage: "Maximum number of tasks; use '-1' for unlimited", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe some users will be confused with -r
and runtime-spec
?
runc update -r - containerid
{
"pids": { "limit": 0 }
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- This is the way it was always working.
- It seems that the behavior of
runc update
is not in spec.
if r.Pids != nil { | ||
c.Resources.PidsLimit = r.Pids.Limit | ||
if r.Pids.Limit > 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know whether upstream projects should write such codes or not when they update the container's pids limit?
if r.Pids.Limit > 0 {
c.Resources.PidsLimit = r.Pids.Limit
} else {
c.Resources.PidsLimit = -1
}
@AkihiroSuda Do you know what's the situation in nerdctl
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe in kubelet they should know this information:
https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/cm/cgroup_manager_linux.go#L355-L381
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I still think we should fix it in pids.go
, because others always use Manager.Set
directly without use specconv
.
But we have to solve 2 problems if we choose solutions like #4015 :
- Don't always write
pids.max
ifPidsLimit == 0
; - Don't introduce break changes of libcontainer API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took a look at git history, and it seems that libcontainer always treated 0
as unset for pids limit, since commit db3159c from January 2016 that added the initial pids limit support (alas, the very same commit incorrectly documents the field (db3159c#diff-b71b6973c045d22e41e802cf65cade82c573a844190bc0d06a6ade9f21cb2c5c), and this PR fixes that).
Thus, I am pretty sure libcontainer users are aware of what 0 means for pids limit in libcontainer; yet I am going to look into kubernetes and nerdctl to confirm.
Now, we have two problems to solve:
- runc violates the runtime spec by treating pids.limit==0 as unset rather than unlimited (Undefined (and potentially incorrect) behavior when pids limit is set to 0 #4014);
- libcontainer's handling of pids limit value differs from runtime spec.
I think that 2 is much less of a problem than 1, and can be solved by properly documenting the PidsLimit field (this is what this PR does).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW: case 1 is the only case I care about fixing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
If we say that we just only need to fix it in ‘runc create’ and ‘runc run’, for the uses using ‘runc update’ and ‘libcontainer’, they should fix this issue in their own projects according to runc’s doc. This one LGTM. |
Yes, both |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, to fix this issue completely, it needs to open PR by their own contributors in projects using ‘runc update’ or ‘runc libcontainer API’, for example, Kubernetes, containerd, and cri-o etc.
@opencontainers/runc-maintainers PTAL After this one backported to 1.1, we can release |
@AkihiroSuda PTAL |
I'd yet have to have a close look, but would like @cyphar to have a look; I think there's still some ambiguity in the description of the PR that updated the specs opencontainers/runtime-spec#234 ("there is no limit: meaning: don't (set) a limit (and use the active max)" vs "explicitly set it to max/unlimited") The mention of "Go's unset is zero value" in that description may indicate that And it wouldn't be the first time that specs were update to reflect the implementation, and now (due to ambiguity on the wording), the implementation being update to reflect the spec (going full circle) |
Rebased |
This test case checks that unified resources override those set by conventional means, but it does not set conventional pids limit. Fix this. Signed-off-by: Kir Kolyshkin <[email protected]>
Signed-off-by: Kir Kolyshkin <[email protected]>
It has been pointed out that runc incorrectly ignores pids.limit=0 set in the runtime spec. This happens because runtime-spec says "default is unlimited" and also allows for Pids to not be set at all, thus distinguishing unset (Resources.Pids == nil) from unlimited (Resources.Pids.Limit <= 0). Internally, runc also distinguishes unset from unlimited, but since we do not use a pointer, we treat 0 as unset and -1 as unlimited. Add a conversion code to libcontainer/specconv. Add a test case to check that starting a container with pids.limit=0 results in no pids limit (the test fails before the fix when systemd cgroup manager is used, as systemd apparently uses parent's TasksMax). NOTE that runc update still treats 0 as "unset". Finally, fix/update the documentation here and there. Should fix issue 4014. Reported-by: Peter Hunt <[email protected]> Signed-off-by: Kir Kolyshkin <[email protected]>
Reading through my (8-year-old 😰) comments, I think that there were three issues at play here:
Looking at it now, I think the ideal option would be to change the spec so that |
Removing the |
👍 |
in the meantime, CRI-O can just set -1 where it originally set 0 cri-o/cri-o#7503 |
What's the next step? |
Let's close it and change the spec instead, |
This is an alternative to #4015 (smaller set of code changes, no libct API breakage).
It has been pointed out in #4014 that runc incorrectly ignores pids.limit=0 set
in the runtime spec. This happens because runtime-spec says "default is
unlimited" and also allows for Pids to not be set at all, thus
distinguishing unset (Resources.Pids == nil) from unlimited
(Resources.Pids.Limit <= 0).
Internally, runc also distinguishes unset from unlimited, but since we
do not use a pointer, we treat 0 as unset and -1 as unlimited.
Add a conversion code to libcontainer/specconv.
Add a test case to check that starting a container with pids.limit=0
results in no pids limit (the test fails before the fix when systemd
cgroup manager is used, as systemd apparently uses parent's TasksMax;
see #4022 for CI runs).
NOTE that runc update still treats 0 as "unset".
Finally, fix/update the documentation here and there.
Fixes: #4014.