Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

kubectl debug: profile "sysadmin" does not work as expected when uid != 0 is specified #1650

Open
Phil1602 opened this issue Sep 10, 2024 · 10 comments
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/backlog Higher priority than priority/awaiting-more-evidence. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@Phil1602
Copy link

Phil1602 commented Sep 10, 2024

What happened:
I wanted to create an ephemeral container with sysadmin (or netadmin) profile to be able to capture traffic using tcpdump using the following command:

kubectl debug test-pod -it --image nicolaka/netshoot --profile=sysadmin -- zsh

Defaulting debug container name to debugger-wj6qq.
If you don't see a command prompt, try pressing enter.

test-pod% whoami
whoami: unknown uid 1000

test-pod% tcpdump
tcpdump: eth0: You don't have permission to perform this capture on that device
(socket: Operation not permitted)

The ephemeral container is set to privileged: true as expected, but the Pod level securityContext forces the ephemeral container to run as user 1000 which is IMO an unwanted behavior for an ephemeral container with sysadmin profile set.

What you expected to happen:
I would expect my ephemeral container with sysadmin to be able to capture traffic in any case.

On a container level securityContext I would not only expect privileged: true, but also runAsUser: 0 to avoid such user override collisions from pod level. Otherwise: a parameter to override the user for the ephemeral container would help in that regard as well.

How to reproduce it (as minimally and precisely as possible):

  1. Create a Pod with the following securityContext
apiVersion: v1
kind: Pod
metadata:
  labels:
    run: test-pod
  name: test-pod
spec:
  securityContext:
    runAsUser: 1000 # Override user != 0
  containers:
  - image: kennethreitz/httpbin
    name: test-pod
    resources: {}
  dnsPolicy: ClusterFirst
  restartPolicy: Always
  1. Attach ephemeral Container via debug command and profile sysadmin set
kubectl debug test-pod -it --image nicolaka/netshoot --profile=sysadmin -- zsh

Defaulting debug container name to debugger-wj6qq.
If you don't see a command prompt, try pressing enter.

test-pod% whoami
whoami: unknown uid 1000

test-pod% tcpdump
tcpdump: eth0: You don't have permission to perform this capture on that device
(socket: Operation not permitted)
test-pod% 

Anything else we need to know?:

Environment:

  • Kubernetes client and server versions (use kubectl version):
Client Version: v1.30.3
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.30.3-eks-a18cd3a
  • Cloud provider or hardware configuration: AWS EKS (see above)
@Phil1602 Phil1602 added the kind/bug Categorizes issue or PR as related to a bug. label Sep 10, 2024
@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Sep 10, 2024
@ardaguclu
Copy link
Member

@mochizuki875 what do you think about this?.

@mochizuki875
Copy link
Member

mochizuki875 commented Sep 10, 2024

@ardaguclu
I think the same situation is happening here with ephemeral container.

When privileged: true is set for a container running as root user, the following CapabilitySet is applied.
The key point is that the CapEff where the capabilities actually used for permission checks are set.

apiVersion: v1
kind: Pod
metadata:
  labels:
    run: privileged
  name: privileged
spec:
  containers:
  - image: busybox
    command: ["sh", "-c", "sleep infinity"]
    name: privileged
    securityContext:
      privileged: true
  terminationGracePeriodSeconds: 0
$ kubectl exec -it privileged -- /bin/sh
/ # whoami
root
/ # grep Cap /proc/1/status
CapInh:	0000000000000000
CapPrm:	000001ffffffffff
CapEff:	000001ffffffffff
CapBnd:	000001ffffffffff
CapAmb:	0000000000000000

On the other hand, when non-root user is specified in runAsUser, even if privileged: true or specific capabilities are set, the container process does not have the appropriate capability(CapEff), and it can not have the required permission.

apiVersion: v1
kind: Pod
metadata:
  labels:
    run: runasuser-with-privileged
  name: runasuser-with-privileged
spec:
  securityContext:
    runAsUser: 1000
  containers:
  - image: busybox
    command: ["sh", "-c", "sleep infinity"]
    name: runasuser-with-privileged
    securityContext:
      privileged: true
  terminationGracePeriodSeconds: 0
$ kubectl exec -it runasuser-with-privileged -- /bin/sh
~ $ whoami
whoami: unknown uid 1000
~ $ grep Cap /proc/1/status
CapInh:	0000000000000000
CapPrm:	0000000000000000
CapEff:	0000000000000000
CapBnd:	000001ffffffffff
CapAmb:	0000000000000000

I have not checked the details yet, but this issue has been reported in #56374, and KEP #2763 has been proposed.
However, it seems to not be implemented yet.

So currently, I think the simplest workaround is to define runAsUser under the containers field.

apiVersion: v1
kind: Pod
metadata:
  labels:
    run: test-pod
  name: test-pod
spec:
  # securityContext:
  #   runAsUser: 1000 # Override user != 0
  containers:
  - image: kennethreitz/httpbin
    name: test-pod
    securityContext:
      runAsUser: 1000 # Override user != 0
    resources: {}
  dnsPolicy: ClusterFirst
  restartPolicy: Always
$ kubectl debug test-pod -it --image nicolaka/netshoot --profile=sysadmin -- zsh
Defaulting debug container name to debugger-dwt84.
If you don't see a command prompt, try pressing enter.
test-pod  ~  whoami
root
test-pod  ~  grep Cap /proc/$$/status
CapInh:	0000000000000000
CapPrm:	000001ffffffffff
CapEff:	000001ffffffffff
CapBnd:	000001ffffffffff
CapAmb:	0000000000000000

or using Custom Profile.

profile-runas-root.yaml

securityContext:
  runAsUser: 0
  privileged: true
$ kubectl debug test-pod -it --image=busybox --custom=profile-runas-root.yaml -- /bin/sh
Defaulting debug container name to debugger-mjp6g.
If you don't see a command prompt, try pressing enter.
/ # grep Cap /proc/$$/status
CapInh:	0000000000000000
CapPrm:	000001ffffffffff
CapEff:	000001ffffffffff
CapBnd:	000001ffffffffff
CapAmb:	0000000000000000

/ # exit

$ kubectl get pod test-pod -o=jsonpath='{.spec.ephemeralContainers[0].securityContext}' | jq .
{
  "privileged": true,
  "runAsUser": 0
}

Another solution which I come up with is to set runAsUser: 0 to ephemeral container when --profile=sysadmin or --profile=netadmin is specified.
However, I don't know it's appropriate...

@ardaguclu
Copy link
Member

Thanks a lot for your extensive investigation @mochizuki875 and it is really helpful.

I think we should wait this KEP kubernetes/enhancements#2763 is revived and the suggested workaround to overcome this issue is using custom profiling ^^.

/triage accepted
/priority backlog

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/backlog Higher priority than priority/awaiting-more-evidence. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Sep 11, 2024
@mochizuki875
Copy link
Member

@ardaguclu
Thank you for your reviewing.
I agree with that, and I think it's one of the cases where custom profile works well👍

@frittentheke
Copy link

frittentheke commented Sep 11, 2024

So currently, I think the simplest workaround is to define runAsUser under the containers field.

There are workarounds, but surely a user (of kubectl) should not have to do all that to simply achieve a privileged ephemeral container to run as UID 0 (or any other). The running pods and their runAsUser might originate from some upstream Helm chart and also a workaround is only good if it can be found easily. Don't want to discourage and scare people away from unprivileged and UID!=0 containers just because it's that one or two steps harder to debug them ;-)

Thank you for your reviewing.
I agree with that, and I think it's one of the cases where custom profile works well👍

Another solution which I come up with is to set runAsUser: 0 to ephemeral container when --profile=sysadmin or --profile=netadmin is specified.
However, I don't know it's appropriate...

If the ephemeral container does not have anything else in its spec that is totally reasonable. And talking about kubectl patching a pod based on distinct CLI options to dynamically create a debug container it seems the more reasonable to simply add this aspect to spec patch created controlled by kubectl anyways?

@Phil1602
Copy link
Author

The workaround using custom profiles is fine for me, but as @frittentheke already said:

I totally agree with the fact, that the client side profile sysadmin indicates that it creates an ephemeral container, which is capable of doing operations with capabilities set. Even though the root cause might be that privileged does not work as expected, one would probably expect being root when setting sysadmin.

Maybe we could at least implement a client side warning if the Pod spec contains runAsUser to inform the user it's sysadmin ephemeral container does not run as root?

@ardaguclu
Copy link
Member

Thank you @frittentheke and @Phil1602 for dropping your valuable comments.

Maybe we could at least implement a client side warning if the Pod spec contains runAsUser to inform the user it's sysadmin ephemeral container does not run as root?

That's exactly what I was thinking about.

@mochizuki875
Copy link
Member

That's exactly what I was thinking about.

Ok, I'll do that and create PR.

/assign

@ardaguclu
Copy link
Member

@mochizuki875 I think, we can recommend the possible custom profiling configuration we discussed ^^ in this warning message.

@mochizuki875
Copy link
Member

@ardaguclu
Yes I'm thinking about same idea just now!
I would appreciate your help when considering the message later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug. priority/backlog Higher priority than priority/awaiting-more-evidence. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants