Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

linkerd-control-plane pods fail with linkerd-cni and proxy.nativeSidecar enabled #12391

Closed
krzysztof-mitus opened this issue Apr 4, 2024 · 6 comments · Fixed by linkerd/linkerd2-proxy-init#362
Labels

Comments

@krzysztof-mitus
Copy link

What is the issue?

Helm chart linkerd-control-plane install fails when having cniEnabled=true and proxy.nativeSidecar=true.
Both linkerd-destination and linkerd-proxy-injector fail on linkerd-network-validator container in CrashLoopBackOff state with the following log:

2024-04-04T08:59:17.257953Z  INFO linkerd_network_validator: Listening for connections on 0.0.0.0:4140
2024-04-04T08:59:17.257986Z DEBUG linkerd_network_validator: token="PqBwBvtO5lWbgVvHjaZ0MQQTqqZwjmkfQPtxoK4iysvQ5KG0zyDpEWQtpoZXfLU\n"
2024-04-04T08:59:17.257993Z  INFO linkerd_network_validator: Connecting to 1.1.1.1:20001
2024-04-04T08:59:27.259562Z ERROR linkerd_network_validator: Failed to validate networking configuration. Please ensure iptables rules are rewriting traffic as expected. timeout=10s

How can it be reproduced?

  1. On EKS 1.29 install helm charts:
  • linkerd-crds 2024.3.5
  • linkerd2-cni 2024.3.5
  • linkerd-control-plane 2024.3.5
  1. linkerd-control-plane helm values:
  • cniEnabled: true
  • proxy.nativeSidecar: true

Logs, error output, etc

see above

output of linkerd check -o short

× control plane pods are ready

Environment

  • EKS 1.29
  • Bottlerocket OS 1.19.3 (aws-k8s-1.29)

Possible solution

Issue probably caused by the code:
https://github.com/linkerd/linkerd2-proxy-init/blob/main/cni-plugin/main.go#L196

See slack

Additional context

Control plane pods are running when proxy.nativeSidecar: false. Enabling the feature causes pods failure.

Would you like to work on fixing this bug?

None

@mateiidavid
Copy link
Member

@krzysztof-mitus thanks for raising this! So, if I understand correctly, the problem is that when you use nativeSidecar you still get the init container when you shouldn't, since it's supposed to be disabled through cniEnabled?

@alpeb
Copy link
Member

alpeb commented Apr 4, 2024

I think @krzysztof-mitus's pointer to the code is spot on. The issue is the cni plugin is looking for the proxy in the list of containers, but in this case it should look in the init containers. Sounds like an easy fix, will try to push something ASAP.

@alpeb
Copy link
Member

alpeb commented Apr 4, 2024

I've pushed a fix to linkerd/linkerd2-proxy-init#362.
@krzysztof-mitus pending a linkerd-cni release, you can give this a try with the image I published ghcr.io/alpeb/cni-plugin:v1.4.1 that contains that fix.

@krzysztof-mitus
Copy link
Author

thanks Alejandro. I will test it

@msuszko-vertex
Copy link

I can confirm linkerd/linkerd2-proxy-init#362 does help. Linkerd 2024.4.1 with proxy.nativeSidecar: true and cniEnabled: true runs linkerd-network-validator initContainer successfully.

@krzysztof-mitus
Copy link
Author

thanks for quick resolution and test

@adleong adleong closed this as completed Apr 10, 2024
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 11, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants