Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

direct routing and BPF datapath of kube-proxy replacement #386

Open
hown3d opened this issue Aug 11, 2024 · 1 comment · May be fixed by gardener/gardener#10575
Open

direct routing and BPF datapath of kube-proxy replacement #386

hown3d opened this issue Aug 11, 2024 · 1 comment · May be fixed by gardener/gardener#10575
Labels
area/networking Networking related kind/bug Bug

Comments

@hown3d
Copy link

hown3d commented Aug 11, 2024

How to categorize this issue?

/area networking
/kind bug

What happened:
When running Cilium as a kube-proxy replacement and the eBPF datapath is chosen (will be introduced with #350) the lo device will be ignored to search for host addresses https://github.com/cilium/cilium/blob/9d631b91ad4d2c146d3decbfcfc39968764eb539/pkg/datapath/linux/devices.go#L32-L38
Running without a network overlay let's request inside containers against https://kubernetes time-out.

This currently isn not reproducible when running without overlay because bpf-masquerade get's disabled in that case:

{{- if ne .Values.global.tunnel "disabled" }}
enable-bpf-masquerade: {{ .Values.global.enableBPFMasquerade | quote }}
{{- end }}

Cilium will fallback to the legacy implementation of hostrouting instead of using the eBPF datapath:

$ kubectl -n kube-system logs ds/cilium
time="2024-08-07T14:01:05Z" level=info msg="BPF host routing requires enable-bpf-masquerade. Falling back to legacy host routing (enable-host-legacy-routing=true)." subsys=daemon
  • tcp-dump of cilium managed node (100.83.126.209 is the service IP of kube-apiserver)
shoot--ondemand--test-worker-tyo9o-z2-6f799-cdmnr / # tcpdump -i any | grep 100.83.126.209
tcpdump: data link type LINUX_SLL2
dropped privs to pcap
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on any, link-type LINUX_SLL2 (Linux cooked v2), snapshot length 262144 bytes
00:16:59.303015 lxc30002a5304eb In  IP 100.64.1.234.35474 > 100.83.126.209.https: Flags [S], seq 3195040413, win 65535, options [mss 8710,sackOK,TS val 3811945852 ecr 0,nop,wscale 9], length 0: Flags [P.], seq 363384
00:16:59.303071 eth0  Out IP 100.64.1.234.35474 > 100.83.126.209.https: Flags [S], seq 3195040413, win 65535, options [mss 8710,sackOK,TS val 3811945852 ecr 0,nop,wscale 9], length 0: Flags [.], ack 362392,
00:17:00.365963 lxc30002a5304eb In  IP 100.64.1.234.35474 > 100.83.126.209.https: Flags [S], seq 3195040413, win 65535, options [mss 8710,sackOK,TS val 3811946915 ecr 0,nop,wscale 9], length 092: Flags [.], ack 6785,
  • cilium-dbg output
kubectl -n kube-system exec cilium-nbtgj -- cilium-dbg statedb
 node-addresses
Defaulted container "cilium-agent" out of: cilium-agent, disable-rp-filter (init), config (init), mount-cgroup (init), apply-sysctl-overwrites (init), mount-bpf-fs (init), clean-cilium-state (init), install-cni-binaries (init)
Address                     NodePort   Primary   DeviceName
10.250.2.187                true       true      eth0
100.64.0.56                 false      true      cilium_host
fe80::b407:4cff:fe39:f6fa   false      true      cilium_host

What you expected to happen:
Pods are able to access the kube-apiserver via service discovery

How to reproduce it (as minimally and precisely as possible):
Create a shoot without overlay and enable the kube-proxy replacement.
Either:

  1. Add enable-bpf-masquerade: true to the cilium-config configmap in kube-system

or

  1. Install cilium extension using branch of PR fix: enable-bpf-masquerade when snat values are not enabled #350

Example shoot spec to reproduce:

spec:
  kubernetes:
    kubeProxy:
      enabled: false
  networking:
    type: cilium
    providerConfig:
      apiVersion: cilium.networking.extensions.gardener.cloud/v1alpha1
      kind: NetworkConfig
      hubble:
        enabled: true
      tunnel: disabled
      ipv4NativeRoutingCIDREnabled: true
      overlay:
        enabled: false
        createPodRoutes: true

Anything else we need to know?:

Environment:

@hown3d
Copy link
Author

hown3d commented Sep 27, 2024

Related issue and commit in the cilium repository.
Cilium has a hidden flag called --local-max-addr-scope which is by default to scope link (253) - 1 after v1.13.

IP addresses on a devices with scope higher than link (e.g. scope host like the apiserver-proxy creates) will be skipped.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/networking Networking related kind/bug Bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants