Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connections initiated by Pods failing often due to ephemeral port mismatch #3909

Open
gtie opened this issue Apr 10, 2023 · 5 comments
Open
Labels

Comments

@gtie
Copy link

gtie commented Apr 10, 2023

Summary

TCP and UDP connections initiated by Pods time out/fail often (~50% of time). I tracked this network "instability" to unexpected ephemeral ports used, not matching the OS ephemeral port range and the external firewall configuration.

IOW, despite the OS having:

# sysctl net.ipv4.ip_local_port_range
net.ipv4.ip_local_port_range = 32768	60999

...the containers were using source ports in a much wider range (>1024? >0?)

I'm still unclear of is how can the (S)NAT component be configured to use a source port range per the OS syctl. I did not find a relevant setting in the general CNI settings or in the calico portmap docs.

I verified there are no iptables rules doing NAT

What Should Happen Instead?

Connections should succeed all the time.

Reproduction Steps

With fexternal irewall only allowing syn-ack packets to ephemeral ports in the system (> 32768).

From a pod, run curl 142.250.184.238 (an IP address of google.com). From the outside run tcpdump to confirm that the source ports used by the connections are sometimes failing outside of the ephemeral port range:

# tcpdump -i eno1 -n host 142.250.184.238 and "tcp[tcpflags] & (tcp-syn) != 0 and tcp[tcpflags] & (tcp-ack) =0"
tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
listening on eno1, link-type EN10MB (Ethernet), snapshot length 262144 bytes
16:37:13.987895 IP 144.76.196.2.13686 > 142.250.184.238.80: Flags [S], seq 2626848918, win 64240, options [mss 1460,sackOK,TS val 493799354 ecr 0,nop,wscale 7], length 0
16:37:15.828422 IP 144.76.196.2.13686 > 142.250.184.238.80: Flags [S], seq 699943171, win 64240, options [mss 1460,sackOK,TS val 493801194 ecr 0,nop,wscale 7], length 0
16:37:16.760399 IP 144.76.196.2.2545 > 142.250.184.238.80: Flags [S], seq 2944446947, win 64240, options [mss 1460,sackOK,TS val 493802126 ecr 0,nop,wscale 7], length 0
16:37:17.563713 IP 144.76.196.2.10702 > 142.250.184.238.80: Flags [S], seq 1157060071, win 64240, options [mss 1460,sackOK,TS val 493802929 ecr 0,nop,wscale 7], length 0
16:37:18.427493 IP 144.76.196.2.61052 > 142.250.184.238.80: Flags [S], seq 1670527954, win 64240, options [mss 1460,sackOK,TS val 493803793 ecr 0,nop,wscale 7], length 0

You can see above the source ports are {13686, 2545, 10702, 61052} - i.e. not within the 32768 60999 range.

Introspection Report

(the tar.gz includes too much private data for me to be comfortable sharing it)

# snap list | grep microk8s
microk8s  v1.26.3   4959   1.26/stable    canonical**  classic
# microk8s status
microk8s is running
high-availability: no
  datastore master nodes: 127.0.0.1:19001
  datastore standby nodes: none
addons:
  enabled:
    cert-manager         # (core) Cloud native certificate management
    community            # (core) The community addons repository
    dashboard            # (core) The Kubernetes dashboard
    dns                  # (core) CoreDNS
    ha-cluster           # (core) Configure high availability on the current node
    helm                 # (core) Helm - the package manager for Kubernetes
    helm3                # (core) Helm 3 - the package manager for Kubernetes
    hostpath-storage     # (core) Storage class; allocates storage from host directory
    ingress              # (core) Ingress controller for external access
    metrics-server       # (core) K8s Metrics Server for API access to service metrics
    storage              # (core) Alias to hostpath-storage add-on, deprecated
  disabled:
    argocd               # (community) Argo CD is a declarative continuous deployment for Kubernetes.
    cilium               # (community) SDN, fast with full network policy
    dashboard-ingress    # (community) Ingress definition for Kubernetes dashboard
    fluentd              # (community) Elasticsearch-Fluentd-Kibana logging and monitoring
    gopaddle-lite        # (community) Cheapest, fastest and simplest way to modernize your applications
    inaccel              # (community) Simplifying FPGA management in Kubernetes
    istio                # (community) Core Istio service mesh services
    jaeger               # (community) Kubernetes Jaeger operator with its simple config
    kata                 # (community) Kata Containers is a secure runtime with lightweight VMS
    keda                 # (community) Kubernetes-based Event Driven Autoscaling
    knative              # (community) Knative Serverless and Event Driven Applications
    kwasm                # (community) WebAssembly support for WasmEdge (Docker Wasm) and Spin (Azure AKS WASI)
    linkerd              # (community) Linkerd is a service mesh for Kubernetes and other frameworks
    multus               # (community) Multus CNI enables attaching multiple network interfaces to pods
    nfs                  # (community) NFS Server Provisioner
    ondat                # (community) Ondat is a software-defined, cloud native storage platform for Kubernetes.
    openebs              # (community) OpenEBS is the open-source storage solution for Kubernetes
    openfaas             # (community) OpenFaaS serverless framework
    osm-edge             # (community) osm-edge is a lightweight SMI compatible service mesh for the edge-computing.
    portainer            # (community) Portainer UI for your Kubernetes cluster
    sosivio              # (community) Kubernetes Predictive Troubleshooting, Observability, and Resource Optimization
    traefik              # (community) traefik Ingress controller
    trivy                # (community) Kubernetes-native security scanner
    gpu                  # (core) Automatic enablement of Nvidia CUDA
    host-access          # (core) Allow Pods connecting to Host services smoothly
    kube-ovn             # (core) An advanced network fabric for Kubernetes
    mayastor             # (core) OpenEBS MayaStor
    metallb              # (core) Loadbalancer for your Kubernetes cluster
    minio                # (core) MinIO object storage
    observability        # (core) A lightweight observability stack for logs, traces and metrics
    prometheus           # (core) Prometheus operator for monitoring and logging
    rbac                 # (core) Role-Based Access Control for authorisation
    registry             # (core) Private image registry exposed on localhost:32000
# cat /var/snap/microk8s/current/args/cni-network/10-calico.conflist
{
  "name": "k8s-pod-network",
  "cniVersion": "0.3.1",
  "plugins": [
    {
      "type": "calico",
      "log_level": "info",
      "nodename_file_optional": true,
      "log_file_path": "/var/log/calico/cni/cni.log",
      "datastore_type": "kubernetes",
      "nodename": "rblmon23",
      "mtu": 0,
      "ipam": {
          "type": "calico-ipam"
      },
      "policy": {
          "type": "k8s"
      },
      "kubernetes": {
          "kubeconfig": "/var/snap/microk8s/current/args/cni-network/calico-kubeconfig"
      }
    },
    {
      "type": "portmap",
      "snat": true,
      "capabilities": {"portMappings": true}
    },
    {
      "type": "bandwidth",
      "capabilities": {"bandwidth": true}
    }
  ]
}

Can you suggest a fix?

Update (external) firewall rules to allow a wider range of source ports (>1024)

Are you interested in contributing with a fix?

Yes

@sanjeevpandey19
Copy link

+1

We faced this issue in AWS enviourment today and took really long to figure this out that it wasn't using defualt ephemeral ports

@b44rawat
Copy link

+1

@neoaggelos
Copy link
Contributor

Hi @sanjeevpandey19 @b44rawat, MicroK8s is not doing anything in particular for this issue, but I'm also a bit out my depth about the specifics here. MicroK8s is not messing with this configs in any way, so probably something to discuss/raise with Calico? You might be able to get more help there.

@sanjeevpandey19
Copy link

Hi @sanjeevpandey19 @b44rawat, MicroK8s is not doing anything in particular for this issue, but I'm also a bit out my depth about the specifics here. MicroK8s is not messing with this configs in any way, so probably something to discuss/raise with Calico? You might be able to get more help there.

Ok @neoaggelos thanks for the reply, i will check if i get something related to this which might be causing it

Copy link

stale bot commented Dec 25, 2024

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the inactive label Dec 25, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants