-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Intermittent connection reset and delay running time #245
Comments
We will look into this and get back. Btw is this easily reproducible? |
@jayanthvn Yes, reproducible. But intermittent, it may take time for it to recur. |
Thanks. We will review the logs and get back to you. |
Our team found workarounds for two network policy issues through internal testing today. Maybe it is not a fundamental solution.
TLDRSymptom 1. Intermittent connection reset by peer is resolved.
Symptom 2. Delayed readiness time is resolved.
Workarounds1. Intermittent connection reset by peer1-1. WorkaroundAs mentioned in Amazon EKS official documentation, intermitten connection reset does not occur when bpffs is mounted on an EC2 Worker Node. 1-2. Record for issue resolutionCheck the Kernel version and AMI version of the worker node. The workload pod(source) that was experiencing the intermitten connection reset symptom was scheduled on the worker node. $ kubectl get node -o wide ip-xx-xxx-xx-98.ap-northeast-2.compute.internal
NAME STATUS ROLES AGE VERSION INTERNAL-IP EXTERNAL-IP OS-IMAGE KERNEL-VERSION CONTAINER-RUNTIME
ip-xx-xxx-xx-98.ap-northeast-2.compute.internal Ready <none> 69d v1.26.12-eks-5e0fdde xx.xxx.xx.98 <none> Amazon Linux 2 5.10.205-195.804.amzn2.x86_64 containerd://1.7.2
Connect to the worker node and manually mount # Mount bpf filesystem in worker node
sudo mount -t bpf bpffs /sys/fs/bpf $ mount -l | grep bpf
none on /sys/fs/bpf type bpf (rw,nosuid,nodev,noexec,relatime,mode=700)
none on /sys/fs/bpf type bpf (rw,relatime)
none on /sys/fs/bpf type bpf (rw,relatime)
none on /sys/fs/bpf type bpf (rw,relatime) Question:
Reference: EKS User guide
Since mounting
2. Delayed readiness time2-1. WorkaroundAdd an Ingress netpol to the workload pod (source). 2-2. Record for issue resolutionNetwork policy enforcing mode is set to $ kubectl get ds -n kube-system aws-node -o yaml
containers:
- env:
- name: NETWORK_POLICY_ENFORCING_MODE
value: standard
name: aws-node Create a new ingress netpol that ‘explicitly’ allows ingress from the Kubernetes Service IP range. $ kubectl get netpol -n <REDACTED> ingress-service -o yaml
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
...
spec:
ingress:
- from:
- ipBlock:
cidr: 172.20.0.0/16
ports:
- endPort: 65535
port: 1
protocol: TCP
podSelector:
matchLabels:
app.kubernetes.io/networkpolicy-ingress-service: apply
policyTypes:
- Ingress
status: {} $ kubectl get pod -n <REDACTED> t<REDACTED> -o yaml
apiVersion: v1
kind: Pod
metadata:
annotations:
kubectl.kubernetes.io/restartedAt: "2024-04-12T12:30:49+09:00"
creationTimestamp: "2024-04-12T06:14:02Z"
generateName: t<REDACTED>-cc878cb69-
labels:
...
app.kubernetes.io/networkpolicy-ingress-service: apply
app.kubernetes.io/networkpolicy-ingress-t<REDACTED>: apply
pod-template-hash: cc878cb69 To explicitly allow Kubernetes service IP range, added
Delayed readiness time issue was resolved after explicitly attaching the ingress netpol to the workload pod(source) as shown in this comment.
Captured conntrack list between workload pod(source, ends with # Run on the worker node where the source pod is scheduled
$ conntrack -L --src ss.sss.35.242 --dst 172.20.67.165
tcp 6 118 TIME_WAIT src=ss.sss.35.242 dst=172.20.67.165 sport=55938 dport=80 src=ss.sss.19.208 dst=ss.sss.35.242 sport=8080 dport=55938 [ASSURED] mark=0 use=1
tcp 6 431998 ESTABLISHED src=ss.sss.35.242 dst=172.20.67.165 sport=55944 dport=80 src=ss.sss.21.1 dst=ss.sss.35.242 sport=8080 dport=55944 [ASSURED] mark=0 use=1
conntrack v1.4.4 (conntrack-tools): 2 flow entries have been shown. I observed with # Run on the worker node where the source pod is scheduled
$ conntrack -L --src ss.sss.35.242 --dst 172.20.67.165
tcp 6 98 TIME_WAIT src=ss.sss.35.242 dst=172.20.67.165 sport=55938 dport=80 src=ss.sss.19.208 dst=ss.sss.35.242 sport=8080 dport=55938 [ASSURED] mark=0 use=1
tcp 6 102 TIME_WAIT src=ss.sss.35.242 dst=172.20.67.165 sport=55944 dport=80 src=ss.sss.21.1 dst=ss.sss.35.242 sport=8080 dport=55944 [ASSURED] mark=0 use=1
conntrack v1.4.4 (conntrack-tools): 2 flow entries have been shown.
It was observed that the ready time of the pod was dramatically reduced from 92 seconds to 32 seconds. $ kubectl get pod -n <REDACTED> -l app.kubernetes.io/name=t<REDACTED>
NAME READY STATUS RESTARTS AGE
t<REDACTED>-cc878cb69-8tmt9 1/1 Running 0 33s At this point, Readiness time for the workload(source) pod back to normal. The readiness time comparison is:
|
@jayanthvn @achevuru I submitted a node-level support bundles to [email protected]. node-level support-bundle collection# Collect node level tech-support bundle in the affected worker node
$ /opt/cni/bin/aws-cni-support.sh
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 56 100 56 0 0 34825 0 --:--:-- --:--:-- --:--:-- 56000
This is version 0.7.6. New versions can be found at https://github.com/awslabs/amazon-eks-ami/blob/master/log-collector-script/
Trying to collect common operating system logs...
Trying to collect kernel logs...
Trying to collect modinfo...
Trying to collect mount points and volume information...
Trying to collect SELinux status...
Trying to collect iptables information...
Trying to collect iptables-legacy information...
Trying to collect installed packages...
Trying to collect active system services...
Trying to Collect Containerd daemon information...
Trying to Collect Containerd running information...
Trying to Collect Docker daemon information...
Warning: The Docker daemon is not running.
Trying to collect kubelet information...
Trying to collect L-IPAMD introspection information...
Trying to collect L-IPAMD prometheus metrics...
Trying to collect L-IPAMD checkpoint...
Trying to collect Multus logs if they exist...
Trying to collect sysctls information...
Trying to collect networking infomation... conntrack v1.4.4 (conntrack-tools): 5437 flow entries have been shown.
Trying to collect CNI configuration information...
Trying to collect CNI Configuration Variables from Docker...
Warning: The Docker daemon is not running.
Trying to collect CNI Configuration Variables from Containerd...
Trying to collect network policy ebpf loaded data...
Trying to collect Docker daemon logs...
Trying to Collect sandbox-image daemon information...
Trying to Collect CPU Throttled Process Information...
Trying to Collect IO Throttled Process Information...
Trying to archive gathered information...
Done... your bundled logs are located in /var/log/eks_i-01827ce93254660f5_2024-04-12_0944-UTC_0.7.6.tar.gz $ ls -lh /var/log/eks*
-rw-r--r-- 1 root root 54M Apr 12 09:44 /var/log/eks_i-01827ce93254660f5_2024-04-12_0944-UTC_0.7.6.tar.gz # Download tech-support bundle from node filesystem to local
$ kubectl cp default/nsenter-xxxxxx:/var/log/eks_i-01827ce93254660f5_2024-04-12_0944-UTC_0.7.6.tar.gz $HOME/eks_i-01827ce93254660f5_2024-04-12_0944-UTC_0.7.6.tar.gz
tar: Removing leading `/' from member names Reference: troubleshooting |
@younsl Thanks for sharing your findings with us.
|
[updated] Yes, you're right. Even though the file system |
|
Fix is released with network policy agent v1.1.2. - https://github.com/aws/amazon-vpc-cni-k8s/releases/tag/v1.18.2. Please test and let us know if there are any issues. |
Hi, @jayanthvn. I'm still experiencing intermittent connection resets in Environment
Mitigation measuresTested the two changes below to resolve the
Detailed changesVPC CNI version info: $ kubectl describe daemonset aws-node -n kube-system | grep Image | cut -d "/" -f 2-3
amazon-k8s-cni-init:v1.18.2-eksbuild.1
amazon-k8s-cni:v1.18.2-eksbuild.1
amazon/aws-network-policy-agent:v1.1.2-eksbuild.1 NETWORK_POLICY_ENFORCING_MODE setting currently defaults to Diff for conntrack-cache-cleanup-period argument: - args:
- - --conntrack-cache-cleanup-period=300 # 5m (default)
+ - --conntrack-cache-cleanup-period=21600 # 6h ProblemHowever, even after upgrading to VPC CNI After checking internally with the developer, it was found that the retry logic for connection failures is not included in the application container affected by the network issues. Timeline of read ECONNRESET errors occurring in some pods (starting after upgrading to network-policy-agent v1.1.2): Note
6/10 08:00 (6h)
6/10 02:00 (7h 39m)
6/9 18:21 (2m)
6/9 18:19 (6m)
6/9 18:13 (1h 36m)
6/9 16:37 (1h 25m)
6/9 15:12 (7h 57m)
6/9 07:15 (2h 38m)
6/9 04:37 (12h 26m)
6/8 16:11 (4h 23m)
6/8 11:48 (5h 34m)
6/8 06:14 |
@younsl - as shared internally with the service ticket, the timeouts are not inline with conntrack cleanup since you have the cleanup every 6 hours while timeout is happening at varied times. At these times are you noticing a spike in network policy agent conntrack cache? One suspect is cache is getting full leading to certain entries getting evicted and causing timeouts.. |
@jayanthvn I'm waiting to test PR #280 on my affected clusters. |
Wrap-up after meeting with EKS teamI enabled network-policy-agent log on my dev EKS v1.28 cluster. VPC CNI yaml: - args:
- --enable-ipv6=false
- --enable-network-policy=true
- --enable-cloudwatch-logs=false
- - --enable-policy-event-logs=false
+ - --enable-policy-event-logs=true So I will back to submit conntrack cleanup log for network-policy-agent ASAP. Testing environment
$ kubectl describe daemonset aws-node -n kube-system | grep Image | cut -d "/" -f 2-3
amazon-k8s-cni-init:v1.18.2-eksbuild.1
amazon-k8s-cni:v1.18.2-eksbuild.1
aws-network-policy-agent:v1.1.1-13-gda05900-dirty |
@jayanthvn @achevuru Hi, fellas. I've rolled back the network policy controller from the VPC CNI's network-policy-agent to calico. After disabling the VPC CNI's netpol feature and switching to Calico v3.28.1 with tigera-operator, I experienced no intermittent packet drops for 24 hours. Chart configuration I installed: # charts/tigera-operator/values.yaml
...
tigeraOperator:
image: tigera/operator
version: v1.34.3
registry: quay.io
# calico version
calicoctl:
image: docker.io/calico/ctl
tag: v3.28.1
... |
What happened:
Background
After migrating the network policy provider from Calico v3.25.1 and Tigera Operator to VPC CNI
v1.18.0-eksbuild.1
, the following two network policy issues occurred on EKS v1.26 cluster:Cluster environment
{"enableNetworkPolicy":"true"}
setting in advanced configurationiptables
mode.Network policy issues
1. Intermittent connection reset by peer
occurs during Pod to Pod or Pod to EC2 communication.
Similar issues: #204, #210, #236
2. Delayed Running time
Delay in the time it takes for the pod to run.
For pods to which Network Policy is applied, the time it takes to activate the Readiness Probe is up to 3 times slower.
Similar issues: #189, #186
Attach logs
1. Intermittent connection reset by peer
[tcpdump] From workload pod to EC2 instance
Intermittently, the workload pod receives a resetRST packet response from EC2.
[kubectl sniff] From workload pod to EC2 instance
intermittently, the workload pod receives a resetRST packet response from EC2.
If a issue occurs in the workload pod, the slack notification below is output.
2. Delayed Running time
Captured ebpf-sdk log on worker node immediately after pod restart
A Deny log occurs from the destination Service (Cluster IP) IP
172.20.67.165
.What you expected to happen:
How to reproduce it (as minimally and precisely as possible):
Anything else we need to know?:
The same network issue occurred in all VPC CNI v1.16.0, v1.16.1, and v1.18.0 versions.
Environment:
kubectl version
): v1.26.12-eks-5e0fddecat /etc/os-release
): Amazon Linux release 2 (Karoo)uname -a
): 5.10.205-195.804.amzn2.x86_64The text was updated successfully, but these errors were encountered: