[Bug] kube-proxy image version 1.27 causing the kube-proxy to fail #6991

artemisia480 · 2023-08-21T14:35:04Z

What were you trying to accomplish?

Trying to deploy a new cluster, version 1.27, using eksctl. i am running the command: eksctl create cluster...

What happened?

I get the following error and the nodes for the cluster never come up. Looking at the logs inside the node, I see this error:
ErrImagePull: rpc error: code = Unknown desc = failed to pull and unpack image "602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/kube-proxy:v1.27.1-minimal-eksbuild.1": failed to resolve reference "602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/kube-proxy:v1.27.1-minimal-eksbuild.1": pulling from host 602401143452.dkr.ecr.us-east-1.amazonaws.com failed with status code [manifests v1.27.1-minimal-eksbuild.1]

How to reproduce it?

I am using a yaml file to deploy this. Not sure how you would reproduce it. But if you look at the aws documentation here:
https://docs.aws.amazon.com/eks/latest/userguide/managing-kube-proxy.html
the image is meant to be eksbuild.2 and not 1.
and if you look at the eksctl code here: https://github.com/eksctl-io/eksctl/blob/c27d2e80f50aceb78c35c60b713f8e9267611dde/pkg/addons/default/kube_proxy.go#L150C1-L151
it is only calling eksbuild.1 and not 2.

Logs
ErrImagePull: rpc error: code = Unknown desc = failed to pull and unpack image "602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/kube-proxy:v1.27.1-minimal-eksbuild.1": failed to resolve reference "602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/kube-proxy:v1.27.1-minimal-eksbuild.1": pulling from host 602401143452.dkr.ecr.us-east-1.amazonaws.com failed with status code [manifests v1.27.1-minimal-eksbuild.1]

Anything else we need to know?

Versions
1.27

$ eksctl info

The text was updated successfully, but these errors were encountered:

github-actions · 2023-08-21T14:36:42Z

Hello artemisia480 👋 Thank you for opening an issue in eksctl project. The team will review the issue and aim to respond within 1-5 business days. Meanwhile, please read about the Contribution and Code of Conduct guidelines here. You can find out more information about eksctl on our website

yoplait · 2023-08-21T17:19:53Z

Thanks @artemisia480 same problem here.

cPu1 · 2023-08-22T08:05:16Z

i am running the command: eksctl create cluster...

@artemisia480 did you run any commands after eksctl create cluster, or did you try to update the image?

and if you look at the eksctl code here: https://github.com/eksctl-io/eksctl/blob/c27d2e80f50aceb78c35c60b713f8e9267611dde/pkg/addons/default/kube_proxy.go#L150C1-L151
it is only calling eksbuild.1 and not 2.

That codepath is not used in eksctl create cluster.

cPu1 · 2023-08-22T08:47:16Z

I'm unable to reproduce this. I got the same image tag (602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/kube-proxy:v1.27.1-minimal-eksbuild.1) on a new cluster and it was pulled successfully.

Can you share your config file?

artemisia480 · 2023-08-22T11:11:13Z

@cPu1 , the code doesn't use it? are you sure? but the aws documentation says to use eksbuild.2 and clearly this pulls 1.
here is my yaml file:

apiVersion: eksctl.io/v1alpha5
kind: ClusterConfig

metadata:
  name: ami-testing-cluster2
  version: "1.27"
  region: us-east-1

vpc:
  clusterEndpoints:
    publicAccess: true
    privateAccess: false

managedNodeGroups:
  - name: ami-testing2
    ami:  <custome ami>
    amiFamily: AmazonLinux2
    instanceType: m6i.large
    volumeSize: 20
    disableIMDSv1: false
    ssh:
      allow: true
      publicKeyPath: ~/.ssh/id_rsa.pub
    overrideBootstrapCommand: |
      #!/bin/bash
      eks_register.sh ami-testing-cluster2
    iam:
      withAddonPolicies:
        externalDNS: true
        ebs: true
        autoScaler: true
        cloudWatch: false
      attachPolicyARNs:
        - arn:aws:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly
        - arn:aws:iam::aws:policy/AmazonEKSWorkerNodePolicy
        - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy

a-hilaly · 2023-08-23T13:52:48Z

Could this be an issue in a specific region? @artemisia480 do you have any clusters in other regions to confirm this?

artemisia480 · 2023-08-24T10:34:08Z

@a-hilaly not sure why it would be region specific? But I can test a different region just to see.

a-hilaly · 2023-08-24T11:46:39Z

@artemisia480 not really sure, but if it's a pull issue, maybe the image is not available in every region. Or are we using ECR public here?
i'll try to replicate the same bug locally and update here.

a-hilaly · 2023-08-24T19:52:18Z

@artemisia480 i haven't been able to reproduce your issue through 4/5 creations in different regions... maybe this is an issue with the custom AMI?

artemisia480 · 2023-08-25T10:51:12Z

@a-hilaly thanks for testing that! I am starting to think it is the customer AMI after all. i am not sure what though. I had the following flags in the AMI for 1.26, which I have removed now for 1.27:
KUBELET_EKS_ARGS=--node-ip=192.168.22.222
--pod-infra-container-image=602401143452.dkr.ecr.us-east-1.amazonaws.com/eks/pause-amd64:3.1
--cloud-provider aws
--config /etc/kubernetes/kubelet.json
--kubeconfig /etc/kubernetes/kubeconfig
--container-runtime remote
--container-runtime-endpoint unix:///var/run/containerd/containerd.sock

I also added the flag:
--seccomp-default=unconfined.

But having no luck.

a-hilaly · 2023-09-05T16:17:28Z

Do you run any extra commands after creating the cluster? any daemonset updates?

whereisaaron · 2023-09-09T14:12:09Z

@artemisia480 I got a similar error when I added a containerd node group to an eks 1.23 cluster. The containerd nodes could not pull ECR image and reported the pull failed error. But the dockerd nodes in the same cluster could pull the exact same image. My test cluster was in a VPC that did not have an ECR endpoint, in case that is relevant.

There seems to be something extra that containerd nodes need. @a-hilaly any idea what that might be?

Pulling image "XXXXXX.dkr.ecr.ap-southeast-2.amazonaws.com/mycontainer:1.0.1" Warning Failed 8s (x3 over 47s) kubelet Failed to pull image "XXXXXX.dkr.ecr.ap-southeast-2.amazonaws.com/mycontainer:1.0.1": rpc error: code = NotFound desc = failed to pull and unpack image "XXXXXXX.dkr.ecr.ap-southeast-2.amazonaws.com/mycontainer:1.0.1": failed to copy: httpReadSeeker: failed open: could not fetch content descriptor sha256:d713dedd5b37c3ffea46d23c7933cc173c7755c789eab3bc60ea374cb5af740f (application/vnd.docker.distribution.manifest.v1+json) from remote: not found

artemisia480 added the kind/bug label Aug 21, 2023

azpaulp assigned a-hilaly Aug 23, 2023

Himangini added the priority/important-longterm Important over the long term, but may not be currently staffed and/or may require multiple releases label Sep 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] kube-proxy image version 1.27 causing the kube-proxy to fail #6991

[Bug] kube-proxy image version 1.27 causing the kube-proxy to fail #6991

artemisia480 commented Aug 21, 2023

github-actions bot commented Aug 21, 2023

yoplait commented Aug 21, 2023

cPu1 commented Aug 22, 2023

cPu1 commented Aug 22, 2023

artemisia480 commented Aug 22, 2023 •

edited by a-hilaly

Loading

a-hilaly commented Aug 23, 2023

artemisia480 commented Aug 24, 2023

a-hilaly commented Aug 24, 2023

a-hilaly commented Aug 24, 2023

artemisia480 commented Aug 25, 2023

a-hilaly commented Sep 5, 2023

whereisaaron commented Sep 9, 2023

[Bug] kube-proxy image version 1.27 causing the kube-proxy to fail #6991

[Bug] kube-proxy image version 1.27 causing the kube-proxy to fail #6991

Comments

artemisia480 commented Aug 21, 2023

What were you trying to accomplish?

What happened?

How to reproduce it?

github-actions bot commented Aug 21, 2023

yoplait commented Aug 21, 2023

cPu1 commented Aug 22, 2023

cPu1 commented Aug 22, 2023

artemisia480 commented Aug 22, 2023 • edited by a-hilaly Loading

a-hilaly commented Aug 23, 2023

artemisia480 commented Aug 24, 2023

a-hilaly commented Aug 24, 2023

a-hilaly commented Aug 24, 2023

artemisia480 commented Aug 25, 2023

a-hilaly commented Sep 5, 2023

whereisaaron commented Sep 9, 2023

artemisia480 commented Aug 22, 2023 •

edited by a-hilaly

Loading