Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restrict IAM policy to the EKS cluster #165

Open
maxbrunet opened this issue Dec 16, 2020 · 0 comments
Open

Restrict IAM policy to the EKS cluster #165

maxbrunet opened this issue Dec 16, 2020 · 0 comments

Comments

@maxbrunet
Copy link

maxbrunet commented Dec 16, 2020

Is this a BUG REPORT or FEATURE REQUEST?: Not sure

What happened:

We have been experimenting with the following IAM policy which restricts write operations the desired EKS workers:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "eksWorkerAll",
            "Effect": "Allow",
            "Action": [
                "ec2:DescribeLaunchTemplates",
                "ec2:DescribeInstances",
                "autoscaling:DescribeAutoScalingGroups"
            ],
            "Resource": "*"
        },
        {
            "Sid": "eksWorkerOwn",
            "Effect": "Allow",
            "Action": [
                "ec2:CreateTags",
                "autoscaling:TerminateInstanceInAutoScalingGroup",
                "autoscaling:EnterStandby"
            ],
            "Resource": "*",
            "Condition": {
                "StringEquals": {
                    "aws:ResourceTag/kubernetes.io/cluster/<cluster_name>": "owned"
                }
            }
        }
    ]
}

But the upgrade-manager sometimes gets an HTTP 403 when performing TerminateInstanceInAutoScalingGroup:

time="2020-12-16T21:55:39Z" level=debug msg="cache hit => false, service => autoscaling.TerminateInstanceInAutoScalingGroup"
2020-12-16T21:55:39.873Z        ERROR   controllers.RollingUpgrade      AccessDenied    {"rollingupgrade": "REDACTED", "instanceID": "REDACTED", "error": "AccessDenied: User: arn:aws:sts::REDACTED:assumed-role/REDACTED is not authorized to perform: autoscaling:TerminateInstanceInAutoScalingGroup\n\tstatus code: 403, request id: REDACTED"}
github.com/go-logr/zapr.(*zapLogger).Error
        /go/pkg/mod/github.com/go-logr/[email protected]/zapr.go:128
github.com/keikoproj/upgrade-manager/controllers.(*RollingUpgradeReconciler).error
        /workspace/controllers/rollingupgrade_controller.go:1181
github.com/keikoproj/upgrade-manager/controllers.(*RollingUpgradeReconciler).TerminateNode
        /workspace/controllers/rollingupgrade_controller.go:398
github.com/keikoproj/upgrade-manager/controllers.(*RollingUpgradeReconciler).DrainTerminate
        /workspace/controllers/rollingupgrade_controller.go:994
github.com/keikoproj/upgrade-manager/controllers.(*RollingUpgradeReconciler).UpdateInstanceEager
        /workspace/controllers/rollingupgrade_controller.go:977
github.com/keikoproj/upgrade-manager/controllers.(*RollingUpgradeReconciler).UpdateInstance
        /workspace/controllers/rollingupgrade_controller.go:1051

It seems to happen when the instance was already terminated, the Autoscaling endpoint cannot back track the ASG from the instance ID anymore, thus no ASG tags can be matched from the IAM policy, resulting in a 403, it will retry a 2nd time, get the same, and set the RollingUpgrade status to "error".

What you expected to happen:

Maybe upgrade-manager should keep going on 403 from TerminateInstanceInAutoScalingGroup, assuming it was already terminated?

How to reproduce it (as minimally and precisely as possible):

You could try:

  • Use the IAM policy shared here
  • Start rolling upgrade
  • Terminate a worker before upgrade-manager gets its chance

Anything else we need to know?:

Environment:

  • rolling-upgrade-controller v0.17
  • Kubernetes version :
$ kubectl version -o yaml
clientVersion:
  buildDate: "2020-09-02T11:40:00Z"
  compiler: gc
  gitCommit: 2adc8d7091e89b6e3ca8d048140618ec89b39369
  gitTreeState: clean
  gitVersion: v1.16.15
  goVersion: go1.13.15
  major: "1"
  minor: "16"
  platform: darwin/amd64
serverVersion:
  buildDate: "2020-10-20T23:27:12Z"
  compiler: gc
  gitCommit: ad4801fd44fe0f125c8d13f1b1d4827e8884476d
  gitTreeState: clean
  gitVersion: v1.16.15-eks-ad4801
  goVersion: go1.13.15
  major: "1"
  minor: 16+
  platform: linux/amd64

Other debugging information (if applicable):

  • RollingUpgrade status:
$ kubectl describe rollingupgrade <rollingupgrade-name>
  • controller logs:
$ kubectl logs <rolling-upgrade-controller pod>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant