Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Typo in CAPV controller manager ClusterRole label selector #2283

Open
srm09 opened this issue Aug 27, 2023 · 14 comments
Open

Typo in CAPV controller manager ClusterRole label selector #2283

srm09 opened this issue Aug 27, 2023 · 14 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@srm09
Copy link
Contributor

srm09 commented Aug 27, 2023

/kind bug

What steps did you take and what happened:
I was looking at the aggregated role that CAPV uses for the controller manager. It specifically uses that to allow provider service account controller to be able to create roles/role bindings with extra RBAC permissions that the ones granted to the manager for normal operation.
It uses a ClusterRole with label selector capv.infrastucture.cluster.x-k8s.io/aggregate-to-manager: "true". As you can see there is a typo in infrastructure.

What did you expect to happen:
The label selector should not have the typo. I'd propose for it to be replaced with a newer label selector capv.infrastructure.cluster.x-k8s.io/aggregate-to-manager: "true" (without the typo) and the label selector with the typo to be deprecated and eventually phased out in favor of the newer one.

Anything else you would like to add:
For a few releases, CAPV should support both and release note the new label selector as well as the deprecation of the existing one with the date probable schedule of the deletion.

Environment:

  • Cluster-api-provider-vsphere version:
  • Kubernetes version: (use kubectl version):
  • OS (e.g. from /etc/os-release):
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Aug 27, 2023
@sbueringer
Copy link
Member

Overall sounds good to me. Can we have both labels at the same time?

I was looking at the CAPV repo and didn't fine a role with permissions with the "capv. infrastucture.cluster.x-k8s.io/aggregate-to-manager" label. Can you provide some context around how this is used?

@srm09
Copy link
Contributor Author

srm09 commented Aug 28, 2023

The ProviderServiceAccount CR when reconciled, generates a service account in the workload cluster with permissions specified via a set of RBAC rules.
For the CAPV controller to be able to generate a Role/RoleBinding with those permissions, CAPV's controller manager role should have those RBAC permissions as well. The idea behind the aggregated ClusterRole is
An end user who needs a ProviderServiceAccount (with specific permissions) created, adds a ClusterRole with those permissions and the selector label set. CAPV's aggregated manager role picks up the permissions and is ultimately able to reconcile the ProviderServiceAccount CR.

@sbueringer
Copy link
Member

Sounds good to me!

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jan 27, 2024
@kumarankit999
Copy link

Hey @srm09 , Thanks for raising this Issue.
It would be great If you could refer to the link to that page where we have to do the Changes!

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough active contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle rotten
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle rotten

@k8s-ci-robot k8s-ci-robot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 3, 2024
@zhanggbj
Copy link
Contributor

zhanggbj commented Mar 4, 2024

/remove-lifecycle rotten

@k8s-ci-robot k8s-ci-robot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 4, 2024
@chrischdi
Copy link
Member

There are two occurrences for the wrong label capv.infrastucture.cluster.x-k8s.io/aggregate-to-manager:

  1. At the ClusterRole manager-role as label:

https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/main/config/base/manager_role_aggregation_patch.yaml#L6

  1. At the Clusterrole aggregated-manager-role at aggregationRule.clusterRoleSelectors[]matchLabels

https://github.com/kubernetes-sigs/cluster-api-provider-vsphere/blob/main/config/rbac/aggregate_role.yaml#L8

To fix the issue we should propably do a multi-step approach with multiple phases:

  1. Migration phase: make both labels work, the old and the new
  • Add a second label to the manager-role (here):

    capv.infrastructure.cluster.x-k8s.io/aggregate-to-manager: "true"
  • Add a second entry at aggregationRule.clusterRoleSelectors at the aggregated-manager-role:

    - matchLabels:
        capv.infrastructure.cluster.x-k8s.io/aggregate-to-manager: "true"
  1. Wait phase: wait for X minor releases having both labels in place

  2. Cleanup phase: cleanup the old labels in both places.

Alternative approach

The alternative approach would be to just fix the string in both places.

This would add risk to existing users, if they created an additional ClusterRole which references the manager-role via the old label (by an aggregationRule).

@chrischdi
Copy link
Member

Before proceeding here with an implementation, I think we need consensus on if the alternative approach is okay (because it would be easier) or if we have to go down the long road.

Note: the above needs a review if I missed something here.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 2, 2024
@chrischdi
Copy link
Member

/remove-lifecycle stale

@sbueringer / @fabriziopandini : I propose to take the safe path and:

  1. for v1.11: add the additional label in both places (so v1.11 has both labels)
  2. for v1.12: do nothing (so v1.12 has both labels)
  3. for v1.13: remove the old label (TBD, if we want to remove it even later)

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 3, 2024
@sbueringer
Copy link
Member

sbueringer commented Jun 3, 2024

Sounds good to me

(cc @zhanggbj PTAL to figure out what our options are for where we use CAPV (not sure how the CAPV manifests are used there, maybe we can just directly use the new label there))

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 1, 2024
@chrischdi
Copy link
Member

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Sep 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

7 participants