Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Services (ServiceTemplates) to ManagedCluster to deploy on target cluster #362

Merged
merged 4 commits into from
Oct 1, 2024

Conversation

wahabmk
Copy link
Contributor

@wahabmk wahabmk commented Sep 20, 2024

Description

This PR:

Testing

  • Ran make dev-apply && make dev-creds-apply and waited for everything to be running.
  • Ran make dev-mcluster-apply and waited for everything to be running.

Provisioning

On Management Cluster

We can see the ClusterProfile object was created with kyverno and ingress-nginx services:

~ kubectl get clusterprofiles                          
NAME                      AGE
hmc-system-wali-aws-dev   6m46s
➜  ~~~ kubectl get clusterprofile hmc-system-wali-aws-dev -o yaml
apiVersion: config.projectsveltos.io/v1beta1
kind: ClusterProfile
metadata:
  creationTimestamp: "2024-09-20T18:48:20Z"
  finalizers:
  - clusterprofilefinalizer.projectsveltos.io
  generation: 1
  labels:
    hmc.mirantis.com/managed: "true"
    projectsveltos.io/cluster-name: wali-aws-dev
    projectsveltos.io/cluster-profile-name: hmc-system-wali-aws-dev
    projectsveltos.io/cluster-type: Capi
  name: hmc-system-wali-aws-dev
  ownerReferences:
  - apiVersion: hmc.mirantis.com/v1alpha1
    kind: ManagedCluster
    name: wali-aws-dev
    uid: f5bc78c8-46ea-4efd-bf1c-e2c3244312ff
  resourceVersion: "3632"
  uid: ccfb11a4-3f3e-4863-90a1-202e5deb8788
spec:
  clusterSelector:
    matchLabels:
      helm.toolkit.fluxcd.io/name: wali-aws-dev
      helm.toolkit.fluxcd.io/namespace: hmc-system
  continueOnConflict: false
  helmCharts:
  - chartName: kyverno
    chartVersion: 3.2.6
    helmChartAction: Install
    . . .
    registryCredentialsConfig:
      plainHTTP: true
    releaseName: kyverno
    releaseNamespace: kyverno
    repositoryName: kyverno
    repositoryURL: oci://hmc-local-registry:5000/charts
  - chartName: ingress-nginx
    chartVersion: 4.11.0
    helmChartAction: Install
    . . .
    registryCredentialsConfig:
      plainHTTP: true
    releaseName: ingress-nginx
    releaseNamespace: ingress-nginx
    repositoryName: ingress-nginx
    repositoryURL: oci://hmc-local-registry:5000/charts
    values: |
      fullnameOverride: ingress-nginx
  reloader: false
  stopMatchingBehavior: WithdrawPolicies
  syncMode: Continuous
  tier: 100
status:
  matchingClusters:
  - apiVersion: cluster.x-k8s.io/v1beta1
    kind: Cluster
    name: wali-aws-dev
    namespace: hmc-system

We can see the associated ClusterSummary object was also created and reports that the services have been "Provisioned" onto the target cluster:

~ kubectl -n hmc-system get clustersummary hmc-system-wali-aws-dev-capi-wali-aws-dev -o yaml
apiVersion: config.projectsveltos.io/v1beta1
kind: ClusterSummary
metadata:
  . . .
  generation: 1
  . . .
  resourceVersion: "6030"
  uid: 500b3a9b-8bf7-479a-a1e9-7edaf314e9dc
spec:
  clusterName: wali-aws-dev
  clusterNamespace: hmc-system
  clusterProfileSpec:
    clusterSelector:
      matchLabels:
        helm.toolkit.fluxcd.io/name: wali-aws-dev
        helm.toolkit.fluxcd.io/namespace: hmc-system
    continueOnConflict: false
    helmCharts:
    - chartName: kyverno
      chartVersion: 3.2.6
      helmChartAction: Install
      . . .
      registryCredentialsConfig:
        plainHTTP: true
      releaseName: kyverno
      releaseNamespace: kyverno
      repositoryName: kyverno
      repositoryURL: oci://hmc-local-registry:5000/charts
    - chartName: ingress-nginx
      chartVersion: 4.11.0
      helmChartAction: Install
      . . .
      registryCredentialsConfig:
        plainHTTP: true
      releaseName: ingress-nginx
      releaseNamespace: ingress-nginx
      repositoryName: ingress-nginx
      repositoryURL: oci://hmc-local-registry:5000/charts
      values: |
        fullnameOverride: ingress-nginx
    reloader: false
    stopMatchingBehavior: WithdrawPolicies
    syncMode: Continuous
    tier: 100
  clusterType: Capi
status:
  dependencies: no dependencies
  featureSummaries:
  - featureID: Helm
    hash: 8ZDFC0FQZ2j1VHZPAeecZpAtyOUBotyUGBJGosO4tYA=
    lastAppliedTime: "2024-09-20T18:53:51Z"
    status: Provisioned
  helmReleaseSummaries:
  - releaseName: kyverno
    releaseNamespace: kyverno
    status: Managing
    valuesHash: Eq4yyx7ALQHto1gbEnwf7jsNxTVy7WuvI5choD2C4SY=
  - releaseName: ingress-nginx
    releaseNamespace: ingress-nginx
    status: Managing
    valuesHash: qYgUi/xTJIMlaXCLxb/XjCBv5xso8nVHHQ0copZdxl4=

On Target Cluster

We can see both kyverno and ingress-nginx running on the target cluster:

~ kubectl get pod -A | grep Running
ingress-nginx    ingress-nginx-controller-5bfc858768-m5xd4        1/1     Running   0          3m53s
kube-system      aws-cloud-controller-manager-fjfg2               1/1     Running   0          6m10s
kube-system      calico-kube-controllers-695f6448bd-fckbc         1/1     Running   0          7m7s
kube-system      calico-node-7tv5t                                1/1     Running   0          6m49s
kube-system      calico-node-wkxvg                                1/1     Running   0          4m53s
kube-system      coredns-6997b8f8bd-f966x                         1/1     Running   0          4m43s
kube-system      coredns-6997b8f8bd-ht4qs                         1/1     Running   0          4m43s
kube-system      ebs-csi-controller-5c9db44f4f-5cs6w              5/5     Running   0          7m4s
kube-system      ebs-csi-controller-5c9db44f4f-6twcq              5/5     Running   0          7m4s
kube-system      ebs-csi-node-ctcfp                               3/3     Running   0          6m49s
kube-system      ebs-csi-node-mh8w2                               3/3     Running   0          4m53s
kube-system      kube-proxy-gsw28                                 1/1     Running   0          6m49s
kube-system      kube-proxy-wkz7d                                 1/1     Running   0          4m53s
kube-system      metrics-server-7cc78958fc-n6jrp                  1/1     Running   0          7m7s
kyverno          kyverno-admission-controller-776987899-n9mt9     1/1     Running   0          6m50s
kyverno          kyverno-background-controller-86b9f95c96-bbnmk   1/1     Running   0          6m50s
kyverno          kyverno-cleanup-controller-7bbfc97569-5hjtn      1/1     Running   0          6m50s
kyverno          kyverno-reports-controller-665ccb5b65-cvb6d      1/1     Running   0          6m50s
projectsveltos   sveltos-agent-manager-67d6ffbd86-5vx9z           1/1     Running   0          6m57s

Setting install=false for ingress-nginx

MANAGEMENT CLUSTER TARGET CLUSTER

By setting install=false on the ManagedCluster object, the ingress-nginx service was removed from ClusterProfile -> ClusterSummary objects:

~ kubectl -n hmc-system get clustersummary hmc-system-wali-aws-dev-capi-wali-aws-dev -o yaml
apiVersion: config.projectsveltos.io/v1beta1
kind: ClusterSummary
metadata:
  . . .
  generation: 2
  . . .
  resourceVersion: "8661"
  uid: 500b3a9b-8bf7-479a-a1e9-7edaf314e9dc
spec:
  clusterName: wali-aws-dev
  clusterNamespace: hmc-system
  clusterProfileSpec:
    clusterSelector:
      matchLabels:
        helm.toolkit.fluxcd.io/name: wali-aws-dev
        helm.toolkit.fluxcd.io/namespace: hmc-system
    continueOnConflict: false
    helmCharts:
    - chartName: kyverno
      chartVersion: 3.2.6
      helmChartAction: Install
      . . .
      registryCredentialsConfig:
        plainHTTP: true
      releaseName: kyverno
      releaseNamespace: kyverno
      repositoryName: kyverno
      repositoryURL: oci://hmc-local-registry:5000/charts
    reloader: false
    stopMatchingBehavior: WithdrawPolicies
    syncMode: Continuous
    tier: 100
  clusterType: Capi
status:
  dependencies: no dependencies
  featureSummaries:
  - featureID: Helm
    hash: 2BR25VJae9DRUoGqxh8+6vQ+pKRWoogtkqCCJajwtek=
    lastAppliedTime: "2024-09-20T19:00:12Z"
    status: Provisioned
  helmReleaseSummaries:
  - releaseName: kyverno
    releaseNamespace: kyverno
    status: Managing
    valuesHash: Eq4yyx7ALQHto1gbEnwf7jsNxTVy7WuvI5choD2C4SY=

We can see ingress-nginx was uninstalled from target cluster.

~ kubectl get pod -A | grep Running
kube-system      aws-cloud-controller-manager-fjfg2                         1/1     Running     0          9m45s
kube-system      calico-kube-controllers-695f6448bd-fckbc                   1/1     Running     0          10m
kube-system      calico-node-7tv5t                                          1/1     Running     0          10m
kube-system      calico-node-wkxvg                                          1/1     Running     0          8m28s
kube-system      coredns-6997b8f8bd-f966x                                   1/1     Running     0          8m18s
kube-system      coredns-6997b8f8bd-ht4qs                                   1/1     Running     0          8m18s
kube-system      ebs-csi-controller-5c9db44f4f-5cs6w                        5/5     Running     0          10m
kube-system      ebs-csi-controller-5c9db44f4f-6twcq                        5/5     Running     0          10m
kube-system      ebs-csi-node-ctcfp                                         3/3     Running     0          10m
kube-system      ebs-csi-node-mh8w2                                         3/3     Running     0          8m28s
kube-system      kube-proxy-gsw28                                           1/1     Running     0          10m
kube-system      kube-proxy-wkz7d                                           1/1     Running     0          8m28s
kube-system      metrics-server-7cc78958fc-n6jrp                            1/1     Running     0          10m
kyverno          kyverno-admission-controller-776987899-n9mt9               1/1     Running     0          10m
kyverno          kyverno-background-controller-86b9f95c96-bbnmk             1/1     Running     0          10m
kyverno          kyverno-cleanup-controller-7bbfc97569-5hjtn                1/1     Running     0          10m
kyverno          kyverno-reports-controller-665ccb5b65-cvb6d                1/1     Running     0          10m
projectsveltos   sveltos-agent-manager-67d6ffbd86-5vx9z                     1/1     Running     0          10m

Making services list empty

MANAGEMENT CLUSTER TARGET CLUSTER

We see that the ClusterSummary object does not show any helmCharts list:

~ kubectl -n hmc-system get clustersummary hmc-system-wali-aws-dev-capi-wali-aws-dev -o yaml
apiVersion: config.projectsveltos.io/v1beta1
kind: ClusterSummary
metadata:
  creationTimestamp: "2024-09-20T18:48:20Z"
  finalizers:
  - clustersummaryfinalizer.projectsveltos.io
  generation: 3
  labels:
    hmc.mirantis.com/managed: "true"
    projectsveltos.io/cluster-name: wali-aws-dev
    projectsveltos.io/cluster-profile-name: hmc-system-wali-aws-dev
    projectsveltos.io/cluster-type: Capi
  name: hmc-system-wali-aws-dev-capi-wali-aws-dev
  namespace: hmc-system
  ownerReferences:
  - apiVersion: config.projectsveltos.io/v1beta1
    kind: ClusterProfile
    name: hmc-system-wali-aws-dev
    uid: ccfb11a4-3f3e-4863-90a1-202e5deb8788
  resourceVersion: "9610"
  uid: 500b3a9b-8bf7-479a-a1e9-7edaf314e9dc
spec:
  clusterName: wali-aws-dev
  clusterNamespace: hmc-system
  clusterProfileSpec:
    clusterSelector:
      matchLabels:
        helm.toolkit.fluxcd.io/name: wali-aws-dev
        helm.toolkit.fluxcd.io/namespace: hmc-system
    continueOnConflict: false
    reloader: false
    stopMatchingBehavior: WithdrawPolicies
    syncMode: Continuous
    tier: 100
  clusterType: Capi
status:
  dependencies: no dependencies
  featureSummaries:
  - featureID: Helm
    hash: 47DEQpj8HBSa+/TImW+5JCeuQeRkm5NMpJWZG3hSuFU=
    lastAppliedTime: "2024-09-20T19:02:31Z"
    status: Provisioning

As expected, we can see that both ingress-nginx and kyverno have been uninstalled from the target cluster:

~ kubectl get pod -A  
NAMESPACE        NAME                                       READY   STATUS    RESTARTS   AGE
kube-system      aws-cloud-controller-manager-fjfg2         1/1     Running   0          15m
kube-system      calico-kube-controllers-695f6448bd-fckbc   1/1     Running   0          16m
kube-system      calico-node-7tv5t                          1/1     Running   0          15m
kube-system      calico-node-wkxvg                          1/1     Running   0          13m
kube-system      coredns-6997b8f8bd-f966x                   1/1     Running   0          13m
kube-system      coredns-6997b8f8bd-ht4qs                   1/1     Running   0          13m
kube-system      ebs-csi-controller-5c9db44f4f-5cs6w        5/5     Running   0          15m
kube-system      ebs-csi-controller-5c9db44f4f-6twcq        5/5     Running   0          15m
kube-system      ebs-csi-node-ctcfp                         3/3     Running   0          15m
kube-system      ebs-csi-node-mh8w2                         3/3     Running   0          13m
kube-system      kube-proxy-gsw28                           1/1     Running   0          15m
kube-system      kube-proxy-wkz7d                           1/1     Running   0          13m
kube-system      metrics-server-7cc78958fc-n6jrp            1/1     Running   0          16m
projectsveltos   sveltos-agent-manager-67d6ffbd86-5vx9z     1/1     Running   0          15m

Re-enabling both services again

MANAGEMENT CLUSTER TARGET CLUSTER

We see that the ClusterSummary object again shows the list of helmCharts:

~ kubectl -n hmc-system get clustersummary hmc-system-wali-aws-dev-capi-wali-aws-dev -o yaml
apiVersion: config.projectsveltos.io/v1beta1
kind: ClusterSummary
metadata:
  generation: 4
  . . .
  resourceVersion: "12277"
  uid: 500b3a9b-8bf7-479a-a1e9-7edaf314e9dc
spec:
  clusterName: wali-aws-dev
  clusterNamespace: hmc-system
  clusterProfileSpec:
    clusterSelector:
      matchLabels:
        helm.toolkit.fluxcd.io/name: wali-aws-dev
        helm.toolkit.fluxcd.io/namespace: hmc-system
    continueOnConflict: false
    helmCharts:
    - chartName: kyverno
      chartVersion: 3.2.6
      helmChartAction: Install
      . . .
      registryCredentialsConfig:
        plainHTTP: true
      releaseName: kyverno
      releaseNamespace: kyverno
      repositoryName: kyverno
      repositoryURL: oci://hmc-local-registry:5000/charts
    - chartName: ingress-nginx
      chartVersion: 4.11.0
      helmChartAction: Install
      . . .
      registryCredentialsConfig:
        plainHTTP: true
      releaseName: ingress-nginx
      releaseNamespace: ingress-nginx
      repositoryName: ingress-nginx
      repositoryURL: oci://hmc-local-registry:5000/charts
      values: |
        fullnameOverride: ingress-nginx
    reloader: false
    stopMatchingBehavior: WithdrawPolicies
    syncMode: Continuous
    tier: 100
  clusterType: Capi
status:
  dependencies: no dependencies
  featureSummaries:
  - featureID: Helm
    hash: 8ZDFC0FQZ2j1VHZPAeecZpAtyOUBotyUGBJGosO4tYA=
    lastAppliedTime: "2024-09-20T19:08:52Z"
    status: Provisioned
  helmReleaseSummaries:
  - releaseName: kyverno
    releaseNamespace: kyverno
    status: Managing
    valuesHash: Eq4yyx7ALQHto1gbEnwf7jsNxTVy7WuvI5choD2C4SY=
  - releaseName: ingress-nginx
    releaseNamespace: ingress-nginx
    status: Managing
    valuesHash: qYgUi/xTJIMlaXCLxb/XjCBv5xso8nVHHQ0copZdxl4=

Both ingress-nginx and kyverno have again been installed on the target cluster:

~ kubectl get pod -A
NAMESPACE        NAME                                             READY   STATUS    RESTARTS   AGE
ingress-nginx    ingress-nginx-controller-5bfc858768-dmt84        1/1     Running   0          55s
kube-system      aws-cloud-controller-manager-fjfg2               1/1     Running   0          18m
kube-system      calico-kube-controllers-695f6448bd-fckbc         1/1     Running   0          19m
kube-system      calico-node-7tv5t                                1/1     Running   0          18m
kube-system      calico-node-wkxvg                                1/1     Running   0          17m
kube-system      coredns-6997b8f8bd-f966x                         1/1     Running   0          16m
kube-system      coredns-6997b8f8bd-ht4qs                         1/1     Running   0          16m
kube-system      ebs-csi-controller-5c9db44f4f-5cs6w              5/5     Running   0          19m
kube-system      ebs-csi-controller-5c9db44f4f-6twcq              5/5     Running   0          19m
kube-system      ebs-csi-node-ctcfp                               3/3     Running   0          18m
kube-system      ebs-csi-node-mh8w2                               3/3     Running   0          17m
kube-system      kube-proxy-gsw28                                 1/1     Running   0          18m
kube-system      kube-proxy-wkz7d                                 1/1     Running   0          17m
kube-system      metrics-server-7cc78958fc-n6jrp                  1/1     Running   0          19m
kyverno          kyverno-admission-controller-776987899-qw8g6     1/1     Running   0          67s
kyverno          kyverno-background-controller-86b9f95c96-8nmt5   1/1     Running   0          67s
kyverno          kyverno-cleanup-controller-7bbfc97569-zg86g      1/1     Running   0          67s
kyverno          kyverno-reports-controller-665ccb5b65-jg4xb      1/1     Running   0          67s
projectsveltos   sveltos-agent-manager-67d6ffbd86-5vx9z           1/1     Running   0          19m

Finally deleting the ManagedCluster object

~ kubectl -n hmc-system delete managedclusters.hmc.mirantis.com wali-aws-dev
managedcluster.hmc.mirantis.com "wali-aws-dev" deleted

Wait for a while for the delete to finish . . .

~ kubectl -n hmc-system get managedclusters.hmc.mirantis.com 
No resources found in hmc-system namespace.

We can see that that the associated ClusterProfile and ClusterSummary objects have also been deleted from the management cluster:

~ kubectl get clusterprofiles.config.projectsveltos.io 
No resources found
➜  ~ kubectl -n hmc-system get clustersummaries.config.projectsveltos.io 
No resources found in hmc-system namespace.

@wahabmk wahabmk self-assigned this Sep 20, 2024
@wahabmk wahabmk force-pushed the reconcile-servicetemplates branch 6 times, most recently from 4534079 to 01a018f Compare September 23, 2024 02:32
@wahabmk wahabmk marked this pull request as ready for review September 23, 2024 11:03
@wahabmk wahabmk added the enhancement Small feature, request or improvement suggestion label Sep 23, 2024
}

// DeleteClusterProfile issues delete on ClusterProfile object.
func DeleteClusterProfile(ctx context.Context, cl client.Client, namespace string, name string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create some generic "remover"? Currently the difference between DeleteHelmRelease and DeleteClusterProfile is only in client.Object for Delete method, but the method parameters and logic is the same.

@DinaBelova DinaBelova linked an issue Sep 23, 2024 that may be closed by this pull request
api/v1alpha1/managedcluster_types.go Outdated Show resolved Hide resolved
api/v1alpha1/managedcluster_types.go Outdated Show resolved Hide resolved
api/v1alpha1/managedcluster_types.go Outdated Show resolved Hide resolved
api/v1alpha1/managedcluster_types.go Outdated Show resolved Hide resolved
api/v1alpha1/managedcluster_types.go Outdated Show resolved Hide resolved
internal/sveltos/clusterprofile.go Outdated Show resolved Hide resolved
internal/sveltos/clusterprofile.go Outdated Show resolved Hide resolved
templates/service/ingress-nginx/Chart.yaml Outdated Show resolved Hide resolved
templates/service/kyverno/Chart.yaml Outdated Show resolved Hide resolved
api/v1alpha1/managedcluster_types.go Outdated Show resolved Hide resolved
internal/controller/managedcluster_controller.go Outdated Show resolved Hide resolved
internal/controller/managedcluster_controller.go Outdated Show resolved Hide resolved
@wahabmk wahabmk force-pushed the reconcile-servicetemplates branch 2 times, most recently from 07738dc to ae51ffc Compare September 25, 2024 10:44
Comment on lines 533 to 588
err = sveltos.DeleteProfile(ctx, r.Client, managedCluster.Namespace, managedCluster.Name)
if err != nil {
return ctrl.Result{}, err
}
Copy link
Contributor Author

@wahabmk wahabmk Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Kshatrix Concerning your suggestion to remove Sveltos finalizer. I think it will be better if we don't remove it because otherwise the objects being cleaned up in reconcileDeleteCommon() func may be left hanging in the hmc-system namespace on management cluster after the Profile object has been deleted.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack. then we should wait for Profile removal before we remove finalizers from the managedcluster object (to not leave hanging resources behind unnoticed)

@wahabmk wahabmk requested review from Kshatrix and eromanova and removed request for Kshatrix and eromanova September 26, 2024 12:38
internal/controller/managedcluster_controller.go Outdated Show resolved Hide resolved
internal/controller/managedcluster_controller.go Outdated Show resolved Hide resolved
internal/controller/managedcluster_controller.go Outdated Show resolved Hide resolved
internal/sveltos/profile.go Outdated Show resolved Hide resolved
internal/sveltos/profile.go Outdated Show resolved Hide resolved
internal/controller/managedcluster_controller.go Outdated Show resolved Hide resolved
@wahabmk
Copy link
Contributor Author

wahabmk commented Sep 27, 2024

The CI failure is an auth related one:

ERROR: failed to solve: golang:1.22: failed to resolve source metadata for docker.io/library/golang:1.22: failed to authorize: failed to fetch oauth token: unexpected status from GET request to https://auth.docker.io/token?scope=repository%3Alibrary%2Fgolang%3Apull&service=registry.docker.io: 401 Unauthorized

Comment on lines 533 to 588
err = sveltos.DeleteProfile(ctx, r.Client, managedCluster.Namespace, managedCluster.Name)
if err != nil {
return ctrl.Result{}, err
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ack. then we should wait for Profile removal before we remove finalizers from the managedcluster object (to not leave hanging resources behind unnoticed)

@wahabmk wahabmk force-pushed the reconcile-servicetemplates branch 2 times, most recently from b00ae13 to 4e0f8c0 Compare September 30, 2024 15:24
@wahabmk
Copy link
Contributor Author

wahabmk commented Oct 1, 2024

ack. then we should wait for Profile removal before we remove finalizers from the managedcluster object (to not leave hanging resources behind unnoticed)

done

@wahabmk wahabmk requested a review from Kshatrix October 1, 2024 03:18
@Kshatrix Kshatrix merged commit 0e7867b into Mirantis:main Oct 1, 2024
3 checks passed
@wahabmk wahabmk deleted the reconcile-servicetemplates branch October 4, 2024 15:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Small feature, request or improvement suggestion
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Reconcile ServiceTemplates in ManagedCluster controller
4 participants