Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement NetworkFenceClass Controller #703

Merged
merged 10 commits into from
Nov 13, 2024

Conversation

Madhu-1
Copy link
Member

@Madhu-1 Madhu-1 commented Nov 12, 2024

This PR add/update the below functionality in the controller

NetworkFenceClass Controller

  • Watch for the csiaddons object, if we have an events list all the NFC matching the driver name and Reconcile it
  • Add annotation for all the csiaddons node objects that matches the driver name and also advertises the GET_CLIENTS_TO_FENCE cap
  • Remove annotation upon deletion

CSIAddonsNode Controller

  • Trigger an event if a annotation is added
  • Get the ClassName from the annotation and get the client details from the driver and update the details in the CR status.

sidecar

  • Dont copy the metadata during update operation

API

  • update the csiaddons status to include the Networkfenceclient details

Note:-

The NetworkFenceClass-Controller is not the owner of the CSIAddonsNode CR. Therefor it should not update the status of the CSIAddonsNode with the IP-address(es) that were detected by the NetworkFenceClass trigger.
Instead, the NetworkFenceClass-Controller adds an annotation, so that the CSIAddonsNode-Controller is informed to call the GetFenceClients() CSI-Addons procedure of the CSI-driver and update the CSIAddonsNode/status after that

Test results

  • When a CSIAddonsNode CR is created
[🎩︎]mrajanna@li-2cfbef4c-22d9-11b2-a85c-a3e4a93c405f kubernetes-csi-addons $]kuberc get csiaddonsnode
No resources found in rook-ceph namespace.
[🎩︎]mrajanna@li-2cfbef4c-22d9-11b2-a85c-a3e4a93c405f kubernetes-csi-addons $]kuberc get csiaddonsnode -oyaml
apiVersion: v1
items:
- apiVersion: csiaddons.openshift.io/v1alpha1
  kind: CSIAddonsNode
  metadata:
    annotations:
      csiaddons.openshift.io/networkfenceclass-names: '["networkfenceclass-sample"]
    creationTimestamp: "2024-11-12T13:27:25Z"
    finalizers:
    - csiaddons.openshift.io/csiaddonsnode
    generation: 1
    name: minikube-rook-ceph-daemonset-csi-rbdplugin
    namespace: rook-ceph
    resourceVersion: "111004"
    uid: a9d54f91-b0ec-46e9-a8b8-c1b58f109f15
  spec:
    driver:
      endpoint: pod://csi-rbdplugin-szbq7.rook-ceph:9070
      name: rook-ceph.rbd.csi.ceph.com
      nodeID: minikube
  status:
    capabilities:
    - service.NODE_SERVICE
    - reclaim_space.ONLINE
    - encryption_key_rotation.ENCRYPTIONKEYROTATION
    - network_fence.GET_CLIENTS_TO_FENCE
    message: Successfully established connection with sidecar
    networkFenceClientStatus:
    - ClientDetails:
      - cidrs:
        - 10.244.0.1/32
        id: a815fe8e-eabd-4e87-a6e8-78cebfb67d08
      networkFenceClassName: networkfenceclass-sample
    state: Connected
- apiVersion: csiaddons.openshift.io/v1alpha1
  kind: CSIAddonsNode
  metadata:
    creationTimestamp: "2024-11-12T13:27:24Z"
    finalizers:
    - csiaddons.openshift.io/csiaddonsnode
    generation: 1
    name: minikube-rook-ceph-deployment-csi-rbdplugin-provisioner
    namespace: rook-ceph
    resourceVersion: "110984"
    uid: ca13abdf-8803-4075-a0c4-9ae51e3b758d
  spec:
    driver:
      endpoint: pod://csi-rbdplugin-provisioner-5cf8cf5d74-6hwqb.rook-ceph:9070
      name: rook-ceph.rbd.csi.ceph.com
      nodeID: minikube
  status:
    capabilities:
    - service.CONTROLLER_SERVICE
    - reclaim_space.OFFLINE
    - network_fence.NETWORK_FENCE
    - volume_replication.VOLUME_REPLICATION
    - volume_group.VOLUME_GROUP
    - volume_group.DO_NOT_ALLOW_VG_TO_DELETE_VOLUMES
    - volume_group.LIMIT_VOLUME_TO_ONE_VOLUME_GROUP
    - volume_group.MODIFY_VOLUME_GROUP
    - volume_group.GET_VOLUME_GROUP
    message: Successfully established connection with sidecar
    state: Connected
kind: List
metadata:
  resourceVersion: ""
  • When a csiAddonsNode CR is recreated
[🎩︎]mrajanna@li-2cfbef4c-22d9-11b2-a85c-a3e4a93c405f kubernetes-csi-addons $]kuberc get csiaddonsnode
NAME                                                      NAMESPACE   AGE   DRIVERNAME                   ENDPOINT                                                          NODEID
minikube-rook-ceph-daemonset-csi-rbdplugin                rook-ceph   79s   rook-ceph.rbd.csi.ceph.com   pod://csi-rbdplugin-szbq7.rook-ceph:9070                          minikube
minikube-rook-ceph-deployment-csi-rbdplugin-provisioner   rook-ceph   80s   rook-ceph.rbd.csi.ceph.com   pod://csi-rbdplugin-provisioner-5cf8cf5d74-6hwqb.rook-ceph:9070   minikube
[🎩︎]mrajanna@li-2cfbef4c-22d9-11b2-a85c-a3e4a93c405f kubernetes-csi-addons $]kuberc delete csiaddonsnode minikube-rook-ceph-deployment-csi-rbdplugin-provisioner minikube-rook-ceph-daemonset-csi-rbdplugin
csiaddonsnode.csiaddons.openshift.io "minikube-rook-ceph-deployment-csi-rbdplugin-provisioner" deleted
csiaddonsnode.csiaddons.openshift.io "minikube-rook-ceph-daemonset-csi-rbdplugin" deleted
[🎩︎]mrajanna@li-2cfbef4c-22d9-11b2-a85c-a3e4a93c405f kubernetes-csi-addons $]kuberc get po
NAME                                           READY   STATUS      RESTARTS   AGE
csi-rbdplugin-provisioner-5cf8cf5d74-6hwqb     6/6     Running     0          95s
csi-rbdplugin-szbq7                            3/3     Running     0          94s
rook-ceph-exporter-minikube-7d6cfcd474-kxz67   1/1     Running     0          7d2h
rook-ceph-mds-myfs-a-7758684bd6-4sskv          1/1     Running     0          7d2h
rook-ceph-mds-myfs-b-75cbc8485f-g68r9          1/1     Running     0          7d2h
rook-ceph-mgr-a-548898c75c-mzdzd               1/1     Running     0          7d3h
rook-ceph-mon-a-fb4bf88f7-7tdpm                1/1     Running     0          7d2h
rook-ceph-operator-5f68d9fbc6-2sfb4            1/1     Running     0          7d2h
rook-ceph-osd-0-67d55bb855-6x9ls               1/1     Running     0          7d2h
rook-ceph-osd-prepare-minikube-4zrgf           0/1     Completed   0          7d2h
rook-ceph-tools-68bf47bc65-77qln               1/1     Running     0          7d3h
[🎩︎]mrajanna@li-2cfbef4c-22d9-11b2-a85c-a3e4a93c405f kubernetes-csi-addons $]kuberc get csiaddonsnode
No resources found in rook-ceph namespace.
[🎩︎]mrajanna@li-2cfbef4c-22d9-11b2-a85c-a3e4a93c405f kubernetes-csi-addons $]kuberc delete po csi-rbdplugin-provisioner-5cf8cf5d74-6hwqb csi-rbdplugin-szbq7
pod "csi-rbdplugin-provisioner-5cf8cf5d74-6hwqb" deleted
pod "csi-rbdplugin-szbq7" deleted
[🎩︎]mrajanna@li-2cfbef4c-22d9-11b2-a85c-a3e4a93c405f kubernetes-csi-addons $]kuberc get csiaddonsnode -oyaml
apiVersion: v1
items:
- apiVersion: csiaddons.openshift.io/v1alpha1
  kind: CSIAddonsNode
  metadata:
    annotations:
      csiaddons.openshift.io/networkfenceclass-names: '["networkfenceclass-sample"]
    creationTimestamp: "2024-11-12T13:29:12Z"
    finalizers:
    - csiaddons.openshift.io/csiaddonsnode
    generation: 1
    name: minikube-rook-ceph-daemonset-csi-rbdplugin
    namespace: rook-ceph
    resourceVersion: "111133"
    uid: ab604173-dec7-43fa-a3d6-c7e6c5e3e349
  spec:
    driver:
      endpoint: pod://csi-rbdplugin-nn92s.rook-ceph:9070
      name: rook-ceph.rbd.csi.ceph.com
      nodeID: minikube
  status:
    capabilities:
    - service.NODE_SERVICE
    - reclaim_space.ONLINE
    - encryption_key_rotation.ENCRYPTIONKEYROTATION
    - network_fence.GET_CLIENTS_TO_FENCE
    message: Successfully established connection with sidecar
    networkFenceClientStatus:
    - ClientDetails:
      - cidrs:
        - 10.244.0.1/32
        id: a815fe8e-eabd-4e87-a6e8-78cebfb67d08
      networkFenceClassName: networkfenceclass-sample
    state: Connected
- apiVersion: csiaddons.openshift.io/v1alpha1
  kind: CSIAddonsNode
  metadata:
    creationTimestamp: "2024-11-12T13:29:15Z"
    finalizers:
    - csiaddons.openshift.io/csiaddonsnode
    generation: 1
    name: minikube-rook-ceph-deployment-csi-rbdplugin-provisioner
    namespace: rook-ceph
    resourceVersion: "111164"
    uid: dbd9ed63-8882-4e29-b25a-925c06de858f
  spec:
    driver:
      endpoint: pod://csi-rbdplugin-provisioner-5cf8cf5d74-p24kn.rook-ceph:9070
      name: rook-ceph.rbd.csi.ceph.com
      nodeID: minikube
  status:
    capabilities:
    - service.CONTROLLER_SERVICE
    - reclaim_space.OFFLINE
    - network_fence.NETWORK_FENCE
    - volume_replication.VOLUME_REPLICATION
    - volume_group.VOLUME_GROUP
    - volume_group.DO_NOT_ALLOW_VG_TO_DELETE_VOLUMES
    - volume_group.LIMIT_VOLUME_TO_ONE_VOLUME_GROUP
    - volume_group.MODIFY_VOLUME_GROUP
    - volume_group.GET_VOLUME_GROUP
    message: Successfully established connection with sidecar
    state: Connected
kind: List
metadata:
  resourceVersion: ""
  • When a csiaddonsNode CR is updated
[🎩︎]mrajanna@li-2cfbef4c-22d9-11b2-a85c-a3e4a93c405f kubernetes-csi-addons $]kuberc delete po csi-rbdplugin-nn92s csi-rbdplugin-provisioner-5cf8cf5d74-p24kn 
pod "csi-rbdplugin-nn92s" deleted
pod "csi-rbdplugin-provisioner-5cf8cf5d74-p24kn" deleted
[🎩︎]mrajanna@li-2cfbef4c-22d9-11b2-a85c-a3e4a93c405f kubernetes-csi-addons $]kuberc getuberc get po
NAME                                           READY   STATUS      RESTARTS   AGE
csi-rbdplugin-provisioner-5cf8cf5d74-cbdts     6/6     Running     0          6s
csi-rbdplugin-rqtjb                            3/3     Running     0          5s
rook-ceph-exporter-minikube-7d6cfcd474-kxz67   1/1     Running     0          7d2h
rook-ceph-mds-myfs-a-7758684bd6-4sskv          1/1     Running     0          7d2h
rook-ceph-mds-myfs-b-75cbc8485f-g68r9          1/1     Running     0          7d2h
rook-ceph-mgr-a-548898c75c-mzdzd               1/1     Running     0          7d3h
rook-ceph-mon-a-fb4bf88f7-7tdpm                1/1     Running     0          7d2h
rook-ceph-operator-5f68d9fbc6-2sfb4            1/1     Running     0          7d2h
rook-ceph-osd-0-67d55bb855-6x9ls               1/1     Running     0          7d2h
rook-ceph-osd-prepare-minikube-4zrgf           0/1     Completed   0          7d2h
rook-ceph-tools-68bf47bc65-77qln               1/1     Running     0          7d3h
[🎩︎]mrajanna@li-2cfbef4c-22d9-11b2-a85c-a3e4a93c405f kubernetes-csi-addons $]kuberc get csic get csiaddonsnode -oyaml
apiVersion: v1
items:
- apiVersion: csiaddons.openshift.io/v1alpha1
  kind: CSIAddonsNode
  metadata:
    annotations:
      csiaddons.openshift.io/networkfenceclass-names: '["networkfenceclass-sample"]
    creationTimestamp: "2024-11-12T13:29:12Z"
    finalizers:
    - csiaddons.openshift.io/csiaddonsnode
    generation: 2
    name: minikube-rook-ceph-daemonset-csi-rbdplugin
    namespace: rook-ceph
    resourceVersion: "111289"
    uid: ab604173-dec7-43fa-a3d6-c7e6c5e3e349
  spec:
    driver:
      endpoint: pod://csi-rbdplugin-rqtjb.rook-ceph:9070
      name: rook-ceph.rbd.csi.ceph.com
      nodeID: minikube
  status:
    capabilities:
    - service.NODE_SERVICE
    - reclaim_space.ONLINE
    - encryption_key_rotation.ENCRYPTIONKEYROTATION
    - network_fence.GET_CLIENTS_TO_FENCE
    message: Successfully established connection with sidecar
    networkFenceClientStatus:
    - ClientDetails:
      - cidrs:
        - 10.244.0.1/32
        id: a815fe8e-eabd-4e87-a6e8-78cebfb67d08
      networkFenceClassName: networkfenceclass-sample
    state: Connected
- apiVersion: csiaddons.openshift.io/v1alpha1
  kind: CSIAddonsNode
  metadata:
    creationTimestamp: "2024-11-12T13:29:15Z"
    finalizers:
    - csiaddons.openshift.io/csiaddonsnode
    generation: 2
    name: minikube-rook-ceph-deployment-csi-rbdplugin-provisioner
    namespace: rook-ceph
    resourceVersion: "111292"
    uid: dbd9ed63-8882-4e29-b25a-925c06de858f
  spec:
    driver:
      endpoint: pod://csi-rbdplugin-provisioner-5cf8cf5d74-cbdts.rook-ceph:9070
      name: rook-ceph.rbd.csi.ceph.com
      nodeID: minikube
  status:
    capabilities:
    - service.CONTROLLER_SERVICE
    - reclaim_space.OFFLINE
    - network_fence.NETWORK_FENCE
    - volume_replication.VOLUME_REPLICATION
    - volume_group.VOLUME_GROUP
    - volume_group.DO_NOT_ALLOW_VG_TO_DELETE_VOLUMES
    - volume_group.LIMIT_VOLUME_TO_ONE_VOLUME_GROUP
    - volume_group.MODIFY_VOLUME_GROUP
    - volume_group.GET_VOLUME_GROUP
    message: Successfully established connection with sidecar
    state: Connected
kind: List
metadata:
  resourceVersion: ""

@mergify mergify bot added api Change to the API, requires extra care vendor Pull requests that update vendored dependencies labels Nov 12, 2024
@Madhu-1 Madhu-1 force-pushed the implement-get-client-ip branch from d1f73b2 to 83ae666 Compare November 12, 2024 13:57
config/samples/csiaddons_v1alpha1_networkfenceclass.yaml Outdated Show resolved Hide resolved
config/samples/csiaddons_v1alpha1_networkfenceclass.yaml Outdated Show resolved Hide resolved
internal/sidecar/service/networkfence.go Outdated Show resolved Hide resolved
docs/networkfenceclass.md Outdated Show resolved Hide resolved
kind: CSIAddonsNode
metadata:
annotations:
csiaddons.openshift.io/networkfenceclass-0: network-fence-class
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the -0 postfix?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we are going to have different postfix as the keys need to be unique in annotation

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there really a need for such an ugly annotation? Can't whatever is going to fence not check the CSIAddonsNode.spec.drivername and match it with whatever NetworkFenceClasses there are? The creator of a NetworkFence CR will still need to pick a NetworkFenceClass for it, right?

Looping through such annotation is not really less work than looping through the NetworkFenceClasses and finding the right driver. Making sure the annotations are always correct seems more work than useful, which can have additional bugs with severe results (unable to fence).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nixpanic annotations are used as a way to trigger a reconcile. i am trying to avoid list operation and also there can be causes where the NFC classes are created/delete/recreate later after the csiaddosnnode registration is done, These are only of the way to triggering the Reconcile, The User will still need to look into the driverName and the NFC name present in the status field not on the annotation to get the right client Ip cidr to fence (This can be a documentation improvement i can do). another option is to have something below but that can have length limitation in some worst cases scenarios.

metadata:
  annotations:
    csiaddons.openshift.io/networkfenceclassname: '[{"name":"nfcName1"},{"name":"nfcName2"}]'

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I must be missing something, what is the reason a reconcile is needed? When a NetworkFenceClass has a matching drivername, it is expected that the CSIAddonsNode supports that class.

CSI Provisioners and NodePlugins also do not need to have annotations for the different StorageClasses, why is this different?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal is to advertise the Ip's on the CSIAddonsNode and use the NFCClass to get the cluster details to get the client IP from. In PR description i explained the workflow. This is what we have discussed early (let me know if thats not the case).

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so what I was missing (or forgetting) is that the NetworkFenceClass-Controller is not the owner of the CSIAddonsNode CR. Therefor it should not update the status of the CSIAddonsNode with the IP-address(es) that were detected by the NetworkFenceClass trigger.

Instead, the NetworkFenceClass-Controller adds an annotation, so that the CSIAddonsNode-Controller is informed to call the GetFenceClients() CSI-Addons procedure of the CSI-driver and update the CSIAddonsNode/status after that.

@Madhu-1 Madhu-1 force-pushed the implement-get-client-ip branch 3 times, most recently from 33f75cc to c80be56 Compare November 12, 2024 18:50
@Madhu-1 Madhu-1 requested a review from nixpanic November 12, 2024 18:53
Comment on lines 157 to 170
nfcClientDetails := make([]csiaddonsv1alpha1.ClientDetail, 0)
for _, client := range clients.Clients {
logger.Info("Adding client to status", "client", client.Id, "cidrs", client.Cidrs)
nfcClientDetails = append(nfcClientDetails, csiaddonsv1alpha1.ClientDetail{
Id: client.Id,
Cidrs: client.Cidrs,
})
}
nfsc = append(nfsc,
csiaddonsv1alpha1.NetworkFenceClientStatus{
NetworkFenceClassName: nfc.Name,
ClientDetails: nfcClientDetails,
},
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use a helper function here ?

Suggested change
nfcClientDetails := make([]csiaddonsv1alpha1.ClientDetail, 0)
for _, client := range clients.Clients {
logger.Info("Adding client to status", "client", client.Id, "cidrs", client.Cidrs)
nfcClientDetails = append(nfcClientDetails, csiaddonsv1alpha1.ClientDetail{
Id: client.Id,
Cidrs: client.Cidrs,
})
}
nfsc = append(nfsc,
csiaddonsv1alpha1.NetworkFenceClientStatus{
NetworkFenceClassName: nfc.Name,
ClientDetails: nfcClientDetails,
},
)
nfsc = append(nfsc,
extractNFClientStatus(&logger, nfc.Name, client.Clients),
)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO there is no gain in doing it because its not used in multiple places nor its a complex check, its just a for loop

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part has a loop and itself inside a if condition, which in turn is inside a loop.
It looks messy.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with Rakshith. The function is already very large, and with this change it spans > 120 lines. Please move it out into a helper to make it a little more modular and easier to understand/maintain in the future.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

getNetworkFenceClientStatus() or something would be nice.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

updating the csiaddons spec dependency
to the latest main.

Signed-off-by: Madhu Rajanna <[email protected]>
Added `NetworkFenceClassReconciler` to manage
the reconciliation of `NetworkFenceClass` resources.
Fetches `NetworkFenceClass` and lists
associated `CSIAddonsNode` objects based on
provisioner. Adds or removes labels on csiaddonsnodes
with the `NetworkFenceClass` name based on node
capabilities and deletion state.
Introduced helper functions for label key
retrieval and label count management.
Set up field indexer for `CSIAddonsNode` to
efficiently watch nodes by provisioner/driver name.

Signed-off-by: Madhu Rajanna <[email protected]>
adding a sample yaml for the
networkfenceclass CR.

Signed-off-by: Madhu Rajanna <[email protected]>
adding a new fields to the csiaddonsnode
status to represent the networkfenceclass
and its client details.

Signed-off-by: Madhu Rajanna <[email protected]>
generated internal proto for
GetFenceClients RPC.

Signed-off-by: Madhu Rajanna <[email protected]>
added GetFenceClients RPC to the sidecar
service to make RPC call to the csi driver

Signed-off-by: Madhu Rajanna <[email protected]>
@Madhu-1 Madhu-1 force-pushed the implement-get-client-ip branch from c80be56 to c3369d0 Compare November 13, 2024 11:06
@Madhu-1 Madhu-1 requested a review from Rakshith-R November 13, 2024 11:09
when a csiaddons is registered, List the NFC CR's
matching the provisioner name and send a request
to get the client address from the csi driver and
update the status with the client details.

Signed-off-by: Madhu Rajanna <[email protected]>
run tests with verbose flag to
get more detailed output.

Signed-off-by: Madhu Rajanna <[email protected]>
adding documentation for the network
fence class.

Signed-off-by: Madhu Rajanna <[email protected]>
if we deepcopy the metadata during
the update operations all the annotations
gets removed.

Signed-off-by: Madhu Rajanna <[email protected]>
@Madhu-1 Madhu-1 force-pushed the implement-get-client-ip branch from c3369d0 to f300c30 Compare November 13, 2024 11:11
@Madhu-1
Copy link
Member Author

Madhu-1 commented Nov 13, 2024

@nixpanic PTAL

@mergify mergify bot merged commit 4d025d3 into csi-addons:main Nov 13, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api Change to the API, requires extra care vendor Pull requests that update vendored dependencies
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants