Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NFS export policy is missing one k8s node's IP #965

Open
ptrkmkslv opened this issue Jan 21, 2025 · 8 comments
Open

NFS export policy is missing one k8s node's IP #965

ptrkmkslv opened this issue Jan 21, 2025 · 8 comments
Labels

Comments

@ptrkmkslv
Copy link

We faced the bug when one of the cluster nodes is unable to mount PVC (NFS) with error: 



71s Warning FailedMount pod/grafana-5b7b4f4dc7-9r4zf MountVolume.SetUp failed for volume "pvc-4c0caf5e-ac71-4dc2-a9d1-329913b244a6" : rpc error: code = Internal desc = error mounting NFS volume x.x.x.x/trident_pvc_4c0caf5e_ac71_4dc2_a9d1_329913b244a6 on mountpoint /opt/rke/var/lib/kubelet/pods/cd1b1ad1-f380-4fb2-9e9b-eff4806121b4/volumes/kubernetes.io~csi/pvc-4c0caf5e-ac71-4dc2-a9d1-329913b244a6/mount: exit status 32

After investigation we have discovered that NFS export policy on SVM is missing this node’s IP (policy had 16 entries where cluster consists of 17 nodes).

trident-node / trident-controller did not produce any useful error messages regarding ‘publishing’ volume to the node.
SVM also did not complain about any problem.

Issue was manually resolved by storage team - missing node was manually added to export policy - after that POD was immediately able to mount PVC

Environment
kind: tridentbackendconfigs.trident.netapp.io for NFS share is using both parameters: autoExportPolicy: true
and
autoExportCIDRs: with subnet class /24 where k8s storage interfaces are

  • Trident version: v24.10
  • Kubernetes version: v1.30.6
  • Container runtime: docker://26.1.0
  • Kubernetes orchestrator: Rancher (custom cluster)
  • OS: Flatcar Container Linux by Kinvolk 4081.2.0
  • NetApp backend types: ONTAP AFF (ONTAP 9.12.1P12)

Expected behavior

Complete export list of all k8s worker nodes

Any advice ? what to do if the problem occurs again ? (any tshooting commands that can be used ?)

@ptrkmkslv ptrkmkslv added the bug label Jan 21, 2025
@enneitex
Copy link

Hi, same issue here:

Trident version: v24.10
Kubernetes version: v1.29.10
Container runtime: containerd
Kubernetes orchestrator: Kubeadm
OS: RHEL9
NetApp backend types: ONTAP NAS

A few IP are missing from the export policy while using autoExportPolicy: true and default autoExportCIDRs
Even after deleting the node and adding it back to the kubernetes cluster, its IP is still missing.

@ptrkmkslv
Copy link
Author

we are almost certain that the problem is related to 24.10 - after analysis, the first problem related to NFS occurred the day after the upgrade (upgrade from version 24.02)

@enneitex
Copy link

Same, we were using 24.06.1 before and never hit this issue while removing and adding a lot of nodes in our clusters.

@ptrkmkslv
Copy link
Author

in our environment clusters have fixed number of nodes (mostly) - so it is even more strange that suddenly export policy does not include all nodes... so it is not a problem of adding/removing nodes in dynamic clusters

@wonderland
Copy link

Just to see if it can be broken down more specifically: Are you using driver name ontap-nas or ontap-nas-economy?

@ptrkmkslv
Copy link
Author

storageDriverName: ontap-nas

@enneitex
Copy link

Same, ontap-nas

@torirevilla
Copy link
Contributor

How long are you waiting after the node is added to check if the new node IP is included in the export policy?
In trying to reproduce this issue, I have found that the node IP is added to the export policy during the reconcileNodeAccess loop and is complete soon after adding a node but not immediately.

I have a few more questions to better reproduce the issue:
When are you adding the node, before or after upgrading to 24.10?
How are you performing the upgrade, using the operator or tridentctl cli?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

4 participants