-
Notifications
You must be signed in to change notification settings - Fork 226
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NFS export policy is missing one k8s node's IP #965
Comments
Hi, same issue here: Trident version: v24.10 A few IP are missing from the export policy while using |
we are almost certain that the problem is related to 24.10 - after analysis, the first problem related to NFS occurred the day after the upgrade (upgrade from version 24.02) |
Same, we were using 24.06.1 before and never hit this issue while removing and adding a lot of nodes in our clusters. |
in our environment clusters have fixed number of nodes (mostly) - so it is even more strange that suddenly export policy does not include all nodes... so it is not a problem of adding/removing nodes in dynamic clusters |
Just to see if it can be broken down more specifically: Are you using driver name |
|
Same, ontap-nas |
How long are you waiting after the node is added to check if the new node IP is included in the export policy? I have a few more questions to better reproduce the issue: |
We faced the bug when one of the cluster nodes is unable to mount PVC (NFS) with error:
71s Warning FailedMount pod/grafana-5b7b4f4dc7-9r4zf MountVolume.SetUp failed for volume "pvc-4c0caf5e-ac71-4dc2-a9d1-329913b244a6" : rpc error: code = Internal desc = error mounting NFS volume x.x.x.x/trident_pvc_4c0caf5e_ac71_4dc2_a9d1_329913b244a6 on mountpoint /opt/rke/var/lib/kubelet/pods/cd1b1ad1-f380-4fb2-9e9b-eff4806121b4/volumes/kubernetes.io~csi/pvc-4c0caf5e-ac71-4dc2-a9d1-329913b244a6/mount: exit status 32
After investigation we have discovered that NFS export policy on SVM is missing this node’s IP (policy had 16 entries where cluster consists of 17 nodes).
trident-node / trident-controller did not produce any useful error messages regarding ‘publishing’ volume to the node.
SVM also did not complain about any problem.
Issue was manually resolved by storage team - missing node was manually added to export policy - after that POD was immediately able to mount PVC
Environment
kind: tridentbackendconfigs.trident.netapp.io for NFS share is using both parameters: autoExportPolicy: true
and
autoExportCIDRs: with subnet class /24 where k8s storage interfaces are
Expected behavior
Complete export list of all k8s worker nodes
Any advice ? what to do if the problem occurs again ? (any tshooting commands that can be used ?)
The text was updated successfully, but these errors were encountered: