Skip to content

Latest commit

 

History

History
91 lines (80 loc) · 3.63 KB

csi-debug.md

File metadata and controls

91 lines (80 loc) · 3.63 KB

CSI driver debug tips

case#1: disk create/delete/attach/detach/snapshot/restore failed

  • locate csi driver pod
kubectl get po -o wide -n kube-system | grep csi-azuredisk-controller
NAME                                           READY   STATUS    RESTARTS   AGE     IP             NODE
csi-azuredisk-controller-56bfddd689-dh5tk      5/5     Running   0          35s     10.240.0.19    k8s-agentpool-22533604-0
csi-azuredisk-controller-56bfddd689-sl4ll      5/5     Running   0          35s     10.240.0.23    k8s-agentpool-22533604-1
  • get csi driver logs
kubectl describe pod csi-azuredisk-controller-56bfddd689-dh5tk -n kube-system > csi-azuredisk-controller-description.log
kubectl logs csi-azuredisk-controller-56bfddd689-dh5tk -c azuredisk -n kube-system > csi-azuredisk-controller.log

Note: there could be multiple controller pods, if there are no helpful logs, try to get logs from other controller pods

  • get csi driver logs using scripts (only works when replica of driver controller is 1)
diskControllerName=`kubectl get po -n kube-system | grep csi-azuredisk-controller | cut -d ' ' -f1`
kubectl describe $diskControllerName -n kube-system > $diskControllerName-description.log
kubectl logs $diskControllerName -n kube-system -c azuredisk > $diskControllerName.log

case#2: volume mount/unmount failed

  • locate csi driver pod that does the actual volume mount/unmount
kubectl get po -o wide -n kube-system | grep csi-azuredisk-node
NAME                                           READY   STATUS    RESTARTS   AGE     IP             NODE
csi-azuredisk-node-cvgbs                       3/3     Running   0          7m4s    10.240.0.35    k8s-agentpool-22533604-1
csi-azuredisk-node-dr4s4                       3/3     Running   0          7m4s    10.240.0.4     k8s-agentpool-22533604-0
  • get csi driver logs
kubectl logs csi-azuredisk-node-cvgbs -c azuredisk -n kube-system > csi-azuredisk-node.log
  • check disk mount inside driver
kubectl exec -it csi-azuredisk-node-xxxxx -n kube-system -c azuredisk -- mount | grep sd
/dev/sdc on /var/lib/kubelet/plugins/kubernetes.io/csi/pv/pvc-e4c14592-2a79-423e-846f-4b25fe393d6c/globalmount type ext4 (rw,relatime)
/dev/sdc on /var/lib/kubelet/pods/75351f5a-b2ce-4fab-bb90-250aaa010298/volumes/kubernetes.io~csi/pvc-e4c14592-2a79-423e-846f-4b25fe393d6c/mount type ext4 (rw,relatime)
  • check domain name resolution issue inside driver
apt update && apt install curl -y
curl https://apiserver-fqdn -k -v 2>&1
  • get cloud config file(azure.json) on Linux node
kubectl exec -it csi-azuredisk-node-dx94w -n kube-system -c azuredisk -- cat /etc/kubernetes/azure.json
  • get cloud config file(azure.json) on Windows node
kubectl exec -it csi-azuredisk-node-win-xxxxx -n kube-system -c azuredisk cmd
type c:\k\azure.json
  • get Windows csi-proxy logs inside driver
kubectl exec -it csi-azuredisk-node-win-xxxxx -n kube-system -c azuredisk cmd
type c:\k\csi-proxy.err.log

Update driver version quickly by editing driver deployment directly

  • update controller deployment
kubectl edit deployment csi-azuredisk-controller -n kube-system
  • update daemonset deployment
kubectl edit ds csi-azuredisk-node -n kube-system

change below deployment config, e.g.

        image: mcr.microsoft.com/k8s/csi/azuredisk-csi:v1.8.0
        imagePullPolicy: Always

Links