Cluster down kine.sock connection refused #4523

JasonPulse · 2024-05-16T15:24:50Z

Summary

Came home to have my 2 node cluster down with both nodes reporting the error below for kubelite service
May 16 06:11:30 ubuntu microk8s.daemon-kubelite[6065]: W0516 06:11:30.766110 6065 logging.go:59] [core] [Channel #7 SubChannel #9] grpc: addrConn.createTransport failed to connect to {Addr: "unix:///var/snap/microk8s/6787/var/kubernetes/backend/kine.sock:12379", ServerName: "kine.sock:12379", }. Err: connection error: desc = "transport: Error while dialing: dial unix /var/snap/microk8s/6787/var/kubernetes/backend/kine.sock:12379: connect: connection refused"

What Should Happen Instead?

Running cluster

Reproduction Steps

Unknown at this time, assumed something to do with a snap refresh.

Introspection Report

System is unstable 20-30 seconds to echo back typing with microk8s trying to start, tar is with microk8s stopped, if requested I can provide files from /var/snap/microk8s/current/var/kubernets/backend as there is no sensitive data running on cluster

Would like to know why this happened as we have several clusters running the same things and with them all failing for a different issue of nf_conntrack having another issue to go back to will not be fun.

inspection-report-20240516_151937.tar.gz

The text was updated successfully, but these errors were encountered:

JasonPulse · 2024-05-17T04:27:18Z

After about 7 hours I was able to get a Tar from running microk8s
inspection-report-20240516_223539.tar.gz

mikhatanu · 2024-12-24T08:57:47Z

just encountered this error too, need help.

JasonPulse · 2024-12-24T16:35:09Z

Yeah no one ever answered or helped here, I've had this issue many times since. In this case it was just faster to blow away the cluster and rebuild then to recover. I have noticed this is usually related to the host system writing to the HDD I think in this case the main node was a Pi with a memory card as main HDD and that was just not fast enough. Another time I have received this is the node running out of HDD space or damn near close to it.

You can edit /var/snap/microk8s/current/args/kubelet and add the -gc threshold flags to change the image pruning for a more reasonable tolerance then 85%

--image-gc-high-threshold=60
--image-gc-low-threshold=30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cluster down kine.sock connection refused #4523

Cluster down kine.sock connection refused #4523

JasonPulse commented May 16, 2024

JasonPulse commented May 17, 2024

mikhatanu commented Dec 24, 2024

JasonPulse commented Dec 24, 2024

Cluster down kine.sock connection refused #4523

Cluster down kine.sock connection refused #4523

Comments

JasonPulse commented May 16, 2024

Summary

What Should Happen Instead?

Reproduction Steps

Introspection Report

JasonPulse commented May 17, 2024

mikhatanu commented Dec 24, 2024

JasonPulse commented Dec 24, 2024