You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Expired tokens are refreshed within an acceptable timeframe, e.g. few minutes.
Current Behavior
Currently, the token refresher sleeps for (exp - now)/4 = max. 6hrs. When the refresher is asleep, if the token expires, it can only be refreshed by restarting all pods, otherwise the cluster networking is down until the sleep is completed. These values are hardcoded in, see
logrus.WithError(err).Debug("Unable to create token for CNI kubeconfig as token request api is not supported, falling back to local service account token")
.
Possible Solution
This could be fixed by 1. making the hardcoded values configurable, either with a separate configuration or by (optionally) preferring the configuration from file or 2. refreshing straight after a failed request.
Steps to Reproduce (for bugs)
Get a cluster with calico installed
Stop NTP service
Shift time forward by more than 24hrs
No pods are coming up or down, calico plugin connection is unauthorized
From the code we can deduce that calico is sleeping and will refresh in 6hrs or so.
Context
This issue was noticed in testing, but it's still causing the tests to take 6hrs to complete. The same behavior can happen in production if the token for some reason expires earlier than expected.
Expected Behavior
Expired tokens are refreshed within an acceptable timeframe, e.g. few minutes.
Current Behavior
Currently, the token refresher sleeps for (exp - now)/4 = max. 6hrs. When the refresher is asleep, if the token expires, it can only be refreshed by restarting all pods, otherwise the cluster networking is down until the sleep is completed. These values are hardcoded in, see
calico/node/pkg/cni/token_watch.go
Line 24 in 3bca322
Techically, other times could be inserted with the serviceaccount/token file, but that is only used if the token API is down, see
calico/node/pkg/cni/token_watch.go
Line 104 in 3bca322
Possible Solution
This could be fixed by 1. making the hardcoded values configurable, either with a separate configuration or by (optionally) preferring the configuration from file or 2. refreshing straight after a failed request.
Steps to Reproduce (for bugs)
From the code we can deduce that calico is sleeping and will refresh in 6hrs or so.
Context
This issue was noticed in testing, but it's still causing the tests to take 6hrs to complete. The same behavior can happen in production if the token for some reason expires earlier than expected.
Your Environment
For the reproduction I just put up a fresh VM with Ubuntu 24, installed chrony and followed this https://docs.tigera.io/calico/latest/getting-started/kubernetes/kind to create a Kind cluster with calico
The text was updated successfully, but these errors were encountered: