Skip to content
This repository has been archived by the owner on Mar 5, 2024. It is now read-only.

Small window where nodes are untainted? #24

Open
jim-barber-he opened this issue Jun 30, 2020 · 8 comments
Open

Small window where nodes are untainted? #24

jim-barber-he opened this issue Jun 30, 2020 · 8 comments

Comments

@jim-barber-he
Copy link

Thank you very much for Nidhogg!

I've just implemented it for the first time, but I'm seeing some behaviour that I'm not sure is correct, so would like to run it by you and see if there is something that I can do to fix it up.

I have the following YAML configuration:

    daemonsets:
    - name: fluentd-fluentd-elasticsearch
      namespace: fluentd
    - name: kiam-agent
      namespace: kiam
    - name: node-local-dns
      namespace: kube-system
    - name: weave-net
      namespace: kube-system
    nodeSelector:
      node-role.kubernetes.io/node: ""

With the above, I'm waiting on 4 critical daemonsets.

I have been looking at the Nidhogg logs as nodes are added to the cluster by the cluster-autoscaler.
What I'm finding is that initially taints are added by Nidhogg, but not for all of the 4 daemon sets.
Often it is 3 of them, which then clear, and the firstTimeReady gets set, and then a few seconds later the missing 4th taint is added, along with the other 3 and then they proceed to get removed again as things become ready.
This appears to give a 2 or 3 second window where pods may be able schedule onto the node even though it isn't quite ready yet.

An example of logs showing this follows:

{"level":"info","ts":1593498895.9255075,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-108-84.ap-southeast-2.compute.internal","taints added":["nidhogg.uswitch.com/kiam.kiam-agent","nidhogg.uswitch.com/kube-system.node-local-dns","nidhogg.uswitch.com/kube-system.weave-net"],"taints removed":[],"taintLess":false,"firstTimeReady":""}
{"level":"info","ts":1593498895.9452868,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-108-84.ap-southeast-2.compute.internal","taints added":[],"taints removed":["nidhogg.uswitch.com/kube-system.weave-net"],"taintLess":false,"firstTimeReady":""}
{"level":"info","ts":1593498896.1045887,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-108-84.ap-southeast-2.compute.internal","taints added":[],"taints removed":["nidhogg.uswitch.com/kube-system.node-local-dns"],"taintLess":false,"firstTimeReady":""}
{"level":"info","ts":1593498896.1525385,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-108-84.ap-southeast-2.compute.internal","taints added":[],"taints removed":["nidhogg.uswitch.com/kiam.kiam-agent"],"taintLess":true,"firstTimeReady":"2020-06-30T06:34:56Z"}
{"level":"info","ts":1593498899.199731,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-108-84.ap-southeast-2.compute.internal","taints added":["nidhogg.uswitch.com/fluentd.fluentd-fluentd-elasticsearch"],"taints removed":[],"taintLess":false,"firstTimeReady":"2020-06-30T06:34:56Z"}
{"level":"info","ts":1593498899.5885453,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-108-84.ap-southeast-2.compute.internal","taints added":["nidhogg.uswitch.com/kube-system.weave-net"],"taints removed":[],"taintLess":false,"firstTimeReady":"2020-06-30T06:34:56Z"}
{"level":"info","ts":1593498900.7854457,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-108-84.ap-southeast-2.compute.internal","taints added":["nidhogg.uswitch.com/kube-system.node-local-dns"],"taints removed":[],"taintLess":false,"firstTimeReady":"2020-06-30T06:34:56Z"}
{"level":"info","ts":1593498901.1978955,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-108-84.ap-southeast-2.compute.internal","taints added":["nidhogg.uswitch.com/kiam.kiam-agent"],"taints removed":[],"taintLess":false,"firstTimeReady":"2020-06-30T06:34:56Z"}
{"level":"info","ts":1593498916.6390643,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-108-84.ap-southeast-2.compute.internal","taints added":[],"taints removed":["nidhogg.uswitch.com/kube-system.node-local-dns"],"taintLess":false,"firstTimeReady":"2020-06-30T06:34:56Z"}
{"level":"info","ts":1593498927.831887,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-108-84.ap-southeast-2.compute.internal","taints added":[],"taints removed":["nidhogg.uswitch.com/kiam.kiam-agent"],"taintLess":false,"firstTimeReady":"2020-06-30T06:34:56Z"}
{"level":"info","ts":1593498935.2295878,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-108-84.ap-southeast-2.compute.internal","taints added":[],"taints removed":["nidhogg.uswitch.com/kube-system.weave-net"],"taintLess":false,"firstTimeReady":"2020-06-30T06:34:56Z"}
{"level":"info","ts":1593498949.044196,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-108-84.ap-southeast-2.compute.internal","taints added":[],"taints removed":["nidhogg.uswitch.com/fluentd.fluentd-fluentd-elasticsearch"],"taintLess":true,"firstTimeReady":"2020-06-30T06:34:56Z"}

The first line has added the taints for kiam, node-local-dns, and weave-net, but there is no taint added for fluentd yet.
The next 3 lines are the 3 taints it has being removed one by one, with the final one marking the node as taintLess and setting firstTimeReady.
Then the next line (roughly 3 seconds later) adds the fluentd taint that was previously missing.
It is these 3 seconds that I'm concerned about.
Next 3 more lines re-add the previous taints that were removed, and then the taints are all removed again until the node is ready.

It's not always fluentd that is the left until later, sometime it is node-local-dns instead.
And it's not always 3 taints that are initially addeded either; I've also seen just 2 of the taints added in the first line.
I haven't had it installed for long, but if it is useful I can collect more details and pass them on.

@jim-barber-he
Copy link
Author

Here's one that has essentially done the right thing and added all 4 taints at once, but it flip-flopped around a bit removing and re-adding taints as it went which is a little strange too.
I'm showing this in case the flip-flopping is indicative of what sort of problem that I'm seeing.

{"level":"info","ts":1593496663.4155443,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-57-33.ap-southeast-2.compute.internal","taints added":["nidhogg.uswitch.com/fluentd.fluentd-fluentd-elasticsearch","nidhogg.uswitch.com/kiam.kiam-agent","nidhogg.uswitch.com/kube-system.node-local-dns","nidhogg.uswitch.com/kube-system.weave-net"],"taints removed":[],"taintLess":false,"firstTimeReady":""}
{"level":"info","ts":1593496663.5498528,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-57-33.ap-southeast-2.compute.internal","taints added":[],"taints removed":["nidhogg.uswitch.com/kube-system.node-local-dns"],"taintLess":false,"firstTimeReady":""}
{"level":"info","ts":1593496663.6244056,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-57-33.ap-southeast-2.compute.internal","taints added":[],"taints removed":["nidhogg.uswitch.com/kube-system.weave-net"],"taintLess":false,"firstTimeReady":""}
{"level":"info","ts":1593496665.053052,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-57-33.ap-southeast-2.compute.internal","taints added":["nidhogg.uswitch.com/kube-system.node-local-dns"],"taints removed":[],"taintLess":false,"firstTimeReady":""}
{"level":"info","ts":1593496665.452104,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-57-33.ap-southeast-2.compute.internal","taints added":["nidhogg.uswitch.com/kube-system.weave-net"],"taints removed":[],"taintLess":false,"firstTimeReady":""}
{"level":"info","ts":1593496672.177263,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-57-33.ap-southeast-2.compute.internal","taints added":[],"taints removed":["nidhogg.uswitch.com/kiam.kiam-agent"],"taintLess":false,"firstTimeReady":""}
{"level":"info","ts":1593496672.2539434,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-57-33.ap-southeast-2.compute.internal","taints added":["nidhogg.uswitch.com/kiam.kiam-agent"],"taints removed":["nidhogg.uswitch.com/fluentd.fluentd-fluentd-elasticsearch"],"taintLess":false,"firstTimeReady":""}
{"level":"info","ts":1593496674.8512542,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-57-33.ap-southeast-2.compute.internal","taints added":["nidhogg.uswitch.com/fluentd.fluentd-fluentd-elasticsearch"],"taints removed":[],"taintLess":false,"firstTimeReady":""}
{"level":"info","ts":1593496692.9860325,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-57-33.ap-southeast-2.compute.internal","taints added":[],"taints removed":["nidhogg.uswitch.com/kube-system.node-local-dns"],"taintLess":false,"firstTimeReady":""}
{"level":"info","ts":1593496695.1362882,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-57-33.ap-southeast-2.compute.internal","taints added":[],"taints removed":["nidhogg.uswitch.com/kiam.kiam-agent"],"taintLess":false,"firstTimeReady":""}
{"level":"info","ts":1593496696.2781427,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-57-33.ap-southeast-2.compute.internal","taints added":[],"taints removed":["nidhogg.uswitch.com/kube-system.weave-net"],"taintLess":false,"firstTimeReady":""}
{"level":"info","ts":1593496710.2563887,"logger":"nidhogg","msg":"Updating Node taints","instance":"ip-10-8-57-33.ap-southeast-2.compute.internal","taints added":[],"taints removed":["nidhogg.uswitch.com/fluentd.fluentd-fluentd-elasticsearch"],"taintLess":true,"firstTimeReady":"2020-06-30T05:58:30Z"}

@Joseph-Irving
Copy link
Contributor

What is the status of your pods when this is happening? if your pods are restarting or alternating between passing and failing readiness, then this is expected behaviour

@jim-barber-he
Copy link
Author

Awesome. Thanks for your fast response.

Ah yes; pod flapping would very likely be the cause of some of the taints being added and removed.
I just looked at another node and weave-net restarted once while kiam restarted twice while it came up.

What about the first set of logs where the fluentd taint wasn't added until after the others had been added & removed and the pod became untainted for about 3 seconds?
Do you have a theory on the cause of that one?

@Joseph-Irving
Copy link
Contributor

Could it be possible that the fluentd pod was already ready by the time that the taints were added? There is a delay between your node coming up and nidhogg applying the taint, so theoretically if the fluentd pod started up very quickly and passed readiness then it wouldn't have its taint added.

@jim-barber-he
Copy link
Author

It's just strange because later the taint is added for the fluentd pod. I'll have to do more investigation to see if its readiness check was flapping.

It's almost like Nidhogg could possibly benefit from a configuration setting saying that a service needs to pass X number of successful checks in a row before a taint is removed to deal with services that may flap as nodes are coming up.
That could eliminate the small few second windows I have seen where all taints are removed before some are added again.

@Joseph-Irving
Copy link
Contributor

To somewhat play devil's advocate, it's probably worth making sure the readiness check is working as intended. As just generally having a very flaky check in Kubernetes isn't ideal as a number of decisions may be made based on that info.

Nidhogg checks the taint every time there's an update, e.g a change of state in one of the pods, so if were to implement the multiple checks option, it's worth noting that it could pass the threshold of checks very rapidly. E.g

  • 1st pass fluent is ready, fluentd taint added as below threshold
  • 2nd kiam pod start, fluentd still ready
  • 3rd node-local dns becomes ready, passed threshold of 3 so fluentd taint removed
  • 4th fluentd fails readiness, fluentd taint added
    So you could still have some flakiness if your readiness check is flaky, you could make the threshold very high to mitigate I guess.
    You'd also have to make sure that nidhogg knows to sync your node until everything passes the threshold, it normally only calculates when the node's pods have an update.
    Let's say you've got 5 pods, they're all now in ready state, but they haven't passed the threshold. The taints will remain in place, but there's nothing happening on the node now so there's nothing to trigger a new sync. So we'd need something to ensure that it periodically syncs the status of the node until all the taints have been removed.

@gordonbondon
Copy link

gordonbondon commented Aug 5, 2020

I think this can be somehow mitigated by starting new nodes with existing taints via --register-with-taints kubelet flag. So if you run kubelet with --register-with-taints=nidhogg.uswitch.com/kube-system.kiam=:NoSchedule it should join the node with taint already in place, and nidhogg should be able to remove it later. This way there should be no gap when the node is still untainted before nidhogg kicks in.

@isugimpy
Copy link

We're experiencing this as well, or something very similar, but are solving it via a mutating webhook which taints the nodes at create time. Seemed cleaner and easier to update than having to alter kubelet args. In our case, it doesn't appear that the firstTimeReady gets set early, but rather that it's a race condition where we have some pods that get scheduled and try to start before Nidhogg can apply even the first taint.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants