Add taint to user nodes #2605

Adam-D-Lewis · 2024-08-01T14:58:13Z

Reference Issues or PRs

Fixes #2507
WIP

I need to test running pods with Argo Workflow through Nebari Workflow Controller before merging this PR

What does this implement/fix?

Put a x in the boxes that apply

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds a feature)
Breaking change (fix or feature that would cause existing features not to work as expected)
Documentation Update
Code style update (formatting, renaming)
Refactoring (no functional changes, no API changes)
Build related changes
Other (please describe):

Testing

Did you test the pull request locally?
Did you add new tests?

Any other comments?

Adam-D-Lewis · 2024-08-19T22:17:05Z

src/_nebari/stages/infrastructure/__init__.py

@@ -41,10 +41,33 @@ class ExistingInputVars(schema.Base):
    kube_context: str


-class DigitalOceanNodeGroup(schema.Base):


Duplicate class

Adam-D-Lewis · 2024-08-19T22:18:58Z

This method works as intended when tested on GCP. However, One issue is that certain daemonsets won't run on the tainted nodes. I saw the issue with rook ceph csi-cephfslplugin from my rook PR, but I expect it would also be an issue for the monitoring daemonset pods. So we'd likely need to add the appropriate toleration to those daemonsets.

Adam-D-Lewis · 2024-08-21T22:14:22Z

src/_nebari/stages/kubernetes_services/template/rook-ceph.tf

@@ -45,6 +45,13 @@ resource "helm_release" "rook-ceph" {
      },
      csi = {
        enableRbdDriver = false, # necessary to provision block storage, but saves some cpu and memory if not needed
+        provisionerReplicas : 1, # default is 2 on different nodes
+        pluginTolerations = [


runs csi-driver on all nodes, even those with NoSchedule taints. Doesn't run on nodes with NoExecute taints. This is what the nebari-prometheus-node-exporter daemonset does so I copied it here.

Adam-D-Lewis · 2024-08-21T22:15:11Z

...bari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/loki/main.tf

+          effect   = "NoSchedule"
+        },
+        {
+          operator = "Exists"


runs promtail on all nodes, even those with NoSchedule taints. Doesn't run on nodes with NoExecute taints. This is what the nebari-prometheus-node-exporter daemonset does so I copied it here. Promtail is what exports logs from the node so we still want it to run on the user and worker nodes.

Adam-D-Lewis · 2024-08-21T22:15:40Z

...bari/stages/kubernetes_services/template/modules/kubernetes/services/monitoring/loki/main.tf

+        {
+          key      = "node-role.kubernetes.io/master"
+          operator = "Exists"
+          effect   = "NoSchedule"
+        },
+        {
+          key      = "node-role.kubernetes.io/control-plane"
+          operator = "Exists"
+          effect   = "NoSchedule"
+        },


These top 2 are the default value for this helm chart.

Adam-D-Lewis · 2024-08-21T23:30:12Z

Okay, so things are working for the user node group. I tried adding a taint to the worker node group, but the dask scheduler won't run on the tainted worker node group. See this commit to see what I tried in a quick test. I do see the new scheduler_pod_extra_config value in /var/lib/dask-gateway/config.json in the dask gateway pod, but the scheduler tolerations look like

│   tolerations:                                                                                                                                                                            │
│   - effect: NoExecute                                                                                                                                                                     │
│     key: node.kubernetes.io/not-ready                                                                                                                                                     │
│     operator: Exists                                                                                                                                                                      │
│     tolerationSeconds: 300                                                                                                                                                                │
│   - effect: NoExecute                                                                                                                                                                     │
│     key: node.kubernetes.io/unreachable                                                                                                                                                   │
│     operator: Exists                                                                                                                                                                      │
│     tolerationSeconds: 300

so I think possibly the merge isn't going as expected, but I need to verify. The docs say that "This dict will be deep merged with the scheduler pod spec (a V1PodSpec object) before submission. Keys should match those in the kubernetes spec, and should be camelCase."

Adam-D-Lewis · 2024-10-25T21:46:20Z

I managed to get the taints applied to the scheduler pod in this commit. I would have expected the c.KubeClusterConfig.scheduler_extra_pod_config to get merged with the options returned by the function passed to c.Backend.cluster_options, but it wasn't.

I should verify this and maybe submit an issue to dask-gateway.

I still need to apply the toleration to the dask workers.

Adam-D-Lewis added 6 commits June 26, 2024 09:59

save progress

5000f06

Merge branch 'develop' into node-taint

7ce8555

fix node taint check

a661514

Merge branch 'develop' into node-taint

fb55fab

fix node taints on gcp

7f1800d

add latest changes

40940f6

Adam-D-Lewis commented Aug 19, 2024

View reviewed changes

Adam-D-Lewis added 2 commits August 21, 2024 12:11

merge develop

cdac5c6

allow daemonsets to run on user node group

6382c7b

Adam-D-Lewis commented Aug 21, 2024

View reviewed changes

Adam-D-Lewis added 2 commits August 21, 2024 18:23

recreate node groups when taints change

e9d9dd9

quick attempt to get scheduler running on tanted worker node group

c55cd5f

Adam-D-Lewis added 2 commits October 25, 2024 14:50

Merge branch 'main' into node-taint

57e6e09

add default options to options_handler

a1370c9

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add taint to user nodes #2605

Add taint to user nodes #2605

Adam-D-Lewis commented Aug 1, 2024 •

edited

Loading

Adam-D-Lewis Aug 19, 2024

Adam-D-Lewis commented Aug 19, 2024 •

edited

Loading

Adam-D-Lewis Aug 21, 2024

Adam-D-Lewis Aug 21, 2024

Adam-D-Lewis Aug 21, 2024

Adam-D-Lewis commented Aug 21, 2024 •

edited

Loading

Adam-D-Lewis commented Oct 25, 2024 •

edited

Loading

		@@ -41,10 +41,33 @@ class ExistingInputVars(schema.Base):
		kube_context: str


		class DigitalOceanNodeGroup(schema.Base):

Add taint to user nodes #2605

Are you sure you want to change the base?

Add taint to user nodes #2605

Conversation

Adam-D-Lewis commented Aug 1, 2024 • edited Loading

Reference Issues or PRs

What does this implement/fix?

Testing

Any other comments?

Adam-D-Lewis Aug 19, 2024

Choose a reason for hiding this comment

Adam-D-Lewis commented Aug 19, 2024 • edited Loading

Adam-D-Lewis Aug 21, 2024

Choose a reason for hiding this comment

Adam-D-Lewis Aug 21, 2024

Choose a reason for hiding this comment

Adam-D-Lewis Aug 21, 2024

Choose a reason for hiding this comment

Adam-D-Lewis commented Aug 21, 2024 • edited Loading

Adam-D-Lewis commented Oct 25, 2024 • edited Loading

Adam-D-Lewis commented Aug 1, 2024 •

edited

Loading

Adam-D-Lewis commented Aug 19, 2024 •

edited

Loading

Adam-D-Lewis commented Aug 21, 2024 •

edited

Loading

Adam-D-Lewis commented Oct 25, 2024 •

edited

Loading