Skip to content
This repository has been archived by the owner on May 6, 2020. It is now read-only.

Fix node local storage #72

Merged
merged 2 commits into from
Sep 27, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 15 additions & 11 deletions packet/flatcar-linux/kubernetes/workers/cl/worker.yaml.tmpl
Original file line number Diff line number Diff line change
Expand Up @@ -7,12 +7,14 @@ systemd:
[Unit]
Description=Persist data RAID if exists
ConditionPathExists=!/etc/mdadm.conf
Before=kubelet.service
[Service]
Type=oneshot
RemainAfterExit=true
ExecStart=/opt/persist-data-raid
[Install]
WantedBy=multi-user.target
RequiredBy=kubelet.service
- name: docker.service
enable: true
- name: locksmithd.service
Expand Down Expand Up @@ -193,10 +195,9 @@ storage:

if [ "$${setup_fs_on_raid}" = true ]; then
mkfs.ext4 "$${device_path}"
mkdir "/mnt/$${array_name}"
mount "$${device_path}" "/mnt/$${array_name}"
mount "$${device_path}" "/mnt/"
# Make mount persistent across reboots
echo "$${device_path} /mnt/$${array_name} ext4 defaults,nofail,discard 0 0" | tee -a /etc/fstab
echo "$${device_path} /mnt/ ext4 defaults,nofail,discard 0 0" | tee -a /etc/fstab
fi
}

Expand All @@ -209,17 +210,20 @@ storage:
# https://www.kernel.org/doc/Documentation/admin-guide/devices.txt
major_numbers="8,259"

# XXX: These options are exclusive, as only one fs can be mounted
# to /mnt/
# This is, partly, because when creating dirs inside /mnt to mount
# several paths (like /mnt/node-local-storage), those are not visible
# to the pods. See this issue for more info:
# https://github.com/kinvolk/lokomotive-kubernetes/issues/73
#
# Variables replaced by Terraform
if [ ${setup_raid} = true ]; then
create_data_raid "$${major_numbers}" -1 /dev/md/node-local-storage true
else
# Both can be set independently
if [ ${setup_raid_hdd} = true ]; then
create_data_raid "$${major_numbers}" 1 /dev/md/node-local-hdd-storage true
fi
if [ ${setup_raid_ssd} = true ]; then
create_data_raid "$${major_numbers}" 0 /dev/md/node-local-ssd-storage ${setup_raid_ssd_fs}
fi
elif [ ${setup_raid_hdd} = true ]; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not really sure, why this change was introduced?

With this code now we can only set RAID of type HDD or SSD. This change has added another regression!

Copy link
Contributor Author

@rata rata Sep 30, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because mounting hdd and sdd both on /mnt is not possible, and the original patch was doing that (if you configure both to have an fs. This is explained in the updated PR description :-)

And those changes never worked (new setup_raid_* flags), they never wrote to those devices, as things mounted under /mnt are not visible to pods. I've created #74 to double check if we really need this and also created #73 to tackle the root cause and even proposed a fix there (if better fixes can't be found), see the comment that links to ba80265

However, as far as I understand those never worked and my reasoning was this: as those variables never worked, introduced for a few commits a backwards incompatible change and I'm not really sure we need them, let's just revert to the previous working situation and if we need to introduce them and do a backwards incompatible change, let's just do it. But it would be great if we can avoid using these, as the code is getting out of control (also created #75 to suggest a path to keep this under control) and introducing more bugs than solutions (as just created bugs but never really worked :-D).

If we later need to do the backwards incompatible change, let's do them (as we tried to do it when we first introduced these vars and change mounting from /mnt to /mnt/node-local-storage). But let's do them making sure everything works and, if possible, trying to make the change as simple as possible and not generate much tech debt :)

create_data_raid "$${major_numbers}" 1 /dev/md/node-local-hdd-storage true
elif [ ${setup_raid_ssd} = true ]; then
create_data_raid "$${major_numbers}" 0 /dev/md/node-local-ssd-storage ${setup_raid_ssd_fs}
fi
- path: /etc/kubernetes/kubeconfig
filesystem: root
Expand Down
6 changes: 3 additions & 3 deletions packet/flatcar-linux/kubernetes/workers/variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -80,19 +80,19 @@ EOD
}

variable "setup_raid" {
description = "Attempt to create a RAID 0 from extra disks to be used for persistent container storage. Valid values: \"true\", \"false\""
description = "Attempt to create a RAID 0 from extra disks to be used for persistent container storage. Can't be used with setup_raid_hdd nor setup_raid_sdd. Valid values: \"true\", \"false\""
type = "string"
default = "false"
}

variable "setup_raid_hdd" {
description = "Attempt to create a RAID 0 from extra Hard Disk drives only, to be used for persistent container storage. Valid values: \"true\", \"false\""
description = "Attempt to create a RAID 0 from extra Hard Disk drives only, to be used for persistent container storage. Can't be used with setup_raid nor setup_raid_sdd. Valid values: \"true\", \"false\""
type = "string"
default = "false"
}

variable "setup_raid_ssd" {
description = "Attempt to create a RAID 0 from extra Solid State Drives only, to be used for persistent container storage. Valid values: \"true\", \"false\""
description = "Attempt to create a RAID 0 from extra Solid State Drives only, to be used for persistent container storage. Can't be used with setup_raid nor setup_raid_hdd. Valid values: \"true\", \"false\""
type = "string"
default = "false"
}
Expand Down