Add `aws-s3-csi-controller` component #299

unexge · 2024-11-22T15:15:30Z

This is part of #279.

This new component, aws-s3-csi-controller, will be the entry point for our controller component. It's using controller-runtime, specifically, it implements Reconciler interface to reconcile Pods in the cluster. It schedules Mountpoint Pods in turn to cluster events such as a new workload Pod using a PV backed by S3 CSI Driver getting scheduled into the cluster. It'd then schedule a Mountpoint Pod for that workload Pod in the same node to provide volume for that Pod.

#279 is still WIP and this component contains some TODOs and it's not in use anywhere except in tests at the moment.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Makefile

cmd/aws-s3-csi-controller/csicontroller/reconciler.go

muddyfish · 2024-11-26T10:32:55Z

cmd/aws-s3-csi-controller/csicontroller/reconciler.go

+		log.V(debugLevel).Info("Pod pending to be scheduled")
+	case corev1.PodRunning:
+		log.V(debugLevel).Info("Pod is running")
+	case corev1.PodSucceeded:


We need to verify that MP exists with an error exit code on all crashes. I'm not sure it does right now

So, depending on Mountpoint's exit code, non-zero for failures and zero for success, Mountpoint Pod will either restart (due to RestartPolicy: corev1.RestartPolicyOnFailure) or transition into PodSucceeded state respectively. This part deletes PodSucceeded Mountpoint Pods.

I know, we just need to ensure that Mountpoint exists with non-zero exit codes on failure.

I've manually verified while testing Mountpoint with incorrect credentials that this is the case. I think we also have some end-to-end tests where it tests failed mount, but I'm not sure if we also want to have a specific test case for exit code.

cmd/aws-s3-csi-controller/csicontroller/reconciler.go

muddyfish · 2024-11-26T13:39:43Z

pkg/podmounter/mppod/mppod_test.go

+)
+
+func TestGeneratingMountpointPodName(t *testing.T) {
+	t.Run("Consistency", func(t *testing.T) {


These tests seem like they'd be better by being property based tests

pkg/podmounter/mppod/path.go

muddyfish · 2024-11-26T13:45:20Z

tests/controller/controller_test.go

+
+				expectNoMountpointPodFor(pod, vol)
+
+				vol.bind()


Semantically, what does this mean in terms of k8s runtime? Where the binding isn't explicit in the PV, but gets bound at pod schedule time?

This is actually the default behavior if you don't pre-bind your PVC and PV. You specify a PVC in your Pod with some capacity, and once your Pod scheduled, Kubernetes tries to bind a PV for your PVC that fulfills your capacity requests. In case of static provisioning, cluster admin must pre-provision PVs for this to work, in case of dynamic provisioning another controller like external-provisioner sees that and dynamically creates a PV and binds that to that PVC. See for more detail https://kubernetes.io/docs/concepts/storage/persistent-volumes/#binding

tests/controller/controller_test.go

muddyfish · 2024-11-26T13:48:44Z

tests/controller/controller_test.go

+
+var _ = Describe("Mountpoint Controller", func() {
+	Context("Static Provisioning", func() {
+		Context("Scheduled Pod with pre-bound PV and PVC", func() {


Are we effectively testing that if there's a volume with the S3 CSI Driver, and it is bound to a pod that is scheduled, waitAndVerifyMountpointPodFor, otherwise, expectNoMountpointPodFor?

If so, then a property based test using a state machine might be useful here to ensure all combinations are reached

Yes, those are the expectations we assert during the tests. I think we can look into using property tests, but I think I'd also like to keep these known/expected cases here

muddyfish · 2024-12-10T17:14:41Z

cmd/aws-s3-csi-controller/csicontroller/reconciler.go

+
+		log.V(debugLevel).Info("Found bound PV for PVC", "pvc", pvc.Name, "volumeName", pv.Name)
+
+		err = r.spawnOrDeleteMountpointPodIfNeeded(ctx, pod, pvc, pv, csiSpec)


This can be called multiple times for the same volume. I don't think this is a problem though, but is it worth migrating this outside the for loop and only calling if we have no requeue and no errors? That way we could call in parallel for each PVC, but that might just be premature optimisation.

I think I'd like this to be called even some other volume for this Pod requires requeueing, so one poisonous volume won't block other volumes to progress. Yeah, I think calling this in parallel would be a pre-mature optimization and would also be problematic if we want to share Mountpoint instances between multiple volumes. They would be racing to spawn Mountpoint Pods and that might cause duplicate Pods.

cmd/aws-s3-csi-controller/csicontroller/reconciler.go

Signed-off-by: Burak Varlı <[email protected]>

See https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/logging.md#what-method-to-use Signed-off-by: Burak Varlı <[email protected]>

Signed-off-by: Burak Varlı <[email protected]>

…r envtest Signed-off-by: Burak Varlı <[email protected]>

Signed-off-by: Burak Varlı <[email protected]>

volume types Signed-off-by: Burak Varlı <[email protected]>

Signed-off-by: Burak Varlı <[email protected]>

muddyfish

Taken notes for non-blocking things we still want to do, good to merge

muddyfish reviewed Nov 26, 2024

View reviewed changes

unexge force-pushed the unexge/podmounter-controller branch from 6fed7e3 to 0d2c5cb Compare December 10, 2024 11:28

unexge requested a review from muddyfish December 10, 2024 11:30

muddyfish reviewed Dec 10, 2024

View reviewed changes

muddyfish reviewed Dec 11, 2024

View reviewed changes

unexge force-pushed the unexge/podmounter-controller branch from 0d2c5cb to 77c6842 Compare December 11, 2024 16:17

unexge added 15 commits December 16, 2024 13:25

Warn about long Unix domain socket paths

f55ab3b

Signed-off-by: Burak Varlı <[email protected]>

Add mppod package that provides utils for Mountpoint Pods

706111b

Signed-off-by: Burak Varlı <[email protected]>

Use mppod to get path of mount.sock in aws-s3-csi-mounter

dcc84dd

Signed-off-by: Burak Varlı <[email protected]>

Add aws-s3-csi-controller component

5f5642a

Signed-off-by: Burak Varlı <[email protected]>

Also check for -f before adding missing --foreground

0eeb93c

Signed-off-by: Burak Varlı <[email protected]>

Update debug log level according to Kubernetes recommendation

2f9d7ef

See https://github.com/kubernetes/community/blob/master/contributors/devel/sig-instrumentation/logging.md#what-method-to-use Signed-off-by: Burak Varlı <[email protected]>

Remove direct dependency on go.uber.org/zap

e221faf

Signed-off-by: Burak Varlı <[email protected]>

Run controller tests as part of e2e tests GitHub workflow

e408134

Signed-off-by: Burak Varlı <[email protected]>

Specify envtest Kubernetes version in CI

b4b0837

Signed-off-by: Burak Varlı <[email protected]>

Clarify the reason for only getting major-minor Kubernetes version fo…

cf1a207

…r envtest Signed-off-by: Burak Varlı <[email protected]>

Ensure associated Mountpoint Pod is also terminated with Workload Pod

fb21838

Signed-off-by: Burak Varlı <[email protected]>

Add link to kubebuilder's repository

3fa86ce

Signed-off-by: Burak Varlı <[email protected]>

Add comment to consistency test on MountpointPodNameFor

00eca14

Signed-off-by: Burak Varlı <[email protected]>

Add test-cases to ensure we don't spawn Mountpoint Pods for different

d9f27b1

volume types Signed-off-by: Burak Varlı <[email protected]>

Address code review comments

5303a67

Signed-off-by: Burak Varlı <[email protected]>

unexge force-pushed the unexge/podmounter-controller branch from 77c6842 to 5303a67 Compare December 16, 2024 13:25

muddyfish approved these changes Dec 16, 2024

View reviewed changes

unexge merged commit 0f2689b into main Dec 16, 2024
21 checks passed

unexge deleted the unexge/podmounter-controller branch December 16, 2024 20:17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add `aws-s3-csi-controller` component #299

Add `aws-s3-csi-controller` component #299

unexge commented Nov 22, 2024

muddyfish Nov 26, 2024

unexge Dec 10, 2024

muddyfish Dec 10, 2024

unexge Dec 10, 2024

muddyfish Nov 26, 2024

muddyfish Nov 26, 2024

unexge Nov 26, 2024

muddyfish Nov 26, 2024

muddyfish Nov 26, 2024

unexge Nov 26, 2024

muddyfish Dec 10, 2024

unexge Dec 10, 2024

muddyfish left a comment


		log.V(debugLevel).Info("Found bound PV for PVC", "pvc", pvc.Name, "volumeName", pv.Name)

		err = r.spawnOrDeleteMountpointPodIfNeeded(ctx, pod, pvc, pv, csiSpec)

Add aws-s3-csi-controller component #299

Add aws-s3-csi-controller component #299

Conversation

unexge commented Nov 22, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

muddyfish left a comment

Choose a reason for hiding this comment

Add `aws-s3-csi-controller` component #299

Add `aws-s3-csi-controller` component #299