Support routing to Kubernetes clusters by request path #50567

timothyb89 · 2024-12-24T04:14:27Z

This implements path-based routing for Kubernetes clusters as described by RFD0185. A new prefixed path handler is added that accepts base64-encoded Teleport and Kubernetes cluster names. The request is routed to the destination Teleport cluster using these parameters instead of those embedded in the session TLS identity, and then the preexisting handlers check authorization and complete the request as usual.

This removes the need for certificates to be issued per Kubernetes cluster: so long as the incoming identity is granted access to the cluster via its roles, access can succeed, and no KubernetesCluster attribute or cert usage restrictions are needed.

Fixes #40405

Testing steps

Ensure a Teleport proxy is running this branch with one or more k8s clusters attached. The Kubernetes node does not need to be running this branch, only the proxy.
Build tbot against Machine ID: Support path-based Kubernetes routing #50898
Create a new bot with appropriate permissions to access your kube node(s), e.g. tctl bots add foo --roles=kube-access

Start tbot with the new kubernetes/v2 service:

$ tbot start kubernetes/v2 --destination=./tbot-user --storage=./tbot-data --token=$token --join-method=token --kubernetes-cluster-name default --proxy-server=example.teleport.sh:443

The --kubernetes-cluster-name flag can be repeated. Alternatively (or additionally), try a label matcher:

$ tbot start kubernetes/v2 --destination=./tbot-user --storage=./tbot-data --token=$token --join-method=token --kubernetes-cluster-labels foo=bar --proxy-server=example.teleport.sh:443

Examine ./tbot-user/kubeconfig.yaml. You should see one cluster: and one context: entry per matched cluster. The server: field should include a new URL suffix, /v1/teleport/<teleportCluster>/<kubeCluster>
Try KUBECONFIG=./tbot-user/kubeconfig.yaml kubectl get pods. Also consider testing kubectl exec, port forward, etc.

This implements path-based routing for Kubernetes clusters as described by [RFD0185]. A new prefixed path handler is added that accepts base64-encoded Teleport and Kubernetes cluster names. The request is routed to the destination Teleport cluster using these parameters instead of those embedded in the session TLS identity, and then the preexisting handlers check authorization and complete the request as usual. This removes the need for certificates to be issued per Kubernetes cluster: so long as the incoming identity is granted access to the cluster via its roles, access can succeed, and no `KubernetesCluster` attribute or cert usage restrictions are needed. [RFD0185]: #47436

timothyb89 · 2024-12-24T04:16:51Z

Outstanding TODOs:

Add manual testing instructions to PR description
Unit tests
More rigorously validate security impact. Particularly want to ensure per-session MFA guarantees remain intact.
~~PR for tbot changes to test this more easily. Should be a low-impact change, and my current hacked tsh build is not really an acceptable validation environment.~~ PR is available here: Machine ID: Support path-based Kubernetes routing #50898

lib/kube/proxy/forwarder.go

tigrato · 2025-01-06T13:50:16Z

lib/kube/proxy/forwarder.go

@@ -520,7 +525,7 @@ type handlerWithAuthFuncStd func(ctx *authContext, w http.ResponseWriter, r *htt
 const accessDeniedMsg = "[00] access denied"

 // authenticate function authenticates request
-func (f *Forwarder) authenticate(req *http.Request) (*authContext, error) {
+func (f *Forwarder) authenticate(req *http.Request, params httprouter.Params, routeSourcer routeSourcer) (*authContext, error) {


I don't fully understand why you need to pass the params/route sourcer here.

Given that singleCertHandler already defines the

o.Identity.RouteToCluster = teleportCluster o.Identity.KubernetesCluster = kubeCluster

Do we ned to have a special case for route sourcer? We can always retrieve it from identity

That's a good call, it was left over from before I had it modifying (and relying on) the identity directly. I've removed the routeSource mechanism outright and reverted most of my changes to the forwarder itself. I ended up diving deeper into how the identity is passed / trusted between clusters and wrote some notes in a comment.

tigrato · 2025-01-06T13:50:59Z

lib/kube/proxy/single_cert_handler.go

+		return "", "", trace.Wrap(err)
+	}
+
+	// TODO: do we care to otherwise validate these results before casting


yeah, I think it would be better anw

I'm not totally sure what other validation steps we might want here - somehow I'd never looked into it before but we seem to impose very few restrictions on resource names. I've added a check to make sure the parameters decode to valid UTF8, at least. Downstream looks to be a map lookup from cached cluster details, so I don't expect significant risk.

Also: fixes various other review comments, adds comment explaining auth strategy.

…es-path-routing

timothyb89 · 2025-01-15T03:51:37Z

lib/kube/proxy/single_cert_handler_test.go

+			assert: func(t *testing.T, restConfig *rest.Config) {
+				client := pathRoutedKubeClient(t, restConfig, clusterName, "a")
+				_, err = client.CoreV1().Pods(metav1.NamespaceDefault).List(context.Background(), metav1.ListOptions{})
+				require.ErrorContains(t, err, "cannot list resource")


I've noticed a mild behavior change here, I think since this just wasn't possible before. Attempting to e.g. list resources on a cluster denied by roles now yields (kubectl get pods -v8):

I0114 20:55:58.643198 22226 request.go:1212] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"kubernetes cluster \"default\" not found","reason":"Forbidden","code":403} I0114 20:55:58.643592 22226 helpers.go:246] server response object: [{ "kind": "Status", "apiVersion": "v1", "metadata": {}, "status": "Failure", "message": "kubernetes cluster \"default\" not found", "reason": "Forbidden", "code": 403 }] Error from server (Forbidden): kubernetes cluster "default" not found

...however, attempting to list a nonexistent cluster returns:

I0114 20:51:57.008342 10127 request.go:1212] Response Body: I0114 20:51:57.008342 10127 request.go:1212] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Kubernetes cluster \"asdf\" not found","reason":"NotFound","code":404} E0114 20:51:57.008352 10127 memcache.go:265] couldn't get current server API group list: the server could not find the requested resource I0114 20:51:57.008357 10127 cached_discovery.go:120] skipped caching discovery info due to the server could not find the requested resource I0114 20:51:57.008477 10127 helpers.go:246] server response object: [{ "metadata": {}, "status": "Failure", "message": "the server could not find the requested resource", "reason": "NotFound", "details": { "causes": [ { "reason": "UnexpectedServerResponse", "message": "unknown" } ] }, "code": 404 }] Error from server (NotFound): the server could not find the requested resource

(and a Kubernetes cluster "$name" not found" is logged on the server)

Previously, you could only ever get certs for a cluster if the name was valid and the user is authorized to see it, and returns a consistent error otherwise before the cert is ever issued. Would we consider this a resource name oracle?

It seems like the two responses probably ought to be identical but changing them might be a mild breaking change to existing behavior (403 vs 404).

Also, don't allow path parameters to override the identity, if the contains nonempty routing params.

timothyb89 · 2025-01-16T03:49:11Z

lib/kube/proxy/single_cert_handler_test.go

+			name:     "successful path routing to multiple clusters",
+			roleSpec: defaultRoleSpec,
+			assert: func(t *testing.T, restConfig *rest.Config) {
+				clientB := pathRoutedKubeClient(t, restConfig, clusterName, "b")
+				_, err = clientB.CoreV1().Pods(metav1.NamespaceDefault).List(context.Background(), metav1.ListOptions{})
+				require.NoError(t, err)
+
+				clientA := pathRoutedKubeClient(t, restConfig, clusterName, "a")
+				_, err := clientA.CoreV1().Pods(metav1.NamespaceDefault).List(context.Background(), metav1.ListOptions{})
+				require.NoError(t, err)
+
+			},


This test triggers ensureRouteNotOverwritten() errors in a way that implies some state (Identity.KubernetesCluster, at least) is shared between requests. I haven't been able to reproduce this behavior on an actual cluster, so either:

Something about the test environment/clients shares the identity object in a way not representative of reality

The identity is sometimes cached between requests (just not in a way I can trigger), in which case I think this method for passing path parameters around by overwriting identity fields may not be viable.

Will need to spend more time debugging this. 😕

I found two issues with the tests:

GenTestKubeClientTLSCert already fills all the kubernetes cluster + route to cluster information

restConfig was edited by appending the route to cluster into the existing endpoint. This means that the second request to cluster A will have /v1/teleport/<telecluster/<kubeclterB//v1/teleport/<telecluster/<kubeclterA/ as the host. that's why it fails

This means we call the route handler multiple times with different clusters causing the identity to mismatch.

check the fixes here af3a02c

Oh, that was definitely it, thanks for finding the issue! I was losing my mind a bit over that and it was just the client config getting reused 🤦

GenTestKubeClientTLSCert accepts "" already for the KubernetesCluster value, and I believe most identities have RouteToCluster configured, so technically I think that was working if a bit ugly. In any case, I'll add some test cases with an explicitly empty RouteToCluster.

timothyb89 commented Dec 24, 2024

View reviewed changes

lib/kube/proxy/forwarder.go Outdated Show resolved Hide resolved

timothyb89 requested review from tigrato and strideynet December 24, 2024 04:21

tigrato reviewed Jan 6, 2025

View reviewed changes

timothyb89 added 6 commits January 13, 2025 19:40

Remove routeSourcer in favor of identity values

f291749

Also: fixes various other review comments, adds comment explaining auth strategy.

Revert more dead code

52e3570

Revert more unnecessary changes; validate utf8 parameters

d299b18

Merge remote-tracking branch 'origin/master' into timothyb89/kubernet…

dccfa8f

…es-path-routing

Build fixes after merge with slog transition

3bb59e0

Add basic tests for Kubernetes path routing

74fd48d

timothyb89 commented Jan 15, 2025

View reviewed changes

Fix imports

48d7b71

timothyb89 mentioned this pull request Jan 15, 2025

Machine ID: Support path-based Kubernetes routing #50898

Draft

timothyb89 marked this pull request as ready for review January 15, 2025 05:39

github-actions bot requested a review from capnspacehook January 15, 2025 05:40

github-actions bot added kubernetes-access size/md labels Jan 15, 2025

strideynet requested a review from tigrato January 15, 2025 09:45

Check identity parameters when per session MFA is enabled

5f5760a

Also, don't allow path parameters to override the identity, if the contains nonempty routing params.

timothyb89 commented Jan 16, 2025

View reviewed changes

Fix failing TestSingleCertRouting due to improper reuse of rest config

2d7b958

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support routing to Kubernetes clusters by request path #50567

Support routing to Kubernetes clusters by request path #50567

timothyb89 commented Dec 24, 2024 •

edited

Loading

timothyb89 commented Dec 24, 2024 •

edited

Loading

tigrato Jan 6, 2025

timothyb89 Jan 14, 2025

tigrato Jan 6, 2025

timothyb89 Jan 14, 2025

timothyb89 Jan 15, 2025 •

edited

Loading

timothyb89 Jan 16, 2025

tigrato Jan 16, 2025 •

edited

Loading

timothyb89 Jan 16, 2025 •

edited

Loading

Support routing to Kubernetes clusters by request path #50567

Are you sure you want to change the base?

Support routing to Kubernetes clusters by request path #50567

Conversation

timothyb89 commented Dec 24, 2024 • edited Loading

Testing steps

timothyb89 commented Dec 24, 2024 • edited Loading

tigrato Jan 6, 2025

Choose a reason for hiding this comment

timothyb89 Jan 14, 2025

Choose a reason for hiding this comment

tigrato Jan 6, 2025

Choose a reason for hiding this comment

timothyb89 Jan 14, 2025

Choose a reason for hiding this comment

timothyb89 Jan 15, 2025 • edited Loading

Choose a reason for hiding this comment

timothyb89 Jan 16, 2025

Choose a reason for hiding this comment

tigrato Jan 16, 2025 • edited Loading

Choose a reason for hiding this comment

timothyb89 Jan 16, 2025 • edited Loading

Choose a reason for hiding this comment

timothyb89 commented Dec 24, 2024 •

edited

Loading

timothyb89 commented Dec 24, 2024 •

edited

Loading

timothyb89 Jan 15, 2025 •

edited

Loading

tigrato Jan 16, 2025 •

edited

Loading

timothyb89 Jan 16, 2025 •

edited

Loading