-
Notifications
You must be signed in to change notification settings - Fork 1.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support routing to Kubernetes clusters by request path #50567
base: master
Are you sure you want to change the base?
Conversation
This implements path-based routing for Kubernetes clusters as described by [RFD0185]. A new prefixed path handler is added that accepts base64-encoded Teleport and Kubernetes cluster names. The request is routed to the destination Teleport cluster using these parameters instead of those embedded in the session TLS identity, and then the preexisting handlers check authorization and complete the request as usual. This removes the need for certificates to be issued per Kubernetes cluster: so long as the incoming identity is granted access to the cluster via its roles, access can succeed, and no `KubernetesCluster` attribute or cert usage restrictions are needed. [RFD0185]: #47436
Outstanding TODOs:
|
lib/kube/proxy/forwarder.go
Outdated
@@ -520,7 +525,7 @@ type handlerWithAuthFuncStd func(ctx *authContext, w http.ResponseWriter, r *htt | |||
const accessDeniedMsg = "[00] access denied" | |||
|
|||
// authenticate function authenticates request | |||
func (f *Forwarder) authenticate(req *http.Request) (*authContext, error) { | |||
func (f *Forwarder) authenticate(req *http.Request, params httprouter.Params, routeSourcer routeSourcer) (*authContext, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand why you need to pass the params/route sourcer here.
Given that singleCertHandler
already defines the
o.Identity.RouteToCluster = teleportCluster
o.Identity.KubernetesCluster = kubeCluster
Do we ned to have a special case for route sourcer? We can always retrieve it from identity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good call, it was left over from before I had it modifying (and relying on) the identity directly. I've removed the routeSource
mechanism outright and reverted most of my changes to the forwarder itself. I ended up diving deeper into how the identity is passed / trusted between clusters and wrote some notes in a comment.
return "", "", trace.Wrap(err) | ||
} | ||
|
||
// TODO: do we care to otherwise validate these results before casting |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I think it would be better anw
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not totally sure what other validation steps we might want here - somehow I'd never looked into it before but we seem to impose very few restrictions on resource names. I've added a check to make sure the parameters decode to valid UTF8, at least. Downstream looks to be a map lookup from cached cluster details, so I don't expect significant risk.
Also: fixes various other review comments, adds comment explaining auth strategy.
assert: func(t *testing.T, restConfig *rest.Config) { | ||
client := pathRoutedKubeClient(t, restConfig, clusterName, "a") | ||
_, err = client.CoreV1().Pods(metav1.NamespaceDefault).List(context.Background(), metav1.ListOptions{}) | ||
require.ErrorContains(t, err, "cannot list resource") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've noticed a mild behavior change here, I think since this just wasn't possible before. Attempting to e.g. list resources on a cluster denied by roles now yields (kubectl get pods -v8
):
I0114 20:55:58.643198 22226 request.go:1212] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"kubernetes cluster \"default\" not found","reason":"Forbidden","code":403}
I0114 20:55:58.643592 22226 helpers.go:246] server response object: [{
"kind": "Status",
"apiVersion": "v1",
"metadata": {},
"status": "Failure",
"message": "kubernetes cluster \"default\" not found",
"reason": "Forbidden",
"code": 403
}]
Error from server (Forbidden): kubernetes cluster "default" not found
...however, attempting to list a nonexistent cluster returns:
I0114 20:51:57.008342 10127 request.go:1212] Response Body: I0114 20:51:57.008342 10127 request.go:1212] Response Body: {"kind":"Status","apiVersion":"v1","metadata":{},"status":"Failure","message":"Kubernetes cluster \"asdf\" not found","reason":"NotFound","code":404}
E0114 20:51:57.008352 10127 memcache.go:265] couldn't get current server API group list: the server could not find the requested resource
I0114 20:51:57.008357 10127 cached_discovery.go:120] skipped caching discovery info due to the server could not find the requested resource
I0114 20:51:57.008477 10127 helpers.go:246] server response object: [{
"metadata": {},
"status": "Failure",
"message": "the server could not find the requested resource",
"reason": "NotFound",
"details": {
"causes": [
{
"reason": "UnexpectedServerResponse",
"message": "unknown"
}
]
},
"code": 404
}]
Error from server (NotFound): the server could not find the requested resource
(and a Kubernetes cluster "$name" not found"
is logged on the server)
Previously, you could only ever get certs for a cluster if the name was valid and the user is authorized to see it, and returns a consistent error otherwise before the cert is ever issued. Would we consider this a resource name oracle?
It seems like the two responses probably ought to be identical but changing them might be a mild breaking change to existing behavior (403 vs 404).
Also, don't allow path parameters to override the identity, if the contains nonempty routing params.
name: "successful path routing to multiple clusters", | ||
roleSpec: defaultRoleSpec, | ||
assert: func(t *testing.T, restConfig *rest.Config) { | ||
clientB := pathRoutedKubeClient(t, restConfig, clusterName, "b") | ||
_, err = clientB.CoreV1().Pods(metav1.NamespaceDefault).List(context.Background(), metav1.ListOptions{}) | ||
require.NoError(t, err) | ||
|
||
clientA := pathRoutedKubeClient(t, restConfig, clusterName, "a") | ||
_, err := clientA.CoreV1().Pods(metav1.NamespaceDefault).List(context.Background(), metav1.ListOptions{}) | ||
require.NoError(t, err) | ||
|
||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test triggers ensureRouteNotOverwritten()
errors in a way that implies some state (Identity.KubernetesCluster
, at least) is shared between requests. I haven't been able to reproduce this behavior on an actual cluster, so either:
- Something about the test environment/clients shares the identity object in a way not representative of reality
- The identity is sometimes cached between requests (just not in a way I can trigger), in which case I think this method for passing path parameters around by overwriting identity fields may not be viable.
Will need to spend more time debugging this. 😕
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found two issues with the tests:
GenTestKubeClientTLSCert
already fills all the kubernetes cluster + route to cluster informationrestConfig
was edited by appending the route to cluster into the existing endpoint. This means that the second request to cluster A will have/v1/teleport/<telecluster/<kubeclterB//v1/teleport/<telecluster/<kubeclterA/
as the host. that's why it fails
This means we call the route handler multiple times with different clusters causing the identity to mismatch.
check the fixes here af3a02c
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, that was definitely it, thanks for finding the issue! I was losing my mind a bit over that and it was just the client config getting reused 🤦
GenTestKubeClientTLSCert
accepts ""
already for the KubernetesCluster value, and I believe most identities have RouteToCluster
configured, so technically I think that was working if a bit ugly. In any case, I'll add some test cases with an explicitly empty RouteToCluster
.
This implements path-based routing for Kubernetes clusters as described by RFD0185. A new prefixed path handler is added that accepts base64-encoded Teleport and Kubernetes cluster names. The request is routed to the destination Teleport cluster using these parameters instead of those embedded in the session TLS identity, and then the preexisting handlers check authorization and complete the request as usual.
This removes the need for certificates to be issued per Kubernetes cluster: so long as the incoming identity is granted access to the cluster via its roles, access can succeed, and no
KubernetesCluster
attribute or cert usage restrictions are needed.Fixes #40405
Testing steps
Ensure a Teleport proxy is running this branch with one or more k8s clusters attached. The Kubernetes node does not need to be running this branch, only the proxy.
Build
tbot
against Machine ID: Support path-based Kubernetes routing #50898Create a new bot with appropriate permissions to access your kube node(s), e.g.
tctl bots add foo --roles=kube-access
Start
tbot
with the newkubernetes/v2
service:The
--kubernetes-cluster-name
flag can be repeated. Alternatively (or additionally), try a label matcher:Examine
./tbot-user/kubeconfig.yaml
. You should see onecluster:
and onecontext:
entry per matched cluster. Theserver:
field should include a new URL suffix,/v1/teleport/<teleportCluster>/<kubeCluster>
Try
KUBECONFIG=./tbot-user/kubeconfig.yaml kubectl get pods
. Also consider testingkubectl exec
, port forward, etc.