This is Istio's recommended approach for External Authorization1. It is not limited to the use
of oauth2-proxy
2 alone. This method is an industry standard, meeting all of Kubeflow's
current and foreseeable authentication needs.
The Kubeflow Pipelines component relies on the built-in kubernetes functionalities to authenticate and authorize user requests, specifically the TokenReviews3 and SubjectAccessReview4.
The best way to describe how it works is to explain with an example. Lets analyze the flow when a client calls the API to list the KF Pipeline runs:
-
api-server starts endpoints in:
https://github.com/kubeflow/pipelines/blob/2.0.5/backend/src/apiserver/main.go#L95
Focusing on the pipelines run service:
- Register Run Service:
- proto RPC definition of ListRunsV1
- code definition of ListRunsV1
- ListRunsV1 calls internal method
listRuns
listRuns
calls internal methodcanAccessRun
which itself callss.resourceManager.IsAuthorized
ResourceManager.IsAuthorized
first tries to authenticate over every available authenticator, which are theTokenReviewAuthenticator
andHTTPHeaderAuthenticator
- here the user identity is either the user email provided directly in the
kubeflow-userid
header or the user identity obtained from provided token - https://github.com/kubeflow/pipelines/blob/master/backend/src/apiserver/resource/resource_manager.go#L1667
- here the user identity is either the user email provided directly in the
TokenReviewAuthenticator.GetUserIdentity
gets the token fromAuthorization
header and calls the K8s Authauthv1.TokenReview
with given token which in return providesuserInfo := review.Status.User
.GetUserIdentity
returnuserInfo.Username
which at this point is thesystem:serviceaccount:default:default
.- Next in
ResourceManager.IsAuthorized
a SubjectAccessReview is created withr.subjectAccessReviewClient.Create
with arguments specifying RBAC verbs provided in code definition ofRunServer.listRuns
. If the user (sa) is not authorized, an error is thrown- https://github.com/kubeflow/pipelines/blob/master/backend/src/apiserver/resource/resource_manager.go#L1703
- if the identity was obtained from token (service account), the
rolebinding.rbac.authorization.k8s.io/default-editor
provides the RBAC permission - if the identity was obtained from header (user), the
rolebinding.rbac.authorization.k8s.io/user-example-com
or similar provides the RBAC permission
-
User calls api to list pipeline runs as unauthorized service account.
- This can be done by running Pod with curl in
default
namespace:$ kubectl -n default run -ti --rm curl --image curlimages/curl --command -- sh # v1beta1 ~ $ curl "istio-ingressgateway.istio-system/pipeline/apis/v1beta1/runs?resource_reference_key.type=NAMESPACE&resource_reference_key.id=kubeflow-user-example-com" -H "Authorization: Bearer $(cat /run/secrets/kubernetes.io/serviceaccount/token)" {"error":"Failed to list v1beta1 runs: Failed to list runs due to authorization error. Check if you have permission to access namespace kubeflow-user-example-com: Failed to access run . Check if you have access to namespace kubeflow-user-example-com: PermissionDenied: User 'system:serviceaccount:default:default' is not authorized with reason: (request: \u0026ResourceAttributes{Namespace:kubeflow-user-example-com,Verb:list,Group:pipelines.kubeflow.org,Version:v1beta1,Resource:runs,Subresource:,Name:,}): Unauthorized access","code":7,"message":"Failed to list v1beta1 runs: Failed to list runs due to authorization error. Check if you have permission to access namespace kubeflow-user-example-com: Failed to access run . Check if you have access to namespace kubeflow-user-example-com: PermissionDenied: User 'system:serviceaccount:default:default' is not authorized with reason: (request: \u0026ResourceAttributes{Namespace:kubeflow-user-example-com,Verb:list,Group:pipelines.kubeflow.org,Version:v1beta1,Resource:runs,Subresource:,Name:,}): Unauthorized access","details":[{"@type":"type.googleapis.com/google.rpc.Status","code":7,"message":"User 'system:serviceaccount:default:default' is not authorized with reason: (request: \u0026ResourceAttributes{Namespace:kubeflow-user-example-com,Verb:list,Group:pipelines.kubeflow.org,Version:v1beta1,Resource:runs,Subresource:,Name:,})"}]} # v2beta1 ~ $ curl istio-ingressgateway.istio-system/pipeline/apis/v2beta1/runs?namespace=kubeflow-user-example-com -H "Authorization: Bearer $(cat /run/secrets/kubernetes.io/serviceaccount/token)" {"error":"Failed to list runs: Failed to list runs due to authorization error. Check if you have permission to access namespace kubeflow-user-example-com: Failed to access run . Check if you have access to namespace kubeflow-user-example-com: PermissionDenied: User 'system:serviceaccount:default:default' is not authorized with reason: (request: \u0026ResourceAttributes{Namespace:kubeflow-user-example-com,Verb:list,Group:pipelines.kubeflow.org,Version:v1beta1,Resource:runs,Subresource:,Name:,}): Unauthorized access","code":7,"message":"Failed to list runs: Failed to list runs due to authorization error. Check if you have permission to access namespace kubeflow-user-example-com: Failed to access run . Check if you have access to namespace kubeflow-user-example-com: PermissionDenied: User 'system:serviceaccount:default:default' is not authorized with reason: (request: \u0026ResourceAttributes{Namespace:kubeflow-user-example-com,Verb:list,Group:pipelines.kubeflow.org,Version:v1beta1,Resource:runs,Subresource:,Name:,}): Unauthorized access","details":[{"@type":"type.googleapis.com/google.rpc.Status","code":7,"message":"User 'system:serviceaccount:default:default' is not authorized with reason: (request: \u0026ResourceAttributes{Namespace:kubeflow-user-example-com,Verb:list,Group:pipelines.kubeflow.org,Version:v1beta1,Resource:runs,Subresource:,Name:,})"}]}
- This can be done by running Pod with curl in
-
User calls api to list pipeline runs as authorized service account.
- This can be done by running Pod with curl in
kubeflow-user-example-com
namespace specifying correct service account:$ kubectl -n kubeflow-user-example-com run -ti --rm curl --image curlimages/curl --command --overrides='{"spec": {"serviceAccountName": "default-editor"}}' -- sh # v1beta1 ~ $ curl "istio-ingressgateway.istio-system/pipeline/apis/v1beta1/runs?resource_reference_key.type=NAMESPACE&resource_reference_key.id=kubeflow-user-example-com" -H "Authorization: Bearer $(cat /run/secrets/kubernetes.io/serviceaccount/token)" {} # empty response which is fine because no pipeline runs exist # v2beta1 ~ $ curl istio-ingressgateway.istio-system/pipeline/apis/v2beta1/runs?namespace=kubeflow-user-example-com -H "Authorization: Bearer $(cat /run/secrets/kubernetes.io/serviceaccount/token)" {} # empty response which is fine because no pipeline runs exist
- This can be done by running Pod with curl in
The authentication in Kubeflow evolved over time and we dropped envoyfilters and oidc-authservice in favor of RequestAuthentication and Oauth2-proxy in Kubeflow 1.9.
You can adjust OAuth2 Proxy to directly connect to your own IDP(Identity Provider) suchg as GCP, AWS, Azure etc:
- Create an application on your IdP (purple line)
- Change your OAuth2 Proxy issuer to your IdP. Of course never ever directly, but with kustomize overlays and components.
- In the istio-system namespace is a RequestAuthentication resource. You need to change its issuer to your own IdP, or even better create an additional one.
- You can now directly issue a token from your IdP and use this token to access your Kubeflow platform.
This feature is useful when you need to integrate kubeflow with you current CI/CD platform (GitHub Actions, Jenkins) via machine-to-machine authentication.
Example for obtaining and using a JWT token From your IDP:
import requests
token_url = "https://your-idp.com/oauth/token"
client_id = "YOUR_CLIENT_ID"
client_secret = "YOUR_CLIENT_SECRET"
username = "YOUR_USERNAME"
password = "YOUR_PASSWORD"
# request header
headers = {
"Content-Type": "application/x-www-form-urlencoded"
}
data = {
"grant_type": "password",
"client_id": client_id,
"client_secret": client_secret,
"username": username,
"password": password,
"scope": "openid profile email" #change your scope
}
response = requests.post(token_url, headers=headers, data=data)
TOKEN = response.json()['access_token']
import kfp
kubeflow_host="https://your_host"
pipeline_host = kubeflow_host + "/pipeline"
client = kfp.Client(host=pipeline_host, existing_token=TOKEN)
print(client.list_runs(namespace="your-profile-name"))
The underlying mechanism is the same as in Kubeflow Pipelines.
Similarly, to explain how it works, let's analyze the code step by step, starting from the api route definition for listing notebooks:
- list notebooks api route definition
- https://github.com/kubeflow/kubeflow/blob/v1.8.0/components/crud-web-apps/jupyter/backend/apps/common/routes/get.py#L53
- this calls
crud_backend/api/notebook.py::list_notebooks
crud_backend/api/notebook.py::list_notebooks
callsauthz.ensure_authorized
crud_backend/authz.py::ensure_authorized
callscrud_backend/authn.py::get_username
- https://github.com/kubeflow/kubeflow/blob/v1.8.0/components/crud-web-apps/common/backend/kubeflow/kubeflow/crud_backend/authz.py#L101
- https://github.com/kubeflow/kubeflow/blob/v1.8.0/components/crud-web-apps/common/backend/kubeflow/kubeflow/crud_backend/authn.py#L12
crud_backend/authn.py::get_username
gets the user id from userid header (email or sa in formatsystem:serviceaccount:kubeflowusernamespace:default-editor
)
crud_backend/authz.py::ensure_authorized
callscrud_backend/authz.py::is_authorized
- https://github.com/kubeflow/kubeflow/blob/v1.8.0/components/crud-web-apps/common/backend/kubeflow/kubeflow/crud_backend/authz.py#L46
- this calls
create_subject_access_review
which uses the same mechanism as pipelines withr.subjectAccessReviewClient.Create
The analysis of KServe auth capabilities suggests that while it's possible to limit access to only authenticated agents, there might be some improvements required to enable access only to authorized agents.
This is based on the following:
-
KServe Controller Manager patch integrating kube-rbac-proxy5.
This suggests the kserve might use the same mechanism based on
SubjectAccessReviews
. Having a look at the kubeflow/manifests I see it's not enabled. -
Search through the docs and code:
- https://github.com/kserve/kserve/tree/v0.12.0/docs/samples/istio-dex
- https://github.com/kserve/kserve/tree/v0.12.0/docs/samples/gcp-iap
The docs above mention that while it's possible to enable authentication, authorization is more complicated and probably we need to add
AuthorizationPolicy
...create an Istio AuthorizationPolicy to grant access to the pods or disable it
Most probably some work is needed to enable authorized access to kserve models.