Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[frontend] The experiment names on the Compare Run page cannot be retrieved #11419

Open
LehmBook opened this issue Nov 29, 2024 · 5 comments
Open

Comments

@LehmBook
Copy link

LehmBook commented Nov 29, 2024

Environment

  • How did you deploy Kubeflow Pipelines (KFP)?
    local deployment
    Kubeflow manifest 1.9.1
    k8s version:1.30.3
    deployed multi-user version kuebflow with the kustomization file in the example folder.
    all images are replaced with images from local storage
    GKE disabled

  • KFP version:
    kubeflow pipeline 2.3.0

Steps to reproduce

  1. checked the kustomization file to see which version of pipeline i deployed
# Kubeflow Pipelines
- ../apps/pipeline/upstream/env/cert-manager/platform-agnostic-multi-user
  1. click pipeline-experiment in the kubeflow UI, then click an experiment, and click compare,it says failed to get associated experiment
    1732847226836

  2. Then with F12, in pipeline-experiment page, when click any experiment or try compare,the result as follow

1732845184641
1732845708771

it says"List experiments failed: Failed to authorize with API: Invalid input error: An experiment cannot have an empty namespace in multi-user mode"
which is refer to the content in the file/backend/src/apiserver/server/experiment_server.go
https://github.com/kubeflow/pipelines/blob/58b2d8d721378fca4f35079f2aaaed92ad6bd36f/backend/src/apiserver/server/experiment_server.go#L342C1-L343C1

4.check if experiment have namespace
1732862362126

All the kubeflow pods are running fine
Also,I want to know if anyone else encountered the same problem or only me.

kubeflow-user-example-com   ctest-0                                                  2/2     Running            0             14d
kubeflow-user-example-com   jupyter-0                                                2/2     Running            0             42h
kubeflow-user-example-com   ml-pipeline-ui-artifact-d6bb58c54-grmxj                  2/2     Running            2 (16h ago)   20d
kubeflow-user-example-com   ml-pipeline-visualizationserver-745dbcb7fb-rfqpz         2/2     Running            0             20d
kubeflow-user-example-com   mnist-pipeline-tsdhj-system-container-impl-1440812143    1/2     ImagePullBackOff   0             7d23h
kubeflow-user-example-com   pvcviewer-codeserver-test-69b55fb6f7-4khh4               2/2     Running            0             14d
kubeflow-user-example-com   pvcviewer-test00-workspace-74cb74f89f-d4s8t              2/2     Running            0             17d
kubeflow-user-example-com   pvcviewer-workspace-8485f49f4f-wkmm7                     2/2     Running            0             2d23h
kubeflow-user-example-com   tensortst-8dcd5dcdb-h77qw                                2/2     Running            0             17d
kubeflow-user-example-com   test10-0                                                 2/2     Running            2 (3d ago)    15d
kubeflow                    admission-webhook-deployment-5c5959bd95-rnmr9            1/1     Running            0             20d
kubeflow                    cache-server-784dff7cfd-768b7                            2/2     Running            0             20d
kubeflow                    centraldashboard-56ff7d88c5-6c8dn                        2/2     Running            0             9d
kubeflow                    jupyter-web-app-deployment-648886f787-4zxx5              2/2     Running            0             13d
kubeflow                    katib-controller-f8df57f77-cqnxp                         1/1     Running            0             20d
kubeflow                    katib-db-manager-5dfb479d86-c9z22                        1/1     Running            0             20d
kubeflow                    katib-mysql-5c54f57f6b-4m2hj                             1/1     Running            0             20d
kubeflow                    katib-ui-6c587c9466-fnzhj                                2/2     Running            0             20d
kubeflow                    kserve-controller-manager-854b47d55d-wmvg4               2/2     Running            0             20d
kubeflow                    kserve-models-web-app-d7c86d76d-wqnfm                    2/2     Running            0             20d
kubeflow                    kubeflow-pipelines-profile-controller-7f78669c58-4r5pj   1/1     Running            0             20d
kubeflow                    metacontroller-0                                         1/1     Running            0             20d
kubeflow                    metadata-envoy-deployment-d4b98db88-qjk25                1/1     Running            0             20d
kubeflow                    metadata-grpc-deployment-6dd4c5fcb6-gvllw                2/2     Running            2 (20d ago)   20d
kubeflow                    metadata-writer-c98794db9-kh52q                          2/2     Running            0             20d
kubeflow                    minio-586bbdcc4-fcszh                                    2/2     Running            0             8d
kubeflow                    ml-pipeline-647947d6d4-5l7tv                             2/2     Running            0             20d
kubeflow                    ml-pipeline-persistenceagent-f4466586-q8wz4              2/2     Running            0             20d
kubeflow                    ml-pipeline-scheduledworkflow-55965d9f58-l4xxh           2/2     Running            0             20d
kubeflow                    ml-pipeline-ui-558979f6d4-2fj4j                          2/2     Running            0             18d
kubeflow                    ml-pipeline-viewer-crd-7db4586d47-z27fv                  2/2     Running            1 (20d ago)   20d
kubeflow                    ml-pipeline-visualizationserver-b687cd666-m6zdf          2/2     Running            0             20d
kubeflow                    mysql-675c9859d8-jklq4                                   2/2     Running            0             20d
kubeflow                    notebook-controller-deployment-546c7d4cb6-8qp5b          2/2     Running            2 (20d ago)   20d
kubeflow                    profiles-deployment-5cb7c65899-k77f5                     3/3     Running            1 (20d ago)   20d
kubeflow                    pvcviewer-controller-manager-d64c9564f-9vnv4             3/3     Running            0             20d
kubeflow                    tensorboard-controller-deployment-5c9c98bcdf-58gfk       3/3     Running            2 (17d ago)   17d
kubeflow                    tensorboards-web-app-deployment-6bbd97c7d4-22cjr         2/2     Running            0             20d
kubeflow                    training-operator-54f754bfcd-sq7k4                       1/1     Running            0             20d
kubeflow                    volumes-web-app-deployment-64cdf449dd-mfhk8              2/2     Running            0             17d
kubeflow                    workflow-controller-986648d9b-v4mzf                      2/2     Running            1 (13d ago)   13d
lsq-pace                    ml-pipeline-ui-artifact-d6bb58c54-bsggh                  2/2     Running            0             17d
lsq-pace                    ml-pipeline-visualizationserver-745dbcb7fb-6ssxg         2/2     Running            0             17d
oauth2-proxy                oauth2-proxy-57d8b58449-6vfvz                            1/1     Running            0             20d
oauth2-proxy                oauth2-proxy-57d8b58449-wjbk4                            1/1     Running            0             20d
tigera-operator             tigera-operator-76ff79f7fd-4bd7f                         1/1     Running            0             25d

Expected result

  1. the experiment of the runs shows correctly in the compare run page
  2. the experiment information fetched correctly in the compare run page and the pipeline-experiment-'my experiment' page

Materials and Reference


Impacted by this bug? Give it a 👍.

@LehmBook
Copy link
Author

LehmBook commented Dec 2, 2024

Temporary fix: I replaced the frontend:2.3.0 image with the frontend:2.2.0 image, and now the experiment in Compare Run is displayed correctly.

@LehmBook LehmBook changed the title [frontend] Kubeflow pipeline experiment compare run cannot retrieve experiment namespace [frontend] The experiment names on the Compare Run page cannot be retrieved Dec 2, 2024
@rnuzzo
Copy link

rnuzzo commented Dec 12, 2024

I'm having a similar issue when trying to access /#/runs or /#/experiments, it fails with this error:

{"error":"Failed to list runs: Failed to list runs due to authorization error. Check if you have permission to access namespace : Invalid input error: A run cannot have an empty namespace in multi-user mode","code":3,"message":"Failed to list runs: Failed to list runs due to authorization error. Check if you have permission to access namespace : Invalid input error: A run cannot have an empty namespace in multi-user mode","details":[{"@type":"type.googleapis.com/google.rpc.Status","code":3,"message":"A run cannot have an empty namespace in multi-user mode"}]}

Kubeflow was installed on EKS 1.30 using the manifests from /apps/pipeline/upstream/env/cert-manager/platform-agnostic-multi-user.

@LehmBook
Copy link
Author

@rnuzzo Hi runuzzo, if you are using Kubeflow 1.9.1. You could try to downgrade your frontend:2.3.0 to 2.2.0 it seems they changed the code of fetching run and experiment in 2.3.0.
for a quick test, you can start by editing the ml-pipeline-ui deployment (iirr) and swap the frontend image to version 2.2.0

@rnuzzo
Copy link

rnuzzo commented Dec 12, 2024

Hi @LehmBook, I already tried that without succeeding. Another thing I should say is that I cannot even create experiments from the UI. I suspect that some role/roleBinding is missing in this config.

EDIT: Running latest version of kubeflow, so 1.9.1

@LehmBook
Copy link
Author

@rnuzzo I deployed the complete kubeflow the manifest file ./example/kustomization.yaml
with
while ! kustomize build example | kubectl apply --server-side --force-conflicts -f -; do echo "Retrying to apply resources"; sleep 20; done

so it might not be helpful to you.
my Role in ns kubeflow is as follow

kubeflow          argo-role                                        2024-11-08T06:28:18Z
kubeflow          centraldashboard                                 2024-11-08T06:28:18Z
kubeflow          jupyter-web-app-jupyter-notebook-role            2024-11-08T06:28:18Z
kubeflow          kserve-leader-election-role                      2024-11-08T06:28:18Z
kubeflow          kubeflow-pipelines-cache-role                    2024-11-08T06:28:18Z
kubeflow          kubeflow-pipelines-metadata-writer-role          2024-11-08T06:28:18Z
kubeflow          ml-pipeline                                      2024-11-08T06:28:18Z
kubeflow          ml-pipeline-persistenceagent-role                2024-11-08T06:28:18Z
kubeflow          ml-pipeline-scheduledworkflow-role               2024-11-08T06:28:18Z
kubeflow          ml-pipeline-ui                                   2024-11-08T06:28:18Z
kubeflow          ml-pipeline-viewer-controller-role               2024-11-08T06:28:18Z
kubeflow          notebook-controller-leader-election-role         2024-11-08T06:28:18Z
kubeflow          pipeline-runner                                  2024-11-08T06:28:18Z
kubeflow          profiles-leader-election-role                    2024-11-08T06:28:18Z
kubeflow          pvcviewer-leader-election-role                   2024-11-08T06:28:18Z
kubeflow          tensorboard-controller-leader-election-role      2024-11-08T06:28:18Z

if you need me to check anything else just lmk.
btw, there are few more places that use frontend images as well. they are:

./apps/kfp-tekton/upstream/v1/base/installs/multi-user/pipelines-profile-controller/sync.py
./apps/kfp-tekton/upstream/base/installs/multi-user/pipelines-profile-controller/sync.py
./apps/pipeline/upstream/base/installs/multi-user/pipelines-profile-controller/sync.py

and

./apps/kfp-tekton/upstream/base/installs/multi-user/pipelines-profile-controller/test_sync.py
./apps/kfp-tekton/upstream/v1/base/installs/multi-user/pipelines-profile-controller/test_sync.py
./apps/pipeline/upstream/base/installs/multi-user/pipelines-profile-controller/test_sync.py

im not sure if they are relevant

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants