-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[backend] Panic while connection to default cache endpoint ml-pipeline.kubeflow:8887 #9702
Comments
/assign @Linchin |
Hi @andre-lx, thank you for bringing up this issue. I tried the same pipeline on a newly deployed 2.0.0 cluster, and the run finished without issue. looking at the log you provided, we have
The metadata client seems to come from version 2.0.0-rc.2 instead of version 2.0.0. Could you double check if you applied the manifest of version 2.0.0? Try apply the manifest again (here) and see if the issue persists. |
Also, could you let me know which way you used to deploy KFP, standalone or via kubeflow? |
Hi @Linchin, I just checked and we are using the following image: pipelines/manifests/kustomize/base/metadata/base/kustomization.yaml Lines 10 to 12 in e03e312
The deployment was done using the follwing file: https://github.com/kubeflow/pipelines/blob/2.0.0/manifests/kustomize/env/platform-agnostic-multi-user/kustomization.yaml Thanks |
|
I have the same error. Here are the details.
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
I also have this issue in my Kubeflow 1.8 environment. I released my environment with the kubeflow manifest 1.8. Can someone fix this issue? |
the same issue on kubeflow 1.8 |
I have faced a similar issue. I have full Kubeflow 1.8 environment installed and the pipeline backend metadata envoy is 2.0.3 version. Is this issue resolved? |
I've faced similar issue, and it was due to proxy setting on the pod/step. After removing proxy setting the issue was gone. |
@umka1332 This solved the problem for me also. But do you know a way how I can still set proxy env vars to connect to the internet? |
Just tested successfully that setting NO_PROXY to '*.kubeflow,*.local' seems to work together with http(s)_proxy. |
If anyone following this can reliably reproduce this issue...
I also need to see the log on the second pod (driver) that is started. Thanks. |
How did you solve this? I tried to set the no_proxy environment variables but it did not work for me. @umka1332 |
Important is to set |
1.8.1 kubeflow has the same problem.... |
I solved this problem by delete proxy, you guys must delete proxy, if you need packages you need make a image that you can use. |
I tried running this but it did not work for me. Is there somethin I am missing here. @pschoen-itsc @umka1332 |
@suanshs Seems like you are having a different problem. If you don't have any proxies set to begin with, then you also should not need the NO_PROXY settings. Can you provide logs of all the containers of the failing pod? |
@pschoen-itsc
Following are the logs from wait container
Following are the logs from |
@suanshs Do you also have logs of the istio sidecar or do you have no istio deployed? |
Thanks! This helped me a lot! |
Hi I'm facing the same issue when using istio-proxy sidecar injected. and with NO_PROXY environment setup not able to fix such issue. :( |
Hi Folks, I was able to resolve the issue. The root cause was that I was using I found a related issue here: istio/istio#23802. As suggested, adding the following label to the container resolved the issue:
|
Environment
Manifests
2.0.0
Steps to reproduce
Hello, we are trying the migration from pipelines 1.8.5 to 2.0.0 but after the apply we are aheving some issues.
Running the "hello world" example from the jupyerlab:
Or running the generated
pipeline.yaml
from the result directly though the UI, we always get the following error on the third pod that is started:The service
ml-pipeline.kubeflow:8887
exists.Everything works great on version 1.8.5.
If you need the logs from the others two pods please let me know. I also check the logs in all the kubeflow services and I can't find any issue.
Impacted by this bug? Give it a 👍.
The text was updated successfully, but these errors were encountered: