[backend] Panic while connection to default cache endpoint ml-pipeline.kubeflow:8887 #9702

andre-lx · 2023-07-05T14:53:10Z

Environment

How did you deploy Kubeflow Pipelines (KFP)?
Manifests
KFP version:
2.0.0
KFP SDK version:

kfp                   2.0.1
kfp-pipeline-spec     0.2.2
kfp-server-api        2.0.0

Steps to reproduce

Hello, we are trying the migration from pipelines 1.8.5 to 2.0.0 but after the apply we are aheving some issues.

Running the "hello world" example from the jupyerlab:

from kfp import dsl
import kfp


from kfp import dsl

@dsl.component
def say_hello(name: str) -> str:
    hello_text = f'Hello, {name}!'
    print(hello_text)
    return hello_text

@dsl.pipeline
def hello_pipeline(recipient: str) -> str:
    hello_task = say_hello(name=recipient)
    return hello_task.output

from kfp import compiler

compiler.Compiler().compile(hello_pipeline, 'pipeline.yaml')

from kfp.client import Client

client = Client()
run = client.create_run_from_pipeline_package(
    'pipeline.yaml',
    arguments={
        'recipient': 'World',
    },
)

Or running the generated pipeline.yaml from the result directly though the UI, we always get the following error on the third pod that is started:

time="2023-07-05T14:19:23.912Z" level=info msg="capturing logs" argo=true
time="2023-07-05T14:19:23.945Z" level=info msg="capturing logs" argo=true
I0705 14:19:23.966873      51 launcher_v2.go:90] input ComponentSpec:{
  "inputDefinitions": {
    "parameters": {
      "name": {
        "parameterType": "STRING"
      }
    }
  },
  "outputDefinitions": {
    "parameters": {
      "Output": {
        "parameterType": "STRING"
      }
    }
  },
  "executorLabel": "exec-say-hello"
}
I0705 14:19:23.967498      51 cache.go:139] Cannot detect ml-pipeline in the same namespace, default to ml-pipeline.kubeflow:8887 as KFP endpoint.
I0705 14:19:23.967512      51 cache.go:116] Connecting to cache endpoint ml-pipeline.kubeflow:8887
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x941c29]

goroutine 1 [running]:
github.com/kubeflow/pipelines/backend/src/v2/metadata.(*Client).PublishExecution(0xc000b29920, {0x20a4878, 0xc000058040}, 0x0, 0x0, {0x0, 0x0, 0xc000b60000?}, 0x4)
	/go/src/github.com/kubeflow/pipelines/backend/src/v2/metadata/client.go:388 +0x69
github.com/kubeflow/pipelines/backend/src/v2/component.(*LauncherV2).publish(0x1d3c167?, {0x20a4878?, 0xc000058040?}, 0x1?, 0x1?, {0x0?, 0x1a51660?, 0xc0006a63a0?}, 0xc73bb0?)
	/go/src/github.com/kubeflow/pipelines/backend/src/v2/component/launcher_v2.go:266 +0x9b
github.com/kubeflow/pipelines/backend/src/v2/component.(*LauncherV2).Execute.func2()
	/go/src/github.com/kubeflow/pipelines/backend/src/v2/component/launcher_v2.go:144 +0x65
github.com/kubeflow/pipelines/backend/src/v2/component.(*LauncherV2).Execute(0xc00028e540, {0x20a4878, 0xc000058040})
	/go/src/github.com/kubeflow/pipelines/backend/src/v2/component/launcher_v2.go:156 +0x91e
main.run()
	/go/src/github.com/kubeflow/pipelines/backend/src/v2/cmd/launcher-v2/main.go:98 +0x3ed
main.main()
	/go/src/github.com/kubeflow/pipelines/backend/src/v2/cmd/launcher-v2/main.go:47 +0x19
time="2023-07-05T14:19:24.950Z" level=info msg="sub-process exited" argo=true error="<nil>"
Error: exit status 2
time="2023-07-05T14:19:25.918Z" level=info msg="sub-process exited" argo=true error="<nil>"
Error: exit status 2

The service ml-pipeline.kubeflow:8887 exists.

Everything works great on version 1.8.5.

If you need the logs from the others two pods please let me know. I also check the logs in all the kubeflow services and I can't find any issue.

Impacted by this bug? Give it a 👍.

The text was updated successfully, but these errors were encountered:

zijianjoy · 2023-07-13T22:41:28Z

/assign @Linchin

Linchin · 2023-07-13T23:48:09Z

Hi @andre-lx, thank you for bringing up this issue. I tried the same pipeline on a newly deployed 2.0.0 cluster, and the run finished without issue. looking at the log you provided, we have

github.com/kubeflow/pipelines/backend/src/v2/metadata.(*Client).PublishExecution(0xc000b29920, {0x20a4878, 0xc000058040}, 0x0, 0x0, {0x0, 0x0, 0xc000b60000?}, 0x4)
/go/src/github.com/kubeflow/pipelines/backend/src/v2/metadata/client.go:388 +0x69

The metadata client seems to come from version 2.0.0-rc.2 instead of version 2.0.0. Could you double check if you applied the manifest of version 2.0.0? Try apply the manifest again (here) and see if the issue persists.

Linchin · 2023-07-14T00:45:11Z

Also, could you let me know which way you used to deploy KFP, standalone or via kubeflow?

andre-lx · 2023-07-14T09:41:14Z

Hi @Linchin, I just checked and we are using the following image:

pipelines/manifests/kustomize/base/metadata/base/kustomization.yaml

Lines 10 to 12 in e03e312

    
           images: 
        
             - name: gcr.io/ml-pipeline/metadata-envoy 
        
               newTag: 2.0.0

The deployment was done using the follwing file: https://github.com/kubeflow/pipelines/blob/2.0.0/manifests/kustomize/env/platform-agnostic-multi-user/kustomization.yaml

Thanks

nithin8702 · 2023-07-20T07:49:21Z

Hi @andre-lx @Linchin
Same issue we are also facing. Did you get a chance to fix it?

andre-lx · 2023-07-28T16:35:06Z

Hi @andre-lx @Linchin Same issue we are also facing. Did you get a chance to fix it?

I had to revert it to 1.8.5 for now.

nithin8702 · 2023-08-05T17:51:50Z

@chensun for visibility of this issue

halilagin · 2023-08-06T23:58:42Z

I have the same error. Here are the details.

Running in standalone mode
Running in virtual cluster (everything is working but cannot run pipelines)
All pods are working
I can upload and run pipelines on UI, but the pod is failing
Using the pipelines version 2.0.0
Generating the pipeline with the command below
kfp dsl compile --py v2/hello_world.py --output hello_world.pipeline.json

github-actions · 2023-11-07T07:41:39Z

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

pffijt · 2023-12-07T09:54:39Z

I also have this issue in my Kubeflow 1.8 environment.
Kubeflow 1.8 is using the pipelines backend 2.0.3

I released my environment with the kubeflow manifest 1.8.

Can someone fix this issue?

taiynlee · 2024-04-24T09:35:02Z

the same issue on kubeflow 1.8

svn123 · 2024-05-14T09:28:14Z

I have faced a similar issue. I have full Kubeflow 1.8 environment installed and the pipeline backend metadata envoy is 2.0.3 version. Is this issue resolved?

umka1332 · 2024-06-01T20:51:04Z

I've faced similar issue, and it was due to proxy setting on the pod/step. After removing proxy setting the issue was gone.

pschoen-itsc · 2024-06-24T14:10:42Z

@umka1332 This solved the problem for me also. But do you know a way how I can still set proxy env vars to connect to the internet?

pschoen-itsc · 2024-06-24T14:56:09Z

Just tested successfully that setting NO_PROXY to '*.kubeflow,*.local' seems to work together with http(s)_proxy.
It makes sense that the connection to ml-pipeline fails without NO_PROXY because then all traffic will be routed through the given proxy. It is just strange that it has seemed to work before updating kubeflow.

gregsheremeta · 2024-08-20T21:33:06Z

If anyone following this can reliably reproduce this issue...

we always get the following error on the third pod that is started

I also need to see the log on the second pod (driver) that is started. Thanks.

suanshs · 2024-08-28T03:03:15Z

@umka1332 This solved the problem for me also. But do you know a way how I can still set proxy env vars to connect to the internet?

Just tested successfully that setting NO_PROXY to '.kubeflow,.local' seems to work together with http(s)_proxy. It makes sense that the connection to ml-pipeline fails without NO_PROXY because then all traffic will be routed through the given proxy. It is just strange that it has seemed to work before updating kubeflow.

How did you solve this? I tried to set the no_proxy environment variables but it did not work for me. @umka1332

pschoen-itsc · 2024-08-28T07:11:04Z

@umka1332 This solved the problem for me also. But do you know a way how I can still set proxy env vars to connect to the internet?

Just tested successfully that setting NO_PROXY to '.kubeflow,.local' seems to work together with http(s)_proxy. It makes sense that the connection to ml-pipeline fails without NO_PROXY because then all traffic will be routed through the given proxy. It is just strange that it has seemed to work before updating kubeflow.

How did you solve this? I tried to set the no_proxy environment variables but it did not work for me. @umka1332

Important is to set NO_PROXY (so all uppercase). Also I had to add the kube api-server IP to NO_PROXY.

stevenkitter · 2024-08-28T11:39:42Z

1.8.1 kubeflow has the same problem....

stevenkitter · 2024-08-28T11:47:45Z

I solved this problem by delete proxy, you guys must delete proxy, if you need packages you need make a image that you can use.

suanshs · 2024-08-28T14:27:13Z

from kfp import dsl
from kfp import compiler

@dsl.component()
def say_hello() :
    import time
    time.sleep(1900)
    hello_text = f'Hello!'
    print(hello_text)

@dsl.pipeline
def hello_pipeline():
    hello_task = say_hello()
    hello_task.set_env_variable(name='NO_PROXY', value='*.kubeflow,*.local')
    hello_task.set_env_variable(name='no_proxy', value='*.kubeflow,*.local')
    hello_task.set_caching_options(False)
    

compiler.Compiler().compile(hello_pipeline, package_path='pipeline.yaml')

I tried running this but it did not work for me. Is there somethin I am missing here. @pschoen-itsc @umka1332

pschoen-itsc · 2024-08-28T16:49:26Z

@suanshs Seems like you are having a different problem. If you don't have any proxies set to begin with, then you also should not need the NO_PROXY settings. Can you provide logs of all the containers of the failing pod?

suanshs · 2024-08-28T17:02:31Z

@pschoen-itsc
Following are the logs from main container of the failing pod

time="2024-08-28T14:19:16.866Z" level=info msg="capturing logs" argo=true
time="2024-08-28T14:19:16.900Z" level=info msg="capturing logs" argo=true
I0828 14:19:16.922099      53 launcher_v2.go:90] input ComponentSpec:{
  "executorLabel": "exec-say-hello"
}
I0828 14:19:16.922671      53 cache.go:116] Connecting to cache endpoint ml-pipeline.kubeflow:8887
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x941c29]

goroutine 1 [running]:
github.com/kubeflow/pipelines/backend/src/v2/metadata.(*Client).PublishExecution(0xc000afc720, {0x20a4878, 0xc000196000}, 0x0, 0x0, {0x0, 0x0, 0xc0004dc000?}, 0x4)
	/go/src/github.com/kubeflow/pipelines/backend/src/v2/metadata/client.go:388 +0x69
github.com/kubeflow/pipelines/backend/src/v2/component.(*LauncherV2).publish(0x467387?, {0x20a4878?, 0xc000196000?}, 0x1?, 0x1?, {0x0?, 0x1a51660?, 0xc0004c6060?}, 0xbbfbb0?)
	/go/src/github.com/kubeflow/pipelines/backend/src/v2/component/launcher_v2.go:266 +0x9b
github.com/kubeflow/pipelines/backend/src/v2/component.(*LauncherV2).Execute.func2()
	/go/src/github.com/kubeflow/pipelines/backend/src/v2/component/launcher_v2.go:144 +0x65
github.com/kubeflow/pipelines/backend/src/v2/component.(*LauncherV2).Execute(0xc000306460, {0x20a4878, 0xc000196000})
	/go/src/github.com/kubeflow/pipelines/backend/src/v2/component/launcher_v2.go:156 +0x91e
main.run()
	/go/src/github.com/kubeflow/pipelines/backend/src/v2/cmd/launcher-v2/main.go:98 +0x3ed
main.main()
	/go/src/github.com/kubeflow/pipelines/backend/src/v2/cmd/launcher-v2/main.go:47 +0x19
time="2024-08-28T14:19:17.903Z" level=info msg="sub-process exited" argo=true error="<nil>"
Error: exit status 2
time="2024-08-28T14:19:18.871Z" level=info msg="sub-process exited" argo=true error="<nil>"
Error: exit status 2

Following are the logs from wait container

time="2024-08-28T14:19:16.138Z" level=info msg="Starting Workflow Executor" executorType=emissary version=v3.3.10
time="2024-08-28T14:19:16.141Z" level=info msg="Creating a emissary executor"
time="2024-08-28T14:19:16.141Z" level=info msg="Using executor retry strategy" Duration=1s Factor=1.6 Jitter=0.5 Steps=5
time="2024-08-28T14:19:16.141Z" level=info msg="Executor initialized" deadline="0001-01-01 00:00:00 +0000 UTC" includeScriptOutput=false namespace=kubeflow podName=hello-pipeline-2clrb-1334336905 template="{\"name\":\"system-container-impl\",\"inputs\":{\"parameters\":[{\"name\":\"pod-spec-patch\",\"value\":\"{\\\"containers\\\":[{\\\"name\\\":\\\"main\\\",\\\"image\\\":\\\"docker-dev-artifactory.workday.com/ml/kubeflow/python-3.7:latest\\\",\\\"command\\\":[\\\"/var/run/argo/argoexec\\\",\\\"emissary\\\",\\\"--\\\",\\\"/kfp-launcher/launch\\\",\\\"--pipeline_name\\\",\\\"hello-pipeline\\\",\\\"--run_id\\\",\\\"5610709d-50b9-4833-8e2d-7e72a19a97ec\\\",\\\"--execution_id\\\",\\\"91\\\",\\\"--executor_input\\\",\\\"{\\\\\\\"inputs\\\\\\\":{},\\\\\\\"outputs\\\\\\\":{\\\\\\\"outputFile\\\\\\\":\\\\\\\"/tmp/kfp_outputs/output_metadata.json\\\\\\\"}}\\\",\\\"--component_spec\\\",\\\"{\\\\\\\"executorLabel\\\\\\\":\\\\\\\"exec-say-hello\\\\\\\"}\\\",\\\"--pod_name\\\",\\\"$(KFP_POD_NAME)\\\",\\\"--pod_uid\\\",\\\"$(KFP_POD_UID)\\\",\\\"--mlmd_server_address\\\",\\\"$(METADATA_GRPC_SERVICE_HOST)\\\",\\\"--mlmd_server_port\\\",\\\"tcp://10.100.242.77:8080\\\",\\\"--\\\"],\\\"args\\\":[\\\"sh\\\",\\\"-c\\\",\\\"\\\\nif ! [ -x \\\\\\\"$(command -v pip)\\\\\\\" ]; then\\\\n    python3 -m ensurepip || python3 -m ensurepip --user || apt-get install python3-pip\\\\nfi\\\\n\\\\nPIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet     --no-warn-script-location 'kfp==2.0.1' \\\\u0026\\\\u0026 \\\\\\\"$0\\\\\\\" \\\\\\\"$@\\\\\\\"\\\\n\\\",\\\"sh\\\",\\\"-ec\\\",\\\"program_path=$(mktemp -d)\\\\nprintf \\\\\\\"%s\\\\\\\" \\\\\\\"$0\\\\\\\" \\\\u003e \\\\\\\"$program_path/ephemeral_component.py\\\\\\\"\\\\npython3 -m kfp.components.executor_main                         --component_module_path                         \\\\\\\"$program_path/ephemeral_component.py\\\\\\\"                         \\\\\\\"$@\\\\\\\"\\\\n\\\",\\\"\\\\nimport kfp\\\\nfrom kfp import dsl\\\\nfrom kfp.dsl import *\\\\nfrom typing import *\\\\n\\\\ndef say_hello() :\\\\n    import time\\\\n    time.sleep(1900)\\\\n    hello_text = f'Hello, Suansh!'\\\\n    print(hello_text)\\\\n\\\\n\\\",\\\"--executor_input\\\",\\\"{{$}}\\\",\\\"--function_to_execute\\\",\\\"say_hello\\\"],\\\"env\\\":[{\\\"name\\\":\\\"NO_PROXY\\\",\\\"value\\\":\\\"172.17.68.189,.kubeflow,.local\\\"},{\\\"name\\\":\\\"no_proxy\\\",\\\"value\\\":\\\"172.17.68.189,.kubeflow,.local\\\"}],\\\"resources\\\":{}}]}\"}]},\"outputs\":{},\"metadata\":{\"annotations\":{\"sidecar.istio.io/inject\":\"false\"}},\"container\":{\"name\":\"\",\"image\":\"gcr.io/ml-pipeline/should-be-overridden-during-runtime\",\"command\":[\"should-be-overridden-during-runtime\"],\"envFrom\":[{\"configMapRef\":{\"name\":\"metadata-grpc-configmap\",\"optional\":true}}],\"env\":[{\"name\":\"KFP_POD_NAME\",\"valueFrom\":{\"fieldRef\":{\"fieldPath\":\"metadata.name\"}}},{\"name\":\"KFP_POD_UID\",\"valueFrom\":{\"fieldRef\":{\"fieldPath\":\"metadata.uid\"}}}],\"resources\":{},\"volumeMounts\":[{\"name\":\"kfp-launcher\",\"mountPath\":\"/kfp-launcher\"}]},\"volumes\":[{\"name\":\"kfp-launcher\",\"emptyDir\":{}}],\"initContainers\":[{\"name\":\"kfp-launcher\",\"image\":\"gcr.io/ml-pipeline/kfp-launcher@sha256:80cf120abd125db84fa547640fd6386c4b2a26936e0c2b04a7d3634991a850a4\",\"command\":[\"launcher-v2\",\"--copy\",\"/kfp-launcher/launch\"],\"resources\":{\"limits\":{\"cpu\":\"500m\",\"memory\":\"128Mi\"},\"requests\":{\"cpu\":\"100m\"}},\"volumeMounts\":[{\"name\":\"kfp-launcher\",\"mountPath\":\"/kfp-launcher\"}]}],\"archiveLocation\":{\"archiveLogs\":true,\"s3\":{\"endpoint\":\"minio.kubeflow:9000\",\"bucket\":\"mlpipeline\",\"insecure\":true,\"accessKeySecret\":{\"name\":\"mlpipeline-minio-artifact\",\"key\":\"accesskey\"},\"secretKeySecret\":{\"name\":\"mlpipeline-minio-artifact\",\"key\":\"secretkey\"},\"key\":\"artifacts/kubeflow/hello-pipeline-2clrb/2024-08-28/hello-pipeline-2clrb-1334336905\"}},\"podSpecPatch\":\"{\\\"containers\\\":[{\\\"name\\\":\\\"main\\\",\\\"image\\\":\\\"docker-dev-artifactory.workday.com/ml/kubeflow/python-3.7:latest\\\",\\\"command\\\":[\\\"/var/run/argo/argoexec\\\",\\\"emissary\\\",\\\"--\\\",\\\"/kfp-launcher/launch\\\",\\\"--pipeline_name\\\",\\\"hello-pipeline\\\",\\\"--run_id\\\",\\\"5610709d-50b9-4833-8e2d-7e72a19a97ec\\\",\\\"--execution_id\\\",\\\"91\\\",\\\"--executor_input\\\",\\\"{\\\\\\\"inputs\\\\\\\":{},\\\\\\\"outputs\\\\\\\":{\\\\\\\"outputFile\\\\\\\":\\\\\\\"/tmp/kfp_outputs/output_metadata.json\\\\\\\"}}\\\",\\\"--component_spec\\\",\\\"{\\\\\\\"executorLabel\\\\\\\":\\\\\\\"exec-say-hello\\\\\\\"}\\\",\\\"--pod_name\\\",\\\"$(KFP_POD_NAME)\\\",\\\"--pod_uid\\\",\\\"$(KFP_POD_UID)\\\",\\\"--mlmd_server_address\\\",\\\"$(METADATA_GRPC_SERVICE_HOST)\\\",\\\"--mlmd_server_port\\\",\\\"tcp://10.100.242.77:8080\\\",\\\"--\\\"],\\\"args\\\":[\\\"sh\\\",\\\"-c\\\",\\\"\\\\nif ! [ -x \\\\\\\"$(command -v pip)\\\\\\\" ]; then\\\\n    python3 -m ensurepip || python3 -m ensurepip --user || apt-get install python3-pip\\\\nfi\\\\n\\\\nPIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet     --no-warn-script-location 'kfp==2.0.1' \\\\u0026\\\\u0026 \\\\\\\"$0\\\\\\\" \\\\\\\"$@\\\\\\\"\\\\n\\\",\\\"sh\\\",\\\"-ec\\\",\\\"program_path=$(mktemp -d)\\\\nprintf \\\\\\\"%s\\\\\\\" \\\\\\\"$0\\\\\\\" \\\\u003e \\\\\\\"$program_path/ephemeral_component.py\\\\\\\"\\\\npython3 -m kfp.components.executor_main                         --component_module_path                         \\\\\\\"$program_path/ephemeral_component.py\\\\\\\"                         \\\\\\\"$@\\\\\\\"\\\\n\\\",\\\"\\\\nimport kfp\\\\nfrom kfp import dsl\\\\nfrom kfp.dsl import *\\\\nfrom typing import *\\\\n\\\\ndef say_hello() :\\\\n    import time\\\\n    time.sleep(1900)\\\\n    hello_text = f'Hello, Suansh!'\\\\n    print(hello_text)\\\\n\\\\n\\\",\\\"--executor_input\\\",\\\"{{$}}\\\",\\\"--function_to_execute\\\",\\\"say_hello\\\"],\\\"env\\\":[{\\\"name\\\":\\\"NO_PROXY\\\",\\\"value\\\":\\\"172.17.68.189,.kubeflow,.local\\\"},{\\\"name\\\":\\\"no_proxy\\\",\\\"value\\\":\\\"172.17.68.189,.kubeflow,.local\\\"}],\\\"resources\\\":{}}]}\"}" version="&Version{Version:v3.3.10,BuildDate:2022-11-29T18:18:30Z,GitCommit:b19870d737a14b21d86f6267642a63dd14e5acd5,GitTag:v3.3.10,GitTreeState:clean,GoVersion:go1.17.13,Compiler:gc,Platform:linux/amd64,}"
time="2024-08-28T14:19:16.141Z" level=info msg="Starting deadline monitor"
time="2024-08-28T14:19:18.142Z" level=info msg="Main container completed"
time="2024-08-28T14:19:18.142Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2024-08-28T14:19:18.142Z" level=info msg="Saving logs"
time="2024-08-28T14:19:18.142Z" level=info msg="S3 Save path: /tmp/argo/outputs/logs/main.log, key: artifacts/kubeflow/hello-pipeline-2clrb/2024-08-28/hello-pipeline-2clrb-1334336905/main.log"
time="2024-08-28T14:19:18.142Z" level=info msg="Creating minio client using static credentials" endpoint="minio.kubeflow:9000"
time="2024-08-28T14:19:18.142Z" level=info msg="Saving file to s3" bucket=mlpipeline endpoint="minio.kubeflow:9000" key=artifacts/kubeflow/hello-pipeline-2clrb/2024-08-28/hello-pipeline-2clrb-1334336905/main.log path=/tmp/argo/outputs/logs/main.log
time="2024-08-28T14:19:18.151Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
time="2024-08-28T14:19:18.151Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
time="2024-08-28T14:19:18.151Z" level=info msg="No output parameters"
time="2024-08-28T14:19:18.151Z" level=info msg="No output artifacts"
time="2024-08-28T14:19:18.168Z" level=info msg="Create workflowtaskresults 201"
time="2024-08-28T14:19:18.169Z" level=info msg="Killing sidecars []"
time="2024-08-28T14:19:18.169Z" level=info msg="Alloc=6749 TotalAlloc=12722 Sys=24786 NumGC=4 Goroutines=9"

Following are the logs from

pschoen-itsc · 2024-08-28T17:44:04Z

@suanshs Do you also have logs of the istio sidecar or do you have no istio deployed?

mmazurekgda · 2024-09-17T06:55:01Z

Just tested successfully that setting NO_PROXY to '.kubeflow,.local' seems to work together with http(s)_proxy. It makes sense that the connection to ml-pipeline fails without NO_PROXY because then all traffic will be routed through the given proxy. It is just strange that it has seemed to work before updating kubeflow.

Thanks! This helped me a lot!

cybernagle · 2024-11-11T09:13:16Z

@suanshs Do you also have logs of the istio sidecar or do you have no istio deployed?

Hi I'm facing the same issue when using istio-proxy sidecar injected. and with NO_PROXY environment setup not able to fix such issue. :(

cybernagle · 2024-11-14T09:53:56Z

Hi Folks,

I was able to resolve the issue. The root cause was that I was using Istio sidecar injection for the workflow pods. However, during the init container stage, the kfp-launcherattempts to connect to the endpoint metadata-grpc-service.kubeflow:8080 before the Istio-proxy is ready.

I found a related issue here: istio/istio#23802. As suggested, adding the following label to the container resolved the issue:

traffic.sidecar.istio.io/excludeOutboundPorts: "8080"

andre-lx added area/backend kind/bug labels Jul 5, 2023

google-oss-prow bot assigned Linchin Jul 13, 2023

zijianjoy added this to KFP v2 Jul 13, 2023

chensun self-assigned this Aug 8, 2023

chensun moved this to P1 in KFP v2 Aug 8, 2023

Linchin removed their assignment Aug 28, 2023

github-actions bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Nov 7, 2023

stale bot removed the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Dec 7, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[backend] Panic while connection to default cache endpoint ml-pipeline.kubeflow:8887 #9702

[backend] Panic while connection to default cache endpoint ml-pipeline.kubeflow:8887 #9702

andre-lx commented Jul 5, 2023 •

edited

Loading

zijianjoy commented Jul 13, 2023

Linchin commented Jul 13, 2023

Linchin commented Jul 14, 2023

andre-lx commented Jul 14, 2023

nithin8702 commented Jul 20, 2023

andre-lx commented Jul 28, 2023

nithin8702 commented Aug 5, 2023

halilagin commented Aug 6, 2023

github-actions bot commented Nov 7, 2023

pffijt commented Dec 7, 2023 •

edited

Loading

taiynlee commented Apr 24, 2024

svn123 commented May 14, 2024

umka1332 commented Jun 1, 2024

pschoen-itsc commented Jun 24, 2024

pschoen-itsc commented Jun 24, 2024 •

edited

Loading

gregsheremeta commented Aug 20, 2024

suanshs commented Aug 28, 2024 •

edited

Loading

pschoen-itsc commented Aug 28, 2024

stevenkitter commented Aug 28, 2024

stevenkitter commented Aug 28, 2024

suanshs commented Aug 28, 2024 •

edited

Loading

pschoen-itsc commented Aug 28, 2024

suanshs commented Aug 28, 2024 •

edited

Loading

pschoen-itsc commented Aug 28, 2024

mmazurekgda commented Sep 17, 2024

cybernagle commented Nov 11, 2024

cybernagle commented Nov 14, 2024 •

edited

Loading

[backend] Panic while connection to default cache endpoint ml-pipeline.kubeflow:8887 #9702

[backend] Panic while connection to default cache endpoint ml-pipeline.kubeflow:8887 #9702

Comments

andre-lx commented Jul 5, 2023 • edited Loading

Environment

Steps to reproduce

zijianjoy commented Jul 13, 2023

Linchin commented Jul 13, 2023

Linchin commented Jul 14, 2023

andre-lx commented Jul 14, 2023

nithin8702 commented Jul 20, 2023

andre-lx commented Jul 28, 2023

nithin8702 commented Aug 5, 2023

halilagin commented Aug 6, 2023

github-actions bot commented Nov 7, 2023

pffijt commented Dec 7, 2023 • edited Loading

taiynlee commented Apr 24, 2024

svn123 commented May 14, 2024

umka1332 commented Jun 1, 2024

pschoen-itsc commented Jun 24, 2024

pschoen-itsc commented Jun 24, 2024 • edited Loading

gregsheremeta commented Aug 20, 2024

suanshs commented Aug 28, 2024 • edited Loading

pschoen-itsc commented Aug 28, 2024

stevenkitter commented Aug 28, 2024

stevenkitter commented Aug 28, 2024

suanshs commented Aug 28, 2024 • edited Loading

pschoen-itsc commented Aug 28, 2024

suanshs commented Aug 28, 2024 • edited Loading

pschoen-itsc commented Aug 28, 2024

mmazurekgda commented Sep 17, 2024

cybernagle commented Nov 11, 2024

cybernagle commented Nov 14, 2024 • edited Loading

andre-lx commented Jul 5, 2023 •

edited

Loading

pffijt commented Dec 7, 2023 •

edited

Loading

pschoen-itsc commented Jun 24, 2024 •

edited

Loading

suanshs commented Aug 28, 2024 •

edited

Loading

suanshs commented Aug 28, 2024 •

edited

Loading

suanshs commented Aug 28, 2024 •

edited

Loading

cybernagle commented Nov 14, 2024 •

edited

Loading