Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backend] Unimplemented desc = unknown method ListTasksV1 for service api.TaskService #8446

Open
tanvithakur94 opened this issue Nov 13, 2022 · 18 comments

Comments

@tanvithakur94
Copy link

tanvithakur94 commented Nov 13, 2022

### Environment
kfp v2 SDK
kfp version 2.0.0b6
kfp-pipeline-spec version 0.1.16
kfp-server-api version 2.0.0a6

Steps to reproduce

We are trying to run a simple example from the docs in order to test kfp v2 sdk in kubeflow namespace as shown below:

import kfp
import kfp.dsl as dsl
#from kfp.v2.dsl import component
from kfp.dsl import component

@component
def add(a: float, b: float) -> float:
  '''Calculates sum of two arguments'''
  return a + b

@dsl.pipeline(
  name='addition-pipeline',
  description='An example pipeline that performs addition calculations.',
  # pipeline_root='gs://my-pipeline-root/example-pipeline'
)
def add_pipeline(a: float = 1, b: float = 7):
  add_task = add(a=a, b=b)

if __name__ == '__main__':
    # Compiling the pipeline
    kfp.compiler.Compiler().compile(pipeline_func=add_pipeline, package_path='pipeline1.yaml')

Once we try and submit this pipeline, we see that system-dag-driver pod ran successfully. However, the system-container-driver pod terminates with an error:

I1113 20:31:38.515160 27 main.go:213] output ExecutorInput:{ "inputs": { "parameterValues": { "parallelism": "2" } }, "outputs": { "parameters": { "Output": { "outputFile": "/tmp/kfp/outputs/Output" } }, "outputFile": "/tmp/kfp_outputs/output_metadata.json" } } F1113 20:31:38.515178 27 main.go:74] KFP driver: driver.Container(pipelineName=pipeline/stress-test-v2, runID=caf03c46-c9db-4398-afb9-cea0821a3c51, task="get-loop-args", component="comp-get-loop-args", dagExecutionID=1085086, componentSpec) failed: failure while getting executionCache: failed to list tasks: rpc error: code = Unimplemented desc = unknown method ListTasksV1 for service api.TaskService

Also, another issuing we are facing is, when we submit the pipeline from the kubeflow UI, we are getting an error Cannot get MLMD objects from Metadata store. Cannot find context with {"typeName":"system.PipelineRun": Unknown Content-type received.

Impacted by this bug? Give it a 👍.

@chensun chensun changed the title [sdk] Unimplemented desc = unknown method ListTasksV1 for service api.TaskService [backend] Unimplemented desc = unknown method ListTasksV1 for service api.TaskService Nov 14, 2022
@chensun
Copy link
Member

chensun commented Nov 14, 2022

TL;DR: This should have been resolved now.

Last week, we updated these two images, which we shouldn't have referenced via latest from the beginning:

driverImage: "gcr.io/ml-pipeline-test/dev/kfp-driver:latest",
launcherImage: "gcr.io/ml-pipeline-test/dev/kfp-launcher-v2:latest",

The change caused a mismatch between the API client and the API server. We reverted the images by manually adding latest label to the last working/matching images. In our next release, we will fix the problematic latest reference in the above code.

@stobias123
Copy link

I'm still seeing this on a fresh install

@chensun
Copy link
Member

chensun commented Nov 17, 2022

I'm still seeing this on a fresh install

@stobias123 Sorry, I found the "latest" label we added to the old images was accidentally moved. Changed it back, this should work now. Please retry running any pipeline.

@deepk2u
Copy link
Contributor

deepk2u commented Jan 4, 2023

@chensun I am seeing a different error now. Looks like the argo executor is not part of kfp-launcher image

 containerStatuses:
  - containerID: containerd://4585e4aa7d7f41937c2f6a70321555d2c65070666a507a0a75b4c6857e171bd6
    image: sha256:374e2eca14d7602115e4be44f60b0587d8192a45298c353df4cfac3bfc7854f9
    imageID: gcr.io/ml-pipeline-test/dev/kfp-launcher-v2@sha256:4513cf5c10c252d94f383ce51a890514799c200795e3de5e90f91b98b2e2f959
    lastState: {}
    name: main
    ready: false
    restartCount: 0
    started: false
    state:
      terminated:
        containerID: containerd://4585e4aa7d7f41937c2f6a70321555d2c65070666a507a0a75b4c6857e171bd6
        exitCode: 128
        finishedAt: "2023-01-04T20:57:59Z"
        message: 'failed to create containerd task: failed to create shim task: OCI
          runtime create failed: runc create failed: unable to start container process:
          exec: "/var/run/argo/argoexec": stat /var/run/argo/argoexec: no such file
          or directory: unknown'
        reason: StartError
        startedAt: "1970-01-01T00:00:00Z"
       

@sergeyshevch
Copy link

@chensun I still get same error on apiserver v2.0.0.alpha6. Is there a way to fix it?

@v-raja
Copy link

v-raja commented Mar 8, 2023

I'm getting the same error too. @sergeyshevch Have you figured any temporary workaround?

@sergeyshevch
Copy link

I'm getting the same error too. @sergeyshevch Have you figured any temporary workaround?

I returned to use pipelines sdk v1
It works fine

@v-raja
Copy link

v-raja commented Mar 11, 2023

I would prefer not to use sdk v1 since I'd like to use dsl.importer. @chensun Any thoughts on why this still might be happening? Happy to try to fix it.

@chensun
Copy link
Member

chensun commented Mar 28, 2023

@v-raja sorry for the slow response. I noticed the v2 image label got moved again. Just moved it back, can you retry and see if you still hit the issue?

@rzanetti-cpqd
Copy link

I'm having the same issue on kf 1.7 (kfp 2.0.0-alpha.7)
Environments that were installed this month as well as envs from 2023 are now unable to run any pipeline, referencing the same error listed here as soon as the second pod wraps up:

KFP driver: driver.Container(pipelineName=hello-pipeline, runID=19fdf8c4-f622-4da3-9760-e3c66919032c, task="say-hello", component="comp-say-hello", dagExecutionID=824, componentSpec) failed: failure while getting executionCache: failed to list tasks: rpc error: code = Unimplemented desc = unknown method ListTasksV1 for service api.TaskService

Even the say-hello pipeline breaks on any instances of kf 1.7 we have here.

In case this is a similar issue, how do you resolve the installs after the images get properly moved back? Would simply rollout-restarting the services be enough?

Thanks in advance

@pbasov
Copy link

pbasov commented Feb 10, 2024

@chensun Having the same issue as @rzanetti-cpqd on 2.0.0a7 (provided with the kubeflow distro I'm using)
Are these images exposed in a config somehow? I'd like to rehost them so some upstream change doesn't break our setup.
In a7 they're hardcoded.

@tfontana1
Copy link

Having this issue also with KFP 2.0.0-beta.12. @chensun can you check to see if this v2 image label got moved again?

@chensun
Copy link
Member

chensun commented Feb 12, 2024

Having this issue also with KFP 2.0.0-beta.12. @chensun can you check to see if this v2 image label got moved again?

@tfontana1 Yes, somehow it was moved 5 days ago. It's unclear to me what was the trigger. I just moved it back.
Please do prioritize upgrading to a stable version to avoid such breaks in the future.

@shivanibhargove
Copy link

/reopen
We are facing the same issue with kfp v2.

Copy link

@shivanibhargove: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen
We are facing the same issue with kfp v2.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@deepk2u
Copy link
Contributor

deepk2u commented Jan 8, 2025

/reopen
We are facing the same issue with kfp v2.

@google-oss-prow google-oss-prow bot reopened this Jan 8, 2025
Copy link

@deepk2u: Reopened this issue.

In response to this:

/reopen
We are facing the same issue with kfp v2.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@shivanibhargove
Copy link

shivanibhargove commented Jan 8, 2025

Hi Team,
I have installed kubeflow 1.8 with
kfp run time: 2.3.0
kfp sdk: 2.11.0
Argo: v3.4.18
I am getting below error:
"executor error: open /var/run/argo/outputs/parameters/tmp/outputs/execution-id: no such file or directory"
Steps:

  1. Run a simple addition pipeline:
    import kfp
    import kfp.dsl as dsl
    #from kfp.v2.dsl import component
    from kfp.dsl import component

@component
def add(a: float, b: float) -> float:
'''Calculates sum of two arguments'''
return a + b

@dsl.pipeline(
name='addition-pipeline',
description='An example pipeline that performs addition calculations.',

pipeline_root='gs://my-pipeline-root/example-pipeline'

)
def add_pipeline(a: float = 1, b: float = 7):
add_task = add(a=a, b=b)

if name == 'main':
# Compiling the pipeline
kfp.compiler.Compiler().compile(pipeline_func=add_pipeline, package_path='pipeline1.yaml')

  1. Check for dag driver pod logs.

Is kfp v2 not compatible with the above argo version?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

10 participants