Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using set_display_name results in ‘Cannot Find Producer Task’ error in Kubeflow Pipelines #11173

Open
milosjava opened this issue Sep 5, 2024 · 7 comments

Comments

@milosjava
Copy link
Member

milosjava commented Sep 5, 2024

Environment

Kubeflow version: 1.9
KFP SDK version: 2.8.0

Backend: Argo

Steps to reproduce

  1. Define two components (addition and divide) and create a pipeline (hello_pipeline) that chains them together.
  2. Use the set_display_name function to set a display name for the first component (addition).
  3. Compile the pipeline and run it. Pipeline will fail cause the output from the first component is not available to the second component (divide). Please check the error log at the end of this issue.

Here is the Python code that reproduces the issue:

from kfp import dsl

@dsl.component
def addition(a: float, b: float) -> float:
    print("hi")
    return a + b

@dsl.component
def divide(a: float, b: float) -> float:
    print("hi")
    return a / b

@dsl.pipeline
def hello_pipeline(a: float = 1, b: float = 2, c: float = 3) -> None:
    total = addition(a=a, b=b).set_display_name('total')
    fraction = divide(a=total.output, b=c)

from kfp import compiler

compiler.Compiler().compile(hello_pipeline, 'pipeline.yaml')

Expected result

The pipeline should run successfully with the set_display_name method used.

Actual result

Pipeline fails in divide component with error message:

I0905 21:49:34.529181      21 main.go:108] input ComponentSpec:{
  "executorLabel": "exec-divide",
  "inputDefinitions": {
    "parameters": {
      "a": {
        "parameterType": "NUMBER_DOUBLE"
      },
      "b": {
        "parameterType": "NUMBER_DOUBLE"
      }
    }
  },
  "outputDefinitions": {
    "parameters": {
      "Output": {
        "parameterType": "NUMBER_DOUBLE"
      }
    }
  }
}
I0905 21:49:34.530368      21 main.go:115] input TaskSpec:{
  "cachingOptions": {
    "enableCache": true
  },
  "componentRef": {
    "name": "comp-divide"
  },
  "dependentTasks": [
    "addition"
  ],
  "inputs": {
    "parameters": {
      "a": {
        "taskOutputParameter": {
          "outputParameterKey": "Output",
          "producerTask": "addition"
        }
      },
      "b": {
        "componentInputParameter": "c"
      }
    }
  },
  "taskInfo": {
    "name": "divide"
  }
}
I0905 21:49:34.531025      21 main.go:121] input ContainerSpec:{
  "args": [
    "--executor_input",
    "{{$}}",
    "--function_to_execute",
    "divide"
  ],
  "command": [
    "sh",
    "-c",
    "\nif ! [ -x \"$(command -v pip)\" ]; then\n    python3 -m ensurepip || python3 -m ensurepip --user || apt-get install python3-pip\nfi\n\nPIP_DISABLE_PIP_VERSION_CHECK=1 python3 -m pip install --quiet --no-warn-script-location 'kfp==2.8.0' '--no-deps' 'typing-extensions\u003e=3.7.4,\u003c5; python_version\u003c\"3.9\"' \u0026\u0026 \"$0\" \"$@\"\n",
    "sh",
    "-ec",
    "program_path=$(mktemp -d)\n\nprintf \"%s\" \"$0\" \u003e \"$program_path/ephemeral_component.py\"\n_KFP_RUNTIME=true python3 -m kfp.dsl.executor_main                         --component_module_path                         \"$program_path/ephemeral_component.py\"                         \"$@\"\n",
    "\nimport kfp\nfrom kfp import dsl\nfrom kfp.dsl import *\nfrom typing import *\n\ndef divide(a: float, b: float) -\u003e float:\n    print(\"hi\")\n    return a / b\n\n"
  ],
  "image": "python:3.8"
}
I0905 21:49:34.531735      21 cache.go:139] Cannot detect ml-pipeline in the same namespace, default to ml-pipeline.kubeflow:8887 as KFP endpoint.
I0905 21:49:34.531764      21 cache.go:116] Connecting to cache endpoint ml-pipeline.kubeflow:8887
I0905 21:49:34.607730      21 client.go:302] Pipeline Context: id:17447  name:"hello-pipeline"  type_id:26  type:"system.Pipeline"  create_time_since_epoch:1725572300413  last_update_time_since_epoch:1725572300413
I0905 21:49:34.667931      21 client.go:311] Pipeline Run Context: id:17449  name:"ce7a10c3-b5e4-4222-ae34-919cb179134c"  type_id:27  type:"system.PipelineRun"  custom_properties:{key:"namespace"  value:{string_value:"milos-grubjesic"}}  custom_properties:{key:"pipeline_root"  value:{string_value:"minio://kubeflow-content/v2/artifacts/hello-pipeline/ce7a10c3-b5e4-4222-ae34-919cb179134c"}}  custom_properties:{key:"resource_name"  value:{string_value:"run-resource"}}  custom_properties:{key:"store_session_info"  value:{string_value:"{\"Provider\":\"minio\",\"Params\":{\"accessKeyKey\":\"accesskey\",\"disableSSL\":\"true\",\"endpoint\":\"minio-service.kubeflow:9000\",\"fromEnv\":\"false\",\"region\":\"minio\",\"secretKeyKey\":\"secretkey\",\"secretName\":\"mlpipeline-minio-artifact\"}}"}}  create_time_since_epoch:1725572932246  last_update_time_since_epoch:1725572932246
I0905 21:49:35.060625      21 driver.go:252] parent DAG: id:461494  name:"run/ce7a10c3-b5e4-4222-ae34-919cb179134c"  type_id:239  type:"system.DAGExecution"  last_known_state:RUNNING  custom_properties:{key:"display_name"  value:{string_value:""}}  custom_properties:{key:"inputs"  value:{struct_value:{fields:{key:"a"  value:{number_value:1}}  fields:{key:"b"  value:{number_value:2}}  fields:{key:"c"  value:{number_value:3}}}}}  custom_properties:{key:"task_name"  value:{string_value:""}}  create_time_since_epoch:1725572932561  last_update_time_since_epoch:1725572932561
I0905 21:49:35.176022      21 driver.go:926] parent DAG input parameters: map[a:number_value:1 b:number_value:2 c:number_value:3], artifacts: map[]
F0905 21:49:35.233819      21 main.go:79] KFP driver: driver.Container(pipelineName=hello-pipeline, runID=ce7a10c3-b5e4-4222-ae34-919cb179134c, task="divide", component="comp-divide", dagExecutionID=461494, componentSpec) failed: failed to resolve inputs: resolving input parameter a with spec task_output_parameter:{producer_task:"addition"  output_parameter_key:"Output"}: cannot find producer task "addition"
time="2024-09-05T21:49:35.435Z" level=info msg="sub-process exited" argo=true error="<nil>"
time="2024-09-05T21:49:35.435Z" level=error msg="cannot save parameter /tmp/outputs/pod-spec-patch" argo=true error="open /tmp/outputs/pod-spec-patch: no such file or directory"
time="2024-09-05T21:49:35.435Z" level=error msg="cannot save parameter /tmp/outputs/cached-decision" argo=true error="open /tmp/outputs/cached-decision: no such file or directory"
time="2024-09-05T21:49:35.435Z" level=error msg="cannot save parameter /tmp/outputs/condition" argo=true error="open /tmp/outputs/condition: no such file or directory"
Error: exit status 1

Impacted by this bug? Give it a 👍.

@sanchesoon
Copy link

As workaround you can do python wrapper for component. Real problem caused in api-server + argoworkflow. There are mismatch between names.

Copy link

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@github-actions github-actions bot added the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Nov 10, 2024
Copy link

github-actions bot commented Dec 1, 2024

This issue has been automatically closed because it has not had recent activity. Please comment "/reopen" to reopen it.

@github-actions github-actions bot closed this as completed Dec 1, 2024
@lewmatcin
Copy link

/reopen
Still an issue.

Copy link

@lewmatcin: You can't reopen an issue/PR unless you authored it or you are a collaborator.

In response to this:

/reopen
Still an issue.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@milosjava
Copy link
Member Author

/reopen

@google-oss-prow google-oss-prow bot reopened this Dec 3, 2024
Copy link

@milosjava: Reopened this issue.

In response to this:

/reopen

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@github-actions github-actions bot removed the lifecycle/stale The issue / pull request is stale, any activities remove this label. label Dec 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants