Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[backend] Argo Workflow v2 POD_NAMES env variable limits pipeline name length to 22 characters when CreatePVC involved #11235

Open
vvarala1 opened this issue Sep 23, 2024 · 0 comments

Comments

@vvarala1
Copy link

Environment

  • How did you deploy Kubeflow Pipelines (KFP)?
    Kubeflow manifests 1.9 release
  • KFP version:
kfp                       2.9.0
kfp-kubernetes            1.3.0
kfp-pipeline-spec         0.4.0
kfp-server-api            2.3.0

Related issue 11019

Issue Description

In Kubeflow 1.8, Argo Workflow used pod naming format version v1, where pod names contained the node ID. Pipeline name is limited to 58 chars.

In Kubeflow 1.9, Argo Workflow was updated to version 3.4, which introduced a new pod naming format (v2). In this format, pod names now include the template name. By default, pods will be added -system-dag-driver, -system-container-driver, -system-container-impl strings as part of a pod name.

The issue arises when the CreatePVC task is used in a pipeline where the pipeline name exceeds 22 characters (due to AWF template names inclusion) with POD_NAMES: v2. Because of Kubernetes' 64-character limit on pod names, the full pod name cannot be retrieved, leading to errors during pipeline execution.

Steps to reproduce

from kfp import dsl
from kfp import kubernetes

@dsl.component
def test_step():
    print("Hello world")

# Pipeline 1

@dsl.pipeline(name="create pvc pipeline name length check")
# Pipeline name total chars: "create pvc pipeline name length check" = 37 characters
def test_kfp_task():
    test_step()

# Pipeline 2

@dsl.pipeline(name="create pvc pipeline name")
# Pipeline name total chars: "create pvc pipeline na" = 22 characters
# If pipeline name is greater than 22 chars with CreatePVC then pod failed due to missing full pod name in the main container

def test_createPVC_kfp_task():
    create_pvc = kubernetes.CreatePVC(
        access_modes=["ReadWriteOnce"],
        size="10Mi",
        storage_class_name="default",
    )
    task = test_step()
    task.after(create_pvc)

# Execute Pipeline 1
client.create_run_from_pipeline_func(test_kfp_task, arguments={}, enable_caching=False)
# Pipeline 1 executes successfully with a pipeline name longer than 22 characters

# Execute Pipeline 2
client.create_run_from_pipeline_func(test_createPVC_kfp_task, arguments={}, enable_caching=True)
# Pipeline 2 fails with a pipeline name longer than 22 characters

The error message on CreatePVC task in Pipeline 2 is:

[main.go:79] KFP driver: driver.Container(pipelineName=create-pvc-pipeline-name, runID=27679dce-25b9-4637-89d2-9f2265e608c6,
task="createpvc", component="comp-createpvc", dagExecutionID=346957, componentSpec) 
failed: failed to create PVC and publish execution createpvc: failed to publish driver execution: 
failed to publish driver execution createpvc: 
error retrieving info for pod create-pvc-pipeline-name-mcd5f-system-container-driver-40990679:
 pods "create-pvc-pipeline-name-mcd5f-system-container-driver-40990679" not found.

In this case, the actual pod name is create-pvc-pipeline-name-mcd5f-system-container-driver-4099067909, which exceeds the 64-character limit, causing the CreatePVC task to fail

Expected result

The pipeline should successfully execute as a regular pipeline task, regardless of the involvement of the CreatePVC task.

Since all our pipelines utilize the CreatePVC component, we face the dilemma of either maintaining pipeline names with fewer than 22 characters in the AWF v2 format or reverting to POD_NAMES: v1 to prevent AWF template names from being added to pod names.

I am uncertain about the implications of using POD_NAMES: v1 in Kubeflow v1.9. Any suggestions or insights would be greatly appreciated.

Impacted by this bug? Give it a 👍.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant