Cannot specify key for input artifact (without full artifact location) #3307

hadim · 2020-06-25T18:05:04Z

Tested on 2.9.0-rc3

The official example works:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: artifactory-repository-ref-
spec:
  entrypoint: main
  artifactRepositoryRef:
    key: minio
  templates:
    - name: main
      container:
        image: docker/whalesay:latest
        command: [sh, -c]
        args: ["cowsay hello world | tee /tmp/hello_world.txt"]
      outputs:
        artifacts:
          - name: hello_world
            path: /tmp/hello_world.txt

When switching to input it fails:

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: artifactory-repository-ref-
spec:
  entrypoint: main
  artifactRepositoryRef:
    key: minio
  templates:
    - name: main
      container:
        image: docker/whalesay:latest
        command: [sh, -c]
        args: ["cowsay hello world | tee /tmp/hello_world.txt"]
      inputs:
        artifacts:
          - name: hello_world
            path: /tmp/hello_world.txt

with

Failed to submit workflow: templates.entrypoint.steps[0].main templates.main-template inputs.artifacts.hello_world was not supplied

Another important point IMO. I don't see any ways to specify the key during workflow creation (the location of your object within S3/bucket). My understanding is that artifactRepositoryRef can be used to setup default repositories and then can be reused within workflows specifying the location of the folder or file we want to use as inputs or outputs. Was that designed for that purpose?

The text was updated successfully, but these errors were encountered:

alexec · 2020-06-25T18:06:05Z

Can I confirm if this used to work and does not work anymore? Or if it just never seemed to work?

hadim · 2020-06-25T18:06:59Z

It never worked for me. See also #2461 (comment)

hadim · 2020-06-25T18:08:01Z

I have tested it using both gcs and s3.

alexec · 2020-06-25T18:20:55Z

I think you must specify the key within the bucket, as well as bucket, endpoint etc. Not great I agree, but that is how it is today.

hadim · 2020-06-25T18:22:47Z

So what's the point artifactRepositoryRef if you must replicate all the config?

alexec · 2020-07-16T04:04:29Z

@hadim I'm going to recategorize this as an "enhancement". We should do more work in this area, and I'd like to asses interest.

vitalyrychkov · 2020-07-31T07:58:07Z

Hi, i was looking for the same solution - i need many different artifacts as inputs and was hoping to have s3 parameters to be defined only once.

alexec · 2020-07-31T16:11:32Z

Do 👍 to show interest.

dekovach · 2020-10-14T10:15:14Z

It's was really surprising to me to find out that you cannot specify input artifacts from the default artifact repository. It should be supported by default. I would expect to be able to specify only the key in the default bucket, and then it should work out of the box.

alexec · 2020-10-14T15:28:24Z

This issue isn't really to do with artifactRepositoryRef, it's not supported at all. I'm going to rename this issue to reflect this.

Bobgy · 2020-10-21T04:27:09Z

Kubeflow Pipelines (KFP) try to separate cluster config (like artifact repository) from workflow config. If this feature is supported from argo, we could let each user manage its artifact repository in their own namespace while keeping the workflow definitions shareable.

This is one of the areas KFP hasn't been able to achieve multi-tenancy separation: kubeflow/pipelines#1223 (comment).

Some recent discussion: https://kubeflow.slack.com/archives/CE10KS9M4/p1602516358147900

Ark-kun · 2020-10-22T07:44:31Z

It's was really surprising to me to find out that you cannot specify input artifacts from the default artifact repository.

Please help me understand, how it's supposed to choose the input artifact from the arepository (out of thousands of artifacts already there) based on this information alone?

      inputs:
        artifacts:
          - name: hello_world
            path: /tmp/hello_world.txt

This does not seem to contain any information that can be used to get an artifact.

Ark-kun · 2020-10-22T07:51:57Z

One solution that could solve this feature request is to add support for a generic uri field in the artifact. The rest of the artifact repository information can be selected based on the artifact URI schema.

      inputs:
        artifacts:
          - name: hello_world
            path: /tmp/hello_world.txt
            uri: s3://my-bucket/my_key

This would also be a step towards making it possible to pass the artifact URIs using placeholders like {{tasks.some-task.outputs.hello_world.uri}}.

alexec · 2020-10-22T13:56:35Z

generic uri

How would you support secrets for username + password?

fvdnabee · 2021-01-12T14:04:06Z

It's was really surprising to me to find out that you cannot specify input artifacts from the default artifact repository.

Please help me understand, how it's supposed to choose the input artifact from the arepository (out of thousands of artifacts already there) based on this information alone?
      inputs:
        artifacts:
          - name: hello_world
            path: /tmp/hello_world.txt
This does not seem to contain any information that can be used to get an artifact.

I agree, however if the workflow definition defines the artifact type and the key (in case of s3), then the workflow controller should be able to figure out that it could try fetching the artifact from the default artifact repository (if that is s3) or from a referenced artifact location? This resembles more closely like the scenario posted by the OP in this issue #2461

I think you must specify the key within the bucket, as well as bucket, endpoint etc. Not great I agree, but that is how it is today.

This is also my understanding today.

For output artifacts, it is sufficient to only specify the name and the path in the wf spec. The wf controller will upload the output artifacts to the default artifact repository (if available). This enables you to decouple the artifact repository configuration from the workflow definition (e.g. this configuration might depend on a local, staging or production environment).

For input artifacts however, you need to specify the endpoint, bucket and key in the artifact definition (in the case of s3). I've been unsuccessful to decouple the artifact configuration from the workflow definition. Which, in my case, means the workflow spec depends on the environment as input arguments are fetched from different places in development, staging and production argo workflows.

For both input and output artifcats it would be wanted to:
a) decouple artifact configuration from workflow specification
b) enable this decoupling a per-artifact basis, as input artifact [A, B] might be fetched from repositories [X, Y] whereas output artifact C might be uploaded to repository Z.
Item b) is of interest in workflows where the input artifacts come from an artifact repository that differs from the artifact repo where argo stores its workflow output (which is typically more ephemeral of nature).

#4618 addresses some of these issues, but it is not clear to me if the decoupling is provided on a per-artifact basis. It appears to be configured an entire workflow.

alexec · 2021-01-12T17:08:04Z

#4618 de-couples artifact configuration from the workflow by allowing you to store it in a config map (or secret, I forget). You can only specify this configuration at the workflow level, so all artifacts must be stored in the same place UNLESS you are completely explicit.

I.e. it does not do (b). However, it does lay a lot of groud-work that would make (b) straight-forward to do.

Do you think (b) is a common use case? Give me a 👍 or 👎

hadim added the type/bug label Jun 25, 2020

hadim changed the title ~~artifactRepositoryRef does not work for inputs + no way to specify the localtion within the artifact (key)~~ artifactRepositoryRef does not work for inputs + no way to specify the location within the artifact (key) Jun 25, 2020

alexec added the artifacts label Jun 25, 2020

alexec added type/feature Feature request solution/workaround There's a workaround, might not be great, but exists and removed type/bug labels Jul 16, 2020

alexec changed the title ~~artifactRepositoryRef does not work for inputs + no way to specify the location within the artifact (key)~~ Cannot specify key for input artifact (without full artifact location) Oct 14, 2020

Bobgy mentioned this issue Oct 21, 2020

[Multi User] Support separate artifact repository for each namespace kubeflow/pipelines#4649

Open

alexec self-assigned this Oct 21, 2020

alexec mentioned this issue Oct 21, 2020

Define artifactRepositoryRef only once in spec #3184

Closed

alexec added this to the v3.0 milestone Dec 20, 2020

alexec linked a pull request Jan 20, 2021 that will close this issue

feat(controller)!: Key-only artifacts. Fixes #3184 #4618

Merged

1 task

alexec closed this as completed in #4618 Jan 20, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cannot specify key for input artifact (without full artifact location) #3307

Cannot specify key for input artifact (without full artifact location) #3307

hadim commented Jun 25, 2020

alexec commented Jun 25, 2020

hadim commented Jun 25, 2020

hadim commented Jun 25, 2020

alexec commented Jun 25, 2020

hadim commented Jun 25, 2020

alexec commented Jul 16, 2020

vitalyrychkov commented Jul 31, 2020

alexec commented Jul 31, 2020

dekovach commented Oct 14, 2020

alexec commented Oct 14, 2020

Bobgy commented Oct 21, 2020 •

edited

Loading

Ark-kun commented Oct 22, 2020

Ark-kun commented Oct 22, 2020

alexec commented Oct 22, 2020

fvdnabee commented Jan 12, 2021

alexec commented Jan 12, 2021

Cannot specify key for input artifact (without full artifact location) #3307

Cannot specify key for input artifact (without full artifact location) #3307

Comments

hadim commented Jun 25, 2020

alexec commented Jun 25, 2020

hadim commented Jun 25, 2020

hadim commented Jun 25, 2020

alexec commented Jun 25, 2020

hadim commented Jun 25, 2020

alexec commented Jul 16, 2020

vitalyrychkov commented Jul 31, 2020

alexec commented Jul 31, 2020

dekovach commented Oct 14, 2020

alexec commented Oct 14, 2020

Bobgy commented Oct 21, 2020 • edited Loading

Ark-kun commented Oct 22, 2020

Ark-kun commented Oct 22, 2020

alexec commented Oct 22, 2020

fvdnabee commented Jan 12, 2021

alexec commented Jan 12, 2021

Bobgy commented Oct 21, 2020 •

edited

Loading