-
Notifications
You must be signed in to change notification settings - Fork 3.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot specify key for input artifact (without full artifact location) #3307
Comments
Can I confirm if this used to work and does not work anymore? Or if it just never seemed to work? |
It never worked for me. See also #2461 (comment) |
key
)key
)
I have tested it using both |
I think you must specify the key within the bucket, as well as bucket, endpoint etc. Not great I agree, but that is how it is today. |
So what's the point |
@hadim I'm going to recategorize this as an "enhancement". We should do more work in this area, and I'd like to asses interest. |
Hi, i was looking for the same solution - i need many different artifacts as inputs and was hoping to have s3 parameters to be defined only once. |
Do 👍 to show interest. |
It's was really surprising to me to find out that you cannot specify input artifacts from the default artifact repository. It should be supported by default. I would expect to be able to specify only the key in the default bucket, and then it should work out of the box. |
This issue isn't really to do with |
key
)
Kubeflow Pipelines (KFP) try to separate cluster config (like artifact repository) from workflow config. If this feature is supported from argo, we could let each user manage its artifact repository in their own namespace while keeping the workflow definitions shareable. This is one of the areas KFP hasn't been able to achieve multi-tenancy separation: kubeflow/pipelines#1223 (comment). Some recent discussion: https://kubeflow.slack.com/archives/CE10KS9M4/p1602516358147900 |
Please help me understand, how it's supposed to choose the input artifact from the arepository (out of thousands of artifacts already there) based on this information alone? inputs:
artifacts:
- name: hello_world
path: /tmp/hello_world.txt This does not seem to contain any information that can be used to get an artifact. |
One solution that could solve this feature request is to add support for a generic
This would also be a step towards making it possible to pass the artifact URIs using placeholders like |
How would you support secrets for username + password? |
I agree, however if the workflow definition defines the artifact type and the key (in case of s3), then the workflow controller should be able to figure out that it could try fetching the artifact from the default artifact repository (if that is s3) or from a referenced artifact location? This resembles more closely like the scenario posted by the OP in this issue #2461
This is also my understanding today. For output artifacts, it is sufficient to only specify the name and the path in the wf spec. The wf controller will upload the output artifacts to the default artifact repository (if available). This enables you to decouple the artifact repository configuration from the workflow definition (e.g. this configuration might depend on a local, staging or production environment). For input artifacts however, you need to specify the endpoint, bucket and key in the artifact definition (in the case of s3). I've been unsuccessful to decouple the artifact configuration from the workflow definition. Which, in my case, means the workflow spec depends on the environment as input arguments are fetched from different places in development, staging and production argo workflows. For both input and output artifcats it would be wanted to: #4618 addresses some of these issues, but it is not clear to me if the decoupling is provided on a per-artifact basis. It appears to be configured an entire workflow. |
#4618 de-couples artifact configuration from the workflow by allowing you to store it in a config map (or secret, I forget). You can only specify this configuration at the workflow level, so all artifacts must be stored in the same place UNLESS you are completely explicit. I.e. it does not do (b). However, it does lay a lot of groud-work that would make (b) straight-forward to do. Do you think (b) is a common use case? Give me a 👍 or 👎 |
Tested on 2.9.0-rc3
The official example works:
When switching to input it fails:
with
Another important point IMO. I don't see any ways to specify the
key
during workflow creation (the location of your object within S3/bucket). My understanding is thatartifactRepositoryRef
can be used to setup default repositories and then can be reused within workflows specifying the location of the folder or file we want to use as inputs or outputs. Was that designed for that purpose?The text was updated successfully, but these errors were encountered: