-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(metadata): ability to get artifacts location for Argo-Workflows v3.0+ #5829
Conversation
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Hi @Subreptivus. Thanks for your PR. I'm waiting for a kubeflow member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
bucket=s3_artifact.get('bucket', ''), | ||
key=s3_artifact.get('key', ''), | ||
) | ||
if (s3_artifact.keys() >= {'endpoint', 'bucket'}): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comparison looks fragile. Can we do this check in a more robust way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, and why does it look fragile to you?
Looks clean and short check of multiple keys in dict to me comparing to checking for subset or making unnecessary loops.
@@ -325,8 +333,14 @@ def is_kfp_v2_pod(pod) -> bool: | |||
|
|||
output_artifacts = [] | |||
for name, art in argo_output_artifacts.items(): | |||
artifact_uri = argo_artifact_to_uri(art) | |||
if not artifact_uri: | |||
artifact_uri_check = argo_artifact_to_uri(art) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we just call this artifact_uri
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes we can, it's just a habit of not overwriting variables while making some checks.
if not artifact_uri: | ||
artifact_uri_check = argo_artifact_to_uri(art) | ||
if artifact_uri_check: | ||
if re.search('(\W+)', artifact_uri_check).group(1) != '://': |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be something like if re.match(r'^\W+://', artifact_uri):
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree, it could be if not re.match(r'\w+://', artifact_uri):
, I just don't like to use match
if I'm searching for anything in the middle of the string.
artifact_uri_check = argo_artifact_to_uri(art) | ||
if artifact_uri_check: | ||
if re.search('(\W+)', artifact_uri_check).group(1) != '://': | ||
artifact_uri_wo_key = argo_artifact_to_uri(argo_template.get('archiveLocation', {}), True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can always pass argo_template.get('archiveLocation', {})
to argo_artifact_to_uri
and encapsulate this logic there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is, that despite the property name in archiveLocation
they're storing the folder of artifact location but not the actual archive (file) location. For example in the outputs
you will have file location like "key":"artifacts/artifact-passing-lr6fg/artifact-passing-lr6fg-2908156709/hello-art.tgz"
and within archiveLocation
you will have something like "key":"artifacts/artifact-passing-lr6fg/artifact-passing-lr6fg-2908156709"
Thank you for this contribution. |
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/ok-to-test
@Subreptivus: The following tests failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
I'm facing same problem. Using KF version 1.3. Any workaround for now? |
Starting from this change, Argo-Workflows isn't providing full data (with
provider
andbucket
properties) inoutputs
annotation by default.With this change
metadata_writer
will combine artifact file location from theoutputs
annotation andprovider
andbucket
data fromarchiveLocation
property of thetemplate
annotation in case of missing data inoutputs
annotation.