Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: all written artifacts should be saved and garbage collected #13678

Merged
merged 8 commits into from
Oct 1, 2024

Conversation

juliev0
Copy link
Contributor

@juliev0 juliev0 commented Sep 29, 2024

Fixes #13583

Motivation

Two issues:

  1. Optional artifacts that were not written were still being passed back through the WorkflowTaskResult, which was causing Artifact GC to try to delete them (and fail to do so)
  2. If any non-optional artifact did fail to get saved because it wasn't present, other artifacts would not get saved

Modifications

To address # 1 above, only write to WorkflowTaskResult the artifacts which have been written, whether or not they're marked "Optional". This is what Artifact GC will attempt to delete.

To address # 2 above, don't prematurely return if an artifact fails to be saved.

Verification

e2e test for Artifact GC now includes tests for these cases.
To test # 1 above, see artgc-optional-artifact-not-written.yaml, which includes an Optional artifact that isn't present, which we confirm does not fail ArtifactGC. We also confirm that the Workflow succeeds in this case.

To test # 2 above, see artgc-non-optional-artifact-not-written.yaml , which includes a non-Optional artifact that isn't present, plus another artifact which is present and we confirm does get saved. We also confirm that the Workflow succeeds in this case, since the Artifact that wasn't present was non-optional.

@juliev0
Copy link
Contributor Author

juliev0 commented Sep 29, 2024

Hey @agilgur5 - I know you mentioned saving artifacts in parallel would be a good improvement. My thought was that with what I did it's clearly just a "fix" (which could theoretically be patched), and that saving in parallel would be more of a feature. I was also paranoid there could be some new risk of doing it in parallel, that some artifact type may not be able to handle the concurrent saves, although maybe that's not likely. If you feel this is silly, I can revise.

@juliev0 juliev0 marked this pull request as ready for review September 29, 2024 20:30
@agilgur5 agilgur5 changed the title fix: all Artifacts that were written should be saved and garbage collected fix: all written artifacts should be saved and garbage collected Sep 29, 2024
@agilgur5 agilgur5 added area/artifacts S3/GCP/OSS/Git/HDFS etc area/gc Garbage collection, such as TTLs, retentionPolicy, delays, and more labels Sep 29, 2024
Copy link
Member

@isubasinghe isubasinghe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

workflow/executor/executor.go Outdated Show resolved Hide resolved
@juliev0 juliev0 merged commit ca6c414 into argoproj:main Oct 1, 2024
27 checks passed
@juliev0 juliev0 deleted the task-result-fix branch October 1, 2024 16:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/artifacts S3/GCP/OSS/Git/HDFS etc area/gc Garbage collection, such as TTLs, retentionPolicy, delays, and more
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ArtifactGC fails if output artifact with non-existent file is configured with optional: true
3 participants