You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Each attempt is a separate Batch object so inspecting all three failures is more difficult.
When a job is retried, it goes to the back of the queue. This is particularly painful during giant processing campaigns in custom deployments.
If a hyp3 deployment occurs between retry attempts, then all subsequent retries fail because the Step Function doesn’t have perms to submit outdated Batch job definition.
If the AMI has changed with a deployment, then all jobs fail their current attempt, which means that under the new retry strategy, all jobs fail permanently when the AMI changes.
The text was updated successfully, but these errors were encountered:
Jira: https://asfdaac.atlassian.net/browse/TOOL-2366
Note: The above link is accessible only to members of ASF.
The new retry strategy was implemented in #1871
Under the new retry strategy:
The text was updated successfully, but these errors were encountered: