-
Notifications
You must be signed in to change notification settings - Fork 628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add deleteTasksOnCompletion to Azure Batch configuration #4114
Add deleteTasksOnCompletion to Azure Batch configuration #4114
Conversation
✅ Deploy Preview for nextflow-docs-staging ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Deleting Azure Tasks was checking the configuration object deleteJobsOnCompletion which was incorrect since a task belongs to a job. This adds the equivalent configuration for tasks which is checked before deleting the tasks. Signed-off-by: Adam Talbot <[email protected]>
3ac14a2
to
778d0f0
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, tho cleanup logic is becoming a bit convoluted.
We could just...remove this bit? Deleting Azure Tasks isn't really necessary because it can be performed at the Job (queue) level. I don't believe deleting tasks achieves much other than clearing them out of the Azure Batch interface. |
Signed-off-by: Adam Talbot <[email protected]>
c7ec817
to
32abc3c
Compare
plugins/nf-azure/src/main/nextflow/cloud/azure/batch/AzBatchTaskHandler.groovy
Show resolved
Hide resolved
Tagging @bentsherman for reviewing |
Signed-off-by: Ben Sherman <[email protected]>
docs/config.md
Outdated
`azure.batch.deleteJobsOnCompletion` | ||
: Enable the automatic deletion of jobs created by the pipeline execution (default: `true`). | ||
|
||
`azure.batch.deletePoolsOnCompletion` | ||
: Enable the automatic deletion of compute node pools upon pipeline completion (default: `false`). | ||
|
||
`azure.batch.deleteTasksOnCompletion` | ||
: Delete tasks after successful completion. This deletes them on the Azure Batch service but files and resources may persist on the compute nodes (default: `true`). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a quick thought - does this default to true? I think it actually defaults to false.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you are right, all of these delete flags are false by default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually deleteJobsOnCompletion
is true by default because of this line:
nextflow/plugins/nf-azure/src/main/nextflow/cloud/azure/batch/AzBatchService.groovy
Line 860 in 9fc1d3b
if( config.batch().deleteJobsOnCompletion!=Boolean.FALSE ) { |
But it defaults to false in shouldDelete()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually it defaults to false there too:
nextflow/plugins/nf-azure/src/main/nextflow/cloud/azure/batch/AzBatchTaskHandler.groovy
Lines 137 to 138 in 9fc1d3b
if( !taskKey || shouldDelete()==Boolean.FALSE ) | |
return |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably this logic should happen in AzBatchOpts
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adam, do you think it would be better to remove the task deletion logic? If they are already deleted when the job is deleted (for which there is already a config option), then it seems like that would be better by reducing the number of API calls to Azure Batch. On the other hand, I suppose the tasks would be deleted only when the entire job queue is done, rather than immediately. I'm not sure whether that matters.
I was thinking the same, I'm not sure what deleting the task adds. The only possibility is if it clears out data from the worker nodes but I don't think it actually does, I think it just removes them from the batch interface. |
From the documentation:
That makes me think there is an advantage to deleting tasks as soon as possible. Unless someone wants to be able to control task deletion and job deletion, I am fine with the current behavior, which controls both via |
On the other hand, Dave on StackOverflow says:
I think I will go with the Microsoft docs 😆 |
Yep, currently on stable Nextflow leaves jobs in an active state which fills quota. My recent PR switches them to terminate if Nextflow gracefully finishes (still leaves them hanging around if it blows up). With this behaviour it's adding some task delete calls before running a job delete call, shall we just remove it? It doesn't really add anything as long as the job is deleted cleanly. And we know we frequently hit 429 errors on Azure. |
But my point is, if the tasks aren't deleted until the entire job is done, will that cause the task files on the VM to accumulate and run out of storage? Or are the task files already deleted some other way? |
Yes, this happens frequently and is a regularly common complaint on Azure. I was wrong, I thought it only ran I will definitely clean up the logic. |
Okay, then let's add your config option making it true by default, and if a user has rate limit issues then they have that option as a lever. Let me push a few minor changes first. |
Another quick point, according to the code it retains successful runs but not failed ones for debugging reasons? I think we should just remove that block and make it a blanket true or false to simplify the logic. |
Signed-off-by: Ben Sherman <[email protected]>
The way it works currently is that If you think that isn't useful, I would be fine with removing it, thereby deleting all tasks by default. |
OK that makes sense. Although in general the worker machine is autoscaled down after a failure and you lose the logs anyway.
That makes sense. The reason to remove it might be to make it simpler, i.e. a simple true or false. |
One small change - let's default deleteJobsOnCompletion to |
Signed-off-by: Adam Talbot <[email protected]>
Signed-off-by: Adam Talbot <[email protected]>
Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Ben Sherman <[email protected]> Signed-off-by: Adam Talbot <[email protected]>
Co-authored-by: Ben Sherman <[email protected]> Signed-off-by: Adam Talbot <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Okay I updated the tests and docs to reflect that |
Signed-off-by: Ben Sherman <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pditommaso summary of changes:
- change
terminateJobsOnCompletion
to true by default - change
deleteJobsOnCompletion
to false by default - add
deleteTasksOnCompletion
(true by default) which controls deletion of tasks as soon as they are complete
So now the default behavior is to terminate jobs but not delete them, and to delete tasks as soon as they complete. Failed tasks are still preserved for debugging purposes, unless the option is explicitly enabled.
Thanks both. |
…#4114) Deleting Azure Tasks was checking the configuration object deleteJobsOnCompletion which was incorrect since a task belongs to a job. This adds the equivalent configuration for tasks which is checked before deleting the tasks. Signed-off-by: Adam Talbot <[email protected]> Signed-off-by: Ben Sherman <[email protected]> Signed-off-by: Adam Talbot <[email protected]> Co-authored-by: Ben Sherman <[email protected]>
Deleting Azure Tasks was checking the configuration object
deleteJobsOnCompletion
which was incorrect since a task belongs to a job. This adds the equivalent configuration for tasks which is checked before deleting the tasks.