-
Notifications
You must be signed in to change notification settings - Fork 44
Rollback Support
Consider the example of a Resource allocator below that needs to allocate N resources with a batch size of B. If each batch can be executed independently, it makes sense to do it in parallel using Flux tasks
The same flow can be modelled using flux primitives as below:
num_batches := N/B
for batch_num in num_batches
new Task() {
execute() {
allocate(batch_num)
}
}
end
In Flux, each task unit can be executed independently on any of the available worker node. Thus, each task can fail independently - either due to a semantic error (failure to allocate resources) or due to a runtime failure (task timed out). It can be difficult to bring the system back to a stable state in case some tasks succeed and some failed.
As a convenience, users can define rollback() methods for each task to handle any cleanup activities. Flux would automatically trigger rollbacks in case of an un handled or unexpected error.
Additionally, users can use the flux context to save any local context that may be needed for a rollback
num_batches := N/B
for batch_num in num_batches
new Task() {
execute() {
save batch_num to flux context
allocate(batch_num)
}
rollback() {
retrieve batch_num from flux context
de_allocate(batch_num)
}
}
end