Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CP-50475: parallelize device ops during VM lifecycle ops #6057

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

psafont
Copy link
Member

@psafont psafont commented Oct 16, 2024

Operations on different devices should be independent and therefore can be
parallelized. This both means parallelizing operations on different device
types and on devices for the same type.

An atom to serialize action has been introduced because the operation regarding
a single device must be kept serialized.

Also removes some unneeded parallel atoms, which cause some overhead as well as polluting the traces

This makes VM_starts to run about ~1 second faster on some tests where VMs have 5 VIFs. I'm testing it further, but it doesn't look like it's breaking any tests. I want to push it through some bootstorms and measure improvements

Please do suggest some further parallelization if you can find it!

BVT+BST are all green

psafont and others added 2 commits October 16, 2024 13:54
Parallel atoms do quite a bit of unnecessary actions even when they are empty.
They are also not needed when running a single task.

They also show as spans in the traces. Removing them makes the traces shorter
and easier to read.

Co-authored-by: Edwin Török <[email protected]>
Signed-off-by: Pau Ruiz Safont <[email protected]>
Operations on different devices should be independent and therefore can be
parallelized. This both means parallelizing operations on different device
types and on devices for the same type.

An atom to serialize action has been introduced because the operation regarding
a single device must be kept serialized.

Signed-off-by: Pau Ruiz Safont <[email protected]>
@psafont psafont requested a review from robhoes October 16, 2024 12:58
@robhoes
Copy link
Member

robhoes commented Oct 16, 2024

I have to look at the details of the PR, but the first thing I wondered is why we need to introduce a new Serial atomic operation, as clearly we are already able to serialise. The list returned by atomics_of_operation is implicitly serial and executed as such by perform_atomics. So now we have two ways of expression serial (atomic) operations.

I can see that nesting is easier with an explicit Serial atomic. Should we then as a next step make all serialisation explicit by turning atomics_of_operation into atomic_of_operation and wrapping all lists with Serial? Then we can merge perform_atomics and perform_atomics as well.

@psafont
Copy link
Member Author

psafont commented Oct 16, 2024

I introduced the serial atomic because it was convenient. At some point I also added a nop one, which removes the need to concatenate lists quite a bit, but I'd rather do the latter than the former, as it could end up in the traces.

The list returned by atomics_of_operation is implicitly serial and executed as such by perform_atomics. So now we have two ways of expression serial (atomic) operations.

I can see that nesting is easier with an explicit Serial atomic.

An option to avoid the serial atom would be to make the parallel one to receive a list of lists of atomics ops, instead of a list of atomic ops, although I'm not fond of the implicitness of it.

Should we then as a next step make all serialisation explicit by turning atomics_of_operation into atomic_of_operation and wrapping all lists with Serial? Then we can merge perform_atomics and perform_atomics as well.

I can try :)

@psafont
Copy link
Member Author

psafont commented Oct 17, 2024

Trying is a good exercise of yak-shaving: there's code to calculate progress of tasks. It's not clear to me that it works well currently, especially with regard to the parallel atomic op; and merging the functions needs to track progress into the Serial atomic op. The code needs unit-testing to be able to make it's not made even worse and the results are predictable.

The benefits of the current patch can be seen in the traces
image
image

Now the critical path doesn't contain the VIFs, and now it's dominated even more by the SM and the need to plug RW VBDs before RO ones

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants