Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs/ufs: change default locking protocol -v5.0.x #12759

Merged

Conversation

edgargabriel
Copy link
Member

The fs/ufs component by default disabled all file locking before read/write operations (except for NFS file systems). This was based on the assumption, that the OS itself performs the required locking operation and hence we don't have to add to it.

This assumption is incorrect when using data sieving. In data sieving, the code 'ignore' small gaps when we write to a file, and perform instead a read-modify-write sequence ourselves for performance reasons. The problem is however that even within a collective operation not all aggregators might want to use data sieving. Hence, enabling locking just for the data-sieving routines is insufficient, all processes have to perform the locking. Therefore, our two options are: a) either disable write data-sieving by default, or b) enable range-locking by default.

After some testing, I think enabling range-locking be default is the safer and better approach. It doesn't seem to show any significant performance impact on my test systems.

Note, that on Lustre file systems, we can keep the default to no-locking as far as I can see, since the collective algorithm used by Lustre is unlikely to produce this pattern. I did add in however an mca parameter that allows us to control the locking algorithm used by the Lustre component as well, in case we need to change that for a particular use-case or platform.

Fixes Issue #12718

Signed-off-by: Edgar Gabriel [email protected]
(cherry picked from commit c697f28)

The fs/ufs component by default disabled all file locking before
read/write operations (except for NFS file systems). This was based on
the assumption, that the file system itself performs the required
locking operation and hence we don't have to add to it.

This assumption is incorrect when using data sieving. In data sieving,
the code 'ignore' small gaps when we write to a file, and perform
instead a read-modify-write sequence ourselves for performance reasons.
The problem is however that even within a collective operation not all
aggregators might want to use data sieving. Hence, enabling locking just
for the data-sieving routines is insufficient, all processes have to
perform the locking. Therefore, our two options are: a) either disable
write data-sieving by default, or b) enable range-locking by default.

After some testing, I think enabling range-locking be default is the
safer and better approach. It doesn't seem to show any significant
performance impact on my test systems.

Note, that on Lustre file systems, we can keep the default to no-locking
as far as I can see, since the collective algorithm used by Lustre is
unlikely to produce this pattern. I did add in however an mca parameter
that allows us to control the locking algorithm used by the Lustre
component as well, in case we need to change that for a particular
use-case or platform.

Fixes Issue open-mpi#12718

Signed-off-by: Edgar Gabriel <[email protected]>
(cherry picked from commit c697f28)
@github-actions github-actions bot added this to the v5.0.6 milestone Aug 14, 2024
@edgargabriel edgargabriel changed the title fs/ufs: change default locking protocol fs/ufs: change default locking protocol -v5.0.x Aug 14, 2024
@wenduwan
Copy link
Contributor

@edgargabriel I would like to run some tests on this PR. Do you recommend a specific setup?

@edgargabriel
Copy link
Member Author

@wenduwan any MPI I/O test on a regular local (or non-Lustre) file system should work

@wenduwan
Copy link
Contributor

@edgargabriel Thanks unfortunately we don't have MPI i/o test suite 😞 But let me run our normal tests.

@wenduwan wenduwan merged commit 43a3c3b into open-mpi:v5.0.x Aug 19, 2024
18 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants