-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
OFI MR flags selection based on the provider #1077
base: main
Are you sure you want to change the base?
Conversation
What is MR mode?
From: Md Rahman ***@***.***>
Date: Thursday, December 1, 2022 at 6:06 PM
To: Sandia-OpenSHMEM/SOS ***@***.***>
Cc: Subscribed ***@***.***>
Subject: [Sandia-OpenSHMEM/SOS] OFI MR flags selection based on the provider (PR #1077)
For most of the OFI providers, SOS uses a specific MR mode selection. However, user still has the option to select a different mode that may not be supported. This is an attempt to select the MR mode based on the provider by default. User may still choose a different mode during configuration.
…________________________________
You can view, comment on, or merge this pull request online at:
#1077
Commit Summary
* e0be06f<e0be06f> Removed deprecated flag; added trial MR hint based on provider
* f3a0540<f3a0540> Initial changes to adopt provider based MR flag selection
File Changes
(4 files<https://github.com/Sandia-OpenSHMEM/SOS/pull/1077/files>)
* M configure.ac<https://github.com/Sandia-OpenSHMEM/SOS/pull/1077/files#diff-49473dca262eeab3b4a43002adb08b4db31020d190caaad1594b47f1d5daa810> (8)
* M src/shmem_env_defs.h<https://github.com/Sandia-OpenSHMEM/SOS/pull/1077/files#diff-f68772abccc3858c31d45e1eb92a77ae0f449462ae2864ff76a9e37a29e89726> (2)
* M src/transport_ofi.c<https://github.com/Sandia-OpenSHMEM/SOS/pull/1077/files#diff-3cac951d73790a8a9ce2e3647758bc66cd9fc5d4c3243173fee89846df3d2ce3> (222)
* M src/transport_ofi.h<https://github.com/Sandia-OpenSHMEM/SOS/pull/1077/files#diff-d085d3b6407317efe43e401b2c469a5edf6c802fb513363ddd5043f10781817d> (96)
Patch Links:
* https://github.com/Sandia-OpenSHMEM/SOS/pull/1077.patch
* https://github.com/Sandia-OpenSHMEM/SOS/pull/1077.diff
—
Reply to this email directly, view it on GitHub<#1077>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AZXQXSN756OQSFW5I5BL2QTWLEVNXANCNFSM6AAAAAASRJ23EM>.
You are receiving this because you are subscribed to this thread.Message ID: ***@***.***>
|
@stewartl318 - Great question. In OFI memory registration, there are certain flags that define the behavior of the operations. For example, OFI internally can choose memory region attributes v/s. the user can choose by themselves. These settings are controlled by different MR mode flags. Traditionally, there were two modes: basic and scalable (https://ofiwg.github.io/libfabric/v1.1.1/man/fi_mr.3.html). But, now, there are additional flags that you can use and each provider supports / requires a subset of them (https://github.com/ofiwg/libfabric/wiki/Provider-Feature-Matrix-Main). |
@@ -194,13 +194,15 @@ AM_CONDITIONAL([HAVE_LONG_FORTRAN_HEADER], [test "$enable_long_fortran_header" = | |||
|
|||
AC_ARG_ENABLE([ofi-mr], | |||
[AC_HELP_STRING([--enable-ofi-mr=MODE], | |||
[OFI memory registration mode: basic, scalable, or rma-event (default: scalable)])]) | |||
[OFI memory registration mode: none, basic, scalable, or rma-event (default: none)])]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should "none" be called "auto" since it selects MR flags based on the provider?
[basic], | ||
[], | ||
[AC_DEFINE([ENABLE_MR_BASIC], [1], [If defined, the OFI transport will use FI_MR_BASIC])], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ENABLE_MR_BASIC
used anywhere? Is it needed?
ret = fi_mr_reg(shmem_transport_ofi_domainfd, 0, UINT64_MAX, | ||
FI_REMOTE_READ | FI_REMOTE_WRITE, 0, 0ULL, flags, | ||
&shmem_transport_ofi_target_heap_mrfd, NULL); | ||
OFI_CHECK_RETURN_STR(ret, "target memory (all) registration failed"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm confused why virtual addressing now registers the heap now. Does it not need the data segment?
ret = fi_close(&shmem_transport_ofi_target_heap_mrfd->fid); | ||
OFI_CHECK_ERROR_MSG(ret, "Target heap MR close failed (%s)\n", fi_strerror(errno)); | ||
|
||
#if defined(ENABLE_REMOTE_VIRTUAL_ADDRESSING) | ||
if (shmem_transport_ofi_mr_mode != 0) { | ||
ret = fi_close(&shmem_transport_ofi_target_data_mrfd->fid); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a corresponding mr_reg
for shmem_transport_ofi_target_data_mrfd
if only ENABLE_REMOTE_VIRTUAL_ADDRESSING
is enabled...
@@ -1157,20 +1158,10 @@ int query_for_fabric(struct fabric_info *info) | |||
hints.addr_format = FI_FORMAT_UNSPEC; | |||
domain_attr.data_progress = FI_PROGRESS_AUTO; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a note that #1059 possibly motivates a guard warning about and/or preventing the usage of FI_MR_ENDPOINT
(and thus FI_PROGRESS_MANUAL
) except for providers where it makes sense. But come to think of it, maybe this PR already accomplishes that?
For most of the OFI providers, SOS uses a specific MR mode selection. However, user still has the option to select a different mode that may not be supported. This is an attempt to select the MR mode based on the provider by default. User may still choose a different mode during configuration.