Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pstl-offload] Initial support for PSTL offload under Windows #1359

Merged
merged 6 commits into from
Jul 4, 2024

Conversation

Alexandr-Konovalov
Copy link
Contributor

@Alexandr-Konovalov Alexandr-Konovalov commented Jan 19, 2024

There are 4 groups of the changes:

  1. Microsoft dynamic runtime is instrumented to intercept global memory releasing. Microsoft Detours is used for that. It downloaded and build from sources during pstloffload build step. Then statically linked to pstloffload.dll.
  2. Allocation functions re-use. Under Windows, releasing of memory allocated by native aligned allocation functions must not be done by free/realloc, but by special _aligned_free/etc counterparts. Alignment 0 is used as a sign that common malloc etc should be called.
  3. Tests changes. Link with release runtime is not supported for debug PSTL offload, so in debug testing skips iterator tests that requires that.
  4. CI changes. pstloffload_smoke_tests group of tests created to run on checking.

@Alexandr-Konovalov Alexandr-Konovalov force-pushed the dev/akonoval/public_pstl-offload-Win branch 2 times, most recently from 7f64ad7 to a1c0c20 Compare January 30, 2024 09:35
@akukanov akukanov added this to the 2022.6.0 milestone Feb 5, 2024
@Alexandr-Konovalov Alexandr-Konovalov force-pushed the dev/akonoval/public_pstl-offload-Win branch 2 times, most recently from 8f9d9a6 to 31f9e14 Compare February 6, 2024 17:43
@rarutyun rarutyun added the follow through PRs/issues that should be completed/resolved label Feb 12, 2024
Copy link
Contributor

@dmitriy-sobolev dmitriy-sobolev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've checked all CMakeLists.txt and ci.yml files. Everything looks good besides some minor (probably, matter of personal taste) comments.

test/CMakeLists.txt Outdated Show resolved Hide resolved
test/CMakeLists.txt Show resolved Hide resolved
test/CMakeLists.txt Outdated Show resolved Hide resolved
src/CMakeLists.txt Outdated Show resolved Hide resolved
src/CMakeLists.txt Show resolved Hide resolved
Copy link
Contributor

@rarutyun rarutyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I basically went through the patch and I have some questions. I suggest to talk once again. I think Thursday should be ok.

include/pstl_offload/internal/usm_memory_replacement.h Outdated Show resolved Hide resolved

if (__same_memory_page(__user_ptr, __header) && __header->_M_uniq_const == __uniq_type_const)
{
if (__header->_M_requested_number_of_bytes == __new_size && (uintptr_t)__user_ptr % __alignment == 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (__header->_M_requested_number_of_bytes == __new_size && (uintptr_t)__user_ptr % __alignment == 0)
if (__header->_M_requested_number_of_bytes == __new_size && (std::uintptr_t)__user_ptr % __alignment == 0)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

src/pstl_offload.cpp Outdated Show resolved Hide resolved
test/pstl_offload/memory/allocation_utils.h Outdated Show resolved Hide resolved
test/pstl_offload/memory/usm_memory_alignment.pass.cpp Outdated Show resolved Hide resolved
test/pstl_offload/memory/usm_memory_replacement.pass.cpp Outdated Show resolved Hide resolved
@Alexandr-Konovalov Alexandr-Konovalov force-pushed the dev/akonoval/public_pstl-offload-Win branch from 017b9ab to 7955c50 Compare March 18, 2024 13:43
kboyarinov
kboyarinov previously approved these changes Mar 21, 2024
Copy link
Contributor

@kboyarinov kboyarinov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@danhoeflinger danhoeflinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just getting my feet under me in reviewing this. Here are a few minor things to start with, but will review more in the coming days.

test/CMakeLists.txt Show resolved Hide resolved
include/pstl_offload/internal/usm_memory_replacement.h Outdated Show resolved Hide resolved
include/pstl_offload/internal/usm_memory_replacement.h Outdated Show resolved Hide resolved
}
return ::__pstl_offload::__internal_aligned_realloc(__ptr, __size, __alignment);
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we care about supporting the following?
_aligned_recalloc
_aligned_offset_realloc
_aligned_offset_recalloc

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logic here is next: for TBBmalloc no one is asked about those functions for 10+ years, so we may want to ignore them as well.

From the different side, _recalloc/_aligned_recalloc is low hanging fruit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no strong feelings about what specific functions we support. I suppose the thing to do is to decide on them and document what is supported so it is not a guessing game for the user. Perhaps this becomes an issue to be resolved in a later doc specific PR. This is a fine resolution from my perspective without code changes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. @rarutyun has a table with such functions for Linux, it might be complimented by Windows ones and releases as part of the product's documentation.

@Alexandr-Konovalov Alexandr-Konovalov force-pushed the dev/akonoval/public_pstl-offload-Win branch from f7f7bbb to 5de65c9 Compare April 10, 2024 10:12
include/pstl_offload/internal/usm_memory_replacement.h Outdated Show resolved Hide resolved
}


std::size_t
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this function be static as well

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, it is exported and used in the public headers.

src/pstl_offload.cpp Outdated Show resolved Hide resolved
src/pstl_offload.cpp Outdated Show resolved Hide resolved
test/pstl_offload/memory/usm_memory_alignment.pass.cpp Outdated Show resolved Hide resolved
test/pstl_offload/memory/usm_memory_replacement.pass.cpp Outdated Show resolved Hide resolved
Comment on lines +67 to +79
set(DETOURS_PATH ${PROJECT_BINARY_DIR}/src/project_detours-prefix/src/project_detours)
ExternalProject_Add(project_detours
GIT_REPOSITORY https://github.com/microsoft/Detours.git
GIT_TAG 4b8c659f549b0ab21cf649377c7a84eb708f5e68
INSTALL_COMMAND ""
CONFIGURE_COMMAND ""
# use CL to add additional flags to the Detours build
BUILD_COMMAND cd src && SET CL=/sdl
COMMAND ${NMAKE_EXE}
BUILD_IN_SOURCE on
STEP_TARGETS build
BUILD_BYPRODUCTS ${DETOURS_PATH}/lib.X64/detours.lib
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to mention the use of detours in our documentation (cmake readme and/or pstl section of the guide)?

Are there any license implications? (I assume not since its being used as an external project, but I don't know)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are license implications, I believe. @ValentinaKats promised to add them in a separate commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added Detours as a third party program to the third-party-program.txt file directly to this PR, b09be05.

@rarutyun
Copy link
Contributor

Other than my comments, looks good to me. I would also ask somebody to review the infrastructure part.

@danhoeflinger
Copy link
Contributor

Other than my comments, looks good to me. I would also ask somebody to review the infrastructure part.

I've reviewed the infrastructure part, and it LGTM. I've reviewed the rest as well and looks good other than the documentation / licensing conversations still open (which can be address separately). It would be good to get final approval from @rarutyun after his comments.

rarutyun
rarutyun previously approved these changes Apr 22, 2024
Copy link
Contributor

@rarutyun rarutyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can somebody put the approve from the infrastructure perspective? @danhoeflinger or @dmitriy-sobolev, could it be some of you? Or should it be someone else?

danhoeflinger
danhoeflinger previously approved these changes Apr 23, 2024
Copy link
Contributor

@danhoeflinger danhoeflinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM for infrastructure part, (and normal part as well as far as I can see)
Decision to merge should of course wait for offline discussion.

@akukanov akukanov modified the milestones: 2022.6.0, 2022.7.0 Apr 23, 2024
@Alexandr-Konovalov Alexandr-Konovalov dismissed stale reviews from danhoeflinger and rarutyun via 067b695 May 14, 2024 14:52
@Alexandr-Konovalov Alexandr-Konovalov force-pushed the dev/akonoval/public_pstl-offload-Win branch from b09be05 to 067b695 Compare May 14, 2024 14:52
Alexandr-Konovalov and others added 5 commits July 2, 2024 10:13
There are 4 groups of the changes:

1. Microsoft dynamic runtime is instrumented to intercept global memory releasing. Microsoft Detours is used for that. It downloaded and build from sources during pstloffload build step. Then statically linked to pstloffload.dll.
2. Allocation functions re-use. Under Windows, releasing of memory allocated by native aligned allocation functions must not be done by free/realloc, but by special _aligned_free/etc counterparts. Alignment 0 is used as a sign that common malloc etc should be called.
3. Tests changes. Link with release runtime is not supported for debug PSTL offload, so in debug testing skips iterator tests that requires that.
4. CI changes. pstloffload_smoke_tests group of tests created to run on checking.
Add Detours to third-party-programs.txt
@Alexandr-Konovalov Alexandr-Konovalov force-pushed the dev/akonoval/public_pstl-offload-Win branch from 8618b85 to 622273b Compare July 2, 2024 08:14
// Under Windows, we must not use functions with explicit alignment for malloc replacement, as
// an allocated memory would be released by free() replacement, that has no alignment argument.
// Mark such allocations with special alignment. Use 0, as this is not valid alignment.
static constexpr std::size_t __ignore_alignment = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would make it inline. Like:

Suggested change
static constexpr std::size_t __ignore_alignment = 0;
inline constexpr std::size_t __ignore_alignment = 0;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -189,13 +200,24 @@ __internal_aligned_alloc(std::size_t __size, std::size_t __alignment)
if (__sycl_device_shared_ptr __dev = __offload_policy_holder_type::__get_device_ptr(__offload_policy_holder))
{
void* __res = __allocate_shared_for_device(std::move(__dev), __size, __alignment);
assert((std::uintptr_t(__res) & (__alignment - 1)) == 0);
if (__res && __alignment)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For everything but bool I would recommend to not rely on implicit conversions to bool. Readability-wise it gives you an idea what kind of variable you are checking right in the if-statement. The only place where conversion to bool worth it (in my opinion) is a generic code

Suggested change
if (__res && __alignment)
if (__res != nullptr && __alignment != 0)

You can swap if you like. I mean nullptr != __res && 0 != alignment)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Can it be part of code guideline, then?

void* __res =
__original_aligned_alloc((__ignore_alignment == __alignment) ? alignof(std::max_align_t) : __alignment, __size);
#endif
if (__res && __alignment)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The same comment as above

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@Alexandr-Konovalov Alexandr-Konovalov merged commit 7146089 into main Jul 4, 2024
21 checks passed
@Alexandr-Konovalov Alexandr-Konovalov deleted the dev/akonoval/public_pstl-offload-Win branch July 4, 2024 08:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
follow through PRs/issues that should be completed/resolved
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants