-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIT: Mark some intrinsics as must-expand for NAOT due to ILScanner dependencies #109609
Merged
jakobbotsch
merged 6 commits into
dotnet:main
from
jakobbotsch:mark-intrinsics-mustexpand-ilscanner
Nov 11, 2024
Merged
Changes from 3 commits
Commits
Show all changes
6 commits
Select commit
Hold shift + click to select a range
125778b
Mark some intrinsics as must-expand for NAOT due to ILScanner depende…
jakobbotsch 6db6378
Force MinOpts for NAOT
jakobbotsch 6cd7f25
Fix comment
jakobbotsch 68f0670
Only mustExpand generic version of GetArrayDataReference
jakobbotsch 1a6f2a9
Move a few other intrinsics under the mustExpand path
jakobbotsch 87854bb
Revert "Force MinOpts for NAOT"
jakobbotsch File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not totally clear to me whether we want these to be
mustExpand
or not, given the "If the intrinsic cannot possibly be expanded, it's fine" in the comment above... usually that would mean that from the JIT's perspective, we would not want to do the expansion in MinOpts.(In the end there is no practical difference in behavior... today we expand
betterToExpand
intrinsics even in MinOpts due to a check below, but that sort of goes against our general MinOpts philosophy, I think.)There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RuntimeHelpers_CreateSpan and RuntimeHelpers_InitializeArray are pattern matched. If they're not part of the specific IL patter, RyuJIT cannot possibly expand them - that's why it's not mandatory, just "whenever it's in the right shape". We don't want e.g.
RuntimeHelpers.InitializeArray(null, default);
to fail to compile. But if these intrinsics are not expanded by RyuJIT their method bodies just throw PNSE, we don't have an implementation, so we want to expand everything humanely possible.The rest are basically mustExpand, we don't have meaningful bodies for them. Could achieve the same thing by making them recursive? It has been a long time since I was looking at this and at that time we didn't have mustExpand/betterToExpand distinction or the recursive special handling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. GetArrayDataReference is defined recursively
runtime/src/coreclr/nativeaot/System.Private.CoreLib/src/System/Runtime/InteropServices/MemoryMarshal.NativeAot.cs
Lines 24 to 25 in bad7318
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok, this only handles the actual recursive call inside the method. Ignore my comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mustExpand
exists to flag cases that are required to be handled for correctness reasons. This should largely only apply to recursive calls (to avoid the stack overflow). In general it should never be incorrect to leave a call in because things like reflection, indirect invocation (delegates or function pointers), debugger evaluation, or other cases need to work.betterToExpand
then exists to flag cases where it is profitable to do the expansion inMinOpts
. This is primarily relevant to things that expand to trivial constant and which are likely to allow dead-code elimination or other simple transforms we do for similar reasons. Such cases most typically improve throughput of the JIT and are highly unlikely to cause bugs.We notably also treat some other cases, like hardware intrinsics, as functionally being
betterToExpand
(but we handle them explicitly early on, not via this local). We do it for these because they are known to be typically perf critical, are likely in methods that will be hot and be tiered, compile down to (most typically) singular instructions, generally improve JIT throughput/reduce JIT memory overhead as the hwintrinsic node is much cheaper than a call node (smaller, needs less handling in typical cases, doesn't require ABI fixups, doesn't need additional spills or copies to be introduced, etc), and because the reduce the overall overhead of the app; that is we don't free code for methods that were compiled to T0 and later rejitted; the overhead offunction alignment + vzeroupper + instruction + ret
for dozens to hundreds of intrinsics is non-trivial, as is the overhead required to meet ABI requirements per call (typically involving spilling the inputs so they can be passed by ref, allocating a return buffer, handling register save/restore requirements for V256/V512 registers, etc), and there often being many tightly coupled such calls in any method using them.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is required for correctness in this specific case. IL scanner assumes that these specific methods are always going to be expanded to save binary size related to generic dictionaries. These methods are very popular, so the size is disproportionate compared to what these methods do. If we were to remove the hardcoded assumption from IL scanner, we would take a binary size hit for native AOT.
This will be cleaner if/once IL Scanner is replaced by actual JIT importer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me there is a distinction between tier0 and MinOpts for these things. Tier0 implies MinOpts, and for Tier0 we want to do the cheap and profitable optimizations when possible, but for MinOpts without Tier0 my philosophy is that we truly do not want to do any transformations if we can avoid it (generally for diagnostic/debugging/testing reasons).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved the rest of them to the
mustExpand
section as well. As pointed out above the recursiveness only matters when compiling the intrinsic method itself, not for callers of it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I generally agree, which is also why I had the annotation above:
In this case it sounds like its special and we have to expand it for NAOT for other reasons as well, so I don't think there's much we can do.
I don't think we'd have a problem with ignoring
betterToExpand
for trueMinOpts
(where we're falling back from some compilation failure or similar and just trying to emit functioning code). But, there is an incredibly high cost to this for methods using platform specific hardware intrinsics. Things like the ability to step into a method in the debugger similarly don't matter for them, as they are purely recursive calls, so it might be worth considering such cases to be functionally equivalent to IL opcodes and expanded anyways (and if IL were designed today, its very possible that SIMD would have a more direct representation).