-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add streaming-compatible SVE variant to VFABI mangling #292
base: main
Are you sure you want to change the base?
Conversation
From the point of view of vector libraries, it is convenient to treat SVE and streaming-compatible SVE as separate vector variants. This is because existing optimised SVE routines may not be compatible with streaming mode, for instance where they use SVE instructions which are illegal in streaming mode. This patch adds the ISA marker 'c', for streaming-compatible SVE. Existing mapping from scalar to SVE symbols should all still make sense with streaming-compatibility enabled, with the exception that if the region being vectorised may have streaming enabled then the 'c' variant should be used rather than 's'. At present, for library purposes we are only interested in reaching a consensus about what to name the routines, rather than extending OpenMP and the VFABI to actually facilitate autovectorisation, however please let me know if there is anything that I have left ambiguous or need to add.
I'm not an expert in this area but I'm happy to see this change! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll caveat the review with my lack of expertise in the SVE/SME/Neon area so I could easily have missed something.
The addition of the name mangling seems reasonable to me at the specification level, on the assumption that we want first class support for vector functions in streaming and streaming-compatible functions.
I'm much less confident about the generation of functions, I don't fully understand how the pragma would know the streaming context. Perhaps this part could be split out into a separate pull-request, as I assume a library writer can always manually provide the necessary functions?
To progress this issue I think we'd need a commitment from the compiler teams that they would implement this before it gets merged.
Streaming-compatible SVE Examples | ||
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ | ||
|
||
The use of ``#pragma omp declare simd`` with ``f``, ``g`` and ``foo`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would the compiler know that we are in a streaming or streaming-compatible region?
From what I can find in the ACLE the user adds __arm_streaming
and __arm_streaming_compatible
to functions. I can see how this would work in generating calls to streaming compatible vector functions, but I'm struggling with how this would work for generating functions with #pragma omp declare simd
as these appear to be placed in front of function declarations, which typically won't be within the scope of of a function body.
I'm not at all familiar with OMP though so there could be a way of declaring a streaming compatible region, I've checked https://clang.llvm.org/docs/AttributeReference.html#pragma-omp-declare-simd
Would be good to give an example if you can?
If this part is somewhat speculative it could be worth separating it from the name mangling part, as I assume someone could manually write the functions rather than have the compiler auto-generate them.
@@ -942,6 +942,13 @@ undefined. | |||
Zn.b [msb] ... 0x??????03 0x??????02 0x??????01 0x??????00 [lsb] | |||
Zn.s [msb] ... 0x00000003 0x00000002 0x00000001 0x00000000 [lsb] | |||
|
|||
Streaming compatibility |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The alternative would be that streaming/streaming-compatible regions call the scalar functions? Or make all vector functions streaming-compatible. Which would avoid the name mangling, but lose performance.
Ideally a streaming-compatible target would be universally compatible so that we wouldn't need a non-streaming and streaming-compatible implementation, although I don't think we can represent that with name-mangling. I guess a synonym or wrapper could be used in that case?
From the point of view of vector libraries, it is convenient to treat SVE and streaming-compatible SVE as separate vector variants. This is because existing optimised SVE routines may not be compatible with streaming mode, for instance where they use SVE instructions which are illegal in streaming mode.
This patch adds the ISA marker 'c', for streaming-compatible SVE. Existing mapping from scalar to SVE symbols should all still make sense with streaming-compatibility enabled, with the exception that if the region being vectorised may have streaming enabled then the 'c' variant should be used rather than 's'.
At present, for library purposes we are only interested in reaching a consensus about what to name the routines, rather than extending OpenMP and the VFABI to actually facilitate autovectorisation, however please let me know if there is anything that I have left ambiguous or need to add.