Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add streaming-compatible SVE variant to VFABI mangling #292

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

joeramsay
Copy link

From the point of view of vector libraries, it is convenient to treat SVE and streaming-compatible SVE as separate vector variants. This is because existing optimised SVE routines may not be compatible with streaming mode, for instance where they use SVE instructions which are illegal in streaming mode.

This patch adds the ISA marker 'c', for streaming-compatible SVE. Existing mapping from scalar to SVE symbols should all still make sense with streaming-compatibility enabled, with the exception that if the region being vectorised may have streaming enabled then the 'c' variant should be used rather than 's'.

At present, for library purposes we are only interested in reaching a consensus about what to name the routines, rather than extending OpenMP and the VFABI to actually facilitate autovectorisation, however please let me know if there is anything that I have left ambiguous or need to add.

From the point of view of vector libraries, it is convenient to treat
SVE and streaming-compatible SVE as separate vector variants. This is
because existing optimised SVE routines may not be compatible with
streaming mode, for instance where they use SVE instructions which are
illegal in streaming mode.

This patch adds the ISA marker 'c', for streaming-compatible
SVE. Existing mapping from scalar to SVE symbols should all still make
sense with streaming-compatibility enabled, with the exception that if
the region being vectorised may have streaming enabled then the 'c'
variant should be used rather than 's'.

At present, for library purposes we are only interested in reaching a
consensus about what to name the routines, rather than extending
OpenMP and the VFABI to actually facilitate autovectorisation, however
please let me know if there is anything that I have left ambiguous or
need to add.
@mplatings
Copy link

I'm not an expert in this area but I'm happy to see this change!
In code I've written I've had to resort to awkward workarounds because different instantiations of template functions get the same mangled name. It looks like this change will enable giving the different instantiations different mangled names.

Copy link
Contributor

@smithp35 smithp35 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll caveat the review with my lack of expertise in the SVE/SME/Neon area so I could easily have missed something.

The addition of the name mangling seems reasonable to me at the specification level, on the assumption that we want first class support for vector functions in streaming and streaming-compatible functions.

I'm much less confident about the generation of functions, I don't fully understand how the pragma would know the streaming context. Perhaps this part could be split out into a separate pull-request, as I assume a library writer can always manually provide the necessary functions?

To progress this issue I think we'd need a commitment from the compiler teams that they would implement this before it gets merged.

Streaming-compatible SVE Examples
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The use of ``#pragma omp declare simd`` with ``f``, ``g`` and ``foo``
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How would the compiler know that we are in a streaming or streaming-compatible region?

From what I can find in the ACLE the user adds __arm_streaming and __arm_streaming_compatible to functions. I can see how this would work in generating calls to streaming compatible vector functions, but I'm struggling with how this would work for generating functions with #pragma omp declare simd as these appear to be placed in front of function declarations, which typically won't be within the scope of of a function body.

I'm not at all familiar with OMP though so there could be a way of declaring a streaming compatible region, I've checked https://clang.llvm.org/docs/AttributeReference.html#pragma-omp-declare-simd

Would be good to give an example if you can?

If this part is somewhat speculative it could be worth separating it from the name mangling part, as I assume someone could manually write the functions rather than have the compiler auto-generate them.

@@ -942,6 +942,13 @@ undefined.
Zn.b [msb] ... 0x??????03 0x??????02 0x??????01 0x??????00 [lsb]
Zn.s [msb] ... 0x00000003 0x00000002 0x00000001 0x00000000 [lsb]

Streaming compatibility
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The alternative would be that streaming/streaming-compatible regions call the scalar functions? Or make all vector functions streaming-compatible. Which would avoid the name mangling, but lose performance.

Ideally a streaming-compatible target would be universally compatible so that we wouldn't need a non-streaming and streaming-compatible implementation, although I don't think we can represent that with name-mangling. I guess a synonym or wrapper could be used in that case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants