-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Loongarch64 small matrix #4710
Closed
Closed
Loongarch64 small matrix #4710
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
SAXPY built with OpenXL regresses when compared to SAXPY built with gcc. OpenXL compiler doesn't know that the SAXPY inner kernel assembly is a 64 element loop and to it the remainder loop is the main loop. It vectorizes and interleaves the remainder to be a 48 elements per iteration loop. With a max of 63 iterations, a 48 element loop is mostly not going to get executed, so the 1 element scalar loop that is the remainder after that is probably mostly what gets executed. This can be fixed by adding a pragma, loop interleave_count(2) which will result in 8 element loop. Signed-off-by: Amrita H S <[email protected]>
* fix passing of results on windows
This reverts commit 9b24b31. It turns out that PRs 4515 and 4520 break the tests under lapack-netlib/TESTING which require SECOND and DSECND. IBM has decided this is a bigger biger problem than the conflict between lapack second_ and the xlf run time.
This reverts commit bdaa670. It turns out that PRs 4515 and 4520 break the tests under lapack-netlib/TESTING which require SECOND and DSECND. IBM has decided this is a bigger biger problem than the conflict between lapack second_ and the xlf run time.
…thLib#4695) * Fix syntax in parsing of IFNEQ
…b#4698) * Use CMAKE_C_COMPILER_VERSION throughout
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Add small matrix multiplication optimization