Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use out-of-place trmm when available #978

Merged
merged 3 commits into from
Dec 7, 2023

Conversation

msimberg
Copy link
Collaborator

@msimberg msimberg commented Sep 8, 2023

I'll try to summarize the various stages of trmm evolution since it's a bit complicated to follow in the code:

version rocblas_Xtrmm rocblas_Xtrmm_outofplace
v < 5.0.0 only exists as two-parameter version does not exist
5.0.0 <= v < 5.6.0 only exists as two-parameter version only exists as three-parameter version which supports both in- and out-of-place (same as cuBLAS)
5.6.0 <= v < develop Same as above with supposedly a ROCBLAS_V3 definition that should enable the three-parameter version, but which only seems to have an effect in documentation... Same as above, but deprecated
develop <= v only exists as three-parameter version which supports both in- and out-of-place (same as cuBLAS) does not exist

Given that

  1. 5.6.0 is barely out (and not yet available in spack: Rocm 5.6.0 & 5.6.1 release updates spack/spack#39673).
  2. trmm_outofplace was deprecated in 5.6.X
  3. We don't know yet exactly which version is going to actually remove trmm_outofplace

I would probably for the moment actually not put an upper bound on when to switch back from trmm_outofplace to trmm. It's a guessing game I suppose until they actually release 5.7.0 or 6.0.0 but it seems safer to assume that trmm_outofplace is going to actually be removed in the next major release rather than one minor release after it was deprecated.

I have not tested this with rocblas 4.X (I'm not even sure if it works without these changes, if I remember correctly CMake support was badly broken before 5.something) but in theory it should work. I don't how much we care about that?

I also have not run large-scale runs with this change, so I don't know if it affects performance (smaller tests say no).

I've also put a version cap on when the workaround for certain Ops not being supported by rocblas is applied.

@msimberg msimberg force-pushed the rocblas-trmm3 branch 4 times, most recently from 0695032 to 1e46a36 Compare September 8, 2023 16:03
include/dlaf/blas/tile.h Outdated Show resolved Hide resolved
@rasolca
Copy link
Collaborator

rasolca commented Sep 8, 2023

I have not tested this with rocblas 4.X (I'm not even sure if it works without these changes, if I remember correctly CMake support was badly broken before 5.something) but in theory it should work. I don't how much we care about that?

Neglect 4.X and require >= 5.

@rasolca
Copy link
Collaborator

rasolca commented Sep 8, 2023

Please run the tests with the CI image to be sure that everything works correctly.

@rasolca
Copy link
Collaborator

rasolca commented Sep 8, 2023

cscs-ci run

1 similar comment
@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg
Copy link
Collaborator Author

cscs-ci run

@msimberg
Copy link
Collaborator Author

The tests pass on hohgant. I've yet to verify that performance is unchanged.

@msimberg msimberg added Type:Bug Something isn't working TODO:Task Category:CI not planned Feature currently outside of the roadmap that might be considered in the future Priority:Low Type:Cleanup and removed Type:Bug Something isn't working TODO:Task Category:CI not planned Feature currently outside of the roadmap that might be considered in the future labels Sep 27, 2023
@msimberg msimberg added this to the v0.3.0 milestone Oct 2, 2023
@rasolca rasolca modified the milestones: v0.3.0, V0.4.0 Nov 13, 2023
@msimberg msimberg marked this pull request as ready for review November 29, 2023 09:27
@msimberg
Copy link
Collaborator Author

There seems to be no measurable impact from this change.

@msimberg
Copy link
Collaborator Author

cscs-ci run

@rasolca
Copy link
Collaborator

rasolca commented Nov 30, 2023

cscs-ci run

@rasolca
Copy link
Collaborator

rasolca commented Nov 30, 2023

@msimberg This PR is based on a very old master which makes the CI fails.
Please update it.

@msimberg
Copy link
Collaborator Author

msimberg commented Dec 4, 2023

cscs-ci run

@rasolca rasolca merged commit 56bec25 into eth-cscs:master Dec 7, 2023
4 checks passed
github-actions bot pushed a commit that referenced this pull request Dec 7, 2023
@msimberg msimberg deleted the rocblas-trmm3 branch February 2, 2024 10:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

2 participants