-
Notifications
You must be signed in to change notification settings - Fork 47
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow tunring off DBCSR ACC with env variable #801
Conversation
Can one of the admins verify this patch? |
With the environment variable |
We still need it since we still move data to the gpu and run transpose kernel. Thanks @abussy , I will review the PR next week. |
Gentle ping to review this PR :) It would be nice if it makes it to CP2K's next release |
thanks for the PR.
OK, the first "logical" problem is In terms of output, I will keep has_acc when we output the configuration. The last problem is that it must forbidden to change this value multiple times within a DBCSR run, otherwise will get configuration problems (we are replacing a macro at compile time with something at runtime). This functionality is not available in the library, I need to think about (in any case this is not relevant to your PR). |
Sure, using a positive boolean like Concerning the danger of changing this value multiple times during a run, I think it should be safe as it is. The |
src/core/dbcsr_config.F
Outdated
CALL dbcsr_cfg%mm_stack_size%set(mm_stack_size) | ||
END IF | ||
|
||
IF (.NOT. PRESENT(mm_stack_size) .AND. dbcsr_cfg%turn_off_acc%val) THEN |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should be mm_dense here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same consideration about resetting the value...
The dbcsr_set_config is a public function, so people can call it after the init. Actually, this is what CP2K does. That means we can call it once with GPU_RUN disabled, then run a multiplication, then call set_config and set GPU_RUN to true and at this point I'm not sure what it will happen... We need to protect for this case, as said this is not part of this PR (I will open an issue as a reminder). |
for more information, see https://pre-commit.ci
I implemented the vast majority of the suggested changes:
|
This PR allows turning off DBCSR GPU acceleration at initialization, even if the library is compiled with the
-D__DBCSR_ACC
flag. This can be done by setting the environment variableDBCSR_TURN_OFF_ACC=1
, or by passing a logical to thedbcsr_set_config
subroutine (this will be useful for a future CP2K keyword).Most changes take place in the
dbcsr_config.F
file. There, a new function calleduse_acc()
is defined. It returns a logical, and replaces thehas_acc
variable used in the rest of the code.The PR was validated by successfully running the CP2K regtests on a GPU machine, with and without the
DBCSR_TURN_OFF_ACC=1
environment variable.Generally, the ability of turning off GPU acceleration at runtime should help with debugging, and with edge cases where DBCSR ACC under-performs (see #795).