-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
TridiagSolver: fix missing sort in the deflation #960
Conversation
cscs-ci run |
dc7d176
to
f79d0c5
Compare
cscs-ci run |
Codecov Report
❗ Your organization is not using the GitHub App Integration. As a result you may experience degraded service beginning May 15th. Please install the Github App Integration for your organization. Read more. @@ Coverage Diff @@
## master #960 +/- ##
==========================================
+ Coverage 93.35% 94.83% +1.47%
==========================================
Files 143 129 -14
Lines 8605 7795 -810
Branches 1103 1049 -54
==========================================
- Hits 8033 7392 -641
+ Misses 388 238 -150
+ Partials 184 165 -19
|
it was probably unused since #819
c781736
to
1012688
Compare
1012688
to
fcff6bf
Compare
Co-authored-by: Raffaele Solcà <[email protected]>
Co-authored-by: Raffaele Solcà <[email protected]>
cscs-ci run |
TL;DR
Build CP2K
Test convergence H20-128InputFileialberto@daint103:~/workspace/cp2k> git --no-pager diff H2O-128.inp
diff --git a/H2O-128.inp b/H2O-128.inp
index 53bf03706..6b2bb0761 100644
--- a/H2O-128.inp
+++ b/H2O-128.inp
@@ -8,23 +8,21 @@
REL_CUTOFF 30
&END MGRID
&QS
- EPS_DEFAULT 1.0E-12
+ # EPS_DEFAULT 1.0E-12
WF_INTERPOLATION PS
EXTRAPOLATION_ORDER 3
&END QS
&SCF
SCF_GUESS ATOMIC
- &OT ON
- MINIMIZER DIIS
- &END OT
+ &DIAGONALIZATION ON
+ ALGORITHM STANDARD
+ &END DIAGONALIZATION
# SCF_GUESS RESTART
# EPS_SCF 1.0E-7
-
&PRINT
&RESTART OFF
&END
&END
-
&END SCF
&XC
&XC_FUNCTIONAL Pade
@@ -434,14 +432,12 @@
&END FORCE_EVAL
&GLOBAL
PROJECT H2O-128
- RUN_TYPE MD
+ RUN_TYPE ENERGY
PRINT_LEVEL LOW
+ # PREFERRED_DIAG_LIBRARY scalapack
+ PREFERRED_DIAG_LIBRARY dlaf
+ &FM
+ NCOL_BLOCKS 512
+ NROW_BLOCKS 512
+ &END FM
&END GLOBAL
-&MOTION
- &MD
- ENSEMBLE NVE
- STEPS 10
- TIMESTEP 0.5
- TEMPERATURE 300.0
- &END MD
-&END MOTION Scalapack vs DLAF @ PizDaint-MCAll runs converged and all steps reported the same total energy (up to ~1e-13). Scalapack-192
DLAF256 (RPN=9)
DLAF1024 (RPN=9)
DLAF512 (RPN=2)
|
(probable fix for #953)
Deflation process might produce changes in the order of deflated eigenvalues, and the part taking care of keeping them sorted was not implemented in our version. So, deflated eigenvalues were not always sorted correctly and this ended up with wrong results or
NaN
values. (See dlaed2 for more details)Thanks to @RMeli and @rasolca for the investigation and the support in fixing this.
In addition to the bug fix, this introduces also
std::hypot
for avoiding possible numerical errors in the deflation step (it has been preferred thecppstd
one over the lapackdlapy2
, without any strong reason).Another change is about removal of unused GPU kernels related to this step (namely
stablePartitionIndexOnDevice
and related).TODO:
stablePartitionIndexForDeflation
miniapp_tridiag_solver