You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
QUDA internally refines the multi shift inversions using the single CG.
The MILC code also calls the single CG (either CPU or GPU) again.
This generates a lot of overhead and essentially always does zero iterations, so just wastes a lot of time.
There is an option NO_REFINE which skips the refinement step if the Naik epsilon of the higher shifts is identical to the one for the zeroth-shift.
It would be beneficial to turn off any refinement call from the MILC code, i.e. make NO_REFINE the default option.
In a short test on a 32^4 lattice that reduce runtime of the RHMC (single precision) by a factor 2. Admitted, I basically just changed the test case from tests case, so the iteration count is low and the overhead more pronounced.
The text was updated successfully, but these errors were encountered:
QUDA internally refines the multi shift inversions using the single CG.
The MILC code also calls the single CG (either CPU or GPU) again.
This generates a lot of overhead and essentially always does zero iterations, so just wastes a lot of time.
There is an option
NO_REFINE
which skips the refinement step if the Naik epsilon of the higher shifts is identical to the one for the zeroth-shift.It would be beneficial to turn off any refinement call from the MILC code, i.e. make
NO_REFINE
the default option.Any objections, @detar, @stevengottlieb ?
In a short test on a 32^4 lattice that reduce runtime of the RHMC (single precision) by a factor 2. Admitted, I basically just changed the test case from tests case, so the iteration count is low and the overhead more pronounced.
The text was updated successfully, but these errors were encountered: