Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to HQ residual solves in CG #1387

Merged
merged 16 commits into from
Jul 3, 2023
Merged

Conversation

weinbe2
Copy link
Contributor

@weinbe2 weinbe2 commented Jun 8, 2023

This PR provides an algorithmic solution to convergence issues with HQ residual solves noted in #1376 . In brief, instead of having one unified code path for "regular" L2-norm solves and HQ residual solves, this PR creates separate code path for solves requesting a HQ residual.

This new code path contained in a function hqsolve includes improved logic for tracking reliable updates based on the L2 residual as well as the HQ residual. In the interest of simplicity, this separate code path does not feature all of the optimizations in the traditional CG code. Since HQ solves typically take O(100) iterations, as opposed to thousands to tens of thousands of iterations for light quark solves, this seemed like an acceptable decision. Of note, this new code path still supports mixed precision.

The two key improvements in hqsolve are as follows:

  • Updating the trigger for a HQ reliable update to be based on a drop in the HQ residual relative to the previous update as opposed to drops in the L2 residual
  • Instead of always waiting for "L2 breakdown" before switching to tracking HQ reliable updates, the hqsolve logic now only tracks HQ reliable update conditions if there is no L2 convergence requirement (i.e., tol == 0).

In the case of my own reproducers, the first improvement addresses the issues @detar noted in #1376 . The second improvement leads to a great reduction in the number of iterations required for a HQ solve (much more in line with what is seen in MILC's CPU implementation) with the benefit of retaining mixed precision. This is due to removing the requirement of L2 breakdown before switching to HQ reliable updates.

Outstanding tasks:

  • Verification by @detar
  • Improved documentation
  • Possibly removing the HQ logic from the regular CG solve?
  • clang-format

@weinbe2
Copy link
Contributor Author

weinbe2 commented Jun 12, 2023

Update: @detar is doing a set of thorough correctness checks, thank you!!

@weinbe2 weinbe2 marked this pull request as ready for review June 27, 2023 20:14
@weinbe2 weinbe2 requested a review from a team as a code owner June 27, 2023 20:14
@weinbe2
Copy link
Contributor Author

weinbe2 commented Jun 27, 2023

@detar confirmed offline that this PR addresses the issues he reported in #1376 , so I've removed the "draft" status of this PR and done some last bits of tidy-up.

@maddyscientist and/or @mathiaswagner can you give this PR a review/test run and let me know what you think? Barring any showstoppers, I'd like to get this in as-is sooner as opposed to later.

@weinbe2
Copy link
Contributor Author

weinbe2 commented Jul 3, 2023

Follow-up work is being tracked here: #1389


// whether to select alternative reliable updates
// if we're computing the heavy quark residual, force "traditional" reliable updates
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment should be in the heavy quark path?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no, I just forgot to clean that up. I'll take care of that...

@weinbe2
Copy link
Contributor Author

weinbe2 commented Jul 3, 2023

This passed a ctest -V, so once the latest round of Jenkins passes I'll hit merge

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants