Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simple implementation of Stochastic Reconfiguration #5017

Merged
merged 38 commits into from
Sep 3, 2024

Conversation

camelto2
Copy link
Contributor

Please review the developer documentation
on the wiki of this project that contains help and requirements.

Proposed changes

Wanted to out this up to get some eyes on it.

This implements a very simple version of stochastic reconfiguration. The benefit is that we avoid the <Psi_i | H | Psi_j > matrix elements which get costly for a large number of parameters. This stochastic reconfiguration approach basically solves -tau * h = S * dp for dp, where tau is a small projection time, h is the vector <Psi_i | H | Psi_0>, and S is the overlap matrix <Psi_i | Psi_j>. The simple approach would be to build S, invert, and solve for dp. A better method, employed here, is to use a conjugate gradient method to solve the linear equation above. This avoids having to explicitly build the entire S matrix, and only builds the S*z matrix-vector product at each iteration of the CG algorithm...essentially this is the "Accelerated" SR approach described in https://doi.org/10.1103/PhysRevB.85.045103, described in the text surrounding Eqn. 7.

Preliminary tests using this have allowed us to scale up 100,000+ variational parameters, which was simply not possible with the current implementation of one_shift_only and has a huge speedup over one_shift for large parameter counts.

This is still a WIP because there are a number of things to improve. A few things off the top of my head are

  1. Need to add meaningful tests. This was the result of some exploratory coding to see if it could work. Now that our preliminary testing seems to suggest the method works pretty well and is fast, I need to add proper tests.
  2. Would like to abstract away the CG solver
  3. Probably need a better implementation of the CG solver...I literally implemented whatever was on wikipedia. I have some hard coded CG convergence criteria which may not be sufficient
  4. Right now, it either uses a fixed tau to get the parameter update or it can do a line search with correlated sampling. In some cases, the line search is far superior and converges significantly faster. However, if the correlated sampling weight gets small and stays small throughout the optimization then the energy can blow up and tank the optimization. In those cases, you just need to use the tau as something small, but this can result in many optimization iterations. Each iteration is a lot faster, but you may end up needing O(100) or more to get convergence. Also if the tau is too large, then the optimization can go haywire as well. I think an optimal approach would be to use the line search unless the weights get too small, and just take steps with small tau otherwise. Or some sort of adaptive method which scales the timestep automatically to help accelerate convergence.
  5. probably many other things I'm not thinking of.

I'm going to be on leave for a few weeks, but I wanted to get this up to get some comments/suggestions. This seems to be the approach other QMC codes are using for large parameter count optimizations, so it would be nice to have something like this in the code. But I think it needs a lot of improvements.

What type(s) of changes does this code introduce?

Delete the items that do not apply

  • New feature

Does this introduce a breaking change?

  • No

What systems has this change been tested on?

Checklist

Update the following with a yes where the items apply. If you're unsure about any of them, don't hesitate to ask. This is
simply a reminder of what we are going to look for before merging your code.

  • Yes. This PR is up to date with current the current state of 'develop'
  • Yes. Code added or changed in the PR has been clang-formatted
  • No. This PR adds tests to cover any new code, or to catch a bug that is being fixed
  • No. Documentation has been added (if appropriate)

@prckent
Copy link
Contributor

prckent commented May 31, 2024

Test this please

@prckent prckent added this to the v4.0.0 Release milestone Aug 19, 2024
@prckent
Copy link
Contributor

prckent commented Aug 21, 2024

Cody: Can we revisit the WIP label of this PR and get it merged? This addition is a great achievement -- several people have been able to successfully use it in real calculations. After fixing the conflict and making any other updates you wish, we can review and make a list of what needs addressing in future PRs. e.g. Some small documentation and a basic test to exercise the code paths.

@camelto2
Copy link
Contributor Author

Cody: Can we revisit the WIP label of this PR and get it merged? This addition is a great achievement -- several people have been able to successfully use it in real calculations. After fixing the conflict and making any other updates you wish, we can review and make a list of what needs addressing in future PRs. e.g. Some small documentation and a basic test to exercise the code paths.

Glad to hear that people have tried it out successfully. I will fix the conflict. I might try to refactor part of it and make a smaller PR for the krylov solver which will be easier to write a test for. Then we can incorporate those changes into this optimizer and get it merged if that sounds good.

@prckent
Copy link
Contributor

prckent commented Aug 23, 2024

Since people are using this, best to get it in mainline. So I prefer: fix, merge, & then to iterate testing+refactoring+documenting.

@camelto2 camelto2 changed the title [WIP] Simple implementation of Stochastic Reconfiguration Simple implementation of Stochastic Reconfiguration Aug 23, 2024
@prckent
Copy link
Contributor

prckent commented Aug 23, 2024

Test this please

@ye-luo
Copy link
Contributor

ye-luo commented Aug 23, 2024

Since it is a specific flavor of SR, could you make the enum and option name more specific?

@prckent
Copy link
Contributor

prckent commented Aug 26, 2024

@camelto2 I think Ye has a point, but I'll leave it up to you whether to name now or in a later PR. Please leave a comment either way. This capability is obviously evolving and as you point out (thanks) e.g. the CG is worth both refining and abstracting.

@ye-luo
Copy link
Contributor

ye-luo commented Sep 3, 2024

Test this please

@ye-luo ye-luo merged commit a126083 into QMCPACK:develop Sep 3, 2024
37 of 39 checks passed
{"OneShiftOnly", OptimizerType::ONESHIFTONLY}, {"adaptive", OptimizerType::ADAPTIVE},
{"descent", OptimizerType::DESCENT}, {"hybrid", OptimizerType::HYBRID},
{"gradient_test", OptimizerType::GRADIENT_TEST},
{"stochastic_reconfiguration", OptimizerType::STOCHASTIC_RECONFIGURATION_CG}};
Copy link
Contributor

@ye-luo ye-luo Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@camelto2 can we also rename the input to "sr_cg"?

@camelto2 camelto2 deleted the stochastic_reconfig_fullkrylov branch September 16, 2024 19:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants