-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove loop over blocks of splines? #147
Comments
This is the advantage of miniQMC over QMCPACK. There are many use of this feature for exploration. |
So the nested evaluation of splines is general enough to keep in the base miniapp? |
I think this is general. In fact, thinking about splitting the walkers, this feature provide another layer of parallelism. The QMCPACK CUDA code has some similar feature although only the computing not the memory is chunked. |
Regarding the loop over blocks, one nice thing about it is that it helps with cache pressure as described in the IPDPS paper (or maybe SC). It turns out to be a good feature even if you are not letting different processing elements handle different blocks. Arguing for the simple loop is that it will be easier for others to understand. The benefit of the loop over blocks is that it is well studied and generally turns out to be the best implementation for many platforms. I might suggest that the _ref version could have the simple loop that Mark suggests and the standard implementation retain the loop over blocks. This does imply some coupling of the data types as well, but this could also be handled easily. |
The evaluation of splines is nested - one loop over blocks of splines, and an inner loop over the splines in that block. I think this was done to experiment with distributing the spline evaluation across (shared memory) processors.
This feels like an implementation detail that should not be in the base miniapp. For the base miniapp, there should be a single loop over splines.
A better way to do this is to create a spline interface, and then put the breaking of splines into blocks as one implementation behind that interface.
The text was updated successfully, but these errors were encountered: