-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FFT plan in both paganin filter methods sometimes has a negligible size #456
Comments
thanks @yousefmoazzam , interesting stuff. I wonder what happens with the Paganin filter of TomoPy? The same behaviour for memory estimation? I sense that the Savu's method is not the one that the majority of people will be using, the implementation of the method raises lots of questions. We haven't discussed this in detail but I think that all memory hook tests will be moved to |
Yep, as the title suggests, the tomopy paganin filter has the same behaviour. I showed the savu paganin filter I can provide the
Here is truncated output for running outside a container (non-negligble FFT plan size):
and then for running inside a container (negligble FFT plan size):
|
This was discovered during memory allocation exploration for #454, and the changes made in b32a2c8 allowed the paganin memory hook tests to pass in the IRIS CI test jobs.
I don't know if this belongs in the httomo repo (because it's related to a method's memory estimator), or the httomolibgpu repo (because it's related to FFT's being performed in a specific method), or somewhere else (because I don't know the root cause of the issue). Nevertheless, I've put it here for now, just to have it documented somewhere.
Original observation
When running the memory hook tests inside a container on my local workstation and outside a container on my workstation, I saw a difference in the size of the FFT plan being allocated for the 2D FFT being performed in the paganin filter methods:
To be clear, I don't know if this is a container-related issue, or if it's simply that in both cases when the FFT plan size was negligible, it happened to be when running inside a container. For example, maybe version of the cupy python package, or the cufft CUDA package being different could cause this.
I did check the cupy version in the conda env inside the container and outside the container, both were v12.3.0, but inside the container the cupy package came from a conda channel whereas in the conda env outside the container the cupy package came from PyPI.
The way I was running the two paganin methods to see this behaviour
I chose one specific parametrisation of the memory hook tests for both the methods. In the examples in the section below, I was running the following memory hook test parametrisation for the savu paganin filter:
Investigation details (ie, how I found that the FFT plan size was negligible)
Using the
LineProfileHook
in cupy, I was able to see the size of all allocations being done by the methods, and in particular, the FFT plan generated for the 2D FFT.Outside a container on my workstation, the size of the FFT plan allocated was non-negligible for both methods. Here's truncated output for the savu paganin filter running:
Inside a container on my workstation, the size of the FFT plan allocated was tiny/negligible for both methods. Here's truncated output for the savu paganin filter running inside the container:
Ok, but how is this relevant to the memory hook tests failing in IRIS CI?
Good question: my answer is that I'm not 100% sure if this is definitely happening in the paganin memory hook tests execution in the IRIS workflow. What I can say is that the Savu paganin memory hook tests were failing without correcting for a potential FFT plan with negligible size, but when I added the changes to correct for a potential negligible FFT plan (in b32a2c8), the tests passed.
I have not attempted to print out the
LineProfileHook
results when running the tests via the IRIS CI workflow, mainly due to not wanting to bother fiddling with the workflow file to getprint()
output displayed during running the tests. If we feel this should be investigated further, then this would probably be one of the steps to take.What exactly is this "correction" for a potential FFT plan with negligible size?
Drawing graphs by hand that track the main memory allocations and deallocations was how I consolidated this conclusion. What it boils down to is that:
The peak GPU memory usage is what the
MaxMemoryHook
in httomo memory hook tests rely on for checking the max memory used, and thus if the FFT plan size being negligible or not affects the peak, then it can affect the result of memory hook tests.The if/else branching added in b32a2c8 is encoding the logic to handle at what point in the method does peak GPU memory usage occurs.
The text was updated successfully, but these errors were encountered: