Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise grain finding #104

Merged
merged 13 commits into from
Nov 7, 2023
Merged

Conversation

jni
Copy link

@jni jni commented Jul 5, 2023

This PR is still a work in progress but it implements two improvements:

  • using regionprops to find grains rather than using image == label (which can be expensive when done thousands of times
  • using numba for the inner loop of the flood fill

I'm getting weirdness with the grain IDs now so it's not ready to merge... But it's much faster! 😅

Using a NumPy array as a buffer increases performance, but leaves you
open to bugs due to reusing the same buffer and overwriting it, which is
what was happening. 😅

There's a better-performing fix still: use a buffer as big as the image,
but advance the start index in the buffer as you finish each flood fill.
This guarantees optimal space and time requirements (no need for copies,
or a huge buffer that is mostly unused).
@jni
Copy link
Author

jni commented Oct 31, 2023

Ok, with thanks to @mikesmic for helping me debug this last week, this is mostly fixed.

The issue was that I was being sooooo clever by using a NumPy array as a buffer to store the coordinates — but then when I returned the coordinates, I was returning a view into the buffer, rather than a copy — and of course the buffer was getting overwritten with each grain! 🤦

There's a better fix, which I'll try to implement next: we actually know the max number of coordinates we need: it's the size of the image. We just need to make sure we don't reuse the same buffer space. We can ensure this by copying the buffer after each fill, but we can also ensure it by making a buffer of the same size as the image, then incrementing our position along that buffer. Then the buffer will contain the coordinates of each grain, sorted by grain id, and each grain will contain a view into a section of that buffer.

Then there is one more bug to address, here's cell 39 in the notebook on develop:

Screenshot 2023-10-31 at 12 53 53 pm

And this is what it looks like on this branch:

Screenshot 2023-10-31 at 12 54 27 pm

😅

I think the issue is probably to do with the grain centroid or extent or somesuch. I'll dig into the plotting code to check.

After that, I'll try to make some proper benchmarks.

@jni
Copy link
Author

jni commented Oct 31, 2023

Whoop whoop, it's fixed! 😊 I was just neglecting to add the seed point in one of my two accelerated flood fills, so every grain had (0, 0) in its coordinates. 😅 The notebook output is now identical in this PR and in the develop branch. 🚀

@rhysgt
Copy link
Contributor

rhysgt commented Oct 31, 2023

Nice! Thanks!! How much faster is it out of interest?

@jni
Copy link
Author

jni commented Oct 31, 2023

By "feel" it's a lot. But as I mentioned I need to benchmark properly — it looks like the timings from the progress bars include a bit of time at the end (?) — the progress runs way faster but in some cases (e.g. the 6 seconds at the bottom of the timings below on my branch) it sits there at 99% for a bit. I'd say the flood fill itself is 10x-100x faster. Here's the progressbar output for reading and processing a biggish file on develop:

/Users/jni/conda/envs/all/bin/python /Users/jni/projects/defdap/scripts/read.py
Loaded DaVis 8.4.0 data (dimensions: 2137 x 2137 pixels, sub-window size: 12 x 12 pixels)
Loaded EBSD data (dimensions: 1509 x 1639 pixels, step size: 0.2 um)
Finished building quaternion array (0:00:04)
Finished finding grain boundaries (0:00:08)
Finished finding grains (0:00:15)
Finished finding grains (0:00:06)
Finished calculating grain average Schmid factors (0:00:19)
Finished finding grains (0:00:40)
Finished finding grains (0:00:16)

and on this branch:

/Users/jni/conda/envs/all/bin/python /Users/jni/projects/defdap/scripts/read.py 
Loaded DaVis 8.4.0 data (dimensions: 2137 x 2137 pixels, sub-window size: 12 x 12 pixels)
Loaded EBSD data (dimensions: 1509 x 1639 pixels, step size: 0.2 um)
Finished building quaternion array (0:00:04) 
Finished finding grain boundaries (0:00:08) 
Finished finding grains (0:00:10) 
Finished finding grains (0:00:01) 
Finished calculating grain average Schmid factors (0:00:14) 
Finished finding grains (0:00:00) 
Finished finding grains (0:00:06) 

You should give it a try! 😊

Actually I'll mark this as ready to review — I do want to keep chipping away at this, as I think we can get that whole pipeline down to 1-2s (though I'm not 100% about the quaternion stuff). But I think further progress should probably be done in subsequent PRs.

@jni jni marked this pull request as ready for review October 31, 2023 13:34
@mikesmic
Copy link
Collaborator

mikesmic commented Nov 6, 2023

I've added some tests of the grain finding in ebsd and hrdic for the warp algorithm, all passing :).

I want to remove the flood fill methods from the Map classes, they aren't necessary now and are confusing.

Comment on lines +4 to +8
@njit
def find_first(arr):
for i in range(len(arr)):
if arr[i]:
return i
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this for?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LOL good Q! 🤣 I must have used it in a different setting at some point, then forgotten to delete it. 😬

Anyway it's a handy little function 🤣

defdap/ebsd.py Outdated
Comment on lines 994 to 996
grains[y, x] = index
points_left[y, x] = False
edge = [seed]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed

@mikesmic mikesmic merged commit a54de27 into MechMicroMan:develop Nov 7, 2023
4 checks passed
@jni jni deleted the optimise-grain-finding-3 branch November 8, 2023 07:38
@jni
Copy link
Author

jni commented Nov 8, 2023

Whoo! 🥳

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants