[Feature] Parallelize kwavearray functions #523

faberno · 2024-12-01T20:16:38Z

Is your feature request related to a problem? Please describe.
I was wondering if its possible to parallelize get_array_binary_mask and combine_sensor_data of the kWaveArray class.
For simulations with many array elements these functions are a major bottleneck.

Describe the solution you'd like
Currently these functions contain a loop which iterates over every element. But the iterations are independent of each other, so in theory it should be possible to multiprocess them. This would probably require some refactoring to avoid copying the kwavearray and kgrid class for every thread.

The text was updated successfully, but these errors were encountered:

waltsims · 2024-12-01T22:54:53Z

This is a great point. Should be easy to implement. Thanks for your feedback.

…

-Walter

djps · 2024-12-03T14:40:26Z

What would be the best approach? There are a few things which can be done to accelerate the code: refactoring with list comprehension to remove loops; joblib; JIT with numba; cupy.

faberno · 2024-12-03T22:12:34Z

I just looked into get_array_binary_mask and by only refactoring it a bit I could reduce the runtime for 10,000 elements from 190 to 8 seconds.

This is the original loop:

for ind in range(self.number_elements):
    grid_weights = self.get_off_grid_points(kgrid, ind, True)
    mask = np.bitwise_or(np.squeeze(mask), grid_weights)

In self.get_off_grid_points we first calculate the integration_points (fast) and then the grid_weights (slow).
So I first computed the integration points for all elements (~1 second), stacked them into one array and gave them to off_grid_points(...) alltogether, which saves a lot of uneccessary calls of off_grid_points(...) and we don't need to OR the mask anymore.

faberno · 2024-12-05T15:05:06Z

I could also reduce combine_sensor_data from around ~600s to 37s (for 10,000 elements), without any major changes or parallelization.
Should I open a draft PR with these changes, where we can discuss them and think of more optimizations?

waltsims · 2024-12-06T04:24:46Z

That would be great. Thanks @faberno!

waltsims · 2024-12-24T19:04:54Z

@faberno should we try to get these updates into v0.4.1 in the new year?

faberno · 2024-12-24T20:16:53Z

That would be great. Will open my promised PR tomorrow

faberno added the enhancement New feature or request label Dec 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Parallelize kwavearray functions #523

[Feature] Parallelize kwavearray functions #523

faberno commented Dec 1, 2024

waltsims commented Dec 1, 2024 via email •

edited

Loading

djps commented Dec 3, 2024

faberno commented Dec 3, 2024

faberno commented Dec 5, 2024

waltsims commented Dec 6, 2024

waltsims commented Dec 24, 2024

faberno commented Dec 24, 2024

[Feature] Parallelize kwavearray functions #523

[Feature] Parallelize kwavearray functions #523

Comments

faberno commented Dec 1, 2024

waltsims commented Dec 1, 2024 via email • edited Loading

djps commented Dec 3, 2024

faberno commented Dec 3, 2024

faberno commented Dec 5, 2024

waltsims commented Dec 6, 2024

waltsims commented Dec 24, 2024

faberno commented Dec 24, 2024

waltsims commented Dec 1, 2024 via email •

edited

Loading