-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
BUG: fix load_uniform_grid with cell_widths and multiple fields #5052
BUG: fix load_uniform_grid with cell_widths and multiple fields #5052
Conversation
For reference, here's the test failure when running the updated test on main:
|
grid_dimensions = np.array(list(shapes), dtype="int32") | ||
temp[key] = [data[key][slice] for slice in slices] | ||
cell_widths = grid_cell_widths |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think I understand what the problem is or why this fixes it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops, sorry, should have included more details. the fix was obvious to me, but only after like 45 mins of debugging lol.
The following loop when nprocs > 1
: (1) takes the continuous data arrays and splits them into separate grids and (2) records the grid edges, grid dimensions, and cell widths (if applicable) for each grid:
if nprocs > 1:
temp = {}
new_data = {} # type: ignore [var-annotated]
for key in data.keys():
psize = get_psize(np.array(data[key].shape), nprocs)
(
grid_left_edges,
grid_right_edges,
shapes,
slices,
grid_cell_widths,
) = decompose_array(
data[key].shape,
psize,
bbox,
cell_widths=cell_widths,
)
grid_dimensions = np.array(list(shapes), dtype="int32")
temp[key] = [data[key][slice] for slice in slices]
# setting cell_widths = grid_cell_widths would overwrite for next loop = bad
cell_widths = grid_cell_widths
Only one copy of the grid related variables (grid_dimensions, grid_left_edges, grid_right_edges, cell_widths) is needed for the subsequent call to load_amr_grids
below. When there are multiple fields to split, those grid related variables just get re-created every loop and only the last one is kept. The final grid_cell_widths
needs to be renamed to cell_widths
to match the behavior down in the rest of the function, but that rename has to happen outside the loop over fields or the call to decompose_array
would use a cell_widths that was already split by grid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see it now. I somehow missed that cell_widths
would be fed back to decompose_array
. Thanks a lot !
In addition to the extra details in the comment, here's a small reproducer modified from the updated tests: import yt
import numpy as np
def data_cell_widths_N16():
np.random.seed(0x4D3D3D3)
N = 16
data = {
"density": np.random.random((N, N, N)),
"temperature": np.random.random((N, N, N)), # adding a second field fails on main
}
cell_widths = []
for _ in range(3):
cw = np.random.random(N)
cw /= cw.sum()
cell_widths.append(cw)
return (data, cell_widths)
data, cell_widths = data_cell_widths_N16()
cell_widths = [cw.astype(np.float32) for cw in cell_widths]
ds = yt.load_uniform_grid(
data,
data["density"].shape,
bbox=np.array([[0.0, 1.0], [0.0, 1.0], [0.0, 1.0]]),
cell_widths=cell_widths,
) |
…ths and multiple fields
Turns out #4732 did not test out what happens when the data dict supplied to
load_uniform_grid
contains multiple fields... and it turns out that it errors. But it's a simple fix to indentation level.