Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve speed by using minimal subcubes #64

Open
keflavich opened this issue Jun 15, 2015 · 9 comments
Open

Improve speed by using minimal subcubes #64

keflavich opened this issue Jun 15, 2015 · 9 comments

Comments

@keflavich
Copy link
Contributor

Given a path, we should be able to extract a minimal sub-cube and avoid loading a whole cube into memory. This would significantly speed up the extraction process.

@astrofrog
Copy link
Member

I thought I had been careful to make it so that it used memory mapping and never accessed the whole data, which should minimize memory usage, but it's possible that we accidentally changed this later. I made it so it would only access one 1D spectrum at a time originally.

@keflavich
Copy link
Contributor Author

I think memmaping is still in place, but it has no effect if an operation grabs all the data. The interpolation-based approach using map_coordinates does just that, so it's slow for taking small slices from large cubes

@astrofrog
Copy link
Member

@keflavich - ah right, and I think it's for the finite width method that I implemented a way to get around grabbing all the data. But yes, for the interpolation this is a problem.

@astrofrog
Copy link
Member

@keflavich - could you try with a finite but small path width? (like 1 pixel wide or less). What does the memory usage look like then?

@keflavich
Copy link
Contributor Author

Probably can do it that way, but it requires rewriting the ds9 script; I'll do that when I get back to making PV diagrams

@keflavich
Copy link
Contributor Author

Though this has been laying fallow for a long time, I think it is still a concern: the map_coordinates approach (interpolation) may still load the whole cube into memory, which could cause issues.

The appropriate workaround may be to loop through small subcubes and do interpolation section-by-section, but this will come at a high performance cost. Probably best that we just punt this until dask solves it, or solve it in dask directly:
dask/dask-image#235

@GenevieveBuckley
Copy link

Probably best that we just punt this until dask solves it, or solve it in dask directly:
dask/dask-image#235

Curious to know if you'd be able to test drive @m-albert's proposed PR dask/dask-image#237 while it's still in the stage of soliciting feedback. I'd find it helpful to hear any insights related to your particular application (no pressure, just happy to open the conversation)

@keflavich
Copy link
Contributor Author

Thanks @GenevieveBuckley . I don't think we'll get to this soon, but we definitely will eventually. The funding that was driving active development has run out, so we're back to volunteer effort until the next grant.

@GenevieveBuckley
Copy link

Understandable!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants