Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More large-catalog speedups #397

Merged
merged 7 commits into from
Sep 25, 2024
Merged

Conversation

jvansanten
Copy link
Contributor

@jvansanten jvansanten commented Sep 9, 2024

This PR introduces a bundle of changes, that, together with #396, reduce the time to initialize an injector and run a single trial with 10k sources and the 10-year NT dataset (gamma_precision reduced to 0.25, so only 8 gamma points) from 6.2 hours to 3 minutes. Of that, a solid 2 minutes are spent in load_data()*. The speedups arise as follows:

  • Add a new TableInjector that keeps its MC data in an astropy.table.Table, sorted by declination, rather than in a gigantic structured array. This helps in two ways. First, operations on individual columns of the table make better use of caches, since the values are stored contiguously in memory, rather than 176 bytes apart as in the array. Second, sorting the table by declination means that the expensive band-masking step can be replaced by a cheap binary search. This is also more cache-efficient than indexing with a boolean array, since applying a boolean mask requires reading every single value in the column (or worse, the structured array). This injector is in all ways better than LowMemoryInjector (and perhaps, the base MCInjector)
  • Optimize StdMatrixKDEEnabledLLH's coincident event selection in much the same way, using a Table to store the input data, sorted in declination, and processing sources in declination order.
  • Build up SoB_only_matrix row-by-row in canonical CSR format, and skip constructing an explicit coincidence matrix entirely. Updating a scipy.sparse matrix turns out to be surprisingly expensive, and largely unnecessary.

*Most of the remaining fat is in flarestack.icecube_utils.dataset_loader.load_dataset() and its use of append_records(). This could be obviated by using astropy Table everywhere, but would make the most sense if the data were stored that way in the first place. That's probably too large of a change at this point.

Slicing with a boolean array already returns a copy
Bandmasks are completely unncessary data are ordered in declination;
binary search is cheap. Speeds up injector initialization by a factor
~60 with 1000 sources.
csr_matrix.getrow() is suprisingly slow (lots of layers of indirection
and argument checking), but also completely unnecessary, since
coincidence_matrix and SoB_only_matrix have the same sparsity structure
by construction.
@sathanas31
Copy link
Contributor

Finally tested! For same (single) scale & seed, 1 trial for 1 src, the scrambled dataset using table_injector is the same with that from low_memory_injector. Using this table_injector dataset, the llh kwargs were compared and found same with what you get from main (merged #396). Unless you want me to check low-level params (eg coincident data), we good to go :D

@JannisNe
Copy link
Collaborator

If the datasets are the same, then also the coincident data will be.

@jvansanten jvansanten merged commit 50ca20e into icecube:master Sep 25, 2024
9 of 13 checks passed
@jvansanten jvansanten deleted the oneshot-sob branch September 25, 2024 07:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants