Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speedup 02: 3x speed up for prep_data_for_correlation with custom copy and trace-selection #525

Conversation

flixha
Copy link
Collaborator

@flixha flixha commented Dec 12, 2022

What does this PR do?

  • 3x speed up for function preprocessing._prep_data_for_correlation
  • provides new functions _quick_copy_trace and _quick_stream_copy that are ~3x quicker for copying simple traces by avoiding a call to copy.deepcopy where that is not necessary. So it helps with traces that mostly contain data, but will not speed up copying as much for traces with attached response and longer history.

Why was it initiated? Any relevant Issues?

When a user has a big set of heterogeneous templates (i.e., many templates with different station setups), filling the templates with NaN-channels takes a long time (many copy-operations in serial). This PR speeds that process up by a factor of 3 in my example, going from ca. 150 s to 50 s for 1500 templates with up to 500 channels.

This PR contributes to the summary issue in #522

PR Checklist

  • develop base branch selected?
  • This PR is not directly related to an existing issue (which has no PR yet).
  • All tests still pass.
    - [ ] Any new features or fixed regressions are be covered via new tests.
    - [ ] Any new or changed features have are fully documented.
  • Significant changes have been added to CHANGES.md.
    - [ ] First time contributors have added your name to CONTRIBUTORS.md.

@flixha flixha changed the title 3x speed up for prep_data_for_correlation with custom copy and trace-selection Speedup 02: 3x speed up for prep_data_for_correlation with custom copy and trace-selection Dec 12, 2022
@calum-chamberlain
Copy link
Member

calum-chamberlain commented Dec 22, 2022

For some reason I'm getting segfaults in this branch - develop doesn't seem to have this issue. @flixha are you also getting segfaults for this branch? Mine are coming from the NCEDC test Cases.

I expect that this is due to some unforeseen change in how data are now being passed to the C-funcs. Can you have a look please?

@calum-chamberlain calum-chamberlain self-requested a review December 22, 2022 21:23
@flixha
Copy link
Collaborator Author

flixha commented Jan 3, 2023

For some reason I'm getting segfaults in this branch - develop doesn't seem to have this issue. @flixha are you also getting segfaults for this branch? Mine are coming from the NCEDC test Cases.

I expect that this is due to some unforeseen change in how data are now being passed to the C-funcs. Can you have a look please?

I just found that somehow one commit got lost along the way that had fixed an issue that turned up here again.. Basically it was a mistake on my side where I did not properly set the entries in the stats-dict for the NaN-channel. Let's see how the tests here are doing now; it works for me locally as it did at some point before.

@calum-chamberlain calum-chamberlain merged commit 0fc5ec7 into eqcorrscan:develop Jan 3, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants