Skip to content

Commit

Permalink
Merge pull request #562 from eqcorrscan/process-timeouts
Browse files Browse the repository at this point in the history
Process timeouts
  • Loading branch information
calum-chamberlain authored Dec 11, 2024
2 parents c5811b2 + f3c2bde commit 8e40eac
Show file tree
Hide file tree
Showing 3 changed files with 81 additions and 3 deletions.
4 changes: 2 additions & 2 deletions CHANGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@
- Templates with nan channels will be considered equal to other templates with shared
nan channels.
- New grouping strategy to minimise nan-channels - templates are grouped by
similar seed-ids. This should speed up both correlations and
prep_data_for_correlation. See PR #457.
similar seed-ids. This should speed up both correlations and
prep_data_for_correlation. See PR #457.
* utils.pre_processing
- `_prep_data_for_correlation`: 3x speedup for filling NaN-traces in templates
- New function ``quick_trace_select` for a very efficient selection of trace
Expand Down
14 changes: 13 additions & 1 deletion eqcorrscan/core/match_filter/helpers/processes.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,21 +70,33 @@ def __str__(self):
return self.__repr__()


def _get_and_check(input_queue: Queue, poison_queue: Queue, step: float = 0.5):
def _get_and_check(
input_queue: Queue,
poison_queue: Queue,
step: float = 0.5,
timeout: float = None,
):
"""
Get from a queue and check for poison - returns Poisoned if poisoned.
:param input_queue: Queue to get something from
:param poison_queue: Queue to check for poison
:param step: Time in seconds to wait until re-checking the queues
:param timeout: Maximum time in seconds to wait for data in the input queue.
:return: Item from queue or Poison.
"""
waited = 0.0
while True:
poison = _check_for_poison(poison_queue)
if poison:
return poison
if input_queue.empty():
time.sleep(step)
waited += step
if timeout and waited >= timeout:
return Poison(Empty(
f"{input_queue} is empty after {timeout} seconds"))
else:
return input_queue.get_nowait()

Expand Down
66 changes: 66 additions & 0 deletions eqcorrscan/doc/updates.rst
Original file line number Diff line number Diff line change
@@ -1,6 +1,72 @@
What's new
==========

Version 0.5.0
-------------

* core.match_filter.tribe

* Significant re-write of detect logic to take advantage of parallel steps (see #544)
* Significant re-structure of hidden functions.

* core.match_filter.matched_filter

* 5x speed up for MAD threshold calculation with parallel (threaded) MAD
calculation (#531).

* core.match_filter.detect

* 1000x speedup for retrieving unique detections for all templates.
* 30x speedup in handling detections (50x speedup in selecting detections,
4x speedup in adding prepick time)

* core.match_filter.template

* new quick_group_templates function for 50x quicker template grouping.
* Templates with nan channels will be considered equal to other templates with shared
nan channels.
* New grouping strategy to minimise nan-channels - templates are grouped by
similar seed-ids. This should speed up both correlations and
prep_data_for_correlation. See PR #457.

* utils.pre_processing

* `_prep_data_for_correlation`: 3x speedup for filling NaN-traces in templates
* New function ``quick_trace_select` for a very efficient selection of trace
by seed ID without wildcards (4x speedup).
* `process`, `dayproc` and `shortproc` replaced by `multi_process`. Deprecation
warning added.
* `multi_process` implements multithreaded GIL-releasing parallelism of slow
sections (detrending, resampling and filtering) of the processing workflow.
Multiprocessing is no longer supported or needed for processing. See PR #540
for benchmarks. New approach is slightly faster overall, and significantly
more memory efficeint (uses c. 6x less memory than old multiprocessing approach
on a 12 core machine)
* utils.correlate
* 25 % speedup for `_get_array_dicts` with quicker access to properties.
* utils.catalog_to_dd
* _prepare_stream
* Now more consistently slices templates to length = extract_len * samp_rate
so that user receives less warnings about insufficient data.
* write_correlations
* New option `use_shared_memory` to speed up correlation of many events by
ca. 20 % by moving trace data into shared memory.
* Add ability to weight correlations by raw correlation rather than just
correlation squared.
* utils.cluster.decluster_distance_time
* Bug-fix: fix segmentation fault when declustering more than 46340 detections
with hypocentral_separation.
Version 0.4.4
-------------
* core.match_filter
Expand Down

0 comments on commit 8e40eac

Please sign in to comment.