Merge pull request #562 from eqcorrscan/process-timeouts

Process timeouts
eqcorrscan · Dec 11, 2024 · 8e40eac · 8e40eac
2 parents c5811b2 + f3c2bde
commit 8e40eac
Show file tree

Hide file tree

Showing 3 changed files with 81 additions and 3 deletions.
diff --git a/CHANGES.md b/CHANGES.md
@@ -21,8 +21,8 @@
   - Templates with nan channels will be considered equal to other templates with shared
   nan channels.
   - New grouping strategy to minimise nan-channels - templates are grouped by
-  similar seed-ids. This should speed up both correlations and 
-  prep_data_for_correlation. See PR #457.
+    similar seed-ids. This should speed up both correlations and 
+    prep_data_for_correlation. See PR #457.
 * utils.pre_processing
   - `_prep_data_for_correlation`: 3x speedup for filling NaN-traces in templates
   - New function ``quick_trace_select` for a very efficient selection of trace

diff --git a/eqcorrscan/core/match_filter/helpers/processes.py b/eqcorrscan/core/match_filter/helpers/processes.py
@@ -70,21 +70,33 @@ def __str__(self):
         return self.__repr__()
 
 
-def _get_and_check(input_queue: Queue, poison_queue: Queue, step: float = 0.5):
+def _get_and_check(
+    input_queue: Queue,
+    poison_queue: Queue,
+    step: float = 0.5,
+    timeout: float = None,
+):
     """
     Get from a queue and check for poison - returns Poisoned if poisoned.
 
     :param input_queue: Queue to get something from
     :param poison_queue: Queue to check for poison
+    :param step: Time in seconds to wait until re-checking the queues
+    :param timeout: Maximum time in seconds to wait for data in the input queue.
 
     :return: Item from queue or Poison.
     """
+    waited = 0.0
     while True:
         poison = _check_for_poison(poison_queue)
         if poison:
             return poison
         if input_queue.empty():
             time.sleep(step)
+            waited += step
+            if timeout and waited >= timeout:
+                return Poison(Empty(
+                    f"{input_queue} is empty after {timeout} seconds"))
         else:
             return input_queue.get_nowait()
 

diff --git a/eqcorrscan/doc/updates.rst b/eqcorrscan/doc/updates.rst
@@ -1,6 +1,72 @@
 What's new
 ==========
 
+Version 0.5.0
+-------------
+
+* core.match_filter.tribe
+
+  * Significant re-write of detect logic to take advantage of parallel steps (see #544)
+  * Significant re-structure of hidden functions.
+
+* core.match_filter.matched_filter
+
+  * 5x speed up for MAD threshold calculation with parallel (threaded) MAD 
+    calculation (#531).
+
+* core.match_filter.detect
+
+  * 1000x speedup for retrieving unique detections for all templates.
+  * 30x speedup in handling detections (50x speedup in selecting detections,
+    4x speedup in adding prepick time)
+
+* core.match_filter.template
+
+  * new quick_group_templates function for 50x quicker template grouping.
+  * Templates with nan channels will be considered equal to other templates with shared
+    nan channels.
+  * New grouping strategy to minimise nan-channels - templates are grouped by
+    similar seed-ids. This should speed up both correlations and 
+    prep_data_for_correlation. See PR #457.
+
+* utils.pre_processing
+
+  * `_prep_data_for_correlation`: 3x speedup for filling NaN-traces in templates
+  * New function ``quick_trace_select` for a very efficient selection of trace
+    by seed ID without wildcards (4x speedup).
+  * `process`, `dayproc` and `shortproc` replaced by `multi_process`. Deprecation
+    warning added.
+  * `multi_process` implements multithreaded GIL-releasing parallelism of slow 
+    sections (detrending, resampling and filtering) of the processing workflow. 
+    Multiprocessing is no longer supported or needed for processing. See PR #540 
+    for benchmarks. New approach is slightly faster overall, and significantly 
+    more memory efficeint (uses c. 6x less memory than old multiprocessing approach 
+    on a 12 core machine)
+
+* utils.correlate
+
+  * 25 % speedup for `_get_array_dicts` with quicker access to properties.
+
+* utils.catalog_to_dd
+
+  * _prepare_stream
+
+    * Now more consistently slices templates to length = extract_len * samp_rate
+      so that user receives less warnings about insufficient data.
+
+  * write_correlations
+
+    * New option `use_shared_memory` to speed up correlation of many events by
+      ca. 20 % by moving trace data into shared memory.
+    * Add ability to weight correlations by raw correlation rather than just
+      correlation squared.
+
+* utils.cluster.decluster_distance_time
+
+  * Bug-fix: fix segmentation fault when declustering more than 46340 detections
+    with hypocentral_separation.
+
+
 Version 0.4.4
 -------------
 * core.match_filter