Skip to content

Commit

Permalink
Rolling statistics (#1643)
Browse files Browse the repository at this point in the history
<!--Please ensure the PR fulfills the following requirements! -->
<!-- If this is your first PR, make sure to add your details to the
AUTHORS.rst! -->
### Pull Request Checklist:
- [x] This PR addresses an already opened issue (for bug fixes /
features)
    - This PR fixes #1480
- [x] Tests for the changes have been added (for bug fixes / features)
- [x] (If applicable) Documentation has been added / updated (for bug
fixes / features)
- [x] CHANGES.rst has been updated (with summary of main changes)
- [x] Link to issue (:issue:`number`) and pull request (:pull:`number`)
has been added

### What kind of change does this PR introduce?

* Adds `xclim.indices.select_rolling_resample_op` that acts in a way
similar to `select_resample_op`, but with a rolling window beforehand.

### Does this PR introduce a breaking change?


### Other information:
  • Loading branch information
RondeauG committed Feb 19, 2024
2 parents 7de95fa + 318ae06 commit 3acb019
Show file tree
Hide file tree
Showing 3 changed files with 89 additions and 2 deletions.
4 changes: 2 additions & 2 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Changelog

v0.48 (unreleased)
------------------
Contributors to this version: Juliette Lavoie (:user:`juliettelavoie`), Pascal Bourgault (:user:`aulemahal`), Trevor James Smith (:user:`Zeitsperre`), David Huard (:user:`huard`), Éric Dupuis (:user:`coxipi`), Dante Castro (:user:`profesorpaiche`).
Contributors to this version: Juliette Lavoie (:user:`juliettelavoie`), Pascal Bourgault (:user:`aulemahal`), Trevor James Smith (:user:`Zeitsperre`), David Huard (:user:`huard`), Éric Dupuis (:user:`coxipi`), Dante Castro (:user:`profesorpaiche`), Gabriel Rondeau-Genesse (:user:`RondeauG`).

Announcements
^^^^^^^^^^^^^
Expand All @@ -22,10 +22,10 @@ New features and enhancements
* New ``xclim.core.calendar.stack_periods`` and ``unstack_periods`` for performing ``rolling(time=...).construct(..., stride=...)`` but with non-uniform temporal periods like years or months. They replace ``xclim.sdba.processing.construct_moving_yearly_window`` and ``unpack_moving_yearly_window`` which are deprecated and will be removed in a future release.
* New ``as_dataset`` options for ``xclim.set_options``. When True, indicators will output Datasets instead of DataArrays. (:issue:`1257`, :pull:`1625`).
* Added new option for UTCI calculation to cap low wind velocities to a minimum of 0.5 m/s following Bröde (2012) guidelines. (:issue:`1634`, :pull:`1635`).

* Added option ``never_reached`` to ``degree_days_exceedance_date`` to assign a custom value when the sum threshold is never reached. (:issue:`1459`, :pull:`1647`).
* Added option ``min_members`` to ensemble statistics to mask elements when the number of valid members is under a threshold. (:issue:`1459`, :pull:`1647`).
* Distribution instances can now be passed to the ``dist`` argument of most statistical indices. (:pull:`1644`).
* Added a new `xclim.indices.generic.select_rolling_resample_op` function to allow for computing rolling statistics. (:issue:`1480`, :pull:`1643`).

Breaking changes
^^^^^^^^^^^^^^^^
Expand Down
41 changes: 41 additions & 0 deletions tests/test_generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,47 @@ def test_season(self, q_series):
assert o[0] == 31 + 29


class TestSelectRollingResampleOp:
def test_rollingmax(self, q_series):
q = q_series(np.arange(1, 366 + 365 + 365 + 1)) # 1st year is leap
o = generic.select_rolling_resample_op(
q, "max", window=14, window_center=False, window_op="mean"
)
np.testing.assert_array_equal(
[
np.mean(np.arange(353, 366 + 1)),
np.mean(np.arange(353 + 365, 366 + 365 + 1)),
np.mean(np.arange(353 + 365 * 2, 366 + 365 * 2 + 1)),
],
o.values,
)
assert o.attrs["units"] == "m3 s-1"

def test_rollingmaxindexer(self, q_series):
q = q_series(np.arange(1, 366 + 365 + 365 + 1)) # 1st year is leap
o = generic.select_rolling_resample_op(
q, "min", window=14, window_center=False, window_op="max", season="DJF"
)
np.testing.assert_array_equal(
[14, 367, 367 + 365], o.values
) # 14th day for 1st year, then Jan 1st for the next two
assert o.attrs["units"] == "m3 s-1"

def test_freq(self, q_series):
q = q_series(np.arange(1, 366 + 365 + 365 + 1)) # 1st year is leap
o = generic.select_rolling_resample_op(
q, "max", window=3, window_center=True, window_op="integral", freq="MS"
)
np.testing.assert_array_equal(
[
np.sum([30, 31, 32]) * 86400,
np.sum([30 + 29, 31 + 29, 32 + 29]) * 86400,
], # m3 s-1 being summed by the frequency of the data
o.isel(time=slice(0, 2)).values,
)
assert o.attrs["units"] == "m3"


class TestThresholdCount:
def test_simple(self, tas_series):
ts = tas_series(np.arange(365))
Expand Down
46 changes: 46 additions & 0 deletions xclim/indices/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,52 @@ def select_resample_op(
return to_agg_units(out, da, op)


def select_rolling_resample_op(
da: xr.DataArray,
op: str,
window: int,
window_center: bool = True,
window_op: str = "mean",
freq: str = "YS",
out_units=None,
**indexer,
) -> xr.DataArray:
"""Apply operation over each period that is part of the index selection, using a rolling window before the operation.
Parameters
----------
da : xr.DataArray
Input data.
op : str {'min', 'max', 'mean', 'std', 'var', 'count', 'sum', 'integral', 'argmax', 'argmin'} or func
Reduce operation. Can either be a DataArray method or a function that can be applied to a DataArray.
window : int
Size of the rolling window (centered).
window_center : bool
If True, the window is centered on the date. If False, the window is right-aligned.
window_op : str {'min', 'max', 'mean', 'std', 'var', 'count', 'sum', 'integral'}
Operation to apply to the rolling window. Default: 'mean'.
freq : str
Resampling frequency defining the periods as defined in :ref:`timeseries.resampling`. Applied after the rolling window.
out_units : str, optional
Output units to assign. Only necessary if `op` is function not supported by :py:func:`xclim.core.units.to_agg_units`.
indexer : {dim: indexer, }, optional
Time attribute and values over which to subset the array. For example, use season='DJF' to select winter values,
month=1 to select January, or month=[6,7,8] to select summer months. If not indexer is given, all values are
considered.
Returns
-------
xr.DataArray
The array for which the operation has been applied over each period.
"""
rolled = getattr(
da.rolling(time=window, center=window_center),
window_op.replace("integral", "sum"),
)()
rolled = to_agg_units(rolled, da, window_op)
return select_resample_op(rolled, op=op, freq=freq, out_units=out_units, **indexer)


def doymax(da: xr.DataArray) -> xr.DataArray:
"""Return the day of year of the maximum value."""
i = da.argmax(dim="time")
Expand Down

0 comments on commit 3acb019

Please sign in to comment.