Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rolling statistics #1643

Merged
merged 10 commits into from
Feb 19, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions CHANGES.rst
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ Changelog

v0.48 (unreleased)
------------------
Contributors to this version: Juliette Lavoie (:user:`juliettelavoie`), Pascal Bourgault (:user:`aulemahal`), Trevor James Smith (:user:`Zeitsperre`), David Huard (:user:`huard`), Éric Dupuis (:user:`coxipi`), Dante Castro (:user:`profesorpaiche`).
Contributors to this version: Juliette Lavoie (:user:`juliettelavoie`), Pascal Bourgault (:user:`aulemahal`), Trevor James Smith (:user:`Zeitsperre`), David Huard (:user:`huard`), Éric Dupuis (:user:`coxipi`), Dante Castro (:user:`profesorpaiche`), Gabriel Rondeau-Genesse (:user:`RondeauG`).

Announcements
^^^^^^^^^^^^^
Expand All @@ -22,10 +22,10 @@ New features and enhancements
* New ``xclim.core.calendar.stack_periods`` and ``unstack_periods`` for performing ``rolling(time=...).construct(..., stride=...)`` but with non-uniform temporal periods like years or months. They replace ``xclim.sdba.processing.construct_moving_yearly_window`` and ``unpack_moving_yearly_window`` which are deprecated and will be removed in a future release.
* New ``as_dataset`` options for ``xclim.set_options``. When True, indicators will output Datasets instead of DataArrays. (:issue:`1257`, :pull:`1625`).
* Added new option for UTCI calculation to cap low wind velocities to a minimum of 0.5 m/s following Bröde (2012) guidelines. (:issue:`1634`, :pull:`1635`).

* Added option ``never_reached`` to ``degree_days_exceedance_date`` to assign a custom value when the sum threshold is never reached. (:issue:`1459`, :pull:`1647`).
* Added option ``min_members`` to ensemble statistics to mask elements when the number of valid members is under a threshold. (:issue:`1459`, :pull:`1647`).
* Distribution instances can now be passed to the ``dist`` argument of most statistical indices. (:pull:`1644`).
* Added a new `xclim.indices.generic.select_rolling_resample_op` function to allow for computing rolling statistics. (:issue:`1480`, :pull:`1643`).

Breaking changes
^^^^^^^^^^^^^^^^
Expand Down
41 changes: 41 additions & 0 deletions tests/test_generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,47 @@ def test_season(self, q_series):
assert o[0] == 31 + 29


class TestSelectRollingResampleOp:
def test_rollingmax(self, q_series):
q = q_series(np.arange(1, 366 + 365 + 365 + 1)) # 1st year is leap
o = generic.select_rolling_resample_op(
q, "max", window=14, window_center=False, window_op="mean"
)
np.testing.assert_array_equal(
[
np.mean(np.arange(353, 366 + 1)),
np.mean(np.arange(353 + 365, 366 + 365 + 1)),
np.mean(np.arange(353 + 365 * 2, 366 + 365 * 2 + 1)),
],
o.values,
)
assert o.attrs["units"] == "m3 s-1"

def test_rollingmaxindexer(self, q_series):
q = q_series(np.arange(1, 366 + 365 + 365 + 1)) # 1st year is leap
o = generic.select_rolling_resample_op(
q, "min", window=14, window_center=False, window_op="max", season="DJF"
)
np.testing.assert_array_equal(
[14, 367, 367 + 365], o.values
) # 14th day for 1st year, then Jan 1st for the next two
assert o.attrs["units"] == "m3 s-1"

def test_freq(self, q_series):
q = q_series(np.arange(1, 366 + 365 + 365 + 1)) # 1st year is leap
o = generic.select_rolling_resample_op(
q, "max", window=3, window_center=True, window_op="integral", freq="MS"
)
np.testing.assert_array_equal(
[
np.sum([30, 31, 32]) * 86400,
np.sum([30 + 29, 31 + 29, 32 + 29]) * 86400,
], # m3 s-1 being summed by the frequency of the data
o.isel(time=slice(0, 2)).values,
)
assert o.attrs["units"] == "m3"


class TestThresholdCount:
def test_simple(self, tas_series):
ts = tas_series(np.arange(365))
Expand Down
46 changes: 46 additions & 0 deletions xclim/indices/generic.py
Original file line number Diff line number Diff line change
Expand Up @@ -104,6 +104,52 @@ def select_resample_op(
return to_agg_units(out, da, op)


def select_rolling_resample_op(
da: xr.DataArray,
op: str,
window: int,
window_center: bool = True,
window_op: str = "mean",
freq: str = "YS",
out_units=None,
**indexer,
) -> xr.DataArray:
"""Apply operation over each period that is part of the index selection, using a rolling window before the operation.

Parameters
----------
da : xr.DataArray
Input data.
op : str {'min', 'max', 'mean', 'std', 'var', 'count', 'sum', 'integral', 'argmax', 'argmin'} or func
Reduce operation. Can either be a DataArray method or a function that can be applied to a DataArray.
window : int
Size of the rolling window (centered).
window_center : bool
If True, the window is centered on the date. If False, the window is right-aligned.
window_op : str {'min', 'max', 'mean', 'std', 'var', 'count', 'sum', 'integral'}
Operation to apply to the rolling window. Default: 'mean'.
freq : str
Resampling frequency defining the periods as defined in :ref:`timeseries.resampling`. Applied after the rolling window.
out_units : str, optional
Output units to assign. Only necessary if `op` is function not supported by :py:func:`xclim.core.units.to_agg_units`.
indexer : {dim: indexer, }, optional
Time attribute and values over which to subset the array. For example, use season='DJF' to select winter values,
month=1 to select January, or month=[6,7,8] to select summer months. If not indexer is given, all values are
considered.

Returns
-------
xr.DataArray
The array for which the operation has been applied over each period.
"""
rolled = getattr(
da.rolling(time=window, center=window_center),
window_op.replace("integral", "sum"),
)()
rolled = to_agg_units(rolled, da, window_op)
return select_resample_op(rolled, op=op, freq=freq, out_units=out_units, **indexer)


def doymax(da: xr.DataArray) -> xr.DataArray:
"""Return the day of year of the maximum value."""
i = da.argmax(dim="time")
Expand Down