Add Lafferty and Sriver partition (#1529)

### Pull Request Checklist: - [x] This PR addresses an already opened issue (for bug fixes / features) - This PR fixes #1497 - [x] Tests for the changes have been added (for bug fixes / features) - [x] (If applicable) Documentation has been added / updated (for bug fixes / features) - [x] CHANGES.rst has been updated (with summary of main changes) - [x] Link to issue (:issue:`number`) and pull request (:pull:`number`) has been added ### What kind of change does this PR introduce? * Add partition algo from Lafferty and Sriver (2023) ### Does this PR introduce a breaking change? depends (see options below) ### Other information: TODO list in this PR: - [x] Add weights - [x] add tests - [x] Maybe add `num` to `hawkins_sutton` ? - [x] Clean up based on changes outside of this PR (eg. remove functions that are now elsewhere) TODO list outside of this PR: - [ ] Add a more general `graph_fraction_of_total_variance` to figanos (in progress by Juliette, Ouranosinc/figanos#134) - [ ] Add a function in xscen to prepare the data from a catalog. (Ouranosinc/xscen#289) In this function, I could rename the dimension to fit what the partition vocabulary (`model`, `scenario`, `downscaling`) OR we could change the partition vocab to the catalog/xscen vocab (`source`, `experiment`, `bias_adjust_project`). Option 2 is my personal preference, but that would be a breaking change for `hawkins_sutton`.
Ouranosinc · Dec 14, 2023 · 6529649 · 6529649
2 parents fdde515 + 3eb8dd1
commit 6529649
Show file tree

Hide file tree

Showing 11 changed files with 306 additions and 21 deletions.
diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
@@ -20,7 +20,7 @@ on:
       - submitted
 
 env:
-  XCLIM_TESTDATA_BRANCH: v2023.9.12
+  XCLIM_TESTDATA_BRANCH: v2023.12.14
 
 concurrency:
   # For a given workflow, if we push to the same branch, cancel all previous builds on that branch except on master.

diff --git a/CHANGES.rst b/CHANGES.rst
@@ -2,6 +2,16 @@
 Changelog
 =========
 
+
+v0.48 (unreleased)
+------------------
+Contributors to this version: Juliette Lavoie (:user:`juliettelavoie`), Pascal Bourgault (:user:`aulemahal`), Trevor James Smith (:user:`Zeitsperre`), David Huard (:user:`huard`), Éric Dupuis (:user:`coxipi`).
+
+New features and enhancements
+^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
+* Added uncertainty partitioning method `lafferty_sriver` from Lafferty and Sriver (2023), which can partition uncertainty related to the downscaling method. (:issue:`1497`, :pull:`1529`).
+
+
 v0.47.0 (2023-12-01)
 --------------------
 Contributors to this version: Juliette Lavoie (:user:`juliettelavoie`), Pascal Bourgault (:user:`aulemahal`), Trevor James Smith (:user:`Zeitsperre`), David Huard (:user:`huard`), Éric Dupuis (:user:`coxipi`).

diff --git a/docs/api.rst b/docs/api.rst
@@ -65,6 +65,9 @@ Ensembles Module
 .. autofunction:: xclim.ensembles.hawkins_sutton
     :noindex:
 
+.. autofunction:: xclim.ensembles.lafferty_sriver
+    :noindex:
+
 Units Handling Submodule
 ========================
 

diff --git a/docs/notebooks/partitioning.ipynb b/docs/notebooks/partitioning.ipynb
@@ -79,7 +79,11 @@
    "source": [
     "## Create an ensemble \n",
     "\n",
-    "Here we combine the different models and scenarios into a single DataArray with dimensions `model` and `scenario`. Note that the names of those dimensions are important for the uncertainty partitioning algorithm to work. "
+    "Here we combine the different models and scenarios into a single DataArray with dimensions `model` and `scenario`. Note that the names of those dimensions are important for the uncertainty partitioning algorithm to work. \n",
+    "\n",
+    "<div class=\"alert alert-info\">\n",
+    "Note that the [xscen library](https://xscen.readthedocs.io/en/latest/index.html) provides a helper function `xscen.ensembles.get_partition_input` to build partition ensembles.\n",
+    "</div>"
    ]
   },
   {
@@ -137,7 +141,11 @@
    "id": "41af418d-9e92-433c-800c-6ba28ff7684c",
    "metadata": {},
    "source": [
-    "From there, it's relatively straightforward to compute the relative strength of uncertainties, and create graphics similar to those found in scientific papers. "
+    "From there, it's relatively straightforward to compute the relative strength of uncertainties, and create graphics similar to those found in scientific papers. \n",
+    "\n",
+    "<div class=\"alert alert-info\">\n",
+    "Note that the [figanos library](https://figanos.readthedocs.io/en/latest/) provides a function `fg.partition` to plot the graph below.\n",
+    "</div>"
    ]
   },
   {
@@ -238,7 +246,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.10.8"
+   "version": "3.9.13"
   }
  },
  "nbformat": 4,

diff --git a/docs/references.bib b/docs/references.bib
@@ -2086,3 +2086,20 @@ @inbook{
 year={2023},
 pages={1927–2058}
 }
+
+@article{Lafferty2023,
+   abstract = {Efforts to diagnose the risks of a changing climate often rely on downscaled and bias-corrected climate information, making it important to understand the uncertainties and potential biases of this approach. Here, we perform a variance decomposition to partition uncertainty in global climate projections and quantify the relative importance of downscaling and bias-correction. We analyze simple climate metrics such as annual temperature and precipitation averages, as well as several indices of climate extremes. We find that downscaling and bias-correction often contribute substantial uncertainty to local decision-relevant climate outcomes, though our results are strongly heterogeneous across space, time, and climate metrics. Our results can provide guidance to impact modelers and decision-makers regarding the uncertainties associated with downscaling and bias-correction when performing local-scale analyses, as neglecting to account for these uncertainties may risk overconfidence relative to the full range of possible climate futures.},
+   author = {David C. Lafferty and Ryan L. Sriver},
+   doi = {10.1038/s41612-023-00486-0},
+   issn = {2397-3722},
+   issue = {1},
+   journal = {npj Climate and Atmospheric Science 2023 6:1},
+   keywords = {Atmospheric science,Climate,Climate and Earth system modelling,Projection and prediction,change impacts},
+   month = {9},
+   pages = {1-13},
+   publisher = {Nature Publishing Group},
+   title = {Downscaling and bias-correction contribute considerable uncertainty to local climate projections in CMIP6},
+   volume = {6},
+   url = {https://www.nature.com/articles/s41612-023-00486-0},
+   year = {2023},
+}
diff --git a/tests/conftest.py b/tests/conftest.py
@@ -24,6 +24,7 @@
 from xclim.testing import helpers
 from xclim.testing.helpers import test_timeseries
 from xclim.testing.utils import _default_cache_dir  # noqa
+from xclim.testing.utils import get_file
 from xclim.testing.utils import open_dataset as _open_dataset
 
 if not __xclim_version__.endswith("-beta") and helpers.TESTDATA_BRANCH == "main":
@@ -429,6 +430,30 @@ def ensemble_dataset_objects() -> dict:
     return edo
 
 
+@pytest.fixture(scope="session")
+def lafferty_sriver_ds() -> xr.Dataset:
+    """Get data from Lafferty & Sriver unit test.
+
+    Notes
+    -----
+    https://github.com/david0811/lafferty-sriver_2023_npjCliAtm/tree/main/unit_test
+    """
+    fn = get_file(
+        "uncertainty_partitioning/seattle_avg_tas.csv",
+        cache_dir=_default_cache_dir,
+        branch=helpers.TESTDATA_BRANCH,
+    )
+
+    df = pd.read_csv(fn, parse_dates=["time"]).rename(
+        columns={"ssp": "scenario", "ensemble": "downscaling"}
+    )
+
+    # Make xarray dataset
+    return xr.Dataset.from_dataframe(
+        df.set_index(["scenario", "model", "downscaling", "time"])
+    )
+
+
 @pytest.fixture(scope="session", autouse=True)
 def gather_session_data(threadsafe_data_dir, worker_id, xdoctest_namespace):
     """Gather testing data on pytest run.

diff --git a/tests/test_partitioning.py b/tests/test_partitioning.py
@@ -3,7 +3,7 @@
 import numpy as np
 import xarray as xr
 
-from xclim.ensembles import hawkins_sutton
+from xclim.ensembles import fractional_uncertainty, hawkins_sutton, lafferty_sriver
 from xclim.ensembles._filters import _concat_hist, _model_in_all_scens, _single_member
 
 
@@ -67,3 +67,92 @@ def test_hawkins_sutton_synthetic(random):
         su.sel(time=slice("2020", None)).mean()
         > su.sel(time=slice("2000", "2010")).mean()
     )
+
+
+def test_lafferty_sriver_synthetic(random):
+    """Test logic of Lafferty & Sriver's implementation using synthetic data."""
+    # Time, scenario, model, downscaling
+    # Here the scenarios don't change over time, so there should be no model variability (since it's relative to the
+    # reference period.
+    sm = np.arange(10, 41, 10)  # Scenario mean (4)
+    mm = np.arange(-6, 7, 1)  # Model mean (13)
+    dm = np.arange(-2, 3, 1)  # Downscaling mean (5)
+    mean = (
+        dm[np.newaxis, np.newaxis, :]
+        + mm[np.newaxis, :, np.newaxis]
+        + sm[:, np.newaxis, np.newaxis]
+    )
+
+    # Natural variability
+    r = random.standard_normal((4, 13, 5, 60))
+
+    x = r + mean[:, :, :, np.newaxis]
+    time = xr.date_range("1970-01-01", periods=60, freq="Y")
+    da = xr.DataArray(
+        x, dims=("scenario", "model", "downscaling", "time"), coords={"time": time}
+    )
+    m, v = lafferty_sriver(da)
+    # Mean uncertainty over time
+    vm = v.mean(dim="time")
+
+    # Check that the mean uncertainty
+    np.testing.assert_array_almost_equal(m.mean(dim="time"), 25, decimal=1)
+
+    # Check that model uncertainty > variability
+    assert vm.sel(uncertainty="model") > vm.sel(uncertainty="variability")
+
+    # Smoke test with polynomial of order 2
+    fit = da.polyfit(dim="time", deg=2, skipna=True)
+    sm = xr.polyval(coord=da.time, coeffs=fit.polyfit_coefficients).where(da.notnull())
+    lafferty_sriver(da, sm=sm)
+
+
+def test_lafferty_sriver(lafferty_sriver_ds):
+    g, u = lafferty_sriver(lafferty_sriver_ds.tas)
+
+    fu = fractional_uncertainty(u)
+
+    # Assertions based on expected results from
+    # https://github.com/david0811/lafferty-sriver_2023_npjCliAtm/blob/main/unit_test/unit_test_check.ipynb
+    assert fu.sel(time="2020", uncertainty="downscaling") > fu.sel(
+        time="2020", uncertainty="model"
+    )
+    assert fu.sel(time="2020", uncertainty="variability") > fu.sel(
+        time="2020", uncertainty="scenario"
+    )
+    assert (
+        fu.sel(time="2090", uncertainty="scenario").data
+        > fu.sel(time="2020", uncertainty="scenario").data
+    )
+    assert (
+        fu.sel(time="2090", uncertainty="downscaling").data
+        < fu.sel(time="2020", uncertainty="downscaling").data
+    )
+
+    def graph():
+        """Return graphic like in https://github.com/david0811/lafferty-sriver_2023_npjCliAtm/blob/main/unit_test/unit_test_check.ipynb"""
+        from matplotlib import pyplot as plt
+
+        udict = {
+            "Scenario": fu.sel(uncertainty="scenario").to_numpy().flatten(),
+            "Model": fu.sel(uncertainty="model").to_numpy().flatten(),
+            "Downscaling": fu.sel(uncertainty="downscaling").to_numpy().flatten(),
+            "Variability": fu.sel(uncertainty="variability").to_numpy().flatten(),
+        }
+
+        fig, ax = plt.subplots()
+        ax.stackplot(
+            np.arange(2015, 2101),
+            udict.values(),
+            labels=udict.keys(),
+            alpha=1,
+            colors=["#00CC89", "#6869B3", "#CC883C", "#FFFF99"],
+            edgecolor="white",
+            lw=1.5,
+        )
+        ax.set_xlim([2020, 2095])
+        ax.set_ylim([0, 100])
+        ax.legend(loc="upper left")
+        plt.show()
+
+    # graph()
diff --git a/xclim/ensembles/__init__.py b/xclim/ensembles/__init__.py
@@ -10,7 +10,7 @@
 from __future__ import annotations
 
 from ._base import create_ensemble, ensemble_mean_std_max_min, ensemble_percentiles
-from ._partitioning import hawkins_sutton
+from ._partitioning import fractional_uncertainty, hawkins_sutton, lafferty_sriver
 from ._reduce import (
     kkz_reduce_ensemble,
     kmeans_reduce_ensemble,