nci · tennlee · Nov 15, 2024 · Sep 13, 2024 · Sep 13, 2024 · Sep 13, 2024
diff --git a/.github/pull_request_template.md b/.github/pull_request_template.md
@@ -2,10 +2,16 @@ Please work through the following checklists. Delete anything that isn't relevan
 ## Development for new xarray-based metrics
 - [ ] Works with n-dimensional data and includes `reduce_dims`, `preserve_dims`, and `weights` args.
 - [ ] Typehints added
-- [ ] Docstrings complete and followed Napoleon (google) style
-- [ ] Reference to paper/webpage is in docstring
 - [ ] Add error handling
 - [ ] Imported into the API
+- [ ] Works with both `xr.DataArrays` and `xr.Datasets` if possible
+
+## Docstrings
+- [ ] Docstrings complete and follow Napoleon (google) style
+- [ ] Maths equation added
+- [ ] Reference to paper/webpage is in docstring. The preferred referencing style for journal articles is [APA (7th edition)](https://apastyle.apa.org/style-grammar-guidelines/references/examples/journal-article-references)
+- [ ] Code example added
+
 
 ## Testing of new xarray-based metrics
 - [ ] 100% unit test coverage
@@ -14,7 +20,7 @@ Please work through the following checklists. Delete anything that isn't relevan
 - [ ] Test that broadcasting with xarray works
 - [ ] Test both reduce and preserve dims arguments work
 - [ ] Test that errors are raised as expected
-- [ ] Test that it works with both `xr.Dataarrays` and `xr.Datasets`
+- [ ] Test that it works with both `xr.DataArrays` and `xr.Datasets`
 
 ## Tutorial notebook 
 - [ ] Short introduction to why you would use that metric and what it tells you

diff --git a/.github/workflows/python-app.yml b/.github/workflows/python-app.yml
@@ -15,7 +15,7 @@ jobs:
     runs-on: ubuntu-latest
     strategy:
       matrix:
-        python-version: ["3.9", "3.10", "3.11", "3.12"]
+        python-version: ["3.10", "3.11", "3.12", "3.13"]
 
     steps:
     - uses: actions/checkout@v4

diff --git a/.github/workflows/run-pre-commit.yml b/.github/workflows/run-pre-commit.yml
@@ -15,9 +15,9 @@ jobs:
     runs-on: ubuntu-latest
 
     steps:
-    - uses: actions/checkout@v3
+    - uses: actions/checkout@v4
     - name: Set up Python
-      uses: actions/setup-python@v3
+      uses: actions/setup-python@v5
       with:
           python-version: '3.x'
     - name: Install dependencies

diff --git a/.gitignore b/.gitignore
@@ -39,6 +39,7 @@ share/python-wheels/
 .installed.cfg
 *.egg
 MANIFEST
+*.DS_Store
 
 # Installer logs
 pip-log.txt

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -35,6 +35,6 @@ repos:
         name: pytest
         entry: pytest tests/ --cov=src/scores --cov-report term-missing
         language: system
-        stages: [push]
+        stages: [pre-push]
         always_run: true
         pass_filenames: false
diff --git a/README.md b/README.md
@@ -18,15 +18,15 @@ Below is a **curated selection** of the metrics, tools and statistical tests inc
 
 |                       	| **Description** 	| **Selection of Included Functions** 	|
 |-----------------------	|-----------------	|--------------	|
-| **[Continuous](https://scores.readthedocs.io/en/stable/included.html#continuous)**        	|Scores for evaluating single-valued continuous forecasts.                  	|MAE, MSE, RMSE, Additive Bias, Multiplicative Bias, Pearson's Correlation Coefficient, Flip-Flop Index, Quantile Loss, Murphy Score, and threshold weighted scores for expectiles, quantiles and Huber Loss.             	|
+| **[Continuous](https://scores.readthedocs.io/en/stable/included.html#continuous)**        	|Scores for evaluating single-valued continuous forecasts.                  	|MAE, MSE, RMSE, Additive Bias, Multiplicative Bias, Percent Bias, Pearson's Correlation Coefficient, Kling-Gupta Efficiency, Flip-Flop Index, Quantile Loss, Quantile Interval Score, Interval Score, Murphy Score, and threshold weighted scores for expectiles, quantiles and Huber Loss.             	|
 | **[Probability](https://scores.readthedocs.io/en/stable/included.html#probability)**        |Scores for evaluating forecasts that are expressed as predictive distributions, ensembles, and probabilities of binary events.                   |Brier Score, Continuous Ranked Probability Score (CRPS) for Cumulative Density Functions (CDF) and ensembles (including threshold weighted versions), Receiver Operating Characteristic (ROC), Isotonic Regression (reliability diagrams).               |
 | **[Categorical](https://scores.readthedocs.io/en/stable/included.html#categorical)**       	|Scores for evaluating forecasts of categories.                	|17 binary contingency table (confusion matrix) metrics and the FIxed Risk Multicategorical (FIRM) Score.               	|
 | **[Spatial](https://scores.readthedocs.io/en/stable/included.html#spatial)** 	|Scores that take into account spatial structure.                 	|Fractions Skill Score.              	|
 | **[Statistical Tests](https://scores.readthedocs.io/en/stable/included.html#statistical-tests)** 	|Tools to conduct statistical tests and generate confidence intervals.                 	|Diebold Mariano.              	|
 | **[Processing Tools](https://scores.readthedocs.io/en/stable/included.html#processing-tools-for-preparing-data)**        	|Tools to pre-process data.                 	|Data matching, Discretisation, Cumulative Density Function Manipulation.              	|
 
 
-`scores` not only includes common scores (e.g., MAE, RMSE), it includes novel scores not commonly found elsewhere (e.g., FIRM, Flip-Flop Index), complex scores (e.g., threshold weighted CRPS), and statistical tests (e.g., the Diebold Mariano test). Additionally, it provides pre-processing tools for preparing data for scores in a variety of formats including cumulative distribution functions (CDF). `scores` provides its own implementations where relevant to avoid extensive dependencies.
+`scores` not only includes common scores (e.g., MAE, RMSE), it also includes novel scores not commonly found elsewhere (e.g., FIRM, Flip-Flop Index), complex scores (e.g., threshold weighted CRPS), and statistical tests (e.g., the Diebold Mariano test). Additionally, it provides pre-processing tools for preparing data for scores in a variety of formats including cumulative distribution functions (CDF). `scores` provides its own implementations where relevant to avoid extensive dependencies.
 
 `scores` primarily supports xarray datatypes for Earth system data allowing it to work with NetCDF4, HDF5, Zarr and GRIB data formats among others. `scores` uses Dask for scaling and performance. Some metrics work with pandas and we aim to expand this capability. 
 

diff --git a/docs/api.md b/docs/api.md
@@ -23,6 +23,7 @@
 .. autofunction:: scores.continuous.correlation.pearsonr
 .. autofunction:: scores.continuous.multiplicative_bias
 .. autofunction:: scores.continuous.pbias
+.. autofunction:: scores.continuous.kge
 .. autofunction:: scores.continuous.isotonic_fit
 .. autofunction:: scores.continuous.consistent_expectile_score
 .. autofunction:: scores.continuous.consistent_quantile_score
@@ -32,6 +33,8 @@
 .. autofunction:: scores.continuous.tw_squared_error
 .. autofunction:: scores.continuous.tw_huber_loss
 .. autofunction:: scores.continuous.tw_expectile_score
+.. autofunction:: scores.continuous.quantile_interval_score
+.. autofunction:: scores.continuous.interval_score
 ```
 
 ## scores.probability
@@ -43,6 +46,7 @@
 .. autofunction:: scores.probability.crps_for_ensemble
 .. autofunction:: scores.probability.tw_crps_for_ensemble
 .. autofunction:: scores.probability.tail_tw_crps_for_ensemble
+.. autofunction:: scores.probability.interval_tw_crps_for_ensemble
 .. autofunction:: scores.probability.murphy_score
 .. autofunction:: scores.probability.murphy_thetas
 .. autofunction:: scores.probability.roc_curve_data

diff --git a/docs/coding_practices.md b/docs/coding_practices.md
@@ -1,3 +1,7 @@
+```{eval-rst}
+:orphan:
+```
+
 # Coding Practices
 
 The [Contributing Guide](contributing.md) provides (among other things) guidance on the workflows and general expectations associated with contributing a code change to `scores`. This document eschews some of the context to focus on specifying technical information needed when developing code for `scores`.

diff --git a/docs/included.md b/docs/included.md
@@ -45,14 +45,22 @@
     [Tutorial](project:./tutorials/Flip_Flop_Index.md)
   - 
     [Griffiths et al. (2019)](https://doi.org/10.1002/met.1732); [Griffiths et al. (2021)](https://doi.org/10.1071/ES21010)
+* - Interval Score
+  - [API](api.md#scores.continuous.interval_score)
+  - [Tutorial](project:./tutorials/Quantile_Interval_And_Interval_Scores.md)
+  - [Gneiting and Raftery (2007) - Section 6.2](https://doi.org/10.1198/016214506000001437)
 * - Isotonic Fit, *see Isotonic Regression*
   - &mdash;
   - &mdash;
   - &mdash;
 * - Isotonic Regression (Isotonic Fit, Reliability Diagram)
   - [API](api.md#scores.continuous.isotonic_fit)
   - [Tutorial](project:./tutorials/Isotonic_Regression_And_Reliability_Diagrams.md)
-  - [de Leeuw et al. (2009)](https://doi.org/10.18637/jss.v032.i05); [Dimitriadis et al. (2020)](https://doi.org/10.1073/pnas.2016191118); [Jordan et al. (2020), version 2](https://doi.org/10.48550/arXiv.1904.04761)   
+  - [de Leeuw et al. (2009)](https://doi.org/10.18637/jss.v032.i05); [Dimitriadis et al. (2020)](https://doi.org/10.1073/pnas.2016191118); [Jordan et al. (2020), version 2](https://doi.org/10.48550/arXiv.1904.04761) 
+* - Kling–Gupta Efficiency (KGE)
+  - [API](api.md#scores.continuous.kge)
+  - [Tutorial](project:./tutorials/Kling_Gupta_Efficiency.md)
+  - [Gupta et al. (2009)](https://doi.org/10.1016/j.jhydrol.2009.08.003); [Knoben et al. (2019)](https://doi.org/10.5194/hess-23-4323-2019)    
 * - Mean Absolute Error (MAE)
   - [API](api.md#scores.continuous.mae)
   - [Tutorial](project:./tutorials/Mean_Absolute_Error.md)
@@ -105,6 +113,10 @@
   - &mdash;
   - &mdash;
   - &mdash;
+* - Quantile Interval Score
+  - [API](api.md#scores.continuous.quantile_interval_score)
+  - [Tutorial](project:./tutorials/Quantile_Interval_And_Interval_Scores.md)
+  - [Winkler (1972) ](https://doi.org/10.2307/2284720)
 * - Quantile Loss (Quantile Score, Pinball Loss)
   - [API](api.md#scores.continuous.quantile_score)
   - [Tutorial](project:./tutorials/Quantile_Loss.md)
@@ -181,9 +193,29 @@
   - &mdash;
   - &mdash;
 * - Continuous Ranked Probability Score (CRPS) for Ensembles
+  -    
+  - 
+  -  
+* - 
+    - CRPS for Ensembles
   - [API](api.md#scores.probability.crps_for_ensemble)   
   - [Tutorial](project:./tutorials/CRPS_for_Ensembles.md)
   - [Ferro (2014)](https://doi.org/10.1002/qj.2270); [Gneiting And Raftery (2007)](https://doi.org/10.1198/016214506000001437); [Zamo and Naveau (2018)](https://doi.org/10.1007/s11004-017-9709-7)
+* - 
+    - Threshold-Weighted CRPS (twCRPS) for Ensembles
+  - [API](api.md#scores.probability.tw_crps_for_ensemble)   
+  - [Tutorial](project:./tutorials/Threshold_Weighted_CRPS_for_Ensembles.md)
+  - [Allen et al. (2023)](https://doi.org/10.1137/22M1532184); [Allen (2024)](https://doi.org/10.18637/jss.v110.i08)    
+* - 
+    - Interval-Threshold-Weighted CRPS (twCRPS) for Ensembles
+  - [API](api.md#scores.probability.interval_tw_crps_for_ensemble)   
+  - [Tutorial](project:./tutorials/Threshold_Weighted_CRPS_for_Ensembles.md)
+  - [Allen et al. (2023)](https://doi.org/10.1137/22M1532184); [Allen (2024)](https://doi.org/10.18637/jss.v110.i08) 
+* - 
+    - Tail-Threshold-Weighted CRPS (twCRPS) for Ensembles
+  - [API](api.md#scores.probability.tail_tw_crps_for_ensemble)   
+  - [Tutorial](project:./tutorials/Threshold_Weighted_CRPS_for_Ensembles.md)
+  - [Allen et al. (2023)](https://doi.org/10.1137/22M1532184); [Allen (2024)](https://doi.org/10.18637/jss.v110.i08)
 * - Isotonic Fit, *see Isotonic Regression*
   - &mdash;
   - &mdash;
@@ -224,14 +256,6 @@
   - &mdash;
   - &mdash;
   - &mdash;
-* - Tail Threshold Weighted Continuous Ranked Probability Score (twCRPS) for Ensembles
-  - [API](api.md#scores.probability.tail_tw_crps_for_ensemble)   
-  - &mdash;
-  - [Allen et al. (2023)](https://doi.org/10.1137/22M1532184) 
-* - Threshold Weighted Continuous Ranked Probability Score (twCRPS) for Ensembles
-  - [API](api.md#scores.probability.tw_crps_for_ensemble)   
-  - &mdash;
-  - [Allen et al. (2023)](https://doi.org/10.1137/22M1532184) 
 ```
 
 ## Categorical

diff --git a/docs/maintainer.md b/docs/maintainer.md
@@ -47,6 +47,7 @@ Information relevant for package maintenance
 
 ### Confirm Zenodo correctness
 
+Link to Zenodo archive: [https://doi.org/10.5281/zenodo.12697241](https://doi.org/10.5281/zenodo.12697241)
 1. Confirm license
 2. Confirm authors
 3. Scan everything else
@@ -93,7 +94,7 @@ If so, please open a new pull request. In that pull request please add your deta
 In .zenodo.json, please add your details at the bottom of the “creators” section. The fields you will need to complete are:
 
 1. “orcid”. This is an optional field. If you don’t have an ORCID, but would like one, you can obtain one here: https://info.orcid.org/researchers/ .
-2. “affiliation”. Options include: the name of the institution you are affiliated with, “Independent Researcher” or “Independent Contributor”.
+2. “affiliation”. Options include: the institution you are affiliated with, “Independent Researcher” or “Independent Contributor”.
 3. “name”. Format: surname, given name(s).
 ```
 

diff --git a/docs/release_notes.md b/docs/release_notes.md
@@ -1,5 +1,44 @@
 # Release Notes (What's New)
 
+## Version 1.3.0 (November 15, 2024)
+
+For a list of all changes in this release, see the [full changelog](https://github.com/nci/scores/compare/1.2.0...1.3.0). Below are the changes we think users may wish to be aware of.
+
+### Introduced Support for Python 3.13 and Dropped Support for Python 3.9
+
+- In line with other scientific Python packages, `scores` has dropped support for Python 3.9 in this release. 
+  `scores` has added support for Python 3.13. See [PR #710](https://github.com/nci/scores/pull/710).
+
+### Features
+
+- Added four new metrics:
+	- Quantile Interval Score: `scores.continuous.quantile_interval_score`. See [PR #704](https://github.com/nci/scores/pull/704), [PR #733](https://github.com/nci/scores/pull/733) and [PR #738](https://github.com/nci/scores/pull/738).
+	- Interval Score: `scores.continuous.interval_score`. See [PR #704](https://github.com/nci/scores/pull/704), [PR #733](https://github.com/nci/scores/pull/733) and [PR #738](https://github.com/nci/scores/pull/738).
+	- Kling-Gupta Efficiency (KGE): `scores.continuous.kge`. See [PR #679](https://github.com/nci/scores/pull/679), [PR #700](https://github.com/nci/scores/pull/700) and [PR #734](https://github.com/nci/scores/pull/734). 
+	- Interval threshold weighted continuous ranked probability score (twCRPS) for ensembles: `scores.probability.interval_tw_crps_for_ensemble`. See [PR #682](https://github.com/nci/scores/pull/682) and [PR #734](https://github.com/nci/scores/pull/734).
+- Added an optional `include_components` argument to several continuous ranked probability score (CRPS) functions for ensembles. If supplied, the `include_components` argument will return the underforecast penalty, the overforecast penalty and the forecast spread term, in addition to the overall CRPS value. This applies to the following CRPS functions:
+	- continuous ranked probability score (CRPS) for ensembles: `scores.probability.crps_for_ensemble`
+	- threshold weighted continuous ranked probability score (twCRPS) for ensembles: `scores.probability.tw_crps_for_ensemble`
+	- tail threshold weighted continuous ranked probability score (twCRPS) for ensembles: `scores.probability.tail_tw_crps_for_ensemble`
+	- interval threshold weighted continuous ranked probability score (twCRPS) for ensembles: `scores.probability.interval_tw_crps_for_ensemble`)  
+	See [PR #708](https://github.com/nci/scores/pull/708) and [PR #734](https://github.com/nci/scores/pull/734).
+
+### Documentation
+
+- Added "Kling–Gupta Efficiency (KGE)" tutorial. See [PR #679](https://github.com/nci/scores/pull/679), [PR #700](https://github.com/nci/scores/pull/700) and [PR #734](https://github.com/nci/scores/pull/734).
+- Added "Quantile Interval Score and Interval Score" tutorial. See [PR #704](https://github.com/nci/scores/pull/704), [PR #736](https://github.com/nci/scores/pull/736) and [PR #738](https://github.com/nci/scores/pull/738).
+- Added "Threshold Weighted Continuous Ranked Probability Score (twCRPS) for ensembles" tutorial. See [PR #706](https://github.com/nci/scores/pull/706) and [PR #722](https://github.com/nci/scores/pull/722).
+- Updated the title in the "Binary Categorical Scores and Binary Contingency Tables (Confusion Matrices)" tutorial and the description for the corresponding thumbnail in the tutorial gallery. See [PR #741](https://github.com/nci/scores/pull/741) and [PR #743](https://github.com/nci/scores/pull/743).
+- Updated the pull request template. See [PR #719](https://github.com/nci/scores/pull/719).
+
+### Internal Changes
+
+- Sped up (improved the computational efficiency of) the continuous ranked probability score (CRPS) for ensembles. This also addresses memory issues when a large number of ensemble members are present. See [PR #694](https://github.com/nci/scores/pull/694).
+
+### Contributors to this Release
+
+Mohammadreza Khanarmuei ([@reza-armuei](https://github.com/reza-armuei)), Nicholas Loveday ([@nicholasloveday](https://github.com/nicholasloveday)), Durga Shrestha ([@durgals](https://github.com/durgals)), Tennessee Leeuwenburg ([@tennlee](https://github.com/tennlee)), Stephanie Chong ([@Steph-Chong](https://github.com/Steph-Chong)) and Robert J. Taggart ([@rob-taggart](https://github.com/rob-taggart)).
+
 ## Version 1.2.0 (September 13, 2024) 
 
 For a list of all changes in this release, see the [full changelog](https://github.com/nci/scores/compare/1.1.0...1.2.0). Below are the changes we think users may wish to be aware of.

diff --git a/pyproject.toml b/pyproject.toml
@@ -13,18 +13,18 @@ Scores is a package containing mathematical functions \
 for the verification, evaluation and optimisation of forecasts, predictions or models.
 """
 readme = "README.md"
-requires-python = ">=3.9"
+requires-python = ">=3.10"
 classifiers = [
     "Programming Language :: Python :: 3",
     "License :: OSI Approved :: Apache Software License",
     "Operating System :: OS Independent",
 ]
 dependencies = [
-    "xarray ~= 2024.1",
-    "pandas ~= 2.0",
-    "scipy ~= 1.1",
-    "bottleneck ~= 1.3",
-    "scikit-learn ~= 1.4",
+    "xarray",
+    "pandas",
+    "scipy",
+    "bottleneck",
+    "scikit-learn",
 ]
 
 [project.optional-dependencies]
@@ -49,7 +49,9 @@ tutorial = [
     "rasterio",
     "rioxarray",
     "plotly",
-    "dask"
+    "dask",
+    "gcsfs",
+    "zarr"
 ]
 maintainer = ["build",
               "hatch",