Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

clean up docstring formatting and type hints #16

Merged
merged 6 commits into from
Aug 17, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 3 additions & 3 deletions docs/contributing.md
Original file line number Diff line number Diff line change
Expand Up @@ -30,14 +30,14 @@ A new score or metric should be developed on a separate feature branch, rebased
- The implementation of the new metric or score in xarray, ideally with support for pandas and dask
- 100% unit test coverage
- A tutorial notebook showcasing the use of that metric or score, ideally based on the standard sample data
- API documentation (docstrings) which clearly explain the use of the metrics
- API documentation (docstrings) using [Napoleon (google)](https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html) style, making sure to clearly explain the use of the metrics
- A reference to the paper which described the metrics, added to the API documentation
- For metrics which do not have a paper reference, an online source or reference should be provided
- For metrics which are still under development or which have not yet had an academic publication, they will be placed in a holding area within the API until the method has been properly published and peer reviewed (i.e. `scores.emerging`). The 'emerging' area of the API is subject to rapid change, still of sufficient community interest to include, similar to a 'preprint' of a score or metric.

All merge requests should comply with the coding standards outlined in this document. Merge requests will undergo both a code review and a science review. The code review will focus on coding style, performance and test coverage. The science review will focus on the mathematical correctness of the implementation and the suitability of the method for inclusion within 'scores'.
All merge requests should comply with the coding standards outlined in this document. Merge requests will undergo both a code review and a science review. The code review will focus on coding style, performance and test coverage. The science review will focus on the mathematical correctness of the implementation and the suitability of the method for inclusion within 'scores'.

A github ticket should be created explaining the metric which is being implemented and why it is useful.
A github ticket should be created explaining the metric which is being implemented and why it is useful.

### Development Process for a Correction or Improvement

Expand Down
107 changes: 58 additions & 49 deletions src/scores/continuous.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,31 +6,38 @@


def mse(fcst, obs, reduce_dims=None, preserve_dims=None, weights=None):
"""
"""Calculates the mean squared error from forecast and observed data.

Returns:
- By default an xarray containing a single floating point number representing the mean absolute
error for the supplied data. All dimensions will be reduced.
- Otherwise: Returns an xarray representing the mean squared error, reduced along
the relevant dimensions and weighted appropriately.
Dimensional reduction is not supported for pandas and the user should
convert their data to xarray to formulate the call to the metric. At
most one of reduce_dims and preserve_dims may be specified.
Specifying both will result in an exception.

Args:
- fcst: Forecast or predicted variables in xarray or pandas
- obs: Observed variables in xarray or pandas
- reduce_dims: Optionally specify which dimensions to reduce when calculating MSE.
All other dimensions will be preserved.
- preserve_dims: Optionally specify which dimensions to preserve when calculating MSE. All other
dimensions will be reduced. As a special case, 'all' will allow all dimensions to
be preserved. In this case, the result will be in the same shape/dimensionality as
the forecast, and the errors will be the squared error at each point (i.e. single-value
comparison against observed), and the forecast and observed dimensions must match
precisely.
- weights: Not yet implemented. Allow weighted averaging (e.g. by area, by latitude, by population, custom)

Notes:
- Dimensional reduction is not supported for pandas and the user should convert their data to xarray
to formulate the call to the metric.
- At most one of reduce_dims and preserve_dims may be specified. Specifying both will result in an exception.
fcst (Union[xr.Dataset, xr.DataArray, pd.Dataframe, pd.Series]):
Forecast or predicted variables in xarray or pandas.
obs (Union[xr.Dataset, xr.DataArray, pd.Dataframe, pd.Series]):
Observed variables in xarray or pandas.
reduce_dims (Union[str, Iterable[str]): Optionally specify which
dimensions to reduce when calculating MSE. All other dimensions
will be preserved.
preserve_dims (Union[str, Iterable[str]): Optionally specify which
dimensions to preserve when calculating MSE. All other dimensions
will be reduced. As a special case, 'all' will allow all dimensions
to be preserved. In this case, the result will be in the same
shape/dimensionality as the forecast, and the errors will be
the squared error at each point (i.e. single-value comparison
against observed), and the forecast and observed dimensions
must match precisely.
weights: Not yet implemented. Allow weighted averaging (e.g. by
area, by latitude, by population, custom)

Returns:
Union[xr.Dataset, xr.DataArray, pd.Dataframe, pd.Series]: An object containing
tennlee marked this conversation as resolved.
Show resolved Hide resolved
a single floating point number representing the mean absolute
error for the supplied data. All dimensions will be reduced.
Otherwise: Returns an object representing the mean squared error,
reduced along the relevant dimensions and weighted appropriately.
"""

error = fcst - obs
Expand All @@ -53,38 +60,40 @@ def mse(fcst, obs, reduce_dims=None, preserve_dims=None, weights=None):


def mae(fcst, obs, reduce_dims=None, preserve_dims=None, weights=None):
"""**Needs a 1 liner function description**
"""Calculates the mean absolute error from forecast and observed data.

A detailed explanation is on [Wikipedia](https://en.wikipedia.org/wiki/Mean_absolute_error)

Dimensional reduction is not supported for pandas and the user should
convert their data to xarray to formulate the call to the metric.
At most one of reduce_dims and preserve_dims may be specified.
Specifying both will result in an exception.

Args:
- fcst: Forecast or predicted variables in xarray or pandas.
- obs: Observed variables in xarray or pandas.
- reduce_dims: Optionally specify which dimensions to reduce when
calculating MAE. All other dimensions will be preserved.
- preserve_dims: Optionally specify which dimensions to preserve
when calculating MAE. All other dimensions will be reduced.
As a special case, 'all' will allow all dimensions to be
preserved. In this case, the result will be in the same
shape/dimensionality as the forecast, and the errors will be
the absolute error at each point (i.e. single-value comparison
against observed), and the forecast and observed dimensions
must match precisely.
- weights: Not yet implemented. Allow weighted averaging (e.g. by
area, by latitude, by population, custom).
fcst (Union[xr.Dataset, xr.DataArray, pd.Dataframe, pd.Series]): Forecast
or predicted variables in xarray or pandas.
obs (Union[xr.Dataset, xr.DataArray, pd.Dataframe, pd.Series]): Observed
variables in xarray or pandas.
reduce_dims (Union[str, Iterable[str]]): Optionally specify which dimensions
to reduce when calculating MAE. All other dimensions will be preserved.
preserve_dims (Union[str, Iterable[str]]): Optionally specify which
dimensions to preserve when calculating MAE. All other dimensions
will be reduced. As a special case, 'all' will allow all dimensions
to be preserved. In this case, the result will be in the same
shape/dimensionality as the forecast, and the errors will be
the absolute error at each point (i.e. single-value comparison
against observed), and the forecast and observed dimensions
must match precisely.
weights: Not yet implemented. Allow weighted averaging (e.g. by
area, by latitude, by population, custom).

Returns:
- By default an xarray DataArray containing a single floating
point number representing the mean absolute error for the
Union[xr.Dataset, xr.DataArray, pd.Dataframe, pd.Series]: By default an xarray DataArray containing
a single floating point number representing the mean absolute error for the
supplied data. All dimensions will be reduced.

Alternatively, an xarray structure with dimensions preserved as
appropriate containing the score along reduced dimensions

Notes:
- Dimensional reduction is not supported for pandas and the user
should convert their data to xarray to formulate the call to the metric.
- At most one of reduce_dims and preserve_dims may be specified.
Specifying both will result in an exception.

A detailed explanation is on [Wikipedia](https://en.wikipedia.org/wiki/Mean_absolute_error)
Alternatively, an xarray structure with dimensions preserved as appropriate
containing the score along reduced dimensions
"""

error = fcst - obs
Expand Down
6 changes: 5 additions & 1 deletion src/scores/probability/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,4 +2,8 @@
Import the functions from the implementations into the public API
"""

from .crps_impl import adjust_fcst_for_crps, crps_cdf, crps_cdf_brier_decomposition
from scores.probability.crps_impl import (
adjust_fcst_for_crps,
crps_cdf,
crps_cdf_brier_decomposition,
)
24 changes: 15 additions & 9 deletions src/scores/probability/checks.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
"""
This module contains methods which make assertions at runtime about the state of various data
This module contains methods which make assertions at runtime about the state of various data
structures and values
"""

Expand All @@ -8,24 +8,30 @@


def coords_increasing(da: xr.DataArray, dim: str):
"""
Returns True if coordinates along `dim` dimension of `da` are increasing,
False otherwise. No in-built raise if `dim` is not a dimension of `da`.
"""Checks if coordinates in a given DataArray are increasing.

Note: No in-built raise if `dim` is not a dimension of `da`.

Args:
da (xr.DataArray): Input data
dim (str): Dimension to check if increasing
Returns:
(bool): Returns True if coordinates along `dim` dimension of
`da` are increasing, False otherwise.
"""
result = (da[dim].diff(dim) > 0).all()
return result


def cdf_values_within_bounds(cdf: xr.DataArray) -> bool:
"""
Checks that 0 <= cdf <= 1. Ignores NaNs.
"""Checks that 0 <= cdf <= 1. Ignores NaNs.

Args:
cdf: array of CDF values
cdf (xr.DataArray): array of CDF values

Returns:
`True` if `cdf` values are all between 0 and 1 whenever values are not NaN,
or if all values are NaN; and `False` otherwise.
(bool): `True` if `cdf` values are all between 0 and 1 whenever values are not NaN,
or if all values are NaN; and `False` otherwise.
"""
return cdf.count() == 0 or ((cdf.min() >= 0) & (cdf.max() <= 1))

Expand Down
Loading