Merge branch 'master' into master

Lightning-AI · Oct 8, 2024 · 36de1a0 · 36de1a0
2 parents 45d36cb + 6226a53
commit 36de1a0
Show file tree

Hide file tree

Showing 25 changed files with 115 additions and 44 deletions.
diff --git a/.github/CONTRIBUTING.md b/.github/CONTRIBUTING.md
@@ -16,13 +16,13 @@ We are always looking for help implementing new features or fixing bugs.
    - Add details on how to reproduce the issue - a minimal test case is always best, colab is also great.
      Note, that the sample code shall be minimal and if needed with publicly available data.
 
-1. Try to fix it or recommend a solution. We highly recommend to use test-driven approach:
+2. Try to fix it or recommend a solution. We highly recommend to use test-driven approach:
 
    - Convert your minimal code example to a unit/integration test with assert on expected results.
    - Start by debugging the issue... You can run just this particular test in your IDE and draft a fix.
    - Verify that your test case fails on the master branch and only passes with the fix applied.
 
-1. Submit a PR!
+3. Submit a PR!
 
 _**Note**, even if you do not find the solution, sending a PR with a test covering the issue is a valid contribution and we can
 help you or finish it with you :\]_
@@ -31,14 +31,14 @@ help you or finish it with you :\]_
 
 1. Submit a github issue - describe what is the motivation of such feature (adding the use case or an example is helpful).
 
-1. Let's discuss to determine the feature scope.
+2. Let's discuss to determine the feature scope.
 
-1. Submit a PR! We recommend test driven approach to adding new features as well:
+3. Submit a PR! We recommend test driven approach to adding new features as well:
 
    - Write a test for the functionality you want to add.
    - Write the functional code until the test passes.
 
-1. Add/update the relevant tests!
+4. Add/update the relevant tests!
 
 - [This PR](https://github.com/Lightning-AI/torchmetrics/pull/98) is a good example for adding a new metric
 
@@ -71,7 +71,7 @@ In case you adding new dependencies, make sure that they are compatible with the
 ### Coding Style
 
 1. Use f-strings for output formation (except logging when we stay with lazy `logging.info("Hello %s!", name)`.
-1. You can use `pre-commit` to make sure your code style is correct.
+2. You can use `pre-commit` to make sure your code style is correct.
 
 ### Documentation
 

diff --git a/.github/ISSUE_TEMPLATE/config.yml b/.github/ISSUE_TEMPLATE/config.yml
@@ -3,6 +3,6 @@ contact_links:
   - name: Ask a Question
     url: https://github.com/Lightning-AI/torchmetrics/discussions/new
     about: Ask and answer TorchMetrics related questions
-  - name: 💬 Slack
-    url: https://app.slack.com/client/TR9DVT48M/CQXV8BRH9/thread/CQXV8BRH9-1591382895.254600
-    about: Chat with our community
+  - name: 💬 Chat with us
+    url: https://discord.gg/VptPCZkGNa
+    about: Live chat with experts, engineers, and users in our Discord community.
diff --git a/.github/workflows/publish-pkg.yml b/.github/workflows/publish-pkg.yml
@@ -67,7 +67,7 @@ jobs:
       - run: ls -lh dist/
       # We do this, since failures on test.pypi aren't that bad
       - name: Publish to Test PyPI
-        uses: pypa/[email protected].0
+        uses: pypa/[email protected].2
         with:
           user: __token__
           password: ${{ secrets.test_pypi_password }}
@@ -94,7 +94,7 @@ jobs:
           path: dist
       - run: ls -lh dist/
       - name: Publish distribution 📦 to PyPI
-        uses: pypa/[email protected].0
+        uses: pypa/[email protected].2
         with:
           user: __token__
           password: ${{ secrets.pypi_password }}

diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -69,6 +69,7 @@ repos:
     rev: 0.7.17
     hooks:
       - id: mdformat
+        args: ["--number"]
         additional_dependencies:
           - mdformat-gfm
           - mdformat-black

diff --git a/CHANGELOG.md b/CHANGELOG.md
@@ -44,7 +44,7 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 
 ### Fixed
 
--
+- Fixed for Pearson changes inputs ([#2765](https://github.com/Lightning-AI/torchmetrics/pull/2765))
 
 
 ## [1.4.2] - 2022-09-12
@@ -63,6 +63,9 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
 - Fixed flakiness in tests related to `torch.unique` with `dim=None` ([#2650](https://github.com/Lightning-AI/torchmetrics/pull/2650))
 
 
+- Fixed corner case in `MatthewsCorrCoef` ([#2743](https://github.com/Lightning-AI/torchmetrics/pull/2743))
+
+
 ## [1.4.1] - 2024-08-02
 
 ### Changed

diff --git a/Makefile b/Makefile
@@ -1,5 +1,6 @@
-.PHONY: clean test get-sphinx-template docs env data
+.PHONY: clean test get-sphinx-template docs live-docs env data
 
+export TOKENIZERS_PARALLELISM=false
 export FREEZE_REQUIREMENTS=1
 # assume you have installed need packages
 export SPHINX_MOCK_REQUIREMENTS=1
@@ -39,6 +40,10 @@ docs: clean get-sphinx-template
 	# apt-get install -y texlive-latex-extra dvipng texlive-pictures texlive-fonts-recommended cm-super
 	cd docs && make html --debug --jobs $(nproc) SPHINXOPTS="-W --keep-going"
 
+live-docs: get-sphinx-template
+	pip install -e . --quiet -r requirements/_docs.txt
+	cd docs && make livehtml --jobs $(nproc)
+
 env:
 	pip install -e . -U -r requirements/_devel.txt
 

diff --git a/docs/Makefile b/docs/Makefile
@@ -17,3 +17,6 @@ help:
 # "make mode" option.  $(O) is meant as a shortcut for $(SPHINXOPTS).
 %: Makefile
 	@$(SPHINXBUILD) -M $@ "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
+
+livehtml:
+	sphinx-autobuild "$(SOURCEDIR)" "$(BUILDDIR)" $(SPHINXOPTS) $(O)
diff --git a/docs/source/_static/runllm.js b/docs/source/_static/runllm.js
@@ -0,0 +1,15 @@
+document.addEventListener("DOMContentLoaded", function () {
+    var script = document.createElement("script");
+    script.type = "module";
+    script.id = "runllm-widget-script"
+
+    script.src = "https://widget.runllm.com";
+
+    script.setAttribute("runllm-keyboard-shortcut", "Mod+j"); // cmd-j or ctrl-j to open the widget.
+    script.setAttribute("runllm-name", "TorchMetrics");
+    script.setAttribute("runllm-position", "BOTTOM_RIGHT");
+    script.setAttribute("runllm-assistant-id", "244");
+
+    script.async = true;
+    document.head.appendChild(script);
+});
diff --git a/docs/source/conf.py b/docs/source/conf.py
@@ -220,6 +220,7 @@ def _set_root_image_path(page_path: str) -> None:
 # so a file named "default.css" will overwrite the builtin "default.css".
 html_static_path = ["_static"]
 html_css_files = ["css/custom.css"]
+html_js_files = ["runllm.js"]
 
 # -- Options for HTMLHelp output ---------------------------------------------
 

diff --git a/requirements/_docs.txt b/requirements/_docs.txt
@@ -9,6 +9,7 @@ sphinx-autodoc-typehints ==1.23.0
 sphinx-paramlinks ==0.6.0
 sphinx-togglebutton ==0.3.2
 sphinx-copybutton ==0.5.2
+sphinx-autobuild ==2024.10.3
 sphinx-gallery ==0.17.1
 
 lightning >=1.8.0, <2.5.0

diff --git a/requirements/_tests.txt b/requirements/_tests.txt
@@ -13,7 +13,7 @@ phmdoctest ==1.4.0
 
 psutil ==6.*
 pyGithub >2.0.0, <2.5.0
-fire ==0.6.*
+fire ==0.7.*
 
 cloudpickle >1.3, <=3.0.0
 scikit-learn ==1.2.*; python_version < "3.9"

diff --git a/requirements/classification_test.txt b/requirements/classification_test.txt
@@ -1,7 +1,7 @@
 # NOTE: the upper bound for the package version is only set for CI stability, and it is dropped while installing this package
 #  in case you want to preserve/enforce restrictions on the latest compatible version, add "strict" as an in-line comment
 
-pandas >1.4.0, <=2.2.2
+pandas >1.4.0, <=2.2.3
 netcal >1.0.0, <1.4.0 # calibration_error
 numpy <2.2.0
 fairlearn # group_fairness
diff --git a/requirements/multimodal.txt b/requirements/multimodal.txt
@@ -1,5 +1,5 @@
 # NOTE: the upper bound for the package version is only set for CI stability, and it is dropped while installing this package
 #  in case you want to preserve/enforce restrictions on the latest compatible version, add "strict" as an in-line comment
 
-transformers >=4.42.3, <4.45.0
+transformers >=4.42.3, <4.46.0
 piq <=0.8.0
diff --git a/requirements/nominal_test.txt b/requirements/nominal_test.txt
@@ -1,7 +1,8 @@
 # NOTE: the upper bound for the package version is only set for CI stability, and it is dropped while installing this package
 #  in case you want to preserve/enforce restrictions on the latest compatible version, add "strict" as an in-line comment
 
-pandas >1.4.0, <=2.2.2 # cannot pin version due to numpy version incompatibility
-dython ~=0.7.6
+pandas >1.4.0, <=2.2.3 # cannot pin version due to numpy version incompatibility
+dython ==0.7.6 ; python_version <"3.9"
+dython ~=0.7.8 ; python_version > "3.8"  # we do not use `> =`
 scipy >1.0.0, <1.15.0 # cannot pin version due to some version conflicts with `oldest` CI configuration
 statsmodels >0.13.5, <0.15.0
diff --git a/requirements/text.txt b/requirements/text.txt
@@ -3,8 +3,8 @@
 
 nltk >3.8.1, <=3.9.1
 tqdm <4.67.0
-regex >=2021.9.24, <=2024.7.24
-transformers >4.4.0, <4.45.0
+regex >=2021.9.24, <=2024.9.11
+transformers >4.4.0, <4.46.0
 mecab-python3 >=1.0.6, <1.1.0
 ipadic >=1.0.0, <1.1.0
 sentencepiece >=0.2.0, <0.3.0
diff --git a/requirements/text_test.txt b/requirements/text_test.txt
@@ -4,7 +4,7 @@
 jiwer >=2.3.0, <3.1.0
 rouge-score >0.1.0, <=0.1.2
 bert_score ==0.3.13
-huggingface-hub <0.25
+huggingface-hub <0.26
 sacrebleu >=2.3.0, <2.5.0
 
 mecab-ko >=1.0.0, <1.1.0

diff --git a/src/torchmetrics/detection/mean_ap.py b/src/torchmetrics/detection/mean_ap.py
@@ -124,19 +124,23 @@ class MeanAveragePrecision(Metric):
 
     - ``map_dict``: A dictionary containing the following key-values:
 
-        - map: (:class:`~torch.Tensor`), global mean average precision
-        - map_small: (:class:`~torch.Tensor`), mean average precision for small objects
-        - map_medium:(:class:`~torch.Tensor`), mean average precision for medium objects
-        - map_large: (:class:`~torch.Tensor`), mean average precision for large objects
+        - map: (:class:`~torch.Tensor`), global mean average precision which by default is defined as mAP50-95 e.g. the
+          mean average precision for IoU thresholds 0.50, 0.55, 0.60, ..., 0.95 averaged over all classes and areas. If
+          the IoU thresholds are changed this value will be calculated with the new thresholds.
+        - map_small: (:class:`~torch.Tensor`), mean average precision for small objects (area < 32^2 pixels)
+        - map_medium:(:class:`~torch.Tensor`), mean average precision for medium objects (32^2  pixels < area < 96^2
+          pixels)
+        - map_large: (:class:`~torch.Tensor`), mean average precision for large objects (area > 96^2 pixels)
         - mar_{mdt[0]}: (:class:`~torch.Tensor`), mean average recall for `max_detection_thresholds[0]` (default 1)
           detection per image
         - mar_{mdt[1]}: (:class:`~torch.Tensor`), mean average recall for `max_detection_thresholds[1]` (default 10)
           detection per image
         - mar_{mdt[1]}: (:class:`~torch.Tensor`), mean average recall for `max_detection_thresholds[2]` (default 100)
           detection per image
-        - mar_small: (:class:`~torch.Tensor`), mean average recall for small objects
-        - mar_medium: (:class:`~torch.Tensor`), mean average recall for medium objects
-        - mar_large: (:class:`~torch.Tensor`), mean average recall for large objects
+        - mar_small: (:class:`~torch.Tensor`), mean average recall for small objects (area < 32^2  pixels)
+        - mar_medium: (:class:`~torch.Tensor`), mean average recall for medium objects (32^2 pixels < area < 96^2
+          pixels)
+        - mar_large: (:class:`~torch.Tensor`), mean average recall for large objects (area > 96^2  pixels)
         - map_50: (:class:`~torch.Tensor`) (-1 if 0.5 not in the list of iou thresholds), mean average precision at
           IoU=0.50
         - map_75: (:class:`~torch.Tensor`) (-1 if 0.75 not in the list of iou thresholds), mean average precision at
@@ -150,8 +154,11 @@ class MeanAveragePrecision(Metric):
     For an example on how to use this metric check the `torchmetrics mAP example`_.
 
     .. note::
-        ``map`` score is calculated with @[ IoU=self.iou_thresholds | area=all | max_dets=max_detection_thresholds ].
-        Caution: If the initialization parameters are changed, dictionary keys for mAR can change as well.
+        ``map`` score is calculated with @[ IoU=self.iou_thresholds | area=all | max_dets=max_detection_thresholds ]
+        e.g. the mean average precision for IoU thresholds 0.50, 0.55, 0.60, ..., 0.95 averaged over all classes and
+        all areas and all max detections per image. If the IoU thresholds are changed this value will be calculated with
+        the new thresholds. Caution: If the initialization parameters are changed, dictionary keys for mAR can change as
+        well.
 
     .. note::
         This metric supports, at the moment, two different backends for the evaluation. The default backend is

diff --git a/src/torchmetrics/functional/classification/matthews_corrcoef.py b/src/torchmetrics/functional/classification/matthews_corrcoef.py
@@ -64,12 +64,14 @@ def _matthews_corrcoef_reduce(confmat: Tensor) -> Tensor:
     denom = cov_ypyp * cov_ytyt
 
     if denom == 0 and confmat.numel() == 4:
-        if tp == 0 or tn == 0:
-            a = tp + tn
-
-        if fp == 0 or fn == 0:
-            b = fp + fn
-
+        if fn == 0 and tn == 0:
+            a, b = tp, fp
+        elif fp == 0 and tn == 0:
+            a, b = tp, fn
+        elif tp == 0 and fn == 0:
+            a, b = tn, fp
+        elif tp == 0 and fp == 0:
+            a, b = tn, fn
         eps = torch.tensor(torch.finfo(torch.float32).eps, dtype=torch.float32, device=confmat.device)
         numerator = torch.sqrt(eps) * (a - b)
         denom = (tp + fp + eps) * (tp + fn + eps) * (tn + fp + eps) * (tn + fn + eps)

diff --git a/src/torchmetrics/functional/image/rmse_sw.py b/src/torchmetrics/functional/image/rmse_sw.py
@@ -104,7 +104,8 @@ def _rmse_sw_compute(
     """
     rmse = rmse_val_sum / total_images if rmse_val_sum is not None else None
     if rmse_map is not None:
-        rmse_map /= total_images
+        # prevent overwrite the inputs
+        rmse_map = rmse_map / total_images
     return rmse, rmse_map
 
 

diff --git a/src/torchmetrics/functional/regression/concordance.py b/src/torchmetrics/functional/regression/concordance.py
@@ -27,6 +27,8 @@ def _concordance_corrcoef_compute(
 ) -> Tensor:
     """Compute the final concordance correlation coefficient based on accumulated statistics."""
     pearson = _pearson_corrcoef_compute(var_x, var_y, corr_xy, nb)
+    var_x = var_x / (nb - 1)
+    var_y = var_y / (nb - 1)
     return 2.0 * pearson * var_x.sqrt() * var_y.sqrt() / (var_x + var_y + (mean_x - mean_y) ** 2)
 
 

diff --git a/src/torchmetrics/functional/regression/pearson.py b/src/torchmetrics/functional/regression/pearson.py
@@ -92,9 +92,10 @@ def _pearson_corrcoef_compute(
         nb: number of observations
 
     """
-    var_x /= nb - 1
-    var_y /= nb - 1
-    corr_xy /= nb - 1
+    # prevent overwrite the inputs
+    var_x = var_x / (nb - 1)
+    var_y = var_y / (nb - 1)
+    corr_xy = corr_xy / (nb - 1)
     # if var_x, var_y is float16 and on cpu, make it bfloat16 as sqrt is not supported for float16
     # on cpu, remove this after https://github.com/pytorch/pytorch/issues/54774 is fixed
     if var_x.dtype == torch.float16 and var_x.device == torch.device("cpu"):

diff --git a/src/torchmetrics/regression/r2.py b/src/torchmetrics/regression/r2.py
@@ -38,8 +38,8 @@ class R2Score(Metric):
 
     where the parameter :math:`k` (the number of independent regressors) should be provided as the `adjusted` argument.
     The score is only proper defined when :math:`SS_{tot}\neq 0`, which can happen for near constant targets. In this
-    case a score of 0 is returned. By definition the score is bounded between 0 and 1, where 1 corresponds to the
-    predictions exactly matching the targets.
+    case a score of 0 is returned. By definition the score is bounded between :math:`-inf` and 1.0, with 1.0 indicating
+    perfect prediction, 0 indicating constant prediction and negative values indicating worse than constant prediction.
 
     As input to ``forward`` and ``update`` the metric accepts the following input:
 
@@ -99,7 +99,6 @@ class R2Score(Metric):
     is_differentiable: bool = True
     higher_is_better: bool = True
     full_state_update: bool = False
-    plot_lower_bound: float = 0.0
     plot_upper_bound: float = 1.0
 
     sum_squared_error: Tensor

diff --git a/src/torchmetrics/regression/symmetric_mape.py b/src/torchmetrics/regression/symmetric_mape.py
@@ -41,7 +41,7 @@ class SymmetricMeanAbsolutePercentageError(Metric):
 
     As output of ``forward`` and ``compute`` the metric returns the following output:
 
-    - ``smape`` (:class:`~torch.Tensor`): A tensor with non-negative floating point smape value between 0 and 1
+    - ``smape`` (:class:`~torch.Tensor`): A tensor with non-negative floating point smape value between 0 and 2
 
     Args:
         kwargs: Additional keyword arguments, see :ref:`Metric kwargs` for more info.
@@ -60,6 +60,7 @@ class SymmetricMeanAbsolutePercentageError(Metric):
     higher_is_better: bool = False
     full_state_update: bool = False
     plot_lower_bound: float = 0.0
+    plot_upper_bound: float = 2.0
 
     sum_abs_per_error: Tensor
     total: Tensor

diff --git a/tests/unittests/classification/test_matthews_corrcoef.py b/tests/unittests/classification/test_matthews_corrcoef.py
@@ -331,6 +331,12 @@ def test_zero_case_in_multiclass():
             torch.tensor([0, 0, 0, 0, 0, 1, 1, 1, 1, 1]),
             0.0,
         ),
+        (
+            binary_matthews_corrcoef,
+            torch.tensor([1, 1, 1, 1, 1, 0, 0, 0, 0, 0]),
+            torch.tensor([0, 0, 0, 0, 0, 0, 0, 0, 0, 0]),
+            0.0,
+        ),
         (binary_matthews_corrcoef, torch.zeros(10), torch.ones(10), -1.0),
         (binary_matthews_corrcoef, torch.ones(10), torch.zeros(10), -1.0),
         (

diff --git a/tests/unittests/regression/test_pearson.py b/tests/unittests/regression/test_pearson.py
@@ -164,3 +164,25 @@ def test_single_sample_update():
     metric(torch.tensor([7.0]), torch.tensor([8.0]))
     res2 = metric.compute()
     assert torch.allclose(res1, res2)
+
+
+def test_overwrite_reference_inputs():
+    """Test that the normalizations does not overwrite inputs.
+
+    Variables var_x, var_y, corr_xy are references to the object variables and get incorrectly scaled down such that
+    when you update again and compute you get very wrong values.
+
+    """
+    y = torch.randn(100)
+    y_pred = y + torch.randn(y.shape) / 5
+    # Initialize Pearson correlation coefficient metric
+    pearson = PearsonCorrCoef()
+    # Compute the Pearson correlation coefficient
+    correlation = pearson(y, y_pred)
+
+    pearson = PearsonCorrCoef()
+    for lower, upper in [(0, 33), (33, 66), (66, 99), (99, 100)]:
+        pearson.update(torch.tensor(y[lower:upper]), torch.tensor(y_pred[lower:upper]))
+        pearson.compute()
+
+    assert torch.isclose(pearson.compute(), correlation)