Reference Metric in multiclass pecision recall unittests provides wrong answer when `ignore_index` is specified with `average = 'macro'` #2828

rittik9 · 2024-11-05T19:08:39Z

🐛 Bug

In unittests sklearn's recall_score and precision_score is being used as a reference . So even if in _reference_sklearn_precision_recall_multiclass() function remove_ignore_index function is being used for removing those predictions whose real values are ignore_index class before passing it to recall_score function, it does not matter. Because whenever average='macro' sklearn's recall_score and precision_score will always return mean cosidering the total no. of classes (as we are passing all the classes in recall_score() and precision_score() function's labels argument).

To Reproduce

#2441 issue already talks about the wrong behaviour of MulticlassRecall macro average when ignore_index is specified. Although ignore_index is getting tested, but for it's wrong implementation testcase got passed.

same error for multiclass precision

### Code Example for Multiclass Precision

import torch
from torchmetrics.classification import MulticlassPrecision

metric = MulticlassPrecision(num_classes=2, ignore_index=0, average="none")

y_true = torch.tensor([0, 0, 1, 1])

# Predicted probabilities (logits)
y_pred = torch.tensor([
    [0.9, 0.1],  # Correctly predicted as class 0
    [0.9, 0.1],  # Correctly predicted as class 0
    [0.9, 0.1],  # Incorrectly predicted as class 0 (should be class 1)
    [0.1, 0.9],  # Correctly predicted as class 1
])

metric.update(y_pred, y_true)
precision_result = metric.compute()
print(precision_result)  # tensor([0., 1.])

import torch
from torchmetrics.classification import MulticlassPrecision

metric = MulticlassPrecision(num_classes=2, ignore_index=0, average="macro")

y_true = torch.tensor([0, 0, 1, 1])

# Predicted probabilities (logits)
y_pred = torch.tensor([
    [0.9, 0.1],  # Correctly predicted as class 0
    [0.9, 0.1],  # Correctly predicted as class 0
    [0.9, 0.1],  # Incorrectly predicted as class 0 (should be class 1)
    [0.1, 0.9],  # Correctly predicted as class 1
])

metric.update(y_pred, y_true)
precision_result = metric.compute()
print(precision_result)  # tensor(0.5000) , expected: tensor(1.0)

Expected behavior

import numpy as np
from sklearn.metrics import precision_score

y_true = np.array([0, 0, 1, 1])

# Predicted probabilities (logits)
y_pred_probs = np.array([
    [0.9, 0.1],  # Correctly predicted as class 0
    [0.9, 0.1],  # Correctly predicted as class 0
    [0.9, 0.1],  # Incorrectly predicted as class 0 (should be class 1)
    [0.1, 0.9],  # Correctly predicted as class 1
])

# Convert predicted probabilities to predicted classes
y_pred = np.argmax(y_pred_probs, axis=1)

precision = precision_score(y_true, y_pred, average='macro', labels = [1]) #only considering label 1, i.e. ignoring label 0
print(f"Multiclass Precision: {precision:.2f}") #1.00

Environment

TorchMetrics version : 1.5.1
Python version: 3.10.12
OS : ubuntu

The text was updated successfully, but these errors were encountered:

github-actions · 2024-11-05T19:09:04Z

Hi! thanks for your contribution!, great first issue!

rittik9 added bug / fix Something isn't working help wanted Extra attention is needed labels Nov 5, 2024

Borda added the v1.5.x label Nov 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reference Metric in multiclass pecision recall unittests provides wrong answer when `ignore_index` is specified with `average = 'macro'` #2828

Reference Metric in multiclass pecision recall unittests provides wrong answer when `ignore_index` is specified with `average = 'macro'` #2828

rittik9 commented Nov 5, 2024

github-actions bot commented Nov 5, 2024

Reference Metric in multiclass pecision recall unittests provides wrong answer when ignore_index is specified with average = 'macro' #2828

Reference Metric in multiclass pecision recall unittests provides wrong answer when ignore_index is specified with average = 'macro' #2828

Comments

rittik9 commented Nov 5, 2024

🐛 Bug

To Reproduce

Expected behavior

Environment

github-actions bot commented Nov 5, 2024

Reference Metric in multiclass pecision recall unittests provides wrong answer when `ignore_index` is specified with `average = 'macro'` #2828

Reference Metric in multiclass pecision recall unittests provides wrong answer when `ignore_index` is specified with `average = 'macro'` #2828