GMSD produced NaN gradient #308

markdjwilliams · 2022-03-16T21:47:20Z

Describe the bug

GMSD generates a NaN value in its gradient, using piq 0.6.0, which is flagged by pytorch's autograd anomaly checker.

To Reproduce

import torch
from piq.gmsd import GMSDLoss

torch.autograd.set_detect_anomaly(True)

y = torch.rand((8,3,256,256))
y_hat = torch.zeros((8,3,256,256), requires_grad=True) 
criterion = GMSDLoss()
optim = torch.optim.SGD([y_hat], lr=0.1)

for i in range(100):
    optim.zero_grad()
    loss = criterion(y, torch.sigmoid(y_hat))
    loss.backward()
    optim.step()
    print( loss )

... raises an exception:

...../torch/autograd/__init__.py:132: UserWarning: Error detected in PowBackward0. Traceback of forward call that caused the error:
  File "sandbox.py", line 12, in <module>
    loss = criterion(y, torch.sigmoid(y_hat))
  File "...../torch/nn/modules/module.py", line 727, in _call_impl
    result = self.forward(*input, **kwargs)
  File "..../piq/gmsd.py", line 142, in forward
    return gmsd(x=x, y=y, reduction=self.reduction, data_range=self.data_range, t=self.t)
  File "..../piq/gmsd.py", line 61, in gmsd
    score = _gmsd(x=x, y=y, t=t)
  File "..../piq/gmsd.py", line 86, in _gmsd
    y_grad = gradient_map(y, kernels)
  File "..../piq/functional/base.py", line 58, in gradient_map
    return torch.sqrt(torch.sum(grads ** 2, dim=-3, keepdim=True))
 (Triggered internally at  ...../torch/csrc/autograd/python_anomaly_mode.cpp:104.)
  allow_unreachable=True)  # allow_unreachable flag
Traceback (most recent call last):
  File "sandbox.py", line 13, in <module>
    loss.backward()
  File "...../torch/tensor.py", line 221, in backward
    torch.autograd.backward(self, gradient, retain_graph, create_graph)
  File "...../torch/autograd/__init__.py", line 132, in backward
    allow_unreachable=True)  # allow_unreachable flag
RuntimeError: Function 'PowBackward0' returned nan values in its 0th output.

Expected behavior
The expectation would be that the program completes without raising an exception, with the displayed loss values ideally decreasing throughout.

Additional context
Replacing the line:

y_hat = torch.zeros((8,3,256,256), requires_grad=True)

... with:

y_hat = torch.rand((8,3,256,256), requires_grad=True)

... does allow the program to complete without error.

The text was updated successfully, but these errors were encountered:

markdjwilliams · 2022-03-17T19:49:34Z

Substituding GMSDLoss for SRSIMLoss also raises the exact same exception, with autograd also flagging the line return torch.sqrt(torch.sum(grads ** 2, dim=-3, keepdim=True)) as being the source.

zakajd · 2022-03-17T20:11:55Z

Hi @markdjwilliams, thanks for raising an issue!
We will investigate this further.
Meanwhile, feel free to open PR if you'll find a source of this bug 🐛

markdjwilliams · 2022-03-17T23:58:16Z

Thank you, please let me know if this cannot be reproduced on your side.

markdjwilliams added the bug Something isn't working label Mar 16, 2022

zakajd self-assigned this Mar 17, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GMSD produced NaN gradient #308

GMSD produced NaN gradient #308

markdjwilliams commented Mar 16, 2022 •

edited

Loading

markdjwilliams commented Mar 17, 2022

zakajd commented Mar 17, 2022

markdjwilliams commented Mar 17, 2022

GMSD produced NaN gradient #308

GMSD produced NaN gradient #308

Comments

markdjwilliams commented Mar 16, 2022 • edited Loading

markdjwilliams commented Mar 17, 2022

zakajd commented Mar 17, 2022

markdjwilliams commented Mar 17, 2022

markdjwilliams commented Mar 16, 2022 •

edited

Loading