Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function '_RasterizeToPixelsBackward' returned nan values in its 0th output. #455

Open
insomniaaac opened this issue Oct 16, 2024 · 0 comments

Comments

@insomniaaac
Copy link

trainer.py:203: UserWarning: Anomaly Detection has been enabled. This mode will increase the runtime and should only be enabled for debugging.
  with autograd.detect_anomaly():
/.../python3.9/site-packages/torch/autograd/graph.py:769: UserWarning: Error detected in _RasterizeToPixelsBackward. Traceback of forward call that caused the error:
  File "trainer.py", line 633, in <module>
    cli(main, cfg, verbose=True)
  File "/.../python3.9/site-packages/gsplat/distributed.py", line 360, in cli
    return _distributed_worker(0, 1, fn=fn, args=args)
  File "/.../python3.9/site-packages/gsplat/distributed.py", line 295, in _distributed_worker
    fn(local_rank, world_rank, world_size, args)
  File "trainer.py", line 591, in main
    runner.train()
  File "trainer.py", line 219, in train
    renders, alphas, info = self.rasterize_splats(
  File "trainer.py", line 143, in rasterize_splats
    render_colors, render_alphas, info = rasterization(
  File "rendering.py", line 561, in rasterization
    render_colors_, render_alphas_ = rasterize_to_pixels(
  File "/.../python3.9/site-packages/gsplat/cuda/_wrapper.py", line 551, in rasterize_to_pixels
    render_colors, render_alphas = _RasterizeToPixels.apply(
  File "/.../python3.9/site-packages/torch/autograd/function.py", line 574, in apply
    return super().apply(*args, **kwargs)  # type: ignore[misc]
 (Triggered internally at ../torch/csrc/autograd/python_anomaly_mode.cpp:111.)
  return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
  0%|                                                                                                                                                               | 0/30000 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "trainer.py", line 633, in <module>
    cli(main, cfg, verbose=True)
  File "/.../python3.9/site-packages/gsplat/distributed.py", line 360, in cli
    return _distributed_worker(0, 1, fn=fn, args=args)
  File "/.../python3.9/site-packages/gsplat/distributed.py", line 295, in _distributed_worker
    fn(local_rank, world_rank, world_size, args)
  File "trainer.py", line 591, in main
    runner.train()
  File "trainer.py", line 279, in train
    loss.backward()
  File "/.../python3.9/site-packages/torch/_tensor.py", line 521, in backward
    torch.autograd.backward(
  File "/.../python3.9/site-packages/torch/autograd/__init__.py", line 289, in backward
    _engine_run_backward(
  File "/.../python3.9/site-packages/torch/autograd/graph.py", line 769, in _engine_run_backward
    return Variable._execution_engine.run_backward(  # Calls into the C++ engine to run the backward pass
RuntimeError: Function '_RasterizeToPixelsBackward' returned nan values in its 0th output.

i found scales and opacity have some nan value, so i open torch.autograd.detect_anomaly() context.
it reported that some backward errors occur.
what should i do to avoid nan values in scales and opacity?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant