Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] RuntimeError, an illegal memory access was encountered #460

Open
giacomopas opened this issue Sep 15, 2024 · 4 comments
Open

[BUG] RuntimeError, an illegal memory access was encountered #460

giacomopas opened this issue Sep 15, 2024 · 4 comments
Labels
bug Something isn't working priority:low Low-priority issues

Comments

@giacomopas
Copy link

Hello,
just switched from 14900K to an AMD Ryzen 9 9950X.
I tried running a batch of profiles with NAM (like I usually did like till yesterday with the 14900K machine) and I was stopped at the first profile.

Never happened in the other machine with Intel processor.
Could it be related? How?
Any solution?

Attaching Screenshots
WindowsTerminal_aFLFiHiP6U
WindowsTerminal_l8w2MRqJDp
WindowsTerminal_pLupGex7dQ
WindowsTerminal_Qp8GuiAHXd
WindowsTerminal_tjR7RZFqTt
WindowsTerminal_twjwe3gVuz
WindowsTerminal_wEBB68iSGC
WindowsTerminal_yq1lkj9C9I

Desktop (please complete the following information):

  • OS / Windows 11 23H2
  • Local
@giacomopas giacomopas added bug Something isn't working priority:low Low-priority issues unread This issue is new and hasn't been seen by the maintainers yet labels Sep 15, 2024
@sdatkinson
Copy link
Owner

Hi!

Did you re-install after making the switch? If not, can you give that a try and report back?

Also, can you post the errors as copy-pasted text instead of as screencaps? That makes it easier to navigate.

@sdatkinson sdatkinson removed the unread This issue is new and hasn't been seen by the maintainers yet label Sep 16, 2024
@giacomopas
Copy link
Author

Here it is attached.

I reinstalled PyTorch, then reinstalled NAM.
Nothing changed.

Thanks for your time!

unitlted1.txt

@sdatkinson
Copy link
Owner

Hmm. This isn't a bug with NAM. It looks like PyTorch's fault.

I'd recommend opening up the bug there, because there's not much I can do for you to fix that.

In the meantime, I'd maybe just recommend re-installing deeper, e.g.

  • A fresh environment
  • Clear Anaconda's caches
  • Reinstall Anaconda (not sure if it caches (now-incorrect) information about the CPU in your computer.)

As the message says, you could also use CUDA_LAUNCH_BLOCKING=1 or compile PyTorch with TORCH_USE_CUDA_DSA--but as this implies, it's a bit low-level (hence why I'm pretty confident it's not an issue with NAM's Python code).

E.g. it doesn't seem right that there's MKL things being installed--if I'm not mistaken, those aren't supported by non-Intel CPUs.

@giacomopas
Copy link
Author

Ok, I’ll open a bug report there. In the meantime, I tried completely uninstalling Anaconda and reinstalling everything from scratch, but now I'm getting a different error.

unitlted2.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working priority:low Low-priority issues
Projects
None yet
Development

No branches or pull requests

2 participants