You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hello,
I wanted to know if it happened also to you during training to have the model outputting full nan tensors. It happens to me some times and I wanted to know if it is a problem of the model or it is a problem of my setup.
I'm currently training a tiny version of the model in order to make it enter in RAM so I had to drop some layers of the final stage and in general the number of heads, dims and etc.
EDIT:
I forgot to mention I'm training on mixed precision for memory issues
You have any idea why this can happen?
The text was updated successfully, but these errors were encountered:
Hello,
I wanted to know if it happened also to you during training to have the model outputting full nan tensors. It happens to me some times and I wanted to know if it is a problem of the model or it is a problem of my setup.
I'm currently training a tiny version of the model in order to make it enter in RAM so I had to drop some layers of the final stage and in general the number of heads, dims and etc.
EDIT:
I forgot to mention I'm training on mixed precision for memory issues
You have any idea why this can happen?
The text was updated successfully, but these errors were encountered: