You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Please tell me how to solve the error reported during the use of rwkv ”CUDA error: CUBLAS_STATUS_EXECUTION_FAILED when calling cublasSgemm( handle, opa, opb, m, n, k, &alpha, a, lda, b, ldb, &beta, c, ldc)“
#253
Open
songjie1121 opened this issue
Aug 29, 2024
· 2 comments
I'm planning to apply rwkv in my ASR model, but once I use rwkv's module it generates this error and only after the program has been trained for some time, is there any related solution idea or solution?The code inside rwkv uses Ali's previous source code without modification, and is simply called to replace the attention span
The text was updated successfully, but these errors were encountered:
Thank you very much for your reply. I have replaced the attention used with RWKV_Tmix_x060_state of rwkv6 on this official website. However, it is very strange that the loss of the validation set during the training process will suddenly increase. This is the loss curve and the configuration of RWKV_Tmix_x060_state used. In addition, I found that if the program is interrupted during training and continues training from the checkpoint, the memory usage will double. I would like to ask the author if there are any possible solutions to the above problems? Looking forward to the author's answer very much. Thank you!!
I'm planning to apply rwkv in my ASR model, but once I use rwkv's module it generates this error and only after the program has been trained for some time, is there any related solution idea or solution?The code inside rwkv uses Ali's previous source code without modification, and is simply called to replace the attention span
The text was updated successfully, but these errors were encountered: