You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This allows us to use simple Ref boxes for scalar inputs, but results in calls like cublasDot blocking in libcublas until the GPU is synchronized. We should instead set the pointer mode to device, and synchronize in Julia, so that other tasks get the opportunity to execute while waiting for the GPU to finish.
Going beyond this, we could add async flags to these APIs to optionally return a lazy scalar, or a 0d array.
By default, CUBLAS uses the "host" pointer mode for scalar reference arguments: https://docs.nvidia.com/cuda/cublas/#scalar-parameters
This allows us to use simple
Ref
boxes for scalar inputs, but results in calls likecublasDot
blocking inlibcublas
until the GPU is synchronized. We should instead set the pointer mode to device, and synchronize in Julia, so that other tasks get the opportunity to execute while waiting for the GPU to finish.Going beyond this, we could add
async
flags to these APIs to optionally return a lazy scalar, or a 0d array.cc @Jutho
The text was updated successfully, but these errors were encountered: