-
Notifications
You must be signed in to change notification settings - Fork 307
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for BLAS Syrk #1358
Comments
This has already been discussed in this issue. You probably already know but |
I am aware of general_mat_mul. |
Afaik, nobody did. And I don't think anyone will do it in the near future. ndarray hasn't evolved much in the last 4-5 years because of a lack of maintainers. |
I'd be interested in improving the BLAS functionality. Having spent a lot of time messing around with multidimensional array libraries myself (sole developer of LibRapid), I've got a decent understanding of how best to implement things. As an experiment, I wrote a BLAS wrapper for Rust, which wraps a system BLAS library in a C API, which is further wrapped in a Rust API. It supports any BLAS library that has a CBLAS interface. Implementing something like this (or using this directly??) could be interesting. Is that something worth looking into? On an unrelated note, I'd also be interested in getting OpenCL and CUDA support added to this library -- an obvious first step would be BLAS with CLBlast and cuBLAS, but JIT-compiled kernels could be interesting, too. |
This is something that @bluss might be interested in. |
Hi devs, |
Hi.
Thank you guys for this awesome library.
In my application, I need to compute a gram matrix (basically,
x.t().dot(x)
).Using
gemm
is wasteful, as the result is symmetric and the lower half is redundant, so usingsyrk
in this case is twice as fast asgemm
.On a general note, it would be really useful if the library could also easily support calling any general BLAS / LAPACK function using appropriate idioms (like getting memory layout, strides, etc.)
The text was updated successfully, but these errors were encountered: