You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This would certainly be helpful to include in the interface, but I'm not sure that execution would end up being any faster than cheb2leg(cheb2leg(X')') apart from using plans and a temp array for the transpose.
The current plans are already OpenMP parallelized, so that explains why multi-column execution is maybe faster than expected (this is slightly less than half the 2D transform):
julia> n =10_000; x =randn(n); p =plan_leg2cheb(Float64, n); lmul!(p, ldiv!(p, x)); @timelmul!(p, x);
0.007975 seconds
julia> X =randn(n, n);
julia>lmul!(p, ldiv!(p, X)); @timelmul!(p, X);
5.521245 seconds
julia> n*0.007975/5.52# close to 18 -- my core count14.447463768115943
FFTW allows specifying "regions", for multidimensional FFTs:
It turns out I need this feature for Legendre transforms.... at the moment it just does 1D transforms:
I can add it to FastTransforms.jl, e.g., if I need a 2D Legendre I can do
but I'm curious if this could be SIMD-optimised (or multithreaded) in C to make it faster?
The text was updated successfully, but these errors were encountered: