You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I think I would be happier if all our dense matrix type always stored rows in order, with a slot for the row stride (e.g. in bytes) instead of having an array of pointers to rows.
Pros:
Less space, fewer allocations.
Column vector matrices are as efficient as row vector matrices.
Can apply vec methods to the whole matrix when the rows are contiguous.
Creating window matrices is O(1) and requires no allocation.
Fewer memory accesses, so many basic operations should be faster.
Compatibility with BLAS type interfaces.
Cons
Slower row swaps. Granted, we need these in things like Gaussian elimination and LLL, but are there any places where swapping contents of rows would actually have a measurable performance impact? Typically this will be dwarfed by arithmetic.
Can't create row-permuting window matrices, but do we actually use this anywhere?
The text was updated successfully, but these errors were encountered:
That is a lot of pros and not many cons. Are there cases where we would expect substantial speedups by moving to the current design from the suggested one? This is unclear to me, but anyway any speedup or simplification is good to take.
Just to be sure I follow, for nmod_mat this would simply mean dropping the mat->rows array, and adapting the rest so that everything still works, right?
Concerning the row swaps, I suppose that if some algorithm needs lots of them to the point that it is impacting performance, it could instead create its own pointers to rows, or simply an array representing the current "virtual row permutation", and manipulate the matrix through this (performing the permutation only when necessary, typically at the end or before some recursive call), similarly to how it is done currently?
I think I would be happier if all our dense matrix type always stored rows in order, with a slot for the row stride (e.g. in bytes) instead of having an array of pointers to rows.
Pros:
Cons
The text was updated successfully, but these errors were encountered: