You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
If I understand well, the sparsity of your data is around 10%.
Our library is routinely used to decompose matrices of size 100M x 100M, but much more sparse.
I do not see any reason why the lib should not work, though one would need to tweak the parameters a bit.
Please adjust block size and number of blocks per partition so that each partition of the matrix and of the dense embeddings are less than 2Gb (for the definition of block and number of blocks per partition, see our article on medium), and start with a tiny embedding size (100 for instance).
Would this library scale to
https://stats.stackexchange.com/questions/355260/distributed-pca-or-an-equivalent
Thank you.
The text was updated successfully, but these errors were encountered: