Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

big data set and K #8

Open
yuGithuuub opened this issue May 19, 2020 · 2 comments
Open

big data set and K #8

yuGithuuub opened this issue May 19, 2020 · 2 comments

Comments

@yuGithuuub
Copy link

Hey ALRA team,
I would like to ask alra's performance on very large data sets(~600k cell)
I am using scapy pipeline and I have 2 quertions:

  1. I noticed that the excessively large value of k in your article seems to have little effect on the results. Is it appropriate to use the default parameter of k = 50?
    2.I found that after subsetting the data, I found that it seemed to perform better.Is this related to the k value?
    By the way , alra provides the best experience in certain aspects !^_^
    Looking forward to your reply
@JunZhao1990
Copy link
Member

JunZhao1990 commented Jan 12, 2021

Thanks for your interest in ALRA! And sorry for the very late response.
To better understand your question, could you provide the estimated k values by ALRA for the whole data and the subset data? You could run the choose_k() function in the ALRA code to find the estimated k.

@ghost
Copy link

ghost commented Mar 5, 2024

@yuGithuuub
Hello, ANA111. I, too, work with large datasets in my analyses. I've encountered an issue related to sparseMatrix. Have you faced a similar challenge by any chance?

[Error occurred]
Error in .m2sparse(from, paste0(kind, "g", repr), NULL, NULL):
attempt to construct sparseMatrix with more than 2^31-1 nonzero entries

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants