-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Sparse input data #102
Comments
First, for sparse matrices, I recommend our improved flashr at However, although in principle flashr and flashier can do EBMF with non-negative priors, i would not really recommend them (whether the data are sparse or not). In particular we know that convergence can be an issue with non-negative priors. My recommendation, if this is a sparse count matrix and you want a non-negative factorization, would be to use Poisson non-negative matrix factorization, as in https://github.com/stephenslab/fastTopics , where we have worked much harder on good convergence, and also the count nature of the data is We are also working on semi-nonnegative approaches in flashier (where loadings are non-negative |
Thanks so much for your reply. I will look into flashier. In terms of what you mentioned:
I wonder would such difficulty in convergence lead to less factors than expected? I ran flashr with nonnegative priors multiple times the results are pretty consistent (few factors, but very reproducible)- so I suspect in my case convergence was not an issue. However, I do have a very complex dataset and I expect many factors. I didn't consider fastTopics because my input is not a count readout, but rather normalized statistics. Thank you! |
yes, bottom line is that convergence difficulties could lead to
underfitting of the right number of factors.
Maybe try for comparison another package for Non-negative matrix
factorization like nnlm?
…On Tue, Sep 21, 2021 at 4:50 PM Menghan Liu ***@***.***> wrote:
Thanks so much for your reply. I will look into *flashier*.
In terms of what you mentioned:
"However, although in principle flashr and flashier can do EBMF with
non-negative priors, i would not really recommend them (whether the data
are sparse or not). In particular we know that convergence can be an issue
with non-negative priors."
I wonder would such difficulty in convergence lead to less factors than
expected? I ran flash with nonnegative priors multiple times the results
are pretty consistent (few factors, but very reproducible)- so I suspect in
my case convergence was not an issue. However, I do have a very complex
dataset and I expect many factors.
I didn't consider fastTopics because my input is not a count readout, but
rather normalized statistics.
Thank you!
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#102 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AANXRRLP2D6RUKI3KOEHHHLUDD43PANCNFSM5EOVWIYA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
Hi,
I have a sparse input matrix and I tried to do flashr (with non negative constrains on both F and L matrix), however I only get very few factors. I wonder is this has something to do with the sparsity in my input data (53% zeros), and what would be the best practice?
Thanks so much in advance!
The text was updated successfully, but these errors were encountered: