-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Possible bug when using Phenograph with merge method = FIXED and number of events lower than fixedNum #12
Comments
Hi Lucia, Thanks for identifying this! It seems to be an issue with the "Finding neighbours stage", where the function The treetype argument here is "bd", which according to this issue page, gives the error you reported when there are duplicates in the data (such as those created when sampling with replacement). I'll look into whether its feasible to use "kd" rather than "bd", or add an internal test for duplicates to decide which is best. |
If kd fails, adding some noise may do the job: I think rnorm(..., sd = 1e-7) should avoid data duplication. |
I realised there may be downstream problems if doing sampling with replacement. I've come up with two possible solutions if cytofkit detects: Solution 1: Force Solution 2: Force Currently, I've decided to go with solution 2, as I'm of the opinion that it would provide more balanced analysis results if the same amount of data was extracted from each FCS. However, if you feel that solution 1 would be more appropriate for analysis, do let me know! |
Sounds good to me. Thanks so much for looking into this! |
So then is there no way to sample with replacement as the documentation suggests? I'd like to set the number to the maximum, not the minimum, number of events across fcs files and upsample. |
Hi, |
Hi,
I have been using cytofkit for a while and I have noticed that when I try to analyze data with merge method = FIXED and one or more of the FCS files has a lower number of events than the specified fixedNum (i.e., sampling with replacemenet), phenograph gets stuck for a long time in the "Finding nearest neighbors..." step until it crashes.
For example, a run on a single FCS file with 5004 events, merge = FIXED and FixedNum = 10000 ends like this:
Whereas if I run the same dataset with FixNum = 5000, it runs no problem:
Looks like the merge with replacement is done correctly ("Input data of 10000|5000 rows...'), but for some reason Phenograph does not like it.
If I use other merge methods (ceil, all), it runs fine, it only crashes when merge = fixed and the number of events is lower than FixNum.
Also, I saw that there was an unrelated bug in version 1.8.3 (issue #11), so just in case this bug was also specific for version v1.8.3, I tried with versions 1.6.5 and 1.9.4 and in both versions I find the same issue, i.e. phenograph crashing when merging with replacement.
Happy to send example files if you want to try to replicate the problem.
Thanks in advance for your time and for developing such a great package,
Best regards,
Lucia
The text was updated successfully, but these errors were encountered: