-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experimenting with CRISPR calculations #77
base: main
Are you sure you want to change the base?
Conversation
Overall readability score: 44.82 (🟢 +0.12)
View detailed metrics🟢 - Shows an increase in readability
Averages:
View metric targets
|
Following an older version of the code I did:
And now negative controls are 0 and positive controls are -1 as expected. Will interrogate this more later but I think we're more on track. Also have a function to do the plotting and will add this as a part of unit testing. With the new calculations we are getting closer. It doesn't look like the paper but at least our normalization is actually to the right range now. |
I can't really comment on the style or organization as I'm quite bad with R, but this all seems reasonable! I know you had some manual tests as well that were looking in line with published results. The one concern I had was that there didn't seem to be an automated test case which covers this change. I wonder if it would be possible to have an integration test with one or more small tests. Preferably small enough to be checked by hand. Or is this harder than I am imagining due to the parameters involved? |
Thanks for looking this over. There are automated tests that cover this but there was a minor bug that was inhibiting it from getting to those tests. So fixing that now! |
Description
From a basecamp conversation we realized normalization might not be happening as we think.
@ahberger thought we were calculating CRISPRs using:
But the original code has this as the calculation:
https://github.com/FredHutch/GI_mapping/blob/e117710977fd4c92b62ff3f552254a6a3076a6d4/workflow/scripts/03-filter_and_calculate_LFC.Rmd#L450
And then one more median subtraction later.
And this is what we've been basing CRISPR calculations on and have gotten very similar results to what is in the results folder on the cluster
grp/bergerlab_shared/Projects/paralog_pgRNA/pgPEN_library/GI_mapping/results
But when I plot the results found here (which by all indicators: https://github.com/FredHutch/GI_mapping/blob/e117710977fd4c92b62ff3f552254a6a3076a6d4/workflow/scripts/03-filter_and_calculate_LFC.Rmd#L8 ) are from the code we have.
When I plot these data it doesn't adhere to the negative controls = 0 and positive controls = -1 as expected:
The code on this branch then, attempts to try to better meet these expectations by calculating CRISPR using the following:
Instead of the original code. This results
Note however this version of the code does not result in the perfect -1 for positive controls: