Experimenting with CRISPR calculations #77

cansavvy · 2024-12-20T21:17:49Z

Description

From a basecamp conversation we realized normalization might not be happening as we think.

@ahberger thought we were calculating CRISPRs using:

logFC adjusted = (log2FC - log2FC_negctls) / |log2FC_posctls|

But the original code has this as the calculation:

https://github.com/FredHutch/GI_mapping/blob/e117710977fd4c92b62ff3f552254a6a3076a6d4/workflow/scripts/03-filter_and_calculate_LFC.Rmd#L450

d.lfc_annot_adj <- d.lfc_annot %>%
  group_by(rep) %>%
  mutate(lfc_adj1 = lfc_plasmid_vs_late - median(lfc_plasmid_vs_late[norm_ctrl_flag == "negative_control"]),
         lfc_adj2 = lfc_adj1 / (median(lfc_adj1[norm_ctrl_flag == "negative_control"]) -
                                  median(lfc_adj1[norm_ctrl_flag == "positive_control"]))) 
...

And then one more median subtraction later.

...
  group_by(rep) %>%
  mutate(lfc_adj3 = lfc_adj2 - median(lfc_adj2[unexpressed_ctrl_flag == TRUE]))

And this is what we've been basing CRISPR calculations on and have gotten very similar results to what is in the results folder on the cluster grp/bergerlab_shared/Projects/paralog_pgRNA/pgPEN_library/GI_mapping/results

But when I plot the results found here (which by all indicators: https://github.com/FredHutch/GI_mapping/blob/e117710977fd4c92b62ff3f552254a6a3076a6d4/workflow/scripts/03-filter_and_calculate_LFC.Rmd#L8 ) are from the code we have.

When I plot these data it doesn't adhere to the negative controls = 0 and positive controls = -1 as expected:

The code on this branch then, attempts to try to better meet these expectations by calculating CRISPR using the following:

logFC adjusted = (log2FC - log2FC_negctls) / |log2FC_posctls|

Instead of the original code. This results

Note however this version of the code does not result in the perfect -1 for positive controls:

  rep              norm_ctrl_flag   median_crispr
   <chr>            <fct>                    <dbl>
 1 Day05_RepA_early negative_control         0    
 2 Day05_RepA_early positive_control         3.25 
 3 Day05_RepA_early single_targeting         2.86 
 4 Day05_RepA_early double_targeting         4.47 
 5 Day22_RepA_late  negative_control         0    
 6 Day22_RepA_late  positive_control        -2.18 
 7 Day22_RepA_late  single_targeting        -0.826
 8 Day22_RepA_late  double_targeting        -1.86 
 9 Day22_RepB_late  negative_control         0    
10 Day22_RepB_late  positive_control        -2.07 
11 Day22_RepB_late  single_targeting        -0.793
12 Day22_RepB_late  double_targeting        -1.66 
13 Day22_RepC_late  negative_control         0    
14 Day22_RepC_late  positive_control        -2.13 
15 Day22_RepC_late  single_targeting        -0.785
16 Day22_RepC_late  double_targeting        -1.75

cansavvy · 2024-12-20T21:18:11Z

Overall readability score: 44.82 (🟢 +0.12)

File	Readability
README.md	60.48 (🟢 +0.47)

View detailed metrics

🟢 - Shows an increase in readability
🔴 - Shows a decrease in readability

File	Readability	FRE	GF	ARI	CLI	DCRS
README.md	60.48	50.57	10.65	13.3	11.66	6.39
	🟢 +0.47	🟢 +0.31	🟢 +0.12	🟢 +0.1	🟢 +0	🟢 +0.02

Averages:

	Readability	FRE	GF	ARI	CLI	DCRS
Average	44.82	34.48	11.87	14.18	14.21	8.27
	🟢 +0.12	🟢 +0.08	🟢 +0.03	🟢 +0.02	🟢 +0	🟢 +0

View metric targets

Metric	Range	Ideal score
Flesch Reading Ease	100 (very easy read) to 0 (extremely difficult read)	60
Gunning Fog	6 (very easy read) to 17 (extremely difficult read)	8 or less
Auto. Read. Index	6 (very easy read) to 14 (extremely difficult read)	8 or less
Coleman Liau Index	6 (very easy read) to 17 (extremely difficult read)	8 or less
Dale-Chall Readability	4.9 (very easy read) to 9.9 (extremely difficult read)	6.9 or less

cansavvy · 2024-12-21T00:29:58Z

Following an older version of the code I did:

crispr_score = (lfc - negative_control) / ( negative_control - positive_control)

And now negative controls are 0 and positive controls are -1 as expected. Will interrogate this more later but I think we're more on track. Also have a function to do the plotting and will add this as a part of unit testing.

With the new calculations we are getting closer. It doesn't look like the paper but at least our normalization is actually to the right range now.

marissafujimoto · 2025-01-15T19:43:48Z

I can't really comment on the style or organization as I'm quite bad with R, but this all seems reasonable! I know you had some manual tests as well that were looking in line with published results. The one concern I had was that there didn't seem to be an automated test case which covers this change. I wonder if it would be possible to have an integration test with one or more small tests. Preferably small enough to be checked by hand. Or is this harder than I am imagining due to the parameters involved?

cansavvy · 2025-01-16T14:59:01Z

I can't really comment on the style or organization as I'm quite bad with R, but this all seems reasonable! I know you had some manual tests as well that were looking in line with published results. The one concern I had was that there didn't seem to be an automated test case which covers this change. I wonder if it would be possible to have an integration test with one or more small tests. Preferably small enough to be checked by hand. Or is this harder than I am imagining due to the parameters involved?

Thanks for looking this over.

There are automated tests that cover this but there was a minor bug that was inhibiting it from getting to those tests. So fixing that now!

logFC adjusted = (log2FC - log2FC_negctls) / |log2FC_posctls|

5979627

cansavvy added 2 commits December 20, 2024 16:23

Add dummy set_knitr_image_path

730cace

This works

18edab2

cansavvy added 6 commits December 20, 2024 19:32

Update docs

62cb31c

Update README

32ff9b8

Update vignette

c7214d2

Merge remote-tracking branch 'origin/main' into cansavvy/crispr-calc

65ea925

Update check

2d9849c

No build vignettes

55d448b

Fix a bug update notebooks

08842e7

cansavvy added 5 commits January 16, 2025 10:12

Update tests

02c0354

Update dpendency install

78b4fcb

streamline

91d04e9

streamlining

cb30b7e

utils

cc75019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Experimenting with CRISPR calculations #77

Experimenting with CRISPR calculations #77

cansavvy commented Dec 20, 2024

cansavvy commented Dec 20, 2024 •

edited

Loading

cansavvy commented Dec 21, 2024

marissafujimoto commented Jan 15, 2025

cansavvy commented Jan 16, 2025

Experimenting with CRISPR calculations #77

Are you sure you want to change the base?

Experimenting with CRISPR calculations #77

Conversation

cansavvy commented Dec 20, 2024

Description

cansavvy commented Dec 20, 2024 • edited Loading

cansavvy commented Dec 21, 2024

marissafujimoto commented Jan 15, 2025

cansavvy commented Jan 16, 2025

cansavvy commented Dec 20, 2024 •

edited

Loading