Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include summary step for guide + guide abundances (cell count per perturbation) #56

Open
gwaybio opened this issue Oct 23, 2020 · 12 comments
Labels
enhancement New feature or request

Comments

@gwaybio
Copy link
Member

gwaybio commented Oct 23, 2020

Right now, only one file indicating perturbation abundances is output per site. We should make retrieving a per-plate perturbation abundance easier, by summarizing perturbation counts in an additional script.

@jbauman214 - unfortunately, your request for this info is not super-readily available. We do calculate this at a per-site level, so it is possible to retrieve. The file name you are looking for is:

EXPERIMENT_LABEL/data/0.site-qc/PLATE_NAME/spots/SITE_NAME/cell_perturbation_category_summary_counts.tsv

These are available on github in a private repository for the EXPERIMENT_LABEL per PLATE_NAME. I am intentionally obscuring experimental details as this issue is in a public repo.

@gwaybio gwaybio added the enhancement New feature or request label Oct 23, 2020
@gwaybio gwaybio added this to the Version 0.2 Release milestone Oct 23, 2020
@gwaybio
Copy link
Member Author

gwaybio commented Oct 23, 2020

@hillsbury - this sounds like a great first issue to me :) want to take a crack at it?

@jbauman214
Copy link

Any update on this? If no one else has time to aggregate the files, I can try to do it

@hillsbury
Copy link
Contributor

was going to start taking a look through this today! how high priority is this?

@jbauman214
Copy link

jbauman214 commented Oct 27, 2020 via email

@hillsbury
Copy link
Contributor

I see. I don't want to make any promises right now so feel free to take a stab at it in the meantime as well!

@jbauman214
Copy link

jbauman214 commented Nov 3, 2020 via email

@gwaybio
Copy link
Member Author

gwaybio commented Nov 4, 2020

@jbauman214 - Glad we were able to get some sense of this!

Unless we need these results immediately, to inform a critical upcoming experiment - I don't think we do, but please LMK if I'm wrong - we will perform this analysis in a sustainable way. Concretely, this means integrating the appropriate python code in the correct recipe file, and then updating the CP151 data weld.

@hillsbury and I are going to walk through this (and more!) tomorrow.

@jt-neal
Copy link

jt-neal commented Nov 5, 2020

@gwaygenomics - re: urgency, this is one of the (several) key troubleshooting experiments/analyses that we'll want to have for the JSC, but not to inform any imminent experiments, if that helps with prioritization.

@gwaybio
Copy link
Member Author

gwaybio commented Nov 5, 2020

it does, thanks

Can you speak more to exactly what you're after as well? In my mind, all we need is a single per-plate .csv file with three columns 1) Gene 2) sgRNA 3) cell count. We have figures visualizing counts already, but not a step to generate this summary file.

@jt-neal
Copy link

jt-neal commented Nov 5, 2020

@jbauman214 should probably chime in here.

@jbauman214
Copy link

jbauman214 commented Nov 5, 2020 via email

@gwaybio
Copy link
Member Author

gwaybio commented Dec 1, 2020

@hillsbury - lets add cell quality to the spec outlined in #56 (comment)

So, the single file should include four columns:

Guide Gene Cell Count Cell Quality
AACGTCG GENE X 53 Perfect
AACGTCG GENE X 21 Great
ATCAACG GENE X 67 Perfect

and so on...

We'll be able to extract what we need from this file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants