Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/chunked writer precomputed burdens testing combo #135

Merged
merged 141 commits into from
Sep 24, 2024

Conversation

bfclarke
Copy link
Contributor

What

This is a fairly large PR that makes several changes:

  1. Splits the computation of burdens from the formatting of phenotypes and covariates (this makes a cleaner pipeline, but is especially important when using precomputed burdens to do association testing, for which a pipeline has been added).
  2. Fixes a bug involving multiple concurrent writers to a Zarr array, where burdens were sometimes not written.
  3. Adds a deterministic flag to the config - useful when testing results of pipeline runs.
  4. Reduces the number of phenotypes used for training and association testing in the example config.
  5. Adds an example input config for the REGENIE pipelines.
  6. Adds a pipeline for conditional association testing with REGENIE.
  7. Moves the logic for when to (re)generate configuration files to config.py
  8. Adds a script for comparing pipeline results to reference results
  9. Adds some additional tests

Testing

  • The CV training/association pipeline has been run on example data and the results checked against those from the current main branch
  • A real-data check has been run on UKB WES

deeprvat/cv_utils.py Outdated Show resolved Hide resolved
deeprvat/cv_utils.py Outdated Show resolved Hide resolved
deeprvat/data/dense_gt.py Outdated Show resolved Hide resolved
deeprvat/data/dense_gt.py Outdated Show resolved Hide resolved
deeprvat/data/dense_gt.py Outdated Show resolved Hide resolved
deeprvat/data/dense_gt.py Outdated Show resolved Hide resolved
deeprvat/deeprvat/associate.py Outdated Show resolved Hide resolved
deeprvat/deeprvat/config.py Outdated Show resolved Hide resolved
Copy link
Collaborator

@HolEv HolEv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

spottet only some minor things we could still fix for consistency.
Main thing is that we have two log folder is burdens log and logs which we should unify. Otherwise really nice!

@endast endast self-requested a review September 24, 2024 08:55
@bfclarke bfclarke merged commit f0a2e25 into main Sep 24, 2024
1 check passed
@bfclarke bfclarke deleted the feature/chunked-writer-precomputed-burdens-testing-combo branch September 24, 2024 09:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatically update deeprvat_config.yaml when deeprvat_input_config.yaml changes
4 participants