-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Sync] Reorganize output directories + deduplicate snakefiles + bugfixes #86
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* initial reorg and rewrite of snakefile (WIP) * don't set shell=True (args are evidenty not passed to blastp) * use output references in expand expression * use all caps for all global variables * do not use rule references in rule because it does not work * simplify some logic in the snakefile * update snakefile_ff to reflect changes made to the main snakefile * add test for the pipeline in cluster mode * fix typos/bugs in snakefile_ff * reorganize test artifacts and fix search-mode test * make the features file required in cluster mode * add input PDB files and features file for cluster-mode test * more renaming and remove params that are just global variables (which can be used directly in shell commands) * use indexing on all unnamed outputs for clarity * update readme and rename cluster_similarity.py to plot_cluster_similarity.py * use named outputs for all internal rules (even if they have only one output) * move rule all to the end of the snakefile to allow defining its inputs symbolically * add a comment explaining why rule all is at the end of the snakefile * improve wildcard constraint and use lowercase filename * fix mistake in plot_interactive and eliminate need for input function for aggregate_features --------- Signed-off-by: Keith Cheveralls <[email protected]>
* add conditional step to run tests without mocks to existing test workflow * move conftest.py to repo root and add a --no-mocks pytest CLI option, rename env variable for clarity * run the tests when the PR is labeled * update comment about env variable
* first draft of actions to check for and sync updates from the public repo * rename workflow, simplify, add more comments * make the open-sync-pr workflow consistent with the prior SOP * fix indentation and use range syntax in git log commands * bump actions/checkout to v4 * check repo owner and name in the sync action * better error message when verify-no-new-commits fails * create sync branch in a way that won't fail if there are merge conflicts
* start merging snakefile_ff into the main snakefile (WIP) and drop params that are wildcards * first attempt at merging cluster mode into the main snakefile (WIP) * fix mistakes in the snakefile and add a demo for cluster mode * update tests after merging search and cluster modes * define key_protids in the test config in cluster mode * reorganize validation logic in snakefile (WIP) and merge configs by adding mode-specific config sections * move BLAST_OUTFMT to constants.py and use an enum for mode * move config-related logic into its own module, fix bugs * don't use config subsections for mode-specific settings, use long form of all CLI args in snakefile, fix too-long lines * rename configuration.py to config_utils.py * avoid copying pdb files from input to output in cluster mode * delete cluster-mode-specific files and update readme * some variable renaming and changes to the logic in config_utils.py for clarity * update the rulegraphs and add makefile rules to generate them * add back missing shell=True * adjust formatting in the snakefile * remove unneeded config params from the test configs * capitalize comments in config.yml * improve docstring and minor edits for clarity in snakefile * rename override_file to features_override_file for clarity * update path to demo config in tests action and the readme * fix too-long lines in makefile * clarify that features_file is a TSV file, improve some comments * enums should be singular and used consistently * drop redundant len * make file globbing case insensitive --------- Signed-off-by: Keith Cheveralls <[email protected]>
Sync with the public repo
Sync with the public repo
* initial incomplete draft * move material from notion to contributing docs * address review comments * add testing section * use markdown for list numbering Co-authored-by: Dennis August Sun <[email protected]> Signed-off-by: Keith Cheveralls <[email protected]> * fix repo url and rewrite update-mocks section to use a list w example commands * ask external contributors to use forks --------- Signed-off-by: Keith Cheveralls <[email protected]> Co-authored-by: Dennis August Sun <[email protected]>
… PDBs (#136) * using foldseek to re-calculate TMscores of each input PDB against all PDBs * addressing some of Dennis' comments * addressing most of Keith's comments * addressing comment on Snakefile rule dependencies * solving error caused by no key_protid in config file in cluster-mode * drop duplicated methods by importing them from foldseek_clustering and write an empty tsv when no PDBs are in the query directory * fix variable naming and delete some redundant comments * update readme * fix inputs to the aggregate_features rule and more readme updates * add a comment explaining why the key_protid PDBs are not copied at the snakemake level * update the DAG visualizations * fix formatting in readme --------- Co-authored-by: Keith Cheveralls <[email protected]>
* fix paths in snakefile so make_pdb is called when there is no input pdb file * don't overwrite the PDB if it exists in esmfold_apiquery * revert the last commit and use to prevent make_pdb from overwriting the input PDBs * fix formatting
…dated capitilization in plots (#143)
mezarque
approved these changes
May 10, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm! Thanks for shipping this!
braebigge
approved these changes
May 10, 2024
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me too, thanks again Keith!
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR introduces the following PRs from the private repo: