-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
parse README #6
parse README #6
Changes from 1 commit
ba19913
4cc91ba
4af224f
4e1e4a4
dde8e71
a221581
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,94 +3,6 @@ | |
This repository was derived from a [template repository](https://github.blog/2019-06-06-generate-new-repositories-with-repository-templates/) located at https://github.com/broadinstitute/pooled-cell-painting-profiling-template. | ||
The purpose of the repository is to weld together a versioned data processing pipeline with versioned processed output data for a single Pooled Cell Painting experiment. | ||
|
||
## Setup computational environment | ||
|
||
First, install [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/). | ||
We use conda as an environment manager. | ||
|
||
```bash | ||
# Install computational environment | ||
conda env create --force --file environment.yml | ||
|
||
# Initialize the environment | ||
conda activate pooled-cell-painting | ||
``` | ||
|
||
## Perform the weld | ||
|
||
The welding procedure is a three-step process. | ||
|
||
1. Activate conda environment (see above) | ||
2. Manually update the configuration yaml documents for your specific experiment | ||
* Yaml documents with reasonable default values are available in the [config/](config/) folder. | ||
* Do not change the location of these files. | ||
* Additional documentation for each of the parameters is available in the [config/docs/](config/docs/) folder. | ||
3. Execute `weld.sh` (see below) | ||
|
||
```bash | ||
# After performing steps 1 and 2 above, perform step 3: | ||
./weld.sh | ||
``` | ||
|
||
## **AFTER GENERATING A NEW REPO, CHANGE OR DELETE ALL NONSPECIFIC DETAILS** | ||
|
||
<p align="center"> | ||
<img src="https://raw.githubusercontent.com/broadinstitute/pooled-cp-profiling-template/a57cb7f9e36b89ff56acf094f18ca06b1a53b719/media/pipeline_weld.png" width="500"> | ||
</p> | ||
|
||
## Setup | ||
|
||
To correctly initialize the repository, we need to perform several manual steps. | ||
|
||
### Step 0: Create a New Repository **using this Repository as a Template** | ||
|
||
By spinning up a new repo using this repo as a template, you will retain all code, configuration files, computational environments, and directory structure that a standard Pooled Cell Painting workflow expects and produces. | ||
|
||
### Step 1: Fork The Pooled Cell Painting Painting Recipe | ||
|
||
We first want to [fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) the official pooled cell profiling recipe located at https://github.com/broadinstitute/pooled-cp-profiling-recipe. | ||
|
||
* **Result:** The fork creates a copy of a recipe repository. | ||
* **Goals:** 1) Remove the connection to official recipe updates to avoid unintended weld versioning reversal; 2) Enable independent updates to fork code that does not impact official recipe. | ||
* **Execution:** See [forking instructions](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) and the image below. | ||
|
||
![Step 1: Fork](media/step1_forkrecipe.png) | ||
|
||
### Step 2: Create a Submodule inside this Repository of the Forked Recipe | ||
|
||
Next, we will create a [submodule](https://gist.github.com/gitaarik/8735255) in this repo. | ||
|
||
* **Result:** Adding a submodule initiates the weld. | ||
* **Goals:** 1) Link the processing code (recipe) with the data (current repo); 2) Require a manual step to update the recipe to enable asynchronous development. | ||
* **Execution:** See below | ||
|
||
```bash | ||
# In your terminal, clone the repository you just created (THIS REPO) | ||
USER="INSERT-USERNAME-HERE" | ||
REPO="INSERT-NAME-HERE" | ||
git clone [email protected]:$USER/$REPO.git | ||
|
||
# Navigate to this directory | ||
cd $REPO | ||
|
||
# Add the Recipe Submodule | ||
git submodule add https://github.com/$USER/pooled-cp-profiling-recipe.git pooled-cp-profiling-recipe | ||
``` | ||
|
||
Refer to ["Adding a submodule"](https://gist.github.com/gitaarik/8735255#adding-a-submodule) for more details. | ||
|
||
### Step 3: Commit the Submodule | ||
|
||
Lastly, we will [commit](https://help.github.com/en/desktop/contributing-to-projects/committing-and-reviewing-changes-to-your-project#about-commits) the submodule to github. | ||
|
||
* **Result:** Committing this change finalizes the weld. | ||
* **Goals:** 1) Track the submodule (recipe) version with the current repository. | ||
* **Execution:** See below | ||
|
||
```bash | ||
# Add, commit, and push the submodule contents | ||
git add pooled_cp_profiling_recipe | ||
git add .gitmodules | ||
git commit -m 'finalizing the recipe weld' | ||
git push | ||
``` |
Original file line number | Diff line number | Diff line change | ||||
---|---|---|---|---|---|---|
@@ -0,0 +1,31 @@ | ||||||
The following are the two setup steps that need to be performed once at the start of a project. | ||||||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think of each batch as possibly being a separate experiment, and these steps don't need to happen with each batch. So if you don't like "project" and I don't like "experiment", can we come up with another word? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. And/or we should put into the documentation how we define project/experiment/batch? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Want to say "start of each batch of data collection" or "start of each project batch"? Let's keep only once though. I agree the data pipeline welding "unit" will be experimental batch. An experiment may contain multiple batches, and a project may contain multiple experiments (in my view). Although we may eventually want to make the recipe focused at the "experiment" level (as defined above) since we are likely to want to develop batch effect correction methods. These tools should be part of the recipe IMO - this is a bit down the road though There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just added a commit that defines our terms and so it should be consistent now. I agree we want to eventually bring it to the experiment level (related issue). |
||||||
|
||||||
For a general overview of the pipeline welding process, see the [repo README](README.md). | ||||||
For the welding process steps to perform with each dataset, see the [weld process README](weld_process_README.md). | ||||||
|
||||||
## Setup the Computational Environment | ||||||
|
||||||
Install [conda](https://docs.conda.io/projects/conda/en/latest/user-guide/install/). | ||||||
We use conda as an environment manager. | ||||||
|
||||||
```bash | ||||||
# Install computational environment | ||||||
conda env create --force --file environment.yml | ||||||
|
||||||
# Initialize the environment | ||||||
conda activate pooled-cell-painting | ||||||
``` | ||||||
|
||||||
## Fork the Pooled Cell Painting Painting Recipe | ||||||
|
||||||
We first want to [fork](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) the official pooled cell profiling recipe located at https://github.com/broadinstitute/pooled-cp-profiling-recipe. | ||||||
|
||||||
* **Result:** | ||||||
The fork creates a copy of a recipe repository. | ||||||
* **Goals:** | ||||||
1) Remove the connection to official recipe updates to avoid unintended weld versioning reversal. | ||||||
2) Enable independent updates to fork code that does not impact official recipe. | ||||||
* **Execution:** | ||||||
See [forking instructions](https://help.github.com/en/github/getting-started-with-github/fork-a-repo) and the image below. | ||||||
ErinWeisbart marked this conversation as resolved.
Show resolved
Hide resolved
|
||||||
|
||||||
![Step 1: Fork](media/step1_forkrecipe.png) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,104 @@ | ||
The following are the weld process steps to perform with each dataset you analyze. | ||
ErinWeisbart marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
For a general overview of the pipeline welding process, see the [repo README](README.md). | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Love these pointers |
||
For the setup steps that need to be performed once at the start of a project, see the [setup REAME](setup_README.md). | ||
|
||
### Step 0: Update Your Forked Recipe (Optional) | ||
ErinWeisbart marked this conversation as resolved.
Show resolved
Hide resolved
|
||
* **Result:** | ||
Updates (or reverts) your recipe to include any desired changes. | ||
* **Goal:** | ||
1) Allow you to make changes to your recipe from dataset to dataset (or batch to batch). | ||
* **Execution:** | ||
If you would like your recipe to include any updates to the official recipe: | ||
ErinWeisbart marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
ErinWeisbart marked this conversation as resolved.
Show resolved
Hide resolved
|
||
git fetch upstream | ||
git checkout master | ||
git merge upstream/master | ||
git push | ||
``` | ||
If you would like your recipe to include any updates that you have made: | ||
ErinWeisbart marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
ErinWeisbart marked this conversation as resolved.
Show resolved
Hide resolved
|
||
git checkout UPDATED-BRANCH | ||
``` | ||
or | ||
ErinWeisbart marked this conversation as resolved.
Show resolved
Hide resolved
|
||
``` | ||
ErinWeisbart marked this conversation as resolved.
Show resolved
Hide resolved
|
||
git checkout <commit_hash> | ||
``` | ||
|
||
### Step 1: Create a New Repository **Using This Repository as a Template** | ||
ErinWeisbart marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
* **Result:** | ||
A repository for each dataset/batch. | ||
* **Goal:** | ||
1) Retain all code, configuration files, computational environments, and directory structure that a standard Pooled Cell Painting workflow expects and produces. | ||
* **Execution:** | ||
Click "Use this template". | ||
![Use_this_template](media/use_this_template.png) | ||
Enter a name for your new repository that includes your batch name and click "Create repository from template". | ||
![New_Repo](media/new_repo_from_template.png) | ||
ErinWeisbart marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
### Step 2: Create a Submodule of the Forked Recipe Inside the New Repository | ||
ErinWeisbart marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Next, we create a [submodule](https://gist.github.com/gitaarik/8735255) in the repository we just created. | ||
|
||
* **Result:** | ||
Adding a submodule initiates the weld. | ||
* **Goals:** | ||
1) Link the processing code (recipe) with the data (current repo). | ||
2) Require a manual step to update the recipe to enable asynchronous development. | ||
* **Execution:** See below | ||
ErinWeisbart marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```bash | ||
# In your terminal, clone the repository you just created (THIS REPO) | ||
USER="INSERT-USERNAME-HERE" | ||
REPO="INSERT-NAME-HERE" | ||
git clone [email protected]:$USER/$REPO.git | ||
|
||
# Navigate to this directory | ||
cd $REPO | ||
|
||
# Add the Recipe Submodule | ||
git submodule add https://github.com/$USER/pooled-cp-profiling-recipe.git pooled-cp-profiling-recipe | ||
``` | ||
|
||
Refer to ["Adding a submodule"](https://gist.github.com/gitaarik/8735255#adding-a-submodule) for more details. | ||
|
||
### Step 3: Commit the Submodule | ||
ErinWeisbart marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
Lastly, we [commit](https://help.github.com/en/desktop/contributing-to-projects/committing-and-reviewing-changes-to-your-project#about-commits) the submodule to github. | ||
|
||
* **Result:** | ||
Committing this change finalizes the weld. | ||
* **Goal:** | ||
1) Track the submodule (recipe) version with the current repository. | ||
* **Execution:** | ||
See below | ||
ErinWeisbart marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
```bash | ||
# Add, commit, and push the submodule contents | ||
git add pooled_cp_profiling_recipe | ||
git add .gitmodules | ||
git commit -m 'finalizing the recipe weld' | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. whatever word we decide above should be substituted in this commit message. It was probably me that used this terminology before 😅 |
||
git push | ||
``` | ||
|
||
## Step 4: Perform the Weld | ||
* **Result:** | ||
Data is processed and figures and data are output. | ||
* **Goal:** | ||
1) Track the submodule (recipe) version with the current repository. | ||
* **Execution:** | ||
1) Activate conda environment. | ||
``` | ||
conda activate pooled-cell-painting | ||
``` | ||
2) Manually update the configuration yaml documents for your specific experiment. | ||
Yaml documents with reasonable default values are available in the [config/](config/) folder. | ||
Do NOT change the location of the .yaml files. | ||
Additional documentation for each of the parameters is available in the [config/docs/](config/docs/) folder. | ||
3) Execute `weld.sh` (see below) | ||
ErinWeisbart marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
||
``` | ||
bash | ||
./weld.sh | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a figure legend. Do you want to give it a first crack?