abcpiv

PIV-based analysis for chromatin from A- or B-compartments.

Date started: 2023-02-02

Workflow diagram

TODO: Update workflow diagram to be data-centric instead of rule-centric

graph TD

    subgraph initialize
    A99((all_init))
    end

    subgraph interactive rules
    A98((all_<br>interactive))
    end

    subgraph segmentation
    A2
    B1
    B2
    end

    subgraph piv analyis
    A6
    A8
    A9
    end

    subgraph image processing
    A1
    A4
    A5
    end

    A1((all_roi))
    C1((all_<br>register))

    A2((all_<br>segmentation_<br>nucleus))


    A4((all_<br>measure))
    A5((all_<br>normalize))


    A6((all_piv))
    A8((all_msnd))
    A9((all_msnd_post))

    B1((all_<br>segmentation_<br>nucleoli))    
    B2((all_<br>segmentation_<br>hc))
  
    A01[config_template]
    A99 --> A01

    A02[download_<br>ilasktic_<br>models]
    A98 --> A02 & A12

    A11[parse_metadata] --> A01
    A12[draw_roi] --> A11 &  A01
    A13{crop_roi} --> A12 & A01
    A14[split_channels] --> C11 & A01
    A1 --> A12 & C11

    C11[register_nucleus] --> A13 & A01
    C1 --> C11

    A31[segment_nuclei_<br>in_time] --> A14 & A01
    A2 --> A31

    A41[measure] --> C11 & A31 & A01
    A42[combine_<br>measurements] --> A41 & A01
    A4 --> A42

    A51[normalize_<br>singlechannel] --> A14 & A31 & A01
    A52[normalize_<br>multichannel] --> A51 & A01
    A5 --> A51 & A52

    A61[gen_piv_<br>config_json] --> A01
    A62[piv] --> A61 & A14 & A01
    A6 --> A62


    A81[msnd] --> A62 & A52 & A31 & B15 & B21 & A01
    A8 --> A81


    A91[fit_msnd_line] --> A81
    A92[instantaneous_alphas] --> A81
    A9 --> A91 & A92

    B11{sn_crop_roi} --> A12
    B12[sn_mask_tiff] --> B11 & A31
    B13[sn_predict_<br>nucleoli] --> B12 & A02
    B14[sn_convert_<br>to_ometif] --> B13
    B15[sn_segment_<br>nucleoplasm] --> B14 & A31
    B1 --> B14 & B15

    B21[segment_hc] --> A52 & A01
    B2 --> B21

Dependencies

git submodules

This repo uses git submodules to manage some dependencies. To clone this repo, use the following command:

git clone --recurse-submodules https://github.com/yichechang/alu-mobility.git

others

Dependencies are listed in workflow/envs/abcdcs.yaml file and can be installed as a conda environment using either conda or mamba.

Currently, with mamba which is faster at solving, we need to first create an empty environment before we can install dependencies specified in an yaml file. See the issue and solution.

mamba create -n abcdcs
mamba env update -n abcdcs -f workflow/envs/abcdcs.yaml

wget

ilastik

Install manually before the first run.

mamba create -n ilastik 
mamba activate ilastik
mamba install -c ilastik-forge ilastik

TODO: add to conda environment for snakemake to create this on the first run.

MATLAB

You'll need to have matlab on your path. This can either be done by manually creating a symbolic link to the matlab executable, or by using the environment modules on a cluster (e.g. module load matlab/R2019b).

Note: Do not rely on alias. It is fragile and likely won't work when Snakemake execute a shell directive.

Executing snakemake workflow

On local machine

conda activate abcdcs
cd to analysis folder
Issue the following command to initialize the workflow:
```
snakemake \
  -s {path/to/this/repo}/workflow/Snakefile \
  -c1 \
  init
```
This will copy the config/config.yaml to the analysis folder.
Edit the config file according to the experiment.

Run snakemake locally, optionally specify target rule. (See Snakefile for possible all-type rules.)

snakemake \
  -s {path/to/this/repo}/workflow/Snakefile \
  --configfile config.yaml \
  --use-conda \
  -c{n}

Note: If testing the repo with the test data, treat the repo folder as the analysis folder mentioned above in step 2. And also no need to specify where the main snakemake file and configureation files are, via -s and --configfile, respectively.

On della

Part A: Set up analysis folder

cd to analysis folder
Activate abcd conda environment by issuing conda activate abcd
Issue the following command to initialize the workflow:
```
snakemake \
  -s {path/to/this/repo}/workflow/Snakefile \
  -c1 \
  init
```
This will copy the config/config.yaml to the analysis folder.
Edit the config file according to the experiment.

Part B: GUI-based work and resources download

Launch della-vis1 desktop via VNC on mydella.
Launch terminal and module load anaconda3/2022.10 && conda activate abcd
cd to analysis folder, and run

snakemake \
  -s {path/to/this/repo}/workflow/Snakefile \
  --configfile config.yaml \
  --use-conda \
  --use-envmodules \
  -c1 \
  all_interactive

Part C: Run remaining rules on della using `salloc`

cd to analysis folder
module load anaconda3/2022.10 && conda activate abcd
salloc --nodes=1 --ntasks=<n> --mem-per-cpu=<m>G --time=<t> where <n> is the number of cores to use, <m> is the amount of memory per core, and <t> is the time limit.
Issue the following command:

snakemake \
  -s {path/to/this/repo}/workflow/Snakefile \
  --configfile config.yaml \
  --use-conda \
  --use-envmodules \
  -c{n} \
  all && \
scancel $SLURM_JOB_ID

where {n} is the core requested in the salloc command.

Versions (note this section is outdated)

Tagging system explanation

This repository currently contains both

a snakemake workflow with its config files, scripts, etc; and
a python package abcdcs that is required for the workflow, but also includes modules can be used on their own for upstream preprocessing as well as downstream analyses.

In the future, it might make sense to keep track of them separately, but currently their development is closely related. Thus, we now use a single tagging system for version tracking.

The format is yyyy.MM.dd.[a-z] where [a-z] is used to differentiate versions tagged on the same date.

2023.04.05.a: Consider this v0.0.9!
- include ilastik for nucleoli segmentation
- workflow is more modular with clearer main Snakefile
- directly determine raw input files to be used for rules
- no pepfile is used anymore. just a single config file.
2023.03.31.a: remove unused function in msnd
2023.03.30.a: normalize intensity for y459 and y491 on della
2023.03.28.a: Improve raw data compatibility with tiff file without metadata
2023.03.26.c: matpiv_v2 della (used for y459)
2023.03.26.b: snakemake on local and della up to PIV
- Workflow runs on both local (everything to piv) and della (from cropping to piv).
- No job grouping should be used.
- On della, if want to avoid submit many small jobs (currently some of the corresponding rules have time set to 61 minutes when they take only a few minutes, to avoid piling up in the short-job queue), salloc then run without cluster profile is useful.

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
abcdcs		abcdcs
bim-xarray @ e84e0ee		bim-xarray @ e84e0ee
chromotion @ 2b5a55c		chromotion @ 2b5a55c
config		config
notebooks		notebooks
workflow		workflow
.gitignore		.gitignore
.gitmodules		.gitmodules
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

abcpiv

Workflow diagram

Dependencies

git submodules

others

wget

ilastik

MATLAB

Executing snakemake workflow

On local machine

On della

Part A: Set up analysis folder

Part B: GUI-based work and resources download

Part C: Run remaining rules on della using `salloc`

Versions (note this section is outdated)

Tagging system explanation

About

Releases

Packages

Languages

yichechang/alu-mobility

Folders and files

Latest commit

History

Repository files navigation

abcpiv

Workflow diagram

Dependencies

git submodules

others

wget

ilastik

MATLAB

Executing snakemake workflow

On local machine

On della

Part A: Set up analysis folder

Part B: GUI-based work and resources download

Part C: Run remaining rules on della using salloc

Versions (note this section is outdated)

Tagging system explanation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Part C: Run remaining rules on della using `salloc`

Packages