Skip to content

Pipelines

avialee edited this page Mar 2, 2021 · 18 revisions

Note

Demultiplexing

1. Preparing for the SampleSheet.csv file

  • Bulk (BK):

    • One Primer or Two Primer Only:
      • /projects/ps-epigen/nextSeq/210203_NB501692_0083_AH2WYVBGXH/Data/Fastqs/SampleSheet.csv
    • Mixed two primers with one primer:
      1. /projects/ps-epigen/nextSeq/201002_NB501692_0044_AHTJF7BGXG/Data/Fastqs/OnePrimer/SampleSheet.csv
      2. /projects/ps-epigen/nextSeq/201002_NB501692_0044_AHTJF7BGXG/Data/Fastqs/TwoPrimers/SampleSheet.csv
  • 10xRNA (TR), 10xATAC (TA), 10x_mix_RNA_ATAC (TM):

    /projects/ps-epigen/nextSeq/200928_NB501692_0043_AHFGWCAFX2/Data/Fastqs/SampleSheet_I1.csv

  • snATAC_v2 (S2):

    1. /projects/ps-epigen/nextSeq/201116_NB501692_0062_AH32MNBGXH/Data/Fastqs/SampleSheet_I1.csv (-p7_plates)
    2. /projects/ps-epigen/nextSeq/201116_NB501692_0062_AH32MNBGXH/Data/Fastqs/SampleSheet_I2.csv (-i5_plates)

2. Preparing for the extra parameter file for 10xRNA (TR), 10xATAC (TA), 10x_mix_RNA_ATAC (TM):

  • /projects/ps-epigen/nextSeq/200918_NB501692_0039_AHFHYHAFX2/Data/Fastqs/extraPars.txt
  • Or leave it blank

3. Run demultiplexing script (including updating corresponding fields in nextseq_app_runinfo table)

runBcl2fastq.sh runinfo.Flowcell_ID /projects/ps-epigen/nextSeq/xxxFlowcell_ID request.user.email
  1. convertTASampleSheet.py to change SampleSheet.csv to SampleSheet_expand.csv, the content will be different only when it contains the index from chromium-shared-sample-indexes-plate, e.g. /projects/ps-epigen/nextSeq/190404_NB501692_0122_AHK3FHAFXY/Data/Fastqs/SampleSheet_expand.csv
  1. runBcl2fastq.pbs (/projects/ps-epigen/software/bin/runBcl2fastq.pbs): including updateRunStatus, bcl2fastq command, copy fastq to final folder (/home/zhc268/data/seqdata), updateReadsNumberPerRun and updateRunReads
runDemux10xRNA.sh runinfo.Flowcell_ID /projects/ps-epigen/nextSeq/xxxFlowcell_ID request.user.email
  1. runDemux10xRNA.pbs (/projects/ps-epigen/software/bin/runDemux10xRNA.pbs): including updateRunStatus, mkfastq command (read extraPars.txt), symlink fastq (~/data/seqdata/), updateReadsNumberPerRun and updateRunReads
runDemux10xATAC.sh runinfo.Flowcell_ID /projects/ps-epigen/nextSeq/xxxFlowcell_ID request.user.email
  1. runDemux10xATAC.pbs(/projects/ps-epigen/software/bin/runDemux10xATAC.pbs): including updateRunStatus, mkfastq command (read extraPars.txt), symlink fastq (~/data/seqdata/), updateReadsNumberPerRun and updateRunReads
runDemux10xATAC_RNA.sh runinfo.Flowcell_ID /projects/ps-epigen/nextSeq/xxxFlowcell_ID request.user.email
  1. convertTASampleSheet.py and convertTASampleSheet_TT.py to change SampleSheet.csv to SampleSheet_rna_expand.csv and SampleSheet_rna_expand_TT, the index sequence is in chromium-shared-sample-indexes-plate and Dual_Index_Kit_TT_Set_A
  2. runDemux10xATAC_RNA.pbs (/projects/ps-epigen/software/bin/runDemux10xATAC_RNA.pbs): including updateRunStatus, split rna and atac libraries("SI-GA" for single-index RNA,"SI-TT" for dual-index RNA, "SI-NA" for atac), mkfastq command(read extraPars.txt),bcl2fastq (for rna, no extraPars)command, symlink fastq to final folder (~/data/seqdata/), updateReadsNumberPerRun and updateRunReads
runDemuxSnATAC.sh runinfo.Flowcell_ID /projects/ps-epigen/nextSeq/xxxFlowcell_ID request.user.email
  1. runDemuxSnATAC.pbs (/projects/ps-epigen/software/bin/runDemuxSnATAC.pbs): including updateRunStatus, bcl2fastq (/projects/ps-epigen/software/bcl2fastq/bin/bcl2fastq) to generate fastq without samplesheet, ATACdemultiplex (/projects/ps-epigen/software/bin/ATACdemultiplex) command, symlink fastq to final folder (/projects/ps-epigen/seqdata/), updateReadsNumberPerRun and updateRunReads

Run setQC

1. preparing for the setStatusFile file (.Set_xxx.txt)

  • ATAC-seq: e.g./projects/ps-epigen/outputs/setQCs/.Set_875.txt
  1. 'Processed Or Not' means whether it has been run by the pipeline before, decided by .finished.txt file in libdir. e.g./projects/ps-epigen/outputs/libQCs/ZC_87/.finished.txt
  1. '10xProcessed' is 'Yes' only when the experiment is '10xATAC' and it has been 10xProcessed (decided by web_summary.html in TENX_DIR, like /projects/ps-epigen/outputs/10xATAC/MM_492/outs/web_summary.html ) /projects/ps-epigen/outputs/setQCs/.Set_608_samplesheet.tsv
  1. If it contains 10xATAC unprocessed libs, there should be a samplesheet file listing their names and genome info, e.g. /projects/ps-epigen/outputs/setQCs/.Set_608_samplesheet.tsv
  • ChIP-seq: e.g./projects/ps-epigen/outputs/setQCs/.Set_645.txt (contain 'true' in column 'is input') and /projects/ps-epigen/outputs/setQCs/.Set_733.txt (all is 'false' in column 'is input')
  1. Column 'Processed Or Not' means whether it has been run by the pipeline before (either ATAC or ChIP pipeline), decided by .finished.txt file in libdir.
  1. Whether there is 'true' defined in column 'is input' determines which pipeline to run (see run setQC script below)

2. Run setQC script

runSetQC.sh setinfo.set_id request.user.email re.sub(r"[\)\(]", ".", setinfo.set_name)
  1. Run 10xPipeline (/projects/ps-epigen/software/bin/run10xPipeline.pbs, including 'cellranger-atac count' command and transfer data from workdir /oasis/tscc/scratch/$(whoami)/outputs_TA/ to final folder ~/data/outputs/10xATAC/) for those unprocessed 10xATAC libs which is listed in the samplesheet file mentioned above
  1. Run runBulkATAC_fastq pipeline (/projects/ps-epigen/software/atac_dnase_pipelines/utils/runBulkATAC_fastq.pbs, including merging fastq files for multiple repeats like '_1_2',run /projects/ps-epigen/software/atac_dnase_pipelines/atac.bds, transfer outputs and add .finished.txt tag). The transferring script is in /projects/ps-epigen/software/bin/results_transfer.sh, from /oasis/tscc/scratch/$(whoami)/outputs/${sample} to /projects/ps-epigen/outputs/
  1. Run runSetQC.pbs (/projects/ps-epigen/software/bin/runSetQC.pbs), including updateLibrariesSetQC(update status), setQC_wrapper.sh (~/software/setQC/setQC_wrapper.sh)
  1. setQC_wrapper.sh: -n: set_id, -t:experiment type, default is atac, -c: if SNAP, default is false
    setQC_wrapper.sh -n Set_96
    setQC_wrapper.sh -n Set_113 -t atac_chip -c true
    setQC_wrapper.sh -n Set_113 -t chip -c true
runSetQC_chipseq.sh setinfo.set_id request.user.email re.sub(r"[\)\(]", ".", setinfo.set_name
  1. If 'true' is appeared in column 'is input', it will go through runBulkATAC_fastq.pbs pipeline (with 'snap-chip' on), otherwise runBulkCHIP_fastq.pbs (/projects/ps-epigen/software/bin/runBulkCHIP_fastq.pbs, not sure whether it runs smoothly now)
  1. If it went through runBulkATAC_fastq.pbs pipeline, the report html would be ended with 'setQC_report_atac_chip.html', e.g. http://epigenomics.sdsc.edu:8088/Set_733/157737/setQC_report_atac_chip.html, otherwise 'setQC_report_chip.html ', e.g. http://epigenomics.sdsc.edu:8088/Set_645/7f4342/setQC_report_chip.html

Single Cell App submit_tenX

1. Preparing for the SampleSheet.csv file

  • 10ATAC:

    /projects/ps-epigen/outputs/10xATAC/JB_389/.JB_389.tsv

  • others:

    /projects/ps-epigen/outputs/scRNA/JB_298_1_2/.JB_298_1_2.tsv

2. Run 10X script

run10xOnly.sh seq /projects/ps-epigen/outputs/10xATAC/ email

1.run10xPipeline.pbs (/projects/ps-epigen/software/bin/run10xPipeline.pbs), including updateSingleCellStatus (from InQueue to InProcess), cellranger-atac count command, tranfer outputs to final folder (~/data/outputs/10xATAC/) and updateSingleCellStatus

runCellRanger.sh seq refgenome.path /projects/ps-epigen/outputs/scRNA/ email

1.run10xCellRanger.pbs (/projects/ps-epigen/software/bin/run10xCellRanger.pbs), including updateSingleCellStatus (from InQueue to InProcess), cellranger count command, transfer outputs to fina foler (~/data/outputs/scRNA/) and updateSingleCellStatus (to Yes)

Single Cell App submit_cooladmin

Run 10X script

coolAdmin.sh {email} {seqString} {exp_type} {paramString}
  1. {paramString}: SNAP_PARAM_DICT[key]
  2. 10xATAC: /projects/ps-epigen/software/snATACCoolAdmin_LIMS/10x_model.bash, including clustering_pipeline command (/home/opoirion/code/snATAC/snATAC_pipeline/clustering_pipeline.py) and updateCooladminStatus
  3. others: /projects/ps-epigen/software/snATACCoolAdmin_LIMS/from_fastq_process.bash inclusing fastq_pipeline command (~/code/snATAC/snATAC_pipeline/fastq_pipeline.py) and updateCooladminStatus