Skip to content

Latest commit

 

History

History
84 lines (63 loc) · 5.67 KB

box.md

File metadata and controls

84 lines (63 loc) · 5.67 KB

Box Folder Organization

Table of Contents

ATAC-seq

Annotations

Contains files for SS vs P DA regions annotated using the databases and tools described here. Also contains bar charts showing the relative frequencies of each annotation for regions with increased accessibility and those with decreased accessibility. Also contains hg38 reference sequences for DA regions.

DA-DE Correlation

Contains scatter plots and inner joins representing a naive attempt to find some sort of correlation between differential expression and differential accessibility for SS vs P.

ATAC-seq Data

Contains DiffBind and raw ATAC-seq output for SS vs P.

RNA-seq

Predicted Regulators

Contains LISA and ChEA3 output for top 500 SS vs P DE genes (by ascending p-adjusted).

Compare Conditions

Contains all output of Comparing DE Between Quiescent Conditions, including CSV files of SQL-style outer joins between DE gene lists and a matrix of counts for overlaps between lists.

RNA-seq Data

Contains DEseq2 output for SS vs P, CI vs P, SS vs SSR, and CI vs CIR.

ChIP-seq

Transcription Factor

Co-association

Contains co-binding maps created using various sets of SS vs P DE TFs and DE genes (e.g. upregulated, downregulated, all). Contains interaction heatmaps corresponding to each of these co-binding maps.

Key: UU = upregulated TFs and genes, DD = downregulated TFs and genes, _down = all DE TFs and downregulated genes, _up = all DE TFs and upregulated genes, down = downregulated TFs and all DE genes, up = upregulated TFs and all DE genes, all = all DE TFs and DE genes

Additionally, for interaction heatmaps: self = models trained without validation test sets (standardized to 100 boosting rounds), test = model trained with test sets and early stopping (50 rounds) to prevent overfitting (often resulted in significantly lower model accuracy on primary set due to stopping very early, suggesting the differences between ChIP-seq data from experiment to experiment are to drastic to ignore)

BETA

Contains BETA output (basic and minus mode) for SS vs P DE TF ChIP-seq data from 2018 Cistrome batch download. Includes two lists of merged (deduplicated) predicted targets for each DE TF, one for all targets and a second specifically for DE targets. Also contains heatmaps showing log 2 fold changes of DE targets for DE TFs at various regulatory potential cutoffs.

GREAT

Contains GREAT output for all SS vs P DE TF ChIP-seq files in 2018 Cistrome batch download.

Histone Modification

Co-association

Contains co-binding maps created using various sets of SS vs P DE TFs and either DE genes or DA regions.

Contains UROPA-annotated output for merged H4K20me3 ChIP-seq data in 2018 Cistrome Batch download. Merging was performed by reading all the ChIP-seq peak files into memory, concatenating them into a single master peak list, and combining all peaks within a user-definable distance of one another (in this case, 0 and 1000 bp) using bedtools merge. Annotation was performed using GENCODE V29.

Also contains overlap between merged H4K20me3 annotated genes and merged SS vs P DE TF BETA-predicted target genes. Lastly, contains overlap between merged H4K20me3 annotated genes and SS vs P DE genes.

CRC

Contains the aligned reads, ROSE output, and CRCmapper output for H3K27ac ChIP-seq data from a cell cycle study and an ENCODE experiment on a line of adult human dermal fibroblasts. The raw sequencing reads for G0/G1, M, and S phase H3K27ac ChIP-seq were downloaded using SRA Explorer and the search term SRP098814. Peaks were subsequently called using MACS3. By contrast, prealigned reads (bowtie2, hg38) from isogenic replicate 2 (ENCFF754VWN) were downloaded for the ENCODE cell line. Similarly, pre-called replicated peaks (ENCFF398SWW) were downloaded for this experiment.

RSGDREAM 2020

Contains the abstract, poster, and recorded five-minute talk submitted to the virtual 2020 ISCB Regulatory and Systems Genomics with DREAM Challenges Conference.

Paper Notes

Contains notes on relevant papers for future reference.

Progress Reports

Contains lab meeting presentations and other progress reports.

To-Do

Contains the most up-to-date lists of future plans and extensions.