Understanding chromatin biology using high throughput sequencing workshop

Learning Objectives

Understand the necessity for, and use of, the command line interface (bash) and HPC for analyzing high-throughput sequencing data.
Understand best practices for designing a ChIP-seq / CUT&RUN / ATAC-seq experiment.
Perform the steps involved in going from raw FASTQ files to peak calls for an individual sample.
Review qualitative ways to assess peak calls and if they support the hypothesis

Installations

All:

FileZilla Client (make sure you get ‘FileZilla Client')

Mac users:

Plain text editor like Sublime text or similar

Windows users:

GitBash
Plain text editor like Notepad++ or similar

Notes

These materials focus on the use of local computational resources at Harvard, which are only accessible to Harvard affiliates
Non-Harvard folks can download the data and set up to work on their local clusters (with the help of local system administrators)

Instructions for Harvard researchers with access to HMS-RC's O2 cluster

To run through the code in the lessons below, you will need to be logged into O2 and working on a compute node (i.e. your command prompt should have the word compute in it).

Log in using ssh ecommonsID@o2.hms.harvard.edu and enter your password.
Once you are on the login node, use srun --pty -p interactive -t 0-2:30 --mem 1G /bin/bash to get on a compute node or as specified in the lesson.
Proceed only once your command prompt has the word compute in it.
If you log out between lessons (using the exit command twice), please follow points 1. and 2. above to log back in and get on a compute node when you restart with the self learning.

Lessons

Part 1

Shell basics review
Working in an HPC environment - Review
Best Practices in Research Data Management (RDM)
Dataset overview and Project organization)

Part II

A review of high-throughput sequencing methods for understanding chromatin biology
Experimental design considerations for HTS of chromatin
Quality Control of Sequence Data: Running FASTQC and evaluating results
Alignment using Bowtie2

Part III

Filtering BAM files
Peak calling
Handling peak files using bedtools

Part IV

File formats for peak visualization
Qualitative assessment of peak enrichment using deepTools
Troubleshooting your ChIP-seq analysis

Part V

Automating the ChIP-seq workflow

Answer Keys

Data Management and project organization
QC and Alignment questions
Handling peak calls
Automation shell script
Parallelization script

Building on this workshop

Integration of ChIP-seq and RNA-seq
Advanced bash commands (aliases, copying files, and symlinks)
Introduction to R workshop materials

Resources

ENCODE Data Standards and Processing Pipeline Information for Histone and Transcription Factors
ENCODE guidelines and practices for ChIP-seq. An older paper, but a good outline of general best practices.
Experimental design considerations:
- Thermofisher Step-by-step guide to a successful ChIP experiment
- "Chromatin Immunoprecipitation (ChIP) Principles and How to Obtain Quality Results", BenchSci Blog
- O’Geen et al (2011), Methods Mol Biol - A focus on performing ChIP assays to characterize histone modifications
Jung et al (2014). NAR. - Impact of sequencing depth in ChIP-seq experiments

These materials have been developed by members of the teaching team at the Harvard Chan Bioinformatics Core (HBC). These are open access materials distributed under the terms of the Creative Commons Attribution license (CC BY 4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

links-to-lessons.md

links-to-lessons.md

Understanding chromatin biology using high throughput sequencing workshop

Learning Objectives

Installations

Notes

Instructions for Harvard researchers with access to HMS-RC's O2 cluster

Lessons

Part 1

Part II

Part III

Part IV

Part V

Answer Keys

Building on this workshop

Resources

Files

links-to-lessons.md

Latest commit

History

links-to-lessons.md

File metadata and controls

Understanding chromatin biology using high throughput sequencing workshop

Learning Objectives

Installations

Notes

Instructions for Harvard researchers with access to HMS-RC's O2 cluster

Lessons

Part 1

Part II

Part III

Part IV

Part V

Answer Keys

Building on this workshop

Resources