May 17-19, 2017
Norman Wickett, Conservation Scientist, Chicago Botanic Garden
Matt Johnson, Post-Doctoral Research Associate, Chicago Botanic Garden
Elliot Gardner, Ph.D. Candidate, Northwestern University
Lisa Pokorny, Phylogenomic Research Fellow, Kew Royal Botanic Gardens
Steven Dodsworth, Senior Researcher (PAFTOL), Kew Royal Botanic Garden
Hyb-Seq (Weitemier et al., 2014) combines targeted sequencing of thousands of low-copy nuclear exons and flanking regions evolving across a range of different rates —as well as genome skimming of high-copy repeats and organellar genomes— to efficiently produce genome-scale data sets for phylogenomics. A new angiosperm-wide Hyb-Seq enrichment probe kit is being developed at Kew (in collaboration with MYcroarray), in the context of PAFTOL, based on transcriptome (generated by the oneKP project) and genome data (Phytozome). This workshop will explain how to successfully implement Hyb-Seq all the way from kit design, to lab does and don’ts, data post-processing, and phylogenomic inference.
Following the workshop, participants should be able to:
- Describe the entire HybSeq process, from bait design and lab work through phylogenetic analysis.
- Design a HybSeq study and accurately estimate the cost to complete it.
- Select appropriate markers for a HybSeq probe set.
- Choose appropriate DNA extraction and library preparation strategies
- Decide on an appropriate pooling and sequencing strategy, taking into account both the cost savings and risks of increased multiplexing.
- Assembly raw sequences into a usable data set.
- Choose and carry out appropriate phylogenetic and/or population genetic analyses of assembled data.
All tutorials and slide decks are freely available with an MIT License at: https://github.com/mossmatters/KewHybSeqWorkshop
The workshop relies on Cyverse Atmosphere, a cloud computing project sponsored by the National Science Foundation. Accounts for educational purposes are free. Sign up and request Atmosphere access at: https://user.cyverse.org
We created a Cyverse Atmosphere image, HybSeq_Kew_Workshop
, that contains all of the software needed to run the following tutorials. All software installed on the Atmosphere image is open source and free to use for educational purposes.
Click here for a workshop-specific Atmosphere tutorial.
An introduction to enriching high-throughput sequencing libraries for targeted exons and flanking introns via hybridization (HybSeq).
Wednesday, 9h - 10h
Slides (Presenter: Norm Wickett)
Gaining access to computational and data resources for Hands On Tutorials.
Wednesday, 10h - 12h
Best practices for selecting loci for HybSeq using existing genomic and transcriptomic resources.
Wednesday, 13h - 15h
Slides (Presenter: Matt Johnson)
Software Covered: MarkerMiner, BLAST, MAFFT
Strategies for HybSeq projects: using existing resources, utilizing herbarium material, and planning a budget.
Wednesday, 15:30h - 16:30h
Slides (Presenter: Elliot Gardner)
Wednesday, 16:30h - 17:30h
Slides (Presenter: Elliot Gardner)
Getting the most out of reagents: library preparation, cleaning with SPRI beads, and simultaneous hybridization.
Thursday, 9h - 10:45h Exercises 11h -12h
Slides (Presenter: Elliot Gardner)
So you've got 60 billion base pairs of sequence: now what? Filtering and trimming HybSeq sequences before analysis.
Thursday, 12h - 13h
Slides (Presenter: Matt Johnson)
Software Covered: BaseSpace, FastQC, Trimmomatic
Exploring the three stages of HybPiper (read sorting, contig assembly, and intron extraction) along with strategies for recovering introns and identifying paralogs.
Thursday, 14h - 15:20h (Hands On)
Slides (Presenter: Matt Johnson)
Software Covered: HybPiper (BWA, SPAdes, Exonerate), RStudio
How to avoid staring at every alignment: high-throughput methods for alignment and quality control.
Thursday, 15:40h - 17h
Slides (Presenter: Matt Johnson)
Script for branch length outliers
Software Covered: MAFFT, PAL2NAL, Trimal, Aliview, FastTree, FigTree
From gene trees to species trees: advantages of summary coalescent methods, and new ways of assessing phylogenetic support.
Friday, 9h - 10h (Lecture) Friday, 10h - 10:45h, 11h - 11:30h (Hands On)
Slides (Presenter: Norm Wickett)
More information about Phyparts Pie Charts
Software Covered: RAxML, ASTRAL-II, PhyParts
Strategies for HybSeq in species complexes: marker selection, identifying heterozygosity, and population genomics.
Friday, 11:30h - 13h (Lecture) Friday, 14h - 15:20h (Hands On)
Slides (Presenter: Elliot Gardner)
Example variant workflow script
Software Covered: GATK, IGV, PLINK, Structure, SplitsTree, SVDQuartets
The organizers thank the Kew Royal Botanical Gardens for sponsoring the workshop. This workshop was funded under the Plant and Fungal Tree of Life Project (PAFTOL, William Baker, lead PI). PAFTOL is supported by the Calleva Foundation, the Sackler Foundation, and the Garfield Weston Foundation. Learn more about PAFTOL here.
Active research presented in this workshop on HybSeq and Phylogenomics, including all of the tutorial data sets, was funded by the National Science Foundation:
DEB-1239980 (A. Jonathan Shaw), DEB-12400045 (Bernard Goffinet), and DEB-1239992 (Norman Wickett)
DEB-0919119 (Nyree J.C. Zerega), DEB-1501373 (Elliot Gardner)
DEB-1146295 (Bernard Goffinet)
DEB-1353131 (Andrew Alverson), DEB-1353152 (Norman Wickett)
DEB-1342873 (Krissa Skogen, Norman Wickett, Jeremie Fant)