This repo contains multiple directories that analyze the diversity of specific herpesvirus glycoproteins by running a snakemake pipeline that downloads sequences based a list of accessions, processes the sequences, and constructs a phylogenetic Nextstrain tree that can be viewed using Auspice. Each herpesvirus directory contains its own snakemake pipeline.
The numbering scheme of each protein is relative to the NCBI Virus reference strains (HSV-1: NC_001806.2, EBV: NC_007605.1, HSV-2: NC_001798.2).
Analysis performed by Caleb Carr.
The trees can be colored by several features (e.g., genotype, date, country) by selecting the corresponding option in the Color By dropdown. For example, the HSV-1 gB tree can be colored by amino acid identity at a position by selecting Genotype in the dropdown menu and then selecting HSV1_gB. Entering a position will then color the tree by the amino acid identity at that position. Note that the protein numbering is relative to the NCBI Virus reference strains (HSV-1: NC_001806.2, EBV: NC_007605.1, HSV-2: NC_001798.2). Other features can be viewed by mousing over or clicking on the nodes and branches of the tree.
HSV-1:
EBV:
HSV-2:
Alignments are constructed relative to the NCBI Virus reference strains (HSV-1: NC_001806.2, EBV: NC_007605.1, HSV-2: NC_001798.2).
HSV-1:
EBV:
HSV-2:
HSV1
: Contains the analysis workflow for HSV-1EBV
: Contains the analysis workflow for EBVHSV2
: Contains the analysis workflow for HSV-2auspice
: Contains the final nextstrain tree files for each herpesvirus that then can be viewed using Nextstrain community share via GitHub. Note that these final files are manually copied from each individual herpesvirus directory.