Skip to content

Commit

Permalink
Add supplemental tanglegram for HA/NA analysis
Browse files Browse the repository at this point in the history
Adds a supplemental figure for the HA/NA analysis showing a tanglegram
between the HA and NA trees with tips colored by MCCs from TreeKnit with
at least 10 samples. Updates the README link for this same tanglegram to
match the figure in the paper.

Closes #108
  • Loading branch information
huddlej committed Aug 20, 2024
1 parent 1d0d1b3 commit 717deb1
Show file tree
Hide file tree
Showing 4 changed files with 11 additions and 2 deletions.
2 changes: 1 addition & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,7 @@ Explore the phylogenetic trees and embeddings on Nextstrain.
- Influenza H3N2 HA and NA (2016-2018)
- [HA phylogeny](https://nextstrain.org/groups/blab/cartography/flu-seasonal-h3n2-ha-2016-2018-reassort) colored by maximum compatibility clades (MCCs) representing HA/NA reassortment groups
- [NA phylogeny](https://nextstrain.org/groups/blab/cartography/flu-seasonal-h3n2-na-2016-2018-reassort) colored by MCCs
- [HA/NA tangletree](https://nextstrain.org/groups/blab/cartography/flu-seasonal-h3n2-ha-2016-2018-reassort:groups/blab/cartography/flu-seasonal-h3n2-na-2016-2018-reassort) colored by MCCs
- [HA/NA tangletree](https://nextstrain.org/groups/blab/cartography/flu-seasonal-h3n2-ha-2016-2018-reassort:groups/blab/cartography/flu-seasonal-h3n2-na-2016-2018-reassort?f_MCC=MCC_0,MCC_1,MCC_10,MCC_11,MCC_12,MCC_13,MCC_14,MCC_2,MCC_3,MCC_4,MCC_5,MCC_6,MCC_7,MCC_8,MCC_9) colored by MCCs
- [PCA embedding](https://nextstrain.org/groups/blab/cartography/flu-seasonal-h3n2-ha-2016-2018-reassort?l=scatter&scatterX=pca1&scatterY=pca2)
- [MDS embedding (1 and 2)](https://nextstrain.org/groups/blab/cartography/flu-seasonal-h3n2-ha-2016-2018-reassort?l=scatter&scatterX=mds1&scatterY=mds2)
- [MDS embedding (2 and 3)](https://nextstrain.org/groups/blab/cartography/flu-seasonal-h3n2-ha-2016-2018-reassort?l=scatter&scatterX=mds2&scatterY=mds3)
Expand Down
2 changes: 1 addition & 1 deletion manuscript/cartography.tex
Original file line number Diff line number Diff line change
Expand Up @@ -318,7 +318,7 @@ \subsection{Joint embeddings of hemagglutinin and neuraminidase genomes identify
Evolution of HA and NA surface proteins contributes to the ability of influenza viruses to escape existing immunity \citep{Petrova2018} and HA and NA genes frequently reassort \citep{Nelson2008,Marshall2013,Potter2019}.
Therefore, we focused our reassortment analysis on HA and NA sequences, sampling 1,607 viruses collected between January 2016 and January 2018 with sequences for both genes.
We inferred HA and NA phylogenies from these sequences and applied TreeKnit to both trees to identify maximally compatible clades (MCCs) that represent reassortment events \citep{Barrat-Charlaix2022}.
Of the 208 reassortment events identified by TreeKnit, 15 (7\%) contained at least 10 samples representing 1,049 samples (65\%).
Of the 208 reassortment events identified by TreeKnit, 15 (7\%) contained at least 10 samples representing 1,049 samples (65\%, Supplementary Fig.~\ref{S_Fig_ha_na_tangletree}).

We created PCA, MDS, t-SNE, and UMAP embeddings from the HA alignments and from merged HA and NA alignments.
We identified clusters in both HA-only and HA/NA embeddings and calculated the VI distance between these clusters and the MCCs identified by TreeKnit.
Expand Down
9 changes: 9 additions & 0 deletions manuscript/cartography_supplement.tex
Original file line number Diff line number Diff line change
Expand Up @@ -127,6 +127,15 @@ \section*{Supplementary data}
The random sampling scheme uniformly sampled from the original dataset, reflecting the geographic and genetic bias in those data.}\label{S_Fig_late_flu_replication_of_cluster_accuracy}
\end{figure}

\begin{figure}[!h]
\includegraphics[width=\columnwidth]{figures/flu-2016-2018-ha-na-tangletree-by-mcc.png}
\caption{{\bf Tanglegram view of phylogenetic trees for influenza H3N2 HA (left) and NA (right) with circles representing HA or NA samples, lines connecting the same samples in each tree, and colors showing Maximally Compatible Clades (MCCs) that represent reassortment events identified by TreeKnit.}
Samples from MCCs with fewer than 10 sequences appear in gray without circles in the tanglegram.
Branch labels in the HA tree show Nextstrain clades to help contextualize placement of each MCC.
\href{https://nextstrain.org/groups/blab/cartography/flu-seasonal-h3n2-ha-2016-2018-reassort:groups/blab/cartography/flu-seasonal-h3n2-na-2016-2018-reassort?f_MCC=MCC_0,MCC_1,MCC_10,MCC_11,MCC_12,MCC_13,MCC_14,MCC_2,MCC_3,MCC_4,MCC_5,MCC_6,MCC_7,MCC_8,MCC_9}{View an interactive version of this figure on nextstrain.org}.
}\label{S_Fig_ha_na_tangletree}
\end{figure}

\begin{figure}[!h]
\includegraphics[width=0.75\columnwidth]{figures/flu-2016-2018-ha-na-all-embeddings-by-mcc.png}
\caption{{\bf Embeddings influenza H3N2 HA-only (left) and combined HA/NA (right) showing the effects of additional NA genetic information on the placement of reassortment events detected by TreeKnit (MCCs).}
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 717deb1

Please sign in to comment.