diff --git a/README.md b/README.md index 92b7b380..7ae5514e 100644 --- a/README.md +++ b/README.md @@ -46,7 +46,7 @@ Explore the phylogenetic trees and embeddings on Nextstrain. - Influenza H3N2 HA and NA (2016-2018) - [HA phylogeny](https://nextstrain.org/groups/blab/cartography/flu-seasonal-h3n2-ha-2016-2018-reassort) colored by maximum compatibility clades (MCCs) representing HA/NA reassortment groups - [NA phylogeny](https://nextstrain.org/groups/blab/cartography/flu-seasonal-h3n2-na-2016-2018-reassort) colored by MCCs - - [HA/NA tangletree](https://nextstrain.org/groups/blab/cartography/flu-seasonal-h3n2-ha-2016-2018-reassort:groups/blab/cartography/flu-seasonal-h3n2-na-2016-2018-reassort) colored by MCCs + - [HA/NA tangletree](https://nextstrain.org/groups/blab/cartography/flu-seasonal-h3n2-ha-2016-2018-reassort:groups/blab/cartography/flu-seasonal-h3n2-na-2016-2018-reassort?f_MCC=MCC_0,MCC_1,MCC_10,MCC_11,MCC_12,MCC_13,MCC_14,MCC_2,MCC_3,MCC_4,MCC_5,MCC_6,MCC_7,MCC_8,MCC_9) colored by MCCs - [PCA embedding](https://nextstrain.org/groups/blab/cartography/flu-seasonal-h3n2-ha-2016-2018-reassort?l=scatter&scatterX=pca1&scatterY=pca2) - [MDS embedding (1 and 2)](https://nextstrain.org/groups/blab/cartography/flu-seasonal-h3n2-ha-2016-2018-reassort?l=scatter&scatterX=mds1&scatterY=mds2) - [MDS embedding (2 and 3)](https://nextstrain.org/groups/blab/cartography/flu-seasonal-h3n2-ha-2016-2018-reassort?l=scatter&scatterX=mds2&scatterY=mds3) diff --git a/manuscript/cartography.tex b/manuscript/cartography.tex index 726f0f90..a07a325e 100644 --- a/manuscript/cartography.tex +++ b/manuscript/cartography.tex @@ -318,7 +318,7 @@ \subsection{Joint embeddings of hemagglutinin and neuraminidase genomes identify Evolution of HA and NA surface proteins contributes to the ability of influenza viruses to escape existing immunity \citep{Petrova2018} and HA and NA genes frequently reassort \citep{Nelson2008,Marshall2013,Potter2019}. Therefore, we focused our reassortment analysis on HA and NA sequences, sampling 1,607 viruses collected between January 2016 and January 2018 with sequences for both genes. We inferred HA and NA phylogenies from these sequences and applied TreeKnit to both trees to identify maximally compatible clades (MCCs) that represent reassortment events \citep{Barrat-Charlaix2022}. -Of the 208 reassortment events identified by TreeKnit, 15 (7\%) contained at least 10 samples representing 1,049 samples (65\%). +Of the 208 reassortment events identified by TreeKnit, 15 (7\%) contained at least 10 samples representing 1,049 samples (65\%, Supplementary Fig.~\ref{S_Fig_ha_na_tangletree}). We created PCA, MDS, t-SNE, and UMAP embeddings from the HA alignments and from merged HA and NA alignments. We identified clusters in both HA-only and HA/NA embeddings and calculated the VI distance between these clusters and the MCCs identified by TreeKnit. diff --git a/manuscript/cartography_supplement.tex b/manuscript/cartography_supplement.tex index 167f4255..52b0667c 100644 --- a/manuscript/cartography_supplement.tex +++ b/manuscript/cartography_supplement.tex @@ -127,6 +127,15 @@ \section*{Supplementary data} The random sampling scheme uniformly sampled from the original dataset, reflecting the geographic and genetic bias in those data.}\label{S_Fig_late_flu_replication_of_cluster_accuracy} \end{figure} +\begin{figure}[!h] +\includegraphics[width=\columnwidth]{figures/flu-2016-2018-ha-na-tangletree-by-mcc.png} +\caption{{\bf Tanglegram view of phylogenetic trees for influenza H3N2 HA (left) and NA (right) with circles representing HA or NA samples, lines connecting the same samples in each tree, and colors showing Maximally Compatible Clades (MCCs) that represent reassortment events identified by TreeKnit.} + Samples from MCCs with fewer than 10 sequences appear in gray without circles in the tanglegram. + Branch labels in the HA tree show Nextstrain clades to help contextualize placement of each MCC. + \href{https://nextstrain.org/groups/blab/cartography/flu-seasonal-h3n2-ha-2016-2018-reassort:groups/blab/cartography/flu-seasonal-h3n2-na-2016-2018-reassort?f_MCC=MCC_0,MCC_1,MCC_10,MCC_11,MCC_12,MCC_13,MCC_14,MCC_2,MCC_3,MCC_4,MCC_5,MCC_6,MCC_7,MCC_8,MCC_9}{View an interactive version of this figure on nextstrain.org}. +}\label{S_Fig_ha_na_tangletree} +\end{figure} + \begin{figure}[!h] \includegraphics[width=0.75\columnwidth]{figures/flu-2016-2018-ha-na-all-embeddings-by-mcc.png} \caption{{\bf Embeddings influenza H3N2 HA-only (left) and combined HA/NA (right) showing the effects of additional NA genetic information on the placement of reassortment events detected by TreeKnit (MCCs).} diff --git a/manuscript/figures/flu-2016-2018-ha-na-tangletree-by-mcc.png b/manuscript/figures/flu-2016-2018-ha-na-tangletree-by-mcc.png new file mode 100644 index 00000000..3cef8907 Binary files /dev/null and b/manuscript/figures/flu-2016-2018-ha-na-tangletree-by-mcc.png differ