Skip to content

Commit

Permalink
Small text edit
Browse files Browse the repository at this point in the history
  • Loading branch information
huddlej committed Aug 14, 2024
1 parent a327478 commit adbab82
Showing 1 changed file with 1 addition and 1 deletion.
2 changes: 1 addition & 1 deletion manuscript/cartography.tex
Original file line number Diff line number Diff line change
Expand Up @@ -437,7 +437,7 @@ \subsection{SARS-CoV-2 clusters recapitulate broad genetic groups corresponding
Unlike the Pango lineages in the early SARS-CoV-2 data, the lineages from the later data exhibited fewer pairwise genetic distances between samples in each lineage than samples in Nextstrain clades or any embedding cluster (Supplementary Fig.~S\ref{S_Fig_sarscov2_within_between_group_distances}).

To understand whether t-SNE clusters could capture Pango-resolution genetic groups within a single Nextstrain clade, we evenly sampled approximately 2,000 sequences from a dominant Nextstrain clade with many Pango lineages, 21J (Delta), and identified clusters from a t-SNE embedding of those data.
Within the 1,992 sequences from 21J (Delta), we found 38 Pango lineages after collapsing lineages with fewer than 10 sequences into their parent lineages.
Within the 1,992 sequences sampled from 21J (Delta), we found 38 Pango lineages after collapsing lineages with fewer than 10 sequences into their parent lineages.
We found 28 t-SNE clusters representing 1,806 sequences (91\%) with 186 sequences (9\%) assigned to the unclustered ``-1'' label (Supplementary Fig.~\ref{S_Fig_sarscov2_single_clade_embeddings_tsne_counts}).
The VI distance between Pango lineages and all clusters (including the unclustered group) was 0.17 (Supplementary Table~\ref{S_Table_optimal_cluster_parameters}).
This distance was consistent with the distance of 0.14 between Pango lineages and t-SNE clusters from both the full early and late SARS-CoV-2 datasets.
Expand Down

0 comments on commit adbab82

Please sign in to comment.