Use Nextclade to assign clade labels in main phylogenetic workflow #131

huddlej · 2023-11-28T17:31:59Z

Context

We currently assign clade labels to trees in our main phylogenetic workflow using the augur clades command and the influenza clade nomenclature TSVs. However, clade assignments vary for some samples between these public/private trees and the Nextclade trees. Clade assignments can vary between different runs of public/private tress from the same time period due to different sample compositions of the trees produced by our random subsampling logic. These mismatches can cause confusion among users who look at both Nextclade outputs for their own data and the public/private Nextstrain trees.

Description

Since Nextclade provides a standard clade label interface already, we should use Nextclade to annotate clades in our main phylogenetic workflow instead of augur clades. This change will ensure that the samples are assigned to the same clade regardless of the sample composition of a given public/private tree.

Possible solutions

In the short term, we could replace our nextalign alignment with nextclade using the corresponding reference's dataset for each subtype. We would need to replace the current augur clades command with functionality like @corneliusroemer proposed in nextstrain/augur#1329 that allows us to assign clades to internal nodes and branches for complete backward compatibility of clade display in Auspice. Instead of inferring clades for internal nodes as a discrete trait, we could consider assigning clades with Nextclade to the inferred ancestral sequences for nodes.

In the long (medium?) term, we could run Nextclade during our "data upload to S3" workflow, upload the alignments and Nextclade annotations joined with metadata, and then start our workflows with those files. This approach would allow us to skip the alignment and clades steps of the current workflow and it would provide useful Nextclade data on S3 that we need for other analyses like flu frequencies, etc.

The text was updated successfully, but these errors were encountered:

huddlej added the enhancement New feature or request label Nov 28, 2023

huddlej mentioned this issue Dec 1, 2023

WIP: New clades subcommand that works like traits, using labeled tips rather than clades.tsv nextstrain/augur#1329

Draft

huddlej mentioned this issue Feb 23, 2024

Download clade definitions from GitHub #154

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use Nextclade to assign clade labels in main phylogenetic workflow #131

Use Nextclade to assign clade labels in main phylogenetic workflow #131

huddlej commented Nov 28, 2023 •

edited

Loading

Use Nextclade to assign clade labels in main phylogenetic workflow #131

Use Nextclade to assign clade labels in main phylogenetic workflow #131

Comments

huddlej commented Nov 28, 2023 • edited Loading

Context

Description

Possible solutions

huddlej commented Nov 28, 2023 •

edited

Loading