8-6-20
The trees in this release were generated with the following command line:
bash global_tree_gisaid.sh -i gisaid_hcov-19_2020_06_07_23.fasta -o global.fa -t 35 -k 100
The raw sequence file contains all available SARS-CoV-2 genomes in GISAID available on the 8th of June 2020, at 9AM Canberra (Australia) time.
The ZIP file contains the code necessary to reproduce the trees themselves, and the README in the zip file also describes the methods used in detail. I also include the trees themselves here so that they can be easily downloaded without downloaded the entire repo.
Filtering statistics
- 41449 sequences downloaded
- Initial alignment retains 41274 sequences and 29750 sites
- After filtering gappy sites, alignment retains 29748 sites
- After filtering sequences on length and ambiguity, alignment retains 36286 sequences
- After removing sequences on long branches with TreeShrink, final tree retains 36257 sequences
Notable changes to the scripts in this release
None
Notable aspects of the trees
None