Data and scripts for "Long metabarcoding of the eukaryotic rDNA operon to phylogenetically and taxonomically resolve environmental diversity" (Jamy et al. 2019)
This repository holds the data and part of the analysis stack of the above mentioned paper. It is structured in the following way:
- data holds the MSAs for various combination of query sequences (either full or constrained to V4) and reference sequences, as well as taxonomic information on reference taxa, and outgroup information
- trees contains the results of tree searches based on the aforementioned alignments
- jplace contains the result of phylogenetic placement, the first sub-dir corresponding to the reference tree used, and the second corresponding to which query sequence MSA was used
- assign contains taxonomic assignment results based either on the tree inferred from mixing references with queries (
comprehensive_*
) or on phylogenetic placement (the rest) - leave1out contains the result of the leave-one-out tests
- visualization code and results for visualizations
- src source files used in analysis
- preliminaries data from preliminary study of the data, included soely for transparency reasons
Additionally, the scripts to perform analysis and produce the results are located in the base directory.
The V4
in this context means usage of the query alignments masked to only include the V4 hypervariable region.
Jamy, M, Foster, R, Barbera, P, et al. Long‐read metabarcoding of the eukaryotic rDNA operon to phylogenetically and taxonomically resolve environmental diversity. Mol Ecol Resour. 2020; 20: 429– 443. https://doi.org/10.1111/1755-0998.13117