This document is intended to help understand the function of the files contained within the output directory of urban-weed-genomics. This folder contains all related output files for the urban-weed-genomics project. File descriptions are found below.
This file contains output from Step 3 - Remove PCR Clones. It includes the sublibrary name (i.e., file), the number of input reads, the number of output reads that remain after PCR replicates were removed, and the number of PCR replicates that were identified.
This file contains the list of samples that were used to create the Metapopulation catalog for each species. A subset of samples were chosen randomly from each population (i.e., city) and based on samples that had a similiar percent coverage and number of retained reads.
This file contains the results of Step 5a - Run denovo_map.sh
;
where iterations were run on a subset of samples
to decide an optimal set of parameters to use throughout the stacks pipeline.
Results from these iterations include the:
no.SNPs
= the number of SNPs identified using each iteration, andno.r60.loci
= the number of loci identified and shared across 60% of the samples.
The change in R60 loci, as depicted in this figure was used to decide the optimal set of parameters for each species.
This file contains the optimal parameters used for each species to create catalogs and stacks.
The subset of samples for each species that were used to identify optimal parameters are listed here.
This file summarizes the number of samples, variant sites, etc. at the end of the populations
module in Stacks.
Samples identified and discarded as part of Step 4d - Identify low-coverage and low-quality samples from are listed here.
Samples kept for downstream analysis after Step 4d - Identify low-coverage and low-quality samples from are listed here.
This files contains the library-wide statistics and output from Step 4c - Assess the raw, processed, and cleaned data.
This files contains the per-sample statistics and output from Step 4c - Assess the raw, processed, and cleaned data.
This file contains a list of samples that were discarded at the tsv2bam stage. Samples listed were not included in the species population map because they contained less than 300 sample loci matched to catalog loci.
Samples identified and discarded as part of Step 5b - Run ustacks
are listed here.
Samples kept for downstream analysis after Step 5b - Run ustacks
are listed here.