reference: https://docs.qiime2.org/2019.10/tutorials/
mkdir fastq_files
ref: https://github.com/qiime2/docs/blob/master/source/tutorials/importing.rst#fastq-manifest-formats
sample-id forward-absolute-filepath reverse-absolute-filepath
Con5090Ileum $PWD/fqgz/1183-1_S1_L001_R1_001.fastq.gz $PWD/fqgz/1183-1_S1_L001_R2_001.fastq.gz
Con8124Ileum $PWD/fqgz/1183-19_S19_L001_R1_001.fastq.gz $PWD/fqgz/1183-19_S19_L001_R2_001.fastq.gz
...
sample-id GroupID treatment-group E.coliChallenge Sex Euth PigID Sourceofsample Datetaken NGS-SampleNo
#q2:types categorical categorical categorical categorical categorical categorical categorical categorical categorical
Con5090Ileum C Control NO M 1 5090 Ileum 1/7/2019 1
Con8124Ileum C Control NO F 2 8124 Ileum 5/7/2019 19
Con8141Ileum C Control NO M 2 8141 Ileum 5/7/2019 20
...
sample-id GroupID
#q2:types categorical
C C
CR CR
CR-EC CR-EC
EC EC
We use SILVA database silva_132_99_V4/silva-132-99-515-806-nb-classifier.qza
qiime tools import \
--type 'SampleData[PairedEndSequencesWithQuality]' \
--input-path job_manifest.tsv \
--output-path paired-end-demux.qza \
--input-format PairedEndFastqManifestPhred33V2
the above code produces the paired-end-demux.qza file
code in qiime2:
qiime cutadapt trim-paired \
--i-demultiplexed-sequences paired-end-demux.qza \
--p-cores 20 \
--p-front-f GTGCCAGCMGCCGCGGTAA \
--p-front-r GGACTACHVGGGTWTCTAAT \
--o-trimmed-sequences pe_reads_cutadapt_trimmed.qza \
--verbose \
&> primer_trim.log
--p-cores is number of cores. the above code produces the pe_reads_cutadapt_trimmed.qza file.
- Check quality plots and sequence length
- code in qiime2:
qiime demux summarize \
--i-data pe_reads_cutadapt_trimmed.qza \
--o-visualization pe_reads_cutadapt_trimmed.qzv
Based on the plots you see in qzv, decide values would you choose for --p-trunc-len and --p-trim-left in DADA2 denoising ref: https://docs.qiime2.org/2019.10/tutorials/atacama-soils/