Cleans metagenomic reads to remove adapters, low-quality bases and host (e.g. human) contamination:
For human decontamination download the indexed hg38 genome here.
Step 1 (If indexing your own host genome):
bwa index host_genome.fa
Step 2 (or Step 1 if using the above pre-indexed human genome):
metagen-fastqc.sh -t 8 -f input_1.fastq.gz -r input_2.fastq.gz -c host_genome.fa
Notes:
- -t controls the number of threads. Going above 8 does not significantly improve performance.
- Cleaned files will be generated in the same directory where the original FASTQ files are located and suffixed with "_clean.fastq.gz".
- -f argument can be either the forward read file or just a single-end FASTQ file (in the latter case -r would be omitted). When using paired-end files, make sure the forward and reverse files end in _1.fastq.gz and _2.fastq.gz, respectively.