Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Filtering host reads #5

Open
chrissy005 opened this issue Aug 8, 2024 · 1 comment
Open

Filtering host reads #5

chrissy005 opened this issue Aug 8, 2024 · 1 comment

Comments

@chrissy005
Copy link

Hello, I was attempting the following codes as you described to filter out host sequences:.

"host sequences:
mkdir host not_host
samtools fastq -F 3588 -f 65 output.bam | gzip -c > host/output_S_R1.fastq.gz
echo "R2 matching host genome:"
samtools fastq -F 3588 -f 129 output.bam | gzip -c > host/output_S_R2.fastq.gz

sequences that are not host:
samtools fastq -F 3584 -f 77 output.bam | gzip -c > not_host/output_S_R1.fastq.gz
samtools fastq -F 3584 -f 141 output.bam | gzip -c > not_host/output_S_R2.fastq.gz
samtools fastq -f 4 -F 1 output.bam | gzip -c > not_host/output_S_Singletons.fastq.gz"

I am new to samtools and do not understand the -F and -f flags as well as the integers that follow them. Do these determine which sequences are host and non-host?

@bartns
Copy link

bartns commented Sep 30, 2024

From samtools fastq help:

  -f, --require-flags INT
               only include reads with all  of the FLAGs in INT present [0]
  -F, --excl[ude]-flags INT
               only include reads with none of the FLAGs in INT present [0x900]

And this might help you out regarding the SAM flag values:

https://www.samformat.info/sam-format-flag
And I like to use this one:
https://broadinstitute.github.io/picard/explain-flags.html

For the manual sake it be might be nicer to use the full option names (--require-flags, --exclude-flags)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants