You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am currently using Genomad for analyzing a dataset of 39,910 sequences. However, I’ve noticed discrepancies in the output files that I need clarification on:
The summary file contains only 38,449 rows.
The taxonomy file generated by the annotation module contains 39,888 rows.
Could you please help me understand why there is a difference in the number of rows between the input sequences and these output files? Specifically, I would like to know where and why the sequences might have been removed or filtered out.
Thank you for your assistance!
Best regards,
Fang
The text was updated successfully, but these errors were encountered:
The summary files should only include sequences classified as viruses (<prefix>_virus_summary.tsv) or plasmids (<prefix>_plasmid_summary.tsv). Sequences not present in the summary were either not classified as viruses or plasmids, or they were classified but didn't pass the post-classification filters. These filters can be disabled by using the --relaxed flag.
The taxonomy file only contains sequences that were assigned to a taxon. Sequences missing from this file did not match any taxonomically-informative markers. If you expected all sequences to match a marker, you can try increasing the search sensitivity (e.g., -s 7), but this will increase execution time and memory usage.
Hi,
I hope this message finds you well.
I am currently using Genomad for analyzing a dataset of 39,910 sequences. However, I’ve noticed discrepancies in the output files that I need clarification on:
The summary file contains only 38,449 rows.
The taxonomy file generated by the annotation module contains 39,888 rows.
Could you please help me understand why there is a difference in the number of rows between the input sequences and these output files? Specifically, I would like to know where and why the sequences might have been removed or filtered out.
Thank you for your assistance!
Best regards,
Fang
The text was updated successfully, but these errors were encountered: