Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combining databases to predict is much less than separately #111

Open
1835033964 opened this issue Feb 15, 2023 · 0 comments
Open

Combining databases to predict is much less than separately #111

1835033964 opened this issue Feb 15, 2023 · 0 comments

Comments

@1835033964
Copy link

Thanks to the hgtector, it helps me a lot. However I encountered a question when using hgtector, and I'd like to ask you about it.

The following are my steps:
First,I have download the microbe database and finished steps hgtector search and hgtector analysis. 1140 HGT-derived genes were predicted. (outputfile is "analysis_dir/hgts/result.txt")
Then,I have download the plant database and finished steps hgtector search and hgtector analysis. 776 HGT-derived genes were predicted.
Finally, I concatenated the microbe database and plant database (I merged the database that are fasta format, and make database with diamond) and finished the same steps. However, only 6 HGT-derived genes were predicted.
When I was running the hgtector, all the parameters are default. When building the database, The taxdump file used to create the database is the file downloaded from the "hgtector database" command (taxdump.tar.gz 57.43Mb).And the taxonmap file is "prot.accession2taxid" file provided by Nr.

Is it reasonable to combine databases to predict much less than to predict separately, and what might be the cause? Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant