Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

formatdb and mgblast dependencies #36

Open
abretaud opened this issue Mar 17, 2017 · 4 comments
Open

formatdb and mgblast dependencies #36

abretaud opened this issue Mar 17, 2017 · 4 comments

Comments

@abretaud
Copy link

Hi,
@cmonjeau and me are trying to make a bioconda recipe for transposome, but we are stuck on a strange dependency on formatdb and mgblast.
It looks like at some point these were replaced by makeblastdb and megablast (which we would prefer), but these changes somehow got reverted later.
Is transposome to depend on formatdb+mgblast or makeblastdb+megablast?
In the first case, which version should be installed, and from where (we so some commit where mgblast was in the bin dir, but it got deleted later)?

@sestaton
Copy link
Owner

I have previously worked to use megablast, but unfortunately the changes in the algorithms rendered the program of little use for this application. First, all the hits are now stored in memory, which results in very high memory usage (about 10X that of mgblast) for this this type of application where you expect a large number of similar hits. This is documented in the BLAST+ user manual. More importantly, the results are not the same. It is not trivial to find a set of parameters to make the results comparable, and since this completely changes the interpretation it was not something I pursued further.

These programs (mgblast and formatdb) will be installed when you type make install so I would advise not breaking apart the build process. When you type make test that will tell you if everything is set up correctly. The install instructions list the core dependencies assuming you are on a new Linux image.

I plan to release a Docker image for Transposome as soon as I get a chance but that's been delayed with work. What is the advantage for bioconda?

@abretaud
Copy link
Author

Ok, I better understand now, thanks for the answer!
If make install installs the bins we should be able to use it in the conda recipe (ping @cmonjeau)
For the docker image, once the conda recipe is pushed to bioconda, a docker image is automatically built and made available on https://quay.io/organization/biocontainers (and https://biocontainers.pro)

@sestaton
Copy link
Owner

I think make install should be all that's needed other than those few core packages (build-essential, lib32z1, and ncbi-blast+).

Let me know if I can help further. I would like to add this information about bioconda/docker to the documentation once it is available. Thanks for the interest.

@sestaton
Copy link
Owner

Hi @abretaud and @cmonjeau,

I just want to note that I haven't forgotten about this topic. There is a Docker image for Transposome now and the Dockerfile is on github in the root directory.

I'll look into creating a conda recipe for this tool next week but any advice would be helpful since I have not done this before.

Thanks,
Evan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants