Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sketch from STDIN #65

Open
mihkelvaher opened this issue Jan 7, 2021 · 4 comments
Open

Sketch from STDIN #65

mihkelvaher opened this issue Jan 7, 2021 · 4 comments

Comments

@mihkelvaher
Copy link

Is it possible to create a sketch using fastas streamed through a pipe to dashing?

I'm manipulating both assembled genomes and k-mers and would like to compare them in the end multiple times. I could write them on the disk as an additional step but given the high volume, it is really cumbersome.

Thanks.

@dnbaker
Copy link
Owner

dnbaker commented Jan 7, 2021

Hi!

Sure, that's something that can be done. I'm in the process of adding an option -o to sketch that can take '/dev/stdout' or '-', which I'll probably finish later today. (The option is there, but it is currently broken.)

Thanks,

Daniel

@mihkelvaher
Copy link
Author

That's great!
Also, can smaller sketches be compared with larger ones? For example, if I have some bacteria and human in the same database, do I need to scale up the bacterial sketches if I don't want to lose too much human info?

@dnbaker
Copy link
Owner

dnbaker commented Jan 9, 2021

Hi Mihkel --

They can't be compared directly, but you can compress larger sketches by folding them in half repeatedly, if you will. I've added a new subcommand dashing fold which should compress a larger HLL into a smaller sketch.

The above issue about directing sketch output to a stream has been fixed, and once the new binaries finish compiling they should be ready to use. I'll let you know when it's avaiable.

Thanks,

Daniel

@dnbaker dnbaker mentioned this issue Jan 10, 2021
@mihkelvaher
Copy link
Author

Hi Daniel,

The pull request suggested that the option is now available on main.

Unfortunately, I don't know how to use it and -o isn't listed in dashing sketch -h.
What I tried (and similar):
cat test.fasta | dashing sketch - -o test.sketch

Sketching the old way, I gave fold also a try but
[src/main.cpp:int main(int, char**)56] Invalid subcommand fold provided.

I'm guessing I've got a wrong version (v0.5-9-ge6ae), but the pull request was also merged to main/master...

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants