-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should we support the OPAL format as output? #113
Comments
might be reproducing taxonkit profile2cami? https://bioinf.shenwei.me/taxonkit/usage/#profile2cami |
Some of those commands look really nice, didn't know of that tool thanks @mattheatley ! Agreed, maybe not necesary if it's supported elsewhere? Should consider if our output is compariable as input to |
so you're pretty much already there with the standard taxpasta output. but instead of the two columns (taxid & count) you'd need to provide taxid & abundance (i.e. percentage) and then that's the input required by taxonkit. tbh it's probably more useful to also have the counts in general so maybe just provide the abundances as an extra output |
Raw counts are 'sequence' abunadance anyway, so maybe we are already there then? |
But could be good to test if the two tools are comaptible, then we could update the docs to point people to taxonkit :) |
I think maybe they are talking about proportion vs percentage and not counts but not totally sure |
I have been considering an option to report fractions instead of counts from taxpasta for quite some time. So it seems that small change would already make the output compatible with taxonkit's profile2cami. |
Maybe don’t do away with counts altogether though? I actually find it more useful to have them instead because you can’t convert backwards to counts from abundances. An additional output would be great though. At the moment I convert taxpasta outputs to abundances and then to cami so this would cut out a stage. But there can be rounding issues so maybe calculate them using decimals? |
By the way, @mattheatley, I don't know if this is clear enough from the documentation: Some of the original profiler output is actually given as fractions, which we multiply with a big number in order to obtain integers. So in those cases, it would be more faithful to the original result to only report fractions. |
One major issue atm is, that only leaf counts are supported by taxonkit shenwei356/taxonkit#99 (comment) |
In order to use the OPAL tool for analysis and visualization, it might be useful to convert any supported profiler to that format.
The text was updated successfully, but these errors were encountered: