Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

hifiasm2 allow excluding --primary #6484

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

alpapan
Copy link

@alpapan alpapan commented Oct 23, 2024

the galaxy tool forced --primary even though it isn't on by the software's default setting. It also prevented from removing it. This patch allows users to use the hifiasm2 default behaviour of excluding --primary. Patch sets as true to maintain the galaxy default.

FOR CONTRIBUTOR:

  • I have read the CONTRIBUTING.md document and this tool is appropriate for the tools-iuc repo.
  • License permits unrestricted use (educational + commercial)
  • This PR adds a new tool or tool collection
  • This PR updates an existing tool or tool collection
  • This PR does something else (explain below)

the galaxy tool forced --primary even though it isn't on by the software's default setting. It also prevented from removing it. This patch allows users to use the hifiasm2 default behaviour of excluding --primary. Patch sets as true to maintain the galaxy default.
Copy link
Member

@bgruening bgruening left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please bump the version. Thanks a lot!

@alpapan
Copy link
Author

alpapan commented Oct 23, 2024

i honestly don't know how!

(i have two patch pull requests, i kept them separate so they can be considered independently)

i think that's correct? never done it before
Copy link
Contributor

@bernt-matthias bernt-matthias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also suggest to have only one PR, i.e. include #6483 here. Alteratively do not increase the version suffix in one of them and skip CI?

@@ -216,6 +218,7 @@
</param>
<when value="blank"/>
<when value="set">
<param name="primary" type="boolean" truevalue="--primary" falsevalue="" checked="true" label="Create primary and alternate assemblies" help="Hifiasm outputs a primary assembly and two balanced haplotypes by default. This will output a primary assembly and alternate assembly only (--primary)."/>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
<param name="primary" type="boolean" truevalue="--primary" falsevalue="" checked="true" label="Create primary and alternate assemblies" help="Hifiasm outputs a primary assembly and two balanced haplotypes by default. This will output a primary assembly and alternate assembly only (--primary)."/>
<param argument="--primary" type="boolean" truevalue="--primary" falsevalue="" checked="true" label="Create primary and alternate assemblies" help="Hifiasm outputs a primary assembly and two balanced haplotypes by default. This will output a primary assembly and alternate assembly only"/>

@alpapan alpapan mentioned this pull request Oct 24, 2024
4 tasks
integrates patch-1 of galaxyproject#6483; adds --primary and correctly handles output files; left comment on whether argument= or name= ought to be used. can write tests if code review is +ve
@alpapan
Copy link
Author

alpapan commented Oct 24, 2024

  1. integrated fix from fix hifiasm2 -u option #6483

  2. adds --primary and correctly handles output files. tested with all three use cases
    -- Assembly options blank
    -- Assembly options set and primary set
    -- Assembly options set and primary not set

  3. left comment on whether argument= or name= ought to be used. please advise

  4. added documentation links - not sure if it is the correct way, please advise!

  5. can write tests if this passes code review
    (it would be my first time doing it so rather wait until code is a-ok!)

fixed lint errors. can't find why it complains about <when>
removed remnant from prev commit
@alpapan
Copy link
Author

alpapan commented Oct 24, 2024

no idea why lint fails for when (i don't think i added any code there)

@alpapan
Copy link
Author

alpapan commented Oct 24, 2024

no idea why lint fails for when (i don't think i added any code there)

ok it was my comment. fixed and provide it here:
#AP temporary: comment tested to work with name, not sure about argument because OUTPUT files are different too

Copy link
Contributor

@bernt-matthias bernt-matthias left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

3. left comment on whether argument= or name= ought to be used. please advise

Argument would be preferred: https://galaxy-iuc-standards.readthedocs.io/en/latest/best_practices/tool_xml.html#parameter-name-argument-and-help. It will create a variable with the same name as before (i.e. dashes are stripped automatically) and add the argument in parentheses to the end of the help (which is then nicely consistent between tools).

@@ -320,23 +327,37 @@
</inputs>
<outputs>
<!--Standard mode-->
<data name="raw_unitigs" format="gfa1" from_work_dir="output.r_utg.gfa" label="${tool.name} on ${on_string}: haplotype-resolved raw unitig graph for pseudohaplotype assembly">
<filter>mode['mode_selector'] == 'standard' and hic_partition['hic_partition_selector'] == 'blank'</filter>
<data name="raw_unitigs" format="gfa1" from_work_dir="*.r_utg.gfa" label="${tool.name} on ${on_string}: haplotype-resolved raw unitig graph for pseudohaplotype assembly">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if the *.r_utg.gfa is correct here. Seems to conflict with:

<data name="raw_unitigs_trio" format="gfa1" from_work_dir="output.dip.r_utg.gfa"...>

The prefix should be fix according to the docs. So I guess this should be reverted to output.r_utg.gfa

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i can fix that

<data name="raw_unitigs" format="gfa1" from_work_dir="output.r_utg.gfa" label="${tool.name} on ${on_string}: haplotype-resolved raw unitig graph for pseudohaplotype assembly">
<filter>mode['mode_selector'] == 'standard' and hic_partition['hic_partition_selector'] == 'blank'</filter>
<data name="raw_unitigs" format="gfa1" from_work_dir="*.r_utg.gfa" label="${tool.name} on ${on_string}: haplotype-resolved raw unitig graph for pseudohaplotype assembly">
<filter>mode['mode_selector'] == 'standard' and hic_partition['hic_partition_selector'] == 'blank'</filter>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just an idea: If you have a filter <filter>A and B</filter>. Then you could rewrite this as:

<filter>A</filter>
<filter>B</filter>

Might make all this more readable.

<data name="primary_contig_graph" format="gfa1" from_work_dir="output.p_ctg.gfa" label="${tool.name} on ${on_string}: primary assembly contig graph for pseudohaplotype assembly">
<filter>mode['mode_selector'] == 'standard' and hic_partition['hic_partition_selector'] == 'blank'</filter>
<filter>mode['mode_selector'] == 'standard' and hic_partition['hic_partition_selector'] == 'blank' and (assembly_options['assembly_selector'] == 'blank' or (assembly_options['assembly_selector'] == 'set' and assembly_options['primary']))</filter>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what you are doing here is to state that output.p_ctg.gfa is produced only if --primary is set (which is the default). According to the docs it seems to be even slightly more complicated (but I'm not sure if it really needs to be done here :) ).

--primary

    Output a primary assembly and an alternate assembly. Enable this option or -l0 outputs a primary assembly and an alternate assembly.

It appears to me that analogous changes should be done for all outputs. But feel free not to do so in this PR.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would you read this doc to mean that -l0 is the same as --primary? or does it mean that you should use --primary or ("otherwise") -l0 will output a primary and alt assembly?

i think the former (-l0 or --primary produce primary and alt assemblies)?

</data>

<!--Primary off mode-->
<data name="main_contig_graph" format="gfa1" from_work_dir="output.bp.p_ctg.gfa" label="${tool.name} on ${on_string}: main assembly contig graph for pseudohaplotype assembly">
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The three new outputs should be covered by tests.
Do you know what the statement from the docs means: "Hifiasm generates the following assembly graphs only with HiFi reads in default:" Can we check this in the filters?

Overall I would like to see all outputs covered by at least one test (but again, feel free not to do so here).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'm not sure but every time i use it, that output file is identical to the haplo1 output. the tests will help to figure it out

i can do the tests, just need to find a bit of time to learn how as i 've never done it before. hopefully next week!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alpapan please let us know if you need any help

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants