Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve DAG rendering #4070

Merged
merged 14 commits into from
Sep 17, 2023
Merged

Improve DAG rendering #4070

merged 14 commits into from
Sep 17, 2023

Conversation

bentsherman
Copy link
Member

@bentsherman bentsherman commented Jul 2, 2023

Close #1056 #1543 #3203

  • Render subworkflows as subgraphs
  • Add subgraphs for workflow inputs and outputs
  • Add dag.verbose option to toggle operators (disabled by default)
  • Add dag.depth option to control the level of detail

Notes:

  • I used wrapper classes around DAG.Vertex so that I could store the inputs and outputs for each node. The DAG class has all edges in a list instead of a hash map, so not ideal for graph traversal. Maybe the DAG class could do this from the beginning, but I haven't looked into it yet.

  • I used the fully qualified process names to construct a "node tree" that encodes information about subworkflows. Since operators aren't associated to any subworkflow in the DAG, I try to infer it based on its neighboring processes. Maybe the DAG could save this information when the operator is added, but the heuristic works pretty well.

  • Workflow inputs and outputs are not annotated very well. I can infer some inputs from the outgoing channel name, but the output nodes are always empty. Might be able to grab the output name from the process/workflow emit for now.

  • Large pipelines like rnaseq and sarek tend to have lots of little outputs which makes the diagram super long. It would be nice to condense this somehow while still showing something for the workflow outputs.

  • Currently specific to Mermaid, but I should be able to make it work for all output formats by moving the pre-rendering logic to DagRenderer. Then each renderer will need to operate on the "node tree" instead of the DAG.

Testing:

Run any pipeline with -preview, for example:

../launch.sh run nf-core/fetchngs -profile test --outdir results -preview -with-dag

Use dag.depth and dag.verbose to control the level of detail. For large pipelines like rnaseq or sarek, I recommend that you keep verbose=false and limit the depth to 2 or 3.

@netlify
Copy link

netlify bot commented Jul 2, 2023

Deploy Preview for nextflow-docs-staging ready!

Name Link
🔨 Latest commit 7708889
🔍 Latest deploy log https://app.netlify.com/sites/nextflow-docs-staging/deploys/65071e27fe8c00000741988b
😎 Deploy Preview https://deploy-preview-4070--nextflow-docs-staging.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@bentsherman
Copy link
Member Author

bentsherman commented Jul 2, 2023

Here are some examples with nf-core/fetchngs.

depth=0, verbose=false

flowchart TD
    subgraph " "
    v0[Channel.from]
    v5[fields]
    v27[certificate]
    v31[certificate]
    v39[pipeline]
    v40[strandedness]
    v41[mapping_fields]
    end
    v6([NFCORE_FETCHNGS])
    subgraph " "
    v20[ ]
    v46[ ]
    v49[ ]
    v54[ ]
    v55[ ]
    v56[ ]
    end
    v0 --> v6
    v5 --> v6
    v6 --> v20
    v27 --> v6
    v31 --> v6
    v39 --> v6
    v40 --> v6
    v41 --> v6
    v6 --> v46
    v6 --> v49
    v6 --> v54
    v6 --> v55
    v6 --> v56
Loading

depth=2, verbose=false

flowchart TD
    subgraph " "
    v0[Channel.from]
    v5[fields]
    v27[certificate]
    v31[certificate]
    v39[pipeline]
    v40[strandedness]
    v41[mapping_fields]
    end
    subgraph NFCORE_FETCHNGS
    subgraph SRA
    v6([SRA_IDS_TO_RUNINFO])
    v9([SRA_RUNINFO_TO_FTP])
    v19([SRA_FASTQ_FTP])
    v25([FASTQ_DOWNLOAD_PREFETCH_FASTERQDUMP_SRATOOLS])
    v42([SRA_TO_SAMPLESHEET])
    v45([SRA_MERGE_SAMPLESHEET])
    v48([MULTIQC_MAPPINGS_CONFIG])
    v53([CUSTOM_DUMPSOFTWAREVERSIONS])
    v1(( ))
    v7(( ))
    v12(( ))
    v37(( ))
    v43(( ))
    v44(( ))
    end
    end
    subgraph " "
    v20[ ]
    v46[ ]
    v49[ ]
    v54[ ]
    v55[ ]
    v56[ ]
    end
    v0 --> v1
    v5 --> v6
    v1 --> v6
    v6 --> v9
    v6 --> v7
    v9 --> v7
    v9 --> v12
    v12 --> v19
    v19 --> v20
    v19 --> v7
    v19 --> v37
    v27 --> v25
    v31 --> v25
    v39 --> v42
    v40 --> v42
    v41 --> v42
    v37 --> v42
    v42 --> v43
    v42 --> v44
    v43 --> v45
    v44 --> v45
    v45 --> v46
    v45 --> v48
    v45 --> v7
    v48 --> v49
    v48 --> v7
    v7 --> v53
    v53 --> v56
    v53 --> v55
    v53 --> v54
    v25 --> v7
    v12 --> v25
    v25 --> v37
Loading

depth=3, verbose=true

flowchart TD
    subgraph " "
    v0[Channel.from]
    v4[Channel.empty]
    v5[fields]
    v24[Channel.empty]
    v27[certificate]
    v31[certificate]
    v39[pipeline]
    v40[strandedness]
    v41[mapping_fields]
    end
    subgraph NFCORE_FETCHNGS
    subgraph SRA
    v1([splitCsv])
    v2([map])
    v3([unique])
    v6([SRA_IDS_TO_RUNINFO])
    v7([first])
    v8([mix])
    v9([SRA_RUNINFO_TO_FTP])
    v10([first])
    v11([mix])
    v12([splitCsv])
    v13([map])
    v14([unique])
    v15([first])
    v16([mix])
    v17([map])
    v18([branch])
    v19([SRA_FASTQ_FTP])
    v21([first])
    v22([mix])
    v23([map])
    subgraph FASTQ_DOWNLOAD_PREFETCH_FASTERQDUMP_SRATOOLS
    v25([CUSTOM_SRATOOLSNCBISETTINGS])
    v28([SRATOOLS_PREFETCH])
    v32([SRATOOLS_FASTERQDUMP])
    end
    v26([mix])
    v29([first])
    v30([mix])
    v33([first])
    v34([mix])
    v35([first])
    v36([mix])
    v37([mix])
    v38([map])
    v42([SRA_TO_SAMPLESHEET])
    v43([collect])
    v44([collect])
    v45([SRA_MERGE_SAMPLESHEET])
    v47([mix])
    v48([MULTIQC_MAPPINGS_CONFIG])
    v50([mix])
    v51([unique])
    v52([collectFile])
    v53([CUSTOM_DUMPSOFTWAREVERSIONS])
    end
    end
    subgraph " "
    v20[ ]
    v46[ ]
    v49[ ]
    v54[ ]
    v55[ ]
    v56[ ]
    end
    v0 --> v1
    v1 --> v2
    v2 --> v3
    v3 -->|ids| v6
    v4 -->|ch_versions| v8
    v5 -->|fields| v6
    v6 --> v9
    v6 --> v7
    v7 --> v8
    v8 -->|ch_versions| v11
    v9 --> v12
    v9 --> v10
    v9 --> v15
    v10 --> v11
    v11 -->|ch_versions| v16
    v12 --> v13
    v13 --> v14
    v14 -->|ch_sra_metadata| v17
    v15 --> v16
    v16 -->|ch_versions| v22
    v17 --> v18
    v18 --> v23
    v18 --> v19
    v19 --> v37
    v19 --> v20
    v19 --> v21
    v21 --> v22
    v22 -->|ch_versions| v36
    v23 -->|ch_sra_ids| v28
    v24 -->|ch_versions| v26
    v25 -->|ch_ncbi_settings| v28
    v25 --> v26
    v25 -->|ch_ncbi_settings| v32
    v26 -->|ch_versions| v30
    v27 -->|certificate| v28
    v28 --> v32
    v28 --> v29
    v29 --> v30
    v30 -->|ch_versions| v34
    v31 -->|certificate| v32
    v32 -->|reads| v37
    v32 --> v33
    v33 --> v34
    v34 -->|versions| v35
    v35 --> v36
    v36 -->|ch_versions| v47
    v37 --> v38
    v38 -->|ch_sra_metadata| v42
    v39 -->|pipeline| v42
    v40 -->|strandedness| v42
    v41 -->|mapping_fields| v42
    v42 --> v43
    v42 --> v44
    v43 --> v45
    v44 --> v45
    v45 --> v46
    v45 --> v48
    v45 --> v47
    v47 -->|ch_versions| v50
    v48 --> v49
    v48 --> v50
    v50 -->|ch_versions| v51
    v51 --> v52
    v52 --> v53
    v53 --> v56
    v53 --> v55
    v53 --> v54
Loading

@mribeirodantas
Copy link
Member

Very nice to see the way this is heading to 🤩

@pditommaso
Copy link
Member

pditommaso commented Jul 4, 2023

Wow, this is cool. Is it going to be the default rendering ?

@bentsherman
Copy link
Member Author

Okay, I think we should merge it as is. There are still many things that can be done but would be better as separate in efforts. In particular, improving the input and output labels is likely related to having better input/output definitions for the workflow, so probably best to wait and see.

Summary of future work:

  • Extend support to other diagram formats
  • Add hyperlinks to processes, operators, etc
  • Add interactive expand/collapse to workflow blocks
  • Show better labels for inputs and outputs

Also:

Is it going to be the default rendering ?

The default settings are verbose = false and depth = -1, so it will hide channel names and operators and it will render the full depth of workflows.

@pditommaso pditommaso merged commit 19587f4 into master Sep 17, 2023
19 checks passed
@pditommaso pditommaso deleted the 1056-improve-dag-rendering branch September 17, 2023 16:12
@pditommaso
Copy link
Member

Excellent 👍

abhi18av pushed a commit to abhi18av/nextflow that referenced this pull request Oct 28, 2023

Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Paolo Di Tommaso <[email protected]>
Co-authored-by: Paolo Di Tommaso <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants