-
Notifications
You must be signed in to change notification settings - Fork 628
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve DAG rendering #4070
Improve DAG rendering #4070
Conversation
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
✅ Deploy Preview for nextflow-docs-staging ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
Here are some examples with depth=0, verbose=false flowchart TD
subgraph " "
v0[Channel.from]
v5[fields]
v27[certificate]
v31[certificate]
v39[pipeline]
v40[strandedness]
v41[mapping_fields]
end
v6([NFCORE_FETCHNGS])
subgraph " "
v20[ ]
v46[ ]
v49[ ]
v54[ ]
v55[ ]
v56[ ]
end
v0 --> v6
v5 --> v6
v6 --> v20
v27 --> v6
v31 --> v6
v39 --> v6
v40 --> v6
v41 --> v6
v6 --> v46
v6 --> v49
v6 --> v54
v6 --> v55
v6 --> v56
depth=2, verbose=false flowchart TD
subgraph " "
v0[Channel.from]
v5[fields]
v27[certificate]
v31[certificate]
v39[pipeline]
v40[strandedness]
v41[mapping_fields]
end
subgraph NFCORE_FETCHNGS
subgraph SRA
v6([SRA_IDS_TO_RUNINFO])
v9([SRA_RUNINFO_TO_FTP])
v19([SRA_FASTQ_FTP])
v25([FASTQ_DOWNLOAD_PREFETCH_FASTERQDUMP_SRATOOLS])
v42([SRA_TO_SAMPLESHEET])
v45([SRA_MERGE_SAMPLESHEET])
v48([MULTIQC_MAPPINGS_CONFIG])
v53([CUSTOM_DUMPSOFTWAREVERSIONS])
v1(( ))
v7(( ))
v12(( ))
v37(( ))
v43(( ))
v44(( ))
end
end
subgraph " "
v20[ ]
v46[ ]
v49[ ]
v54[ ]
v55[ ]
v56[ ]
end
v0 --> v1
v5 --> v6
v1 --> v6
v6 --> v9
v6 --> v7
v9 --> v7
v9 --> v12
v12 --> v19
v19 --> v20
v19 --> v7
v19 --> v37
v27 --> v25
v31 --> v25
v39 --> v42
v40 --> v42
v41 --> v42
v37 --> v42
v42 --> v43
v42 --> v44
v43 --> v45
v44 --> v45
v45 --> v46
v45 --> v48
v45 --> v7
v48 --> v49
v48 --> v7
v7 --> v53
v53 --> v56
v53 --> v55
v53 --> v54
v25 --> v7
v12 --> v25
v25 --> v37
depth=3, verbose=true flowchart TD
subgraph " "
v0[Channel.from]
v4[Channel.empty]
v5[fields]
v24[Channel.empty]
v27[certificate]
v31[certificate]
v39[pipeline]
v40[strandedness]
v41[mapping_fields]
end
subgraph NFCORE_FETCHNGS
subgraph SRA
v1([splitCsv])
v2([map])
v3([unique])
v6([SRA_IDS_TO_RUNINFO])
v7([first])
v8([mix])
v9([SRA_RUNINFO_TO_FTP])
v10([first])
v11([mix])
v12([splitCsv])
v13([map])
v14([unique])
v15([first])
v16([mix])
v17([map])
v18([branch])
v19([SRA_FASTQ_FTP])
v21([first])
v22([mix])
v23([map])
subgraph FASTQ_DOWNLOAD_PREFETCH_FASTERQDUMP_SRATOOLS
v25([CUSTOM_SRATOOLSNCBISETTINGS])
v28([SRATOOLS_PREFETCH])
v32([SRATOOLS_FASTERQDUMP])
end
v26([mix])
v29([first])
v30([mix])
v33([first])
v34([mix])
v35([first])
v36([mix])
v37([mix])
v38([map])
v42([SRA_TO_SAMPLESHEET])
v43([collect])
v44([collect])
v45([SRA_MERGE_SAMPLESHEET])
v47([mix])
v48([MULTIQC_MAPPINGS_CONFIG])
v50([mix])
v51([unique])
v52([collectFile])
v53([CUSTOM_DUMPSOFTWAREVERSIONS])
end
end
subgraph " "
v20[ ]
v46[ ]
v49[ ]
v54[ ]
v55[ ]
v56[ ]
end
v0 --> v1
v1 --> v2
v2 --> v3
v3 -->|ids| v6
v4 -->|ch_versions| v8
v5 -->|fields| v6
v6 --> v9
v6 --> v7
v7 --> v8
v8 -->|ch_versions| v11
v9 --> v12
v9 --> v10
v9 --> v15
v10 --> v11
v11 -->|ch_versions| v16
v12 --> v13
v13 --> v14
v14 -->|ch_sra_metadata| v17
v15 --> v16
v16 -->|ch_versions| v22
v17 --> v18
v18 --> v23
v18 --> v19
v19 --> v37
v19 --> v20
v19 --> v21
v21 --> v22
v22 -->|ch_versions| v36
v23 -->|ch_sra_ids| v28
v24 -->|ch_versions| v26
v25 -->|ch_ncbi_settings| v28
v25 --> v26
v25 -->|ch_ncbi_settings| v32
v26 -->|ch_versions| v30
v27 -->|certificate| v28
v28 --> v32
v28 --> v29
v29 --> v30
v30 -->|ch_versions| v34
v31 -->|certificate| v32
v32 -->|reads| v37
v32 --> v33
v33 --> v34
v34 -->|versions| v35
v35 --> v36
v36 -->|ch_versions| v47
v37 --> v38
v38 -->|ch_sra_metadata| v42
v39 -->|pipeline| v42
v40 -->|strandedness| v42
v41 -->|mapping_fields| v42
v42 --> v43
v42 --> v44
v43 --> v45
v44 --> v45
v45 --> v46
v45 --> v48
v45 --> v47
v47 -->|ch_versions| v50
v48 --> v49
v48 --> v50
v50 -->|ch_versions| v51
v51 --> v52
v52 --> v53
v53 --> v56
v53 --> v55
v53 --> v54
|
Very nice to see the way this is heading to 🤩 |
Wow, this is cool. Is it going to be the default rendering ? |
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Signed-off-by: Ben Sherman <[email protected]>
Okay, I think we should merge it as is. There are still many things that can be done but would be better as separate in efforts. In particular, improving the input and output labels is likely related to having better input/output definitions for the workflow, so probably best to wait and see. Summary of future work:
Also:
The default settings are |
81f7cb7
to
8a43489
Compare
Signed-off-by: Paolo Di Tommaso <[email protected]>
Excellent 👍 |
Signed-off-by: Ben Sherman <[email protected]> Signed-off-by: Paolo Di Tommaso <[email protected]> Co-authored-by: Paolo Di Tommaso <[email protected]>
Close #1056 #1543 #3203
dag.verbose
option to toggle operators (disabled by default)dag.depth
option to control the level of detailNotes:
I used wrapper classes around
DAG.Vertex
so that I could store the inputs and outputs for each node. TheDAG
class has all edges in a list instead of a hash map, so not ideal for graph traversal. Maybe theDAG
class could do this from the beginning, but I haven't looked into it yet.I used the fully qualified process names to construct a "node tree" that encodes information about subworkflows. Since operators aren't associated to any subworkflow in the
DAG
, I try to infer it based on its neighboring processes. Maybe theDAG
could save this information when the operator is added, but the heuristic works pretty well.Workflow inputs and outputs are not annotated very well. I can infer some inputs from the outgoing channel name, but the output nodes are always empty. Might be able to grab the output name from the process/workflow
emit
for now.Large pipelines like
rnaseq
andsarek
tend to have lots of little outputs which makes the diagram super long. It would be nice to condense this somehow while still showing something for the workflow outputs.Currently specific to Mermaid, but I should be able to make it work for all output formats by moving the pre-rendering logic to
DagRenderer
. Then each renderer will need to operate on the "node tree" instead of the DAG.Testing:
Run any pipeline with
-preview
, for example:../launch.sh run nf-core/fetchngs -profile test --outdir results -preview -with-dag
Use
dag.depth
anddag.verbose
to control the level of detail. For large pipelines likernaseq
orsarek
, I recommend that you keepverbose=false
and limit the depth to 2 or 3.