Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

added family grouping plus family and cycle collapsing #1810

Open
wants to merge 46 commits into
base: master
Choose a base branch
from

Conversation

markgrahamdawson
Copy link
Contributor

@markgrahamdawson markgrahamdawson commented May 24, 2024

Partly addresses issue #1130
The grouping of nodes by cycle point is completed in this pr #1763

----Notes on work----

Some ideas for a unified approach to grouping/collapsing cycles/families. I'm suggesting unifying the handling of cycles and families (note, cycles represent the "root" family so they are essentially the same thing).

Grouping/Ungrouping - Drawing dashed boxes around a cycle/family.

Collapsing/Expanding - Reducing a family down to a single node.

Limitations of the Cylc 7 approach:

  • Once you expand a family it's gone, the tasks which belong to the expanded family are mixed in with other tasks in the graph, you can not tell what family they belong to. This is an issue if the user wants to examine a component within the workflow.
  • No visibility of the inheritance hierarchy (i.e. what can we expand/collapse).
  • No visibility of what you have expanded/collapsed (i.e. where are we in the hierarchy).

Note, for simplicity, this approach groups/collapses all instances of selected families rather than managing this at a per-cycle level. I think this is probably more aligned with expectations, but does represent a minor limitation, e.g. there's no ability to collapse all but one cycle. The ability to expand/collapse specific cycle instances would be a reasonable enhancement.

Design Sketch
image

Had a quick discussion on this (more to come):

  • Can't really think of a valid use case for collapsing all cycles (users would do this in the tree view if they wanted to), so perhaps treat cycles differently (i.e. collapse per-cycle rather than all cycles) and remove from the menus.
  • Better expand/collapse icon (obviously).
  • The Cylc 7 default of only expanding the cycle point (i.e. show top-level families only) is a reasonable protection for viewing large workflows. We might want to continue with this, or perhaps do something smart (e.g. only collapse families if there are lots of tasks on load).

Check List

  • I have read CONTRIBUTING.md and added my name as a Code Contributor.
  • Contains logically grouped changes (else tidy your branch by rebase).
  • Does not contain off-topic changes (use other PRs for other changes).
  • Applied any dependency changes to both setup.cfg (and conda-environment.yml if present).
  • Tests are included (or explain why tests are not needed).
  • CHANGES.md entry included if this is a change that can affect users
  • Cylc-Doc pull request opened if required at cylc/cylc-doc/pull/XXXX.
  • If this is a bug fix, PR should be raised against the relevant ?.?.x branch.

@markgrahamdawson markgrahamdawson self-assigned this May 24, 2024
@MetRonnie MetRonnie added this to the 2.6.0 milestone May 28, 2024
@markgrahamdawson
Copy link
Contributor Author

markgrahamdawson commented Jul 5, 2024

Still to do (in order of priority).

  1. Work out how to implement nested families
    2. Get graph status showing on graph nodes
  2. When switching to another workflow and returning the graph fails to load
  3. Expand collapse icons on each node (and subgraph) on the graph
    5. If collapsed by family+ grouped by family - dont draw graph around collapsed node (see cycle point functionality)
  4. The edges and nodes are now let variables (so they can be updated) - is this ok? Might want to change them to something else?

src/views/Graph.vue Outdated Show resolved Hide resolved
src/views/Graph.vue Outdated Show resolved Hide resolved
src/views/Graph.vue Outdated Show resolved Hide resolved
src/views/Graph.vue Outdated Show resolved Hide resolved
src/views/Graph.vue Outdated Show resolved Hide resolved
src/views/Graph.vue Outdated Show resolved Hide resolved
src/views/Graph.vue Outdated Show resolved Hide resolved
@markgrahamdawson
Copy link
Contributor Author

markgrahamdawson commented Jul 29, 2024

Im struggling with grouping by families due to the fact that they can be nested ...

The context

Collapsing and Expanding nodes is easier as the .dot file just needs the node/s added/removed ....

"~mdawson/runtime-tutorial-families-two/run1//20240724T0000Z/get_observations_aldergrove" [
            label=<
              <TABLE HEIGHT="132.447509765625">
                <TR>
                  <TD PORT="in" WIDTH="100"></TD>
                </TR>
                <TR>
                  <TD PORT="task" WIDTH="100" HEIGHT="132.447509765625">icon</TD>
                  <TD WIDTH="628.4224853515625">~mdawson/runtime-tutorial-families-two/run1//20240724T0000Z/get_observations_aldergrove</TD>
                </TR>
                <TR>
                  <TD PORT="out" WIDTH="100"></TD>
                </TR>
              </TABLE>
            >
          ]

and the edge (relationship between the node and other nodes) defined...
"~mdawson/runtime-tutorial-families-two/run1//20240724T0300Z/get_observations_aldergrove":out -> "~mdawson/runtime-tutorial-families-two/run1//20240724T0300Z/consolidate_observations":in

This means I dont have to directly deal with a hierarchy (nested structure) as things are just being added/removed.

For grouping the syntax is a little different, using subgraphs ...

                  subgraph cluster_margin_family16
                  {
                    margin=100.0
                    label="margin"
                    subgraph cluster_family16 {"~mdawson/runtime-tutorial-families-two/run1//20240724T0000Z/get_observations_shetland","~mdawson/runtime-tutorial-families-two/run1//20240724T0000Z/get_observations_aldergrove";

                      label = "GET_OBSERVATIONS_NORTH20240724T0000Z";

                      fontsize = "70px"
                      style=dashed
                      margin=60.0
                  }
                }

For grouping by cycle point there are no nested cycle points (doesnt make sense) so its just a case of making subgraph for each cycle. The subgraphs do need to account for the fact that the nodes may have been expanded or collapsed but that can be managed by calculating what nodes need to be included from the nodes variable - which is up-to-date with what has been expanded/collapsed. Also understanding the hierarchical relationship is easier because its contained in the node id whether it has been expanded or collapsed - it will always have a cycle associated with it.

The problem

The problem is with nested grouping which is relevant for

  1. family groups inside cycle groups
  2. family groups inside family groups.

The way this is represented in the .dot code is by having subgraphs within subgraphs...

subgraph FAMILY {
  "~mdawson/run-name/run1/cycle/SUBFAMILY", "~mdawson/run-name/run1/cycle/task1", "~mdawson/run-name/run1/cycle/task2", ;
  label = Family
  subgraph SUBFAMILY {
    "~mdawson/run-name/run1/cycle/task1", "~mdawson/run-name/run1/cycle/task2", ;
    label = SubFamily
  }
}

image

The above is an simple example of some graphviz code for a simple nested family situation. Below is an example for a more complicated one...
image

subgraph FAMILY {
  "~mdawson/run-name/run1/cycle/SUBFAMILY_A", "~mdawson/run-name/run1/cycle/SUBFAMILY_B", "~mdawson/run-name/run1/cycle/SUBFAMILY_A1", "~mdawson/run-name/run1/cycle/SUBFAMILY_A2", "~mdawson/run-name/run1/cycle/SUBFAMILY_B1", "~mdawson/run-name/run1/cycle/SUBFAMILY_B2", "~mdawson/run-name/run1/cycle/Task_y", "~mdawson/run-name/run1/cycle/Task_x", "~mdawson/run-name/run1/cycle/Task_m", "~mdawson/run-name/run1/cycle/Task_n", "~mdawson/run-name/run1/cycle/Task_g", "~mdawson/run-name/run1/cycle/Task_h", "~mdawson/run-name/run1/cycle/Task_i", "~mdawson/run-name/run1/cycle/Task_j" ;
  label = Family
  subgraph SUBFAMILY_A {
 "~mdawson/run-name/run1/cycle/SUBFAMILY_A1", "~mdawson/run-name/run1/cycle/SUBFAMILY_A2", "~mdawson/run-name/run1/cycle/Task_y", "~mdawson/run-name/run1/cycle/Task_x", "~mdawson/run-name/run1/cycle/Task_m", "~mdawson/run-name/run1/cycle/Task_n" ;
    label = SubFamily_A
        subgraph SUBFAMILY_A1 { "~mdawson/run-name/run1/cycle/Task_y", "~mdawson/run-name/run1/cycle/Task_x" ;
          label = SubFamily_A1
      }
      subgraph SUBFAMILY_A2 { "~mdawson/run-name/run1/cycle/Task_m", "~mdawson/run-name/run1/cycle/Task_n" ;
          label = SubFamily_A1
      }
  }
  subgraph SUBFAMILY_B {
 "~mdawson/run-name/run1/cycle/SUBFAMILY_B1", "~mdawson/run-name/run1/cycle/SUBFAMILY_B2", "~mdawson/run-name/run1/cycle/Task_g", "~mdawson/run-name/run1/cycle/Task_h", "~mdawson/run-name/run1/cycle/Task_i", "~mdawson/run-name/run1/cycle/Task_j" ;
    label = SubFamily_B
        subgraph SUBFAMILY_B1 { "~mdawson/run-name/run1/cycle/Task_g", "~mdawson/run-name/run1/cycle/Task_h" ;
          label = SubFamily_B1
      }
      subgraph SUBFAMILY_B2 { "~mdawson/run-name/run1/cycle/Task_i", "~mdawson/run-name/run1/cycle/Task_j" ;
          label = SubFamily_B1
      }
  }
}

The subgraphs can be n layers deep so that needs to be handled programatically (cant be hard coded).
At the moment the graphviz .dot code is being written as an array of strings that all gets added to - pushing new values onto the end. And then using the join method on the array to make one big string.

I have thought about giving each node a ranking in terms of how 'deep' it is in the hierarchy then ranking from most deep to least deep then looping through ... but this wont work because (as in the example above) you would miss out a lot of the graph

@oliver-sanders
Copy link
Member

I think this is a problem that warrants recursion as it's tricky to unroll as an iterative loop.

Here's an idea of what that could look like (Python syntax):

  • First, go through every task and create a dotcode entry for each (this is the bit that includes the <TABLE /> label).
  • Then go through every family in inheritance order and build the nested subgraphs for each, inserting the task entries we have just built into the relevant subgraph when we get to it.
  • Then do the '\n'.join(dotcode) bit.
from random import random

TASKS = {
    'foo': {
        'name': 'foo',
        'parent': 'FOO',
    },
    'FOO': {
        'name': 'FOO',
        'parent': 'root'
    },
    'bar': {
        'name': 'bar',
        'parent': 'BAR1',
    },
    'baz': {
        'name': 'baz',
        'parent': 'BAR2',
    },
    'BAR1': {
        'name': 'BAR1',
        'parent': 'BAR',
    },
    'BAR2': {
        'name': 'BAR2',
        'parent': 'BAR',
    },
    'root': {
        'name': 'root',
        'parent': None,
    },
}

TREE = {
    'root': {
        'FOO': None,
        'BAR': {
            'BAR1': None,
            'BAR2': None,
        },
    },
}

def add_subgraph(dotcode, pointer, graph_sections):
    for key, value in pointer.items():
        dotcode.append(
            f'subgraph cluster_{str(random())[2:]} {{'
            f'\nlabel = "{key}"'
        )

        if value:
            add_subgraph(dotcode, value, graph_sections)

        if key in graph_sections:
            dotcode.extend(graph_sections[key])

        dotcode.append('}')

    return dotcode

def get_dotcode(tasks):
    graph_sections = {}

    for task in tasks.values():
        parent = task['parent']
        if not parent:
            continue
        section = graph_sections.setdefault(parent, [])
        section.append(f'{task["name"]} [title="{task["name"]}"]')

    dotcode = ['digraph {']
    add_subgraph(dotcode, TREE['root'], graph_sections)
    return dotcode


for item in get_dotcode(TASKS):
    print(item)
digraph {

  subgraph cluster_23300787190407446 {
    label = "FOO"

    foo [title="foo"]
  }

  subgraph cluster_5025488657295563 {
    label = "BAR"

    subgraph cluster_2135762450670372 {
      label = "BAR1"

      bar [title="bar"]
    }

    subgraph cluster_4413670667138756 {
      label = "BAR2"

      baz [title="baz"]
    }

  BAR1 [title="BAR1"]
  BAR2 [title="BAR2"]
}

I haven't taken cycles into account in this solution, you'll need to add a for cycle in cycles loop at the top of this.

This solution will also add entries for families which have no tasks, so, you'll need some fancy logic for removing empty families, and any families that contain only empty families.

@wxtim
Copy link
Member

wxtim commented Nov 13, 2024

Hi @markgrahamdawson

I'm still getting missing dependency arrows, though it was harder to reproduce:

[scheduler]
    allow implicit tasks = True
    cycle point format = %Y

[scheduling]
    initial cycle point = 1971
    [[graph]]
        P1Y = """
            X[-P1Y]:succeed-all => start_cycle
            start_cycle => X:succeed-all => _dummy_ => Y
        """

[runtime]
    [[root]]
        script = sleep 1000
    [[X, Y]]

    [[x]]
        inherit = X
    [[y]]
        inherit = Y

I had collapset 1972 and X

image

@wxtim
Copy link
Member

wxtim commented Nov 15, 2024

I can't break it. 🚀

@markgrahamdawson
Copy link
Contributor Author

[tangential to , but exacerbated by this PR]

We were talking about the funky lines GraphViz sometimes comes up with.

I suspect that these are the result of GraphViz trying to thread edges between tasks and subgraphs.

We currently apply spacing to subgraphs by nesting them inside another subgraph. There is a GraphViz margin attribute which might work for us?

On this one we, we do use that margin attribute - line 855 in the src/views/Graph.vue

return nodeFirstFamily
} else if (ancestor) {
return this.allParentLookUp[nodeFirstFamily][0]
// this is almost certainly an over simplification -> better logic needed here
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO

I'm not 100% on what this function does, but I suspect it's missing a bit.

E.g, for this inheritance:

  • A
    • A1
      • A11
        • A111

And this state:

Fam iscollapsed
A
A1 ✔️
A11 ✔️
A111

What family should be returned?

src/views/Graph.vue Outdated Show resolved Hide resolved
src/views/Graph.vue Outdated Show resolved Hide resolved
Comment on lines 865 to 873
const nodeFormattedArray = children.filter((a) => {
// if its not in the list of families (unless its been collapsed)
const isFamily = !this.familyArrayStore.includes(a.name) || this.collapseFamily.includes(a.name)
// its the node has been removed/collapsed
const isRemoved = !removedNodes.includes(a.name)
// is not numeric
const isNumeric = !parseFloat(a.name)
return isFamily && isRemoved && isNumeric
}).map(a => `"${a.id}"`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is an optimisation we can make here which will help to speed the code up.

E.G:

function one () {
  console.log('one')
  return false
}

function two () {
  console.log('two')
  return true
}

function three () {
  console.log('three')
  return true
}

if (one() && two() && three()) {
  console.log('true')
} else {
  console.log('false')
}

This code contains an if statement that only evaluates true if all three of the tests one(), two() and three() return true.

However, take a look at what happens when you run the code:

$ node test.js
one
false

It only actually performs the first test (one()), this returns false, so it skips the others.

This is "lazy evaluation", several scripting languages including JS and Python do it. This is a useful tool for optimisation as it helps us to avoid performing unnecessary comparisons.

To apply that to this example:

          const nodeFormattedArray = children.filter((a) => {
            return (
              // if its not in the list of families (unless its been collapsed)
              !this.familyArrayStore.includes(a.name) || this.collapseFamily.includes(a.name)
              // its the node has been removed/collapsed
              && !removedNodes.includes(a.name)
              // is not numeric
              && !parseFloat(a.name)
            )
          }).map(a => `"${a.id}"`)

If the first condition (!this.familyArrayStore.includes(a.name) || this.collapseFamily.includes(a.name)) evaluates false, then the following conditions are not evaluated at all, saving CPU.

We can game this by putting the conditions that are most likely to resolve as false at the top of the list.

return nodes.reduce((x, y) => {
(x[y.tokens.cycle] ||= []).push(y)
return x
}, {})
},
addSubgraph (dotcode, pointer, graphSections) {
pointer.children.forEach((key, i) => {
const value = key
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why reassign key to value?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious about this too - tried getting rid of this either way and both seem to work.

src/views/Graph.vue Outdated Show resolved Hide resolved
// its the node has been removed/collapsed
const isRemoved = !removedNodes.includes(a.name)
// is not numeric
const isNumeric = !parseFloat(a.name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the isNumeric check for?

Is this here to filter out jobs?

src/views/Graph.vue Outdated Show resolved Hide resolved
}`)
const graphSections = {}
Object.keys(cycles).forEach((cycle, iCycle) => {
const indexSearch = Object.values(this.cylcTree.$index).filter((node) => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is searching over the full cylcTree.$index, this is the complete data store which also contains nodes for other workflows.

We only need to iterate over the cycles in this workflow, i.e. Object.keys(cycles).forEach((key, i) => {.

src/views/Graph.vue Outdated Show resolved Hide resolved
})
return store
},
allParentLookUp () {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As someone who isn't super fluent in JS, could I request more docstring?

},
allChildrenLookUp () {
const lookup = {}
// Calculate some values for familes that we need for the toolbar
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which values?

Comment on lines 845 to 910
pointer.children.forEach((key, i) => {
const value = key
const children = this.allChildrenLookUp[value.id]
if (!children) { return }
const removedNodes = new Set()
children.forEach((a) => {
if (this.collapseFamily.includes(a.name)) {
a.children.forEach((child) => {
removedNodes.add(child.name)
})
}
})
// filter parent
let openedBrackets = false
if (
children.length &&
this.groupFamily.includes(key.node.name) &&
!this.collapseFamily.includes(key.node.name) &&
!this.collapseFamily.includes(key.node.firstParent.name)) {
// filter child
const nodeFormattedArray = children.filter((a) => {
const isNumeric = !parseFloat(a.name)
let isAncestor = true
if (isNumeric) {
const nodeFirstParent = this.cylcTree.$index[a.id].node.firstParent.name
isAncestor = !this.isNodeCollapsedByFamily(nodeFirstParent)
}
return (
// the node is not a numeric value
isNumeric &&
// if its not in the list of families (unless its been collapsed)
(!this.familyArrayStore.includes(a.name) || this.collapseFamily.includes(a.name)) &&
// the node has been removed/collapsed
!removedNodes.has(a.name) &&
// the node doesnt have a collapsed ancestor
isAncestor
)
}).map(a => `"${a.id}"`)
if (nodeFormattedArray.length) {
openedBrackets = true
dotcode.push(`
subgraph cluster_margin_family_${key.name}${key.tokens.cycle}
{
margin=100.0
label="margin"
subgraph cluster_${key.name}${key.tokens.cycle}
{${nodeFormattedArray}${nodeFormattedArray.length ? ';' : ''}
label = "${key.name}"
fontsize = "70px"
style=dashed
margin=60.0
`)
}
}

if (value) {
this.addSubgraph(dotcode, value, graphSections)
}
if (Object.keys(graphSections).includes(key)) {
dotcode.push(graphSections[key.id])
}
if (openedBrackets) {
dotcode.push('}}')
}
})
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seemed fine:

Suggested change
pointer.children.forEach((key, i) => {
const value = key
const children = this.allChildrenLookUp[value.id]
if (!children) { return }
const removedNodes = new Set()
children.forEach((a) => {
if (this.collapseFamily.includes(a.name)) {
a.children.forEach((child) => {
removedNodes.add(child.name)
})
}
})
// filter parent
let openedBrackets = false
if (
children.length &&
this.groupFamily.includes(key.node.name) &&
!this.collapseFamily.includes(key.node.name) &&
!this.collapseFamily.includes(key.node.firstParent.name)) {
// filter child
const nodeFormattedArray = children.filter((a) => {
const isNumeric = !parseFloat(a.name)
let isAncestor = true
if (isNumeric) {
const nodeFirstParent = this.cylcTree.$index[a.id].node.firstParent.name
isAncestor = !this.isNodeCollapsedByFamily(nodeFirstParent)
}
return (
// the node is not a numeric value
isNumeric &&
// if its not in the list of families (unless its been collapsed)
(!this.familyArrayStore.includes(a.name) || this.collapseFamily.includes(a.name)) &&
// the node has been removed/collapsed
!removedNodes.has(a.name) &&
// the node doesnt have a collapsed ancestor
isAncestor
)
}).map(a => `"${a.id}"`)
if (nodeFormattedArray.length) {
openedBrackets = true
dotcode.push(`
subgraph cluster_margin_family_${key.name}${key.tokens.cycle}
{
margin=100.0
label="margin"
subgraph cluster_${key.name}${key.tokens.cycle}
{${nodeFormattedArray}${nodeFormattedArray.length ? ';' : ''}
label = "${key.name}"
fontsize = "70px"
style=dashed
margin=60.0
`)
}
}
if (value) {
this.addSubgraph(dotcode, value, graphSections)
}
if (Object.keys(graphSections).includes(key)) {
dotcode.push(graphSections[key.id])
}
if (openedBrackets) {
dotcode.push('}}')
}
})
},
pointer.children.forEach((value, i) => {
const children = this.allChildrenLookUp[value.id]
if (!children) { return }
const removedNodes = new Set()
children.forEach((a) => {
if (this.collapseFamily.includes(a.name)) {
a.children.forEach((child) => {
removedNodes.add(child.name)
})
}
})
// filter parent
let openedBrackets = false
if (
children.length &&
this.groupFamily.includes(value.node.name) &&
!this.collapseFamily.includes(value.node.name) &&
!this.collapseFamily.includes(value.node.firstParent.name)) {
// filter child
const nodeFormattedArray = children.filter((a) => {
const isNumeric = !parseFloat(a.name)
let isAncestor = true
if (isNumeric) {
const nodeFirstParent = this.cylcTree.$index[a.id].node.firstParent.name
isAncestor = !this.isNodeCollapsedByFamily(nodeFirstParent)
}
return (
// the node is not a numeric value
isNumeric &&
// if its not in the list of families (unless its been collapsed)
(!this.familyArrayStore.includes(a.name) || this.collapseFamily.includes(a.name)) &&
// the node has been removed/collapsed
!removedNodes.has(a.name) &&
// the node doesnt have a collapsed ancestor
isAncestor
)
}).map(a => `"${a.id}"`)
if (nodeFormattedArray.length) {
openedBrackets = true
dotcode.push(`
subgraph cluster_margin_family_${value.name}${value.tokens.cycle}
{
margin=100.0
label="margin"
subgraph cluster_${value.name}${value.tokens.cycle}
{${nodeFormattedArray}${nodeFormattedArray.length ? ';' : ''}
label = "${value.name}"
fontsize = "70px"
style=dashed
margin=60.0
`)
}
}
if (value) {
this.addSubgraph(dotcode, value, graphSections)
}
if (Object.keys(graphSections).includes(value)) {
dotcode.push(graphSections[value.id])
}
if (openedBrackets) {
dotcode.push('}}')
}
})
},

@oliver-sanders oliver-sanders modified the milestones: 2.7.0, 2.8.0 Dec 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants