Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

normalization doing things it shouldn't do #31

Closed
2 tasks done
hoechp opened this issue Mar 4, 2017 · 6 comments
Closed
2 tasks done

normalization doing things it shouldn't do #31

hoechp opened this issue Mar 4, 2017 · 6 comments
Assignees

Comments

@hoechp
Copy link
Contributor

hoechp commented Mar 4, 2017

found some bugs related to creating the NF graph:

  • it changes data within the original graph (that shouldn't be touched at all)
  • having way to many (duplicate) edges after normalization

example graph:

	    Graph b = new Graph();
	    Node b1 = new Node();
	    Node b2 = new Node();
	    Node b3 = new Node();
	    Node b4 = new Node();
	    b1.addEdge("a", b2);
	    b1.addEdge("b", b2);
	    b3.addEdge("a", b4);
	    b3.addEdge("b", b4);
	    b.addNode(b1, b2, b3, b4);

before:

"edges":[{"a":[3]},{"b":[3]}]

after:

"edges":[{"a":[3,3,3,3,3,3,3,3]},{"b":[3,3,3,3,3,3,3,3]}]
  • fix normalization changing data of the original graph at all
  • fix edges occuring as duplicates
@hoechp
Copy link
Contributor Author

hoechp commented Mar 4, 2017

normalization seems to have serious bugs and it's almost a wonder it was able to create a correct reachability graph.

  • totally rework the sorting

edit: the bug was a mapping from serializations of nodesorttrees to the nodesorttrees. having two identical subgraphs within a graph, their nodesorttree-serializations are identical, resulting in all sorts of bugs. this can, as far as I understand, only happen with multiple occurancies of identical subgraphs within the graph (that are not connected to each other).

@hoechp
Copy link
Contributor Author

hoechp commented Mar 4, 2017

Okay, I found a solution, I think. It's pretty brainf***, to think through this - but I identified two problems that are already shown by the two examples we used at the bachelor colloquium. Either having multiple identical sub-graphs that are not connected to each other, or otherwise within a connected (sub-)graph having multiple sets of nodes that are interchangable with each other - then (in both those cases) the node-sort-trees for multiple nodes are exactly the same, resulting in a basically random sort within those identically represented nodes. though this can lead into different serializations of the complete graph. (additionally this caused further bugs along the way)

The solution:

  • first split the graph into its 'sub-graphs'
  • secondly for each sub-graph build node-sort-trees for each node and sort them accordingly. this though will not sort 'identical' nodes between one another.
  • then for each sub-graph take the first node-sort-tree and sort all the sub-graph's nodes (and target-lists) based on the layout of this node-sort-tree
  • finally merge the sub-graphs together in order of the sub-graphs serializations

This will result in a completely different order, but its consistent even for the worst kind of graph. I'm pretty sure at least and I'll implement and test it shortly.

hoechp added a commit that referenced this issue Mar 5, 2017
@hoechp
Copy link
Contributor Author

hoechp commented Mar 5, 2017

Okay, this worked. Now the examples from friday's colloquium are running as expected - no known bugs are left. And the performance didn't noticably decrease, either. I will add more hellish tests for this feature shortly - hopefully covering most worst-case situations for this algorithm.

edit: While solving this, I had an idea on how to utilize those nodes, that have other nodes within the graph, that have the exact same 'node-sort-tree' (they are practically 'isomorph' to each other - well, it's a bit more compilcated, though). For example, this can be used in pattern matching - see #32.

@hoechp
Copy link
Contributor Author

hoechp commented Mar 5, 2017

This is a pretty wicked example for this sort of algorithm:
20170305_173632
To make it even harder, this can be included multiple times as sub-graphs that ain't connected to one another.

The implementation now works in principle, but currently it's possible in some situations, that the sorting can get stuck in an infinite loop. Which it does for this example (but not the smaller one from colloquium). I'm pretty sure, this can be fixed, though.

@hoechp
Copy link
Contributor Author

hoechp commented Mar 5, 2017

One solution would be to detect loop cycles and pick a certain sort within the loop cycle, for example the one where the whole sort itself is lexicographically, in respect of the list of serialized node-sort-trees, the lowest. The reason why this can't be done earlier, is that the IDs of the nodes within the node-sort-trees are changing while sorting - if a cycle is reached, though, this can be done. The biggest drawback here is, that even know these cycles are occuring only in very very rare situations, still for cycle detection the whole history of orders must be maintained, which is likely to noticably slow the whole algorithm down.

hoechp added a commit that referenced this issue Mar 5, 2017
@hoechp
Copy link
Contributor Author

hoechp commented Mar 5, 2017

Okay, it's now all fixed - and not noticably slower than before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant