Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash-cons Applys and Constants #1165

Draft
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

brandonwillard
Copy link
Member

@brandonwillard brandonwillard commented Sep 2, 2022

This is an experimental PR that attempts to use hash-consing to restrict equivalent Applys and Constants to distinct representative instances. These changes should remove the need to perform "merging" as a distinct rewrite step, since it would now be performed by the Apply and Constant "constructors".

In other words, Apply(op, [v0, v1], ...) is Apply(op, [v0, v1], ...) and Constant(t1, x) is Constant(t1, x).

In order for this notion of equivalence to work, we need immutable Applys , which means that we need to stop updating Apply.inputs in-place. To accommodate that requirement, FunctionGraph.change_node_input can create new Apply instances with the updated inputs. Afterward, the old Apply nodes need to be replaced by the new ones somehow.

These changes overlap with the work in #666; however, we would like to avoid the need to clone and replace every Apply node that uses a replaced node's outputs as inputs (i.e. we don't want to clone and replace all the dependent sub-graphs).

  • We should be able to avoid sub-graph cloning by moving the in-place updates to Variable.owner (i.e. re-assign the output variables to the new Applys). The problem with a change like this is that our Feature.on_change_input callback semantics will also need to be changed, because—for instance—the objects passed to those methods will no longer have the same expected relationships with each other. For example, if we're no longer changing Apply.inputs in-place, a call to Feature.on_change_input with a newly constructed Apply node as the node argument will be invalid for Features that track the node arguments explicitly. In general, Features will need to explicitly account for any new/different relationship between a new node and an old/replaced one.

  • One approach to updating Feature.on_change_input involves the use of old and new node arguments. In this case, subscribers to the Feature.on_change_input callbacks will be responsible for handling the node replacements, which—in many existing cases—is reasonable. Also, if we're updating Variable.owner in each of the output variables of an replaced Apply, listeners can start using "representative" output variables to track specific node locations in a graph. In other words, if one tracks out = node.outputs[0] instead of node, out.owner will always reflect the applicable Apply node after all/any subsequent changes to node.inputs.

  • We need to determine how distinct computations of an Apply node are represented in this scenario. Currently, z = x + y and w = x + y represent two distinct computations (e.g. expressed in implementation with z != w being True). Under these changes, z is w could be True, and we would need to represent distinct computations of the same expressions in some other way (e.g. via the use of distinct output Variables that point to the same Apply).

    I think the only reason we currently need distinct, duplicate computations is that it's a necessary part of our memory/data-flow model—itself needed for things like in-placing. Instead, we can work in reverse to how we currently do by making all equivalent expressions represent the same computation by default (i.e. the changes in this PR), then we use special constructors/helper-functions and rewrite passes that introduce distinct computations as required (e.g. for in-placing). In other words, we go from everything being a distinct computation by default and then merging the equivalent ones, to everything being equivalent (that actually is equivalent) by default and then introducing duplicates when needed.

    Again, distinct output Variables could be used to represent distinct computations, since multiple Variables can have their .owner values set to the same Apply node. The primary difference with our current approach is that the Apply nodes associated with those distinct Variables will not be distinct.

    Our in-placing rewrites, DestroyHandler, and Merge[Optimizer|Feature] work together—with the backtracking provided by the History Feature—to incrementally apply the equivalences between equivalent expressions. That approach is more oriented for other concerns or interests and doesn't really suit the needs of equational reasoning, term rewriting, and other areas of work that make use of these equivalences.

@brandonwillard brandonwillard self-assigned this Sep 2, 2022
@brandonwillard brandonwillard marked this pull request as draft September 2, 2022 06:28
@brandonwillard brandonwillard added enhancement New feature or request question Further information is requested important graph rewriting refactor This issue involves refactoring request discussion graph objects performance concern labels Sep 2, 2022
@ricardoV94
Copy link
Contributor

Do we ever change Apply inputs inplace behind the scenes?

@brandonwillard
Copy link
Member Author

brandonwillard commented Sep 2, 2022

Do we ever change Apply inputs inplace behind the scenes?

Yes, in FunctionGraph.change_node_input primarily. A big part of the remaining work on this PR involves refactoring for hash-based collections that are invalidated after such changes. There's also a good chance that we'll need to remove the in-place Apply node updating altogether.

@brandonwillard brandonwillard force-pushed the use-hash-cons-Apply-and-Constant branch 10 times, most recently from f4a9e86 to 9843a4b Compare September 10, 2022 01:04
@ricardoV94
Copy link
Contributor

ricardoV94 commented Sep 10, 2022

Any chance trying this could give us enough speedup without having to perform such fundamental changes?
Theano/Theano#3373

Edit: The suggested improvement in that issue was actually already implemented:

# If they're all `AtomicVariable`s, there's no need to call validate.
if all(isinstance(old, AtomicVariable) for old, _ in pairs):
fgraph.replace_all(pairs, reason="MergeOptimizer")

@brandonwillard
Copy link
Member Author

brandonwillard commented Sep 10, 2022

Any chance trying this could give us enough speedup without having to perform such fundamental changes? Theano/Theano#3373

We're not looking for just a speed-up here. We're trying to improve our graph object model, rewriting process, and graph equivalence checks in general so we can arrive at a place where #1082 —and other "meta" rewriting work—is feasible in Aesara.

Regardless, incremental changes to MergeFeature/MergeOptimizer like that could be an improvement for the current approach to merging; however, a PR like this takes an entirely different approach that removes the need to perform merging as a rewrite step altogether. While this PR's approach isn't cost-free, it could effectively come at a considerably lower total cost than the current approach.

For instance, with the changes in this PR, there's no need to run through the rewrite machinery to accomplish the same things as our current merging, and that machinery has a very real cost (e.g. cloning, numerous iterative object and type checks, etc.) As a result, there are considerably fewer limitations on when identity-based comparisons can be productively made between graph objects (e.g. within rewrites, graph-constructor/helper functions, etc.) Also, there are much fewer rewriting redundancies (e.g. no Feature callbacks induced by merging).

Currently, we have to wait for merge passes after every rewrite before we can reasonably ask when/if (sub)graphs are equivalent. This adds complexity and constraints to the construction and use of rewrite passes and the databases in which they reside.
This also means that we need to perform all the work in a rewrite until we get to a point where an equivalence check is necessary, then we need to let such checks fail and try again after another merge rewrite cleans things up. These equivalence-based situations are probably the most evident in shape-related rewrites and/or Op.infer_shape and relate directly to #1110.
Anyway, that's just one set of limitations that can't be easily addressed without changes like these.

These changes also make changes like #996 considerably more straightforward, and they directly address things like #724.

More importantly, if we ever want to cache and/or memoize rewrite steps, we'll need a more suitable object model—like the one here.

@brandonwillard brandonwillard changed the title Hash cons Applys and Constants Hash-cons Applys and Constants Sep 10, 2022
@brandonwillard brandonwillard force-pushed the use-hash-cons-Apply-and-Constant branch 2 times, most recently from 4b81904 to 8b69775 Compare September 11, 2022 06:31
@brandonwillard
Copy link
Member Author

brandonwillard commented Feb 2, 2023

This was covered in some Gitter chats (e.g. some egraph conversations here) months back, but it should be here as well: with these changes in place, we can more efficiently—and easily—implement egraphs and Equality Saturation as a Feature or FunctionGraph extension.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request graph objects graph rewriting important performance concern question Further information is requested refactor This issue involves refactoring request discussion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants