Hash-cons `Apply`s and `Constant`s #1165

brandonwillard · 2022-09-02T06:28:08Z

This is an experimental PR that attempts to use hash-consing to restrict equivalent Applys and Constants to distinct representative instances. These changes should remove the need to perform "merging" as a distinct rewrite step, since it would now be performed by the Apply and Constant "constructors".

In other words, Apply(op, [v0, v1], ...) is Apply(op, [v0, v1], ...) and Constant(t1, x) is Constant(t1, x).

In order for this notion of equivalence to work, we need immutable Applys , which means that we need to stop updating Apply.inputs in-place. To accommodate that requirement, FunctionGraph.change_node_input can create new Apply instances with the updated inputs. Afterward, the old Apply nodes need to be replaced by the new ones somehow.

These changes overlap with the work in #666; however, we would like to avoid the need to clone and replace every Apply node that uses a replaced node's outputs as inputs (i.e. we don't want to clone and replace all the dependent sub-graphs).

We should be able to avoid sub-graph cloning by moving the in-place updates to Variable.owner (i.e. re-assign the output variables to the new Applys). The problem with a change like this is that our Feature.on_change_input callback semantics will also need to be changed, because—for instance—the objects passed to those methods will no longer have the same expected relationships with each other. For example, if we're no longer changing Apply.inputs in-place, a call to Feature.on_change_input with a newly constructed Apply node as the node argument will be invalid for Features that track the node arguments explicitly. In general, Features will need to explicitly account for any new/different relationship between a new node and an old/replaced one.
One approach to updating Feature.on_change_input involves the use of old and new node arguments. In this case, subscribers to the Feature.on_change_input callbacks will be responsible for handling the node replacements, which—in many existing cases—is reasonable. Also, if we're updating Variable.owner in each of the output variables of an replaced Apply, listeners can start using "representative" output variables to track specific node locations in a graph. In other words, if one tracks out = node.outputs[0] instead of node, out.owner will always reflect the applicable Apply node after all/any subsequent changes to node.inputs.
We need to determine how distinct computations of an Apply node are represented in this scenario. Currently, z = x + y and w = x + y represent two distinct computations (e.g. expressed in implementation with z != w being True). Under these changes, z is w could be True, and we would need to represent distinct computations of the same expressions in some other way (e.g. via the use of distinct output Variables that point to the same Apply).

I think the only reason we currently need distinct, duplicate computations is that it's a necessary part of our memory/data-flow model—itself needed for things like in-placing. Instead, we can work in reverse to how we currently do by making all equivalent expressions represent the same computation by default (i.e. the changes in this PR), then we use special constructors/helper-functions and rewrite passes that introduce distinct computations as required (e.g. for in-placing). In other words, we go from everything being a distinct computation by default and then merging the equivalent ones, to everything being equivalent (that actually is equivalent) by default and then introducing duplicates when needed.

Again, distinct output Variables could be used to represent distinct computations, since multiple Variables can have their .owner values set to the same Apply node. The primary difference with our current approach is that the Apply nodes associated with those distinct Variables will not be distinct.

Our in-placing rewrites, DestroyHandler, and Merge[Optimizer|Feature] work together—with the backtracking provided by the History Feature—to incrementally apply the equivalences between equivalent expressions. That approach is more oriented for other concerns or interests and doesn't really suit the needs of equational reasoning, term rewriting, and other areas of work that make use of these equivalences.

ricardoV94 · 2022-09-02T06:41:13Z

Do we ever change Apply inputs inplace behind the scenes?

brandonwillard · 2022-09-02T06:42:55Z

Do we ever change Apply inputs inplace behind the scenes?

Yes, in FunctionGraph.change_node_input primarily. A big part of the remaining work on this PR involves refactoring for hash-based collections that are invalidated after such changes. There's also a good chance that we'll need to remove the in-place Apply node updating altogether.

aesara/graph/destroyhandler.py

ricardoV94 · 2022-09-10T05:21:57Z

Any chance trying this could give us enough speedup without having to perform such fundamental changes?
Theano/Theano#3373

Edit: The suggested improvement in that issue was actually already implemented:

aesara/aesara/graph/rewriting/basic.py

Lines 772 to 774 in 7550668

    
           # If they're all `AtomicVariable`s, there's no need to call validate. 
        
           if all(isinstance(old, AtomicVariable) for old, _ in pairs): 
        
               fgraph.replace_all(pairs, reason="MergeOptimizer")

brandonwillard · 2022-09-10T18:32:57Z

Any chance trying this could give us enough speedup without having to perform such fundamental changes? Theano/Theano#3373

We're not looking for just a speed-up here. We're trying to improve our graph object model, rewriting process, and graph equivalence checks in general so we can arrive at a place where #1082 —and other "meta" rewriting work—is feasible in Aesara.

Regardless, incremental changes to MergeFeature/MergeOptimizer like that could be an improvement for the current approach to merging; however, a PR like this takes an entirely different approach that removes the need to perform merging as a rewrite step altogether. While this PR's approach isn't cost-free, it could effectively come at a considerably lower total cost than the current approach.

For instance, with the changes in this PR, there's no need to run through the rewrite machinery to accomplish the same things as our current merging, and that machinery has a very real cost (e.g. cloning, numerous iterative object and type checks, etc.) As a result, there are considerably fewer limitations on when identity-based comparisons can be productively made between graph objects (e.g. within rewrites, graph-constructor/helper functions, etc.) Also, there are much fewer rewriting redundancies (e.g. no Feature callbacks induced by merging).

Currently, we have to wait for merge passes after every rewrite before we can reasonably ask when/if (sub)graphs are equivalent. This adds complexity and constraints to the construction and use of rewrite passes and the databases in which they reside.
This also means that we need to perform all the work in a rewrite until we get to a point where an equivalence check is necessary, then we need to let such checks fail and try again after another merge rewrite cleans things up. These equivalence-based situations are probably the most evident in shape-related rewrites and/or Op.infer_shape and relate directly to #1110.
Anyway, that's just one set of limitations that can't be easily addressed without changes like these.

These changes also make changes like #996 considerably more straightforward, and they directly address things like #724.

More importantly, if we ever want to cache and/or memoize rewrite steps, we'll need a more suitable object model—like the one here.

brandonwillard · 2023-02-02T06:07:20Z

This was covered in some Gitter chats (e.g. some egraph conversations here) months back, but it should be here as well: with these changes in place, we can more efficiently—and easily—implement egraphs and Equality Saturation as a Feature or FunctionGraph extension.

brandonwillard self-assigned this Sep 2, 2022

brandonwillard marked this pull request as draft September 2, 2022 06:28

brandonwillard added enhancement New feature or request question Further information is requested important graph rewriting refactor This issue involves refactoring request discussion graph objects performance concern labels Sep 2, 2022

rlouf reviewed Sep 2, 2022

View reviewed changes

aesara/graph/destroyhandler.py Outdated Show resolved Hide resolved

brandonwillard force-pushed the use-hash-cons-Apply-and-Constant branch from 679da16 to 80e412e Compare September 7, 2022 17:59

brandonwillard mentioned this pull request Sep 7, 2022

Added KanrenRelationSub for distributive rewrites #634

Open

4 tasks

brandonwillard force-pushed the use-hash-cons-Apply-and-Constant branch 10 times, most recently from f4a9e86 to 9843a4b Compare September 10, 2022 01:04

brandonwillard changed the title ~~Hash cons Applys and Constants~~ Hash-cons Applys and Constants Sep 10, 2022

brandonwillard force-pushed the use-hash-cons-Apply-and-Constant branch 2 times, most recently from 4b81904 to 8b69775 Compare September 11, 2022 06:31

brandonwillard force-pushed the use-hash-cons-Apply-and-Constant branch 2 times, most recently from e15e056 to 03d841f Compare September 12, 2022 05:19

brandonwillard mentioned this pull request Sep 12, 2022

Implement measurable transforms for division, subtraction and negation aesara-devs/aeppl#144

Closed

brandonwillard force-pushed the use-hash-cons-Apply-and-Constant branch 13 times, most recently from dc20621 to 5b01bff Compare September 20, 2022 22:23

brandonwillard force-pushed the use-hash-cons-Apply-and-Constant branch from 5b01bff to 3e9665c Compare December 14, 2022 18:25

brandonwillard added 11 commits December 14, 2022 13:05

Use helper function instead of a mixin in tests.graph.test_basic

6916e84

Use WeakValueDictionary for NominalVariable instance caching

cd98f20

Add HashableNDArray type

14830f8

Replace string comparisons with equal_computations in rewrite tests

3c46fc4

Remove pre_constant_merge

a0fe9a8

Remove pre_greedy_node_rewriter

2625228

Remove is_same_graph

50a57bf

Print outputs in Apply.__str__

1603a03

Add FunctionGraph callback checks to tests

e4451de

Separate recursive importing from single node importing in FunctionGraph

48a249e

Hash-cons Apply, Constant and change node input replacement semantics

ad7259d

brandonwillard force-pushed the use-hash-cons-Apply-and-Constant branch from 3e9665c to ad7259d Compare December 14, 2022 19:05

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Hash-cons `Apply`s and `Constant`s #1165

Hash-cons `Apply`s and `Constant`s #1165

brandonwillard commented Sep 2, 2022 •

edited

Loading

ricardoV94 commented Sep 2, 2022

brandonwillard commented Sep 2, 2022 •

edited

Loading

ricardoV94 commented Sep 10, 2022 •

edited

Loading

brandonwillard commented Sep 10, 2022 •

edited

Loading

brandonwillard commented Feb 2, 2023 •

edited

Loading

Hash-cons Applys and Constants #1165

Are you sure you want to change the base?

Hash-cons Applys and Constants #1165

Conversation

brandonwillard commented Sep 2, 2022 • edited Loading

ricardoV94 commented Sep 2, 2022

brandonwillard commented Sep 2, 2022 • edited Loading

ricardoV94 commented Sep 10, 2022 • edited Loading

brandonwillard commented Sep 10, 2022 • edited Loading

brandonwillard commented Feb 2, 2023 • edited Loading

Hash-cons `Apply`s and `Constant`s #1165

Hash-cons `Apply`s and `Constant`s #1165

brandonwillard commented Sep 2, 2022 •

edited

Loading

brandonwillard commented Sep 2, 2022 •

edited

Loading

ricardoV94 commented Sep 10, 2022 •

edited

Loading

brandonwillard commented Sep 10, 2022 •

edited

Loading

brandonwillard commented Feb 2, 2023 •

edited

Loading