Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement aesara.tensor.matmul #744

Merged
merged 1 commit into from
Jul 30, 2022
Merged

Conversation

zoj613
Copy link
Member

@zoj613 zoj613 commented Jan 11, 2022

closes #488

This implements an aesara equivalent of np.matmul.

The behavior depends on the arguments in the following way.

If both arguments are 2-D they are multiplied like conventional matrices.

If either argument is N-D, N > 2, it is treated as a stack of matrices residing in the last two indexes and broadcast accordingly.

If the first argument is 1-D, it is promoted to a matrix by prepending a 1 to its dimensions. After matrix multiplication the prepended 1 is removed.

If the second argument is 1-D, it is promoted to a matrix by appending a 1 to its dimensions. After matrix multiplication the appended 1 is removed.

matmul differs from dot in two important ways:

Multiplication by scalars is not allowed.

Stacks of matrices are broadcast together as if the matrices were elements, respecting the signature (n,k),(k,m)->(n,m)

References

https://numpy.org/doc/stable/reference/generated/numpy.matmul.html

@zoj613 zoj613 added the enhancement New feature or request label Jan 11, 2022
@zoj613 zoj613 requested a review from brandonwillard January 11, 2022 23:24
@codecov
Copy link

codecov bot commented Jan 11, 2022

Codecov Report

Merging #744 (5abde4d) into main (8763981) will increase coverage by 0.01%.
The diff coverage is 100.00%.

❗ Current head 5abde4d differs from pull request most recent head b60030f. Consider uploading reports for the commit b60030f to get more accurate results

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #744      +/-   ##
==========================================
+ Coverage   79.23%   79.25%   +0.01%     
==========================================
  Files         152      152              
  Lines       47943    48006      +63     
  Branches    10909    10933      +24     
==========================================
+ Hits        37990    38048      +58     
+ Misses       7453     7449       -4     
- Partials     2500     2509       +9     
Impacted Files Coverage Δ
aesara/tensor/nlinalg.py 98.76% <100.00%> (+0.18%) ⬆️
aesara/sparse/type.py 72.11% <0.00%> (-2.66%) ⬇️
aesara/link/numba/dispatch/tensor_basic.py 97.95% <0.00%> (-2.05%) ⬇️
aesara/link/basic.py 85.00% <0.00%> (-1.50%) ⬇️
aesara/tensor/shape.py 90.93% <0.00%> (-1.29%) ⬇️
aesara/tensor/subtensor_opt.py 86.32% <0.00%> (-0.80%) ⬇️
aesara/link/c/lazylinker_c.py 65.95% <0.00%> (-0.71%) ⬇️
aesara/link/c/cutils.py 68.18% <0.00%> (-0.71%) ⬇️
aesara/sparse/basic.py 82.47% <0.00%> (-0.43%) ⬇️
aesara/printing.py 49.52% <0.00%> (-0.24%) ⬇️
... and 27 more

Copy link
Member

@brandonwillard brandonwillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

The next two important methods that need to be implemented are Op.grad (or Op.L_op) and Op.infer_shape.

They're both optional, but they can really make or break the usefulness of an Op. At the very least, it's good to have "stubs" for those that explicitly say they aren't implemented.

You can use tests.unittest_tools.verify_grad to create numeric gradient tests (or even tests.tensor.utils.makeTester for a basic test suite), and tests.unittest_tools.InferShapeTester to create automate the Op.infer_shape tests.

Copy link
Member

@brandonwillard brandonwillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that np.matmul functionality can be implemented by a helper function that dispatches to the existing aesara.tensor.math.Dot and aesara.tensor.math.tensordot? If so, that would save considerable time and effort attempting to (re)implement the Op.grad and Op.infer_shape logic.

aesara/tensor/__init__.py Outdated Show resolved Hide resolved
aesara/tensor/nlinalg.py Outdated Show resolved Hide resolved
aesara/tensor/nlinalg.py Outdated Show resolved Hide resolved
@zoj613 zoj613 changed the title WIP: Implement aesara.tensor.matmul WIP: Implement aesara.tensor.matmul Jan 12, 2022
@zoj613
Copy link
Member Author

zoj613 commented Jan 12, 2022

Is it possible that np.matmul functionality can be implemented by a helper function that dispatches to the existing aesara.tensor.math.Dot and aesara.tensor.math.tensordot? If so, that would save considerable time and effort attempting to (re)implement the Op.grad and Op.infer_shape logic.

Not sure this is best given that matmul behaviour is different from dot's depending on the shape of the inputs. See the PR description. I managed to implement the infer_shape method without much hassle.

@zoj613
Copy link
Member Author

zoj613 commented Jan 12, 2022

tests.unittest_tools.InferShapeTester to create automate the Op.infer_shape tests.

Could you elaborate on this part? Do I necessarily have to implement a test_infer_shape method or is this automated by inheriting from InferShapeTester? I see that some tests implement this method and some dont. not sure what is the best practice here.

@brandonwillard
Copy link
Member

tests.unittest_tools.InferShapeTester to create automate the Op.infer_shape tests.

Could you elaborate on this part? Do I necessarily have to implement a test_infer_shape method or is this automated by inheriting from InferShapeTester? I see that some tests implement this method and some dont. not sure what is the best practice here.

Yeah, you need to create a subclass and then call the method(s) it provides. It can be a useful tool, but, if you want to make custom tests, that's also fine.

@zoj613
Copy link
Member Author

zoj613 commented Jan 12, 2022

tests.unittest_tools.InferShapeTester to create automate the Op.infer_shape tests.

Could you elaborate on this part? Do I necessarily have to implement a test_infer_shape method or is this automated by inheriting from InferShapeTester? I see that some tests implement this method and some dont. not sure what is the best practice here.

Yeah, you need to create a subclass and then call the method(s) it provides. It can be a useful tool, but, if you want to make custom tests, that's also fine.

as in _compile_and_check?

aesara/tensor/nlinalg.py Outdated Show resolved Hide resolved
aesara/tensor/nlinalg.py Outdated Show resolved Hide resolved
aesara/tensor/nlinalg.py Outdated Show resolved Hide resolved
@zoj613 zoj613 force-pushed the matmul branch 3 times, most recently from 8ad2797 to fcb2d5c Compare January 15, 2022 09:52
Comment on lines 129 to 131
self.rng = np.random.default_rng(utt.fetch_seed())
self.op = matmul
self.op_class = MatMul
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This use of a shared RNG state induces test method order dependence; instead, if you create and seed the RNG objects within each independent test unit, they can be run in any order (and in parallel) and produce consistent results.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear: by "test unit" you mean each independent test_* method?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to be clear: by "test unit" you mean each independent test_* method?

Yes

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like all the tests are sharing the same class-level RNG object. As I mentioned earlier, this will make the results order-dependent. We need to construct the RNG object within each individual test in order to avoid that.

N.B. A fixture could be used if you don't want to copy-paste the RNG construction code each time.

aesara/tensor/nlinalg.py Outdated Show resolved Hide resolved
@zoj613 zoj613 marked this pull request as ready for review January 16, 2022 08:16
@zoj613 zoj613 changed the title WIP: Implement aesara.tensor.matmul Implement aesara.tensor.matmul Jan 16, 2022
@zoj613 zoj613 force-pushed the matmul branch 2 times, most recently from 701a93e to a336429 Compare January 21, 2022 07:21
@Sayam753
Copy link

@zoj613 is there anything left to be completed in this PR?

The PR #808 is blocked to use matmul operation (implemented in this PR).

@brandonwillard
Copy link
Member

It looks like this can't be rebased by maintainers, and we need to (squash and) rebase to make sure it passes with all the recent changes—especially the shape-inference-related ones. @zoj613, is "Allow edits and access to secrets by maintainers" not enabled/checked?

@Sayam753
Copy link

Sayam753 commented Jul 7, 2022

@brandonwillard is this PR ready to be merged?

@brandonwillard brandonwillard added Op implementation Involves the implementation of an Op and removed new Op labels Jul 11, 2022
Copy link
Member

@brandonwillard brandonwillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor changes are needed; otherwise, it looks good.

aesara/tensor/__init__.py Outdated Show resolved Hide resolved
aesara/tensor/nlinalg.py Outdated Show resolved Hide resolved
aesara/tensor/nlinalg.py Outdated Show resolved Hide resolved
aesara/tensor/nlinalg.py Outdated Show resolved Hide resolved
aesara/tensor/nlinalg.py Outdated Show resolved Hide resolved
aesara/tensor/nlinalg.py Outdated Show resolved Hide resolved
aesara/tensor/nlinalg.py Outdated Show resolved Hide resolved
Comment on lines 129 to 131
self.rng = np.random.default_rng(utt.fetch_seed())
self.op = matmul
self.op_class = MatMul
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like all the tests are sharing the same class-level RNG object. As I mentioned earlier, this will make the results order-dependent. We need to construct the RNG object within each individual test in order to avoid that.

N.B. A fixture could be used if you don't want to copy-paste the RNG construction code each time.

tests/tensor/test_nlinalg.py Outdated Show resolved Hide resolved
tests/tensor/test_nlinalg.py Outdated Show resolved Hide resolved
@purna135
Copy link
Contributor

Thank you so much for this PR, @zoj613! I'm keeping an eye on it, and it appears that some minor changes are required before this can be merged. Are you able to push this PR to the boundary line? I have a strong dependency on this PR.

@brandonwillard
Copy link
Member

brandonwillard commented Jul 23, 2022

I've just noticed that numpy.matmul is essentially a ufunc of dot. With that in mind, the right way to do this is to first close #695 and use that functionality to implement this.

In the meantime, I've updated the tests and docstrings.

@brandonwillard brandonwillard added NumPy compatibility tensor algebra Relates to our use and representations of tensor algebra labels Jul 23, 2022
@brandonwillard brandonwillard force-pushed the matmul branch 2 times, most recently from b9e3b2c to 7e0f94c Compare July 23, 2022 21:03
@purna135
Copy link
Contributor

Hello, @brandonwillard. Is this PR now ready to be merged?
I have some PR that are waiting for this PR to be merged because my work is dependent on matrix_inverse at #808, which is once again blocked by this PR : (

Copy link
Member

@brandonwillard brandonwillard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can merge in order to move #808 along, but we need to keep the associated issue open until at least a gradient implementation is provided.

Preferably, we should have both a gradient and non-Python implementation for the addition of an Op, and, in this case, #757 is where our efforts need to be focused to accomplish that.

@brandonwillard brandonwillard merged commit 2246504 into aesara-devs:main Jul 30, 2022
@zoj613
Copy link
Member Author

zoj613 commented Jul 30, 2022

I am looking into adding a grad/L_op/R_op method for this. It appears that it won't be as straight forward as re-using one for the Dot Op since matmul has slight differences in behavior, especially for dimensions > 2.

@brandonwillard
Copy link
Member

I am looking into adding a grad/L_op/R_op method for this. It appears that it won't be as straight forward as re-using one for the Dot Op since matmul has slight differences in behavior, especially for dimensions > 2.

Yeah, I realized that the logic for doing that is a slightly specialized form of the logic we need in #757, so it's better that we focus our efforts there.

@zoj613 zoj613 deleted the matmul branch July 30, 2022 21:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request NumPy compatibility Op implementation Involves the implementation of an Op tensor algebra Relates to our use and representations of tensor algebra
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement aesara.tensor.matmul
5 participants