[Relax] Handle binary operations between Tensor and PrimValue #16827

Lunderberg · 2024-04-01T03:00:31Z

Prior to this commit, binary operations were only defined between two tensors. This commit allows binary operations to apply between a tensor and a relax::PrimValue.

When inferring the output StructInfo, binary operations with a PrimValue produce the same output as using a 0-d tensor. When legalizing operations containing a PrimValue, they are lowered to primitive TIR arguments.

Prior to this commit, binary operations were only defined between two tensors. This commit allows binary operations to apply between a tensor and a `relax::PrimValue`. When inferring the output `StructInfo`, binary operations with a `PrimValue` produce the same output as using a 0-d tensor. When legalizing operations containing a `PrimValue`, they are lowered to primitive TIR arguments.

slyubomirsky

Thank you for pursuing these changes and also making a few refactors that improve readability. I have a couple of concerns listed below about how to handle Object types (I think arithmetic ops shouldn't accept them, though admittedly we presently don't have a way to express, "I would be fine with either a tensor or prim value").

Is there a particular use case for arithmetic with PrimValues and tensors? I guess it makes sense to be able to pass a PrimValue directly to one of these ops without requiring an explicit conversion. I would be a little hesitant to have arithmetic on PrimValues via shape expressions and then also pass them around to Relax arithmetic ops.

slyubomirsky · 2024-04-01T21:51:25Z

src/relax/op/op_common.h

+  } else if (const auto* tensor = sinfo.as<TensorStructInfoNode>()) {
+    return tensor->dtype;
+  } else if (sinfo.as<ObjectStructInfoNode>()) {
+    return DataType::Void();


Would this necessarily be expected behavior? An Object could be anything, including things that dtype does not make sense for at all.

Yeah, I went back and forth on it. There isn't currently a standard for whether FInferStructInfo should raise an error when the arguments are provably invalid, or if it should raise an error when the arguments are not provably valid. On the one hand, StructInfoLCA returns ObjectStructInfo as the common base class of TensorStructInfo and PrimStructInfo, so an ObjectStructInfo could contain a valid instance of either. On the other hand, the current struct inference requires that the input be validated as TensorStructInfo.

Overall, I'm not sure which is the better behavior. For now, I'm updating this PR to explicitly require either TensorStructInfo or PrimStructInfo, and to raise an exception for ObjectStructInfo, since allowing ObjectStructInfo would be an independent change.

Personally, I'm in favor of asking for a MatchCast if we can't draw a conclusion. Down the line, inserting MatchCasts via normalization rules would be a good policy.

True. Currently, FInferStructInfo is called prior to FNormalize, so inference could be inspecting an expression that hasn't yet been normalized. This was useful for providing FNormalize for R.Prim (if PrimStructInfo contains a known value, in-line that value), but I'm wondering if we should re-visit that.

slyubomirsky · 2024-04-01T21:53:21Z

src/relax/op/tensor/binary.cc

+  } else if (lhs_sinfo.as<ObjectStructInfoNode>() && rhs_sinfo.as<ObjectStructInfoNode>()) {
+    return ObjectStructInfo();
+  }


I'm not sure it's appropriate to accept Objects for an arithmetic operation. This relates to your idea about using normalization rules to turn type requirements into explicit checks with MatchCast, but I think this would be a case to ask a user to put in a MatchCast to assert the types work.

slyubomirsky · 2024-04-01T21:54:24Z

src/script/printer/relax/tir.cc

-  ICHECK(n->dtype.is_int() && n->dtype.is_scalar()) << "TypeError: Relax only uses "
-                                                       "scalar integer TIR variables, but gets: "
-                                                    << n;


What was the reason for removing this requirement? Are we using handle-typed vars now?

Good point. I needed to remove the n->dtype.is_int() check, as the value could be R.Prim("float32"), but the check for n->dtype.is_scalar() should be kept.

Lunderberg · 2024-04-02T11:30:40Z

Is there a particular use case for arithmetic with PrimValues and tensors?

Primarily for cases where a dynamic computation requires use of a dynamic shape (e.g. RMS_norm). Or simpler cases, like computing the mean.

@R.function
def mean(A: R.Tensor(['m','n'], 'float32') -> R.Tensor(['m'], 'float32'):
    n = T.int64()
    sum = R.sum(A, axis=1, keep=False)
    output = sum / n.astype('float32') # Allow expressing this step.
    return output

Lunderberg · 2024-04-02T14:03:08Z

And the PR is now updated to require operands to have either TensorStructInfo or PrimStructInfo.

slyubomirsky

Thanks for making the requested changes. I don't see much harm in adding support for such ops, though we should be mindful of the added complexity of having more ways to express equivalent computations. I don't think it's a problem for now.

tests/python/relax/test_op_binary.py

Lunderberg · 2024-04-05T14:00:02Z

@tqchen The requested change has been made, and CI is passing. Any other changes that should be made before merging?

tqchen · 2024-04-05T16:40:43Z

src/relax/op/op_common.h

+  } else if (const auto* tensor = sinfo.as<TensorStructInfoNode>()) {
+    return tensor->dtype;
+  } else {
+    LOG(FATAL) << "TypeError: "


Originally our error message would ask for TensorStructInfo. In this particular case, would this error message be less informative than before? Given this is a global change across all binary ops, would be good to cross confirm the usages here and make error more informative.

Good point, and this no longer tells the user which operation it was. Updated.

tqchen · 2024-04-05T16:41:05Z

Sorry i didn't yet have time to do a full look through, will spend sometime this weekend

Lunderberg · 2024-04-05T18:40:23Z

No problem, and thank you. This isn't a high-priority PR to land, and can certainly wait until after the weekend.

tqchen · 2024-04-17T01:22:16Z

thanks @Lunderberg should be good to go after ci

Lunderberg · 2024-04-18T15:32:50Z

@tqchen Thank you! I've resolved the unit test whose failure was specific to this PR.

There's a few other CI failures, which look like they're triggered by a bug in tvm.device('cuda').exist. If no GPUs are present, it raises an exception when it should return False. This is an independent failure mode, and I've submitted #16903 which resolves it.

Lunderberg force-pushed the relax_binary_operations_with_primvalue branch from 579125e to 687950e Compare April 1, 2024 12:43

Fix unit tests

4f487d5

slyubomirsky reviewed Apr 1, 2024

View reviewed changes

Lunderberg added 3 commits April 2, 2024 08:22

Restore ICHECK for scalar TIR variable

4f3f510

Fix a few more unit tests

7a6ff0a

Remove handling of ObjectStructInfo

efe5323

slyubomirsky approved these changes Apr 2, 2024

View reviewed changes

tqchen requested changes Apr 3, 2024

View reviewed changes

tests/python/relax/test_op_binary.py Outdated Show resolved Hide resolved

Undo commenting-out of test cases

5295220

tqchen approved these changes Apr 5, 2024

View reviewed changes

tqchen requested changes Apr 5, 2024

View reviewed changes

Update for improved error messages

b5e2608

Fix failing unit tests

16e468e

tqchen approved these changes Apr 17, 2024

View reviewed changes

Fix unit test

f097c78

Lunderberg merged commit 622bd15 into apache:main Apr 18, 2024
18 checks passed

Lunderberg deleted the relax_binary_operations_with_primvalue branch April 18, 2024 21:42

Lunderberg mentioned this pull request Jul 15, 2024

[Relax] Implement Rewriter class for pattern-rewrite #17149

Merged

ysh329 mentioned this pull request Jul 20, 2024

[Release] v0.17.0 Release Candidate Notes #17178

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Relax] Handle binary operations between Tensor and PrimValue #16827

[Relax] Handle binary operations between Tensor and PrimValue #16827

Lunderberg commented Apr 1, 2024

slyubomirsky left a comment

slyubomirsky Apr 1, 2024

Lunderberg Apr 2, 2024

slyubomirsky Apr 2, 2024

Lunderberg Apr 3, 2024

slyubomirsky Apr 1, 2024

slyubomirsky Apr 1, 2024

Lunderberg Apr 2, 2024

Lunderberg commented Apr 2, 2024

Lunderberg commented Apr 2, 2024

slyubomirsky left a comment

Lunderberg commented Apr 5, 2024

tqchen Apr 5, 2024

Lunderberg Apr 5, 2024

tqchen commented Apr 5, 2024

Lunderberg commented Apr 5, 2024

tqchen commented Apr 17, 2024

Lunderberg commented Apr 18, 2024

[Relax] Handle binary operations between Tensor and PrimValue #16827

[Relax] Handle binary operations between Tensor and PrimValue #16827

Conversation

Lunderberg commented Apr 1, 2024

slyubomirsky left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Lunderberg commented Apr 2, 2024

Lunderberg commented Apr 2, 2024

slyubomirsky left a comment

Choose a reason for hiding this comment

Lunderberg commented Apr 5, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tqchen commented Apr 5, 2024

Lunderberg commented Apr 5, 2024

tqchen commented Apr 17, 2024

Lunderberg commented Apr 18, 2024