Consider removing or redesigning the `triangular` operator #768

a-sully · 2024-10-18T00:04:23Z

I was looking into implementing the triangular operator in Chromium's CoreML backend when I realized... all three of Chromium's WebNN backends have to emulate the triangular operator using a mask!

Backend	Strategy
DirectML	May use the `DML_DIAGONAL_MATRIX1_OPERATOR_DESC` operator in some circumstances, but otherwise must fall back to creating a mask
TFLite	Always creates a(n unoptimized, for now) mask
CoreML	May use the `band_part` operator in some circumstances, but otherwise must fall back to creating a mask

Notably, CoreML's band_part operator is similar to DirectML's DML_DIAGONAL_MATRIX1_OPERATOR_DESC operator, except it's not capable of handling cases where the main diagonal is excluded from the mask. See https://crbug.com/374127244 for details.

triangular was added as part of in #478. Why was this specific behavior chosen? Did we consider alternative designs, especially since the operator as specified doesn't naively map to operators provided by any of the backends we're prototyping with?

What are the use cases we care about? Would an alternative design that aligns more closely with band_part and DML_DIAGONAL_MATRIX1_OPERATOR_DESC be sufficient for those use cases? For example, we could consider replacing triangular with a diagonal (naming TBD) operator which starts from the main diagonal and offers to include additional diagonals in the upper and lower triangles. This would trivially map to the aforementioned CoreML and DirectML operators

The text was updated successfully, but these errors were encountered:

fdwr · 2024-10-25T01:37:38Z

So CoreML's band_part appears to miss the shift offset parameter, found in other API's...

numpy.tri(m, k=0) https://numpy.org/doc/stable/reference/generated/numpy.tri.html
numpy.triu(m, k=0) https://numpy.org/doc/stable/reference/generated/numpy.triu.html
numpy.tril(m, k=0) https://numpy.org/doc/stable/reference/generated/numpy.tril.html
torch.triu(input, diagonal=0, *, out=None) https://pytorch.org/docs/stable/generated/torch.triu.html
torch.tril(input, diagonal=0, *, out=None) https://pytorch.org/docs/stable/generated/torch.tril.html
Trilu k https://onnx.ai/onnx/operators/onnx__Trilu.html
tf.linalg.diag k parameter (similar) https://www.tensorflow.org/api_docs/python/tf/linalg/diag

...,and unfortunately band_part has a very special meaning (higher level policy) for -1 in the lower and upper offsets rather than just allowing a free form positive/negative range which would permit range shifting. 🤔 The limitation that the diagonal can't be excluded is the clincher.

Why was this specific behavior chosen?

Inspired by the existing API's above.

What are the use cases we care about?

It's used in Stable Diffusion's text encoder:

Though, the number of occurrences within a single model file (SD CLIP text encoder) is few - just one!

... since the operator as specified doesn't naively map to operators provided by any of the backends we're prototyping with? ...

DML_DIAGONAL_MATRIX1_OPERATOR_DESC supports it directly.

May use the DML_DIAGONAL_MATRIX1_OPERATOR_DESC operator in some circumstances, but otherwise must fall back to creating a mask...

The fallback is really for older DML versions (which shouldn't matter anymore after framework packages complete and the newest version is always available), but I see that 4D branch you're talking about (maybe I should extend DIAGONAL_MATRIX1 for the >4D case either way).

operator which starts from the main diagonal and offers to include additional diagonals in the upper and lower triangles

Unfortunately the primary motivating case (SD CLIP text encoder) excludes the diagonal line (k=1), and so that would not be sufficient.

      "input": [[0, 1, 2, 3],
                [4, 5, 6, 7],
                [8, 9,10,11]],
      "k": 1,
      "output": [[0, 1, 2, 3],
                 [0, 0, 6, 7],
                 [0, 0, 0,11]],

So, given the backend inconsistencies, rarity of occurrence in the model, and inability of CoreML to implement it directly, maybe this operator should be migrated to an aggregate operator (that functional concept Ningxin presented at TPAC2024), or decomposed when incompatible. I thought up this possible higher-level decomposition:

tallTensor = iota(tallShape, {start:0, step: 1}); // sizes = [10,1], broadcast later in add
wideTensor = iota(wideShape, {start:0, step:-1}); // sizes = [1,10], broadcast later in add
// To select the lower triangle including diagonal, set shiftDelta = 0
// To select the lower triangle excluding diagonal, set shiftDelta = 1
// To select the upper triangle including diagonal, use lesserOrEqual and shiftDelta = 0.
booleanMask = greaterOrEqual(add(tallTensor, wideTensor), shiftDelta);
output = /*select*/where(booleanMask, input, 0.0);

// iota is a simple helper function like std::iota, filling a sequence and returning a reshaped tensor.
// It could be implemented by uploading a constant sequence generated CPU-side, or generated
// on-the-fly via cumulativeSum.

Visualizing the broadcasted sum to generate a comparison mask:

Interested to hear @huningxin's thoughts too.

a-sully · 2024-10-28T21:24:58Z

Thanks for the investigation, @fdwr!

maybe this operator should be migrated to an aggregate operator

I'm sure you'll be shocked to hear that I'm receptive to removing explicit support for this operator in WebNN :)

I'd like to clarify the your use of the word "migrated", though. Are you suggesting that removal of this operator should be blocked on implementation of the "aggregate operator" proposal?

Given the lack of cross-platform support and the straightforward decomposition (which browser backends are doing some form of today anyways!) I personally don't think we should block on that 🤔

anssiko added the operator specific label Oct 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider removing or redesigning the `triangular` operator #768

Consider removing or redesigning the `triangular` operator #768

a-sully commented Oct 18, 2024

fdwr commented Oct 25, 2024 •

edited

Loading

a-sully commented Oct 28, 2024

Consider removing or redesigning the triangular operator #768

Consider removing or redesigning the triangular operator #768

Comments

a-sully commented Oct 18, 2024

fdwr commented Oct 25, 2024 • edited Loading

a-sully commented Oct 28, 2024

Consider removing or redesigning the `triangular` operator #768

Consider removing or redesigning the `triangular` operator #768

fdwr commented Oct 25, 2024 •

edited

Loading