-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Matrix reduce to Vector with FIRST or SECOND #28
Comments
The algorithms I'm using for reduce-to-vector would break if the operator is not associative and commutative. FIRST and SECOND wouldn't work at all. Slower methods could work, but then I would have to have two sets of reduce-to-vector methods: one where the operator is well-behaved, and a slower one where it isn't. That's a lot of extra code to support. |
Thanks. I understand this is the case for generic binary operators. If the internal data formats are sorted, I would expect there to be efficient specialized implementations for FIRST and SECOND. Would you be okay with creating special implementations for these, or do you think this would get too messy? If you're okay with a specialized approach, would this be a good "first issue" to implement for new contributors? |
fwiw, I can't think of a meaningful example when restricted to the current spec where a Monoid can't be used instead of a BinaryOp. So, I think it's fine that you don't fully support reduce with BinaryOp, and perfectly understandable if you don't want to add special code to support a couple edge cases (in this case, FIRST and SECOND). A better, longer term approach may be to address some of the deficiencies of using BinaryOps (which there are several). In particular, it would be great to have generic machinery in place that can perform a reduction with FIRST or SECOND in parallel. I believe this is doable, but we'd need to determine an API. For another example, suppose one created a user-define type that is a struct. With generic, parallel machinery in place, I would expect one could perform a reduction to e.g. add all the elements of one of the fields in the struct. |
I think the API would be different. There's a similar issue that arises in the CC problem, where an assignment C(I) max= A is done, using the max as the accumulator operator. But in this case, if the index vector has duplicates, they want to apply the accum operator to reduce down to a final result. This is not something the C API for GraphBLAS defines. Using FIRST and SECOND for the accum operator is not unlike the kind of reduction you want to do, if C is a vector and A a matrix. These two cases are different, but they are related in changing what it means to do a reduction. |
That's an interesting case with "assign" with duplicate indices. I see you discuss this (or something very similar) in section 9.9. If I understand correctly, I think what you want here is an explicit "scatter" operation (to a list of indices) that accepts a binaryop/monoid to perform the reduction when values are scattered to the same index. This reduction could happen before the accumulation into the output, so, per normal GraphBLAS operations, accum could be a different binop. Fwiw, I would love to have a functional scatter as well where a function is used to determine which index a value is scattered to. For interested readers (e.g., my future self): "build" is another place where reduction is kind of special. Could one expect to get better performance if a Monoid were accepted instead of a BinaryOp? SuiteSparse does accept FIRST and SECOND here and is well-defined where the spec is undefined if the binop is not associative or commutative. |
That's right ... "build" is special.
I have to accept FIRST and SECOND here, because I also use "build" to sum
up my "pending tuples".
If there are two assignments, in two calls to GrB_assign, the first gets
overwritten by the second.
So if the two calls to GrB_assign both leave pending tuples around, I have
an "implied SECOND" operator to
assemble the duplicates.
here the FIRST and SECOND are defined by which call to GrB_assign came
first, so it's very well defined.
If an accum operator is used in the two calls to GrB_assign, then this
becomes my "dup" operator for "build",
and it cannot be assumed to be associateve / commutative.
…On Fri, Feb 26, 2021 at 5:41 PM Erik Welch ***@***.***> wrote:
That's an interesting case with "assign" with duplicate indices. I see you
discuss this (or something very similar) in section 9.9. If I understand
correctly, I think what you want here is an explicit "scatter" operation
(to a list of indices) that accepts a binaryop/monoid to perform the
reduction when values are scattered to the same index. This reduction could
happen before the accumulation into the output, so, per normal GraphBLAS
operations, accum could be a different binop. Fwiw, I would *love* to
have a functional scatter as well where a function is used to determine
which index a value is scattered to.
For interested readers (e.g., my future self): "build" is another place
where reduction is kind of special. Could one expect to get better
performance if a Monoid were accepted instead of a BinaryOp? SuiteSparse
does accept FIRST and SECOND here and is well-defined where the spec is
undefined if the binop is not associative or commutative.
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#28 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEYIIOLXDL73HUSWJ42RHP3TBAWQ7ANCNFSM4YI6NRFA>
.
|
I think what you really need is not a reduce-to-vector with FIRST and SECOND. That is ill-defined for any other binary op. What you really need is a kind of GxB_select that can depend on the entry count in a given row or column. "Give me just the first 4 entries in each row, starting from the lowest column indices". That step appears in CC, but GraphBLAS can't do it so the LAGraph CC exports the matrix and does it itself. |
It's good to bring select into the conversation, but I really would like a better reduction with binaryop. For example, performing a reduction on one field of a user-defined type that is a struct. To do such a generic reduction well (and able to be done in parallel, so I will be talking about doing reductions in groups), I think at least four things are needed:
For example:
A further enhancement (that I'm not pushing for, but think is important to be aware of) would be to allow the type If order is important (i.e., for reductions that don't commute), then there is a natural order to do things: along a column, along a row, or along a vector. For matrix reduce to scalar, we'd need to choose a convention to linearize the elements (for example, like it were C-contiguous or row-wise), and setting the transpose descriptor on the input would choose the other option for linearizing a matrix. FIRST and SECOND make perfect sense in view of this style of reduction. For these, the unaryop would simply be the identity, and the combine operator would be the same as the reducing binaryop (FIRST or SECOND). If any of this functionality were available, I would surely use it. I don't need it though, and I think having this conversation is more important atm. |
I like that your select variant has a clean API that is easy to understand. Instead of operating on the absolute index I or J of an element, it operates on, say, Regarding reduction on user-defined binaryops again, there are many more operations that could be done with my proposal above when combined with user-defined types:
It is common in modern data analytics system to have user-defined aggregation functions in a form similar to initialize-update-merge-evaluate. Variations exist, and I should point to literature or documentation... |
The GxB_select method isn't a good place to do any kind of reduction or
aggregation. An algorithm to do that would be a very different one than
what GxB_select currently does. The method is point-wise, in the sense
that the select op only depends on one value A(i,j) and its position. The
result for one A(i,j) can't depend on values nearby, currently.
The idea of a select op that can take as input an integer k where the entry
in question A(i,j) is the kth entry in its row, could be very useful but
I'm struggling with trying to define it clearly so it can be implemented
efficiently. If the select op takes in 2 integers, say k and s for an
entry A(i,j) that is the kth entry in its row and sth entry in its column,
then this would be very very hard to implement efficiently. I don't have
that information in the matrix. To compute it, I would have to transpose
the matrix, and then operate on both the matrix and its transpose at the
same time. That would be very very slow ... and if the select op is
user-defined, I might not know that the selectop doesn't need the value s
anyway, so all this extra work is useless. And it is a *lot* of extra work.
So for now, I don't see how I can add a selectop that depends on both k and
s.
Maybe just k ... but what if the matrix is held by column and k is the
ordinal position in the row? I'd need a costly transpose. That at least
is more feasible, however, than depending on both k and s.
Computing a mean or mode in a given row is really not well suited to
GxB_select. The mean can be done with GrB_mxv already, or reduce-to-vector
(which is the same thing actually). Even sum(x^2) could be done, to sum up
the squares of entries in each row. That could be done with a single call
to GrB_mxv with the right semiring. The monoid is plus, and the
multiplicative operator would be f(x,y) = x^2. I don't have that operator
as a built-in, but it could be added as a user-defined op. Currently,
user-defined ops and types are slow ... but my plan is to eventually allow
them to be all just as fast as built-in ops, by using a jit.
The "finalize" step is out of scope of GrB_mxv; it can be done quickly as a
GrB_eWiseMult with the div operator and a vector holding the degree.
Computing the mode or median of each row is very problematic. It's not
expressible as a semiring, nor as a reduce-to-vector operation.
…On Fri, Apr 9, 2021 at 12:50 PM Erik Welch ***@***.***> wrote:
I like that your select variant has a clean API that is easy to
understand. Instead of operating on the absolute index I or J of an
element, it operates on, say, ith and jth, which is the count of number
of elements previously seen along this row, column, or vector. It's perhaps
a little awkward if both ith and jth are needed. Otherwise, the API can
match your select or the new select. I don't have a feel for how common or
useful such an operation would be.
Regarding reduction on user-defined binaryops again, there are many more
operations that could be done with my proposal above when combined with
user-defined types:
- Get up to the first k elements.
- it may be difficult to stop the iteration once k elements are
seen.
- Get a random k elements.
- Determine if rows are monotonically increasing or decreasing.
- Calculate approximate percentiles.
- Calculate approximate count distinct.
- Compute a checksum.
- Compute the sum and count at the same time
- If there were a finalize step at the end, this could then compute
the mean.
- Similarly, if there were a finalize, one could first compute the
count, sum(x), and sum(x**2), and then calculate the standard deviation
from these all in a single pass.
- We can take this further and use it to compute a linear
regression!
- Lots more: power mean, geometric mean, central mean (two passes),
log-sum-exp, skewness (two passes), etc.
It is common in modern data analytics system to have user-defined
aggregation functions in a form similar to
initialize-update-merge-evaluate. Variations exist, and I should point to
literature or documentation...
—
You are receiving this because you commented.
Reply to this email directly, view it on GitHub
<#28 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AEYIIOPKAAQ3TZHFAOTLA7DTH446FANCNFSM4YI6NRFA>
.
|
Thanks for taking the time to reply. I think I can do a better job explaining what I consider most important in this issue, so please indulge me as I summarize below. This is a long comment, so please take as much time as you like. Summary
The ProposalThe goal:
Reduction methods include Matrix to Vector, Matrix to Scalar, Vector to Scalar, and building Matrices and Vectors with dup_op (this last one may require separate attention). Because order may matter during the reduction, I will call the group of elements that is being reduced over a list (if order didn't matter, I would call this a multiset). Since we require the reduction to be parallelizable, we allow the list of elements to be partitioned into sub-lists that can be operated on independently. Instead of a single BinaryOp, we require three operators:
If there are It's worth noting that there is no notion of identity of the missing elements or of the BinaryOp. Such an identity is strictly for Monoids and doesn't apply here. If the list being reduced is empty, then the result is empty. If it only has one element, then the result is the result of the UnaryOp. I realize this would be a lot of extra code to support. I'm not asking you to implement it (a JIT would be much more amazing!), but I do think it's worth having the conversation and keeping an eye open for use cases. Misc.Yeah, I understand the behavior and pitfalls of the variant of select you brought up. I'll let you know if I find any more use cases for it. I wasn't asking for mode or median. I was saying that if I were able to perform reductions with user-defined BinaryOps as per my proposal, then I could use it to efficiently compute approximate percentiles. And lots of other things in a single reduction operation. Going back a few comments, you described a situation that comes up in CC. If I understand correctly, I recognize this operation as a "scatter" with aggregation. I would love it if an astute reader of this issue would give this more thought! I'm sorry for repeating myself. There's been a lot of good/interesting information in this thread, and I wanted to refocus :) |
Per the spec, one can pass a BinaryOp to
GrB_Matrix_reduce_BinaryOp
to reduce the rows or columns of a Matrix to a Vector. SuiteSparse 4.0 changed so that only BinaryOps that correspond to known Monoids may be used. FIRST and SECOND do not have Monoids, therefore they cannot be used.Nevertheless, I expected FIRST and SECOND to work in SuiteSparse. The intent is clear, and I suspect efficient implementations exist.
(Note: per the spec, the outcome of a non-commutative or non-associative binary operator is undefined. FIRST and SECOND are non-commutative--they commute to each other--so their behavior would also be undefined with the current spec. SuiteSparse often implies an ordering when applying BinaryOps and chooses to be well-defined).
The text was updated successfully, but these errors were encountered: