Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to handle sparse matrices when f(0) returns not zero? #2597

Open
SteveBronder opened this issue Oct 5, 2021 · 14 comments
Open

How to handle sparse matrices when f(0) returns not zero? #2597

SteveBronder opened this issue Oct 5, 2021 · 14 comments

Comments

@SteveBronder
Copy link
Collaborator

Description

I'm working on some of the testing stuff for sparse matrices and had a quick Q. For a sparse matrix, how should we handle functions where an input of 0 would return back a dense matrix? i.e. for a sparse matrix X calling acos(X) where 0 returns 1.57~. Should the return be dense or should we treat the non-filled cells as just empty and ignore them, only iterating on the cells with values in them before the operation?

Current Version:

v4.1.0

@bob-carpenter
Copy link
Contributor

Yes, you need to treat the matrix as having 0 entries everywhere there isn't an explicit non-zero value. That means if we apply exp(), for example, the result is non-zero everywhere.

I think the bigger question is whether the result should be a sparse matrix or a dense matrix. It feels like sparse matrix makes more sense in that the vectorized return type is the same as the input type. But dense would be more efficient. My inclination is to have the result be sparse.

The right thing to do for efficiency here is compute acos(0) once and copy it rather than creating a gazillion new autodiff variables. That at least could be cooked into the definition.

I don't think we should try to get fancy and allow a non-zero value as the default for unspecified values.

@SteveBronder
Copy link
Collaborator Author

imo I'd like a dense matrix, not only because it will be more efficient but it makes the API very explicit that zeroed values will become nonzero.

@nhuurre
Copy link
Collaborator

nhuurre commented Oct 5, 2021 via email

@bob-carpenter
Copy link
Contributor

@nhuurre: The problem is that when you apply exp() to a sparse matrix, the result is dense. The question is really whether we want to auto-convert to a dense result for efficiency. @SteveBronder is suggesting a signature like this:

matrix cos(sparse_matrix);

because cos(0) != 0. But because sin(0) = 0, presumably we'd have

sparse_matrix sin(sparse_matrix);

This inconsistency is what worries me about returning a dense matrix from density inducing functions like exp or cos.

@nhuurre
Copy link
Collaborator

nhuurre commented Oct 5, 2021 via email

@bob-carpenter
Copy link
Contributor

why would you want to apply exp() to a sparse matrix?

I have no idea where this would come up in practice, but if it did, it would be more efficient to have the argument be sparse even if the result is dense.

My main objection is that it's confusing to have varying return types for elementwise operations applied to sparse matrices.

@SteveBronder
Copy link
Collaborator Author

why would you want to apply exp() to a sparse matrix?
I have no idea where this would come up in practice, but if it did, it would be more efficient to have the argument be sparse even if the result is dense.

My main objection is that it's confusing to have varying return types for elementwise operations applied to sparse matrices.

Maybe this would be a good thing to bring up in the Stan meeting. I agree it's weird, but I think it's weird only because the scheme itself is kind of weird. Like the alt here is that we return back a sparse_matrix[N, M] with each cell having a value. My concern with that is that users are going to be like, "Oh I'm going to use a sparse matrix that will be very fast I have many zeros." Then they apply one of these non-zeroing functions and suddenly they're using ~3 times the amount of memory and not getting any speedup. imo returning a dense matrix is sort of a red flag for the user like, "This function will blow up your sparsity structure"

@bob-carpenter
Copy link
Contributor

I understand the reasons to return a dense matrix. Another alternative we should consider is to simply not allow functions where f(0) != 0 to apply elementwise to sparse matrices.

@nhuurre
Copy link
Collaborator

nhuurre commented Oct 6, 2021 via email

@SteveBronder
Copy link
Collaborator Author

I actually posted this Q to the Eigen discorse server (a chat server Eigen devs and users talk on). They are also saying no no to my dense matrix idea as it sounds too weird so I'm fine with tossing that out.

Another alternative we should consider is to simply not allow functions where f(0) != 0 to apply elementwise to sparse matrices.

Yes, come to think of it, that would make most sense.

I'm okay with this, though feel a little weird to have like atan() but not acos(). Also what do we want to do for sparse_matrix<lower=0>[...] that requires using log()/exp()?

@nhuurre
Copy link
Collaborator

nhuurre commented Oct 7, 2021

what do we want to do for sparse_matrix<lower=0>[...]

A dense return type is definitely wrong here! The constraining transform must apply only to nonzero elements.
lb_constrain() is not exp(), any relation between them is an implementation detail and can change.
Notice that the unconstrained type corresponding to a Cholesky factor matrix is a vector. I expect the sparse matrix type also turn into a small vector when unconstrained in which case the lower bound constrain can be applied to its vector form.

Another question is do we allow sparse_matrix<lower=1>[...] even though it seems to be saying that the "sparse" matrix cannot contain any zeros? Or maybe the bounds check ignores structural zeros?

@bob-carpenter
Copy link
Contributor

Good points, @nhuurre.

I agree about exp() being an implementation detail for <lower = 0>. We were just talking about this today in the Stan meeting and it's come up in other issues.

All of our unconstrained types are dense vectors and they'll presumably stay that way. They eventually have to concatenate into the argument to the log density function.

I don't think sparse_matrix<lower = 1> makes sense as the result isn't sparse. But it's a run-time issue to figure this out as it could be sparse_matrix<lower = L> where L is declared as a data variable. Validating the constraint requires validating for all non-zero elements and then making sure it holds for 0 if there is sparsity.

@SteveBronder
Copy link
Collaborator Author

I think for now in #2599 I'm going to go with Bob's route and return back a sparse matrix with full NxM size.

I don't think sparse_matrix<lower = 1> makes sense as the result isn't sparse. But it's a run-time issue to figure this out as it could be sparse_matrix<lower = L> where L is declared as a data variable. Validating the constraint requires validating for all non-zero elements and then making sure it holds for 0 if there is sparsity.

Maybe just not having constraints? The only one that makes sense to me is sparse_matrix<offset=other_sparse_mat, multiplier=other_sparse_mat2>

@bob-carpenter
Copy link
Contributor

With no constraints, there's no problem with accidentally removing sparsity. Other constraints are consistent with sparsity, like <lower = -1, upper = 1>.

We'll eventually want to have something like a sparse Cholesky factor or sparse covariance matrix and then implement log determinants, and multi-normal on the precision scale (where there's some hope to preserve sparsity).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants