-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Work out good approach for automatic Hessian sparsity detection #13
Comments
Should be solved by some combination of SparseConnectivityTracer.jl and the solution to gdalle/DifferentiationInterface.jl#263 |
Now that SparseConnectivityTracer supports linear algebra, can you see if this is enough for your use case? |
That's great! The main functions that are both important to have, and difficult to handle till now, are log-determinants (of both dense and sparse matrices). I need them to calculate loglikelihoods of Gaussian Markov random fields (aka MV Normals paramaterized by a precision matrix instead of a covariance). If I can trace through dense and sparse versions of the MWE function in gdalle/DifferentiationInterface.jl#263 I will be pretty happy: const y = randn(10)
function f_dense(u)
Q = diagm(exp.(u))
return logdet(Q) - y' * Q * y
end
function f_sparse(u)
Q = spdiagm(exp.(u))
return logdet(Q) - y' * Q * y
end In most problems, the structure of Currently, SCT can handle the dense one with local tracing, but gets a stack overflow error with the sparse one. |
I opened an issue to keep track. It's weird that it doesn't happen with |
Can you use this code in the meantime? I still haven't decided whether it belongs in DI itself or just in the docs as an example, given how easy and short it is using ADTypes
using DifferentiationInterface
using SparseArrays
struct DenseSparsityDetector{B} <: ADTypes.AbstractSparsityDetector
backend::B
atol::Float64
end
function ADTypes.jacobian_sparsity(f, x, detector::DenseSparsityDetector)
J = jacobian(f, detector.backend, x)
return sparse(abs.(J) .> detector.atol)
end
function ADTypes.jacobian_sparsity(f!, y, x, detector::DenseSparsityDetector)
J = jacobian(f!, y, detector.backend, x)
return sparse(abs.(J) .> detector.atol)
end
function ADTypes.hessian_sparsity(f, x, detector::DenseSparsityDetector)
H = hessian(f, detector.backend, x)
return sparse(abs.(H) .> detector.atol)
end |
Yes, that solution is probably good enough to close this issue, and I already have it as a patch on a development branch here. My $0.02 is it would be great to have it in DI unless there's a compelling reason not to. |
|
Working here, will close this issue once I merge that branch. Thanks for the quick fix! |
Awesome! I have a few remarks on the PR, will put them here: MarginalLogDensities.jl/src/MarginalLogDensities.jl Lines 127 to 131 in e62bebc
Why pick finite differences over forward as the default second order method? Forward over reverse would be much faster for large problems MarginalLogDensities.jl/src/MarginalLogDensities.jl Lines 132 to 138 in e62bebc
Why separate the sparsity detector, coloring algorithm and backend? With the ADTypes Also keep in mind that the MarginalLogDensities.jl/src/MarginalLogDensities.jl Lines 212 to 213 in e62bebc
I'm not entirely sure that preparation is correct when you have different function objects such as two anonymous functions. I think it's better to define |
Also I don't think your test failures on nightly are related to DI. I think it's Reexport.jl interacting badly with the new |
Most of these decision were made to err on the side of reliability (at least for now). If I'm showing this package to someone who is currently using R/TMB, I'd rather they encounter something that runs slower than they'd like, rather than something that breaks in a situation that they are used to having work. In my anecdotal experience, it's not uncommon to try some model that should work, but breaks because of some AD corner case, hence my conservatism. What I really need to do at this point is work up a more comprehensive set of realistic problems and test them systematically for performance comparisons, and to find bugs.
I made FiniteDiff the default because it is guaranteed to work with any inner backend, and is often not actually much slower than ForwardDiff if the Hessian is sparse enough.
True, not sure why I made it this way now. Will probably change it.
Good to know.
Also good to know, and might explain some odd timings I've seen. Will try it out! |
Closed via #25 |
The docstring for |
SparsityDetection.jl is no longer maintained, and the sparsity detection functionality in Symbolics.jl still seems somewhat brittle. Currently this package just uses ForwardDiff, but that is a suboptimal solution (very slow in high dimensions).
The text was updated successfully, but these errors were encountered: