-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
symmetric positive definite matrix transforms (aka covariance matrices) #15
Comments
@sethaxen maybe I can start with correlation matrices and you can see covariance matrices |
Is this for Cholesky factors? Or for the whole thing? We usually go with Cholesky-factor parameterizations, so to me, those are more important. But for symmetric positive definite, that's just an unconstraining transform from (0, inf) to (-inf, inf) for the diagonals. The alternative is to take a symmetric matrix and push it through matrix exp. I have no clue as to what the Jacobian determinant looks like there. |
I think correlation matrices should be broken out into a separate issue. Adam Haber has looked at those. |
Works for me!
Agreed, we could just take the "best" Cholesky transform with the transform for the positive diagonals. Another alternative is to use the "best" transform for orthonormal matrices for the eigenvectors and then a transform for the positive eigenvectors. I think I've seen other approaches in the literature before and will look into it.
From http://eprints.maths.manchester.ac.uk/2754/1/catalog.pdf, the eigendecomposition is still the preferred way to compute I verified this using finite differences in Julia: julia> using FiniteDifferences, LinearAlgebra, LogExpFunctions
julia> function sparse_to_vec(U)
inds = findall(!iszero, U)
v = U[inds]
function vec_to_sparse(v)
Unew = similar(U)
fill!(Unew, 0)
Unew[inds] .= v
return Unew
end
return v, vec_to_sparse
end
sparse_to_vec (generic function with 1 method)
julia> function symexp_logdetjac(A)
λ = eigvals(A)
T = float(eltype(λ))
n = size(A, 1)
s = zero(T)
for i in 1:n
λᵢ = λ[i]
s += λᵢ
for j in (i + 1):n
λₗ, λᵤ = minmax(λᵢ, λ[j])
s += logsubexp(λᵤ, λₗ) - log(λᵤ - λₗ)
end
end
return s
end
symexp_logdetjac (generic function with 1 method)
julia> n = 5;
julia> U = triu(randn(n, n))
5×5 Matrix{Float64}:
-0.500465 2.08902 0.782754 0.844179 -0.766196
0.0 0.688097 0.929935 -1.54167 0.114989
0.0 0.0 -0.24768 1.41416 0.830064
0.0 0.0 0.0 -1.26386 0.915003
0.0 0.0 0.0 0.0 -0.220671
julia> A = Symmetric(U)
5×5 Symmetric{Float64, Matrix{Float64}}:
-0.500465 2.08902 0.782754 0.844179 -0.766196
2.08902 0.688097 0.929935 -1.54167 0.114989
0.782754 0.929935 -0.24768 1.41416 0.830064
0.844179 -1.54167 1.41416 -1.26386 0.915003
-0.766196 0.114989 0.830064 0.915003 -0.220671
julia> λ = eigvals(A);
julia> v, vec_to_sparse = sparse_to_vec(U);
julia> J = jacobian(central_fdm(5, 1), first ∘ sparse_to_vec ∘ triu ∘ exp ∘ Symmetric ∘ vec_to_sparse, v)[1]
15×15 Matrix{Float64}:
3.20214 5.12366 1.56993 … -0.531845 -0.253399 -0.0277042 0.0378789
2.56183 6.45721 3.58887 -0.658365 -0.197366 0.0777662 0.0283516
1.56993 7.17774 7.2442 -0.748569 -0.136485 0.130263 0.0115425
1.21801 2.00235 0.751506 0.378459 -0.0642349 -0.0627738 -0.0795216
0.740135 2.4536 1.75339 0.987444 -0.0265266 -0.262457 -0.050647
0.348811 0.707586 0.359101 … 0.489996 1.33695 0.403651 0.159073
0.100272 -0.734982 -0.636234 0.601058 0.14057 -0.191995 -0.0795725
0.139963 -0.442792 -1.29323 1.12791 0.176198 -0.420207 -0.0500219
0.0677437 -0.232091 -0.308252 0.0485569 0.880119 0.719764 0.158394
-0.00103785 -0.0805224 0.237911 -0.389536 0.402673 1.05731 0.157784
-0.338023 -0.545782 -0.145159 … 2.16254 0.939959 -0.0392903 -0.249754
-0.265922 -0.658365 -0.374284 3.7016 1.0117 -0.725274 -0.248751
-0.1267 -0.197366 -0.0682424 1.0117 2.09928 0.841366 0.569505
-0.0138521 0.0777662 0.0651313 -0.725274 0.841366 1.50725 0.580339
0.0378789 0.0567031 0.0115425 -0.497502 1.13901 1.16068 1.44321
julia> logabsdet(J)[1]
0.3457424809406372
julia> symexp_logdetjac(A)
0.3457424809559124 I'd expect ADing through this transform would be quite a bit more expensive than for the Cholesky transform, which only requires adding the unconstrained parameters behind the positive diagonals. |
I think @adamhaber already has the correlation matrix and Cholesky for correlation matrix transforms. |
@bob-carpenter I'm not certain if your comment is in reply to me, but I was only referring to symmetric positive definite matrices. |
Super cool! That correction is |
f780af0 adds derivations for this section |
The set of SPD matrices is not compact, so we need to add proper distributions. The Wisharts are conjugate for covariance in a multivariate normal model, so they're essentially like testing multivariate normal posteriors.
The text was updated successfully, but these errors were encountered: