Skip to content

Commit

Permalink
Multinomial randomness (#702)
Browse files Browse the repository at this point in the history
* Introduce random functions.
* Fix 2 error I introduced when simplifying the doubly stochastic checks.
* generate a bit more test coverage
* Random MultinomialSPD point generation

---------

Co-authored-by: Mateusz Baran <[email protected]>
  • Loading branch information
kellertuer and mateuszbaran authored Jan 25, 2024
1 parent 95e90c8 commit e46a864
Show file tree
Hide file tree
Showing 17 changed files with 448 additions and 47 deletions.
6 changes: 6 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,12 +14,18 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
* introduce `rand(:HamiltonianMatrices)`
* extend `rand` to also `rand!` for `HamiltonianMatrices`, `SymplecticMatrices` and `SymplecticStiefel`
* implement `riemannian_gradient` conversion for `SymplecticMatrices` and `SymplecticGrassmann`
* the new manifold of `MultinomialSymmetricPositiveDefinite` matrices
* `rand!` for `MultinomialDoublyStochastic` and `MultinomialSymmetric`

### Deprecated

* Rename `Symplectic` to `SimplecticMatrices` in order to have a `Symplectic` wrapper for such matrices as well in the future for the next breaking change.
* Rename `SymplecticMatrix` to `SymplecticElement` to clarify that it is the special matrix ``J_{2n}`` and not an arbitrary symplectic matrix.

### Fixed

* a bug that cause `project` for tangent vectors to return wrong results on `MultinomialDoublyStochastic`

## [0.9.12] – 2024-01-21

### Fixed
Expand Down
1 change: 1 addition & 0 deletions docs/make.jl
Original file line number Diff line number Diff line change
Expand Up @@ -119,6 +119,7 @@ makedocs(;
"Multinomial doubly stochastic matrices" => "manifolds/multinomialdoublystochastic.md",
"Multinomial matrices" => "manifolds/multinomial.md",
"Multinomial symmetric matrices" => "manifolds/multinomialsymmetric.md",
"Multinomial symmetric positive definite matrices" => "manifolds/multinomialsymmetricpositivedefinite.md",
"Oblique manifold" => "manifolds/oblique.md",
"Probability simplex" => "manifolds/probabilitysimplex.md",
"Positive numbers" => "manifolds/positivenumbers.md",
Expand Down
5 changes: 5 additions & 0 deletions docs/src/manifolds/multinomialdoublystochastic.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,8 @@ Order = [:type, :function]
```

## Literature

```@bibliography
Pages = ["multinomialdoublystochastic.md"]
Canonical=false
```
5 changes: 5 additions & 0 deletions docs/src/manifolds/multinomialsymmetric.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,3 +7,8 @@ Order = [:type, :function]
```

## Literature

```@bibliography
Pages = ["multinomialsymmetric.md"]
Canonical=false
```
14 changes: 14 additions & 0 deletions docs/src/manifolds/multinomialsymmetricpositivedefinite.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# Multinomial symmetric positive definite matrices

```@autodocs
Modules = [Manifolds]
Pages = ["manifolds/MultinomialSymmetricPositiveDefinite.jl"]
Order = [:type, :function]
```

## Literature

```@bibliography
Pages = ["multinomialsymmetricpositivedefinite.md"]
Canonical=false
```
1 change: 1 addition & 0 deletions docs/src/references.bib
Original file line number Diff line number Diff line change
Expand Up @@ -275,6 +275,7 @@ @article{DouikHassibi:2019
TITLE = {Manifold Optimization Over the Set of Doubly Stochastic Matrices: A Second-Order Geometry},
JOURNAL = {IEEE Transactions on Signal Processing}
}

#
# E
# ----------------------------------------------------------------------------------------
Expand Down
5 changes: 4 additions & 1 deletion src/Manifolds.jl
Original file line number Diff line number Diff line change
Expand Up @@ -252,6 +252,7 @@ using ManifoldsBase:
IsEmbeddedSubmanifold,
IsExplicitDecorator,
LogarithmicInverseRetraction,
ManifoldDomainError,
ManifoldsBase,
NestedPowerRepresentation,
NestedReplacingPowerRepresentation,
Expand Down Expand Up @@ -426,6 +427,7 @@ include("manifolds/GeneralizedStiefel.jl")
include("manifolds/Hyperbolic.jl")
include("manifolds/MultinomialDoublyStochastic.jl")
include("manifolds/MultinomialSymmetric.jl")
include("manifolds/MultinomialSymmetricPositiveDefinite.jl")
include("manifolds/PositiveNumbers.jl")
include("manifolds/ProjectiveSpace.jl")
include("manifolds/SkewHermitian.jl")
Expand Down Expand Up @@ -646,6 +648,7 @@ export Euclidean,
MultinomialDoubleStochastic,
MultinomialMatrices,
MultinomialSymmetric,
MultinomialSymmetricPositiveDefinite,
Oblique,
OrthogonalMatrices,
PositiveArrays,
Expand Down Expand Up @@ -781,7 +784,7 @@ export CachedBasis,
InducedBasis,
ProjectedOrthonormalBasis
# Errors on Manifolds
export ComponentManifoldError, CompositeManifoldError
export ComponentManifoldError, CompositeManifoldError, ManifoldDomainError
# Functions on Manifolds
export ×,
^,
Expand Down
27 changes: 26 additions & 1 deletion src/manifolds/Multinomial.jl
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,11 @@ The multinomial manifold consists of `m` column vectors, where each column is of
`n` and unit norm, i.e.
````math
\mathcal{MN}(n,m) \coloneqq \bigl\{ p ∈ ℝ^{n×m}\ \big|\ p_{i,j} > 0 \text{ for all } i=1,…,n, j=1,…,m \text{ and } p^{\mathrm{T}}\mathbb{1}_m = \mathbb{1}_n\bigr\},
\mathcal{MN}(n,m) \coloneqq \bigl\{
p ∈ ℝ^{n×m}\ \big|\ p_{i,j} > 0 \text{ for all } i=1,…,n, j=1,…,m
\text{ and } p^{\mathrm{T}}\mathbb{1}_m = \mathbb{1}_n\bigr\},
````
where ``\mathbb{1}_k`` is the vector of length ``k`` containing ones.
This yields exactly the same metric as
Expand Down Expand Up @@ -85,6 +88,28 @@ function power_dimensions(M::MultinomialMatrices)
return (m,)
end

@doc raw"""
riemannian_gradient(M::MultinomialMatrices, p, Y; kwargs...)
Let ``Y`` denote the Euclidean gradient of a function ``\tilde f`` defined in the
embedding neighborhood of `M`, then the Riemannian gradient is given by
Equation 5 of [DouikHassibi:2019](@cite) as
```math
\operatorname{grad} f(p) = \proj_{T_p\mathcal M}(Y⊙p)
```
where ``⊙`` denotes the Hadamard or elementwise product.
"""
riemannian_gradient(M::MultinomialMatrices, p, Y; kwargs...)

function riemannian_gradient!(M::MultinomialMatrices, X, p, Y; kwargs...)
X .= p .* Y
project!(M, X, p, X)
return X
end

representation_size(M::MultinomialMatrices) = get_parameter(M.size)

function Base.show(io::IO, ::MultinomialMatrices{TypeParameter{Tuple{n,m}}}) where {n,m}
Expand Down
124 changes: 89 additions & 35 deletions src/manifolds/MultinomialDoublyStochastic.jl
Original file line number Diff line number Diff line change
Expand Up @@ -3,19 +3,57 @@
A common type for manifolds that are doubly stochastic, for example by direct constraint
[`MultinomialDoubleStochastic`](@ref) or by symmetry [`MultinomialSymmetric`](@ref),
or additionally by symmetric positive definiteness [`MultinomialSymmetricPositiveDefinite`](@ref)
as long as they are also modeled as [`IsIsometricEmbeddedManifold`](https://juliamanifolds.github.io/ManifoldsBase.jl/stable/decorator.html#ManifoldsBase.IsIsometricEmbeddedManifold).
That way they share the inner product (just by restriction), and even the Riemannian gradient
"""
abstract type AbstractMultinomialDoublyStochastic <: AbstractDecoratorManifold{ℝ} end

function active_traits(f, ::AbstractMultinomialDoublyStochastic, args...)
return merge_traits(IsIsometricEmbeddedManifold())
end

@doc raw"""
representation_size(M::AbstractMultinomialDoublyStochastic)
return the representation size of doubly stochastic matrices, whic are embedded
in the ``ℝ^{n×n}`` matrices and hence the answer here is ``
"""
function representation_size(M::AbstractMultinomialDoublyStochastic)
n = get_parameter(M.size)[1]
return (n, n)
end

@doc raw"""
riemannian_gradient(M::AbstractMultinomialDoublyStochastic, p, Y; kwargs...)
Let ``Y`` denote the Euclidean gradient of a function ``\tilde f`` defined in the
embedding neighborhood of `M`, then the Riemannian gradient is given by
Lemma 1 [DouikHassibi:2019](@cite) as
```math
\operatorname{grad} f(p) = \proj_{T_p\mathcal M}(Y⊙p)
```
where ``⊙`` denotes the Hadamard or elementwise product, and the projection
is the projection onto the tangent space of the corresponding manifold.
"""
riemannian_gradient(M::AbstractMultinomialDoublyStochastic, p, Y; kwargs...)

function riemannian_gradient!(M::AbstractMultinomialDoublyStochastic, X, p, Y; kwargs...)
X .= p .* Y
project!(M, X, p, X)
return X
end

@doc raw"""
MultinomialDoublyStochastic{T} <: AbstractMultinomialDoublyStochastic
The set of doubly stochastic multinomial matrices consists of all ``n×n`` matrices with
stochastic columns and rows, i.e.
````math
\begin{aligned}
\mathcal{DP}(n) \coloneqq \bigl\{p ∈ ℝ^{n×n}\ \big|\ &p_{i,j} > 0 \text{ for all } i=1,…,n, j=1,…,m,\\
Expand Down Expand Up @@ -60,44 +98,27 @@ end
Checks whether `p` is a valid point on the [`MultinomialDoubleStochastic`](@ref)`(n)` `M`,
i.e. is a matrix with positive entries whose rows and columns sum to one.
"""
function check_point(
M::MultinomialDoubleStochastic,
p::T;
atol::Real=sqrt(prod(representation_size(M))) * eps(real(float(number_eltype(T)))),
kwargs...,
) where {T}
function check_point(M::MultinomialDoubleStochastic, p; kwargs...)
n = get_parameter(M.size)[1]
r = sum(p, dims=2)
if !isapprox(norm(r - ones(n, 1)), 0.0; atol=atol, kwargs...)
return DomainError(
r,
"The point $(p) does not lie on $M, since its rows do not sum up to one.",
)
end
return nothing
s = check_point(MultinomialMatrices(n, n), p; kwargs...)
!isnothing(s) && return s
s2 = check_point(MultinomialMatrices(n, n), p'; kwargs...)
return s2
end

@doc raw"""
check_vector(M::MultinomialDoubleStochastic p, X; kwargs...)
Checks whether `X` is a valid tangent vector to `p` on the [`MultinomialDoubleStochastic`](@ref) `M`.
This means, that `p` is valid, that `X` is of correct dimension and sums to zero along any
column or row.
"""
function check_vector(
M::MultinomialDoubleStochastic,
p,
X::T;
atol::Real=sqrt(prod(representation_size(M))) * eps(real(float(number_eltype(T)))),
kwargs...,
) where {T}
r = sum(X, dims=2) # check for stochastic rows
if !isapprox(norm(r), 0.0; atol=atol, kwargs...)
return DomainError(
r,
"The matrix $(X) is not a tangent vector to $(p) on $(M), since its rows do not sum up to zero.",
)
end
return nothing
function check_vector(M::MultinomialDoubleStochastic, p, X; kwargs...)
n = get_parameter(M.size)[1]
s = check_vector(MultinomialMatrices(n, n), p, X; kwargs...)
!isnothing(s) && return s
s2 = check_vector(MultinomialMatrices(n, n), p, X'; kwargs...)
return s2
end

function get_embedding(::MultinomialDoubleStochastic{TypeParameter{Tuple{n}}}) where {n}
Expand Down Expand Up @@ -134,11 +155,14 @@ end
Project `Y` onto the tangent space at `p` on the [`MultinomialDoubleStochastic`](@ref) `M`, return the result in `X`.
The formula reads
````math
\operatorname{proj}_p(Y) = Y - (α\mathbf{1}_n^{\mathrm{T}} + \mathbf{1}_nβ^{\mathrm{T}}) ⊙ p,
````
where ``⊙`` denotes the Hadamard or elementwise product and ``\mathbb{1}_n`` is the vector of length ``n`` containing ones.
The two vectors ``α,β ∈ ℝ^{n×n}`` are computed as a solution (typically using the left pseudo inverse) of
````math
\begin{pmatrix} I_n & p\\p^{\mathrm{T}} & I_n \end{pmatrix}
\begin{pmatrix} α\\ β\end{pmatrix}
Expand All @@ -152,8 +176,8 @@ project(::MultinomialDoubleStochastic, ::Any, ::Any)

function project!(M::MultinomialDoubleStochastic, X, p, Y)
n = get_parameter(M.size)[1]
ζ = [I p; p I] \ [sum(Y, dims=2); sum(Y, dims=1)'] # Formula (25) from 1802.02628
return X .= Y .- (repeat(ζ[1:n], 1, 3) .+ repeat(ζ[(n + 1):end]', 3, 1)) .* p
ζ = [I p; p' I] \ [sum(Y, dims=2); sum(Y, dims=1)'] # Formula (25) from 1802.02628
return X .= Y .- (repeat(ζ[1:n], 1, n) .+ repeat(ζ[(n + 1):end]', n, 1)) .* p
end

@doc raw"""
Expand Down Expand Up @@ -201,15 +225,45 @@ function project!(
return q
end

function representation_size(M::MultinomialDoubleStochastic)
n = get_parameter(M.size)[1]
return (n, n)
@doc raw"""
rand(::MultinomialDoubleStochastic; vector_at=nothing, σ::Real=1.0, kwargs...)
Generate random points on the [`MultinomialDoubleStochastic`](@ref) manifold
or tangent vectors at the point `vector_at` if that is not `nothing`.
Let ``n×n`` denote the matrix dimension of the [`MultinomialDoubleStochastic`](@ref).
When `vector_at` is nothing, this is done by generating a random matrix`rand(n,n)`
with positive entries and projecting it onto the manifold. The `kwargs...` are
passed to this projection.
When `vector_at` is not `nothing`, a random matrix in the ambient space is generated
and projected onto the tangent space
"""
rand(::MultinomialDoubleStochastic; σ::Real=1.0)

function Random.rand!(
rng::AbstractRNG,
M::MultinomialDoubleStochastic,
pX;
vector_at=nothing,
σ::Real=one(real(eltype(pX))),
kwargs...,
)
rand!(rng, pX)
pX .*= σ
if vector_at === nothing
project!(M, pX, pX; kwargs...)
else
project!(M, pX, vector_at, pX)
end
return pX
end

@doc raw"""
retract(M::MultinomialDoubleStochastic, p, X, ::ProjectionRetraction)
compute a projection based retraction by projecting $p\odot\exp(X⨸p)$ back onto the manifold,
compute a projection based retraction by projecting ``p\odot\exp(X⨸p)`` back onto the manifold,
where ``⊙,⨸`` are elementwise multiplication and division, respectively. Similarly, ``\exp``
refers to the elementwise exponentiation.
"""
Expand Down
Loading

2 comments on commit e46a864

@kellertuer
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator register

Release notes:

Added

  • added the real symplectic Grassmann manifold SymplecticGrassmann
  • Introduce the manifold of HamiltonianMatrices and a wrapper for Hamiltonian matrices
  • introduce rand(:HamiltonianMatrices)
  • extend rand to also rand! for HamiltonianMatrices, SymplecticMatrices and SymplecticStiefel
  • implement riemannian_gradient conversion for SymplecticMatrices and SymplecticGrassmann
  • the new manifold of MultinomialSymmetricPositiveDefinite matrices
  • rand! for MultinomialDoublyStochastic and MultinomialSymmetric

Deprecated

  • Rename Symplectic to SimplecticMatrices in order to have a Symplectic wrapper for such matrices as well in the future for the next breaking change.
  • Rename SymplecticMatrix to SymplecticElement to clarify that it is the special matrix J_{2n} and not an arbitrary symplectic matrix.

Fixed

  • a bug that cause project for tangent vectors to return wrong results on MultinomialDoublyStochastic

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/99535

Tagging

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.9.13 -m "<description of version>" e46a864447274646f75fb1e79c5492770fd05a8f
git push origin v0.9.13

Please sign in to comment.