Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve Linesearch and Quasi Newton allocations #335

Merged
merged 23 commits into from
Dec 28, 2023
Merged
Show file tree
Hide file tree
Changes from 9 commits
Commits
Show all changes
23 commits
Select commit Hold shift + click to select a range
2810598
Resolving #333 point 6, linesearch_backtrack! can now be used and is …
kellertuer Dec 16, 2023
d3812fb
Fix #333 point 1, 2, 5, and one further memory to use.
kellertuer Dec 16, 2023
55c5456
documentation formatting.
kellertuer Dec 16, 2023
5745c6d
IN theory resolves #333 points 4 and 7, in practice there seems to be…
kellertuer Dec 16, 2023
1b30f2f
first back to copying eta – then tests from before work again.
kellertuer Dec 16, 2023
ca7b1ef
Start changelog entry.
kellertuer Dec 16, 2023
d0a6e39
partial fix for Circle
mateuszbaran Dec 16, 2023
f79d857
Merge branch 'kellertuer/fix-linesearch-allocations' of github.com:Ju…
kellertuer Dec 16, 2023
bc795ae
rand -> allocate_result
mateuszbaran Dec 16, 2023
5175979
Fix a positional argument bug.
kellertuer Dec 16, 2023
590303d
Merge branch 'kellertuer/fix-linesearch-allocations' of github.com:Ju…
kellertuer Dec 16, 2023
070b2b4
fix typo
mateuszbaran Dec 16, 2023
72118ae
Merge remote-tracking branch 'origin/kellertuer/fix-linesearch-alloca…
mateuszbaran Dec 16, 2023
802ef16
Merge branch 'master' into kellertuer/fix-linesearch-allocations
mateuszbaran Dec 16, 2023
36cb062
copy -> allocate
mateuszbaran Dec 16, 2023
a6015be
fix qN direction update
mateuszbaran Dec 16, 2023
4340a51
save one allocation
mateuszbaran Dec 16, 2023
417f956
Adapt Documentation.
kellertuer Dec 17, 2023
a92e8a7
Work on Test Coverage.
kellertuer Dec 18, 2023
42d22b4
Fix deps.
kellertuer Dec 25, 2023
34dcad8
Merge branch 'master' into kellertuer/fix-linesearch-allocations
kellertuer Dec 26, 2023
c683eba
Bump dependencies.
kellertuer Dec 27, 2023
525d4f7
bump deps.
kellertuer Dec 28, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 8 additions & 0 deletions Changelog.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,14 @@ All notable Changes to the Julia package `Manopt.jl` will be documented in this
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [0.4.45] unreleased

## Changed

* `WolfePowellLineSearch`, `ArmijoLineSearch` step sizes now allocate less
* `linesearch_backtrack!` is now available
* Quasi Newton Updates can work inplace of a direction vector as well.

## [0.4.44] December 12, 2023

Formally one could consider this version breaking, since a few functions
Expand Down
48 changes: 33 additions & 15 deletions src/plans/quasi_newton_plan.jl
Original file line number Diff line number Diff line change
Expand Up @@ -362,43 +362,57 @@ function QuasiNewtonMatrixDirectionUpdate(
basis, m, scale, update, vector_transport_method
)
end
function (d::QuasiNewtonMatrixDirectionUpdate)(mp, st)
r = zero_vector(get_manifold(mp), get_iterate(st))
return d(r, mp, st)
end
function (d::QuasiNewtonMatrixDirectionUpdate{T})(
mp, st
r, mp, st
) where {T<:Union{InverseBFGS,InverseDFP,InverseSR1,InverseBroyden}}
M = get_manifold(mp)
p = get_iterate(st)
X = get_gradient(st)
return get_vector(M, p, -d.matrix * get_coordinates(M, p, X, d.basis), d.basis)
get_vector!(M, r, p, -d.matrix * get_coordinates(M, p, X, d.basis), d.basis)
return r
end
function (d::QuasiNewtonMatrixDirectionUpdate{T})(
mp, st
r, mp, st
) where {T<:Union{BFGS,DFP,SR1,Broyden}}
M = get_manifold(mp)
p = get_iterate(st)
X = get_gradient(st)
return get_vector(M, p, -d.matrix \ get_coordinates(M, p, X, d.basis), d.basis)
get_vector!(M, r, p, -d.matrix \ get_coordinates(M, p, X, d.basis), d.basis)
return r
end
@doc raw"""
QuasiNewtonLimitedMemoryDirectionUpdate <: AbstractQuasiNewtonDirectionUpdate

This [`AbstractQuasiNewtonDirectionUpdate`](@ref) represents the limited-memory Riemannian BFGS update, where the approximating operator is represented by ``m`` stored pairs of tangent vectors ``\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}`` in the ``k``-th iteration.
For the calculation of the search direction ``η_k``, the generalisation of the two-loop recursion is used (see [Huang, Gallican, Absil, SIAM J. Optim., 2015](@cite HuangGallivanAbsil:2015)), since it only requires inner products and linear combinations of tangent vectors in ``T_{x_k} \mathcal{M}``. For that the stored pairs of tangent vectors ``\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}``, the gradient ``\operatorname{grad}f(x_k)`` of the objective function ``f`` in ``x_k`` and the positive definite self-adjoint operator
This [`AbstractQuasiNewtonDirectionUpdate`](@ref) represents the limited-memory Riemannian BFGS update,
where the approximating operator is represented by ``m`` stored pairs of tangent vectors ``\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}``
in the ``k``-th iteration.
For the calculation of the search direction ``η_k``, the generalisation of the two-loop recursion
is used (see [Huang, Gallican, Absil, SIAM J. Optim., 2015](@cite HuangGallivanAbsil:2015)),
since it only requires inner products and linear combinations of tangent vectors in ``T_{x_k} \mathcal{M}``.
For that the stored pairs of tangent vectors ``\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}``,
the gradient ``\operatorname{grad}f(x_k)`` of the objective function ``f`` in ``x_k``
and the positive definite self-adjoint operator

```math
\mathcal{B}^{(0)}_k[⋅] = \frac{g_{x_k}(s_{k-1}, y_{k-1})}{g_{x_k}(y_{k-1}, y_{k-1})} \; \mathrm{id}_{T_{x_k} \mathcal{M}}[⋅]
```

are used. The two-loop recursion can be understood as that the [`InverseBFGS`](@ref) update is executed ``m`` times in a row on ``\mathcal{B}^{(0)}_k[⋅]`` using the tangent vectors ``\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}``, and in the same time the resulting operator ``\mathcal{B}^{LRBFGS}_k [⋅]`` is directly applied on ``\operatorname{grad}f(x_k)``.
are used. The two-loop recursion can be understood as that the [`InverseBFGS`](@ref) update
is executed ``m`` times in a row on ``\mathcal{B}^{(0)}_k[⋅]`` using the tangent vectors ``\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}``, and in the same time the resulting operator ``\mathcal{B}^{LRBFGS}_k [⋅]`` is directly applied on ``\operatorname{grad}f(x_k)``.
When updating there are two cases: if there is still free memory, i.e. ``k < m``, the previously stored vector pairs ``\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}`` have to be transported into the upcoming tangent space ``T_{x_{k+1}} \mathcal{M}``; if there is no free memory, the oldest pair ``\{ \widetilde{s}_{k−m}, \widetilde{y}_{k−m}\}`` has to be discarded and then all the remaining vector pairs ``\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m+1}^{k-1}`` are transported into the tangent space ``T_{x_{k+1}} \mathcal{M}``. After that we calculate and store ``s_k = \widetilde{s}_k = T^{S}_{x_k, α_k η_k}(α_k η_k)`` and ``y_k = \widetilde{y}_k``. This process ensures that new information about the objective function is always included and the old, probably no longer relevant, information is discarded.

# Fields
* `memory_s` the set of the stored (and transported) search directions times step size ``\{ \widetilde{s}_i\}_{i=k-m}^{k-1}``.
* `memory_y` set of the stored gradient differences ``\{ \widetilde{y}_i\}_{i=k-m}^{k-1}``.
* `ξ` a variable used in the two-loop recursion.
* `ρ` a variable used in the two-loop recursion.
* `scale`
* `vector_transport_method` a `AbstractVectorTransportMethod`
* `message` a string containing a potential warning that might have appeared
* `memory_s` the set of the stored (and transported) search directions times step size ``\{ \widetilde{s}_i\}_{i=k-m}^{k-1}``.
* `memory_y` set of the stored gradient differences ``\{ \widetilde{y}_i\}_{i=k-m}^{k-1}``.
* `ξ` a variable used in the two-loop recursion.
* `ρ` a variable used in the two-loop recursion.
* `scale` initial scaling of the Hessian
* `vector_transport_method` a `AbstractVectorTransportMethod`
* `message` a string containing a potential warning that might have appeared

# Constructor
QuasiNewtonLimitedMemoryDirectionUpdate(
Expand Down Expand Up @@ -468,10 +482,14 @@ function status_summary(d::QuasiNewtonLimitedMemoryDirectionUpdate{T}) where {T}
return s
end
function (d::QuasiNewtonLimitedMemoryDirectionUpdate{InverseBFGS})(mp, st)
r = zero_vector(get_manifold(mp), get_iterate(st))
return d(r, mp, st)
end
function (d::QuasiNewtonLimitedMemoryDirectionUpdate{InverseBFGS})(r, mp, st)
isempty(d.message) || (d.message = "") # reset message
M = get_manifold(mp)
p = get_iterate(st)
r = copy(M, p, get_gradient(st))
copyto!(M, r, p, get_gradient(st))
m = length(d.memory_s)
m == 0 && return -r
for i in m:-1:1
Expand Down
Loading
Loading