diff --git a/.gitignore b/.gitignore index c1ce6ebd72..120a653afe 100644 --- a/.gitignore +++ b/.gitignore @@ -22,3 +22,4 @@ docs/src/tutorials/Optimize!_files docs/src/tutorials/*.html docs/src/changelog.md docs/styles/Google +tutorials/LocalPreferences.toml diff --git a/Changelog.md b/Changelog.md index 4934dc1e41..52ead36f46 100644 --- a/Changelog.md +++ b/Changelog.md @@ -5,17 +5,19 @@ All notable Changes to the Julia package `Manopt.jl` will be documented in this The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). -## [unreleased] +## [0.4.52] ### Added -* Allow the `message=` of the `DebugIfEntry` debug action to contain a format element to print the field in the message as well. +* introduce an environment persistent way of setting global values with the `set_manopt_parameter!` function using [Preferences.jl](https://github.com/JuliaPackaging/Preferences.jl). +* introduce such a value named `:Mode` to enable a `"Tutorial"` mode that shall often provide more warnings and information for people getting started with optimisation on manifolds ## [0.4.51] January 30, 2024 ### Added * A `StopWhenSubgradientNormLess` stopping criterion for subgradient-based optimization. +* Allow the `message=` of the `DebugIfEntry` debug action to contain a format element to print the field in the message as well. ## [0.4.50] January 26, 2024 diff --git a/Project.toml b/Project.toml index 71e6be3120..4868913f23 100644 --- a/Project.toml +++ b/Project.toml @@ -1,7 +1,7 @@ name = "Manopt" uuid = "0fc0a36d-df90-57f3-8f93-d78a9fc72bb5" authors = ["Ronny Bergmann "] -version = "0.4.51" +version = "0.4.52" [deps] ColorSchemes = "35d6a980-a343-548e-a6ea-1d62b119f2f4" @@ -14,6 +14,7 @@ ManifoldDiff = "af67fdf4-a580-4b9f-bbec-742ef357defd" ManifoldsBase = "3362f125-f0bb-47a3-aa74-596ffd7ef2fb" Markdown = "d6f4376e-aef5-505a-96c1-9c027394607a" PolynomialRoots = "3a141323-8675-5d76-9d11-e1df1406c778" +Preferences = "21216c6a-2e73-6563-6e65-726566657250" Printf = "de0858da-6303-5e67-8744-51eddeeeb8d7" Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c" Requires = "ae029012-a4dd-5104-9daa-d747884805df" @@ -42,19 +43,20 @@ Colors = "0.11.2, 0.12" DataStructures = "0.17, 0.18" Dates = "1.6" JuMP = "1.15" -LinearAlgebra = "1.6" LRUCache = "1.4" +LinearAlgebra = "1.6" ManifoldDiff = "0.3.8" Manifolds = "0.9.11" ManifoldsBase = "0.15" ManoptExamples = "0.1.4" Markdown = "1.6" PolynomialRoots = "1" +Preferences = "1.4" Printf = "1.6" Random = "1.6" Requires = "0.5, 1" -Statistics = "1.6" SparseArrays = "1.6" +Statistics = "1.6" Test = "1.6" julia = "1.6" diff --git a/Readme.md b/Readme.md index 472ed33285..35ed9d5e52 100644 --- a/Readme.md +++ b/Readme.md @@ -78,16 +78,17 @@ for the most recent version or a corresponding version specific DOI, see [the li If you are also using [`Manifolds.jl`](https://juliamanifolds.github.io/Manifolds.jl/stable/) please consider to cite -``` +```biblatex @article{AxenBaranBergmannRzecki:2023, - AUTHOR = {Seth D. Axen and Mateusz Baran and Ronny Bergmann and Krzysztof Rzecki}, - DOI = {10.1145/3618296}, - EPRINT = {2021.08777}, - EPRINTTYPE = {arXiv}, - JOURNAL = {AMS Transactions on Mathematical Software}, - NOTE = {accepted for publication}, - TITLE = {Manifolds.jl: An Extensible {J}ulia Framework for Data Analysis on Manifolds}, - YEAR = {2023} + AUTHOR = {Axen, Seth D. and Baran, Mateusz and Bergmann, Ronny and Rzecki, Krzysztof}, + ARTICLENO = {33}, + DOI = {10.1145/3618296}, + JOURNAL = {ACM Transactions on Mathematical Software}, + MONTH = {dec}, + NUMBER = {4}, + TITLE = {Manifolds.Jl: An Extensible Julia Framework for Data Analysis on Manifolds}, + VOLUME = {49}, + YEAR = {2023} } ``` diff --git a/docs/src/index.md b/docs/src/index.md index 167f3dd36f..335c9e5e68 100644 --- a/docs/src/index.md +++ b/docs/src/index.md @@ -48,16 +48,34 @@ To refer to a certain version or the source code in general cite for example ```biblatex @software{manoptjl-zenodo-mostrecent, - Author = {Ronny Bergmann}, + Author = {Ronny Bergmann}, Copyright = {MIT License}, - Doi = {10.5281/zenodo.4290905}, + Doi = {10.5281/zenodo.4290905}, Publisher = {Zenodo}, - Title = {Manopt.jl}, - Year = {2022}, + Title = {Manopt.jl}, + Year = {2024}, } ``` for the most recent version or a corresponding version specific DOI, see [the list of all versions](https://zenodo.org/search?page=1&size=20&q=conceptrecid:%224290905%22&sort=-version&all_versions=True). + + +If you are also using [`Manifolds.jl`](https://juliamanifolds.github.io/Manifolds.jl/stable/) please consider to cite + +```biblatex +@article{AxenBaranBergmannRzecki:2023, + AUTHOR = {Axen, Seth D. and Baran, Mateusz and Bergmann, Ronny and Rzecki, Krzysztof}, + ARTICLENO = {33}, + DOI = {10.1145/3618296}, + JOURNAL = {ACM Transactions on Mathematical Software}, + MONTH = {dec}, + NUMBER = {4}, + TITLE = {Manifolds.Jl: An Extensible Julia Framework for Data Analysis on Manifolds}, + VOLUME = {49}, + YEAR = {2023} +} +``` + Note that both citations are in [BibLaTeX](https://ctan.org/pkg/biblatex) format. ## Main features diff --git a/docs/src/plans/index.md b/docs/src/plans/index.md index 365c8e0861..de05884482 100644 --- a/docs/src/plans/index.md +++ b/docs/src/plans/index.md @@ -47,4 +47,10 @@ The column “generic” refers to a short hand that might be used for readabil Since the iterate is often stored in the states fields `s.p` one _could_ access the iterate often also with `:p` and similarly the gradient with `:X`. This is discouraged for both readability as well as to star more generic, and it is recommended -to use `:Iterate` and `:Gradient` instead in generic settings. \ No newline at end of file +to use `:Iterate` and `:Gradient` instead in generic settings. + +You can further activate a “Tutorial” mode by `set_manopt_parameter!(:Mode, "Tutorial")`. Internally, the following convenience function is available. + +```@docs +Manopt.is_tutorial_mode +``` \ No newline at end of file diff --git a/docs/src/references.bib b/docs/src/references.bib index 14af657d5a..6eeb0e6b64 100644 --- a/docs/src/references.bib +++ b/docs/src/references.bib @@ -8,10 +8,10 @@ @book{AbsilMahonySepulchre:2008, AUTHOR = {Absil, P.-A. and Mahony, R. and Sepulchre, R.}, DOI = {10.1515/9781400830244}, + NOTE = {available online at \href{http://press.princeton.edu/chapters/absil/}{press.princeton.edu/chapters/absil/}}, PUBLISHER = {Princeton University Press}, - NOTE = {[open access](http://press.princeton.edu/chapters/absil/)}, TITLE = {Optimization Algorithms on Matrix Manifolds}, - YEAR = {2008}, + YEAR = {2008} } @article{AbsilBakerGallivan:2006, DOI = {10.1007/s10208-005-0179-9}, @@ -61,27 +61,29 @@ @article{AlmeidaNetoOliveiraSouza:2020 % % @article{Bacak:2014, - AUTHOR = {Bačák, M.}, - DOI = {10.1137/140953393}, - JOURNAL = {SIAM Journal on Optimization}, - NOTE = {arXiv: [1210.2145](https://arxiv.org/abs/1210.2145)}, - NUMBER = {3}, - PAGES = {1542--1566}, - TITLE = {Computing medians and means in Hadamard spaces}, - VOLUME = {24}, - YEAR = {2014} + AUTHOR = {Bačák, M.}, + DOI = {10.1137/140953393}, + JOURNAL = {SIAM Journal on Optimization}, + EPRINT = {1210.2145}, + EPRINTTYPE = {arXiv}, + NUMBER = {3}, + PAGES = {1542--1566}, + TITLE = {Computing medians and means in Hadamard spaces}, + VOLUME = {24}, + YEAR = {2014} } @article{BacakBergmannSteidlWeinmann:2016, AUTHOR = {Bačák, Miroslav and Bergmann, Ronny and Steidl, Gabriele and Weinmann, Andreas}, YEAR = {2016}, DOI = {10.1137/15M101988X}, + EPRINT = {1506.02409}, + EPRINTTYPE = {arXiv}, JOURNAL = {SIAM Journal on Scientific Computing}, NUMBER = {1}, PAGES = {A567--A597}, TITLE = {A second order non-smooth variational model for restoring manifold-valued images}, VOLUME = {38}, - NOTE = {arxiv: [1506.02409](https://arxiv.org/abs/1506.02409)} } @inproceedings{Beale:1972, @@ -128,18 +130,20 @@ @article{BergmannFerreiraSantosSouza:2023 @article{BergmannGousenbourger:2018, AUTHOR = {Bergmann, Ronny and Gousenbourger, Pierre-Yves}, YEAR = {2018}, + EPRINT = {1807.10090}, + EPRINTTYPE = {arXiv}, DOI = {10.3389/fams.2018.00059}, JOURNAL = {Frontiers in Applied Mathematics and Statistics}, TITLE = {A variational model for data fitting on manifolds by minimizing the acceleration of a Bézier curve}, VOLUME = {4}, - NOTE = {arXiv: [1807.10090](https://arxiv.org/abs/1807.10090)} } @article{BergmannHerzog:2019, AUTHOR = {Bergmann, Ronny and Herzog, Roland}, DOI = {10.1137/18M1181602}, JOURNAL = {SIAM Journal on Optimization}, - NOTE = {arXiv: [1804.06214](https://arxiv.org/abs/1804.06214)}, + EPRINTTYPE = {arXiv}, + EPRINT = {1804.06214}, NUMBER = {4}, PAGES = {2423–2444}, TITLE = {Intrinsic formulation of KKT conditions and constraint qualifications on smooth manifolds}, @@ -149,6 +153,8 @@ @article{BergmannHerzog:2019 @article{BergmannHerzogSilvaLouzeiroTenbrinckVidalNunez:2021, AUTHOR = {Bergmann, Ronny and Herzog, Roland and Silva Louzeiro, Maurício and Tenbrinck, Daniel and Vidal-Núñez, José}, + EPRINT = {1908.02022}, + EPRINTTYPE = {arXiv}, PUBLISHER = {Springer Science and Business Media LLC}, YEAR = {2021}, MONTH = jan, @@ -158,7 +164,6 @@ @article{BergmannHerzogSilvaLouzeiroTenbrinckVidalNunez:2021 PAGES = {1465--1504}, TITLE = {Fenchel duality theory and a primal-dual algorithm on Riemannian manifolds}, VOLUME = {21}, - NOTE = {arXiv: [1908.02022](http://arxiv.org/abs/1908.02022)} } @article{BergmannLausSteidlWeinmann:2014:1, @@ -206,7 +211,7 @@ @book{Boumal:2023 EDITION = {First}, PUBLISHER = {Cambridge University Press}, DOI = {10.1017/9781009166164}, - NOTE = {Homepage to the book: [nicolasboumal.net/book/index.html](https://www.nicolasboumal.net/book/index.html).}, + URL = {https://www.nicolasboumal.net/book/index.html}, ABSTRACT = {Optimization on Riemannian manifolds-the result of smooth geometry and optimization merging into one elegant modern framework-spans many areas of science and engineering, including machine learning, computer vision, signal processing, dynamical systems and scientific computing. This text introduces the differential geometry and Riemannian geometry concepts that will help students and researchers in applied mathematics, computer science and engineering gain a firm mathematical grounding to use these tools confidently in their research. Its charts-last approach will prove more intuitive from an optimizer's viewpoint, and all definitions and theorems are motivated to build time-tested optimization algorithms. Starting from first principles, the text goes on to cover current research on topics including worst-case complexity and geodesic convexity. Readers will appreciate the tricks of the trade for conducting research and for numerical implementations sprinkled throughout the book.}, ISBN = {978-1-00-916616-4} } @@ -275,17 +280,18 @@ @article{DaiYuan:1999 YEAR = {1999} } @article{DiepeveenLellmann:2021, - AUTHOR = {Willem Diepeveen and Jan Lellmann}, - DOI = {10.1137/21m1398513}, - JOURNAL = {SIAM Journal on Imaging Sciences}, - MONTH = jan, - NOTE = {arXiv: [2102.10309](https://arxiv.org/abs/2102.10309)}, - NUMBER = {4}, - PAGES = {1565--1600}, - PUBLISHER = {Society for Industrial \& Applied Mathematics ({SIAM})}, - TITLE = {An Inexact Semismooth Newton Method on Riemannian Manifolds with Application to Duality-Based Total Variation Denoising}, - VOLUME = {14}, - YEAR = {2021}, + AUTHOR = {Willem Diepeveen and Jan Lellmann}, + DOI = {10.1137/21m1398513}, + JOURNAL = {SIAM Journal on Imaging Sciences}, + MONTH = jan, + EPRINTTYPE = {arXiv}, + EPRINT = {2102.10309}, + NUMBER = {4}, + PAGES = {1565--1600}, + PUBLISHER = {Society for Industrial \& Applied Mathematics ({SIAM})}, + TITLE = {An Inexact Semismooth Newton Method on Riemannian Manifolds with Application to Duality-Based Total Variation Denoising}, + VOLUME = {14}, + YEAR = {2021}, } @article{DuranMoelleSbertCremers:2016, AUTHOR = {Duran, J. and Moeller, M. and Sbert, C. and Cremers, D.}, @@ -296,7 +302,8 @@ @article{DuranMoelleSbertCremers:2016 PAGES = {116-151}, YEAR = {2016}, DOI = {10.1137/15M102873X}, - NOTE = {arxiv: [1508.01308](https://arxiv.org/abs/1508.01308)} + EPRINT = {1508.01308}, + EPRINTTYPE = {arXiv}, } % --- E @@ -343,7 +350,7 @@ @article{GrapigliaStella:2023 DOI = {10.1007/s10957-023-02227-y}, JOURNAL = {Journal of Optimization Theory and Applications}, MONTH = may, - NOTE = {preprint: [optimization-online.org/wp-content/uploads/2022/04/8864.pdf](https://optimization-online.org/wp-content/uploads/2022/04/8864.pdf)}, + URL = {https://optimization-online.org/wp-content/uploads/2022/04/8864.pdf}, NUMBER = {3}, PAGES = {1140--1160}, PUBLISHER = {Springer Science and Business Media {LLC}}, @@ -457,7 +464,7 @@ @article{LausNikolovaPerschSteidl:2017 AUTHOR = {Laus, F. and Nikolova, M. and Persch, J. and Steidl, G.}, YEAR = {2017}, DOI = {10.1137/16M1087114}, - JOURNAL = {SIAM Journal on Imaging Sciences}, + JOURNAL = {SIAM Journal on Imaging Sciences}, NUMBER = {1}, PAGES = {416--448}, TITLE = {A nonlocal denoising algorithm for manifold-valued images using second order statistics}, @@ -472,7 +479,8 @@ @article{LiuBoumal:2019 DOI = {10.1007/s00245-019-09564-3}, JOURNAL = {Applied Mathematics \& Optimization}, TITLE = {Simple algorithms for optimization on Riemannian manifolds with constraints}, - URL = {https://arxiv.org/abs/1901.10000} + EPRINT = {1091.10000}, + EPRINTTYPE = {arXiv}, } @article{Luenberger:1972, AUTHOR = {Luenberger, David G}, diff --git a/src/Manopt.jl b/src/Manopt.jl index 3e7901d19c..f853e49366 100644 --- a/src/Manopt.jl +++ b/src/Manopt.jl @@ -122,6 +122,8 @@ using ManifoldsBase: ℂ, ℝ using Markdown +using Preferences: + @load_preference, @set_preferences!, @has_preference, @delete_preferences! using Printf using Random: shuffle!, rand, randperm using Requires @@ -470,7 +472,8 @@ export DebugDualResidual, DebugPrimalDualResidual, DebugPrimalResidual export DebugProximalParameter, DebugWarnIfCostIncreases export DebugGradient, DebugGradientNorm, DebugStepsize export DebugWhenActive, DebugWarnIfFieldNotFinite, DebugIfEntry -export DebugWarnIfCostNotFinite, DebugWarnIfFieldNotFinite, DebugMessages +export DebugWarnIfCostNotFinite, DebugWarnIfFieldNotFinite +export DebugWarnIfGradientNormTooLarge, DebugMessages # # Records - and access functions export get_record, get_record_state, get_record_action, has_record diff --git a/src/plans/debug.jl b/src/plans/debug.jl index 8f00f4ff35..957969d80d 100644 --- a/src/plans/debug.jl +++ b/src/plans/debug.jl @@ -621,13 +621,15 @@ This debug can be used to `:print` them display them as `:info` or `:warnings` o depending on the message type. # Constructor + DebugMessages(mode=:Info; io::IO=stdout) Initialize the messages debug to a certain `mode`. Available modes are -* `:Error` – issue the messages as an error and hence stop at any issue occurring -* `:Info` – issue the messages as an `@info` -* `:Print` – print messages to the steam `io`. -* `:Warning` – issue the messages as a warning + +* `:Error` issue the messages as an error and hence stop at any issue occurring +* `:Info` issue the messages as an `@info` +* `:Print` print messages to the steam `io`. +* `:Warning` issue the messages as a warning """ mutable struct DebugMessages <: DebugAction io::IO @@ -946,7 +948,59 @@ function (d::DebugWarnIfFieldNotFinite)( return nothing end function show(io::IO, dw::DebugWarnIfFieldNotFinite) - return print(io, "DebugWarnIfFieldNotFinite(:$(dw.field))") + return print(io, "DebugWarnIfFieldNotFinite(:$(dw.field), :$(dw.status))") +end + +@doc raw""" + DebugWarnIfGradientNormTooLarge{T} <: DebugAction + +A debug to warn when an evaluated gradient at the current iterate is larger than +(a factor times) the maximal (recommended) stepsize at the current iterate. + +# Constructor + DebugWarnIfGradientNormTooLarge(warn=:Once, factor::T=1.0) + +Initialize the warning to warn `:Once`. + +This can be set to `:Once` to only warn the first time the cost is Nan. +It can also be set to `:No` to deactivate the warning, but this makes this Action also useless. +All other symbols are handled as if they were `:Always:` + +# Example + DebugWaranIfFieldNotFinite(:Gradient) + +Creates a [`DebugAction`] to track whether the gradient does not get `Nan` or `Inf`. +""" +mutable struct DebugWarnIfGradientNormTooLarge{T} <: DebugAction + status::Symbol + factor::T + function DebugWarnIfGradientNormTooLarge(factor::T=1.0, warn::Symbol=:Once) where {T} + return new{T}(warn, factor) + end +end +function (d::DebugWarnIfGradientNormTooLarge)( + mp::AbstractManoptProblem, st::AbstractManoptSolverState, i::Int +) + if d.status !== :No + M = get_manifold(mp) + p = get_iterate(st) + X = get_gradient(st) + Xn = norm(M, p, X) + p_inj = d.factor * max_stepsize(M, p) + if Xn > p_inj + @warn """At iteration #$i + the gradient norm ($Xn) is larger that $(d.factor) times the injectivity radius $(p_inj) at the current iterate. + """ + if d.status === :Once + @warn "Further warnings will be suppressed, use DebugWarnIfGradientNormTooLarge($(d.factor), :Always) to get all warnings." + d.status = :No + end + end + end + return nothing +end +function show(io::IO, d::DebugWarnIfGradientNormTooLarge) + return print(io, "DebugWarnIfGradientNormTooLarge($(d.factor), :$(d.status))") end @doc raw""" @@ -1028,6 +1082,7 @@ Note that the Shortcut symbols should all start with a capital letter. * `:Cost` creates a [`DebugCost`](@ref) * `:Change` creates a [`DebugChange`](@ref) +* `:Gradient` creates a [`DebugGradient`](@ref) * `:GradientChange` creates a [`DebugGradientChange`](@ref) * `:GradientNorm` creates a [`DebugGradientNorm`](@ref) * `:Iterate` creates a [`DebugIterate`](@ref) @@ -1047,6 +1102,7 @@ any other symbol creates a `DebugEntry(s)` to print the entry (o.:s) from the op function DebugActionFactory(s::Symbol) (s == :Cost) && return DebugCost() (s == :Change) && return DebugChange() + (s == :Gradient) && return DebugGradient() (s == :GradientChange) && return DebugGradientChange() (s == :GradientNorm) && return DebugGradientNorm() (s == :Iterate) && return DebugIterate() @@ -1071,9 +1127,11 @@ Currently the following ones are done, where the string in `t[2]` is passed as t `format` the corresponding debug. Note that the Shortcut symbols `t[1]` should all start with a capital letter. -* `:Cost` creates a [`DebugCost`](@ref) * `:Change` creates a [`DebugChange`](@ref) +* `:Cost` creates a [`DebugCost`](@ref) +* `:Gradient` creates a [`DebugGradient`](@ref) * `:GradientChange` creates a [`DebugGradientChange`](@ref) +* `:GradientNorm` creates a [`DebugGradientNorm`](@ref) * `:Iterate` creates a [`DebugIterate`](@ref) * `:Iteration` creates a [`DebugIteration`](@ref) * `:Stepsize` creates a [`DebugStepsize`](@ref) @@ -1084,13 +1142,14 @@ any other symbol creates a `DebugEntry(s)` to print the entry (o.:s) from the op """ function DebugActionFactory(t::Tuple{Symbol,String}) (t[1] == :Change) && return DebugChange(; format=t[2]) + (t[1] == :Cost) && return DebugCost(; format=t[2]) + (t[1] == :Gradient) && return DebugGradient(; format=t[2]) (t[1] == :GradientChange) && return DebugGradientChange(; format=t[2]) + (t[1] == :GradientNorm) && return DebugGradientNorm(; format=t[2]) (t[1] == :Iteration) && return DebugIteration(; format=t[2]) (t[1] == :Iterate) && return DebugIterate(; format=t[2]) - (t[1] == :GradientNorm) && return DebugGradientNorm(; format=t[2]) - (t[1] == :Cost) && return DebugCost(; format=t[2]) + (t[1] == :IterativeTime) && return DebugTime(; mode=:Iterative, format=t[2]) (t[1] == :Stepsize) && return DebugStepsize(; format=t[2]) (t[1] == :Time) && return DebugTime(; format=t[2]) - (t[1] == :IterativeTime) && return DebugTime(; mode=:Iterative, format=t[2]) return DebugEntry(t[1]; format=t[2]) end diff --git a/src/plans/plan.jl b/src/plans/plan.jl index c1fa227e53..335d608151 100644 --- a/src/plans/plan.jl +++ b/src/plans/plan.jl @@ -25,8 +25,10 @@ end """ get_manopt_parameter(f, element::Symbol, args...) +Access arbitrary parameters from `f` addressed by a symbol `element`. + For any `f` and a `Symbol` `e` we dispatch on its value so by default, to -get some element from `f` potentially further qulalified by `args...`. +get some element from `f` potentially further qualified by `args...`. This functions returns `nothing` if `f` does not have the property `element` """ @@ -35,6 +37,64 @@ function get_manopt_parameter(f, e::Symbol, args...) end get_manopt_parameter(f, args...) = nothing +""" + get_manopt_parameter(element::Symbol; default=nothing) + +Access global [`Manopt`](@ref) parametersadressed by a symbol `element`. +We first dispatch on the value of `element`. + +If the value is not set, `default` is returned. + +The parameters are queried from the global settings using [`Preferences.jl`](https://github.com/JuliaPackaging/Preferences.jl), +so they are persistent within your activated Environment. + +# Currently used settings + +`:Mode` +the mode can be set to `"Tutorial"` to get several hints especially in scenarios, where +the optimisation on manifolds is different from the usual “experience” in +(classical, Euclidean) optimization. +Any other value has the same effect as not setting it. +""" +function get_manopt_parameter( + e::Symbol, args...; default=get_manopt_parameter(Val(e), Val(:default)) +) + return @load_preference("$(e)", default) +end +# Handle empty defaults +get_manopt_parameter(::Symbol, ::Val{:default}) = nothing +get_manopt_parameter(::Val{:Mode}, v::Val{:default}) = nothing + +""" + set_manopt_parameter!(element::Symbol, value::Union{String,Bool,<:Number}) + +Set global [`Manopt`](@ref) parameters adressed by a symbol `element`. +We first dispatch on the value of `element`. + +The parameters are stored to the global settings using [`Preferences.jl`](https://github.com/JuliaPackaging/Preferences.jl). + +Passing a `value` of `""` deletes the corresponding entry from the preferences. +Whenever the `LocalPreferences.toml` is modified, this is also `@info`rmed about. +""" +function set_manopt_parameter!(e::Symbol, value::Union{String,Bool,<:Number}) + if length(value) == 0 + @delete_preferences!(string(e)) + v = get_manopt_parameter(e, Val(:default)) + default = isnothing(v) ? "" : ((v isa String) ? " \"$v\"" : " ($v)") + @info("Resetting the `Manopt.jl` parameter :$(e) to default$(default).") + else + @set_preferences!("$(e)" => value) + @info("Setting the `Manopt.jl` parameter :$(e) to $value.") + end +end + +""" + is_tutorial_mode() + +A small internal helper to indicate whether tutorial mode is active +""" +is_tutorial_mode() = (get_manopt_parameter(:Mode) == "Tutorial") + include("objective.jl") include("problem.jl") include("solver_state.jl") diff --git a/src/plans/quasi_newton_plan.jl b/src/plans/quasi_newton_plan.jl index d73b635577..65087921ff 100644 --- a/src/plans/quasi_newton_plan.jl +++ b/src/plans/quasi_newton_plan.jl @@ -586,11 +586,11 @@ as given in [`QuasiNewtonMatrixDirectionUpdate`](@ref) or [`QuasiNewtonLimitedMe butut the update then is only executed if ```math -\frac{g_{x_{k+1}}(y_k,s_k)}{\lVert s_k \rVert^{2}_{x_{k+1}}} \geq \theta(\lVert \operatorname{grad}f(x_k) \rVert_{x_k}), +\frac{g_{x_{k+1}}(y_k,s_k)}{\lVert s_k \rVert^{2}_{x_{k+1}}} ≥ θ(\lVert \operatorname{grad}f(x_k) \rVert_{x_k}), ``` -is satisfied, where ``\theta`` is a monotone increasing function satisfying ``\theta(0) = 0`` -and ``\theta`` is strictly increasing at ``0``. If this is not the case, the corresponding +is satisfied, where ``θ`` is a monotone increasing function satisfying ``θ(0) = 0`` +and ``θ`` is strictly increasing at ``0``. If this is not the case, the corresponding update will be skipped, which means that for [`QuasiNewtonMatrixDirectionUpdate`](@ref) the matrix ``H_k`` or ``B_k`` is not updated. The basis ``\{b_i\}^{n}_{i=1}`` is nevertheless transported into the upcoming tangent @@ -600,8 +600,8 @@ discarded nor the newest vector pair ``\{ \widetilde{s}_{k}, \widetilde{y}_{k}\} into storage, but all stored vector pairs ``\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}`` are transported into the tangent space ``T_{x_{k+1}} \mathcal{M}``. If [`InverseBFGS`](@ref) or [`InverseBFGS`](@ref) is chosen as update, then the resulting -method follows the method of [Huang, Absil, Gallivan, SIAM J. Optim., 2018](@cite HuangAbsilGallivan:2018), taking into account that -the corresponding step size is chosen. +method follows the method of [Huang, Absil, Gallivan, SIAM J. Optim., 2018](@cite HuangAbsilGallivan:2018), +taking into account that the corresponding step size is chosen. # Provided functors diff --git a/src/plans/stopping_criterion.jl b/src/plans/stopping_criterion.jl index 3d95df90f0..ffafbb0888 100644 --- a/src/plans/stopping_criterion.jl +++ b/src/plans/stopping_criterion.jl @@ -173,57 +173,6 @@ function update_stopping_criterion!(c::StopAfterIteration, ::Val{:MaxIteration}, return c end -""" - StopWhenSubgradientNormLess <: StoppingCriterion - -A stopping criterion based on the current subgradient norm. - -# Constructor - - StopWhenSubgradientNormLess(ε::Float64) - -Create a stopping criterion with threshold `ε` for the subgradient, that is, this criterion -indicates to stop when [`get_subgradient`](@ref) returns a subgradient vector of norm less than `ε`. -""" -mutable struct StopWhenSubgradientNormLess <: StoppingCriterion - threshold::Float64 - reason::String - StopWhenSubgradientNormLess(ε::Float64) = new(ε, "") -end -function (c::StopWhenSubgradientNormLess)( - mp::AbstractManoptProblem, s::AbstractManoptSolverState, i::Int -) - M = get_manifold(mp) - (i == 0) && (c.reason = "") # reset on init - if (norm(M, get_iterate(s), get_subgradient(s)) < c.threshold) && (i > 0) - c.reason = "The algorithm reached approximately critical point after $i iterations; the subgradient norm ($(norm(M,get_iterate(s),get_subgradient(s)))) is less than $(c.threshold).\n" - return true - end - return false -end -function status_summary(c::StopWhenSubgradientNormLess) - has_stopped = length(c.reason) > 0 - s = has_stopped ? "reached" : "not reached" - return "|subgrad f| < $(c.threshold): $s" -end -indicates_convergence(c::StopWhenSubgradientNormLess) = true -function show(io::IO, c::StopWhenSubgradientNormLess) - return print( - io, "StopWhenSubgradientNormLess($(c.threshold))\n $(status_summary(c))" - ) -end -""" - update_stopping_criterion!(c::StopWhenSubgradientNormLess, :MinSubgradNorm, v::Float64) - -Update the minimal subgradient norm when an algorithm shall stop -""" -function update_stopping_criterion!( - c::StopWhenSubgradientNormLess, ::Val{:MinSubgradNorm}, v::Float64 -) - c.threshold = v - return c -end - """ StopWhenChangeLess <: StoppingCriterion @@ -726,6 +675,62 @@ function show(io::IO, c::StopWhenSmallerOrEqual) ) end +""" + StopWhenSubgradientNormLess <: StoppingCriterion + +A stopping criterion based on the current subgradient norm. + +# Constructor + + StopWhenSubgradientNormLess(ε::Float64) + +Create a stopping criterion with threshold `ε` for the subgradient, that is, this criterion +indicates to stop when [`get_subgradient`](@ref) returns a subgradient vector of norm less than `ε`. +""" +mutable struct StopWhenSubgradientNormLess <: StoppingCriterion + at_iteration::Int + threshold::Float64 + reason::String + StopWhenSubgradientNormLess(ε::Float64) = new(0, ε, "") +end +function (c::StopWhenSubgradientNormLess)( + mp::AbstractManoptProblem, s::AbstractManoptSolverState, i::Int +) + M = get_manifold(mp) + if (i == 0) # reset on init + c.reason = "" + c.at_iteration = 0 + end + if (norm(M, get_iterate(s), get_subgradient(s)) < c.threshold) && (i > 0) + c.at_iteration = i + c.reason = "The algorithm reached approximately critical point after $i iterations; the subgradient norm ($(norm(M,get_iterate(s),get_subgradient(s)))) is less than $(c.threshold).\n" + return true + end + return false +end +function status_summary(c::StopWhenSubgradientNormLess) + has_stopped = length(c.reason) > 0 + s = has_stopped ? "reached" : "not reached" + return "|∂f| < $(c.threshold): $s" +end +indicates_convergence(c::StopWhenSubgradientNormLess) = true +function show(io::IO, c::StopWhenSubgradientNormLess) + return print( + io, "StopWhenSubgradientNormLess($(c.threshold))\n $(status_summary(c))" + ) +end +""" + update_stopping_criterion!(c::StopWhenSubgradientNormLess, :MinSubgradNorm, v::Float64) + +Update the minimal subgradient norm when an algorithm shall stop +""" +function update_stopping_criterion!( + c::StopWhenSubgradientNormLess, ::Val{:MinSubgradNorm}, v::Float64 +) + c.threshold = v + return c +end + # # Meta Criteria # diff --git a/src/solvers/gradient_descent.jl b/src/solvers/gradient_descent.jl index 5301b275e6..106f040e02 100644 --- a/src/solvers/gradient_descent.jl +++ b/src/solvers/gradient_descent.jl @@ -240,7 +240,15 @@ function gradient_descent!( ), stopping_criterion::StoppingCriterion=StopAfterIteration(200) | StopWhenGradientNormLess(1e-8), - debug=stepsize isa ConstantStepsize ? [DebugWarnIfCostIncreases()] : [], + debug=if is_tutorial_mode() + if (stepsize isa ConstantStepsize) + [DebugWarnIfCostIncreases(), DebugWarnIfGradientNormTooLarge()] + else + [DebugWarnIfGradientNormTooLarge()] + end + else + [] + end, direction=IdentityUpdateRule(), X=zero_vector(M, p), kwargs..., #collect rest diff --git a/src/solvers/quasi_Newton.jl b/src/solvers/quasi_Newton.jl index a723ad56c9..0577ce3e9a 100644 --- a/src/solvers/quasi_Newton.jl +++ b/src/solvers/quasi_Newton.jl @@ -268,7 +268,8 @@ function quasi_Newton!( mgo::O, p; cautious_update::Bool=false, - cautious_function::Function=x -> x * 10^(-4), + cautious_function::Function=x -> x * 1e-4, + debug=is_tutorial_mode() ? [DebugWarnIfGradientNormTooLarge()] : [], retraction_method::AbstractRetractionMethod=default_retraction_method(M, typeof(p)), vector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method( M, typeof(p) @@ -320,7 +321,7 @@ function quasi_Newton!( local_dir_upd; θ=cautious_function ) end - dmgo = decorate_objective!(M, mgo; kwargs...) + dmgo = decorate_objective!(M, mgo; debug=debug, kwargs...) mp = DefaultManoptProblem(M, dmgo) qns = QuasiNewtonState( M, @@ -332,14 +333,16 @@ function quasi_Newton!( retraction_method=retraction_method, vector_transport_method=vector_transport_method, ) - dqns = decorate_state!(qns; kwargs...) + dqns = decorate_state!(qns; debug=debug, kwargs...) solve!(mp, dqns) return get_solver_return(get_objective(mp), dqns) end -function initialize_solver!(p::AbstractManoptProblem, s::QuasiNewtonState) - s.X = get_gradient(p, s.p) - s.sk = deepcopy(s.X) - return s.yk = deepcopy(s.X) +function initialize_solver!(amp::AbstractManoptProblem, qns::QuasiNewtonState) + M = get_manifold(amp) + get_gradient!(amp, qns.X, qns.p) + copyto!(M, qns.sk, qns.p, qns.X) + copyto!(M, qns.yk, qns.p, qns.X) + return qns end function step_solver!(mp::AbstractManoptProblem, qns::QuasiNewtonState, iter) M = get_manifold(mp) diff --git a/test/helpers/test_linesearches.jl b/test/helpers/test_linesearches.jl index 24dd397024..6063937cca 100644 --- a/test/helpers/test_linesearches.jl +++ b/test/helpers/test_linesearches.jl @@ -32,6 +32,7 @@ using Test rosenbrock_grad!, x0; stepsize=ls_hz, + debug=[], evaluation=InplaceEvaluation(), stopping_criterion=StopAfterIteration(1000) | StopWhenGradientNormLess(1e-6), return_state=true, diff --git a/test/plans/test_debug.jl b/test/plans/test_debug.jl index 96bd46be04..17bb6239b2 100644 --- a/test/plans/test_debug.jl +++ b/test/plans/test_debug.jl @@ -238,7 +238,7 @@ Manopt.get_message(::TestMessageState) = "DebugTest" st.X = grad_f(M, p) w3 = DebugWarnIfFieldNotFinite(:X) - @test repr(w3) == "DebugWarnIfFieldNotFinite(:X)" + @test repr(w3) == "DebugWarnIfFieldNotFinite(:X, :Once)" @test_logs (:warn,) ( :warn, "Further warnings will be suppressed, use DebugWaranIfFieldNotFinite(:X, :Always) to get all warnings.", @@ -254,12 +254,16 @@ Manopt.get_message(::TestMessageState) = "DebugTest" "The gradient is or contains values that are not finite.\nAt iteration #1 it evaluated to [Inf, Inf].", ) w5(mp, st, 1) + M2 = Sphere(2) + mp2 = DefaultManoptProblem(M2, ManifoldGradientObjective(f, grad_f)) + w6 = DebugWarnIfGradientNormTooLarge(1.0, :Once) + @test repr(w6) == "DebugWarnIfGradientNormTooLarge(1.0, :Once)" + st.X .= [4.0, 0.0] # > π in norm + @test_logs (:warn,) (:warn,) w6(mp2, st, 1) + st.p = Inf .* ones(2) - w6 = DebugWarnIfFieldNotFinite(:Iterate, :Always) - @test_logs ( - :warn, - "The iterate is or contains values that are not finite.\nAt iteration #1 it evaluated to [Inf, Inf].", - ) w6(mp, st, 1) + w7 = DebugWarnIfFieldNotFinite(:Iterate, :Always) + @test_logs (:warn,) w7(mp, st, 1) df1 = DebugFactory([:WarnCost]) @test isa(df1[:All].group[1], DebugWarnIfCostNotFinite) diff --git a/test/plans/test_higher_order_primal_dual_plan.jl b/test/plans/test_higher_order_primal_dual_plan.jl index fb15ae381a..44b87cc201 100644 --- a/test/plans/test_higher_order_primal_dual_plan.jl +++ b/test/plans/test_higher_order_primal_dual_plan.jl @@ -1,7 +1,9 @@ using Manopt, Manifolds, ManifoldsBase, Test using ManoptExamples: forward_logs, adjoint_differential_forward_logs using ManifoldDiff: - differential_shortest_geodesic_startpoint, differential_shortest_geodesic_startpoint! + differential_shortest_geodesic_startpoint, + differential_shortest_geodesic_startpoint!, + prox_distance! @testset "Test higher order primal dual plan" begin # Perform an really easy test, just compute a mid point # diff --git a/test/plans/test_parameters.jl b/test/plans/test_parameters.jl new file mode 100644 index 0000000000..99b9eeecf6 --- /dev/null +++ b/test/plans/test_parameters.jl @@ -0,0 +1,14 @@ +using Manifolds, Manopt, Test, ManifoldsBase + +@testset "Generic Parameters" begin + # test one internal fallback + Manopt.get_manopt_parameter(:None, Val(:default)) === nothing + @test_logs (:info, "Setting the `Manopt.jl` parameter :TestValue to Å.") Manopt.set_manopt_parameter!( + :TestValue, "Å" + ) + @test Manopt.get_manopt_parameter(:TestValue) == "Å" + @test_logs (:info, "Resetting the `Manopt.jl` parameter :TestValue to default.") Manopt.set_manopt_parameter!( + :TestValue, "" + ) # reset + @test Manopt.get_manopt_parameter(:TestValue) === nothing +end diff --git a/test/plans/test_stopping_criteria.jl b/test/plans/test_stopping_criteria.jl index 474313604e..ab0a878cfa 100644 --- a/test/plans/test_stopping_criteria.jl +++ b/test/plans/test_stopping_criteria.jl @@ -6,177 +6,207 @@ struct myStoppingCriteriaSet <: StoppingCriterionSet end struct DummyStoppingCriterion <: StoppingCriterion end @testset "StoppingCriteria" begin - @test_throws ErrorException get_stopping_criteria(myStoppingCriteriaSet()) + @testset "Generic Tests" begin + @test_throws ErrorException get_stopping_criteria(myStoppingCriteriaSet()) - s = StopWhenAll(StopAfterIteration(10), StopWhenChangeLess(0.1)) - @test Manopt.indicates_convergence(s) #due to all and change this is true - @test startswith(repr(s), "StopWhenAll with the") - s2 = StopWhenAll([StopAfterIteration(10), StopWhenChangeLess(0.1)]) - @test get_stopping_criteria(s)[1].maxIter == get_stopping_criteria(s2)[1].maxIter + s = StopWhenAll(StopAfterIteration(10), StopWhenChangeLess(0.1)) + @test Manopt.indicates_convergence(s) #due to all and change this is true + @test startswith(repr(s), "StopWhenAll with the") + s2 = StopWhenAll([StopAfterIteration(10), StopWhenChangeLess(0.1)]) + @test get_stopping_criteria(s)[1].maxIter == get_stopping_criteria(s2)[1].maxIter - s3 = StopWhenCostLess(0.1) - p = DefaultManoptProblem(Euclidean(), ManifoldGradientObjective((M, x) -> x^2, x -> 2x)) - s = GradientDescentState(Euclidean(), 1.0) - @test !s3(p, s, 1) - @test length(s3.reason) == 0 - s.p = 0.3 - @test s3(p, s, 2) - @test length(s3.reason) > 0 - # repack - sn = StopWhenAny(StopAfterIteration(10), s3) - @test !Manopt.indicates_convergence(sn) # since it might stop after 10 iters - @test startswith(repr(sn), "StopWhenAny with the") - sn2 = StopAfterIteration(10) | s3 - @test get_stopping_criteria(sn)[1].maxIter == get_stopping_criteria(sn2)[1].maxIter - @test get_stopping_criteria(sn)[2].threshold == get_stopping_criteria(sn2)[2].threshold - # then s3 is the only active one - @test get_active_stopping_criteria(sn) == [s3] - @test get_active_stopping_criteria(s3) == [s3] - @test get_active_stopping_criteria(StopAfterIteration(1)) == [] - sm = StopWhenAll(StopAfterIteration(10), s3) - s1 = "StopAfterIteration(10)\n Max Iteration 10:\tnot reached" - @test repr(StopAfterIteration(10)) == s1 - @test !sm(p, s, 9) - @test sm(p, s, 11) - an = sm.reason - m = match(r"^((.*)\n)+", an) - @test length(m.captures) == 2 # both have to be active - update_stopping_criterion!(s3, :MinCost, 1e-2) - @test s3.threshold == 1e-2 - # Dummy without iterations has a reasonable fallback - @test Manopt.get_count(DummyStoppingCriterion(), Val(:Iterations)) == 0 -end + s3 = StopWhenCostLess(0.1) + p = DefaultManoptProblem( + Euclidean(), ManifoldGradientObjective((M, x) -> x^2, x -> 2x) + ) + s = GradientDescentState(Euclidean(), 1.0) + @test !s3(p, s, 1) + @test length(s3.reason) == 0 + s.p = 0.3 + @test s3(p, s, 2) + @test length(s3.reason) > 0 + # repack + sn = StopWhenAny(StopAfterIteration(10), s3) + @test !Manopt.indicates_convergence(sn) # since it might stop after 10 iters + @test startswith(repr(sn), "StopWhenAny with the") + sn2 = StopAfterIteration(10) | s3 + @test get_stopping_criteria(sn)[1].maxIter == get_stopping_criteria(sn2)[1].maxIter + @test get_stopping_criteria(sn)[2].threshold == + get_stopping_criteria(sn2)[2].threshold + # then s3 is the only active one + @test get_active_stopping_criteria(sn) == [s3] + @test get_active_stopping_criteria(s3) == [s3] + @test get_active_stopping_criteria(StopAfterIteration(1)) == [] + sm = StopWhenAll(StopAfterIteration(10), s3) + s1 = "StopAfterIteration(10)\n Max Iteration 10:\tnot reached" + @test repr(StopAfterIteration(10)) == s1 + @test !sm(p, s, 9) + @test sm(p, s, 11) + an = sm.reason + m = match(r"^((.*)\n)+", an) + @test length(m.captures) == 2 # both have to be active + update_stopping_criterion!(s3, :MinCost, 1e-2) + @test s3.threshold == 1e-2 + # Dummy without iterations has a reasonable fallback + @test Manopt.get_count(DummyStoppingCriterion(), Val(:Iterations)) == 0 + end -@testset "Test StopAfter" begin - p = TestStopProblem() - o = TestStopState() - s = StopAfter(Millisecond(30)) - @test !Manopt.indicates_convergence(s) - @test Manopt.status_summary(s) == "stopped after $(s.threshold):\tnot reached" - @test repr(s) == "StopAfter(Millisecond(30))\n $(Manopt.status_summary(s))" - s(p, o, 0) # Start - @test s(p, o, 1) == false - sleep(0.05) - @test s(p, o, 2) == true - @test_throws ErrorException StopAfter(Second(-1)) - @test_throws ErrorException update_stopping_criterion!(s, :MaxTime, Second(-1)) - update_stopping_criterion!(s, :MaxTime, Second(2)) - @test s.threshold == Second(2) -end + @testset "Test StopAfter" begin + p = TestStopProblem() + o = TestStopState() + s = StopAfter(Millisecond(30)) + @test !Manopt.indicates_convergence(s) + @test Manopt.status_summary(s) == "stopped after $(s.threshold):\tnot reached" + @test repr(s) == "StopAfter(Millisecond(30))\n $(Manopt.status_summary(s))" + s(p, o, 0) # Start + @test s(p, o, 1) == false + sleep(0.05) + @test s(p, o, 2) == true + @test_throws ErrorException StopAfter(Second(-1)) + @test_throws ErrorException update_stopping_criterion!(s, :MaxTime, Second(-1)) + update_stopping_criterion!(s, :MaxTime, Second(2)) + @test s.threshold == Second(2) + end -@testset "Stopping Criterion &/| operators" begin - a = StopAfterIteration(200) - b = StopWhenChangeLess(1e-6) - sb = "StopWhenChangeLess(1.0e-6)\n $(Manopt.status_summary(b))" - @test repr(b) == sb - b2 = StopWhenChangeLess(Euclidean(), 1e-6) # second constructor - @test repr(b2) == sb - c = StopWhenGradientNormLess(1e-6) - sc = "StopWhenGradientNormLess(1.0e-6)\n $(Manopt.status_summary(c))" - @test repr(c) == sc - c2 = StopWhenSubgradientNormLess(1e-6) - sc2 = "StopWhenSubgradientNormLess(1.0e-6)\n $(Manopt.status_summary(c2))" - @test repr(c2) == sc2 - d = StopWhenAll(a, b, c) - @test typeof(d) === typeof(a & b & c) - @test typeof(d) === typeof(a & (b & c)) - @test typeof(d) === typeof((a & b) & c) - update_stopping_criterion!(d, :MinIterateChange, 1e-8) - @test d.criteria[2].threshold == 1e-8 - e = a | b | c - @test typeof(e) === typeof(a | b | c) - @test typeof(e) === typeof(a | (b | c)) - @test typeof(e) === typeof((a | b) | c) - update_stopping_criterion!(e, :MinGradNorm, 1e-9) - @test d.criteria[3].threshold == 1e-9 -end + @testset "Stopping Criterion &/| operators" begin + a = StopAfterIteration(200) + b = StopWhenChangeLess(1e-6) + sb = "StopWhenChangeLess(1.0e-6)\n $(Manopt.status_summary(b))" + @test repr(b) == sb + b2 = StopWhenChangeLess(Euclidean(), 1e-6) # second constructor + @test repr(b2) == sb + c = StopWhenGradientNormLess(1e-6) + sc = "StopWhenGradientNormLess(1.0e-6)\n $(Manopt.status_summary(c))" + @test repr(c) == sc + d = StopWhenAll(a, b, c) + @test typeof(d) === typeof(a & b & c) + @test typeof(d) === typeof(a & (b & c)) + @test typeof(d) === typeof((a & b) & c) + update_stopping_criterion!(d, :MinIterateChange, 1e-8) + @test d.criteria[2].threshold == 1e-8 + e = a | b | c + @test typeof(e) === typeof(a | b | c) + @test typeof(e) === typeof(a | (b | c)) + @test typeof(e) === typeof((a | b) | c) + update_stopping_criterion!(e, :MinGradNorm, 1e-9) + @test d.criteria[3].threshold == 1e-9 + end -@testset "Stopping Criterion print&summary" begin - f = StopWhenStepsizeLess(1e-6) - sf1 = "Stepsize s < 1.0e-6:\tnot reached" - @test Manopt.status_summary(f) == sf1 - sf2 = "StopWhenStepsizeLess(1.0e-6)\n $(sf1)" - @test repr(f) == sf2 - g = StopWhenCostLess(1e-4) - @test Manopt.status_summary(g) == "f(x) < $(1e-4):\tnot reached" - @test repr(g) == "StopWhenCostLess(0.0001)\n $(Manopt.status_summary(g))" - gf(M, p) = norm(p) - grad_gf(M, p) = p - gp = DefaultManoptProblem(Euclidean(2), ManifoldGradientObjective(gf, grad_gf)) - gs = GradientDescentState(Euclidean(2)) - Manopt.set_iterate!(gs, Euclidean(2), [0.0, 1e-2]) - g(gp, gs, 0) # reset - @test length(g.reason) == 0 - @test !g(gp, gs, 1) - Manopt.set_iterate!(gs, Euclidean(2), [0.0, 1e-8]) - @test g(gp, gs, 2) - @test length(g.reason) > 0 - h = StopWhenSmallerOrEqual(:p, 1e-4) - @test repr(h) == "StopWhenSmallerOrEqual(:p, $(1e-4))\n $(Manopt.status_summary(h))" - swgcl1 = StopWhenGradientChangeLess(Euclidean(2), 1e-8) - swgcl2 = StopWhenGradientChangeLess(1e-8) - for swgcl in [swgcl1, swgcl2] - repr(swgcl) == - "StopWhenGradientChangeLess($(1e-8); vector_transport_method=ParallelTransport())\n $(Manopt.status_summary(swgcl))" - swgcl(gp, gs, 0) # reset - @test swgcl(gp, gs, 1) # change 0 -> true - @test endswith(Manopt.status_summary(swgcl), "reached") + @testset "Stopping Criterion print&summary" begin + f = StopWhenStepsizeLess(1e-6) + sf1 = "Stepsize s < 1.0e-6:\tnot reached" + @test Manopt.status_summary(f) == sf1 + sf2 = "StopWhenStepsizeLess(1.0e-6)\n $(sf1)" + @test repr(f) == sf2 + g = StopWhenCostLess(1e-4) + @test Manopt.status_summary(g) == "f(x) < $(1e-4):\tnot reached" + @test repr(g) == "StopWhenCostLess(0.0001)\n $(Manopt.status_summary(g))" + gf(M, p) = norm(p) + grad_gf(M, p) = p + gp = DefaultManoptProblem(Euclidean(2), ManifoldGradientObjective(gf, grad_gf)) + gs = GradientDescentState(Euclidean(2)) + Manopt.set_iterate!(gs, Euclidean(2), [0.0, 1e-2]) + g(gp, gs, 0) # reset + @test length(g.reason) == 0 + @test !g(gp, gs, 1) + Manopt.set_iterate!(gs, Euclidean(2), [0.0, 1e-8]) + @test g(gp, gs, 2) + @test length(g.reason) > 0 + h = StopWhenSmallerOrEqual(:p, 1e-4) + @test repr(h) == + "StopWhenSmallerOrEqual(:p, $(1e-4))\n $(Manopt.status_summary(h))" + swgcl1 = StopWhenGradientChangeLess(Euclidean(2), 1e-8) + swgcl2 = StopWhenGradientChangeLess(1e-8) + for swgcl in [swgcl1, swgcl2] + repr(swgcl) == + "StopWhenGradientChangeLess($(1e-8); vector_transport_method=ParallelTransport())\n $(Manopt.status_summary(swgcl))" + swgcl(gp, gs, 0) # reset + @test swgcl(gp, gs, 1) # change 0 -> true + @test endswith(Manopt.status_summary(swgcl), "reached") + end + update_stopping_criterion!(swgcl1, :MinGradientChange, 1e-9) + @test swgcl1.threshold == 1e-9 end - update_stopping_criterion!(swgcl1, :MinGradientChange, 1e-9) - @test swgcl1.threshold == 1e-9 -end -@testset "TCG stopping criteria" begin - # create dummy criterion - ho = ManifoldHessianObjective(x -> x, (M, x) -> x, (M, x) -> x, x -> x) - hp = DefaultManoptProblem(Euclidean(), ho) - tcgs = TruncatedConjugateGradientState( - TangentSpace(Euclidean(), 1.0), 0.0; trust_region_radius=2.0, randomize=false - ) - tcgs.model_value = 1.0 - s = StopWhenModelIncreased() - @test !s(hp, tcgs, 0) - @test s.reason == "" - s.model_value = 0.5 # twweak - @test s(hp, tcgs, 1) - @test length(s.reason) > 0 - s2 = StopWhenCurvatureIsNegative() - tcgs.δHδ = -1.0 - @test !s2(hp, tcgs, 0) - @test s2.reason == "" - @test s2(hp, tcgs, 1) - @test length(s2.reason) > 0 - s3 = StopWhenResidualIsReducedByFactorOrPower() - update_stopping_criterion!(s3, :ResidualFactor, 0.5) - @test s3.κ == 0.5 - update_stopping_criterion!(s3, :ResidualPower, 0.5) - @test s3.θ == 0.5 -end + @testset "TCG stopping criteria" begin + # create dummy criterion + ho = ManifoldHessianObjective(x -> x, (M, x) -> x, (M, x) -> x, x -> x) + hp = DefaultManoptProblem(Euclidean(), ho) + tcgs = TruncatedConjugateGradientState( + TangentSpace(Euclidean(), 1.0), 0.0; trust_region_radius=2.0, randomize=false + ) + tcgs.model_value = 1.0 + s = StopWhenModelIncreased() + @test !s(hp, tcgs, 0) + @test s.reason == "" + s.model_value = 0.5 # twweak + @test s(hp, tcgs, 1) + @test length(s.reason) > 0 + s2 = StopWhenCurvatureIsNegative() + tcgs.δHδ = -1.0 + @test !s2(hp, tcgs, 0) + @test s2.reason == "" + @test s2(hp, tcgs, 1) + @test length(s2.reason) > 0 + s3 = StopWhenResidualIsReducedByFactorOrPower() + update_stopping_criterion!(s3, :ResidualFactor, 0.5) + @test s3.κ == 0.5 + update_stopping_criterion!(s3, :ResidualPower, 0.5) + @test s3.θ == 0.5 + end -@testset "Stop with step size" begin - mgo = ManifoldGradientObjective((M, x) -> x^2, x -> 2x) - dmp = DefaultManoptProblem(Euclidean(), mgo) - gds = GradientDescentState( - Euclidean(), - 1.0; - stopping_criterion=StopAfterIteration(100), - stepsize=ConstantStepsize(Euclidean()), - ) - s1 = StopWhenStepsizeLess(0.5) - @test !s1(dmp, gds, 1) - @test s1.reason == "" - gds.stepsize = ConstantStepsize(; stepsize=0.25) - @test s1(dmp, gds, 2) - @test length(s1.reason) > 0 - update_stopping_criterion!(gds, :MaxIteration, 200) - @test gds.stop.maxIter == 200 - update_stopping_criterion!(s1, :MinStepsize, 1e-1) - @test s1.threshold == 1e-1 -end + @testset "Stop with step size" begin + mgo = ManifoldGradientObjective((M, x) -> x^2, x -> 2x) + dmp = DefaultManoptProblem(Euclidean(), mgo) + gds = GradientDescentState( + Euclidean(), + 1.0; + stopping_criterion=StopAfterIteration(100), + stepsize=ConstantStepsize(Euclidean()), + ) + s1 = StopWhenStepsizeLess(0.5) + @test !s1(dmp, gds, 1) + @test s1.reason == "" + gds.stepsize = ConstantStepsize(; stepsize=0.25) + @test s1(dmp, gds, 2) + @test length(s1.reason) > 0 + update_stopping_criterion!(gds, :MaxIteration, 200) + @test gds.stop.maxIter == 200 + update_stopping_criterion!(s1, :MinStepsize, 1e-1) + @test s1.threshold == 1e-1 + end -@testset "Test further setters" begin - swecl = StopWhenEntryChangeLess(:dummy, (p, s, v, w) -> norm(w - v), 1e-5) - @test startswith(repr(swecl), "StopWhenEntryChangeLess\n") - update_stopping_criterion!(swecl, :Threshold, 1e-1) - @test swecl.threshold == 1e-1 + @testset "Test further setters" begin + swecl = StopWhenEntryChangeLess(:dummy, (p, s, v, w) -> norm(w - v), 1e-5) + @test startswith(repr(swecl), "StopWhenEntryChangeLess\n") + update_stopping_criterion!(swecl, :Threshold, 1e-1) + @test swecl.threshold == 1e-1 + end + @testset "Subgradient Norm Stopping Criterion" begin + M = Euclidean(2) + p = [1.0, 2.0] + f(M, q) = distance(M, q, p) + function ∂f(M, q) + if distance(M, p, q) == 0 + return zero_vector(M, q) + end + return -log(M, q, p) / max(10 * eps(Float64), distance(M, p, q)) + end + mso = ManifoldSubgradientObjective(f, ∂f) + mp = DefaultManoptProblem(M, mso) + c2 = StopWhenSubgradientNormLess(1e-6) + sc2 = "StopWhenSubgradientNormLess(1.0e-6)\n $(Manopt.status_summary(c2))" + @test repr(c2) == sc2 + st = SubGradientMethodState(M, p; stopping_criterion=c2) + st.X = ∂f(M, 2p) + @test !c2(mp, st, 1) + st.X = ∂f(M, p) + @test c2(mp, st, 2) + @test length(get_reason(c2)) > 0 + c2(mp, st, 0) # check reset + @test length(get_reason(c2)) == 0 + @test Manopt.indicates_convergence(c2) + update_stopping_criterion!(c2, :MinSubgradNorm, 1e-8) + @test c2.threshold == 1e-8 + end end diff --git a/test/runtests.jl b/test/runtests.jl index 603ff34f9d..f68404b643 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -18,6 +18,7 @@ include("utils/example_tasks.jl") include("plans/test_gradient_plan.jl") include("plans/test_constrained_plan.jl") include("plans/test_hessian_plan.jl") + include("plans/test_parameters.jl") include("plans/test_primal_dual_plan.jl") include("plans/test_proximal_plan.jl") include("plans/test_higher_order_primal_dual_plan.jl") diff --git a/test/solvers/test_ChambollePock.jl b/test/solvers/test_ChambollePock.jl index 81f5ffbea8..11ab2853ed 100644 --- a/test/solvers/test_ChambollePock.jl +++ b/test/solvers/test_ChambollePock.jl @@ -1,5 +1,6 @@ using Manopt, Manifolds, ManifoldsBase, Test using ManoptExamples: forward_logs, adjoint_differential_forward_logs +using ManifoldDiff: prox_distance, prox_distance! @testset "Chambolle-Pock" begin # diff --git a/test/solvers/test_gradient_descent.jl b/test/solvers/test_gradient_descent.jl index 3a0cc64750..c74bb0b620 100644 --- a/test/solvers/test_gradient_descent.jl +++ b/test/solvers/test_gradient_descent.jl @@ -158,20 +158,20 @@ using ManifoldDiff: grad_distance ) @test repr(n6) == "$(n6[2])\n\n$(n6[1])" end - @testset "Warning when cost increases" begin + @testset "Tutorial mode" begin M = Sphere(2) q = 1 / sqrt(2) .* [1.0, 1.0, 0.0] f(M, p) = distance(M, p, q) .^ 2 # choose a wrong gradient such that ConstantStepsize yields an increase grad_f(M, p) = -grad_distance(M, q, p) - # issues three warnings + @test_logs (:info,) set_manopt_parameter!(:Mode, "Tutorial") @test_logs (:warn,) (:warn,) (:warn,) gradient_descent( - M, - f, - grad_f, - 1 / sqrt(2) .* [1.0, -1.0, 0.0]; - stepsize=ConstantStepsize(1.0), - debug=[DebugWarnIfCostIncreases(:Once), :Messages], + M, f, grad_f, 1 / sqrt(2) .* [1.0, -1.0, 0.0]; stepsize=ConstantStepsize(1.0) + ) + grad_f2(M, p) = 20 * grad_distance(M, q, p) + @test_logs (:warn,) (:warn,) gradient_descent( + M, f, grad_f2, 1 / sqrt(2) .* [1.0, -1.0, 0.0]; ) + @test_logs (:info,) set_manopt_parameter!(:Mode, "") end end diff --git a/test/solvers/test_primal_dual_semismooth_Newton.jl b/test/solvers/test_primal_dual_semismooth_Newton.jl index d7d08fdbad..cade5b5187 100644 --- a/test/solvers/test_primal_dual_semismooth_Newton.jl +++ b/test/solvers/test_primal_dual_semismooth_Newton.jl @@ -1,5 +1,5 @@ using Manopt, Manifolds, ManifoldsBase, Test -using ManoptExamples: adjoint_differential_forward_logs +using ManoptExamples: adjoint_differential_forward_logs, differential_forward_logs using ManifoldDiff: differential_shortest_geodesic_startpoint @testset "PD-RSSN" begin diff --git a/test/solvers/test_quasi_Newton.jl b/test/solvers/test_quasi_Newton.jl index 97f0e96a22..401894c69f 100644 --- a/test/solvers/test_quasi_Newton.jl +++ b/test/solvers/test_quasi_Newton.jl @@ -326,13 +326,15 @@ using LinearAlgebra: I, eigvecs, tr, Diagonal @test isapprox(M, p_1, BFGS_allocating.grad_tmp, [0.0, 8.0, 0.0, 4.0]) end - @testset "A small complex example" begin + @testset "A small complex example in Tutorial Mode" begin M = Euclidean(2; field=ℂ) A = [2 im; -im 2] fc(::Euclidean, p) = real(p' * A * p) grad_fc(::Euclidean, p) = 2 * A * p p0 = [2.0, 1 + im] + @test_logs (:info,) set_manopt_parameter!(:Mode, "Tutorial") p4 = quasi_Newton(M, fc, grad_fc, p0; stopoing_criterion=StopAfterIteration(3)) + @test_logs (:info,) set_manopt_parameter!(:Mode, "") @test fc(M, p4) ≤ fc(M, p0) end diff --git a/tutorials/ImplementOwnManifold.qmd b/tutorials/ImplementOwnManifold.qmd index 60671d5728..bc8349a4ca 100644 --- a/tutorials/ImplementOwnManifold.qmd +++ b/tutorials/ImplementOwnManifold.qmd @@ -104,17 +104,21 @@ pts = [ r/norm(p) .* p for p in pts] Then, before starting with optimization, we need the distance on the manifold, to define the cost function, as well as the logarithmic map to defined the gradient. -For both, we here use the “lazy” approach of using the [Sphere](https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/sphere.html) as a fallback +For both, we here use the “lazy” approach of using the [Sphere](https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/sphere.html) as a fallback. +Finally, we have to provide information about how points and tangent vectors are stored on the +manifold by implementing their [`representation_size`](https://juliamanifolds.github.io/Manifolds.jl/v0.1/interface.html#ManifoldsBase.representation_size-Tuple{Manifold}) function, which is often required when allocating memory. +While we could ```{julia} #| output : false -import ManifoldsBase: distance, log +import ManifoldsBase: distance, log, representation_size function distance(M::ScaledSphere, p, q) return M.radius * distance(Sphere(M.dimension), p ./ M.radius, q ./ M.radius) end function log(M::ScaledSphere, p, q) return M.radius * log(Sphere(M.dimension), p ./ M.radius, q ./ M.radius) end +representation_size(M::ScaledSphere) = (M.dimension+1,) ``` ## Define the cost and gradient @@ -144,7 +148,7 @@ To be precise, we have to implement the [in-place variant](https://juliamanifold import ManifoldsBase: retract_project! function retract_project!(M::ScaledSphere, q, p, X, t::Number) q .= p .+ t .* X - q .*= M.radius/norm(q) + q .*= M.radius / norm(q) return q end ``` @@ -169,10 +173,10 @@ We will discuss these in the next steps. ```{julia} q1 = gradient_descent(M, f, grad_f, p0; - retraction_method = ProjectionRetraction(), #state, that we use the retraction from above + retraction_method = ProjectionRetraction(), # state, that we use the retraction from above stepsize = DecreasingStepsize(M; length=1.0), # A simple step size - stopping_criterion = StopAfterIteration(10), # A simple stopping crtierion - X = zeros(d+1), # how we define a tangent vector + stopping_criterion = StopAfterIteration(10), # A simple stopping crtierion + X = zeros(d+1), # how we define/represent tangent vectors ) f(M,q1) ``` diff --git a/tutorials/Optimize.qmd b/tutorials/Optimize.qmd index f331eb7868..093eb29c9e 100644 --- a/tutorials/Optimize.qmd +++ b/tutorials/Optimize.qmd @@ -24,10 +24,9 @@ As an example we take the generalisation of the [(arithemtic) mean](https://en.w In the Euclidean case with$d\in\mathbb N$, that is for $n\in \mathbb N$ data points $y_1,\ldots,y_n \in \mathbb R^d$ the mean ```math - \sum_{i=1}^n y_i + \frac{1}{n}\sum_{i=1}^n y_i ``` - can not be directly generalised to data $q_1,\ldots,q_n$, since on a manifold we do not have an addition. But the mean can also be characterised as @@ -122,7 +121,7 @@ m2 = gradient_descent(M, f, grad_f, data[1]; See [here](https://manoptjl.org/stable/plans/debug/#Manopt.DebugActionFactory-Tuple{Symbol}) for the list of available symbols. ```{=commonmark} -!!! info \"Technical Detail\" +!!! info "Technical Detail" The `debug=` keyword is actually a list of [`DebugActions`](https://manoptjl.org/stable/plans/debug/#Manopt.DebugAction) added to every iteration, allowing you to write your own ones even. Additionally, `:Stop` is an action added to the end of the solver to display the reason why the solver stopped. ``` @@ -156,9 +155,66 @@ Then we reach approximately the same point as in the previous run, but in far le [f(M, m3)-f(M,m4), distance(M, m3, m4)] ``` +## Using the “Tutorial” mode + +Since a few things on manifolds are a bit different from (classical) Euclidean optimization, `Manopt.jl` has a mode to warn about a few pitfalls. + +It can be set using + +```{julia} +#| warning: true +set_manopt_parameter!(:Mode, "Tutorial") +``` + +to activate these. Continuing from the example before, one might argue, that the +minimizer of $f$ does not depend on the scaling of the function. +In theory this is of course also the case on manifolds, but for the optimizations there is a caveat. When we define the Riemannian mean without the scaling + +```{julia} +f2(M, p) = sum(1 / 2 * distance.(Ref(M), Ref(p), data) .^ 2) +grad_f2(M, p) = sum(grad_distance.(Ref(M), data, Ref(p))); +``` + +And we consider the gradient at the starting point in norm + +```{julia} +norm(M, data[1], grad_f2(M, data[1])) +``` + +On the sphere, when we follow a geodesic, we “return” to the start point after length $2π$. +If we “land” short before the starting point due to a gradient of length just shy of $2π$, +the line search would take the gradient direction (and not the negative gradient direction) +as a start. The line search is still performed, but in this case returns a much too small, +maybe even nearly zero step size. + +In other words – we have to be careful, that the optimisation stays a “local” argument we use. + +This is also warned for in `"Tutorial"` mode. Calling + +```{julia} +#| warning: true +mX = gradient_descent(M, f2, grad_f2, data[1]) +``` + +So just by chance it seems we still got nearly the same point as before, but +when we for example look when this one stops, that is takes more steps. + +```{julia} +#| warning: true +gradient_descent(M, f2, grad_f2, data[1], debug=[:Stop]); +``` + + +This also illustrates one way to deactivate the hints, namely by overwriting the `debug=` keyword, that in `Tutorial` mode contains addional warnings.the other option is to globally reset the `:Mode` back to +```{julia} +#| warning: true +set_manopt_parameter!(:Mode, "") +``` + ## Example 2: Computing the median of symmetric positive definite matrices. For the second example let's consider the manifold of [$3 × 3$ symmetric positive definite matrices](https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/symmetricpositivedefinite.html) and again 100 random points + ```{julia} N = SymmetricPositiveDefinite(3) m = 100 @@ -194,11 +250,12 @@ s = cyclic_proximal_point(N, g, proxes_g, data2[1]; ``` ```{=commonmark} -!!! note \"Technical Detail\" +!!! note "Technical Detail" The recording is realised by [`RecordActions`](https://manoptjl.org/stable/plans/record/#Manopt.RecordAction) that are (also) executed at every iteration. These can also be individually implemented and added to the `record=` array instead of symbols. -```{=commonmark} +``` First, the computed median can be accessed as + ```{julia} median = get_solver_result(s) ``` diff --git a/tutorials/Project.toml b/tutorials/Project.toml index e1bb797b96..db4bdfaa9c 100644 --- a/tutorials/Project.toml +++ b/tutorials/Project.toml @@ -21,5 +21,5 @@ LRUCache = "1.4" ManifoldDiff = "0.3.9" Manifolds = "0.8.81, 0.9" ManifoldsBase = "0.14.12, 0.15" -Manopt = "0.4.22" +Manopt = "0.4.51" Plots = "1.38"