Manopt.jl inherited its name from Manopt, a Matlab toolbox for optimization on manifolds. This Julia package was started and is currently maintained by Ronny Bergmann.
Manopt.jl inherited its name from Manopt, a Matlab toolbox for optimization on manifolds. This Julia package was started and is currently maintained by Ronny Bergmann.
+ `;
+
+ return filter_html;
+}
+
+/**
+ * Make the result component given a minisearch result data object and the value of the search input as queryString.
+ * To view the result object structure, refer: https://lucaong.github.io/minisearch/modules/_minisearch_.html#searchresult
+ *
+ * @param {object} result
+ * @param {string} querystring
+ * @returns string
+ */
+function make_search_result(result, querystring) {
+ let search_divider = ``;
+ let display_link =
+ result.location.slice(Math.max(0), Math.min(50, result.location.length)) +
+ (result.location.length > 30 ? "..." : ""); // To cut-off the link because it messes with the overflow of the whole div
+
+ if (result.page !== "") {
+ display_link += ` (${result.page})`;
+ }
+
+ let textindex = new RegExp(`\\b${querystring}\\b`, "i").exec(result.text);
+ let text =
+ textindex !== null
+ ? result.text.slice(
+ Math.max(textindex.index - 100, 0),
+ Math.min(
+ textindex.index + querystring.length + 100,
+ result.text.length
+ )
+ )
+ : ""; // cut-off text before and after from the match
+
+ let display_result = text.length
+ ? "..." +
+ text.replace(
+ new RegExp(`\\b${querystring}\\b`, "i"), // For first occurrence
+ '$&'
+ ) +
+ "..."
+ : ""; // highlights the match
+
+ let in_code = false;
+ if (!["page", "section"].includes(result.category.toLowerCase())) {
+ in_code = true;
+ }
+
+ // We encode the full url to escape some special characters which can lead to broken links
+ let result_div = `
+
+
add a --help argument to docs/make.jl to document all availabel command line arguments
add a --exclude-tutorials argument to docs/make.jl. This way, when quarto is not available on a computer, the docs can still be build with the tutorials not being added to the menu such that documenter does not expect them to exist.
an ManifoldEuclideanGradientObjetive to allow the cost, gradient, and Hessian and other first or second derivative based elements to be Euclidean and converted when needed.
a keyword objective_type=:Euclidean for all solvers, that specifies that an Objective shall be created of the above type
ConstantStepsize and DecreasingStepsize now have an additional field type::Symbol to assess whether the step-size should be relatively (to the gradient norm) or absolutely constant.
A :Subsolver keyword in the debug= keyword argument, that activates the new DebugWhenActiveto de/activate subsolver debug from the main solversDebugEvery`.
Levenberg-Marquardt now possesses its parameters initial_residual_values and initial_jacobian_f also as keyword arguments, such that their default initialisations can be adapted, if necessary
A ManifoldCacheObjective as a decorator for objectives to cache results of calls, using LRU Caches as a weak dependency. For now this works with cost and gradient evaluations
A ManifoldCountObjective as a decorator for objectives to enable counting of calls to for example the cost and the gradient
adds a return_objective keyword, that switches the return of a solver to a tuple (o, s), where o is the (possibly decorated) objective, and s is the “classical” solver return (state or point). This way the counted values can be accessed and the cache can be reused.
change solvers on the mid level (form solver(M, objective, p)) to also accept decorated objectives
A new interface of the form alg(M, objective, p0) to allow to reuse objectives without creating AbstractManoptSolverStates and calling solve!. This especially still allows for any decoration of the objective and/or the state using e.g. debug=, or record=.
The developer can most easily be reached in the Julia Slack channel #manifolds. You can apply for the Julia Slack workspace here if you haven't joined yet. You can also ask your question on discourse.julialang.org.
There is still a lot of methods for within the optimization framework of Manopt.jl, may it be functions, gradients, differentials, proximal maps, step size rules or stopping criteria. If you notice a method missing and can contribute an implementation, please do so! Even providing a single new method is a good contribution.
A main contribution you can provide is another algorithm that is not yet included in the package. An algorithm is always based on a concrete type of a AbstractManoptProblem storing the main information of the task and a concrete type of an AbstractManoptSolverState storing all information that needs to be known to the solver in general. The actual algorithm is split into an initialization phase, see initialize_solver!, and the implementation of the ith step of the solver itself, see before the iterative procedure, see step_solver!. For these two functions, it would be great if a new algorithm uses functions from the ManifoldsBase.jl interface as generically as possible. For example, if possible use retract!(M,q,p,X) in favor of exp!(M,q,p,X) to perform a step starting in p in direction X (in place of q), since the exponential map might be too expensive to evaluate or might not be available on a certain manifold. See Retractions and inverse retractions for more details. Further, if possible, prefer retract!(M,q,p,X) in favor of retract(M,p,X), since a computation in place of a suitable variable q reduces memory allocations.
Usually, the methods implemented in Manopt.jl also have a high-level interface, that is easier to call, creates the necessary problem and options structure and calls the solver.
The two technical functions initialize_solver! and step_solver! should be documented with technical details, while the high level interface should usually provide a general description and some literature references to the algorithm at hand.
We try to follow the documentation guidelines from the Julia documentation as well as Blue Style. We run JuliaFormatter.jl on the repo in the way set in the .JuliaFormatter.toml file, which enforces a number of conventions consistent with the Blue Style.
We also follow a few internal conventions:
It is preferred that the AbstractManoptProblem's struct contains information about the general structure of the problem.
Any implemented function should be accompanied by its mathematical formulae if a closed form exists.
AbstractManoptProblem and option structures are stored within the plan/ folder and sorted by properties of the problem and/or solver at hand.
Within the source code of one algorithm, the high level interface should be first, then the initialization, then the step.
Otherwise an alphabetical order is preferable.
The above implies that the mutating variant of a function follows the non-mutating variant.
There should be no dangling = signs.
Always add a newline between things of different types (struct/method/const).
Always add a newline between methods for different functions (including mutating/nonmutating variants).
Prefer to have no newline between methods for the same function; when reasonable, merge the docstrings.
All import/using/include should be in the main module file.
Settings
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
The developer can most easily be reached in the Julia Slack channel #manifolds. You can apply for the Julia Slack workspace here if you haven't joined yet. You can also ask your question on discourse.julialang.org.
There is still a lot of methods for within the optimization framework of Manopt.jl, may it be functions, gradients, differentials, proximal maps, step size rules or stopping criteria. If you notice a method missing and can contribute an implementation, please do so! Even providing a single new method is a good contribution.
A main contribution you can provide is another algorithm that is not yet included in the package. An algorithm is always based on a concrete type of a AbstractManoptProblem storing the main information of the task and a concrete type of an AbstractManoptSolverState storing all information that needs to be known to the solver in general. The actual algorithm is split into an initialization phase, see initialize_solver!, and the implementation of the ith step of the solver itself, see before the iterative procedure, see step_solver!. For these two functions, it would be great if a new algorithm uses functions from the ManifoldsBase.jl interface as generically as possible. For example, if possible use retract!(M,q,p,X) in favor of exp!(M,q,p,X) to perform a step starting in p in direction X (in place of q), since the exponential map might be too expensive to evaluate or might not be available on a certain manifold. See Retractions and inverse retractions for more details. Further, if possible, prefer retract!(M,q,p,X) in favor of retract(M,p,X), since a computation in place of a suitable variable q reduces memory allocations.
Usually, the methods implemented in Manopt.jl also have a high-level interface, that is easier to call, creates the necessary problem and options structure and calls the solver.
The two technical functions initialize_solver! and step_solver! should be documented with technical details, while the high level interface should usually provide a general description and some literature references to the algorithm at hand.
We try to follow the documentation guidelines from the Julia documentation as well as Blue Style. We run JuliaFormatter.jl on the repo in the way set in the .JuliaFormatter.toml file, which enforces a number of conventions consistent with the Blue Style.
We also follow a few internal conventions:
It is preferred that the AbstractManoptProblem's struct contains information about the general structure of the problem.
Any implemented function should be accompanied by its mathematical formulae if a closed form exists.
AbstractManoptProblem and option structures are stored within the plan/ folder and sorted by properties of the problem and/or solver at hand.
Within the source code of one algorithm, the high level interface should be first, then the initialization, then the step.
Otherwise an alphabetical order is preferable.
The above implies that the mutating variant of a function follows the non-mutating variant.
There should be no dangling = signs.
Always add a newline between things of different types (struct/method/const).
Always add a newline between methods for different functions (including mutating/nonmutating variants).
Prefer to have no newline between methods for the same function; when reasonable, merge the docstrings.
All import/using/include should be in the main module file.
Settings
This document was generated with Documenter.jl version 1.1.2 on Wednesday 25 October 2023. Using Julia version 1.9.3.
Manopt can be used with line search algorithms implemented in LineSearches.jl. This can be illustrated by the following example of optimizing Rosenbrock function constrained to the unit sphere.
using Manopt, Manifolds, LineSearches
+Extensions · Manopt.jl
Manopt can be used with line search algorithms implemented in LineSearches.jl. This can be illustrated by the following example of optimizing Rosenbrock function constrained to the unit sphere.
using Manopt, Manifolds, LineSearches
# define objective function and its gradient
p = [1.0, 100.0]
@@ -62,7 +62,7 @@
Max Iteration 1000: not reached
|grad f| < 1.0e-6: reached
Overall: reached
-This indicates convergence: Yes
Wrap linesearch (for example HagerZhang or MoreThuente). The initial step selection from Linesearches.jl is not yet supported and the value 1.0 is used. The retraction used for determining the line along which the search is performed can be provided as retraction_method. Gradient vectors are transported between points using vector_transport_method.
Compute the mid point between p and q. If there is more than one mid point of (not necessarily minimizing) geodesics (e.g. on the sphere), the one nearest to x is returned (in place of y).
Tangent bundle has injectivity radius of either infinity (for flat manifolds) or 0 (for non-flat manifolds). This makes a guess of what a reasonable maximum stepsize on a tangent bundle might be.
Return a reasonable guess of maximum step size on FixedRankMatrices following the choice of typical distance in Matlab Manopt, i.e. dimension of M. See this note
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
+)
Wrap linesearch (for example HagerZhang or MoreThuente). The initial step selection from Linesearches.jl is not yet supported and the value 1.0 is used. The retraction used for determining the line along which the search is performed can be provided as retraction_method. Gradient vectors are transported between points using vector_transport_method.
Compute the mid point between p and q. If there is more than one mid point of (not necessarily minimizing) geodesics (e.g. on the sphere), the one nearest to x is returned (in place of y).
Tangent bundle has injectivity radius of either infinity (for flat manifolds) or 0 (for non-flat manifolds). This makes a guess of what a reasonable maximum stepsize on a tangent bundle might be.
Return a reasonable guess of maximum step size on FixedRankMatrices following the choice of typical distance in Matlab Manopt, i.e. dimension of M. See this note
evaluate the adjoint of the differential of a composite Bézier curve on the manifold M with respect to its control points b based on a points T$=(t_i)_{i=1}^n$ that are pointwise in $t_i∈[0,1]$ on the curve and given corresponding tangential vectors $X = (η_i)_{i=1}^n$, $η_i∈T_{β(t_i)}\mathcal M$ This can be computed in place of Y.
evaluate the adjoint of the differential of a composite Bézier curve on the manifold M with respect to its control points b based on a points T$=(t_i)_{i=1}^n$ that are pointwise in $t_i∈[0,1]$ on the curve and given corresponding tangential vectors $X = (η_i)_{i=1}^n$, $η_i∈T_{β(t_i)}\mathcal M$ This can be computed in place of Y.
evaluate the adjoint of the differential of a Bézier curve on the manifold M with respect to its control points b based on a points T$=(t_i)_{i=1}^n$ that are pointwise in $t_i∈[0,1]$ on the curve and given corresponding tangential vectors $X = (η_i)_{i=1}^n$, $η_i∈T_{β(t_i)}\mathcal M$ This can be computed in place of Y.
evaluate the adjoint of the differential of a Bézier curve on the manifold M with respect to its control points b based on a points T$=(t_i)_{i=1}^n$ that are pointwise in $t_i∈[0,1]$ on the curve and given corresponding tangential vectors $X = (η_i)_{i=1}^n$, $η_i∈T_{β(t_i)}\mathcal M$ This can be computed in place of Y.
evaluate the adjoint of the differential of a Bézier curve on the manifold M with respect to its control points b based on a point t$∈[0,1]$ on the curve and a tangent vector $η∈T_{β(t)}\mathcal M$. This can be computed in place of Y.
Y = adjoint_differential_forward_logs(M, p, X)
-adjoint_differential_forward_logs!(M, Y, p, X)
Compute the adjoint differential of forward_logs$F$ occurring, in the power manifold array p, the differential of the function
$F_i(p) = \sum_{j ∈ \mathcal I_i} \log_{p_i} p_j$
where $i$ runs over all indices of the PowerManifold manifold M and $\mathcal I_i$ denotes the forward neighbors of $i$ Let $n$ be the number dimensions of the PowerManifold manifold (i.e. length(size(x))). Then the input tangent vector lies on the manifold $\mathcal M' = \mathcal M^n$. The adjoint differential can be computed in place of Y.
Input
M – a PowerManifold manifold
p – an array of points on a manifold
X – a tangent vector to from the n-fold power of p, where n is the ndims of p
Output
Y – resulting tangent vector in $T_p\mathcal M$ representing the adjoint differentials of the logs.
evaluate the adjoint of the differential of a Bézier curve on the manifold M with respect to its control points b based on a point t$∈[0,1]$ on the curve and a tangent vector $η∈T_{β(t)}\mathcal M$. This can be computed in place of Y.
Y = adjoint_differential_forward_logs(M, p, X)
+adjoint_differential_forward_logs!(M, Y, p, X)
Compute the adjoint differential of forward_logs$F$ occurring, in the power manifold array p, the differential of the function
$F_i(p) = \sum_{j ∈ \mathcal I_i} \log_{p_i} p_j$
where $i$ runs over all indices of the PowerManifold manifold M and $\mathcal I_i$ denotes the forward neighbors of $i$ Let $n$ be the number dimensions of the PowerManifold manifold (i.e. length(size(x))). Then the input tangent vector lies on the manifold $\mathcal M' = \mathcal M^n$. The adjoint differential can be computed in place of Y.
Input
M – a PowerManifold manifold
p – an array of points on a manifold
X – a tangent vector to from the n-fold power of p, where n is the ndims of p
Output
Y – resulting tangent vector in $T_p\mathcal M$ representing the adjoint differentials of the logs.
R. Bergmann and P.-Y. Gousenbourger. A variational model for data fitting on manifolds by minimizing the acceleration of a Bézier curve. Frontiers in Applied Mathematics and Statistics 4 (2018), arXiv: [1807.10090](https://arxiv.org/abs/1807.10090).
-
Settings
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
+
Settings
This document was generated with Documenter.jl version 1.1.2 on Wednesday 25 October 2023. Using Julia version 1.9.3.
A type to capture a Bezier segment. With $n$ points, a Bézier segment of degree $n-1$ is stored. On the Euclidean manifold, this yields a polynomial of degree $n-1$.
This type is mainly used to encapsulate the points within a composite Bezier curve, which consist of an AbstractVector of BezierSegments where each of the points might be a nested array on a PowerManifold already.
Not that this can also be used to represent tangent vectors on the control points of a segment.
de_casteljau(M::AbstractManifold, b::BezierSegment NTuple{N,P}) -> Function
return the Bézier curve$β(⋅;b_0,…,b_n): [0,1] → \mathcal M$ defined by the control points $b_0,…,b_n∈\mathcal M$, $n∈\mathbb N$, as a BezierSegment. This function implements de Casteljau's algorithm Casteljau, 1959, Casteljau, 1963 generalized to manifolds by Popiel, Noakes, J Approx Theo, 2007: Let $γ_{a,b}(t)$ denote the shortest geodesic connecting $a,b∈\mathcal M$. Then the curve is defined by the recursion
A type to capture a Bezier segment. With $n$ points, a Bézier segment of degree $n-1$ is stored. On the Euclidean manifold, this yields a polynomial of degree $n-1$.
This type is mainly used to encapsulate the points within a composite Bezier curve, which consist of an AbstractVector of BezierSegments where each of the points might be a nested array on a PowerManifold already.
Not that this can also be used to represent tangent vectors on the control points of a segment.
de_casteljau(M::AbstractManifold, b::BezierSegment NTuple{N,P}) -> Function
return the Bézier curve$β(⋅;b_0,…,b_n): [0,1] → \mathcal M$ defined by the control points $b_0,…,b_n∈\mathcal M$, $n∈\mathbb N$, as a BezierSegment. This function implements de Casteljau's algorithm Casteljau, 1959, Casteljau, 1963 generalized to manifolds by Popiel, Noakes, J Approx Theo, 2007: Let $γ_{a,b}(t)$ denote the shortest geodesic connecting $a,b∈\mathcal M$. Then the curve is defined by the recursion
de_casteljau(M::AbstractManifold, B::AbstractVector{<:BezierSegment}) -> Function
Given a vector of Bézier segments, i.e. a vector of control points $B=\bigl( (b_{0,0},…,b_{n_0,0}),…,(b_{0,m},… b_{n_m,m}) \bigr)$, where the different segments might be of different degree(s) $n_0,…,n_m$. The resulting composite Bézier curve $c_B:[0,m] → \mathcal M$ consists of $m$ segments which are Bézier curves.
returns the inner (i.e. despite start and end) points of the segments of the composite Bézier curve specified by the control points B. For a single segment b, its inner points are returned
returns the tangent vectors at start and end points of the composite Bézier curve pointing from a junction point to the first and last inner control points for each segment of the composite Bezier curve specified by the control points B, either a vector of segments of controlpoints.
returns the start and end point(s) of the segments of the composite Bézier curve specified by the control points B. For just one segment b, its start and end points are returned.
returns the inner (i.e. despite start and end) points of the segments of the composite Bézier curve specified by the control points B. For a single segment b, its inner points are returned
returns the tangent vectors at start and end points of the composite Bézier curve pointing from a junction point to the first and last inner control points for each segment of the composite Bezier curve specified by the control points B, either a vector of segments of controlpoints.
returns the start and end point(s) of the segments of the composite Bézier curve specified by the control points B. For just one segment b, its start and end points are returned.
returns the control points of the segments of the composite Bézier curve specified by the control points B, either a vector of segments of controlpoints or a.
This method reduces the points depending on the optional reduce symbol
:default – no reduction is performed
:continuous – for a continuous function, the junction points are doubled at $b_{0,i}=b_{n_{i-1},i-1}$, so only $b_{0,i}$ is in the vector.
:differentiable – for a differentiable function additionally $\log_{b_{0,i}}b_{1,i} = -\log_{b_{n_{i-1},i-1}}b_{n_{i-1}-1,i-1}$ holds. hence $b_{n_{i-1}-1,i-1}$ is omitted.
If only one segment is given, all points of b – i.e. b.pts is returned.
returns the array of BezierSegments B of a composite Bézier curve reconstructed from an array c of points on the manifold M and an array of degrees d.
There are a few (reduced) representations that can get extended; see also get_bezier_points. For ease of the following, let $c=(c_1,…,c_k)$ and $d=(d_1,…,d_m)$, where $m$ denotes the number of components the composite Bézier curve consists of. Then
:default – $k = m + \sum_{i=1}^m d_i$ since each component requires one point more than its degree. The points are then ordered in tuples, i.e.
:continuous – $k = 1+ \sum_{i=1}{m} d_i$, since for a continuous curve start and end point of successive components are the same, so the very first start point and the end points are stored.
:differentiable – for a differentiable function additionally to the last explanation, also the second point of any segment was not stored except for the first segment. Hence $k = 2 - m + \sum_{i=1}{m} d_i$ and at a junction point $b_n$ with its given prior point $c_{n-1}$, i.e. this is the last inner point of a segment, the first inner point in the next segment the junction is computed as $b = \exp_{c_n}(-\log_{c_n} c_{n-1})$ such that the assumed differentiability holds
returns the control points of the segments of the composite Bézier curve specified by the control points B, either a vector of segments of controlpoints or a.
This method reduces the points depending on the optional reduce symbol
:default – no reduction is performed
:continuous – for a continuous function, the junction points are doubled at $b_{0,i}=b_{n_{i-1},i-1}$, so only $b_{0,i}$ is in the vector.
:differentiable – for a differentiable function additionally $\log_{b_{0,i}}b_{1,i} = -\log_{b_{n_{i-1},i-1}}b_{n_{i-1}-1,i-1}$ holds. hence $b_{n_{i-1}-1,i-1}$ is omitted.
If only one segment is given, all points of b – i.e. b.pts is returned.
returns the array of BezierSegments B of a composite Bézier curve reconstructed from an array c of points on the manifold M and an array of degrees d.
There are a few (reduced) representations that can get extended; see also get_bezier_points. For ease of the following, let $c=(c_1,…,c_k)$ and $d=(d_1,…,d_m)$, where $m$ denotes the number of components the composite Bézier curve consists of. Then
:default – $k = m + \sum_{i=1}^m d_i$ since each component requires one point more than its degree. The points are then ordered in tuples, i.e.
:continuous – $k = 1+ \sum_{i=1}{m} d_i$, since for a continuous curve start and end point of successive components are the same, so the very first start point and the end points are stored.
:differentiable – for a differentiable function additionally to the last explanation, also the second point of any segment was not stored except for the first segment. Hence $k = 2 - m + \sum_{i=1}{m} d_i$ and at a junction point $b_n$ with its given prior point $c_{n-1}$, i.e. this is the last inner point of a segment, the first inner point in the next segment the junction is computed as $b = \exp_{c_n}(-\log_{c_n} c_{n-1})$ such that the assumed differentiability holds
Compute the intrinsic infimal convolution model, where the addition is replaced by a mid point approach and the two functions involved are costTV2 and costTV. The model reads
Compute the intrinsic infimal convolution model, where the addition is replaced by a mid point approach and the two functions involved are costTV2 and costTV. The model reads
compute the $ℓ^2$-TV-TV2 functional on the PowerManifold manifold M for given (fixed) data f (on M), nonnegative weight α, β, and evaluated at x (on M), i.e.
Compute the $\operatorname{TV}^p$ functional for data xon the PowerManifold manifold M, i.e. $\mathcal M = \mathcal N^n$, where $n ∈ \mathbb N^k$ denotes the dimensions of the data x. Let $\mathcal I_i$ denote the forward neighbors, i.e. with $\mathcal G$ as all indices from $\mathbf{1} ∈ \mathbb N^k$ to $n$ we have $\mathcal I_i = \{i+e_j, j=1,…,k\}\cap \mathcal G$. The formula reads
compute the $\operatorname{TV}_2^p$ functional for data x on the PowerManifold manifoldmanifold M, i.e. $\mathcal M = \mathcal N^n$, where $n ∈ \mathbb N^k$ denotes the dimensions of the data x. Let $\mathcal I_i^{\pm}$ denote the forward and backward neighbors, respectively, i.e. with $\mathcal G$ as all indices from $\mathbf{1} ∈ \mathbb N^k$ to $n$ we have $\mathcal I^\pm_i = \{i\pm e_j, j=1,…,k\}\cap \mathcal I$. The formula then reads
where for this formula the pts along the curve are equispaced and denoted by $t_i$ and $d_2$ refers to the second order absolute difference costTV2 (squared), the junction points are denoted by $p_i$, and to each $p_i$ corresponds one data item in the manifold points given in d. For details on the acceleration approximation, see cost_acceleration_bezier. Note that the Bézier-curve is given in reduces form as a point on a PowerManifold, together with the degrees of the segments and assuming a differentiable curve, the segments can internally be reconstructed.
compute the $ℓ^2$-TV-TV2 functional on the PowerManifold manifold M for given (fixed) data f (on M), nonnegative weight α, β, and evaluated at x (on M), i.e.
Compute the $\operatorname{TV}^p$ functional for data xon the PowerManifold manifold M, i.e. $\mathcal M = \mathcal N^n$, where $n ∈ \mathbb N^k$ denotes the dimensions of the data x. Let $\mathcal I_i$ denote the forward neighbors, i.e. with $\mathcal G$ as all indices from $\mathbf{1} ∈ \mathbb N^k$ to $n$ we have $\mathcal I_i = \{i+e_j, j=1,…,k\}\cap \mathcal G$. The formula reads
compute the $\operatorname{TV}_2^p$ functional for data x on the PowerManifold manifoldmanifold M, i.e. $\mathcal M = \mathcal N^n$, where $n ∈ \mathbb N^k$ denotes the dimensions of the data x. Let $\mathcal I_i^{\pm}$ denote the forward and backward neighbors, respectively, i.e. with $\mathcal G$ as all indices from $\mathbf{1} ∈ \mathbb N^k$ to $n$ we have $\mathcal I^\pm_i = \{i\pm e_j, j=1,…,k\}\cap \mathcal I$. The formula then reads
where for this formula the pts along the curve are equispaced and denoted by $t_i$ and $d_2$ refers to the second order absolute difference costTV2 (squared), the junction points are denoted by $p_i$, and to each $p_i$ corresponds one data item in the manifold points given in d. For details on the acceleration approximation, see cost_acceleration_bezier. Note that the Bézier-curve is given in reduces form as a point on a PowerManifold, together with the degrees of the segments and assuming a differentiable curve, the segments can internally be reconstructed.
where for this formula the pts along the curve are equispaced and denoted by $t_i$, $i=1,…,N$, and $d_2$ refers to the second order absolute difference costTV2 (squared). Note that the Bézier-curve is given in reduces form as a point on a PowerManifold, together with the degrees of the segments and assuming a differentiable curve, the segments can internally be reconstructed.
where for this formula the pts along the curve are equispaced and denoted by $t_i$, $i=1,…,N$, and $d_2$ refers to the second order absolute difference costTV2 (squared). Note that the Bézier-curve is given in reduces form as a point on a PowerManifold, together with the degrees of the segments and assuming a differentiable curve, the segments can internally be reconstructed.
evaluate the differential of the composite Bézier curve with respect to its control points B and tangent vectors Ξ in the tangent spaces of the control points. The result is the “change” of the curve at the points in T, which are elementwise in $[0,N]$, and each depending the corresponding segment(s). Here, $N$ is the length of B. For the mutating variant the result is computed in Θ.
evaluate the differential of the composite Bézier curve with respect to its control points B and tangent vectors Ξ in the tangent spaces of the control points. The result is the “change” of the curve at the points in T, which are elementwise in $[0,N]$, and each depending the corresponding segment(s). Here, $N$ is the length of B. For the mutating variant the result is computed in Θ.
evaluate the differential of the composite Bézier curve with respect to its control points B and tangent vectors Ξ in the tangent spaces of the control points. The result is the “change” of the curve at t$∈[0,N]$, which depends only on the corresponding segment. Here, $N$ is the length of B. The computation can be done in place of Y.
evaluate the differential of the composite Bézier curve with respect to its control points B and tangent vectors Ξ in the tangent spaces of the control points. The result is the “change” of the curve at t$∈[0,N]$, which depends only on the corresponding segment. Here, $N$ is the length of B. The computation can be done in place of Y.
evaluate the differential of the Bézier curve with respect to its control points b and tangent vectors X in the tangent spaces of the control points. The result is the “change” of the curve at the points T, elementwise in $t∈[0,1]$. The computation can be done in place of Y.
evaluate the differential of the Bézier curve with respect to its control points b and tangent vectors X in the tangent spaces of the control points. The result is the “change” of the curve at the points T, elementwise in $t∈[0,1]$. The computation can be done in place of Y.
evaluate the differential of the Bézier curve with respect to its control points b and tangent vectors X given in the tangent spaces of the control points. The result is the “change” of the curve at t$∈[0,1]$. The computation can be done in place of Y.
Y = differential_forward_logs(M, p, X)
-differential_forward_logs!(M, Y, p, X)
compute the differential of forward_logs$F$ on the PowerManifold manifold M at p and direction X , in the power manifold array, the differential of the function
where $\mathcal G$ is the set of indices of the PowerManifold manifold M and $\mathcal I_i$ denotes the forward neighbors of $i$.
Input
M – a PowerManifold manifold
p – a point.
X – a tangent vector.
Output
Y – resulting tangent vector in $T_x\mathcal N$ representing the differentials of the logs, where $\mathcal N$ is the power manifold with the number of dimensions added to size(x). The computation can also be done in place.
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
+)
evaluate the differential of the Bézier curve with respect to its control points b and tangent vectors X given in the tangent spaces of the control points. The result is the “change” of the curve at t$∈[0,1]$. The computation can be done in place of Y.
Y = differential_forward_logs(M, p, X)
+differential_forward_logs!(M, Y, p, X)
compute the differential of forward_logs$F$ on the PowerManifold manifold M at p and direction X , in the power manifold array, the differential of the function
where $\mathcal G$ is the set of indices of the PowerManifold manifold M and $\mathcal I_i$ denotes the forward neighbors of $i$.
Input
M – a PowerManifold manifold
p – a point.
X – a tangent vector.
Output
Y – resulting tangent vector in $T_x\mathcal N$ representing the differentials of the logs, where $\mathcal N$ is the power manifold with the number of dimensions added to size(x). The computation can also be done in place.
where $D_xf[ξ]$ denotes the differential of $f$ at $x$ with respect to the tangent direction (vector) $ξ$ or in other words the directional derivative.
where $\mathcal G$ is the set of indices of the PowerManifold manifold M and $\mathcal I_i$ denotes the forward neighbors of $i$. This can also be done in place of ξ.
Input
M – a PowerManifold manifold
x – a point.
Output
Y – resulting tangent vector in $T_x\mathcal M$ representing the logs, where $\mathcal N$ is the power manifold with the number of dimensions added to size(x). The computation can be done in place of Y.
where $D_xf[ξ]$ denotes the differential of $f$ at $x$ with respect to the tangent direction (vector) $ξ$ or in other words the directional derivative.
where $\mathcal G$ is the set of indices of the PowerManifold manifold M and $\mathcal I_i$ denotes the forward neighbors of $i$. This can also be done in place of ξ.
Input
M – a PowerManifold manifold
x – a point.
Output
Y – resulting tangent vector in $T_x\mathcal M$ representing the logs, where $\mathcal N$ is the power manifold with the number of dimensions added to size(x). The computation can be done in place of Y.
grad_L2_acceleration_bezier(
M::AbstractManifold,
B::AbstractVector{P},
degrees::AbstractVector{<:Integer},
T::AbstractVector,
λ,
d::AbstractVector{P}
-) where {P}
compute the gradient of the discretized acceleration of a composite Bézier curve on the ManifoldM with respect to its control points B together with a data term that relates the junction points p_i to the data d with a weight $λ$ compared to the acceleration. The curve is evaluated at the points given in pts (elementwise in $[0,N]$), where $N$ is the number of segments of the Bézier curve. The summands are grad_distance for the data term and grad_acceleration_bezier for the acceleration with interpolation constrains. Here the get_bezier_junctions are included in the optimization, i.e. setting $λ=0$ yields the unconstrained acceleration minimization. Note that this is ill-posed, since any Bézier curve identical to a geodesic is a minimizer.
Note that the Bézier-curve is given in reduces form as a point on a PowerManifold, together with the degrees of the segments and assuming a differentiable curve, the segments can internally be reconstructed.
Y = grad_TV2(M, q[, p=1])
-grad_TV2!(M, Y, q[, p=1])
computes the (sub) gradient of $\frac{1}{p}d_2^p(q_1, q_2, q_3)$ with respect to all three components of $q∈\mathcal M^3$, where $d_2$ denotes the second order absolute difference using the mid point model, i.e. let
\[\mathcal C = \bigl\{ c ∈ \mathcal M \ |\ g(\tfrac{1}{2};q_1,q_3) \text{ for some geodesic }g\bigr\}\]
denote the mid points between $q_1$ and $q_3$ on the manifold $\mathcal M$. Then the absolute second order difference is defined as
computes the (sub) gradient of $\frac{1}{p}d_2^p(q_1,q_2,q_3)$ with respect to all $q_1,q_2,q_3$ occurring along any array dimension in the point q, where M is the corresponding PowerManifold.
compute the gradient of the discretized acceleration of a composite Bézier curve on the ManifoldM with respect to its control points B together with a data term that relates the junction points p_i to the data d with a weight $λ$ compared to the acceleration. The curve is evaluated at the points given in pts (elementwise in $[0,N]$), where $N$ is the number of segments of the Bézier curve. The summands are grad_distance for the data term and grad_acceleration_bezier for the acceleration with interpolation constrains. Here the get_bezier_junctions are included in the optimization, i.e. setting $λ=0$ yields the unconstrained acceleration minimization. Note that this is ill-posed, since any Bézier curve identical to a geodesic is a minimizer.
Note that the Bézier-curve is given in reduces form as a point on a PowerManifold, together with the degrees of the segments and assuming a differentiable curve, the segments can internally be reconstructed.
computes the (sub) gradient of $\frac{1}{p}d_2^p(q_1,q_2,q_3)$ with respect to all $q_1,q_2,q_3$ occurring along any array dimension in the point q, where M is the corresponding PowerManifold.
Y = grad_TV2(M, q[, p=1])
+grad_TV2!(M, Y, q[, p=1])
computes the (sub) gradient of $\frac{1}{p}d_2^p(q_1, q_2, q_3)$ with respect to all three components of $q∈\mathcal M^3$, where $d_2$ denotes the second order absolute difference using the mid point model, i.e. let
\[\mathcal C = \bigl\{ c ∈ \mathcal M \ |\ g(\tfrac{1}{2};q_1,q_3) \text{ for some geodesic }g\bigr\}\]
denote the mid points between $q_1$ and $q_3$ on the manifold $\mathcal M$. Then the absolute second order difference is defined as
compute the gradient of the discretized acceleration of a (composite) Bézier curve $c_B(t)$ on the ManifoldM with respect to its control points B given as a point on the PowerManifold assuming C1 conditions and known degrees. The curve is evaluated at the points given in T (elementwise in $[0,N]$, where $N$ is the number of segments of the Bézier curve). The get_bezier_junctions are fixed for this gradient (interpolation constraint). For the unconstrained gradient, see grad_L2_acceleration_bezier and set $λ=0$ therein. This gradient is computed using adjoint_Jacobi_fields. For details, see Bergmann, Gousenbourger, Front. Appl. Math. Stat., 2018. See de_casteljau for more details on the curve.
for $p\neq 1$ or $x\neq y$. Note that for the remaining case $p=1$, $x=y$ the function is not differentiable. In this case, the function returns the corresponding zero tangent vector, since this is an element of the subdifferential.
Optional
p – (2) the exponent of the distance, i.e. the default is the squared distance
grad_u, grad_v = grad_intrinsic_infimal_convolution_TV12(M, f, u, v, α, β)
compute (sub)gradient of the intrinsic infimal convolution model using the mid point model of second order differences, see costTV2, i.e. for some $f ∈ \mathcal M$ on a PowerManifold manifold $\mathcal M$ this function computes the (sub)gradient of
\[E(u,v) =
+)
compute the gradient of the discretized acceleration of a (composite) Bézier curve $c_B(t)$ on the ManifoldM with respect to its control points B given as a point on the PowerManifold assuming C1 conditions and known degrees. The curve is evaluated at the points given in T (elementwise in $[0,N]$, where $N$ is the number of segments of the Bézier curve). The get_bezier_junctions are fixed for this gradient (interpolation constraint). For the unconstrained gradient, see grad_L2_acceleration_bezier and set $λ=0$ therein. This gradient is computed using adjoint_Jacobi_fields. For details, see Bergmann, Gousenbourger, Front. Appl. Math. Stat., 2018. See de_casteljau for more details on the curve.
for $p\neq 1$ or $x\neq y$. Note that for the remaining case $p=1$, $x=y$ the function is not differentiable. In this case, the function returns the corresponding zero tangent vector, since this is an element of the subdifferential.
Optional
p – (2) the exponent of the distance, i.e. the default is the squared distance
grad_u, grad_v = grad_intrinsic_infimal_convolution_TV12(M, f, u, v, α, β)
compute (sub)gradient of the intrinsic infimal convolution model using the mid point model of second order differences, see costTV2, i.e. for some $f ∈ \mathcal M$ on a PowerManifold manifold $\mathcal M$ this function computes the (sub)gradient of
R. Bergmann and P.-Y. Gousenbourger. A variational model for data fitting on manifolds by minimizing the acceleration of a Bézier curve. Frontiers in Applied Mathematics and Statistics 4 (2018), arXiv: [1807.10090](https://arxiv.org/abs/1807.10090).
-
Settings
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
+
Settings
This document was generated with Documenter.jl version 1.1.2 on Wednesday 25 October 2023. Using Julia version 1.9.3.
There are several functions required within optimization, most prominently costFunctions and gradients. This package includes several cost functions and corresponding gradients, but also corresponding proximal maps for variational methods manifold-valued data. Most of these functions require the evaluation of Differentials or their adjointss.
Settings
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
There are several functions required within optimization, most prominently costFunctions and gradients. This package includes several cost functions and corresponding gradients, but also corresponding proximal maps for variational methods manifold-valued data. Most of these functions require the evaluation of Differentials or their adjointss.
Settings
This document was generated with Documenter.jl version 1.1.2 on Wednesday 25 October 2023. Using Julia version 1.9.3.
This small section extends the functions available from ManifoldsBase.jl and Manifolds.jl, especially a few random generators, that are simpler than the available functions.
where $\operatorname{retr}$ and $\operatorname{retr}^{-1}$ denote a retraction and an inverse retraction, respectively. This can also be done in place of q.
Keyword arguments
retraction_method - (default_retraction_metiod(M, typeof(p))) the retraction to use in the reflection
inverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) the inverse retraction to use within the reflection
and for the reflect! additionally
X (zero_vector(M,p)) a temporary memory to compute the inverse retraction in place. otherwise this is the memory that would be allocated anyways.
This small section extends the functions available from ManifoldsBase.jl and Manifolds.jl, especially a few random generators, that are simpler than the available functions.
where $\operatorname{retr}$ and $\operatorname{retr}^{-1}$ denote a retraction and an inverse retraction, respectively. This can also be done in place of q.
Keyword arguments
retraction_method - (default_retraction_metiod(M, typeof(p))) the retraction to use in the reflection
inverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) the inverse retraction to use within the reflection
and for the reflect! additionally
X (zero_vector(M,p)) a temporary memory to compute the inverse retraction in place. otherwise this is the memory that would be allocated anyways.
where $d_{\mathcal M}: \mathcal M \times \mathcal M → ℝ$ denotes the geodesic distance on $\mathcal M$. While it might still be difficult to compute the minimizer, there are several proximal maps known (locally) in closed form. Furthermore if $x^{\star} ∈ \mathcal M$ is a minimizer of $\varphi$, then
where $d_{\mathcal M}: \mathcal M \times \mathcal M → ℝ$ denotes the geodesic distance on $\mathcal M$. While it might still be difficult to compute the minimizer, there are several proximal maps known (locally) in closed form. Furthermore if $x^{\star} ∈ \mathcal M$ is a minimizer of $\varphi$, then
where $\mathcal G$ is the set of indices for $x∈\mathcal M$ and $\mathcal I_i$ is the set of its forward neighbors. The computation can also be done in place of Θ.
compute the proximal maps $\operatorname{prox}_{λ\varphi}$ of all forward differences occurring in the power manifold array, i.e. $\varphi(xi,xj) = d_{\mathcal M}^p(xi,xj)$ with xi and xj are array elements of x and j = i+e_k, where e_k is the $k$th unit vector. The parameter λ is the prox parameter.
Input
M – a manifold M
λ – a real value, parameter of the proximal map
x – a point.
Optional
(default is given in brackets)
p – (1) exponent of the distance of the TV term
Output
y – resulting point containing with all mentioned proximal points evaluated (in a cyclic order). The computation can also be done in place
(y1,y2,y3) = prox_TV2(M,λ,(x1,x2,x3),[p=1], kwargs...)
-prox_TV2!(M, y, λ,(x1,x2,x3),[p=1], kwargs...)
Compute the proximal map $\operatorname{prox}_{λ\varphi}$ of $\varphi(x_1,x_2,x_3) = d_{\mathcal M}^p(c(x_1,x_3),x_2)$ with parameter λ>0, where $c(x,z)$ denotes the mid point of a shortest geodesic from x1 to x3 that is closest to x2. The result can be computed in place of y.
Input
M – a manifold
λ – a real value, parameter of the proximal map
(x1,x2,x3) – a tuple of three points
p – (1) exponent of the distance of the TV term
Optional
kwargs... – parameters for the internal subgradient_method (if M is neither Euclidean nor Circle, since for these a closed form is given)
Output
(y1,y2,y3) – resulting tuple of points of the proximal map. The computation can also be done in place.
y = prox_TV2(M, λ, x[, p=1])
-prox_TV2!(M, y, λ, x[, p=1])
compute the proximal maps $\operatorname{prox}_{λ\varphi}$ of all centered second order differences occurring in the power manifold array, i.e. $\varphi(x_k,x_i,x_j) = d_2(x_k,x_i.x_j)$, where $k,j$ are backward and forward neighbors (along any dimension in the array of x). The parameter λ is the prox parameter.
Input
M – a manifold M
λ – a real value, parameter of the proximal map
x – a points.
Optional
(default is given in brackets)
p – (1) exponent of the distance of the TV term
Output
y – resulting point with all mentioned proximal points evaluated (in a cyclic order). The computation can also be done in place.
y = prox_distance(M,λ,f,x [, p=2])
-prox_distance!(M, y, λ, f, x [, p=2])
compute the proximal map $\operatorname{prox}_{λ\varphi}$ with parameter λ of $φ(x) = \frac{1}{p}d_{\mathcal M}^p(f,x)$. For the mutating variant the computation is done in place of y.
y = prox_parallel_TV(M, λ, x [,p=1])
-prox_parallel_TV!(M, y, λ, x [,p=1])
compute the proximal maps $\operatorname{prox}_{λφ}$ of all forward differences occurring in the power manifold array, i.e. $φ(x_i,x_j) = d_{\mathcal M}^p(x_i,x_j)$ with xi and xj are array elements of x and j = i+e_k, where e_k is the $k$th unit vector. The parameter λ is the prox parameter.
Input
M – a PowerManifold manifold
λ – a real value, parameter of the proximal map
x – a point
Optional
(default is given in brackets)
p – (1) exponent of the distance of the TV term
Output
y – resulting Array of points with all mentioned proximal points evaluated (in a parallel within the arrays elements). The computation can also be done in place.
where $\mathcal G$ is the set of indices for $x∈\mathcal M$ and $\mathcal I_i$ is the set of its forward neighbors. The computation can also be done in place of Θ.
compute the proximal maps $\operatorname{prox}_{λ\varphi}$ of all forward differences occurring in the power manifold array, i.e. $\varphi(xi,xj) = d_{\mathcal M}^p(xi,xj)$ with xi and xj are array elements of x and j = i+e_k, where e_k is the $k$th unit vector. The parameter λ is the prox parameter.
Input
M – a manifold M
λ – a real value, parameter of the proximal map
x – a point.
Optional
(default is given in brackets)
p – (1) exponent of the distance of the TV term
Output
y – resulting point containing with all mentioned proximal points evaluated (in a cyclic order). The computation can also be done in place
(y1,y2,y3) = prox_TV2(M,λ,(x1,x2,x3),[p=1], kwargs...)
+prox_TV2!(M, y, λ,(x1,x2,x3),[p=1], kwargs...)
Compute the proximal map $\operatorname{prox}_{λ\varphi}$ of $\varphi(x_1,x_2,x_3) = d_{\mathcal M}^p(c(x_1,x_3),x_2)$ with parameter λ>0, where $c(x,z)$ denotes the mid point of a shortest geodesic from x1 to x3 that is closest to x2. The result can be computed in place of y.
Input
M – a manifold
λ – a real value, parameter of the proximal map
(x1,x2,x3) – a tuple of three points
p – (1) exponent of the distance of the TV term
Optional
kwargs... – parameters for the internal subgradient_method (if M is neither Euclidean nor Circle, since for these a closed form is given)
Output
(y1,y2,y3) – resulting tuple of points of the proximal map. The computation can also be done in place.
y = prox_TV2(M, λ, x[, p=1])
+prox_TV2!(M, y, λ, x[, p=1])
compute the proximal maps $\operatorname{prox}_{λ\varphi}$ of all centered second order differences occurring in the power manifold array, i.e. $\varphi(x_k,x_i,x_j) = d_2(x_k,x_i.x_j)$, where $k,j$ are backward and forward neighbors (along any dimension in the array of x). The parameter λ is the prox parameter.
Input
M – a manifold M
λ – a real value, parameter of the proximal map
x – a points.
Optional
(default is given in brackets)
p – (1) exponent of the distance of the TV term
Output
y – resulting point with all mentioned proximal points evaluated (in a cyclic order). The computation can also be done in place.
y = prox_distance(M,λ,f,x [, p=2])
+prox_distance!(M, y, λ, f, x [, p=2])
compute the proximal map $\operatorname{prox}_{λ\varphi}$ with parameter λ of $φ(x) = \frac{1}{p}d_{\mathcal M}^p(f,x)$. For the mutating variant the computation is done in place of y.
y = prox_parallel_TV(M, λ, x [,p=1])
+prox_parallel_TV!(M, y, λ, x [,p=1])
compute the proximal maps $\operatorname{prox}_{λφ}$ of all forward differences occurring in the power manifold array, i.e. $φ(x_i,x_j) = d_{\mathcal M}^p(x_i,x_j)$ with xi and xj are array elements of x and j = i+e_k, where e_k is the $k$th unit vector. The parameter λ is the prox parameter.
Input
M – a PowerManifold manifold
λ – a real value, parameter of the proximal map
x – a point
Optional
(default is given in brackets)
p – (1) exponent of the distance of the TV term
Output
y – resulting Array of points with all mentioned proximal points evaluated (in a parallel within the arrays elements). The computation can also be done in place.
J. Duran, M. Moeller, C. Sbert and D. Cremers. Collaborative Total Variation: A General Framework for Vectorial TV Models. SIAM Journal on Imaging Sciences 9, 116-151 (2016), arxiv: [1508.01308](https://arxiv.org/abs/1508.01308).
-
Settings
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
+
Settings
This document was generated with Documenter.jl version 1.1.2 on Wednesday 25 October 2023. Using Julia version 1.9.3.
or in other words, that the error between the function $f$ and its second order Taylor behaves in error $\mathcal O(t^3)$, which indicates that the Hessian is correct, cf. also Section 6.8, Boumal, Cambridge Press, 2023.
Note that if the errors are below the given tolerance and the method is exact, no plot will be generated.
mode - (:Default) specify the mode, by default we assume to have a second order retraction given by retraction_method= you can also this method if you already have a critical point p. Set to :CritalPoint to use gradient_descent to find a critical point. Note: This requires (and evaluates) new tangent vectors X and Y
atol, rtol – (same defaults as isapprox) tolerances that are passed down to all checks
a, b – two real values to check linearity of the Hessian (if check_linearity=true)
N - (101) number of points to check within the log_range default range $[10^{-8},10^{0}]$
exactness_tol - (1e-12) if all errors are below this tolerance, the check is considered to be exact
io – (nothing) provide an IO to print the check result to
gradient - (grad_f(M, p)) instead of the gradient function you can also provide the gradient at p directly
Hessian - (Hess_f(M, p, X)) instead of the Hessian function you can provide the result of $\operatorname{Hess} f(p)[X]$ directly. Note that evaluations of the Hessian might still be necessary for checking linearity and symmetry and/or when using :CriticalPoint mode.
limits - ((1e-8,1)) specify the limits in the log_range
log_range - (range(limits[1], limits[2]; length=N)) specify the range of points (in log scale) to sample the Hessian line
N - (101) number of points to check within the log_range default range $[10^{-8},10^{0}]$
plot - (false) whether to plot the resulting check (if Plots.jl is loaded). The plot is in log-log-scale. This is returned and can then also be saved.
retraction_method - (default_retraction_method(M, typeof(p))) retraction method to use for the check
slope_tol – (0.1) tolerance for the slope (global) of the approximation
throw_error - (false) throw an error message if the Hessian is wrong
window – (nothing) specify window sizes within the log_range that are used for the slope estimation. the default is, to use all window sizes 2:N.
The kwargs... are also passed down to the check_vector call, such that tolerances can easily be set.
or in other words, that the error between the function $f$ and its first order Taylor behaves in error $\mathcal O(t^2)$, which indicates that the gradient is correct, cf. also Section 4.8, Boumal, Cambridge Press, 2023.
Note that if the errors are below the given tolerance and the method is exact, no plot will be generated.
Check data X,Y for the largest contiguous interval (window) with a regression line fitting “best”. Among all intervals with a slope within slope_tol to slope the longest one is taken. If no such interval exists, the one with the slope closest to slope is taken.
If the window is set to nothing (default), all window sizes 2,...,length(X) are checked. You can also specify a window size or an array of window sizes.
For each window size , all its translates in the data are checked. For all these (shifted) windows the regression line is computed (i.e. a,b in a + t*b) and the best line is computed.
From the best line the following data is returned
a, b specifying the regression line a + t*b
i, j determining the window, i.e the regression line stems from data X[i], ..., X[j]
or in other words, that the error between the function $f$ and its second order Taylor behaves in error $\mathcal O(t^3)$, which indicates that the Hessian is correct, cf. also Section 6.8, Boumal, Cambridge Press, 2023.
Note that if the errors are below the given tolerance and the method is exact, no plot will be generated.
mode - (:Default) specify the mode, by default we assume to have a second order retraction given by retraction_method= you can also this method if you already have a critical point p. Set to :CritalPoint to use gradient_descent to find a critical point. Note: This requires (and evaluates) new tangent vectors X and Y
atol, rtol – (same defaults as isapprox) tolerances that are passed down to all checks
a, b – two real values to check linearity of the Hessian (if check_linearity=true)
N - (101) number of points to check within the log_range default range $[10^{-8},10^{0}]$
exactness_tol - (1e-12) if all errors are below this tolerance, the check is considered to be exact
io – (nothing) provide an IO to print the check result to
gradient - (grad_f(M, p)) instead of the gradient function you can also provide the gradient at p directly
Hessian - (Hess_f(M, p, X)) instead of the Hessian function you can provide the result of $\operatorname{Hess} f(p)[X]$ directly. Note that evaluations of the Hessian might still be necessary for checking linearity and symmetry and/or when using :CriticalPoint mode.
limits - ((1e-8,1)) specify the limits in the log_range
log_range - (range(limits[1], limits[2]; length=N)) specify the range of points (in log scale) to sample the Hessian line
N - (101) number of points to check within the log_range default range $[10^{-8},10^{0}]$
plot - (false) whether to plot the resulting check (if Plots.jl is loaded). The plot is in log-log-scale. This is returned and can then also be saved.
retraction_method - (default_retraction_method(M, typeof(p))) retraction method to use for the check
slope_tol – (0.1) tolerance for the slope (global) of the approximation
throw_error - (false) throw an error message if the Hessian is wrong
window – (nothing) specify window sizes within the log_range that are used for the slope estimation. the default is, to use all window sizes 2:N.
The kwargs... are also passed down to the check_vector call, such that tolerances can easily be set.
or in other words, that the error between the function $f$ and its first order Taylor behaves in error $\mathcal O(t^2)$, which indicates that the gradient is correct, cf. also Section 4.8, Boumal, Cambridge Press, 2023.
Note that if the errors are below the given tolerance and the method is exact, no plot will be generated.
Check data X,Y for the largest contiguous interval (window) with a regression line fitting “best”. Among all intervals with a slope within slope_tol to slope the longest one is taken. If no such interval exists, the one with the slope closest to slope is taken.
If the window is set to nothing (default), all window sizes 2,...,length(X) are checked. You can also specify a window size or an array of window sizes.
For each window size , all its translates in the data are checked. For all these (shifted) windows the regression line is computed (i.e. a,b in a + t*b) and the best line is computed.
From the best line the following data is returned
a, b specifying the regression line a + t*b
i, j determining the window, i.e the regression line stems from data X[i], ..., X[j]
For some manifolds there are artificial or real application data available that can be loaded using the following data functions. Note that these need additionally Manifolds.jl to be loaded.
generate a real-valued signal having piecewise constant, linear and quadratic intervals with jumps in between. If the resulting manifold the data lives on, is the Circle the data is also wrapped to $[-\pi,\pi)$. This is data for an example from Bergmann et. al., SIAM J Imag Sci, 2014.
Optional
pts – (500) number of points to sample the function
evaluate the example signal $f(x), x ∈ [0,1]$, of phase-valued data introduces in Sec. 5.1 of Bergmann et. al., SIAM J Imag Sci, 2014 for values outside that interval, this Signal is missing.
For some manifolds there are artificial or real application data available that can be loaded using the following data functions. Note that these need additionally Manifolds.jl to be loaded.
generate a real-valued signal having piecewise constant, linear and quadratic intervals with jumps in between. If the resulting manifold the data lives on, is the Circle the data is also wrapped to $[-\pi,\pi)$. This is data for an example from Bergmann et. al., SIAM J Imag Sci, 2014.
Optional
pts – (500) number of points to sample the function
evaluate the example signal $f(x), x ∈ [0,1]$, of phase-valued data introduces in Sec. 5.1 of Bergmann et. al., SIAM J Imag Sci, 2014 for values outside that interval, this Signal is missing.
where each segment is a cubic Bézier curve, i.e. each point, except $p_3$ has a first point within the following segment $b_i^+$, $i=0,1,2$ and a last point within the previous segment, except for $p_0$, which are denoted by $b_i^-$, $i=1,2,3$. This curve is differentiable by the conditions $b_i^- = \gamma_{b_i^+,p_i}(2)$, $i=1,2$, where $\gamma_{a,b}$ is the shortest_geodesic connecting $a$ and $b$. The remaining points are defined as
Generate a point from the signal on the Sphere$\mathbb S^2$ by creating the Lemniscate of Bernoulli in the tangent space of p sampled at t and use expto obtain a point on the [Sphere`](https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/sphere.html).
Input
p – the tangent space the Lemniscate is created in
t – value to sample the Lemniscate at
Optional Values
a – (π/2) defines a half axis of the Lemniscate to cover a half sphere.
Generate a Signal on the Sphere$\mathbb S^2$ by creating the Lemniscate of Bernoulli in the tangent space of p sampled at pts points and use exp to get a signal on the Sphere.
Input
p – the tangent space the Lemniscate is created in
pts – (128) number of points to sample the Lemniscate
a – (π/2) defines a half axis of the Lemniscate to cover a half sphere.
interval – ([0,2*π]) range to sample the lemniscate at, the default value refers to one closed curve
Generate a point from the signal on the Sphere$\mathbb S^2$ by creating the Lemniscate of Bernoulli in the tangent space of p sampled at t and use expto obtain a point on the [Sphere`](https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/sphere.html).
Input
p – the tangent space the Lemniscate is created in
t – value to sample the Lemniscate at
Optional Values
a – (π/2) defines a half axis of the Lemniscate to cover a half sphere.
Generate a Signal on the Sphere$\mathbb S^2$ by creating the Lemniscate of Bernoulli in the tangent space of p sampled at pts points and use exp to get a signal on the Sphere.
Input
p – the tangent space the Lemniscate is created in
pts – (128) number of points to sample the Lemniscate
a – (π/2) defines a half axis of the Lemniscate to cover a half sphere.
interval – ([0,2*π]) range to sample the lemniscate at, the default value refers to one closed curve
M. Bačák, R. Bergmann, G. Steidl and A. Weinmann. A second order non-smooth variational model for restoring manifold-valued images. SIAM Journal on Scientific Computing 38, A567–A597 (2016), arxiv: [1506.02409](https://arxiv.org/abs/1506.02409).
Exports aim to provide a consistent generation of images of your results. For example if you record the trace your algorithm walks on the Sphere, you can easily export this trace to a rendered image using asymptote_export_S2_signals and render the result with Asymptote. Despite these, you can always record values during your iterations, and export these, for example to csv.
Export given points, curves, and tangent_vectors on the sphere $\mathbb S^2$ to Asymptote.
Input
filename – a file to store the Asymptote code in.
Optional Arguments (Data)
colors - dictionary of color arrays (indexed by symbols :points, :curves and :tvector) where each entry has to provide as least as many colors as the length of the corresponding sets.
curves – an Array of Arrays of points on the sphere, where each inner array is interpreted as a curve and is accompanied by an entry within colors
points – an Array of Arrays of points on the sphere where each inner array is interpreted as a set of points and is accompanied by an entry within colors
tangent_vectors – an Array of Arrays of tuples, where the first is a points, the second a tangent vector and each set of vectors is accompanied by an entry from within colors
Optional Arguments (Asymptote)
arrow_head_size - (6.0) size of the arrowheads of the tangent vectors
arrow_head_sizes – overrides the previous value to specify a value per tVector set.
camera_position - ((1., 1., 0.)) position of the camera in the Asymptote szene
line_width – (1.0) size of the lines used to draw the curves.
line_widths – overrides the previous value to specify a value per curve and tVector set.
dot_size – (1.0) size of the dots used to draw the points.
dot_sizes – overrides the previous value to specify a value per point set.
size - (nothing) a tuple for the image size, otherwise a relative size 4cm is used.
sphere_color – (RGBA{Float64}(0.85, 0.85, 0.85, 0.6)) color of the sphere the data is drawn on
sphere_line_color – (RGBA{Float64}(0.75, 0.75, 0.75, 0.6)) color of the lines on the sphere
sphere_line_width – (0.5) line width of the lines on the sphere
target – ((0.,0.,0.)) position the camera points at
export given data as a point on a Power(SymmetricPOsitiveDefinnite(3))} manifold, i.e. one-, two- or three-dimensional data with points on the manifold of symmetric positive definite matrices.
Input
filename – a file to store the Asymptote code in.
Optional Arguments (Data)
data – a point representing the 1-,2-, or 3-D array of SPD matrices
color_scheme - A ColorScheme for Geometric Anisotropy Index
scale_axes - ((1/3,1/3,1/3)) move symmetric positive definite matrices closer to each other by a factor per direction compared to the distance estimated by the maximal eigenvalue of all involved SPD points
Optional Arguments (Asymptote)
camera_position - position of the camera (default: centered above xy-plane) szene.
target - position the camera points at (default: center of xy-plane within data).
Both values camera_position and target are scaled by scaledAxes*EW, where EW is the maximal eigenvalue in the data.
Exports aim to provide a consistent generation of images of your results. For example if you record the trace your algorithm walks on the Sphere, you can easily export this trace to a rendered image using asymptote_export_S2_signals and render the result with Asymptote. Despite these, you can always record values during your iterations, and export these, for example to csv.
Export given points, curves, and tangent_vectors on the sphere $\mathbb S^2$ to Asymptote.
Input
filename – a file to store the Asymptote code in.
Optional Arguments (Data)
colors - dictionary of color arrays (indexed by symbols :points, :curves and :tvector) where each entry has to provide as least as many colors as the length of the corresponding sets.
curves – an Array of Arrays of points on the sphere, where each inner array is interpreted as a curve and is accompanied by an entry within colors
points – an Array of Arrays of points on the sphere where each inner array is interpreted as a set of points and is accompanied by an entry within colors
tangent_vectors – an Array of Arrays of tuples, where the first is a points, the second a tangent vector and each set of vectors is accompanied by an entry from within colors
Optional Arguments (Asymptote)
arrow_head_size - (6.0) size of the arrowheads of the tangent vectors
arrow_head_sizes – overrides the previous value to specify a value per tVector set.
camera_position - ((1., 1., 0.)) position of the camera in the Asymptote szene
line_width – (1.0) size of the lines used to draw the curves.
line_widths – overrides the previous value to specify a value per curve and tVector set.
dot_size – (1.0) size of the dots used to draw the points.
dot_sizes – overrides the previous value to specify a value per point set.
size - (nothing) a tuple for the image size, otherwise a relative size 4cm is used.
sphere_color – (RGBA{Float64}(0.85, 0.85, 0.85, 0.6)) color of the sphere the data is drawn on
sphere_line_color – (RGBA{Float64}(0.75, 0.75, 0.75, 0.6)) color of the lines on the sphere
sphere_line_width – (0.5) line width of the lines on the sphere
target – ((0.,0.,0.)) position the camera points at
export given data as a point on a Power(SymmetricPOsitiveDefinnite(3))} manifold, i.e. one-, two- or three-dimensional data with points on the manifold of symmetric positive definite matrices.
Input
filename – a file to store the Asymptote code in.
Optional Arguments (Data)
data – a point representing the 1-,2-, or 3-D array of SPD matrices
color_scheme - A ColorScheme for Geometric Anisotropy Index
scale_axes - ((1/3,1/3,1/3)) move symmetric positive definite matrices closer to each other by a factor per direction compared to the distance estimated by the maximal eigenvalue of all involved SPD points
Optional Arguments (Asymptote)
camera_position - position of the camera (default: centered above xy-plane) szene.
target - position the camera points at (default: center of xy-plane within data).
Both values camera_position and target are scaled by scaledAxes*EW, where EW is the maximal eigenvalue in the data.
For a function $f:\mathcal M → ℝ$ defined on a Riemannian manifold$\mathcal M$ we aim to solve
\[\operatorname*{argmin}_{p ∈ \mathcal M} f(p),\]
or in other words: find the point $p$ on the manifold, where $f$ reaches its minimal function value.
Manopt.jl provides a framework for optimization on manifolds as well as a Library of optimization algorithms in Julia. It belongs to the “Manopt family”, which includes Manopt (Matlab) and pymanopt.org (Python).
If you want to delve right into Manopt.jl check out the Get started: Optimize! tutorial.
Manopt.jl makes it easy to use an algorithm for your favourite manifold as well as a manifold for your favourite algorithm. It already provides many manifolds and algorithms, which can easily be enhanced, for example to record certain data or debug output throughout iterations.
If you use Manopt.jlin your work, please cite the following
For a function $f:\mathcal M → ℝ$ defined on a Riemannian manifold$\mathcal M$ we aim to solve
\[\operatorname*{argmin}_{p ∈ \mathcal M} f(p),\]
or in other words: find the point $p$ on the manifold, where $f$ reaches its minimal function value.
Manopt.jl provides a framework for optimization on manifolds as well as a Library of optimization algorithms in Julia. It belongs to the “Manopt family”, which includes Manopt (Matlab) and pymanopt.org (Python).
If you want to delve right into Manopt.jl check out the Get started: Optimize! tutorial.
Manopt.jl makes it easy to use an algorithm for your favourite manifold as well as a manifold for your favourite algorithm. It already provides many manifolds and algorithms, which can easily be enhanced, for example to record certain data or debug output throughout iterations.
If you use Manopt.jlin your work, please cite the following
@article{Bergmann2022,
Author = {Ronny Bergmann},
Doi = {10.21105/joss.03866},
Journal = {Journal of Open Source Software},
@@ -26,4 +26,4 @@
A DebugAction is a small functor to print/issue debug output. The usual call is given by (amp::AbstractManoptProblem, ams::AbstractManoptSolverState, i) -> s, where i is the current iterate.
By convention i=0 is interpreted as "For Initialization only", i.e. only debug info that prints initialization reacts, i<0 triggers updates of variables internally but does not trigger any output. Finally typemin(Int) is used to indicate a call from stop_solver! that returns true afterwards.
Fields (assumed by subtypes to exist)
print method to perform the actual print. Can for example be set to a file export,
or to @info. The default is the print function on the default Base.stdout.
debug for the amount of change of the iterate (stored in get_iterate(o) of the AbstractManoptSolverState) during the last iteration. See DebugEntryChange for the general case
Keyword Parameters
storage – (StoreStateAction( [:Gradient] )) – (eventually shared) the storage of the previous action
prefix – ("Last Change:") prefix of the debug output (ignored if you set format)
io – (stdout) default stream to print the debug to.
format - ( "$prefix %f") format to print the output using an sprintf format.
inverse_retraction_method - (default_inverse_retraction_method(M)) the inverse retraction to be used for approximating distance.
evaluate and print debug only every $i$th iteration. Otherwise no print is performed. Whether internal variables are updates is determined by always_update.
This method does not perform any print itself but relies on it's childrens print.
debug for the amount of change of the gradient (stored in get_gradient(o) of the AbstractManoptSolverStateo) during the last iteration. See DebugEntryChange for the general case
Keyword Parameters
storage – (StoreStateAction( (:Gradient,) )) – (eventually shared) the storage of the previous action
prefix – ("Last Change:") prefix of the debug output (ignored if you set format)
io – (stdout) default stream to print the debug to.
format - ( "$prefix %f") format to print the output using an sprintf format.
group a set of DebugActions into one action, where the internal prints are removed by default and the resulting strings are concatenated
Constructor
DebugGroup(g)
construct a group consisting of an Array of DebugActions g, that are evaluated en bloque; the method does not perform any print itself, but relies on the internal prints. It still concatenates the result and returns the complete string
An AbstractManoptSolverState or one of its substeps like a Stepsize might generate warnings throughout their computations. This debug can be used to :print them display them as :info or :warnings or even :error, depending on the message type.
Constructor
DebugMessages(mode=:Info; io::IO=stdout)
Initialize the messages debug to a certain mode. Available modes are
:Error – issue the messages as an error and hence stop at any issue occurring
The debug options append to any options a debug functionality, i.e. they act as a decorator pattern. Internally a Dictionary is kept that stores a DebugAction for several occasions using a Symbol as reference. The default occasion is :All and for example solvers join this field with :Start, :Step and :Stop at the beginning, every iteration or the end of the algorithm, respectively
The original options can still be accessed using the get_state function.
Fields (defaults in brackets)
options – the options that are extended by debug information
debugDictionary – a Dict{Symbol,DebugAction} to keep track of Debug for different actions
Constructors
DebugSolverState(o,dA)
construct debug decorated options, where dD can be
a DebugAction, then it is stored within the dictionary at :All
an Array of DebugActions, then it is stored as a debugDictionary within :All.
a Dict{Symbol,DebugAction}.
an Array of Symbols, String and an Int for the DebugFactory
Measure time and print the intervals. Using start=true you can start the timer on construction, for example to measure the runtime of an algorithm overall (adding)
The measured time is rounded using the given time_accuracy and printed after canonicalization.
Keyword Parameters
prefix – ("Last Change:") prefix of the debug output (ignored if you set format)
io – (stdout) default strea to print the debug to.
format - ( "$prefix %s") format to print the output using an sprintf format, where %s is the canonicalized time`.
mode – (:cumulative) whether to display the total time or reset on every call using :iterative.
start – (false) indicate whether to start the timer on creation or not. Otherwise it might only be started on firsr call.
time_accuracy – (Millisecond(1)) round the time to this period before printing the canonicalized time
Note that this provides an additional warning for gradient descent with its default constant step size.
Constructor
DebugWarnIfCostIncreases(warn=:Once; tol=1e-13)
Initialize the warning to warning level (:Once) and introduce a tolerance for the test of 1e-13.
The warn level can be set to :Once to only warn the first time the cost increases, to :Always to report an increase every time it happens, and it can be set to :No to deactivate the warning, then this DebugAction is inactive. All other symbols are handled as if they were :Always:
This can be set to :Once to only warn the first time the cost is Nan. It can also be set to :No to deactivate the warning, but this makes this Action also useless. All other symbols are handled as if they were :Always:
This can be set to :Once to only warn the first time the cost is Nan. It can also be set to :No to deactivate the warning, but this makes this Action also useless. All other symbols are handled as if they were :Always:
Example
DebugWaranIfFieldNotFinite(:Gradient)
Creates a [DebugAction] to track whether the gradient does not get Nan or Inf.
evaluate and print debug only if the active boolean is set. This can be set from outside and is for example triggered by DebugEvery on debugs on the subsolver.
This method does not perform any print itself but relies on it's childrens print.
For now, the main interaction is with DebugEvery which might activate or deactivate this debug
Fields
always_update – whether or not to call the order debugs with iteration -1 in in active state
active – a boolean that can (de-)activated from outside to enable/disable debug
Convert certain Symbols in the debug=[ ... ] vector to DebugActions Currently the following ones are done. Note that the Shortcut symbols should all start with a capital letter.
Convert certain Symbols in the debug=[ ... ] vector to DebugActions Currently the following ones are done, where the string in t[2] is passed as the format the corresponding debug. Note that the Shortcut symbols t[1] should all start with a capital letter.
given an array of Symbols, Strings DebugActions and Ints
The symbol :Stop creates an entry of to display the stopping criterion at the end (:Stop => DebugStoppingCriterion()), for further symbols see DebugActionFactory
The symbol :Subsolver wraps all dictionary entries with DebugWhenActive that can be set from outside.
Tuples of a symbol and a string can be used to also specify a format, see DebugActionFactory
an Integer kintroduces that debug is only printed every kth iteration
Return value
This function returns a dictionary with an entry :All containing one general DebugAction, possibly a DebugGroup of entries. It might contain an entry :Start, :Step, :Stop with an action (each) to specify what to do at the start, after a step or at the end of an Algorithm, respectively. On all three occasions the :All action is executed. Note that only the :Stop entry is actually filled when specifying the :Stop symbol.
The decorator to print debug during the iterations can be activated by decorating the state of a solver and implementing your own DebugActions. For example printing a gradient from the GradientDescentState is automatically available, as explained in the gradient_descent solver.
A DebugAction is a small functor to print/issue debug output. The usual call is given by (amp::AbstractManoptProblem, ams::AbstractManoptSolverState, i) -> s, where i is the current iterate.
By convention i=0 is interpreted as "For Initialization only", i.e. only debug info that prints initialization reacts, i<0 triggers updates of variables internally but does not trigger any output. Finally typemin(Int) is used to indicate a call from stop_solver! that returns true afterwards.
Fields (assumed by subtypes to exist)
print method to perform the actual print. Can for example be set to a file export,
or to @info. The default is the print function on the default Base.stdout.
debug for the amount of change of the iterate (stored in get_iterate(o) of the AbstractManoptSolverState) during the last iteration. See DebugEntryChange for the general case
Keyword Parameters
storage – (StoreStateAction( [:Gradient] )) – (eventually shared) the storage of the previous action
prefix – ("Last Change:") prefix of the debug output (ignored if you set format)
io – (stdout) default stream to print the debug to.
format - ( "$prefix %f") format to print the output using an sprintf format.
inverse_retraction_method - (default_inverse_retraction_method(M)) the inverse retraction to be used for approximating distance.
evaluate and print debug only every $i$th iteration. Otherwise no print is performed. Whether internal variables are updates is determined by always_update.
This method does not perform any print itself but relies on it's childrens print.
debug for the amount of change of the gradient (stored in get_gradient(o) of the AbstractManoptSolverStateo) during the last iteration. See DebugEntryChange for the general case
Keyword Parameters
storage – (StoreStateAction( (:Gradient,) )) – (eventually shared) the storage of the previous action
prefix – ("Last Change:") prefix of the debug output (ignored if you set format)
io – (stdout) default stream to print the debug to.
format - ( "$prefix %f") format to print the output using an sprintf format.
group a set of DebugActions into one action, where the internal prints are removed by default and the resulting strings are concatenated
Constructor
DebugGroup(g)
construct a group consisting of an Array of DebugActions g, that are evaluated en bloque; the method does not perform any print itself, but relies on the internal prints. It still concatenates the result and returns the complete string
An AbstractManoptSolverState or one of its substeps like a Stepsize might generate warnings throughout their computations. This debug can be used to :print them display them as :info or :warnings or even :error, depending on the message type.
Constructor
DebugMessages(mode=:Info; io::IO=stdout)
Initialize the messages debug to a certain mode. Available modes are
:Error – issue the messages as an error and hence stop at any issue occurring
The debug options append to any options a debug functionality, i.e. they act as a decorator pattern. Internally a Dictionary is kept that stores a DebugAction for several occasions using a Symbol as reference. The default occasion is :All and for example solvers join this field with :Start, :Step and :Stop at the beginning, every iteration or the end of the algorithm, respectively
The original options can still be accessed using the get_state function.
Fields (defaults in brackets)
options – the options that are extended by debug information
debugDictionary – a Dict{Symbol,DebugAction} to keep track of Debug for different actions
Constructors
DebugSolverState(o,dA)
construct debug decorated options, where dD can be
a DebugAction, then it is stored within the dictionary at :All
an Array of DebugActions, then it is stored as a debugDictionary within :All.
a Dict{Symbol,DebugAction}.
an Array of Symbols, String and an Int for the DebugFactory
Measure time and print the intervals. Using start=true you can start the timer on construction, for example to measure the runtime of an algorithm overall (adding)
The measured time is rounded using the given time_accuracy and printed after canonicalization.
Keyword Parameters
prefix – ("Last Change:") prefix of the debug output (ignored if you set format)
io – (stdout) default strea to print the debug to.
format - ( "$prefix %s") format to print the output using an sprintf format, where %s is the canonicalized time`.
mode – (:cumulative) whether to display the total time or reset on every call using :iterative.
start – (false) indicate whether to start the timer on creation or not. Otherwise it might only be started on firsr call.
time_accuracy – (Millisecond(1)) round the time to this period before printing the canonicalized time
Note that this provides an additional warning for gradient descent with its default constant step size.
Constructor
DebugWarnIfCostIncreases(warn=:Once; tol=1e-13)
Initialize the warning to warning level (:Once) and introduce a tolerance for the test of 1e-13.
The warn level can be set to :Once to only warn the first time the cost increases, to :Always to report an increase every time it happens, and it can be set to :No to deactivate the warning, then this DebugAction is inactive. All other symbols are handled as if they were :Always:
This can be set to :Once to only warn the first time the cost is Nan. It can also be set to :No to deactivate the warning, but this makes this Action also useless. All other symbols are handled as if they were :Always:
This can be set to :Once to only warn the first time the cost is Nan. It can also be set to :No to deactivate the warning, but this makes this Action also useless. All other symbols are handled as if they were :Always:
Example
DebugWaranIfFieldNotFinite(:Gradient)
Creates a [DebugAction] to track whether the gradient does not get Nan or Inf.
evaluate and print debug only if the active boolean is set. This can be set from outside and is for example triggered by DebugEvery on debugs on the subsolver.
This method does not perform any print itself but relies on it's childrens print.
For now, the main interaction is with DebugEvery which might activate or deactivate this debug
Fields
always_update – whether or not to call the order debugs with iteration -1 in in active state
active – a boolean that can (de-)activated from outside to enable/disable debug
Convert certain Symbols in the debug=[ ... ] vector to DebugActions Currently the following ones are done. Note that the Shortcut symbols should all start with a capital letter.
Convert certain Symbols in the debug=[ ... ] vector to DebugActions Currently the following ones are done, where the string in t[2] is passed as the format the corresponding debug. Note that the Shortcut symbols t[1] should all start with a capital letter.
given an array of Symbols, Strings DebugActions and Ints
The symbol :Stop creates an entry of to display the stopping criterion at the end (:Stop => DebugStoppingCriterion()), for further symbols see DebugActionFactory
The symbol :Subsolver wraps all dictionary entries with DebugWhenActive that can be set from outside.
Tuples of a symbol and a string can be used to also specify a format, see DebugActionFactory
an Integer kintroduces that debug is only printed every kth iteration
Return value
This function returns a dictionary with an entry :All containing one general DebugAction, possibly a DebugGroup of entries. It might contain an entry :Start, :Step, :Stop with an action (each) to specify what to do at the start, after a step or at the end of an Algorithm, respectively. On all three occasions the :All action is executed. Note that only the :Stop entry is actually filled when specifying the :Stop symbol.
The decorator to print debug during the iterations can be activated by decorating the state of a solver and implementing your own DebugActions. For example printing a gradient from the GradientDescentState is automatically available, as explained in the gradient_descent solver.
For any optimisation performed in Manopt.jl we need information about both the optimisation task or “problem” at hand as well as the solver and all its parameters. This together is called a plan in Manopt.jl and it consists of two data structures:
The Manopt Problem describes all static data of our task, most prominently the manifold and the objective.
The Solver State describes all varying data and parameters for the solver we aim to use. This also means that each solver has its own data structure for the state.
By splitting these two parts, we can use one problem and solve it using different solvers.
Still there might be the need to set certain parameters within any of these structures. For that there is
For any optimisation performed in Manopt.jl we need information about both the optimisation task or “problem” at hand as well as the solver and all its parameters. This together is called a plan in Manopt.jl and it consists of two data structures:
The Manopt Problem describes all static data of our task, most prominently the manifold and the objective.
The Solver State describes all varying data and parameters for the solver we aim to use. This also means that each solver has its own data structure for the state.
By splitting these two parts, we can use one problem and solve it using different solvers.
Still there might be the need to set certain parameters within any of these structures. For that there is
Describe the collection of the optimization function `f\colon \mathcal M → \bbR (or even a vectorial range) and its corresponding elements, which might for example be a gradient or (one or more) proximal maps.
All these elements should usually be implemented as functions (M, p) -> ..., or (M, X, p) -> ... that is
the first argument of these functions should be the manifold M they are defined on
the argument X is present, if the computation is performed inplace of X (see InplaceEvaluation)
the argument p is the place the function ($f$ or one of its elements) is evaluated at.
A common supertype for all decorators of AbstractManifoldObjectives to simplify dispatch. The second parameter should refer to the undecorated objective (i.e. the most inner one).
Which has two main different possibilities for its containing functions concerning the evaluation mode – not necessarily the cost, but for example gradient in an AbstractManifoldGradientObjective.
Indicate internally, whether an AbstractManifoldObjectiveo to be of decorating type, i.e. it stores (encapsulates) an object in itself, by default in the field o.objective.
Decorators indicate this by returning Val{true} for further dispatch.
The default is Val{false}, i.e. by default an state is not decorated.
optional arguments provide necessary details on the decorators. A specific one is used to activate certain decorators.
cache – (missing) specify a cache. Currently :Simple is supported and :LRU if you load LRUCache.jl. For this case a tuple specifying what to cache and how many can be provided, i.e. (:LRU, [:Cost, :Gradient], 10), where the number specifies the size of each cache. and 10 is the default if one omits the last tuple entry
count – (missing) specify calls to the objective to be called, see ManifoldCountObjective for the full list
objective_type – (:Riemannian) specify that an objective is :Riemannian or :Euclidean. The :Euclidean symbol is equivalent to specifying it as :Embedded, since in the end, both refer to converting an objective from the embedding (whether its Euclidean or not) to the Riemannian one.
EmbeddedManifoldObjective{P, T, E, O2, O1<:AbstractManifoldObjective{E}} <:
- AbstractDecoratedManifoldObjective{O2, O1}
Declare an objective to be defined in the embedding. This also declares the gradient to be defined in the embedding, and especially being the Riesz representer with respect to the metric in the embedding. The types can be used to still dispatch on also the undecorated objective type O2.
Fields
objective – the objective that is defined in the embedding
p - (nothing) a point in the embedding.
X - (nothing) a tangent vector in the embedding
When a point in the embedding p is provided, embed! is used in place of this point to reduce memory allocations. Similarly X is used when embedding tangent vectors
Since single function calls, e.g. to the cost or the gradient, might be expensive, a simple cache objective exists as a decorator, that caches one cost value or gradient.
It can be activated/used with the cache= keyword argument available for every solver.
:LRU generates a ManifoldCachedObjective where you should use the form (:LRU, [:Cost, :Gradient]) to specify what should be cached or (:LRU, [:Cost, :Gradient], 100) to specify the cache size. Here this variant defaults to (:LRU, [:Cost, :Gradient], 100), i.e. to cache up to 100 cost and gradient values.[1]
Generate a cached variant of the AbstractManifoldObjectiveo on the AbstractManifold M based on the symbol cache[1], where the second element cache[2] are further arguments to the cache and the optional third is passed down as keyword arguments.
For all available caches see the simpler variant with symbols.
Provide a simple cache for an AbstractManifoldGradientObjective that is for a given point p this cache stores a point p and a gradient $\operatorname{grad} f(p)$ in X as well as a cost value $f(p)$ in c.
Both X and c are accompanied by booleans to keep track of their validity.
For the more advanced cache, you need to implement some type of cache yourself, that provides a get! and implement init_caches. This is for example provided if you load LRUCache.jl. Then you obtain
specify an AbstractManifoldObjective that does only have information about the cost function $f\colon \mathbb M → ℝ$ implemented as a function (M, p) -> c to compute the cost value c at p on the manifold M.
cost – a function $f: \mathcal M → ℝ$ to minimize
Constructors
ManifoldCostObjective(f)
Generate a problem. While this Problem does not have any allocating functions, the type T can be set for consistency reasons with other problems.
Evaluate the cost function of an objective defined in the embedding, i.e. embed p before calling the cost function stored in the EmbeddedManifoldObjective.
Describe the collection of the optimization function `f\colon \mathcal M → \bbR (or even a vectorial range) and its corresponding elements, which might for example be a gradient or (one or more) proximal maps.
All these elements should usually be implemented as functions (M, p) -> ..., or (M, X, p) -> ... that is
the first argument of these functions should be the manifold M they are defined on
the argument X is present, if the computation is performed inplace of X (see InplaceEvaluation)
the argument p is the place the function ($f$ or one of its elements) is evaluated at.
A common supertype for all decorators of AbstractManifoldObjectives to simplify dispatch. The second parameter should refer to the undecorated objective (i.e. the most inner one).
Which has two main different possibilities for its containing functions concerning the evaluation mode – not necessarily the cost, but for example gradient in an AbstractManifoldGradientObjective.
Indicate internally, whether an AbstractManifoldObjectiveo to be of decorating type, i.e. it stores (encapsulates) an object in itself, by default in the field o.objective.
Decorators indicate this by returning Val{true} for further dispatch.
The default is Val{false}, i.e. by default an state is not decorated.
optional arguments provide necessary details on the decorators. A specific one is used to activate certain decorators.
cache – (missing) specify a cache. Currently :Simple is supported and :LRU if you load LRUCache.jl. For this case a tuple specifying what to cache and how many can be provided, i.e. (:LRU, [:Cost, :Gradient], 10), where the number specifies the size of each cache. and 10 is the default if one omits the last tuple entry
count – (missing) specify calls to the objective to be called, see ManifoldCountObjective for the full list
objective_type – (:Riemannian) specify that an objective is :Riemannian or :Euclidean. The :Euclidean symbol is equivalent to specifying it as :Embedded, since in the end, both refer to converting an objective from the embedding (whether its Euclidean or not) to the Riemannian one.
EmbeddedManifoldObjective{P, T, E, O2, O1<:AbstractManifoldObjective{E}} <:
+ AbstractDecoratedManifoldObjective{O2, O1}
Declare an objective to be defined in the embedding. This also declares the gradient to be defined in the embedding, and especially being the Riesz representer with respect to the metric in the embedding. The types can be used to still dispatch on also the undecorated objective type O2.
Fields
objective – the objective that is defined in the embedding
p - (nothing) a point in the embedding.
X - (nothing) a tangent vector in the embedding
When a point in the embedding p is provided, embed! is used in place of this point to reduce memory allocations. Similarly X is used when embedding tangent vectors
Since single function calls, e.g. to the cost or the gradient, might be expensive, a simple cache objective exists as a decorator, that caches one cost value or gradient.
It can be activated/used with the cache= keyword argument available for every solver.
:LRU generates a ManifoldCachedObjective where you should use the form (:LRU, [:Cost, :Gradient]) to specify what should be cached or (:LRU, [:Cost, :Gradient], 100) to specify the cache size. Here this variant defaults to (:LRU, [:Cost, :Gradient], 100), i.e. to cache up to 100 cost and gradient values.[1]
Generate a cached variant of the AbstractManifoldObjectiveo on the AbstractManifold M based on the symbol cache[1], where the second element cache[2] are further arguments to the cache and the optional third is passed down as keyword arguments.
For all available caches see the simpler variant with symbols.
Provide a simple cache for an AbstractManifoldGradientObjective that is for a given point p this cache stores a point p and a gradient $\operatorname{grad} f(p)$ in X as well as a cost value $f(p)$ in c.
Both X and c are accompanied by booleans to keep track of their validity.
For the more advanced cache, you need to implement some type of cache yourself, that provides a get! and implement init_caches. This is for example provided if you load LRUCache.jl. Then you obtain
specify an AbstractManifoldObjective that does only have information about the cost function $f\colon \mathbb M → ℝ$ implemented as a function (M, p) -> c to compute the cost value c at p on the manifold M.
cost – a function $f: \mathcal M → ℝ$ to minimize
Constructors
ManifoldCostObjective(f)
Generate a problem. While this Problem does not have any allocating functions, the type T can be set for consistency reasons with other problems.
Evaluate the cost function of an objective defined in the embedding, i.e. embed p before calling the cost function stored in the EmbeddedManifoldObjective.
a(n optional) cost function ``f(p) = \displaystyle\sum{i=1}^n fi(p)
an array of gradients, $\operatorname{grad}f_i(p), i=1,\ldots,n$ which can be given in two forms
as one single function $(\mathcal M, p) ↦ (X_1,…,X_n) \in (T_p\mathcal M)^n$
as a vector of functions $\bigl( (\mathcal M, p) ↦ X_1, …, (\mathcal M, p) ↦ X_n\bigr)$.
Where both variants can also be provided as InplaceEvaluation functions, i.e. (M, X, p) -> X, where X is the vector of X1,...Xn and (M, X1, p) -> X1, ..., (M, Xn, p) -> Xn, respectively.
Constructors
ManifoldStochasticGradientObjective(
+)
Create a alternating gradient problem with an optional cost and the gradient either as one function (returning an array) or a vector of functions.
a(n optional) cost function ``f(p) = \displaystyle\sum{i=1}^n fi(p)
an array of gradients, $\operatorname{grad}f_i(p), i=1,\ldots,n$ which can be given in two forms
as one single function $(\mathcal M, p) ↦ (X_1,…,X_n) \in (T_p\mathcal M)^n$
as a vector of functions $\bigl( (\mathcal M, p) ↦ X_1, …, (\mathcal M, p) ↦ X_n\bigr)$.
Where both variants can also be provided as InplaceEvaluation functions, i.e. (M, X, p) -> X, where X is the vector of X1,...Xn and (M, X1, p) -> X1, ..., (M, Xn, p) -> Xn, respectively.
Create a Stochastic gradient problem with the gradient either as one function (returning an array of tangent vectors) or a vector of functions (each returning one tangent vector).
The optional cost can also be given as either a single function (returning a number) pr a vector of functions, each returning a value.
specify an objective containing one function to perform a combined computation of cost and its gradient
Fields
costgrad!! – a function that computes both the cost $f\colon\mathcal M → ℝ$ and its gradient $\operatorname{grad}f\colon\mathcal M → \mathcal T\mathcal M$
The evaluation is done in place of X for the !-variant. The T=AllocatingEvaluation problem might still allocate memory within. When the non-mutating variant is called with a T=InplaceEvaluation memory for the result is allocated.
Note that the order of parameters follows the philosophy of Manifolds.jl, namely that even for the mutating variant, the manifold is the first parameter and the (inplace) tangent vector X comes second.
get_gradient(M::AbstractManifold, sgo::ManifoldStochasticGradientObjective, p, k)
-get_gradient!(M::AbstractManifold, sgo::ManifoldStochasticGradientObjective, Y, p, k)
Evaluate one of the summands gradients $\operatorname{grad}f_k$, $k∈\{1,…,n\}$, at x (in place of Y).
If you use a single function for the stochastic gradient, that works inplace, then get_gradient is not available, since the length (or number of elements of the gradient required for allocation) can not be determined.
Evaluate the complete gradient $\operatorname{grad} f = \displaystyle\sum_{i=1}^n \operatorname{grad} f_i(p)$ at p (in place of X).
If you use a single function for the stochastic gradient, that works inplace, then get_gradient is not available, since the length (or number of elements of the gradient required for allocation) can not be determined.
Evaluate the gradient function of an objective defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.
The returned gradient is then converted to a Riemannian gradient calling riemannian_gradient.
Evaluate all summands gradients $\{\operatorname{grad}f_i\}_{i=1}^n$ at p (in place of X).
If you use a single function for the stochastic gradient, that works inplace, then get_gradient is not available, since the length (or number of elements of the gradient) can not be determined.
return the function to evaluate (just) the gradient $\operatorname{grad} f(p)$, where either the gradient function using the decorator or without the decorator is used.
By default recursive is set to false, since usually to just pass the gradient function somewhere, you still want e.g. the cached one or the one that still counts calls.
Create a Stochastic gradient problem with the gradient either as one function (returning an array of tangent vectors) or a vector of functions (each returning one tangent vector).
The optional cost can also be given as either a single function (returning a number) pr a vector of functions, each returning a value.
specify an objective containing one function to perform a combined computation of cost and its gradient
Fields
costgrad!! – a function that computes both the cost $f\colon\mathcal M → ℝ$ and its gradient $\operatorname{grad}f\colon\mathcal M → \mathcal T\mathcal M$
The evaluation is done in place of X for the !-variant. The T=AllocatingEvaluation problem might still allocate memory within. When the non-mutating variant is called with a T=InplaceEvaluation memory for the result is allocated.
Note that the order of parameters follows the philosophy of Manifolds.jl, namely that even for the mutating variant, the manifold is the first parameter and the (inplace) tangent vector X comes second.
get_gradient(M::AbstractManifold, sgo::ManifoldStochasticGradientObjective, p, k)
+get_gradient!(M::AbstractManifold, sgo::ManifoldStochasticGradientObjective, Y, p, k)
Evaluate one of the summands gradients $\operatorname{grad}f_k$, $k∈\{1,…,n\}$, at x (in place of Y).
If you use a single function for the stochastic gradient, that works inplace, then get_gradient is not available, since the length (or number of elements of the gradient required for allocation) can not be determined.
Evaluate the complete gradient $\operatorname{grad} f = \displaystyle\sum_{i=1}^n \operatorname{grad} f_i(p)$ at p (in place of X).
If you use a single function for the stochastic gradient, that works inplace, then get_gradient is not available, since the length (or number of elements of the gradient required for allocation) can not be determined.
Evaluate the gradient function of an objective defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.
The returned gradient is then converted to a Riemannian gradient calling riemannian_gradient.
Evaluate all summands gradients $\{\operatorname{grad}f_i\}_{i=1}^n$ at p (in place of X).
If you use a single function for the stochastic gradient, that works inplace, then get_gradient is not available, since the length (or number of elements of the gradient) can not be determined.
return the function to evaluate (just) the gradient $\operatorname{grad} f(p)$, where either the gradient function using the decorator or without the decorator is used.
By default recursive is set to false, since usually to just pass the gradient function somewhere, you still want e.g. the cached one or the one that still counts calls.
A structure to store information about a objective for a subgradient based optimization problem
Fields
cost – the function $F$ to be minimized
subgradient – a function returning a subgradient $\partial F$ of $F$
Constructor
ManifoldSubgradientObjective(f, ∂f)
Generate the ManifoldSubgradientObjective for a subgradient objective, i.e. a (cost) function f(M, p) and a function ∂f(M, p) that returns a not necessarily deterministic element from the subdifferential at p on a manifold M.
number_of_proxes - (ones(length(proxes))` number of proximal Maps per function, e.g. if one of the maps is a combined one such that the proximal Maps functions return more than one entry per function, you have to adapt this value. if not specified, it is set to one prox per function.
A structure to store information about a objective for a subgradient based optimization problem
Fields
cost – the function $F$ to be minimized
subgradient – a function returning a subgradient $\partial F$ of $F$
Constructor
ManifoldSubgradientObjective(f, ∂f)
Generate the ManifoldSubgradientObjective for a subgradient objective, i.e. a (cost) function f(M, p) and a function ∂f(M, p) that returns a not necessarily deterministic element from the subdifferential at p on a manifold M.
number_of_proxes - (ones(length(proxes))` number of proximal Maps per function, e.g. if one of the maps is a combined one such that the proximal Maps functions return more than one entry per function, you have to adapt this value. if not specified, it is set to one prox per function.
gradient : the gradient $\operatorname{grad}F:\mathcal M → \mathcal T\mathcal M$ of the cost function $F$
hessian : the hessian $\operatorname{Hess}F(x)[⋅]: \mathcal T_{x} \mathcal M → \mathcal T_{x} \mathcal M$ of the cost function $F$
preconditioner : the symmetric, positive definite preconditioner as an approximation of the inverse of the Hessian of $f$, i.e. as a map with the same input variables as the hessian.
Y = get_hessian(amp::AbstractManoptProblem{T}, p, X)
-get_hessian!(amp::AbstractManoptProblem{T}, Y, p, X)
evaluate the Hessian of an AbstractManoptProblemamp at p applied to a tangent vector X, i.e. compute $\operatorname{Hess}f(q)[X]$, which can also happen in-place of Y.
get_hessian(M::AbstractManifold, emo::EmbeddedManifoldObjective, p, X)
-get_hessian!(M::AbstractManifold, Y, emo::EmbeddedManifoldObjective, p, X)
Evaluate the Hessian of an objective defined in the embedding, that is embed p and X before calling the Hessian function stored in the EmbeddedManifoldObjective.
The returned Hessian is then converted to a Riemannian Hessian calling riemannian_Hessian.
evaluate the symmetric, positive definite preconditioner (approximation of the inverse of the Hessian of the cost function f) of a AbstractManoptProblemamps objective at the point p applied to a tangent vector X.
evaluate the symmetric, positive definite preconditioner (approximation of the inverse of the Hessian of the cost function F) of a ManifoldHessianObjectivemho at the point p applied to a tangent vector X.
gradient : the gradient $\operatorname{grad}F:\mathcal M → \mathcal T\mathcal M$ of the cost function $F$
hessian : the hessian $\operatorname{Hess}F(x)[⋅]: \mathcal T_{x} \mathcal M → \mathcal T_{x} \mathcal M$ of the cost function $F$
preconditioner : the symmetric, positive definite preconditioner as an approximation of the inverse of the Hessian of $f$, i.e. as a map with the same input variables as the hessian.
Y = get_hessian(amp::AbstractManoptProblem{T}, p, X)
+get_hessian!(amp::AbstractManoptProblem{T}, Y, p, X)
evaluate the Hessian of an AbstractManoptProblemamp at p applied to a tangent vector X, i.e. compute $\operatorname{Hess}f(q)[X]$, which can also happen in-place of Y.
get_hessian(M::AbstractManifold, emo::EmbeddedManifoldObjective, p, X)
+get_hessian!(M::AbstractManifold, Y, emo::EmbeddedManifoldObjective, p, X)
Evaluate the Hessian of an objective defined in the embedding, that is embed p and X before calling the Hessian function stored in the EmbeddedManifoldObjective.
The returned Hessian is then converted to a Riemannian Hessian calling riemannian_Hessian.
evaluate the symmetric, positive definite preconditioner (approximation of the inverse of the Hessian of the cost function f) of a AbstractManoptProblemamps objective at the point p applied to a tangent vector X.
evaluate the symmetric, positive definite preconditioner (approximation of the inverse of the Hessian of the cost function F) of a ManifoldHessianObjectivemho at the point p applied to a tangent vector X.
The last optional argument can be used to provide the 4 or 5 functions as allocating or mutating (in place computation) ones. Note that the first argument is always the manifold under consideration, the mutated one is the second.
X = adjoint_linearized_operator(N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, m, n, Y)
-adjoint_linearized_operator(N::AbstractManifold, X, apdmo::AbstractPrimalDualManifoldObjective, m, n, Y)
Evaluate the adjoint of the linearized forward operator of $(DΛ(m))^*[Y]$ stored within the AbstractPrimalDualManifoldObjective (in place of X). Since $Y∈T_n\mathcal N$, both $m$ and $n=Λ(m)$ are necessary arguments, mainly because the forward operator $Λ$ might be missing in p.
η = get_differential_dual_prox(N::AbstractManifold, pdsno::PrimalDualManifoldSemismoothNewtonObjective, n, τ, X, ξ)
-get_differential_dual_prox!(N::AbstractManifold, pdsno::PrimalDualManifoldSemismoothNewtonObjective, η, n, τ, X, ξ)
y = get_differential_primal_prox(M::AbstractManifold, pdsno::PrimalDualManifoldSemismoothNewtonObjective σ, x)
-get_differential_primal_prox!(p::TwoManifoldProblem, y, σ, x)
Y = get_dual_prox(N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, n, τ, X)
-get_dual_prox!(N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, Y, n, τ, X)
Y = linearized_forward_operator(M::AbstractManifold, N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, m, X, n)
-linearized_forward_operator!(M::AbstractManifold, N::AbstractManifold, Y, apdmo::AbstractPrimalDualManifoldObjective, m, X, n)
Evaluate the linearized operator (differential) $DΛ(m)[X]$ stored within the AbstractPrimalDualManifoldObjective (in place of Y), where n = Λ(m).
Besides the AbstractEvaluationType there is one further property to distinguish among constraint functions, especially the gradients of the constraints.
ConstrainedManifoldObjective{T<:AbstractEvaluationType, C <: ConstraintType Manifold} <: AbstractManifoldObjective{T}
Describes the constrained objective
\[\begin{aligned}
+)
The last optional argument can be used to provide the 4 or 5 functions as allocating or mutating (in place computation) ones. Note that the first argument is always the manifold under consideration, the mutated one is the second.
X = adjoint_linearized_operator(N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, m, n, Y)
+adjoint_linearized_operator(N::AbstractManifold, X, apdmo::AbstractPrimalDualManifoldObjective, m, n, Y)
Evaluate the adjoint of the linearized forward operator of $(DΛ(m))^*[Y]$ stored within the AbstractPrimalDualManifoldObjective (in place of X). Since $Y∈T_n\mathcal N$, both $m$ and $n=Λ(m)$ are necessary arguments, mainly because the forward operator $Λ$ might be missing in p.
η = get_differential_dual_prox(N::AbstractManifold, pdsno::PrimalDualManifoldSemismoothNewtonObjective, n, τ, X, ξ)
+get_differential_dual_prox!(N::AbstractManifold, pdsno::PrimalDualManifoldSemismoothNewtonObjective, η, n, τ, X, ξ)
y = get_differential_primal_prox(M::AbstractManifold, pdsno::PrimalDualManifoldSemismoothNewtonObjective σ, x)
+get_differential_primal_prox!(p::TwoManifoldProblem, y, σ, x)
Y = get_dual_prox(N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, n, τ, X)
+get_dual_prox!(N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, Y, n, τ, X)
Y = linearized_forward_operator(M::AbstractManifold, N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, m, X, n)
+linearized_forward_operator!(M::AbstractManifold, N::AbstractManifold, Y, apdmo::AbstractPrimalDualManifoldObjective, m, X, n)
Evaluate the linearized operator (differential) $DΛ(m)[X]$ stored within the AbstractPrimalDualManifoldObjective (in place of Y), where n = Λ(m).
Besides the AbstractEvaluationType there is one further property to distinguish among constraint functions, especially the gradients of the constraints.
ConstrainedManifoldObjective{T<:AbstractEvaluationType, C <: ConstraintType Manifold} <: AbstractManifoldObjective{T}
Describes the constrained objective
\[\begin{aligned}
\operatorname*{arg\,min}_{p ∈\mathcal{M}} & f(p)\\
\text{subject to } &g_i(p)\leq0 \quad \text{ for all } i=1,…,m,\\
\quad &h_j(p)=0 \quad \text{ for all } j=1,…,n.
@@ -57,8 +57,8 @@
)
Where f, g, h describe the cost, inequality and equality constraints, respectively, as described above and grad_f, grad_g, grad_h are the corresponding gradient functions in one of the 4 formats. If the objective does not have inequality constraints, you can set G and gradG no nothing. If the problem does not have equality constraints, you can set H and gradH no nothing or leave them out.
Return the vector $(g_1(p),...g_m(p),h_1(p),...,h_n(p))$ defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.
evaluate the js equality constraint $h_j(p)$ defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.
Evaluate all equality constraints $h(p)$ of $\bigl(h_1(p), h_2(p),\ldots,h_p(p)\bigr)$ defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.
Evaluate the is inequality constraint $g_i(p)$ defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.
Evaluate all inequality constraints $g(p)$ of $\bigl(g_1(p), g_2(p),\ldots,g_m(p)\bigr)$ defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.
evaluate the gradient of the j th equality constraint $(\operatorname{grad} h(p))_j$ or $\operatorname{grad} h_j(x)$.
Note
For the FunctionConstraint variant of the problem, this function still evaluates the full gradient. For the InplaceEvaluation and FunctionConstraint of the problem, this function currently also calls get_equality_constraints, since this is the only way to determine the number of constraints. It also allocates a full tangent vector.
evaluate the gradient of the jth equality constraint $\operatorname{grad} h_j(p)$ defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.
The returned gradient is then converted to a Riemannian gradient calling riemannian_gradient.
evaluate all gradients of the equality constraints $\operatorname{grad} h(x)$ or $\bigl(\operatorname{grad} h_1(x), \operatorname{grad} h_2(x),\ldots, \operatorname{grad}h_n(x)\bigr)$ of the ConstrainedManifoldObjectiveP at p.
X = get_grad_equality_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p)
-get_grad_equality_constraints!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p)
evaluate the gradients of the the equality constraints $\operatorname{grad} h(p)$ defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.
The returned gradients are then converted to a Riemannian gradient calling riemannian_gradient.
evaluate all gradients of the equality constraints $\operatorname{grad} h(p)$ or $\bigl(\operatorname{grad} h_1(p), \operatorname{grad} h_2(p),\ldots,\operatorname{grad} h_n(p)\bigr)$ of the ConstrainedManifoldObjective$P$ at $p$ in place of X, which is a vector ofn` tangent vectors.
Evaluate the gradient of the jth equality constraint $(\operatorname{grad} h(x))_j$ or $\operatorname{grad} h_j(x)$ in place of $X$
Note
For the FunctionConstraint variant of the problem, this function still evaluates the full gradient. For the InplaceEvaluation of the FunctionConstraint of the problem, this function currently also calls get_inequality_constraints, since this is the only way to determine the number of constraints and allocates a full vector of tangent vectors
evaluate the gradient of the ith inequality constraint $\operatorname{grad} g_i(p)$ defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.
The returned gradient is then converted to a Riemannian gradient calling riemannian_gradient.
Evaluate the gradient of the ith inequality constraints $(\operatorname{grad} g(x))_i$ or $\operatorname{grad} g_i(x)$ of the ConstrainedManifoldObjectiveP in place of $X$
since this is the only way to determine the number of constraints. evaluate all gradients of the inequality constraints $\operatorname{grad} h(x)$ or $\bigl(g_1(x), g_2(x),\ldots,g_m(x)\bigr)$ of the ConstrainedManifoldObjective$p$ at $x$ in place of X, which is a vector ofm` tangent vectors .
evaluate all gradients of the inequality constraints $\operatorname{grad} g(p)$ or $\bigl(\operatorname{grad} g_1(p), \operatorname{grad} g_2(p),…,\operatorname{grad} g_m(p)\bigr)$ of the ConstrainedManifoldObjective$P$ at $p$.
X = get_grad_inequality_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p)
-get_grad_inequality_constraints!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p)
evaluate the gradients of the the inequality constraints $\operatorname{grad} g(p)$ defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.
The returned gradients are then converted to a Riemannian gradient calling riemannian_gradient.
evaluate all gradients of the inequality constraints $\operatorname{grad} g(x)$ or $\bigl(\operatorname{grad} g_1(x), \operatorname{grad} g_2(x),\ldots,\operatorname{grad} g_m(x)\bigr)$ of the ConstrainedManifoldObjectiveP at p in place of X, which is a vector of $m$ tangent vectors.
Return the vector $(g_1(p),...g_m(p),h_1(p),...,h_n(p))$ defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.
evaluate the js equality constraint $h_j(p)$ defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.
Evaluate all equality constraints $h(p)$ of $\bigl(h_1(p), h_2(p),\ldots,h_p(p)\bigr)$ defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.
Evaluate the is inequality constraint $g_i(p)$ defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.
Evaluate all inequality constraints $g(p)$ of $\bigl(g_1(p), g_2(p),\ldots,g_m(p)\bigr)$ defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.
evaluate the gradient of the j th equality constraint $(\operatorname{grad} h(p))_j$ or $\operatorname{grad} h_j(x)$.
Note
For the FunctionConstraint variant of the problem, this function still evaluates the full gradient. For the InplaceEvaluation and FunctionConstraint of the problem, this function currently also calls get_equality_constraints, since this is the only way to determine the number of constraints. It also allocates a full tangent vector.
evaluate the gradient of the jth equality constraint $\operatorname{grad} h_j(p)$ defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.
The returned gradient is then converted to a Riemannian gradient calling riemannian_gradient.
evaluate all gradients of the equality constraints $\operatorname{grad} h(x)$ or $\bigl(\operatorname{grad} h_1(x), \operatorname{grad} h_2(x),\ldots, \operatorname{grad}h_n(x)\bigr)$ of the ConstrainedManifoldObjectiveP at p.
X = get_grad_equality_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p)
+get_grad_equality_constraints!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p)
evaluate the gradients of the the equality constraints $\operatorname{grad} h(p)$ defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.
The returned gradients are then converted to a Riemannian gradient calling riemannian_gradient.
evaluate all gradients of the equality constraints $\operatorname{grad} h(p)$ or $\bigl(\operatorname{grad} h_1(p), \operatorname{grad} h_2(p),\ldots,\operatorname{grad} h_n(p)\bigr)$ of the ConstrainedManifoldObjective$P$ at $p$ in place of X, which is a vector ofn` tangent vectors.
Evaluate the gradient of the jth equality constraint $(\operatorname{grad} h(x))_j$ or $\operatorname{grad} h_j(x)$ in place of $X$
Note
For the FunctionConstraint variant of the problem, this function still evaluates the full gradient. For the InplaceEvaluation of the FunctionConstraint of the problem, this function currently also calls get_inequality_constraints, since this is the only way to determine the number of constraints and allocates a full vector of tangent vectors
evaluate the gradient of the ith inequality constraint $\operatorname{grad} g_i(p)$ defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.
The returned gradient is then converted to a Riemannian gradient calling riemannian_gradient.
Evaluate the gradient of the ith inequality constraints $(\operatorname{grad} g(x))_i$ or $\operatorname{grad} g_i(x)$ of the ConstrainedManifoldObjectiveP in place of $X$
since this is the only way to determine the number of constraints. evaluate all gradients of the inequality constraints $\operatorname{grad} h(x)$ or $\bigl(g_1(x), g_2(x),\ldots,g_m(x)\bigr)$ of the ConstrainedManifoldObjective$p$ at $x$ in place of X, which is a vector ofm` tangent vectors .
evaluate all gradients of the inequality constraints $\operatorname{grad} g(p)$ or $\bigl(\operatorname{grad} g_1(p), \operatorname{grad} g_2(p),…,\operatorname{grad} g_m(p)\bigr)$ of the ConstrainedManifoldObjective$P$ at $p$.
X = get_grad_inequality_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p)
+get_grad_inequality_constraints!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p)
evaluate the gradients of the the inequality constraints $\operatorname{grad} g(p)$ defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.
The returned gradients are then converted to a Riemannian gradient calling riemannian_gradient.
evaluate all gradients of the inequality constraints $\operatorname{grad} g(x)$ or $\bigl(\operatorname{grad} g_1(x), \operatorname{grad} g_2(x),\ldots,\operatorname{grad} g_m(x)\bigr)$ of the ConstrainedManifoldObjectiveP at p in place of X, which is a vector of $m$ tangent vectors.
return the (one step) undecorated AbstractManifoldObjective of the (possibly) decorated o. As long as your decorated objective stores the objective within o.objective and the dispatch_objective_decorator is set to Val{true}, the internal state are extracted automatically.
By default the objective that is stored within a decorated objective is assumed to be at o.objective. Overwrite _get_objective(o, ::Val{true}, recursive) to change this behaviour for your objectiveo` for both the recursive and the nonrecursive case.
If recursive is set to false, only the most outer decorator is taken away instead of all.
Usually, such a problem is determined by the manifold or domain of the optimisation and the objective with all its properties used within an algorithm – see The Objective. For that we can just use
The exception to these are the primal dual-based solvers (Chambolle-Pock and the PD Semismooth Newton]), which both need two manifolds as their domain(s), hence there also exists a
return the (one step) undecorated AbstractManifoldObjective of the (possibly) decorated o. As long as your decorated objective stores the objective within o.objective and the dispatch_objective_decorator is set to Val{true}, the internal state are extracted automatically.
By default the objective that is stored within a decorated objective is assumed to be at o.objective. Overwrite _get_objective(o, ::Val{true}, recursive) to change this behaviour for your objectiveo` for both the recursive and the nonrecursive case.
If recursive is set to false, only the most outer decorator is taken away instead of all.
Usually, such a problem is determined by the manifold or domain of the optimisation and the objective with all its properties used within an algorithm – see The Objective. For that we can just use
The exception to these are the primal dual-based solvers (Chambolle-Pock and the PD Semismooth Newton]), which both need two manifolds as their domain(s), hence there also exists a
To record values during the iterations of a solver run, there are in general two possibilities. On the one hand, the high-level interfaces provide a record= keyword, that accepts several different inputs. For more details see How to record.
A RecordAction is a small functor to record values. The usual call is given by (amp::AbstractManoptProblem, ams::AbstractManoptSolverState, i) -> s that performs the record, where i is the current iteration.
By convention i<=0 is interpreted as "For Initialization only", i.e. only initialize internal values, but not trigger any record, the same holds for i=typemin(Inf) which is used to indicate stop, i.e. that the record is called from within stop_solver! which returns true afterwards.
debug for the amount of change of the iterate (stored in o.x of the AbstractManoptSolverState) during the last iteration.
Additional Fields
storage a StoreStateAction to store (at least) o.x to use this as the last value (to compute the change
inverse_retraction_method - (default_inverse_retraction_method(manifold, p)) the inverse retraction to be used for approximating distance.
Constructor
RecordChange(M=DefaultManifold();)
with the above fields as keywords. For the DefaultManifold only the field storage is used. Providing the actual manifold moves the default storage to the efficient point storage.
group a set of RecordActions into one action, where the internal RecordActions act independently, but the results can be collected in a grouped fashion, i.e. tuples per calls of this group. The entries can be later addressed either by index or semantic Symbols
Constructors
RecordGroup(g::Array{<:RecordAction, 1})
construct a group consisting of an Array of RecordActions g,
RecordGroup(g, symbols)
Examples
r = RecordGroup([RecordIteration(), RecordCost()])
A RecordGroup to record the current iteration and the cost. The cost can then be accessed using get_record(r,2) or r[2].
r = RecordGroup([RecordIteration(), RecordCost()], Dict(:Cost => 2))
A RecordGroup to record the current iteration and the cost, which can then be accessed using get_record(:Cost) or r[:Cost].
r = RecordGroup([RecordIteration(), :Cost => RecordCost()])
A RecordGroup identical to the previous constructor, just a little easier to use.
append to any AbstractManoptSolverState the decorator with record functionality, Internally a Dictionary is kept that stores a RecordAction for several concurrent modes using a Symbol as reference. The default mode is :Iteration, which is used to store information that is recorded during the iterations. RecordActions might be added to :Start or :Stop to record values at the beginning or for the stopping time point, respectively
The original options can still be accessed using the get_state function.
Fields
options – the options that are extended by debug information
recordDictionary – a Dict{Symbol,RecordAction} to keep track of all different recorded values
To record values during the iterations of a solver run, there are in general two possibilities. On the one hand, the high-level interfaces provide a record= keyword, that accepts several different inputs. For more details see How to record.
A RecordAction is a small functor to record values. The usual call is given by (amp::AbstractManoptProblem, ams::AbstractManoptSolverState, i) -> s that performs the record, where i is the current iteration.
By convention i<=0 is interpreted as "For Initialization only", i.e. only initialize internal values, but not trigger any record, the same holds for i=typemin(Inf) which is used to indicate stop, i.e. that the record is called from within stop_solver! which returns true afterwards.
debug for the amount of change of the iterate (stored in o.x of the AbstractManoptSolverState) during the last iteration.
Additional Fields
storage a StoreStateAction to store (at least) o.x to use this as the last value (to compute the change
inverse_retraction_method - (default_inverse_retraction_method(manifold, p)) the inverse retraction to be used for approximating distance.
Constructor
RecordChange(M=DefaultManifold();)
with the above fields as keywords. For the DefaultManifold only the field storage is used. Providing the actual manifold moves the default storage to the efficient point storage.
group a set of RecordActions into one action, where the internal RecordActions act independently, but the results can be collected in a grouped fashion, i.e. tuples per calls of this group. The entries can be later addressed either by index or semantic Symbols
Constructors
RecordGroup(g::Array{<:RecordAction, 1})
construct a group consisting of an Array of RecordActions g,
RecordGroup(g, symbols)
Examples
r = RecordGroup([RecordIteration(), RecordCost()])
A RecordGroup to record the current iteration and the cost. The cost can then be accessed using get_record(r,2) or r[2].
r = RecordGroup([RecordIteration(), RecordCost()], Dict(:Cost => 2))
A RecordGroup to record the current iteration and the cost, which can then be accessed using get_record(:Cost) or r[:Cost].
r = RecordGroup([RecordIteration(), :Cost => RecordCost()])
A RecordGroup identical to the previous constructor, just a little easier to use.
append to any AbstractManoptSolverState the decorator with record functionality, Internally a Dictionary is kept that stores a RecordAction for several concurrent modes using a Symbol as reference. The default mode is :Iteration, which is used to store information that is recorded during the iterations. RecordActions might be added to :Start or :Stop to record values at the beginning or for the stopping time point, respectively
The original options can still be accessed using the get_state function.
Fields
options – the options that are extended by debug information
recordDictionary – a Dict{Symbol,RecordAction} to keep track of all different recorded values
return the recorded values from within the RecordSolverStates that where recorded with respect to the Symbol symbol as an Array. The default refers to any recordings during an :Iteration.
either record (i>0 and not Inf) the value v within the RecordActionr or reset (i<0) the internal storage, where v has to match the internal value type of the corresponding Recordaction.
return the recorded values from within the RecordSolverStates that where recorded with respect to the Symbol symbol as an Array. The default refers to any recordings during an :Iteration.
either record (i>0 and not Inf) the value v within the RecordActionr or reset (i<0) the internal storage, where v has to match the internal value type of the corresponding Recordaction.
Given an AbstractManoptProblem, that is a certain optimisation task, the state specifies the solver to use. It contains the parameters of a solver and all fields necessary during the algorithm, e.g. the current iterate, a StoppingCriterion or a Stepsize.
return the (one step) undecorated AbstractManoptSolverState of the (possibly) decorated s. As long as your decorated state stores the state within s.state and the dispatch_objective_decorator is set to Val{true}, the internal state are extracted automatically.
By default the state that is stored within a decorated state is assumed to be at s.state. Overwrite _get_state(s, ::Val{true}, recursive) to change this behaviour for your states` for both the recursive and the nonrecursive case.
If recursive is set to false, only the most outer decorator is taken away instead of all.
Since every subtype of an AbstractManoptSolverState directly relate to a solver, the concrete states are documented together with the corresponding solvers. This page documents the general functionality available for every state.
A first example is to access, i.e. obtain or set, the current iterate. This might be useful to continue investigation at the current iterate, or to set up a solver for a next experiment, respectively.
An internal function working on the state and elements within a state is used to pass messages from (sub) activities of a state to the corresponding DebugMessages
For an undecorated state, this is assumed to be in ams.stop. Overwrite _get_stopping_criterion(yms::YMS) to change this for your manopt solver (yms) assuming it has type YMS`.
Indicate internally, whether an AbstractManoptSolverStates to be of decorating type, i.e. it stores (encapsulates) a state in itself, by default in the field s.state.
Decorators indicate this by returning Val{true} for further dispatch.
The default is Val{false}, i.e. by default an state is not decorated.
optional arguments provide necessary details on the decorators. A specific one is used to activate certain decorators.
debug – (Array{Union{Symbol,DebugAction,String,Int},1}()) a set of symbols representing DebugActions, Strings used as dividers and a subsampling integer. These are passed as a DebugGroup within :All to the DebugSolverState decorator dictionary. Only exception is :Stop that is passed to :Stop.
record – (Array{Union{Symbol,RecordAction,Int},1}()) specify recordings by using Symbols or RecordActions directly. The integer can again be used for only recording every $i$th iteration.
return_state - (false) indicate whether to wrap the options in a ReturnSolverState, indicating that the solver should return options and not (only) the minimizer.
This internal type is used to indicate that the contained AbstractManoptSolverStatestate should be returned at the end of a solver instead of the usual minimizer.
Several state decorators or actions might store intermediate values like the (last) iterate to compute some change or the last gradient. In order to minimise the storage of these, there is a generic StoreStateAction that acts as generic common storage that can be shared among different actions.
This functor possesses the usual interface of functions called during an iteration, i.e. acts on (p,o,i), where p is a AbstractManoptProblem, o is an AbstractManoptSolverState and i is the current iteration.
Fields
values – a dictionary to store interims values based on certain Symbols
keys – a Vector of Symbols to refer to fields of AbstractManoptSolverState
point_values – a NamedTuple of mutable values of points on a manifold to be stored in StoreStateAction. Manifold is later determined by AbstractManoptProblem passed to update_storage!.
point_init – a NamedTuple of boolean values indicating whether a point in point_values with matching key has been already initialized to a value. When it is false, it corresponds to a general value not being stored for the key present in the vector keys.
vector_values – a NamedTuple of mutable values of tangent vectors on a manifold to be stored in StoreStateAction. Manifold is later determined by AbstractManoptProblem passed to update_storage!. It is not specified at which point the vectors are tangent but for storage it should not matter.
vector_init – a NamedTuple of boolean values indicating whether a tangent vector in vector_values with matching key has been already initialized to a value. When it is false, it corresponds to a general value not being stored for the key present in the vector keys.
once – whether to update the internal values only once per iteration
lastStored – last iterate, where this AbstractStateAction was called (to determine once)
To handle the general storage, use get_storage and has_storage with keys as Symbols. For the point storage use PointStorageKey. For tangent vector storage use VectorStorageKey. Point and tangent storage have been optimized to be more efficient.
Constructors
StoreStateAction(s::Vector{Symbol})
This is equivalent as providing s to the keyword store_fields, just that here, no manifold is necessity for the construction.
StoreStateAction(M)
Keyword arguments
store_fields (Symbol[])
store_points (Symbol[])
store_vectors (Symbol[])
as vectors of symbols each referring to fields of the state (lower case symbols) or semantic ones (upper case).
p_init (rand(M))
X_init (zero_vector(M, p_init))
are used to initialize the point and vector storages, change these if you use other types (than the default) for your points/vectors on M.
once (true) whether to update internal storage only once per iteration or on every update call
An AbstractManoptSolverState type to represent algorithms that employ the Hessian. These options are assumed to have a field (gradient) to store the current gradient $\operatorname{grad}f(x)$
Given an AbstractManoptProblem, that is a certain optimisation task, the state specifies the solver to use. It contains the parameters of a solver and all fields necessary during the algorithm, e.g. the current iterate, a StoppingCriterion or a Stepsize.
return the (one step) undecorated AbstractManoptSolverState of the (possibly) decorated s. As long as your decorated state stores the state within s.state and the dispatch_objective_decorator is set to Val{true}, the internal state are extracted automatically.
By default the state that is stored within a decorated state is assumed to be at s.state. Overwrite _get_state(s, ::Val{true}, recursive) to change this behaviour for your states` for both the recursive and the nonrecursive case.
If recursive is set to false, only the most outer decorator is taken away instead of all.
Since every subtype of an AbstractManoptSolverState directly relate to a solver, the concrete states are documented together with the corresponding solvers. This page documents the general functionality available for every state.
A first example is to access, i.e. obtain or set, the current iterate. This might be useful to continue investigation at the current iterate, or to set up a solver for a next experiment, respectively.
An internal function working on the state and elements within a state is used to pass messages from (sub) activities of a state to the corresponding DebugMessages
For an undecorated state, this is assumed to be in ams.stop. Overwrite _get_stopping_criterion(yms::YMS) to change this for your manopt solver (yms) assuming it has type YMS`.
Indicate internally, whether an AbstractManoptSolverStates to be of decorating type, i.e. it stores (encapsulates) a state in itself, by default in the field s.state.
Decorators indicate this by returning Val{true} for further dispatch.
The default is Val{false}, i.e. by default an state is not decorated.
optional arguments provide necessary details on the decorators. A specific one is used to activate certain decorators.
debug – (Array{Union{Symbol,DebugAction,String,Int},1}()) a set of symbols representing DebugActions, Strings used as dividers and a subsampling integer. These are passed as a DebugGroup within :All to the DebugSolverState decorator dictionary. Only exception is :Stop that is passed to :Stop.
record – (Array{Union{Symbol,RecordAction,Int},1}()) specify recordings by using Symbols or RecordActions directly. The integer can again be used for only recording every $i$th iteration.
return_state - (false) indicate whether to wrap the options in a ReturnSolverState, indicating that the solver should return options and not (only) the minimizer.
This internal type is used to indicate that the contained AbstractManoptSolverStatestate should be returned at the end of a solver instead of the usual minimizer.
Several state decorators or actions might store intermediate values like the (last) iterate to compute some change or the last gradient. In order to minimise the storage of these, there is a generic StoreStateAction that acts as generic common storage that can be shared among different actions.
This functor possesses the usual interface of functions called during an iteration, i.e. acts on (p,o,i), where p is a AbstractManoptProblem, o is an AbstractManoptSolverState and i is the current iteration.
Fields
values – a dictionary to store interims values based on certain Symbols
keys – a Vector of Symbols to refer to fields of AbstractManoptSolverState
point_values – a NamedTuple of mutable values of points on a manifold to be stored in StoreStateAction. Manifold is later determined by AbstractManoptProblem passed to update_storage!.
point_init – a NamedTuple of boolean values indicating whether a point in point_values with matching key has been already initialized to a value. When it is false, it corresponds to a general value not being stored for the key present in the vector keys.
vector_values – a NamedTuple of mutable values of tangent vectors on a manifold to be stored in StoreStateAction. Manifold is later determined by AbstractManoptProblem passed to update_storage!. It is not specified at which point the vectors are tangent but for storage it should not matter.
vector_init – a NamedTuple of boolean values indicating whether a tangent vector in vector_values with matching key has been already initialized to a value. When it is false, it corresponds to a general value not being stored for the key present in the vector keys.
once – whether to update the internal values only once per iteration
lastStored – last iterate, where this AbstractStateAction was called (to determine once)
To handle the general storage, use get_storage and has_storage with keys as Symbols. For the point storage use PointStorageKey. For tangent vector storage use VectorStorageKey. Point and tangent storage have been optimized to be more efficient.
Constructors
StoreStateAction(s::Vector{Symbol})
This is equivalent as providing s to the keyword store_fields, just that here, no manifold is necessity for the construction.
StoreStateAction(M)
Keyword arguments
store_fields (Symbol[])
store_points (Symbol[])
store_vectors (Symbol[])
as vectors of symbols each referring to fields of the state (lower case symbols) or semantic ones (upper case).
p_init (rand(M))
X_init (zero_vector(M, p_init))
are used to initialize the point and vector storages, change these if you use other types (than the default) for your points/vectors on M.
once (true) whether to update internal storage only once per iteration or on every update call
An AbstractManoptSolverState type to represent algorithms that employ the Hessian. These options are assumed to have a field (gradient) to store the current gradient $\operatorname{grad}f(x)$
Most iterative algorithms determine a direction along which the algorithm will proceed and determine a step size to find the next iterate. How advanced the step size computation can be implemented depends (among others) on the properties the corresponding problem provides.
Within Manopt.jl, the step size determination is implemented as a functor which is a subtype of [Stepsize](@refbased on
An abstract type for the functors representing step sizes, i.e. they are callable structures. The naming scheme is TypeOfStepSize, e.g. ConstantStepsize.
Every Stepsize has to provide a constructor and its function has to have the interface (p,o,i) where a AbstractManoptProblem as well as AbstractManoptSolverState and the current number of iterations are the arguments and returns a number, namely the stepsize to use.
Usually, a constructor should take the manifold M as its first argument, for consistency, to allow general step size functors to be set up based on default values that might depend on the manifold currently under consideration.
Given a positive threshold $\hat c \mathbb N$, an minimal bound $b_{\mathrm{min}} > 0$, an initial $b_0 ≥ b_{\mathrm{min}}$, and a gradient reduction factor threshold ``\alpha \in [0,1).
Set $c_0=0$ and use $\omega_0 = \lVert \operatorname{grad} f(p_0) \rvert_{p_0}$.
For the first iterate we use the initial step size $s_0 = \frac{1}{b_0}$
Then, given the last gradient $X_{k-1} = \operatorname{grad} f(x_{k-1})$, and a previous $\omega_{k-1}$, the values $(b_k, \omega_k, c_k)$ are computed using $X_k = \operatorname{grad} f(p_k)$ and the following cases
If $\lVert X_k \rVert_{p_k} \leq \alpha\omega_{k-1}$, then let $\hat b_{k-1} \in [b_\mathrm{min},b_{k-1}]$ and set
Most iterative algorithms determine a direction along which the algorithm will proceed and determine a step size to find the next iterate. How advanced the step size computation can be implemented depends (among others) on the properties the corresponding problem provides.
Within Manopt.jl, the step size determination is implemented as a functor which is a subtype of [Stepsize](@refbased on
An abstract type for the functors representing step sizes, i.e. they are callable structures. The naming scheme is TypeOfStepSize, e.g. ConstantStepsize.
Every Stepsize has to provide a constructor and its function has to have the interface (p,o,i) where a AbstractManoptProblem as well as AbstractManoptSolverState and the current number of iterations are the arguments and returns a number, namely the stepsize to use.
Usually, a constructor should take the manifold M as its first argument, for consistency, to allow general step size functors to be set up based on default values that might depend on the manifold currently under consideration.
Given a positive threshold $\hat c \mathbb N$, an minimal bound $b_{\mathrm{min}} > 0$, an initial $b_0 ≥ b_{\mathrm{min}}$, and a gradient reduction factor threshold ``\alpha \in [0,1).
Set $c_0=0$ and use $\omega_0 = \lVert \operatorname{grad} f(p_0) \rvert_{p_0}$.
For the first iterate we use the initial step size $s_0 = \frac{1}{b_0}$
Then, given the last gradient $X_{k-1} = \operatorname{grad} f(x_{k-1})$, and a previous $\omega_{k-1}$, the values $(b_k, \omega_k, c_k)$ are computed using $X_k = \operatorname{grad} f(p_k)$ and the following cases
If $\lVert X_k \rVert_{p_k} \leq \alpha\omega_{k-1}$, then let $\hat b_{k-1} \in [b_\mathrm{min},b_{k-1}]$ and set
Note that for $α=0$ this is the Riemannian variant of WNGRad
Fields
count_threshold::Int (4) an Integer for $\hat c$
minimal_bound::Float64 (1e-4) for $b_{\mathrm{min}}$
alternate_bound::Function ((bk, hat_c) -> min(gradient_bound, max(gradient_bound, bk/(3*hat_c)) how to determine $\hat b_k$ as a function of (bmin, bk, hat_c) -> hat_bk
gradient_reduction::Float64 (0.9)
gradient_boundnorm(M, p0, grad_f(M,p0)) the bound $b_k$.
as well as the internal fields
weight for $ω_k$ initialised to $ω_0 =$norm(M, p0, grad_f(M,p0)) if this is not zero, 1.0 otherwise.
A functor representing Armijo line search including the last runs state, i.e. a last step size.
Fields
initial_stepsize – (1.0) and initial step size
retraction_method – (default_retraction_method(M)) the retraction to use
contraction_factor – (0.95) exponent for line search reduction
sufficient_decrease – (0.1) gain within Armijo's rule
last_stepsize – (initialstepsize) the last step size we start the search with
initial_guess - ((p,s,i,l) -> l) based on a AbstractManoptProblemp, AbstractManoptSolverStates and a current iterate i and a last step size l, this returns an initial guess. The default uses the last obtained stepsize
Furthermore the following fields act as safeguards
stop_when_stepsize_less - (0.0`) smallest stepsize when to stop (the last one before is taken)
stop_when_stepsize_exceeds - ([max_stepsize](@ref)(M, p)`) – largest stepsize when to stop.
stop_increasing_at_step - (^100`) last step to increase the stepsize (phase 1),
stop_decreasing_at_step - (1000) last step size to decrease the stepsize (phase 2),
Pass :Messages to a debug= to see @infos when these happen.
Constructor
ArmijoLinesearch(M=DefaultManifold())
with the Fields above as keyword arguments and the retraction is set to the default retraction on M.
The constructors return the functor to perform Armijo line search, where two interfaces are available:
with (M, x, F, gradFx[,η=-gradFx]) -> s where M, a current point x a function F, that maps from the manifold to the reals, its gradient (a tangent vector) gradFx$=\operatorname{grad}F(x)$ at x and an optional search direction tangent vector η=-gradFx are the arguments.
Note that for $α=0$ this is the Riemannian variant of WNGRad
Fields
count_threshold::Int (4) an Integer for $\hat c$
minimal_bound::Float64 (1e-4) for $b_{\mathrm{min}}$
alternate_bound::Function ((bk, hat_c) -> min(gradient_bound, max(gradient_bound, bk/(3*hat_c)) how to determine $\hat b_k$ as a function of (bmin, bk, hat_c) -> hat_bk
gradient_reduction::Float64 (0.9)
gradient_boundnorm(M, p0, grad_f(M,p0)) the bound $b_k$.
as well as the internal fields
weight for $ω_k$ initialised to $ω_0 =$norm(M, p0, grad_f(M,p0)) if this is not zero, 1.0 otherwise.
A functor representing Armijo line search including the last runs state, i.e. a last step size.
Fields
initial_stepsize – (1.0) and initial step size
retraction_method – (default_retraction_method(M)) the retraction to use
contraction_factor – (0.95) exponent for line search reduction
sufficient_decrease – (0.1) gain within Armijo's rule
last_stepsize – (initialstepsize) the last step size we start the search with
initial_guess - ((p,s,i,l) -> l) based on a AbstractManoptProblemp, AbstractManoptSolverStates and a current iterate i and a last step size l, this returns an initial guess. The default uses the last obtained stepsize
Furthermore the following fields act as safeguards
stop_when_stepsize_less - (0.0`) smallest stepsize when to stop (the last one before is taken)
stop_when_stepsize_exceeds - ([max_stepsize](@ref)(M, p)`) – largest stepsize when to stop.
stop_increasing_at_step - (^100`) last step to increase the stepsize (phase 1),
stop_decreasing_at_step - (1000) last step size to decrease the stepsize (phase 2),
Pass :Messages to a debug= to see @infos when these happen.
Constructor
ArmijoLinesearch(M=DefaultManifold())
with the Fields above as keyword arguments and the retraction is set to the default retraction on M.
The constructors return the functor to perform Armijo line search, where two interfaces are available:
with (M, x, F, gradFx[,η=-gradFx]) -> s where M, a current point x a function F, that maps from the manifold to the reals, its gradient (a tangent vector) gradFx$=\operatorname{grad}F(x)$ at x and an optional search direction tangent vector η=-gradFx are the arguments.
initialize the stepsize to a constant stepsize, which by default is half the injectivity radius, unless the radius is infinity, then the default step size is 1.
Alternatively one can also use the following keyword.
DecreasingStepsize(
+)
initialize the stepsize to a constant stepsize, which by default is half the injectivity radius, unless the radius is infinity, then the default step size is 1.
An abstract functor to represent line search type step size determinations, see Stepsize for details. One example is the ArmijoLinesearch functor.
Compared to simple step sizes, the linesearch functors provide an interface of the form (p,o,i,η) -> s with an additional (but optional) fourth parameter to provide a search direction; this should default to something reasonable, e.g. the negative gradient.
A functor representing a nonmonotone line search using the Barzilai-Borwein step size Iannazzo, Porcelli, IMA J. Numer. Anal., 2017. Together with a gradient descent algorithm this line search represents the Riemannian Barzilai-Borwein with nonmonotone line-search (RBBNMLS) algorithm. We shifted the order of the algorithm steps from the paper by Iannazzo and Porcelli so that in each iteration we first find
where $α_{k-1}$ is the step size computed in the last iteration and $\operatorname{T}$ is a vector transport. We then find the Barzilai–Borwein step size
\[α_k^{\text{BB}} = \begin{cases}
+)
initializes all fields above, where none of them is mandatory and the length is set to half and to $1$ if the injectivity radius is infinite.
An abstract functor to represent line search type step size determinations, see Stepsize for details. One example is the ArmijoLinesearch functor.
Compared to simple step sizes, the linesearch functors provide an interface of the form (p,o,i,η) -> s with an additional (but optional) fourth parameter to provide a search direction; this should default to something reasonable, e.g. the negative gradient.
A functor representing a nonmonotone line search using the Barzilai-Borwein step size Iannazzo, Porcelli, IMA J. Numer. Anal., 2017. Together with a gradient descent algorithm this line search represents the Riemannian Barzilai-Borwein with nonmonotone line-search (RBBNMLS) algorithm. We shifted the order of the algorithm steps from the paper by Iannazzo and Porcelli so that in each iteration we first find
where $α_{k-1}$ is the step size computed in the last iteration and $\operatorname{T}$ is a vector transport. We then find the Barzilai–Borwein step size
in case of the inverse strategy and an alternation between the two in case of the alternating strategy. Then we find the smallest $h = 0, 1, 2, …$ such that
where $σ$ is a step length reduction factor $∈ (0,1)$, $m$ is the number of iterations after which the function value has to be lower than the current one and $γ$ is the sufficient decrease parameter $∈(0,1)$. We can then find the new stepsize by
\[α_k = σ^h α_k^{\text{BB}}.\]
Fields
initial_stepsize – (1.0) the step size we start the search with
memory_size – (10) number of iterations after which the cost value needs to be lower than the current one
bb_min_stepsize – (1e-3) lower bound for the Barzilai-Borwein step size greater than zero
bb_max_stepsize – (1e3) upper bound for the Barzilai-Borwein step size greater than min_stepsize
retraction_method – (ExponentialRetraction()) the retraction to use
strategy – (direct) defines if the new step size is computed using the direct, indirect or alternating strategy
stepsize_reduction – (0.5) step size reduction factor contained in the interval (0,1)
sufficient_decrease – (1e-4) sufficient decrease parameter contained in the interval (0,1)
vector_transport_method – (ParallelTransport()) the vector transport method to use
Furthermore the following fields act as safeguards
stop_when_stepsize_less - (0.0`) smallest stepsize when to stop (the last one before is taken)
stop_when_stepsize_exceeds - ([max_stepsize](@ref)(M, p)`) – largest stepsize when to stop.
stop_increasing_at_step - (^100`) last step to increase the stepsize (phase 1),
stop_decreasing_at_step - (1000) last step size to decrease the stepsize (phase 2),
Pass :Messages to a debug= to see @infos when these happen.
Constructor
NonmonotoneLinesearch()
with the Fields above in their order as optional arguments (deprecated).
NonmonotoneLinesearch(M)
with the Fields above in their order as keyword arguments and where the retraction and vector transport are set to the default ones on M, respectively.
The constructors return the functor to perform nonmonotone line search.
where $σ$ is a step length reduction factor $∈ (0,1)$, $m$ is the number of iterations after which the function value has to be lower than the current one and $γ$ is the sufficient decrease parameter $∈(0,1)$. We can then find the new stepsize by
\[α_k = σ^h α_k^{\text{BB}}.\]
Fields
initial_stepsize – (1.0) the step size we start the search with
memory_size – (10) number of iterations after which the cost value needs to be lower than the current one
bb_min_stepsize – (1e-3) lower bound for the Barzilai-Borwein step size greater than zero
bb_max_stepsize – (1e3) upper bound for the Barzilai-Borwein step size greater than min_stepsize
retraction_method – (ExponentialRetraction()) the retraction to use
strategy – (direct) defines if the new step size is computed using the direct, indirect or alternating strategy
stepsize_reduction – (0.5) step size reduction factor contained in the interval (0,1)
sufficient_decrease – (1e-4) sufficient decrease parameter contained in the interval (0,1)
vector_transport_method – (ParallelTransport()) the vector transport method to use
Furthermore the following fields act as safeguards
stop_when_stepsize_less - (0.0`) smallest stepsize when to stop (the last one before is taken)
stop_when_stepsize_exceeds - ([max_stepsize](@ref)(M, p)`) – largest stepsize when to stop.
stop_increasing_at_step - (^100`) last step to increase the stepsize (phase 1),
stop_decreasing_at_step - (1000) last step size to decrease the stepsize (phase 2),
Pass :Messages to a debug= to see @infos when these happen.
Constructor
NonmonotoneLinesearch()
with the Fields above in their order as optional arguments (deprecated).
NonmonotoneLinesearch(M)
with the Fields above in their order as keyword arguments and where the retraction and vector transport are set to the default ones on M, respectively.
The constructors return the functor to perform nonmonotone line search.
where $x_+ = \operatorname{retr}_x(tη)$ is the current trial point, and $\text{V}$ is a vector transport, we perform the following Algorithm similar to Algorithm 7 from Huang, Thesis, 2014
set $α=0$, $β=∞$ and $t=1$.
While either $A(t)$ does not hold or $W(t)$ does not hold do steps 3-5.
If $A(t)$ fails, set $β=t$.
If $A(t)$ holds but $W(t)$ fails, set $α=t$.
If $β<∞$ set $t=\frac{α+β}{2}$, otherwise set $t=2α$.
Constructors
There exist two constructors, where, when prodivind the manifold M as a first (optional) parameter, its default retraction and vector transport are the default. In this case the retraction and the vector transport are also keyword arguments for ease of use. The other constructor is kept for backward compatibility.
Returns the default Stepsize functor used when running the solver specified by the AbstractManoptSolverStateams running with an objective on the AbstractManifold M.
return the stepsize stored within AbstractManoptSolverStateams when solving the AbstractManoptProblemamp. This method also works for decorated options and the Stepsize function within the options, by default stored in o.stepsize.
Returns the default Stepsize functor used when running the solver specified by the AbstractManoptSolverStateams running with an objective on the AbstractManifold M.
return the stepsize stored within AbstractManoptSolverStateams when solving the AbstractManoptProblemamp. This method also works for decorated options and the Stepsize function within the options, by default stored in o.stepsize.
a retraction, which defaults to the default_retraction_method(M)
a search direction $η = -\operatorname{grad}F(x)$
an offset, $f_0 = F(x)$
And use the 4 keywords to limit the maximal increase and decrease steps as well as a maximal stepsize (especially on non-Hadamard manifolds) and a minimal one.
Return value
A stepsize s and a message msg (in case any of the 4 criteria hit)
a retraction, which defaults to the default_retraction_method(M)
a search direction $η = -\operatorname{grad}F(x)$
an offset, $f_0 = F(x)$
And use the 4 keywords to limit the maximal increase and decrease steps as well as a maximal stepsize (especially on non-Hadamard manifolds) and a minimal one.
Return value
A stepsize s and a message msg (in case any of the 4 criteria hit)
G. N. Grapiglia and G. F. Stella. An Adaptive Riemannian Gradient Method Without Function Evaluations. Journal of Optimization Theory and Applications 197, 1140–1160 (2023), preprint: [optimization-online.org/wp-content/uploads/2022/04/8864.pdf](https://optimization-online.org/wp-content/uploads/2022/04/8864.pdf).
An abstract type for the functors representing stopping criteria, i.e. they are callable structures. The naming Scheme follows functions, see for example StopAfterIteration.
Every StoppingCriterion has to provide a constructor and its function has to have the interface (p,o,i) where a AbstractManoptProblem as well as AbstractManoptSolverState and the current number of iterations are the arguments and returns a Bool whether to stop or not.
By default each StoppingCriterion should provide a fields reason to provide details when a criterion is met (and that is empty otherwise).
An abstract type for a Stopping Criterion that itself consists of a set of Stopping criteria. In total it acts as a stopping criterion itself. Examples are StopWhenAny and StopWhenAll that can be used to combine stopping criteria.
Then the stopping criteria s might have certain internal values to check against, and this is done when calling them as a function s(amp::AbstractManoptProblem, ams::AbstractManoptSolverState), where the AbstractManoptProblem and the AbstractManoptSolverState together represent the current state of the solver. The functor returns either false when the stopping criterion is not fulfilled or true otherwise. One field all criteria should have is the s.reason, a string giving the reason to stop, see get_reason.
The following generic stopping criteria are available. Some require that, for example, the corresponding AbstractManoptSolverState have a field gradient when the criterion should check that.
Further stopping criteria might be available for individual solvers.
store a threshold when to stop looking at the complete runtime. It uses time_ns() to measure the time and you provide a Period as a time limit, i.e. Minute(15)
Constructor
StopAfter(t)
initialize the stopping criterion to a Period t to stop after.
store an array of StoppingCriterion elements and indicates to stop, when all indicate to stop. The reason is given by the concatenation of all reasons.
Constructor
StopWhenAll(c::NTuple{N,StoppingCriterion} where N)
-StopWhenAll(c::StoppingCriterion,...)
store an array of StoppingCriterion elements and indicates to stop, when any single one indicates to stop. The reason is given by the concatenation of all reasons (assuming that all non-indicating return "").
Constructor
StopWhenAny(c::NTuple{N,StoppingCriterion} where N)
-StopWhenAny(c::StoppingCriterion...)
stores a threshold when to stop looking at the norm of the change of the optimization variable from within a AbstractManoptSolverState, i.e get_iterate(o). For the storage a StoreStateAction is used
An abstract type for the functors representing stopping criteria, i.e. they are callable structures. The naming Scheme follows functions, see for example StopAfterIteration.
Every StoppingCriterion has to provide a constructor and its function has to have the interface (p,o,i) where a AbstractManoptProblem as well as AbstractManoptSolverState and the current number of iterations are the arguments and returns a Bool whether to stop or not.
By default each StoppingCriterion should provide a fields reason to provide details when a criterion is met (and that is empty otherwise).
An abstract type for a Stopping Criterion that itself consists of a set of Stopping criteria. In total it acts as a stopping criterion itself. Examples are StopWhenAny and StopWhenAll that can be used to combine stopping criteria.
Then the stopping criteria s might have certain internal values to check against, and this is done when calling them as a function s(amp::AbstractManoptProblem, ams::AbstractManoptSolverState), where the AbstractManoptProblem and the AbstractManoptSolverState together represent the current state of the solver. The functor returns either false when the stopping criterion is not fulfilled or true otherwise. One field all criteria should have is the s.reason, a string giving the reason to stop, see get_reason.
The following generic stopping criteria are available. Some require that, for example, the corresponding AbstractManoptSolverState have a field gradient when the criterion should check that.
Further stopping criteria might be available for individual solvers.
store a threshold when to stop looking at the complete runtime. It uses time_ns() to measure the time and you provide a Period as a time limit, i.e. Minute(15)
Constructor
StopAfter(t)
initialize the stopping criterion to a Period t to stop after.
store an array of StoppingCriterion elements and indicates to stop, when all indicate to stop. The reason is given by the concatenation of all reasons.
Constructor
StopWhenAll(c::NTuple{N,StoppingCriterion} where N)
+StopWhenAll(c::StoppingCriterion,...)
store an array of StoppingCriterion elements and indicates to stop, when any single one indicates to stop. The reason is given by the concatenation of all reasons (assuming that all non-indicating return "").
Constructor
StopWhenAny(c::NTuple{N,StoppingCriterion} where N)
+StopWhenAny(c::StoppingCriterion...)
stores a threshold when to stop looking at the norm of the change of the optimization variable from within a AbstractManoptSolverState, i.e get_iterate(o). For the storage a StoreStateAction is used
initialize the stopping criterion to a threshold ε using the StoreStateActiona, which is initialized to just store :Iterate by default. You can also provide an inverseretractionmethod for the distance or a manifold to use its default inverse retraction.
store a threshold when to stop looking at the cost function of the optimization problem from within a AbstractManoptProblem, i.e get_cost(p,get_iterate(o)).
Constructor
StopWhenCostLess(ε)
initialize the stopping criterion to a threshold ε.
initialize the stopping criterion to a threshold ε using the StoreStateActiona, which is initialized to just store :Iterate by default. You can also provide an inverseretractionmethod for the distance or a manifold to use its default inverse retraction.
store a threshold when to stop looking at the cost function of the optimization problem from within a AbstractManoptProblem, i.e get_cost(p,get_iterate(o)).
Constructor
StopWhenCostLess(ε)
initialize the stopping criterion to a threshold ε.
Create a stopping criterion with threshold ε for the change gradient, that is, this criterion indicates to stop when get_gradient is in (norm of) its change less than ε, where vector_transport_method denotes the vector transport $\mathcal T$ used.
A stopping criterion based on the current gradient norm.
Constructor
StopWhenGradientNormLess(ε::Float64)
Create a stopping criterion with threshold ε for the gradient, that is, this criterion indicates to stop when get_gradient returns a gradient vector of norm less than ε.
There are a few functions to update, combine and modify stopping criteria, especially to update internal values even for stopping criteria already being used within an AbstractManoptSolverState structure.
Create a stopping criterion with threshold ε for the change gradient, that is, this criterion indicates to stop when get_gradient is in (norm of) its change less than ε, where vector_transport_method denotes the vector transport $\mathcal T$ used.
A stopping criterion based on the current gradient norm.
Constructor
StopWhenGradientNormLess(ε::Float64)
Create a stopping criterion with threshold ε for the gradient, that is, this criterion indicates to stop when get_gradient returns a gradient vector of norm less than ε.
There are a few functions to update, combine and modify stopping criteria, especially to update internal values even for stopping criteria already being used within an AbstractManoptSolverState structure.
returns all active stopping criteria, if any, that are within a StoppingCriterionc, and indicated a stop, i.e. their reason is nonempty. To be precise for a simple stopping criterion, this returns either an empty array if no stop is indicated or the stopping criterion as the only element of an array. For a StoppingCriterionSet all internal (even nested) criteria that indicate to stop are returned.
Return whether (true) or not (false) a StoppingCriterion does always mean that, when it indicates to stop, the solver has converged to a minimizer or critical point.
Note that this is independent of the actual state of the stopping criterion, i.e. whether some of them indicate to stop, but a purely type-based, static decision
Examples
With s1=StopAfterIteration(20) and s2=StopWhenGradientNormLess(1e-7) we have
indicates_convergence(s1) is false
indicates_convergence(s2) is true
indicates_convergence(s1 | s2) is false, since this might also stop after 20 iterations
indicates_convergence(s1 & s2) is true, since s2 is fulfilled if this stops.
returns all active stopping criteria, if any, that are within a StoppingCriterionc, and indicated a stop, i.e. their reason is nonempty. To be precise for a simple stopping criterion, this returns either an empty array if no stop is indicated or the stopping criterion as the only element of an array. For a StoppingCriterionSet all internal (even nested) criteria that indicate to stop are returned.
Return whether (true) or not (false) a StoppingCriterion does always mean that, when it indicates to stop, the solver has converged to a minimizer or critical point.
Note that this is independent of the actual state of the stopping criterion, i.e. whether some of them indicate to stop, but a purely type-based, static decision
Examples
With s1=StopAfterIteration(20) and s2=StopWhenGradientNormLess(1e-7) we have
indicates_convergence(s1) is false
indicates_convergence(s2) is true
indicates_convergence(s1 | s2) is false, since this might also stop after 20 iterations
indicates_convergence(s1 & s2) is true, since s2 is fulfilled if this stops.
Update a value within a stopping criterion, specified by the symbol s, to v. If a criterion does not have a value assigned that corresponds to s, the update is ignored.
Update a value within a stopping criterion, specified by the symbol s, to v. If a criterion does not have a value assigned that corresponds to s, the update is ignored.
This is all literature mentioned / referenced in the Manopt.jl documentation. Usually you will find a small reference section at the end of every documentation page that contains references.
This is all literature mentioned / referenced in the Manopt.jl documentation. Usually you will find a small reference section at the end of every documentation page that contains references.
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
diff --git a/dev/search_index.js b/dev/search_index.js
index 6c6ffeb257..035852ae4d 100644
--- a/dev/search_index.js
+++ b/dev/search_index.js
@@ -1,3 +1,3 @@
var documenterSearchIndex = {"docs":
-[{"location":"notation/#Notation","page":"Notation","title":"Notation","text":"","category":"section"},{"location":"notation/","page":"Notation","title":"Notation","text":"In this package, we follow the notation introduced in Manifolds.jl – Notation","category":"page"},{"location":"notation/","page":"Notation","title":"Notation","text":"with the following additional or slightly changed notation","category":"page"},{"location":"notation/","page":"Notation","title":"Notation","text":"Symbol Description Also used Comment\n The Levi-Cevita connection \noperatornamegradf The Riemannian gradient f due to possible confusion with the connection, we try to avoid f\noperatornameHessf The Riemannian Hessian ","category":"page"},{"location":"tutorials/AutomaticDifferentiation/#Using-Automatic-Differentiation-in-Manopt.jl","page":"Use Automatic Differentiation","title":"Using Automatic Differentiation in Manopt.jl","text":"","category":"section"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Since Manifolds.jl 0.7, the support of automatic differentiation support has been extended.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"This tutorial explains how to use Euclidean tools to derive a gradient for a real-valued function fcolon mathcal M ℝ. We will consider two methods: an intrinsic variant and a variant employing the embedding. These gradients can then be used within any gradient based optimization algorithm in Manopt.jl.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"While by default we use FiniteDifferences.jl, you can also use FiniteDiff.jl, ForwardDiff.jl, ReverseDiff.jl, or Zygote.jl.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"In this tutorial we will take a look at a few possibilities to approximate or derive the gradient of a function fmathcal M to ℝ on a Riemannian manifold, without computing it yourself. There are mainly two different philosophies:","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Working instrinsically, i.e. staying on the manifold and in the tangent spaces. Here, we will consider approximating the gradient by forward differences.\nWorking in an embedding – there we can use all tools from functions on Euclidean spaces – finite differences or automatic differenciation – and then compute the corresponding Riemannian gradient from there.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We first load all necessary packages","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"using Manopt, Manifolds, Random, LinearAlgebra\nusing FiniteDifferences, ManifoldDiff\nRandom.seed!(42);","category":"page"},{"location":"tutorials/AutomaticDifferentiation/#.-(Intrinsic)-Forward-Differences","page":"Use Automatic Differentiation","title":"1. (Intrinsic) Forward Differences","text":"","category":"section"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"A first idea is to generalize (multivariate) finite differences to Riemannian manifolds. Let X_1ldotsX_d T_pmathcal M denote an orthonormal basis of the tangent space T_pmathcal M at the point pmathcal M on the Riemannian manifold.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We can generalize the notion of a directional derivative, i.e. for the “direction” YT_pmathcal M. Let ccolon -εε, ε0, be a curve with c(0) = p, dot c(0) = Y, e.g. c(t)= exp_p(tY). We obtain","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":" Df(p)Y = left fracddt right_t=0 f(c(t)) = lim_t to 0 frac1t(f(exp_p(tY))-f(p))","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We can approximate Df(p)X by a finite difference scheme for an h0 as","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"DF(p)Y G_h(Y) = frac1h(f(exp_p(hY))-f(p))","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Furthermore the gradient operatornamegradf is the Riesz representer of the differential, ie.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":" Df(p)Y = g_p(operatornamegradf(p) Y)qquad text for all Y T_pmathcal M","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"and since it is a tangent vector, we can write it in terms of a basis as","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":" operatornamegradf(p) = sum_i=1^d g_p(operatornamegradf(p)X_i)X_i\n = sum_i=1^d Df(p)X_iX_i","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"and perform the approximation from above to obtain","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":" operatornamegradf(p) sum_i=1^d G_h(X_i)X_i","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"for some suitable step size h. This comes at the cost of d+1 function evaluations and d exponential maps.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"This is the first variant we can use. An advantage is that it is intrinsic in the sense that it does not require any embedding of the manifold.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/#An-Example:-The-Rayleigh-Quotient","page":"Use Automatic Differentiation","title":"An Example: The Rayleigh Quotient","text":"","category":"section"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"The Rayleigh quotient is concerned with finding eigenvalues (and eigenvectors) of a symmetric matrix Ain ℝ^(n+1)(n+1). The optimization problem reads","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Fcolon ℝ^n+1 to ℝquad F(mathbf x) = fracmathbf x^mathrmTAmathbf xmathbf x^mathrmTmathbf x","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Minimizing this function yields the smallest eigenvalue lambda_1 as a value and the corresponding minimizer mathbf x^* is a corresponding eigenvector.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Since the length of an eigenvector is irrelevant, there is an ambiguity in the cost function. It can be better phrased on the sphere $ 𝕊^n$ of unit vectors in mathbb R^n+1, i.e.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"operatorname*argmin_p in 𝕊^n f(p) = operatorname*argmin_ p in 𝕊^n p^mathrmTAp","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We can compute the Riemannian gradient exactly as","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"operatornamegrad f(p) = 2(Ap - pp^mathrmTAp)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"so we can compare it to the approximation by finite differences.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"n = 200\nA = randn(n + 1, n + 1)\nA = Symmetric(A)\nM = Sphere(n);\n\nf1(p) = p' * A'p\ngradf1(p) = 2 * (A * p - p * p' * A * p)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"gradf1 (generic function with 1 method)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Manifolds provides a finite difference scheme in tangent spaces, that you can introduce to use an existing framework (if the wrapper is implemented) form Euclidean space. Here we use FiniteDiff.jl.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"r_backend = ManifoldDiff.TangentDiffBackend(\n ManifoldDiff.FiniteDifferencesBackend()\n)\ngradf1_FD(p) = ManifoldDiff.gradient(M, f1, p, r_backend)\n\np = zeros(n + 1)\np[1] = 1.0\nX1 = gradf1(p)\nX2 = gradf1_FD(p)\nnorm(M, p, X1 - X2)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"1.0003414846716736e-12","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We obtain quite a good approximation of the gradient.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/#EmbeddedGradient","page":"Use Automatic Differentiation","title":"2. Conversion of a Euclidean Gradient in the Embedding to a Riemannian Gradient of a (not Necessarily Isometrically) Embedded Manifold","text":"","category":"section"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Let tilde fcolonmathbb R^m to mathbb R be a function on the embedding of an n-dimensional manifold mathcal M subset mathbb R^mand let fcolon mathcal M to mathbb R denote the restriction of tilde f to the manifold mathcal M.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Since we can use the pushforward of the embedding to also embed the tangent space T_pmathcal M, pin mathcal M, we can similarly obtain the differential Df(p)colon T_pmathcal M to mathbb R by restricting the differential Dtilde f(p) to the tangent space.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"If both T_pmathcal M and T_pmathbb R^m have the same inner product, or in other words the manifold is isometrically embedded in mathbb R^m (like for example the sphere mathbb S^nsubsetmathbb R^m+1), then this restriction of the differential directly translates to a projection of the gradient, i.e.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"operatornamegradf(p) = operatornameProj_T_pmathcal M(operatornamegrad tilde f(p))","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"More generally we might have to take a change of the metric into account, i.e.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"langle operatornameProj_T_pmathcal M(operatornamegrad tilde f(p)) X rangle\n= Df(p)X = g_p(operatornamegradf(p) X)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"or in words: we have to change the Riesz representer of the (restricted/projected) differential of f (tilde f) to the one with respect to the Riemannian metric. This is done using change_representer.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/#A-Continued-Example","page":"Use Automatic Differentiation","title":"A Continued Example","text":"","category":"section"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We continue with the Rayleigh Quotient from before, now just starting with the defintion of the Euclidean case in the embedding, the function F.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"F(x) = x' * A * x / (x' * x);","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"The cost function is the same by restriction","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"f2(M, p) = F(p);","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"The gradient is now computed combining our gradient scheme with FiniteDifferences.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"function grad_f2_AD(M, p)\n return Manifolds.gradient(\n M, F, p, Manifolds.RiemannianProjectionBackend(ManifoldDiff.FiniteDifferencesBackend())\n )\nend\nX3 = grad_f2_AD(M, p)\nnorm(M, p, X1 - X3)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"1.69683800899515e-12","category":"page"},{"location":"tutorials/AutomaticDifferentiation/#An-Example-for-a-Nonisometrically-Embedded-Manifold","page":"Use Automatic Differentiation","title":"An Example for a Nonisometrically Embedded Manifold","text":"","category":"section"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"on the manifold mathcal P(3) of symmetric positive definite matrices.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"The following function computes (half) the distance squared (with respect to the linear affine metric) on the manifold mathcal P(3) to the identity, i.e. I_3. Denoting the unit matrix we consider the function","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":" G(q)\n = frac12d^2_mathcal P(3)(qI_3)\n = lVert operatornameLog(q) rVert_F^2","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"where operatornameLog denotes the matrix logarithm and lVert cdot rVert_F is the Frobenius norm. This can be computed for symmetric positive definite matrices by summing the squares of the logarithms of the eigenvalues of q and dividing by two:","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"G(q) = sum(log.(eigvals(Symmetric(q))) .^ 2) / 2","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"G (generic function with 1 method)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We can also interpret this as a function on the space of matrices and apply the Euclidean finite differences machinery; in this way we can easily derive the Euclidean gradient. But when computing the Riemannian gradient, we have to change the representer (see again change_representer) after projecting onto the tangent space T_pmathcal P(n) at p.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Let’s first define a point and the manifold N=mathcal P(3).","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"rotM(α) = [1.0 0.0 0.0; 0.0 cos(α) sin(α); 0.0 -sin(α) cos(α)]\nq = rotM(π / 6) * [1.0 0.0 0.0; 0.0 2.0 0.0; 0.0 0.0 3.0] * transpose(rotM(π / 6))\nN = SymmetricPositiveDefinite(3)\nis_point(N, q)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"true","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We could first just compute the gradient using FiniteDifferences.jl, but this yields the Euclidean gradient:","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"FiniteDifferences.grad(central_fdm(5, 1), G, q)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"([3.240417492806275e-14 -2.3531899864903462e-14 0.0; 0.0 0.3514812167654708 0.017000516835452926; 0.0 0.0 0.36129646973723023],)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Instead, we use the RiemannianProjectedBackend of Manifolds.jl, which in this case internally uses FiniteDifferences.jl to compute a Euclidean gradient but then uses the conversion explained above to derive the Riemannian gradient.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We define this here again as a function grad_G_FD that could be used in the Manopt.jl framework within a gradient based optimization.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"function grad_G_FD(N, q)\n return Manifolds.gradient(\n N, G, q, ManifoldDiff.RiemannianProjectionBackend(ManifoldDiff.FiniteDifferencesBackend())\n )\nend\nG1 = grad_G_FD(N, q)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"3×3 Matrix{Float64}:\n 3.24042e-14 -2.64734e-14 -5.09481e-15\n -2.64734e-14 1.86368 0.826856\n -5.09481e-15 0.826856 2.81845","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Now, we can again compare this to the (known) solution of the gradient, namely the gradient of (half of) the distance squared, i.e. G(q) = frac12d^2_mathcal P(3)(qI_3) is given by operatornamegrad G(q) = -operatornamelog_q I_3, where operatornamelog is the logarithmic map on the manifold.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"G2 = -log(N, q, Matrix{Float64}(I, 3, 3))","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"3×3 Matrix{Float64}:\n -0.0 -0.0 -0.0\n -0.0 1.86368 0.826856\n -0.0 0.826856 2.81845","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Both terms agree up to 1810^-12:","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"norm(G1 - G2)\nisapprox(M, q, G1, G2; atol=2 * 1e-12)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"true","category":"page"},{"location":"tutorials/AutomaticDifferentiation/#Summary","page":"Use Automatic Differentiation","title":"Summary","text":"","category":"section"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"This tutorial illustrates how to use tools from Euclidean spaces, finite differences or automatic differentiation, to compute gradients on Riemannian manifolds. The scheme allows to use any differentiation framework within the embedding to derive a Riemannian gradient.","category":"page"},{"location":"solvers/conjugate_gradient_descent/#CGSolver","page":"Conjugate gradient descent","title":"Conjugate Gradient Descent","text":"","category":"section"},{"location":"solvers/conjugate_gradient_descent/","page":"Conjugate gradient descent","title":"Conjugate gradient descent","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/conjugate_gradient_descent/","page":"Conjugate gradient descent","title":"Conjugate gradient descent","text":"conjugate_gradient_descent\nconjugate_gradient_descent!","category":"page"},{"location":"solvers/conjugate_gradient_descent/#Manopt.conjugate_gradient_descent","page":"Conjugate gradient descent","title":"Manopt.conjugate_gradient_descent","text":"conjugate_gradient_descent(M, F, gradF, p=rand(M))\nconjugate_gradient_descent(M, gradient_objective, p)\n\nperform a conjugate gradient based descent\n\np_k+1 = operatornameretr_p_k bigl( s_kδ_k bigr)\n\nwhere operatornameretr denotes a retraction on the Manifold M and one can employ different rules to update the descent direction δ_k based on the last direction δ_k-1 and both gradients operatornamegradf(x_k),operatornamegradf(x_k-1). The Stepsize s_k may be determined by a Linesearch.\n\nAlternatively to f and grad_f you can provide the AbstractManifoldGradientObjective gradient_objective directly.\n\nAvailable update rules are SteepestDirectionUpdateRule, which yields a gradient_descent, ConjugateDescentCoefficient (the default), DaiYuanCoefficient, FletcherReevesCoefficient, HagerZhangCoefficient, HestenesStiefelCoefficient, LiuStoreyCoefficient, and PolakRibiereCoefficient. These can all be combined with a ConjugateGradientBealeRestart rule.\n\nThey all compute β_k such that this algorithm updates the search direction as\n\ndelta_k=operatornamegradf(p_k) + β_k delta_k-1\n\nInput\n\nM : a manifold mathcal M\nf : a cost function Fmathcal Mℝ to minimize implemented as a function (M,p) -> v\ngrad_f: the gradient operatornamegradFmathcal M Tmathcal M of F implemented also as (M,x) -> X\np : an initial value xmathcal M\n\nOptional\n\ncoefficient : (ConjugateDescentCoefficient <: DirectionUpdateRule) rule to compute the descent direction update coefficient β_k, as a functor, i.e. the resulting function maps (amp, cgs, i) -> β, where amp is an AbstractManoptProblem, cgs are the ConjugateGradientDescentState o and i is the current iterate.\nevaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).\nretraction_method - (default_retraction_method(M, typeof(p))) a retraction method to use.\nstepsize - (ArmijoLinesearch via default_stepsize) A Stepsize function applied to the search direction. The default is a constant step size 1.\nstopping_criterion : (stopWhenAny( stopAtIteration(200), stopGradientNormLess(10.0^-8))) a function indicating when to stop.\nvector_transport_method – (default_vector_transport_method(M, typeof(p))) vector transport method to transport the old descent direction when computing the new descent direction.\n\nIf you provide the ManifoldGradientObjective directly, evaluation is ignored.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/conjugate_gradient_descent/#Manopt.conjugate_gradient_descent!","page":"Conjugate gradient descent","title":"Manopt.conjugate_gradient_descent!","text":"conjugate_gradient_descent!(M, F, gradF, x)\nconjugate_gradient_descent!(M, gradient_objective, p; kwargs...)\n\nperform a conjugate gradient based descent in place of x, i.e.\n\np_k+1 = operatornameretr_p_k bigl( s_kdelta_k bigr)\n\nwhere operatornameretr denotes a retraction on the Manifold M\n\nInput\n\nM : a manifold mathcal M\nf : a cost function Fmathcal Mℝ to minimize\ngrad_f: the gradient operatornamegradFmathcal M Tmathcal M of F\np : an initial value pmathcal M\n\nAlternatively to f and grad_f you can provide the AbstractManifoldGradientObjective gradient_objective directly.\n\nfor more details and options, especially the DirectionUpdateRules, see conjugate_gradient_descent.\n\n\n\n\n\n","category":"function"},{"location":"solvers/conjugate_gradient_descent/#State","page":"Conjugate gradient descent","title":"State","text":"","category":"section"},{"location":"solvers/conjugate_gradient_descent/","page":"Conjugate gradient descent","title":"Conjugate gradient descent","text":"ConjugateGradientDescentState","category":"page"},{"location":"solvers/conjugate_gradient_descent/#Manopt.ConjugateGradientDescentState","page":"Conjugate gradient descent","title":"Manopt.ConjugateGradientDescentState","text":"ConjugateGradientState <: AbstractGradientSolverState\n\nspecify options for a conjugate gradient descent algorithm, that solves a [DefaultManoptProblem].\n\nFields\n\np – the current iterate, a point on a manifold\nX – the current gradient, also denoted as ξ or X_k for the gradient in the kth step.\nδ – the current descent direction, i.e. also tangent vector\nβ – the current update coefficient rule, see .\ncoefficient – (ConjugateDescentCoefficient()) a DirectionUpdateRule function to determine the new β\nstepsize – (default_stepsize(M, ConjugateGradientDescentState; retraction_method=retraction_method)) a Stepsize function\nstop – (StopAfterIteration(500) |StopWhenGradientNormLess(1e-8)) a StoppingCriterion\nretraction_method – (default_retraction_method(M, typeof(p))) a type of retraction\nvector_transport_method – (default_retraction_method(M, typeof(p))) a type of retraction\n\nConstructor\n\nConjugateGradientState(M, p)\n\nwhere the last five fields above can be set by their names as keyword and the X can be set to a tangent vector type using the keyword initial_gradient which defaults to zero_vector(M,p), and δ is initialized to a copy of this vector.\n\nSee also\n\nconjugate_gradient_descent, DefaultManoptProblem, ArmijoLinesearch\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#cg-coeffs","page":"Conjugate gradient descent","title":"Available Coefficients","text":"","category":"section"},{"location":"solvers/conjugate_gradient_descent/","page":"Conjugate gradient descent","title":"Conjugate gradient descent","text":"The update rules act as DirectionUpdateRule, which internally always first evaluate the gradient itself.","category":"page"},{"location":"solvers/conjugate_gradient_descent/","page":"Conjugate gradient descent","title":"Conjugate gradient descent","text":"ConjugateGradientBealeRestart\nConjugateDescentCoefficient\nDaiYuanCoefficient\nFletcherReevesCoefficient\nHagerZhangCoefficient\nHestenesStiefelCoefficient\nLiuStoreyCoefficient\nPolakRibiereCoefficient\nSteepestDirectionUpdateRule","category":"page"},{"location":"solvers/conjugate_gradient_descent/#Manopt.ConjugateGradientBealeRestart","page":"Conjugate gradient descent","title":"Manopt.ConjugateGradientBealeRestart","text":"ConjugateGradientBealeRestart <: DirectionUpdateRule\n\nAn update rule might require a restart, that is using pure gradient as descent direction, if the last two gradients are nearly orthogonal, cf. Hager, Zhang, Pacific J Optim, 2006, page 12 (in the pdf, 46 in Journal page numbers). This method is named after E. Beale from his proceedings paper in 1972 [Bea72]. This method acts as a decorator to any existing DirectionUpdateRule direction_update.\n\nWhen obtain from the ConjugateGradientDescentStatecgs the last p_kX_k and the current p_k+1X_k+1 iterate and the gradient, respectively.\n\nThen a restart is performed, i.e. β_k = 0 returned if\n\n frac X_k+1 P_p_k+1gets p_kX_klVert X_k rVert_p_k ξ\n\nwhere P_agets b() denotes a vector transport from the tangent space at a to b, and ξ is the threshold. The default threshold is chosen as 0.2 as recommended in Powell, Math. Prog., 1977\n\nConstructor\n\nConjugateGradientBealeRestart(\n direction_update::D,\n threshold=0.2;\n manifold::AbstractManifold = DefaultManifold(),\n vector_transport_method::V=default_vector_transport_method(manifold),\n)\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Manopt.ConjugateDescentCoefficient","page":"Conjugate gradient descent","title":"Manopt.ConjugateDescentCoefficient","text":"ConjugateDescentCoefficient <: DirectionUpdateRule\n\nComputes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates p_kX_k, the current iterates p_k+1X_k+1 of the iterate and the gradient, respectively, and the last update direction delta=delta_k, based on Fletcher, 1987 adapted to manifolds:\n\nβ_k =\nfrac lVert X_k+1 rVert_p_k+1^2 \nlangle -delta_kX_k rangle_p_k\n\nSee also conjugate_gradient_descent\n\nConstructor\n\nConjugateDescentCoefficient(a::StoreStateAction=())\n\nConstruct the conjugate descent coefficient update rule, a new storage is created by default.\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Manopt.DaiYuanCoefficient","page":"Conjugate gradient descent","title":"Manopt.DaiYuanCoefficient","text":"DaiYuanCoefficient <: DirectionUpdateRule\n\nComputes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates p_kX_k, the current iterates p_k+1X_k+1 of the iterate and the gradient, respectively, and the last update direction delta=delta_k, based on Dai, Yuan, Siam J Optim, 1999 adapted to manifolds:\n\nLet nu_k = X_k+1 - P_p_k+1gets p_kX_k, where P_agets b() denotes a vector transport from the tangent space at a to b.\n\nThen the coefficient reads\n\nβ_k =\nfrac lVert X_k+1 rVert_p_k+1^2 \nlangle P_p_k+1gets p_kdelta_k nu_k rangle_p_k+1\n\nSee also conjugate_gradient_descent\n\nConstructor\n\nfunction DaiYuanCoefficient(\n M::AbstractManifold=DefaultManifold(2);\n t::AbstractVectorTransportMethod=default_vector_transport_method(M)\n)\n\nConstruct the Dai Yuan coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Manopt.FletcherReevesCoefficient","page":"Conjugate gradient descent","title":"Manopt.FletcherReevesCoefficient","text":"FletcherReevesCoefficient <: DirectionUpdateRule\n\nComputes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates p_kX_k, the current iterates p_k+1X_k+1 of the iterate and the gradient, respectively, and the last update direction delta=delta_k, based on Flecther, Reeves, Comput. J, 1964 adapted to manifolds:\n\nβ_k =\nfraclVert X_k+1rVert_p_k+1^2lVert X_krVert_x_k^2\n\nSee also conjugate_gradient_descent\n\nConstructor\n\nFletcherReevesCoefficient(a::StoreStateAction=())\n\nConstruct the Fletcher Reeves coefficient update rule, a new storage is created by default.\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Manopt.HagerZhangCoefficient","page":"Conjugate gradient descent","title":"Manopt.HagerZhangCoefficient","text":"HagerZhangCoefficient <: DirectionUpdateRule\n\nComputes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates p_kX_k, the current iterates p_k+1X_k+1 of the iterate and the gradient, respectively, and the last update direction delta=delta_k, based on Hager, Zhang, SIAM J Optim, 2005. adapted to manifolds: let nu_k = X_k+1 - P_p_k+1gets p_kX_k, where P_agets b() denotes a vector transport from the tangent space at a to b.\n\nβ_k = Bigllanglenu_k -\nfrac 2lVert nu_krVert_p_k+1^2 langle P_p_k+1gets p_kdelta_k nu_k rangle_p_k+1 \nP_p_k+1gets p_kdelta_k\nfracX_k+1 langle P_p_k+1gets p_kdelta_k nu_k rangle_p_k+1 \nBigrrangle_p_k+1\n\nThis method includes a numerical stability proposed by those authors.\n\nSee also conjugate_gradient_descent\n\nConstructor\n\nfunction HagerZhangCoefficient(t::AbstractVectorTransportMethod)\nfunction HagerZhangCoefficient(M::AbstractManifold = DefaultManifold(2))\n\nConstruct the Hager Zhang coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Manopt.HestenesStiefelCoefficient","page":"Conjugate gradient descent","title":"Manopt.HestenesStiefelCoefficient","text":"HestenesStiefelCoefficient <: DirectionUpdateRule\n\nComputes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates p_kX_k, the current iterates p_k+1X_k+1 of the iterate and the gradient, respectively, and the last update direction delta=delta_k, based on Heestenes, Stiefel, J. Research Nat. Bur. Standards, 1952 adapted to manifolds as follows:\n\nLet nu_k = X_k+1 - P_p_k+1gets p_kX_k. Then the update reads\n\nβ_k = fraclangle X_k+1 nu_k rangle_p_k+1 \n langle P_p_k+1gets p_k delta_k nu_krangle_p_k+1 \n\nwhere P_agets b() denotes a vector transport from the tangent space at a to b.\n\nConstructor\n\nfunction HestenesStiefelCoefficient(transport_method::AbstractVectorTransportMethod)\nfunction HestenesStiefelCoefficient(M::AbstractManifold = DefaultManifold(2))\n\nConstruct the Heestens Stiefel coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.\n\nSee also conjugate_gradient_descent\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Manopt.LiuStoreyCoefficient","page":"Conjugate gradient descent","title":"Manopt.LiuStoreyCoefficient","text":"LiuStoreyCoefficient <: DirectionUpdateRule\n\nComputes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates p_kX_k, the current iterates p_k+1X_k+1 of the iterate and the gradient, respectively, and the last update direction delta=delta_k, based on Lui, Storey, J. Optim. Theoru Appl., 1991 adapted to manifolds:\n\nLet nu_k = X_k+1 - P_p_k+1gets p_kX_k, where P_agets b() denotes a vector transport from the tangent space at a to b.\n\nThen the coefficient reads\n\nβ_k = -\nfrac langle X_k+1nu_k rangle_p_k+1 \nlangle delta_kX_k rangle_p_k\n\nSee also conjugate_gradient_descent\n\nConstructor\n\nfunction LiuStoreyCoefficient(t::AbstractVectorTransportMethod)\nfunction LiuStoreyCoefficient(M::AbstractManifold = DefaultManifold(2))\n\nConstruct the Lui Storey coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Manopt.PolakRibiereCoefficient","page":"Conjugate gradient descent","title":"Manopt.PolakRibiereCoefficient","text":"PolakRibiereCoefficient <: DirectionUpdateRule\n\nComputes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates p_kX_k, the current iterates p_k+1X_k+1 of the iterate and the gradient, respectively, and the last update direction delta=delta_k, based on Poliak, Ribiere, ESIAM Math. Modelling Num. Anal., 1969 and Polyak, USSR Comp. Math. Math. Phys., 1969 adapted to manifolds:\n\nLet nu_k = X_k+1 - P_p_k+1gets p_kX_k, where P_agets b() denotes a vector transport from the tangent space at a to b.\n\nThen the update reads\n\nβ_k =\nfrac langle X_k+1 nu_k rangle_p_k+1 \nlVert X_k rVert_p_k^2 \n\nConstructor\n\nfunction PolakRibiereCoefficient(\n M::AbstractManifold=DefaultManifold(2);\n t::AbstractVectorTransportMethod=default_vector_transport_method(M)\n)\n\nConstruct the PolakRibiere coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.\n\nSee also conjugate_gradient_descent\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Manopt.SteepestDirectionUpdateRule","page":"Conjugate gradient descent","title":"Manopt.SteepestDirectionUpdateRule","text":"SteepestDirectionUpdateRule <: DirectionUpdateRule\n\nThe simplest rule to update is to have no influence of the last direction and hence return an update β = 0 for all ConjugateGradientDescentStatecgds\n\nSee also conjugate_gradient_descent\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Literature","page":"Conjugate gradient descent","title":"Literature","text":"","category":"section"},{"location":"solvers/conjugate_gradient_descent/","page":"Conjugate gradient descent","title":"Conjugate gradient descent","text":"Pages = [\"solvers/conjugate_gradient_descent.md\"]\nCanonical=false","category":"page"},{"location":"helpers/errorMeasures/#ErrorMeasures","page":"Error Measures","title":"Error Measures","text":"","category":"section"},{"location":"helpers/errorMeasures/","page":"Error Measures","title":"Error Measures","text":"meanSquaredError\nmeanAverageError","category":"page"},{"location":"helpers/errorMeasures/#Manopt.meanSquaredError","page":"Error Measures","title":"Manopt.meanSquaredError","text":"meanSquaredError(M, p, q)\n\nCompute the (mean) squared error between the two points p and q on the (power) manifold M.\n\n\n\n\n\n","category":"function"},{"location":"helpers/errorMeasures/#Manopt.meanAverageError","page":"Error Measures","title":"Manopt.meanAverageError","text":"meanSquaredError(M,x,y)\n\nCompute the (mean) squared error between the two points x and y on the PowerManifold manifold M.\n\n\n\n\n\n","category":"function"},{"location":"functions/adjoint_differentials/#adjointDifferentialFunctions","page":"Adjoint Differentials","title":"Adjoint Differentials","text":"","category":"section"},{"location":"functions/adjoint_differentials/","page":"Adjoint Differentials","title":"Adjoint Differentials","text":"Modules = [Manopt]\nPages = [\"adjoint_differentials.jl\"]","category":"page"},{"location":"functions/adjoint_differentials/#Manopt.adjoint_differential_bezier_control-Tuple{AbstractManifold, AbstractVector{<:BezierSegment}, AbstractVector, AbstractVector}","page":"Adjoint Differentials","title":"Manopt.adjoint_differential_bezier_control","text":"adjoint_differential_bezier_control(\n M::AbstractManifold,\n T::AbstractVector,\n X::AbstractVector,\n)\nadjoint_differential_bezier_control!(\n M::AbstractManifold,\n Y::AbstractVector{<:BezierSegment},\n T::AbstractVector,\n X::AbstractVector,\n)\n\nEvaluate the adjoint of the differential with respect to the controlpoints at several times T. This can be computed in place of Y.\n\nSee de_casteljau for more details on the curve.\n\n\n\n\n\n","category":"method"},{"location":"functions/adjoint_differentials/#Manopt.adjoint_differential_bezier_control-Tuple{AbstractManifold, AbstractVector{<:BezierSegment}, Any, Any}","page":"Adjoint Differentials","title":"Manopt.adjoint_differential_bezier_control","text":"adjoint_differential_bezier_control(\n M::AbstractManifold,\n B::AbstractVector{<:BezierSegment},\n t,\n X\n)\nadjoint_differential_bezier_control!(\n M::AbstractManifold,\n Y::AbstractVector{<:BezierSegment},\n B::AbstractVector{<:BezierSegment},\n t,\n X\n)\n\nevaluate the adjoint of the differential of a composite Bézier curve on the manifold M with respect to its control points b based on a points T=(t_i)_i=1^n that are pointwise in t_i01 on the curve and given corresponding tangential vectors X = (η_i)_i=1^n, η_iT_β(t_i)mathcal M This can be computed in place of Y.\n\nSee de_casteljau for more details on the curve.\n\n\n\n\n\n","category":"method"},{"location":"functions/adjoint_differentials/#Manopt.adjoint_differential_bezier_control-Tuple{AbstractManifold, BezierSegment, AbstractVector, AbstractVector}","page":"Adjoint Differentials","title":"Manopt.adjoint_differential_bezier_control","text":"adjoint_differential_bezier_control(\n M::AbstractManifold,\n b::BezierSegment,\n t::AbstractVector,\n X::AbstractVector,\n)\nadjoint_differential_bezier_control!(\n M::AbstractManifold,\n Y::BezierSegment,\n b::BezierSegment,\n t::AbstractVector,\n X::AbstractVector,\n)\n\nevaluate the adjoint of the differential of a Bézier curve on the manifold M with respect to its control points b based on a points T=(t_i)_i=1^n that are pointwise in t_i01 on the curve and given corresponding tangential vectors X = (η_i)_i=1^n, η_iT_β(t_i)mathcal M This can be computed in place of Y.\n\nSee de_casteljau for more details on the curve and Bergmann, Gousenbourger, Front. Appl. Math. Stat., 2018\n\n\n\n\n\n","category":"method"},{"location":"functions/adjoint_differentials/#Manopt.adjoint_differential_bezier_control-Tuple{AbstractManifold, BezierSegment, Any, Any}","page":"Adjoint Differentials","title":"Manopt.adjoint_differential_bezier_control","text":"adjoint_differential_bezier_control(M::AbstractManifold, b::BezierSegment, t, η)\nadjoint_differential_bezier_control!(\n M::AbstractManifold,\n Y::BezierSegment,\n b::BezierSegment,\n t,\n η,\n)\n\nevaluate the adjoint of the differential of a Bézier curve on the manifold M with respect to its control points b based on a point t01 on the curve and a tangent vector ηT_β(t)mathcal M. This can be computed in place of Y.\n\nSee de_casteljau for more details on the curve.\n\n\n\n\n\n","category":"method"},{"location":"functions/adjoint_differentials/#Manopt.adjoint_differential_forward_logs-Union{Tuple{TPR}, Tuple{TSize}, Tuple{TM}, Tuple{𝔽}, Tuple{PowerManifold{𝔽, TM, TSize, TPR}, Any, Any}} where {𝔽, TM, TSize, TPR}","page":"Adjoint Differentials","title":"Manopt.adjoint_differential_forward_logs","text":"Y = adjoint_differential_forward_logs(M, p, X)\nadjoint_differential_forward_logs!(M, Y, p, X)\n\nCompute the adjoint differential of forward_logs F occurring, in the power manifold array p, the differential of the function\n\nF_i(p) = sum_j mathcal I_i log_p_i p_j\n\nwhere i runs over all indices of the PowerManifold manifold M and mathcal I_i denotes the forward neighbors of i Let n be the number dimensions of the PowerManifold manifold (i.e. length(size(x))). Then the input tangent vector lies on the manifold mathcal M = mathcal M^n. The adjoint differential can be computed in place of Y.\n\nInput\n\nM – a PowerManifold manifold\np – an array of points on a manifold\nX – a tangent vector to from the n-fold power of p, where n is the ndims of p\n\nOutput\n\nY – resulting tangent vector in T_pmathcal M representing the adjoint differentials of the logs.\n\n\n\n\n\n","category":"method"},{"location":"functions/adjoint_differentials/#Literature","page":"Adjoint Differentials","title":"Literature","text":"","category":"section"},{"location":"functions/adjoint_differentials/","page":"Adjoint Differentials","title":"Adjoint Differentials","text":"Pages = [\"functions/adjoint_differentials.md\"]\nCanonical=false","category":"page"},{"location":"solvers/gradient_descent/#GradientDescentSolver","page":"Gradient Descent","title":"Gradient Descent","text":"","category":"section"},{"location":"solvers/gradient_descent/","page":"Gradient Descent","title":"Gradient Descent","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/gradient_descent/","page":"Gradient Descent","title":"Gradient Descent","text":" gradient_descent\n gradient_descent!","category":"page"},{"location":"solvers/gradient_descent/#Manopt.gradient_descent","page":"Gradient Descent","title":"Manopt.gradient_descent","text":"gradient_descent(M, f, grad_f, p=rand(M); kwargs...)\ngradient_descent(M, gradient_objective, p=rand(M); kwargs...)\n\nperform a gradient descent\n\np_k+1 = operatornameretr_p_kbigl( s_koperatornamegradf(p_k) bigr)\nqquad k=01\n\nwith different choices of the stepsize s_k available (see stepsize option below).\n\nInput\n\nM – a manifold mathcal M\nf – a cost function f mathcal Mℝ to find a minimizer p^* for\ngrad_f – the gradient operatornamegradf mathcal M Tmathcal M of f\nas a function (M, p) -> X or a function (M, X, p) -> X\np – an initial value p = p_0 mathcal M\n\nAlternatively to f and grad_f you can provide the AbstractManifoldGradientObjective gradient_objective directly.\n\nOptional\n\ndirection – (IdentityUpdateRule) perform a processing of the direction, e.g.\nevaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form grad_f(M, p) or InplaceEvaluation in place, i.e. is of the form grad_f!(M, X, p).\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction to use\nstepsize – (ConstantStepsize(1.)) specify a Stepsize functor.\nstopping_criterion – (StopWhenAny(StopAfterIteration(200),StopWhenGradientNormLess(10.0^-8))) a functor inheriting from StoppingCriterion indicating when to stop.\n\nIf you provide the ManifoldGradientObjective directly, evaluation is ignored.\n\nAll other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective, respectively. If you provide the ManifoldGradientObjective directly, these decorations can still be specified\n\nOutput\n\nthe obtained (approximate) minimizer p^*. To obtain the whole final state of the solver, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/gradient_descent/#Manopt.gradient_descent!","page":"Gradient Descent","title":"Manopt.gradient_descent!","text":"gradient_descent!(M, f, grad_f, p; kwargs...)\ngradient_descent!(M, gradient_objective, p; kwargs...)\n\nperform a gradient_descent\n\np_k+1 = operatornameretr_p_kbigl( s_koperatornamegradf(p_k) bigr)\n\nin place of p with different choices of s_k available.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function Fmathcal Mℝ to minimize\ngrad_f – the gradient operatornamegradFmathcal M Tmathcal M of F\np – an initial value p mathcal M\n\nAlternatively to f and grad_f you can provide the AbstractManifoldGradientObjective gradient_objective directly.\n\nFor more options, especially Stepsizes for s_k, see gradient_descent\n\n\n\n\n\n","category":"function"},{"location":"solvers/gradient_descent/#State","page":"Gradient Descent","title":"State","text":"","category":"section"},{"location":"solvers/gradient_descent/","page":"Gradient Descent","title":"Gradient Descent","text":"GradientDescentState","category":"page"},{"location":"solvers/gradient_descent/#Manopt.GradientDescentState","page":"Gradient Descent","title":"Manopt.GradientDescentState","text":"GradientDescentState{P,T} <: AbstractGradientSolverState\n\nDescribes a Gradient based descent algorithm, with\n\nFields\n\na default value is given in brackets if a parameter can be left out in initialization.\n\np – (rand(M)` the current iterate\nX – (zero_vector(M,p)) the current gradient operatornamegradf(p), initialised to zero vector.\nstopping_criterion – (StopAfterIteration(100)) a StoppingCriterion\nstepsize – (default_stepsize(M, GradientDescentState)) a Stepsize\ndirection - (IdentityUpdateRule) a processor to compute the gradient\nretraction_method – (default_retraction_method(M, typeof(p))) the retraction to use, defaults to the default set for your manifold.\n\nConstructor\n\nGradientDescentState(M, p=rand(M); X=zero_vector(M, p), kwargs...)\n\nGenerate gradient descent options, where X can be used to set the tangent vector to store the gradient in a certain type; it will be initialised accordingly at a later stage. All following fields are keyword arguments.\n\nSee also\n\ngradient_descent\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Direction-Update-Rules","page":"Gradient Descent","title":"Direction Update Rules","text":"","category":"section"},{"location":"solvers/gradient_descent/","page":"Gradient Descent","title":"Gradient Descent","text":"A field of the options is the direction, a DirectionUpdateRule, which by default IdentityUpdateRule just evaluates the gradient but can be enhanced for example to","category":"page"},{"location":"solvers/gradient_descent/","page":"Gradient Descent","title":"Gradient Descent","text":"DirectionUpdateRule\nIdentityUpdateRule\nMomentumGradient\nAverageGradient\nNesterov","category":"page"},{"location":"solvers/gradient_descent/#Manopt.DirectionUpdateRule","page":"Gradient Descent","title":"Manopt.DirectionUpdateRule","text":"DirectionUpdateRule\n\nA general functor, that handles direction update rules. It's field(s) is usually only a StoreStateAction by default initialized to the fields required for the specific coefficient, but can also be replaced by a (common, global) individual one that provides these values.\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Manopt.IdentityUpdateRule","page":"Gradient Descent","title":"Manopt.IdentityUpdateRule","text":"IdentityUpdateRule <: DirectionUpdateRule\n\nThe default gradient direction update is the identity, i.e. it just evaluates the gradient.\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Manopt.MomentumGradient","page":"Gradient Descent","title":"Manopt.MomentumGradient","text":"MomentumGradient <: DirectionUpdateRule\n\nAppend a momentum to a gradient processor, where the last direction and last iterate are stored and the new is composed as η_i = m*η_i-1 - s d_i, where sd_i is the current (inner) direction and η_i-1 is the vector transported last direction multiplied by momentum m.\n\nFields\n\np_old - (rand(M)) remember the last iterate for parallel transporting the last direction\nmomentum – (0.2) factor for momentum\ndirection – internal DirectionUpdateRule to determine directions to add the momentum to.\nvector_transport_method – default_vector_transport_method(M, typeof(p)) vector transport method to use\nX_old – (zero_vector(M,x0)) the last gradient/direction update added as momentum\n\nConstructors\n\nAdd momentum to a gradient problem, where by default just a gradient evaluation is used\n\nMomentumGradient(\n M::AbstractManifold;\n p=rand(M),\n s::DirectionUpdateRule=IdentityUpdateRule();\n X=zero_vector(p.M, x0), momentum=0.2\n vector_transport_method=default_vector_transport_method(M, typeof(p)),\n)\n\nInitialize a momentum gradient rule to s. Note that the keyword arguments p and X will be overridden often, so their initialisation is meant to set the to certain types of points or tangent vectors, if you do not use the default types with respect to M.\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Manopt.AverageGradient","page":"Gradient Descent","title":"Manopt.AverageGradient","text":"AverageGradient <: DirectionUpdateRule\n\nAdd an average of gradients to a gradient processor. A set of previous directions (from the inner processor) and the last iterate are stored, average is taken after vector transporting them to the current iterates tangent space.\n\nFields\n\ngradients – (fill(zero_vector(M,x0),n)) the last n gradient/direction updates\nlast_iterate – last iterate (needed to transport the gradients)\ndirection – internal DirectionUpdateRule to determine directions to apply the averaging to\nvector_transport_method - vector transport method to use\n\nConstructors\n\nAverageGradient(\n M::AbstractManifold,\n p::P=rand(M);\n n::Int=10\n s::DirectionUpdateRule=IdentityUpdateRule();\n gradients = fill(zero_vector(p.M, o.x),n),\n last_iterate = deepcopy(x0),\n vector_transport_method = default_vector_transport_method(M, typeof(p))\n)\n\nAdd average to a gradient problem, where\n\nn determines the size of averaging\ns is the internal DirectionUpdateRule to determine the gradients to store\ngradients can be prefilled with some history\nlast_iterate stores the last iterate\nvector_transport_method determines how to transport all gradients to the current iterates tangent space before averaging\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Manopt.Nesterov","page":"Gradient Descent","title":"Manopt.Nesterov","text":"Nesterov <: DirectionUpdateRule\n\nFields\n\nγ\nμ the strong convexity coefficient\nv (==v_k, v_0=x_0) an interims point to compute the next gradient evaluation point y_k\nshrinkage (= i -> 0.8) a function to compute the shrinkage β_k per iterate.\n\nLet's assume f is L-Lipschitz and μ-strongly convex. Given\n\na step size h_kfrac1L (from the GradientDescentState\na shrinkage parameter β_k\nand a current iterate x_k\nas well as the interims values γ_k and v_k from the previous iterate.\n\nThis compute a Nesterov type update using the following steps, see Zhang, Sra, Preprint, 2018\n\nCompute the positive root, i.e. α_k(01) of α^2 = h_kbigl((1-α_k)γ_k+α_k μbigr).\nSet bar γ_k+1 = (1-α_k)γ_k + α_kμ\ny_k = operatornameretr_x_kBigl(fracα_kγ_kγ_k + α_kμoperatornameretr^-1_x_kv_k Bigr)\nx_k+1 = operatornameretr_y_k(-h_k operatornamegradf(y_k))\nv_k+1 = operatornameretr_y_kBigl(frac(1-α_k)γ_kbarγ_koperatornameretr_y_k^-1(v_k) - fracα_kbar γ_k+1operatornamegradf(y_k) Bigr)\nγ_k+1 = frac11+β_kbar γ_k+1\n\nThen the direction from x_k to x_k+1, i.e. d = operatornameretr^-1_x_kx_k+1 is returned.\n\nConstructor\n\nNesterov(M::AbstractManifold, p::P; γ=0.001, μ=0.9, shrinkage = k -> 0.8;\n inverse_retraction_method=LogarithmicInverseRetraction())\n\nInitialize the Nesterov acceleration, where x0 initializes v.\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Debug-Actions","page":"Gradient Descent","title":"Debug Actions","text":"","category":"section"},{"location":"solvers/gradient_descent/","page":"Gradient Descent","title":"Gradient Descent","text":"DebugGradient\nDebugGradientNorm\nDebugStepsize","category":"page"},{"location":"solvers/gradient_descent/#Manopt.DebugGradient","page":"Gradient Descent","title":"Manopt.DebugGradient","text":"DebugGradient <: DebugAction\n\ndebug for the gradient evaluated at the current iterate\n\nConstructors\n\nDebugGradient(; long=false, prefix= , format= \"$prefix%s\", io=stdout)\n\ndisplay the short (false) or long (true) default text for the gradient, or set the prefix manually. Alternatively the complete format can be set.\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Manopt.DebugGradientNorm","page":"Gradient Descent","title":"Manopt.DebugGradientNorm","text":"DebugGradientNorm <: DebugAction\n\ndebug for gradient evaluated at the current iterate.\n\nConstructors\n\nDebugGradientNorm([long=false,p=print])\n\ndisplay the short (false) or long (true) default text for the gradient norm.\n\nDebugGradientNorm(prefix[, p=print])\n\ndisplay the a prefix in front of the gradientnorm.\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Manopt.DebugStepsize","page":"Gradient Descent","title":"Manopt.DebugStepsize","text":"DebugStepsize <: DebugAction\n\ndebug for the current step size.\n\nConstructors\n\nDebugStepsize(;long=false,prefix=\"step size:\", format=\"$prefix%s\", io=stdout)\n\ndisplay the a prefix in front of the step size.\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Record-Actions","page":"Gradient Descent","title":"Record Actions","text":"","category":"section"},{"location":"solvers/gradient_descent/","page":"Gradient Descent","title":"Gradient Descent","text":"RecordGradient\nRecordGradientNorm\nRecordStepsize","category":"page"},{"location":"solvers/gradient_descent/#Manopt.RecordGradient","page":"Gradient Descent","title":"Manopt.RecordGradient","text":"RecordGradient <: RecordAction\n\nrecord the gradient evaluated at the current iterate\n\nConstructors\n\nRecordGradient(ξ)\n\ninitialize the RecordAction to the corresponding type of the tangent vector.\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Manopt.RecordGradientNorm","page":"Gradient Descent","title":"Manopt.RecordGradientNorm","text":"RecordGradientNorm <: RecordAction\n\nrecord the norm of the current gradient\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Manopt.RecordStepsize","page":"Gradient Descent","title":"Manopt.RecordStepsize","text":"RecordStepsize <: RecordAction\n\nrecord the step size\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Literature","page":"Gradient Descent","title":"Literature","text":"","category":"section"},{"location":"solvers/gradient_descent/","page":"Gradient Descent","title":"Gradient Descent","text":"Pages = [\"solvers/gradient_descent.md\"]\nCanonical=false\n\nLuenberger:1972","category":"page"},{"location":"solvers/#SolversSection","page":"Introduction","title":"Solvers","text":"","category":"section"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"Solvers can be applied to AbstractManoptProblems with solver specific AbstractManoptSolverState.","category":"page"},{"location":"solvers/#List-of-Algorithms","page":"Introduction","title":"List of Algorithms","text":"","category":"section"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"The following algorithms are currently available","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"Solver Function & State Objective\nAlternating Gradient Descent alternating_gradient_descent AlternatingGradientDescentState f=(f_1ldotsf_n), operatornamegrad f_i\nChambolle-Pock ChambollePock, ChambollePockState (using TwoManifoldProblem) f=F+G(Λcdot), operatornameprox_σ F, operatornameprox_τ G^*, Λ\nConjugate Gradient Descent conjugate_gradient_descent, ConjugateGradientDescentState f, operatornamegrad f\nCyclic Proximal Point cyclic_proximal_point, CyclicProximalPointState f=sum f_i, operatornameprox_lambda f_i\nDifference of Convex Algorithm difference_of_convex_algorithm, DifferenceOfConvexState f=g-h, h, and e.g. g, operatornamegrad g\nDifference of Convex Proximal Point difference_of_convex_proximal_point, DifferenceOfConvexProximalState f=g-h, h, and e.g. g, operatornamegrad g\nDouglas–Rachford DouglasRachford, DouglasRachfordState f=sum f_i, operatornameprox_lambda f_i\nExact Penalty Method exact_penalty_method, ExactPenaltyMethodState f, operatornamegrad f, g, operatornamegrad g_i, h, operatornamegrad h_j\nFrank-Wolfe algorithm Frank_Wolfe_method, FrankWolfeState sub-problem solver\nGradient Descent gradient_descent, GradientDescentState f, operatornamegrad f\nLevenberg-Marquardt LevenbergMarquardt, LevenbergMarquardtState f = sum_i f_i operatornamegrad f_i (Jacobian)\nNelder-Mead NelderMead, NelderMeadState f\nAugmented Lagrangian Method augmented_Lagrangian_method, AugmentedLagrangianMethodState f, operatornamegrad f, g, operatornamegrad g_i, h, operatornamegrad h_j\nParticle Swarm particle_swarm, ParticleSwarmState f\nPrimal-dual Riemannian semismooth Newton Algorithm primal_dual_semismooth_Newton, PrimalDualSemismoothNewtonState (using TwoManifoldProblem) f=F+G(Λcdot), operatornameprox_σ F & diff., operatornameprox_τ G^* & diff., Λ\nQuasi-Newton Method quasi_Newton, QuasiNewtonState f, operatornamegrad f\nSteihaug-Toint Truncated Conjugate-Gradient Method truncated_conjugate_gradient_descent, TruncatedConjugateGradientState f, operatornamegrad f, operatornameHess f\nSubgradient Method subgradient_method, SubGradientMethodState f, f\nStochastic Gradient Descent stochastic_gradient_descent, StochasticGradientDescentState f = sum_i f_i, operatornamegrad f_i\nThe Riemannian Trust-Regions Solver trust_regions, TrustRegionsState f, operatornamegrad f, operatornameHess f","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"Note that the solvers (their AbstractManoptSolverState, to be precise) can also be decorated to enhance your algorithm by general additional properties, see debug output and recording values. This is done using the debug= and record= keywords in the function calls. Similarly, since 0.4 we provide a (simple) caching of the objective function using the cache= keyword in any of the function calls..","category":"page"},{"location":"solvers/#Technical-Details","page":"Introduction","title":"Technical Details","text":"","category":"section"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"The main function a solver calls is","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"solve!(p::AbstractManoptProblem, s::AbstractManoptSolverState)","category":"page"},{"location":"solvers/#Manopt.solve!-Tuple{AbstractManoptProblem, AbstractManoptSolverState}","page":"Introduction","title":"Manopt.solve!","text":"solve!(p::AbstractManoptProblem, s::AbstractManoptSolverState)\n\nrun the solver implemented for the AbstractManoptProblemp and the AbstractManoptSolverStates employing initialize_solver!, step_solver!, as well as the stop_solver! of the solver.\n\n\n\n\n\n","category":"method"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"which is a framework that you in general should not change or redefine. It uses the following methods, which also need to be implemented on your own algorithm, if you want to provide one.","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"initialize_solver!\nstep_solver!\nget_solver_result\nget_solver_return\nstop_solver!(p::AbstractManoptProblem, s::AbstractManoptSolverState, Any)","category":"page"},{"location":"solvers/#Manopt.initialize_solver!","page":"Introduction","title":"Manopt.initialize_solver!","text":"initialize_solver!(ams::AbstractManoptProblem, amp::AbstractManoptSolverState)\n\nInitialize the solver to the optimization AbstractManoptProblem amp by initializing the necessary values in the AbstractManoptSolverState amp.\n\n\n\n\n\ninitialize_solver!(amp::AbstractManoptProblem, dss::DebugSolverState)\n\nExtend the initialization of the solver by a hook to run debug that were added to the :Start and :All entries of the debug lists.\n\n\n\n\n\ninitialize_solver!(ams::AbstractManoptProblem, rss::RecordSolverState)\n\nExtend the initialization of the solver by a hook to run records that were added to the :Start entry.\n\n\n\n\n\n","category":"function"},{"location":"solvers/#Manopt.step_solver!","page":"Introduction","title":"Manopt.step_solver!","text":"step_solver!(amp::AbstractManoptProblem, ams::AbstractManoptSolverState, i)\n\nDo one iteration step (the ith) for an AbstractManoptProblemp by modifying the values in the AbstractManoptSolverState ams.\n\n\n\n\n\nstep_solver!(amp::AbstractManoptProblem, dss::DebugSolverState, i)\n\nExtend the ith step of the solver by a hook to run debug prints, that were added to the :Step and :All entries of the debug lists.\n\n\n\n\n\nstep_solver!(amp::AbstractManoptProblem, rss::RecordSolverState, i)\n\nExtend the ith step of the solver by a hook to run records, that were added to the :Iteration entry.\n\n\n\n\n\n","category":"function"},{"location":"solvers/#Manopt.get_solver_result","page":"Introduction","title":"Manopt.get_solver_result","text":"get_solver_result(ams::AbstractManoptSolverState)\nget_solver_result(tos::Tuple{AbstractManifoldObjective,AbstractManoptSolverState})\nget_solver_result(o::AbstractManifoldObjective, s::AbstractManoptSolverState)\n\nReturn the final result after all iterations that is stored within the AbstractManoptSolverState ams, which was modified during the iterations.\n\nFor the case the objective is passed as well, but default, the objective is ignored, and the solver result for the state is called.\n\n\n\n\n\n","category":"function"},{"location":"solvers/#Manopt.get_solver_return","page":"Introduction","title":"Manopt.get_solver_return","text":"get_solver_return(s::AbstractManoptSolverState)\nget_solver_return(o::AbstractManifoldObjective, s::AbstractManoptSolverState)\n\ndetermine the result value of a call to a solver. By default this returns the same as get_solver_result, i.e. the last iterate or (approximate) minimizer.\n\nget_solver_return(s::ReturnSolverState)\nget_solver_return(o::AbstractManifoldObjective, s::ReturnSolverState)\n\nreturn the internally stored state of the ReturnSolverState instead of the minimizer. This means that when the state are decorated like this, the user still has to call get_solver_result on the internal state separately.\n\nget_solver_return(o::ReturnManifoldObjective, s::AbstractManoptSolverState)\n\nreturn both the objective and the state as a tuple.\n\n\n\n\n\n","category":"function"},{"location":"solvers/#Manopt.stop_solver!-Tuple{AbstractManoptProblem, AbstractManoptSolverState, Any}","page":"Introduction","title":"Manopt.stop_solver!","text":"stop_solver!(amp::AbstractManoptProblem, ams::AbstractManoptSolverState, i)\n\ndepending on the current AbstractManoptProblem amp, the current state of the solver stored in AbstractManoptSolverState ams and the current iterate i this function determines whether to stop the solver, which by default means to call the internal StoppingCriterion. ams.stop\n\n\n\n\n\n","category":"method"},{"location":"solvers/#API-for-solvers","page":"Introduction","title":"API for solvers","text":"","category":"section"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"this is a short overview of the different types of high-level functions are usually available for a solver. Let's assume the solver is called new_solver and requires a cost f and some first order information df as well as a starting point p on M. f and df form the objective together called obj.","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"Then there are basically two different variants to call","category":"page"},{"location":"solvers/#The-easy-to-access-call","page":"Introduction","title":"The easy to access call","text":"","category":"section"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"new_solver(M, f, df, p=rand(M); kwargs...)\nnew_solver!(M, f, df, p; kwargs...)","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"Where the start point should be optional. Keyword arguments include the type of evaluation, decorators like debug= or record= as well as algorithm specific ones. If you provide an immutable point p or the rand(M) point is immutable, like on the Circle() this method should turn the point into a mutable one as well.","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"The third variant works in place of p, so it is mandatory.","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"This first interface would set up the objective and pass all keywords on the the objective based call.","category":"page"},{"location":"solvers/#The-objective-based-call","page":"Introduction","title":"The objective-based call","text":"","category":"section"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"new_solver(M, obj, p=rand(M); kwargs...)\nnew_solver!(M, obj, p; kwargs...)","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"Here the objective would be created beforehand, e.g. to compare different solvers on the same objective, and for the first variant the start point is optional. Keyword arguments include decorators like debug= or record= as well as algorithm specific ones.","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"this variant would generate the problem and the state and check validity of all provided keyword arguments that affect the state. Then it would call the iterate process.","category":"page"},{"location":"solvers/#The-manual-call","page":"Introduction","title":"The manual call","text":"","category":"section"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"If you generate the corresponding problem and state as the previous step does, you can also use the third (lowest level) and just call","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"solve!(problem, state)","category":"page"},{"location":"functions/gradients/#GradientFunctions","page":"Gradients","title":"Gradients","text":"","category":"section"},{"location":"functions/gradients/","page":"Gradients","title":"Gradients","text":"For a function fmathcal Mℝ the Riemannian gradient operatornamegradf(x) at xmathcal M is given by the unique tangent vector fulfilling","category":"page"},{"location":"functions/gradients/","page":"Gradients","title":"Gradients","text":"langle operatornamegradf(x) ξrangle_x = D_xfξquad\nforall ξ T_xmathcal M","category":"page"},{"location":"functions/gradients/","page":"Gradients","title":"Gradients","text":"where D_xfξ denotes the differential of f at x with respect to the tangent direction (vector) ξ or in other words the directional derivative.","category":"page"},{"location":"functions/gradients/","page":"Gradients","title":"Gradients","text":"This page collects the available gradients.","category":"page"},{"location":"functions/gradients/","page":"Gradients","title":"Gradients","text":"Modules = [Manopt]\nPages = [\"gradients.jl\"]","category":"page"},{"location":"functions/gradients/#Manopt.forward_logs-Union{Tuple{TPR}, Tuple{TSize}, Tuple{TM}, Tuple{𝔽}, Tuple{PowerManifold{𝔽, TM, TSize, TPR}, Any}} where {𝔽, TM, TSize, TPR}","page":"Gradients","title":"Manopt.forward_logs","text":"Y = forward_logs(M,x)\nforward_logs!(M, Y, x)\n\ncompute the forward logs F (generalizing forward differences) occurring, in the power manifold array, the function\n\nF_i(x) = sum_j mathcal I_i log_x_i x_jquad i mathcal G\n\nwhere mathcal G is the set of indices of the PowerManifold manifold M and mathcal I_i denotes the forward neighbors of i. This can also be done in place of ξ.\n\nInput\n\nM – a PowerManifold manifold\nx – a point.\n\nOutput\n\nY – resulting tangent vector in T_xmathcal M representing the logs, where mathcal N is the power manifold with the number of dimensions added to size(x). The computation can be done in place of Y.\n\n\n\n\n\n","category":"method"},{"location":"functions/gradients/#Manopt.grad_L2_acceleration_bezier-Union{Tuple{P}, Tuple{AbstractManifold, AbstractVector{P}, AbstractVector{<:Integer}, AbstractVector, Any, AbstractVector{P}}} where P","page":"Gradients","title":"Manopt.grad_L2_acceleration_bezier","text":"grad_L2_acceleration_bezier(\n M::AbstractManifold,\n B::AbstractVector{P},\n degrees::AbstractVector{<:Integer},\n T::AbstractVector,\n λ,\n d::AbstractVector{P}\n) where {P}\n\ncompute the gradient of the discretized acceleration of a composite Bézier curve on the Manifold M with respect to its control points B together with a data term that relates the junction points p_i to the data d with a weight λ compared to the acceleration. The curve is evaluated at the points given in pts (elementwise in 0N), where N is the number of segments of the Bézier curve. The summands are grad_distance for the data term and grad_acceleration_bezier for the acceleration with interpolation constrains. Here the get_bezier_junctions are included in the optimization, i.e. setting λ=0 yields the unconstrained acceleration minimization. Note that this is ill-posed, since any Bézier curve identical to a geodesic is a minimizer.\n\nNote that the Bézier-curve is given in reduces form as a point on a PowerManifold, together with the degrees of the segments and assuming a differentiable curve, the segments can internally be reconstructed.\n\nSee also\n\ngrad_acceleration_bezier, cost_L2_acceleration_bezier, cost_acceleration_bezier.\n\n\n\n\n\n","category":"method"},{"location":"functions/gradients/#Manopt.grad_TV","page":"Gradients","title":"Manopt.grad_TV","text":"X = grad_TV(M, λ, x[, p=1])\ngrad_TV!(M, X, λ, x[, p=1])\n\nCompute the (sub)gradient partial F of all forward differences occurring, in the power manifold array, i.e. of the function\n\nF(x) = sum_isum_j mathcal I_i d^p(x_ix_j)\n\nwhere i runs over all indices of the PowerManifold manifold M and mathcal I_i denotes the forward neighbors of i.\n\nInput\n\nM – a PowerManifold manifold\nx – a point.\n\nOutput\n\nX – resulting tangent vector in T_xmathcal M. The computation can also be done in place.\n\n\n\n\n\n","category":"function"},{"location":"functions/gradients/#Manopt.grad_TV-Union{Tuple{T}, Tuple{AbstractManifold, Tuple{T, T}}, Tuple{AbstractManifold, Tuple{T, T}, Any}} where T","page":"Gradients","title":"Manopt.grad_TV","text":"X = grad_TV(M, (x,y)[, p=1])\ngrad_TV!(M, X, (x,y)[, p=1])\n\ncompute the (sub) gradient of frac1pd^p_mathcal M(xy) with respect to both x and y (in place of X and Y).\n\n\n\n\n\n","category":"method"},{"location":"functions/gradients/#Manopt.grad_TV2","page":"Gradients","title":"Manopt.grad_TV2","text":"Y = grad_TV2(M, q[, p=1])\ngrad_TV2!(M, Y, q[, p=1])\n\ncomputes the (sub) gradient of frac1pd_2^p(q_1 q_2 q_3) with respect to all three components of qmathcal M^3, where d_2 denotes the second order absolute difference using the mid point model, i.e. let\n\nmathcal C = bigl c mathcal M g(tfrac12q_1q_3) text for some geodesic gbigr\n\ndenote the mid points between q_1 and q_3 on the manifold mathcal M. Then the absolute second order difference is defined as\n\nd_2(q_1q_2q_3) = min_c mathcal C_q_1q_3 d(c q_2)\n\nWhile the (sub)gradient with respect to q_2 is easy, the other two require the evaluation of an adjoint_Jacobi_field.\n\n\n\n\n\n","category":"function"},{"location":"functions/gradients/#Manopt.grad_TV2-2","page":"Gradients","title":"Manopt.grad_TV2","text":"grad_TV2(M::PowerManifold, q[, p=1])\n\ncomputes the (sub) gradient of frac1pd_2^p(q_1q_2q_3) with respect to all q_1q_2q_3 occurring along any array dimension in the point q, where M is the corresponding PowerManifold.\n\n\n\n\n\n","category":"function"},{"location":"functions/gradients/#Manopt.grad_acceleration_bezier-Tuple{AbstractManifold, AbstractVector, AbstractVector{<:Integer}, AbstractVector}","page":"Gradients","title":"Manopt.grad_acceleration_bezier","text":"grad_acceleration_bezier(\n M::AbstractManifold,\n B::AbstractVector,\n degrees::AbstractVector{<:Integer}\n T::AbstractVector\n)\n\ncompute the gradient of the discretized acceleration of a (composite) Bézier curve c_B(t) on the Manifold M with respect to its control points B given as a point on the PowerManifold assuming C1 conditions and known degrees. The curve is evaluated at the points given in T (elementwise in 0N, where N is the number of segments of the Bézier curve). The get_bezier_junctions are fixed for this gradient (interpolation constraint). For the unconstrained gradient, see grad_L2_acceleration_bezier and set λ=0 therein. This gradient is computed using adjoint_Jacobi_fields. For details, see Bergmann, Gousenbourger, Front. Appl. Math. Stat., 2018. See de_casteljau for more details on the curve.\n\nSee also\n\ncost_acceleration_bezier, grad_L2_acceleration_bezier, cost_L2_acceleration_bezier.\n\n\n\n\n\n","category":"method"},{"location":"functions/gradients/#Manopt.grad_distance","page":"Gradients","title":"Manopt.grad_distance","text":"grad_distance(M,y,x[, p=2])\ngrad_distance!(M,X,y,x[, p=2])\n\ncompute the (sub)gradient of the distance (squared), in place of X.\n\nf(x) = frac1p d^p_mathcal M(xy)\n\nto a fixed point y on the manifold M and p is an integer. The gradient reads\n\n operatornamegradf(x) = -d_mathcal M^p-2(xy)log_xy\n\nfor pneq 1 or xneq y. Note that for the remaining case p=1, x=y the function is not differentiable. In this case, the function returns the corresponding zero tangent vector, since this is an element of the subdifferential.\n\nOptional\n\np – (2) the exponent of the distance, i.e. the default is the squared distance\n\n\n\n\n\n","category":"function"},{"location":"functions/gradients/#Manopt.grad_intrinsic_infimal_convolution_TV12-Tuple{AbstractManifold, Vararg{Any, 5}}","page":"Gradients","title":"Manopt.grad_intrinsic_infimal_convolution_TV12","text":"grad_u, grad_v = grad_intrinsic_infimal_convolution_TV12(M, f, u, v, α, β)\n\ncompute (sub)gradient of the intrinsic infimal convolution model using the mid point model of second order differences, see costTV2, i.e. for some f mathcal M on a PowerManifold manifold mathcal M this function computes the (sub)gradient of\n\nE(uv) =\nfrac12sum_i mathcal G d_mathcal M(g(frac12v_iw_i)f_i)\n+ alpha\nbigl(\nβmathrmTV(v) + (1-β)mathrmTV_2(w)\nbigr)\n\nwhere both total variations refer to the intrinsic ones, grad_TV and grad_TV2, respectively.\n\n\n\n\n\n","category":"method"},{"location":"functions/gradients/#Literature","page":"Gradients","title":"Literature","text":"","category":"section"},{"location":"functions/gradients/","page":"Gradients","title":"Gradients","text":"Pages = [\"functions/gradients.md\"]\nCanonical=false","category":"page"},{"location":"extensions/#Extensions","page":"Extensions","title":"Extensions","text":"","category":"section"},{"location":"extensions/#LineSearches.jl","page":"Extensions","title":"LineSearches.jl","text":"","category":"section"},{"location":"extensions/","page":"Extensions","title":"Extensions","text":"Manopt can be used with line search algorithms implemented in LineSearches.jl. This can be illustrated by the following example of optimizing Rosenbrock function constrained to the unit sphere.","category":"page"},{"location":"extensions/","page":"Extensions","title":"Extensions","text":"using Manopt, Manifolds, LineSearches\n\n# define objective function and its gradient\np = [1.0, 100.0]\nfunction rosenbrock(::AbstractManifold, x)\n val = zero(eltype(x))\n for i in 1:(length(x) - 1)\n val += (p[1] - x[i])^2 + p[2] * (x[i + 1] - x[i]^2)^2\n end\n return val\nend\nfunction rosenbrock_grad!(M::AbstractManifold, storage, x)\n storage .= 0.0\n for i in 1:(length(x) - 1)\n storage[i] += -2.0 * (p[1] - x[i]) - 4.0 * p[2] * (x[i + 1] - x[i]^2) * x[i]\n storage[i + 1] += 2.0 * p[2] * (x[i + 1] - x[i]^2)\n end\n project!(M, storage, x, storage)\n return storage\nend\n# define constraint\nn_dims = 5\nM = Manifolds.Sphere(n_dims)\n# set initial point\nx0 = vcat(zeros(n_dims - 1), 1.0)\n# use LineSearches.jl HagerZhang method with Manopt.jl quasiNewton solver\nls_hz = Manopt.LineSearchesStepsize(M, LineSearches.HagerZhang())\nx_opt = quasi_Newton(\n M,\n rosenbrock,\n rosenbrock_grad!,\n x0;\n stepsize=ls_hz,\n evaluation=InplaceEvaluation(),\n stopping_criterion=StopAfterIteration(1000) | StopWhenGradientNormLess(1e-6),\n return_state=true,\n)","category":"page"},{"location":"extensions/#Manifolds.jl","page":"Extensions","title":"Manifolds.jl","text":"","category":"section"},{"location":"extensions/","page":"Extensions","title":"Extensions","text":"Manopt.LineSearchesStepsize\nmid_point\nManopt.max_stepsize(::TangentBundle, ::Any)\nManopt.max_stepsize(::FixedRankMatrices, ::Any)","category":"page"},{"location":"extensions/#Manopt.LineSearchesStepsize","page":"Extensions","title":"Manopt.LineSearchesStepsize","text":"LineSearchesStepsize <: Stepsize\n\nWrapper for line searches available in the LineSearches.jl library.\n\nConstructors\n\nLineSearchesStepsize(\n M::AbstractManifold,\n linesearch;\n retraction_method::AbstractRetractionMethod=default_retraction_method(M),\n vector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method(M),\n)\nLineSearchesStepsize(\n linesearch;\n retraction_method::AbstractRetractionMethod=ExponentialRetraction(),\n vector_transport_method::AbstractVectorTransportMethod=ParallelTransport(),\n)\n\nWrap linesearch (for example HagerZhang or MoreThuente). The initial step selection from Linesearches.jl is not yet supported and the value 1.0 is used. The retraction used for determining the line along which the search is performed can be provided as retraction_method. Gradient vectors are transported between points using vector_transport_method.\n\n\n\n\n\n","category":"type"},{"location":"extensions/#ManifoldsBase.mid_point","page":"Extensions","title":"ManifoldsBase.mid_point","text":"mid_point(M, p, q, x)\nmid_point!(M, y, p, q, x)\n\nCompute the mid point between p and q. If there is more than one mid point of (not necessarily minimizing) geodesics (e.g. on the sphere), the one nearest to x is returned (in place of y).\n\n\n\n\n\n","category":"function"},{"location":"extensions/#Manopt.max_stepsize-Tuple{TangentBundle{𝔽} where 𝔽, Any}","page":"Extensions","title":"Manopt.max_stepsize","text":"max_stepsize(M::TangentBundle, p)\n\nTangent bundle has injectivity radius of either infinity (for flat manifolds) or 0 (for non-flat manifolds). This makes a guess of what a reasonable maximum stepsize on a tangent bundle might be.\n\n\n\n\n\n","category":"method"},{"location":"extensions/#Manopt.max_stepsize-Tuple{FixedRankMatrices, Any}","page":"Extensions","title":"Manopt.max_stepsize","text":"max_stepsize(M::FixedRankMatrices, p)\n\nReturn a reasonable guess of maximum step size on FixedRankMatrices following the choice of typical distance in Matlab Manopt, i.e. dimension of M. See this note\n\n\n\n\n\n","category":"method"},{"location":"functions/bezier/#BezierCurves","page":"Bézier curves","title":"Bézier curves","text":"","category":"section"},{"location":"functions/bezier/","page":"Bézier curves","title":"Bézier curves","text":"Modules = [Manopt]\nPages = [\"bezier_curves.jl\"]","category":"page"},{"location":"functions/bezier/#Manopt.BezierSegment","page":"Bézier curves","title":"Manopt.BezierSegment","text":"BezierSegment\n\nA type to capture a Bezier segment. With n points, a Bézier segment of degree n-1 is stored. On the Euclidean manifold, this yields a polynomial of degree n-1.\n\nThis type is mainly used to encapsulate the points within a composite Bezier curve, which consist of an AbstractVector of BezierSegments where each of the points might be a nested array on a PowerManifold already.\n\nNot that this can also be used to represent tangent vectors on the control points of a segment.\n\nSee also: de_casteljau.\n\nConstructor\n\nBezierSegment(pts::AbstractVector)\n\nGiven an abstract vector of pts generate the corresponding Bézier segment.\n\n\n\n","category":"type"},{"location":"functions/bezier/#Manopt.de_casteljau-Tuple{AbstractManifold, Vararg{Any}}","page":"Bézier curves","title":"Manopt.de_casteljau","text":"de_casteljau(M::AbstractManifold, b::BezierSegment NTuple{N,P}) -> Function\n\nreturn the Bézier curve β(b_0b_n) 01 mathcal M defined by the control points b_0b_nmathcal M, nmathbb N, as a BezierSegment. This function implements de Casteljau's algorithm Casteljau, 1959, Casteljau, 1963 generalized to manifolds by Popiel, Noakes, J Approx Theo, 2007: Let γ_ab(t) denote the shortest geodesic connecting abmathcal M. Then the curve is defined by the recursion\n\nbeginaligned\n β(tb_0b_1) = gamma_b_0b_1(t)\n β(tb_0b_n) = gamma_β(tb_0b_n-1) β(tb_1b_n)(t)\nendaligned\n\nand P is the type of a point on the Manifold M.\n\nde_casteljau(M::AbstractManifold, B::AbstractVector{<:BezierSegment}) -> Function\n\nGiven a vector of Bézier segments, i.e. a vector of control points B=bigl( (b_00b_n_00)(b_0m b_n_mm) bigr), where the different segments might be of different degree(s) n_0n_m. The resulting composite Bézier curve c_B0m mathcal M consists of m segments which are Bézier curves.\n\nc_B(t) =\n begincases\n β(t b_00b_n_00) text if t 01\n β(t-i b_0ib_n_ii) text if \n t(ii+1 quad i1m-1\n endcases\n\nde_casteljau(M::AbstractManifold, b::BezierSegment, t::Real)\nde_casteljau(M::AbstractManifold, B::AbstractVector{<:BezierSegment}, t::Real)\nde_casteljau(M::AbstractManifold, b::BezierSegment, T::AbstractVector) -> AbstractVector\nde_casteljau(\n M::AbstractManifold,\n B::AbstractVector{<:BezierSegment},\n T::AbstractVector\n) -> AbstractVector\n\nEvaluate the Bézier curve at time t or at times t in T.\n\n\n\n\n\n","category":"method"},{"location":"functions/bezier/#Manopt.get_bezier_degree-Tuple{AbstractManifold, BezierSegment}","page":"Bézier curves","title":"Manopt.get_bezier_degree","text":"get_bezier_degree(M::AbstractManifold, b::BezierSegment)\n\nreturn the degree of the Bézier curve represented by the tuple b of control points on the manifold M, i.e. the number of points minus 1.\n\n\n\n\n\n","category":"method"},{"location":"functions/bezier/#Manopt.get_bezier_degrees-Tuple{AbstractManifold, AbstractVector{<:BezierSegment}}","page":"Bézier curves","title":"Manopt.get_bezier_degrees","text":"get_bezier_degrees(M::AbstractManifold, B::AbstractVector{<:BezierSegment})\n\nreturn the degrees of the components of a composite Bézier curve represented by tuples in B containing points on the manifold M.\n\n\n\n\n\n","category":"method"},{"location":"functions/bezier/#Manopt.get_bezier_inner_points-Tuple{AbstractManifold, AbstractVector{<:BezierSegment}}","page":"Bézier curves","title":"Manopt.get_bezier_inner_points","text":"get_bezier_inner_points(M::AbstractManifold, B::AbstractVector{<:BezierSegment} )\nget_bezier_inner_points(M::AbstractManifold, b::BezierSegment)\n\nreturns the inner (i.e. despite start and end) points of the segments of the composite Bézier curve specified by the control points B. For a single segment b, its inner points are returned\n\n\n\n\n\n","category":"method"},{"location":"functions/bezier/#Manopt.get_bezier_junction_tangent_vectors-Tuple{AbstractManifold, AbstractVector{<:BezierSegment}}","page":"Bézier curves","title":"Manopt.get_bezier_junction_tangent_vectors","text":"get_bezier_junction_tangent_vectors(M::AbstractManifold, B::AbstractVector{<:BezierSegment})\nget_bezier_junction_tangent_vectors(M::AbstractManifold, b::BezierSegment)\n\nreturns the tangent vectors at start and end points of the composite Bézier curve pointing from a junction point to the first and last inner control points for each segment of the composite Bezier curve specified by the control points B, either a vector of segments of controlpoints.\n\n\n\n\n\n","category":"method"},{"location":"functions/bezier/#Manopt.get_bezier_junctions","page":"Bézier curves","title":"Manopt.get_bezier_junctions","text":"get_bezier_junctions(M::AbstractManifold, B::AbstractVector{<:BezierSegment})\nget_bezier_junctions(M::AbstractManifold, b::BezierSegment)\n\nreturns the start and end point(s) of the segments of the composite Bézier curve specified by the control points B. For just one segment b, its start and end points are returned.\n\n\n\n\n\n","category":"function"},{"location":"functions/bezier/#Manopt.get_bezier_points","page":"Bézier curves","title":"Manopt.get_bezier_points","text":"get_bezier_points(\n M::AbstractManifold,\n B::AbstractVector{<:BezierSegment},\n reduce::Symbol=:default\n)\nget_bezier_points(M::AbstractManifold, b::BezierSegment, reduce::Symbol=:default)\n\nreturns the control points of the segments of the composite Bézier curve specified by the control points B, either a vector of segments of controlpoints or a.\n\nThis method reduces the points depending on the optional reduce symbol\n\n:default – no reduction is performed\n:continuous – for a continuous function, the junction points are doubled at b_0i=b_n_i-1i-1, so only b_0i is in the vector.\n:differentiable – for a differentiable function additionally log_b_0ib_1i = -log_b_n_i-1i-1b_n_i-1-1i-1 holds. hence b_n_i-1-1i-1 is omitted.\n\nIf only one segment is given, all points of b – i.e. b.pts is returned.\n\n\n\n\n\n","category":"function"},{"location":"functions/bezier/#Manopt.get_bezier_segments-Union{Tuple{P}, Tuple{AbstractManifold, Vector{P}, Any}, Tuple{AbstractManifold, Vector{P}, Any, Symbol}} where P","page":"Bézier curves","title":"Manopt.get_bezier_segments","text":"get_bezier_segments(M::AbstractManifold, c::AbstractArray{P}, d[, s::Symbol=:default])\n\nreturns the array of BezierSegments B of a composite Bézier curve reconstructed from an array c of points on the manifold M and an array of degrees d.\n\nThere are a few (reduced) representations that can get extended; see also get_bezier_points. For ease of the following, let c=(c_1c_k) and d=(d_1d_m), where m denotes the number of components the composite Bézier curve consists of. Then\n\n:default – k = m + sum_i=1^m d_i since each component requires one point more than its degree. The points are then ordered in tuples, i.e.\nB = bigl c_1c_d_1+1 (c_d_1+2c_d_1+d_2+2 c_k-m+1+d_mc_k bigr\n:continuous – k = 1+ sum_i=1m d_i, since for a continuous curve start and end point of successive components are the same, so the very first start point and the end points are stored.\nB = bigl c_1c_d_1+1 c_d_1+1c_d_1+d_2+1 c_k-1+d_mb_k) bigr\n:differentiable – for a differentiable function additionally to the last explanation, also the second point of any segment was not stored except for the first segment. Hence k = 2 - m + sum_i=1m d_i and at a junction point b_n with its given prior point c_n-1, i.e. this is the last inner point of a segment, the first inner point in the next segment the junction is computed as b = exp_c_n(-log_c_n c_n-1) such that the assumed differentiability holds\n\n\n\n\n\n","category":"method"},{"location":"functions/bezier/#Literature","page":"Bézier curves","title":"Literature","text":"","category":"section"},{"location":"functions/bezier/","page":"Bézier curves","title":"Bézier curves","text":"Pages = [\"functions/bezier.md\"]\nCanonical=false","category":"page"},{"location":"solvers/subgradient/#SubgradientSolver","page":"Subgradient method","title":"Subgradient Method","text":"","category":"section"},{"location":"solvers/subgradient/","page":"Subgradient method","title":"Subgradient method","text":"subgradient_method\nsubgradient_method!","category":"page"},{"location":"solvers/subgradient/#Manopt.subgradient_method","page":"Subgradient method","title":"Manopt.subgradient_method","text":"subgradient_method(M, f, ∂f, p; kwargs...)\nsubgradient_method(M; sgo, p; kwargs...)\n\nperform a subgradient method p_k+1 = mathrmretr(p_k s_kf(p_k)),\n\nwhere mathrmretr is a retraction, s_k is a step size, usually the ConstantStepsize but also be specified. Though the subgradient might be set valued, the argument ∂f should always return one element from the subgradient, but not necessarily deterministic.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function fmathcal Mℝ to minimize\n∂f– the (sub)gradient partial f mathcal M Tmathcal M of f restricted to always only returning one value/element from the subdifferential. This function can be passed as an allocation function (M, p) -> X or a mutating function (M, X, p) -> X, see evaluation.\np – an initial value p_0=p mathcal M\n\nalternatively to f and ∂f a ManifoldSubgradientObjective sgo can be provided.\n\nOptional\n\nevaluation – (AllocatingEvaluation) specify whether the subgradient works by allocation (default) form ∂f(M, y) or InplaceEvaluation in place, i.e. is of the form ∂f!(M, X, x).\nstepsize – (ConstantStepsize(M)) specify a Stepsize\nretraction – (default_retraction_method(M, typeof(p))) a retraction to use.\nstopping_criterion – (StopAfterIteration(5000)) a functor, seeStoppingCriterion, indicating when to stop.\n\nand the ones that are passed to decorate_state! for decorators.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/subgradient/#Manopt.subgradient_method!","page":"Subgradient method","title":"Manopt.subgradient_method!","text":"subgradient_method!(M, f, ∂f, p)\nsubgradient_method!(M, sgo, p)\n\nperform a subgradient method p_k+1 = mathrmretr(p_k s_kf(p_k)),\n\nInput\n\nM – a manifold mathcal M\nf – a cost function fmathcal Mℝ to minimize\n∂f– the (sub)gradient partial f mathcal M Tmathcal M of F restricted to always only returning one value/element from the subdifferential. This function can be passed as an allocation function (M, p) -> X or a mutating function (M, X, p) -> X, see evaluation.\np – an initial value p_0=p mathcal M\n\nalternatively to f and ∂f a ManifoldSubgradientObjective sgo can be provided.\n\nfor more details and all optional parameters, see subgradient_method.\n\n\n\n\n\n","category":"function"},{"location":"solvers/subgradient/#State","page":"Subgradient method","title":"State","text":"","category":"section"},{"location":"solvers/subgradient/","page":"Subgradient method","title":"Subgradient method","text":"SubGradientMethodState","category":"page"},{"location":"solvers/subgradient/#Manopt.SubGradientMethodState","page":"Subgradient method","title":"Manopt.SubGradientMethodState","text":"SubGradientMethodState <: AbstractManoptSolverState\n\nstores option values for a subgradient_method solver\n\nFields\n\nretraction_method – the rectration to use within\nstepsize – (ConstantStepsize(M)) a Stepsize\nstop – (StopAfterIteration(5000))a [StoppingCriterion`](@ref)\np – (initial or current) value the algorithm is at\np_star – optimal value (initialized to a copy of p.)\nX - (zero_vector(M, p)) the current element from the possible subgradients at p that was last evaluated.\n\nConstructor\n\nSubGradientMethodState(M::AbstractManifold, p; kwargs...)\n\nwith keywords for all fields above besides p_star which obtains the same type as p. You can use e.g. X= to specify the type of tangent vector to use\n\n\n\n\n\n","category":"type"},{"location":"solvers/subgradient/","page":"Subgradient method","title":"Subgradient method","text":"For DebugActions and RecordActions to record (sub)gradient, its norm and the step sizes, see the steepest Descent actions.","category":"page"},{"location":"functions/#Functions","page":"Introduction","title":"Functions","text":"","category":"section"},{"location":"functions/","page":"Introduction","title":"Introduction","text":"There are several functions required within optimization, most prominently costFunctions and gradients. This package includes several cost functions and corresponding gradients, but also corresponding proximal maps for variational methods manifold-valued data. Most of these functions require the evaluation of Differentials or their adjointss.","category":"page"},{"location":"functions/differentials/#DifferentialFunctions","page":"Differentials","title":"Differentials","text":"","category":"section"},{"location":"functions/differentials/","page":"Differentials","title":"Differentials","text":"Modules = [Manopt]\nPages = [\"functions/differentials.jl\"]","category":"page"},{"location":"functions/differentials/#Manopt.differential_bezier_control-Tuple{AbstractManifold, AbstractVector{<:BezierSegment}, AbstractVector, AbstractVector{<:BezierSegment}}","page":"Differentials","title":"Manopt.differential_bezier_control","text":"differential_bezier_control(\n M::AbstractManifold,\n B::AbstractVector{<:BezierSegment},\n T::AbstractVector\n Ξ::AbstractVector{<:BezierSegment}\n)\ndifferential_bezier_control!(\n M::AbstractManifold,\n Θ::AbstractVector{<:BezierSegment}\n B::AbstractVector{<:BezierSegment},\n T::AbstractVector\n Ξ::AbstractVector{<:BezierSegment}\n)\n\nevaluate the differential of the composite Bézier curve with respect to its control points B and tangent vectors Ξ in the tangent spaces of the control points. The result is the “change” of the curve at the points in T, which are elementwise in 0N, and each depending the corresponding segment(s). Here, N is the length of B. For the mutating variant the result is computed in Θ.\n\nSee de_casteljau for more details on the curve and Bergmann, Gousenbourger, Front. Appl. Math. Stat., 2018.\n\n\n\n\n\n","category":"method"},{"location":"functions/differentials/#Manopt.differential_bezier_control-Tuple{AbstractManifold, AbstractVector{<:BezierSegment}, Any, AbstractVector{<:BezierSegment}}","page":"Differentials","title":"Manopt.differential_bezier_control","text":"differential_bezier_control(\n M::AbstractManifold,\n B::AbstractVector{<:BezierSegment},\n t,\n X::AbstractVector{<:BezierSegment}\n)\ndifferential_bezier_control!(\n M::AbstractManifold,\n Y::AbstractVector{<:BezierSegment}\n B::AbstractVector{<:BezierSegment},\n t,\n X::AbstractVector{<:BezierSegment}\n)\n\nevaluate the differential of the composite Bézier curve with respect to its control points B and tangent vectors Ξ in the tangent spaces of the control points. The result is the “change” of the curve at t0N, which depends only on the corresponding segment. Here, N is the length of B. The computation can be done in place of Y.\n\nSee de_casteljau for more details on the curve.\n\n\n\n\n\n","category":"method"},{"location":"functions/differentials/#Manopt.differential_bezier_control-Tuple{AbstractManifold, BezierSegment, AbstractVector, BezierSegment}","page":"Differentials","title":"Manopt.differential_bezier_control","text":"differential_bezier_control(\n M::AbstractManifold,\n b::BezierSegment,\n T::AbstractVector,\n X::BezierSegment,\n)\ndifferential_bezier_control!(\n M::AbstractManifold,\n Y,\n b::BezierSegment,\n T::AbstractVector,\n X::BezierSegment,\n)\n\nevaluate the differential of the Bézier curve with respect to its control points b and tangent vectors X in the tangent spaces of the control points. The result is the “change” of the curve at the points T, elementwise in t01. The computation can be done in place of Y.\n\nSee de_casteljau for more details on the curve.\n\n\n\n\n\n","category":"method"},{"location":"functions/differentials/#Manopt.differential_bezier_control-Tuple{AbstractManifold, BezierSegment, Any, BezierSegment}","page":"Differentials","title":"Manopt.differential_bezier_control","text":"differential_bezier_control(M::AbstractManifold, b::BezierSegment, t::Float, X::BezierSegment)\ndifferential_bezier_control!(\n M::AbstractManifold,\n Y,\n b::BezierSegment,\n t,\n X::BezierSegment\n)\n\nevaluate the differential of the Bézier curve with respect to its control points b and tangent vectors X given in the tangent spaces of the control points. The result is the “change” of the curve at t01. The computation can be done in place of Y.\n\nSee de_casteljau for more details on the curve.\n\n\n\n\n\n","category":"method"},{"location":"functions/differentials/#Manopt.differential_forward_logs-Tuple{PowerManifold, Any, Any}","page":"Differentials","title":"Manopt.differential_forward_logs","text":"Y = differential_forward_logs(M, p, X)\ndifferential_forward_logs!(M, Y, p, X)\n\ncompute the differential of forward_logs F on the PowerManifold manifold M at p and direction X , in the power manifold array, the differential of the function\n\nF_i(x) = sum_j mathcal I_i log_p_i p_j quad i mathcal G\n\nwhere mathcal G is the set of indices of the PowerManifold manifold M and mathcal I_i denotes the forward neighbors of i.\n\nInput\n\nM – a PowerManifold manifold\np – a point.\nX – a tangent vector.\n\nOutput\n\nY – resulting tangent vector in T_xmathcal N representing the differentials of the logs, where mathcal N is the power manifold with the number of dimensions added to size(x). The computation can also be done in place.\n\n\n\n\n\n","category":"method"},{"location":"solvers/augmented_Lagrangian_method/#AugmentedLagrangianSolver","page":"Augmented Lagrangian Method","title":"Augmented Lagrangian Method","text":"","category":"section"},{"location":"solvers/augmented_Lagrangian_method/","page":"Augmented Lagrangian Method","title":"Augmented Lagrangian Method","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/augmented_Lagrangian_method/","page":"Augmented Lagrangian Method","title":"Augmented Lagrangian Method","text":" augmented_Lagrangian_method\n augmented_Lagrangian_method!","category":"page"},{"location":"solvers/augmented_Lagrangian_method/#Manopt.augmented_Lagrangian_method","page":"Augmented Lagrangian Method","title":"Manopt.augmented_Lagrangian_method","text":"augmented_Lagrangian_method(M, f, grad_f, p=rand(M); kwargs...)\naugmented_Lagrangian_method(M, cmo::ConstrainedManifoldObjective, p=rand(M); kwargs...)\n\nperform the augmented Lagrangian method (ALM) Liu, Boumal, 2019, Appl. Math. Optim. The aim of the ALM is to find the solution of the constrained optimisation task\n\nbeginaligned\nmin_p mathcalM f(p)\ntextsubject to g_i(p)leq 0 quad text for i= 1 m\nquad h_j(p)=0 quad text for j=1n\nendaligned\n\nwhere M is a Riemannian manifold, and f, g_i_i=1^m and h_j_j=1^p are twice continuously differentiable functions from M to ℝ. In every step k of the algorithm, the AugmentedLagrangianCost mathcalL_ρ^(k-1)(p μ^(k-1) λ^(k-1)) is minimized on mathcalM, where μ^(k-1) in mathbb R^n and λ^(k-1) in mathbb R^m are the current iterates of the Lagrange multipliers and ρ^(k-1) is the current penalty parameter.\n\nThe Lagrange multipliers are then updated by\n\nλ_j^(k) =operatornameclip_λ_minλ_max (λ_j^(k-1) + ρ^(k-1) h_j(p^(k))) textfor all j=1p\n\nand\n\nμ_i^(k) =operatornameclip_0μ_max (μ_i^(k-1) + ρ^(k-1) g_i(p^(k))) text for all i=1m\n\nwhere λ_min leq λ_max and μ_max are the multiplier boundaries.\n\nNext, we update the accuracy tolerance ϵ by setting\n\nϵ^(k)=maxϵ_min θ_ϵ ϵ^(k-1)\n\nwhere ϵ_min is the lowest value ϵ is allowed to become and θ_ϵ (01) is constant scaling factor.\n\nLast, we update the penalty parameter ρ. For this, we define\n\nσ^(k)=max_j=1p i=1m h_j(p^(k)) max_i=1mg_i(p^(k)) -fracμ_i^(k-1)ρ^(k-1) \n\nThen, we update ρ according to\n\nρ^(k) = begincases\nρ^(k-1)θ_ρ textif σ^(k)leq θ_ρ σ^(k-1) \nρ^(k-1) textelse\nendcases\n\nwhere θ_ρ in (01) is a constant scaling factor.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function Fmathcal Mℝ to minimize\ngrad_f – the gradient of the cost function\n\nOptional (if not called with the ConstrainedManifoldObjective cmo)\n\ng – (nothing) the inequality constraints\nh – (nothing) the equality constraints\ngrad_g – (nothing) the gradient of the inequality constraints\ngrad_h – (nothing) the gradient of the equality constraints\n\nNote that one of the pairs (g, grad_g) or (h, grad_h) has to be provided. Otherwise the problem is not constrained and you can also call e.g. quasi_Newton\n\nOptional\n\nϵ – (1e-3) the accuracy tolerance\nϵ_min – (1e-6) the lower bound for the accuracy tolerance\nϵ_exponent – (1/100) exponent of the ϵ update factor; also 1/number of iterations until maximal accuracy is needed to end algorithm naturally\nθ_ϵ – ((ϵ_min / ϵ)^(ϵ_exponent)) the scaling factor of the exactness\nμ – (ones(size(h(M,x),1))) the Lagrange multiplier with respect to the inequality constraints\nμ_max – (20.0) an upper bound for the Lagrange multiplier belonging to the inequality constraints\nλ – (ones(size(h(M,x),1))) the Lagrange multiplier with respect to the equality constraints\nλ_max – (20.0) an upper bound for the Lagrange multiplier belonging to the equality constraints\nλ_min – (- λ_max) a lower bound for the Lagrange multiplier belonging to the equality constraints\nτ – (0.8) factor for the improvement of the evaluation of the penalty parameter\nρ – (1.0) the penalty parameter\nθ_ρ – (0.3) the scaling factor of the penalty parameter\nsub_cost – (AugmentedLagrangianCost(problem, ρ, μ, λ)) use augmented Lagrangian, especially with the same numbers ρ,μ as in the options for the sub problem\nsub_grad – (AugmentedLagrangianGrad(problem, ρ, μ, λ)) use augmented Lagrangian gradient, especially with the same numbers ρ,μ as in the options for the sub problem\nsub_kwargs – keyword arguments to decorate the sub options, e.g. with debug.\nsub_stopping_criterion – (StopAfterIteration(200) |StopWhenGradientNormLess(ϵ) |StopWhenStepsizeLess(1e-8)) specify a stopping criterion for the subsolver.\nsub_problem – (DefaultManoptProblem(M,ConstrainedManifoldObjective(subcost, subgrad; evaluation=evaluation))) problem for the subsolver\nsub_state – (QuasiNewtonState) using QuasiNewtonLimitedMemoryDirectionUpdate with InverseBFGS and sub_stopping_criterion as a stopping criterion. See also sub_kwargs.\nstopping_criterion – (StopAfterIteration(300) | (StopWhenSmallerOrEqual(ϵ, ϵ_min) & StopWhenChangeLess(1e-10))) a functor inheriting from StoppingCriterion indicating when to stop.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/augmented_Lagrangian_method/#Manopt.augmented_Lagrangian_method!","page":"Augmented Lagrangian Method","title":"Manopt.augmented_Lagrangian_method!","text":"augmented_Lagrangian_method!(M, f, grad_f p=rand(M); kwargs...)\n\nperform the augmented Lagrangian method (ALM) in-place of p.\n\nFor all options, see augmented_Lagrangian_method.\n\n\n\n\n\n","category":"function"},{"location":"solvers/augmented_Lagrangian_method/#State","page":"Augmented Lagrangian Method","title":"State","text":"","category":"section"},{"location":"solvers/augmented_Lagrangian_method/","page":"Augmented Lagrangian Method","title":"Augmented Lagrangian Method","text":"AugmentedLagrangianMethodState","category":"page"},{"location":"solvers/augmented_Lagrangian_method/#Manopt.AugmentedLagrangianMethodState","page":"Augmented Lagrangian Method","title":"Manopt.AugmentedLagrangianMethodState","text":"AugmentedLagrangianMethodState{P,T} <: AbstractManoptSolverState\n\nDescribes the augmented Lagrangian method, with\n\nFields\n\na default value is given in brackets if a parameter can be left out in initialization.\n\np – a point on a manifold as starting point and current iterate\nsub_problem – an AbstractManoptProblem problem for the subsolver\nsub_state – an AbstractManoptSolverState for the subsolver\nϵ – (1e–3) the accuracy tolerance\nϵ_min – (1e-6) the lower bound for the accuracy tolerance\nλ – (ones(len(get_equality_constraints(p,x))) the Lagrange multiplier with respect to the equality constraints\nλ_max – (20.0) an upper bound for the Lagrange multiplier belonging to the equality constraints\nλ_min – (- λ_max) a lower bound for the Lagrange multiplier belonging to the equality constraints\nμ – (ones(len(get_inequality_constraints(p,x))) the Lagrange multiplier with respect to the inequality constraints\nμ_max – (20.0) an upper bound for the Lagrange multiplier belonging to the inequality constraints\nρ – (1.0) the penalty parameter\nτ – (0.8) factor for the improvement of the evaluation of the penalty parameter\nθ_ρ – (0.3) the scaling factor of the penalty parameter\nθ_ϵ – ((ϵ_min/ϵ)^(ϵ_exponent)) the scaling factor of the accuracy tolerance\npenalty – evaluation of the current penalty term, initialized to Inf.\nstopping_criterion – ((StopAfterIteration(300) | (StopWhenSmallerOrEqual(ϵ, ϵ_min) &StopWhenChangeLess(1e-10))) a functor inheriting from StoppingCriterion indicating when to stop.\n\nConstructor\n\nAugmentedLagrangianMethodState(M::AbstractManifold, co::ConstrainedManifoldObjective, p; kwargs...)\n\nconstruct an augmented Lagrangian method options with the fields and defaults as above, where the manifold M and the ConstrainedManifoldObjective co are used for defaults in the keyword arguments.\n\nSee also\n\naugmented_Lagrangian_method\n\n\n\n\n\n","category":"type"},{"location":"solvers/augmented_Lagrangian_method/#Helping-Functions","page":"Augmented Lagrangian Method","title":"Helping Functions","text":"","category":"section"},{"location":"solvers/augmented_Lagrangian_method/","page":"Augmented Lagrangian Method","title":"Augmented Lagrangian Method","text":"AugmentedLagrangianCost\nAugmentedLagrangianGrad","category":"page"},{"location":"solvers/augmented_Lagrangian_method/#Manopt.AugmentedLagrangianCost","page":"Augmented Lagrangian Method","title":"Manopt.AugmentedLagrangianCost","text":"AugmentedLagrangianCost{CO,R,T}\n\nStores the parameters ρ mathbb R, μ mathbb R^m, λ mathbb R^n of the augmented Lagrangian associated to the ConstrainedManifoldObjective co.\n\nThis struct is also a functor (M,p) -> v that can be used as a cost function within a solver, based on the internal ConstrainedManifoldObjective we can compute\n\nmathcal L_rho(p μ λ)\n= f(x) + fracρ2 biggl(\n sum_j=1^n Bigl( h_j(p) + fracλ_jρ Bigr)^2\n +\n sum_i=1^m maxBigl 0 fracμ_iρ + g_i(p) Bigr^2\nBigr)\n\nFields\n\nco::CO, ρ::R, μ::T, λ::T as mentioned above\n\n\n\n\n\n","category":"type"},{"location":"solvers/augmented_Lagrangian_method/#Manopt.AugmentedLagrangianGrad","page":"Augmented Lagrangian Method","title":"Manopt.AugmentedLagrangianGrad","text":"AugmentedLagrangianGrad{CO,R,T}\n\nStores the parameters ρ mathbb R, μ mathbb R^m, λ mathbb R^n of the augmented Lagrangian associated to the ConstrainedManifoldObjective co.\n\nThis struct is also a functor in both formats\n\n(M, p) -> X to compute the gradient in allocating fashion.\n(M, X, p) to compute the gradient in in-place fashion.\n\nbased on the internal ConstrainedManifoldObjective and computes the gradient operatornamegrad mathcal L_ρ(p μ λ), see also AugmentedLagrangianCost.\n\n\n\n\n\n","category":"type"},{"location":"solvers/augmented_Lagrangian_method/#Literature","page":"Augmented Lagrangian Method","title":"Literature","text":"","category":"section"},{"location":"solvers/augmented_Lagrangian_method/","page":"Augmented Lagrangian Method","title":"Augmented Lagrangian Method","text":"Pages = [\"solvers/augmented_Lagrangian_method.md\"]\nCanonical=false","category":"page"},{"location":"plans/record/#RecordSection","page":"Recording values","title":"Record values","text":"","category":"section"},{"location":"plans/record/","page":"Recording values","title":"Recording values","text":"CurrentModule = Manopt","category":"page"},{"location":"plans/record/","page":"Recording values","title":"Recording values","text":"To record values during the iterations of a solver run, there are in general two possibilities. On the one hand, the high-level interfaces provide a record= keyword, that accepts several different inputs. For more details see How to record.","category":"page"},{"location":"plans/record/","page":"Recording values","title":"Recording values","text":"For example recording the gradient from the GradientDescentState is automatically available, as explained in the gradient_descent solver.","category":"page"},{"location":"plans/record/#RecordSolverState","page":"Recording values","title":"Record Solver States","text":"","category":"section"},{"location":"plans/record/","page":"Recording values","title":"Recording values","text":"Modules = [Manopt]\nPages = [\"plans/record.jl\"]\nOrder = [:type, :function]\nPrivate = true","category":"page"},{"location":"plans/record/#Manopt.RecordAction","page":"Recording values","title":"Manopt.RecordAction","text":"RecordAction\n\nA RecordAction is a small functor to record values. The usual call is given by (amp::AbstractManoptProblem, ams::AbstractManoptSolverState, i) -> s that performs the record, where i is the current iteration.\n\nBy convention i<=0 is interpreted as \"For Initialization only\", i.e. only initialize internal values, but not trigger any record, the same holds for i=typemin(Inf) which is used to indicate stop, i.e. that the record is called from within stop_solver! which returns true afterwards.\n\nFields (assumed by subtypes to exist)\n\nrecorded_values an Array of the recorded values.\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordChange","page":"Recording values","title":"Manopt.RecordChange","text":"RecordChange <: RecordAction\n\ndebug for the amount of change of the iterate (stored in o.x of the AbstractManoptSolverState) during the last iteration.\n\nAdditional Fields\n\nstorage a StoreStateAction to store (at least) o.x to use this as the last value (to compute the change\ninverse_retraction_method - (default_inverse_retraction_method(manifold, p)) the inverse retraction to be used for approximating distance.\n\nConstructor\n\nRecordChange(M=DefaultManifold();)\n\nwith the above fields as keywords. For the DefaultManifold only the field storage is used. Providing the actual manifold moves the default storage to the efficient point storage.\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordCost","page":"Recording values","title":"Manopt.RecordCost","text":"RecordCost <: RecordAction\n\nRecord the current cost function value, see get_cost.\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordEntry","page":"Recording values","title":"Manopt.RecordEntry","text":"RecordEntry{T} <: RecordAction\n\nrecord a certain fields entry of type {T} during the iterates\n\nFields\n\nrecorded_values – the recorded Iterates\nfield – Symbol the entry can be accessed with within AbstractManoptSolverState\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordEntryChange","page":"Recording values","title":"Manopt.RecordEntryChange","text":"RecordEntryChange{T} <: RecordAction\n\nrecord a certain entries change during iterates\n\nAdditional Fields\n\nrecorded_values – the recorded Iterates\nfield – Symbol the field can be accessed with within AbstractManoptSolverState\ndistance – function (p,o,x1,x2) to compute the change/distance between two values of the entry\nstorage – a StoreStateAction to store (at least) getproperty(o, d.field)\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordEvery","page":"Recording values","title":"Manopt.RecordEvery","text":"RecordEvery <: RecordAction\n\nrecord only every ith iteration. Otherwise (optionally, but activated by default) just update internal tracking values.\n\nThis method does not perform any record itself but relies on it's childrens methods\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordGroup","page":"Recording values","title":"Manopt.RecordGroup","text":"RecordGroup <: RecordAction\n\ngroup a set of RecordActions into one action, where the internal RecordActions act independently, but the results can be collected in a grouped fashion, i.e. tuples per calls of this group. The entries can be later addressed either by index or semantic Symbols\n\nConstructors\n\nRecordGroup(g::Array{<:RecordAction, 1})\n\nconstruct a group consisting of an Array of RecordActions g,\n\nRecordGroup(g, symbols)\n\nExamples\n\nr = RecordGroup([RecordIteration(), RecordCost()])\n\nA RecordGroup to record the current iteration and the cost. The cost can then be accessed using get_record(r,2) or r[2].\n\nr = RecordGroup([RecordIteration(), RecordCost()], Dict(:Cost => 2))\n\nA RecordGroup to record the current iteration and the cost, which can then be accessed using get_record(:Cost) or r[:Cost].\n\nr = RecordGroup([RecordIteration(), :Cost => RecordCost()])\n\nA RecordGroup identical to the previous constructor, just a little easier to use.\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordIterate","page":"Recording values","title":"Manopt.RecordIterate","text":"RecordIterate <: RecordAction\n\nrecord the iterate\n\nConstructors\n\nRecordIterate(x0)\n\ninitialize the iterate record array to the type of x0, e.g. your initial data.\n\nRecordIterate(P)\n\ninitialize the iterate record array to the data type T.\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordIteration","page":"Recording values","title":"Manopt.RecordIteration","text":"RecordIteration <: RecordAction\n\nrecord the current iteration\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordSolverState","page":"Recording values","title":"Manopt.RecordSolverState","text":"RecordSolverState <: AbstractManoptSolverState\n\nappend to any AbstractManoptSolverState the decorator with record functionality, Internally a Dictionary is kept that stores a RecordAction for several concurrent modes using a Symbol as reference. The default mode is :Iteration, which is used to store information that is recorded during the iterations. RecordActions might be added to :Start or :Stop to record values at the beginning or for the stopping time point, respectively\n\nThe original options can still be accessed using the get_state function.\n\nFields\n\noptions – the options that are extended by debug information\nrecordDictionary – a Dict{Symbol,RecordAction} to keep track of all different recorded values\n\nConstructors\n\nRecordSolverState(o,dR)\n\nconstruct record decorated AbstractManoptSolverState, where dR can be\n\na RecordAction, then it is stored within the dictionary at :Iteration\nan Array of RecordActions, then it is stored as a recordDictionary(@ref) within the dictionary at :All.\na Dict{Symbol,RecordAction}.\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordTime","page":"Recording values","title":"Manopt.RecordTime","text":"RecordTime <: RecordAction\n\nrecord the time elapsed during the current iteration.\n\nThe three possible modes are\n\n:cumulative record times without resetting the timer\n:iterative record times with resetting the timer\n:total record a time only at the end of an algorithm (see stop_solver!)\n\nThe default is :cumulative, and any non-listed symbol default to using this mode.\n\nConstructor\n\nRecordTime(; mode::Symbol=:cumulative)\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Base.getindex-Tuple{RecordGroup, Vararg{Any}}","page":"Recording values","title":"Base.getindex","text":"getindex(r::RecordGroup, s::Symbol)\nr[s]\ngetindex(r::RecordGroup, sT::NTuple{N,Symbol})\nr[sT]\ngetindex(r::RecordGroup, i)\nr[i]\n\nreturn an array of recorded values with respect to the s, the symbols from the tuple sT or the index i. See get_record for details.\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Base.getindex-Tuple{RecordSolverState, Symbol}","page":"Recording values","title":"Base.getindex","text":"get_index(rs::RecordSolverState, s::Symbol)\nro[s]\n\nGet the recorded values for recorded type s, see get_record for details.\n\nget_index(rs::RecordSolverState, s::Symbol, i...)\nro[s, i...]\n\nAccess the recording type of type s and call its RecordAction with [i...].\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.RecordActionFactory-Tuple{AbstractManoptSolverState, RecordAction}","page":"Recording values","title":"Manopt.RecordActionFactory","text":"RecordActionFactory(s)\n\ncreate a RecordAction where\n\na RecordAction is passed through\na [Symbol] creates RecordEntry of that symbol, with the exceptions of\n:Change - to record the change of the iterates in o.x`\n:Iterate - to record the iterate\n:Iteration - to record the current iteration number\n:Cost - to record the current cost function value\n:Time - to record the total time taken after every iteration\n:IterativeTime – to record the times taken for each iteration.\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.RecordFactory-Tuple{AbstractManoptSolverState, Vector}","page":"Recording values","title":"Manopt.RecordFactory","text":"RecordFactory(s::AbstractManoptSolverState, a)\n\ngiven an array of Symbols and RecordActions and Ints\n\nThe symbol :Cost creates a RecordCost\nThe symbol :iteration creates a RecordIteration\nThe symbol :Change creates a RecordChange\nany other symbol creates a RecordEntry of the corresponding field in AbstractManoptSolverState\nany RecordAction is directly included\nan semantic pair :symbol => RecordAction is directly included\nan Integer k introduces that record is only performed every kth iteration\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.get_record","page":"Recording values","title":"Manopt.get_record","text":"get_record(s::AbstractManoptSolverState, [,symbol=:Iteration])\nget_record(s::RecordSolverState, [,symbol=:Iteration])\n\nreturn the recorded values from within the RecordSolverState s that where recorded with respect to the Symbol symbol as an Array. The default refers to any recordings during an :Iteration.\n\nWhen called with arbitrary AbstractManoptSolverState, this method looks for the RecordSolverState decorator and calls get_record on the decorator.\n\n\n\n\n\n","category":"function"},{"location":"plans/record/#Manopt.get_record-Tuple{RecordAction, Any}","page":"Recording values","title":"Manopt.get_record","text":"get_record(r::RecordAction)\n\nreturn the recorded values stored within a RecordAction r.\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.get_record-Tuple{RecordGroup}","page":"Recording values","title":"Manopt.get_record","text":"get_record(r::RecordGroup)\n\nreturn an array of tuples, where each tuple is a recorded set, e.g. per iteration / record call.\n\nget_record(r::RecordGruop, i::Int)\n\nreturn an array of values corresponding to the ith entry in this record group\n\nget_record(r::RecordGruop, s::Symbol)\n\nreturn an array of recorded values with respect to the s, see RecordGroup.\n\nget_record(r::RecordGroup, s1::Symbol, s2::Symbol,...)\n\nreturn an array of tuples, where each tuple is a recorded set corresponding to the symbols s1, s2,... per iteration / record call.\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.get_record_action","page":"Recording values","title":"Manopt.get_record_action","text":"get_record_action(s::AbstractManoptSolverState, s::Symbol)\n\nreturn the action contained in the (first) RecordSolverState decorator within the AbstractManoptSolverState o.\n\n\n\n\n\n","category":"function"},{"location":"plans/record/#Manopt.get_record_state-Tuple{AbstractManoptSolverState}","page":"Recording values","title":"Manopt.get_record_state","text":"get_record_state(s::AbstractManoptSolverState)\n\nreturn the RecordSolverState among the decorators from the AbstractManoptSolverState o\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.has_record-Tuple{RecordSolverState}","page":"Recording values","title":"Manopt.has_record","text":"has_record(s::AbstractManoptSolverState)\n\ncheck whether the AbstractManoptSolverStates are decorated with RecordSolverState\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.record_or_reset!-Tuple{RecordAction, Any, Int64}","page":"Recording values","title":"Manopt.record_or_reset!","text":"record_or_reset!(r,v,i)\n\neither record (i>0 and not Inf) the value v within the RecordAction r or reset (i<0) the internal storage, where v has to match the internal value type of the corresponding Recordaction.\n\n\n\n\n\n","category":"method"},{"location":"plans/record/","page":"Recording values","title":"Recording values","text":"see recording values for details on the decorated solver.","category":"page"},{"location":"plans/record/","page":"Recording values","title":"Recording values","text":"Further specific RecordActions can be found when specific types of AbstractManoptSolverState define them on their corresponding site.","category":"page"},{"location":"plans/record/#Technical-Details:-The-Record-Solver","page":"Recording values","title":"Technical Details: The Record Solver","text":"","category":"section"},{"location":"plans/record/","page":"Recording values","title":"Recording values","text":"initialize_solver!(amp::AbstractManoptProblem, rss::RecordSolverState)\nstep_solver!(p::AbstractManoptProblem, s::RecordSolverState, i)\nstop_solver!(p::AbstractManoptProblem, s::RecordSolverState, i)","category":"page"},{"location":"plans/record/#Manopt.initialize_solver!-Tuple{AbstractManoptProblem, RecordSolverState}","page":"Recording values","title":"Manopt.initialize_solver!","text":"initialize_solver!(ams::AbstractManoptProblem, rss::RecordSolverState)\n\nExtend the initialization of the solver by a hook to run records that were added to the :Start entry.\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.step_solver!-Tuple{AbstractManoptProblem, RecordSolverState, Any}","page":"Recording values","title":"Manopt.step_solver!","text":"step_solver!(amp::AbstractManoptProblem, rss::RecordSolverState, i)\n\nExtend the ith step of the solver by a hook to run records, that were added to the :Iteration entry.\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.stop_solver!-Tuple{AbstractManoptProblem, RecordSolverState, Any}","page":"Recording values","title":"Manopt.stop_solver!","text":"stop_solver!(amp::AbstractManoptProblem, rss::RecordSolverState, i)\n\nExtend the check, whether to stop the solver by a hook to run records, that were added to the :Stop entry.\n\n\n\n\n\n","category":"method"},{"location":"solvers/adaptive-regularization-with-cubics/#ARSSection","page":"Adaptive Regularization with Cubics","title":"Adaptive regularization with Cubics","text":"","category":"section"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"adaptive_regularization_with_cubics\nadaptive_regularization_with_cubics!","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/#Manopt.adaptive_regularization_with_cubics","page":"Adaptive Regularization with Cubics","title":"Manopt.adaptive_regularization_with_cubics","text":"adaptive_regularization_with_cubics(M, f, grad_f, Hess_f, p=rand(M); kwargs...)\nadaptive_regularization_with_cubics(M, f, grad_f, p=rand(M); kwargs...)\nadaptive_regularization_with_cubics(M, mho, p=rand(M); kwargs...)\n\nSolve an optimization problem on the manifold M by iteratively minimizing\n\nm_k(X) = f(p_k) + X operatornamegrad f(p_k) + frac12X operatornameHess f(p_k)X + fracσ_k3lVert X rVert^3\n\non the tangent space at the current iterate p_k, i.e. X T_p_kmathcal M and where σ_k 0 is a regularization parameter.\n\nLet X_k denote the minimizer of the model m_k, then we use the model improvement\n\nρ_k = fracf(p_k) - f(operatornameretr_p_k(X_k))m_k(0) - m_k(s) + fracσ_k3lVert X_krVert^3\n\nWe use two thresholds η_2 η_1 0 and set p_k+1 = operatornameretr_p_k(X_k) if ρ η_1 and reject the candidate otherwise, i.e. set p_k+1 = p_k.\n\nWe further update the regularization parameter using factors 0 γ_1 1 γ_2\n\nσ_k+1 =\nbegincases\n maxσ_min γ_1σ_k text if ρ geq η_2 text (the model was very successful)\n σ_k text if ρ in η_1 η_2)text (the model was successful)\n γ_2σ_k text if ρ η_1text (the model was unsuccessful)\nendcases\n\nFor more details see Agarwal, Boumal, Bullins, Cartis, Math. Prog., 2020.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function F mathcal M ℝ to minimize\ngrad_f- the gradient operatornamegradF mathcal M T mathcal M of F\nHess_f – (optional) the hessian H( mathcal M x ξ) of F\np – an initial value p mathcal M\n\nFor the case that no hessian is provided, the Hessian is computed using finite difference, see ApproxHessianFiniteDifference.\n\nthe cost f and its gradient and hessian might also be provided as a ManifoldHessianObjective\n\nKeyword arguments\n\nthe default values are given in brackets\n\nσ - (100.0 / sqrt(manifold_dimension(M)) initial regularization parameter\nσmin - (1e-10) minimal regularization value σ_min\nη1 - (0.1) lower model success threshold\nη2 - (0.9) upper model success threshold\nγ1 - (0.1) regularization reduction factor (for the success case)\nγ2 - (2.0) regularization increment factor (for the non-success case)\nevaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form grad_f(M, p) or InplaceEvaluation in place, i.e. is of the form grad_f!(M, X, p) and analogously for the hessian.\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction to use\ninitial_tangent_vector - (zero_vector(M, p)) initialize any tangent vector data,\nmaxIterLanczos - (200) a shortcut to set the stopping criterion in the sub_solver,\nρ_regularization - (1e3) a regularization to avoid dividing by zero for small values of cost and model\nstopping_criterion - (StopAfterIteration(40) |StopWhenGradientNormLess(1e-9) |StopWhenAllLanczosVectorsUsed(maxIterLanczos))\nsub_state - LanczosState(M, copy(M, p); maxIterLanczos=maxIterLanczos, σ=σ) a state for the subproblem or an [AbstractEvaluationType`](@ref) if the problem is a function.\nsub_objective - a shortcut to modify the objective of the subproblem used within in the\nsub_problem - DefaultManoptProblem(M, sub_objective) the problem (or a function) for the sub problem\n\nAll other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective, respectively. If you provide the ManifoldGradientObjective directly, these decorations can still be specified\n\nBy default the debug= keyword is set to DebugIfEntry(:ρ_denonimator, >(0); message=\"Denominator nonpositive\", type=:error)to avoid that by rounding errors the denominator in the computation ofρ` gets nonpositive.\n\n\n\n\n\n","category":"function"},{"location":"solvers/adaptive-regularization-with-cubics/#Manopt.adaptive_regularization_with_cubics!","page":"Adaptive Regularization with Cubics","title":"Manopt.adaptive_regularization_with_cubics!","text":"adaptive_regularization_with_cubics!(M, f, grad_f, Hess_f, p; kwargs...)\nadaptive_regularization_with_cubics!(M, f, grad_f, p; kwargs...)\nadaptive_regularization_with_cubics!(M, mho, p; kwargs...)\n\nevaluate the Riemannian adaptive regularization with cubics solver in place of p.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function F mathcal M ℝ to minimize\ngrad_f- the gradient operatornamegradF mathcal M T mathcal M of F\nHess_f – (optional) the hessian H( mathcal M x ξ) of F\np – an initial value p mathcal M\n\nFor the case that no hessian is provided, the Hessian is computed using finite difference, see ApproxHessianFiniteDifference.\n\nthe cost f and its gradient and hessian might also be provided as a ManifoldHessianObjective\n\nfor more details and all options, see adaptive_regularization_with_cubics.\n\n\n\n\n\n","category":"function"},{"location":"solvers/adaptive-regularization-with-cubics/#State","page":"Adaptive Regularization with Cubics","title":"State","text":"","category":"section"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"AdaptiveRegularizationState","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/#Manopt.AdaptiveRegularizationState","page":"Adaptive Regularization with Cubics","title":"Manopt.AdaptiveRegularizationState","text":"AdaptiveRegularizationState{P,T} <: AbstractHessianSolverState\n\nA state for the adaptive_regularization_with_cubics solver.\n\nFields\n\na default value is given in brackets if a parameter can be left out in initialization.\n\nη1, η2 – (0.1, 0.9) bounds for evaluating the regularization parameter\nγ1, γ2 – (0.1, 2.0) shrinking and expansion factors for regularization parameter σ\np – (rand(M) the current iterate\nX – (zero_vector(M,p)) the current gradient operatornamegradf(p)\ns - (zero_vector(M,p)) the tangent vector step resulting from minimizing the model problem in the tangent space mathcal T_p mathcal M\nσ – the current cubic regularization parameter\nσmin – (1e-7) lower bound for the cubic regularization parameter\nρ_regularization – (1e3) regularization parameter for computing ρ. As we approach convergence the ρ may be difficult to compute with numerator and denominator approaching zero. Regularizing the the ratio lets ρ go to 1 near convergence.\nevaluation - (AllocatingEvaluation()) if you provide a\nretraction_method – (default_retraction_method(M)) the retraction to use\nstopping_criterion – (StopAfterIteration(100)) a StoppingCriterion\nsub_problem - sub problem solved in each iteration\nsub_state - sub state for solving the sub problem – either a solver state if the problem is an AbstractManoptProblem or an AbstractEvaluationType if it is a function, where it defaults to AllocatingEvaluation.\n\nFurthermore the following integral fields are defined\n\nq - (copy(M,p)) a point for the candidates to evaluate model and ρ\nH – (copy(M, p, X)) the current hessian, operatornameHessF(p)\nS – (copy(M, p, X)) the current solution from the subsolver\nρ – the current regularized ratio of actual improvement and model improvement.\nρ_denominator – (one(ρ)) a value to store the denominator from the computation of ρ to allow for a warning or error when this value is non-positive.\n\nConstructor\n\nAdaptiveRegularizationState(M, p=rand(M); X=zero_vector(M, p); kwargs...)\n\nConstruct the solver state with all fields stated above as keyword arguments.\n\n\n\n\n\n","category":"type"},{"location":"solvers/adaptive-regularization-with-cubics/#Sub-solvers","page":"Adaptive Regularization with Cubics","title":"Sub solvers","text":"","category":"section"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"There are several ways to approach the subsolver. The default is the first one.","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/#Lanczos-Iteration","page":"Adaptive Regularization with Cubics","title":"Lanczos Iteration","text":"","category":"section"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"Manopt.LanczosState","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/#Manopt.LanczosState","page":"Adaptive Regularization with Cubics","title":"Manopt.LanczosState","text":"LanczosState{P,T,SC,B,I,R,TM,V,Y} <: AbstractManoptSolverState\n\nSolve the adaptive regularized subproblem with a Lanczos iteration\n\nFields\n\np the current iterate\nstop – the stopping criterion\nσ – the current regularization parameter\nX the current gradient\nLanczos_vectors – the obtained Lanczos vectors\ntridig_matrix the tridiagonal coefficient matrix T\ncoefficients the coefficients y_1,...y_k` that determine the solution\nHp – a temporary vector containing the evaluation of the Hessian\nHp_residual – a temporary vector containing the residual to the Hessian\nS – the current obtained / approximated solution\n\n\n\n\n\n","category":"type"},{"location":"solvers/adaptive-regularization-with-cubics/#(Conjugate)-Gradient-Descent","page":"Adaptive Regularization with Cubics","title":"(Conjugate) Gradient Descent","text":"","category":"section"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"There are two generic functors, that implement the sub problem","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"AdaptiveRegularizationCubicCost\nAdaptiveRegularizationCubicGrad","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/#Manopt.AdaptiveRegularizationCubicCost","page":"Adaptive Regularization with Cubics","title":"Manopt.AdaptiveRegularizationCubicCost","text":"AdaptiveRegularizationCubicCost\n\nWe define the model m(X) in the tangent space of the current iterate p=p_k as\n\n m(X) = f(p) + X operatornamegradf(p)\n + frac12 X operatornameHess f(p)X + fracσ3 lVert X rVert^3\n\nFields\n\nmho – an AbstractManifoldObjective that should provide at least get_cost, get_gradient and get_hessian.\nσ – the current regularization parameter\nX – a storage for the gradient at p of the original cost\n\nConstructors\n\nAdaptiveRegularizationCubicCost(mho, σ, X)\nAdaptiveRegularizationCubicCost(M, mho, σ; p=rand(M), X=get_gradient(M, mho, p))\n\nInitialize the cubic cost to the objective mho, regularization parameter σ, and (temporary) gradient X.\n\nnote: Note\nFor this gradient function to work, we require the TangentSpaceAtPoint from Manifolds.jl\n\n\n\n\n\n","category":"type"},{"location":"solvers/adaptive-regularization-with-cubics/#Manopt.AdaptiveRegularizationCubicGrad","page":"Adaptive Regularization with Cubics","title":"Manopt.AdaptiveRegularizationCubicGrad","text":"AdaptiveRegularizationCubicGrad\n\nWe define the model m(X) in the tangent space of the current iterate p=p_k as\n\n m(X) = f(p) + X operatornamegradf(p)\n + frac12 X operatornameHess f(p)X + fracσ3 lVert X rVert^3\n\nThis struct represents its gradient, given by\n\n operatornamegrad m(X) = operatornamegradf(p) + operatornameHess f(p)X + σ lVert X rVert X\n\nFields\n\nmho – an AbstractManifoldObjective that should provide at least get_cost, get_gradient and get_hessian.\nσ – the current regularization parameter\nX – a storage for the gradient at p of the original cost\n\nConstructors\n\nAdaptiveRegularizationCubicGrad(mho, σ, X)\nAdaptiveRegularizationCubicGrad(M, mho, σ; p=rand(M), X=get_gradient(M, mho, p))\n\nInitialize the cubic cost to the original objective mho, regularization parameter σ, and (temporary) gradient X.\n\nnote: Note\nFor this gradient function to work, we require the TangentSpaceAtPointfrom Manifolds.jlThe gradient functor provides both an allocating as well as an in-place variant.\n\n\n\n\n\n","category":"type"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"Since the sub problem is given on the tangent space, you have to provide","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"g = AdaptiveRegularizationCubicCost(M, mho, σ)\ngrad_g = AdaptiveRegularizationCubicGrad(M, mho, σ)\nsub_problem = DefaultProblem(TangentSpaceAt(M,p), ManifoldGradienObjective(g, grad_g))","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"where mho is the hessian objective of f to solve. Then use this for the sub_problem keyword and use your favourite gradient based solver for the sub_state keyword, for example a ConjugateGradientDescentState","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/#Additional-Stopping-Criteria","page":"Adaptive Regularization with Cubics","title":"Additional Stopping Criteria","text":"","category":"section"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"StopWhenAllLanczosVectorsUsed\nStopWhenFirstOrderProgress","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/#Manopt.StopWhenAllLanczosVectorsUsed","page":"Adaptive Regularization with Cubics","title":"Manopt.StopWhenAllLanczosVectorsUsed","text":"StopWhenAllLanczosVectorsUsed <: StoppingCriterion\n\nWhen an inner iteration has used up all Lanczos vectors, then this stopping criterion is a fallback / security stopping criterion in order to not access a non-existing field in the array allocated for vectors.\n\nNote that this stopping criterion (for now) is only implemented for the case that an AdaptiveRegularizationState when using a LanczosState subsolver\n\nFields\n\nmaxLanczosVectors – maximal number of Lanczos vectors\nreason – a String indicating the reason if the criterion indicated to stop\n\nConstructor\n\nStopWhenAllLanczosVectorsUsed(maxLancosVectors::Int)\n\n\n\n\n\n","category":"type"},{"location":"solvers/adaptive-regularization-with-cubics/#Manopt.StopWhenFirstOrderProgress","page":"Adaptive Regularization with Cubics","title":"Manopt.StopWhenFirstOrderProgress","text":"StopWhenFirstOrderProgress <: StoppingCriterion\n\nA stopping criterion related to the Riemannian adaptive regularization with cubics (ARC) solver indicating that the model function at the current (outer) iterate, i.e.\n\n m(X) = f(p) + X operatornamegradf(p)\n + frac12 X operatornameHess f(p)X + fracσ3 lVert X rVert^3\n\ndefined on the tangent space T_pmathcal M fulfills at the current iterate X_k that\n\nm(X_k) leq m(0)\nquadtext and quad\nlVert operatornamegrad m(X_k) rVert θ lVert X_k rVert^2\n\nFields\n\nθ – the factor θ in the second condition above\nreason – a String indicating the reason if the criterion indicated to stop\n\n\n\n\n\n","category":"type"},{"location":"solvers/adaptive-regularization-with-cubics/#Literature","page":"Adaptive Regularization with Cubics","title":"Literature","text":"","category":"section"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"Pages = [\"solvers/adaptive-regularization-with-cubics.md\"]\nCanonical=false","category":"page"},{"location":"solvers/trust_regions/#trust_regions","page":"Trust-Regions Solver","title":"The Riemannian Trust-Regions Solver","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"The aim is to solve an optimization problem on a manifold","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"operatorname*min_x mathcalM F(x)","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"by using the Riemannian trust-regions solver. It is number one choice for smooth optimization. This trust-region method uses the Steihaug-Toint truncated conjugate-gradient method truncated_conjugate_gradient_descent to solve the inner minimization problem called the trust-regions subproblem. This inner solver can be preconditioned by providing a preconditioner (symmetric and positive definite, an approximation of the inverse of the Hessian of F). If no Hessian of the cost function F is provided, a standard approximation of the Hessian based on the gradient operatornamegradF with ApproxHessianFiniteDifference will be computed.","category":"page"},{"location":"solvers/trust_regions/#Initialization","page":"Trust-Regions Solver","title":"Initialization","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"Initialize x_0 = x with an initial point x on the manifold. It can be given by the caller or set randomly. Set the initial trust-region radius Delta =frac18 barDelta where barDelta is the maximum radius the trust-region can have. Usually one uses the root of the manifold's dimension operatornamedim(mathcalM). For accepting the next iterate and evaluating the new trust-region radius, one needs an accept/reject threshold rho 0frac14), which is rho = 01 on default. Set k=0.","category":"page"},{"location":"solvers/trust_regions/#Iteration","page":"Trust-Regions Solver","title":"Iteration","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"Repeat until a convergence criterion is reached","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"Set η as a random tangent vector if using randomized approach. Else set η as the zero vector in the tangential space T_x_kmathcalM.\nSet η^* as the solution of the trust-region subproblem, computed by the tcg-method with η as initial vector.\nIf using randomized approach, compare η^* with the Cauchy point η_c^* = -tau_c fracDeltalVert operatornameGradF (x_k) rVert_x_k operatornameGradF (x_k) by the model function m_x_k(). If the model decrease is larger by using the Cauchy point, set η^* = η_c^*.\nSet x^* = operatornameretr_x_k(η^*).\nSet rho = fracF(x_k)-F(x^*)m_x_k(η)-m_x_k(η^*), where m_x_k() describes the quadratic model function.\nUpdate the trust-region radius:Delta = begincasesfrac14 Delta text if rho frac14 textor m_x_k(η)-m_x_k(η^*) leq 0 textor rho = pm fty operatornamemin(2 Delta barDelta) text if rho frac34 textand the tcg-method stopped because of negative curvature or exceeding the trust-regionDelta textotherwiseendcases\nIf m_x_k(η)-m_x_k(η^*) geq 0 and rho rho set x_k = x^*.\nSet k = k+1.","category":"page"},{"location":"solvers/trust_regions/#Result","page":"Trust-Regions Solver","title":"Result","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"The result is given by the last computed x_k.","category":"page"},{"location":"solvers/trust_regions/#Remarks","page":"Trust-Regions Solver","title":"Remarks","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"To the initialization: a random point on the manifold.","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"To step number 1: using a randomized approach means using a random tangent vector as initial vector for the approximate solve of the trust-regions subproblem. If this is the case, keep in mind that the vector must be in the trust-region radius. This is achieved by multiplying η by sqrt(4,eps(Float64)) as long as its norm is greater than the current trust-region radius Delta. For not using randomized approach, one can get the zero tangent vector.","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"To step number 2: obtain η^* by (approximately) solving the trust-regions subproblem","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"operatorname*argmin_η T_x_kmathcalM m_x_k(η) = F(x_k) +\nlangle operatornamegradF(x_k) η rangle_x_k + frac12 langle\noperatornameHessF(η)_ x_k η rangle_x_k","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"textst langle η η rangle_x_k leq Delta^2","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"with the Steihaug-Toint truncated conjugate-gradient (tcg) method. The problem as well as the solution method is described in the truncated_conjugate_gradient_descent. In this inner solver, the stopping criterion StopWhenResidualIsReducedByFactorOrPower so that superlinear or at least linear convergence in the trust-region method can be achieved.","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"To step number 3: if using a random tangent vector as an initial vector, compare the result of the tcg-method with the Cauchy point. Convergence proofs assume that one achieves at least (a fraction of) the reduction of the Cauchy point. The idea is to go in the direction of the gradient to an optimal point. This can be on the edge, but also before. The parameter tau_c for the optimal length is defined by","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"tau_c = begincases 1 langle operatornameGradF (x_k) \noperatornameHessF (η_k)_ x_krangle_x_k leq 0 \noperatornamemin(fracoperatornamenorm(operatornameGradF (x_k))^3\nDelta langle operatornameGradF (x_k) \noperatornameHessF (η_k)_ x_krangle_x_k 1) textotherwise\nendcases","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"To check the model decrease one compares","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"m_x_k(η_c^*) = F(x_k) + langle η_c^*\noperatornameGradF (x_k)rangle_x_k + frac12langle η_c^*\noperatornameHessF (η_c^*)_ x_krangle_x_k","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"with","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"m_x_k(η^*) = F(x_k) + langle η^*\noperatornameGradF (x_k)rangle_x_k + frac12langle η^*\noperatornameHessF (η^*)_ x_krangle_x_k","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"If m_x_k(η_c^*) m_x_k(η^*) then m_x_k(η_c^*) is the better choice.","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"To step number 4: operatornameretr_x_k() denotes the retraction, a mapping operatornameretr_x_kT_x_kmathcalM rightarrow mathcalM which approximates the exponential map. In some cases it is cheaper to use this instead of the exponential.","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"To step number 6: one knows that the truncated_conjugate_gradient_descent algorithm stopped for these reasons when the stopping criteria StopWhenCurvatureIsNegative, StopWhenTrustRegionIsExceeded are activated.","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"To step number 7: the last step is to decide if the new point x^* is accepted.","category":"page"},{"location":"solvers/trust_regions/#Interface","page":"Trust-Regions Solver","title":"Interface","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"trust_regions\ntrust_regions!","category":"page"},{"location":"solvers/trust_regions/#Manopt.trust_regions","page":"Trust-Regions Solver","title":"Manopt.trust_regions","text":"trust_regions(M, f, grad_f, hess_f, p)\ntrust_regions(M, f, grad_f, p)\n\nrun the Riemannian trust-regions solver for optimization on manifolds to minimize f cf. [Absil, Baker, Gallivan, FoCM, 2006; Conn, Gould, Toint, SIAM, 2000].\n\nFor the case that no hessian is provided, the Hessian is computed using finite difference, see ApproxHessianFiniteDifference. For solving the the inner trust-region subproblem of finding an update-vector, see truncated_conjugate_gradient_descent.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function F mathcal M ℝ to minimize\ngrad_f- the gradient operatornamegradF mathcal M T mathcal M of F\nHess_f – (optional), the hessian operatornameHessF(x) T_xmathcal M T_xmathcal M, X operatornameHessF(x)X = _ξoperatornamegradf(x)\np – an initial value x mathcal M\n\nOptional\n\nevaluation – (AllocatingEvaluation) specify whether the gradient and hessian work by allocation (default) or InplaceEvaluation in place\nmax_trust_region_radius – the maximum trust-region radius\npreconditioner – a preconditioner (a symmetric, positive definite operator that should approximate the inverse of the Hessian)\nrandomize – set to true if the trust-region solve is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.\nproject! : (copyto!) specify a projection operation for tangent vectors within the TCG for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.\nretraction – (default_retraction_method(M, typeof(p))) approximation of the exponential map\nstopping_criterion – (StopWhenAny(StopAfterIteration(1000), StopWhenGradientNormLess(10^(-6))) a functor inheriting from StoppingCriterion indicating when to stop.\ntrust_region_radius - the initial trust-region radius\nρ_prime – Accept/reject threshold: if ρ (the performance ratio for the iterate) is at least ρ', the outer iteration is accepted. Otherwise, it is rejected. In case it is rejected, the trust-region radius will have been decreased. To ensure this, ρ' >= 0 must be strictly smaller than 1/4. If ρ_prime is negative, the algorithm is not guaranteed to produce monotonically decreasing cost values. It is strongly recommended to set ρ' > 0, to aid convergence.\nρ_regularization – Close to convergence, evaluating the performance ratio ρ is numerically challenging. Meanwhile, close to convergence, the quadratic model should be a good fit and the steps should be accepted. Regularization lets ρ go to 1 as the model decrease and the actual decrease go to zero. Set this option to zero to disable regularization (not recommended). When this is not zero, it may happen that the iterates produced are not monotonically improving the cost when very close to convergence. This is because the corrected cost improvement could change sign if it is negative but very small.\nθ – (1.0) 1+θ is the superlinear convergence target rate of the tCG-method truncated_conjugate_gradient_descent, which computes an approximate solution for the trust-region subproblem. The tCG-method aborts if the residual is less than or equal to the initial residual to the power of 1+θ.\nκ – (0.1) the linear convergence target rate of the tCG-method truncated_conjugate_gradient_descent, which computes an approximate solution for the trust-region subproblem. The method aborts if the residual is less than or equal to κ times the initial residual.\nreduction_threshold – (0.1) Trust-region reduction threshold: if ρ (the performance ratio for the iterate) is less than this bound, the trust-region radius and thus the trust-regions decreases.\naugmentation_threshold – (0.75) Trust-region augmentation threshold: if ρ (the performance ratio for the iterate) is greater than this and further conditions apply, the trust-region radius and thus the trust-regions increases.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\nsee also\n\ntruncated_conjugate_gradient_descent\n\n\n\n\n\n","category":"function"},{"location":"solvers/trust_regions/#Manopt.trust_regions!","page":"Trust-Regions Solver","title":"Manopt.trust_regions!","text":"trust_regions!(M, f, grad_f, Hess_f, p; kwargs...)\ntrust_regions!(M, f, grad_f, p; kwargs...)\n\nevaluate the Riemannian trust-regions solver in place of p.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function F mathcal M ℝ to minimize\ngrad_f- the gradient operatornamegradF mathcal M T mathcal M of F\nHess_f – (optional) the hessian H( mathcal M x ξ) of F\np – an initial value p mathcal M\n\nFor the case that no hessian is provided, the Hessian is computed using finite difference, see ApproxHessianFiniteDifference.\n\nfor more details and all options, see trust_regions\n\n\n\n\n\n","category":"function"},{"location":"solvers/trust_regions/#State","page":"Trust-Regions Solver","title":"State","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"TrustRegionsState","category":"page"},{"location":"solvers/trust_regions/#Manopt.TrustRegionsState","page":"Trust-Regions Solver","title":"Manopt.TrustRegionsState","text":"TrustRegionsState <: AbstractHessianSolverState\n\ndescribe the trust-regions solver, with\n\nFields\n\nwhere all but p are keyword arguments in the constructor\n\np : the current iterate\nstop : (`StopAfterIteration(1000) | StopWhenGradientNormLess(1e-6))\nmax_trust_region_radius : (sqrt(manifold_dimension(M))) the maximum trust-region radius\nproject! : (copyto!) specify a projection operation for tangent vectors for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.\nrandomize : (false) indicates if the trust-region solve is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.\nρ_prime : (0.1) a lower bound of the performance ratio for the iterate that decides if the iteration will be accepted or not. If not, the trust-region radius will have been decreased. To ensure this, ρ'>= 0 must be strictly smaller than 1/4. If ρ' is negative, the algorithm is not guaranteed to produce monotonically decreasing cost values. It is strongly recommended to set ρ' > 0, to aid convergence.\nρ_regularization : (10000.0) Close to convergence, evaluating the performance ratio ρ is numerically challenging. Meanwhile, close to convergence, the quadratic model should be a good fit and the steps should be accepted. Regularization lets ρ go to 1 as the model decrease and the actual decrease go to zero. Set this option to zero to disable regularization (not recommended). When this is not zero, it may happen that the iterates produced are not monotonically improving the cost when very close to convergence. This is because the corrected cost improvement could change sign if it is negative but very small.\ntrust_region_radius : the (initial) trust-region radius\n\nConstructor\n\nTrustRegionsState(M,\n p=rand(M),\n X=zero_vector(M,p),\n sub_state=TruncatedConjugateGradientState(M, p, X),\n\n)\n\nconstruct a trust-regions Option with all other fields from above being keyword arguments\n\nSee also\n\ntrust_regions\n\n\n\n\n\n","category":"type"},{"location":"solvers/trust_regions/#Approximation-of-the-Hessian","page":"Trust-Regions Solver","title":"Approximation of the Hessian","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"We currently provide a few different methods to approximate the Hessian.","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"ApproxHessianFiniteDifference\nApproxHessianSymmetricRankOne\nApproxHessianBFGS","category":"page"},{"location":"solvers/trust_regions/#Manopt.ApproxHessianFiniteDifference","page":"Trust-Regions Solver","title":"Manopt.ApproxHessianFiniteDifference","text":"ApproxHessianFiniteDifference{E, P, T, G, RTR,, VTR, R <: Real} <: AbstractApproxHessian\n\nA functor to approximate the Hessian by a finite difference of gradient evaluation.\n\nGiven a point p and a direction X and the gradient operatornamegradF mathcal M to Tmathcal M of a function F the Hessian is approximated as follows: Let c be a stepsize, X T_pmathcal M a tangent vector and q = operatornameretr_p(fracclVert X rVert_pX) be a step in direction X of length c following a retraction Then we approximate the Hessian by the finite difference of the gradients, where mathcal T_cdotgetscdot is a vector transport.\n\noperatornameHessF(p)X\n \nfraclVert X rVert_pcBigl( mathcal T_pgets qbigr(operatornamegradF(q)bigl) - operatornamegradF(p)Bigl)\n\nFields\n\ngradient!! the gradient function (either allocating or mutating, see evaluation parameter)\nstep_length a step length for the finite difference\nretraction_method - a retraction to use\nvector_transport_method a vector transport to use\n\nInternal temporary fields\n\ngrad_tmp a temporary storage for the gradient at the current p\ngrad_dir_tmp a temporary storage for the gradient at the current p_dir\np_dir::P a temporary storage to the forward direction (i.e. q above)\n\nConstructor\n\nApproximateFiniteDifference(M, p, grad_f; kwargs...)\n\nKeyword arguments\n\nevaluation (AllocatingEvaluation) whether the gradient is given as an allocation function or an in-place (InplaceEvaluation).\nsteplength (2^-14) step length c to approximate the gradient evaluations\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction(M, p, X) to use in the approximation.\nvector_transport_method - (default_vector_transport_method(M, typeof(p))) a vector transport to use\n\n\n\n\n\n","category":"type"},{"location":"solvers/trust_regions/#Manopt.ApproxHessianSymmetricRankOne","page":"Trust-Regions Solver","title":"Manopt.ApproxHessianSymmetricRankOne","text":"ApproxHessianSymmetricRankOne{E, P, G, T, B<:AbstractBasis{ℝ}, VTR, R<:Real} <: AbstractApproxHessian\n\nA functor to approximate the Hessian by the symmetric rank one update.\n\nFields\n\ngradient!! the gradient function (either allocating or mutating, see evaluation parameter).\nν a small real number to ensure that the denominator in the update does not become too small and thus the method does not break down.\nvector_transport_method a vector transport to use.\n\nInternal temporary fields\n\np_tmp a temporary storage the current point p.\ngrad_tmp a temporary storage for the gradient at the current p.\nmatrix a temporary storage for the matrix representation of the approximating operator.\nbasis a temporary storage for an orthonormal basis at the current p.\n\nConstructor\n\nApproxHessianSymmetricRankOne(M, p, gradF; kwargs...)\n\nKeyword arguments\n\ninitial_operator (Matrix{Float64}(I, manifold_dimension(M), manifold_dimension(M))) the matrix representation of the initial approximating operator.\nbasis (DefaultOrthonormalBasis()) an orthonormal basis in the tangent space of the initial iterate p.\nnu (-1)\nevaluation (AllocatingEvaluation) whether the gradient is given as an allocation function or an in-place (InplaceEvaluation).\nvector_transport_method (ParallelTransport()) vector transport mathcal T_cdotgetscdot to use.\n\n\n\n\n\n","category":"type"},{"location":"solvers/trust_regions/#Manopt.ApproxHessianBFGS","page":"Trust-Regions Solver","title":"Manopt.ApproxHessianBFGS","text":"ApproxHessianBFGS{E, P, G, T, B<:AbstractBasis{ℝ}, VTR, R<:Real} <: AbstractApproxHessian\n\nA functor to approximate the Hessian by the BFGS update.\n\nFields\n\ngradient!! the gradient function (either allocating or mutating, see evaluation parameter).\nscale\nvector_transport_method a vector transport to use.\n\nInternal temporary fields\n\np_tmp a temporary storage the current point p.\ngrad_tmp a temporary storage for the gradient at the current p.\nmatrix a temporary storage for the matrix representation of the approximating operator.\nbasis a temporary storage for an orthonormal basis at the current p.\n\nConstructor\n\nApproxHessianBFGS(M, p, gradF; kwargs...)\n\nKeyword arguments\n\ninitial_operator (Matrix{Float64}(I, manifold_dimension(M), manifold_dimension(M))) the matrix representation of the initial approximating operator.\nbasis (DefaultOrthonormalBasis()) an orthonormal basis in the tangent space of the initial iterate p.\nnu (-1)\nevaluation (AllocatingEvaluation) whether the gradient is given as an allocation function or an in-place (InplaceEvaluation).\nvector_transport_method (ParallelTransport()) vector transport mathcal T_cdotgetscdot to use.\n\n\n\n\n\n","category":"type"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"as well as their (non-exported) common supertype","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"Manopt.AbstractApproxHessian","category":"page"},{"location":"solvers/trust_regions/#Manopt.AbstractApproxHessian","page":"Trust-Regions Solver","title":"Manopt.AbstractApproxHessian","text":"AbstractApproxHessian <: Function\n\nAn abstract supertypes for approximate hessian functions, declares them also to be functions.\n\n\n\n\n\n","category":"type"},{"location":"solvers/trust_regions/#Literature","page":"Trust-Regions Solver","title":"Literature","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"Pages = [\"solvers/trust_regions.md\"]\nCanonical=false","category":"page"},{"location":"plans/debug/#DebugSection","page":"Debug Output","title":"Debug Output","text":"","category":"section"},{"location":"plans/debug/","page":"Debug Output","title":"Debug Output","text":"CurrentModule = Manopt","category":"page"},{"location":"plans/debug/","page":"Debug Output","title":"Debug Output","text":"Debug output can easily be added to any solver run. On the high level interfaces, like gradient_descent, you can just use the debug= keyword.","category":"page"},{"location":"plans/debug/","page":"Debug Output","title":"Debug Output","text":"Modules = [Manopt]\nPages = [\"plans/debug.jl\"]\nOrder = [:type, :function]\nPrivate = true","category":"page"},{"location":"plans/debug/#Manopt.DebugAction","page":"Debug Output","title":"Manopt.DebugAction","text":"DebugAction\n\nA DebugAction is a small functor to print/issue debug output. The usual call is given by (amp::AbstractManoptProblem, ams::AbstractManoptSolverState, i) -> s, where i is the current iterate.\n\nBy convention i=0 is interpreted as \"For Initialization only\", i.e. only debug info that prints initialization reacts, i<0 triggers updates of variables internally but does not trigger any output. Finally typemin(Int) is used to indicate a call from stop_solver! that returns true afterwards.\n\nFields (assumed by subtypes to exist)\n\nprint method to perform the actual print. Can for example be set to a file export,\n\nor to @info. The default is the print function on the default Base.stdout.\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugChange","page":"Debug Output","title":"Manopt.DebugChange","text":"DebugChange(M=DefaultManifold())\n\ndebug for the amount of change of the iterate (stored in get_iterate(o) of the AbstractManoptSolverState) during the last iteration. See DebugEntryChange for the general case\n\nKeyword Parameters\n\nstorage – (StoreStateAction( [:Gradient] )) – (eventually shared) the storage of the previous action\nprefix – (\"Last Change:\") prefix of the debug output (ignored if you set format)\nio – (stdout) default stream to print the debug to.\nformat - ( \"$prefix %f\") format to print the output using an sprintf format.\ninverse_retraction_method - (default_inverse_retraction_method(M)) the inverse retraction to be used for approximating distance.\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugCost","page":"Debug Output","title":"Manopt.DebugCost","text":"DebugCost <: DebugAction\n\nprint the current cost function value, see get_cost.\n\nConstructors\n\nDebugCost()\n\nParameters\n\nformat - (\"$prefix %f\") format to print the output using sprintf and a prefix (see long).\nio – (stdout) default stream to print the debug to.\nlong - (false) short form to set the format to f(x): (default) or current cost: and the cost\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugDivider","page":"Debug Output","title":"Manopt.DebugDivider","text":"DebugDivider <: DebugAction\n\nprint a small divider (default \" | \").\n\nConstructor\n\nDebugDivider(div,print)\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugEntry","page":"Debug Output","title":"Manopt.DebugEntry","text":"DebugEntry <: DebugAction\n\nprint a certain fields entry of type {T} during the iterates, where a format can be specified how to print the entry.\n\nAddidtional Fields\n\nfield – Symbol the entry can be accessed with within AbstractManoptSolverState\n\nConstructor\n\nDebugEntry(f; prefix=\"$f:\", format = \"$prefix %s\", io=stdout)\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugEntryChange","page":"Debug Output","title":"Manopt.DebugEntryChange","text":"DebugEntryChange{T} <: DebugAction\n\nprint a certain entries change during iterates\n\nAdditional Fields\n\nprint – (print) function to print the result\nprefix – (\"Change of :Iterate\") prefix to the print out\nformat – (\"$prefix %e\") format to print (uses the `prefix by default and scientific notation)\nfield – Symbol the field can be accessed with within AbstractManoptSolverState\ndistance – function (p,o,x1,x2) to compute the change/distance between two values of the entry\nstorage – a StoreStateAction to store the previous value of :f\n\nConstructors\n\nDebugEntryChange(f,d)\n\nKeyword arguments\n\nio (stdout) an IOStream\nprefix (\"Change of $f\")\nstorage (StoreStateAction((f,))) a StoreStateAction\ninitial_value an initial value for the change of o.field.\nformat – (\"$prefix %e\") format to print the change\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugEvery","page":"Debug Output","title":"Manopt.DebugEvery","text":"DebugEvery <: DebugAction\n\nevaluate and print debug only every ith iteration. Otherwise no print is performed. Whether internal variables are updates is determined by always_update.\n\nThis method does not perform any print itself but relies on it's childrens print.\n\nConstructor\n\nDebugEvery(d::DebugAction, every=1, always_update=true)\n\nInitialise the DebugEvery.\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugGradientChange","page":"Debug Output","title":"Manopt.DebugGradientChange","text":"DebugGradientChange()\n\ndebug for the amount of change of the gradient (stored in get_gradient(o) of the AbstractManoptSolverState o) during the last iteration. See DebugEntryChange for the general case\n\nKeyword Parameters\n\nstorage – (StoreStateAction( (:Gradient,) )) – (eventually shared) the storage of the previous action\nprefix – (\"Last Change:\") prefix of the debug output (ignored if you set format)\nio – (stdout) default stream to print the debug to.\nformat - ( \"$prefix %f\") format to print the output using an sprintf format.\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugGroup","page":"Debug Output","title":"Manopt.DebugGroup","text":"DebugGroup <: DebugAction\n\ngroup a set of DebugActions into one action, where the internal prints are removed by default and the resulting strings are concatenated\n\nConstructor\n\nDebugGroup(g)\n\nconstruct a group consisting of an Array of DebugActions g, that are evaluated en bloque; the method does not perform any print itself, but relies on the internal prints. It still concatenates the result and returns the complete string\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugIfEntry","page":"Debug Output","title":"Manopt.DebugIfEntry","text":"DebugIfEntry <: DebugAction\n\nIssue a warning, info or error if a certain field does not pass a check\n\nFields\n\nio – an io stream\ncheck – a function that takes the value of the field as input and returns a boolean\nfield – Symbol the entry can be accessed with within AbstractManoptSolverState\nmsg - is the check fails, this message is displayed\ntype – Symbol specifying the type of display, possible values :print, : warn, :info, :error, where :print prints to io.\n\nConstructor\n\nDebugEntry(field, check=(>(0)); type=:warn, message=\":$f is nonnegative\", io=stdout)\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugIterate","page":"Debug Output","title":"Manopt.DebugIterate","text":"DebugIterate <: DebugAction\n\ndebug for the current iterate (stored in get_iterate(o)).\n\nConstructor\n\nDebugIterate()\n\nParameters\n\nio – (stdout) default stream to print the debug to.\nlong::Bool whether to print x: or current iterate\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugIteration","page":"Debug Output","title":"Manopt.DebugIteration","text":"DebugIteration <: DebugAction\n\nConstructor\n\nDebugIteration()\n\nKeyword parameters\n\nformat - (\"# %-6d\") format to print the output using an sprintf format.\nio – (stdout) default stream to print the debug to.\n\ndebug for the current iteration (prefixed with # by )\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugMessages","page":"Debug Output","title":"Manopt.DebugMessages","text":"DebugMessages <: DebugAction\n\nAn AbstractManoptSolverState or one of its substeps like a Stepsize might generate warnings throughout their computations. This debug can be used to :print them display them as :info or :warnings or even :error, depending on the message type.\n\nConstructor\n\nDebugMessages(mode=:Info; io::IO=stdout)\n\nInitialize the messages debug to a certain mode. Available modes are\n\n:Error – issue the messages as an error and hence stop at any issue occurring\n:Info – issue the messages as an @info\n:Print – print messages to the steam io.\n:Warning – issue the messages as a warning\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugSolverState","page":"Debug Output","title":"Manopt.DebugSolverState","text":"DebugSolverState <: AbstractManoptSolverState\n\nThe debug options append to any options a debug functionality, i.e. they act as a decorator pattern. Internally a Dictionary is kept that stores a DebugAction for several occasions using a Symbol as reference. The default occasion is :All and for example solvers join this field with :Start, :Step and :Stop at the beginning, every iteration or the end of the algorithm, respectively\n\nThe original options can still be accessed using the get_state function.\n\nFields (defaults in brackets)\n\noptions – the options that are extended by debug information\ndebugDictionary – a Dict{Symbol,DebugAction} to keep track of Debug for different actions\n\nConstructors\n\nDebugSolverState(o,dA)\n\nconstruct debug decorated options, where dD can be\n\na DebugAction, then it is stored within the dictionary at :All\nan Array of DebugActions, then it is stored as a debugDictionary within :All.\na Dict{Symbol,DebugAction}.\nan Array of Symbols, String and an Int for the DebugFactory\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugStoppingCriterion","page":"Debug Output","title":"Manopt.DebugStoppingCriterion","text":"DebugStoppingCriterion <: DebugAction\n\nprint the Reason provided by the stopping criterion. Usually this should be empty, unless the algorithm stops.\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugTime","page":"Debug Output","title":"Manopt.DebugTime","text":"DebugTime()\n\nMeasure time and print the intervals. Using start=true you can start the timer on construction, for example to measure the runtime of an algorithm overall (adding)\n\nThe measured time is rounded using the given time_accuracy and printed after canonicalization.\n\nKeyword Parameters\n\nprefix – (\"Last Change:\") prefix of the debug output (ignored if you set format)\nio – (stdout) default strea to print the debug to.\nformat - ( \"$prefix %s\") format to print the output using an sprintf format, where %s is the canonicalized time`.\nmode – (:cumulative) whether to display the total time or reset on every call using :iterative.\nstart – (false) indicate whether to start the timer on creation or not. Otherwise it might only be started on firsr call.\ntime_accuracy – (Millisecond(1)) round the time to this period before printing the canonicalized time\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugWarnIfCostIncreases","page":"Debug Output","title":"Manopt.DebugWarnIfCostIncreases","text":"DebugWarnIfCostIncreases <: DebugAction\n\nprint a warning if the cost increases.\n\nNote that this provides an additional warning for gradient descent with its default constant step size.\n\nConstructor\n\nDebugWarnIfCostIncreases(warn=:Once; tol=1e-13)\n\nInitialize the warning to warning level (:Once) and introduce a tolerance for the test of 1e-13.\n\nThe warn level can be set to :Once to only warn the first time the cost increases, to :Always to report an increase every time it happens, and it can be set to :No to deactivate the warning, then this DebugAction is inactive. All other symbols are handled as if they were :Always:\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugWarnIfCostNotFinite","page":"Debug Output","title":"Manopt.DebugWarnIfCostNotFinite","text":"DebugWarnIfCostNotFinite <: DebugAction\n\nA debug to see when a field (value or array within the AbstractManoptSolverState is or contains values that are not finite, for example Inf or Nan.\n\nConstructor\n\nDebugWarnIfCostNotFinite(field::Symbol, warn=:Once)\n\nInitialize the warning to warn :Once.\n\nThis can be set to :Once to only warn the first time the cost is Nan. It can also be set to :No to deactivate the warning, but this makes this Action also useless. All other symbols are handled as if they were :Always:\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugWarnIfFieldNotFinite","page":"Debug Output","title":"Manopt.DebugWarnIfFieldNotFinite","text":"DebugWarnIfFieldNotFinite <: DebugAction\n\nA debug to see when a field from the options is not finite, for example Inf or Nan\n\nConstructor\n\nDebugWarnIfFieldNotFinite(field::Symbol, warn=:Once)\n\nInitialize the warning to warn :Once.\n\nThis can be set to :Once to only warn the first time the cost is Nan. It can also be set to :No to deactivate the warning, but this makes this Action also useless. All other symbols are handled as if they were :Always:\n\nExample\n\nDebugWaranIfFieldNotFinite(:Gradient)\n\nCreates a [DebugAction] to track whether the gradient does not get Nan or Inf.\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugWhenActive","page":"Debug Output","title":"Manopt.DebugWhenActive","text":"DebugWhenActive <: DebugAction\n\nevaluate and print debug only if the active boolean is set. This can be set from outside and is for example triggered by DebugEvery on debugs on the subsolver.\n\nThis method does not perform any print itself but relies on it's childrens print.\n\nFor now, the main interaction is with DebugEvery which might activate or deactivate this debug\n\nFields\n\nalways_update – whether or not to call the order debugs with iteration -1 in in active state\nactive – a boolean that can (de-)activated from outside to enable/disable debug\n\nConstructor\n\nDebugWhenActive(d::DebugAction, active=true, always_update=true)\n\nInitialise the DebugSubsolver.\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugActionFactory-Tuple{String}","page":"Debug Output","title":"Manopt.DebugActionFactory","text":"DebugActionFactory(s)\n\ncreate a DebugAction where\n\na Stringyields the corresponding divider\na DebugAction is passed through\na [Symbol] creates DebugEntry of that symbol, with the exceptions of :Change, :Iterate, :Iteration, and :Cost.\na Tuple{Symbol,String} creates a DebugEntry of that symbol where the String specifies the format.\n\n\n\n\n\n","category":"method"},{"location":"plans/debug/#Manopt.DebugActionFactory-Tuple{Symbol}","page":"Debug Output","title":"Manopt.DebugActionFactory","text":"DebugActionFactory(s::Symbol)\n\nConvert certain Symbols in the debug=[ ... ] vector to DebugActions Currently the following ones are done. Note that the Shortcut symbols should all start with a capital letter.\n\n:Cost creates a DebugCost\n:Change creates a DebugChange\n:GradientChange creates a DebugGradientChange\n:GradientNorm creates a DebugGradientNorm\n:Iterate creates a DebugIterate\n:Iteration creates a DebugIteration\n:IterativeTime creates a DebugTime(:Iterative)\n:Stepsize creates a DebugStepsize\n:WarnCost creates a DebugWarnIfCostNotFinite\n:WarnGradient creates a DebugWarnIfFieldNotFinite for the ::Gradient.\n:Time creates a DebugTime\n:WarningMessagescreates a DebugMessages(:Warning)\n:InfoMessagescreates a DebugMessages(:Info)\n:ErrorMessages creates a DebugMessages(:Error)\n:Messages creates a DebugMessages() (i.e. the same as :InfoMessages)\n\nany other symbol creates a DebugEntry(s) to print the entry (o.:s) from the options.\n\n\n\n\n\n","category":"method"},{"location":"plans/debug/#Manopt.DebugActionFactory-Tuple{Tuple{Symbol, String}}","page":"Debug Output","title":"Manopt.DebugActionFactory","text":"DebugActionFactory(t::Tuple{Symbol,String)\n\nConvert certain Symbols in the debug=[ ... ] vector to DebugActions Currently the following ones are done, where the string in t[2] is passed as the format the corresponding debug. Note that the Shortcut symbols t[1] should all start with a capital letter.\n\n:Cost creates a DebugCost\n:Change creates a DebugChange\n:GradientChange creates a DebugGradientChange\n:Iterate creates a DebugIterate\n:Iteration creates a DebugIteration\n:Stepsize creates a DebugStepsize\n:Time creates a DebugTime\n:IterativeTime creates a DebugTime(:Iterative)\n\nany other symbol creates a DebugEntry(s) to print the entry (o.:s) from the options.\n\n\n\n\n\n","category":"method"},{"location":"plans/debug/#Manopt.DebugFactory-Tuple{Vector}","page":"Debug Output","title":"Manopt.DebugFactory","text":"DebugFactory(a)\n\ngiven an array of Symbols, Strings DebugActions and Ints\n\nThe symbol :Stop creates an entry of to display the stopping criterion at the end (:Stop => DebugStoppingCriterion()), for further symbols see DebugActionFactory\nThe symbol :Subsolver wraps all dictionary entries with DebugWhenActive that can be set from outside.\nTuples of a symbol and a string can be used to also specify a format, see DebugActionFactory\nany string creates a DebugDivider\nany DebugAction is directly included\nan Integer kintroduces that debug is only printed every kth iteration\n\nReturn value\n\nThis function returns a dictionary with an entry :All containing one general DebugAction, possibly a DebugGroup of entries. It might contain an entry :Start, :Step, :Stop with an action (each) to specify what to do at the start, after a step or at the end of an Algorithm, respectively. On all three occasions the :All action is executed. Note that only the :Stop entry is actually filled when specifying the :Stop symbol.\n\nExample\n\nThe array\n\n[:Iterate, \" | \", :Cost, :Stop, 10]\n\nAdds a group to :All of three actions (DebugIteration, DebugDivider with \" | \" to display, DebugCost) as a DebugGroup inside an DebugEvery to only be executed every 10th iteration. It also adds the DebugStoppingCriterion to the :Stop entry of the dictionary.\n\n\n\n\n\n","category":"method"},{"location":"plans/debug/#Manopt.reset!-Tuple{DebugTime}","page":"Debug Output","title":"Manopt.reset!","text":"reset!(d::DebugTime)\n\nreset the internal time of a DebugTime, that is start from now again.\n\n\n\n\n\n","category":"method"},{"location":"plans/debug/#Manopt.stop!-Tuple{DebugTime}","page":"Debug Output","title":"Manopt.stop!","text":"stop!(d::DebugTime)\n\nstop the reset the internal time of a DebugTime, that is set the time to 0 (undefined)\n\n\n\n\n\n","category":"method"},{"location":"plans/debug/#Technical-Details:-The-Debug-Solver","page":"Debug Output","title":"Technical Details: The Debug Solver","text":"","category":"section"},{"location":"plans/debug/","page":"Debug Output","title":"Debug Output","text":"The decorator to print debug during the iterations can be activated by decorating the state of a solver and implementing your own DebugActions. For example printing a gradient from the GradientDescentState is automatically available, as explained in the gradient_descent solver.","category":"page"},{"location":"plans/debug/","page":"Debug Output","title":"Debug Output","text":"initialize_solver!(amp::AbstractManoptProblem, dss::DebugSolverState)\nstep_solver!(amp::AbstractManoptProblem, dss::DebugSolverState, i)\nstop_solver!(amp::AbstractManoptProblem, dss::DebugSolverState, i::Int)","category":"page"},{"location":"plans/debug/#Manopt.initialize_solver!-Tuple{AbstractManoptProblem, DebugSolverState}","page":"Debug Output","title":"Manopt.initialize_solver!","text":"initialize_solver!(amp::AbstractManoptProblem, dss::DebugSolverState)\n\nExtend the initialization of the solver by a hook to run debug that were added to the :Start and :All entries of the debug lists.\n\n\n\n\n\n","category":"method"},{"location":"plans/debug/#Manopt.step_solver!-Tuple{AbstractManoptProblem, DebugSolverState, Any}","page":"Debug Output","title":"Manopt.step_solver!","text":"step_solver!(amp::AbstractManoptProblem, dss::DebugSolverState, i)\n\nExtend the ith step of the solver by a hook to run debug prints, that were added to the :Step and :All entries of the debug lists.\n\n\n\n\n\n","category":"method"},{"location":"plans/debug/#Manopt.stop_solver!-Tuple{AbstractManoptProblem, DebugSolverState, Int64}","page":"Debug Output","title":"Manopt.stop_solver!","text":"stop_solver!(amp::AbstractManoptProblem, dss::DebugSolverState, i)\n\nExtend the check, whether to stop the solver by a hook to run debug, that were added to the :Stop and :All entries of the debug lists.\n\n\n\n\n\n","category":"method"},{"location":"plans/stepsize/#Stepsize","page":"Stepsize","title":"Stepsize and Linesearch","text":"","category":"section"},{"location":"plans/stepsize/","page":"Stepsize","title":"Stepsize","text":"CurrentModule = Manopt","category":"page"},{"location":"plans/stepsize/","page":"Stepsize","title":"Stepsize","text":"Most iterative algorithms determine a direction along which the algorithm will proceed and determine a step size to find the next iterate. How advanced the step size computation can be implemented depends (among others) on the properties the corresponding problem provides.","category":"page"},{"location":"plans/stepsize/","page":"Stepsize","title":"Stepsize","text":"Within Manopt.jl, the step size determination is implemented as a functor which is a subtype of [Stepsize](@refbased on","category":"page"},{"location":"plans/stepsize/","page":"Stepsize","title":"Stepsize","text":"Stepsize","category":"page"},{"location":"plans/stepsize/#Manopt.Stepsize","page":"Stepsize","title":"Manopt.Stepsize","text":"Stepsize\n\nAn abstract type for the functors representing step sizes, i.e. they are callable structures. The naming scheme is TypeOfStepSize, e.g. ConstantStepsize.\n\nEvery Stepsize has to provide a constructor and its function has to have the interface (p,o,i) where a AbstractManoptProblem as well as AbstractManoptSolverState and the current number of iterations are the arguments and returns a number, namely the stepsize to use.\n\nSee also\n\nLinesearch\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/","page":"Stepsize","title":"Stepsize","text":"Usually, a constructor should take the manifold M as its first argument, for consistency, to allow general step size functors to be set up based on default values that might depend on the manifold currently under consideration.","category":"page"},{"location":"plans/stepsize/","page":"Stepsize","title":"Stepsize","text":"Currently, the following step sizes are available","category":"page"},{"location":"plans/stepsize/","page":"Stepsize","title":"Stepsize","text":"Modules = [Manopt]\nPages = [\"plans/stepsize.jl\"]\nOrder = [:type,:function]\nFilter = t -> t != Stepsize","category":"page"},{"location":"plans/stepsize/#Manopt.AdaptiveWNGradient","page":"Stepsize","title":"Manopt.AdaptiveWNGradient","text":"AdaptiveWNGradient <: DirectionUpdateRule\n\nRepresent an adaptive gradient method introduced by Grapiglia,Stella, J. Optim. Theory Appl., 2023.\n\nGiven a positive threshold hat c mathbb N, an minimal bound b_mathrmmin 0, an initial b_0 b_mathrmmin, and a gradient reduction factor threshold ``\\alpha \\in [0,1).\n\nSet c_0=0 and use omega_0 = lVert operatornamegrad f(p_0) rvert_p_0.\n\nFor the first iterate we use the initial step size s_0 = frac1b_0\n\nThen, given the last gradient X_k-1 = operatornamegrad f(x_k-1), and a previous omega_k-1, the values (b_k omega_k c_k) are computed using X_k = operatornamegrad f(p_k) and the following cases\n\nIf lVert X_k rVert_p_k leq alphaomega_k-1, then let hat b_k-1 in b_mathrmminb_k-1 and set\n\n(b_k omega_k c_k) = begincases\nbigl(hat b_k-1 lVert X_krVert_p_k 0 bigr) text if c_k-1+1 = hat c\nBigl(b_k-1 + fraclVert X_krVert_p_k^2b_k-1 omega_k-1 c_k-1+1 Bigr) text if c_k-1+1hat c\nendcases\n\nIf lVert X_k rVert_p_k alphaomega_k-1, the set\n\n(b_k omega_k c_k) =\nBigl( b_k-1 + fraclVert X_krVert_p_k^2b_k-1 omega_k-1 0)\n\nand return the step size s_k = frac1b_k.\n\nNote that for α=0 this is the Riemannian variant of WNGRad\n\nFields\n\ncount_threshold::Int (4) an Integer for hat c\nminimal_bound::Float64 (1e-4) for b_mathrmmin\nalternate_bound::Function ((bk, hat_c) -> min(gradient_bound, max(gradient_bound, bk/(3*hat_c)) how to determine hat b_k as a function of (bmin, bk, hat_c) -> hat_bk\ngradient_reduction::Float64 (0.9)\ngradient_bound norm(M, p0, grad_f(M,p0)) the bound b_k.\n\nas well as the internal fields\n\nweight for ω_k initialised to ω_0 =norm(M, p0, grad_f(M,p0)) if this is not zero, 1.0 otherwise.\ncount for the c_k, initialised to c_0 = 0.\n\nConstructor\n\nAdaptiveWNGrad(M=DefaultManifold, grad_f=(M,p) -> zero_vector(M,rand(M)), p=rand(M); kwargs...)\n\nWhere all above fields with defaults are keyword arguments. An additional keyword arguments\n\nadaptive (true) switches the gradient_reductionαto0`.\nevaluation (AllocatingEvaluation()) specifies whether the gradient (that is used for initialisation only) is mutating or allocating\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/#Manopt.ArmijoLinesearch","page":"Stepsize","title":"Manopt.ArmijoLinesearch","text":"ArmijoLinesearch <: Linesearch\n\nA functor representing Armijo line search including the last runs state, i.e. a last step size.\n\nFields\n\ninitial_stepsize – (1.0) and initial step size\nretraction_method – (default_retraction_method(M)) the retraction to use\ncontraction_factor – (0.95) exponent for line search reduction\nsufficient_decrease – (0.1) gain within Armijo's rule\nlast_stepsize – (initialstepsize) the last step size we start the search with\ninitial_guess - ((p,s,i,l) -> l) based on a AbstractManoptProblem p, AbstractManoptSolverState s and a current iterate i and a last step size l, this returns an initial guess. The default uses the last obtained stepsize\n\nFurthermore the following fields act as safeguards\n\nstop_when_stepsize_less - (0.0`) smallest stepsize when to stop (the last one before is taken)\nstop_when_stepsize_exceeds - ([max_stepsize](@ref)(M, p)`) – largest stepsize when to stop.\nstop_increasing_at_step - (^100`) last step to increase the stepsize (phase 1),\nstop_decreasing_at_step - (1000) last step size to decrease the stepsize (phase 2),\n\nPass :Messages to a debug= to see @infos when these happen.\n\nConstructor\n\nArmijoLinesearch(M=DefaultManifold())\n\nwith the Fields above as keyword arguments and the retraction is set to the default retraction on M.\n\nThe constructors return the functor to perform Armijo line search, where two interfaces are available:\n\nbased on a tuple (amp, ams, i) of a AbstractManoptProblem amp, AbstractManoptSolverState ams and a current iterate i.\nwith (M, x, F, gradFx[,η=-gradFx]) -> s where M, a current point x a function F, that maps from the manifold to the reals, its gradient (a tangent vector) gradFx=operatornamegradF(x) at x and an optional search direction tangent vector η=-gradFx are the arguments.\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/#Manopt.ConstantStepsize","page":"Stepsize","title":"Manopt.ConstantStepsize","text":"ConstantStepsize <: Stepsize\n\nA functor that always returns a fixed step size.\n\nFields\n\nlength – constant value for the step size\ntype - a symbol that indicates whether the stepsize is relatively (:relative), with respect to the gradient norm, or absolutely (:absolute) constant.\n\nConstructors\n\nConstantStepsize(s::Real, t::Symbol=:relative)\n\ninitialize the stepsize to a constant s of type t.\n\nConstantStepsize(M::AbstractManifold=DefaultManifold(2);\n stepsize=injectivity_radius(M)/2, type::Symbol=:relative\n)\n\ninitialize the stepsize to a constant stepsize, which by default is half the injectivity radius, unless the radius is infinity, then the default step size is 1.\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/#Manopt.DecreasingStepsize","page":"Stepsize","title":"Manopt.DecreasingStepsize","text":"DecreasingStepsize()\n\nA functor that represents several decreasing step sizes\n\nFields\n\nlength – (1) the initial step size l.\nfactor – (1) a value f to multiply the initial step size with every iteration\nsubtrahend – (0) a value a that is subtracted every iteration\nexponent – (1) a value e the current iteration numbers eth exponential is taken of\nshift – (0) shift the denominator iterator i by s`.\ntype - a symbol that indicates whether the stepsize is relatively (:relative), with respect to the gradient norm, or absolutely (:absolute) constant.\n\nIn total the complete formulae reads for the ith iterate as\n\ns_i = frac(l - i a)f^i(i+s)^e\n\nand hence the default simplifies to just s_i = fracli\n\nConstructor\n\nDecreasingStepsize(l=1,f=1,a=0,e=1,s=0,type=:relative)\n\nAlternatively one can also use the following keyword.\n\nDecreasingStepsize(\n M::AbstractManifold=DefaultManifold(3);\n length=injectivity_radius(M)/2, multiplier=1.0, subtrahend=0.0,\n exponent=1.0, shift=0, type=:relative\n)\n\ninitializes all fields above, where none of them is mandatory and the length is set to half and to 1 if the injectivity radius is infinite.\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/#Manopt.Linesearch","page":"Stepsize","title":"Manopt.Linesearch","text":"Linesearch <: Stepsize\n\nAn abstract functor to represent line search type step size determinations, see Stepsize for details. One example is the ArmijoLinesearch functor.\n\nCompared to simple step sizes, the linesearch functors provide an interface of the form (p,o,i,η) -> s with an additional (but optional) fourth parameter to provide a search direction; this should default to something reasonable, e.g. the negative gradient.\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/#Manopt.NonmonotoneLinesearch","page":"Stepsize","title":"Manopt.NonmonotoneLinesearch","text":"NonmonotoneLinesearch <: Linesearch\n\nA functor representing a nonmonotone line search using the Barzilai-Borwein step size Iannazzo, Porcelli, IMA J. Numer. Anal., 2017. Together with a gradient descent algorithm this line search represents the Riemannian Barzilai-Borwein with nonmonotone line-search (RBBNMLS) algorithm. We shifted the order of the algorithm steps from the paper by Iannazzo and Porcelli so that in each iteration we first find\n\ny_k = operatornamegradF(x_k) - operatornameT_x_k-1 x_k(operatornamegradF(x_k-1))\n\nand\n\ns_k = - α_k-1 * operatornameT_x_k-1 x_k(operatornamegradF(x_k-1))\n\nwhere α_k-1 is the step size computed in the last iteration and operatornameT is a vector transport. We then find the Barzilai–Borwein step size\n\nα_k^textBB = begincases\nmin(α_textmax max(α_textmin τ_k)) textif s_k y_k_x_k 0\nα_textmax textelse\nendcases\n\nwhere\n\nτ_k = fracs_k s_k_x_ks_k y_k_x_k\n\nif the direct strategy is chosen,\n\nτ_k = fracs_k y_k_x_ky_k y_k_x_k\n\nin case of the inverse strategy and an alternation between the two in case of the alternating strategy. Then we find the smallest h = 0 1 2 such that\n\nF(operatornameretr_x_k(- σ^h α_k^textBB operatornamegradF(x_k)))\nleq\nmax_1 j min(k+1m) F(x_k+1-j) - γ σ^h α_k^textBB operatornamegradF(x_k) operatornamegradF(x_k)_x_k\n\nwhere σ is a step length reduction factor (01), m is the number of iterations after which the function value has to be lower than the current one and γ is the sufficient decrease parameter (01). We can then find the new stepsize by\n\nα_k = σ^h α_k^textBB\n\nFields\n\ninitial_stepsize – (1.0) the step size we start the search with\nmemory_size – (10) number of iterations after which the cost value needs to be lower than the current one\nbb_min_stepsize – (1e-3) lower bound for the Barzilai-Borwein step size greater than zero\nbb_max_stepsize – (1e3) upper bound for the Barzilai-Borwein step size greater than min_stepsize\nretraction_method – (ExponentialRetraction()) the retraction to use\nstrategy – (direct) defines if the new step size is computed using the direct, indirect or alternating strategy\nstorage – (for :Iterate and :Gradient) a StoreStateAction\nstepsize_reduction – (0.5) step size reduction factor contained in the interval (0,1)\nsufficient_decrease – (1e-4) sufficient decrease parameter contained in the interval (0,1)\nvector_transport_method – (ParallelTransport()) the vector transport method to use\n\nFurthermore the following fields act as safeguards\n\nstop_when_stepsize_less - (0.0`) smallest stepsize when to stop (the last one before is taken)\nstop_when_stepsize_exceeds - ([max_stepsize](@ref)(M, p)`) – largest stepsize when to stop.\nstop_increasing_at_step - (^100`) last step to increase the stepsize (phase 1),\nstop_decreasing_at_step - (1000) last step size to decrease the stepsize (phase 2),\n\nPass :Messages to a debug= to see @infos when these happen.\n\nConstructor\n\nNonmonotoneLinesearch()\n\nwith the Fields above in their order as optional arguments (deprecated).\n\nNonmonotoneLinesearch(M)\n\nwith the Fields above in their order as keyword arguments and where the retraction and vector transport are set to the default ones on M, respectively.\n\nThe constructors return the functor to perform nonmonotone line search.\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/#Manopt.WolfePowellBinaryLinesearch","page":"Stepsize","title":"Manopt.WolfePowellBinaryLinesearch","text":"WolfePowellBinaryLinesearch <: Linesearch\n\nA Linesearch method that determines a step size t fulfilling the Wolfe conditions\n\nbased on a binary chop. Let η be a search direction and c1c_20 be two constants. Then with\n\nA(t) = f(x_+) c1 t operatornamegradf(x) η_x\nquadtextandquad\nW(t) = operatornamegradf(x_+) textV_x_+gets xη_x_+ c_2 η operatornamegradf(x)_x\n\nwhere x_+ = operatornameretr_x(tη) is the current trial point, and textV is a vector transport, we perform the following Algorithm similar to Algorithm 7 from Huang, Thesis, 2014\n\nset α=0, β= and t=1.\nWhile either A(t) does not hold or W(t) does not hold do steps 3-5.\nIf A(t) fails, set β=t.\nIf A(t) holds but W(t) fails, set α=t.\nIf β set t=fracα+β2, otherwise set t=2α.\n\nConstructors\n\nThere exist two constructors, where, when prodivind the manifold M as a first (optional) parameter, its default retraction and vector transport are the default. In this case the retraction and the vector transport are also keyword arguments for ease of use. The other constructor is kept for backward compatibility.\n\nWolfePowellLinesearch(\n M=DefaultManifold(),\n c1::Float64=10^(-4),\n c2::Float64=0.999;\n retraction_method = default_retraction_method(M),\n vector_transport_method = default_vector_transport(M),\n linesearch_stopsize = 0.0\n)\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/#Manopt.WolfePowellLinesearch","page":"Stepsize","title":"Manopt.WolfePowellLinesearch","text":"WolfePowellLinesearch <: Linesearch\n\nDo a backtracking linesearch to find a step size α that fulfils the Wolfe conditions along a search direction η starting from x, i.e.\n\nfbigl( operatornameretr_x(αη) bigr) f(x_k) + c_1 α_k operatornamegradf(x) η_x\nquadtextandquad\nfracmathrmdmathrmdt fbigr(operatornameretr_x(tη)bigr)\nBigvert_t=α\n c_2 fracmathrmdmathrmdt fbigl(operatornameretr_x(tη)bigr)Bigvert_t=0\n\nConstructors\n\nThere exist two constructors, where, when prodivind the manifold M as a first (optional) parameter, its default retraction and vector transport are the default. In this case the retraction and the vector transport are also keyword arguments for ease of use. The other constructor is kept for backward compatibility. Note that the linesearch_stopsize to stop for too small stepsizes is only available in the new signature including M.\n\nWolfePowellLinesearch(\n M,\n c1::Float64=10^(-4),\n c2::Float64=0.999;\n retraction_method = default_retraction_method(M),\n vector_transport_method = default_vector_transport(M),\n linesearch_stopsize = 0.0\n)\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/#Manopt.default_stepsize-Tuple{AbstractManifold, Type{<:AbstractManoptSolverState}}","page":"Stepsize","title":"Manopt.default_stepsize","text":"default_stepsize(M::AbstractManifold, ams::AbstractManoptSolverState)\n\nReturns the default Stepsize functor used when running the solver specified by the AbstractManoptSolverState ams running with an objective on the AbstractManifold M.\n\n\n\n\n\n","category":"method"},{"location":"plans/stepsize/#Manopt.get_stepsize-Tuple{AbstractManoptProblem, AbstractManoptSolverState, Vararg{Any}}","page":"Stepsize","title":"Manopt.get_stepsize","text":"get_stepsize(amp::AbstractManoptProblem, ams::AbstractManoptSolverState, vars...)\n\nreturn the stepsize stored within AbstractManoptSolverState ams when solving the AbstractManoptProblem amp. This method also works for decorated options and the Stepsize function within the options, by default stored in o.stepsize.\n\n\n\n\n\n","category":"method"},{"location":"plans/stepsize/#Manopt.linesearch_backtrack-Union{Tuple{T}, Tuple{TF}, Tuple{AbstractManifold, TF, Any, T, Any, Any, Any}, Tuple{AbstractManifold, TF, Any, T, Any, Any, Any, AbstractRetractionMethod}, Tuple{AbstractManifold, TF, Any, T, Any, Any, Any, AbstractRetractionMethod, T}, Tuple{AbstractManifold, TF, Any, T, Any, Any, Any, AbstractRetractionMethod, T, Any}} where {TF, T}","page":"Stepsize","title":"Manopt.linesearch_backtrack","text":"(s, msg) = linesearch_backtrack(\n M, F, x, gradFx, s, decrease, contract, retr, η = -gradFx, f0 = F(x);\n stop_when_stepsize_less=0.0,\n stop_when_stepsize_exceeds=max_stepsize(M, p),\n stop_increasing_at_step = 100,\n stop_decreasing_at_step = 1000,\n)\n\nperform a linesearch for\n\na manifold M\na cost function f,\nan iterate p\nthe gradient operatornamegradF(x)\nan initial stepsize s usually called γ\na sufficient decrease\na contraction factor σ\na retraction, which defaults to the default_retraction_method(M)\na search direction η = -operatornamegradF(x)\nan offset, f_0 = F(x)\n\nAnd use the 4 keywords to limit the maximal increase and decrease steps as well as a maximal stepsize (especially on non-Hadamard manifolds) and a minimal one.\n\nReturn value\n\nA stepsize s and a message msg (in case any of the 4 criteria hit)\n\n\n\n\n\n","category":"method"},{"location":"plans/stepsize/#Manopt.max_stepsize-Tuple{AbstractManifold, Any}","page":"Stepsize","title":"Manopt.max_stepsize","text":"max_stepsize(M::AbstractManifold, p)\nmax_stepsize(M::AbstractManifold)\n\nGet the maximum stepsize (at point p) on manifold M. It should be used to limit the distance an algorithm is trying to move in a single step.\n\n\n\n\n\n","category":"method"},{"location":"plans/stepsize/#Literature","page":"Stepsize","title":"Literature","text":"","category":"section"},{"location":"plans/stepsize/","page":"Stepsize","title":"Stepsize","text":"Pages = [\"plans/stepsize.md\"]\nCanonical=false","category":"page"},{"location":"tutorials/Optimize!/#Get-Started:-Optimize!","page":"Get started: Optimize!","title":"Get Started: Optimize!","text":"","category":"section"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"In this tutorial, we will both introduce the basics of optimisation on manifolds as well as how to use Manopt.jl to perform optimisation on manifolds in Julia.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"For more theoretical background, see e.g. [Car92] for an introduction to Riemannian manifolds and [AMS08] or [Bou23] to read more about optimisation thereon.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Let mathcal M denote a Riemannian manifold and let fcolon mathcal M ℝ be a cost function. We aim to compute a point p^* where f is minimal or in other words p^* is a minimizer of f.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"We also write this as","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":" operatorname*argmin_p mathcal M f(p)","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"and would like to find p^* numerically. As an example we take the generalisation of the (arithemtic) mean. In the Euclidean case withdinmathbb N, that is for nin mathbb N data points y_1ldotsy_n in mathbb R^d the mean","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":" sum_i=1^n y_i","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"can not be directly generalised to data q_1ldotsq_n, since on a manifold we do not have an addition. But the mean can also be charcterised as","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":" operatorname*argmin_xinmathbb R^d frac12nsum_i=1^n lVert x - y_irVert^2","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"and using the Riemannian distance d_mathcal M, this can be written on Riemannian manifolds. We obtain the Riemannian Center of Mass [Kar77]","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":" operatorname*argmin_pinmathbb R^d\n frac12n sum_i=1^n d_mathcal M^2(p q_i)","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Fortunately the gradient can be computed and is","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":" operatorname*argmin_pinmathbb R^d frac1n sum_i=1^n -log_p q_i","category":"page"},{"location":"tutorials/Optimize!/#Loading-the-necessary-packages","page":"Get started: Optimize!","title":"Loading the necessary packages","text":"","category":"section"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Let’s assume you have already installed both Manotp and Manifolds in Julia (using e.g. using Pkg; Pkg.add([\"Manopt\", \"Manifolds\"])). Then we can get started by loading both packages – and Random for persistency in this tutorial.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"using Manopt, Manifolds, Random, LinearAlgebra\nRandom.seed!(42);","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Now assume we are on the Sphere mathcal M = mathbb S^2 and we generate some random points “around” some initial point p","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"n = 100\nσ = π / 8\nM = Sphere(2)\np = 1 / sqrt(2) * [1.0, 0.0, 1.0]\ndata = [exp(M, p, σ * rand(M; vector_at=p)) for i in 1:n];","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Now we can define the cost function f and its (Riemannian) gradient operatornamegrad f for the Riemannian center of mass:","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"f(M, p) = sum(1 / (2 * n) * distance.(Ref(M), Ref(p), data) .^ 2)\ngrad_f(M, p) = sum(1 / n * grad_distance.(Ref(M), data, Ref(p)));","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"and just call gradient_descent. For a first start, we do not have to provide more than the manifold, the cost, the gradient, and a startig point, which we just set to the first data point","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"m1 = gradient_descent(M, f, grad_f, data[1])","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"3-element Vector{Float64}:\n 0.6868392794790367\n 0.006531600680668244\n 0.7267799820834814","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"In order to get more details, we further add the debug= keyword argument, which act as a decorator pattern.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"This way we can easily specify a certain debug to be printed. The goal is to get an output of the form","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"# i | Last Change: [...] | F(x): [...] |","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"but where we also want to fix the display format for the change and the cost numbers (the [...]) to have a certain format. Furthermore, the reason why the solver stopped should be printed at the end","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"These can easily be specified using either a Symbol – using the default format for numbers – or a tuple of a symbol and a format-string in the debug= keyword that is avaiable for every solver. We can also – for illustration reasons – just look at the first 6 steps by setting a stopping_criterion=","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"m2 = gradient_descent(M, f, grad_f, data[1];\n debug=[:Iteration,(:Change, \"|Δp|: %1.9f |\"),\n (:Cost, \" F(x): %1.11f | \"), \"\\n\", :Stop],\n stopping_criterion = StopAfterIteration(6)\n )","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Initial F(x): 0.32487988924 | \n# 1 |Δp|: 1.063609017 | F(x): 0.25232524046 | \n# 2 |Δp|: 0.809858671 | F(x): 0.20966960102 | \n# 3 |Δp|: 0.616665145 | F(x): 0.18546505598 | \n# 4 |Δp|: 0.470841764 | F(x): 0.17121604104 | \n# 5 |Δp|: 0.359345690 | F(x): 0.16300825911 | \n# 6 |Δp|: 0.274597420 | F(x): 0.15818548927 | \nThe algorithm reached its maximal number of iterations (6).\n\n3-element Vector{Float64}:\n 0.7533872481682505\n -0.060531070555836314\n 0.6547851890466334","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"See here for the list of available symbols.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"!!! info \\\"Technical Detail\\\" The debug= keyword is actually a list of DebugActions added to every iteration, allowing you to write your own ones even. Additionally, :Stop is an action added to the end of the solver to display the reason why the solver stopped.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"The default stopping criterion for gradient_descent is, to either stopwhen the gradient is small (<1e-9) or a max number of iterations is reached (as a fallback. Combining stopping-criteria can be done by | or &. We further pass a number 25 to debug= to only an output every 25th iteration:","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"m3 = gradient_descent(M, f, grad_f, data[1];\n debug=[:Iteration,(:Change, \"|Δp|: %1.9f |\"),\n (:Cost, \" F(x): %1.11f | \"), \"\\n\", :Stop, 25],\n stopping_criterion = StopWhenGradientNormLess(1e-14) | StopAfterIteration(400),\n)","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Initial F(x): 0.32487988924 | \n# 25 |Δp|: 0.459715605 | F(x): 0.15145076374 | \n# 50 |Δp|: 0.000551270 | F(x): 0.15145051509 | \nThe algorithm reached approximately critical point after 70 iterations; the gradient norm (9.399656483458736e-16) is less than 1.0e-14.\n\n3-element Vector{Float64}:\n 0.6868392794788667\n 0.006531600680779304\n 0.726779982083641","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"We can finally use another way to determine the stepsize, for example a little more expensive ArmijoLineSeach than the default stepsize rule used on the Sphere.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"m4 = gradient_descent(M, f, grad_f, data[1];\n debug=[:Iteration,(:Change, \"|Δp|: %1.9f |\"),\n (:Cost, \" F(x): %1.11f | \"), \"\\n\", :Stop, 2],\n stepsize = ArmijoLinesearch(M; contraction_factor=0.999, sufficient_decrease=0.5),\n stopping_criterion = StopWhenGradientNormLess(1e-14) | StopAfterIteration(400),\n)","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Initial F(x): 0.32487988924 | \n# 2 |Δp|: 0.001318138 | F(x): 0.15145051509 | \n# 4 |Δp|: 0.000000004 | F(x): 0.15145051509 | \n# 6 |Δp|: 0.000000000 | F(x): 0.15145051509 | \n# 8 |Δp|: 0.000000000 | F(x): 0.15145051509 | \nThe algorithm reached approximately critical point after 8 iterations; the gradient norm (6.7838288590006e-15) is less than 1.0e-14.\n\n3-element Vector{Float64}:\n 0.6868392794788671\n 0.006531600680779187\n 0.726779982083641","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Then we reach approximately the same point as in the previous run, but in far less steps","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"[f(M, m3)-f(M,m4), distance(M, m3, m4)]","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"2-element Vector{Float64}:\n 2.7755575615628914e-16\n 4.592670164656332e-16","category":"page"},{"location":"tutorials/Optimize!/#Example-2:-Computing-the-median-of-symmetric-positive-definite-matrices.","page":"Get started: Optimize!","title":"Example 2: Computing the median of symmetric positive definite matrices.","text":"","category":"section"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"For the second example let’s consider the manifold of 3 3 symmetric positive definite matrices and again 100 random points","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"N = SymmetricPositiveDefinite(3)\nm = 100\nσ = 0.005\nq = Matrix{Float64}(I, 3, 3)\ndata2 = [exp(N, q, σ * rand(N; vector_at=q)) for i in 1:m];","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Instead of the mean, let’s consider a non-smooth optimisation task: The median can be generalized to Manifolds as the minimiser of the sum of distances, see e.g. [Bac14]. We define","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"g(N, q) = sum(1 / (2 * m) * distance.(Ref(N), Ref(q), data2))","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"g (generic function with 1 method)","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Since the function is non-smooth, we can not use a gradient-based approach. But since for every summand the proximal map is available, we can use the cyclic proximal point algorithm (CPPA). We hence define the vector of proximal maps as","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"proxes_g = Function[(N, λ, q) -> prox_distance(N, λ / m, di, q, 1) for di in data2];","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Besides also looking at a some debug prints, we can also easily record these values. Similarly to debug=, record= also accepts Symbols, see list here, to indicate things to record. We further set return_state to true to obtain not just the (approximate) minimizer.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"s = cyclic_proximal_point(N, g, proxes_g, data2[1];\n debug=[:Iteration,\" | \",:Change,\" | \",(:Cost, \"F(x): %1.12f\"),\"\\n\", 1000, :Stop,\n ],\n record=[:Iteration, :Change, :Cost, :Iterate],\n return_state=true,\n );","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Initial | | F(x): 0.005875512856\n# 1000 | Last Change: 0.003704 | F(x): 0.003239019699\n# 2000 | Last Change: 0.000015 | F(x): 0.003238996105\n# 3000 | Last Change: 0.000005 | F(x): 0.003238991748\n# 4000 | Last Change: 0.000002 | F(x): 0.003238990225\n# 5000 | Last Change: 0.000001 | F(x): 0.003238989520\nThe algorithm reached its maximal number of iterations (5000).","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"!!! note \\\"Technical Detail\\\" The recording is realised by RecordActions that are (also) executed at every iteration. These can also be individually implemented and added to the record= array instead of symbols.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"\nFirst, the computed median can be accessed as\n\n::: {.cell execution_count=14}","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"{.julia .cell-code} median = getsolverresult(s)","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"3×3 Matrix{Float64}:\n 1.0 2.12236e-5 0.000398721\n 2.12236e-5 1.00044 0.000141798\n 0.000398721 0.000141798 1.00041","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":":::","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"but we can also look at the recorded values. For simplicity (of output), lets just look at the recorded values at iteration 42","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"get_record(s)[42]","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"(42, 1.0569455861045147e-5, 0.0032525477393699743, [0.9998583866917474 0.00020988803126553712 0.0002895445818457687; 0.0002098880312654816 1.0000931572564826 0.00020843715016866105; 0.00028954458184579646 0.00020843715016866105 1.0000709207432568])","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"But we can also access whole serieses and see that the cost does not decrease that fast; actually, the CPPA might converge relatively slow. For that we can for example access the :Cost that was recorded every :Iterate as well as the (maybe a little boring) :Iteration-number in a semilogplot.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"x = get_record(s, :Iteration, :Iteration)\ny = get_record(s, :Iteration, :Cost)\nusing Plots\nplot(x,y,xaxis=:log, label=\"CPPA Cost\")","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"(Image: )","category":"page"},{"location":"tutorials/Optimize!/#Literature","page":"Get started: Optimize!","title":"Literature","text":"","category":"section"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Pages = [\"tutorials/Optimize!.md\"]\nCanonical=false","category":"page"},{"location":"#Welcome-to-Manopt.jl","page":"Home","title":"Welcome to Manopt.jl","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"CurrentModule = Manopt","category":"page"},{"location":"","page":"Home","title":"Home","text":"Manopt.Manopt","category":"page"},{"location":"#Manopt.Manopt","page":"Home","title":"Manopt.Manopt","text":"🏔️ Manopt.jl – Optimization on Manifolds in Julia.\n\n📚 Documentation: manoptjl.org\n📦 Repository: github.com/JuliaManifolds/Manopt.jl\n💬 Discussions: github.com/JuliaManifolds/Manopt.jl/discussions\n🎯 Issues: github.com/JuliaManifolds/Manopt.jl/issues\n\n\n\n\n\n","category":"module"},{"location":"","page":"Home","title":"Home","text":"For a function fmathcal M ℝ defined on a Riemannian manifold mathcal M we aim to solve","category":"page"},{"location":"","page":"Home","title":"Home","text":"operatorname*argmin_p mathcal M f(p)","category":"page"},{"location":"","page":"Home","title":"Home","text":"or in other words: find the point p on the manifold, where f reaches its minimal function value.","category":"page"},{"location":"","page":"Home","title":"Home","text":"Manopt.jl provides a framework for optimization on manifolds as well as a Library of optimization algorithms in Julia. It belongs to the “Manopt family”, which includes Manopt (Matlab) and pymanopt.org (Python).","category":"page"},{"location":"","page":"Home","title":"Home","text":"If you want to delve right into Manopt.jl check out the Get started: Optimize! tutorial.","category":"page"},{"location":"","page":"Home","title":"Home","text":"Manopt.jl makes it easy to use an algorithm for your favourite manifold as well as a manifold for your favourite algorithm. It already provides many manifolds and algorithms, which can easily be enhanced, for example to record certain data or debug output throughout iterations.","category":"page"},{"location":"","page":"Home","title":"Home","text":"If you use Manopt.jlin your work, please cite the following","category":"page"},{"location":"","page":"Home","title":"Home","text":"@article{Bergmann2022,\n Author = {Ronny Bergmann},\n Doi = {10.21105/joss.03866},\n Journal = {Journal of Open Source Software},\n Number = {70},\n Pages = {3866},\n Publisher = {The Open Journal},\n Title = {Manopt.jl: Optimization on Manifolds in {J}ulia},\n Volume = {7},\n Year = {2022},\n}","category":"page"},{"location":"","page":"Home","title":"Home","text":"To refer to a certain version or the source code in general we recommend to cite for example","category":"page"},{"location":"","page":"Home","title":"Home","text":"@software{manoptjl-zenodo-mostrecent,\n Author = {Ronny Bergmann},\n Copyright = {MIT License},\n Doi = {10.5281/zenodo.4290905},\n Publisher = {Zenodo},\n Title = {Manopt.jl},\n Year = {2022},\n}","category":"page"},{"location":"","page":"Home","title":"Home","text":"for the most recent version or a corresponding version specific DOI, see the list of all versions. Note that both citations are in BibLaTeX format.","category":"page"},{"location":"#Main-Features","page":"Home","title":"Main Features","text":"","category":"section"},{"location":"#Optimization-Algorithms-(Solvers)","page":"Home","title":"Optimization Algorithms (Solvers)","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"For every optimization algorithm, a solver is implemented based on a AbstractManoptProblem that describes the problem to solve and its AbstractManoptSolverState that set up the solver, store interims values. Together they form a plan.","category":"page"},{"location":"#Manifolds","page":"Home","title":"Manifolds","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"This project is build upon ManifoldsBase.jl, a generic interface to implement manifolds. Certain functions are extended for specific manifolds from Manifolds.jl, but all other manifolds from that package can be used here, too.","category":"page"},{"location":"","page":"Home","title":"Home","text":"The notation in the documentation aims to follow the same notation from these packages.","category":"page"},{"location":"#Functions-on-Manifolds","page":"Home","title":"Functions on Manifolds","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Several functions are available, implemented on an arbitrary manifold, cost functions, differentials and their adjoints, and gradients as well as proximal maps.","category":"page"},{"location":"#Visualization","page":"Home","title":"Visualization","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"To visualize and interpret results, Manopt.jl aims to provide both easy plot functions as well as exports. Furthermore a system to get debug during the iterations of an algorithms as well as record capabilities, i.e. to record a specified tuple of values per iteration, most prominently RecordCost and RecordIterate. Take a look at the Get Started: Optimize! tutorial on how to easily activate this.","category":"page"},{"location":"#Literature","page":"Home","title":"Literature","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"If you want to get started with manifolds, one book is [Car92], and if you want do directly dive into optimization on manifolds, good references are [AMS08] and [Bou23], which are both available online for free","category":"page"},{"location":"","page":"Home","title":"Home","text":"Pages = [\"index.md\"]\nCanonical=false","category":"page"},{"location":"references/#Literature","page":"References","title":"Literature","text":"","category":"section"},{"location":"references/","page":"References","title":"References","text":"This is all literature mentioned / referenced in the Manopt.jl documentation. Usually you will find a small reference section at the end of every documentation page that contains references.","category":"page"},{"location":"references/","page":"References","title":"References","text":"","category":"page"},{"location":"tutorials/StochasticGradientDescent/#How-to-Run-Stochastic-Gradient-Descent","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"","category":"section"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"This tutorial illustrates how to use the stochastic_gradient_descent solver and different DirectionUpdateRules in order to introduce the average or momentum variant, see Stochastic Gradient Descent.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"Computationally, we look at a very simple but large scale problem, the Riemannian Center of Mass or Fréchet mean: for given points p_i mathcal M, i=1N this optimization problem reads","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"operatorname*argmin_xmathcal M frac12sum_i=1^N\n operatornamed^2_mathcal M(xp_i)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"which of course can be (and is) solved by a gradient descent, see the introductionary tutorial or Statistics in Manifolds.jl. If N is very large, evaluating the complete gradient might be quite expensive. A remedy is to evaluate only one of the terms at a time and choose a random order for these.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"We first initialize the packages","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"using Manifolds, Manopt, Random, BenchmarkTools\nRandom.seed!(42);","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"We next generate a (little) large(r) data set","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"n = 5000\nσ = π / 12\nM = Sphere(2)\np = 1 / sqrt(2) * [1.0, 0.0, 1.0]\ndata = [exp(M, p, σ * rand(M; vector_at=p)) for i in 1:n];","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"Note that due to the construction of the points as zero mean tangent vectors, the mean should be very close to our initial point p.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"In order to use the stochastic gradient, we now need a function that returns the vector of gradients. There are two ways to define it in Manopt.jl: either as a single function that returns a vector, or as a vector of functions.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"The first variant is of course easier to define, but the second is more efficient when only evaluating one of the gradients.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"For the mean, the gradient is","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"operatornamegradf(p) = sum_i=1^N operatornamegradf_i(x) quad textwhere operatornamegradf_i(x) = -log_x p_i","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"which we define in Manopt.jl in two different ways: either as one function returning all gradients as a vector (see gradF), or – maybe more fitting for a large scale problem, as a vector of small gradient functions (see gradf)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"F(M, p) = 1 / (2 * n) * sum(map(q -> distance(M, p, q)^2, data))\ngradF(M, p) = [grad_distance(M, p, q) for q in data]\ngradf = [(M, p) -> grad_distance(M, q, p) for q in data];\np0 = 1 / sqrt(3) * [1.0, 1.0, 1.0]","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"3-element Vector{Float64}:\n 0.5773502691896258\n 0.5773502691896258\n 0.5773502691896258","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"The calls are only slightly different, but notice that accessing the 2nd gradient element requires evaluating all logs in the first function, while we only call one of the functions in the second array of functions. So while you can use both gradF and gradf in the following call, the second one is (much) faster:","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"p_opt1 = stochastic_gradient_descent(M, gradF, p)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"3-element Vector{Float64}:\n -0.034408323150541376\n 0.028979490714898942\n -0.2172726573502577","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"@benchmark stochastic_gradient_descent($M, $gradF, $p0)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"BenchmarkTools.Trial: 1 sample with 1 evaluation.\n Single result which took 8.795 s (5.82% GC) to evaluate,\n with a memory estimate of 7.83 GiB, over 100161804 allocations.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"p_opt2 = stochastic_gradient_descent(M, gradf, p0)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"3-element Vector{Float64}:\n 0.37206187599994556\n -0.11462522239619985\n 0.9211031531907937","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"@benchmark stochastic_gradient_descent($M, $gradf, $p0)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"BenchmarkTools.Trial: 890 samples with 1 evaluation.\n Range (min … max): 5.189 ms … 10.524 ms ┊ GC (min … max): 0.00% … 36.83%\n Time (median): 5.267 ms ┊ GC (median): 0.00%\n Time (mean ± σ): 5.611 ms ± 1.070 ms ┊ GC (mean ± σ): 5.35% ± 11.22%\n\n █▄ \n ██▄▅▃▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▃▃▂ ▂\n 5.19 ms Histogram: frequency by time 9.33 ms <\n\n Memory estimate: 3.43 MiB, allocs estimate: 50030.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"This result is reasonably close. But we can improve it by using a DirectionUpdateRule, namely:","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"On the one hand MomentumGradient, which requires both the manifold and the initial value, in order to keep track of the iterate and parallel transport the last direction to the current iterate. The necessary vector_transport_method keyword is set to a suitable default on every manifold, see default_vector_transport_method. We get ““”","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"p_opt3 = stochastic_gradient_descent(\n M, gradf, p0; direction=MomentumGradient(M, p0; direction=StochasticGradient(M))\n)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"3-element Vector{Float64}:\n -0.6605946566435753\n 0.24633535998595033\n -0.7091781088235515","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"MG = MomentumGradient(M, p0; direction=StochasticGradient(M));\n@benchmark stochastic_gradient_descent($M, $gradf, $p0; direction=$MG)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"BenchmarkTools.Trial: 200 samples with 1 evaluation.\n Range (min … max): 23.306 ms … 36.966 ms ┊ GC (min … max): 0.00% … 13.75%\n Time (median): 23.815 ms ┊ GC (median): 0.00%\n Time (mean ± σ): 24.993 ms ± 2.260 ms ┊ GC (mean ± σ): 4.76% ± 7.15%\n\n ▃█▂▄ \n ▅▇████▆▆▃▂▂▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▃▅▄▇▄▄▂▄▃▁▁▁▂ ▃\n 23.3 ms Histogram: frequency by time 29.2 ms <\n\n Memory estimate: 11.36 MiB, allocs estimate: 249516.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"And on the other hand the AverageGradient computes an average of the last n gradients, i.e.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"p_opt4 = stochastic_gradient_descent(\n M, gradf, p0; direction=AverageGradient(M, p0; n=10, direction=StochasticGradient(M))\n)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"3-element Vector{Float64}:\n 0.8041185045468516\n 0.08386875203799127\n 0.5885231202569053","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"AG = AverageGradient(M, p0; n=10, direction=StochasticGradient(M));\n@benchmark stochastic_gradient_descent($M, $gradf, $p0; direction=$AG)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"BenchmarkTools.Trial: 84 samples with 1 evaluation.\n Range (min … max): 55.228 ms … 65.851 ms ┊ GC (min … max): 0.00% … 7.58%\n Time (median): 60.566 ms ┊ GC (median): 8.15%\n Time (mean ± σ): 59.708 ms ± 2.240 ms ┊ GC (mean ± σ): 6.44% ± 3.44%\n\n ▅ █▃ \n ▃▅▇█▃▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆████▇▄▇▅▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃ ▁\n 55.2 ms Histogram: frequency by time 63.7 ms <\n\n Memory estimate: 34.25 MiB, allocs estimate: 569516.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"Note that the default StoppingCriterion is a fixed number of iterations which helps the comparison here.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"For both update rules we have to internally specify that we are still in the stochastic setting, since both rules can also be used with the IdentityUpdateRule within gradient_descent.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"For this not-that-large-scale example we can of course also use a gradient descent with ArmijoLinesearch,","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"fullGradF(M, p) = sum(grad_distance(M, q, p) for q in data)\np_opt5 = gradient_descent(M, F, fullGradF, p0; stepsize=ArmijoLinesearch(M))","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"3-element Vector{Float64}:\n 0.6595265191812062\n 0.1457504051994757\n 0.7374154798218656","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"but it will be a little slower usually","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"AL = ArmijoLinesearch(M);\n@benchmark gradient_descent($M, $F, $fullGradF, $p0; stepsize=$AL)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"BenchmarkTools.Trial: 7 samples with 1 evaluation.\n Range (min … max): 783.478 ms … 805.992 ms ┊ GC (min … max): 7.44% … 7.38%\n Time (median): 786.469 ms ┊ GC (median): 7.51%\n Time (mean ± σ): 789.545 ms ± 7.991 ms ┊ GC (mean ± σ): 7.47% ± 0.06%\n\n ▁ ▁ ▁ █ ▁ ▁ \n █▁▁█▁▁█▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁\n 783 ms Histogram: frequency by time 806 ms <\n\n Memory estimate: 703.16 MiB, allocs estimate: 9021018.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"Note that all 5 runs are very close to each other, here we check the distance to the first","category":"page"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"EditURL = \"https://github.com/JuliaManifolds/Manopt.jl/blob/master/CONTRIBUTING.md\"","category":"page"},{"location":"contributing/#Contributing-to-Manopt.jl","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"","category":"section"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"First, thanks for taking the time to contribute. Any contribution is appreciated and welcome.","category":"page"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"The following is a set of guidelines to Manopt.jl.","category":"page"},{"location":"contributing/#Table-of-Contents","page":"Contributing to Manopt.jl","title":"Table of Contents","text":"","category":"section"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"Contributing to Manopt.jl - Table of Contents\nI just have a question\nHow can I file an issue?\nHow can I contribute?\nAdd a missing method\nProvide a new algorithm\nProvide a new example\nCode style","category":"page"},{"location":"contributing/#I-just-have-a-question","page":"Contributing to Manopt.jl","title":"I just have a question","text":"","category":"section"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"The developer can most easily be reached in the Julia Slack channel #manifolds. You can apply for the Julia Slack workspace here if you haven't joined yet. You can also ask your question on discourse.julialang.org.","category":"page"},{"location":"contributing/#How-can-I-file-an-issue?","page":"Contributing to Manopt.jl","title":"How can I file an issue?","text":"","category":"section"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"If you found a bug or want to propose a feature, we track our issues within the GitHub repository.","category":"page"},{"location":"contributing/#How-can-I-contribute?","page":"Contributing to Manopt.jl","title":"How can I contribute?","text":"","category":"section"},{"location":"contributing/#Add-a-missing-method","page":"Contributing to Manopt.jl","title":"Add a missing method","text":"","category":"section"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"There is still a lot of methods for within the optimization framework of Manopt.jl, may it be functions, gradients, differentials, proximal maps, step size rules or stopping criteria. If you notice a method missing and can contribute an implementation, please do so! Even providing a single new method is a good contribution.","category":"page"},{"location":"contributing/#Provide-a-new-algorithm","page":"Contributing to Manopt.jl","title":"Provide a new algorithm","text":"","category":"section"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"A main contribution you can provide is another algorithm that is not yet included in the package. An algorithm is always based on a concrete type of a AbstractManoptProblem storing the main information of the task and a concrete type of an AbstractManoptSolverState storing all information that needs to be known to the solver in general. The actual algorithm is split into an initialization phase, see initialize_solver!, and the implementation of the ith step of the solver itself, see before the iterative procedure, see step_solver!. For these two functions, it would be great if a new algorithm uses functions from the ManifoldsBase.jl interface as generically as possible. For example, if possible use retract!(M,q,p,X) in favor of exp!(M,q,p,X) to perform a step starting in p in direction X (in place of q), since the exponential map might be too expensive to evaluate or might not be available on a certain manifold. See Retractions and inverse retractions for more details. Further, if possible, prefer retract!(M,q,p,X) in favor of retract(M,p,X), since a computation in place of a suitable variable q reduces memory allocations.","category":"page"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"Usually, the methods implemented in Manopt.jl also have a high-level interface, that is easier to call, creates the necessary problem and options structure and calls the solver.","category":"page"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"The two technical functions initialize_solver! and step_solver! should be documented with technical details, while the high level interface should usually provide a general description and some literature references to the algorithm at hand.","category":"page"},{"location":"contributing/#Provide-a-new-example","page":"Contributing to Manopt.jl","title":"Provide a new example","text":"","category":"section"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"Example problems are available at ManoptExamples.jl, where also their reproducible Quarto-Markdown files are stored.","category":"page"},{"location":"contributing/#Code-style","page":"Contributing to Manopt.jl","title":"Code style","text":"","category":"section"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"We try to follow the documentation guidelines from the Julia documentation as well as Blue Style. We run JuliaFormatter.jl on the repo in the way set in the .JuliaFormatter.toml file, which enforces a number of conventions consistent with the Blue Style.","category":"page"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"We also follow a few internal conventions:","category":"page"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"It is preferred that the AbstractManoptProblem's struct contains information about the general structure of the problem.\nAny implemented function should be accompanied by its mathematical formulae if a closed form exists.\nAbstractManoptProblem and option structures are stored within the plan/ folder and sorted by properties of the problem and/or solver at hand.\nWithin the source code of one algorithm, the high level interface should be first, then the initialization, then the step.\nOtherwise an alphabetical order is preferable.\nThe above implies that the mutating variant of a function follows the non-mutating variant.\nThere should be no dangling = signs.\nAlways add a newline between things of different types (struct/method/const).\nAlways add a newline between methods for different functions (including mutating/nonmutating variants).\nPrefer to have no newline between methods for the same function; when reasonable, merge the docstrings.\nAll import/using/include should be in the main module file.","category":"page"},{"location":"helpers/checks/#Checks","page":"Checks","title":"Checks","text":"","category":"section"},{"location":"helpers/checks/","page":"Checks","title":"Checks","text":"If you have computed a gradient or differential and you are not sure whether it is correct.","category":"page"},{"location":"helpers/checks/","page":"Checks","title":"Checks","text":"Modules = [Manopt]\nPages = [\"checks.jl\"]","category":"page"},{"location":"helpers/checks/#Manopt.check_Hessian","page":"Checks","title":"Manopt.check_Hessian","text":"check_Hessian(M, f, grad_f, Hess_f, p=rand(M), X=rand(M; vector_at=p), Y=rand(M, vector_at=p); kwargs...)\n\nCheck numerically whether the Hessian {operatorname{Hess} f(M,p, X) of f(M,p) is correct.\n\nFor this we require either a second-order retraction or a critical point p of f.\n\ngiven that we know that is whether\n\nf(operatornameretr_p(tX)) = f(p) + toperatornamegrad f(p) X + fract^22operatornameHessf(p)X X + mathcal O(t^3)\n\nor in other words, that the error between the function f and its second order Taylor behaves in error mathcal O(t^3), which indicates that the Hessian is correct, cf. also Section 6.8, Boumal, Cambridge Press, 2023.\n\nNote that if the errors are below the given tolerance and the method is exact, no plot will be generated.\n\nKeyword arguments\n\ncheck_grad – (true) check whether operatornamegrad f(p) in T_pmathcal M.\ncheck_linearity – (true) check whether the Hessian is linear, see is_Hessian_linear using a, b, X, and Y\ncheck_symmetry – (true) check whether the Hessian is symmetric, see is_Hessian_symmetric\ncheck_vector – (false) check whether operatornameHess f(p)X in T_pmathcal M using is_vector.\nmode - (:Default) specify the mode, by default we assume to have a second order retraction given by retraction_method= you can also this method if you already have a critical point p. Set to :CritalPoint to use gradient_descent to find a critical point. Note: This requires (and evaluates) new tangent vectors X and Y\natol, rtol – (same defaults as isapprox) tolerances that are passed down to all checks\na, b – two real values to check linearity of the Hessian (if check_linearity=true)\nN - (101) number of points to check within the log_range default range 10^-810^0\nexactness_tol - (1e-12) if all errors are below this tolerance, the check is considered to be exact\nio – (nothing) provide an IO to print the check result to\ngradient - (grad_f(M, p)) instead of the gradient function you can also provide the gradient at p directly\nHessian - (Hess_f(M, p, X)) instead of the Hessian function you can provide the result of operatornameHess f(p)X directly. Note that evaluations of the Hessian might still be necessary for checking linearity and symmetry and/or when using :CriticalPoint mode.\nlimits - ((1e-8,1)) specify the limits in the log_range\nlog_range - (range(limits[1], limits[2]; length=N)) specify the range of points (in log scale) to sample the Hessian line\nN - (101) number of points to check within the log_range default range 10^-810^0\nplot - (false) whether to plot the resulting check (if Plots.jl is loaded). The plot is in log-log-scale. This is returned and can then also be saved.\nretraction_method - (default_retraction_method(M, typeof(p))) retraction method to use for the check\nslope_tol – (0.1) tolerance for the slope (global) of the approximation\nthrow_error - (false) throw an error message if the Hessian is wrong\nwindow – (nothing) specify window sizes within the log_range that are used for the slope estimation. the default is, to use all window sizes 2:N.\n\nThe kwargs... are also passed down to the check_vector call, such that tolerances can easily be set.\n\n\n\n\n\n","category":"function"},{"location":"helpers/checks/#Manopt.check_differential","page":"Checks","title":"Manopt.check_differential","text":"check_differential(M, F, dF, p=rand(M), X=rand(M; vector_at=p); kwargs...)\n\nCheck numerically whether the differential dF(M,p,X) of F(M,p) is correct.\n\nThis implements the method described in Section 4.8, Boumal, Cambridge Press, 2023.\n\nNote that if the errors are below the given tolerance and the method is exact, no plot will be generated,\n\nKeyword arguments\n\nexactness_tol - (1e-12) if all errors are below this tolerance, the check is considered to be exact\nio – (nothing) provide an IO to print the check result to\nlimits ((1e-8,1)) specify the limits in the log_range\nlog_range (range(limits[1], limits[2]; length=N)) - specify the range of points (in log scale) to sample the differential line\nN (101) – number of points to check within the log_range default range 10^-810^0\nname (\"differential\") – name to display in the check (e.g. if checking differential)\nplot- (false) whether to plot the resulting check (if Plots.jl is loaded). The plot is in log-log-scale. This is returned and can then also be saved.\nretraction_method - (default_retraction_method(M, typeof(p))) retraction method to use for the check\nslope_tol – (0.1) tolerance for the slope (global) of the approximation\nthrow_error - (false) throw an error message if the differential is wrong\nwindow – (nothing) specify window sizes within the log_range that are used for the slope estimation. the default is, to use all window sizes 2:N.\n\n\n\n\n\n","category":"function"},{"location":"helpers/checks/#Manopt.check_gradient","page":"Checks","title":"Manopt.check_gradient","text":"check_gradient(M, F, gradF, p=rand(M), X=rand(M; vector_at=p); kwargs...)\n\nCheck numerically whether the gradient gradF(M,p) of F(M,p) is correct, that is whether\n\nf(operatornameretr_p(tX)) = f(p) + toperatornamegrad f(p) X + mathcal O(t^2)\n\nor in other words, that the error between the function f and its first order Taylor behaves in error mathcal O(t^2), which indicates that the gradient is correct, cf. also Section 4.8, Boumal, Cambridge Press, 2023.\n\nNote that if the errors are below the given tolerance and the method is exact, no plot will be generated.\n\nKeyword arguments\n\ncheck_vector – (true) check whether operatornamegrad f(p) in T_pmathcal M using is_vector.\nexactness_tol - (1e-12) if all errors are below this tolerance, the check is considered to be exact\nio – (nothing) provide an IO to print the check result to\ngradient - (grad_f(M, p)) instead of the gradient function you can also provide the gradient at p directly\nlimits - ((1e-8,1)) specify the limits in the log_range\nlog_range - (range(limits[1], limits[2]; length=N)) - specify the range of points (in log scale) to sample the gradient line\nN - (101) – number of points to check within the log_range default range 10^-810^0\nplot - (false) whether to plot the resulting check (if Plots.jl is loaded). The plot is in log-log-scale. This is returned and can then also be saved.\nretraction_method - (default_retraction_method(M, typeof(p))) retraction method to use for the check\nslope_tol – (0.1) tolerance for the slope (global) of the approximation\natol, rtol – (same defaults as isapprox) tolerances that are passed down to is_vector if check_vector is set to true\nthrow_error - (false) throw an error message if the gradient is wrong\nwindow – (nothing) specify window sizes within the log_range that are used for the slope estimation. the default is, to use all window sizes 2:N.\n\nThe kwargs... are also passed down to the check_vector call, such that tolerances can easily be set.\n\n\n\n\n\n","category":"function"},{"location":"helpers/checks/#Manopt.find_best_slope_window","page":"Checks","title":"Manopt.find_best_slope_window","text":"(a,b,i,j) = find_best_slope_window(X,Y,window=nothing; slope=2.0, slope_tol=0.1)\n\nCheck data X,Y for the largest contiguous interval (window) with a regression line fitting “best”. Among all intervals with a slope within slope_tol to slope the longest one is taken. If no such interval exists, the one with the slope closest to slope is taken.\n\nIf the window is set to nothing (default), all window sizes 2,...,length(X) are checked. You can also specify a window size or an array of window sizes.\n\nFor each window size , all its translates in the data are checked. For all these (shifted) windows the regression line is computed (i.e. a,b in a + t*b) and the best line is computed.\n\nFrom the best line the following data is returned\n\na, b specifying the regression line a + t*b\ni, j determining the window, i.e the regression line stems from data X[i], ..., X[j]\n\n\n\n\n\n","category":"function"},{"location":"helpers/checks/#Manopt.is_Hessian_linear","page":"Checks","title":"Manopt.is_Hessian_linear","text":"is_Hessian_linear(M, Hess_f, p,\n X=rand(M; vector_at=p), Y=rand(M; vector_at=p), a=randn(), b=randn();\n throw_error=false, io=nothing, kwargs...\n)\n\nCheck whether the Hessian function Hess_f fulfills linearity, i.e. that\n\noperatornameHess f(p)aX + bY = boperatornameHess f(p)X\n + boperatornameHess f(p)Y\n\nwhich is checked using isapprox and the kwargs... are passed to this function.\n\nOptional Arguments\n\nthrow_error - (false) throw an error message if the Hessian is wrong\n\n\n\n\n\n","category":"function"},{"location":"helpers/checks/#Manopt.is_Hessian_symmetric","page":"Checks","title":"Manopt.is_Hessian_symmetric","text":"is_Hessian_symmetric(M, Hess_f, p=rand(M), X=rand(M; vector_at=p), Y=rand(M; vector_at=p);\nthrow_error=false, io=nothing, atol::Real=0, rtol::Real=atol>0 ? 0 : √eps\n\n)\n\nCheck whether the Hessian function Hess_f fulfills symmetry, i.e. that\n\noperatornameHess f(p)X Y = X operatornameHess f(p)Y\n\nwhich is checked using isapprox and the kwargs... are passed to this function.\n\nOptional Arguments\n\natol, rtol - with the same defaults as the usual isapprox\nthrow_error - (false) throw an error message if the Hessian is wrong\n\n\n\n\n\n","category":"function"},{"location":"helpers/checks/#Manopt.plot_slope-Tuple{Any, Any}","page":"Checks","title":"Manopt.plot_slope","text":"plot_slope(x, y; slope=2, line_base=0, a=0, b=2.0, i=1,j=length(x))\n\nPlot the result from the error check functions, e.g. check_gradient, check_differential, check_Hessian on data x,y with two comparison lines\n\nline_base + tslope as the global slope the plot should have\na + b*t on the interval [x[i], x[j]] for some (best fitting) comparison slope\n\n\n\n\n\n","category":"method"},{"location":"helpers/checks/#Manopt.prepare_check_result-Tuple{Any, Any, Any}","page":"Checks","title":"Manopt.prepare_check_result","text":"prepare_check_result(log_range, errors, slope)\n\nGiven a range of values log_range, where we computed errors, check whether this yields a slope of slope in log-scale\n\nNote that if the errors are below the given tolerance and the method is exact, no plot will be generated,\n\nKeyword arguments\n\nexactness_tol - (1e3*eps(eltype(errors))) is all errors are below this tolerance, the check is considered to be exact\nio – (nothing) provide an IO to print the check result to\nname (\"differential\") – name to display in the check (e.g. if checking gradient)\nplot- (false) whether to plot the resulting check (if Plots.jl is loaded). The plot is in log-log-scale. This is returned and can then also be saved.\nslope_tol – (0.1) tolerance for the slope (global) of the approximation\nthrow_error - (false) throw an error message if the gradient or Hessian is wrong\n\n\n\n\n\n","category":"method"},{"location":"helpers/checks/#Literature","page":"Checks","title":"Literature","text":"","category":"section"},{"location":"helpers/checks/","page":"Checks","title":"Checks","text":"Pages = [\"helpers/checks.md\"]\nCanonical=false","category":"page"},{"location":"solvers/difference_of_convex/#DifferenceOfConvexSolvers","page":"Difference of Convex","title":"Difference of Convex","text":"","category":"section"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/difference_of_convex/#DCASolver","page":"Difference of Convex","title":"Difference of Convex Algorithm","text":"","category":"section"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"difference_of_convex_algorithm\ndifference_of_convex_algorithm!","category":"page"},{"location":"solvers/difference_of_convex/#Manopt.difference_of_convex_algorithm","page":"Difference of Convex","title":"Manopt.difference_of_convex_algorithm","text":"difference_of_convex_algorithm(M, f, g, ∂h, p=rand(M); kwargs...)\ndifference_of_convex_algorithm(M, mdco, p; kwargs...)\n\nCompute the difference of convex algorithm Bergmann, Ferreira, Santos, Souza, preprint, 2023 to minimize\n\n operatorname*argmin_pmathcal M g(p) - h(p)\n\nwhere you need to provide f(p) = g(p) - h(p), g and the subdifferential h of h.\n\nThis algorithm performs the following steps given a start point p= p^(0). Then repeat for k=01ldots\n\nTake X^(k) h(p^(k))\nSet the next iterate to the solution of the subproblem\n\n p^(k+1) in operatorname*argmin_qin mathcal M g(q) - X^(k) log_p^(k)q\n\nuntil the stopping_criterion is fulfilled.\n\nOptional parameters\n\nevaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form grad_f(M, p) or InplaceEvaluation form grad_f!(M, X, x)\ngradient – (nothing) specify operatornamegrad f, for debug / analysis or enhancing stopping_criterion=\ngrad_g – (nothing) specify the gradient of g. If specified, a subsolver is automatically set up.\ninitial_vector - (zero_vector(M, p)) initialise the inner tangent vector to store the subgradient result.\nstopping_criterion – (StopAfterIteration(200) |StopWhenChangeLess(1e-8)) a StoppingCriterion for the algorithm – includes a StopWhenGradientNormLess(1e-8), when a gradient is provided.\n\nif you specify the ManifoldDifferenceOfConvexObjective mdco, additionally\n\ng - (nothing) specify the function g If specified, a subsolver is automatically set up.\n\nWhile there are several parameters for a sub solver, the easiest is to provide the function grad_g=, such that together with the mandatory function g a default cost and gradient can be generated and passed to a default subsolver. Hence the easiest example call looks like\n\ndifference_of_convex_algorithm(M, f, g, grad_h, p; grad_g=grad_g)\n\nOptional parameters for the sub problem\n\nsub_cost - (LinearizedDCCost(g, p, initial_vector)) a cost to be used within the default sub_problem Use this if you have a more efficient version than the default that is built using g from above.\nsub_grad - (LinearizedDCGrad(grad_g, p, initial_vector; evaluation=evaluation) gradient to be used within the default sub_problem. This is generated by default when grad_g is provided. You can specify your own by overwriting this keyword.\nsub_hess – (a finite difference approximation by default) specify a Hessian of the subproblem, which the default solver, see sub_state needs\nsub_kwargs - ([]) pass keyword arguments to the sub_state, in form of a Dict(:kwname=>value), unless you set the sub_state directly.\nsub_objective - (a gradient or hessian objective based on the last 3 keywords) provide the objective used within sub_problem (if that is not specified by the user)\nsub_problem - (DefaultManoptProblem(M, sub_objective) specify a manopt problem for the sub-solver runs. You can also provide a function for a closed form solution. Then evaluation= is taken into account for the form of this function.\nsub_state - (TrustRegionsState by default, requires sub_hessian to be provided; decorated with sub_kwargs). Choose the solver by specifying a solver state to solve the sub_problem if the sub_problem if a function (i.e. a closed form solution), this is set to evaluation and can be changed to the evaluation type of the closed form solution accordingly.\nsub_stopping_criterion - (StopAfterIteration(300) |StopWhenStepsizeLess(1e-9) |StopWhenGradientNormLess(1e-9)) a stopping criterion used withing the default sub_state=\nsub_stepsize - (ArmijoLinesearch(M)) specify a step size used within the sub_state\n\n...all others are passed on to decorate the inner DifferenceOfConvexState.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/difference_of_convex/#Manopt.difference_of_convex_algorithm!","page":"Difference of Convex","title":"Manopt.difference_of_convex_algorithm!","text":"difference_of_convex_algorithm!(M, f, g, ∂h, p; kwargs...)\ndifference_of_convex_algorithm!(M, mdco, p; kwargs...)\n\nRun the difference of convex algorithm and perform the steps in place of p. See difference_of_convex_algorithm for more details.\n\nif you specify the ManifoldDifferenceOfConvexObjective mdco, the g is a keyword argument.\n\n\n\n\n\n","category":"function"},{"location":"solvers/difference_of_convex/#DCPPASolver","page":"Difference of Convex","title":"Difference of Convex Proximal Point","text":"","category":"section"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"difference_of_convex_proximal_point\ndifference_of_convex_proximal_point!","category":"page"},{"location":"solvers/difference_of_convex/#Manopt.difference_of_convex_proximal_point","page":"Difference of Convex","title":"Manopt.difference_of_convex_proximal_point","text":"difference_of_convex_proximal_point(M, grad_h, p=rand(M); kwargs...)\ndifference_of_convex_proximal_point(M, mdcpo, p=rand(M); kwargs...)\n\nCompute the difference of convex proximal point algorithm Souza, Oliveira, J. Glob. Optim., 2015 to minimize\n\n operatorname*argmin_pmathcal M g(p) - h(p)\n\nwhere you have to provide the (sub) gradient h of h and either\n\nthe proximal map operatornameprox_lambda g of g as a function prox_g(M, λ, p) or prox_g(M, q, λ, p)\nthe functions g and grad_g to compute the proximal map using a sub solver\nyour own sub-solver, see optional keywords below\n\nThis algorithm performs the following steps given a start point p= p^(0). Then repeat for k=01ldots\n\nX^(k) operatornamegrad h(p^(k))\nq^(k) = operatornameretr_p^(k)(λ_kX^(k))\nr^(k) = operatornameprox_λ_kg(q^(k))\nX^(k) = operatornameretr^-1_p^(k)(r^(k))\nCompute a stepsize s_k and\nset p^(k+1) = operatornameretr_p^(k)(s_kX^(k)).\n\nuntil the stopping_criterion is fulfilled. See Almeida, da Cruz Neto, Oliveira, Souza, Comput. Optim. Appl., 2020 for more details on the modified variant, where we slightly changed step 4-6, sine here we get the classical proximal point method for DC functions for s_k = 1 and we can employ linesearches similar to other solvers.\n\nOptional parameters\n\nλ – ( i -> 1/2 ) a function returning the sequence of prox parameters λi\nevaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).\ncost - (nothing) provide the cost f, e.g. for debug reasonscost to be used within the default sub_problem. Use this if you have a more efficient version than using g from above.\ngradient – (nothing) specify operatornamegrad f, for debug / analysis or enhancing the stopping_criterion\nprox_g - (nothing) specify a proximal map for the sub problem or both of the following\ng – (nothing) specify the function g.\ngrad_g – (nothing) specify the gradient of g. If both gand grad_g are specified, a subsolver is automatically set up.\ninverse_retraction_method - (default_inverse_retraction_method(M)) an inverse retraction method to use (see step 4).\nretraction_method – (default_retraction_method(M)) a retraction to use (see step 2)\nstepsize – (ConstantStepsize(M)) specify a Stepsize to run the modified algorithm (experimental.) functor.\nstopping_criterion (StopAfterIteration(200) |StopWhenChangeLess(1e-8)) a StoppingCriterion for the algorithm – includes a StopWhenGradientNormLess(1e-8), when a gradient is provided.\n\nWhile there are several parameters for a sub solver, the easiest is to provide the function g and grad_g, such that together with the mandatory function g a default cost and gradient can be generated and passed to a default subsolver. Hence the easiest example call looks like\n\ndifference_of_convex_proximal_point(M, grad_h, p0; g=g, grad_g=grad_g)\n\nOptional parameters for the sub problem\n\nsub_cost – (ProximalDCCost(g, copy(M, p), λ(1))) cost to be used within the default sub_problem that is initialized as soon as g is provided.\nsub_grad – (ProximalDCGrad(grad_g, copy(M, p), λ(1); evaluation=evaluation) gradient to be used within the default sub_problem, that is initialized as soon as grad_g is provided. This is generated by default when grad_g is provided. You can specify your own by overwriting this keyword.\nsub_hess – (a finite difference approximation by default) specify a Hessian of the subproblem, which the default solver, see sub_state needs\nsub_kwargs – ([]) pass keyword arguments to the sub_state, in form of a Dict(:kwname=>value), unless you set the sub_state directly.\nsub_objective – (a gradient or hessian objective based on the last 3 keywords) provide the objective used within sub_problem (if that is not specified by the user)\nsub_problem – (DefaultManoptProblem(M, sub_objective) specify a manopt problem for the sub-solver runs. You can also provide a function for a closed form solution. Then evaluation= is taken into account for the form of this function.\nsub_state – (TrustRegionsState – requires the sub_hessian to be provided, decorated withsubkwargs) choose the solver by specifying a solver state to solve thesubproblem`\nsub_stopping_criterion - (StopAfterIteration(300) |StopWhenStepsizeLess(1e-9) |StopWhenGradientNormLess(1e-9)) a stopping criterion used withing the default sub_state=\n\n...all others are passed on to decorate the inner DifferenceOfConvexProximalState.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/difference_of_convex/#Manopt.difference_of_convex_proximal_point!","page":"Difference of Convex","title":"Manopt.difference_of_convex_proximal_point!","text":"difference_of_convex_proximal_point!(M, grad_h, p; cost=nothing, kwargs...)\ndifference_of_convex_proximal_point!(M, mdcpo, p; cost=nothing, kwargs...)\ndifference_of_convex_proximal_point!(M, mdcpo, prox_g, p; cost=nothing, kwargs...)\n\nCompute the difference of convex algorithm to minimize\n\n operatorname*argmin_pmathcal M g(p) - h(p)\n\nwhere you have to provide the proximal map of g and the gradient of h.\n\nThe computation is done inplace of p.\n\nFor all further details, especially the keyword arguments, see difference_of_convex_proximal_point.\n\n\n\n\n\n","category":"function"},{"location":"solvers/difference_of_convex/#Manopt-Solver-States","page":"Difference of Convex","title":"Manopt Solver States","text":"","category":"section"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"DifferenceOfConvexState\nDifferenceOfConvexProximalState","category":"page"},{"location":"solvers/difference_of_convex/#Manopt.DifferenceOfConvexState","page":"Difference of Convex","title":"Manopt.DifferenceOfConvexState","text":"DifferenceOfConvexState{Pr,St,P,T,SC<:StoppingCriterion} <:\n AbstractManoptSolverState\n\nA struct to store the current state of the [difference_of_convex_algorithm])(@ref). It comes in two forms, depending on the realisation of the subproblem.\n\nFields\n\np – the current iterate, i.e. a point on the manifold\nX – the current subgradient, i.e. a tangent vector to p.\nsub_problem – problem for the subsolver\nsub_state – state of the subproblem\nstop – a functor inheriting from StoppingCriterion indicating when to stop.\n\nFor the sub task, we need a method to solve\n\n operatorname*argmin_qmathcal M g(p) - X log_p q\n\nbesides a problem and options, one can also provide a function and an AbstractEvaluationType, respectively, to indicate a closed form solution for the sub task.\n\nConstructors\n\nDifferenceOfConvexState(M, p, sub_problem, sub_state; kwargs...)\nDifferenceOfConvexState(M, p, sub_solver; evaluation=InplaceEvaluation(), kwargs...)\n\nGenerate the state either using a solver from Manopt, given by an AbstractManoptProblem sub_problem and an AbstractManoptSolverState sub_state, or a closed form solution sub_solver for the sub-problem, where by default its AbstractEvaluationType evaluation is in-place, i.e. the function is of the form (M, p, X) -> q or (M, q, p, X) -> q, such that the current iterate p and the subgradient X of h can be passed to that function and the result if q.\n\nFurther keyword Arguments\n\ninitial_vector=zero_vector (zero_vectoir(M,p)) how to initialize the inner gradient tangent vector\nstopping_criterion – StopAfterIteration(200) a stopping criterion\n\n\n\n\n\n","category":"type"},{"location":"solvers/difference_of_convex/#Manopt.DifferenceOfConvexProximalState","page":"Difference of Convex","title":"Manopt.DifferenceOfConvexProximalState","text":"DifferenceOfConvexProximalState{Type} <: Options\n\nA struct to store the current state of the algorithm as well as the form. It comes in two forms, depending on the realisation of the subproblem.\n\nFields\n\ninverse_retraction_method – (default_inverse_retraction_method(M)) an inverse retraction method to use within Frank Wolfe.\nretraction_method – (default_retraction_method(M)) a type of retraction\np, q, r – the current iterate, the gradient step and the prox, respectively their type is set by initializing p\nstepsize – (ConstantStepsize(1.0)) a Stepsize function to run the modified algorithm (experimental)\nstop – (StopWhenChangeLess(1e-8)) a StoppingCriterion\nX, Y – (zero_vector(M,p)) the current gradient and descent direction, respectively their common type is set by the keyword X\n\nConstructor\n\nDifferenceOfConvexProximalState(M, p; kwargs...)\n\nKeyword arguments\n\nX, retraction_method, inverse_retraction_method, stepsize for the fields above\nstoppping_criterion for the StoppingCriterion\n\n\n\n\n\n","category":"type"},{"location":"solvers/difference_of_convex/#The-difference-of-convex-objective","page":"Difference of Convex","title":"The difference of convex objective","text":"","category":"section"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"ManifoldDifferenceOfConvexObjective","category":"page"},{"location":"solvers/difference_of_convex/#Manopt.ManifoldDifferenceOfConvexObjective","page":"Difference of Convex","title":"Manopt.ManifoldDifferenceOfConvexObjective","text":"ManifoldDifferenceOfConvexObjective{E} <: AbstractManifoldCostObjective{E}\n\nSpecify an objective for a difference_of_convex_algorithm.\n\nThe objective f mathcal M to ℝ is given as\n\n f(p) = g(p) - h(p)\n\nwhere both g and h are convex, lsc. and proper. Furthermore we assume that the subdifferential h of h is given.\n\nFields\n\ncost – an implementation of f(p) = g(p)-h(p) as a function f(M,p).\n∂h!! – a deterministic version of h mathcal M Tmathcal M, i.e. calling ∂h(M, p) returns a subgradient of h at p and if there is more than one, it returns a deterministic choice.\n\nNote that the subdifferential might be given in two possible signatures\n\n∂h(M,p) which does an AllocatingEvaluation\n∂h!(M, X, p) which does an InplaceEvaluation in place of X.\n\n\n\n\n\n","category":"type"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"as well as for the corresponding sub problem","category":"page"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"LinearizedDCCost\nLinearizedDCGrad","category":"page"},{"location":"solvers/difference_of_convex/#Manopt.LinearizedDCCost","page":"Difference of Convex","title":"Manopt.LinearizedDCCost","text":"LinearizedDCCost\n\nA functor (M,q) → ℝ to represent the inner problem of a ManifoldDifferenceOfConvexObjective, i.e. a cost function of the form\n\n F_p_kX_k(p) = g(p) - X_k log_p_kp\n\nfor a point p_k and a tangent vector X_k at p_k (e.g. outer iterates) that are stored within this functor as well.\n\nFields\n\ng a function\npk a point on a manifold\nXk a tangent vector at pk\n\nBoth interims values can be set using set_manopt_parameter!(::LinearizedDCCost, ::Val{:p}, p) and set_manopt_parameter!(::LinearizedDCCost, ::Val{:X}, X), respectively.\n\nConstructor\n\nLinearizedDCCost(g, p, X)\n\n\n\n\n\n","category":"type"},{"location":"solvers/difference_of_convex/#Manopt.LinearizedDCGrad","page":"Difference of Convex","title":"Manopt.LinearizedDCGrad","text":"LinearizedDCGrad\n\nA functor (M,X,p) → ℝ to represent the gradient of the inner problem of a ManifoldDifferenceOfConvexObjective, i.e. for a cost function of the form\n\n F_p_kX_k(p) = g(p) - X_k log_p_kp\n\nits gradient is given by using F=F_1(F_2(p)), where F_1(X) = X_kX and F_2(p) = log_p_kp and the chain rule as well as the adjoint differential of the logarithmic map with respect to its argument for D^*F_2(p)\n\n operatornamegrad F(q) = operatornamegrad f(q) - DF_2^*(q)X\n\nfor a point pk and a tangent vector Xk at pk (the outer iterates) that are stored within this functor as well\n\nFields\n\ngrad_g!! the gradient of g (see also LinearizedDCCost)\npk a point on a manifold\nXk a tangent vector at pk\n\nBoth interims values can be set using set_manopt_parameter!(::LinearizedDCGrad, ::Val{:p}, p) and set_manopt_parameter!(::LinearizedDCGrad, ::Val{:X}, X), respectively.\n\nConstructor\n\nLinearizedDCGrad(grad_g, p, X; evaluation=AllocatingEvaluation())\n\nWhere you specify whether grad_g is AllocatingEvaluation or InplaceEvaluation, while this function still provides both signatures.\n\n\n\n\n\n","category":"type"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"ManifoldDifferenceOfConvexProximalObjective","category":"page"},{"location":"solvers/difference_of_convex/#Manopt.ManifoldDifferenceOfConvexProximalObjective","page":"Difference of Convex","title":"Manopt.ManifoldDifferenceOfConvexProximalObjective","text":"ManifoldDifferenceOfConvexProximalObjective{E} <: Problem\n\nSpecify an objective difference_of_convex_proximal_point algorithm. The problem is of the form\n\n operatorname*argmin_pin mathcal M g(p) - h(p)\n\nwhere both g and h are convex, lsc. and proper.\n\nFields\n\ncost – (nothing) implementation of f(p) = g(p)-h(p) (optional)\ngradient - the gradient of the cost\ngrad_h!! – a function operatornamegradh mathcal M Tmathcal M,\n\nNote that both the gradients might be given in two possible signatures as allocating or Inplace.\n\nConstructor\n\nManifoldDifferenceOfConvexProximalObjective(gradh; cost=nothing, gradient=nothing)\n\nan note that neither cost nor gradient are required for the algorithm, just for eventual debug or stopping criteria.\n\n\n\n\n\n","category":"type"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"as well as for the corresponding sub problems","category":"page"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"ProximalDCCost\nProximalDCGrad","category":"page"},{"location":"solvers/difference_of_convex/#Manopt.ProximalDCCost","page":"Difference of Convex","title":"Manopt.ProximalDCCost","text":"ProximalDCCost\n\nA functor (M, p) → ℝ to represent the inner cost function of a ManifoldDifferenceOfConvexProximalObjective, i.e. the cost function of the proximal map of g.\n\n F_p_k(p) = frac12λd_mathcal M(p_kp)^2 + g(p)\n\nfor a point pk and a proximal parameter λ.\n\nFields\n\ng - a function\npk - a point on a manifold\nλ - the prox parameter\n\nBoth interims values can be set using set_manopt_parameter!(::ProximalDCCost, ::Val{:p}, p) and set_manopt_parameter!(::ProximalDCCost, ::Val{:λ}, λ), respectively.\n\nConstructor\n\nProximalDCCost(g, p, λ)\n\n\n\n\n\n","category":"type"},{"location":"solvers/difference_of_convex/#Manopt.ProximalDCGrad","page":"Difference of Convex","title":"Manopt.ProximalDCGrad","text":"ProximalDCGrad\n\nA functor (M,X,p) → ℝ to represent the gradient of the inner cost function of a ManifoldDifferenceOfConvexProximalObjective, i.e. the gradient function of the proximal map cost function of g, i.e. of\n\n F_p_k(p) = frac12λd_mathcal M(p_kp)^2 + g(p)\n\nwhich reads\n\n operatornamegrad F_p_k(p) = operatornamegrad g(p) - frac1λlog_p p_k\n\nfor a point pk and a proximal parameter λ.\n\nFields\n\ngrad_g - a gradient function\npk - a point on a manifold\nλ - the prox parameter\n\nBoth interims values can be set using set_manopt_parameter!(::ProximalDCGrad, ::Val{:p}, p) and set_manopt_parameter!(::ProximalDCGrad, ::Val{:λ}, λ), respectively.\n\nConstructor\n\nProximalDCGrad(grad_g, pk, λ; evaluation=AllocatingEvaluation())\n\nWhere you specify whether grad_g is AllocatingEvaluation or InplaceEvaluation, while this function still always provides both signatures.\n\n\n\n\n\n","category":"type"},{"location":"solvers/difference_of_convex/#Further-helper-functions","page":"Difference of Convex","title":"Further helper functions","text":"","category":"section"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"get_subtrahend_gradient","category":"page"},{"location":"solvers/difference_of_convex/#Manopt.get_subtrahend_gradient","page":"Difference of Convex","title":"Manopt.get_subtrahend_gradient","text":"X = get_subtrahend_gradient(amp, q)\nget_subtrahend_gradient!(amp, X, q)\n\nEvaluate the (sub)gradient of the subtrahend h from within a ManifoldDifferenceOfConvexObjective amp at the point q (in place of X).\n\nThe evaluation is done in place of X for the !-variant. The T=AllocatingEvaluation problem might still allocate memory within. When the non-mutating variant is called with a T=InplaceEvaluation memory for the result is allocated.\n\n\n\n\n\nX = get_subtrahend_gradient(M::AbstractManifold, dcpo::ManifoldDifferenceOfConvexProximalObjective, p)\nget_subtrahend_gradient!(M::AbstractManifold, X, dcpo::ManifoldDifferenceOfConvexProximalObjective, p)\n\nEvaluate the gradient of the subtrahend h from within a ManifoldDifferenceOfConvexProximalObjectivePat the pointp` (in place of X).\n\n\n\n\n\n","category":"function"},{"location":"solvers/difference_of_convex/#Literature","page":"Difference of Convex","title":"Literature","text":"","category":"section"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"Pages = [\"solvers/difference_of_convex.md\"]\nCanonical=false","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/#PDRSSNSolver","page":"Primal-dual Riemannian semismooth Newton","title":"The Primal-dual Riemannian semismooth Newton Algorithm","text":"","category":"section"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"The Primal-dual Riemannian semismooth Newton Algorithm is a second-order method derived from the ChambollePock.","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"The aim is to solve an optimization problem on a manifold with a cost function of the form","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"F(p) + G(Λ(p))","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"where Fmathcal M overlineℝ, Gmathcal N overlineℝ, and Λmathcal M mathcal N. If the manifolds mathcal M or mathcal N are not Hadamard, it has to be considered locally, i.e. on geodesically convex sets mathcal C subset mathcal M and mathcal D subsetmathcal N such that Λ(mathcal C) subset mathcal D.","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"The algorithm comes down to applying the Riemannian semismooth Newton method to the rewritten primal-dual optimality conditions, i.e., we define the vector field X mathcalM times mathcalT_n^* mathcalN rightarrow mathcalT mathcalM times mathcalT_n^* mathcalN as","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"Xleft(p xi_nright)=left(beginarrayc\n-log _p operatornameprox_sigma Fleft(exp _pleft(mathcalP_p leftarrow mleft(-sigmaleft(D_m Lambdaright)^*leftmathcalP_Lambda(m) leftarrow n xi_nrightright)^sharpright)right) \nxi_n-operatornameprox_tau G_n^*left(xi_n+tauleft(mathcalP_n leftarrow Lambda(m) D_m Lambdaleftlog _m prightright)^flatright)\nendarrayright)","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"and solve for X(pξ_n)=0.","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"Given base points mmathcal C, n=Λ(m)mathcal D, initial primal and dual values p^(0) mathcal C, ξ_n^(0) mathcal T_n^*mathcal N, and primal and dual step sizes sigma, tau.","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"The algorithms performs the steps k=1 (until a StoppingCriterion is reached)","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"Choose any element\nV^(k) _C X(p^(k)ξ_n^(k))\nof the Clarke generalized covariant derivative\nSolve\nV^(k) (d_p^(k) d_n^(k)) = - X(p^(k)ξ_n^(k))\nin the vector space mathcalT_p^(k) mathcalM times mathcalT_n^* mathcalN\nUpdate\np^(k+1) = exp_p^(k)(d_p^(k))\nand\nξ_n^(k+1) = ξ_n^(k) + d_n^(k)","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"Furthermore you can exchange the exponential map, the logarithmic map, and the parallel transport by a retraction, an inverse retraction and a vector transport.","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"Finally you can also update the base points m and n during the iterations. This introduces a few additional vector transports. The same holds for the case that Λ(m^(k))neq n^(k) at some point. All these cases are covered in the algorithm.","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"primal_dual_semismooth_Newton\nprimal_dual_semismooth_Newton!","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/#Manopt.primal_dual_semismooth_Newton","page":"Primal-dual Riemannian semismooth Newton","title":"Manopt.primal_dual_semismooth_Newton","text":"primal_dual_semismooth_Newton(M, N, cost, p, X, m, n, prox_F, diff_prox_F, prox_G_dual, diff_prox_dual_G, linearized_operator, adjoint_linearized_operator)\n\nPerform the Primal-Dual Riemannian Semismooth Newton algorithm.\n\nGiven a cost function mathcal Ecolonmathcal M to overlineℝ of the form\n\nmathcal E(p) = F(p) + G( Λ(p) )\n\nwhere Fcolonmathcal M to overlineℝ, Gcolonmathcal N to overlineℝ, and Lambdacolonmathcal M to mathcal N. The remaining input parameters are\n\np, X primal and dual start points xinmathcal M and xiin T_nmathcal N\nm,n base points on mathcal M and mathcal N, respectively.\nlinearized_forward_operator the linearization DΛ() of the operator Λ().\nadjoint_linearized_operator the adjoint DΛ^* of the linearized operator DΛ(m)colon T_mmathcal M to T_Λ(m)mathcal N\nprox_F, prox_G_Dual the proximal maps of F and G^ast_n\ndiff_prox_F, diff_prox_dual_G the (Clarke Generalized) differentials of the proximal maps of F and G^ast_n\n\nFor more details on the algorithm, see Diepeveen, Lellmann, SIAM J. Imag. Sci., 2021.\n\nOptional Parameters\n\nprimal_stepsize – (1/sqrt(8)) proximal parameter of the primal prox\nΛ (missing) the exact operator, that is required if Λ(m)=n does not hold;\n\nmissing indicates, that the forward operator is exact.\n\ndual_stepsize – (1/sqrt(8)) proximal parameter of the dual prox\nreg_param – (1e-5) regularisation parameter for the Newton matrix\n\nNote that this changes the arguments the forward_operator will be called.\n\nstopping_criterion – (stopAtIteration(50)) a StoppingCriterion\nupdate_primal_base – (missing) function to update m (identity by default/missing)\nupdate_dual_base – (missing) function to update n (identity by default/missing)\nretraction_method – (default_retraction_method(M, typeof(p))) the retraction to use\ninverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use.\nvector_transport_method - (default_vector_transport_method(M, typeof(p))) a vector transport to use\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/primal_dual_semismooth_Newton/#Manopt.primal_dual_semismooth_Newton!","page":"Primal-dual Riemannian semismooth Newton","title":"Manopt.primal_dual_semismooth_Newton!","text":"primal_dual_semismooth_Newton(M, N, cost, x0, ξ0, m, n, prox_F, diff_prox_F, prox_G_dual, diff_prox_G_dual, linearized_forward_operator, adjoint_linearized_operator)\n\nPerform the Riemannian Primal-dual Riemannian semismooth Newton algorithm in place of x, ξ, and potentially m, n if they are not fixed. See primal_dual_semismooth_Newton for details and optional parameters.\n\n\n\n\n\n","category":"function"},{"location":"solvers/primal_dual_semismooth_Newton/#State","page":"Primal-dual Riemannian semismooth Newton","title":"State","text":"","category":"section"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"PrimalDualSemismoothNewtonState","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/#Manopt.PrimalDualSemismoothNewtonState","page":"Primal-dual Riemannian semismooth Newton","title":"Manopt.PrimalDualSemismoothNewtonState","text":"PrimalDualSemismoothNewtonState <: AbstractPrimalDualSolverState\n\nm - base point on $ \\mathcal M $\nn - base point on $ \\mathcal N $\nx - an initial point on x^(0) in mathcal M (and its previous iterate)\nξ - an initial tangent vector xi^(0)in T_n^*mathcal N (and its previous iterate)\nprimal_stepsize – (1/sqrt(8)) proximal parameter of the primal prox\ndual_stepsize – (1/sqrt(8)) proximal parameter of the dual prox\nreg_param – (1e-5) regularisation parameter for the Newton matrix\nstop - a StoppingCriterion\nupdate_primal_base (( amp, ams, i) -> o.m) function to update the primal base\nupdate_dual_base ((amp, ams, i) -> o.n) function to update the dual base\nretraction_method – (default_retraction_method(M, typeof(p))) the retraction to use\ninverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use.\nvector_transport_method - (default_vector_transport_method(M, typeof(p))) a vector transport to use\n\nwhere for the update functions a AbstractManoptProblem amp, AbstractManoptSolverState ams and the current iterate i are the arguments. If you activate these to be different from the default identity, you have to provide p.Λ for the algorithm to work (which might be missing).\n\nConstructor\n\nPrimalDualSemismoothNewtonState(M::AbstractManifold,\n m::P, n::Q, x::P, ξ::T, primal_stepsize::Float64, dual_stepsize::Float64, reg_param::Float64;\n stopping_criterion::StoppingCriterion = StopAfterIteration(50),\n update_primal_base::Union{Function,Missing} = missing,\n update_dual_base::Union{Function,Missing} = missing,\n retraction_method = default_retraction_method(M, typeof(p)),\n inverse_retraction_method = default_inverse_retraction_method(M, typeof(p)),\n vector_transport_method = default_vector_transport_method(M, typeof(p)),\n)\n\n\n\n\n\n","category":"type"},{"location":"solvers/primal_dual_semismooth_Newton/#Literature","page":"Primal-dual Riemannian semismooth Newton","title":"Literature","text":"","category":"section"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"Pages = [\"solvers/primal_dual_semismooth_Newton.md\"]\nCanonical=false","category":"page"},{"location":"solvers/DouglasRachford/#DRSolver","page":"Douglas–Rachford","title":"Douglas–Rachford Algorithm","text":"","category":"section"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"The (Parallel) Douglas–Rachford ((P)DR) Algorithm was generalized to Hadamard manifolds in [BPS16].","category":"page"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"The aim is to minimize the sum","category":"page"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"F(p) = f(p) + g(p)","category":"page"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"on a manifold, where the two summands have proximal maps operatornameprox_λ f operatornameprox_λ g that are easy to evaluate (maybe in closed form, or not too costly to approximate). Further, define the reflection operator at the proximal map as","category":"page"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"operatornamerefl_λ f(p) = operatornameretr_operatornameprox_λ f(p) bigl( -operatornameretr^-1_operatornameprox_λ f(p) p bigr)","category":"page"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"Let alpha_k 01 with sum_k mathbb N alpha_k(1-alpha_k) = infty and λ 0 (which might depend on iteration k as well) be given.","category":"page"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"Then the (P)DRA algorithm for initial data x_0 mathcal H as","category":"page"},{"location":"solvers/DouglasRachford/#Initialization","page":"Douglas–Rachford","title":"Initialization","text":"","category":"section"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"Initialize t_0 = x_0 and k=0","category":"page"},{"location":"solvers/DouglasRachford/#Iteration","page":"Douglas–Rachford","title":"Iteration","text":"","category":"section"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"Repeat until a convergence criterion is reached","category":"page"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"Compute s_k = operatornamerefl_λ foperatornamerefl_λ g(t_k)\nWithin that operation, store p_k+1 = operatornameprox_λ g(t_k) which is the prox the inner reflection reflects at.\nCompute t_k+1 = g(alpha_k t_k s_k), where g is a curve approximating the shortest geodesic, provided by a retraction and its inverse\nSet k = k+1","category":"page"},{"location":"solvers/DouglasRachford/#Result","page":"Douglas–Rachford","title":"Result","text":"","category":"section"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"The result is given by the last computed p_K.","category":"page"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"For the parallel version, the first proximal map is a vectorial version where in each component one prox is applied to the corresponding copy of t_k and the second proximal map corresponds to the indicator function of the set, where all copies are equal (in mathcal H^n, where n is the number of copies), leading to the second prox being the Riemannian mean.","category":"page"},{"location":"solvers/DouglasRachford/#Interface","page":"Douglas–Rachford","title":"Interface","text":"","category":"section"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":" DouglasRachford\n DouglasRachford!","category":"page"},{"location":"solvers/DouglasRachford/#Manopt.DouglasRachford","page":"Douglas–Rachford","title":"Manopt.DouglasRachford","text":"DouglasRachford(M, f, proxes_f, p)\nDouglasRachford(M, mpo, p)\n\nCompute the Douglas-Rachford algorithm on the manifold mathcal M, initial data p and the (two) proximal maps proxMaps, see Bergmann, Persch, Steidl, SIAM J Imag Sci, 2016.\n\nFor k2 proximal maps, the problem is reformulated using the parallel Douglas Rachford: A vectorial proximal map on the power manifold mathcal M^k is introduced as the first proximal map and the second proximal map of the is set to the mean (Riemannian Center of mass). This hence also boils down to two proximal maps, though each evaluates proximal maps in parallel, i.e. component wise in a vector.\n\nIf you provide a ManifoldProximalMapObjective mpo instead, the proximal maps are kept unchanged.\n\nInput\n\nM – a Riemannian Manifold mathcal M\nF – a cost function consisting of a sum of cost functions\nproxes_f – functions of the form (M, λ, p)->... performing a proximal maps, where λ denotes the proximal parameter, for each of the summands of F. These can also be given in the InplaceEvaluation variants (M, q, λ p) -> ... computing in place of q.\np – initial data p mathcal M\n\nOptional values\n\nevaluation – (AllocatingEvaluation) specify whether the proximal maps work by allocation (default) form prox(M, λ, x) or InplaceEvaluation in place, i.e. is of the form prox!(M, y, λ, x).\nλ – ((iter) -> 1.0) function to provide the value for the proximal parameter during the calls\nα – ((iter) -> 0.9) relaxation of the step from old to new iterate, i.e. t_k+1 = g(α_k t_k s_k), where s_k is the result of the double reflection involved in the DR algorithm\ninverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) the inverse retraction to use within\nthe reflection (ignored, if you set R directly)\nthe relaxation step\nR – method employed in the iteration to perform the reflection of x at the prox p. This uses by default reflect or reflect! depending on reflection_evaluation and the retraction and inverse retraction specified by retraction_method and inverse_retraction_method, respectively.\nreflection_evaluation – (AllocatingEvaluation whether R works inplace or allocating\nretraction_method - (default_retraction_metiod(M, typeof(p))) the retraction to use in\nthe reflection (ignored, if you set R directly)\nthe relaxation step\nstopping_criterion – (StopWhenAny(StopAfterIteration(200),StopWhenChangeLess(10.0^-5))) a StoppingCriterion.\nparallel – (false) clarify that we are doing a parallel DR, i.e. on a PowerManifold manifold with two proxes. This can be used to trigger parallel Douglas–Rachford if you enter with two proxes. Keep in mind, that a parallel Douglas–Rachford implicitly works on a PowerManifold manifold and its first argument is the result then (assuming all are equal after the second prox.\n\nand the ones that are passed to decorate_state! for decorators.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/DouglasRachford/#Manopt.DouglasRachford!","page":"Douglas–Rachford","title":"Manopt.DouglasRachford!","text":" DouglasRachford!(M, f, proxes_f, p)\n DouglasRachford!(M, mpo, p)\n\nCompute the Douglas-Rachford algorithm on the manifold mathcal M, initial data p in mathcal M and the (two) proximal maps proxes_f in place of p.\n\nFor k2 proximal maps, the problem is reformulated using the parallel Douglas Rachford: A vectorial proximal map on the power manifold mathcal M^k is introduced as the first proximal map and the second proximal map of the is set to the mean (Riemannian Center of mass). This hence also boils down to two proximal maps, though each evaluates proximal maps in parallel, i.e. component wise in a vector.\n\nnote: Note\nWhile creating the new staring point p' on the power manifold, a copy of p Is created, so that the (by k>2 implicitly generated) parallel Douglas Rachford does not work in-place for now.\n\nIf you provide a ManifoldProximalMapObjective mpo instead, the proximal maps are kept unchanged.\n\nInput\n\nM – a Riemannian Manifold mathcal M\nf – a cost function consisting of a sum of cost functions\nproxes_f – functions of the form (M, λ, p)->q or (M, q, λ, p)->q performing a proximal map, where λ denotes the proximal parameter, for each of the summands of f.\np – initial point p mathcal M\n\nFor more options, see DouglasRachford.\n\n\n\n\n\n","category":"function"},{"location":"solvers/DouglasRachford/#State","page":"Douglas–Rachford","title":"State","text":"","category":"section"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"DouglasRachfordState","category":"page"},{"location":"solvers/DouglasRachford/#Manopt.DouglasRachfordState","page":"Douglas–Rachford","title":"Manopt.DouglasRachfordState","text":"DouglasRachfordState <: AbstractManoptSolverState\n\nStore all options required for the DouglasRachford algorithm,\n\nFields\n\np - the current iterate (result) For the parallel Douglas-Rachford, this is not a value from the PowerManifold manifold but the mean.\ns – the last result of the double reflection at the proxes relaxed by α.\nλ – function to provide the value for the proximal parameter during the calls\nα – relaxation of the step from old to new iterate, i.e. x^(k+1) = g(α(k) x^(k) t^(k)), where t^(k) is the result of the double reflection involved in the DR algorithm\ninverse_retraction_method – an inverse retraction method\nR – method employed in the iteration to perform the reflection of x at the prox p.\nreflection_evaluation – whether R works inplace or allocating\nretraction_method – a retraction method\nstop – a StoppingCriterion\nparallel – indicate whether we are running a parallel Douglas-Rachford or not.\n\nConstructor\n\nDouglasRachfordState(M, p; kwargs...)\n\nGenerate the options for a Manifold M and an initial point p, where the following keyword arguments can be used\n\nλ – ((iter)->1.0) function to provide the value for the proximal parameter during the calls\nα – ((iter)->0.9) relaxation of the step from old to new iterate, i.e. x^(k+1) = g(α(k) x^(k) t^(k)), where t^(k) is the result of the double reflection involved in the DR algorithm\nR – (reflect or reflect!) method employed in the iteration to perform the reflection of x at the prox p, which function is used depends on reflection_evaluation.\nreflection_evaluation – (AllocatingEvaluation()) specify whether the reflection works inplace or allocating (default)\nstopping_criterion – (StopAfterIteration(300)) a StoppingCriterion\nparallel – (false) indicate whether we are running a parallel Douglas-Rachford or not.\n\n\n\n\n\n","category":"type"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"For specific DebugActions and RecordActions see also Cyclic Proximal Point.","category":"page"},{"location":"solvers/DouglasRachford/#Literature","page":"Douglas–Rachford","title":"Literature","text":"","category":"section"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"Pages = [\"solvers/DouglasRachford.md\"]","category":"page"},{"location":"tutorials/CountAndCache/#How-to-Count-and-Cache-Function-Calls","page":"Count and use a Cache","title":"How to Count and Cache Function Calls","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"In this tutorial, we want to investigate the caching and counting (i.e. statistics) features of Manopt.jl. We will reuse the optimization tasks from the introductory tutorial Get Started: Optimize!.","category":"page"},{"location":"tutorials/CountAndCache/#Introduction","page":"Count and use a Cache","title":"Introduction","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"There are surely many ways to keep track for example of how often the cost function is called, for example with a functor, as we used in an example in How to Record Data","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"mutable struct MyCost{I<:Integer}\n count::I\nend\nMyCost() = MyCost{Int64}(0)\nfunction (c::MyCost)(M, x)\n c.count += 1\n # [ .. Actual implementation of the cost here ]\nend","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"This still leaves a bit of work to the user, especially for tracking more than just the number of cost function evaluations.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"When a function like the objective or gradient is expensive to compute, it may make sense to cache its results. Manopt.jl tries to minimize the number of repeated calls but sometimes they are necessary and harmless when the function is cheap to compute. Caching of expensive function calls can for example be added using Memoize.jl by the user. The approach in the solvers of Manopt.jl aims to simplify adding both these capabilities on the level of calling a solver.","category":"page"},{"location":"tutorials/CountAndCache/#Technical-Background","page":"Count and use a Cache","title":"Technical Background","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"The two ingredients for a solver in Manopt.jl are the AbstractManoptProblem and the AbstractManoptSolverState, where the former consists of the domain, that is the manifold and AbstractManifoldObjective.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"Both recording and debug capabilities are implemented in a decorator pattern to the solver state. They can be easily added using the record= and debug= in any solver call. This pattern was recently extended, such that also the objective can be decorated. This is how both caching and counting are implemented, as decorators of the AbstractManifoldObjective and hence for example changing/extending the behaviour of a call to get_cost.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"Let’s finish off the technical background by loading the necessary packages. Besides Manopt.jl and Manifolds.jl we also need LRUCaches.jl which are (since Julia 1.9) a weak dependency and provide the least recently used strategy for our caches.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"using Manopt, Manifolds, Random, LRUCache, LinearAlgebra","category":"page"},{"location":"tutorials/CountAndCache/#Counting","page":"Count and use a Cache","title":"Counting","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"We first define our task, the Riemannian Center of Mass from the Get Started: Optimize! tutorial.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"n = 100\nσ = π / 8\nM = Sphere(2)\np = 1 / sqrt(2) * [1.0, 0.0, 1.0]\nRandom.seed!(42)\ndata = [exp(M, p, σ * rand(M; vector_at=p)) for i in 1:n];\nf(M, p) = sum(1 / (2 * n) * distance.(Ref(M), Ref(p), data) .^ 2)\ngrad_f(M, p) = sum(1 / n * grad_distance.(Ref(M), data, Ref(p)));","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"to now count how often the cost and the gradient are called, we use the count= keyword argument that works in any solver to specify the elements of the objective whose calls we want to count calls to. A full list is available in the documentation of the AbstractManifoldObjective. To also see the result, we have to set return_objective=true. This returns (objective, p) instead of just the solver result p. We can further also set return_state=true to get even more information about the solver run.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"gradient_descent(M, f, grad_f, data[1]; count=[:Cost, :Gradient], return_objective=true, return_state=true)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"# Solver state for `Manopt.jl`s Gradient Descent\nAfter 68 iterations\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nArmijoLineseach() with keyword parameters\n * initial_stepsize = 1.0\n * retraction_method = ExponentialRetraction()\n * contraction_factor = 0.95\n * sufficient_decrease = 0.1\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 200: not reached\n |grad f| < 1.0e-9: reached\nOverall: reached\nThis indicates convergence: Yes\n\n## Statistics on function calls\n * :Gradient : 205\n * :Cost : 285","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"And we see that statistics are shown in the end.","category":"page"},{"location":"tutorials/CountAndCache/#Caching","page":"Count and use a Cache","title":"Caching","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"To now also cache these calls, we can use the cache= keyword argument. Since now both the cache and the count “extend” the functionality of the objective, the order is important: On the high-level interface, the count is treated first, which means that only actual function calls and not cache look-ups are counted. With the proper initialisation, you can use any caches here that support the get!(function, cache, key)! update. All parts of the objective that can currently be cached are listed at ManifoldCachedObjective. The solver call has a keyword cache that takes a tuple(c, vs, n) of three arguments, where c is a symbol for the type of cache, vs is a vector of symbols, which calls to cache and n is the size of the cache. If the last element is not provided, a suitable default (currentlyn=10) is used.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"Here we want to use c=:LRU caches for vs=[Cost, :Gradient] with a size of n=25.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"r = gradient_descent(M, f, grad_f, data[1];\n count=[:Cost, :Gradient],\n cache=(:LRU, [:Cost, :Gradient], 25),\n return_objective=true, return_state=true)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"# Solver state for `Manopt.jl`s Gradient Descent\nAfter 68 iterations\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nArmijoLineseach() with keyword parameters\n * initial_stepsize = 1.0\n * retraction_method = ExponentialRetraction()\n * contraction_factor = 0.95\n * sufficient_decrease = 0.1\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 200: not reached\n |grad f| < 1.0e-9: reached\nOverall: reached\nThis indicates convergence: Yes\n\n## Cache\n * :Cost : 25/25 entries of type Float64 used\n * :Gradient : 25/25 entries of type Vector{Float64} used\n\n## Statistics on function calls\n * :Gradient : 68\n * :Cost : 157","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"Since the default setup with ArmijoLinesearch needs the gradient and the cost, and similarly the stopping criterion might (independently) evaluate the gradient, the caching is quite helpful here.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"And of course also for this advanced return value of the solver, we can still access the result as usual:","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"get_solver_result(r)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"3-element Vector{Float64}:\n 0.6868392794790367\n 0.006531600680668244\n 0.7267799820834814","category":"page"},{"location":"tutorials/CountAndCache/#Advanced-Caching-Examples","page":"Count and use a Cache","title":"Advanced Caching Examples","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"There are more options other than caching single calls to specific parts of the objective. For example you may want to cache intermediate results of computing the cost and share that with the gradient computation. We will present three solutions to this:","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"An easy approach from within Manopt.jl: The ManifoldCostGradientObjective\nA shared storage approach using a functor\nA shared (internal) cache approach also using a functor","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"For that we switch to another example: The Rayleigh quotient. We aim to maximize the Rayleigh quotient displaystylefracx^mathrmTAxx^mathrmTx, for some Ainmathbb R^m+1times m+1 and xinmathbb R^m+1 but since we consider this on the sphere and Manopt.jl (as many other optimization toolboxes) minimizes, we consider","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"g(p) = -p^mathrmTApqquad pinmathbb S^m","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"The Euclidean gradient (that is in $ R^{m+1}$) is actually just nabla g(p) = -2Ap, the Riemannian gradient the projection of nabla g(p) onto the tangent space T_pmathbb S^m.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"m = 25\nRandom.seed!(42)\nA = randn(m + 1, m + 1)\nA = Symmetric(A)\np_star = eigvecs(A)[:, end] # minimizer (or similarly -p)\nf_star = -eigvals(A)[end] # cost (note that we get - the largest Eigenvalue)\n\nN = Sphere(m);\n\ng(M, p) = -p' * A*p\n∇g(p) = -2 * A * p\ngrad_g(M,p) = project(M, p, ∇g(p))\ngrad_g!(M,X, p) = project!(M, X, p, ∇g(p))","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"grad_g! (generic function with 1 method)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"But since both the cost and the gradient require the computation of the matrix-vector product Ap, it might be beneficial to only compute this once.","category":"page"},{"location":"tutorials/CountAndCache/#The-[ManifoldCostGradientObjective](@ref)-approach","page":"Count and use a Cache","title":"The ManifoldCostGradientObjective approach","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"The ManifoldCostGradientObjective uses a combined function to compute both the gradient and the cost at the same time. We define the inplace variant as","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"function g_grad_g!(M::AbstractManifold, X, p)\n X .= -A*p\n c = p'*X\n X .*= 2\n project!(M, X, p, X)\n return (c, X)\nend","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"g_grad_g! (generic function with 1 method)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"where we only compute the matrix-vector product once. The small disadvantage might be, that we always compute both, the gradient and the cost. Luckily, the cache we used before, takes this into account and caches both results, such that we indeed end up computing A*p only once when asking to a cost and a gradient.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"Let’s compare both methods","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"p0 = [(1/5 .* ones(5))..., zeros(m-4)...];\n@time s1 = gradient_descent(N, g, grad_g!, p0;\n stopping_criterion = StopWhenGradientNormLess(1e-5),\n evaluation=InplaceEvaluation(),\n count=[:Cost, :Gradient],\n cache=(:LRU, [:Cost, :Gradient], 25),\n return_objective=true,\n)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":" 1.392875 seconds (1.55 M allocations: 124.750 MiB, 3.75% gc time, 99.36% compilation time)\n\n## Cache\n * :Cost : 25/25 entries of type Float64 used\n * :Gradient : 25/25 entries of type Vector{Float64} used\n\n## Statistics on function calls\n * :Gradient : 602\n * :Cost : 1449\n\nTo access the solver result, call `get_solver_result` on this variable.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"versus","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"obj = ManifoldCostGradientObjective(g_grad_g!; evaluation=InplaceEvaluation())\n@time s2 = gradient_descent(N, obj, p0;\n stopping_criterion=StopWhenGradientNormLess(1e-5),\n count=[:Cost, :Gradient],\n cache=(:LRU, [:Cost, :Gradient], 25),\n return_objective=true,\n)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":" 0.773684 seconds (773.96 k allocations: 60.275 MiB, 3.04% gc time, 97.88% compilation time)\n\n## Cache\n * :Cost : 25/25 entries of type Float64 used\n * :Gradient : 25/25 entries of type Vector{Float64} used\n\n## Statistics on function calls\n * :Gradient : 1448\n * :Cost : 1448\n\nTo access the solver result, call `get_solver_result` on this variable.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"first of all both yield the same result","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"p1 = get_solver_result(s1)\np2 = get_solver_result(s2)\n[distance(N, p1, p2), g(N, p1), g(N, p2), f_star]","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"4-element Vector{Float64}:\n 0.0\n -7.8032957637779035\n -7.8032957637779035\n -7.803295763793953","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"and we can see that the combined number of evaluations is once 2051, once just the number of cost evaluations 1449. Note that the involved additional 847 gradient evaluations are merely a multiplication with 2. On the other hand, the additional caching of the gradient in these cases might be less beneficial. It is beneficial, when the gradient and the cost are very often required together.","category":"page"},{"location":"tutorials/CountAndCache/#A-shared-storage-approach-using-a-functor","page":"Count and use a Cache","title":"A shared storage approach using a functor","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"An alternative to the previous approach is the usage of a functor that introduces a “shared storage” of the result of computing A*p. We additionally have to store p though, since we have to check that we are still evaluating the cost and/or gradient at the same point at which the cached A*p was computed. We again consider the (more efficient) inplace variant. This can be done as follows","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"struct StorageG{T,M}\n A::M\n Ap::T\n p::T\nend\nfunction (g::StorageG)(::Val{:Cost}, M::AbstractManifold, p)\n if !(p==g.p) #We are at a new point -> Update\n g.Ap .= g.A*p\n g.p .= p\n end\n return -g.p'*g.Ap\nend\nfunction (g::StorageG)(::Val{:Gradient}, M::AbstractManifold, X, p)\n if !(p==g.p) #We are at a new point -> Update\n g.Ap .= g.A*p\n g.p .= p\n end\n X .= -2 .* g.Ap\n project!(M, X, p, X)\n return X\nend","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"Here we use the first parameter to distinguish both functions. For the mutating case the signatures are different regardless of the additional argument but for the allocating case, the signatures of the cost and the gradient function are the same.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"#Define the new functor\nstorage_g = StorageG(A, zero(p0), zero(p0))\n# and cost and gradient that use this functor as\ng3(M,p) = storage_g(Val(:Cost), M, p)\ngrad_g3!(M, X, p) = storage_g(Val(:Gradient), M, X, p)\n@time s3 = gradient_descent(N, g3, grad_g3!, p0;\n stopping_criterion = StopWhenGradientNormLess(1e-5),\n evaluation=InplaceEvaluation(),\n count=[:Cost, :Gradient],\n cache=(:LRU, [:Cost, :Gradient], 2),\n return_objective=true#, return_state=true\n)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":" 0.487223 seconds (325.29 k allocations: 23.338 MiB, 98.24% compilation time)\n\n## Cache\n * :Cost : 2/2 entries of type Float64 used\n * :Gradient : 2/2 entries of type Vector{Float64} used\n\n## Statistics on function calls\n * :Gradient : 602\n * :Cost : 1449\n\nTo access the solver result, call `get_solver_result` on this variable.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"This of course still yields the same result","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"p3 = get_solver_result(s3)\ng(N, p3) - f_star","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"1.6049384043981263e-11","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"And while we again have a split off the cost and gradient evaluations, we can observe that the allocations are less than half of the previous approach.","category":"page"},{"location":"tutorials/CountAndCache/#A-local-cache-approach","page":"Count and use a Cache","title":"A local cache approach","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"This variant is very similar to the previous one, but uses a whole cache instead of just one place to store A*p. This makes the code a bit nicer, and it is possible to store more than just the last p either cost or gradient was called with.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"struct CacheG{C,M}\n A::M\n cache::C\nend\nfunction (g::CacheG)(::Val{:Cost}, M, p)\n Ap = get!(g.cache, copy(M,p)) do\n g.A*p\n end\n return -p'*Ap\nend\nfunction (g::CacheG)(::Val{:Gradient}, M, X, p)\n Ap = get!(g.cache, copy(M,p)) do\n g.A*p\n end\n X .= -2 .* Ap\n project!(M, X, p, X)\n return X\nend","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"However, the resulting solver run is not always faster, since the whole cache instead of storing just Ap and p is a bit more costly. Then the tradeoff is, whether this pays off.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"#Define the new functor\ncache_g = CacheG(A, LRU{typeof(p0),typeof(p0)}(; maxsize=25))\n# and cost and gradient that use this functor as\ng4(M,p) = cache_g(Val(:Cost), M, p)\ngrad_g4!(M, X, p) = cache_g(Val(:Gradient), M, X, p)\n@time s4 = gradient_descent(N, g4, grad_g4!, p0;\n stopping_criterion = StopWhenGradientNormLess(1e-5),\n evaluation=InplaceEvaluation(),\n count=[:Cost, :Gradient],\n cache=(:LRU, [:Cost, :Gradient], 25),\n return_objective=true,\n)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":" 0.474319 seconds (313.56 k allocations: 22.981 MiB, 97.87% compilation time)\n\n## Cache\n * :Cost : 25/25 entries of type Float64 used\n * :Gradient : 25/25 entries of type Vector{Float64} used\n\n## Statistics on function calls\n * :Gradient : 602\n * :Cost : 1449\n\nTo access the solver result, call `get_solver_result` on this variable.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"and for safety let’s check that we are reasonably close","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"p4 = get_solver_result(s4)\ng(N, p4) - f_star","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"1.6049384043981263e-11","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"For this example, or maybe even gradient_descent in general it seems, this additional (second, inner) cache does not improve the result further, it is about the same effort both time and allocation-wise.","category":"page"},{"location":"tutorials/CountAndCache/#Summary","page":"Count and use a Cache","title":"Summary","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"While the second approach of ManifoldCostGradientObjective is very easy to implement, both the storage and the (local) cache approach are more efficient. All three are an improvement over the first implementation without sharing interms results. The results with storage or cache have further advantage of being more flexible, i.e. the stored information could also be reused in a third function, for example when also computing the Hessian.","category":"page"},{"location":"tutorials/InplaceGradient/#Speedup-using-Inplace-Evaluation","page":"Speedup using Inplace computations","title":"Speedup using Inplace Evaluation","text":"","category":"section"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"When it comes to time critital operations, a main ingredient in Julia is given by mutating functions, i.e. those that compute in place without additional memory allocations. In the following, we illustrate how to do this with Manopt.jl.","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"Let’s start with the same function as in Get Started: Optimize! and compute the mean of some points, only that here we use the sphere mathbb S^30 and n=800 points.","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"From the aforementioned example.","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"We first load all necessary packages.","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"using Manopt, Manifolds, Random, BenchmarkTools\nRandom.seed!(42);","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"And setup our data","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"Random.seed!(42)\nm = 30\nM = Sphere(m)\nn = 800\nσ = π / 8\np = zeros(Float64, m + 1)\np[2] = 1.0\ndata = [exp(M, p, σ * rand(M; vector_at=p)) for i in 1:n];","category":"page"},{"location":"tutorials/InplaceGradient/#Classical-Definition","page":"Speedup using Inplace computations","title":"Classical Definition","text":"","category":"section"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"The variant from the previous tutorial defines a cost f(x) and its gradient operatornamegradf(p) ““”","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"f(M, p) = sum(1 / (2 * n) * distance.(Ref(M), Ref(p), data) .^ 2)\ngrad_f(M, p) = sum(1 / n * grad_distance.(Ref(M), data, Ref(p)))","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"grad_f (generic function with 1 method)","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"We further set the stopping criterion to be a little more strict. Then we obtain","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"sc = StopWhenGradientNormLess(3e-10)\np0 = zeros(Float64, m + 1); p0[1] = 1/sqrt(2); p0[2] = 1/sqrt(2)\nm1 = gradient_descent(M, f, grad_f, p0; stopping_criterion=sc);","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"We can also benchmark this as","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"@benchmark gradient_descent($M, $f, $grad_f, $p0; stopping_criterion=$sc)","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"BenchmarkTools.Trial: 100 samples with 1 evaluation.\n Range (min … max): 48.285 ms … 56.649 ms ┊ GC (min … max): 4.84% … 6.96%\n Time (median): 49.552 ms ┊ GC (median): 5.41%\n Time (mean ± σ): 50.151 ms ± 1.731 ms ┊ GC (mean ± σ): 5.56% ± 0.64%\n\n ▂▃ █▃▃▆ ▂ \n ▅████████▅█▇█▄▅▇▁▅█▅▇▄▇▅▁▅▄▄▄▁▄▁▁▁▄▄▁▁▁▁▁▁▄▁▁▁▁▁▁▄▁▄▁▁▁▁▁▁▄ ▄\n 48.3 ms Histogram: frequency by time 56.6 ms <\n\n Memory estimate: 194.10 MiB, allocs estimate: 655347.","category":"page"},{"location":"tutorials/InplaceGradient/#In-place-Computation-of-the-Gradient","page":"Speedup using Inplace computations","title":"In-place Computation of the Gradient","text":"","category":"section"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"We can reduce the memory allocations by implementing the gradient to be evaluated in-place. We do this by using a functor. The motivation is twofold: on one hand, we want to avoid variables from the global scope, for example the manifold M or the data, being used within the function. Considering to do the same for more complicated cost functions might also be worth pursuing.","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"Here, we store the data (as reference) and one introduce temporary memory in order to avoid reallocation of memory per grad_distance computation. We get","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"struct GradF!{TD,TTMP}\n data::TD\n tmp::TTMP\nend\nfunction (grad_f!::GradF!)(M, X, p)\n fill!(X, 0)\n for di in grad_f!.data\n grad_distance!(M, grad_f!.tmp, di, p)\n X .+= grad_f!.tmp\n end\n X ./= length(grad_f!.data)\n return X\nend","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"For the actual call to the solver, we first have to generate an instance of GradF! and tell the solver, that the gradient is provided in an InplaceEvaluation. We can further also use gradient_descent! to even work inplace of the initial point we pass.","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"grad_f2! = GradF!(data, similar(data[1]))\nm2 = deepcopy(p0)\ngradient_descent!(\n M, f, grad_f2!, m2; evaluation=InplaceEvaluation(), stopping_criterion=sc\n);","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"We can again benchmark this","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"@benchmark gradient_descent!(\n $M, $f, $grad_f2!, m2; evaluation=$(InplaceEvaluation()), stopping_criterion=$sc\n) setup = (m2 = deepcopy($p0))","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"BenchmarkTools.Trial: 176 samples with 1 evaluation.\n Range (min … max): 27.419 ms … 34.154 ms ┊ GC (min … max): 0.00% … 0.00%\n Time (median): 28.001 ms ┊ GC (median): 0.00%\n Time (mean ± σ): 28.412 ms ± 1.079 ms ┊ GC (mean ± σ): 0.73% ± 2.24%\n\n ▁▅▇█▅▂▄ ▁ \n ▄▁███████▆█▇█▄▆▃▃▃▃▁▁▃▁▁▃▁▃▃▁▄▁▁▃▃▁▁▄▁▁▃▅▃▃▃▁▃▃▁▁▁▁▁▁▁▁▃▁▁▃ ▃\n 27.4 ms Histogram: frequency by time 31.9 ms <\n\n Memory estimate: 3.76 MiB, allocs estimate: 5949.","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"which is faster by about a factor of 2 compared to the first solver-call. Note that the results m1 and m2 are of course the same.","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"distance(M, m1, m2)","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"2.0004809792350595e-10","category":"page"},{"location":"plans/state/#SolverStateSection","page":"Solver State","title":"The Solver State","text":"","category":"section"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"CurrentModule = Manopt","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"Given an AbstractManoptProblem, that is a certain optimisation task, the state specifies the solver to use. It contains the parameters of a solver and all fields necessary during the algorithm, e.g. the current iterate, a StoppingCriterion or a Stepsize.","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"AbstractManoptSolverState\nget_state\nManopt.get_count","category":"page"},{"location":"plans/state/#Manopt.AbstractManoptSolverState","page":"Solver State","title":"Manopt.AbstractManoptSolverState","text":"AbstractManoptSolverState\n\nA general super type for all solver states.\n\nFields\n\nThe following fields are assumed to be default. If you use different ones, provide the access functions accordingly\n\np a point on a manifold with the current iterate\nstop a StoppingCriterion.\n\n\n\n\n\n","category":"type"},{"location":"plans/state/#Manopt.get_state","page":"Solver State","title":"Manopt.get_state","text":"get_state(s::AbstractManoptSolverState, recursive::Bool=true)\n\nreturn the (one step) undecorated AbstractManoptSolverState of the (possibly) decorated s. As long as your decorated state stores the state within s.state and the dispatch_objective_decorator is set to Val{true}, the internal state are extracted automatically.\n\nBy default the state that is stored within a decorated state is assumed to be at s.state. Overwrite _get_state(s, ::Val{true}, recursive) to change this behaviour for your states` for both the recursive and the nonrecursive case.\n\nIf recursive is set to false, only the most outer decorator is taken away instead of all.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.get_count","page":"Solver State","title":"Manopt.get_count","text":"get_count(ams::AbstractManoptSolverState, ::Symbol)\n\nObtain the count for a certain countable size, e.g. the :Iterations. This function returns 0 if there was nothing to count\n\nAvailable symbols from within the solver state\n\n:Iterations is passed on to the stop field to obtain the iteration at which the solver stopped.\n\n\n\n\n\nget_count(co::ManifoldCountObjective, s::Symbol, mode::Symbol=:None)\n\nGet the number of counts for a certain symbol s.\n\nDepending on the mode different results appear if the symbol does not exist in the dictionary\n\n:None – (default) silent mode, returns -1 for non-existing entries\n:warn – issues a warning if a field does not exist\n:error – issues an error if a field does not exist\n\n\n\n\n\n","category":"function"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"Since every subtype of an AbstractManoptSolverState directly relate to a solver, the concrete states are documented together with the corresponding solvers. This page documents the general functionality available for every state.","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"A first example is to access, i.e. obtain or set, the current iterate. This might be useful to continue investigation at the current iterate, or to set up a solver for a next experiment, respectively.","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"get_iterate\nset_iterate!\nget_gradient(s::AbstractManoptSolverState)\nset_gradient!","category":"page"},{"location":"plans/state/#Manopt.get_iterate","page":"Solver State","title":"Manopt.get_iterate","text":"get_iterate(O::AbstractManoptSolverState)\n\nreturn the (last stored) iterate within AbstractManoptSolverStates`. By default also undecorates the state beforehand.\n\n\n\n\n\nget_iterate(agst::AbstractGradientSolverState)\n\nreturn the iterate stored within gradient options. THe default returns agst.p.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.set_iterate!","page":"Solver State","title":"Manopt.set_iterate!","text":"set_iterate!(s::AbstractManoptSolverState, M::AbstractManifold, p)\n\nset the iterate within an AbstractManoptSolverState to some (start) value p.\n\n\n\n\n\nset_iterate!(agst::AbstractGradientSolverState, M, p)\n\nset the (current) iterate stored within an AbstractGradientSolverState to p. The default function modifies s.p.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.get_gradient-Tuple{AbstractManoptSolverState}","page":"Solver State","title":"Manopt.get_gradient","text":"get_gradient(s::AbstractManoptSolverState)\n\nreturn the (last stored) gradient within AbstractManoptSolverStates`. By default also undecorates the state beforehand\n\n\n\n\n\n","category":"method"},{"location":"plans/state/#Manopt.set_gradient!","page":"Solver State","title":"Manopt.set_gradient!","text":"set_gradient!(s::AbstractManoptSolverState, M::AbstractManifold, p, X)\n\nset the gradient within an (possibly decorated) AbstractManoptSolverState to some (start) value X in the tangent space at p.\n\n\n\n\n\nset_gradient!(agst::AbstractGradientSolverState, M, p, X)\n\nset the (current) gradient stored within an AbstractGradientSolverState to X. The default function modifies s.X.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"An internal function working on the state and elements within a state is used to pass messages from (sub) activities of a state to the corresponding DebugMessages","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"get_message","category":"page"},{"location":"plans/state/#Manopt.get_message","page":"Solver State","title":"Manopt.get_message","text":"get_message(du::AbstractManoptSolverState)\n\nget a message (String) from e.g. performing a step computation. This should return any message a sub-step might have issued\n\n\n\n\n\n","category":"function"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"Furthermore, to access the stopping criterion use","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"get_stopping_criterion","category":"page"},{"location":"plans/state/#Manopt.get_stopping_criterion","page":"Solver State","title":"Manopt.get_stopping_criterion","text":"get_stopping_criterion(ams::AbstractManoptSolverState)\n\nReturn the StoppingCriterion stored within the AbstractManoptSolverState ams.\n\nFor an undecorated state, this is assumed to be in ams.stop. Overwrite _get_stopping_criterion(yms::YMS) to change this for your manopt solver (yms) assuming it has type YMS`.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Decorators-for-AbstractManoptSolverState","page":"Solver State","title":"Decorators for AbstractManoptSolverState","text":"","category":"section"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"A solver state can be decorated using the following trait and function to initialize","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"dispatch_state_decorator\nis_state_decorator\ndecorate_state!","category":"page"},{"location":"plans/state/#Manopt.dispatch_state_decorator","page":"Solver State","title":"Manopt.dispatch_state_decorator","text":"dispatch_state_decorator(s::AbstractManoptSolverState)\n\nIndicate internally, whether an AbstractManoptSolverState s to be of decorating type, i.e. it stores (encapsulates) a state in itself, by default in the field s.state.\n\nDecorators indicate this by returning Val{true} for further dispatch.\n\nThe default is Val{false}, i.e. by default an state is not decorated.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.is_state_decorator","page":"Solver State","title":"Manopt.is_state_decorator","text":"is_state_decorator(s::AbstractManoptSolverState)\n\nIndicate, whether AbstractManoptSolverState s are of decorator type.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.decorate_state!","page":"Solver State","title":"Manopt.decorate_state!","text":"decorate_state!(s::AbstractManoptSolverState)\n\ndecorate the AbstractManoptSolverStates with specific decorators.\n\nOptional Arguments\n\noptional arguments provide necessary details on the decorators. A specific one is used to activate certain decorators.\n\ndebug – (Array{Union{Symbol,DebugAction,String,Int},1}()) a set of symbols representing DebugActions, Strings used as dividers and a subsampling integer. These are passed as a DebugGroup within :All to the DebugSolverState decorator dictionary. Only exception is :Stop that is passed to :Stop.\nrecord – (Array{Union{Symbol,RecordAction,Int},1}()) specify recordings by using Symbols or RecordActions directly. The integer can again be used for only recording every ith iteration.\nreturn_state - (false) indicate whether to wrap the options in a ReturnSolverState, indicating that the solver should return options and not (only) the minimizer.\n\nother keywords are ignored.\n\nSee also\n\nDebugSolverState, RecordSolverState, ReturnSolverState\n\n\n\n\n\n","category":"function"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"A simple example is the","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"ReturnSolverState","category":"page"},{"location":"plans/state/#Manopt.ReturnSolverState","page":"Solver State","title":"Manopt.ReturnSolverState","text":"ReturnSolverState{O<:AbstractManoptSolverState} <: AbstractManoptSolverState\n\nThis internal type is used to indicate that the contained AbstractManoptSolverState state should be returned at the end of a solver instead of the usual minimizer.\n\nSee also\n\nget_solver_result\n\n\n\n\n\n","category":"type"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"as well as DebugSolverState and RecordSolverState.","category":"page"},{"location":"plans/state/#State-Actions","page":"Solver State","title":"State Actions","text":"","category":"section"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"A state action is a struct for callback functions that can be attached within for example the just mentioned debug decorator or the record decorator.","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"AbstractStateAction","category":"page"},{"location":"plans/state/#Manopt.AbstractStateAction","page":"Solver State","title":"Manopt.AbstractStateAction","text":"AbstractStateAction\n\na common Type for AbstractStateActions that might be triggered in decoraters, for example within the DebugSolverState or within the RecordSolverState.\n\n\n\n\n\n","category":"type"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"Several state decorators or actions might store intermediate values like the (last) iterate to compute some change or the last gradient. In order to minimise the storage of these, there is a generic StoreStateAction that acts as generic common storage that can be shared among different actions.","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"StoreStateAction\nget_storage\nhas_storage\nupdate_storage!\nPointStorageKey\nVectorStorageKey","category":"page"},{"location":"plans/state/#Manopt.StoreStateAction","page":"Solver State","title":"Manopt.StoreStateAction","text":"StoreStateAction <: AbstractStateAction\n\ninternal storage for AbstractStateActions to store a tuple of fields from an AbstractManoptSolverStates\n\nThis functor possesses the usual interface of functions called during an iteration, i.e. acts on (p,o,i), where p is a AbstractManoptProblem, o is an AbstractManoptSolverState and i is the current iteration.\n\nFields\n\nvalues – a dictionary to store interims values based on certain Symbols\nkeys – a Vector of Symbols to refer to fields of AbstractManoptSolverState\npoint_values – a NamedTuple of mutable values of points on a manifold to be stored in StoreStateAction. Manifold is later determined by AbstractManoptProblem passed to update_storage!.\npoint_init – a NamedTuple of boolean values indicating whether a point in point_values with matching key has been already initialized to a value. When it is false, it corresponds to a general value not being stored for the key present in the vector keys.\nvector_values – a NamedTuple of mutable values of tangent vectors on a manifold to be stored in StoreStateAction. Manifold is later determined by AbstractManoptProblem passed to update_storage!. It is not specified at which point the vectors are tangent but for storage it should not matter.\nvector_init – a NamedTuple of boolean values indicating whether a tangent vector in vector_values with matching key has been already initialized to a value. When it is false, it corresponds to a general value not being stored for the key present in the vector keys.\nonce – whether to update the internal values only once per iteration\nlastStored – last iterate, where this AbstractStateAction was called (to determine once)\n\nTo handle the general storage, use get_storage and has_storage with keys as Symbols. For the point storage use PointStorageKey. For tangent vector storage use VectorStorageKey. Point and tangent storage have been optimized to be more efficient.\n\nConstructors\n\nStoreStateAction(s::Vector{Symbol})\n\nThis is equivalent as providing s to the keyword store_fields, just that here, no manifold is necessity for the construction.\n\nStoreStateAction(M)\n\nKeyword arguments\n\nstore_fields (Symbol[])\nstore_points (Symbol[])\nstore_vectors (Symbol[])\n\nas vectors of symbols each referring to fields of the state (lower case symbols) or semantic ones (upper case).\n\np_init (rand(M))\nX_init (zero_vector(M, p_init))\n\nare used to initialize the point and vector storages, change these if you use other types (than the default) for your points/vectors on M.\n\nonce (true) whether to update internal storage only once per iteration or on every update call\n\n\n\n\n\n","category":"type"},{"location":"plans/state/#Manopt.get_storage","page":"Solver State","title":"Manopt.get_storage","text":"get_storage(a::AbstractStateAction, key::Symbol)\n\nReturn the internal value of the AbstractStateAction a at the Symbol key.\n\n\n\n\n\nget_storage(a::AbstractStateAction, ::PointStorageKey{key}) where {key}\n\nReturn the internal value of the AbstractStateAction a at the Symbol key that represents a point.\n\n\n\n\n\nget_storage(a::AbstractStateAction, ::VectorStorageKey{key}) where {key}\n\nReturn the internal value of the AbstractStateAction a at the Symbol key that represents a vector.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.has_storage","page":"Solver State","title":"Manopt.has_storage","text":"has_storage(a::AbstractStateAction, key::Symbol)\n\nReturn whether the AbstractStateAction a has a value stored at the Symbol key.\n\n\n\n\n\nhas_storage(a::AbstractStateAction, ::PointStorageKey{key}) where {key}\n\nReturn whether the AbstractStateAction a has a point value stored at the Symbol key.\n\n\n\n\n\nhas_storage(a::AbstractStateAction, ::VectorStorageKey{key}) where {key}\n\nReturn whether the AbstractStateAction a has a point value stored at the Symbol key.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.update_storage!","page":"Solver State","title":"Manopt.update_storage!","text":"update_storage!(a::AbstractStateAction, amp::AbstractManoptProblem, s::AbstractManoptSolverState)\n\nUpdate the AbstractStateAction a internal values to the ones given on the AbstractManoptSolverState s. Optimized using the information from amp\n\n\n\n\n\nupdate_storage!(a::AbstractStateAction, d::Dict{Symbol,<:Any})\n\nUpdate the AbstractStateAction a internal values to the ones given in the dictionary d. The values are merged, where the values from d are preferred.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.PointStorageKey","page":"Solver State","title":"Manopt.PointStorageKey","text":"struct PointStorageKey{key} end\n\nRefer to point storage of StoreStateAction in get_storage and has_storage functions\n\n\n\n\n\n","category":"type"},{"location":"plans/state/#Manopt.VectorStorageKey","page":"Solver State","title":"Manopt.VectorStorageKey","text":"struct VectorStorageKey{key} end\n\nRefer to tangent storage of StoreStateAction in get_storage and has_storage functions\n\n\n\n\n\n","category":"type"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"as well as two internal functions","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"_storage_copy_vector\n_storage_copy_point","category":"page"},{"location":"plans/state/#Manopt._storage_copy_vector","page":"Solver State","title":"Manopt._storage_copy_vector","text":"_storage_copy_vector(M::AbstractManifold, X)\n\nMake a copy of tangent vector X from manifold M for storage in StoreStateAction.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt._storage_copy_point","page":"Solver State","title":"Manopt._storage_copy_point","text":"_storage_copy_point(M::AbstractManifold, p)\n\nMake a copy of point p from manifold M for storage in StoreStateAction.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Abstract-States","page":"Solver State","title":"Abstract States","text":"","category":"section"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"In a few cases it is useful to have a hierarchy of types. These are","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"AbstractSubProblemSolverState\nAbstractGradientSolverState\nAbstractHessianSolverState\nAbstractPrimalDualSolverState","category":"page"},{"location":"plans/state/#Manopt.AbstractSubProblemSolverState","page":"Solver State","title":"Manopt.AbstractSubProblemSolverState","text":"AbstractSubProblemSolverState <: AbstractManoptSolverState\n\nAn abstract type for problems that involve a subsolver\n\n\n\n\n\n","category":"type"},{"location":"plans/state/#Manopt.AbstractGradientSolverState","page":"Solver State","title":"Manopt.AbstractGradientSolverState","text":"AbstractGradientSolverState <: AbstractManoptSolverState\n\nA generic AbstractManoptSolverState type for gradient based options data.\n\nIt assumes that\n\nthe iterate is stored in the field p\nthe gradient at p is stored in X.\n\nsee also\n\nGradientDescentState, StochasticGradientDescentState, SubGradientMethodState, QuasiNewtonState.\n\n\n\n\n\n","category":"type"},{"location":"plans/state/#Manopt.AbstractHessianSolverState","page":"Solver State","title":"Manopt.AbstractHessianSolverState","text":"AbstractHessianSolverState <: AbstractGradientSolverState\n\nAn AbstractManoptSolverState type to represent algorithms that employ the Hessian. These options are assumed to have a field (gradient) to store the current gradient operatornamegradf(x)\n\n\n\n\n\n","category":"type"},{"location":"plans/state/#Manopt.AbstractPrimalDualSolverState","page":"Solver State","title":"Manopt.AbstractPrimalDualSolverState","text":"AbstractPrimalDualSolverState\n\nA general type for all primal dual based options to be used within primal dual based algorithms\n\n\n\n\n\n","category":"type"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"For the sub problem state, there are two access functions","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"get_sub_problem\nget_sub_state","category":"page"},{"location":"plans/state/#Manopt.get_sub_problem","page":"Solver State","title":"Manopt.get_sub_problem","text":"get_sub_problem(ams::AbstractSubProblemSolverState)\n\nAccess the sub problem of a solver state that involves a sub optimisation task. By default this returns ams.sub_problem.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.get_sub_state","page":"Solver State","title":"Manopt.get_sub_state","text":"get_sub_state(ams::AbstractSubProblemSolverState)\n\nAccess the sub state of a solver state that involves a sub optimisation task. By default this returns ams.sub_state.\n\n\n\n\n\n","category":"function"},{"location":"about/#About","page":"About","title":"About","text":"","category":"section"},{"location":"about/","page":"About","title":"About","text":"Manopt.jl inherited its name from Manopt, a Matlab toolbox for optimization on manifolds. This Julia package was started and is currently maintained by Ronny Bergmann.","category":"page"},{"location":"about/","page":"About","title":"About","text":"The following people contributed","category":"page"},{"location":"about/","page":"About","title":"About","text":"Constantin Ahlmann-Eltze implemented the gradient and differential check functions\nRenée Dornig implemented the particle swarm, the Riemannian Augmented Lagrangian Method, the Exact Penalty Method, as well as the NonmonotoneLinesearch\nWillem Diepeveen implemented the primal-dual Riemannian semismooth Newton solver.\nEven Stephansen Kjemsås contributed to the implementation of the Frank Wolfe Method solver\nMathias Ravn Munkvold contributed most of the implementation of the Adaptive Regularization with Cubics solver\nTom-Christian Riemer Riemer implemented the trust regions and quasi Newton solvers.\nManuel Weiss implemented most of the conjugate gradient update rules","category":"page"},{"location":"about/","page":"About","title":"About","text":"...as well as various contributors providing small extensions, finding small bugs and mistakes and fixing them by opening PRs.","category":"page"},{"location":"about/","page":"About","title":"About","text":"If you want to contribute a manifold or algorithm or have any questions, visit the GitHub repository to clone/fork the repository or open an issue.","category":"page"},{"location":"about/#Further-Packages-and-Links","page":"About","title":"Further Packages & Links","text":"","category":"section"},{"location":"about/","page":"About","title":"About","text":"Manopt.jl belongs to the Manopt family:","category":"page"},{"location":"about/","page":"About","title":"About","text":"manopt.org – The Matlab version of Manopt, see also their :octocat: GitHub repository\npymanopt.org – The Python version of Manopt – providing also several AD backends, see also their :octocat: GitHub repository","category":"page"},{"location":"about/","page":"About","title":"About","text":"but there are also more packages providing tools on manifolds:","category":"page"},{"location":"about/","page":"About","title":"About","text":"Jax Geometry (Python/Jax) for differential geometry and stochastic dynamics with deep learning\nGeomstats (Python with several backends) focusing on statistics and machine learning :octocat: GitHub repository\nGeoopt (Python & PyTorch) – Riemannian ADAM & SGD. :octocat: GitHub repository\nMcTorch (Python & PyToch) – Riemannian SGD, Adagrad, ASA & CG.\nROPTLIB (C++) a Riemannian OPTimization LIBrary :octocat: GitHub repository\nTF Riemopt (Python & TensorFlow) Riemannian optimization using TensorFlow","category":"page"},{"location":"tutorials/GeodesicRegression/#How-to-perform-Geodesic-Regression","page":"Do Geodesic Regression","title":"How to perform Geodesic Regression","text":"","category":"section"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Geodesic regression generalizes linear regression to Riemannian manifolds. Let’s first phrase it informally as follows:","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"For given data points d_1ldotsd_n on a Riemannian manifold mathcal M, find the geodesic that “best explains” the data.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"The meaning of “best explain” has still to be clarified. We distinguish two cases: time labelled data and unlabelled data","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":" using Manopt, ManifoldDiff, Manifolds, Random, Colors\n using LinearAlgebra: svd\n Random.seed!(42);","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"We use the following data, where we want to highlight one of the points.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"n = 7\nσ = π / 8\nS = Sphere(2)\nbase = 1 / sqrt(2) * [1.0, 0.0, 1.0]\ndir = [-0.75, 0.5, 0.75]\ndata_orig = [exp(S, base, dir, t) for t in range(-0.5, 0.5; length=n)]\n# add noise to the points on the geodesic\ndata = map(p -> exp(S, p, rand(S; vector_at=p, σ=σ)), data_orig)\nhighlighted = 4;","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"(Image: The given data)","category":"page"},{"location":"tutorials/GeodesicRegression/#Time-Labeled-Data","page":"Do Geodesic Regression","title":"Time Labeled Data","text":"","category":"section"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"If for each data item d_i we are also given a time point t_iinmathbb R, which are pairwise different, then we can use the least squares error to state the objetive function as [Fle13]","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"F(pX) = frac12sum_i=1^n d_mathcal M^2(γ_pX(t_i) d_i)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"where d_mathcal M is the Riemannian distance and γ_pX is the geodesic with γ(0) = p and dotgamma(0) = X.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"For the real-valued case mathcal M = mathbb R^m the solution (p^* X^*) is given in closed form as follows: with d^* = frac1ndisplaystylesum_i=1^nd_i and t^* = frac1ndisplaystylesum_i=1^n t_i we get","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":" X^* = fracsum_i=1^n (d_i-d^*)(t-t^*)sum_i=1^n (t_i-t^*)^2\nquadtext and quad\np^* = d^* - t^*X^*","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"and hence the linear regression result is the line γ_p^*X^*(t) = p^* + tX^*.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"On a Riemannian manifold we can phrase this as an optimization problem on the tangent bundle, i.e. the disjoint union of all tangent spaces, as","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"operatorname*argmin_(pX) in mathrmTmathcal M F(pX)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Due to linearity, the gradient of F(pX) is the sum of the single gradients of","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":" frac12d_mathcal M^2bigl(γ_pX(t_i)d_ibigr)\n = frac12d_mathcal M^2bigl(exp_p(t_iX)d_ibigr)\n quad i1ldotsn","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"which can be computed using a chain rule of the squared distance and the exponential map, see for example [BG18] for details or Equations (7) and (8) of [Fle13]:","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"M = TangentBundle(S)\nstruct RegressionCost{T,S}\n data::T\n times::S\nend\nRegressionCost(data::T, times::S) where {T,S} = RegressionCost{T,S}(data, times)\nfunction (a::RegressionCost)(M, x)\n pts = [geodesic(M.manifold, x[M, :point], x[M, :vector], ti) for ti in a.times]\n return 1 / 2 * sum(distance.(Ref(M.manifold), pts, a.data) .^ 2)\nend\nstruct RegressionGradient!{T,S}\n data::T\n times::S\nend\nfunction RegressionGradient!(data::T, times::S) where {T,S}\n return RegressionGradient!{T,S}(data, times)\nend\nfunction (a::RegressionGradient!)(M, Y, x)\n pts = [geodesic(M.manifold, x[M, :point], x[M, :vector], ti) for ti in a.times]\n gradients = grad_distance.(Ref(M.manifold), a.data, pts)\n Y[M, :point] .= sum(\n ManifoldDiff.adjoint_differential_exp_basepoint.(\n Ref(M.manifold),\n Ref(x[M, :point]),\n [ti * x[M, :vector] for ti in a.times],\n gradients,\n ),\n )\n Y[M, :vector] .= sum(\n ManifoldDiff.adjoint_differential_exp_argument.(\n Ref(M.manifold),\n Ref(x[M, :point]),\n [ti * x[M, :vector] for ti in a.times],\n gradients,\n ),\n )\n return Y\nend","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"For the Euclidean case, the result is given by the first principal component of a principal component analysis, see PCR, i.e. with p^* = frac1ndisplaystylesum_i=1^n d_i the direction X^* is obtained by defining the zero mean data matrix","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"D = bigl(d_1-p^* ldots d_n-p^*bigr) in mathbb R^mn","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"and taking X^* as an eigenvector to the largest eigenvalue of D^mathrmTD.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"We can do something similar, when considering the tangent space at the (Riemannian) mean of the data and then do a PCA on the coordinate coefficients with respect to a basis.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"m = mean(S, data)\nA = hcat(\n map(x -> get_coordinates(S, m, log(S, m, x), DefaultOrthonormalBasis()), data)...\n)\npca1 = get_vector(S, m, svd(A).U[:, 1], DefaultOrthonormalBasis())\nx0 = ArrayPartition(m, pca1)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"([0.6998621681746481, -0.013681674945026638, 0.7141468737791822], [0.5931302057517893, -0.5459465115717783, -0.5917254139611094])","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"The optimal “time labels” are then just the projections t_i = d_iX^*, i=1ldotsn.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"t = map(d -> inner(S, m, pca1, log(S, m, d)), data)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"7-element Vector{Float64}:\n 1.0763904949888323\n 0.4594060193318443\n -0.5030195874833682\n 0.02135686940521725\n -0.6158692507563633\n -0.24431652575028764\n -0.2259012492666664","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"And we can call the gradient descent. Note that since gradF! works in place of Y, we have to set the evalutation type accordingly.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"y = gradient_descent(\n M,\n RegressionCost(data, t),\n RegressionGradient!(data, t),\n x0;\n evaluation=InplaceEvaluation(),\n stepsize=ArmijoLinesearch(\n M;\n initial_stepsize=1.0,\n contraction_factor=0.990,\n sufficient_decrease=0.05,\n stop_when_stepsize_less=1e-9,\n ),\n stopping_criterion=StopAfterIteration(200) |\n StopWhenGradientNormLess(1e-8) |\n StopWhenStepsizeLess(1e-9),\n debug=[:Iteration, \" | \", :Cost, \"\\n\", :Stop, 50],\n)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Initial | F(x): 0.142862\n# 50 | F(x): 0.141113\n# 100 | F(x): 0.141113\n# 150 | F(x): 0.141113\n# 200 | F(x): 0.141113\nThe algorithm reached its maximal number of iterations (200).\n\n([0.7119768725361988, 0.009463059143003981, 0.7021391482357537], [0.590008151835008, -0.5543272518659472, -0.5908038715512287])","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"For the result, we can generate and plot all involved geodesics","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"dense_t = range(-0.5, 0.5; length=100)\ngeo = geodesic(S, y[M, :point], y[M, :vector], dense_t)\ninit_geo = geodesic(S, x0[M, :point], x0[M, :vector], dense_t)\ngeo_pts = geodesic(S, y[M, :point], y[M, :vector], t)\ngeo_conn_highlighted = shortest_geodesic(\n S, data[highlighted], geo_pts[highlighted], 0.5 .+ dense_t\n);","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"(Image: Result of Geodesic Regression)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"In this image, together with the blue data points, you see the geodesic of the initialization in black (evaluated on -frac12frac12), the final point on the tangent bundle in orange, as well as the resulting regression geodesic in teal, (on the same interval as the start) as well as small teal points indicating the time points on the geodesic corresponding to the data. Additionally, a thin blue line indicates the geodesic between a data point and its corresponding data point on the geodesic. While this would be the closest point in Euclidean space and hence the two directions (along the geodesic vs. to the data point) orthogonal, here we have","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"inner(\n S,\n geo_pts[highlighted],\n log(S, geo_pts[highlighted], geo_pts[highlighted + 1]),\n log(S, geo_pts[highlighted], data[highlighted]),\n)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"0.002487393068917863","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"But we also started with one of the best scenarios, i.e. equally spaced points on a geodesic obstructed by noise.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"This gets worse if you start with less evenly distributed data","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"data2 = [exp(S, base, dir, t) for t in [-0.5, -0.49, -0.48, 0.1, 0.48, 0.49, 0.5]]\ndata2 = map(p -> exp(S, p, rand(S; vector_at=p, σ=σ / 2)), data2)\nm2 = mean(S, data2)\nA2 = hcat(\n map(x -> get_coordinates(S, m, log(S, m, x), DefaultOrthonormalBasis()), data2)...\n)\npca2 = get_vector(S, m, svd(A2).U[:, 1], DefaultOrthonormalBasis())\nx1 = ArrayPartition(m, pca2)\nt2 = map(d -> inner(S, m2, pca2, log(S, m2, d)), data2)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"7-element Vector{Float64}:\n 0.8226008307680276\n 0.470952643700004\n 0.7974195537403082\n 0.01533949241264346\n -0.6546705405852389\n -0.8913273825362389\n -0.5775954445730889","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"then we run again","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"y2 = gradient_descent(\n M,\n RegressionCost(data2, t2),\n RegressionGradient!(data2, t2),\n x1;\n evaluation=InplaceEvaluation(),\n stepsize=ArmijoLinesearch(\n M;\n initial_stepsize=1.0,\n contraction_factor=0.990,\n sufficient_decrease=0.05,\n stop_when_stepsize_less=1e-9,\n ),\n stopping_criterion=StopAfterIteration(200) |\n StopWhenGradientNormLess(1e-8) |\n StopWhenStepsizeLess(1e-9),\n debug=[:Iteration, \" | \", :Cost, \"\\n\", :Stop, 3],\n);","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Initial | F(x): 0.089844\n# 3 | F(x): 0.085364\n# 6 | F(x): 0.085364\n# 9 | F(x): 0.085364\n# 12 | F(x): 0.085364\n# 15 | F(x): 0.085364\n# 18 | F(x): 0.085364\n# 21 | F(x): 0.085364\n# 24 | F(x): 0.085364\n# 27 | F(x): 0.085364\n# 30 | F(x): 0.085364\n# 33 | F(x): 0.085364\n# 36 | F(x): 0.085364\n# 39 | F(x): 0.085364\n# 42 | F(x): 0.085364\n# 45 | F(x): 0.085364\n# 48 | F(x): 0.085364\n# 51 | F(x): 0.085364\n# 54 | F(x): 0.085364\n# 57 | F(x): 0.085364\n# 60 | F(x): 0.085364\n# 63 | F(x): 0.085364\n# 66 | F(x): 0.085364\n# 69 | F(x): 0.085364\n# 72 | F(x): 0.085364\n# 75 | F(x): 0.085364\n# 78 | F(x): 0.085364\n# 81 | F(x): 0.085364\n# 84 | F(x): 0.085364\n# 87 | F(x): 0.085364\n# 90 | F(x): 0.085364\n# 93 | F(x): 0.085364\n# 96 | F(x): 0.085364\n# 99 | F(x): 0.085364\n# 102 | F(x): 0.085364\n# 105 | F(x): 0.085364\n# 108 | F(x): 0.085364\n# 111 | F(x): 0.085364\n# 114 | F(x): 0.085364\n# 117 | F(x): 0.085364\n# 120 | F(x): 0.085364\n# 123 | F(x): 0.085364\n# 126 | F(x): 0.085364\n# 129 | F(x): 0.085364\n# 132 | F(x): 0.085364\n# 135 | F(x): 0.085364\n# 138 | F(x): 0.085364\n# 141 | F(x): 0.085364\n# 144 | F(x): 0.085364\n# 147 | F(x): 0.085364\n# 150 | F(x): 0.085364\n# 153 | F(x): 0.085364\n# 156 | F(x): 0.085364\n# 159 | F(x): 0.085364\n# 162 | F(x): 0.085364\n# 165 | F(x): 0.085364\n# 168 | F(x): 0.085364\n# 171 | F(x): 0.085364\n# 174 | F(x): 0.085364\n# 177 | F(x): 0.085364\n# 180 | F(x): 0.085364\n# 183 | F(x): 0.085364\n# 186 | F(x): 0.085364\n# 189 | F(x): 0.085364\n# 192 | F(x): 0.085364\n# 195 | F(x): 0.085364\n# 198 | F(x): 0.085364\nThe algorithm reached its maximal number of iterations (200).","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"For plotting we again generate all data","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"geo2 = geodesic(S, y2[M, :point], y2[M, :vector], dense_t)\ninit_geo2 = geodesic(S, x1[M, :point], x1[M, :vector], dense_t)\ngeo_pts2 = geodesic(S, y2[M, :point], y2[M, :vector], t2)\ngeo_conn_highlighted2 = shortest_geodesic(\n S, data2[highlighted], geo_pts2[highlighted], 0.5 .+ dense_t\n);","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"(Image: A second result with different time points)","category":"page"},{"location":"tutorials/GeodesicRegression/#Unlabeled-Data","page":"Do Geodesic Regression","title":"Unlabeled Data","text":"","category":"section"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"If we are not given time points t_i, then the optimization problem extends – informally speaking – to also finding the “best fitting” (in the sense of smallest error). To formalize, the objective function here reads","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"F(p X t) = frac12sum_i=1^n d_mathcal M^2(γ_pX(t_i) d_i)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"where t = (t_1ldotst_n) in mathbb R^n is now an additional parameter of the objective function. We write F_1(p X) to refer to the function on the tangent bundle for fixed values of t (as the one in the last part) and F_2(t) for the function F(p X t) as a function in t with fixed values (p X).","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"For the Euclidean case, there is no neccessity to optimize with respect to t, as we saw above for the initialization of the fixed time points.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"On a Riemannian manifold this can be stated as a problem on the product manifold mathcal N = mathrmTmathcal M times mathbb R^n, i.e.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"N = M × Euclidean(length(t2))","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"ProductManifold with 2 submanifolds:\n TangentBundle(Sphere(2, ℝ))\n Euclidean(7; field = ℝ)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":" operatorname*argmin_bigl((pX)tbigr)inmathcal N F(p X t)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"In this tutorial we present an approach to solve this using an alternating gradient descent scheme. To be precise, we define the cost funcion now on the product manifold","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"struct RegressionCost2{T}\n data::T\nend\nRegressionCost2(data::T) where {T} = RegressionCost2{T}(data)\nfunction (a::RegressionCost2)(N, x)\n TM = N[1]\n pts = [\n geodesic(TM.manifold, x[N, 1][TM, :point], x[N, 1][TM, :vector], ti) for\n ti in x[N, 2]\n ]\n return 1 / 2 * sum(distance.(Ref(TM.manifold), pts, a.data) .^ 2)\nend","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"The gradient in two parts, namely (a) the same gradient as before w.r.t. (pX) Tmathcal M, just now with a fixed t in mind for the second component of the product manifold mathcal N","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"struct RegressionGradient2a!{T}\n data::T\nend\nRegressionGradient2a!(data::T) where {T} = RegressionGradient2a!{T}(data)\nfunction (a::RegressionGradient2a!)(N, Y, x)\n TM = N[1]\n p = x[N, 1]\n pts = [geodesic(TM.manifold, p[TM, :point], p[TM, :vector], ti) for ti in x[N, 2]]\n gradients = Manopt.grad_distance.(Ref(TM.manifold), a.data, pts)\n Y[TM, :point] .= sum(\n ManifoldDiff.adjoint_differential_exp_basepoint.(\n Ref(TM.manifold),\n Ref(p[TM, :point]),\n [ti * p[TM, :vector] for ti in x[N, 2]],\n gradients,\n ),\n )\n Y[TM, :vector] .= sum(\n ManifoldDiff.adjoint_differential_exp_argument.(\n Ref(TM.manifold),\n Ref(p[TM, :point]),\n [ti * p[TM, :vector] for ti in x[N, 2]],\n gradients,\n ),\n )\n return Y\nend","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Finally, we addionally look for a fixed point x=(pX) mathrmTmathcal M at the gradient with respect to tmathbb R^n, i.e. the second component, which is given by","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":" (operatornamegradF_2(t))_i\n = - dot γ_pX(t_i) log_γ_pX(t_i)d_i_γ_pX(t_i) i = 1 ldots n","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"struct RegressionGradient2b!{T}\n data::T\nend\nRegressionGradient2b!(data::T) where {T} = RegressionGradient2b!{T}(data)\nfunction (a::RegressionGradient2b!)(N, Y, x)\n TM = N[1]\n p = x[N, 1]\n pts = [geodesic(TM.manifold, p[TM, :point], p[TM, :vector], ti) for ti in x[N, 2]]\n logs = log.(Ref(TM.manifold), pts, a.data)\n pt = map(\n d -> vector_transport_to(TM.manifold, p[TM, :point], p[TM, :vector], d), pts\n )\n Y .= -inner.(Ref(TM.manifold), pts, logs, pt)\n return Y\nend","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"We can reuse the computed initial values from before, just that now we are on a product manifold","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"x2 = ArrayPartition(x1, t2)\nF3 = RegressionCost2(data2)\ngradF3_vector = [RegressionGradient2a!(data2), RegressionGradient2b!(data2)];","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"and we run the algorithm","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"y3 = alternating_gradient_descent(\n N,\n F3,\n gradF3_vector,\n x2;\n evaluation=InplaceEvaluation(),\n debug=[:Iteration, \" | \", :Cost, \"\\n\", :Stop, 50],\n stepsize=ArmijoLinesearch(\n M;\n contraction_factor=0.999,\n sufficient_decrease=0.066,\n stop_when_stepsize_less=1e-11,\n retraction_method=ProductRetraction(SasakiRetraction(2), ExponentialRetraction()),\n ),\n inner_iterations=1,\n)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Initial | F(x): 0.089844\n# 50 | F(x): 0.091097\n# 100 | F(x): 0.091097\nThe algorithm reached its maximal number of iterations (100).\n\n(ArrayPartition{Float64, Tuple{Vector{Float64}, Vector{Float64}}}(([0.750222090700214, 0.031464227399200885, 0.6604368380243274], [0.6636489079535082, -0.3497538263293046, -0.737208025444054])), [0.7965909273713889, 0.43402264218923514, 0.755822122896529, 0.001059348203453764, -0.6421135044471217, -0.8635572995105818, -0.5546338813212247])","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"which we render can collect into an image creating the geodesics again","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"geo3 = geodesic(S, y3[N, 1][M, :point], y3[N, 1][M, :vector], dense_t)\ninit_geo3 = geodesic(S, x1[M, :point], x1[M, :vector], dense_t)\ngeo_pts3 = geodesic(S, y3[N, 1][M, :point], y3[N, 1][M, :vector], y3[N, 2])\nt3 = y3[N, 2]\ngeo_conns = shortest_geodesic.(Ref(S), data2, geo_pts3, Ref(0.5 .+ 4*dense_t));","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"which yields","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"(Image: The third result)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Note that the geodesics from the data to the regression geodesic meet at a nearly orthogonal angle.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Acknowledgement. Parts of this tutorial are based on the bachelor thesis of Jeremias Arf.","category":"page"},{"location":"tutorials/GeodesicRegression/#Literature","page":"Do Geodesic Regression","title":"Literature","text":"","category":"section"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Pages = [\"tutorials/GeodesicRegression.md\"]\nCanonical=false","category":"page"},{"location":"solvers/FrankWolfe/#FrankWolfe","page":"Frank-Wolfe","title":"Frank Wolfe Method","text":"","category":"section"},{"location":"solvers/FrankWolfe/","page":"Frank-Wolfe","title":"Frank-Wolfe","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/FrankWolfe/","page":"Frank-Wolfe","title":"Frank-Wolfe","text":"Frank_Wolfe_method\nFrank_Wolfe_method!","category":"page"},{"location":"solvers/FrankWolfe/#Manopt.Frank_Wolfe_method","page":"Frank-Wolfe","title":"Manopt.Frank_Wolfe_method","text":"Frank_Wolfe_method(M, f, grad_f, p)\nFrank_Wolfe_method(M, gradient_objective, p; kwargs...)\n\nPerform the Frank-Wolfe algorithm to compute for mathcal C subset mathcal M\n\n operatorname*argmin_pmathcal C f(p)\n\nwhere the main step is a constrained optimisation is within the algorithm, that is the sub problem (Oracle)\n\n q_k = operatornameargmin_q in C operatornamegrad F(p_k) log_p_kq\n\nfor every iterate p_k together with a stepsize s_k1, by default s_k = frac2k+2. This algorithm is inspired by but slightly more general than Weber, Sra, Math. Prog., 2022.\n\nThe next iterate is then given by p_k+1 = γ_p_kq_k(s_k), where by default γ is the shortest geodesic between the two points but can also be changed to use a retraction and its inverse.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function f mathcal Mℝ to find a minimizer p^* for\ngrad_f – the gradient operatornamegradf mathcal M Tmathcal M of f\nas a function (M, p) -> X or a function (M, X, p) -> X working in place of X.\np – an initial value p mathcal C, note that it really has to be a feasible point\n\nAlternatively to f and grad_f you can provide the AbstractManifoldGradientObjective gradient_objective directly.\n\nKeyword Arguments\n\nevaluation - (AllocatingEvaluation) whether grad_f is an inplace or allocating (default) function\ninitial_vector – (zero_vectoir(M,p)) how to initialize the inner gradient tangent vector\nstopping_criterion – (StopAfterIteration(500) |StopWhenGradientNormLess(1.0e-6)) a stopping criterion\nretraction_method – (default_retraction_method(M, typeof(p))) a type of retraction\nstepsize -(DecreasingStepsize(; length=2.0, shift=2) a Stepsize to use; but it has to be always less than 1. The default is the one proposed by Frank & Wolfe: s_k = frac2k+2.\nsub_cost - (FrankWolfeCost(p, initiel_vector)) – the cost of the Frank-Wolfe sub problem which by default uses the current iterate and (sub)gradient of the current iteration to define a default cost, this is used to define the default sub_objective. It is ignored, if you set that or the sub_problem directly\nsub_grad - (FrankWolfeGradient(p, initial_vector)) – the gradient of the Frank-Wolfe sub problem which by default uses the current iterate and (sub)gradient of the current iteration to define a default gradient this is used to define the default sub_objective. It is ignored, if you set that or the sub_problem directly\nsub_objective - (ManifoldGradientObjective(sub_cost, sub_gradient)) – the objective for the Frank-Wolfe sub problem this is used to define the default sub_problem. It is ignored, if you set the sub_problem manually\nsub_problem - (DefaultManoptProblem(M, sub_objective)) – the Frank-Wolfe sub problem to solve. This can be given in three forms\nas an AbstractManoptProblem, then the sub_state specifies the solver to use\nas a closed form solution, e.g. a function, evaluating with new allocations, that is a function (M, p, X) -> q that solves the sub problem on M given the current iterate p and (sub)gradient X.\nas a closed form solution, e.g. a function, evaluating in place, that is a function (M, q, p, X) -> q working in place of q, with the parameters as in the last point\nFor points 2 and 3 the sub_state has to be set to the corresponding AbstractEvaluationType, AllocatingEvaluation and InplaceEvaluation, respectively\nsub_state - (evaluation if sub_problem is a function, a decorated GradientDescentState otherwise) for a function, the evaluation is inherited from the Frank-Wolfe evaluation keyword.\nsub_kwargs - ([]) – keyword arguments to decorate the sub_state default state in case the sub_problem is not a function\n\nAll other keyword arguments are passed to decorate_state! for decorators or decorate_objective!, respectively. If you provide the ManifoldGradientObjective directly, these decorations can still be specified\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/FrankWolfe/#Manopt.Frank_Wolfe_method!","page":"Frank-Wolfe","title":"Manopt.Frank_Wolfe_method!","text":"Frank_Wolfe_method!(M, f, grad_f, p; kwargs...)\nFrank_Wolfe_method!(M, gradient_objective, p; kwargs...)\n\nPerform the Frank Wolfe method in place of p.\n\nFor all options and keyword arguments, see Frank_Wolfe_method.\n\n\n\n\n\n","category":"function"},{"location":"solvers/FrankWolfe/#State","page":"Frank-Wolfe","title":"State","text":"","category":"section"},{"location":"solvers/FrankWolfe/","page":"Frank-Wolfe","title":"Frank-Wolfe","text":"FrankWolfeState","category":"page"},{"location":"solvers/FrankWolfe/#Manopt.FrankWolfeState","page":"Frank-Wolfe","title":"Manopt.FrankWolfeState","text":"FrankWolfeState <: AbstractManoptSolverState\n\nA struct to store the current state of the Frank_Wolfe_method\n\nIt comes in two forms, depending on the realisation of the subproblem.\n\nFields\n\np – the current iterate, i.e. a point on the manifold\nX – the current gradient operatornamegrad F(p), i.e. a tangent vector to p.\ninverse_retraction_method – (default_inverse_retraction_method(M, typeof(p))) an inverse retraction method to use within Frank Wolfe.\nsub_problem – an AbstractManoptProblem problem or a function (M, p, X) -> q or (M, q, p, X) for the a closed form solution of the sub problem\nsub_state – an AbstractManoptSolverState for the subsolver or an AbstractEvaluationType in case the sub problem is provided as a function\nstop – (StopAfterIteration(200) |StopWhenGradientNormLess(1.0e-6)) a StoppingCriterion\nstepsize - (DecreasingStepsize(; length=2.0, shift=2)) s_k which by default is set to s_k = frac2k+2.\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction to use within Frank-Wolfe\n\nFor the subtask, we need a method to solve\n\n operatorname*argmin_qmathcal M X log_p qqquad text where X=operatornamegrad f(p)\n\nConstructor\n\nFrankWolfeState(M, p, X, sub_problem, sub_state)\n\nwhere the remaining fields from above are keyword arguments with their defaults already given in brackets.\n\n\n\n\n\n","category":"type"},{"location":"solvers/FrankWolfe/#Helpers","page":"Frank-Wolfe","title":"Helpers","text":"","category":"section"},{"location":"solvers/FrankWolfe/","page":"Frank-Wolfe","title":"Frank-Wolfe","text":"For the inner sub-problem you can easily create the corresponding cost and gradient using","category":"page"},{"location":"solvers/FrankWolfe/","page":"Frank-Wolfe","title":"Frank-Wolfe","text":"FrankWolfeCost\nFrankWolfeGradient","category":"page"},{"location":"solvers/FrankWolfe/#Manopt.FrankWolfeCost","page":"Frank-Wolfe","title":"Manopt.FrankWolfeCost","text":"FrankWolfeCost{P,T}\n\nA structure to represent the oracle sub problem in the Frank_Wolfe_method. The cost function reads\n\nF(q) = X log_p q\n\nThe values p and X are stored within this functor and should be references to the iterate and gradient from within FrankWolfeState.\n\n\n\n\n\n","category":"type"},{"location":"solvers/FrankWolfe/#Manopt.FrankWolfeGradient","page":"Frank-Wolfe","title":"Manopt.FrankWolfeGradient","text":"FrankWolfeGradient{P,T}\n\nA structure to represent the gradient of the oracle sub problem in the Frank_Wolfe_method, that is for a given point p and a tangent vector X we have\n\nF(q) = X log_p q\n\nIts gradient can be computed easily using adjoint_differential_log_argument.\n\nThe values p and X are stored within this functor and should be references to the iterate and gradient from within FrankWolfeState.\n\n\n\n\n\n","category":"type"},{"location":"solvers/FrankWolfe/","page":"Frank-Wolfe","title":"Frank-Wolfe","text":"Pages = [\"solvers/FrankWolfe.md\"]\nCanonical=false","category":"page"},{"location":"tutorials/ImplementASolver/#How-to-implementing-your-own-solver","page":"Implement a Solver","title":"How to implementing your own solver","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"When you have used a few solvers from Manopt.jl for example like in the opening tutorial Get Started: Optimize! you might come to the idea of implementing a solver yourself.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"After a short introduction of the algorithm we will implement, this tutorial first discusses the structural details, i.e. what a solver consists of and “works with”. Afterwards, we will show how to implement the algorithm. Finally, we will discuss how to make the algorithm both nice for the user as well as initialized in a way, that it can benefit from features already available in Manopt.jl.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"note: Note\nIf you have implemented your own solver, we would be very happy to have that within Manopt.jl as well, so maybe consider opening a Pull Request","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"using Manopt, Manifolds, Random","category":"page"},{"location":"tutorials/ImplementASolver/#Our-Guiding-Example:-A-random-walk-Minimization","page":"Implement a Solver","title":"Our Guiding Example: A random walk Minimization","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Since most serious algorithms should be implemented in Manopt.jl themselves directly, we will implement a solver that randomly walks on the manifold and keeps track of the lowest point visited. As for algorithms in Manopt.jl we aim to implement this generically for any manifold that is implemented using ManifoldsBase.jl.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"The Random Walk Minimization","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Given:","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"a manifold mathcal M\na starting point p=p^(0)\na cost function f mathcal M tomathbb R.\na parameter sigma 0.\na retraction operatornameretr_p(X) that maps Xin T_pmathcal M to the manifold.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We can run the following steps of the algorithm","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"set k=0\nset our best point q = p^(0)\nRepeat until a stopping criterion is fulfilled\nChoose a random tangent vector X^(k) in T_p^(k)mathcal M of length lVert X^(k) rVert = sigma\n“Walk” along this direction, i.e. p^(k+1) = operatornameretr_p^(k)(X^(k))\nIf f(p^(k+1)) f(q) set q = p^{(k+1)}$ as our new best visited point\nReturn q as the resulting best point we visited","category":"page"},{"location":"tutorials/ImplementASolver/#Preliminaries-–-Elements-a-Solver-works-on","page":"Implement a Solver","title":"Preliminaries – Elements a Solver works on","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"There are two main ingredients a solver needs: a problem to work on and the state of a solver, which “identifies” the solver and stores intermediate results.","category":"page"},{"location":"tutorials/ImplementASolver/#The-“Task”-–-An-AbstractManoptProblem","page":"Implement a Solver","title":"The “Task” – An AbstractManoptProblem","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"A problem in Manopt.jl usually consists of a manifold (an AbstractManifold) and an AbstractManifoldObjective describing the function we have and its features. In our case the objective is (just) a ManifoldCostObjective that stores cost function f(M,p) = .... More generally, it might for example store a gradient function or the Hessian or any other information we have about our task.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"This is something independent of the solver itself, since it only identifies the problem we want to solve independent of how we want to solve it – or in other words, this type contains all information that is static and independent of the specific solver at hand.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Usually the problems variable is called mp.","category":"page"},{"location":"tutorials/ImplementASolver/#The-Solver-–-An-AbstractManoptSolverState","page":"Implement a Solver","title":"The Solver – An AbstractManoptSolverState","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Everything that is needed by a solver during the iterations, all its parameters, interims values that are needed beyond just one iteration, is stored in a subtype of the AbstractManoptSolverState. This identifies the solver uniquely.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"In our case we want to store five things","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"the current iterate p=p^(k)\nthe best visited point q\nthe variable sigma 0\nthe retraction operatornameretr to use (cf. retractions and inverse retractions)\na criterion, when to stop, i.e. a StoppingCriterion","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We can defined this as","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"mutable struct RandomWalkState{\n P,\n R<:AbstractRetractionMethod,\n S<:StoppingCriterion,\n} <: AbstractManoptSolverState\n p::P\n q::P\n σ::Float64\n retraction_method::R\n stop::S\nend","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"The stopping criterion is usually stored in the state’s stop field. If you have a reason to do otherwise, you have one more function to implement (see next section). For ease of use, we can provide a constructor, that for example chooses a good default for the retraction based on a given manifold.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"function RandomWalkState(M::AbstractManifold, p::P=rand(M);\n σ = 0.1,\n retraction_method::R=default_retraction_method(M),\n stopping_criterion::S=StopAfterIteration(200)\n) where {P, R<:AbstractRetractionMethod, S<:StoppingCriterion}\n return RandomWalkState{P,R,S}(p, copy(M, p), σ, retraction_method, stopping_criterion)\nend","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Parametrising the state avoid that we have abstract typed fields. The keyword arguments for the retraction and stopping criterion are the ones usually used in Manopt.jl and provide an easy way to construct this state now.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"States usually have a shortened name as their variable, we will use rws for our state here.","category":"page"},{"location":"tutorials/ImplementASolver/#Implementing-the-Your-solver","page":"Implement a Solver","title":"Implementing the Your solver","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"There is basically only two methods we need to implement for our solver","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"initialize_solver!(mp, rws) which initialises the solver before the first iteration\nstep_solver!(mp, rws, i) which implements the ith iteration, where i is given to you as the third parameter\nget_iterate(rws) which accesses the iterate from other places in the solver\nget_solver_result(rws) returning the solvers final (best) point we reached. By default this would return the last iterate rws.p (or more precisely calls get_iterate), but since we randomly walk and remember our best point in q, this has to return rws.q.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"The first two functions are in-place functions, that is they modify our solver state rws. You implement these by multiple dispatch on the types after importing said functions from Manopt:","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"import Manopt: initialize_solver!, step_solver!, get_iterate, get_solver_result","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"The state above has two fields where we use the common names used in Manopt.jl, that is the StoppingCriterion is usually in stop and the iterate in p. If your choice is different, you need to reimplement","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"stop_solver!(mp, rws, i) to determine whether or not to stop after the ith iteration.\nget_iterate(rws) to access the current iterate","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We recommend to follow the general scheme with the stop field. If you have specific criteria when to stop, consider implementing your own stoping criterion instead.","category":"page"},{"location":"tutorials/ImplementASolver/#Initialization-and-Iterate-Access","page":"Implement a Solver","title":"Initialization & Iterate Access","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"For our solver, there is not so much to initialize, just to be safe we should copy over the initial value in p we start with, to q. We do not have to care about remembering the iterate, that is done by Manopt.jl. For the iterate access we just have to pass p.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"function initialize_solver!(mp::AbstractManoptProblem, rws::RandomWalkState)\n copyto!(M, rws.q, rws.p) # Set p^{(0)} = q\n return rws\nend\nget_iterate(rws::RandomWalkState) = rws.p\nget_solver_result(rws::RandomWalkState) = rws.q","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"and similarly we implement the step. Here we make use of the fact that the problem (and also the objective in fact) have access functions for their elements, the one we need is get_cost.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"function step_solver!(mp::AbstractManoptProblem, rws::RandomWalkState, i)\n M = get_manifold(mp) # for ease of use get the manifold from the problem\n X = rand(M; vector_at=p) # generate a direction\n X .*= rws.σ/norm(M, p, X)\n # Walk\n retract!(M, rws.p, rws.p, X, rws.retraction_method)\n # is the new point better? Then store it\n if get_cost(mp, rws.p) < get_cost(mp, rws.q)\n copyto!(M, rws.p, rws.q)\n end\n return rws\nend","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Performance wise we could improve the number of allocations by making X also a field of our rws but let’s keep it simple here. We could also store the cost of q in the state, but we will see how to easily also enable this solver to allow for caching. In practice, however, it is preferable to cache intermediate values like cost of q in the state when it can be easily achieved. This way we do not have to deal with overheads of an external cache.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Now we can just run the solver already! We take the same example as for the other tutorials","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We first define our task, the Riemannian Center of Mass from the Get Started: Optimize! tutorial.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Random.seed!(23)\nn = 100\nσ = π / 8\nM = Sphere(2)\np = 1 / sqrt(2) * [1.0, 0.0, 1.0]\ndata = [exp(M, p, σ * rand(M; vector_at=p)) for i in 1:n];\nf(M, p) = sum(1 / (2 * n) * distance.(Ref(M), Ref(p), data) .^ 2)","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We can now generate the problem with its objective and the state","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"mp = DefaultManoptProblem(M, ManifoldCostObjective(f))\ns = RandomWalkState(M; σ = 0.2)\n\nsolve!(mp, s)\nget_solver_result(s)","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"3-element Vector{Float64}:\n -0.2412674850987521\n 0.8608618657176527\n -0.44800317943876844","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"The function solve! works also in place of s, but the last line illustrates how to access the result in general; we could also just look at s.p, but the function get_iterate is also used in several other places.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We could for example easily set up a second solver to work from a specified starting point with a different σ like","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"s2 = RandomWalkState(M, [1.0, 0.0, 0.0]; σ = 0.1)\nsolve!(mp, s2)\nget_solver_result(s2)","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"3-element Vector{Float64}:\n 1.0\n 0.0\n 0.0","category":"page"},{"location":"tutorials/ImplementASolver/#Ease-of-Use-I:-The-high-level-interface(s)","page":"Implement a Solver","title":"Ease of Use I: The high level interface(s)","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Manopt.jl offers a few additional features for solvers in their high level interfaces, for example debug= for debug, record= keywords for debug and recording within solver states or count= and cache keywords for the objective.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We can introduce these here as well with just a few lines of code. There are usually two steps. We further need three internal function from Manopt.jl","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"using Manopt: get_solver_return, indicates_convergence, status_summary","category":"page"},{"location":"tutorials/ImplementASolver/#A-high-level-interface-using-the-objective","page":"Implement a Solver","title":"A high level interface using the objective","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"This could be considered as an interims step to the high-level interface: If we already have the objective – in our case a ManifoldCostObjective at hand, the high level interface consists of the steps","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"possibly decorate the objective\ngenerate the problem\ngenerate and possiblz generate the state\ncall the solver\ndetermine the return value","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We illustrate the step with an in-place variant here. A variant that keeps the given start point unchanged would just add a copy(M, p) upfront. Manopt.jl provides both variants.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"function random_walk_algorithm!(\n M::AbstractManifold,\n mgo::ManifoldCostObjective,\n p;\n σ = 0.1,\n retraction_method::AbstractRetractionMethod=default_retraction_method(M, typeof(p)),\n stopping_criterion::StoppingCriterion=StopAfterIteration(200),\n kwargs...,\n)\n dmgo = decorate_objective!(M, mgo; kwargs...)\n dmp = DefaultManoptProblem(M, dmgo)\n s = RandomWalkState(M, [1.0, 0.0, 0.0];\n σ=0.1,\n retraction_method=retraction_method, stopping_criterion=stopping_criterion,\n )\n ds = decorate_state!(s; kwargs...)\n solve!(dmp, ds)\n return get_solver_return(get_objective(dmp), ds)\nend","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"random_walk_algorithm! (generic function with 1 method)","category":"page"},{"location":"tutorials/ImplementASolver/#The-high-level-interface","page":"Implement a Solver","title":"The high level interface","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Starting from the last section, the usual call a user would prefer is just passing a manifold M the cost f and maybe a start point p.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"function random_walk_algorithm!(M::AbstractManifold, f, p=rand(M); kwargs...)\n mgo = ManifoldCostObjective(f)\n return random_walk_algorithm!(M, mgo, p; kwargs...)\nend","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"random_walk_algorithm! (generic function with 3 methods)","category":"page"},{"location":"tutorials/ImplementASolver/#Ease-of-Use-II:-The-State-Summary","page":"Implement a Solver","title":"Ease of Use II: The State Summary","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"For the case that you set return_state=true the solver should return a summary of the run. When a show method is provided, users can easily read such summary in a terminal. It should reflect its main parameters, if they are not too verbose and provide information about the reason it stopped and whether this indicates convergence.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Here it would for example look like","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"import Base: show\nfunction show(io::IO, rws::RandomWalkState)\n i = get_count(rws, :Iterations)\n Iter = (i > 0) ? \"After $i iterations\\n\" : \"\"\n Conv = indicates_convergence(rws.stop) ? \"Yes\" : \"No\"\n s = \"\"\"\n # Solver state for `Manopt.jl`s Tutorial Random Walk\n $Iter\n ## Parameters\n * retraction method: $(rws.retraction_method)\n * σ : $(rws.σ)\n\n ## Stopping Criterion\n $(status_summary(rws.stop))\n This indicates convergence: $Conv\"\"\"\n return print(io, s)\nend","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"show (generic function with 671 methods)","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Now the algorithm can be easily called and provides – if wanted – all features of a Manopt.jl algorithm. For example to see the summary, we could now just call","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"q = random_walk_algorithm!(M, f; return_state=true)","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"# Solver state for `Manopt.jl`s Tutorial Random Walk\nAfter 200 iterations\n\n## Parameters\n* retraction method: ExponentialRetraction()\n* σ : 0.1\n\n## Stopping Criterion\nMax Iteration 200: reached\nThis indicates convergence: No","category":"page"},{"location":"tutorials/ImplementASolver/#Conclusion-and-Beyond","page":"Implement a Solver","title":"Conclusion & Beyond","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We saw in this tutorial how to implement a simple cost-based algorithm, to illustrate how optimization algorithms are covered in Manopt.jl.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"One feature we did not cover is that most algorithms allow for inplace and allocation functions, as soon as they work on more than just the cost, e.g. gradients, proximal maps or Hessians. This is usually a keyword argument of the objective and hence also part of the high-level interfaces.","category":"page"},{"location":"tutorials/HowToDebug/#How-to-Print-Debug-Output","page":"Print Debug Output","title":"How to Print Debug Output","text":"","category":"section"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"This tutorial aims to illustrate how to perform debug output. For that we consider an example that includes a subsolver, to also consider their debug capabilities.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"The problem itself is hence not the main focus.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"We consider a nonnegative PCA which we can write as a constraint problem on the Sphere","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Let’s first load the necessary packages.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"using Manopt, Manifolds, Random, LinearAlgebra\nRandom.seed!(42);","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"d = 4\nM = Sphere(d - 1)\nv0 = project(M, [ones(2)..., zeros(d - 2)...])\nZ = v0 * v0'\n#Cost and gradient\nf(M, p) = -tr(transpose(p) * Z * p) / 2\ngrad_f(M, p) = project(M, p, -transpose.(Z) * p / 2 - Z * p / 2)\n# Constraints\ng(M, p) = -p # i.e. p ≥ 0\nmI = -Matrix{Float64}(I, d, d)\n# Vector of gradients of the constraint components\ngrad_g(M, p) = [project(M, p, mI[:, i]) for i in 1:d]","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Then we can take a starting point","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"p0 = project(M, [ones(2)..., zeros(d - 3)..., 0.1])","category":"page"},{"location":"tutorials/HowToDebug/#Simple-debug-output","page":"Print Debug Output","title":"Simple debug output","text":"","category":"section"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Any solver accepts the keyword debug=, which in the simplest case can be set to an array of strings, symbols and a number.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Strings are printed in every iteration as is (cf. DebugDivider) and should be used to finish the array with a line break.\nthe last number in the array is used with DebugEvery to print the debug only every ith iteration.\nAny Symbol is converted into certain debug prints","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Certain symbols starting with a capital letter are mapped to certain prints, e.g. :Cost is mapped to DebugCost() to print the current cost function value. A full list is provided in the DebugActionFactory. A special keyword is :Stop, which is only added to the final debug hook to print the stopping criterion.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Any symbol with a small letter is mapped to fields of the AbstractManoptSolverState which is used. This way you can easily print internal data, if you know their names.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Let’s look at an example first: If we want to print the current iteration number, the current cost function value as well as the value ϵ from the ExactPenaltyMethodState. To keep the amount of print at a reasonable level, we want to only print the debug every 25th iteration.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Then we can write","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"p1 = exact_penalty_method(\n M, f, grad_f, p0; g=g, grad_g=grad_g,\n debug = [:Iteration, :Cost, \" | \", :ϵ, 25, \"\\n\", :Stop]\n);","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Initial f(x): -0.497512 | ϵ: 0.001\n# 25 f(x): -0.499449 | ϵ: 0.0001778279410038921\n# 50 f(x): -0.499995 | ϵ: 3.1622776601683734e-5\n# 75 f(x): -0.500000 | ϵ: 5.623413251903474e-6\n# 100 f(x): -0.500000 | ϵ: 1.0e-6\nThe value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-6).\nThe algorithm performed a step with a change (6.534762378319523e-9) less than 1.0e-6.","category":"page"},{"location":"tutorials/HowToDebug/#Advanced-Debug-output","page":"Print Debug Output","title":"Advanced Debug output","text":"","category":"section"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"There is two more advanced variants that can be used. The first is a tuple of a symbol and a string, where the string is used as the format print, that most DebugActions have. The second is, to directly provide a DebugAction.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"We can for example change the way the :ϵ is printed by adding a format string and use DebugCost() which is equivalent to using :Cost. Especially with the format change, the lines are more coniststent in length.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"p2 = exact_penalty_method(\n M, f, grad_f, p0; g=g, grad_g=grad_g,\n debug = [:Iteration, DebugCost(), (:ϵ,\" | ϵ: %.8f\"), 25, \"\\n\", :Stop]\n);","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Initial f(x): -0.497512 | ϵ: 0.00100000\n# 25 f(x): -0.499449 | ϵ: 0.00017783\n# 50 f(x): -0.499995 | ϵ: 0.00003162\n# 75 f(x): -0.500000 | ϵ: 0.00000562\n# 100 f(x): -0.500000 | ϵ: 0.00000100\nThe value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-6).\nThe algorithm performed a step with a change (6.534762378319523e-9) less than 1.0e-6.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"You can also write your own DebugAction functor, where the function to implement has the same signature as the step function, that is an AbstractManoptProblem, an AbstractManoptSolverState, as well as the current iterate. For example the already mentioned [DebugDivider](@ref)(s)` is given as","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"mutable struct DebugDivider{TIO<:IO} <: DebugAction\n io::TIO\n divider::String\n DebugDivider(divider=\" | \"; io::IO=stdout) = new{typeof(io)}(io, divider)\nend\nfunction (d::DebugDivider)(::AbstractManoptProblem, ::AbstractManoptSolverState, i::Int)\n (i >= 0) && (!isempty(d.divider)) && (print(d.io, d.divider))\n return nothing\nend","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"or you could implement that of course just for your specific problem or state.","category":"page"},{"location":"tutorials/HowToDebug/#Subsolver-Debug","page":"Print Debug Output","title":"Subsolver Debug","text":"","category":"section"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"most subsolvers have a sub_kwargs keyword, such that you can pass keywords to the sub solver as well. This works well if you do not plan to change the subsolver. If you do you can wrap your own solver_state= argument in a decorate_state! and pass a debug= password to this function call. Keywords in a keyword have to be passed as pairs (:debug => [...]).","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"A main problem now is, that this debug is issued every sub solver call or initialisation, as the following print of just a . per sub solver test/call illustrates","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"p3 = exact_penalty_method(\n M, f, grad_f, p0; g=g, grad_g=grad_g,\n debug = [\"\\n\",:Iteration, DebugCost(), (:ϵ,\" | ϵ: %.8f\"), 25, \"\\n\", :Stop],\n sub_kwargs = [:debug => [\".\"]]\n);","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Initial f(x): -0.497512 | ϵ: 0.00100000\n........................................................\n# 25 f(x): -0.499449 | ϵ: 0.00017783\n..................................................\n# 50 f(x): -0.499995 | ϵ: 0.00003162\n..................................................\n# 75 f(x): -0.500000 | ϵ: 0.00000562\n..................................................\n# 100 f(x): -0.500000 | ϵ: 0.00000100\n....The value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-6).\nThe algorithm performed a step with a change (6.534762378319523e-9) less than 1.0e-6.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"The different lengths of the dotted lines come from the fact that —at least in the beginning— the subsolver performs a few steps.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"For this issue, there is the next symbol (similar to the :Stop) to indicate that a debug set is a subsolver set :Subsolver, which introduces a DebugWhenActive that is only activated when the outer debug is actually active, i.e. DebugEvery is active itself. Let’s","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"p4 = exact_penalty_method(\n M, f, grad_f, p0; g=g, grad_g=grad_g,\n debug = [:Iteration, DebugCost(), (:ϵ,\" | ϵ: %.8f\"), 25, \"\\n\", :Stop],\n sub_kwargs = [\n :debug => [\" | \", :Iteration, :Cost, \"\\n\",:Stop, :Subsolver]\n ]\n);","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Initial f(x): -0.497512 | ϵ: 0.00100000\n | Initial f(x): -0.499127\n | # 1 f(x): -0.499147\nThe algorithm reached approximately critical point after 1 iterations; the gradient norm (0.0002121889852717264) is less than 0.001.\n# 25 f(x): -0.499449 | ϵ: 0.00017783\n | Initial f(x): -0.499993\n | # 1 f(x): -0.499994\nThe algorithm reached approximately critical point after 1 iterations; the gradient norm (1.6025009584517956e-5) is less than 0.001.\n# 50 f(x): -0.499995 | ϵ: 0.00003162\n | Initial f(x): -0.500000\n | # 1 f(x): -0.500000\nThe algorithm reached approximately critical point after 1 iterations; the gradient norm (9.966301158124465e-7) is less than 0.001.\n# 75 f(x): -0.500000 | ϵ: 0.00000562\n | Initial f(x): -0.500000\n | # 1 f(x): -0.500000\nThe algorithm reached approximately critical point after 1 iterations; the gradient norm (5.4875346930698466e-8) is less than 0.001.\n# 100 f(x): -0.500000 | ϵ: 0.00000100\nThe value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-6).\nThe algorithm performed a step with a change (6.534762378319523e-9) less than 1.0e-6.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"where we now see that the subsolver always only requires one step. Note that since debug of an iteration is happening after a step, we see the sub solver run before the debug for an iteration number.","category":"page"},{"location":"functions/manifold/#Specific-manifold-functions","page":"Specific Manifold Functions","title":"Specific manifold functions","text":"","category":"section"},{"location":"functions/manifold/","page":"Specific Manifold Functions","title":"Specific Manifold Functions","text":"This small section extends the functions available from ManifoldsBase.jl and Manifolds.jl, especially a few random generators, that are simpler than the available functions.","category":"page"},{"location":"functions/manifold/","page":"Specific Manifold Functions","title":"Specific Manifold Functions","text":"Modules = [Manopt]\nPages = [\"manifold_functions.jl\"]","category":"page"},{"location":"functions/manifold/#Manopt.reflect-Tuple{AbstractManifold, Any, Any}","page":"Specific Manifold Functions","title":"Manopt.reflect","text":"reflect(M, p, x, kwargs...)\nreflect!(M, q, p, x, kwargs...)\n\nReflect the point x from the manifold M at point p, i.e.\n\n operatornamerefl_p(x) = operatornameretr_p(-operatornameretr^-1_p x)\n\nwhere operatornameretr and operatornameretr^-1 denote a retraction and an inverse retraction, respectively. This can also be done in place of q.\n\nKeyword arguments\n\nretraction_method - (default_retraction_metiod(M, typeof(p))) the retraction to use in the reflection\ninverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) the inverse retraction to use within the reflection\n\nand for the reflect! additionally\n\nX (zero_vector(M,p)) a temporary memory to compute the inverse retraction in place. otherwise this is the memory that would be allocated anyways.\n\nPassing X to reflect will just have no effect.\n\n\n\n\n\n","category":"method"},{"location":"functions/manifold/#Manopt.reflect-Tuple{AbstractManifold, Function, Any}","page":"Specific Manifold Functions","title":"Manopt.reflect","text":"reflect(M, f, x; kwargs...)\nreflect!(M, q, f, x; kwargs...)\n\nreflect the point x from the manifold M at the point f(x) of the function f mathcal M mathcal M, i.e.,\n\n operatornamerefl_f(x) = operatornamerefl_f(x)(x)\n\nCompute the result in q.\n\nsee also reflect(M,p,x), to which the keywords are also passed to.\n\n\n\n\n\n","category":"method"},{"location":"solvers/particle_swarm/#ParticleSwarmSolver","page":"Particle Swarm Optimization","title":"Particle Swarm Optimization","text":"","category":"section"},{"location":"solvers/particle_swarm/","page":"Particle Swarm Optimization","title":"Particle Swarm Optimization","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/particle_swarm/","page":"Particle Swarm Optimization","title":"Particle Swarm Optimization","text":" particle_swarm\n particle_swarm!","category":"page"},{"location":"solvers/particle_swarm/#Manopt.particle_swarm","page":"Particle Swarm Optimization","title":"Manopt.particle_swarm","text":"patricle_swarm(M, f; kwargs...)\npatricle_swarm(M, f, swarm; kwargs...)\npatricle_swarm(M, mco::AbstractManifoldCostObjective; kwargs..)\npatricle_swarm(M, mco::AbstractManifoldCostObjective, swarm; kwargs..)\n\nperform the particle swarm optimization algorithm (PSO), starting with an initial swarm Borkmanns, Ishteva, Absil, 7th IC Swarm Intelligence, 2010. If no swarm is provided, swarm_size many random points are used. Note that since this method does not work in-place – these points are duplicated internally.\n\nThe aim of PSO is to find the particle position g on the Manifold M that solves\n\nmin_x mathcalM F(x)\n\nTo this end, a swarm of particles is moved around the Manifold M in the following manner. For every particle k we compute the new particle velocities v_k^(i) in every step i of the algorithm by\n\nv_k^(i) = ω operatornameT_x_k^(i)gets x_k^(i-1)v_k^(i-1) + c r_1 operatornameretr_x_k^(i)^-1(p_k^(i)) + s r_2 operatornameretr_x_k^(i)^-1(g)\n\nwhere x_k^(i) is the current particle position, ω denotes the inertia, c and s are a cognitive and a social weight, respectively, r_j, j=12 are random factors which are computed new for each particle and step, operatornameretr^-1 denotes an inverse retraction on the Manifold M, and operatornameT is a vector transport.\n\nThen the position of the particle is updated as\n\nx_k^(i+1) = operatornameretr_x_k^(i)(v_k^(i))\n\nwhere operatornameretr denotes a retraction on the Manifold M. At the end of each step for every particle, we set\n\np_k^(i+1) = begincases\nx_k^(i+1) textif F(x_k^(i+1))F(p_k^(i))\np_k^(i) textelse\nendcases\n\n\nand\n\ng_k^(i+1) =begincases\np_k^(i+1) textif F(p_k^(i+1))F(g_k^(i))\ng_k^(i) textelse\nendcases\n\ni.e. p_k^(i) is the best known position for the particle k and g^(i) is the global best known position ever visited up to step i.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function Fmathcal Mℝ to minimize\nswarm – ([rand(M) for _ in 1:swarm_size]) – an initial swarm of points.\n\nInstead of a cost function f you can also provide an AbstractManifoldCostObjective mco.\n\nOptional\n\ncognitive_weight – (1.4) a cognitive weight factor\ninertia – (0.65) the inertia of the particles\ninverse_retraction_method - (default_inverse_retraction_method(M, eltype(x))) an inverse_retraction(M,x,y) to use.\nswarm_size - (100) number of random initial positions of x0\nretraction_method – (default_retraction_method(M, eltype(x))) a retraction(M,x,ξ) to use.\nsocial_weight – (1.4) a social weight factor\nstopping_criterion – (StopWhenAny(StopAfterIteration(500), StopWhenChangeLess(10^{-4}))) a functor inheriting from StoppingCriterion indicating when to stop.\nvector_transport_mthod - (default_vector_transport_method(M, eltype(x))) a vector transport method to use.\nvelocity – a set of tangent vectors (of type AbstractVector{T}) representing the velocities of the particles, per default a random tangent vector per initial position\n\nAll other keyword arguments are passed to decorate_state! for decorators or decorate_objective!, respectively. If you provide the ManifoldGradientObjective directly, these decorations can still be specified\n\nOutput\n\nthe obtained (approximate) minimizer g, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/particle_swarm/#Manopt.particle_swarm!","page":"Particle Swarm Optimization","title":"Manopt.particle_swarm!","text":"patricle_swarm!(M, f, swarm; kwargs...)\npatricle_swarm!(M, mco::AbstractManifoldCostObjective, swarm; kwargs..)\n\nperform the particle swarm optimization algorithm (PSO), starting with the initial swarm which is then modified in place.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function Fmathcal Mℝ to minimize\nswarm – ([rand(M) for _ in 1:swarm_size]) – an initial swarm of points.\n\nInstead of a cost function f you can also provide an AbstractManifoldCostObjective mco.\n\nFor more details and optional arguments, see particle_swarm.\n\n\n\n\n\n","category":"function"},{"location":"solvers/particle_swarm/#State","page":"Particle Swarm Optimization","title":"State","text":"","category":"section"},{"location":"solvers/particle_swarm/","page":"Particle Swarm Optimization","title":"Particle Swarm Optimization","text":"ParticleSwarmState","category":"page"},{"location":"solvers/particle_swarm/#Manopt.ParticleSwarmState","page":"Particle Swarm Optimization","title":"Manopt.ParticleSwarmState","text":"ParticleSwarmState{P,T} <: AbstractManoptSolverState\n\nDescribes a particle swarm optimizing algorithm, with\n\nFields\n\nx – a set of points (of type AbstractVector{P}) on a manifold as initial particle positions\nvelocity – a set of tangent vectors (of type AbstractVector{T}) representing the velocities of the particles\ninertia – (0.65) the inertia of the particles\nsocial_weight – (1.4) a social weight factor\ncognitive_weight – (1.4) a cognitive weight factor\np_temp – temporary storage for a point to avoid allocations during a step of the algorithm\nsocial_vec - temporary storage for a tangent vector related to social_weight\ncognitive_vector - temporary storage for a tangent vector related to cognitive_weight\nstopping_criterion – ([StopAfterIteration](@ref)(500) | [StopWhenChangeLess](@ref)(1e-4)) a functor inheriting from [StoppingCriterion`](@ref) indicating when to stop.\nretraction_method – (default_retraction_method(M, eltype(x))) the retraction to use\ninverse_retraction_method - (default_inverse_retraction_method(M, eltype(x))) an inverse retraction to use.\nvector_transport_method - (default_vector_transport_method(M, eltype(x))) a vector transport to use\n\nConstructor\n\nParticleSwarmState(M, x0, velocity; kawrgs...)\n\nconstruct a particle swarm Option for the manifold M starting at initial population x0 with velocities x0, where the manifold is used within the defaults of the other fields mentioned above, which are keyword arguments here.\n\nSee also\n\nparticle_swarm\n\n\n\n\n\n","category":"type"},{"location":"solvers/particle_swarm/#Literature","page":"Particle Swarm Optimization","title":"Literature","text":"","category":"section"},{"location":"solvers/particle_swarm/","page":"Particle Swarm Optimization","title":"Particle Swarm Optimization","text":"Pages = [\"solvers/particle_swarm.md\"]\nCanonical=false","category":"page"},{"location":"solvers/stochastic_gradient_descent/#StochasticGradientDescentSolver","page":"Stochastic Gradient Descent","title":"Stochastic Gradient Descent","text":"","category":"section"},{"location":"solvers/stochastic_gradient_descent/","page":"Stochastic Gradient Descent","title":"Stochastic Gradient Descent","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/stochastic_gradient_descent/","page":"Stochastic Gradient Descent","title":"Stochastic Gradient Descent","text":"stochastic_gradient_descent\nstochastic_gradient_descent!","category":"page"},{"location":"solvers/stochastic_gradient_descent/#Manopt.stochastic_gradient_descent","page":"Stochastic Gradient Descent","title":"Manopt.stochastic_gradient_descent","text":"stochastic_gradient_descent(M, grad_f, p; kwargs...)\nstochastic_gradient_descent(M, msgo, p; kwargs...)\n\nperform a stochastic gradient descent\n\nInput\n\nM a manifold mathcal M\ngrad_f – a gradient function, that either returns a vector of the subgradients or is a vector of gradients\np – an initial value x mathcal M\n\nalternatively to the gradient you can provide an ManifoldStochasticGradientObjective msgo, then using the cost= keyword does not have any effect since if so, the cost is already within the objective.\n\nOptional\n\ncost – (missing) you can provide a cost function for example to track the function value\nevaluation – (AllocatingEvaluation) specify whether the gradient(s) works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x) (elementwise).\nevaluation_order – (:Random) – whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Linear) or the default :Random one.\nstopping_criterion (StopAfterIteration(1000))– a StoppingCriterion\nstepsize (ConstantStepsize(1.0)) a Stepsize\norder_type (:RandomOder) a type of ordering of gradient evaluations. values are :RandomOrder, a :FixedPermutation, :LinearOrder\norder - ([1:n]) the initial permutation, where n is the number of gradients in gradF.\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction to use.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/stochastic_gradient_descent/#Manopt.stochastic_gradient_descent!","page":"Stochastic Gradient Descent","title":"Manopt.stochastic_gradient_descent!","text":"stochastic_gradient_descent!(M, grad_f, p)\nstochastic_gradient_descent!(M, msgo, p)\n\nperform a stochastic gradient descent in place of p.\n\nInput\n\nM a manifold mathcal M\ngrad_f – a gradient function, that either returns a vector of the subgradients or is a vector of gradients\np – an initial value p mathcal M\n\nAlternatively to the gradient you can provide an ManifoldStochasticGradientObjective msgo, then using the cost= keyword does not have any effect since if so, the cost is already within the objective.\n\nfor all optional parameters, see stochastic_gradient_descent.\n\n\n\n\n\n","category":"function"},{"location":"solvers/stochastic_gradient_descent/#State","page":"Stochastic Gradient Descent","title":"State","text":"","category":"section"},{"location":"solvers/stochastic_gradient_descent/","page":"Stochastic Gradient Descent","title":"Stochastic Gradient Descent","text":"StochasticGradientDescentState","category":"page"},{"location":"solvers/stochastic_gradient_descent/#Manopt.StochasticGradientDescentState","page":"Stochastic Gradient Descent","title":"Manopt.StochasticGradientDescentState","text":"StochasticGradientDescentState <: AbstractGradientDescentSolverState\n\nStore the following fields for a default stochastic gradient descent algorithm, see also ManifoldStochasticGradientObjective and stochastic_gradient_descent.\n\nFields\n\np the current iterate\ndirection (StochasticGradient) a direction update to use\nstopping_criterion (StopAfterIteration(1000))– a StoppingCriterion\nstepsize (ConstantStepsize(1.0)) a Stepsize\nevaluation_order – (:Random) – whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Linear) or the default :Random one.\norder the current permutation\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction(M, p, X) to use.\n\nConstructor\n\nStochasticGradientDescentState(M, p)\n\nCreate a StochasticGradientDescentState with start point x. all other fields are optional keyword arguments, and the defaults are taken from M.\n\n\n\n\n\n","category":"type"},{"location":"solvers/stochastic_gradient_descent/","page":"Stochastic Gradient Descent","title":"Stochastic Gradient Descent","text":"Additionally, the options share a DirectionUpdateRule, so you can also apply MomentumGradient and AverageGradient here. The most inner one should always be.","category":"page"},{"location":"solvers/stochastic_gradient_descent/","page":"Stochastic Gradient Descent","title":"Stochastic Gradient Descent","text":"AbstractGradientGroupProcessor\nStochasticGradient","category":"page"},{"location":"solvers/stochastic_gradient_descent/#Manopt.AbstractGradientGroupProcessor","page":"Stochastic Gradient Descent","title":"Manopt.AbstractGradientGroupProcessor","text":"AbstractStochasticGradientDescentSolverState <: AbstractManoptSolverState\n\nA generic type for all options related to stochastic gradient descent methods\n\n\n\n\n\n","category":"type"},{"location":"solvers/stochastic_gradient_descent/#Manopt.StochasticGradient","page":"Stochastic Gradient Descent","title":"Manopt.StochasticGradient","text":"StochasticGradient <: AbstractGradientGroupProcessor\n\nThe default gradient processor, which just evaluates the (stochastic) gradient or a subset thereof.\n\nConstructor\n\nStochasticGradient(M::AbstractManifold; p=rand(M), X=zero_vector(M, p))\n\nInitialize the stochastic Gradient processor with X, i.e. both M and p are just help variables, though M is mandatory by convention.\n\n\n\n\n\n","category":"type"},{"location":"solvers/cyclic_proximal_point/#CPPSolver","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"","category":"section"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"The Cyclic Proximal Point (CPP) algorithm aims to minimize","category":"page"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"F(x) = sum_i=1^c f_i(x)","category":"page"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"assuming that the proximal maps operatornameprox_λ f_i(x) are given in closed form or can be computed efficiently (at least approximately).","category":"page"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"The algorithm then cycles through these proximal maps, where the type of cycle might differ and the proximal parameter λ_k changes after each cycle k.","category":"page"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"For a convergence result on Hadamard manifolds see Bačák [Bac14].","category":"page"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"cyclic_proximal_point\ncyclic_proximal_point!","category":"page"},{"location":"solvers/cyclic_proximal_point/#Manopt.cyclic_proximal_point","page":"Cyclic Proximal Point","title":"Manopt.cyclic_proximal_point","text":"cyclic_proximal_point(M, f, proxes_f, p)\ncyclic_proximal_point(M, mpo, p)\n\nperform a cyclic proximal point algorithm.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function fmathcal Mℝ to minimize\nproxes_f – an Array of proximal maps (Functions) (M,λ,p) -> q or (M, q, λ, p) -> q for the summands of f (see evaluation)\np – an initial value p mathcal M\n\nwhere f and the proximal maps proxes_f can also be given directly as a ManifoldProximalMapObjective mpo\n\nOptional\n\nevaluation – (AllocatingEvaluation) specify whether the proximal maps work by allocation (default) form prox(M, λ, x) or InplaceEvaluation in place, i.e. is of the form prox!(M, y, λ, x).\nevaluation_order – (:Linear) – whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Random) or the default linear one.\nλ – ( iter -> 1/iter ) a function returning the (square summable but not summable) sequence of λi\nstopping_criterion – (StopWhenAny(StopAfterIteration(5000),StopWhenChangeLess(10.0^-8))) a StoppingCriterion.\n\nAll other keyword arguments are passed to decorate_state! for decorators or decorate_objective!, respectively. If you provide the ManifoldProximalMapObjective directly, these decorations can still be specified.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/cyclic_proximal_point/#Manopt.cyclic_proximal_point!","page":"Cyclic Proximal Point","title":"Manopt.cyclic_proximal_point!","text":"cyclic_proximal_point!(M, F, proxes, p)\ncyclic_proximal_point!(M, mpo, p)\n\nperform a cyclic proximal point algorithm in place of p.\n\nInput\n\nM – a manifold mathcal M\nF – a cost function Fmathcal Mℝ to minimize\nproxes – an Array of proximal maps (Functions) (M, λ, p) -> q or (M, q, λ, p) for the summands of F\np – an initial value p mathcal M\n\nwhere f and the proximal maps proxes_f can also be given directly as a ManifoldProximalMapObjective mpo\n\nfor all options, see cyclic_proximal_point.\n\n\n\n\n\n","category":"function"},{"location":"solvers/cyclic_proximal_point/#State","page":"Cyclic Proximal Point","title":"State","text":"","category":"section"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"CyclicProximalPointState","category":"page"},{"location":"solvers/cyclic_proximal_point/#Manopt.CyclicProximalPointState","page":"Cyclic Proximal Point","title":"Manopt.CyclicProximalPointState","text":"CyclicProximalPointState <: AbstractManoptSolverState\n\nstores options for the cyclic_proximal_point algorithm. These are the\n\nFields\n\np – the current iterate\nstopping_criterion – a StoppingCriterion\nλ – (@(i) -> 1/i) a function for the values of λ_k per iteration(cycle ì\noder_type – (:LinearOrder) – whether to use a randomly permuted sequence (:FixedRandomOrder), a per cycle permuted sequence (:RandomOrder) or the default linear one.\n\nConstructor\n\nCyclicProximalPointState(M, p)\n\nGenerate the options with the following keyword arguments\n\nstopping_criterion (StopAfterIteration(2000)) – a StoppingCriterion.\nλ ( i -> 1.0 / i) – a function to compute the λ_k k mathbb N,\nevaluation_order – (:LinearOrder) – a Symbol indicating the order the proxes are applied.\n\nSee also\n\ncyclic_proximal_point\n\n\n\n\n\n","category":"type"},{"location":"solvers/cyclic_proximal_point/#Debug-Functions","page":"Cyclic Proximal Point","title":"Debug Functions","text":"","category":"section"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"DebugProximalParameter","category":"page"},{"location":"solvers/cyclic_proximal_point/#Manopt.DebugProximalParameter","page":"Cyclic Proximal Point","title":"Manopt.DebugProximalParameter","text":"DebugProximalParameter <: DebugAction\n\nprint the current iterates proximal point algorithm parameter given by AbstractManoptSolverStates o.λ.\n\n\n\n\n\n","category":"type"},{"location":"solvers/cyclic_proximal_point/#Record-Functions","page":"Cyclic Proximal Point","title":"Record Functions","text":"","category":"section"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"RecordProximalParameter","category":"page"},{"location":"solvers/cyclic_proximal_point/#Manopt.RecordProximalParameter","page":"Cyclic Proximal Point","title":"Manopt.RecordProximalParameter","text":"RecordProximalParameter <: RecordAction\n\nrecord the current iterates proximal point algorithm parameter given by in AbstractManoptSolverStates o.λ.\n\n\n\n\n\n","category":"type"},{"location":"solvers/cyclic_proximal_point/#Literature","page":"Cyclic Proximal Point","title":"Literature","text":"","category":"section"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"Pages = [\"solvers/cyclic_proximal_point.md\"]\nCanonical=false","category":"page"},{"location":"functions/costs/#CostFunctions","page":"Cost functions","title":"Cost Functions","text":"","category":"section"},{"location":"functions/costs/","page":"Cost functions","title":"Cost functions","text":"The following cost functions are available","category":"page"},{"location":"functions/costs/","page":"Cost functions","title":"Cost functions","text":"Modules = [Manopt]\nPages = [\"costs.jl\"]","category":"page"},{"location":"functions/costs/#Manopt.costIntrICTV12-Tuple{AbstractManifold, Vararg{Any, 5}}","page":"Cost functions","title":"Manopt.costIntrICTV12","text":"costIntrICTV12(M, f, u, v, α, β)\n\nCompute the intrinsic infimal convolution model, where the addition is replaced by a mid point approach and the two functions involved are costTV2 and costTV. The model reads\n\nE(uv) =\n frac12sum_i mathcal G\n d_mathcal Mbigl(g(frac12v_iw_i)f_ibigr)\n +alphabigl( βmathrmTV(v) + (1-β)mathrmTV_2(w) bigr)\n\n\n\n\n\n","category":"method"},{"location":"functions/costs/#Manopt.costL2TV-NTuple{4, Any}","page":"Cost functions","title":"Manopt.costL2TV","text":"costL2TV(M, f, α, x)\n\ncompute the ℓ^2-TV functional on the PowerManifold manifoldMfor given (fixed) dataf(onM), a nonnegative weightα, and evaluated atx(onM`), i.e.\n\nE(x) = d_mathcal M^2(fx) + alpha operatornameTV(x)\n\nSee also\n\ncostTV\n\n\n\n\n\n","category":"method"},{"location":"functions/costs/#Manopt.costL2TV2-Tuple{PowerManifold, Any, Any, Any}","page":"Cost functions","title":"Manopt.costL2TV2","text":"costL2TV2(M, f, β, x)\n\ncompute the ℓ^2-TV2 functional on the PowerManifold manifold M for given data f, nonnegative parameter β, and evaluated at x, i.e.\n\nE(x) = d_mathcal M^2(fx) + βoperatornameTV_2(x)\n\nSee also\n\ncostTV2\n\n\n\n\n\n","category":"method"},{"location":"functions/costs/#Manopt.costL2TVTV2-Tuple{PowerManifold, Vararg{Any, 4}}","page":"Cost functions","title":"Manopt.costL2TVTV2","text":"costL2TVTV2(M, f, α, β, x)\n\ncompute the ℓ^2-TV-TV2 functional on the PowerManifold manifold M for given (fixed) data f (on M), nonnegative weight α, β, and evaluated at x (on M), i.e.\n\nE(x) = d_mathcal M^2(fx) + alphaoperatornameTV(x)\n + βoperatornameTV_2(x)\n\nSee also\n\ncostTV, costTV2\n\n\n\n\n\n","category":"method"},{"location":"functions/costs/#Manopt.costTV","page":"Cost functions","title":"Manopt.costTV","text":"costTV(M,x [,p=2,q=1])\n\nCompute the operatornameTV^p functional for data xon the PowerManifold manifold M, i.e. mathcal M = mathcal N^n, where n mathbb N^k denotes the dimensions of the data x. Let mathcal I_i denote the forward neighbors, i.e. with mathcal G as all indices from mathbf1 mathbb N^k to n we have mathcal I_i = i+e_j j=1kcap mathcal G. The formula reads\n\nE^q(x) = sum_i mathcal G\n bigl( sum_j mathcal I_i d^p_mathcal M(x_ix_j) bigr)^qp\n\nSee also\n\ngrad_TV, prox_TV\n\n\n\n\n\n","category":"function"},{"location":"functions/costs/#Manopt.costTV-Union{Tuple{T}, Tuple{AbstractManifold, Tuple{T, T}}, Tuple{AbstractManifold, Tuple{T, T}, Int64}} where T","page":"Cost functions","title":"Manopt.costTV","text":"costTV(M, x, p)\n\nCompute the operatornameTV^p functional for a tuple pT of points on a manifold M, i.e.\n\nE(x_1x_2) = d_mathcal M^p(x_1x_2) quad x_1x_2 mathcal M\n\nSee also\n\ngrad_TV, prox_TV\n\n\n\n\n\n","category":"method"},{"location":"functions/costs/#Manopt.costTV2","page":"Cost functions","title":"Manopt.costTV2","text":"costTV2(M,x [,p=1])\n\ncompute the operatornameTV_2^p functional for data x on the PowerManifold manifoldmanifold M, i.e. mathcal M = mathcal N^n, where n mathbb N^k denotes the dimensions of the data x. Let mathcal I_i^pm denote the forward and backward neighbors, respectively, i.e. with mathcal G as all indices from mathbf1 mathbb N^k to n we have mathcal I^pm_i = ipm e_j j=1kcap mathcal I. The formula then reads\n\nE(x) = sum_i mathcal I j_1 mathcal I^+_i j_2 mathcal I^-_i\nd^p_mathcal M(c_i(x_j_1x_j_2) x_i)\n\nwhere c_i() denotes the mid point between its two arguments that is nearest to x_i.\n\nSee also\n\ngrad_TV2, prox_TV2\n\n\n\n\n\n","category":"function"},{"location":"functions/costs/#Manopt.costTV2-Union{Tuple{T}, Tuple{MT}, Tuple{MT, Tuple{T, T, T}}, Tuple{MT, Tuple{T, T, T}, Any}} where {MT<:AbstractManifold, T}","page":"Cost functions","title":"Manopt.costTV2","text":"costTV2(M,(x1,x2,x3) [,p=1])\n\nCompute the operatornameTV_2^p functional for the 3-tuple of points (x1,x2,x3)on the manifold M. Denote by\n\n mathcal C = bigl c mathcal M g(tfrac12x_1x_3) text for some geodesic gbigr\n\nthe set of mid points between x_1 and x_3. Then the function reads\n\nd_2^p(x_1x_2x_3) = min_c mathcal C d_mathcal M(cx_2)\n\nSee also\n\ngrad_TV2, prox_TV2\n\n\n\n\n\n","category":"method"},{"location":"functions/costs/#Manopt.cost_L2_acceleration_bezier-Union{Tuple{P}, Tuple{AbstractManifold, AbstractVector{P}, AbstractVector{<:Integer}, AbstractVector{<:AbstractFloat}, AbstractFloat, AbstractVector{P}}} where P","page":"Cost functions","title":"Manopt.cost_L2_acceleration_bezier","text":"cost_L2_acceleration_bezier(M,B,pts,λ,d)\n\ncompute the value of the discrete Acceleration of the composite Bezier curve together with a data term, i.e.\n\nfracλ2sum_i=0^N d_mathcal M(d_i c_B(i))^2+\nsum_i=1^N-1fracd^2_2 B(t_i-1) B(t_i) B(t_i+1)Delta_t^3\n\nwhere for this formula the pts along the curve are equispaced and denoted by t_i and d_2 refers to the second order absolute difference costTV2 (squared), the junction points are denoted by p_i, and to each p_i corresponds one data item in the manifold points given in d. For details on the acceleration approximation, see cost_acceleration_bezier. Note that the Bézier-curve is given in reduces form as a point on a PowerManifold, together with the degrees of the segments and assuming a differentiable curve, the segments can internally be reconstructed.\n\nSee also\n\ngrad_L2_acceleration_bezier, cost_acceleration_bezier, grad_acceleration_bezier\n\n\n\n\n\n","category":"method"},{"location":"functions/costs/#Manopt.cost_acceleration_bezier-Union{Tuple{P}, Tuple{AbstractManifold, AbstractVector{P}, AbstractVector{<:Integer}, AbstractVector{<:AbstractFloat}}} where P","page":"Cost functions","title":"Manopt.cost_acceleration_bezier","text":"cost_acceleration_bezier(\n M::AbstractManifold,\n B::AbstractVector{P},\n degrees::AbstractVector{<:Integer},\n T::AbstractVector{<:AbstractFloat},\n) where {P}\n\ncompute the value of the discrete Acceleration of the composite Bezier curve\n\nsum_i=1^N-1fracd^2_2 B(t_i-1) B(t_i) B(t_i+1)Delta_t^3\n\nwhere for this formula the pts along the curve are equispaced and denoted by t_i, i=1N, and d_2 refers to the second order absolute difference costTV2 (squared). Note that the Bézier-curve is given in reduces form as a point on a PowerManifold, together with the degrees of the segments and assuming a differentiable curve, the segments can internally be reconstructed.\n\nThis acceleration discretization was introduced in Bergmann, Gousenbourger, Front. Appl. Math. Stat., 2018.\n\nSee also\n\ngrad_acceleration_bezier, cost_L2_acceleration_bezier, grad_L2_acceleration_bezier\n\n\n\n\n\n","category":"method"},{"location":"plans/objective/#ObjectiveSection","page":"Objective","title":"A Manifold Objective","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"CurrentModule = Manopt","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"The Objective describes that actual cost function and all its properties.","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"AbstractManifoldObjective\nAbstractDecoratedManifoldObjective","category":"page"},{"location":"plans/objective/#Manopt.AbstractManifoldObjective","page":"Objective","title":"Manopt.AbstractManifoldObjective","text":"AbstractManifoldObjective{E<:AbstractEvaluationType}\n\nDescribe the collection of the optimization function `f\\colon \\mathcal M → \\bbR (or even a vectorial range) and its corresponding elements, which might for example be a gradient or (one or more) proximal maps.\n\nAll these elements should usually be implemented as functions (M, p) -> ..., or (M, X, p) -> ... that is\n\nthe first argument of these functions should be the manifold M they are defined on\nthe argument X is present, if the computation is performed inplace of X (see InplaceEvaluation)\nthe argument p is the place the function (f or one of its elements) is evaluated at.\n\nthe type T indicates the global AbstractEvaluationType.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.AbstractDecoratedManifoldObjective","page":"Objective","title":"Manopt.AbstractDecoratedManifoldObjective","text":"AbstractDecoratedManifoldObjective{E<:AbstractEvaluationType,O<:AbstractManifoldObjective}\n\nA common supertype for all decorators of AbstractManifoldObjectives to simplify dispatch. The second parameter should refer to the undecorated objective (i.e. the most inner one).\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"Which has two main different possibilities for its containing functions concerning the evaluation mode – not necessarily the cost, but for example gradient in an AbstractManifoldGradientObjective.","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"AbstractEvaluationType\nAllocatingEvaluation\nInplaceEvaluation\nevaluation_type","category":"page"},{"location":"plans/objective/#Manopt.AbstractEvaluationType","page":"Objective","title":"Manopt.AbstractEvaluationType","text":"AbstractEvaluationType\n\nAn abstract type to specify the kind of evaluation a AbstractManifoldObjective supports.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.AllocatingEvaluation","page":"Objective","title":"Manopt.AllocatingEvaluation","text":"AllocatingEvaluation <: AbstractEvaluationType\n\nA parameter for a AbstractManoptProblem indicating that the problem uses functions that allocate memory for their result, i.e. they work out of place.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.InplaceEvaluation","page":"Objective","title":"Manopt.InplaceEvaluation","text":"InplaceEvaluation <: AbstractEvaluationType\n\nA parameter for a AbstractManoptProblem indicating that the problem uses functions that do not allocate memory but work on their input, i.e. in place.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.evaluation_type","page":"Objective","title":"Manopt.evaluation_type","text":"evaluation_type(mp::AbstractManoptProblem)\n\nGet the AbstractEvaluationType of the objective in AbstractManoptProblem mp.\n\n\n\n\n\nevaluation_type(::AbstractManifoldObjective{Teval})\n\nGet the AbstractEvaluationType of the objective.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Decorators-for-Objectives","page":"Objective","title":"Decorators for Objectives","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"An objective can be decorated using the following trait and function to initialize","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"dispatch_objective_decorator\nis_objective_decorator\ndecorate_objective!","category":"page"},{"location":"plans/objective/#Manopt.dispatch_objective_decorator","page":"Objective","title":"Manopt.dispatch_objective_decorator","text":"dispatch_objective_decorator(o::AbstractManoptSolverState)\n\nIndicate internally, whether an AbstractManifoldObjective o to be of decorating type, i.e. it stores (encapsulates) an object in itself, by default in the field o.objective.\n\nDecorators indicate this by returning Val{true} for further dispatch.\n\nThe default is Val{false}, i.e. by default an state is not decorated.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.is_objective_decorator","page":"Objective","title":"Manopt.is_objective_decorator","text":"is_object_decorator(s::AbstractManifoldObjective)\n\nIndicate, whether AbstractManifoldObjective s are of decorator type.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.decorate_objective!","page":"Objective","title":"Manopt.decorate_objective!","text":"decorate_objective!(M, o::AbstractManifoldObjective)\n\ndecorate the AbstractManifoldObjectiveo with specific decorators.\n\nOptional Arguments\n\noptional arguments provide necessary details on the decorators. A specific one is used to activate certain decorators.\n\ncache – (missing) specify a cache. Currently :Simple is supported and :LRU if you load LRUCache.jl. For this case a tuple specifying what to cache and how many can be provided, i.e. (:LRU, [:Cost, :Gradient], 10), where the number specifies the size of each cache. and 10 is the default if one omits the last tuple entry\ncount – (missing) specify calls to the objective to be called, see ManifoldCountObjective for the full list\nobjective_type – (:Riemannian) specify that an objective is :Riemannian or :Euclidean. The :Euclidean symbol is equivalent to specifying it as :Embedded, since in the end, both refer to converting an objective from the embedding (whether its Euclidean or not) to the Riemannian one.\n\nSee also\n\nobjective_cache_factory\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#ManifoldEmbeddedObjective","page":"Objective","title":"Embedded Objectives","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"EmbeddedManifoldObjective","category":"page"},{"location":"plans/objective/#Manopt.EmbeddedManifoldObjective","page":"Objective","title":"Manopt.EmbeddedManifoldObjective","text":"EmbeddedManifoldObjective{P, T, E, O2, O1<:AbstractManifoldObjective{E}} <:\n AbstractDecoratedManifoldObjective{O2, O1}\n\nDeclare an objective to be defined in the embedding. This also declares the gradient to be defined in the embedding, and especially being the Riesz representer with respect to the metric in the embedding. The types can be used to still dispatch on also the undecorated objective type O2.\n\nFields\n\nobjective – the objective that is defined in the embedding\np - (nothing) a point in the embedding.\nX - (nothing) a tangent vector in the embedding\n\nWhen a point in the embedding p is provided, embed! is used in place of this point to reduce memory allocations. Similarly X is used when embedding tangent vectors\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#CacheSection","page":"Objective","title":"Cache Objective","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"Since single function calls, e.g. to the cost or the gradient, might be expensive, a simple cache objective exists as a decorator, that caches one cost value or gradient.","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"It can be activated/used with the cache= keyword argument available for every solver.","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"Manopt.reset_counters!\nManopt.objective_cache_factory","category":"page"},{"location":"plans/objective/#Manopt.reset_counters!","page":"Objective","title":"Manopt.reset_counters!","text":"reset_counters(co::ManifoldCountObjective, value::Integer=0)\n\nReset all values in the count objective to value.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.objective_cache_factory","page":"Objective","title":"Manopt.objective_cache_factory","text":"objective_cache_factory(M::AbstractManifold, o::AbstractManifoldObjective, cache::Symbol)\n\nGenerate a cached variant of the AbstractManifoldObjective o on the AbstractManifold M based on the symbol cache.\n\nThe following caches are available\n\n:Simple generates a SimpleManifoldCachedObjective\n:LRU generates a ManifoldCachedObjective where you should use the form (:LRU, [:Cost, :Gradient]) to specify what should be cached or (:LRU, [:Cost, :Gradient], 100) to specify the cache size. Here this variant defaults to (:LRU, [:Cost, :Gradient], 100), i.e. to cache up to 100 cost and gradient values.[1]\n\n[1]: This cache requires LRUCache.jl to be loaded as well.\n\n\n\n\n\nobjective_cache_factory(M::AbstractManifold, o::AbstractManifoldObjective, cache::Tuple{Symbol, Array, Array})\nobjective_cache_factory(M::AbstractManifold, o::AbstractManifoldObjective, cache::Tuple{Symbol, Array})\n\nGenerate a cached variant of the AbstractManifoldObjective o on the AbstractManifold M based on the symbol cache[1], where the second element cache[2] are further arguments to the cache and the optional third is passed down as keyword arguments.\n\nFor all available caches see the simpler variant with symbols.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#A-simple-cache","page":"Objective","title":"A simple cache","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"A first generic cache is always available, but it only caches one gradient and one cost function evaluation (for the same point).","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"SimpleManifoldCachedObjective","category":"page"},{"location":"plans/objective/#Manopt.SimpleManifoldCachedObjective","page":"Objective","title":"Manopt.SimpleManifoldCachedObjective","text":" SimpleManifoldCachedObjective{O<:AbstractManifoldGradientObjective{E,TC,TG}, P, T,C} <: AbstractManifoldGradientObjective{E,TC,TG}\n\nProvide a simple cache for an AbstractManifoldGradientObjective that is for a given point p this cache stores a point p and a gradient operatornamegrad f(p) in X as well as a cost value f(p) in c.\n\nBoth X and c are accompanied by booleans to keep track of their validity.\n\nConstructor\n\nSimpleManifoldCachedObjective(M::AbstractManifold, obj::AbstractManifoldGradientObjective; kwargs...)\n\nKeyword\n\np (rand(M)) – a point on the manifold to initialize the cache with\nX (get_gradient(M, obj, p) or zero_vector(M,p)) – a tangent vector to store the gradient in, see also initialize\nc (get_cost(M, obj, p) or 0.0) – a value to store the cost function in initialize\ninitialized (true) – whether to initialize the cached X and c or not.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#A-Generic-Cache","page":"Objective","title":"A Generic Cache","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"For the more advanced cache, you need to implement some type of cache yourself, that provides a get! and implement init_caches. This is for example provided if you load LRUCache.jl. Then you obtain","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ManifoldCachedObjective\ninit_caches","category":"page"},{"location":"plans/objective/#Manopt.ManifoldCachedObjective","page":"Objective","title":"Manopt.ManifoldCachedObjective","text":"ManifoldCachedObjective{E,P,O<:AbstractManifoldObjective{<:E},C<:NamedTuple{}} <: AbstractDecoratedManifoldObjective{E,P}\n\nCreate a cache for an objective, based on a NamedTuple that stores some kind of cache.\n\nConstructor\n\nManifoldCachedObjective(M, o::AbstractManifoldObjective, caches::Vector{Symbol}; kwargs...)\n\nCreate a cache for the AbstractManifoldObjective where the Symbols in caches indicate, which function evaluations to cache.\n\nSupported Symbols\n\nSymbol Caches calls to (incl. ! variants) Comment\n:Constraints get_constraints vector of numbers\n:Cost get_cost \n:EqualityConstraint get_equality_constraint numbers per (p,i)\n:EqualityConstraints get_equality_constraints vector of numbers\n:GradEqualityConstraint get_grad_equality_constraint tangent vector per (p,i)\n:GradEqualityConstraints get_grad_equality_constraints vector of tangent vectors\n:GradInequalityConstraint get_inequality_constraint tangent vector per (p,i)\n:GradInequalityConstraints get_inequality_constraints vector of tangent vectors\n:Gradient get_gradient(M,p) tangent vectors\n:Hessian get_hessian tangent vectors\n:InequalityConstraint get_inequality_constraint numbers per (p,j)\n:InequalityConstraints get_inequality_constraints vector of numbers\n:Preconditioner get_preconditioner tangent vectors\n:ProximalMap get_proximal_map point per (p,λ,i)\n:StochasticGradients get_gradients vector of tangent vectors\n:StochasticGradient get_gradient(M, p, i) tangent vector per (p,i)\n:SubGradient get_subgradient tangent vectors\n:SubtrahendGradient get_subtrahend_gradient tangent vectors\n\nKeyword Arguments\n\np - (rand(M)) the type of the keys to be used in the caches. Defaults to the default representation on M.\nvalue - (get_cost(M, objective, p)) the type of values for numeric values in the cache, e.g. the cost\nX - (zero_vector(M,p)) the type of values to be cached for gradient and Hessian calls.\ncache - ([:Cost]) a vector of symbols indicating which function calls should be cached.\ncache_size - (10) number of (least recently used) calls to cache\ncache_sizes – (Dict{Symbol,Int}()) a named tuple or dictionary specifying the sizes individually for each cache.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.init_caches","page":"Objective","title":"Manopt.init_caches","text":"init_caches(caches, T::Type{LRU}; kwargs...)\n\nGiven a vector of symbols caches, this function sets up the NamedTuple of caches, where T is the type of cache to use.\n\nKeyword arguments\n\np - (rand(M)) a point on a manifold, to both infer its type for keys and initialize caches\nvalue - (0.0) a value both typing and initialising number-caches, eg. for caching a cost.\nX - (zero_vector(M, p) a tangent vector at p to both type and initialize tangent vector caches\ncache_size - (10) a default cache size to use\ncache_sizes – (Dict{Symbol,Int}()) a dictionary of sizes for the caches to specify different (non-default) sizes\n\n\n\n\n\ninit_caches(M::AbstractManifold, caches, T; kwargs...)\n\nGiven a vector of symbols caches, this function sets up the NamedTuple of caches for points/vectors on M, where T is the type of cache to use.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#ManifoldCountObjective","page":"Objective","title":"Count Objective","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ManifoldCountObjective","category":"page"},{"location":"plans/objective/#Manopt.ManifoldCountObjective","page":"Objective","title":"Manopt.ManifoldCountObjective","text":"ManifoldCountObjective{E,P,O<:AbstractManifoldObjective,I<:Integer} <: AbstractDecoratedManifoldObjective{E,P}\n\nA wrapper for any AbstractManifoldObjective of type O to count different calls to parts of the objective.\n\nFields\n\ncounts a dictionary of symbols mapping to integers keeping the counted values\nobjective the wrapped objective\n\nSupported Symbols\n\nSymbol Counts calls to (incl. ! variants) Comment\n:Constraints get_constraints \n:Cost get_cost \n:EqualityConstraint get_equality_constraint requires vector of counters\n:EqualityConstraints get_equality_constraints does not count single access\n:GradEqualityConstraint get_grad_equality_constraint requires vector of counters\n:GradEqualityConstraints get_grad_equality_constraints does not count single access\n:GradInequalityConstraint get_inequality_constraint requires vector of counters\n:GradInequalityConstraints get_inequality_constraints does not count single access\n:Gradient get_gradient(M,p) \n:Hessian get_hessian \n:InequalityConstraint get_inequality_constraint requires vector of counters\n:InequalityConstraints get_inequality_constraints does not count single access\n:Preconditioner get_preconditioner \n:ProximalMap get_proximal_map \n:StochasticGradients get_gradients \n:StochasticGradient get_gradient(M, p, i) \n:SubGradient get_subgradient \n:SubtrahendGradient get_subtrahend_gradient \n\nConstructors\n\nManifoldCountObjective(objective::AbstractManifoldObjective, counts::Dict{Symbol, <:Integer})\n\nInitialise the ManifoldCountObjective to wrap objective initializing the set of counts\n\nManifoldCountObjective(M::AbtractManifold, objective::AbstractManifoldObjective, count::AbstractVecor{Symbol}, init=0)\n\nCount function calls on objective using the symbols in count initialising all entries to init.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Internal-Decorators","page":"Objective","title":"Internal Decorators","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ReturnManifoldObjective","category":"page"},{"location":"plans/objective/#Manopt.ReturnManifoldObjective","page":"Objective","title":"Manopt.ReturnManifoldObjective","text":"ReturnManifoldObjective{E,O2,O1<:AbstractManifoldObjective{E}} <:\n AbstractDecoratedManifoldObjective{E,O2}\n\nA wrapper to indicate that get_solver_result should return the inner objective.\n\nThe types are such that one can still dispatch on the undecorated type O2 of the original objective as well.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Specific-Objective-typed-and-their-access-functions","page":"Objective","title":"Specific Objective typed and their access functions","text":"","category":"section"},{"location":"plans/objective/#Cost-Objective","page":"Objective","title":"Cost Objective","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"AbstractManifoldCostObjective\nManifoldCostObjective","category":"page"},{"location":"plans/objective/#Manopt.AbstractManifoldCostObjective","page":"Objective","title":"Manopt.AbstractManifoldCostObjective","text":"AbstractManifoldCostObjective{T<:AbstractEvaluationType} <: AbstractManifoldObjective{T}\n\nRepresenting objectives on manifolds with a cost function implemented.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.ManifoldCostObjective","page":"Objective","title":"Manopt.ManifoldCostObjective","text":"ManifoldCostObjective{T, TC} <: AbstractManifoldCostObjective{T, TC}\n\nspecify an AbstractManifoldObjective that does only have information about the cost function fcolon mathbb M ℝ implemented as a function (M, p) -> c to compute the cost value c at p on the manifold M.\n\ncost – a function f mathcal M ℝ to minimize\n\nConstructors\n\nManifoldCostObjective(f)\n\nGenerate a problem. While this Problem does not have any allocating functions, the type T can be set for consistency reasons with other problems.\n\nUsed with\n\nNelderMead, particle_swarm\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Access-functions","page":"Objective","title":"Access functions","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_cost","category":"page"},{"location":"plans/objective/#Manopt.get_cost","page":"Objective","title":"Manopt.get_cost","text":"get_cost(amp::AbstractManoptProblem, p)\n\nevaluate the cost function f stored within the AbstractManifoldObjective of an AbstractManoptProblem amp at the point p.\n\n\n\n\n\nget_cost(M::AbstractManifold, obj::AbstractManifoldObjective, p)\n\nevaluate the cost function f defined on M stored within the AbstractManifoldObjective at the point p.\n\n\n\n\n\nget_cost(M::AbstractManifold, mco::AbstractManifoldCostObjective, p)\n\nEvaluate the cost function from within the AbstractManifoldCostObjective on M at p.\n\nBy default this implementation assumed that the cost is stored within mco.cost.\n\n\n\n\n\nget_cost(M::AbstractManifold, sgo::ManifoldStochasticGradientObjective, p, i)\n\nEvaluate the ith summand of the cost.\n\nIf you use a single function for the stochastic cost, then only the index ì=1` is available to evaluate the whole cost.\n\n\n\n\n\nget_cost(M::AbstractManifold,emo::EmbeddedManifoldObjective, p)\n\nEvaluate the cost function of an objective defined in the embedding, i.e. embed p before calling the cost function stored in the EmbeddedManifoldObjective.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"and internally","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_cost_function","category":"page"},{"location":"plans/objective/#Manopt.get_cost_function","page":"Objective","title":"Manopt.get_cost_function","text":"get_cost_function(amco::AbstractManifoldCostObjective)\n\nreturn the function to evaluate (just) the cost f(p)=c as a function (M,p) -> c.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Gradient-Objectives","page":"Objective","title":"Gradient Objectives","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"AbstractManifoldGradientObjective\nManifoldGradientObjective\nManifoldAlternatingGradientObjective\nManifoldStochasticGradientObjective\nNonlinearLeastSquaresObjective","category":"page"},{"location":"plans/objective/#Manopt.AbstractManifoldGradientObjective","page":"Objective","title":"Manopt.AbstractManifoldGradientObjective","text":"AbstractManifoldGradientObjective{E<:AbstractEvaluationType, TC, TG} <: AbstractManifoldCostObjective{E, TC}\n\nAn abstract type for all functions that provide a (full) gradient, where T is a AbstractEvaluationType for the gradient function.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.ManifoldGradientObjective","page":"Objective","title":"Manopt.ManifoldGradientObjective","text":"ManifoldGradientObjective{T<:AbstractEvaluationType} <: AbstractManifoldGradientObjective{T}\n\nspecify an objective containing a cost and its gradient\n\nFields\n\ncost – a function fcolonmathcal M ℝ\ngradient!! – the gradient operatornamegradfcolonmathcal M mathcal Tmathcal M of the cost function f.\n\nDepending on the AbstractEvaluationType T the gradient can have to forms\n\nas a function (M, p) -> X that allocates memory for X, i.e. an AllocatingEvaluation\nas a function (M, X, p) -> X that work in place of X, i.e. an InplaceEvaluation\n\nConstructors\n\nManifoldGradientObjective(cost, gradient; evaluation=AllocatingEvaluation())\n\nUsed with\n\ngradient_descent, conjugate_gradient_descent, quasi_Newton\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.ManifoldAlternatingGradientObjective","page":"Objective","title":"Manopt.ManifoldAlternatingGradientObjective","text":"ManifoldAlternatingGradientObjective{E<:AbstractEvaluationType,TCost,TGradient} <: AbstractManifoldGradientObjective{E}\n\nAn alternating gradient objective consists of\n\na cost function F(x)\na gradient operatornamegradF that is either\ngiven as one function operatornamegradF returning a tangent vector X on M or\nan array of gradient functions operatornamegradF_i, ì=1,…,n s each returning a component of the gradient\nwhich might be allocating or mutating variants, but not a mix of both.\n\nnote: Note\nThis Objective is usually defined using the ProductManifold from Manifolds.jl, so Manifolds.jl to be loaded.\n\nConstructors\n\nManifoldAlternatingGradientObjective(F, gradF::Function;\n evaluation=AllocatingEvaluation()\n)\nManifoldAlternatingGradientObjective(F, gradF::AbstractVector{<:Function};\n evaluation=AllocatingEvaluation()\n)\n\nCreate a alternating gradient problem with an optional cost and the gradient either as one function (returning an array) or a vector of functions.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.ManifoldStochasticGradientObjective","page":"Objective","title":"Manopt.ManifoldStochasticGradientObjective","text":"ManifoldStochasticGradientObjective{T<:AbstractEvaluationType} <: AbstractManifoldGradientObjective{T}\n\nA stochastic gradient objective consists of\n\na(n optional) cost function ``f(p) = \\displaystyle\\sum{i=1}^n fi(p)\nan array of gradients, operatornamegradf_i(p) i=1ldotsn which can be given in two forms\nas one single function (mathcal M p) (X_1X_n) in (T_pmathcal M)^n\nas a vector of functions bigl( (mathcal M p) X_1 (mathcal M p) X_nbigr).\n\nWhere both variants can also be provided as InplaceEvaluation functions, i.e. (M, X, p) -> X, where X is the vector of X1,...Xn and (M, X1, p) -> X1, ..., (M, Xn, p) -> Xn, respectively.\n\nConstructors\n\nManifoldStochasticGradientObjective(\n grad_f::Function;\n cost=Missing(),\n evaluation=AllocatingEvaluation()\n)\nManifoldStochasticGradientObjective(\n grad_f::AbstractVector{<:Function};\n cost=Missing(), evaluation=AllocatingEvaluation()\n)\n\nCreate a Stochastic gradient problem with the gradient either as one function (returning an array of tangent vectors) or a vector of functions (each returning one tangent vector).\n\nThe optional cost can also be given as either a single function (returning a number) pr a vector of functions, each returning a value.\n\nUsed with\n\nstochastic_gradient_descent\n\nNote that this can also be used with a gradient_descent, since the (complete) gradient is just the sums of the single gradients.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.NonlinearLeastSquaresObjective","page":"Objective","title":"Manopt.NonlinearLeastSquaresObjective","text":"NonlinearLeastSquaresObjective{T<:AbstractEvaluationType} <: AbstractManifoldObjective{T}\n\nA type for nonlinear least squares problems. T is a AbstractEvaluationType for the F and Jacobian functions.\n\nSpecify a nonlinear least squares problem\n\nFields\n\nf – a function f mathcal M ℝ^d to minimize\njacobian!! – Jacobian of the function f\njacobian_tangent_basis – the basis of tangent space used for computing the Jacobian.\nnum_components – number of values returned by f (equal to d).\n\nDepending on the AbstractEvaluationType T the function F has to be provided:\n\nas a functions (M::AbstractManifold, p) -> v that allocates memory for v itself for an AllocatingEvaluation,\nas a function (M::AbstractManifold, v, p) -> v that works in place of v for a InplaceEvaluation.\n\nAlso the Jacobian jacF is required:\n\nas a functions (M::AbstractManifold, p; basis_domain::AbstractBasis) -> v that allocates memory for v itself for an AllocatingEvaluation,\nas a function (M::AbstractManifold, v, p; basis_domain::AbstractBasis) -> v that works in place of v for an InplaceEvaluation.\n\nConstructors\n\nNonlinearLeastSquaresProblem(M, F, jacF, num_components; evaluation=AllocatingEvaluation(), jacobian_tangent_basis=DefaultOrthonormalBasis())\n\nSee also\n\nLevenbergMarquardt, LevenbergMarquardtState\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"There is also a second variant, if just one function is responsible for computing the cost and the gradient","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ManifoldCostGradientObjective","category":"page"},{"location":"plans/objective/#Manopt.ManifoldCostGradientObjective","page":"Objective","title":"Manopt.ManifoldCostGradientObjective","text":"ManifoldCostGradientObjective{T} <: AbstractManifoldObjective{T}\n\nspecify an objective containing one function to perform a combined computation of cost and its gradient\n\nFields\n\ncostgrad!! – a function that computes both the cost fcolonmathcal M ℝ and its gradient operatornamegradfcolonmathcal M mathcal Tmathcal M\n\nDepending on the AbstractEvaluationType T the gradient can have to forms\n\nas a function (M, p) -> (c, X) that allocates memory for the gradient X, i.e. an AllocatingEvaluation\nas a function (M, X, p) -> (c, X) that work in place of X, i.e. an InplaceEvaluation\n\nConstructors\n\nManifoldCostGradientObjective(costgrad; evaluation=AllocatingEvaluation())\n\nUsed with\n\ngradient_descent, conjugate_gradient_descent, quasi_Newton\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Access-functions-2","page":"Objective","title":"Access functions","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_gradient\nget_gradients","category":"page"},{"location":"plans/objective/#Manopt.get_gradient","page":"Objective","title":"Manopt.get_gradient","text":"X = get_gradient(M::ProductManifold, ago::ManifoldAlternatingGradientObjective, p)\nget_gradient!(M::ProductManifold, P::ManifoldAlternatingGradientObjective, X, p)\n\nEvaluate all summands gradients at a point p on the ProductManifold M (in place of X)\n\n\n\n\n\nX = get_gradient(M::AbstractManifold, p::ManifoldAlternatingGradientObjective, p, k)\nget_gradient!(M::AbstractManifold, p::ManifoldAlternatingGradientObjective, X, p, k)\n\nEvaluate one of the component gradients operatornamegradf_k, k1n, at x (in place of Y).\n\n\n\n\n\nget_gradient(s::AbstractManoptSolverState)\n\nreturn the (last stored) gradient within AbstractManoptSolverStates`. By default also undecorates the state beforehand\n\n\n\n\n\nget_gradient(amp::AbstractManoptProblem, p)\nget_gradient!(amp::AbstractManoptProblem, X, p)\n\nevaluate the gradient of an AbstractManoptProblem amp at the point p.\n\nThe evaluation is done in place of X for the !-variant.\n\n\n\n\n\nget_gradient(M::AbstractManifold, mgo::AbstractManifoldGradientObjective{T}, p)\nget_gradient!(M::AbstractManifold, X, mgo::AbstractManifoldGradientObjective{T}, p)\n\nevaluate the gradient of a AbstractManifoldGradientObjective{T} mgo at p.\n\nThe evaluation is done in place of X for the !-variant. The T=AllocatingEvaluation problem might still allocate memory within. When the non-mutating variant is called with a T=InplaceEvaluation memory for the result is allocated.\n\nNote that the order of parameters follows the philosophy of Manifolds.jl, namely that even for the mutating variant, the manifold is the first parameter and the (inplace) tangent vector X comes second.\n\n\n\n\n\nget_gradient(agst::AbstractGradientSolverState)\n\nreturn the gradient stored within gradient options. THe default returns agst.X.\n\n\n\n\n\nget_gradient(M::AbstractManifold, sgo::ManifoldStochasticGradientObjective, p, k)\nget_gradient!(M::AbstractManifold, sgo::ManifoldStochasticGradientObjective, Y, p, k)\n\nEvaluate one of the summands gradients operatornamegradf_k, k1n, at x (in place of Y).\n\nIf you use a single function for the stochastic gradient, that works inplace, then get_gradient is not available, since the length (or number of elements of the gradient required for allocation) can not be determined.\n\n\n\n\n\nget_gradient(M::AbstractManifold, sgo::ManifoldStochasticGradientObjective, p)\nget_gradient!(M::AbstractManifold, sgo::ManifoldStochasticGradientObjective, X, p)\n\nEvaluate the complete gradient operatornamegrad f = displaystylesum_i=1^n operatornamegrad f_i(p) at p (in place of X).\n\nIf you use a single function for the stochastic gradient, that works inplace, then get_gradient is not available, since the length (or number of elements of the gradient required for allocation) can not be determined.\n\n\n\n\n\nget_gradient(M::AbstractManifold, emo::EmbeddedManifoldObjective, p)\nget_gradient!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p)\n\nEvaluate the gradient function of an objective defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.\n\nThe returned gradient is then converted to a Riemannian gradient calling riemannian_gradient.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_gradients","page":"Objective","title":"Manopt.get_gradients","text":"get_gradients(M::AbstractManifold, sgo::ManifoldStochasticGradientObjective, p)\nget_gradients!(M::AbstractManifold, X, sgo::ManifoldStochasticGradientObjective, p)\n\nEvaluate all summands gradients operatornamegradf_i_i=1^n at p (in place of X).\n\nIf you use a single function for the stochastic gradient, that works inplace, then get_gradient is not available, since the length (or number of elements of the gradient) can not be determined.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"and internally","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_gradient_function","category":"page"},{"location":"plans/objective/#Manopt.get_gradient_function","page":"Objective","title":"Manopt.get_gradient_function","text":"get_gradient_function(amgo::AbstractManifoldGradientObjective, recursive=false)\n\nreturn the function to evaluate (just) the gradient operatornamegrad f(p), where either the gradient function using the decorator or without the decorator is used.\n\nBy default recursive is set to false, since usually to just pass the gradient function somewhere, you still want e.g. the cached one or the one that still counts calls.\n\nDepending on the AbstractEvaluationType E this is a function\n\n(M, p) -> X for the AllocatingEvaluation case\n(M, X, p) -> X for the InplaceEvaluation, i.e. working inplace of X.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Internal-Helpers","page":"Objective","title":"Internal Helpers","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_gradient_from_Jacobian!","category":"page"},{"location":"plans/objective/#Manopt.get_gradient_from_Jacobian!","page":"Objective","title":"Manopt.get_gradient_from_Jacobian!","text":"get_gradient_from_Jacobian!(\n M::AbstractManifold,\n X,\n nlso::NonlinearLeastSquaresObjective{InplaceEvaluation},\n p,\n Jval=zeros(nlso.num_components, manifold_dimension(M)),\n)\n\nCompute gradient of NonlinearLeastSquaresObjective nlso at point p in place of X, with temporary Jacobian stored in the optional argument Jval.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Subgradient-Objective","page":"Objective","title":"Subgradient Objective","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ManifoldSubgradientObjective","category":"page"},{"location":"plans/objective/#Manopt.ManifoldSubgradientObjective","page":"Objective","title":"Manopt.ManifoldSubgradientObjective","text":"ManifoldSubgradientObjective{T<:AbstractEvaluationType,C,S} <:AbstractManifoldCostObjective{T, C}\n\nA structure to store information about a objective for a subgradient based optimization problem\n\nFields\n\ncost – the function F to be minimized\nsubgradient – a function returning a subgradient partial F of F\n\nConstructor\n\nManifoldSubgradientObjective(f, ∂f)\n\nGenerate the ManifoldSubgradientObjective for a subgradient objective, i.e. a (cost) function f(M, p) and a function ∂f(M, p) that returns a not necessarily deterministic element from the subdifferential at p on a manifold M.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Access-Functions","page":"Objective","title":"Access Functions","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_subgradient","category":"page"},{"location":"plans/objective/#Manopt.get_subgradient","page":"Objective","title":"Manopt.get_subgradient","text":"get_subgradient(amp::AbstractManoptProblem, p)\nget_subgradient!(amp::AbstractManoptProblem, X, p)\n\nevaluate the subgradient of an AbstractManoptProblem amp at point p.\n\nThe evaluation is done in place of X for the !-variant. The result might not be deterministic, one element of the subdifferential is returned.\n\n\n\n\n\nX = get_subgradient(M;;AbstractManifold, sgo::ManifoldSubgradientObjective, p)\nget_subgradient!(M;;AbstractManifold, X, sgo::ManifoldSubgradientObjective, p)\n\nEvaluate the (sub)gradient of a ManifoldSubgradientObjective sgo at the point p.\n\nThe evaluation is done in place of X for the !-variant. The result might not be deterministic, one element of the subdifferential is returned.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Proximal-Map-Objective","page":"Objective","title":"Proximal Map Objective","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ManifoldProximalMapObjective","category":"page"},{"location":"plans/objective/#Manopt.ManifoldProximalMapObjective","page":"Objective","title":"Manopt.ManifoldProximalMapObjective","text":"ManifoldProximalMapObjective{E<:AbstractEvaluationType, TC, TP, V <: Vector{<:Integer}} <: AbstractManifoldCostObjective{E, TC}\n\nspecify a problem for solvers based on the evaluation of proximal map(s).\n\nFields\n\ncost - a function Fmathcal Mℝ to minimize\nproxes - proximal maps operatornameprox_λvarphimathcal Mmathcal M as functions (M, λ, p) -> q.\nnumber_of_proxes - (ones(length(proxes))` number of proximal Maps per function, e.g. if one of the maps is a combined one such that the proximal Maps functions return more than one entry per function, you have to adapt this value. if not specified, it is set to one prox per function.\n\nSee also\n\ncyclic_proximal_point, get_cost, get_proximal_map\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Access-Functions-2","page":"Objective","title":"Access Functions","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_proximal_map","category":"page"},{"location":"plans/objective/#Manopt.get_proximal_map","page":"Objective","title":"Manopt.get_proximal_map","text":"q = get_proximal_map(M::AbstractManifold, mpo::ManifoldProximalMapObjective, λ, p)\nget_proximal_map!(M::AbstractManifold, q, mpo::ManifoldProximalMapObjective, λ, p)\nq = get_proximal_map(M::AbstractManifold, mpo::ManifoldProximalMapObjective, λ, p, i)\nget_proximal_map!(M::AbstractManifold, q, mpo::ManifoldProximalMapObjective, λ, p, i)\n\nevaluate the (ith) proximal map of ManifoldProximalMapObjective p at the point p of p.M with parameter λ0.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Hessian-Objective","page":"Objective","title":"Hessian Objective","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ManifoldHessianObjective","category":"page"},{"location":"plans/objective/#Manopt.ManifoldHessianObjective","page":"Objective","title":"Manopt.ManifoldHessianObjective","text":"ManifoldHessianObjective{T<:AbstractEvaluationType,C,G,H,Pre} <: AbstractManifoldGradientObjective{T}\n\nspecify a problem for hessian based algorithms.\n\nFields\n\ncost : a function Fmathcal Mℝ to minimize\ngradient : the gradient operatornamegradFmathcal M mathcal Tmathcal M of the cost function F\nhessian : the hessian operatornameHessF(x) mathcal T_x mathcal M mathcal T_x mathcal M of the cost function F\npreconditioner : the symmetric, positive definite preconditioner as an approximation of the inverse of the Hessian of f, i.e. as a map with the same input variables as the hessian.\n\nDepending on the AbstractEvaluationType T the gradient and can have to forms\n\nas a function (M, p) -> X and (M, p, X) -> Y, resp. i.e. an AllocatingEvaluation\nas a function (M, X, p) -> X and (M, Y, p, X), resp., i.e. an InplaceEvaluation\n\nConstructor\n\nManifoldHessianObjective(f, grad_f, Hess_f, preconditioner = (M, p, X) -> X;\n evaluation=AllocatingEvaluation())\n\nSee also\n\ntruncated_conjugate_gradient_descent, trust_regions\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Access-functions-3","page":"Objective","title":"Access functions","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_hessian\nget_preconditioner","category":"page"},{"location":"plans/objective/#Manopt.get_hessian","page":"Objective","title":"Manopt.get_hessian","text":"Y = get_hessian(amp::AbstractManoptProblem{T}, p, X)\nget_hessian!(amp::AbstractManoptProblem{T}, Y, p, X)\n\nevaluate the Hessian of an AbstractManoptProblem amp at p applied to a tangent vector X, i.e. compute operatornameHessf(q)X, which can also happen in-place of Y.\n\n\n\n\n\nget_hessian(M::AbstractManifold, emo::EmbeddedManifoldObjective, p, X)\nget_hessian!(M::AbstractManifold, Y, emo::EmbeddedManifoldObjective, p, X)\n\nEvaluate the Hessian of an objective defined in the embedding, that is embed p and X before calling the Hessian function stored in the EmbeddedManifoldObjective.\n\nThe returned Hessian is then converted to a Riemannian Hessian calling riemannian_Hessian.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_preconditioner","page":"Objective","title":"Manopt.get_preconditioner","text":"get_preconditioner(amp::AbstractManoptProblem, p, X)\n\nevaluate the symmetric, positive definite preconditioner (approximation of the inverse of the Hessian of the cost function f) of a AbstractManoptProblem amps objective at the point p applied to a tangent vector X.\n\n\n\n\n\nget_preconditioner(M::AbstractManifold, mho::ManifoldHessianObjective, p, X)\n\nevaluate the symmetric, positive definite preconditioner (approximation of the inverse of the Hessian of the cost function F) of a ManifoldHessianObjective mho at the point p applied to a tangent vector X.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"and internally","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_hessian_function","category":"page"},{"location":"plans/objective/#Manopt.get_hessian_function","page":"Objective","title":"Manopt.get_hessian_function","text":"get_gradient_function(amgo::AbstractManifoldGradientObjective{E<:AbstractEvaluationType})\n\nreturn the function to evaluate (just) the hessian operatornameHess f(p). Depending on the AbstractEvaluationType E this is a function\n\n(M, p, X) -> Y for the AllocatingEvaluation case\n(M, Y, p, X) -> X for the InplaceEvaluation, i.e. working inplace of Y.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Primal-Dual-based-Objectives","page":"Objective","title":"Primal-Dual based Objectives","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"AbstractPrimalDualManifoldObjective\nPrimalDualManifoldObjective\nPrimalDualManifoldSemismoothNewtonObjective","category":"page"},{"location":"plans/objective/#Manopt.AbstractPrimalDualManifoldObjective","page":"Objective","title":"Manopt.AbstractPrimalDualManifoldObjective","text":"AbstractPrimalDualManifoldObjective{E<:AbstractEvaluationType,C,P} <: AbstractManifoldCostObjective{E,C}\n\nA common abstract super type for objectives that consider primal-dual problems.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.PrimalDualManifoldObjective","page":"Objective","title":"Manopt.PrimalDualManifoldObjective","text":"PrimalDualManifoldObjective{E<:AbstractEvaluationType} <: AbstractPrimalDualManifoldObjective{E}\n\nDescribes an Objective linearized or exact Chambolle-Pock algorithm, cf. Bergmann et al., Found. Comput. Math., 2021, Chambolle, Pock, JMIV, 201\n\nFields\n\nAll fields with !! can either be mutating or nonmutating functions, which should be set depending on the parameter T <: AbstractEvaluationType.\n\ncost F + G(Λ()) to evaluate interims cost function values\nlinearized_forward_operator!! linearized operator for the forward operation in the algorithm DΛ\nlinearized_adjoint_operator!! The adjoint differential (DΛ)^* mathcal N Tmathcal M\nprox_f!! the proximal map belonging to f\nprox_G_dual!! the proximal map belonging to g_n^*\nΛ!! – (fordward_operator) the forward operator (if given) Λ mathcal M mathcal N\n\nEither the linearized operator DΛ or Λ are required usually.\n\nConstructor\n\nPrimalDualManifoldObjective(cost, prox_f, prox_G_dual, adjoint_linearized_operator;\n linearized_forward_operator::Union{Function,Missing}=missing,\n Λ::Union{Function,Missing}=missing,\n evaluation::AbstractEvaluationType=AllocatingEvaluation()\n)\n\nThe last optional argument can be used to provide the 4 or 5 functions as allocating or mutating (in place computation) ones. Note that the first argument is always the manifold under consideration, the mutated one is the second.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.PrimalDualManifoldSemismoothNewtonObjective","page":"Objective","title":"Manopt.PrimalDualManifoldSemismoothNewtonObjective","text":"PrimalDualManifoldSemismoothNewtonObjective{E<:AbstractEvaluationType, TC, LO, ALO, PF, DPF, PG, DPG, L} <: AbstractPrimalDualManifoldObjective{E, TC, PF}\n\nDescribes a Problem for the Primal-dual Riemannian semismooth Newton algorithm. Diepeveen, Lellmann, SIAM J. Imag. Sci., 2021\n\nFields\n\ncost F + G(Λ()) to evaluate interims cost function values\nlinearized_operator the linearization DΛ() of the operator Λ().\nlinearized_adjoint_operator The adjoint differential (DΛ)^* colon mathcal N to Tmathcal M\nprox_F the proximal map belonging to f\ndiff_prox_F the (Clarke Generalized) differential of the proximal maps of F\nprox_G_dual the proximal map belonging to g_n^*\ndiff_prox_dual_G the (Clarke Generalized) differential of the proximal maps of G^ast_n\nΛ – the exact forward operator. This operator is required if Λ(m)=n does not hold.\n\nConstructor\n\nPrimalDualManifoldSemismoothNewtonObjective(cost, prox_F, prox_G_dual, forward_operator, adjoint_linearized_operator,Λ)\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Access-functions-4","page":"Objective","title":"Access functions","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"adjoint_linearized_operator\nforward_operator\nget_differential_dual_prox\nget_differential_primal_prox\nget_dual_prox\nget_primal_prox\nlinearized_forward_operator","category":"page"},{"location":"plans/objective/#Manopt.adjoint_linearized_operator","page":"Objective","title":"Manopt.adjoint_linearized_operator","text":"X = adjoint_linearized_operator(N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, m, n, Y)\nadjoint_linearized_operator(N::AbstractManifold, X, apdmo::AbstractPrimalDualManifoldObjective, m, n, Y)\n\nEvaluate the adjoint of the linearized forward operator of (DΛ(m))^*Y stored within the AbstractPrimalDualManifoldObjective (in place of X). Since YT_nmathcal N, both m and n=Λ(m) are necessary arguments, mainly because the forward operator Λ might be missing in p.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.forward_operator","page":"Objective","title":"Manopt.forward_operator","text":"q = forward_operator(M::AbstractManifold, N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, p)\nforward_operator!(M::AbstractManifold, N::AbstractManifold, q, apdmo::AbstractPrimalDualManifoldObjective, p)\n\nEvaluate the forward operator of Λ(x) stored within the TwoManifoldProblem (in place of q).\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_differential_dual_prox","page":"Objective","title":"Manopt.get_differential_dual_prox","text":"η = get_differential_dual_prox(N::AbstractManifold, pdsno::PrimalDualManifoldSemismoothNewtonObjective, n, τ, X, ξ)\nget_differential_dual_prox!(N::AbstractManifold, pdsno::PrimalDualManifoldSemismoothNewtonObjective, η, n, τ, X, ξ)\n\nEvaluate the differential proximal map of G_n^* stored within PrimalDualManifoldSemismoothNewtonObjective\n\nDoperatornameprox_τG_n^*(X)ξ\n\nwhich can also be computed in place of η.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_differential_primal_prox","page":"Objective","title":"Manopt.get_differential_primal_prox","text":"y = get_differential_primal_prox(M::AbstractManifold, pdsno::PrimalDualManifoldSemismoothNewtonObjective σ, x)\nget_differential_primal_prox!(p::TwoManifoldProblem, y, σ, x)\n\nEvaluate the differential proximal map of F stored within AbstractPrimalDualManifoldObjective\n\nDoperatornameprox_σF(x)X\n\nwhich can also be computed in place of y.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_dual_prox","page":"Objective","title":"Manopt.get_dual_prox","text":"Y = get_dual_prox(N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, n, τ, X)\nget_dual_prox!(N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, Y, n, τ, X)\n\nEvaluate the proximal map of g_n^* stored within AbstractPrimalDualManifoldObjective\n\n Y = operatornameprox_τG_n^*(X)\n\nwhich can also be computed in place of Y.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_primal_prox","page":"Objective","title":"Manopt.get_primal_prox","text":"q = get_primal_prox(M::AbstractManifold, p::AbstractPrimalDualManifoldObjective, σ, p)\nget_primal_prox!(M::AbstractManifold, p::AbstractPrimalDualManifoldObjective, q, σ, p)\n\nEvaluate the proximal map of F stored within AbstractPrimalDualManifoldObjective\n\noperatornameprox_σF(x)\n\nwhich can also be computed in place of y.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.linearized_forward_operator","page":"Objective","title":"Manopt.linearized_forward_operator","text":"Y = linearized_forward_operator(M::AbstractManifold, N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, m, X, n)\nlinearized_forward_operator!(M::AbstractManifold, N::AbstractManifold, Y, apdmo::AbstractPrimalDualManifoldObjective, m, X, n)\n\nEvaluate the linearized operator (differential) DΛ(m)X stored within the AbstractPrimalDualManifoldObjective (in place of Y), where n = Λ(m).\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Constrained-Objective","page":"Objective","title":"Constrained Objective","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"Besides the AbstractEvaluationType there is one further property to distinguish among constraint functions, especially the gradients of the constraints.","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ConstraintType\nFunctionConstraint\nVectorConstraint","category":"page"},{"location":"plans/objective/#Manopt.ConstraintType","page":"Objective","title":"Manopt.ConstraintType","text":"ConstraintType\n\nAn abstract type to represent different forms of representing constraints\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.FunctionConstraint","page":"Objective","title":"Manopt.FunctionConstraint","text":"FunctionConstraint <: ConstraintType\n\nA type to indicate that constraints are implemented one whole functions, e.g. g(p) mathbb R^m.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.VectorConstraint","page":"Objective","title":"Manopt.VectorConstraint","text":"VectorConstraint <: ConstraintType\n\nA type to indicate that constraints are implemented a vector of functions, e.g. g_i(p) mathbb R i=1m.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"The ConstraintType is a parameter of the corresponding Objective.","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ConstrainedManifoldObjective","category":"page"},{"location":"plans/objective/#Manopt.ConstrainedManifoldObjective","page":"Objective","title":"Manopt.ConstrainedManifoldObjective","text":"ConstrainedManifoldObjective{T<:AbstractEvaluationType, C <: ConstraintType Manifold} <: AbstractManifoldObjective{T}\n\nDescribes the constrained objective\n\nbeginaligned\n operatorname*argmin_p mathcalM f(p)\n textsubject to g_i(p)leq0 quad text for all i=1m\n quad h_j(p)=0 quad text for all j=1n\nendaligned\n\nIt consists of\n\nan cost function f(p)\nthe gradient of f, operatornamegradf(p) AbstractManifoldGradientObjective\ninequality constraints g(p), either a function g returning a vector or a vector [g1, g2,...,gm] of functions.\nequality constraints h(p), either a function h returning a vector or a vector [h1, h2,...,hn] of functions.\ngradient(s) of the inequality constraints operatornamegradg(p) (T_pmathcal M)^m, either a function or a vector of functions.\ngradient(s) of the equality constraints operatornamegradh(p) (T_pmathcal M)^n, either a function or a vector of functions.\n\nThere are two ways to specify the constraints g and h.\n\nas one Function returning a vector in mathbb R^m and mathbb R^n respectively. This might be easier to implement but requires evaluating all constraints even if only one is needed.\nas a AbstractVector{<:Function} where each function returns a real number. This requires each constraint to be implemented as a single function, but it is possible to evaluate also only a single constraint.\n\nThe gradients operatornamegradg, operatornamegradh have to follow the same form. Additionally they can be implemented as in-place functions or as allocating ones. The gradient operatornamegradF has to be the same kind. This difference is indicated by the evaluation keyword.\n\nConstructors\n\nConstrainedManifoldObjective(f, grad_f, g, grad_g, h, grad_h;\n evaluation=AllocatingEvaluation()\n)\n\nWhere f, g, h describe the cost, inequality and equality constraints, respectively, as described above and grad_f, grad_g, grad_h are the corresponding gradient functions in one of the 4 formats. If the objective does not have inequality constraints, you can set G and gradG no nothing. If the problem does not have equality constraints, you can set H and gradH no nothing or leave them out.\n\nConstrainedManifoldObjective(M::AbstractManifold, F, gradF;\n G=nothing, gradG=nothing, H=nothing, gradH=nothing;\n evaluation=AllocatingEvaluation()\n)\n\nA keyword argument variant of the constructor above, where you can leave out either G and gradG or H and gradH but not both.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Access-functions-5","page":"Objective","title":"Access functions","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_constraints\nget_equality_constraint\nget_equality_constraints\nget_inequality_constraint\nget_inequality_constraints\nget_grad_equality_constraint\nget_grad_equality_constraints\nget_grad_equality_constraints!\nget_grad_equality_constraint!\nget_grad_inequality_constraint\nget_grad_inequality_constraint!\nget_grad_inequality_constraints\nget_grad_inequality_constraints!","category":"page"},{"location":"plans/objective/#Manopt.get_constraints","page":"Objective","title":"Manopt.get_constraints","text":"get_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p)\n\nReturn the vector (g_1(p)g_m(p)h_1(p)h_n(p)) from the ConstrainedManifoldObjective P containing the values of all constraints at p.\n\n\n\n\n\nget_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p)\n\nReturn the vector (g_1(p)g_m(p)h_1(p)h_n(p)) defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_equality_constraint","page":"Objective","title":"Manopt.get_equality_constraint","text":"get_equality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, j)\n\nevaluate the jth equality constraint (h(p))_j or h_j(p).\n\nnote: Note\nFor the FunctionConstraint representation this still evaluates all constraints.\n\n\n\n\n\nget_equality_constraint(M::AbstractManifold, emo::EmbeddedManifoldObjective, p, j)\n\nevaluate the js equality constraint h_j(p) defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_equality_constraints","page":"Objective","title":"Manopt.get_equality_constraints","text":"get_equality_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p)\n\nevaluate all equality constraints h(p) of bigl(h_1(p) h_2(p)ldotsh_p(p)bigr) of the ConstrainedManifoldObjective P at p.\n\n\n\n\n\nget_equality_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p)\n\nEvaluate all equality constraints h(p) of bigl(h_1(p) h_2(p)ldotsh_p(p)bigr) defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_inequality_constraint","page":"Objective","title":"Manopt.get_inequality_constraint","text":"get_inequality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, i)\n\nevaluate one equality constraint (g(p))_i or g_i(p).\n\nnote: Note\nFor the FunctionConstraint representation this still evaluates all constraints.\n\n\n\n\n\nget_inequality_constraint(M::AbstractManifold, ems::EmbeddedManifoldObjective, p, i)\n\nEvaluate the is inequality constraint g_i(p) defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_inequality_constraints","page":"Objective","title":"Manopt.get_inequality_constraints","text":"get_inequality_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p)\n\nEvaluate all inequality constraints g(p) or bigl(g_1(p) g_2(p)ldotsg_m(p)bigr) of the ConstrainedManifoldObjective P at p.\n\n\n\n\n\nget_inequality_constraints(M::AbstractManifold, ems::EmbeddedManifoldObjective, p)\n\nEvaluate all inequality constraints g(p) of bigl(g_1(p) g_2(p)ldotsg_m(p)bigr) defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_grad_equality_constraint","page":"Objective","title":"Manopt.get_grad_equality_constraint","text":"get_grad_equality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, j)\n\nevaluate the gradient of the j th equality constraint (operatornamegrad h(p))_j or operatornamegrad h_j(x).\n\nnote: Note\nFor the FunctionConstraint variant of the problem, this function still evaluates the full gradient. For the InplaceEvaluation and FunctionConstraint of the problem, this function currently also calls get_equality_constraints, since this is the only way to determine the number of constraints. It also allocates a full tangent vector.\n\n\n\n\n\nX = get_grad_equality_constraint(M::AbstractManifold, emo::EmbeddedManifoldObjective, p, j)\nget_grad_equality_constraint!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p, j)\n\nevaluate the gradient of the jth equality constraint operatornamegrad h_j(p) defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.\n\nThe returned gradient is then converted to a Riemannian gradient calling riemannian_gradient.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_grad_equality_constraints","page":"Objective","title":"Manopt.get_grad_equality_constraints","text":"get_grad_equality_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p)\n\nevaluate all gradients of the equality constraints operatornamegrad h(x) or bigl(operatornamegrad h_1(x) operatornamegrad h_2(x)ldots operatornamegradh_n(x)bigr) of the ConstrainedManifoldObjective P at p.\n\nnote: Note\nFor the InplaceEvaluation and FunctionConstraint variant of the problem, this function currently also calls get_equality_constraints, since this is the only way to determine the number of constraints.\n\n\n\n\n\nX = get_grad_equality_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p)\nget_grad_equality_constraints!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p)\n\nevaluate the gradients of the the equality constraints operatornamegrad h(p) defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.\n\nThe returned gradients are then converted to a Riemannian gradient calling riemannian_gradient.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_grad_equality_constraints!","page":"Objective","title":"Manopt.get_grad_equality_constraints!","text":"get_grad_equality_constraints!(M::AbstractManifold, X, co::ConstrainedManifoldObjective, p)\n\nevaluate all gradients of the equality constraints operatornamegrad h(p) or bigl(operatornamegrad h_1(p) operatornamegrad h_2(p)ldotsoperatornamegrad h_n(p)bigr) of the ConstrainedManifoldObjective P at p in place of X, which is a vector ofn` tangent vectors.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_grad_equality_constraint!","page":"Objective","title":"Manopt.get_grad_equality_constraint!","text":"get_grad_equality_constraint!(M::AbstractManifold, X, co::ConstrainedManifoldObjective, p, j)\n\nEvaluate the gradient of the jth equality constraint (operatornamegrad h(x))_j or operatornamegrad h_j(x) in place of X\n\nnote: Note\nFor the FunctionConstraint variant of the problem, this function still evaluates the full gradient. For the InplaceEvaluation of the FunctionConstraint of the problem, this function currently also calls get_inequality_constraints, since this is the only way to determine the number of constraints and allocates a full vector of tangent vectors\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_grad_inequality_constraint","page":"Objective","title":"Manopt.get_grad_inequality_constraint","text":"get_grad_inequality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, i)\n\nEvaluate the gradient of the i th inequality constraints (operatornamegrad g(x))_i or operatornamegrad g_i(x).\n\nnote: Note\nFor the FunctionConstraint variant of the problem, this function still evaluates the full gradient. For the InplaceEvaluation and FunctionConstraint of the problem, this function currently also calls get_inequality_constraints, since this is the only way to determine the number of constraints.\n\n\n\n\n\nX = get_grad_inequality_constraint(M::AbstractManifold, emo::EmbeddedManifoldObjective, p, i)\nget_grad_inequality_constraint!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p, i)\n\nevaluate the gradient of the ith inequality constraint operatornamegrad g_i(p) defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.\n\nThe returned gradient is then converted to a Riemannian gradient calling riemannian_gradient.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_grad_inequality_constraint!","page":"Objective","title":"Manopt.get_grad_inequality_constraint!","text":"get_grad_inequality_constraint!(P, X, p, i)\n\nEvaluate the gradient of the ith inequality constraints (operatornamegrad g(x))_i or operatornamegrad g_i(x) of the ConstrainedManifoldObjective P in place of X\n\nnote: Note\nFor the FunctionConstraint variant of the problem, this function still evaluates the full gradient. For the InplaceEvaluation and FunctionConstraint of the problem, this function currently also calls get_inequality_constraints,\n\nsince this is the only way to determine the number of constraints. evaluate all gradients of the inequality constraints operatornamegrad h(x) or bigl(g_1(x) g_2(x)ldotsg_m(x)bigr) of the ConstrainedManifoldObjective p at x in place of X, which is a vector ofm` tangent vectors .\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_grad_inequality_constraints","page":"Objective","title":"Manopt.get_grad_inequality_constraints","text":"get_grad_inequality_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p)\n\nevaluate all gradients of the inequality constraints operatornamegrad g(p) or bigl(operatornamegrad g_1(p) operatornamegrad g_2(p)operatornamegrad g_m(p)bigr) of the ConstrainedManifoldObjective P at p.\n\nnote: Note\n\n\nfor the InplaceEvaluation and FunctionConstraint variant of the problem, this function currently also calls get_equality_constraints, since this is the only way to determine the number of constraints.\n\n\n\n\n\nX = get_grad_inequality_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p)\nget_grad_inequality_constraints!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p)\n\nevaluate the gradients of the the inequality constraints operatornamegrad g(p) defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.\n\nThe returned gradients are then converted to a Riemannian gradient calling riemannian_gradient.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_grad_inequality_constraints!","page":"Objective","title":"Manopt.get_grad_inequality_constraints!","text":"get_grad_inequality_constraints!(M::AbstractManifold, X, co::ConstrainedManifoldObjective, p)\n\nevaluate all gradients of the inequality constraints operatornamegrad g(x) or bigl(operatornamegrad g_1(x) operatornamegrad g_2(x)ldotsoperatornamegrad g_m(x)bigr) of the ConstrainedManifoldObjective P at p in place of X, which is a vector of m tangent vectors.\n\n\n\n\n\n","category":"function"},{"location":"plans/stopping_criteria/#StoppingCriteria","page":"Stopping Criteria","title":"Stopping Criteria","text":"","category":"section"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"Stopping criteria are implemented as a functor, i.e. inherit from the base type","category":"page"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"StoppingCriterion","category":"page"},{"location":"plans/stopping_criteria/#Manopt.StoppingCriterion","page":"Stopping Criteria","title":"Manopt.StoppingCriterion","text":"StoppingCriterion\n\nAn abstract type for the functors representing stopping criteria, i.e. they are callable structures. The naming Scheme follows functions, see for example StopAfterIteration.\n\nEvery StoppingCriterion has to provide a constructor and its function has to have the interface (p,o,i) where a AbstractManoptProblem as well as AbstractManoptSolverState and the current number of iterations are the arguments and returns a Bool whether to stop or not.\n\nBy default each StoppingCriterion should provide a fields reason to provide details when a criterion is met (and that is empty otherwise).\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"They can also be grouped, which is summarized in the type of a set of criteria","category":"page"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"StoppingCriterionSet","category":"page"},{"location":"plans/stopping_criteria/#Manopt.StoppingCriterionSet","page":"Stopping Criteria","title":"Manopt.StoppingCriterionSet","text":"StoppingCriterionGroup <: StoppingCriterion\n\nAn abstract type for a Stopping Criterion that itself consists of a set of Stopping criteria. In total it acts as a stopping criterion itself. Examples are StopWhenAny and StopWhenAll that can be used to combine stopping criteria.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"Then the stopping criteria s might have certain internal values to check against, and this is done when calling them as a function s(amp::AbstractManoptProblem, ams::AbstractManoptSolverState), where the AbstractManoptProblem and the AbstractManoptSolverState together represent the current state of the solver. The functor returns either false when the stopping criterion is not fulfilled or true otherwise. One field all criteria should have is the s.reason, a string giving the reason to stop, see get_reason.","category":"page"},{"location":"plans/stopping_criteria/#Stopping-Criteria","page":"Stopping Criteria","title":"Stopping Criteria","text":"","category":"section"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"The following generic stopping criteria are available. Some require that, for example, the corresponding AbstractManoptSolverState have a field gradient when the criterion should check that.","category":"page"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"Further stopping criteria might be available for individual solvers.","category":"page"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"Modules = [Manopt]\nPages = [\"plans/stopping_criterion.jl\"]\nOrder = [:type]\nFilter = t -> t != StoppingCriterion && t != StoppingCriterionSet","category":"page"},{"location":"plans/stopping_criteria/#Manopt.StopAfter","page":"Stopping Criteria","title":"Manopt.StopAfter","text":"StopAfter <: StoppingCriterion\n\nstore a threshold when to stop looking at the complete runtime. It uses time_ns() to measure the time and you provide a Period as a time limit, i.e. Minute(15)\n\nConstructor\n\nStopAfter(t)\n\ninitialize the stopping criterion to a Period t to stop after.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopAfterIteration","page":"Stopping Criteria","title":"Manopt.StopAfterIteration","text":"StopAfterIteration <: StoppingCriterion\n\nA functor for an easy stopping criterion, i.e. to stop after a maximal number of iterations.\n\nFields\n\nmaxIter – stores the maximal iteration number where to stop at\nreason – stores a reason of stopping if the stopping criterion has one be reached, see get_reason.\n\nConstructor\n\nStopAfterIteration(maxIter)\n\ninitialize the stopafterIteration functor to indicate to stop after maxIter iterations.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopWhenAll","page":"Stopping Criteria","title":"Manopt.StopWhenAll","text":"StopWhenAll <: StoppingCriterion\n\nstore an array of StoppingCriterion elements and indicates to stop, when all indicate to stop. The reason is given by the concatenation of all reasons.\n\nConstructor\n\nStopWhenAll(c::NTuple{N,StoppingCriterion} where N)\nStopWhenAll(c::StoppingCriterion,...)\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopWhenAny","page":"Stopping Criteria","title":"Manopt.StopWhenAny","text":"StopWhenAny <: StoppingCriterion\n\nstore an array of StoppingCriterion elements and indicates to stop, when any single one indicates to stop. The reason is given by the concatenation of all reasons (assuming that all non-indicating return \"\").\n\nConstructor\n\nStopWhenAny(c::NTuple{N,StoppingCriterion} where N)\nStopWhenAny(c::StoppingCriterion...)\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopWhenChangeLess","page":"Stopping Criteria","title":"Manopt.StopWhenChangeLess","text":"StopWhenChangeLess <: StoppingCriterion\n\nstores a threshold when to stop looking at the norm of the change of the optimization variable from within a AbstractManoptSolverState, i.e get_iterate(o). For the storage a StoreStateAction is used\n\nConstructor\n\nStopWhenChangeLess(\n M::AbstractManifold,\n ε::Float64;\n storage::StoreStateAction=StoreStateAction([:Iterate]),\n inverse_retraction_method::IRT=default_inverse_retraction_method(manifold)\n)\n\ninitialize the stopping criterion to a threshold ε using the StoreStateAction a, which is initialized to just store :Iterate by default. You can also provide an inverseretractionmethod for the distance or a manifold to use its default inverse retraction.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopWhenCostLess","page":"Stopping Criteria","title":"Manopt.StopWhenCostLess","text":"StopWhenCostLess <: StoppingCriterion\n\nstore a threshold when to stop looking at the cost function of the optimization problem from within a AbstractManoptProblem, i.e get_cost(p,get_iterate(o)).\n\nConstructor\n\nStopWhenCostLess(ε)\n\ninitialize the stopping criterion to a threshold ε.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopWhenGradientChangeLess","page":"Stopping Criteria","title":"Manopt.StopWhenGradientChangeLess","text":"StopWhenGradientChangeLess <: StoppingCriterion\n\nA stopping criterion based on the change of the gradient\n\n\\lVert \\mathcal T_{p^{(k)}\\gets p^{(k-1)} \\operatorname{grad} f(p^{(k-1)}) - \\operatorname{grad} f(p^{(k-1)}) \\rVert < ε\n\nConstructor\n\nStopWhenGradientChangeLess(\n M::AbstractManifold,\n ε::Float64;\n storage::StoreStateAction=StoreStateAction([:Iterate]),\n vector_transport_method::IRT=default_vector_transport_method(M),\n)\n\nCreate a stopping criterion with threshold ε for the change gradient, that is, this criterion indicates to stop when get_gradient is in (norm of) its change less than ε, where vector_transport_method denotes the vector transport mathcal T used.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopWhenGradientNormLess","page":"Stopping Criteria","title":"Manopt.StopWhenGradientNormLess","text":"StopWhenGradientNormLess <: StoppingCriterion\n\nA stopping criterion based on the current gradient norm.\n\nConstructor\n\nStopWhenGradientNormLess(ε::Float64)\n\nCreate a stopping criterion with threshold ε for the gradient, that is, this criterion indicates to stop when get_gradient returns a gradient vector of norm less than ε.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopWhenSmallerOrEqual","page":"Stopping Criteria","title":"Manopt.StopWhenSmallerOrEqual","text":"StopWhenSmallerOrEqual <: StoppingCriterion\n\nA functor for an stopping criterion, where the algorithm if stopped when a variable is smaller than or equal to its minimum value.\n\nFields\n\nvalue – stores the variable which has to fall under a threshold for the algorithm to stop\nminValue – stores the threshold where, if the value is smaller or equal to this threshold, the algorithm stops\nreason – stores a reason of stopping if the stopping criterion has one be reached, see get_reason.\n\nConstructor\n\nStopWhenSmallerOrEqual(value, minValue)\n\ninitialize the stopifsmallerorequal functor to indicate to stop after value is smaller than or equal to minValue.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopWhenStepsizeLess","page":"Stopping Criteria","title":"Manopt.StopWhenStepsizeLess","text":"StopWhenStepsizeLess <: StoppingCriterion\n\nstores a threshold when to stop looking at the last step size determined or found during the last iteration from within a AbstractManoptSolverState.\n\nConstructor\n\nStopWhenStepsizeLess(ε)\n\ninitialize the stopping criterion to a threshold ε.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Functions-for-Stopping-Criteria","page":"Stopping Criteria","title":"Functions for Stopping Criteria","text":"","category":"section"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"There are a few functions to update, combine and modify stopping criteria, especially to update internal values even for stopping criteria already being used within an AbstractManoptSolverState structure.","category":"page"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"Modules = [Manopt]\nPages = [\"plans/stopping_criterion.jl\"]\nOrder = [:function]","category":"page"},{"location":"plans/stopping_criteria/#Base.:&-Union{Tuple{T}, Tuple{S}, Tuple{S, T}} where {S<:StoppingCriterion, T<:StoppingCriterion}","page":"Stopping Criteria","title":"Base.:&","text":"&(s1,s2)\ns1 & s2\n\nCombine two StoppingCriterion within an StopWhenAll. If either s1 (or s2) is already an StopWhenAll, then s2 (or s1) is appended to the list of StoppingCriterion within s1 (or s2).\n\nExample\n\na = StopAfterIteration(200) & StopWhenChangeLess(1e-6)\nb = a & StopWhenGradientNormLess(1e-6)\n\nIs the same as\n\na = StopWhenAll(StopAfterIteration(200), StopWhenChangeLess(1e-6))\nb = StopWhenAll(StopAfterIteration(200), StopWhenChangeLess(1e-6), StopWhenGradientNormLess(1e-6))\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Base.:|-Union{Tuple{T}, Tuple{S}, Tuple{S, T}} where {S<:StoppingCriterion, T<:StoppingCriterion}","page":"Stopping Criteria","title":"Base.:|","text":"|(s1,s2)\ns1 | s2\n\nCombine two StoppingCriterion within an StopWhenAny. If either s1 (or s2) is already an StopWhenAny, then s2 (or s1) is appended to the list of StoppingCriterion within s1 (or s2)\n\nExample\n\na = StopAfterIteration(200) | StopWhenChangeLess(1e-6)\nb = a | StopWhenGradientNormLess(1e-6)\n\nIs the same as\n\na = StopWhenAny(StopAfterIteration(200), StopWhenChangeLess(1e-6))\nb = StopWhenAny(StopAfterIteration(200), StopWhenChangeLess(1e-6), StopWhenGradientNormLess(1e-6))\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.get_active_stopping_criteria-Tuple{sCS} where sCS<:StoppingCriterionSet","page":"Stopping Criteria","title":"Manopt.get_active_stopping_criteria","text":"get_active_stopping_criteria(c)\n\nreturns all active stopping criteria, if any, that are within a StoppingCriterion c, and indicated a stop, i.e. their reason is nonempty. To be precise for a simple stopping criterion, this returns either an empty array if no stop is indicated or the stopping criterion as the only element of an array. For a StoppingCriterionSet all internal (even nested) criteria that indicate to stop are returned.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.get_reason-Tuple{AbstractManoptSolverState}","page":"Stopping Criteria","title":"Manopt.get_reason","text":"get_reason(o)\n\nreturn the current reason stored within the StoppingCriterion from within the AbstractManoptSolverState This reason is empty if the criterion has never been met.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.get_reason-Tuple{sC} where sC<:StoppingCriterion","page":"Stopping Criteria","title":"Manopt.get_reason","text":"get_reason(c)\n\nreturn the current reason stored within a StoppingCriterion c. This reason is empty if the criterion has never been met.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.get_stopping_criteria-Tuple{S} where S<:StoppingCriterionSet","page":"Stopping Criteria","title":"Manopt.get_stopping_criteria","text":"get_stopping_criteria(c)\n\nreturn the array of internally stored StoppingCriterions for a StoppingCriterionSet c.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.indicates_convergence-Tuple{StoppingCriterion}","page":"Stopping Criteria","title":"Manopt.indicates_convergence","text":"indicates_convergence(c::StoppingCriterion)\n\nReturn whether (true) or not (false) a StoppingCriterion does always mean that, when it indicates to stop, the solver has converged to a minimizer or critical point.\n\nNote that this is independent of the actual state of the stopping criterion, i.e. whether some of them indicate to stop, but a purely type-based, static decision\n\nExamples\n\nWith s1=StopAfterIteration(20) and s2=StopWhenGradientNormLess(1e-7) we have\n\nindicates_convergence(s1) is false\nindicates_convergence(s2) is true\nindicates_convergence(s1 | s2) is false, since this might also stop after 20 iterations\nindicates_convergence(s1 & s2) is true, since s2 is fulfilled if this stops.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.update_stopping_criterion!-Tuple{Any, Any, Any}","page":"Stopping Criteria","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::Stoppingcriterion, s::Symbol, v::value)\nupdate_stopping_criterion!(s::AbstractManoptSolverState, symbol::Symbol, v::value)\nupdate_stopping_criterion!(c::Stoppingcriterion, ::Val{Symbol}, v::value)\n\nUpdate a value within a stopping criterion, specified by the symbol s, to v. If a criterion does not have a value assigned that corresponds to s, the update is ignored.\n\nFor the second signature, the stopping criterion within the AbstractManoptSolverState o is updated.\n\nTo see which symbol updates which value, see the specific stopping criteria. They should use dispatch per symbol value (the third signature).\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.update_stopping_criterion!-Tuple{StopAfter, Val{:MaxTime}, Dates.Period}","page":"Stopping Criteria","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopAfter, :MaxTime, v::Period)\n\nUpdate the time period after which an algorithm shall stop.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.update_stopping_criterion!-Tuple{StopAfterIteration, Val{:MaxIteration}, Int64}","page":"Stopping Criteria","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopAfterIteration, :;MaxIteration, v::Int)\n\nUpdate the number of iterations after which the algorithm should stop.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.update_stopping_criterion!-Tuple{StopWhenChangeLess, Val{:MinIterateChange}, Any}","page":"Stopping Criteria","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopWhenChangeLess, :MinIterateChange, v::Int)\n\nUpdate the minimal change below which an algorithm shall stop.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.update_stopping_criterion!-Tuple{StopWhenCostLess, Val{:MinCost}, Any}","page":"Stopping Criteria","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopWhenCostLess, :MinCost, v)\n\nUpdate the minimal cost below which the algorithm shall stop\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.update_stopping_criterion!-Tuple{StopWhenGradientChangeLess, Val{:MinGradientChange}, Any}","page":"Stopping Criteria","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopWhenGradientChangeLess, :MinGradientChange, v)\n\nUpdate the minimal change below which an algorithm shall stop.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.update_stopping_criterion!-Tuple{StopWhenGradientNormLess, Val{:MinGradNorm}, Float64}","page":"Stopping Criteria","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopWhenGradientNormLess, :MinGradNorm, v::Float64)\n\nUpdate the minimal gradient norm when an algorithm shall stop\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.update_stopping_criterion!-Tuple{StopWhenStepsizeLess, Val{:MinStepsize}, Any}","page":"Stopping Criteria","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopWhenStepsizeLess, :MinStepsize, v)\n\nUpdate the minimal step size below which the algorithm shall stop\n\n\n\n\n\n","category":"method"},{"location":"tutorials/HowToRecord/#How-to-Record-Data-During-the-Iterations","page":"Record values","title":"How to Record Data During the Iterations","text":"","category":"section"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"The recording and debugging features make it possible to record nearly any data during the iterations. This tutorial illustrates how to:","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"record one value during the iterations;\nrecord multiple values during the iterations and access them afterwards;\ndefine an own RecordAction to perform individual recordings.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Several predefined recordings exist, for example RecordCost or RecordGradient, if the problem the solver uses provides a gradient. For fields of the State the recording can also be done RecordEntry. For other recordings, for example more advanced computations before storing a value, an own RecordAction can be defined.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"We illustrate these using the gradient descent from the Get Started: Optimize! tutorial.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Here we focus on ways to investigate the behaviour during iterations by using Recording techniques.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Let’s first load the necessary packages.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"using Manopt, Manifolds, Random\nRandom.seed!(42);","category":"page"},{"location":"tutorials/HowToRecord/#The-Objective","page":"Record values","title":"The Objective","text":"","category":"section"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"We generate data and define our cost and gradient:","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Random.seed!(42)\nm = 30\nM = Sphere(m)\nn = 800\nσ = π / 8\nx = zeros(Float64, m + 1)\nx[2] = 1.0\ndata = [exp(M, x, σ * rand(M; vector_at=x)) for i in 1:n]\nf(M, p) = sum(1 / (2 * n) * distance.(Ref(M), Ref(p), data) .^ 2)\ngrad_f(M, p) = sum(1 / n * grad_distance.(Ref(M), data, Ref(p)))","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"grad_f (generic function with 1 method)","category":"page"},{"location":"tutorials/HowToRecord/#Plain-Examples","page":"Record values","title":"Plain Examples","text":"","category":"section"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"For the high level interfaces of the solvers, like gradient_descent we have to set return_state to true to obtain the whole solver state and not only the resulting minimizer.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Then we can easily use the record= option to add recorded values. This keyword accepts RecordActions as well as several symbols as shortcuts, for example :Cost to record the cost, or if your options have a field f, :f would record that entry. An overview of the symbols that can be used is given here.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"We first just record the cost after every iteration","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"R = gradient_descent(M, f, grad_f, data[1]; record=:Cost, return_state=true)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"# Solver state for `Manopt.jl`s Gradient Descent\nAfter 200 iterations\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nArmijoLineseach() with keyword parameters\n * initial_stepsize = 1.0\n * retraction_method = ExponentialRetraction()\n * contraction_factor = 0.95\n * sufficient_decrease = 0.1\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 200: reached\n |grad f| < 1.0e-9: not reached\nOverall: reached\nThis indicates convergence: No\n\n## Record\n(Iteration = RecordCost(),)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"From the returned state, we see that the GradientDescentState are encapsulated (decorated) within a RecordSolverState.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"For such a state, one can attach different recorders to some operations, currently to :Start. :Stop, and :Iteration, where :Iteration is the default when using the record= keyword with a RecordAction as above. We can access all values recorded during the iterations by calling get_record(R, :Iteation) or since this is the default even shorter","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record(R)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"200-element Vector{Float64}:\n 0.6868754085841272\n 0.6240211444102518\n 0.5900374782569906\n 0.5691425134106757\n 0.5512819383843194\n 0.5421368100229839\n 0.5374585627386622\n 0.5350045365259574\n 0.5337243124406585\n 0.5330491236590464\n 0.5326944302021913\n 0.5325071127227715\n 0.5324084047176342\n ⋮\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"To record more than one value, you can pass an array of a mix of symbols and RecordActions which formally introduces RecordGroup. Such a group records a tuple of values in every iteration:","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"R2 = gradient_descent(M, f, grad_f, data[1]; record=[:Iteration, :Cost], return_state=true)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"# Solver state for `Manopt.jl`s Gradient Descent\nAfter 200 iterations\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nArmijoLineseach() with keyword parameters\n * initial_stepsize = 1.0\n * retraction_method = ExponentialRetraction()\n * contraction_factor = 0.95\n * sufficient_decrease = 0.1\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 200: reached\n |grad f| < 1.0e-9: not reached\nOverall: reached\nThis indicates convergence: No\n\n## Record\n(Iteration = RecordGroup([RecordIteration(), RecordCost()]),)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Here, the symbol :Cost is mapped to using the RecordCost action. The same holds for :Iteration obiously records the current iteration number i. To access these you can first extract the group of records (that is where the :Iterations are recorded – note the plural) and then access the :Cost ““”","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record_action(R2, :Iteration)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"RecordGroup([RecordIteration(), RecordCost()])","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Since iteration is the default, we can also omit it here again. To access single recorded values, one can use","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record_action(R2)[:Cost]","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"200-element Vector{Float64}:\n 0.6868754085841272\n 0.6240211444102518\n 0.5900374782569906\n 0.5691425134106757\n 0.5512819383843194\n 0.5421368100229839\n 0.5374585627386622\n 0.5350045365259574\n 0.5337243124406585\n 0.5330491236590464\n 0.5326944302021913\n 0.5325071127227715\n 0.5324084047176342\n ⋮\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"This can be also done by using a the high level interface get_record","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record(R2, :Iteration, :Cost)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"200-element Vector{Float64}:\n 0.6868754085841272\n 0.6240211444102518\n 0.5900374782569906\n 0.5691425134106757\n 0.5512819383843194\n 0.5421368100229839\n 0.5374585627386622\n 0.5350045365259574\n 0.5337243124406585\n 0.5330491236590464\n 0.5326944302021913\n 0.5325071127227715\n 0.5324084047176342\n ⋮\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676\n 0.5322977905736676","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Note that the first symbol again refers to the point where we record (not to the thing we record). We can also pass a tuple as second argument to have our own order within the tuples returned. Switching the order of recorded cost and Iteration can be done using ““”","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record(R2, :Iteration, (:Iteration, :Cost))","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"200-element Vector{Tuple{Int64, Float64}}:\n (1, 0.6868754085841272)\n (2, 0.6240211444102518)\n (3, 0.5900374782569906)\n (4, 0.5691425134106757)\n (5, 0.5512819383843194)\n (6, 0.5421368100229839)\n (7, 0.5374585627386622)\n (8, 0.5350045365259574)\n (9, 0.5337243124406585)\n (10, 0.5330491236590464)\n (11, 0.5326944302021913)\n (12, 0.5325071127227715)\n (13, 0.5324084047176342)\n ⋮\n (189, 0.5322977905736676)\n (190, 0.5322977905736676)\n (191, 0.5322977905736676)\n (192, 0.5322977905736676)\n (193, 0.5322977905736676)\n (194, 0.5322977905736676)\n (195, 0.5322977905736676)\n (196, 0.5322977905736676)\n (197, 0.5322977905736676)\n (198, 0.5322977905736676)\n (199, 0.5322977905736676)\n (200, 0.5322977905736676)","category":"page"},{"location":"tutorials/HowToRecord/#A-more-Complex-Example","page":"Record values","title":"A more Complex Example","text":"","category":"section"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"To illustrate a complicated example let’s record: * the iteration number, cost and gradient field, but only every sixth iteration; * the iteration at which we stop.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"We first generate the problem and the state, to also illustrate the low-level works when not using the high-level iterface gradient_descent.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"p = DefaultManoptProblem(M, ManifoldGradientObjective(f, grad_f))\ns = GradientDescentState(\n M,\n copy(data[1]);\n stopping_criterion=StopAfterIteration(200) | StopWhenGradientNormLess(10.0^-9),\n)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"# Solver state for `Manopt.jl`s Gradient Descent\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nArmijoLineseach() with keyword parameters\n * initial_stepsize = 1.0\n * retraction_method = ExponentialRetraction()\n * contraction_factor = 0.95\n * sufficient_decrease = 0.1\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 200: not reached\n |grad f| < 1.0e-9: not reached\nOverall: not reached\nThis indicates convergence: No","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"We now first build a RecordGroup to group the three entries we want to record per iteration. We then put this into a RecordEvery to only record this every 6th iteration","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"rI = RecordEvery(\n RecordGroup([\n :Iteration => RecordIteration(),\n :Cost => RecordCost(),\n :Gradient => RecordEntry(similar(data[1]), :X),\n ]),\n 6,\n)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"RecordEvery(RecordGroup([RecordIteration(), RecordCost(), RecordEntry(:X)]), 6, true)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"and for recodring the final iteration number","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"sI = RecordIteration()","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"RecordIteration()","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"We now combine both into the RecordSolverState decorator. It acts completely the same as any AbstractManoptSolverState but records something in every iteration additionally. This is stored in a dictionary of RecordActions, where :Iteration is the action (here the only every 6th iteration group) and the sI which is executed at stop.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Note that the keyword record= in the high level interface gradient_descent only would fill the :Iteration symbol of said dictionary.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"r = RecordSolverState(s, Dict(:Iteration => rI, :Stop => sI))","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"# Solver state for `Manopt.jl`s Gradient Descent\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nArmijoLineseach() with keyword parameters\n * initial_stepsize = 1.0\n * retraction_method = ExponentialRetraction()\n * contraction_factor = 0.95\n * sufficient_decrease = 0.1\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 200: not reached\n |grad f| < 1.0e-9: not reached\nOverall: not reached\nThis indicates convergence: No\n\n## Record\n(Iteration = RecordEvery(RecordGroup([RecordIteration(), RecordCost(), RecordEntry(:X)]), 6, true), Stop = RecordIteration())","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"We now call the solver","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"res = solve!(p, r)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"# Solver state for `Manopt.jl`s Gradient Descent\nAfter 200 iterations\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nArmijoLineseach() with keyword parameters\n * initial_stepsize = 1.0\n * retraction_method = ExponentialRetraction()\n * contraction_factor = 0.95\n * sufficient_decrease = 0.1\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 200: reached\n |grad f| < 1.0e-9: not reached\nOverall: reached\nThis indicates convergence: No\n\n## Record\n(Iteration = RecordEvery(RecordGroup([RecordIteration(), RecordCost(), RecordEntry(:X)]), 6, true), Stop = RecordIteration())","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"And we can check the recorded value at :Stop to see how many iterations were performed","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record(res, :Stop)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"1-element Vector{Int64}:\n 200","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"and the other values during the iterations are","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record(res, :Iteration, (:Iteration, :Cost))","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"33-element Vector{Tuple{Int64, Float64}}:\n (6, 0.5421368100229839)\n (12, 0.5325071127227715)\n (18, 0.5323023757104093)\n (24, 0.5322978928223222)\n (30, 0.5322977928970516)\n (36, 0.5322977906274986)\n (42, 0.53229779057494)\n (48, 0.5322977905736989)\n (54, 0.5322977905736691)\n (60, 0.532297790573668)\n (66, 0.5322977905736676)\n (72, 0.5322977905736676)\n (78, 0.5322977905736676)\n ⋮\n (132, 0.5322977905736676)\n (138, 0.5322977905736676)\n (144, 0.5322977905736676)\n (150, 0.5322977905736676)\n (156, 0.5322977905736676)\n (162, 0.5322977905736676)\n (168, 0.5322977905736676)\n (174, 0.5322977905736676)\n (180, 0.5322977905736676)\n (186, 0.5322977905736676)\n (192, 0.5322977905736676)\n (198, 0.5322977905736676)","category":"page"},{"location":"tutorials/HowToRecord/#Writing-an-own-[RecordAction](https://manoptjl.org/stable/plans/record/#Manopt.RecordAction)s","page":"Record values","title":"Writing an own RecordActions","text":"","category":"section"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Let’s investigate where we want to count the number of function evaluations, again just to illustrate, since for the gradient this is just one evaluation per iteration. We first define a cost, that counts its own calls. ““”","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"mutable struct MyCost{T}\n data::T\n count::Int\nend\nMyCost(data::T) where {T} = MyCost{T}(data, 0)\nfunction (c::MyCost)(M, x)\n c.count += 1\n return sum(1 / (2 * length(c.data)) * distance.(Ref(M), Ref(x), c.data) .^ 2)\nend","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"and we define an own, new RecordAction, which is a functor, i.e. a struct that is also a function. The function we have to implement is similar to a single solver step in signature, since it might get called every iteration:","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"mutable struct RecordCount <: RecordAction\n recorded_values::Vector{Int}\n RecordCount() = new(Vector{Int}())\nend\nfunction (r::RecordCount)(p::AbstractManoptProblem, ::AbstractManoptSolverState, i)\n if i > 0\n push!(r.recorded_values, get_cost_function(get_objective(p)).count)\n elseif i < 0 # reset if negative\n r.recorded_values = Vector{Int}()\n end\nend","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Now we can initialize the new cost and call the gradient descent. Note that this illustrates also the last use case – you can pass symbol-action pairs into the record=array.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"f2 = MyCost(data)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"MyCost{Vector{Vector{Float64}}}([[-0.054658825167894595, -0.5592077846510423, -0.04738273828111257, -0.04682080720921302, 0.12279468849667038, 0.07171438895366239, -0.12930045409417057, -0.22102081626380404, -0.31805333254577767, 0.0065859500152017645 … -0.21999168261518043, 0.19570142227077295, 0.340909965798364, -0.0310802190082894, -0.04674431076254687, -0.006088297671169996, 0.01576037011323387, -0.14523596850249543, 0.14526158060820338, 0.1972125856685378], [-0.08192376929745249, -0.5097715132187676, -0.008339904915541005, 0.07289741328038676, 0.11422036270613797, -0.11546739299835748, 0.2296996932628472, 0.1490467170835958, -0.11124820565850364, -0.11790721606521781 … -0.16421249630470344, -0.2450575844467715, -0.07570080850379841, -0.07426218324072491, -0.026520181327346338, 0.11555341205250205, -0.0292955762365121, -0.09012096853677576, -0.23470556634911574, -0.026214242996704013], [-0.22951484264859257, -0.6083825348640186, 0.14273766477054015, -0.11947823367023377, 0.05984293499234536, 0.058820835498203126, 0.07577331705863266, 0.1632847202946857, 0.20244385489915745, 0.04389826920203656 … 0.3222365119325929, 0.009728730325524067, -0.12094785371632395, -0.36322323926212824, -0.0689253407939657, 0.23356953371702974, 0.23489531397909744, 0.078303336494718, -0.14272984135578806, 0.07844539956202407], [-0.0012588500237817606, -0.29958740415089763, 0.036738459489123514, 0.20567651907595125, -0.1131046432541904, -0.06032435985370224, 0.3366633723165895, -0.1694687746143405, -0.001987171245125281, 0.04933779858684409 … -0.2399584473006256, 0.19889267065775063, 0.22468755918787048, 0.1780090580180643, 0.023703860700539356, -0.10212737517121755, 0.03807004103115319, -0.20569120952458983, -0.03257704254233959, 0.06925473452536687], [-0.035534309946938375, -0.06645560787329002, 0.14823972268208874, -0.23913346587232426, 0.038347027875883496, 0.10453333143286662, 0.050933995140290705, -0.12319549375687473, 0.12956684644537844, -0.23540367869989412 … -0.41471772859912864, -0.1418984610380257, 0.0038321446836859334, 0.23655566917750157, -0.17500681300994742, -0.039189751036839374, -0.08687860620942896, -0.11509948162959047, 0.11378233994840942, 0.38739450723013735], [-0.3122539912469438, -0.3101935557860296, 0.1733113629107006, 0.08968593616209351, -0.1836344261367962, -0.06480023695256802, 0.18165070013886545, 0.19618275767992124, -0.07956460275570058, 0.0325997354656551 … 0.2845492418767769, 0.17406455870721682, -0.053101230371568706, -0.1382082812981627, 0.005830071475508364, 0.16739264037923055, 0.034365814374995335, 0.09107702398753297, -0.1877250428700409, 0.05116494897806923], [-0.04159442361185588, -0.7768029783272633, 0.06303616666722486, 0.08070518925253539, -0.07396265237309446, -0.06008109299719321, 0.07977141629715745, 0.019511027129056415, 0.08629917589924847, -0.11156298867318722 … 0.0792587504128044, -0.016444383900170008, -0.181746064577005, -0.01888129512990984, -0.13523922089388968, 0.11358102175659832, 0.07929049608459493, 0.1689565359083833, 0.07673657951723721, -0.1128480905648813], [-0.21221814304651335, -0.5031823821503253, 0.010326342133992458, -0.12438192100961257, 0.04004758695231872, 0.2280527500843805, -0.2096243232022162, -0.16564828762420294, -0.28325749481138984, 0.17033534605245823 … -0.13599096505924074, 0.28437770540525625, 0.08424426798544583, -0.1266207606984139, 0.04917635557603396, -0.00012608938533809706, -0.04283220254770056, -0.08771365647566572, 0.14750169103093985, 0.11601120086036351], [0.10683290707435536, -0.17680836277740156, 0.23767458301899405, 0.12011180867097299, -0.029404774462600154, 0.11522028383799933, -0.3318174480974519, -0.17859266746938374, 0.04352373642537759, 0.2530382802667988 … 0.08879861736692073, -0.004412506987801729, 0.19786810509925895, -0.1397104682727044, 0.09482328498485094, 0.05108149065160893, -0.14578343506951633, 0.3167479772660438, 0.10422673169182732, 0.21573150015891313], [-0.024895624707466164, -0.7473912016432697, -0.1392537238944721, -0.14948896791465557, -0.09765393283580377, 0.04413059403279867, -0.13865379004720355, -0.071032040283992, 0.15604054722246585, -0.10744260463413555 … -0.14748067081342833, -0.14743635071251024, 0.0643591937981352, 0.16138827697852615, -0.12656652133603935, -0.06463635704869083, 0.14329582429103488, -0.01113113793821713, 0.29295387893749997, 0.06774523575259782] … [0.011874845316569967, -0.6910596618389588, 0.21275741439477827, -0.014042545524367437, -0.07883613103495014, -0.0021900966696246776, -0.033836430464220496, 0.2925813113264835, -0.04718187201980008, 0.03949680289730036 … 0.0867736586603294, 0.0404682510051544, -0.24779813848587257, -0.28631514602877145, -0.07211767532456789, -0.15072898498180473, 0.017855923621826746, -0.09795357710255254, -0.14755229203084924, 0.1305005778855436], [0.013457629515450426, -0.3750353654626534, 0.12349883726772073, 0.3521803555005319, 0.2475921439420274, 0.006088649842999206, 0.31203183112392907, -0.036869203979483754, -0.07475746464056504, -0.029297797064479717 … 0.16867368684091563, -0.09450564983271922, -0.0587273302122711, -0.1326667940553803, -0.25530237980444614, 0.37556905374043376, 0.04922612067677609, 0.2605362549983866, -0.21871556587505667, -0.22915883767386164], [0.03295085436260177, -0.971861604433394, 0.034748713521512035, -0.0494065013245799, -0.01767479281403355, 0.0465459739459587, 0.007470494722096038, 0.003227960072276129, 0.0058328596338402365, -0.037591237446692356 … 0.03205152122876297, 0.11331109854742015, 0.03044900529526686, 0.017971704993311105, -0.009329252062960229, -0.02939354719650879, 0.022088835776251863, -0.02546111553658854, -0.0026257225461427582, 0.005702111697172774], [0.06968243992532257, -0.7119502191435176, -0.18136614593117445, -0.1695926215673451, 0.01725015359973796, -0.00694164951158388, -0.34621134287344574, 0.024709256792651912, -0.1632255805999673, -0.2158226433583082 … -0.14153772108081458, -0.11256850346909901, 0.045109821764180706, -0.1162754336222613, -0.13221711766357983, 0.005365354776191061, 0.012750671705879105, -0.018208207549835407, 0.12458753932455452, -0.31843587960340897], [-0.19830349374441875, -0.6086693423968884, 0.08552341811170468, 0.35781519334042255, 0.15790663648524367, 0.02712571268324985, 0.09855601327331667, -0.05840653973421127, -0.09546429767790429, -0.13414717696055448 … -0.0430935804718714, 0.2678584478951765, 0.08780994289014614, 0.01613469379498457, 0.0516187906322884, -0.07383067566731401, -0.1481272738354552, -0.010532317187265649, 0.06555344745952187, -0.1506167863762911], [-0.04347524125197773, -0.6327981074196994, -0.221116680035191, 0.0282207467940456, -0.0855024881522933, 0.12821801740178346, 0.1779499563280024, -0.10247384887512365, 0.0396432464100116, -0.0582580338112627 … 0.1253893207083573, 0.09628202269764763, 0.3165295473947355, -0.14915034201394833, -0.1376727867817772, -0.004153096613530293, 0.09277957650773738, 0.05917264554031624, -0.12230262590034507, -0.19655728521529914], [-0.10173946348675116, -0.6475660153977272, 0.1260284619729566, -0.11933160462857616, -0.04774310633937567, 0.09093928358804217, 0.041662676324043114, -0.1264739543938265, 0.09605293126911392, -0.16790474428001648 … -0.04056684573478108, 0.09351665120940456, 0.15259195558799882, 0.0009949298312580497, 0.09461980828206303, 0.3067004514287283, 0.16129258773733715, -0.18893664085007542, -0.1806865244492513, 0.029319680436405825], [-0.251780954320053, -0.39147463259941456, -0.24359579328578626, 0.30179309757665723, 0.21658893985206484, 0.12304585275893232, 0.28281133086451704, 0.029187615341955325, 0.03616243507191924, 0.029375588909979152 … -0.08071746662465404, -0.2176101928258658, 0.20944684921170825, 0.043033273425352715, -0.040505542460853576, 0.17935596149079197, -0.08454569418519972, 0.0545941597033932, 0.12471741052450099, -0.24314124407858329], [0.28156471341150974, -0.6708572780452595, -0.1410302363738465, -0.08322589397277698, -0.022772599832907418, -0.04447265789199677, -0.016448068022011157, -0.07490911512503738, 0.2778432295769144, -0.10191899088372378 … -0.057272155080983836, 0.12817478092201395, 0.04623814480781884, -0.12184190164369117, 0.1987855635987229, -0.14533603246124993, -0.16334072868597016, -0.052369977381939437, 0.014904286931394959, -0.2440882678882144], [0.12108727495744157, -0.714787344982596, 0.01632521838262752, 0.04437570556908449, -0.041199280304144284, 0.052984488452616, 0.03796520200156107, 0.2791785910964288, 0.11530429924056099, 0.12178223160398421 … -0.07621847481721669, 0.18353870423743013, -0.19066653731436745, -0.09423224997242206, 0.14596847781388494, -0.09747986927777111, 0.16041150122587072, -0.02296513951256738, 0.06786878373578588, 0.15296635978447756]], 0)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Now for the plain gradient descent, we have to modify the step (to a constant stepsize) and remove the default check whether the cost increases (setting debug to []). We also only look at the first 20 iterations to keep this example small in recorded values. We call","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"R3 = gradient_descent(\n M,\n f2,\n grad_f,\n data[1];\n record=[:Iteration, :Count => RecordCount(), :Cost],\n stepsize = ConstantStepsize(1.0),\n stopping_criterion=StopAfterIteration(20),\n debug=[],\n return_state=true,\n)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"# Solver state for `Manopt.jl`s Gradient Descent\nAfter 20 iterations\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nConstantStepsize(1.0, relative)\n\n## Stopping Criterion\nMax Iteration 20: reached\nThis indicates convergence: No\n\n## Record\n(Iteration = RecordGroup([RecordIteration(), RecordCount([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]), RecordCost()]),)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"For :Cost we already learned how to access them, the :Count => introduces the following action to obtain the :Count. We can again access the whole sets of records","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record(R3)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"20-element Vector{Tuple{Int64, Int64, Float64}}:\n (1, 0, 0.5808287253777765)\n (2, 1, 0.5395268557323746)\n (3, 2, 0.5333529073733115)\n (4, 3, 0.5324514620174543)\n (5, 4, 0.5323201743667151)\n (6, 5, 0.5323010518577256)\n (7, 6, 0.5322982658416161)\n (8, 7, 0.532297859847447)\n (9, 8, 0.5322978006725337)\n (10, 9, 0.5322977920461375)\n (11, 10, 0.5322977907883957)\n (12, 11, 0.5322977906049865)\n (13, 12, 0.5322977905782369)\n (14, 13, 0.532297790574335)\n (15, 14, 0.5322977905737657)\n (16, 15, 0.5322977905736823)\n (17, 16, 0.5322977905736703)\n (18, 17, 0.5322977905736688)\n (19, 18, 0.5322977905736683)\n (20, 19, 0.5322977905736683)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"this is equivalent to calling R[:Iteration]. Note that since we introduced :Count we can also access a single recorded value using","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"R3[:Iteration, :Count]","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"20-element Vector{Int64}:\n 0\n 1\n 2\n 3\n 4\n 5\n 6\n 7\n 8\n 9\n 10\n 11\n 12\n 13\n 14\n 15\n 16\n 17\n 18\n 19","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"and we see that the cost function is called once per iteration.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"If we use this counting cost and run the default gradient descent with Armijo linesearch, we can infer how many Armijo linesearch backtracks are preformed:","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"f3 = MyCost(data)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"MyCost{Vector{Vector{Float64}}}([[-0.054658825167894595, -0.5592077846510423, -0.04738273828111257, -0.04682080720921302, 0.12279468849667038, 0.07171438895366239, -0.12930045409417057, -0.22102081626380404, -0.31805333254577767, 0.0065859500152017645 … -0.21999168261518043, 0.19570142227077295, 0.340909965798364, -0.0310802190082894, -0.04674431076254687, -0.006088297671169996, 0.01576037011323387, -0.14523596850249543, 0.14526158060820338, 0.1972125856685378], [-0.08192376929745249, -0.5097715132187676, -0.008339904915541005, 0.07289741328038676, 0.11422036270613797, -0.11546739299835748, 0.2296996932628472, 0.1490467170835958, -0.11124820565850364, -0.11790721606521781 … -0.16421249630470344, -0.2450575844467715, -0.07570080850379841, -0.07426218324072491, -0.026520181327346338, 0.11555341205250205, -0.0292955762365121, -0.09012096853677576, -0.23470556634911574, -0.026214242996704013], [-0.22951484264859257, -0.6083825348640186, 0.14273766477054015, -0.11947823367023377, 0.05984293499234536, 0.058820835498203126, 0.07577331705863266, 0.1632847202946857, 0.20244385489915745, 0.04389826920203656 … 0.3222365119325929, 0.009728730325524067, -0.12094785371632395, -0.36322323926212824, -0.0689253407939657, 0.23356953371702974, 0.23489531397909744, 0.078303336494718, -0.14272984135578806, 0.07844539956202407], [-0.0012588500237817606, -0.29958740415089763, 0.036738459489123514, 0.20567651907595125, -0.1131046432541904, -0.06032435985370224, 0.3366633723165895, -0.1694687746143405, -0.001987171245125281, 0.04933779858684409 … -0.2399584473006256, 0.19889267065775063, 0.22468755918787048, 0.1780090580180643, 0.023703860700539356, -0.10212737517121755, 0.03807004103115319, -0.20569120952458983, -0.03257704254233959, 0.06925473452536687], [-0.035534309946938375, -0.06645560787329002, 0.14823972268208874, -0.23913346587232426, 0.038347027875883496, 0.10453333143286662, 0.050933995140290705, -0.12319549375687473, 0.12956684644537844, -0.23540367869989412 … -0.41471772859912864, -0.1418984610380257, 0.0038321446836859334, 0.23655566917750157, -0.17500681300994742, -0.039189751036839374, -0.08687860620942896, -0.11509948162959047, 0.11378233994840942, 0.38739450723013735], [-0.3122539912469438, -0.3101935557860296, 0.1733113629107006, 0.08968593616209351, -0.1836344261367962, -0.06480023695256802, 0.18165070013886545, 0.19618275767992124, -0.07956460275570058, 0.0325997354656551 … 0.2845492418767769, 0.17406455870721682, -0.053101230371568706, -0.1382082812981627, 0.005830071475508364, 0.16739264037923055, 0.034365814374995335, 0.09107702398753297, -0.1877250428700409, 0.05116494897806923], [-0.04159442361185588, -0.7768029783272633, 0.06303616666722486, 0.08070518925253539, -0.07396265237309446, -0.06008109299719321, 0.07977141629715745, 0.019511027129056415, 0.08629917589924847, -0.11156298867318722 … 0.0792587504128044, -0.016444383900170008, -0.181746064577005, -0.01888129512990984, -0.13523922089388968, 0.11358102175659832, 0.07929049608459493, 0.1689565359083833, 0.07673657951723721, -0.1128480905648813], [-0.21221814304651335, -0.5031823821503253, 0.010326342133992458, -0.12438192100961257, 0.04004758695231872, 0.2280527500843805, -0.2096243232022162, -0.16564828762420294, -0.28325749481138984, 0.17033534605245823 … -0.13599096505924074, 0.28437770540525625, 0.08424426798544583, -0.1266207606984139, 0.04917635557603396, -0.00012608938533809706, -0.04283220254770056, -0.08771365647566572, 0.14750169103093985, 0.11601120086036351], [0.10683290707435536, -0.17680836277740156, 0.23767458301899405, 0.12011180867097299, -0.029404774462600154, 0.11522028383799933, -0.3318174480974519, -0.17859266746938374, 0.04352373642537759, 0.2530382802667988 … 0.08879861736692073, -0.004412506987801729, 0.19786810509925895, -0.1397104682727044, 0.09482328498485094, 0.05108149065160893, -0.14578343506951633, 0.3167479772660438, 0.10422673169182732, 0.21573150015891313], [-0.024895624707466164, -0.7473912016432697, -0.1392537238944721, -0.14948896791465557, -0.09765393283580377, 0.04413059403279867, -0.13865379004720355, -0.071032040283992, 0.15604054722246585, -0.10744260463413555 … -0.14748067081342833, -0.14743635071251024, 0.0643591937981352, 0.16138827697852615, -0.12656652133603935, -0.06463635704869083, 0.14329582429103488, -0.01113113793821713, 0.29295387893749997, 0.06774523575259782] … [0.011874845316569967, -0.6910596618389588, 0.21275741439477827, -0.014042545524367437, -0.07883613103495014, -0.0021900966696246776, -0.033836430464220496, 0.2925813113264835, -0.04718187201980008, 0.03949680289730036 … 0.0867736586603294, 0.0404682510051544, -0.24779813848587257, -0.28631514602877145, -0.07211767532456789, -0.15072898498180473, 0.017855923621826746, -0.09795357710255254, -0.14755229203084924, 0.1305005778855436], [0.013457629515450426, -0.3750353654626534, 0.12349883726772073, 0.3521803555005319, 0.2475921439420274, 0.006088649842999206, 0.31203183112392907, -0.036869203979483754, -0.07475746464056504, -0.029297797064479717 … 0.16867368684091563, -0.09450564983271922, -0.0587273302122711, -0.1326667940553803, -0.25530237980444614, 0.37556905374043376, 0.04922612067677609, 0.2605362549983866, -0.21871556587505667, -0.22915883767386164], [0.03295085436260177, -0.971861604433394, 0.034748713521512035, -0.0494065013245799, -0.01767479281403355, 0.0465459739459587, 0.007470494722096038, 0.003227960072276129, 0.0058328596338402365, -0.037591237446692356 … 0.03205152122876297, 0.11331109854742015, 0.03044900529526686, 0.017971704993311105, -0.009329252062960229, -0.02939354719650879, 0.022088835776251863, -0.02546111553658854, -0.0026257225461427582, 0.005702111697172774], [0.06968243992532257, -0.7119502191435176, -0.18136614593117445, -0.1695926215673451, 0.01725015359973796, -0.00694164951158388, -0.34621134287344574, 0.024709256792651912, -0.1632255805999673, -0.2158226433583082 … -0.14153772108081458, -0.11256850346909901, 0.045109821764180706, -0.1162754336222613, -0.13221711766357983, 0.005365354776191061, 0.012750671705879105, -0.018208207549835407, 0.12458753932455452, -0.31843587960340897], [-0.19830349374441875, -0.6086693423968884, 0.08552341811170468, 0.35781519334042255, 0.15790663648524367, 0.02712571268324985, 0.09855601327331667, -0.05840653973421127, -0.09546429767790429, -0.13414717696055448 … -0.0430935804718714, 0.2678584478951765, 0.08780994289014614, 0.01613469379498457, 0.0516187906322884, -0.07383067566731401, -0.1481272738354552, -0.010532317187265649, 0.06555344745952187, -0.1506167863762911], [-0.04347524125197773, -0.6327981074196994, -0.221116680035191, 0.0282207467940456, -0.0855024881522933, 0.12821801740178346, 0.1779499563280024, -0.10247384887512365, 0.0396432464100116, -0.0582580338112627 … 0.1253893207083573, 0.09628202269764763, 0.3165295473947355, -0.14915034201394833, -0.1376727867817772, -0.004153096613530293, 0.09277957650773738, 0.05917264554031624, -0.12230262590034507, -0.19655728521529914], [-0.10173946348675116, -0.6475660153977272, 0.1260284619729566, -0.11933160462857616, -0.04774310633937567, 0.09093928358804217, 0.041662676324043114, -0.1264739543938265, 0.09605293126911392, -0.16790474428001648 … -0.04056684573478108, 0.09351665120940456, 0.15259195558799882, 0.0009949298312580497, 0.09461980828206303, 0.3067004514287283, 0.16129258773733715, -0.18893664085007542, -0.1806865244492513, 0.029319680436405825], [-0.251780954320053, -0.39147463259941456, -0.24359579328578626, 0.30179309757665723, 0.21658893985206484, 0.12304585275893232, 0.28281133086451704, 0.029187615341955325, 0.03616243507191924, 0.029375588909979152 … -0.08071746662465404, -0.2176101928258658, 0.20944684921170825, 0.043033273425352715, -0.040505542460853576, 0.17935596149079197, -0.08454569418519972, 0.0545941597033932, 0.12471741052450099, -0.24314124407858329], [0.28156471341150974, -0.6708572780452595, -0.1410302363738465, -0.08322589397277698, -0.022772599832907418, -0.04447265789199677, -0.016448068022011157, -0.07490911512503738, 0.2778432295769144, -0.10191899088372378 … -0.057272155080983836, 0.12817478092201395, 0.04623814480781884, -0.12184190164369117, 0.1987855635987229, -0.14533603246124993, -0.16334072868597016, -0.052369977381939437, 0.014904286931394959, -0.2440882678882144], [0.12108727495744157, -0.714787344982596, 0.01632521838262752, 0.04437570556908449, -0.041199280304144284, 0.052984488452616, 0.03796520200156107, 0.2791785910964288, 0.11530429924056099, 0.12178223160398421 … -0.07621847481721669, 0.18353870423743013, -0.19066653731436745, -0.09423224997242206, 0.14596847781388494, -0.09747986927777111, 0.16041150122587072, -0.02296513951256738, 0.06786878373578588, 0.15296635978447756]], 0)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"To not get too many entries let’s just look at the first 20 iterations again","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"R4 = gradient_descent(\n M,\n f3,\n grad_f,\n data[1];\n record=[:Count => RecordCount()],\n return_state=true,\n)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"# Solver state for `Manopt.jl`s Gradient Descent\nAfter 200 iterations\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nArmijoLineseach() with keyword parameters\n * initial_stepsize = 1.0\n * retraction_method = ExponentialRetraction()\n * contraction_factor = 0.95\n * sufficient_decrease = 0.1\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 200: reached\n |grad f| < 1.0e-9: not reached\nOverall: reached\nThis indicates convergence: No\n\n## Record\n(Iteration = RecordGroup([RecordCount([25, 29, 33, 37, 40, 44, 48, 52, 56, 60, 64, 68, 72, 76, 80, 84, 88, 92, 96, 100, 104, 108, 112, 116, 120, 124, 128, 132, 136, 140, 144, 148, 152, 156, 160, 164, 168, 172, 176, 180, 184, 188, 192, 196, 200, 204, 208, 212, 216, 220, 224, 229, 232, 236, 240, 242, 246, 248, 254, 257, 262, 265, 582, 644, 668, 670, 672, 674, 683, 685, 687, 689, 691, 693, 695, 697, 708, 710, 712, 714, 716, 718, 721, 723, 725, 736, 738, 740, 742, 744, 746, 748, 750, 752, 754, 756, 758, 760, 762, 764, 766, 768, 770, 780, 859, 861, 863, 865, 867, 869, 871, 873, 875, 877, 879, 881, 883, 885, 887, 889, 891, 893, 895, 897, 899, 901, 903, 905, 907, 909, 911, 913, 915, 917, 919, 921, 923, 925, 927, 929, 931, 933, 935, 937, 939, 941, 943, 945, 947, 949, 951, 953, 955, 957, 959, 961, 963, 965, 967, 969, 971, 973, 975, 977, 979, 981, 983, 985, 987, 989, 991, 993, 995, 997, 999, 1001, 1003, 1005, 1007, 1009, 1011, 1013, 1015, 1017, 1019, 1021, 1023, 1025, 1027, 1029, 1031, 1033, 1035, 1037, 1039, 1041, 1043, 1045, 1047, 1049])]),)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record(R4)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"200-element Vector{Tuple{Int64}}:\n (25,)\n (29,)\n (33,)\n (37,)\n (40,)\n (44,)\n (48,)\n (52,)\n (56,)\n (60,)\n (64,)\n (68,)\n (72,)\n ⋮\n (1027,)\n (1029,)\n (1031,)\n (1033,)\n (1035,)\n (1037,)\n (1039,)\n (1041,)\n (1043,)\n (1045,)\n (1047,)\n (1049,)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"We can see that the number of cost function calls varies, depending on how many linesearch backtrack steps were required to obtain a good stepsize.","category":"page"},{"location":"solvers/ChambollePock/#ChambollePockSolver","page":"Chambolle-Pock","title":"The Riemannian Chambolle-Pock Algorithm","text":"","category":"section"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"The Riemannian Chambolle–Pock is a generalization of the Chambolle–Pock algorithm Chambolle and Pock [CP11] It is also known as primal-dual hybrid gradient (PDHG) or primal-dual proximal splitting (PDPS) algorithm.","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"In order to minimize over p∈\\mathcal M§ the cost function consisting of","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"F(p) + G(Λ(p))","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"where Fmathcal M overlineℝ, Gmathcal N overlineℝ, and Λmathcal M mathcal N. If the manifolds mathcal M or mathcal N are not Hadamard, it has to be considered locally, i.e. on geodesically convex sets mathcal C subset mathcal M and mathcal D subsetmathcal N such that Λ(mathcal C) subset mathcal D.","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"The algorithm is available in four variants: exact versus linearized (see variant) as well as with primal versus dual relaxation (see relax). For more details, see Bergmann, Herzog, Silva Louzeiro, Tenbrinck and Vidal-Núñez [BHS+21]. In the following we note the case of the exact, primal relaxed Riemannian Chambolle–Pock algorithm.","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"Given base points mmathcal C, n=Λ(m)mathcal D, initial primal and dual values p^(0) mathcal C, ξ_n^(0) T_n^*mathcal N, and primal and dual step sizes sigma_0, tau_0, relaxation theta_0, as well as acceleration gamma.","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"As an initialization, perform bar p^(0) gets p^(0).","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"The algorithms performs the steps k=1 (until a StoppingCriterion is fulfilled with)","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"ξ^(k+1)_n = operatornameprox_tau_k G_n^*Bigl(ξ_n^(k) + tau_k bigl(log_n Λ (bar p^(k))bigr)^flatBigr)\np^(k+1) = operatornameprox_sigma_k Fbiggl(exp_p^(k)Bigl( operatornamePT_p^(k)gets mbigl(-sigma_k DΛ(m)^*ξ_n^(k+1)bigr)^sharpBigr)biggr)\nUpdate\ntheta_k = (1+2gammasigma_k)^-frac12\nsigma_k+1 = sigma_ktheta_k\ntau_k+1 = fractau_ktheta_k\nbar p^(k+1) = exp_p^(k+1)bigl(-theta_k log_p^(k+1) p^(k)bigr)","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"Furthermore you can exchange the exponential map, the logarithmic map, and the parallel transport by a retraction, an inverse retraction, and a vector transport.","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"Finally you can also update the base points m and n during the iterations. This introduces a few additional vector transports. The same holds for the case Λ(m^(k))neq n^(k) at some point. All these cases are covered in the algorithm.","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"ChambollePock\nChambollePock!","category":"page"},{"location":"solvers/ChambollePock/#Manopt.ChambollePock","page":"Chambolle-Pock","title":"Manopt.ChambollePock","text":"ChambollePock(\n M, N, cost, x0, ξ0, m, n, prox_F, prox_G_dual, adjoint_linear_operator;\n forward_operator=missing,\n linearized_forward_operator=missing,\n evaluation=AllocatingEvaluation()\n)\n\nPerform the Riemannian Chambolle–Pock algorithm.\n\nGiven a cost function mathcal Emathcal M ℝ of the form\n\nmathcal E(p) = F(p) + G( Λ(p) )\n\nwhere Fmathcal M ℝ, Gmathcal N ℝ, and Λmathcal M mathcal N. The remaining input parameters are\n\np, X primal and dual start points xmathcal M and ξT_nmathcal N\nm,n base points on mathcal M and mathcal N, respectively.\nadjoint_linearized_operator the adjoint DΛ^* of the linearized operator DΛ(m) T_mmathcal M T_Λ(m)mathcal N\nprox_F, prox_G_Dual the proximal maps of F and G^ast_n\n\nnote that depending on the AbstractEvaluationType evaluation the last three parameters as well as the forwardoperator Λ and the `linearizedforward_operatorcan be given as allocating functions(Manifolds, parameters) -> resultor as mutating functions(Manifold, result, parameters)-> result to spare allocations.\n\nBy default, this performs the exact Riemannian Chambolle Pock algorithm, see the optional parameter DΛ for their linearized variant.\n\nFor more details on the algorithm, see Bergmann et al., Found. Comput. Math., 2021.\n\nOptional Parameters\n\nacceleration – (0.05)\ndual_stepsize – (1/sqrt(8)) proximal parameter of the primal prox\nevaluation (AllocatingEvaluation()) specify whether the proximal maps and operators are allocating functions(Manifolds, parameters) -> resultor given as mutating functions(Manifold, result, parameters)-> result to spare allocations.\nΛ (missing) the (forward) operator Λ() (required for the :exact variant)\nlinearized_forward_operator (missing) its linearization DΛ() (required for the :linearized variant)\nprimal_stepsize – (1/sqrt(8)) proximal parameter of the dual prox\nrelaxation – (1.)\nrelax – (:primal) whether to relax the primal or dual\nvariant - (:exact if Λ is missing, otherwise :linearized) variant to use. Note that this changes the arguments the forward_operator will be called.\nstopping_criterion – (stopAtIteration(100)) a StoppingCriterion\nupdate_primal_base – (missing) function to update m (identity by default/missing)\nupdate_dual_base – (missing) function to update n (identity by default/missing)\nretraction_method – (default_retraction_method(M, typeof(p))) the retraction to use\ninverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use.\nvector_transport_method - (default_vector_transport_method(M, typeof(p))) a vector transport to use\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.ChambollePock!","page":"Chambolle-Pock","title":"Manopt.ChambollePock!","text":"ChambollePock(M, N, cost, x0, ξ0, m, n, prox_F, prox_G_dual, adjoint_linear_operator)\n\nPerform the Riemannian Chambolle–Pock algorithm in place of x, ξ, and potentially m, n if they are not fixed. See ChambollePock for details and optional parameters.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#State","page":"Chambolle-Pock","title":"State","text":"","category":"section"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"ChambollePockState","category":"page"},{"location":"solvers/ChambollePock/#Manopt.ChambollePockState","page":"Chambolle-Pock","title":"Manopt.ChambollePockState","text":"ChambollePockState <: AbstractPrimalDualSolverState\n\nstores all options and variables within a linearized or exact Chambolle Pock. The following list provides the order for the constructor, where the previous iterates are initialized automatically and values with a default may be left out.\n\nm - base point on mathcal M\nn - base point on mathcal N\np - an initial point on x^(0) mathcal M (and its previous iterate)\nX - an initial tangent vector X^(0)T^*mathcal N (and its previous iterate)\npbar - the relaxed iterate used in the next dual update step (when using :primal relaxation)\nXbar - the relaxed iterate used in the next primal update step (when using :dual relaxation)\nprimal_stepsize – (1/sqrt(8)) proximal parameter of the primal prox\ndual_stepsize – (1/sqrt(8)) proximal parameter of the dual prox\nacceleration – (0.) acceleration factor due to Chambolle & Pock\nrelaxation – (1.) relaxation in the primal relaxation step (to compute pbar)\nrelax – (:primal) which variable to relax (:primal or :dual)\nstop - a StoppingCriterion\nvariant – (exact) whether to perform an :exact or :linearized Chambolle-Pock\nupdate_primal_base ((p,o,i) -> o.m) function to update the primal base\nupdate_dual_base ((p,o,i) -> o.n) function to update the dual base\nretraction_method – (default_retraction_method(M, typeof(p))) the retraction to use\ninverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use on the manifold mathcal M.\ninverse_retraction_method_dual - (default_inverse_retraction_method(N, typeof(n))) an inverse retraction to use on manifold mathcal N.\nvector_transport_method - (default_vector_transport_method(M, typeof(p))) a vector transport to use on the manifold mathcal M.\nvector_transport_method_dual - (default_vector_transport_method(N, typeof(n))) a vector transport to use on manifold mathcal N.\n\nwhere for the last two the functions a AbstractManoptProblemp, AbstractManoptSolverStateo and the current iterate i are the arguments. If you activate these to be different from the default identity, you have to provide p.Λ for the algorithm to work (which might be missing in the linearized case).\n\nConstructor\n\nChambollePockState(M::AbstractManifold, N::AbstractManifold,\n m::P, n::Q, p::P, X::T, primal_stepsize::Float64, dual_stepsize::Float64;\n kwargs...\n)\n\nwhere all other fields from above are keyword arguments with their default values given in brackets.\n\nif Manifolds.jl is loaded, N is also a keyword argument and set to TangentBundle(M) by default.\n\n\n\n\n\n","category":"type"},{"location":"solvers/ChambollePock/#Useful-Terms","page":"Chambolle-Pock","title":"Useful Terms","text":"","category":"section"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"primal_residual\ndual_residual","category":"page"},{"location":"solvers/ChambollePock/#Manopt.primal_residual","page":"Chambolle-Pock","title":"Manopt.primal_residual","text":"primal_residual(p, o, x_old, X_old, n_old)\n\nCompute the primal residual at current iterate k given the necessary values x_k-1 X_k-1, and n_k-1 from the previous iterate.\n\nBigllVert\nfrac1σoperatornameretr^-1_x_kx_k-1 -\nV_x_kgets m_kbigl(DΛ^*(m_k)biglV_n_kgets n_k-1X_k-1 - X_k bigr\nBigrrVert\n\nwhere V_gets is the vector transport used in the ChambollePockState\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.dual_residual","page":"Chambolle-Pock","title":"Manopt.dual_residual","text":"dual_residual(p, o, x_old, X_old, n_old)\n\nCompute the dual residual at current iterate k given the necessary values x_k-1 X_k-1, and n_k-1 from the previous iterate. The formula is slightly different depending on the o.variant used:\n\nFor the :linearized it reads\n\nBigllVert\nfrac1τbigl(\nV_n_kgets n_k-1(X_k-1)\n- X_k\nbigr)\n-\nDΛ(m_k)bigl\nV_m_kgets x_koperatornameretr^-1_x_kx_k-1\nbigr\nBigrrVert\n\nand for the :exact variant\n\nBigllVert\nfrac1τ V_n_kgets n_k-1(X_k-1)\n-\noperatornameretr^-1_n_kbigl(\nΛ(operatornameretr_m_k(V_m_kgets x_koperatornameretr^-1_x_kx_k-1))\nbigr)\nBigrrVert\n\nwhere in both cases V_gets is the vector transport used in the ChambollePockState.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Debug","page":"Chambolle-Pock","title":"Debug","text":"","category":"section"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"DebugDualBaseIterate\nDebugDualBaseChange\nDebugPrimalBaseIterate\nDebugPrimalBaseChange\nDebugDualChange\nDebugDualIterate\nDebugDualResidual\nDebugPrimalChange\nDebugPrimalIterate\nDebugPrimalResidual\nDebugPrimalDualResidual","category":"page"},{"location":"solvers/ChambollePock/#Manopt.DebugDualBaseIterate","page":"Chambolle-Pock","title":"Manopt.DebugDualBaseIterate","text":"DebugDualBaseIterate(io::IO=stdout)\n\nPrint the dual base variable by using DebugEntry, see their constructors for detail. This method is further set display o.n.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.DebugDualBaseChange","page":"Chambolle-Pock","title":"Manopt.DebugDualBaseChange","text":"DebugDualChange(; storage=StoreStateAction([:n]), io::IO=stdout)\n\nPrint the change of the dual base variable by using DebugEntryChange, see their constructors for detail, on o.n.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.DebugPrimalBaseIterate","page":"Chambolle-Pock","title":"Manopt.DebugPrimalBaseIterate","text":"DebugPrimalBaseIterate()\n\nPrint the primal base variable by using DebugEntry, see their constructors for detail. This method is further set display o.m.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.DebugPrimalBaseChange","page":"Chambolle-Pock","title":"Manopt.DebugPrimalBaseChange","text":"DebugPrimalBaseChange(a::StoreStateAction=StoreStateAction([:m]),io::IO=stdout)\n\nPrint the change of the primal base variable by using DebugEntryChange, see their constructors for detail, on o.n.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.DebugDualChange","page":"Chambolle-Pock","title":"Manopt.DebugDualChange","text":"DebugDualChange(opts...)\n\nPrint the change of the dual variable, similar to DebugChange, see their constructors for detail, but with a different calculation of the change, since the dual variable lives in (possibly different) tangent spaces.\n\n\n\n\n\n","category":"type"},{"location":"solvers/ChambollePock/#Manopt.DebugDualIterate","page":"Chambolle-Pock","title":"Manopt.DebugDualIterate","text":"DebugDualIterate(e)\n\nPrint the dual variable by using DebugEntry, see their constructors for detail. This method is further set display o.X.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.DebugDualResidual","page":"Chambolle-Pock","title":"Manopt.DebugDualResidual","text":"DebugDualResidual <: DebugAction\n\nA Debug action to print the dual residual. The constructor accepts a printing function and some (shared) storage, which should at least record :Iterate, :X and :n.\n\nConstructor\n\nDebugDualResidual()\n\nwith the keywords\n\nio (stdout) - stream to perform the debug to\nformat (\"$prefix%s\") format to print the dual residual, using the\nprefix (\"Dual Residual: \") short form to just set the prefix\nstorage (a new StoreStateAction) to store values for the debug.\n\n\n\n\n\n","category":"type"},{"location":"solvers/ChambollePock/#Manopt.DebugPrimalChange","page":"Chambolle-Pock","title":"Manopt.DebugPrimalChange","text":"DebugPrimalChange(opts...)\n\nPrint the change of the primal variable by using DebugChange, see their constructors for detail.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.DebugPrimalIterate","page":"Chambolle-Pock","title":"Manopt.DebugPrimalIterate","text":"DebugPrimalIterate(opts...;kwargs...)\n\nPrint the change of the primal variable by using DebugIterate, see their constructors for detail.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.DebugPrimalResidual","page":"Chambolle-Pock","title":"Manopt.DebugPrimalResidual","text":"DebugPrimalResidual <: DebugAction\n\nA Debug action to print the primal residual. The constructor accepts a printing function and some (shared) storage, which should at least record :Iterate, :X and :n.\n\nConstructor\n\nDebugPrimalResidual()\n\nwith the keywords\n\nio (stdout) - stream to perform the debug to\nformat (\"$prefix%s\") format to print the dual residual, using the\nprefix (\"Primal Residual: \") short form to just set the prefix\nstorage (a new StoreStateAction) to store values for the debug.\n\n\n\n\n\n","category":"type"},{"location":"solvers/ChambollePock/#Manopt.DebugPrimalDualResidual","page":"Chambolle-Pock","title":"Manopt.DebugPrimalDualResidual","text":"DebugPrimalDualResidual <: DebugAction\n\nA Debug action to print the primaldual residual. The constructor accepts a printing function and some (shared) storage, which should at least record :Iterate, :X and :n.\n\nConstructor\n\nDebugPrimalDualResidual()\n\nwith the keywords\n\nio (stdout) - stream to perform the debug to\nformat (\"$prefix%s\") format to print the dual residual, using the\nprefix (\"Primal Residual: \") short form to just set the prefix\nstorage (a new StoreStateAction) to store values for the debug.\n\n\n\n\n\n","category":"type"},{"location":"solvers/ChambollePock/#Record","page":"Chambolle-Pock","title":"Record","text":"","category":"section"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"RecordDualBaseIterate\nRecordDualBaseChange\nRecordDualChange\nRecordDualIterate\nRecordPrimalBaseIterate\nRecordPrimalBaseChange\nRecordPrimalChange\nRecordPrimalIterate","category":"page"},{"location":"solvers/ChambollePock/#Manopt.RecordDualBaseIterate","page":"Chambolle-Pock","title":"Manopt.RecordDualBaseIterate","text":"RecordDualBaseIterate(n)\n\nCreate an RecordAction that records the dual base point, i.e. RecordEntry of o.n.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.RecordDualBaseChange","page":"Chambolle-Pock","title":"Manopt.RecordDualBaseChange","text":"RecordDualBaseChange(e)\n\nCreate an RecordAction that records the dual base point change, i.e. RecordEntryChange of o.n with distance to the last value to store a value.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.RecordDualChange","page":"Chambolle-Pock","title":"Manopt.RecordDualChange","text":"RecordDualChange()\n\nCreate the action either with a given (shared) Storage, which can be set to the values Tuple, if that is provided).\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.RecordDualIterate","page":"Chambolle-Pock","title":"Manopt.RecordDualIterate","text":"RecordDualIterate(X)\n\nCreate an RecordAction that records the dual base point, i.e. RecordEntry of o.X, so .\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.RecordPrimalBaseIterate","page":"Chambolle-Pock","title":"Manopt.RecordPrimalBaseIterate","text":"RecordPrimalBaseIterate(x)\n\nCreate an RecordAction that records the primal base point, i.e. RecordEntry of o.m.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.RecordPrimalBaseChange","page":"Chambolle-Pock","title":"Manopt.RecordPrimalBaseChange","text":"RecordPrimalBaseChange()\n\nCreate an RecordAction that records the primal base point change, i.e. RecordEntryChange of o.m with distance to the last value to store a value.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.RecordPrimalChange","page":"Chambolle-Pock","title":"Manopt.RecordPrimalChange","text":"RecordPrimalChange(a)\n\nCreate an RecordAction that records the primal value change, i.e. RecordChange, since we just record the change of o.x.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.RecordPrimalIterate","page":"Chambolle-Pock","title":"Manopt.RecordPrimalIterate","text":"RecordDualBaseIterate(x)\n\nCreate an RecordAction that records the dual base point, i.e. RecordIterate, i.e. o.x.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Internals","page":"Chambolle-Pock","title":"Internals","text":"","category":"section"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"Manopt.update_prox_parameters!","category":"page"},{"location":"solvers/ChambollePock/#Manopt.update_prox_parameters!","page":"Chambolle-Pock","title":"Manopt.update_prox_parameters!","text":"update_prox_parameters!(o)\n\nupdate the prox parameters as described in Algorithm 2 of Chambolle, Pock, 2010, i.e.\n\nθ_n = frac1sqrt1+2γτ_n\nτ_n+1 = θ_nτ_n\nσ_n+1 = fracσ_nθ_n\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Literature","page":"Chambolle-Pock","title":"Literature","text":"","category":"section"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"Pages = [\"solvers/ChambollePock.md\"]\nCanonical=false","category":"page"},{"location":"tutorials/EmbeddingObjectives/#How-to-define-the-cost-in-the-embedding","page":"Define Objectives in the Embedding","title":"How to define the cost in the embedding","text":"","category":"section"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Specifying a cost function fcolon mathcal M to mathbb R on a manifold is usually the model one starts with. Specifying its gradient operatornamegrad fcolonmathcal M to Tmathcal M, or more precisely operatornamegradf(p) in T_pmathcal M, and eventually a Hessian operatornameHess fcolon T_pmathcal M to T_pmathcal M are then necessary to perform optimization. Since these might be challenging to compute, especially when manifolds and differential geometry are not the main area of a user – easier to use methods might be welcome.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"This tutorial discusses how to specify f in the embedding as tilde f, maybe only locally around the manifold, and use the Euclidean gradient tilde f and Hessian ^2 tilde f within Manopt.jl.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"For the theoretical background see convert an Euclidean to an Riemannian Gradient, or Section 4.7 of [Bou23] for the gradient part or Section 5.11 as well as [Ngu23] for the background on converting Hessians.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Here we use the Examples 9.40 and 9.49 of [Bou23] and compare the different methods, one can call the solver, depending on which gradient and/or Hessian one provides.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"using Manifolds, Manopt, ManifoldDiff\nusing LinearAlgebra, Random, Colors, Plots\nRandom.seed!(123)","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"We consider the cost function on the Grassmann manifold given by","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"n = 5\nk = 2\nM = Grassmann(5,2)\nA = Symmetric(rand(n,n));","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"f(M, p) = 1 / 2 * tr(p' * A * p)","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Note that this implementation is already also a valid implementation / continuation of f into the (lifted) embedding of the Grassmann manifold. In the implementation we can use f for both the Euclidean tilde f and the Grassmann case f.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Its Euclidean gradient nabla f and Hessian nabla^2f are easy to compute as","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"∇f(M, p) = A * p\n∇²f(M,p,X) = A*X","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"On the other hand, from the aforementioned Example 9.49 we can also state the Riemannian gradient and Hessian for comparison as","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"grad_f(M, p) = A * p - p * (p' * A * p)\nHess_f(M, p, X) = A * X - p * p' * A * X - X * p' * A * p","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"We can check that these are the correct at least numerically by calling the check_gradient","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"check_gradient(M, f, grad_f; plot=true)","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"(Image: )","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"and the check_Hessian, which requires a bit more tolerance in its linearity check","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"check_Hessian(M, f, grad_f, Hess_f; plot=true, throw_error=true, atol=1e-15)","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"(Image: )","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"While they look reasonable here and were already derived – for the general case this derivation might be more complicated.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Luckily there exist two functions in ManifoldDiff.jl that are implemented for several manifolds from Manifolds.jl, namely riemannian_gradient(M, p, eG) that converts a Riemannian gradient eG=nabla tilde f(p) into a the Riemannain one operatornamegrad f(p) and riemannian_Hessian(M, p, eG, eH, X) which converts the Euclidean Hessian eH=nabla^2 tilde f(p)X into operatornameHess f(p)X, where we also require the Euclidean gradient eG=nabla tilde f(p).","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"So we can define","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"grad2_f(M, p) = riemannian_gradient(M, p, ∇f(get_embedding(M), embed(M, p)))","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"where only formally we here call embed(M,p) before passing p to the Euclidean gradient, though here (for the Grassmann manifold with Stiefel representation) the embedding function is the identity.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Similarly for the Hessian, where in our example the embeddings of both the points and tangent vectors are the identity.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"function Hess2_f(M, p, X)\n return riemannian_Hessian(\n M,\n p,\n ∇f(get_embedding(M), embed(M, p)),\n ∇²f(get_embedding(M), embed(M, p), embed(M, p, X)),\n X\n )\nend","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"And we can again check these numerically,","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"check_gradient(M, f, grad2_f; plot=true)","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"(Image: )","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"and","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"check_Hessian(M, f, grad2_f, Hess2_f; plot=true, throw_error=true, atol=1e-14)","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"(Image: )","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"which yields the same result, but we see that the Euclidean conversion might be a bit less stable.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Now if we want to use these in optimization we would require these two functions to call e.g.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"p0 = [1.0 0.0; 0.0 1.0; 0.0 0.0; 0.0 0.0; 0.0 0.0]\nr1 = adaptive_regularization_with_cubics(\n M,\n f,\n grad_f,\n Hess_f,\n p0;\n debug=[:Iteration, :Cost, \"\\n\"],\n return_objective=true,\n return_state=true,\n)\nq1 = get_solver_result(r1)\nr1","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Initial f(x): 0.666814\n# 1 f(x): 0.333500\n# 2 f(x): -0.233243\n# 3 f(x): -0.440486\n# 4 f(x): -0.607487\n# 5 f(x): -0.608797\n# 6 f(x): -0.608797\n# 7 f(x): -0.608797\n\n# Solver state for `Manopt.jl`s Adaptive Regularization with Cubics (ARC)\nAfter 7 iterations\n\n## Parameters\n* η1 | η2 : 0.1 | 0.9\n* γ1 | γ2 : 0.1 | 2.0\n* σ (σmin) : 0.0004082482904638632 (1.0e-10)\n* ρ (ρ_regularization) : 0.9998886221507552 (1000.0)\n* retraction method : PolarRetraction()\n* sub solver state :\n | # Solver state for `Manopt.jl`s Lanczos Iteration\n | After 6 iterations\n | \n | ## Parameters\n | * σ : 0.0040824829046386315\n | * # of Lanczos vectors used : 6\n | \n | ## Stopping Criteria\n | (a) For the Lanczos Iteration\n | Stop When _one_ of the following are fulfilled:\n | Max Iteration 6: reached\n | First order progress with θ=0.5: not reached\n | Overall: reached\n | (b) For the Newton sub solver\n | Max Iteration 200: not reached\n | This indicates convergence: No\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 40: not reached\n |grad f| < 1.0e-9: reached\n All Lanczos vectors (5) used: not reached\nOverall: reached\nThis indicates convergence: Yes\n\n## Debug\n [ (:Iteration, \"# %-6d\"), (:Cost, \"f(x): %f\"), \"\n\" ]","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"but if you choose to go for the conversions, then, thinking of the embedding and defining two new functions might be tedious. There is a shortcut for these, which performs the change internally, when necessary by specifying objective_type=:Euclidean.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"r2 = adaptive_regularization_with_cubics(\n M,\n f,\n ∇f,\n ∇²f,\n p0;\n # The one line different to specify our grad/Hess are Eucldiean:\n objective_type=:Euclidean,\n debug=[:Iteration, :Cost, \"\\n\"],\n return_objective=true,\n return_state=true,\n)\nq2 = get_solver_result(r2)\nr2","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Initial f(x): 0.666814\n# 1 f(x): 0.333500\n# 2 f(x): -0.233243\n# 3 f(x): -0.440486\n# 4 f(x): -0.607487\n# 5 f(x): -0.608797\n# 6 f(x): -0.608797\n# 7 f(x): -0.608797\n\n# Solver state for `Manopt.jl`s Adaptive Regularization with Cubics (ARC)\nAfter 7 iterations\n\n## Parameters\n* η1 | η2 : 0.1 | 0.9\n* γ1 | γ2 : 0.1 | 2.0\n* σ (σmin) : 0.0004082482904638632 (1.0e-10)\n* ρ (ρ_regularization) : 0.9998886221248858 (1000.0)\n* retraction method : PolarRetraction()\n* sub solver state :\n | # Solver state for `Manopt.jl`s Lanczos Iteration\n | After 6 iterations\n | \n | ## Parameters\n | * σ : 0.0040824829046386315\n | * # of Lanczos vectors used : 6\n | \n | ## Stopping Criteria\n | (a) For the Lanczos Iteration\n | Stop When _one_ of the following are fulfilled:\n | Max Iteration 6: reached\n | First order progress with θ=0.5: not reached\n | Overall: reached\n | (b) For the Newton sub solver\n | Max Iteration 200: not reached\n | This indicates convergence: No\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 40: not reached\n |grad f| < 1.0e-9: reached\n All Lanczos vectors (5) used: not reached\nOverall: reached\nThis indicates convergence: Yes\n\n## Debug\n [ (:Iteration, \"# %-6d\"), (:Cost, \"f(x): %f\"), \"\n\" ]","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"which returns the same result, see","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"distance(M, q1, q2)","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"3.2016811410571575e-16","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"This conversion also works for the gradients of constraints, and is passed down to subsolvers by deault when these are created using the Euclidean objective f, nabla f and nabla^2 f.","category":"page"},{"location":"tutorials/EmbeddingObjectives/#Summary","page":"Define Objectives in the Embedding","title":"Summary","text":"","category":"section"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"If you have the Euclidean gradient (or Hessian) available for a solver call, all you need to provide is objective_type=:Euclidean to convert the objective to a Riemannian one.","category":"page"},{"location":"tutorials/EmbeddingObjectives/#Literature","page":"Define Objectives in the Embedding","title":"Literature","text":"","category":"section"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Pages = [\"tutorials/EmbeddingObjectives.md\"]\nCanonical=false","category":"page"},{"location":"tutorials/EmbeddingObjectives/#Technical-Details","page":"Define Objectives in the Embedding","title":"Technical Details","text":"","category":"section"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"This notebook was rendered with the following environment","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Pkg.status()","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Status `~/work/Manopt.jl/Manopt.jl/tutorials/Project.toml`\n [6e4b80f9] BenchmarkTools v1.3.2\n [5ae59095] Colors v0.12.10\n [31c24e10] Distributions v0.25.100\n [26cc04aa] FiniteDifferences v0.12.30\n [7073ff75] IJulia v1.24.2\n [8ac3fa9e] LRUCache v1.4.1\n [af67fdf4] ManifoldDiff v0.3.6\n [1cead3c2] Manifolds v0.8.75\n [3362f125] ManifoldsBase v0.14.11\n [0fc0a36d] Manopt v0.4.34 `~/work/Manopt.jl/Manopt.jl`\n [91a5bcdd] Plots v1.39.0","category":"page"},{"location":"helpers/data/#Data","page":"Data","title":"Data","text":"","category":"section"},{"location":"helpers/data/","page":"Data","title":"Data","text":"For some manifolds there are artificial or real application data available that can be loaded using the following data functions. Note that these need additionally Manifolds.jl to be loaded.","category":"page"},{"location":"helpers/data/","page":"Data","title":"Data","text":"Modules = [Manopt]\nPages = [\"artificialDataFunctions.jl\"]","category":"page"},{"location":"helpers/data/#Manopt.artificialIn_SAR_image-Tuple{Integer}","page":"Data","title":"Manopt.artificialIn_SAR_image","text":"artificialIn_SAR_image([pts=500])\n\ngenerate an artificial InSAR image, i.e. phase valued data, of size pts x pts points.\n\nThis data set was introduced for the numerical examples in Bergmann et. al., SIAM J Imag Sci, 2014.\n\n\n\n\n\n","category":"method"},{"location":"helpers/data/#Manopt.artificial_S1_signal","page":"Data","title":"Manopt.artificial_S1_signal","text":"artificial_S1_signal([pts=500])\n\ngenerate a real-valued signal having piecewise constant, linear and quadratic intervals with jumps in between. If the resulting manifold the data lives on, is the Circle the data is also wrapped to -pipi). This is data for an example from Bergmann et. al., SIAM J Imag Sci, 2014.\n\nOptional\n\npts – (500) number of points to sample the function\n\n\n\n\n\n","category":"function"},{"location":"helpers/data/#Manopt.artificial_S1_signal-Tuple{Real}","page":"Data","title":"Manopt.artificial_S1_signal","text":"artificial_S1_signal(x)\n\nevaluate the example signal f(x) x 01, of phase-valued data introduces in Sec. 5.1 of Bergmann et. al., SIAM J Imag Sci, 2014 for values outside that interval, this Signal is missing.\n\n\n\n\n\n","category":"method"},{"location":"helpers/data/#Manopt.artificial_S1_slope_signal","page":"Data","title":"Manopt.artificial_S1_slope_signal","text":"artificial_S1_slope_signal([pts=500, slope=4.])\n\nCreates a Signal of (phase-valued) data represented on the Circle with increasing slope.\n\nOptional\n\npts – (500) number of points to sample the function.\nslope – (4.0) initial slope that gets increased afterwards\n\nThis data set was introduced for the numerical examples in Bergmann et. al., SIAM J Imag Sci, 2014\n\n\n\n\n\n","category":"function"},{"location":"helpers/data/#Manopt.artificial_S2_composite_bezier_curve-Tuple{}","page":"Data","title":"Manopt.artificial_S2_composite_bezier_curve","text":"artificial_S2_composite_bezier_curve()\n\nCreate the artificial curve in the Sphere(2) consisting of 3 segments between the four points\n\np_0 = beginbmatrix001endbmatrix^mathrmT\np_1 = beginbmatrix0-10endbmatrix^mathrmT\np_2 = beginbmatrix-100endbmatrix^mathrmT\np_3 = beginbmatrix00-1endbmatrix^mathrmT\n\nwhere each segment is a cubic Bézier curve, i.e. each point, except p_3 has a first point within the following segment b_i^+, i=012 and a last point within the previous segment, except for p_0, which are denoted by b_i^-, i=123. This curve is differentiable by the conditions b_i^- = gamma_b_i^+p_i(2), i=12, where gamma_ab is the shortest_geodesic connecting a and b. The remaining points are defined as\n\nbeginaligned\n b_0^+ = exp_p_0fracpi8sqrt2beginpmatrix1-10endpmatrix^mathrmT\n b_1^+ = exp_p_1-fracpi4sqrt2beginpmatrix-101endpmatrix^mathrmT\n b_2^+ = exp_p_2fracpi4sqrt2beginpmatrix01-1endpmatrix^mathrmT\n b_3^- = exp_p_3-fracpi8sqrt2beginpmatrix-110endpmatrix^mathrmT\nendaligned\n\nThis example was used within minimization of acceleration of the paper Bergmann, Gousenbourger, Front. Appl. Math. Stat., 2018.\n\n\n\n\n\n","category":"method"},{"location":"helpers/data/#Manopt.artificial_S2_lemniscate","page":"Data","title":"Manopt.artificial_S2_lemniscate","text":"artificial_S2_lemniscate(p, t::Float64; a::Float64=π/2)\n\nGenerate a point from the signal on the Sphere mathbb S^2 by creating the Lemniscate of Bernoulli in the tangent space of p sampled at t and use expto obtain a point on the [Sphere`](https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/sphere.html).\n\nInput\n\np – the tangent space the Lemniscate is created in\nt – value to sample the Lemniscate at\n\nOptional Values\n\na – (π/2) defines a half axis of the Lemniscate to cover a half sphere.\n\nThis dataset was used in the numerical example of Section 5.1 of Bačák et al., SIAM J Sci Comput, 2016.\n\n\n\n\n\n","category":"function"},{"location":"helpers/data/#Manopt.artificial_S2_lemniscate-2","page":"Data","title":"Manopt.artificial_S2_lemniscate","text":"artificial_S2_lemniscate(p [,pts=128,a=π/2,interval=[0,2π])\n\nGenerate a Signal on the Sphere mathbb S^2 by creating the Lemniscate of Bernoulli in the tangent space of p sampled at pts points and use exp to get a signal on the Sphere.\n\nInput\n\np – the tangent space the Lemniscate is created in\npts – (128) number of points to sample the Lemniscate\na – (π/2) defines a half axis of the Lemniscate to cover a half sphere.\ninterval – ([0,2*π]) range to sample the lemniscate at, the default value refers to one closed curve\n\nThis dataset was used in the numerical example of Section 5.1 of Bačák et al., SIAM J Sci Comput, 2016.\n\n\n\n\n\n","category":"function"},{"location":"helpers/data/#Manopt.artificial_S2_rotation_image-Tuple{}","page":"Data","title":"Manopt.artificial_S2_rotation_image","text":"artificial_S2_rotation_image([pts=64, rotations=(.5,.5)])\n\nCreate an image with a rotation on each axis as a parametrization.\n\nOptional Parameters\n\npts – (64) number of pixels along one dimension\nrotations – ((.5,.5)) number of total rotations performed on the axes.\n\nThis dataset was used in the numerical example of Section 5.1 of Bačák et al., SIAM J Sci Comput, 2016.\n\n\n\n\n\n","category":"method"},{"location":"helpers/data/#Manopt.artificial_S2_whirl_image-Tuple{Int64}","page":"Data","title":"Manopt.artificial_S2_whirl_image","text":"artificial_S2_whirl_image([pts::Int=64])\n\nGenerate an artificial image of data on the 2 sphere,\n\nArguments\n\npts – (64) size of the image in ptstimespts pixel.\n\nThis example dataset was used in the numerical example in Section 5.5 of Laus et al., SIAM J Imag Sci., 2017\n\nIt is based on artificial_S2_rotation_image extended by small whirl patches.\n\n\n\n\n\n","category":"method"},{"location":"helpers/data/#Manopt.artificial_S2_whirl_patch","page":"Data","title":"Manopt.artificial_S2_whirl_patch","text":"artificial_S2_whirl_patch([pts=5])\n\ncreate a whirl within the ptstimespts patch of Sphere(@ref)(2)-valued image data.\n\nThese patches are used within artificial_S2_whirl_image.\n\nOptional Parameters\n\npts – (5) size of the patch. If the number is odd, the center is the north pole.\n\n\n\n\n\n","category":"function"},{"location":"helpers/data/#Manopt.artificial_SPD_image","page":"Data","title":"Manopt.artificial_SPD_image","text":"artificial_SPD_image([pts=64, stepsize=1.5])\n\ncreate an artificial image of symmetric positive definite matrices of size ptstimespts pixel with a jump of size stepsize.\n\nThis dataset was used in the numerical example of Section 5.2 of Bačák et al., SIAM J Sci Comput, 2016.\n\n\n\n\n\n","category":"function"},{"location":"helpers/data/#Manopt.artificial_SPD_image2-Tuple{Any, Any}","page":"Data","title":"Manopt.artificial_SPD_image2","text":"artificial_SPD_image2([pts=64, fraction=.66])\n\ncreate an artificial image of symmetric positive definite matrices of size ptstimespts pixel with right hand side fraction is moved upwards.\n\nThis data set was introduced in the numerical examples of Section of Bergmann, Presch, Steidl, SIAM J Imag Sci, 2016\n\n\n\n\n\n","category":"method"},{"location":"helpers/data/#Literature","page":"Data","title":"Literature","text":"","category":"section"},{"location":"helpers/data/","page":"Data","title":"Data","text":"Pages = [\"helpers/data.md\"]\nCanonical=false","category":"page"},{"location":"solvers/alternating_gradient_descent/#AlternatingGradientDescentSolver","page":"Alternating Gradient Descent","title":"Alternating Gradient Descent","text":"","category":"section"},{"location":"solvers/alternating_gradient_descent/","page":"Alternating Gradient Descent","title":"Alternating Gradient Descent","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/alternating_gradient_descent/","page":"Alternating Gradient Descent","title":"Alternating Gradient Descent","text":"alternating_gradient_descent\nalternating_gradient_descent!","category":"page"},{"location":"solvers/alternating_gradient_descent/#Manopt.alternating_gradient_descent","page":"Alternating Gradient Descent","title":"Manopt.alternating_gradient_descent","text":"alternating_gradient_descent(M::ProductManifold, f, grad_f, p=rand(M))\nalternating_gradient_descent(M::ProductManifold, ago::ManifoldAlternatingGradientObjective, p)\n\nperform an alternating gradient descent\n\nInput\n\nM – the product manifold mathcal M = mathcal M_1 mathcal M_2 mathcal M_n\nf – the objective function (cost) defined on M.\ngrad_f – a gradient, that can be of two cases\nis a single function returning an ArrayPartition or\nis a vector functions each returning a component part of the whole gradient\np – an initial value p_0 mathcal M\n\nOptional\n\nevaluation – (AllocatingEvaluation) specify whether the gradient(s) works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x) (elementwise).\nevaluation_order – (:Linear) – whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Random) or the default :Linear one.\ninner_iterations– (5) how many gradient steps to take in a component before alternating to the next\nstopping_criterion (StopAfterIteration(1000))– a StoppingCriterion\nstepsize (ArmijoLinesearch()) a Stepsize\norder - ([1:n]) the initial permutation, where n is the number of gradients in gradF.\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction(M, p, X) to use.\n\nOutput\n\nusually the obtained (approximate) minimizer, see get_solver_return for details\n\nnote: Note\nThis Problem requires the ProductManifold from Manifolds.jl, so Manifolds.jl needs to be loaded.\n\nnote: Note\nThe input of each of the (component) gradients is still the whole vector X, just that all other then the ith input component are assumed to be fixed and just the ith components gradient is computed / returned.\n\n\n\n\n\n","category":"function"},{"location":"solvers/alternating_gradient_descent/#Manopt.alternating_gradient_descent!","page":"Alternating Gradient Descent","title":"Manopt.alternating_gradient_descent!","text":"alternating_gradient_descent!(M::ProductManifold, f, grad_f, p)\nalternating_gradient_descent!(M::ProductManifold, ago::ManifoldAlternatingGradientObjective, p)\n\nperform a alternating gradient descent in place of p.\n\nInput\n\nM a product manifold mathcal M\nf – the objective functioN (cost)\ngrad_f – a gradient function, that either returns a vector of the subgradients or is a vector of gradients\np – an initial value p_0 mathcal M\n\nyou can also pass a ManifoldAlternatingGradientObjective ago containing f and grad_f instead.\n\nfor all optional parameters, see alternating_gradient_descent.\n\n\n\n\n\n","category":"function"},{"location":"solvers/alternating_gradient_descent/#State","page":"Alternating Gradient Descent","title":"State","text":"","category":"section"},{"location":"solvers/alternating_gradient_descent/","page":"Alternating Gradient Descent","title":"Alternating Gradient Descent","text":"AlternatingGradientDescentState","category":"page"},{"location":"solvers/alternating_gradient_descent/#Manopt.AlternatingGradientDescentState","page":"Alternating Gradient Descent","title":"Manopt.AlternatingGradientDescentState","text":"AlternatingGradientDescentState <: AbstractGradientDescentSolverState\n\nStore the fields for an alternating gradient descent algorithm, see also alternating_gradient_descent.\n\nFields\n\ndirection (AlternatingGradient(zero_vector(M, x)) a DirectionUpdateRule\nevaluation_order – (:Linear) – whether\ninner_iterations– (5) how many gradient steps to take in a component before alternating to the next to use a randomly permuted sequence (:FixedRandom), a per cycle newly permuted sequence (:Random) or the default :Linear evaluation order.\norder the current permutation\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction(M,x,ξ) to use.\nstepsize (ConstantStepsize(M)) a Stepsize\nstopping_criterion (StopAfterIteration(1000))– a StoppingCriterion\np the current iterate\nX (zero_vector(M,p)) the current gradient tangent vector\nk, ì` internal counters for the outer and inner iterations, respectively.\n\nConstructors\n\nAlternatingGradientDescentState(M, p; kwargs...)\n\nGenerate the options for point p and and where inner_iterations, order_type, order, retraction_method, stopping_criterion, and stepsize` are keyword arguments\n\n\n\n\n\n","category":"type"},{"location":"solvers/alternating_gradient_descent/","page":"Alternating Gradient Descent","title":"Alternating Gradient Descent","text":"Additionally, the options share a DirectionUpdateRule, which chooses the current component, so they can be decorated further; The most inner one should always be the following one though.","category":"page"},{"location":"solvers/alternating_gradient_descent/","page":"Alternating Gradient Descent","title":"Alternating Gradient Descent","text":"AlternatingGradient","category":"page"},{"location":"solvers/alternating_gradient_descent/#Manopt.AlternatingGradient","page":"Alternating Gradient Descent","title":"Manopt.AlternatingGradient","text":"AlternatingGradient <: DirectionUpdateRule\n\nThe default gradient processor, which just evaluates the (alternating) gradient on one of the components\n\n\n\n\n\n","category":"type"},{"location":"solvers/truncated_conjugate_gradient_descent/#tCG","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint Truncated Conjugate-Gradient Method","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"The aim is to solve the trust-region subproblem","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"operatorname*argmin_η T_xmathcalM m_x(η) = F(x) +\noperatornamegradF(x) η_x + frac12 \nmathcalHη η_x","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"textst η η_x leq Δ^2","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"on a manifold by using the Steihaug-Toint truncated conjugate-gradient method, abbreviated tCG-method. All terms involving the trust-region radius use an inner product w.r.t. the preconditioner; this is because the iterates grow in length w.r.t. the preconditioner, guaranteeing that we do not re-enter the trust-region.","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/#Initialization","page":"Steihaug-Toint TCG Method","title":"Initialization","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"Initialize η_0 = η if using randomized approach and η the zero tangent vector otherwise, r_0 = operatornamegradF(x), z_0 = operatornameP(r_0), δ_0 = z_0 and k=0","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/#Iteration","page":"Steihaug-Toint TCG Method","title":"Iteration","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"Repeat until a convergence criterion is reached","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"Set α =fracr_k z_k_xδ_k mathcalHδ_k_x and η_k η_k_x^* = η_k operatornameP(η_k)_x + 2α η_k operatornameP(δ_k)_x + α^2 δ_k operatornameP(δ_k)_x.\nIf δ_k mathcalHδ_k_x 0 or η_k η_k_x^* Δ^2 return η_k+1 = η_k + τ δ_k and stop.\nSet η_k^*= η_k + α δ_k, if η_k η_k_x + frac12 η_k operatornameHessF (η_k)_x_x η_k^* η_k^*_x + frac12 η_k^* operatornameHessF (η_k)_ x_x set η_k+1 = η_k else set η_k+1 = η_k^*.\nSet r_k+1 = r_k + α mathcalHδ_k, z_k+1 = operatornameP(r_k+1), β = fracr_k+1 z_k+1_xr_k z_k _x and δ_k+1 = -z_k+1 + β δ_k.\nSet k=k+1.","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/#Result","page":"Steihaug-Toint TCG Method","title":"Result","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"The result is given by the last computed η_k.","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/#Remarks","page":"Steihaug-Toint TCG Method","title":"Remarks","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"The operatornameP() denotes the symmetric, positive definite preconditioner. It is required if a randomized approach is used i.e. using a random tangent vector η_0 as the initial vector. The idea behind it is to avoid saddle points. Preconditioning is simply a rescaling of the variables and thus a redefinition of the shape of the trust region. Ideally operatornameP() is a cheap, positive approximation of the inverse of the Hessian of F at x. On default, the preconditioner is just the identity.","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"To step number 2: obtain τ from the positive root of leftlVert η_k + τ δ_k rightrVert_operatornameP x = Δ what becomes after the conversion of the equation to","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":" τ = frac-η_k operatornameP(δ_k)_x +\n sqrtη_k operatornameP(δ_k)_x^2 +\n δ_k operatornameP(δ_k)_x ( Δ^2 -\n η_k operatornameP(η_k)_x)\n δ_k operatornameP(δ_k)_x","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"It can occur that δ_k operatornameHessF (δ_k)_x_x = κ 0 at iteration k. In this case, the model is not strictly convex, and the stepsize α =fracr_k z_k_x κ computed in step 1. does not give a reduction in the model function m_x(). Indeed, m_x() is unbounded from below along the line η_k + α δ_k. If our aim is to minimize the model within the trust-region, it makes far more sense to reduce m_x() along η_k + α δ_k as much as we can while staying within the trust-region, and this means moving to the trust-region boundary along this line. Thus, when κ 0 at iteration k, we replace α = fracr_k z_k_xκ with τ described as above. The other possibility is that η_k+1 would lie outside the trust-region at iteration k (i.e. η_k η_k_x^* Δ^2 that can be identified with the norm of η_k+1). In particular, when operatornameHessF ()_x is positive definite and η_k+1 lies outside the trust region, the solution to the trust-region problem must lie on the trust-region boundary. Thus, there is no reason to continue with the conjugate gradient iteration, as it stands, as subsequent iterates will move further outside the trust-region boundary. A sensible strategy, just as in the case considered above, is to move to the trust-region boundary by finding τ.","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"Although it is virtually impossible in practice to know how many iterations are necessary to provide a good estimate η_k of the trust-region subproblem, the method stops after a certain number of iterations, which is realised by StopAfterIteration. In order to increase the convergence rate of the underlying trust-region method, see trust_regions, a typical stopping criterion is to stop as soon as an iteration k is reached for which","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":" Vert r_k Vert_x leqq Vert r_0 Vert_x min left( Vert r_0 Vert^θ_x κ right)","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"holds, where 0 κ 1 and θ 0 are chosen in advance. This is realized in this method by StopWhenResidualIsReducedByFactorOrPower. It can be shown that under appropriate conditions the iterates x_k of the underlying trust-region method converge to nondegenerate critical points with an order of convergence of at least min left( θ + 1 2 right), see Absil, Mahony, Sepulchre, Princeton University Press, 2008. The method also aborts if the curvature of the model is negative, i.e. if langle delta_k mathcalHδ_k rangle_x leqq 0, which is realised by StopWhenCurvatureIsNegative. If the next possible approximate solution η_k^* calculated in iteration k lies outside the trust region, i.e. if lVert η_k^* rVert_x geq Δ, then the method aborts, which is realised by StopWhenTrustRegionIsExceeded. Furthermore, the method aborts if the new model value evaluated at η_k^* is greater than the previous model value evaluated at η_k, which is realised by StopWhenModelIncreased.","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/#Interface","page":"Steihaug-Toint TCG Method","title":"Interface","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":" truncated_conjugate_gradient_descent\n truncated_conjugate_gradient_descent!","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.truncated_conjugate_gradient_descent","page":"Steihaug-Toint TCG Method","title":"Manopt.truncated_conjugate_gradient_descent","text":"truncated_conjugate_gradient_descent(M, f, grad_f, p; kwargs...)\ntruncated_conjugate_gradient_descent(M, f, grad_f, p, X; kwargs...)\ntruncated_conjugate_gradient_descent(M, f, grad_f, Hess_f; kwargs...)\ntruncated_conjugate_gradient_descent(M, f, grad_f, Hess_f, p; kwargs...)\ntruncated_conjugate_gradient_descent(M, f, grad_f, Hess_f, p, X; kwargs...)\ntruncated_conjugate_gradient_descent(M, mho::ManifoldHessianObjective, p, X; kwargs...)\n\nsolve the trust-region subproblem\n\noperatorname*argmin_η T_pM\nm_p(η) quadtextwhere\nm_p(η) = f(p) + operatornamegrad f(p)η_x + frac12operatornameHess f(p)ηη_x\n\ntextsuch thatquad ηη_x Δ^2\n\non a manifold M by using the Steihaug-Toint truncated conjugate-gradient method, abbreviated tCG-method. For a description of the algorithm and theorems offering convergence guarantees, see the reference:\n\nP.-A. Absil, C.G. Baker, K.A. Gallivan, Trust-region methods on Riemannian manifolds, FoCM, 2007. doi: 10.1007/s10208-005-0179-9\nA. R. Conn, N. I. M. Gould, P. L. Toint, Trust-region methods, SIAM, MPS, 2000. doi: 10.1137/1.9780898719857\n\nInput\n\nSee signatures above, you can leave out only the Hessian, the vector, the point and the vector, or all 3.\n\nM – a manifold mathcal M\nf – a cost function F mathcal M ℝ to minimize\ngrad_f – the gradient operatornamegradf mathcal M Tmathcal M of F\nHess_f – (optional, cf. ApproxHessianFiniteDifference) the hessian operatornameHessf T_pmathcal M T_pmathcal M, X operatornameHessF(p)X = _Xoperatornamegradf(p)\np – a point on the manifold p mathcal M\nX – an update tangential vector X T_pmathcal M\n\nOptional\n\nevaluation – (AllocatingEvaluation) specify whether the gradient and hessian work by allocation (default) or InplaceEvaluation in place\npreconditioner – a preconditioner for the hessian H\nθ – (1.0) 1+θ is the superlinear convergence target rate. The method aborts if the residual is less than or equal to the initial residual to the power of 1+θ.\nκ – (0.1) the linear convergence target rate. The method aborts if the residual is less than or equal to κ times the initial residual.\nrandomize – set to true if the trust-region solve is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.\ntrust_region_radius – (injectivity_radius(M)/4) a trust-region radius\nproject! : (copyto!) specify a projection operation for tangent vectors for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.\nstopping_criterion – (StopAfterIteration| [StopWhenResidualIsReducedByFactorOrPower](@ref) | 'StopWhenCurvatureIsNegative|StopWhenTrustRegionIsExceeded ) a functor inheriting from StoppingCriterion indicating when to stop, where for the default, the maximal number of iterations is set to the dimension of the manifold, the power factor is θ, the reduction factor is κ.\n\nand the ones that are passed to decorate_state! for decorators.\n\nOutput\n\nthe obtained (approximate) minimizer eta^*, see get_solver_return for details\n\nsee also\n\ntrust_regions\n\n\n\n\n\n","category":"function"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.truncated_conjugate_gradient_descent!","page":"Steihaug-Toint TCG Method","title":"Manopt.truncated_conjugate_gradient_descent!","text":"truncated_conjugate_gradient_descent!(M, f, grad_f, Hess_f, p, X; kwargs...)\ntruncated_conjugate_gradient_descent!(M, f, grad_f, p, X; kwargs...)\n\nsolve the trust-region subproblem in place of X (and p).\n\nInput\n\nM – a manifold mathcal M\nf – a cost function F mathcal M ℝ to minimize\ngrad_f – the gradient operatornamegradf mathcal M Tmathcal M of f\nHess_f – the hessian operatornameHessf(x) T_pmathcal M T_pmathcal M, X operatornameHessf(p)X\np – a point on the manifold p mathcal M\nX – an update tangential vector X T_xmathcal M\n\nFor more details and all optional arguments, see truncated_conjugate_gradient_descent.\n\n\n\n\n\n","category":"function"},{"location":"solvers/truncated_conjugate_gradient_descent/#State","page":"Steihaug-Toint TCG Method","title":"State","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"TruncatedConjugateGradientState","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.TruncatedConjugateGradientState","page":"Steihaug-Toint TCG Method","title":"Manopt.TruncatedConjugateGradientState","text":"TruncatedConjugateGradientState <: AbstractHessianSolverState\n\ndescribe the Steihaug-Toint truncated conjugate-gradient method, with\n\nFields\n\na default value is given in brackets if a parameter can be left out in initialization.\n\nx : a point, where the trust-region subproblem needs to be solved\nη : a tangent vector (called update vector), which solves the trust-region subproblem after successful calculation by the algorithm\nstop : a StoppingCriterion.\ngradient : the gradient at the current iterate\nδ : search direction\ntrust_region_radius : (injectivity_radius(M)/4) the trust-region radius\nresidual : the gradient\nrandomize : indicates if the trust-region solve and so the algorithm is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.\nproject! : (copyto!) specify a projection operation for tangent vectors for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.\n\nConstructor\n\nTruncatedConjugateGradientState(M, p=rand(M), η=zero_vector(M,p);\n trust_region_radius=injectivity_radius(M)/4,\n randomize=false,\n θ=1.0,\n κ=0.1,\n project!=copyto!,\n)\n\nand a slightly involved `stopping_criterion`\n\nSee also\n\ntruncated_conjugate_gradient_descent, trust_regions\n\n\n\n\n\n","category":"type"},{"location":"solvers/truncated_conjugate_gradient_descent/#Stopping-Criteria","page":"Steihaug-Toint TCG Method","title":"Stopping Criteria","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"StopWhenResidualIsReducedByFactorOrPower\nStopWhenTrustRegionIsExceeded\nStopWhenCurvatureIsNegative\nStopWhenModelIncreased\nupdate_stopping_criterion!(::StopWhenResidualIsReducedByFactorOrPower, ::Val{:ResidualPower}, ::Any)\nupdate_stopping_criterion!(::StopWhenResidualIsReducedByFactorOrPower, ::Val{:ResidualFactor}, ::Any)","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.StopWhenResidualIsReducedByFactorOrPower","page":"Steihaug-Toint TCG Method","title":"Manopt.StopWhenResidualIsReducedByFactorOrPower","text":"StopWhenResidualIsReducedByFactorOrPower <: StoppingCriterion\n\nA functor for testing if the norm of residual at the current iterate is reduced either by a power of 1+θ or by a factor κ compared to the norm of the initial residual, i.e. Vert r_k Vert_x leqq Vert r_0 Vert_x \nmin left( kappa Vert r_0 Vert_x^theta right).\n\nFields\n\nκ – the reduction factor\nθ – part of the reduction power\nreason – stores a reason of stopping if the stopping criterion has one be reached, see get_reason.\n\nConstructor\n\nStopWhenResidualIsReducedByFactorOrPower(; κ=0.1, θ=1.0)\n\ninitialize the StopWhenResidualIsReducedByFactorOrPower functor to indicate to stop after the norm of the current residual is lesser than either the norm of the initial residual to the power of 1+θ or the norm of the initial residual times κ.\n\nSee also\n\ntruncated_conjugate_gradient_descent, trust_regions\n\n\n\n\n\n","category":"type"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.StopWhenTrustRegionIsExceeded","page":"Steihaug-Toint TCG Method","title":"Manopt.StopWhenTrustRegionIsExceeded","text":"StopWhenTrustRegionIsExceeded <: StoppingCriterion\n\nA functor for testing if the norm of the next iterate in the Steihaug-Toint tcg method is larger than the trust-region radius, i.e. Vert η_k^* Vert_x trust_region_radius. terminate the algorithm when the trust region has been left.\n\nFields\n\nreason – stores a reason of stopping if the stopping criterion has been reached, see get_reason.\n\nConstructor\n\nStopWhenTrustRegionIsExceeded()\n\ninitialize the StopWhenTrustRegionIsExceeded functor to indicate to stop after the norm of the next iterate is greater than the trust-region radius.\n\nSee also\n\ntruncated_conjugate_gradient_descent, trust_regions\n\n\n\n\n\n","category":"type"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.StopWhenCurvatureIsNegative","page":"Steihaug-Toint TCG Method","title":"Manopt.StopWhenCurvatureIsNegative","text":"StopWhenCurvatureIsNegative <: StoppingCriterion\n\nA functor for testing if the curvature of the model is negative, i.e. langle delta_k operatornameHessF(delta_k)rangle_x leqq 0. In this case, the model is not strictly convex, and the stepsize as computed does not give a reduction of the model.\n\nFields\n\nreason – stores a reason of stopping if the stopping criterion has been reached, see get_reason.\n\nConstructor\n\nStopWhenCurvatureIsNegative()\n\nSee also\n\ntruncated_conjugate_gradient_descent, trust_regions\n\n\n\n\n\n","category":"type"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.StopWhenModelIncreased","page":"Steihaug-Toint TCG Method","title":"Manopt.StopWhenModelIncreased","text":"StopWhenModelIncreased <: StoppingCriterion\n\nA functor for testing if the curvature of the model value increased.\n\nFields\n\nreason – stores a reason of stopping if the stopping criterion has been reached, see get_reason.\n\nConstructor\n\nStopWhenModelIncreased()\n\nSee also\n\ntruncated_conjugate_gradient_descent, trust_regions\n\n\n\n\n\n","category":"type"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.update_stopping_criterion!-Tuple{StopWhenResidualIsReducedByFactorOrPower, Val{:ResidualPower}, Any}","page":"Steihaug-Toint TCG Method","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopWhenResidualIsReducedByFactorOrPower, :ResidualPower, v)\n\nUpdate the residual Power θ to v.\n\n\n\n\n\n","category":"method"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.update_stopping_criterion!-Tuple{StopWhenResidualIsReducedByFactorOrPower, Val{:ResidualFactor}, Any}","page":"Steihaug-Toint TCG Method","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopWhenResidualIsReducedByFactorOrPower, :ResidualFactor, v)\n\nUpdate the residual Factor κ to v.\n\n\n\n\n\n","category":"method"},{"location":"solvers/truncated_conjugate_gradient_descent/#Literature","page":"Steihaug-Toint TCG Method","title":"Literature","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"Pages = [\"solvers/truncated_conjugate_gradient_descent.md\"]\nCanonical=false","category":"page"},{"location":"solvers/LevenbergMarquardt/#Levenberg-Marquardt","page":"Levenberg–Marquardt","title":"Levenberg-Marquardt","text":"","category":"section"},{"location":"solvers/LevenbergMarquardt/","page":"Levenberg–Marquardt","title":"Levenberg–Marquardt","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/LevenbergMarquardt/","page":"Levenberg–Marquardt","title":"Levenberg–Marquardt","text":"LevenbergMarquardt\nLevenbergMarquardt!","category":"page"},{"location":"solvers/LevenbergMarquardt/#Manopt.LevenbergMarquardt","page":"Levenberg–Marquardt","title":"Manopt.LevenbergMarquardt","text":"LevenbergMarquardt(M, f, jacobian_f, p, num_components=-1)\n\nSolve an optimization problem of the form\n\noperatornameargmin_p mathcal M frac12 lVert f(p) rVert^2\n\nwhere fcolonmathcal M to ℝ^d is a continuously differentiable function, using the Riemannian Levenberg-Marquardt algorithm Peeters, Tech. Rep., 1993. The implementation follows Algorithm 1 Adachi, Okuno, Takeda, Preprint, 2022\n\nInput\n\nM – a manifold mathcal M\nf – a cost function F mathcal Mℝ^d\njacobian_f – the Jacobian of f. The Jacobian jacF is supposed to accept a keyword argument basis_domain which specifies basis of the tangent space at a given point in which the Jacobian is to be calculated. By default it should be the DefaultOrthonormalBasis.\np – an initial value p mathcal M\nnum_components – length of the vector returned by the cost function (d). By default its value is -1 which means that it will be determined automatically by calling F one additional time. Only possible when evaluation is AllocatingEvaluation, for mutating evaluation this must be explicitly specified.\n\nThese can also be passed as a NonlinearLeastSquaresObjective, then the keyword jacobian_tangent_basis below is ignored\n\nOptional\n\nevaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction(M,x,ξ) to use.\nstopping_criterion – (StopWhenAny(StopAfterIteration(200),StopWhenGradientNormLess(1e-12))) a functor inheriting from StoppingCriterion indicating when to stop.\nexpect_zero_residual – (false) whether or not the algorithm might expect that the value of residual (objective) at minimum is equal to 0.\nη – Scaling factor for the sufficient cost decrease threshold required to accept new proposal points. Allowed range: 0 < η < 1.\ndamping_term_min – initial (and also minimal) value of the damping term\nβ – parameter by which the damping term is multiplied when the current new point is rejected\ninitial_residual_values – the initial residual vector of the cost function f.\ninitial_jacobian_f – the initial Jacobian of the cost function f.\njacobian_tangent_basis - AbstractBasis specify the basis of the tangent space for jacobian_f.\n\nAll other keyword arguments are passed to decorate_state! for decorators or decorate_objective!, respectively. If you provide the ManifoldGradientObjective directly, these decorations can still be specified\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\nReferences\n\n\n\n\n\n","category":"function"},{"location":"solvers/LevenbergMarquardt/#Manopt.LevenbergMarquardt!","page":"Levenberg–Marquardt","title":"Manopt.LevenbergMarquardt!","text":"LevenbergMarquardt!(M, f, jacobian_f, p, num_components=-1; kwargs...)\n\nFor more options see LevenbergMarquardt.\n\n\n\n\n\n","category":"function"},{"location":"solvers/LevenbergMarquardt/#Options","page":"Levenberg–Marquardt","title":"Options","text":"","category":"section"},{"location":"solvers/LevenbergMarquardt/","page":"Levenberg–Marquardt","title":"Levenberg–Marquardt","text":"LevenbergMarquardtState","category":"page"},{"location":"solvers/LevenbergMarquardt/#Manopt.LevenbergMarquardtState","page":"Levenberg–Marquardt","title":"Manopt.LevenbergMarquardtState","text":"LevenbergMarquardtState{P,T} <: AbstractGradientSolverState\n\nDescribes a Gradient based descent algorithm, with\n\nFields\n\nA default value is given in brackets if a parameter can be left out in initialization.\n\nx – a point (of type P) on a manifold as starting point\nstop – (StopAfterIteration(200) | StopWhenGradientNormLess(1e-12) | StopWhenStepsizeLess(1e-12)) a StoppingCriterion\nretraction_method – (default_retraction_method(M, typeof(p))) the retraction to use, defaults to the default set for your manifold.\nresidual_values – value of F calculated in the solver setup or the previous iteration\nresidual_values_temp – value of F for the current proposal point\njacF – the current Jacobian of F\ngradient – the current gradient of F\nstep_vector – the tangent vector at x that is used to move to the next point\nlast_stepsize – length of step_vector\nη – Scaling factor for the sufficient cost decrease threshold required to accept new proposal points. Allowed range: 0 < η < 1.\ndamping_term – current value of the damping term\ndamping_term_min – initial (and also minimal) value of the damping term\nβ – parameter by which the damping term is multiplied when the current new point is rejected\nexpect_zero_residual – (false) if true, the algorithm expects that the value of residual (objective) at minimum is equal to 0.\n\nConstructor\n\nLevenbergMarquardtState(M, initialX, initial_residual_values, initial_jacF; initial_vector), kwargs...)\n\nGenerate Levenberg-Marquardt options.\n\nSee also\n\ngradient_descent, LevenbergMarquardt\n\n\n\n\n\n","category":"type"},{"location":"solvers/LevenbergMarquardt/#Literature","page":"Levenberg–Marquardt","title":"Literature","text":"","category":"section"},{"location":"solvers/LevenbergMarquardt/","page":"Levenberg–Marquardt","title":"Levenberg–Marquardt","text":"Pages = [\"solvers/LevenbergMarquardt.md\"]\nCanonical=false","category":"page"},{"location":"solvers/exact_penalty_method/#ExactPenaltySolver","page":"Exact Penalty Method","title":"Exact Penalty Method","text":"","category":"section"},{"location":"solvers/exact_penalty_method/","page":"Exact Penalty Method","title":"Exact Penalty Method","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/exact_penalty_method/","page":"Exact Penalty Method","title":"Exact Penalty Method","text":" exact_penalty_method\n exact_penalty_method!","category":"page"},{"location":"solvers/exact_penalty_method/#Manopt.exact_penalty_method","page":"Exact Penalty Method","title":"Manopt.exact_penalty_method","text":"exact_penalty_method(M, F, gradF, p=rand(M); kwargs...)\nexact_penalty_method(M, cmo::ConstrainedManifoldObjective, p=rand(M); kwargs...)\n\nperform the exact penalty method (EPM) Liu, Boumal, 2019, Appl. Math. Optim The aim of the EPM is to find a solution of the constrained optimisation task\n\nbeginaligned\nmin_p mathcalM f(p)\ntextsubject to g_i(p)leq 0 quad text for i= 1 m\nquad h_j(p)=0 quad text for j=1n\nendaligned\n\nwhere M is a Riemannian manifold, and f, g_i_i=1^m and h_j_j=1^n are twice continuously differentiable functions from M to ℝ. For that a weighted L_1-penalty term for the violation of the constraints is added to the objective\n\nf(x) + ρ (sum_i=1^m maxleft0 g_i(x)right + sum_j=1^n vert h_j(x)vert)\n\nwhere ρ0 is the penalty parameter. Since this is non-smooth, a SmoothingTechnique with parameter u is applied, see the ExactPenaltyCost.\n\nIn every step k of the exact penalty method, the smoothed objective is then minimized over all x mathcalM. Then, the accuracy tolerance ϵ and the smoothing parameter u are updated by setting\n\nϵ^(k)=maxϵ_min θ_ϵ ϵ^(k-1)\n\nwhere ϵ_min is the lowest value ϵ is allowed to become and θ_ϵ (01) is constant scaling factor, and\n\nu^(k) = max u_min theta_u u^(k-1) \n\nwhere u_min is the lowest value u is allowed to become and θ_u (01) is constant scaling factor.\n\nLast, we update the penalty parameter ρ according to\n\nρ^(k) = begincases\nρ^(k-1)θ_ρ textif displaystyle max_j in mathcalEi in mathcalI Bigl vert h_j(x^(k)) vert g_i(x^(k))Bigr geq u^(k-1) Bigr) \nρ^(k-1) textelse\nendcases\n\nwhere θ_ρ in (01) is a constant scaling factor.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function fmathcal Mℝ to minimize\ngrad_f – the gradient of the cost function\n\nOptional (if not called with the ConstrainedManifoldObjective cmo)\n\ng – (nothing) the inequality constraints\nh – (nothing) the equality constraints\ngrad_g – (nothing) the gradient of the inequality constraints\ngrad_h – (nothing) the gradient of the equality constraints\n\nNote that one of the pairs (g, grad_g) or (h, grad_h) has to be provided. Otherwise the problem is not constrained and you can also call e.g. quasi_Newton\n\nOptional\n\nsmoothing – (LogarithmicSumOfExponentials) SmoothingTechnique to use\nϵ – (1e–3) the accuracy tolerance\nϵ_exponent – (1/100) exponent of the ϵ update factor;\nϵ_min – (1e-6) the lower bound for the accuracy tolerance\nu – (1e–1) the smoothing parameter and threshold for violation of the constraints\nu_exponent – (1/100) exponent of the u update factor;\nu_min – (1e-6) the lower bound for the smoothing parameter and threshold for violation of the constraints\nρ – (1.0) the penalty parameter\nmin_stepsize – (1e-10) the minimal step size\nsub_cost – (ExactPenaltyCost(problem, ρ, u; smoothing=smoothing)) use this exact penalty cost, especially with the same numbers ρ,u as in the options for the sub problem\nsub_grad – (ExactPenaltyGrad(problem, ρ, u; smoothing=smoothing)) use this exact penalty gradient, especially with the same numbers ρ,u as in the options for the sub problem\nsub_kwargs – keyword arguments to decorate the sub options, e.g. with debug.\nsub_stopping_criterion – (StopAfterIteration(200) |StopWhenGradientNormLess(ϵ) |StopWhenStepsizeLess(1e-10)) specify a stopping criterion for the subsolver.\nsub_problem – (DefaultManoptProblem(M,ManifoldGradientObjective(sub_cost, sub_grad; evaluation=evaluation) – ` problem for the subsolver\nsub_state – (QuasiNewtonState) using QuasiNewtonLimitedMemoryDirectionUpdate with InverseBFGS and sub_stopping_criterion as a stopping criterion. See also sub_kwargs.\nstopping_criterion – (StopAfterIteration(300) | (StopWhenSmallerOrEqual(ϵ, ϵ_min) & StopWhenChangeLess(1e-10)) a functor inheriting from StoppingCriterion indicating when to stop.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/exact_penalty_method/#Manopt.exact_penalty_method!","page":"Exact Penalty Method","title":"Manopt.exact_penalty_method!","text":"exact_penalty_method!(M, f, grad_f, p; kwargs...)\nexact_penalty_method!(M, cmo::ConstrainedManifoldObjective, p; kwargs...)\n\nperform the exact penalty method (EPM) performed in place of p.\n\nFor all options, see exact_penalty_method.\n\n\n\n\n\n","category":"function"},{"location":"solvers/exact_penalty_method/#State","page":"Exact Penalty Method","title":"State","text":"","category":"section"},{"location":"solvers/exact_penalty_method/","page":"Exact Penalty Method","title":"Exact Penalty Method","text":"ExactPenaltyMethodState","category":"page"},{"location":"solvers/exact_penalty_method/#Manopt.ExactPenaltyMethodState","page":"Exact Penalty Method","title":"Manopt.ExactPenaltyMethodState","text":"ExactPenaltyMethodState{P,T} <: AbstractManoptSolverState\n\nDescribes the exact penalty method, with\n\nFields\n\na default value is given in brackets if a parameter can be left out in initialization.\n\np – a set point on a manifold as starting point\nsub_problem – an AbstractManoptProblem problem for the subsolver\nsub_state – an AbstractManoptSolverState for the subsolver\nϵ – (1e–3) the accuracy tolerance\nϵ_min – (1e-6) the lower bound for the accuracy tolerance\nu – (1e–1) the smoothing parameter and threshold for violation of the constraints\nu_min – (1e-6) the lower bound for the smoothing parameter and threshold for violation of the constraints\nρ – (1.0) the penalty parameter\nθ_ρ – (0.3) the scaling factor of the penalty parameter\nstopping_criterion – (StopWhenAny(StopAfterIteration(300),StopWhenAll(StopWhenSmallerOrEqual(ϵ, ϵ_min),StopWhenChangeLess(min_stepsize)))) a functor inheriting from StoppingCriterion indicating when to stop.\n\nConstructor\n\nExactPenaltyMethodState(M::AbstractManifold, p, sub_problem, sub_state; kwargs...)\n\nconstruct an exact penalty options with the fields and defaults as above, where the manifold M is used for defaults in the keyword arguments.\n\nSee also\n\nexact_penalty_method\n\n\n\n\n\n","category":"type"},{"location":"solvers/exact_penalty_method/#Helping-Functions","page":"Exact Penalty Method","title":"Helping Functions","text":"","category":"section"},{"location":"solvers/exact_penalty_method/","page":"Exact Penalty Method","title":"Exact Penalty Method","text":"ExactPenaltyCost\nExactPenaltyGrad\nSmoothingTechnique\nLinearQuadraticHuber\nLogarithmicSumOfExponentials","category":"page"},{"location":"solvers/exact_penalty_method/#Manopt.ExactPenaltyCost","page":"Exact Penalty Method","title":"Manopt.ExactPenaltyCost","text":"ExactPenaltyCost{S, Pr, R}\n\nRepresent the cost of the exact penalty method based on a ConstrainedManifoldObjective P and a parameter ρ given by\n\nf(p) + ρBigl(\n sum_i=0^m max0g_i(p) + sum_j=0^n lvert h_j(p)rvert\nBigr)\n\nwhere we use an additional parameter u and a smoothing technique, e.g. LogarithmicSumOfExponentials or LinearQuadraticHuber to obtain a smooth cost function. This struct is also a functor (M,p) -> v of the cost v.\n\nFields\n\nP, ρ, u as mentioned above.\n\nConstructor\n\nExactPenaltyCost(co::ConstrainedManifoldObjective, ρ, u; smoothing=LinearQuadraticHuber())\n\n\n\n\n\n","category":"type"},{"location":"solvers/exact_penalty_method/#Manopt.ExactPenaltyGrad","page":"Exact Penalty Method","title":"Manopt.ExactPenaltyGrad","text":"ExactPenaltyGrad{S, CO, R}\n\nRepresent the gradient of the ExactPenaltyCost based on a ConstrainedManifoldObjective co and a parameter ρ and a smoothing technique, which uses an additional parameter u.\n\nThis struct is also a functor in both formats\n\n(M, p) -> X to compute the gradient in allocating fashion.\n(M, X, p) to compute the gradient in in-place fashion.\n\nFields\n\nP, ρ, u as mentioned above.\n\nConstructor\n\nExactPenaltyGradient(co::ConstrainedManifoldObjective, ρ, u; smoothing=LinearQuadraticHuber())\n\n\n\n\n\n","category":"type"},{"location":"solvers/exact_penalty_method/#Manopt.SmoothingTechnique","page":"Exact Penalty Method","title":"Manopt.SmoothingTechnique","text":"abstract type SmoothingTechnique\n\nSpecify a smoothing technique, e.g. for the ExactPenaltyCost and ExactPenaltyGrad.\n\n\n\n\n\n","category":"type"},{"location":"solvers/exact_penalty_method/#Manopt.LinearQuadraticHuber","page":"Exact Penalty Method","title":"Manopt.LinearQuadraticHuber","text":"LinearQuadraticHuber <: SmoothingTechnique\n\nSpecify a smoothing based on max0x mathcal P(xu) for some u, where\n\nmathcal P(x u) = begincases\n 0 text if x leq 0\n fracx^22u text if 0 leq x leq u\n x-fracu2 text if x geq u\nendcases\n\n\n\n\n\n","category":"type"},{"location":"solvers/exact_penalty_method/#Manopt.LogarithmicSumOfExponentials","page":"Exact Penalty Method","title":"Manopt.LogarithmicSumOfExponentials","text":"LogarithmicSumOfExponentials <: SmoothingTechnique\n\nSpecify a smoothing based on maxab u log(mathrme^fracau+mathrme^fracbu) for some u.\n\n\n\n\n\n","category":"type"},{"location":"solvers/exact_penalty_method/#Literature","page":"Exact Penalty Method","title":"Literature","text":"","category":"section"},{"location":"solvers/exact_penalty_method/","page":"Exact Penalty Method","title":"Exact Penalty Method","text":"Pages = [\"solvers/exact_penalty_method.md\"]\nCanonical=false","category":"page"},{"location":"functions/proximal_maps/#proximalMapFunctions","page":"Proximal Maps","title":"Proximal Maps","text":"","category":"section"},{"location":"functions/proximal_maps/","page":"Proximal Maps","title":"Proximal Maps","text":"For a function varphimathcal M ℝ the proximal map is defined as","category":"page"},{"location":"functions/proximal_maps/","page":"Proximal Maps","title":"Proximal Maps","text":"displaystyleoperatornameprox_λvarphi(x)\n= operatorname*argmin_y mathcal M d_mathcal M^2(xy) + λvarphi(y)\nquad λ 0","category":"page"},{"location":"functions/proximal_maps/","page":"Proximal Maps","title":"Proximal Maps","text":"where d_mathcal M mathcal M times mathcal M ℝ denotes the geodesic distance on mathcal M. While it might still be difficult to compute the minimizer, there are several proximal maps known (locally) in closed form. Furthermore if x^star mathcal M is a minimizer of varphi, then","category":"page"},{"location":"functions/proximal_maps/","page":"Proximal Maps","title":"Proximal Maps","text":"displaystyleoperatornameprox_λvarphi(x^star) = x^star","category":"page"},{"location":"functions/proximal_maps/","page":"Proximal Maps","title":"Proximal Maps","text":"i.e. a minimizer is a fixed point of the proximal map.","category":"page"},{"location":"functions/proximal_maps/","page":"Proximal Maps","title":"Proximal Maps","text":"This page lists all proximal maps available within Manopt. To add you own, just extend the functions/proximal_maps.jl file.","category":"page"},{"location":"functions/proximal_maps/","page":"Proximal Maps","title":"Proximal Maps","text":"Modules = [Manopt]\nPages = [\"proximal_maps.jl\"]","category":"page"},{"location":"functions/proximal_maps/#Manopt.project_collaborative_TV","page":"Proximal Maps","title":"Manopt.project_collaborative_TV","text":"project_collaborative_TV(M, λ, x, Ξ[, p=2,q=1])\nproject_collaborative_TV!(M, Θ, λ, x, Ξ[, p=2,q=1])\n\ncompute the projection onto collaborative Norm unit (or α-) ball, i.e. of the function\n\nF^q(x) = sum_imathcal G\n Bigl( sum_jmathcal I_i\n sum_k=1^d lVert X_ijrVert_x^pBigr)^fracqp\n\nwhere mathcal G is the set of indices for xmathcal M and mathcal I_i is the set of its forward neighbors. The computation can also be done in place of Θ.\n\nThis is adopted from the paper Duran, Möller, Sbert, Cremers, SIAM J Imag Sci, 2016, see their Example 3 for details.\n\n\n\n\n\n","category":"function"},{"location":"functions/proximal_maps/#Manopt.prox_TV","page":"Proximal Maps","title":"Manopt.prox_TV","text":"ξ = prox_TV(M,λ,x [,p=1])\n\ncompute the proximal maps operatornameprox_λvarphi of all forward differences occurring in the power manifold array, i.e. varphi(xixj) = d_mathcal M^p(xixj) with xi and xj are array elements of x and j = i+e_k, where e_k is the kth unit vector. The parameter λ is the prox parameter.\n\nInput\n\nM – a manifold M\nλ – a real value, parameter of the proximal map\nx – a point.\n\nOptional\n\n(default is given in brackets)\n\np – (1) exponent of the distance of the TV term\n\nOutput\n\ny – resulting point containing with all mentioned proximal points evaluated (in a cyclic order). The computation can also be done in place\n\n\n\n\n\n","category":"function"},{"location":"functions/proximal_maps/#Manopt.prox_TV-Union{Tuple{T}, Tuple{AbstractManifold, Number, Tuple{T, T}}, Tuple{AbstractManifold, Number, Tuple{T, T}, Int64}} where T","page":"Proximal Maps","title":"Manopt.prox_TV","text":"[y1,y2] = prox_TV(M, λ, [x1,x2] [,p=1])\nprox_TV!(M, [y1,y2] λ, [x1,x2] [,p=1])\n\nCompute the proximal map operatornameprox_λvarphi of φ(xy) = d_mathcal M^p(xy) with parameter λ.\n\nInput\n\nM – a manifold M\nλ – a real value, parameter of the proximal map\n(x1,x2) – a tuple of two points,\n\nOptional\n\n(default is given in brackets)\n\np – (1) exponent of the distance of the TV term\n\nOutput\n\n(y1,y2) – resulting tuple of points of the operatornameprox_λφ((x1,x2)). The result can also be computed in place.\n\n\n\n\n\n","category":"method"},{"location":"functions/proximal_maps/#Manopt.prox_TV2-Union{Tuple{T}, Tuple{AbstractManifold, Any, Tuple{T, T, T}}, Tuple{AbstractManifold, Any, Tuple{T, T, T}, Int64}} where T","page":"Proximal Maps","title":"Manopt.prox_TV2","text":"(y1,y2,y3) = prox_TV2(M,λ,(x1,x2,x3),[p=1], kwargs...)\nprox_TV2!(M, y, λ,(x1,x2,x3),[p=1], kwargs...)\n\nCompute the proximal map operatornameprox_λvarphi of varphi(x_1x_2x_3) = d_mathcal M^p(c(x_1x_3)x_2) with parameter λ>0, where c(xz) denotes the mid point of a shortest geodesic from x1 to x3 that is closest to x2. The result can be computed in place of y.\n\nInput\n\nM – a manifold\nλ – a real value, parameter of the proximal map\n(x1,x2,x3) – a tuple of three points\np – (1) exponent of the distance of the TV term\n\nOptional\n\nkwargs... – parameters for the internal subgradient_method (if M is neither Euclidean nor Circle, since for these a closed form is given)\n\nOutput\n\n(y1,y2,y3) – resulting tuple of points of the proximal map. The computation can also be done in place.\n\n\n\n\n\n","category":"method"},{"location":"functions/proximal_maps/#Manopt.prox_TV2-Union{Tuple{T}, Tuple{N}, Tuple{PowerManifold{N, T}, Any, Any}, Tuple{PowerManifold{N, T}, Any, Any, Int64}} where {N, T}","page":"Proximal Maps","title":"Manopt.prox_TV2","text":"y = prox_TV2(M, λ, x[, p=1])\nprox_TV2!(M, y, λ, x[, p=1])\n\ncompute the proximal maps operatornameprox_λvarphi of all centered second order differences occurring in the power manifold array, i.e. varphi(x_kx_ix_j) = d_2(x_kx_ix_j), where kj are backward and forward neighbors (along any dimension in the array of x). The parameter λ is the prox parameter.\n\nInput\n\nM – a manifold M\nλ – a real value, parameter of the proximal map\nx – a points.\n\nOptional\n\n(default is given in brackets)\n\np – (1) exponent of the distance of the TV term\n\nOutput\n\ny – resulting point with all mentioned proximal points evaluated (in a cyclic order). The computation can also be done in place.\n\n\n\n\n\n","category":"method"},{"location":"functions/proximal_maps/#Manopt.prox_distance","page":"Proximal Maps","title":"Manopt.prox_distance","text":"y = prox_distance(M,λ,f,x [, p=2])\nprox_distance!(M, y, λ, f, x [, p=2])\n\ncompute the proximal map operatornameprox_λvarphi with parameter λ of φ(x) = frac1pd_mathcal M^p(fx). For the mutating variant the computation is done in place of y.\n\nInput\n\nM – a manifold M\nλ – the prox parameter\nf – a point f mathcal M (the data)\nx – the argument of the proximal map\n\nOptional argument\n\np – (2) exponent of the distance.\n\nOutput\n\ny – the result of the proximal map of φ\n\n\n\n\n\n","category":"function"},{"location":"functions/proximal_maps/#Manopt.prox_parallel_TV","page":"Proximal Maps","title":"Manopt.prox_parallel_TV","text":"y = prox_parallel_TV(M, λ, x [,p=1])\nprox_parallel_TV!(M, y, λ, x [,p=1])\n\ncompute the proximal maps operatornameprox_λφ of all forward differences occurring in the power manifold array, i.e. φ(x_ix_j) = d_mathcal M^p(x_ix_j) with xi and xj are array elements of x and j = i+e_k, where e_k is the kth unit vector. The parameter λ is the prox parameter.\n\nInput\n\nM – a PowerManifold manifold\nλ – a real value, parameter of the proximal map\nx – a point\n\nOptional\n\n(default is given in brackets)\n\np – (1) exponent of the distance of the TV term\n\nOutput\n\ny – resulting Array of points with all mentioned proximal points evaluated (in a parallel within the arrays elements). The computation can also be done in place.\n\nSee also prox_TV\n\n\n\n\n\n","category":"function"},{"location":"functions/proximal_maps/#Literature","page":"Proximal Maps","title":"Literature","text":"","category":"section"},{"location":"functions/proximal_maps/","page":"Proximal Maps","title":"Proximal Maps","text":"Pages = [\"functions/proximal_maps.md\"]\nCanonical=false","category":"page"},{"location":"plans/#planSection","page":"Specify a Solver","title":"Plans for solvers","text":"","category":"section"},{"location":"plans/","page":"Specify a Solver","title":"Specify a Solver","text":"CurrentModule = Manopt","category":"page"},{"location":"plans/","page":"Specify a Solver","title":"Specify a Solver","text":"For any optimisation performed in Manopt.jl we need information about both the optimisation task or “problem” at hand as well as the solver and all its parameters. This together is called a plan in Manopt.jl and it consists of two data structures:","category":"page"},{"location":"plans/","page":"Specify a Solver","title":"Specify a Solver","text":"The Manopt Problem describes all static data of our task, most prominently the manifold and the objective.\nThe Solver State describes all varying data and parameters for the solver we aim to use. This also means that each solver has its own data structure for the state.","category":"page"},{"location":"plans/","page":"Specify a Solver","title":"Specify a Solver","text":"By splitting these two parts, we can use one problem and solve it using different solvers.","category":"page"},{"location":"plans/","page":"Specify a Solver","title":"Specify a Solver","text":"Still there might be the need to set certain parameters within any of these structures. For that there is","category":"page"},{"location":"plans/","page":"Specify a Solver","title":"Specify a Solver","text":"set_manopt_parameter!\nManopt.status_summary","category":"page"},{"location":"plans/#Manopt.set_manopt_parameter!","page":"Specify a Solver","title":"Manopt.set_manopt_parameter!","text":"set_manopt_parameter!(f, element::Symbol , args...)\n\nFor any f and a Symbol e we dispatch on its value so by default, to set some args... in f or one of uts sub elements.\n\n\n\n\n\nset_manopt_parameter!(amo::AbstractManifoldObjective, element::Symbol, args...)\n\nSet a certain args... from the AbstractManifoldObjective amo to value. This function should dispatch onVal(element)`.\n\nCurrently supported\n\n:Cost passes to the get_cost_function\n:Gradient passes to the get_gradient_function\n\n\n\n\n\nset_manopt_parameter!(ams::AbstractManoptProblem, element::Symbol, field::Symbol , value)\n\nSet a certain field/element from the AbstractManoptProblem ams to value. This function should dispatch onVal(element)`.\n\nBy default this passes on to the inner objective, see set_manopt_parameter!\n\n\n\n\n\nset_manopt_parameter!(ams::AbstractManoptSolverState, element::Symbol, args...)\n\nSet a certain field/element from the AbstractManoptSolverState ams to value. This function dispatches onVal(element)`.\n\n\n\n\n\nset_manopt_parameter!(ams::DebugSolverState, ::Val{:Debug}, args...)\n\nSet certain values specified by args... into the elements of the debugDictionary\n\n\n\n\n\nset_manopt_parameter!(ams::DebugSolverState, ::Val{:SubProblem}, args...)\n\nSet certain values specified by args... to the sub problem.\n\n\n\n\n\nset_manopt_parameter!(ams::DebugSolverState, ::Val{:SubState}, args...)\n\nSet certain values specified by args... to the sub state.\n\n\n\n\n\n","category":"function"},{"location":"plans/#Manopt.status_summary","page":"Specify a Solver","title":"Manopt.status_summary","text":"status_summary(e)\n\nReturn a string reporting about the current status of e, where e is a type from Manopt, e.g. an AbstractManoptSolverStates.\n\nThis method is similar to show but just returns a string. It might also be more verbose in explaining, or hide internal information.\n\n\n\n\n\n","category":"function"},{"location":"tutorials/ConstrainedOptimization/#How-to-do-Constrained-Optimization","page":"Do Constrained Optimization","title":"How to do Constrained Optimization","text":"","category":"section"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"This tutorial is a short introduction to using solvers for constraint optimisation in Manopt.jl.","category":"page"},{"location":"tutorials/ConstrainedOptimization/#Introduction","page":"Do Constrained Optimization","title":"Introduction","text":"","category":"section"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"A constraint optimisation problem is given by","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"tagP\nbeginalign*\noperatorname*argmin_pinmathcal M f(p)\ntextsuch that quad g(p) leq 0\nquad h(p) = 0\nendalign*","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"where fcolon mathcal M ℝ is a cost function, and gcolon mathcal M ℝ^m and hcolon mathcal M ℝ^n are the inequality and equality constraints, respectively. The leq and = in (P) are meant elementwise.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"This can be seen as a balance between moving constraints into the geometry of a manifold mathcal M and keeping some, since they can be handled well in algorithms, see [BH19], [LB19] for details.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"using Distributions, LinearAlgebra, Manifolds, Manopt, Random\nRandom.seed!(42);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"In this tutorial we want to look at different ways to specify the problem and its implications. We start with specifying an example problems to illustrayte the different available forms.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"We will consider the problem of a Nonnegative PCA, cf. Section 5.1.2 in [LB19]","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"let v_0 ℝ^d, lVert v_0 rVert=1 be given spike signal, that is a signal that is sparse with only s=lfloor δd rfloor nonzero entries.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Z = sqrtσ v_0v_0^mathrmT+N","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"where sigma is a signal-to-noise ratio and N is a matrix with random entries, where the diagonal entries are distributed with zero mean and standard deviation 1d on the off-diagonals and 2d on the daigonal","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"d = 150; # dimension of v0\nσ = 0.1^2; # SNR\nδ = 0.1; s = Int(floor(δ * d)); # Sparsity\nS = sample(1:d, s; replace=false);\nv0 = [i ∈ S ? 1 / sqrt(s) : 0.0 for i in 1:d];\nN = rand(Normal(0, 1 / d), (d, d)); N[diagind(N, 0)] .= rand(Normal(0, 2 / d), d);\nZ = Z = sqrt(σ) * v0 * transpose(v0) + N;","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"In order to recover v_0 we consider the constrained optimisation problem on the sphere mathcal S^d-1 given by","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"beginalign*\noperatorname*argmin_pinmathcal S^d-1 -p^mathrmTZp^mathrmT\ntextsuch that quad p geq 0\nendalign*","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"or in the previous notation f(p) = -p^mathrmTZp^mathrmT and g(p) = -p. We first initialize the manifold under consideration","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"M = Sphere(d - 1)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Sphere(149, ℝ)","category":"page"},{"location":"tutorials/ConstrainedOptimization/#A-first-Augmented-Lagrangian-Run","page":"Do Constrained Optimization","title":"A first Augmented Lagrangian Run","text":"","category":"section"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"We first defined f and g as usual functions","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"f(M, p) = -transpose(p) * Z * p;\ng(M, p) = -p;","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"since f is a functions defined in the embedding ℝ^d as well, we obtain its gradient by projection.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"grad_f(M, p) = project(M, p, -transpose(Z) * p - Z * p);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"For the constraints this is a little more involved, since each function g_i = g(p)_i = p_i has to return its own gradient. These are again in the embedding just operatornamegrad g_i(p) = -e_i the i th unit vector. We can project these again onto the tangent space at p:","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"grad_g(M, p) = project.(\n Ref(M), Ref(p), [[i == j ? -1.0 : 0.0 for j in 1:d] for i in 1:d]\n);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"We further start in a random point:","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"p0 = rand(M);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Let’s check a few things for the initial point","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"f(M, p0)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"0.005747604833124234","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"How much the function g is positive","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"maximum(g(M, p0))","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"0.17885478285466855","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Now as a first method we can just call the Augmented Lagrangian Method with a simple call:","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"@time v1 = augmented_Lagrangian_method(\n M, f, grad_f, p0; g=g, grad_g=grad_g,\n debug=[:Iteration, :Cost, :Stop, \" | \", (:Change, \"Δp : %1.5e\"), 20, \"\\n\"],\n stopping_criterion = StopAfterIteration(300) | (\n StopWhenSmallerOrEqual(:ϵ, 1e-5) & StopWhenChangeLess(1e-8)\n )\n);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Initial f(x): 0.005748 | \n# 20 f(x): -0.123842 | Δp : 9.99682e-01\n# 40 f(x): -0.123842 | Δp : 8.13541e-07\n# 60 f(x): -0.123842 | Δp : 7.85694e-04\nThe value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-5).\nThe algorithm performed a step with a change (1.7450108123172955e-15) less than 9.77237220955808e-6.\n 16.843524 seconds (43.34 M allocations: 32.293 GiB, 10.65% gc time, 37.25% compilation time)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Now we have both a lower function value and the point is nearly within the constraints, … up to numerical inaccuracies","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"f(M, v1)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"-0.12384244779997305","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"maximum( g(M, v1) )","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"7.912675333644102e-18","category":"page"},{"location":"tutorials/ConstrainedOptimization/#A-faster-Augmented-Lagrangian-Run","page":"Do Constrained Optimization","title":"A faster Augmented Lagrangian Run","text":"","category":"section"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Now this is a little slow, so we can modify two things, that we will directly do both – but one could also just change one of these – :","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Gradients should be evaluated in place, so for example","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"grad_f!(M, X, p) = project!(M, X, p, -transpose(Z) * p - Z * p);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"The constraints are currently always evaluated all together, since the function grad_g always returns a vector of gradients. We first change the constraints function into a vector of functions. We further change the gradient both into a vector of gradient functions operatornamegrad g_i i=1ldotsd, as well as gradients that are computed in place.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"g2 = [(M, p) -> -p[i] for i in 1:d];\ngrad_g2! = [\n (M, X, p) -> project!(M, X, p, [i == j ? -1.0 : 0.0 for j in 1:d]) for i in 1:d\n];","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"We obtain","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"@time v2 = augmented_Lagrangian_method(\n M, f, grad_f!, p0; g=g2, grad_g=grad_g2!, evaluation=InplaceEvaluation(),\n debug=[:Iteration, :Cost, :Stop, \" | \", (:Change, \"Δp : %1.5e\"), 20, \"\\n\"],\n stopping_criterion = StopAfterIteration(300) | (\n StopWhenSmallerOrEqual(:ϵ, 1e-5) & StopWhenChangeLess(1e-8)\n )\n );","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Initial f(x): 0.005748 | \n# 20 f(x): -0.123842 | Δp : 9.99544e-01\n# 40 f(x): -0.123842 | Δp : 1.92065e-03\n# 60 f(x): -0.123842 | Δp : 4.84931e-06\nThe value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-5).\nThe algorithm performed a step with a change (2.7435918100802105e-17) less than 9.77237220955808e-6.\n 3.547284 seconds (6.52 M allocations: 3.728 GiB, 6.70% gc time, 41.27% compilation time)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"As a technical remark: Note that (by default) the change to InplaceEvaluations affects both the constrained solver as well as the inner solver of the subproblem in each iteration.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"f(M, v2)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"-0.12384239276300012","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"maximum(g(M, v2))","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"2.2466899389459647e-18","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"These are the very similar to the previous values but the solver took much less time and less memory allocations.","category":"page"},{"location":"tutorials/ConstrainedOptimization/#Exact-Penalty-Method","page":"Do Constrained Optimization","title":"Exact Penalty Method","text":"","category":"section"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"As a second solver, we have the Exact Penalty Method, which currenlty is available with two smoothing variants, which make an inner solver for smooth optimisationm, that is by default again [quasi Newton] possible: LogarithmicSumOfExponentials and LinearQuadraticHuber. We compare both here as well. The first smoothing technique is the default, so we can just call","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"@time v3 = exact_penalty_method(\n M, f, grad_f!, p0; g=g2, grad_g=grad_g2!, evaluation=InplaceEvaluation(),\n debug=[:Iteration, :Cost, :Stop, \" | \", :Change, 50, \"\\n\"],\n);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Initial f(x): 0.005748 | \n# 50 f(x): -0.123071 | Last Change: 0.981116\n# 100 f(x): -0.123840 | Last Change: 0.014124\nThe value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-6).\nThe algorithm performed a step with a change (2.202641515349944e-7) less than 1.0e-6.\n 2.383160 seconds (5.78 M allocations: 3.123 GiB, 7.71% gc time, 64.51% compilation time)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"We obtain a similar cost value as for the Augmented Lagrangian Solver above, but here the constraint is actually fulfilled and not just numerically “on the boundary”.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"f(M, v3)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"-0.12384029692539944","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"maximum(g(M, v3))","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"-3.582398293370528e-6","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"The second smoothing technique is often beneficial, when we have a lot of constraints (in the above mentioned vectorial manner), since we can avoid several gradient evaluations for the constraint functions here. This leads to a faster iteration time.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"@time v4 = exact_penalty_method(\n M, f, grad_f!, p0; g=g2, grad_g=grad_g2!,\n evaluation=InplaceEvaluation(),\n smoothing=LinearQuadraticHuber(),\n debug=[:Iteration, :Cost, :Stop, \" | \", :Change, 50, \"\\n\"],\n);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Initial f(x): 0.005748 | \n# 50 f(x): -0.123845 | Last Change: 0.009235\n# 100 f(x): -0.123843 | Last Change: 0.000107\nThe value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-6).\nThe algorithm performed a step with a change (3.586352489111338e-7) less than 1.0e-6.\n 1.557075 seconds (2.76 M allocations: 514.648 MiB, 5.08% gc time, 79.85% compilation time)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"For the result we see the same behaviour as for the other smoothing.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"f(M, v4)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"-0.12384258173223292","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"maximum(g(M, v4))","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"2.7028045565194566e-8","category":"page"},{"location":"tutorials/ConstrainedOptimization/#Comparing-to-the-unconstraint-solver","page":"Do Constrained Optimization","title":"Comparing to the unconstraint solver","text":"","category":"section"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"We can compare this to the global optimum on the sphere, which is the unconstraint optimisation problem; we can just use Quasi Newton.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Note that this is much faster, since every iteration of the algorithms above does a quasi-Newton call as well.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"@time w1 = quasi_Newton(\n M, f, grad_f!, p0; evaluation=InplaceEvaluation()\n);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":" 0.706571 seconds (634.12 k allocations: 61.701 MiB, 3.18% gc time, 96.56% compilation time)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"f(M, w1)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"-0.14021901809807297","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"But for sure here the constraints here are not fulfilled and we have veru positive entries in g(w_1)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"maximum(g(M, w1))","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"0.11762414497055226","category":"page"},{"location":"tutorials/ConstrainedOptimization/#Literature","page":"Do Constrained Optimization","title":"Literature","text":"","category":"section"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Pages = [\"tutorials/ConstrainedOptimization.md\"]\nCanonical=false","category":"page"},{"location":"helpers/exports/#Exports","page":"Exports","title":"Exports","text":"","category":"section"},{"location":"helpers/exports/","page":"Exports","title":"Exports","text":"Exports aim to provide a consistent generation of images of your results. For example if you record the trace your algorithm walks on the Sphere, you can easily export this trace to a rendered image using asymptote_export_S2_signals and render the result with Asymptote. Despite these, you can always record values during your iterations, and export these, for example to csv.","category":"page"},{"location":"helpers/exports/#Asymptote","page":"Exports","title":"Asymptote","text":"","category":"section"},{"location":"helpers/exports/","page":"Exports","title":"Exports","text":"The following functions provide exports both in graphics and/or raw data using Asymptote.","category":"page"},{"location":"helpers/exports/","page":"Exports","title":"Exports","text":"Modules = [Manopt]\nPages = [\"Asymptote.jl\"]","category":"page"},{"location":"helpers/exports/#Manopt.asymptote_export_S2_data-Tuple{String}","page":"Exports","title":"Manopt.asymptote_export_S2_data","text":"asymptote_export_S2_data(filename)\n\nExport given data as an array of points on the sphere, i.e. one-, two- or three-dimensional data with points on the Sphere mathbb S^2.\n\nInput\n\nfilename – a file to store the Asymptote code in.\n\nOptional Arguments (Data)\n\ndata – a point representing the 1-,2-, or 3-D array of points\nelevation_color_scheme - A ColorScheme for elevation\nscale_axes - ((1/3,1/3,1/3)) move spheres closer to each other by a factor per direction\n\nOptional Arguments (Asymptote)\n\narrow_head_size - (1.8) size of the arrowheads of the vectors (in mm)\ncamera_position - position of the camera (default: centered above xy-plane) szene\ntarget - position the camera points at (default: center of xy-plane within data).\n\n\n\n\n\n","category":"method"},{"location":"helpers/exports/#Manopt.asymptote_export_S2_signals-Tuple{String}","page":"Exports","title":"Manopt.asymptote_export_S2_signals","text":"asymptote_export_S2_signals(filename; points, curves, tangent_vectors, colors, options...)\n\nExport given points, curves, and tangent_vectors on the sphere mathbb S^2 to Asymptote.\n\nInput\n\nfilename – a file to store the Asymptote code in.\n\nOptional Arguments (Data)\n\ncolors - dictionary of color arrays (indexed by symbols :points, :curves and :tvector) where each entry has to provide as least as many colors as the length of the corresponding sets.\ncurves – an Array of Arrays of points on the sphere, where each inner array is interpreted as a curve and is accompanied by an entry within colors\npoints – an Array of Arrays of points on the sphere where each inner array is interpreted as a set of points and is accompanied by an entry within colors\ntangent_vectors – an Array of Arrays of tuples, where the first is a points, the second a tangent vector and each set of vectors is accompanied by an entry from within colors\n\nOptional Arguments (Asymptote)\n\narrow_head_size - (6.0) size of the arrowheads of the tangent vectors\narrow_head_sizes – overrides the previous value to specify a value per tVector set.\ncamera_position - ((1., 1., 0.)) position of the camera in the Asymptote szene\nline_width – (1.0) size of the lines used to draw the curves.\nline_widths – overrides the previous value to specify a value per curve and tVector set.\ndot_size – (1.0) size of the dots used to draw the points.\ndot_sizes – overrides the previous value to specify a value per point set.\nsize - (nothing) a tuple for the image size, otherwise a relative size 4cm is used.\nsphere_color – (RGBA{Float64}(0.85, 0.85, 0.85, 0.6)) color of the sphere the data is drawn on\nsphere_line_color – (RGBA{Float64}(0.75, 0.75, 0.75, 0.6)) color of the lines on the sphere\nsphere_line_width – (0.5) line width of the lines on the sphere\ntarget – ((0.,0.,0.)) position the camera points at\n\n\n\n\n\n","category":"method"},{"location":"helpers/exports/#Manopt.asymptote_export_SPD-Tuple{String}","page":"Exports","title":"Manopt.asymptote_export_SPD","text":"asymptote_export_SPD(filename)\n\nexport given data as a point on a Power(SymmetricPOsitiveDefinnite(3))} manifold, i.e. one-, two- or three-dimensional data with points on the manifold of symmetric positive definite matrices.\n\nInput\n\nfilename – a file to store the Asymptote code in.\n\nOptional Arguments (Data)\n\ndata – a point representing the 1-,2-, or 3-D array of SPD matrices\ncolor_scheme - A ColorScheme for Geometric Anisotropy Index\nscale_axes - ((1/3,1/3,1/3)) move symmetric positive definite matrices closer to each other by a factor per direction compared to the distance estimated by the maximal eigenvalue of all involved SPD points\n\nOptional Arguments (Asymptote)\n\ncamera_position - position of the camera (default: centered above xy-plane) szene.\ntarget - position the camera points at (default: center of xy-plane within data).\n\nBoth values camera_position and target are scaled by scaledAxes*EW, where EW is the maximal eigenvalue in the data.\n\n\n\n\n\n","category":"method"},{"location":"helpers/exports/#Manopt.render_asymptote-Tuple{Any}","page":"Exports","title":"Manopt.render_asymptote","text":"render_asymptote(filename; render=4, format=\"png\", ...)\n\nrender an exported asymptote file specified in the filename, which can also be given as a relative or full path\n\nInput\n\nfilename – filename of the exported asy and rendered image\n\nKeyword Arguments\n\nthe default values are given in brackets\n\nrender – (4) render level of asymptote, i.e. its -render option. This can be removed from the command by setting it to nothing.\nformat – (\"png\") final rendered format, i.e. asymptote's -f option\nexport_file - (the filename with format as ending) specify the export filename\n\n\n\n\n\n","category":"method"},{"location":"plans/problem/#ProblemSection","page":"Problem","title":"A Manopt Problem","text":"","category":"section"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"CurrentModule = Manopt","category":"page"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"A problem describes all static data of an optimisation task and has as a super type","category":"page"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"AbstractManoptProblem\nget_objective\nget_manifold","category":"page"},{"location":"plans/problem/#Manopt.AbstractManoptProblem","page":"Problem","title":"Manopt.AbstractManoptProblem","text":"AbstractManoptProblem{M<:AbstractManifold}\n\nDescribe a Riemannian optimization problem with all static (not-changing) properties.\n\nThe most prominent features that should always be stated here are\n\nthe AbstractManifold mathcal M (cf. ManifoldsBase.jl#AbstractManifold)\nthe cost function fcolon mathcal M ℝ\n\nUsually the cost should be within an AbstractManifoldObjective.\n\n\n\n\n\n","category":"type"},{"location":"plans/problem/#Manopt.get_objective","page":"Problem","title":"Manopt.get_objective","text":"get_objective(o::AbstractManifoldObjective, recursive=true)\n\nreturn the (one step) undecorated AbstractManifoldObjective of the (possibly) decorated o. As long as your decorated objective stores the objective within o.objective and the dispatch_objective_decorator is set to Val{true}, the internal state are extracted automatically.\n\nBy default the objective that is stored within a decorated objective is assumed to be at o.objective. Overwrite _get_objective(o, ::Val{true}, recursive) to change this behaviour for your objectiveo` for both the recursive and the nonrecursive case.\n\nIf recursive is set to false, only the most outer decorator is taken away instead of all.\n\n\n\n\n\nget_objective(mp::AbstractManoptProblem, recursive=false)\n\nreturn the objective AbstractManifoldObjective stored within an AbstractManoptProblem. If recursive is set to true, it additionally unwraps all decorators of the objective\n\n\n\n\n\n","category":"function"},{"location":"plans/problem/#Manopt.get_manifold","page":"Problem","title":"Manopt.get_manifold","text":"get_manifold(amp::AbstractManoptProblem)\n\nreturn the manifold stored within an AbstractManoptProblem\n\n\n\n\n\n","category":"function"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"Usually, such a problem is determined by the manifold or domain of the optimisation and the objective with all its properties used within an algorithm – see The Objective. For that we can just use","category":"page"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"DefaultManoptProblem","category":"page"},{"location":"plans/problem/#Manopt.DefaultManoptProblem","page":"Problem","title":"Manopt.DefaultManoptProblem","text":"DefaultManoptProblem{TM <: AbstractManifold, Objective <: AbstractManifoldObjective}\n\nModel a default manifold problem, that (just) consists of the domain of optimisation, that is an AbstractManifold and an AbstractManifoldObjective\n\n\n\n\n\n","category":"type"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"The exception to these are the primal dual-based solvers (Chambolle-Pock and the PD Semismooth Newton]), which both need two manifolds as their domain(s), hence there also exists a","category":"page"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"TwoManifoldProblem","category":"page"},{"location":"plans/problem/#Manopt.TwoManifoldProblem","page":"Problem","title":"Manopt.TwoManifoldProblem","text":"TwoManifoldProblem{\n MT<:AbstractManifold,NT<:AbstractManifold,O<:AbstractManifoldObjective\n} <: AbstractManoptProblem{MT}\n\nAn abstract type for primal-dual-based problems.\n\n\n\n\n\n","category":"type"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"From the two ingredients here, you can find more information about","category":"page"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"the AbstractManifold in ManifoldsBase.jl\nthe AbstractManifoldObjective on the page about the objective.","category":"page"},{"location":"solvers/quasi_Newton/#quasiNewton","page":"Quasi-Newton","title":"Riemannian quasi-Newton methods","text":"","category":"section"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":" CurrentModule = Manopt","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":" quasi_Newton\n quasi_Newton!","category":"page"},{"location":"solvers/quasi_Newton/#Manopt.quasi_Newton","page":"Quasi-Newton","title":"Manopt.quasi_Newton","text":"quasi_Newton(M, f, grad_f, p)\n\nPerform a quasi Newton iteration for f on the manifold M starting in the point p.\n\nThe kth iteration consists of\n\nCompute the search direction η_k = -mathcalB_k operatornamegradf (p_k) or solve mathcalH_k η_k = -operatornamegradf (p_k).\nDetermine a suitable stepsize α_k along the curve gamma(α) = R_p_k(α η_k) e.g. by using WolfePowellLinesearch.\nCompute p_{k+1} = R_{p_k}(α_k η_k)`.\nDefine s_k = T_p_k α_k η_k(α_k η_k) and y_k = operatornamegradf(p_k+1) - T_p_k α_k η_k(operatornamegradf(p_k)).\nCompute the new approximate Hessian H_k+1 or its inverse B_k.\n\nInput\n\nM – a manifold mathcalM.\nf – a cost function F mathcalM ℝ to minimize.\ngrad_f– the gradient operatornamegradF mathcalM T_xmathcal M of F.\np – an initial value p mathcalM.\n\nOptional\n\nbasis – (DefaultOrthonormalBasis()) basis within the tangent space(s) to represent the Hessian (inverse).\ncautious_update – (false) – whether or not to use a QuasiNewtonCautiousDirectionUpdate\ncautious_function – ((x) -> x*10^(-4)) – a monotone increasing function that is zero at 0 and strictly increasing at 0 for the cautious update.\ndirection_update – (InverseBFGS()) the update rule to use.\nevaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).\ninitial_operator – (Matrix{Float64}(I,n,n)) initial matrix to use die the approximation, where n=manifold_dimension(M), see also scale_initial_operator.\nmemory_size – (20) limited memory, number of s_k y_k to store. Set to a negative value to use a full memory representation\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction method to use, by default the exponential map.\nscale_initial_operator - (true) scale initial operator with fracs_ky_k_p_klVert y_krVert_p_k in the computation\nstabilize – (true) stabilize the method numerically by projecting computed (Newton-) directions to the tangent space to reduce numerical errors\nstepsize – (WolfePowellLinesearch(retraction_method, vector_transport_method)) specify a Stepsize.\nstopping_criterion - (StopWhenAny(StopAfterIteration(max(1000, memory_size)), StopWhenGradientNormLess(10^(-6))) specify a StoppingCriterion\nvector_transport_method – (default_vector_transport_method(M, typeof(p))) a vector transport to use.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details.\n\n\n\n\n\n","category":"function"},{"location":"solvers/quasi_Newton/#Manopt.quasi_Newton!","page":"Quasi-Newton","title":"Manopt.quasi_Newton!","text":"quasi_Newton!(M, F, gradF, x; options...)\n\nPerform a quasi Newton iteration for F on the manifold M starting in the point x using a retraction R and a vector transport T.\n\nInput\n\nM – a manifold mathcalM.\nF – a cost function F mathcalM ℝ to minimize.\ngradF– the gradient operatornamegradF mathcalM T_xmathcal M of F implemented as gradF(M,p).\nx – an initial value x mathcalM.\n\nFor all optional parameters, see quasi_Newton.\n\n\n\n\n\n","category":"function"},{"location":"solvers/quasi_Newton/#Background","page":"Quasi-Newton","title":"Background","text":"","category":"section"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"The aim is to minimize a real-valued function on a Riemannian manifold, i.e.","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"min f(x) quad x mathcalM","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"Riemannian quasi-Newtonian methods are as generalizations of their Euclidean counterparts Riemannian line search methods. These methods determine a search direction η_k T_x_k mathcalM at the current iterate x_k and a suitable stepsize α_k along gamma(α) = R_x_k(α η_k), where R T mathcalM mathcalM is a retraction. The next iterate is obtained by","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"x_k+1 = R_x_k(α_k η_k)","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"In quasi-Newton methods, the search direction is given by","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"η_k = -mathcalH_k^-1operatornamegradf (x_k) = -mathcalB_k operatornamegrad (x_k)","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"where mathcalH_k T_x_k mathcalM T_x_k mathcalM is a positive definite self-adjoint operator, which approximates the action of the Hessian operatornameHess f (x_k) and mathcalB_k = mathcalH_k^-1. The idea of quasi-Newton methods is instead of creating a complete new approximation of the Hessian operator operatornameHess f(x_k+1) or its inverse at every iteration, the previous operator mathcalH_k or mathcalB_k is updated by a convenient formula using the obtained information about the curvature of the objective function during the iteration. The resulting operator mathcalH_k+1 or mathcalB_k+1 acts on the tangent space T_x_k+1 mathcalM of the freshly computed iterate x_k+1. In order to get a well-defined method, the following requirements are placed on the new operator mathcalH_k+1 or mathcalB_k+1 that is created by an update. Since the Hessian operatornameHess f(x_k+1) is a self-adjoint operator on the tangent space T_x_k+1 mathcalM, and mathcalH_k+1 approximates it, we require that mathcalH_k+1 or mathcalB_k+1 is also self-adjoint on T_x_k+1 mathcalM. In order to achieve a steady descent, we want η_k to be a descent direction in each iteration. Therefore we require, that mathcalH_k+1 or mathcalB_k+1 is a positive definite operator on T_x_k+1 mathcalM. In order to get information about the curvature of the objective function into the new operator mathcalH_k+1 or mathcalB_k+1, we require that it satisfies a form of a Riemannian quasi-Newton equation:","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"mathcalH_k+1 T_x_k rightarrow x_k+1(R_x_k^-1(x_k+1)) = operatornamegrad(x_k+1) - T_x_k rightarrow x_k+1(operatornamegradf(x_k))","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"or","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"mathcalB_k+1 operatornamegradf(x_k+1) - T_x_k rightarrow x_k+1(operatornamegradf(x_k)) = T_x_k rightarrow x_k+1(R_x_k^-1(x_k+1))","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"where T_x_k rightarrow x_k+1 T_x_k mathcalM T_x_k+1 mathcalM and the chosen retraction R is the associated retraction of T. We note that, of course, not all updates in all situations will meet these conditions in every iteration. For specific quasi-Newton updates, the fulfillment of the Riemannian curvature condition, which requires that","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"g_x_k+1(s_k y_k) 0","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"holds, is a requirement for the inheritance of the self-adjointness and positive definiteness of the mathcalH_k or mathcalB_k to the operator mathcalH_k+1 or mathcalB_k+1. Unfortunately, the fulfillment of the Riemannian curvature condition is not given by a step size alpha_k 0 that satisfies the generalized Wolfe conditions. However, in order to create a positive definite operator mathcalH_k+1 or mathcalB_k+1 in each iteration, the so-called locking condition was introduced in Huang, Gallican, Absil, SIAM J. Optim., 2015, which requires that the isometric vector transport T^S, which is used in the update formula, and its associate retraction R fulfill","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"T^Sx ξ_x(ξ_x) = β T^Rx ξ_x(ξ_x) quad β = fraclVert ξ_x rVert_xlVert T^Rx ξ_x(ξ_x) rVert_R_x(ξ_x)","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"where T^R is the vector transport by differentiated retraction. With the requirement that the isometric vector transport T^S and its associated retraction R satisfies the locking condition and using the tangent vector","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"y_k = β_k^-1 operatornamegradf(x_k+1) - T^Sx_k α_k η_k(operatornamegradf(x_k))","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"where","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"β_k = fraclVert α_k η_k rVert_x_klVert T^Rx_k α_k η_k(α_k η_k) rVert_x_k+1","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"in the update, it can be shown that choosing a stepsize α_k 0 that satisfies the Riemannian Wolfe conditions leads to the fulfillment of the Riemannian curvature condition, which in turn implies that the operator generated by the updates is positive definite. In the following we denote the specific operators in matrix notation and hence use H_k and B_k, respectively.","category":"page"},{"location":"solvers/quasi_Newton/#Direction-Updates","page":"Quasi-Newton","title":"Direction Updates","text":"","category":"section"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"In general there are different ways to compute a fixed AbstractQuasiNewtonUpdateRule. In general these are represented by","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"AbstractQuasiNewtonDirectionUpdate\nQuasiNewtonMatrixDirectionUpdate\nQuasiNewtonLimitedMemoryDirectionUpdate\nQuasiNewtonCautiousDirectionUpdate","category":"page"},{"location":"solvers/quasi_Newton/#Manopt.AbstractQuasiNewtonDirectionUpdate","page":"Quasi-Newton","title":"Manopt.AbstractQuasiNewtonDirectionUpdate","text":"AbstractQuasiNewtonDirectionUpdate\n\nAn abstract representation of an Quasi Newton Update rule to determine the next direction given current QuasiNewtonState.\n\nAll subtypes should be functors, i.e. one should be able to call them as H(M,x,d) to compute a new direction update.\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.QuasiNewtonMatrixDirectionUpdate","page":"Quasi-Newton","title":"Manopt.QuasiNewtonMatrixDirectionUpdate","text":"QuasiNewtonMatrixDirectionUpdate <: AbstractQuasiNewtonDirectionUpdate\n\nThese AbstractQuasiNewtonDirectionUpdates represent any quasi-Newton update rule, where the operator is stored as a matrix. A distinction is made between the update of the approximation of the Hessian, H_k mapsto H_k+1, and the update of the approximation of the Hessian inverse, B_k mapsto B_k+1. For the first case, the coordinates of the search direction η_k with respect to a basis b_i^n_i=1 are determined by solving a linear system of equations, i.e.\n\ntextSolve quad hatη_k = - H_k widehatoperatornamegradf(x_k)\n\nwhere H_k is the matrix representing the operator with respect to the basis b_i^n_i=1 and widehatoperatornamegradf(x_k) represents the coordinates of the gradient of the objective function f in x_k with respect to the basis b_i^n_i=1. If a method is chosen where Hessian inverse is approximated, the coordinates of the search direction η_k with respect to a basis b_i^n_i=1 are obtained simply by matrix-vector multiplication, i.e.\n\nhatη_k = - B_k widehatoperatornamegradf(x_k)\n\nwhere B_k is the matrix representing the operator with respect to the basis b_i^n_i=1 and widehatoperatornamegradf(x_k) as above. In the end, the search direction η_k is generated from the coordinates hateta_k and the vectors of the basis b_i^n_i=1 in both variants. The AbstractQuasiNewtonUpdateRule indicates which quasi-Newton update rule is used. In all of them, the Euclidean update formula is used to generate the matrix H_k+1 and B_k+1, and the basis b_i^n_i=1 is transported into the upcoming tangent space T_x_k+1 mathcalM, preferably with an isometric vector transport, or generated there.\n\nFields\n\nupdate – a AbstractQuasiNewtonUpdateRule.\nbasis – the basis.\nmatrix – (Matrix{Float64}(I, manifold_dimension(M), manifold_dimension(M))) the matrix which represents the approximating operator.\nscale – (`true) indicates whether the initial matrix (= identity matrix) should be scaled before the first update.\nvector_transport_method – (vector_transport_method)an AbstractVectorTransportMethod\n\nConstructor\n\nQuasiNewtonMatrixDirectionUpdate(M::AbstractManifold, update, basis, matrix;\nscale=true, vector_transport_method=default_vector_transport_method(M))\n\nGenerate the Update rule with defaults from a manifold and the names corresponding to the fields above.\n\nSee also\n\nQuasiNewtonLimitedMemoryDirectionUpdate QuasiNewtonCautiousDirectionUpdate AbstractQuasiNewtonDirectionUpdate\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.QuasiNewtonLimitedMemoryDirectionUpdate","page":"Quasi-Newton","title":"Manopt.QuasiNewtonLimitedMemoryDirectionUpdate","text":"QuasiNewtonLimitedMemoryDirectionUpdate <: AbstractQuasiNewtonDirectionUpdate\n\nThis AbstractQuasiNewtonDirectionUpdate represents the limited-memory Riemannian BFGS update, where the approximating operator is represented by m stored pairs of tangent vectors widetildes_i widetildey_i_i=k-m^k-1 in the k-th iteration. For the calculation of the search direction η_k, the generalisation of the two-loop recursion is used (see Huang, Gallican, Absil, SIAM J. Optim., 2015), since it only requires inner products and linear combinations of tangent vectors in T_x_k mathcalM. For that the stored pairs of tangent vectors widetildes_i widetildey_i_i=k-m^k-1, the gradient operatornamegradf(x_k) of the objective function f in x_k and the positive definite self-adjoint operator\n\nmathcalB^(0)_k = fracg_x_k(s_k-1 y_k-1)g_x_k(y_k-1 y_k-1) mathrmid_T_x_k mathcalM\n\nare used. The two-loop recursion can be understood as that the InverseBFGS update is executed m times in a row on mathcalB^(0)_k using the tangent vectors widetildes_i widetildey_i_i=k-m^k-1, and in the same time the resulting operator mathcalB^LRBFGS_k is directly applied on operatornamegradf(x_k). When updating there are two cases: if there is still free memory, i.e. k m, the previously stored vector pairs widetildes_i widetildey_i_i=k-m^k-1 have to be transported into the upcoming tangent space T_x_k+1 mathcalM; if there is no free memory, the oldest pair widetildes_km widetildey_km has to be discarded and then all the remaining vector pairs widetildes_i widetildey_i_i=k-m+1^k-1 are transported into the tangent space T_x_k+1 mathcalM. After that we calculate and store s_k = widetildes_k = T^S_x_k α_k η_k(α_k η_k) and y_k = widetildey_k. This process ensures that new information about the objective function is always included and the old, probably no longer relevant, information is discarded.\n\nFields\n\nmemory_s – the set of the stored (and transported) search directions times step size widetildes_i_i=k-m^k-1.\nmemory_y – set of the stored gradient differences widetildey_i_i=k-m^k-1.\nξ – a variable used in the two-loop recursion.\nρ – a variable used in the two-loop recursion.\nscale –\nvector_transport_method – a AbstractVectorTransportMethod\nmessage – a string containing a potential warning that might have appeared\n\nConstructor\n\nQuasiNewtonLimitedMemoryDirectionUpdate(\n M::AbstractManifold,\n x,\n update::AbstractQuasiNewtonUpdateRule,\n memory_size;\n initial_vector=zero_vector(M,x),\n scale=1.0\n project=true\n )\n\nSee also\n\nInverseBFGS QuasiNewtonCautiousDirectionUpdate AbstractQuasiNewtonDirectionUpdate\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.QuasiNewtonCautiousDirectionUpdate","page":"Quasi-Newton","title":"Manopt.QuasiNewtonCautiousDirectionUpdate","text":"QuasiNewtonCautiousDirectionUpdate <: AbstractQuasiNewtonDirectionUpdate\n\nThese AbstractQuasiNewtonDirectionUpdates represent any quasi-Newton update rule, which are based on the idea of a so-called cautious update. The search direction is calculated as given in QuasiNewtonMatrixDirectionUpdate or QuasiNewtonLimitedMemoryDirectionUpdate, butut the update then is only executed if\n\nfracg_x_k+1(y_ks_k)lVert s_k rVert^2_x_k+1 geq theta(lVert operatornamegradf(x_k) rVert_x_k)\n\nis satisfied, where theta is a monotone increasing function satisfying theta(0) = 0 and theta is strictly increasing at 0. If this is not the case, the corresponding update will be skipped, which means that for QuasiNewtonMatrixDirectionUpdate the matrix H_k or B_k is not updated. The basis b_i^n_i=1 is nevertheless transported into the upcoming tangent space T_x_k+1 mathcalM, and for QuasiNewtonLimitedMemoryDirectionUpdate neither the oldest vector pair widetildes_km widetildey_km is discarded nor the newest vector pair widetildes_k widetildey_k is added into storage, but all stored vector pairs widetildes_i widetildey_i_i=k-m^k-1 are transported into the tangent space T_x_k+1 mathcalM. If InverseBFGS or InverseBFGS is chosen as update, then the resulting method follows the method of Huang, Absil, Gallivan, SIAM J. Optim., 2018, taking into account that the corresponding step size is chosen.\n\nFields\n\nupdate – an AbstractQuasiNewtonDirectionUpdate\nθ – a monotone increasing function satisfying θ(0) = 0 and θ is strictly increasing at 0.\n\nConstructor\n\nQuasiNewtonCautiousDirectionUpdate(U::QuasiNewtonMatrixDirectionUpdate; θ = x -> x)\nQuasiNewtonCautiousDirectionUpdate(U::QuasiNewtonLimitedMemoryDirectionUpdate; θ = x -> x)\n\nGenerate a cautious update for either a matrix based or a limited memorz based update rule.\n\nSee also\n\nQuasiNewtonMatrixDirectionUpdate QuasiNewtonLimitedMemoryDirectionUpdate\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Hessian-Update-Rules","page":"Quasi-Newton","title":"Hessian Update Rules","text":"","category":"section"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"Using","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"update_hessian!","category":"page"},{"location":"solvers/quasi_Newton/#Manopt.update_hessian!","page":"Quasi-Newton","title":"Manopt.update_hessian!","text":"update_hessian!(d, amp, st, p_old, iter)\n\nupdate the hessian within the QuasiNewtonState o given a AbstractManoptProblem amp as well as the an AbstractQuasiNewtonDirectionUpdate d and the last iterate p_old. Note that the current (iterth) iterate is already stored in o.x.\n\nSee also AbstractQuasiNewtonUpdateRule for the different rules that are available within d.\n\n\n\n\n\n","category":"function"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"the following update formulae for either H_k+1 or B_k+1 are available.","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"AbstractQuasiNewtonUpdateRule\nBFGS\nDFP\nBroyden\nSR1\nInverseBFGS\nInverseDFP\nInverseBroyden\nInverseSR1","category":"page"},{"location":"solvers/quasi_Newton/#Manopt.AbstractQuasiNewtonUpdateRule","page":"Quasi-Newton","title":"Manopt.AbstractQuasiNewtonUpdateRule","text":"AbstractQuasiNewtonUpdateRule\n\nSpecify a type for the different AbstractQuasiNewtonDirectionUpdates, that is, e.g. for a QuasiNewtonMatrixDirectionUpdate there are several different updates to the matrix, while the default for QuasiNewtonLimitedMemoryDirectionUpdate the most prominent is InverseBFGS.\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.BFGS","page":"Quasi-Newton","title":"Manopt.BFGS","text":"BFGS <: AbstractQuasiNewtonUpdateRule\n\nindicates in AbstractQuasiNewtonDirectionUpdate that the Riemannian BFGS update is used in the Riemannian quasi-Newton method.\n\nWe denote by widetildeH_k^mathrmBFGS the operator concatenated with a vector transport and its inverse before and after to act on x_k+1 = R_x_k(α_k η_k). Then the update formula reads\n\nH^mathrmBFGS_k+1 = widetildeH^mathrmBFGS_k + fracy_k y^mathrmT_k s^mathrmT_k y_k - fracwidetildeH^mathrmBFGS_k s_k s^mathrmT_k widetildeH^mathrmBFGS_k s^mathrmT_k widetildeH^mathrmBFGS_k s_k\n\nwhere s_k and y_k are the coordinate vectors with respect to the current basis (from QuasiNewtonState) of\n\nT^S_x_k α_k η_k(α_k η_k) quadtextandquad\noperatornamegradf(x_k+1) - T^S_x_k α_k η_k(operatornamegradf(x_k)) T_x_k+1 mathcalM\n\nrespectively.\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.DFP","page":"Quasi-Newton","title":"Manopt.DFP","text":"DFP <: AbstractQuasiNewtonUpdateRule\n\nindicates in an AbstractQuasiNewtonDirectionUpdate that the Riemannian DFP update is used in the Riemannian quasi-Newton method.\n\nWe denote by widetildeH_k^mathrmDFP the operator concatenated with a vector transport and its inverse before and after to act on x_k+1 = R_x_k(α_k η_k). Then the update formula reads\n\nH^mathrmDFP_k+1 = Bigl(\n mathrmid_T_x_k+1 mathcalM - fracy_k s^mathrmT_ks^mathrmT_k y_k\nBigr)\nwidetildeH^mathrmDFP_k\nBigl(\n mathrmid_T_x_k+1 mathcalM - fracs_k y^mathrmT_ks^mathrmT_k y_k\nBigr) + fracy_k y^mathrmT_ks^mathrmT_k y_k\n\nwhere s_k and y_k are the coordinate vectors with respect to the current basis (from QuasiNewtonState) of\n\nT^S_x_k α_k η_k(α_k η_k) quadtextandquad\noperatornamegradf(x_k+1) - T^S_x_k α_k η_k(operatornamegradf(x_k)) T_x_k+1 mathcalM\n\nrespectively.\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.Broyden","page":"Quasi-Newton","title":"Manopt.Broyden","text":"Broyden <: AbstractQuasiNewtonUpdateRule\n\nindicates in AbstractQuasiNewtonDirectionUpdate that the Riemannian Broyden update is used in the Riemannian quasi-Newton method, which is as a convex combination of BFGS and DFP.\n\nWe denote by widetildeH_k^mathrmBr the operator concatenated with a vector transport and its inverse before and after to act on x_k+1 = R_x_k(α_k η_k). Then the update formula reads\n\nH^mathrmBr_k+1 = widetildeH^mathrmBr_k\n - fracwidetildeH^mathrmBr_k s_k s^mathrmT_k widetildeH^mathrmBr_ks^mathrmT_k widetildeH^mathrmBr_k s_k + fracy_k y^mathrmT_ks^mathrmT_k y_k\n + φ_k s^mathrmT_k widetildeH^mathrmBr_k s_k\n Bigl(\n fracy_ks^mathrmT_k y_k - fracwidetildeH^mathrmBr_k s_ks^mathrmT_k widetildeH^mathrmBr_k s_k\n Bigr)\n Bigl(\n fracy_ks^mathrmT_k y_k - fracwidetildeH^mathrmBr_k s_ks^mathrmT_k widetildeH^mathrmBr_k s_k\n Bigr)^mathrmT\n\nwhere s_k and y_k are the coordinate vectors with respect to the current basis (from QuasiNewtonState) of\n\nT^S_x_k α_k η_k(α_k η_k) quadtextandquad\noperatornamegradf(x_k+1) - T^S_x_k α_k η_k(operatornamegradf(x_k)) T_x_k+1 mathcalM\n\nrespectively, and φ_k is the Broyden factor which is :constant by default but can also be set to :Davidon.\n\nConstructor\n\nBroyden(φ, update_rule::Symbol = :constant)\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.SR1","page":"Quasi-Newton","title":"Manopt.SR1","text":"SR1 <: AbstractQuasiNewtonUpdateRule\n\nindicates in AbstractQuasiNewtonDirectionUpdate that the Riemannian SR1 update is used in the Riemannian quasi-Newton method.\n\nWe denote by widetildeH_k^mathrmSR1 the operator concatenated with a vector transport and its inverse before and after to act on x_k+1 = R_x_k(α_k η_k). Then the update formula reads\n\nH^mathrmSR1_k+1 = widetildeH^mathrmSR1_k\n+ frac\n (y_k - widetildeH^mathrmSR1_k s_k) (y_k - widetildeH^mathrmSR1_k s_k)^mathrmT\n\n(y_k - widetildeH^mathrmSR1_k s_k)^mathrmT s_k\n\n\nwhere s_k and y_k are the coordinate vectors with respect to the current basis (from QuasiNewtonState) of\n\nT^S_x_k α_k η_k(α_k η_k) quadtextandquad\noperatornamegradf(x_k+1) - T^S_x_k α_k η_k(operatornamegradf(x_k)) T_x_k+1 mathcalM\n\nrespectively.\n\nThis method can be stabilized by only performing the update if denominator is larger than rlVert s_krVert_x_k+1lVert y_k - widetildeH^mathrmSR1_k s_k rVert_x_k+1 for some r0. For more details, see Section 6.2 in Nocedal, Wright, Springer, 2006.\n\nConstructor\n\nSR1(r::Float64=-1.0)\n\nGenerate the SR1 update, which by default does not include the check (since the default sets t0`)\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.InverseBFGS","page":"Quasi-Newton","title":"Manopt.InverseBFGS","text":"InverseBFGS <: AbstractQuasiNewtonUpdateRule\n\nindicates in AbstractQuasiNewtonDirectionUpdate that the inverse Riemannian BFGS update is used in the Riemannian quasi-Newton method.\n\nWe denote by widetildeB_k^mathrmBFGS the operator concatenated with a vector transport and its inverse before and after to act on x_k+1 = R_x_k(α_k η_k). Then the update formula reads\n\nB^mathrmBFGS_k+1 = Bigl(\n mathrmid_T_x_k+1 mathcalM - fracs_k y^mathrmT_k s^mathrmT_k y_k\nBigr)\nwidetildeB^mathrmBFGS_k\nBigl(\n mathrmid_T_x_k+1 mathcalM - fracy_k s^mathrmT_k s^mathrmT_k y_k\nBigr) + fracs_k s^mathrmT_ks^mathrmT_k y_k\n\nwhere s_k and y_k are the coordinate vectors with respect to the current basis (from QuasiNewtonState) of\n\nT^S_x_k α_k η_k(α_k η_k) quadtextandquad\noperatornamegradf(x_k+1) - T^S_x_k α_k η_k(operatornamegradf(x_k)) T_x_k+1 mathcalM\n\nrespectively.\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.InverseDFP","page":"Quasi-Newton","title":"Manopt.InverseDFP","text":"InverseDFP <: AbstractQuasiNewtonUpdateRule\n\nindicates in AbstractQuasiNewtonDirectionUpdate that the inverse Riemannian DFP update is used in the Riemannian quasi-Newton method.\n\nWe denote by widetildeB_k^mathrmDFP the operator concatenated with a vector transport and its inverse before and after to act on x_k+1 = R_x_k(α_k η_k). Then the update formula reads\n\nB^mathrmDFP_k+1 = widetildeB^mathrmDFP_k + fracs_k s^mathrmT_ks^mathrmT_k y_k\n - fracwidetildeB^mathrmDFP_k y_k y^mathrmT_k widetildeB^mathrmDFP_ky^mathrmT_k widetildeB^mathrmDFP_k y_k\n\nwhere s_k and y_k are the coordinate vectors with respect to the current basis (from QuasiNewtonState) of\n\nT^S_x_k α_k η_k(α_k η_k) quadtextandquad\noperatornamegradf(x_k+1) - T^S_x_k α_k η_k(operatornamegradf(x_k)) T_x_k+1 mathcalM\n\nrespectively.\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.InverseBroyden","page":"Quasi-Newton","title":"Manopt.InverseBroyden","text":"InverseBroyden <: AbstractQuasiNewtonUpdateRule\n\nIndicates in AbstractQuasiNewtonDirectionUpdate that the Riemannian Broyden update is used in the Riemannian quasi-Newton method, which is as a convex combination of InverseBFGS and InverseDFP.\n\nWe denote by widetildeH_k^mathrmBr the operator concatenated with a vector transport and its inverse before and after to act on x_k+1 = R_x_k(α_k η_k). Then the update formula reads\n\nB^mathrmBr_k+1 = widetildeB^mathrmBr_k\n - fracwidetildeB^mathrmBr_k y_k y^mathrmT_k widetildeB^mathrmBr_ky^mathrmT_k widetildeB^mathrmBr_k y_k\n + fracs_k s^mathrmT_ks^mathrmT_k y_k\n + φ_k y^mathrmT_k widetildeB^mathrmBr_k y_k\n Bigl(\n fracs_ks^mathrmT_k y_k - fracwidetildeB^mathrmBr_k y_ky^mathrmT_k widetildeB^mathrmBr_k y_k\n Bigr) Bigl(\n fracs_ks^mathrmT_k y_k - fracwidetildeB^mathrmBr_k y_ky^mathrmT_k widetildeB^mathrmBr_k y_k\n Bigr)^mathrmT\n\nwhere s_k and y_k are the coordinate vectors with respect to the current basis (from QuasiNewtonState) of\n\nT^S_x_k α_k η_k(α_k η_k) quadtextandquad\noperatornamegradf(x_k+1) - T^S_x_k α_k η_k(operatornamegradf(x_k)) T_x_k+1 mathcalM\n\nrespectively, and φ_k is the Broyden factor which is :constant by default but can also be set to :Davidon.\n\nConstructor\n\nInverseBroyden(φ, update_rule::Symbol = :constant)\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.InverseSR1","page":"Quasi-Newton","title":"Manopt.InverseSR1","text":"InverseSR1 <: AbstractQuasiNewtonUpdateRule\n\nindicates in AbstractQuasiNewtonDirectionUpdate that the inverse Riemannian SR1 update is used in the Riemannian quasi-Newton method.\n\nWe denote by widetildeB_k^mathrmSR1 the operator concatenated with a vector transport and its inverse before and after to act on x_k+1 = R_x_k(α_k η_k). Then the update formula reads\n\nB^mathrmSR1_k+1 = widetildeB^mathrmSR1_k\n+ frac\n (s_k - widetildeB^mathrmSR1_k y_k) (s_k - widetildeB^mathrmSR1_k y_k)^mathrmT\n\n (s_k - widetildeB^mathrmSR1_k y_k)^mathrmT y_k\n\n\nwhere s_k and y_k are the coordinate vectors with respect to the current basis (from QuasiNewtonState) of\n\nT^S_x_k α_k η_k(α_k η_k) quadtextandquad\noperatornamegradf(x_k+1) - T^S_x_k α_k η_k(operatornamegradf(x_k)) T_x_k+1 mathcalM\n\nrespectively.\n\nThis method can be stabilized by only performing the update if denominator is larger than rlVert y_krVert_x_k+1lVert s_k - widetildeH^mathrmSR1_k y_k rVert_x_k+1 for some r0. For more details, see Section 6.2 in Nocedal, Wright, Springer, 2006.\n\nConstructor\n\nInverseSR1(r::Float64=-1.0)\n\nGenerate the InverseSR1 update, which by default does not include the check, since the default sets t0`.\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#State","page":"Quasi-Newton","title":"State","text":"","category":"section"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"The quasi Newton algorithm is based on a DefaultManoptProblem.","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"QuasiNewtonState","category":"page"},{"location":"solvers/quasi_Newton/#Manopt.QuasiNewtonState","page":"Quasi-Newton","title":"Manopt.QuasiNewtonState","text":"QuasiNewtonState <: AbstractManoptSolverState\n\nThese Quasi Newton AbstractManoptSolverState represent any quasi-Newton based method and can be used with any update rule for the direction.\n\nFields\n\np – the current iterate, a point on a manifold\nX – the current gradient\nsk – the current step\nyk the current gradient difference\ndirection_update - an AbstractQuasiNewtonDirectionUpdate rule.\nretraction_method – an AbstractRetractionMethod\nstop – a StoppingCriterion\n\nConstructor\n\nQuasiNewtonState(\n M::AbstractManifold,\n x;\n initial_vector=zero_vector(M,x),\n direction_update::D=QuasiNewtonLimitedMemoryDirectionUpdate(M, x, InverseBFGS(), 20;\n vector_transport_method=vector_transport_method,\n )\n stopping_criterion=StopAfterIteration(1000) | StopWhenGradientNormLess(1e-6),\n retraction_method::RM=default_retraction_method(M, typeof(p)),\n vector_transport_method::VTM=default_vector_transport_method(M, typeof(p)),\n stepsize=default_stepsize(M; QuasiNewtonState)\n)\n\nSee also\n\nquasi_Newton\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Literature","page":"Quasi-Newton","title":"Literature","text":"","category":"section"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"Pages = [\"solvers/quasi_Newton.md\"]\nCanonical=false","category":"page"},{"location":"solvers/NelderMead/#NelderMeadSolver","page":"Nelder–Mead","title":"Nelder Mead Method","text":"","category":"section"},{"location":"solvers/NelderMead/","page":"Nelder–Mead","title":"Nelder–Mead","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/NelderMead/","page":"Nelder–Mead","title":"Nelder–Mead","text":" NelderMead\n NelderMead!","category":"page"},{"location":"solvers/NelderMead/#Manopt.NelderMead","page":"Nelder–Mead","title":"Manopt.NelderMead","text":"NelderMead(M::AbstractManifold, f [, population::NelderMeadSimplex])\nNelderMead(M::AbstractManifold, mco::AbstractManifoldCostObjective [, population::NelderMeadSimplex])\n\nSolve a Nelder-Mead minimization problem for the cost function fcolon mathcal M on the manifold M. If the initial population p is not given, a random set of points is chosen.\n\nThis algorithm is adapted from the Euclidean Nelder-Mead method, see https://en.wikipedia.org/wiki/Nelder–Mead_method and http://www.optimization-online.org/DB_FILE/2007/08/1742.pdf.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function to minimize\npopulation – (n+1 rand(M)s) an initial population of n+1 points, where n is the dimension of the manifold M.\n\nOptional\n\nstopping_criterion – (StopAfterIteration(2000) |StopWhenPopulationConcentrated()) a StoppingCriterion\nα – (1.) reflection parameter (α 0)\nγ – (2.) expansion parameter (γ)\nρ – (1/2) contraction parameter, 0 ρ frac12,\nσ – (1/2) shrink coefficient, 0 σ 1\nretraction_method – (default_retraction_method(M, typeof(p))) the retraction to use\ninverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use.\n\nand the ones that are passed to decorate_state! for decorators.\n\nnote: Note\nThe manifold M used here has to either provide a mean(M, pts) or you have to load Manifolds.jl to use its statistics part.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/NelderMead/#Manopt.NelderMead!","page":"Nelder–Mead","title":"Manopt.NelderMead!","text":"NelderMead(M::AbstractManifold, f [, population::NelderMeadSimplex])\n\nSolve a Nelder Mead minimization problem for the cost function f on the manifold M. If the initial population population is not given, a random set of points is chosen. If it is given, the computation is done in place of population.\n\nFor more options see NelderMead.\n\n\n\n\n\n","category":"function"},{"location":"solvers/NelderMead/#State","page":"Nelder–Mead","title":"State","text":"","category":"section"},{"location":"solvers/NelderMead/","page":"Nelder–Mead","title":"Nelder–Mead","text":" NelderMeadState","category":"page"},{"location":"solvers/NelderMead/#Manopt.NelderMeadState","page":"Nelder–Mead","title":"Manopt.NelderMeadState","text":"NelderMeadState <: AbstractManoptSolverState\n\nDescribes all parameters and the state of a Nelder-Mead heuristic based optimization algorithm.\n\nFields\n\nThe naming of these parameters follows the Wikipedia article of the Euclidean case. The default is given in brackets, the required value range after the description\n\npopulation – an Array{point,1} of n+1 points x_i, i=1n+1, where n is the dimension of the manifold.\nstopping_criterion – (StopAfterIteration(2000) |StopWhenPopulationConcentrated()) a StoppingCriterion\nα – (1.) reflection parameter (α 0)\nγ – (2.) expansion parameter (γ 0)\nρ – (1/2) contraction parameter, 0 ρ frac12,\nσ – (1/2) shrink coefficient, 0 σ 1\np – (copy(population.pts[1])) - a field to collect the current best value (initialized to some point here)\nretraction_method – (default_retraction_method(M, typeof(p))) the retraction to use.\ninverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use.\n\nConstructors\n\nNelderMead(M[, population::NelderMeadSimplex]; kwargs...)\n\nConstruct a Nelder-Mead Option with a default population (if not provided) of set of dimension(M)+1 random points stored in NelderMeadSimplex.\n\nIn the constructor all fields (besides the population) are keyword arguments.\n\n\n\n\n\n","category":"type"},{"location":"solvers/NelderMead/#Simplex","page":"Nelder–Mead","title":"Simplex","text":"","category":"section"},{"location":"solvers/NelderMead/","page":"Nelder–Mead","title":"Nelder–Mead","text":"NelderMeadSimplex","category":"page"},{"location":"solvers/NelderMead/#Manopt.NelderMeadSimplex","page":"Nelder–Mead","title":"Manopt.NelderMeadSimplex","text":"NelderMeadSimplex\n\nA simplex for the Nelder-Mead algorithm.\n\nConstructors\n\nNelderMeadSimplex(M::AbstractManifold)\n\nConstruct a simplex using n+1 random points from manifold M, where n is the manifold dimension of M.\n\nNelderMeadSimplex(\n M::AbstractManifold,\n p,\n B::AbstractBasis=DefaultOrthonormalBasis();\n a::Real=0.025,\n retraction_method::AbstractRetractionMethod=default_retraction_method(M, typeof(p)),\n)\n\nConstruct a simplex from a basis B with one point being p and other points constructed by moving by a in each principal direction defined by basis B of the tangent space at point p using retraction retraction_method. This works similarly to how the initial simplex is constructed in the Euclidean Nelder-Mead algorithm, just in the tangent space at point p.\n\n\n\n\n\n","category":"type"},{"location":"solvers/NelderMead/#Additional-Stopping-Criteria","page":"Nelder–Mead","title":"Additional Stopping Criteria","text":"","category":"section"},{"location":"solvers/NelderMead/","page":"Nelder–Mead","title":"Nelder–Mead","text":"StopWhenPopulationConcentrated","category":"page"},{"location":"solvers/NelderMead/#Manopt.StopWhenPopulationConcentrated","page":"Nelder–Mead","title":"Manopt.StopWhenPopulationConcentrated","text":"StopWhenPopulationConcentrated <: StoppingCriterion\n\nA stopping criterion for NelderMead to indicate to stop when both\n\nthe maximal distance of the first to the remaining the cost values and\nthe maximal distance of the first to the remaining the population points\n\ndrops below a certain tolerance tol_f and tol_p, respectively.\n\nConstructor\n\nStopWhenPopulationConcentrated(tol_f::Real=1e-8, tol_x::Real=1e-8)\n\n\n\n\n\n","category":"type"}]
+[{"location":"notation/#Notation","page":"Notation","title":"Notation","text":"","category":"section"},{"location":"notation/","page":"Notation","title":"Notation","text":"In this package, we follow the notation introduced in Manifolds.jl – Notation","category":"page"},{"location":"notation/","page":"Notation","title":"Notation","text":"with the following additional notation","category":"page"},{"location":"notation/","page":"Notation","title":"Notation","text":"Symbol Description Also used Comment\n The Levi-Cevita connection ","category":"page"},{"location":"tutorials/AutomaticDifferentiation/#Using-Automatic-Differentiation-in-Manopt.jl","page":"Use Automatic Differentiation","title":"Using Automatic Differentiation in Manopt.jl","text":"","category":"section"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Since Manifolds.jl 0.7, the support of automatic differentiation support has been extended.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"This tutorial explains how to use Euclidean tools to derive a gradient for a real-valued function fcolon mathcal M ℝ. We will consider two methods: an intrinsic variant and a variant employing the embedding. These gradients can then be used within any gradient based optimization algorithm in Manopt.jl.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"While by default we use FiniteDifferences.jl, you can also use FiniteDiff.jl, ForwardDiff.jl, ReverseDiff.jl, or Zygote.jl.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"In this tutorial we will take a look at a few possibilities to approximate or derive the gradient of a function fmathcal M to ℝ on a Riemannian manifold, without computing it yourself. There are mainly two different philosophies:","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Working instrinsically, i.e. staying on the manifold and in the tangent spaces. Here, we will consider approximating the gradient by forward differences.\nWorking in an embedding – there we can use all tools from functions on Euclidean spaces – finite differences or automatic differenciation – and then compute the corresponding Riemannian gradient from there.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We first load all necessary packages","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"using Manopt, Manifolds, Random, LinearAlgebra\nusing FiniteDifferences, ManifoldDiff\nRandom.seed!(42);","category":"page"},{"location":"tutorials/AutomaticDifferentiation/#.-(Intrinsic)-Forward-Differences","page":"Use Automatic Differentiation","title":"1. (Intrinsic) Forward Differences","text":"","category":"section"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"A first idea is to generalize (multivariate) finite differences to Riemannian manifolds. Let X_1ldotsX_d T_pmathcal M denote an orthonormal basis of the tangent space T_pmathcal M at the point pmathcal M on the Riemannian manifold.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We can generalize the notion of a directional derivative, i.e. for the “direction” YT_pmathcal M. Let ccolon -εε, ε0, be a curve with c(0) = p, dot c(0) = Y, e.g. c(t)= exp_p(tY). We obtain","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":" Df(p)Y = left fracddt right_t=0 f(c(t)) = lim_t to 0 frac1t(f(exp_p(tY))-f(p))","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We can approximate Df(p)X by a finite difference scheme for an h0 as","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"DF(p)Y G_h(Y) = frac1h(f(exp_p(hY))-f(p))","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Furthermore the gradient operatornamegradf is the Riesz representer of the differential, ie.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":" Df(p)Y = g_p(operatornamegradf(p) Y)qquad text for all Y T_pmathcal M","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"and since it is a tangent vector, we can write it in terms of a basis as","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":" operatornamegradf(p) = sum_i=1^d g_p(operatornamegradf(p)X_i)X_i\n = sum_i=1^d Df(p)X_iX_i","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"and perform the approximation from above to obtain","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":" operatornamegradf(p) sum_i=1^d G_h(X_i)X_i","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"for some suitable step size h. This comes at the cost of d+1 function evaluations and d exponential maps.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"This is the first variant we can use. An advantage is that it is intrinsic in the sense that it does not require any embedding of the manifold.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/#An-Example:-The-Rayleigh-Quotient","page":"Use Automatic Differentiation","title":"An Example: The Rayleigh Quotient","text":"","category":"section"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"The Rayleigh quotient is concerned with finding eigenvalues (and eigenvectors) of a symmetric matrix Ain ℝ^(n+1)(n+1). The optimization problem reads","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Fcolon ℝ^n+1 to ℝquad F(mathbf x) = fracmathbf x^mathrmTAmathbf xmathbf x^mathrmTmathbf x","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Minimizing this function yields the smallest eigenvalue lambda_1 as a value and the corresponding minimizer mathbf x^* is a corresponding eigenvector.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Since the length of an eigenvector is irrelevant, there is an ambiguity in the cost function. It can be better phrased on the sphere $ 𝕊^n$ of unit vectors in mathbb R^n+1, i.e.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"operatorname*argmin_p in 𝕊^n f(p) = operatorname*argmin_ p in 𝕊^n p^mathrmTAp","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We can compute the Riemannian gradient exactly as","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"operatornamegrad f(p) = 2(Ap - pp^mathrmTAp)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"so we can compare it to the approximation by finite differences.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"n = 200\nA = randn(n + 1, n + 1)\nA = Symmetric(A)\nM = Sphere(n);\n\nf1(p) = p' * A'p\ngradf1(p) = 2 * (A * p - p * p' * A * p)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"gradf1 (generic function with 1 method)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Manifolds provides a finite difference scheme in tangent spaces, that you can introduce to use an existing framework (if the wrapper is implemented) form Euclidean space. Here we use FiniteDiff.jl.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"r_backend = ManifoldDiff.TangentDiffBackend(\n ManifoldDiff.FiniteDifferencesBackend()\n)\ngradf1_FD(p) = ManifoldDiff.gradient(M, f1, p, r_backend)\n\np = zeros(n + 1)\np[1] = 1.0\nX1 = gradf1(p)\nX2 = gradf1_FD(p)\nnorm(M, p, X1 - X2)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"1.0003414846716736e-12","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We obtain quite a good approximation of the gradient.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/#EmbeddedGradient","page":"Use Automatic Differentiation","title":"2. Conversion of a Euclidean Gradient in the Embedding to a Riemannian Gradient of a (not Necessarily Isometrically) Embedded Manifold","text":"","category":"section"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Let tilde fcolonmathbb R^m to mathbb R be a function on the embedding of an n-dimensional manifold mathcal M subset mathbb R^mand let fcolon mathcal M to mathbb R denote the restriction of tilde f to the manifold mathcal M.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Since we can use the pushforward of the embedding to also embed the tangent space T_pmathcal M, pin mathcal M, we can similarly obtain the differential Df(p)colon T_pmathcal M to mathbb R by restricting the differential Dtilde f(p) to the tangent space.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"If both T_pmathcal M and T_pmathbb R^m have the same inner product, or in other words the manifold is isometrically embedded in mathbb R^m (like for example the sphere mathbb S^nsubsetmathbb R^m+1), then this restriction of the differential directly translates to a projection of the gradient, i.e.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"operatornamegradf(p) = operatornameProj_T_pmathcal M(operatornamegrad tilde f(p))","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"More generally we might have to take a change of the metric into account, i.e.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"langle operatornameProj_T_pmathcal M(operatornamegrad tilde f(p)) X rangle\n= Df(p)X = g_p(operatornamegradf(p) X)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"or in words: we have to change the Riesz representer of the (restricted/projected) differential of f (tilde f) to the one with respect to the Riemannian metric. This is done using change_representer.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/#A-Continued-Example","page":"Use Automatic Differentiation","title":"A Continued Example","text":"","category":"section"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We continue with the Rayleigh Quotient from before, now just starting with the defintion of the Euclidean case in the embedding, the function F.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"F(x) = x' * A * x / (x' * x);","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"The cost function is the same by restriction","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"f2(M, p) = F(p);","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"The gradient is now computed combining our gradient scheme with FiniteDifferences.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"function grad_f2_AD(M, p)\n return Manifolds.gradient(\n M, F, p, Manifolds.RiemannianProjectionBackend(ManifoldDiff.FiniteDifferencesBackend())\n )\nend\nX3 = grad_f2_AD(M, p)\nnorm(M, p, X1 - X3)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"1.69683800899515e-12","category":"page"},{"location":"tutorials/AutomaticDifferentiation/#An-Example-for-a-Nonisometrically-Embedded-Manifold","page":"Use Automatic Differentiation","title":"An Example for a Nonisometrically Embedded Manifold","text":"","category":"section"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"on the manifold mathcal P(3) of symmetric positive definite matrices.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"The following function computes (half) the distance squared (with respect to the linear affine metric) on the manifold mathcal P(3) to the identity, i.e. I_3. Denoting the unit matrix we consider the function","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":" G(q)\n = frac12d^2_mathcal P(3)(qI_3)\n = lVert operatornameLog(q) rVert_F^2","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"where operatornameLog denotes the matrix logarithm and lVert cdot rVert_F is the Frobenius norm. This can be computed for symmetric positive definite matrices by summing the squares of the logarithms of the eigenvalues of q and dividing by two:","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"G(q) = sum(log.(eigvals(Symmetric(q))) .^ 2) / 2","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"G (generic function with 1 method)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We can also interpret this as a function on the space of matrices and apply the Euclidean finite differences machinery; in this way we can easily derive the Euclidean gradient. But when computing the Riemannian gradient, we have to change the representer (see again change_representer) after projecting onto the tangent space T_pmathcal P(n) at p.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Let’s first define a point and the manifold N=mathcal P(3).","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"rotM(α) = [1.0 0.0 0.0; 0.0 cos(α) sin(α); 0.0 -sin(α) cos(α)]\nq = rotM(π / 6) * [1.0 0.0 0.0; 0.0 2.0 0.0; 0.0 0.0 3.0] * transpose(rotM(π / 6))\nN = SymmetricPositiveDefinite(3)\nis_point(N, q)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"true","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We could first just compute the gradient using FiniteDifferences.jl, but this yields the Euclidean gradient:","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"FiniteDifferences.grad(central_fdm(5, 1), G, q)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"([3.240417492806275e-14 -2.3531899864903462e-14 0.0; 0.0 0.3514812167654708 0.017000516835452926; 0.0 0.0 0.36129646973723023],)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Instead, we use the RiemannianProjectedBackend of Manifolds.jl, which in this case internally uses FiniteDifferences.jl to compute a Euclidean gradient but then uses the conversion explained above to derive the Riemannian gradient.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"We define this here again as a function grad_G_FD that could be used in the Manopt.jl framework within a gradient based optimization.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"function grad_G_FD(N, q)\n return Manifolds.gradient(\n N, G, q, ManifoldDiff.RiemannianProjectionBackend(ManifoldDiff.FiniteDifferencesBackend())\n )\nend\nG1 = grad_G_FD(N, q)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"3×3 Matrix{Float64}:\n 3.24042e-14 -2.64734e-14 -5.09481e-15\n -2.64734e-14 1.86368 0.826856\n -5.09481e-15 0.826856 2.81845","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Now, we can again compare this to the (known) solution of the gradient, namely the gradient of (half of) the distance squared, i.e. G(q) = frac12d^2_mathcal P(3)(qI_3) is given by operatornamegrad G(q) = -operatornamelog_q I_3, where operatornamelog is the logarithmic map on the manifold.","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"G2 = -log(N, q, Matrix{Float64}(I, 3, 3))","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"3×3 Matrix{Float64}:\n -0.0 -0.0 -0.0\n -0.0 1.86368 0.826856\n -0.0 0.826856 2.81845","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"Both terms agree up to 1810^-12:","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"norm(G1 - G2)\nisapprox(M, q, G1, G2; atol=2 * 1e-12)","category":"page"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"true","category":"page"},{"location":"tutorials/AutomaticDifferentiation/#Summary","page":"Use Automatic Differentiation","title":"Summary","text":"","category":"section"},{"location":"tutorials/AutomaticDifferentiation/","page":"Use Automatic Differentiation","title":"Use Automatic Differentiation","text":"This tutorial illustrates how to use tools from Euclidean spaces, finite differences or automatic differentiation, to compute gradients on Riemannian manifolds. The scheme allows to use any differentiation framework within the embedding to derive a Riemannian gradient.","category":"page"},{"location":"solvers/conjugate_gradient_descent/#CGSolver","page":"Conjugate gradient descent","title":"Conjugate Gradient Descent","text":"","category":"section"},{"location":"solvers/conjugate_gradient_descent/","page":"Conjugate gradient descent","title":"Conjugate gradient descent","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/conjugate_gradient_descent/","page":"Conjugate gradient descent","title":"Conjugate gradient descent","text":"conjugate_gradient_descent\nconjugate_gradient_descent!","category":"page"},{"location":"solvers/conjugate_gradient_descent/#Manopt.conjugate_gradient_descent","page":"Conjugate gradient descent","title":"Manopt.conjugate_gradient_descent","text":"conjugate_gradient_descent(M, F, gradF, p=rand(M))\nconjugate_gradient_descent(M, gradient_objective, p)\n\nperform a conjugate gradient based descent\n\np_k+1 = operatornameretr_p_k bigl( s_kδ_k bigr)\n\nwhere operatornameretr denotes a retraction on the Manifold M and one can employ different rules to update the descent direction δ_k based on the last direction δ_k-1 and both gradients operatornamegradf(x_k),operatornamegradf(x_k-1). The Stepsize s_k may be determined by a Linesearch.\n\nAlternatively to f and grad_f you can provide the AbstractManifoldGradientObjective gradient_objective directly.\n\nAvailable update rules are SteepestDirectionUpdateRule, which yields a gradient_descent, ConjugateDescentCoefficient (the default), DaiYuanCoefficient, FletcherReevesCoefficient, HagerZhangCoefficient, HestenesStiefelCoefficient, LiuStoreyCoefficient, and PolakRibiereCoefficient. These can all be combined with a ConjugateGradientBealeRestart rule.\n\nThey all compute β_k such that this algorithm updates the search direction as\n\ndelta_k=operatornamegradf(p_k) + β_k delta_k-1\n\nInput\n\nM : a manifold mathcal M\nf : a cost function Fmathcal Mℝ to minimize implemented as a function (M,p) -> v\ngrad_f: the gradient operatornamegradFmathcal M Tmathcal M of F implemented also as (M,x) -> X\np : an initial value xmathcal M\n\nOptional\n\ncoefficient : (ConjugateDescentCoefficient <: DirectionUpdateRule) rule to compute the descent direction update coefficient β_k, as a functor, i.e. the resulting function maps (amp, cgs, i) -> β, where amp is an AbstractManoptProblem, cgs are the ConjugateGradientDescentState o and i is the current iterate.\nevaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).\nretraction_method - (default_retraction_method(M, typeof(p))) a retraction method to use.\nstepsize - (ArmijoLinesearch via default_stepsize) A Stepsize function applied to the search direction. The default is a constant step size 1.\nstopping_criterion : (stopWhenAny( stopAtIteration(200), stopGradientNormLess(10.0^-8))) a function indicating when to stop.\nvector_transport_method – (default_vector_transport_method(M, typeof(p))) vector transport method to transport the old descent direction when computing the new descent direction.\n\nIf you provide the ManifoldGradientObjective directly, evaluation is ignored.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/conjugate_gradient_descent/#Manopt.conjugate_gradient_descent!","page":"Conjugate gradient descent","title":"Manopt.conjugate_gradient_descent!","text":"conjugate_gradient_descent!(M, F, gradF, x)\nconjugate_gradient_descent!(M, gradient_objective, p; kwargs...)\n\nperform a conjugate gradient based descent in place of x, i.e.\n\np_k+1 = operatornameretr_p_k bigl( s_kdelta_k bigr)\n\nwhere operatornameretr denotes a retraction on the Manifold M\n\nInput\n\nM : a manifold mathcal M\nf : a cost function Fmathcal Mℝ to minimize\ngrad_f: the gradient operatornamegradFmathcal M Tmathcal M of F\np : an initial value pmathcal M\n\nAlternatively to f and grad_f you can provide the AbstractManifoldGradientObjective gradient_objective directly.\n\nfor more details and options, especially the DirectionUpdateRules, see conjugate_gradient_descent.\n\n\n\n\n\n","category":"function"},{"location":"solvers/conjugate_gradient_descent/#State","page":"Conjugate gradient descent","title":"State","text":"","category":"section"},{"location":"solvers/conjugate_gradient_descent/","page":"Conjugate gradient descent","title":"Conjugate gradient descent","text":"ConjugateGradientDescentState","category":"page"},{"location":"solvers/conjugate_gradient_descent/#Manopt.ConjugateGradientDescentState","page":"Conjugate gradient descent","title":"Manopt.ConjugateGradientDescentState","text":"ConjugateGradientState <: AbstractGradientSolverState\n\nspecify options for a conjugate gradient descent algorithm, that solves a [DefaultManoptProblem].\n\nFields\n\np – the current iterate, a point on a manifold\nX – the current gradient, also denoted as ξ or X_k for the gradient in the kth step.\nδ – the current descent direction, i.e. also tangent vector\nβ – the current update coefficient rule, see .\ncoefficient – (ConjugateDescentCoefficient()) a DirectionUpdateRule function to determine the new β\nstepsize – (default_stepsize(M, ConjugateGradientDescentState; retraction_method=retraction_method)) a Stepsize function\nstop – (StopAfterIteration(500) |StopWhenGradientNormLess(1e-8)) a StoppingCriterion\nretraction_method – (default_retraction_method(M, typeof(p))) a type of retraction\nvector_transport_method – (default_retraction_method(M, typeof(p))) a type of retraction\n\nConstructor\n\nConjugateGradientState(M, p)\n\nwhere the last five fields above can be set by their names as keyword and the X can be set to a tangent vector type using the keyword initial_gradient which defaults to zero_vector(M,p), and δ is initialized to a copy of this vector.\n\nSee also\n\nconjugate_gradient_descent, DefaultManoptProblem, ArmijoLinesearch\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#cg-coeffs","page":"Conjugate gradient descent","title":"Available Coefficients","text":"","category":"section"},{"location":"solvers/conjugate_gradient_descent/","page":"Conjugate gradient descent","title":"Conjugate gradient descent","text":"The update rules act as DirectionUpdateRule, which internally always first evaluate the gradient itself.","category":"page"},{"location":"solvers/conjugate_gradient_descent/","page":"Conjugate gradient descent","title":"Conjugate gradient descent","text":"ConjugateGradientBealeRestart\nConjugateDescentCoefficient\nDaiYuanCoefficient\nFletcherReevesCoefficient\nHagerZhangCoefficient\nHestenesStiefelCoefficient\nLiuStoreyCoefficient\nPolakRibiereCoefficient\nSteepestDirectionUpdateRule","category":"page"},{"location":"solvers/conjugate_gradient_descent/#Manopt.ConjugateGradientBealeRestart","page":"Conjugate gradient descent","title":"Manopt.ConjugateGradientBealeRestart","text":"ConjugateGradientBealeRestart <: DirectionUpdateRule\n\nAn update rule might require a restart, that is using pure gradient as descent direction, if the last two gradients are nearly orthogonal, cf. Hager, Zhang, Pacific J Optim, 2006, page 12 (in the pdf, 46 in Journal page numbers). This method is named after E. Beale from his proceedings paper in 1972 [Bea72]. This method acts as a decorator to any existing DirectionUpdateRule direction_update.\n\nWhen obtain from the ConjugateGradientDescentStatecgs the last p_kX_k and the current p_k+1X_k+1 iterate and the gradient, respectively.\n\nThen a restart is performed, i.e. β_k = 0 returned if\n\n frac X_k+1 P_p_k+1gets p_kX_klVert X_k rVert_p_k ξ\n\nwhere P_agets b() denotes a vector transport from the tangent space at a to b, and ξ is the threshold. The default threshold is chosen as 0.2 as recommended in Powell, Math. Prog., 1977\n\nConstructor\n\nConjugateGradientBealeRestart(\n direction_update::D,\n threshold=0.2;\n manifold::AbstractManifold = DefaultManifold(),\n vector_transport_method::V=default_vector_transport_method(manifold),\n)\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Manopt.ConjugateDescentCoefficient","page":"Conjugate gradient descent","title":"Manopt.ConjugateDescentCoefficient","text":"ConjugateDescentCoefficient <: DirectionUpdateRule\n\nComputes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates p_kX_k, the current iterates p_k+1X_k+1 of the iterate and the gradient, respectively, and the last update direction delta=delta_k, based on Fletcher, 1987 adapted to manifolds:\n\nβ_k =\nfrac lVert X_k+1 rVert_p_k+1^2 \nlangle -delta_kX_k rangle_p_k\n\nSee also conjugate_gradient_descent\n\nConstructor\n\nConjugateDescentCoefficient(a::StoreStateAction=())\n\nConstruct the conjugate descent coefficient update rule, a new storage is created by default.\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Manopt.DaiYuanCoefficient","page":"Conjugate gradient descent","title":"Manopt.DaiYuanCoefficient","text":"DaiYuanCoefficient <: DirectionUpdateRule\n\nComputes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates p_kX_k, the current iterates p_k+1X_k+1 of the iterate and the gradient, respectively, and the last update direction delta=delta_k, based on Dai, Yuan, Siam J Optim, 1999 adapted to manifolds:\n\nLet nu_k = X_k+1 - P_p_k+1gets p_kX_k, where P_agets b() denotes a vector transport from the tangent space at a to b.\n\nThen the coefficient reads\n\nβ_k =\nfrac lVert X_k+1 rVert_p_k+1^2 \nlangle P_p_k+1gets p_kdelta_k nu_k rangle_p_k+1\n\nSee also conjugate_gradient_descent\n\nConstructor\n\nfunction DaiYuanCoefficient(\n M::AbstractManifold=DefaultManifold(2);\n t::AbstractVectorTransportMethod=default_vector_transport_method(M)\n)\n\nConstruct the Dai Yuan coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Manopt.FletcherReevesCoefficient","page":"Conjugate gradient descent","title":"Manopt.FletcherReevesCoefficient","text":"FletcherReevesCoefficient <: DirectionUpdateRule\n\nComputes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates p_kX_k, the current iterates p_k+1X_k+1 of the iterate and the gradient, respectively, and the last update direction delta=delta_k, based on Flecther, Reeves, Comput. J, 1964 adapted to manifolds:\n\nβ_k =\nfraclVert X_k+1rVert_p_k+1^2lVert X_krVert_x_k^2\n\nSee also conjugate_gradient_descent\n\nConstructor\n\nFletcherReevesCoefficient(a::StoreStateAction=())\n\nConstruct the Fletcher Reeves coefficient update rule, a new storage is created by default.\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Manopt.HagerZhangCoefficient","page":"Conjugate gradient descent","title":"Manopt.HagerZhangCoefficient","text":"HagerZhangCoefficient <: DirectionUpdateRule\n\nComputes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates p_kX_k, the current iterates p_k+1X_k+1 of the iterate and the gradient, respectively, and the last update direction delta=delta_k, based on Hager, Zhang, SIAM J Optim, 2005. adapted to manifolds: let nu_k = X_k+1 - P_p_k+1gets p_kX_k, where P_agets b() denotes a vector transport from the tangent space at a to b.\n\nβ_k = Bigllanglenu_k -\nfrac 2lVert nu_krVert_p_k+1^2 langle P_p_k+1gets p_kdelta_k nu_k rangle_p_k+1 \nP_p_k+1gets p_kdelta_k\nfracX_k+1 langle P_p_k+1gets p_kdelta_k nu_k rangle_p_k+1 \nBigrrangle_p_k+1\n\nThis method includes a numerical stability proposed by those authors.\n\nSee also conjugate_gradient_descent\n\nConstructor\n\nfunction HagerZhangCoefficient(t::AbstractVectorTransportMethod)\nfunction HagerZhangCoefficient(M::AbstractManifold = DefaultManifold(2))\n\nConstruct the Hager Zhang coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Manopt.HestenesStiefelCoefficient","page":"Conjugate gradient descent","title":"Manopt.HestenesStiefelCoefficient","text":"HestenesStiefelCoefficient <: DirectionUpdateRule\n\nComputes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates p_kX_k, the current iterates p_k+1X_k+1 of the iterate and the gradient, respectively, and the last update direction delta=delta_k, based on Heestenes, Stiefel, J. Research Nat. Bur. Standards, 1952 adapted to manifolds as follows:\n\nLet nu_k = X_k+1 - P_p_k+1gets p_kX_k. Then the update reads\n\nβ_k = fraclangle X_k+1 nu_k rangle_p_k+1 \n langle P_p_k+1gets p_k delta_k nu_krangle_p_k+1 \n\nwhere P_agets b() denotes a vector transport from the tangent space at a to b.\n\nConstructor\n\nfunction HestenesStiefelCoefficient(transport_method::AbstractVectorTransportMethod)\nfunction HestenesStiefelCoefficient(M::AbstractManifold = DefaultManifold(2))\n\nConstruct the Heestens Stiefel coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.\n\nSee also conjugate_gradient_descent\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Manopt.LiuStoreyCoefficient","page":"Conjugate gradient descent","title":"Manopt.LiuStoreyCoefficient","text":"LiuStoreyCoefficient <: DirectionUpdateRule\n\nComputes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates p_kX_k, the current iterates p_k+1X_k+1 of the iterate and the gradient, respectively, and the last update direction delta=delta_k, based on Lui, Storey, J. Optim. Theoru Appl., 1991 adapted to manifolds:\n\nLet nu_k = X_k+1 - P_p_k+1gets p_kX_k, where P_agets b() denotes a vector transport from the tangent space at a to b.\n\nThen the coefficient reads\n\nβ_k = -\nfrac langle X_k+1nu_k rangle_p_k+1 \nlangle delta_kX_k rangle_p_k\n\nSee also conjugate_gradient_descent\n\nConstructor\n\nfunction LiuStoreyCoefficient(t::AbstractVectorTransportMethod)\nfunction LiuStoreyCoefficient(M::AbstractManifold = DefaultManifold(2))\n\nConstruct the Lui Storey coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Manopt.PolakRibiereCoefficient","page":"Conjugate gradient descent","title":"Manopt.PolakRibiereCoefficient","text":"PolakRibiereCoefficient <: DirectionUpdateRule\n\nComputes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates p_kX_k, the current iterates p_k+1X_k+1 of the iterate and the gradient, respectively, and the last update direction delta=delta_k, based on Poliak, Ribiere, ESIAM Math. Modelling Num. Anal., 1969 and Polyak, USSR Comp. Math. Math. Phys., 1969 adapted to manifolds:\n\nLet nu_k = X_k+1 - P_p_k+1gets p_kX_k, where P_agets b() denotes a vector transport from the tangent space at a to b.\n\nThen the update reads\n\nβ_k =\nfrac langle X_k+1 nu_k rangle_p_k+1 \nlVert X_k rVert_p_k^2 \n\nConstructor\n\nfunction PolakRibiereCoefficient(\n M::AbstractManifold=DefaultManifold(2);\n t::AbstractVectorTransportMethod=default_vector_transport_method(M)\n)\n\nConstruct the PolakRibiere coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.\n\nSee also conjugate_gradient_descent\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Manopt.SteepestDirectionUpdateRule","page":"Conjugate gradient descent","title":"Manopt.SteepestDirectionUpdateRule","text":"SteepestDirectionUpdateRule <: DirectionUpdateRule\n\nThe simplest rule to update is to have no influence of the last direction and hence return an update β = 0 for all ConjugateGradientDescentStatecgds\n\nSee also conjugate_gradient_descent\n\n\n\n\n\n","category":"type"},{"location":"solvers/conjugate_gradient_descent/#Literature","page":"Conjugate gradient descent","title":"Literature","text":"","category":"section"},{"location":"solvers/conjugate_gradient_descent/","page":"Conjugate gradient descent","title":"Conjugate gradient descent","text":"
[Bea72]
\n
\n
E. M. Beale. A derivation of conjugate gradients. In: Numerical methods for nonlinear optimization, 39–43, London, Academic Press, London (1972).
","category":"page"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"EditURL = \"https://github.com/JuliaManifolds/Manopt.jl/blob/master/Changelog.md\"","category":"page"},{"location":"changelog/#Changelog","page":"Changelog","title":"Changelog","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"All notable Changes to the Julia package Manopt.jl will be documented in this file. The file was started with Version 0.4.","category":"page"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.","category":"page"},{"location":"changelog/#[0.4.40]-–-24/10/2023","page":"Changelog","title":"[0.4.40] – 24/10/2023","text":"","category":"section"},{"location":"changelog/#Added","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"add a --help argument to docs/make.jl to document all availabel command line arguments\nadd a --exclude-tutorials argument to docs/make.jl. This way, when quarto is not available on a computer, the docs can still be build with the tutorials not being added to the menu such that documenter does not expect them to exist.","category":"page"},{"location":"changelog/#Changes","page":"Changelog","title":"Changes","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"Bump dependencies to ManifoldsBase.jl 0.15 and Manifolds.jl 0.9\nmove the ARC CG subsolver to the main package, since TangentSpace is now already available from ManifoldsBase.","category":"page"},{"location":"changelog/#[0.4.39]-–-09/10/2023","page":"Changelog","title":"[0.4.39] – 09/10/2023","text":"","category":"section"},{"location":"changelog/#Changes-2","page":"Changelog","title":"Changes","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"also use the pair of a retraction and the inverse retraction (see last update) to perform the relaxation within the Douglas-Rachford algorithm.","category":"page"},{"location":"changelog/#[0.4.38]-–-08/10/2023","page":"Changelog","title":"[0.4.38] – 08/10/2023","text":"","category":"section"},{"location":"changelog/#Changes-3","page":"Changelog","title":"Changes","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"avoid allocations when calling get_jacobian! within the Levenberg-Marquard Algorithm.","category":"page"},{"location":"changelog/#Fixed","page":"Changelog","title":"Fixed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"Fix a lot of typos in the documentation","category":"page"},{"location":"changelog/#[0.4.37]-–-28/09/2023","page":"Changelog","title":"[0.4.37] – 28/09/2023","text":"","category":"section"},{"location":"changelog/#Changes-4","page":"Changelog","title":"Changes","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"add more of the Riemannian Levenberg-Marquard algorithms parameters as keywords, so they can be changed on call\ngeneralize the internal reflection of Douglas-Rachford, such that is also works with an arbitrary pair of a reflection and an inverse reflection.","category":"page"},{"location":"changelog/#[0.4.36]-–-20/09/2023","page":"Changelog","title":"[0.4.36] – 20/09/2023","text":"","category":"section"},{"location":"changelog/#Fixed-2","page":"Changelog","title":"Fixed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"Fixed a bug that caused non-matrix points and vectors to fail when working with approcimate","category":"page"},{"location":"changelog/#[0.4.35]-–-14/09/2023","page":"Changelog","title":"[0.4.35] – 14/09/2023","text":"","category":"section"},{"location":"changelog/#Added-2","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"The access to functions of the objective is now unified and encapsulated in proper get_ functions.","category":"page"},{"location":"changelog/#[0.4.34]-–-02/09/2023","page":"Changelog","title":"[0.4.34] – 02/09/2023","text":"","category":"section"},{"location":"changelog/#Added-3","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"an ManifoldEuclideanGradientObjetive to allow the cost, gradient, and Hessian and other first or second derivative based elements to be Euclidean and converted when needed.\na keyword objective_type=:Euclidean for all solvers, that specifies that an Objective shall be created of the above type","category":"page"},{"location":"changelog/#[0.4.33]-24/08/2023","page":"Changelog","title":"[0.4.33] - 24/08/2023","text":"","category":"section"},{"location":"changelog/#Added-4","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"ConstantStepsize and DecreasingStepsize now have an additional field type::Symbol to assess whether the step-size should be relatively (to the gradient norm) or absolutely constant.","category":"page"},{"location":"changelog/#[0.4.32]-23/08/2023","page":"Changelog","title":"[0.4.32] - 23/08/2023","text":"","category":"section"},{"location":"changelog/#Added-5","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"The adaptive regularization with cubics (ARC) solver.","category":"page"},{"location":"changelog/#[0.4.31]-14/08/2023","page":"Changelog","title":"[0.4.31] - 14/08/2023","text":"","category":"section"},{"location":"changelog/#Added-6","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"A :Subsolver keyword in the debug= keyword argument, that activates the new DebugWhenActiveto de/activate subsolver debug from the main solversDebugEvery`.","category":"page"},{"location":"changelog/#[0.4.30]-03/08/2023","page":"Changelog","title":"[0.4.30] - 03/08/2023","text":"","category":"section"},{"location":"changelog/#Changed","page":"Changelog","title":"Changed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"References in the documentation are now rendered using DocumenterCitations.jl\nAsymptote export now also accepts a size in pixel instead of its default 4cm size and render can be deactivated setting it to nothing.","category":"page"},{"location":"changelog/#[0.4.29]-12/07/2023","page":"Changelog","title":"[0.4.29] - 12/07/2023","text":"","category":"section"},{"location":"changelog/#Fixed-3","page":"Changelog","title":"Fixed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"fixed a bug, where cyclic_proximal_point did not work with decorated objectives.","category":"page"},{"location":"changelog/#[0.4.28]-24/06/2023","page":"Changelog","title":"[0.4.28] - 24/06/2023","text":"","category":"section"},{"location":"changelog/#Changed-2","page":"Changelog","title":"Changed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"max_stepsize was specialized for FixedRankManifold to follow Matlab Manopt.","category":"page"},{"location":"changelog/#[0.4.27]-15/06/2023","page":"Changelog","title":"[0.4.27] - 15/06/2023","text":"","category":"section"},{"location":"changelog/#Added-7","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"The AdaptiveWNGrad stepsize is now available as a new stepsize functor.","category":"page"},{"location":"changelog/#Fixed-4","page":"Changelog","title":"Fixed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"Levenberg-Marquardt now possesses its parameters initial_residual_values and initial_jacobian_f also as keyword arguments, such that their default initialisations can be adapted, if necessary","category":"page"},{"location":"changelog/#[0.4.26]-11/06/2023","page":"Changelog","title":"[0.4.26] - 11/06/2023","text":"","category":"section"},{"location":"changelog/#Added-8","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"simplify usage of gradient descent as sub solver in the DoC solvers.\nadd a get_state function\ndocument indicates_convergence.","category":"page"},{"location":"changelog/#[0.4.25]-05/06/2023","page":"Changelog","title":"[0.4.25] - 05/06/2023","text":"","category":"section"},{"location":"changelog/#Fixed-5","page":"Changelog","title":"Fixed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"Fixes an allocation bug in the difference of convex algorithm","category":"page"},{"location":"changelog/#[0.4.24]-04/06/2023","page":"Changelog","title":"[0.4.24] - 04/06/2023","text":"","category":"section"},{"location":"changelog/#Added-9","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"another workflow that deletes old PR renderings from the docs to keep them smaller in overall size.","category":"page"},{"location":"changelog/#Changes-5","page":"Changelog","title":"Changes","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"bump dependencies since the extension between Manifolds.jl and ManifoldsDiff.jl has been moved to Manifolds.jl","category":"page"},{"location":"changelog/#[0.4.23]-04/06/2023","page":"Changelog","title":"[0.4.23] - 04/06/2023","text":"","category":"section"},{"location":"changelog/#Added-10","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"More details on the Count and Cache tutorial","category":"page"},{"location":"changelog/#Changed-3","page":"Changelog","title":"Changed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"loosen constraints slightly","category":"page"},{"location":"changelog/#[0.4.22]-31/05/2023","page":"Changelog","title":"[0.4.22] - 31/05/2023","text":"","category":"section"},{"location":"changelog/#Added-11","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"A tutorial on how to implement a solver","category":"page"},{"location":"changelog/#[0.4.21]-22/05/2023","page":"Changelog","title":"[0.4.21] - 22/05/2023","text":"","category":"section"},{"location":"changelog/#Added-12","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"A ManifoldCacheObjective as a decorator for objectives to cache results of calls, using LRU Caches as a weak dependency. For now this works with cost and gradient evaluations\nA ManifoldCountObjective as a decorator for objectives to enable counting of calls to for example the cost and the gradient\nadds a return_objective keyword, that switches the return of a solver to a tuple (o, s), where o is the (possibly decorated) objective, and s is the “classical” solver return (state or point). This way the counted values can be accessed and the cache can be reused.\nchange solvers on the mid level (form solver(M, objective, p)) to also accept decorated objectives","category":"page"},{"location":"changelog/#Changed-4","page":"Changelog","title":"Changed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"Switch all Requires weak dependencies to actual weak dependencies starting in Julia 1.9","category":"page"},{"location":"changelog/#[0.4.20]-11/05/2023","page":"Changelog","title":"[0.4.20] - 11/05/2023","text":"","category":"section"},{"location":"changelog/#Changed-5","page":"Changelog","title":"Changed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"the default tolerances for the numerical check_ functions were loosened a bit, such that check_vector can also be changed in its tolerances.","category":"page"},{"location":"changelog/#[0.4.19]-07/05/2023","page":"Changelog","title":"[0.4.19] - 07/05/2023","text":"","category":"section"},{"location":"changelog/#Added-13","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"the sub solver for trust_regions is now customizable, i.e. can be exchanged.","category":"page"},{"location":"changelog/#Changed-6","page":"Changelog","title":"Changed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"slightly changed the definitions of the solver states for ALM and EPM to be type stable","category":"page"},{"location":"changelog/#[0.4.18]-04/05/2023","page":"Changelog","title":"[0.4.18] - 04/05/2023","text":"","category":"section"},{"location":"changelog/#Added-14","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"A function check_Hessian(M, f, grad_f, Hess_f) to numerically check the (Riemannian) Hessian of a function f","category":"page"},{"location":"changelog/#[0.4.17]-28/04/2023","page":"Changelog","title":"[0.4.17] - 28/04/2023","text":"","category":"section"},{"location":"changelog/#Added-15","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"A new interface of the form alg(M, objective, p0) to allow to reuse objectives without creating AbstractManoptSolverStates and calling solve!. This especially still allows for any decoration of the objective and/or the state using e.g. debug=, or record=.","category":"page"},{"location":"changelog/#Changed-7","page":"Changelog","title":"Changed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"All solvers now have the initial point p as an optional parameter making it more accessible to first time users, e.g. gradient_descent(M, f, grad_f)","category":"page"},{"location":"changelog/#Fixed-6","page":"Changelog","title":"Fixed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"Unified the framework to work on manifold where points are represented by numbers for several solvers","category":"page"},{"location":"changelog/#[0.4.16]-18/04/2023","page":"Changelog","title":"[0.4.16] - 18/04/2023","text":"","category":"section"},{"location":"changelog/#Fixed-7","page":"Changelog","title":"Fixed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"the inner products used in truncated_gradient_descent now also work thoroughly on complex matrix manifolds","category":"page"},{"location":"changelog/#[0.4.15]-13/04/2023","page":"Changelog","title":"[0.4.15] - 13/04/2023","text":"","category":"section"},{"location":"changelog/#Changed-8","page":"Changelog","title":"Changed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"trust_regions(M, f, grad_f, hess_f, p) now has the Hessian hess_f as well as the start point p0 as an optional parameter and approximate it otherwise.\ntrust_regions!(M, f, grad_f, hess_f, p) has the Hessian as an optional parameter and approximate it otherwise.","category":"page"},{"location":"changelog/#Removed","page":"Changelog","title":"Removed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"support for ManifoldsBase.jl 0.13.x, since with the definition of copy(M,p::Number), in 0.14.4, we now use that instead of defining it ourselves.","category":"page"},{"location":"changelog/#[0.4.14]-06/04/2023","page":"Changelog","title":"[0.4.14] - 06/04/2023","text":"","category":"section"},{"location":"changelog/#Changed-9","page":"Changelog","title":"Changed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"particle_swarm now uses much more in-place operations","category":"page"},{"location":"changelog/#Fixed-8","page":"Changelog","title":"Fixed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"particle_swarm used quite a few deepcopy(p) commands still, which were replaced by copy(M, p)","category":"page"},{"location":"changelog/#[0.4.13]-09/04/2023","page":"Changelog","title":"[0.4.13] - 09/04/2023","text":"","category":"section"},{"location":"changelog/#Added-16","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"get_message to obtain messages from sub steps of a solver\nDebugMessages to display the new messages in debug\nsafeguards in Armijo linesearch and L-BFGS against numerical over- and underflow that report in messages","category":"page"},{"location":"changelog/#[0.4.12]-04/04/2023","page":"Changelog","title":"[0.4.12] - 04/04/2023","text":"","category":"section"},{"location":"changelog/#Added-17","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"Introduce the Difference of Convex Algorithm (DCA) difference_of_convex_algorithm(M, f, g, ∂h, p0)\nIntroduce the Difference of Convex Proximal Point Algorithm (DCPPA) difference_of_convex_proximal_point(M, prox_g, grad_h, p0)\nIntroduce a StopWhenGradientChangeLess stopping criterion","category":"page"},{"location":"changelog/#[0.4.11]-27/04/2023","page":"Changelog","title":"[0.4.11] - 27/04/2023","text":"","category":"section"},{"location":"changelog/#Changed-10","page":"Changelog","title":"Changed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"adapt tolerances in tests to the speed/accuracy optimized distance on the sphere in Manifolds.jl (part II)","category":"page"},{"location":"changelog/#[0.4.10]-26/04/2023","page":"Changelog","title":"[0.4.10] - 26/04/2023","text":"","category":"section"},{"location":"changelog/#Changed-11","page":"Changelog","title":"Changed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"adapt tolerances in tests to the speed/accuracy optimized distance on the sphere in Manifolds.jl","category":"page"},{"location":"changelog/#[0.4.9]-–-03/03/2023","page":"Changelog","title":"[0.4.9] – 03/03/2023","text":"","category":"section"},{"location":"changelog/#Added-18","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"introduce a wrapper that allows line searches from LineSearches.jl to be used within Manopt.jl, introduce the manoptjl.org/stable/extensions/ page to explain the details.","category":"page"},{"location":"changelog/#[0.4.8]-21/02/2023","page":"Changelog","title":"[0.4.8] - 21/02/2023","text":"","category":"section"},{"location":"changelog/#Added-19","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"a status_summary that displays the main parameters within several structures of Manopt, most prominently a solver state","category":"page"},{"location":"changelog/#Changed-12","page":"Changelog","title":"Changed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"Improved storage performance by introducing separate named tuples for points and vectors\nchanged the show methods of AbstractManoptSolverStates to display their `state_summary\nMove tutorials to be rendered with Quarto into the documentation.","category":"page"},{"location":"changelog/#[0.4.7]-14/02/2023","page":"Changelog","title":"[0.4.7] - 14/02/2023","text":"","category":"section"},{"location":"changelog/#Changed-13","page":"Changelog","title":"Changed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"Bump [compat] entry of ManifoldDiff to also include 0.3","category":"page"},{"location":"changelog/#[0.4.6]-03/02/2023","page":"Changelog","title":"[0.4.6] - 03/02/2023","text":"","category":"section"},{"location":"changelog/#Fixed-9","page":"Changelog","title":"Fixed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"Fixed a few stopping criteria even indicated to stop before the algorithm started.","category":"page"},{"location":"changelog/#[0.4.5]-24/01/2023","page":"Changelog","title":"[0.4.5] - 24/01/2023","text":"","category":"section"},{"location":"changelog/#Changed-14","page":"Changelog","title":"Changed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"the new default functions that include p are used where possible\na first step towards faster storage handling","category":"page"},{"location":"changelog/#[0.4.4]-20/01/2023","page":"Changelog","title":"[0.4.4] - 20/01/2023","text":"","category":"section"},{"location":"changelog/#Added-20","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"Introduce ConjugateGradientBealeRestart to allow CG restarts using Beale‘s rule","category":"page"},{"location":"changelog/#Fixed-10","page":"Changelog","title":"Fixed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"fix a type in HestenesStiefelCoefficient","category":"page"},{"location":"changelog/#[0.4.3]-17/01/2023","page":"Changelog","title":"[0.4.3] - 17/01/2023","text":"","category":"section"},{"location":"changelog/#Fixed-11","page":"Changelog","title":"Fixed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"the CG coefficient β can now be complex\nfix a bug in grad_distance","category":"page"},{"location":"changelog/#[0.4.2]-16/01/2023","page":"Changelog","title":"[0.4.2] - 16/01/2023","text":"","category":"section"},{"location":"changelog/#Changed-15","page":"Changelog","title":"Changed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"the usage of inner in linesearch methods, such that they work well with complex manifolds as well","category":"page"},{"location":"changelog/#[0.4.1]-15/01/2023","page":"Changelog","title":"[0.4.1] - 15/01/2023","text":"","category":"section"},{"location":"changelog/#Fixed-12","page":"Changelog","title":"Fixed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"a max_stepsize per manifold to avoid leaving the injectivity radius, which it also defaults to","category":"page"},{"location":"changelog/#[0.4.0]-10/01/2023","page":"Changelog","title":"[0.4.0] - 10/01/2023","text":"","category":"section"},{"location":"changelog/#Added-21","page":"Changelog","title":"Added","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"Dependency on ManifoldDiff.jl and a start of moving actual derivatives, differentials, and gradients there.\nAbstractManifoldObjective to store the objective within the AbstractManoptProblem\nIntroduce a CostGrad structure to store a function that computes the cost and gradient within one function.","category":"page"},{"location":"changelog/#Changed-16","page":"Changelog","title":"Changed","text":"","category":"section"},{"location":"changelog/","page":"Changelog","title":"Changelog","text":"AbstractManoptProblem replaces Problem\nthe problem now contains a\nAbstractManoptSolverState replaces Options\nrandom_point(M) is replaced by rand(M) from `ManifoldsBase.jl\nrandom_tangent(M, p) is replaced by rand(M; vector_at=p)","category":"page"},{"location":"helpers/errorMeasures/#ErrorMeasures","page":"Error Measures","title":"Error Measures","text":"","category":"section"},{"location":"helpers/errorMeasures/","page":"Error Measures","title":"Error Measures","text":"meanSquaredError\nmeanAverageError","category":"page"},{"location":"helpers/errorMeasures/#Manopt.meanSquaredError","page":"Error Measures","title":"Manopt.meanSquaredError","text":"meanSquaredError(M, p, q)\n\nCompute the (mean) squared error between the two points p and q on the (power) manifold M.\n\n\n\n\n\n","category":"function"},{"location":"helpers/errorMeasures/#Manopt.meanAverageError","page":"Error Measures","title":"Manopt.meanAverageError","text":"meanSquaredError(M,x,y)\n\nCompute the (mean) squared error between the two points x and y on the PowerManifold manifold M.\n\n\n\n\n\n","category":"function"},{"location":"functions/adjoint_differentials/#adjointDifferentialFunctions","page":"Adjoint Differentials","title":"Adjoint Differentials","text":"","category":"section"},{"location":"functions/adjoint_differentials/","page":"Adjoint Differentials","title":"Adjoint Differentials","text":"Modules = [Manopt]\nPages = [\"adjoint_differentials.jl\"]","category":"page"},{"location":"functions/adjoint_differentials/#Manopt.adjoint_differential_bezier_control-Tuple{AbstractManifold, AbstractVector{<:BezierSegment}, AbstractVector, AbstractVector}","page":"Adjoint Differentials","title":"Manopt.adjoint_differential_bezier_control","text":"adjoint_differential_bezier_control(\n M::AbstractManifold,\n T::AbstractVector,\n X::AbstractVector,\n)\nadjoint_differential_bezier_control!(\n M::AbstractManifold,\n Y::AbstractVector{<:BezierSegment},\n T::AbstractVector,\n X::AbstractVector,\n)\n\nEvaluate the adjoint of the differential with respect to the controlpoints at several times T. This can be computed in place of Y.\n\nSee de_casteljau for more details on the curve.\n\n\n\n\n\n","category":"method"},{"location":"functions/adjoint_differentials/#Manopt.adjoint_differential_bezier_control-Tuple{AbstractManifold, AbstractVector{<:BezierSegment}, Any, Any}","page":"Adjoint Differentials","title":"Manopt.adjoint_differential_bezier_control","text":"adjoint_differential_bezier_control(\n M::AbstractManifold,\n B::AbstractVector{<:BezierSegment},\n t,\n X\n)\nadjoint_differential_bezier_control!(\n M::AbstractManifold,\n Y::AbstractVector{<:BezierSegment},\n B::AbstractVector{<:BezierSegment},\n t,\n X\n)\n\nevaluate the adjoint of the differential of a composite Bézier curve on the manifold M with respect to its control points b based on a points T=(t_i)_i=1^n that are pointwise in t_i01 on the curve and given corresponding tangential vectors X = (η_i)_i=1^n, η_iT_β(t_i)mathcal M This can be computed in place of Y.\n\nSee de_casteljau for more details on the curve.\n\n\n\n\n\n","category":"method"},{"location":"functions/adjoint_differentials/#Manopt.adjoint_differential_bezier_control-Tuple{AbstractManifold, BezierSegment, AbstractVector, AbstractVector}","page":"Adjoint Differentials","title":"Manopt.adjoint_differential_bezier_control","text":"adjoint_differential_bezier_control(\n M::AbstractManifold,\n b::BezierSegment,\n t::AbstractVector,\n X::AbstractVector,\n)\nadjoint_differential_bezier_control!(\n M::AbstractManifold,\n Y::BezierSegment,\n b::BezierSegment,\n t::AbstractVector,\n X::AbstractVector,\n)\n\nevaluate the adjoint of the differential of a Bézier curve on the manifold M with respect to its control points b based on a points T=(t_i)_i=1^n that are pointwise in t_i01 on the curve and given corresponding tangential vectors X = (η_i)_i=1^n, η_iT_β(t_i)mathcal M This can be computed in place of Y.\n\nSee de_casteljau for more details on the curve and Bergmann, Gousenbourger, Front. Appl. Math. Stat., 2018\n\n\n\n\n\n","category":"method"},{"location":"functions/adjoint_differentials/#Manopt.adjoint_differential_bezier_control-Tuple{AbstractManifold, BezierSegment, Any, Any}","page":"Adjoint Differentials","title":"Manopt.adjoint_differential_bezier_control","text":"adjoint_differential_bezier_control(M::AbstractManifold, b::BezierSegment, t, η)\nadjoint_differential_bezier_control!(\n M::AbstractManifold,\n Y::BezierSegment,\n b::BezierSegment,\n t,\n η,\n)\n\nevaluate the adjoint of the differential of a Bézier curve on the manifold M with respect to its control points b based on a point t01 on the curve and a tangent vector ηT_β(t)mathcal M. This can be computed in place of Y.\n\nSee de_casteljau for more details on the curve.\n\n\n\n\n\n","category":"method"},{"location":"functions/adjoint_differentials/#Manopt.adjoint_differential_forward_logs-Union{Tuple{TPR}, Tuple{TSize}, Tuple{TM}, Tuple{𝔽}, Tuple{PowerManifold{𝔽, TM, TSize, TPR}, Any, Any}} where {𝔽, TM, TSize, TPR}","page":"Adjoint Differentials","title":"Manopt.adjoint_differential_forward_logs","text":"Y = adjoint_differential_forward_logs(M, p, X)\nadjoint_differential_forward_logs!(M, Y, p, X)\n\nCompute the adjoint differential of forward_logs F occurring, in the power manifold array p, the differential of the function\n\nF_i(p) = sum_j mathcal I_i log_p_i p_j\n\nwhere i runs over all indices of the PowerManifold manifold M and mathcal I_i denotes the forward neighbors of i Let n be the number dimensions of the PowerManifold manifold (i.e. length(size(x))). Then the input tangent vector lies on the manifold mathcal M = mathcal M^n. The adjoint differential can be computed in place of Y.\n\nInput\n\nM – a PowerManifold manifold\np – an array of points on a manifold\nX – a tangent vector to from the n-fold power of p, where n is the ndims of p\n\nOutput\n\nY – resulting tangent vector in T_pmathcal M representing the adjoint differentials of the logs.\n\n\n\n\n\n","category":"method"},{"location":"functions/adjoint_differentials/#Literature","page":"Adjoint Differentials","title":"Literature","text":"","category":"section"},{"location":"functions/adjoint_differentials/","page":"Adjoint Differentials","title":"Adjoint Differentials","text":"
[BG18]
\n
\n
R. Bergmann and P.-Y. Gousenbourger. A variational model for data fitting on manifolds by minimizing the acceleration of a Bézier curve. Frontiers in Applied Mathematics and Statistics 4 (2018), arXiv: [1807.10090](https://arxiv.org/abs/1807.10090).
\n
\n
","category":"page"},{"location":"solvers/gradient_descent/#GradientDescentSolver","page":"Gradient Descent","title":"Gradient Descent","text":"","category":"section"},{"location":"solvers/gradient_descent/","page":"Gradient Descent","title":"Gradient Descent","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/gradient_descent/","page":"Gradient Descent","title":"Gradient Descent","text":" gradient_descent\n gradient_descent!","category":"page"},{"location":"solvers/gradient_descent/#Manopt.gradient_descent","page":"Gradient Descent","title":"Manopt.gradient_descent","text":"gradient_descent(M, f, grad_f, p=rand(M); kwargs...)\ngradient_descent(M, gradient_objective, p=rand(M); kwargs...)\n\nperform a gradient descent\n\np_k+1 = operatornameretr_p_kbigl( s_koperatornamegradf(p_k) bigr)\nqquad k=01\n\nwith different choices of the stepsize s_k available (see stepsize option below).\n\nInput\n\nM – a manifold mathcal M\nf – a cost function f mathcal Mℝ to find a minimizer p^* for\ngrad_f – the gradient operatornamegradf mathcal M Tmathcal M of f\nas a function (M, p) -> X or a function (M, X, p) -> X\np – an initial value p = p_0 mathcal M\n\nAlternatively to f and grad_f you can provide the AbstractManifoldGradientObjective gradient_objective directly.\n\nOptional\n\ndirection – (IdentityUpdateRule) perform a processing of the direction, e.g.\nevaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form grad_f(M, p) or InplaceEvaluation in place, i.e. is of the form grad_f!(M, X, p).\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction to use\nstepsize – (ConstantStepsize(1.)) specify a Stepsize functor.\nstopping_criterion – (StopWhenAny(StopAfterIteration(200),StopWhenGradientNormLess(10.0^-8))) a functor inheriting from StoppingCriterion indicating when to stop.\n\nIf you provide the ManifoldGradientObjective directly, evaluation is ignored.\n\nAll other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective, respectively. If you provide the ManifoldGradientObjective directly, these decorations can still be specified\n\nOutput\n\nthe obtained (approximate) minimizer p^*. To obtain the whole final state of the solver, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/gradient_descent/#Manopt.gradient_descent!","page":"Gradient Descent","title":"Manopt.gradient_descent!","text":"gradient_descent!(M, f, grad_f, p; kwargs...)\ngradient_descent!(M, gradient_objective, p; kwargs...)\n\nperform a gradient_descent\n\np_k+1 = operatornameretr_p_kbigl( s_koperatornamegradf(p_k) bigr)\n\nin place of p with different choices of s_k available.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function Fmathcal Mℝ to minimize\ngrad_f – the gradient operatornamegradFmathcal M Tmathcal M of F\np – an initial value p mathcal M\n\nAlternatively to f and grad_f you can provide the AbstractManifoldGradientObjective gradient_objective directly.\n\nFor more options, especially Stepsizes for s_k, see gradient_descent\n\n\n\n\n\n","category":"function"},{"location":"solvers/gradient_descent/#State","page":"Gradient Descent","title":"State","text":"","category":"section"},{"location":"solvers/gradient_descent/","page":"Gradient Descent","title":"Gradient Descent","text":"GradientDescentState","category":"page"},{"location":"solvers/gradient_descent/#Manopt.GradientDescentState","page":"Gradient Descent","title":"Manopt.GradientDescentState","text":"GradientDescentState{P,T} <: AbstractGradientSolverState\n\nDescribes a Gradient based descent algorithm, with\n\nFields\n\na default value is given in brackets if a parameter can be left out in initialization.\n\np – (rand(M)` the current iterate\nX – (zero_vector(M,p)) the current gradient operatornamegradf(p), initialised to zero vector.\nstopping_criterion – (StopAfterIteration(100)) a StoppingCriterion\nstepsize – (default_stepsize(M, GradientDescentState)) a Stepsize\ndirection - (IdentityUpdateRule) a processor to compute the gradient\nretraction_method – (default_retraction_method(M, typeof(p))) the retraction to use, defaults to the default set for your manifold.\n\nConstructor\n\nGradientDescentState(M, p=rand(M); X=zero_vector(M, p), kwargs...)\n\nGenerate gradient descent options, where X can be used to set the tangent vector to store the gradient in a certain type; it will be initialised accordingly at a later stage. All following fields are keyword arguments.\n\nSee also\n\ngradient_descent\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Direction-Update-Rules","page":"Gradient Descent","title":"Direction Update Rules","text":"","category":"section"},{"location":"solvers/gradient_descent/","page":"Gradient Descent","title":"Gradient Descent","text":"A field of the options is the direction, a DirectionUpdateRule, which by default IdentityUpdateRule just evaluates the gradient but can be enhanced for example to","category":"page"},{"location":"solvers/gradient_descent/","page":"Gradient Descent","title":"Gradient Descent","text":"DirectionUpdateRule\nIdentityUpdateRule\nMomentumGradient\nAverageGradient\nNesterov","category":"page"},{"location":"solvers/gradient_descent/#Manopt.DirectionUpdateRule","page":"Gradient Descent","title":"Manopt.DirectionUpdateRule","text":"DirectionUpdateRule\n\nA general functor, that handles direction update rules. It's field(s) is usually only a StoreStateAction by default initialized to the fields required for the specific coefficient, but can also be replaced by a (common, global) individual one that provides these values.\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Manopt.IdentityUpdateRule","page":"Gradient Descent","title":"Manopt.IdentityUpdateRule","text":"IdentityUpdateRule <: DirectionUpdateRule\n\nThe default gradient direction update is the identity, i.e. it just evaluates the gradient.\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Manopt.MomentumGradient","page":"Gradient Descent","title":"Manopt.MomentumGradient","text":"MomentumGradient <: DirectionUpdateRule\n\nAppend a momentum to a gradient processor, where the last direction and last iterate are stored and the new is composed as η_i = m*η_i-1 - s d_i, where sd_i is the current (inner) direction and η_i-1 is the vector transported last direction multiplied by momentum m.\n\nFields\n\np_old - (rand(M)) remember the last iterate for parallel transporting the last direction\nmomentum – (0.2) factor for momentum\ndirection – internal DirectionUpdateRule to determine directions to add the momentum to.\nvector_transport_method – default_vector_transport_method(M, typeof(p)) vector transport method to use\nX_old – (zero_vector(M,x0)) the last gradient/direction update added as momentum\n\nConstructors\n\nAdd momentum to a gradient problem, where by default just a gradient evaluation is used\n\nMomentumGradient(\n M::AbstractManifold;\n p=rand(M),\n s::DirectionUpdateRule=IdentityUpdateRule();\n X=zero_vector(p.M, x0), momentum=0.2\n vector_transport_method=default_vector_transport_method(M, typeof(p)),\n)\n\nInitialize a momentum gradient rule to s. Note that the keyword arguments p and X will be overridden often, so their initialisation is meant to set the to certain types of points or tangent vectors, if you do not use the default types with respect to M.\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Manopt.AverageGradient","page":"Gradient Descent","title":"Manopt.AverageGradient","text":"AverageGradient <: DirectionUpdateRule\n\nAdd an average of gradients to a gradient processor. A set of previous directions (from the inner processor) and the last iterate are stored, average is taken after vector transporting them to the current iterates tangent space.\n\nFields\n\ngradients – (fill(zero_vector(M,x0),n)) the last n gradient/direction updates\nlast_iterate – last iterate (needed to transport the gradients)\ndirection – internal DirectionUpdateRule to determine directions to apply the averaging to\nvector_transport_method - vector transport method to use\n\nConstructors\n\nAverageGradient(\n M::AbstractManifold,\n p::P=rand(M);\n n::Int=10\n s::DirectionUpdateRule=IdentityUpdateRule();\n gradients = fill(zero_vector(p.M, o.x),n),\n last_iterate = deepcopy(x0),\n vector_transport_method = default_vector_transport_method(M, typeof(p))\n)\n\nAdd average to a gradient problem, where\n\nn determines the size of averaging\ns is the internal DirectionUpdateRule to determine the gradients to store\ngradients can be prefilled with some history\nlast_iterate stores the last iterate\nvector_transport_method determines how to transport all gradients to the current iterates tangent space before averaging\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Manopt.Nesterov","page":"Gradient Descent","title":"Manopt.Nesterov","text":"Nesterov <: DirectionUpdateRule\n\nFields\n\nγ\nμ the strong convexity coefficient\nv (==v_k, v_0=x_0) an interims point to compute the next gradient evaluation point y_k\nshrinkage (= i -> 0.8) a function to compute the shrinkage β_k per iterate.\n\nLet's assume f is L-Lipschitz and μ-strongly convex. Given\n\na step size h_kfrac1L (from the GradientDescentState\na shrinkage parameter β_k\nand a current iterate x_k\nas well as the interims values γ_k and v_k from the previous iterate.\n\nThis compute a Nesterov type update using the following steps, see Zhang, Sra, Preprint, 2018\n\nCompute the positive root, i.e. α_k(01) of α^2 = h_kbigl((1-α_k)γ_k+α_k μbigr).\nSet bar γ_k+1 = (1-α_k)γ_k + α_kμ\ny_k = operatornameretr_x_kBigl(fracα_kγ_kγ_k + α_kμoperatornameretr^-1_x_kv_k Bigr)\nx_k+1 = operatornameretr_y_k(-h_k operatornamegradf(y_k))\nv_k+1 = operatornameretr_y_kBigl(frac(1-α_k)γ_kbarγ_koperatornameretr_y_k^-1(v_k) - fracα_kbar γ_k+1operatornamegradf(y_k) Bigr)\nγ_k+1 = frac11+β_kbar γ_k+1\n\nThen the direction from x_k to x_k+1, i.e. d = operatornameretr^-1_x_kx_k+1 is returned.\n\nConstructor\n\nNesterov(M::AbstractManifold, p::P; γ=0.001, μ=0.9, shrinkage = k -> 0.8;\n inverse_retraction_method=LogarithmicInverseRetraction())\n\nInitialize the Nesterov acceleration, where x0 initializes v.\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Debug-Actions","page":"Gradient Descent","title":"Debug Actions","text":"","category":"section"},{"location":"solvers/gradient_descent/","page":"Gradient Descent","title":"Gradient Descent","text":"DebugGradient\nDebugGradientNorm\nDebugStepsize","category":"page"},{"location":"solvers/gradient_descent/#Manopt.DebugGradient","page":"Gradient Descent","title":"Manopt.DebugGradient","text":"DebugGradient <: DebugAction\n\ndebug for the gradient evaluated at the current iterate\n\nConstructors\n\nDebugGradient(; long=false, prefix= , format= \"$prefix%s\", io=stdout)\n\ndisplay the short (false) or long (true) default text for the gradient, or set the prefix manually. Alternatively the complete format can be set.\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Manopt.DebugGradientNorm","page":"Gradient Descent","title":"Manopt.DebugGradientNorm","text":"DebugGradientNorm <: DebugAction\n\ndebug for gradient evaluated at the current iterate.\n\nConstructors\n\nDebugGradientNorm([long=false,p=print])\n\ndisplay the short (false) or long (true) default text for the gradient norm.\n\nDebugGradientNorm(prefix[, p=print])\n\ndisplay the a prefix in front of the gradientnorm.\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Manopt.DebugStepsize","page":"Gradient Descent","title":"Manopt.DebugStepsize","text":"DebugStepsize <: DebugAction\n\ndebug for the current step size.\n\nConstructors\n\nDebugStepsize(;long=false,prefix=\"step size:\", format=\"$prefix%s\", io=stdout)\n\ndisplay the a prefix in front of the step size.\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Record-Actions","page":"Gradient Descent","title":"Record Actions","text":"","category":"section"},{"location":"solvers/gradient_descent/","page":"Gradient Descent","title":"Gradient Descent","text":"RecordGradient\nRecordGradientNorm\nRecordStepsize","category":"page"},{"location":"solvers/gradient_descent/#Manopt.RecordGradient","page":"Gradient Descent","title":"Manopt.RecordGradient","text":"RecordGradient <: RecordAction\n\nrecord the gradient evaluated at the current iterate\n\nConstructors\n\nRecordGradient(ξ)\n\ninitialize the RecordAction to the corresponding type of the tangent vector.\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Manopt.RecordGradientNorm","page":"Gradient Descent","title":"Manopt.RecordGradientNorm","text":"RecordGradientNorm <: RecordAction\n\nrecord the norm of the current gradient\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Manopt.RecordStepsize","page":"Gradient Descent","title":"Manopt.RecordStepsize","text":"RecordStepsize <: RecordAction\n\nrecord the step size\n\n\n\n\n\n","category":"type"},{"location":"solvers/gradient_descent/#Literature","page":"Gradient Descent","title":"Literature","text":"","category":"section"},{"location":"solvers/gradient_descent/","page":"Gradient Descent","title":"Gradient Descent","text":"
[Lue72]
\n
\n
D. G. Luenberger. The gradient projection method along geodesics. Management Science 18, 620–631 (1972).
","category":"page"},{"location":"solvers/#SolversSection","page":"Introduction","title":"Solvers","text":"","category":"section"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"Solvers can be applied to AbstractManoptProblems with solver specific AbstractManoptSolverState.","category":"page"},{"location":"solvers/#List-of-Algorithms","page":"Introduction","title":"List of Algorithms","text":"","category":"section"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"The following algorithms are currently available","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"Solver Function & State Objective\nAlternating Gradient Descent alternating_gradient_descent AlternatingGradientDescentState f=(f_1ldotsf_n), operatornamegrad f_i\nChambolle-Pock ChambollePock, ChambollePockState (using TwoManifoldProblem) f=F+G(Λcdot), operatornameprox_σ F, operatornameprox_τ G^*, Λ\nConjugate Gradient Descent conjugate_gradient_descent, ConjugateGradientDescentState f, operatornamegrad f\nCyclic Proximal Point cyclic_proximal_point, CyclicProximalPointState f=sum f_i, operatornameprox_lambda f_i\nDifference of Convex Algorithm difference_of_convex_algorithm, DifferenceOfConvexState f=g-h, h, and e.g. g, operatornamegrad g\nDifference of Convex Proximal Point difference_of_convex_proximal_point, DifferenceOfConvexProximalState f=g-h, h, and e.g. g, operatornamegrad g\nDouglas–Rachford DouglasRachford, DouglasRachfordState f=sum f_i, operatornameprox_lambda f_i\nExact Penalty Method exact_penalty_method, ExactPenaltyMethodState f, operatornamegrad f, g, operatornamegrad g_i, h, operatornamegrad h_j\nFrank-Wolfe algorithm Frank_Wolfe_method, FrankWolfeState sub-problem solver\nGradient Descent gradient_descent, GradientDescentState f, operatornamegrad f\nLevenberg-Marquardt LevenbergMarquardt, LevenbergMarquardtState f = sum_i f_i operatornamegrad f_i (Jacobian)\nNelder-Mead NelderMead, NelderMeadState f\nAugmented Lagrangian Method augmented_Lagrangian_method, AugmentedLagrangianMethodState f, operatornamegrad f, g, operatornamegrad g_i, h, operatornamegrad h_j\nParticle Swarm particle_swarm, ParticleSwarmState f\nPrimal-dual Riemannian semismooth Newton Algorithm primal_dual_semismooth_Newton, PrimalDualSemismoothNewtonState (using TwoManifoldProblem) f=F+G(Λcdot), operatornameprox_σ F & diff., operatornameprox_τ G^* & diff., Λ\nQuasi-Newton Method quasi_Newton, QuasiNewtonState f, operatornamegrad f\nSteihaug-Toint Truncated Conjugate-Gradient Method truncated_conjugate_gradient_descent, TruncatedConjugateGradientState f, operatornamegrad f, operatornameHess f\nSubgradient Method subgradient_method, SubGradientMethodState f, f\nStochastic Gradient Descent stochastic_gradient_descent, StochasticGradientDescentState f = sum_i f_i, operatornamegrad f_i\nThe Riemannian Trust-Regions Solver trust_regions, TrustRegionsState f, operatornamegrad f, operatornameHess f","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"Note that the solvers (their AbstractManoptSolverState, to be precise) can also be decorated to enhance your algorithm by general additional properties, see debug output and recording values. This is done using the debug= and record= keywords in the function calls. Similarly, since 0.4 we provide a (simple) caching of the objective function using the cache= keyword in any of the function calls..","category":"page"},{"location":"solvers/#Technical-Details","page":"Introduction","title":"Technical Details","text":"","category":"section"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"The main function a solver calls is","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"solve!(p::AbstractManoptProblem, s::AbstractManoptSolverState)","category":"page"},{"location":"solvers/#Manopt.solve!-Tuple{AbstractManoptProblem, AbstractManoptSolverState}","page":"Introduction","title":"Manopt.solve!","text":"solve!(p::AbstractManoptProblem, s::AbstractManoptSolverState)\n\nrun the solver implemented for the AbstractManoptProblemp and the AbstractManoptSolverStates employing initialize_solver!, step_solver!, as well as the stop_solver! of the solver.\n\n\n\n\n\n","category":"method"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"which is a framework that you in general should not change or redefine. It uses the following methods, which also need to be implemented on your own algorithm, if you want to provide one.","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"initialize_solver!\nstep_solver!\nget_solver_result\nget_solver_return\nstop_solver!(p::AbstractManoptProblem, s::AbstractManoptSolverState, Any)","category":"page"},{"location":"solvers/#Manopt.initialize_solver!","page":"Introduction","title":"Manopt.initialize_solver!","text":"initialize_solver!(ams::AbstractManoptProblem, amp::AbstractManoptSolverState)\n\nInitialize the solver to the optimization AbstractManoptProblem amp by initializing the necessary values in the AbstractManoptSolverState amp.\n\n\n\n\n\ninitialize_solver!(amp::AbstractManoptProblem, dss::DebugSolverState)\n\nExtend the initialization of the solver by a hook to run debug that were added to the :Start and :All entries of the debug lists.\n\n\n\n\n\ninitialize_solver!(ams::AbstractManoptProblem, rss::RecordSolverState)\n\nExtend the initialization of the solver by a hook to run records that were added to the :Start entry.\n\n\n\n\n\n","category":"function"},{"location":"solvers/#Manopt.step_solver!","page":"Introduction","title":"Manopt.step_solver!","text":"step_solver!(amp::AbstractManoptProblem, ams::AbstractManoptSolverState, i)\n\nDo one iteration step (the ith) for an AbstractManoptProblemp by modifying the values in the AbstractManoptSolverState ams.\n\n\n\n\n\nstep_solver!(amp::AbstractManoptProblem, dss::DebugSolverState, i)\n\nExtend the ith step of the solver by a hook to run debug prints, that were added to the :Step and :All entries of the debug lists.\n\n\n\n\n\nstep_solver!(amp::AbstractManoptProblem, rss::RecordSolverState, i)\n\nExtend the ith step of the solver by a hook to run records, that were added to the :Iteration entry.\n\n\n\n\n\n","category":"function"},{"location":"solvers/#Manopt.get_solver_result","page":"Introduction","title":"Manopt.get_solver_result","text":"get_solver_result(ams::AbstractManoptSolverState)\nget_solver_result(tos::Tuple{AbstractManifoldObjective,AbstractManoptSolverState})\nget_solver_result(o::AbstractManifoldObjective, s::AbstractManoptSolverState)\n\nReturn the final result after all iterations that is stored within the AbstractManoptSolverState ams, which was modified during the iterations.\n\nFor the case the objective is passed as well, but default, the objective is ignored, and the solver result for the state is called.\n\n\n\n\n\n","category":"function"},{"location":"solvers/#Manopt.get_solver_return","page":"Introduction","title":"Manopt.get_solver_return","text":"get_solver_return(s::AbstractManoptSolverState)\nget_solver_return(o::AbstractManifoldObjective, s::AbstractManoptSolverState)\n\ndetermine the result value of a call to a solver. By default this returns the same as get_solver_result, i.e. the last iterate or (approximate) minimizer.\n\nget_solver_return(s::ReturnSolverState)\nget_solver_return(o::AbstractManifoldObjective, s::ReturnSolverState)\n\nreturn the internally stored state of the ReturnSolverState instead of the minimizer. This means that when the state are decorated like this, the user still has to call get_solver_result on the internal state separately.\n\nget_solver_return(o::ReturnManifoldObjective, s::AbstractManoptSolverState)\n\nreturn both the objective and the state as a tuple.\n\n\n\n\n\n","category":"function"},{"location":"solvers/#Manopt.stop_solver!-Tuple{AbstractManoptProblem, AbstractManoptSolverState, Any}","page":"Introduction","title":"Manopt.stop_solver!","text":"stop_solver!(amp::AbstractManoptProblem, ams::AbstractManoptSolverState, i)\n\ndepending on the current AbstractManoptProblem amp, the current state of the solver stored in AbstractManoptSolverState ams and the current iterate i this function determines whether to stop the solver, which by default means to call the internal StoppingCriterion. ams.stop\n\n\n\n\n\n","category":"method"},{"location":"solvers/#API-for-solvers","page":"Introduction","title":"API for solvers","text":"","category":"section"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"this is a short overview of the different types of high-level functions are usually available for a solver. Let's assume the solver is called new_solver and requires a cost f and some first order information df as well as a starting point p on M. f and df form the objective together called obj.","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"Then there are basically two different variants to call","category":"page"},{"location":"solvers/#The-easy-to-access-call","page":"Introduction","title":"The easy to access call","text":"","category":"section"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"new_solver(M, f, df, p=rand(M); kwargs...)\nnew_solver!(M, f, df, p; kwargs...)","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"Where the start point should be optional. Keyword arguments include the type of evaluation, decorators like debug= or record= as well as algorithm specific ones. If you provide an immutable point p or the rand(M) point is immutable, like on the Circle() this method should turn the point into a mutable one as well.","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"The third variant works in place of p, so it is mandatory.","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"This first interface would set up the objective and pass all keywords on the the objective based call.","category":"page"},{"location":"solvers/#The-objective-based-call","page":"Introduction","title":"The objective-based call","text":"","category":"section"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"new_solver(M, obj, p=rand(M); kwargs...)\nnew_solver!(M, obj, p; kwargs...)","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"Here the objective would be created beforehand, e.g. to compare different solvers on the same objective, and for the first variant the start point is optional. Keyword arguments include decorators like debug= or record= as well as algorithm specific ones.","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"this variant would generate the problem and the state and check validity of all provided keyword arguments that affect the state. Then it would call the iterate process.","category":"page"},{"location":"solvers/#The-manual-call","page":"Introduction","title":"The manual call","text":"","category":"section"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"If you generate the corresponding problem and state as the previous step does, you can also use the third (lowest level) and just call","category":"page"},{"location":"solvers/","page":"Introduction","title":"Introduction","text":"solve!(problem, state)","category":"page"},{"location":"functions/gradients/#GradientFunctions","page":"Gradients","title":"Gradients","text":"","category":"section"},{"location":"functions/gradients/","page":"Gradients","title":"Gradients","text":"For a function fmathcal Mℝ the Riemannian gradient operatornamegradf(x) at xmathcal M is given by the unique tangent vector fulfilling","category":"page"},{"location":"functions/gradients/","page":"Gradients","title":"Gradients","text":"langle operatornamegradf(x) ξrangle_x = D_xfξquad\nforall ξ T_xmathcal M","category":"page"},{"location":"functions/gradients/","page":"Gradients","title":"Gradients","text":"where D_xfξ denotes the differential of f at x with respect to the tangent direction (vector) ξ or in other words the directional derivative.","category":"page"},{"location":"functions/gradients/","page":"Gradients","title":"Gradients","text":"This page collects the available gradients.","category":"page"},{"location":"functions/gradients/","page":"Gradients","title":"Gradients","text":"Modules = [Manopt]\nPages = [\"gradients.jl\"]","category":"page"},{"location":"functions/gradients/#Manopt.forward_logs-Union{Tuple{TPR}, Tuple{TSize}, Tuple{TM}, Tuple{𝔽}, Tuple{PowerManifold{𝔽, TM, TSize, TPR}, Any}} where {𝔽, TM, TSize, TPR}","page":"Gradients","title":"Manopt.forward_logs","text":"Y = forward_logs(M,x)\nforward_logs!(M, Y, x)\n\ncompute the forward logs F (generalizing forward differences) occurring, in the power manifold array, the function\n\nF_i(x) = sum_j mathcal I_i log_x_i x_jquad i mathcal G\n\nwhere mathcal G is the set of indices of the PowerManifold manifold M and mathcal I_i denotes the forward neighbors of i. This can also be done in place of ξ.\n\nInput\n\nM – a PowerManifold manifold\nx – a point.\n\nOutput\n\nY – resulting tangent vector in T_xmathcal M representing the logs, where mathcal N is the power manifold with the number of dimensions added to size(x). The computation can be done in place of Y.\n\n\n\n\n\n","category":"method"},{"location":"functions/gradients/#Manopt.grad_L2_acceleration_bezier-Union{Tuple{P}, Tuple{AbstractManifold, AbstractVector{P}, AbstractVector{<:Integer}, AbstractVector, Any, AbstractVector{P}}} where P","page":"Gradients","title":"Manopt.grad_L2_acceleration_bezier","text":"grad_L2_acceleration_bezier(\n M::AbstractManifold,\n B::AbstractVector{P},\n degrees::AbstractVector{<:Integer},\n T::AbstractVector,\n λ,\n d::AbstractVector{P}\n) where {P}\n\ncompute the gradient of the discretized acceleration of a composite Bézier curve on the Manifold M with respect to its control points B together with a data term that relates the junction points p_i to the data d with a weight λ compared to the acceleration. The curve is evaluated at the points given in pts (elementwise in 0N), where N is the number of segments of the Bézier curve. The summands are grad_distance for the data term and grad_acceleration_bezier for the acceleration with interpolation constrains. Here the get_bezier_junctions are included in the optimization, i.e. setting λ=0 yields the unconstrained acceleration minimization. Note that this is ill-posed, since any Bézier curve identical to a geodesic is a minimizer.\n\nNote that the Bézier-curve is given in reduces form as a point on a PowerManifold, together with the degrees of the segments and assuming a differentiable curve, the segments can internally be reconstructed.\n\nSee also\n\ngrad_acceleration_bezier, cost_L2_acceleration_bezier, cost_acceleration_bezier.\n\n\n\n\n\n","category":"method"},{"location":"functions/gradients/#Manopt.grad_TV","page":"Gradients","title":"Manopt.grad_TV","text":"X = grad_TV(M, λ, x[, p=1])\ngrad_TV!(M, X, λ, x[, p=1])\n\nCompute the (sub)gradient partial F of all forward differences occurring, in the power manifold array, i.e. of the function\n\nF(x) = sum_isum_j mathcal I_i d^p(x_ix_j)\n\nwhere i runs over all indices of the PowerManifold manifold M and mathcal I_i denotes the forward neighbors of i.\n\nInput\n\nM – a PowerManifold manifold\nx – a point.\n\nOutput\n\nX – resulting tangent vector in T_xmathcal M. The computation can also be done in place.\n\n\n\n\n\n","category":"function"},{"location":"functions/gradients/#Manopt.grad_TV-Union{Tuple{T}, Tuple{AbstractManifold, Tuple{T, T}}, Tuple{AbstractManifold, Tuple{T, T}, Any}} where T","page":"Gradients","title":"Manopt.grad_TV","text":"X = grad_TV(M, (x,y)[, p=1])\ngrad_TV!(M, X, (x,y)[, p=1])\n\ncompute the (sub) gradient of frac1pd^p_mathcal M(xy) with respect to both x and y (in place of X and Y).\n\n\n\n\n\n","category":"method"},{"location":"functions/gradients/#Manopt.grad_TV2","page":"Gradients","title":"Manopt.grad_TV2","text":"grad_TV2(M::PowerManifold, q[, p=1])\n\ncomputes the (sub) gradient of frac1pd_2^p(q_1q_2q_3) with respect to all q_1q_2q_3 occurring along any array dimension in the point q, where M is the corresponding PowerManifold.\n\n\n\n\n\n","category":"function"},{"location":"functions/gradients/#Manopt.grad_TV2-2","page":"Gradients","title":"Manopt.grad_TV2","text":"Y = grad_TV2(M, q[, p=1])\ngrad_TV2!(M, Y, q[, p=1])\n\ncomputes the (sub) gradient of frac1pd_2^p(q_1 q_2 q_3) with respect to all three components of qmathcal M^3, where d_2 denotes the second order absolute difference using the mid point model, i.e. let\n\nmathcal C = bigl c mathcal M g(tfrac12q_1q_3) text for some geodesic gbigr\n\ndenote the mid points between q_1 and q_3 on the manifold mathcal M. Then the absolute second order difference is defined as\n\nd_2(q_1q_2q_3) = min_c mathcal C_q_1q_3 d(c q_2)\n\nWhile the (sub)gradient with respect to q_2 is easy, the other two require the evaluation of an adjoint_Jacobi_field.\n\n\n\n\n\n","category":"function"},{"location":"functions/gradients/#Manopt.grad_acceleration_bezier-Tuple{AbstractManifold, AbstractVector, AbstractVector{<:Integer}, AbstractVector}","page":"Gradients","title":"Manopt.grad_acceleration_bezier","text":"grad_acceleration_bezier(\n M::AbstractManifold,\n B::AbstractVector,\n degrees::AbstractVector{<:Integer}\n T::AbstractVector\n)\n\ncompute the gradient of the discretized acceleration of a (composite) Bézier curve c_B(t) on the Manifold M with respect to its control points B given as a point on the PowerManifold assuming C1 conditions and known degrees. The curve is evaluated at the points given in T (elementwise in 0N, where N is the number of segments of the Bézier curve). The get_bezier_junctions are fixed for this gradient (interpolation constraint). For the unconstrained gradient, see grad_L2_acceleration_bezier and set λ=0 therein. This gradient is computed using adjoint_Jacobi_fields. For details, see Bergmann, Gousenbourger, Front. Appl. Math. Stat., 2018. See de_casteljau for more details on the curve.\n\nSee also\n\ncost_acceleration_bezier, grad_L2_acceleration_bezier, cost_L2_acceleration_bezier.\n\n\n\n\n\n","category":"method"},{"location":"functions/gradients/#Manopt.grad_distance","page":"Gradients","title":"Manopt.grad_distance","text":"grad_distance(M,y,x[, p=2])\ngrad_distance!(M,X,y,x[, p=2])\n\ncompute the (sub)gradient of the distance (squared), in place of X.\n\nf(x) = frac1p d^p_mathcal M(xy)\n\nto a fixed point y on the manifold M and p is an integer. The gradient reads\n\n operatornamegradf(x) = -d_mathcal M^p-2(xy)log_xy\n\nfor pneq 1 or xneq y. Note that for the remaining case p=1, x=y the function is not differentiable. In this case, the function returns the corresponding zero tangent vector, since this is an element of the subdifferential.\n\nOptional\n\np – (2) the exponent of the distance, i.e. the default is the squared distance\n\n\n\n\n\n","category":"function"},{"location":"functions/gradients/#Manopt.grad_intrinsic_infimal_convolution_TV12-Tuple{AbstractManifold, Vararg{Any, 5}}","page":"Gradients","title":"Manopt.grad_intrinsic_infimal_convolution_TV12","text":"grad_u, grad_v = grad_intrinsic_infimal_convolution_TV12(M, f, u, v, α, β)\n\ncompute (sub)gradient of the intrinsic infimal convolution model using the mid point model of second order differences, see costTV2, i.e. for some f mathcal M on a PowerManifold manifold mathcal M this function computes the (sub)gradient of\n\nE(uv) =\nfrac12sum_i mathcal G d_mathcal M(g(frac12v_iw_i)f_i)\n+ alpha\nbigl(\nβmathrmTV(v) + (1-β)mathrmTV_2(w)\nbigr)\n\nwhere both total variations refer to the intrinsic ones, grad_TV and grad_TV2, respectively.\n\n\n\n\n\n","category":"method"},{"location":"functions/gradients/#Literature","page":"Gradients","title":"Literature","text":"","category":"section"},{"location":"functions/gradients/","page":"Gradients","title":"Gradients","text":"
[BG18]
\n
\n
R. Bergmann and P.-Y. Gousenbourger. A variational model for data fitting on manifolds by minimizing the acceleration of a Bézier curve. Frontiers in Applied Mathematics and Statistics 4 (2018), arXiv: [1807.10090](https://arxiv.org/abs/1807.10090).
\n
\n
","category":"page"},{"location":"extensions/#Extensions","page":"Extensions","title":"Extensions","text":"","category":"section"},{"location":"extensions/#LineSearches.jl","page":"Extensions","title":"LineSearches.jl","text":"","category":"section"},{"location":"extensions/","page":"Extensions","title":"Extensions","text":"Manopt can be used with line search algorithms implemented in LineSearches.jl. This can be illustrated by the following example of optimizing Rosenbrock function constrained to the unit sphere.","category":"page"},{"location":"extensions/","page":"Extensions","title":"Extensions","text":"using Manopt, Manifolds, LineSearches\n\n# define objective function and its gradient\np = [1.0, 100.0]\nfunction rosenbrock(::AbstractManifold, x)\n val = zero(eltype(x))\n for i in 1:(length(x) - 1)\n val += (p[1] - x[i])^2 + p[2] * (x[i + 1] - x[i]^2)^2\n end\n return val\nend\nfunction rosenbrock_grad!(M::AbstractManifold, storage, x)\n storage .= 0.0\n for i in 1:(length(x) - 1)\n storage[i] += -2.0 * (p[1] - x[i]) - 4.0 * p[2] * (x[i + 1] - x[i]^2) * x[i]\n storage[i + 1] += 2.0 * p[2] * (x[i + 1] - x[i]^2)\n end\n project!(M, storage, x, storage)\n return storage\nend\n# define constraint\nn_dims = 5\nM = Manifolds.Sphere(n_dims)\n# set initial point\nx0 = vcat(zeros(n_dims - 1), 1.0)\n# use LineSearches.jl HagerZhang method with Manopt.jl quasiNewton solver\nls_hz = Manopt.LineSearchesStepsize(M, LineSearches.HagerZhang())\nx_opt = quasi_Newton(\n M,\n rosenbrock,\n rosenbrock_grad!,\n x0;\n stepsize=ls_hz,\n evaluation=InplaceEvaluation(),\n stopping_criterion=StopAfterIteration(1000) | StopWhenGradientNormLess(1e-6),\n return_state=true,\n)","category":"page"},{"location":"extensions/#Manifolds.jl","page":"Extensions","title":"Manifolds.jl","text":"","category":"section"},{"location":"extensions/","page":"Extensions","title":"Extensions","text":"Manopt.LineSearchesStepsize\nmid_point\nManopt.max_stepsize(::TangentBundle, ::Any)\nManopt.max_stepsize(::FixedRankMatrices, ::Any)","category":"page"},{"location":"extensions/#Manopt.LineSearchesStepsize","page":"Extensions","title":"Manopt.LineSearchesStepsize","text":"LineSearchesStepsize <: Stepsize\n\nWrapper for line searches available in the LineSearches.jl library.\n\nConstructors\n\nLineSearchesStepsize(\n M::AbstractManifold,\n linesearch;\n retraction_method::AbstractRetractionMethod=default_retraction_method(M),\n vector_transport_method::AbstractVectorTransportMethod=default_vector_transport_method(M),\n)\nLineSearchesStepsize(\n linesearch;\n retraction_method::AbstractRetractionMethod=ExponentialRetraction(),\n vector_transport_method::AbstractVectorTransportMethod=ParallelTransport(),\n)\n\nWrap linesearch (for example HagerZhang or MoreThuente). The initial step selection from Linesearches.jl is not yet supported and the value 1.0 is used. The retraction used for determining the line along which the search is performed can be provided as retraction_method. Gradient vectors are transported between points using vector_transport_method.\n\n\n\n\n\n","category":"type"},{"location":"extensions/#ManifoldsBase.mid_point","page":"Extensions","title":"ManifoldsBase.mid_point","text":"mid_point(M, p, q, x)\nmid_point!(M, y, p, q, x)\n\nCompute the mid point between p and q. If there is more than one mid point of (not necessarily minimizing) geodesics (e.g. on the sphere), the one nearest to x is returned (in place of y).\n\n\n\n\n\n","category":"function"},{"location":"extensions/#Manopt.max_stepsize-Tuple{FiberBundle{𝔽, ManifoldsBase.TangentSpaceType, M} where {𝔽, M<:AbstractManifold{𝔽}}, Any}","page":"Extensions","title":"Manopt.max_stepsize","text":"max_stepsize(M::TangentBundle, p)\n\nTangent bundle has injectivity radius of either infinity (for flat manifolds) or 0 (for non-flat manifolds). This makes a guess of what a reasonable maximum stepsize on a tangent bundle might be.\n\n\n\n\n\n","category":"method"},{"location":"extensions/#Manopt.max_stepsize-Tuple{FixedRankMatrices, Any}","page":"Extensions","title":"Manopt.max_stepsize","text":"max_stepsize(M::FixedRankMatrices, p)\n\nReturn a reasonable guess of maximum step size on FixedRankMatrices following the choice of typical distance in Matlab Manopt, i.e. dimension of M. See this note\n\n\n\n\n\n","category":"method"},{"location":"functions/bezier/#BezierCurves","page":"Bézier curves","title":"Bézier curves","text":"","category":"section"},{"location":"functions/bezier/","page":"Bézier curves","title":"Bézier curves","text":"Modules = [Manopt]\nPages = [\"bezier_curves.jl\"]","category":"page"},{"location":"functions/bezier/#Manopt.BezierSegment","page":"Bézier curves","title":"Manopt.BezierSegment","text":"BezierSegment\n\nA type to capture a Bezier segment. With n points, a Bézier segment of degree n-1 is stored. On the Euclidean manifold, this yields a polynomial of degree n-1.\n\nThis type is mainly used to encapsulate the points within a composite Bezier curve, which consist of an AbstractVector of BezierSegments where each of the points might be a nested array on a PowerManifold already.\n\nNot that this can also be used to represent tangent vectors on the control points of a segment.\n\nSee also: de_casteljau.\n\nConstructor\n\nBezierSegment(pts::AbstractVector)\n\nGiven an abstract vector of pts generate the corresponding Bézier segment.\n\n\n\n\n\n","category":"type"},{"location":"functions/bezier/#Manopt.de_casteljau-Tuple{AbstractManifold, Vararg{Any}}","page":"Bézier curves","title":"Manopt.de_casteljau","text":"de_casteljau(M::AbstractManifold, b::BezierSegment NTuple{N,P}) -> Function\n\nreturn the Bézier curve β(b_0b_n) 01 mathcal M defined by the control points b_0b_nmathcal M, nmathbb N, as a BezierSegment. This function implements de Casteljau's algorithm Casteljau, 1959, Casteljau, 1963 generalized to manifolds by Popiel, Noakes, J Approx Theo, 2007: Let γ_ab(t) denote the shortest geodesic connecting abmathcal M. Then the curve is defined by the recursion\n\nbeginaligned\n β(tb_0b_1) = gamma_b_0b_1(t)\n β(tb_0b_n) = gamma_β(tb_0b_n-1) β(tb_1b_n)(t)\nendaligned\n\nand P is the type of a point on the Manifold M.\n\nde_casteljau(M::AbstractManifold, B::AbstractVector{<:BezierSegment}) -> Function\n\nGiven a vector of Bézier segments, i.e. a vector of control points B=bigl( (b_00b_n_00)(b_0m b_n_mm) bigr), where the different segments might be of different degree(s) n_0n_m. The resulting composite Bézier curve c_B0m mathcal M consists of m segments which are Bézier curves.\n\nc_B(t) =\n begincases\n β(t b_00b_n_00) text if t 01\n β(t-i b_0ib_n_ii) text if \n t(ii+1 quad i1m-1\n endcases\n\nde_casteljau(M::AbstractManifold, b::BezierSegment, t::Real)\nde_casteljau(M::AbstractManifold, B::AbstractVector{<:BezierSegment}, t::Real)\nde_casteljau(M::AbstractManifold, b::BezierSegment, T::AbstractVector) -> AbstractVector\nde_casteljau(\n M::AbstractManifold,\n B::AbstractVector{<:BezierSegment},\n T::AbstractVector\n) -> AbstractVector\n\nEvaluate the Bézier curve at time t or at times t in T.\n\n\n\n\n\n","category":"method"},{"location":"functions/bezier/#Manopt.get_bezier_degree-Tuple{AbstractManifold, BezierSegment}","page":"Bézier curves","title":"Manopt.get_bezier_degree","text":"get_bezier_degree(M::AbstractManifold, b::BezierSegment)\n\nreturn the degree of the Bézier curve represented by the tuple b of control points on the manifold M, i.e. the number of points minus 1.\n\n\n\n\n\n","category":"method"},{"location":"functions/bezier/#Manopt.get_bezier_degrees-Tuple{AbstractManifold, AbstractVector{<:BezierSegment}}","page":"Bézier curves","title":"Manopt.get_bezier_degrees","text":"get_bezier_degrees(M::AbstractManifold, B::AbstractVector{<:BezierSegment})\n\nreturn the degrees of the components of a composite Bézier curve represented by tuples in B containing points on the manifold M.\n\n\n\n\n\n","category":"method"},{"location":"functions/bezier/#Manopt.get_bezier_inner_points-Tuple{AbstractManifold, AbstractVector{<:BezierSegment}}","page":"Bézier curves","title":"Manopt.get_bezier_inner_points","text":"get_bezier_inner_points(M::AbstractManifold, B::AbstractVector{<:BezierSegment} )\nget_bezier_inner_points(M::AbstractManifold, b::BezierSegment)\n\nreturns the inner (i.e. despite start and end) points of the segments of the composite Bézier curve specified by the control points B. For a single segment b, its inner points are returned\n\n\n\n\n\n","category":"method"},{"location":"functions/bezier/#Manopt.get_bezier_junction_tangent_vectors-Tuple{AbstractManifold, AbstractVector{<:BezierSegment}}","page":"Bézier curves","title":"Manopt.get_bezier_junction_tangent_vectors","text":"get_bezier_junction_tangent_vectors(M::AbstractManifold, B::AbstractVector{<:BezierSegment})\nget_bezier_junction_tangent_vectors(M::AbstractManifold, b::BezierSegment)\n\nreturns the tangent vectors at start and end points of the composite Bézier curve pointing from a junction point to the first and last inner control points for each segment of the composite Bezier curve specified by the control points B, either a vector of segments of controlpoints.\n\n\n\n\n\n","category":"method"},{"location":"functions/bezier/#Manopt.get_bezier_junctions","page":"Bézier curves","title":"Manopt.get_bezier_junctions","text":"get_bezier_junctions(M::AbstractManifold, B::AbstractVector{<:BezierSegment})\nget_bezier_junctions(M::AbstractManifold, b::BezierSegment)\n\nreturns the start and end point(s) of the segments of the composite Bézier curve specified by the control points B. For just one segment b, its start and end points are returned.\n\n\n\n\n\n","category":"function"},{"location":"functions/bezier/#Manopt.get_bezier_points","page":"Bézier curves","title":"Manopt.get_bezier_points","text":"get_bezier_points(\n M::AbstractManifold,\n B::AbstractVector{<:BezierSegment},\n reduce::Symbol=:default\n)\nget_bezier_points(M::AbstractManifold, b::BezierSegment, reduce::Symbol=:default)\n\nreturns the control points of the segments of the composite Bézier curve specified by the control points B, either a vector of segments of controlpoints or a.\n\nThis method reduces the points depending on the optional reduce symbol\n\n:default – no reduction is performed\n:continuous – for a continuous function, the junction points are doubled at b_0i=b_n_i-1i-1, so only b_0i is in the vector.\n:differentiable – for a differentiable function additionally log_b_0ib_1i = -log_b_n_i-1i-1b_n_i-1-1i-1 holds. hence b_n_i-1-1i-1 is omitted.\n\nIf only one segment is given, all points of b – i.e. b.pts is returned.\n\n\n\n\n\n","category":"function"},{"location":"functions/bezier/#Manopt.get_bezier_segments-Union{Tuple{P}, Tuple{AbstractManifold, Vector{P}, Any}, Tuple{AbstractManifold, Vector{P}, Any, Symbol}} where P","page":"Bézier curves","title":"Manopt.get_bezier_segments","text":"get_bezier_segments(M::AbstractManifold, c::AbstractArray{P}, d[, s::Symbol=:default])\n\nreturns the array of BezierSegments B of a composite Bézier curve reconstructed from an array c of points on the manifold M and an array of degrees d.\n\nThere are a few (reduced) representations that can get extended; see also get_bezier_points. For ease of the following, let c=(c_1c_k) and d=(d_1d_m), where m denotes the number of components the composite Bézier curve consists of. Then\n\n:default – k = m + sum_i=1^m d_i since each component requires one point more than its degree. The points are then ordered in tuples, i.e.\nB = bigl c_1c_d_1+1 (c_d_1+2c_d_1+d_2+2 c_k-m+1+d_mc_k bigr\n:continuous – k = 1+ sum_i=1m d_i, since for a continuous curve start and end point of successive components are the same, so the very first start point and the end points are stored.\nB = bigl c_1c_d_1+1 c_d_1+1c_d_1+d_2+1 c_k-1+d_mb_k) bigr\n:differentiable – for a differentiable function additionally to the last explanation, also the second point of any segment was not stored except for the first segment. Hence k = 2 - m + sum_i=1m d_i and at a junction point b_n with its given prior point c_n-1, i.e. this is the last inner point of a segment, the first inner point in the next segment the junction is computed as b = exp_c_n(-log_c_n c_n-1) such that the assumed differentiability holds\n\n\n\n\n\n","category":"method"},{"location":"functions/bezier/#Literature","page":"Bézier curves","title":"Literature","text":"","category":"section"},{"location":"functions/bezier/","page":"Bézier curves","title":"Bézier curves","text":"
[Cas59]
\n
\n
P. de Casteljau. Outillage methodes calcul. Enveloppe Soleau 40.040, Institute National de la Propriété Industrielle, Paris. (1959).
\n
[Cas63]
\n
\n
P. de Casteljau. Courbes et surfaces à pôles. Microfiche P 4147-1, Institute National de la Propriété Industrielle, Paris. (1963).
","category":"page"},{"location":"solvers/subgradient/#SubgradientSolver","page":"Subgradient method","title":"Subgradient Method","text":"","category":"section"},{"location":"solvers/subgradient/","page":"Subgradient method","title":"Subgradient method","text":"subgradient_method\nsubgradient_method!","category":"page"},{"location":"solvers/subgradient/#Manopt.subgradient_method","page":"Subgradient method","title":"Manopt.subgradient_method","text":"subgradient_method(M, f, ∂f, p; kwargs...)\nsubgradient_method(M; sgo, p; kwargs...)\n\nperform a subgradient method p_k+1 = mathrmretr(p_k s_kf(p_k)),\n\nwhere mathrmretr is a retraction, s_k is a step size, usually the ConstantStepsize but also be specified. Though the subgradient might be set valued, the argument ∂f should always return one element from the subgradient, but not necessarily deterministic.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function fmathcal Mℝ to minimize\n∂f– the (sub)gradient partial f mathcal M Tmathcal M of f restricted to always only returning one value/element from the subdifferential. This function can be passed as an allocation function (M, p) -> X or a mutating function (M, X, p) -> X, see evaluation.\np – an initial value p_0=p mathcal M\n\nalternatively to f and ∂f a ManifoldSubgradientObjective sgo can be provided.\n\nOptional\n\nevaluation – (AllocatingEvaluation) specify whether the subgradient works by allocation (default) form ∂f(M, y) or InplaceEvaluation in place, i.e. is of the form ∂f!(M, X, x).\nstepsize – (ConstantStepsize(M)) specify a Stepsize\nretraction – (default_retraction_method(M, typeof(p))) a retraction to use.\nstopping_criterion – (StopAfterIteration(5000)) a functor, seeStoppingCriterion, indicating when to stop.\n\nand the ones that are passed to decorate_state! for decorators.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/subgradient/#Manopt.subgradient_method!","page":"Subgradient method","title":"Manopt.subgradient_method!","text":"subgradient_method!(M, f, ∂f, p)\nsubgradient_method!(M, sgo, p)\n\nperform a subgradient method p_k+1 = mathrmretr(p_k s_kf(p_k)),\n\nInput\n\nM – a manifold mathcal M\nf – a cost function fmathcal Mℝ to minimize\n∂f– the (sub)gradient partial f mathcal M Tmathcal M of F restricted to always only returning one value/element from the subdifferential. This function can be passed as an allocation function (M, p) -> X or a mutating function (M, X, p) -> X, see evaluation.\np – an initial value p_0=p mathcal M\n\nalternatively to f and ∂f a ManifoldSubgradientObjective sgo can be provided.\n\nfor more details and all optional parameters, see subgradient_method.\n\n\n\n\n\n","category":"function"},{"location":"solvers/subgradient/#State","page":"Subgradient method","title":"State","text":"","category":"section"},{"location":"solvers/subgradient/","page":"Subgradient method","title":"Subgradient method","text":"SubGradientMethodState","category":"page"},{"location":"solvers/subgradient/#Manopt.SubGradientMethodState","page":"Subgradient method","title":"Manopt.SubGradientMethodState","text":"SubGradientMethodState <: AbstractManoptSolverState\n\nstores option values for a subgradient_method solver\n\nFields\n\nretraction_method – the rectration to use within\nstepsize – (ConstantStepsize(M)) a Stepsize\nstop – (StopAfterIteration(5000))a [StoppingCriterion`](@ref)\np – (initial or current) value the algorithm is at\np_star – optimal value (initialized to a copy of p.)\nX - (zero_vector(M, p)) the current element from the possible subgradients at p that was last evaluated.\n\nConstructor\n\nSubGradientMethodState(M::AbstractManifold, p; kwargs...)\n\nwith keywords for all fields above besides p_star which obtains the same type as p. You can use e.g. X= to specify the type of tangent vector to use\n\n\n\n\n\n","category":"type"},{"location":"solvers/subgradient/","page":"Subgradient method","title":"Subgradient method","text":"For DebugActions and RecordActions to record (sub)gradient, its norm and the step sizes, see the steepest Descent actions.","category":"page"},{"location":"functions/#Functions","page":"Introduction","title":"Functions","text":"","category":"section"},{"location":"functions/","page":"Introduction","title":"Introduction","text":"There are several functions required within optimization, most prominently costFunctions and gradients. This package includes several cost functions and corresponding gradients, but also corresponding proximal maps for variational methods manifold-valued data. Most of these functions require the evaluation of Differentials or their adjointss.","category":"page"},{"location":"functions/differentials/#DifferentialFunctions","page":"Differentials","title":"Differentials","text":"","category":"section"},{"location":"functions/differentials/","page":"Differentials","title":"Differentials","text":"Modules = [Manopt]\nPages = [\"functions/differentials.jl\"]","category":"page"},{"location":"functions/differentials/#Manopt.differential_bezier_control-Tuple{AbstractManifold, AbstractVector{<:BezierSegment}, AbstractVector, AbstractVector{<:BezierSegment}}","page":"Differentials","title":"Manopt.differential_bezier_control","text":"differential_bezier_control(\n M::AbstractManifold,\n B::AbstractVector{<:BezierSegment},\n T::AbstractVector\n Ξ::AbstractVector{<:BezierSegment}\n)\ndifferential_bezier_control!(\n M::AbstractManifold,\n Θ::AbstractVector{<:BezierSegment}\n B::AbstractVector{<:BezierSegment},\n T::AbstractVector\n Ξ::AbstractVector{<:BezierSegment}\n)\n\nevaluate the differential of the composite Bézier curve with respect to its control points B and tangent vectors Ξ in the tangent spaces of the control points. The result is the “change” of the curve at the points in T, which are elementwise in 0N, and each depending the corresponding segment(s). Here, N is the length of B. For the mutating variant the result is computed in Θ.\n\nSee de_casteljau for more details on the curve and Bergmann, Gousenbourger, Front. Appl. Math. Stat., 2018.\n\n\n\n\n\n","category":"method"},{"location":"functions/differentials/#Manopt.differential_bezier_control-Tuple{AbstractManifold, AbstractVector{<:BezierSegment}, Any, AbstractVector{<:BezierSegment}}","page":"Differentials","title":"Manopt.differential_bezier_control","text":"differential_bezier_control(\n M::AbstractManifold,\n B::AbstractVector{<:BezierSegment},\n t,\n X::AbstractVector{<:BezierSegment}\n)\ndifferential_bezier_control!(\n M::AbstractManifold,\n Y::AbstractVector{<:BezierSegment}\n B::AbstractVector{<:BezierSegment},\n t,\n X::AbstractVector{<:BezierSegment}\n)\n\nevaluate the differential of the composite Bézier curve with respect to its control points B and tangent vectors Ξ in the tangent spaces of the control points. The result is the “change” of the curve at t0N, which depends only on the corresponding segment. Here, N is the length of B. The computation can be done in place of Y.\n\nSee de_casteljau for more details on the curve.\n\n\n\n\n\n","category":"method"},{"location":"functions/differentials/#Manopt.differential_bezier_control-Tuple{AbstractManifold, BezierSegment, AbstractVector, BezierSegment}","page":"Differentials","title":"Manopt.differential_bezier_control","text":"differential_bezier_control(\n M::AbstractManifold,\n b::BezierSegment,\n T::AbstractVector,\n X::BezierSegment,\n)\ndifferential_bezier_control!(\n M::AbstractManifold,\n Y,\n b::BezierSegment,\n T::AbstractVector,\n X::BezierSegment,\n)\n\nevaluate the differential of the Bézier curve with respect to its control points b and tangent vectors X in the tangent spaces of the control points. The result is the “change” of the curve at the points T, elementwise in t01. The computation can be done in place of Y.\n\nSee de_casteljau for more details on the curve.\n\n\n\n\n\n","category":"method"},{"location":"functions/differentials/#Manopt.differential_bezier_control-Tuple{AbstractManifold, BezierSegment, Any, BezierSegment}","page":"Differentials","title":"Manopt.differential_bezier_control","text":"differential_bezier_control(M::AbstractManifold, b::BezierSegment, t::Float, X::BezierSegment)\ndifferential_bezier_control!(\n M::AbstractManifold,\n Y,\n b::BezierSegment,\n t,\n X::BezierSegment\n)\n\nevaluate the differential of the Bézier curve with respect to its control points b and tangent vectors X given in the tangent spaces of the control points. The result is the “change” of the curve at t01. The computation can be done in place of Y.\n\nSee de_casteljau for more details on the curve.\n\n\n\n\n\n","category":"method"},{"location":"functions/differentials/#Manopt.differential_forward_logs-Tuple{PowerManifold, Any, Any}","page":"Differentials","title":"Manopt.differential_forward_logs","text":"Y = differential_forward_logs(M, p, X)\ndifferential_forward_logs!(M, Y, p, X)\n\ncompute the differential of forward_logs F on the PowerManifold manifold M at p and direction X , in the power manifold array, the differential of the function\n\nF_i(x) = sum_j mathcal I_i log_p_i p_j quad i mathcal G\n\nwhere mathcal G is the set of indices of the PowerManifold manifold M and mathcal I_i denotes the forward neighbors of i.\n\nInput\n\nM – a PowerManifold manifold\np – a point.\nX – a tangent vector.\n\nOutput\n\nY – resulting tangent vector in T_xmathcal N representing the differentials of the logs, where mathcal N is the power manifold with the number of dimensions added to size(x). The computation can also be done in place.\n\n\n\n\n\n","category":"method"},{"location":"solvers/augmented_Lagrangian_method/#AugmentedLagrangianSolver","page":"Augmented Lagrangian Method","title":"Augmented Lagrangian Method","text":"","category":"section"},{"location":"solvers/augmented_Lagrangian_method/","page":"Augmented Lagrangian Method","title":"Augmented Lagrangian Method","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/augmented_Lagrangian_method/","page":"Augmented Lagrangian Method","title":"Augmented Lagrangian Method","text":" augmented_Lagrangian_method\n augmented_Lagrangian_method!","category":"page"},{"location":"solvers/augmented_Lagrangian_method/#Manopt.augmented_Lagrangian_method","page":"Augmented Lagrangian Method","title":"Manopt.augmented_Lagrangian_method","text":"augmented_Lagrangian_method(M, f, grad_f, p=rand(M); kwargs...)\naugmented_Lagrangian_method(M, cmo::ConstrainedManifoldObjective, p=rand(M); kwargs...)\n\nperform the augmented Lagrangian method (ALM) Liu, Boumal, 2019, Appl. Math. Optim. The aim of the ALM is to find the solution of the constrained optimisation task\n\nbeginaligned\nmin_p mathcalM f(p)\ntextsubject to g_i(p)leq 0 quad text for i= 1 m\nquad h_j(p)=0 quad text for j=1n\nendaligned\n\nwhere M is a Riemannian manifold, and f, g_i_i=1^m and h_j_j=1^p are twice continuously differentiable functions from M to ℝ. In every step k of the algorithm, the AugmentedLagrangianCost mathcalL_ρ^(k-1)(p μ^(k-1) λ^(k-1)) is minimized on mathcalM, where μ^(k-1) in mathbb R^n and λ^(k-1) in mathbb R^m are the current iterates of the Lagrange multipliers and ρ^(k-1) is the current penalty parameter.\n\nThe Lagrange multipliers are then updated by\n\nλ_j^(k) =operatornameclip_λ_minλ_max (λ_j^(k-1) + ρ^(k-1) h_j(p^(k))) textfor all j=1p\n\nand\n\nμ_i^(k) =operatornameclip_0μ_max (μ_i^(k-1) + ρ^(k-1) g_i(p^(k))) text for all i=1m\n\nwhere λ_min leq λ_max and μ_max are the multiplier boundaries.\n\nNext, we update the accuracy tolerance ϵ by setting\n\nϵ^(k)=maxϵ_min θ_ϵ ϵ^(k-1)\n\nwhere ϵ_min is the lowest value ϵ is allowed to become and θ_ϵ (01) is constant scaling factor.\n\nLast, we update the penalty parameter ρ. For this, we define\n\nσ^(k)=max_j=1p i=1m h_j(p^(k)) max_i=1mg_i(p^(k)) -fracμ_i^(k-1)ρ^(k-1) \n\nThen, we update ρ according to\n\nρ^(k) = begincases\nρ^(k-1)θ_ρ textif σ^(k)leq θ_ρ σ^(k-1) \nρ^(k-1) textelse\nendcases\n\nwhere θ_ρ in (01) is a constant scaling factor.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function Fmathcal Mℝ to minimize\ngrad_f – the gradient of the cost function\n\nOptional (if not called with the ConstrainedManifoldObjective cmo)\n\ng – (nothing) the inequality constraints\nh – (nothing) the equality constraints\ngrad_g – (nothing) the gradient of the inequality constraints\ngrad_h – (nothing) the gradient of the equality constraints\n\nNote that one of the pairs (g, grad_g) or (h, grad_h) has to be provided. Otherwise the problem is not constrained and you can also call e.g. quasi_Newton\n\nOptional\n\nϵ – (1e-3) the accuracy tolerance\nϵ_min – (1e-6) the lower bound for the accuracy tolerance\nϵ_exponent – (1/100) exponent of the ϵ update factor; also 1/number of iterations until maximal accuracy is needed to end algorithm naturally\nθ_ϵ – ((ϵ_min / ϵ)^(ϵ_exponent)) the scaling factor of the exactness\nμ – (ones(size(h(M,x),1))) the Lagrange multiplier with respect to the inequality constraints\nμ_max – (20.0) an upper bound for the Lagrange multiplier belonging to the inequality constraints\nλ – (ones(size(h(M,x),1))) the Lagrange multiplier with respect to the equality constraints\nλ_max – (20.0) an upper bound for the Lagrange multiplier belonging to the equality constraints\nλ_min – (- λ_max) a lower bound for the Lagrange multiplier belonging to the equality constraints\nτ – (0.8) factor for the improvement of the evaluation of the penalty parameter\nρ – (1.0) the penalty parameter\nθ_ρ – (0.3) the scaling factor of the penalty parameter\nsub_cost – (AugmentedLagrangianCost(problem, ρ, μ, λ)) use augmented Lagrangian, especially with the same numbers ρ,μ as in the options for the sub problem\nsub_grad – (AugmentedLagrangianGrad(problem, ρ, μ, λ)) use augmented Lagrangian gradient, especially with the same numbers ρ,μ as in the options for the sub problem\nsub_kwargs – keyword arguments to decorate the sub options, e.g. with debug.\nsub_stopping_criterion – (StopAfterIteration(200) |StopWhenGradientNormLess(ϵ) |StopWhenStepsizeLess(1e-8)) specify a stopping criterion for the subsolver.\nsub_problem – (DefaultManoptProblem(M,ConstrainedManifoldObjective(subcost, subgrad; evaluation=evaluation))) problem for the subsolver\nsub_state – (QuasiNewtonState) using QuasiNewtonLimitedMemoryDirectionUpdate with InverseBFGS and sub_stopping_criterion as a stopping criterion. See also sub_kwargs.\nstopping_criterion – (StopAfterIteration(300) | (StopWhenSmallerOrEqual(ϵ, ϵ_min) & StopWhenChangeLess(1e-10))) a functor inheriting from StoppingCriterion indicating when to stop.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/augmented_Lagrangian_method/#Manopt.augmented_Lagrangian_method!","page":"Augmented Lagrangian Method","title":"Manopt.augmented_Lagrangian_method!","text":"augmented_Lagrangian_method!(M, f, grad_f p=rand(M); kwargs...)\n\nperform the augmented Lagrangian method (ALM) in-place of p.\n\nFor all options, see augmented_Lagrangian_method.\n\n\n\n\n\n","category":"function"},{"location":"solvers/augmented_Lagrangian_method/#State","page":"Augmented Lagrangian Method","title":"State","text":"","category":"section"},{"location":"solvers/augmented_Lagrangian_method/","page":"Augmented Lagrangian Method","title":"Augmented Lagrangian Method","text":"AugmentedLagrangianMethodState","category":"page"},{"location":"solvers/augmented_Lagrangian_method/#Manopt.AugmentedLagrangianMethodState","page":"Augmented Lagrangian Method","title":"Manopt.AugmentedLagrangianMethodState","text":"AugmentedLagrangianMethodState{P,T} <: AbstractManoptSolverState\n\nDescribes the augmented Lagrangian method, with\n\nFields\n\na default value is given in brackets if a parameter can be left out in initialization.\n\np – a point on a manifold as starting point and current iterate\nsub_problem – an AbstractManoptProblem problem for the subsolver\nsub_state – an AbstractManoptSolverState for the subsolver\nϵ – (1e–3) the accuracy tolerance\nϵ_min – (1e-6) the lower bound for the accuracy tolerance\nλ – (ones(len(get_equality_constraints(p,x))) the Lagrange multiplier with respect to the equality constraints\nλ_max – (20.0) an upper bound for the Lagrange multiplier belonging to the equality constraints\nλ_min – (- λ_max) a lower bound for the Lagrange multiplier belonging to the equality constraints\nμ – (ones(len(get_inequality_constraints(p,x))) the Lagrange multiplier with respect to the inequality constraints\nμ_max – (20.0) an upper bound for the Lagrange multiplier belonging to the inequality constraints\nρ – (1.0) the penalty parameter\nτ – (0.8) factor for the improvement of the evaluation of the penalty parameter\nθ_ρ – (0.3) the scaling factor of the penalty parameter\nθ_ϵ – ((ϵ_min/ϵ)^(ϵ_exponent)) the scaling factor of the accuracy tolerance\npenalty – evaluation of the current penalty term, initialized to Inf.\nstopping_criterion – ((StopAfterIteration(300) | (StopWhenSmallerOrEqual(ϵ, ϵ_min) &StopWhenChangeLess(1e-10))) a functor inheriting from StoppingCriterion indicating when to stop.\n\nConstructor\n\nAugmentedLagrangianMethodState(M::AbstractManifold, co::ConstrainedManifoldObjective, p; kwargs...)\n\nconstruct an augmented Lagrangian method options with the fields and defaults as above, where the manifold M and the ConstrainedManifoldObjective co are used for defaults in the keyword arguments.\n\nSee also\n\naugmented_Lagrangian_method\n\n\n\n\n\n","category":"type"},{"location":"solvers/augmented_Lagrangian_method/#Helping-Functions","page":"Augmented Lagrangian Method","title":"Helping Functions","text":"","category":"section"},{"location":"solvers/augmented_Lagrangian_method/","page":"Augmented Lagrangian Method","title":"Augmented Lagrangian Method","text":"AugmentedLagrangianCost\nAugmentedLagrangianGrad","category":"page"},{"location":"solvers/augmented_Lagrangian_method/#Manopt.AugmentedLagrangianCost","page":"Augmented Lagrangian Method","title":"Manopt.AugmentedLagrangianCost","text":"AugmentedLagrangianCost{CO,R,T}\n\nStores the parameters ρ mathbb R, μ mathbb R^m, λ mathbb R^n of the augmented Lagrangian associated to the ConstrainedManifoldObjective co.\n\nThis struct is also a functor (M,p) -> v that can be used as a cost function within a solver, based on the internal ConstrainedManifoldObjective we can compute\n\nmathcal L_rho(p μ λ)\n= f(x) + fracρ2 biggl(\n sum_j=1^n Bigl( h_j(p) + fracλ_jρ Bigr)^2\n +\n sum_i=1^m maxBigl 0 fracμ_iρ + g_i(p) Bigr^2\nBigr)\n\nFields\n\nco::CO, ρ::R, μ::T, λ::T as mentioned above\n\n\n\n\n\n","category":"type"},{"location":"solvers/augmented_Lagrangian_method/#Manopt.AugmentedLagrangianGrad","page":"Augmented Lagrangian Method","title":"Manopt.AugmentedLagrangianGrad","text":"AugmentedLagrangianGrad{CO,R,T}\n\nStores the parameters ρ mathbb R, μ mathbb R^m, λ mathbb R^n of the augmented Lagrangian associated to the ConstrainedManifoldObjective co.\n\nThis struct is also a functor in both formats\n\n(M, p) -> X to compute the gradient in allocating fashion.\n(M, X, p) to compute the gradient in in-place fashion.\n\nbased on the internal ConstrainedManifoldObjective and computes the gradient operatornamegrad mathcal L_ρ(p μ λ), see also AugmentedLagrangianCost.\n\n\n\n\n\n","category":"type"},{"location":"solvers/augmented_Lagrangian_method/#Literature","page":"Augmented Lagrangian Method","title":"Literature","text":"","category":"section"},{"location":"solvers/augmented_Lagrangian_method/","page":"Augmented Lagrangian Method","title":"Augmented Lagrangian Method","text":"
","category":"page"},{"location":"plans/record/#RecordSection","page":"Recording values","title":"Record values","text":"","category":"section"},{"location":"plans/record/","page":"Recording values","title":"Recording values","text":"CurrentModule = Manopt","category":"page"},{"location":"plans/record/","page":"Recording values","title":"Recording values","text":"To record values during the iterations of a solver run, there are in general two possibilities. On the one hand, the high-level interfaces provide a record= keyword, that accepts several different inputs. For more details see How to record.","category":"page"},{"location":"plans/record/","page":"Recording values","title":"Recording values","text":"For example recording the gradient from the GradientDescentState is automatically available, as explained in the gradient_descent solver.","category":"page"},{"location":"plans/record/#RecordSolverState","page":"Recording values","title":"Record Solver States","text":"","category":"section"},{"location":"plans/record/","page":"Recording values","title":"Recording values","text":"Modules = [Manopt]\nPages = [\"plans/record.jl\"]\nOrder = [:type, :function]\nPrivate = true","category":"page"},{"location":"plans/record/#Manopt.RecordAction","page":"Recording values","title":"Manopt.RecordAction","text":"RecordAction\n\nA RecordAction is a small functor to record values. The usual call is given by (amp::AbstractManoptProblem, ams::AbstractManoptSolverState, i) -> s that performs the record, where i is the current iteration.\n\nBy convention i<=0 is interpreted as \"For Initialization only\", i.e. only initialize internal values, but not trigger any record, the same holds for i=typemin(Inf) which is used to indicate stop, i.e. that the record is called from within stop_solver! which returns true afterwards.\n\nFields (assumed by subtypes to exist)\n\nrecorded_values an Array of the recorded values.\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordChange","page":"Recording values","title":"Manopt.RecordChange","text":"RecordChange <: RecordAction\n\ndebug for the amount of change of the iterate (stored in o.x of the AbstractManoptSolverState) during the last iteration.\n\nAdditional Fields\n\nstorage a StoreStateAction to store (at least) o.x to use this as the last value (to compute the change\ninverse_retraction_method - (default_inverse_retraction_method(manifold, p)) the inverse retraction to be used for approximating distance.\n\nConstructor\n\nRecordChange(M=DefaultManifold();)\n\nwith the above fields as keywords. For the DefaultManifold only the field storage is used. Providing the actual manifold moves the default storage to the efficient point storage.\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordCost","page":"Recording values","title":"Manopt.RecordCost","text":"RecordCost <: RecordAction\n\nRecord the current cost function value, see get_cost.\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordEntry","page":"Recording values","title":"Manopt.RecordEntry","text":"RecordEntry{T} <: RecordAction\n\nrecord a certain fields entry of type {T} during the iterates\n\nFields\n\nrecorded_values – the recorded Iterates\nfield – Symbol the entry can be accessed with within AbstractManoptSolverState\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordEntryChange","page":"Recording values","title":"Manopt.RecordEntryChange","text":"RecordEntryChange{T} <: RecordAction\n\nrecord a certain entries change during iterates\n\nAdditional Fields\n\nrecorded_values – the recorded Iterates\nfield – Symbol the field can be accessed with within AbstractManoptSolverState\ndistance – function (p,o,x1,x2) to compute the change/distance between two values of the entry\nstorage – a StoreStateAction to store (at least) getproperty(o, d.field)\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordEvery","page":"Recording values","title":"Manopt.RecordEvery","text":"RecordEvery <: RecordAction\n\nrecord only every ith iteration. Otherwise (optionally, but activated by default) just update internal tracking values.\n\nThis method does not perform any record itself but relies on it's childrens methods\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordGroup","page":"Recording values","title":"Manopt.RecordGroup","text":"RecordGroup <: RecordAction\n\ngroup a set of RecordActions into one action, where the internal RecordActions act independently, but the results can be collected in a grouped fashion, i.e. tuples per calls of this group. The entries can be later addressed either by index or semantic Symbols\n\nConstructors\n\nRecordGroup(g::Array{<:RecordAction, 1})\n\nconstruct a group consisting of an Array of RecordActions g,\n\nRecordGroup(g, symbols)\n\nExamples\n\nr = RecordGroup([RecordIteration(), RecordCost()])\n\nA RecordGroup to record the current iteration and the cost. The cost can then be accessed using get_record(r,2) or r[2].\n\nr = RecordGroup([RecordIteration(), RecordCost()], Dict(:Cost => 2))\n\nA RecordGroup to record the current iteration and the cost, which can then be accessed using get_record(:Cost) or r[:Cost].\n\nr = RecordGroup([RecordIteration(), :Cost => RecordCost()])\n\nA RecordGroup identical to the previous constructor, just a little easier to use.\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordIterate","page":"Recording values","title":"Manopt.RecordIterate","text":"RecordIterate <: RecordAction\n\nrecord the iterate\n\nConstructors\n\nRecordIterate(x0)\n\ninitialize the iterate record array to the type of x0, e.g. your initial data.\n\nRecordIterate(P)\n\ninitialize the iterate record array to the data type T.\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordIteration","page":"Recording values","title":"Manopt.RecordIteration","text":"RecordIteration <: RecordAction\n\nrecord the current iteration\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordSolverState","page":"Recording values","title":"Manopt.RecordSolverState","text":"RecordSolverState <: AbstractManoptSolverState\n\nappend to any AbstractManoptSolverState the decorator with record functionality, Internally a Dictionary is kept that stores a RecordAction for several concurrent modes using a Symbol as reference. The default mode is :Iteration, which is used to store information that is recorded during the iterations. RecordActions might be added to :Start or :Stop to record values at the beginning or for the stopping time point, respectively\n\nThe original options can still be accessed using the get_state function.\n\nFields\n\noptions – the options that are extended by debug information\nrecordDictionary – a Dict{Symbol,RecordAction} to keep track of all different recorded values\n\nConstructors\n\nRecordSolverState(o,dR)\n\nconstruct record decorated AbstractManoptSolverState, where dR can be\n\na RecordAction, then it is stored within the dictionary at :Iteration\nan Array of RecordActions, then it is stored as a recordDictionary(@ref) within the dictionary at :All.\na Dict{Symbol,RecordAction}.\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Manopt.RecordTime","page":"Recording values","title":"Manopt.RecordTime","text":"RecordTime <: RecordAction\n\nrecord the time elapsed during the current iteration.\n\nThe three possible modes are\n\n:cumulative record times without resetting the timer\n:iterative record times with resetting the timer\n:total record a time only at the end of an algorithm (see stop_solver!)\n\nThe default is :cumulative, and any non-listed symbol default to using this mode.\n\nConstructor\n\nRecordTime(; mode::Symbol=:cumulative)\n\n\n\n\n\n","category":"type"},{"location":"plans/record/#Base.getindex-Tuple{RecordGroup, Vararg{Any}}","page":"Recording values","title":"Base.getindex","text":"getindex(r::RecordGroup, s::Symbol)\nr[s]\ngetindex(r::RecordGroup, sT::NTuple{N,Symbol})\nr[sT]\ngetindex(r::RecordGroup, i)\nr[i]\n\nreturn an array of recorded values with respect to the s, the symbols from the tuple sT or the index i. See get_record for details.\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Base.getindex-Tuple{RecordSolverState, Symbol}","page":"Recording values","title":"Base.getindex","text":"get_index(rs::RecordSolverState, s::Symbol)\nro[s]\n\nGet the recorded values for recorded type s, see get_record for details.\n\nget_index(rs::RecordSolverState, s::Symbol, i...)\nro[s, i...]\n\nAccess the recording type of type s and call its RecordAction with [i...].\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.RecordActionFactory-Tuple{AbstractManoptSolverState, RecordAction}","page":"Recording values","title":"Manopt.RecordActionFactory","text":"RecordActionFactory(s)\n\ncreate a RecordAction where\n\na RecordAction is passed through\na [Symbol] creates RecordEntry of that symbol, with the exceptions of\n:Change - to record the change of the iterates in o.x`\n:Iterate - to record the iterate\n:Iteration - to record the current iteration number\n:Cost - to record the current cost function value\n:Time - to record the total time taken after every iteration\n:IterativeTime – to record the times taken for each iteration.\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.RecordFactory-Tuple{AbstractManoptSolverState, Vector}","page":"Recording values","title":"Manopt.RecordFactory","text":"RecordFactory(s::AbstractManoptSolverState, a)\n\ngiven an array of Symbols and RecordActions and Ints\n\nThe symbol :Cost creates a RecordCost\nThe symbol :iteration creates a RecordIteration\nThe symbol :Change creates a RecordChange\nany other symbol creates a RecordEntry of the corresponding field in AbstractManoptSolverState\nany RecordAction is directly included\nan semantic pair :symbol => RecordAction is directly included\nan Integer k introduces that record is only performed every kth iteration\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.get_record","page":"Recording values","title":"Manopt.get_record","text":"get_record(s::AbstractManoptSolverState, [,symbol=:Iteration])\nget_record(s::RecordSolverState, [,symbol=:Iteration])\n\nreturn the recorded values from within the RecordSolverState s that where recorded with respect to the Symbol symbol as an Array. The default refers to any recordings during an :Iteration.\n\nWhen called with arbitrary AbstractManoptSolverState, this method looks for the RecordSolverState decorator and calls get_record on the decorator.\n\n\n\n\n\n","category":"function"},{"location":"plans/record/#Manopt.get_record-Tuple{RecordAction, Any}","page":"Recording values","title":"Manopt.get_record","text":"get_record(r::RecordAction)\n\nreturn the recorded values stored within a RecordAction r.\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.get_record-Tuple{RecordGroup}","page":"Recording values","title":"Manopt.get_record","text":"get_record(r::RecordGroup)\n\nreturn an array of tuples, where each tuple is a recorded set, e.g. per iteration / record call.\n\nget_record(r::RecordGruop, i::Int)\n\nreturn an array of values corresponding to the ith entry in this record group\n\nget_record(r::RecordGruop, s::Symbol)\n\nreturn an array of recorded values with respect to the s, see RecordGroup.\n\nget_record(r::RecordGroup, s1::Symbol, s2::Symbol,...)\n\nreturn an array of tuples, where each tuple is a recorded set corresponding to the symbols s1, s2,... per iteration / record call.\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.get_record_action","page":"Recording values","title":"Manopt.get_record_action","text":"get_record_action(s::AbstractManoptSolverState, s::Symbol)\n\nreturn the action contained in the (first) RecordSolverState decorator within the AbstractManoptSolverState o.\n\n\n\n\n\n","category":"function"},{"location":"plans/record/#Manopt.get_record_state-Tuple{AbstractManoptSolverState}","page":"Recording values","title":"Manopt.get_record_state","text":"get_record_state(s::AbstractManoptSolverState)\n\nreturn the RecordSolverState among the decorators from the AbstractManoptSolverState o\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.has_record-Tuple{RecordSolverState}","page":"Recording values","title":"Manopt.has_record","text":"has_record(s::AbstractManoptSolverState)\n\ncheck whether the AbstractManoptSolverStates are decorated with RecordSolverState\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.record_or_reset!-Tuple{RecordAction, Any, Int64}","page":"Recording values","title":"Manopt.record_or_reset!","text":"record_or_reset!(r,v,i)\n\neither record (i>0 and not Inf) the value v within the RecordAction r or reset (i<0) the internal storage, where v has to match the internal value type of the corresponding Recordaction.\n\n\n\n\n\n","category":"method"},{"location":"plans/record/","page":"Recording values","title":"Recording values","text":"see recording values for details on the decorated solver.","category":"page"},{"location":"plans/record/","page":"Recording values","title":"Recording values","text":"Further specific RecordActions can be found when specific types of AbstractManoptSolverState define them on their corresponding site.","category":"page"},{"location":"plans/record/#Technical-Details:-The-Record-Solver","page":"Recording values","title":"Technical Details: The Record Solver","text":"","category":"section"},{"location":"plans/record/","page":"Recording values","title":"Recording values","text":"initialize_solver!(amp::AbstractManoptProblem, rss::RecordSolverState)\nstep_solver!(p::AbstractManoptProblem, s::RecordSolverState, i)\nstop_solver!(p::AbstractManoptProblem, s::RecordSolverState, i)","category":"page"},{"location":"plans/record/#Manopt.initialize_solver!-Tuple{AbstractManoptProblem, RecordSolverState}","page":"Recording values","title":"Manopt.initialize_solver!","text":"initialize_solver!(ams::AbstractManoptProblem, rss::RecordSolverState)\n\nExtend the initialization of the solver by a hook to run records that were added to the :Start entry.\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.step_solver!-Tuple{AbstractManoptProblem, RecordSolverState, Any}","page":"Recording values","title":"Manopt.step_solver!","text":"step_solver!(amp::AbstractManoptProblem, rss::RecordSolverState, i)\n\nExtend the ith step of the solver by a hook to run records, that were added to the :Iteration entry.\n\n\n\n\n\n","category":"method"},{"location":"plans/record/#Manopt.stop_solver!-Tuple{AbstractManoptProblem, RecordSolverState, Any}","page":"Recording values","title":"Manopt.stop_solver!","text":"stop_solver!(amp::AbstractManoptProblem, rss::RecordSolverState, i)\n\nExtend the check, whether to stop the solver by a hook to run records, that were added to the :Stop entry.\n\n\n\n\n\n","category":"method"},{"location":"solvers/adaptive-regularization-with-cubics/#ARSSection","page":"Adaptive Regularization with Cubics","title":"Adaptive regularization with Cubics","text":"","category":"section"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"adaptive_regularization_with_cubics\nadaptive_regularization_with_cubics!","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/#Manopt.adaptive_regularization_with_cubics","page":"Adaptive Regularization with Cubics","title":"Manopt.adaptive_regularization_with_cubics","text":"adaptive_regularization_with_cubics(M, f, grad_f, Hess_f, p=rand(M); kwargs...)\nadaptive_regularization_with_cubics(M, f, grad_f, p=rand(M); kwargs...)\nadaptive_regularization_with_cubics(M, mho, p=rand(M); kwargs...)\n\nSolve an optimization problem on the manifold M by iteratively minimizing\n\nm_k(X) = f(p_k) + X operatornamegrad f(p_k) + frac12X operatornameHess f(p_k)X + fracσ_k3lVert X rVert^3\n\non the tangent space at the current iterate p_k, i.e. X T_p_kmathcal M and where σ_k 0 is a regularization parameter.\n\nLet X_k denote the minimizer of the model m_k, then we use the model improvement\n\nρ_k = fracf(p_k) - f(operatornameretr_p_k(X_k))m_k(0) - m_k(s) + fracσ_k3lVert X_krVert^3\n\nWe use two thresholds η_2 η_1 0 and set p_k+1 = operatornameretr_p_k(X_k) if ρ η_1 and reject the candidate otherwise, i.e. set p_k+1 = p_k.\n\nWe further update the regularization parameter using factors 0 γ_1 1 γ_2\n\nσ_k+1 =\nbegincases\n maxσ_min γ_1σ_k text if ρ geq η_2 text (the model was very successful)\n σ_k text if ρ in η_1 η_2)text (the model was successful)\n γ_2σ_k text if ρ η_1text (the model was unsuccessful)\nendcases\n\nFor more details see Agarwal, Boumal, Bullins, Cartis, Math. Prog., 2020.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function F mathcal M ℝ to minimize\ngrad_f- the gradient operatornamegradF mathcal M T mathcal M of F\nHess_f – (optional) the hessian H( mathcal M x ξ) of F\np – an initial value p mathcal M\n\nFor the case that no hessian is provided, the Hessian is computed using finite difference, see ApproxHessianFiniteDifference.\n\nthe cost f and its gradient and hessian might also be provided as a ManifoldHessianObjective\n\nKeyword arguments\n\nthe default values are given in brackets\n\nσ - (100.0 / sqrt(manifold_dimension(M)) initial regularization parameter\nσmin - (1e-10) minimal regularization value σ_min\nη1 - (0.1) lower model success threshold\nη2 - (0.9) upper model success threshold\nγ1 - (0.1) regularization reduction factor (for the success case)\nγ2 - (2.0) regularization increment factor (for the non-success case)\nevaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form grad_f(M, p) or InplaceEvaluation in place, i.e. is of the form grad_f!(M, X, p) and analogously for the hessian.\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction to use\ninitial_tangent_vector - (zero_vector(M, p)) initialize any tangent vector data,\nmaxIterLanczos - (200) a shortcut to set the stopping criterion in the sub_solver,\nρ_regularization - (1e3) a regularization to avoid dividing by zero for small values of cost and model\nstopping_criterion - (StopAfterIteration(40) |StopWhenGradientNormLess(1e-9) |StopWhenAllLanczosVectorsUsed(maxIterLanczos))\nsub_state - LanczosState(M, copy(M, p); maxIterLanczos=maxIterLanczos, σ=σ) a state for the subproblem or an [AbstractEvaluationType`](@ref) if the problem is a function.\nsub_objective - a shortcut to modify the objective of the subproblem used within in the\nsub_problem - DefaultManoptProblem(M, sub_objective) the problem (or a function) for the sub problem\n\nAll other keyword arguments are passed to decorate_state! for state decorators or decorate_objective! for objective, respectively. If you provide the ManifoldGradientObjective directly, these decorations can still be specified\n\nBy default the debug= keyword is set to DebugIfEntry(:ρ_denonimator, >(0); message=\"Denominator nonpositive\", type=:error)to avoid that by rounding errors the denominator in the computation ofρ` gets nonpositive.\n\n\n\n\n\n","category":"function"},{"location":"solvers/adaptive-regularization-with-cubics/#Manopt.adaptive_regularization_with_cubics!","page":"Adaptive Regularization with Cubics","title":"Manopt.adaptive_regularization_with_cubics!","text":"adaptive_regularization_with_cubics!(M, f, grad_f, Hess_f, p; kwargs...)\nadaptive_regularization_with_cubics!(M, f, grad_f, p; kwargs...)\nadaptive_regularization_with_cubics!(M, mho, p; kwargs...)\n\nevaluate the Riemannian adaptive regularization with cubics solver in place of p.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function F mathcal M ℝ to minimize\ngrad_f- the gradient operatornamegradF mathcal M T mathcal M of F\nHess_f – (optional) the hessian H( mathcal M x ξ) of F\np – an initial value p mathcal M\n\nFor the case that no hessian is provided, the Hessian is computed using finite difference, see ApproxHessianFiniteDifference.\n\nthe cost f and its gradient and hessian might also be provided as a ManifoldHessianObjective\n\nfor more details and all options, see adaptive_regularization_with_cubics.\n\n\n\n\n\n","category":"function"},{"location":"solvers/adaptive-regularization-with-cubics/#State","page":"Adaptive Regularization with Cubics","title":"State","text":"","category":"section"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"AdaptiveRegularizationState","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/#Manopt.AdaptiveRegularizationState","page":"Adaptive Regularization with Cubics","title":"Manopt.AdaptiveRegularizationState","text":"AdaptiveRegularizationState{P,T} <: AbstractHessianSolverState\n\nA state for the adaptive_regularization_with_cubics solver.\n\nFields\n\na default value is given in brackets if a parameter can be left out in initialization.\n\nη1, η2 – (0.1, 0.9) bounds for evaluating the regularization parameter\nγ1, γ2 – (0.1, 2.0) shrinking and expansion factors for regularization parameter σ\np – (rand(M) the current iterate\nX – (zero_vector(M,p)) the current gradient operatornamegradf(p)\ns - (zero_vector(M,p)) the tangent vector step resulting from minimizing the model problem in the tangent space mathcal T_p mathcal M\nσ – the current cubic regularization parameter\nσmin – (1e-7) lower bound for the cubic regularization parameter\nρ_regularization – (1e3) regularization parameter for computing ρ. As we approach convergence the ρ may be difficult to compute with numerator and denominator approaching zero. Regularizing the the ratio lets ρ go to 1 near convergence.\nevaluation - (AllocatingEvaluation()) if you provide a\nretraction_method – (default_retraction_method(M)) the retraction to use\nstopping_criterion – (StopAfterIteration(100)) a StoppingCriterion\nsub_problem - sub problem solved in each iteration\nsub_state - sub state for solving the sub problem – either a solver state if the problem is an AbstractManoptProblem or an AbstractEvaluationType if it is a function, where it defaults to AllocatingEvaluation.\n\nFurthermore the following integral fields are defined\n\nq - (copy(M,p)) a point for the candidates to evaluate model and ρ\nH – (copy(M, p, X)) the current hessian, operatornameHessF(p)\nS – (copy(M, p, X)) the current solution from the subsolver\nρ – the current regularized ratio of actual improvement and model improvement.\nρ_denominator – (one(ρ)) a value to store the denominator from the computation of ρ to allow for a warning or error when this value is non-positive.\n\nConstructor\n\nAdaptiveRegularizationState(M, p=rand(M); X=zero_vector(M, p); kwargs...)\n\nConstruct the solver state with all fields stated above as keyword arguments.\n\n\n\n\n\n","category":"type"},{"location":"solvers/adaptive-regularization-with-cubics/#Sub-solvers","page":"Adaptive Regularization with Cubics","title":"Sub solvers","text":"","category":"section"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"There are several ways to approach the subsolver. The default is the first one.","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/#Lanczos-Iteration","page":"Adaptive Regularization with Cubics","title":"Lanczos Iteration","text":"","category":"section"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"Manopt.LanczosState","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/#Manopt.LanczosState","page":"Adaptive Regularization with Cubics","title":"Manopt.LanczosState","text":"LanczosState{P,T,SC,B,I,R,TM,V,Y} <: AbstractManoptSolverState\n\nSolve the adaptive regularized subproblem with a Lanczos iteration\n\nFields\n\np the current iterate\nstop – the stopping criterion\nσ – the current regularization parameter\nX the current gradient\nLanczos_vectors – the obtained Lanczos vectors\ntridig_matrix the tridiagonal coefficient matrix T\ncoefficients the coefficients y_1,...y_k` that determine the solution\nHp – a temporary vector containing the evaluation of the Hessian\nHp_residual – a temporary vector containing the residual to the Hessian\nS – the current obtained / approximated solution\n\n\n\n\n\n","category":"type"},{"location":"solvers/adaptive-regularization-with-cubics/#(Conjugate)-Gradient-Descent","page":"Adaptive Regularization with Cubics","title":"(Conjugate) Gradient Descent","text":"","category":"section"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"There are two generic functors, that implement the sub problem","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"AdaptiveRegularizationCubicCost\nAdaptiveRegularizationCubicGrad","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/#Manopt.AdaptiveRegularizationCubicCost","page":"Adaptive Regularization with Cubics","title":"Manopt.AdaptiveRegularizationCubicCost","text":"AdaptiveRegularizationCubicCost\n\nWe define the model m(X) in the tangent space of the current iterate p=p_k as\n\n m(X) = f(p) + X operatornamegradf(p)\n + frac12 X operatornameHess f(p)X + fracσ3 lVert X rVert^3\n\nFields\n\nmho – an AbstractManifoldObjective that should provide at least get_cost, get_gradient and get_hessian.\nσ – the current regularization parameter\nX – a storage for the gradient at p of the original cost\n\nConstructors\n\nAdaptiveRegularizationCubicCost(mho, σ, X)\nAdaptiveRegularizationCubicCost(M, mho, σ; p=rand(M), X=get_gradient(M, mho, p))\n\nInitialize the cubic cost to the objective mho, regularization parameter σ, and (temporary) gradient X.\n\nnote: Note\nFor this gradient function to work, we require the TangentSpaceAtPoint from Manifolds.jl\n\n\n\n\n\n","category":"type"},{"location":"solvers/adaptive-regularization-with-cubics/#Manopt.AdaptiveRegularizationCubicGrad","page":"Adaptive Regularization with Cubics","title":"Manopt.AdaptiveRegularizationCubicGrad","text":"AdaptiveRegularizationCubicGrad\n\nWe define the model m(X) in the tangent space of the current iterate p=p_k as\n\n m(X) = f(p) + X operatornamegradf(p)\n + frac12 X operatornameHess f(p)X + fracσ3 lVert X rVert^3\n\nThis struct represents its gradient, given by\n\n operatornamegrad m(X) = operatornamegradf(p) + operatornameHess f(p)X + σ lVert X rVert X\n\nFields\n\nmho – an AbstractManifoldObjective that should provide at least get_cost, get_gradient and get_hessian.\nσ – the current regularization parameter\nX – a storage for the gradient at p of the original cost\n\nConstructors\n\nAdaptiveRegularizationCubicGrad(mho, σ, X)\nAdaptiveRegularizationCubicGrad(M, mho, σ; p=rand(M), X=get_gradient(M, mho, p))\n\nInitialize the cubic cost to the original objective mho, regularization parameter σ, and (temporary) gradient X.\n\nnote: Note\nFor this gradient function to work, we require the TangentSpaceAtPointfrom Manifolds.jlThe gradient functor provides both an allocating as well as an in-place variant.\n\n\n\n\n\n","category":"type"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"Since the sub problem is given on the tangent space, you have to provide","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"g = AdaptiveRegularizationCubicCost(M, mho, σ)\ngrad_g = AdaptiveRegularizationCubicGrad(M, mho, σ)\nsub_problem = DefaultProblem(TangentSpaceAt(M,p), ManifoldGradienObjective(g, grad_g))","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"where mho is the hessian objective of f to solve. Then use this for the sub_problem keyword and use your favourite gradient based solver for the sub_state keyword, for example a ConjugateGradientDescentState","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/#Additional-Stopping-Criteria","page":"Adaptive Regularization with Cubics","title":"Additional Stopping Criteria","text":"","category":"section"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"StopWhenAllLanczosVectorsUsed\nStopWhenFirstOrderProgress","category":"page"},{"location":"solvers/adaptive-regularization-with-cubics/#Manopt.StopWhenAllLanczosVectorsUsed","page":"Adaptive Regularization with Cubics","title":"Manopt.StopWhenAllLanczosVectorsUsed","text":"StopWhenAllLanczosVectorsUsed <: StoppingCriterion\n\nWhen an inner iteration has used up all Lanczos vectors, then this stopping criterion is a fallback / security stopping criterion in order to not access a non-existing field in the array allocated for vectors.\n\nNote that this stopping criterion (for now) is only implemented for the case that an AdaptiveRegularizationState when using a LanczosState subsolver\n\nFields\n\nmaxLanczosVectors – maximal number of Lanczos vectors\nreason – a String indicating the reason if the criterion indicated to stop\n\nConstructor\n\nStopWhenAllLanczosVectorsUsed(maxLancosVectors::Int)\n\n\n\n\n\n","category":"type"},{"location":"solvers/adaptive-regularization-with-cubics/#Manopt.StopWhenFirstOrderProgress","page":"Adaptive Regularization with Cubics","title":"Manopt.StopWhenFirstOrderProgress","text":"StopWhenFirstOrderProgress <: StoppingCriterion\n\nA stopping criterion related to the Riemannian adaptive regularization with cubics (ARC) solver indicating that the model function at the current (outer) iterate, i.e.\n\n m(X) = f(p) + X operatornamegradf(p)\n + frac12 X operatornameHess f(p)X + fracσ3 lVert X rVert^3\n\ndefined on the tangent space T_pmathcal M fulfills at the current iterate X_k that\n\nm(X_k) leq m(0)\nquadtext and quad\nlVert operatornamegrad m(X_k) rVert θ lVert X_k rVert^2\n\nFields\n\nθ – the factor θ in the second condition above\nreason – a String indicating the reason if the criterion indicated to stop\n\n\n\n\n\n","category":"type"},{"location":"solvers/adaptive-regularization-with-cubics/#Literature","page":"Adaptive Regularization with Cubics","title":"Literature","text":"","category":"section"},{"location":"solvers/adaptive-regularization-with-cubics/","page":"Adaptive Regularization with Cubics","title":"Adaptive Regularization with Cubics","text":"
[ABBC20]
\n
\n
N. Agarwal, N. Boumal, B. Bullins and C. Cartis. Adaptive regularization with cubics on manifolds. Mathematical Programming (2020).
\n
\n
","category":"page"},{"location":"solvers/trust_regions/#trust_regions","page":"Trust-Regions Solver","title":"The Riemannian Trust-Regions Solver","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"The aim is to solve an optimization problem on a manifold","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"operatorname*min_x mathcalM F(x)","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"by using the Riemannian trust-regions solver. It is number one choice for smooth optimization. This trust-region method uses the Steihaug-Toint truncated conjugate-gradient method truncated_conjugate_gradient_descent to solve the inner minimization problem called the trust-regions subproblem. This inner solver can be preconditioned by providing a preconditioner (symmetric and positive definite, an approximation of the inverse of the Hessian of F). If no Hessian of the cost function F is provided, a standard approximation of the Hessian based on the gradient operatornamegradF with ApproxHessianFiniteDifference will be computed.","category":"page"},{"location":"solvers/trust_regions/#Initialization","page":"Trust-Regions Solver","title":"Initialization","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"Initialize x_0 = x with an initial point x on the manifold. It can be given by the caller or set randomly. Set the initial trust-region radius Delta =frac18 barDelta where barDelta is the maximum radius the trust-region can have. Usually one uses the root of the manifold's dimension operatornamedim(mathcalM). For accepting the next iterate and evaluating the new trust-region radius, one needs an accept/reject threshold rho 0frac14), which is rho = 01 on default. Set k=0.","category":"page"},{"location":"solvers/trust_regions/#Iteration","page":"Trust-Regions Solver","title":"Iteration","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"Repeat until a convergence criterion is reached","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"Set η as a random tangent vector if using randomized approach. Else set η as the zero vector in the tangential space T_x_kmathcalM.\nSet η^* as the solution of the trust-region subproblem, computed by the tcg-method with η as initial vector.\nIf using randomized approach, compare η^* with the Cauchy point η_c^* = -tau_c fracDeltalVert operatornameGradF (x_k) rVert_x_k operatornameGradF (x_k) by the model function m_x_k(). If the model decrease is larger by using the Cauchy point, set η^* = η_c^*.\nSet x^* = operatornameretr_x_k(η^*).\nSet rho = fracF(x_k)-F(x^*)m_x_k(η)-m_x_k(η^*), where m_x_k() describes the quadratic model function.\nUpdate the trust-region radius:Delta = begincasesfrac14 Delta text if rho frac14 textor m_x_k(η)-m_x_k(η^*) leq 0 textor rho = pm fty operatornamemin(2 Delta barDelta) text if rho frac34 textand the tcg-method stopped because of negative curvature or exceeding the trust-regionDelta textotherwiseendcases\nIf m_x_k(η)-m_x_k(η^*) geq 0 and rho rho set x_k = x^*.\nSet k = k+1.","category":"page"},{"location":"solvers/trust_regions/#Result","page":"Trust-Regions Solver","title":"Result","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"The result is given by the last computed x_k.","category":"page"},{"location":"solvers/trust_regions/#Remarks","page":"Trust-Regions Solver","title":"Remarks","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"To the initialization: a random point on the manifold.","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"To step number 1: using a randomized approach means using a random tangent vector as initial vector for the approximate solve of the trust-regions subproblem. If this is the case, keep in mind that the vector must be in the trust-region radius. This is achieved by multiplying η by sqrt(4,eps(Float64)) as long as its norm is greater than the current trust-region radius Delta. For not using randomized approach, one can get the zero tangent vector.","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"To step number 2: obtain η^* by (approximately) solving the trust-regions subproblem","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"operatorname*argmin_η T_x_kmathcalM m_x_k(η) = F(x_k) +\nlangle operatornamegradF(x_k) η rangle_x_k + frac12 langle\noperatornameHessF(η)_ x_k η rangle_x_k","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"textst langle η η rangle_x_k leq Delta^2","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"with the Steihaug-Toint truncated conjugate-gradient (tcg) method. The problem as well as the solution method is described in the truncated_conjugate_gradient_descent. In this inner solver, the stopping criterion StopWhenResidualIsReducedByFactorOrPower so that superlinear or at least linear convergence in the trust-region method can be achieved.","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"To step number 3: if using a random tangent vector as an initial vector, compare the result of the tcg-method with the Cauchy point. Convergence proofs assume that one achieves at least (a fraction of) the reduction of the Cauchy point. The idea is to go in the direction of the gradient to an optimal point. This can be on the edge, but also before. The parameter tau_c for the optimal length is defined by","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"tau_c = begincases 1 langle operatornameGradF (x_k) \noperatornameHessF (η_k)_ x_krangle_x_k leq 0 \noperatornamemin(fracoperatornamenorm(operatornameGradF (x_k))^3\nDelta langle operatornameGradF (x_k) \noperatornameHessF (η_k)_ x_krangle_x_k 1) textotherwise\nendcases","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"To check the model decrease one compares","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"m_x_k(η_c^*) = F(x_k) + langle η_c^*\noperatornameGradF (x_k)rangle_x_k + frac12langle η_c^*\noperatornameHessF (η_c^*)_ x_krangle_x_k","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"with","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"m_x_k(η^*) = F(x_k) + langle η^*\noperatornameGradF (x_k)rangle_x_k + frac12langle η^*\noperatornameHessF (η^*)_ x_krangle_x_k","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"If m_x_k(η_c^*) m_x_k(η^*) then m_x_k(η_c^*) is the better choice.","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"To step number 4: operatornameretr_x_k() denotes the retraction, a mapping operatornameretr_x_kT_x_kmathcalM rightarrow mathcalM which approximates the exponential map. In some cases it is cheaper to use this instead of the exponential.","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"To step number 6: one knows that the truncated_conjugate_gradient_descent algorithm stopped for these reasons when the stopping criteria StopWhenCurvatureIsNegative, StopWhenTrustRegionIsExceeded are activated.","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"To step number 7: the last step is to decide if the new point x^* is accepted.","category":"page"},{"location":"solvers/trust_regions/#Interface","page":"Trust-Regions Solver","title":"Interface","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"trust_regions\ntrust_regions!","category":"page"},{"location":"solvers/trust_regions/#Manopt.trust_regions","page":"Trust-Regions Solver","title":"Manopt.trust_regions","text":"trust_regions(M, f, grad_f, hess_f, p)\ntrust_regions(M, f, grad_f, p)\n\nrun the Riemannian trust-regions solver for optimization on manifolds to minimize f cf. [Absil, Baker, Gallivan, FoCM, 2006; Conn, Gould, Toint, SIAM, 2000].\n\nFor the case that no hessian is provided, the Hessian is computed using finite difference, see ApproxHessianFiniteDifference. For solving the the inner trust-region subproblem of finding an update-vector, see truncated_conjugate_gradient_descent.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function F mathcal M ℝ to minimize\ngrad_f- the gradient operatornamegradF mathcal M T mathcal M of F\nHess_f – (optional), the hessian operatornameHessF(x) T_xmathcal M T_xmathcal M, X operatornameHessF(x)X = _ξoperatornamegradf(x)\np – an initial value x mathcal M\n\nOptional\n\nevaluation – (AllocatingEvaluation) specify whether the gradient and hessian work by allocation (default) or InplaceEvaluation in place\nmax_trust_region_radius – the maximum trust-region radius\npreconditioner – a preconditioner (a symmetric, positive definite operator that should approximate the inverse of the Hessian)\nrandomize – set to true if the trust-region solve is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.\nproject! : (copyto!) specify a projection operation for tangent vectors within the TCG for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.\nretraction – (default_retraction_method(M, typeof(p))) approximation of the exponential map\nstopping_criterion – (StopWhenAny(StopAfterIteration(1000), StopWhenGradientNormLess(10^(-6))) a functor inheriting from StoppingCriterion indicating when to stop.\ntrust_region_radius - the initial trust-region radius\nρ_prime – Accept/reject threshold: if ρ (the performance ratio for the iterate) is at least ρ', the outer iteration is accepted. Otherwise, it is rejected. In case it is rejected, the trust-region radius will have been decreased. To ensure this, ρ' >= 0 must be strictly smaller than 1/4. If ρ_prime is negative, the algorithm is not guaranteed to produce monotonically decreasing cost values. It is strongly recommended to set ρ' > 0, to aid convergence.\nρ_regularization – Close to convergence, evaluating the performance ratio ρ is numerically challenging. Meanwhile, close to convergence, the quadratic model should be a good fit and the steps should be accepted. Regularization lets ρ go to 1 as the model decrease and the actual decrease go to zero. Set this option to zero to disable regularization (not recommended). When this is not zero, it may happen that the iterates produced are not monotonically improving the cost when very close to convergence. This is because the corrected cost improvement could change sign if it is negative but very small.\nθ – (1.0) 1+θ is the superlinear convergence target rate of the tCG-method truncated_conjugate_gradient_descent, which computes an approximate solution for the trust-region subproblem. The tCG-method aborts if the residual is less than or equal to the initial residual to the power of 1+θ.\nκ – (0.1) the linear convergence target rate of the tCG-method truncated_conjugate_gradient_descent, which computes an approximate solution for the trust-region subproblem. The method aborts if the residual is less than or equal to κ times the initial residual.\nreduction_threshold – (0.1) Trust-region reduction threshold: if ρ (the performance ratio for the iterate) is less than this bound, the trust-region radius and thus the trust-regions decreases.\naugmentation_threshold – (0.75) Trust-region augmentation threshold: if ρ (the performance ratio for the iterate) is greater than this and further conditions apply, the trust-region radius and thus the trust-regions increases.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\nsee also\n\ntruncated_conjugate_gradient_descent\n\n\n\n\n\n","category":"function"},{"location":"solvers/trust_regions/#Manopt.trust_regions!","page":"Trust-Regions Solver","title":"Manopt.trust_regions!","text":"trust_regions!(M, f, grad_f, Hess_f, p; kwargs...)\ntrust_regions!(M, f, grad_f, p; kwargs...)\n\nevaluate the Riemannian trust-regions solver in place of p.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function F mathcal M ℝ to minimize\ngrad_f- the gradient operatornamegradF mathcal M T mathcal M of F\nHess_f – (optional) the hessian H( mathcal M x ξ) of F\np – an initial value p mathcal M\n\nFor the case that no hessian is provided, the Hessian is computed using finite difference, see ApproxHessianFiniteDifference.\n\nfor more details and all options, see trust_regions\n\n\n\n\n\n","category":"function"},{"location":"solvers/trust_regions/#State","page":"Trust-Regions Solver","title":"State","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"TrustRegionsState","category":"page"},{"location":"solvers/trust_regions/#Manopt.TrustRegionsState","page":"Trust-Regions Solver","title":"Manopt.TrustRegionsState","text":"TrustRegionsState <: AbstractHessianSolverState\n\ndescribe the trust-regions solver, with\n\nFields\n\nwhere all but p are keyword arguments in the constructor\n\np : the current iterate\nstop : (`StopAfterIteration(1000) | StopWhenGradientNormLess(1e-6))\nmax_trust_region_radius : (sqrt(manifold_dimension(M))) the maximum trust-region radius\nproject! : (copyto!) specify a projection operation for tangent vectors for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.\nrandomize : (false) indicates if the trust-region solve is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.\nρ_prime : (0.1) a lower bound of the performance ratio for the iterate that decides if the iteration will be accepted or not. If not, the trust-region radius will have been decreased. To ensure this, ρ'>= 0 must be strictly smaller than 1/4. If ρ' is negative, the algorithm is not guaranteed to produce monotonically decreasing cost values. It is strongly recommended to set ρ' > 0, to aid convergence.\nρ_regularization : (10000.0) Close to convergence, evaluating the performance ratio ρ is numerically challenging. Meanwhile, close to convergence, the quadratic model should be a good fit and the steps should be accepted. Regularization lets ρ go to 1 as the model decrease and the actual decrease go to zero. Set this option to zero to disable regularization (not recommended). When this is not zero, it may happen that the iterates produced are not monotonically improving the cost when very close to convergence. This is because the corrected cost improvement could change sign if it is negative but very small.\ntrust_region_radius : the (initial) trust-region radius\n\nConstructor\n\nTrustRegionsState(M,\n p=rand(M),\n X=zero_vector(M,p),\n sub_state=TruncatedConjugateGradientState(M, p, X),\n\n)\n\nconstruct a trust-regions Option with all other fields from above being keyword arguments\n\nSee also\n\ntrust_regions\n\n\n\n\n\n","category":"type"},{"location":"solvers/trust_regions/#Approximation-of-the-Hessian","page":"Trust-Regions Solver","title":"Approximation of the Hessian","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"We currently provide a few different methods to approximate the Hessian.","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"ApproxHessianFiniteDifference\nApproxHessianSymmetricRankOne\nApproxHessianBFGS","category":"page"},{"location":"solvers/trust_regions/#Manopt.ApproxHessianFiniteDifference","page":"Trust-Regions Solver","title":"Manopt.ApproxHessianFiniteDifference","text":"ApproxHessianFiniteDifference{E, P, T, G, RTR,, VTR, R <: Real} <: AbstractApproxHessian\n\nA functor to approximate the Hessian by a finite difference of gradient evaluation.\n\nGiven a point p and a direction X and the gradient operatornamegradF mathcal M to Tmathcal M of a function F the Hessian is approximated as follows: Let c be a stepsize, X T_pmathcal M a tangent vector and q = operatornameretr_p(fracclVert X rVert_pX) be a step in direction X of length c following a retraction Then we approximate the Hessian by the finite difference of the gradients, where mathcal T_cdotgetscdot is a vector transport.\n\noperatornameHessF(p)X\n \nfraclVert X rVert_pcBigl( mathcal T_pgets qbigr(operatornamegradF(q)bigl) - operatornamegradF(p)Bigl)\n\nFields\n\ngradient!! the gradient function (either allocating or mutating, see evaluation parameter)\nstep_length a step length for the finite difference\nretraction_method - a retraction to use\nvector_transport_method a vector transport to use\n\nInternal temporary fields\n\ngrad_tmp a temporary storage for the gradient at the current p\ngrad_dir_tmp a temporary storage for the gradient at the current p_dir\np_dir::P a temporary storage to the forward direction (i.e. q above)\n\nConstructor\n\nApproximateFiniteDifference(M, p, grad_f; kwargs...)\n\nKeyword arguments\n\nevaluation (AllocatingEvaluation) whether the gradient is given as an allocation function or an in-place (InplaceEvaluation).\nsteplength (2^-14) step length c to approximate the gradient evaluations\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction(M, p, X) to use in the approximation.\nvector_transport_method - (default_vector_transport_method(M, typeof(p))) a vector transport to use\n\n\n\n\n\n","category":"type"},{"location":"solvers/trust_regions/#Manopt.ApproxHessianSymmetricRankOne","page":"Trust-Regions Solver","title":"Manopt.ApproxHessianSymmetricRankOne","text":"ApproxHessianSymmetricRankOne{E, P, G, T, B<:AbstractBasis{ℝ}, VTR, R<:Real} <: AbstractApproxHessian\n\nA functor to approximate the Hessian by the symmetric rank one update.\n\nFields\n\ngradient!! the gradient function (either allocating or mutating, see evaluation parameter).\nν a small real number to ensure that the denominator in the update does not become too small and thus the method does not break down.\nvector_transport_method a vector transport to use.\n\nInternal temporary fields\n\np_tmp a temporary storage the current point p.\ngrad_tmp a temporary storage for the gradient at the current p.\nmatrix a temporary storage for the matrix representation of the approximating operator.\nbasis a temporary storage for an orthonormal basis at the current p.\n\nConstructor\n\nApproxHessianSymmetricRankOne(M, p, gradF; kwargs...)\n\nKeyword arguments\n\ninitial_operator (Matrix{Float64}(I, manifold_dimension(M), manifold_dimension(M))) the matrix representation of the initial approximating operator.\nbasis (DefaultOrthonormalBasis()) an orthonormal basis in the tangent space of the initial iterate p.\nnu (-1)\nevaluation (AllocatingEvaluation) whether the gradient is given as an allocation function or an in-place (InplaceEvaluation).\nvector_transport_method (ParallelTransport()) vector transport mathcal T_cdotgetscdot to use.\n\n\n\n\n\n","category":"type"},{"location":"solvers/trust_regions/#Manopt.ApproxHessianBFGS","page":"Trust-Regions Solver","title":"Manopt.ApproxHessianBFGS","text":"ApproxHessianBFGS{E, P, G, T, B<:AbstractBasis{ℝ}, VTR, R<:Real} <: AbstractApproxHessian\n\nA functor to approximate the Hessian by the BFGS update.\n\nFields\n\ngradient!! the gradient function (either allocating or mutating, see evaluation parameter).\nscale\nvector_transport_method a vector transport to use.\n\nInternal temporary fields\n\np_tmp a temporary storage the current point p.\ngrad_tmp a temporary storage for the gradient at the current p.\nmatrix a temporary storage for the matrix representation of the approximating operator.\nbasis a temporary storage for an orthonormal basis at the current p.\n\nConstructor\n\nApproxHessianBFGS(M, p, gradF; kwargs...)\n\nKeyword arguments\n\ninitial_operator (Matrix{Float64}(I, manifold_dimension(M), manifold_dimension(M))) the matrix representation of the initial approximating operator.\nbasis (DefaultOrthonormalBasis()) an orthonormal basis in the tangent space of the initial iterate p.\nnu (-1)\nevaluation (AllocatingEvaluation) whether the gradient is given as an allocation function or an in-place (InplaceEvaluation).\nvector_transport_method (ParallelTransport()) vector transport mathcal T_cdotgetscdot to use.\n\n\n\n\n\n","category":"type"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"as well as their (non-exported) common supertype","category":"page"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"Manopt.AbstractApproxHessian","category":"page"},{"location":"solvers/trust_regions/#Manopt.AbstractApproxHessian","page":"Trust-Regions Solver","title":"Manopt.AbstractApproxHessian","text":"AbstractApproxHessian <: Function\n\nAn abstract supertypes for approximate hessian functions, declares them also to be functions.\n\n\n\n\n\n","category":"type"},{"location":"solvers/trust_regions/#Literature","page":"Trust-Regions Solver","title":"Literature","text":"","category":"section"},{"location":"solvers/trust_regions/","page":"Trust-Regions Solver","title":"Trust-Regions Solver","text":"
","category":"page"},{"location":"plans/debug/#DebugSection","page":"Debug Output","title":"Debug Output","text":"","category":"section"},{"location":"plans/debug/","page":"Debug Output","title":"Debug Output","text":"CurrentModule = Manopt","category":"page"},{"location":"plans/debug/","page":"Debug Output","title":"Debug Output","text":"Debug output can easily be added to any solver run. On the high level interfaces, like gradient_descent, you can just use the debug= keyword.","category":"page"},{"location":"plans/debug/","page":"Debug Output","title":"Debug Output","text":"Modules = [Manopt]\nPages = [\"plans/debug.jl\"]\nOrder = [:type, :function]\nPrivate = true","category":"page"},{"location":"plans/debug/#Manopt.DebugAction","page":"Debug Output","title":"Manopt.DebugAction","text":"DebugAction\n\nA DebugAction is a small functor to print/issue debug output. The usual call is given by (amp::AbstractManoptProblem, ams::AbstractManoptSolverState, i) -> s, where i is the current iterate.\n\nBy convention i=0 is interpreted as \"For Initialization only\", i.e. only debug info that prints initialization reacts, i<0 triggers updates of variables internally but does not trigger any output. Finally typemin(Int) is used to indicate a call from stop_solver! that returns true afterwards.\n\nFields (assumed by subtypes to exist)\n\nprint method to perform the actual print. Can for example be set to a file export,\n\nor to @info. The default is the print function on the default Base.stdout.\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugChange","page":"Debug Output","title":"Manopt.DebugChange","text":"DebugChange(M=DefaultManifold())\n\ndebug for the amount of change of the iterate (stored in get_iterate(o) of the AbstractManoptSolverState) during the last iteration. See DebugEntryChange for the general case\n\nKeyword Parameters\n\nstorage – (StoreStateAction( [:Gradient] )) – (eventually shared) the storage of the previous action\nprefix – (\"Last Change:\") prefix of the debug output (ignored if you set format)\nio – (stdout) default stream to print the debug to.\nformat - ( \"$prefix %f\") format to print the output using an sprintf format.\ninverse_retraction_method - (default_inverse_retraction_method(M)) the inverse retraction to be used for approximating distance.\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugCost","page":"Debug Output","title":"Manopt.DebugCost","text":"DebugCost <: DebugAction\n\nprint the current cost function value, see get_cost.\n\nConstructors\n\nDebugCost()\n\nParameters\n\nformat - (\"$prefix %f\") format to print the output using sprintf and a prefix (see long).\nio – (stdout) default stream to print the debug to.\nlong - (false) short form to set the format to f(x): (default) or current cost: and the cost\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugDivider","page":"Debug Output","title":"Manopt.DebugDivider","text":"DebugDivider <: DebugAction\n\nprint a small divider (default \" | \").\n\nConstructor\n\nDebugDivider(div,print)\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugEntry","page":"Debug Output","title":"Manopt.DebugEntry","text":"DebugEntry <: DebugAction\n\nprint a certain fields entry of type {T} during the iterates, where a format can be specified how to print the entry.\n\nAddidtional Fields\n\nfield – Symbol the entry can be accessed with within AbstractManoptSolverState\n\nConstructor\n\nDebugEntry(f; prefix=\"$f:\", format = \"$prefix %s\", io=stdout)\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugEntryChange","page":"Debug Output","title":"Manopt.DebugEntryChange","text":"DebugEntryChange{T} <: DebugAction\n\nprint a certain entries change during iterates\n\nAdditional Fields\n\nprint – (print) function to print the result\nprefix – (\"Change of :Iterate\") prefix to the print out\nformat – (\"$prefix %e\") format to print (uses the `prefix by default and scientific notation)\nfield – Symbol the field can be accessed with within AbstractManoptSolverState\ndistance – function (p,o,x1,x2) to compute the change/distance between two values of the entry\nstorage – a StoreStateAction to store the previous value of :f\n\nConstructors\n\nDebugEntryChange(f,d)\n\nKeyword arguments\n\nio (stdout) an IOStream\nprefix (\"Change of $f\")\nstorage (StoreStateAction((f,))) a StoreStateAction\ninitial_value an initial value for the change of o.field.\nformat – (\"$prefix %e\") format to print the change\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugEvery","page":"Debug Output","title":"Manopt.DebugEvery","text":"DebugEvery <: DebugAction\n\nevaluate and print debug only every ith iteration. Otherwise no print is performed. Whether internal variables are updates is determined by always_update.\n\nThis method does not perform any print itself but relies on it's childrens print.\n\nConstructor\n\nDebugEvery(d::DebugAction, every=1, always_update=true)\n\nInitialise the DebugEvery.\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugGradientChange","page":"Debug Output","title":"Manopt.DebugGradientChange","text":"DebugGradientChange()\n\ndebug for the amount of change of the gradient (stored in get_gradient(o) of the AbstractManoptSolverState o) during the last iteration. See DebugEntryChange for the general case\n\nKeyword Parameters\n\nstorage – (StoreStateAction( (:Gradient,) )) – (eventually shared) the storage of the previous action\nprefix – (\"Last Change:\") prefix of the debug output (ignored if you set format)\nio – (stdout) default stream to print the debug to.\nformat - ( \"$prefix %f\") format to print the output using an sprintf format.\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugGroup","page":"Debug Output","title":"Manopt.DebugGroup","text":"DebugGroup <: DebugAction\n\ngroup a set of DebugActions into one action, where the internal prints are removed by default and the resulting strings are concatenated\n\nConstructor\n\nDebugGroup(g)\n\nconstruct a group consisting of an Array of DebugActions g, that are evaluated en bloque; the method does not perform any print itself, but relies on the internal prints. It still concatenates the result and returns the complete string\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugIfEntry","page":"Debug Output","title":"Manopt.DebugIfEntry","text":"DebugIfEntry <: DebugAction\n\nIssue a warning, info or error if a certain field does not pass a check\n\nFields\n\nio – an io stream\ncheck – a function that takes the value of the field as input and returns a boolean\nfield – Symbol the entry can be accessed with within AbstractManoptSolverState\nmsg - is the check fails, this message is displayed\ntype – Symbol specifying the type of display, possible values :print, : warn, :info, :error, where :print prints to io.\n\nConstructor\n\nDebugEntry(field, check=(>(0)); type=:warn, message=\":$f is nonnegative\", io=stdout)\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugIterate","page":"Debug Output","title":"Manopt.DebugIterate","text":"DebugIterate <: DebugAction\n\ndebug for the current iterate (stored in get_iterate(o)).\n\nConstructor\n\nDebugIterate()\n\nParameters\n\nio – (stdout) default stream to print the debug to.\nlong::Bool whether to print x: or current iterate\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugIteration","page":"Debug Output","title":"Manopt.DebugIteration","text":"DebugIteration <: DebugAction\n\nConstructor\n\nDebugIteration()\n\nKeyword parameters\n\nformat - (\"# %-6d\") format to print the output using an sprintf format.\nio – (stdout) default stream to print the debug to.\n\ndebug for the current iteration (prefixed with # by )\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugMessages","page":"Debug Output","title":"Manopt.DebugMessages","text":"DebugMessages <: DebugAction\n\nAn AbstractManoptSolverState or one of its substeps like a Stepsize might generate warnings throughout their computations. This debug can be used to :print them display them as :info or :warnings or even :error, depending on the message type.\n\nConstructor\n\nDebugMessages(mode=:Info; io::IO=stdout)\n\nInitialize the messages debug to a certain mode. Available modes are\n\n:Error – issue the messages as an error and hence stop at any issue occurring\n:Info – issue the messages as an @info\n:Print – print messages to the steam io.\n:Warning – issue the messages as a warning\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugSolverState","page":"Debug Output","title":"Manopt.DebugSolverState","text":"DebugSolverState <: AbstractManoptSolverState\n\nThe debug options append to any options a debug functionality, i.e. they act as a decorator pattern. Internally a Dictionary is kept that stores a DebugAction for several occasions using a Symbol as reference. The default occasion is :All and for example solvers join this field with :Start, :Step and :Stop at the beginning, every iteration or the end of the algorithm, respectively\n\nThe original options can still be accessed using the get_state function.\n\nFields (defaults in brackets)\n\noptions – the options that are extended by debug information\ndebugDictionary – a Dict{Symbol,DebugAction} to keep track of Debug for different actions\n\nConstructors\n\nDebugSolverState(o,dA)\n\nconstruct debug decorated options, where dD can be\n\na DebugAction, then it is stored within the dictionary at :All\nan Array of DebugActions, then it is stored as a debugDictionary within :All.\na Dict{Symbol,DebugAction}.\nan Array of Symbols, String and an Int for the DebugFactory\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugStoppingCriterion","page":"Debug Output","title":"Manopt.DebugStoppingCriterion","text":"DebugStoppingCriterion <: DebugAction\n\nprint the Reason provided by the stopping criterion. Usually this should be empty, unless the algorithm stops.\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugTime","page":"Debug Output","title":"Manopt.DebugTime","text":"DebugTime()\n\nMeasure time and print the intervals. Using start=true you can start the timer on construction, for example to measure the runtime of an algorithm overall (adding)\n\nThe measured time is rounded using the given time_accuracy and printed after canonicalization.\n\nKeyword Parameters\n\nprefix – (\"Last Change:\") prefix of the debug output (ignored if you set format)\nio – (stdout) default strea to print the debug to.\nformat - ( \"$prefix %s\") format to print the output using an sprintf format, where %s is the canonicalized time`.\nmode – (:cumulative) whether to display the total time or reset on every call using :iterative.\nstart – (false) indicate whether to start the timer on creation or not. Otherwise it might only be started on firsr call.\ntime_accuracy – (Millisecond(1)) round the time to this period before printing the canonicalized time\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugWarnIfCostIncreases","page":"Debug Output","title":"Manopt.DebugWarnIfCostIncreases","text":"DebugWarnIfCostIncreases <: DebugAction\n\nprint a warning if the cost increases.\n\nNote that this provides an additional warning for gradient descent with its default constant step size.\n\nConstructor\n\nDebugWarnIfCostIncreases(warn=:Once; tol=1e-13)\n\nInitialize the warning to warning level (:Once) and introduce a tolerance for the test of 1e-13.\n\nThe warn level can be set to :Once to only warn the first time the cost increases, to :Always to report an increase every time it happens, and it can be set to :No to deactivate the warning, then this DebugAction is inactive. All other symbols are handled as if they were :Always:\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugWarnIfCostNotFinite","page":"Debug Output","title":"Manopt.DebugWarnIfCostNotFinite","text":"DebugWarnIfCostNotFinite <: DebugAction\n\nA debug to see when a field (value or array within the AbstractManoptSolverState is or contains values that are not finite, for example Inf or Nan.\n\nConstructor\n\nDebugWarnIfCostNotFinite(field::Symbol, warn=:Once)\n\nInitialize the warning to warn :Once.\n\nThis can be set to :Once to only warn the first time the cost is Nan. It can also be set to :No to deactivate the warning, but this makes this Action also useless. All other symbols are handled as if they were :Always:\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugWarnIfFieldNotFinite","page":"Debug Output","title":"Manopt.DebugWarnIfFieldNotFinite","text":"DebugWarnIfFieldNotFinite <: DebugAction\n\nA debug to see when a field from the options is not finite, for example Inf or Nan\n\nConstructor\n\nDebugWarnIfFieldNotFinite(field::Symbol, warn=:Once)\n\nInitialize the warning to warn :Once.\n\nThis can be set to :Once to only warn the first time the cost is Nan. It can also be set to :No to deactivate the warning, but this makes this Action also useless. All other symbols are handled as if they were :Always:\n\nExample\n\nDebugWaranIfFieldNotFinite(:Gradient)\n\nCreates a [DebugAction] to track whether the gradient does not get Nan or Inf.\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugWhenActive","page":"Debug Output","title":"Manopt.DebugWhenActive","text":"DebugWhenActive <: DebugAction\n\nevaluate and print debug only if the active boolean is set. This can be set from outside and is for example triggered by DebugEvery on debugs on the subsolver.\n\nThis method does not perform any print itself but relies on it's childrens print.\n\nFor now, the main interaction is with DebugEvery which might activate or deactivate this debug\n\nFields\n\nalways_update – whether or not to call the order debugs with iteration -1 in in active state\nactive – a boolean that can (de-)activated from outside to enable/disable debug\n\nConstructor\n\nDebugWhenActive(d::DebugAction, active=true, always_update=true)\n\nInitialise the DebugSubsolver.\n\n\n\n\n\n","category":"type"},{"location":"plans/debug/#Manopt.DebugActionFactory-Tuple{String}","page":"Debug Output","title":"Manopt.DebugActionFactory","text":"DebugActionFactory(s)\n\ncreate a DebugAction where\n\na Stringyields the corresponding divider\na DebugAction is passed through\na [Symbol] creates DebugEntry of that symbol, with the exceptions of :Change, :Iterate, :Iteration, and :Cost.\na Tuple{Symbol,String} creates a DebugEntry of that symbol where the String specifies the format.\n\n\n\n\n\n","category":"method"},{"location":"plans/debug/#Manopt.DebugActionFactory-Tuple{Symbol}","page":"Debug Output","title":"Manopt.DebugActionFactory","text":"DebugActionFactory(s::Symbol)\n\nConvert certain Symbols in the debug=[ ... ] vector to DebugActions Currently the following ones are done. Note that the Shortcut symbols should all start with a capital letter.\n\n:Cost creates a DebugCost\n:Change creates a DebugChange\n:GradientChange creates a DebugGradientChange\n:GradientNorm creates a DebugGradientNorm\n:Iterate creates a DebugIterate\n:Iteration creates a DebugIteration\n:IterativeTime creates a DebugTime(:Iterative)\n:Stepsize creates a DebugStepsize\n:WarnCost creates a DebugWarnIfCostNotFinite\n:WarnGradient creates a DebugWarnIfFieldNotFinite for the ::Gradient.\n:Time creates a DebugTime\n:WarningMessagescreates a DebugMessages(:Warning)\n:InfoMessagescreates a DebugMessages(:Info)\n:ErrorMessages creates a DebugMessages(:Error)\n:Messages creates a DebugMessages() (i.e. the same as :InfoMessages)\n\nany other symbol creates a DebugEntry(s) to print the entry (o.:s) from the options.\n\n\n\n\n\n","category":"method"},{"location":"plans/debug/#Manopt.DebugActionFactory-Tuple{Tuple{Symbol, String}}","page":"Debug Output","title":"Manopt.DebugActionFactory","text":"DebugActionFactory(t::Tuple{Symbol,String)\n\nConvert certain Symbols in the debug=[ ... ] vector to DebugActions Currently the following ones are done, where the string in t[2] is passed as the format the corresponding debug. Note that the Shortcut symbols t[1] should all start with a capital letter.\n\n:Cost creates a DebugCost\n:Change creates a DebugChange\n:GradientChange creates a DebugGradientChange\n:Iterate creates a DebugIterate\n:Iteration creates a DebugIteration\n:Stepsize creates a DebugStepsize\n:Time creates a DebugTime\n:IterativeTime creates a DebugTime(:Iterative)\n\nany other symbol creates a DebugEntry(s) to print the entry (o.:s) from the options.\n\n\n\n\n\n","category":"method"},{"location":"plans/debug/#Manopt.DebugFactory-Tuple{Vector}","page":"Debug Output","title":"Manopt.DebugFactory","text":"DebugFactory(a)\n\ngiven an array of Symbols, Strings DebugActions and Ints\n\nThe symbol :Stop creates an entry of to display the stopping criterion at the end (:Stop => DebugStoppingCriterion()), for further symbols see DebugActionFactory\nThe symbol :Subsolver wraps all dictionary entries with DebugWhenActive that can be set from outside.\nTuples of a symbol and a string can be used to also specify a format, see DebugActionFactory\nany string creates a DebugDivider\nany DebugAction is directly included\nan Integer kintroduces that debug is only printed every kth iteration\n\nReturn value\n\nThis function returns a dictionary with an entry :All containing one general DebugAction, possibly a DebugGroup of entries. It might contain an entry :Start, :Step, :Stop with an action (each) to specify what to do at the start, after a step or at the end of an Algorithm, respectively. On all three occasions the :All action is executed. Note that only the :Stop entry is actually filled when specifying the :Stop symbol.\n\nExample\n\nThe array\n\n[:Iterate, \" | \", :Cost, :Stop, 10]\n\nAdds a group to :All of three actions (DebugIteration, DebugDivider with \" | \" to display, DebugCost) as a DebugGroup inside an DebugEvery to only be executed every 10th iteration. It also adds the DebugStoppingCriterion to the :Stop entry of the dictionary.\n\n\n\n\n\n","category":"method"},{"location":"plans/debug/#Manopt.reset!-Tuple{DebugTime}","page":"Debug Output","title":"Manopt.reset!","text":"reset!(d::DebugTime)\n\nreset the internal time of a DebugTime, that is start from now again.\n\n\n\n\n\n","category":"method"},{"location":"plans/debug/#Manopt.stop!-Tuple{DebugTime}","page":"Debug Output","title":"Manopt.stop!","text":"stop!(d::DebugTime)\n\nstop the reset the internal time of a DebugTime, that is set the time to 0 (undefined)\n\n\n\n\n\n","category":"method"},{"location":"plans/debug/#Technical-Details:-The-Debug-Solver","page":"Debug Output","title":"Technical Details: The Debug Solver","text":"","category":"section"},{"location":"plans/debug/","page":"Debug Output","title":"Debug Output","text":"The decorator to print debug during the iterations can be activated by decorating the state of a solver and implementing your own DebugActions. For example printing a gradient from the GradientDescentState is automatically available, as explained in the gradient_descent solver.","category":"page"},{"location":"plans/debug/","page":"Debug Output","title":"Debug Output","text":"initialize_solver!(amp::AbstractManoptProblem, dss::DebugSolverState)\nstep_solver!(amp::AbstractManoptProblem, dss::DebugSolverState, i)\nstop_solver!(amp::AbstractManoptProblem, dss::DebugSolverState, i::Int)","category":"page"},{"location":"plans/debug/#Manopt.initialize_solver!-Tuple{AbstractManoptProblem, DebugSolverState}","page":"Debug Output","title":"Manopt.initialize_solver!","text":"initialize_solver!(amp::AbstractManoptProblem, dss::DebugSolverState)\n\nExtend the initialization of the solver by a hook to run debug that were added to the :Start and :All entries of the debug lists.\n\n\n\n\n\n","category":"method"},{"location":"plans/debug/#Manopt.step_solver!-Tuple{AbstractManoptProblem, DebugSolverState, Any}","page":"Debug Output","title":"Manopt.step_solver!","text":"step_solver!(amp::AbstractManoptProblem, dss::DebugSolverState, i)\n\nExtend the ith step of the solver by a hook to run debug prints, that were added to the :Step and :All entries of the debug lists.\n\n\n\n\n\n","category":"method"},{"location":"plans/debug/#Manopt.stop_solver!-Tuple{AbstractManoptProblem, DebugSolverState, Int64}","page":"Debug Output","title":"Manopt.stop_solver!","text":"stop_solver!(amp::AbstractManoptProblem, dss::DebugSolverState, i)\n\nExtend the check, whether to stop the solver by a hook to run debug, that were added to the :Stop and :All entries of the debug lists.\n\n\n\n\n\n","category":"method"},{"location":"plans/stepsize/#Stepsize","page":"Stepsize","title":"Stepsize and Linesearch","text":"","category":"section"},{"location":"plans/stepsize/","page":"Stepsize","title":"Stepsize","text":"CurrentModule = Manopt","category":"page"},{"location":"plans/stepsize/","page":"Stepsize","title":"Stepsize","text":"Most iterative algorithms determine a direction along which the algorithm will proceed and determine a step size to find the next iterate. How advanced the step size computation can be implemented depends (among others) on the properties the corresponding problem provides.","category":"page"},{"location":"plans/stepsize/","page":"Stepsize","title":"Stepsize","text":"Within Manopt.jl, the step size determination is implemented as a functor which is a subtype of [Stepsize](@refbased on","category":"page"},{"location":"plans/stepsize/","page":"Stepsize","title":"Stepsize","text":"Stepsize","category":"page"},{"location":"plans/stepsize/#Manopt.Stepsize","page":"Stepsize","title":"Manopt.Stepsize","text":"Stepsize\n\nAn abstract type for the functors representing step sizes, i.e. they are callable structures. The naming scheme is TypeOfStepSize, e.g. ConstantStepsize.\n\nEvery Stepsize has to provide a constructor and its function has to have the interface (p,o,i) where a AbstractManoptProblem as well as AbstractManoptSolverState and the current number of iterations are the arguments and returns a number, namely the stepsize to use.\n\nSee also\n\nLinesearch\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/","page":"Stepsize","title":"Stepsize","text":"Usually, a constructor should take the manifold M as its first argument, for consistency, to allow general step size functors to be set up based on default values that might depend on the manifold currently under consideration.","category":"page"},{"location":"plans/stepsize/","page":"Stepsize","title":"Stepsize","text":"Currently, the following step sizes are available","category":"page"},{"location":"plans/stepsize/","page":"Stepsize","title":"Stepsize","text":"Modules = [Manopt]\nPages = [\"plans/stepsize.jl\"]\nOrder = [:type,:function]\nFilter = t -> t != Stepsize","category":"page"},{"location":"plans/stepsize/#Manopt.AdaptiveWNGradient","page":"Stepsize","title":"Manopt.AdaptiveWNGradient","text":"AdaptiveWNGradient <: DirectionUpdateRule\n\nRepresent an adaptive gradient method introduced by Grapiglia,Stella, J. Optim. Theory Appl., 2023.\n\nGiven a positive threshold hat c mathbb N, an minimal bound b_mathrmmin 0, an initial b_0 b_mathrmmin, and a gradient reduction factor threshold ``\\alpha \\in [0,1).\n\nSet c_0=0 and use omega_0 = lVert operatornamegrad f(p_0) rvert_p_0.\n\nFor the first iterate we use the initial step size s_0 = frac1b_0\n\nThen, given the last gradient X_k-1 = operatornamegrad f(x_k-1), and a previous omega_k-1, the values (b_k omega_k c_k) are computed using X_k = operatornamegrad f(p_k) and the following cases\n\nIf lVert X_k rVert_p_k leq alphaomega_k-1, then let hat b_k-1 in b_mathrmminb_k-1 and set\n\n(b_k omega_k c_k) = begincases\nbigl(hat b_k-1 lVert X_krVert_p_k 0 bigr) text if c_k-1+1 = hat c\nBigl(b_k-1 + fraclVert X_krVert_p_k^2b_k-1 omega_k-1 c_k-1+1 Bigr) text if c_k-1+1hat c\nendcases\n\nIf lVert X_k rVert_p_k alphaomega_k-1, the set\n\n(b_k omega_k c_k) =\nBigl( b_k-1 + fraclVert X_krVert_p_k^2b_k-1 omega_k-1 0)\n\nand return the step size s_k = frac1b_k.\n\nNote that for α=0 this is the Riemannian variant of WNGRad\n\nFields\n\ncount_threshold::Int (4) an Integer for hat c\nminimal_bound::Float64 (1e-4) for b_mathrmmin\nalternate_bound::Function ((bk, hat_c) -> min(gradient_bound, max(gradient_bound, bk/(3*hat_c)) how to determine hat b_k as a function of (bmin, bk, hat_c) -> hat_bk\ngradient_reduction::Float64 (0.9)\ngradient_bound norm(M, p0, grad_f(M,p0)) the bound b_k.\n\nas well as the internal fields\n\nweight for ω_k initialised to ω_0 =norm(M, p0, grad_f(M,p0)) if this is not zero, 1.0 otherwise.\ncount for the c_k, initialised to c_0 = 0.\n\nConstructor\n\nAdaptiveWNGrad(M=DefaultManifold, grad_f=(M,p) -> zero_vector(M,rand(M)), p=rand(M); kwargs...)\n\nWhere all above fields with defaults are keyword arguments. An additional keyword arguments\n\nadaptive (true) switches the gradient_reductionαto0`.\nevaluation (AllocatingEvaluation()) specifies whether the gradient (that is used for initialisation only) is mutating or allocating\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/#Manopt.ArmijoLinesearch","page":"Stepsize","title":"Manopt.ArmijoLinesearch","text":"ArmijoLinesearch <: Linesearch\n\nA functor representing Armijo line search including the last runs state, i.e. a last step size.\n\nFields\n\ninitial_stepsize – (1.0) and initial step size\nretraction_method – (default_retraction_method(M)) the retraction to use\ncontraction_factor – (0.95) exponent for line search reduction\nsufficient_decrease – (0.1) gain within Armijo's rule\nlast_stepsize – (initialstepsize) the last step size we start the search with\ninitial_guess - ((p,s,i,l) -> l) based on a AbstractManoptProblem p, AbstractManoptSolverState s and a current iterate i and a last step size l, this returns an initial guess. The default uses the last obtained stepsize\n\nFurthermore the following fields act as safeguards\n\nstop_when_stepsize_less - (0.0`) smallest stepsize when to stop (the last one before is taken)\nstop_when_stepsize_exceeds - ([max_stepsize](@ref)(M, p)`) – largest stepsize when to stop.\nstop_increasing_at_step - (^100`) last step to increase the stepsize (phase 1),\nstop_decreasing_at_step - (1000) last step size to decrease the stepsize (phase 2),\n\nPass :Messages to a debug= to see @infos when these happen.\n\nConstructor\n\nArmijoLinesearch(M=DefaultManifold())\n\nwith the Fields above as keyword arguments and the retraction is set to the default retraction on M.\n\nThe constructors return the functor to perform Armijo line search, where two interfaces are available:\n\nbased on a tuple (amp, ams, i) of a AbstractManoptProblem amp, AbstractManoptSolverState ams and a current iterate i.\nwith (M, x, F, gradFx[,η=-gradFx]) -> s where M, a current point x a function F, that maps from the manifold to the reals, its gradient (a tangent vector) gradFx=operatornamegradF(x) at x and an optional search direction tangent vector η=-gradFx are the arguments.\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/#Manopt.ConstantStepsize","page":"Stepsize","title":"Manopt.ConstantStepsize","text":"ConstantStepsize <: Stepsize\n\nA functor that always returns a fixed step size.\n\nFields\n\nlength – constant value for the step size\ntype - a symbol that indicates whether the stepsize is relatively (:relative), with respect to the gradient norm, or absolutely (:absolute) constant.\n\nConstructors\n\nConstantStepsize(s::Real, t::Symbol=:relative)\n\ninitialize the stepsize to a constant s of type t.\n\nConstantStepsize(M::AbstractManifold=DefaultManifold(2);\n stepsize=injectivity_radius(M)/2, type::Symbol=:relative\n)\n\ninitialize the stepsize to a constant stepsize, which by default is half the injectivity radius, unless the radius is infinity, then the default step size is 1.\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/#Manopt.DecreasingStepsize","page":"Stepsize","title":"Manopt.DecreasingStepsize","text":"DecreasingStepsize()\n\nA functor that represents several decreasing step sizes\n\nFields\n\nlength – (1) the initial step size l.\nfactor – (1) a value f to multiply the initial step size with every iteration\nsubtrahend – (0) a value a that is subtracted every iteration\nexponent – (1) a value e the current iteration numbers eth exponential is taken of\nshift – (0) shift the denominator iterator i by s`.\ntype - a symbol that indicates whether the stepsize is relatively (:relative), with respect to the gradient norm, or absolutely (:absolute) constant.\n\nIn total the complete formulae reads for the ith iterate as\n\ns_i = frac(l - i a)f^i(i+s)^e\n\nand hence the default simplifies to just s_i = fracli\n\nConstructor\n\nDecreasingStepsize(l=1,f=1,a=0,e=1,s=0,type=:relative)\n\nAlternatively one can also use the following keyword.\n\nDecreasingStepsize(\n M::AbstractManifold=DefaultManifold(3);\n length=injectivity_radius(M)/2, multiplier=1.0, subtrahend=0.0,\n exponent=1.0, shift=0, type=:relative\n)\n\ninitializes all fields above, where none of them is mandatory and the length is set to half and to 1 if the injectivity radius is infinite.\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/#Manopt.Linesearch","page":"Stepsize","title":"Manopt.Linesearch","text":"Linesearch <: Stepsize\n\nAn abstract functor to represent line search type step size determinations, see Stepsize for details. One example is the ArmijoLinesearch functor.\n\nCompared to simple step sizes, the linesearch functors provide an interface of the form (p,o,i,η) -> s with an additional (but optional) fourth parameter to provide a search direction; this should default to something reasonable, e.g. the negative gradient.\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/#Manopt.NonmonotoneLinesearch","page":"Stepsize","title":"Manopt.NonmonotoneLinesearch","text":"NonmonotoneLinesearch <: Linesearch\n\nA functor representing a nonmonotone line search using the Barzilai-Borwein step size Iannazzo, Porcelli, IMA J. Numer. Anal., 2017. Together with a gradient descent algorithm this line search represents the Riemannian Barzilai-Borwein with nonmonotone line-search (RBBNMLS) algorithm. We shifted the order of the algorithm steps from the paper by Iannazzo and Porcelli so that in each iteration we first find\n\ny_k = operatornamegradF(x_k) - operatornameT_x_k-1 x_k(operatornamegradF(x_k-1))\n\nand\n\ns_k = - α_k-1 * operatornameT_x_k-1 x_k(operatornamegradF(x_k-1))\n\nwhere α_k-1 is the step size computed in the last iteration and operatornameT is a vector transport. We then find the Barzilai–Borwein step size\n\nα_k^textBB = begincases\nmin(α_textmax max(α_textmin τ_k)) textif s_k y_k_x_k 0\nα_textmax textelse\nendcases\n\nwhere\n\nτ_k = fracs_k s_k_x_ks_k y_k_x_k\n\nif the direct strategy is chosen,\n\nτ_k = fracs_k y_k_x_ky_k y_k_x_k\n\nin case of the inverse strategy and an alternation between the two in case of the alternating strategy. Then we find the smallest h = 0 1 2 such that\n\nF(operatornameretr_x_k(- σ^h α_k^textBB operatornamegradF(x_k)))\nleq\nmax_1 j min(k+1m) F(x_k+1-j) - γ σ^h α_k^textBB operatornamegradF(x_k) operatornamegradF(x_k)_x_k\n\nwhere σ is a step length reduction factor (01), m is the number of iterations after which the function value has to be lower than the current one and γ is the sufficient decrease parameter (01). We can then find the new stepsize by\n\nα_k = σ^h α_k^textBB\n\nFields\n\ninitial_stepsize – (1.0) the step size we start the search with\nmemory_size – (10) number of iterations after which the cost value needs to be lower than the current one\nbb_min_stepsize – (1e-3) lower bound for the Barzilai-Borwein step size greater than zero\nbb_max_stepsize – (1e3) upper bound for the Barzilai-Borwein step size greater than min_stepsize\nretraction_method – (ExponentialRetraction()) the retraction to use\nstrategy – (direct) defines if the new step size is computed using the direct, indirect or alternating strategy\nstorage – (for :Iterate and :Gradient) a StoreStateAction\nstepsize_reduction – (0.5) step size reduction factor contained in the interval (0,1)\nsufficient_decrease – (1e-4) sufficient decrease parameter contained in the interval (0,1)\nvector_transport_method – (ParallelTransport()) the vector transport method to use\n\nFurthermore the following fields act as safeguards\n\nstop_when_stepsize_less - (0.0`) smallest stepsize when to stop (the last one before is taken)\nstop_when_stepsize_exceeds - ([max_stepsize](@ref)(M, p)`) – largest stepsize when to stop.\nstop_increasing_at_step - (^100`) last step to increase the stepsize (phase 1),\nstop_decreasing_at_step - (1000) last step size to decrease the stepsize (phase 2),\n\nPass :Messages to a debug= to see @infos when these happen.\n\nConstructor\n\nNonmonotoneLinesearch()\n\nwith the Fields above in their order as optional arguments (deprecated).\n\nNonmonotoneLinesearch(M)\n\nwith the Fields above in their order as keyword arguments and where the retraction and vector transport are set to the default ones on M, respectively.\n\nThe constructors return the functor to perform nonmonotone line search.\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/#Manopt.WolfePowellBinaryLinesearch","page":"Stepsize","title":"Manopt.WolfePowellBinaryLinesearch","text":"WolfePowellBinaryLinesearch <: Linesearch\n\nA Linesearch method that determines a step size t fulfilling the Wolfe conditions\n\nbased on a binary chop. Let η be a search direction and c1c_20 be two constants. Then with\n\nA(t) = f(x_+) c1 t operatornamegradf(x) η_x\nquadtextandquad\nW(t) = operatornamegradf(x_+) textV_x_+gets xη_x_+ c_2 η operatornamegradf(x)_x\n\nwhere x_+ = operatornameretr_x(tη) is the current trial point, and textV is a vector transport, we perform the following Algorithm similar to Algorithm 7 from Huang, Thesis, 2014\n\nset α=0, β= and t=1.\nWhile either A(t) does not hold or W(t) does not hold do steps 3-5.\nIf A(t) fails, set β=t.\nIf A(t) holds but W(t) fails, set α=t.\nIf β set t=fracα+β2, otherwise set t=2α.\n\nConstructors\n\nThere exist two constructors, where, when prodivind the manifold M as a first (optional) parameter, its default retraction and vector transport are the default. In this case the retraction and the vector transport are also keyword arguments for ease of use. The other constructor is kept for backward compatibility.\n\nWolfePowellLinesearch(\n M=DefaultManifold(),\n c1::Float64=10^(-4),\n c2::Float64=0.999;\n retraction_method = default_retraction_method(M),\n vector_transport_method = default_vector_transport(M),\n linesearch_stopsize = 0.0\n)\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/#Manopt.WolfePowellLinesearch","page":"Stepsize","title":"Manopt.WolfePowellLinesearch","text":"WolfePowellLinesearch <: Linesearch\n\nDo a backtracking linesearch to find a step size α that fulfils the Wolfe conditions along a search direction η starting from x, i.e.\n\nfbigl( operatornameretr_x(αη) bigr) f(x_k) + c_1 α_k operatornamegradf(x) η_x\nquadtextandquad\nfracmathrmdmathrmdt fbigr(operatornameretr_x(tη)bigr)\nBigvert_t=α\n c_2 fracmathrmdmathrmdt fbigl(operatornameretr_x(tη)bigr)Bigvert_t=0\n\nConstructors\n\nThere exist two constructors, where, when prodivind the manifold M as a first (optional) parameter, its default retraction and vector transport are the default. In this case the retraction and the vector transport are also keyword arguments for ease of use. The other constructor is kept for backward compatibility. Note that the linesearch_stopsize to stop for too small stepsizes is only available in the new signature including M.\n\nWolfePowellLinesearch(\n M,\n c1::Float64=10^(-4),\n c2::Float64=0.999;\n retraction_method = default_retraction_method(M),\n vector_transport_method = default_vector_transport(M),\n linesearch_stopsize = 0.0\n)\n\n\n\n\n\n","category":"type"},{"location":"plans/stepsize/#Manopt.default_stepsize-Tuple{AbstractManifold, Type{<:AbstractManoptSolverState}}","page":"Stepsize","title":"Manopt.default_stepsize","text":"default_stepsize(M::AbstractManifold, ams::AbstractManoptSolverState)\n\nReturns the default Stepsize functor used when running the solver specified by the AbstractManoptSolverState ams running with an objective on the AbstractManifold M.\n\n\n\n\n\n","category":"method"},{"location":"plans/stepsize/#Manopt.get_stepsize-Tuple{AbstractManoptProblem, AbstractManoptSolverState, Vararg{Any}}","page":"Stepsize","title":"Manopt.get_stepsize","text":"get_stepsize(amp::AbstractManoptProblem, ams::AbstractManoptSolverState, vars...)\n\nreturn the stepsize stored within AbstractManoptSolverState ams when solving the AbstractManoptProblem amp. This method also works for decorated options and the Stepsize function within the options, by default stored in o.stepsize.\n\n\n\n\n\n","category":"method"},{"location":"plans/stepsize/#Manopt.linesearch_backtrack-Union{Tuple{T}, Tuple{TF}, Tuple{AbstractManifold, TF, Any, T, Any, Any, Any}, Tuple{AbstractManifold, TF, Any, T, Any, Any, Any, AbstractRetractionMethod}, Tuple{AbstractManifold, TF, Any, T, Any, Any, Any, AbstractRetractionMethod, T}, Tuple{AbstractManifold, TF, Any, T, Any, Any, Any, AbstractRetractionMethod, T, Any}} where {TF, T}","page":"Stepsize","title":"Manopt.linesearch_backtrack","text":"(s, msg) = linesearch_backtrack(\n M, F, x, gradFx, s, decrease, contract, retr, η = -gradFx, f0 = F(x);\n stop_when_stepsize_less=0.0,\n stop_when_stepsize_exceeds=max_stepsize(M, p),\n stop_increasing_at_step = 100,\n stop_decreasing_at_step = 1000,\n)\n\nperform a linesearch for\n\na manifold M\na cost function f,\nan iterate p\nthe gradient operatornamegradF(x)\nan initial stepsize s usually called γ\na sufficient decrease\na contraction factor σ\na retraction, which defaults to the default_retraction_method(M)\na search direction η = -operatornamegradF(x)\nan offset, f_0 = F(x)\n\nAnd use the 4 keywords to limit the maximal increase and decrease steps as well as a maximal stepsize (especially on non-Hadamard manifolds) and a minimal one.\n\nReturn value\n\nA stepsize s and a message msg (in case any of the 4 criteria hit)\n\n\n\n\n\n","category":"method"},{"location":"plans/stepsize/#Manopt.max_stepsize-Tuple{AbstractManifold, Any}","page":"Stepsize","title":"Manopt.max_stepsize","text":"max_stepsize(M::AbstractManifold, p)\nmax_stepsize(M::AbstractManifold)\n\nGet the maximum stepsize (at point p) on manifold M. It should be used to limit the distance an algorithm is trying to move in a single step.\n\n\n\n\n\n","category":"method"},{"location":"plans/stepsize/#Literature","page":"Stepsize","title":"Literature","text":"","category":"section"},{"location":"plans/stepsize/","page":"Stepsize","title":"Stepsize","text":"
[GS23]
\n
\n
G. N. Grapiglia and G. F. Stella. An Adaptive Riemannian Gradient Method Without Function Evaluations. Journal of Optimization Theory and Applications 197, 1140–1160 (2023), preprint: [optimization-online.org/wp-content/uploads/2022/04/8864.pdf](https://optimization-online.org/wp-content/uploads/2022/04/8864.pdf).
","category":"page"},{"location":"tutorials/Optimize!/#Get-Started:-Optimize!","page":"Get started: Optimize!","title":"Get Started: Optimize!","text":"","category":"section"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"In this tutorial, we will both introduce the basics of optimisation on manifolds as well as how to use Manopt.jl to perform optimisation on manifolds in Julia.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"For more theoretical background, see e.g. [Car92] for an introduction to Riemannian manifolds and [AMS08] or [Bou23] to read more about optimisation thereon.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Let mathcal M denote a Riemannian manifold and let fcolon mathcal M ℝ be a cost function. We aim to compute a point p^* where f is minimal or in other words p^* is a minimizer of f.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"We also write this as","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":" operatorname*argmin_p mathcal M f(p)","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"and would like to find p^* numerically. As an example we take the generalisation of the (arithemtic) mean. In the Euclidean case withdinmathbb N, that is for nin mathbb N data points y_1ldotsy_n in mathbb R^d the mean","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":" sum_i=1^n y_i","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"can not be directly generalised to data q_1ldotsq_n, since on a manifold we do not have an addition. But the mean can also be charcterised as","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":" operatorname*argmin_xinmathbb R^d frac12nsum_i=1^n lVert x - y_irVert^2","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"and using the Riemannian distance d_mathcal M, this can be written on Riemannian manifolds. We obtain the Riemannian Center of Mass [Kar77]","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":" operatorname*argmin_pinmathbb R^d\n frac12n sum_i=1^n d_mathcal M^2(p q_i)","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Fortunately the gradient can be computed and is","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":" operatorname*argmin_pinmathbb R^d frac1n sum_i=1^n -log_p q_i","category":"page"},{"location":"tutorials/Optimize!/#Loading-the-necessary-packages","page":"Get started: Optimize!","title":"Loading the necessary packages","text":"","category":"section"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Let’s assume you have already installed both Manotp and Manifolds in Julia (using e.g. using Pkg; Pkg.add([\"Manopt\", \"Manifolds\"])). Then we can get started by loading both packages – and Random for persistency in this tutorial.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"using Manopt, Manifolds, Random, LinearAlgebra\nRandom.seed!(42);","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Now assume we are on the Sphere mathcal M = mathbb S^2 and we generate some random points “around” some initial point p","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"n = 100\nσ = π / 8\nM = Sphere(2)\np = 1 / sqrt(2) * [1.0, 0.0, 1.0]\ndata = [exp(M, p, σ * rand(M; vector_at=p)) for i in 1:n];","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Now we can define the cost function f and its (Riemannian) gradient operatornamegrad f for the Riemannian center of mass:","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"f(M, p) = sum(1 / (2 * n) * distance.(Ref(M), Ref(p), data) .^ 2)\ngrad_f(M, p) = sum(1 / n * grad_distance.(Ref(M), data, Ref(p)));","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"and just call gradient_descent. For a first start, we do not have to provide more than the manifold, the cost, the gradient, and a startig point, which we just set to the first data point","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"m1 = gradient_descent(M, f, grad_f, data[1])","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"3-element Vector{Float64}:\n 0.6868392794790367\n 0.006531600680668244\n 0.7267799820834814","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"In order to get more details, we further add the debug= keyword argument, which act as a decorator pattern.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"This way we can easily specify a certain debug to be printed. The goal is to get an output of the form","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"# i | Last Change: [...] | F(x): [...] |","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"but where we also want to fix the display format for the change and the cost numbers (the [...]) to have a certain format. Furthermore, the reason why the solver stopped should be printed at the end","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"These can easily be specified using either a Symbol – using the default format for numbers – or a tuple of a symbol and a format-string in the debug= keyword that is avaiable for every solver. We can also – for illustration reasons – just look at the first 6 steps by setting a stopping_criterion=","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"m2 = gradient_descent(M, f, grad_f, data[1];\n debug=[:Iteration,(:Change, \"|Δp|: %1.9f |\"),\n (:Cost, \" F(x): %1.11f | \"), \"\\n\", :Stop],\n stopping_criterion = StopAfterIteration(6)\n )","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Initial F(x): 0.32487988924 | \n# 1 |Δp|: 1.063609017 | F(x): 0.25232524046 | \n# 2 |Δp|: 0.809858671 | F(x): 0.20966960102 | \n# 3 |Δp|: 0.616665145 | F(x): 0.18546505598 | \n# 4 |Δp|: 0.470841764 | F(x): 0.17121604104 | \n# 5 |Δp|: 0.359345690 | F(x): 0.16300825911 | \n# 6 |Δp|: 0.274597420 | F(x): 0.15818548927 | \nThe algorithm reached its maximal number of iterations (6).\n\n3-element Vector{Float64}:\n 0.7533872481682505\n -0.060531070555836314\n 0.6547851890466334","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"See here for the list of available symbols.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"!!! info \\\"Technical Detail\\\" The debug= keyword is actually a list of DebugActions added to every iteration, allowing you to write your own ones even. Additionally, :Stop is an action added to the end of the solver to display the reason why the solver stopped.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"The default stopping criterion for gradient_descent is, to either stopwhen the gradient is small (<1e-9) or a max number of iterations is reached (as a fallback. Combining stopping-criteria can be done by | or &. We further pass a number 25 to debug= to only an output every 25th iteration:","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"m3 = gradient_descent(M, f, grad_f, data[1];\n debug=[:Iteration,(:Change, \"|Δp|: %1.9f |\"),\n (:Cost, \" F(x): %1.11f | \"), \"\\n\", :Stop, 25],\n stopping_criterion = StopWhenGradientNormLess(1e-14) | StopAfterIteration(400),\n)","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Initial F(x): 0.32487988924 | \n# 25 |Δp|: 0.459715605 | F(x): 0.15145076374 | \n# 50 |Δp|: 0.000551270 | F(x): 0.15145051509 | \nThe algorithm reached approximately critical point after 70 iterations; the gradient norm (9.399656483458736e-16) is less than 1.0e-14.\n\n3-element Vector{Float64}:\n 0.6868392794788667\n 0.006531600680779304\n 0.726779982083641","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"We can finally use another way to determine the stepsize, for example a little more expensive ArmijoLineSeach than the default stepsize rule used on the Sphere.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"m4 = gradient_descent(M, f, grad_f, data[1];\n debug=[:Iteration,(:Change, \"|Δp|: %1.9f |\"),\n (:Cost, \" F(x): %1.11f | \"), \"\\n\", :Stop, 2],\n stepsize = ArmijoLinesearch(M; contraction_factor=0.999, sufficient_decrease=0.5),\n stopping_criterion = StopWhenGradientNormLess(1e-14) | StopAfterIteration(400),\n)","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Initial F(x): 0.32487988924 | \n# 2 |Δp|: 0.001318138 | F(x): 0.15145051509 | \n# 4 |Δp|: 0.000000004 | F(x): 0.15145051509 | \n# 6 |Δp|: 0.000000000 | F(x): 0.15145051509 | \n# 8 |Δp|: 0.000000000 | F(x): 0.15145051509 | \nThe algorithm reached approximately critical point after 8 iterations; the gradient norm (6.7838288590006e-15) is less than 1.0e-14.\n\n3-element Vector{Float64}:\n 0.6868392794788671\n 0.006531600680779187\n 0.726779982083641","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Then we reach approximately the same point as in the previous run, but in far less steps","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"[f(M, m3)-f(M,m4), distance(M, m3, m4)]","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"2-element Vector{Float64}:\n 2.7755575615628914e-16\n 4.592670164656332e-16","category":"page"},{"location":"tutorials/Optimize!/#Example-2:-Computing-the-median-of-symmetric-positive-definite-matrices.","page":"Get started: Optimize!","title":"Example 2: Computing the median of symmetric positive definite matrices.","text":"","category":"section"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"For the second example let’s consider the manifold of 3 3 symmetric positive definite matrices and again 100 random points","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"N = SymmetricPositiveDefinite(3)\nm = 100\nσ = 0.005\nq = Matrix{Float64}(I, 3, 3)\ndata2 = [exp(N, q, σ * rand(N; vector_at=q)) for i in 1:m];","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Instead of the mean, let’s consider a non-smooth optimisation task: The median can be generalized to Manifolds as the minimiser of the sum of distances, see e.g. [Bac14]. We define","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"g(N, q) = sum(1 / (2 * m) * distance.(Ref(N), Ref(q), data2))","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"g (generic function with 1 method)","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Since the function is non-smooth, we can not use a gradient-based approach. But since for every summand the proximal map is available, we can use the cyclic proximal point algorithm (CPPA). We hence define the vector of proximal maps as","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"proxes_g = Function[(N, λ, q) -> prox_distance(N, λ / m, di, q, 1) for di in data2];","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Besides also looking at a some debug prints, we can also easily record these values. Similarly to debug=, record= also accepts Symbols, see list here, to indicate things to record. We further set return_state to true to obtain not just the (approximate) minimizer.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"s = cyclic_proximal_point(N, g, proxes_g, data2[1];\n debug=[:Iteration,\" | \",:Change,\" | \",(:Cost, \"F(x): %1.12f\"),\"\\n\", 1000, :Stop,\n ],\n record=[:Iteration, :Change, :Cost, :Iterate],\n return_state=true,\n );","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"Initial | | F(x): 0.005875512856\n# 1000 | Last Change: 0.003704 | F(x): 0.003239019699\n# 2000 | Last Change: 0.000015 | F(x): 0.003238996105\n# 3000 | Last Change: 0.000005 | F(x): 0.003238991748\n# 4000 | Last Change: 0.000002 | F(x): 0.003238990225\n# 5000 | Last Change: 0.000001 | F(x): 0.003238989520\nThe algorithm reached its maximal number of iterations (5000).","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"!!! note \\\"Technical Detail\\\" The recording is realised by RecordActions that are (also) executed at every iteration. These can also be individually implemented and added to the record= array instead of symbols.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"\nFirst, the computed median can be accessed as\n\n::: {.cell execution_count=14}","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"{.julia .cell-code} median = getsolverresult(s)","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"3×3 Matrix{Float64}:\n 1.0 2.12236e-5 0.000398721\n 2.12236e-5 1.00044 0.000141798\n 0.000398721 0.000141798 1.00041","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":":::","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"but we can also look at the recorded values. For simplicity (of output), lets just look at the recorded values at iteration 42","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"get_record(s)[42]","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"(42, 1.0569455861045147e-5, 0.0032525477393699743, [0.9998583866917474 0.00020988803126553712 0.0002895445818457687; 0.0002098880312654816 1.0000931572564826 0.00020843715016866105; 0.00028954458184579646 0.00020843715016866105 1.0000709207432568])","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"But we can also access whole serieses and see that the cost does not decrease that fast; actually, the CPPA might converge relatively slow. For that we can for example access the :Cost that was recorded every :Iterate as well as the (maybe a little boring) :Iteration-number in a semilogplot.","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"x = get_record(s, :Iteration, :Iteration)\ny = get_record(s, :Iteration, :Cost)\nusing Plots\nplot(x,y,xaxis=:log, label=\"CPPA Cost\")","category":"page"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"(Image: )","category":"page"},{"location":"tutorials/Optimize!/#Literature","page":"Get started: Optimize!","title":"Literature","text":"","category":"section"},{"location":"tutorials/Optimize!/","page":"Get started: Optimize!","title":"Get started: Optimize!","text":"
[AMS08]
\n
\n
P.-A. Absil, R. Mahony and R. Sepulchre. Optimization Algorithms on Matrix Manifolds. Princeton University Press (2008). [open access](http://press.princeton.edu/chapters/absil/).
","category":"page"},{"location":"#Welcome-to-Manopt.jl","page":"Home","title":"Welcome to Manopt.jl","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"CurrentModule = Manopt","category":"page"},{"location":"","page":"Home","title":"Home","text":"Manopt.Manopt","category":"page"},{"location":"#Manopt.Manopt","page":"Home","title":"Manopt.Manopt","text":"🏔️ Manopt.jl – Optimization on Manifolds in Julia.\n\n📚 Documentation: manoptjl.org\n📦 Repository: github.com/JuliaManifolds/Manopt.jl\n💬 Discussions: github.com/JuliaManifolds/Manopt.jl/discussions\n🎯 Issues: github.com/JuliaManifolds/Manopt.jl/issues\n\n\n\n\n\n","category":"module"},{"location":"","page":"Home","title":"Home","text":"For a function fmathcal M ℝ defined on a Riemannian manifold mathcal M we aim to solve","category":"page"},{"location":"","page":"Home","title":"Home","text":"operatorname*argmin_p mathcal M f(p)","category":"page"},{"location":"","page":"Home","title":"Home","text":"or in other words: find the point p on the manifold, where f reaches its minimal function value.","category":"page"},{"location":"","page":"Home","title":"Home","text":"Manopt.jl provides a framework for optimization on manifolds as well as a Library of optimization algorithms in Julia. It belongs to the “Manopt family”, which includes Manopt (Matlab) and pymanopt.org (Python).","category":"page"},{"location":"","page":"Home","title":"Home","text":"If you want to delve right into Manopt.jl check out the Get started: Optimize! tutorial.","category":"page"},{"location":"","page":"Home","title":"Home","text":"Manopt.jl makes it easy to use an algorithm for your favourite manifold as well as a manifold for your favourite algorithm. It already provides many manifolds and algorithms, which can easily be enhanced, for example to record certain data or debug output throughout iterations.","category":"page"},{"location":"","page":"Home","title":"Home","text":"If you use Manopt.jlin your work, please cite the following","category":"page"},{"location":"","page":"Home","title":"Home","text":"@article{Bergmann2022,\n Author = {Ronny Bergmann},\n Doi = {10.21105/joss.03866},\n Journal = {Journal of Open Source Software},\n Number = {70},\n Pages = {3866},\n Publisher = {The Open Journal},\n Title = {Manopt.jl: Optimization on Manifolds in {J}ulia},\n Volume = {7},\n Year = {2022},\n}","category":"page"},{"location":"","page":"Home","title":"Home","text":"To refer to a certain version or the source code in general we recommend to cite for example","category":"page"},{"location":"","page":"Home","title":"Home","text":"@software{manoptjl-zenodo-mostrecent,\n Author = {Ronny Bergmann},\n Copyright = {MIT License},\n Doi = {10.5281/zenodo.4290905},\n Publisher = {Zenodo},\n Title = {Manopt.jl},\n Year = {2022},\n}","category":"page"},{"location":"","page":"Home","title":"Home","text":"for the most recent version or a corresponding version specific DOI, see the list of all versions. Note that both citations are in BibLaTeX format.","category":"page"},{"location":"#Main-Features","page":"Home","title":"Main Features","text":"","category":"section"},{"location":"#Optimization-Algorithms-(Solvers)","page":"Home","title":"Optimization Algorithms (Solvers)","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"For every optimization algorithm, a solver is implemented based on a AbstractManoptProblem that describes the problem to solve and its AbstractManoptSolverState that set up the solver, store interims values. Together they form a plan.","category":"page"},{"location":"#Manifolds","page":"Home","title":"Manifolds","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"This project is build upon ManifoldsBase.jl, a generic interface to implement manifolds. Certain functions are extended for specific manifolds from Manifolds.jl, but all other manifolds from that package can be used here, too.","category":"page"},{"location":"","page":"Home","title":"Home","text":"The notation in the documentation aims to follow the same notation from these packages.","category":"page"},{"location":"#Functions-on-Manifolds","page":"Home","title":"Functions on Manifolds","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"Several functions are available, implemented on an arbitrary manifold, cost functions, differentials and their adjoints, and gradients as well as proximal maps.","category":"page"},{"location":"#Visualization","page":"Home","title":"Visualization","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"To visualize and interpret results, Manopt.jl aims to provide both easy plot functions as well as exports. Furthermore a system to get debug during the iterations of an algorithms as well as record capabilities, i.e. to record a specified tuple of values per iteration, most prominently RecordCost and RecordIterate. Take a look at the Get Started: Optimize! tutorial on how to easily activate this.","category":"page"},{"location":"#Literature","page":"Home","title":"Literature","text":"","category":"section"},{"location":"","page":"Home","title":"Home","text":"If you want to get started with manifolds, one book is [Car92], and if you want do directly dive into optimization on manifolds, good references are [AMS08] and [Bou23], which are both available online for free","category":"page"},{"location":"","page":"Home","title":"Home","text":"
[AMS08]
\n
\n
P.-A. Absil, R. Mahony and R. Sepulchre. Optimization Algorithms on Matrix Manifolds. Princeton University Press (2008). [open access](http://press.princeton.edu/chapters/absil/).
","category":"page"},{"location":"references/#Literature","page":"References","title":"Literature","text":"","category":"section"},{"location":"references/","page":"References","title":"References","text":"This is all literature mentioned / referenced in the Manopt.jl documentation. Usually you will find a small reference section at the end of every documentation page that contains references.","category":"page"},{"location":"references/","page":"References","title":"References","text":"
P.-A. Absil, R. Mahony and R. Sepulchre. Optimization Algorithms on Matrix Manifolds. Princeton University Press (2008). [open access](http://press.princeton.edu/chapters/absil/).
M. Bačák, R. Bergmann, G. Steidl and A. Weinmann. A second order non-smooth variational model for restoring manifold-valued images. SIAM Journal on Scientific Computing 38, A567–A597 (2016), arxiv: [1506.02409](https://arxiv.org/abs/1506.02409).
\n
[Bea72]
\n
\n
E. M. Beale. A derivation of conjugate gradients. In: Numerical methods for nonlinear optimization, 39–43, London, Academic Press, London (1972).
R. Bergmann and P.-Y. Gousenbourger. A variational model for data fitting on manifolds by minimizing the acceleration of a Bézier curve. Frontiers in Applied Mathematics and Statistics 4 (2018), arXiv: [1807.10090](https://arxiv.org/abs/1807.10090).
\n
[BH19]
\n
\n
R. Bergmann and R. Herzog. Intrinsic formulation of KKT conditions and constraint qualifications on smooth manifolds. SIAM Journal on Optimization 29, 2423–2444 (2019), arXiv: [1804.06214](https://arxiv.org/abs/1804.06214).
\n
[BHS+21]
\n
\n
R. Bergmann, R. Herzog, M. Silva Louzeiro, D. Tenbrinck and J. Vidal-Núñez. Fenchel duality theory and a primal-dual algorithm on Riemannian manifolds. Foundations of Computational Mathematics 21, 1465–1504 (2021), arXiv: [1908.02022](http://arxiv.org/abs/1908.02022).
\n
[BLSW14]
\n
\n
R. Bergmann, F. Laus, G. Steidl and A. Weinmann. Second order differences of cyclic data and applications in variational denoising. SIAM Journal on Imaging Sciences 7, 2916–2953 (2014), arxiv: [1405.5349](https://arxiv.org/abs/1405.5349).
\n
[BPS16]
\n
\n
R. Bergmann, J. Persch and G. Steidl. A parallel Douglas Rachford algorithm for minimizing ROF-like functionals on images with values in symmetric Hadamard manifolds. SIAM Journal on Imaging Sciences 9, 901–937 (2016), arxiv: [1512.02814](https://arxiv.org/abs/1512.02814).
W. Diepeveen and J. Lellmann. An Inexact Semismooth Newton Method on Riemannian Manifolds with Application to Duality-Based Total Variation Denoising. SIAM Journal on Imaging Sciences 14, 1565–1600 (2021), arXiv: [2102.10309](https://arxiv.org/abs/2102.10309).
\n
[DMSC16]
\n
\n
J. Duran, M. Moeller, C. Sbert and D. Cremers. Collaborative Total Variation: A General Framework for Vectorial TV Models. SIAM Journal on Imaging Sciences 9, 116-151 (2016), arxiv: [1508.01308](https://arxiv.org/abs/1508.01308).
G. N. Grapiglia and G. F. Stella. An Adaptive Riemannian Gradient Method Without Function Evaluations. Journal of Optimization Theory and Applications 197, 1140–1160 (2023), preprint: [optimization-online.org/wp-content/uploads/2022/04/8864.pdf](https://optimization-online.org/wp-content/uploads/2022/04/8864.pdf).
","category":"page"},{"location":"tutorials/StochasticGradientDescent/#How-to-Run-Stochastic-Gradient-Descent","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"","category":"section"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"This tutorial illustrates how to use the stochastic_gradient_descent solver and different DirectionUpdateRules in order to introduce the average or momentum variant, see Stochastic Gradient Descent.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"Computationally, we look at a very simple but large scale problem, the Riemannian Center of Mass or Fréchet mean: for given points p_i mathcal M, i=1N this optimization problem reads","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"operatorname*argmin_xmathcal M frac12sum_i=1^N\n operatornamed^2_mathcal M(xp_i)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"which of course can be (and is) solved by a gradient descent, see the introductionary tutorial or Statistics in Manifolds.jl. If N is very large, evaluating the complete gradient might be quite expensive. A remedy is to evaluate only one of the terms at a time and choose a random order for these.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"We first initialize the packages","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"using Manifolds, Manopt, Random, BenchmarkTools\nRandom.seed!(42);","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"We next generate a (little) large(r) data set","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"n = 5000\nσ = π / 12\nM = Sphere(2)\np = 1 / sqrt(2) * [1.0, 0.0, 1.0]\ndata = [exp(M, p, σ * rand(M; vector_at=p)) for i in 1:n];","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"Note that due to the construction of the points as zero mean tangent vectors, the mean should be very close to our initial point p.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"In order to use the stochastic gradient, we now need a function that returns the vector of gradients. There are two ways to define it in Manopt.jl: either as a single function that returns a vector, or as a vector of functions.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"The first variant is of course easier to define, but the second is more efficient when only evaluating one of the gradients.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"For the mean, the gradient is","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"operatornamegradf(p) = sum_i=1^N operatornamegradf_i(x) quad textwhere operatornamegradf_i(x) = -log_x p_i","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"which we define in Manopt.jl in two different ways: either as one function returning all gradients as a vector (see gradF), or – maybe more fitting for a large scale problem, as a vector of small gradient functions (see gradf)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"F(M, p) = 1 / (2 * n) * sum(map(q -> distance(M, p, q)^2, data))\ngradF(M, p) = [grad_distance(M, p, q) for q in data]\ngradf = [(M, p) -> grad_distance(M, q, p) for q in data];\np0 = 1 / sqrt(3) * [1.0, 1.0, 1.0]","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"3-element Vector{Float64}:\n 0.5773502691896258\n 0.5773502691896258\n 0.5773502691896258","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"The calls are only slightly different, but notice that accessing the 2nd gradient element requires evaluating all logs in the first function, while we only call one of the functions in the second array of functions. So while you can use both gradF and gradf in the following call, the second one is (much) faster:","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"p_opt1 = stochastic_gradient_descent(M, gradF, p)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"3-element Vector{Float64}:\n -0.034408323150541376\n 0.028979490714898942\n -0.2172726573502577","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"@benchmark stochastic_gradient_descent($M, $gradF, $p0)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"BenchmarkTools.Trial: 1 sample with 1 evaluation.\n Single result which took 8.795 s (5.82% GC) to evaluate,\n with a memory estimate of 7.83 GiB, over 100161804 allocations.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"p_opt2 = stochastic_gradient_descent(M, gradf, p0)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"3-element Vector{Float64}:\n 0.37206187599994556\n -0.11462522239619985\n 0.9211031531907937","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"@benchmark stochastic_gradient_descent($M, $gradf, $p0)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"BenchmarkTools.Trial: 890 samples with 1 evaluation.\n Range (min … max): 5.189 ms … 10.524 ms ┊ GC (min … max): 0.00% … 36.83%\n Time (median): 5.267 ms ┊ GC (median): 0.00%\n Time (mean ± σ): 5.611 ms ± 1.070 ms ┊ GC (mean ± σ): 5.35% ± 11.22%\n\n █▄ \n ██▄▅▃▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▃▃▂ ▂\n 5.19 ms Histogram: frequency by time 9.33 ms <\n\n Memory estimate: 3.43 MiB, allocs estimate: 50030.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"This result is reasonably close. But we can improve it by using a DirectionUpdateRule, namely:","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"On the one hand MomentumGradient, which requires both the manifold and the initial value, in order to keep track of the iterate and parallel transport the last direction to the current iterate. The necessary vector_transport_method keyword is set to a suitable default on every manifold, see default_vector_transport_method. We get ““”","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"p_opt3 = stochastic_gradient_descent(\n M, gradf, p0; direction=MomentumGradient(M, p0; direction=StochasticGradient(M))\n)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"3-element Vector{Float64}:\n -0.6605946566435753\n 0.24633535998595033\n -0.7091781088235515","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"MG = MomentumGradient(M, p0; direction=StochasticGradient(M));\n@benchmark stochastic_gradient_descent($M, $gradf, $p0; direction=$MG)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"BenchmarkTools.Trial: 200 samples with 1 evaluation.\n Range (min … max): 23.306 ms … 36.966 ms ┊ GC (min … max): 0.00% … 13.75%\n Time (median): 23.815 ms ┊ GC (median): 0.00%\n Time (mean ± σ): 24.993 ms ± 2.260 ms ┊ GC (mean ± σ): 4.76% ± 7.15%\n\n ▃█▂▄ \n ▅▇████▆▆▃▂▂▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃▃▅▄▇▄▄▂▄▃▁▁▁▂ ▃\n 23.3 ms Histogram: frequency by time 29.2 ms <\n\n Memory estimate: 11.36 MiB, allocs estimate: 249516.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"And on the other hand the AverageGradient computes an average of the last n gradients, i.e.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"p_opt4 = stochastic_gradient_descent(\n M, gradf, p0; direction=AverageGradient(M, p0; n=10, direction=StochasticGradient(M))\n)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"3-element Vector{Float64}:\n 0.8041185045468516\n 0.08386875203799127\n 0.5885231202569053","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"AG = AverageGradient(M, p0; n=10, direction=StochasticGradient(M));\n@benchmark stochastic_gradient_descent($M, $gradf, $p0; direction=$AG)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"BenchmarkTools.Trial: 84 samples with 1 evaluation.\n Range (min … max): 55.228 ms … 65.851 ms ┊ GC (min … max): 0.00% … 7.58%\n Time (median): 60.566 ms ┊ GC (median): 8.15%\n Time (mean ± σ): 59.708 ms ± 2.240 ms ┊ GC (mean ± σ): 6.44% ± 3.44%\n\n ▅ █▃ \n ▃▅▇█▃▃▃▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▆████▇▄▇▅▄▁▁▁▁▁▁▁▁▁▁▁▁▁▁▃ ▁\n 55.2 ms Histogram: frequency by time 63.7 ms <\n\n Memory estimate: 34.25 MiB, allocs estimate: 569516.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"Note that the default StoppingCriterion is a fixed number of iterations which helps the comparison here.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"For both update rules we have to internally specify that we are still in the stochastic setting, since both rules can also be used with the IdentityUpdateRule within gradient_descent.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"For this not-that-large-scale example we can of course also use a gradient descent with ArmijoLinesearch,","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"fullGradF(M, p) = sum(grad_distance(M, q, p) for q in data)\np_opt5 = gradient_descent(M, F, fullGradF, p0; stepsize=ArmijoLinesearch(M))","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"3-element Vector{Float64}:\n 0.6595265191812062\n 0.1457504051994757\n 0.7374154798218656","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"but it will be a little slower usually","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"AL = ArmijoLinesearch(M);\n@benchmark gradient_descent($M, $F, $fullGradF, $p0; stepsize=$AL)","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"BenchmarkTools.Trial: 7 samples with 1 evaluation.\n Range (min … max): 783.478 ms … 805.992 ms ┊ GC (min … max): 7.44% … 7.38%\n Time (median): 786.469 ms ┊ GC (median): 7.51%\n Time (mean ± σ): 789.545 ms ± 7.991 ms ┊ GC (mean ± σ): 7.47% ± 0.06%\n\n ▁ ▁ ▁ █ ▁ ▁ \n █▁▁█▁▁█▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁\n 783 ms Histogram: frequency by time 806 ms <\n\n Memory estimate: 703.16 MiB, allocs estimate: 9021018.","category":"page"},{"location":"tutorials/StochasticGradientDescent/","page":"How to Run Stochastic Gradient Descent","title":"How to Run Stochastic Gradient Descent","text":"Note that all 5 runs are very close to each other, here we check the distance to the first","category":"page"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"EditURL = \"https://github.com/JuliaManifolds/Manopt.jl/blob/master/CONTRIBUTING.md\"","category":"page"},{"location":"contributing/#Contributing-to-Manopt.jl","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"","category":"section"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"First, thanks for taking the time to contribute. Any contribution is appreciated and welcome.","category":"page"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"The following is a set of guidelines to Manopt.jl.","category":"page"},{"location":"contributing/#Table-of-Contents","page":"Contributing to Manopt.jl","title":"Table of Contents","text":"","category":"section"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"Contributing to Manopt.jl - Table of Contents\nI just have a question\nHow can I file an issue?\nHow can I contribute?\nAdd a missing method\nProvide a new algorithm\nProvide a new example\nCode style","category":"page"},{"location":"contributing/#I-just-have-a-question","page":"Contributing to Manopt.jl","title":"I just have a question","text":"","category":"section"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"The developer can most easily be reached in the Julia Slack channel #manifolds. You can apply for the Julia Slack workspace here if you haven't joined yet. You can also ask your question on discourse.julialang.org.","category":"page"},{"location":"contributing/#How-can-I-file-an-issue?","page":"Contributing to Manopt.jl","title":"How can I file an issue?","text":"","category":"section"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"If you found a bug or want to propose a feature, we track our issues within the GitHub repository.","category":"page"},{"location":"contributing/#How-can-I-contribute?","page":"Contributing to Manopt.jl","title":"How can I contribute?","text":"","category":"section"},{"location":"contributing/#Add-a-missing-method","page":"Contributing to Manopt.jl","title":"Add a missing method","text":"","category":"section"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"There is still a lot of methods for within the optimization framework of Manopt.jl, may it be functions, gradients, differentials, proximal maps, step size rules or stopping criteria. If you notice a method missing and can contribute an implementation, please do so! Even providing a single new method is a good contribution.","category":"page"},{"location":"contributing/#Provide-a-new-algorithm","page":"Contributing to Manopt.jl","title":"Provide a new algorithm","text":"","category":"section"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"A main contribution you can provide is another algorithm that is not yet included in the package. An algorithm is always based on a concrete type of a AbstractManoptProblem storing the main information of the task and a concrete type of an AbstractManoptSolverState storing all information that needs to be known to the solver in general. The actual algorithm is split into an initialization phase, see initialize_solver!, and the implementation of the ith step of the solver itself, see before the iterative procedure, see step_solver!. For these two functions, it would be great if a new algorithm uses functions from the ManifoldsBase.jl interface as generically as possible. For example, if possible use retract!(M,q,p,X) in favor of exp!(M,q,p,X) to perform a step starting in p in direction X (in place of q), since the exponential map might be too expensive to evaluate or might not be available on a certain manifold. See Retractions and inverse retractions for more details. Further, if possible, prefer retract!(M,q,p,X) in favor of retract(M,p,X), since a computation in place of a suitable variable q reduces memory allocations.","category":"page"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"Usually, the methods implemented in Manopt.jl also have a high-level interface, that is easier to call, creates the necessary problem and options structure and calls the solver.","category":"page"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"The two technical functions initialize_solver! and step_solver! should be documented with technical details, while the high level interface should usually provide a general description and some literature references to the algorithm at hand.","category":"page"},{"location":"contributing/#Provide-a-new-example","page":"Contributing to Manopt.jl","title":"Provide a new example","text":"","category":"section"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"Example problems are available at ManoptExamples.jl, where also their reproducible Quarto-Markdown files are stored.","category":"page"},{"location":"contributing/#Code-style","page":"Contributing to Manopt.jl","title":"Code style","text":"","category":"section"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"We try to follow the documentation guidelines from the Julia documentation as well as Blue Style. We run JuliaFormatter.jl on the repo in the way set in the .JuliaFormatter.toml file, which enforces a number of conventions consistent with the Blue Style.","category":"page"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"We also follow a few internal conventions:","category":"page"},{"location":"contributing/","page":"Contributing to Manopt.jl","title":"Contributing to Manopt.jl","text":"It is preferred that the AbstractManoptProblem's struct contains information about the general structure of the problem.\nAny implemented function should be accompanied by its mathematical formulae if a closed form exists.\nAbstractManoptProblem and option structures are stored within the plan/ folder and sorted by properties of the problem and/or solver at hand.\nWithin the source code of one algorithm, the high level interface should be first, then the initialization, then the step.\nOtherwise an alphabetical order is preferable.\nThe above implies that the mutating variant of a function follows the non-mutating variant.\nThere should be no dangling = signs.\nAlways add a newline between things of different types (struct/method/const).\nAlways add a newline between methods for different functions (including mutating/nonmutating variants).\nPrefer to have no newline between methods for the same function; when reasonable, merge the docstrings.\nAll import/using/include should be in the main module file.","category":"page"},{"location":"helpers/checks/#Checks","page":"Checks","title":"Checks","text":"","category":"section"},{"location":"helpers/checks/","page":"Checks","title":"Checks","text":"If you have computed a gradient or differential and you are not sure whether it is correct.","category":"page"},{"location":"helpers/checks/","page":"Checks","title":"Checks","text":"Modules = [Manopt]\nPages = [\"checks.jl\"]","category":"page"},{"location":"helpers/checks/#Manopt.check_Hessian","page":"Checks","title":"Manopt.check_Hessian","text":"check_Hessian(M, f, grad_f, Hess_f, p=rand(M), X=rand(M; vector_at=p), Y=rand(M, vector_at=p); kwargs...)\n\nCheck numerically whether the Hessian {operatorname{Hess} f(M,p, X) of f(M,p) is correct.\n\nFor this we require either a second-order retraction or a critical point p of f.\n\ngiven that we know that is whether\n\nf(operatornameretr_p(tX)) = f(p) + toperatornamegrad f(p) X + fract^22operatornameHessf(p)X X + mathcal O(t^3)\n\nor in other words, that the error between the function f and its second order Taylor behaves in error mathcal O(t^3), which indicates that the Hessian is correct, cf. also Section 6.8, Boumal, Cambridge Press, 2023.\n\nNote that if the errors are below the given tolerance and the method is exact, no plot will be generated.\n\nKeyword arguments\n\ncheck_grad – (true) check whether operatornamegrad f(p) in T_pmathcal M.\ncheck_linearity – (true) check whether the Hessian is linear, see is_Hessian_linear using a, b, X, and Y\ncheck_symmetry – (true) check whether the Hessian is symmetric, see is_Hessian_symmetric\ncheck_vector – (false) check whether operatornameHess f(p)X in T_pmathcal M using is_vector.\nmode - (:Default) specify the mode, by default we assume to have a second order retraction given by retraction_method= you can also this method if you already have a critical point p. Set to :CritalPoint to use gradient_descent to find a critical point. Note: This requires (and evaluates) new tangent vectors X and Y\natol, rtol – (same defaults as isapprox) tolerances that are passed down to all checks\na, b – two real values to check linearity of the Hessian (if check_linearity=true)\nN - (101) number of points to check within the log_range default range 10^-810^0\nexactness_tol - (1e-12) if all errors are below this tolerance, the check is considered to be exact\nio – (nothing) provide an IO to print the check result to\ngradient - (grad_f(M, p)) instead of the gradient function you can also provide the gradient at p directly\nHessian - (Hess_f(M, p, X)) instead of the Hessian function you can provide the result of operatornameHess f(p)X directly. Note that evaluations of the Hessian might still be necessary for checking linearity and symmetry and/or when using :CriticalPoint mode.\nlimits - ((1e-8,1)) specify the limits in the log_range\nlog_range - (range(limits[1], limits[2]; length=N)) specify the range of points (in log scale) to sample the Hessian line\nN - (101) number of points to check within the log_range default range 10^-810^0\nplot - (false) whether to plot the resulting check (if Plots.jl is loaded). The plot is in log-log-scale. This is returned and can then also be saved.\nretraction_method - (default_retraction_method(M, typeof(p))) retraction method to use for the check\nslope_tol – (0.1) tolerance for the slope (global) of the approximation\nthrow_error - (false) throw an error message if the Hessian is wrong\nwindow – (nothing) specify window sizes within the log_range that are used for the slope estimation. the default is, to use all window sizes 2:N.\n\nThe kwargs... are also passed down to the check_vector call, such that tolerances can easily be set.\n\n\n\n\n\n","category":"function"},{"location":"helpers/checks/#Manopt.check_differential","page":"Checks","title":"Manopt.check_differential","text":"check_differential(M, F, dF, p=rand(M), X=rand(M; vector_at=p); kwargs...)\n\nCheck numerically whether the differential dF(M,p,X) of F(M,p) is correct.\n\nThis implements the method described in Section 4.8, Boumal, Cambridge Press, 2023.\n\nNote that if the errors are below the given tolerance and the method is exact, no plot will be generated,\n\nKeyword arguments\n\nexactness_tol - (1e-12) if all errors are below this tolerance, the check is considered to be exact\nio – (nothing) provide an IO to print the check result to\nlimits ((1e-8,1)) specify the limits in the log_range\nlog_range (range(limits[1], limits[2]; length=N)) - specify the range of points (in log scale) to sample the differential line\nN (101) – number of points to check within the log_range default range 10^-810^0\nname (\"differential\") – name to display in the check (e.g. if checking differential)\nplot- (false) whether to plot the resulting check (if Plots.jl is loaded). The plot is in log-log-scale. This is returned and can then also be saved.\nretraction_method - (default_retraction_method(M, typeof(p))) retraction method to use for the check\nslope_tol – (0.1) tolerance for the slope (global) of the approximation\nthrow_error - (false) throw an error message if the differential is wrong\nwindow – (nothing) specify window sizes within the log_range that are used for the slope estimation. the default is, to use all window sizes 2:N.\n\n\n\n\n\n","category":"function"},{"location":"helpers/checks/#Manopt.check_gradient","page":"Checks","title":"Manopt.check_gradient","text":"check_gradient(M, F, gradF, p=rand(M), X=rand(M; vector_at=p); kwargs...)\n\nCheck numerically whether the gradient gradF(M,p) of F(M,p) is correct, that is whether\n\nf(operatornameretr_p(tX)) = f(p) + toperatornamegrad f(p) X + mathcal O(t^2)\n\nor in other words, that the error between the function f and its first order Taylor behaves in error mathcal O(t^2), which indicates that the gradient is correct, cf. also Section 4.8, Boumal, Cambridge Press, 2023.\n\nNote that if the errors are below the given tolerance and the method is exact, no plot will be generated.\n\nKeyword arguments\n\ncheck_vector – (true) check whether operatornamegrad f(p) in T_pmathcal M using is_vector.\nexactness_tol - (1e-12) if all errors are below this tolerance, the check is considered to be exact\nio – (nothing) provide an IO to print the check result to\ngradient - (grad_f(M, p)) instead of the gradient function you can also provide the gradient at p directly\nlimits - ((1e-8,1)) specify the limits in the log_range\nlog_range - (range(limits[1], limits[2]; length=N)) - specify the range of points (in log scale) to sample the gradient line\nN - (101) – number of points to check within the log_range default range 10^-810^0\nplot - (false) whether to plot the resulting check (if Plots.jl is loaded). The plot is in log-log-scale. This is returned and can then also be saved.\nretraction_method - (default_retraction_method(M, typeof(p))) retraction method to use for the check\nslope_tol – (0.1) tolerance for the slope (global) of the approximation\natol, rtol – (same defaults as isapprox) tolerances that are passed down to is_vector if check_vector is set to true\nthrow_error - (false) throw an error message if the gradient is wrong\nwindow – (nothing) specify window sizes within the log_range that are used for the slope estimation. the default is, to use all window sizes 2:N.\n\nThe kwargs... are also passed down to the check_vector call, such that tolerances can easily be set.\n\n\n\n\n\n","category":"function"},{"location":"helpers/checks/#Manopt.find_best_slope_window","page":"Checks","title":"Manopt.find_best_slope_window","text":"(a,b,i,j) = find_best_slope_window(X,Y,window=nothing; slope=2.0, slope_tol=0.1)\n\nCheck data X,Y for the largest contiguous interval (window) with a regression line fitting “best”. Among all intervals with a slope within slope_tol to slope the longest one is taken. If no such interval exists, the one with the slope closest to slope is taken.\n\nIf the window is set to nothing (default), all window sizes 2,...,length(X) are checked. You can also specify a window size or an array of window sizes.\n\nFor each window size , all its translates in the data are checked. For all these (shifted) windows the regression line is computed (i.e. a,b in a + t*b) and the best line is computed.\n\nFrom the best line the following data is returned\n\na, b specifying the regression line a + t*b\ni, j determining the window, i.e the regression line stems from data X[i], ..., X[j]\n\n\n\n\n\n","category":"function"},{"location":"helpers/checks/#Manopt.is_Hessian_linear","page":"Checks","title":"Manopt.is_Hessian_linear","text":"is_Hessian_linear(M, Hess_f, p,\n X=rand(M; vector_at=p), Y=rand(M; vector_at=p), a=randn(), b=randn();\n throw_error=false, io=nothing, kwargs...\n)\n\nCheck whether the Hessian function Hess_f fulfills linearity, i.e. that\n\noperatornameHess f(p)aX + bY = boperatornameHess f(p)X\n + boperatornameHess f(p)Y\n\nwhich is checked using isapprox and the kwargs... are passed to this function.\n\nOptional Arguments\n\nthrow_error - (false) throw an error message if the Hessian is wrong\n\n\n\n\n\n","category":"function"},{"location":"helpers/checks/#Manopt.is_Hessian_symmetric","page":"Checks","title":"Manopt.is_Hessian_symmetric","text":"is_Hessian_symmetric(M, Hess_f, p=rand(M), X=rand(M; vector_at=p), Y=rand(M; vector_at=p);\nthrow_error=false, io=nothing, atol::Real=0, rtol::Real=atol>0 ? 0 : √eps\n\n)\n\nCheck whether the Hessian function Hess_f fulfills symmetry, i.e. that\n\noperatornameHess f(p)X Y = X operatornameHess f(p)Y\n\nwhich is checked using isapprox and the kwargs... are passed to this function.\n\nOptional Arguments\n\natol, rtol - with the same defaults as the usual isapprox\nthrow_error - (false) throw an error message if the Hessian is wrong\n\n\n\n\n\n","category":"function"},{"location":"helpers/checks/#Manopt.plot_slope-Tuple{Any, Any}","page":"Checks","title":"Manopt.plot_slope","text":"plot_slope(x, y; slope=2, line_base=0, a=0, b=2.0, i=1,j=length(x))\n\nPlot the result from the error check functions, e.g. check_gradient, check_differential, check_Hessian on data x,y with two comparison lines\n\nline_base + tslope as the global slope the plot should have\na + b*t on the interval [x[i], x[j]] for some (best fitting) comparison slope\n\n\n\n\n\n","category":"method"},{"location":"helpers/checks/#Manopt.prepare_check_result-Tuple{Any, Any, Any}","page":"Checks","title":"Manopt.prepare_check_result","text":"prepare_check_result(log_range, errors, slope)\n\nGiven a range of values log_range, where we computed errors, check whether this yields a slope of slope in log-scale\n\nNote that if the errors are below the given tolerance and the method is exact, no plot will be generated,\n\nKeyword arguments\n\nexactness_tol - (1e3*eps(eltype(errors))) is all errors are below this tolerance, the check is considered to be exact\nio – (nothing) provide an IO to print the check result to\nname (\"differential\") – name to display in the check (e.g. if checking gradient)\nplot- (false) whether to plot the resulting check (if Plots.jl is loaded). The plot is in log-log-scale. This is returned and can then also be saved.\nslope_tol – (0.1) tolerance for the slope (global) of the approximation\nthrow_error - (false) throw an error message if the gradient or Hessian is wrong\n\n\n\n\n\n","category":"method"},{"location":"helpers/checks/#Literature","page":"Checks","title":"Literature","text":"","category":"section"},{"location":"helpers/checks/","page":"Checks","title":"Checks","text":"
","category":"page"},{"location":"solvers/difference_of_convex/#DifferenceOfConvexSolvers","page":"Difference of Convex","title":"Difference of Convex","text":"","category":"section"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/difference_of_convex/#DCASolver","page":"Difference of Convex","title":"Difference of Convex Algorithm","text":"","category":"section"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"difference_of_convex_algorithm\ndifference_of_convex_algorithm!","category":"page"},{"location":"solvers/difference_of_convex/#Manopt.difference_of_convex_algorithm","page":"Difference of Convex","title":"Manopt.difference_of_convex_algorithm","text":"difference_of_convex_algorithm(M, f, g, ∂h, p=rand(M); kwargs...)\ndifference_of_convex_algorithm(M, mdco, p; kwargs...)\n\nCompute the difference of convex algorithm Bergmann, Ferreira, Santos, Souza, preprint, 2023 to minimize\n\n operatorname*argmin_pmathcal M g(p) - h(p)\n\nwhere you need to provide f(p) = g(p) - h(p), g and the subdifferential h of h.\n\nThis algorithm performs the following steps given a start point p= p^(0). Then repeat for k=01ldots\n\nTake X^(k) h(p^(k))\nSet the next iterate to the solution of the subproblem\n\n p^(k+1) in operatorname*argmin_qin mathcal M g(q) - X^(k) log_p^(k)q\n\nuntil the stopping_criterion is fulfilled.\n\nOptional parameters\n\nevaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form grad_f(M, p) or InplaceEvaluation form grad_f!(M, X, x)\ngradient – (nothing) specify operatornamegrad f, for debug / analysis or enhancing stopping_criterion=\ngrad_g – (nothing) specify the gradient of g. If specified, a subsolver is automatically set up.\ninitial_vector - (zero_vector(M, p)) initialise the inner tangent vector to store the subgradient result.\nstopping_criterion – (StopAfterIteration(200) |StopWhenChangeLess(1e-8)) a StoppingCriterion for the algorithm – includes a StopWhenGradientNormLess(1e-8), when a gradient is provided.\n\nif you specify the ManifoldDifferenceOfConvexObjective mdco, additionally\n\ng - (nothing) specify the function g If specified, a subsolver is automatically set up.\n\nWhile there are several parameters for a sub solver, the easiest is to provide the function grad_g=, such that together with the mandatory function g a default cost and gradient can be generated and passed to a default subsolver. Hence the easiest example call looks like\n\ndifference_of_convex_algorithm(M, f, g, grad_h, p; grad_g=grad_g)\n\nOptional parameters for the sub problem\n\nsub_cost - (LinearizedDCCost(g, p, initial_vector)) a cost to be used within the default sub_problem Use this if you have a more efficient version than the default that is built using g from above.\nsub_grad - (LinearizedDCGrad(grad_g, p, initial_vector; evaluation=evaluation) gradient to be used within the default sub_problem. This is generated by default when grad_g is provided. You can specify your own by overwriting this keyword.\nsub_hess – (a finite difference approximation by default) specify a Hessian of the subproblem, which the default solver, see sub_state needs\nsub_kwargs - ([]) pass keyword arguments to the sub_state, in form of a Dict(:kwname=>value), unless you set the sub_state directly.\nsub_objective - (a gradient or hessian objective based on the last 3 keywords) provide the objective used within sub_problem (if that is not specified by the user)\nsub_problem - (DefaultManoptProblem(M, sub_objective) specify a manopt problem for the sub-solver runs. You can also provide a function for a closed form solution. Then evaluation= is taken into account for the form of this function.\nsub_state - (TrustRegionsState by default, requires sub_hessian to be provided; decorated with sub_kwargs). Choose the solver by specifying a solver state to solve the sub_problem if the sub_problem if a function (i.e. a closed form solution), this is set to evaluation and can be changed to the evaluation type of the closed form solution accordingly.\nsub_stopping_criterion - (StopAfterIteration(300) |StopWhenStepsizeLess(1e-9) |StopWhenGradientNormLess(1e-9)) a stopping criterion used withing the default sub_state=\nsub_stepsize - (ArmijoLinesearch(M)) specify a step size used within the sub_state\n\n...all others are passed on to decorate the inner DifferenceOfConvexState.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/difference_of_convex/#Manopt.difference_of_convex_algorithm!","page":"Difference of Convex","title":"Manopt.difference_of_convex_algorithm!","text":"difference_of_convex_algorithm!(M, f, g, ∂h, p; kwargs...)\ndifference_of_convex_algorithm!(M, mdco, p; kwargs...)\n\nRun the difference of convex algorithm and perform the steps in place of p. See difference_of_convex_algorithm for more details.\n\nif you specify the ManifoldDifferenceOfConvexObjective mdco, the g is a keyword argument.\n\n\n\n\n\n","category":"function"},{"location":"solvers/difference_of_convex/#DCPPASolver","page":"Difference of Convex","title":"Difference of Convex Proximal Point","text":"","category":"section"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"difference_of_convex_proximal_point\ndifference_of_convex_proximal_point!","category":"page"},{"location":"solvers/difference_of_convex/#Manopt.difference_of_convex_proximal_point","page":"Difference of Convex","title":"Manopt.difference_of_convex_proximal_point","text":"difference_of_convex_proximal_point(M, grad_h, p=rand(M); kwargs...)\ndifference_of_convex_proximal_point(M, mdcpo, p=rand(M); kwargs...)\n\nCompute the difference of convex proximal point algorithm Souza, Oliveira, J. Glob. Optim., 2015 to minimize\n\n operatorname*argmin_pmathcal M g(p) - h(p)\n\nwhere you have to provide the (sub) gradient h of h and either\n\nthe proximal map operatornameprox_lambda g of g as a function prox_g(M, λ, p) or prox_g(M, q, λ, p)\nthe functions g and grad_g to compute the proximal map using a sub solver\nyour own sub-solver, see optional keywords below\n\nThis algorithm performs the following steps given a start point p= p^(0). Then repeat for k=01ldots\n\nX^(k) operatornamegrad h(p^(k))\nq^(k) = operatornameretr_p^(k)(λ_kX^(k))\nr^(k) = operatornameprox_λ_kg(q^(k))\nX^(k) = operatornameretr^-1_p^(k)(r^(k))\nCompute a stepsize s_k and\nset p^(k+1) = operatornameretr_p^(k)(s_kX^(k)).\n\nuntil the stopping_criterion is fulfilled. See Almeida, da Cruz Neto, Oliveira, Souza, Comput. Optim. Appl., 2020 for more details on the modified variant, where we slightly changed step 4-6, sine here we get the classical proximal point method for DC functions for s_k = 1 and we can employ linesearches similar to other solvers.\n\nOptional parameters\n\nλ – ( i -> 1/2 ) a function returning the sequence of prox parameters λi\nevaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).\ncost - (nothing) provide the cost f, e.g. for debug reasonscost to be used within the default sub_problem. Use this if you have a more efficient version than using g from above.\ngradient – (nothing) specify operatornamegrad f, for debug / analysis or enhancing the stopping_criterion\nprox_g - (nothing) specify a proximal map for the sub problem or both of the following\ng – (nothing) specify the function g.\ngrad_g – (nothing) specify the gradient of g. If both gand grad_g are specified, a subsolver is automatically set up.\ninverse_retraction_method - (default_inverse_retraction_method(M)) an inverse retraction method to use (see step 4).\nretraction_method – (default_retraction_method(M)) a retraction to use (see step 2)\nstepsize – (ConstantStepsize(M)) specify a Stepsize to run the modified algorithm (experimental.) functor.\nstopping_criterion (StopAfterIteration(200) |StopWhenChangeLess(1e-8)) a StoppingCriterion for the algorithm – includes a StopWhenGradientNormLess(1e-8), when a gradient is provided.\n\nWhile there are several parameters for a sub solver, the easiest is to provide the function g and grad_g, such that together with the mandatory function g a default cost and gradient can be generated and passed to a default subsolver. Hence the easiest example call looks like\n\ndifference_of_convex_proximal_point(M, grad_h, p0; g=g, grad_g=grad_g)\n\nOptional parameters for the sub problem\n\nsub_cost – (ProximalDCCost(g, copy(M, p), λ(1))) cost to be used within the default sub_problem that is initialized as soon as g is provided.\nsub_grad – (ProximalDCGrad(grad_g, copy(M, p), λ(1); evaluation=evaluation) gradient to be used within the default sub_problem, that is initialized as soon as grad_g is provided. This is generated by default when grad_g is provided. You can specify your own by overwriting this keyword.\nsub_hess – (a finite difference approximation by default) specify a Hessian of the subproblem, which the default solver, see sub_state needs\nsub_kwargs – ([]) pass keyword arguments to the sub_state, in form of a Dict(:kwname=>value), unless you set the sub_state directly.\nsub_objective – (a gradient or hessian objective based on the last 3 keywords) provide the objective used within sub_problem (if that is not specified by the user)\nsub_problem – (DefaultManoptProblem(M, sub_objective) specify a manopt problem for the sub-solver runs. You can also provide a function for a closed form solution. Then evaluation= is taken into account for the form of this function.\nsub_state – (TrustRegionsState – requires the sub_hessian to be provided, decorated withsubkwargs) choose the solver by specifying a solver state to solve thesubproblem`\nsub_stopping_criterion - (StopAfterIteration(300) |StopWhenStepsizeLess(1e-9) |StopWhenGradientNormLess(1e-9)) a stopping criterion used withing the default sub_state=\n\n...all others are passed on to decorate the inner DifferenceOfConvexProximalState.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/difference_of_convex/#Manopt.difference_of_convex_proximal_point!","page":"Difference of Convex","title":"Manopt.difference_of_convex_proximal_point!","text":"difference_of_convex_proximal_point!(M, grad_h, p; cost=nothing, kwargs...)\ndifference_of_convex_proximal_point!(M, mdcpo, p; cost=nothing, kwargs...)\ndifference_of_convex_proximal_point!(M, mdcpo, prox_g, p; cost=nothing, kwargs...)\n\nCompute the difference of convex algorithm to minimize\n\n operatorname*argmin_pmathcal M g(p) - h(p)\n\nwhere you have to provide the proximal map of g and the gradient of h.\n\nThe computation is done inplace of p.\n\nFor all further details, especially the keyword arguments, see difference_of_convex_proximal_point.\n\n\n\n\n\n","category":"function"},{"location":"solvers/difference_of_convex/#Manopt-Solver-States","page":"Difference of Convex","title":"Manopt Solver States","text":"","category":"section"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"DifferenceOfConvexState\nDifferenceOfConvexProximalState","category":"page"},{"location":"solvers/difference_of_convex/#Manopt.DifferenceOfConvexState","page":"Difference of Convex","title":"Manopt.DifferenceOfConvexState","text":"DifferenceOfConvexState{Pr,St,P,T,SC<:StoppingCriterion} <:\n AbstractManoptSolverState\n\nA struct to store the current state of the [difference_of_convex_algorithm])(@ref). It comes in two forms, depending on the realisation of the subproblem.\n\nFields\n\np – the current iterate, i.e. a point on the manifold\nX – the current subgradient, i.e. a tangent vector to p.\nsub_problem – problem for the subsolver\nsub_state – state of the subproblem\nstop – a functor inheriting from StoppingCriterion indicating when to stop.\n\nFor the sub task, we need a method to solve\n\n operatorname*argmin_qmathcal M g(p) - X log_p q\n\nbesides a problem and options, one can also provide a function and an AbstractEvaluationType, respectively, to indicate a closed form solution for the sub task.\n\nConstructors\n\nDifferenceOfConvexState(M, p, sub_problem, sub_state; kwargs...)\nDifferenceOfConvexState(M, p, sub_solver; evaluation=InplaceEvaluation(), kwargs...)\n\nGenerate the state either using a solver from Manopt, given by an AbstractManoptProblem sub_problem and an AbstractManoptSolverState sub_state, or a closed form solution sub_solver for the sub-problem, where by default its AbstractEvaluationType evaluation is in-place, i.e. the function is of the form (M, p, X) -> q or (M, q, p, X) -> q, such that the current iterate p and the subgradient X of h can be passed to that function and the result if q.\n\nFurther keyword Arguments\n\ninitial_vector=zero_vector (zero_vectoir(M,p)) how to initialize the inner gradient tangent vector\nstopping_criterion – StopAfterIteration(200) a stopping criterion\n\n\n\n\n\n","category":"type"},{"location":"solvers/difference_of_convex/#Manopt.DifferenceOfConvexProximalState","page":"Difference of Convex","title":"Manopt.DifferenceOfConvexProximalState","text":"DifferenceOfConvexProximalState{Type} <: Options\n\nA struct to store the current state of the algorithm as well as the form. It comes in two forms, depending on the realisation of the subproblem.\n\nFields\n\ninverse_retraction_method – (default_inverse_retraction_method(M)) an inverse retraction method to use within Frank Wolfe.\nretraction_method – (default_retraction_method(M)) a type of retraction\np, q, r – the current iterate, the gradient step and the prox, respectively their type is set by initializing p\nstepsize – (ConstantStepsize(1.0)) a Stepsize function to run the modified algorithm (experimental)\nstop – (StopWhenChangeLess(1e-8)) a StoppingCriterion\nX, Y – (zero_vector(M,p)) the current gradient and descent direction, respectively their common type is set by the keyword X\n\nConstructor\n\nDifferenceOfConvexProximalState(M, p; kwargs...)\n\nKeyword arguments\n\nX, retraction_method, inverse_retraction_method, stepsize for the fields above\nstoppping_criterion for the StoppingCriterion\n\n\n\n\n\n","category":"type"},{"location":"solvers/difference_of_convex/#The-difference-of-convex-objective","page":"Difference of Convex","title":"The difference of convex objective","text":"","category":"section"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"ManifoldDifferenceOfConvexObjective","category":"page"},{"location":"solvers/difference_of_convex/#Manopt.ManifoldDifferenceOfConvexObjective","page":"Difference of Convex","title":"Manopt.ManifoldDifferenceOfConvexObjective","text":"ManifoldDifferenceOfConvexObjective{E} <: AbstractManifoldCostObjective{E}\n\nSpecify an objective for a difference_of_convex_algorithm.\n\nThe objective f mathcal M to ℝ is given as\n\n f(p) = g(p) - h(p)\n\nwhere both g and h are convex, lsc. and proper. Furthermore we assume that the subdifferential h of h is given.\n\nFields\n\ncost – an implementation of f(p) = g(p)-h(p) as a function f(M,p).\n∂h!! – a deterministic version of h mathcal M Tmathcal M, i.e. calling ∂h(M, p) returns a subgradient of h at p and if there is more than one, it returns a deterministic choice.\n\nNote that the subdifferential might be given in two possible signatures\n\n∂h(M,p) which does an AllocatingEvaluation\n∂h!(M, X, p) which does an InplaceEvaluation in place of X.\n\n\n\n\n\n","category":"type"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"as well as for the corresponding sub problem","category":"page"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"LinearizedDCCost\nLinearizedDCGrad","category":"page"},{"location":"solvers/difference_of_convex/#Manopt.LinearizedDCCost","page":"Difference of Convex","title":"Manopt.LinearizedDCCost","text":"LinearizedDCCost\n\nA functor (M,q) → ℝ to represent the inner problem of a ManifoldDifferenceOfConvexObjective, i.e. a cost function of the form\n\n F_p_kX_k(p) = g(p) - X_k log_p_kp\n\nfor a point p_k and a tangent vector X_k at p_k (e.g. outer iterates) that are stored within this functor as well.\n\nFields\n\ng a function\npk a point on a manifold\nXk a tangent vector at pk\n\nBoth interims values can be set using set_manopt_parameter!(::LinearizedDCCost, ::Val{:p}, p) and set_manopt_parameter!(::LinearizedDCCost, ::Val{:X}, X), respectively.\n\nConstructor\n\nLinearizedDCCost(g, p, X)\n\n\n\n\n\n","category":"type"},{"location":"solvers/difference_of_convex/#Manopt.LinearizedDCGrad","page":"Difference of Convex","title":"Manopt.LinearizedDCGrad","text":"LinearizedDCGrad\n\nA functor (M,X,p) → ℝ to represent the gradient of the inner problem of a ManifoldDifferenceOfConvexObjective, i.e. for a cost function of the form\n\n F_p_kX_k(p) = g(p) - X_k log_p_kp\n\nits gradient is given by using F=F_1(F_2(p)), where F_1(X) = X_kX and F_2(p) = log_p_kp and the chain rule as well as the adjoint differential of the logarithmic map with respect to its argument for D^*F_2(p)\n\n operatornamegrad F(q) = operatornamegrad f(q) - DF_2^*(q)X\n\nfor a point pk and a tangent vector Xk at pk (the outer iterates) that are stored within this functor as well\n\nFields\n\ngrad_g!! the gradient of g (see also LinearizedDCCost)\npk a point on a manifold\nXk a tangent vector at pk\n\nBoth interims values can be set using set_manopt_parameter!(::LinearizedDCGrad, ::Val{:p}, p) and set_manopt_parameter!(::LinearizedDCGrad, ::Val{:X}, X), respectively.\n\nConstructor\n\nLinearizedDCGrad(grad_g, p, X; evaluation=AllocatingEvaluation())\n\nWhere you specify whether grad_g is AllocatingEvaluation or InplaceEvaluation, while this function still provides both signatures.\n\n\n\n\n\n","category":"type"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"ManifoldDifferenceOfConvexProximalObjective","category":"page"},{"location":"solvers/difference_of_convex/#Manopt.ManifoldDifferenceOfConvexProximalObjective","page":"Difference of Convex","title":"Manopt.ManifoldDifferenceOfConvexProximalObjective","text":"ManifoldDifferenceOfConvexProximalObjective{E} <: Problem\n\nSpecify an objective difference_of_convex_proximal_point algorithm. The problem is of the form\n\n operatorname*argmin_pin mathcal M g(p) - h(p)\n\nwhere both g and h are convex, lsc. and proper.\n\nFields\n\ncost – (nothing) implementation of f(p) = g(p)-h(p) (optional)\ngradient - the gradient of the cost\ngrad_h!! – a function operatornamegradh mathcal M Tmathcal M,\n\nNote that both the gradients might be given in two possible signatures as allocating or Inplace.\n\nConstructor\n\nManifoldDifferenceOfConvexProximalObjective(gradh; cost=nothing, gradient=nothing)\n\nan note that neither cost nor gradient are required for the algorithm, just for eventual debug or stopping criteria.\n\n\n\n\n\n","category":"type"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"as well as for the corresponding sub problems","category":"page"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"ProximalDCCost\nProximalDCGrad","category":"page"},{"location":"solvers/difference_of_convex/#Manopt.ProximalDCCost","page":"Difference of Convex","title":"Manopt.ProximalDCCost","text":"ProximalDCCost\n\nA functor (M, p) → ℝ to represent the inner cost function of a ManifoldDifferenceOfConvexProximalObjective, i.e. the cost function of the proximal map of g.\n\n F_p_k(p) = frac12λd_mathcal M(p_kp)^2 + g(p)\n\nfor a point pk and a proximal parameter λ.\n\nFields\n\ng - a function\npk - a point on a manifold\nλ - the prox parameter\n\nBoth interims values can be set using set_manopt_parameter!(::ProximalDCCost, ::Val{:p}, p) and set_manopt_parameter!(::ProximalDCCost, ::Val{:λ}, λ), respectively.\n\nConstructor\n\nProximalDCCost(g, p, λ)\n\n\n\n\n\n","category":"type"},{"location":"solvers/difference_of_convex/#Manopt.ProximalDCGrad","page":"Difference of Convex","title":"Manopt.ProximalDCGrad","text":"ProximalDCGrad\n\nA functor (M,X,p) → ℝ to represent the gradient of the inner cost function of a ManifoldDifferenceOfConvexProximalObjective, i.e. the gradient function of the proximal map cost function of g, i.e. of\n\n F_p_k(p) = frac12λd_mathcal M(p_kp)^2 + g(p)\n\nwhich reads\n\n operatornamegrad F_p_k(p) = operatornamegrad g(p) - frac1λlog_p p_k\n\nfor a point pk and a proximal parameter λ.\n\nFields\n\ngrad_g - a gradient function\npk - a point on a manifold\nλ - the prox parameter\n\nBoth interims values can be set using set_manopt_parameter!(::ProximalDCGrad, ::Val{:p}, p) and set_manopt_parameter!(::ProximalDCGrad, ::Val{:λ}, λ), respectively.\n\nConstructor\n\nProximalDCGrad(grad_g, pk, λ; evaluation=AllocatingEvaluation())\n\nWhere you specify whether grad_g is AllocatingEvaluation or InplaceEvaluation, while this function still always provides both signatures.\n\n\n\n\n\n","category":"type"},{"location":"solvers/difference_of_convex/#Further-helper-functions","page":"Difference of Convex","title":"Further helper functions","text":"","category":"section"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"get_subtrahend_gradient","category":"page"},{"location":"solvers/difference_of_convex/#Manopt.get_subtrahend_gradient","page":"Difference of Convex","title":"Manopt.get_subtrahend_gradient","text":"X = get_subtrahend_gradient(amp, q)\nget_subtrahend_gradient!(amp, X, q)\n\nEvaluate the (sub)gradient of the subtrahend h from within a ManifoldDifferenceOfConvexObjective amp at the point q (in place of X).\n\nThe evaluation is done in place of X for the !-variant. The T=AllocatingEvaluation problem might still allocate memory within. When the non-mutating variant is called with a T=InplaceEvaluation memory for the result is allocated.\n\n\n\n\n\nX = get_subtrahend_gradient(M::AbstractManifold, dcpo::ManifoldDifferenceOfConvexProximalObjective, p)\nget_subtrahend_gradient!(M::AbstractManifold, X, dcpo::ManifoldDifferenceOfConvexProximalObjective, p)\n\nEvaluate the gradient of the subtrahend h from within a ManifoldDifferenceOfConvexProximalObjectivePat the pointp` (in place of X).\n\n\n\n\n\n","category":"function"},{"location":"solvers/difference_of_convex/#Literature","page":"Difference of Convex","title":"Literature","text":"","category":"section"},{"location":"solvers/difference_of_convex/","page":"Difference of Convex","title":"Difference of Convex","text":"
","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/#PDRSSNSolver","page":"Primal-dual Riemannian semismooth Newton","title":"The Primal-dual Riemannian semismooth Newton Algorithm","text":"","category":"section"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"The Primal-dual Riemannian semismooth Newton Algorithm is a second-order method derived from the ChambollePock.","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"The aim is to solve an optimization problem on a manifold with a cost function of the form","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"F(p) + G(Λ(p))","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"where Fmathcal M overlineℝ, Gmathcal N overlineℝ, and Λmathcal M mathcal N. If the manifolds mathcal M or mathcal N are not Hadamard, it has to be considered locally, i.e. on geodesically convex sets mathcal C subset mathcal M and mathcal D subsetmathcal N such that Λ(mathcal C) subset mathcal D.","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"The algorithm comes down to applying the Riemannian semismooth Newton method to the rewritten primal-dual optimality conditions, i.e., we define the vector field X mathcalM times mathcalT_n^* mathcalN rightarrow mathcalT mathcalM times mathcalT_n^* mathcalN as","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"Xleft(p xi_nright)=left(beginarrayc\n-log _p operatornameprox_sigma Fleft(exp _pleft(mathcalP_p leftarrow mleft(-sigmaleft(D_m Lambdaright)^*leftmathcalP_Lambda(m) leftarrow n xi_nrightright)^sharpright)right) \nxi_n-operatornameprox_tau G_n^*left(xi_n+tauleft(mathcalP_n leftarrow Lambda(m) D_m Lambdaleftlog _m prightright)^flatright)\nendarrayright)","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"and solve for X(pξ_n)=0.","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"Given base points mmathcal C, n=Λ(m)mathcal D, initial primal and dual values p^(0) mathcal C, ξ_n^(0) mathcal T_n^*mathcal N, and primal and dual step sizes sigma, tau.","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"The algorithms performs the steps k=1 (until a StoppingCriterion is reached)","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"Choose any element\nV^(k) _C X(p^(k)ξ_n^(k))\nof the Clarke generalized covariant derivative\nSolve\nV^(k) (d_p^(k) d_n^(k)) = - X(p^(k)ξ_n^(k))\nin the vector space mathcalT_p^(k) mathcalM times mathcalT_n^* mathcalN\nUpdate\np^(k+1) = exp_p^(k)(d_p^(k))\nand\nξ_n^(k+1) = ξ_n^(k) + d_n^(k)","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"Furthermore you can exchange the exponential map, the logarithmic map, and the parallel transport by a retraction, an inverse retraction and a vector transport.","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"Finally you can also update the base points m and n during the iterations. This introduces a few additional vector transports. The same holds for the case that Λ(m^(k))neq n^(k) at some point. All these cases are covered in the algorithm.","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"primal_dual_semismooth_Newton\nprimal_dual_semismooth_Newton!","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/#Manopt.primal_dual_semismooth_Newton","page":"Primal-dual Riemannian semismooth Newton","title":"Manopt.primal_dual_semismooth_Newton","text":"primal_dual_semismooth_Newton(M, N, cost, p, X, m, n, prox_F, diff_prox_F, prox_G_dual, diff_prox_dual_G, linearized_operator, adjoint_linearized_operator)\n\nPerform the Primal-Dual Riemannian Semismooth Newton algorithm.\n\nGiven a cost function mathcal Ecolonmathcal M to overlineℝ of the form\n\nmathcal E(p) = F(p) + G( Λ(p) )\n\nwhere Fcolonmathcal M to overlineℝ, Gcolonmathcal N to overlineℝ, and Lambdacolonmathcal M to mathcal N. The remaining input parameters are\n\np, X primal and dual start points xinmathcal M and xiin T_nmathcal N\nm,n base points on mathcal M and mathcal N, respectively.\nlinearized_forward_operator the linearization DΛ() of the operator Λ().\nadjoint_linearized_operator the adjoint DΛ^* of the linearized operator DΛ(m)colon T_mmathcal M to T_Λ(m)mathcal N\nprox_F, prox_G_Dual the proximal maps of F and G^ast_n\ndiff_prox_F, diff_prox_dual_G the (Clarke Generalized) differentials of the proximal maps of F and G^ast_n\n\nFor more details on the algorithm, see Diepeveen, Lellmann, SIAM J. Imag. Sci., 2021.\n\nOptional Parameters\n\nprimal_stepsize – (1/sqrt(8)) proximal parameter of the primal prox\nΛ (missing) the exact operator, that is required if Λ(m)=n does not hold;\n\nmissing indicates, that the forward operator is exact.\n\ndual_stepsize – (1/sqrt(8)) proximal parameter of the dual prox\nreg_param – (1e-5) regularisation parameter for the Newton matrix\n\nNote that this changes the arguments the forward_operator will be called.\n\nstopping_criterion – (stopAtIteration(50)) a StoppingCriterion\nupdate_primal_base – (missing) function to update m (identity by default/missing)\nupdate_dual_base – (missing) function to update n (identity by default/missing)\nretraction_method – (default_retraction_method(M, typeof(p))) the retraction to use\ninverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use.\nvector_transport_method - (default_vector_transport_method(M, typeof(p))) a vector transport to use\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/primal_dual_semismooth_Newton/#Manopt.primal_dual_semismooth_Newton!","page":"Primal-dual Riemannian semismooth Newton","title":"Manopt.primal_dual_semismooth_Newton!","text":"primal_dual_semismooth_Newton(M, N, cost, x0, ξ0, m, n, prox_F, diff_prox_F, prox_G_dual, diff_prox_G_dual, linearized_forward_operator, adjoint_linearized_operator)\n\nPerform the Riemannian Primal-dual Riemannian semismooth Newton algorithm in place of x, ξ, and potentially m, n if they are not fixed. See primal_dual_semismooth_Newton for details and optional parameters.\n\n\n\n\n\n","category":"function"},{"location":"solvers/primal_dual_semismooth_Newton/#State","page":"Primal-dual Riemannian semismooth Newton","title":"State","text":"","category":"section"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"PrimalDualSemismoothNewtonState","category":"page"},{"location":"solvers/primal_dual_semismooth_Newton/#Manopt.PrimalDualSemismoothNewtonState","page":"Primal-dual Riemannian semismooth Newton","title":"Manopt.PrimalDualSemismoothNewtonState","text":"PrimalDualSemismoothNewtonState <: AbstractPrimalDualSolverState\n\nm - base point on $ \\mathcal M $\nn - base point on $ \\mathcal N $\nx - an initial point on x^(0) in mathcal M (and its previous iterate)\nξ - an initial tangent vector xi^(0)in T_n^*mathcal N (and its previous iterate)\nprimal_stepsize – (1/sqrt(8)) proximal parameter of the primal prox\ndual_stepsize – (1/sqrt(8)) proximal parameter of the dual prox\nreg_param – (1e-5) regularisation parameter for the Newton matrix\nstop - a StoppingCriterion\nupdate_primal_base (( amp, ams, i) -> o.m) function to update the primal base\nupdate_dual_base ((amp, ams, i) -> o.n) function to update the dual base\nretraction_method – (default_retraction_method(M, typeof(p))) the retraction to use\ninverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use.\nvector_transport_method - (default_vector_transport_method(M, typeof(p))) a vector transport to use\n\nwhere for the update functions a AbstractManoptProblem amp, AbstractManoptSolverState ams and the current iterate i are the arguments. If you activate these to be different from the default identity, you have to provide p.Λ for the algorithm to work (which might be missing).\n\nConstructor\n\nPrimalDualSemismoothNewtonState(M::AbstractManifold,\n m::P, n::Q, x::P, ξ::T, primal_stepsize::Float64, dual_stepsize::Float64, reg_param::Float64;\n stopping_criterion::StoppingCriterion = StopAfterIteration(50),\n update_primal_base::Union{Function,Missing} = missing,\n update_dual_base::Union{Function,Missing} = missing,\n retraction_method = default_retraction_method(M, typeof(p)),\n inverse_retraction_method = default_inverse_retraction_method(M, typeof(p)),\n vector_transport_method = default_vector_transport_method(M, typeof(p)),\n)\n\n\n\n\n\n","category":"type"},{"location":"solvers/primal_dual_semismooth_Newton/#Literature","page":"Primal-dual Riemannian semismooth Newton","title":"Literature","text":"","category":"section"},{"location":"solvers/primal_dual_semismooth_Newton/","page":"Primal-dual Riemannian semismooth Newton","title":"Primal-dual Riemannian semismooth Newton","text":"
[DL21]
\n
\n
W. Diepeveen and J. Lellmann. An Inexact Semismooth Newton Method on Riemannian Manifolds with Application to Duality-Based Total Variation Denoising. SIAM Journal on Imaging Sciences 14, 1565–1600 (2021), arXiv: [2102.10309](https://arxiv.org/abs/2102.10309).
\n
\n
","category":"page"},{"location":"solvers/DouglasRachford/#DRSolver","page":"Douglas–Rachford","title":"Douglas–Rachford Algorithm","text":"","category":"section"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"The (Parallel) Douglas–Rachford ((P)DR) Algorithm was generalized to Hadamard manifolds in [BPS16].","category":"page"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"The aim is to minimize the sum","category":"page"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"F(p) = f(p) + g(p)","category":"page"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"on a manifold, where the two summands have proximal maps operatornameprox_λ f operatornameprox_λ g that are easy to evaluate (maybe in closed form, or not too costly to approximate). Further, define the reflection operator at the proximal map as","category":"page"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"operatornamerefl_λ f(p) = operatornameretr_operatornameprox_λ f(p) bigl( -operatornameretr^-1_operatornameprox_λ f(p) p bigr)","category":"page"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"Let alpha_k 01 with sum_k mathbb N alpha_k(1-alpha_k) = infty and λ 0 (which might depend on iteration k as well) be given.","category":"page"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"Then the (P)DRA algorithm for initial data x_0 mathcal H as","category":"page"},{"location":"solvers/DouglasRachford/#Initialization","page":"Douglas–Rachford","title":"Initialization","text":"","category":"section"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"Initialize t_0 = x_0 and k=0","category":"page"},{"location":"solvers/DouglasRachford/#Iteration","page":"Douglas–Rachford","title":"Iteration","text":"","category":"section"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"Repeat until a convergence criterion is reached","category":"page"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"Compute s_k = operatornamerefl_λ foperatornamerefl_λ g(t_k)\nWithin that operation, store p_k+1 = operatornameprox_λ g(t_k) which is the prox the inner reflection reflects at.\nCompute t_k+1 = g(alpha_k t_k s_k), where g is a curve approximating the shortest geodesic, provided by a retraction and its inverse\nSet k = k+1","category":"page"},{"location":"solvers/DouglasRachford/#Result","page":"Douglas–Rachford","title":"Result","text":"","category":"section"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"The result is given by the last computed p_K.","category":"page"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"For the parallel version, the first proximal map is a vectorial version where in each component one prox is applied to the corresponding copy of t_k and the second proximal map corresponds to the indicator function of the set, where all copies are equal (in mathcal H^n, where n is the number of copies), leading to the second prox being the Riemannian mean.","category":"page"},{"location":"solvers/DouglasRachford/#Interface","page":"Douglas–Rachford","title":"Interface","text":"","category":"section"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":" DouglasRachford\n DouglasRachford!","category":"page"},{"location":"solvers/DouglasRachford/#Manopt.DouglasRachford","page":"Douglas–Rachford","title":"Manopt.DouglasRachford","text":"DouglasRachford(M, f, proxes_f, p)\nDouglasRachford(M, mpo, p)\n\nCompute the Douglas-Rachford algorithm on the manifold mathcal M, initial data p and the (two) proximal maps proxMaps, see Bergmann, Persch, Steidl, SIAM J Imag Sci, 2016.\n\nFor k2 proximal maps, the problem is reformulated using the parallel Douglas Rachford: A vectorial proximal map on the power manifold mathcal M^k is introduced as the first proximal map and the second proximal map of the is set to the mean (Riemannian Center of mass). This hence also boils down to two proximal maps, though each evaluates proximal maps in parallel, i.e. component wise in a vector.\n\nIf you provide a ManifoldProximalMapObjective mpo instead, the proximal maps are kept unchanged.\n\nInput\n\nM – a Riemannian Manifold mathcal M\nF – a cost function consisting of a sum of cost functions\nproxes_f – functions of the form (M, λ, p)->... performing a proximal maps, where λ denotes the proximal parameter, for each of the summands of F. These can also be given in the InplaceEvaluation variants (M, q, λ p) -> ... computing in place of q.\np – initial data p mathcal M\n\nOptional values\n\nevaluation – (AllocatingEvaluation) specify whether the proximal maps work by allocation (default) form prox(M, λ, x) or InplaceEvaluation in place, i.e. is of the form prox!(M, y, λ, x).\nλ – ((iter) -> 1.0) function to provide the value for the proximal parameter during the calls\nα – ((iter) -> 0.9) relaxation of the step from old to new iterate, i.e. t_k+1 = g(α_k t_k s_k), where s_k is the result of the double reflection involved in the DR algorithm\ninverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) the inverse retraction to use within\nthe reflection (ignored, if you set R directly)\nthe relaxation step\nR – method employed in the iteration to perform the reflection of x at the prox p. This uses by default reflect or reflect! depending on reflection_evaluation and the retraction and inverse retraction specified by retraction_method and inverse_retraction_method, respectively.\nreflection_evaluation – (AllocatingEvaluation whether R works inplace or allocating\nretraction_method - (default_retraction_metiod(M, typeof(p))) the retraction to use in\nthe reflection (ignored, if you set R directly)\nthe relaxation step\nstopping_criterion – (StopWhenAny(StopAfterIteration(200),StopWhenChangeLess(10.0^-5))) a StoppingCriterion.\nparallel – (false) clarify that we are doing a parallel DR, i.e. on a PowerManifold manifold with two proxes. This can be used to trigger parallel Douglas–Rachford if you enter with two proxes. Keep in mind, that a parallel Douglas–Rachford implicitly works on a PowerManifold manifold and its first argument is the result then (assuming all are equal after the second prox.\n\nand the ones that are passed to decorate_state! for decorators.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/DouglasRachford/#Manopt.DouglasRachford!","page":"Douglas–Rachford","title":"Manopt.DouglasRachford!","text":" DouglasRachford!(M, f, proxes_f, p)\n DouglasRachford!(M, mpo, p)\n\nCompute the Douglas-Rachford algorithm on the manifold mathcal M, initial data p in mathcal M and the (two) proximal maps proxes_f in place of p.\n\nFor k2 proximal maps, the problem is reformulated using the parallel Douglas Rachford: A vectorial proximal map on the power manifold mathcal M^k is introduced as the first proximal map and the second proximal map of the is set to the mean (Riemannian Center of mass). This hence also boils down to two proximal maps, though each evaluates proximal maps in parallel, i.e. component wise in a vector.\n\nnote: Note\nWhile creating the new staring point p' on the power manifold, a copy of p Is created, so that the (by k>2 implicitly generated) parallel Douglas Rachford does not work in-place for now.\n\nIf you provide a ManifoldProximalMapObjective mpo instead, the proximal maps are kept unchanged.\n\nInput\n\nM – a Riemannian Manifold mathcal M\nf – a cost function consisting of a sum of cost functions\nproxes_f – functions of the form (M, λ, p)->q or (M, q, λ, p)->q performing a proximal map, where λ denotes the proximal parameter, for each of the summands of f.\np – initial point p mathcal M\n\nFor more options, see DouglasRachford.\n\n\n\n\n\n","category":"function"},{"location":"solvers/DouglasRachford/#State","page":"Douglas–Rachford","title":"State","text":"","category":"section"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"DouglasRachfordState","category":"page"},{"location":"solvers/DouglasRachford/#Manopt.DouglasRachfordState","page":"Douglas–Rachford","title":"Manopt.DouglasRachfordState","text":"DouglasRachfordState <: AbstractManoptSolverState\n\nStore all options required for the DouglasRachford algorithm,\n\nFields\n\np - the current iterate (result) For the parallel Douglas-Rachford, this is not a value from the PowerManifold manifold but the mean.\ns – the last result of the double reflection at the proxes relaxed by α.\nλ – function to provide the value for the proximal parameter during the calls\nα – relaxation of the step from old to new iterate, i.e. x^(k+1) = g(α(k) x^(k) t^(k)), where t^(k) is the result of the double reflection involved in the DR algorithm\ninverse_retraction_method – an inverse retraction method\nR – method employed in the iteration to perform the reflection of x at the prox p.\nreflection_evaluation – whether R works inplace or allocating\nretraction_method – a retraction method\nstop – a StoppingCriterion\nparallel – indicate whether we are running a parallel Douglas-Rachford or not.\n\nConstructor\n\nDouglasRachfordState(M, p; kwargs...)\n\nGenerate the options for a Manifold M and an initial point p, where the following keyword arguments can be used\n\nλ – ((iter)->1.0) function to provide the value for the proximal parameter during the calls\nα – ((iter)->0.9) relaxation of the step from old to new iterate, i.e. x^(k+1) = g(α(k) x^(k) t^(k)), where t^(k) is the result of the double reflection involved in the DR algorithm\nR – (reflect or reflect!) method employed in the iteration to perform the reflection of x at the prox p, which function is used depends on reflection_evaluation.\nreflection_evaluation – (AllocatingEvaluation()) specify whether the reflection works inplace or allocating (default)\nstopping_criterion – (StopAfterIteration(300)) a StoppingCriterion\nparallel – (false) indicate whether we are running a parallel Douglas-Rachford or not.\n\n\n\n\n\n","category":"type"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"For specific DebugActions and RecordActions see also Cyclic Proximal Point.","category":"page"},{"location":"solvers/DouglasRachford/#Literature","page":"Douglas–Rachford","title":"Literature","text":"","category":"section"},{"location":"solvers/DouglasRachford/","page":"Douglas–Rachford","title":"Douglas–Rachford","text":"
\n
","category":"page"},{"location":"tutorials/CountAndCache/#How-to-Count-and-Cache-Function-Calls","page":"Count and use a Cache","title":"How to Count and Cache Function Calls","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"In this tutorial, we want to investigate the caching and counting (i.e. statistics) features of Manopt.jl. We will reuse the optimization tasks from the introductory tutorial Get Started: Optimize!.","category":"page"},{"location":"tutorials/CountAndCache/#Introduction","page":"Count and use a Cache","title":"Introduction","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"There are surely many ways to keep track for example of how often the cost function is called, for example with a functor, as we used in an example in How to Record Data","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"mutable struct MyCost{I<:Integer}\n count::I\nend\nMyCost() = MyCost{Int64}(0)\nfunction (c::MyCost)(M, x)\n c.count += 1\n # [ .. Actual implementation of the cost here ]\nend","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"This still leaves a bit of work to the user, especially for tracking more than just the number of cost function evaluations.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"When a function like the objective or gradient is expensive to compute, it may make sense to cache its results. Manopt.jl tries to minimize the number of repeated calls but sometimes they are necessary and harmless when the function is cheap to compute. Caching of expensive function calls can for example be added using Memoize.jl by the user. The approach in the solvers of Manopt.jl aims to simplify adding both these capabilities on the level of calling a solver.","category":"page"},{"location":"tutorials/CountAndCache/#Technical-Background","page":"Count and use a Cache","title":"Technical Background","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"The two ingredients for a solver in Manopt.jl are the AbstractManoptProblem and the AbstractManoptSolverState, where the former consists of the domain, that is the manifold and AbstractManifoldObjective.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"Both recording and debug capabilities are implemented in a decorator pattern to the solver state. They can be easily added using the record= and debug= in any solver call. This pattern was recently extended, such that also the objective can be decorated. This is how both caching and counting are implemented, as decorators of the AbstractManifoldObjective and hence for example changing/extending the behaviour of a call to get_cost.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"Let’s finish off the technical background by loading the necessary packages. Besides Manopt.jl and Manifolds.jl we also need LRUCaches.jl which are (since Julia 1.9) a weak dependency and provide the least recently used strategy for our caches.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"using Manopt, Manifolds, Random, LRUCache, LinearAlgebra","category":"page"},{"location":"tutorials/CountAndCache/#Counting","page":"Count and use a Cache","title":"Counting","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"We first define our task, the Riemannian Center of Mass from the Get Started: Optimize! tutorial.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"n = 100\nσ = π / 8\nM = Sphere(2)\np = 1 / sqrt(2) * [1.0, 0.0, 1.0]\nRandom.seed!(42)\ndata = [exp(M, p, σ * rand(M; vector_at=p)) for i in 1:n];\nf(M, p) = sum(1 / (2 * n) * distance.(Ref(M), Ref(p), data) .^ 2)\ngrad_f(M, p) = sum(1 / n * grad_distance.(Ref(M), data, Ref(p)));","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"to now count how often the cost and the gradient are called, we use the count= keyword argument that works in any solver to specify the elements of the objective whose calls we want to count calls to. A full list is available in the documentation of the AbstractManifoldObjective. To also see the result, we have to set return_objective=true. This returns (objective, p) instead of just the solver result p. We can further also set return_state=true to get even more information about the solver run.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"gradient_descent(M, f, grad_f, data[1]; count=[:Cost, :Gradient], return_objective=true, return_state=true)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"# Solver state for `Manopt.jl`s Gradient Descent\nAfter 68 iterations\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nArmijoLineseach() with keyword parameters\n * initial_stepsize = 1.0\n * retraction_method = ExponentialRetraction()\n * contraction_factor = 0.95\n * sufficient_decrease = 0.1\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 200: not reached\n |grad f| < 1.0e-9: reached\nOverall: reached\nThis indicates convergence: Yes\n\n## Statistics on function calls\n * :Gradient : 205\n * :Cost : 285","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"And we see that statistics are shown in the end.","category":"page"},{"location":"tutorials/CountAndCache/#Caching","page":"Count and use a Cache","title":"Caching","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"To now also cache these calls, we can use the cache= keyword argument. Since now both the cache and the count “extend” the functionality of the objective, the order is important: On the high-level interface, the count is treated first, which means that only actual function calls and not cache look-ups are counted. With the proper initialisation, you can use any caches here that support the get!(function, cache, key)! update. All parts of the objective that can currently be cached are listed at ManifoldCachedObjective. The solver call has a keyword cache that takes a tuple(c, vs, n) of three arguments, where c is a symbol for the type of cache, vs is a vector of symbols, which calls to cache and n is the size of the cache. If the last element is not provided, a suitable default (currentlyn=10) is used.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"Here we want to use c=:LRU caches for vs=[Cost, :Gradient] with a size of n=25.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"r = gradient_descent(M, f, grad_f, data[1];\n count=[:Cost, :Gradient],\n cache=(:LRU, [:Cost, :Gradient], 25),\n return_objective=true, return_state=true)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"# Solver state for `Manopt.jl`s Gradient Descent\nAfter 68 iterations\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nArmijoLineseach() with keyword parameters\n * initial_stepsize = 1.0\n * retraction_method = ExponentialRetraction()\n * contraction_factor = 0.95\n * sufficient_decrease = 0.1\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 200: not reached\n |grad f| < 1.0e-9: reached\nOverall: reached\nThis indicates convergence: Yes\n\n## Cache\n * :Cost : 25/25 entries of type Float64 used\n * :Gradient : 25/25 entries of type Vector{Float64} used\n\n## Statistics on function calls\n * :Gradient : 68\n * :Cost : 157","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"Since the default setup with ArmijoLinesearch needs the gradient and the cost, and similarly the stopping criterion might (independently) evaluate the gradient, the caching is quite helpful here.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"And of course also for this advanced return value of the solver, we can still access the result as usual:","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"get_solver_result(r)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"3-element Vector{Float64}:\n 0.6868392794790367\n 0.006531600680668244\n 0.7267799820834814","category":"page"},{"location":"tutorials/CountAndCache/#Advanced-Caching-Examples","page":"Count and use a Cache","title":"Advanced Caching Examples","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"There are more options other than caching single calls to specific parts of the objective. For example you may want to cache intermediate results of computing the cost and share that with the gradient computation. We will present three solutions to this:","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"An easy approach from within Manopt.jl: The ManifoldCostGradientObjective\nA shared storage approach using a functor\nA shared (internal) cache approach also using a functor","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"For that we switch to another example: The Rayleigh quotient. We aim to maximize the Rayleigh quotient displaystylefracx^mathrmTAxx^mathrmTx, for some Ainmathbb R^m+1times m+1 and xinmathbb R^m+1 but since we consider this on the sphere and Manopt.jl (as many other optimization toolboxes) minimizes, we consider","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"g(p) = -p^mathrmTApqquad pinmathbb S^m","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"The Euclidean gradient (that is in $ R^{m+1}$) is actually just nabla g(p) = -2Ap, the Riemannian gradient the projection of nabla g(p) onto the tangent space T_pmathbb S^m.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"m = 25\nRandom.seed!(42)\nA = randn(m + 1, m + 1)\nA = Symmetric(A)\np_star = eigvecs(A)[:, end] # minimizer (or similarly -p)\nf_star = -eigvals(A)[end] # cost (note that we get - the largest Eigenvalue)\n\nN = Sphere(m);\n\ng(M, p) = -p' * A*p\n∇g(p) = -2 * A * p\ngrad_g(M,p) = project(M, p, ∇g(p))\ngrad_g!(M,X, p) = project!(M, X, p, ∇g(p))","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"grad_g! (generic function with 1 method)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"But since both the cost and the gradient require the computation of the matrix-vector product Ap, it might be beneficial to only compute this once.","category":"page"},{"location":"tutorials/CountAndCache/#The-[ManifoldCostGradientObjective](@ref)-approach","page":"Count and use a Cache","title":"The ManifoldCostGradientObjective approach","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"The ManifoldCostGradientObjective uses a combined function to compute both the gradient and the cost at the same time. We define the inplace variant as","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"function g_grad_g!(M::AbstractManifold, X, p)\n X .= -A*p\n c = p'*X\n X .*= 2\n project!(M, X, p, X)\n return (c, X)\nend","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"g_grad_g! (generic function with 1 method)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"where we only compute the matrix-vector product once. The small disadvantage might be, that we always compute both, the gradient and the cost. Luckily, the cache we used before, takes this into account and caches both results, such that we indeed end up computing A*p only once when asking to a cost and a gradient.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"Let’s compare both methods","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"p0 = [(1/5 .* ones(5))..., zeros(m-4)...];\n@time s1 = gradient_descent(N, g, grad_g!, p0;\n stopping_criterion = StopWhenGradientNormLess(1e-5),\n evaluation=InplaceEvaluation(),\n count=[:Cost, :Gradient],\n cache=(:LRU, [:Cost, :Gradient], 25),\n return_objective=true,\n)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":" 1.392875 seconds (1.55 M allocations: 124.750 MiB, 3.75% gc time, 99.36% compilation time)\n\n## Cache\n * :Cost : 25/25 entries of type Float64 used\n * :Gradient : 25/25 entries of type Vector{Float64} used\n\n## Statistics on function calls\n * :Gradient : 602\n * :Cost : 1449\n\nTo access the solver result, call `get_solver_result` on this variable.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"versus","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"obj = ManifoldCostGradientObjective(g_grad_g!; evaluation=InplaceEvaluation())\n@time s2 = gradient_descent(N, obj, p0;\n stopping_criterion=StopWhenGradientNormLess(1e-5),\n count=[:Cost, :Gradient],\n cache=(:LRU, [:Cost, :Gradient], 25),\n return_objective=true,\n)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":" 0.773684 seconds (773.96 k allocations: 60.275 MiB, 3.04% gc time, 97.88% compilation time)\n\n## Cache\n * :Cost : 25/25 entries of type Float64 used\n * :Gradient : 25/25 entries of type Vector{Float64} used\n\n## Statistics on function calls\n * :Gradient : 1448\n * :Cost : 1448\n\nTo access the solver result, call `get_solver_result` on this variable.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"first of all both yield the same result","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"p1 = get_solver_result(s1)\np2 = get_solver_result(s2)\n[distance(N, p1, p2), g(N, p1), g(N, p2), f_star]","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"4-element Vector{Float64}:\n 0.0\n -7.8032957637779035\n -7.8032957637779035\n -7.803295763793953","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"and we can see that the combined number of evaluations is once 2051, once just the number of cost evaluations 1449. Note that the involved additional 847 gradient evaluations are merely a multiplication with 2. On the other hand, the additional caching of the gradient in these cases might be less beneficial. It is beneficial, when the gradient and the cost are very often required together.","category":"page"},{"location":"tutorials/CountAndCache/#A-shared-storage-approach-using-a-functor","page":"Count and use a Cache","title":"A shared storage approach using a functor","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"An alternative to the previous approach is the usage of a functor that introduces a “shared storage” of the result of computing A*p. We additionally have to store p though, since we have to check that we are still evaluating the cost and/or gradient at the same point at which the cached A*p was computed. We again consider the (more efficient) inplace variant. This can be done as follows","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"struct StorageG{T,M}\n A::M\n Ap::T\n p::T\nend\nfunction (g::StorageG)(::Val{:Cost}, M::AbstractManifold, p)\n if !(p==g.p) #We are at a new point -> Update\n g.Ap .= g.A*p\n g.p .= p\n end\n return -g.p'*g.Ap\nend\nfunction (g::StorageG)(::Val{:Gradient}, M::AbstractManifold, X, p)\n if !(p==g.p) #We are at a new point -> Update\n g.Ap .= g.A*p\n g.p .= p\n end\n X .= -2 .* g.Ap\n project!(M, X, p, X)\n return X\nend","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"Here we use the first parameter to distinguish both functions. For the mutating case the signatures are different regardless of the additional argument but for the allocating case, the signatures of the cost and the gradient function are the same.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"#Define the new functor\nstorage_g = StorageG(A, zero(p0), zero(p0))\n# and cost and gradient that use this functor as\ng3(M,p) = storage_g(Val(:Cost), M, p)\ngrad_g3!(M, X, p) = storage_g(Val(:Gradient), M, X, p)\n@time s3 = gradient_descent(N, g3, grad_g3!, p0;\n stopping_criterion = StopWhenGradientNormLess(1e-5),\n evaluation=InplaceEvaluation(),\n count=[:Cost, :Gradient],\n cache=(:LRU, [:Cost, :Gradient], 2),\n return_objective=true#, return_state=true\n)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":" 0.487223 seconds (325.29 k allocations: 23.338 MiB, 98.24% compilation time)\n\n## Cache\n * :Cost : 2/2 entries of type Float64 used\n * :Gradient : 2/2 entries of type Vector{Float64} used\n\n## Statistics on function calls\n * :Gradient : 602\n * :Cost : 1449\n\nTo access the solver result, call `get_solver_result` on this variable.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"This of course still yields the same result","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"p3 = get_solver_result(s3)\ng(N, p3) - f_star","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"1.6049384043981263e-11","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"And while we again have a split off the cost and gradient evaluations, we can observe that the allocations are less than half of the previous approach.","category":"page"},{"location":"tutorials/CountAndCache/#A-local-cache-approach","page":"Count and use a Cache","title":"A local cache approach","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"This variant is very similar to the previous one, but uses a whole cache instead of just one place to store A*p. This makes the code a bit nicer, and it is possible to store more than just the last p either cost or gradient was called with.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"struct CacheG{C,M}\n A::M\n cache::C\nend\nfunction (g::CacheG)(::Val{:Cost}, M, p)\n Ap = get!(g.cache, copy(M,p)) do\n g.A*p\n end\n return -p'*Ap\nend\nfunction (g::CacheG)(::Val{:Gradient}, M, X, p)\n Ap = get!(g.cache, copy(M,p)) do\n g.A*p\n end\n X .= -2 .* Ap\n project!(M, X, p, X)\n return X\nend","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"However, the resulting solver run is not always faster, since the whole cache instead of storing just Ap and p is a bit more costly. Then the tradeoff is, whether this pays off.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"#Define the new functor\ncache_g = CacheG(A, LRU{typeof(p0),typeof(p0)}(; maxsize=25))\n# and cost and gradient that use this functor as\ng4(M,p) = cache_g(Val(:Cost), M, p)\ngrad_g4!(M, X, p) = cache_g(Val(:Gradient), M, X, p)\n@time s4 = gradient_descent(N, g4, grad_g4!, p0;\n stopping_criterion = StopWhenGradientNormLess(1e-5),\n evaluation=InplaceEvaluation(),\n count=[:Cost, :Gradient],\n cache=(:LRU, [:Cost, :Gradient], 25),\n return_objective=true,\n)","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":" 0.474319 seconds (313.56 k allocations: 22.981 MiB, 97.87% compilation time)\n\n## Cache\n * :Cost : 25/25 entries of type Float64 used\n * :Gradient : 25/25 entries of type Vector{Float64} used\n\n## Statistics on function calls\n * :Gradient : 602\n * :Cost : 1449\n\nTo access the solver result, call `get_solver_result` on this variable.","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"and for safety let’s check that we are reasonably close","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"p4 = get_solver_result(s4)\ng(N, p4) - f_star","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"1.6049384043981263e-11","category":"page"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"For this example, or maybe even gradient_descent in general it seems, this additional (second, inner) cache does not improve the result further, it is about the same effort both time and allocation-wise.","category":"page"},{"location":"tutorials/CountAndCache/#Summary","page":"Count and use a Cache","title":"Summary","text":"","category":"section"},{"location":"tutorials/CountAndCache/","page":"Count and use a Cache","title":"Count and use a Cache","text":"While the second approach of ManifoldCostGradientObjective is very easy to implement, both the storage and the (local) cache approach are more efficient. All three are an improvement over the first implementation without sharing interms results. The results with storage or cache have further advantage of being more flexible, i.e. the stored information could also be reused in a third function, for example when also computing the Hessian.","category":"page"},{"location":"tutorials/InplaceGradient/#Speedup-using-Inplace-Evaluation","page":"Speedup using Inplace computations","title":"Speedup using Inplace Evaluation","text":"","category":"section"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"When it comes to time critital operations, a main ingredient in Julia is given by mutating functions, i.e. those that compute in place without additional memory allocations. In the following, we illustrate how to do this with Manopt.jl.","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"Let’s start with the same function as in Get Started: Optimize! and compute the mean of some points, only that here we use the sphere mathbb S^30 and n=800 points.","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"From the aforementioned example.","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"We first load all necessary packages.","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"using Manopt, Manifolds, Random, BenchmarkTools\nRandom.seed!(42);","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"And setup our data","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"Random.seed!(42)\nm = 30\nM = Sphere(m)\nn = 800\nσ = π / 8\np = zeros(Float64, m + 1)\np[2] = 1.0\ndata = [exp(M, p, σ * rand(M; vector_at=p)) for i in 1:n];","category":"page"},{"location":"tutorials/InplaceGradient/#Classical-Definition","page":"Speedup using Inplace computations","title":"Classical Definition","text":"","category":"section"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"The variant from the previous tutorial defines a cost f(x) and its gradient operatornamegradf(p) ““”","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"f(M, p) = sum(1 / (2 * n) * distance.(Ref(M), Ref(p), data) .^ 2)\ngrad_f(M, p) = sum(1 / n * grad_distance.(Ref(M), data, Ref(p)))","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"grad_f (generic function with 1 method)","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"We further set the stopping criterion to be a little more strict. Then we obtain","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"sc = StopWhenGradientNormLess(3e-10)\np0 = zeros(Float64, m + 1); p0[1] = 1/sqrt(2); p0[2] = 1/sqrt(2)\nm1 = gradient_descent(M, f, grad_f, p0; stopping_criterion=sc);","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"We can also benchmark this as","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"@benchmark gradient_descent($M, $f, $grad_f, $p0; stopping_criterion=$sc)","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"BenchmarkTools.Trial: 100 samples with 1 evaluation.\n Range (min … max): 48.285 ms … 56.649 ms ┊ GC (min … max): 4.84% … 6.96%\n Time (median): 49.552 ms ┊ GC (median): 5.41%\n Time (mean ± σ): 50.151 ms ± 1.731 ms ┊ GC (mean ± σ): 5.56% ± 0.64%\n\n ▂▃ █▃▃▆ ▂ \n ▅████████▅█▇█▄▅▇▁▅█▅▇▄▇▅▁▅▄▄▄▁▄▁▁▁▄▄▁▁▁▁▁▁▄▁▁▁▁▁▁▄▁▄▁▁▁▁▁▁▄ ▄\n 48.3 ms Histogram: frequency by time 56.6 ms <\n\n Memory estimate: 194.10 MiB, allocs estimate: 655347.","category":"page"},{"location":"tutorials/InplaceGradient/#In-place-Computation-of-the-Gradient","page":"Speedup using Inplace computations","title":"In-place Computation of the Gradient","text":"","category":"section"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"We can reduce the memory allocations by implementing the gradient to be evaluated in-place. We do this by using a functor. The motivation is twofold: on one hand, we want to avoid variables from the global scope, for example the manifold M or the data, being used within the function. Considering to do the same for more complicated cost functions might also be worth pursuing.","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"Here, we store the data (as reference) and one introduce temporary memory in order to avoid reallocation of memory per grad_distance computation. We get","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"struct GradF!{TD,TTMP}\n data::TD\n tmp::TTMP\nend\nfunction (grad_f!::GradF!)(M, X, p)\n fill!(X, 0)\n for di in grad_f!.data\n grad_distance!(M, grad_f!.tmp, di, p)\n X .+= grad_f!.tmp\n end\n X ./= length(grad_f!.data)\n return X\nend","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"For the actual call to the solver, we first have to generate an instance of GradF! and tell the solver, that the gradient is provided in an InplaceEvaluation. We can further also use gradient_descent! to even work inplace of the initial point we pass.","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"grad_f2! = GradF!(data, similar(data[1]))\nm2 = deepcopy(p0)\ngradient_descent!(\n M, f, grad_f2!, m2; evaluation=InplaceEvaluation(), stopping_criterion=sc\n);","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"We can again benchmark this","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"@benchmark gradient_descent!(\n $M, $f, $grad_f2!, m2; evaluation=$(InplaceEvaluation()), stopping_criterion=$sc\n) setup = (m2 = deepcopy($p0))","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"BenchmarkTools.Trial: 176 samples with 1 evaluation.\n Range (min … max): 27.419 ms … 34.154 ms ┊ GC (min … max): 0.00% … 0.00%\n Time (median): 28.001 ms ┊ GC (median): 0.00%\n Time (mean ± σ): 28.412 ms ± 1.079 ms ┊ GC (mean ± σ): 0.73% ± 2.24%\n\n ▁▅▇█▅▂▄ ▁ \n ▄▁███████▆█▇█▄▆▃▃▃▃▁▁▃▁▁▃▁▃▃▁▄▁▁▃▃▁▁▄▁▁▃▅▃▃▃▁▃▃▁▁▁▁▁▁▁▁▃▁▁▃ ▃\n 27.4 ms Histogram: frequency by time 31.9 ms <\n\n Memory estimate: 3.76 MiB, allocs estimate: 5949.","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"which is faster by about a factor of 2 compared to the first solver-call. Note that the results m1 and m2 are of course the same.","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"distance(M, m1, m2)","category":"page"},{"location":"tutorials/InplaceGradient/","page":"Speedup using Inplace computations","title":"Speedup using Inplace computations","text":"2.0004809792350595e-10","category":"page"},{"location":"plans/state/#SolverStateSection","page":"Solver State","title":"The Solver State","text":"","category":"section"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"CurrentModule = Manopt","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"Given an AbstractManoptProblem, that is a certain optimisation task, the state specifies the solver to use. It contains the parameters of a solver and all fields necessary during the algorithm, e.g. the current iterate, a StoppingCriterion or a Stepsize.","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"AbstractManoptSolverState\nget_state\nManopt.get_count","category":"page"},{"location":"plans/state/#Manopt.AbstractManoptSolverState","page":"Solver State","title":"Manopt.AbstractManoptSolverState","text":"AbstractManoptSolverState\n\nA general super type for all solver states.\n\nFields\n\nThe following fields are assumed to be default. If you use different ones, provide the access functions accordingly\n\np a point on a manifold with the current iterate\nstop a StoppingCriterion.\n\n\n\n\n\n","category":"type"},{"location":"plans/state/#Manopt.get_state","page":"Solver State","title":"Manopt.get_state","text":"get_state(s::AbstractManoptSolverState, recursive::Bool=true)\n\nreturn the (one step) undecorated AbstractManoptSolverState of the (possibly) decorated s. As long as your decorated state stores the state within s.state and the dispatch_objective_decorator is set to Val{true}, the internal state are extracted automatically.\n\nBy default the state that is stored within a decorated state is assumed to be at s.state. Overwrite _get_state(s, ::Val{true}, recursive) to change this behaviour for your states` for both the recursive and the nonrecursive case.\n\nIf recursive is set to false, only the most outer decorator is taken away instead of all.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.get_count","page":"Solver State","title":"Manopt.get_count","text":"get_count(ams::AbstractManoptSolverState, ::Symbol)\n\nObtain the count for a certain countable size, e.g. the :Iterations. This function returns 0 if there was nothing to count\n\nAvailable symbols from within the solver state\n\n:Iterations is passed on to the stop field to obtain the iteration at which the solver stopped.\n\n\n\n\n\nget_count(co::ManifoldCountObjective, s::Symbol, mode::Symbol=:None)\n\nGet the number of counts for a certain symbol s.\n\nDepending on the mode different results appear if the symbol does not exist in the dictionary\n\n:None – (default) silent mode, returns -1 for non-existing entries\n:warn – issues a warning if a field does not exist\n:error – issues an error if a field does not exist\n\n\n\n\n\n","category":"function"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"Since every subtype of an AbstractManoptSolverState directly relate to a solver, the concrete states are documented together with the corresponding solvers. This page documents the general functionality available for every state.","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"A first example is to access, i.e. obtain or set, the current iterate. This might be useful to continue investigation at the current iterate, or to set up a solver for a next experiment, respectively.","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"get_iterate\nset_iterate!\nget_gradient(s::AbstractManoptSolverState)\nset_gradient!","category":"page"},{"location":"plans/state/#Manopt.get_iterate","page":"Solver State","title":"Manopt.get_iterate","text":"get_iterate(O::AbstractManoptSolverState)\n\nreturn the (last stored) iterate within AbstractManoptSolverStates`. By default also undecorates the state beforehand.\n\n\n\n\n\nget_iterate(agst::AbstractGradientSolverState)\n\nreturn the iterate stored within gradient options. THe default returns agst.p.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.set_iterate!","page":"Solver State","title":"Manopt.set_iterate!","text":"set_iterate!(s::AbstractManoptSolverState, M::AbstractManifold, p)\n\nset the iterate within an AbstractManoptSolverState to some (start) value p.\n\n\n\n\n\nset_iterate!(agst::AbstractGradientSolverState, M, p)\n\nset the (current) iterate stored within an AbstractGradientSolverState to p. The default function modifies s.p.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.get_gradient-Tuple{AbstractManoptSolverState}","page":"Solver State","title":"Manopt.get_gradient","text":"get_gradient(s::AbstractManoptSolverState)\n\nreturn the (last stored) gradient within AbstractManoptSolverStates`. By default also undecorates the state beforehand\n\n\n\n\n\n","category":"method"},{"location":"plans/state/#Manopt.set_gradient!","page":"Solver State","title":"Manopt.set_gradient!","text":"set_gradient!(s::AbstractManoptSolverState, M::AbstractManifold, p, X)\n\nset the gradient within an (possibly decorated) AbstractManoptSolverState to some (start) value X in the tangent space at p.\n\n\n\n\n\nset_gradient!(agst::AbstractGradientSolverState, M, p, X)\n\nset the (current) gradient stored within an AbstractGradientSolverState to X. The default function modifies s.X.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"An internal function working on the state and elements within a state is used to pass messages from (sub) activities of a state to the corresponding DebugMessages","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"get_message","category":"page"},{"location":"plans/state/#Manopt.get_message","page":"Solver State","title":"Manopt.get_message","text":"get_message(du::AbstractManoptSolverState)\n\nget a message (String) from e.g. performing a step computation. This should return any message a sub-step might have issued\n\n\n\n\n\n","category":"function"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"Furthermore, to access the stopping criterion use","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"get_stopping_criterion","category":"page"},{"location":"plans/state/#Manopt.get_stopping_criterion","page":"Solver State","title":"Manopt.get_stopping_criterion","text":"get_stopping_criterion(ams::AbstractManoptSolverState)\n\nReturn the StoppingCriterion stored within the AbstractManoptSolverState ams.\n\nFor an undecorated state, this is assumed to be in ams.stop. Overwrite _get_stopping_criterion(yms::YMS) to change this for your manopt solver (yms) assuming it has type YMS`.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Decorators-for-AbstractManoptSolverState","page":"Solver State","title":"Decorators for AbstractManoptSolverState","text":"","category":"section"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"A solver state can be decorated using the following trait and function to initialize","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"dispatch_state_decorator\nis_state_decorator\ndecorate_state!","category":"page"},{"location":"plans/state/#Manopt.dispatch_state_decorator","page":"Solver State","title":"Manopt.dispatch_state_decorator","text":"dispatch_state_decorator(s::AbstractManoptSolverState)\n\nIndicate internally, whether an AbstractManoptSolverState s to be of decorating type, i.e. it stores (encapsulates) a state in itself, by default in the field s.state.\n\nDecorators indicate this by returning Val{true} for further dispatch.\n\nThe default is Val{false}, i.e. by default an state is not decorated.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.is_state_decorator","page":"Solver State","title":"Manopt.is_state_decorator","text":"is_state_decorator(s::AbstractManoptSolverState)\n\nIndicate, whether AbstractManoptSolverState s are of decorator type.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.decorate_state!","page":"Solver State","title":"Manopt.decorate_state!","text":"decorate_state!(s::AbstractManoptSolverState)\n\ndecorate the AbstractManoptSolverStates with specific decorators.\n\nOptional Arguments\n\noptional arguments provide necessary details on the decorators. A specific one is used to activate certain decorators.\n\ndebug – (Array{Union{Symbol,DebugAction,String,Int},1}()) a set of symbols representing DebugActions, Strings used as dividers and a subsampling integer. These are passed as a DebugGroup within :All to the DebugSolverState decorator dictionary. Only exception is :Stop that is passed to :Stop.\nrecord – (Array{Union{Symbol,RecordAction,Int},1}()) specify recordings by using Symbols or RecordActions directly. The integer can again be used for only recording every ith iteration.\nreturn_state - (false) indicate whether to wrap the options in a ReturnSolverState, indicating that the solver should return options and not (only) the minimizer.\n\nother keywords are ignored.\n\nSee also\n\nDebugSolverState, RecordSolverState, ReturnSolverState\n\n\n\n\n\n","category":"function"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"A simple example is the","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"ReturnSolverState","category":"page"},{"location":"plans/state/#Manopt.ReturnSolverState","page":"Solver State","title":"Manopt.ReturnSolverState","text":"ReturnSolverState{O<:AbstractManoptSolverState} <: AbstractManoptSolverState\n\nThis internal type is used to indicate that the contained AbstractManoptSolverState state should be returned at the end of a solver instead of the usual minimizer.\n\nSee also\n\nget_solver_result\n\n\n\n\n\n","category":"type"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"as well as DebugSolverState and RecordSolverState.","category":"page"},{"location":"plans/state/#State-Actions","page":"Solver State","title":"State Actions","text":"","category":"section"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"A state action is a struct for callback functions that can be attached within for example the just mentioned debug decorator or the record decorator.","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"AbstractStateAction","category":"page"},{"location":"plans/state/#Manopt.AbstractStateAction","page":"Solver State","title":"Manopt.AbstractStateAction","text":"AbstractStateAction\n\na common Type for AbstractStateActions that might be triggered in decoraters, for example within the DebugSolverState or within the RecordSolverState.\n\n\n\n\n\n","category":"type"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"Several state decorators or actions might store intermediate values like the (last) iterate to compute some change or the last gradient. In order to minimise the storage of these, there is a generic StoreStateAction that acts as generic common storage that can be shared among different actions.","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"StoreStateAction\nget_storage\nhas_storage\nupdate_storage!\nPointStorageKey\nVectorStorageKey","category":"page"},{"location":"plans/state/#Manopt.StoreStateAction","page":"Solver State","title":"Manopt.StoreStateAction","text":"StoreStateAction <: AbstractStateAction\n\ninternal storage for AbstractStateActions to store a tuple of fields from an AbstractManoptSolverStates\n\nThis functor possesses the usual interface of functions called during an iteration, i.e. acts on (p,o,i), where p is a AbstractManoptProblem, o is an AbstractManoptSolverState and i is the current iteration.\n\nFields\n\nvalues – a dictionary to store interims values based on certain Symbols\nkeys – a Vector of Symbols to refer to fields of AbstractManoptSolverState\npoint_values – a NamedTuple of mutable values of points on a manifold to be stored in StoreStateAction. Manifold is later determined by AbstractManoptProblem passed to update_storage!.\npoint_init – a NamedTuple of boolean values indicating whether a point in point_values with matching key has been already initialized to a value. When it is false, it corresponds to a general value not being stored for the key present in the vector keys.\nvector_values – a NamedTuple of mutable values of tangent vectors on a manifold to be stored in StoreStateAction. Manifold is later determined by AbstractManoptProblem passed to update_storage!. It is not specified at which point the vectors are tangent but for storage it should not matter.\nvector_init – a NamedTuple of boolean values indicating whether a tangent vector in vector_values with matching key has been already initialized to a value. When it is false, it corresponds to a general value not being stored for the key present in the vector keys.\nonce – whether to update the internal values only once per iteration\nlastStored – last iterate, where this AbstractStateAction was called (to determine once)\n\nTo handle the general storage, use get_storage and has_storage with keys as Symbols. For the point storage use PointStorageKey. For tangent vector storage use VectorStorageKey. Point and tangent storage have been optimized to be more efficient.\n\nConstructors\n\nStoreStateAction(s::Vector{Symbol})\n\nThis is equivalent as providing s to the keyword store_fields, just that here, no manifold is necessity for the construction.\n\nStoreStateAction(M)\n\nKeyword arguments\n\nstore_fields (Symbol[])\nstore_points (Symbol[])\nstore_vectors (Symbol[])\n\nas vectors of symbols each referring to fields of the state (lower case symbols) or semantic ones (upper case).\n\np_init (rand(M))\nX_init (zero_vector(M, p_init))\n\nare used to initialize the point and vector storages, change these if you use other types (than the default) for your points/vectors on M.\n\nonce (true) whether to update internal storage only once per iteration or on every update call\n\n\n\n\n\n","category":"type"},{"location":"plans/state/#Manopt.get_storage","page":"Solver State","title":"Manopt.get_storage","text":"get_storage(a::AbstractStateAction, key::Symbol)\n\nReturn the internal value of the AbstractStateAction a at the Symbol key.\n\n\n\n\n\nget_storage(a::AbstractStateAction, ::PointStorageKey{key}) where {key}\n\nReturn the internal value of the AbstractStateAction a at the Symbol key that represents a point.\n\n\n\n\n\nget_storage(a::AbstractStateAction, ::VectorStorageKey{key}) where {key}\n\nReturn the internal value of the AbstractStateAction a at the Symbol key that represents a vector.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.has_storage","page":"Solver State","title":"Manopt.has_storage","text":"has_storage(a::AbstractStateAction, key::Symbol)\n\nReturn whether the AbstractStateAction a has a value stored at the Symbol key.\n\n\n\n\n\nhas_storage(a::AbstractStateAction, ::PointStorageKey{key}) where {key}\n\nReturn whether the AbstractStateAction a has a point value stored at the Symbol key.\n\n\n\n\n\nhas_storage(a::AbstractStateAction, ::VectorStorageKey{key}) where {key}\n\nReturn whether the AbstractStateAction a has a point value stored at the Symbol key.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.update_storage!","page":"Solver State","title":"Manopt.update_storage!","text":"update_storage!(a::AbstractStateAction, amp::AbstractManoptProblem, s::AbstractManoptSolverState)\n\nUpdate the AbstractStateAction a internal values to the ones given on the AbstractManoptSolverState s. Optimized using the information from amp\n\n\n\n\n\nupdate_storage!(a::AbstractStateAction, d::Dict{Symbol,<:Any})\n\nUpdate the AbstractStateAction a internal values to the ones given in the dictionary d. The values are merged, where the values from d are preferred.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.PointStorageKey","page":"Solver State","title":"Manopt.PointStorageKey","text":"struct PointStorageKey{key} end\n\nRefer to point storage of StoreStateAction in get_storage and has_storage functions\n\n\n\n\n\n","category":"type"},{"location":"plans/state/#Manopt.VectorStorageKey","page":"Solver State","title":"Manopt.VectorStorageKey","text":"struct VectorStorageKey{key} end\n\nRefer to tangent storage of StoreStateAction in get_storage and has_storage functions\n\n\n\n\n\n","category":"type"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"as well as two internal functions","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"_storage_copy_vector\n_storage_copy_point","category":"page"},{"location":"plans/state/#Manopt._storage_copy_vector","page":"Solver State","title":"Manopt._storage_copy_vector","text":"_storage_copy_vector(M::AbstractManifold, X)\n\nMake a copy of tangent vector X from manifold M for storage in StoreStateAction.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt._storage_copy_point","page":"Solver State","title":"Manopt._storage_copy_point","text":"_storage_copy_point(M::AbstractManifold, p)\n\nMake a copy of point p from manifold M for storage in StoreStateAction.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Abstract-States","page":"Solver State","title":"Abstract States","text":"","category":"section"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"In a few cases it is useful to have a hierarchy of types. These are","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"AbstractSubProblemSolverState\nAbstractGradientSolverState\nAbstractHessianSolverState\nAbstractPrimalDualSolverState","category":"page"},{"location":"plans/state/#Manopt.AbstractSubProblemSolverState","page":"Solver State","title":"Manopt.AbstractSubProblemSolverState","text":"AbstractSubProblemSolverState <: AbstractManoptSolverState\n\nAn abstract type for problems that involve a subsolver\n\n\n\n\n\n","category":"type"},{"location":"plans/state/#Manopt.AbstractGradientSolverState","page":"Solver State","title":"Manopt.AbstractGradientSolverState","text":"AbstractGradientSolverState <: AbstractManoptSolverState\n\nA generic AbstractManoptSolverState type for gradient based options data.\n\nIt assumes that\n\nthe iterate is stored in the field p\nthe gradient at p is stored in X.\n\nsee also\n\nGradientDescentState, StochasticGradientDescentState, SubGradientMethodState, QuasiNewtonState.\n\n\n\n\n\n","category":"type"},{"location":"plans/state/#Manopt.AbstractHessianSolverState","page":"Solver State","title":"Manopt.AbstractHessianSolverState","text":"AbstractHessianSolverState <: AbstractGradientSolverState\n\nAn AbstractManoptSolverState type to represent algorithms that employ the Hessian. These options are assumed to have a field (gradient) to store the current gradient operatornamegradf(x)\n\n\n\n\n\n","category":"type"},{"location":"plans/state/#Manopt.AbstractPrimalDualSolverState","page":"Solver State","title":"Manopt.AbstractPrimalDualSolverState","text":"AbstractPrimalDualSolverState\n\nA general type for all primal dual based options to be used within primal dual based algorithms\n\n\n\n\n\n","category":"type"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"For the sub problem state, there are two access functions","category":"page"},{"location":"plans/state/","page":"Solver State","title":"Solver State","text":"get_sub_problem\nget_sub_state","category":"page"},{"location":"plans/state/#Manopt.get_sub_problem","page":"Solver State","title":"Manopt.get_sub_problem","text":"get_sub_problem(ams::AbstractSubProblemSolverState)\n\nAccess the sub problem of a solver state that involves a sub optimisation task. By default this returns ams.sub_problem.\n\n\n\n\n\n","category":"function"},{"location":"plans/state/#Manopt.get_sub_state","page":"Solver State","title":"Manopt.get_sub_state","text":"get_sub_state(ams::AbstractSubProblemSolverState)\n\nAccess the sub state of a solver state that involves a sub optimisation task. By default this returns ams.sub_state.\n\n\n\n\n\n","category":"function"},{"location":"about/#About","page":"About","title":"About","text":"","category":"section"},{"location":"about/","page":"About","title":"About","text":"Manopt.jl inherited its name from Manopt, a Matlab toolbox for optimization on manifolds. This Julia package was started and is currently maintained by Ronny Bergmann.","category":"page"},{"location":"about/","page":"About","title":"About","text":"The following people contributed","category":"page"},{"location":"about/","page":"About","title":"About","text":"Constantin Ahlmann-Eltze implemented the gradient and differential check functions\nRenée Dornig implemented the particle swarm, the Riemannian Augmented Lagrangian Method, the Exact Penalty Method, as well as the NonmonotoneLinesearch\nWillem Diepeveen implemented the primal-dual Riemannian semismooth Newton solver.\nEven Stephansen Kjemsås contributed to the implementation of the Frank Wolfe Method solver\nMathias Ravn Munkvold contributed most of the implementation of the Adaptive Regularization with Cubics solver\nTom-Christian Riemer Riemer implemented the trust regions and quasi Newton solvers.\nManuel Weiss implemented most of the conjugate gradient update rules","category":"page"},{"location":"about/","page":"About","title":"About","text":"...as well as various contributors providing small extensions, finding small bugs and mistakes and fixing them by opening PRs.","category":"page"},{"location":"about/","page":"About","title":"About","text":"If you want to contribute a manifold or algorithm or have any questions, visit the GitHub repository to clone/fork the repository or open an issue.","category":"page"},{"location":"about/#Further-Packages-and-Links","page":"About","title":"Further Packages & Links","text":"","category":"section"},{"location":"about/","page":"About","title":"About","text":"Manopt.jl belongs to the Manopt family:","category":"page"},{"location":"about/","page":"About","title":"About","text":"manopt.org – The Matlab version of Manopt, see also their :octocat: GitHub repository\npymanopt.org – The Python version of Manopt – providing also several AD backends, see also their :octocat: GitHub repository","category":"page"},{"location":"about/","page":"About","title":"About","text":"but there are also more packages providing tools on manifolds:","category":"page"},{"location":"about/","page":"About","title":"About","text":"Jax Geometry (Python/Jax) for differential geometry and stochastic dynamics with deep learning\nGeomstats (Python with several backends) focusing on statistics and machine learning :octocat: GitHub repository\nGeoopt (Python & PyTorch) – Riemannian ADAM & SGD. :octocat: GitHub repository\nMcTorch (Python & PyToch) – Riemannian SGD, Adagrad, ASA & CG.\nROPTLIB (C++) a Riemannian OPTimization LIBrary :octocat: GitHub repository\nTF Riemopt (Python & TensorFlow) Riemannian optimization using TensorFlow","category":"page"},{"location":"tutorials/GeodesicRegression/#How-to-perform-Geodesic-Regression","page":"Do Geodesic Regression","title":"How to perform Geodesic Regression","text":"","category":"section"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Geodesic regression generalizes linear regression to Riemannian manifolds. Let’s first phrase it informally as follows:","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"For given data points d_1ldotsd_n on a Riemannian manifold mathcal M, find the geodesic that “best explains” the data.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"The meaning of “best explain” has still to be clarified. We distinguish two cases: time labelled data and unlabelled data","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":" using Manopt, ManifoldDiff, Manifolds, Random, Colors\n using LinearAlgebra: svd\n Random.seed!(42);","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"We use the following data, where we want to highlight one of the points.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"n = 7\nσ = π / 8\nS = Sphere(2)\nbase = 1 / sqrt(2) * [1.0, 0.0, 1.0]\ndir = [-0.75, 0.5, 0.75]\ndata_orig = [exp(S, base, dir, t) for t in range(-0.5, 0.5; length=n)]\n# add noise to the points on the geodesic\ndata = map(p -> exp(S, p, rand(S; vector_at=p, σ=σ)), data_orig)\nhighlighted = 4;","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"(Image: The given data)","category":"page"},{"location":"tutorials/GeodesicRegression/#Time-Labeled-Data","page":"Do Geodesic Regression","title":"Time Labeled Data","text":"","category":"section"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"If for each data item d_i we are also given a time point t_iinmathbb R, which are pairwise different, then we can use the least squares error to state the objetive function as [Fle13]","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"F(pX) = frac12sum_i=1^n d_mathcal M^2(γ_pX(t_i) d_i)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"where d_mathcal M is the Riemannian distance and γ_pX is the geodesic with γ(0) = p and dotgamma(0) = X.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"For the real-valued case mathcal M = mathbb R^m the solution (p^* X^*) is given in closed form as follows: with d^* = frac1ndisplaystylesum_i=1^nd_i and t^* = frac1ndisplaystylesum_i=1^n t_i we get","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":" X^* = fracsum_i=1^n (d_i-d^*)(t-t^*)sum_i=1^n (t_i-t^*)^2\nquadtext and quad\np^* = d^* - t^*X^*","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"and hence the linear regression result is the line γ_p^*X^*(t) = p^* + tX^*.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"On a Riemannian manifold we can phrase this as an optimization problem on the tangent bundle, i.e. the disjoint union of all tangent spaces, as","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"operatorname*argmin_(pX) in mathrmTmathcal M F(pX)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Due to linearity, the gradient of F(pX) is the sum of the single gradients of","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":" frac12d_mathcal M^2bigl(γ_pX(t_i)d_ibigr)\n = frac12d_mathcal M^2bigl(exp_p(t_iX)d_ibigr)\n quad i1ldotsn","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"which can be computed using a chain rule of the squared distance and the exponential map, see for example [BG18] for details or Equations (7) and (8) of [Fle13]:","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"M = TangentBundle(S)\nstruct RegressionCost{T,S}\n data::T\n times::S\nend\nRegressionCost(data::T, times::S) where {T,S} = RegressionCost{T,S}(data, times)\nfunction (a::RegressionCost)(M, x)\n pts = [geodesic(M.manifold, x[M, :point], x[M, :vector], ti) for ti in a.times]\n return 1 / 2 * sum(distance.(Ref(M.manifold), pts, a.data) .^ 2)\nend\nstruct RegressionGradient!{T,S}\n data::T\n times::S\nend\nfunction RegressionGradient!(data::T, times::S) where {T,S}\n return RegressionGradient!{T,S}(data, times)\nend\nfunction (a::RegressionGradient!)(M, Y, x)\n pts = [geodesic(M.manifold, x[M, :point], x[M, :vector], ti) for ti in a.times]\n gradients = grad_distance.(Ref(M.manifold), a.data, pts)\n Y[M, :point] .= sum(\n ManifoldDiff.adjoint_differential_exp_basepoint.(\n Ref(M.manifold),\n Ref(x[M, :point]),\n [ti * x[M, :vector] for ti in a.times],\n gradients,\n ),\n )\n Y[M, :vector] .= sum(\n ManifoldDiff.adjoint_differential_exp_argument.(\n Ref(M.manifold),\n Ref(x[M, :point]),\n [ti * x[M, :vector] for ti in a.times],\n gradients,\n ),\n )\n return Y\nend","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"For the Euclidean case, the result is given by the first principal component of a principal component analysis, see PCR, i.e. with p^* = frac1ndisplaystylesum_i=1^n d_i the direction X^* is obtained by defining the zero mean data matrix","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"D = bigl(d_1-p^* ldots d_n-p^*bigr) in mathbb R^mn","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"and taking X^* as an eigenvector to the largest eigenvalue of D^mathrmTD.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"We can do something similar, when considering the tangent space at the (Riemannian) mean of the data and then do a PCA on the coordinate coefficients with respect to a basis.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"m = mean(S, data)\nA = hcat(\n map(x -> get_coordinates(S, m, log(S, m, x), DefaultOrthonormalBasis()), data)...\n)\npca1 = get_vector(S, m, svd(A).U[:, 1], DefaultOrthonormalBasis())\nx0 = ArrayPartition(m, pca1)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"([0.6998621681746481, -0.013681674945026638, 0.7141468737791822], [0.5931302057517893, -0.5459465115717783, -0.5917254139611094])","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"The optimal “time labels” are then just the projections t_i = d_iX^*, i=1ldotsn.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"t = map(d -> inner(S, m, pca1, log(S, m, d)), data)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"7-element Vector{Float64}:\n 1.0763904949888323\n 0.4594060193318443\n -0.5030195874833682\n 0.02135686940521725\n -0.6158692507563633\n -0.24431652575028764\n -0.2259012492666664","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"And we can call the gradient descent. Note that since gradF! works in place of Y, we have to set the evalutation type accordingly.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"y = gradient_descent(\n M,\n RegressionCost(data, t),\n RegressionGradient!(data, t),\n x0;\n evaluation=InplaceEvaluation(),\n stepsize=ArmijoLinesearch(\n M;\n initial_stepsize=1.0,\n contraction_factor=0.990,\n sufficient_decrease=0.05,\n stop_when_stepsize_less=1e-9,\n ),\n stopping_criterion=StopAfterIteration(200) |\n StopWhenGradientNormLess(1e-8) |\n StopWhenStepsizeLess(1e-9),\n debug=[:Iteration, \" | \", :Cost, \"\\n\", :Stop, 50],\n)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Initial | F(x): 0.142862\n# 50 | F(x): 0.141113\n# 100 | F(x): 0.141113\n# 150 | F(x): 0.141113\n# 200 | F(x): 0.141113\nThe algorithm reached its maximal number of iterations (200).\n\n([0.7119768725361988, 0.009463059143003981, 0.7021391482357537], [0.590008151835008, -0.5543272518659472, -0.5908038715512287])","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"For the result, we can generate and plot all involved geodesics","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"dense_t = range(-0.5, 0.5; length=100)\ngeo = geodesic(S, y[M, :point], y[M, :vector], dense_t)\ninit_geo = geodesic(S, x0[M, :point], x0[M, :vector], dense_t)\ngeo_pts = geodesic(S, y[M, :point], y[M, :vector], t)\ngeo_conn_highlighted = shortest_geodesic(\n S, data[highlighted], geo_pts[highlighted], 0.5 .+ dense_t\n);","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"(Image: Result of Geodesic Regression)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"In this image, together with the blue data points, you see the geodesic of the initialization in black (evaluated on -frac12frac12), the final point on the tangent bundle in orange, as well as the resulting regression geodesic in teal, (on the same interval as the start) as well as small teal points indicating the time points on the geodesic corresponding to the data. Additionally, a thin blue line indicates the geodesic between a data point and its corresponding data point on the geodesic. While this would be the closest point in Euclidean space and hence the two directions (along the geodesic vs. to the data point) orthogonal, here we have","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"inner(\n S,\n geo_pts[highlighted],\n log(S, geo_pts[highlighted], geo_pts[highlighted + 1]),\n log(S, geo_pts[highlighted], data[highlighted]),\n)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"0.002487393068917863","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"But we also started with one of the best scenarios, i.e. equally spaced points on a geodesic obstructed by noise.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"This gets worse if you start with less evenly distributed data","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"data2 = [exp(S, base, dir, t) for t in [-0.5, -0.49, -0.48, 0.1, 0.48, 0.49, 0.5]]\ndata2 = map(p -> exp(S, p, rand(S; vector_at=p, σ=σ / 2)), data2)\nm2 = mean(S, data2)\nA2 = hcat(\n map(x -> get_coordinates(S, m, log(S, m, x), DefaultOrthonormalBasis()), data2)...\n)\npca2 = get_vector(S, m, svd(A2).U[:, 1], DefaultOrthonormalBasis())\nx1 = ArrayPartition(m, pca2)\nt2 = map(d -> inner(S, m2, pca2, log(S, m2, d)), data2)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"7-element Vector{Float64}:\n 0.8226008307680276\n 0.470952643700004\n 0.7974195537403082\n 0.01533949241264346\n -0.6546705405852389\n -0.8913273825362389\n -0.5775954445730889","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"then we run again","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"y2 = gradient_descent(\n M,\n RegressionCost(data2, t2),\n RegressionGradient!(data2, t2),\n x1;\n evaluation=InplaceEvaluation(),\n stepsize=ArmijoLinesearch(\n M;\n initial_stepsize=1.0,\n contraction_factor=0.990,\n sufficient_decrease=0.05,\n stop_when_stepsize_less=1e-9,\n ),\n stopping_criterion=StopAfterIteration(200) |\n StopWhenGradientNormLess(1e-8) |\n StopWhenStepsizeLess(1e-9),\n debug=[:Iteration, \" | \", :Cost, \"\\n\", :Stop, 3],\n);","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Initial | F(x): 0.089844\n# 3 | F(x): 0.085364\n# 6 | F(x): 0.085364\n# 9 | F(x): 0.085364\n# 12 | F(x): 0.085364\n# 15 | F(x): 0.085364\n# 18 | F(x): 0.085364\n# 21 | F(x): 0.085364\n# 24 | F(x): 0.085364\n# 27 | F(x): 0.085364\n# 30 | F(x): 0.085364\n# 33 | F(x): 0.085364\n# 36 | F(x): 0.085364\n# 39 | F(x): 0.085364\n# 42 | F(x): 0.085364\n# 45 | F(x): 0.085364\n# 48 | F(x): 0.085364\n# 51 | F(x): 0.085364\n# 54 | F(x): 0.085364\n# 57 | F(x): 0.085364\n# 60 | F(x): 0.085364\n# 63 | F(x): 0.085364\n# 66 | F(x): 0.085364\n# 69 | F(x): 0.085364\n# 72 | F(x): 0.085364\n# 75 | F(x): 0.085364\n# 78 | F(x): 0.085364\n# 81 | F(x): 0.085364\n# 84 | F(x): 0.085364\n# 87 | F(x): 0.085364\n# 90 | F(x): 0.085364\n# 93 | F(x): 0.085364\n# 96 | F(x): 0.085364\n# 99 | F(x): 0.085364\n# 102 | F(x): 0.085364\n# 105 | F(x): 0.085364\n# 108 | F(x): 0.085364\n# 111 | F(x): 0.085364\n# 114 | F(x): 0.085364\n# 117 | F(x): 0.085364\n# 120 | F(x): 0.085364\n# 123 | F(x): 0.085364\n# 126 | F(x): 0.085364\n# 129 | F(x): 0.085364\n# 132 | F(x): 0.085364\n# 135 | F(x): 0.085364\n# 138 | F(x): 0.085364\n# 141 | F(x): 0.085364\n# 144 | F(x): 0.085364\n# 147 | F(x): 0.085364\n# 150 | F(x): 0.085364\n# 153 | F(x): 0.085364\n# 156 | F(x): 0.085364\n# 159 | F(x): 0.085364\n# 162 | F(x): 0.085364\n# 165 | F(x): 0.085364\n# 168 | F(x): 0.085364\n# 171 | F(x): 0.085364\n# 174 | F(x): 0.085364\n# 177 | F(x): 0.085364\n# 180 | F(x): 0.085364\n# 183 | F(x): 0.085364\n# 186 | F(x): 0.085364\n# 189 | F(x): 0.085364\n# 192 | F(x): 0.085364\n# 195 | F(x): 0.085364\n# 198 | F(x): 0.085364\nThe algorithm reached its maximal number of iterations (200).","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"For plotting we again generate all data","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"geo2 = geodesic(S, y2[M, :point], y2[M, :vector], dense_t)\ninit_geo2 = geodesic(S, x1[M, :point], x1[M, :vector], dense_t)\ngeo_pts2 = geodesic(S, y2[M, :point], y2[M, :vector], t2)\ngeo_conn_highlighted2 = shortest_geodesic(\n S, data2[highlighted], geo_pts2[highlighted], 0.5 .+ dense_t\n);","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"(Image: A second result with different time points)","category":"page"},{"location":"tutorials/GeodesicRegression/#Unlabeled-Data","page":"Do Geodesic Regression","title":"Unlabeled Data","text":"","category":"section"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"If we are not given time points t_i, then the optimization problem extends – informally speaking – to also finding the “best fitting” (in the sense of smallest error). To formalize, the objective function here reads","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"F(p X t) = frac12sum_i=1^n d_mathcal M^2(γ_pX(t_i) d_i)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"where t = (t_1ldotst_n) in mathbb R^n is now an additional parameter of the objective function. We write F_1(p X) to refer to the function on the tangent bundle for fixed values of t (as the one in the last part) and F_2(t) for the function F(p X t) as a function in t with fixed values (p X).","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"For the Euclidean case, there is no neccessity to optimize with respect to t, as we saw above for the initialization of the fixed time points.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"On a Riemannian manifold this can be stated as a problem on the product manifold mathcal N = mathrmTmathcal M times mathbb R^n, i.e.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"N = M × Euclidean(length(t2))","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"ProductManifold with 2 submanifolds:\n TangentBundle(Sphere(2, ℝ))\n Euclidean(7; field = ℝ)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":" operatorname*argmin_bigl((pX)tbigr)inmathcal N F(p X t)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"In this tutorial we present an approach to solve this using an alternating gradient descent scheme. To be precise, we define the cost funcion now on the product manifold","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"struct RegressionCost2{T}\n data::T\nend\nRegressionCost2(data::T) where {T} = RegressionCost2{T}(data)\nfunction (a::RegressionCost2)(N, x)\n TM = N[1]\n pts = [\n geodesic(TM.manifold, x[N, 1][TM, :point], x[N, 1][TM, :vector], ti) for\n ti in x[N, 2]\n ]\n return 1 / 2 * sum(distance.(Ref(TM.manifold), pts, a.data) .^ 2)\nend","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"The gradient in two parts, namely (a) the same gradient as before w.r.t. (pX) Tmathcal M, just now with a fixed t in mind for the second component of the product manifold mathcal N","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"struct RegressionGradient2a!{T}\n data::T\nend\nRegressionGradient2a!(data::T) where {T} = RegressionGradient2a!{T}(data)\nfunction (a::RegressionGradient2a!)(N, Y, x)\n TM = N[1]\n p = x[N, 1]\n pts = [geodesic(TM.manifold, p[TM, :point], p[TM, :vector], ti) for ti in x[N, 2]]\n gradients = Manopt.grad_distance.(Ref(TM.manifold), a.data, pts)\n Y[TM, :point] .= sum(\n ManifoldDiff.adjoint_differential_exp_basepoint.(\n Ref(TM.manifold),\n Ref(p[TM, :point]),\n [ti * p[TM, :vector] for ti in x[N, 2]],\n gradients,\n ),\n )\n Y[TM, :vector] .= sum(\n ManifoldDiff.adjoint_differential_exp_argument.(\n Ref(TM.manifold),\n Ref(p[TM, :point]),\n [ti * p[TM, :vector] for ti in x[N, 2]],\n gradients,\n ),\n )\n return Y\nend","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Finally, we addionally look for a fixed point x=(pX) mathrmTmathcal M at the gradient with respect to tmathbb R^n, i.e. the second component, which is given by","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":" (operatornamegradF_2(t))_i\n = - dot γ_pX(t_i) log_γ_pX(t_i)d_i_γ_pX(t_i) i = 1 ldots n","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"struct RegressionGradient2b!{T}\n data::T\nend\nRegressionGradient2b!(data::T) where {T} = RegressionGradient2b!{T}(data)\nfunction (a::RegressionGradient2b!)(N, Y, x)\n TM = N[1]\n p = x[N, 1]\n pts = [geodesic(TM.manifold, p[TM, :point], p[TM, :vector], ti) for ti in x[N, 2]]\n logs = log.(Ref(TM.manifold), pts, a.data)\n pt = map(\n d -> vector_transport_to(TM.manifold, p[TM, :point], p[TM, :vector], d), pts\n )\n Y .= -inner.(Ref(TM.manifold), pts, logs, pt)\n return Y\nend","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"We can reuse the computed initial values from before, just that now we are on a product manifold","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"x2 = ArrayPartition(x1, t2)\nF3 = RegressionCost2(data2)\ngradF3_vector = [RegressionGradient2a!(data2), RegressionGradient2b!(data2)];","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"and we run the algorithm","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"y3 = alternating_gradient_descent(\n N,\n F3,\n gradF3_vector,\n x2;\n evaluation=InplaceEvaluation(),\n debug=[:Iteration, \" | \", :Cost, \"\\n\", :Stop, 50],\n stepsize=ArmijoLinesearch(\n M;\n contraction_factor=0.999,\n sufficient_decrease=0.066,\n stop_when_stepsize_less=1e-11,\n retraction_method=ProductRetraction(SasakiRetraction(2), ExponentialRetraction()),\n ),\n inner_iterations=1,\n)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Initial | F(x): 0.089844\n# 50 | F(x): 0.091097\n# 100 | F(x): 0.091097\nThe algorithm reached its maximal number of iterations (100).\n\n(ArrayPartition{Float64, Tuple{Vector{Float64}, Vector{Float64}}}(([0.750222090700214, 0.031464227399200885, 0.6604368380243274], [0.6636489079535082, -0.3497538263293046, -0.737208025444054])), [0.7965909273713889, 0.43402264218923514, 0.755822122896529, 0.001059348203453764, -0.6421135044471217, -0.8635572995105818, -0.5546338813212247])","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"which we render can collect into an image creating the geodesics again","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"geo3 = geodesic(S, y3[N, 1][M, :point], y3[N, 1][M, :vector], dense_t)\ninit_geo3 = geodesic(S, x1[M, :point], x1[M, :vector], dense_t)\ngeo_pts3 = geodesic(S, y3[N, 1][M, :point], y3[N, 1][M, :vector], y3[N, 2])\nt3 = y3[N, 2]\ngeo_conns = shortest_geodesic.(Ref(S), data2, geo_pts3, Ref(0.5 .+ 4*dense_t));","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"which yields","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"(Image: The third result)","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Note that the geodesics from the data to the regression geodesic meet at a nearly orthogonal angle.","category":"page"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"Acknowledgement. Parts of this tutorial are based on the bachelor thesis of Jeremias Arf.","category":"page"},{"location":"tutorials/GeodesicRegression/#Literature","page":"Do Geodesic Regression","title":"Literature","text":"","category":"section"},{"location":"tutorials/GeodesicRegression/","page":"Do Geodesic Regression","title":"Do Geodesic Regression","text":"
[BG18]
\n
\n
R. Bergmann and P.-Y. Gousenbourger. A variational model for data fitting on manifolds by minimizing the acceleration of a Bézier curve. Frontiers in Applied Mathematics and Statistics 4 (2018), arXiv: [1807.10090](https://arxiv.org/abs/1807.10090).
","category":"page"},{"location":"solvers/FrankWolfe/#FrankWolfe","page":"Frank-Wolfe","title":"Frank Wolfe Method","text":"","category":"section"},{"location":"solvers/FrankWolfe/","page":"Frank-Wolfe","title":"Frank-Wolfe","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/FrankWolfe/","page":"Frank-Wolfe","title":"Frank-Wolfe","text":"Frank_Wolfe_method\nFrank_Wolfe_method!","category":"page"},{"location":"solvers/FrankWolfe/#Manopt.Frank_Wolfe_method","page":"Frank-Wolfe","title":"Manopt.Frank_Wolfe_method","text":"Frank_Wolfe_method(M, f, grad_f, p)\nFrank_Wolfe_method(M, gradient_objective, p; kwargs...)\n\nPerform the Frank-Wolfe algorithm to compute for mathcal C subset mathcal M\n\n operatorname*argmin_pmathcal C f(p)\n\nwhere the main step is a constrained optimisation is within the algorithm, that is the sub problem (Oracle)\n\n q_k = operatornameargmin_q in C operatornamegrad F(p_k) log_p_kq\n\nfor every iterate p_k together with a stepsize s_k1, by default s_k = frac2k+2. This algorithm is inspired by but slightly more general than Weber, Sra, Math. Prog., 2022.\n\nThe next iterate is then given by p_k+1 = γ_p_kq_k(s_k), where by default γ is the shortest geodesic between the two points but can also be changed to use a retraction and its inverse.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function f mathcal Mℝ to find a minimizer p^* for\ngrad_f – the gradient operatornamegradf mathcal M Tmathcal M of f\nas a function (M, p) -> X or a function (M, X, p) -> X working in place of X.\np – an initial value p mathcal C, note that it really has to be a feasible point\n\nAlternatively to f and grad_f you can provide the AbstractManifoldGradientObjective gradient_objective directly.\n\nKeyword Arguments\n\nevaluation - (AllocatingEvaluation) whether grad_f is an inplace or allocating (default) function\ninitial_vector – (zero_vectoir(M,p)) how to initialize the inner gradient tangent vector\nstopping_criterion – (StopAfterIteration(500) |StopWhenGradientNormLess(1.0e-6)) a stopping criterion\nretraction_method – (default_retraction_method(M, typeof(p))) a type of retraction\nstepsize -(DecreasingStepsize(; length=2.0, shift=2) a Stepsize to use; but it has to be always less than 1. The default is the one proposed by Frank & Wolfe: s_k = frac2k+2.\nsub_cost - (FrankWolfeCost(p, initiel_vector)) – the cost of the Frank-Wolfe sub problem which by default uses the current iterate and (sub)gradient of the current iteration to define a default cost, this is used to define the default sub_objective. It is ignored, if you set that or the sub_problem directly\nsub_grad - (FrankWolfeGradient(p, initial_vector)) – the gradient of the Frank-Wolfe sub problem which by default uses the current iterate and (sub)gradient of the current iteration to define a default gradient this is used to define the default sub_objective. It is ignored, if you set that or the sub_problem directly\nsub_objective - (ManifoldGradientObjective(sub_cost, sub_gradient)) – the objective for the Frank-Wolfe sub problem this is used to define the default sub_problem. It is ignored, if you set the sub_problem manually\nsub_problem - (DefaultManoptProblem(M, sub_objective)) – the Frank-Wolfe sub problem to solve. This can be given in three forms\nas an AbstractManoptProblem, then the sub_state specifies the solver to use\nas a closed form solution, e.g. a function, evaluating with new allocations, that is a function (M, p, X) -> q that solves the sub problem on M given the current iterate p and (sub)gradient X.\nas a closed form solution, e.g. a function, evaluating in place, that is a function (M, q, p, X) -> q working in place of q, with the parameters as in the last point\nFor points 2 and 3 the sub_state has to be set to the corresponding AbstractEvaluationType, AllocatingEvaluation and InplaceEvaluation, respectively\nsub_state - (evaluation if sub_problem is a function, a decorated GradientDescentState otherwise) for a function, the evaluation is inherited from the Frank-Wolfe evaluation keyword.\nsub_kwargs - ([]) – keyword arguments to decorate the sub_state default state in case the sub_problem is not a function\n\nAll other keyword arguments are passed to decorate_state! for decorators or decorate_objective!, respectively. If you provide the ManifoldGradientObjective directly, these decorations can still be specified\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/FrankWolfe/#Manopt.Frank_Wolfe_method!","page":"Frank-Wolfe","title":"Manopt.Frank_Wolfe_method!","text":"Frank_Wolfe_method!(M, f, grad_f, p; kwargs...)\nFrank_Wolfe_method!(M, gradient_objective, p; kwargs...)\n\nPerform the Frank Wolfe method in place of p.\n\nFor all options and keyword arguments, see Frank_Wolfe_method.\n\n\n\n\n\n","category":"function"},{"location":"solvers/FrankWolfe/#State","page":"Frank-Wolfe","title":"State","text":"","category":"section"},{"location":"solvers/FrankWolfe/","page":"Frank-Wolfe","title":"Frank-Wolfe","text":"FrankWolfeState","category":"page"},{"location":"solvers/FrankWolfe/#Manopt.FrankWolfeState","page":"Frank-Wolfe","title":"Manopt.FrankWolfeState","text":"FrankWolfeState <: AbstractManoptSolverState\n\nA struct to store the current state of the Frank_Wolfe_method\n\nIt comes in two forms, depending on the realisation of the subproblem.\n\nFields\n\np – the current iterate, i.e. a point on the manifold\nX – the current gradient operatornamegrad F(p), i.e. a tangent vector to p.\ninverse_retraction_method – (default_inverse_retraction_method(M, typeof(p))) an inverse retraction method to use within Frank Wolfe.\nsub_problem – an AbstractManoptProblem problem or a function (M, p, X) -> q or (M, q, p, X) for the a closed form solution of the sub problem\nsub_state – an AbstractManoptSolverState for the subsolver or an AbstractEvaluationType in case the sub problem is provided as a function\nstop – (StopAfterIteration(200) |StopWhenGradientNormLess(1.0e-6)) a StoppingCriterion\nstepsize - (DecreasingStepsize(; length=2.0, shift=2)) s_k which by default is set to s_k = frac2k+2.\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction to use within Frank-Wolfe\n\nFor the subtask, we need a method to solve\n\n operatorname*argmin_qmathcal M X log_p qqquad text where X=operatornamegrad f(p)\n\nConstructor\n\nFrankWolfeState(M, p, X, sub_problem, sub_state)\n\nwhere the remaining fields from above are keyword arguments with their defaults already given in brackets.\n\n\n\n\n\n","category":"type"},{"location":"solvers/FrankWolfe/#Helpers","page":"Frank-Wolfe","title":"Helpers","text":"","category":"section"},{"location":"solvers/FrankWolfe/","page":"Frank-Wolfe","title":"Frank-Wolfe","text":"For the inner sub-problem you can easily create the corresponding cost and gradient using","category":"page"},{"location":"solvers/FrankWolfe/","page":"Frank-Wolfe","title":"Frank-Wolfe","text":"FrankWolfeCost\nFrankWolfeGradient","category":"page"},{"location":"solvers/FrankWolfe/#Manopt.FrankWolfeCost","page":"Frank-Wolfe","title":"Manopt.FrankWolfeCost","text":"FrankWolfeCost{P,T}\n\nA structure to represent the oracle sub problem in the Frank_Wolfe_method. The cost function reads\n\nF(q) = X log_p q\n\nThe values p and X are stored within this functor and should be references to the iterate and gradient from within FrankWolfeState.\n\n\n\n\n\n","category":"type"},{"location":"solvers/FrankWolfe/#Manopt.FrankWolfeGradient","page":"Frank-Wolfe","title":"Manopt.FrankWolfeGradient","text":"FrankWolfeGradient{P,T}\n\nA structure to represent the gradient of the oracle sub problem in the Frank_Wolfe_method, that is for a given point p and a tangent vector X we have\n\nF(q) = X log_p q\n\nIts gradient can be computed easily using adjoint_differential_log_argument.\n\nThe values p and X are stored within this functor and should be references to the iterate and gradient from within FrankWolfeState.\n\n\n\n\n\n","category":"type"},{"location":"solvers/FrankWolfe/","page":"Frank-Wolfe","title":"Frank-Wolfe","text":"
","category":"page"},{"location":"tutorials/ImplementASolver/#How-to-implementing-your-own-solver","page":"Implement a Solver","title":"How to implementing your own solver","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"When you have used a few solvers from Manopt.jl for example like in the opening tutorial Get Started: Optimize! you might come to the idea of implementing a solver yourself.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"After a short introduction of the algorithm we will implement, this tutorial first discusses the structural details, i.e. what a solver consists of and “works with”. Afterwards, we will show how to implement the algorithm. Finally, we will discuss how to make the algorithm both nice for the user as well as initialized in a way, that it can benefit from features already available in Manopt.jl.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"note: Note\nIf you have implemented your own solver, we would be very happy to have that within Manopt.jl as well, so maybe consider opening a Pull Request","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"using Manopt, Manifolds, Random","category":"page"},{"location":"tutorials/ImplementASolver/#Our-Guiding-Example:-A-random-walk-Minimization","page":"Implement a Solver","title":"Our Guiding Example: A random walk Minimization","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Since most serious algorithms should be implemented in Manopt.jl themselves directly, we will implement a solver that randomly walks on the manifold and keeps track of the lowest point visited. As for algorithms in Manopt.jl we aim to implement this generically for any manifold that is implemented using ManifoldsBase.jl.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"The Random Walk Minimization","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Given:","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"a manifold mathcal M\na starting point p=p^(0)\na cost function f mathcal M tomathbb R.\na parameter sigma 0.\na retraction operatornameretr_p(X) that maps Xin T_pmathcal M to the manifold.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We can run the following steps of the algorithm","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"set k=0\nset our best point q = p^(0)\nRepeat until a stopping criterion is fulfilled\nChoose a random tangent vector X^(k) in T_p^(k)mathcal M of length lVert X^(k) rVert = sigma\n“Walk” along this direction, i.e. p^(k+1) = operatornameretr_p^(k)(X^(k))\nIf f(p^(k+1)) f(q) set q = p^{(k+1)}$ as our new best visited point\nReturn q as the resulting best point we visited","category":"page"},{"location":"tutorials/ImplementASolver/#Preliminaries-–-Elements-a-Solver-works-on","page":"Implement a Solver","title":"Preliminaries – Elements a Solver works on","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"There are two main ingredients a solver needs: a problem to work on and the state of a solver, which “identifies” the solver and stores intermediate results.","category":"page"},{"location":"tutorials/ImplementASolver/#The-“Task”-–-An-AbstractManoptProblem","page":"Implement a Solver","title":"The “Task” – An AbstractManoptProblem","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"A problem in Manopt.jl usually consists of a manifold (an AbstractManifold) and an AbstractManifoldObjective describing the function we have and its features. In our case the objective is (just) a ManifoldCostObjective that stores cost function f(M,p) = .... More generally, it might for example store a gradient function or the Hessian or any other information we have about our task.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"This is something independent of the solver itself, since it only identifies the problem we want to solve independent of how we want to solve it – or in other words, this type contains all information that is static and independent of the specific solver at hand.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Usually the problems variable is called mp.","category":"page"},{"location":"tutorials/ImplementASolver/#The-Solver-–-An-AbstractManoptSolverState","page":"Implement a Solver","title":"The Solver – An AbstractManoptSolverState","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Everything that is needed by a solver during the iterations, all its parameters, interims values that are needed beyond just one iteration, is stored in a subtype of the AbstractManoptSolverState. This identifies the solver uniquely.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"In our case we want to store five things","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"the current iterate p=p^(k)\nthe best visited point q\nthe variable sigma 0\nthe retraction operatornameretr to use (cf. retractions and inverse retractions)\na criterion, when to stop, i.e. a StoppingCriterion","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We can defined this as","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"mutable struct RandomWalkState{\n P,\n R<:AbstractRetractionMethod,\n S<:StoppingCriterion,\n} <: AbstractManoptSolverState\n p::P\n q::P\n σ::Float64\n retraction_method::R\n stop::S\nend","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"The stopping criterion is usually stored in the state’s stop field. If you have a reason to do otherwise, you have one more function to implement (see next section). For ease of use, we can provide a constructor, that for example chooses a good default for the retraction based on a given manifold.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"function RandomWalkState(M::AbstractManifold, p::P=rand(M);\n σ = 0.1,\n retraction_method::R=default_retraction_method(M),\n stopping_criterion::S=StopAfterIteration(200)\n) where {P, R<:AbstractRetractionMethod, S<:StoppingCriterion}\n return RandomWalkState{P,R,S}(p, copy(M, p), σ, retraction_method, stopping_criterion)\nend","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Parametrising the state avoid that we have abstract typed fields. The keyword arguments for the retraction and stopping criterion are the ones usually used in Manopt.jl and provide an easy way to construct this state now.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"States usually have a shortened name as their variable, we will use rws for our state here.","category":"page"},{"location":"tutorials/ImplementASolver/#Implementing-the-Your-solver","page":"Implement a Solver","title":"Implementing the Your solver","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"There is basically only two methods we need to implement for our solver","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"initialize_solver!(mp, rws) which initialises the solver before the first iteration\nstep_solver!(mp, rws, i) which implements the ith iteration, where i is given to you as the third parameter\nget_iterate(rws) which accesses the iterate from other places in the solver\nget_solver_result(rws) returning the solvers final (best) point we reached. By default this would return the last iterate rws.p (or more precisely calls get_iterate), but since we randomly walk and remember our best point in q, this has to return rws.q.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"The first two functions are in-place functions, that is they modify our solver state rws. You implement these by multiple dispatch on the types after importing said functions from Manopt:","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"import Manopt: initialize_solver!, step_solver!, get_iterate, get_solver_result","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"The state above has two fields where we use the common names used in Manopt.jl, that is the StoppingCriterion is usually in stop and the iterate in p. If your choice is different, you need to reimplement","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"stop_solver!(mp, rws, i) to determine whether or not to stop after the ith iteration.\nget_iterate(rws) to access the current iterate","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We recommend to follow the general scheme with the stop field. If you have specific criteria when to stop, consider implementing your own stoping criterion instead.","category":"page"},{"location":"tutorials/ImplementASolver/#Initialization-and-Iterate-Access","page":"Implement a Solver","title":"Initialization & Iterate Access","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"For our solver, there is not so much to initialize, just to be safe we should copy over the initial value in p we start with, to q. We do not have to care about remembering the iterate, that is done by Manopt.jl. For the iterate access we just have to pass p.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"function initialize_solver!(mp::AbstractManoptProblem, rws::RandomWalkState)\n copyto!(M, rws.q, rws.p) # Set p^{(0)} = q\n return rws\nend\nget_iterate(rws::RandomWalkState) = rws.p\nget_solver_result(rws::RandomWalkState) = rws.q","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"and similarly we implement the step. Here we make use of the fact that the problem (and also the objective in fact) have access functions for their elements, the one we need is get_cost.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"function step_solver!(mp::AbstractManoptProblem, rws::RandomWalkState, i)\n M = get_manifold(mp) # for ease of use get the manifold from the problem\n X = rand(M; vector_at=p) # generate a direction\n X .*= rws.σ/norm(M, p, X)\n # Walk\n retract!(M, rws.p, rws.p, X, rws.retraction_method)\n # is the new point better? Then store it\n if get_cost(mp, rws.p) < get_cost(mp, rws.q)\n copyto!(M, rws.p, rws.q)\n end\n return rws\nend","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Performance wise we could improve the number of allocations by making X also a field of our rws but let’s keep it simple here. We could also store the cost of q in the state, but we will see how to easily also enable this solver to allow for caching. In practice, however, it is preferable to cache intermediate values like cost of q in the state when it can be easily achieved. This way we do not have to deal with overheads of an external cache.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Now we can just run the solver already! We take the same example as for the other tutorials","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We first define our task, the Riemannian Center of Mass from the Get Started: Optimize! tutorial.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Random.seed!(23)\nn = 100\nσ = π / 8\nM = Sphere(2)\np = 1 / sqrt(2) * [1.0, 0.0, 1.0]\ndata = [exp(M, p, σ * rand(M; vector_at=p)) for i in 1:n];\nf(M, p) = sum(1 / (2 * n) * distance.(Ref(M), Ref(p), data) .^ 2)","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We can now generate the problem with its objective and the state","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"mp = DefaultManoptProblem(M, ManifoldCostObjective(f))\ns = RandomWalkState(M; σ = 0.2)\n\nsolve!(mp, s)\nget_solver_result(s)","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"3-element Vector{Float64}:\n -0.2412674850987521\n 0.8608618657176527\n -0.44800317943876844","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"The function solve! works also in place of s, but the last line illustrates how to access the result in general; we could also just look at s.p, but the function get_iterate is also used in several other places.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We could for example easily set up a second solver to work from a specified starting point with a different σ like","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"s2 = RandomWalkState(M, [1.0, 0.0, 0.0]; σ = 0.1)\nsolve!(mp, s2)\nget_solver_result(s2)","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"3-element Vector{Float64}:\n 1.0\n 0.0\n 0.0","category":"page"},{"location":"tutorials/ImplementASolver/#Ease-of-Use-I:-The-high-level-interface(s)","page":"Implement a Solver","title":"Ease of Use I: The high level interface(s)","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Manopt.jl offers a few additional features for solvers in their high level interfaces, for example debug= for debug, record= keywords for debug and recording within solver states or count= and cache keywords for the objective.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We can introduce these here as well with just a few lines of code. There are usually two steps. We further need three internal function from Manopt.jl","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"using Manopt: get_solver_return, indicates_convergence, status_summary","category":"page"},{"location":"tutorials/ImplementASolver/#A-high-level-interface-using-the-objective","page":"Implement a Solver","title":"A high level interface using the objective","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"This could be considered as an interims step to the high-level interface: If we already have the objective – in our case a ManifoldCostObjective at hand, the high level interface consists of the steps","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"possibly decorate the objective\ngenerate the problem\ngenerate and possiblz generate the state\ncall the solver\ndetermine the return value","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We illustrate the step with an in-place variant here. A variant that keeps the given start point unchanged would just add a copy(M, p) upfront. Manopt.jl provides both variants.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"function random_walk_algorithm!(\n M::AbstractManifold,\n mgo::ManifoldCostObjective,\n p;\n σ = 0.1,\n retraction_method::AbstractRetractionMethod=default_retraction_method(M, typeof(p)),\n stopping_criterion::StoppingCriterion=StopAfterIteration(200),\n kwargs...,\n)\n dmgo = decorate_objective!(M, mgo; kwargs...)\n dmp = DefaultManoptProblem(M, dmgo)\n s = RandomWalkState(M, [1.0, 0.0, 0.0];\n σ=0.1,\n retraction_method=retraction_method, stopping_criterion=stopping_criterion,\n )\n ds = decorate_state!(s; kwargs...)\n solve!(dmp, ds)\n return get_solver_return(get_objective(dmp), ds)\nend","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"random_walk_algorithm! (generic function with 1 method)","category":"page"},{"location":"tutorials/ImplementASolver/#The-high-level-interface","page":"Implement a Solver","title":"The high level interface","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Starting from the last section, the usual call a user would prefer is just passing a manifold M the cost f and maybe a start point p.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"function random_walk_algorithm!(M::AbstractManifold, f, p=rand(M); kwargs...)\n mgo = ManifoldCostObjective(f)\n return random_walk_algorithm!(M, mgo, p; kwargs...)\nend","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"random_walk_algorithm! (generic function with 3 methods)","category":"page"},{"location":"tutorials/ImplementASolver/#Ease-of-Use-II:-The-State-Summary","page":"Implement a Solver","title":"Ease of Use II: The State Summary","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"For the case that you set return_state=true the solver should return a summary of the run. When a show method is provided, users can easily read such summary in a terminal. It should reflect its main parameters, if they are not too verbose and provide information about the reason it stopped and whether this indicates convergence.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Here it would for example look like","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"import Base: show\nfunction show(io::IO, rws::RandomWalkState)\n i = get_count(rws, :Iterations)\n Iter = (i > 0) ? \"After $i iterations\\n\" : \"\"\n Conv = indicates_convergence(rws.stop) ? \"Yes\" : \"No\"\n s = \"\"\"\n # Solver state for `Manopt.jl`s Tutorial Random Walk\n $Iter\n ## Parameters\n * retraction method: $(rws.retraction_method)\n * σ : $(rws.σ)\n\n ## Stopping Criterion\n $(status_summary(rws.stop))\n This indicates convergence: $Conv\"\"\"\n return print(io, s)\nend","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"show (generic function with 671 methods)","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"Now the algorithm can be easily called and provides – if wanted – all features of a Manopt.jl algorithm. For example to see the summary, we could now just call","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"q = random_walk_algorithm!(M, f; return_state=true)","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"# Solver state for `Manopt.jl`s Tutorial Random Walk\nAfter 200 iterations\n\n## Parameters\n* retraction method: ExponentialRetraction()\n* σ : 0.1\n\n## Stopping Criterion\nMax Iteration 200: reached\nThis indicates convergence: No","category":"page"},{"location":"tutorials/ImplementASolver/#Conclusion-and-Beyond","page":"Implement a Solver","title":"Conclusion & Beyond","text":"","category":"section"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"We saw in this tutorial how to implement a simple cost-based algorithm, to illustrate how optimization algorithms are covered in Manopt.jl.","category":"page"},{"location":"tutorials/ImplementASolver/","page":"Implement a Solver","title":"Implement a Solver","text":"One feature we did not cover is that most algorithms allow for inplace and allocation functions, as soon as they work on more than just the cost, e.g. gradients, proximal maps or Hessians. This is usually a keyword argument of the objective and hence also part of the high-level interfaces.","category":"page"},{"location":"tutorials/HowToDebug/#How-to-Print-Debug-Output","page":"Print Debug Output","title":"How to Print Debug Output","text":"","category":"section"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"This tutorial aims to illustrate how to perform debug output. For that we consider an example that includes a subsolver, to also consider their debug capabilities.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"The problem itself is hence not the main focus.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"We consider a nonnegative PCA which we can write as a constraint problem on the Sphere","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Let’s first load the necessary packages.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"using Manopt, Manifolds, Random, LinearAlgebra\nRandom.seed!(42);","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"d = 4\nM = Sphere(d - 1)\nv0 = project(M, [ones(2)..., zeros(d - 2)...])\nZ = v0 * v0'\n#Cost and gradient\nf(M, p) = -tr(transpose(p) * Z * p) / 2\ngrad_f(M, p) = project(M, p, -transpose.(Z) * p / 2 - Z * p / 2)\n# Constraints\ng(M, p) = -p # i.e. p ≥ 0\nmI = -Matrix{Float64}(I, d, d)\n# Vector of gradients of the constraint components\ngrad_g(M, p) = [project(M, p, mI[:, i]) for i in 1:d]","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Then we can take a starting point","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"p0 = project(M, [ones(2)..., zeros(d - 3)..., 0.1])","category":"page"},{"location":"tutorials/HowToDebug/#Simple-debug-output","page":"Print Debug Output","title":"Simple debug output","text":"","category":"section"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Any solver accepts the keyword debug=, which in the simplest case can be set to an array of strings, symbols and a number.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Strings are printed in every iteration as is (cf. DebugDivider) and should be used to finish the array with a line break.\nthe last number in the array is used with DebugEvery to print the debug only every ith iteration.\nAny Symbol is converted into certain debug prints","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Certain symbols starting with a capital letter are mapped to certain prints, e.g. :Cost is mapped to DebugCost() to print the current cost function value. A full list is provided in the DebugActionFactory. A special keyword is :Stop, which is only added to the final debug hook to print the stopping criterion.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Any symbol with a small letter is mapped to fields of the AbstractManoptSolverState which is used. This way you can easily print internal data, if you know their names.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Let’s look at an example first: If we want to print the current iteration number, the current cost function value as well as the value ϵ from the ExactPenaltyMethodState. To keep the amount of print at a reasonable level, we want to only print the debug every 25th iteration.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Then we can write","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"p1 = exact_penalty_method(\n M, f, grad_f, p0; g=g, grad_g=grad_g,\n debug = [:Iteration, :Cost, \" | \", :ϵ, 25, \"\\n\", :Stop]\n);","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Initial f(x): -0.497512 | ϵ: 0.001\n# 25 f(x): -0.499449 | ϵ: 0.0001778279410038921\n# 50 f(x): -0.499995 | ϵ: 3.1622776601683734e-5\n# 75 f(x): -0.500000 | ϵ: 5.623413251903474e-6\n# 100 f(x): -0.500000 | ϵ: 1.0e-6\nThe value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-6).\nThe algorithm performed a step with a change (6.534762378319523e-9) less than 1.0e-6.","category":"page"},{"location":"tutorials/HowToDebug/#Advanced-Debug-output","page":"Print Debug Output","title":"Advanced Debug output","text":"","category":"section"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"There is two more advanced variants that can be used. The first is a tuple of a symbol and a string, where the string is used as the format print, that most DebugActions have. The second is, to directly provide a DebugAction.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"We can for example change the way the :ϵ is printed by adding a format string and use DebugCost() which is equivalent to using :Cost. Especially with the format change, the lines are more coniststent in length.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"p2 = exact_penalty_method(\n M, f, grad_f, p0; g=g, grad_g=grad_g,\n debug = [:Iteration, DebugCost(), (:ϵ,\" | ϵ: %.8f\"), 25, \"\\n\", :Stop]\n);","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Initial f(x): -0.497512 | ϵ: 0.00100000\n# 25 f(x): -0.499449 | ϵ: 0.00017783\n# 50 f(x): -0.499995 | ϵ: 0.00003162\n# 75 f(x): -0.500000 | ϵ: 0.00000562\n# 100 f(x): -0.500000 | ϵ: 0.00000100\nThe value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-6).\nThe algorithm performed a step with a change (6.534762378319523e-9) less than 1.0e-6.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"You can also write your own DebugAction functor, where the function to implement has the same signature as the step function, that is an AbstractManoptProblem, an AbstractManoptSolverState, as well as the current iterate. For example the already mentioned [DebugDivider](@ref)(s)` is given as","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"mutable struct DebugDivider{TIO<:IO} <: DebugAction\n io::TIO\n divider::String\n DebugDivider(divider=\" | \"; io::IO=stdout) = new{typeof(io)}(io, divider)\nend\nfunction (d::DebugDivider)(::AbstractManoptProblem, ::AbstractManoptSolverState, i::Int)\n (i >= 0) && (!isempty(d.divider)) && (print(d.io, d.divider))\n return nothing\nend","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"or you could implement that of course just for your specific problem or state.","category":"page"},{"location":"tutorials/HowToDebug/#Subsolver-Debug","page":"Print Debug Output","title":"Subsolver Debug","text":"","category":"section"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"most subsolvers have a sub_kwargs keyword, such that you can pass keywords to the sub solver as well. This works well if you do not plan to change the subsolver. If you do you can wrap your own solver_state= argument in a decorate_state! and pass a debug= password to this function call. Keywords in a keyword have to be passed as pairs (:debug => [...]).","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"A main problem now is, that this debug is issued every sub solver call or initialisation, as the following print of just a . per sub solver test/call illustrates","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"p3 = exact_penalty_method(\n M, f, grad_f, p0; g=g, grad_g=grad_g,\n debug = [\"\\n\",:Iteration, DebugCost(), (:ϵ,\" | ϵ: %.8f\"), 25, \"\\n\", :Stop],\n sub_kwargs = [:debug => [\".\"]]\n);","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Initial f(x): -0.497512 | ϵ: 0.00100000\n........................................................\n# 25 f(x): -0.499449 | ϵ: 0.00017783\n..................................................\n# 50 f(x): -0.499995 | ϵ: 0.00003162\n..................................................\n# 75 f(x): -0.500000 | ϵ: 0.00000562\n..................................................\n# 100 f(x): -0.500000 | ϵ: 0.00000100\n....The value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-6).\nThe algorithm performed a step with a change (6.534762378319523e-9) less than 1.0e-6.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"The different lengths of the dotted lines come from the fact that —at least in the beginning— the subsolver performs a few steps.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"For this issue, there is the next symbol (similar to the :Stop) to indicate that a debug set is a subsolver set :Subsolver, which introduces a DebugWhenActive that is only activated when the outer debug is actually active, i.e. DebugEvery is active itself. Let’s","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"p4 = exact_penalty_method(\n M, f, grad_f, p0; g=g, grad_g=grad_g,\n debug = [:Iteration, DebugCost(), (:ϵ,\" | ϵ: %.8f\"), 25, \"\\n\", :Stop],\n sub_kwargs = [\n :debug => [\" | \", :Iteration, :Cost, \"\\n\",:Stop, :Subsolver]\n ]\n);","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"Initial f(x): -0.497512 | ϵ: 0.00100000\n | Initial f(x): -0.499127\n | # 1 f(x): -0.499147\nThe algorithm reached approximately critical point after 1 iterations; the gradient norm (0.0002121889852717264) is less than 0.001.\n# 25 f(x): -0.499449 | ϵ: 0.00017783\n | Initial f(x): -0.499993\n | # 1 f(x): -0.499994\nThe algorithm reached approximately critical point after 1 iterations; the gradient norm (1.6025009584517956e-5) is less than 0.001.\n# 50 f(x): -0.499995 | ϵ: 0.00003162\n | Initial f(x): -0.500000\n | # 1 f(x): -0.500000\nThe algorithm reached approximately critical point after 1 iterations; the gradient norm (9.966301158124465e-7) is less than 0.001.\n# 75 f(x): -0.500000 | ϵ: 0.00000562\n | Initial f(x): -0.500000\n | # 1 f(x): -0.500000\nThe algorithm reached approximately critical point after 1 iterations; the gradient norm (5.4875346930698466e-8) is less than 0.001.\n# 100 f(x): -0.500000 | ϵ: 0.00000100\nThe value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-6).\nThe algorithm performed a step with a change (6.534762378319523e-9) less than 1.0e-6.","category":"page"},{"location":"tutorials/HowToDebug/","page":"Print Debug Output","title":"Print Debug Output","text":"where we now see that the subsolver always only requires one step. Note that since debug of an iteration is happening after a step, we see the sub solver run before the debug for an iteration number.","category":"page"},{"location":"functions/manifold/#Specific-manifold-functions","page":"Specific Manifold Functions","title":"Specific manifold functions","text":"","category":"section"},{"location":"functions/manifold/","page":"Specific Manifold Functions","title":"Specific Manifold Functions","text":"This small section extends the functions available from ManifoldsBase.jl and Manifolds.jl, especially a few random generators, that are simpler than the available functions.","category":"page"},{"location":"functions/manifold/","page":"Specific Manifold Functions","title":"Specific Manifold Functions","text":"Modules = [Manopt]\nPages = [\"manifold_functions.jl\"]","category":"page"},{"location":"functions/manifold/#Manopt.reflect-Tuple{AbstractManifold, Any, Any}","page":"Specific Manifold Functions","title":"Manopt.reflect","text":"reflect(M, p, x, kwargs...)\nreflect!(M, q, p, x, kwargs...)\n\nReflect the point x from the manifold M at point p, i.e.\n\n operatornamerefl_p(x) = operatornameretr_p(-operatornameretr^-1_p x)\n\nwhere operatornameretr and operatornameretr^-1 denote a retraction and an inverse retraction, respectively. This can also be done in place of q.\n\nKeyword arguments\n\nretraction_method - (default_retraction_metiod(M, typeof(p))) the retraction to use in the reflection\ninverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) the inverse retraction to use within the reflection\n\nand for the reflect! additionally\n\nX (zero_vector(M,p)) a temporary memory to compute the inverse retraction in place. otherwise this is the memory that would be allocated anyways.\n\nPassing X to reflect will just have no effect.\n\n\n\n\n\n","category":"method"},{"location":"functions/manifold/#Manopt.reflect-Tuple{AbstractManifold, Function, Any}","page":"Specific Manifold Functions","title":"Manopt.reflect","text":"reflect(M, f, x; kwargs...)\nreflect!(M, q, f, x; kwargs...)\n\nreflect the point x from the manifold M at the point f(x) of the function f mathcal M mathcal M, i.e.,\n\n operatornamerefl_f(x) = operatornamerefl_f(x)(x)\n\nCompute the result in q.\n\nsee also reflect(M,p,x), to which the keywords are also passed to.\n\n\n\n\n\n","category":"method"},{"location":"solvers/particle_swarm/#ParticleSwarmSolver","page":"Particle Swarm Optimization","title":"Particle Swarm Optimization","text":"","category":"section"},{"location":"solvers/particle_swarm/","page":"Particle Swarm Optimization","title":"Particle Swarm Optimization","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/particle_swarm/","page":"Particle Swarm Optimization","title":"Particle Swarm Optimization","text":" particle_swarm\n particle_swarm!","category":"page"},{"location":"solvers/particle_swarm/#Manopt.particle_swarm","page":"Particle Swarm Optimization","title":"Manopt.particle_swarm","text":"patricle_swarm(M, f; kwargs...)\npatricle_swarm(M, f, swarm; kwargs...)\npatricle_swarm(M, mco::AbstractManifoldCostObjective; kwargs..)\npatricle_swarm(M, mco::AbstractManifoldCostObjective, swarm; kwargs..)\n\nperform the particle swarm optimization algorithm (PSO), starting with an initial swarm Borkmanns, Ishteva, Absil, 7th IC Swarm Intelligence, 2010. If no swarm is provided, swarm_size many random points are used. Note that since this method does not work in-place – these points are duplicated internally.\n\nThe aim of PSO is to find the particle position g on the Manifold M that solves\n\nmin_x mathcalM F(x)\n\nTo this end, a swarm of particles is moved around the Manifold M in the following manner. For every particle k we compute the new particle velocities v_k^(i) in every step i of the algorithm by\n\nv_k^(i) = ω operatornameT_x_k^(i)gets x_k^(i-1)v_k^(i-1) + c r_1 operatornameretr_x_k^(i)^-1(p_k^(i)) + s r_2 operatornameretr_x_k^(i)^-1(g)\n\nwhere x_k^(i) is the current particle position, ω denotes the inertia, c and s are a cognitive and a social weight, respectively, r_j, j=12 are random factors which are computed new for each particle and step, operatornameretr^-1 denotes an inverse retraction on the Manifold M, and operatornameT is a vector transport.\n\nThen the position of the particle is updated as\n\nx_k^(i+1) = operatornameretr_x_k^(i)(v_k^(i))\n\nwhere operatornameretr denotes a retraction on the Manifold M. At the end of each step for every particle, we set\n\np_k^(i+1) = begincases\nx_k^(i+1) textif F(x_k^(i+1))F(p_k^(i))\np_k^(i) textelse\nendcases\n\n\nand\n\ng_k^(i+1) =begincases\np_k^(i+1) textif F(p_k^(i+1))F(g_k^(i))\ng_k^(i) textelse\nendcases\n\ni.e. p_k^(i) is the best known position for the particle k and g^(i) is the global best known position ever visited up to step i.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function Fmathcal Mℝ to minimize\nswarm – ([rand(M) for _ in 1:swarm_size]) – an initial swarm of points.\n\nInstead of a cost function f you can also provide an AbstractManifoldCostObjective mco.\n\nOptional\n\ncognitive_weight – (1.4) a cognitive weight factor\ninertia – (0.65) the inertia of the particles\ninverse_retraction_method - (default_inverse_retraction_method(M, eltype(x))) an inverse_retraction(M,x,y) to use.\nswarm_size - (100) number of random initial positions of x0\nretraction_method – (default_retraction_method(M, eltype(x))) a retraction(M,x,ξ) to use.\nsocial_weight – (1.4) a social weight factor\nstopping_criterion – (StopWhenAny(StopAfterIteration(500), StopWhenChangeLess(10^{-4}))) a functor inheriting from StoppingCriterion indicating when to stop.\nvector_transport_mthod - (default_vector_transport_method(M, eltype(x))) a vector transport method to use.\nvelocity – a set of tangent vectors (of type AbstractVector{T}) representing the velocities of the particles, per default a random tangent vector per initial position\n\nAll other keyword arguments are passed to decorate_state! for decorators or decorate_objective!, respectively. If you provide the ManifoldGradientObjective directly, these decorations can still be specified\n\nOutput\n\nthe obtained (approximate) minimizer g, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/particle_swarm/#Manopt.particle_swarm!","page":"Particle Swarm Optimization","title":"Manopt.particle_swarm!","text":"patricle_swarm!(M, f, swarm; kwargs...)\npatricle_swarm!(M, mco::AbstractManifoldCostObjective, swarm; kwargs..)\n\nperform the particle swarm optimization algorithm (PSO), starting with the initial swarm which is then modified in place.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function Fmathcal Mℝ to minimize\nswarm – ([rand(M) for _ in 1:swarm_size]) – an initial swarm of points.\n\nInstead of a cost function f you can also provide an AbstractManifoldCostObjective mco.\n\nFor more details and optional arguments, see particle_swarm.\n\n\n\n\n\n","category":"function"},{"location":"solvers/particle_swarm/#State","page":"Particle Swarm Optimization","title":"State","text":"","category":"section"},{"location":"solvers/particle_swarm/","page":"Particle Swarm Optimization","title":"Particle Swarm Optimization","text":"ParticleSwarmState","category":"page"},{"location":"solvers/particle_swarm/#Manopt.ParticleSwarmState","page":"Particle Swarm Optimization","title":"Manopt.ParticleSwarmState","text":"ParticleSwarmState{P,T} <: AbstractManoptSolverState\n\nDescribes a particle swarm optimizing algorithm, with\n\nFields\n\nx – a set of points (of type AbstractVector{P}) on a manifold as initial particle positions\nvelocity – a set of tangent vectors (of type AbstractVector{T}) representing the velocities of the particles\ninertia – (0.65) the inertia of the particles\nsocial_weight – (1.4) a social weight factor\ncognitive_weight – (1.4) a cognitive weight factor\np_temp – temporary storage for a point to avoid allocations during a step of the algorithm\nsocial_vec - temporary storage for a tangent vector related to social_weight\ncognitive_vector - temporary storage for a tangent vector related to cognitive_weight\nstopping_criterion – ([StopAfterIteration](@ref)(500) | [StopWhenChangeLess](@ref)(1e-4)) a functor inheriting from [StoppingCriterion`](@ref) indicating when to stop.\nretraction_method – (default_retraction_method(M, eltype(x))) the retraction to use\ninverse_retraction_method - (default_inverse_retraction_method(M, eltype(x))) an inverse retraction to use.\nvector_transport_method - (default_vector_transport_method(M, eltype(x))) a vector transport to use\n\nConstructor\n\nParticleSwarmState(M, x0, velocity; kawrgs...)\n\nconstruct a particle swarm Option for the manifold M starting at initial population x0 with velocities x0, where the manifold is used within the defaults of the other fields mentioned above, which are keyword arguments here.\n\nSee also\n\nparticle_swarm\n\n\n\n\n\n","category":"type"},{"location":"solvers/particle_swarm/#Literature","page":"Particle Swarm Optimization","title":"Literature","text":"","category":"section"},{"location":"solvers/particle_swarm/","page":"Particle Swarm Optimization","title":"Particle Swarm Optimization","text":"
","category":"page"},{"location":"solvers/stochastic_gradient_descent/#StochasticGradientDescentSolver","page":"Stochastic Gradient Descent","title":"Stochastic Gradient Descent","text":"","category":"section"},{"location":"solvers/stochastic_gradient_descent/","page":"Stochastic Gradient Descent","title":"Stochastic Gradient Descent","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/stochastic_gradient_descent/","page":"Stochastic Gradient Descent","title":"Stochastic Gradient Descent","text":"stochastic_gradient_descent\nstochastic_gradient_descent!","category":"page"},{"location":"solvers/stochastic_gradient_descent/#Manopt.stochastic_gradient_descent","page":"Stochastic Gradient Descent","title":"Manopt.stochastic_gradient_descent","text":"stochastic_gradient_descent(M, grad_f, p; kwargs...)\nstochastic_gradient_descent(M, msgo, p; kwargs...)\n\nperform a stochastic gradient descent\n\nInput\n\nM a manifold mathcal M\ngrad_f – a gradient function, that either returns a vector of the subgradients or is a vector of gradients\np – an initial value x mathcal M\n\nalternatively to the gradient you can provide an ManifoldStochasticGradientObjective msgo, then using the cost= keyword does not have any effect since if so, the cost is already within the objective.\n\nOptional\n\ncost – (missing) you can provide a cost function for example to track the function value\nevaluation – (AllocatingEvaluation) specify whether the gradient(s) works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x) (elementwise).\nevaluation_order – (:Random) – whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Linear) or the default :Random one.\nstopping_criterion (StopAfterIteration(1000))– a StoppingCriterion\nstepsize (ConstantStepsize(1.0)) a Stepsize\norder_type (:RandomOder) a type of ordering of gradient evaluations. values are :RandomOrder, a :FixedPermutation, :LinearOrder\norder - ([1:n]) the initial permutation, where n is the number of gradients in gradF.\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction to use.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/stochastic_gradient_descent/#Manopt.stochastic_gradient_descent!","page":"Stochastic Gradient Descent","title":"Manopt.stochastic_gradient_descent!","text":"stochastic_gradient_descent!(M, grad_f, p)\nstochastic_gradient_descent!(M, msgo, p)\n\nperform a stochastic gradient descent in place of p.\n\nInput\n\nM a manifold mathcal M\ngrad_f – a gradient function, that either returns a vector of the subgradients or is a vector of gradients\np – an initial value p mathcal M\n\nAlternatively to the gradient you can provide an ManifoldStochasticGradientObjective msgo, then using the cost= keyword does not have any effect since if so, the cost is already within the objective.\n\nfor all optional parameters, see stochastic_gradient_descent.\n\n\n\n\n\n","category":"function"},{"location":"solvers/stochastic_gradient_descent/#State","page":"Stochastic Gradient Descent","title":"State","text":"","category":"section"},{"location":"solvers/stochastic_gradient_descent/","page":"Stochastic Gradient Descent","title":"Stochastic Gradient Descent","text":"StochasticGradientDescentState","category":"page"},{"location":"solvers/stochastic_gradient_descent/#Manopt.StochasticGradientDescentState","page":"Stochastic Gradient Descent","title":"Manopt.StochasticGradientDescentState","text":"StochasticGradientDescentState <: AbstractGradientDescentSolverState\n\nStore the following fields for a default stochastic gradient descent algorithm, see also ManifoldStochasticGradientObjective and stochastic_gradient_descent.\n\nFields\n\np the current iterate\ndirection (StochasticGradient) a direction update to use\nstopping_criterion (StopAfterIteration(1000))– a StoppingCriterion\nstepsize (ConstantStepsize(1.0)) a Stepsize\nevaluation_order – (:Random) – whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Linear) or the default :Random one.\norder the current permutation\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction(M, p, X) to use.\n\nConstructor\n\nStochasticGradientDescentState(M, p)\n\nCreate a StochasticGradientDescentState with start point x. all other fields are optional keyword arguments, and the defaults are taken from M.\n\n\n\n\n\n","category":"type"},{"location":"solvers/stochastic_gradient_descent/","page":"Stochastic Gradient Descent","title":"Stochastic Gradient Descent","text":"Additionally, the options share a DirectionUpdateRule, so you can also apply MomentumGradient and AverageGradient here. The most inner one should always be.","category":"page"},{"location":"solvers/stochastic_gradient_descent/","page":"Stochastic Gradient Descent","title":"Stochastic Gradient Descent","text":"AbstractGradientGroupProcessor\nStochasticGradient","category":"page"},{"location":"solvers/stochastic_gradient_descent/#Manopt.AbstractGradientGroupProcessor","page":"Stochastic Gradient Descent","title":"Manopt.AbstractGradientGroupProcessor","text":"AbstractStochasticGradientDescentSolverState <: AbstractManoptSolverState\n\nA generic type for all options related to stochastic gradient descent methods\n\n\n\n\n\n","category":"type"},{"location":"solvers/stochastic_gradient_descent/#Manopt.StochasticGradient","page":"Stochastic Gradient Descent","title":"Manopt.StochasticGradient","text":"StochasticGradient <: AbstractGradientGroupProcessor\n\nThe default gradient processor, which just evaluates the (stochastic) gradient or a subset thereof.\n\nConstructor\n\nStochasticGradient(M::AbstractManifold; p=rand(M), X=zero_vector(M, p))\n\nInitialize the stochastic Gradient processor with X, i.e. both M and p are just help variables, though M is mandatory by convention.\n\n\n\n\n\n","category":"type"},{"location":"solvers/cyclic_proximal_point/#CPPSolver","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"","category":"section"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"The Cyclic Proximal Point (CPP) algorithm aims to minimize","category":"page"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"F(x) = sum_i=1^c f_i(x)","category":"page"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"assuming that the proximal maps operatornameprox_λ f_i(x) are given in closed form or can be computed efficiently (at least approximately).","category":"page"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"The algorithm then cycles through these proximal maps, where the type of cycle might differ and the proximal parameter λ_k changes after each cycle k.","category":"page"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"For a convergence result on Hadamard manifolds see Bačák [Bac14].","category":"page"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"cyclic_proximal_point\ncyclic_proximal_point!","category":"page"},{"location":"solvers/cyclic_proximal_point/#Manopt.cyclic_proximal_point","page":"Cyclic Proximal Point","title":"Manopt.cyclic_proximal_point","text":"cyclic_proximal_point(M, f, proxes_f, p)\ncyclic_proximal_point(M, mpo, p)\n\nperform a cyclic proximal point algorithm.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function fmathcal Mℝ to minimize\nproxes_f – an Array of proximal maps (Functions) (M,λ,p) -> q or (M, q, λ, p) -> q for the summands of f (see evaluation)\np – an initial value p mathcal M\n\nwhere f and the proximal maps proxes_f can also be given directly as a ManifoldProximalMapObjective mpo\n\nOptional\n\nevaluation – (AllocatingEvaluation) specify whether the proximal maps work by allocation (default) form prox(M, λ, x) or InplaceEvaluation in place, i.e. is of the form prox!(M, y, λ, x).\nevaluation_order – (:Linear) – whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Random) or the default linear one.\nλ – ( iter -> 1/iter ) a function returning the (square summable but not summable) sequence of λi\nstopping_criterion – (StopWhenAny(StopAfterIteration(5000),StopWhenChangeLess(10.0^-8))) a StoppingCriterion.\n\nAll other keyword arguments are passed to decorate_state! for decorators or decorate_objective!, respectively. If you provide the ManifoldProximalMapObjective directly, these decorations can still be specified.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/cyclic_proximal_point/#Manopt.cyclic_proximal_point!","page":"Cyclic Proximal Point","title":"Manopt.cyclic_proximal_point!","text":"cyclic_proximal_point!(M, F, proxes, p)\ncyclic_proximal_point!(M, mpo, p)\n\nperform a cyclic proximal point algorithm in place of p.\n\nInput\n\nM – a manifold mathcal M\nF – a cost function Fmathcal Mℝ to minimize\nproxes – an Array of proximal maps (Functions) (M, λ, p) -> q or (M, q, λ, p) for the summands of F\np – an initial value p mathcal M\n\nwhere f and the proximal maps proxes_f can also be given directly as a ManifoldProximalMapObjective mpo\n\nfor all options, see cyclic_proximal_point.\n\n\n\n\n\n","category":"function"},{"location":"solvers/cyclic_proximal_point/#State","page":"Cyclic Proximal Point","title":"State","text":"","category":"section"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"CyclicProximalPointState","category":"page"},{"location":"solvers/cyclic_proximal_point/#Manopt.CyclicProximalPointState","page":"Cyclic Proximal Point","title":"Manopt.CyclicProximalPointState","text":"CyclicProximalPointState <: AbstractManoptSolverState\n\nstores options for the cyclic_proximal_point algorithm. These are the\n\nFields\n\np – the current iterate\nstopping_criterion – a StoppingCriterion\nλ – (@(i) -> 1/i) a function for the values of λ_k per iteration(cycle ì\noder_type – (:LinearOrder) – whether to use a randomly permuted sequence (:FixedRandomOrder), a per cycle permuted sequence (:RandomOrder) or the default linear one.\n\nConstructor\n\nCyclicProximalPointState(M, p)\n\nGenerate the options with the following keyword arguments\n\nstopping_criterion (StopAfterIteration(2000)) – a StoppingCriterion.\nλ ( i -> 1.0 / i) – a function to compute the λ_k k mathbb N,\nevaluation_order – (:LinearOrder) – a Symbol indicating the order the proxes are applied.\n\nSee also\n\ncyclic_proximal_point\n\n\n\n\n\n","category":"type"},{"location":"solvers/cyclic_proximal_point/#Debug-Functions","page":"Cyclic Proximal Point","title":"Debug Functions","text":"","category":"section"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"DebugProximalParameter","category":"page"},{"location":"solvers/cyclic_proximal_point/#Manopt.DebugProximalParameter","page":"Cyclic Proximal Point","title":"Manopt.DebugProximalParameter","text":"DebugProximalParameter <: DebugAction\n\nprint the current iterates proximal point algorithm parameter given by AbstractManoptSolverStates o.λ.\n\n\n\n\n\n","category":"type"},{"location":"solvers/cyclic_proximal_point/#Record-Functions","page":"Cyclic Proximal Point","title":"Record Functions","text":"","category":"section"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"RecordProximalParameter","category":"page"},{"location":"solvers/cyclic_proximal_point/#Manopt.RecordProximalParameter","page":"Cyclic Proximal Point","title":"Manopt.RecordProximalParameter","text":"RecordProximalParameter <: RecordAction\n\nrecord the current iterates proximal point algorithm parameter given by in AbstractManoptSolverStates o.λ.\n\n\n\n\n\n","category":"type"},{"location":"solvers/cyclic_proximal_point/#Literature","page":"Cyclic Proximal Point","title":"Literature","text":"","category":"section"},{"location":"solvers/cyclic_proximal_point/","page":"Cyclic Proximal Point","title":"Cyclic Proximal Point","text":"
","category":"page"},{"location":"functions/costs/#CostFunctions","page":"Cost functions","title":"Cost Functions","text":"","category":"section"},{"location":"functions/costs/","page":"Cost functions","title":"Cost functions","text":"The following cost functions are available","category":"page"},{"location":"functions/costs/","page":"Cost functions","title":"Cost functions","text":"Modules = [Manopt]\nPages = [\"costs.jl\"]","category":"page"},{"location":"functions/costs/#Manopt.costIntrICTV12-Tuple{AbstractManifold, Vararg{Any, 5}}","page":"Cost functions","title":"Manopt.costIntrICTV12","text":"costIntrICTV12(M, f, u, v, α, β)\n\nCompute the intrinsic infimal convolution model, where the addition is replaced by a mid point approach and the two functions involved are costTV2 and costTV. The model reads\n\nE(uv) =\n frac12sum_i mathcal G\n d_mathcal Mbigl(g(frac12v_iw_i)f_ibigr)\n +alphabigl( βmathrmTV(v) + (1-β)mathrmTV_2(w) bigr)\n\n\n\n\n\n","category":"method"},{"location":"functions/costs/#Manopt.costL2TV-NTuple{4, Any}","page":"Cost functions","title":"Manopt.costL2TV","text":"costL2TV(M, f, α, x)\n\ncompute the ℓ^2-TV functional on the PowerManifold manifoldMfor given (fixed) dataf(onM), a nonnegative weightα, and evaluated atx(onM`), i.e.\n\nE(x) = d_mathcal M^2(fx) + alpha operatornameTV(x)\n\nSee also\n\ncostTV\n\n\n\n\n\n","category":"method"},{"location":"functions/costs/#Manopt.costL2TV2-Tuple{PowerManifold, Any, Any, Any}","page":"Cost functions","title":"Manopt.costL2TV2","text":"costL2TV2(M, f, β, x)\n\ncompute the ℓ^2-TV2 functional on the PowerManifold manifold M for given data f, nonnegative parameter β, and evaluated at x, i.e.\n\nE(x) = d_mathcal M^2(fx) + βoperatornameTV_2(x)\n\nSee also\n\ncostTV2\n\n\n\n\n\n","category":"method"},{"location":"functions/costs/#Manopt.costL2TVTV2-Tuple{PowerManifold, Vararg{Any, 4}}","page":"Cost functions","title":"Manopt.costL2TVTV2","text":"costL2TVTV2(M, f, α, β, x)\n\ncompute the ℓ^2-TV-TV2 functional on the PowerManifold manifold M for given (fixed) data f (on M), nonnegative weight α, β, and evaluated at x (on M), i.e.\n\nE(x) = d_mathcal M^2(fx) + alphaoperatornameTV(x)\n + βoperatornameTV_2(x)\n\nSee also\n\ncostTV, costTV2\n\n\n\n\n\n","category":"method"},{"location":"functions/costs/#Manopt.costTV","page":"Cost functions","title":"Manopt.costTV","text":"costTV(M,x [,p=2,q=1])\n\nCompute the operatornameTV^p functional for data xon the PowerManifold manifold M, i.e. mathcal M = mathcal N^n, where n mathbb N^k denotes the dimensions of the data x. Let mathcal I_i denote the forward neighbors, i.e. with mathcal G as all indices from mathbf1 mathbb N^k to n we have mathcal I_i = i+e_j j=1kcap mathcal G. The formula reads\n\nE^q(x) = sum_i mathcal G\n bigl( sum_j mathcal I_i d^p_mathcal M(x_ix_j) bigr)^qp\n\nSee also\n\ngrad_TV, prox_TV\n\n\n\n\n\n","category":"function"},{"location":"functions/costs/#Manopt.costTV-Union{Tuple{T}, Tuple{AbstractManifold, Tuple{T, T}}, Tuple{AbstractManifold, Tuple{T, T}, Int64}} where T","page":"Cost functions","title":"Manopt.costTV","text":"costTV(M, x, p)\n\nCompute the operatornameTV^p functional for a tuple pT of points on a manifold M, i.e.\n\nE(x_1x_2) = d_mathcal M^p(x_1x_2) quad x_1x_2 mathcal M\n\nSee also\n\ngrad_TV, prox_TV\n\n\n\n\n\n","category":"method"},{"location":"functions/costs/#Manopt.costTV2","page":"Cost functions","title":"Manopt.costTV2","text":"costTV2(M,x [,p=1])\n\ncompute the operatornameTV_2^p functional for data x on the PowerManifold manifoldmanifold M, i.e. mathcal M = mathcal N^n, where n mathbb N^k denotes the dimensions of the data x. Let mathcal I_i^pm denote the forward and backward neighbors, respectively, i.e. with mathcal G as all indices from mathbf1 mathbb N^k to n we have mathcal I^pm_i = ipm e_j j=1kcap mathcal I. The formula then reads\n\nE(x) = sum_i mathcal I j_1 mathcal I^+_i j_2 mathcal I^-_i\nd^p_mathcal M(c_i(x_j_1x_j_2) x_i)\n\nwhere c_i() denotes the mid point between its two arguments that is nearest to x_i.\n\nSee also\n\ngrad_TV2, prox_TV2\n\n\n\n\n\n","category":"function"},{"location":"functions/costs/#Manopt.costTV2-Union{Tuple{T}, Tuple{MT}, Tuple{MT, Tuple{T, T, T}}, Tuple{MT, Tuple{T, T, T}, Any}} where {MT<:AbstractManifold, T}","page":"Cost functions","title":"Manopt.costTV2","text":"costTV2(M,(x1,x2,x3) [,p=1])\n\nCompute the operatornameTV_2^p functional for the 3-tuple of points (x1,x2,x3)on the manifold M. Denote by\n\n mathcal C = bigl c mathcal M g(tfrac12x_1x_3) text for some geodesic gbigr\n\nthe set of mid points between x_1 and x_3. Then the function reads\n\nd_2^p(x_1x_2x_3) = min_c mathcal C d_mathcal M(cx_2)\n\nSee also\n\ngrad_TV2, prox_TV2\n\n\n\n\n\n","category":"method"},{"location":"functions/costs/#Manopt.cost_L2_acceleration_bezier-Union{Tuple{P}, Tuple{AbstractManifold, AbstractVector{P}, AbstractVector{<:Integer}, AbstractVector{<:AbstractFloat}, AbstractFloat, AbstractVector{P}}} where P","page":"Cost functions","title":"Manopt.cost_L2_acceleration_bezier","text":"cost_L2_acceleration_bezier(M,B,pts,λ,d)\n\ncompute the value of the discrete Acceleration of the composite Bezier curve together with a data term, i.e.\n\nfracλ2sum_i=0^N d_mathcal M(d_i c_B(i))^2+\nsum_i=1^N-1fracd^2_2 B(t_i-1) B(t_i) B(t_i+1)Delta_t^3\n\nwhere for this formula the pts along the curve are equispaced and denoted by t_i and d_2 refers to the second order absolute difference costTV2 (squared), the junction points are denoted by p_i, and to each p_i corresponds one data item in the manifold points given in d. For details on the acceleration approximation, see cost_acceleration_bezier. Note that the Bézier-curve is given in reduces form as a point on a PowerManifold, together with the degrees of the segments and assuming a differentiable curve, the segments can internally be reconstructed.\n\nSee also\n\ngrad_L2_acceleration_bezier, cost_acceleration_bezier, grad_acceleration_bezier\n\n\n\n\n\n","category":"method"},{"location":"functions/costs/#Manopt.cost_acceleration_bezier-Union{Tuple{P}, Tuple{AbstractManifold, AbstractVector{P}, AbstractVector{<:Integer}, AbstractVector{<:AbstractFloat}}} where P","page":"Cost functions","title":"Manopt.cost_acceleration_bezier","text":"cost_acceleration_bezier(\n M::AbstractManifold,\n B::AbstractVector{P},\n degrees::AbstractVector{<:Integer},\n T::AbstractVector{<:AbstractFloat},\n) where {P}\n\ncompute the value of the discrete Acceleration of the composite Bezier curve\n\nsum_i=1^N-1fracd^2_2 B(t_i-1) B(t_i) B(t_i+1)Delta_t^3\n\nwhere for this formula the pts along the curve are equispaced and denoted by t_i, i=1N, and d_2 refers to the second order absolute difference costTV2 (squared). Note that the Bézier-curve is given in reduces form as a point on a PowerManifold, together with the degrees of the segments and assuming a differentiable curve, the segments can internally be reconstructed.\n\nThis acceleration discretization was introduced in Bergmann, Gousenbourger, Front. Appl. Math. Stat., 2018.\n\nSee also\n\ngrad_acceleration_bezier, cost_L2_acceleration_bezier, grad_L2_acceleration_bezier\n\n\n\n\n\n","category":"method"},{"location":"plans/objective/#ObjectiveSection","page":"Objective","title":"A Manifold Objective","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"CurrentModule = Manopt","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"The Objective describes that actual cost function and all its properties.","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"AbstractManifoldObjective\nAbstractDecoratedManifoldObjective","category":"page"},{"location":"plans/objective/#Manopt.AbstractManifoldObjective","page":"Objective","title":"Manopt.AbstractManifoldObjective","text":"AbstractManifoldObjective{E<:AbstractEvaluationType}\n\nDescribe the collection of the optimization function `f\\colon \\mathcal M → \\bbR (or even a vectorial range) and its corresponding elements, which might for example be a gradient or (one or more) proximal maps.\n\nAll these elements should usually be implemented as functions (M, p) -> ..., or (M, X, p) -> ... that is\n\nthe first argument of these functions should be the manifold M they are defined on\nthe argument X is present, if the computation is performed inplace of X (see InplaceEvaluation)\nthe argument p is the place the function (f or one of its elements) is evaluated at.\n\nthe type T indicates the global AbstractEvaluationType.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.AbstractDecoratedManifoldObjective","page":"Objective","title":"Manopt.AbstractDecoratedManifoldObjective","text":"AbstractDecoratedManifoldObjective{E<:AbstractEvaluationType,O<:AbstractManifoldObjective}\n\nA common supertype for all decorators of AbstractManifoldObjectives to simplify dispatch. The second parameter should refer to the undecorated objective (i.e. the most inner one).\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"Which has two main different possibilities for its containing functions concerning the evaluation mode – not necessarily the cost, but for example gradient in an AbstractManifoldGradientObjective.","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"AbstractEvaluationType\nAllocatingEvaluation\nInplaceEvaluation\nevaluation_type","category":"page"},{"location":"plans/objective/#Manopt.AbstractEvaluationType","page":"Objective","title":"Manopt.AbstractEvaluationType","text":"AbstractEvaluationType\n\nAn abstract type to specify the kind of evaluation a AbstractManifoldObjective supports.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.AllocatingEvaluation","page":"Objective","title":"Manopt.AllocatingEvaluation","text":"AllocatingEvaluation <: AbstractEvaluationType\n\nA parameter for a AbstractManoptProblem indicating that the problem uses functions that allocate memory for their result, i.e. they work out of place.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.InplaceEvaluation","page":"Objective","title":"Manopt.InplaceEvaluation","text":"InplaceEvaluation <: AbstractEvaluationType\n\nA parameter for a AbstractManoptProblem indicating that the problem uses functions that do not allocate memory but work on their input, i.e. in place.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.evaluation_type","page":"Objective","title":"Manopt.evaluation_type","text":"evaluation_type(mp::AbstractManoptProblem)\n\nGet the AbstractEvaluationType of the objective in AbstractManoptProblem mp.\n\n\n\n\n\nevaluation_type(::AbstractManifoldObjective{Teval})\n\nGet the AbstractEvaluationType of the objective.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Decorators-for-Objectives","page":"Objective","title":"Decorators for Objectives","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"An objective can be decorated using the following trait and function to initialize","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"dispatch_objective_decorator\nis_objective_decorator\ndecorate_objective!","category":"page"},{"location":"plans/objective/#Manopt.dispatch_objective_decorator","page":"Objective","title":"Manopt.dispatch_objective_decorator","text":"dispatch_objective_decorator(o::AbstractManoptSolverState)\n\nIndicate internally, whether an AbstractManifoldObjective o to be of decorating type, i.e. it stores (encapsulates) an object in itself, by default in the field o.objective.\n\nDecorators indicate this by returning Val{true} for further dispatch.\n\nThe default is Val{false}, i.e. by default an state is not decorated.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.is_objective_decorator","page":"Objective","title":"Manopt.is_objective_decorator","text":"is_object_decorator(s::AbstractManifoldObjective)\n\nIndicate, whether AbstractManifoldObjective s are of decorator type.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.decorate_objective!","page":"Objective","title":"Manopt.decorate_objective!","text":"decorate_objective!(M, o::AbstractManifoldObjective)\n\ndecorate the AbstractManifoldObjectiveo with specific decorators.\n\nOptional Arguments\n\noptional arguments provide necessary details on the decorators. A specific one is used to activate certain decorators.\n\ncache – (missing) specify a cache. Currently :Simple is supported and :LRU if you load LRUCache.jl. For this case a tuple specifying what to cache and how many can be provided, i.e. (:LRU, [:Cost, :Gradient], 10), where the number specifies the size of each cache. and 10 is the default if one omits the last tuple entry\ncount – (missing) specify calls to the objective to be called, see ManifoldCountObjective for the full list\nobjective_type – (:Riemannian) specify that an objective is :Riemannian or :Euclidean. The :Euclidean symbol is equivalent to specifying it as :Embedded, since in the end, both refer to converting an objective from the embedding (whether its Euclidean or not) to the Riemannian one.\n\nSee also\n\nobjective_cache_factory\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#ManifoldEmbeddedObjective","page":"Objective","title":"Embedded Objectives","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"EmbeddedManifoldObjective","category":"page"},{"location":"plans/objective/#Manopt.EmbeddedManifoldObjective","page":"Objective","title":"Manopt.EmbeddedManifoldObjective","text":"EmbeddedManifoldObjective{P, T, E, O2, O1<:AbstractManifoldObjective{E}} <:\n AbstractDecoratedManifoldObjective{O2, O1}\n\nDeclare an objective to be defined in the embedding. This also declares the gradient to be defined in the embedding, and especially being the Riesz representer with respect to the metric in the embedding. The types can be used to still dispatch on also the undecorated objective type O2.\n\nFields\n\nobjective – the objective that is defined in the embedding\np - (nothing) a point in the embedding.\nX - (nothing) a tangent vector in the embedding\n\nWhen a point in the embedding p is provided, embed! is used in place of this point to reduce memory allocations. Similarly X is used when embedding tangent vectors\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#CacheSection","page":"Objective","title":"Cache Objective","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"Since single function calls, e.g. to the cost or the gradient, might be expensive, a simple cache objective exists as a decorator, that caches one cost value or gradient.","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"It can be activated/used with the cache= keyword argument available for every solver.","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"Manopt.reset_counters!\nManopt.objective_cache_factory","category":"page"},{"location":"plans/objective/#Manopt.reset_counters!","page":"Objective","title":"Manopt.reset_counters!","text":"reset_counters(co::ManifoldCountObjective, value::Integer=0)\n\nReset all values in the count objective to value.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.objective_cache_factory","page":"Objective","title":"Manopt.objective_cache_factory","text":"objective_cache_factory(M::AbstractManifold, o::AbstractManifoldObjective, cache::Symbol)\n\nGenerate a cached variant of the AbstractManifoldObjective o on the AbstractManifold M based on the symbol cache.\n\nThe following caches are available\n\n:Simple generates a SimpleManifoldCachedObjective\n:LRU generates a ManifoldCachedObjective where you should use the form (:LRU, [:Cost, :Gradient]) to specify what should be cached or (:LRU, [:Cost, :Gradient], 100) to specify the cache size. Here this variant defaults to (:LRU, [:Cost, :Gradient], 100), i.e. to cache up to 100 cost and gradient values.[1]\n\n[1]: This cache requires LRUCache.jl to be loaded as well.\n\n\n\n\n\nobjective_cache_factory(M::AbstractManifold, o::AbstractManifoldObjective, cache::Tuple{Symbol, Array, Array})\nobjective_cache_factory(M::AbstractManifold, o::AbstractManifoldObjective, cache::Tuple{Symbol, Array})\n\nGenerate a cached variant of the AbstractManifoldObjective o on the AbstractManifold M based on the symbol cache[1], where the second element cache[2] are further arguments to the cache and the optional third is passed down as keyword arguments.\n\nFor all available caches see the simpler variant with symbols.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#A-simple-cache","page":"Objective","title":"A simple cache","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"A first generic cache is always available, but it only caches one gradient and one cost function evaluation (for the same point).","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"SimpleManifoldCachedObjective","category":"page"},{"location":"plans/objective/#Manopt.SimpleManifoldCachedObjective","page":"Objective","title":"Manopt.SimpleManifoldCachedObjective","text":" SimpleManifoldCachedObjective{O<:AbstractManifoldGradientObjective{E,TC,TG}, P, T,C} <: AbstractManifoldGradientObjective{E,TC,TG}\n\nProvide a simple cache for an AbstractManifoldGradientObjective that is for a given point p this cache stores a point p and a gradient operatornamegrad f(p) in X as well as a cost value f(p) in c.\n\nBoth X and c are accompanied by booleans to keep track of their validity.\n\nConstructor\n\nSimpleManifoldCachedObjective(M::AbstractManifold, obj::AbstractManifoldGradientObjective; kwargs...)\n\nKeyword\n\np (rand(M)) – a point on the manifold to initialize the cache with\nX (get_gradient(M, obj, p) or zero_vector(M,p)) – a tangent vector to store the gradient in, see also initialize\nc (get_cost(M, obj, p) or 0.0) – a value to store the cost function in initialize\ninitialized (true) – whether to initialize the cached X and c or not.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#A-Generic-Cache","page":"Objective","title":"A Generic Cache","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"For the more advanced cache, you need to implement some type of cache yourself, that provides a get! and implement init_caches. This is for example provided if you load LRUCache.jl. Then you obtain","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ManifoldCachedObjective\ninit_caches","category":"page"},{"location":"plans/objective/#Manopt.ManifoldCachedObjective","page":"Objective","title":"Manopt.ManifoldCachedObjective","text":"ManifoldCachedObjective{E,P,O<:AbstractManifoldObjective{<:E},C<:NamedTuple{}} <: AbstractDecoratedManifoldObjective{E,P}\n\nCreate a cache for an objective, based on a NamedTuple that stores some kind of cache.\n\nConstructor\n\nManifoldCachedObjective(M, o::AbstractManifoldObjective, caches::Vector{Symbol}; kwargs...)\n\nCreate a cache for the AbstractManifoldObjective where the Symbols in caches indicate, which function evaluations to cache.\n\nSupported Symbols\n\nSymbol Caches calls to (incl. ! variants) Comment\n:Constraints get_constraints vector of numbers\n:Cost get_cost \n:EqualityConstraint get_equality_constraint numbers per (p,i)\n:EqualityConstraints get_equality_constraints vector of numbers\n:GradEqualityConstraint get_grad_equality_constraint tangent vector per (p,i)\n:GradEqualityConstraints get_grad_equality_constraints vector of tangent vectors\n:GradInequalityConstraint get_inequality_constraint tangent vector per (p,i)\n:GradInequalityConstraints get_inequality_constraints vector of tangent vectors\n:Gradient get_gradient(M,p) tangent vectors\n:Hessian get_hessian tangent vectors\n:InequalityConstraint get_inequality_constraint numbers per (p,j)\n:InequalityConstraints get_inequality_constraints vector of numbers\n:Preconditioner get_preconditioner tangent vectors\n:ProximalMap get_proximal_map point per (p,λ,i)\n:StochasticGradients get_gradients vector of tangent vectors\n:StochasticGradient get_gradient(M, p, i) tangent vector per (p,i)\n:SubGradient get_subgradient tangent vectors\n:SubtrahendGradient get_subtrahend_gradient tangent vectors\n\nKeyword Arguments\n\np - (rand(M)) the type of the keys to be used in the caches. Defaults to the default representation on M.\nvalue - (get_cost(M, objective, p)) the type of values for numeric values in the cache, e.g. the cost\nX - (zero_vector(M,p)) the type of values to be cached for gradient and Hessian calls.\ncache - ([:Cost]) a vector of symbols indicating which function calls should be cached.\ncache_size - (10) number of (least recently used) calls to cache\ncache_sizes – (Dict{Symbol,Int}()) a named tuple or dictionary specifying the sizes individually for each cache.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.init_caches","page":"Objective","title":"Manopt.init_caches","text":"init_caches(M::AbstractManifold, caches, T; kwargs...)\n\nGiven a vector of symbols caches, this function sets up the NamedTuple of caches for points/vectors on M, where T is the type of cache to use.\n\n\n\n\n\ninit_caches(caches, T::Type{LRU}; kwargs...)\n\nGiven a vector of symbols caches, this function sets up the NamedTuple of caches, where T is the type of cache to use.\n\nKeyword arguments\n\np - (rand(M)) a point on a manifold, to both infer its type for keys and initialize caches\nvalue - (0.0) a value both typing and initialising number-caches, eg. for caching a cost.\nX - (zero_vector(M, p) a tangent vector at p to both type and initialize tangent vector caches\ncache_size - (10) a default cache size to use\ncache_sizes – (Dict{Symbol,Int}()) a dictionary of sizes for the caches to specify different (non-default) sizes\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#ManifoldCountObjective","page":"Objective","title":"Count Objective","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ManifoldCountObjective","category":"page"},{"location":"plans/objective/#Manopt.ManifoldCountObjective","page":"Objective","title":"Manopt.ManifoldCountObjective","text":"ManifoldCountObjective{E,P,O<:AbstractManifoldObjective,I<:Integer} <: AbstractDecoratedManifoldObjective{E,P}\n\nA wrapper for any AbstractManifoldObjective of type O to count different calls to parts of the objective.\n\nFields\n\ncounts a dictionary of symbols mapping to integers keeping the counted values\nobjective the wrapped objective\n\nSupported Symbols\n\nSymbol Counts calls to (incl. ! variants) Comment\n:Constraints get_constraints \n:Cost get_cost \n:EqualityConstraint get_equality_constraint requires vector of counters\n:EqualityConstraints get_equality_constraints does not count single access\n:GradEqualityConstraint get_grad_equality_constraint requires vector of counters\n:GradEqualityConstraints get_grad_equality_constraints does not count single access\n:GradInequalityConstraint get_inequality_constraint requires vector of counters\n:GradInequalityConstraints get_inequality_constraints does not count single access\n:Gradient get_gradient(M,p) \n:Hessian get_hessian \n:InequalityConstraint get_inequality_constraint requires vector of counters\n:InequalityConstraints get_inequality_constraints does not count single access\n:Preconditioner get_preconditioner \n:ProximalMap get_proximal_map \n:StochasticGradients get_gradients \n:StochasticGradient get_gradient(M, p, i) \n:SubGradient get_subgradient \n:SubtrahendGradient get_subtrahend_gradient \n\nConstructors\n\nManifoldCountObjective(objective::AbstractManifoldObjective, counts::Dict{Symbol, <:Integer})\n\nInitialise the ManifoldCountObjective to wrap objective initializing the set of counts\n\nManifoldCountObjective(M::AbtractManifold, objective::AbstractManifoldObjective, count::AbstractVecor{Symbol}, init=0)\n\nCount function calls on objective using the symbols in count initialising all entries to init.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Internal-Decorators","page":"Objective","title":"Internal Decorators","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ReturnManifoldObjective","category":"page"},{"location":"plans/objective/#Manopt.ReturnManifoldObjective","page":"Objective","title":"Manopt.ReturnManifoldObjective","text":"ReturnManifoldObjective{E,O2,O1<:AbstractManifoldObjective{E}} <:\n AbstractDecoratedManifoldObjective{E,O2}\n\nA wrapper to indicate that get_solver_result should return the inner objective.\n\nThe types are such that one can still dispatch on the undecorated type O2 of the original objective as well.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Specific-Objective-typed-and-their-access-functions","page":"Objective","title":"Specific Objective typed and their access functions","text":"","category":"section"},{"location":"plans/objective/#Cost-Objective","page":"Objective","title":"Cost Objective","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"AbstractManifoldCostObjective\nManifoldCostObjective","category":"page"},{"location":"plans/objective/#Manopt.AbstractManifoldCostObjective","page":"Objective","title":"Manopt.AbstractManifoldCostObjective","text":"AbstractManifoldCostObjective{T<:AbstractEvaluationType} <: AbstractManifoldObjective{T}\n\nRepresenting objectives on manifolds with a cost function implemented.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.ManifoldCostObjective","page":"Objective","title":"Manopt.ManifoldCostObjective","text":"ManifoldCostObjective{T, TC} <: AbstractManifoldCostObjective{T, TC}\n\nspecify an AbstractManifoldObjective that does only have information about the cost function fcolon mathbb M ℝ implemented as a function (M, p) -> c to compute the cost value c at p on the manifold M.\n\ncost – a function f mathcal M ℝ to minimize\n\nConstructors\n\nManifoldCostObjective(f)\n\nGenerate a problem. While this Problem does not have any allocating functions, the type T can be set for consistency reasons with other problems.\n\nUsed with\n\nNelderMead, particle_swarm\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Access-functions","page":"Objective","title":"Access functions","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_cost","category":"page"},{"location":"plans/objective/#Manopt.get_cost","page":"Objective","title":"Manopt.get_cost","text":"get_cost(amp::AbstractManoptProblem, p)\n\nevaluate the cost function f stored within the AbstractManifoldObjective of an AbstractManoptProblem amp at the point p.\n\n\n\n\n\nget_cost(M::AbstractManifold, obj::AbstractManifoldObjective, p)\n\nevaluate the cost function f defined on M stored within the AbstractManifoldObjective at the point p.\n\n\n\n\n\nget_cost(M::AbstractManifold, mco::AbstractManifoldCostObjective, p)\n\nEvaluate the cost function from within the AbstractManifoldCostObjective on M at p.\n\nBy default this implementation assumed that the cost is stored within mco.cost.\n\n\n\n\n\nget_cost(M::AbstractManifold, sgo::ManifoldStochasticGradientObjective, p, i)\n\nEvaluate the ith summand of the cost.\n\nIf you use a single function for the stochastic cost, then only the index ì=1` is available to evaluate the whole cost.\n\n\n\n\n\nget_cost(M::AbstractManifold,emo::EmbeddedManifoldObjective, p)\n\nEvaluate the cost function of an objective defined in the embedding, i.e. embed p before calling the cost function stored in the EmbeddedManifoldObjective.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"and internally","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_cost_function","category":"page"},{"location":"plans/objective/#Manopt.get_cost_function","page":"Objective","title":"Manopt.get_cost_function","text":"get_cost_function(amco::AbstractManifoldCostObjective)\n\nreturn the function to evaluate (just) the cost f(p)=c as a function (M,p) -> c.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Gradient-Objectives","page":"Objective","title":"Gradient Objectives","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"AbstractManifoldGradientObjective\nManifoldGradientObjective\nManifoldAlternatingGradientObjective\nManifoldStochasticGradientObjective\nNonlinearLeastSquaresObjective","category":"page"},{"location":"plans/objective/#Manopt.AbstractManifoldGradientObjective","page":"Objective","title":"Manopt.AbstractManifoldGradientObjective","text":"AbstractManifoldGradientObjective{E<:AbstractEvaluationType, TC, TG} <: AbstractManifoldCostObjective{E, TC}\n\nAn abstract type for all functions that provide a (full) gradient, where T is a AbstractEvaluationType for the gradient function.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.ManifoldGradientObjective","page":"Objective","title":"Manopt.ManifoldGradientObjective","text":"ManifoldGradientObjective{T<:AbstractEvaluationType} <: AbstractManifoldGradientObjective{T}\n\nspecify an objective containing a cost and its gradient\n\nFields\n\ncost – a function fcolonmathcal M ℝ\ngradient!! – the gradient operatornamegradfcolonmathcal M mathcal Tmathcal M of the cost function f.\n\nDepending on the AbstractEvaluationType T the gradient can have to forms\n\nas a function (M, p) -> X that allocates memory for X, i.e. an AllocatingEvaluation\nas a function (M, X, p) -> X that work in place of X, i.e. an InplaceEvaluation\n\nConstructors\n\nManifoldGradientObjective(cost, gradient; evaluation=AllocatingEvaluation())\n\nUsed with\n\ngradient_descent, conjugate_gradient_descent, quasi_Newton\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.ManifoldAlternatingGradientObjective","page":"Objective","title":"Manopt.ManifoldAlternatingGradientObjective","text":"ManifoldAlternatingGradientObjective{E<:AbstractEvaluationType,TCost,TGradient} <: AbstractManifoldGradientObjective{E}\n\nAn alternating gradient objective consists of\n\na cost function F(x)\na gradient operatornamegradF that is either\ngiven as one function operatornamegradF returning a tangent vector X on M or\nan array of gradient functions operatornamegradF_i, ì=1,…,n s each returning a component of the gradient\nwhich might be allocating or mutating variants, but not a mix of both.\n\nnote: Note\nThis Objective is usually defined using the ProductManifold from Manifolds.jl, so Manifolds.jl to be loaded.\n\nConstructors\n\nManifoldAlternatingGradientObjective(F, gradF::Function;\n evaluation=AllocatingEvaluation()\n)\nManifoldAlternatingGradientObjective(F, gradF::AbstractVector{<:Function};\n evaluation=AllocatingEvaluation()\n)\n\nCreate a alternating gradient problem with an optional cost and the gradient either as one function (returning an array) or a vector of functions.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.ManifoldStochasticGradientObjective","page":"Objective","title":"Manopt.ManifoldStochasticGradientObjective","text":"ManifoldStochasticGradientObjective{T<:AbstractEvaluationType} <: AbstractManifoldGradientObjective{T}\n\nA stochastic gradient objective consists of\n\na(n optional) cost function ``f(p) = \\displaystyle\\sum{i=1}^n fi(p)\nan array of gradients, operatornamegradf_i(p) i=1ldotsn which can be given in two forms\nas one single function (mathcal M p) (X_1X_n) in (T_pmathcal M)^n\nas a vector of functions bigl( (mathcal M p) X_1 (mathcal M p) X_nbigr).\n\nWhere both variants can also be provided as InplaceEvaluation functions, i.e. (M, X, p) -> X, where X is the vector of X1,...Xn and (M, X1, p) -> X1, ..., (M, Xn, p) -> Xn, respectively.\n\nConstructors\n\nManifoldStochasticGradientObjective(\n grad_f::Function;\n cost=Missing(),\n evaluation=AllocatingEvaluation()\n)\nManifoldStochasticGradientObjective(\n grad_f::AbstractVector{<:Function};\n cost=Missing(), evaluation=AllocatingEvaluation()\n)\n\nCreate a Stochastic gradient problem with the gradient either as one function (returning an array of tangent vectors) or a vector of functions (each returning one tangent vector).\n\nThe optional cost can also be given as either a single function (returning a number) pr a vector of functions, each returning a value.\n\nUsed with\n\nstochastic_gradient_descent\n\nNote that this can also be used with a gradient_descent, since the (complete) gradient is just the sums of the single gradients.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.NonlinearLeastSquaresObjective","page":"Objective","title":"Manopt.NonlinearLeastSquaresObjective","text":"NonlinearLeastSquaresObjective{T<:AbstractEvaluationType} <: AbstractManifoldObjective{T}\n\nA type for nonlinear least squares problems. T is a AbstractEvaluationType for the F and Jacobian functions.\n\nSpecify a nonlinear least squares problem\n\nFields\n\nf – a function f mathcal M ℝ^d to minimize\njacobian!! – Jacobian of the function f\njacobian_tangent_basis – the basis of tangent space used for computing the Jacobian.\nnum_components – number of values returned by f (equal to d).\n\nDepending on the AbstractEvaluationType T the function F has to be provided:\n\nas a functions (M::AbstractManifold, p) -> v that allocates memory for v itself for an AllocatingEvaluation,\nas a function (M::AbstractManifold, v, p) -> v that works in place of v for a InplaceEvaluation.\n\nAlso the Jacobian jacF is required:\n\nas a functions (M::AbstractManifold, p; basis_domain::AbstractBasis) -> v that allocates memory for v itself for an AllocatingEvaluation,\nas a function (M::AbstractManifold, v, p; basis_domain::AbstractBasis) -> v that works in place of v for an InplaceEvaluation.\n\nConstructors\n\nNonlinearLeastSquaresProblem(M, F, jacF, num_components; evaluation=AllocatingEvaluation(), jacobian_tangent_basis=DefaultOrthonormalBasis())\n\nSee also\n\nLevenbergMarquardt, LevenbergMarquardtState\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"There is also a second variant, if just one function is responsible for computing the cost and the gradient","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ManifoldCostGradientObjective","category":"page"},{"location":"plans/objective/#Manopt.ManifoldCostGradientObjective","page":"Objective","title":"Manopt.ManifoldCostGradientObjective","text":"ManifoldCostGradientObjective{T} <: AbstractManifoldObjective{T}\n\nspecify an objective containing one function to perform a combined computation of cost and its gradient\n\nFields\n\ncostgrad!! – a function that computes both the cost fcolonmathcal M ℝ and its gradient operatornamegradfcolonmathcal M mathcal Tmathcal M\n\nDepending on the AbstractEvaluationType T the gradient can have to forms\n\nas a function (M, p) -> (c, X) that allocates memory for the gradient X, i.e. an AllocatingEvaluation\nas a function (M, X, p) -> (c, X) that work in place of X, i.e. an InplaceEvaluation\n\nConstructors\n\nManifoldCostGradientObjective(costgrad; evaluation=AllocatingEvaluation())\n\nUsed with\n\ngradient_descent, conjugate_gradient_descent, quasi_Newton\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Access-functions-2","page":"Objective","title":"Access functions","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_gradient\nget_gradients","category":"page"},{"location":"plans/objective/#Manopt.get_gradient","page":"Objective","title":"Manopt.get_gradient","text":"X = get_gradient(M::ProductManifold, ago::ManifoldAlternatingGradientObjective, p)\nget_gradient!(M::ProductManifold, P::ManifoldAlternatingGradientObjective, X, p)\n\nEvaluate all summands gradients at a point p on the ProductManifold M (in place of X)\n\n\n\n\n\nX = get_gradient(M::AbstractManifold, p::ManifoldAlternatingGradientObjective, p, k)\nget_gradient!(M::AbstractManifold, p::ManifoldAlternatingGradientObjective, X, p, k)\n\nEvaluate one of the component gradients operatornamegradf_k, k1n, at x (in place of Y).\n\n\n\n\n\nget_gradient(s::AbstractManoptSolverState)\n\nreturn the (last stored) gradient within AbstractManoptSolverStates`. By default also undecorates the state beforehand\n\n\n\n\n\nget_gradient(amp::AbstractManoptProblem, p)\nget_gradient!(amp::AbstractManoptProblem, X, p)\n\nevaluate the gradient of an AbstractManoptProblem amp at the point p.\n\nThe evaluation is done in place of X for the !-variant.\n\n\n\n\n\nget_gradient(M::AbstractManifold, mgo::AbstractManifoldGradientObjective{T}, p)\nget_gradient!(M::AbstractManifold, X, mgo::AbstractManifoldGradientObjective{T}, p)\n\nevaluate the gradient of a AbstractManifoldGradientObjective{T} mgo at p.\n\nThe evaluation is done in place of X for the !-variant. The T=AllocatingEvaluation problem might still allocate memory within. When the non-mutating variant is called with a T=InplaceEvaluation memory for the result is allocated.\n\nNote that the order of parameters follows the philosophy of Manifolds.jl, namely that even for the mutating variant, the manifold is the first parameter and the (inplace) tangent vector X comes second.\n\n\n\n\n\nget_gradient(agst::AbstractGradientSolverState)\n\nreturn the gradient stored within gradient options. THe default returns agst.X.\n\n\n\n\n\nget_gradient(M::AbstractManifold, sgo::ManifoldStochasticGradientObjective, p, k)\nget_gradient!(M::AbstractManifold, sgo::ManifoldStochasticGradientObjective, Y, p, k)\n\nEvaluate one of the summands gradients operatornamegradf_k, k1n, at x (in place of Y).\n\nIf you use a single function for the stochastic gradient, that works inplace, then get_gradient is not available, since the length (or number of elements of the gradient required for allocation) can not be determined.\n\n\n\n\n\nget_gradient(M::AbstractManifold, sgo::ManifoldStochasticGradientObjective, p)\nget_gradient!(M::AbstractManifold, sgo::ManifoldStochasticGradientObjective, X, p)\n\nEvaluate the complete gradient operatornamegrad f = displaystylesum_i=1^n operatornamegrad f_i(p) at p (in place of X).\n\nIf you use a single function for the stochastic gradient, that works inplace, then get_gradient is not available, since the length (or number of elements of the gradient required for allocation) can not be determined.\n\n\n\n\n\nget_gradient(M::AbstractManifold, emo::EmbeddedManifoldObjective, p)\nget_gradient!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p)\n\nEvaluate the gradient function of an objective defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.\n\nThe returned gradient is then converted to a Riemannian gradient calling riemannian_gradient.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_gradients","page":"Objective","title":"Manopt.get_gradients","text":"get_gradients(M::AbstractManifold, sgo::ManifoldStochasticGradientObjective, p)\nget_gradients!(M::AbstractManifold, X, sgo::ManifoldStochasticGradientObjective, p)\n\nEvaluate all summands gradients operatornamegradf_i_i=1^n at p (in place of X).\n\nIf you use a single function for the stochastic gradient, that works inplace, then get_gradient is not available, since the length (or number of elements of the gradient) can not be determined.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"and internally","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_gradient_function","category":"page"},{"location":"plans/objective/#Manopt.get_gradient_function","page":"Objective","title":"Manopt.get_gradient_function","text":"get_gradient_function(amgo::AbstractManifoldGradientObjective, recursive=false)\n\nreturn the function to evaluate (just) the gradient operatornamegrad f(p), where either the gradient function using the decorator or without the decorator is used.\n\nBy default recursive is set to false, since usually to just pass the gradient function somewhere, you still want e.g. the cached one or the one that still counts calls.\n\nDepending on the AbstractEvaluationType E this is a function\n\n(M, p) -> X for the AllocatingEvaluation case\n(M, X, p) -> X for the InplaceEvaluation, i.e. working inplace of X.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Internal-Helpers","page":"Objective","title":"Internal Helpers","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_gradient_from_Jacobian!","category":"page"},{"location":"plans/objective/#Manopt.get_gradient_from_Jacobian!","page":"Objective","title":"Manopt.get_gradient_from_Jacobian!","text":"get_gradient_from_Jacobian!(\n M::AbstractManifold,\n X,\n nlso::NonlinearLeastSquaresObjective{InplaceEvaluation},\n p,\n Jval=zeros(nlso.num_components, manifold_dimension(M)),\n)\n\nCompute gradient of NonlinearLeastSquaresObjective nlso at point p in place of X, with temporary Jacobian stored in the optional argument Jval.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Subgradient-Objective","page":"Objective","title":"Subgradient Objective","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ManifoldSubgradientObjective","category":"page"},{"location":"plans/objective/#Manopt.ManifoldSubgradientObjective","page":"Objective","title":"Manopt.ManifoldSubgradientObjective","text":"ManifoldSubgradientObjective{T<:AbstractEvaluationType,C,S} <:AbstractManifoldCostObjective{T, C}\n\nA structure to store information about a objective for a subgradient based optimization problem\n\nFields\n\ncost – the function F to be minimized\nsubgradient – a function returning a subgradient partial F of F\n\nConstructor\n\nManifoldSubgradientObjective(f, ∂f)\n\nGenerate the ManifoldSubgradientObjective for a subgradient objective, i.e. a (cost) function f(M, p) and a function ∂f(M, p) that returns a not necessarily deterministic element from the subdifferential at p on a manifold M.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Access-Functions","page":"Objective","title":"Access Functions","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_subgradient","category":"page"},{"location":"plans/objective/#Manopt.get_subgradient","page":"Objective","title":"Manopt.get_subgradient","text":"get_subgradient(amp::AbstractManoptProblem, p)\nget_subgradient!(amp::AbstractManoptProblem, X, p)\n\nevaluate the subgradient of an AbstractManoptProblem amp at point p.\n\nThe evaluation is done in place of X for the !-variant. The result might not be deterministic, one element of the subdifferential is returned.\n\n\n\n\n\nX = get_subgradient(M;;AbstractManifold, sgo::ManifoldSubgradientObjective, p)\nget_subgradient!(M;;AbstractManifold, X, sgo::ManifoldSubgradientObjective, p)\n\nEvaluate the (sub)gradient of a ManifoldSubgradientObjective sgo at the point p.\n\nThe evaluation is done in place of X for the !-variant. The result might not be deterministic, one element of the subdifferential is returned.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Proximal-Map-Objective","page":"Objective","title":"Proximal Map Objective","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ManifoldProximalMapObjective","category":"page"},{"location":"plans/objective/#Manopt.ManifoldProximalMapObjective","page":"Objective","title":"Manopt.ManifoldProximalMapObjective","text":"ManifoldProximalMapObjective{E<:AbstractEvaluationType, TC, TP, V <: Vector{<:Integer}} <: AbstractManifoldCostObjective{E, TC}\n\nspecify a problem for solvers based on the evaluation of proximal map(s).\n\nFields\n\ncost - a function Fmathcal Mℝ to minimize\nproxes - proximal maps operatornameprox_λvarphimathcal Mmathcal M as functions (M, λ, p) -> q.\nnumber_of_proxes - (ones(length(proxes))` number of proximal Maps per function, e.g. if one of the maps is a combined one such that the proximal Maps functions return more than one entry per function, you have to adapt this value. if not specified, it is set to one prox per function.\n\nSee also\n\ncyclic_proximal_point, get_cost, get_proximal_map\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Access-Functions-2","page":"Objective","title":"Access Functions","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_proximal_map","category":"page"},{"location":"plans/objective/#Manopt.get_proximal_map","page":"Objective","title":"Manopt.get_proximal_map","text":"q = get_proximal_map(M::AbstractManifold, mpo::ManifoldProximalMapObjective, λ, p)\nget_proximal_map!(M::AbstractManifold, q, mpo::ManifoldProximalMapObjective, λ, p)\nq = get_proximal_map(M::AbstractManifold, mpo::ManifoldProximalMapObjective, λ, p, i)\nget_proximal_map!(M::AbstractManifold, q, mpo::ManifoldProximalMapObjective, λ, p, i)\n\nevaluate the (ith) proximal map of ManifoldProximalMapObjective p at the point p of p.M with parameter λ0.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Hessian-Objective","page":"Objective","title":"Hessian Objective","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ManifoldHessianObjective","category":"page"},{"location":"plans/objective/#Manopt.ManifoldHessianObjective","page":"Objective","title":"Manopt.ManifoldHessianObjective","text":"ManifoldHessianObjective{T<:AbstractEvaluationType,C,G,H,Pre} <: AbstractManifoldGradientObjective{T}\n\nspecify a problem for hessian based algorithms.\n\nFields\n\ncost : a function Fmathcal Mℝ to minimize\ngradient : the gradient operatornamegradFmathcal M mathcal Tmathcal M of the cost function F\nhessian : the hessian operatornameHessF(x) mathcal T_x mathcal M mathcal T_x mathcal M of the cost function F\npreconditioner : the symmetric, positive definite preconditioner as an approximation of the inverse of the Hessian of f, i.e. as a map with the same input variables as the hessian.\n\nDepending on the AbstractEvaluationType T the gradient and can have to forms\n\nas a function (M, p) -> X and (M, p, X) -> Y, resp. i.e. an AllocatingEvaluation\nas a function (M, X, p) -> X and (M, Y, p, X), resp., i.e. an InplaceEvaluation\n\nConstructor\n\nManifoldHessianObjective(f, grad_f, Hess_f, preconditioner = (M, p, X) -> X;\n evaluation=AllocatingEvaluation())\n\nSee also\n\ntruncated_conjugate_gradient_descent, trust_regions\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Access-functions-3","page":"Objective","title":"Access functions","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_hessian\nget_preconditioner","category":"page"},{"location":"plans/objective/#Manopt.get_hessian","page":"Objective","title":"Manopt.get_hessian","text":"Y = get_hessian(amp::AbstractManoptProblem{T}, p, X)\nget_hessian!(amp::AbstractManoptProblem{T}, Y, p, X)\n\nevaluate the Hessian of an AbstractManoptProblem amp at p applied to a tangent vector X, i.e. compute operatornameHessf(q)X, which can also happen in-place of Y.\n\n\n\n\n\nget_hessian(M::AbstractManifold, emo::EmbeddedManifoldObjective, p, X)\nget_hessian!(M::AbstractManifold, Y, emo::EmbeddedManifoldObjective, p, X)\n\nEvaluate the Hessian of an objective defined in the embedding, that is embed p and X before calling the Hessian function stored in the EmbeddedManifoldObjective.\n\nThe returned Hessian is then converted to a Riemannian Hessian calling riemannian_Hessian.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_preconditioner","page":"Objective","title":"Manopt.get_preconditioner","text":"get_preconditioner(amp::AbstractManoptProblem, p, X)\n\nevaluate the symmetric, positive definite preconditioner (approximation of the inverse of the Hessian of the cost function f) of a AbstractManoptProblem amps objective at the point p applied to a tangent vector X.\n\n\n\n\n\nget_preconditioner(M::AbstractManifold, mho::ManifoldHessianObjective, p, X)\n\nevaluate the symmetric, positive definite preconditioner (approximation of the inverse of the Hessian of the cost function F) of a ManifoldHessianObjective mho at the point p applied to a tangent vector X.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"and internally","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_hessian_function","category":"page"},{"location":"plans/objective/#Manopt.get_hessian_function","page":"Objective","title":"Manopt.get_hessian_function","text":"get_gradient_function(amgo::AbstractManifoldGradientObjective{E<:AbstractEvaluationType})\n\nreturn the function to evaluate (just) the hessian operatornameHess f(p). Depending on the AbstractEvaluationType E this is a function\n\n(M, p, X) -> Y for the AllocatingEvaluation case\n(M, Y, p, X) -> X for the InplaceEvaluation, i.e. working inplace of Y.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Primal-Dual-based-Objectives","page":"Objective","title":"Primal-Dual based Objectives","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"AbstractPrimalDualManifoldObjective\nPrimalDualManifoldObjective\nPrimalDualManifoldSemismoothNewtonObjective","category":"page"},{"location":"plans/objective/#Manopt.AbstractPrimalDualManifoldObjective","page":"Objective","title":"Manopt.AbstractPrimalDualManifoldObjective","text":"AbstractPrimalDualManifoldObjective{E<:AbstractEvaluationType,C,P} <: AbstractManifoldCostObjective{E,C}\n\nA common abstract super type for objectives that consider primal-dual problems.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.PrimalDualManifoldObjective","page":"Objective","title":"Manopt.PrimalDualManifoldObjective","text":"PrimalDualManifoldObjective{E<:AbstractEvaluationType} <: AbstractPrimalDualManifoldObjective{E}\n\nDescribes an Objective linearized or exact Chambolle-Pock algorithm, cf. Bergmann et al., Found. Comput. Math., 2021, Chambolle, Pock, JMIV, 201\n\nFields\n\nAll fields with !! can either be mutating or nonmutating functions, which should be set depending on the parameter T <: AbstractEvaluationType.\n\ncost F + G(Λ()) to evaluate interims cost function values\nlinearized_forward_operator!! linearized operator for the forward operation in the algorithm DΛ\nlinearized_adjoint_operator!! The adjoint differential (DΛ)^* mathcal N Tmathcal M\nprox_f!! the proximal map belonging to f\nprox_G_dual!! the proximal map belonging to g_n^*\nΛ!! – (fordward_operator) the forward operator (if given) Λ mathcal M mathcal N\n\nEither the linearized operator DΛ or Λ are required usually.\n\nConstructor\n\nPrimalDualManifoldObjective(cost, prox_f, prox_G_dual, adjoint_linearized_operator;\n linearized_forward_operator::Union{Function,Missing}=missing,\n Λ::Union{Function,Missing}=missing,\n evaluation::AbstractEvaluationType=AllocatingEvaluation()\n)\n\nThe last optional argument can be used to provide the 4 or 5 functions as allocating or mutating (in place computation) ones. Note that the first argument is always the manifold under consideration, the mutated one is the second.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.PrimalDualManifoldSemismoothNewtonObjective","page":"Objective","title":"Manopt.PrimalDualManifoldSemismoothNewtonObjective","text":"PrimalDualManifoldSemismoothNewtonObjective{E<:AbstractEvaluationType, TC, LO, ALO, PF, DPF, PG, DPG, L} <: AbstractPrimalDualManifoldObjective{E, TC, PF}\n\nDescribes a Problem for the Primal-dual Riemannian semismooth Newton algorithm. Diepeveen, Lellmann, SIAM J. Imag. Sci., 2021\n\nFields\n\ncost F + G(Λ()) to evaluate interims cost function values\nlinearized_operator the linearization DΛ() of the operator Λ().\nlinearized_adjoint_operator The adjoint differential (DΛ)^* colon mathcal N to Tmathcal M\nprox_F the proximal map belonging to f\ndiff_prox_F the (Clarke Generalized) differential of the proximal maps of F\nprox_G_dual the proximal map belonging to g_n^*\ndiff_prox_dual_G the (Clarke Generalized) differential of the proximal maps of G^ast_n\nΛ – the exact forward operator. This operator is required if Λ(m)=n does not hold.\n\nConstructor\n\nPrimalDualManifoldSemismoothNewtonObjective(cost, prox_F, prox_G_dual, forward_operator, adjoint_linearized_operator,Λ)\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Access-functions-4","page":"Objective","title":"Access functions","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"adjoint_linearized_operator\nforward_operator\nget_differential_dual_prox\nget_differential_primal_prox\nget_dual_prox\nget_primal_prox\nlinearized_forward_operator","category":"page"},{"location":"plans/objective/#Manopt.adjoint_linearized_operator","page":"Objective","title":"Manopt.adjoint_linearized_operator","text":"X = adjoint_linearized_operator(N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, m, n, Y)\nadjoint_linearized_operator(N::AbstractManifold, X, apdmo::AbstractPrimalDualManifoldObjective, m, n, Y)\n\nEvaluate the adjoint of the linearized forward operator of (DΛ(m))^*Y stored within the AbstractPrimalDualManifoldObjective (in place of X). Since YT_nmathcal N, both m and n=Λ(m) are necessary arguments, mainly because the forward operator Λ might be missing in p.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.forward_operator","page":"Objective","title":"Manopt.forward_operator","text":"q = forward_operator(M::AbstractManifold, N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, p)\nforward_operator!(M::AbstractManifold, N::AbstractManifold, q, apdmo::AbstractPrimalDualManifoldObjective, p)\n\nEvaluate the forward operator of Λ(x) stored within the TwoManifoldProblem (in place of q).\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_differential_dual_prox","page":"Objective","title":"Manopt.get_differential_dual_prox","text":"η = get_differential_dual_prox(N::AbstractManifold, pdsno::PrimalDualManifoldSemismoothNewtonObjective, n, τ, X, ξ)\nget_differential_dual_prox!(N::AbstractManifold, pdsno::PrimalDualManifoldSemismoothNewtonObjective, η, n, τ, X, ξ)\n\nEvaluate the differential proximal map of G_n^* stored within PrimalDualManifoldSemismoothNewtonObjective\n\nDoperatornameprox_τG_n^*(X)ξ\n\nwhich can also be computed in place of η.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_differential_primal_prox","page":"Objective","title":"Manopt.get_differential_primal_prox","text":"y = get_differential_primal_prox(M::AbstractManifold, pdsno::PrimalDualManifoldSemismoothNewtonObjective σ, x)\nget_differential_primal_prox!(p::TwoManifoldProblem, y, σ, x)\n\nEvaluate the differential proximal map of F stored within AbstractPrimalDualManifoldObjective\n\nDoperatornameprox_σF(x)X\n\nwhich can also be computed in place of y.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_dual_prox","page":"Objective","title":"Manopt.get_dual_prox","text":"Y = get_dual_prox(N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, n, τ, X)\nget_dual_prox!(N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, Y, n, τ, X)\n\nEvaluate the proximal map of g_n^* stored within AbstractPrimalDualManifoldObjective\n\n Y = operatornameprox_τG_n^*(X)\n\nwhich can also be computed in place of Y.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_primal_prox","page":"Objective","title":"Manopt.get_primal_prox","text":"q = get_primal_prox(M::AbstractManifold, p::AbstractPrimalDualManifoldObjective, σ, p)\nget_primal_prox!(M::AbstractManifold, p::AbstractPrimalDualManifoldObjective, q, σ, p)\n\nEvaluate the proximal map of F stored within AbstractPrimalDualManifoldObjective\n\noperatornameprox_σF(x)\n\nwhich can also be computed in place of y.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.linearized_forward_operator","page":"Objective","title":"Manopt.linearized_forward_operator","text":"Y = linearized_forward_operator(M::AbstractManifold, N::AbstractManifold, apdmo::AbstractPrimalDualManifoldObjective, m, X, n)\nlinearized_forward_operator!(M::AbstractManifold, N::AbstractManifold, Y, apdmo::AbstractPrimalDualManifoldObjective, m, X, n)\n\nEvaluate the linearized operator (differential) DΛ(m)X stored within the AbstractPrimalDualManifoldObjective (in place of Y), where n = Λ(m).\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Constrained-Objective","page":"Objective","title":"Constrained Objective","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"Besides the AbstractEvaluationType there is one further property to distinguish among constraint functions, especially the gradients of the constraints.","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ConstraintType\nFunctionConstraint\nVectorConstraint","category":"page"},{"location":"plans/objective/#Manopt.ConstraintType","page":"Objective","title":"Manopt.ConstraintType","text":"ConstraintType\n\nAn abstract type to represent different forms of representing constraints\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.FunctionConstraint","page":"Objective","title":"Manopt.FunctionConstraint","text":"FunctionConstraint <: ConstraintType\n\nA type to indicate that constraints are implemented one whole functions, e.g. g(p) mathbb R^m.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Manopt.VectorConstraint","page":"Objective","title":"Manopt.VectorConstraint","text":"VectorConstraint <: ConstraintType\n\nA type to indicate that constraints are implemented a vector of functions, e.g. g_i(p) mathbb R i=1m.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"The ConstraintType is a parameter of the corresponding Objective.","category":"page"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"ConstrainedManifoldObjective","category":"page"},{"location":"plans/objective/#Manopt.ConstrainedManifoldObjective","page":"Objective","title":"Manopt.ConstrainedManifoldObjective","text":"ConstrainedManifoldObjective{T<:AbstractEvaluationType, C <: ConstraintType Manifold} <: AbstractManifoldObjective{T}\n\nDescribes the constrained objective\n\nbeginaligned\n operatorname*argmin_p mathcalM f(p)\n textsubject to g_i(p)leq0 quad text for all i=1m\n quad h_j(p)=0 quad text for all j=1n\nendaligned\n\nIt consists of\n\nan cost function f(p)\nthe gradient of f, operatornamegradf(p) AbstractManifoldGradientObjective\ninequality constraints g(p), either a function g returning a vector or a vector [g1, g2,...,gm] of functions.\nequality constraints h(p), either a function h returning a vector or a vector [h1, h2,...,hn] of functions.\ngradient(s) of the inequality constraints operatornamegradg(p) (T_pmathcal M)^m, either a function or a vector of functions.\ngradient(s) of the equality constraints operatornamegradh(p) (T_pmathcal M)^n, either a function or a vector of functions.\n\nThere are two ways to specify the constraints g and h.\n\nas one Function returning a vector in mathbb R^m and mathbb R^n respectively. This might be easier to implement but requires evaluating all constraints even if only one is needed.\nas a AbstractVector{<:Function} where each function returns a real number. This requires each constraint to be implemented as a single function, but it is possible to evaluate also only a single constraint.\n\nThe gradients operatornamegradg, operatornamegradh have to follow the same form. Additionally they can be implemented as in-place functions or as allocating ones. The gradient operatornamegradF has to be the same kind. This difference is indicated by the evaluation keyword.\n\nConstructors\n\nConstrainedManifoldObjective(f, grad_f, g, grad_g, h, grad_h;\n evaluation=AllocatingEvaluation()\n)\n\nWhere f, g, h describe the cost, inequality and equality constraints, respectively, as described above and grad_f, grad_g, grad_h are the corresponding gradient functions in one of the 4 formats. If the objective does not have inequality constraints, you can set G and gradG no nothing. If the problem does not have equality constraints, you can set H and gradH no nothing or leave them out.\n\nConstrainedManifoldObjective(M::AbstractManifold, F, gradF;\n G=nothing, gradG=nothing, H=nothing, gradH=nothing;\n evaluation=AllocatingEvaluation()\n)\n\nA keyword argument variant of the constructor above, where you can leave out either G and gradG or H and gradH but not both.\n\n\n\n\n\n","category":"type"},{"location":"plans/objective/#Access-functions-5","page":"Objective","title":"Access functions","text":"","category":"section"},{"location":"plans/objective/","page":"Objective","title":"Objective","text":"get_constraints\nget_equality_constraint\nget_equality_constraints\nget_inequality_constraint\nget_inequality_constraints\nget_grad_equality_constraint\nget_grad_equality_constraints\nget_grad_equality_constraints!\nget_grad_equality_constraint!\nget_grad_inequality_constraint\nget_grad_inequality_constraint!\nget_grad_inequality_constraints\nget_grad_inequality_constraints!","category":"page"},{"location":"plans/objective/#Manopt.get_constraints","page":"Objective","title":"Manopt.get_constraints","text":"get_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p)\n\nReturn the vector (g_1(p)g_m(p)h_1(p)h_n(p)) from the ConstrainedManifoldObjective P containing the values of all constraints at p.\n\n\n\n\n\nget_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p)\n\nReturn the vector (g_1(p)g_m(p)h_1(p)h_n(p)) defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_equality_constraint","page":"Objective","title":"Manopt.get_equality_constraint","text":"get_equality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, j)\n\nevaluate the jth equality constraint (h(p))_j or h_j(p).\n\nnote: Note\nFor the FunctionConstraint representation this still evaluates all constraints.\n\n\n\n\n\nget_equality_constraint(M::AbstractManifold, emo::EmbeddedManifoldObjective, p, j)\n\nevaluate the js equality constraint h_j(p) defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_equality_constraints","page":"Objective","title":"Manopt.get_equality_constraints","text":"get_equality_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p)\n\nevaluate all equality constraints h(p) of bigl(h_1(p) h_2(p)ldotsh_p(p)bigr) of the ConstrainedManifoldObjective P at p.\n\n\n\n\n\nget_equality_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p)\n\nEvaluate all equality constraints h(p) of bigl(h_1(p) h_2(p)ldotsh_p(p)bigr) defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_inequality_constraint","page":"Objective","title":"Manopt.get_inequality_constraint","text":"get_inequality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, i)\n\nevaluate one equality constraint (g(p))_i or g_i(p).\n\nnote: Note\nFor the FunctionConstraint representation this still evaluates all constraints.\n\n\n\n\n\nget_inequality_constraint(M::AbstractManifold, ems::EmbeddedManifoldObjective, p, i)\n\nEvaluate the is inequality constraint g_i(p) defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_inequality_constraints","page":"Objective","title":"Manopt.get_inequality_constraints","text":"get_inequality_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p)\n\nEvaluate all inequality constraints g(p) or bigl(g_1(p) g_2(p)ldotsg_m(p)bigr) of the ConstrainedManifoldObjective P at p.\n\n\n\n\n\nget_inequality_constraints(M::AbstractManifold, ems::EmbeddedManifoldObjective, p)\n\nEvaluate all inequality constraints g(p) of bigl(g_1(p) g_2(p)ldotsg_m(p)bigr) defined in the embedding, that is embed p before calling the constraint function(s) stored in the EmbeddedManifoldObjective.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_grad_equality_constraint","page":"Objective","title":"Manopt.get_grad_equality_constraint","text":"get_grad_equality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, j)\n\nevaluate the gradient of the j th equality constraint (operatornamegrad h(p))_j or operatornamegrad h_j(x).\n\nnote: Note\nFor the FunctionConstraint variant of the problem, this function still evaluates the full gradient. For the InplaceEvaluation and FunctionConstraint of the problem, this function currently also calls get_equality_constraints, since this is the only way to determine the number of constraints. It also allocates a full tangent vector.\n\n\n\n\n\nX = get_grad_equality_constraint(M::AbstractManifold, emo::EmbeddedManifoldObjective, p, j)\nget_grad_equality_constraint!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p, j)\n\nevaluate the gradient of the jth equality constraint operatornamegrad h_j(p) defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.\n\nThe returned gradient is then converted to a Riemannian gradient calling riemannian_gradient.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_grad_equality_constraints","page":"Objective","title":"Manopt.get_grad_equality_constraints","text":"get_grad_equality_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p)\n\nevaluate all gradients of the equality constraints operatornamegrad h(x) or bigl(operatornamegrad h_1(x) operatornamegrad h_2(x)ldots operatornamegradh_n(x)bigr) of the ConstrainedManifoldObjective P at p.\n\nnote: Note\nFor the InplaceEvaluation and FunctionConstraint variant of the problem, this function currently also calls get_equality_constraints, since this is the only way to determine the number of constraints.\n\n\n\n\n\nX = get_grad_equality_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p)\nget_grad_equality_constraints!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p)\n\nevaluate the gradients of the the equality constraints operatornamegrad h(p) defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.\n\nThe returned gradients are then converted to a Riemannian gradient calling riemannian_gradient.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_grad_equality_constraints!","page":"Objective","title":"Manopt.get_grad_equality_constraints!","text":"get_grad_equality_constraints!(M::AbstractManifold, X, co::ConstrainedManifoldObjective, p)\n\nevaluate all gradients of the equality constraints operatornamegrad h(p) or bigl(operatornamegrad h_1(p) operatornamegrad h_2(p)ldotsoperatornamegrad h_n(p)bigr) of the ConstrainedManifoldObjective P at p in place of X, which is a vector ofn` tangent vectors.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_grad_equality_constraint!","page":"Objective","title":"Manopt.get_grad_equality_constraint!","text":"get_grad_equality_constraint!(M::AbstractManifold, X, co::ConstrainedManifoldObjective, p, j)\n\nEvaluate the gradient of the jth equality constraint (operatornamegrad h(x))_j or operatornamegrad h_j(x) in place of X\n\nnote: Note\nFor the FunctionConstraint variant of the problem, this function still evaluates the full gradient. For the InplaceEvaluation of the FunctionConstraint of the problem, this function currently also calls get_inequality_constraints, since this is the only way to determine the number of constraints and allocates a full vector of tangent vectors\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_grad_inequality_constraint","page":"Objective","title":"Manopt.get_grad_inequality_constraint","text":"get_grad_inequality_constraint(M::AbstractManifold, co::ConstrainedManifoldObjective, p, i)\n\nEvaluate the gradient of the i th inequality constraints (operatornamegrad g(x))_i or operatornamegrad g_i(x).\n\nnote: Note\nFor the FunctionConstraint variant of the problem, this function still evaluates the full gradient. For the InplaceEvaluation and FunctionConstraint of the problem, this function currently also calls get_inequality_constraints, since this is the only way to determine the number of constraints.\n\n\n\n\n\nX = get_grad_inequality_constraint(M::AbstractManifold, emo::EmbeddedManifoldObjective, p, i)\nget_grad_inequality_constraint!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p, i)\n\nevaluate the gradient of the ith inequality constraint operatornamegrad g_i(p) defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.\n\nThe returned gradient is then converted to a Riemannian gradient calling riemannian_gradient.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_grad_inequality_constraint!","page":"Objective","title":"Manopt.get_grad_inequality_constraint!","text":"get_grad_inequality_constraint!(P, X, p, i)\n\nEvaluate the gradient of the ith inequality constraints (operatornamegrad g(x))_i or operatornamegrad g_i(x) of the ConstrainedManifoldObjective P in place of X\n\nnote: Note\nFor the FunctionConstraint variant of the problem, this function still evaluates the full gradient. For the InplaceEvaluation and FunctionConstraint of the problem, this function currently also calls get_inequality_constraints,\n\nsince this is the only way to determine the number of constraints. evaluate all gradients of the inequality constraints operatornamegrad h(x) or bigl(g_1(x) g_2(x)ldotsg_m(x)bigr) of the ConstrainedManifoldObjective p at x in place of X, which is a vector ofm` tangent vectors .\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_grad_inequality_constraints","page":"Objective","title":"Manopt.get_grad_inequality_constraints","text":"get_grad_inequality_constraints(M::AbstractManifold, co::ConstrainedManifoldObjective, p)\n\nevaluate all gradients of the inequality constraints operatornamegrad g(p) or bigl(operatornamegrad g_1(p) operatornamegrad g_2(p)operatornamegrad g_m(p)bigr) of the ConstrainedManifoldObjective P at p.\n\nnote: Note\n\n\nfor the InplaceEvaluation and FunctionConstraint variant of the problem, this function currently also calls get_equality_constraints, since this is the only way to determine the number of constraints.\n\n\n\n\n\nX = get_grad_inequality_constraints(M::AbstractManifold, emo::EmbeddedManifoldObjective, p)\nget_grad_inequality_constraints!(M::AbstractManifold, X, emo::EmbeddedManifoldObjective, p)\n\nevaluate the gradients of the the inequality constraints operatornamegrad g(p) defined in the embedding, that is embed p before calling the gradient function stored in the EmbeddedManifoldObjective.\n\nThe returned gradients are then converted to a Riemannian gradient calling riemannian_gradient.\n\n\n\n\n\n","category":"function"},{"location":"plans/objective/#Manopt.get_grad_inequality_constraints!","page":"Objective","title":"Manopt.get_grad_inequality_constraints!","text":"get_grad_inequality_constraints!(M::AbstractManifold, X, co::ConstrainedManifoldObjective, p)\n\nevaluate all gradients of the inequality constraints operatornamegrad g(x) or bigl(operatornamegrad g_1(x) operatornamegrad g_2(x)ldotsoperatornamegrad g_m(x)bigr) of the ConstrainedManifoldObjective P at p in place of X, which is a vector of m tangent vectors.\n\n\n\n\n\n","category":"function"},{"location":"plans/stopping_criteria/#StoppingCriteria","page":"Stopping Criteria","title":"Stopping Criteria","text":"","category":"section"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"Stopping criteria are implemented as a functor, i.e. inherit from the base type","category":"page"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"StoppingCriterion","category":"page"},{"location":"plans/stopping_criteria/#Manopt.StoppingCriterion","page":"Stopping Criteria","title":"Manopt.StoppingCriterion","text":"StoppingCriterion\n\nAn abstract type for the functors representing stopping criteria, i.e. they are callable structures. The naming Scheme follows functions, see for example StopAfterIteration.\n\nEvery StoppingCriterion has to provide a constructor and its function has to have the interface (p,o,i) where a AbstractManoptProblem as well as AbstractManoptSolverState and the current number of iterations are the arguments and returns a Bool whether to stop or not.\n\nBy default each StoppingCriterion should provide a fields reason to provide details when a criterion is met (and that is empty otherwise).\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"They can also be grouped, which is summarized in the type of a set of criteria","category":"page"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"StoppingCriterionSet","category":"page"},{"location":"plans/stopping_criteria/#Manopt.StoppingCriterionSet","page":"Stopping Criteria","title":"Manopt.StoppingCriterionSet","text":"StoppingCriterionGroup <: StoppingCriterion\n\nAn abstract type for a Stopping Criterion that itself consists of a set of Stopping criteria. In total it acts as a stopping criterion itself. Examples are StopWhenAny and StopWhenAll that can be used to combine stopping criteria.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"Then the stopping criteria s might have certain internal values to check against, and this is done when calling them as a function s(amp::AbstractManoptProblem, ams::AbstractManoptSolverState), where the AbstractManoptProblem and the AbstractManoptSolverState together represent the current state of the solver. The functor returns either false when the stopping criterion is not fulfilled or true otherwise. One field all criteria should have is the s.reason, a string giving the reason to stop, see get_reason.","category":"page"},{"location":"plans/stopping_criteria/#Stopping-Criteria","page":"Stopping Criteria","title":"Stopping Criteria","text":"","category":"section"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"The following generic stopping criteria are available. Some require that, for example, the corresponding AbstractManoptSolverState have a field gradient when the criterion should check that.","category":"page"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"Further stopping criteria might be available for individual solvers.","category":"page"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"Modules = [Manopt]\nPages = [\"plans/stopping_criterion.jl\"]\nOrder = [:type]\nFilter = t -> t != StoppingCriterion && t != StoppingCriterionSet","category":"page"},{"location":"plans/stopping_criteria/#Manopt.StopAfter","page":"Stopping Criteria","title":"Manopt.StopAfter","text":"StopAfter <: StoppingCriterion\n\nstore a threshold when to stop looking at the complete runtime. It uses time_ns() to measure the time and you provide a Period as a time limit, i.e. Minute(15)\n\nConstructor\n\nStopAfter(t)\n\ninitialize the stopping criterion to a Period t to stop after.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopAfterIteration","page":"Stopping Criteria","title":"Manopt.StopAfterIteration","text":"StopAfterIteration <: StoppingCriterion\n\nA functor for an easy stopping criterion, i.e. to stop after a maximal number of iterations.\n\nFields\n\nmaxIter – stores the maximal iteration number where to stop at\nreason – stores a reason of stopping if the stopping criterion has one be reached, see get_reason.\n\nConstructor\n\nStopAfterIteration(maxIter)\n\ninitialize the stopafterIteration functor to indicate to stop after maxIter iterations.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopWhenAll","page":"Stopping Criteria","title":"Manopt.StopWhenAll","text":"StopWhenAll <: StoppingCriterion\n\nstore an array of StoppingCriterion elements and indicates to stop, when all indicate to stop. The reason is given by the concatenation of all reasons.\n\nConstructor\n\nStopWhenAll(c::NTuple{N,StoppingCriterion} where N)\nStopWhenAll(c::StoppingCriterion,...)\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopWhenAny","page":"Stopping Criteria","title":"Manopt.StopWhenAny","text":"StopWhenAny <: StoppingCriterion\n\nstore an array of StoppingCriterion elements and indicates to stop, when any single one indicates to stop. The reason is given by the concatenation of all reasons (assuming that all non-indicating return \"\").\n\nConstructor\n\nStopWhenAny(c::NTuple{N,StoppingCriterion} where N)\nStopWhenAny(c::StoppingCriterion...)\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopWhenChangeLess","page":"Stopping Criteria","title":"Manopt.StopWhenChangeLess","text":"StopWhenChangeLess <: StoppingCriterion\n\nstores a threshold when to stop looking at the norm of the change of the optimization variable from within a AbstractManoptSolverState, i.e get_iterate(o). For the storage a StoreStateAction is used\n\nConstructor\n\nStopWhenChangeLess(\n M::AbstractManifold,\n ε::Float64;\n storage::StoreStateAction=StoreStateAction([:Iterate]),\n inverse_retraction_method::IRT=default_inverse_retraction_method(manifold)\n)\n\ninitialize the stopping criterion to a threshold ε using the StoreStateAction a, which is initialized to just store :Iterate by default. You can also provide an inverseretractionmethod for the distance or a manifold to use its default inverse retraction.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopWhenCostLess","page":"Stopping Criteria","title":"Manopt.StopWhenCostLess","text":"StopWhenCostLess <: StoppingCriterion\n\nstore a threshold when to stop looking at the cost function of the optimization problem from within a AbstractManoptProblem, i.e get_cost(p,get_iterate(o)).\n\nConstructor\n\nStopWhenCostLess(ε)\n\ninitialize the stopping criterion to a threshold ε.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopWhenGradientChangeLess","page":"Stopping Criteria","title":"Manopt.StopWhenGradientChangeLess","text":"StopWhenGradientChangeLess <: StoppingCriterion\n\nA stopping criterion based on the change of the gradient\n\n\\lVert \\mathcal T_{p^{(k)}\\gets p^{(k-1)} \\operatorname{grad} f(p^{(k-1)}) - \\operatorname{grad} f(p^{(k-1)}) \\rVert < ε\n\nConstructor\n\nStopWhenGradientChangeLess(\n M::AbstractManifold,\n ε::Float64;\n storage::StoreStateAction=StoreStateAction([:Iterate]),\n vector_transport_method::IRT=default_vector_transport_method(M),\n)\n\nCreate a stopping criterion with threshold ε for the change gradient, that is, this criterion indicates to stop when get_gradient is in (norm of) its change less than ε, where vector_transport_method denotes the vector transport mathcal T used.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopWhenGradientNormLess","page":"Stopping Criteria","title":"Manopt.StopWhenGradientNormLess","text":"StopWhenGradientNormLess <: StoppingCriterion\n\nA stopping criterion based on the current gradient norm.\n\nConstructor\n\nStopWhenGradientNormLess(ε::Float64)\n\nCreate a stopping criterion with threshold ε for the gradient, that is, this criterion indicates to stop when get_gradient returns a gradient vector of norm less than ε.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopWhenSmallerOrEqual","page":"Stopping Criteria","title":"Manopt.StopWhenSmallerOrEqual","text":"StopWhenSmallerOrEqual <: StoppingCriterion\n\nA functor for an stopping criterion, where the algorithm if stopped when a variable is smaller than or equal to its minimum value.\n\nFields\n\nvalue – stores the variable which has to fall under a threshold for the algorithm to stop\nminValue – stores the threshold where, if the value is smaller or equal to this threshold, the algorithm stops\nreason – stores a reason of stopping if the stopping criterion has one be reached, see get_reason.\n\nConstructor\n\nStopWhenSmallerOrEqual(value, minValue)\n\ninitialize the stopifsmallerorequal functor to indicate to stop after value is smaller than or equal to minValue.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Manopt.StopWhenStepsizeLess","page":"Stopping Criteria","title":"Manopt.StopWhenStepsizeLess","text":"StopWhenStepsizeLess <: StoppingCriterion\n\nstores a threshold when to stop looking at the last step size determined or found during the last iteration from within a AbstractManoptSolverState.\n\nConstructor\n\nStopWhenStepsizeLess(ε)\n\ninitialize the stopping criterion to a threshold ε.\n\n\n\n\n\n","category":"type"},{"location":"plans/stopping_criteria/#Functions-for-Stopping-Criteria","page":"Stopping Criteria","title":"Functions for Stopping Criteria","text":"","category":"section"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"There are a few functions to update, combine and modify stopping criteria, especially to update internal values even for stopping criteria already being used within an AbstractManoptSolverState structure.","category":"page"},{"location":"plans/stopping_criteria/","page":"Stopping Criteria","title":"Stopping Criteria","text":"Modules = [Manopt]\nPages = [\"plans/stopping_criterion.jl\"]\nOrder = [:function]","category":"page"},{"location":"plans/stopping_criteria/#Base.:&-Union{Tuple{T}, Tuple{S}, Tuple{S, T}} where {S<:StoppingCriterion, T<:StoppingCriterion}","page":"Stopping Criteria","title":"Base.:&","text":"&(s1,s2)\ns1 & s2\n\nCombine two StoppingCriterion within an StopWhenAll. If either s1 (or s2) is already an StopWhenAll, then s2 (or s1) is appended to the list of StoppingCriterion within s1 (or s2).\n\nExample\n\na = StopAfterIteration(200) & StopWhenChangeLess(1e-6)\nb = a & StopWhenGradientNormLess(1e-6)\n\nIs the same as\n\na = StopWhenAll(StopAfterIteration(200), StopWhenChangeLess(1e-6))\nb = StopWhenAll(StopAfterIteration(200), StopWhenChangeLess(1e-6), StopWhenGradientNormLess(1e-6))\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Base.:|-Union{Tuple{T}, Tuple{S}, Tuple{S, T}} where {S<:StoppingCriterion, T<:StoppingCriterion}","page":"Stopping Criteria","title":"Base.:|","text":"|(s1,s2)\ns1 | s2\n\nCombine two StoppingCriterion within an StopWhenAny. If either s1 (or s2) is already an StopWhenAny, then s2 (or s1) is appended to the list of StoppingCriterion within s1 (or s2)\n\nExample\n\na = StopAfterIteration(200) | StopWhenChangeLess(1e-6)\nb = a | StopWhenGradientNormLess(1e-6)\n\nIs the same as\n\na = StopWhenAny(StopAfterIteration(200), StopWhenChangeLess(1e-6))\nb = StopWhenAny(StopAfterIteration(200), StopWhenChangeLess(1e-6), StopWhenGradientNormLess(1e-6))\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.get_active_stopping_criteria-Tuple{sCS} where sCS<:StoppingCriterionSet","page":"Stopping Criteria","title":"Manopt.get_active_stopping_criteria","text":"get_active_stopping_criteria(c)\n\nreturns all active stopping criteria, if any, that are within a StoppingCriterion c, and indicated a stop, i.e. their reason is nonempty. To be precise for a simple stopping criterion, this returns either an empty array if no stop is indicated or the stopping criterion as the only element of an array. For a StoppingCriterionSet all internal (even nested) criteria that indicate to stop are returned.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.get_reason-Tuple{AbstractManoptSolverState}","page":"Stopping Criteria","title":"Manopt.get_reason","text":"get_reason(o)\n\nreturn the current reason stored within the StoppingCriterion from within the AbstractManoptSolverState This reason is empty if the criterion has never been met.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.get_reason-Tuple{sC} where sC<:StoppingCriterion","page":"Stopping Criteria","title":"Manopt.get_reason","text":"get_reason(c)\n\nreturn the current reason stored within a StoppingCriterion c. This reason is empty if the criterion has never been met.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.get_stopping_criteria-Tuple{S} where S<:StoppingCriterionSet","page":"Stopping Criteria","title":"Manopt.get_stopping_criteria","text":"get_stopping_criteria(c)\n\nreturn the array of internally stored StoppingCriterions for a StoppingCriterionSet c.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.indicates_convergence-Tuple{StoppingCriterion}","page":"Stopping Criteria","title":"Manopt.indicates_convergence","text":"indicates_convergence(c::StoppingCriterion)\n\nReturn whether (true) or not (false) a StoppingCriterion does always mean that, when it indicates to stop, the solver has converged to a minimizer or critical point.\n\nNote that this is independent of the actual state of the stopping criterion, i.e. whether some of them indicate to stop, but a purely type-based, static decision\n\nExamples\n\nWith s1=StopAfterIteration(20) and s2=StopWhenGradientNormLess(1e-7) we have\n\nindicates_convergence(s1) is false\nindicates_convergence(s2) is true\nindicates_convergence(s1 | s2) is false, since this might also stop after 20 iterations\nindicates_convergence(s1 & s2) is true, since s2 is fulfilled if this stops.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.update_stopping_criterion!-Tuple{Any, Any, Any}","page":"Stopping Criteria","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::Stoppingcriterion, s::Symbol, v::value)\nupdate_stopping_criterion!(s::AbstractManoptSolverState, symbol::Symbol, v::value)\nupdate_stopping_criterion!(c::Stoppingcriterion, ::Val{Symbol}, v::value)\n\nUpdate a value within a stopping criterion, specified by the symbol s, to v. If a criterion does not have a value assigned that corresponds to s, the update is ignored.\n\nFor the second signature, the stopping criterion within the AbstractManoptSolverState o is updated.\n\nTo see which symbol updates which value, see the specific stopping criteria. They should use dispatch per symbol value (the third signature).\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.update_stopping_criterion!-Tuple{StopAfter, Val{:MaxTime}, Dates.Period}","page":"Stopping Criteria","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopAfter, :MaxTime, v::Period)\n\nUpdate the time period after which an algorithm shall stop.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.update_stopping_criterion!-Tuple{StopAfterIteration, Val{:MaxIteration}, Int64}","page":"Stopping Criteria","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopAfterIteration, :;MaxIteration, v::Int)\n\nUpdate the number of iterations after which the algorithm should stop.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.update_stopping_criterion!-Tuple{StopWhenChangeLess, Val{:MinIterateChange}, Any}","page":"Stopping Criteria","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopWhenChangeLess, :MinIterateChange, v::Int)\n\nUpdate the minimal change below which an algorithm shall stop.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.update_stopping_criterion!-Tuple{StopWhenCostLess, Val{:MinCost}, Any}","page":"Stopping Criteria","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopWhenCostLess, :MinCost, v)\n\nUpdate the minimal cost below which the algorithm shall stop\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.update_stopping_criterion!-Tuple{StopWhenGradientChangeLess, Val{:MinGradientChange}, Any}","page":"Stopping Criteria","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopWhenGradientChangeLess, :MinGradientChange, v)\n\nUpdate the minimal change below which an algorithm shall stop.\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.update_stopping_criterion!-Tuple{StopWhenGradientNormLess, Val{:MinGradNorm}, Float64}","page":"Stopping Criteria","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopWhenGradientNormLess, :MinGradNorm, v::Float64)\n\nUpdate the minimal gradient norm when an algorithm shall stop\n\n\n\n\n\n","category":"method"},{"location":"plans/stopping_criteria/#Manopt.update_stopping_criterion!-Tuple{StopWhenStepsizeLess, Val{:MinStepsize}, Any}","page":"Stopping Criteria","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopWhenStepsizeLess, :MinStepsize, v)\n\nUpdate the minimal step size below which the algorithm shall stop\n\n\n\n\n\n","category":"method"},{"location":"tutorials/HowToRecord/#How-to-Record-Data-During-the-Iterations","page":"Record values","title":"How to Record Data During the Iterations","text":"","category":"section"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"The recording and debugging features make it possible to record nearly any data during the iterations. This tutorial illustrates how to:","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"record one value during the iterations;\nrecord multiple values during the iterations and access them afterwards;\ndefine an own RecordAction to perform individual recordings.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Several predefined recordings exist, for example RecordCost or RecordGradient, if the problem the solver uses provides a gradient. For fields of the State the recording can also be done RecordEntry. For other recordings, for example more advanced computations before storing a value, an own RecordAction can be defined.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"We illustrate these using the gradient descent from the Get Started: Optimize! tutorial.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Here we focus on ways to investigate the behaviour during iterations by using Recording techniques.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Let’s first load the necessary packages.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"using Manopt, Manifolds, Random\nRandom.seed!(42);","category":"page"},{"location":"tutorials/HowToRecord/#The-Objective","page":"Record values","title":"The Objective","text":"","category":"section"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"We generate data and define our cost and gradient:","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Random.seed!(42)\nm = 30\nM = Sphere(m)\nn = 800\nσ = π / 8\nx = zeros(Float64, m + 1)\nx[2] = 1.0\ndata = [exp(M, x, σ * rand(M; vector_at=x)) for i in 1:n]\nf(M, p) = sum(1 / (2 * n) * distance.(Ref(M), Ref(p), data) .^ 2)\ngrad_f(M, p) = sum(1 / n * grad_distance.(Ref(M), data, Ref(p)))","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"grad_f (generic function with 1 method)","category":"page"},{"location":"tutorials/HowToRecord/#Plain-Examples","page":"Record values","title":"Plain Examples","text":"","category":"section"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"For the high level interfaces of the solvers, like gradient_descent we have to set return_state to true to obtain the whole solver state and not only the resulting minimizer.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Then we can easily use the record= option to add recorded values. This keyword accepts RecordActions as well as several symbols as shortcuts, for example :Cost to record the cost, or if your options have a field f, :f would record that entry. An overview of the symbols that can be used is given here.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"We first just record the cost after every iteration","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"R = gradient_descent(M, f, grad_f, data[1]; record=:Cost, return_state=true)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"# Solver state for `Manopt.jl`s Gradient Descent\nAfter 63 iterations\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nArmijoLinesearch() with keyword parameters\n * initial_stepsize = 1.0\n * retraction_method = ExponentialRetraction()\n * contraction_factor = 0.95\n * sufficient_decrease = 0.1\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 200: not reached\n |grad f| < 1.0e-9: reached\nOverall: reached\nThis indicates convergence: Yes\n\n## Record\n(Iteration = RecordCost(),)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"From the returned state, we see that the GradientDescentState are encapsulated (decorated) within a RecordSolverState.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"For such a state, one can attach different recorders to some operations, currently to :Start. :Stop, and :Iteration, where :Iteration is the default when using the record= keyword with a RecordAction as above. We can access all values recorded during the iterations by calling get_record(R, :Iteation) or since this is the default even shorter","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record(R)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"63-element Vector{Float64}:\n 0.6868754085841272\n 0.6240211444102516\n 0.5900374782569905\n 0.5691425134106757\n 0.5512819383843195\n 0.542136810022984\n 0.5374585627386623\n 0.5350045365259574\n 0.5337243124406587\n 0.5330491236590466\n 0.5326944302021914\n 0.5325071127227716\n 0.5324084047176342\n ⋮\n 0.5322977905736713\n 0.5322977905736701\n 0.5322977905736692\n 0.5322977905736687\n 0.5322977905736684\n 0.5322977905736682\n 0.5322977905736682\n 0.5322977905736681\n 0.5322977905736681\n 0.5322977905736681\n 0.5322977905736681\n 0.5322977905736679","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"To record more than one value, you can pass an array of a mix of symbols and RecordActions which formally introduces RecordGroup. Such a group records a tuple of values in every iteration:","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"R2 = gradient_descent(M, f, grad_f, data[1]; record=[:Iteration, :Cost], return_state=true)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"# Solver state for `Manopt.jl`s Gradient Descent\nAfter 63 iterations\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nArmijoLinesearch() with keyword parameters\n * initial_stepsize = 1.0\n * retraction_method = ExponentialRetraction()\n * contraction_factor = 0.95\n * sufficient_decrease = 0.1\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 200: not reached\n |grad f| < 1.0e-9: reached\nOverall: reached\nThis indicates convergence: Yes\n\n## Record\n(Iteration = RecordGroup([RecordIteration(), RecordCost()]),)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Here, the symbol :Cost is mapped to using the RecordCost action. The same holds for :Iteration obiously records the current iteration number i. To access these you can first extract the group of records (that is where the :Iterations are recorded – note the plural) and then access the :Cost ““”","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record_action(R2, :Iteration)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"RecordGroup([RecordIteration(), RecordCost()])","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Since iteration is the default, we can also omit it here again. To access single recorded values, one can use","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record_action(R2)[:Cost]","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"63-element Vector{Float64}:\n 0.6868754085841272\n 0.6240211444102516\n 0.5900374782569905\n 0.5691425134106757\n 0.5512819383843195\n 0.542136810022984\n 0.5374585627386623\n 0.5350045365259574\n 0.5337243124406587\n 0.5330491236590466\n 0.5326944302021914\n 0.5325071127227716\n 0.5324084047176342\n ⋮\n 0.5322977905736713\n 0.5322977905736701\n 0.5322977905736692\n 0.5322977905736687\n 0.5322977905736684\n 0.5322977905736682\n 0.5322977905736682\n 0.5322977905736681\n 0.5322977905736681\n 0.5322977905736681\n 0.5322977905736681\n 0.5322977905736679","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"This can be also done by using a the high level interface get_record","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record(R2, :Iteration, :Cost)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"63-element Vector{Float64}:\n 0.6868754085841272\n 0.6240211444102516\n 0.5900374782569905\n 0.5691425134106757\n 0.5512819383843195\n 0.542136810022984\n 0.5374585627386623\n 0.5350045365259574\n 0.5337243124406587\n 0.5330491236590466\n 0.5326944302021914\n 0.5325071127227716\n 0.5324084047176342\n ⋮\n 0.5322977905736713\n 0.5322977905736701\n 0.5322977905736692\n 0.5322977905736687\n 0.5322977905736684\n 0.5322977905736682\n 0.5322977905736682\n 0.5322977905736681\n 0.5322977905736681\n 0.5322977905736681\n 0.5322977905736681\n 0.5322977905736679","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Note that the first symbol again refers to the point where we record (not to the thing we record). We can also pass a tuple as second argument to have our own order within the tuples returned. Switching the order of recorded cost and Iteration can be done using ““”","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record(R2, :Iteration, (:Iteration, :Cost))","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"63-element Vector{Tuple{Int64, Float64}}:\n (1, 0.6868754085841272)\n (2, 0.6240211444102516)\n (3, 0.5900374782569905)\n (4, 0.5691425134106757)\n (5, 0.5512819383843195)\n (6, 0.542136810022984)\n (7, 0.5374585627386623)\n (8, 0.5350045365259574)\n (9, 0.5337243124406587)\n (10, 0.5330491236590466)\n (11, 0.5326944302021914)\n (12, 0.5325071127227716)\n (13, 0.5324084047176342)\n ⋮\n (52, 0.5322977905736713)\n (53, 0.5322977905736701)\n (54, 0.5322977905736692)\n (55, 0.5322977905736687)\n (56, 0.5322977905736684)\n (57, 0.5322977905736682)\n (58, 0.5322977905736682)\n (59, 0.5322977905736681)\n (60, 0.5322977905736681)\n (61, 0.5322977905736681)\n (62, 0.5322977905736681)\n (63, 0.5322977905736679)","category":"page"},{"location":"tutorials/HowToRecord/#A-more-Complex-Example","page":"Record values","title":"A more Complex Example","text":"","category":"section"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"To illustrate a complicated example let’s record: * the iteration number, cost and gradient field, but only every sixth iteration; * the iteration at which we stop.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"We first generate the problem and the state, to also illustrate the low-level works when not using the high-level iterface gradient_descent.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"p = DefaultManoptProblem(M, ManifoldGradientObjective(f, grad_f))\ns = GradientDescentState(\n M,\n copy(data[1]);\n stopping_criterion=StopAfterIteration(200) | StopWhenGradientNormLess(10.0^-9),\n)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"# Solver state for `Manopt.jl`s Gradient Descent\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nArmijoLinesearch() with keyword parameters\n * initial_stepsize = 1.0\n * retraction_method = ExponentialRetraction()\n * contraction_factor = 0.95\n * sufficient_decrease = 0.1\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 200: not reached\n |grad f| < 1.0e-9: not reached\nOverall: not reached\nThis indicates convergence: No","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"We now first build a RecordGroup to group the three entries we want to record per iteration. We then put this into a RecordEvery to only record this every 6th iteration","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"rI = RecordEvery(\n RecordGroup([\n :Iteration => RecordIteration(),\n :Cost => RecordCost(),\n :Gradient => RecordEntry(similar(data[1]), :X),\n ]),\n 6,\n)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"RecordEvery(RecordGroup([RecordIteration(), RecordCost(), RecordEntry(:X)]), 6, true)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"and for recodring the final iteration number","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"sI = RecordIteration()","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"RecordIteration()","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"We now combine both into the RecordSolverState decorator. It acts completely the same as any AbstractManoptSolverState but records something in every iteration additionally. This is stored in a dictionary of RecordActions, where :Iteration is the action (here the only every 6th iteration group) and the sI which is executed at stop.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Note that the keyword record= in the high level interface gradient_descent only would fill the :Iteration symbol of said dictionary.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"r = RecordSolverState(s, Dict(:Iteration => rI, :Stop => sI))","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"# Solver state for `Manopt.jl`s Gradient Descent\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nArmijoLinesearch() with keyword parameters\n * initial_stepsize = 1.0\n * retraction_method = ExponentialRetraction()\n * contraction_factor = 0.95\n * sufficient_decrease = 0.1\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 200: not reached\n |grad f| < 1.0e-9: not reached\nOverall: not reached\nThis indicates convergence: No\n\n## Record\n(Iteration = RecordEvery(RecordGroup([RecordIteration(), RecordCost(), RecordEntry(:X)]), 6, true), Stop = RecordIteration())","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"We now call the solver","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"res = solve!(p, r)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"# Solver state for `Manopt.jl`s Gradient Descent\nAfter 63 iterations\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nArmijoLinesearch() with keyword parameters\n * initial_stepsize = 1.0\n * retraction_method = ExponentialRetraction()\n * contraction_factor = 0.95\n * sufficient_decrease = 0.1\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 200: not reached\n |grad f| < 1.0e-9: reached\nOverall: reached\nThis indicates convergence: Yes\n\n## Record\n(Iteration = RecordEvery(RecordGroup([RecordIteration(), RecordCost(), RecordEntry(:X)]), 6, true), Stop = RecordIteration())","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"And we can check the recorded value at :Stop to see how many iterations were performed","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record(res, :Stop)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"1-element Vector{Int64}:\n 63","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"and the other values during the iterations are","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record(res, :Iteration, (:Iteration, :Cost))","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"10-element Vector{Tuple{Int64, Float64}}:\n (6, 0.542136810022984)\n (12, 0.5325071127227716)\n (18, 0.5323023757104095)\n (24, 0.5322978928223224)\n (30, 0.5322977928970518)\n (36, 0.5322977906274987)\n (42, 0.5322977905749401)\n (48, 0.5322977905736989)\n (54, 0.5322977905736692)\n (60, 0.5322977905736681)","category":"page"},{"location":"tutorials/HowToRecord/#Writing-an-own-[RecordAction](https://manoptjl.org/stable/plans/record/#Manopt.RecordAction)s","page":"Record values","title":"Writing an own RecordActions","text":"","category":"section"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Let’s investigate where we want to count the number of function evaluations, again just to illustrate, since for the gradient this is just one evaluation per iteration. We first define a cost, that counts its own calls. ““”","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"mutable struct MyCost{T}\n data::T\n count::Int\nend\nMyCost(data::T) where {T} = MyCost{T}(data, 0)\nfunction (c::MyCost)(M, x)\n c.count += 1\n return sum(1 / (2 * length(c.data)) * distance.(Ref(M), Ref(x), c.data) .^ 2)\nend","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"and we define an own, new RecordAction, which is a functor, i.e. a struct that is also a function. The function we have to implement is similar to a single solver step in signature, since it might get called every iteration:","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"mutable struct RecordCount <: RecordAction\n recorded_values::Vector{Int}\n RecordCount() = new(Vector{Int}())\nend\nfunction (r::RecordCount)(p::AbstractManoptProblem, ::AbstractManoptSolverState, i)\n if i > 0\n push!(r.recorded_values, Manopt.get_cost_function(get_objective(p)).count)\n elseif i < 0 # reset if negative\n r.recorded_values = Vector{Int}()\n end\nend","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Now we can initialize the new cost and call the gradient descent. Note that this illustrates also the last use case – you can pass symbol-action pairs into the record=array.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"f2 = MyCost(data)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"MyCost{Vector{Vector{Float64}}}([[-0.054658825167894595, -0.5592077846510423, -0.04738273828111257, -0.04682080720921302, 0.12279468849667038, 0.07171438895366239, -0.12930045409417057, -0.22102081626380404, -0.31805333254577767, 0.0065859500152017645 … -0.21999168261518043, 0.19570142227077295, 0.340909965798364, -0.0310802190082894, -0.04674431076254687, -0.006088297671169996, 0.01576037011323387, -0.14523596850249543, 0.14526158060820338, 0.1972125856685378], [-0.08192376929745249, -0.5097715132187676, -0.008339904915541005, 0.07289741328038676, 0.11422036270613797, -0.11546739299835748, 0.2296996932628472, 0.1490467170835958, -0.11124820565850364, -0.11790721606521781 … -0.16421249630470344, -0.2450575844467715, -0.07570080850379841, -0.07426218324072491, -0.026520181327346338, 0.11555341205250205, -0.0292955762365121, -0.09012096853677576, -0.23470556634911574, -0.026214242996704013], [-0.22951484264859257, -0.6083825348640186, 0.14273766477054015, -0.11947823367023377, 0.05984293499234536, 0.058820835498203126, 0.07577331705863266, 0.1632847202946857, 0.20244385489915745, 0.04389826920203656 … 0.3222365119325929, 0.009728730325524067, -0.12094785371632395, -0.36322323926212824, -0.0689253407939657, 0.23356953371702974, 0.23489531397909744, 0.078303336494718, -0.14272984135578806, 0.07844539956202407], [-0.0012588500237817606, -0.29958740415089763, 0.036738459489123514, 0.20567651907595125, -0.1131046432541904, -0.06032435985370224, 0.3366633723165895, -0.1694687746143405, -0.001987171245125281, 0.04933779858684409 … -0.2399584473006256, 0.19889267065775063, 0.22468755918787048, 0.1780090580180643, 0.023703860700539356, -0.10212737517121755, 0.03807004103115319, -0.20569120952458983, -0.03257704254233959, 0.06925473452536687], [-0.035534309946938375, -0.06645560787329002, 0.14823972268208874, -0.23913346587232426, 0.038347027875883496, 0.10453333143286662, 0.050933995140290705, -0.12319549375687473, 0.12956684644537844, -0.23540367869989412 … -0.41471772859912864, -0.1418984610380257, 0.0038321446836859334, 0.23655566917750157, -0.17500681300994742, -0.039189751036839374, -0.08687860620942896, -0.11509948162959047, 0.11378233994840942, 0.38739450723013735], [-0.3122539912469438, -0.3101935557860296, 0.1733113629107006, 0.08968593616209351, -0.1836344261367962, -0.06480023695256802, 0.18165070013886545, 0.19618275767992124, -0.07956460275570058, 0.0325997354656551 … 0.2845492418767769, 0.17406455870721682, -0.053101230371568706, -0.1382082812981627, 0.005830071475508364, 0.16739264037923055, 0.034365814374995335, 0.09107702398753297, -0.1877250428700409, 0.05116494897806923], [-0.04159442361185588, -0.7768029783272633, 0.06303616666722486, 0.08070518925253539, -0.07396265237309446, -0.06008109299719321, 0.07977141629715745, 0.019511027129056415, 0.08629917589924847, -0.11156298867318722 … 0.0792587504128044, -0.016444383900170008, -0.181746064577005, -0.01888129512990984, -0.13523922089388968, 0.11358102175659832, 0.07929049608459493, 0.1689565359083833, 0.07673657951723721, -0.1128480905648813], [-0.21221814304651335, -0.5031823821503253, 0.010326342133992458, -0.12438192100961257, 0.04004758695231872, 0.2280527500843805, -0.2096243232022162, -0.16564828762420294, -0.28325749481138984, 0.17033534605245823 … -0.13599096505924074, 0.28437770540525625, 0.08424426798544583, -0.1266207606984139, 0.04917635557603396, -0.00012608938533809706, -0.04283220254770056, -0.08771365647566572, 0.14750169103093985, 0.11601120086036351], [0.10683290707435536, -0.17680836277740156, 0.23767458301899405, 0.12011180867097299, -0.029404774462600154, 0.11522028383799933, -0.3318174480974519, -0.17859266746938374, 0.04352373642537759, 0.2530382802667988 … 0.08879861736692073, -0.004412506987801729, 0.19786810509925895, -0.1397104682727044, 0.09482328498485094, 0.05108149065160893, -0.14578343506951633, 0.3167479772660438, 0.10422673169182732, 0.21573150015891313], [-0.024895624707466164, -0.7473912016432697, -0.1392537238944721, -0.14948896791465557, -0.09765393283580377, 0.04413059403279867, -0.13865379004720355, -0.071032040283992, 0.15604054722246585, -0.10744260463413555 … -0.14748067081342833, -0.14743635071251024, 0.0643591937981352, 0.16138827697852615, -0.12656652133603935, -0.06463635704869083, 0.14329582429103488, -0.01113113793821713, 0.29295387893749997, 0.06774523575259782] … [0.011874845316569967, -0.6910596618389588, 0.21275741439477827, -0.014042545524367437, -0.07883613103495014, -0.0021900966696246776, -0.033836430464220496, 0.2925813113264835, -0.04718187201980008, 0.03949680289730036 … 0.0867736586603294, 0.0404682510051544, -0.24779813848587257, -0.28631514602877145, -0.07211767532456789, -0.15072898498180473, 0.017855923621826746, -0.09795357710255254, -0.14755229203084924, 0.1305005778855436], [0.013457629515450426, -0.3750353654626534, 0.12349883726772073, 0.3521803555005319, 0.2475921439420274, 0.006088649842999206, 0.31203183112392907, -0.036869203979483754, -0.07475746464056504, -0.029297797064479717 … 0.16867368684091563, -0.09450564983271922, -0.0587273302122711, -0.1326667940553803, -0.25530237980444614, 0.37556905374043376, 0.04922612067677609, 0.2605362549983866, -0.21871556587505667, -0.22915883767386164], [0.03295085436260177, -0.971861604433394, 0.034748713521512035, -0.0494065013245799, -0.01767479281403355, 0.0465459739459587, 0.007470494722096038, 0.003227960072276129, 0.0058328596338402365, -0.037591237446692356 … 0.03205152122876297, 0.11331109854742015, 0.03044900529526686, 0.017971704993311105, -0.009329252062960229, -0.02939354719650879, 0.022088835776251863, -0.02546111553658854, -0.0026257225461427582, 0.005702111697172774], [0.06968243992532257, -0.7119502191435176, -0.18136614593117445, -0.1695926215673451, 0.01725015359973796, -0.00694164951158388, -0.34621134287344574, 0.024709256792651912, -0.1632255805999673, -0.2158226433583082 … -0.14153772108081458, -0.11256850346909901, 0.045109821764180706, -0.1162754336222613, -0.13221711766357983, 0.005365354776191061, 0.012750671705879105, -0.018208207549835407, 0.12458753932455452, -0.31843587960340897], [-0.19830349374441875, -0.6086693423968884, 0.08552341811170468, 0.35781519334042255, 0.15790663648524367, 0.02712571268324985, 0.09855601327331667, -0.05840653973421127, -0.09546429767790429, -0.13414717696055448 … -0.0430935804718714, 0.2678584478951765, 0.08780994289014614, 0.01613469379498457, 0.0516187906322884, -0.07383067566731401, -0.1481272738354552, -0.010532317187265649, 0.06555344745952187, -0.1506167863762911], [-0.04347524125197773, -0.6327981074196994, -0.221116680035191, 0.0282207467940456, -0.0855024881522933, 0.12821801740178346, 0.1779499563280024, -0.10247384887512365, 0.0396432464100116, -0.0582580338112627 … 0.1253893207083573, 0.09628202269764763, 0.3165295473947355, -0.14915034201394833, -0.1376727867817772, -0.004153096613530293, 0.09277957650773738, 0.05917264554031624, -0.12230262590034507, -0.19655728521529914], [-0.10173946348675116, -0.6475660153977272, 0.1260284619729566, -0.11933160462857616, -0.04774310633937567, 0.09093928358804217, 0.041662676324043114, -0.1264739543938265, 0.09605293126911392, -0.16790474428001648 … -0.04056684573478108, 0.09351665120940456, 0.15259195558799882, 0.0009949298312580497, 0.09461980828206303, 0.3067004514287283, 0.16129258773733715, -0.18893664085007542, -0.1806865244492513, 0.029319680436405825], [-0.251780954320053, -0.39147463259941456, -0.24359579328578626, 0.30179309757665723, 0.21658893985206484, 0.12304585275893232, 0.28281133086451704, 0.029187615341955325, 0.03616243507191924, 0.029375588909979152 … -0.08071746662465404, -0.2176101928258658, 0.20944684921170825, 0.043033273425352715, -0.040505542460853576, 0.17935596149079197, -0.08454569418519972, 0.0545941597033932, 0.12471741052450099, -0.24314124407858329], [0.28156471341150974, -0.6708572780452595, -0.1410302363738465, -0.08322589397277698, -0.022772599832907418, -0.04447265789199677, -0.016448068022011157, -0.07490911512503738, 0.2778432295769144, -0.10191899088372378 … -0.057272155080983836, 0.12817478092201395, 0.04623814480781884, -0.12184190164369117, 0.1987855635987229, -0.14533603246124993, -0.16334072868597016, -0.052369977381939437, 0.014904286931394959, -0.2440882678882144], [0.12108727495744157, -0.714787344982596, 0.01632521838262752, 0.04437570556908449, -0.041199280304144284, 0.052984488452616, 0.03796520200156107, 0.2791785910964288, 0.11530429924056099, 0.12178223160398421 … -0.07621847481721669, 0.18353870423743013, -0.19066653731436745, -0.09423224997242206, 0.14596847781388494, -0.09747986927777111, 0.16041150122587072, -0.02296513951256738, 0.06786878373578588, 0.15296635978447756]], 0)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"Now for the plain gradient descent, we have to modify the step (to a constant stepsize) and remove the default check whether the cost increases (setting debug to []). We also only look at the first 20 iterations to keep this example small in recorded values. We call","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"R3 = gradient_descent(\n M,\n f2,\n grad_f,\n data[1];\n record=[:Iteration, :Count => RecordCount(), :Cost],\n stepsize = ConstantStepsize(1.0),\n stopping_criterion=StopAfterIteration(20),\n debug=[],\n return_state=true,\n)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"# Solver state for `Manopt.jl`s Gradient Descent\nAfter 20 iterations\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nConstantStepsize(1.0, relative)\n\n## Stopping Criterion\nMax Iteration 20: reached\nThis indicates convergence: No\n\n## Record\n(Iteration = RecordGroup([RecordIteration(), RecordCount([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19]), RecordCost()]),)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"For :Cost we already learned how to access them, the :Count => introduces the following action to obtain the :Count. We can again access the whole sets of records","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record(R3)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"20-element Vector{Tuple{Int64, Int64, Float64}}:\n (1, 0, 0.5808287253777765)\n (2, 1, 0.5395268557323746)\n (3, 2, 0.5333529073733115)\n (4, 3, 0.5324514620174543)\n (5, 4, 0.5323201743667151)\n (6, 5, 0.5323010518577256)\n (7, 6, 0.5322982658416161)\n (8, 7, 0.532297859847447)\n (9, 8, 0.5322978006725337)\n (10, 9, 0.5322977920461375)\n (11, 10, 0.5322977907883957)\n (12, 11, 0.5322977906049865)\n (13, 12, 0.5322977905782369)\n (14, 13, 0.532297790574335)\n (15, 14, 0.5322977905737657)\n (16, 15, 0.5322977905736823)\n (17, 16, 0.5322977905736703)\n (18, 17, 0.5322977905736688)\n (19, 18, 0.5322977905736683)\n (20, 19, 0.5322977905736683)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"this is equivalent to calling R[:Iteration]. Note that since we introduced :Count we can also access a single recorded value using","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"R3[:Iteration, :Count]","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"20-element Vector{Int64}:\n 0\n 1\n 2\n 3\n 4\n 5\n 6\n 7\n 8\n 9\n 10\n 11\n 12\n 13\n 14\n 15\n 16\n 17\n 18\n 19","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"and we see that the cost function is called once per iteration.","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"If we use this counting cost and run the default gradient descent with Armijo linesearch, we can infer how many Armijo linesearch backtracks are preformed:","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"f3 = MyCost(data)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"MyCost{Vector{Vector{Float64}}}([[-0.054658825167894595, -0.5592077846510423, -0.04738273828111257, -0.04682080720921302, 0.12279468849667038, 0.07171438895366239, -0.12930045409417057, -0.22102081626380404, -0.31805333254577767, 0.0065859500152017645 … -0.21999168261518043, 0.19570142227077295, 0.340909965798364, -0.0310802190082894, -0.04674431076254687, -0.006088297671169996, 0.01576037011323387, -0.14523596850249543, 0.14526158060820338, 0.1972125856685378], [-0.08192376929745249, -0.5097715132187676, -0.008339904915541005, 0.07289741328038676, 0.11422036270613797, -0.11546739299835748, 0.2296996932628472, 0.1490467170835958, -0.11124820565850364, -0.11790721606521781 … -0.16421249630470344, -0.2450575844467715, -0.07570080850379841, -0.07426218324072491, -0.026520181327346338, 0.11555341205250205, -0.0292955762365121, -0.09012096853677576, -0.23470556634911574, -0.026214242996704013], [-0.22951484264859257, -0.6083825348640186, 0.14273766477054015, -0.11947823367023377, 0.05984293499234536, 0.058820835498203126, 0.07577331705863266, 0.1632847202946857, 0.20244385489915745, 0.04389826920203656 … 0.3222365119325929, 0.009728730325524067, -0.12094785371632395, -0.36322323926212824, -0.0689253407939657, 0.23356953371702974, 0.23489531397909744, 0.078303336494718, -0.14272984135578806, 0.07844539956202407], [-0.0012588500237817606, -0.29958740415089763, 0.036738459489123514, 0.20567651907595125, -0.1131046432541904, -0.06032435985370224, 0.3366633723165895, -0.1694687746143405, -0.001987171245125281, 0.04933779858684409 … -0.2399584473006256, 0.19889267065775063, 0.22468755918787048, 0.1780090580180643, 0.023703860700539356, -0.10212737517121755, 0.03807004103115319, -0.20569120952458983, -0.03257704254233959, 0.06925473452536687], [-0.035534309946938375, -0.06645560787329002, 0.14823972268208874, -0.23913346587232426, 0.038347027875883496, 0.10453333143286662, 0.050933995140290705, -0.12319549375687473, 0.12956684644537844, -0.23540367869989412 … -0.41471772859912864, -0.1418984610380257, 0.0038321446836859334, 0.23655566917750157, -0.17500681300994742, -0.039189751036839374, -0.08687860620942896, -0.11509948162959047, 0.11378233994840942, 0.38739450723013735], [-0.3122539912469438, -0.3101935557860296, 0.1733113629107006, 0.08968593616209351, -0.1836344261367962, -0.06480023695256802, 0.18165070013886545, 0.19618275767992124, -0.07956460275570058, 0.0325997354656551 … 0.2845492418767769, 0.17406455870721682, -0.053101230371568706, -0.1382082812981627, 0.005830071475508364, 0.16739264037923055, 0.034365814374995335, 0.09107702398753297, -0.1877250428700409, 0.05116494897806923], [-0.04159442361185588, -0.7768029783272633, 0.06303616666722486, 0.08070518925253539, -0.07396265237309446, -0.06008109299719321, 0.07977141629715745, 0.019511027129056415, 0.08629917589924847, -0.11156298867318722 … 0.0792587504128044, -0.016444383900170008, -0.181746064577005, -0.01888129512990984, -0.13523922089388968, 0.11358102175659832, 0.07929049608459493, 0.1689565359083833, 0.07673657951723721, -0.1128480905648813], [-0.21221814304651335, -0.5031823821503253, 0.010326342133992458, -0.12438192100961257, 0.04004758695231872, 0.2280527500843805, -0.2096243232022162, -0.16564828762420294, -0.28325749481138984, 0.17033534605245823 … -0.13599096505924074, 0.28437770540525625, 0.08424426798544583, -0.1266207606984139, 0.04917635557603396, -0.00012608938533809706, -0.04283220254770056, -0.08771365647566572, 0.14750169103093985, 0.11601120086036351], [0.10683290707435536, -0.17680836277740156, 0.23767458301899405, 0.12011180867097299, -0.029404774462600154, 0.11522028383799933, -0.3318174480974519, -0.17859266746938374, 0.04352373642537759, 0.2530382802667988 … 0.08879861736692073, -0.004412506987801729, 0.19786810509925895, -0.1397104682727044, 0.09482328498485094, 0.05108149065160893, -0.14578343506951633, 0.3167479772660438, 0.10422673169182732, 0.21573150015891313], [-0.024895624707466164, -0.7473912016432697, -0.1392537238944721, -0.14948896791465557, -0.09765393283580377, 0.04413059403279867, -0.13865379004720355, -0.071032040283992, 0.15604054722246585, -0.10744260463413555 … -0.14748067081342833, -0.14743635071251024, 0.0643591937981352, 0.16138827697852615, -0.12656652133603935, -0.06463635704869083, 0.14329582429103488, -0.01113113793821713, 0.29295387893749997, 0.06774523575259782] … [0.011874845316569967, -0.6910596618389588, 0.21275741439477827, -0.014042545524367437, -0.07883613103495014, -0.0021900966696246776, -0.033836430464220496, 0.2925813113264835, -0.04718187201980008, 0.03949680289730036 … 0.0867736586603294, 0.0404682510051544, -0.24779813848587257, -0.28631514602877145, -0.07211767532456789, -0.15072898498180473, 0.017855923621826746, -0.09795357710255254, -0.14755229203084924, 0.1305005778855436], [0.013457629515450426, -0.3750353654626534, 0.12349883726772073, 0.3521803555005319, 0.2475921439420274, 0.006088649842999206, 0.31203183112392907, -0.036869203979483754, -0.07475746464056504, -0.029297797064479717 … 0.16867368684091563, -0.09450564983271922, -0.0587273302122711, -0.1326667940553803, -0.25530237980444614, 0.37556905374043376, 0.04922612067677609, 0.2605362549983866, -0.21871556587505667, -0.22915883767386164], [0.03295085436260177, -0.971861604433394, 0.034748713521512035, -0.0494065013245799, -0.01767479281403355, 0.0465459739459587, 0.007470494722096038, 0.003227960072276129, 0.0058328596338402365, -0.037591237446692356 … 0.03205152122876297, 0.11331109854742015, 0.03044900529526686, 0.017971704993311105, -0.009329252062960229, -0.02939354719650879, 0.022088835776251863, -0.02546111553658854, -0.0026257225461427582, 0.005702111697172774], [0.06968243992532257, -0.7119502191435176, -0.18136614593117445, -0.1695926215673451, 0.01725015359973796, -0.00694164951158388, -0.34621134287344574, 0.024709256792651912, -0.1632255805999673, -0.2158226433583082 … -0.14153772108081458, -0.11256850346909901, 0.045109821764180706, -0.1162754336222613, -0.13221711766357983, 0.005365354776191061, 0.012750671705879105, -0.018208207549835407, 0.12458753932455452, -0.31843587960340897], [-0.19830349374441875, -0.6086693423968884, 0.08552341811170468, 0.35781519334042255, 0.15790663648524367, 0.02712571268324985, 0.09855601327331667, -0.05840653973421127, -0.09546429767790429, -0.13414717696055448 … -0.0430935804718714, 0.2678584478951765, 0.08780994289014614, 0.01613469379498457, 0.0516187906322884, -0.07383067566731401, -0.1481272738354552, -0.010532317187265649, 0.06555344745952187, -0.1506167863762911], [-0.04347524125197773, -0.6327981074196994, -0.221116680035191, 0.0282207467940456, -0.0855024881522933, 0.12821801740178346, 0.1779499563280024, -0.10247384887512365, 0.0396432464100116, -0.0582580338112627 … 0.1253893207083573, 0.09628202269764763, 0.3165295473947355, -0.14915034201394833, -0.1376727867817772, -0.004153096613530293, 0.09277957650773738, 0.05917264554031624, -0.12230262590034507, -0.19655728521529914], [-0.10173946348675116, -0.6475660153977272, 0.1260284619729566, -0.11933160462857616, -0.04774310633937567, 0.09093928358804217, 0.041662676324043114, -0.1264739543938265, 0.09605293126911392, -0.16790474428001648 … -0.04056684573478108, 0.09351665120940456, 0.15259195558799882, 0.0009949298312580497, 0.09461980828206303, 0.3067004514287283, 0.16129258773733715, -0.18893664085007542, -0.1806865244492513, 0.029319680436405825], [-0.251780954320053, -0.39147463259941456, -0.24359579328578626, 0.30179309757665723, 0.21658893985206484, 0.12304585275893232, 0.28281133086451704, 0.029187615341955325, 0.03616243507191924, 0.029375588909979152 … -0.08071746662465404, -0.2176101928258658, 0.20944684921170825, 0.043033273425352715, -0.040505542460853576, 0.17935596149079197, -0.08454569418519972, 0.0545941597033932, 0.12471741052450099, -0.24314124407858329], [0.28156471341150974, -0.6708572780452595, -0.1410302363738465, -0.08322589397277698, -0.022772599832907418, -0.04447265789199677, -0.016448068022011157, -0.07490911512503738, 0.2778432295769144, -0.10191899088372378 … -0.057272155080983836, 0.12817478092201395, 0.04623814480781884, -0.12184190164369117, 0.1987855635987229, -0.14533603246124993, -0.16334072868597016, -0.052369977381939437, 0.014904286931394959, -0.2440882678882144], [0.12108727495744157, -0.714787344982596, 0.01632521838262752, 0.04437570556908449, -0.041199280304144284, 0.052984488452616, 0.03796520200156107, 0.2791785910964288, 0.11530429924056099, 0.12178223160398421 … -0.07621847481721669, 0.18353870423743013, -0.19066653731436745, -0.09423224997242206, 0.14596847781388494, -0.09747986927777111, 0.16041150122587072, -0.02296513951256738, 0.06786878373578588, 0.15296635978447756]], 0)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"To not get too many entries let’s just look at the first 20 iterations again","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"R4 = gradient_descent(\n M,\n f3,\n grad_f,\n data[1];\n record=[:Count => RecordCount()],\n return_state=true,\n)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"# Solver state for `Manopt.jl`s Gradient Descent\nAfter 63 iterations\n\n## Parameters\n* retraction method: ExponentialRetraction()\n\n## Stepsize\nArmijoLinesearch() with keyword parameters\n * initial_stepsize = 1.0\n * retraction_method = ExponentialRetraction()\n * contraction_factor = 0.95\n * sufficient_decrease = 0.1\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 200: not reached\n |grad f| < 1.0e-9: reached\nOverall: reached\nThis indicates convergence: Yes\n\n## Record\n(Iteration = RecordGroup([RecordCount([25, 29, 33, 37, 40, 44, 48, 52, 56, 60, 64, 68, 72, 76, 80, 84, 88, 92, 96, 100, 104, 108, 112, 116, 120, 124, 128, 132, 136, 140, 144, 148, 152, 156, 160, 164, 168, 172, 176, 180, 184, 188, 192, 196, 200, 204, 208, 212, 216, 220, 224, 229, 232, 236, 240, 242, 247, 254, 263, 268, 270, 272, 278])]),)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"get_record(R4)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"63-element Vector{Tuple{Int64}}:\n (25,)\n (29,)\n (33,)\n (37,)\n (40,)\n (44,)\n (48,)\n (52,)\n (56,)\n (60,)\n (64,)\n (68,)\n (72,)\n ⋮\n (229,)\n (232,)\n (236,)\n (240,)\n (242,)\n (247,)\n (254,)\n (263,)\n (268,)\n (270,)\n (272,)\n (278,)","category":"page"},{"location":"tutorials/HowToRecord/","page":"Record values","title":"Record values","text":"We can see that the number of cost function calls varies, depending on how many linesearch backtrack steps were required to obtain a good stepsize.","category":"page"},{"location":"solvers/ChambollePock/#ChambollePockSolver","page":"Chambolle-Pock","title":"The Riemannian Chambolle-Pock Algorithm","text":"","category":"section"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"The Riemannian Chambolle–Pock is a generalization of the Chambolle–Pock algorithm Chambolle and Pock [CP11] It is also known as primal-dual hybrid gradient (PDHG) or primal-dual proximal splitting (PDPS) algorithm.","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"In order to minimize over pmathcal M the cost function consisting of","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"F(p) + G(Λ(p))","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"where Fmathcal M overlineℝ, Gmathcal N overlineℝ, and Λmathcal M mathcal N. If the manifolds mathcal M or mathcal N are not Hadamard, it has to be considered locally, i.e. on geodesically convex sets mathcal C subset mathcal M and mathcal D subsetmathcal N such that Λ(mathcal C) subset mathcal D.","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"The algorithm is available in four variants: exact versus linearized (see variant) as well as with primal versus dual relaxation (see relax). For more details, see Bergmann, Herzog, Silva Louzeiro, Tenbrinck and Vidal-Núñez [BHS+21]. In the following we note the case of the exact, primal relaxed Riemannian Chambolle–Pock algorithm.","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"Given base points mmathcal C, n=Λ(m)mathcal D, initial primal and dual values p^(0) mathcal C, ξ_n^(0) T_n^*mathcal N, and primal and dual step sizes sigma_0, tau_0, relaxation theta_0, as well as acceleration gamma.","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"As an initialization, perform bar p^(0) gets p^(0).","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"The algorithms performs the steps k=1 (until a StoppingCriterion is fulfilled with)","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"ξ^(k+1)_n = operatornameprox_tau_k G_n^*Bigl(ξ_n^(k) + tau_k bigl(log_n Λ (bar p^(k))bigr)^flatBigr)\np^(k+1) = operatornameprox_sigma_k Fbiggl(exp_p^(k)Bigl( operatornamePT_p^(k)gets mbigl(-sigma_k DΛ(m)^*ξ_n^(k+1)bigr)^sharpBigr)biggr)\nUpdate\ntheta_k = (1+2gammasigma_k)^-frac12\nsigma_k+1 = sigma_ktheta_k\ntau_k+1 = fractau_ktheta_k\nbar p^(k+1) = exp_p^(k+1)bigl(-theta_k log_p^(k+1) p^(k)bigr)","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"Furthermore you can exchange the exponential map, the logarithmic map, and the parallel transport by a retraction, an inverse retraction, and a vector transport.","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"Finally you can also update the base points m and n during the iterations. This introduces a few additional vector transports. The same holds for the case Λ(m^(k))neq n^(k) at some point. All these cases are covered in the algorithm.","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"ChambollePock\nChambollePock!","category":"page"},{"location":"solvers/ChambollePock/#Manopt.ChambollePock","page":"Chambolle-Pock","title":"Manopt.ChambollePock","text":"ChambollePock(\n M, N, cost, x0, ξ0, m, n, prox_F, prox_G_dual, adjoint_linear_operator;\n forward_operator=missing,\n linearized_forward_operator=missing,\n evaluation=AllocatingEvaluation()\n)\n\nPerform the Riemannian Chambolle–Pock algorithm.\n\nGiven a cost function mathcal Emathcal M ℝ of the form\n\nmathcal E(p) = F(p) + G( Λ(p) )\n\nwhere Fmathcal M ℝ, Gmathcal N ℝ, and Λmathcal M mathcal N. The remaining input parameters are\n\np, X primal and dual start points xmathcal M and ξT_nmathcal N\nm,n base points on mathcal M and mathcal N, respectively.\nadjoint_linearized_operator the adjoint DΛ^* of the linearized operator DΛ(m) T_mmathcal M T_Λ(m)mathcal N\nprox_F, prox_G_Dual the proximal maps of F and G^ast_n\n\nnote that depending on the AbstractEvaluationType evaluation the last three parameters as well as the forwardoperator Λ and the `linearizedforward_operatorcan be given as allocating functions(Manifolds, parameters) -> resultor as mutating functions(Manifold, result, parameters)-> result to spare allocations.\n\nBy default, this performs the exact Riemannian Chambolle Pock algorithm, see the optional parameter DΛ for their linearized variant.\n\nFor more details on the algorithm, see Bergmann et al., Found. Comput. Math., 2021.\n\nOptional Parameters\n\nacceleration – (0.05)\ndual_stepsize – (1/sqrt(8)) proximal parameter of the primal prox\nevaluation (AllocatingEvaluation()) specify whether the proximal maps and operators are allocating functions(Manifolds, parameters) -> resultor given as mutating functions(Manifold, result, parameters)-> result to spare allocations.\nΛ (missing) the (forward) operator Λ() (required for the :exact variant)\nlinearized_forward_operator (missing) its linearization DΛ() (required for the :linearized variant)\nprimal_stepsize – (1/sqrt(8)) proximal parameter of the dual prox\nrelaxation – (1.)\nrelax – (:primal) whether to relax the primal or dual\nvariant - (:exact if Λ is missing, otherwise :linearized) variant to use. Note that this changes the arguments the forward_operator will be called.\nstopping_criterion – (stopAtIteration(100)) a StoppingCriterion\nupdate_primal_base – (missing) function to update m (identity by default/missing)\nupdate_dual_base – (missing) function to update n (identity by default/missing)\nretraction_method – (default_retraction_method(M, typeof(p))) the retraction to use\ninverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use.\nvector_transport_method - (default_vector_transport_method(M, typeof(p))) a vector transport to use\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.ChambollePock!","page":"Chambolle-Pock","title":"Manopt.ChambollePock!","text":"ChambollePock(M, N, cost, x0, ξ0, m, n, prox_F, prox_G_dual, adjoint_linear_operator)\n\nPerform the Riemannian Chambolle–Pock algorithm in place of x, ξ, and potentially m, n if they are not fixed. See ChambollePock for details and optional parameters.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#State","page":"Chambolle-Pock","title":"State","text":"","category":"section"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"ChambollePockState","category":"page"},{"location":"solvers/ChambollePock/#Manopt.ChambollePockState","page":"Chambolle-Pock","title":"Manopt.ChambollePockState","text":"ChambollePockState <: AbstractPrimalDualSolverState\n\nstores all options and variables within a linearized or exact Chambolle Pock. The following list provides the order for the constructor, where the previous iterates are initialized automatically and values with a default may be left out.\n\nm - base point on mathcal M\nn - base point on mathcal N\np - an initial point on x^(0) mathcal M (and its previous iterate)\nX - an initial tangent vector X^(0)T^*mathcal N (and its previous iterate)\npbar - the relaxed iterate used in the next dual update step (when using :primal relaxation)\nXbar - the relaxed iterate used in the next primal update step (when using :dual relaxation)\nprimal_stepsize – (1/sqrt(8)) proximal parameter of the primal prox\ndual_stepsize – (1/sqrt(8)) proximal parameter of the dual prox\nacceleration – (0.) acceleration factor due to Chambolle & Pock\nrelaxation – (1.) relaxation in the primal relaxation step (to compute pbar)\nrelax – (:primal) which variable to relax (:primal or :dual)\nstop - a StoppingCriterion\nvariant – (exact) whether to perform an :exact or :linearized Chambolle-Pock\nupdate_primal_base ((p,o,i) -> o.m) function to update the primal base\nupdate_dual_base ((p,o,i) -> o.n) function to update the dual base\nretraction_method – (default_retraction_method(M, typeof(p))) the retraction to use\ninverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use on the manifold mathcal M.\ninverse_retraction_method_dual - (default_inverse_retraction_method(N, typeof(n))) an inverse retraction to use on manifold mathcal N.\nvector_transport_method - (default_vector_transport_method(M, typeof(p))) a vector transport to use on the manifold mathcal M.\nvector_transport_method_dual - (default_vector_transport_method(N, typeof(n))) a vector transport to use on manifold mathcal N.\n\nwhere for the last two the functions a AbstractManoptProblemp, AbstractManoptSolverStateo and the current iterate i are the arguments. If you activate these to be different from the default identity, you have to provide p.Λ for the algorithm to work (which might be missing in the linearized case).\n\nConstructor\n\nChambollePockState(M::AbstractManifold, N::AbstractManifold,\n m::P, n::Q, p::P, X::T, primal_stepsize::Float64, dual_stepsize::Float64;\n kwargs...\n)\n\nwhere all other fields from above are keyword arguments with their default values given in brackets.\n\nif Manifolds.jl is loaded, N is also a keyword argument and set to TangentBundle(M) by default.\n\n\n\n\n\n","category":"type"},{"location":"solvers/ChambollePock/#Useful-Terms","page":"Chambolle-Pock","title":"Useful Terms","text":"","category":"section"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"primal_residual\ndual_residual","category":"page"},{"location":"solvers/ChambollePock/#Manopt.primal_residual","page":"Chambolle-Pock","title":"Manopt.primal_residual","text":"primal_residual(p, o, x_old, X_old, n_old)\n\nCompute the primal residual at current iterate k given the necessary values x_k-1 X_k-1, and n_k-1 from the previous iterate.\n\nBigllVert\nfrac1σoperatornameretr^-1_x_kx_k-1 -\nV_x_kgets m_kbigl(DΛ^*(m_k)biglV_n_kgets n_k-1X_k-1 - X_k bigr\nBigrrVert\n\nwhere V_gets is the vector transport used in the ChambollePockState\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.dual_residual","page":"Chambolle-Pock","title":"Manopt.dual_residual","text":"dual_residual(p, o, x_old, X_old, n_old)\n\nCompute the dual residual at current iterate k given the necessary values x_k-1 X_k-1, and n_k-1 from the previous iterate. The formula is slightly different depending on the o.variant used:\n\nFor the :linearized it reads\n\nBigllVert\nfrac1τbigl(\nV_n_kgets n_k-1(X_k-1)\n- X_k\nbigr)\n-\nDΛ(m_k)bigl\nV_m_kgets x_koperatornameretr^-1_x_kx_k-1\nbigr\nBigrrVert\n\nand for the :exact variant\n\nBigllVert\nfrac1τ V_n_kgets n_k-1(X_k-1)\n-\noperatornameretr^-1_n_kbigl(\nΛ(operatornameretr_m_k(V_m_kgets x_koperatornameretr^-1_x_kx_k-1))\nbigr)\nBigrrVert\n\nwhere in both cases V_gets is the vector transport used in the ChambollePockState.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Debug","page":"Chambolle-Pock","title":"Debug","text":"","category":"section"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"DebugDualBaseIterate\nDebugDualBaseChange\nDebugPrimalBaseIterate\nDebugPrimalBaseChange\nDebugDualChange\nDebugDualIterate\nDebugDualResidual\nDebugPrimalChange\nDebugPrimalIterate\nDebugPrimalResidual\nDebugPrimalDualResidual","category":"page"},{"location":"solvers/ChambollePock/#Manopt.DebugDualBaseIterate","page":"Chambolle-Pock","title":"Manopt.DebugDualBaseIterate","text":"DebugDualBaseIterate(io::IO=stdout)\n\nPrint the dual base variable by using DebugEntry, see their constructors for detail. This method is further set display o.n.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.DebugDualBaseChange","page":"Chambolle-Pock","title":"Manopt.DebugDualBaseChange","text":"DebugDualChange(; storage=StoreStateAction([:n]), io::IO=stdout)\n\nPrint the change of the dual base variable by using DebugEntryChange, see their constructors for detail, on o.n.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.DebugPrimalBaseIterate","page":"Chambolle-Pock","title":"Manopt.DebugPrimalBaseIterate","text":"DebugPrimalBaseIterate()\n\nPrint the primal base variable by using DebugEntry, see their constructors for detail. This method is further set display o.m.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.DebugPrimalBaseChange","page":"Chambolle-Pock","title":"Manopt.DebugPrimalBaseChange","text":"DebugPrimalBaseChange(a::StoreStateAction=StoreStateAction([:m]),io::IO=stdout)\n\nPrint the change of the primal base variable by using DebugEntryChange, see their constructors for detail, on o.n.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.DebugDualChange","page":"Chambolle-Pock","title":"Manopt.DebugDualChange","text":"DebugDualChange(opts...)\n\nPrint the change of the dual variable, similar to DebugChange, see their constructors for detail, but with a different calculation of the change, since the dual variable lives in (possibly different) tangent spaces.\n\n\n\n\n\n","category":"type"},{"location":"solvers/ChambollePock/#Manopt.DebugDualIterate","page":"Chambolle-Pock","title":"Manopt.DebugDualIterate","text":"DebugDualIterate(e)\n\nPrint the dual variable by using DebugEntry, see their constructors for detail. This method is further set display o.X.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.DebugDualResidual","page":"Chambolle-Pock","title":"Manopt.DebugDualResidual","text":"DebugDualResidual <: DebugAction\n\nA Debug action to print the dual residual. The constructor accepts a printing function and some (shared) storage, which should at least record :Iterate, :X and :n.\n\nConstructor\n\nDebugDualResidual()\n\nwith the keywords\n\nio (stdout) - stream to perform the debug to\nformat (\"$prefix%s\") format to print the dual residual, using the\nprefix (\"Dual Residual: \") short form to just set the prefix\nstorage (a new StoreStateAction) to store values for the debug.\n\n\n\n\n\n","category":"type"},{"location":"solvers/ChambollePock/#Manopt.DebugPrimalChange","page":"Chambolle-Pock","title":"Manopt.DebugPrimalChange","text":"DebugPrimalChange(opts...)\n\nPrint the change of the primal variable by using DebugChange, see their constructors for detail.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.DebugPrimalIterate","page":"Chambolle-Pock","title":"Manopt.DebugPrimalIterate","text":"DebugPrimalIterate(opts...;kwargs...)\n\nPrint the change of the primal variable by using DebugIterate, see their constructors for detail.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.DebugPrimalResidual","page":"Chambolle-Pock","title":"Manopt.DebugPrimalResidual","text":"DebugPrimalResidual <: DebugAction\n\nA Debug action to print the primal residual. The constructor accepts a printing function and some (shared) storage, which should at least record :Iterate, :X and :n.\n\nConstructor\n\nDebugPrimalResidual()\n\nwith the keywords\n\nio (stdout) - stream to perform the debug to\nformat (\"$prefix%s\") format to print the dual residual, using the\nprefix (\"Primal Residual: \") short form to just set the prefix\nstorage (a new StoreStateAction) to store values for the debug.\n\n\n\n\n\n","category":"type"},{"location":"solvers/ChambollePock/#Manopt.DebugPrimalDualResidual","page":"Chambolle-Pock","title":"Manopt.DebugPrimalDualResidual","text":"DebugPrimalDualResidual <: DebugAction\n\nA Debug action to print the primaldual residual. The constructor accepts a printing function and some (shared) storage, which should at least record :Iterate, :X and :n.\n\nConstructor\n\nDebugPrimalDualResidual()\n\nwith the keywords\n\nio (stdout) - stream to perform the debug to\nformat (\"$prefix%s\") format to print the dual residual, using the\nprefix (\"Primal Residual: \") short form to just set the prefix\nstorage (a new StoreStateAction) to store values for the debug.\n\n\n\n\n\n","category":"type"},{"location":"solvers/ChambollePock/#Record","page":"Chambolle-Pock","title":"Record","text":"","category":"section"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"RecordDualBaseIterate\nRecordDualBaseChange\nRecordDualChange\nRecordDualIterate\nRecordPrimalBaseIterate\nRecordPrimalBaseChange\nRecordPrimalChange\nRecordPrimalIterate","category":"page"},{"location":"solvers/ChambollePock/#Manopt.RecordDualBaseIterate","page":"Chambolle-Pock","title":"Manopt.RecordDualBaseIterate","text":"RecordDualBaseIterate(n)\n\nCreate an RecordAction that records the dual base point, i.e. RecordEntry of o.n.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.RecordDualBaseChange","page":"Chambolle-Pock","title":"Manopt.RecordDualBaseChange","text":"RecordDualBaseChange(e)\n\nCreate an RecordAction that records the dual base point change, i.e. RecordEntryChange of o.n with distance to the last value to store a value.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.RecordDualChange","page":"Chambolle-Pock","title":"Manopt.RecordDualChange","text":"RecordDualChange()\n\nCreate the action either with a given (shared) Storage, which can be set to the values Tuple, if that is provided).\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.RecordDualIterate","page":"Chambolle-Pock","title":"Manopt.RecordDualIterate","text":"RecordDualIterate(X)\n\nCreate an RecordAction that records the dual base point, i.e. RecordEntry of o.X, so .\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.RecordPrimalBaseIterate","page":"Chambolle-Pock","title":"Manopt.RecordPrimalBaseIterate","text":"RecordPrimalBaseIterate(x)\n\nCreate an RecordAction that records the primal base point, i.e. RecordEntry of o.m.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.RecordPrimalBaseChange","page":"Chambolle-Pock","title":"Manopt.RecordPrimalBaseChange","text":"RecordPrimalBaseChange()\n\nCreate an RecordAction that records the primal base point change, i.e. RecordEntryChange of o.m with distance to the last value to store a value.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.RecordPrimalChange","page":"Chambolle-Pock","title":"Manopt.RecordPrimalChange","text":"RecordPrimalChange(a)\n\nCreate an RecordAction that records the primal value change, i.e. RecordChange, since we just record the change of o.x.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Manopt.RecordPrimalIterate","page":"Chambolle-Pock","title":"Manopt.RecordPrimalIterate","text":"RecordDualBaseIterate(x)\n\nCreate an RecordAction that records the dual base point, i.e. RecordIterate, i.e. o.x.\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Internals","page":"Chambolle-Pock","title":"Internals","text":"","category":"section"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"Manopt.update_prox_parameters!","category":"page"},{"location":"solvers/ChambollePock/#Manopt.update_prox_parameters!","page":"Chambolle-Pock","title":"Manopt.update_prox_parameters!","text":"update_prox_parameters!(o)\n\nupdate the prox parameters as described in Algorithm 2 of Chambolle, Pock, 2010, i.e.\n\nθ_n = frac1sqrt1+2γτ_n\nτ_n+1 = θ_nτ_n\nσ_n+1 = fracσ_nθ_n\n\n\n\n\n\n","category":"function"},{"location":"solvers/ChambollePock/#Literature","page":"Chambolle-Pock","title":"Literature","text":"","category":"section"},{"location":"solvers/ChambollePock/","page":"Chambolle-Pock","title":"Chambolle-Pock","text":"
[BHS+21]
\n
\n
R. Bergmann, R. Herzog, M. Silva Louzeiro, D. Tenbrinck and J. Vidal-Núñez. Fenchel duality theory and a primal-dual algorithm on Riemannian manifolds. Foundations of Computational Mathematics 21, 1465–1504 (2021), arXiv: [1908.02022](http://arxiv.org/abs/1908.02022).
","category":"page"},{"location":"tutorials/EmbeddingObjectives/#How-to-define-the-cost-in-the-embedding","page":"Define Objectives in the Embedding","title":"How to define the cost in the embedding","text":"","category":"section"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Specifying a cost function fcolon mathcal M to mathbb R on a manifold is usually the model one starts with. Specifying its gradient operatornamegrad fcolonmathcal M to Tmathcal M, or more precisely operatornamegradf(p) in T_pmathcal M, and eventually a Hessian operatornameHess fcolon T_pmathcal M to T_pmathcal M are then necessary to perform optimization. Since these might be challenging to compute, especially when manifolds and differential geometry are not the main area of a user – easier to use methods might be welcome.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"This tutorial discusses how to specify f in the embedding as tilde f, maybe only locally around the manifold, and use the Euclidean gradient tilde f and Hessian ^2 tilde f within Manopt.jl.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"For the theoretical background see convert an Euclidean to an Riemannian Gradient, or Section 4.7 of [Bou23] for the gradient part or Section 5.11 as well as [Ngu23] for the background on converting Hessians.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Here we use the Examples 9.40 and 9.49 of [Bou23] and compare the different methods, one can call the solver, depending on which gradient and/or Hessian one provides.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"using Manifolds, Manopt, ManifoldDiff\nusing LinearAlgebra, Random, Colors, Plots\nRandom.seed!(123)","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"We consider the cost function on the Grassmann manifold given by","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"n = 5\nk = 2\nM = Grassmann(5,2)\nA = Symmetric(rand(n,n));","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"f(M, p) = 1 / 2 * tr(p' * A * p)","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Note that this implementation is already also a valid implementation / continuation of f into the (lifted) embedding of the Grassmann manifold. In the implementation we can use f for both the Euclidean tilde f and the Grassmann case f.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Its Euclidean gradient nabla f and Hessian nabla^2f are easy to compute as","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"∇f(M, p) = A * p\n∇²f(M,p,X) = A*X","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"On the other hand, from the aforementioned Example 9.49 we can also state the Riemannian gradient and Hessian for comparison as","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"grad_f(M, p) = A * p - p * (p' * A * p)\nHess_f(M, p, X) = A * X - p * p' * A * X - X * p' * A * p","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"We can check that these are the correct at least numerically by calling the check_gradient","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"check_gradient(M, f, grad_f; plot=true)","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"(Image: )","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"and the check_Hessian, which requires a bit more tolerance in its linearity check","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"check_Hessian(M, f, grad_f, Hess_f; plot=true, throw_error=true, atol=1e-15)","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"(Image: )","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"While they look reasonable here and were already derived – for the general case this derivation might be more complicated.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Luckily there exist two functions in ManifoldDiff.jl that are implemented for several manifolds from Manifolds.jl, namely riemannian_gradient(M, p, eG) that converts a Riemannian gradient eG=nabla tilde f(p) into a the Riemannain one operatornamegrad f(p) and riemannian_Hessian(M, p, eG, eH, X) which converts the Euclidean Hessian eH=nabla^2 tilde f(p)X into operatornameHess f(p)X, where we also require the Euclidean gradient eG=nabla tilde f(p).","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"So we can define","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"grad2_f(M, p) = riemannian_gradient(M, p, ∇f(get_embedding(M), embed(M, p)))","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"where only formally we here call embed(M,p) before passing p to the Euclidean gradient, though here (for the Grassmann manifold with Stiefel representation) the embedding function is the identity.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Similarly for the Hessian, where in our example the embeddings of both the points and tangent vectors are the identity.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"function Hess2_f(M, p, X)\n return riemannian_Hessian(\n M,\n p,\n ∇f(get_embedding(M), embed(M, p)),\n ∇²f(get_embedding(M), embed(M, p), embed(M, p, X)),\n X\n )\nend","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"And we can again check these numerically,","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"check_gradient(M, f, grad2_f; plot=true)","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"(Image: )","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"and","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"check_Hessian(M, f, grad2_f, Hess2_f; plot=true, throw_error=true, atol=1e-14)","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"(Image: )","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"which yields the same result, but we see that the Euclidean conversion might be a bit less stable.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Now if we want to use these in optimization we would require these two functions to call e.g.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"p0 = [1.0 0.0; 0.0 1.0; 0.0 0.0; 0.0 0.0; 0.0 0.0]\nr1 = adaptive_regularization_with_cubics(\n M,\n f,\n grad_f,\n Hess_f,\n p0;\n debug=[:Iteration, :Cost, \"\\n\"],\n return_objective=true,\n return_state=true,\n)\nq1 = get_solver_result(r1)\nr1","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Initial f(x): 0.666814\n# 1 f(x): 0.333500\n# 2 f(x): -0.233243\n# 3 f(x): -0.440486\n# 4 f(x): -0.607487\n# 5 f(x): -0.608797\n# 6 f(x): -0.608797\n# 7 f(x): -0.608797\n\n# Solver state for `Manopt.jl`s Adaptive Regularization with Cubics (ARC)\nAfter 7 iterations\n\n## Parameters\n* η1 | η2 : 0.1 | 0.9\n* γ1 | γ2 : 0.1 | 2.0\n* σ (σmin) : 0.0004082482904638632 (1.0e-10)\n* ρ (ρ_regularization) : 0.9998886221507552 (1000.0)\n* retraction method : PolarRetraction()\n* sub solver state :\n | # Solver state for `Manopt.jl`s Lanczos Iteration\n | After 6 iterations\n | \n | ## Parameters\n | * σ : 0.0040824829046386315\n | * # of Lanczos vectors used : 6\n | \n | ## Stopping Criteria\n | (a) For the Lanczos Iteration\n | Stop When _one_ of the following are fulfilled:\n | Max Iteration 6: reached\n | First order progress with θ=0.5: not reached\n | Overall: reached\n | (b) For the Newton sub solver\n | Max Iteration 200: not reached\n | This indicates convergence: No\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 40: not reached\n |grad f| < 1.0e-9: reached\n All Lanczos vectors (5) used: not reached\nOverall: reached\nThis indicates convergence: Yes\n\n## Debug\n [ (:Iteration, \"# %-6d\"), (:Cost, \"f(x): %f\"), \"\n\" ]","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"but if you choose to go for the conversions, then, thinking of the embedding and defining two new functions might be tedious. There is a shortcut for these, which performs the change internally, when necessary by specifying objective_type=:Euclidean.","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"r2 = adaptive_regularization_with_cubics(\n M,\n f,\n ∇f,\n ∇²f,\n p0;\n # The one line different to specify our grad/Hess are Eucldiean:\n objective_type=:Euclidean,\n debug=[:Iteration, :Cost, \"\\n\"],\n return_objective=true,\n return_state=true,\n)\nq2 = get_solver_result(r2)\nr2","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Initial f(x): 0.666814\n# 1 f(x): 0.333500\n# 2 f(x): -0.233243\n# 3 f(x): -0.440486\n# 4 f(x): -0.607487\n# 5 f(x): -0.608797\n# 6 f(x): -0.608797\n# 7 f(x): -0.608797\n\n# Solver state for `Manopt.jl`s Adaptive Regularization with Cubics (ARC)\nAfter 7 iterations\n\n## Parameters\n* η1 | η2 : 0.1 | 0.9\n* γ1 | γ2 : 0.1 | 2.0\n* σ (σmin) : 0.0004082482904638632 (1.0e-10)\n* ρ (ρ_regularization) : 0.9998886221248858 (1000.0)\n* retraction method : PolarRetraction()\n* sub solver state :\n | # Solver state for `Manopt.jl`s Lanczos Iteration\n | After 6 iterations\n | \n | ## Parameters\n | * σ : 0.0040824829046386315\n | * # of Lanczos vectors used : 6\n | \n | ## Stopping Criteria\n | (a) For the Lanczos Iteration\n | Stop When _one_ of the following are fulfilled:\n | Max Iteration 6: reached\n | First order progress with θ=0.5: not reached\n | Overall: reached\n | (b) For the Newton sub solver\n | Max Iteration 200: not reached\n | This indicates convergence: No\n\n## Stopping Criterion\nStop When _one_ of the following are fulfilled:\n Max Iteration 40: not reached\n |grad f| < 1.0e-9: reached\n All Lanczos vectors (5) used: not reached\nOverall: reached\nThis indicates convergence: Yes\n\n## Debug\n [ (:Iteration, \"# %-6d\"), (:Cost, \"f(x): %f\"), \"\n\" ]","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"which returns the same result, see","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"distance(M, q1, q2)","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"3.2016811410571575e-16","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"This conversion also works for the gradients of constraints, and is passed down to subsolvers by deault when these are created using the Euclidean objective f, nabla f and nabla^2 f.","category":"page"},{"location":"tutorials/EmbeddingObjectives/#Summary","page":"Define Objectives in the Embedding","title":"Summary","text":"","category":"section"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"If you have the Euclidean gradient (or Hessian) available for a solver call, all you need to provide is objective_type=:Euclidean to convert the objective to a Riemannian one.","category":"page"},{"location":"tutorials/EmbeddingObjectives/#Literature","page":"Define Objectives in the Embedding","title":"Literature","text":"","category":"section"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"
","category":"page"},{"location":"tutorials/EmbeddingObjectives/#Technical-Details","page":"Define Objectives in the Embedding","title":"Technical Details","text":"","category":"section"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"This notebook was rendered with the following environment","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Pkg.status()","category":"page"},{"location":"tutorials/EmbeddingObjectives/","page":"Define Objectives in the Embedding","title":"Define Objectives in the Embedding","text":"Status `~/work/Manopt.jl/Manopt.jl/tutorials/Project.toml`\n [6e4b80f9] BenchmarkTools v1.3.2\n [5ae59095] Colors v0.12.10\n [31c24e10] Distributions v0.25.100\n [26cc04aa] FiniteDifferences v0.12.30\n [7073ff75] IJulia v1.24.2\n [8ac3fa9e] LRUCache v1.4.1\n [af67fdf4] ManifoldDiff v0.3.6\n [1cead3c2] Manifolds v0.8.75\n [3362f125] ManifoldsBase v0.14.11\n [0fc0a36d] Manopt v0.4.34 `~/work/Manopt.jl/Manopt.jl`\n [91a5bcdd] Plots v1.39.0","category":"page"},{"location":"helpers/data/#Data","page":"Data","title":"Data","text":"","category":"section"},{"location":"helpers/data/","page":"Data","title":"Data","text":"For some manifolds there are artificial or real application data available that can be loaded using the following data functions. Note that these need additionally Manifolds.jl to be loaded.","category":"page"},{"location":"helpers/data/","page":"Data","title":"Data","text":"Modules = [Manopt]\nPages = [\"artificialDataFunctions.jl\"]","category":"page"},{"location":"helpers/data/#Manopt.artificialIn_SAR_image-Tuple{Integer}","page":"Data","title":"Manopt.artificialIn_SAR_image","text":"artificialIn_SAR_image([pts=500])\n\ngenerate an artificial InSAR image, i.e. phase valued data, of size pts x pts points.\n\nThis data set was introduced for the numerical examples in Bergmann et. al., SIAM J Imag Sci, 2014.\n\n\n\n\n\n","category":"method"},{"location":"helpers/data/#Manopt.artificial_S1_signal","page":"Data","title":"Manopt.artificial_S1_signal","text":"artificial_S1_signal([pts=500])\n\ngenerate a real-valued signal having piecewise constant, linear and quadratic intervals with jumps in between. If the resulting manifold the data lives on, is the Circle the data is also wrapped to -pipi). This is data for an example from Bergmann et. al., SIAM J Imag Sci, 2014.\n\nOptional\n\npts – (500) number of points to sample the function\n\n\n\n\n\n","category":"function"},{"location":"helpers/data/#Manopt.artificial_S1_signal-Tuple{Real}","page":"Data","title":"Manopt.artificial_S1_signal","text":"artificial_S1_signal(x)\n\nevaluate the example signal f(x) x 01, of phase-valued data introduces in Sec. 5.1 of Bergmann et. al., SIAM J Imag Sci, 2014 for values outside that interval, this Signal is missing.\n\n\n\n\n\n","category":"method"},{"location":"helpers/data/#Manopt.artificial_S1_slope_signal","page":"Data","title":"Manopt.artificial_S1_slope_signal","text":"artificial_S1_slope_signal([pts=500, slope=4.])\n\nCreates a Signal of (phase-valued) data represented on the Circle with increasing slope.\n\nOptional\n\npts – (500) number of points to sample the function.\nslope – (4.0) initial slope that gets increased afterwards\n\nThis data set was introduced for the numerical examples in Bergmann et. al., SIAM J Imag Sci, 2014\n\n\n\n\n\n","category":"function"},{"location":"helpers/data/#Manopt.artificial_S2_composite_bezier_curve-Tuple{}","page":"Data","title":"Manopt.artificial_S2_composite_bezier_curve","text":"artificial_S2_composite_bezier_curve()\n\nCreate the artificial curve in the Sphere(2) consisting of 3 segments between the four points\n\np_0 = beginbmatrix001endbmatrix^mathrmT\np_1 = beginbmatrix0-10endbmatrix^mathrmT\np_2 = beginbmatrix-100endbmatrix^mathrmT\np_3 = beginbmatrix00-1endbmatrix^mathrmT\n\nwhere each segment is a cubic Bézier curve, i.e. each point, except p_3 has a first point within the following segment b_i^+, i=012 and a last point within the previous segment, except for p_0, which are denoted by b_i^-, i=123. This curve is differentiable by the conditions b_i^- = gamma_b_i^+p_i(2), i=12, where gamma_ab is the shortest_geodesic connecting a and b. The remaining points are defined as\n\nbeginaligned\n b_0^+ = exp_p_0fracpi8sqrt2beginpmatrix1-10endpmatrix^mathrmT\n b_1^+ = exp_p_1-fracpi4sqrt2beginpmatrix-101endpmatrix^mathrmT\n b_2^+ = exp_p_2fracpi4sqrt2beginpmatrix01-1endpmatrix^mathrmT\n b_3^- = exp_p_3-fracpi8sqrt2beginpmatrix-110endpmatrix^mathrmT\nendaligned\n\nThis example was used within minimization of acceleration of the paper Bergmann, Gousenbourger, Front. Appl. Math. Stat., 2018.\n\n\n\n\n\n","category":"method"},{"location":"helpers/data/#Manopt.artificial_S2_lemniscate","page":"Data","title":"Manopt.artificial_S2_lemniscate","text":"artificial_S2_lemniscate(p, t::Float64; a::Float64=π/2)\n\nGenerate a point from the signal on the Sphere mathbb S^2 by creating the Lemniscate of Bernoulli in the tangent space of p sampled at t and use expto obtain a point on the [Sphere`](https://juliamanifolds.github.io/Manifolds.jl/stable/manifolds/sphere.html).\n\nInput\n\np – the tangent space the Lemniscate is created in\nt – value to sample the Lemniscate at\n\nOptional Values\n\na – (π/2) defines a half axis of the Lemniscate to cover a half sphere.\n\nThis dataset was used in the numerical example of Section 5.1 of Bačák et al., SIAM J Sci Comput, 2016.\n\n\n\n\n\n","category":"function"},{"location":"helpers/data/#Manopt.artificial_S2_lemniscate-2","page":"Data","title":"Manopt.artificial_S2_lemniscate","text":"artificial_S2_lemniscate(p [,pts=128,a=π/2,interval=[0,2π])\n\nGenerate a Signal on the Sphere mathbb S^2 by creating the Lemniscate of Bernoulli in the tangent space of p sampled at pts points and use exp to get a signal on the Sphere.\n\nInput\n\np – the tangent space the Lemniscate is created in\npts – (128) number of points to sample the Lemniscate\na – (π/2) defines a half axis of the Lemniscate to cover a half sphere.\ninterval – ([0,2*π]) range to sample the lemniscate at, the default value refers to one closed curve\n\nThis dataset was used in the numerical example of Section 5.1 of Bačák et al., SIAM J Sci Comput, 2016.\n\n\n\n\n\n","category":"function"},{"location":"helpers/data/#Manopt.artificial_S2_rotation_image-Tuple{}","page":"Data","title":"Manopt.artificial_S2_rotation_image","text":"artificial_S2_rotation_image([pts=64, rotations=(.5,.5)])\n\nCreate an image with a rotation on each axis as a parametrization.\n\nOptional Parameters\n\npts – (64) number of pixels along one dimension\nrotations – ((.5,.5)) number of total rotations performed on the axes.\n\nThis dataset was used in the numerical example of Section 5.1 of Bačák et al., SIAM J Sci Comput, 2016.\n\n\n\n\n\n","category":"method"},{"location":"helpers/data/#Manopt.artificial_S2_whirl_image-Tuple{Int64}","page":"Data","title":"Manopt.artificial_S2_whirl_image","text":"artificial_S2_whirl_image([pts::Int=64])\n\nGenerate an artificial image of data on the 2 sphere,\n\nArguments\n\npts – (64) size of the image in pts×pts pixel.\n\nThis example dataset was used in the numerical example in Section 5.5 of Laus et al., SIAM J Imag Sci., 2017\n\nIt is based on artificial_S2_rotation_image extended by small whirl patches.\n\n\n\n\n\n","category":"method"},{"location":"helpers/data/#Manopt.artificial_S2_whirl_patch","page":"Data","title":"Manopt.artificial_S2_whirl_patch","text":"artificial_S2_whirl_patch([pts=5])\n\ncreate a whirl within the pts×pts patch of Sphere(@ref)(2)-valued image data.\n\nThese patches are used within artificial_S2_whirl_image.\n\nOptional Parameters\n\npts – (5) size of the patch. If the number is odd, the center is the north pole.\n\n\n\n\n\n","category":"function"},{"location":"helpers/data/#Manopt.artificial_SPD_image","page":"Data","title":"Manopt.artificial_SPD_image","text":"artificial_SPD_image([pts=64, stepsize=1.5])\n\ncreate an artificial image of symmetric positive definite matrices of size pts×pts pixel with a jump of size stepsize.\n\nThis dataset was used in the numerical example of Section 5.2 of Bačák et al., SIAM J Sci Comput, 2016.\n\n\n\n\n\n","category":"function"},{"location":"helpers/data/#Manopt.artificial_SPD_image2-Tuple{Any, Any}","page":"Data","title":"Manopt.artificial_SPD_image2","text":"artificial_SPD_image2([pts=64, fraction=.66])\n\ncreate an artificial image of symmetric positive definite matrices of size pts×pts pixel with right hand side fraction is moved upwards.\n\nThis data set was introduced in the numerical examples of Section of Bergmann, Presch, Steidl, SIAM J Imag Sci, 2016\n\n\n\n\n\n","category":"method"},{"location":"helpers/data/#Literature","page":"Data","title":"Literature","text":"","category":"section"},{"location":"helpers/data/","page":"Data","title":"Data","text":"
[BBSW16]
\n
\n
M. Bačák, R. Bergmann, G. Steidl and A. Weinmann. A second order non-smooth variational model for restoring manifold-valued images. SIAM Journal on Scientific Computing 38, A567–A597 (2016), arxiv: [1506.02409](https://arxiv.org/abs/1506.02409).
\n
[BG18]
\n
\n
R. Bergmann and P.-Y. Gousenbourger. A variational model for data fitting on manifolds by minimizing the acceleration of a Bézier curve. Frontiers in Applied Mathematics and Statistics 4 (2018), arXiv: [1807.10090](https://arxiv.org/abs/1807.10090).
\n
[BLSW14]
\n
\n
R. Bergmann, F. Laus, G. Steidl and A. Weinmann. Second order differences of cyclic data and applications in variational denoising. SIAM Journal on Imaging Sciences 7, 2916–2953 (2014), arxiv: [1405.5349](https://arxiv.org/abs/1405.5349).
\n
[BPS16]
\n
\n
R. Bergmann, J. Persch and G. Steidl. A parallel Douglas Rachford algorithm for minimizing ROF-like functionals on images with values in symmetric Hadamard manifolds. SIAM Journal on Imaging Sciences 9, 901–937 (2016), arxiv: [1512.02814](https://arxiv.org/abs/1512.02814).
","category":"page"},{"location":"solvers/alternating_gradient_descent/#AlternatingGradientDescentSolver","page":"Alternating Gradient Descent","title":"Alternating Gradient Descent","text":"","category":"section"},{"location":"solvers/alternating_gradient_descent/","page":"Alternating Gradient Descent","title":"Alternating Gradient Descent","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/alternating_gradient_descent/","page":"Alternating Gradient Descent","title":"Alternating Gradient Descent","text":"alternating_gradient_descent\nalternating_gradient_descent!","category":"page"},{"location":"solvers/alternating_gradient_descent/#Manopt.alternating_gradient_descent","page":"Alternating Gradient Descent","title":"Manopt.alternating_gradient_descent","text":"alternating_gradient_descent(M::ProductManifold, f, grad_f, p=rand(M))\nalternating_gradient_descent(M::ProductManifold, ago::ManifoldAlternatingGradientObjective, p)\n\nperform an alternating gradient descent\n\nInput\n\nM – the product manifold mathcal M = mathcal M_1 mathcal M_2 mathcal M_n\nf – the objective function (cost) defined on M.\ngrad_f – a gradient, that can be of two cases\nis a single function returning an ArrayPartition or\nis a vector functions each returning a component part of the whole gradient\np – an initial value p_0 mathcal M\n\nOptional\n\nevaluation – (AllocatingEvaluation) specify whether the gradient(s) works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x) (elementwise).\nevaluation_order – (:Linear) – whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Random) or the default :Linear one.\ninner_iterations– (5) how many gradient steps to take in a component before alternating to the next\nstopping_criterion (StopAfterIteration(1000))– a StoppingCriterion\nstepsize (ArmijoLinesearch()) a Stepsize\norder - ([1:n]) the initial permutation, where n is the number of gradients in gradF.\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction(M, p, X) to use.\n\nOutput\n\nusually the obtained (approximate) minimizer, see get_solver_return for details\n\nnote: Note\nThis Problem requires the ProductManifold from Manifolds.jl, so Manifolds.jl needs to be loaded.\n\nnote: Note\nThe input of each of the (component) gradients is still the whole vector X, just that all other then the ith input component are assumed to be fixed and just the ith components gradient is computed / returned.\n\n\n\n\n\n","category":"function"},{"location":"solvers/alternating_gradient_descent/#Manopt.alternating_gradient_descent!","page":"Alternating Gradient Descent","title":"Manopt.alternating_gradient_descent!","text":"alternating_gradient_descent!(M::ProductManifold, f, grad_f, p)\nalternating_gradient_descent!(M::ProductManifold, ago::ManifoldAlternatingGradientObjective, p)\n\nperform a alternating gradient descent in place of p.\n\nInput\n\nM a product manifold mathcal M\nf – the objective functioN (cost)\ngrad_f – a gradient function, that either returns a vector of the subgradients or is a vector of gradients\np – an initial value p_0 mathcal M\n\nyou can also pass a ManifoldAlternatingGradientObjective ago containing f and grad_f instead.\n\nfor all optional parameters, see alternating_gradient_descent.\n\n\n\n\n\n","category":"function"},{"location":"solvers/alternating_gradient_descent/#State","page":"Alternating Gradient Descent","title":"State","text":"","category":"section"},{"location":"solvers/alternating_gradient_descent/","page":"Alternating Gradient Descent","title":"Alternating Gradient Descent","text":"AlternatingGradientDescentState","category":"page"},{"location":"solvers/alternating_gradient_descent/#Manopt.AlternatingGradientDescentState","page":"Alternating Gradient Descent","title":"Manopt.AlternatingGradientDescentState","text":"AlternatingGradientDescentState <: AbstractGradientDescentSolverState\n\nStore the fields for an alternating gradient descent algorithm, see also alternating_gradient_descent.\n\nFields\n\ndirection (AlternatingGradient(zero_vector(M, x)) a DirectionUpdateRule\nevaluation_order – (:Linear) – whether\ninner_iterations– (5) how many gradient steps to take in a component before alternating to the next to use a randomly permuted sequence (:FixedRandom), a per cycle newly permuted sequence (:Random) or the default :Linear evaluation order.\norder the current permutation\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction(M,x,ξ) to use.\nstepsize (ConstantStepsize(M)) a Stepsize\nstopping_criterion (StopAfterIteration(1000))– a StoppingCriterion\np the current iterate\nX (zero_vector(M,p)) the current gradient tangent vector\nk, ì` internal counters for the outer and inner iterations, respectively.\n\nConstructors\n\nAlternatingGradientDescentState(M, p; kwargs...)\n\nGenerate the options for point p and and where inner_iterations, order_type, order, retraction_method, stopping_criterion, and stepsize` are keyword arguments\n\n\n\n\n\n","category":"type"},{"location":"solvers/alternating_gradient_descent/","page":"Alternating Gradient Descent","title":"Alternating Gradient Descent","text":"Additionally, the options share a DirectionUpdateRule, which chooses the current component, so they can be decorated further; The most inner one should always be the following one though.","category":"page"},{"location":"solvers/alternating_gradient_descent/","page":"Alternating Gradient Descent","title":"Alternating Gradient Descent","text":"AlternatingGradient","category":"page"},{"location":"solvers/alternating_gradient_descent/#Manopt.AlternatingGradient","page":"Alternating Gradient Descent","title":"Manopt.AlternatingGradient","text":"AlternatingGradient <: DirectionUpdateRule\n\nThe default gradient processor, which just evaluates the (alternating) gradient on one of the components\n\n\n\n\n\n","category":"type"},{"location":"solvers/truncated_conjugate_gradient_descent/#tCG","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint Truncated Conjugate-Gradient Method","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"The aim is to solve the trust-region subproblem","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"operatorname*argmin_η T_xmathcalM m_x(η) = F(x) +\noperatornamegradF(x) η_x + frac12 \nmathcalHη η_x","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"textst η η_x leq Δ^2","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"on a manifold by using the Steihaug-Toint truncated conjugate-gradient method, abbreviated tCG-method. All terms involving the trust-region radius use an inner product w.r.t. the preconditioner; this is because the iterates grow in length w.r.t. the preconditioner, guaranteeing that we do not re-enter the trust-region.","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/#Initialization","page":"Steihaug-Toint TCG Method","title":"Initialization","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"Initialize η_0 = η if using randomized approach and η the zero tangent vector otherwise, r_0 = operatornamegradF(x), z_0 = operatornameP(r_0), δ_0 = z_0 and k=0","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/#Iteration","page":"Steihaug-Toint TCG Method","title":"Iteration","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"Repeat until a convergence criterion is reached","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"Set α =fracr_k z_k_xδ_k mathcalHδ_k_x and η_k η_k_x^* = η_k operatornameP(η_k)_x + 2α η_k operatornameP(δ_k)_x + α^2 δ_k operatornameP(δ_k)_x.\nIf δ_k mathcalHδ_k_x 0 or η_k η_k_x^* Δ^2 return η_k+1 = η_k + τ δ_k and stop.\nSet η_k^*= η_k + α δ_k, if η_k η_k_x + frac12 η_k operatornameHessF (η_k)_x_x η_k^* η_k^*_x + frac12 η_k^* operatornameHessF (η_k)_ x_x set η_k+1 = η_k else set η_k+1 = η_k^*.\nSet r_k+1 = r_k + α mathcalHδ_k, z_k+1 = operatornameP(r_k+1), β = fracr_k+1 z_k+1_xr_k z_k _x and δ_k+1 = -z_k+1 + β δ_k.\nSet k=k+1.","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/#Result","page":"Steihaug-Toint TCG Method","title":"Result","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"The result is given by the last computed η_k.","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/#Remarks","page":"Steihaug-Toint TCG Method","title":"Remarks","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"The operatornameP() denotes the symmetric, positive definite preconditioner. It is required if a randomized approach is used i.e. using a random tangent vector η_0 as the initial vector. The idea behind it is to avoid saddle points. Preconditioning is simply a rescaling of the variables and thus a redefinition of the shape of the trust region. Ideally operatornameP() is a cheap, positive approximation of the inverse of the Hessian of F at x. On default, the preconditioner is just the identity.","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"To step number 2: obtain τ from the positive root of leftlVert η_k + τ δ_k rightrVert_operatornameP x = Δ what becomes after the conversion of the equation to","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":" τ = frac-η_k operatornameP(δ_k)_x +\n sqrtη_k operatornameP(δ_k)_x^2 +\n δ_k operatornameP(δ_k)_x ( Δ^2 -\n η_k operatornameP(η_k)_x)\n δ_k operatornameP(δ_k)_x","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"It can occur that δ_k operatornameHessF (δ_k)_x_x = κ 0 at iteration k. In this case, the model is not strictly convex, and the stepsize α =fracr_k z_k_x κ computed in step 1. does not give a reduction in the model function m_x(). Indeed, m_x() is unbounded from below along the line η_k + α δ_k. If our aim is to minimize the model within the trust-region, it makes far more sense to reduce m_x() along η_k + α δ_k as much as we can while staying within the trust-region, and this means moving to the trust-region boundary along this line. Thus, when κ 0 at iteration k, we replace α = fracr_k z_k_xκ with τ described as above. The other possibility is that η_k+1 would lie outside the trust-region at iteration k (i.e. η_k η_k_x^* Δ^2 that can be identified with the norm of η_k+1). In particular, when operatornameHessF ()_x is positive definite and η_k+1 lies outside the trust region, the solution to the trust-region problem must lie on the trust-region boundary. Thus, there is no reason to continue with the conjugate gradient iteration, as it stands, as subsequent iterates will move further outside the trust-region boundary. A sensible strategy, just as in the case considered above, is to move to the trust-region boundary by finding τ.","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"Although it is virtually impossible in practice to know how many iterations are necessary to provide a good estimate η_k of the trust-region subproblem, the method stops after a certain number of iterations, which is realised by StopAfterIteration. In order to increase the convergence rate of the underlying trust-region method, see trust_regions, a typical stopping criterion is to stop as soon as an iteration k is reached for which","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":" Vert r_k Vert_x leqq Vert r_0 Vert_x min left( Vert r_0 Vert^θ_x κ right)","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"holds, where 0 κ 1 and θ 0 are chosen in advance. This is realized in this method by StopWhenResidualIsReducedByFactorOrPower. It can be shown that under appropriate conditions the iterates x_k of the underlying trust-region method converge to nondegenerate critical points with an order of convergence of at least min left( θ + 1 2 right), see Absil, Mahony, Sepulchre, Princeton University Press, 2008. The method also aborts if the curvature of the model is negative, i.e. if langle delta_k mathcalHδ_k rangle_x leqq 0, which is realised by StopWhenCurvatureIsNegative. If the next possible approximate solution η_k^* calculated in iteration k lies outside the trust region, i.e. if lVert η_k^* rVert_x geq Δ, then the method aborts, which is realised by StopWhenTrustRegionIsExceeded. Furthermore, the method aborts if the new model value evaluated at η_k^* is greater than the previous model value evaluated at η_k, which is realised by StopWhenModelIncreased.","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/#Interface","page":"Steihaug-Toint TCG Method","title":"Interface","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":" truncated_conjugate_gradient_descent\n truncated_conjugate_gradient_descent!","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.truncated_conjugate_gradient_descent","page":"Steihaug-Toint TCG Method","title":"Manopt.truncated_conjugate_gradient_descent","text":"truncated_conjugate_gradient_descent(M, f, grad_f, p; kwargs...)\ntruncated_conjugate_gradient_descent(M, f, grad_f, p, X; kwargs...)\ntruncated_conjugate_gradient_descent(M, f, grad_f, Hess_f; kwargs...)\ntruncated_conjugate_gradient_descent(M, f, grad_f, Hess_f, p; kwargs...)\ntruncated_conjugate_gradient_descent(M, f, grad_f, Hess_f, p, X; kwargs...)\ntruncated_conjugate_gradient_descent(M, mho::ManifoldHessianObjective, p, X; kwargs...)\n\nsolve the trust-region subproblem\n\noperatorname*argmin_η T_pM\nm_p(η) quadtextwhere\nm_p(η) = f(p) + operatornamegrad f(p)η_x + frac12operatornameHess f(p)ηη_x\n\ntextsuch thatquad ηη_x Δ^2\n\non a manifold M by using the Steihaug-Toint truncated conjugate-gradient method, abbreviated tCG-method. For a description of the algorithm and theorems offering convergence guarantees, see the reference:\n\nP.-A. Absil, C.G. Baker, K.A. Gallivan, Trust-region methods on Riemannian manifolds, FoCM, 2007. doi: 10.1007/s10208-005-0179-9\nA. R. Conn, N. I. M. Gould, P. L. Toint, Trust-region methods, SIAM, MPS, 2000. doi: 10.1137/1.9780898719857\n\nInput\n\nSee signatures above, you can leave out only the Hessian, the vector, the point and the vector, or all 3.\n\nM – a manifold mathcal M\nf – a cost function F mathcal M ℝ to minimize\ngrad_f – the gradient operatornamegradf mathcal M Tmathcal M of F\nHess_f – (optional, cf. ApproxHessianFiniteDifference) the hessian operatornameHessf T_pmathcal M T_pmathcal M, X operatornameHessF(p)X = _Xoperatornamegradf(p)\np – a point on the manifold p mathcal M\nX – an update tangential vector X T_pmathcal M\n\nOptional\n\nevaluation – (AllocatingEvaluation) specify whether the gradient and hessian work by allocation (default) or InplaceEvaluation in place\npreconditioner – a preconditioner for the hessian H\nθ – (1.0) 1+θ is the superlinear convergence target rate. The method aborts if the residual is less than or equal to the initial residual to the power of 1+θ.\nκ – (0.1) the linear convergence target rate. The method aborts if the residual is less than or equal to κ times the initial residual.\nrandomize – set to true if the trust-region solve is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.\ntrust_region_radius – (injectivity_radius(M)/4) a trust-region radius\nproject! : (copyto!) specify a projection operation for tangent vectors for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.\nstopping_criterion – (StopAfterIteration| [StopWhenResidualIsReducedByFactorOrPower](@ref) | 'StopWhenCurvatureIsNegative|StopWhenTrustRegionIsExceeded ) a functor inheriting from StoppingCriterion indicating when to stop, where for the default, the maximal number of iterations is set to the dimension of the manifold, the power factor is θ, the reduction factor is κ.\n\nand the ones that are passed to decorate_state! for decorators.\n\nOutput\n\nthe obtained (approximate) minimizer eta^*, see get_solver_return for details\n\nsee also\n\ntrust_regions\n\n\n\n\n\n","category":"function"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.truncated_conjugate_gradient_descent!","page":"Steihaug-Toint TCG Method","title":"Manopt.truncated_conjugate_gradient_descent!","text":"truncated_conjugate_gradient_descent!(M, f, grad_f, Hess_f, p, X; kwargs...)\ntruncated_conjugate_gradient_descent!(M, f, grad_f, p, X; kwargs...)\n\nsolve the trust-region subproblem in place of X (and p).\n\nInput\n\nM – a manifold mathcal M\nf – a cost function F mathcal M ℝ to minimize\ngrad_f – the gradient operatornamegradf mathcal M Tmathcal M of f\nHess_f – the hessian operatornameHessf(x) T_pmathcal M T_pmathcal M, X operatornameHessf(p)X\np – a point on the manifold p mathcal M\nX – an update tangential vector X T_xmathcal M\n\nFor more details and all optional arguments, see truncated_conjugate_gradient_descent.\n\n\n\n\n\n","category":"function"},{"location":"solvers/truncated_conjugate_gradient_descent/#State","page":"Steihaug-Toint TCG Method","title":"State","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"TruncatedConjugateGradientState","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.TruncatedConjugateGradientState","page":"Steihaug-Toint TCG Method","title":"Manopt.TruncatedConjugateGradientState","text":"TruncatedConjugateGradientState <: AbstractHessianSolverState\n\ndescribe the Steihaug-Toint truncated conjugate-gradient method, with\n\nFields\n\na default value is given in brackets if a parameter can be left out in initialization.\n\nx : a point, where the trust-region subproblem needs to be solved\nη : a tangent vector (called update vector), which solves the trust-region subproblem after successful calculation by the algorithm\nstop : a StoppingCriterion.\ngradient : the gradient at the current iterate\nδ : search direction\ntrust_region_radius : (injectivity_radius(M)/4) the trust-region radius\nresidual : the gradient\nrandomize : indicates if the trust-region solve and so the algorithm is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.\nproject! : (copyto!) specify a projection operation for tangent vectors for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.\n\nConstructor\n\nTruncatedConjugateGradientState(M, p=rand(M), η=zero_vector(M,p);\n trust_region_radius=injectivity_radius(M)/4,\n randomize=false,\n θ=1.0,\n κ=0.1,\n project!=copyto!,\n)\n\nand a slightly involved `stopping_criterion`\n\nSee also\n\ntruncated_conjugate_gradient_descent, trust_regions\n\n\n\n\n\n","category":"type"},{"location":"solvers/truncated_conjugate_gradient_descent/#Stopping-Criteria","page":"Steihaug-Toint TCG Method","title":"Stopping Criteria","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"StopWhenResidualIsReducedByFactorOrPower\nStopWhenTrustRegionIsExceeded\nStopWhenCurvatureIsNegative\nStopWhenModelIncreased\nupdate_stopping_criterion!(::StopWhenResidualIsReducedByFactorOrPower, ::Val{:ResidualPower}, ::Any)\nupdate_stopping_criterion!(::StopWhenResidualIsReducedByFactorOrPower, ::Val{:ResidualFactor}, ::Any)","category":"page"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.StopWhenResidualIsReducedByFactorOrPower","page":"Steihaug-Toint TCG Method","title":"Manopt.StopWhenResidualIsReducedByFactorOrPower","text":"StopWhenResidualIsReducedByFactorOrPower <: StoppingCriterion\n\nA functor for testing if the norm of residual at the current iterate is reduced either by a power of 1+θ or by a factor κ compared to the norm of the initial residual, i.e. Vert r_k Vert_x leqq Vert r_0 Vert_x \nmin left( kappa Vert r_0 Vert_x^theta right).\n\nFields\n\nκ – the reduction factor\nθ – part of the reduction power\nreason – stores a reason of stopping if the stopping criterion has one be reached, see get_reason.\n\nConstructor\n\nStopWhenResidualIsReducedByFactorOrPower(; κ=0.1, θ=1.0)\n\ninitialize the StopWhenResidualIsReducedByFactorOrPower functor to indicate to stop after the norm of the current residual is lesser than either the norm of the initial residual to the power of 1+θ or the norm of the initial residual times κ.\n\nSee also\n\ntruncated_conjugate_gradient_descent, trust_regions\n\n\n\n\n\n","category":"type"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.StopWhenTrustRegionIsExceeded","page":"Steihaug-Toint TCG Method","title":"Manopt.StopWhenTrustRegionIsExceeded","text":"StopWhenTrustRegionIsExceeded <: StoppingCriterion\n\nA functor for testing if the norm of the next iterate in the Steihaug-Toint tcg method is larger than the trust-region radius, i.e. Vert η_k^* Vert_x trust_region_radius. terminate the algorithm when the trust region has been left.\n\nFields\n\nreason – stores a reason of stopping if the stopping criterion has been reached, see get_reason.\n\nConstructor\n\nStopWhenTrustRegionIsExceeded()\n\ninitialize the StopWhenTrustRegionIsExceeded functor to indicate to stop after the norm of the next iterate is greater than the trust-region radius.\n\nSee also\n\ntruncated_conjugate_gradient_descent, trust_regions\n\n\n\n\n\n","category":"type"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.StopWhenCurvatureIsNegative","page":"Steihaug-Toint TCG Method","title":"Manopt.StopWhenCurvatureIsNegative","text":"StopWhenCurvatureIsNegative <: StoppingCriterion\n\nA functor for testing if the curvature of the model is negative, i.e. langle delta_k operatornameHessF(delta_k)rangle_x leqq 0. In this case, the model is not strictly convex, and the stepsize as computed does not give a reduction of the model.\n\nFields\n\nreason – stores a reason of stopping if the stopping criterion has been reached, see get_reason.\n\nConstructor\n\nStopWhenCurvatureIsNegative()\n\nSee also\n\ntruncated_conjugate_gradient_descent, trust_regions\n\n\n\n\n\n","category":"type"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.StopWhenModelIncreased","page":"Steihaug-Toint TCG Method","title":"Manopt.StopWhenModelIncreased","text":"StopWhenModelIncreased <: StoppingCriterion\n\nA functor for testing if the curvature of the model value increased.\n\nFields\n\nreason – stores a reason of stopping if the stopping criterion has been reached, see get_reason.\n\nConstructor\n\nStopWhenModelIncreased()\n\nSee also\n\ntruncated_conjugate_gradient_descent, trust_regions\n\n\n\n\n\n","category":"type"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.update_stopping_criterion!-Tuple{StopWhenResidualIsReducedByFactorOrPower, Val{:ResidualPower}, Any}","page":"Steihaug-Toint TCG Method","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopWhenResidualIsReducedByFactorOrPower, :ResidualPower, v)\n\nUpdate the residual Power θ to v.\n\n\n\n\n\n","category":"method"},{"location":"solvers/truncated_conjugate_gradient_descent/#Manopt.update_stopping_criterion!-Tuple{StopWhenResidualIsReducedByFactorOrPower, Val{:ResidualFactor}, Any}","page":"Steihaug-Toint TCG Method","title":"Manopt.update_stopping_criterion!","text":"update_stopping_criterion!(c::StopWhenResidualIsReducedByFactorOrPower, :ResidualFactor, v)\n\nUpdate the residual Factor κ to v.\n\n\n\n\n\n","category":"method"},{"location":"solvers/truncated_conjugate_gradient_descent/#Literature","page":"Steihaug-Toint TCG Method","title":"Literature","text":"","category":"section"},{"location":"solvers/truncated_conjugate_gradient_descent/","page":"Steihaug-Toint TCG Method","title":"Steihaug-Toint TCG Method","text":"
[AMS08]
\n
\n
P.-A. Absil, R. Mahony and R. Sepulchre. Optimization Algorithms on Matrix Manifolds. Princeton University Press (2008). [open access](http://press.princeton.edu/chapters/absil/).
\n
\n
","category":"page"},{"location":"solvers/LevenbergMarquardt/#Levenberg-Marquardt","page":"Levenberg–Marquardt","title":"Levenberg-Marquardt","text":"","category":"section"},{"location":"solvers/LevenbergMarquardt/","page":"Levenberg–Marquardt","title":"Levenberg–Marquardt","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/LevenbergMarquardt/","page":"Levenberg–Marquardt","title":"Levenberg–Marquardt","text":"LevenbergMarquardt\nLevenbergMarquardt!","category":"page"},{"location":"solvers/LevenbergMarquardt/#Manopt.LevenbergMarquardt","page":"Levenberg–Marquardt","title":"Manopt.LevenbergMarquardt","text":"LevenbergMarquardt(M, f, jacobian_f, p, num_components=-1)\n\nSolve an optimization problem of the form\n\noperatornameargmin_p mathcal M frac12 lVert f(p) rVert^2\n\nwhere fcolonmathcal M to ℝ^d is a continuously differentiable function, using the Riemannian Levenberg-Marquardt algorithm Peeters, Tech. Rep., 1993. The implementation follows Algorithm 1 Adachi, Okuno, Takeda, Preprint, 2022\n\nInput\n\nM – a manifold mathcal M\nf – a cost function F mathcal Mℝ^d\njacobian_f – the Jacobian of f. The Jacobian jacF is supposed to accept a keyword argument basis_domain which specifies basis of the tangent space at a given point in which the Jacobian is to be calculated. By default it should be the DefaultOrthonormalBasis.\np – an initial value p mathcal M\nnum_components – length of the vector returned by the cost function (d). By default its value is -1 which means that it will be determined automatically by calling F one additional time. Only possible when evaluation is AllocatingEvaluation, for mutating evaluation this must be explicitly specified.\n\nThese can also be passed as a NonlinearLeastSquaresObjective, then the keyword jacobian_tangent_basis below is ignored\n\nOptional\n\nevaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction(M,x,ξ) to use.\nstopping_criterion – (StopWhenAny(StopAfterIteration(200),StopWhenGradientNormLess(1e-12))) a functor inheriting from StoppingCriterion indicating when to stop.\nexpect_zero_residual – (false) whether or not the algorithm might expect that the value of residual (objective) at minimum is equal to 0.\nη – Scaling factor for the sufficient cost decrease threshold required to accept new proposal points. Allowed range: 0 < η < 1.\ndamping_term_min – initial (and also minimal) value of the damping term\nβ – parameter by which the damping term is multiplied when the current new point is rejected\ninitial_residual_values – the initial residual vector of the cost function f.\ninitial_jacobian_f – the initial Jacobian of the cost function f.\njacobian_tangent_basis - AbstractBasis specify the basis of the tangent space for jacobian_f.\n\nAll other keyword arguments are passed to decorate_state! for decorators or decorate_objective!, respectively. If you provide the ManifoldGradientObjective directly, these decorations can still be specified\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\nReferences\n\n\n\n\n\n","category":"function"},{"location":"solvers/LevenbergMarquardt/#Manopt.LevenbergMarquardt!","page":"Levenberg–Marquardt","title":"Manopt.LevenbergMarquardt!","text":"LevenbergMarquardt!(M, f, jacobian_f, p, num_components=-1; kwargs...)\n\nFor more options see LevenbergMarquardt.\n\n\n\n\n\n","category":"function"},{"location":"solvers/LevenbergMarquardt/#Options","page":"Levenberg–Marquardt","title":"Options","text":"","category":"section"},{"location":"solvers/LevenbergMarquardt/","page":"Levenberg–Marquardt","title":"Levenberg–Marquardt","text":"LevenbergMarquardtState","category":"page"},{"location":"solvers/LevenbergMarquardt/#Manopt.LevenbergMarquardtState","page":"Levenberg–Marquardt","title":"Manopt.LevenbergMarquardtState","text":"LevenbergMarquardtState{P,T} <: AbstractGradientSolverState\n\nDescribes a Gradient based descent algorithm, with\n\nFields\n\nA default value is given in brackets if a parameter can be left out in initialization.\n\nx – a point (of type P) on a manifold as starting point\nstop – (StopAfterIteration(200) | StopWhenGradientNormLess(1e-12) | StopWhenStepsizeLess(1e-12)) a StoppingCriterion\nretraction_method – (default_retraction_method(M, typeof(p))) the retraction to use, defaults to the default set for your manifold.\nresidual_values – value of F calculated in the solver setup or the previous iteration\nresidual_values_temp – value of F for the current proposal point\njacF – the current Jacobian of F\ngradient – the current gradient of F\nstep_vector – the tangent vector at x that is used to move to the next point\nlast_stepsize – length of step_vector\nη – Scaling factor for the sufficient cost decrease threshold required to accept new proposal points. Allowed range: 0 < η < 1.\ndamping_term – current value of the damping term\ndamping_term_min – initial (and also minimal) value of the damping term\nβ – parameter by which the damping term is multiplied when the current new point is rejected\nexpect_zero_residual – (false) if true, the algorithm expects that the value of residual (objective) at minimum is equal to 0.\n\nConstructor\n\nLevenbergMarquardtState(M, initialX, initial_residual_values, initial_jacF; initial_vector), kwargs...)\n\nGenerate Levenberg-Marquardt options.\n\nSee also\n\ngradient_descent, LevenbergMarquardt\n\n\n\n\n\n","category":"type"},{"location":"solvers/LevenbergMarquardt/#Literature","page":"Levenberg–Marquardt","title":"Literature","text":"","category":"section"},{"location":"solvers/LevenbergMarquardt/","page":"Levenberg–Marquardt","title":"Levenberg–Marquardt","text":"
","category":"page"},{"location":"solvers/exact_penalty_method/#ExactPenaltySolver","page":"Exact Penalty Method","title":"Exact Penalty Method","text":"","category":"section"},{"location":"solvers/exact_penalty_method/","page":"Exact Penalty Method","title":"Exact Penalty Method","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/exact_penalty_method/","page":"Exact Penalty Method","title":"Exact Penalty Method","text":" exact_penalty_method\n exact_penalty_method!","category":"page"},{"location":"solvers/exact_penalty_method/#Manopt.exact_penalty_method","page":"Exact Penalty Method","title":"Manopt.exact_penalty_method","text":"exact_penalty_method(M, F, gradF, p=rand(M); kwargs...)\nexact_penalty_method(M, cmo::ConstrainedManifoldObjective, p=rand(M); kwargs...)\n\nperform the exact penalty method (EPM) Liu, Boumal, 2019, Appl. Math. Optim The aim of the EPM is to find a solution of the constrained optimisation task\n\nbeginaligned\nmin_p mathcalM f(p)\ntextsubject to g_i(p)leq 0 quad text for i= 1 m\nquad h_j(p)=0 quad text for j=1n\nendaligned\n\nwhere M is a Riemannian manifold, and f, g_i_i=1^m and h_j_j=1^n are twice continuously differentiable functions from M to ℝ. For that a weighted L_1-penalty term for the violation of the constraints is added to the objective\n\nf(x) + ρ (sum_i=1^m maxleft0 g_i(x)right + sum_j=1^n vert h_j(x)vert)\n\nwhere ρ0 is the penalty parameter. Since this is non-smooth, a SmoothingTechnique with parameter u is applied, see the ExactPenaltyCost.\n\nIn every step k of the exact penalty method, the smoothed objective is then minimized over all x mathcalM. Then, the accuracy tolerance ϵ and the smoothing parameter u are updated by setting\n\nϵ^(k)=maxϵ_min θ_ϵ ϵ^(k-1)\n\nwhere ϵ_min is the lowest value ϵ is allowed to become and θ_ϵ (01) is constant scaling factor, and\n\nu^(k) = max u_min theta_u u^(k-1) \n\nwhere u_min is the lowest value u is allowed to become and θ_u (01) is constant scaling factor.\n\nLast, we update the penalty parameter ρ according to\n\nρ^(k) = begincases\nρ^(k-1)θ_ρ textif displaystyle max_j in mathcalEi in mathcalI Bigl vert h_j(x^(k)) vert g_i(x^(k))Bigr geq u^(k-1) Bigr) \nρ^(k-1) textelse\nendcases\n\nwhere θ_ρ in (01) is a constant scaling factor.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function fmathcal Mℝ to minimize\ngrad_f – the gradient of the cost function\n\nOptional (if not called with the ConstrainedManifoldObjective cmo)\n\ng – (nothing) the inequality constraints\nh – (nothing) the equality constraints\ngrad_g – (nothing) the gradient of the inequality constraints\ngrad_h – (nothing) the gradient of the equality constraints\n\nNote that one of the pairs (g, grad_g) or (h, grad_h) has to be provided. Otherwise the problem is not constrained and you can also call e.g. quasi_Newton\n\nOptional\n\nsmoothing – (LogarithmicSumOfExponentials) SmoothingTechnique to use\nϵ – (1e–3) the accuracy tolerance\nϵ_exponent – (1/100) exponent of the ϵ update factor;\nϵ_min – (1e-6) the lower bound for the accuracy tolerance\nu – (1e–1) the smoothing parameter and threshold for violation of the constraints\nu_exponent – (1/100) exponent of the u update factor;\nu_min – (1e-6) the lower bound for the smoothing parameter and threshold for violation of the constraints\nρ – (1.0) the penalty parameter\nmin_stepsize – (1e-10) the minimal step size\nsub_cost – (ExactPenaltyCost(problem, ρ, u; smoothing=smoothing)) use this exact penalty cost, especially with the same numbers ρ,u as in the options for the sub problem\nsub_grad – (ExactPenaltyGrad(problem, ρ, u; smoothing=smoothing)) use this exact penalty gradient, especially with the same numbers ρ,u as in the options for the sub problem\nsub_kwargs – keyword arguments to decorate the sub options, e.g. with debug.\nsub_stopping_criterion – (StopAfterIteration(200) |StopWhenGradientNormLess(ϵ) |StopWhenStepsizeLess(1e-10)) specify a stopping criterion for the subsolver.\nsub_problem – (DefaultManoptProblem(M,ManifoldGradientObjective(sub_cost, sub_grad; evaluation=evaluation) – ` problem for the subsolver\nsub_state – (QuasiNewtonState) using QuasiNewtonLimitedMemoryDirectionUpdate with InverseBFGS and sub_stopping_criterion as a stopping criterion. See also sub_kwargs.\nstopping_criterion – (StopAfterIteration(300) | (StopWhenSmallerOrEqual(ϵ, ϵ_min) & StopWhenChangeLess(1e-10)) a functor inheriting from StoppingCriterion indicating when to stop.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/exact_penalty_method/#Manopt.exact_penalty_method!","page":"Exact Penalty Method","title":"Manopt.exact_penalty_method!","text":"exact_penalty_method!(M, f, grad_f, p; kwargs...)\nexact_penalty_method!(M, cmo::ConstrainedManifoldObjective, p; kwargs...)\n\nperform the exact penalty method (EPM) performed in place of p.\n\nFor all options, see exact_penalty_method.\n\n\n\n\n\n","category":"function"},{"location":"solvers/exact_penalty_method/#State","page":"Exact Penalty Method","title":"State","text":"","category":"section"},{"location":"solvers/exact_penalty_method/","page":"Exact Penalty Method","title":"Exact Penalty Method","text":"ExactPenaltyMethodState","category":"page"},{"location":"solvers/exact_penalty_method/#Manopt.ExactPenaltyMethodState","page":"Exact Penalty Method","title":"Manopt.ExactPenaltyMethodState","text":"ExactPenaltyMethodState{P,T} <: AbstractManoptSolverState\n\nDescribes the exact penalty method, with\n\nFields\n\na default value is given in brackets if a parameter can be left out in initialization.\n\np – a set point on a manifold as starting point\nsub_problem – an AbstractManoptProblem problem for the subsolver\nsub_state – an AbstractManoptSolverState for the subsolver\nϵ – (1e–3) the accuracy tolerance\nϵ_min – (1e-6) the lower bound for the accuracy tolerance\nu – (1e–1) the smoothing parameter and threshold for violation of the constraints\nu_min – (1e-6) the lower bound for the smoothing parameter and threshold for violation of the constraints\nρ – (1.0) the penalty parameter\nθ_ρ – (0.3) the scaling factor of the penalty parameter\nstopping_criterion – (StopWhenAny(StopAfterIteration(300),StopWhenAll(StopWhenSmallerOrEqual(ϵ, ϵ_min),StopWhenChangeLess(min_stepsize)))) a functor inheriting from StoppingCriterion indicating when to stop.\n\nConstructor\n\nExactPenaltyMethodState(M::AbstractManifold, p, sub_problem, sub_state; kwargs...)\n\nconstruct an exact penalty options with the fields and defaults as above, where the manifold M is used for defaults in the keyword arguments.\n\nSee also\n\nexact_penalty_method\n\n\n\n\n\n","category":"type"},{"location":"solvers/exact_penalty_method/#Helping-Functions","page":"Exact Penalty Method","title":"Helping Functions","text":"","category":"section"},{"location":"solvers/exact_penalty_method/","page":"Exact Penalty Method","title":"Exact Penalty Method","text":"ExactPenaltyCost\nExactPenaltyGrad\nSmoothingTechnique\nLinearQuadraticHuber\nLogarithmicSumOfExponentials","category":"page"},{"location":"solvers/exact_penalty_method/#Manopt.ExactPenaltyCost","page":"Exact Penalty Method","title":"Manopt.ExactPenaltyCost","text":"ExactPenaltyCost{S, Pr, R}\n\nRepresent the cost of the exact penalty method based on a ConstrainedManifoldObjective P and a parameter ρ given by\n\nf(p) + ρBigl(\n sum_i=0^m max0g_i(p) + sum_j=0^n lvert h_j(p)rvert\nBigr)\n\nwhere we use an additional parameter u and a smoothing technique, e.g. LogarithmicSumOfExponentials or LinearQuadraticHuber to obtain a smooth cost function. This struct is also a functor (M,p) -> v of the cost v.\n\nFields\n\nP, ρ, u as mentioned above.\n\nConstructor\n\nExactPenaltyCost(co::ConstrainedManifoldObjective, ρ, u; smoothing=LinearQuadraticHuber())\n\n\n\n\n\n","category":"type"},{"location":"solvers/exact_penalty_method/#Manopt.ExactPenaltyGrad","page":"Exact Penalty Method","title":"Manopt.ExactPenaltyGrad","text":"ExactPenaltyGrad{S, CO, R}\n\nRepresent the gradient of the ExactPenaltyCost based on a ConstrainedManifoldObjective co and a parameter ρ and a smoothing technique, which uses an additional parameter u.\n\nThis struct is also a functor in both formats\n\n(M, p) -> X to compute the gradient in allocating fashion.\n(M, X, p) to compute the gradient in in-place fashion.\n\nFields\n\nP, ρ, u as mentioned above.\n\nConstructor\n\nExactPenaltyGradient(co::ConstrainedManifoldObjective, ρ, u; smoothing=LinearQuadraticHuber())\n\n\n\n\n\n","category":"type"},{"location":"solvers/exact_penalty_method/#Manopt.SmoothingTechnique","page":"Exact Penalty Method","title":"Manopt.SmoothingTechnique","text":"abstract type SmoothingTechnique\n\nSpecify a smoothing technique, e.g. for the ExactPenaltyCost and ExactPenaltyGrad.\n\n\n\n\n\n","category":"type"},{"location":"solvers/exact_penalty_method/#Manopt.LinearQuadraticHuber","page":"Exact Penalty Method","title":"Manopt.LinearQuadraticHuber","text":"LinearQuadraticHuber <: SmoothingTechnique\n\nSpecify a smoothing based on max0x mathcal P(xu) for some u, where\n\nmathcal P(x u) = begincases\n 0 text if x leq 0\n fracx^22u text if 0 leq x leq u\n x-fracu2 text if x geq u\nendcases\n\n\n\n\n\n","category":"type"},{"location":"solvers/exact_penalty_method/#Manopt.LogarithmicSumOfExponentials","page":"Exact Penalty Method","title":"Manopt.LogarithmicSumOfExponentials","text":"LogarithmicSumOfExponentials <: SmoothingTechnique\n\nSpecify a smoothing based on maxab u log(mathrme^fracau+mathrme^fracbu) for some u.\n\n\n\n\n\n","category":"type"},{"location":"solvers/exact_penalty_method/#Literature","page":"Exact Penalty Method","title":"Literature","text":"","category":"section"},{"location":"solvers/exact_penalty_method/","page":"Exact Penalty Method","title":"Exact Penalty Method","text":"
","category":"page"},{"location":"functions/proximal_maps/#proximalMapFunctions","page":"Proximal Maps","title":"Proximal Maps","text":"","category":"section"},{"location":"functions/proximal_maps/","page":"Proximal Maps","title":"Proximal Maps","text":"For a function varphimathcal M ℝ the proximal map is defined as","category":"page"},{"location":"functions/proximal_maps/","page":"Proximal Maps","title":"Proximal Maps","text":"displaystyleoperatornameprox_λvarphi(x)\n= operatorname*argmin_y mathcal M d_mathcal M^2(xy) + λvarphi(y)\nquad λ 0","category":"page"},{"location":"functions/proximal_maps/","page":"Proximal Maps","title":"Proximal Maps","text":"where d_mathcal M mathcal M times mathcal M ℝ denotes the geodesic distance on mathcal M. While it might still be difficult to compute the minimizer, there are several proximal maps known (locally) in closed form. Furthermore if x^star mathcal M is a minimizer of varphi, then","category":"page"},{"location":"functions/proximal_maps/","page":"Proximal Maps","title":"Proximal Maps","text":"displaystyleoperatornameprox_λvarphi(x^star) = x^star","category":"page"},{"location":"functions/proximal_maps/","page":"Proximal Maps","title":"Proximal Maps","text":"i.e. a minimizer is a fixed point of the proximal map.","category":"page"},{"location":"functions/proximal_maps/","page":"Proximal Maps","title":"Proximal Maps","text":"This page lists all proximal maps available within Manopt. To add you own, just extend the functions/proximal_maps.jl file.","category":"page"},{"location":"functions/proximal_maps/","page":"Proximal Maps","title":"Proximal Maps","text":"Modules = [Manopt]\nPages = [\"proximal_maps.jl\"]","category":"page"},{"location":"functions/proximal_maps/#Manopt.project_collaborative_TV","page":"Proximal Maps","title":"Manopt.project_collaborative_TV","text":"project_collaborative_TV(M, λ, x, Ξ[, p=2,q=1])\nproject_collaborative_TV!(M, Θ, λ, x, Ξ[, p=2,q=1])\n\ncompute the projection onto collaborative Norm unit (or α-) ball, i.e. of the function\n\nF^q(x) = sum_imathcal G\n Bigl( sum_jmathcal I_i\n sum_k=1^d lVert X_ijrVert_x^pBigr)^fracqp\n\nwhere mathcal G is the set of indices for xmathcal M and mathcal I_i is the set of its forward neighbors. The computation can also be done in place of Θ.\n\nThis is adopted from the paper Duran, Möller, Sbert, Cremers, SIAM J Imag Sci, 2016, see their Example 3 for details.\n\n\n\n\n\n","category":"function"},{"location":"functions/proximal_maps/#Manopt.prox_TV","page":"Proximal Maps","title":"Manopt.prox_TV","text":"ξ = prox_TV(M,λ,x [,p=1])\n\ncompute the proximal maps operatornameprox_λvarphi of all forward differences occurring in the power manifold array, i.e. varphi(xixj) = d_mathcal M^p(xixj) with xi and xj are array elements of x and j = i+e_k, where e_k is the kth unit vector. The parameter λ is the prox parameter.\n\nInput\n\nM – a manifold M\nλ – a real value, parameter of the proximal map\nx – a point.\n\nOptional\n\n(default is given in brackets)\n\np – (1) exponent of the distance of the TV term\n\nOutput\n\ny – resulting point containing with all mentioned proximal points evaluated (in a cyclic order). The computation can also be done in place\n\n\n\n\n\n","category":"function"},{"location":"functions/proximal_maps/#Manopt.prox_TV-Union{Tuple{T}, Tuple{AbstractManifold, Number, Tuple{T, T}}, Tuple{AbstractManifold, Number, Tuple{T, T}, Int64}} where T","page":"Proximal Maps","title":"Manopt.prox_TV","text":"[y1,y2] = prox_TV(M, λ, [x1,x2] [,p=1])\nprox_TV!(M, [y1,y2] λ, [x1,x2] [,p=1])\n\nCompute the proximal map operatornameprox_λvarphi of φ(xy) = d_mathcal M^p(xy) with parameter λ.\n\nInput\n\nM – a manifold M\nλ – a real value, parameter of the proximal map\n(x1,x2) – a tuple of two points,\n\nOptional\n\n(default is given in brackets)\n\np – (1) exponent of the distance of the TV term\n\nOutput\n\n(y1,y2) – resulting tuple of points of the operatornameprox_λφ((x1,x2)). The result can also be computed in place.\n\n\n\n\n\n","category":"method"},{"location":"functions/proximal_maps/#Manopt.prox_TV2-Union{Tuple{T}, Tuple{AbstractManifold, Any, Tuple{T, T, T}}, Tuple{AbstractManifold, Any, Tuple{T, T, T}, Int64}} where T","page":"Proximal Maps","title":"Manopt.prox_TV2","text":"(y1,y2,y3) = prox_TV2(M,λ,(x1,x2,x3),[p=1], kwargs...)\nprox_TV2!(M, y, λ,(x1,x2,x3),[p=1], kwargs...)\n\nCompute the proximal map operatornameprox_λvarphi of varphi(x_1x_2x_3) = d_mathcal M^p(c(x_1x_3)x_2) with parameter λ>0, where c(xz) denotes the mid point of a shortest geodesic from x1 to x3 that is closest to x2. The result can be computed in place of y.\n\nInput\n\nM – a manifold\nλ – a real value, parameter of the proximal map\n(x1,x2,x3) – a tuple of three points\np – (1) exponent of the distance of the TV term\n\nOptional\n\nkwargs... – parameters for the internal subgradient_method (if M is neither Euclidean nor Circle, since for these a closed form is given)\n\nOutput\n\n(y1,y2,y3) – resulting tuple of points of the proximal map. The computation can also be done in place.\n\n\n\n\n\n","category":"method"},{"location":"functions/proximal_maps/#Manopt.prox_TV2-Union{Tuple{T}, Tuple{N}, Tuple{PowerManifold{N, T}, Any, Any}, Tuple{PowerManifold{N, T}, Any, Any, Int64}} where {N, T}","page":"Proximal Maps","title":"Manopt.prox_TV2","text":"y = prox_TV2(M, λ, x[, p=1])\nprox_TV2!(M, y, λ, x[, p=1])\n\ncompute the proximal maps operatornameprox_λvarphi of all centered second order differences occurring in the power manifold array, i.e. varphi(x_kx_ix_j) = d_2(x_kx_ix_j), where kj are backward and forward neighbors (along any dimension in the array of x). The parameter λ is the prox parameter.\n\nInput\n\nM – a manifold M\nλ – a real value, parameter of the proximal map\nx – a points.\n\nOptional\n\n(default is given in brackets)\n\np – (1) exponent of the distance of the TV term\n\nOutput\n\ny – resulting point with all mentioned proximal points evaluated (in a cyclic order). The computation can also be done in place.\n\n\n\n\n\n","category":"method"},{"location":"functions/proximal_maps/#Manopt.prox_distance","page":"Proximal Maps","title":"Manopt.prox_distance","text":"y = prox_distance(M,λ,f,x [, p=2])\nprox_distance!(M, y, λ, f, x [, p=2])\n\ncompute the proximal map operatornameprox_λvarphi with parameter λ of φ(x) = frac1pd_mathcal M^p(fx). For the mutating variant the computation is done in place of y.\n\nInput\n\nM – a manifold M\nλ – the prox parameter\nf – a point f mathcal M (the data)\nx – the argument of the proximal map\n\nOptional argument\n\np – (2) exponent of the distance.\n\nOutput\n\ny – the result of the proximal map of φ\n\n\n\n\n\n","category":"function"},{"location":"functions/proximal_maps/#Manopt.prox_parallel_TV","page":"Proximal Maps","title":"Manopt.prox_parallel_TV","text":"y = prox_parallel_TV(M, λ, x [,p=1])\nprox_parallel_TV!(M, y, λ, x [,p=1])\n\ncompute the proximal maps operatornameprox_λφ of all forward differences occurring in the power manifold array, i.e. φ(x_ix_j) = d_mathcal M^p(x_ix_j) with xi and xj are array elements of x and j = i+e_k, where e_k is the kth unit vector. The parameter λ is the prox parameter.\n\nInput\n\nM – a PowerManifold manifold\nλ – a real value, parameter of the proximal map\nx – a point\n\nOptional\n\n(default is given in brackets)\n\np – (1) exponent of the distance of the TV term\n\nOutput\n\ny – resulting Array of points with all mentioned proximal points evaluated (in a parallel within the arrays elements). The computation can also be done in place.\n\nSee also prox_TV\n\n\n\n\n\n","category":"function"},{"location":"functions/proximal_maps/#Literature","page":"Proximal Maps","title":"Literature","text":"","category":"section"},{"location":"functions/proximal_maps/","page":"Proximal Maps","title":"Proximal Maps","text":"
[DMSC16]
\n
\n
J. Duran, M. Moeller, C. Sbert and D. Cremers. Collaborative Total Variation: A General Framework for Vectorial TV Models. SIAM Journal on Imaging Sciences 9, 116-151 (2016), arxiv: [1508.01308](https://arxiv.org/abs/1508.01308).
\n
\n
","category":"page"},{"location":"plans/#planSection","page":"Specify a Solver","title":"Plans for solvers","text":"","category":"section"},{"location":"plans/","page":"Specify a Solver","title":"Specify a Solver","text":"CurrentModule = Manopt","category":"page"},{"location":"plans/","page":"Specify a Solver","title":"Specify a Solver","text":"For any optimisation performed in Manopt.jl we need information about both the optimisation task or “problem” at hand as well as the solver and all its parameters. This together is called a plan in Manopt.jl and it consists of two data structures:","category":"page"},{"location":"plans/","page":"Specify a Solver","title":"Specify a Solver","text":"The Manopt Problem describes all static data of our task, most prominently the manifold and the objective.\nThe Solver State describes all varying data and parameters for the solver we aim to use. This also means that each solver has its own data structure for the state.","category":"page"},{"location":"plans/","page":"Specify a Solver","title":"Specify a Solver","text":"By splitting these two parts, we can use one problem and solve it using different solvers.","category":"page"},{"location":"plans/","page":"Specify a Solver","title":"Specify a Solver","text":"Still there might be the need to set certain parameters within any of these structures. For that there is","category":"page"},{"location":"plans/","page":"Specify a Solver","title":"Specify a Solver","text":"set_manopt_parameter!\nManopt.status_summary","category":"page"},{"location":"plans/#Manopt.set_manopt_parameter!","page":"Specify a Solver","title":"Manopt.set_manopt_parameter!","text":"set_manopt_parameter!(f, element::Symbol , args...)\n\nFor any f and a Symbol e we dispatch on its value so by default, to set some args... in f or one of uts sub elements.\n\n\n\n\n\nset_manopt_parameter!(amo::AbstractManifoldObjective, element::Symbol, args...)\n\nSet a certain args... from the AbstractManifoldObjective amo to value. This function should dispatch onVal(element)`.\n\nCurrently supported\n\n:Cost passes to the get_cost_function\n:Gradient passes to the get_gradient_function\n\n\n\n\n\nset_manopt_parameter!(ams::AbstractManoptProblem, element::Symbol, field::Symbol , value)\n\nSet a certain field/element from the AbstractManoptProblem ams to value. This function should dispatch onVal(element)`.\n\nBy default this passes on to the inner objective, see set_manopt_parameter!\n\n\n\n\n\nset_manopt_parameter!(ams::AbstractManoptSolverState, element::Symbol, args...)\n\nSet a certain field/element from the AbstractManoptSolverState ams to value. This function dispatches onVal(element)`.\n\n\n\n\n\nset_manopt_parameter!(ams::DebugSolverState, ::Val{:Debug}, args...)\n\nSet certain values specified by args... into the elements of the debugDictionary\n\n\n\n\n\nset_manopt_parameter!(ams::DebugSolverState, ::Val{:SubProblem}, args...)\n\nSet certain values specified by args... to the sub problem.\n\n\n\n\n\nset_manopt_parameter!(ams::DebugSolverState, ::Val{:SubState}, args...)\n\nSet certain values specified by args... to the sub state.\n\n\n\n\n\n","category":"function"},{"location":"plans/#Manopt.status_summary","page":"Specify a Solver","title":"Manopt.status_summary","text":"status_summary(e)\n\nReturn a string reporting about the current status of e, where e is a type from Manopt, e.g. an AbstractManoptSolverStates.\n\nThis method is similar to show but just returns a string. It might also be more verbose in explaining, or hide internal information.\n\n\n\n\n\n","category":"function"},{"location":"tutorials/ConstrainedOptimization/#How-to-do-Constrained-Optimization","page":"Do Constrained Optimization","title":"How to do Constrained Optimization","text":"","category":"section"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Ronny Bergmann","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"This tutorial is a short introduction to using solvers for constraint optimisation in Manopt.jl.","category":"page"},{"location":"tutorials/ConstrainedOptimization/#Introduction","page":"Do Constrained Optimization","title":"Introduction","text":"","category":"section"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"A constraint optimisation problem is given by","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"tagP\nbeginalign*\noperatorname*argmin_pinmathcal M f(p)\ntextsuch that quad g(p) leq 0\nquad h(p) = 0\nendalign*","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"where fcolon mathcal M ℝ is a cost function, and gcolon mathcal M ℝ^m and hcolon mathcal M ℝ^n are the inequality and equality constraints, respectively. The leq and = in (P) are meant elementwise.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"This can be seen as a balance between moving constraints into the geometry of a manifold mathcal M and keeping some, since they can be handled well in algorithms, see [BH19], [LB19] for details.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"using Distributions, LinearAlgebra, Manifolds, Manopt, Random\nRandom.seed!(42);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"In this tutorial we want to look at different ways to specify the problem and its implications. We start with specifying an example problems to illustrayte the different available forms.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"We will consider the problem of a Nonnegative PCA, cf. Section 5.1.2 in [LB19]","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"let v_0 ℝ^d, lVert v_0 rVert=1 be given spike signal, that is a signal that is sparse with only s=lfloor δd rfloor nonzero entries.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Z = sqrtσ v_0v_0^mathrmT+N","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"where sigma is a signal-to-noise ratio and N is a matrix with random entries, where the diagonal entries are distributed with zero mean and standard deviation 1d on the off-diagonals and 2d on the daigonal","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"d = 150; # dimension of v0\nσ = 0.1^2; # SNR\nδ = 0.1; s = Int(floor(δ * d)); # Sparsity\nS = sample(1:d, s; replace=false);\nv0 = [i ∈ S ? 1 / sqrt(s) : 0.0 for i in 1:d];\nN = rand(Normal(0, 1 / d), (d, d)); N[diagind(N, 0)] .= rand(Normal(0, 2 / d), d);\nZ = Z = sqrt(σ) * v0 * transpose(v0) + N;","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"In order to recover v_0 we consider the constrained optimisation problem on the sphere mathcal S^d-1 given by","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"beginalign*\noperatorname*argmin_pinmathcal S^d-1 -p^mathrmTZp^mathrmT\ntextsuch that quad p geq 0\nendalign*","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"or in the previous notation f(p) = -p^mathrmTZp^mathrmT and g(p) = -p. We first initialize the manifold under consideration","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"M = Sphere(d - 1)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Sphere(149, ℝ)","category":"page"},{"location":"tutorials/ConstrainedOptimization/#A-first-Augmented-Lagrangian-Run","page":"Do Constrained Optimization","title":"A first Augmented Lagrangian Run","text":"","category":"section"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"We first defined f and g as usual functions","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"f(M, p) = -transpose(p) * Z * p;\ng(M, p) = -p;","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"since f is a functions defined in the embedding ℝ^d as well, we obtain its gradient by projection.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"grad_f(M, p) = project(M, p, -transpose(Z) * p - Z * p);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"For the constraints this is a little more involved, since each function g_i = g(p)_i = p_i has to return its own gradient. These are again in the embedding just operatornamegrad g_i(p) = -e_i the i th unit vector. We can project these again onto the tangent space at p:","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"grad_g(M, p) = project.(\n Ref(M), Ref(p), [[i == j ? -1.0 : 0.0 for j in 1:d] for i in 1:d]\n);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"We further start in a random point:","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"p0 = rand(M);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Let’s check a few things for the initial point","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"f(M, p0)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"0.005747604833124234","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"How much the function g is positive","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"maximum(g(M, p0))","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"0.17885478285466855","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Now as a first method we can just call the Augmented Lagrangian Method with a simple call:","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"@time v1 = augmented_Lagrangian_method(\n M, f, grad_f, p0; g=g, grad_g=grad_g,\n debug=[:Iteration, :Cost, :Stop, \" | \", (:Change, \"Δp : %1.5e\"), 20, \"\\n\"],\n stopping_criterion = StopAfterIteration(300) | (\n StopWhenSmallerOrEqual(:ϵ, 1e-5) & StopWhenChangeLess(1e-8)\n )\n);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Initial f(x): 0.005748 | \n# 20 f(x): -0.123842 | Δp : 9.99682e-01\n# 40 f(x): -0.123842 | Δp : 8.13541e-07\n# 60 f(x): -0.123842 | Δp : 7.85694e-04\nThe value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-5).\nThe algorithm performed a step with a change (1.7450108123172955e-15) less than 9.77237220955808e-6.\n 16.843524 seconds (43.34 M allocations: 32.293 GiB, 10.65% gc time, 37.25% compilation time)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Now we have both a lower function value and the point is nearly within the constraints, … up to numerical inaccuracies","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"f(M, v1)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"-0.12384244779997305","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"maximum( g(M, v1) )","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"7.912675333644102e-18","category":"page"},{"location":"tutorials/ConstrainedOptimization/#A-faster-Augmented-Lagrangian-Run","page":"Do Constrained Optimization","title":"A faster Augmented Lagrangian Run","text":"","category":"section"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Now this is a little slow, so we can modify two things, that we will directly do both – but one could also just change one of these – :","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Gradients should be evaluated in place, so for example","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"grad_f!(M, X, p) = project!(M, X, p, -transpose(Z) * p - Z * p);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"The constraints are currently always evaluated all together, since the function grad_g always returns a vector of gradients. We first change the constraints function into a vector of functions. We further change the gradient both into a vector of gradient functions operatornamegrad g_i i=1ldotsd, as well as gradients that are computed in place.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"g2 = [(M, p) -> -p[i] for i in 1:d];\ngrad_g2! = [\n (M, X, p) -> project!(M, X, p, [i == j ? -1.0 : 0.0 for j in 1:d]) for i in 1:d\n];","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"We obtain","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"@time v2 = augmented_Lagrangian_method(\n M, f, grad_f!, p0; g=g2, grad_g=grad_g2!, evaluation=InplaceEvaluation(),\n debug=[:Iteration, :Cost, :Stop, \" | \", (:Change, \"Δp : %1.5e\"), 20, \"\\n\"],\n stopping_criterion = StopAfterIteration(300) | (\n StopWhenSmallerOrEqual(:ϵ, 1e-5) & StopWhenChangeLess(1e-8)\n )\n );","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Initial f(x): 0.005748 | \n# 20 f(x): -0.123842 | Δp : 9.99544e-01\n# 40 f(x): -0.123842 | Δp : 1.92065e-03\n# 60 f(x): -0.123842 | Δp : 4.84931e-06\nThe value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-5).\nThe algorithm performed a step with a change (2.7435918100802105e-17) less than 9.77237220955808e-6.\n 3.547284 seconds (6.52 M allocations: 3.728 GiB, 6.70% gc time, 41.27% compilation time)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"As a technical remark: Note that (by default) the change to InplaceEvaluations affects both the constrained solver as well as the inner solver of the subproblem in each iteration.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"f(M, v2)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"-0.12384239276300012","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"maximum(g(M, v2))","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"2.2466899389459647e-18","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"These are the very similar to the previous values but the solver took much less time and less memory allocations.","category":"page"},{"location":"tutorials/ConstrainedOptimization/#Exact-Penalty-Method","page":"Do Constrained Optimization","title":"Exact Penalty Method","text":"","category":"section"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"As a second solver, we have the Exact Penalty Method, which currenlty is available with two smoothing variants, which make an inner solver for smooth optimisationm, that is by default again [quasi Newton] possible: LogarithmicSumOfExponentials and LinearQuadraticHuber. We compare both here as well. The first smoothing technique is the default, so we can just call","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"@time v3 = exact_penalty_method(\n M, f, grad_f!, p0; g=g2, grad_g=grad_g2!, evaluation=InplaceEvaluation(),\n debug=[:Iteration, :Cost, :Stop, \" | \", :Change, 50, \"\\n\"],\n);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Initial f(x): 0.005748 | \n# 50 f(x): -0.123071 | Last Change: 0.981116\n# 100 f(x): -0.123840 | Last Change: 0.014124\nThe value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-6).\nThe algorithm performed a step with a change (2.202641515349944e-7) less than 1.0e-6.\n 2.383160 seconds (5.78 M allocations: 3.123 GiB, 7.71% gc time, 64.51% compilation time)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"We obtain a similar cost value as for the Augmented Lagrangian Solver above, but here the constraint is actually fulfilled and not just numerically “on the boundary”.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"f(M, v3)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"-0.12384029692539944","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"maximum(g(M, v3))","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"-3.582398293370528e-6","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"The second smoothing technique is often beneficial, when we have a lot of constraints (in the above mentioned vectorial manner), since we can avoid several gradient evaluations for the constraint functions here. This leads to a faster iteration time.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"@time v4 = exact_penalty_method(\n M, f, grad_f!, p0; g=g2, grad_g=grad_g2!,\n evaluation=InplaceEvaluation(),\n smoothing=LinearQuadraticHuber(),\n debug=[:Iteration, :Cost, :Stop, \" | \", :Change, 50, \"\\n\"],\n);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Initial f(x): 0.005748 | \n# 50 f(x): -0.123845 | Last Change: 0.009235\n# 100 f(x): -0.123843 | Last Change: 0.000107\nThe value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-6).\nThe algorithm performed a step with a change (3.586352489111338e-7) less than 1.0e-6.\n 1.557075 seconds (2.76 M allocations: 514.648 MiB, 5.08% gc time, 79.85% compilation time)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"For the result we see the same behaviour as for the other smoothing.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"f(M, v4)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"-0.12384258173223292","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"maximum(g(M, v4))","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"2.7028045565194566e-8","category":"page"},{"location":"tutorials/ConstrainedOptimization/#Comparing-to-the-unconstraint-solver","page":"Do Constrained Optimization","title":"Comparing to the unconstraint solver","text":"","category":"section"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"We can compare this to the global optimum on the sphere, which is the unconstraint optimisation problem; we can just use Quasi Newton.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"Note that this is much faster, since every iteration of the algorithms above does a quasi-Newton call as well.","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"@time w1 = quasi_Newton(\n M, f, grad_f!, p0; evaluation=InplaceEvaluation()\n);","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":" 0.706571 seconds (634.12 k allocations: 61.701 MiB, 3.18% gc time, 96.56% compilation time)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"f(M, w1)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"-0.14021901809807297","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"But for sure here the constraints here are not fulfilled and we have veru positive entries in g(w_1)","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"maximum(g(M, w1))","category":"page"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"0.11762414497055226","category":"page"},{"location":"tutorials/ConstrainedOptimization/#Literature","page":"Do Constrained Optimization","title":"Literature","text":"","category":"section"},{"location":"tutorials/ConstrainedOptimization/","page":"Do Constrained Optimization","title":"Do Constrained Optimization","text":"
[BH19]
\n
\n
R. Bergmann and R. Herzog. Intrinsic formulation of KKT conditions and constraint qualifications on smooth manifolds. SIAM Journal on Optimization 29, 2423–2444 (2019), arXiv: [1804.06214](https://arxiv.org/abs/1804.06214).
","category":"page"},{"location":"helpers/exports/#Exports","page":"Exports","title":"Exports","text":"","category":"section"},{"location":"helpers/exports/","page":"Exports","title":"Exports","text":"Exports aim to provide a consistent generation of images of your results. For example if you record the trace your algorithm walks on the Sphere, you can easily export this trace to a rendered image using asymptote_export_S2_signals and render the result with Asymptote. Despite these, you can always record values during your iterations, and export these, for example to csv.","category":"page"},{"location":"helpers/exports/#Asymptote","page":"Exports","title":"Asymptote","text":"","category":"section"},{"location":"helpers/exports/","page":"Exports","title":"Exports","text":"The following functions provide exports both in graphics and/or raw data using Asymptote.","category":"page"},{"location":"helpers/exports/","page":"Exports","title":"Exports","text":"Modules = [Manopt]\nPages = [\"Asymptote.jl\"]","category":"page"},{"location":"helpers/exports/#Manopt.asymptote_export_S2_data-Tuple{String}","page":"Exports","title":"Manopt.asymptote_export_S2_data","text":"asymptote_export_S2_data(filename)\n\nExport given data as an array of points on the sphere, i.e. one-, two- or three-dimensional data with points on the Sphere mathbb S^2.\n\nInput\n\nfilename – a file to store the Asymptote code in.\n\nOptional Arguments (Data)\n\ndata – a point representing the 1-,2-, or 3-D array of points\nelevation_color_scheme - A ColorScheme for elevation\nscale_axes - ((1/3,1/3,1/3)) move spheres closer to each other by a factor per direction\n\nOptional Arguments (Asymptote)\n\narrow_head_size - (1.8) size of the arrowheads of the vectors (in mm)\ncamera_position - position of the camera (default: centered above xy-plane) szene\ntarget - position the camera points at (default: center of xy-plane within data).\n\n\n\n\n\n","category":"method"},{"location":"helpers/exports/#Manopt.asymptote_export_S2_signals-Tuple{String}","page":"Exports","title":"Manopt.asymptote_export_S2_signals","text":"asymptote_export_S2_signals(filename; points, curves, tangent_vectors, colors, options...)\n\nExport given points, curves, and tangent_vectors on the sphere mathbb S^2 to Asymptote.\n\nInput\n\nfilename – a file to store the Asymptote code in.\n\nOptional Arguments (Data)\n\ncolors - dictionary of color arrays (indexed by symbols :points, :curves and :tvector) where each entry has to provide as least as many colors as the length of the corresponding sets.\ncurves – an Array of Arrays of points on the sphere, where each inner array is interpreted as a curve and is accompanied by an entry within colors\npoints – an Array of Arrays of points on the sphere where each inner array is interpreted as a set of points and is accompanied by an entry within colors\ntangent_vectors – an Array of Arrays of tuples, where the first is a points, the second a tangent vector and each set of vectors is accompanied by an entry from within colors\n\nOptional Arguments (Asymptote)\n\narrow_head_size - (6.0) size of the arrowheads of the tangent vectors\narrow_head_sizes – overrides the previous value to specify a value per tVector set.\ncamera_position - ((1., 1., 0.)) position of the camera in the Asymptote szene\nline_width – (1.0) size of the lines used to draw the curves.\nline_widths – overrides the previous value to specify a value per curve and tVector set.\ndot_size – (1.0) size of the dots used to draw the points.\ndot_sizes – overrides the previous value to specify a value per point set.\nsize - (nothing) a tuple for the image size, otherwise a relative size 4cm is used.\nsphere_color – (RGBA{Float64}(0.85, 0.85, 0.85, 0.6)) color of the sphere the data is drawn on\nsphere_line_color – (RGBA{Float64}(0.75, 0.75, 0.75, 0.6)) color of the lines on the sphere\nsphere_line_width – (0.5) line width of the lines on the sphere\ntarget – ((0.,0.,0.)) position the camera points at\n\n\n\n\n\n","category":"method"},{"location":"helpers/exports/#Manopt.asymptote_export_SPD-Tuple{String}","page":"Exports","title":"Manopt.asymptote_export_SPD","text":"asymptote_export_SPD(filename)\n\nexport given data as a point on a Power(SymmetricPOsitiveDefinnite(3))} manifold, i.e. one-, two- or three-dimensional data with points on the manifold of symmetric positive definite matrices.\n\nInput\n\nfilename – a file to store the Asymptote code in.\n\nOptional Arguments (Data)\n\ndata – a point representing the 1-,2-, or 3-D array of SPD matrices\ncolor_scheme - A ColorScheme for Geometric Anisotropy Index\nscale_axes - ((1/3,1/3,1/3)) move symmetric positive definite matrices closer to each other by a factor per direction compared to the distance estimated by the maximal eigenvalue of all involved SPD points\n\nOptional Arguments (Asymptote)\n\ncamera_position - position of the camera (default: centered above xy-plane) szene.\ntarget - position the camera points at (default: center of xy-plane within data).\n\nBoth values camera_position and target are scaled by scaledAxes*EW, where EW is the maximal eigenvalue in the data.\n\n\n\n\n\n","category":"method"},{"location":"helpers/exports/#Manopt.render_asymptote-Tuple{Any}","page":"Exports","title":"Manopt.render_asymptote","text":"render_asymptote(filename; render=4, format=\"png\", ...)\n\nrender an exported asymptote file specified in the filename, which can also be given as a relative or full path\n\nInput\n\nfilename – filename of the exported asy and rendered image\n\nKeyword Arguments\n\nthe default values are given in brackets\n\nrender – (4) render level of asymptote, i.e. its -render option. This can be removed from the command by setting it to nothing.\nformat – (\"png\") final rendered format, i.e. asymptote's -f option\nexport_file - (the filename with format as ending) specify the export filename\n\n\n\n\n\n","category":"method"},{"location":"plans/problem/#ProblemSection","page":"Problem","title":"A Manopt Problem","text":"","category":"section"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"CurrentModule = Manopt","category":"page"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"A problem describes all static data of an optimisation task and has as a super type","category":"page"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"AbstractManoptProblem\nget_objective\nget_manifold","category":"page"},{"location":"plans/problem/#Manopt.AbstractManoptProblem","page":"Problem","title":"Manopt.AbstractManoptProblem","text":"AbstractManoptProblem{M<:AbstractManifold}\n\nDescribe a Riemannian optimization problem with all static (not-changing) properties.\n\nThe most prominent features that should always be stated here are\n\nthe AbstractManifold mathcal M (cf. ManifoldsBase.jl#AbstractManifold)\nthe cost function fcolon mathcal M ℝ\n\nUsually the cost should be within an AbstractManifoldObjective.\n\n\n\n\n\n","category":"type"},{"location":"plans/problem/#Manopt.get_objective","page":"Problem","title":"Manopt.get_objective","text":"get_objective(o::AbstractManifoldObjective, recursive=true)\n\nreturn the (one step) undecorated AbstractManifoldObjective of the (possibly) decorated o. As long as your decorated objective stores the objective within o.objective and the dispatch_objective_decorator is set to Val{true}, the internal state are extracted automatically.\n\nBy default the objective that is stored within a decorated objective is assumed to be at o.objective. Overwrite _get_objective(o, ::Val{true}, recursive) to change this behaviour for your objectiveo` for both the recursive and the nonrecursive case.\n\nIf recursive is set to false, only the most outer decorator is taken away instead of all.\n\n\n\n\n\nget_objective(mp::AbstractManoptProblem, recursive=false)\n\nreturn the objective AbstractManifoldObjective stored within an AbstractManoptProblem. If recursive is set to true, it additionally unwraps all decorators of the objective\n\n\n\n\n\n","category":"function"},{"location":"plans/problem/#Manopt.get_manifold","page":"Problem","title":"Manopt.get_manifold","text":"get_manifold(amp::AbstractManoptProblem)\n\nreturn the manifold stored within an AbstractManoptProblem\n\n\n\n\n\n","category":"function"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"Usually, such a problem is determined by the manifold or domain of the optimisation and the objective with all its properties used within an algorithm – see The Objective. For that we can just use","category":"page"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"DefaultManoptProblem","category":"page"},{"location":"plans/problem/#Manopt.DefaultManoptProblem","page":"Problem","title":"Manopt.DefaultManoptProblem","text":"DefaultManoptProblem{TM <: AbstractManifold, Objective <: AbstractManifoldObjective}\n\nModel a default manifold problem, that (just) consists of the domain of optimisation, that is an AbstractManifold and an AbstractManifoldObjective\n\n\n\n\n\n","category":"type"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"The exception to these are the primal dual-based solvers (Chambolle-Pock and the PD Semismooth Newton]), which both need two manifolds as their domain(s), hence there also exists a","category":"page"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"TwoManifoldProblem","category":"page"},{"location":"plans/problem/#Manopt.TwoManifoldProblem","page":"Problem","title":"Manopt.TwoManifoldProblem","text":"TwoManifoldProblem{\n MT<:AbstractManifold,NT<:AbstractManifold,O<:AbstractManifoldObjective\n} <: AbstractManoptProblem{MT}\n\nAn abstract type for primal-dual-based problems.\n\n\n\n\n\n","category":"type"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"From the two ingredients here, you can find more information about","category":"page"},{"location":"plans/problem/","page":"Problem","title":"Problem","text":"the AbstractManifold in ManifoldsBase.jl\nthe AbstractManifoldObjective on the page about the objective.","category":"page"},{"location":"solvers/quasi_Newton/#quasiNewton","page":"Quasi-Newton","title":"Riemannian quasi-Newton methods","text":"","category":"section"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":" CurrentModule = Manopt","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":" quasi_Newton\n quasi_Newton!","category":"page"},{"location":"solvers/quasi_Newton/#Manopt.quasi_Newton","page":"Quasi-Newton","title":"Manopt.quasi_Newton","text":"quasi_Newton(M, f, grad_f, p)\n\nPerform a quasi Newton iteration for f on the manifold M starting in the point p.\n\nThe kth iteration consists of\n\nCompute the search direction η_k = -mathcalB_k operatornamegradf (p_k) or solve mathcalH_k η_k = -operatornamegradf (p_k).\nDetermine a suitable stepsize α_k along the curve gamma(α) = R_p_k(α η_k) e.g. by using WolfePowellLinesearch.\nCompute p_{k+1} = R_{p_k}(α_k η_k)`.\nDefine s_k = T_p_k α_k η_k(α_k η_k) and y_k = operatornamegradf(p_k+1) - T_p_k α_k η_k(operatornamegradf(p_k)).\nCompute the new approximate Hessian H_k+1 or its inverse B_k.\n\nInput\n\nM – a manifold mathcalM.\nf – a cost function F mathcalM ℝ to minimize.\ngrad_f– the gradient operatornamegradF mathcalM T_xmathcal M of F.\np – an initial value p mathcalM.\n\nOptional\n\nbasis – (DefaultOrthonormalBasis()) basis within the tangent space(s) to represent the Hessian (inverse).\ncautious_update – (false) – whether or not to use a QuasiNewtonCautiousDirectionUpdate\ncautious_function – ((x) -> x*10^(-4)) – a monotone increasing function that is zero at 0 and strictly increasing at 0 for the cautious update.\ndirection_update – (InverseBFGS()) the update rule to use.\nevaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).\ninitial_operator – (Matrix{Float64}(I,n,n)) initial matrix to use die the approximation, where n=manifold_dimension(M), see also scale_initial_operator.\nmemory_size – (20) limited memory, number of s_k y_k to store. Set to a negative value to use a full memory representation\nretraction_method – (default_retraction_method(M, typeof(p))) a retraction method to use, by default the exponential map.\nscale_initial_operator - (true) scale initial operator with fracs_ky_k_p_klVert y_krVert_p_k in the computation\nstabilize – (true) stabilize the method numerically by projecting computed (Newton-) directions to the tangent space to reduce numerical errors\nstepsize – (WolfePowellLinesearch(retraction_method, vector_transport_method)) specify a Stepsize.\nstopping_criterion - (StopWhenAny(StopAfterIteration(max(1000, memory_size)), StopWhenGradientNormLess(10^(-6))) specify a StoppingCriterion\nvector_transport_method – (default_vector_transport_method(M, typeof(p))) a vector transport to use.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details.\n\n\n\n\n\n","category":"function"},{"location":"solvers/quasi_Newton/#Manopt.quasi_Newton!","page":"Quasi-Newton","title":"Manopt.quasi_Newton!","text":"quasi_Newton!(M, F, gradF, x; options...)\n\nPerform a quasi Newton iteration for F on the manifold M starting in the point x using a retraction R and a vector transport T.\n\nInput\n\nM – a manifold mathcalM.\nF – a cost function F mathcalM ℝ to minimize.\ngradF– the gradient operatornamegradF mathcalM T_xmathcal M of F implemented as gradF(M,p).\nx – an initial value x mathcalM.\n\nFor all optional parameters, see quasi_Newton.\n\n\n\n\n\n","category":"function"},{"location":"solvers/quasi_Newton/#Background","page":"Quasi-Newton","title":"Background","text":"","category":"section"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"The aim is to minimize a real-valued function on a Riemannian manifold, i.e.","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"min f(x) quad x mathcalM","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"Riemannian quasi-Newtonian methods are as generalizations of their Euclidean counterparts Riemannian line search methods. These methods determine a search direction η_k T_x_k mathcalM at the current iterate x_k and a suitable stepsize α_k along gamma(α) = R_x_k(α η_k), where R T mathcalM mathcalM is a retraction. The next iterate is obtained by","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"x_k+1 = R_x_k(α_k η_k)","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"In quasi-Newton methods, the search direction is given by","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"η_k = -mathcalH_k^-1operatornamegradf (x_k) = -mathcalB_k operatornamegrad (x_k)","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"where mathcalH_k T_x_k mathcalM T_x_k mathcalM is a positive definite self-adjoint operator, which approximates the action of the Hessian operatornameHess f (x_k) and mathcalB_k = mathcalH_k^-1. The idea of quasi-Newton methods is instead of creating a complete new approximation of the Hessian operator operatornameHess f(x_k+1) or its inverse at every iteration, the previous operator mathcalH_k or mathcalB_k is updated by a convenient formula using the obtained information about the curvature of the objective function during the iteration. The resulting operator mathcalH_k+1 or mathcalB_k+1 acts on the tangent space T_x_k+1 mathcalM of the freshly computed iterate x_k+1. In order to get a well-defined method, the following requirements are placed on the new operator mathcalH_k+1 or mathcalB_k+1 that is created by an update. Since the Hessian operatornameHess f(x_k+1) is a self-adjoint operator on the tangent space T_x_k+1 mathcalM, and mathcalH_k+1 approximates it, we require that mathcalH_k+1 or mathcalB_k+1 is also self-adjoint on T_x_k+1 mathcalM. In order to achieve a steady descent, we want η_k to be a descent direction in each iteration. Therefore we require, that mathcalH_k+1 or mathcalB_k+1 is a positive definite operator on T_x_k+1 mathcalM. In order to get information about the curvature of the objective function into the new operator mathcalH_k+1 or mathcalB_k+1, we require that it satisfies a form of a Riemannian quasi-Newton equation:","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"mathcalH_k+1 T_x_k rightarrow x_k+1(R_x_k^-1(x_k+1)) = operatornamegrad(x_k+1) - T_x_k rightarrow x_k+1(operatornamegradf(x_k))","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"or","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"mathcalB_k+1 operatornamegradf(x_k+1) - T_x_k rightarrow x_k+1(operatornamegradf(x_k)) = T_x_k rightarrow x_k+1(R_x_k^-1(x_k+1))","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"where T_x_k rightarrow x_k+1 T_x_k mathcalM T_x_k+1 mathcalM and the chosen retraction R is the associated retraction of T. We note that, of course, not all updates in all situations will meet these conditions in every iteration. For specific quasi-Newton updates, the fulfillment of the Riemannian curvature condition, which requires that","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"g_x_k+1(s_k y_k) 0","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"holds, is a requirement for the inheritance of the self-adjointness and positive definiteness of the mathcalH_k or mathcalB_k to the operator mathcalH_k+1 or mathcalB_k+1. Unfortunately, the fulfillment of the Riemannian curvature condition is not given by a step size alpha_k 0 that satisfies the generalized Wolfe conditions. However, in order to create a positive definite operator mathcalH_k+1 or mathcalB_k+1 in each iteration, the so-called locking condition was introduced in Huang, Gallican, Absil, SIAM J. Optim., 2015, which requires that the isometric vector transport T^S, which is used in the update formula, and its associate retraction R fulfill","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"T^Sx ξ_x(ξ_x) = β T^Rx ξ_x(ξ_x) quad β = fraclVert ξ_x rVert_xlVert T^Rx ξ_x(ξ_x) rVert_R_x(ξ_x)","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"where T^R is the vector transport by differentiated retraction. With the requirement that the isometric vector transport T^S and its associated retraction R satisfies the locking condition and using the tangent vector","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"y_k = β_k^-1 operatornamegradf(x_k+1) - T^Sx_k α_k η_k(operatornamegradf(x_k))","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"where","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"β_k = fraclVert α_k η_k rVert_x_klVert T^Rx_k α_k η_k(α_k η_k) rVert_x_k+1","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"in the update, it can be shown that choosing a stepsize α_k 0 that satisfies the Riemannian Wolfe conditions leads to the fulfillment of the Riemannian curvature condition, which in turn implies that the operator generated by the updates is positive definite. In the following we denote the specific operators in matrix notation and hence use H_k and B_k, respectively.","category":"page"},{"location":"solvers/quasi_Newton/#Direction-Updates","page":"Quasi-Newton","title":"Direction Updates","text":"","category":"section"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"In general there are different ways to compute a fixed AbstractQuasiNewtonUpdateRule. In general these are represented by","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"AbstractQuasiNewtonDirectionUpdate\nQuasiNewtonMatrixDirectionUpdate\nQuasiNewtonLimitedMemoryDirectionUpdate\nQuasiNewtonCautiousDirectionUpdate","category":"page"},{"location":"solvers/quasi_Newton/#Manopt.AbstractQuasiNewtonDirectionUpdate","page":"Quasi-Newton","title":"Manopt.AbstractQuasiNewtonDirectionUpdate","text":"AbstractQuasiNewtonDirectionUpdate\n\nAn abstract representation of an Quasi Newton Update rule to determine the next direction given current QuasiNewtonState.\n\nAll subtypes should be functors, i.e. one should be able to call them as H(M,x,d) to compute a new direction update.\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.QuasiNewtonMatrixDirectionUpdate","page":"Quasi-Newton","title":"Manopt.QuasiNewtonMatrixDirectionUpdate","text":"QuasiNewtonMatrixDirectionUpdate <: AbstractQuasiNewtonDirectionUpdate\n\nThese AbstractQuasiNewtonDirectionUpdates represent any quasi-Newton update rule, where the operator is stored as a matrix. A distinction is made between the update of the approximation of the Hessian, H_k mapsto H_k+1, and the update of the approximation of the Hessian inverse, B_k mapsto B_k+1. For the first case, the coordinates of the search direction η_k with respect to a basis b_i^n_i=1 are determined by solving a linear system of equations, i.e.\n\ntextSolve quad hatη_k = - H_k widehatoperatornamegradf(x_k)\n\nwhere H_k is the matrix representing the operator with respect to the basis b_i^n_i=1 and widehatoperatornamegradf(x_k) represents the coordinates of the gradient of the objective function f in x_k with respect to the basis b_i^n_i=1. If a method is chosen where Hessian inverse is approximated, the coordinates of the search direction η_k with respect to a basis b_i^n_i=1 are obtained simply by matrix-vector multiplication, i.e.\n\nhatη_k = - B_k widehatoperatornamegradf(x_k)\n\nwhere B_k is the matrix representing the operator with respect to the basis b_i^n_i=1 and widehatoperatornamegradf(x_k) as above. In the end, the search direction η_k is generated from the coordinates hateta_k and the vectors of the basis b_i^n_i=1 in both variants. The AbstractQuasiNewtonUpdateRule indicates which quasi-Newton update rule is used. In all of them, the Euclidean update formula is used to generate the matrix H_k+1 and B_k+1, and the basis b_i^n_i=1 is transported into the upcoming tangent space T_x_k+1 mathcalM, preferably with an isometric vector transport, or generated there.\n\nFields\n\nupdate – a AbstractQuasiNewtonUpdateRule.\nbasis – the basis.\nmatrix – (Matrix{Float64}(I, manifold_dimension(M), manifold_dimension(M))) the matrix which represents the approximating operator.\nscale – (`true) indicates whether the initial matrix (= identity matrix) should be scaled before the first update.\nvector_transport_method – (vector_transport_method)an AbstractVectorTransportMethod\n\nConstructor\n\nQuasiNewtonMatrixDirectionUpdate(M::AbstractManifold, update, basis, matrix;\nscale=true, vector_transport_method=default_vector_transport_method(M))\n\nGenerate the Update rule with defaults from a manifold and the names corresponding to the fields above.\n\nSee also\n\nQuasiNewtonLimitedMemoryDirectionUpdate QuasiNewtonCautiousDirectionUpdate AbstractQuasiNewtonDirectionUpdate\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.QuasiNewtonLimitedMemoryDirectionUpdate","page":"Quasi-Newton","title":"Manopt.QuasiNewtonLimitedMemoryDirectionUpdate","text":"QuasiNewtonLimitedMemoryDirectionUpdate <: AbstractQuasiNewtonDirectionUpdate\n\nThis AbstractQuasiNewtonDirectionUpdate represents the limited-memory Riemannian BFGS update, where the approximating operator is represented by m stored pairs of tangent vectors widetildes_i widetildey_i_i=k-m^k-1 in the k-th iteration. For the calculation of the search direction η_k, the generalisation of the two-loop recursion is used (see Huang, Gallican, Absil, SIAM J. Optim., 2015), since it only requires inner products and linear combinations of tangent vectors in T_x_k mathcalM. For that the stored pairs of tangent vectors widetildes_i widetildey_i_i=k-m^k-1, the gradient operatornamegradf(x_k) of the objective function f in x_k and the positive definite self-adjoint operator\n\nmathcalB^(0)_k = fracg_x_k(s_k-1 y_k-1)g_x_k(y_k-1 y_k-1) mathrmid_T_x_k mathcalM\n\nare used. The two-loop recursion can be understood as that the InverseBFGS update is executed m times in a row on mathcalB^(0)_k using the tangent vectors widetildes_i widetildey_i_i=k-m^k-1, and in the same time the resulting operator mathcalB^LRBFGS_k is directly applied on operatornamegradf(x_k). When updating there are two cases: if there is still free memory, i.e. k m, the previously stored vector pairs widetildes_i widetildey_i_i=k-m^k-1 have to be transported into the upcoming tangent space T_x_k+1 mathcalM; if there is no free memory, the oldest pair widetildes_km widetildey_km has to be discarded and then all the remaining vector pairs widetildes_i widetildey_i_i=k-m+1^k-1 are transported into the tangent space T_x_k+1 mathcalM. After that we calculate and store s_k = widetildes_k = T^S_x_k α_k η_k(α_k η_k) and y_k = widetildey_k. This process ensures that new information about the objective function is always included and the old, probably no longer relevant, information is discarded.\n\nFields\n\nmemory_s – the set of the stored (and transported) search directions times step size widetildes_i_i=k-m^k-1.\nmemory_y – set of the stored gradient differences widetildey_i_i=k-m^k-1.\nξ – a variable used in the two-loop recursion.\nρ – a variable used in the two-loop recursion.\nscale –\nvector_transport_method – a AbstractVectorTransportMethod\nmessage – a string containing a potential warning that might have appeared\n\nConstructor\n\nQuasiNewtonLimitedMemoryDirectionUpdate(\n M::AbstractManifold,\n x,\n update::AbstractQuasiNewtonUpdateRule,\n memory_size;\n initial_vector=zero_vector(M,x),\n scale=1.0\n project=true\n )\n\nSee also\n\nInverseBFGS QuasiNewtonCautiousDirectionUpdate AbstractQuasiNewtonDirectionUpdate\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.QuasiNewtonCautiousDirectionUpdate","page":"Quasi-Newton","title":"Manopt.QuasiNewtonCautiousDirectionUpdate","text":"QuasiNewtonCautiousDirectionUpdate <: AbstractQuasiNewtonDirectionUpdate\n\nThese AbstractQuasiNewtonDirectionUpdates represent any quasi-Newton update rule, which are based on the idea of a so-called cautious update. The search direction is calculated as given in QuasiNewtonMatrixDirectionUpdate or QuasiNewtonLimitedMemoryDirectionUpdate, butut the update then is only executed if\n\nfracg_x_k+1(y_ks_k)lVert s_k rVert^2_x_k+1 geq theta(lVert operatornamegradf(x_k) rVert_x_k)\n\nis satisfied, where theta is a monotone increasing function satisfying theta(0) = 0 and theta is strictly increasing at 0. If this is not the case, the corresponding update will be skipped, which means that for QuasiNewtonMatrixDirectionUpdate the matrix H_k or B_k is not updated. The basis b_i^n_i=1 is nevertheless transported into the upcoming tangent space T_x_k+1 mathcalM, and for QuasiNewtonLimitedMemoryDirectionUpdate neither the oldest vector pair widetildes_km widetildey_km is discarded nor the newest vector pair widetildes_k widetildey_k is added into storage, but all stored vector pairs widetildes_i widetildey_i_i=k-m^k-1 are transported into the tangent space T_x_k+1 mathcalM. If InverseBFGS or InverseBFGS is chosen as update, then the resulting method follows the method of Huang, Absil, Gallivan, SIAM J. Optim., 2018, taking into account that the corresponding step size is chosen.\n\nFields\n\nupdate – an AbstractQuasiNewtonDirectionUpdate\nθ – a monotone increasing function satisfying θ(0) = 0 and θ is strictly increasing at 0.\n\nConstructor\n\nQuasiNewtonCautiousDirectionUpdate(U::QuasiNewtonMatrixDirectionUpdate; θ = x -> x)\nQuasiNewtonCautiousDirectionUpdate(U::QuasiNewtonLimitedMemoryDirectionUpdate; θ = x -> x)\n\nGenerate a cautious update for either a matrix based or a limited memorz based update rule.\n\nSee also\n\nQuasiNewtonMatrixDirectionUpdate QuasiNewtonLimitedMemoryDirectionUpdate\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Hessian-Update-Rules","page":"Quasi-Newton","title":"Hessian Update Rules","text":"","category":"section"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"Using","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"update_hessian!","category":"page"},{"location":"solvers/quasi_Newton/#Manopt.update_hessian!","page":"Quasi-Newton","title":"Manopt.update_hessian!","text":"update_hessian!(d, amp, st, p_old, iter)\n\nupdate the hessian within the QuasiNewtonState o given a AbstractManoptProblem amp as well as the an AbstractQuasiNewtonDirectionUpdate d and the last iterate p_old. Note that the current (iterth) iterate is already stored in o.x.\n\nSee also AbstractQuasiNewtonUpdateRule for the different rules that are available within d.\n\n\n\n\n\n","category":"function"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"the following update formulae for either H_k+1 or B_k+1 are available.","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"AbstractQuasiNewtonUpdateRule\nBFGS\nDFP\nBroyden\nSR1\nInverseBFGS\nInverseDFP\nInverseBroyden\nInverseSR1","category":"page"},{"location":"solvers/quasi_Newton/#Manopt.AbstractQuasiNewtonUpdateRule","page":"Quasi-Newton","title":"Manopt.AbstractQuasiNewtonUpdateRule","text":"AbstractQuasiNewtonUpdateRule\n\nSpecify a type for the different AbstractQuasiNewtonDirectionUpdates, that is, e.g. for a QuasiNewtonMatrixDirectionUpdate there are several different updates to the matrix, while the default for QuasiNewtonLimitedMemoryDirectionUpdate the most prominent is InverseBFGS.\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.BFGS","page":"Quasi-Newton","title":"Manopt.BFGS","text":"BFGS <: AbstractQuasiNewtonUpdateRule\n\nindicates in AbstractQuasiNewtonDirectionUpdate that the Riemannian BFGS update is used in the Riemannian quasi-Newton method.\n\nWe denote by widetildeH_k^mathrmBFGS the operator concatenated with a vector transport and its inverse before and after to act on x_k+1 = R_x_k(α_k η_k). Then the update formula reads\n\nH^mathrmBFGS_k+1 = widetildeH^mathrmBFGS_k + fracy_k y^mathrmT_k s^mathrmT_k y_k - fracwidetildeH^mathrmBFGS_k s_k s^mathrmT_k widetildeH^mathrmBFGS_k s^mathrmT_k widetildeH^mathrmBFGS_k s_k\n\nwhere s_k and y_k are the coordinate vectors with respect to the current basis (from QuasiNewtonState) of\n\nT^S_x_k α_k η_k(α_k η_k) quadtextandquad\noperatornamegradf(x_k+1) - T^S_x_k α_k η_k(operatornamegradf(x_k)) T_x_k+1 mathcalM\n\nrespectively.\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.DFP","page":"Quasi-Newton","title":"Manopt.DFP","text":"DFP <: AbstractQuasiNewtonUpdateRule\n\nindicates in an AbstractQuasiNewtonDirectionUpdate that the Riemannian DFP update is used in the Riemannian quasi-Newton method.\n\nWe denote by widetildeH_k^mathrmDFP the operator concatenated with a vector transport and its inverse before and after to act on x_k+1 = R_x_k(α_k η_k). Then the update formula reads\n\nH^mathrmDFP_k+1 = Bigl(\n mathrmid_T_x_k+1 mathcalM - fracy_k s^mathrmT_ks^mathrmT_k y_k\nBigr)\nwidetildeH^mathrmDFP_k\nBigl(\n mathrmid_T_x_k+1 mathcalM - fracs_k y^mathrmT_ks^mathrmT_k y_k\nBigr) + fracy_k y^mathrmT_ks^mathrmT_k y_k\n\nwhere s_k and y_k are the coordinate vectors with respect to the current basis (from QuasiNewtonState) of\n\nT^S_x_k α_k η_k(α_k η_k) quadtextandquad\noperatornamegradf(x_k+1) - T^S_x_k α_k η_k(operatornamegradf(x_k)) T_x_k+1 mathcalM\n\nrespectively.\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.Broyden","page":"Quasi-Newton","title":"Manopt.Broyden","text":"Broyden <: AbstractQuasiNewtonUpdateRule\n\nindicates in AbstractQuasiNewtonDirectionUpdate that the Riemannian Broyden update is used in the Riemannian quasi-Newton method, which is as a convex combination of BFGS and DFP.\n\nWe denote by widetildeH_k^mathrmBr the operator concatenated with a vector transport and its inverse before and after to act on x_k+1 = R_x_k(α_k η_k). Then the update formula reads\n\nH^mathrmBr_k+1 = widetildeH^mathrmBr_k\n - fracwidetildeH^mathrmBr_k s_k s^mathrmT_k widetildeH^mathrmBr_ks^mathrmT_k widetildeH^mathrmBr_k s_k + fracy_k y^mathrmT_ks^mathrmT_k y_k\n + φ_k s^mathrmT_k widetildeH^mathrmBr_k s_k\n Bigl(\n fracy_ks^mathrmT_k y_k - fracwidetildeH^mathrmBr_k s_ks^mathrmT_k widetildeH^mathrmBr_k s_k\n Bigr)\n Bigl(\n fracy_ks^mathrmT_k y_k - fracwidetildeH^mathrmBr_k s_ks^mathrmT_k widetildeH^mathrmBr_k s_k\n Bigr)^mathrmT\n\nwhere s_k and y_k are the coordinate vectors with respect to the current basis (from QuasiNewtonState) of\n\nT^S_x_k α_k η_k(α_k η_k) quadtextandquad\noperatornamegradf(x_k+1) - T^S_x_k α_k η_k(operatornamegradf(x_k)) T_x_k+1 mathcalM\n\nrespectively, and φ_k is the Broyden factor which is :constant by default but can also be set to :Davidon.\n\nConstructor\n\nBroyden(φ, update_rule::Symbol = :constant)\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.SR1","page":"Quasi-Newton","title":"Manopt.SR1","text":"SR1 <: AbstractQuasiNewtonUpdateRule\n\nindicates in AbstractQuasiNewtonDirectionUpdate that the Riemannian SR1 update is used in the Riemannian quasi-Newton method.\n\nWe denote by widetildeH_k^mathrmSR1 the operator concatenated with a vector transport and its inverse before and after to act on x_k+1 = R_x_k(α_k η_k). Then the update formula reads\n\nH^mathrmSR1_k+1 = widetildeH^mathrmSR1_k\n+ frac\n (y_k - widetildeH^mathrmSR1_k s_k) (y_k - widetildeH^mathrmSR1_k s_k)^mathrmT\n\n(y_k - widetildeH^mathrmSR1_k s_k)^mathrmT s_k\n\n\nwhere s_k and y_k are the coordinate vectors with respect to the current basis (from QuasiNewtonState) of\n\nT^S_x_k α_k η_k(α_k η_k) quadtextandquad\noperatornamegradf(x_k+1) - T^S_x_k α_k η_k(operatornamegradf(x_k)) T_x_k+1 mathcalM\n\nrespectively.\n\nThis method can be stabilized by only performing the update if denominator is larger than rlVert s_krVert_x_k+1lVert y_k - widetildeH^mathrmSR1_k s_k rVert_x_k+1 for some r0. For more details, see Section 6.2 in Nocedal, Wright, Springer, 2006.\n\nConstructor\n\nSR1(r::Float64=-1.0)\n\nGenerate the SR1 update, which by default does not include the check (since the default sets t0`)\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.InverseBFGS","page":"Quasi-Newton","title":"Manopt.InverseBFGS","text":"InverseBFGS <: AbstractQuasiNewtonUpdateRule\n\nindicates in AbstractQuasiNewtonDirectionUpdate that the inverse Riemannian BFGS update is used in the Riemannian quasi-Newton method.\n\nWe denote by widetildeB_k^mathrmBFGS the operator concatenated with a vector transport and its inverse before and after to act on x_k+1 = R_x_k(α_k η_k). Then the update formula reads\n\nB^mathrmBFGS_k+1 = Bigl(\n mathrmid_T_x_k+1 mathcalM - fracs_k y^mathrmT_k s^mathrmT_k y_k\nBigr)\nwidetildeB^mathrmBFGS_k\nBigl(\n mathrmid_T_x_k+1 mathcalM - fracy_k s^mathrmT_k s^mathrmT_k y_k\nBigr) + fracs_k s^mathrmT_ks^mathrmT_k y_k\n\nwhere s_k and y_k are the coordinate vectors with respect to the current basis (from QuasiNewtonState) of\n\nT^S_x_k α_k η_k(α_k η_k) quadtextandquad\noperatornamegradf(x_k+1) - T^S_x_k α_k η_k(operatornamegradf(x_k)) T_x_k+1 mathcalM\n\nrespectively.\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.InverseDFP","page":"Quasi-Newton","title":"Manopt.InverseDFP","text":"InverseDFP <: AbstractQuasiNewtonUpdateRule\n\nindicates in AbstractQuasiNewtonDirectionUpdate that the inverse Riemannian DFP update is used in the Riemannian quasi-Newton method.\n\nWe denote by widetildeB_k^mathrmDFP the operator concatenated with a vector transport and its inverse before and after to act on x_k+1 = R_x_k(α_k η_k). Then the update formula reads\n\nB^mathrmDFP_k+1 = widetildeB^mathrmDFP_k + fracs_k s^mathrmT_ks^mathrmT_k y_k\n - fracwidetildeB^mathrmDFP_k y_k y^mathrmT_k widetildeB^mathrmDFP_ky^mathrmT_k widetildeB^mathrmDFP_k y_k\n\nwhere s_k and y_k are the coordinate vectors with respect to the current basis (from QuasiNewtonState) of\n\nT^S_x_k α_k η_k(α_k η_k) quadtextandquad\noperatornamegradf(x_k+1) - T^S_x_k α_k η_k(operatornamegradf(x_k)) T_x_k+1 mathcalM\n\nrespectively.\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.InverseBroyden","page":"Quasi-Newton","title":"Manopt.InverseBroyden","text":"InverseBroyden <: AbstractQuasiNewtonUpdateRule\n\nIndicates in AbstractQuasiNewtonDirectionUpdate that the Riemannian Broyden update is used in the Riemannian quasi-Newton method, which is as a convex combination of InverseBFGS and InverseDFP.\n\nWe denote by widetildeH_k^mathrmBr the operator concatenated with a vector transport and its inverse before and after to act on x_k+1 = R_x_k(α_k η_k). Then the update formula reads\n\nB^mathrmBr_k+1 = widetildeB^mathrmBr_k\n - fracwidetildeB^mathrmBr_k y_k y^mathrmT_k widetildeB^mathrmBr_ky^mathrmT_k widetildeB^mathrmBr_k y_k\n + fracs_k s^mathrmT_ks^mathrmT_k y_k\n + φ_k y^mathrmT_k widetildeB^mathrmBr_k y_k\n Bigl(\n fracs_ks^mathrmT_k y_k - fracwidetildeB^mathrmBr_k y_ky^mathrmT_k widetildeB^mathrmBr_k y_k\n Bigr) Bigl(\n fracs_ks^mathrmT_k y_k - fracwidetildeB^mathrmBr_k y_ky^mathrmT_k widetildeB^mathrmBr_k y_k\n Bigr)^mathrmT\n\nwhere s_k and y_k are the coordinate vectors with respect to the current basis (from QuasiNewtonState) of\n\nT^S_x_k α_k η_k(α_k η_k) quadtextandquad\noperatornamegradf(x_k+1) - T^S_x_k α_k η_k(operatornamegradf(x_k)) T_x_k+1 mathcalM\n\nrespectively, and φ_k is the Broyden factor which is :constant by default but can also be set to :Davidon.\n\nConstructor\n\nInverseBroyden(φ, update_rule::Symbol = :constant)\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Manopt.InverseSR1","page":"Quasi-Newton","title":"Manopt.InverseSR1","text":"InverseSR1 <: AbstractQuasiNewtonUpdateRule\n\nindicates in AbstractQuasiNewtonDirectionUpdate that the inverse Riemannian SR1 update is used in the Riemannian quasi-Newton method.\n\nWe denote by widetildeB_k^mathrmSR1 the operator concatenated with a vector transport and its inverse before and after to act on x_k+1 = R_x_k(α_k η_k). Then the update formula reads\n\nB^mathrmSR1_k+1 = widetildeB^mathrmSR1_k\n+ frac\n (s_k - widetildeB^mathrmSR1_k y_k) (s_k - widetildeB^mathrmSR1_k y_k)^mathrmT\n\n (s_k - widetildeB^mathrmSR1_k y_k)^mathrmT y_k\n\n\nwhere s_k and y_k are the coordinate vectors with respect to the current basis (from QuasiNewtonState) of\n\nT^S_x_k α_k η_k(α_k η_k) quadtextandquad\noperatornamegradf(x_k+1) - T^S_x_k α_k η_k(operatornamegradf(x_k)) T_x_k+1 mathcalM\n\nrespectively.\n\nThis method can be stabilized by only performing the update if denominator is larger than rlVert y_krVert_x_k+1lVert s_k - widetildeH^mathrmSR1_k y_k rVert_x_k+1 for some r0. For more details, see Section 6.2 in Nocedal, Wright, Springer, 2006.\n\nConstructor\n\nInverseSR1(r::Float64=-1.0)\n\nGenerate the InverseSR1 update, which by default does not include the check, since the default sets t0`.\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#State","page":"Quasi-Newton","title":"State","text":"","category":"section"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"The quasi Newton algorithm is based on a DefaultManoptProblem.","category":"page"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"QuasiNewtonState","category":"page"},{"location":"solvers/quasi_Newton/#Manopt.QuasiNewtonState","page":"Quasi-Newton","title":"Manopt.QuasiNewtonState","text":"QuasiNewtonState <: AbstractManoptSolverState\n\nThese Quasi Newton AbstractManoptSolverState represent any quasi-Newton based method and can be used with any update rule for the direction.\n\nFields\n\np – the current iterate, a point on a manifold\nX – the current gradient\nsk – the current step\nyk the current gradient difference\ndirection_update - an AbstractQuasiNewtonDirectionUpdate rule.\nretraction_method – an AbstractRetractionMethod\nstop – a StoppingCriterion\n\nConstructor\n\nQuasiNewtonState(\n M::AbstractManifold,\n x;\n initial_vector=zero_vector(M,x),\n direction_update::D=QuasiNewtonLimitedMemoryDirectionUpdate(M, x, InverseBFGS(), 20;\n vector_transport_method=vector_transport_method,\n )\n stopping_criterion=StopAfterIteration(1000) | StopWhenGradientNormLess(1e-6),\n retraction_method::RM=default_retraction_method(M, typeof(p)),\n vector_transport_method::VTM=default_vector_transport_method(M, typeof(p)),\n stepsize=default_stepsize(M; QuasiNewtonState)\n)\n\nSee also\n\nquasi_Newton\n\n\n\n\n\n","category":"type"},{"location":"solvers/quasi_Newton/#Literature","page":"Quasi-Newton","title":"Literature","text":"","category":"section"},{"location":"solvers/quasi_Newton/","page":"Quasi-Newton","title":"Quasi-Newton","text":"
","category":"page"},{"location":"solvers/NelderMead/#NelderMeadSolver","page":"Nelder–Mead","title":"Nelder Mead Method","text":"","category":"section"},{"location":"solvers/NelderMead/","page":"Nelder–Mead","title":"Nelder–Mead","text":"CurrentModule = Manopt","category":"page"},{"location":"solvers/NelderMead/","page":"Nelder–Mead","title":"Nelder–Mead","text":" NelderMead\n NelderMead!","category":"page"},{"location":"solvers/NelderMead/#Manopt.NelderMead","page":"Nelder–Mead","title":"Manopt.NelderMead","text":"NelderMead(M::AbstractManifold, f [, population::NelderMeadSimplex])\nNelderMead(M::AbstractManifold, mco::AbstractManifoldCostObjective [, population::NelderMeadSimplex])\n\nSolve a Nelder-Mead minimization problem for the cost function fcolon mathcal M on the manifold M. If the initial population p is not given, a random set of points is chosen.\n\nThis algorithm is adapted from the Euclidean Nelder-Mead method, see https://en.wikipedia.org/wiki/Nelder–Mead_method and http://www.optimization-online.org/DB_FILE/2007/08/1742.pdf.\n\nInput\n\nM – a manifold mathcal M\nf – a cost function to minimize\npopulation – (n+1 rand(M)s) an initial population of n+1 points, where n is the dimension of the manifold M.\n\nOptional\n\nstopping_criterion – (StopAfterIteration(2000) |StopWhenPopulationConcentrated()) a StoppingCriterion\nα – (1.) reflection parameter (α 0)\nγ – (2.) expansion parameter (γ)\nρ – (1/2) contraction parameter, 0 ρ frac12,\nσ – (1/2) shrink coefficient, 0 σ 1\nretraction_method – (default_retraction_method(M, typeof(p))) the retraction to use\ninverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use.\n\nand the ones that are passed to decorate_state! for decorators.\n\nnote: Note\nThe manifold M used here has to either provide a mean(M, pts) or you have to load Manifolds.jl to use its statistics part.\n\nOutput\n\nthe obtained (approximate) minimizer p^*, see get_solver_return for details\n\n\n\n\n\n","category":"function"},{"location":"solvers/NelderMead/#Manopt.NelderMead!","page":"Nelder–Mead","title":"Manopt.NelderMead!","text":"NelderMead(M::AbstractManifold, f [, population::NelderMeadSimplex])\n\nSolve a Nelder Mead minimization problem for the cost function f on the manifold M. If the initial population population is not given, a random set of points is chosen. If it is given, the computation is done in place of population.\n\nFor more options see NelderMead.\n\n\n\n\n\n","category":"function"},{"location":"solvers/NelderMead/#State","page":"Nelder–Mead","title":"State","text":"","category":"section"},{"location":"solvers/NelderMead/","page":"Nelder–Mead","title":"Nelder–Mead","text":" NelderMeadState","category":"page"},{"location":"solvers/NelderMead/#Manopt.NelderMeadState","page":"Nelder–Mead","title":"Manopt.NelderMeadState","text":"NelderMeadState <: AbstractManoptSolverState\n\nDescribes all parameters and the state of a Nelder-Mead heuristic based optimization algorithm.\n\nFields\n\nThe naming of these parameters follows the Wikipedia article of the Euclidean case. The default is given in brackets, the required value range after the description\n\npopulation – an Array{point,1} of n+1 points x_i, i=1n+1, where n is the dimension of the manifold.\nstopping_criterion – (StopAfterIteration(2000) |StopWhenPopulationConcentrated()) a StoppingCriterion\nα – (1.) reflection parameter (α 0)\nγ – (2.) expansion parameter (γ 0)\nρ – (1/2) contraction parameter, 0 ρ frac12,\nσ – (1/2) shrink coefficient, 0 σ 1\np – (copy(population.pts[1])) - a field to collect the current best value (initialized to some point here)\nretraction_method – (default_retraction_method(M, typeof(p))) the retraction to use.\ninverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use.\n\nConstructors\n\nNelderMead(M[, population::NelderMeadSimplex]; kwargs...)\n\nConstruct a Nelder-Mead Option with a default population (if not provided) of set of dimension(M)+1 random points stored in NelderMeadSimplex.\n\nIn the constructor all fields (besides the population) are keyword arguments.\n\n\n\n\n\n","category":"type"},{"location":"solvers/NelderMead/#Simplex","page":"Nelder–Mead","title":"Simplex","text":"","category":"section"},{"location":"solvers/NelderMead/","page":"Nelder–Mead","title":"Nelder–Mead","text":"NelderMeadSimplex","category":"page"},{"location":"solvers/NelderMead/#Manopt.NelderMeadSimplex","page":"Nelder–Mead","title":"Manopt.NelderMeadSimplex","text":"NelderMeadSimplex\n\nA simplex for the Nelder-Mead algorithm.\n\nConstructors\n\nNelderMeadSimplex(M::AbstractManifold)\n\nConstruct a simplex using n+1 random points from manifold M, where n is the manifold dimension of M.\n\nNelderMeadSimplex(\n M::AbstractManifold,\n p,\n B::AbstractBasis=DefaultOrthonormalBasis();\n a::Real=0.025,\n retraction_method::AbstractRetractionMethod=default_retraction_method(M, typeof(p)),\n)\n\nConstruct a simplex from a basis B with one point being p and other points constructed by moving by a in each principal direction defined by basis B of the tangent space at point p using retraction retraction_method. This works similarly to how the initial simplex is constructed in the Euclidean Nelder-Mead algorithm, just in the tangent space at point p.\n\n\n\n\n\n","category":"type"},{"location":"solvers/NelderMead/#Additional-Stopping-Criteria","page":"Nelder–Mead","title":"Additional Stopping Criteria","text":"","category":"section"},{"location":"solvers/NelderMead/","page":"Nelder–Mead","title":"Nelder–Mead","text":"StopWhenPopulationConcentrated","category":"page"},{"location":"solvers/NelderMead/#Manopt.StopWhenPopulationConcentrated","page":"Nelder–Mead","title":"Manopt.StopWhenPopulationConcentrated","text":"StopWhenPopulationConcentrated <: StoppingCriterion\n\nA stopping criterion for NelderMead to indicate to stop when both\n\nthe maximal distance of the first to the remaining the cost values and\nthe maximal distance of the first to the remaining the population points\n\ndrops below a certain tolerance tol_f and tol_p, respectively.\n\nConstructor\n\nStopWhenPopulationConcentrated(tol_f::Real=1e-8, tol_x::Real=1e-8)\n\n\n\n\n\n","category":"type"}]
}
diff --git a/dev/solvers/ChambollePock/index.html b/dev/solvers/ChambollePock/index.html
index a5dcf252a7..149a625524 100644
--- a/dev/solvers/ChambollePock/index.html
+++ b/dev/solvers/ChambollePock/index.html
@@ -1,16 +1,16 @@
-Chambolle-Pock · Manopt.jl
The Riemannian Chambolle–Pock is a generalization of the Chambolle–Pock algorithm Chambolle and Pock [CP11] It is also known as primal-dual hybrid gradient (PDHG) or primal-dual proximal splitting (PDPS) algorithm.
In order to minimize over p∈\mathcal M§ the cost function consisting of
\[F(p) + G(Λ(p)),\]
where $F:\mathcal M → \overline{ℝ}$, $G:\mathcal N → \overline{ℝ}$, and $Λ:\mathcal M →\mathcal N$. If the manifolds $\mathcal M$ or $\mathcal N$ are not Hadamard, it has to be considered locally, i.e. on geodesically convex sets $\mathcal C \subset \mathcal M$ and $\mathcal D \subset\mathcal N$ such that $Λ(\mathcal C) \subset \mathcal D$.
The algorithm is available in four variants: exact versus linearized (see variant) as well as with primal versus dual relaxation (see relax). For more details, see Bergmann, Herzog, Silva Louzeiro, Tenbrinck and Vidal-Núñez [BHS+21]. In the following we note the case of the exact, primal relaxed Riemannian Chambolle–Pock algorithm.
Given base points $m∈\mathcal C$, $n=Λ(m)∈\mathcal D$, initial primal and dual values $p^{(0)} ∈\mathcal C$, $ξ_n^{(0)} ∈T_n^*\mathcal N$, and primal and dual step sizes $\sigma_0$, $\tau_0$, relaxation $\theta_0$, as well as acceleration $\gamma$.
As an initialization, perform $\bar p^{(0)} \gets p^{(0)}$.
The algorithms performs the steps $k=1,…,$ (until a StoppingCriterion is fulfilled with)
Furthermore you can exchange the exponential map, the logarithmic map, and the parallel transport by a retraction, an inverse retraction, and a vector transport.
Finally you can also update the base points $m$ and $n$ during the iterations. This introduces a few additional vector transports. The same holds for the case $Λ(m^{(k)})\neq n^{(k)}$ at some point. All these cases are covered in the algorithm.
The Riemannian Chambolle–Pock is a generalization of the Chambolle–Pock algorithm Chambolle and Pock [CP11] It is also known as primal-dual hybrid gradient (PDHG) or primal-dual proximal splitting (PDPS) algorithm.
In order to minimize over $p∈\mathcal M$ the cost function consisting of
\[F(p) + G(Λ(p)),\]
where $F:\mathcal M → \overline{ℝ}$, $G:\mathcal N → \overline{ℝ}$, and $Λ:\mathcal M →\mathcal N$. If the manifolds $\mathcal M$ or $\mathcal N$ are not Hadamard, it has to be considered locally, i.e. on geodesically convex sets $\mathcal C \subset \mathcal M$ and $\mathcal D \subset\mathcal N$ such that $Λ(\mathcal C) \subset \mathcal D$.
The algorithm is available in four variants: exact versus linearized (see variant) as well as with primal versus dual relaxation (see relax). For more details, see Bergmann, Herzog, Silva Louzeiro, Tenbrinck and Vidal-Núñez [BHS+21]. In the following we note the case of the exact, primal relaxed Riemannian Chambolle–Pock algorithm.
Given base points $m∈\mathcal C$, $n=Λ(m)∈\mathcal D$, initial primal and dual values $p^{(0)} ∈\mathcal C$, $ξ_n^{(0)} ∈T_n^*\mathcal N$, and primal and dual step sizes $\sigma_0$, $\tau_0$, relaxation $\theta_0$, as well as acceleration $\gamma$.
As an initialization, perform $\bar p^{(0)} \gets p^{(0)}$.
The algorithms performs the steps $k=1,…,$ (until a StoppingCriterion is fulfilled with)
Furthermore you can exchange the exponential map, the logarithmic map, and the parallel transport by a retraction, an inverse retraction, and a vector transport.
Finally you can also update the base points $m$ and $n$ during the iterations. This introduces a few additional vector transports. The same holds for the case $Λ(m^{(k)})\neq n^{(k)}$ at some point. All these cases are covered in the algorithm.
ChambollePock(
M, N, cost, x0, ξ0, m, n, prox_F, prox_G_dual, adjoint_linear_operator;
forward_operator=missing,
linearized_forward_operator=missing,
evaluation=AllocatingEvaluation()
-)
Perform the Riemannian Chambolle–Pock algorithm.
Given a cost function $\mathcal E:\mathcal M → ℝ$ of the form
\[\mathcal E(p) = F(p) + G( Λ(p) ),\]
where $F:\mathcal M → ℝ$, $G:\mathcal N → ℝ$, and $Λ:\mathcal M → \mathcal N$. The remaining input parameters are
p, X primal and dual start points $x∈\mathcal M$ and $ξ∈T_n\mathcal N$
m,n base points on $\mathcal M$ and $\mathcal N$, respectively.
adjoint_linearized_operator the adjoint $DΛ^*$ of the linearized operator $DΛ(m): T_{m}\mathcal M → T_{Λ(m)}\mathcal N$
prox_F, prox_G_Dual the proximal maps of $F$ and $G^\ast_n$
note that depending on the AbstractEvaluationTypeevaluation the last three parameters as well as the forwardoperator Λ and the `linearizedforward_operatorcan be given as allocating functions(Manifolds, parameters) -> resultor as mutating functions(Manifold, result, parameters)-> result to spare allocations.
By default, this performs the exact Riemannian Chambolle Pock algorithm, see the optional parameter DΛ for their linearized variant.
dual_stepsize – (1/sqrt(8)) proximal parameter of the primal prox
evaluation (AllocatingEvaluation()) specify whether the proximal maps and operators are allocating functions(Manifolds, parameters) -> resultor given as mutating functions(Manifold, result, parameters)-> result to spare allocations.
Λ (missing) the (forward) operator $Λ(⋅)$ (required for the :exact variant)
linearized_forward_operator (missing) its linearization $DΛ(⋅)[⋅]$ (required for the :linearized variant)
primal_stepsize – (1/sqrt(8)) proximal parameter of the dual prox
relaxation – (1.)
relax – (:primal) whether to relax the primal or dual
variant - (:exact if Λ is missing, otherwise :linearized) variant to use. Note that this changes the arguments the forward_operator will be called.
ChambollePock(M, N, cost, x0, ξ0, m, n, prox_F, prox_G_dual, adjoint_linear_operator)
Perform the Riemannian Chambolle–Pock algorithm in place of x, ξ, and potentially m, n if they are not fixed. See ChambollePock for details and optional parameters.
stores all options and variables within a linearized or exact Chambolle Pock. The following list provides the order for the constructor, where the previous iterates are initialized automatically and values with a default may be left out.
m - base point on $\mathcal M$
n - base point on $\mathcal N$
p - an initial point on $x^{(0)} ∈\mathcal M$ (and its previous iterate)
X - an initial tangent vector $X^{(0)}∈T^*\mathcal N$ (and its previous iterate)
pbar - the relaxed iterate used in the next dual update step (when using :primal relaxation)
Xbar - the relaxed iterate used in the next primal update step (when using :dual relaxation)
primal_stepsize – (1/sqrt(8)) proximal parameter of the primal prox
dual_stepsize – (1/sqrt(8)) proximal parameter of the dual prox
acceleration – (0.) acceleration factor due to Chambolle & Pock
relaxation – (1.) relaxation in the primal relaxation step (to compute pbar)
relax – (:primal) which variable to relax (:primal or :dual)
variant – (exact) whether to perform an :exact or :linearized Chambolle-Pock
update_primal_base ((p,o,i) -> o.m) function to update the primal base
update_dual_base ((p,o,i) -> o.n) function to update the dual base
retraction_method – (default_retraction_method(M, typeof(p))) the retraction to use
inverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use on the manifold $\mathcal M$.
inverse_retraction_method_dual - (default_inverse_retraction_method(N, typeof(n))) an inverse retraction to use on manifold $\mathcal N$.
vector_transport_method - (default_vector_transport_method(M, typeof(p))) a vector transport to use on the manifold $\mathcal M$.
vector_transport_method_dual - (default_vector_transport_method(N, typeof(n))) a vector transport to use on manifold $\mathcal N$.
where for the last two the functions a AbstractManoptProblemp, AbstractManoptSolverStateo and the current iterate i are the arguments. If you activate these to be different from the default identity, you have to provide p.Λ for the algorithm to work (which might be missing in the linearized case).
Given a cost function $\mathcal E:\mathcal M → ℝ$ of the form
\[\mathcal E(p) = F(p) + G( Λ(p) ),\]
where $F:\mathcal M → ℝ$, $G:\mathcal N → ℝ$, and $Λ:\mathcal M → \mathcal N$. The remaining input parameters are
p, X primal and dual start points $x∈\mathcal M$ and $ξ∈T_n\mathcal N$
m,n base points on $\mathcal M$ and $\mathcal N$, respectively.
adjoint_linearized_operator the adjoint $DΛ^*$ of the linearized operator $DΛ(m): T_{m}\mathcal M → T_{Λ(m)}\mathcal N$
prox_F, prox_G_Dual the proximal maps of $F$ and $G^\ast_n$
note that depending on the AbstractEvaluationTypeevaluation the last three parameters as well as the forwardoperator Λ and the `linearizedforward_operatorcan be given as allocating functions(Manifolds, parameters) -> resultor as mutating functions(Manifold, result, parameters)-> result to spare allocations.
By default, this performs the exact Riemannian Chambolle Pock algorithm, see the optional parameter DΛ for their linearized variant.
dual_stepsize – (1/sqrt(8)) proximal parameter of the primal prox
evaluation (AllocatingEvaluation()) specify whether the proximal maps and operators are allocating functions(Manifolds, parameters) -> resultor given as mutating functions(Manifold, result, parameters)-> result to spare allocations.
Λ (missing) the (forward) operator $Λ(⋅)$ (required for the :exact variant)
linearized_forward_operator (missing) its linearization $DΛ(⋅)[⋅]$ (required for the :linearized variant)
primal_stepsize – (1/sqrt(8)) proximal parameter of the dual prox
relaxation – (1.)
relax – (:primal) whether to relax the primal or dual
variant - (:exact if Λ is missing, otherwise :linearized) variant to use. Note that this changes the arguments the forward_operator will be called.
ChambollePock(M, N, cost, x0, ξ0, m, n, prox_F, prox_G_dual, adjoint_linear_operator)
Perform the Riemannian Chambolle–Pock algorithm in place of x, ξ, and potentially m, n if they are not fixed. See ChambollePock for details and optional parameters.
stores all options and variables within a linearized or exact Chambolle Pock. The following list provides the order for the constructor, where the previous iterates are initialized automatically and values with a default may be left out.
m - base point on $\mathcal M$
n - base point on $\mathcal N$
p - an initial point on $x^{(0)} ∈\mathcal M$ (and its previous iterate)
X - an initial tangent vector $X^{(0)}∈T^*\mathcal N$ (and its previous iterate)
pbar - the relaxed iterate used in the next dual update step (when using :primal relaxation)
Xbar - the relaxed iterate used in the next primal update step (when using :dual relaxation)
primal_stepsize – (1/sqrt(8)) proximal parameter of the primal prox
dual_stepsize – (1/sqrt(8)) proximal parameter of the dual prox
acceleration – (0.) acceleration factor due to Chambolle & Pock
relaxation – (1.) relaxation in the primal relaxation step (to compute pbar)
relax – (:primal) which variable to relax (:primal or :dual)
variant – (exact) whether to perform an :exact or :linearized Chambolle-Pock
update_primal_base ((p,o,i) -> o.m) function to update the primal base
update_dual_base ((p,o,i) -> o.n) function to update the dual base
retraction_method – (default_retraction_method(M, typeof(p))) the retraction to use
inverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use on the manifold $\mathcal M$.
inverse_retraction_method_dual - (default_inverse_retraction_method(N, typeof(n))) an inverse retraction to use on manifold $\mathcal N$.
vector_transport_method - (default_vector_transport_method(M, typeof(p))) a vector transport to use on the manifold $\mathcal M$.
vector_transport_method_dual - (default_vector_transport_method(N, typeof(n))) a vector transport to use on manifold $\mathcal N$.
where for the last two the functions a AbstractManoptProblemp, AbstractManoptSolverStateo and the current iterate i are the arguments. If you activate these to be different from the default identity, you have to provide p.Λ for the algorithm to work (which might be missing in the linearized case).
Compute the dual residual at current iterate $k$ given the necessary values $x_{k-1}, X_{k-1}$, and $n_{k-1}$ from the previous iterate. The formula is slightly different depending on the o.variant used:
For the :linearized it reads
\[\Bigl\lVert
+\Bigr\rVert\]
where $V_{⋅\gets⋅}$ is the vector transport used in the ChambollePockState
Compute the dual residual at current iterate $k$ given the necessary values $x_{k-1}, X_{k-1}$, and $n_{k-1}$ from the previous iterate. The formula is slightly different depending on the o.variant used:
Print the change of the dual variable, similar to DebugChange, see their constructors for detail, but with a different calculation of the change, since the dual variable lives in (possibly different) tangent spaces.
A Debug action to print the dual residual. The constructor accepts a printing function and some (shared) storage, which should at least record :Iterate, :X and :n.
Constructor
DebugDualResidual()
with the keywords
io (stdout) - stream to perform the debug to
format ("$prefix%s") format to print the dual residual, using the
prefix ("Dual Residual: ") short form to just set the prefix
A Debug action to print the primal residual. The constructor accepts a printing function and some (shared) storage, which should at least record :Iterate, :X and :n.
Constructor
DebugPrimalResidual()
with the keywords
io (stdout) - stream to perform the debug to
format ("$prefix%s") format to print the dual residual, using the
prefix ("Primal Residual: ") short form to just set the prefix
A Debug action to print the primaldual residual. The constructor accepts a printing function and some (shared) storage, which should at least record :Iterate, :X and :n.
Constructor
DebugPrimalDualResidual()
with the keywords
io (stdout) - stream to perform the debug to
format ("$prefix%s") format to print the dual residual, using the
prefix ("Primal Residual: ") short form to just set the prefix
Print the change of the dual variable, similar to DebugChange, see their constructors for detail, but with a different calculation of the change, since the dual variable lives in (possibly different) tangent spaces.
A Debug action to print the dual residual. The constructor accepts a printing function and some (shared) storage, which should at least record :Iterate, :X and :n.
Constructor
DebugDualResidual()
with the keywords
io (stdout) - stream to perform the debug to
format ("$prefix%s") format to print the dual residual, using the
prefix ("Dual Residual: ") short form to just set the prefix
A Debug action to print the primal residual. The constructor accepts a printing function and some (shared) storage, which should at least record :Iterate, :X and :n.
Constructor
DebugPrimalResidual()
with the keywords
io (stdout) - stream to perform the debug to
format ("$prefix%s") format to print the dual residual, using the
prefix ("Primal Residual: ") short form to just set the prefix
A Debug action to print the primaldual residual. The constructor accepts a printing function and some (shared) storage, which should at least record :Iterate, :X and :n.
Constructor
DebugPrimalDualResidual()
with the keywords
io (stdout) - stream to perform the debug to
format ("$prefix%s") format to print the dual residual, using the
prefix ("Primal Residual: ") short form to just set the prefix
R. Bergmann, R. Herzog, M. Silva Louzeiro, D. Tenbrinck and J. Vidal-Núñez. Fenchel duality theory and a primal-dual algorithm on Riemannian manifolds. Foundations of Computational Mathematics 21, 1465–1504 (2021), arXiv: [1908.02022](http://arxiv.org/abs/1908.02022).
The (Parallel) Douglas–Rachford ((P)DR) Algorithm was generalized to Hadamard manifolds in [BPS16].
The aim is to minimize the sum
\[F(p) = f(p) + g(p)\]
on a manifold, where the two summands have proximal maps $\operatorname{prox}_{λ f}, \operatorname{prox}_{λ g}$ that are easy to evaluate (maybe in closed form, or not too costly to approximate). Further, define the reflection operator at the proximal map as
\[\operatorname{refl}_{λ f}(p) = \operatorname{retr}_{\operatorname{prox}_{λ f}(p)} \bigl( -\operatorname{retr}^{-1}_{\operatorname{prox}_{λ f}(p)} p \bigr).\]
Let $\alpha_k ∈ [0,1]$ with $\sum_{k ∈ \mathbb N} \alpha_k(1-\alpha_k) = \infty$ and $λ > 0$ (which might depend on iteration $k$ as well) be given.
Then the (P)DRA algorithm for initial data $x_0 ∈ \mathcal H$ as
For the parallel version, the first proximal map is a vectorial version where in each component one prox is applied to the corresponding copy of $t_k$ and the second proximal map corresponds to the indicator function of the set, where all copies are equal (in $\mathcal H^n$, where $n$ is the number of copies), leading to the second prox being the Riemannian mean.
For $k>2$ proximal maps, the problem is reformulated using the parallel Douglas Rachford: A vectorial proximal map on the power manifold $\mathcal M^k$ is introduced as the first proximal map and the second proximal map of the is set to the mean (Riemannian Center of mass). This hence also boils down to two proximal maps, though each evaluates proximal maps in parallel, i.e. component wise in a vector.
F – a cost function consisting of a sum of cost functions
proxes_f – functions of the form (M, λ, p)->... performing a proximal maps, where λ denotes the proximal parameter, for each of the summands of F. These can also be given in the InplaceEvaluation variants (M, q, λ p) -> ... computing in place of q.
p – initial data $p ∈ \mathcal M$
Optional values
evaluation – (AllocatingEvaluation) specify whether the proximal maps work by allocation (default) form prox(M, λ, x) or InplaceEvaluation in place, i.e. is of the form prox!(M, y, λ, x).
λ – ((iter) -> 1.0) function to provide the value for the proximal parameter during the calls
α – ((iter) -> 0.9) relaxation of the step from old to new iterate, i.e. $t_{k+1} = g(α_k; t_k, s_k)$, where $s_k$ is the result of the double reflection involved in the DR algorithm
inverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) the inverse retraction to use within
the reflection (ignored, if you set R directly)
the relaxation step
R – method employed in the iteration to perform the reflection of x at the prox p. This uses by default reflect or reflect! depending on reflection_evaluation and the retraction and inverse retraction specified by retraction_method and inverse_retraction_method, respectively.
reflection_evaluation – (AllocatingEvaluation whether R works inplace or allocating
retraction_method - (default_retraction_metiod(M, typeof(p))) the retraction to use in
parallel – (false) clarify that we are doing a parallel DR, i.e. on a PowerManifold manifold with two proxes. This can be used to trigger parallel Douglas–Rachford if you enter with two proxes. Keep in mind, that a parallel Douglas–Rachford implicitly works on a PowerManifold manifold and its first argument is the result then (assuming all are equal after the second prox.
and the ones that are passed to decorate_state! for decorators.
Output
the obtained (approximate) minimizer $p^*$, see get_solver_return for details
Compute the Douglas-Rachford algorithm on the manifold $\mathcal M$, initial data $p \in \mathcal M$ and the (two) proximal maps proxes_f in place of p.
For $k>2$ proximal maps, the problem is reformulated using the parallel Douglas Rachford: A vectorial proximal map on the power manifold $\mathcal M^k$ is introduced as the first proximal map and the second proximal map of the is set to the mean (Riemannian Center of mass). This hence also boils down to two proximal maps, though each evaluates proximal maps in parallel, i.e. component wise in a vector.
Note
While creating the new staring point p' on the power manifold, a copy of p Is created, so that the (by k>2 implicitly generated) parallel Douglas Rachford does not work in-place for now.
f – a cost function consisting of a sum of cost functions
proxes_f – functions of the form (M, λ, p)->q or (M, q, λ, p)->q performing a proximal map, where λ denotes the proximal parameter, for each of the summands of f.
Store all options required for the DouglasRachford algorithm,
Fields
p - the current iterate (result) For the parallel Douglas-Rachford, this is not a value from the PowerManifold manifold but the mean.
s – the last result of the double reflection at the proxes relaxed by α.
λ – function to provide the value for the proximal parameter during the calls
α – relaxation of the step from old to new iterate, i.e. $x^{(k+1)} = g(α(k); x^{(k)}, t^{(k)})$, where $t^{(k)}$ is the result of the double reflection involved in the DR algorithm
inverse_retraction_method – an inverse retraction method
R – method employed in the iteration to perform the reflection of x at the prox p.
reflection_evaluation – whether R works inplace or allocating
parallel – indicate whether we are running a parallel Douglas-Rachford or not.
Constructor
DouglasRachfordState(M, p; kwargs...)
Generate the options for a Manifold M and an initial point p, where the following keyword arguments can be used
λ – ((iter)->1.0) function to provide the value for the proximal parameter during the calls
α – ((iter)->0.9) relaxation of the step from old to new iterate, i.e. $x^{(k+1)} = g(α(k); x^{(k)}, t^{(k)})$, where $t^{(k)}$ is the result of the double reflection involved in the DR algorithm
R – (reflect or reflect!) method employed in the iteration to perform the reflection of x at the prox p, which function is used depends on reflection_evaluation.
reflection_evaluation – (AllocatingEvaluation()) specify whether the reflection works inplace or allocating (default)
The (Parallel) Douglas–Rachford ((P)DR) Algorithm was generalized to Hadamard manifolds in [BPS16].
The aim is to minimize the sum
\[F(p) = f(p) + g(p)\]
on a manifold, where the two summands have proximal maps $\operatorname{prox}_{λ f}, \operatorname{prox}_{λ g}$ that are easy to evaluate (maybe in closed form, or not too costly to approximate). Further, define the reflection operator at the proximal map as
\[\operatorname{refl}_{λ f}(p) = \operatorname{retr}_{\operatorname{prox}_{λ f}(p)} \bigl( -\operatorname{retr}^{-1}_{\operatorname{prox}_{λ f}(p)} p \bigr).\]
Let $\alpha_k ∈ [0,1]$ with $\sum_{k ∈ \mathbb N} \alpha_k(1-\alpha_k) = \infty$ and $λ > 0$ (which might depend on iteration $k$ as well) be given.
Then the (P)DRA algorithm for initial data $x_0 ∈ \mathcal H$ as
For the parallel version, the first proximal map is a vectorial version where in each component one prox is applied to the corresponding copy of $t_k$ and the second proximal map corresponds to the indicator function of the set, where all copies are equal (in $\mathcal H^n$, where $n$ is the number of copies), leading to the second prox being the Riemannian mean.
For $k>2$ proximal maps, the problem is reformulated using the parallel Douglas Rachford: A vectorial proximal map on the power manifold $\mathcal M^k$ is introduced as the first proximal map and the second proximal map of the is set to the mean (Riemannian Center of mass). This hence also boils down to two proximal maps, though each evaluates proximal maps in parallel, i.e. component wise in a vector.
F – a cost function consisting of a sum of cost functions
proxes_f – functions of the form (M, λ, p)->... performing a proximal maps, where λ denotes the proximal parameter, for each of the summands of F. These can also be given in the InplaceEvaluation variants (M, q, λ p) -> ... computing in place of q.
p – initial data $p ∈ \mathcal M$
Optional values
evaluation – (AllocatingEvaluation) specify whether the proximal maps work by allocation (default) form prox(M, λ, x) or InplaceEvaluation in place, i.e. is of the form prox!(M, y, λ, x).
λ – ((iter) -> 1.0) function to provide the value for the proximal parameter during the calls
α – ((iter) -> 0.9) relaxation of the step from old to new iterate, i.e. $t_{k+1} = g(α_k; t_k, s_k)$, where $s_k$ is the result of the double reflection involved in the DR algorithm
inverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) the inverse retraction to use within
the reflection (ignored, if you set R directly)
the relaxation step
R – method employed in the iteration to perform the reflection of x at the prox p. This uses by default reflect or reflect! depending on reflection_evaluation and the retraction and inverse retraction specified by retraction_method and inverse_retraction_method, respectively.
reflection_evaluation – (AllocatingEvaluation whether R works inplace or allocating
retraction_method - (default_retraction_metiod(M, typeof(p))) the retraction to use in
parallel – (false) clarify that we are doing a parallel DR, i.e. on a PowerManifold manifold with two proxes. This can be used to trigger parallel Douglas–Rachford if you enter with two proxes. Keep in mind, that a parallel Douglas–Rachford implicitly works on a PowerManifold manifold and its first argument is the result then (assuming all are equal after the second prox.
and the ones that are passed to decorate_state! for decorators.
Output
the obtained (approximate) minimizer $p^*$, see get_solver_return for details
Compute the Douglas-Rachford algorithm on the manifold $\mathcal M$, initial data $p \in \mathcal M$ and the (two) proximal maps proxes_f in place of p.
For $k>2$ proximal maps, the problem is reformulated using the parallel Douglas Rachford: A vectorial proximal map on the power manifold $\mathcal M^k$ is introduced as the first proximal map and the second proximal map of the is set to the mean (Riemannian Center of mass). This hence also boils down to two proximal maps, though each evaluates proximal maps in parallel, i.e. component wise in a vector.
Note
While creating the new staring point p' on the power manifold, a copy of p Is created, so that the (by k>2 implicitly generated) parallel Douglas Rachford does not work in-place for now.
f – a cost function consisting of a sum of cost functions
proxes_f – functions of the form (M, λ, p)->q or (M, q, λ, p)->q performing a proximal map, where λ denotes the proximal parameter, for each of the summands of f.
Store all options required for the DouglasRachford algorithm,
Fields
p - the current iterate (result) For the parallel Douglas-Rachford, this is not a value from the PowerManifold manifold but the mean.
s – the last result of the double reflection at the proxes relaxed by α.
λ – function to provide the value for the proximal parameter during the calls
α – relaxation of the step from old to new iterate, i.e. $x^{(k+1)} = g(α(k); x^{(k)}, t^{(k)})$, where $t^{(k)}$ is the result of the double reflection involved in the DR algorithm
inverse_retraction_method – an inverse retraction method
R – method employed in the iteration to perform the reflection of x at the prox p.
reflection_evaluation – whether R works inplace or allocating
parallel – indicate whether we are running a parallel Douglas-Rachford or not.
Constructor
DouglasRachfordState(M, p; kwargs...)
Generate the options for a Manifold M and an initial point p, where the following keyword arguments can be used
λ – ((iter)->1.0) function to provide the value for the proximal parameter during the calls
α – ((iter)->0.9) relaxation of the step from old to new iterate, i.e. $x^{(k+1)} = g(α(k); x^{(k)}, t^{(k)})$, where $t^{(k)}$ is the result of the double reflection involved in the DR algorithm
R – (reflect or reflect!) method employed in the iteration to perform the reflection of x at the prox p, which function is used depends on reflection_evaluation.
reflection_evaluation – (AllocatingEvaluation()) specify whether the reflection works inplace or allocating (default)
for every iterate $p_k$ together with a stepsize $s_k≤1$, by default $s_k = \frac{2}{k+2}$. This algorithm is inspired by but slightly more general than Weber, Sra, Math. Prog., 2022.
The next iterate is then given by $p_{k+1} = γ_{p_k,q_k}(s_k)$, where by default $γ$ is the shortest geodesic between the two points but can also be changed to use a retraction and its inverse.
Input
M – a manifold $\mathcal M$
f – a cost function $f: \mathcal M→ℝ$ to find a minimizer $p^*$ for
grad_f – the gradient $\operatorname{grad}f: \mathcal M → T\mathcal M$ of f
as a function (M, p) -> X or a function (M, X, p) -> X working in place of X.
p – an initial value $p ∈ \mathcal C$, note that it really has to be a feasible point
retraction_method – (default_retraction_method(M, typeof(p))) a type of retraction
stepsize -(DecreasingStepsize(; length=2.0, shift=2) a Stepsize to use; but it has to be always less than 1. The default is the one proposed by Frank & Wolfe: $s_k = \frac{2}{k+2}$.
sub_cost - (FrankWolfeCost(p, initiel_vector)) – the cost of the Frank-Wolfe sub problem which by default uses the current iterate and (sub)gradient of the current iteration to define a default cost, this is used to define the default sub_objective. It is ignored, if you set that or the sub_problem directly
sub_grad - (FrankWolfeGradient(p, initial_vector)) – the gradient of the Frank-Wolfe sub problem which by default uses the current iterate and (sub)gradient of the current iteration to define a default gradient this is used to define the default sub_objective. It is ignored, if you set that or the sub_problem directly
sub_objective - (ManifoldGradientObjective(sub_cost, sub_gradient)) – the objective for the Frank-Wolfe sub problem this is used to define the default sub_problem. It is ignored, if you set the sub_problem manually
sub_problem - (DefaultManoptProblem(M, sub_objective)) – the Frank-Wolfe sub problem to solve. This can be given in three forms
as a closed form solution, e.g. a function, evaluating with new allocations, that is a function (M, p, X) -> q that solves the sub problem on M given the current iterate p and (sub)gradient X.
as a closed form solution, e.g. a function, evaluating in place, that is a function (M, q, p, X) -> q working in place of q, with the parameters as in the last point
sub_state - (evaluation if sub_problem is a function, a decorated GradientDescentState otherwise) for a function, the evaluation is inherited from the Frank-Wolfe evaluation keyword.
sub_kwargs - ([]) – keyword arguments to decorate the sub_state default state in case the sub_problem is not a function
for every iterate $p_k$ together with a stepsize $s_k≤1$, by default $s_k = \frac{2}{k+2}$. This algorithm is inspired by but slightly more general than Weber, Sra, Math. Prog., 2022.
The next iterate is then given by $p_{k+1} = γ_{p_k,q_k}(s_k)$, where by default $γ$ is the shortest geodesic between the two points but can also be changed to use a retraction and its inverse.
Input
M – a manifold $\mathcal M$
f – a cost function $f: \mathcal M→ℝ$ to find a minimizer $p^*$ for
grad_f – the gradient $\operatorname{grad}f: \mathcal M → T\mathcal M$ of f
as a function (M, p) -> X or a function (M, X, p) -> X working in place of X.
p – an initial value $p ∈ \mathcal C$, note that it really has to be a feasible point
retraction_method – (default_retraction_method(M, typeof(p))) a type of retraction
stepsize -(DecreasingStepsize(; length=2.0, shift=2) a Stepsize to use; but it has to be always less than 1. The default is the one proposed by Frank & Wolfe: $s_k = \frac{2}{k+2}$.
sub_cost - (FrankWolfeCost(p, initiel_vector)) – the cost of the Frank-Wolfe sub problem which by default uses the current iterate and (sub)gradient of the current iteration to define a default cost, this is used to define the default sub_objective. It is ignored, if you set that or the sub_problem directly
sub_grad - (FrankWolfeGradient(p, initial_vector)) – the gradient of the Frank-Wolfe sub problem which by default uses the current iterate and (sub)gradient of the current iteration to define a default gradient this is used to define the default sub_objective. It is ignored, if you set that or the sub_problem directly
sub_objective - (ManifoldGradientObjective(sub_cost, sub_gradient)) – the objective for the Frank-Wolfe sub problem this is used to define the default sub_problem. It is ignored, if you set the sub_problem manually
sub_problem - (DefaultManoptProblem(M, sub_objective)) – the Frank-Wolfe sub problem to solve. This can be given in three forms
as a closed form solution, e.g. a function, evaluating with new allocations, that is a function (M, p, X) -> q that solves the sub problem on M given the current iterate p and (sub)gradient X.
as a closed form solution, e.g. a function, evaluating in place, that is a function (M, q, p, X) -> q working in place of q, with the parameters as in the last point
sub_state - (evaluation if sub_problem is a function, a decorated GradientDescentState otherwise) for a function, the evaluation is inherited from the Frank-Wolfe evaluation keyword.
sub_kwargs - ([]) – keyword arguments to decorate the sub_state default state in case the sub_problem is not a function
jacobian_f – the Jacobian of $f$. The Jacobian jacF is supposed to accept a keyword argument basis_domain which specifies basis of the tangent space at a given point in which the Jacobian is to be calculated. By default it should be the DefaultOrthonormalBasis.
p – an initial value $p ∈ \mathcal M$
num_components – length of the vector returned by the cost function (d). By default its value is -1 which means that it will be determined automatically by calling F one additional time. Only possible when evaluation is AllocatingEvaluation, for mutating evaluation this must be explicitly specified.
evaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).
retraction_method – (default_retraction_method(M, typeof(p))) a retraction(M,x,ξ) to use.
jacobian_f – the Jacobian of $f$. The Jacobian jacF is supposed to accept a keyword argument basis_domain which specifies basis of the tangent space at a given point in which the Jacobian is to be calculated. By default it should be the DefaultOrthonormalBasis.
p – an initial value $p ∈ \mathcal M$
num_components – length of the vector returned by the cost function (d). By default its value is -1 which means that it will be determined automatically by calling F one additional time. Only possible when evaluation is AllocatingEvaluation, for mutating evaluation this must be explicitly specified.
evaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).
retraction_method – (default_retraction_method(M, typeof(p))) a retraction(M,x,ξ) to use.
NelderMead(M::AbstractManifold, f [, population::NelderMeadSimplex])
-NelderMead(M::AbstractManifold, mco::AbstractManifoldCostObjective [, population::NelderMeadSimplex])
Solve a Nelder-Mead minimization problem for the cost function $f\colon \mathcal M$ on the manifold M. If the initial population p is not given, a random set of points is chosen.
NelderMead(M::AbstractManifold, f [, population::NelderMeadSimplex])
Solve a Nelder Mead minimization problem for the cost function f on the manifold M. If the initial population population is not given, a random set of points is chosen. If it is given, the computation is done in place of population.
Describes all parameters and the state of a Nelder-Mead heuristic based optimization algorithm.
Fields
The naming of these parameters follows the Wikipedia article of the Euclidean case. The default is given in brackets, the required value range after the description
population – an Array{point,1} of $n+1$ points $x_i$, $i=1,…,n+1$, where $n$ is the dimension of the manifold.
NelderMead(M::AbstractManifold, f [, population::NelderMeadSimplex])
+NelderMead(M::AbstractManifold, mco::AbstractManifoldCostObjective [, population::NelderMeadSimplex])
Solve a Nelder-Mead minimization problem for the cost function $f\colon \mathcal M$ on the manifold M. If the initial population p is not given, a random set of points is chosen.
NelderMead(M::AbstractManifold, f [, population::NelderMeadSimplex])
Solve a Nelder Mead minimization problem for the cost function f on the manifold M. If the initial population population is not given, a random set of points is chosen. If it is given, the computation is done in place of population.
Describes all parameters and the state of a Nelder-Mead heuristic based optimization algorithm.
Fields
The naming of these parameters follows the Wikipedia article of the Euclidean case. The default is given in brackets, the required value range after the description
population – an Array{point,1} of $n+1$ points $x_i$, $i=1,…,n+1$, where $n$ is the dimension of the manifold.
Construct a simplex from a basis B with one point being p and other points constructed by moving by a in each principal direction defined by basis B of the tangent space at point p using retraction retraction_method. This works similarly to how the initial simplex is constructed in the Euclidean Nelder-Mead algorithm, just in the tangent space at point p.
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
+)
Construct a simplex from a basis B with one point being p and other points constructed by moving by a in each principal direction defined by basis B of the tangent space at point p using retraction retraction_method. This works similarly to how the initial simplex is constructed in the Euclidean Nelder-Mead algorithm, just in the tangent space at point p.
We use two thresholds $η_2 ≥ η_1 > 0$ and set $p_{k+1} = \operatorname{retr}_{p_k}(X_k)$ if $ρ ≥ η_1$ and reject the candidate otherwise, i.e. set $p_{k+1} = p_k$.
We further update the regularization parameter using factors $0 < γ_1 < 1 < γ_2$
\[σ_{k+1} =
\begin{cases}
\max\{σ_{\min}, γ_1σ_k\} & \text{ if } ρ \geq η_2 &\text{ (the model was very successful)},\\
σ_k & \text{ if } ρ \in [η_1, η_2)&\text{ (the model was successful)},\\
γ_2σ_k & \text{ if } ρ < η_1&\text{ (the model was unsuccessful)}.
-\end{cases}\]
σmin - (1e-10) minimal regularization value $σ_{\min}$
η1 - (0.1) lower model success threshold
η2 - (0.9) upper model success threshold
γ1 - (0.1) regularization reduction factor (for the success case)
γ2 - (2.0) regularization increment factor (for the non-success case)
evaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form grad_f(M, p) or InplaceEvaluation in place, i.e. is of the form grad_f!(M, X, p) and analogously for the hessian.
retraction_method – (default_retraction_method(M, typeof(p))) a retraction to use
initial_tangent_vector - (zero_vector(M, p)) initialize any tangent vector data,
maxIterLanczos - (200) a shortcut to set the stopping criterion in the sub_solver,
ρ_regularization - (1e3) a regularization to avoid dividing by zero for small values of cost and model
sub_state - LanczosState(M, copy(M, p); maxIterLanczos=maxIterLanczos, σ=σ) a state for the subproblem or an [AbstractEvaluationType`](@ref) if the problem is a function.
sub_objective - a shortcut to modify the objective of the subproblem used within in the
sub_problem - DefaultManoptProblem(M, sub_objective) the problem (or a function) for the sub problem
By default the debug= keyword is set to DebugIfEntry(:ρ_denonimator, >(0); message="Denominator nonpositive", type=:error)to avoid that by rounding errors the denominator in the computation ofρ` gets nonpositive.
σmin - (1e-10) minimal regularization value $σ_{\min}$
η1 - (0.1) lower model success threshold
η2 - (0.9) upper model success threshold
γ1 - (0.1) regularization reduction factor (for the success case)
γ2 - (2.0) regularization increment factor (for the non-success case)
evaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form grad_f(M, p) or InplaceEvaluation in place, i.e. is of the form grad_f!(M, X, p) and analogously for the hessian.
retraction_method – (default_retraction_method(M, typeof(p))) a retraction to use
initial_tangent_vector - (zero_vector(M, p)) initialize any tangent vector data,
maxIterLanczos - (200) a shortcut to set the stopping criterion in the sub_solver,
ρ_regularization - (1e3) a regularization to avoid dividing by zero for small values of cost and model
sub_state - LanczosState(M, copy(M, p); maxIterLanczos=maxIterLanczos, σ=σ) a state for the subproblem or an [AbstractEvaluationType`](@ref) if the problem is a function.
sub_objective - a shortcut to modify the objective of the subproblem used within in the
sub_problem - DefaultManoptProblem(M, sub_objective) the problem (or a function) for the sub problem
By default the debug= keyword is set to DebugIfEntry(:ρ_denonimator, >(0); message="Denominator nonpositive", type=:error)to avoid that by rounding errors the denominator in the computation ofρ` gets nonpositive.
a default value is given in brackets if a parameter can be left out in initialization.
η1, η2 – (0.1, 0.9) bounds for evaluating the regularization parameter
γ1, γ2 – (0.1, 2.0) shrinking and expansion factors for regularization parameter σ
p – (rand(M) the current iterate
X – (zero_vector(M,p)) the current gradient $\operatorname{grad}f(p)$
s - (zero_vector(M,p)) the tangent vector step resulting from minimizing the model problem in the tangent space $\mathcal T_{p} \mathcal M$
σ – the current cubic regularization parameter
σmin – (1e-7) lower bound for the cubic regularization parameter
ρ_regularization – (1e3) regularization parameter for computing ρ. As we approach convergence the ρ may be difficult to compute with numerator and denominator approaching zero. Regularizing the the ratio lets ρ go to 1 near convergence.
evaluation - (AllocatingEvaluation()) if you provide a
retraction_method – (default_retraction_method(M)) the retraction to use
a default value is given in brackets if a parameter can be left out in initialization.
η1, η2 – (0.1, 0.9) bounds for evaluating the regularization parameter
γ1, γ2 – (0.1, 2.0) shrinking and expansion factors for regularization parameter σ
p – (rand(M) the current iterate
X – (zero_vector(M,p)) the current gradient $\operatorname{grad}f(p)$
s - (zero_vector(M,p)) the tangent vector step resulting from minimizing the model problem in the tangent space $\mathcal T_{p} \mathcal M$
σ – the current cubic regularization parameter
σmin – (1e-7) lower bound for the cubic regularization parameter
ρ_regularization – (1e3) regularization parameter for computing ρ. As we approach convergence the ρ may be difficult to compute with numerator and denominator approaching zero. Regularizing the the ratio lets ρ go to 1 near convergence.
evaluation - (AllocatingEvaluation()) if you provide a
retraction_method – (default_retraction_method(M)) the retraction to use
where mho is the hessian objective of f to solve. Then use this for the sub_problem keyword and use your favourite gradient based solver for the sub_state keyword, for example a ConjugateGradientDescentState
When an inner iteration has used up all Lanczos vectors, then this stopping criterion is a fallback / security stopping criterion in order to not access a non-existing field in the array allocated for vectors.
A stopping criterion related to the Riemannian adaptive regularization with cubics (ARC) solver indicating that the model function at the current (outer) iterate, i.e.
where mho is the hessian objective of f to solve. Then use this for the sub_problem keyword and use your favourite gradient based solver for the sub_state keyword, for example a ConjugateGradientDescentState
When an inner iteration has used up all Lanczos vectors, then this stopping criterion is a fallback / security stopping criterion in order to not access a non-existing field in the array allocated for vectors.
A stopping criterion related to the Riemannian adaptive regularization with cubics (ARC) solver indicating that the model function at the current (outer) iterate, i.e.
M – the product manifold $\mathcal M = \mathcal M_1 × \mathcal M_2 × ⋯ ×\mathcal M_n$
f – the objective function (cost) defined on M.
grad_f – a gradient, that can be of two cases
is a single function returning an ArrayPartition or
is a vector functions each returning a component part of the whole gradient
p – an initial value $p_0 ∈ \mathcal M$
Optional
evaluation – (AllocatingEvaluation) specify whether the gradient(s) works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x) (elementwise).
evaluation_order – (:Linear) – whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Random) or the default :Linear one.
inner_iterations– (5) how many gradient steps to take in a component before alternating to the next
order - ([1:n]) the initial permutation, where n is the number of gradients in gradF.
retraction_method – (default_retraction_method(M, typeof(p))) a retraction(M, p, X) to use.
Output
usually the obtained (approximate) minimizer, see get_solver_return for details
Note
This Problem requires the ProductManifold from Manifolds.jl, so Manifolds.jl needs to be loaded.
Note
The input of each of the (component) gradients is still the whole vector X, just that all other then the ith input component are assumed to be fixed and just the ith components gradient is computed / returned.
inner_iterations– (5) how many gradient steps to take in a component before alternating to the next to use a randomly permuted sequence (:FixedRandom), a per cycle newly permuted sequence (:Random) or the default :Linear evaluation order.
order the current permutation
retraction_method – (default_retraction_method(M, typeof(p))) a retraction(M,x,ξ) to use.
X (zero_vector(M,p)) the current gradient tangent vector
k, ì` internal counters for the outer and inner iterations, respectively.
Constructors
AlternatingGradientDescentState(M, p; kwargs...)
Generate the options for point p and and where inner_iterations, order_type, order, retraction_method, stopping_criterion, and stepsize` are keyword arguments
Additionally, the options share a DirectionUpdateRule, which chooses the current component, so they can be decorated further; The most inner one should always be the following one though.
M – the product manifold $\mathcal M = \mathcal M_1 × \mathcal M_2 × ⋯ ×\mathcal M_n$
f – the objective function (cost) defined on M.
grad_f – a gradient, that can be of two cases
is a single function returning an ArrayPartition or
is a vector functions each returning a component part of the whole gradient
p – an initial value $p_0 ∈ \mathcal M$
Optional
evaluation – (AllocatingEvaluation) specify whether the gradient(s) works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x) (elementwise).
evaluation_order – (:Linear) – whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Random) or the default :Linear one.
inner_iterations– (5) how many gradient steps to take in a component before alternating to the next
order - ([1:n]) the initial permutation, where n is the number of gradients in gradF.
retraction_method – (default_retraction_method(M, typeof(p))) a retraction(M, p, X) to use.
Output
usually the obtained (approximate) minimizer, see get_solver_return for details
Note
This Problem requires the ProductManifold from Manifolds.jl, so Manifolds.jl needs to be loaded.
Note
The input of each of the (component) gradients is still the whole vector X, just that all other then the ith input component are assumed to be fixed and just the ith components gradient is computed / returned.
inner_iterations– (5) how many gradient steps to take in a component before alternating to the next to use a randomly permuted sequence (:FixedRandom), a per cycle newly permuted sequence (:Random) or the default :Linear evaluation order.
order the current permutation
retraction_method – (default_retraction_method(M, typeof(p))) a retraction(M,x,ξ) to use.
X (zero_vector(M,p)) the current gradient tangent vector
k, ì` internal counters for the outer and inner iterations, respectively.
Constructors
AlternatingGradientDescentState(M, p; kwargs...)
Generate the options for point p and and where inner_iterations, order_type, order, retraction_method, stopping_criterion, and stepsize` are keyword arguments
Additionally, the options share a DirectionUpdateRule, which chooses the current component, so they can be decorated further; The most inner one should always be the following one though.
perform the augmented Lagrangian method (ALM) Liu, Boumal, 2019, Appl. Math. Optim. The aim of the ALM is to find the solution of the constrained optimisation task
where M is a Riemannian manifold, and $f$, $\{g_i\}_{i=1}^m$ and $\{h_j\}_{j=1}^p$ are twice continuously differentiable functions from M to ℝ. In every step $k$ of the algorithm, the AugmentedLagrangianCost$\mathcal{L}_{ρ^{(k-1)}}(p, μ^{(k-1)}, λ^{(k-1)})$ is minimized on $\mathcal{M}$, where $μ^{(k-1)} \in \mathbb R^n$ and $λ^{(k-1)} \in \mathbb R^m$ are the current iterates of the Lagrange multipliers and $ρ^{(k-1)}$ is the current penalty parameter.
grad_g – (nothing) the gradient of the inequality constraints
grad_h – (nothing) the gradient of the equality constraints
Note that one of the pairs (g, grad_g) or (h, grad_h) has to be provided. Otherwise the problem is not constrained and you can also call e.g. quasi_Newton
Optional
ϵ – (1e-3) the accuracy tolerance
ϵ_min – (1e-6) the lower bound for the accuracy tolerance
ϵ_exponent – (1/100) exponent of the ϵ update factor; also 1/number of iterations until maximal accuracy is needed to end algorithm naturally
θ_ϵ – ((ϵ_min / ϵ)^(ϵ_exponent)) the scaling factor of the exactness
μ – (ones(size(h(M,x),1))) the Lagrange multiplier with respect to the inequality constraints
μ_max – (20.0) an upper bound for the Lagrange multiplier belonging to the inequality constraints
λ – (ones(size(h(M,x),1))) the Lagrange multiplier with respect to the equality constraints
λ_max – (20.0) an upper bound for the Lagrange multiplier belonging to the equality constraints
λ_min – (- λ_max) a lower bound for the Lagrange multiplier belonging to the equality constraints
τ – (0.8) factor for the improvement of the evaluation of the penalty parameter
ρ – (1.0) the penalty parameter
θ_ρ – (0.3) the scaling factor of the penalty parameter
sub_cost – (AugmentedLagrangianCost(problem, ρ, μ, λ)) use augmented Lagrangian, especially with the same numbers ρ,μ as in the options for the sub problem
sub_grad – (AugmentedLagrangianGrad(problem, ρ, μ, λ)) use augmented Lagrangian gradient, especially with the same numbers ρ,μ as in the options for the sub problem
sub_kwargs – keyword arguments to decorate the sub options, e.g. with debug.
construct an augmented Lagrangian method options with the fields and defaults as above, where the manifold M and the ConstrainedManifoldObjectiveco are used for defaults in the keyword arguments.
Stores the parameters $ρ ∈ \mathbb R$, $μ ∈ \mathbb R^m$, $λ ∈ \mathbb R^n$ of the augmented Lagrangian associated to the ConstrainedManifoldObjectiveco.
This struct is also a functor (M,p) -> v that can be used as a cost function within a solver, based on the internal ConstrainedManifoldObjective we can compute
\[\mathcal L_\rho(p, μ, λ)
+\end{cases}\]
where $θ_ρ \in (0,1)$ is a constant scaling factor.
grad_g – (nothing) the gradient of the inequality constraints
grad_h – (nothing) the gradient of the equality constraints
Note that one of the pairs (g, grad_g) or (h, grad_h) has to be provided. Otherwise the problem is not constrained and you can also call e.g. quasi_Newton
Optional
ϵ – (1e-3) the accuracy tolerance
ϵ_min – (1e-6) the lower bound for the accuracy tolerance
ϵ_exponent – (1/100) exponent of the ϵ update factor; also 1/number of iterations until maximal accuracy is needed to end algorithm naturally
θ_ϵ – ((ϵ_min / ϵ)^(ϵ_exponent)) the scaling factor of the exactness
μ – (ones(size(h(M,x),1))) the Lagrange multiplier with respect to the inequality constraints
μ_max – (20.0) an upper bound for the Lagrange multiplier belonging to the inequality constraints
λ – (ones(size(h(M,x),1))) the Lagrange multiplier with respect to the equality constraints
λ_max – (20.0) an upper bound for the Lagrange multiplier belonging to the equality constraints
λ_min – (- λ_max) a lower bound for the Lagrange multiplier belonging to the equality constraints
τ – (0.8) factor for the improvement of the evaluation of the penalty parameter
ρ – (1.0) the penalty parameter
θ_ρ – (0.3) the scaling factor of the penalty parameter
sub_cost – (AugmentedLagrangianCost(problem, ρ, μ, λ)) use augmented Lagrangian, especially with the same numbers ρ,μ as in the options for the sub problem
sub_grad – (AugmentedLagrangianGrad(problem, ρ, μ, λ)) use augmented Lagrangian gradient, especially with the same numbers ρ,μ as in the options for the sub problem
sub_kwargs – keyword arguments to decorate the sub options, e.g. with debug.
construct an augmented Lagrangian method options with the fields and defaults as above, where the manifold M and the ConstrainedManifoldObjectiveco are used for defaults in the keyword arguments.
Stores the parameters $ρ ∈ \mathbb R$, $μ ∈ \mathbb R^m$, $λ ∈ \mathbb R^n$ of the augmented Lagrangian associated to the ConstrainedManifoldObjectiveco.
This struct is also a functor (M,p) -> v that can be used as a cost function within a solver, based on the internal ConstrainedManifoldObjective we can compute
Stores the parameters $ρ ∈ \mathbb R$, $μ ∈ \mathbb R^m$, $λ ∈ \mathbb R^n$ of the augmented Lagrangian associated to the ConstrainedManifoldObjectiveco.
This struct is also a functor in both formats
(M, p) -> X to compute the gradient in allocating fashion.
(M, X, p) to compute the gradient in in-place fashion.
Stores the parameters $ρ ∈ \mathbb R$, $μ ∈ \mathbb R^m$, $λ ∈ \mathbb R^n$ of the augmented Lagrangian associated to the ConstrainedManifoldObjectiveco.
This struct is also a functor in both formats
(M, p) -> X to compute the gradient in allocating fashion.
(M, X, p) to compute the gradient in in-place fashion.
where $\operatorname{retr}$ denotes a retraction on the ManifoldM and one can employ different rules to update the descent direction $δ_k$ based on the last direction $δ_{k-1}$ and both gradients $\operatorname{grad}f(x_k)$,$\operatorname{grad}f(x_{k-1})$. The Stepsize$s_k$ may be determined by a Linesearch.
evaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).
retraction_method - (default_retraction_method(M, typeof(p))) a retraction method to use.
stopping_criterion : (stopWhenAny( stopAtIteration(200), stopGradientNormLess(10.0^-8))) a function indicating when to stop.
vector_transport_method – (default_vector_transport_method(M, typeof(p))) vector transport method to transport the old descent direction when computing the new descent direction.
retraction_method – (default_retraction_method(M, typeof(p))) a type of retraction
vector_transport_method – (default_retraction_method(M, typeof(p))) a type of retraction
Constructor
ConjugateGradientState(M, p)
where the last five fields above can be set by their names as keyword and the X can be set to a tangent vector type using the keyword initial_gradient which defaults to zero_vector(M,p), and δ is initialized to a copy of this vector.
An update rule might require a restart, that is using pure gradient as descent direction, if the last two gradients are nearly orthogonal, cf. Hager, Zhang, Pacific J Optim, 2006, page 12 (in the pdf, 46 in Journal page numbers). This method is named after E. Beale from his proceedings paper in 1972 [Bea72]. This method acts as a decorator to any existing DirectionUpdateRuledirection_update.
When obtain from the ConjugateGradientDescentStatecgs the last $p_k,X_k$ and the current $p_{k+1},X_{k+1}$ iterate and the gradient, respectively.
Then a restart is performed, i.e. $β_k = 0$ returned if
where $P_{a\gets b}(⋅)$ denotes a vector transport from the tangent space at $a$ to $b$, and $ξ$ is the threshold. The default threshold is chosen as 0.2 as recommended in Powell, Math. Prog., 1977
where $\operatorname{retr}$ denotes a retraction on the ManifoldM and one can employ different rules to update the descent direction $δ_k$ based on the last direction $δ_{k-1}$ and both gradients $\operatorname{grad}f(x_k)$,$\operatorname{grad}f(x_{k-1})$. The Stepsize$s_k$ may be determined by a Linesearch.
evaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).
retraction_method - (default_retraction_method(M, typeof(p))) a retraction method to use.
stopping_criterion : (stopWhenAny( stopAtIteration(200), stopGradientNormLess(10.0^-8))) a function indicating when to stop.
vector_transport_method – (default_vector_transport_method(M, typeof(p))) vector transport method to transport the old descent direction when computing the new descent direction.
retraction_method – (default_retraction_method(M, typeof(p))) a type of retraction
vector_transport_method – (default_retraction_method(M, typeof(p))) a type of retraction
Constructor
ConjugateGradientState(M, p)
where the last five fields above can be set by their names as keyword and the X can be set to a tangent vector type using the keyword initial_gradient which defaults to zero_vector(M,p), and δ is initialized to a copy of this vector.
An update rule might require a restart, that is using pure gradient as descent direction, if the last two gradients are nearly orthogonal, cf. Hager, Zhang, Pacific J Optim, 2006, page 12 (in the pdf, 46 in Journal page numbers). This method is named after E. Beale from his proceedings paper in 1972 [Bea72]. This method acts as a decorator to any existing DirectionUpdateRuledirection_update.
When obtain from the ConjugateGradientDescentStatecgs the last $p_k,X_k$ and the current $p_{k+1},X_{k+1}$ iterate and the gradient, respectively.
Then a restart is performed, i.e. $β_k = 0$ returned if
where $P_{a\gets b}(⋅)$ denotes a vector transport from the tangent space at $a$ to $b$, and $ξ$ is the threshold. The default threshold is chosen as 0.2 as recommended in Powell, Math. Prog., 1977
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates $p_k,X_k$, the current iterates $p_{k+1},X_{k+1}$ of the iterate and the gradient, respectively, and the last update direction $\delta=\delta_k$, based on Fletcher, 1987 adapted to manifolds:
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates $p_k,X_k$, the current iterates $p_{k+1},X_{k+1}$ of the iterate and the gradient, respectively, and the last update direction $\delta=\delta_k$, based on Fletcher, 1987 adapted to manifolds:
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates $p_k,X_k$, the current iterates $p_{k+1},X_{k+1}$ of the iterate and the gradient, respectively, and the last update direction $\delta=\delta_k$, based on Dai, Yuan, Siam J Optim, 1999 adapted to manifolds:
Let $\nu_k = X_{k+1} - P_{p_{k+1}\gets p_k}X_k$, where $P_{a\gets b}(⋅)$ denotes a vector transport from the tangent space at $a$ to $b$.
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates $p_k,X_k$, the current iterates $p_{k+1},X_{k+1}$ of the iterate and the gradient, respectively, and the last update direction $\delta=\delta_k$, based on Dai, Yuan, Siam J Optim, 1999 adapted to manifolds:
Let $\nu_k = X_{k+1} - P_{p_{k+1}\gets p_k}X_k$, where $P_{a\gets b}(⋅)$ denotes a vector transport from the tangent space at $a$ to $b$.
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates $p_k,X_k$, the current iterates $p_{k+1},X_{k+1}$ of the iterate and the gradient, respectively, and the last update direction $\delta=\delta_k$, based on Flecther, Reeves, Comput. J, 1964 adapted to manifolds:
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates $p_k,X_k$, the current iterates $p_{k+1},X_{k+1}$ of the iterate and the gradient, respectively, and the last update direction $\delta=\delta_k$, based on Hager, Zhang, SIAM J Optim, 2005. adapted to manifolds: let $\nu_k = X_{k+1} - P_{p_{k+1}\gets p_k}X_k$, where $P_{a\gets b}(⋅)$ denotes a vector transport from the tangent space at $a$ to $b$.
\[β_k = \Bigl\langle\nu_k -
+)
Construct the Dai Yuan coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates $p_k,X_k$, the current iterates $p_{k+1},X_{k+1}$ of the iterate and the gradient, respectively, and the last update direction $\delta=\delta_k$, based on Flecther, Reeves, Comput. J, 1964 adapted to manifolds:
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates $p_k,X_k$, the current iterates $p_{k+1},X_{k+1}$ of the iterate and the gradient, respectively, and the last update direction $\delta=\delta_k$, based on Hager, Zhang, SIAM J Optim, 2005. adapted to manifolds: let $\nu_k = X_{k+1} - P_{p_{k+1}\gets p_k}X_k$, where $P_{a\gets b}(⋅)$ denotes a vector transport from the tangent space at $a$ to $b$.
function HagerZhangCoefficient(t::AbstractVectorTransportMethod)
-function HagerZhangCoefficient(M::AbstractManifold = DefaultManifold(2))
Construct the Hager Zhang coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates $p_k,X_k$, the current iterates $p_{k+1},X_{k+1}$ of the iterate and the gradient, respectively, and the last update direction $\delta=\delta_k$, based on Heestenes, Stiefel, J. Research Nat. Bur. Standards, 1952 adapted to manifolds as follows:
Let $\nu_k = X_{k+1} - P_{p_{k+1}\gets p_k}X_k$. Then the update reads
Construct the Hager Zhang coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates $p_k,X_k$, the current iterates $p_{k+1},X_{k+1}$ of the iterate and the gradient, respectively, and the last update direction $\delta=\delta_k$, based on Heestenes, Stiefel, J. Research Nat. Bur. Standards, 1952 adapted to manifolds as follows:
Let $\nu_k = X_{k+1} - P_{p_{k+1}\gets p_k}X_k$. Then the update reads
where $P_{a\gets b}(⋅)$ denotes a vector transport from the tangent space at $a$ to $b$.
Constructor
function HestenesStiefelCoefficient(transport_method::AbstractVectorTransportMethod)
-function HestenesStiefelCoefficient(M::AbstractManifold = DefaultManifold(2))
Construct the Heestens Stiefel coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates $p_k,X_k$, the current iterates $p_{k+1},X_{k+1}$ of the iterate and the gradient, respectively, and the last update direction $\delta=\delta_k$, based on Lui, Storey, J. Optim. Theoru Appl., 1991 adapted to manifolds:
Let $\nu_k = X_{k+1} - P_{p_{k+1}\gets p_k}X_k$, where $P_{a\gets b}(⋅)$ denotes a vector transport from the tangent space at $a$ to $b$.
Construct the Heestens Stiefel coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.
Computes an update coefficient for the conjugate gradient method, where the ConjugateGradientDescentStatecgds include the last iterates $p_k,X_k$, the current iterates $p_{k+1},X_{k+1}$ of the iterate and the gradient, respectively, and the last update direction $\delta=\delta_k$, based on Lui, Storey, J. Optim. Theoru Appl., 1991 adapted to manifolds:
Let $\nu_k = X_{k+1} - P_{p_{k+1}\gets p_k}X_k$, where $P_{a\gets b}(⋅)$ denotes a vector transport from the tangent space at $a$ to $b$.
function LiuStoreyCoefficient(t::AbstractVectorTransportMethod)
-function LiuStoreyCoefficient(M::AbstractManifold = DefaultManifold(2))
Construct the Lui Storey coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.
Construct the Lui Storey coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.
function PolakRibiereCoefficient(
M::AbstractManifold=DefaultManifold(2);
t::AbstractVectorTransportMethod=default_vector_transport_method(M)
-)
Construct the PolakRibiere coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.
Construct the PolakRibiere coefficient update rule, where the parallel transport is the default vector transport and a new storage is created by default.
The Cyclic Proximal Point (CPP) algorithm aims to minimize
\[F(x) = \sum_{i=1}^c f_i(x)\]
assuming that the proximal maps$\operatorname{prox}_{λ f_i}(x)$ are given in closed form or can be computed efficiently (at least approximately).
The algorithm then cycles through these proximal maps, where the type of cycle might differ and the proximal parameter $λ_k$ changes after each cycle $k$.
evaluation – (AllocatingEvaluation) specify whether the proximal maps work by allocation (default) form prox(M, λ, x) or InplaceEvaluation in place, i.e. is of the form prox!(M, y, λ, x).
evaluation_order – (:Linear) – whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Random) or the default linear one.
λ – ( iter -> 1/iter ) a function returning the (square summable but not summable) sequence of λi
λ – (@(i) -> 1/i) a function for the values of $λ_k$ per iteration(cycle $ì$
oder_type – (:LinearOrder) – whether to use a randomly permuted sequence (:FixedRandomOrder), a per cycle permuted sequence (:RandomOrder) or the default linear one.
Constructor
CyclicProximalPointState(M, p)
Generate the options with the following keyword arguments
stopping_criterion (StopAfterIteration(2000)) – a StoppingCriterion.
λ ( i -> 1.0 / i) – a function to compute the $λ_k, k ∈ \mathbb N$,
evaluation_order – (:LinearOrder) – a Symbol indicating the order the proxes are applied.
The Cyclic Proximal Point (CPP) algorithm aims to minimize
\[F(x) = \sum_{i=1}^c f_i(x)\]
assuming that the proximal maps$\operatorname{prox}_{λ f_i}(x)$ are given in closed form or can be computed efficiently (at least approximately).
The algorithm then cycles through these proximal maps, where the type of cycle might differ and the proximal parameter $λ_k$ changes after each cycle $k$.
evaluation – (AllocatingEvaluation) specify whether the proximal maps work by allocation (default) form prox(M, λ, x) or InplaceEvaluation in place, i.e. is of the form prox!(M, y, λ, x).
evaluation_order – (:Linear) – whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Random) or the default linear one.
λ – ( iter -> 1/iter ) a function returning the (square summable but not summable) sequence of λi
λ – (@(i) -> 1/i) a function for the values of $λ_k$ per iteration(cycle $ì$
oder_type – (:LinearOrder) – whether to use a randomly permuted sequence (:FixedRandomOrder), a per cycle permuted sequence (:RandomOrder) or the default linear one.
Constructor
CyclicProximalPointState(M, p)
Generate the options with the following keyword arguments
stopping_criterion (StopAfterIteration(2000)) – a StoppingCriterion.
λ ( i -> 1.0 / i) – a function to compute the $λ_k, k ∈ \mathbb N$,
evaluation_order – (:LinearOrder) – a Symbol indicating the order the proxes are applied.
evaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form grad_f(M, p) or InplaceEvaluation form grad_f!(M, X, x)
gradient – (nothing) specify $\operatorname{grad} f$, for debug / analysis or enhancing stopping_criterion=
grad_g – (nothing) specify the gradient of g. If specified, a subsolver is automatically set up.
initial_vector - (zero_vector(M, p)) initialise the inner tangent vector to store the subgradient result.
g - (nothing) specify the function g If specified, a subsolver is automatically set up.
While there are several parameters for a sub solver, the easiest is to provide the function grad_g=, such that together with the mandatory function g a default cost and gradient can be generated and passed to a default subsolver. Hence the easiest example call looks like
sub_cost - (LinearizedDCCost(g, p, initial_vector)) a cost to be used within the default sub_problem Use this if you have a more efficient version than the default that is built using g from above.
sub_grad - (LinearizedDCGrad(grad_g, p, initial_vector; evaluation=evaluation) gradient to be used within the default sub_problem. This is generated by default when grad_g is provided. You can specify your own by overwriting this keyword.
sub_hess – (a finite difference approximation by default) specify a Hessian of the subproblem, which the default solver, see sub_state needs
sub_kwargs - ([]) pass keyword arguments to the sub_state, in form of a Dict(:kwname=>value), unless you set the sub_state directly.
sub_objective - (a gradient or hessian objective based on the last 3 keywords) provide the objective used within sub_problem (if that is not specified by the user)
sub_problem - (DefaultManoptProblem(M, sub_objective) specify a manopt problem for the sub-solver runs. You can also provide a function for a closed form solution. Then evaluation= is taken into account for the form of this function.
sub_state - (TrustRegionsState by default, requires sub_hessian to be provided; decorated with sub_kwargs). Choose the solver by specifying a solver state to solve the sub_problem if the sub_problem if a function (i.e. a closed form solution), this is set to evaluation and can be changed to the evaluation type of the closed form solution accordingly.
set $p^{(k+1)} = \operatorname{retr}_{p^{(k)}}(s_kX^{(k)})$.
until the stopping_criterion is fulfilled. See Almeida, da Cruz Neto, Oliveira, Souza, Comput. Optim. Appl., 2020 for more details on the modified variant, where we slightly changed step 4-6, sine here we get the classical proximal point method for DC functions for $s_k = 1$ and we can employ linesearches similar to other solvers.
Optional parameters
λ – ( i -> 1/2 ) a function returning the sequence of prox parameters λi
evaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).
cost - (nothing) provide the cost f, e.g. for debug reasonscost to be used within the default sub_problem. Use this if you have a more efficient version than using g from above.
gradient – (nothing) specify $\operatorname{grad} f$, for debug / analysis or enhancing the stopping_criterion
prox_g - (nothing) specify a proximal map for the sub problem or both of the following
g – (nothing) specify the function g.
grad_g – (nothing) specify the gradient of g. If both gand grad_g are specified, a subsolver is automatically set up.
inverse_retraction_method - (default_inverse_retraction_method(M)) an inverse retraction method to use (see step 4).
retraction_method – (default_retraction_method(M)) a retraction to use (see step 2)
stepsize – (ConstantStepsize(M)) specify a Stepsize to run the modified algorithm (experimental.) functor.
While there are several parameters for a sub solver, the easiest is to provide the function g and grad_g, such that together with the mandatory function g a default cost and gradient can be generated and passed to a default subsolver. Hence the easiest example call looks like
sub_cost – (ProximalDCCost(g, copy(M, p), λ(1))) cost to be used within the default sub_problem that is initialized as soon as g is provided.
sub_grad – (ProximalDCGrad(grad_g, copy(M, p), λ(1); evaluation=evaluation) gradient to be used within the default sub_problem, that is initialized as soon as grad_g is provided. This is generated by default when grad_g is provided. You can specify your own by overwriting this keyword.
sub_hess – (a finite difference approximation by default) specify a Hessian of the subproblem, which the default solver, see sub_state needs
sub_kwargs – ([]) pass keyword arguments to the sub_state, in form of a Dict(:kwname=>value), unless you set the sub_state directly.
sub_objective – (a gradient or hessian objective based on the last 3 keywords) provide the objective used within sub_problem (if that is not specified by the user)
sub_problem – (DefaultManoptProblem(M, sub_objective) specify a manopt problem for the sub-solver runs. You can also provide a function for a closed form solution. Then evaluation= is taken into account for the form of this function.
sub_state – (TrustRegionsState – requires the sub_hessian to be provided, decorated withsubkwargs) choose the solver by specifying a solver state to solve thesubproblem`
evaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form grad_f(M, p) or InplaceEvaluation form grad_f!(M, X, x)
gradient – (nothing) specify $\operatorname{grad} f$, for debug / analysis or enhancing stopping_criterion=
grad_g – (nothing) specify the gradient of g. If specified, a subsolver is automatically set up.
initial_vector - (zero_vector(M, p)) initialise the inner tangent vector to store the subgradient result.
g - (nothing) specify the function g If specified, a subsolver is automatically set up.
While there are several parameters for a sub solver, the easiest is to provide the function grad_g=, such that together with the mandatory function g a default cost and gradient can be generated and passed to a default subsolver. Hence the easiest example call looks like
sub_cost - (LinearizedDCCost(g, p, initial_vector)) a cost to be used within the default sub_problem Use this if you have a more efficient version than the default that is built using g from above.
sub_grad - (LinearizedDCGrad(grad_g, p, initial_vector; evaluation=evaluation) gradient to be used within the default sub_problem. This is generated by default when grad_g is provided. You can specify your own by overwriting this keyword.
sub_hess – (a finite difference approximation by default) specify a Hessian of the subproblem, which the default solver, see sub_state needs
sub_kwargs - ([]) pass keyword arguments to the sub_state, in form of a Dict(:kwname=>value), unless you set the sub_state directly.
sub_objective - (a gradient or hessian objective based on the last 3 keywords) provide the objective used within sub_problem (if that is not specified by the user)
sub_problem - (DefaultManoptProblem(M, sub_objective) specify a manopt problem for the sub-solver runs. You can also provide a function for a closed form solution. Then evaluation= is taken into account for the form of this function.
sub_state - (TrustRegionsState by default, requires sub_hessian to be provided; decorated with sub_kwargs). Choose the solver by specifying a solver state to solve the sub_problem if the sub_problem if a function (i.e. a closed form solution), this is set to evaluation and can be changed to the evaluation type of the closed form solution accordingly.
set $p^{(k+1)} = \operatorname{retr}_{p^{(k)}}(s_kX^{(k)})$.
until the stopping_criterion is fulfilled. See Almeida, da Cruz Neto, Oliveira, Souza, Comput. Optim. Appl., 2020 for more details on the modified variant, where we slightly changed step 4-6, sine here we get the classical proximal point method for DC functions for $s_k = 1$ and we can employ linesearches similar to other solvers.
Optional parameters
λ – ( i -> 1/2 ) a function returning the sequence of prox parameters λi
evaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).
cost - (nothing) provide the cost f, e.g. for debug reasonscost to be used within the default sub_problem. Use this if you have a more efficient version than using g from above.
gradient – (nothing) specify $\operatorname{grad} f$, for debug / analysis or enhancing the stopping_criterion
prox_g - (nothing) specify a proximal map for the sub problem or both of the following
g – (nothing) specify the function g.
grad_g – (nothing) specify the gradient of g. If both gand grad_g are specified, a subsolver is automatically set up.
inverse_retraction_method - (default_inverse_retraction_method(M)) an inverse retraction method to use (see step 4).
retraction_method – (default_retraction_method(M)) a retraction to use (see step 2)
stepsize – (ConstantStepsize(M)) specify a Stepsize to run the modified algorithm (experimental.) functor.
While there are several parameters for a sub solver, the easiest is to provide the function g and grad_g, such that together with the mandatory function g a default cost and gradient can be generated and passed to a default subsolver. Hence the easiest example call looks like
sub_cost – (ProximalDCCost(g, copy(M, p), λ(1))) cost to be used within the default sub_problem that is initialized as soon as g is provided.
sub_grad – (ProximalDCGrad(grad_g, copy(M, p), λ(1); evaluation=evaluation) gradient to be used within the default sub_problem, that is initialized as soon as grad_g is provided. This is generated by default when grad_g is provided. You can specify your own by overwriting this keyword.
sub_hess – (a finite difference approximation by default) specify a Hessian of the subproblem, which the default solver, see sub_state needs
sub_kwargs – ([]) pass keyword arguments to the sub_state, in form of a Dict(:kwname=>value), unless you set the sub_state directly.
sub_objective – (a gradient or hessian objective based on the last 3 keywords) provide the objective used within sub_problem (if that is not specified by the user)
sub_problem – (DefaultManoptProblem(M, sub_objective) specify a manopt problem for the sub-solver runs. You can also provide a function for a closed form solution. Then evaluation= is taken into account for the form of this function.
sub_state – (TrustRegionsState – requires the sub_hessian to be provided, decorated withsubkwargs) choose the solver by specifying a solver state to solve thesubproblem`
A struct to store the current state of the [difference_of_convex_algorithm])(@ref). It comes in two forms, depending on the realisation of the subproblem.
Fields
p – the current iterate, i.e. a point on the manifold
X – the current subgradient, i.e. a tangent vector to p.
sub_problem – problem for the subsolver
sub_state – state of the subproblem
stop – a functor inheriting from StoppingCriterion indicating when to stop.
besides a problem and options, one can also provide a function and an AbstractEvaluationType, respectively, to indicate a closed form solution for the sub task.
Generate the state either using a solver from Manopt, given by an AbstractManoptProblemsub_problem and an AbstractManoptSolverStatesub_state, or a closed form solution sub_solver for the sub-problem, where by default its AbstractEvaluationTypeevaluation is in-place, i.e. the function is of the form (M, p, X) -> q or (M, q, p, X) -> q, such that the current iterate p and the subgradient X of h can be passed to that function and the result if q.
Further keyword Arguments
initial_vector=zero_vector (zero_vectoir(M,p)) how to initialize the inner gradient tangent vector
where both $g$ and $h$ are convex, lsc. and proper. Furthermore we assume that the subdifferential $∂h$ of $h$ is given.
Fields
cost – an implementation of $f(p) = g(p)-h(p)$ as a function f(M,p).
∂h!! – a deterministic version of $∂h: \mathcal M → T\mathcal M$, i.e. calling ∂h(M, p) returns a subgradient of $h$ at p and if there is more than one, it returns a deterministic choice.
Note that the subdifferential might be given in two possible signatures
for a point p_k and a tangent vector X_k at p_k (e.g. outer iterates) that are stored within this functor as well.
Fields
g a function
pk a point on a manifold
Xk a tangent vector at pk
Both interims values can be set using set_manopt_parameter!(::LinearizedDCCost, ::Val{:p}, p) and set_manopt_parameter!(::LinearizedDCCost, ::Val{:X}, X), respectively.
A functor (M,X,p) → ℝ to represent the gradient of the inner problem of a ManifoldDifferenceOfConvexObjective, i.e. for a cost function of the form
\[ F_{p_k,X_k}(p) = g(p) - ⟨X_k, \log_{p_k}p⟩\]
its gradient is given by using $F=F_1(F_2(p))$, where $F_1(X) = ⟨X_k,X⟩$ and $F_2(p) = \log_{p_k}p$ and the chain rule as well as the adjoint differential of the logarithmic map with respect to its argument for $D^*F_2(p)$
Both interims values can be set using set_manopt_parameter!(::LinearizedDCGrad, ::Val{:p}, p) and set_manopt_parameter!(::LinearizedDCGrad, ::Val{:X}, X), respectively.
Both interims values can be set using set_manopt_parameter!(::ProximalDCCost, ::Val{:p}, p) and set_manopt_parameter!(::ProximalDCCost, ::Val{:λ}, λ), respectively.
A functor (M,X,p) → ℝ to represent the gradient of the inner cost function of a ManifoldDifferenceOfConvexProximalObjective, i.e. the gradient function of the proximal map cost function of g, i.e. of
Both interims values can be set using set_manopt_parameter!(::ProximalDCGrad, ::Val{:p}, p) and set_manopt_parameter!(::ProximalDCGrad, ::Val{:λ}, λ), respectively.
The evaluation is done in place of X for the !-variant. The T=AllocatingEvaluation problem might still allocate memory within. When the non-mutating variant is called with a T=InplaceEvaluation memory for the result is allocated.
Generate the state either using a solver from Manopt, given by an AbstractManoptProblemsub_problem and an AbstractManoptSolverStatesub_state, or a closed form solution sub_solver for the sub-problem, where by default its AbstractEvaluationTypeevaluation is in-place, i.e. the function is of the form (M, p, X) -> q or (M, q, p, X) -> q, such that the current iterate p and the subgradient X of h can be passed to that function and the result if q.
Further keyword Arguments
initial_vector=zero_vector (zero_vectoir(M,p)) how to initialize the inner gradient tangent vector
where both $g$ and $h$ are convex, lsc. and proper. Furthermore we assume that the subdifferential $∂h$ of $h$ is given.
Fields
cost – an implementation of $f(p) = g(p)-h(p)$ as a function f(M,p).
∂h!! – a deterministic version of $∂h: \mathcal M → T\mathcal M$, i.e. calling ∂h(M, p) returns a subgradient of $h$ at p and if there is more than one, it returns a deterministic choice.
Note that the subdifferential might be given in two possible signatures
for a point p_k and a tangent vector X_k at p_k (e.g. outer iterates) that are stored within this functor as well.
Fields
g a function
pk a point on a manifold
Xk a tangent vector at pk
Both interims values can be set using set_manopt_parameter!(::LinearizedDCCost, ::Val{:p}, p) and set_manopt_parameter!(::LinearizedDCCost, ::Val{:X}, X), respectively.
A functor (M,X,p) → ℝ to represent the gradient of the inner problem of a ManifoldDifferenceOfConvexObjective, i.e. for a cost function of the form
\[ F_{p_k,X_k}(p) = g(p) - ⟨X_k, \log_{p_k}p⟩\]
its gradient is given by using $F=F_1(F_2(p))$, where $F_1(X) = ⟨X_k,X⟩$ and $F_2(p) = \log_{p_k}p$ and the chain rule as well as the adjoint differential of the logarithmic map with respect to its argument for $D^*F_2(p)$
Both interims values can be set using set_manopt_parameter!(::LinearizedDCGrad, ::Val{:p}, p) and set_manopt_parameter!(::LinearizedDCGrad, ::Val{:X}, X), respectively.
Both interims values can be set using set_manopt_parameter!(::ProximalDCCost, ::Val{:p}, p) and set_manopt_parameter!(::ProximalDCCost, ::Val{:λ}, λ), respectively.
A functor (M,X,p) → ℝ to represent the gradient of the inner cost function of a ManifoldDifferenceOfConvexProximalObjective, i.e. the gradient function of the proximal map cost function of g, i.e. of
Both interims values can be set using set_manopt_parameter!(::ProximalDCGrad, ::Val{:p}, p) and set_manopt_parameter!(::ProximalDCGrad, ::Val{:λ}, λ), respectively.
The evaluation is done in place of X for the !-variant. The T=AllocatingEvaluation problem might still allocate memory within. When the non-mutating variant is called with a T=InplaceEvaluation memory for the result is allocated.
perform the exact penalty method (EPM) Liu, Boumal, 2019, Appl. Math. Optim The aim of the EPM is to find a solution of the constrained optimisation task
where M is a Riemannian manifold, and $f$, $\{g_i\}_{i=1}^m$ and $\{h_j\}_{j=1}^n$ are twice continuously differentiable functions from M to ℝ. For that a weighted $L_1$-penalty term for the violation of the constraints is added to the objective
where $ρ>0$ is the penalty parameter. Since this is non-smooth, a SmoothingTechnique with parameter u is applied, see the ExactPenaltyCost.
In every step $k$ of the exact penalty method, the smoothed objective is then minimized over all $x ∈\mathcal{M}$. Then, the accuracy tolerance $ϵ$ and the smoothing parameter $u$ are updated by setting
\[ϵ^{(k)}=\max\{ϵ_{\min}, θ_ϵ ϵ^{(k-1)}\},\]
where $ϵ_{\min}$ is the lowest value $ϵ$ is allowed to become and $θ_ϵ ∈ (0,1)$ is constant scaling factor, and
grad_g – (nothing) the gradient of the inequality constraints
grad_h – (nothing) the gradient of the equality constraints
Note that one of the pairs (g, grad_g) or (h, grad_h) has to be provided. Otherwise the problem is not constrained and you can also call e.g. quasi_Newton
ϵ_exponent – (1/100) exponent of the ϵ update factor;
ϵ_min – (1e-6) the lower bound for the accuracy tolerance
u – (1e–1) the smoothing parameter and threshold for violation of the constraints
u_exponent – (1/100) exponent of the u update factor;
u_min – (1e-6) the lower bound for the smoothing parameter and threshold for violation of the constraints
ρ – (1.0) the penalty parameter
min_stepsize – (1e-10) the minimal step size
sub_cost – (ExactPenaltyCost(problem, ρ, u; smoothing=smoothing)) use this exact penalty cost, especially with the same numbers ρ,u as in the options for the sub problem
sub_grad – (ExactPenaltyGrad(problem, ρ, u; smoothing=smoothing)) use this exact penalty gradient, especially with the same numbers ρ,u as in the options for the sub problem
sub_kwargs – keyword arguments to decorate the sub options, e.g. with debug.
grad_g – (nothing) the gradient of the inequality constraints
grad_h – (nothing) the gradient of the equality constraints
Note that one of the pairs (g, grad_g) or (h, grad_h) has to be provided. Otherwise the problem is not constrained and you can also call e.g. quasi_Newton
ϵ_exponent – (1/100) exponent of the ϵ update factor;
ϵ_min – (1e-6) the lower bound for the accuracy tolerance
u – (1e–1) the smoothing parameter and threshold for violation of the constraints
u_exponent – (1/100) exponent of the u update factor;
u_min – (1e-6) the lower bound for the smoothing parameter and threshold for violation of the constraints
ρ – (1.0) the penalty parameter
min_stepsize – (1e-10) the minimal step size
sub_cost – (ExactPenaltyCost(problem, ρ, u; smoothing=smoothing)) use this exact penalty cost, especially with the same numbers ρ,u as in the options for the sub problem
sub_grad – (ExactPenaltyGrad(problem, ρ, u; smoothing=smoothing)) use this exact penalty gradient, especially with the same numbers ρ,u as in the options for the sub problem
sub_kwargs – keyword arguments to decorate the sub options, e.g. with debug.
where we use an additional parameter $u$ and a smoothing technique, e.g. LogarithmicSumOfExponentials or LinearQuadraticHuber to obtain a smooth cost function. This struct is also a functor (M,p) -> v of the cost $v$.
Specify a smoothing based on $\max\{0,x\} ≈ \mathcal P(x,u)$ for some $u$, where
\[\mathcal P(x, u) = \begin{cases}
+\Bigr),\]
where we use an additional parameter $u$ and a smoothing technique, e.g. LogarithmicSumOfExponentials or LinearQuadraticHuber to obtain a smooth cost function. This struct is also a functor (M,p) -> v of the cost $v$.
Specify a smoothing based on $\max\{0,x\} ≈ \mathcal P(x,u)$ for some $u$, where
\[\mathcal P(x, u) = \begin{cases}
0 & \text{ if } x \leq 0,\\
\frac{x^2}{2u} & \text{ if } 0 \leq x \leq u,\\
x-\frac{u}{2} & \text{ if } x \geq u.
-\end{cases}\]
direction – (IdentityUpdateRule) perform a processing of the direction, e.g.
evaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form grad_f(M, p) or InplaceEvaluation in place, i.e. is of the form grad_f!(M, X, p).
retraction_method – (default_retraction_method(M, typeof(p))) a retraction to use
Generate gradient descent options, where X can be used to set the tangent vector to store the gradient in a certain type; it will be initialised accordingly at a later stage. All following fields are keyword arguments.
A field of the options is the direction, a DirectionUpdateRule, which by default IdentityUpdateRule just evaluates the gradient but can be enhanced for example to
A general functor, that handles direction update rules. It's field(s) is usually only a StoreStateAction by default initialized to the fields required for the specific coefficient, but can also be replaced by a (common, global) individual one that provides these values.
Append a momentum to a gradient processor, where the last direction and last iterate are stored and the new is composed as $η_i = m*η_{i-1}' - s d_i$, where $sd_i$ is the current (inner) direction and $η_{i-1}'$ is the vector transported last direction multiplied by momentum $m$.
Fields
p_old - (rand(M)) remember the last iterate for parallel transporting the last direction
momentum – (0.2) factor for momentum
direction – internal DirectionUpdateRule to determine directions to add the momentum to.
vector_transport_method – default_vector_transport_method(M, typeof(p)) vector transport method to use
X_old – (zero_vector(M,x0)) the last gradient/direction update added as momentum
Constructors
Add momentum to a gradient problem, where by default just a gradient evaluation is used
MomentumGradient(
+\qquad k=0,1,…\]
with different choices of the stepsize $s_k$ available (see stepsize option below).
Input
M – a manifold $\mathcal M$
f – a cost function $f: \mathcal M→ℝ$ to find a minimizer $p^*$ for
grad_f – the gradient $\operatorname{grad}f: \mathcal M → T\mathcal M$ of f
as a function (M, p) -> X or a function (M, X, p) -> X
direction – (IdentityUpdateRule) perform a processing of the direction, e.g.
evaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form grad_f(M, p) or InplaceEvaluation in place, i.e. is of the form grad_f!(M, X, p).
retraction_method – (default_retraction_method(M, typeof(p))) a retraction to use
Generate gradient descent options, where X can be used to set the tangent vector to store the gradient in a certain type; it will be initialised accordingly at a later stage. All following fields are keyword arguments.
A field of the options is the direction, a DirectionUpdateRule, which by default IdentityUpdateRule just evaluates the gradient but can be enhanced for example to
A general functor, that handles direction update rules. It's field(s) is usually only a StoreStateAction by default initialized to the fields required for the specific coefficient, but can also be replaced by a (common, global) individual one that provides these values.
Append a momentum to a gradient processor, where the last direction and last iterate are stored and the new is composed as $η_i = m*η_{i-1}' - s d_i$, where $sd_i$ is the current (inner) direction and $η_{i-1}'$ is the vector transported last direction multiplied by momentum $m$.
Fields
p_old - (rand(M)) remember the last iterate for parallel transporting the last direction
momentum – (0.2) factor for momentum
direction – internal DirectionUpdateRule to determine directions to add the momentum to.
vector_transport_method – default_vector_transport_method(M, typeof(p)) vector transport method to use
X_old – (zero_vector(M,x0)) the last gradient/direction update added as momentum
Constructors
Add momentum to a gradient problem, where by default just a gradient evaluation is used
Initialize a momentum gradient rule to s. Note that the keyword arguments p and X will be overridden often, so their initialisation is meant to set the to certain types of points or tangent vectors, if you do not use the default types with respect to M.
Add an average of gradients to a gradient processor. A set of previous directions (from the inner processor) and the last iterate are stored, average is taken after vector transporting them to the current iterates tangent space.
Fields
gradients – (fill(zero_vector(M,x0),n)) the last n gradient/direction updates
last_iterate – last iterate (needed to transport the gradients)
direction – internal DirectionUpdateRule to determine directions to apply the averaging to
vector_transport_method - vector transport method to use
Constructors
AverageGradient(
+)
Initialize a momentum gradient rule to s. Note that the keyword arguments p and X will be overridden often, so their initialisation is meant to set the to certain types of points or tangent vectors, if you do not use the default types with respect to M.
Add an average of gradients to a gradient processor. A set of previous directions (from the inner processor) and the last iterate are stored, average is taken after vector transporting them to the current iterates tangent space.
Fields
gradients – (fill(zero_vector(M,x0),n)) the last n gradient/direction updates
last_iterate – last iterate (needed to transport the gradients)
direction – internal DirectionUpdateRule to determine directions to apply the averaging to
vector_transport_method - vector transport method to use
Note that the solvers (their AbstractManoptSolverState, to be precise) can also be decorated to enhance your algorithm by general additional properties, see debug output and recording values. This is done using the debug= and record= keywords in the function calls. Similarly, since 0.4 we provide a (simple) caching of the objective function using the cache= keyword in any of the function calls..
which is a framework that you in general should not change or redefine. It uses the following methods, which also need to be implemented on your own algorithm, if you want to provide one.
Note that the solvers (their AbstractManoptSolverState, to be precise) can also be decorated to enhance your algorithm by general additional properties, see debug output and recording values. This is done using the debug= and record= keywords in the function calls. Similarly, since 0.4 we provide a (simple) caching of the objective function using the cache= keyword in any of the function calls..
which is a framework that you in general should not change or redefine. It uses the following methods, which also need to be implemented on your own algorithm, if you want to provide one.
determine the result value of a call to a solver. By default this returns the same as get_solver_result, i.e. the last iterate or (approximate) minimizer.
return the internally stored state of the ReturnSolverState instead of the minimizer. This means that when the state are decorated like this, the user still has to call get_solver_result on the internal state separately.
depending on the current AbstractManoptProblemamp, the current state of the solver stored in AbstractManoptSolverStateams and the current iterate i this function determines whether to stop the solver, which by default means to call the internal StoppingCriterion. ams.stop
this is a short overview of the different types of high-level functions are usually available for a solver. Let's assume the solver is called new_solver and requires a cost f and some first order information df as well as a starting point p on M. f and df form the objective together called obj.
Then there are basically two different variants to call
return the internally stored state of the ReturnSolverState instead of the minimizer. This means that when the state are decorated like this, the user still has to call get_solver_result on the internal state separately.
depending on the current AbstractManoptProblemamp, the current state of the solver stored in AbstractManoptSolverStateams and the current iterate i this function determines whether to stop the solver, which by default means to call the internal StoppingCriterion. ams.stop
this is a short overview of the different types of high-level functions are usually available for a solver. Let's assume the solver is called new_solver and requires a cost f and some first order information df as well as a starting point p on M. f and df form the objective together called obj.
Then there are basically two different variants to call
Where the start point should be optional. Keyword arguments include the type of evaluation, decorators like debug= or record= as well as algorithm specific ones. If you provide an immutable point p or the rand(M) point is immutable, like on the Circle() this method should turn the point into a mutable one as well.
The third variant works in place of p, so it is mandatory.
This first interface would set up the objective and pass all keywords on the the objective based call.
Here the objective would be created beforehand, e.g. to compare different solvers on the same objective, and for the first variant the start point is optional. Keyword arguments include decorators like debug= or record= as well as algorithm specific ones.
this variant would generate the problem and the state and check validity of all provided keyword arguments that affect the state. Then it would call the iterate process.
If you generate the corresponding problem and state as the previous step does, you can also use the third (lowest level) and just call
solve!(problem, state)
Settings
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
+new_solver!(M, obj, p; kwargs...)
Here the objective would be created beforehand, e.g. to compare different solvers on the same objective, and for the first variant the start point is optional. Keyword arguments include decorators like debug= or record= as well as algorithm specific ones.
this variant would generate the problem and the state and check validity of all provided keyword arguments that affect the state. Then it would call the iterate process.
perform the particle swarm optimization algorithm (PSO), starting with an initial swarmBorkmanns, Ishteva, Absil, 7th IC Swarm Intelligence, 2010. If no swarm is provided, swarm_size many random points are used. Note that since this method does not work in-place – these points are duplicated internally.
The aim of PSO is to find the particle position $g$ on the Manifold M that solves
\[\min_{x ∈\mathcal{M}} F(x).\]
To this end, a swarm of particles is moved around the Manifold M in the following manner. For every particle $k$ we compute the new particle velocities $v_k^{(i)}$ in every step $i$ of the algorithm by
\[v_k^{(i)} = ω \, \operatorname{T}_{x_k^{(i)}\gets x_k^{(i-1)}}v_k^{(i-1)} + c \, r_1 \operatorname{retr}_{x_k^{(i)}}^{-1}(p_k^{(i)}) + s \, r_2 \operatorname{retr}_{x_k^{(i)}}^{-1}(g),\]
where $x_k^{(i)}$ is the current particle position, $ω$ denotes the inertia, $c$ and $s$ are a cognitive and a social weight, respectively, $r_j$, $j=1,2$ are random factors which are computed new for each particle and step, $\operatorname{retr}^{-1}$ denotes an inverse retraction on the ManifoldM, and $\operatorname{T}$ is a vector transport.
vector_transport_mthod - (default_vector_transport_method(M, eltype(x))) a vector transport method to use.
velocity – a set of tangent vectors (of type AbstractVector{T}) representing the velocities of the particles, per default a random tangent vector per initial position
Describes a particle swarm optimizing algorithm, with
Fields
x – a set of points (of type AbstractVector{P}) on a manifold as initial particle positions
velocity – a set of tangent vectors (of type AbstractVector{T}) representing the velocities of the particles
inertia – (0.65) the inertia of the particles
social_weight – (1.4) a social weight factor
cognitive_weight – (1.4) a cognitive weight factor
p_temp – temporary storage for a point to avoid allocations during a step of the algorithm
social_vec - temporary storage for a tangent vector related to social_weight
cognitive_vector - temporary storage for a tangent vector related to cognitive_weight
stopping_criterion – ([StopAfterIteration](@ref)(500) | [StopWhenChangeLess](@ref)(1e-4)) a functor inheriting from [StoppingCriterion`](@ref) indicating when to stop.
retraction_method – (default_retraction_method(M, eltype(x))) the retraction to use
inverse_retraction_method - (default_inverse_retraction_method(M, eltype(x))) an inverse retraction to use.
vector_transport_method - (default_vector_transport_method(M, eltype(x))) a vector transport to use
Constructor
ParticleSwarmState(M, x0, velocity; kawrgs...)
construct a particle swarm Option for the manifold M starting at initial population x0 with velocities x0, where the manifold is used within the defaults of the other fields mentioned above, which are keyword arguments here.
vector_transport_mthod - (default_vector_transport_method(M, eltype(x))) a vector transport method to use.
velocity – a set of tangent vectors (of type AbstractVector{T}) representing the velocities of the particles, per default a random tangent vector per initial position
Describes a particle swarm optimizing algorithm, with
Fields
x – a set of points (of type AbstractVector{P}) on a manifold as initial particle positions
velocity – a set of tangent vectors (of type AbstractVector{T}) representing the velocities of the particles
inertia – (0.65) the inertia of the particles
social_weight – (1.4) a social weight factor
cognitive_weight – (1.4) a cognitive weight factor
p_temp – temporary storage for a point to avoid allocations during a step of the algorithm
social_vec - temporary storage for a tangent vector related to social_weight
cognitive_vector - temporary storage for a tangent vector related to cognitive_weight
stopping_criterion – ([StopAfterIteration](@ref)(500) | [StopWhenChangeLess](@ref)(1e-4)) a functor inheriting from [StoppingCriterion`](@ref) indicating when to stop.
retraction_method – (default_retraction_method(M, eltype(x))) the retraction to use
inverse_retraction_method - (default_inverse_retraction_method(M, eltype(x))) an inverse retraction to use.
vector_transport_method - (default_vector_transport_method(M, eltype(x))) a vector transport to use
Constructor
ParticleSwarmState(M, x0, velocity; kawrgs...)
construct a particle swarm Option for the manifold M starting at initial population x0 with velocities x0, where the manifold is used within the defaults of the other fields mentioned above, which are keyword arguments here.
The Primal-dual Riemannian semismooth Newton Algorithm is a second-order method derived from the ChambollePock.
The aim is to solve an optimization problem on a manifold with a cost function of the form
\[F(p) + G(Λ(p)),\]
where $F:\mathcal M → \overline{ℝ}$, $G:\mathcal N → \overline{ℝ}$, and $Λ:\mathcal M →\mathcal N$. If the manifolds $\mathcal M$ or $\mathcal N$ are not Hadamard, it has to be considered locally, i.e. on geodesically convex sets $\mathcal C \subset \mathcal M$ and $\mathcal D \subset\mathcal N$ such that $Λ(\mathcal C) \subset \mathcal D$.
The algorithm comes down to applying the Riemannian semismooth Newton method to the rewritten primal-dual optimality conditions, i.e., we define the vector field $X: \mathcal{M} \times \mathcal{T}_{n}^{*} \mathcal{N} \rightarrow \mathcal{T} \mathcal{M} \times \mathcal{T}_{n}^{*} \mathcal{N}$ as
The Primal-dual Riemannian semismooth Newton Algorithm is a second-order method derived from the ChambollePock.
The aim is to solve an optimization problem on a manifold with a cost function of the form
\[F(p) + G(Λ(p)),\]
where $F:\mathcal M → \overline{ℝ}$, $G:\mathcal N → \overline{ℝ}$, and $Λ:\mathcal M →\mathcal N$. If the manifolds $\mathcal M$ or $\mathcal N$ are not Hadamard, it has to be considered locally, i.e. on geodesically convex sets $\mathcal C \subset \mathcal M$ and $\mathcal D \subset\mathcal N$ such that $Λ(\mathcal C) \subset \mathcal D$.
The algorithm comes down to applying the Riemannian semismooth Newton method to the rewritten primal-dual optimality conditions, i.e., we define the vector field $X: \mathcal{M} \times \mathcal{T}_{n}^{*} \mathcal{N} \rightarrow \mathcal{T} \mathcal{M} \times \mathcal{T}_{n}^{*} \mathcal{N}$ as
in the vector space $\mathcal{T}_{p^{(k)}} \mathcal{M} \times \mathcal{T}_{n}^{*} \mathcal{N}$
Update
\[p^{(k+1)} := \exp_{p^{(k)}}(d_p^{(k)})\]
and
\[ξ_n^{(k+1)} := ξ_n^{(k)} + d_n^{(k)}\]
Furthermore you can exchange the exponential map, the logarithmic map, and the parallel transport by a retraction, an inverse retraction and a vector transport.
Finally you can also update the base points $m$ and $n$ during the iterations. This introduces a few additional vector transports. The same holds for the case that $Λ(m^{(k)})\neq n^{(k)}$ at some point. All these cases are covered in the algorithm.
primal_dual_semismooth_Newton(M, N, cost, p, X, m, n, prox_F, diff_prox_F, prox_G_dual, diff_prox_dual_G, linearized_operator, adjoint_linearized_operator)
Perform the Primal-Dual Riemannian Semismooth Newton algorithm.
Given a cost function $\mathcal E\colon\mathcal M \to \overline{ℝ}$ of the form
\[\mathcal E(p) = F(p) + G( Λ(p) ),\]
where $F\colon\mathcal M \to \overline{ℝ}$, $G\colon\mathcal N \to \overline{ℝ}$, and $\Lambda\colon\mathcal M \to \mathcal N$. The remaining input parameters are
p, X primal and dual start points $x\in\mathcal M$ and $\xi\in T_n\mathcal N$
m,n base points on $\mathcal M$ and $\mathcal N$, respectively.
linearized_forward_operator the linearization $DΛ(⋅)[⋅]$ of the operator $Λ(⋅)$.
adjoint_linearized_operator the adjoint $DΛ^*$ of the linearized operator $DΛ(m)\colon T_{m}\mathcal M \to T_{Λ(m)}\mathcal N$
prox_F, prox_G_Dual the proximal maps of $F$ and $G^\ast_n$
diff_prox_F, diff_prox_dual_G the (Clarke Generalized) differentials of the proximal maps of $F$ and $G^\ast_n$
primal_dual_semismooth_Newton(M, N, cost, x0, ξ0, m, n, prox_F, diff_prox_F, prox_G_dual, diff_prox_G_dual, linearized_forward_operator, adjoint_linearized_operator)
Perform the Riemannian Primal-dual Riemannian semismooth Newton algorithm in place of x, ξ, and potentially m, n if they are not fixed. See primal_dual_semismooth_Newton for details and optional parameters.
update_primal_base (( amp, ams, i) -> o.m) function to update the primal base
update_dual_base ((amp, ams, i) -> o.n) function to update the dual base
retraction_method – (default_retraction_method(M, typeof(p))) the retraction to use
inverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use.
vector_transport_method - (default_vector_transport_method(M, typeof(p))) a vector transport to use
where for the update functions a AbstractManoptProblemamp, AbstractManoptSolverStateams and the current iterate i are the arguments. If you activate these to be different from the default identity, you have to provide p.Λ for the algorithm to work (which might be missing).
in the vector space $\mathcal{T}_{p^{(k)}} \mathcal{M} \times \mathcal{T}_{n}^{*} \mathcal{N}$
Update
\[p^{(k+1)} := \exp_{p^{(k)}}(d_p^{(k)})\]
and
\[ξ_n^{(k+1)} := ξ_n^{(k)} + d_n^{(k)}\]
Furthermore you can exchange the exponential map, the logarithmic map, and the parallel transport by a retraction, an inverse retraction and a vector transport.
Finally you can also update the base points $m$ and $n$ during the iterations. This introduces a few additional vector transports. The same holds for the case that $Λ(m^{(k)})\neq n^{(k)}$ at some point. All these cases are covered in the algorithm.
primal_dual_semismooth_Newton(M, N, cost, p, X, m, n, prox_F, diff_prox_F, prox_G_dual, diff_prox_dual_G, linearized_operator, adjoint_linearized_operator)
Perform the Primal-Dual Riemannian Semismooth Newton algorithm.
Given a cost function $\mathcal E\colon\mathcal M \to \overline{ℝ}$ of the form
\[\mathcal E(p) = F(p) + G( Λ(p) ),\]
where $F\colon\mathcal M \to \overline{ℝ}$, $G\colon\mathcal N \to \overline{ℝ}$, and $\Lambda\colon\mathcal M \to \mathcal N$. The remaining input parameters are
p, X primal and dual start points $x\in\mathcal M$ and $\xi\in T_n\mathcal N$
m,n base points on $\mathcal M$ and $\mathcal N$, respectively.
linearized_forward_operator the linearization $DΛ(⋅)[⋅]$ of the operator $Λ(⋅)$.
adjoint_linearized_operator the adjoint $DΛ^*$ of the linearized operator $DΛ(m)\colon T_{m}\mathcal M \to T_{Λ(m)}\mathcal N$
prox_F, prox_G_Dual the proximal maps of $F$ and $G^\ast_n$
diff_prox_F, diff_prox_dual_G the (Clarke Generalized) differentials of the proximal maps of $F$ and $G^\ast_n$
primal_dual_semismooth_Newton(M, N, cost, x0, ξ0, m, n, prox_F, diff_prox_F, prox_G_dual, diff_prox_G_dual, linearized_forward_operator, adjoint_linearized_operator)
Perform the Riemannian Primal-dual Riemannian semismooth Newton algorithm in place of x, ξ, and potentially m, n if they are not fixed. See primal_dual_semismooth_Newton for details and optional parameters.
update_primal_base (( amp, ams, i) -> o.m) function to update the primal base
update_dual_base ((amp, ams, i) -> o.n) function to update the dual base
retraction_method – (default_retraction_method(M, typeof(p))) the retraction to use
inverse_retraction_method - (default_inverse_retraction_method(M, typeof(p))) an inverse retraction to use.
vector_transport_method - (default_vector_transport_method(M, typeof(p))) a vector transport to use
where for the update functions a AbstractManoptProblemamp, AbstractManoptSolverStateams and the current iterate i are the arguments. If you activate these to be different from the default identity, you have to provide p.Λ for the algorithm to work (which might be missing).
W. Diepeveen and J. Lellmann. An Inexact Semismooth Newton Method on Riemannian Manifolds with Application to Duality-Based Total Variation Denoising. SIAM Journal on Imaging Sciences 14, 1565–1600 (2021), arXiv: [2102.10309](https://arxiv.org/abs/2102.10309).
-
Settings
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
+
Settings
This document was generated with Documenter.jl version 1.1.2 on Wednesday 25 October 2023. Using Julia version 1.9.3.
cautious_function – ((x) -> x*10^(-4)) – a monotone increasing function that is zero at 0 and strictly increasing at 0 for the cautious update.
direction_update – (InverseBFGS()) the update rule to use.
evaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).
initial_operator – (Matrix{Float64}(I,n,n)) initial matrix to use die the approximation, where n=manifold_dimension(M), see also scale_initial_operator.
memory_size – (20) limited memory, number of $s_k, y_k$ to store. Set to a negative value to use a full memory representation
retraction_method – (default_retraction_method(M, typeof(p))) a retraction method to use, by default the exponential map.
scale_initial_operator - (true) scale initial operator with $\frac{⟨s_k,y_k⟩_{p_k}}{\lVert y_k\rVert_{p_k}}$ in the computation
stabilize – (true) stabilize the method numerically by projecting computed (Newton-) directions to the tangent space to reduce numerical errors
The aim is to minimize a real-valued function on a Riemannian manifold, i.e.
\[\min f(x), \quad x ∈ \mathcal{M}.\]
Riemannian quasi-Newtonian methods are as generalizations of their Euclidean counterparts Riemannian line search methods. These methods determine a search direction $η_k ∈ T_{x_k} \mathcal{M}$ at the current iterate $x_k$ and a suitable stepsize $α_k$ along $\gamma(α) = R_{x_k}(α η_k)$, where $R: T \mathcal{M} →\mathcal{M}$ is a retraction. The next iterate is obtained by
\[x_{k+1} = R_{x_k}(α_k η_k).\]
In quasi-Newton methods, the search direction is given by
where $\mathcal{H}_k : T_{x_k} \mathcal{M} →T_{x_k} \mathcal{M}$ is a positive definite self-adjoint operator, which approximates the action of the Hessian $\operatorname{Hess} f (x_k)[⋅]$ and $\mathcal{B}_k = {\mathcal{H}_k}^{-1}$. The idea of quasi-Newton methods is instead of creating a complete new approximation of the Hessian operator $\operatorname{Hess} f(x_{k+1})$ or its inverse at every iteration, the previous operator $\mathcal{H}_k$ or $\mathcal{B}_k$ is updated by a convenient formula using the obtained information about the curvature of the objective function during the iteration. The resulting operator $\mathcal{H}_{k+1}$ or $\mathcal{B}_{k+1}$ acts on the tangent space $T_{x_{k+1}} \mathcal{M}$ of the freshly computed iterate $x_{k+1}$. In order to get a well-defined method, the following requirements are placed on the new operator $\mathcal{H}_{k+1}$ or $\mathcal{B}_{k+1}$ that is created by an update. Since the Hessian $\operatorname{Hess} f(x_{k+1})$ is a self-adjoint operator on the tangent space $T_{x_{k+1}} \mathcal{M}$, and $\mathcal{H}_{k+1}$ approximates it, we require that $\mathcal{H}_{k+1}$ or $\mathcal{B}_{k+1}$ is also self-adjoint on $T_{x_{k+1}} \mathcal{M}$. In order to achieve a steady descent, we want $η_k$ to be a descent direction in each iteration. Therefore we require, that $\mathcal{H}_{k+1}$ or $\mathcal{B}_{k+1}$ is a positive definite operator on $T_{x_{k+1}} \mathcal{M}$. In order to get information about the curvature of the objective function into the new operator $\mathcal{H}_{k+1}$ or $\mathcal{B}_{k+1}$, we require that it satisfies a form of a Riemannian quasi-Newton equation:
where $T_{x_k \rightarrow x_{k+1}} : T_{x_k} \mathcal{M} →T_{x_{k+1}} \mathcal{M}$ and the chosen retraction $R$ is the associated retraction of $T$. We note that, of course, not all updates in all situations will meet these conditions in every iteration. For specific quasi-Newton updates, the fulfillment of the Riemannian curvature condition, which requires that
\[g_{x_{k+1}}(s_k, y_k) > 0\]
holds, is a requirement for the inheritance of the self-adjointness and positive definiteness of the $\mathcal{H}_k$ or $\mathcal{B}_k$ to the operator $\mathcal{H}_{k+1}$ or $\mathcal{B}_{k+1}$. Unfortunately, the fulfillment of the Riemannian curvature condition is not given by a step size $\alpha_k > 0$ that satisfies the generalized Wolfe conditions. However, in order to create a positive definite operator $\mathcal{H}_{k+1}$ or $\mathcal{B}_{k+1}$ in each iteration, the so-called locking condition was introduced in Huang, Gallican, Absil, SIAM J. Optim., 2015, which requires that the isometric vector transport $T^S$, which is used in the update formula, and its associate retraction $R$ fulfill
where $T^R$ is the vector transport by differentiated retraction. With the requirement that the isometric vector transport $T^S$ and its associated retraction $R$ satisfies the locking condition and using the tangent vector
in the update, it can be shown that choosing a stepsize $α_k > 0$ that satisfies the Riemannian Wolfe conditions leads to the fulfillment of the Riemannian curvature condition, which in turn implies that the operator generated by the updates is positive definite. In the following we denote the specific operators in matrix notation and hence use $H_k$ and $B_k$, respectively.
These AbstractQuasiNewtonDirectionUpdates represent any quasi-Newton update rule, where the operator is stored as a matrix. A distinction is made between the update of the approximation of the Hessian, $H_k \mapsto H_{k+1}$, and the update of the approximation of the Hessian inverse, $B_k \mapsto B_{k+1}$. For the first case, the coordinates of the search direction $η_k$ with respect to a basis $\{b_i\}^{n}_{i=1}$ are determined by solving a linear system of equations, i.e.
where $H_k$ is the matrix representing the operator with respect to the basis $\{b_i\}^{n}_{i=1}$ and $\widehat{\operatorname{grad}f(x_k)}$ represents the coordinates of the gradient of the objective function $f$ in $x_k$ with respect to the basis $\{b_i\}^{n}_{i=1}$. If a method is chosen where Hessian inverse is approximated, the coordinates of the search direction $η_k$ with respect to a basis $\{b_i\}^{n}_{i=1}$ are obtained simply by matrix-vector multiplication, i.e.
where $B_k$ is the matrix representing the operator with respect to the basis $\{b_i\}^{n}_{i=1}$ and $\widehat{\operatorname{grad}f(x_k)}$ as above. In the end, the search direction $η_k$ is generated from the coordinates $\hat{eta_k}$ and the vectors of the basis $\{b_i\}^{n}_{i=1}$ in both variants. The AbstractQuasiNewtonUpdateRule indicates which quasi-Newton update rule is used. In all of them, the Euclidean update formula is used to generate the matrix $H_{k+1}$ and $B_{k+1}$, and the basis $\{b_i\}^{n}_{i=1}$ is transported into the upcoming tangent space $T_{x_{k+1}} \mathcal{M}$, preferably with an isometric vector transport, or generated there.
This AbstractQuasiNewtonDirectionUpdate represents the limited-memory Riemannian BFGS update, where the approximating operator is represented by $m$ stored pairs of tangent vectors $\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}$ in the $k$-th iteration. For the calculation of the search direction $η_k$, the generalisation of the two-loop recursion is used (see Huang, Gallican, Absil, SIAM J. Optim., 2015), since it only requires inner products and linear combinations of tangent vectors in $T_{x_k} \mathcal{M}$. For that the stored pairs of tangent vectors $\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}$, the gradient $\operatorname{grad}f(x_k)$ of the objective function $f$ in $x_k$ and the positive definite self-adjoint operator
are used. The two-loop recursion can be understood as that the InverseBFGS update is executed $m$ times in a row on $\mathcal{B}^{(0)}_k[⋅]$ using the tangent vectors $\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}$, and in the same time the resulting operator $\mathcal{B}^{LRBFGS}_k [⋅]$ is directly applied on $\operatorname{grad}f(x_k)$. When updating there are two cases: if there is still free memory, i.e. $k < m$, the previously stored vector pairs $\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}$ have to be transported into the upcoming tangent space $T_{x_{k+1}} \mathcal{M}$; if there is no free memory, the oldest pair $\{ \widetilde{s}_{k−m}, \widetilde{y}_{k−m}\}$ has to be discarded and then all the remaining vector pairs $\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m+1}^{k-1}$ are transported into the tangent space $T_{x_{k+1}} \mathcal{M}$. After that we calculate and store $s_k = \widetilde{s}_k = T^{S}_{x_k, α_k η_k}(α_k η_k)$ and $y_k = \widetilde{y}_k$. This process ensures that new information about the objective function is always included and the old, probably no longer relevant, information is discarded.
Fields
memory_s – the set of the stored (and transported) search directions times step size $\{ \widetilde{s}_i\}_{i=k-m}^{k-1}$.
memory_y – set of the stored gradient differences $\{ \widetilde{y}_i\}_{i=k-m}^{k-1}$.
ξ – a variable used in the two-loop recursion.
ρ – a variable used in the two-loop recursion.
scale –
vector_transport_method – a AbstractVectorTransportMethod
message – a string containing a potential warning that might have appeared
cautious_function – ((x) -> x*10^(-4)) – a monotone increasing function that is zero at 0 and strictly increasing at 0 for the cautious update.
direction_update – (InverseBFGS()) the update rule to use.
evaluation – (AllocatingEvaluation) specify whether the gradient works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x).
initial_operator – (Matrix{Float64}(I,n,n)) initial matrix to use die the approximation, where n=manifold_dimension(M), see also scale_initial_operator.
memory_size – (20) limited memory, number of $s_k, y_k$ to store. Set to a negative value to use a full memory representation
retraction_method – (default_retraction_method(M, typeof(p))) a retraction method to use, by default the exponential map.
scale_initial_operator - (true) scale initial operator with $\frac{⟨s_k,y_k⟩_{p_k}}{\lVert y_k\rVert_{p_k}}$ in the computation
stabilize – (true) stabilize the method numerically by projecting computed (Newton-) directions to the tangent space to reduce numerical errors
The aim is to minimize a real-valued function on a Riemannian manifold, i.e.
\[\min f(x), \quad x ∈ \mathcal{M}.\]
Riemannian quasi-Newtonian methods are as generalizations of their Euclidean counterparts Riemannian line search methods. These methods determine a search direction $η_k ∈ T_{x_k} \mathcal{M}$ at the current iterate $x_k$ and a suitable stepsize $α_k$ along $\gamma(α) = R_{x_k}(α η_k)$, where $R: T \mathcal{M} →\mathcal{M}$ is a retraction. The next iterate is obtained by
\[x_{k+1} = R_{x_k}(α_k η_k).\]
In quasi-Newton methods, the search direction is given by
where $\mathcal{H}_k : T_{x_k} \mathcal{M} →T_{x_k} \mathcal{M}$ is a positive definite self-adjoint operator, which approximates the action of the Hessian $\operatorname{Hess} f (x_k)[⋅]$ and $\mathcal{B}_k = {\mathcal{H}_k}^{-1}$. The idea of quasi-Newton methods is instead of creating a complete new approximation of the Hessian operator $\operatorname{Hess} f(x_{k+1})$ or its inverse at every iteration, the previous operator $\mathcal{H}_k$ or $\mathcal{B}_k$ is updated by a convenient formula using the obtained information about the curvature of the objective function during the iteration. The resulting operator $\mathcal{H}_{k+1}$ or $\mathcal{B}_{k+1}$ acts on the tangent space $T_{x_{k+1}} \mathcal{M}$ of the freshly computed iterate $x_{k+1}$. In order to get a well-defined method, the following requirements are placed on the new operator $\mathcal{H}_{k+1}$ or $\mathcal{B}_{k+1}$ that is created by an update. Since the Hessian $\operatorname{Hess} f(x_{k+1})$ is a self-adjoint operator on the tangent space $T_{x_{k+1}} \mathcal{M}$, and $\mathcal{H}_{k+1}$ approximates it, we require that $\mathcal{H}_{k+1}$ or $\mathcal{B}_{k+1}$ is also self-adjoint on $T_{x_{k+1}} \mathcal{M}$. In order to achieve a steady descent, we want $η_k$ to be a descent direction in each iteration. Therefore we require, that $\mathcal{H}_{k+1}$ or $\mathcal{B}_{k+1}$ is a positive definite operator on $T_{x_{k+1}} \mathcal{M}$. In order to get information about the curvature of the objective function into the new operator $\mathcal{H}_{k+1}$ or $\mathcal{B}_{k+1}$, we require that it satisfies a form of a Riemannian quasi-Newton equation:
where $T_{x_k \rightarrow x_{k+1}} : T_{x_k} \mathcal{M} →T_{x_{k+1}} \mathcal{M}$ and the chosen retraction $R$ is the associated retraction of $T$. We note that, of course, not all updates in all situations will meet these conditions in every iteration. For specific quasi-Newton updates, the fulfillment of the Riemannian curvature condition, which requires that
\[g_{x_{k+1}}(s_k, y_k) > 0\]
holds, is a requirement for the inheritance of the self-adjointness and positive definiteness of the $\mathcal{H}_k$ or $\mathcal{B}_k$ to the operator $\mathcal{H}_{k+1}$ or $\mathcal{B}_{k+1}$. Unfortunately, the fulfillment of the Riemannian curvature condition is not given by a step size $\alpha_k > 0$ that satisfies the generalized Wolfe conditions. However, in order to create a positive definite operator $\mathcal{H}_{k+1}$ or $\mathcal{B}_{k+1}$ in each iteration, the so-called locking condition was introduced in Huang, Gallican, Absil, SIAM J. Optim., 2015, which requires that the isometric vector transport $T^S$, which is used in the update formula, and its associate retraction $R$ fulfill
where $T^R$ is the vector transport by differentiated retraction. With the requirement that the isometric vector transport $T^S$ and its associated retraction $R$ satisfies the locking condition and using the tangent vector
in the update, it can be shown that choosing a stepsize $α_k > 0$ that satisfies the Riemannian Wolfe conditions leads to the fulfillment of the Riemannian curvature condition, which in turn implies that the operator generated by the updates is positive definite. In the following we denote the specific operators in matrix notation and hence use $H_k$ and $B_k$, respectively.
These AbstractQuasiNewtonDirectionUpdates represent any quasi-Newton update rule, where the operator is stored as a matrix. A distinction is made between the update of the approximation of the Hessian, $H_k \mapsto H_{k+1}$, and the update of the approximation of the Hessian inverse, $B_k \mapsto B_{k+1}$. For the first case, the coordinates of the search direction $η_k$ with respect to a basis $\{b_i\}^{n}_{i=1}$ are determined by solving a linear system of equations, i.e.
where $H_k$ is the matrix representing the operator with respect to the basis $\{b_i\}^{n}_{i=1}$ and $\widehat{\operatorname{grad}f(x_k)}$ represents the coordinates of the gradient of the objective function $f$ in $x_k$ with respect to the basis $\{b_i\}^{n}_{i=1}$. If a method is chosen where Hessian inverse is approximated, the coordinates of the search direction $η_k$ with respect to a basis $\{b_i\}^{n}_{i=1}$ are obtained simply by matrix-vector multiplication, i.e.
where $B_k$ is the matrix representing the operator with respect to the basis $\{b_i\}^{n}_{i=1}$ and $\widehat{\operatorname{grad}f(x_k)}$ as above. In the end, the search direction $η_k$ is generated from the coordinates $\hat{eta_k}$ and the vectors of the basis $\{b_i\}^{n}_{i=1}$ in both variants. The AbstractQuasiNewtonUpdateRule indicates which quasi-Newton update rule is used. In all of them, the Euclidean update formula is used to generate the matrix $H_{k+1}$ and $B_{k+1}$, and the basis $\{b_i\}^{n}_{i=1}$ is transported into the upcoming tangent space $T_{x_{k+1}} \mathcal{M}$, preferably with an isometric vector transport, or generated there.
This AbstractQuasiNewtonDirectionUpdate represents the limited-memory Riemannian BFGS update, where the approximating operator is represented by $m$ stored pairs of tangent vectors $\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}$ in the $k$-th iteration. For the calculation of the search direction $η_k$, the generalisation of the two-loop recursion is used (see Huang, Gallican, Absil, SIAM J. Optim., 2015), since it only requires inner products and linear combinations of tangent vectors in $T_{x_k} \mathcal{M}$. For that the stored pairs of tangent vectors $\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}$, the gradient $\operatorname{grad}f(x_k)$ of the objective function $f$ in $x_k$ and the positive definite self-adjoint operator
are used. The two-loop recursion can be understood as that the InverseBFGS update is executed $m$ times in a row on $\mathcal{B}^{(0)}_k[⋅]$ using the tangent vectors $\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}$, and in the same time the resulting operator $\mathcal{B}^{LRBFGS}_k [⋅]$ is directly applied on $\operatorname{grad}f(x_k)$. When updating there are two cases: if there is still free memory, i.e. $k < m$, the previously stored vector pairs $\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}$ have to be transported into the upcoming tangent space $T_{x_{k+1}} \mathcal{M}$; if there is no free memory, the oldest pair $\{ \widetilde{s}_{k−m}, \widetilde{y}_{k−m}\}$ has to be discarded and then all the remaining vector pairs $\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m+1}^{k-1}$ are transported into the tangent space $T_{x_{k+1}} \mathcal{M}$. After that we calculate and store $s_k = \widetilde{s}_k = T^{S}_{x_k, α_k η_k}(α_k η_k)$ and $y_k = \widetilde{y}_k$. This process ensures that new information about the objective function is always included and the old, probably no longer relevant, information is discarded.
Fields
memory_s – the set of the stored (and transported) search directions times step size $\{ \widetilde{s}_i\}_{i=k-m}^{k-1}$.
memory_y – set of the stored gradient differences $\{ \widetilde{y}_i\}_{i=k-m}^{k-1}$.
ξ – a variable used in the two-loop recursion.
ρ – a variable used in the two-loop recursion.
scale –
vector_transport_method – a AbstractVectorTransportMethod
message – a string containing a potential warning that might have appeared
is satisfied, where $\theta$ is a monotone increasing function satisfying $\theta(0) = 0$ and $\theta$ is strictly increasing at $0$. If this is not the case, the corresponding update will be skipped, which means that for QuasiNewtonMatrixDirectionUpdate the matrix $H_k$ or $B_k$ is not updated. The basis $\{b_i\}^{n}_{i=1}$ is nevertheless transported into the upcoming tangent space $T_{x_{k+1}} \mathcal{M}$, and for QuasiNewtonLimitedMemoryDirectionUpdate neither the oldest vector pair $\{ \widetilde{s}_{k−m}, \widetilde{y}_{k−m}\}$ is discarded nor the newest vector pair $\{ \widetilde{s}_{k}, \widetilde{y}_{k}\}$ is added into storage, but all stored vector pairs $\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}$ are transported into the tangent space $T_{x_{k+1}} \mathcal{M}$. If InverseBFGS or InverseBFGS is chosen as update, then the resulting method follows the method of Huang, Absil, Gallivan, SIAM J. Optim., 2018, taking into account that the corresponding step size is chosen.
We denote by $\widetilde{H}_k^\mathrm{BFGS}$ the operator concatenated with a vector transport and its inverse before and after to act on $x_{k+1} = R_{x_k}(α_k η_k)$. Then the update formula reads
We denote by $\widetilde{H}_k^\mathrm{DFP}$ the operator concatenated with a vector transport and its inverse before and after to act on $x_{k+1} = R_{x_k}(α_k η_k)$. Then the update formula reads
is satisfied, where $\theta$ is a monotone increasing function satisfying $\theta(0) = 0$ and $\theta$ is strictly increasing at $0$. If this is not the case, the corresponding update will be skipped, which means that for QuasiNewtonMatrixDirectionUpdate the matrix $H_k$ or $B_k$ is not updated. The basis $\{b_i\}^{n}_{i=1}$ is nevertheless transported into the upcoming tangent space $T_{x_{k+1}} \mathcal{M}$, and for QuasiNewtonLimitedMemoryDirectionUpdate neither the oldest vector pair $\{ \widetilde{s}_{k−m}, \widetilde{y}_{k−m}\}$ is discarded nor the newest vector pair $\{ \widetilde{s}_{k}, \widetilde{y}_{k}\}$ is added into storage, but all stored vector pairs $\{ \widetilde{s}_i, \widetilde{y}_i\}_{i=k-m}^{k-1}$ are transported into the tangent space $T_{x_{k+1}} \mathcal{M}$. If InverseBFGS or InverseBFGS is chosen as update, then the resulting method follows the method of Huang, Absil, Gallivan, SIAM J. Optim., 2018, taking into account that the corresponding step size is chosen.
We denote by $\widetilde{H}_k^\mathrm{BFGS}$ the operator concatenated with a vector transport and its inverse before and after to act on $x_{k+1} = R_{x_k}(α_k η_k)$. Then the update formula reads
We denote by $\widetilde{H}_k^\mathrm{DFP}$ the operator concatenated with a vector transport and its inverse before and after to act on $x_{k+1} = R_{x_k}(α_k η_k)$. Then the update formula reads
indicates in AbstractQuasiNewtonDirectionUpdate that the Riemannian Broyden update is used in the Riemannian quasi-Newton method, which is as a convex combination of BFGS and DFP.
We denote by $\widetilde{H}_k^\mathrm{Br}$ the operator concatenated with a vector transport and its inverse before and after to act on $x_{k+1} = R_{x_k}(α_k η_k)$. Then the update formula reads
indicates in AbstractQuasiNewtonDirectionUpdate that the Riemannian Broyden update is used in the Riemannian quasi-Newton method, which is as a convex combination of BFGS and DFP.
We denote by $\widetilde{H}_k^\mathrm{Br}$ the operator concatenated with a vector transport and its inverse before and after to act on $x_{k+1} = R_{x_k}(α_k η_k)$. Then the update formula reads
We denote by $\widetilde{H}_k^\mathrm{SR1}$ the operator concatenated with a vector transport and its inverse before and after to act on $x_{k+1} = R_{x_k}(α_k η_k)$. Then the update formula reads
We denote by $\widetilde{H}_k^\mathrm{SR1}$ the operator concatenated with a vector transport and its inverse before and after to act on $x_{k+1} = R_{x_k}(α_k η_k)$. Then the update formula reads
This method can be stabilized by only performing the update if denominator is larger than $r\lVert s_k\rVert_{x_{k+1}}\lVert y_k - \widetilde{H}^\mathrm{SR1}_k s_k \rVert_{x_{k+1}}$ for some $r>0$. For more details, see Section 6.2 in Nocedal, Wright, Springer, 2006.
Constructor
SR1(r::Float64=-1.0)
Generate the SR1 update, which by default does not include the check (since the default sets $t<0$`)
We denote by $\widetilde{B}_k^\mathrm{BFGS}$ the operator concatenated with a vector transport and its inverse before and after to act on $x_{k+1} = R_{x_k}(α_k η_k)$. Then the update formula reads
This method can be stabilized by only performing the update if denominator is larger than $r\lVert s_k\rVert_{x_{k+1}}\lVert y_k - \widetilde{H}^\mathrm{SR1}_k s_k \rVert_{x_{k+1}}$ for some $r>0$. For more details, see Section 6.2 in Nocedal, Wright, Springer, 2006.
Constructor
SR1(r::Float64=-1.0)
Generate the SR1 update, which by default does not include the check (since the default sets $t<0$`)
We denote by $\widetilde{B}_k^\mathrm{BFGS}$ the operator concatenated with a vector transport and its inverse before and after to act on $x_{k+1} = R_{x_k}(α_k η_k)$. Then the update formula reads
We denote by $\widetilde{B}_k^\mathrm{DFP}$ the operator concatenated with a vector transport and its inverse before and after to act on $x_{k+1} = R_{x_k}(α_k η_k)$. Then the update formula reads
We denote by $\widetilde{B}_k^\mathrm{DFP}$ the operator concatenated with a vector transport and its inverse before and after to act on $x_{k+1} = R_{x_k}(α_k η_k)$. Then the update formula reads
We denote by $\widetilde{H}_k^\mathrm{Br}$ the operator concatenated with a vector transport and its inverse before and after to act on $x_{k+1} = R_{x_k}(α_k η_k)$. Then the update formula reads
We denote by $\widetilde{H}_k^\mathrm{Br}$ the operator concatenated with a vector transport and its inverse before and after to act on $x_{k+1} = R_{x_k}(α_k η_k)$. Then the update formula reads
We denote by $\widetilde{B}_k^\mathrm{SR1}$ the operator concatenated with a vector transport and its inverse before and after to act on $x_{k+1} = R_{x_k}(α_k η_k)$. Then the update formula reads
We denote by $\widetilde{B}_k^\mathrm{SR1}$ the operator concatenated with a vector transport and its inverse before and after to act on $x_{k+1} = R_{x_k}(α_k η_k)$. Then the update formula reads
This method can be stabilized by only performing the update if denominator is larger than $r\lVert y_k\rVert_{x_{k+1}}\lVert s_k - \widetilde{H}^\mathrm{SR1}_k y_k \rVert_{x_{k+1}}$ for some $r>0$. For more details, see Section 6.2 in Nocedal, Wright, Springer, 2006.
Constructor
InverseSR1(r::Float64=-1.0)
Generate the InverseSR1 update, which by default does not include the check, since the default sets $t<0$`.
This method can be stabilized by only performing the update if denominator is larger than $r\lVert y_k\rVert_{x_{k+1}}\lVert s_k - \widetilde{H}^\mathrm{SR1}_k y_k \rVert_{x_{k+1}}$ for some $r>0$. For more details, see Section 6.2 in Nocedal, Wright, Springer, 2006.
Constructor
InverseSR1(r::Float64=-1.0)
Generate the InverseSR1 update, which by default does not include the check, since the default sets $t<0$`.
grad_f – a gradient function, that either returns a vector of the subgradients or is a vector of gradients
p – an initial value $x ∈ \mathcal M$
alternatively to the gradient you can provide an ManifoldStochasticGradientObjectivemsgo, then using the cost= keyword does not have any effect since if so, the cost is already within the objective.
Optional
cost – (missing) you can provide a cost function for example to track the function value
evaluation – (AllocatingEvaluation) specify whether the gradient(s) works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x) (elementwise).
evaluation_order – (:Random) – whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Linear) or the default :Random one.
perform a stochastic gradient descent in place of p.
Input
M a manifold $\mathcal M$
grad_f – a gradient function, that either returns a vector of the subgradients or is a vector of gradients
p – an initial value $p ∈ \mathcal M$
Alternatively to the gradient you can provide an ManifoldStochasticGradientObjectivemsgo, then using the cost= keyword does not have any effect since if so, the cost is already within the objective.
evaluation_order – (:Random) – whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Linear) or the default :Random one.
order the current permutation
retraction_method – (default_retraction_method(M, typeof(p))) a retraction(M, p, X) to use.
Constructor
StochasticGradientDescentState(M, p)
Create a StochasticGradientDescentState with start point x. all other fields are optional keyword arguments, and the defaults are taken from M.
grad_f – a gradient function, that either returns a vector of the subgradients or is a vector of gradients
p – an initial value $x ∈ \mathcal M$
alternatively to the gradient you can provide an ManifoldStochasticGradientObjectivemsgo, then using the cost= keyword does not have any effect since if so, the cost is already within the objective.
Optional
cost – (missing) you can provide a cost function for example to track the function value
evaluation – (AllocatingEvaluation) specify whether the gradient(s) works by allocation (default) form gradF(M, x) or InplaceEvaluation in place, i.e. is of the form gradF!(M, X, x) (elementwise).
evaluation_order – (:Random) – whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Linear) or the default :Random one.
perform a stochastic gradient descent in place of p.
Input
M a manifold $\mathcal M$
grad_f – a gradient function, that either returns a vector of the subgradients or is a vector of gradients
p – an initial value $p ∈ \mathcal M$
Alternatively to the gradient you can provide an ManifoldStochasticGradientObjectivemsgo, then using the cost= keyword does not have any effect since if so, the cost is already within the objective.
evaluation_order – (:Random) – whether to use a randomly permuted sequence (:FixedRandom), a per cycle permuted sequence (:Linear) or the default :Random one.
order the current permutation
retraction_method – (default_retraction_method(M, typeof(p))) a retraction(M, p, X) to use.
Constructor
StochasticGradientDescentState(M, p)
Create a StochasticGradientDescentState with start point x. all other fields are optional keyword arguments, and the defaults are taken from M.
perform a subgradient method $p_{k+1} = \mathrm{retr}(p_k, s_k∂f(p_k))$,
where $\mathrm{retr}$ is a retraction, $s_k$ is a step size, usually the ConstantStepsize but also be specified. Though the subgradient might be set valued, the argument ∂f should always return one element from the subgradient, but not necessarily deterministic.
Input
M – a manifold $\mathcal M$
f – a cost function $f:\mathcal M→ℝ$ to minimize
∂f– the (sub)gradient $\partial f: \mathcal M→ T\mathcal M$ of f restricted to always only returning one value/element from the subdifferential. This function can be passed as an allocation function (M, p) -> X or a mutating function (M, X, p) -> X, see evaluation.
evaluation – (AllocatingEvaluation) specify whether the subgradient works by allocation (default) form ∂f(M, y) or InplaceEvaluation in place, i.e. is of the form ∂f!(M, X, x).
perform a subgradient method $p_{k+1} = \mathrm{retr}(p_k, s_k∂f(p_k))$,
Input
M – a manifold $\mathcal M$
f – a cost function $f:\mathcal M→ℝ$ to minimize
∂f– the (sub)gradient $\partial f: \mathcal M→ T\mathcal M$ of F restricted to always only returning one value/element from the subdifferential. This function can be passed as an allocation function (M, p) -> X or a mutating function (M, X, p) -> X, see evaluation.
perform a subgradient method $p_{k+1} = \mathrm{retr}(p_k, s_k∂f(p_k))$,
where $\mathrm{retr}$ is a retraction, $s_k$ is a step size, usually the ConstantStepsize but also be specified. Though the subgradient might be set valued, the argument ∂f should always return one element from the subgradient, but not necessarily deterministic.
Input
M – a manifold $\mathcal M$
f – a cost function $f:\mathcal M→ℝ$ to minimize
∂f– the (sub)gradient $\partial f: \mathcal M→ T\mathcal M$ of f restricted to always only returning one value/element from the subdifferential. This function can be passed as an allocation function (M, p) -> X or a mutating function (M, X, p) -> X, see evaluation.
evaluation – (AllocatingEvaluation) specify whether the subgradient works by allocation (default) form ∂f(M, y) or InplaceEvaluation in place, i.e. is of the form ∂f!(M, X, x).
perform a subgradient method $p_{k+1} = \mathrm{retr}(p_k, s_k∂f(p_k))$,
Input
M – a manifold $\mathcal M$
f – a cost function $f:\mathcal M→ℝ$ to minimize
∂f– the (sub)gradient $\partial f: \mathcal M→ T\mathcal M$ of F restricted to always only returning one value/element from the subdifferential. This function can be passed as an allocation function (M, p) -> X or a mutating function (M, X, p) -> X, see evaluation.
on a manifold by using the Steihaug-Toint truncated conjugate-gradient method, abbreviated tCG-method. All terms involving the trust-region radius use an inner product w.r.t. the preconditioner; this is because the iterates grow in length w.r.t. the preconditioner, guaranteeing that we do not re-enter the trust-region.
Initialize $η_0 = η$ if using randomized approach and $η$ the zero tangent vector otherwise, $r_0 = \operatorname{grad}F(x)$, $z_0 = \operatorname{P}(r_0)$, $δ_0 = z_0$ and $k=0$
The $\operatorname{P}(⋅)$ denotes the symmetric, positive definite preconditioner. It is required if a randomized approach is used i.e. using a random tangent vector $η_0$ as the initial vector. The idea behind it is to avoid saddle points. Preconditioning is simply a rescaling of the variables and thus a redefinition of the shape of the trust region. Ideally $\operatorname{P}(⋅)$ is a cheap, positive approximation of the inverse of the Hessian of $F$ at $x$. On default, the preconditioner is just the identity.
To step number 2: obtain $τ$ from the positive root of $\left\lVert η_k + τ δ_k \right\rVert_{\operatorname{P}, x} = Δ$ what becomes after the conversion of the equation to
It can occur that $⟨δ_k, \operatorname{Hess}[F] (δ_k)_{x}⟩_{x} = κ ≤ 0$ at iteration $k$. In this case, the model is not strictly convex, and the stepsize $α =\frac{⟨r_k, z_k⟩_{x}} {κ}$ computed in step 1. does not give a reduction in the model function $m_x(⋅)$. Indeed, $m_x(⋅)$ is unbounded from below along the line $η_k + α δ_k$. If our aim is to minimize the model within the trust-region, it makes far more sense to reduce $m_x(⋅)$ along $η_k + α δ_k$ as much as we can while staying within the trust-region, and this means moving to the trust-region boundary along this line. Thus, when $κ ≤ 0$ at iteration k, we replace $α = \frac{⟨r_k, z_k⟩_{x}}{κ}$ with $τ$ described as above. The other possibility is that $η_{k+1}$ would lie outside the trust-region at iteration k (i.e. $⟨η_k, η_k⟩_{x}^{* } ≥ {Δ}^2$ that can be identified with the norm of $η_{k+1}$). In particular, when $\operatorname{Hess}[F] (⋅)_{x}$ is positive definite and $η_{k+1}$ lies outside the trust region, the solution to the trust-region problem must lie on the trust-region boundary. Thus, there is no reason to continue with the conjugate gradient iteration, as it stands, as subsequent iterates will move further outside the trust-region boundary. A sensible strategy, just as in the case considered above, is to move to the trust-region boundary by finding $τ$.
Although it is virtually impossible in practice to know how many iterations are necessary to provide a good estimate $η_{k}$ of the trust-region subproblem, the method stops after a certain number of iterations, which is realised by StopAfterIteration. In order to increase the convergence rate of the underlying trust-region method, see trust_regions, a typical stopping criterion is to stop as soon as an iteration $k$ is reached for which
holds, where $0 < κ < 1$ and $θ > 0$ are chosen in advance. This is realized in this method by StopWhenResidualIsReducedByFactorOrPower. It can be shown that under appropriate conditions the iterates $x_k$ of the underlying trust-region method converge to nondegenerate critical points with an order of convergence of at least $\min \left( θ + 1, 2 \right)$, see Absil, Mahony, Sepulchre, Princeton University Press, 2008. The method also aborts if the curvature of the model is negative, i.e. if $\langle \delta_k, \mathcal{H}[δ_k] \rangle_x \leqq 0$, which is realised by StopWhenCurvatureIsNegative. If the next possible approximate solution $η_{k}^{*}$ calculated in iteration $k$ lies outside the trust region, i.e. if $\lVert η_{k}^{*} \rVert_x \geq Δ$, then the method aborts, which is realised by StopWhenTrustRegionIsExceeded. Furthermore, the method aborts if the new model value evaluated at $η_{k}^{*}$ is greater than the previous model value evaluated at $η_{k}$, which is realised by StopWhenModelIncreased.
It can occur that $⟨δ_k, \operatorname{Hess}[F] (δ_k)_{x}⟩_{x} = κ ≤ 0$ at iteration $k$. In this case, the model is not strictly convex, and the stepsize $α =\frac{⟨r_k, z_k⟩_{x}} {κ}$ computed in step 1. does not give a reduction in the model function $m_x(⋅)$. Indeed, $m_x(⋅)$ is unbounded from below along the line $η_k + α δ_k$. If our aim is to minimize the model within the trust-region, it makes far more sense to reduce $m_x(⋅)$ along $η_k + α δ_k$ as much as we can while staying within the trust-region, and this means moving to the trust-region boundary along this line. Thus, when $κ ≤ 0$ at iteration k, we replace $α = \frac{⟨r_k, z_k⟩_{x}}{κ}$ with $τ$ described as above. The other possibility is that $η_{k+1}$ would lie outside the trust-region at iteration k (i.e. $⟨η_k, η_k⟩_{x}^{* } ≥ {Δ}^2$ that can be identified with the norm of $η_{k+1}$). In particular, when $\operatorname{Hess}[F] (⋅)_{x}$ is positive definite and $η_{k+1}$ lies outside the trust region, the solution to the trust-region problem must lie on the trust-region boundary. Thus, there is no reason to continue with the conjugate gradient iteration, as it stands, as subsequent iterates will move further outside the trust-region boundary. A sensible strategy, just as in the case considered above, is to move to the trust-region boundary by finding $τ$.
Although it is virtually impossible in practice to know how many iterations are necessary to provide a good estimate $η_{k}$ of the trust-region subproblem, the method stops after a certain number of iterations, which is realised by StopAfterIteration. In order to increase the convergence rate of the underlying trust-region method, see trust_regions, a typical stopping criterion is to stop as soon as an iteration $k$ is reached for which
holds, where $0 < κ < 1$ and $θ > 0$ are chosen in advance. This is realized in this method by StopWhenResidualIsReducedByFactorOrPower. It can be shown that under appropriate conditions the iterates $x_k$ of the underlying trust-region method converge to nondegenerate critical points with an order of convergence of at least $\min \left( θ + 1, 2 \right)$, see Absil, Mahony, Sepulchre, Princeton University Press, 2008. The method also aborts if the curvature of the model is negative, i.e. if $\langle \delta_k, \mathcal{H}[δ_k] \rangle_x \leqq 0$, which is realised by StopWhenCurvatureIsNegative. If the next possible approximate solution $η_{k}^{*}$ calculated in iteration $k$ lies outside the trust region, i.e. if $\lVert η_{k}^{*} \rVert_x \geq Δ$, then the method aborts, which is realised by StopWhenTrustRegionIsExceeded. Furthermore, the method aborts if the new model value evaluated at $η_{k}^{*}$ is greater than the previous model value evaluated at $η_{k}$, which is realised by StopWhenModelIncreased.
on a manifold M by using the Steihaug-Toint truncated conjugate-gradient method, abbreviated tCG-method. For a description of the algorithm and theorems offering convergence guarantees, see the reference:
preconditioner – a preconditioner for the hessian H
θ – (1.0) 1+θ is the superlinear convergence target rate. The method aborts if the residual is less than or equal to the initial residual to the power of 1+θ.
κ – (0.1) the linear convergence target rate. The method aborts if the residual is less than or equal to κ times the initial residual.
randomize – set to true if the trust-region solve is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.
trust_region_radius – (injectivity_radius(M)/4) a trust-region radius
project! : (copyto!) specify a projection operation for tangent vectors for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.
stopping_criterion – (StopAfterIteration| [StopWhenResidualIsReducedByFactorOrPower](@ref) | 'StopWhenCurvatureIsNegative|StopWhenTrustRegionIsExceeded ) a functor inheriting from StoppingCriterion indicating when to stop, where for the default, the maximal number of iterations is set to the dimension of the manifold, the power factor is θ, the reduction factor is κ.
and the ones that are passed to decorate_state! for decorators.
Output
the obtained (approximate) minimizer $\eta^*$, see get_solver_return for details
trust_region_radius : (injectivity_radius(M)/4) the trust-region radius
residual : the gradient
randomize : indicates if the trust-region solve and so the algorithm is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.
project! : (copyto!) specify a projection operation for tangent vectors for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.
on a manifold M by using the Steihaug-Toint truncated conjugate-gradient method, abbreviated tCG-method. For a description of the algorithm and theorems offering convergence guarantees, see the reference:
preconditioner – a preconditioner for the hessian H
θ – (1.0) 1+θ is the superlinear convergence target rate. The method aborts if the residual is less than or equal to the initial residual to the power of 1+θ.
κ – (0.1) the linear convergence target rate. The method aborts if the residual is less than or equal to κ times the initial residual.
randomize – set to true if the trust-region solve is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.
trust_region_radius – (injectivity_radius(M)/4) a trust-region radius
project! : (copyto!) specify a projection operation for tangent vectors for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.
stopping_criterion – (StopAfterIteration| [StopWhenResidualIsReducedByFactorOrPower](@ref) | 'StopWhenCurvatureIsNegative|StopWhenTrustRegionIsExceeded ) a functor inheriting from StoppingCriterion indicating when to stop, where for the default, the maximal number of iterations is set to the dimension of the manifold, the power factor is θ, the reduction factor is κ.
and the ones that are passed to decorate_state! for decorators.
Output
the obtained (approximate) minimizer $\eta^*$, see get_solver_return for details
trust_region_radius : (injectivity_radius(M)/4) the trust-region radius
residual : the gradient
randomize : indicates if the trust-region solve and so the algorithm is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.
project! : (copyto!) specify a projection operation for tangent vectors for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.
A functor for testing if the norm of residual at the current iterate is reduced either by a power of 1+θ or by a factor κ compared to the norm of the initial residual, i.e. $\Vert r_k \Vert_x \leqq \Vert r_0 \Vert_{x} \
-\min \left( \kappa, \Vert r_0 \Vert_{x}^{\theta} \right)$.
Fields
κ – the reduction factor
θ – part of the reduction power
reason – stores a reason of stopping if the stopping criterion has one be reached, see get_reason.
initialize the StopWhenResidualIsReducedByFactorOrPower functor to indicate to stop after the norm of the current residual is lesser than either the norm of the initial residual to the power of 1+θ or the norm of the initial residual times κ.
A functor for testing if the norm of the next iterate in the Steihaug-Toint tcg method is larger than the trust-region radius, i.e. $\Vert η_{k}^{*} \Vert_x ≧ trust_region_radius$. terminate the algorithm when the trust region has been left.
Fields
reason – stores a reason of stopping if the stopping criterion has been reached, see get_reason.
Constructor
StopWhenTrustRegionIsExceeded()
initialize the StopWhenTrustRegionIsExceeded functor to indicate to stop after the norm of the next iterate is greater than the trust-region radius.
A functor for testing if the curvature of the model is negative, i.e. $\langle \delta_k, \operatorname{Hess}[F](\delta_k)\rangle_x \leqq 0$. In this case, the model is not strictly convex, and the stepsize as computed does not give a reduction of the model.
Fields
reason – stores a reason of stopping if the stopping criterion has been reached, see get_reason.
A functor for testing if the norm of residual at the current iterate is reduced either by a power of 1+θ or by a factor κ compared to the norm of the initial residual, i.e. $\Vert r_k \Vert_x \leqq \Vert r_0 \Vert_{x} \
+\min \left( \kappa, \Vert r_0 \Vert_{x}^{\theta} \right)$.
Fields
κ – the reduction factor
θ – part of the reduction power
reason – stores a reason of stopping if the stopping criterion has one be reached, see get_reason.
initialize the StopWhenResidualIsReducedByFactorOrPower functor to indicate to stop after the norm of the current residual is lesser than either the norm of the initial residual to the power of 1+θ or the norm of the initial residual times κ.
A functor for testing if the norm of the next iterate in the Steihaug-Toint tcg method is larger than the trust-region radius, i.e. $\Vert η_{k}^{*} \Vert_x ≧ trust_region_radius$. terminate the algorithm when the trust region has been left.
Fields
reason – stores a reason of stopping if the stopping criterion has been reached, see get_reason.
Constructor
StopWhenTrustRegionIsExceeded()
initialize the StopWhenTrustRegionIsExceeded functor to indicate to stop after the norm of the next iterate is greater than the trust-region radius.
A functor for testing if the curvature of the model is negative, i.e. $\langle \delta_k, \operatorname{Hess}[F](\delta_k)\rangle_x \leqq 0$. In this case, the model is not strictly convex, and the stepsize as computed does not give a reduction of the model.
Fields
reason – stores a reason of stopping if the stopping criterion has been reached, see get_reason.
P.-A. Absil, R. Mahony and R. Sepulchre. Optimization Algorithms on Matrix Manifolds. Princeton University Press (2008). [open access](http://press.princeton.edu/chapters/absil/).
-
Settings
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
+
Settings
This document was generated with Documenter.jl version 1.1.2 on Wednesday 25 October 2023. Using Julia version 1.9.3.
The aim is to solve an optimization problem on a manifold
\[\operatorname*{min}_{x ∈ \mathcal{M}} F(x)\]
by using the Riemannian trust-regions solver. It is number one choice for smooth optimization. This trust-region method uses the Steihaug-Toint truncated conjugate-gradient method truncated_conjugate_gradient_descent to solve the inner minimization problem called the trust-regions subproblem. This inner solver can be preconditioned by providing a preconditioner (symmetric and positive definite, an approximation of the inverse of the Hessian of $F$). If no Hessian of the cost function $F$ is provided, a standard approximation of the Hessian based on the gradient $\operatorname{grad}F$ with ApproxHessianFiniteDifference will be computed.
Initialize $x_0 = x$ with an initial point $x$ on the manifold. It can be given by the caller or set randomly. Set the initial trust-region radius $\Delta =\frac{1}{8} \bar{\Delta}$ where $\bar{\Delta}$ is the maximum radius the trust-region can have. Usually one uses the root of the manifold's dimension $\operatorname{dim}(\mathcal{M})$. For accepting the next iterate and evaluating the new trust-region radius, one needs an accept/reject threshold $\rho' ∈ [0,\frac{1}{4})$, which is $\rho' = 0.1$ on default. Set $k=0$.
Set $η$ as a random tangent vector if using randomized approach. Else set $η$ as the zero vector in the tangential space $T_{x_k}\mathcal{M}$.
Set $η^*$ as the solution of the trust-region subproblem, computed by the tcg-method with $η$ as initial vector.
If using randomized approach, compare $η^*$ with the Cauchy point $η_{c}^* = -\tau_{c} \frac{\Delta}{\lVert \operatorname{Grad}[F] (x_k) \rVert_{x_k}} \operatorname{Grad}[F] (x_k)$ by the model function $m_{x_k}(⋅)$. If the model decrease is larger by using the Cauchy point, set $η^* = η_{c}^*$.
Set ${x}^* = \operatorname{retr}_{x_k}(η^*)$.
Set $\rho = \frac{F(x_k)-F({x}^*)}{m_{x_k}(η)-m_{x_k}(η^*)}$, where $m_{x_k}(⋅)$ describes the quadratic model function.
Update the trust-region radius:$\Delta = \begin{cases}\frac{1}{4} \Delta &\text{ if } \rho < \frac{1}{4} \, \text{or} \, m_{x_k}(η)-m_{x_k}(η^*) \leq 0 \, \text{or} \, \rho = \pm ∈ fty , \\\operatorname{min}(2 \Delta, \bar{\Delta}) &\text{ if } \rho > \frac{3}{4} \, \text{and the tcg-method stopped because of negative curvature or exceeding the trust-region},\\\Delta & \, \text{otherwise.}\end{cases}$
If $m_{x_k}(η)-m_{x_k}(η^*) \geq 0$ and $\rho > \rho'$ set $x_k = {x}^*$.
To the initialization: a random point on the manifold.
To step number 1: using a randomized approach means using a random tangent vector as initial vector for the approximate solve of the trust-regions subproblem. If this is the case, keep in mind that the vector must be in the trust-region radius. This is achieved by multiplying η by sqrt(4,eps(Float64)) as long as its norm is greater than the current trust-region radius $\Delta$. For not using randomized approach, one can get the zero tangent vector.
To step number 2: obtain $η^*$ by (approximately) solving the trust-regions subproblem
The aim is to solve an optimization problem on a manifold
\[\operatorname*{min}_{x ∈ \mathcal{M}} F(x)\]
by using the Riemannian trust-regions solver. It is number one choice for smooth optimization. This trust-region method uses the Steihaug-Toint truncated conjugate-gradient method truncated_conjugate_gradient_descent to solve the inner minimization problem called the trust-regions subproblem. This inner solver can be preconditioned by providing a preconditioner (symmetric and positive definite, an approximation of the inverse of the Hessian of $F$). If no Hessian of the cost function $F$ is provided, a standard approximation of the Hessian based on the gradient $\operatorname{grad}F$ with ApproxHessianFiniteDifference will be computed.
Initialize $x_0 = x$ with an initial point $x$ on the manifold. It can be given by the caller or set randomly. Set the initial trust-region radius $\Delta =\frac{1}{8} \bar{\Delta}$ where $\bar{\Delta}$ is the maximum radius the trust-region can have. Usually one uses the root of the manifold's dimension $\operatorname{dim}(\mathcal{M})$. For accepting the next iterate and evaluating the new trust-region radius, one needs an accept/reject threshold $\rho' ∈ [0,\frac{1}{4})$, which is $\rho' = 0.1$ on default. Set $k=0$.
Set $η$ as a random tangent vector if using randomized approach. Else set $η$ as the zero vector in the tangential space $T_{x_k}\mathcal{M}$.
Set $η^*$ as the solution of the trust-region subproblem, computed by the tcg-method with $η$ as initial vector.
If using randomized approach, compare $η^*$ with the Cauchy point $η_{c}^* = -\tau_{c} \frac{\Delta}{\lVert \operatorname{Grad}[F] (x_k) \rVert_{x_k}} \operatorname{Grad}[F] (x_k)$ by the model function $m_{x_k}(⋅)$. If the model decrease is larger by using the Cauchy point, set $η^* = η_{c}^*$.
Set ${x}^* = \operatorname{retr}_{x_k}(η^*)$.
Set $\rho = \frac{F(x_k)-F({x}^*)}{m_{x_k}(η)-m_{x_k}(η^*)}$, where $m_{x_k}(⋅)$ describes the quadratic model function.
Update the trust-region radius:$\Delta = \begin{cases}\frac{1}{4} \Delta &\text{ if } \rho < \frac{1}{4} \, \text{or} \, m_{x_k}(η)-m_{x_k}(η^*) \leq 0 \, \text{or} \, \rho = \pm ∈ fty , \\\operatorname{min}(2 \Delta, \bar{\Delta}) &\text{ if } \rho > \frac{3}{4} \, \text{and the tcg-method stopped because of negative curvature or exceeding the trust-region},\\\Delta & \, \text{otherwise.}\end{cases}$
If $m_{x_k}(η)-m_{x_k}(η^*) \geq 0$ and $\rho > \rho'$ set $x_k = {x}^*$.
To the initialization: a random point on the manifold.
To step number 1: using a randomized approach means using a random tangent vector as initial vector for the approximate solve of the trust-regions subproblem. If this is the case, keep in mind that the vector must be in the trust-region radius. This is achieved by multiplying η by sqrt(4,eps(Float64)) as long as its norm is greater than the current trust-region radius $\Delta$. For not using randomized approach, one can get the zero tangent vector.
To step number 2: obtain $η^*$ by (approximately) solving the trust-regions subproblem
\[\operatorname*{arg\,min}_{η ∈ T_{x_k}\mathcal{M}} m_{x_k}(η) = F(x_k) +
\langle \operatorname{grad}F(x_k), η \rangle_{x_k} + \frac{1}{2} \langle
\operatorname{Hess}[F](η)_ {x_k}, η \rangle_{x_k}\]
\[\text{s.t.} \; \langle η, η \rangle_{x_k} \leq {\Delta}^2\]
with the Steihaug-Toint truncated conjugate-gradient (tcg) method. The problem as well as the solution method is described in the truncated_conjugate_gradient_descent. In this inner solver, the stopping criterion StopWhenResidualIsReducedByFactorOrPower so that superlinear or at least linear convergence in the trust-region method can be achieved.
To step number 3: if using a random tangent vector as an initial vector, compare the result of the tcg-method with the Cauchy point. Convergence proofs assume that one achieves at least (a fraction of) the reduction of the Cauchy point. The idea is to go in the direction of the gradient to an optimal point. This can be on the edge, but also before. The parameter $\tau_{c}$ for the optimal length is defined by
If $m_{x_k}(η_{c}^*) < m_{x_k}(η^*)$ then $m_{x_k}(η_{c}^*)$ is the better choice.
To step number 4: $\operatorname{retr}_{x_k}(⋅)$ denotes the retraction, a mapping $\operatorname{retr}_{x_k}:T_{x_k}\mathcal{M} \rightarrow \mathcal{M}$ which approximates the exponential map. In some cases it is cheaper to use this instead of the exponential.
max_trust_region_radius – the maximum trust-region radius
preconditioner – a preconditioner (a symmetric, positive definite operator that should approximate the inverse of the Hessian)
randomize – set to true if the trust-region solve is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.
project! : (copyto!) specify a projection operation for tangent vectors within the TCG for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.
retraction – (default_retraction_method(M, typeof(p))) approximation of the exponential map
trust_region_radius - the initial trust-region radius
ρ_prime – Accept/reject threshold: if ρ (the performance ratio for the iterate) is at least ρ', the outer iteration is accepted. Otherwise, it is rejected. In case it is rejected, the trust-region radius will have been decreased. To ensure this, ρ' >= 0 must be strictly smaller than 1/4. If ρ_prime is negative, the algorithm is not guaranteed to produce monotonically decreasing cost values. It is strongly recommended to set ρ' > 0, to aid convergence.
ρ_regularization – Close to convergence, evaluating the performance ratio ρ is numerically challenging. Meanwhile, close to convergence, the quadratic model should be a good fit and the steps should be accepted. Regularization lets ρ go to 1 as the model decrease and the actual decrease go to zero. Set this option to zero to disable regularization (not recommended). When this is not zero, it may happen that the iterates produced are not monotonically improving the cost when very close to convergence. This is because the corrected cost improvement could change sign if it is negative but very small.
θ – (1.0) 1+θ is the superlinear convergence target rate of the tCG-method truncated_conjugate_gradient_descent, which computes an approximate solution for the trust-region subproblem. The tCG-method aborts if the residual is less than or equal to the initial residual to the power of 1+θ.
κ – (0.1) the linear convergence target rate of the tCG-method truncated_conjugate_gradient_descent, which computes an approximate solution for the trust-region subproblem. The method aborts if the residual is less than or equal to κ times the initial residual.
reduction_threshold – (0.1) Trust-region reduction threshold: if ρ (the performance ratio for the iterate) is less than this bound, the trust-region radius and thus the trust-regions decreases.
augmentation_threshold – (0.75) Trust-region augmentation threshold: if ρ (the performance ratio for the iterate) is greater than this and further conditions apply, the trust-region radius and thus the trust-regions increases.
Output
the obtained (approximate) minimizer $p^*$, see get_solver_return for details
max_trust_region_radius : (sqrt(manifold_dimension(M))) the maximum trust-region radius
project! : (copyto!) specify a projection operation for tangent vectors for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.
randomize : (false) indicates if the trust-region solve is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.
ρ_prime : (0.1) a lower bound of the performance ratio for the iterate that decides if the iteration will be accepted or not. If not, the trust-region radius will have been decreased. To ensure this, ρ'>= 0 must be strictly smaller than 1/4. If ρ' is negative, the algorithm is not guaranteed to produce monotonically decreasing cost values. It is strongly recommended to set ρ' > 0, to aid convergence.
ρ_regularization : (10000.0) Close to convergence, evaluating the performance ratio ρ is numerically challenging. Meanwhile, close to convergence, the quadratic model should be a good fit and the steps should be accepted. Regularization lets ρ go to 1 as the model decrease and the actual decrease go to zero. Set this option to zero to disable regularization (not recommended). When this is not zero, it may happen that the iterates produced are not monotonically improving the cost when very close to convergence. This is because the corrected cost improvement could change sign if it is negative but very small.
trust_region_radius : the (initial) trust-region radius
If $m_{x_k}(η_{c}^*) < m_{x_k}(η^*)$ then $m_{x_k}(η_{c}^*)$ is the better choice.
To step number 4: $\operatorname{retr}_{x_k}(⋅)$ denotes the retraction, a mapping $\operatorname{retr}_{x_k}:T_{x_k}\mathcal{M} \rightarrow \mathcal{M}$ which approximates the exponential map. In some cases it is cheaper to use this instead of the exponential.
max_trust_region_radius – the maximum trust-region radius
preconditioner – a preconditioner (a symmetric, positive definite operator that should approximate the inverse of the Hessian)
randomize – set to true if the trust-region solve is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.
project! : (copyto!) specify a projection operation for tangent vectors within the TCG for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.
retraction – (default_retraction_method(M, typeof(p))) approximation of the exponential map
trust_region_radius - the initial trust-region radius
ρ_prime – Accept/reject threshold: if ρ (the performance ratio for the iterate) is at least ρ', the outer iteration is accepted. Otherwise, it is rejected. In case it is rejected, the trust-region radius will have been decreased. To ensure this, ρ' >= 0 must be strictly smaller than 1/4. If ρ_prime is negative, the algorithm is not guaranteed to produce monotonically decreasing cost values. It is strongly recommended to set ρ' > 0, to aid convergence.
ρ_regularization – Close to convergence, evaluating the performance ratio ρ is numerically challenging. Meanwhile, close to convergence, the quadratic model should be a good fit and the steps should be accepted. Regularization lets ρ go to 1 as the model decrease and the actual decrease go to zero. Set this option to zero to disable regularization (not recommended). When this is not zero, it may happen that the iterates produced are not monotonically improving the cost when very close to convergence. This is because the corrected cost improvement could change sign if it is negative but very small.
θ – (1.0) 1+θ is the superlinear convergence target rate of the tCG-method truncated_conjugate_gradient_descent, which computes an approximate solution for the trust-region subproblem. The tCG-method aborts if the residual is less than or equal to the initial residual to the power of 1+θ.
κ – (0.1) the linear convergence target rate of the tCG-method truncated_conjugate_gradient_descent, which computes an approximate solution for the trust-region subproblem. The method aborts if the residual is less than or equal to κ times the initial residual.
reduction_threshold – (0.1) Trust-region reduction threshold: if ρ (the performance ratio for the iterate) is less than this bound, the trust-region radius and thus the trust-regions decreases.
augmentation_threshold – (0.75) Trust-region augmentation threshold: if ρ (the performance ratio for the iterate) is greater than this and further conditions apply, the trust-region radius and thus the trust-regions increases.
Output
the obtained (approximate) minimizer $p^*$, see get_solver_return for details
max_trust_region_radius : (sqrt(manifold_dimension(M))) the maximum trust-region radius
project! : (copyto!) specify a projection operation for tangent vectors for numerical stability. A function (M, Y, p, X) -> ... working in place of Y. per default, no projection is perfomed, set it to project! to activate projection.
randomize : (false) indicates if the trust-region solve is to be initiated with a random tangent vector. If set to true, no preconditioner will be used. This option is set to true in some scenarios to escape saddle points, but is otherwise seldom activated.
ρ_prime : (0.1) a lower bound of the performance ratio for the iterate that decides if the iteration will be accepted or not. If not, the trust-region radius will have been decreased. To ensure this, ρ'>= 0 must be strictly smaller than 1/4. If ρ' is negative, the algorithm is not guaranteed to produce monotonically decreasing cost values. It is strongly recommended to set ρ' > 0, to aid convergence.
ρ_regularization : (10000.0) Close to convergence, evaluating the performance ratio ρ is numerically challenging. Meanwhile, close to convergence, the quadratic model should be a good fit and the steps should be accepted. Regularization lets ρ go to 1 as the model decrease and the actual decrease go to zero. Set this option to zero to disable regularization (not recommended). When this is not zero, it may happen that the iterates produced are not monotonically improving the cost when very close to convergence. This is because the corrected cost improvement could change sign if it is negative but very small.
trust_region_radius : the (initial) trust-region radius
A functor to approximate the Hessian by a finite difference of gradient evaluation.
Given a point p and a direction X and the gradient $\operatorname{grad}F: \mathcal M \to T\mathcal M$ of a function $F$ the Hessian is approximated as follows: Let $c$ be a stepsize, $X∈ T_p\mathcal M$ a tangent vector and $q = \operatorname{retr}_p(\frac{c}{\lVert X \rVert_p}X)$ be a step in direction $X$ of length $c$ following a retraction Then we approximate the Hessian by the finite difference of the gradients, where $\mathcal T_{\cdot\gets\cdot}$ is a vector transport.
A functor to approximate the Hessian by a finite difference of gradient evaluation.
Given a point p and a direction X and the gradient $\operatorname{grad}F: \mathcal M \to T\mathcal M$ of a function $F$ the Hessian is approximated as follows: Let $c$ be a stepsize, $X∈ T_p\mathcal M$ a tangent vector and $q = \operatorname{retr}_p(\frac{c}{\lVert X \rVert_p}X)$ be a step in direction $X$ of length $c$ following a retraction Then we approximate the Hessian by the finite difference of the gradients, where $\mathcal T_{\cdot\gets\cdot}$ is a vector transport.
\[\operatorname{Hess}F(p)[X]
≈
-\frac{\lVert X \rVert_p}{c}\Bigl( \mathcal T_{p\gets q}\bigr(\operatorname{grad}F(q)\bigl) - \operatorname{grad}F(p)\Bigl)\]
Fields
gradient!! the gradient function (either allocating or mutating, see evaluation parameter)
step_length a step length for the finite difference
retraction_method - a retraction to use
vector_transport_method a vector transport to use
Internal temporary fields
grad_tmp a temporary storage for the gradient at the current p
grad_dir_tmp a temporary storage for the gradient at the current p_dir
p_dir::P a temporary storage to the forward direction (i.e. $q$ above)
Since Manifolds.jl 0.7, the support of automatic differentiation support has been extended.
This tutorial explains how to use Euclidean tools to derive a gradient for a real-valued function $f\colon \mathcal M → ℝ$. We will consider two methods: an intrinsic variant and a variant employing the embedding. These gradients can then be used within any gradient based optimization algorithm in Manopt.jl.
In this tutorial we will take a look at a few possibilities to approximate or derive the gradient of a function $f:\mathcal M \to ℝ$ on a Riemannian manifold, without computing it yourself. There are mainly two different philosophies:
Working instrinsically, i.e. staying on the manifold and in the tangent spaces. Here, we will consider approximating the gradient by forward differences.
Working in an embedding – there we can use all tools from functions on Euclidean spaces – finite differences or automatic differenciation – and then compute the corresponding Riemannian gradient from there.
We first load all necessary packages
using Manopt, Manifolds, Random, LinearAlgebra
+Use Automatic Differentiation · Manopt.jl
Since Manifolds.jl 0.7, the support of automatic differentiation support has been extended.
This tutorial explains how to use Euclidean tools to derive a gradient for a real-valued function $f\colon \mathcal M → ℝ$. We will consider two methods: an intrinsic variant and a variant employing the embedding. These gradients can then be used within any gradient based optimization algorithm in Manopt.jl.
In this tutorial we will take a look at a few possibilities to approximate or derive the gradient of a function $f:\mathcal M \to ℝ$ on a Riemannian manifold, without computing it yourself. There are mainly two different philosophies:
Working instrinsically, i.e. staying on the manifold and in the tangent spaces. Here, we will consider approximating the gradient by forward differences.
Working in an embedding – there we can use all tools from functions on Euclidean spaces – finite differences or automatic differenciation – and then compute the corresponding Riemannian gradient from there.
We first load all necessary packages
using Manopt, Manifolds, Random, LinearAlgebra
using FiniteDifferences, ManifoldDiff
Random.seed!(42);
A first idea is to generalize (multivariate) finite differences to Riemannian manifolds. Let $X_1,\ldots,X_d ∈ T_p\mathcal M$ denote an orthonormal basis of the tangent space $T_p\mathcal M$ at the point $p∈\mathcal M$ on the Riemannian manifold.
We can generalize the notion of a directional derivative, i.e. for the “direction” $Y∈T_p\mathcal M$. Let $c\colon [-ε,ε]$, $ε>0$, be a curve with $c(0) = p$, $\dot c(0) = Y$, e.g. $c(t)= \exp_p(tY)$. We obtain
The Rayleigh quotient is concerned with finding eigenvalues (and eigenvectors) of a symmetric matrix $A\in ℝ^{(n+1)×(n+1)}$. The optimization problem reads
Minimizing this function yields the smallest eigenvalue $\lambda_1$ as a value and the corresponding minimizer $\mathbf x^*$ is a corresponding eigenvector.
Since the length of an eigenvector is irrelevant, there is an ambiguity in the cost function. It can be better phrased on the sphere $ 𝕊^n$ of unit vectors in $\mathbb R^{n+1}$, i.e.
\[\operatorname*{arg\,min}_{p \in 𝕊^n}\ f(p) = \operatorname*{arg\,min}_{\ p \in 𝕊^n} p^\mathrm{T}Ap\]
This tutorial illustrates how to use tools from Euclidean spaces, finite differences or automatic differentiation, to compute gradients on Riemannian manifolds. The scheme allows to use any differentiation framework within the embedding to derive a Riemannian gradient.
Settings
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
This tutorial illustrates how to use tools from Euclidean spaces, finite differences or automatic differentiation, to compute gradients on Riemannian manifolds. The scheme allows to use any differentiation framework within the embedding to derive a Riemannian gradient.
Settings
This document was generated with Documenter.jl version 1.1.2 on Wednesday 25 October 2023. Using Julia version 1.9.3.
In this tutorial, we want to investigate the caching and counting (i.e. statistics) features of Manopt.jl. We will reuse the optimization tasks from the introductory tutorial Get Started: Optimize!.
There are surely many ways to keep track for example of how often the cost function is called, for example with a functor, as we used in an example in How to Record Data
mutable struct MyCost{I<:Integer}
+Count and use a Cache · Manopt.jl
In this tutorial, we want to investigate the caching and counting (i.e. statistics) features of Manopt.jl. We will reuse the optimization tasks from the introductory tutorial Get Started: Optimize!.
There are surely many ways to keep track for example of how often the cost function is called, for example with a functor, as we used in an example in How to Record Data
mutable struct MyCost{I<:Integer}
count::I
end
MyCost() = MyCost{Int64}(0)
@@ -205,4 +205,4 @@
* :Cost : 1449
To access the solver result, call `get_solver_result` on this variable.
and for safety let’s check that we are reasonably close
p4 = get_solver_result(s4)
-g(N, p4) - f_star
1.6049384043981263e-11
For this example, or maybe even gradient_descent in general it seems, this additional (second, inner) cache does not improve the result further, it is about the same effort both time and allocation-wise.
While the second approach of ManifoldCostGradientObjective is very easy to implement, both the storage and the (local) cache approach are more efficient. All three are an improvement over the first implementation without sharing interms results. The results with storage or cache have further advantage of being more flexible, i.e. the stored information could also be reused in a third function, for example when also computing the Hessian.
Settings
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
+g(N, p4) - f_star
1.6049384043981263e-11
For this example, or maybe even gradient_descent in general it seems, this additional (second, inner) cache does not improve the result further, it is about the same effort both time and allocation-wise.
While the second approach of ManifoldCostGradientObjective is very easy to implement, both the storage and the (local) cache approach are more efficient. All three are an improvement over the first implementation without sharing interms results. The results with storage or cache have further advantage of being more flexible, i.e. the stored information could also be reused in a third function, for example when also computing the Hessian.
Settings
This document was generated with Documenter.jl version 1.1.2 on Wednesday 25 October 2023. Using Julia version 1.9.3.
diff --git a/dev/tutorials/EmbeddingObjectives/index.html b/dev/tutorials/EmbeddingObjectives/index.html
index 943afaa824..a1fa6c1f89 100644
--- a/dev/tutorials/EmbeddingObjectives/index.html
+++ b/dev/tutorials/EmbeddingObjectives/index.html
@@ -1,5 +1,5 @@
-Define Objectives in the Embedding · Manopt.jl
Specifying a cost function $f\colon \mathcal M \to \mathbb R$ on a manifold is usually the model one starts with. Specifying its gradient $\operatorname{grad} f\colon\mathcal M \to T\mathcal M$, or more precisely $\operatorname{grad}f(p) \in T_p\mathcal M$, and eventually a Hessian $\operatorname{Hess} f\colon T_p\mathcal M \to T_p\mathcal M$ are then necessary to perform optimization. Since these might be challenging to compute, especially when manifolds and differential geometry are not the main area of a user – easier to use methods might be welcome.
This tutorial discusses how to specify $f$ in the embedding as $\tilde f$, maybe only locally around the manifold, and use the Euclidean gradient $∇ \tilde f$ and Hessian $∇^2 \tilde f$ within Manopt.jl.
Here we use the Examples 9.40 and 9.49 of [Bou23] and compare the different methods, one can call the solver, depending on which gradient and/or Hessian one provides.
using Manifolds, Manopt, ManifoldDiff
+Define Objectives in the Embedding · Manopt.jl
Specifying a cost function $f\colon \mathcal M \to \mathbb R$ on a manifold is usually the model one starts with. Specifying its gradient $\operatorname{grad} f\colon\mathcal M \to T\mathcal M$, or more precisely $\operatorname{grad}f(p) \in T_p\mathcal M$, and eventually a Hessian $\operatorname{Hess} f\colon T_p\mathcal M \to T_p\mathcal M$ are then necessary to perform optimization. Since these might be challenging to compute, especially when manifolds and differential geometry are not the main area of a user – easier to use methods might be welcome.
This tutorial discusses how to specify $f$ in the embedding as $\tilde f$, maybe only locally around the manifold, and use the Euclidean gradient $∇ \tilde f$ and Hessian $∇^2 \tilde f$ within Manopt.jl.
Here we use the Examples 9.40 and 9.49 of [Bou23] and compare the different methods, one can call the solver, depending on which gradient and/or Hessian one provides.
using Manifolds, Manopt, ManifoldDiff
using LinearAlgebra, Random, Colors, Plots
Random.seed!(123)
We consider the cost function on the Grassmann manifold given by
This tutorial aims to illustrate how to perform debug output. For that we consider an example that includes a subsolver, to also consider their debug capabilities.
The problem itself is hence not the main focus.
We consider a nonnegative PCA which we can write as a constraint problem on the Sphere
Let’s first load the necessary packages.
using Manopt, Manifolds, Random, LinearAlgebra
+Print Debug Output · Manopt.jl
This tutorial aims to illustrate how to perform debug output. For that we consider an example that includes a subsolver, to also consider their debug capabilities.
The problem itself is hence not the main focus.
We consider a nonnegative PCA which we can write as a constraint problem on the Sphere
Let’s first load the necessary packages.
using Manopt, Manifolds, Random, LinearAlgebra
Random.seed!(42);
d = 4
M = Sphere(d - 1)
v0 = project(M, [ones(2)..., zeros(d - 2)...])
@@ -75,4 +75,4 @@
The algorithm reached approximately critical point after 1 iterations; the gradient norm (5.4875346930698466e-8) is less than 0.001.
# 100 f(x): -0.500000 | ϵ: 0.00000100
The value of the variable (ϵ) is smaller than or equal to its threshold (1.0e-6).
-The algorithm performed a step with a change (6.534762378319523e-9) less than 1.0e-6.
where we now see that the subsolver always only requires one step. Note that since debug of an iteration is happening after a step, we see the sub solver run before the debug for an iteration number.
Settings
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
+The algorithm performed a step with a change (6.534762378319523e-9) less than 1.0e-6.
where we now see that the subsolver always only requires one step. Note that since debug of an iteration is happening after a step, we see the sub solver run before the debug for an iteration number.
Settings
This document was generated with Documenter.jl version 1.1.2 on Wednesday 25 October 2023. Using Julia version 1.9.3.
The recording and debugging features make it possible to record nearly any data during the iterations. This tutorial illustrates how to:
record one value during the iterations;
record multiple values during the iterations and access them afterwards;
define an own RecordAction to perform individual recordings.
Several predefined recordings exist, for example RecordCost or RecordGradient, if the problem the solver uses provides a gradient. For fields of the State the recording can also be done RecordEntry. For other recordings, for example more advanced computations before storing a value, an own RecordAction can be defined.
We illustrate these using the gradient descent from the Get Started: Optimize! tutorial.
Here we focus on ways to investigate the behaviour during iterations by using Recording techniques.
Let’s first load the necessary packages.
using Manopt, Manifolds, Random
+Record values · Manopt.jl
The recording and debugging features make it possible to record nearly any data during the iterations. This tutorial illustrates how to:
record one value during the iterations;
record multiple values during the iterations and access them afterwards;
define an own RecordAction to perform individual recordings.
Several predefined recordings exist, for example RecordCost or RecordGradient, if the problem the solver uses provides a gradient. For fields of the State the recording can also be done RecordEntry. For other recordings, for example more advanced computations before storing a value, an own RecordAction can be defined.
We illustrate these using the gradient descent from the Get Started: Optimize! tutorial.
Here we focus on ways to investigate the behaviour during iterations by using Recording techniques.
For the high level interfaces of the solvers, like gradient_descent we have to set return_state to true to obtain the whole solver state and not only the resulting minimizer.
Then we can easily use the record= option to add recorded values. This keyword accepts RecordActions as well as several symbols as shortcuts, for example :Cost to record the cost, or if your options have a field f, :f would record that entry. An overview of the symbols that can be used is given here.
We first just record the cost after every iteration
R = gradient_descent(M, f, grad_f, data[1]; record=:Cost, return_state=true)
# Solver state for `Manopt.jl`s Gradient Descent
-After 200 iterations
+After 63 iterations
## Parameters
* retraction method: ExponentialRetraction()
## Stepsize
-ArmijoLineseach() with keyword parameters
+ArmijoLinesearch() with keyword parameters
* initial_stepsize = 1.0
* retraction_method = ExponentialRetraction()
* contraction_factor = 0.95
@@ -24,46 +24,46 @@
## Stopping Criterion
Stop When _one_ of the following are fulfilled:
- Max Iteration 200: reached
- |grad f| < 1.0e-9: not reached
+ Max Iteration 200: not reached
+ |grad f| < 1.0e-9: reached
Overall: reached
-This indicates convergence: No
+This indicates convergence: Yes
## Record
-(Iteration = RecordCost(),)
For such a state, one can attach different recorders to some operations, currently to :Start. :Stop, and :Iteration, where :Iteration is the default when using the record= keyword with a RecordAction as above. We can access all values recorded during the iterations by calling get_record(R, :Iteation) or since this is the default even shorter
For such a state, one can attach different recorders to some operations, currently to :Start. :Stop, and :Iteration, where :Iteration is the default when using the record= keyword with a RecordAction as above. We can access all values recorded during the iterations by calling get_record(R, :Iteation) or since this is the default even shorter
To record more than one value, you can pass an array of a mix of symbols and RecordActions which formally introduces RecordGroup. Such a group records a tuple of values in every iteration:
To record more than one value, you can pass an array of a mix of symbols and RecordActions which formally introduces RecordGroup. Such a group records a tuple of values in every iteration:
# Solver state for `Manopt.jl`s Gradient Descent
+After 63 iterations
## Parameters
* retraction method: ExponentialRetraction()
## Stepsize
-ArmijoLineseach() with keyword parameters
+ArmijoLinesearch() with keyword parameters
* initial_stepsize = 1.0
* retraction_method = ExponentialRetraction()
* contraction_factor = 0.95
@@ -71,91 +71,91 @@
## Stopping Criterion
Stop When _one_ of the following are fulfilled:
- Max Iteration 200: reached
- |grad f| < 1.0e-9: not reached
+ Max Iteration 200: not reached
+ |grad f| < 1.0e-9: reached
Overall: reached
-This indicates convergence: No
+This indicates convergence: Yes
## Record
-(Iteration = RecordGroup([RecordIteration(), RecordCost()]),)
Here, the symbol :Cost is mapped to using the RecordCost action. The same holds for :Iteration obiously records the current iteration number i. To access these you can first extract the group of records (that is where the :Iterations are recorded – note the plural) and then access the :Cost ““”
get_record_action(R2, :Iteration)
RecordGroup([RecordIteration(), RecordCost()])
Since iteration is the default, we can also omit it here again. To access single recorded values, one can use
Here, the symbol :Cost is mapped to using the RecordCost action. The same holds for :Iteration obiously records the current iteration number i. To access these you can first extract the group of records (that is where the :Iterations are recorded – note the plural) and then access the :Cost ““”
get_record_action(R2, :Iteration)
RecordGroup([RecordIteration(), RecordCost()])
Since iteration is the default, we can also omit it here again. To access single recorded values, one can use
Note that the first symbol again refers to the point where we record (not to the thing we record). We can also pass a tuple as second argument to have our own order within the tuples returned. Switching the order of recorded cost and Iteration can be done using ““”
Note that the first symbol again refers to the point where we record (not to the thing we record). We can also pass a tuple as second argument to have our own order within the tuples returned. Switching the order of recorded cost and Iteration can be done using ““”
To illustrate a complicated example let’s record: * the iteration number, cost and gradient field, but only every sixth iteration; * the iteration at which we stop.
We first generate the problem and the state, to also illustrate the low-level works when not using the high-level iterface gradient_descent.
To illustrate a complicated example let’s record: * the iteration number, cost and gradient field, but only every sixth iteration; * the iteration at which we stop.
We first generate the problem and the state, to also illustrate the low-level works when not using the high-level iterface gradient_descent.
Let’s investigate where we want to count the number of function evaluations, again just to illustrate, since for the gradient this is just one evaluation per iteration. We first define a cost, that counts its own calls. ““”
Let’s investigate where we want to count the number of function evaluations, again just to illustrate, since for the gradient this is just one evaluation per iteration. We first define a cost, that counts its own calls. ““”
mutable struct MyCost{T}
data::T
count::Int
end
@@ -266,7 +250,7 @@
end
function (r::RecordCount)(p::AbstractManoptProblem, ::AbstractManoptSolverState, i)
if i > 0
- push!(r.recorded_values, get_cost_function(get_objective(p)).count)
+ push!(r.recorded_values, Manopt.get_cost_function(get_objective(p)).count)
elseif i < 0 # reset if negative
r.recorded_values = Vector{Int}()
end
@@ -342,13 +326,13 @@
record=[:Count => RecordCount()],
return_state=true,
)
When you have used a few solvers from Manopt.jl for example like in the opening tutorial Get Started: Optimize! you might come to the idea of implementing a solver yourself.
After a short introduction of the algorithm we will implement, this tutorial first discusses the structural details, i.e. what a solver consists of and “works with”. Afterwards, we will show how to implement the algorithm. Finally, we will discuss how to make the algorithm both nice for the user as well as initialized in a way, that it can benefit from features already available in Manopt.jl.
Note
If you have implemented your own solver, we would be very happy to have that within Manopt.jl as well, so maybe consider opening a Pull Request
Since most serious algorithms should be implemented in Manopt.jl themselves directly, we will implement a solver that randomly walks on the manifold and keeps track of the lowest point visited. As for algorithms in Manopt.jl we aim to implement this generically for any manifold that is implemented using ManifoldsBase.jl.
The Random Walk Minimization
Given:
a manifold $\mathcal M$
a starting point $p=p^{(0)}$
a cost function $f: \mathcal M \to\mathbb R$.
a parameter $\sigma > 0$.
a retraction $\operatorname{retr}_p(X)$ that maps $X\in T_p\mathcal M$ to the manifold.
We can run the following steps of the algorithm
set $k=0$
set our best point $q = p^{(0)}$
Repeat until a stopping criterion is fulfilled
Choose a random tangent vector $X^{(k)} \in T_{p^{(k)}}\mathcal M$ of length $\lVert X^{(k)} \rVert = \sigma$
“Walk” along this direction, i.e. $p^{(k+1)} = \operatorname{retr}_{p^{(k)}}(X^{(k)})$
If $f(p^{(k+1)}) < f(q)$ set q = p^{(k+1)}$ as our new best visited point
There are two main ingredients a solver needs: a problem to work on and the state of a solver, which “identifies” the solver and stores intermediate results.
A problem in Manopt.jl usually consists of a manifold (an AbstractManifold) and an AbstractManifoldObjective describing the function we have and its features. In our case the objective is (just) a ManifoldCostObjective that stores cost function f(M,p) = .... More generally, it might for example store a gradient function or the Hessian or any other information we have about our task.
This is something independent of the solver itself, since it only identifies the problem we want to solve independent of how we want to solve it – or in other words, this type contains all information that is static and independent of the specific solver at hand.
Everything that is needed by a solver during the iterations, all its parameters, interims values that are needed beyond just one iteration, is stored in a subtype of the AbstractManoptSolverState. This identifies the solver uniquely.
When you have used a few solvers from Manopt.jl for example like in the opening tutorial Get Started: Optimize! you might come to the idea of implementing a solver yourself.
After a short introduction of the algorithm we will implement, this tutorial first discusses the structural details, i.e. what a solver consists of and “works with”. Afterwards, we will show how to implement the algorithm. Finally, we will discuss how to make the algorithm both nice for the user as well as initialized in a way, that it can benefit from features already available in Manopt.jl.
Note
If you have implemented your own solver, we would be very happy to have that within Manopt.jl as well, so maybe consider opening a Pull Request
Since most serious algorithms should be implemented in Manopt.jl themselves directly, we will implement a solver that randomly walks on the manifold and keeps track of the lowest point visited. As for algorithms in Manopt.jl we aim to implement this generically for any manifold that is implemented using ManifoldsBase.jl.
The Random Walk Minimization
Given:
a manifold $\mathcal M$
a starting point $p=p^{(0)}$
a cost function $f: \mathcal M \to\mathbb R$.
a parameter $\sigma > 0$.
a retraction $\operatorname{retr}_p(X)$ that maps $X\in T_p\mathcal M$ to the manifold.
We can run the following steps of the algorithm
set $k=0$
set our best point $q = p^{(0)}$
Repeat until a stopping criterion is fulfilled
Choose a random tangent vector $X^{(k)} \in T_{p^{(k)}}\mathcal M$ of length $\lVert X^{(k)} \rVert = \sigma$
“Walk” along this direction, i.e. $p^{(k+1)} = \operatorname{retr}_{p^{(k)}}(X^{(k)})$
If $f(p^{(k+1)}) < f(q)$ set q = p^{(k+1)}$ as our new best visited point
There are two main ingredients a solver needs: a problem to work on and the state of a solver, which “identifies” the solver and stores intermediate results.
A problem in Manopt.jl usually consists of a manifold (an AbstractManifold) and an AbstractManifoldObjective describing the function we have and its features. In our case the objective is (just) a ManifoldCostObjective that stores cost function f(M,p) = .... More generally, it might for example store a gradient function or the Hessian or any other information we have about our task.
This is something independent of the solver itself, since it only identifies the problem we want to solve independent of how we want to solve it – or in other words, this type contains all information that is static and independent of the specific solver at hand.
Everything that is needed by a solver during the iterations, all its parameters, interims values that are needed beyond just one iteration, is stored in a subtype of the AbstractManoptSolverState. This identifies the solver uniquely.
We saw in this tutorial how to implement a simple cost-based algorithm, to illustrate how optimization algorithms are covered in Manopt.jl.
One feature we did not cover is that most algorithms allow for inplace and allocation functions, as soon as they work on more than just the cost, e.g. gradients, proximal maps or Hessians. This is usually a keyword argument of the objective and hence also part of the high-level interfaces.
Settings
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
We saw in this tutorial how to implement a simple cost-based algorithm, to illustrate how optimization algorithms are covered in Manopt.jl.
One feature we did not cover is that most algorithms allow for inplace and allocation functions, as soon as they work on more than just the cost, e.g. gradients, proximal maps or Hessians. This is usually a keyword argument of the objective and hence also part of the high-level interfaces.
Settings
This document was generated with Documenter.jl version 1.1.2 on Wednesday 25 October 2023. Using Julia version 1.9.3.
When it comes to time critital operations, a main ingredient in Julia is given by mutating functions, i.e. those that compute in place without additional memory allocations. In the following, we illustrate how to do this with Manopt.jl.
Let’s start with the same function as in Get Started: Optimize! and compute the mean of some points, only that here we use the sphere $\mathbb S^{30}$ and $n=800$ points.
From the aforementioned example.
We first load all necessary packages.
using Manopt, Manifolds, Random, BenchmarkTools
+Speedup using Inplace computations · Manopt.jl
When it comes to time critital operations, a main ingredient in Julia is given by mutating functions, i.e. those that compute in place without additional memory allocations. In the following, we illustrate how to do this with Manopt.jl.
Let’s start with the same function as in Get Started: Optimize! and compute the mean of some points, only that here we use the sphere $\mathbb S^{30}$ and $n=800$ points.
From the aforementioned example.
We first load all necessary packages.
using Manopt, Manifolds, Random, BenchmarkTools
Random.seed!(42);
And setup our data
Random.seed!(42)
m = 30
M = Sphere(m)
@@ -46,4 +46,4 @@
▄▁███████▆█▇█▄▆▃▃▃▃▁▁▃▁▁▃▁▃▃▁▄▁▁▃▃▁▁▄▁▁▃▅▃▃▃▁▃▃▁▁▁▁▁▁▁▁▃▁▁▃ ▃
27.4 ms Histogram: frequency by time 31.9 ms <
- Memory estimate: 3.76 MiB, allocs estimate: 5949.
which is faster by about a factor of 2 compared to the first solver-call. Note that the results m1 and m2 are of course the same.
distance(M, m1, m2)
2.0004809792350595e-10
Settings
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.
In this tutorial, we will both introduce the basics of optimisation on manifolds as well as how to use Manopt.jl to perform optimisation on manifolds in Julia.
For more theoretical background, see e.g. [Car92] for an introduction to Riemannian manifolds and [AMS08] or [Bou23] to read more about optimisation thereon.
Let $\mathcal M$ denote a Riemannian manifold and let $f\colon \mathcal M → ℝ$ be a cost function. We aim to compute a point $p^*$ where $f$ is minimal or in other words $p^*$ is a minimizer of $f$.
and would like to find $p^*$ numerically. As an example we take the generalisation of the (arithemtic) mean. In the Euclidean case with$d\in\mathbb N$, that is for $n\in \mathbb N$ data points $y_1,\ldots,y_n \in \mathbb R^d$ the mean
\[ \sum_{i=1}^n y_i\]
can not be directly generalised to data $q_1,\ldots,q_n$, since on a manifold we do not have an addition. But the mean can also be charcterised as
\[ \operatorname*{arg\,min}_{x\in\mathbb R^d} \frac{1}{2n}\sum_{i=1}^n \lVert x - y_i\rVert^2\]
and using the Riemannian distance $d_\mathcal M$, this can be written on Riemannian manifolds. We obtain the Riemannian Center of Mass[Kar77]
In this tutorial, we will both introduce the basics of optimisation on manifolds as well as how to use Manopt.jl to perform optimisation on manifolds in Julia.
For more theoretical background, see e.g. [Car92] for an introduction to Riemannian manifolds and [AMS08] or [Bou23] to read more about optimisation thereon.
Let $\mathcal M$ denote a Riemannian manifold and let $f\colon \mathcal M → ℝ$ be a cost function. We aim to compute a point $p^*$ where $f$ is minimal or in other words $p^*$ is a minimizer of $f$.
and would like to find $p^*$ numerically. As an example we take the generalisation of the (arithemtic) mean. In the Euclidean case with$d\in\mathbb N$, that is for $n\in \mathbb N$ data points $y_1,\ldots,y_n \in \mathbb R^d$ the mean
\[ \sum_{i=1}^n y_i\]
can not be directly generalised to data $q_1,\ldots,q_n$, since on a manifold we do not have an addition. But the mean can also be charcterised as
\[ \operatorname*{arg\,min}_{x\in\mathbb R^d} \frac{1}{2n}\sum_{i=1}^n \lVert x - y_i\rVert^2\]
and using the Riemannian distance $d_\mathcal M$, this can be written on Riemannian manifolds. We obtain the Riemannian Center of Mass[Kar77]
Let’s assume you have already installed both Manotp and Manifolds in Julia (using e.g. using Pkg; Pkg.add(["Manopt", "Manifolds"])). Then we can get started by loading both packages – and Random for persistency in this tutorial.
using Manopt, Manifolds, Random, LinearAlgebra
Random.seed!(42);
Now assume we are on the Sphere$\mathcal M = \mathbb S^2$ and we generate some random points “around” some initial point $p$
Computationally, we look at a very simple but large scale problem, the Riemannian Center of Mass or Fréchet mean: for given points $p_i ∈\mathcal M$, $i=1,…,N$ this optimization problem reads
Computationally, we look at a very simple but large scale problem, the Riemannian Center of Mass or Fréchet mean: for given points $p_i ∈\mathcal M$, $i=1,…,N$ this optimization problem reads
which of course can be (and is) solved by a gradient descent, see the introductionary tutorial or Statistics in Manifolds.jl. If $N$ is very large, evaluating the complete gradient might be quite expensive. A remedy is to evaluate only one of the terms at a time and choose a random order for these.
We first initialize the packages
using Manifolds, Manopt, Random, BenchmarkTools
Random.seed!(42);
We next generate a (little) large(r) data set
n = 5000
σ = π / 12
@@ -72,4 +72,4 @@
█▁▁█▁▁█▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁█ ▁
783 ms Histogram: frequency by time 806 ms <
- Memory estimate: 703.16 MiB, allocs estimate: 9021018.
Note that all 5 runs are very close to each other, here we check the distance to the first
Settings
This document was generated with Documenter.jl version 0.27.25 on Tuesday 17 October 2023. Using Julia version 1.9.3.