Inexact Newton Methods for Solving Generalized Equations on Riemannian Manifolds
Abstract
The convergence of inexact Newton methods is studied for solving generalized equations on Riemannian manifolds by using the metric regularity property, which is also explored. Under appropriate conditions and without any additional geometric assumptions, local convergence results with linear and quadratic rates, as well as a semi-local convergence result, are obtained for the proposed method. Finally, the theory is applied to the problem of finding a singularity for the sum of two vector fields. In particular, the KKT system for the constrained Riemannian center of mass on the sphere is explored numerically.
Key words: Generalized equation, inexact Newton method, metric regularity, Riemannian manifolds, vector fields.
AMS subject classification: 90C33 49M37 65K05
1 Introduction
In recent years, constrained generalized equations on Banach spaces, i.e., finding a solution to the inclusion
(1) |
where and are two Banach spaces, is a nonempty, closed, and convex set, is a mapping, and is a set-valued mapping, have gained increased attention [18, 7]. This is attributed to the fact that the model (1) covers many well-known problems, such as constrained variational inequality and split variational inequality problem [15, 34], nonlinear equations, systems of equations and inequalities, optimality condition in mathematical programming and optimal control, and equilibrium problem. The readers are referred to [3, 5, 4, 36, 21, 17, 30, 28, 26, 9, 24, 22, 23, 8, 33, 31, 38] for a detailed study of (1) with .
For a Riemannian manifold , a closed set with a nonempty interior, a continuously differentiable mapping , and a set-valued mapping , we consider the generalized equation
(2) |
Evidently, (2) covers the nonlinear equation ( and the nonlinear inclusion problem () for a fixed cone [52]. In this paper, we shall prove that problem (2) also covers the problem of finding a singularity of the sum of two vector fields of the form
(3) |
where is a single-valued vector field, is a set-valued vector field. In particular, we demonstrate that (2) can also be used to obtain a solution for a variational inequality problem and the Karush–Kuhn–Tucker (KKT) conditions for a constrained optimization problem on manifolds.
Problem (3) has been extensively investigated [5, 28] and solved using Newton’s method for [6, 25, 51, 42, 2, 19]. This problem naturally arises, for example, in the first-order optimality conditions of the minimization problem
(4) |
where is a differentiable function defined over (the interior of ) and is non-differentiable. In fact, one can consider (3) with representing the Riemannian gradient of (), representing the Riemannian subdifferential of (), and , i.e.,
Problem (4) is associated with several important applications, including sparse principal component analysis [32], sparse blind deconvolution [54], unsupervised feature selection [49], and image restoration [12, 11].
The extensive scope of generalized equations and the growing interest in optimization on manifolds [47, 14, 1] in recent years have motivated us to explore the integration of these two areas in this paper. Our focus is on studying a Riemannian version of the inexact Newton method proposed by [16] with the aim of solving problem (2). Given an initial point , our method generates a sequence of iterations as follows:
(5) |
where is a sequence of set-valued mappings with closed graphs representing the inexactness, and denotes the differential of . In other words, the method involves: selecting an initial point on the manifold (sufficiently close to a solution), solving a subproblem to find in the intersection above (under certain conditions, it is possible to guarantee that it is nonempty), and computing the next iteration by applying the exponential map to . It is noteworthy that whenever is sufficiently small, (5) can be expressed as:
(6) |
Although it may appear that obtaining requires solving the subproblem in (5) exactly, the map is specifically introduced to circumvent this necessity. This is evident in the particular case of (5) where is defined as the closed ball centered at with a radius of , with being a forcing sequence of non-negative real numbers converging to 0, and . In this scenario, the subproblem in (5) consists of finding a that satisfies
where denotes the Euclidean norm, which can be interpreted as an inexact Newton method for by solving . The convergence analysis presented in this paper is conducted using (5) because it is more general and can be applied to other potential particular cases of (2).
Assuming that the set-valued mapping in (2) is metrically regular at for and that satisfies a suitable boundedness condition, we demonstrate that a sequence generated by (6) exhibits both linear and quadratic convergence towards , with the exact nature of the convergence depending on the specific assumptions regarding . These results can be obtained without requiring prior knowledge of the sequence of mappings in terms of problem-specific data. However, it is necessary to ensure that a sequence in converges to at the same rate as the sequence converges to . This requirement is a standard assumption in the context of inexact Newton-type methods, even when applied to nonlinear equations.
Under the condition of metric regularity of a linearization of at for , and provided that certain additional conditions are met, we present variations of the aforementioned results. In these variations, a neighborhood of is assumed to be known, which allows for a more suitable choice of the initial point for the sequence . We also provide a semi-local convergence result, where the required conditions are related to rather than . This result is new even in the case where is a Euclidean space.
To understand the concept of metric regularity well on Riemannian manifolds, we constructed examples of mappings that are metrically regular over the set of positive definite symmetric matrices equipped with a well-established Riemannian metric. Additionally, we present conditions that guarantee the metric regularity property for certain mappings, and in particular, we compare the metric regularity of with that of its linearization. Finally, some examples are given to show that our proposed concept is useful and a numerical example applied to the KKT system is given as well for the constrained Riemannian center of mass on the sphere to illustrate our theoretical results.
This work is organized as follows. In Section 2, some notations and basic concepts are reviewed. In Section 3 the metric regularity assumption is explored. In Section 4, local and semi-local convergence are studied for the proposed method. In Section 5 the relationship is investigated between (2) and (3), and some examples of classical problems which can be viewed as generalized equations are given. In Section 6, a numerical example is provided to illustrate our theoretical results. Finally, conclusions are presented in the last section.
2 Preliminary
In this section we recall some notations, definitions and basic properties of Riemannian manifolds used throughout the paper, which can be found in many introductory books on Riemannian geometry, for example [20, 41, 40, 50].
Suppose that is a connected, -dimensional smooth manifold. At each point , the tangent space is an -dimensional vector space with its origin at . The disjoint union of all tangent spaces, denoted as , is the tangent bundle of . We assume that is equipped with a Riemannian metric, making it a Riemannian manifold. At each point , this Riemannian metric is denoted by , and the associated norm in is represented as . The Riemannian distance between two points and in , denoted as , is defined as the infimum of the lengths of all piecewise smooth curve segments connecting to . Additionally, the distance from a point to a subset is defined as and the interior of is represented by .
A vector field on is a correspondence that associates to each point a vector . The point is said to be a singularity of if and only if . The set of smooth vector fields on is denoted by .
The tangent vector of a smooth curve defined on some open interval is denoted by . For each , , the Levi-Civita connection induces an isometry relative to the Riemannian metric on , given by , where is the unique vector field on such that and . The isometry is the parallel transport along joining to . When the geodesic connecting and is unique, the notation will be used instead of . It is well-known that is equal to the identity map over .
The differential of a smooth function at is the linear map which assigns to each the value
for every smooth curve satisfying and . The gradient at of , denoted as , is defined by the unique tangent vector at such that for all For a smooth multifunction , its differential is given by
(7) |
Note that is a linear map from into for all . The norm of a linear map is defined by where is the Euclidean norm on .
A vector field along a smooth curve is said to be parallel if and only if . The curve is a geodesic when is self-parallel. When the geodesic equation is a second-order nonlinear ordinary differential equation, the geodesic is determined by its position and velocity at . The restriction of a geodesic to a closed bounded interval is called a geodesic segment. Denote the unique geodesic segment satisfying and by . A geodesic segment joining to in is said to be minimal if its length is equal to . A Riemannian manifold is complete if its geodesics are defined for all values of . Hopf-Rinow’s theorem asserts that every pair of points in a complete, connected Riemannian manifold can be joined by a (not necessarily unique) minimal geodesic segment. Due to the completeness of , the exponential map is given by , for each . In this paper, all manifolds are assumed to be connected, finite-dimensional, and complete.
The open and closed balls on of radius centered at are defined, respectively, as and . Analogously, the open and closed balls on of radius centered at are defined, respectively, as and . It is well-known that there exists such that is a diffeomorphism, and is called a normal ball with center and radius . Whenever is a normal ball, it and its closure will be denoted by and , respectively. The injectivity radius of at , denoted by , is the supremum of all such that is a normal ball. The equality
(8) |
holds for all , where denotes the inverse of the exponential map. Recall that there exist and such that, for every , is a normal ball and , see [20, Theorem 3.7]. In this case, is called a totally normal ball of center and radius . When is a Hadamard manifold, that is, a complete, simply connected Riemannian manifold with nonpositive sectional curvature, is totally normal for all and .
A sequence is linearly convergent to when there exist and such that
It is said to be quadratically convergent to when there exist and such that
We end this section by presenting the concept of Lipschitz continuity for the differential of continuously differentiable functions in the Riemannian context. This concept will be fundamental for obtaining the quadratic convergence rate for our algorithm.
Definition 1.
Let be continuously differentiable at . Then is L-Lipschitz continuous around if there exists such that for every
(9) |
where is a geodesic connecting to .
3 Metric Regularity on Riemannian Manifolds
Let be a set-valued mapping, whose the graph of is the set with domain . The inverse of is defined as .
A function is said to be Lipschitz continuous relative to a set if there exists a constant such that
Let . The Lipschitz modulus of at is defined by
We now introduce the following notations on : denotes the Euclidean closed ball of radius and center , and denotes the Euclidean distance from a point to a set. The next concept plays an important role in role in this paper and it comes from [23, p. 279] in the metric spaces setting:
Definition 2.
A set-valued mapping is said to be metrically regular at for when , and there exist positive constants , and such that
(10) |
and
(11) |
The infimum of over all such combinations of , and is called the regularity modulus for at for and denoted by . The absence of metric regularity is signaled by .
It is important to emphasize that one of the main assumptions used to guarantee the convergence of the proposed algorithm in this paper is metric regularity. This concept, in the context of Riemannian manifolds, has been explored in recent works such as [4, 2, 28]. We will now briefly discuss this property in the case where . If is a differentiable function, the metric regularity of at for is equivalent to stating that the inverse of the Jacobian matrix of at exists, which is the standard regularity assumption applied to solve the nonlinear equation . On the other hand, if we are interested in solving a nonlinear system of equations and inequalities, for example,
(12) |
where and are given parameters, and and are continuously differentiable functions, then we can associate problem (12) with the following generalized equation:
In this particular case, it is well known that the metric regularity of at for is equivalent to the standard Mangasarian-Fromovitz constraint qualification at . Notably, the case where is of particular interest, with denoting the normal cone to a closed convex set . The metric regularity of at for is equivalent to the concept of strong metric regularity, meaning that is locally single-valued and Lipschitz continuous at for . Thus, if and are chosen such that forms the KKT system for a constrained optimization problem, then strong metric regularity is equivalent to the linear independence of the gradients of the active constraints and the strong second-order sufficient condition. These details are thoroughly discussed in [24]; see also Examples 6 and 7 in Section 5.
We present in Subsection 3.1 some examples of set-valued mapping that satisfy the previous concept. The next lemma is a version in the Riemannian setting of Corollary 3F.3 given in [24]. Its proof is similar to the corresponding result in Euclidean space and will be omitted here.
Lemma 1.
Consider a mapping and a point . Then for every with , one has .
In the theorem below, we demonstrate that the metric regularity condition of a mapping at for is equivalent to the same condition for its linearization at .
Theorem 1.
Consider a normal ball . Let be a continuously differentiable function at and be a set-valued mapping. Then for defined by
(13) |
one has .
Proof.
Pick arbitrary. By Lemma 5 given in the Appendix of this paper, there exists a totally normal ball such that
(14) |
Consider a function given by
(15) |
In particular, this function satisfies
(16) |
The first part of Proposition 2 guarantees that there exists such that
where is the geodesic satisfying and . Adding to both sides, it follows that
Applying norm on both sides, we conclude that
(17) |
On the other hand, from Proposition 3 with and we have
From this equality and (14) with , (17) yields
Then (16) implies that for all . Since was chosen arbitrarily, we can conclude that . Finally, to obtain the desired equality, just use Lemma 1 with as in (15) and . ∎
Next, we define the Aubin property for a set-valued mapping . To this end, it is necessary to introduce the concept of excess between subsets of . For sets and in , the excess of beyond is defined as
with the convention that for and otherwise.
Definition 3.
A set-valued mapping is said to have the Aubin property at for if , and there exist positive constants , , and such that
The infimum of over all such combinations of , , and is called the Lipschitz modulus of at for and is denoted by . The absence of this property is indicated by .
The next result provides a relationship between the metric regularity of a set-valued mapping and the Aubin property for its inverse. Its proof follows from a straightforward adaptation of the demonstration in [24, Theorem 3E.6] and the concepts introduced above, and hence will be omitted.
Theorem 2.
A set-valued mapping is metrically regular at for with a constant if and only if its inverse has the Aubin property at for with constant . Thus,
3.1 Example in the SPD Matrices Cone
In this section, some examples of metrically regular mappings on a particular manifold are presented.
Denote the set of symmetric matrices of size for some by , and the cone of symmetric positive definite matrices by . The latter is endowed with the affine invariant Riemannian metric given by
(18) |
where tr denotes the trace of a matrix. The tangent space can be identified with . Moreover, is a Hadamard manifold, see, for example, [39, Theorem 1.2. p. 325]. The Riemannian distance is given by
(19) |
where denotes the matrix logarithm. The Riemannian gradient of at is given by
(20) |
where represents the Euclidean gradient of at , see [48]. The identity matrix of size will be denoted by .
In the following two examples, we verify the property of metric regularity for single-valued functions.
Example 1.
Consider the function defined by . For every , the following equality holds:
(21) |
This implies that for all Thus, using (19) and simple algebraic manipulations, we obtain
On the other hand, since is Euclidean differentiable, it follows from (20) that will also be Riemannian differentiable with respect to metric (18). Therefore, is continuous and, by Proposition 4, it has closed graph. Given this and the last inequality above, we conclude that is metrically regular at for for all .
Example 2.
Define the function by . Note that for all . Using this and (19), we obtain
(22) |
Now recall that
(23) |
where are the eigenvalues of . Pick a constant . From (19), for all , one has
Then holds for all and . Consequently
With (23), we conclude that for all . Since the function is Lipschitz on the interval (its gradient is bounded on this interval), there exists such that
(24) |
Furthermore, taking into account the argument presented at the end of Example 1, it can be concluded that is closed. Then is metrically regular at for for all .
We now present two examples involving set-valued mappings. In these examples, we demonstrate that the given mappings are metrically regular at specific points in .
Example 3.
Define the set-valued mapping by
(25) |
By (25) and the calculations in (21), if and satisfy then . Now, consider the case . By applying on both sides, it follows that . Clearly, this implies that . Therefore, by (25), we get for this case as well. Thus, holds for all , which implies
On the other hand, it follows from (25) that
Overall, we conclude
With a justification analogous to the one presented in Example 1, we state that has closed graph. Therefore, is metrically regular at for for all .
Example 4.
Consider the set-valued mapping by
(26) |
Following the steps of Example 3, we can show that holds for all , . This means that for all , and hence
From (24), for all there exists such that
Pick . Exploiting the definition of in (26), it follows that
Combining this with the fact that is closed, we conclude that is metrically regular at for for all .
4 Convergence
In this section, we will establish three convergence theorems for the sequence generated by the inexact Newton method described in (6). The proofs presented here rely on a Riemannian version of Theorem 5G.3 from [24], which is presented in the following lemma. The proof of this lemma is a straightforward adaptation of its Euclidean version and will be omitted.
Lemma 2.
Consider a set-valued mapping and a point at which is metrically regular with positive constants , and satisfying (10)-(11). Let be such that and . Then for every positive and such that
and for every function satisfying
the mapping has the following property: for every and every there exists such that
Our first convergence result is a local analysis of the inexact Newton method (6) for solving (2). This approach makes assumptions around a solution of (2).
Theorem 3.
Let be a solution of (2). Suppose that the following conditions hold:
-
(i)
the function is continuously differentiable at and ;
-
(ii)
the sequence satisfies
(27)
Then there exist and a totally normal ball such that for every , every , and every there exists satisfying
(28) |
and
(29) |
Consequently, for any starting point , there exists a sequence generated by (6) that converges linearly to .
Proof.
First, (27) implies that there exists satisfying the following inequality:
(30) |
Choose satisfying (30) and positive constants , , and such that
(31) |
Pick and such that
(32) |
From the first inequality in (30), there exists such that
(33) |
Choose a totally normal ball . Note that the function defined by
(34) |
satisfies for all . On the other hand, since and satisfy , it follows from Theorem 1 that , where is the function defined in (13) with . Hence, by the definition of in Definition 2, there exist and such that (10) and (11) are satisfied for and . Moreover, by Lemma 5 and (34), there exists such that
for all . Therefore, for every one has
Overall, there exists (independent of ) such that
and
Applying Lemma 2 with , , , and , we conclude that for every and every there exists such that Taking and , we conclude that for every there exists such that
(35) |
Using the second part of Proposition 2 with chosen as in (31), we have the existence of a normal ball such that
(36) |
Fixed such that
(37) |
and , by choosing and , it follows from (33) that satisfies . Denote
(38) |
If , then satisfies (28) because , and (29) holds trivially. For , it follows from (38), (36) and (33) that
(39) |
Since , it follows from (39) that . Applying (35) with , we obtain that there exists such that . Hence, from the upper bound for given in (32), (39), and the last inequality in (31), it follows that
Furthermore, it comes from , (38), (34) and (13) that
An upper bound for the radius of the ball mentioned in the statement of Theorem 3 is given in (37). This bound depends on certain constants that, while not known a priori, are guaranteed to exist due to conditions i) and ii) in Lemma 2, as well as additional results presented in the Appendix of this paper.
Below, we present a version of Theorem 3 with more technical conditions, where it is assumed that the constants mentioned above are known. Under these new conditions, we can explicitly determine a radius of convergence for the sequence generated by (6). The proof of this new theorem follows a similar structure to the proof of Theorem 3 and will therefore be omitted.
Theorem 4.
Let be a solution to the equation (2). We assume that the following conditions are satisfied:
-
(i)
is a totally normal ball, and the mapping defined in (13) with is metrically regular at for with the constant and neighborhoods and .
-
(ii)
is chosen such that holds for all , , .
-
(iii)
, , and satisfy
-
(iv)
and
-
(v)
is such that and .
-
(vi)
satisfies for all .
Then, for every starting point with , there exists a sequence generated by (6) that is well defined and converges linearly to .
Next, we modify the assumption on the multifunction and introduce a stronger condition for the differentiability of the function in order to establish quadratic convergence of the sequence (6).
Theorem 5.
Let be a solution of (2). Suppose the following conditions hold:
-
(i)
the function is continuously differentiable at , and ;
-
(ii)
is L-Lipschitz continuous around ;
-
(iii)
the sequence satisfies
(40)
Then there exist and a totally normal ball such that for every , every , and every , there exists satisfying
(41) |
and
(42) |
Consequently, for any starting point , there exists a sequence generated by (6) that converges quadratically to .
Proof.
This proof is analogous to the proof of Theorem 3. Using (40), we can find and a totally normal ball such that
(43) |
Choose , , and satisfying
(44) |
Note that . From and Theorem 1, we obtain , where is the function defined in (13) with . Thus, there exist and such that is metrically regular at for with the constant and neighborhoods and . Now, consider the auxiliary functions , , defined in (34). Recall that for all . Make smaller, if necessary, to ensure
Hence, we can apply Lemma 2 with , , , , , , and . This yields the existence of (independent of the point that determines the function ) such that for each , there exists such that
(45) |
With , the last part of Proposition 2 implies that there exists such that
(46) |
Fixed such that
(47) |
and fixed, by choosing and it follows from (43) that satisfies . Denote
(48) |
If , then satisfies (41) because and (42) hold trivially. Assume that . Using (46) and (48), we obtain
(49) |
Since , it follows from (49) that . Applying (45) with , we obtain that there exists such that . Hence, utilizing the upper bound for given in (44) and (49), it follows that
Furthermore, it comes from , (48), (34) and (13) that
To finish the proof, choose any . If , then verifies (6) because . If , applying (41) and (42) we obtain that for every there exists such that
By considering the definition of and we obtain from the above inequality that Repeating the previous argument it is possible to construct a sequence in that satisfies (6) and converges quadratically to . ∎
Under suitable conditions, it is possible to determine the radius mentioned in Theorem 5. Details are given in the following result. The proof of this result is along the same lines as the proof of Theorem 5 and will therefore be omitted.
Theorem 6.
Let be a solution of (2). Suppose that the following conditions hold:
-
(i)
is a totally normal ball, and the mapping defined in (13) with is metrically regular at for with the constant and neighborhoods and .
-
(ii)
satisfies for all , , .
-
(iii)
, and satisfy
-
(iv)
and
-
(v)
satisfies and .
-
(vi)
and satisfy for all .
Then for every satisfying (47) and starting point there exists a sequence generated by (6) that is well defined and converges linearly to .
Some comments about the previous results are in order. First, Theorems 3 and 5 establish conditions for ensuring that the inexact Newton method (6) converges with linear and quadratic rates, respectively. Second, a similar result to Theorem 3 is presented in [17, Theorem 2.1] by considering as Banach spaces. Thus, if , where is a Banach space, and the derivative is replaced by a suitable approximation, then Theorems 3 and [17, Theorem 2.1] are equivalent. Third, to ensure superlinear and quadratic convergence, the authors in [17] assume a stronger condition, namely, the strong metric regularity (see [17, Theorem 2.3]). We recall that a set-valued mapping is strongly metrically regular at for if and only if its inverse has a single-valued graphical localization around for which is Lipschitz continuous around with Lipschitz modulus at equal to , see [17, p. 1007]. Fourth, we establish quadratic convergence of (6) by only assuming the metric regularity condition, which is clearly a weaker assumption than strong metric regularity, see the previous comment. Thus, the result obtained in Theorem 5 is stronger than [17, Theorem 2.3]. Finally, Theorems 4 and 6 refine Theorems 3 and 5, respectively, in the sense that they provide guidance on how to find the neighborhood to start the proposed method in (6).
It is also important to mention that in [17] is introduced a version of the Dennis-Moré theorem for the sequence generated by (6). We do not go futher in this topic because we do not proposed a quasi-Newton method in this paper.
We conclude this section by presenting a semi-local convergence result for the inexact Newton method proposed in this paper to solve (2). This result makes no assumptions about the unknown solution to the problem under investigation; instead, the assumptions are made about the starting point . It is worth mentioning that this result is novel, even in the Euclidean context.
Theorem 7.
Assume that for with and , the following conditions hold:
-
(i)
is a totally normal ball, and the mapping defined by
(50) is metrically regular at for with the constant and neighborhoods and .
-
(ii)
the positive constants , , , , and satisfy
-
(iii)
and for all there holds
(51) -
(iv)
holds for all , , , and .
-
(v)
is continuously differentiable, and the inequalities
and
hold for all .
Then there exist sequences and satisfying:
-
(A1)
-
(A2)
-
(A3)
In particular, remains in , converges to a solution of (2), and satisfies
Proof.
Note that the function , , defined by
(52) |
satisfies . From (52) and the second inequality in , it follows that
for all . Thus, by using , one has
and
Hence, we can apply Lemma 2 with , , , , and to obtain the following statement:
-
(S)
for every and every there exists such that .
Due to , for every , the following can be stated:
(53) |
We are now able to prove (A1)–(A3) by induction. By using statement (S) with and we conclude that there exists such that
where the last inequality follows from . Moreover, the inclusion is equivalent to
Hence, (A1)–(A3) hold for . For , assume the induction hypothesis: there exist and such that (A1)–(A3) hold for every . Denote
(54) | ||||
(55) |
By (A3), (52), (50) we find . On the other hand, the first inequality in , (53), (A1), (A2) and (51) yield
and
and
which implies . Therefore, we can apply (S) with and to conclude that there exists such that . Putting in this inequality and taking into account (54), (55), the first inequality in , (53) and (A2), we get
In view of this and (A1), we also have
Furthermore, it comes from that
Overall, we can conclude that (A1), (A2) and (A3) hold for all
Now our focus is to show the convergence of Using (A1) and (A2) in a simple induction procedure, we can show that
holds for all , . Hence, since , is a Cauchy sequence, and consequently, there exists such that From (A1) and (51), it follows that for all , which implies that . As is continuous and has closed graph, by letting in (53) and (A3), we conclude that is a solution of (2). Finally, using (A2), we get
for all . ∎
We conclude this section with some remarks about Theorem 7. First, it is a result to guarantee the existence of a neighborhood which the inexact Newton’s method is well-defined and, hence its convergence to a solution of (2). Thus, in practice, it can be hard to evaluate all the parameters in itens . Second, it states that if the initial point satisfies the inclusion where and
then the inexact Newton’s method finds a solution of (2) in , where
We can interpret this remark as follows: although we cannot evaluate in practice, if we choose close to then the proposed method will work. Third, the assumption in ensures that the first call (and the subsequent calls) to the inexact Newton’s method for solving (2) address the inexactness. Fourth, conditions in (i)-(ii) are usual in the context of metric regularity, and both inequalities in are related to the smoothness assumption of and the continuity of the exponential map. Thus, if and are small enough, this assumption holds.
5 Application
In this section, we investigate three well-known problems that can be formulated as generalized equations on Riemannian manifolds.
Example 5 (System of Inequalities and Equalities).
Consider the generalized equation (2) with , where is the fixed cone
It is easy to see that this generalized equation is equivalent to the following system of equalities and inequalities:
(56) | ||||
Let solve (56) and let each be continuously differentiable around for all . As defined in [10, Definition 3.12], the Mangasarian–Fromovitz condition for system (56) is as follows:
(57) |
After making simple adaptations of [24, Example 4D.3] to the Riemannian context, we can assert that condition (57) is equivalent to the Aubin property of at for , which, in turn, by Theorem 2, is equivalent to the metric regularity of at for .
The following proposition serves as a prerequisite for discussing the last two examples in this section. This result establishes an equivalence between problems (2) and (3).
Proposition 1.
Proof.
If is a solution of (3), then there exists such that . Consequently,
Using the definitions of and in (58), we find that satisfies (2). Conversely, if is a solution to (2) with and defined in (58), then there exists such that
Since forms a basis for , it follows that for all . Thus, , implying that is a solution to (3). ∎
In the following example, we discuss the variational inequality problem proposed in [44], which extends the problem introduced in [46].
Example 6 (Variational Inequality Problem).
Let be a single-valued vector field. Consider the variational inequality problem
(59) |
where is the set of all geodesics satisfying , , and for all . Since the normal cone associated with is the set-valued vector field defined by
for all (see [43]), the problem (59) is equivalent to
which, in turn, by Proposition 1, is equivalent to generalized equation (2) with and as in (58), with and being any basis for .
The following example proposes an approach based on the Riemannian extension of the analysis presented in [36].
Example 7 (KKT Conditions).
Consider the constrained nonlinear optimization problem on :
minimize | (60) | |||
subject to | (61) |
where , , and are assumed to be continuously differentiable functions. The constraint means that for all . The Lagrangian function is given by
where and . For each , consider the function defined by for all . Based on [10], we assert that the KKT conditions for (60)–(61) are:
(62) | |||
(63) | |||
(64) |
Let and consider the vector field defined by
and the set-valued vector field defined by
where denotes the set of vectors in with nonnegative coordinates. Note that KKT system (62)–(63)–(64) is equivalent to the problem
(65) |
Let be a basis for and be the canonical basis for . Then, forms a basis for and, by Proposition 1, (65) is equivalent to the generalized equation
where and are defined, respectively, by
and
6 Numerical Example
In this section, we present a numerical example based on a generalized equation derived from Example 7, and solve it using the inexact Newton method described in (5). All computations were performed on a MacBook Pro running macOS Sonoma 14.5, equipped with 16 GB RAM, an Apple M1 Pro CPU, and MATLAB R2022a. To ensure reproducibility, we fixed the randomness using MATLAB’s built-in function rng(2024).
Here, we consider the Riemannian constrained optimization problem on :
minimize | (66) | |||
subject to | (67) |
where and are chosen such that and . We will use the fact that
(68) |
The problem defined by (66)–(67) represents a constrained version of the Riemannian center of mass, also known as the (Riemannian) mean, as proposed in [10]. This problem was first introduced in [37] and has been extensively studied in recent literature (see, for example, [13, 29, 53]).
Particularly, we are interested in problem (66)–(67) with , equipped with the metric of the ambient space . According to [35], the skew-symmetric matrices
induce an orthonormal basis of vector fields on , defined by (standard matrix-vector product) for all and . By defining by
it follows from Example 7 that the KKT conditions for (66)–(67) are:
(69) | ||||
(70) |
which is equivalent to the generalized equation
(71) |
where and are defined, respectively, by
(72) |
and
(73) |
for all .
To apply the inexact Newton method in (5) to solve (71), the subproblem in each iteration involves computing such that
To accomplish this, select (we use , where denotes the transpose operation, for all ), and then solve the following optimization problem:
minimize | (74) | |||
subject to | (75) |
Based on the definition of in (73) and the fact that every can be expressed as a linear combination of , , and , we can find a solution to (74)–(75) by solving the Euclidean quadratic constraint problem:
minimize | (76) | |||
subject to | (77) |
Thus, the iteration is obtained as follows:
For the implementation of our algorithm, we use the following expressions derived from (68) and (72):
and
where, for each , denotes the Riemannian gradient of at , which is the projection of the Euclidean gradient of at onto . This projection can be obtained by multiplying the vector by the matrix , where denotes the identity matrix of dimension . Additionally, we need to know the following expressions:
and
These formulas on the sphere can be found in [27], for example.
Now, we are prepared for the numerical implementation. We use the Matlab built-in function rand to generate , normalizing them to unit vectors. Each component of lies within the interval . Specifically, we consider four cases:
-
A1.
, , , and ;
-
A2.
, , , and ;
-
A3.
, , , and ;
-
A4.
, , , and ;
where is the initial iteration point. For the stopping criteria, we use
where .
From Figure 1 and Table 1, we can claim that for the above four cases A1-A4, we find a solution for generalized equation (71). This is because for all four cases, we achieve , and . In addition, KKT system (69)-(70) is also satisfied under , which is asserted in theory and again ensured numerically. Furthermore, looking into cases A1-A2, we have , which means that lies in the interior of the constraint region. But for cases A3-A4, is on the boundary of the constraint region because is almost equal to . Due to the effect of the constraint, the convergence rate of the norm of for cases A1-A2 is stable. For cases A3-A4, the convergence rate of the norm of in the first several iterations is unstable but after several iterations, the convergence rate becomes stable. This coincides with the theory that the convergence rate of the inexact Newton method becomes stable when the iteration point is close to the solution.
No. of Iteration | ||||||
---|---|---|---|---|---|---|
Case A1 | [0.4388,0.4862,0.5665,0.5001] | 0 | -2.9037 | 0 | 21 | |
Case A2 | [0.4908,0.5100,0.4878,0.5109] | 0 | -2.9298 | 0 | 19 | |
Case A3 | [0.0504,0.0566,0.0650,0.9950] | 8.9114 | 16 | |||
Case A4 | [0.0570,0.0593,0.0566,0.9950] | 8.7870 | 16 |
7 Conclusion
In this paper, we address the problem of finding solutions to generalized equations on Riemannian manifolds. If the manifold is a Euclidean space, then it is well-known that this problem encompasses several other contexts, such as standard nonlinear optimization, variational inequalities, and equilibrium problems. Here, we present a general inexact Newton method for solving generalized equations. Firstly, we discuss the metric regularity property and provide some examples of mappings satisfying this property. Secondly, we establish local convergence results, including both linear and quadratic convergence under suitable assumptions, along with a semi-local convergence result. Finally, we discuss the relationship between problems (2) and (3).
All results obtained here are derived under assumptions that are highly natural in comparison to their Euclidean counterparts, without requiring additional conditions related to the geometry of the manifold. From this perspective, considering a problem as a generalized equation may be a promising approach in the context of manifolds as well.
As a next step, we plan to investigate how to extend the theory presented here to cases where the exponential map is replaced by a general retraction. Moreover, we intend to apply this new concept to quasi-Newton-type methods for solving problem (2).
Data Availability Data sharing not applicable to this article as no datasets were generated or analyzed during the current study
Declarations
Conflict of interest The authors declare that they have no conflict of interest
Appendix A Appendix
We begin this section by presenting two supporting lemmas. The proof of the first one can be found in [45, Lemma A.1]. The second one is analogous to [45, Lemma A.2], but its proof requires some adaptations. Therefore, we provide the proof here.
Lemma 3.
Let be a point on , and be a normal ball. Define the following vector field on :
Then, is a smooth vector field on .
Lemma 4.
Let be a continuously differentiable function at and a normal ball. Then, for each , there exists such that
Proof.
It follows from (7) and parallel transport properties that
From norm properties, we have
Since is a linear map, the above inequality implies that
Under the assumptions of the lemma, is a continuous vector field around for all . Hence, using Lemma 3 with and considering the fact that the function is continuous, we obtain
Thus, it follows from the last inequality above that , which is equivalent to what we want. ∎
In the following proposition, we provide a Riemannian version of the Fundamental Theorem of Calculus and two inequalities that play a crucial role throughout this paper. We claim that the inequalities in the following result do not hold for a general retraction.
Proposition 2.
Let be a continuously differentiable function at . Then, there exists such that for each , we have
(78) |
where is a geodesic satisfying and . In particular, for each , there exists a normal ball such that
(79) |
Furthermore, if is -Lipschitz continuous around then there exists a normal ball such that
(80) |
Proof.
Choose such that is continuously differentiable on . Pick and consider a geodesic connecting to . Using [20, Proposition 2.7], we conclude that
Applying integral with respect to on both sides and using the Fundamental Theorem of Calculus for the real function , we conclude that
(81) |
which completes the proof of (78). Now, let us prove (79) and (80). Let be a normal ball. For each , the geodesic connecting to can be written as , which implies that holds for all . Hence, it follows from (81) with that
By adding the term on both sides, it comes that
Applying the Euclidean norm of , we get
It follows from (8) and simple properties of integral that
(82) |
for all . For arbitrary , it follows from Lemma 4 and that for all which implies that for all and . Therefore, (82) yields (79). Finally, assume that is -Lipschitz continuous around . By Definition 1, there exists such that for all , which implies that
for all . Using the previous inequality and (82), we conclude the proof of (80). ∎
The following proposition provides a geometric property of the exponential map.
Proposition 3.
Let be a totally normal ball. Then, for each and in , we have
(83) |
where is the geodesic connecting to .
Proof.
Let and be arbitrary points in . Note that (83) is trivially satisfied when and . Thus, we only need to analyze the case where . Considering that the geodesic starting from with direction is unique, and that is a totally normal ball, we have
Applying parallel transport on both sides of the equation, we obtain
Since is a geodesic, the field is parallel along , and, consequently, the equality holds for all . Thus, the last expression implies
In a similar manner, we can show that
To conclude the proof, simply add the last two equalities and use the fact that for all . ∎
In the next result, we will establish a sufficient condition for the function to have closed graph.
Proposition 4.
Let be a continuous function. Then, the graph of is closed in .
Proof.
Since the function is continuous, the function defined by is also continuous. On the other hand, we have that
Therefore, is closed because it is the inverse image, by the continuous function , of the closed set . ∎
Throughout this paper, our strategy to prove some results requires the following lemma.
Lemma 5.
Let be a continuously differentiable function at . Then, for each , there exists a totally normal ball such that
Proof.
Let be a totally normal ball. Define the function by
(84) |
Let , and consider with the product metric. Since and are continuous at and continuously differentiable at , respectively, (the differential of at ) is continuous at . Therefore, there exists such that
From (84), it follows that for all which implies . Thus, the last line can be rewritten as
(85) |
Set . It follows from for all and (85) that for all . Consequently,
(86) |
For , it follows from the first part of Proposition 2 and (84) that
where is a geodesic that satisfies and . Since is totally normal, for all . Hence, applying the norm to both sides of the above equality and using norm properties, we obtain
The conclusion of this proof follows from the previous inequality and (86). ∎
References
- [1] P.-A. Absil, R. Mahony, and R. Sepulchre. Optimization algorithms on matrix manifolds. Princeton University Press, 2009.
- [2] R. L. Adler, J.-P. Dedieu, J. Y. Margulies, M. Martens, and M. Shub. Newton’s method on Riemannian manifolds and a geometric model for the human spine. IMA Journal of Numerical Analysis, 22(3):359–390, 2002.
- [3] S. Adly, R. Cibulka, and H. van Ngai. Newton’s method for solving inclusions using set-valued approximations. SIAM J. Optim., 25:159–184, 2015.
- [4] S. Adly, H. van Ngai, and N. V. Vu. Newton-type method for solving generalized equations on Riemannian manifolds. Journal of Convex Analysis, 25(2):341–370, 2018.
- [5] S. Adly, H. van Ngai, and N. V. Vu. Dennis-Moré condition for set-valued vector fields and the superlinear convergence of Broyden updates in Riemannian manifolds. Journal of Convex Analysis, 29(3), 2022.
- [6] F. Alvarez, J. Bolte, and J. Munier. A unifying local convergence result for Newton’s method in Riemannian manifolds. Foundations of Computational Mathematics, 8(2):197–226, 2008.
- [7] R. Andreani, R. M. de Carvalho, L. D. Secchin, and G. N. Silva. Convergence of Quasi-Newton methods for solving constrained generalized equations. ESAIM:COCV, 28, 2022.
- [8] F. J. Aragón Artacho, A. Belyakov, A. L. Dontchev, and M. López. Local convergence of quasi-Newton methods under metric regularity. Comput. Optim. Appl., 58(1):225–247, 2014.
- [9] F. J. Aragón Artacho, A. L. Dontchev, M. Gaydu, M. H. Geoffroy, and V. M. Veliov. Metric regularity of Newton’s iteration. SIAM J. Control Optim., 49(2):339–362, 2011.
- [10] R. Bergmann and R. Herzog. Intrinsic formulation of KKT conditions and constraint qualifications on smooth manifolds. SIAM Journal on Optimization, 29(4):2423–2444, 2019.
- [11] R. Bergmann, R. Herzog, M. Silva Louzeiro, D. Tenbrinck, and J. Vidal-Núñez. Fenchel duality theory and a primal-dual algorithm on Riemannian manifolds. Foundations of Computational Mathematics, 21(6):1465–1504, 2021.
- [12] R. Bergmann, J. Persch, and G. Steidl. A parallel Douglas–Rachford algorithm for minimizing rof-like functionals on images with values in symmetric Hadamard manifolds. SIAM Journal on Imaging Sciences, 9(3):901–937, 2016.
- [13] D. A. Bini and B. Iannazzo. Computing the karcher mean of symmetric positive definite matrices. Linear Algebra and its Applications, 438(4):1700–1710, 2013.
- [14] N. Boumal. An introduction to optimization on smooth manifolds. Cambridge University Press, 2023.
- [15] Y. Censor, A. Gibali, and S. Reich. Algorithms for the split variational inequality problem. Numerical Algorithms, 59:301–323, 2011.
- [16] R. Cibulka, A. Dontchev, and M. H. Geoffroy. Inexact Newton Methods and Dennis–Moré Theorems for Nonsmooth Generalized Equations. SIAM J. Control Optim., 53(2):1003–1019, 2015.
- [17] R. Cibulka, A. Dontchev, and M. H. Geoffroy. Inexact Newton methods and Dennis–Moré theorems for nonsmooth generalized equations. SIAM Journal on Control and Optimization, 53(2):1003–1019, 2015.
- [18] F. R. de Oliveira, O. P. Ferreira, and G. N. Silva. Newton’s method with feasible inexact projections for solving constrained generalized equations. Computational Optimization and Applications, 72:159–177, 2019.
- [19] J.-P. Dedieu, P. Priouret, and G. Malajovich. Newton’s method on Riemannian manifolds: covariant alpha theory. IMA Journal of Numerical Analysis, 23(3):395–419, 2003.
- [20] M. P. do Carmo. Riemannian geometry. Mathematics: Theory & Applications. Birkhäuser Boston, Inc., Boston, MA, 1992. Translated from the second Portuguese edition by Francis Flaherty.
- [21] A. L. Dontchev. Local analysis of a Newton-type method based on partial linearization. In The mathematics of numerical analysis (Park City, UT, 1995), volume 32 of Lectures in Appl. Math., pages 295–306. Amer. Math. Soc., Providence, RI, 1996.
- [22] A. L. Dontchev and R. T. Rockafellar. Newton’s method for generalized equations: a sequential implicit function theorem. Math. Program., 123(1, Ser. B):139–159, 2010.
- [23] A. L. Dontchev and R. T. Rockafellar. Convergence of inexact Newton methods for generalized equations. Math. Program., 139(1-2, Ser. B):115–137, 2013.
- [24] A. L. Dontchev and R. T. Rockafellar. Implicit Functions and Solution Mappings: A View from Variational Analysis. Springer, New York, 2nd edition, 2014.
- [25] T. A. Fernandes, O. P. Ferreira, and J. Yuan. On the superlinear convergence of Newton’s method on Riemannian manifolds. Journal of Optimization Theory and Applications, 173(3):828–843, 2017.
- [26] O. Ferreira and G. Silva. Local convergence analysis of Newton’s method for solving strongly regular generalized equations. Journal of Mathematical Analysis and Applications, 458(1):481–496, 2018.
- [27] O. P. Ferreira, A. N. Iusem, and S. Z. Németh. Projections onto convex sets on the sphere. Journal of Global Optimization, 57(3):663–676, 2013.
- [28] O. P. Ferreira, C. Jean-Alexis, and A. Piétrus. Metrically regular vector field and iterative processes for generalized equations in Hadamard manifolds. Journal of Optimization Theory and Applications, 175(3):624–651, 2017.
- [29] O. P. Ferreira, M. S. Louzeiro, and L. Prudente. Gradient method for optimization on riemannian manifolds with lower bounded curvature. SIAM Journal on Optimization, 29(4):2517–2541, 2019.
- [30] O. P. Ferreira and G. N. Silva. Kantorovich’s Theorem on Newton’s Method for Solving Strongly Regular Generalized Equation. SIAM J. Optim., 27(2):910–926, 2017.
- [31] M. Gaydu and G. N. Silva. A general iterative procedure to solve generalized equations with differentiable multifunction. Journal of Optimization Theory and Applications, 185:207–222, 2020.
- [32] M. Genicot, W. Huang, and N. T. Trendafilov. Weakly correlated sparse components with nearly orthonormal loadings. In International Conference on Geometric Science of Information, pages 484–490. Springer, 2015.
- [33] M. H. Geoffroy and A. Piétrus. Local convergence of some iterative methods for generalized equations. Journal of Mathematical Analysis and Applications, 290(2):497–505, 2004.
- [34] H. He, C. Ling, and H.-K. Xu. A relaxed projection method for split variational inequalities. Journal of Optimization Theory and Applications, 166:213–233, 2015.
- [35] L. Hesselholt. Vector fields on spheres. Preprint, http://www-math. mit. edu larsh/teaching/vectorfields. pdf.
- [36] A. F. Izmailov and M. V. Solodov. Inexact Josephy–Newton framework for generalized equations and its applications to local analysis of Newtonian methods for constrained optimization. Computational Optimization and Applications, 46(2):347–368, 2010.
- [37] H. Karcher. Riemannian center of mass and mollifier smoothing. Communications on pure and applied mathematics, 30(5):509–541, 1977.
- [38] D. Klatte and B. Kummer. Approximations and generalized Newton methods. Mathematical Programming, 168:673–716, 2018.
- [39] S. Lang. Fundamentals of differential geometry, volume 191 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1999.
- [40] J. M. Lee. Introduction to Smooth Manifolds, volume 218 of Graduate Texts in Mathematics. Springer-Verlag, New York, 2003.
- [41] J. M. Lee. Riemannian manifolds: an introduction to curvature, volume 176. Springer Science & Business Media, 2006.
- [42] C. Li and J. Wang. Convergence of the Newton method and uniqueness of zeros of vector fields on Riemannian manifolds. Science in China Series A: Mathematics, 48(11):1465–1478, 2005.
- [43] C. Li and J.-C. Yao. Variational inequalities for set-valued vector fields on Riemannian manifolds: Convexity of the solution set and the proximal point algorithm. SIAM Journal on Control and Optimization, 50(4):2486–2514, 2012.
- [44] S.-L. Li, C. Li, Y.-C. Liou, and J.-C. Yao. Existence of solutions for variational inequalities on Riemannian manifolds. Nonlinear Analysis: Theory, Methods & Applications, 71(11):5695–5706, 2009.
- [45] C. Liu and N. Boumal. Simple algorithms for optimization on Riemannian manifolds with constraints. Applied Mathematics & Optimization, 82(3):949–981, 2020.
- [46] S. Németh. Variational inequalities on hadamard manifolds. Nonlinear Analysis: Theory, Methods & Applications, 52(5):1491–1498, 2003.
- [47] H. Sato. Riemannian optimization and its applications. Springer, 2021.
- [48] S. Sra and R. Hosseini. Conic geometric optimization on the manifold of positive definite matrices. SIAM J. Optim., 25(1):713–739, 2015.
- [49] J. Tang and H. Liu. Unsupervised feature selection for linked social media data. In Proceedings of the 18th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 904–912, 2012.
- [50] L. W. Tu. An Introduction to Manifolds. Universitext. Springer, New York, 2 edition, 2011.
- [51] C. Udrişte. Convex functions and optimization methods on Riemannian manifolds, volume 297 of Mathematics and its Applications. Kluwer Academic Publishers Group, Dordrecht, 1994.
- [52] J.-H. Wang, S. Huang, and C. Li. Extended Newton’s method for mappings on Riemannian manifolds with values in a cone. Taiwanese J. Math., 13(2B):633–656, 2009.
- [53] M. Weber and S. Sra. Riemannian optimization via frank-wolfe methods. Mathematical Programming, 199(1):525–556, 2023.
- [54] Y. Zhang, Y. Lau, H.-w. Kuo, S. Cheung, A. Pasupathy, and J. Wright. On the global geometry of sphere-constrained sparse blind deconvolution. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 4894–4902, 2017.