∎
[email protected],
Supported by COST Action: CA16228 33institutetext: Aïcha BALHAG 44institutetext: Zaki CHBANI 55institutetext: Hassan RIAHI
Cadi Ayyad University
Sémlalia Faculty of Sciences 40000 Marrakech, Morroco
[email protected] 66institutetext: [email protected] 77institutetext: [email protected]
Damped inertial dynamics with vanishing Tikhonov regularization: strong asymptotic convergence towards the minimum norm solution
Abstract
In a Hilbert space, we provide a fast dynamic approach to the hierarchical minimization problem which consists in finding the minimum norm solution of a convex minimization problem. For this, we study the convergence properties of the trajectories generated by a damped inertial dynamic with Tikhonov regularization. When the time goes to infinity, the Tikhonov regularization parameter is supposed to tend towards zero, not too fast, which is a key property to make the trajectories strongly converge towards the minimizer of of minimum norm. According to the structure of the heavy ball method for strongly convex functions, the viscous damping coefficient is proportional to the square root of the Tikhonov regularization parameter. Therefore, it also converges to zero, which will ensure rapid convergence of values. Precisely, under a proper tuning of these parameters, based on Lyapunov’s analysis, we show that the trajectories strongly converge towards the minimizer of minimum norm, and we provide the convergence rate of the values. We show a trade off between the property of fast convergence of values, and the property of strong convergence towards the minimum norm solution. This study improves several previous works where this type of results was obtained under restrictive hypotheses.
Keywords:
Accelerated gradient methods; convex optimization; damped inertial dynamics; hierarchical minimization; Nesterov accelerated gradient method; Tikhonov approximation.MSC:
37N40, 46N10, 49M30, 65K05, 65K10, 90B50, 90C25.1 Introduction
Throughout the paper, is a real Hilbert space which is endowed with the scalar product , with for . We consider the convex minimization problem
(1) |
where is a convex continuously differentiable function whose solution set is nonempty. We aim at finding by rapid methods the element of minimum norm of . Our approach is in line with the dynamic approach developed by Attouch and László in AL to solve this question. It is based on the asymptotic analysis, as , of the nonautonomous damped inertial dynamic
where the function and the Tikhonov regularization parameter satisfy the following hypothesis111In section 4, we will extend our study to the case of a convex lower semicontinuous proper function .:
The Cauchy problem for (TRIGS) is well posed. The proof of the existence and uniqueness of a global solution for the corresponding Cauchy problem is given in the appendix (see also AL ). It is based on classical arguments combining the Cauchy-Lipschitz theorem with energy estimates. Our main contribution is to develop a new Lyapunov analysis which gives the strong convergence of the trajectories of (TRIGS) to the element of minimum norm of . Precisely, we give sufficient conditions on which ensure that . This improves the results of AL .
1.1 Attouch-László Lyapunov analysis of (TRIGS)
The main idea developed in AL consists of starting from the Polyak heavy ball with friction dynamic for strongly convex functions, then adapting it via Tikhonov approximation to deal with the case of general convex functions. Recall that a function is said to be -strongly convex for some if is convex. In this setting, we have the following exponential convergence result for the damped autonomous inertial dynamic where the damping coefficient is twice the square root of the modulus of strong convexity of :
Theorem 1.1
Suppose that is a function of class which is -strongly convex for some . Let be a solution trajectory of
(2) |
Then, the following property holds:
To adapt this result to the case of a general convex differentiable function , a natural idea is to use Tikhonov’s method of regularization. This leads to consider the non-autonomous dynamic which at time is governed by the gradient of the strongly convex function
Then, replacing by in (2), and noticing that is -strongly convex, this gives the following dynamic which was introduced in AL ( is a positive parameter)
(TRIGS) stands shortly for Tikhonov regularization of inertial gradient systems.
In order not to asymptotically modify the equilibria, it is supposed that as 222This is the key property of the asymptotic version () of the Browder-Tikhonov regularization method.. This condition implies that (TRIGS) falls within the framework of the inertial gradient systems with asymptotically vanishing damping.
The importance of this class of inertial dynamics has been highlighted by several recent studies
AAD1 , ABotCest , AC10 , ACPR , AP , CD , SBC , which make the link with the accelerated gradient method of Nesterov Nest1 ; Nest2 .
The control of the decay of to zero as plays a key role in the Lyapunov analysis of (TRIGS), and uses the following condition.
Definition 1
Let us give . We say that satisfies the controlled decay property , if it is a nonincreasing function which satisfies: there exists such that for all
(3) |
where is a parameter such that for , and for .
By integrating the differential inequality (3), one can easily verify that this condition implies that is greater than or equal to . Since the damping coefficient is proportional to , this means that it must be greater than or equal to . This is in accordance with the theory of inertial gradient systems with time-dependent viscosity coefficient, which states that the asymptotic optimization property is valid provided that the integral on of the viscous damping coefficient is infinite, see AC10 , CEG . Let us state the following convergence result obtained in AL .
Theorem 1.2
(Attouch-László AL ) Let be a solution trajectory of (TRIGS). Let be a positive parameter. Suppose that satisfies the condition for some . Then, we have the following rate of convergence of values: for all
(4) |
where
The proof is based on the following Lyapunov function
(5) |
where the function is chosen appropriately. Based on this Lyapunov analysis, it is proved in AL that . We will improve this result, and show that . For this, we will develop a new Lyapunov analysis. Let us first recall some related previous results showing the progression of the understanding of these delicate questions, where the hierarchical minimization property is reached asymptotically.
1.2 Historical facts and related results
In relation to hierarchical optimization, a rich literature has been devoted to the coupling of dynamic gradient systems with Tikhonov regularization, and to the study of the corresponding algorithms.
1.2.1 First-order gradient dynamics
For first-order gradient systems and subdifferential inclusions, the asymptotic hierarchical minimization property which results from the introduction of a vanishing viscosity term in the dynamic (in our context the Tikhonov approximation Tikh ; TA ) has been highlighted in a series of papers AlvCab , Att2 , AttCom , AttCza2 , BaiCom , CPS , Hirstoaga . In parallel way, there is a vast literature on convex descent algorithms involving Tikhonov and more general penalty, regularization terms. The historical evolution can be traced back to Fiacco and McCormick FM , and the interpretation of interior point methods with the help of a vanishing logarithmic barrier. Some more specific references for the coupling of Prox and Tikhonov can be found in Cominetti Com . The time discretization of the first-order gradient systems and subdifferential inclusions involving multiscale (in time) features provides a natural link between the continuous and discrete dynamics. The resulting algorithms combine proximal based methods (for example forward-backward algorithms), with the viscosity of penalization methods, see AttCzaPey1 , AttCzaPey2 , BotCse1 , Cabot-inertiel ; Cab , Hirstoaga .
1.2.2 Second order gradient dynamics
First studies concerning the coupling of damped inertial dynamics with Tikhonov approximation concerned the heavy ball with friction system of Polyak Polyak , where the damping coefficient is fixed. In AttCza1 Attouch-Czarnecki considered the system
(6) |
In the slow parametrization case , they proved that any solution of (6) converges strongly to the minimum norm element of , see also JM-Tikh . A parallel study has been developed for PDE’s, see AA for damped hyperbolic equations with non-isolated equilibria, and AlvCab for semilinear PDE’s. The system (6) is a special case of the general dynamic model
(7) |
which involves two functions and intervening with different time scale. When tends to zero moderately slowly, it was shown in Att-Czar-last that the trajectories of (7) converge asymptotically to equilibria that are solutions of the following hierarchical problem: they minimize the function on the set of minimizers of . When is a product space, defining for , and , where the are linear operators, (7) provides (weakly) coupled inertial systems. The continuous and discrete-time versions of these systems have a natural connection to the best response dynamics for potential games AttCza2 , domain decomposition for PDE’s abc2 , optimal transport abc , coupled wave equations HJ2 .
In the quest for a faster convergence, the following system
(8) |
has been studied by Attouch-Chbani-Riahi ACR . It is a Tikhonov regularization of the dynamic
(9) |
which was introduced by Su, Boyd and Candès in SBC . When , can be viewed as a continuous version of the accelerated gradient method of Nesterov. It has been the subject of many recent studies which have given an in-depth understanding of the Nesterov acceleration method, see AAD1 , AC10 , ACPR , SBC , Siegel , WRJ .
1.3 Model results
Let us illustrate our results in the case . In section 3, we will prove the following result:
Theorem 1.3
Take , . Let be a solution trajectory of
Then, we have the following rates of convergence for the values and the trajectory:
(10) | |||
(11) |
According to the strong convergence of to the minimum norm solution, the above property implies that strongly converges to the minimum norm solution.
With many respect, these results represent an important advance compared to previous works:
i) Let us underline that in our approach the rapid convergence of the values and the strong convergence towards the solution of minimum norm are obtained for the same dynamic, whereas in the previous works ACR , AttCza1 , they are obtained for different dynamics corresponding to different settings of the parameters. Moreover, we obtain the strong convergence of the trajectories to the minimum norm solution, whereas in AL and ACR it was only obtained that It is clear that the results extend naturally to obtaining strong convergence towards the solution closest to a desired state . It suffices to replace in Tikhonov’s approximation by . This is important for inverse problems. In addition, we obtain a convergence rate of the values which is better than the one obtained inAL .
ii) These results show the trade-off between the property of rapid convergence of values, and the property of strong convergence towards the minimum norm solution. The two rates of convergence move in opposite directions with varies. The determination of a good compromise between these two antagonistic criteria is an interesting subject that we will consider later.
iii) Note that at the limit, when , which is the most interesting case to obtain a fast convergence of values comparable to the accelerated gradient method of Nesterov, then our analysis does not allow us to conclude that the trajectories are converging towards the solution of minimum norm. This question remains open, the interested reader can consult AL .
1.4 Contents
The paper is organized as follows. In section 2, for a general Tikhonov regularization parameter , we study the asymptotic convergence properties of the solution trajectories of (TRIGS). Based on Lyapunov analysis, we show their strong convergence to the element of minimum norm of , and we provide convergence rate of the values. In section 3, we apply these results to the particular case . Section 4 considers the extension of these results to the nonsmooth case. Section 5 contains numerical illustrations. We conclude in section 6 with some perspective and open questions.
2 Convergence analysis for general
We are going to analyze via Lyapunov analysis the convergence properties as of the solution trajectories of the inertial dynamic (TRIGS) that we recall below
(12) |
Throughout the paper, we assume that is the origin of time, is a positive parameter. For each , let us introduce the function defined by
(13) |
and set
which is the unique minimizer of the strongly convex function . The first order optimality condition gives
(14) |
The following properties are immediate consequences of the classical properties of the Tikhonov regularization
(15) | |||
(16) |
2.1 Preparatory results for Lyapunov analysis
Let us introduce the real-valued function function that plays a key role in our Lyapunov analysis. It is defined by
(17) |
where has been defined in (13), and
(18) |
The time dependent parameter will be adjusted during the proof. We will show that under judicious choice of the parameters, then is a decreasing function. Moreover, we will estimate the rate of convergence of towards zero. This will provide the rates of convergence of values and trajectories, as the following lemma shows.
Lemma 1
Let be a solution trajectory of the damped inertial dynamic (TRIGS), and be the energy function defined in (17). Then, the following inequalites are satisfied: for any ,
(19) | |||
(20) |
Therefore, converges strongly to as soon as
Proof
According to the definition of , we have
By definition of we have
(21) |
which, combined with the above inequality, gives (19).
To estimate , we will show that it satisfies a first order differential inequality of the form
(22) |
where and are positive functions that will be made precise in the proof. So the first step of the proof is to compute . The computation of involves the two terms and They are evaluated in the lemma below.
Lemma 2
For each , we have
-
-
.
Therefore,
Proof
We have
Since (see (Att-book, , Lemma 3.27), (Att2, , Corollary 6.2)), we have:
Therefore,
(23) |
On the other hand, we have
Since we get . This combined with (23) gives
We have
According to the monotonicity of we have
which implies
After division by we obtain
By letting we get
which completes the proof.
2.2 Main result
Given a general parameter , let’s proceed with the Lyapunov analysis.
Theorem 2.1
Suppose that is a convex function of class . Let be a solution trajectory of the system (TRIGS) with
Let us assume that there exists and such that for all
where is such that
Then, the following property holds:
(24) |
where and .
Proof
According to the classical derivation chain rule and Lemma 2 , the differentiation of the function gives
(25) |
Using the constitutive equation (12), we have
Therefore,
(26) | |||||
Since is strongly convex, we have
Using the above inequality, we get
(27) |
Moreover, we have for any
and for any
By combining the two above inequalities with (25), (26) and (27), and after reduction we obtain
(28) | |||||
On the other hand, for a positive function we have
(29) |
By adding (28) and (29), we get
(30) | |||||
As we do not know a priori the sign of we take the coefficient in front of this term equal to zero, which gives
(31) |
Let us make the following choice of the time dependent parameter introduced in (18) (indeed, it is a key ingredient of our Lyapunov analysis)
where is a positive parameter to be fixed. Then, the relation (31) can be equivalently written
According to this choice for and and neglecting the term which is less than or equal to zero, the inequality (30) becomes
(32) |
According to item of Lemma 2, and inequality (15)
Using again inequality (15), and the fact that , we have
By inserting the two inequalities above in (32), we obtain
(33) |
By taking with we get
(34) |
We are looking for sufficient conditions on the parameters and which make , , and less than or equal to zero. It is here that the hypothesis formulated in the statement of the Theorem 2.1 intervenes. It is recalled below for the convenience of the reader
There exists and such that for all
where is such that for and for
Clearly, condition is equivalent to
Under condition we immediately obtain that and are less than or equal to zero:
Let us now examine .
When we have
When we have because .
This implies that
According to (34), under condition we conclude that
(35) |
By multiplying the inequality above with we obtain
(36) |
By integrating (36) on , we get
This completes the proof of the Lyapunov analysis.
Remark 1
Given , the condition gives the admissible values of the parameters and which enter into the convergence rates obtained in Theorem 2.1. Let us verify that the inequalities that define the values of these parameters are consistent. We consider successively the two cases , then .
a) When , we have for sufficiently large. Because , we have , and hence the interval is nonempty. Therefore, in this case the conditions are consistent.
b) Suppose now that . Then for all , and we can argue with arbitrarily large. Let us verify that we can find and such that
(37) |
and hence a value of belonging to the corresponding interval. By letting and in the above inequality we obtain
which is equivalent to , and hence is satisfied.
By a continuity argument, we obtain that the inequality
(37) is satisfied by taking and sufficiently large.
Note that, since we are interested in asymptotic results the important point is to get the existence of parameters for which the Lyapunov analysis is valid. If we are interested in complexity results then the precise value of these parameters is important.
Remark 2
The above argument shows that the controlled decay property used in AL corresponds to the limiting case and in our condition .
Remark 3
As in AL , our Lyapunov analysis is valid for an arbitrary choice of the parameter . It would be interesting to know what is the best choice for .
3 Particular cases
Let’s study the case , and discuss, according to the value of the parameter , the convergence rate of values, and the convergence rate to zero of . The following results were stated in Theorem 1.3, in the introduction, as model results. We reproduce them here for the convenience of the reader. The point is simply to apply the general Theorem 2.1 to this particular situation, and to show that the different quantities involved in the convergence results can be computed explicitly.
Theorem 3.1
Take , . Let be a solution trajectory of
Then, we have convergence of the values, and strong convergence to the minimum norm solution with the following rates:
(38) | |||
(39) |
Proof
a) Convergence rate of the values: With the notations of Theorem 2.1, we have
So
(40) | |||||
where Let us choose the parameters , such that
for and for .
Notice that, in these two cases, we have . Therefore, . On the other hand, since , we have . So we have, for large enough,
As a consequence, the condition is satisfied.
According to (24), we have where
(41) |
and
(42) |
According to (40) we have
Since and , we have that tends to zero at an exponential rate, as .
Thus, we only have to focus on the asymptotic behavior of . Let us simplify the formula
by setting in (24). Replacing and by their values in
(41) gives
To compute the above integral, we notice that
Then, note that, when we set such that , we obtain
Let us verify that the last above inequality is satisfied for large enough. First, since , we have , and hence . On the other hand, according to we have . This property combined with the choice of implies
Therefore . So, for large enough
Since has an exponential decay to zero, we deduce that there exists a positive constant such that for large enough
According to Lemma 1, we get
Since we have . We conclude that
b) Convergence rate to zero of . According to Lemma 1, we have
(43) |
and since, for large enough, , we obtain
which completes the proof.
Remark 4
The convergence rate of the values obtained in Theorem 3.1 notably improves the result obtained in AL , where the convergence rate was of order . Indeed, for we have . In addition, we have obtained that for any , for any trajectory of (TRIGS), we have strong convergence of the trajectory to the minimum norm solution, as time tends toward . In AL it was only obtained .
Remark 5
A close look at the proof of Theorem 3.1 shows that the convergence rate of the values is still valid in the case . Precisely, by taking with sufficiently small, we have that the condition is satisfied, and hence
Remark 6
As a key ingredient of our proof of the strong convergence of the trajectories of (TRIGS) to the element of minimum norm of , we use the function and show that . This strategy which consists in showing that the trajectory is not too far from the viscosity curve was already present in the approach developed by Attouch and Cominetti in AttCom for the study of similar questions in the case of the steepest descent method.
3.1 Trade-off between the convergence rate of values and trajectories
The following elementary diagram shows the respective evolution as varies of the convergence rate of the values, and the convergence rate of .
We observe that is a good compromise between these two antagonist properties. Let us state the corresponding result below.
Corollary 1
Take , . Let be a solution trajectory of
Then, we have convergence of the values, and strong convergence to the minimum norm solution with the following rates:
b) Another interesting case is to take , with close to . In this case, we have a convergence rate of the values which is arbitrarily close to the convergence rate of the accelerated gradient method of Nesterov, with a guarantee of strong convergence towards the minimum norm solution. The case has been studied extensively in AL . The strong convergence of the trajectories to the minimum norm solution is an open question in the case .
c) Estimating the convergence rate of to relies on getting informations about the viscosity trajectory , and how fast converges to as . This is a difficult problem because the viscosity trajectory can have an infinite length, as Torralba showed in Torralba . His counterexample involves the construction of a convex function from its sub-level sets, and relates to a convex function whose sub-level sets vary greatly. Our analysis focuses on general , i.e. the worst case. This suggests that, under good geometric properties of , such as the Kurdyka-Lojasiewicz property, one should be able to obtain better results; see also AttCom where it is shown the importance of this kind of question for the coupling of the steepest descent with Tikhonov approximation.
4 Non-smooth case
Let us extend the previous results to the case of a proper lower semicontinuous and convex function . We rely on the basic properties of the Moreau envelope. Recall that, for any , is defined by
(44) |
Then is a convex differentiable function, whose gradient is -Lipschitz continuous, and such that , . Denoting by the unique point where the minimum value is achieved in (44), let us recall the following classical formulae:
-
1.
.
-
2.
.
The interested reader may refer to BC ; Bre1 for a comprehensive treatment of the Moreau envelope in a Hilbert setting. Since the set of minimizers is preserved by taking the Moreau envelope, the idea is to replace by in the inertial dynamic (TRIGS). Then, (TRIGS) applied to now reads
Clearly, since is continuously differentiable, the hypothesis are satisfied by the above dynamic. By applying Theorem 3.1 to , we get the following result, which provides convergence rate of the values and strong convergence to the minimum norm solution.
Theorem 4.1
Let be a convex, lower semicontinuous, proper function. Take , , and . Let be a solution trajectory of , i.e.
Then, we have the following convergence rates: as
(45) | |||
(46) | |||
(47) |
Proof
By applying Theorem 3.1 to the function , and since , we get
(48) | |||
(49) |
According to , we get
which gives the claims.
Remark 7
The above result suggests that, in the case of a nonsmooth convex function , the corresponding proximal algorithms will inherit the convergence properties of the continuous dynamic (TRIALS). When considering convex minimization problems with additive structure with smooth and nonsmooth, it is in general difficult to compute the proximal mapping of . A common device then consists of using a splitting method, and writing the minimization problem as the fixed point problem where, given
Under appropriate conditions, is an averaged nonexpansive operator BC , so the associated iterative method (proximal gradient method) converges to a fixed point of , and therefore of the initial minimization problem. In our context, this naturally leads to studying the inertial system
Many properties of the Tikhonov approximation are still valid for maximally monotone operators, which allows to expect good convergence properties for the above system. This is a subject for further research.
5 Numerical illustration
Let us illustrate our results in the following elementary situation. Take equipped with the classical Euclidean structure, and is given by
The function is convex, but not strongly convex. In this case, the solution set is the entire affine subspace , and the projection of the origin onto is given by .

The numerical experiments described in Figure 1 are in agreement with our theoretical findings. They show the trade-off between the convergence rate of values and trajectories , and that taken around is a good compromise. We limit our illustration to the case . It is the most interesting case for obtaining good convergence rate of the trajectories towards the minimum norm solution. We also notice that the regularization Tikhonov’s term in the system (TRIGS) reduces the oscillations. This suggests introducing the Hessian driven damping into these dynamics to further dampen oscillations, see ACFR , APR , BCL and references therein. This is related to the notion of strong damping for PDE’s.
6 Conclusion, perspective
In the general framework of convex optimization in Hilbert spaces, we have introduced a damped inertial dynamic which generates trajectories rapidly converging towards the minimum norm solution. We obtained these results by removing restrictive assumptions concerning the convergence of trajectories, made in previous works. This seems to be the first time that these two properties have been obtained for the same inertial dynamic. We have developed an in-depth mathematical Lyapunov analysis of the dynamic which is a valuable tool for the development of corresponding results for algorithms obtained by temporal discretization. Precisely, the corresponding algorithmic study must be the subject of a subsequent study. Many interesting questions such as the introduction of Hessian-driven damping to attenuate oscillations, and the study of the impact of error disturbances, merit further study. These results also adapt well to inverse problems for which strong convergence of trajectories, and obtaining a solution close to a desired state are key properties. It is likely that a parallel approach can be developed for constrained optimization, in multiobjective optimization for the dynamical approach to Pareto optima, and within the framework of potential games. The Lyapunov analysis developed in this paper could also be very useful to study the asymptotic stabilization of several classes of PDE’s, for example nonlinear damped wave equations.
Appendix A Auxiliary results
A.1 Existence and uniqueness for the Cauchy problem, energy estimates
Let us first show that the Cauchy problem for (TRIGS) is well posed. The proof relies on classical arguments combining the Cauchy-Lipschitz theorem with energy estimates. The following proof has been given in AL . We reproduce it for the convenience of the reader, and give supplementary energy estimates.
Theorem A.1
Let us make the assumptions on and . Then, given , there exists a unique global classical solution of the Cauchy problem
(50) |
In addition, the global energy function is decreasing where
and we have the energy estimate
(51) |
Proof
Consider the Hamiltonian formulation of (50), which gives the first order system
(52) |
According to the hypothesis , and by applying the Cauchy-Lipschitz theorem in the locally Lipschitz case, we obtain the existence and uniqueness of a local solution of the Cauchy problem (52). Then, in order to pass from a local solution to a global solution, we use energy estimates. By taking the scalar product of (TRIGS) with , we obtain
(53) |
According to , the function is non-increasing. Therefore, the energy function is decreasing where
The end of the proof follows a standard argument. Take a maximal solution defined on an interval . If is infinite, the
proof is over. Otherwise, if is finite, according to the above energy estimate, we have that remains bounded, just like and (use (TRIGS)). Therefore, the limit of and exists when . Applying the local existence result at with the initial conditions thus obtained gives a contradiction to the maximality of the solution.
Let us complete the proof with the energy estimates. Returning to (53), we get
(54) |
References
- (1) F. Alvarez, H. Attouch, Convergence and asymptotic stabilization for some damped hyperbolic equations with non-isolated equilibria, ESAIM Control Optim. Calc. Var. 6 (2001), 539–552.
- (2) F. Alvarez, A. Cabot, Asymptotic selection of viscosity equilibria of semilinear evolution equations by the introduction of a slowly vanishing term, Discrete Contin. Dyn. Syst. 15 (2006), 921–938.
- (3) V. Apidopoulos, J.-F. Aujol, Ch. Dossal, The differential inclusion modeling the FISTA algorithm and optimality of convergence rate in the case , SIAM J. Optim., 28(1) (2018), 551—574.
- (4) H. Attouch, Variational convergence for functions and operators, Applicable Mathematics Series, Pitman Advanced Publishing Program, 1984.
- (5) H. Attouch, Viscosity solutions of minimization problems, SIAM J. Optim. 6 (3) (1996), 769–806.
- (6) H. Attouch, R.I. Boţ, E.R. Csetnek, Fast optimization via inertial dynamics with closed-loop damping, Journal of the European Mathematical Society (JEMS), 2021, hal-02910307.
- (7) H. Attouch, L.M. Briceño-Arias, P.L. Combettes, A parallel splitting method for coupled monotone inclusions, SIAM J. Control Optim. 48 (5) (2010), 3246–3270.
- (8) H. Attouch, L.M. Briceño-Arias, P.L. Combettes, A strongly convergent primal-dual method for nonoverlapping domain decomposition, Numerische Mathematik, 133(3) (2016), 443–470.
- (9) H. Attouch, A. Cabot, Asymptotic stabilization of inertial gradient dynamics with time-dependent viscosity, J. Differential Equations, 263 (9), (2017), 5412–5458.
- (10) H. Attouch, Z. Chbani, J. Fadili, H. Riahi, First order optimization algorithms via inertial systems with Hessian driven damping, Math. Program. (2020), https://doi.org/10.1007/s10107-020-01591-1.
- (11) H. Attouch, Z. Chbani, J. Peypouquet, P. Redont, Fast convergence of inertial dynamics and algorithms with asymptotic vanishing viscosity, Mathematical Programming, 168 (1-2) (2018), 123–175.
- (12) H. Attouch, Z. Chbani, H. Riahi, Combining fast inertial dynamics for convex optimization with Tikhonov regularization, J. Math. Anal. Appl, 457 (2018), 1065–1094.
- (13) H. Attouch, R. Cominetti, A dynamical approach to convex minimization coupling approximation with the steepest descent method, J. Differential Equations, 128 (2) (1996), 519–540.
- (14) H. Attouch, M.-O. Czarnecki, Asymptotic control and stabilization of nonlinear oscillators with non-isolated equilibria, J. Differential Equations 179 (2002), 278–310.
- (15) H. Attouch, M.-O. Czarnecki, Asymptotic behavior of coupled dynamical systems with multiscale aspects, J. Differential Equations 248 (2010), 1315–1344.
- (16) H. Attouch, M.-O. Czarnecki, J. Peypouquet, Prox-penalization and splitting methods for constrained variational problems, SIAM J. Optim. 21 (2011), 149–173.
- (17) H. Attouch, M.-O. Czarnecki, J. Peypouquet, Coupling forward-backward with penalty schemes and parallel splitting for constrained variational inequalities, SIAM J. Optim. 21 (2011), 1251–1274.
- (18) H. Attouch, M.-O. Czarnecki, Asymptotic behavior of gradient-like dynamical systems involving inertia and multiscale aspects, J. Differential Equations, 262 (3) (2017), 2745–2770.
- (19) H. Attouch, S. László, Convex optimization via inertial algorithms with vanishing Tikhonov regularization: fast convergence to the minimum norm solution, arXiv:2104.11987v1 [math.OC] 24 Apr 2021.
- (20) H. Attouch, J. Peypouquet, The rate of convergence of Nesterov’s accelerated forward-backward method is actually faster than , SIAM J. Optim., 26(3) (2016), pp. 1824–1834.
- (21) H. Attouch, J. Peypouquet, P. Redont, Fast convex minimization via inertial dynamics with Hessian driven damping, J. Differential Equations, 261(10), (2016), 5734–5783.
- (22) J.-B. Baillon, R. Cominetti, A convergence result for non-autonomous subgradient evolution equations and its application to the steepest descent exponential penalty trajectory in linear programming, J. Funct. Anal. 187 (2001) 263-273.
- (23) H. Bauschke, P. L. Combettes, Convex Analysis and Monotone Operator Theory in Hilbert spaces, CMS Books in Mathematics, Springer, (2011).
- (24) R. I. Bot, E. R. Csetnek, Forward-Backward and Tseng’s type penalty schemes for monotone inclusion problems, Set-Valued Var. Anal. 22 (2014), 313–331.
- (25) R. I. Boţ, E. R. Csetnek, S.C. László, Tikhonov regularization of a second order dynamical system with Hessian damping, Math. Program. (2020), https://doi.org/10.1007/s10107-020-01528-8.
- (26) H. Brézis, Opérateurs maximaux monotones dans les espaces de Hilbert et équations d’évolution, Lecture Notes 5, North Holland, (1972).
- (27) A. Cabot, Inertial gradient-like dynamical system controlled by a stabilizing term, J. Optim. Theory Appl. 120 (2004) 275–303.
- (28) A. Cabot, Proximal point algorithm controlled by a slowly vanishing term: Applications to hierarchical minimization, SIAM J. Optim. 15 (2) (2005), 555–572.
- (29) A. Cabot, H. Engler, S. Gadat, On the long time behavior of second order differential equations with asymptotically small dissipation, Trans. Amer. Math. Soc. 361 (2009), 5983–6017.
- (30) A. Chambolle, Ch. Dossal, On the convergence of the iterates of Fista, J. Opt. Theory Appl., 166 (2015), 968–982.
- (31) R. Cominetti, Coupling the proximal point algorithm with approximation methods, J. Optim. Theory Appl. 95 (3) (1997), 581–600.
- (32) R. Cominetti, J. Peypouquet, S. Sorin, Strong asymptotic convergence of evolution equations governed by maximal monotone operators with Tikhonov regularization, J. Differential Equations, 245 (2008), 3753–3763.
- (33) A. Fiacco, G. McCormick, Nonlinear programming: Sequential Unconstrained Minimization Techniques, John Wiley and Sons, New York, (1968).
- (34) A. Haraux, M.A. Jendoubi, A Liapunov function approach to the stabilization of second-order coupled systems, (2016) arXiv preprint arXiv:1604.06547.
- (35) S.A. Hirstoaga, Approximation et résolution de problèmes d’équilibre, de point fixe et d’inclusion monotone. PhD thesis, Université Pierre et Marie Curie - Paris VI, 2006, HAL Id: tel-00137228.
- (36) M.A. Jendoubi, R. May, On an asymptotically autonomous system with Tikhonov type regularizing term, Archiv der Mathematik 95 (4) (2010), 389–399.
- (37) Y. Nesterov, A method of solving a convex programming problem with convergence rate , Soviet Math. Dokl. 27 (1983), 372–376.
- (38) Y. Nesterov, Introductory lectures on convex optimization: A basic course, volume 87 of Applied Optimization. Kluwer Academic Publishers, Boston, MA, 2004.
- (39) B. Polyak, Introduction to Optimization, New York, NY: Optimization Software-Inc, 1987.
- (40) W. Siegel, Accelerated first-order methods: Differential equations and Lyapunov functions, arXiv:1903.05671v1 [math.OC], 2019.
- (41) W. Su, S. Boyd, E. J. Candès, A Differential Equation for Modeling Nesterov’s Accelerated Gradient Method: Theory and Insights. NIPS, December 2014.
- (42) A. N. Tikhonov, Doklady Akademii Nauk SSSR 151 (1963) 501–504, (Translated in ”Solution of incorrectly formulated problems and the regularization method”. Soviet Mathematics 4 (1963) 1035–1038).
- (43) A. N. Tikhonov, V. Y. Arsenin, Solutions of Ill-Posed Problems, Winston, New York, 1977.
- (44) D. Torralba, Convergence epigraphique et changements d’échelle en analyse variationnelle et optimisation, PhD thesis, Université Montpellier, 1996.
- (45) A. C. Wilson, B. Recht, M. I. Jordan, A Lyapunov analysis of momentum methods in optimization, arXiv preprint arXiv:1611.02635, 2016.