A relaxed version of Ryu’s three-operator splitting method for structured nonconvex optimization
Abstract
In this work, we propose a modification of Ryu’s splitting algorithm for minimizing the sum of three functions, where two of them are convex with Lipschitz continuous gradients, and the third is an arbitrary proper closed function that is not necessarily convex. The modification is essential to facilitate the convergence analysis, particularly in establishing a sufficient descent property for an associated envelope function. This envelope, tailored to the proposed method, is an extension of the well-known Moreau envelope. Notably, the original Ryu splitting algorithm is recovered as a limiting case of our proposal. The results show that the descent property holds as long as the stepsizes remain sufficiently small. Leveraging this result, we prove global subsequential convergence to critical points of the nonconvex objective.
1 Introduction
Modern optimization applications often exhibit a structured form that is well-suited to the application of decomposition techniques. Operator splitting methods have emerged as powerful tools for efficiently solving complex optimization problems. Instances of operator splitting methods include the forward-backward splitting method Passty79 and the Douglas-Rachford splitting method DR ; LM . These algorithmic schemes have been pivotal in the development of techniques for image reconstruction, signal processing, and machine learning combettes2007douglas ; cai2010split ; combettes2011proximal ; boyd2011distributed ; glowinski2017splitting .
The aforementioned methods were originally designed to solve maximal monotone problems with a two-block structure. In the context of optimization problems, it translates to the sum of two convex functions. Several generalizations to three-block problems have been proposed in the literature, such as the Davis-Yin operator splitting method davis2017three , Ryu’s three-operator splitting method ryu2020uniqueness , and the Malitsky-Tam operator splitting method MT23 (which is also applicable to -block problems). All these methods, in the context of optimization, have provable convergence guarantees for convex optimization problems.
The literature is relatively scarce for methods that solve nonconvex optimization problems for the sum of three functions. Notable contributions in this direction include the Davis-Yin splitting to nonconvex problems analyzed in bian2021three , the extension of Davis-Yin to four-operator splitting investigated in ALT24 , and an extension of Douglas-Rachford splitting examined in AT25 (that also works for -operators). In this work, we propose a modification of Ryu’s splitting method to solve nonconvex three-block optimization problems with a specific structure.
Consider the optimization problem
(1) |
where for . Given , and , the (relaxed) Ryu’s three-operator splitting method ryu2020uniqueness is given by the following iteration: for ,
(2) |
where denotes the proximal mapping of defined in (4) below.
When the functions and in problem (1) are convex, (ryu2020uniqueness, , Theorem 4) states the convergence of the method to minimizers of problem (1). Our objective in this work is to investigate whether the convergence guarantees can be extended to a nonconvex setting.
In our convergence analysis, we adopt the “envelope technique”, a strategy that has been employed in the analysis of several splitting algorithms for nonconvex problems themelis2018forward ; themelis2020douglas ; ALT24 ; AT25 ; Atenas25 . A key challenge, however, lies in the fact that the standard form of Ryu’s splitting algorithm is not directly amenable to this type of analysis. To address this, we consider the following relaxed variant of Ryu’s method: given , , and , we define the iterates for all as follows:
(3) |
In particular, we modify the -proximal step in (2) to obtain (3). When , this reduces to Ryu’s splitting algorithm.
Throughout this paper, we adopt the following blanket assumptions.
Assumption 1 (Blanket assumption)
Suppose the following conditions are satisfied.
-
(a)
For , the function is a convex -smooth function, that is, is globally Lipschitz continuous with modulus .
-
(b)
The function is proper and lower semicontinuous (lsc for short).
-
(c)
Problem (1) has a nonempty set of solutions.
Remark 1 (On the simplified nonconvex setting)
The convexity in Assumption 1(a) can be relaxed. However, we impose it in our setting to simplify the analysis. Similar results can be obtained when and are merely - and -smooth, respectively, by employing a strategy similar to that in themelis2018forward ; bian2021three .
This paper is organized as follows. In Section 2, we introduce the notation and some variational analysis results we use throughout this paper. In Section 3, we define the merit function we use as the foundation of our analysis the Ryu’s three-operator splitting envelope and establish key properties relevant to the convergence analysis of the proposed method. Section 4 is dedicated to investigating the convergence properties of the method (3). In particular, we show that the defined envelope satisfies a sufficient decrease condition, and then we exploit this property to prove that all cluster points of the generated sequence are critical points of the objective function of problem (1). Finally, in Section 5 we comment on some ongoing works and future research directions.
2 Preliminaries and notation
Throughout this paper, denotes an inner product in , and its induced norm. We shall make use of the following technical result.
Lemma 1
For all , it holds
A function is called proper when its domain, the set , is nonempty. We say is lsc if at any , , and it is convex if for all , and all , . A function is said to be -smooth if it is differentiable and its gradient is -Lipschitz continuous, that is, for all , . In the next result, we recall some propeties of -smooth functions that will be important in our analysis (see, for instance, (Beck17, , Theorem 5.8)).
Lemma 2
Suppose is -smooth. Then,
If, in addiiton, is convex, then
For a proper function , and , we denote by the Fréchet (or regular) subdifferential of at , defined as
The limiting (or general) subdifferential of at , denoted , is defined as
where denotes convergence in the attentive sense, that is, and . When is smooth, then . If is proper lsc convex, then the Fréchet and limiting subdifferentials coincide with the subdifferential of convex analysis, namely,
A set of points of particular interest is the zeros of the subdifferential operator. We say is a critical point of if . If is convex, critical points are exactly the global minimizers of .
Remark 2 (Critical points of the sum)
Under Assumption 1 and for defined in (1), in view of (Rockafellar_Wets_2009, , Exercise 8.8(c)), is a critical point of if
Given a function , a parameter , and a point , the proximal operator of at is defined as
(4) |
The associated optimal value function is known as the Moreau envelope, defined as follows:
We say is prox-bounded if, for some , is bounded from below. The supremum of such parameters is called the threshold of prox-boundedness. If is a proper lsc prox-bounded function with threshold , then for any , is nonempty and compact-valued, and is finite-valued (Rockafellar_Wets_2009, , Theorem 1.25). In particular, if is a proper lsc convex function, then , and is a single-valued mapping (Rockafellar_Wets_2009, , Theorem 12.12, Theorem 12.17).
Remark 3 (On the well-definedness of proximal operations)
Under Assumption 1, and are single-valued since and are proper lsc convex, while is well-defined for , as is prox-bounded with threshold at least from (themelis2020douglas, , Remark 3.1).
3 An envelope for Ryu’s splitting method
Our convergence analysis for (3) under Assumption 1 builds on the approach in ALT24 ; AT25 ; themelis2020douglas , which utilizes an envelope function, akin to the Moreau envelope, well-suited to the corresponding iterative method. We begin our analysis by motivating such merit function.
Proposition 1
Proof
From the -update in (3), we have
(5) |
Meanwhile, the first-order optimality conditions of the -update and the -update yield, respectively,
(6) | ||||
(7) |
Hence,
which, combined with (5), yields
(8) |
for and . Following the calculations in (AT25, , Theorem 5.5), by expanding the squared norm, dropping the constant term , and adding the constant term , we get the desired result. ∎
In view of Proposition 1, we define the Relaxed Ryu envelope (RRE), for all , by
(9) |
Meanwhile, the corresponding set-valued iteration operator associated with (3) is given by
where
(10) |
In view of Remark 3, note that is nonempty- and compact-valued for any provided that .
Remark 4 (Relationship between the Moreau envelope and the RRE)
Observe that from (8), we have
We now establish some properties of the envelope function. The following result states that the RRE inherits continuity properties of the Moreau envelope (cf. (themelis2018forward, , Proposition 4.2) and (themelis2020douglas, , Proposition 3.2)).
Proposition 2 (Continuity of RRE)
The RRE is a real-valued and locally Lipschitz continuous function.
Proof
Since and are nonexpansive (Beck17, , Theorem 6.42(b)), then the maps and defined in (10) are (globally) Lipschitz continuous. Furthermore, from Assumption 1, is (globally) Lipschitz continuous, for . Then, as the Moreau envelope is locally Lipschitz continuous (Rockafellar_Wets_2009, , Example 10.32), the conclusion follows from Remark 4.∎
Next, we show some sandwich-type bounds relating the RRE and the original objective function in problem (1) (cf. (themelis2018forward, , Proposition 4.3) and (themelis2020douglas, , Proposition 3.3)).
Proposition 3
Proof
Take in (9) and apply Lemma 2 to to obtain
Similarly, taking in (9) and applying Lemma 2 to yields
From these, we immediately get (i). Moreover, since minimizes the problem in (9),
where the inequality holds by Lemma 2 and the last equality holds by plugging in and . This completes the proof of (ii). Finally, part (iii) immediately follows from (ii). ∎
Note that the iteration in (3) is designed to find a fixed point of the relaxed Ryu splitting operator , that is, a point in the set
In the next proposition, we establish a connection between such fixed points and the notion of criticality, as commonly used in optimization.
Proposition 4
Proof
It is straightforward from the definition of that is equivalent to having satisfying the conditions in (11). Hence, it suffices to prove that such is, in this case, always a critical point of . Evaluating the first-order optimality conditions of the -step, namely,
(12) |
at and , for , yields
Adding this inclusion with (6) and (7) using the substitutions and , we obtain
that is, by Remark 2.
At this point, we have built the necessary tools regarding the RRE for the analysis of convergence of the proposed relaxed Ryu splitting method. In the next section, we show subsequential convergence of the method under standard assumptions in the literature.
4 Convergence of modified Ryu’s three-operator splitting method via envelopes
To establish the (subsequential) convergence of the iterative method in (3), we follow the approach introduced in themelis2020douglas for the Douglas–Rachford splitting method. Specifically, the core argument relies on a sufficient decrease property satisfied by the RRE.
4.1 Sufficient decrease property for RRE
We first prove three technical lemmas that we will use in the main result of this section. We use the following notation: for ,
(13) |
Proof
Proof
Lemma 5
Let and let . Then the following holds:
-
(i)
.
-
(ii)
The interval is nonempty for any .
-
(iii)
For and , define the constants
(21) and the intervals
(22) If for , then is strictly positive for .
Proof
These results follow from straightforward calculations. ∎
With these lemmas in place, we are now ready to present the first main result of this paper. We establish the sufficient descent property of the RRE, provided that the stepsize is chosen sufficiently small. In particular, we restrict the stepsize in the interval
(23) |
where , and is given by (21) for , with taken from given in (22) for .
Theorem 4.1 (RRE sufficient descent)
Proof
From the definition of the relaxed Ryu envelope, we have
Thus, together with Lemma 1, we have
Using Lemmas 3 and 4, we obtain
Since , then
where the first inequality holds by -smoothness of , while the second inequality holds by the convexity of and -smoothness of . Moreover, by Young’s inequality, we have
where are arbitrary. With these, we obtain
(24) |
where the constants for are given by
Choosing for , it is not difficult to compute that since . In addition, since . Hence, for the given stepsize , we obtain from (24) that
(25) |
Meanwhile, from (16), (17) and -smoothness of for , it follows
Defining
we obtain from (25) that
The proof concludes by setting . ∎
Theorem 4.1 suggests that our modification of Ryu’s three-operator splitting method behaves like a descent method for the suitably defined RRE. Hence, one could expect this method to converge. In the next section, we formalize this idea.
4.2 Convergence properties of the modified Ryu’s algorithm
The convergence analysis of the method in (3) relies on a particular relationship between the RRE and a Lagrangian associated with a reformulation of problem (1), which we proceed to define. By duplicating variables, problem (1) can be reformulated as follows
We define the following Lagrangian associated with this problem reformulation:
where, for , is a Lagrangian multiplier associated with the constraint . The next result states the aformentioned relationship between the RRE and the augmented Lagrangian.
Lemma 6
We are now ready to state the second main result of this paper. We follow the approach taken in themelis2020douglas ; Atenas25 .
Theorem 4.2 (Subsequential convergence of nonconvex Ryu’s three-operator splitting)
Suppose that Assumption 1 holds. Let where , and . Let such that , where is given in (23). Then, for any sequence generated by algorithm (3), then
-
(i)
for , and and for .
If, in addition, is bounded, then
-
(ii)
For any cluster point of , , and .
-
(iii)
All cluster points of the sequences , for , coincide and are critical points of problem (1).
Proof
From the assumption, , then Proposition 4 implies is a bounded non-increasing sequence, thus convergent to some . As a consequence, for , Theorem 4.1 yields , and the -update rule in (3) gives . Since is convex, then is nonexpansive. In particular,
so that . Likewise, is nonexpansive, yielding
and thus . Moreover, as
then as well. This proves (i). Next, observe that and imply that the sequences , and have the same cluster points. Let be a cluster point of and be a cluster point of , and let for , and for . Note that from continuity of the proximal operator, and . Then
Hence, . Furthermore, from Lemma 6, . Since , , and are bounded, then part (i) implies , from where (ii) follows. Earlier in the proof we already showed the first part of item (iii). It remains to prove that is a critical point. A simple computation shows that for any ,
Hence, taking for ,
Then, in view of (6), (7), and (12), we get
Taking the limit as , since for , and for , item (i) then implies
where , and . In turn, this inclusion is equivalent to the following conditions:
Addind these equations yields , concluding the proof.
Remark 5 (Boundedness of the generated sequences)
Using a similar argument to (themelis2018forward, , Theorem 3.4(iii)), we can show that if has bounded level sets, then also has bounded level sets. Since for appropriately chosen stepsize, is a non-decreasing sequence as shown in Theorem 4.1, then for all , , and thus is bounded. Moreover, boundedness of follows from the continuity properties of and , and boundedness of is a consequence of Theorem 4.2(i).
Remark 6 (Global convergence)
In Theorem 4.2, we establish subsequential convergence of the relaxed variant of Ryu’s three-operator splitting method in a specific nonconvex setting. A natural question is whether the method converges globally. This can be affirmatively addressed by adopting the now standard approach based on the Kurdyka–Łojasiewicz (KL) inequality attouch2013convergence , or alternatively, by leveraging the subdifferential-based error bound technique used in Atenas25 within the unifying framework proposed in atenas2023unified . Under the same set of assumptions, one can also obtain local linear convergence rates.
5 Conclusion
By defining a Moreau-type envelope tailored to the algorithmic scheme in (3), the core of the analysis relies on the sufficient decrease property shown in Theorem 4.1. Observe that our analysis does not cover the limiting case (corresponding to the original method proposed by Ryu in (2)), as it would make the stepsize interval in (23) empty. Nevertheless, when is sufficiently close to 1, we can still guarantee global subsequential convergence for sufficiently small stepsizes. A full treatment of the limiting case requires a more refined analysis, which is the subject of ongoing work by the authors.
Acknowledgments
The authors thank the mathematical research institute MATRIX in Australia where part of this research was performed. JHA’s visit at MATRIX was supported in part by the MATRIX-Simons Travel Grant. The research of FA was supported in part by Australian Research Council grant DP230101749.
References
- [1] J. H. Alcantara, C.-p. Lee, and A. Takeda. A four-operator splitting algorithm for nonconvex and nonsmooth optimization. arXiv preprint arXiv:2406.16025, 2024.
- [2] J. H. Alcantara and A. Takeda. Douglas-Rachford algorithm for nonmonotone multioperator inclusion problems. arXiv preprint arXiv:2501.02752, 2025.
- [3] F. Atenas. Understanding the Douglas–Rachford splitting method through the lenses of Moreau-type envelopes. Computational Optimization and Applications, 90:1–30, 2025.
- [4] F. Atenas, C. Sagastizábal, P. J. Silva, and M. Solodov. A unified analysis of descent sequences in weakly convex optimization, including convergence rates for bundle methods. SIAM Journal on Optimization, 33(1):89–115, 2023.
- [5] H. Attouch, J. Bolte, and B. F. Svaiter. Convergence of descent methods for semi-algebraic and tame problems: proximal algorithms, forward–backward splitting, and regularized Gauss–Seidel methods. Mathematical programming, 137(1):91–129, 2013.
- [6] A. Beck. First-Order Methods in Optimization. SIAM - Society for Industrial and Applied Mathematics, Philadelphia, PA, United States, 2017.
- [7] F. Bian and X. Zhang. A three-operator splitting algorithm for nonconvex sparsity regularization. SIAM Journal on Scientific Computing, 43(4):A2809–A2839, 2021.
- [8] S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, et al. Distributed optimization and statistical learning via the alternating direction method of multipliers. Foundations and Trends in Machine learning, 3(1):1–122, 2011.
- [9] J.-F. Cai, S. Osher, and Z. Shen. Split Bregman methods and frame based image restoration. Multiscale Modeling & Simulation, 8(2):337–369, 2010.
- [10] P. L. Combettes and J.-C. Pesquet. A Douglas–Rachford splitting approach to nonsmooth convex variational signal recovery. IEEE Journal of Selected Topics in Signal Processing, 1(4):564–574, 2007.
- [11] P. L. Combettes and J.-C. Pesquet. Proximal Splitting Methods in Signal Processing, pages 185–212. Springer New York, New York, NY, 2011.
- [12] D. Davis and W. Yin. A three-operator splitting scheme and its optimization applications. Set-valued and variational analysis, 25:829–858, 2017.
- [13] J. Douglas and H. H. Rachford. On the numerical solution of heat conduction problems in two and three space variables. Transactions of the American mathematical Society, 82(2):421–439, 1956.
- [14] R. Glowinski, S. Osher, and W. Yin. Splitting Methods in Communication, Imaging, Science, and Engineering. Scientific Computation. Springer International Publishing, Cham, 2017.
- [15] P.-L. Lions and B. Mercier. Splitting algorithms for the sum of two nonlinear operators. SIAM Journal on Numerical Analysis, 16(6):964–979, 1979.
- [16] Y. Malitsky and M. K. Tam. Resolvent splitting for sums of monotone operators with minimal lifting. Mathematical Programming, 201(1):231–262, 2023.
- [17] G. B. Passty. Ergodic convergence to a zero of the sum of monotone operators in Hilbert space. Journal of Mathematical Analysis and Applications, 72(2):383–390, 1979.
- [18] R. T. Rockafellar and R. J.-B. Wets. Variational Analysis, volume 317 of Grundlehren der Mathematischen Wissenschaften. Springer Verlag Berlin, Berlin, 3rd printing edition, 2009.
- [19] E. K. Ryu. Uniqueness of DRS as the 2 operator resolvent-splitting and impossibility of 3 operator resolvent-splitting. Mathematical Programming, 182:233–273, 2020.
- [20] A. Themelis and P. Patrinos. Douglas–Rachford splitting and ADMM for nonconvex optimization: Tight convergence results. SIAM Journal on Optimization, 30(1):149–181, 2020.
- [21] A. Themelis, L. Stella, and P. Patrinos. Forward-backward envelope for the sum of two nonconvex functions: Further properties and nonmonotone linesearch algorithms. SIAM Journal on Optimization, 28(3):2274–2303, 2018.