This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Variational Theory and Algorithms for a Class of Asymptotically Approachable Nonconvex Problems

Hanyang Li Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley 94720 ([email protected])    Ying Cui Department of Industrial Engineering and Operations Research, University of California, Berkeley, Berkeley 94720 ([email protected])
Abstract

We investigate a class of composite nonconvex functions, where the outer function is the sum of univariate extended-real-valued convex functions and the inner function is the limit of difference-of-convex functions. A notable feature of this class is that the inner function may fail to be locally Lipschitz continuous. It covers a range of important yet challenging applications, including inverse optimal value optimization and problems under value-at-risk constraints. We propose an asymptotic decomposition of the composite function that guarantees epi-convergence to the original function, leading to necessary optimality conditions for the corresponding minimization problem. The proposed decomposition also enables us to design a numerical algorithm such that any accumulation point of the generated sequence, if exists, satisfies the newly introduced optimality conditions. These results expand on the study of so-called amenable functions introduced by Poliquin and Rockafellar in 1992, which are compositions of convex functions with smooth maps, and the prox-linear methods for their minimization. To demonstrate that our algorithmic framework is practically implementable, we further present verifiable termination criteria and preliminary numerical results.

Keywords:

epi-convergence; optimality conditions; nonsmooth analysis; difference-of-convex functions

1 Introduction.

We consider a class of composite optimization problems of the form:

minimizexnp=1m[Fp(x)φp(fp(x))],\begin{array}[]{rl}\displaystyle\operatornamewithlimits{minimize}_{x\in\mathbb{R}^{n}}&\;\,\displaystyle\sum_{p=1}^{m}\left[F_{p}(x)\triangleq\varphi_{p}\big{(}f_{p}(x)\big{)}\right],\end{array} (CP0)

where for each p=1,,mp=1,\cdots,m, the outer function φp:{+}\varphi_{p}:\mathbb{R}\to\mathbb{R}\cup\{+\infty\} is proper, convex, lower semicontinuous (lsc), and the inner function fp:nf_{p}:\mathbb{R}^{n}\rightarrow\mathbb{R} is not necessarily locally Lipschitz continuous.

If each inner function fpf_{p} is continuously differentiable, then the objective in (CP0) belongs to the family of amenable functions under a constraint qualification [25, 26]. For a thorough exploration of the variational theory of amenable functions, readers are referred to [30, Chapter 10(F)]. The properties of amenable functions have also led to the development of prox-linear algorithms, where convex subproblems are constructed through the linearization of the inner smooth mapping [16, 4, 5, 19, 14].

However, there are various applications of composite optimization problem in the form of (CP0) where the inner function fpf_{p} is nondifferentiable. In the following, we provide two such examples.

Example 1.1 (The inverse optimal value optimization). For p=1,,mp=1,\cdots,m, consider the optimal value function

fp(x)infyd{(cp+Cpx)y+12yQpy|Apx+Bpybp}xnf_{p}(x)\triangleq\displaystyle{\inf_{y\in{\mathbb{R}^{d}}}}\left\{(c^{\,p}+C^{\,p}x)^{\top}y+\frac{1}{2}\,y^{\top}Q^{\,p}\,y\;\middle|\;A^{\,p}x+B^{\,p}y\leq b^{\,p}\,\right\}\qquad x\in{\mathbb{R}^{n}} (1)

with appropriate dimensional vectors bpb^{\,p} and cpc^{\,p}, and matrices Ap,Bp,CpA^{\,p},B^{\,p},C^{\,p} and QpQ^{\,p}. The function fpf_{p} is not smooth in general. The inverse (multi) optimal value problem [2, 24] finds a vector xnx\in{\mathbb{R}^{n}} that minimizes the discrepancy between observed optimal values {νp}p=1m{\{\nu_{p}\}_{p=1}^{m}} and true optimal values {fp(x)}p=1m\{f_{p}(x)\}_{p=1}^{m} based on a prescribed metric, such as the 1\ell_{1}-error:

minimizexnp=1m|νpfp(x)|.\displaystyle\operatornamewithlimits{minimize}_{x\in{\mathbb{R}^{n}}}\;\sum_{p=1}^{m}\left|{\nu_{p}}-f_{p}(x)\right|. (2)

If fpf_{p} is real-valued for p=1,,mp=1,\cdots,m, one can express problem (2) in the form of (CP0) by defining the outer function φp(t)=|νpt|\varphi_{p}(t)=|{\nu_{p}}-t|.

Example 1.2 (The portfolio optimization under a value-at-risk constraint). Given a random variable YY, the Value-at-risk (VaR) of YY at a confidence level α(0,1)\alpha\in(0,1) is defined as VaRα(Y)min{γ(Yγ)α}\mbox{VaR}_{\alpha}(Y)\triangleq{\min}\left\{\gamma\in\mathbb{R}\,\mid\,\mathbb{P}(Y\leq\gamma)\geq\alpha\right\}. Let ZZ be the random return of investments and c(,)c(\cdot,\cdot) be a lsc function representing the profit of ZZ parameterized by xnx\in\mathbb{R}^{n}. An agent’s goal is to maximize the expected profit, denoted by 𝔼[c(x,Z)]\mathbb{E}[\,c(x,Z)], while also controlling the risk via a constraint on VaRα[c(x,Z)]\mbox{VaR}_{\alpha}[c(x,Z)] under a prescribed level rr. The model can be written as

minimizexn𝔼[c(x,Z)]subject toVaRα[c(x,Z)]r.{\displaystyle\operatornamewithlimits{minimize}_{x\in\mathbb{R}^{n}}\;\;-\mathbb{E}\left[\,c(x,Z)\right]}\quad\mbox{subject to}\;\;\mbox{VaR}_{\alpha}[\,c(x,Z)]\,\geq r. (3)

Define δA\delta_{A} as the indicator function of a set AA, where δA(t)=0\delta_{A}(t)=0 for tAt\in A and δA(t)=+\delta_{A}(t)=+\infty for tAt\notin A. Problem (3) can then be put into the framework (CP0) by setting φ1(t)=t\varphi_{1}(t)=-t, f1(x)=𝔼[c(x,Z)]f_{1}(x)=\mathbb{E}[\,c(x,Z)], φ2(t)=δ[r,+)(t)\varphi_{2}(t)=\delta_{[r,+\infty)}(t), and f2(x)=VaRα[c(x,Z)]f_{2}(x)=\mbox{VaR}_{\alpha}[\,c(x,Z)]. We note that the function VaRα[c(,Z)]\mbox{VaR}_{\alpha}[\,c(\cdot,Z)] can be nondifferentiable even if the function c(,z)c(\cdot,z) is differentiable for every zz.

Due to the nondifferentiablity of the inner function fpf_{p} in (CP0), the overall objective is not amenable and the prox-linear algorithm [16] is not applicable to solve this composite optimization problem. In this paper, we develop an algorithmic framework for a subclass of (CP0), where each inner function fpf_{p}, although nondifferentiable, can be derived from DC functions through a limiting process. We refer to this class of functions as approachable difference-of-convex (ADC) functions (see section 2.1 for the formal definition). It is important to note that ADC functions are ubiquitous. In particular, we will show that the optimal value function fpf_{p} in (1) and VaRα[c(,Z)]\mbox{VaR}_{\alpha}[\,c(\cdot,Z)] in (3) are instances of ADC functions under mild conditions. In fact, based on the result recently shown in [31], any lsc function is the epi-limit of piecewise affine DC functions.

With this new class of functions in hand, we have made a first step to understand the variational properties of the composite ADC minimization problem (CP0), including an in-depth analysis of its necessary optimality conditions. The novel optimality conditions are defined through a handy approximation of the subdifferential mapping fp\partial f_{p} that explores the ADC structure of fpf_{p}. Using the notion of epi-convergence, we further show that these optimality conditions are necessary conditions for any local solution of (CP0). Additionally, we propose a double-loop algorithm to solve (CP0), where the outer loop dynamically updates the DC functions approximating each fpf_{p}, and the inner loop finds an approximate stationary point of the resulting composite DC problem through successive convex approximations. It can be shown that any accumulation point of the sequence generated by our algorithm satisfies the newly introduced optimality conditions.

Our strategy to handle the nondifferentiable and possibly discontinuous inner function fpf_{p} through a sequence of DC functions shares certain similarities with the approximation frameworks in the existing literature. For instance, Ermoliev et al. [15] have designed smoothing approximations for lsc functions utilizing convolutions with bounded mollifier sequences, a technique akin to local “averaging”. Research has sought to identify conditions that ensure gradient consistency for the smoothing approximation of composite nonconvex functions [10, 8, 6, 7]. Notably, Burke and Hoheisel [6] have emphasized the importance of epi-convergence for the approximating sequence, a less stringent requirement than the continuous convergence assumed in earlier works [10, 3]. In recent work, Royset [32] has studied the consistent approximation of the composite optimization in terms of the global minimizers and stationary solutions, where the inner function is assumed to be locally Lipschitz continuous. Our notion of subdifferentials and optimality conditions for (CP0) takes inspiration from these works but adapts to accommodate nonsmooth approximating sequences that exhibit the advantageous property of being DC.

The rest of the paper is organized as follows. Section 2 presents a class of ADC functions and introduces a new associated notion of subdifferential. In section 3, we investigate the necessary optimality conditions for problem (CP0). Section 4 is devoted to an algorithmic framework for solving (CP0) and its convergence analysis to the newly introduced optimality conditions. We also discuss termination criteria for practical implementation in section 4.3. Preliminary numerical experiments on the inverse optimal value problems are presented in the last section.

Notation and Terminology. Let \|\cdot\| denote the Euclidean norm in n\mathbb{R}^{n}. We use the symbol 𝔹(x¯,δ)\mathbb{B}(\bar{x},\delta) to denote the Euclidean ball {xnxx¯δ}\{x\in\mathbb{R}^{n}\mid\|x-\bar{x}\|\leq\delta\}. The set of nonpositive and nonnegative are denoted by \mathbb{R}_{-} and +\mathbb{R}_{+}, respectively, and the set of nonnegative integers is denoted by \mathbb{N}. We write {NN infinite}\mathbb{N}_{\infty}^{\sharp}\triangleq\{N\subset\mathbb{N}\,\mid\,N\mbox{ infinite}\} and {N\N finite}\mathbb{N}_{\infty}\triangleq\{N\,\mid\,\mathbb{N}\;\backslash N\text{ finite}\}. Notation {tk}\{t^{k}\} is used to simplify the expression of any sequence {tk}k\{t^{k}\}_{k\in\mathbb{N}}, where the elements can be points, sets, or functions. By tktt^{k}\to t and tkNtt^{k}\to_{N}t, we mean that the sequence {tk}\{t^{k}\} and the subsequence {tk}kN\{t^{k}\}_{k\in N} indexed by NN\in\mathbb{N}^{\sharp}_{\infty} converge to tt, respectively.

Given two sets AA and BB in n\mathbb{R}^{n} and a scalar λ\lambda\in\mathbb{R}, the Minkowski sum and the scalar multiple are defined as A+B{a+baA,bB}A+B\triangleq\{a+b\mid a\in A,b\in B\} and λA{λaaA}\lambda\,A\triangleq\{\lambda\,a\mid a\in A\}. We also define 0={0}0\cdot\emptyset=\{0\} and λ=\lambda\cdot\emptyset=\emptyset whenever λ0\lambda\neq 0. When AA and BB are nonempty and closed, we define the one-sided deviation of AA from BB as 𝔻(A,B)supxAdist(x,B)\mathbb{D}(A,B)\,\triangleq\,\sup_{x\in A}\;\operatorname{dist}(x,B), where dist(x,B)infyByx\operatorname{dist}(x,B)\triangleq\inf_{y\in B}\|y-x\|. The Hausdorff distance between AA and BB is given by (A,B)max{𝔻(A,B),𝔻(B,A)}\mathbb{H}(A,B)\triangleq\max\{\mathbb{D}(A,B),\,\mathbb{D}(B,A)\}. The boundary and interior of AA are denoted by bdry(A)\operatorname{bdry}(A) and int(A)\operatorname{int}(A). The topological closure and the convex hull of AA are indicated by cl(A)\operatorname{cl}(A) and conA\operatorname{con}A.

For a sequence of sets {Ck}\{C^{k}\}, we define its outer limit as

Limsupk+Ck{uN,ukNu with ukCk},\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}C^{k}\triangleq\{u\mid\exists\,N\in\mathbb{N}_{\infty}^{\sharp},\,u^{k}\to_{N}u\text{ with }u^{k}\in C^{k}\},

and the horizon outer limit as

Limsupk+Ck{0}{uN,λk0,λkukNu with ukCk}.{\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}}^{\infty}\,C^{k}\triangleq\{0\}\cup\left\{u\mid\exists\,N\in\mathbb{N}_{\infty}^{\sharp},\,\lambda_{k}\downarrow 0,\,\lambda_{k}u^{k}\rightarrow_{N}u\text{ with }u^{k}\in C^{k}\right\}.

The outer limit of a set-valued mapping S:nmS:\mathbb{R}^{n}\rightrightarrows\mathbb{R}^{m} is defined as

Limsupxx¯S(x)xkx¯Limsupk+S(xk)={uxkx¯,uku with ukS(xk)}x¯n.\displaystyle\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}S(x)\triangleq\bigcup_{x^{k}\rightarrow\bar{x}}\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}S(x^{k})=\{u\mid\exists\,x^{k}\rightarrow\bar{x},\,u^{k}\rightarrow u\text{ with }u^{k}\in S(x^{k})\}\quad\bar{x}\in\mathbb{R}^{n}.

We say SS is outer semicontinuous (osc) at x¯n\bar{x}\in\mathbb{R}^{n} if Limsupxx¯S(x)S(x¯)\operatorname{Lim\,sup}_{x\rightarrow\bar{x}}S(x)\subset S(\bar{x}). Consider some index set NN\in\mathbb{N}^{\sharp}_{\infty}. A sequence of sets {Ck}kN\{C^{k}\}_{k\in N} is equi-bounded if there exists a bounded set BB such that CkBC^{k}\subset B for all kNk\in N. Otherwise, the sequence is unbounded. If there is an integer KNK\in N such that {Ck}kN,kK\{C^{k}\}_{k\in N,k\geq K} is equi-bounded, then the sequence {Ck}kN\{C^{k}\}_{k\in N} is said to be eventually bounded. Interested readers are referred to [30, Chapter 4] for a comprehensive study of set convergence.

The regular normal cone and the limiting normal cone of a set CnC\subset\mathbb{R}^{n} at x¯C\bar{x}\in C are given by

𝒩^C(x¯){v|v(xx¯)o(xx¯) for all xC}and𝒩C(x¯)Limsupx(C)x¯𝒩^C(x).\widehat{\mathcal{N}}_{C}(\bar{x})\triangleq\left\{v\,\middle|\,v^{\top}(x-\bar{x})\leq o(\|x-\bar{x}\|)\text{ for all }x\in C\right\}\quad\mbox{and}\quad\mathcal{N}_{C}(\bar{x})\triangleq\displaystyle\operatornamewithlimits{Lim\,sup}_{x(\in C)\rightarrow\bar{x}}\widehat{\mathcal{N}}_{C}(x).

The proximal normal cone of a set CC at x¯C\bar{x}\in C is defined as 𝒩Cp(x¯){λ(xx¯)x¯PC(x),λ0}\mathcal{N}^{p}_{C}(\bar{x})\triangleq\{\lambda(x-\bar{x})\mid\bar{x}\in P_{C}(x),\lambda\geq 0\}, where PCP_{C} is the projection onto CC that maps any xx to the set of points in CC that are closest to xx.

For an extended-real-valued function f:n¯{±}f:\mathbb{R}^{n}\rightarrow\overline{\mathbb{R}}\triangleq\mathbb{R}\cup\{\pm\infty\}, we write its effective domain as domf{xnf(x)<+}\operatorname{dom}f\triangleq\{x\in\mathbb{R}^{n}\mid f(x)<+\infty\}, and the epigraph as epif{(x,α)n+1αf(x)}\operatorname{epi}f\triangleq\{(x,\alpha)\in\mathbb{R}^{n+1}\mid\alpha\geq f(x)\}. We say ff is proper if domf\operatorname{dom}f is nonempty and f(x)>f(x)>-\infty for all xnx\in\mathbb{R}^{n}. We adopt the common rules for extended arithmetic operations, and the lower and upper limits of a sequence of scalars in ¯\overline{\mathbb{R}} (cf. [30, Chapter 1(E)]).

Let f:n¯f:\mathbb{R}^{n}\to\overline{\mathbb{R}} be a proper function. We write xfx¯x\rightarrow_{f}\bar{x}, if xx¯ and f(x)f(x¯)x\rightarrow\bar{x}\text{ and }f(x)\rightarrow f(\bar{x}). The regular subdifferential and the limiting subdifferential of ff at x¯domf\bar{x}\in\operatorname{dom}f are respectively defined as

^f(x¯){vf(x)f(x¯)+v(xx¯)+o(xx¯) for all x}andf(x¯)Limsupxfx¯^f(x).\widehat{\partial}f(\bar{x})\triangleq\{v\mid f(x)\geq f(\bar{x})+v^{\top}(x-\bar{x})+o(\|x-\bar{x}\|)\text{ for all }x\}\quad\mbox{and}\quad\partial f(\bar{x})\triangleq\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow_{f}\bar{x}}\widehat{\partial}f(x).

For any x¯domf\bar{x}\notin\operatorname{dom}f, we set ^f(x¯)=f(x¯)=\widehat{\partial}f(\bar{x})=\partial f(\bar{x})=\emptyset. When ff is locally Lipschitz continuous at x¯\bar{x}, conf(x¯)\operatorname{con}\partial f(\bar{x}) equals to the Clarke subdifferential Cf(x¯)\partial_{C}f(\bar{x}). We further say ff is subdifferentially regular at x¯domf\bar{x}\in\operatorname{dom}f if ff is lsc at x¯\bar{x} and ^f(x¯)=f(x¯)\widehat{\partial}f(\bar{x})=\partial f(\bar{x}). When ff is proper and convex, ^f\widehat{\partial}f, f\partial f, and Cf\partial_{C}f coincide with the concept of the subdifferential in convex analysis.

Finally, we introduce the notion of function convergence. A sequence of functions {fk:n¯}\{f^{k}:\mathbb{R}^{n}\to\overline{\mathbb{R}}\} is said to converge pointwise to f:n¯f:\mathbb{R}^{n}\rightarrow\overline{\mathbb{R}}, written fkpff^{k}\overset{\text{p}}{\rightarrow}f, if limk+fk(x)=f(x)\lim_{k\to+\infty}f^{k}(x)=f(x) for any xnx\in\mathbb{R}^{n}. The sequence {fk}\{f^{k}\} is said to epi-converge to ff, written fkeff^{k}\overset{\text{e}}{\rightarrow}f, if for any xx, it holds

{lim infk+fk(xk)f(x)for every sequence xkx,lim supk+fk(xk)f(x)for some sequence xkx.\left\{\begin{array}[]{ll}\displaystyle\liminf_{k\rightarrow+\infty}f^{k}(x^{k})\geq f(x)&\;\text{for every sequence }x^{k}\rightarrow x,\\ \displaystyle\limsup_{k\rightarrow+\infty}f^{k}(x^{k})\leq f(x)&\;\text{for some sequence }x^{k}\rightarrow x.\end{array}\right.

The sequence {fk}\{f^{k}\} is said to converge continuously to ff, written fkcff^{k}\overset{\text{c}}{\rightarrow}f, if limk+fk(xk)=f(x)\lim_{k\rightarrow+\infty}f^{k}(x^{k})=f(x) for any xx and any sequence xkxx^{k}\rightarrow x.

2 Approachable difference-of-convex functions.

In this section, we formally introduce a class of functions that can be asymptotically approximated by DC functions. A new concept of subdifferential that is defined through the approximating functions is proposed. At the end of this section, we provide several examples that demonstrate the introduced concepts.

2.1 Definitions and properties.

An extended-real-valued function can be approximated by a sequence of functions in various notions of convergence, as comprehensively investigated in [30, Chapter 7(A-C)]. Among these approaches, epi-convergence has a notable advantage in its ability to preserve the global minimizers [30, Theorem 7.31]. Our focus lies on a particular class of approximating functions, wherein each function exhibits a DC structure.

Definition 1.

A function ff is said to be DC on its domain if there exist proper, lsc and convex functions g,h:n¯g,h:\mathbb{R}^{n}\to\overline{\mathbb{R}} such that domf=[domgdomh]\operatorname{dom}f=[\,\operatorname{dom}g\cap\operatorname{dom}h\,] and f(x)=g(x)h(x)f(x)=g(x)-h(x) for any xdomfx\in\operatorname{dom}f.

With this definition, we introduce the concept of ADC functions.

Definition 2 (ADC functions).

Let f:n¯f:\mathbb{R}^{n}\rightarrow\overline{\mathbb{R}} be a proper function.
(a) ff is said to be pointwise approachable DC (p-ADC) if there exist proper functions {fk:n¯}\{f^{k}:\mathbb{R}^{n}\to\overline{\mathbb{R}}\}, DC on their respective domains, such that fkpff^{k}\overset{\text{p}}{\rightarrow}f.
(b) ff is said to be epigraphically approachable DC (e-ADC) if there exist proper functions {fk:n¯}\{f^{k}:\mathbb{R}^{n}\to\overline{\mathbb{R}}\}, DC on their respective domains, such that fkeff^{k}\overset{\text{e}}{\rightarrow}f.
(c) ff is said to be continuously approachable DC (c-ADC) if there exist proper functions {fk:n¯}\{f^{k}:\mathbb{R}^{n}\to\overline{\mathbb{R}}\}, DC on their respective domains, such that fkcff^{k}\overset{\text{c}}{\rightarrow}f.

A function ff is said to be ADC associated with {fk}\{f^{k}\} if {fk}\{f^{k}\} confirms one of these convergence properties. By a slight abuse of notation, we denote the DC decomposition of each fkf^{k} as fk=gkhkf^{k}=g^{k}-h^{k}, although the equality may only hold for xdomfkx\in\operatorname{dom}f^{k}.

A p-ADC function may not be lsc. An example is given by f(x)=𝟏{0}(x)+2𝟏(0,+)(x)f(x)=\mathbf{1}_{\{0\}}(x)+2\cdot\mathbf{1}_{(0,+\infty)}(x), where for a set CnC\subset\mathbb{R}^{n}, we write 𝟏C(x)=1\mathbf{1}_{C}(x)=1 if xCx\in C and 𝟏C(x)=0\mathbf{1}_{C}(x)=0 if xCx\notin C. In this case, ff is not lsc at x=0x=0. However, ff is p-ADC associated with fk(x)=max( 0, 2kx+1)max( 0, 2kx1)f^{k}(x)=\max\left(\,0,\,2kx+1\,\right)-\max\left(\,0,\,2kx-1\,\right). In contrast, any e-ADC function must be lsc [30, Proposition 7.4(a)], and any c-ADC function is continuous [30, Theorem 7.14].

The relationships among different notions of function convergence, including the unaddressed uniform convergence, have been thoroughly examined in [30]. Generally, pointwise convergence and epi-convergence do not imply one another, but they coincide when the sequence {fk}\{f^{k}\} is asymptotically equi-lsc everywhere [30, Theorem 7.10]. In addition, {fk}\{f^{k}\} converges continuously to ff if and only if both fkeff^{k}\overset{\text{e}}{\rightarrow}f and (fk)e(f)(-f^{k})\overset{\text{e}}{\rightarrow}(-f) are satisfied [30, Theorem 7.11]. While verifying epi-convergence is often challenging, it becomes simpler for a monotonic sequence {fk}\{f^{k}\} that converges pointwise to ff [30, Proposition 7.4(c-d)].

2.2 Subdifferentials of ADC functions.

Characterizing the limiting and Clarke subdifferentials can be challenging when dealing with functions that exhibit complex composite structures. Our focus in this subsection is on numerically computable approximations of the limiting subdifferentials. We begin with the definitions.

Definition 3 (approximate subdifferentials).

Consider an ADC function f:n¯f:\mathbb{R}^{n}\rightarrow\overline{\mathbb{R}} associated with {fk=gkhk}\{f^{k}=g^{k}-h^{k}\}. The approximate subdifferential of ff (associated with {fk=gkhk}\{f^{k}=g^{k}-h^{k}\}) at x¯n\bar{x}\in\mathbb{R}^{n} is defined as

Af(x¯)xkx¯Limsupk+[gk(xk)hk(xk)].\begin{array}[]{rl}\partial_{A}f(\bar{x})&\triangleq\,\bigcup\limits_{x^{k}\rightarrow\bar{x}}\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}\;\big{[}\partial g^{k}(x^{k})-\partial h^{k}(x^{k})\big{]}.\end{array}

The approximate horizon subdifferential of ff (associated with {fk=gkhk}\{f^{k}=g^{k}-h^{k}\}) at x¯n\bar{x}\in\mathbb{R}^{n} is defined as

Af(x¯)xkx¯Limsupk+[gk(xk)hk(xk)].\partial^{\infty}_{A}f(\bar{x})\triangleq\,\bigcup\limits_{x^{k}\rightarrow\bar{x}}{\displaystyle\operatornamewithlimits{Lim\,sup}_{\,k\rightarrow+\infty}}^{\infty}\;\big{[}\partial g^{k}(x^{k})-\partial h^{k}(x^{k})\big{]}.

Unlike the limiting subdifferential which requires xkfx¯x^{k}\rightarrow_{f}\bar{x}, Af(x)\partial_{A}f(x) is defined using all the sequences xkx¯x^{k}\rightarrow\bar{x} without necessitating the convergence of function values. It follows directly from the definitions that the mappings xAf(x)x\mapsto\partial_{A}f(x) and xAf(x)x\mapsto\partial_{A}^{\infty}f(x) are osc. The following proposition presents a sufficient condition for Af(x¯)=f(x¯)=\partial_{A}f(\bar{x})=\partial f(\bar{x})=\emptyset at any x¯domf\bar{x}\notin\operatorname{dom}f.

Proposition 1.

Let x¯domf\bar{x}\notin\operatorname{dom}f. Then Af(x¯)=\partial_{A}f(\bar{x})=\emptyset if for any sequence xkx¯x^{k}\rightarrow\bar{x}, we have xkdomfkx^{k}\notin\operatorname{dom}f^{k} for all sufficiently large kk. The latter condition is particularly satisfied whenever domf\operatorname{dom}f is closed and domfkdomf\operatorname{dom}f^{k}\subset\operatorname{dom}f for all sufficiently large kk.

Proof.

Note that for any xkx¯domfx^{k}\rightarrow\bar{x}\notin\operatorname{dom}f, we have [gk(xk)hk(xk)]=[\partial g^{k}(x^{k})-\partial h^{k}(x^{k})]=\emptyset for all sufficiently large kk due to xkdomfk=[domgkdomhk]x^{k}\notin\operatorname{dom}f^{k}=[\operatorname{dom}g^{k}\cap\operatorname{dom}h^{k}]. Thus, Af(x¯)=\partial_{A}f(\bar{x})=\emptyset for any x¯domf\bar{x}\notin\operatorname{dom}f. ∎

In the subsequent analysis, we restrict our attention to x¯domf\bar{x}\in\operatorname{dom}f. Admittedly, the set Af(x¯)\partial_{A}f(\bar{x}) depends on the approximating sequence {fk}\{f^{k}\} and the DC decomposition of each fkf^{k}, which may contain irrelevant information concerning the local geometry of epif\operatorname{epi}f. In fact, for a given ADC function ff, we can make the set Af(x¯)\partial_{A}f(\bar{x}) arbitrarily large by adding the same nonsmooth functions to both gkg^{k} and hkh^{k}. By Attouch’s theorem (see for example [30, Theorem 12.35]), for proper, lsc, convex functions ff and {fk}\{f^{k}\}, if fkeff^{k}\overset{\text{e}}{\rightarrow}f, we immediately have Af=f\partial_{A}f=\partial f when taking gk=fkg^{k}=f^{k} and hk=0h^{k}=0. In what follows, we further explore the relationships among Af\partial_{A}f and other commonly employed subdifferentials in the literature beyond the convex setting. As it turns out, with respect to an arbitrary DC function fkf^{k} that is lsc, Af(x¯)\partial_{A}f(\bar{x}) contains the limiting subdifferential of ff at any x¯domf\bar{x}\in\operatorname{dom}f whenever fkeff^{k}\overset{\text{e}}{\rightarrow}f.

Theorem 1 (subdifferentials relationships).

Consider an ADC function f:n¯f:\mathbb{R}^{n}\to\overline{\mathbb{R}}. The following statements hold for any x¯domf\bar{x}\in\operatorname{dom}f.
(a) If ff is e-ADC associated with {fk}\{f^{k}\} and fkf^{k} is lsc, then f(x¯)Af(x¯)\partial f(\bar{x})\subset\partial_{A}f(\bar{x}) and f(x¯)Af(x¯)\partial^{\infty}f(\bar{x})\subset\partial^{\infty}_{A}f(\bar{x}).
(b) If ff is locally Lipschitz continuous and bounded from below, then there exists a sequence of DC functions {fk}\{f^{k}\} such that fkcff^{k}\overset{\text{c}}{\rightarrow}f, f(x¯)Af(x¯)Cf(x¯)\partial f(\bar{x})\subset\partial_{A}f(\bar{x})\subset\partial_{C}f(\bar{x}), and Af(x¯)={0}\partial^{\infty}_{A}f(\bar{x})=\{0\}. Consequently, conAf(x¯)=Cf(x¯)\operatorname{con}\partial_{A}f(\bar{x})=\partial_{C}f(\bar{x}), the set Af(x¯)\partial_{A}f(\bar{x}) is nonempty and bounded, and f(x¯)=Af(x¯)\partial f(\bar{x})=\partial_{A}f(\bar{x}) when ff is subdifferentially regular at x¯\bar{x}.

Proof.

(a) Let gkhkg^{k}-h^{k} be a DC decomposition of fkf^{k}. Since ff is e-ADC, it must be lsc [30, Proposition 7.4(a)]. Using epi-convergence of {fk}\{f^{k}\} to ff, we know from [30, corollary 8.47(b)] and [30, Proposition 8.46(e)] that any element of f(x¯)\partial f(\bar{x}) can be generated as a limit of regular subgradients at xkx^{k} with xkNx¯x^{k}\rightarrow_{N}\bar{x} and fk(xk)Nf(x¯)f^{k}(x^{k})\rightarrow_{N}f(\bar{x}) for some NN\in\mathbb{N}_{\infty}. Indeed, we can further restrict xkdomfkx^{k}\in\operatorname{dom}f^{k} since fk(xk)Nf(x¯)f^{k}(x^{k})\rightarrow_{N}f(\bar{x}) and x¯domf\bar{x}\in\operatorname{dom}f. Then, we have

f(x¯)xk(domfk)x¯Limsupk+^fk(xk)xk(domfk)x¯Limsupk+[gk(xk)hk(xk)]Af(x¯),\partial f(\bar{x})\subset\bigcup\limits_{x^{k}(\in\operatorname{dom}f^{k})\rightarrow\bar{x}}\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}{\widehat{\partial}}f^{k}(x^{k})\subset\bigcup\limits_{x^{k}(\in\operatorname{dom}f^{k})\rightarrow\bar{x}}\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}\big{[}\partial g^{k}(x^{k})-\partial h^{k}(x^{k})\big{]}\subset\partial_{A}f(\bar{x}),

where the second inclusion can be verified as follows: Firstly, due to the lower semicontinuity of fkf^{k} and hkh^{k}, and xkdomfkdomgkx^{k}\in\operatorname{dom}f^{k}\subset\operatorname{dom}g^{k}, it follows from the sum rule of regular subdifferentials [30, corollary 10.9] that ^gk(xk)^fk(xk)+^hk(xk)\widehat{\partial}g^{k}(x^{k})\supset\widehat{\partial}f^{k}(x^{k})+\widehat{\partial}h^{k}(x^{k}). Consequently, ^fk(xk)^gk(xk)^hk(xk)=gk(xk)hk(xk)\widehat{\partial}f^{k}(x^{k})\subset\widehat{\partial}g^{k}(x^{k})-\widehat{\partial}h^{k}(x^{k})=\partial g^{k}(x^{k})-\partial h^{k}(x^{k}) since gkg^{k} and hkh^{k} are proper and convex [30, Proposition 8.12]. Similarly, by [30, corollary 8.47(b)], we have

f(x¯)xk(domfk)x¯Limsupk+^fk(xk)xk(domfk)x¯Limsupk+[gk(xk)hk(xk)]Af(x¯).\partial^{\infty}f(\bar{x})\subset\bigcup\limits_{x^{k}(\in\operatorname{dom}f^{k})\rightarrow\bar{x}}{\displaystyle\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}}^{\infty}\;{\widehat{\partial}}f^{k}(x^{k})\subset\bigcup\limits_{x^{k}(\in\operatorname{dom}f^{k})\rightarrow\bar{x}}{\displaystyle\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}}^{\infty}\big{[}\partial g^{k}(x^{k})-\partial h^{k}(x^{k})\big{]}\subset\partial^{\infty}_{A}f(\bar{x}).

(b) For a locally Lipschitz continuous function ff, consider its Moreau envelope eγf(x)infz{f(z)+zx2/(2γ)}e_{\gamma}f(x)\triangleq\inf_{z}\{f(z)+\|z-x\|^{2}/(2\gamma)\} and the set-valued mapping Pγf(x)argminz{f(z)+zx2/(2γ)}P_{\gamma f}(x)\triangleq\operatornamewithlimits{argmin}_{z}\{f(z)+\|z-x\|^{2}/(2\gamma)\}. For any sequence γk0\gamma_{k}\downarrow 0, we demonstrate in the following that {fkeγkf}\{f^{k}\triangleq e_{\gamma_{k}}f\} is the desired sequence of approximating functions. Firstly, since ff is bounded from below, it must be prox-bounded and, thus, each fkf^{k} is continuous and fk(x¯)f(x¯)f^{k}(\bar{x})\uparrow f(\bar{x}) for all x¯\bar{x} (cf. [30, Theorem 1.25]). By the continuity of ff and fkf^{k}, we have fkcff^{k}\overset{\text{c}}{\rightarrow}f from [30, Proposition 7.4(c-d)]. It then follows from part (a) that f(x¯)Af(x¯)\partial f(\bar{x})\subset\partial_{A}f(\bar{x}). Consider the following DC decomposition of each fkf^{k}:

fk(x)=x22γkgk(x)supzn{f(z)z22γk+zxγk}hk(x)xn.f^{k}(x)=\underbrace{\frac{\|x\|^{2}}{2\gamma_{k}}}_{\triangleq g^{k}(x)}-\underbrace{\sup_{z\in\mathbb{R}^{n}}\left\{-f(z)-\frac{\|z\|^{2}}{2\gamma_{k}}+\frac{z^{\top}x}{\gamma_{k}}\right\}}_{\triangleq h^{k}(x)}\qquad x\in\mathbb{R}^{n}.

It is clear that f(z)+z2/(2γk)+zx/γkf(z)+\|z\|^{2}/(2\gamma_{k})+z^{\top}x/\gamma_{k} is level-bounded in zz locally uniformly in xx, since for any rr\in\mathbb{R} and any bounded set XnX\subset\mathbb{R}^{n}, the set

{zn|xX,f(z)+z22γkzxγkr}{zn|xX,zx2x2+2γk[rinfzf(z)]}\left\{z\in\mathbb{R}^{n}\middle|\,x\in X,f(z)+\frac{\|z\|^{2}}{2\gamma_{k}}-\frac{z^{\top}x}{\gamma_{k}}\leq r\right\}\subset\left\{z\in\mathbb{R}^{n}\middle|\,x\in X,\|z-x\|^{2}\leq\|x\|^{2}+2\gamma_{k}\left[r-\inf_{z}f(z)\right]\right\}

is bounded. Due to the level-boundedness condition, we can apply the subdifferential formula of the parametric minimization [30, Theorem 10.13] to get

(hk)(x)zPγkf(x){y|(0,y)(z,x)(f(z)+z22γkzxγk)}zPγkf(x){f(z)xγk},\partial(-h^{k})(x)\subset\bigcup_{z\in P_{\gamma_{k}f}(x)}\left\{y\,\middle|\,(0,y)\in\partial_{(z,x)}\left(f(z)+\frac{\|z\|^{2}}{2\gamma_{k}}-\frac{z^{\top}x}{\gamma_{k}}\right)\right\}{\subset}\bigcup_{z\in P_{\gamma_{k}f}(x)}\left\{\partial f(z)-\frac{x}{\gamma_{k}}\right\},

where the last inclusion is due to the calculus rules [30, Proposition 10.5 and exercise 8.8(c)]. Since hkh^{k} is convex, we have hk(x)=C(hk)(x)=con(hk)(x)-\partial h^{k}(x)=\partial_{C}(-h^{k})(x)=\operatorname{con}\partial(-h^{k})(x) by [30, Theorem 9.61], which further yields that

[gk(x)hk(x)]con{f(z)|zPγkf(x)}xn,k.\left[\partial g^{k}(x)-\partial h^{k}(x)\right]\subset\operatorname{con}\,\bigcup\left\{\partial f(z)\,\middle|\,z\in P_{\gamma_{k}f}(x)\right\}\qquad\forall\,x\in\mathbb{R}^{n},\;{k\in\mathbb{N}}. (4)

For any xkx¯x^{k}\rightarrow\bar{x} and any zkPγkf(xk)z^{k}\in P_{\gamma_{k}f}(x^{k}), we have

12γkzkxk2+infxf(x)12γkzkxk2+f(zk)12γkx¯xk2+f(x¯).\frac{1}{2\gamma_{k}}\|z^{k}-x^{k}\|^{2}+\inf_{x}f(x)\leq\frac{1}{2\gamma_{k}}\|z^{k}-x^{k}\|^{2}+f(z^{k})\leq\frac{1}{2\gamma_{k}}\|\bar{x}-x^{k}\|^{2}+f(\bar{x}).

Then, zkxkx¯xk2+2γk[f(x¯)infxf(x)]0\|z^{k}-x^{k}\|\leq\sqrt{\|\bar{x}-x^{k}\|^{2}+2\gamma_{k}[f(\bar{x})-\inf_{x}f(x)]}\rightarrow 0 due to the assumption that ff is bounded from below and therefore zkx¯z^{k}\rightarrow\bar{x}. By the local Lipschitz continuity of ff, it follows from [30, Theorem 9.13] that the mapping f:xf(x)\partial f:x\mapsto\partial f(x) is locally bounded at x¯\bar{x}. Thus, there is a bounded set SS such that {f(zk)zkPγkf(xk)}S\bigcup\{\partial f(z^{k})\mid z^{k}\in P_{\gamma_{k}f}(x^{k})\}\subset S for all sufficiently large kk. It follows directly from [30, Example 4.22] and the definition of the approximate horizon subdifferential that Af(x¯)={0}\partial^{\infty}_{A}f(\bar{x})=\{0\}.

Next, we will prove Af(x¯)Cf(x¯)\partial_{A}f(\bar{x})\subset\partial_{C}f(\bar{x}). For any uAf(x¯)u\in\partial_{A}f(\bar{x}), from (4), there exist sequences of vectors xkx¯x^{k}\rightarrow\bar{x} and ukuu^{k}\rightarrow u with each uku^{k} taken from the convex hull of a bounded set {f(zk)zkPγkf(xk)}\bigcup\{\partial f(z^{k})\mid z^{k}\in P_{\gamma_{k}f}(x^{k})\}. By Carathéodory’s Theorem (see, e.g. [27, Theorem 17.1]), for each kk, we have uk=i=1n+1λk,ivk,iu^{k}=\sum_{i=1}^{n+1}\lambda_{k,i}\;v^{k,i} for some nonnegative scalars {λk,i}i=1n+1\{\lambda_{k,i}\}^{n+1}_{i=1} with i=1n+1λk,i=1\sum_{i=1}^{n+1}\lambda_{k,i}=1 and a sequence {vk,if(zk,i)}i=1n+1\big{\{}v^{k,i}\in\partial f(z^{k,i})\big{\}}^{n+1}_{i=1} with {zk,iPγkf(xk)}i=1n+1\{z^{k,i}\in P_{\gamma_{k}f}(x^{k})\}^{n+1}_{i=1}. It is easy to see that the sequences {λk,i}k\{\lambda_{k,i}\}_{k\in\mathbb{N}} and {vk,i}k\{v^{k,i}\}_{k\in\mathbb{N}} are bounded for each ii. We can then obtain convergent subsequences λk,iNλ¯i0\lambda_{k,i}\to_{N}\bar{\lambda}_{i}\geq 0 with i=1n+1λ¯i=1\sum_{i=1}^{n+1}\bar{\lambda}_{i}=1 and vk,iNv¯iv^{k,i}\to_{N}{\bar{v}}^{\,i} for each ii. Since zk,ix¯z^{k,i}\to\bar{x}, we have v¯if(x¯){\bar{v}}^{\,i}\in\partial f(\bar{x}) by using the outer semicontinuity of f\partial f. Thus, ukNu=i=1n+1λ¯iv¯iconf(x¯)=Cf(x¯)u^{k}\rightarrow_{N}u=\sum_{i=1}^{n+1}\bar{\lambda}_{i}\,{\bar{v}}^{\,i}\in\operatorname{con}\partial f(\bar{x})=\partial_{C}f(\bar{x}). This implies that Af(x¯)Cf(x¯)\partial_{A}f(\bar{x})\subset\partial_{C}f(\bar{x}). The rest of the statements in (b) follows from the fact that Cf(x¯)\partial_{C}f(\bar{x}) is nonempty and bounded whenever ff is locally Lipschitz continuous [30, Theorem 9.61]. ∎

Under suitable assumptions, Theorem 1(b) guarantees the existence of an ADC decomposition that has its approximate subdifferential contained in the Clarke subdifferential of the original function. Notably, this decomposition may not always be practically useful due to the necessity of computing the Moreau envelope for a generally nonconvex function. Another noteworthy remark is that the assumptions and results of Theorem 1 can be localized to any specific point x¯\bar{x}. This can be accomplished by defining a notion of “local epi-convergence” at x¯\bar{x} and extending the result of [30, corollary 8.47] accordingly.

2.3 Examples of ADC functions.

In this subsection, we provide examples of ADC functions, including functions that are discontinuous relative to their domains, with explicit and computationally tractable approximating sequences. Moreover, we undertake an investigation into the approximate subdifferentials of these ADC functions.

Example 2.1 (implicitly convex-concave functions). The concept of implicitly convex-concave (icc) functions is introduced in the monograph [13], and is further generalized to extended-real-valued functions in [20]. A proper function f:n¯f:\mathbb{R}^{n}\rightarrow\overline{\mathbb{R}} is icc if there exists a lifted function f¯:n×n¯\overline{f}:\mathbb{R}^{n}\times\mathbb{R}^{n}\rightarrow\overline{\mathbb{R}} such that the following three conditions hold:

(i) f¯(z,x)=+\overline{f}(z,x)=+\infty if zdomf,xnz\notin\operatorname{dom}f,x\in\mathbb{R}^{n}, and f¯(z,x)=\overline{f}(z,x)=-\infty if zdomf,xdomfz\in\operatorname{dom}f,x\notin\operatorname{dom}f;

(ii) f¯(,x)\overline{f}(\cdot,x) is convex for any fixed xdomfx\in\operatorname{dom}f, and f¯(z,)\overline{f}(z,\cdot) is concave for any fixed zdomfz\in\operatorname{dom}f;

(iii) f(x)=f¯(x,x)f(x)=\overline{f}(x,x) for any xdomfx\in\operatorname{dom}f.

A notable example of icc functions is the optimal value function fpf_{p} in (1), which is associated with the lifted function defined by (the subscripts/superscripts pp are omitted for brevity):

f¯(z,x)infyd{(c+Cx)y+12yQy|Az+Byb}(x,z)domf×domf.\begin{array}[]{lr}\overline{f}(z,x)\triangleq\displaystyle{\inf_{y\in{\mathbb{R}^{d}}}}\left\{(c+Cx)^{\top}y+\frac{1}{2}y^{\top}Q\,y\;\middle|\;Az+By\leq b\right\}\qquad(x,z)\in\operatorname{dom}f\times\operatorname{dom}f.\end{array} (5)

Let 1f¯(,x)\partial_{1}\overline{f}(\cdot,x) and 2(f¯)(z,)\partial_{2}(-\overline{f})(z,\cdot) denote the subdifferentials of the convex functions f¯(,x)\overline{f}(\cdot,x) and (f¯)(z,)(-\overline{f})(z,\cdot), respectively, for any (x,z)domf×domf(x,z)\in\operatorname{dom}f\times\operatorname{dom}f. For any γ>0\gamma>0, the partial Moreau envelope of an icc function ff associated with f¯\overline{f} is given by

infzn{f¯(z,x)+12γzx2}=x22γgγ(x)supzn{f¯(z,x)z22γ+zxγ}hγ(x)xdomf.\inf_{z\in{\mathbb{R}^{n}}}\left\{\overline{f}(z,x)+\frac{1}{2\gamma}\|z-x\|^{2}\right\}=\underbrace{\frac{\|x\|^{2}}{2\gamma}}_{\triangleq g_{\gamma}(x)}-\underbrace{\sup_{z\in{\mathbb{R}^{n}}}\left\{-\overline{f}(z,x)-\frac{\|z\|^{2}}{2\gamma}+\frac{z^{\top}x}{\gamma}\right\}}_{\triangleq h_{\gamma}(x)}\qquad x\in\operatorname{dom}f.\vspace{-0.1in} (6)

This decomposition, established in [20], offers computational advantages compared to the standard Moreau envelope, as the maximization problem defining hγh_{\gamma} is concave in zz for any fixed xx. In what follows, we present new results on the conditions under which the icc function ff is e-ADC and c-ADC based on the partial Moreau envelope. Additionally, we explore a relationship between Af(x¯)\partial_{A}f(\bar{x}) and 1f¯(x¯,x¯)2(f¯)(x¯,x¯)\partial_{1}\overline{f}(\bar{x},\bar{x})-\partial_{2}(-\overline{f})(\bar{x},\bar{x}), where the latter is known to be an outer estimate of Cf(x¯)\partial_{C}f(\bar{x}) [13, Proposition 4.4.26]. The proof is deferred to Appendix A.

Proposition 2.

Let f:n¯f:{\mathbb{R}^{n}}\to\overline{\mathbb{R}} be a proper, lsc, icc function associated with f¯\overline{f}, where domf\operatorname{dom}f is closed and f¯\overline{f} is lsc on n×domf{\mathbb{R}^{n}}\times\operatorname{dom}f, bounded below on domf×domf\operatorname{dom}f\times\operatorname{dom}f, and continuous relative to int(domf)×int(domf)\operatorname{int}(\operatorname{dom}f)\times\operatorname{int}(\operatorname{dom}f). Given a sequence of scalars γk0\gamma_{k}\downarrow 0, we have:
(a) ff is e-ADC associated with {fk}\{f^{k}\}, where each fk(x)gγk(x)hγk(x)+δdomf(x)f^{k}(x)\triangleq{g_{\gamma_{k}}(x)-h_{\gamma_{k}}(x)}+\delta_{\operatorname{dom}f}(x). In addition, if domf=n\operatorname{dom}f={\mathbb{R}^{n}}, then ff is c-ADC associated with {fk}\{f^{k}\}.
(b) Af(x¯)1f¯(x¯,x¯)2(f¯)(x¯,x¯)\partial_{A}f(\bar{x})\subset\partial_{1}\overline{f}(\bar{x},\bar{x})-\partial_{2}(-\overline{f})(\bar{x},\bar{x}) and Af(x¯)={0}\partial^{\infty}_{A}f(\bar{x})=\{0\} for any x¯int(domf)\bar{x}\in\operatorname{int}(\operatorname{dom}f).

Example 2.2 (VaR for continuous random variables). Given a continuous random variable Y:ΩY:\Omega\to\mathbb{R}, its conditional value-at-risk (CVaR) at a confidence level α(0,1)\alpha\in(0,1) is defined as CVaRα(Y)𝔼[YYVaRα(Y)]\mbox{CVaR}_{\alpha}(Y)\triangleq\mathbb{E}[\,Y\mid Y\geq\mbox{VaR}_{\alpha}(Y)], where VaRα\mbox{VaR}_{\alpha} is the value-at-risk given in Example 1.2 (see, e.g., [29]). For any α(0,1)\alpha\in(0,1) and k>1/αk>1/{\alpha}, we define

gk(x)[k(1α)+1]CVaRα1/k[c(x,Z)],hk(x)k(1α)CVaRα[c(x,Z)]xn.g^{k}(x)\triangleq[k(1-\alpha)+1]\,{\mbox{CVaR}_{\alpha-1/k}}[\,c(x,Z)],\quad h^{k}(x)\triangleq k(1-\alpha)\,{\mbox{CVaR}_{\alpha}}[\,c(x,Z)]\quad x\in\mathbb{R}^{n}. (7)

The following properties of VaR for continuous random variables hold, with proofs provided in Appendix A.

Proposition 3.

Let c:n×mc:\mathbb{R}^{n}\times\mathbb{R}^{m}\to\mathbb{R} be a lsc function and Z:ΩmZ:\Omega\to\mathbb{R}^{m} be a random vector. Suppose that c(,z)c(\cdot,z) is convex for any fixed zmz\in\mathbb{R}^{m}, and c(x,Z)c(x,Z) is a random variable having a continuous distribution induced by that of ZZ for any fixed xnx\in\mathbb{R}^{n}. Additionally, assume that 𝔼[|c(x,Z)|]<+\mathbb{E}[\,|c(x,Z)|\,]<+\infty for any xnx\in\mathbb{R}^{n}. For any given constant α(0,1)\alpha\in(0,1), the following properties hold.
(a) VaRα[c(,Z)]\mbox{VaR}_{\alpha}[\,{c(\cdot,Z)}] is lsc and e-ADC associated with {gkhk}\{g^{k}-h^{k}\} (with the definitions of gkg^{k} and hkh^{k} in (7)). Additionally, if c(,)c(\cdot,\cdot) is continuous, then VaRα[c(,Z)]\mbox{VaR}_{\alpha}[\,{c(\cdot,Z)}] is continuous and c-ADC associated with {gkhk}\{g^{k}-h^{k}\}.
(b) If there exists a measurable function κ:m+\kappa:\mathbb{R}^{m}\to\mathbb{R}_{+} such that 𝔼[κ(Z)]<+\mathbb{E}[\,\kappa(Z)]<+\infty and |c(x,z)c(x,z)|κ(z)xx|c(x,z)-c(x^{\prime},z)|\leq\kappa(z)\|x-x^{\prime}\| for all x,xnx,x^{\prime}\in\mathbb{R}^{n} and zmz\in\mathbb{R}^{m}, then for any x¯n\bar{x}\in\mathbb{R}^{n},

AVaRα[c(,Z)](x¯)=xkx¯Limsupk+𝔼[1c(xk,Z)|VaRα1/k[c(xk,Z)]<c(xk,Z)<VaRα[c(xk,Z)]],\partial_{A}\mbox{VaR}_{\alpha}[\,c(\cdot,Z)](\bar{x})=\bigcup_{x^{k}\rightarrow\bar{x}}\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}\mathbb{E}\left[\,\partial_{1}\,c(x^{k},Z)\,\middle|\,\mbox{VaR}_{\alpha-1/k}[\,c(x^{k},Z)]<c(x^{k},Z)<\mbox{VaR}_{\alpha}[\,c(x^{k},Z)]\,\right],\vspace{-0.08in}

where 𝔼[𝒜(Z)B]\mathbb{E}[\,\mathcal{A}(Z)\mid B] for a random set-valued mapping 𝒜\mathcal{A} and an event BB is defined as the set of conditional expectations 𝔼[a(Z)B]\mathbb{E}[\,a(Z)\mid B] for all measurable selections a(Z)𝒜(Z)a(Z)\in\mathcal{A}(Z).

3 The convex composite ADC functions and minimization.

This section aims to derive necessary optimality conditions for (CP0), particularly focusing on the inner function fpf_{p} that lacks local Lipschitz continuity. Throughout the rest of this paper, we assume that φp:{+}\varphi_{p}:\mathbb{R}\to\mathbb{R}\cup\{+\infty\} is proper, convex, lsc and fp:nf_{p}:\mathbb{R}^{n}\to\mathbb{R} is real-valued for all p=1,,mp=1,\cdots,m. Depending on whether φp\varphi_{p} is nondecreasing or not, we partition {1,,m}\{1,\cdots,m\} into two categories:

I1{p{1,,m}|φp nondecreasing}andI2{1,,m}\I1.I_{1}\triangleq\left\{\,p\in\{1,\cdots,m\}\,\middle|\,\varphi_{p}\text{ nondecreasing}\right\}\quad\text{and}\quad I_{2}\triangleq\{1,\cdots,m\}\backslash I_{1}. (8)

We do not specifically address the case where φp\varphi_{p} is nonincreasing, as one can always redefine φ~p(t)=φp(t)\widetilde{\varphi}_{p}(t)=\varphi_{p}(-t) and f~p(x)=fp(x)\widetilde{f}_{p}(x)=-f_{p}(x), enabling the treatment of these indices in the same manner as those in I1I_{1}. Therefore, the set I2I_{2} should be viewed as the collection of indices pp where φp\varphi_{p} is not monotone. We further make the following assumptions on the functions φp\varphi_{p} and fpf_{p}.

Assumption 1 For each pp, we have (a) fpf_{p} is e-ADC associated with {fpk=gpkhpk}k\{f^{k}_{p}=g^{k}_{p}-h^{k}_{p}\}_{k\in\mathbb{N}}, and domgpk=domhpk=n\operatorname{dom}g^{k}_{p}=\operatorname{dom}h^{k}_{p}=\mathbb{R}^{n}; (b) <lim infxx,k+fpk(x)lim supxx,k+fpk(x)<+-\infty<\displaystyle\liminf_{x^{\prime}\rightarrow x,\,k\rightarrow+\infty}f^{k}_{p}(x^{\prime})\leq\limsup_{x^{\prime}\rightarrow x,\,k\rightarrow+\infty}f^{k}_{p}(x^{\prime})<+\infty for all xnx\in\mathbb{R}^{n}; (c) [Fpkφpfpk]eFp\big{[}F^{k}_{p}\triangleq\varphi_{p}\circ f^{k}_{p}\big{]}\overset{\text{e}}{\rightarrow}F_{p}.

From Assumption 1(a), each fpkf^{k}_{p} is locally Lipschitz continuous since any real-valued convex function is locally Lipschitz continuous. Obviously, fpkcfpf^{k}_{p}\overset{\text{c}}{\rightarrow}f_{p} is sufficient for Assumption 1(b) to hold. Since fpkefpf^{k}_{p}\overset{\text{e}}{\rightarrow}f_{p}, we have lim infxx,k+fpk(x)fp(x)>\liminf_{x^{\prime}\rightarrow x,k\rightarrow+\infty}f^{k}_{p}(x^{\prime})\geq f_{p}(x)>-\infty for each pp at any xnx\in\mathbb{R}^{n}. However, lim supxx,k+fpk(x)<+\limsup_{x^{\prime}\rightarrow x,k\rightarrow+\infty}f^{k}_{p}(x^{\prime})<+\infty does not hold trivially. For example, consider a continuous function ff and

fk(x)={f(x)+k2x+k if x[1/k,0]f(x)k2x+k if x(0,1/k]f(x) otherwise,f^{k}(x)=\left\{\begin{array}[]{cl}f(x)+k^{2}x+k&\text{ if }x\in[-1/k,0]\\ f(x)-k^{2}x+k&\text{ if }x\in(0,1/k]\\ f(x)&\text{ otherwise}\end{array}\right.,

which results in fkeff^{k}\overset{\text{e}}{\rightarrow}f but lim supk+fk(0)=+{\limsup_{k\rightarrow+\infty}f^{k}(0)=+\infty}. Additionally, Assumption 1(b) ensures that at each point xx and for any sequence xkxx^{k}\to x, the sequence {fpk(xk)}k\{f^{k}_{p}(x^{k})\}_{k\in\mathbb{N}} must be bounded.

It follows from [30, Exercise 7.8(c)] and [32, Theorem 2.4] that there are several sufficient conditions for Assumption 1(c) to hold, which differ based on the monotonicity of each φp\varphi_{p}: (i) For pI1p\in I_{1}, either φp\varphi_{p} is real-valued or fpkfpf^{k}_{p}\leq f_{p}; (ii) For pI2p\in I_{2}, fpf_{p} is c-ADC and for all xx with fp(x)bdry(domφp)f_{p}(x)\in\operatorname{bdry}(\operatorname{dom}{\varphi_{p}}), there exists a sequence xkx{x^{k}}\to x with f(xk)int(domφp)f(x^{k})\in\operatorname{int}(\operatorname{dom}\varphi_{p}). In addition, according to [30, Proposition 7.4(a)], Assumption 1(c) implies that Fp=φpfpF_{p}=\varphi_{p}\circ f_{p} is lsc. We also note that Assumption 1(c) doesn’t necessarily imply p=1mFpkep=1mFp\sum_{p=1}^{m}F^{k}_{p}\overset{\text{e}}{\rightarrow}\sum_{p=1}^{m}F_{p}. To maintain epi-convergence under addition of functions, one may refer to the sufficient conditions in [30, Theorem 7.46].

3.1 Asymptotic stationarity under epi-convergence.

In this subsection, we introduce a novel stationarity concept for problem (CP0), grounded in a monotonic decomposition of univariate convex functions. We demonstrate that under certain constraint qualifications, epi-convergence of approximating functions ensures this stationarity concept as a necessary optimality condition. Alongside the known fact that epi-convergence also ensures the consistency of global optimal solutions [30, Theorem 7.31(b)], this highlights the usefulness of epi-convergence as a tool for studying the approximation of problem (CP0).

The following lemma is an extension of [13, Lemma 6.1.1] from real-valued univariate convex functions to extended-real-valued univariate convex functions.

Lemma 1 (a monotonic decomposition of univariate convex functions).

Let φ:¯\varphi:\mathbb{R}\rightarrow\overline{\mathbb{R}} be a proper, lsc and convex function. Then there exist a proper, lsc, convex and nondecreasing function φ\varphi^{\uparrow}, as well as a proper, lsc, convex and nonincreasing function φ\varphi^{\downarrow}, such that φ=φ+φ\varphi=\varphi^{\uparrow}+\varphi^{\downarrow}. In addition, if int(domφ)\operatorname{int}(\operatorname{dom}\varphi)\neq\emptyset, then φ(z)=φ(z)+φ(z)\partial\varphi(z)=\partial\varphi^{\uparrow}(z)+\partial\varphi^{\downarrow}(z) for any zdomφz\in\operatorname{dom}\varphi.

Proof.

From the convexity of φ\varphi, domφ\operatorname{dom}\varphi is an interval on \mathbb{R}, possibly unbounded. In fact, we can explicitly construct φ\varphi^{\uparrow} and φ\varphi^{\downarrow} in following two cases.

Case 1. If φ\varphi has no direction of recession, i.e., there does not exist d0d\neq 0 such that for any zz, φ(z+λd)\varphi(z+\lambda d) is a nonincreasing function of λ>0\lambda>0, it follows from [27, Theorem 27.2] that φ\varphi attains its minimum at some zdomφz^{\ast}\in\operatorname{dom}\varphi. Define

φ(z)={φ(z)if zzφ(z)if z>zandφ(z)={φ(z)φ(z)if zz0if z>z.\varphi^{\uparrow}(z)=\left\{\begin{array}[]{cl}\varphi(z^{\ast})&\text{if }z\leq z^{\ast}\\ \varphi(z)&\text{if }z>z^{\ast}\end{array}\right.\quad\mbox{and}\quad\varphi^{\downarrow}(z)=\left\{\begin{array}[]{cl}\varphi(z)-\varphi(z^{\ast})&\text{if }z\leq z^{\ast}\\ 0&\text{if }z>z^{\ast}\end{array}\right..

Observe that int(domφ)[int(domφ)int(domφ)]\emptyset\neq\operatorname{int}(\operatorname{dom}\varphi)\subset\big{[}\operatorname{int}(\operatorname{dom}\varphi^{\uparrow})\cap\operatorname{int}(\operatorname{dom}\varphi^{\downarrow})\big{]}. Consequently, from [27, Theorem 23.8], we have φ(z)=φ(z)+φ(z)\partial\varphi(z)=\partial\varphi^{\uparrow}(z)+\partial\varphi^{\downarrow}(z) for any zz\in\mathbb{R}.

Case 2. Otherwise, there exists d0d\neq 0 such that for any zz\in\mathbb{R}, φ(z+λd)\varphi(z+\lambda d) is a nonincreasing function of λ>0\lambda>0. Consequently, domφ\operatorname{dom}\varphi must be an unbounded interval on \mathbb{R}. Let d=1d=1 (or 1-1) be such a recession direction, then φ\varphi is nonincreasing (or nondecreasing) on \mathbb{R}. We can set φ=0\varphi^{\uparrow}=0 and φ=φ\varphi^{\downarrow}=\varphi (or φ=φ\varphi^{\uparrow}=\varphi and φ=0\varphi^{\downarrow}=0). In this case, it is obvious that φ(z)=φ(z)+φ(z)\partial\varphi(z)=\partial\varphi^{\uparrow}(z)+\partial\varphi^{\downarrow}(z) for any zz\in\mathbb{R}. The proof is thus completed. ∎

In the subsequent analysis, we use φ\varphi^{\uparrow} and φ\varphi^{\downarrow} to denote the monotonic decomposition of any univariate, proper, lsc, and convex function φ\varphi constructed in the proof of Lemma 1 and, in particular, we take φ=0\varphi^{\downarrow}=0 whenever φ\varphi is nondecreasing. We are now ready to present the definition of asymptotically stationary points.

Definition 4 (asymptotically stationary points).

Let each fpf_{p} be an ADC function associated with {fpk=gpkhpk}k\{f^{k}_{p}=g^{k}_{p}-h^{k}_{p}\}_{k\in\mathbb{N}}. For each pp, define

Tp(x){tp|N,xkx with fpk(xk)Ntp}xn.T_{p}(x)\triangleq\left\{t_{p}\,\middle|\,\exists\,N\in\mathbb{N}_{\infty}^{\sharp},x^{k}\rightarrow x\text{ with }f^{k}_{p}(x^{k})\rightarrow_{N}t_{p}\right\}\qquad x\in\mathbb{R}^{n}. (9)

We say that x¯\bar{x} is an asymptotically stationary (A-stationary) point of problem (CP0) if for each pp, there exists yp{φp(tp)tpTp(x¯)}y_{p}\in\bigcup\{\partial\varphi_{p}(t_{p})\mid t_{p}\in T_{p}(\bar{x})\} such that

0p=1m({ypAfp(x¯)}[±Afp(x¯)\{0}]).0\,\in\,\sum\limits_{p=1}^{m}\left(\,\big{\{}y_{p}\,\partial_{A}f_{p}(\bar{x})\big{\}}\cup{\big{[}\pm\partial^{\infty}_{A}f_{p}(\bar{x})\backslash\{0\}\,\big{]}}\,\right). (10)

We say that x¯\bar{x} is a weakly asymptotically stationary (weakly A-stationary) point of problem (CP0) if for each pp, there exist t¯pTp(x¯)\bar{t}_{p}\in T_{p}(\bar{x}), yp,1φp(t¯p)y_{p,1}\in\partial\varphi^{\uparrow}_{p}(\bar{t}_{p}) and yp,2φp(t¯p)y_{p,2}\in\partial\varphi^{\downarrow}_{p}(\bar{t}_{p}) such that

0p=1m({yp,1Afp(x¯)+yp,2Afp(x¯)}[±Afp(x¯)\{0}]).0\in\sum\limits_{p=1}^{m}\left(\,\left\{y_{p,1}\,\partial_{A}f_{p}(\bar{x})+y_{p,2}\,\partial_{A}f_{p}(\bar{x})\right\}\cup\big{[}\pm\partial^{\infty}_{A}f_{p}(\bar{x})\backslash\{0\}\big{]}\,\right).
Remark 1.

(i) Given that the approximate subdifferential Afp\partial_{A}f_{p} is determined by the approximating sequence {fpk}k\{f^{k}_{p}\}_{k\in\mathbb{N}} and their corresponding DC decompositions, the notion of (weak) A-stationarity also depends on these sequences and decompositions. (ii) It follows directly from Lemma 1 that an A-stationary point must be a weakly A-stationary point if int(domφp)\operatorname{int}(\operatorname{dom}\varphi_{p})\neq\emptyset for each p=1,,mp=1,\cdots,m. (iii) When each φp\varphi_{p} is nondecreasing or nonincreasing, the concepts of weak A-stationarity and A-stationarity coincide. (iv) Given a point x¯\bar{x}, we can rewrite (10) as

0pI[±Afp(x¯)\{0}]+p{1,,m}\I{ypAfp(x¯)}0\in\sum_{p\in I}\big{[}\pm\partial^{\infty}_{A}f_{p}(\bar{x})\backslash\{0\}\big{]}+\sum_{p\in\{1,\cdots,m\}\backslash I}\{y_{p}\,\partial_{A}f_{p}(\bar{x})\}

for some index set I{1,,m}I\subset\{1,\cdots,m\} that is potentially empty. For each pIp\in I, although the scalar ypy_{p} does not explicitly appear in this inclusion, its existence implies that {φp(tp)tpTp(x¯)}\bigcup\{\partial\varphi_{p}(t_{p})\mid t_{p}\in T_{p}(\bar{x})\}\neq\emptyset, which plays a role in ensuring x¯dom(φpfp)\bar{x}\in\operatorname{dom}(\varphi_{p}\circ f_{p}). For instance, if fpkcfpf^{k}_{p}\overset{\text{c}}{\rightarrow}f_{p} for some pIp\in I, then Tp(x¯)={fp(x¯)}T_{p}(\bar{x})=\{f_{p}(\bar{x})\}, and the existence of yp{φp(tp)tpTp(x¯)}=φp(fp(x¯))y_{p}\in\bigcup\{\partial\varphi_{p}(t_{p})\mid t_{p}\in T_{p}(\bar{x})\}=\partial\varphi_{p}(f_{p}(\bar{x})) yields x¯dom(φpfp)\bar{x}\in\operatorname{dom}(\varphi_{p}\circ f_{p}).

In the following, we take a detour to compare the A-stationarity with the stationarity defined in [32], where the author has focused on a more general composite problem

minimizexnφ(f(x)),\operatornamewithlimits{minimize}_{x\in\mathbb{R}^{n}}\;\varphi\left(f(x)\right),

where φ:m¯\varphi:\mathbb{R}^{m}\to\overline{\mathbb{R}} is proper, lsc, convex and f(f1,,fm):nmf\triangleq(f_{1},\cdots,f_{m}):\mathbb{R}^{n}\to\mathbb{R}^{m} is a locally Lipschitz continuous mapping. Consider the special case where φ(z)=p=1mφp(zp)\varphi(z)=\sum_{p=1}^{m}\varphi_{p}(z_{p}) with z=(z1,,zm)z=(z_{1},\cdots,z_{m}). Under this setting, a vector x¯\bar{x} is called a stationary point in [32] if there exist y¯\bar{y} and z¯\bar{z} such that

0S(x¯,y¯,z¯){(f1(x¯),,fm(x¯))z¯}×{φ1(z¯1)××φm(z¯m)y¯}×(p=1my¯pCfp(x¯)),0\in S(\bar{x},\bar{y},\bar{z})\triangleq\big{\{}(f_{1}(\bar{x}),\cdots,f_{m}(\bar{x}))-\bar{z}\big{\}}\times\big{\{}\partial\varphi_{1}(\bar{z}_{1})\times\cdots\times\partial\varphi_{m}(\bar{z}_{m})-\bar{y}\big{\}}\times\left(\sum_{p=1}^{m}\bar{y}_{p}\,\partial_{C}f_{p}(\bar{x})\right), (11)

which can be equivalently written as

0p=1my¯pCfp(x¯) for some y¯pφp(fp(x¯))p=1,,m.0\in\sum_{p=1}^{m}\bar{y}_{p}\,\partial_{C}f_{p}(\bar{x})\;\text{ for some }\;\bar{y}_{p}\in\partial\varphi_{p}(f_{p}(\bar{x}))\quad p=1,\cdots,m. (12)

For any fixed k{k\in\mathbb{N}}, a surrogate set-valued mapping SkS^{k} can be defined similarly as SS in (11) by substituting fpf_{p} and φp\varphi_{p} with fpkf^{k}_{p} and φpk\varphi^{k}_{p} for each pp. The cited paper provides sufficient conditions to ensure Limsupk+(gphSk)gphS\operatorname{Lim\,sup}_{k\rightarrow+\infty}(\operatorname{gph}S^{k})\subset\operatorname{gph}S, which asserts that any accumulation point (x¯,y¯,z¯)(\bar{x},\bar{y},\bar{z}) of a sequence {(xk,yk,zk)}\{(x^{k},y^{k},z^{k})\} with 0Sk(xk,yk,zk)0\in S^{k}(x^{k},y^{k},z^{k}) yields a stationary point x¯\bar{x}. Our study on the asymptotic stationarity differs from [32] in the following aspects:

  1. 1.

    Our outer convex function φ\varphi is assumed to have the separable form p=1mφp\sum_{p=1}^{m}\varphi_{p}, while [32] allows a general proper, lsc, convex function. In addition, each φp\varphi_{p} is fixed in our approximating problem while [32] considers a sequence of convex functions {φpk}k\{\varphi^{k}_{p}\}_{k\in\mathbb{N}} that epi-converges to φp\varphi_{p}.

  2. 2.

    We do not require the inner function fpf_{p} to be locally Lipschitz continuous.
    If each fpf_{p} is locally Lipschitz continuous and bounded from below, it then follows from Theorem 1 that fpf_{p} is c-ADC associated with {fpk=gpkhpk}k\{f^{k}_{p}=g^{k}_{p}-h^{k}_{p}\}_{k\in\mathbb{N}} such that fp(x)Afp(x)Cfp(x)\partial f_{p}(x)\subset\partial_{A}f_{p}(x)\subset\partial_{C}f_{p}(x) and Afp(x)={0}\partial^{\infty}_{A}f_{p}(x)=\{0\} for any xx. Moreover, by fpkcfpf^{k}_{p}\overset{\text{c}}{\rightarrow}f_{p}, one has Tp(x)={fp(x)}T_{p}(x)=\{f_{p}(x)\}. Thus, for any A-stationary point x¯\bar{x} induced by these ADC decompositions, there exists y¯pφp(fp(x¯))\bar{y}_{p}\in\partial\varphi_{p}(f_{p}(\bar{x})) for each pp such that

    0p=1m{y¯pAfp(x¯)}p=1m{y¯pCfp(x¯)}.0\in\sum_{p=1}^{m}\left\{\bar{y}_{p}\,\partial_{A}f_{p}(\bar{x})\right\}\;\subset\;\sum_{p=1}^{m}\left\{\bar{y}_{p}\,\partial_{C}f_{p}(\bar{x})\right\}. (13)

    Hence, x¯\bar{x} is also a stationary point defined in (12). Indeed, A-stationarity here can be sharper than the latter one as the last inclusion in (13) may not hold with equality.
    When fpf_{p} fails to be locally Lipschitz continuous for some pp, it is not known if (11) is still a necessary condition for a local solution of (CP0). This situation further complicates the fulfillment of conditions outlined in [32, Theorem 2.4], especially the requirement of fpkcfpf^{k}_{p}\overset{\text{c}}{\rightarrow}f_{p}, due to the potential discontinuity of fpf_{p}. As will be shown in Theorem 2 below, despite these challenges, weak A-stationarity continues to be a necessary optimality condition under Assumption 1.

To proceed, for each pp and any xdom(φpfp)x\in\operatorname{dom}(\varphi_{p}\circ f_{p}), we define Sp(x)S_{p}(x) to be a collection of sequences:

Sp(x){{xpk}k|xpkx with φp(fpk(xpk))φp(fp(x))}.{S_{p}(x)\,\triangleq\,\left\{\,\{x^{k}_{p}\}_{k\in\mathbb{N}}\,\middle|\,x^{k}_{p}\rightarrow x\text{ with }\varphi_{p}(f^{k}_{p}(x^{k}_{p}))\rightarrow\varphi_{p}(f_{p}(x))\,\right\}.} (14)
Theorem 2 (necessary conditions for optimality).

Let x¯p=1mdomFp\bar{x}\in\bigcap_{p=1}^{m}\operatorname{dom}F_{p} be a local minimizer of problem (CP0). Suppose that Assumption 1 and the following two conditions hold:
(i) For each pp and any sequence {xpk}kSp(x¯)\{x^{k}_{p}\}_{k\in\mathbb{N}}\in S_{p}(\bar{x}), there is a positive integer KK such that

0Cfpk(xpk)or𝒩domφp(fpk(xpk))={0}kK,0\notin\partial_{C}f^{k}_{p}(x^{k}_{p})\quad\text{or}\quad\mathcal{N}_{\operatorname{dom}\varphi_{p}}(f^{k}_{p}(x^{k}_{p}))=\{0\}\quad\forall\,k\geq K, (15)

and

[0ypAfp(x¯),yp{𝒩domφp(tp)tpTp(x¯)}]yp=0,p=1,,m.\bigg{[}0\in y_{p}\,\partial_{A}f_{p}(\bar{x}),\;y_{p}\in\bigcup\big{\{}\mathcal{N}_{\operatorname{dom}\varphi_{p}}(t_{p})\mid t_{p}\in T_{p}(\bar{x})\big{\}}\bigg{]}\quad\Longrightarrow\quad y_{p}=0,\quad p=1,\cdots,m. (16)

(ii) One has

[p=1mwp=0,wp(φpfp)(x¯)]w1==wm=0.\left[\sum\limits_{p=1}^{m}w_{p}=0,\;w_{p}\in\partial^{\infty}(\varphi_{p}\circ f_{p})(\bar{x})\right]\quad\Longrightarrow\quad w_{1}=\cdots=w_{m}=0. (17)

Then x¯\bar{x} is an A-stationary point of (CP0). Additionally, x¯\bar{x} is a weakly A-stationary point of (CP0) if int(domφp)\operatorname{int}(\operatorname{dom}\varphi_{p})\neq\emptyset for each p=1,,mp=1,\cdots,m.

Proof.

By using Fermat’s rule [30, Theorem 10.1] and the sum rule of the limiting subdifferentials [30, Corrollary 10.9] due to the condition (17), we have

0[p=1m(φpfp)(x¯)]p=1m(φpfp)(x¯)(i)p=1m{xpk}kSp(x¯)Limsupk+(φpfpk)(xpk)(ii)p=1m{xpk}kSp(x¯)Limsupk+{(ypkfpk)(xpk)|ypkφp(fpk(xpk))}(iii)p=1m{xpk}kSp(x¯)Limsupk+{ypkvpk|ypkφp(fpk(xpk)),vpkCfpk(xpk)}(iv)p=1m{xpk}kSp(x¯)Limsupk+{ypkvpk|ypkφp(fpk(xpk)),vpk[gpk(xpk)hpk(xpk)]}.\begin{split}0&\;\in\partial\left[\sum\limits_{p=1}^{m}(\varphi_{p}\circ f_{p})(\bar{x})\right]\subset\sum\limits_{p=1}^{m}\partial(\varphi_{p}\circ f_{p})(\bar{x})\;\overset{(\rm i)}{\subset}\;\sum\limits_{p=1}^{m}\bigcup_{\{x^{k}_{p}\}_{k\in\mathbb{N}}\in{S_{p}(\bar{x})}}\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}\;\partial(\varphi_{p}\circ f^{k}_{p})(x^{k}_{p})\\ &\overset{(\rm ii)}{\subset}\sum\limits_{p=1}^{m}\bigcup_{\{x^{k}_{p}\}_{k\in\mathbb{N}}\in{S_{p}(\bar{x})}}\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}\;\bigcup\left\{\partial(y^{k}_{p}\,f^{k}_{p})(x^{k}_{p})\,\middle|\,y^{k}_{p}\in\partial\varphi_{p}(f^{k}_{p}(x^{k}_{p}))\right\}\\ &\overset{(\rm iii)}{\subset}\sum\limits_{p=1}^{m}\bigcup_{\{x^{k}_{p}\}_{k\in\mathbb{N}}\in{S_{p}(\bar{x})}}\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}\;\left\{y^{k}_{p}\,v^{k}_{p}\,\middle|\,y^{k}_{p}\in\partial\varphi_{p}(f^{k}_{p}(x^{k}_{p})),\,v^{k}_{p}\in\partial_{C}f^{k}_{p}(x^{k}_{p})\right\}\\ &\overset{(\rm iv)}{\subset}\sum\limits_{p=1}^{m}\bigcup_{\{x^{k}_{p}\}_{k\in\mathbb{N}}\in{S_{p}(\bar{x})}}\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}\;\left\{y^{k}_{p}\,v^{k}_{p}\,\middle|\,y^{k}_{p}\in\partial\varphi_{p}(f^{k}_{p}(x^{k}_{p})),\,v^{k}_{p}\in\left[\partial g^{k}_{p}(x^{k}_{p})-\partial h^{k}_{p}(x^{k}_{p})\right]\right\}.\end{split} (18)

The inclusion (i)(\rm i) is due to φpfpkeφpfp\varphi_{p}\circ f^{k}_{p}\overset{\text{e}}{\rightarrow}\varphi_{p}\circ f_{p} in Assumption 1(c) and approximation of subgradients under epi-convergence [30, corollary 8.47] and [30, Proposition 8.46(e)]; (ii)(\rm ii) follows from the nonsmooth Lagrange multiplier rule [30, Exercise 10.52] due to the local Lipschitz continuity of fpkf^{k}_{p} [30, Example 9.14] and the condition (15); (iii)(\rm iii) and (iv)(\rm iv) use the calculus rules of the Clarke subdifferential [12, Chapter 2.3]. For each pp, any sequence {xpk}kSp(x¯)\{x^{k}_{p}\}_{k\in\mathbb{N}}\in S_{p}(\bar{x}) and any element

w¯pLimsupk+{ypkvpk|ypkφp(fpk(xpk)),vpk[gpk(xpk)hpk(xpk)]},\bar{w}_{p}\in\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}\;\left\{y^{k}_{p}\,v^{k}_{p}\,\middle|\,y^{k}_{p}\in\partial\varphi_{p}(f^{k}_{p}(x^{k}_{p})),v^{k}_{p}\in\big{[}\partial g^{k}_{p}(x^{k}_{p})-\partial h^{k}_{p}(x^{k}_{p})\big{]}\right\},

there is a subsequence wpkNw¯pw^{k}_{p}\rightarrow_{N}\bar{w}_{p} with wpk=ypkvpkw^{k}_{p}=y^{k}_{p}\,v^{k}_{p} for some NN\in\mathbb{N}^{\sharp}_{\infty}. Next, we show the existence of y¯p{φp(tp)tpTp(x¯)}\bar{y}_{p}\in\bigcup\{\partial\varphi_{p}(t_{p})\mid t_{p}\in T_{p}(\bar{x})\} for each pp such that

w¯p{y¯pAfp(x¯)}[±Afp(x¯)\{0}].\bar{w}_{p}\in\left\{\,\bar{y}_{p}\,\partial_{A}f_{p}(\bar{x})\,\right\}\,\cup\,\big{[}\pm\partial^{\infty}_{A}f_{p}(\bar{x})\backslash\{0\}\,\big{]}. (19)

By Assumption 1(b), the subsequence {fpk(xpk)}kN\{f^{k}_{p}(x^{k}_{p})\}_{k\in N} is bounded. Taking a subsequence if necessary, we can suppose that fpk(xpk)Nz¯pTp(x¯)f^{k}_{p}(x^{k}_{p})\to_{N}\bar{z}_{p}\in T_{p}(\bar{x}). If {ypk}kN\{y^{k}_{p}\}_{k\in N} is unbounded, then {vpk}kN\{v^{k}_{p}\}_{k\in N} has a subsequence converging to 0 and, thus, 0Afp(x¯)0\in\partial_{A}f_{p}(\bar{x}). Additionally, there exists y~p0\widetilde{y}_{p}\neq 0 such that

ypk|ypk|Ny~pLimsupk(N)+φp(fpk(xpk))=(v)Limsupk(N)+^φp(fpk(xpk))(vi)φp(z¯p)=(vii)𝒩domφp(z¯p).\hskip-5.05942pt\frac{y^{k}_{p}}{|y^{k}_{p}|}\rightarrow_{N}\widetilde{y}_{p}\in{\displaystyle\operatornamewithlimits{Lim\,sup}_{k(\in N)\rightarrow+\infty}}^{\infty}\;\partial\varphi_{p}(f^{k}_{p}(x^{k}_{p}))\overset{(\rm v)}{=}{\displaystyle\operatornamewithlimits{Lim\,sup}_{k(\in N)\rightarrow+\infty}}^{\infty}\;\widehat{\partial}\varphi_{p}(f^{k}_{p}(x^{k}_{p}))\overset{(\rm vi)}{\subset}\partial^{\infty}\varphi_{p}(\bar{z}_{p})\overset{(\rm vii)}{=}\mathcal{N}_{\operatorname{dom}\varphi_{p}}(\bar{z}_{p}). (20)

The equation (v)(\rm v) follows from [30, Proposition 8.12] by the convexity of φp\varphi_{p}. From {xpk}kSp(x¯)\{x^{k}_{p}\}_{k\in\mathbb{N}}\in{S_{p}(\bar{x})} and x¯domFp\bar{x}\in\operatorname{dom}F_{p}, we must have fpk(xpk)domφpf^{k}_{p}(x^{k}_{p})\in\operatorname{dom}\varphi_{p} for sufficiently large kNk\in N. Since φp\varphi_{p} is lsc, it holds that φp(z¯p)lim infk(N)+φp(fpk(xpk))=φp(fp(x¯))\varphi_{p}(\bar{z}_{p})\leq\liminf_{k(\in N)\rightarrow+\infty}\varphi_{p}(f^{k}_{p}(x^{k}_{p}))=\varphi_{p}(f_{p}(\bar{x})) and, thus, z¯pdomφp\bar{z}_{p}\in\operatorname{dom}\varphi_{p}. Also, notice that φp\varphi_{p} is continuous relative to its domain as it is univariate convex and lsc [27, Theorem 10.2]. This continuity implies φp(fpk(xpk))Nφp(z¯p)\varphi_{p}(f^{k}_{p}(x^{k}_{p}))\rightarrow_{N}\varphi_{p}(\bar{z}_{p}). The inclusion (vi)(\rm vi) follows directly from the definition of the horizon subdifferential. Lastly, (vii)(\rm vii) is due to the lower semicontinuity of the proper convex function φ\varphi and [30, Proposition 8.12]. Therefore, we have (0)y~p{𝒩domφp(tp)tpTp(x¯)}(0\neq)\widetilde{y}_{p}\in\bigcup\{\mathcal{N}_{\operatorname{dom}\varphi_{p}}(t_{p})\mid t_{p}\in T_{p}(\bar{x})\big{\}} with 0y~pAfp(x¯)0\in\widetilde{y}_{p}\partial_{A}\,f_{p}(\bar{x}) due to 0Afp(x¯)0\in\partial_{A}f_{p}(\bar{x}), contradicting (16). So far, we conclude that {ypk}kN\{y^{k}_{p}\}_{k\in N} is a bounded sequence. Suppose that ypkNy¯py^{k}_{p}\rightarrow_{N}\bar{y}_{p} and, thus, y¯pφp(z¯p)\bar{y}_{p}\in\partial\varphi_{p}(\bar{z}_{p}) by the outer semicontinuity of φp\partial\varphi_{p} [30, Proposition 8.7].

Case 1. If y¯p=0\bar{y}_{p}=0, inclusion (19) holds trivially for w¯p=0\bar{w}_{p}=0, and for w¯p0\bar{w}_{p}\neq 0 we can find a subsequence {|ypk|}kN0\{|y^{k}_{p}|\}_{k\in N^{\prime}}\downarrow 0 such that {|ypk|vpk}kN\{|y^{k}_{p}|\,v^{k}_{p}\}_{k\in N^{\prime}} converges to w¯p\bar{w}_{p} or w¯p(0)-\bar{w}_{p}(\neq 0) with vpk[gpk(xpk)hpk(xpk)]v^{k}_{p}\in\big{[}\partial g^{k}_{p}(x^{k}_{p})-\partial h^{k}_{p}(x^{k}_{p})\big{]} for all kNk\in N^{\prime}. Therefore, (19) follows from

w¯p[(±Limsupk+[gpk(xpk)hpk(xpk)])\{0}][±Afp(x¯)\{0}].\bar{w}_{p}\in\left[\,\Big{(}\pm{\displaystyle\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}}^{\infty}\;\big{[}\partial g^{k}_{p}(x^{k}_{p})-\partial h^{k}_{p}(x^{k}_{p})\big{]}\Big{)}\backslash\{0\}\,\right]\subset\big{[}\pm\partial^{\infty}_{A}f_{p}(\bar{x})\backslash\{0\}\big{]}.

Case 2. Otherwise, vpkNw¯p/|y¯p|\|v^{k}_{p}\|\rightarrow_{N}\|\bar{w}_{p}\|/|\bar{y}_{p}|. This means that {vpk}kN\{v^{k}_{p}\}_{k\in N} is bounded. Suppose vpkNv¯pv^{k}_{p}\to_{N}\bar{v}_{p}. Then, v¯pLimsupk+[gpk(xpk)hpk(xpk)]Afp(x¯)\bar{v}_{p}\in\operatorname{Lim\,sup}_{k\rightarrow+\infty}\big{[}\partial g^{k}_{p}(x^{k}_{p})-\partial h^{k}_{p}(x^{k}_{p})\big{]}\subset\partial_{A}f_{p}(\bar{x}), and (19) is evident from w¯p=y¯pv¯p\bar{w}_{p}=\bar{y}_{p}\,\bar{v}_{p}.

In either case, we have proved (19). Combining (18) with (19), for some y¯p{φp(tp)tpTp(x¯)}\bar{y}_{p}\in\bigcup\{\partial\varphi_{p}(t_{p})\mid t_{p}\in T_{p}(\bar{x})\}, we know that x¯\bar{x} is an A-stationary point of (CP0). The final assertion follows from Remark 1(ii). ∎

3.2 An example of A-stationarity.

We present an example to illustrate the concept of A-stationarity and to study its relationship with other known optimality conditions.

Example 3.1 (bi-parametrized two-stage stochastic programs). Consider the following bi-parametrized two-stage stochastic program with fixed scenarios described in [21]:

minimizexnθ(x)+1m1p=1m1fp(x)subject toϕp(x)0,p=1,,m2,\displaystyle\operatornamewithlimits{minimize}_{x\in{\mathbb{R}^{n}}}\;\,\theta(x)+\frac{1}{m_{1}}\sum_{p=1}^{m_{1}}f_{p}(x)\quad\mbox{subject to}\;\;\phi_{p}(x)\leq 0,\quad p=1,\cdots,m_{2}, (21)

where θ,ϕp:n\theta,\phi_{p}:{\mathbb{R}^{n}}\to\mathbb{R} are convex, continuously differentiable for p=1,,m2p=1,\cdots,m_{2}, and fpf_{p}, as defined in (1), is real-valued for p=1,,m1p=1,\cdots,m_{1}. At x=x¯x=\bar{x}, let Yp(x¯)Y_{p}(\bar{x}) and Λp(x¯)\Lambda_{p}(\bar{x}) represent the optimal solutions and multipliers for each second-stage problem (1). Suppose that Yp(x¯)Y_{p}(\bar{x}) and Λp(x¯)\Lambda_{p}(\bar{x}) are bounded. Note that θ\theta and ϕp\phi_{p} are ADC functions since they are convex. Example 2.1 shows that fpf_{p} is an ADC function, and therefore, problem (21) is a specific case of the composite model (CP0). Given an A-stationary point x¯\bar{x} of (21), under the assumptions of Example 2.1, we have

0\displaystyle\hskip-7.22743pt0 θ(x¯)+1m1p=1m1({Afp(x¯)}[±Afp(x¯)\{0}])+p=1m2μ¯m1+pϕp(x¯)\displaystyle\in\nabla\theta(\bar{x})+\frac{1}{m_{1}}\sum_{p=1}^{m_{1}}\big{(}\left\{\partial_{A}f_{p}(\bar{x})\right\}\cup\left[\,\pm\partial^{\infty}_{A}f_{p}(\bar{x})\backslash\{0\}\right]\big{)}+\sum_{p=1}^{m_{2}}\bar{\mu}^{\,m_{1}+p}\,\nabla\phi_{p}(\bar{x}) (22)
θ(x¯)+1m1p=1m1{1f¯p(x¯,x¯)2(f¯p)(x¯,x¯)}+p=1m2μ¯m1+pϕp(x¯),\displaystyle\subset\nabla\theta(\bar{x})+\frac{1}{m_{1}}\sum_{p=1}^{m_{1}}\left\{\partial_{1}\overline{f}_{p}(\bar{x},\bar{x})-\partial_{2}(-\overline{f}_{p})(\bar{x},\bar{x})\right\}+\sum_{p=1}^{m_{2}}\bar{\mu}^{\,m_{1}+p}\,\nabla\phi_{p}(\bar{x}),

where μ¯m1+p𝒩(,0](ϕp(x¯))\bar{\mu}^{m_{1}+p}\in\mathcal{N}_{(-\infty,0]}(\phi_{p}(\bar{x})) for p=1,,m2p=1,\cdots,m_{2} and f¯p\overline{f}_{p} is defined in (5) for p=1,,m1p=1,\cdots,m_{1}. By assumptions, both Λp(x¯)\Lambda_{p}(\bar{x}) and Yp(x¯)Y_{p}(\bar{x}) are nonempty, bounded, and

Λp(x¯)×Yp(x¯)={(y¯p,μ¯p)|cp+Cpx¯+Qpy¯p+(Bp)μ¯p=0, 0bpApx¯Bpy¯pμ¯p0}.\Lambda_{p}(\bar{x})\times Y_{p}(\bar{x})=\left\{(\bar{y}^{p},\bar{\mu}^{p})\,\middle|\,c^{p}+C^{p}\bar{x}+Q^{p}\,\bar{y}^{p}+(B^{p})^{\top}\bar{\mu}^{p}=0,\;0\leq b^{p}-A^{p}\bar{x}-B^{p}\bar{y}^{p}\;\perp\;\bar{\mu}^{p}\geq 0\right\}.

It then follows from Danskin’s Theorem [11, Theorem 2.1] that

1f¯p(x¯,x¯)\displaystyle\partial_{1}\overline{f}_{p}(\bar{x},\bar{x}) =con{(Ap)μ¯p|μ¯pΛp(x¯)}={(Ap)μ¯p|μ¯pΛp(x¯)},\displaystyle=\operatorname{con}\left\{(A^{p})^{\top}\bar{\mu}^{p}\,\middle|\,\bar{\mu}^{p}\in\Lambda_{p}(\bar{x})\right\}=\left\{(A^{p})^{\top}\bar{\mu}^{p}\,\middle|\,\bar{\mu}^{p}\in\Lambda_{p}(\bar{x})\right\},
2(f¯p)(x¯,x¯)\displaystyle\partial_{2}(-\overline{f}_{p})(\bar{x},\bar{x}) =con{(Cp)y¯p|y¯pYp(x¯)}={(Cp)y¯p|y¯pYp(x¯)}.\displaystyle=\operatorname{con}\left\{-(C^{p})^{\top}\bar{y}^{p}\,\middle|\,\bar{y}^{p}\in Y_{p}(\bar{x})\right\}=\left\{-(C^{p})^{\top}\bar{y}^{p}\,\middle|\,\bar{y}^{p}\in Y_{p}(\bar{x})\right\}.

Combining these expressions with (22), we obtain

{0=θ(x¯)+1m1p=1m1[(Cp)y¯p+(Ap)μ¯p]+p=1m2μ¯m1+pϕp(x¯),cp+Cpx¯+Qpy¯p+(Bp)μ¯p=0,  0bpApx¯Bpy¯pμ¯p0,p=1,,m1,0ϕp(x¯)μ¯m1+p0,p=1,,m2,\left\{\begin{array}[]{ll}0=\nabla\theta(\bar{x})+\displaystyle\frac{1}{m_{1}}\sum\limits_{p=1}^{m_{1}}\left[(C^{p})^{\top}\bar{y}^{p}+(A^{p})^{\top}\bar{\mu}^{p}\right]+\sum_{p=1}^{m_{2}}\bar{\mu}^{m_{1}+p}\,\nabla\phi_{p}(\bar{x}),\\ c^{\,p}+C^{p}\bar{x}+Q^{p}\,\bar{y}^{p}+(B^{p})^{\top}\bar{\mu}^{p}=0,\;\,0\leq b^{p}-A^{p}\bar{x}-B^{p}\bar{y}^{p}\;\perp\;\bar{\mu}^{p}\geq 0,\quad p=1,\cdots,m_{1},\\[3.61371pt] 0\leq\phi_{p}(\bar{x})\;\perp\;\bar{\mu}^{\,m_{1}+p}\geq 0,\quad p=1,\cdots,m_{2},\end{array}\right.

which are the Karush-Kuhn-Tucker (KKT) conditions for the deterministic equivalent of (21).

4 A computational algorithm.

In this section, we consider a double-loop algorithm for solving problem (CP0). The inner loop finds an approximate stationary point of the perturbed composite optimization problem

minimizexnp=1m[Fpk(x)φp(fpk(x))]\displaystyle\operatornamewithlimits{minimize}_{x\in\mathbb{R}^{n}}\;\,\sum_{p=1}^{m}\left[F^{k}_{p}(x)\triangleq\varphi_{p}(f^{k}_{p}(x))\right] (23)

by solving a sequence of convex subproblems, while the outer loop drives k+k\to+\infty. It is important to note the potential infeasibility in (23) because [Fpk=φpfpk]eFp[F^{k}_{p}=\varphi_{p}\circ f^{k}_{p}]\overset{\text{e}}{\rightarrow}F_{p} in Assumption 1(c), together with dom(φpfp)\operatorname{dom}(\varphi_{p}\circ f_{p})\neq\emptyset, does not guarantee dom(φpfpk)\operatorname{dom}(\varphi_{p}\circ f^{k}_{p})\neq\emptyset for all k{k\in\mathbb{N}}. This can be seen from the example of φ(t)=δ(,0](t)\varphi(t)=\delta_{(-\infty,0]}(t), f(x)=max{x,0}1/10f(x)=\max\{x,0\}-1/10 and fk(x)=max{x,0}+1/k1/10f^{k}(x)=\max\{x,0\}+1/k-1/10. Obviously dom(φf)=(,1/10]\operatorname{dom}(\varphi\circ f)=(-\infty,1/10] and φfkeφf\varphi\circ f^{k}\overset{\text{e}}{\rightarrow}\varphi\circ f by [32, Theorem 2.4(d)], but we have dom(φfk)=\operatorname{dom}(\varphi\circ f^{k})=\emptyset for k=1,,9k=1,\cdots,9. Even though dom(φpfpk)\operatorname{dom}(\varphi_{p}\circ f^{k}_{p})\neq\emptyset for all k{k\in\mathbb{N}} and each pp, this does not imply the feasibility of convex subproblems used in the inner loop to approximate (23).

For simplicity of the analysis, we assume that in problem (CP0), φp\varphi_{p} is real-valued for p=1,,m1p=1,\cdots,m_{1}, and φp=δ(,0]\varphi_{p}=\delta_{(-\infty,0]} for p=m1+1,,mp=m_{1}+1,\cdots,m. Namely, the problem takes the following form:

minimizexnp=1m1[Fp(x)=φp(fp(x))]subject tofp(x)0,p=m1+1,,m.\displaystyle\operatornamewithlimits{minimize}_{x\in\mathbb{R}^{n}}\;\,\sum_{p=1}^{m_{1}}\left[F_{p}(x)=\varphi_{p}\big{(}f_{p}(x)\big{)}\right]\quad\mbox{subject to}\;\,f_{p}(x)\leq 0,\;\,p=m_{1}+1,\cdots,m. (CP1)

For p=1,,m1p=1,\cdots,m_{1}, the convexity of each real-valued function φp\varphi_{p} implies its continuity by [27, corollary 10.1.1]. Consequently, the composite function Fpk=φpfpkF^{k}_{p}=\varphi_{p}\circ f^{k}_{p} is also continuous for p=1,,m1p=1,\cdots,m_{1} and kk\in\mathbb{N} due to the continuity of each approximating function fpkf^{k}_{p}. It is important to note that model (CP1) still covers discontinuous objective functions since each fpf_{p} can be discontinuous, even though the approximating sequence {fpk}k\{f^{k}_{p}\}_{k\in\mathbb{N}} only consists of locally Lipschitz continuous functions.

4.1 Assumptions and examples.

Firstly, we make an assumption to address the feasibility issue outlined at the start of this section. For all kk\in\mathbb{N} and p=m1+1,,mp=m_{1}+1,\cdots,m, define

αpksupxXk[fpk+1(x)fpk(x)]+ with Xk{xn|fpk(x)0,p=m1+1,,m}.{\alpha^{k}_{p}\triangleq\sup_{x\in X^{k}}\left[f^{k+1}_{p}(x)-f^{k}_{p}(x)\right]_{+}}\text{ with }X^{k}\triangleq\left\{x\in\mathbb{R}^{n}\,\middle|\,f^{k}_{p}(x)\leq 0,\,p=m_{1}+1,\cdots,m\right\}.

Based on these auxiliary sequences, we need an initial point x0x^{0} that is strictly feasible to the constraints fp0(x)0f^{0}_{p}(x)\leq 0 for each p=m1+1,,mp=m_{1}+1,\cdots,m.

Assumption 2 (strict feasibility) There exist x0x^{0} and nonnegative sequences {αpk^}k\big{\{}\widehat{\alpha^{k}_{p}}\big{\}}_{k\in\mathbb{N}} for p=m1+1,,mp=m_{1}+1,\cdots,m, such that αpkαpk^\alpha^{k}_{p}\leq\widehat{\alpha^{k}_{p}} for all k{k\in\mathbb{N}} and k=0+αpk^<+,fp0(x0)k=0+αpk^,p=m1+1,,m.\sum_{{k^{\prime}}=0}^{+\infty}\widehat{\alpha^{{k^{\prime}}}_{p}}<+\infty,\quad f^{0}_{p}(x^{0})\leq-\sum_{{k^{\prime}}=0}^{+\infty}\widehat{\alpha^{{k^{\prime}}}_{p}},\qquad\,p=m_{1}+1,\cdots,m.

To streamline our notation and analysis, we extend the definitions of αpk\alpha^{k}_{p} and introduce αpk^\widehat{\alpha^{k}_{p}} for p=1,,m1p=1,\cdots,m_{1} by setting αpk=αpk^=0\alpha^{k}_{p}=\widehat{\alpha^{k}_{p}}=0 for all kk\in\mathbb{N} and p=1,,m1p=1,\cdots,m_{1}. Since the quantity αpk\alpha^{k}_{p} depends on the sequence {fpk}k\{f^{k}_{p}\}_{k\in\mathbb{N}}, Assumption 2 poses a condition for this approximating sequence. Consider a fixed index p{m1+1,,m}p\in\{m_{1}+1,\cdots,m\}. One can use the following way to construct {αpk}kN\{\alpha^{k}_{p}\}_{k\in N}. Suppose that there exist a function fp~:n×[0,1]\widetilde{f_{p}}:\mathbb{R}^{n}\times[0,1]\to\mathbb{R} and a nonnegative sequence γk0\gamma_{k}\downarrow 0 such that

fp~(x,γk)=fpk(x)andfp~(x,0)=fp(x).\widetilde{f_{p}}(x,\gamma_{k})=f^{k}_{p}(x)\quad\text{and}\quad\widetilde{f_{p}}(x,0)=f_{p}(x).

Additionally, assume that for any fixed xx, the function fp~(x,)\widetilde{f_{p}}(x,\cdot) is continuous on [0,1][0,1] and differentiable on (0,1)(0,1), and there exists a constant CpC_{p} such that |γfp~(x,γ)|Cp\big{|}\nabla_{\gamma}\widetilde{f_{p}}(x,\gamma)\big{|}\leq C_{p} for any xx and γ(0,1)\gamma\in(0,1). For any fixed xx, by the mean value theorem, there exists a point γ¯k(γk+1,γk)\bar{\gamma}_{k}\in(\gamma_{k+1},\gamma_{k}) such that fpk+1(x)fpk(x)=(γk+1γk)γfp~(x,γ¯k)f^{k+1}_{p}(x)-f^{k}_{p}(x)=(\gamma_{k+1}-\gamma_{k})\nabla_{\gamma}\widetilde{f_{p}}(x,\bar{\gamma}_{k}). Thus,

k=0+αpkk=0+(γkγk+1)supxn|γfp~(x,γ¯k)|k=0+[αpk^Cp(γkγk+1)]=Cpγ0<+.\sum^{+\infty}_{k^{\prime}=0}\alpha^{k^{\prime}}_{p}\leq\sum^{+\infty}_{k^{\prime}=0}(\gamma_{k^{\prime}}-\gamma_{k^{\prime}+1})\sup_{x\in\mathbb{R}^{n}}\left|\nabla_{\gamma}\widetilde{f_{p}}(x,\bar{\gamma}_{k^{\prime}})\right|\leq\sum^{+\infty}_{k^{\prime}=0}\left[\,\widehat{\alpha^{k^{\prime}}_{p}}\triangleq C_{p}(\gamma_{k^{\prime}}-\gamma_{k^{\prime}+1})\right]=C_{p}\gamma_{0}<+\infty.

Two more assumptions on the approximating sequences {fpk}k\{f_{p}^{k}\}_{k\in\mathbb{N}} are needed.

Assumption 3 (smoothness of gpkg_{p}^{k} or hpkh_{p}^{k}) For each k{k\in\mathbb{N}}, there exists k>0\ell_{k}>0 such that min{(gpk(x),gpk(x)),(hpk(x),hpk(x))}kxxx,xn,p=1,,m.\min\Big{\{}\,\mathbb{H}\big{(}\partial g^{k}_{p}(x),\partial g^{k}_{p}(x^{\prime})\big{)},\,\mathbb{H}\big{(}\partial h^{k}_{p}(x),\partial h^{k}_{p}(x^{\prime})\big{)}\,\Big{\}}\leq\ell_{k}\|x^{\prime}-x\|\quad\forall\,x,x^{\prime}\in\mathbb{R}^{n},\;p=1,\cdots,m. Assumption 4 (level-boundedness) For each k{k\in\mathbb{N}}, the function Hkp=1mFpk{H^{k}\triangleq}\sum_{p=1}^{m}F^{k}_{p} is level-bounded, i.e., for any rr\in\mathbb{R}, the set {xn|p=1m1φp(fpk(x))+p=m1+1mδ(,0](fpk(x))r}={xn|p=1m1φp(fpk(x))r}Xk{\left\{x\in\mathbb{R}^{n}\,\middle|\,\sum^{m_{1}}_{p=1}\varphi_{p}(f^{k}_{p}(x))+\sum^{m}_{p=m_{1}+1}\delta_{(-\infty,0]}(f^{k}_{p}(x))\leq r\right\}=}\left\{x\in\mathbb{R}^{n}\,\middle|\,\sum\limits_{p=1}^{m_{1}}\varphi_{p}\left(f^{k}_{p}(x)\right)\leq r\right\}\cap X^{k} is bounded.

Assumption 3 imposes conditions on the Lipschitz continuity of the subdifferential mapping gpk\partial g^{k}_{p} or hpk\partial h^{k}_{p}, which will be used to determine the termination rule of the inner loop. A straightforward sufficient condition for this assumption is that, for each pp and kk, at least one of the functions gpkg^{k}_{p} and hpkh^{k}_{p} is k\ell_{k}-smooth, i.e., gpk(x)gpk(x)kxx\|\nabla g^{k}_{p}(x)-\nabla g^{k}_{p}(x^{\prime})\|\leq\ell_{k}\|x-x^{\prime}\| or hpk(x)hpk(x)kxx\|\nabla h^{k}_{p}(x)-\nabla h^{k}_{p}(x^{\prime})\|\leq\ell_{k}\|x-x^{\prime}\| for any x,xnx,x^{\prime}\in\mathbb{R}^{n}. We also remark that Assumption 3 can hold even though both gpkg^{k}_{p} and hpkh^{k}_{p} are nondifferentiable. This can be seen from the following univariate example: gpk(x)=|x|g^{k}_{p}(x)=|x| and hpk(x)=|x1|h^{k}_{p}(x)=|x-1| for any xx\in\mathbb{R}. It is not difficult to verify that Assumption 3 holds for k=2\ell_{k}=2. Assumption 4 is a standard condition to ensure the boundedness of the generated sequences for each k{k\in\mathbb{N}}.

In addition, we need a technical assumption to ensure the boundedness of the multiplier sequences in our algorithm.

Assumption 5 (an asymptotic constraint qualification) For any x¯p=1mdomFp\bar{x}\in{\bigcap_{p=1}^{m}}\operatorname{dom}F_{p}, if there exists {yp}p=1m\{y_{p}\}_{p=1}^{m} satisfying 0=p=1mypvp0=\sum_{p=1}^{m}y_{p}\,v_{p} where for each pp (with the definition of Tp(x¯)T_{p}(\bar{x}) in (9)), (yp,vp)({𝒩domφp(tp)tpTp(x¯)}×conAfp(x¯))(×[Afp(x¯)\{0}]),(y_{p},\,v_{p})\in\Big{(}\bigcup\big{\{}\mathcal{N}_{\operatorname{dom}\varphi_{p}}(t_{p})\mid t_{p}\in T_{p}(\bar{x})\big{\}}\times\operatorname{con}\partial_{A}f_{p}(\bar{x})\Big{)}\cup\Big{(}\mathbb{R}\times\left[\,\partial^{\infty}_{A}f_{p}(\bar{x})\backslash\{0\}\,\right]\Big{)}, (24) then we must have y1==ym=0y_{1}=\cdots=y_{m}=0.

The normal cone 𝒩domφp(tp)\mathcal{N}_{\operatorname{dom}\varphi_{p}}(t_{p}) in (24) reduces to {0}\{0\} for p=1,,m1p=1,\cdots,m_{1} and 𝒩(,0](tp)\mathcal{N}_{(-\infty,0]}(t_{p}) for p=m1+1,,mp=m_{1}+1,\cdots,m. According to the definitions of Afp(x¯)\partial_{A}f_{p}(\bar{x}) and Afp(x¯)\partial^{\infty}_{A}f_{p}(\bar{x}), Assumption 5 depends on the approximating sequences {fpk}k\{f^{k}_{p}\}_{k\in\mathbb{N}} for p=1,,mp=1,\cdots,m. It holds trivially if each φp\varphi_{p} is real-valued and Afp(x¯)={0}\partial^{\infty}_{A}f_{p}(\bar{x})=\{0\}. By Theorem 1(b), the condition Afp(x¯)={0}\partial^{\infty}_{A}f_{p}(\bar{x})=\{0\} holds when the ADC decompositions are constructed using the Moreau envelope, provided that fpf_{p} is locally Lipschitz continuous and bounded from below. However, in general, Assumption 5 is not easy to verify. For Example 3.1, the assumption translates into

[p=1m2λpϕp(x¯)=0,λp𝒩(,0](ϕp(x¯)),p=1,,m2]λ1==λm2=0.\left[\sum_{p=1}^{m_{2}}\lambda_{p}\nabla\phi_{p}(\bar{x})=0,\;\lambda_{p}\in\mathcal{N}_{(-\infty,0]}(\phi_{p}(\bar{x})),\;\;p=1,\cdots,m_{2}\right]\quad\Longrightarrow\quad\lambda_{1}=\cdots=\lambda_{m_{2}}=0.

This is equivalent to the Mangasarian-Fromovitz constraint qualification (MFCQ) for problem (21) by [30, Example 6.40]; see also [28].

Furthermore, if each fpf_{p} is c-ADC associated with {fpk=gpkhpk}k\{f^{k}_{p}=g^{k}_{p}-h^{k}_{p}\}_{k\in\mathbb{N}} such that conAfp(x¯)=Cfp(x¯)\operatorname{con}\partial_{A}f_{p}(\bar{x})=\partial_{C}f_{p}(\bar{x}), and Afp(x¯)={0}\partial^{\infty}_{A}f_{p}(\bar{x})=\{0\}, Assumption 5 states that

[ 0p=1mypCfp(x¯),yp𝒩domφp(fp(x¯)),p=1,,m]y1==ym=0.\left[\,0\in\sum_{p=1}^{m}y_{p}\,\partial_{C}f_{p}(\bar{x}),\quad y_{p}\in\mathcal{N}_{\operatorname{dom}\varphi_{p}}(f_{p}(\bar{x})),\;\;p=1,\cdots,m\,\right]\quad\Longrightarrow\quad y_{1}=\cdots=y_{m}=0.

This condition aligns with the constraint qualification for the composite optimization problem in [32, Proposition 2.1], and is stronger than the condition in the nonsmooth Lagrange multiplier rule [30, Exercise 10.52]. Finally, Assumption 5 implies the constraint qualifications (15)-(17) in Theorem 2. We formally present this conclusion in the following proposition. The proof of Proposition 4 is given in Appendix B.

Proposition 4 (consequences of Assumption 5).

Suppose that Assumptions 1 and 5 hold, and fpkefpf^{k}_{p}\overset{\text{e}}{\rightarrow}f_{p} for each pp. If supφp=+\sup\varphi_{p}=+\infty for pI1p\in I_{1}, and fpf_{p} is locally Lipschitz continuous for pI2p\in I_{2} (with the definitions of I1I_{1} and I2I_{2} in (8)), then conditions (15), (16), and (17) hold at any feasible point x¯\bar{x} of (CP1). Consequently, any local solution of (CP1) is a (weakly) A-stationary point of (CP1).

In the following, we use two examples to further illustrate Assumption 3 and the computation of {αpk^}k\{\widehat{\alpha^{k}_{p}}\}_{k\in\mathbb{N}} in Assumption 2.

Example 4.1 (icc constraints). Let fpf_{p} be real-valued and icc associated with f¯p\overline{f}_{p}, where f¯p(,x)\overline{f}_{p}(\cdot,x) is Lipschitz continuous with modulus LL for any xx. For the sequence {fpk}k\{f^{k}_{p}\}_{k\in\mathbb{N}} in Example 2.1, it follows from gpk(x)=x2/(2γk)g^{k}_{p}(x)=\|x\|^{2}/(2\gamma_{k}) that Assumption 3 holds for k=1/γk\ell_{k}=1/\gamma_{k}. To construct the quantities αpk^\widehat{\alpha^{k}_{p}} in Assumption 2, we notice that

αpksupxn[fpk+1(x)fpk(x)]+supxn[fp(x)fpk(x)]+γkL22αpk^k,\alpha^{k}_{p}\leq\sup_{x\in\mathbb{R}^{n}}\left[f^{k+1}_{p}(x)-f^{k}_{p}(x)\right]_{+}\,\leq\,\displaystyle\sup_{x\in\mathbb{R}^{n}}\left[f_{p}(x)-f^{k}_{p}(x)\right]_{+}\,\leq\,\frac{\gamma_{k}\,L^{2}}{2}\,\triangleq\,\widehat{\alpha^{k}_{p}}\qquad\forall\,{k\in\mathbb{N}}, (25)

where the second inequality is due to fpk+1(x)fp(x)f^{k+1}_{p}(x)\leq f_{p}(x) for any xx, and the last one uses the bound between the partial Moreau envelope and the original function [20, Lemma 3]. Thus, the sequence {αpk^}k\big{\{}\widehat{\alpha^{k}_{p}}\big{\}}_{k\in\mathbb{N}} satisfies k=0+αpk^<+\sum_{{k^{\prime}}=0}^{+\infty}\widehat{\alpha^{{k^{\prime}}}_{p}}<+\infty if {γk}\{\gamma_{k}\} is summable.

Alternatively, we can construct the quantities αpk^\widehat{\alpha^{k}_{p}} as follows. Let the partial Moreau envelope in (6) be the function fp~\widetilde{f_{p}} jointly defined for (x,γ)n×(0,1](x,\gamma)\in\mathbb{R}^{n}\times(0,1], and fp~(x,0)=fp(x)\widetilde{f_{p}}(x,0)=f_{p}(x) for any xx. We claim that fp~(x,)\widetilde{f_{p}}(x,\cdot) is continuous on [0,1][0,1] and differentiable on (0,1)(0,1) for any fixed xx. Continuity in γ\gamma can be simply checked by a standard argument [30, Theorem 1.17(c)], noting that the optimal value is achieved at a unique point as the function f¯p(,x)+x2/(2γ)\overline{f}_{p}(\cdot,x)+\|\cdot-x\|^{2}/(2\gamma) is strongly convex for any fixed xx. Differentiability follows from the Danskin’s Theorem [11, Theorem 2.1] that γfp~(x,γ)=zx2/(2γ2)\nabla_{\gamma}\widetilde{f_{p}}(x,\gamma)=-\|z-x\|^{2}/(2\gamma^{2}) with zz satisfying (xz)/γ1f¯(z,x)(x-z)/\gamma\in\partial_{1}\bar{f}(z,x) for any (x,γ)n×(0,1](x,\gamma)\in\mathbb{R}^{n}\times(0,1]. It then follows from the Lipschitz continuity of f¯p(,x)\overline{f}_{p}(\cdot,x) that |γfp~(x,γ)|L2/2Cp\big{|}\nabla_{\gamma}\widetilde{f_{p}}(x,\gamma)\big{|}\leq L^{2}/2\triangleq C_{p} for any (x,γ)n×(0,1](x,\gamma)\in\mathbb{R}^{n}\times(0,1]. Therefore, αpkCp(γkγk+1)αpk^\alpha^{k}_{p}\leq C_{p}(\gamma_{k}-\gamma_{k+1})\triangleq\widehat{\alpha^{k}_{p}} and k=0+αpk^=Cpγ0<+\sum^{+\infty}_{k^{\prime}=0}\widehat{\alpha^{k^{\prime}}_{p}}=C_{p}\gamma_{0}<+\infty for any sequence {fp~(,γk)}k\{\widetilde{f_{p}}(\cdot,\gamma_{k})\}_{k\in\mathbb{N}} defined by the partial Moreau envelope with γk0\gamma_{k}\downarrow 0.

Example 4.2 (VaR constraints for log-normal distributions). Consider fp(x)=VaRα[c(x,Z)]f_{p}(x)=\mbox{VaR}_{\alpha}[\,c(x,Z)] with c(x,Z)=exp(xZ)c(x,Z)=\exp(\,x^{\top}Z) for some random vector ZNormal(μ,Σ)Z\sim\mbox{Normal}(\mu,\Sigma), where Σ\Sigma is a positive definite covariance matrix. We have c(x,Z)Lognormal(xμ,xΣx)c(x,Z)\sim\mbox{Lognormal}\big{(}x^{\top}\mu,\sqrt{x^{\top}\Sigma x}\big{)}. The variable xx is restricted to a compact set XnX\subset\mathbb{R}^{n}. Denote the α\alpha-quantile of the standard normal distribution by qαq_{\alpha} and the cumulative distribution function of the standard normal distribution by Φ()\Phi(\cdot). By direct calculation (cf. [23, Section 3.2]), we have

VaRα[c(x,Z)]=exp(xμ+xΣxqα),CVaRα[c(x,Z)]=exp(xμ+xΣx2)Φ(xΣxqα)1α.\mbox{VaR}_{\alpha}[\,c(x,Z)]=\exp\left({x^{\top}\mu+\sqrt{x^{\top}\Sigma\,x}\,q_{\alpha}}\right),\;\mbox{CVaR}_{\alpha}[\,c(x,Z)]=\exp\left(x^{\top}\mu+\frac{x^{\top}\Sigma x}{2}\right)\frac{\Phi\big{(}\sqrt{x^{\top}\Sigma x}-q_{\alpha}\big{)}}{1-\alpha}.

Hence, fp(x)=VaRα[c(x,Z)]f_{p}(x)=\mbox{VaR}_{\alpha}[\,c(x,Z)] is neither convex nor concave if qα<0q_{\alpha}<0. For the sequence {fpk}k\{f^{k}_{p}\}_{k\in\mathbb{N}} in Example 2.2, we can derive that

hpk(x)=k(1α)CVaRα[c(x,Z)]=kexp(xμ+xΣx/2)Φ(xΣxqα).h^{k}_{p}(x)=k(1-\alpha)\,\mbox{CVaR}_{\alpha}[\,c(x,Z)]=k\,\exp\left(x^{\top}\mu+x^{\top}\Sigma x/2\right)\,\Phi\big{(}\sqrt{x^{\top}\Sigma x}-q_{\alpha}\big{)}.

Since Σ\Sigma is positive definite, it is easy to see that hpkh^{k}_{p} is twice continuously differentiable. Consequently, hpkh^{k}_{p} is k\ell_{k}-smooth relative to the compact set XX for some k\ell_{k}, and Assumption 3 holds (relative to XX). Next, we define fp~(x,γ)=1γαγαVaRt[c(x,Z)]dt\widetilde{f_{p}}(x,\gamma)=\frac{1}{\gamma}\int^{\alpha}_{\alpha-\gamma}\mbox{VaR}_{t}[\,c(x,Z)]\,\mbox{d}t for any (x,γ)n×(0,α2](x,\gamma)\in\mathbb{R}^{n}\times(0,\frac{\alpha}{2}] and fp~(x,0)=fp(x)\widetilde{f_{p}}(x,0)=f_{p}(x) for any xx. Obviously, fp~(x,)\widetilde{f_{p}}(x,\cdot) is continuous on [0,α2][0,\frac{\alpha}{2}] and differentiable on (0,α2)(0,\frac{\alpha}{2}) for any fixed xx. By using the Leibniz rule for differentiating the parametric integral, for γ(0,α2)\gamma\in(0,\frac{\alpha}{2}), we have

|γfp~(x,γ)|=1γ2αγα(VaRt[c(x,Z)]VaRαγ[c(x,Z)])dt1γ(VaRα[c(x,Z)]VaRαγ[c(x,Z)])=exp(xμ)exp(xΣxqα)exp(xΣxqαγ)γ=exp(xμ)[exp(xΣxqα)xΣxαqα]\begin{array}[]{rl}\left|\nabla_{\gamma}\widetilde{f_{p}}(x,\gamma)\right|=&\displaystyle\frac{1}{\gamma^{2}}\int^{\alpha}_{\alpha-\gamma}(\mbox{VaR}_{t}[\,c(x,Z)]-\mbox{VaR}_{\alpha-\gamma}[\,c(x,Z)])\,\mbox{d}t\\[9.39545pt] \leq&\displaystyle\frac{1}{\gamma}\left(\mbox{VaR}_{\alpha}[\,c(x,Z)]-\mbox{VaR}_{\alpha-\gamma}[\,c(x,Z)]\right)\\[7.22743pt] =&\displaystyle\exp({x^{\top}\mu})\;\frac{\exp\big{(}{\sqrt{x^{\top}\Sigma x}\,q_{\alpha}}\big{)}-\exp\big{(}{\sqrt{x^{\top}\Sigma x}\,q_{\alpha-\gamma}}\big{)}}{\gamma}\\[9.39545pt] =&\exp(x^{\top}\mu)\cdot\left[\exp\big{(}\sqrt{x^{\top}\Sigma x}\,q_{\alpha^{\prime}}\big{)}\sqrt{x^{\top}\Sigma x}\;\nabla_{\alpha}q_{\alpha^{\prime}}\right]\end{array}

for some α(αγ,α)\alpha^{\prime}\in(\alpha-\gamma,\alpha) by the mean-value theorem. By using the fact αqα=2πexp(qα2/2)\nabla_{\alpha}q_{\alpha}=\sqrt{2\pi}\,\exp(q_{\alpha}^{2}/2), the monotonicity qα/2<qα<qαq_{\alpha/2}<q_{\alpha^{\prime}}<q_{\alpha}, and the compactness of XX, we further have

supxX|γfp~(x,γ)|exp(max{qα2,qα/22}2)supxX{exp(xμ+xΣxqα)(2π)xΣx}Cp<+.{\sup_{x\in X}}\left|\nabla_{\gamma}\widetilde{f_{p}}(x,\gamma)\right|\leq\exp\left(\frac{\max\{q_{\alpha}^{2},q_{\,\alpha/2}^{2}\}}{2}\right)\sup_{x\in X}\left\{\exp\big{(}{x^{\top}\mu+\sqrt{x^{\top}\Sigma x}\,q_{\alpha}}\big{)}\sqrt{(2\pi)\,x^{\top}\Sigma x}\right\}\,\triangleq\,C_{p}<+\infty.

Therefore, αpkCp(γkγk+1)αpk^\alpha^{k}_{p}\leq C_{p}(\gamma_{k}-\gamma_{k+1})\triangleq\widehat{\alpha^{k}_{p}} and k=0αpk^=Cpγ0<+\sum^{\infty}_{k^{\prime}=0}\widehat{\alpha^{k}_{p}}=C_{p}\gamma_{0}<+\infty for any sequence {fp~(,γk)}k\{\widetilde{f_{p}}(\cdot,\gamma_{k})\}_{k\in\mathbb{N}} with γk0\gamma_{k}\downarrow 0.

4.2 The algorithmic framework and convergence analysis.

We now formalize the algorithm for solving (CP1). For p=m1+1,,mp=m_{1}+1,\cdots,m, recall the nonnegative sequences {αpk^}k\big{\{}\widehat{\alpha^{k}_{p}}\big{\}}_{k\in\mathbb{N}} introduced in Assumption 2, and observe that k=k+αpk^0\sum^{+\infty}_{k^{\prime}=k}\widehat{\alpha^{k^{\prime}}_{p}}\rightarrow 0 as k+k\to+\infty. For consistency of our notation, we also set αpk^0\widehat{\alpha^{k}_{p}}\equiv 0 for all k{k\in\mathbb{N}} and p=1,,m1p=1,\cdots,m_{1}. At the kk-th outer iteration and for p=1,,mp=1,\cdots,m, consider the upper and lower approximation of fpkf^{k}_{p} at a point yy by taking some apkhpk(y)a^{k}_{p}\in\partial h^{k}_{p}(y), bpkgpk(y)b^{k}_{p}\in\partial g^{k}_{p}(y) and incorporating sequences {αpk^}k\big{\{}\widehat{\alpha^{k}_{p}}\big{\}}_{k\in\mathbb{N}}:

fpk,upper(x;y)gpk(x)hpk(y)(apk)(xy)+k=k+αpk^,fpk,lower(x;y)gpk(y)+(bpk)(xy)hpk(x).\begin{array}[]{rll}f^{k,\text{upper}}_{p}(x;y)\triangleq&g^{k}_{p}(x)-h^{k}_{p}(y)-(a^{k}_{p})^{\top}(x-y)+\sum\limits^{+\infty}_{k^{\prime}=k}\widehat{\alpha^{k^{\prime}}_{p}},\\[7.22743pt] f^{k,\text{lower}}_{p}(x;y)\triangleq&g^{k}_{p}(y)+(b^{k}_{p})^{\top}(x-y)-h^{k}_{p}(x).\end{array} (26)

Observe that, for fixed yy, the upper approximation fpk,upper(;y)f^{k,\text{upper}}_{p}(\cdot\,;y) is convex while the lower approximation fpk,lower(;y)f^{k,\text{lower}}_{p}(\cdot\,;y) is concave. For p=1,,m1p=1,\cdots,m_{1}, consider the following function

Fpk^(x;y)φp(fpk,upper(x;y))+φp(fpk,lower(x;y)),\widehat{F^{k}_{p}}(x;y)\triangleq\varphi_{p}^{\uparrow}\left(f^{k,\text{upper}}_{p}(x;y)\right)+\varphi_{p}^{\downarrow}\left(f^{k,\text{lower}}_{p}(x;y)\right), (27)

which is a convex majorization of FpkF^{k}_{p} at a point yy by the fact that φp\varphi^{\uparrow}_{p} is nondecreasing and φp\varphi^{\downarrow}_{p} is nonincreasing. For p=m1+1,,mp=m_{1}+1,\cdots,m, consider the convex constraint fpk,upper(x;y)0f^{k,\text{upper}}_{p}(x;y)\leq 0 as an approximation for fpk(x)0f^{k}_{p}(x)\leq 0.

We summarize the properties of all the surrogate functions as follows. Note that (28a) and (28b) hold for p=1,,mp=1,\cdots,m, while (28c) holds only for p=1,,m1p=1,\cdots,m_{1}.

fpk,upper(x;y)fpk(x)+k=kαpk^\displaystyle f^{k,\text{upper}}_{p}(x;y)\geq f^{k}_{p}(x)+\sum^{\infty}_{k^{\prime}=k}\widehat{\alpha^{k^{\prime}}_{p}} fpk(x),fpk,upper(x;x)=fpk(x)+k=kαpk^,\displaystyle\geq f^{k}_{p}(x),\qquad f^{k,\text{upper}}_{p}(x;x)=f^{k}_{p}(x)+\sum^{\infty}_{k^{\prime}=k}\widehat{\alpha^{k^{\prime}}_{p}}, (28a)
fpk,lower(x;y)\displaystyle f^{k,\text{lower}}_{p}(x;y) fpk(x),fpk,lower(x;x)=fpk(x),\displaystyle\leq f^{k}_{p}(x),\qquad f^{k,\text{lower}}_{p}(x;x)=f^{k}_{p}(x), (28b)
Fpk^(x;y)\displaystyle\widehat{F^{k}_{p}}(x;y) Fpk(x),Fpk^(x;x)=Fpk(x).\displaystyle\geq F^{k}_{p}(x),\qquad\widehat{F^{k}_{p}}(x;x)=F^{k}_{p}(x). (28c)

The proposed method for solving problem (CP1) is outlined in Algorithm 1. The inner loop of the algorithm (indexed by ii) is terminated when the following conditions are satisfied:

{fpk,upper(xk,i+1;xk,i)fpk(xk,i+1)+k=k+αpk^+ϵk,p=1,,m,fpk,lower(xk,i+1;xk,i)fpk(xk,i+1)ϵk,pI2,xk,i+1xk,iδk/(λ+k),\left\{\begin{array}[]{rll}f^{k,\text{upper}}_{p}(x^{k,i+1};x^{k,i})&\leq f^{k}_{p}(x^{k,i+1})+\sum\limits^{+\infty}_{k^{\prime}=k}\widehat{\alpha^{k^{\prime}}_{p}}+\epsilon_{k},&\quad p=1,\cdots,m,\\[3.61371pt] f^{k,\text{lower}}_{p}(x^{k,i+1};x^{k,i})&\geq f^{k}_{p}(x^{k,i+1})-\epsilon_{k},&\quad p\in I_{2},\\[3.61371pt] \|x^{k,i+1}-x^{k,i}\|&\leq\delta_{k}/(\lambda+\ell_{k}),\end{array}\right. (29)
Algorithm 1 The prox-ADC method for solving (CP1)

Input: Given x0x^{0} and {αpk^}k\big{\{}\widehat{\alpha^{k}_{p}}\big{\}}_{k\in\mathbb{N}} satisfying Assumption 2. Let {k}\{\ell_{k}\} be a sequence satisfying Assumption 3. Choose λ>0\lambda>0, a positive sequence (ϵk,δk)0(\epsilon_{k},\delta_{k})\downarrow 0 such that δk/(λ+k)0\delta_{k}/(\lambda+\ell_{k})\downarrow 0. Set k=0k=0.

1:while a prescribed stopping criterion is not met do
2:     xk,0=xkx^{k,0}=x^{k}
3:     for i=0,1,i=0,1,\cdots do
4:         Take apk,igpk(xk,i)a^{k,i}_{p}\in\partial\,g^{k}_{p}(x^{k,i}) for p=1,,mp=1,\cdots,m and bpk,ihpk(xk,i)b^{k,i}_{p}\in\partial h^{k}_{p}(x^{k,i}) for p=1,,m1p=1,\cdots,m_{1}
5:         Solve the strongly convex subproblem:
xk,i+1=[argminxnp=1m1Fpk^(x;xk,i)+λ2xxk,i2subject tofpk,upper(x;xk,i)0,p=m1+1,,m]\hskip-10.84006ptx^{k,i+1}=\left[\begin{array}[]{cl}\displaystyle\operatornamewithlimits{argmin}_{x\in\mathbb{R}^{n}}&\;\;\sum\limits_{p=1}^{m_{1}}\widehat{F^{k}_{p}}(x;x^{k,i})+\frac{\lambda}{2}\|x-x^{k,i}\|^{2}\\[10.84006pt] \text{subject to}&\;\;f^{k,\text{upper}}_{p}(x;x^{k,i})\leq 0,\,p=m_{1}+1,\cdots,m\end{array}\right] (30)
6:         if the conditions (29) hold for λ,k,ϵk,δk\lambda,\ell_{k},\epsilon_{k},\delta_{k}, and k=k+αpk^\sum^{+\infty}_{k^{\prime}=k}\widehat{\alpha^{k^{\prime}}_{p}} then
7:              Break the for-loop
8:         else
9:              ii+1i\leftarrow i+1
10:         end if
11:     end for
12:     xk+1=xk,ix^{k+1}=x^{k,i}
13:     kk+1k\leftarrow k+1
14:end while
Refer to caption
(a) F1=φ1f1F_{1}=\varphi_{1}\circ f_{1} for a convex φ1\varphi_{1} and a smooth f1f_{1}.
Refer to caption
(b) F1=φ1f1F_{1}=\varphi_{1}\circ f_{1} for a convex nondecreasing φ1\varphi_{1} and a lsc f1f_{1}.
Figure 1: Illustrations of the prox-ADC method. (a): a comparison of the prox-ADC and the prox-linear method for minimizing an amenable function. (b): asymptotic approximations of a discontinuous composite function F1=φ1f1F_{1}=\varphi_{1}\circ f_{1} that are constructed by an epi-convergent sequence {F1k=φ1f1k}\{F^{k}_{1}=\varphi_{1}\circ f^{k}_{1}\}, and a convex majorization F1k^(;y)\widehat{F^{k}_{1}}(\cdot\,;y) for F1kF^{k}_{1}.

In contrast to the prox-linear algorithm that is designed to minimize amenable functions and adopts complete linearization of the inner maps, the prox-ADC method retains more curvature information inherent in these maps (see Figure 1). We emphasize that the prox-ADC method differs from [13, Algorithm 7.1.2] that is designed for solving a problem with a convex composite DC objective and DC constraints. Central to the prox-ADC method is the double-loop structure, where, in contrast to [13, Algorithm 7.1.2], the DC sequence fpkf^{k}_{p} is dynamically updated in the outer loop rather than remaining the same. This adaptation necessitates specialized termination criteria (29) and the incorporation of αpk^\widehat{\alpha^{k}_{p}} to maintain feasibility with each update of fpkf^{k}_{p}. In the following, we demonstrate the well-definedness of the prox-ADC method. Specifically, we establish that for each iteration kk, the criteria detailed in (29) are attainable in finitely many steps.

Theorem 3 (convergence of the inner loop).

Suppose that Assumptions 1-4 hold. Then the following statements hold.
(a) Problem (30) is feasible for any k,ik,i\in\mathbb{N}.
(b) The stopping rule of the inner loop is achievable in finitely many steps, i.e., the smallest integer ii satisfying conditions (29), denoted by iki_{k}, is finite for any k{k\in\mathbb{N}}.

Proof.

We prove (a) and (b) by induction. For k=0k=0, notice from Assumption 2 and (28a) that fp0,upper(x0;x0)=fp0(x0)+k=0+αpk^0f^{0,\text{upper}}_{p}(x^{0};x^{0})=f^{0}_{p}(x^{0})+\sum^{+\infty}_{{k^{\prime}}=0}\widehat{\alpha^{{k^{\prime}}}_{p}}\leq 0 for p=m1+1,,mp=m_{1}+1,\cdots,m. Thus, problem (30) is feasible for k=i=0k=i=0. Assume that (30) is feasible for k=0k=0 and some i=i¯()i=\bar{i}~{}({\in\mathbb{N}}). Consequently, x0,i¯+1x^{0,\bar{i}+1} is well-defined and for p=m1+1,,mp=m_{1}+1,\cdots,m,

fp0,upper(x0,i¯+1;x0,i¯+1)=(28a)fp0(x0,i¯+1)+k=0+αpk^(28a)fp0,upper(x0,i¯+1;x0,i¯)0,f^{0,\text{upper}}_{p}(x^{0,\bar{i}+1};x^{{0},\bar{i}+1})\overset{\eqref{eq:f_upper}}{=}f^{0}_{p}(x^{0,\bar{i}+1})+\sum\limits^{+\infty}_{{k^{\prime}}=0}\widehat{\alpha^{{k^{\prime}}}_{p}}\;\overset{\eqref{eq:f_upper}}{\leq}\;f^{0,\text{upper}}_{p}(x^{0,\bar{i}+1};x^{0,\bar{i}})\leq 0,

which yields the feasibility of (30) for k=0k=0, i=i¯+1i=\bar{i}+1. Hence, by induction, problem (30) is feasible for k=0k=0 and any ii\in\mathbb{N}. To proceed, recall the function HkH^{k} defined in Assumption 4. From the update of x0,i+1x^{0,i+1}, we have

H0(x0,i+1)=p=1m1Fp0(x0,i+1)(28c)p=1m1Fp0^(x0,i+1;x0,i)H0(x0,i)λ2x0,i+1x0,i2i.H^{0}(x^{0,i+1})=\sum_{p=1}^{m_{1}}F^{0}_{p}(x^{0,i+1})\,{\overset{\eqref{eq:F_hat}}{\leq}}\,\sum_{p=1}^{m_{1}}\widehat{F^{0}_{p}}(x^{0,i+1};x^{0,i})\leq H^{0}(x^{0,i})-\frac{\lambda}{2}\|x^{0,i+1}-x^{0,i}\|^{2}\quad\forall\,i{\in\mathbb{N}}. (31)

The last inequality follows from the definition of x0,i+1x^{0,i+1} and the second relation in (28c) that Fp0^(x0,i;x0,i)=Fp0(x0,i)\widehat{F^{0}_{p}}(x^{0,i};x^{0,i})=F^{0}_{p}(x^{0,i}) for p=1,,m1p=1,\cdots,m_{1}. Observe that H0H^{0} is bounded from below by the continuity of Fp0=φpfp0F^{0}_{p}=\varphi_{p}\circ f^{0}_{p} for p=1,,m1p=1,\cdots,m_{1} (see the discussion following model (CP1)) and the level-boundedness of H0H^{0}. Suppose for contradiction that the stopping rule of the inner loop is not achievable in finitely many steps. Then from (31), {H0(x0,i)}\left\{H^{0}(x^{0,i})\right\} converges and i=0x0,i+1x0,i2<+\sum_{i=0}^{\infty}\|x^{0,i+1}-x^{0,i}\|^{2}<+\infty. The latter further yields x0,i+1x0,i0\|x^{0,i+1}-x^{0,i}\|\rightarrow 0 and thus the last condition in (29) is achievable in finitely many iterations. Next, to derive a contradiction, it suffices to prove that the first two conditions in (29) can also be achieved in finitely many steps. We only show the first one since the other can be done with similar arguments. By the level-boundedness of H0H^{0}, the set S0{xH0(x)H0(x0,0)}S^{0}\triangleq\{x\mid H^{0}(x)\leq H^{0}(x^{0,0})\} is compact. Notice that x0,iS0x^{0,i}\in S^{0} for all ii\in\mathbb{N} due to (31). For p=1,,mp=1,\cdots,m, we then have

0fp0,upper(x0,i+1;x0,i)fp0(x0,i+1)k=0+αpk^=hp0(x0,i+1)hp0(x0,i)(ap0,i)(x0,i+1x0,i) 0,0\leq\;f^{0,\text{upper}}_{p}(x^{0,i+1};x^{0,i})-f^{0}_{p}(x^{0,i+1})-\sum^{+\infty}_{{k^{\prime}}=0}\widehat{\alpha^{{k^{\prime}}}_{p}}\;=\;h^{0}_{p}(x^{0,i+1})-h^{0}_{p}(x^{0,i})-(a^{0,i}_{p})^{\top}(x^{0,i+1}-x^{0,i})\;\longrightarrow\;0,

because hp0h^{0}_{p} is uniformly continuous on the compact set S0S^{0} and {ap0,i}i{hp0(x)xS0}\{a^{0,i}_{p}\}_{i\in\mathbb{N}}\subset\bigcup\left\{\partial h^{0}_{p}(x)\mid x\in S^{0}\right\} is bounded by [27, Theorem 24.7]. Therefore, for a fixed ϵ0>0\epsilon_{0}>0, there exists some i0i_{0} such that fp0,upper(x0,i0+1;x0,i0)fp0(x0,i0+1)+k=0+αpk^+ϵ0f^{0,\text{upper}}_{p}(x^{0,i_{0}+1};x^{0,i_{0}})\leq f^{0}_{p}(x^{0,i_{0}+1})+\sum^{+\infty}_{{k^{\prime}}=0}\widehat{\alpha^{{k^{\prime}}}_{p}}+\epsilon_{0} holds for p=1,,mp=1,\cdots,m. Thus, (a)-(b) hold for k=0k=0.

Now assume that (a)-(b) hold for some k=k¯()k=\bar{k}~{}({\in\mathbb{N}}) and, hence ik¯i_{\bar{k}} is finite. It then follows from xk¯+1,0=xk¯,ik¯Xk¯x^{\bar{k}+1,0}={x^{\bar{k},i_{\bar{k}}}}\in X^{\bar{k}} and fpk¯,upper(xk¯,ik¯;xk¯,ik¯)0f^{\bar{k},\text{upper}}_{p}({x^{\bar{k},i_{\bar{k}}}};x^{\bar{k},i_{\bar{k}}})\leq 0 that for each p=m1+1,,mp=m_{1}+1,\cdots,m,

fpk¯+1,upper(xk¯+1,0;xk¯+1,0)=(28a)fpk¯+1(xk¯+1,0)+k=k¯+1+αpk^fpk¯(xk¯+1,0)+supxXk¯[fpk¯+1(x)fpk¯(x)]++k=k¯+1+αpk^fpk¯(xk¯+1,0)+k=k¯+αpk^=(28a)fpk¯,upper(xk¯+1,0;xk¯,ik¯)0.\begin{array}[]{rl}&f^{\bar{k}+1,\text{upper}}_{p}(x^{\bar{k}+1,0};x^{\bar{k}+1,0}){\overset{\eqref{eq:f_upper}}{=}}f^{\bar{k}+1}_{p}(x^{\bar{k}+1,0})+\sum\limits^{+\infty}_{{k^{\prime}}=\bar{k}+1}\widehat{\alpha^{{k^{\prime}}}_{p}}\\[10.84006pt] \leq&f^{\bar{k}}_{p}(x^{\bar{k}+1,0})+\sup\limits_{x\in X^{\bar{k}}}\left[f^{\bar{k}+1}_{p}(x)-f^{\bar{k}}_{p}(x)\right]_{+}+\sum\limits^{+\infty}_{{k^{\prime}}=\bar{k}+1}\widehat{\alpha^{{k^{\prime}}}_{p}}\\[13.00806pt] \leq&f^{\bar{k}}_{p}(x^{\bar{k}+1,0})+\sum\limits^{+\infty}_{{k^{\prime}}=\bar{k}}\widehat{\alpha^{{k^{\prime}}}_{p}}\;\;{\overset{\eqref{eq:f_upper}}{=}}\;\;f^{\bar{k},\text{upper}}_{p}(x^{\bar{k}+1,0};x^{\bar{k},i_{\bar{k}}})\leq 0.\end{array}

Thus, problem (30) is feasible for k=k¯+1k=\bar{k}+1 and any i{i\in\mathbb{N}}. Building upon this, we can now clearly see the validity of (b) for k=k¯+1k=\bar{k}+1, as we have shown similar results earlier in the case of k=0k=0. By induction, we complete the proof of (a)-(b). ∎

For any kk\in\mathbb{N}, define the set of multipliers for problem (30) as

Yk(xk+1){(y1,1ky1,2kym,1kym,2k)|0p=1m[yp,1kfpk,upper(xk,ik+1;xk,ik)+yp,2kfpk,lower(xk,ik+1;xk,ik)]+λ(xk,ik+1xk,ik),yp,1kφp(fpk,upper(xk,ik+1;xk,ik)),p=1,,m,yp,2kφp(fpk,lower(xk,ik+1;xk,ik)),p=1,,m.}.Y^{k}(x^{k+1})\triangleq\left\{\left(\begin{array}[]{c}y^{k}_{1,1}\\ y^{k}_{1,2}\\ \vdots\\ y^{k}_{m,1}\\ y^{k}_{m,2}\end{array}\right)\middle|\begin{array}[]{rl}&0\in\sum\limits_{p=1}^{m}\Big{[}y^{k}_{p,1}\,\partial f^{k,\text{upper}}_{p}(x^{k,i_{k}+1};x^{k,i_{k}})+y^{k}_{p,2}\,\partial f^{k,\text{lower}}_{p}(x^{k,i_{k}+1};x^{k,i_{k}})\Big{]}\\ &\qquad+\lambda(x^{k,i_{k}+1}-x^{k,i_{k}}),\\[3.61371pt] &y^{k}_{p,1}\in\partial\varphi^{\uparrow}_{p}(f_{p}^{k,\text{upper}}(x^{k,i_{k}+1};x^{k,i_{k}})),\,p=1,\cdots,m,\\[3.61371pt] &y^{k}_{p,2}\in\partial\varphi^{\downarrow}_{p}(f_{p}^{k,\text{lower}}(x^{k,i_{k}+1};x^{k,i_{k}})),\,p=1,\cdots,m.\end{array}\right\}.

Here xk,ik+1x^{k,i_{k}+1} is uniquely determined by xk+1=xk,ikx^{k+1}=x^{k,i_{k}} as the minimizer of a strongly convex problem (30). Notice that yp,2k=0y^{k}_{p,2}=0 for pI1p\in I_{1} since φp\varphi_{p} is nondecreasing and φp=0\varphi^{\downarrow}_{p}=0 for pI1p\in I_{1}. Let {xk+1}kN\{x^{k+1}\}_{k\in N} be a subsequence that converges to some point x¯\bar{x}. As we will see in the following lemma, the asymptotic constraint qualification in Assumption 5 implies the non-emptiness and the compactness of Yk(xk+1)Y^{k}(x^{k+1}) for all sufficiently large kNk\in N and the eventual boundedness of {Yk(xk+1)}kN\{Y^{k}(x^{k+1})\}_{k\in N}. These technical results play an important role in the convergence analysis of the prox-ADC method. However, a stronger property of equi-boundedness appears necessary for designing practical termination criteria for the algorithm. We will establish this strengthened property in section 4.3 under non-asymptotic constraint qualifications.

Lemma 2 (non-emptiness and eventual boundedness of multipliers).

Let x¯p=1mdomFp\bar{x}\in\bigcap^{m}_{p=1}\operatorname{dom}F_{p} be a feasible point of problem (CP1). Suppose that Assumptions 1-5 hold. Consider any sequence {xk}\{x^{k}\} generated by the prox-ADC method, with a subsequence {xk+1}kN\{x^{k+1}\}_{k\in N} converging to x¯\bar{x}. The following statements hold.
(a) The set of multipliers Yk(xk+1)Y^{k}(x^{k+1}) is non-empty and compact for all sufficiently large kNk\in N.
(b) Additionally, if Afp(x¯)={0}\partial^{\infty}_{A}f_{p}(\bar{x})=\{0\} for pI2p\in I_{2} (with the definition of I2I_{2} in (8)), then the sequence {Yk(xk+1)}kN\big{\{}Y^{k}(x^{k+1})\big{\}}_{k\in N} is eventually bounded.

Proof.

(a) Observe that xk,ik+1Nx¯x^{k,i_{k}+1}\to_{N}\bar{x} because xk+1=xk,ikNx¯x^{k+1}=x^{k,i_{k}}\to_{N}\bar{x} and xk,ikxk,ik+1δk/(λ+k)0\|x^{k,i_{k}}-x^{k,i_{k}+1}\|\leq\delta_{k}/(\lambda+\ell_{k})\downarrow 0 by conditions (29). The non-emptiness and compactness of Yk(xk+1)Y^{k}(x^{k+1}) for all sufficiently large kNk\in N is a direct consequence of the nonsmooth Lagrange multiplier rule [30, Exercise 10.52] for problem (30) if we can show that, for all sufficiently large kNk\in N, ym1+1k==ymk=0y^{k}_{m_{1}+1}=\cdots=y^{k}_{m}=0 is the unique solution of the following system

0p=m1+1mypkfpk,upper(xk,ik+1;xk,ik),ypk𝒩(,0](fpk,upper(xk,ik+1;xk,ik)),p=m1+1,,m.0\in\sum\limits_{p=m_{1}+1}^{m}y^{k}_{p}\,\partial f_{p}^{k,\text{upper}}(x^{k,i_{k}+1};x^{k,i_{k}}),\;\,y^{k}_{p}\in\mathcal{N}_{(-\infty,0]}(f_{p}^{k,\text{upper}}(x^{k,i_{k}+1};x^{k,i_{k}})),\;p=m_{1}+1,\cdots,m. (32)

Suppose that the above claim does not hold. Then, there exists a subsequence NNN^{\prime}\subset N such that ym1+1k==ymk=0y^{k}_{m_{1}+1}=\cdots=y^{k}_{m}=0 is not the unique solution of (32) for all kNk\in N^{\prime}. Without loss of generality, suppose N=NN^{\prime}=N and take {ypk}kN\{y^{k}_{p}\}_{k\in N} for p=m1+1,,mp=m_{1}+1,\cdots,m satisfying (32) and p=m1+1m|ypk|=1\sum_{p=m_{1}+1}^{m}|y^{k}_{p}|=1. For each pp and kNk\in N, define

Apk{ypkvpk|vpk{gpk(xk,ik)hpk(xk,ik)}{gpk(xk,ik+1)hpk(xk,ik+1)}}.A^{k}_{p}\triangleq\left\{y^{k}_{p}\,v^{k}_{p}\,\middle|\,v^{k}_{p}\in\left\{\partial g^{k}_{p}(x^{k,i_{k}})-\partial h^{k}_{p}(x^{k,i_{k}})\right\}\cup\left\{\partial g^{k}_{p}(x^{k,i_{k}+1})-\partial h^{k}_{p}(x^{k,i_{k}+1}{)}\right\}\right\}.

Then, for all kNk\in N, we have

dist(0,p=m1+1mApk)(i)dist(0,p=m1+1mypk[gpk(xk,ik+1)hpk(xk,ik)])+p=m1+1m𝔻(ypk[gpk(xk,ik+1)hpk(xk,ik)],Apk)(ii)dist(0,p=m1+1mypkfpk,upper(xk,ik+1;xk,ik))+p=m1+1m|ypk|min{𝔻(gpk(xk,ik+1),gpk(xk,ik)),𝔻(hpk(xk,ik),hpk(xk,ik+1)}(iii)0+p=m1+1m|ypk|min{(gpk(xk,ik+1),gpk(xk,ik)),(hpk(xk,ik+1),hpk(xk,ik))}(iv)p=m1+1m|ypk|kxk,ik+1xk,ik(v)δk,\begin{array}[]{rl}&\operatorname{dist}\left(0,\sum\limits_{p=m_{1}+1}^{m}A^{k}_{p}\right)\\[10.84006pt] \overset{(\rm i)}{\leq}&\operatorname{dist}\left(0,\sum\limits_{p=m_{1}+1}^{m}y^{k}_{p}\left[\partial g^{k}_{p}(x^{k,i_{k}+1})-\partial h^{k}_{p}(x^{k,i_{k}})\right]\right)+\sum\limits_{p=m_{1}+1}^{m}\mathbb{D}\left(y^{k}_{p}\left[\partial g^{k}_{p}(x^{k,i_{k}+1})-\partial h^{k}_{p}(x^{k,i_{k}})\right],A^{k}_{p}\,\right)\\[10.84006pt] {\overset{(\rm ii)}{\leq}}&{\operatorname{dist}\left(0,\sum\limits_{p=m_{1}+1}^{m}y^{k}_{p}\,\partial f_{p}^{k,\text{upper}}(x^{k,i_{k}+1};x^{k,i_{k}})\right)}\\[8.67204pt] &\qquad\quad{+\sum\limits_{p=m_{1}+1}^{m}|y^{k}_{p}|\cdot\min\Big{\{}\mathbb{D}\left(\,\partial g^{k}_{p}(x^{k,i_{k}+1}),\,\partial g^{k}_{p}(x^{k,i_{k}})\,\right),\,\mathbb{D}\left(\,\partial h^{k}_{p}(x^{k,i_{k}}),\partial h^{k}_{p}(x^{k,i_{k}+1}\right)\Big{\}}}\\ \overset{(\rm iii)}{\leq}&0+\sum\limits_{p=m_{1}+1}^{m}|y^{k}_{p}|\cdot\min\Big{\{}\mathbb{H}(\partial g^{k}_{p}(x^{k,i_{k}+1}),\partial g^{k}_{p}(x^{k,i_{k}})),\,\mathbb{H}(\partial h^{k}_{p}(x^{k,i_{k}+1}),\partial h^{k}_{p}(x^{k,i_{k}}))\Big{\}}\\[7.22743pt] \overset{(\rm iv)}{\leq}&\sum\limits_{p=m_{1}+1}^{m}|y^{k}_{p}|\cdot\ell_{k}\,\|x^{k,i_{k}+1}-x^{k,i_{k}}\|\;\overset{(\rm v)}{\leq}\;\delta_{k},\end{array}

where (i)(\rm i) uses the inequalities 𝔻(A,C)𝔻(A,B)+𝔻(B,C)\mathbb{D}(A,C)\leq\mathbb{D}(A,B)+\mathbb{D}(B,C) and 𝔻(A+B,A+B)𝔻(A,A)+𝔻(B,B)\mathbb{D}(A+B,A^{\prime}+B^{\prime})\leq\mathbb{D}(A,A^{\prime})+\mathbb{D}(B,B^{\prime}); the first term in (ii)(\rm ii) is because of the construction of upper convex majorization (26); the second term in (ii)(\rm ii) is due to 𝔻(A,BC)min{𝔻(A,B),𝔻(A,C)}\mathbb{D}(A,B\cup C)\leq\min\{\mathbb{D}(A,B),\mathbb{D}(A,C)\} so that

𝔻(ypk[gpk(xk,ik+1)hpk(xk,ik)],Apk)=|ypk|𝔻(gpk(xk,ik+1)hpk(xk,ik),{gpk(xk,ik)hpk(xk,ik)}{gpk(xk,ik+1)hpk(xk,ik+1)})|ypk|min{𝔻(gpk(xk,ik+1),gpk(xk,ik)),𝔻(hpk(xk,ik),hpk(xk,ik+1)}.\begin{array}[]{rl}&\mathbb{D}\left(y^{k}_{p}\left[\,\partial g^{k}_{p}(x^{k,i_{k}+1})-\partial h^{k}_{p}(x^{k,i_{k}})\right],\,A^{k}_{p}\,\right)\\[5.78172pt] =&|y^{k}_{p}|\cdot\mathbb{D}\Big{(}\,\partial g^{k}_{p}(x^{k,i_{k}+1})-\partial h^{k}_{p}(x^{k,i_{k}}),\,\left\{\partial g^{k}_{p}(x^{k,i_{k}})-\partial h^{k}_{p}(x^{k,i_{k}})\right\}\cup\left\{\partial g^{k}_{p}(x^{k,i_{k}+1})-\partial h^{k}_{p}(x^{k,i_{k}+1})\right\}\Big{)}\\[7.22743pt] \leq&|y^{k}_{p}|\cdot\min\Big{\{}\mathbb{D}\left(\,\partial g^{k}_{p}(x^{k,i_{k}+1}),\,\partial g^{k}_{p}(x^{k,i_{k}})\,\right),\,\mathbb{D}\left(\,\partial h^{k}_{p}(x^{k,i_{k}}),\partial h^{k}_{p}(x^{k,i_{k}+1}\right)\Big{\}}.\end{array}

Inequality (iii)(\rm iii) is due to (32) and 𝔻(A,B)(A,B)\mathbb{D}(A,B)\leq\mathbb{H}(A,B); (iv)(\rm iv) is by Assumption 3; and (v)(\rm v) is implied by conditions (29) and p=m1+1m|ypk|=1\sum_{p=m_{1}+1}^{m}|y^{k}_{p}|=1. Equivalently, for all kNk\in N and p=m1+1,,mp=m_{1}+1,\cdots,m, there exist ypk𝒩(,0](fpk,upper(xk,ik+1;xk,ik))y^{k}_{p}\in\mathcal{N}_{(-\infty,0]}\left(f^{k,\text{upper}}_{p}(x^{k,i_{k}+1};x^{k,i_{k}})\right) with p=m1+1m|ypk|=1\sum_{p=m_{1}+1}^{m}|y^{k}_{p}|=1 and

vpk{gpk(xk,ik)hpk(xk,ik)}{gpk(xk,ik+1)hpk(xk,ik+1)}v^{k}_{p}\in\left\{\partial g^{k}_{p}(x^{k,i_{k}})-\partial h^{k}_{p}(x^{k,i_{k}})\right\}\cup\left\{\partial g^{k}_{p}(x^{k,i_{k}+1})-\partial h^{k}_{p}(x^{k,i_{k}+1})\right\}

such that p=m1+1mypkvpkδk\|\sum_{p=m_{1}+1}^{m}y^{k}_{p}\,v^{k}_{p}\|\leq\delta_{k}. For p=m1+1,,mp=m_{1}+1,\cdots,m, since the subsequence {fpk(xk,ik+1)}kN\{f^{k}_{p}(x^{k,i_{k}+1})\}_{k\in N} is bounded by Assumption 1(b), we can assume without loss of generality that fpk(xk,ik+1)f^{k}_{p}(x^{k,i_{k}+1}) converges to some z¯pTp(x¯)\bar{z}_{p}\in T_{p}(\bar{x}) as k(N)+k(\in N)\to+\infty. Furthermore, it can be easily seen from (28a) and (29) that fpk,upper(xk,ik+1;xk,ik)f^{k,\text{upper}}_{p}(x^{k,i_{k}+1};x^{k,i_{k}}) converges to the same limit point z¯p\bar{z}_{p} for p=m1+1,,mp=m_{1}+1,\cdots,m. Notice that fpk,upper(xk,ik+1;xi,ik)0f^{k,\text{upper}}_{p}(x^{k,i_{k}+1};x^{i,i_{k}})\leq 0 for all kNk\in N and p=m1+1,,mp=m_{1}+1,\cdots,m from Theorem 3(a) and, thus, each z¯p\bar{z}_{p} must satisfy z¯p0\bar{z}_{p}\leq 0. Suppose that ypkNy¯py^{k}_{p}\rightarrow_{N}\bar{y}_{p} for each pp. Then, by the outer semicontinuity of the normal cone [30, Proposition 6.6],

y¯p𝒩(,0](z¯p){𝒩domφp(tp)tpTp(x¯)},p=m1+1,,m.\bar{y}_{p}\in\mathcal{N}_{(-\infty,0]}(\bar{z}_{p})\subset\bigcup\big{\{}\mathcal{N}_{\operatorname{dom}\varphi_{p}}(t_{p})\mid t_{p}\in T_{p}(\bar{x})\big{\}},\quad p=m_{1}+1,\cdots,m.

Obviously, p=m1+1m|y¯p|=1\sum_{p=m_{1}+1}^{m}|\bar{y}_{p}|=1, and the sequence {y¯p}p=m1+1m\{\bar{y}_{p}\}^{m}_{p=m_{1}+1} has at least one nonzero element. Consider two cases.

Case 1. If {vpk}kN\{v^{k}_{p}\}_{k\in N} is bounded for p=m1+1,,mp=m_{1}+1,\cdots,m, then there are vectors {v¯p}p=m1+1m\{\bar{v}_{p}\}^{m}_{p=m_{1}+1} with v¯pAfp(x¯)\bar{v}_{p}\in\partial_{A}f_{p}(\bar{x}) such that vpkNv¯pv^{k}_{p}\rightarrow_{N}\bar{v}_{p} and 0=p=m1+1my¯pv¯pp=m1+1my¯pAfp(x¯)0=\sum_{p=m_{1}+1}^{m}\bar{y}_{p}\,\bar{v}_{p}\in\sum_{p=m_{1}+1}^{m}\bar{y}_{p}\,\partial_{A}f_{p}(\bar{x}), contradicting Assumption 5 since y¯m1+1,,y¯m\bar{y}_{m_{1}+1},\cdots,\bar{y}_{m} are not all zeros.

Case 2. Otherwise, there exists some pp such that {vpk}kN\{v^{k}_{p}\}_{k\in N} is unbounded. Define the index sets

Iub{p{m1+1,,m}|{vpk}kN unbounded}()andIb{m1+1,,m}\Iub.I_{\text{ub}}\triangleq\left\{\,p\in\{m_{1}+1,\cdots,m\}\,\middle|\,\{v^{k}_{p}\}_{k\in N}\text{ unbounded}\,\right\}\,(\neq\emptyset)\quad\mbox{and}\quad I_{\text{b}}\triangleq\{m_{1}+1,\cdots,m\}\backslash I_{\text{ub}}.

Notice that {pIbypkvpk}kN\big{\{}\sum_{p\in I_{\text{b}}}y^{k}_{p}\,v^{k}_{p}\big{\}}_{k\in N} is bounded. Without loss of generality, assume that this sequence converges to some w¯\bar{w} and, thus, pIubypkvpkN(w¯)\sum_{p\in I_{\text{ub}}}y^{k}_{p}\,v^{k}_{p}\rightarrow_{N}(-\bar{w}).

Step 1: Next we prove by contradiction that, for each pIubp\in I_{\text{ub}}, the sequence {ypkvpk}kN\{y^{k}_{p}\,v^{k}_{p}\}_{k\in N} is bounded. Suppose that the boundedness fails and pIubypkvpkN+\sum_{p\in I_{\text{ub}}}\|y^{k}_{p}\,v^{k}_{p}\|\rightarrow_{N}+\infty by passing to a subsequence. Consider w~pkypkvpk/pIubypkvpk\widetilde{w}^{k}_{p}\triangleq y^{k}_{p}\,v^{k}_{p}/\sum_{p\in I_{\text{ub}}}\|y^{k}_{p}\,v^{k}_{p}\| for pIubp\in I_{\text{ub}}. Then pIubw~pkN0\sum_{p\in I_{\text{ub}}}\widetilde{w}^{k}_{p}\rightarrow_{N}0. Since pIubw~pk=1\sum_{p\in I_{\text{ub}}}\|\widetilde{w}^{k}_{p}\|=1 for all kNk\in N, we can assume that there exist p1Iubp_{1}\in I_{\text{ub}} and w~p10\widetilde{w}_{p_{1}}\neq 0 such that w~p1kNw~p1\widetilde{w}^{k}_{p_{1}}\rightarrow_{N}\widetilde{w}_{p_{1}}. It then follows from the construction of w~pk\widetilde{w}^{k}_{p} that {w~pk}kN\{\widetilde{w}^{k}_{p}\}_{k\in N} has a subsequence converging to some element of ±Afp(x¯)\pm\partial^{\infty}_{A}f_{p}(\bar{x}) for each pIubp\in I_{\text{ub}} and, in particular, w~p1[±Afp1(x¯)\{0}]\widetilde{w}_{p_{1}}\in\big{[}\pm\partial^{\infty}_{A}f_{p_{1}}(\bar{x})\backslash\{0\}\big{]}. From pIubw~pkN0\sum_{p\in I_{\text{ub}}}\widetilde{w}^{k}_{p}\rightarrow_{N}0, we obtain

0[±Afp1(x¯)\{0}]+pIub\{p1}[±Afp(x¯)],0\in\left[\,\pm\partial^{\infty}_{A}f_{p_{1}}(\bar{x})\backslash\{0\}\,\right]+\sum_{p\in I_{\text{ub}}\backslash\{p_{1}\}}\left[\,\pm\partial^{\infty}_{A}f_{p}(\bar{x})\,\right],

a contradiction to Assumption 5 since the coefficient of the term [±Afp1(x¯)\{0}][\,\pm\partial^{\infty}_{A}f_{p_{1}}(\bar{x})\backslash\{0\}\,] is nonzero. So far, we have shown the boundedness of {ypkvpk}kN\{y^{k}_{p}\,v^{k}_{p}\}_{k\in N} for each pIubp\in I_{\text{ub}}.

Step 2: Now suppose that ypkvpkNw¯py^{k}_{p}\,v^{k}_{p}\rightarrow_{N}\bar{w}_{p} for each pIubp\in I_{\text{ub}} with pIubw¯p=w¯\sum_{p\in I_{\text{ub}}}\bar{w}_{p}=-\bar{w}. Thus ypkN0y^{k}_{p}\rightarrow_{N}0 and w¯p[±Afp(x¯)]\bar{w}_{p}\in\left[\,\pm\partial^{\infty}_{A}f_{p}(\bar{x})\,\right] for each pIubp\in I_{\text{ub}}. Since p=m1+1m|y¯p|=pIb|y¯p|=1\sum_{p=m_{1}+1}^{m}|\bar{y}_{p}|=\sum_{p\in I_{\text{b}}}|\bar{y}_{p}|=1, there exists p2Ibp_{2}\in I_{\text{b}} such that y¯p20\bar{y}_{p_{2}}\neq 0. Then p=m1+1mypkvpkN0\sum_{p=m_{1}+1}^{m}y^{k}_{p}\,v^{k}_{p}\rightarrow_{N}0 implies

0y¯p2Afp2(x¯)+pIb\{p2}y¯pAfp(x¯)+pIub[±Afp(x¯)],0\in\bar{y}_{p_{2}}\,\partial_{A}f_{p_{2}}(\bar{x})+\sum_{p\in I_{\text{b}}\backslash\{p_{2}\}}\bar{y}_{p}\,\partial_{A}f_{p}(\bar{x})+\sum_{p\in I_{\text{ub}}}\left[\,\pm\partial^{\infty}_{A}f_{p}(\bar{x})\,\right],

which leads to a contradiction to Assumption 5. Thus, Yk(xk+1)Y^{k}(x^{k+1}) is non-empty and compact for all sufficiently large kNk\in N.

(b) By part (a), assume from now on that Yk(xk+1)Y^{k}(x^{k+1})\neq\emptyset for all kNk\in N without loss of generality. We also assume that {fpk(xk,ik+1)}kN\{f^{k}_{p}(x^{k,i_{k}+1})\}_{k\in N} converges to some point z¯pTp(x¯)\bar{z}_{p}\in T_{p}(\bar{x}) for p=1,,mp=1,\cdots,m. Then, by (28a), (28b) and (29), fpk,upper(xk,ik+1;xk,ik)Nz¯pf^{k,\text{upper}}_{p}(x^{k,i_{k}+1};x^{k,i_{k}})\to_{N}\bar{z}_{p} for p=1,,mp=1,\cdots,m and fpk,lower(xk,ik+1;xk,ik)Nz¯pf^{k,\text{lower}}_{p}(x^{k,i_{k}+1};x^{k,i_{k}})\to_{N}\bar{z}_{p} for pI2p\in I_{2}. For any kNk\in N and any (y1,1k,y1,2k,,ym,1k,ym,2k)Yk(xk+1)(y^{k}_{1,1},y^{k}_{1,2},\cdots,y^{k}_{m,1},y^{k}_{m,2})\in Y^{k}(x^{k+1}), we have yp,1kφp(fpk,upper(xk,ik+1;xk,ik))y^{k}_{p,1}\in\partial\varphi^{\uparrow}_{p}(f^{k,\text{upper}}_{p}(x^{k,i_{k}+1};x^{k,i_{k}})) and yp,2kφp(fpk,lower(xk,ik+1;xk,ik))y^{k}_{p,2}\in\partial\varphi^{\downarrow}_{p}(f^{k,\text{lower}}_{p}(x^{k,i_{k}+1};x^{k,i_{k}})) for p=1,,mp=1,\cdots,m satisfying

0p=1m[yp,1k[gpk(xk,ik+1)hpk(xk,ik)]+yp,2k[gpk(xk,ik)hpk(xk,ik+1)]]+λ(xk,ik+1xk,ik).0\in\sum\limits_{p=1}^{m}\Big{[}\,y^{k}_{p,1}\big{[}\partial g^{k}_{p}(x^{k,i_{k}+1})-\partial\,h^{k}_{p}(x^{k,i_{k}})\big{]}+y^{k}_{p,2}\big{[}\partial g^{k}_{p}(x^{k,i_{k}})-\partial\,h^{k}_{p}(x^{k,i_{k}+1})\big{]}\Big{]}+\lambda(x^{k,i_{k}+1}-x^{k,i_{k}}). (33)

Due to Assumption 3 and similar arguments in the proof of part (a), the optimality condition (33) implies that

{p=1m(yp,1kvp,1k+yp,2kvp,2k)[λ+p=1m(|yp,1k|+|yp,2k|)k]δkλ+kmax{1,p=1m(|yp,1k|+|yp,2k|)}δk,[vp,1kgpk(xk,ik)hpk(xk,ik)vp,2kgpk(xk,ik+1)hpk(xk,ik+1)]or[vp,1kgpk(xk,ik+1)hpk(xk,ik+1)vp,2kgpk(xk,ik)hpk(xk,ik)],p=1,,m.\left\{\begin{array}[]{l}\displaystyle\left\|\sum_{p=1}^{m}\big{(}y^{k}_{p,1}\,v^{k}_{p,1}+y^{k}_{p,2}\,v^{k}_{p,2}\big{)}\right\|{\leq\left[\lambda+\sum\limits_{p=1}^{m}\big{(}|y^{k}_{p,1}|+|y^{k}_{p,2}|\big{)}\ell_{k}\right]\frac{\delta_{k}}{\lambda+\ell_{k}}\leq\max\left\{1,\sum\limits_{p=1}^{m}\big{(}|y^{k}_{p,1}|+|y^{k}_{p,2}|\big{)}\right\}\delta_{k}},\\[14.45377pt] \left[\begin{array}[]{cc}v^{k}_{p,1}\in\partial g^{k}_{p}(x^{k,i_{k}})-\partial h^{k}_{p}(x^{k,i_{k}})\\[7.22743pt] v^{k}_{p,2}\in\partial g^{k}_{p}(x^{k,i_{k}+1})-\partial h^{k}_{p}(x^{k,i_{k}+1})\end{array}\right]\,\text{or}\,\left[\begin{array}[]{cc}v^{k}_{p,1}\in\partial g^{k}_{p}(x^{k,i_{k}+1})-\partial h^{k}_{p}(x^{k,i_{k}+1})\\[7.22743pt] v^{k}_{p,2}\in\partial g^{k}_{p}(x^{k,i_{k}})-\partial h^{k}_{p}(x^{k,i_{k}})\end{array}\right],\,p=1,\cdots,m.\end{array}\right. (34)

Note that, for pI1p\in I_{1}, φp\varphi_{p} is nondecreasing, i.e., φp=0\varphi^{\downarrow}_{p}=0. Then yp,2k=0{y^{k}_{p,2}}=0 for all kNk\in N and pI1p\in{I_{1}}, and the first inequality of (34) is equivalent to

pI1yp,1kvp,1k+pI2(yp,1kvp,1k+yp,2kvp,2k)max{1,pI1|yp,1k|+pI2(|yp,1k|+|yp,2k|)}δk.\left\|\sum\limits_{p\in I_{1}}y^{k}_{p,1}\,v^{k}_{p,1}+\sum\limits_{p\in I_{2}}\big{(}y^{k}_{p,1}\,v^{k}_{p,1}+y^{k}_{p,2}\,v^{k}_{p,2}\big{)}\right\|\leq{\max\left\{1,\sum\limits_{p\in I_{1}}|y^{k}_{p,1}|+\sum\limits_{p\in I_{2}}\big{(}|y^{k}_{p,1}|+|y^{k}_{p,2}|\big{)}\right\}\delta_{k}}. (35)

Observe that the sequences {vp,1k}kN\{v^{k}_{p,1}\}_{k\in N} and {vp,2k}kN\{v^{k}_{p,2}\}_{k\in N} must be bounded for pI2p\in I_{2}. Otherwise, we could assume vp,1kN+\|v^{k}_{p,1}\|\rightarrow_{N}+\infty. Then every accumulation point of the unit vectors {vp,1k/vp,1k}kN\{v^{k}_{p,1}/\|v^{k}_{p,1}\|\}_{k\in N} would be in the set Afp(x¯)\partial^{\infty}_{A}f_{p}(\bar{x}), contradicting our assumption that Afp(x¯)={0}\partial^{\infty}_{A}f_{p}(\bar{x})=\{0\} for each pI2p\in I_{2}.

For pI2{1,,m1}p\in I_{2}\subset\{1,\cdots,m_{1}\}, given that φp\varphi^{\uparrow}_{p} is convex, real-valued, and fpk,upper(xk,ik+1;xk,ik)Nz¯pf^{k,\text{upper}}_{p}(x^{k,i_{k}+1};x^{k,i_{k}})\rightarrow_{N}\bar{z}_{p}, we can invoke [27, Theorem 24.7] to deduce the boundedness of {yp,1k}kN\{y^{k}_{p,1}\}_{k\in N}. A parallel reasoning applies to demonstrate the boundedness of {yp,2k}kN\{y^{k}_{p,2}\}_{k\in N}.

For pI1p\in I_{1}, we proceed by contradiction to establish the boundedness of {yp,1k}kN\{y^{k}_{p,1}\}_{k\in N} based on Assumption 5. Suppose that {pI1|yp,1k|}kN\big{\{}\sum_{p\in I_{1}}|y^{k}_{p,1}|\big{\}}_{k\in N} is unbounded and pI1|yp,1k|N+\sum_{p\in I_{1}}|y^{k}_{p,1}|\rightarrow_{N}+\infty by passing to a subsequence. Consider the normalized subsequences {y~p,1kyp,1k/pI1|yp,1k|}kN\big{\{}\widetilde{y}^{k}_{p,1}\triangleq y^{k}_{p,1}/\sum_{{p^{\prime}}\in I_{1}}|y^{k}_{{p^{\prime}},1}|\big{\}}_{k\in N} and {y~p,2kyp,2k/pI1|yp,1k|}kN\big{\{}\widetilde{y}^{k}_{p,2}\triangleq y^{k}_{p,2}/\sum_{p^{\prime}\in I_{1}}|y^{k}_{p^{\prime},1}|\big{\}}_{k\in N} for each pp. Consequently, y~p,1kN0\widetilde{y}^{k}_{p,1}\rightarrow_{N}0 and y~p,2kN0\widetilde{y}^{k}_{p,2}\rightarrow_{N}0 for pI2p\in I_{2}. By the triangle inequality and (35), we have

|pI1y~p,1kvp,1kpI2(y~p,1kvp,1k+y~p,2kvp,2k)|pI1y~p,1kvp,1k+pI2(y~p,1kvp,1k+y~p,2kvp,2k)max{1pI1|yp,1k|,1+pI2(|yp,1k|+|yp,2k|)pI1|yp,1k|}δkN 0,\begin{array}[]{rl}&\left|\;\displaystyle\left\|\sum_{p\in I_{1}}\widetilde{y}^{k}_{p,1}\,v^{k}_{p,1}\right\|-\left\|\sum_{p\in I_{2}}\big{(}\widetilde{y}^{k}_{p,1}\,v^{k}_{p,1}+\widetilde{y}^{k}_{p,2}\,v^{k}_{p,2}\big{)}\right\|\;\right|\,\displaystyle\leq\,\left\|\sum_{p\in I_{1}}\widetilde{y}^{k}_{p,1}\,v^{k}_{p,1}+\sum_{p\in I_{2}}\big{(}\widetilde{y}^{k}_{p,1}\,v^{k}_{p,1}+\widetilde{y}^{k}_{p,2}\,v^{k}_{p,2}\big{)}\right\|\\[15.89948pt] \leq&{\displaystyle\max\left\{\frac{1}{\sum_{p\in I_{1}}|y^{k}_{p,1}|},1+\frac{\sum_{p\in I_{2}}\big{(}|y^{k}_{p,1}|+|y^{k}_{p,2}|\big{)}}{\sum_{p\in I_{1}}|y^{k}_{p,1}|}\right\}\delta_{k}}\;\longrightarrow_{N}\;0,\end{array}

which further implies pI1y~p,1kvp,1kN0\big{\|}\sum_{p\in I_{1}}\widetilde{y}^{k}_{p,1}\,v^{k}_{p,1}\big{\|}\rightarrow_{N}0 by the boundedness of {vp,1k}kN\{v^{k}_{p,1}\}_{k\in N} and {vp,2k}kN\{v^{k}_{p,2}\}_{k\in N} for pI2p\in I_{2}. Now suppose that y~p,1kNy~p,1\widetilde{y}^{k}_{p,1}\rightarrow_{N}\widetilde{y}_{p,1} for pI1p\in I_{1}. Then from a similar reasoning in (20), for pI1p\in I_{1},

y~p,1Limsupk(N)+φp(fpk,upper(xk,ik+1;xk,ik))φp(z¯p)=𝒩domφp(z¯p),\widetilde{y}_{p,1}\in{\displaystyle\operatornamewithlimits{Lim\,sup}_{k(\in N)\rightarrow+\infty}}^{\infty}\;\partial\varphi^{\uparrow}_{p}\left(f^{k,\text{upper}}_{p}(x^{k,i_{k}+1};x^{k,i_{k}})\right)\subset\partial^{\infty}\varphi^{\uparrow}_{p}(\bar{z}_{p})=\mathcal{N}_{\operatorname{dom}\varphi^{\uparrow}_{p}}(\bar{z}_{p}),

and obviously pI1|y~p,1|=1\sum_{p\in I_{1}}|\widetilde{y}_{p,1}|=1. The remaining argument to derive a contradiction to Assumption 5 follows the same steps as the proof of part (a) for the two cases, with the exception that the index set {m1+1,,m}\{m_{1}+1,\cdots,m\} is replaced by I1I_{1}. Thus, the sequences {yp,1k}kN\{y^{k}_{p,1}\}_{k\in N} for pI1I2p\in I_{1}\cup I_{2} and {yp,2k}kN\{y^{k}_{p,2}\}_{k\in N} for pI1p\in I_{1} are bounded. We can conclude that {Yk(xk+1)kN,kK}\bigcup\{Y^{k}(x^{k+1})\mid k\in N,k\geq K\} is bounded for sufficiently large integer KK, because otherwise we could extract a subsequence of multipliers from Yk(xk+1)Y^{k}(x^{k+1}) whose norms diverge to ++\infty as k(N)+k(\in N)\to+\infty, in contradiction to the result of boundedness that we have shown. Hence, the subsequence {Yk(xk+1)}kN\{Y^{k}(x^{k+1})\}_{k\in N} is eventually bounded. ∎

We make a remark on Lemma 2(b) about the additional assumption. According to the proof of part (b), the assumption Afp(x¯)={0}\partial^{\infty}_{A}f_{p}(\bar{x})=\{0\} for pI2p\in I_{2} ensures the boundedness of the set Afp(x¯)\partial_{A}f_{p}(\bar{x}) for pI2p\in I_{2}. There are some sufficient conditions for Afp(x¯)={0}\partial^{\infty}_{A}f_{p}(\bar{x})=\{0\} to hold: (i) If fpf_{p} is locally Lipschitz continuous and bounded from below, by Theorem 1(b), we have Afp(x)={0}\partial^{\infty}_{A}f_{p}(x)=\{0\} at any xdomfpx\in\operatorname{dom}f_{p} for the approximating sequence generated by the Moreau envelope. (ii) If fpf_{p} is icc associated with f¯p\overline{f}_{p} satisfying all assumptions in Proposition 2, it then follows from Proposition 2(b) that Afp(x)={0}\partial^{\infty}_{A}f_{p}(x)=\{0\} at any xint(domfp)x\in\operatorname{int}(\operatorname{dom}f_{p}) for the approximating sequence based on the partial Moreau envelope. It is worth mentioning that the icc function fpf_{p} under condition (ii) is not necessarily locally Lipschitz continuous.

The main convergence result of the prox-ADC method follows.

Theorem 4.

Suppose that Assumptions 1-5 hold. Let {xk}\{x^{k}\} be the sequence generated by the prox-ADC method. Suppose that {xk}\{x^{k}\} has an accumulation point x¯\bar{x} and, in addition, Afp(x¯)={0}\partial^{\infty}_{A}f_{p}(\bar{x})=\{0\} for pI2p\in I_{2}. Then x¯\bar{x} is a weakly A-stationary point of (CP1). Moreover, if for each pI2p\in I_{2}, the functions gpkg^{k}_{p} and hpkh^{k}_{p} are k\ell_{k}-smooth for all k{k\in\mathbb{N}}, i.e., there exists a sequence {k}\{\ell_{k}\} such that for all k{k\in\mathbb{N}},

max{gpk(x)gpk(x),hpk(x)hpk(x)}kxxx,xn,pI2,\max\Big{\{}\,\big{\|}\nabla g^{k}_{p}(x)-\nabla g^{k}_{p}(x^{\prime})\big{\|},\,\big{\|}\nabla h^{k}_{p}(x)-\nabla h^{k}_{p}(x^{\prime})\big{\|}\,\Big{\}}\leq\ell_{k}\|x^{\prime}-x\|\quad\forall\,x,x^{\prime}\in\mathbb{R}^{n},\;p\in I_{2}, (36)

then x¯\bar{x} is also an A-stationary point of (CP1).

Proof.

Let {xk+1}kN\{x^{k+1}\}_{k\in N} be a subsequence converging to x¯\bar{x}. By the stopping conditions (29) and xk,ikNx¯x^{k,i_{k}}\rightarrow_{N}\bar{x}, we also have xk,ik+1Nx¯x^{k,i_{k}+1}\rightarrow_{N}\bar{x}. First, we prove x¯p=1mdomFp\bar{x}\in\bigcap^{m}_{p=1}\operatorname{dom}F_{p}. From Theorem 3(a), we have fpk(xk,ik+1)0f^{k}_{p}(x^{k,i_{k}+1})\leq 0 for p=m1+1,,mp=m_{1}+1,\cdots,m and all kk\in\mathbb{N}. Due to epi-convergence in Assumption 1(c), it holds that

δ(,0](fp(x¯))lim infk(N)+δ(,0](fpk(xk,ik+1))=0,p=m1+1,,m.\delta_{(-\infty,0]}(f_{p}(\bar{x}))\leq\liminf_{k(\in N)\to+\infty}\delta_{(-\infty,0]}(f^{k}_{p}(x^{k,i_{k}+1}))=0,\qquad p=m_{1}+1,\cdots,m.

Thus fp(x¯)0f_{p}(\bar{x})\leq 0 for p=m1+1,,mp=m_{1}+1,\cdots,m and x¯p=m1+1mdomFp\bar{x}\in{\bigcap^{m}_{p=m_{1}+1}\operatorname{dom}F_{p}}. By Assumption 1(a), domφp=n\operatorname{dom}\varphi_{p}=\mathbb{R}^{n} for all p=1,,m1p=1,\cdots,m_{1}. This implies x¯p=1m1domFp\bar{x}\in\bigcap^{m_{1}}_{p=1}\operatorname{dom}F_{p}, and we can conclude that x¯p=1mdomFp\bar{x}\in\bigcap^{m}_{p=1}\operatorname{dom}F_{p}.

By Lemma 2(a), for all sufficiently large kNk\in N, we have

0p=1m[yp,1k(gpk(xk,ik+1)hpk(xk,ik))+yp,2k(gpk(xk,ik)hpk(xk,ik+1))]+λ(xk,ik+1xk,ik),\hskip-3.61371pt0\in\sum\limits_{p=1}^{m}\Big{[}\,y^{k}_{p,1}\left(\partial g^{k}_{p}(x^{k,i_{k}+1})-\partial\,h^{k}_{p}(x^{k,i_{k}})\right)+y^{k}_{p,2}\left(\partial g^{k}_{p}(x^{k,i_{k}})-\partial\,h^{k}_{p}(x^{k,i_{k}+1})\right)\Big{]}+\lambda(x^{k,i_{k}+1}-x^{k,i_{k}}), (37)

where yp,1kφp(fpk,upper(xk,ik+1;xk,ik))y^{k}_{p,1}\in\partial\varphi^{\uparrow}_{p}(f^{k,\text{upper}}_{p}(x^{k,i_{k}+1};x^{k,i_{k}})) and yp,2kφp(fpk,lower(xk,ik+1;xk,ik))y^{k}_{p,2}\in\partial\varphi^{\downarrow}_{p}(f^{k,\text{lower}}_{p}(x^{k,i_{k}+1};x^{k,i_{k}})) for p=1,,mp=1,\cdots,m. It follows from Lemma 2(b) that the subsequences {yp,1k}kN\{y^{k}_{p,1}\}_{k\in N} and {yp,2k}kN\{y^{k}_{p,2}\}_{k\in N} are bounded for p=1,,mp=1,\cdots,m. Suppose that yp,1kNy¯p,1y^{k}_{p,1}\to_{N}\bar{y}_{p,1} and yp,2kNy¯p,2y^{k}_{p,2}\to_{N}\bar{y}_{p,2} for p=1,,mp=1,\cdots,m. Recall that the subsequence {fpk(xk,ik+1)}kN\{f^{k}_{p}(x^{k,i_{k}+1})\}_{k\in N} is bounded by Assumption 1(b) for p=1,,mp=1,\cdots,m. Without loss of generality, assume that {fpk(xk,ik+1)}kN\{f^{k}_{p}(x^{k,i_{k}+1})\}_{k\in N} converges to some point z¯pTp(x¯)\bar{z}_{p}\in T_{p}(\bar{x}) for p=1,,mp=1,\cdots,m. Then, by (28a), (28b) and (29), fpk,upper(xk,ik+1;xk,ik)Nz¯pf^{k,\text{upper}}_{p}(x^{k,i_{k}+1};x^{k,i_{k}})\to_{N}\bar{z}_{p} for p=1,,mp=1,\cdots,m and fpk,lower(xk,ik+1;xk,ik)Nz¯pf^{k,\text{lower}}_{p}(x^{k,i_{k}+1};x^{k,i_{k}})\to_{N}\bar{z}_{p} for pI2p\in I_{2}. From the outer semicontinuity of φp\partial\varphi^{\uparrow}_{p} and φ\partial\varphi^{\downarrow}, we have y¯p,1φp(z¯p)\bar{y}_{p,1}\in\partial\varphi^{\uparrow}_{p}(\bar{z}_{p}) for p=1,,mp=1,\cdots,m and y¯p,2φp(z¯p)\bar{y}_{p,2}\in\partial\varphi^{\downarrow}_{p}(\bar{z}_{p}) for pI2p\in I_{2}.

To proceed, we prove by contradiction that the sequence {yp,1kvp,1k}kN\{y^{k}_{p,1}\,v^{k}_{p,1}\}_{k\in N} is bounded for pI1p\in I_{1}. Suppose that pI1yp,1kvp,1kN+\sum_{p\in I_{1}}\|y^{k}_{p,1}\,v^{k}_{p,1}\|\rightarrow_{N}+\infty. For each pI2p\in I_{2}, the boundedness of {vp,1k}kN\{v^{k}_{p,1}\}_{k\in N} and {vp,2k}kN\{v^{k}_{p,2}\}_{k\in N} follows from the assumption Afp(x¯)={0}\partial^{\infty}_{A}f_{p}(\bar{x})=\{0\}; otherwise, any accumulation point of the unit vectors {vp,1k/vp,1k}kN\{v^{k}_{p,1}/\|v^{k}_{p,1}\|\}_{k\in N} would be in Afp(x¯)\partial^{\infty}_{A}f_{p}(\bar{x}), leading to a contradiction. Since {yp,1k}kN\{y^{k}_{p,1}\}_{k\in N} and {yp,1k}kN\{y^{k}_{p,1}\}_{k\in N} for pI2p\in I_{2} are also bounded, we conclude that the subsequence {pI2(yp,1kvp,1k+yp,2kvp,2k)}kN\big{\{}\sum_{p\in I_{2}}(y^{k}_{p,1}\,v^{k}_{p,1}+y^{k}_{p,2}\,v^{k}_{p,2})\big{\}}_{k\in N} is bounded. Thus, we can assume that

pI2(yp,1kvp,1k+yp,2kvp,2k)Nw¯(pI2(y¯p,1Afp(x¯)+y¯p,2Afp(x¯))).\sum_{p\in I_{2}}\big{(}y^{k}_{p,1}\,v^{k}_{p,1}+y^{k}_{p,2}\,v^{k}_{p,2}\big{)}\rightarrow_{N}\;\bar{w}~{}\left(\in\sum_{p\in I_{2}}(\bar{y}_{p,1}\,\partial_{A}f_{p}(\bar{x})+\bar{y}_{p,2}\,\partial_{A}f_{p}(\bar{x}))\right).

By (35), it follows that pI1yp,1kvp,1kN(w¯)\sum_{p\in I_{1}}y^{k}_{p,1}\,v^{k}_{p,1}\rightarrow_{N}(-\bar{w}). Consider w~pkyp,1kvp,1k/pI1yp,1kvp,1k\widetilde{w}^{k}_{p}\triangleq y^{k}_{p,1}\,v^{k}_{p,1}/\sum_{{p^{\prime}}\in I_{1}}\|y^{k}_{{p^{\prime}},1}\,v^{k}_{{p^{\prime}},1}\| for pI1p\in I_{1}, and then pI1w~pkN0\sum_{p\in I_{1}}\widetilde{w}^{k}_{p}\rightarrow_{N}0. Given pI1w~pk=1\sum_{p\in I_{1}}\|\widetilde{w}^{k}_{p}\|=1 for all kNk\in N, there must exist p1I1p_{1}\in I_{1} such that w~p1kNw~p10\widetilde{w}^{k}_{p_{1}}\rightarrow_{N}\widetilde{w}_{p_{1}}\neq 0. For each pI1p\in I_{1}, it then follows from yp,1k/pI1yp,1kvp,1kN0y^{k}_{p,1}/\sum_{{p^{\prime}}\in I_{1}}\|y^{k}_{{p^{\prime}},1}v^{k}_{{p^{\prime}},1}\|\to_{N}0 that {w~pk}kN\{\widetilde{w}^{k}_{p}\}_{k\in N} has a subsequence converging to some element in Afp(x¯)\partial^{\infty}_{A}f_{p}(\bar{x}). In particular, w~p1Afp1(x¯)\{0}\widetilde{w}_{p_{1}}\in\partial^{\infty}_{A}f_{p_{1}}(\bar{x})\backslash\{0\}. Since pI1w~pkN0\sum_{p\in I_{1}}\widetilde{w}^{k}_{p}\rightarrow_{N}0, this implies that

0[Afp1(x¯)\{0}]+pI1\{p1}Afp(x¯),0\in\left[\,\partial^{\infty}_{A}f_{p_{1}}(\bar{x})\backslash\{0\}\,\right]+\sum_{p\in I_{1}\backslash\{p_{1}\}}\partial^{\infty}_{A}f_{p}(\bar{x}),\vspace{-0.1in}

which contradicts Assumption 5. Hence, {yp,1kvp,1k}kN\{y^{k}_{p,1}\,v^{k}_{p,1}\}_{k\in N} is bounded for pI1p\in I_{1}.

We are now ready to prove that x¯\bar{x} is a weakly A-stationary point. Suppose that yp,1kvp,1kNw¯py^{k}_{p,1}\,v^{k}_{p,1}\rightarrow_{N}\bar{w}_{p} for pI1p\in I_{1} with pI1w¯p=w¯\sum_{p\in I_{1}}\bar{w}_{p}=-\bar{w}. It remains to show that for each pI1p\in I_{1}, there exists y¯p,1{φp(tp)tpTp(x¯)}\bar{y}_{p,1}\in\bigcup\{\partial\varphi^{\uparrow}_{p}(t_{p})\mid t_{p}\in T_{p}(\bar{x})\} such that

w¯p{y¯p,1Afp(x¯)}[Afp(x¯)\{0}],\bar{w}_{p}\in\left\{\,\bar{y}_{p,1}\,\partial_{A}f_{p}(\bar{x})\,\right\}\cup\left[\,\partial^{\infty}_{A}f_{p}(\bar{x})\backslash\{0\}\,\right],

which can be derived similarly as the proof of (19) in Theorem 2. Summarizing these arguments, we conclude that x¯\bar{x} is a weakly A-stationary point of (CP1).

Under the additional assumption of the theorem, there exist yp,1kφp(fpk,upper(xk,ik+1;xk,ik))y^{k}_{p,1}\in\partial\varphi^{\uparrow}_{p}(f^{k,\text{upper}}_{p}(x^{k,i_{k}+1};x^{k,i_{k}})), yp,2kφp(fpk,lower(xk,ik+1;xk,ik))y^{k}_{p,2}\in\partial\varphi^{\downarrow}_{p}(f^{k,\text{lower}}_{p}(x^{k,i_{k}+1};x^{k,i_{k}})) for p=1,,mp=1,\cdots,m, and

vp,1k{gpk(xk,ik)hpk(xk,ik)}{gpk(xk,ik+1)hpk(xk,ik+1)} for pI1v^{k}_{p,1}\in\left\{\partial g^{k}_{p}(x^{k,i_{k}})-\partial h^{k}_{p}(x^{k,i_{k}})\right\}\cup\left\{\partial g^{k}_{p}(x^{k,i_{k}+1})-\partial h^{k}_{p}(x^{k,i_{k}+1})\right\}{\mbox{ for }p\in I_{1}}

such that

pI1yp,1kvp,1k+pI2(yp,1k+yp,2k)[gpk(xk,ik)hpk(xk,ik)]\displaystyle\left\|\sum_{p\in I_{1}}y^{k}_{p,1}\,v^{k}_{p,1}+\sum_{p\in I_{2}}(y^{k}_{p,1}+y^{k}_{p,2})\left[\nabla g^{k}_{p}(x^{k,i_{k}})-\nabla h^{k}_{p}(x^{k,i_{k}})\right]\right\|
(vi)\displaystyle\overset{(\rm vi)}{\leq} λxk,ik+1xk,ik+pI1|yp,1k|min{(gpk(xk,ik+1),gpk(xk,ik)),(hpk(xk,ik+1),hpk(xk,ik))}\displaystyle\lambda\|x^{k,i_{k}+1}-x^{k,i_{k}}\|+\sum\limits_{p\in I_{1}}|y^{k}_{p,1}|\cdot\min\Big{\{}{\mathbb{H}\left(\partial g^{k}_{p}(x^{k,i_{k}+1}),\partial g^{k}_{p}(x^{k,i_{k}})\right),\,\mathbb{H}\left(\partial h^{k}_{p}(x^{k,i_{k}+1}),\partial h^{k}_{p}(x^{k,i_{k}})\right)}\Big{\}}
+pI2(|yp,1k|gpk(xk,ik)gpk(xk,ik+1)+|yp,2k|hpk(xk,ik+1)hpk(xk,ik))\displaystyle+\sum_{p\in I_{2}}\Big{(}|y^{k}_{p,1}|\cdot\|\nabla g^{k}_{p}(x^{k,i_{k}})-\nabla g^{k}_{p}(x^{k,i_{k}+1})\|+|y^{k}_{p,2}|\cdot\|\nabla h^{k}_{p}(x^{k,i_{k}+1})-\nabla h^{k}_{p}(x^{k,i_{k}})\|\Big{)}
(vii)\displaystyle\overset{(\rm vii)}{\leq} [λ+(pI1|yp,1k|+pI2(|yp,1k|+|yp,2k|))k]xk,ik+1xk,ik\displaystyle\left[\lambda+\left(\sum_{p\in I_{1}}|y^{k}_{p,1}|+\sum_{p\in I_{2}}\big{(}|y^{k}_{p,1}|+|y^{k}_{p,2}|\big{)}\right)\ell_{k}\right]\|x^{k,i_{k}+1}-x^{k,i_{k}}\|
(viii)\displaystyle\overset{(\rm viii)}{\leq} max{1,pI1|yp,1k|+pI2(|yp,1k|+|yp,2k|)}δkk,\displaystyle{\max\left\{1,\sum\limits_{p\in I_{1}}|y^{k}_{p,1}|+\sum\limits_{p\in I_{2}}\big{(}|y^{k}_{p,1}|+|y^{k}_{p,2}|\big{)}\right\}\delta_{k}\qquad\forall\,k\in\mathbb{N}},

where (vi)(\rm vi) is implied by the optimality condition (37), (vii)(\rm vii) employs (36) and Assumption 3, and (viii)(\rm viii) follows from conditions (29). This inequality is a tighter version of (35) in the sense that, for each pI2p\in I_{2} and k{k\in\mathbb{N}}, vp,1kv^{k}_{p,1} and vp,2kv^{k}_{p,2} are elements taken from the single-valued mapping gpk()hpk()\nabla g^{k}_{p}(\cdot)-\nabla h^{k}_{p}(\cdot) evaluated at the same point xk,ikx^{k,i_{k}}. A straightforward adaptation of the preceding argument confirms that x¯\bar{x} is an A-stationary point of (CP1). ∎

4.3 Termination criteria.

The previous subsection demonstrates the asymptotic convergence of the algorithm, showing that any accumulation point of the sequence generated by the prox-ADC method is weakly A-stationary. This subsection is dedicated to the non-asymptotic analysis of verifiable termination criteria for practical implementation.

Assumption 6 (non-asymptotic constraint qualifications) Let λ\lambda be the parameter in Algorithm 1. For all kk\in\mathbb{N} and any pair (x,x′′)(x^{\prime},x^{\prime\prime}) satisfying x′′=[argminxnp=1m1Fpk^(x;x)+λ2xx2subject tofpk,upper(x;x)0,p=m1+1,,m],x^{\prime\prime}=\left[\begin{array}[]{cl}\displaystyle\operatornamewithlimits{argmin}_{x\in\mathbb{R}^{n}}&\;\;\sum\limits_{p=1}^{m_{1}}\widehat{F^{k}_{p}}(x;x^{\prime})+\frac{\lambda}{2}\|x-x^{\prime}\|^{2}\\[10.84006pt] \text{subject to}&\;\;f^{k,\text{upper}}_{p}(x;x^{\prime})\leq 0,\,p=m_{1}+1,\cdots,m\end{array}\right], if there exist ypk𝒩(,0](fpk,upper(x′′;x))y^{k}_{p}\in\mathcal{N}_{(-\infty,0]}(f_{p}^{k,\text{upper}}(x^{\prime\prime};x^{\prime})) for p=m1+1,,mp=m_{1}+1,\cdots,m such that 0p=m1+1mypkfpk,upper(x′′;x),0\in\sum\limits_{p=m_{1}+1}^{m}y^{k}_{p}\,\partial f_{p}^{k,\text{upper}}(x^{\prime\prime};x^{\prime}),\vspace{-0.08in} then we must have ym1+1k==ymk=0y^{k}_{m_{1}+1}=\cdots=y^{k}_{m}=0.

A direct consequence of Assumption 6 and the nonsmooth Lagrange multiplier rule [30, Exercise 10.52] is that the set of multipliers Yk(xk+1)Y^{k}(x^{k+1}) is non-empty and compact for any fixed kk\in\mathbb{N}. This is in contrast with Lemma 2(a), where the results only hold for sufficiently large kNk\in N. We will show below that the result on the eventual boundedness of the subsequence {Yk(xk+1)}kN\{Y^{k}(x^{k+1})\}_{k\in N} can be strengthened to the equi-boundedness under this assumption.

Proposition 5 (equi-boundedness of multipliers).

Suppose that Assumptions 1-6 hold. Consider any sequence {xk}\{x^{k}\} generated by the prox-ADC method. The following statements hold.
(a) If there is a subsequence {xk+1}kN\{x^{k+1}\}_{k\in N} converging to some x¯\bar{x} and Afp(x¯)={0}\partial^{\infty}_{A}f_{p}(\bar{x})=\{0\} for pI2p\in I_{2}, then the subsequence {Yk(xk+1)}kN\big{\{}Y^{k}(x^{k+1})\big{\}}_{k\in N} is equi-bounded.
(b) If {xk}\{x^{k}\} is bounded and Afp(x)={0}\partial^{\infty}_{A}f_{p}(x)=\{0\} for any xp=1mdomFpx\in\bigcap^{m}_{p=1}\operatorname{dom}F_{p} and pI2p\in I_{2}, then the sequence {Yk(xk+1)}\big{\{}Y^{k}(x^{k+1})\big{\}} is equi-bounded, i,e,

DsupksupyYk(xk+1)y<+.D\,\triangleq\,\sup_{k\in\mathbb{N}}\,\sup_{y\in Y^{k}(x^{k+1})}\|y\|<+\infty. (38)
Proof.

(a) We know from Lemma 2(b) that the subsequence {Yk(xk+1)}kN\{Y^{k}(x^{k+1})\}_{k\in N} is eventually bounded. This implies the existence of an index KNK\in N such that {Yk(xk+1)kN,kK}\bigcup\{Y^{k}(x^{k+1})\mid k\in N,k\geq K\} is bounded. On the other hand, it follows from Assumption 6 that Yk(xk+1)Y^{k}(x^{k+1}) is non-empty and compact for any fixed kNk\in N. Thus, {Yk(xk+1)kN}\bigcup\{Y^{k}(x^{k+1})\mid k\in N\} is bounded, and {Yk(xk+1)}kN\{Y^{k}(x^{k+1})\}_{k\in N} is equi-bounded.

(b) Suppose for contradiction that {Yk(xk+1)}\{Y^{k}(x^{k+1})\} is not equi-bounded. Then for any nonnegative integer jj, there is an index kjk_{j}\in\mathbb{N} such that ykjj\|y^{k_{j}}\|\geq j for some multiplier ykjYkj(xkj+1)y^{k_{j}}\in Y^{k_{j}}\big{(}x^{k_{j}+1}\big{)}. Observe that the nonnegative sequence of indices {kj}j\{k_{j}\}_{j\in\mathbb{N}} is either bounded or unbounded. It suffices to consider these two cases separately.

Suppose first that {kj}j\{k_{j}\}_{j\in\mathbb{N}} is bounded. There must be an index k¯\bar{k}\in\mathbb{N} that appears infinitely many times in {kj}j\{k_{j}\}_{j\in\mathbb{N}}. Consequently, the set Yk¯(xk¯+1)Y^{\bar{k}}(x^{\bar{k}+1}) is unbounded, a contradiction to Assumption 6.

Suppose next that {kj}j\{k_{j}\}_{j\in\mathbb{N}} is unbounded. For some index set NN^{\prime}\in\mathbb{N}^{\sharp}_{\infty}, we have kj+k_{j}\to+\infty as j(N)+j(\in N^{\prime})\to+\infty. Notice that the subsequence {xkj}jN\{x^{k_{j}}\}_{j\in N^{\prime}} is bounded since {xk}\{x^{k}\} is bounded. By passing to a subsequence if necessary, we assume that {xkj}jN\{x^{k_{j}}\}_{j\in N^{\prime}} converges to some x~\widetilde{x}. Using epi-convergence in Assumption 1(c) and following the same procedure as in the proof of Theorem 4, we can obtain that x~p=1mdomFp\widetilde{x}\in\bigcap^{m}_{p=1}\operatorname{dom}F_{p}. Then, by the assumption in (b), Afp(x~)={0}\partial^{\infty}_{A}f_{p}(\widetilde{x})=\{0\} for pI2p\in I_{2}. Henceforth, there is a subsequence {xkj}jN{xk}\{x^{k_{j}}\}_{j\in N^{\prime}}\subset\{x^{k}\} converging to some x~\widetilde{x} with a corresponding subsequence of multipliers {ykjYkj(xkj+1)}jN\{y^{k_{j}}\in Y^{k_{j}}(x^{k_{j}+1})\}_{j\in N^{\prime}} such that ykj+\|y^{k_{j}}\|\to+\infty as j(N)+j(\in N^{\prime})\to+\infty, which is a contradiction to the result of part (a).

We have obtained contradictions for the two cases where {kj}j\{k_{j}\}_{j\in\mathbb{N}} is bounded or unbounded. Then we can conclude that {Yk(xk+1)}\{Y^{k}(x^{k+1})\} is equi-bounded and the quantity DD defined in (38) is finite. ∎

After obtaining the equi-boundedness of the multipliers, we next introduce a relaxation of the weakly A-stationary point for preparation of the termination criteria. For a proper and convex function ff and any β>0\beta>0, we denote βf(x¯){f(x)x𝔹(x¯,β)}\partial^{\beta}f(\bar{x})\triangleq\bigcup\{\partial f(x)\mid x\in\mathbb{B}(\bar{x},\beta)\}, which is related to the Goldstein’s β\beta-subdifferential [18].

Definition 5.

Given any η¯>0\bar{\eta}>0, β¯>0\bar{\beta}>0 and k¯\bar{k}\in\mathbb{N}, we say a point xx is a (η¯,β¯,k¯)(\bar{\eta},\bar{\beta},\bar{k})-weakly A-stationary point of problem (CP0) if there exists a nonnegative integer kk¯k\geq\bar{k} such that

dist(0,p=1m{yp,1[β¯gpk(x)β¯hpk(x)]+yp,2[β¯gpk(x)β¯hpk(x)]|yp,1β¯φp(fpk(x)),yp,2β¯φp(fpk(x))})η¯.\operatorname{dist}\left(0,\,\sum^{m}_{p=1}\bigcup\left\{y_{p,1}\big{[}\partial^{\bar{\beta}}g^{k}_{p}(x)-\partial^{\bar{\beta}}h^{k}_{p}(x)\big{]}+y_{p,2}\big{[}\partial^{\bar{\beta}}g^{k}_{p}(x)-\partial^{\bar{\beta}}h^{k}_{p}(x)\big{]}\,\middle|\begin{array}[]{c}\,y_{p,1}\in\partial^{\bar{\beta}}\varphi^{\uparrow}_{p}(f^{k}_{p}(x)),\\ y_{p,2}\in\partial^{\bar{\beta}}\varphi^{\downarrow}_{p}(f^{k}_{p}(x))\end{array}\right\}\right)\leq\bar{\eta}.

We remark that, if each outer function φp\varphi_{p} is an identity function, i.e., φp(t)=t\varphi_{p}(t)=t for any tt\in\mathbb{R}, and each inner function fpf_{p} is DC rather than ADC, the above definition in the context of a DC program is independent of kk and says about nearness to a η¯\bar{\eta}-critical point [35, definition 2]. For nonsmooth optimization problem, similar definitions based on the idea of small nearby subgradients, together with the termination criteria, have appeared in the literature [18, 9].

The following proposition reveals the relationship between a (η¯,β¯,k¯)(\bar{\eta},\bar{\beta},\bar{k})-weakly A-stationary point and a weakly A-stationary point.

Proposition 6.

Let x¯p=1mdomFp\bar{x}\in\bigcap^{m}_{p=1}\operatorname{dom}F_{p} be a feasible point of (CP0). Suppose that Assumption 1 holds and Afp(x¯)={0}\partial^{\infty}_{A}f_{p}(\bar{x})=\{0\} for each p=1,,mp=1,\cdots,m. For any nonnegative sequence (ηk,βk)0(\eta_{k},\beta_{k})\downarrow 0 and some index set NN\in\mathbb{N}^{\sharp}_{\infty}, if each xkx^{k} is a (ηk,βk,k)(\eta_{k},\beta_{k},k)-weakly A-stationary point of (CP0) for kNk\in N and xkNx¯x^{k}\to_{N}\bar{x}, then x¯\bar{x} is a weakly A-stationary point of (CP0).

Proof.

By Assumption 1(b), the subsequence {fpk(xk)}kN\{f^{k}_{p}(x^{k})\}_{k\in N} is bounded for each pp. Then, there is an index set N(N)NN^{\prime}(\subset N)\in N^{\sharp}_{\infty} such that {fpk(xk)}kN\{f^{k}_{p}(x^{k})\}_{k\in N^{\prime}} converges to some t¯pTp(x¯)\bar{t}_{p}\in T_{p}(\bar{x}) for each pp. Using the outer semicontinuity of the subdifferential mapping of a convex function, we have

Limsupk(N)+βkφp(fpk(xk))φp(t¯p),Limsupk(N)+βkφp(fpk(xk))φp(t¯p)\displaystyle\operatornamewithlimits{Lim\,sup}_{k(\in N^{\prime})\to+\infty}\;\partial^{\beta_{k}}\varphi^{\uparrow}_{p}(f^{k}_{p}(x^{k}))\subset\partial\varphi^{\uparrow}_{p}(\bar{t}_{p}),\qquad\displaystyle\operatornamewithlimits{Lim\,sup}_{k(\in N^{\prime})\to+\infty}\;\partial^{\beta_{k}}\varphi^{\downarrow}_{p}(f^{k}_{p}(x^{k}))\subset\partial\varphi^{\downarrow}_{p}(\bar{t}_{p})

and

Limsupk(N)+[βkgpk(xk)βkhpk(xk)]Afp(x¯).\displaystyle\operatornamewithlimits{Lim\,sup}_{k(\in N^{\prime})\to+\infty}\big{[}\partial^{\beta_{k}}g^{k}_{p}(x^{k})-\partial^{\beta_{k}}h^{k}_{p}(x^{k})\big{]}\subset\partial_{A}f_{p}(\bar{x}).

Thus, by taking an outer limit of the subdifferentials involved in the condition that xkx^{k} is (ηk,βk,k)(\eta_{k},\beta_{k},k)-weakly A-stationary for all kNk\in N, we know that x¯\bar{x} is a weakly A-stationary point of (CP0). ∎

We conclude this section with our main result on the termination criteria.

Proposition 7 (termination criteria).

Suppose that Assumptions 1-6 hold. Let {xk}\{x^{k}\} be the sequence generated by the prox-ADC method. Suppose that {xk}\{x^{k}\} is bounded and Afp(x)={0}\partial^{\infty}_{A}f_{p}(x)=\{0\} for any xp=1mdomFpx\in\bigcap^{m}_{p=1}\operatorname{dom}F_{p} and pI2p\in I_{2}. For any η¯>0\bar{\eta}>0, β¯>0\bar{\beta}>0 and k¯\bar{k}\in\mathbb{N}, there exists a nonnegative integer k0k¯k_{0}\geq\bar{k} such that

max1pmk=k0+αpk^+ϵk0β¯,δk0λ+k0β¯,δk0η¯.\max\limits_{1\leq p\leq m}\sum\limits^{+\infty}_{k^{\prime}=k_{0}}\widehat{\alpha^{k^{\prime}}_{p}}+\epsilon_{k_{0}}\leq\bar{\beta},\quad\frac{\delta_{k_{0}}}{\lambda+\ell_{k_{0}}}\leq\bar{\beta},\quad\delta_{k_{0}}\leq\bar{\eta}. (39)

Consequently, xk0,ik0+1x^{k_{0},i_{k_{0}}+1} is a (η¯max{1,2mD},β¯,k¯)\left(\bar{\eta}\cdot\max\{1,\sqrt{2m}D\},\bar{\beta},\bar{k}\right)-weakly A-stationary point of problem (CP1), where DD is the constant defined in (38).

Proof.

The existence of k0k¯k_{0}\geq\bar{k} satisfying (39) is a direct consequence of k=0+αpk^<+\sum^{+\infty}_{k^{\prime}=0}\widehat{\alpha^{k^{\prime}}_{p}}<+\infty, (ϵk,δk)0(\epsilon_{k},\delta_{k})\downarrow 0, and δk/(λ+k)0\delta_{k}/(\lambda+\ell_{k})\downarrow 0. Recalling (34) in the proof of Lemma 2, at the k0k_{0}-th outer iteration, we have

{p=1m(yp,1k0vp,1k0+yp,2k0vp,2k0)max{1,p=1m(|yp,1k0|+|yp,2k0|)}δk0,[vp,1k0gpk0(xk0,ik0)hpk0(xk0,ik0)vp,2k0gpk0(xk0,ik0+1)hpk0(xk0,ik0+1)]or[vp,1k0gpk0(xk0,ik0+1)hpk0(xk0,ik0+1)vp,2k0gpk0(xk0,ik0)hpk0(xk0,ik0)],p=1,,m,\left\{\begin{array}[]{l}\displaystyle\left\|\sum_{p=1}^{m}\big{(}y^{k_{0}}_{p,1}\,v^{k_{0}}_{p,1}+y^{k_{0}}_{p,2}\,v^{k_{0}}_{p,2}\big{)}\right\|\leq\max\left\{1,\sum\limits_{p=1}^{m}\big{(}|y^{k_{0}}_{p,1}|+|y^{k_{0}}_{p,2}|\big{)}\right\}\delta_{k_{0}},\\[18.06749pt] \left[\begin{array}[]{cc}v^{k_{0}}_{p,1}\in\partial g^{k_{0}}_{p}(x^{k_{0},i_{k_{0}}})-\partial h^{k_{0}}_{p}(x^{k_{0},i_{k_{0}}})\\[7.22743pt] v^{k_{0}}_{p,2}\in\partial g^{k_{0}}_{p}(x^{k_{0},i_{k_{0}}+1})-\partial h^{k_{0}}_{p}(x^{k_{0},i_{k_{0}}+1})\end{array}\right]\text{or}\left[\begin{array}[]{cc}v^{k_{0}}_{p,1}\in\partial g^{k_{0}}_{p}(x^{k_{0},i_{k_{0}}+1})-\partial h^{k_{0}}_{p}(x^{k_{0},i_{k_{0}}+1})\\[7.22743pt] v^{k_{0}}_{p,2}\in\partial g^{k_{0}}_{p}(x^{k_{0},i_{k_{0}}})-\partial h^{k_{0}}_{p}(x^{k_{0},i_{k_{0}}})\end{array}\right],\,p=1,\cdots,m,\end{array}\right.

where yp,1k0φp(fpk0,upper(xk0,ik0+1;xk0,ik0))y^{k_{0}}_{p,1}\in\partial\varphi^{\uparrow}_{p}(f^{k_{0},\text{upper}}_{p}(x^{k_{0},i_{k_{0}}+1};x^{k_{0},i_{k_{0}}})), yp,2k0φp(fpk0,lower(xk0,ik0+1;xk0,ik0))y^{k_{0}}_{p,2}\in\partial\varphi^{\downarrow}_{p}(f^{k_{0},\text{lower}}_{p}(x^{k_{0},i_{k_{0}}+1};x^{k_{0},i_{k_{0}}})) for p=1,,mp=1,\cdots,m. Thus, at the point x=xk0,ik0+1x^{\ast}=x^{k_{0},i_{k_{0}}+1}, we have

dist(0,p=1m{yp,1[βgpk0(x)βhpk0(x)]+yp,2[βgpk0(x)βhpk0(x)]|yp,1βφp(fpk0(x)),yp,2βφp(fpk0(x))})η,\operatorname{dist}\left(0,\,\sum^{m}_{p=1}\bigcup\left\{y_{p,1}\big{[}\partial^{\beta}g^{k_{0}}_{p}(x^{\ast})-\partial^{\beta}h^{k_{0}}_{p}(x^{\ast})\big{]}+y_{p,2}\big{[}\partial^{\beta}g^{k_{0}}_{p}(x^{\ast})-\partial^{\beta}h^{k_{0}}_{p}(x^{\ast})\big{]}\,\middle|\begin{array}[]{c}\,y_{p,1}\in\partial^{\beta}\varphi^{\uparrow}_{p}(f^{k_{0}}_{p}(x^{\ast})),\\ y_{p,2}\in\partial^{\beta}\varphi^{\downarrow}_{p}(f^{k_{0}}_{p}(x^{\ast}))\end{array}\right\}\right)\leq\eta,

where the parameters β\beta and η\eta are given by

β=max{max1pm[fpk0,upper(x;xk0,ik0)fpk0(x)],max1pm[fpk0(x)fpk0,lower(x;xk0,ik0)],xxk0,ik0}(29)max{max1pmk=k0+αpk^+ϵk0,δk0/(λ+k0)}(39)β¯,η=max{1,p=1m(|yp,1k0|+|yp,2k0|)}δk0Proposition 5max{1,2mD}δk0(39)η¯max{1,2mD}.\begin{array}[]{l}\beta=\max\left\{\begin{array}[]{c}\max\limits_{1\leq p\leq m}\left[f^{k_{0},\text{upper}}_{p}(x^{\ast}\,;x^{k_{0},i_{k_{0}}})-f^{k_{0}}_{p}(x^{\ast})\right],\\ \max\limits_{1\leq p\leq m}\left[f^{k_{0}}_{p}(x^{\ast})-f^{k_{0},\text{lower}}_{p}(x^{\ast}\,;x^{k_{0},i_{k_{0}}})\right],\\ \|x^{\ast}-x^{k_{0},i_{k_{0}}}\|\end{array}\right\}\overset{\eqref{eq:inner_stopping}}{\leq}\max\left\{\begin{array}[]{c}\max\limits_{1\leq p\leq m}\sum\limits^{+\infty}_{k^{\prime}=k_{0}}\widehat{\alpha^{k^{\prime}}_{p}}+\epsilon_{k_{0}},\\[10.84006pt] \delta_{k_{0}}/(\lambda+\ell_{k_{0}})\end{array}\right\}\overset{\eqref{eq:Termination}}{\leq}\bar{\beta},\\[28.90755pt] \eta={\max\left\{1,\sum\limits_{p=1}^{m}\big{(}|y^{k_{0}}_{p,1}|+|y^{k_{0}}_{p,2}|\big{)}\right\}\delta_{k_{0}}}\overset{\text{Proposition }\ref{prop:equi-bounded}}{\leq}\max\{1,\,\sqrt{2m}D\}\cdot\delta_{k_{0}}\overset{\eqref{eq:Termination}}{\leq}\bar{\eta}\cdot\max\{1,\,\sqrt{2m}D\}.\end{array}

Henceforth, for k0k_{0} satisfying (39), x=xk0,ik0+1x^{\ast}=x^{k_{0},i_{k_{0}}+1} is a (η¯max{1,2mD},β¯,k¯)\left(\bar{\eta}\cdot\max\{1,\,\sqrt{2m}D\},\bar{\beta},\bar{k}\right)-weakly A-stationary point of problem (CP1). ∎

5 Numerical examples.

We present some preliminary experiments to illustrate the performance of our algorithm on the inverse optimal value optimization with or without constraints. The first experiment aims to demonstrate the practical performance of the prox-ADC method under the termination criteria in section 4.3, by varying different approximating sequences and initial points. To demonstrate the computation of ADC constrained problems, especially the choice of the quantity αpk^\widehat{\alpha^{k}_{p}} and a feasible initial point in Assumption 2, we further consider the constrained inverse optimal value optimization. These experiments were tested on a MacBook Air laptop with an Apple M1 chip and 16GB of memory using Julia 1.10.2.

5.1 Inverse optimal value optimization with simple constraints.

Based on the setting in (2), we aim to find a vector x[1,1]nx\in[-1,1]^{n} to minimize the errors between the observed optimal values {νp}p=1m{\{\nu_{p}\}^{m}_{p=1}} and true optimal values {fp(x)}p=1m\{f_{p}(x)\}^{m}_{p=1}:

minimizex[1,1]nF(x)p=1m|νpfp(x)|,\displaystyle\operatornamewithlimits{minimize}_{x\in[-1,1]^{n}}\;F(x)\triangleq\sum_{p=1}^{m}\left|{\nu_{p}}-f_{p}(x)\right|, (40)

where each fpf_{p} is the optimal value function as defined in (1). We fix n=10n=10, m=11m=11, d=10d=10, and the number of inequality constraints =5\ell=5 in the minimization problem (1). Vectors bpb^{\,p} and cpc^{\,p}, and matrices Ap,Bp,CpA^{\,p},B^{\,p},C^{\,p} are randomly generated with each entry independent and normally distributed with mean μ=0\mu=0 and variance σ=1\sigma=1. For numerical stability, we then normalize matrices CpC^{\,p} and ApA^{\,p} by a factor of n\sqrt{n}. We also generate a positive definite matrix QpQ^{\,p} and a random solution x=u/ux^{\ast}=u/\|u\| with uNormal(0,In)u\sim\mbox{Normal}(0,I_{n}). We set νp=fp(x){\nu_{p}}=f_{p}(x^{\ast}) for each pp and, therefore, F(x)=0F(x^{\ast})=0.

We adopt the ADC decomposition in (6), denoted by fpk=(gp)γk(hp)γkf^{k}_{p}=(g_{p})_{\gamma_{k}}-(h_{p})_{\gamma_{k}} with a sequence {γk=1/(k+1)ρ}\{\gamma_{k}=1/(k+1)^{\rho}\} for some exponent ρ>0\rho>0. Consequently, k=1/γk=(k+1)ρ\ell_{k}=1/\gamma_{k}=(k+1)^{\rho}. We apply the prox-ADC algorithm to solve this example with ϵk=δk=1/(k+1)ρ\epsilon_{k}=\delta_{k}=1/(k+1)^{\rho} and λ=5\lambda=5. In this example, the strongly convex subproblem (30) can be easily reformulated to a problem with linear objective and convex quadratic constraints, which is solved by Gurobi in our experiments.

We first investigate the performance of our algorithm under the termination criteria with different values of parameters. Figure 2 displays the logarithm of the objective values against the number of outer iterations and the total number of inner iterations. We mark three different points on the curve where the termination criteria (39) with η¯=β¯=101,102,103\bar{\eta}=\bar{\beta}=10^{-1},10^{-2},10^{-3} and k¯=10,20,40\bar{k}=10,20,40 are satisfied.

Refer to caption
Refer to caption
Figure 2: Performance of the prox-ADC method for problem (40), under the termination criteria (39) with η¯=β¯=101,102,103\bar{\eta}=\bar{\beta}=10^{-1},10^{-2},10^{-3} and k¯=10,20,40\bar{k}=10,20,40, for a fixed exponent ρ=1.5\rho=1.5 and a fixed initial point.

We have also experimented with various values of exponent ρ\rho that determine the convergence rate of the approximating sequence and various initial points. In both cases, we terminate the algorithm under the conditions (39) with η¯=β¯=102\bar{\eta}=\bar{\beta}=10^{-2} and k¯=10\bar{k}=10. In Figure 3, we observe that setting different values of ρ\rho under the same termination criteria leads to candidate solutions with similar objective values, and there are roughly two phases of convergence in terms of the total number of iterations. Initially, the objective value decreases faster for smaller ρ\rho, corresponding to poorer approximation. When the objective value is sufficiently small (10310^{-3} on this particular instance), larger ρ\rho results in faster convergence to high accuracy. We remark that for ρ=0.5\rho=0.5, the algorithm reaches the maximum number of outer iterations and does not output a (102,102,10)(10^{-2},10^{-2},10)-weakly A-stationary point. Figure 4 demonstrates the influence of using various initial points that are uniformly distributed on [1,1]n[-1,1]^{n}. On this instance, two of the initial points find (102,102,10)(10^{-2},10^{-2},10)-weakly A-stationary points with large objective values. For these two initial points, we rerun the algorithm with η¯=β¯=103\bar{\eta}=\bar{\beta}=10^{-3}, and the algorithm still terminates with large objective values.

Refer to caption
Refer to caption
Figure 3: Performance of the prox-ADC method for problem (40) using different values of exponent ρ\rho, under the termination criteria (39) with η¯=β¯=102\bar{\eta}=\bar{\beta}=10^{-2} and k¯=10\bar{k}=10, for a fixed initial point.
Refer to caption
Refer to caption
Figure 4: Performance of the prox-ADC method for problem (40) using five initial points uniformly distributed on [1,1]n[-1,1]^{n}, under the termination criteria (39) with η¯=β¯=102\bar{\eta}=\bar{\beta}=10^{-2} and k¯=10\bar{k}=10, for a fixed exponent ρ=1.5\rho=1.5.

5.2 Inverse optimal value optimization with ADC constraints.

We consider a variant of the inverse optimal value optimization that is defined as follows:

minimizex[1,1]nF(x)=p=1m1|νpfp(x)|subject toνpfp(x)max{1,|νp|}ε,fp(x)νpmax{1,|νp|}ε,p=m1+1,,m.\begin{array}[]{cl}\displaystyle\operatornamewithlimits{minimize}_{x\in[-1,1]^{n}}&\;\,\displaystyle F(x)=\sum^{m_{1}}_{p=1}\left|{\nu_{p}}-f_{p}(x)\right|\\[14.45377pt] \mbox{subject to}&\;\,\displaystyle\frac{{\nu_{p}}-f_{p}(x)}{\max\{1,|{\nu_{p}}|\}}\leq\varepsilon,\quad\frac{f_{p}(x)-{\nu_{p}}}{\max\{1,|{\nu_{p}}|\}}\leq\varepsilon,\quad p=m_{1}+1,\cdots,m.\end{array} (41)

In this formulation, the observations of the optimal values {νp}p=1m{\{\nu_{p}\}^{m}_{p=1}} are divided into two groups indexed by {1,,m1}\{1,\cdots,m_{1}\} and {m1+1,,m}\{m_{1}+1,\cdots,m\}. We aim to minimize the errors for the first group while ensuring the relative errors for the second group do not exceed a specified feasibility tolerance, denoted by ε\varepsilon. In our experiment, we fix n=10n=10, m=11m=11, m1=8m_{1}=8, ε=101\varepsilon=10^{-1}, d=10d=10, and the number of inequality constraints =5\ell=5 in the minimization problem (1). The solution xx^{\ast} and the data, including {νp}p=1m{\{\nu_{p}\}^{m}_{p=1}}, are randomly generated in the same way as in section 5.1. We can see that xx^{\ast} is feasible to (41) and attains the minimal objective value F(x)=0F(x^{\ast})=0.

Similar to the first example, we adopt the ADC decomposition in (6), denoted by fpk=(gp)γk(hp)γkf^{k}_{p}=(g_{p})_{\gamma_{k}}-(h_{p})_{\gamma_{k}} with a sequence {γk=1/(k+k~)ρ}\{\gamma_{k}=1/(k+\tilde{k})^{\rho}\} for some positive integer k~\tilde{k} and ρ>0\rho>0. Due to the feasibility problem in Assumption 2, we introduce the additional parameter k~\tilde{k} to control the approximating sequences, which will be explained in details later. We also note that treating νpfp(x)max{1,|νp|}ε\frac{{\nu_{p}}-f_{p}(x)}{\max\{1,|{\nu_{p}}|\}}\leq\varepsilon and fp(x)νpmax{1,|νp|}ε\frac{f_{p}(x)-{\nu_{p}}}{\max\{1,|{\nu_{p}}|\}}\leq\varepsilon as two separate constraints for p=m1+1,,mp={m_{1}+1},\cdots,m leads to the failure of the asymptotic constraint qualification in Assumption 5 because the approximate subdifferentials of the ADC functions fpf_{p} and fp-f_{p} are linearly dependent. This issue can be resolved by rewriting the constraints in a composite ADC form |νpfp(x)|max{1,|νp|}ε\frac{|{\nu_{p}}-f_{p}(x)|}{\max\{1,|{\nu_{p}}|\}}\leq\varepsilon and assuming a corresponding version of Assumption 5. We omit this technical detail since the main focus of this section is to illustrate the practical implementation of our algorithm.

To verify Assumption 2 that states the existence of a strictly feasible point, we first follow the discussion after Assumption 2 to construct the quantity αpk^=(γkγk+1)Lp2/(2max{1,|νp|})\widehat{\alpha^{k}_{p}}=(\gamma_{k}-\gamma_{k+1})L_{p}^{2}/(2\max\{1,|{\nu_{p}}|\}) where LpL_{p} is the Lipschitz constant of f¯p(,x)\overline{f}_{p}(\cdot,x) for all x[1,1]nx\in[-1,1]^{n}. We can derive the Lipschitz constant LpL_{p} by characterizing the subdifferential 1fp¯(,x)\partial_{1}\overline{f_{p}}(\cdot,x) for a fixed xx based on Danskin’s Theorem [11, Theorem 2.1] and then upper bounding the norm of this subdifferential over x[1,1]nx\in[-1,1]^{n}. The extra denominator max{1,|νp|}\max\{1,|{\nu_{p}}|\} in the expression of αpk^\widehat{\alpha^{k}_{p}} is due to the scaling of the constraints in (41). Then, consider the following problem:

minimizex[1,1]nV(x)p=m1+1mmax{0,|νpfp0(x)|(εk=0+αpk^)max{1,|νp|}sp(constant)},\displaystyle\operatornamewithlimits{minimize}_{x\in[-1,1]^{n}}\;V(x)\triangleq\sum^{m}_{p=m_{1}+1}\max\Bigg{\{}0,\,\big{|}{\nu_{p}}-f^{0}_{p}(x)\big{|}-\underbrace{\left(\varepsilon-\sum^{+\infty}_{k^{\prime}=0}\widehat{\alpha^{k^{\prime}}_{p}}\right)\max\{1,|{\nu_{p}}|\}}_{\triangleq\,s_{p}\,(\text{constant})}\Bigg{\}}, (Feas)

where the objective is the sum of the compositions of univariate convex functions φp(t)=max{0,|νpt|sp}\varphi_{p}(t)=\max\{0,|{\nu_{p}}-t|-s_{p}\} and DC functions fp0f^{0}_{p}. Notice that problem (Feas) takes the same form as (23). Thus, we can apply the inner loop of the prox-ADC method to solve it approximately. If solving this problem gives a solution x0x^{0} with V(x0)=0V(x^{0})=0, then

|νpfp0(x)|max{1,|νp|}εk=0+αpk^=ε(Lp)22k~ρmax{1,|νp|},\frac{|{\nu_{p}}-f^{0}_{p}(x)|}{\max\{1,|{\nu_{p}}|\}}\leq\varepsilon-\sum^{+\infty}_{k^{\prime}=0}\widehat{\alpha^{k^{\prime}}_{p}}=\varepsilon-\frac{(L_{p})^{2}}{2\,\tilde{k}^{\rho}\cdot\max\{1,|{\nu_{p}}|\}}, (42)

and x0x^{0} is a strictly feasible point satisfying Assumption 2. We emphasize that using the inner loop of the prox-ADC method for solving problem (Feas) to obtain a strictly feasible point is merely a heuristic. Although this approach works well in our experiments, it is generally not easy to verify Assumption 2. We make a final remark on the role of k~\tilde{k}. For small values of ρ\rho and k~\tilde{k}, it is possible that ε<(Lp)22k~ρmax{1,|νp|}\varepsilon<\frac{(L_{p})^{2}}{2\,\tilde{k}^{\rho}\cdot\max\{1,|{\nu_{p}}|\}}, and, from (42), there is no strictly feasible point satisfying Assumption 2 for this fixed approximating sequence. Henceforth, the flexibility of the parameter k~{\tilde{k}} is necessary to ensure the validity of Assumption 2.

We implement the above procedure to find an initial point and then apply the prox-ADC method with ϵk=δk=1/(k+1)ρ\epsilon_{k}=\delta_{k}=1/(k+1)^{\rho} and λ=5\lambda=5. On most of the randomly generated instances, we observe that the point given by solving (Feas) is also feasible to the original problem (41) along the iterations, although this result cannot be implied by Assumption 2. In Figure 5, we again plot the logarithm of the objective values against the total number of iterations, using various combination of k~\tilde{k} and ρ\rho and various initial points. It is worth mentioning that for this constrained problem, the random initial point is not directly utilized in the prox-ADC method. Instead, it is first used in problem (Feas) to generate a strictly feasible point satisfying Assumption 2, and the candidate solution for (Feas) then becomes the initial point of the prox-ADC method for solving (41).

Refer to caption
(a)  Varying sequences {γk}\{\gamma_{k}\}.
Refer to caption
(b)  Varying initial points.
Figure 5: Performance of the prox-ADC method for problem (41) under the termination criteria (39) with η¯=β¯=2×102\bar{\eta}=\bar{\beta}=2\times 10^{-2} and k¯=5\bar{k}=5. (a): using different sequences {γk}\{\gamma_{k}\} for a fixed initial point. (b): using five initial points uniformly distributed on [1,1]n[-1,1]^{n} for a fixed sequence {γk=1/(k+10)2.5}\{\gamma_{k}=1/(k+10)^{2.5}\}.

Acknowledgments.

The authors are partially supported by the National Science Foundation under grants CCF-2416172 and DMS-2416250, and the National Institutes of Health under grant 1R01CA287413-01. The authors are grateful to the associate editor and two reviewers for their careful reading and constructive comments that have substantially improved the paper.

References

  • [1] Carlo Acerbi. Spectral measures of risk: A coherent representation of subjective risk aversion. Journal of Banking and Finance, 26(7):1505–1518, 2002.
  • [2] Shabbir Ahmed and Yongpei Guan. The inverse optimal value problem. Mathematical Programming, 102(1):91–110, 2005.
  • [3] Amir Beck and Marc Teboulle. Smoothing and first order methods: A unified framework. SIAM Journal on Optimization, 22(2):557–580, 2012.
  • [4] James V Burke. Descent methods for composite nondifferentiable optimization problems. Mathematical Programming, 33(3):260–279, 1985.
  • [5] James V Burke and Michael C Ferris. A Gauss-Newton method for convex composite optimization. Mathematical Programming, 71(2):179–194, 1995.
  • [6] James V Burke and Tim Hoheisel. Epi-convergent smoothing with applications to convex composite functions. SIAM Journal on Optimization, 23(3):1457–1479, 2013.
  • [7] James V Burke and Tim Hoheisel. Epi-convergence properties of smoothing by infimal convolution. Set-Valued and Variational Analysis, 25(1):1–23, 2017.
  • [8] James V Burke, Tim Hoheisel, and Christian Kanzow. Gradient consistency for integral-convolution smoothing functions. Set-Valued and Variational Analysis, 21(2):359–376, 2013.
  • [9] James V Burke, Adrian S Lewis, and Michael L Overton. A robust gradient sampling algorithm for nonsmooth, nonconvex optimization. SIAM Journal on Optimization, 15(3):751–779, 2005.
  • [10] Xiaojun Chen. Smoothing methods for nonsmooth, nonconvex minimization. Mathematical Programming, 134(1):71–99, 2012.
  • [11] Frank H Clarke. Generalized gradients and applications. Transactions of the American Mathematical Society, 205:247–262, 1975.
  • [12] Frank H Clarke. Optimization and Nonsmooth Analysis. Society for Industrial and Applied Mathematics, Philadelphia, 1990.
  • [13] Ying Cui and Jong-Shi Pang. Modern Nonconvex Nondifferentiable Optimization. Society for Industrial and Applied Mathematics, Philadelphia, 2021.
  • [14] Dmitriy Drusvyatskiy and Courtney Paquette. Efficiency of minimizing compositions of convex functions and smooth maps. Mathematical Programming, 178(1–2):503–558, 2019.
  • [15] Yuri M Ermoliev, Vladimir I Norkin, and Roger JB Wets. The minimization of semicontinuous functions: Mollifier subgradients. SIAM Journal on Control and Optimization, 33(1):149–167, 1995.
  • [16] Roger Fletcher. A model algorithm for composite nondifferentiable optimization problems. Mathematical Programming Studies, 17:67–76, 1982.
  • [17] Gerald B Folland. Real Analysis: Modern Techniques and Their Applications, volume 40. John Wiley & Sons, New York, 1999.
  • [18] Allen A Goldstein. Optimization of Lipschitz continuous functions. Mathematical Programming, 13(1):14–22, 1977.
  • [19] Adrian S Lewis and Stephen J Wright. A proximal method for composite minimization. Mathematical Programming, 158(1-2):501–546, 2016.
  • [20] Hanyang Li and Ying Cui. A decomposition algorithm for two-stage stochastic programs with nonconvex recourse. SIAM Journal on Optimization, 34(1):306–335, 2024.
  • [21] Junyi Liu, Ying Cui, Jong-Shi Pang, and Suvrajeet Sen. Two-stage stochastic programming with linearly bi-parameterized quadratic recourse. SIAM Journal on Optimization, 30(3):2530–2558, 2020.
  • [22] Boris S Mordukhovich. Generalized differential calculus for nonsmooth and set-valued mappings. Journal of Mathematical Analysis and Applications, 183(1):250–288, 1994.
  • [23] Matthew Norton, Valentyn Khokhlov, and Stan Uryasev. Calculating CVaR and bPOE for common probability distributions with application to portfolio optimization and density estimation. Annals of Operations Research, 299(1):1281–1315, 2021.
  • [24] Giuseppe Paleologo and Samer Takriti. Bandwidth trading: A new market looking for help from the OR community. AIRO News, 6(3):1–4, 2001.
  • [25] RA Poliquin and R Tyrrell Rockafellar. Amenable functions in optimization. Nonsmooth Optimization: Methods and Applications (Erice, 1991), pages 338–353, 1992.
  • [26] RA Poliquin and R Tyrrell Rockafellar. A calculus of epi-derivatives applicable to optimization. Canadian Journal of Mathematics, 45(4):879–896, 1993.
  • [27] R Tyrrell Rockafellar. Convex Analysis, volume 18. Princeton University Press, Princeton, NJ, 1970.
  • [28] R Tyrrell Rockafellar. Lagrange multipliers and optimality. SIAM Review, 35(2):183–238, 1993.
  • [29] R Tyrrell Rockafellar and Stanislav Uryasev. Optimization of conditional value-at-risk. Journal of Risk, 2(3):21–42, 2000.
  • [30] R Tyrrell Rockafellar and Roger JB Wets. Variational Analysis, volume 317. Springer Science & Business Media, New York, 2009.
  • [31] Johannes O Royset. Approximations of semicontinuous functions with applications to stochastic optimization and statistical estimation. Mathematical Programming, 184(1):289–318, 2020.
  • [32] Johannes O Royset. Consistent approximations in composite optimization. Mathematical Programming, 201(1–2):339–372, 2022.
  • [33] Alexander Shapiro, Darinka Dentcheva, and Andrzej Ruszczyński. Lectures on Stochastic Programming: Modeling and Theory (Third Edition). Society for Industrial and Applied Mathematics, Philadelphia, 2021.
  • [34] Wim van Ackooij. A discussion of probability functions and constraints from a variational perspective. Set-Valued and Variational Analysis, 28(4):585–609, 2020.
  • [35] Yao Yao, Qihang Lin, and Tianbao Yang. Large-scale optimization of partial AUC in a range of false positive rates. Advances in Neural Information Processing Systems, 35:31239–31253, 2022.

Appendix A. Proofs of Proposition 2 and Proposition 3.

Proof of Proposition 2..

(a) We first generalize the convergence result of the classical Moreau envelopes when γk0\gamma_{k}\downarrow 0 (see, e.g., [30, Theorem 1.25]) to the partial Moreau envelopes. Fixing any γ0>0\gamma_{0}>0, we consider the function ψ(z,x,γ)f¯(z,x)+δdomf(x)+ψ0(z,x,γ)\psi(z,x,\gamma)\triangleq\overline{f}(z,x)+\delta_{\operatorname{dom}f}(x)+\psi_{0}(z,x,\gamma) with

ψ0(z,x,γ){zx2/(2γ) if γ(0,γ0],0 if γ=0,z=x, otherwise.\psi_{0}(z,x,\gamma)\triangleq\left\{\begin{array}[]{cl}\|z-x\|^{2}/(2\gamma)&\text{ if }\gamma\in(0,\gamma_{0}],\\ 0&\text{ if }\gamma=0,z=x,\\ \infty&\text{ otherwise.}\end{array}\right.

Notice that fk(x)=gγk(x)hγk(x)+δdomf(x)=infzψ(z,x,γk)f^{k}(x)=g_{\gamma_{k}}(x)-h_{\gamma_{k}}(x)+\delta_{\operatorname{dom}f}(x)=\inf_{z}\psi(z,x,\gamma_{k}). It is easy to verify that ψ\psi is proper and lsc based on our assumptions. Under the assumption that f¯\overline{f} is bounded from below on domf×domf\operatorname{dom}f\times\operatorname{dom}f, we can also show by contradiction that ψ(z,x,γ)\psi(z,x,\gamma) is level-bounded in zz locally uniformly in (x,γ)(x,\gamma). Consequently, it follows from [30, Theorem 1.17] that fk(x)=infzψ(z,x,γk)f(x)f^{k}(x)=\inf_{z}\psi(z,x,\gamma_{k})\uparrow f(x) for any fixed xx and each fkf^{k} is lsc.

Hence, fkeff^{k}\overset{\text{e}}{\rightarrow}f is a direct consequence of [30, Proposition 7.4(d)] by fk(x)f(x)f^{k}(x)\uparrow f(x) for all xx and the lower semicontinuity of fkf^{k}. If domf=n\operatorname{dom}f={\mathbb{R}^{n}}, then ff is continuous, and thus fkcff^{k}\overset{\text{c}}{\rightarrow}f by [30, Proposition 7.4(c-d)]. We then complete the proof of (a).

(b) For any x¯int(domf)\bar{x}\in\operatorname{int}(\operatorname{dom}f),

Af(x¯)=xkx¯Limsupk+{gγk(xk)hγk(xk)}=(i)xkx¯Limsupk+{xkγk2(f¯)(zk,xk)zkγk|zk=argminzn[f¯(z,xk)+12γkzxk2]}(ii)(xk,zk)(x¯,x¯)Limsupk+[1f¯(zk,xk)2(f¯)(zk,xk)]=(iii)1f¯(x¯,x¯)2(f¯)(x¯,x¯),\begin{array}[]{rl}\partial_{A}f(\bar{x})=&\bigcup\limits_{x^{k}\rightarrow\bar{x}}\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}\;\left\{\partial{g_{\gamma_{k}}}(x^{k})-\partial{h_{\gamma_{k}}}(x^{k})\right\}\\ \overset{(\rm i^{\prime})}{=}&\bigcup\limits_{x^{k}\rightarrow\bar{x}}\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}\;\left\{\frac{x^{k}}{\gamma_{k}}-\partial_{2}(-\overline{f})(z^{k},x^{k})-\frac{z^{k}}{\gamma_{k}}\;\middle|\;z^{k}=\operatorname*{arg\,min}_{z\in{\mathbb{R}^{n}}}\left[\,\overline{f}(z,x^{k})+\frac{1}{2\gamma_{k}}\|z-x^{k}\|^{2}\right]\right\}\\ \,\overset{(\rm ii^{\prime})}{\subset}&\bigcup\limits_{(x^{k},z^{k})\rightarrow(\bar{x},\bar{x})}\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}\;\left[\partial_{1}\overline{f}(z^{k},x^{k})-\partial_{2}(-\overline{f})(z^{k},x^{k})\right]\\ \overset{(\rm iii^{\prime})}{=}&\partial_{1}\overline{f}(\bar{x},\bar{x})-\partial_{2}(-\overline{f})(\bar{x},\bar{x}),\end{array}

where (i)(\rm i^{\prime}) follows from the convexity of (f¯)(z,)(-\overline{f})(z,\cdot) for any zdomfz\in\operatorname{dom}f and Danskin’s Theorem [11, Theorem 2.1]; (ii)(\rm ii^{\prime}) is due to the optimality condition for zkz^{k}, and zkx¯z^{k}\rightarrow\bar{x} is obtained by similar arguments in the proof of Theorem 1(b) due to our assumption that f¯\overline{f} is bounded from below on domf×domf\operatorname{dom}f\times\operatorname{dom}f; and (iii)(\rm iii^{\prime}) uses the outer semicontinuity of 1f¯\partial_{1}\overline{f} and 2(f¯)\partial_{2}(-\overline{f}) at (x¯,x¯)(\bar{x},\bar{x}) [20, Lemma 5]. Therefore, for any x¯int(domf)\bar{x}\in\operatorname{int}(\operatorname{dom}f), f(x¯)Af(x¯)1f¯(x¯,x¯)2(f¯)(x¯,x¯)\partial f(\bar{x})\subset\partial_{A}f(\bar{x})\subset\partial_{1}\overline{f}(\bar{x},\bar{x})-\partial_{2}(-\overline{f})(\bar{x},\bar{x}). Moreover, due to the local boundedness of the mappings 1f¯\partial_{1}\overline{f} and 2(f¯)\partial_{2}(-\overline{f}) at (x¯,x¯)(\bar{x},\bar{x}) [20, Lemma 5], it follows from [30, Example 4.22] that Af(x¯)={0}\partial^{\infty}_{A}f(\bar{x})=\{0\}. ∎

Proof of Proposition 3..

(a) Note that for any xnx\in\mathbb{R}^{n}, CVaRα[c(x,Z)]{\mbox{CVaR}_{\alpha}}[\,c(x,Z)\,] is well-defined and takes finite value due to 𝔼[|c(x,Z)|]<+\mathbb{E}[\,|c(x,Z)|\,]<+\infty. Since c(x,Z)c(x,Z) follows a continuous distribution for any xnx\in\mathbb{R}^{n}, we know from [29, Theorem 1] and [1] that CVaR has the following equivalent representations:

CVaRα[c(x,Z)]=inft{t+11α𝔼[max{c(x,Z)t,0}]}=11αα1VaRt[c(x,Z)]dt.{\mbox{CVaR}_{\alpha}}[\,c(x,Z)\,]=\inf_{t\in\mathbb{R}}\left\{t+\frac{1}{1-\alpha}\,\mathbb{E}\left[\,\max\{c(x,Z)-t,0\}\right]\right\}=\frac{1}{1-\alpha}\int_{\alpha}^{1}\mbox{VaR}_{t}[\,c(x,Z)\,]\,\mbox{d}t.

Moreover, CVaRα[c(,Z)]{\mbox{CVaR}_{\alpha}}[\,c(\cdot,Z)\,] is convex by the convexity of c(,z)c(\cdot,z) for any fixed zmz\in\mathbb{R}^{m} (cf. [29, Theorem 2]). Therefore, both gkg^{k} and hkh^{k} defined in (7) are convex. By the definitions of gkg^{k} and hkh^{k}, we have

gk(x)hk(x)=[k(1α)+1]CVaRα1/k[c(x,Z)]k(1α)CVaRα[c(x,Z)]=k(1α)+11(α1/k)α1/k1VaRt[c(x,Z)]dtk(1α)1αα1VaRt[c(x,Z)]dt=kα1/k1VaRt[c(x,Z)]dtkα1VaRt[c(x,Z)]dt=kα1/kαVaRt[c(x,Z)]dt.\begin{array}[]{rl}g^{k}(x)-h^{k}(x)&=[k(1-\alpha)+1]\,\mbox{CVaR}_{\alpha-1/k}[c(x,Z)]-k(1-\alpha)\,\mbox{CVaR}_{\alpha}[c(x,Z)]\\[5.78172pt] &=\displaystyle\frac{k(1-\alpha)+1}{1-(\alpha-1/k)}\int^{1}_{\alpha-1/k}\operatorname{VaR}_{t}[c(x,Z)]\,\mbox{d}t-\frac{k(1-\alpha)}{1-\alpha}\int^{1}_{\alpha}\operatorname{VaR}_{t}[c(x,Z)]\,\mbox{d}t\\[10.84006pt] &=k\displaystyle\int^{1}_{\alpha-1/k}\operatorname{VaR}_{t}[c(x,Z)]\,\mbox{d}t-k\int^{1}_{\alpha}\operatorname{VaR}_{t}[c(x,Z)]\,\mbox{d}t\\[10.84006pt] &=k\displaystyle\int^{\alpha}_{\alpha-1/k}\operatorname{VaR}_{t}[c(x,Z)]\,\mbox{d}t.\end{array}

Note that VaRt[c(x,Z)]\operatorname{VaR}_{t}[c(x,Z)] is nondecreasing as a function of tt for any fixed xnx\in\mathbb{R}^{n}. Namely,

α1/kαVaRα1/k[c(x,Z)]dtα1/kαVaRt[c(x,Z)]dtα1/kαVaRα[c(x,Z)]dt.\displaystyle\int^{\alpha}_{\alpha-1/k}\operatorname{VaR}_{\alpha-1/k}[c(x,Z)]\,\mbox{d}t\leq\displaystyle\int^{\alpha}_{\alpha-1/k}\operatorname{VaR}_{t}[c(x,Z)]\,\mbox{d}t\leq\displaystyle\int^{\alpha}_{\alpha-1/k}\operatorname{VaR}_{\alpha}[c(x,Z)]\,\mbox{d}t.

Thus, VaRα1/k[c(x,Z)]gk(x)hk(x)VaRα[c(x,Z)]\mbox{VaR}_{\alpha-1/k}[\,c(x,Z)\,]\leq g^{k}(x)-h^{k}(x)\leq\mbox{VaR}_{\alpha}[\,c(x,Z)\,] for any xnx\in\mathbb{R}^{n} and k>1/αk>1/\alpha. Since VaRt[c(x,Z)]\mbox{VaR}_{t}[\,c(x,Z)\,] as a function of tt on (0,1)(0,1) is left-continuous, it follows that [gk(x)hk(x)]VaRα[c(x,Z)][g^{k}(x)-h^{k}(x)]\uparrow\mbox{VaR}_{\alpha}[\,c(x,Z)\,] for all xx. Observe that

{xVaRα[c(x,Z)]r}={x(c(x,Z)r)α}.\{x\mid\mbox{VaR}_{\alpha}[\,c(x,Z)\,]\leq r\}=\{x\mid\mathbb{P}(c(x,Z)\leq r)\geq\alpha\}.

Based on our assumptions and [34, Proposition 2.2], for any rr\in\mathbb{R}, the probability function x(c(x,Z)r)x\mapsto-\mathbb{P}(c(x,Z)\leq r) is lsc, which implies the closedness of the level set {x(c(x,Z)r)α}\{x\mid\mathbb{P}(c(x,Z)\leq r)\geq\alpha\} for any (r,α)×(0,1)(r,\alpha)\in\mathbb{R}\times(0,1). Hence, VaRα[c(,Z)]\mbox{VaR}_{\alpha}[\,{c(\cdot,Z)}] is lsc for any given α(0,1)\alpha\in(0,1) and is continuous if c(,)c(\cdot,\cdot) is further assumed to be continuous. Then (a) is a direct consequence of [30, Proposition 7.4(c-d)] by the monotonicity [gk(x)hk(x)]VaRα[c(x,Z)][g^{k}(x)-h^{k}(x)]\uparrow\operatorname{VaR}_{\alpha}[\,c(x,Z)] and the continuity of VaRα[c(,Z)]\mbox{VaR}_{\alpha}[\,c(\cdot,Z)].

(b) We use 1(Ω,,)\mathcal{L}_{1}(\Omega,\mathcal{F},\mathbb{P}) to denote the space of all random variables X:Ω{X}:\Omega\to\mathbb{R} with 𝔼[|X(ω)|]<+\mathbb{E}[\,|{X}(\omega)|]<+\infty. According to [33, Example 6.19], the function CVaRα:1(Ω,,){\mbox{CVaR}_{\alpha}}:\mathcal{L}_{1}(\Omega,\mathcal{F},\mathbb{P})\to\mathbb{R} is subdifferentiable (see [33, (9.281)] for the definition). Consider any fixed xnx\in\mathbb{R}^{n}. Given that c(x,Z)c(x,Z) is a continuous random variable in 1(Ω,,)\mathcal{L}_{1}(\Omega,\mathcal{F},\mathbb{P}), it follows from [33, (6.81)] that the subdifferential of CVaRα[]{\mbox{CVaR}_{\alpha}}[\cdot] at c(x,Z)c(x,Z) is:

(CVaRα[])[c(x,Z)]={ϕ(Ω,,)|ϕ(ω)=(1α)1if c(x,Z(ω))>VaRα[c(x,Z)]ϕ(ω)[0,(1α)1]if c(x,Z(ω))=VaRα[c(x,Z)]ϕ(ω)=0if c(x,Z(ω))<VaRα[c(x,Z)]}.{\partial\left(\mbox{CVaR}_{\alpha}[\,\cdot\,]\right)[\,c(x,Z)]}=\left\{\phi\in\mathcal{L}_{\infty}(\Omega,\mathcal{F},\mathbb{P})\,\middle|\begin{array}[]{ll}\phi(\omega)=(1-\alpha)^{-1}&\text{if }c(x,Z(\omega))>\mbox{VaR}_{\alpha}[\,c(x,Z)]\\ {\phi(\omega)\in[0,(1-\alpha)^{-1}]}&{\text{if }c(x,Z(\omega))=\mbox{VaR}_{\alpha}[\,c(x,Z)]}\\ \phi(\omega)=0&\text{if }c(x,Z(\omega))<\mbox{VaR}_{\alpha}[\,c(x,Z)]\end{array}\right\}. (43)

We would like to mention that the event {ωΩc(x,Z(ω))=VaRα[c(x,Z)]}\{\omega\in\Omega\mid c(x,Z(\omega))=\mbox{VaR}_{\alpha}[\,c(x,Z)]\} has zero probability and, thus, 𝔼[ϕ]=(1α)1(1α)=1\mathbb{E}[\,\phi\,]=(1-\alpha)^{-1}\cdot(1-\alpha)=1 for every random variable ϕ(CVaRα[])[c(x,Z)]\phi\in\partial\left(\mbox{CVaR}_{\alpha}[\,\cdot\,]\right)\,[\,c(x,Z)]. Let Z\mathbb{P}_{Z} denote the probability measure associated with ZZ. By using [33, Theorem 6.14], we obtain the subdifferential of the convex function CVaRα[c(,Z)]{\mbox{CVaR}_{\alpha}}[\,c(\cdot,Z)\,] at xx:

(CVaRα[c(,Z)])(x)=cl(ϕ(CVaRα[])[c(x,Z)]1c(x,Z(ω))ϕ(ω)dZ(ω))=(iv)cl(1c(x,Z(ω))ϕ¯(ω)dZ(ω))ϕ¯(CVaRα[])[c(x,Z)]=(v)1c(x,Z(ω))ϕ¯(ω)dZ(ω)ϕ¯(CVaRα[])[c(x,Z)].\begin{split}\begin{array}[]{rl}\partial\left({\mbox{CVaR}_{\alpha}}[\,c(\cdot,Z)\,]\right)(x)=&\operatorname{cl}\left(\displaystyle\bigcup_{\phi\in\partial({\operatorname{CVaR}_{\alpha}}[\,\cdot\,])\,[\,c(x,Z)]}\int\partial_{1}\,c(x,Z(\omega))\,\phi(\omega)\,\mbox{d}\mathbb{P}_{Z}(\omega)\right)\\[19.5132pt] \overset{(\rm iv^{\prime})}{=}&{\operatorname{cl}\left(\displaystyle\int\partial_{1}\,c(x,Z(\omega))\,\bar{\phi}(\omega)\,\mbox{d}\mathbb{P}_{Z}(\omega)\right)\quad\forall\,\bar{\phi}\in\partial(\mbox{CVaR}_{\alpha}[\,\cdot\,])\,[\,c(x,Z)]}\\[12.28577pt] \overset{(\rm v^{\prime})}{=}&{\displaystyle\int\partial_{1}\,c(x,Z(\omega))\,\bar{\phi}(\omega)\,\mbox{d}\mathbb{P}_{Z}(\omega)\qquad\quad\;\;\forall\,\bar{\phi}\in\partial(\mbox{CVaR}_{\alpha}[\,\cdot\,])\,[\,c(x,Z)].}\end{array}\end{split} (44)

To see (iv)(\rm iv^{\prime}), it suffices to show that, for arbitrary two elements ϕ1\phi_{1} and ϕ2\phi_{2} in the set (CVaRα[])[c(x,Z)]\partial(\mbox{CVaR}_{\alpha}[\,\cdot\,])\,[\,c(x,Z)], we have

1c(x,Z(ω))ϕ1(ω)dZ(ω)=1c(x,Z(ω))ϕ2(ω)dZ(ω).\displaystyle\int\partial_{1}\,c(x,Z(\omega))\,\phi_{1}(\omega)\,\mbox{d}\mathbb{P}_{Z}(\omega)=\int\partial_{1}\,c(x,Z(\omega))\,\phi_{2}(\omega)\,\mbox{d}\mathbb{P}_{Z}(\omega). (45)

To this end, we take any measurable selection a(x,Z(ω))1c(x,Z(ω))a(x,Z(\omega))\in\partial_{1}c(x,Z(\omega)). By the assumption that |c(x,z)c(x,z)|κ(z)xx|c(x,z)-c(x^{\prime},z)|\leq\kappa(z)\|x-x^{\prime}\| for all x,xnx,x^{\prime}\in\mathbb{R}^{n} and zmz\in\mathbb{R}^{m}, it holds that a(x,Z)κ(Z)\|a(x,Z)\|\leq\kappa(Z) since subgradients of a convex function are uniformly bounded in norm by the Lipschitz constant. Consequently, both a(x,Z(ω))ϕ1(ω)a(x,Z(\omega))\,\phi_{1}(\omega) and a(x,Z(ω))ϕ2(ω)a(x,Z(\omega))\,\phi_{2}(\omega) are integrable as |ϕ1(ω)|(1α)1|\phi_{1}(\omega)|\leq(1-\alpha)^{-1} and |ϕ2(ω)|(1α)1|\phi_{2}(\omega)|\leq(1-\alpha)^{-1} for any ω\omega by (43) and 𝔼[a(x,Z)]𝔼[κ(Z)]<+\mathbb{E}[\|a(x,Z)\|]\leq\mathbb{E}[\kappa(Z)]<+\infty by our assumption. Observing that a(x,Z(ω))ϕ1(ω)=a(x,Z(ω))ϕ2(ω)a(x,Z(\omega))\,\phi_{1}(\omega)=a(x,Z(\omega))\,\phi_{2}(\omega) almost surely, we can conclude from [17, Proposition 2.23] that a(x,Z(ω))ϕ1(ω)dZ(ω)=a(x,Z(ω))ϕ2(ω)dZ(ω)\int a(x,Z(\omega))\,\phi_{1}(\omega)\,\mbox{d}\mathbb{P}_{Z}(\omega)=\int a(x,Z(\omega))\,\phi_{2}(\omega)\,\mbox{d}\mathbb{P}_{Z}(\omega). This completes the proof of (45).

Next, we will explain why the closure can be removed in (44). By the convexity of c(,z)c(\cdot,z) for any fixed zmz\in\mathbb{R}^{m} and the existence of a measurable function κ\kappa, it follows from [12, Theorem 2.7.2] that

1c(x,Z(ω))ϕ¯(ω)dZ(ω)=(c(,Z(ω))ϕ¯(ω)dZ(ω))(x),\int\partial_{1}c(x,Z(\omega))\,\bar{\phi}(\omega)\,\mbox{d}\mathbb{P}_{Z}(\omega)=\partial{\left(\int c(\cdot,Z(\omega))\,\bar{\phi}(\omega)\,\mbox{d}\mathbb{P}_{Z}(\omega)\right)(x)},

where the right-hand-side is the subdifferential of a convex function and, thus, is a closed set. Then, we can omit the closure to obtain the equation (v)(\rm v^{\prime}) in (44).

Now we use the expression of (CVaRα[c(,Z)])(x)\partial\left(\mbox{CVaR}_{\alpha}[\,c(\cdot,Z)\,]\right)(x) to characterize AVaRα[c(,Z)](x¯)\partial_{A}\mbox{VaR}_{\alpha}[\,c(\cdot,Z)](\bar{x}). For any k>1/αk>1/\alpha, taking any ϕ3(CVaRα1/k[])[c(x,Z)]\phi_{3}\in\partial(\operatorname{CVaR}_{\alpha-1/k}[\,\cdot\,])\,[\,c(x,Z)] and ϕ4(CVaRα[])[c(x,Z)]\phi_{4}\in\partial(\operatorname{CVaR}_{\alpha}[\,\cdot\,])\,[\,c(x,Z)], we have

gk(x)hk(x)=[k(1α)+1]CVaRα1/k[c(,Z)](x)k(1α)CVaRα[c(,Z)](x)=(44)1c(x,Z(ω))([k(1α)+1]ϕ3(ω)k(1α)ϕ4(ω))dZ(ω)=(43)1c(x,Z(ω))ϕ(ω)dZ(ω),\begin{array}[]{rl}\partial g^{k}(x)-\partial h^{k}(x)=&[k(1-\alpha)+1]\,\partial\operatorname{CVaR}_{\alpha-1/k}[\,c(\cdot,Z)](x)-k(1-\alpha)\,\partial\operatorname{CVaR}_{\alpha}[\,c(\cdot,Z)](x)\\[3.61371pt] \overset{\eqref{eq:CVaR_comp_subdiff}}{=}&\displaystyle\int\partial_{1}c(x,Z(\omega))\cdot\big{(}[k(1-\alpha)+1]\,\phi_{3}(\omega)-k(1-\alpha)\,\phi_{4}(\omega)\big{)}\,\mbox{d}\mathbb{P}_{Z}(\omega)\\ \overset{\eqref{eq:CVaR_subdiff}}{=}&\displaystyle\int\partial_{1}c(x,Z(\omega))\cdot\phi(\omega)\,\mbox{d}\mathbb{P}_{Z}(\omega),\end{array}

with

ϕ(ω){0 if c(x,Z(ω))>VaRα[c(x,Z)] or c(x,Z(ω))<VaRα1/k[c(x,Z)][0,k] if c(x,Z(ω))=VaRα[c(x,Z)] or c(x,Z(ω))=VaRα1/k[c(x,Z)]k if VaRα1/k[c(x,Z)]<c(x,Z(ω))<VaRα[c(x,Z)].\phi(\omega)\triangleq\left\{\begin{array}[]{cl}{0}&{\text{ if }c(x,Z(\omega))>\mbox{VaR}_{\alpha}[\,c(x,Z)]}{\text{ or }c(x,Z(\omega))<\mbox{VaR}_{\alpha-1/k}[\,c(x,Z)]}\\ {[0,k]}&{\text{ if }c(x,Z(\omega))=\mbox{VaR}_{\alpha}[\,c(x,Z)]}\text{ or }c(x,Z(\omega))=\mbox{VaR}_{\alpha-1/k}[\,c(x,Z)]\\ k&\text{ if }\mbox{VaR}_{\alpha-1/k}[\,c(x,Z)]<c(x,Z(\omega))<\mbox{VaR}_{\alpha}[\,c(x,Z)]\end{array}\right..

Since the event {ωΩc(x,Z(ω))=VaRα[c(x,Z)] or VaRα1/k[c(x,Z)]}\{\omega\in\Omega\mid c(x,Z(\omega))=\mbox{VaR}_{\alpha}[\,c(x,Z)]\text{ or }\mbox{VaR}_{\alpha-1/k}[\,c(x,Z)]\} has zero probability, we have

gk(x)hk(x)=1c(x,Z(ω))k 1{VaRα1/k[c(x,Z)]<c(x,Z)<VaRα[c(x,Z)]}dZ(ω)=1c(x,Z(ω))𝟏{VaRα1/k[c(x,Z)]<c(x,Z)<VaRα[c(x,Z)]}(VaRα1/k[c(x,Z)]<c(x,Z)<VaRα[c(x,Z)])dZ(ω)=𝔼[1c(x,Z)|VaRα1/k[c(x,Z)]<c(x,Z)<VaRα[c(x,Z)]].\begin{array}[]{rl}\partial g^{k}(x)-\partial h^{k}(x)=&\displaystyle\int\partial_{1}c(x,Z(\omega))\,k\,\mathbf{1}\{\mbox{VaR}_{\alpha-1/k}[\,c(x,Z)]<c(x,Z)<\mbox{VaR}_{\alpha}[\,c(x,Z)]\}\,\mbox{d}\mathbb{P}_{Z}(\omega)\\[2.168pt] =&\displaystyle\int\partial_{1}c(x,Z(\omega))\cdot\frac{\mathbf{1}\{\mbox{VaR}_{\alpha-1/k}[\,c(x,Z)]<c(x,Z)<\mbox{VaR}_{\alpha}[\,c(x,Z)]\}}{\mathbb{P}(\mbox{VaR}_{\alpha-1/k}[\,c(x,Z)]<c(x,Z)<\mbox{VaR}_{\alpha}[\,c(x,Z)])}\,\mbox{d}\mathbb{P}_{Z}(\omega)\\[10.84006pt] =&\mathbb{E}\left[\,\partial_{1}c(x,Z)\,\middle|\,\mbox{VaR}_{\alpha-1/k}[\,c(x,Z)]<c(x,Z)<\mbox{VaR}_{\alpha}[\,c(x,Z)]\,\right].\end{array}

By the definition of the approximate subdifferential, the proof is then completed. ∎

Appendix B. Proof of Proposition 4.

We start with the chain rules for (φf)\partial(\varphi\circ f) and (φf)\partial^{\infty}(\varphi\circ f), where the inner function ff is merely lsc. These results are extensions of the nonlinear rescaling [30, Proposition 10.19(b)] to the case where φ\varphi may lack the strictly increasing property at a given point. One can also derive the same results through a general chain rule of the coderivative for composite set-valued mappings [22, Theorem 5.1]. However, to avoid the complicated computations accompanied by the introduction of the coderivative, we give an alternative proof below that is more straightforward. To prepare for the chain rules, we need a technical lemma about the proximal normal cone.

Lemma 3.

Let f:nf:\mathbb{R}^{n}\to\mathbb{R} be a lsc function. For α¯>f(x¯)\bar{\alpha}>f(\bar{x}), it holds that

𝒩epifp(x¯,α¯)𝒩epifp(x¯,f(x¯)).\mathcal{N}^{p}_{\operatorname{epi}f}(\bar{x},\bar{\alpha})\subset\mathcal{N}^{p}_{\operatorname{epi}f}(\bar{x},f(\bar{x})).
Proof of Lemma 3..

For α¯>f(x¯)\bar{\alpha}>f(\bar{x}). By [30, Example 6.16], we have

𝒩epifp(x¯,α¯)={λ[(x,α)(x¯,α¯)](x,α)n+1 such that (x¯,α¯)Πepif(x,α),λ0},\mathcal{N}^{p}_{\operatorname{epi}f}(\bar{x},\bar{\alpha})=\left\{\lambda[(x,\alpha)-(\bar{x},\bar{\alpha})]\mid(x,\alpha)\in\mathbb{R}^{n+1}\text{ such that }(\bar{x},\bar{\alpha})\in\Pi_{\operatorname{epi}f}(x,\alpha),\lambda\geq 0\right\},

where Πepif:n+1epif\Pi_{\operatorname{epi}f}:\mathbb{R}^{n+1}\to\operatorname{epi}f is the projection operator. For any (x,α)n+1(x,\alpha)\in\mathbb{R}^{n+1} with (x¯,α¯)Πepif(x,α)(\bar{x},\bar{\alpha})\in\Pi_{\operatorname{epi}f}(x,\alpha), we have

(x¯,α¯)argmin(u,t)epif(u,t)(x,α)2,(\bar{x},\bar{\alpha})\in\displaystyle\operatornamewithlimits{argmin}_{(u,t)\in\operatorname{epi}f}\|(u,t)-(x,\alpha)\|^{2},

which can be equivalently written as

(x¯,f(x¯))argmin(u,t+α¯f(x¯))epif(u,t)(x,αα¯+f(x¯))2.(\bar{x},f(\bar{x}))\in\displaystyle\operatornamewithlimits{argmin}_{(u,t+\bar{\alpha}-f(\bar{x}))\in\operatorname{epi}f}\|(u,t)-(x,\alpha-\bar{\alpha}+f(\bar{x}))\|^{2}.

Then restrict the feasible region of the above problem to a subset {(u,t)(u,t)epif}\{(u,t)\mid(u,t)\in\operatorname{epi}f\}. Since (x¯,f(x¯))(\bar{x},f(\bar{x})) is still a feasible point in this subset, we have (x¯,f(x¯))Πepif(x,αα¯+f(x¯))(\bar{x},f(\bar{x}))\in\Pi_{\operatorname{epi}f}(x,\alpha-\bar{\alpha}+f(\bar{x})). Hence, (x¯,α¯)Πepif(x,α)(\bar{x},\bar{\alpha})\in\Pi_{\operatorname{epi}f}(x,\alpha) implies (x¯,f(x¯))Πepif(x,αα¯+f(x¯))(\bar{x},f(\bar{x}))\in\Pi_{\operatorname{epi}f}(x,\alpha-\bar{\alpha}+f(\bar{x})). Using this result and the expression of 𝒩epifp(x¯,α¯)\mathcal{N}^{p}_{\operatorname{epi}f}(\bar{x},\bar{\alpha}), we conclude that 𝒩epifp(x¯,α¯)𝒩epifp(x¯,f(x¯))\mathcal{N}^{p}_{\operatorname{epi}f}(\bar{x},\bar{\alpha})\subset\mathcal{N}^{p}_{\operatorname{epi}f}(\bar{x},f(\bar{x})) for α¯>f(x¯)\bar{\alpha}>f(\bar{x}). ∎

We present the chain rules with a self-contained proof in the following lemma.

Lemma 4 (chain rules for the limiting subdifferential).

Let φ:¯\varphi:\mathbb{R}\rightarrow\overline{\mathbb{R}} be proper, lsc, convex, and nondecreasing with supφ=+\sup\varphi=+\infty, and f:nf:\mathbb{R}^{n}\rightarrow\mathbb{R} be lsc. Consider x¯dom(φf)\bar{x}\in\operatorname{dom}(\varphi\circ f). If the only scalar yLimsupx(φf)x¯𝒩domφ(f(x))y\in\operatorname{Lim\,sup}_{x\rightarrow_{(\varphi\circ f)}\bar{x}}\,\mathcal{N}_{\operatorname{dom}\varphi}(f(x)) with 0yLimsupxx¯f(x)0\in y\cdot\operatorname{Lim\,sup}_{x\rightarrow\bar{x}}\partial f(x) is y=0y=0, then

(φf)(x¯){yLimsupxx¯f(x)|yLimsupx(φf)x¯φ(f(x))}[(Limsupxx¯f(x))\{0}],(φf)(x¯){yLimsupxx¯f(x)|yLimsupx(φf)x¯𝒩domφ(f(x))}[(Limsupxx¯f(x))\{0}].\begin{array}[]{ll}\quad\partial(\varphi\circ f)(\bar{x})\subset\bigcup\left\{\,y\cdot\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}\partial f(x)\,\middle|\,y\in\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow_{(\varphi\circ f)}\,\bar{x}}\partial\varphi(f(x))\,\right\}\cup\left[\,{\Big{(}}{\displaystyle\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}}^{\infty}\partial f(x){\Big{)}}\backslash\{0\}\,\right],\\[14.45377pt] \,\partial^{\infty}(\varphi\circ f)(\bar{x})\subset\bigcup\left\{\,y\cdot\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}\partial f(x)\,\middle|\,y\in\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow_{(\varphi\circ f)}\,\bar{x}}\mathcal{N}_{\operatorname{dom}\varphi}(f(x))\,\right\}\cup\left[\,{\Big{(}}{\displaystyle\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}}^{\infty}\partial f(x){\Big{)}}\backslash\{0\}\,\right].\end{array}
Proof of Lemma 4..

The basic idea is to rewrite φf\varphi\circ f as a parametric minimization problem and apply [30, Theorem 10.13]. Note that φ(f(x))=infα[g(x,α)δepif(x,α)+φ(α)]\varphi\big{(}f(x)\big{)}=\inf_{\alpha}\;[g(x,\alpha)\triangleq\delta_{\operatorname{epi}f}(x,\alpha)+\varphi(\alpha)] for xdom(φf)x\in\operatorname{dom}(\varphi\circ f). Define the corresponding set of optimal solutions as M(x)M(x) for any xdom(φf)x\in\operatorname{dom}(\varphi\circ f). Then, we have f(x¯)M(x¯)f(\bar{x})\in{M(\bar{x})} and φ(α)=φ(f(x¯))\varphi(\alpha)=\varphi(f(\bar{x})) for any αM(x¯)\alpha\in{M(\bar{x})}. By our assumptions, it is obvious that domφ{(,b),(,b]}\operatorname{dom}\varphi\in\{(-\infty,b),(-\infty,b]\} for some b{+}b\in\mathbb{R}\cup\{+\infty\}. Based on our assumption that supφ=+\sup\varphi=+\infty and ff is lsc, it is easy to verify that gg is proper, lsc, and level-bounded in α\alpha locally uniformly in xx. Then we apply [30, Theorem 10.13] to obtain

(φf)(x¯){v(v,0)g(x¯,α¯),α¯M(x¯)},(φf)(x¯){v(v,0)g(x¯,α¯),α¯M(x¯)}.\partial(\varphi\circ f)(\bar{x})\subset\left\{v\mid(v,0)\in\partial g(\bar{x},\bar{\alpha}),\bar{\alpha}\in M(\bar{x})\right\},\;\partial^{\infty}(\varphi\circ f)(\bar{x})\subset\left\{v\mid(v,0)\in\partial^{\infty}g(\bar{x},\bar{\alpha}),\bar{\alpha}\in M(\bar{x})\right\}. (46)

Step 1: We will show that for any α¯M(x¯)\bar{\alpha}\in{M(\bar{x})},

𝒩epif(x¯,α¯)({0}×[𝒩domφ(α¯)])={0}.\mathcal{N}_{\operatorname{epi}f}(\bar{x},\bar{\alpha})\cap\big{(}\,\{0\}\times[-\mathcal{N}_{\operatorname{dom}\varphi}(\bar{\alpha})]\,\big{)}=\{0\}. (47)

We divide the proof of (47) into two cases.

Case 1. If M(x¯){M(\bar{x})} is a singleton {f(x¯)}\{f(\bar{x})\}, we can characterize 𝒩epif(x¯,f(x¯))\mathcal{N}_{\operatorname{epi}f}(\bar{x},f(\bar{x})) by using the result in [30, Theorem 8.9]. Since f(x¯)Limsupxx¯f(x)\partial f(\bar{x})\subset\operatorname{Lim\,sup}_{x\rightarrow\bar{x}}\partial f(x) and 𝒩domφ(f(x¯))Limsupx(φf)x¯𝒩domφ(f(x))\mathcal{N}_{\operatorname{dom}\varphi}(f(\bar{x}))\subset\operatorname{Lim\,sup}_{x\rightarrow_{(\varphi\circ f)}\,\bar{x}}\,\mathcal{N}_{\operatorname{dom}\varphi}(f(x)), it follows from our assumption that either 0f(x¯)0\notin\partial f(\bar{x}) or 𝒩domφ(f(x¯))={0}\mathcal{N}_{\operatorname{dom}\varphi}(f(\bar{x}))=\{0\}. Hence, based on the characterization of 𝒩epif(x¯,f(x¯))\mathcal{N}_{\operatorname{epi}f}(\bar{x},f(\bar{x})), (47) is satisfied.

Case 2. Otherwise, there exists α¯max(f(x¯),+)\bar{\alpha}_{\max}\in(f(\bar{x}),+\infty) such that M(x¯)=[f(x¯),α¯max]{M(\bar{x})}=[f(\bar{x}),\bar{\alpha}_{\max}] since φ\varphi is lsc, nondecreasing and supφ=+\sup\varphi=+\infty. Thus, from (46),

(φf)(x¯)[{v(v,0)g(x¯,f(x¯))}{v(v,0)g(x¯,α¯),f(x¯)<α¯α¯max}],(φf)(x¯)[{v(v,0)g(x¯,f(x¯))}{v(v,0)g(x¯,α¯),f(x¯)<α¯α¯max}].\begin{split}\partial(\varphi\circ f)(\bar{x})&\subset\big{[}\{v\mid(v,0)\in\partial g(\bar{x},f(\bar{x}))\}\cup\{v\mid(v,0)\in\partial g(\bar{x},\bar{\alpha}),f(\bar{x})<\bar{\alpha}\leq\bar{\alpha}_{\max}\}\big{]},\\[5.78172pt] \partial^{\infty}(\varphi\circ f)(\bar{x})&\subset\big{[}\{v\mid(v,0)\in\partial^{\infty}g(\bar{x},f(\bar{x}))\}\cup\{v\mid(v,0)\in\partial^{\infty}g(\bar{x},\bar{\alpha}),f(\bar{x})<\bar{\alpha}\leq\bar{\alpha}_{\max}\}\big{]}.\end{split} (48)

Let M1(x¯){α¯(f(x¯),α¯max]|xkx¯ with f(xk)α¯}{M_{1}(\bar{x})}\triangleq\left\{\bar{\alpha}\in(f(\bar{x}),\bar{\alpha}_{\max}]\,\middle|\,\exists\,x^{k}\rightarrow\bar{x}\text{ with }f(x^{k})\rightarrow\bar{\alpha}\right\} and M2(x¯)M(x¯)\M1(x¯){M_{2}(\bar{x})\triangleq M(\bar{x})\backslash M_{1}(\bar{x})}. In the following, we characterize 𝒩epif(x¯,α¯)\mathcal{N}_{\operatorname{epi}f}(\bar{x},\bar{\alpha}) and verify (47) separately for α¯M1(x¯)\bar{\alpha}\in{M_{1}(\bar{x})} and α¯M2(x¯)\bar{\alpha}\in{M_{2}(\bar{x})}.

Case 2.1. For any α¯M1(x¯)\bar{\alpha}\in{M_{1}(\bar{x})}, we first prove the inclusion:

𝒩epif(x¯,α¯)[{λ(v,1)|vLimsupxx¯f(x),λ>0}{(v,0)|vLimsupxx¯f(x)}].\mathcal{N}_{\operatorname{epi}f}(\bar{x},\bar{\alpha})\subset\bigg{[}\left\{\lambda(v,-1)\,\middle|\,v\in\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}\partial f(x),\lambda>0\right\}\cup\left\{(v,0)\,\middle|\,v\in{\displaystyle\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}}^{\infty}\partial f(x)\right\}\bigg{]}. (49)

Observe that for any α¯M1(x¯)\bar{\alpha}\in{M_{1}(\bar{x})}, it holds that

𝒩epif(x¯,α¯)Limsup(x,α)(epif)(x¯,α¯)𝒩epifp(x,α)Limsupxx¯𝒩epifp(x,f(x))Limsupxx¯𝒩epif(x,f(x)),\mathcal{N}_{\operatorname{epi}f}(\bar{x},\bar{\alpha})\subset\displaystyle\operatornamewithlimits{Lim\,sup}_{(x,\alpha)(\in\operatorname{epi}f)\rightarrow(\bar{x},\bar{\alpha})}\;\mathcal{N}^{p}_{\operatorname{epi}f}(x,\alpha)\subset\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}\;\mathcal{N}^{p}_{\operatorname{epi}f}(x,f(x))\subset\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}\;\mathcal{N}_{\operatorname{epi}f}(x,f(x)), (50)

where the first inclusion is because any normal vector is a limit of proximal normals at nearby points [30, Exercise 6.18]; the second one uses Lemma 3; the last inclusion follows from the fact that the proximal normal cone is a subset of the limiting normal cone [30, Example 6.16]. Based on the the result of [30, Theorem 8.9] that

𝒩epif(x,f(x))={λ(v,1)|vf(x),λ>0}{(v,0)|vf(x)},\mathcal{N}_{\operatorname{epi}f}(x,f(x))=\left\{\lambda(v,-1)\,\middle|\,v\in\partial f(x),\lambda>0\right\}\cup\left\{(v,0)\,\middle|\,v\in\partial^{\infty}f(x)\right\},

we conclude that 𝒩epif(x¯,α¯)n×\mathcal{N}_{\operatorname{epi}f}(\bar{x},\bar{\alpha})\subset\mathbb{R}^{n}\times\mathbb{R}_{-} for any α¯M1(x¯)\bar{\alpha}\in{M_{1}(\bar{x})}. For any (v,1)𝒩epif(x¯,α¯)(v,-1)\in\mathcal{N}_{\operatorname{epi}f}(\bar{x},\bar{\alpha}) with α¯M1(x¯)\bar{\alpha}\in{M_{1}(\bar{x})}, there exist xkx¯x^{k}\rightarrow\bar{x}, vkvv^{k}\rightarrow v with vkf(xk)v^{k}\in\partial f(x^{k}). Then vLimsupxx¯f(x)v\in\operatorname{Lim\,sup}_{x\rightarrow\bar{x}}\partial f(x).

To prove (49), it remains to show that vLimsupxx¯f(x)v\in\operatorname{Lim\,sup}_{x\rightarrow\bar{x}}^{\infty}\partial f(x) whenever (v,0)𝒩epif(x¯,α¯)(v,0)\in\mathcal{N}_{\operatorname{epi}f}(\bar{x},\bar{\alpha}). It follows from (50) that (v,0)(v,0) is a limit of proximal normals of epif\operatorname{epi}f at (xk,f(xk))(x^{k},f(x^{k})) for some sequence xkx¯x^{k}\rightarrow\bar{x}. (i) First consider the case (vk,0)(v,0)(v^{k},0)\rightarrow(v,0) with (vk,0)𝒩epifp(xk,f(xk))(v^{k},0)\in\mathcal{N}^{p}_{\operatorname{epi}f}(x^{k},f(x^{k})). Following the argument in the proof of [30, Theorem 8.9], we can derive vkf(xk)v^{k}\in\partial^{\infty}f(x^{k}). Therefore,

vLimsupk+f(xk)Limsupk+(xk,ifxkLimsupi+f(xk,i))xjx¯Limsupj+f(xj),v\in\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}\;\partial^{\infty}f(x^{k})\;\subset\;\displaystyle\operatornamewithlimits{Lim\,sup}_{k\rightarrow+\infty}\left(\bigcup_{x^{k,i}\rightarrow_{f}\,x^{k}}{\displaystyle\displaystyle\operatornamewithlimits{Lim\,sup}_{i\rightarrow+\infty}}^{\infty}\partial f(x^{k,i})\right)\;\subset\;\bigcup_{x^{j}\rightarrow\bar{x}}{\displaystyle\displaystyle\operatornamewithlimits{Lim\,sup}_{j\rightarrow+\infty}}^{\infty}\;\partial f(x^{j}),

where the first inclusion is due to the definition of the horizon subdifferential, and the last inclusion follows from a standard diagonal extraction procedure. (ii) In the other case, we have λk(vk,1)(v,0)\lambda_{k}(v^{k},-1)\rightarrow(v,0) with λk0\lambda_{k}\downarrow 0 and vkf(xk)v^{k}\in\partial f(x^{k}) for all k{k\in\mathbb{N}}. It is easy to see vLimsupxx¯f(x)v\in\operatorname{Lim\,sup}_{x\rightarrow\bar{x}}^{\infty}\partial f(x). So far, we obtain inclusion (49). Since α¯M1(x¯)\bar{\alpha}\in{M_{1}(\bar{x})}, we have 𝒩domφ(α¯)Limsupx(φf)x¯𝒩domφ(f(x))\mathcal{N}_{\operatorname{dom}\varphi}(\bar{\alpha})\subset\operatorname{Lim\,sup}_{x\rightarrow_{(\varphi\circ f)}\bar{x}}\,\mathcal{N}_{\operatorname{dom}\varphi}(f(x)), and our assumption implies that λ=0\lambda=0 is the unique solution satisfying 0λLimsupxx¯f(x)0\in\lambda\cdot\operatorname{Lim\,sup}_{x\rightarrow\bar{x}}\partial f(x) with λ𝒩domφ(α¯)\lambda\in\mathcal{N}_{\operatorname{dom}\varphi}(\bar{\alpha}). Combining this with (49), we immediately obtain (47).

Case 2.2. For any α¯M2(x¯)\bar{\alpha}\in{M_{2}(\bar{x})}, consider any sequence {(xk,αk)}epif\big{\{}(x^{k},\alpha^{k})\big{\}}\subset\operatorname{epi}f converging to (x¯,α¯)(\bar{x},\bar{\alpha}). Then αk>f(xk)\alpha^{k}>f(x^{k}) for all sufficiently large kk since α¯M1(x¯)\bar{\alpha}\notin{M_{1}(\bar{x})}. It is easy to see that 𝒩epifp(xk,αk)n×{0}\mathcal{N}^{p}_{\operatorname{epi}f}(x^{k},\alpha^{k})\subset\mathbb{R}^{n}\times\{0\}, which gives us 𝒩epif(xk,αk)n×{0}\mathcal{N}_{\operatorname{epi}f}(x^{k},\alpha^{k})\subset\mathbb{R}^{n}\times\{0\} due to [30, Exercise 6.18]. By following a similar pattern as the final part of Case 2.1, it is not difficult to obtain, for any α¯M2(x¯)\bar{\alpha}\in{M_{2}(\bar{x})},

𝒩epif(x¯,α¯){(v,0)|vLimsupxx¯f(x)}.\mathcal{N}_{\operatorname{epi}f}(\bar{x},\bar{\alpha})\subset\left\{(v,0)\,\middle|\,v\in{\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}}^{\infty}\partial f(x)\right\}. (51)

In this case, (47) holds trivially. Hence, we have verified (47) for cases 1 and 2.

Step 2: Based on (47) in step 1, we can now apply the sum rule [30, corollary 10.9] for g(x¯,α¯)\partial g(\bar{x},\bar{\alpha}) to obtain

g(x¯,α¯)𝒩epif(x¯,α¯)+{0}×φ(α¯),g(x¯,α¯)𝒩epif(x¯,α¯)+{0}×𝒩domφ(α¯).\partial g(\bar{x},\bar{\alpha})\subset\mathcal{N}_{\operatorname{epi}f}(\bar{x},\bar{\alpha})+\{0\}\times\partial\varphi(\bar{\alpha}),\qquad\partial^{\infty}g(\bar{x},\bar{\alpha})\subset\mathcal{N}_{\operatorname{epi}f}(\bar{x},\bar{\alpha})+\{0\}\times\mathcal{N}_{\operatorname{dom}\varphi}(\bar{\alpha}). (52)

Case 1. For M(x¯)={f(x¯)}{M(\bar{x})}=\{f(\bar{x})\}, by combining (52) with (46), we can derive the stated results for (φf)(x¯)\partial(\varphi\circ f)(\bar{x}) and (φf)(x¯)\partial^{\infty}(\varphi\circ f)(\bar{x}) based on the observations that φ(f(x¯))Limsupx(φf)x¯φ(f(x))\partial\varphi(f(\bar{x}))\subset\operatorname{Lim\,sup}_{x\rightarrow_{(\varphi\circ f)}\bar{x}}\varphi(f(x)) and f(x¯)Limsupxx¯f(x)\partial^{\infty}f(\bar{x})\subset\operatorname{Lim\,sup}_{x\rightarrow\bar{x}}^{\infty}\partial f(x).

Case 2. Otherwise, by (52), we have

{v(v,0)g(x¯,α¯),f(x¯)<α¯α¯max}(49)(51){yLimsupxx¯f(x)|yφ(α¯),α¯M1(x¯)}[{Limsupxx¯f(x)| 0φ(α¯),f(x¯)<α¯α¯max}]{yLimsupxx¯f(x)|yLimsupx(φf)x¯φ(f(x))}[(Limsupxx¯f(x))\{0}],\begin{array}[]{cl}&\{v\mid(v,0)\in\partial g(\bar{x},\bar{\alpha}),f(\bar{x})<\bar{\alpha}\leq\bar{\alpha}_{\max}\}\\ \overset{\eqref{eq:2.1normalcone_epi}\eqref{eq:2.2normalcone_epi}}{\subset}&\bigcup\left\{y\cdot\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}\partial f(x)\,\middle|\,y\in\partial\varphi(\bar{\alpha}),\bar{\alpha}\in{M_{1}}(\bar{x})\right\}\cup\left[\,{\bigcup}\left\{{\displaystyle\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}}^{\infty}\partial f(x)\,\middle|\,0\in\partial\varphi(\bar{\alpha}),f(\bar{x})<\bar{\alpha}\leq\bar{\alpha}_{\max}\right\}\right]\\[10.84006pt] \subset&\bigcup\left\{\,y\cdot\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}\partial f(x)\,\middle|\,y\in\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow_{(\varphi\circ f)}\,\bar{x}}\partial\varphi(f(x))\,\right\}\cup\left[\,{\Big{(}}{\displaystyle\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}}^{\infty}\partial f(x){\Big{)}}\backslash\{0\}\,\right],\end{array}

where the last inclusion is because 0 will be included in the first set if 0φ(α¯)0\in\partial\varphi(\bar{\alpha}) for some α¯(f(x¯),α¯max]\bar{\alpha}\in(f(\bar{x}),\bar{\alpha}_{\max}] and the second set will be empty otherwise. Similarly,

{v(v,0)g(x¯,α¯),f(x¯)<α¯α¯max}{yLimsupxx¯f(x)|yLimsupx(φf)x¯𝒩domφ(f(x))}[(Limsupxx¯f(x))\{0}].\begin{array}[]{rl}&\{v\mid(v,0)\in\partial g^{\infty}(\bar{x},\bar{\alpha}),f(\bar{x})<\bar{\alpha}\leq\bar{\alpha}_{\max}\}\\[5.78172pt] \subset&\bigcup\left\{\,y\cdot\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}\partial f(x)\,\middle|\,y\in\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow_{(\varphi\circ f)}\,\bar{x}}\mathcal{N}_{\operatorname{dom}\varphi}(f(x))\,\right\}\cup\left[\,{\Big{(}}{\displaystyle\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}}^{\infty}\partial f(x){\Big{)}}\backslash\{0\}\,\right].\end{array}

We then complete the proof by using the inclusions in (48). ∎

Equipped with the chain rules, we are now ready to prove Proposition 4.

Proof of Proposition 4..

Let x¯\bar{x} be any feasible point, i.e., x¯p=1mdomFp\bar{x}\in{\bigcap_{p=1}^{m}}\operatorname{dom}F_{p}. Suppose for contradiction that (15) does not hold at x¯\bar{x}. Thus, there exist p1{1,,m}p_{1}\in\{1,\cdots,m\}, {xk}Sp1(x¯)\{x^{k}\}\in S_{p_{1}}(\bar{x}) and an index set NN\in\mathbb{N}_{\infty}^{\sharp} such that 0Cfp1k(xk)0\in\partial_{C}f^{k}_{p_{1}}(x^{k}) and 𝒩domφp1(fp1k(xk)){0}\mathcal{N}_{\operatorname{dom}\varphi_{p_{1}}}\left(f^{k}_{p_{1}}(x^{k})\right)\neq\{0\} for all kNk\in N. Take an arbitrary nonzero scalar yk𝒩domφp1(fp1k(xk))y^{k}\in\mathcal{N}_{\operatorname{dom}\varphi_{p_{1}}}\left(f^{k}_{p_{1}}(x^{k})\right) for all kNk\in N. Let y~\widetilde{y} be any accumulation point of the unit scalars {yk/|yk|}kN\{y^{k}/|y^{k}|\}_{k\in N}. Then, we have (0)y~{𝒩domφp1(tp1)tp1Tp1(x¯)}(0\neq)\widetilde{y}\in\bigcup\{\mathcal{N}_{\operatorname{dom}\varphi_{p_{1}}}(t_{p_{1}})\mid t_{p_{1}}\in T_{p_{1}}(\bar{x})\} and 0conAfp1(x¯)0\in\operatorname{con}\partial_{A}f_{p_{1}}(\bar{x}), contradicting Assumption 5. This proves condition (15).

For any fixed p=1,,mp=1,\cdots,m, let yp=0y_{p^{\prime}}=0 for any p{1,,m}\{p}p^{\prime}\in\{1,\cdots,m\}\backslash\{p\} in Assumption 5. Then the only scalar yp{𝒩domφp(tp)tpTp(x¯)}y_{p}\in\bigcup\left\{\mathcal{N}_{\operatorname{dom}\varphi_{p}}(t_{p})\mid t_{p}\in T_{p}(\bar{x})\right\} with 0ypconAfp(x¯)0\in y_{p}\,\operatorname{con}\partial_{A}f_{p}(\bar{x}) is yp=0y_{p}=0, which completes the proof of (16).

To derive the constraint qualification (17), we consider two cases.

Case 1. For pI2p\in I_{2}, we have 𝒩domφp(fp(x¯)){𝒩domφp(tp)tpTp(x¯)}\mathcal{N}_{\operatorname{dom}\varphi_{p}}(f_{p}(\bar{x}))\subset\bigcup\big{\{}\mathcal{N}_{\operatorname{dom}\varphi_{p}}(t_{p})\mid t_{p}\in T_{p}(\bar{x})\big{\}} due to fpkefpf^{k}_{p}\overset{\text{e}}{\rightarrow}f_{p} and (yfp)(x¯)yCfp(x¯)yconAfp(x¯)\partial(yf_{p})(\bar{x})\subset y\,\partial_{C}f_{p}(\bar{x})\subset y\cdot\operatorname{con}\partial_{A}f_{p}(\bar{x}) for any yy by Theorem 1(a). Together with Assumption 5, we deduce that the only scalar y𝒩domφp(fp(x¯))y\in\mathcal{N}_{\operatorname{dom}\varphi_{p}}(f_{p}(\bar{x})) with 0(yfp)(x¯)0\in\partial(yf_{p})(\bar{x}) is y=0y=0. From this condition and the local Lipschitz continuity of fpf_{p} for pI2p\in I_{2}, we can apply the chain rule [30, Theorem 10.49] to get

(φpfp)(x¯){yconAfp(x¯)y𝒩domφp(tp),tpTp(x¯)}.\partial^{\infty}(\varphi_{p}\circ f_{p})(\bar{x})\subset\bigcup\left\{y\cdot\operatorname{con}\partial_{A}f_{p}(\bar{x})\mid y\in\mathcal{N}_{\operatorname{dom}\varphi_{p}}(t_{p}),t_{p}\in T_{p}(\bar{x})\right\}. (53)

Case 2. For pI1p\in I_{1}, to utilize the chain rules (Lemma 4) for (φpfp)\partial^{\infty}(\varphi_{p}\circ f_{p}), we must first confirm the validity of the condition:

[0yLimsupxx¯fp(x),yLimsupxFpx¯𝒩domφp(fp(x))]y=0.\left[0\in y\cdot\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}\partial f_{p}(x),\quad y\in\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow_{F_{p}}\bar{x}}\;\mathcal{N}_{\operatorname{dom}\varphi_{p}}(f_{p}(x))\right]\quad\Longrightarrow\quad y=0. (54)

Indeed, it suffices to consider the case of domφp=(,rp)\operatorname{dom}\varphi^{\uparrow}_{p}=(-\infty,r_{p}) or (,rp](-\infty,r_{p}] for some rpr_{p}\in\mathbb{R}, because the statement holds trivially when φp\varphi^{\uparrow}_{p} is real-valued. For any element y¯LimsupxFpx¯𝒩domφp(fp(x))\bar{y}\in\operatorname{Lim\,sup}_{x\rightarrow_{F_{p}}\,\bar{x}}\,\mathcal{N}_{\operatorname{dom}\varphi_{p}}(f_{p}(x)), there exist (xk,yk)(x¯,y¯)(x^{k},y^{k})\rightarrow(\bar{x},\bar{y}) with yk𝒩domφp(fp(xk))y^{k}\in\mathcal{N}_{\operatorname{dom}\varphi_{p}}(f_{p}(x^{k})) and Fp(xk)Fp(x¯)F_{p}(x^{k})\rightarrow F_{p}(\bar{x}). Since x¯domFp\bar{x}\in\operatorname{dom}F_{p}, we must have xkdomFpx^{k}\in\operatorname{dom}F_{p} for all sufficiently large k, i.e., fp(xk)domφpf_{p}(x^{k})\in\operatorname{dom}\varphi^{\uparrow}_{p}, and {fp(xk)}k\{f_{p}(x^{k})\}_{k\in\mathbb{N}} is bounded from above due to domφp=(,rp)\operatorname{dom}\varphi^{\uparrow}_{p}=(-\infty,r_{p}) or (,rp](-\infty,r_{p}]. The sequence {fp(xk)}k\{f_{p}(x^{k})\}_{k\in\mathbb{N}} is also bounded from below since fpf_{p} is lsc as a consequence of fpkefpf^{k}_{p}\overset{\text{e}}{\rightarrow}f_{p}. Then, we can assume that the bounded sequence {fp(xk)}k\{f_{p}(x^{k})\}_{k\in\mathbb{N}} converges to some z¯p\bar{z}_{p}. Note that z¯pdomφp\bar{z}_{p}\in\operatorname{dom}\varphi_{p} due to Fp(x¯)=lim infk+φp(fp(xk))φp(z¯p)F_{p}(\bar{x})=\liminf_{k\rightarrow+\infty}\varphi_{p}(f_{p}(x^{k}))\geq\varphi_{p}(\bar{z}_{p}). Thus, by the outer semicontinuity, yky¯𝒩domφp(z¯p)y^{k}\rightarrow\bar{y}\in\mathcal{N}_{\operatorname{dom}\varphi_{p}}(\bar{z}_{p}). By fpkefpf^{k}_{p}\overset{\text{e}}{\rightarrow}f_{p}, each fp(xk)f_{p}(x^{k}) can be expressed as the limit of a sequence {fpi(xk,i)}i\{f^{i}_{p}(x^{k,i})\}_{i\in\mathbb{N}} with xk,ixkx^{k,i}\rightarrow x^{k} for any fixed k{k\in\mathbb{N}}. Using a standard diagonal extraction procedure, one can extract a subsequence fpik(xk,ik)z¯pf^{i_{k}}_{p}(x^{k,i_{k}})\rightarrow\bar{z}_{p} with xk,ikx¯x^{k,i_{k}}\rightarrow\bar{x}. Hence, z¯pTp(x¯)\bar{z}_{p}\in T_{p}(\bar{x}) and

LimsupxFpx¯𝒩domφp(fp(x)){𝒩domφp(tp)tpTp(x¯)}.\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow_{F_{p}}\,\bar{x}}\,\mathcal{N}_{\operatorname{dom}\varphi_{p}}(f_{p}(x))\subset\bigcup\{\mathcal{N}_{\operatorname{dom}\varphi_{p}}(t_{p})\mid t_{p}\in T_{p}(\bar{x})\}. (55)

Using the subdifferentials relationships in Theorem 1 and the outer semicontinuity of Afp\partial_{A}f_{p}, we have

Limsupxx¯fp(x)Limsupxx¯Afp(x)=Afp(x¯).\displaystyle\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}\;\partial f_{p}(x)\subset\displaystyle\operatornamewithlimits{Lim\,sup}_{x\rightarrow\bar{x}}\;\partial_{A}f_{p}(x)=\partial_{A}f_{p}(\bar{x}). (56)

By (55), (56) and Assumption 5, we immediately get (54). Thus, we can apply the chain rule in Lemma 4, and use (55), (56) again to obtain

(φpfp)(x¯){yAfp(x)|y𝒩domφp(tp),tpTp(x¯)}[Limsupxx¯fp(x)\{0}]{yAfp(x)|y𝒩domφp(tp),tpTp(x¯)}[Afp(x¯)\{0}].\begin{array}[]{rl}\partial^{\infty}(\varphi_{p}\circ f_{p})(\bar{x})&\displaystyle\subset\bigcup\left\{\,y\,\partial_{A}f_{p}(x)\,\middle|\,y\in\mathcal{N}_{\operatorname{dom}\varphi_{p}}(t_{p}),\,t_{p}\in T_{p}(\bar{x})\right\}\cup\left[\,{\displaystyle\displaystyle\operatornamewithlimits{Lim\,sup}_{x\to\bar{x}}}^{\infty}{\partial}f_{p}(x)\backslash\{0\}\,\right]\\[8.67204pt] &\displaystyle\subset\bigcup\left\{\,y\,\partial_{A}f_{p}(x)\,\middle|\,y\in\mathcal{N}_{\operatorname{dom}\varphi_{p}}(t_{p}),\,t_{p}\in T_{p}(\bar{x})\right\}\cup\left[\,\partial^{\infty}_{A}f_{p}(\bar{x})\backslash\{0\}\,\right].\end{array} (57)

For the last inclusion, we use Limsupxx¯fp(x)Limsupxx¯Afp(x)Afp(x¯)\operatorname{Lim\,sup}_{x\rightarrow\bar{x}}^{\infty}\;\partial f_{p}(x)\subset\operatorname{Lim\,sup}_{x\rightarrow\bar{x}}^{\infty}\;\partial_{A}f_{p}(x)\subset\partial^{\infty}_{A}f_{p}(\bar{x}) by Theorem 1(a) and using a standard diagonal extraction procedure. Combining inclusions (53), (57) for two cases with Assumption 5, we derive (17) and complete the proof. ∎