This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Relaxed constant positive linear dependence constraint qualification and its application to bilevel programs

Mengwei Xu  and   Jane J. Ye School of Mathematics, Tianjin University, Tianjin, 300072, China. E-mail: [email protected]. The research of this author was supported by the National Natural Science Foundation of China under Projects No. 11601376Corresponding author. Department of Mathematics and Statistics, University of Victoria, Victoria, B.C., Canada V8W 2Y2. E-mail: [email protected]. The research of this author was partially supported by NSERC.

Abstract. Relaxed constant positive linear dependence constraint qualification (RCPLD) for a system of smooth equalities and inequalities is a constraint qualification that is weaker than the usual constraint qualifications such as Mangasarian Fromovitz constraint qualification and the linear constraint qualification. Moreover RCPLD is known to induce an error bound property. In this paper we extend RCPLD to a very general feasibility system which may include Lipschitz continuous inequality constraints, complementarity constraints and abstract constraints. We show that this RCPLD for the general system is a constraint qualification for the optimality condition in terms of limiting subdifferential and limiting normal cone and it is a sufficient condition for the error bound property under the strict complementarity condition for the complementarity system and Clarke regularity conditions for the inequality constraints and the abstract constraint set. Moreover we introduce and study some sufficient conditions for RCPLD including the relaxed constant rank constraint qualification (RCRCQ). Finally we apply our results to the bilevel program.

Key Words. Constraint qualification, error bound property, nonsmooth program, RCPLD, variational analysis, complementarity system, bilevel program.

2010 Mathematics Subject Classification. 49J52, 90C31, 90C33.

1 Introduction

A constraint qualification is a condition imposed on the constraint region of a mathematical program under which the Karush-Kuhn-Tucker (KKT) condition holds at any local optimal solution. Other than guaranteeing KKT condition holding at all local optimal solutions, some constraint qualifications also lead to existence of error bounds to the feasible region and hence play a key role in convergence analysis of certain computational methods. Hence studying constraint qualifications is essential in both theoretical and numerical points of view.

For smooth nonlinear programs with equality and inequality constraints, the classical constraint qualifications are the linear independence constraint qualification (LICQ), Mangasarian-Fromovitz constraint qualification (MFCQ), and the linear constraint qualification (LCQ), i.e., all functions in the equality and inequality constraints are affine. These three classical constraint qualifications may be too restrictive for many problems. Janin [16] relaxed LICQ and proposed the constant rank constraint qualification (CRCQ), which neither implies nor is implied by MFCQ. The concept of CRCQ was weakened to the relaxed constant rank constraint qualification (RCRCQ) which is shown to be a constraint qualification by Minchenko and Stakhovshi [19]. Qi and Wei [26] introduced the concept of the constant positive linear dependence (CPLD) condition which is weaker than both CRCQ and MFCQ. CPLD was shown to be a constraint qualification by Andreani, Martínez and Schuverdt in [2].

In [1], Andreani, Haeser, Schuverdt and Silva introduced the following relaxed version of CPLD for a system of smooth equality and inequality constraints:

gi(x)0,i=1,,n,hi(x)=0,i=1,,m,g_{i}(x)\leq 0,\ i=1,\cdots,n,\ \ h_{i}(x)=0,\ i=1,\cdots,m,

where gi,hi:dg_{i},h_{i}:\mathbb{R}^{d}\rightarrow\mathbb{R} are smooth at xx^{*}, a feasible solution. xx^{*} is said to satisfy the relaxed constant positive linear dependence constraint qualification (RCPLD)(\rm RCPLD) if there exists 𝕌(x)\mathbb{U}(x^{*}), a neighborhood of xx^{*} such that

  • (i)

    {hi(x)}i=1m\{\nabla h_{i}(x)\}_{i=1}^{m} has the same rank for every x𝕌(x)x\in\mathbb{U}(x^{*}).

  • (ii)

    Let J{1,,m}J\subseteq\{1,\cdots,m\} be such that {hi(x)}iJ\{\nabla h_{i}(x^{*})\}_{i\in J} is a basis for span{hi(x)}i=1m\{\nabla h_{i}(x^{*})\}_{i=1}^{m}. For every IIg:={i:gi(x)=0}I\subseteq I_{g}^{*}:=\{i:g_{i}(x^{*})=0\}, if {gi(x)}iI{hi(x)}iJ\{\nabla g_{i}(x^{*})\}_{i\in I}\cup\{\nabla h_{i}(x^{*})\}_{i\in J} is positive linearly dependent, i.e., there exist scalars λi0\lambda_{i}\geq 0, iIi\in I, μi\mu_{i}, iJi\in J not all zero such that

    0=iIλigi(x)+iJμihi(x),\displaystyle 0=\sum_{i\in I}\lambda_{i}\nabla g_{i}(x^{*})+\sum_{i\in J}\mu_{i}\nabla h_{i}(x^{*}),

    then {gi(x)}iI{hi(x)}iJ\{\nabla g_{i}(x)\}_{i\in I}\cup\{\nabla h_{i}(x)\}_{i\in J} is linearly dependent for every x𝕌(x)x\in\mathbb{U}(x^{*}).

It is easy to see that in the case where either LCQ or MFCQ holds, RCPLD also holds. Hence RCPLD is weaker than LCQ and MFCQ. In [1], the authors not only showed that RCPLD is a constraint qualification but also proved that if all functions gi,hig_{i},h_{i} have second-order derivatives at all points near the point xx^{*}, then RCPLD is a sufficient condition for the error bound property: α>0,𝕌(x)\exists\alpha>0,\mathbb{U}(x^{*}), a neighborhood of xx^{*} such that

d(x)α(g+(x)+h(x)),x𝕌(x),d_{{\cal F}}(x)\leq\alpha(\|g_{+}(x)\|+\|h(x)\|),\quad\forall x\in\mathbb{U}(x^{*}),

where :={x|g(x)0,h(x)=0}{{\cal F}}:=\{x|g(x)\leq 0,h(x)=0\}, g(x)=(g1(x),,gn(x))g(x)=(g_{1}(x),\dots,g_{n}(x)), h(x)=(h1(x),,hm(x))h(x)=(h_{1}(x),\dots,h_{m}(x)), d(x)d_{{\cal F}}(x) is the distance from xx to set {{\cal F}}, \|\cdot\| denotes any norm and g+(x):=max{0,g(x)}g_{+}(x):=\max\{0,g(x)\}, where the maximum is taken component-wise. Moreover as an open question, in [1], a question was asked on whether or not it was possible to prove the error bound property without imposing the second-order differentiability of all functions. In Guo, Zhang and Lin [14], it was shown that the error bound property holds under RCPLD without imposing the second order differentiability of all functions. Other than using it as a constraint qualification to ensure the KKT condition holds, RCPLD is also used in the convergence analysis of the augmented Lagrangian method to obtain a KKT point (see e.g., [1, 3, 15, 28]). Recently, Guo and Ye [13, Corollary 3] extended RCPLD to the case where there is an extra abstract constraint set and showed that it is still a constraint qualification. In [11, Definition 4.3], a version of RCPLD called MPEC RCPLD was introduced for the mathematical programs with equilibrium constraints (MPEC) and was shown in [10, Corollary 4.1] that it is a constraint qualification for M-stationary conditions. Moreover, it was shown in [14, Theorem 5.1] that the RCPLD for MPECs ensures the error bound property under the assumption of strict complementarity.

In this paper, we extend RCPLD to the following very general feasibility system:

gi(x)0,i=1,,n,hi(x)=0,i=1,,m,(G(x),H(x))Ωp,xC,\begin{array}[]{l}g_{i}(x)\leq 0,\ i=1,\cdots,n,\\ h_{i}(x)=0,\ i=1,\cdots,m,\\ (G(x),H(x))\in\Omega^{p},\\ x\in C,\end{array} (1.1)

where Ωp:={(y,z)p×p|0yz0}\Omega^{p}:=\{(y,z)\in\mathbb{R}^{p}\times\mathbb{R}^{p}|0\leq y\perp z\geq 0\} is the ppth dimensional complementarity set, C:=C1×C2××ClC:=C_{1}\times C_{2}\times\cdots\times C_{l} with CiqiC_{i}\subseteq\mathbb{R}^{q_{i}} closed, i=1,,li=1,\cdots,l, q1++ql=dq_{1}+\cdots+q_{l}=d, the functions gi:d,i=1,,ng_{i}:\mathbb{R}^{d}\to\mathbb{R},i=1,\cdots,n, are locally Lipschitz continuous and hi:d,i=1,,mh_{i}:\mathbb{R}^{d}\to\mathbb{R},i=1,\cdots,m, G,H:dpG,H:\mathbb{R}^{d}\rightarrow\mathbb{R}^{p}, are continuously differentiable at xx^{*}, a feasible solution.

Denote the feasible region of system (1.1)(\ref{feasibilitys}) by {\cal F}. For a feasible point xx^{*}\in{\cal F}, we define the following index sets:

Ig:={i=1,,n:gi(x)=0},\displaystyle I_{g}^{*}:=\{i=1,\cdots,n:g_{i}(x^{*})=0\},
:={i=1,,p:0=Gi(x)<Hi(x)},\displaystyle\mathcal{I}^{*}:=\{i=1,\cdots,p:0=G_{i}(x^{*})<H_{i}(x^{*})\},
𝒥:={i=1,,p:0=Gi(x)=Hi(x)},\displaystyle\mathcal{J}^{*}:=\{i=1,\cdots,p:0=G_{i}(x^{*})=H_{i}(x^{*})\},
𝒦:={i=1,,p:Gi(x)>Hi(x)=0}.\displaystyle\mathcal{K}^{*}:=\{i=1,\cdots,p:G_{i}(x^{*})>H_{i}(x^{*})=0\}.
Definition 1.1 (RCPLD for the nonsmooth system (1.1))

We say that the relaxed constant positive linear dependence constraint qualification (RCPLD)(\rm RCPLD) holds at xx^{*}\in{\cal F} for system (1.1) if the following conditions hold:

  • (i)

    The vectors {hi(x)}i=1m{Gi(x)}i{Hi(x)}i𝒦\{\nabla h_{i}(x)\}_{i=1}^{m}\cup\{\nabla G_{i}(x)\}_{i\in\mathcal{I}^{*}}\cup\{\nabla H_{i}(x)\}_{i\in\mathcal{K}^{*}} have the same rank for all xx in a neighbourhood of xx^{*}.

  • (ii)

    Let 1{1,,m}\mathcal{I}_{1}\subseteq\{1,\cdots,m\}, 2\mathcal{I}_{2}\subseteq\mathcal{I}^{*}, 3𝒦\mathcal{I}_{3}\subseteq\mathcal{K}^{*} be such that the set of vectors {hi(x)}i1{Gi(x)}i2{Hi(x)}i3\{\nabla h_{i}(x^{*})\}_{i\in\mathcal{I}_{1}}\cup\{\nabla G_{i}(x^{*})\}_{i\in\mathcal{I}_{2}}\cup\{\nabla H_{i}(x^{*})\}_{i\in\mathcal{I}_{3}} is a basis for

     span {{hi(x)}i=1m{Gi(x)}i{Hi(x)}i𝒦}.\mbox{ span }\large\{\{\nabla h_{i}(x^{*})\}_{i=1}^{m}\cup\{\nabla G_{i}(x^{*})\}_{i\in\mathcal{I}^{*}}\cup\{\nabla H_{i}(x^{*})\}_{i\in\mathcal{K}^{*}}\large\}.

    For any index sets 4Ig\mathcal{I}_{4}\subseteq I_{g}^{*}, 5,6𝒥\mathcal{I}_{5},\mathcal{I}_{6}\subseteq\mathcal{J}^{*}, if there exists a nonzero vector (λg,λh,λG,λH,η)n×m×p×p×d(\lambda^{g},\lambda^{h},\lambda^{G},\lambda^{H},\eta^{*})\in\mathbb{R}^{n}\times\mathbb{R}^{m}\times\mathbb{R}^{p}\times\mathbb{R}^{p}\times\mathbb{R}^{d} satisfying λig0\lambda_{i}^{g}\geq 0 for i4i\in\mathcal{I}_{4}, and either λiG>0,λiH>0 or λiGλiH=0,i𝒥\mbox{either }\lambda_{i}^{G}>0,\lambda_{i}^{H}>0\mbox{ or }\lambda_{i}^{G}\lambda_{i}^{H}=0,\forall i\in\mathcal{J}^{*}, η=(η1,,ηl)𝒩C(x)=𝒩C1(x1)××𝒩Cl(xl){\eta^{*}=(\eta_{1}^{*},\cdots,\eta_{l}^{*})}\in\mathcal{N}_{C}(x^{*})=\mathcal{N}_{C_{1}}(x_{1}^{*})\times\cdots\times\mathcal{N}_{C_{l}}(x_{l}^{*}), vigi(x)v^{*}_{i}\in\partial g_{i}(x^{*}) for i4i\in{\cal I}_{4} such that

    0=i4λigvi+i1λihhi(x)i25λiGGi(x)i36λiHHi(x)+η,\displaystyle 0=\sum_{i\in\mathcal{I}_{4}}\lambda_{i}^{g}v_{i}^{*}+\sum_{i\in\mathcal{I}_{1}}\lambda_{i}^{h}\nabla h_{i}(x^{*})-\sum_{i\in\mathcal{I}_{2}\cup\mathcal{I}_{5}}\lambda_{i}^{G}\nabla G_{i}(x^{*})-\sum_{i\in\mathcal{I}_{3}\cup\mathcal{I}_{6}}\lambda_{i}^{H}\nabla H_{i}(x^{*})+\eta^{*},

    then for all kk sufficiently large, the set of vectors

    {vik}i4{hi(xk)}i1{Gi(xk)}i25{Hi(xk)}i36{νik}iL,\{v_{i}^{k}\}_{i\in\mathcal{I}_{4}}\cup\{\nabla h_{i}(x^{k})\}_{i\in\mathcal{I}_{1}}\cup\{\nabla G_{i}(x^{k})\}_{i\in\mathcal{I}_{2}\cup\mathcal{I}_{5}}\cup\{\nabla H_{i}(x^{k})\}_{i\in\mathcal{I}_{3}\cup\mathcal{I}_{6}}\cup{\{\nu_{i}^{k}\}_{i\in L}}, (1.2)

    where vikgi(xk)v_{i}^{k}\in\partial g_{i}(x^{k}), νik:={0}si×{ηik}×{0}ti with ηik𝒩Ci(xik)\nu_{i}^{k}:=\{0\}^{s_{i}}\times\{\eta_{i}^{k}\}\times\{0\}^{t_{i}}\mbox{ with }\eta_{i}^{k}\in\mathcal{N}_{C_{i}}(x_{i}^{k}), si:=q1++qi1s_{i}:=q_{1}+\cdots+q_{i-1}, ti:=qi+1++qlt_{i}:=q_{i+1}+\cdots+q_{l}, L:={1,,l:ηik0}L:=\{1,\dots,l:\eta_{i}^{k}\neq 0\} and xkxx^{k}\neq x^{*}, is linearly dependent for all sequences {xk},{vk},{ηk}\{x^{k}\},\{v^{k}\},\{\eta^{k}\} satisfying xkxx^{k}\rightarrow x^{*}, vikviv_{i}^{k}\rightarrow v_{i}^{*}, ηk:=(η1k,,ηlk)=iLνikη\eta^{k}:=(\eta_{1}^{k},\dots,\eta_{l}^{k})=\sum_{i\in L}\nu_{i}^{k}\rightarrow\eta^{*} as kk\rightarrow\infty.

Remark 1.1

In Definition 1.1 (ii), we use sequences instead of neighborhoods. Although we could also use a neighborhood in the definition equivalently, it is more convenient to use the sequential form since if the point xx^{*} is an isolated non-differentiable point, i.e., there exists a neighborhood around xx^{*} where gg is differentiable, then vikv_{i}^{k} can be taken as the gradient gi(xk)\nabla g_{i}(x^{k}). Since a Lipschitz continuous function is differentiable almost everywhere and so such points are abundant.

In the case where the rank of {hi(x)}i=1m{Gi(x)}i{Hi(x)}i𝒦\{\nabla h_{i}(x^{*})\}_{i=1}^{m}\cup\{\nabla G_{i}(x^{*})\}_{i\in\mathcal{I}^{*}}\cup\{\nabla H_{i}(x^{*})\}_{i\in\mathcal{K}^{*}} is equal to dd, it is easy to see that RCPLD holds automatically. What is more, in Theorem 4.1 we will show that the error bound property holds in this case.

Note that Definition 1.1 is weaker than the one defined in Guo and Ye [13, Corollary 3] for the system containing only smooth equality and inequality constraints and one abstract constraint, in which the stronger condition {gi(x)}i4{hi(x)}i1\{\nabla g_{i}(x)\}_{i\in\mathcal{I}_{4}}\cup\{\nabla h_{i}(x)\}_{i\in\mathcal{I}_{1}} is linearly dependent for every x𝕌(x)x\in\mathbb{U}(x^{*}) required.

If the set of vectors in (1.2) is replaced by the following set of vectors

{vik}i4{hi(xk)}i1{Gi(xk)}i25{Hi(xk)}i36{ηk},\{v_{i}^{k}\}_{i\in\mathcal{I}_{4}}\cup\{\nabla h_{i}(x^{k})\}_{i\in\mathcal{I}_{1}}\cup\{\nabla G_{i}(x^{k})\}_{i\in\mathcal{I}_{2}\cup\mathcal{I}_{5}}\cup\{\nabla H_{i}(x^{k})\}_{i\in\mathcal{I}_{3}\cup\mathcal{I}_{6}}\cup{\{\eta^{k}\}},

then since ηk=iLνik\eta^{k}=\sum_{i\in L}\nu_{i}^{k}, the resulting condition would be stronger than the RCPLD defined in Definition 1.1. We illustrate this by Example 4.1.

In this paper we show that RCPLD for the nonsmooth feasibility system is a constraint qualification for any optimization problem with a Lipschitz objective function and the constraints described as in the feasibility system (1.1). Moreover with some extra conditions, we will show that RCPLD is a sufficient condition for the following error bound property: α>0,𝕌(x)\exists\alpha>0,\mathbb{U}(x^{*}), a neighborhood of xx^{*} such that

d(x)α(g+(x)+h(x)+i=1pdΩ(Gi(x),Hi(x))),x𝕌(x)C,d_{{\cal F}}(x)\leq\alpha(\|g_{+}(x)\|+\|h(x)\|+\sum_{i=1}^{p}d_{\Omega}(G_{i}(x),H_{i}(x))),\quad\forall x\in\mathbb{U}(x^{*})\cap C, (1.3)

where Ω:=Ω1={(y,z)2| 0yz0}\Omega:=\Omega^{1}=\{(y,z)\in\mathbb{R}^{2}|\ 0\leq y\perp z\geq 0\}.

One of the motivations to extend the concept of RCPLD to the nonsmooth system (1.1) is to study the constraint qualification and optimality condition for the following bilevel program:

(BP)min\displaystyle({\rm BP})~{}~{}~{}~{}~{}~{}\min F(x,y)\displaystyle F(x,y)
s.t.\displaystyle{\rm s.t.} yS(x),G(x,y)0,H(x,y)=0,\displaystyle y\in S(x),\ G(x,y)\leq 0,\ H(x,y)=0,

where S(x)S(x) denotes the solution set of the lower level program

(Px)minyY(x)f(x,y),\displaystyle({\rm P}_{x})~{}~{}~{}~{}~{}~{}~{}~{}\min_{y\in Y(x)}\ f(x,y),

and Y(x):={ys:g(x,y)0,h(x,y)=0}Y(x):=\{y\in\mathbb{R}^{s}:g(x,y)\leq 0,h(x,y)=0\}, F:d×sF:\mathbb{R}^{d}\times\mathbb{R}^{s}\rightarrow\mathbb{R} is locally Lipschitz continuous, G:d×spG:\mathbb{R}^{d}\times\mathbb{R}^{s}\rightarrow\mathbb{R}^{p} and H:d×sqH:\mathbb{R}^{d}\times\mathbb{R}^{s}\rightarrow\mathbb{R}^{q} are continuously differentiable, f:d×sf:\mathbb{R}^{d}\times\mathbb{R}^{s}\rightarrow\mathbb{R}, g:d×sm,h:d×sng:\mathbb{R}^{d}\times\mathbb{R}^{s}\rightarrow\mathbb{R}^{m},h:\mathbb{R}^{d}\times\mathbb{R}^{s}\rightarrow\mathbb{R}^{n} are continuously differentiable and twice continuously differentiable with respect to variable yy.

Bilevel programs naturally fall in the domain of global optimization since in the lower level problem, the global optimal solution is always required in either optimality conditions or numerical algorithms. A popular approach to deal with (BP) is to replace the set of global solutions S(x)S(x) by the KKT optimality conditions of the lower level problem. This reformulation is based on the fact that if the lower level problem (Px)(P_{x}) is a convex program for each fixed xx and certain constraint qualification holds, then yS(x)y\in S(x) if and only if its KKT optimality condition holds. But due to the introduction of multipliers for the lower level problem as extra variables, the resulting reformulation may not be equivalent to the original bilevel program even in the case where (Px)(P_{x}) is convex but the multipliers are not unique. For the discussion of this issue and the recent new results, the reader is referred to recent paper [31] and the reference within for more discussions. If the lower level problem (Px)(P_{x}) is not a convex program for each fixed xx, the KKT condition for the lower level problem is usually only a necessary but not sufficient condition for optimality and moreover, as pointed out by Mirrlees in [20], such a reformulation by the KKT condition may miss out the true optimal solution of the original bilevel program.

Instead of using yS(x)y\in S(x) as a constraint in (BP), Outrata [25] proposed to replace it with f(x,y)V(x)=0,yY(x)f(x,y)-V(x)=0,y\in Y(x) where V(x):=infyY(x)f(x,y)V(x):=\inf_{y\in Y(x)}f(x,y) is the value function of the lower level program for a numerical consideration. This so-called value function approach was further used in Ye and Zhu [32] and later in other papers such as [7, 8, 21, 22, 23, 24] to derive various necessary optimality conditions under the partial calmness condition. The partial calmness condition for the value function reformulation of the bilevel program, however, is a very strong assumption. To derive a necessary optimality condition under weaker assumptions, Ye and Zhu [33] proposed the following combined program where both the value function constraint and the KKT condition of the lower level program are used as constraints:

(CP)minx,y,u,v\displaystyle({\rm CP})~{}~{}~{}~{}~{}~{}\min_{x,y,u,v} F(x,y)\displaystyle F(x,y)
s.t.\displaystyle{\rm s.t.} f(x,y)V(x)0,G(x,y)0,H(x,y)=0,\displaystyle f(x,y)-V(x)\leq 0,\ G(x,y)\leq 0,H(x,y)=0,
yf(x,y)+yg(x,y)Tu+yh(x,y)Tv=0,\displaystyle\nabla_{y}f(x,y)+\nabla_{y}g(x,y)^{T}u+\nabla_{y}h(x,y)^{T}v=0,
(g(x,y),u)Ωm.\displaystyle(-g(x,y),u)\in\Omega^{m}.

As discussed in [33], such a reformulation can avoid some difficulties caused by using the value function or the classical KKT approach alone. In [30], necessary optimality conditions in the form of Mordukhovich (M-) stationary condition for (CP) is studied under the partial calmness/weak calmness condition. These conditions, however, may not be easy to verify.

Note that the inclusion of the value function makes the problem nonsmooth since the value function is usually nonsmooth but under some reasonable assumptions on the lower level problem, the value function is Lipschitz continuous. Hence the feasible set of (CP) is a special case of the general feasibility system (1.1). However as it was shown in Ye and Zhu [33, Proposition 1.3], the no nonzero abnormal multiplier constraint qualification (NNAMCQ) as defined in Definition 4.1 never holds for (CP). Being able to deal with nonsmooth inequality constraints in RCPLD would allow us to present verifiable constraint qualifications and exact penalty for the reformulation of the bilevel program in which the value function is used, such as (CP).

The rest of the paper is organized as follows. In Section 2, we present basic definitions as well as some preliminaries which will be used in this paper. In Section 3, we show that RCPLD is a constraint qualification and a sufficient condition for error bound properties under certain regularity conditions. We introduce some sufficient conditions for RCPLD, which are easier to verify in Section 4. In Section 5, we apply RCPLD to the bilevel program.

We adopt the following standard notation in this paper. For any two vectors aa and bb, we denote by either a,b\langle a,b\rangle or aTba^{T}b their inner product. Given a differentiable function G:dG:\mathbb{R}^{d}\rightarrow\mathbb{R}, we denote its gradient vector by G(x)d\nabla G(x)\in\mathbb{R}^{d}. For a differentiable mapping Φ:dn\Phi:\mathbb{R}^{d}\to\mathbb{R}^{n} with n2n\geq 2 and a vector xdx\in\mathbb{R}^{d}, we denote by Φ(x)n×d\nabla\Phi(x)\in\mathbb{R}^{n\times d} the Jacobian matrix of Φ\Phi at xx. For a set CdC\subseteq\mathbb{R}^{d}, we denote by int CC, co CC, C¯\bar{C}, bd CC and dC(x)d_{C}(x) the interior, the convex hull, the closure, the boundary of CC and the distance from xx to CC, respectively. We denote by |||{\cal I}| the cardinality of index set {1,,n}{\cal I}\subseteq\{1,\dots,n\}. For any vector vnv\in\mathbb{R}^{n} and a given index set {1,,n}{\cal I}\subseteq\{1,\dots,n\}, we use vv_{\cal I} to denote the vector in ||\mathbb{R}^{|{\cal I}|} with components vi,iv_{i},i\in{\cal I}. For a matrix An×mA\in\mathbb{R}^{n\times m}, ATA^{T} denotes its transpose, r(A)r(A) denotes its rank. We denote by 𝔹δ(x¯){\mathbb{B}}_{\delta}(\bar{x}) the open ball center at x¯\bar{x} with radius δ>0\delta>0 and by 𝔹{\mathbb{B}} the open unit ball center at the origin. Unless otherwise specified we denote by \|\cdot\| any norm in the finite dimensional space.

2 Background and preliminaries

In this section, we present some background materials on variational analysis which will be used throughout the paper. Detailed discussions on these subjects can be found in [4, 5, 21, 27].

Given a set CdC\subseteq\mathbb{R}^{d}, its Fréchet/regular normal cone at zCz\in C is defined by

𝒩^C(z):={vd:v,zzo(zz) for all zC}\mathaccent 866{\mathcal{N}}_{C}(z):=\{v\in\mathbb{R}^{d}:\langle v,z^{\prime}-z\rangle\leq o\big{(}\|z-z^{\prime}\|\big{)}\;\mbox{ for all }\;z^{\prime}\in C\}

and the limiting/Mordukhovich/basic normal cone to CC at point zz is defined by

𝒩C(z)={limkvk:vk𝒩^C(zk),zkC,zkz}.\mathcal{N}_{C}(z)=\{\lim_{k\rightarrow\infty}v_{k}:v_{k}\in\mathaccent 866{\mathcal{N}}_{C}(z_{k}),\ \ z_{k}\in C,z_{k}\rightarrow z\}.

A set CdC\subseteq\mathbb{R}^{d} is Clarke regular at zz if it is locally closed at zz and 𝒩C(z)=𝒩^C(z)\mathcal{N}_{C}(z)=\mathaccent 866{\mathcal{N}}_{C}(z) [27, Definition 6.4]. Note that for C:=C1×C2××ClC:=C_{1}\times C_{2}\times\cdots\times C_{l} with CiC_{i} closed, i=1,,li=1,\cdots,l, by [27, Proposition 6.41], CC is regular at zz if and only if each CiC_{i} is regular at ziz_{i}.

The exact expressions for the limiting normal cone of the complementarity set present below will be useful. In this paper we denote by Ωp\Omega^{p} the complementarity set and when p=1p=1, Ω:=Ω1={(y,z)2| 0yz0}\Omega:=\Omega^{1}=\{(y,z)\in\mathbb{R}^{2}|\ 0\leq y\perp z\geq 0\}.

Proposition 2.1

(See e.g. [29, Proposition 3.7]) For any (a,b)(a,b) lying in the complementarity set Ωp\Omega^{p}, the limiting normal cone to Ωp\Omega^{p} at (a,b)(a,b) is

𝒩Ωp(a,b)={(α,β)p×p:αi,βi=0ifai=0<biαi=0,βiifai>0=bieitherαi>0,βi>0orαiβi=0ifai=bi=0}.\displaystyle\mathcal{N}_{\Omega^{p}}(a,b)=\left\{-(\alpha,\beta)\in\mathbb{R}^{p}\times\mathbb{R}^{p}:\begin{array}[]{ll}\alpha_{i}\in\mathbb{R},\ \beta_{i}=0&{\rm if}\ a_{i}=0<b_{i}\\ \alpha_{i}=0,\ \beta_{i}\in\mathbb{R}&{\rm if}\ a_{i}>0=b_{i}\\ {\rm either}\ \alpha_{i}>0,\beta_{i}>0\ {\rm or}\ \alpha_{i}\beta_{i}=0&{\rm if}\ a_{i}=b_{i}=0\end{array}\right\}.

Let f:d¯f:\mathbb{R}^{d}\rightarrow\overline{\mathbb{R}} be a lower semicontinuous function and finite at xdx\in\mathbb{R}^{d}. We define the Fréchet/regular subdifferential ([27, Definition 8.3]) of ff at xx as

^f(x):={ζd:f(x)f(x)ζ,xxoxx},\displaystyle\mathaccent 866{\partial}f(x):=\{\zeta\in\mathbb{R}^{d}:f(x^{\prime})-f(x)-\langle\zeta,x^{\prime}-x\rangle\geq o\|x^{\prime}-x\|\},

and the limiting/Mordukhovich/basic subdifferential of ff at x{x} as

f(x):={limkξk:ξk^f(xk),xkx,f(xk)f(x)}.\displaystyle\partial f({x}):=\{\lim_{k\rightarrow\infty}\xi_{k}:\xi_{k}\in\mathaccent 866{\partial}f(x_{k}),x_{k}\rightarrow x,f(x_{k})\rightarrow f(x)\}.

Let f:df:\mathbb{R}^{d}\rightarrow\mathbb{R} be Lipschitz continuous at xx. We say that ff is subdifferentially/Clarke regular at xx provided that f(x)=^f(x)\partial f({x})=\mathaccent 866{\partial}f(x) [27, Corollary 8.11].

The following proposition collects some useful properties and calculus rules of the limiting subdifferential.

Proposition 2.2
  • (i)

    [27, Exercise 10.10] and [21, Theorem 2.33] Let f,g:n[,]f,g:\mathbb{R}^{n}\to[-\infty,\infty] be proper lower semi-continuous around xnx^{*}\in\mathbb{R}^{n} and finite at xx^{*}, and α,β\alpha,\beta be nonnegative scalars. Assume that at least one of them is Lipschitz continuous around xx^{*}. Then

    (αf+βg)(x)αf(x)+βg(x).\partial(\alpha f+\beta g)(x^{*})\subseteq\alpha\partial f(x^{*})+\beta\partial g(x^{*}).
  • (ii)

    [17, Theorem 2.5, Remark (2)] and [21, Theorem 3.41]Let g:nmg:\mathbb{R}^{n}\to\mathbb{R}^{m} be Lipschitz continuous around xx^{*} and f:mf:\mathbb{R}^{m}\to\mathbb{R} be Lipschitz continuous near g(x)g(x^{*}). Then the composite function fgf\circ g is Lipschitz continuous around xx^{*} and

    (fg)(x)ξf(g(x))ξ,g(x).\partial(f\circ g)(x^{*})\subseteq\bigcup_{\xi\in\partial f(g(x^{*}))}\partial\langle\xi,g\rangle(x^{*}).
  • (iii)

    [21, Theorem 3.46 and Proposition 1.113] Let φi:n(i=1,,n)\varphi_{i}:\mathbb{R}^{n}\to\mathbb{R}\ (i=1,\dots,n) be Lipschitz continuous around xx^{*} and φmax(x):=max{φi(x):i=1,,n},\varphi_{\max}({x}):=\max\{\varphi_{i}({x}):i=1,\ldots,n\}, and φmin(x):=min{φi(x):i=1,,n}.\varphi_{\min}({x}):=\min\{\varphi_{i}({x}):i=1,\ldots,n\}. Then φmax(x)\varphi_{\max}(x) and φmin(x)\varphi_{\min}(x) are Lipschitz continuous around xnx^{*}\in\mathbb{R}^{n} and

    φmax(x)\displaystyle\partial\varphi_{\max}(x^{*}) \displaystyle\subseteq co{φi(x):iI+(x)},\displaystyle co\{\partial\varphi_{i}(x^{*}):i\in I_{+}(x^{*})\},
    φmin(x)\displaystyle\partial\varphi_{\min}(x^{*}) \displaystyle\subseteq {φi(x):iI(x)},\displaystyle\{\partial\varphi_{i}(x^{*}):i\in I_{-}(x^{*})\},

    where I+(x):={i:φi(x)=φmax(x)}I_{+}(x^{*}):=\{i:\varphi_{i}(x^{*})=\varphi_{\max}(x^{*})\} and I(x):={i:φi(x)=φmin(x)}I_{-}(x^{*}):=\{i:\varphi_{i}(x^{*})=\varphi_{\min}(x^{*})\}, and the first inclusion holds as an equation if each φi\varphi_{i} is subdifferentially regular at xx^{*}.

Taking into account that all norms in a finite dimensional space are equivalent, the following formula for distance function to the complementarity set Ω\Omega can be used in the error bound property (1.3).

Proposition 2.3

(see e.g. [18]) When the norm is chosen to be the l1l_{1}-norm in the distance function dΩd_{\Omega}, for any (a,b)2(a,b)\in\mathbb{R}^{2},

dΩ(a,b)=max{a,b,(a+b),min{a,b}}.d_{\Omega}(a,b)=\max\{-a,-b,-(a+b),\min\{a,b\}\}.

When the norm is chosen to be the ll_{\infty}-norm in the distance function dΩd_{\Omega}, for any (a,b)2(a,b)\in\mathbb{R}^{2},

dΩ(a,b)=|min{a,b}|.d_{\Omega}(a,b)=|\min\{a,b\}|.

In Theorem 3.1, we need to calculate ϕ0(x)\partial\phi_{0}(x), where

ϕ0(x):=i=1nmax{0,gi(x)}+i=1m|hi(x)|+i=1p|min{Gi(x),Hi(x)}|.\displaystyle\phi_{0}(x):=\sum_{i=1}^{n}\max\{0,g_{i}(x)\}+\sum_{i=1}^{m}|h_{i}(x)|+\sum_{i=1}^{p}|\min\{G_{i}(x),H_{i}(x)\}|.

In order to calculate ϕ0(x)\partial\phi_{0}(x), we define the following index sets for each xx:

A(x):={i=1,,n:gi(x)0},\displaystyle A(x):=\{i=1,\cdots,n:g_{i}(x)\geq 0\},
(x):={i=1,,p:Gi(x)<Hi(x)},\displaystyle\mathcal{I}(x):=\{i=1,\cdots,p:G_{i}(x)<H_{i}(x)\},
𝒥(x):={i=1,,p:Gi(x)=Hi(x)},\displaystyle\mathcal{J}(x):=\{i=1,\cdots,p:G_{i}(x)=H_{i}(x)\}, (2.2)
𝒦(x):={i=1,,p:Gi(x)>Hi(x)}.\displaystyle\mathcal{K}(x):=\{i=1,\cdots,p:G_{i}(x)>H_{i}(x)\}.

From the calculus rules in Proposition 2.2, we have the following estimate for the limiting subdifferential of ϕ0\phi_{0}.

Lemma 2.1

Assume that the functions gi:d,i=1,,ng_{i}:\mathbb{R}^{d}\to\mathbb{R},i=1,\cdots,n, are locally Lipschitz continuous and hi:d,i=1,,mh_{i}:\mathbb{R}^{d}\to\mathbb{R},i=1,\cdots,m, G,H:dpG,H:\mathbb{R}^{d}\rightarrow\mathbb{R}^{p}, are continuously differentiable around xx^{*}. ϕ0(x)\phi_{0}(x) is a Lipschitz continuous function and for any xx^{*}, there exist λig0,iA(x)\lambda_{i}^{g}\geq 0,i\in A(x^{*}), λih,i=1,,m\lambda_{i}^{h},i=1,\cdots,m and λiG,λiH\lambda_{i}^{G},\lambda_{i}^{H}, i=1,,pi=1,\cdots,p satisfying

{λiH=0 if i(x)λiG=0 if i𝒦(x)either λiG>0,λiH>0 or λiGλiH=0 if i𝒥(x)\left\{\begin{array}[]{ll}\lambda_{i}^{H}=0&\mbox{ if }i\in{\mathcal{I}(x^{*})}\\ \lambda_{i}^{G}=0&\mbox{ if }i\in{\mathcal{K}(x^{*})}\\ \mbox{either }\lambda_{i}^{G}>0,\lambda_{i}^{H}>0\mbox{ or }\lambda_{i}^{G}\lambda_{i}^{H}=0&\mbox{ if }i\in{\mathcal{J}(x^{*})}\end{array}\right. (2.3)

such that

ϕ0(x)iA(x)λiggi(x)+i=1mλihhi(x)i=1pλiGGi(x)i=1pλiHHi(x).\displaystyle\partial\phi_{0}(x^{*})\subseteq\sum_{i\in A(x^{*})}\lambda_{i}^{g}\partial g_{i}(x^{*})+\sum_{i=1}^{m}\lambda_{i}^{h}\nabla h_{i}(x^{*})-\sum_{i=1}^{p}\lambda_{i}^{G}\nabla G_{i}(x^{*})-\sum_{i=1}^{p}\lambda_{i}^{H}\nabla H_{i}(x^{*}).

Proof. For i=1,,pi=1,\cdots,p, let Fi(x):=|fi(x)|F_{i}(x):=|f_{i}(x)| and fi(x):=min{Gi(x),Hi(x)}f_{i}(x):=\min\{G_{i}(x),H_{i}(x)\}. From the chain rule in Proposition 2.2(ii), we have Fi(x){(μfi)(x):μ||(fi(x))}.\partial F_{i}(x^{*})\subseteq\{\partial(\mu f_{i})(x^{*}):\mu\in\partial|\cdot|(f_{i}(x^{*}))\}. We divide the analysis into three cases.

  • Case 1:

    i(x)i\in\mathcal{I}(x^{*}). In this case we have Gi(x)<Hi(x)\ G_{i}(x^{*})<H_{i}(x^{*}) and hence fi(x)=Gi(x)f_{i}(x^{*})=G_{i}(x^{*}). By the chain rule we have Fi(x)={Gi(x)}\partial F_{i}(x)=\{\nabla G_{i}(x^{*})\} if Gi(x)>0G_{i}(x^{*})>0 and Fi(x)={Gi(x)}\partial F_{i}(x)=\{-\nabla G_{i}(x^{*})\} if Gi(x)<0G_{i}(x^{*})<0. If Gi(x)=0G_{i}(x^{*})=0, then ||(fi(x))=[1,1]\partial|\cdot|(f_{i}(x^{*}))=[-1,1] and hence Fi(x){μGi(x):μ[1,1]}\partial F_{i}(x)\subseteq\{\mu\nabla G_{i}(x^{*}):\mu\in[-1,1]\}.

  • Case 2:

    i𝒦(x)i\in\mathcal{K}(x^{*}). Similarly as in Case 1, we can show Fi(x){μHi(x):μ[1,1]}\partial F_{i}(x)\subseteq\{\mu\nabla H_{i}(x^{*}):\mu\in[-1,1]\}.

  • Case 3:

    i𝒥(x)i\in\mathcal{J}(x^{*}). In this case, fi(x)=Gi(x)=Hi(x)f_{i}(x^{*})=G_{i}(x^{*})=H_{i}(x^{*}). If Gi(x)=Hi(x)<0G_{i}(x^{*})=H_{i}(x^{*})<0, then ||(fi(x))={1}\partial|\cdot|(f_{i}(x^{*}))=\{-1\} and Fi(x)(fi)(x)\partial F_{i}(x^{*})\subseteq\partial(-f_{i})(x^{*}). Since fi(x)=max{Gi(x),Hi(x)}-f_{i}(x)=\max\{-G_{i}(x),-H_{i}(x)\}, by Proposition 2.2(iii), Fi(x)=co{Gi(x),Hi(x)}\partial F_{i}(x)=co\{-\nabla G_{i}(x^{*}),-\nabla H_{i}(x^{*})\}. If Gi(x)=Hi(x)>0G_{i}(x^{*})=H_{i}(x^{*})>0, then ||(fi(x))={1}\partial|\cdot|(f_{i}(x^{*}))=\{1\} and Fi(x)fi(x)\partial F_{i}(x^{*})\subseteq\partial f_{i}(x^{*}). It follows by Proposition 2.2(iii) that Fi(x){Gi(x),Hi(x)}\partial F_{i}(x)\subseteq\{\nabla G_{i}(x^{*}),\nabla H_{i}(x^{*})\}. If Gi(x)=Hi(x)=0G_{i}(x^{*})=H_{i}(x^{*})=0, then ||(fi(x))=[1,1]\partial|\cdot|(f_{i}(x^{*}))=[-1,1], Fi(x){(μfi)(x):μ[1,1]}\partial F_{i}(x)\subseteq\{\partial(\mu f_{i})(x^{*}):\mu\in[-1,1]\}. If μ>0\mu>0, then (μfi)(x)μfi(x)μ{Gi(x),Hi(x)}\partial(\mu f_{i})(x^{*})\subseteq\mu\partial f_{i}(x^{*})\subseteq\mu\{\nabla G_{i}(x^{*}),\nabla H_{i}(x^{*})\}. If μ<0\mu<0, then (μfi)(x)=μ(fi)(x)=μco{Gi(x),Hi(x)}\partial(\mu f_{i})(x^{*})=-\mu\partial(-f_{i})(x^{*})=-\mu co\{-\nabla G_{i}(x^{*}),-\nabla H_{i}(x^{*})\}. Hence

    Fi(x){λv:λ0,vco{Gi(x),Hi(x)}{Gi(x),Hi(x)}}.\partial F_{i}(x^{*})\subseteq\{\lambda v:\lambda\geq 0,v\in co\{-\nabla G_{i}(x^{*}),-\nabla H_{i}(x^{*})\}\cup\{\nabla G_{i}(x^{*}),\nabla H_{i}(x^{*})\}\}.

Summarizing the above cases, we have that for any xx^{*},

|min{Gi(x),Hi(x)}|(G(x)H(x))T(λG,λH),\partial|\min\{G_{i}(x^{*}),H_{i}(x^{*})\}|\subseteq\left(\begin{array}[]{l}\nabla G(x^{*})\\ \nabla H(x^{*})\end{array}\right)^{T}(-\lambda^{G},-\lambda^{H}),

where (2.3)(\ref{multipliers}) holds.

Since gi(x),i=1,,ng_{i}(x),i=1,\cdots,n, |hi(x)|,j=1,,m|h_{i}(x)|,j=1,\cdots,m and |min{Gi(x),Hi(x)}|,i=1,p|\min\{G_{i}(x),H_{i}(x)\}|,i=1\cdots,p are all Lipschitz continuous around xx^{*}, by the calculus rules in Proposition 2.2 and (2.3), there exist parameters λig0,iA(x)\lambda_{i}^{g}\geq 0,i\in A(x^{*}), λih,i=1,,m\lambda_{i}^{h},i=1,\cdots,m and λiG,λiH\lambda_{i}^{G},\lambda_{i}^{H}, i=1,,pi=1,\cdots,p satisfying (2.3) such that

ϕ0(x)iA(x)λiggi(x)+i=1mλihhi(x)i=1pλiGGi(x)i=1pλiHHi(x).\displaystyle\partial\phi_{0}(x^{*})\subseteq\sum_{i\in A(x^{*})}\lambda_{i}^{g}\partial g_{i}(x^{*})+\sum_{i=1}^{m}\lambda_{i}^{h}\nabla h_{i}(x^{*})-\sum_{i=1}^{p}\lambda_{i}^{G}\nabla G_{i}(x^{*})-\sum_{i=1}^{p}\lambda_{i}^{H}\nabla H_{i}(x^{*}).

 

We will need the following result which is an extension of Carathéodory’s lemma.

Lemma 2.2

[1, Lemma 1] If v=i=1m+nαiviv=\displaystyle\sum_{i=1}^{m+n}\alpha_{i}v_{i} with vidv_{i}\in\mathbb{R}^{d} for every ii, {vi}i=1m\{v_{i}\}_{i=1}^{m} is linearly independent and αi0\alpha_{i}\neq 0, i=m+1,,m+ni=m+1,\cdots,m+n, then there exist I{m+1,,m+n}I\subseteq\{m+1,\cdots,m+n\} and scalars α¯i\bar{\alpha}_{i} for every i{1,,m}Ii\in\{1,\cdots,m\}\cup I such that
(i) v={1,,m}Iα¯iviv=\displaystyle\sum_{\{1,\cdots,m\}\cup I}\bar{\alpha}_{i}v_{i} with αiα¯i>0\alpha_{i}\bar{\alpha}_{i}>0 for every iIi\in I,
(ii) {vi}{1,,m}I\{v_{i}\}_{\{1,\cdots,m\}\cup I} is linearly independent.

3 RCPLD as a constraint qualification and a sufficient condition for error bounds

We first show that the RCPLD introduced in Definition 1.1 is a constraint qualification for any optimization problem in the form (P)minxf(x)s.t.x,({\rm P})~{}~{}\min_{x}\quad f(x)\quad{\rm s.t.}\quad x\in\mathcal{F}, where f:df:\mathbb{R}^{d}\rightarrow\mathbb{R} is Lipschitz continuous around any local optimal solution and \mathcal{F} is the feasible region defined by the system (1.1).

Theorem 3.1

Let xx^{*} be a local minimizer of the optimization problem (P)({\rm P}). Suppose that RCPLD holds at xx^{*}. Then xx^{*} is an M-stationary point, i.e., there exist λig0\lambda_{i}^{g}\geq 0, iIgi\in I_{g}^{*}, λih\lambda_{i}^{h}, i=1,,mi=1,\cdots,m, λiG\lambda_{i}^{G}, i=1,,pi=1,\cdots,p and λiH\lambda_{i}^{H}, i=1,,pi=1,\cdots,p such that

0f(x)+iIgλiggi(x)+i=1mλihhi(x)\displaystyle 0\in\partial f(x^{*})+\sum_{i\in I_{g}^{*}}\lambda_{i}^{g}\partial g_{i}(x^{*})+\sum_{i=1}^{m}\lambda_{i}^{h}\nabla h_{i}(x^{*})
i𝒥λiGGi(x)i𝒦𝒥λiHHi(x)+𝒩C(x),\displaystyle~{}~{}~{}~{}-\sum_{i\in\mathcal{I}^{*}\cup\mathcal{J}^{*}}\lambda_{i}^{G}\nabla G_{i}(x^{*})-\sum_{i\in\mathcal{K}^{*}\cup\mathcal{J}^{*}}\lambda_{i}^{H}\nabla H_{i}(x^{*})+\mathcal{N}_{C}(x^{*}),
λiH=0i,λiG=0i𝒦, either λiG>0,λiH>0, or λiGλiH=0,i𝒥.\displaystyle\lambda_{i}^{H}=0\ i\in{\cal I}^{*},\lambda_{i}^{G}=0\ i\in{\cal K}^{*},\mbox{ either }\lambda_{i}^{G}>0,\lambda_{i}^{H}>0,\mbox{ or }\lambda_{i}^{G}\lambda_{i}^{H}=0,\ i\in\mathcal{J}^{*}.

Proof. Step 1: Note that a point xCx\in C is feasible for problem (P{\rm P}) if and only if

ϕ0(x):=i=1nmax{0,gi(x)}+i=1m|hi(x)|+i=1p|min{Gi(x),Hi(x)}|=0.\phi_{0}(x):=\sum_{i=1}^{n}\max\{0,g_{i}(x)\}+\sum_{i=1}^{m}|h_{i}(x)|+\sum_{i=1}^{p}|\min\{G_{i}(x),H_{i}(x)\}|=0.

For each kk, we consider the following penalized problem:

(Pk)min\displaystyle({\rm P}_{k})~{}~{}~{}\min Fk(x):=f(x)+k2ϕ02(x)+12xx2\displaystyle F^{k}(x):=f(x)+\frac{k}{2}\phi_{0}^{2}(x)+\frac{1}{2}\|x-x^{*}\|^{2}
s.t.\displaystyle{\rm s.t.} x𝔹¯ε(x)C,\displaystyle x\in{\bar{\mathbb{B}}}_{\varepsilon}(x^{*})\cap C,

where \|\cdot\| denotes the 2-norm, ε>0\varepsilon>0 is such that f(x)f(x)f(x^{*})\leq f(x) for all feasible point x𝔹¯ε(x)x\in\bar{\mathbb{B}}_{\varepsilon}(x^{*}). Since the feasible region is compact and the objective function is continuous, there exists an optimal solution xkx^{k} of the problem (Pk)({\rm P}_{k}). Taking subsequence if necessary, we assume that limkxk=x¯\displaystyle\lim_{k\to\infty}x^{k}=\bar{x} and thus x¯𝔹¯ε(x)C\bar{x}\in\bar{\mathbb{B}}_{\varepsilon}(x^{*})\cap C. Moreover by the optimality of xkx^{k}, we have

f(xk)+k2ϕ02(xk)+12xkx2=Fk(xk)Fk(x)=f(x).\displaystyle f(x^{k})+\frac{k}{2}\phi_{0}^{2}(x^{k})+\frac{1}{2}\|x^{k}-x^{*}\|^{2}=F^{k}(x^{k})\leq F^{k}(x^{*})=f(x^{*}). (3.1)

From the boundedness of {f(xk)}\{f(x^{k})\}, we have that ϕ0(xk)0\phi_{0}(x^{k})\to 0 as kk\to\infty. It follows that x¯\bar{x} is a feasible point of the problem (P)(P). Condition (3.1)(\ref{th2-2-1}) yields f(xk)+12xkx2f(x).f(x^{k})+\frac{1}{2}\|x^{k}-x^{*}\|^{2}\leq f(x^{*}). Taking limit as kk\to\infty, we obtain f(x¯)+12xx¯2f(x),f(\bar{x})+\frac{1}{2}\|x^{*}-\bar{x}\|^{2}\leq f(x^{*}), which means that x¯=x\bar{x}=x^{*}. Thus the sequence {xk}\{x^{k}\} converges to xx^{*}.

Step 2: Since for sufficiently large kk, xkx^{k} is an interior point of 𝔹¯ε(x)\bar{\mathbb{B}}_{\varepsilon}(x^{*}), by the necessary optimality condition in terms of limiting subdifferential and the nonsmooth calculus rule, there exist v0kf(xk)v_{0}^{k}\in\partial f(x^{k}), ukkϕ0(xk)ϕ0(xk)+ηku^{k}\in k\phi_{0}(x^{k})\partial\phi_{0}(x^{k})+\eta^{k}, ηk𝒩C(xk)\eta^{k}\in\mathcal{N}_{C}(x^{k}) such that

0=v0k+uk+(xkx).\displaystyle 0=v_{0}^{k}+u^{k}+(x^{k}-x^{*}). (3.2)

Suppose that there is a subsequence of {uk}\{u^{k}\} such that all uku^{k} equal to zero in this subsequence. Since ff is Lipschitz continuous, its limiting subdifferential is compact and so the sequences {v0k}\{v_{0}^{k}\} is compact. Taking subsequence if necessary, we assume limk0uk=0\displaystyle\lim_{k\rightarrow 0}u^{k}=0 and limkv0k=v0f(x)\displaystyle\lim_{k\to\infty}v_{0}^{k}=v_{0}\in\partial f(x^{*}). Taking limit as kk\to\infty in (3.2)(\ref{sumrule}), we have 0f(x)0\in\partial f(x^{*}) by the outer semi-continuity of the limiting subdifferential and thus xx^{*} is a stationary point of problem (P{\rm P}) automatically. So without loss of generality, we may assume uk0u^{k}\neq 0 for all sufficiently large kk.

By Lemma 2.1, since kϕ0(xk)0k\phi_{0}(x^{k})\geq 0 there exist λi,kg0,iA(xk)\lambda_{i,k}^{g}\geq 0,i\in A(x^{k}), λi,kh,i=1,,m\lambda_{i,k}^{h},i=1,\cdots,m, vikgi(xk)v_{i}^{k}\in\partial g_{i}(x^{k}), iA(xk)i\in A(x^{k}) and λi,kG,λi,kH\lambda_{i,k}^{G},\lambda_{i,k}^{H}, i=1,,pi=1,\cdots,p such that  either λi,kG>0,λi,kH>0 or λi,kGλi,kH=0,i𝒥(xk)\mbox{ either }\lambda_{i,k}^{G}>0,\lambda_{i,k}^{H}>0\mbox{ or }\lambda_{i,k}^{G}\lambda_{i,k}^{H}=0,\ \ \forall i\in{\cal J}(x^{k}) and

kϕ0(xk)ϕ0(xk)\displaystyle k\phi_{0}(x^{k})\partial\phi_{0}(x^{k})\subseteq
iA(xk)λi,kgvik+i=1mλi,khhi(xk)i(xk)𝒥(xk)λi,kGGi(xk)i𝒦(xk)𝒥(xk)λi,kHHi(xk).\displaystyle\sum_{i\in A(x^{k})}\lambda_{i,k}^{g}v_{i}^{k}+\sum_{i=1}^{m}\lambda_{i,k}^{h}\nabla h_{i}(x^{k})-\sum_{i\in\mathcal{I}(x^{k})\cup\mathcal{J}(x^{k})}\lambda_{i,k}^{G}\nabla G_{i}(x^{k})-\sum_{i\in\mathcal{K}(x^{k})\cup\mathcal{J}(x^{k})}\lambda_{i,k}^{H}\nabla H_{i}(x^{k}).

By the continuity of gg, it is easy to see that A(xk)IgA(x^{k})\subseteq I_{g}^{*} for sufficiently large kk. Similarly, by the continuity of GG and HH, (xk)𝒥(xk)𝒥\mathcal{I}(x^{k})\cup\mathcal{J}(x^{k})\subseteq\mathcal{I}^{*}\cup\mathcal{J}^{*} and 𝒦(xk)𝒥(xk)𝒦𝒥\mathcal{K}(x^{k})\cup\mathcal{J}(x^{k})\subseteq\mathcal{K}^{*}\cup\mathcal{J}^{*} for sufficiently large kk. Let λi,kg=0\lambda_{i,k}^{g}=0 if iIgA(xk)i\in I_{g}^{*}\setminus A(x^{k}), λi,kG=0\lambda_{i,k}^{G}=0 if i(𝒥)((xk)𝒥(xk))i\in(\mathcal{I}^{*}\cup\mathcal{J}^{*})\setminus(\mathcal{I}(x^{k})\cup\mathcal{J}(x^{k})), and λi,kH=0\lambda_{i,k}^{H}=0 if i(𝒦𝒥)(𝒦(xk)𝒥(xk))i\in(\mathcal{K}^{*}\cup\mathcal{J}^{*})\setminus(\mathcal{K}(x^{k})\cup\mathcal{J}(x^{k})). Then we have that λi,kG>0,λi,kH>0 or λi,kGλi,kH=0,i𝒥\lambda_{i,k}^{G}>0,\lambda_{i,k}^{H}>0\mbox{ or }\lambda_{i,k}^{G}\lambda_{i,k}^{H}=0,\ \ \forall i\in{\cal J}^{*}. Since ukkϕ0(xk)ϕ0(xk)+ηku^{k}\in k\phi_{0}(x^{k})\partial\phi_{0}(x^{k})+\eta^{k}, it follows that

uk\displaystyle u^{k} =\displaystyle= iIgλi,kgvik+i=1mλi,khhi(xk)i𝒥λi,kGGi(xk)i𝒦𝒥λi,kHHi(xk)+ηk.\displaystyle\sum_{i\in I_{g}^{*}}\lambda_{i,k}^{g}v_{i}^{k}+\sum_{i=1}^{m}\lambda_{i,k}^{h}\nabla h_{i}(x^{k})-\sum_{i\in\mathcal{I}^{*}\cup\mathcal{J}^{*}}\lambda_{i,k}^{G}\nabla G_{i}(x^{k})-\sum_{i\in\mathcal{K}^{*}\cup\mathcal{J}^{*}}\lambda_{i,k}^{H}\nabla H_{i}(x^{k})+\eta^{k}.

Since ηk:=(η1k,,ηlk)𝒩C(xk)=𝒩C1(x1k)××𝒩Cl(xlk)\eta^{k}:=(\eta^{k}_{1},\dots,\eta^{k}_{l})\in\mathcal{N}_{C}(x^{k})=\mathcal{N}_{C_{1}}(x^{k}_{1})\times\cdots\times\mathcal{N}_{C_{l}}(x^{k}_{l}), we have νik:={0}si×{ηik}×{0}ti𝒩C(xk)\nu_{i}^{k}:=\{0\}^{s_{i}}\times\{\eta_{i}^{k}\}\times\{0\}^{t_{i}}\in\mathcal{N}_{C}(x^{k}). We denote by ηk=iLkνik\eta^{k}=\sum_{i\in L_{k}}\nu_{i}^{k} with Lk:={i=1,,l:ηik0}L_{k}:=\{i=1,\cdots,l:\eta_{i}^{k}\not=0\}.

Let 1{1,,m}\mathcal{I}_{1}\subseteq\{1,\cdots,m\}, 2\mathcal{I}_{2}\in\mathcal{I}^{*} and 3𝒦\mathcal{I}_{3}\in\mathcal{K}^{*} be such that {hi(x)}i1{Gi(x)}i2{Hi(x)}i3\{\nabla h_{i}(x^{*})\}_{i\in\mathcal{I}_{1}}\cup\{\nabla G_{i}(x^{*})\}_{i\in\mathcal{I}_{2}}\cup\{\nabla H_{i}(x^{*})\}_{i\in\mathcal{I}_{3}} is a basis for  span {{hi(x)}i=1m{Gi(x)}i{Hi(x)}i𝒦}\mbox{ span }\{\{\nabla h_{i}(x^{*})\}_{i=1}^{m}\cup\{\nabla G_{i}(x^{*})\}_{i\in\mathcal{I}^{*}}\cup\{\nabla H_{i}(x^{*})\}_{i\in\mathcal{K}^{*}}\}. Then by RCPLD (i), {hi(xk)}i1{Gi(xk)}i2{Hi(xk)}i3\{\nabla h_{i}(x^{k})\}_{i\in\mathcal{I}_{1}}\cup\{\nabla G_{i}(x^{k})\}_{i\in\mathcal{I}_{2}}\cup\{\nabla H_{i}(x^{k})\}_{i\in\mathcal{I}_{3}} is linearly independent and thus is a basis of  span {{hi(xk)}i=1m{Gi(xk)}i{Hi(xk)}i𝒦}\mbox{ span }\{\{\nabla h_{i}(x^{k})\}_{i=1}^{m}\cup\{\nabla G_{i}(x^{k})\}_{i\in\mathcal{I}^{*}}\cup\{\nabla H_{i}(x^{k})\}_{i\in\mathcal{K}^{*}}\}. Hence there exist λ~i,kh,i1\tilde{\lambda}_{i,k}^{h},i\in\mathcal{I}_{1}, λ~kG,λ~kH\tilde{\lambda}_{k}^{G},\tilde{\lambda}_{k}^{H} such that

uk\displaystyle u^{k} =\displaystyle= iIgsupp(λi,kg)λi,kgvik+i1λ~i,khhi(xk)i2λ~i,kGGi(xk)i3λ~i,kHHi(xk)\displaystyle\sum_{i\in I_{g}^{*}\cap{\rm supp}(\lambda_{i,k}^{g})}\lambda_{i,k}^{g}v_{i}^{k}+\sum_{i\in\mathcal{I}_{1}}\tilde{\lambda}_{i,k}^{h}\nabla h_{i}(x^{k})-\sum_{i\in\mathcal{I}_{2}}\tilde{\lambda}_{i,k}^{G}\nabla G_{i}(x^{k})-\sum_{i\in\mathcal{I}_{3}}\tilde{\lambda}_{i,k}^{H}\nabla H_{i}(x^{k}) (3.3)
i𝒥supp(λ~i,kG)λ~i,kGGi(xk)i𝒥supp(λ~i,kH)λ~i,kHHi(xk)+iLkνik,\displaystyle-\sum_{i\in\mathcal{J}^{*}\cap{\rm supp}(\tilde{\lambda}_{i,k}^{G})}\tilde{\lambda}_{i,k}^{G}\nabla G_{i}(x^{k})-\sum_{i\in\mathcal{J}^{*}\cap{\rm supp}(\tilde{\lambda}_{i,k}^{H})}\tilde{\lambda}_{i,k}^{H}\nabla H_{i}(x^{k})+\sum_{i\in L_{k}}\nu_{i}^{k},

where supp(a):={i:ai0}{\rm supp}(a):=\{i:a_{i}\neq 0\}, λ~i,kG>0,λ~i,kH>0\tilde{\lambda}_{i,k}^{G}>0,\tilde{\lambda}_{i,k}^{H}>0 or λ~i,kGλ~i,kH=0\tilde{\lambda}_{i,k}^{G}\tilde{\lambda}_{i,k}^{H}=0 if i𝒥i\in\mathcal{J}^{*}.

Since uk0u^{k}\not=0, applying Carathéodory’s lemma in Lemma 2.2 to (3.3), we obtain subsets

4kIgsupp(λ~i,kg),5k𝒥supp(λ~i,kG),6k𝒥supp(λ~i,kH),LkLk,\displaystyle\mathcal{I}_{4}^{k}\subseteq I_{g}^{*}\cap{\rm supp}(\tilde{\lambda}_{i,k}^{g}),\qquad\mathcal{I}_{5}^{k}\subseteq\mathcal{J}^{*}\cap{\rm supp}(\tilde{\lambda}_{i,k}^{G}),\quad\mathcal{I}_{6}^{k}\subseteq\mathcal{J}^{*}\cap{\rm supp}(\tilde{\lambda}_{i,k}^{H}),\quad L^{\prime}_{k}\subseteq L_{k},

and {λ¯kg,λ¯kh,λ¯kG,λ¯kH,αk}\{\bar{\lambda}_{k}^{g},\bar{\lambda}_{k}^{h},\bar{\lambda}_{k}^{G},\bar{\lambda}_{k}^{H},\alpha^{k}\} with αik>0\alpha_{i}^{k}>0, iLki\in L^{\prime}_{k}, λ¯i,kg>0\bar{\lambda}_{i,k}^{g}>0, i4ki\in\mathcal{I}_{4}^{k} and λ¯i,kG>0,λ¯i,kH>0\bar{\lambda}_{i,k}^{G}>0,\bar{\lambda}_{i,k}^{H}>0 or λ¯i,kGλ¯i,kH=0\bar{\lambda}_{i,k}^{G}\bar{\lambda}_{i,k}^{H}=0 if i𝒥i\in\mathcal{J}^{*} such that

uk=i4kλ¯i,kgvik+i1λ¯i,khhi(xk)i25kλ¯i,kGGi(xk)i36kλ¯i,kHHi(xk)+iLkαikνik,\displaystyle u^{k}=\sum_{i\in\mathcal{I}_{4}^{k}}\bar{\lambda}_{i,k}^{g}v_{i}^{k}+\sum_{i\in\mathcal{I}_{1}}\bar{\lambda}_{i,k}^{h}\nabla h_{i}(x^{k})-\sum_{i\in\mathcal{I}_{2}\cup\mathcal{I}_{5}^{k}}\bar{\lambda}_{i,k}^{G}\nabla G_{i}(x^{k})-\sum_{i\in\mathcal{I}_{3}\cup\mathcal{I}_{6}^{k}}\bar{\lambda}_{i,k}^{H}\nabla H_{i}(x^{k})+\sum_{i\in L^{\prime}_{k}}\alpha_{i}^{k}\nu_{i}^{k}, (3.4)

and the set of vectors {vik}i4k{hi(xk)}i1{Gi(xk)}i25k{Hi(xk)}i36k{νik}iLk\{v_{i}^{k}\}_{i\in\mathcal{I}_{4}^{k}}\cup\{\nabla h_{i}(x^{k})\}_{i\in\mathcal{I}_{1}}\cup\{\nabla G_{i}(x^{k})\}_{i\in\mathcal{I}_{2}\cup\mathcal{I}_{5}^{k}}\cup\{\nabla H_{i}(x^{k})\}_{i\in\mathcal{I}_{3}\cup\mathcal{I}_{6}^{k}}\cup\{\nu_{i}^{k}\}_{i\in L^{\prime}_{k}} is linearly independent.

Since the index sets are all finite, for every large kk, we may assume 4k4\mathcal{I}_{4}^{k}\equiv\mathcal{I}_{4}, LkLL^{\prime}_{k}\equiv L, 5k5\mathcal{I}_{5}^{k}\equiv\mathcal{I}_{5} and 6k6\mathcal{I}_{6}^{k}\equiv\mathcal{I}_{6} without loss of generality. Hence the set of vectors {vik}i4{hi(xk)}i1{Gi(xk)}i25{Hi(xk)}i36{νik}iL\{v_{i}^{k}\}_{i\in\mathcal{I}_{4}}\cup\{\nabla h_{i}(x^{k})\}_{i\in\mathcal{I}_{1}}\cup\{\nabla G_{i}(x^{k})\}_{i\in\mathcal{I}_{2}\cup\mathcal{I}_{5}}\cup\{\nabla H_{i}(x^{k})\}_{i\in\mathcal{I}_{3}\cup\mathcal{I}_{6}}\cup\{\nu_{i}^{k}\}_{i\in L} is linearly independent. From (3.4)(\ref{AKKT3}), the condition (3.2)(\ref{sumrule}) reduces to

0=v0k+(xkx)+i4λ¯i,kgvik+i1λ¯i,khhi(xk)i2λ¯i,kGGi(xk)i3λ¯i,kHHi(xk)\displaystyle 0=v_{0}^{k}+(x^{k}-x^{*})+\sum_{i\in\mathcal{I}_{4}}\bar{\lambda}_{i,k}^{g}v_{i}^{k}+\sum_{i\in\mathcal{I}_{1}}\bar{\lambda}_{i,k}^{h}\nabla h_{i}(x^{k})-\sum_{i\in\mathcal{I}_{2}}\bar{\lambda}_{i,k}^{G}\nabla G_{i}(x^{k})-\sum_{i\in\mathcal{I}_{3}}\bar{\lambda}_{i,k}^{H}\nabla H_{i}(x^{k})
i5λ¯i,kGGi(xk)i6λ¯i,kHHi(xk)+iLαikνik,\displaystyle~{}~{}~{}-\sum_{i\in\mathcal{I}_{5}}\bar{\lambda}_{i,k}^{G}\nabla G_{i}(x^{k})-\sum_{i\in\mathcal{I}_{6}}\bar{\lambda}_{i,k}^{H}\nabla H_{i}(x^{k})+\sum_{i\in L}\alpha_{i}^{k}\nu_{i}^{k}, (3.5)

where αik>0\alpha_{i}^{k}>0, iLi\in L, λ¯i,kg>0\bar{\lambda}_{i,k}^{g}>0 if i4i\in\mathcal{I}_{4}, λ¯i,kG>0,λ¯i,kH>0\bar{\lambda}_{i,k}^{G}>0,\bar{\lambda}_{i,k}^{H}>0 or λ¯i,kGλ¯i,kH=0\bar{\lambda}_{i,k}^{G}\bar{\lambda}_{i,k}^{H}=0 if i𝒥i\in\mathcal{J}^{*}.

Step 3: We now prove that the sequence {(λ¯kg,λ¯kh,λ¯kG,λ¯kH,iLαikνik)}\{(\bar{\lambda}_{k}^{g},\bar{\lambda}_{k}^{h},\bar{\lambda}_{k}^{G},\bar{\lambda}_{k}^{H},\sum_{i\in L}\alpha_{i}^{k}\nu_{i}^{k})\} must be bounded. To the contrary, assume that it is unbounded. Let Mk:=(λ¯kg,λ¯kh,λ¯kG,λ¯kH)+iLαikνikM_{k}:=\|(\bar{\lambda}_{k}^{g},\bar{\lambda}_{k}^{h},\bar{\lambda}_{k}^{G},\bar{\lambda}_{k}^{H})\|+\|\sum_{i\in L}\alpha_{i}^{k}\nu_{i}^{k}\|. Then there exists a subsequence KK such that MkM_{k}\to\infty and

limk,kK(λ¯kg,λ¯kh,λ¯kG,λ¯kH,iLαikνik)Mk=(λg,λh,λG,λH,η),\displaystyle\lim_{k\to\infty,k\in K}\frac{(\bar{\lambda}_{k}^{g},\bar{\lambda}_{k}^{h},\bar{\lambda}_{k}^{G},\bar{\lambda}_{k}^{H},\sum_{i\in L}\alpha_{i}^{k}\nu_{i}^{k})}{M_{k}}=(\lambda^{g},\lambda^{h},\lambda^{G},\lambda^{H},{\eta^{*}}),

where ηi=limkαikMkηik\eta_{i}^{*}=\lim_{k\rightarrow\infty}\frac{\alpha_{i}^{k}}{M_{k}}\eta_{i}^{k}, iLi\in L and ηi=0\eta_{i}^{*}=0, iLi\notin L. Since ηik𝒩Ci(xik)\eta^{k}_{i}\in\mathcal{N}_{C_{i}}(x^{k}_{i}) and αik/Mk>0\alpha_{i}^{k}/M_{k}>0, it follows from the outer semicontinuity of the limiting normal cone that ηi𝒩Ci(xi)\eta^{*}_{i}\in\mathcal{N}_{C_{i}}(x^{*}_{i}) and η𝒩C(x)\eta^{*}\in\mathcal{N}_{C}(x^{*}) for each iLi\in L. It is easy to see that λig0\lambda_{i}^{g}\geq 0 if i4i\in\mathcal{I}_{4} and  either λiG>0,λiH>0 or λiGλiH=0i𝒥\mbox{ either }\lambda_{i}^{G}>0,\lambda_{i}^{H}>0\mbox{ or }\lambda_{i}^{G}\lambda_{i}^{H}=0\ \ \forall i\in\mathcal{J}^{*}. Without loss of generality, assume that vikviv_{i}^{k}\to v_{i}^{*} and v0kv0v_{0}^{k}\to v_{0}^{*}. Then by the outer semi-continuity of the limiting subdifferential, we have vigi(x)v_{i}^{*}\in\partial g_{i}(x^{*}) and v0f(x)v_{0}^{*}\in\partial f(x^{*}). Dividing MkM_{k} on both sides of (3)(\ref{sumrule1}) and taking limits with k,kKk\rightarrow\infty,k\in K, we have

0=i4λigvi+i1λihhi(x)i25λiGGi(x)i36λiHHi(x)+η.\displaystyle 0=\sum_{i\in\mathcal{I}_{4}}\lambda_{i}^{g}v_{i}^{*}+\sum_{i\in\mathcal{I}_{1}}\lambda_{i}^{h}\nabla h_{i}(x^{*})-\sum_{i\in\mathcal{I}_{2}\cup\mathcal{I}_{5}}\lambda_{i}^{G}\nabla G_{i}(x^{*})-\sum_{i\in\mathcal{I}_{3}\cup\mathcal{I}_{6}}\lambda_{i}^{H}\nabla H_{i}(x^{*})+\eta^{*}.

By RCPLD(ii), since the vectors λ4g,λ1h,λ25G,λ36H\lambda^{g}_{\mathcal{I}_{4}},\lambda^{h}_{\mathcal{I}_{1}},\lambda^{G}_{\mathcal{I}_{2}\cup\mathcal{I}_{5}},\lambda^{H}_{\mathcal{I}_{3}\cup\mathcal{I}_{6}}, η\eta^{*} are not all equal to zero, xkx,vikvix^{k}\rightarrow x^{*},v_{i}^{k}\rightarrow v_{i}^{*}, αikηikMkηi,iL\frac{\alpha^{k}_{i}\eta^{k}_{i}}{M_{k}}\rightarrow\eta^{*}_{i},i\in L, the set of vectors

{vik}i4{hi(xk)}i1{Gi(xk)}i25{Hi(xk)}i36{αikνikMk}iL\{v_{i}^{k}\}_{i\in\mathcal{I}_{4}}\cup\{\nabla h_{i}(x^{k})\}_{i\in\mathcal{I}_{1}}\cup\{\nabla G_{i}(x^{k})\}_{i\in\mathcal{I}_{2}\cup\mathcal{I}_{5}}\cup\{\nabla H_{i}(x^{k})\}_{i\in\mathcal{I}_{3}\cup\mathcal{I}_{6}}\cup\{\frac{\alpha^{k}_{i}\nu^{k}_{i}}{M_{k}}\}_{i\in L}

must be linearly dependent. Since αik/Mk>0\alpha_{i}^{k}/M_{k}>0 for each iLi\in L, this is a contradiction to the conclusion in step 2. The contradiction proves that {(λ¯kg,λ¯kh,λ¯kG,λ¯kH,iLαikνik)}\{(\bar{\lambda}_{k}^{g},\bar{\lambda}_{k}^{h},\bar{\lambda}_{k}^{G},\bar{\lambda}_{k}^{H},\sum_{i\in L}\alpha^{k}_{i}\nu_{i}^{k})\} is bounded.

Without loss of generality, assume that (λ¯kg,λ¯kh,λ¯kG,λ¯kH,iLαikνik)(\bar{\lambda}_{k}^{g},\bar{\lambda}_{k}^{h},\bar{\lambda}_{k}^{G},\bar{\lambda}_{k}^{H},\sum_{i\in L}\alpha^{k}_{i}\nu_{i}^{k}) (λg,λh,λG,λH,iLνi)\to(\lambda^{g},\lambda^{h},\lambda^{G},\lambda^{H},\sum_{i\in L}\nu_{i}^{*}) as kk\rightarrow\infty. It is easy to see that  either λiG>0,λiH>0 or λiGλiH=0i𝒥\mbox{ either }\lambda_{i}^{G}>0,\lambda_{i}^{H}>0\mbox{ or }\lambda_{i}^{G}\lambda_{i}^{H}=0\ \ \forall i\in\mathcal{J}^{*}, limkiLαikνik=iLνi𝒩C(x)\displaystyle\lim_{k\to\infty}\displaystyle\sum_{i\in L}\alpha_{i}^{k}\nu_{i}^{k}=\sum_{i\in L}\nu_{i}^{*}\in\mathcal{N}_{C}(x^{*}). Taking limits for a subsequence in (3)(\ref{sumrule1}), we derive that xx^{*} is an M-stationary point of (P)(\rm P).  

We now perform the next task of proving that RCPLD is a sufficient condition for the error bound property under extra regularity conditions. First we prove the following result which shows that RCPLD is persistent locally.

Proposition 3.1

Assume that RCPLD\rm RCPLD holds at a feasible point xx^{*} of the system

g(x)0,h(x)=0,xC.\begin{array}[]{l}g(x)\leq 0,\quad h(x)=0,\quad x\in C.\end{array} (3.6)

Then RCPLD\rm RCPLD holds at every point belongs to a sufficiently small neighborhood of xx^{*}.

Proof. Consider any sequence xkxx^{k}\rightarrow x^{*} as kk\to\infty. It is obvious that RCPLD\rm RCPLD (i) holds at each xkx^{k} when kk is sufficiently large. Assume that {hi(x)}i1\{\nabla h_{i}(x^{*})\}_{i\in\mathcal{I}_{1}} with 1{1,,m}\mathcal{I}_{1}\subseteq\{1,\cdots,m\} is a basis for span{hi(x)}i=1m\{\nabla h_{i}(x^{*})\}_{i=1}^{m}. Then {hi(xk)}i1\{\nabla h_{i}(x^{k})\}_{i\in\mathcal{I}_{1}} is a basis of span{hi(xk)}i=1m\{\nabla h_{i}(x^{k})\}_{i=1}^{m} for any large kk. We now show that RCPLD (ii) holds at xkx^{k} for any sufficiently large kk. To contrary, assume that RCPLD(ii) does not hold at each point of a subsequence {xk}kK0\{x^{k}\}_{k\in K_{0}}. Then there exist an index set 2kIgk:={i=1,,n:gi(xk)=0}\mathcal{I}_{2}^{k}\subseteq I_{g}^{k}:=\{i=1,\cdots,n:g_{i}(x^{k})=0\}, vikgi(xk)v_{i}^{k}\in\partial g_{i}(x^{k}), i2ki\in\mathcal{I}_{2}^{k} and a nonzero vector (λkg,λkh,ηk)(\lambda^{g}_{k},\lambda^{h}_{k},\eta^{k}) with λ2k,kg0\lambda^{g}_{\mathcal{I}_{2}^{k},k}\geq 0, ηk:=(η1k,,ηlk)𝒩C(xk)\eta^{k}:=(\eta^{k}_{1},\dots,\eta^{k}_{l})\in\mathcal{N}_{C}(x^{k}) such that

0=i2kλi,kgvik+i1λi,khhi(xk)+ηk,\displaystyle 0=\sum_{i\in\mathcal{I}_{2}^{k}}\lambda_{i,k}^{g}v_{i}^{k}+\sum_{i\in\mathcal{I}_{1}}\lambda_{i,k}^{h}\nabla h_{i}(x^{k})+\eta^{k}, (3.7)

but the set of vectors {vik,s}i2k{hi(yk,s)}i1{νik,s}iL,\{v_{i}^{k,s}\}_{i\in\mathcal{I}_{2}^{k}}\cup\{\nabla h_{i}(y^{k,s})\}_{i\in\mathcal{I}_{1}}\cup\{\nu_{i}^{k,s}\}_{i\in L}, where L:={1,,l:ηik,s0}L:=\{1,\dots,l:\eta_{i}^{k,s}\neq 0\}, is linearly independent for some sequences yk,sxky^{k,s}\to x^{k} with gi(yk,s)vik,svik(i2k)\partial g_{i}(y^{k,s})\ni v_{i}^{k,s}\to v_{i}^{k}\ (i\in\mathcal{I}_{2}^{k}), νik,s:={0}si×{ηik,s}×{0}ti\nu_{i}^{k,s}:=\{0\}^{s_{i}}\times\{\eta_{i}^{k,s}\}\times\{0\}^{t_{i}}, ηik,s𝒩Ci(yik,s),ηik,sηik\eta_{i}^{k,s}\in\mathcal{N}_{C_{i}}(y_{i}^{k,s}),\eta_{i}^{k,s}\to\eta_{i}^{k} as ss\to\infty. Since 2kIgkIg\mathcal{I}_{2}^{k}\subseteq I_{g}^{k}\subseteq I_{g}^{*} is a finite set, we may consider a subsequence such that 2k=2\mathcal{I}_{2}^{k}=\mathcal{I}_{2} for every large kK0k\in K_{0}. Let Mk:=λkg+λkh+ηkM_{k}:=\|\lambda^{g}_{k}\|+\|\lambda_{k}^{h}\|+\|\eta^{k}\|. Suppose there exists a subsequence KK0K\subseteq K_{0} such that

limk,kK(λkg,λkh,ηk)Mk=(λg,λh,η)\displaystyle\lim_{k\to\infty,k\in K}\frac{({\lambda}_{k}^{g},\lambda_{k}^{h},\eta^{k})}{M_{k}}=(\lambda^{g},\lambda^{h},\eta)

with λig0,i2{\lambda}_{i}^{g}\geq 0,i\in\mathcal{I}_{2}. Without loss of generality, assume that vikvigi(x)v_{i}^{k}\to v_{i}\in\partial g_{i}(x^{*}) and we assume limk,kKηikMk=ηi𝒩Ci(xi)\displaystyle\lim_{k\to\infty,k\in K}\frac{\eta_{i}^{k}}{M_{k}}=\eta_{i}\in\mathcal{N}_{C_{i}}(x_{i}^{*}), i=1,,li=1,\cdots,l. Since ηik,sηik\eta_{i}^{k,s}\to\eta_{i}^{k} as ss\to\infty, we also have ηik,sMkηi,iL\frac{\eta_{i}^{k,s}}{M_{k}}\to\eta_{i},i\in L as k,sk,s\to\infty.

By the diagonalization law, there exists a sequence {zk}\{z^{k}\} converging to xx^{*} such that for each kk, v¯ikgi(zk)\bar{v}_{i}^{k}\in\partial g_{i}(z^{k}), v¯ikvik,i=1,,n\bar{v}_{i}^{k}\to{v}_{i}^{k},i=1,\cdots,n, η¯ik𝒩Ci(zik)\bar{\eta}^{k}_{i}\in\mathcal{N}_{C_{i}}(z_{i}^{k}), η¯ikMkηi,iL\frac{\bar{\eta}_{i}^{k}}{M_{k}}\to\eta_{i},i\in L and {v¯ik}i2{hi(zk)}i1{ν¯ik}iL\{\bar{v}_{i}^{k}\}_{i\in\mathcal{I}_{2}}\cup\{\nabla h_{i}(z^{k})\}_{i\in\mathcal{I}_{1}}\cup\{\bar{\nu}_{i}^{k}\}_{i\in L} is linearly independent for all large kk and ν¯ik:={0}si×{η¯ik}×{0}ti\bar{\nu}_{i}^{k}:=\{0\}^{s_{i}}\times\{\bar{\eta}_{i}^{k}\}\times\{0\}^{t_{i}}.

Dividing by MkM_{k} in both sides of (3.7)(\ref{persisnew}) and letting k,kKk\to\infty,k\in K, we have that η𝒩C(x)\eta\in\mathcal{N}_{C}(x^{*}) and

0=i2λigvi+i1λihhi(x)+η.0=\sum_{i\in\mathcal{I}_{2}}{\lambda}_{i}^{g}v_{i}+\sum_{i\in\mathcal{I}_{1}}\lambda_{i}^{h}\nabla h_{i}(x^{*})+\eta.

From RCPLD (ii), the set of vectors {v¯ik}i2{hi(zk)}i1{ν¯ikMk}iL\{\bar{v}_{i}^{k}\}_{i\in\mathcal{I}_{2}}\cup\{\nabla h_{i}(z^{k})\}_{i\in\mathcal{I}_{1}}\cup\{\frac{\bar{\nu}_{i}^{k}}{M_{k}}\}_{i\in L} must be linearly dependent, which is a contradiction. The contradiction shows that RCPLD\rm RCPLD holds at xkx^{k} for kk sufficiently large.  

Theorem 3.2

Assume the RCPLDRCPLD holds at a feasible point xx^{*} of the system (3.6)(\ref{F1}), gi(),i=1,,ng_{i}(\cdot),i=1,\cdots,n are subdifferentially regular and CC is Clarke regular around xx^{*}, then there exist α>0\alpha>0 and ε>0\varepsilon>0 such that

d1(x)α(g+(x)+h(x)),x𝔹ε(x)C,\displaystyle d_{{\cal F}_{1}}(x)\leq\alpha(\|g_{+}(x)\|+\|h(x)\|),\quad\forall x\in\mathbb{B}_{\varepsilon}(x^{*})\cap C,

where 1{\cal F}_{1} denotes the set of feasible points satisfying system (3.6).

Proof. If xx^{*} is an interior point of 1{\cal F}_{1}, then d1(x)=0d_{{\cal F}_{1}}(x)=0 for all x𝔹ε(x)Cx\in\mathbb{B}_{\varepsilon}(x^{*})\cap C and hence the result holds automatically. Now assume that xx^{*} lies in the boundary of 1{\cal F}_{1}. Let ϕ1(x):=g+(x)+h(x)\phi_{1}(x):=\|g_{+}(x)\|+\|h(x)\|. We rewrite the feasible set by 1:={xC:ϕ1(x)=0}.{\cal F}_{1}:=\left\{x\in C:\phi_{1}(x)=0\right\}. Assume for a contradiction that there exists CxkxC\ni x^{k}\to x^{*} such that d1(xk)>kϕ1(xk).d_{{\cal F}_{1}}(x^{k})>k\phi_{1}(x^{k}).

Obviously xk1x^{k}\notin{\cal F}_{1}. Let yky^{k} be the projector of xkx^{k} to 1{\cal F}_{1}. Then d1(xk)=ykxk0d_{{\cal F}_{1}}(x^{k})=\|y^{k}-x^{k}\|\not=0 and limkyk=x\displaystyle\lim_{k\to\infty}y^{k}=x^{*}. For each kk, yky^{k} is an optimal solution of the following problem:

(Pk)min\displaystyle({\rm P}^{\prime}_{k})~{}~{}~{}\min Fk(x):=xxk\displaystyle F^{k}(x):=\|x-x^{k}\|
s.t.\displaystyle{\rm s.t.} g(x)0,h(x)=0,xC,\displaystyle g(x)\leq 0,h(x)=0,x\in C,

where \|\cdot\| denotes the 2-norm. Since the RCPLD persists in a neighborhood of xx^{*}, RCPLD holds at yky^{k} for kk sufficiently large. From Theorem 3.1, yky^{k} is a limiting stationary point of (Pk)({\rm P}^{\prime}_{k}). By the optimality condition, there exist parameters λi,kg0\lambda_{i,k}^{g}\geq 0, vikgi(yk)v_{i}^{k}\in\partial g_{i}(y^{k}) for iI(yk)i\in I(y^{k}) and λi,kh,i=1,,m\lambda_{i,k}^{h},i=1,\cdots,m such that

0=ykxkykxk+iI(yk)λi,kgvik+i=1mλi,khhi(yk)+ηk.\displaystyle 0=\frac{y^{k}-x^{k}}{\|y^{k}-x^{k}\|}+\sum_{i\in I(y^{k})}\lambda_{i,k}^{g}v_{i}^{k}+\sum_{i=1}^{m}\lambda_{i,k}^{h}\nabla h_{i}(y^{k})+\eta^{k}. (3.8)

Let νik:={0}si×{ηik}×{0}ti\nu_{i}^{k}:=\{0\}^{s_{i}}\times\{\eta_{i}^{k}\}\times\{0\}^{t_{i}}, iLkνik=ηk𝒩C(yk)\sum_{i\in L_{k}}\nu_{i}^{k}=\eta^{k}\in\mathcal{N}_{C}(y^{k}) and Lk:={1,,l:ηik0}L_{k}:=\{1,\cdots,l:\eta_{i}^{k}\neq 0\}.

Assume that {hi(x)}i1\{\nabla h_{i}(x^{*})\}_{i\in\mathcal{I}_{1}} with 1{1,,m}\mathcal{I}_{1}\subseteq\{1,\cdots,m\} is a basis for span{hi(x)}i=1m\{\nabla h_{i}(x^{*})\}_{i=1}^{m}. From Lemma 2.2, we obtain 2kI(yk)supp(λi,kg)\mathcal{I}_{2}^{k}\subseteq I(y^{k})\cap{\rm supp}(\lambda_{i,k}^{g}) and LkLkL^{\prime}_{k}\subseteq L_{k} with {λ¯kg,λ¯kh,αk}\{\bar{\lambda}_{k}^{g},\bar{\lambda}_{k}^{h},\alpha^{k}\}, λ¯i,kg>0\bar{\lambda}_{i,k}^{g}>0 for i2ki\in\mathcal{I}_{2}^{k} and αik>0\alpha_{i}^{k}>0, iLki\in L^{\prime}_{k} such that

0=ykxkykxk+i2kλ¯i,kgvik+i1λ¯i,khhi(yk)+iLkαikνik\displaystyle 0=\frac{y^{k}-x^{k}}{\|y^{k}-x^{k}\|}+\sum_{i\in\mathcal{I}_{2}^{k}}\bar{\lambda}_{i,k}^{g}v_{i}^{k}+\sum_{i\in\mathcal{I}_{1}}\bar{\lambda}_{i,k}^{h}\nabla h_{i}(y^{k})+\sum_{i\in L^{\prime}_{k}}\alpha_{i}^{k}\nu_{i}^{k} (3.9)

with bounded multipliers {(λ¯kg,λ¯kh,ξk)}\{(\bar{\lambda}_{k}^{g},\bar{\lambda}_{k}^{h},\xi^{k})\} from Theorem 3.1, for kk\to\infty, ξk:=iLkαikνik𝒩C(yk)\xi^{k}:=\sum_{i\in L^{\prime}_{k}}\alpha_{i}^{k}\nu_{i}^{k}\in\mathcal{N}_{C}(y^{k}). Let M(λ¯kg,λ¯kh)1M\geq\|(\bar{\lambda}_{k}^{g},\bar{\lambda}_{k}^{h})\|_{1} for sufficiently kk.

From the subdifferentially regularity of gi(),i=1,,ng_{i}(\cdot),i=1,\cdots,n, for sufficiently large kk,

gi(xk)gi(yk)vik,xkyk+14Mxkyk0.\displaystyle g_{i}(x^{k})-g_{i}(y^{k})-\langle v_{i}^{k},x^{k}-y^{k}\rangle+\frac{1}{4M}\|x^{k}-y^{k}\|\geq 0.

Similarly, ξk,xkyk14ykxk\langle\xi^{k},x^{k}-y^{k}\rangle\leq\frac{1}{4}\|y^{k}-x^{k}\|. Furthermore, for each i=1,,mi=1,\cdots,m,

hi(xk)=hi(yk)+hi(yk),xkyk+o(xkyk).\displaystyle h_{i}(x^{k})=h_{i}(y^{k})+\langle\nabla h_{i}(y^{k}),x^{k}-y^{k}\rangle+o(\|x^{k}-y^{k}\|).

Then from (3.9)(\ref{AKKT30}), we have

ykxk\displaystyle\|y^{k}-x^{k}\| =\displaystyle= i2kλ¯i,kgvik+i1λ¯i,khhi(yk)+ξk,xkyk\displaystyle\langle\sum_{i\in\mathcal{I}_{2}^{k}}\bar{\lambda}_{i,k}^{g}v_{i}^{k}+\sum_{i\in\mathcal{I}_{1}}\bar{\lambda}_{i,k}^{h}\nabla h_{i}(y^{k})+\xi^{k},x^{k}-y^{k}\rangle
\displaystyle\leq i2kλ¯i,kg(gi(xk)gi(yk))+i1λ¯i,kh(hi(xk)hi(yk))\displaystyle\sum_{i\in\mathcal{I}_{2}^{k}}\bar{\lambda}_{i,k}^{g}(g_{i}(x^{k})-g_{i}(y^{k}))+\sum_{i\in\mathcal{I}_{1}}\bar{\lambda}_{i,k}^{h}(h_{i}(x^{k})-h_{i}(y^{k}))
+14ykxk+(i2kλ¯i,kg+i1|λ¯i,kh|)14Mykxk\displaystyle+\frac{1}{4}\|y^{k}-x^{k}\|+(\sum_{i\in\mathcal{I}_{2}^{k}}\bar{\lambda}_{i,k}^{g}+\sum_{i\in\mathcal{I}_{1}}|\bar{\lambda}_{i,k}^{h}|)\frac{1}{4M}\|y^{k}-x^{k}\|
\displaystyle\leq i2kλ¯i,kggi(xk)+i1λ¯i,khhi(xk)+12ykxk.\displaystyle\sum_{i\in\mathcal{I}_{2}^{k}}\bar{\lambda}_{i,k}^{g}g_{i}(x^{k})+\sum_{i\in\mathcal{I}_{1}}\bar{\lambda}_{i,k}^{h}h_{i}(x^{k})+\frac{1}{2}\|y^{k}-x^{k}\|.

This means

d1(xk)=ykxk\displaystyle d_{{\cal F}_{1}}(x^{k})=\|y^{k}-x^{k}\| \displaystyle\leq 2M(i2kmax{0,gi(xk)}+i1|hi(xk)|)\displaystyle 2M(\sum_{i\in\mathcal{I}_{2}^{k}}\max\{0,g_{i}(x^{k})\}+\sum_{i\in\mathcal{I}_{1}}|h_{i}(x^{k})|)
\displaystyle\leq 2M(max{0,g(xk)}1+h(xk)1),\displaystyle 2M(\|\max\{0,g(x^{k})\}\|_{1}+\|h(x^{k})\|_{1}),

which is a contradiction. Therefore the error bound property holds.  

The error bound property for the general system (1.1) can be now obtained from Theorem 3.2. It extends [14, Theorem 5.1] to allow nonsmooth inequality constraints and abstract constraints.

Corollary 3.1

Assume the RCPLDRCPLD holds at xx^{*}\in\mathcal{F}, gi(),i=1,,ng_{i}(\cdot),i=1,\cdots,n are subdifferentially regular and CC is Clarke regular around xx^{*}, the constraint (G(x),H(x))Ωp(G(x),H(x))\in\Omega^{p} satisfies the strict complementarity condition at xx^{*}, then there exist α>0\alpha>0 and ε>0\varepsilon>0 such that

d(x)αϕ(x),x𝔹ε(x)C,\displaystyle d_{{\cal F}}(x)\leq\alpha\phi(x),\quad\forall x\in\mathbb{B}_{\varepsilon}(x^{*})\cap C, (3.10)

where ϕ(x):=g+(x)+h(x)+i|Gi(x)|+i𝒦|Hi(x)|\phi(x):=\|g_{+}(x)\|+\|h(x)\|+\sum_{i\in{\cal I}^{*}}|G_{i}(x)|+\sum_{i\in{\cal K}^{*}}|H_{i}(x)| and {\cal F} denotes the set of feasible points satisfying system (1.1).

Proof. Since the strict complementarity holds at xx^{*}, we have 𝒥=\mathcal{J}^{*}=\emptyset. Hence for all xx\in{\cal F} sufficiently close to xx^{*}, we can represent it as a solution to the system

gi(x)0,i=1,,n,hi(x)=0,i=1,,m,Gi(x)=0,i,Hi(x)=0,i𝒦,xC.g_{i}(x)\leq 0,\ i=1,\cdots,n,\\ h_{i}(x)=0,\ i=1,\cdots,m,G_{i}(x)=0,i\in\mathcal{I}^{*},H_{i}(x)=0,i\in\mathcal{K}^{*},x\in C.

Then the error bound property follows by Theorem 3.2 and the equivalence of the finite dimensional norm.  

4 Sufficient conditions for RCPLD

Note that although the RCPLD is a weak condition, it may not be easy to verify. In this section we investigate sufficient conditions for RCPLD which are stronger but easier to verify.

It is easy to see that the following well-known constraint qualification implies RCPLD.

Definition 4.1

Let xx^{*}\in{\cal F}. We say that the no nonzero abnormal multiplier constraint qualification (NNAMCQ) holds at xx^{*} if there is no nonzero vector (λg,λh,λG,λH,η)n×m×p×p×d(\lambda^{g},\lambda^{h},\lambda^{G},\lambda^{H},\eta^{*})\in\mathbb{R}^{n}\times\mathbb{R}^{m}\times\mathbb{R}^{p}\times\mathbb{R}^{p}\times\mathbb{R}^{d} satisfying λig0\lambda_{i}^{g}\geq 0 for iIgi\in I_{g}^{*}, and either λiG>0,λiH>0 or λiGλiH=0,i𝒥\mbox{either }\lambda_{i}^{G}>0,\lambda_{i}^{H}>0\mbox{ or }\lambda_{i}^{G}\lambda_{i}^{H}=0,\forall i\in\mathcal{J}^{*}, η𝒩C(x)\eta^{*}\in\mathcal{N}_{C}(x^{*}), vigi(x)v^{*}_{i}\in\partial g_{i}(x^{*}) for iIgi\in I_{g}^{*} such that

0=iIgλigvi+i=1mλihhi(x)i𝒥λiGGi(x)i𝒦𝒥λiHHi(x)+η.\displaystyle 0=\sum_{i\in I_{g}^{*}}\lambda_{i}^{g}v_{i}^{*}+\sum_{i=1}^{m}\lambda_{i}^{h}\nabla h_{i}(x^{*})-\sum_{i\in\mathcal{I}^{*}\cup\mathcal{J}^{*}}\lambda_{i}^{G}\nabla G_{i}(x^{*})-\sum_{i\in\mathcal{K}^{*}\cup\mathcal{J}^{*}}\lambda_{i}^{H}\nabla H_{i}(x^{*})+\eta^{*}.

It is obvious that when the rank of {hi(x)}i=1m{Gi(x)}i{Hi(x)}i𝒦\{\nabla h_{i}(x^{*})\}_{i=1}^{m}\cup\{\nabla G_{i}(x^{*})\}_{i\in\mathcal{I}^{*}}\cup\{\nabla H_{i}(x^{*})\}_{i\in\mathcal{K}^{*}} is equal to dd, RCPLD holds automatically. This condition is easy to verify and moreover it is not just a constraint qualification but also a sufficient condition for error bounds without imposing any regularity conditions.

Theorem 4.1

For a feasible point xx^{*}\in\mathcal{F}, suppose that the rank of {{hi(x)}i=1m{Gi(x)}i{Hi(x)}i𝒦}\{\{\nabla h_{i}(x^{*})\}_{i=1}^{m}\cup\{\nabla G_{i}(x^{*})\}_{i\in\mathcal{I}^{*}}\cup\{\nabla H_{i}(x^{*})\}_{i\in\mathcal{K}^{*}}\} is equal to dd. Then the error bound property (1.3) holds at xx^{*}.

Proof. Around the point xx^{*}, we can equivalently formulate the complementarity system (G(x),H(x))Ωp(G(x),H(x))\in\Omega^{p} as

Gi(x)=0,i,Hi(x)=0,i𝒦,(Gi(x),Hi(x))Ω,i𝒥.G_{i}(x)=0,i\in\mathcal{I}^{*},H_{i}(x)=0,i\in\mathcal{K}^{*},(G_{i}(x),H_{i}(x))\in\Omega,i\in\mathcal{J}^{*}.

Hence the constraints Gi(x)=0,i,Hi(x)=0,i𝒦G_{i}(x)=0,i\in\mathcal{I}^{*},H_{i}(x)=0,i\in\mathcal{K}^{*} can be treated as equality constraints. We denote by ϕ(x):=max{0,gi(x),i=1,,n,|hi(x)|,i=1,,m,|Gi(x)|,i,|Hi(x)|,i𝒦,dΩ(Gi(x),Hi(x)),i𝒥}\phi(x):=\max\{0,g_{i}(x),i=1,\cdots,n,|h_{i}(x)|,i=1,\cdots,m,\ |G_{i}(x)|,i\in\mathcal{I}^{*},|H_{i}(x)|,i\in\mathcal{K}^{*},\ d_{\Omega}(G_{i}(x),H_{i}(x)),i\in\mathcal{J}^{*}\}.

If xx^{*} is an interior point of {\cal F}, then d(x)=0d_{{\cal F}}(x)=0 for all x𝔹ε(x)Cx\in\mathbb{B}_{\varepsilon}(x^{*})\cap C and hence the result holds automatically. Now assume that xx^{*} lies in the boundary of {\cal F}. We rewrite the feasible set by :={xC:ϕ(x)=0}.{{\cal F}:=\left\{x\in C:\phi(x)=0\right\}.} To a contrary, assume that there exists CxkxC\ni x^{k}\to x^{*} such that

d(xk)>kϕ(xk).\displaystyle d_{{\cal F}}(x^{k})>k\phi(x^{k}). (4.1)

For any i=1,,mi=1,\cdots,m, hi(x)=0h_{i}(x^{*})=0 and by the Taylor expansion,

hi(xk)=hi(x)+hi(x),xkx+o(xkx).h_{i}(x^{k})=h_{i}(x^{*})+\langle\nabla h_{i}(x^{*}),x^{k}-x^{*}\rangle+o(\|x^{k}-x^{*}\|). (4.2)

From (4.1)(\ref{contrerr}), xxkd(xk)>kϕ(xk)k|hi(xk)|\|x^{*}-x^{k}\|\geq d_{{\cal F}}(x^{k})>k\phi(x^{k})\geq k|h_{i}(x^{k})|, which implies that limkhi(xk)xxk=0\displaystyle\lim_{k\to\infty}\frac{h_{i}(x^{k})}{\|x^{*}-x^{k}\|}=0. Taking subsequence if necessary, let d:=limkxkxxkxd^{*}:=\displaystyle\lim_{k\to\infty}\frac{x^{k}-x^{*}}{\|x^{k}-x^{*}\|}. Dividing the both sides of (4.2) by xkx\|x^{k}-x^{*}\| and letting kk\to\infty, we have hi(x),d=0.\langle\nabla h_{i}(x^{*}),d^{*}\rangle=0. Similarly from the above discussion, we have Gi(x),d=0,i,Hi(x),d=0,i𝒦.\langle\nabla G_{i}(x^{*}),d^{*}\rangle=0,\ \forall i\in\mathcal{I}^{*},\quad\langle\nabla H_{i}(x^{*}),d^{*}\rangle=0,\ \forall i\in\mathcal{K}^{*}. This means that dd^{*} is linearly independent with all vectors in  span {{hi(x)}i=1m{Gi(x)}i{Hi(x)}i𝒦}\mbox{ span }\{\{\nabla h_{i}(x^{*})\}_{i=1}^{m}\cup\{\nabla G_{i}(x^{*})\}_{i\in\mathcal{I}^{*}}\cup\{\nabla H_{i}(x^{*})\}_{i\in\mathcal{K}^{*}}\}. Since by assumption the rank of span {{hi(x)}i=1m{Gi(x)}i{Hi(x)}i𝒦}\mbox{span }\{\{\nabla h_{i}(x^{*})\}_{i=1}^{m}\cup\{\nabla G_{i}(x^{*})\}_{i\in\mathcal{I}^{*}}\cup\{\nabla H_{i}(x^{*})\}_{i\in\mathcal{K}^{*}}\} is dd, this is impossible since these vectors lie in d\mathbb{R}^{d}. The proof of the theorem is therefore complete.  

In the following definition we extend the well-known concept of RCRCQ (see [19] for the smooth equality and inequality systems and [11, Definition 3.4] for system with complementarity constraints) to our general system (1.1). RCRCQ is stronger than the RCPLD but may be easier to verify.

Definition 4.2

Let xx^{*}\in{\cal F}. We say that the relaxed constant rank constraint qualification (RCRCQ)(\rm RCRCQ) holds at xx^{*} if for all sufficiently large kk, any index sets 4Ig\mathcal{I}_{4}\subseteq I_{g}^{*}, 5𝒥\mathcal{I}_{5}\subseteq\mathcal{J}^{*}, 6𝒥\mathcal{I}_{6}\subseteq\mathcal{J}^{*}, {1,,l}\mathcal{L}\subseteq\{1,\cdots,l\} and any vectors vigi(x)(i4)v^{*}_{i}\in\partial g_{i}(x^{*})(i\in\mathcal{I}_{4}), ηi𝒩Ci(xi)\eta^{*}_{i}\in\mathcal{N}_{C_{i}}(x^{*}_{i}) (ii\in\mathcal{L}), νi:={0}si×{ηi}×{0}ti\nu_{i}^{*}:=\{0\}^{s_{i}}\times\{\eta_{i}^{*}\}\times\{0\}^{t_{i}} with si:=q1++qi1s_{i}:=q_{1}+\cdots+q_{i-1} and ti:=qi+1++qlt_{i}:=q_{i+1}+\cdots+q_{l}, the set of vectors

{vi}i4{hi(x)}i=1m{Gi(x)}i5{Hi(x)}i𝒦6{νi}i,\{v_{i}^{*}\}_{i\in\mathcal{I}_{4}}\cup\{\nabla h_{i}(x^{*})\}_{i=1}^{m}\cup\{\nabla G_{i}(x^{*})\}_{i\in\mathcal{I}^{*}\cup\mathcal{I}_{5}}\cup\{\nabla H_{i}(x^{*})\}_{i\in\mathcal{K}^{*}\cup\mathcal{I}_{6}}\cup\{\nu_{i}^{*}\}_{i\in\mathcal{L}},

and the set of vectors

{vik}i4{hi(xk)}i=1m{Gi(xk)}i5{Hi(xk)}i𝒦6{νik}i,\{v_{i}^{k}\}_{i\in\mathcal{I}_{4}}\cup\{\nabla h_{i}(x^{k})\}_{i=1}^{m}\cup\{\nabla G_{i}(x^{k})\}_{i\in\mathcal{I}^{*}\cup\mathcal{I}_{5}}\cup\{\nabla H_{i}(x^{k})\}_{i\in\mathcal{K}^{*}\cup\mathcal{I}_{6}}\cup\{\nu_{i}^{k}\}_{i\in\mathcal{L}},

where xkxx^{k}\not=x^{*}, νik:={0}si×{ηik}×{0}ti\nu_{i}^{k}:=\{0\}^{s_{i}}\times\{\eta_{i}^{k}\}\times\{0\}^{t_{i}}, have the same rank for all sequences {xk},{vk},{ηk}\{x^{k}\},\{v^{k}\},\{\eta^{k}\} satisfying xkxx^{k}\rightarrow x^{*}, vkvv^{k}\rightarrow v^{*}, ηkη\eta^{k}\rightarrow\eta^{*} as kk\rightarrow\infty, vikgi(xk)v_{i}^{k}\in\partial g_{i}(x^{k}), ηik𝒩Ci(xik)\eta_{i}^{k}\in\mathcal{N}_{C_{i}}(x_{i}^{k}).

Proposition 4.1

RCRCQ implies RCPLD.

Proof. Let xx^{*}\in{\cal F} satisfy RCRCQ. Taking 456=\mathcal{I}_{4}\cup\mathcal{I}_{5}\cup\mathcal{I}_{6}\cup\mathcal{L}=\emptyset, we have that RCPLD (i) holds. We now show the RCPLD (ii) holds at xx^{*}. Let 1{1,,m}\mathcal{I}_{1}\subseteq\{1,\cdots,m\}, 2\mathcal{I}_{2}\subseteq\mathcal{I}^{*}, 3𝒦\mathcal{I}_{3}\subseteq\mathcal{K}^{*} be such that the set of vectors {hi(x)}i1{Gi(x)}i2{Hi(x)}i3\{\nabla h_{i}(x^{*})\}_{i\in\mathcal{I}_{1}}\cup\{\nabla G_{i}(x^{*})\}_{i\in\mathcal{I}_{2}}\cup\{\nabla H_{i}(x^{*})\}_{i\in\mathcal{I}_{3}} is a basis for  span {{hi(x)}i=1m{Gi(x)}i{Hi(x)}i𝒦}.\mbox{ span }\large\{\{\nabla h_{i}(x^{*})\}_{i=1}^{m}\cup\{\nabla G_{i}(x^{*})\}_{i\in\mathcal{I}^{*}}\cup\{\nabla H_{i}(x^{*})\}_{i\in\mathcal{K}^{*}}\large\}.

Let 4Ig\mathcal{I}_{4}\subseteq I_{g}^{*}, 5,6𝒥\mathcal{I}_{5},\mathcal{I}_{6}\subseteq\mathcal{J}^{*} be the index sets such that there exists a nonzero vector (λg,λh,λG,λH,η)(\lambda^{g},\lambda^{h},\lambda^{G},\lambda^{H},\eta^{*}) satisfying λig0\lambda_{i}^{g}\geq 0 for i4i\in\mathcal{I}_{4}, and either λiG>0,λiH>0 or λiGλiH=0,i𝒥\mbox{either }\lambda_{i}^{G}>0,\lambda_{i}^{H}>0\mbox{ or }\lambda_{i}^{G}\lambda_{i}^{H}=0,\forall i\in\mathcal{J}^{*}, η=(η1,,ηl)𝒩C(x){\eta^{*}=(\eta_{1}^{*},\cdots,\eta_{l}^{*})}\in\mathcal{N}_{C}(x^{*}), vigi(x)v^{*}_{i}\in\partial g_{i}(x^{*}) for i4i\in{\cal I}_{4} satisfying

0=i4λigvi+i1λihhi(x)i25λiGGi(x)i36λiHHi(x)+η.\displaystyle 0=\sum_{i\in\mathcal{I}_{4}}\lambda_{i}^{g}v_{i}^{*}+\sum_{i\in\mathcal{I}_{1}}\lambda_{i}^{h}\nabla h_{i}(x^{*})-\sum_{i\in\mathcal{I}_{2}\cup\mathcal{I}_{5}}\lambda_{i}^{G}\nabla G_{i}(x^{*})-\sum_{i\in\mathcal{I}_{3}\cup\mathcal{I}_{6}}\lambda_{i}^{H}\nabla H_{i}(x^{*})+\eta^{*}.

Then the vectors

{vi}i4{hi(x)}i1{Gi(x)}i25{Hi(x)}i36{νi}iL,\displaystyle\{v_{i}^{*}\}_{i\in\mathcal{I}_{4}}\cup\{\nabla h_{i}(x^{*})\}_{i\in\mathcal{I}_{1}}\cup\{\nabla G_{i}(x^{*})\}_{i\in\mathcal{I}_{2}\cup\mathcal{I}_{5}}\cup\{\nabla H_{i}(x^{*})\}_{i\in\mathcal{I}_{3}\cup\mathcal{I}_{6}}\cup\{\nu_{i}^{*}\}_{i\in L},

where νi:={0}si×{ηi}×{0}ti\nu_{i}^{*}:=\{0\}^{s_{i}}\times\{\eta_{i}^{*}\}\times\{0\}^{t_{i}} and L:={1,,l:ηi0}L:=\{1,\dots,l:\eta_{i}^{*}\neq 0\}, is linearly dependent. Since {hi(x)}i1{Gi(x)}i2{Hi(x)}i3\{\nabla h_{i}(x^{*})\}_{i\in\mathcal{I}_{1}}\cup\{\nabla G_{i}(x^{*})\}_{i\in\mathcal{I}_{2}}\cup\{\nabla H_{i}(x^{*})\}_{i\in\mathcal{I}_{3}} is a basis for  span {{hi(x)}i=1m{Gi(x)}i{Hi(x)}i𝒦},\mbox{ span }\large\{\{\nabla h_{i}(x^{*})\}_{i=1}^{m}\cup\{\nabla G_{i}(x^{*})\}_{i\in\mathcal{I}^{*}}\cup\{\nabla H_{i}(x^{*})\}_{i\in\mathcal{K}^{*}}\large\}, it follows that

{vi}i4{hi(x)}i=1m{Gi(x)}i5{Hi(x)}i𝒦6{νi}iL\displaystyle\{v_{i}^{*}\}_{i\in\mathcal{I}_{4}}\cup\{\nabla h_{i}(x^{*})\}_{i=1}^{m}\cup\{\nabla G_{i}(x^{*})\}_{i\in\mathcal{I}^{*}\cup\mathcal{I}_{5}}\cup\{\nabla H_{i}(x^{*})\}_{i\in\mathcal{K}^{*}\cup\mathcal{I}_{6}}\cup\{\nu_{i}^{*}\}_{i\in L}

is linearly dependent as well. By RCRCQ, it follows that

{vik}i4{hi(xk)}i=1m{Gi(xk)}i5{Hi(xk)}i𝒦6{νik}iL\displaystyle\{v_{i}^{k}\}_{i\in\mathcal{I}_{4}}\cup\{\nabla h_{i}(x^{k})\}_{i=1}^{m}\cup\{\nabla G_{i}(x^{k})\}_{i\in\mathcal{I}^{*}\cup\mathcal{I}_{5}}\cup\{\nabla H_{i}(x^{k})\}_{i\in\mathcal{K}^{*}\cup\mathcal{I}_{6}}\cup\{\nu_{i}^{k}\}_{i\in L}

is linearly dependent for all sequences {xk},{vk},{ηk}\{x^{k}\},\{v^{k}\},\{\eta^{k}\} satisfying xkxx^{k}\rightarrow x^{*}, vkvv^{k}\rightarrow v^{*}, ηkη\eta^{k}\rightarrow\eta^{*}, vikgi(xk)v_{i}^{k}\in\partial g_{i}(x^{k}), ηik𝒩Ci(xik)\eta_{i}^{k}\in\mathcal{N}_{C_{i}}(x_{i}^{k}), νik:={0}si×{ηik}×{0}ti\nu_{i}^{k}:=\{0\}^{s_{i}}\times\{\eta_{i}^{k}\}\times\{0\}^{t_{i}}. From RCPLD (i), {hi(xk)}i1{Gi(xk)}i2{Hi(xk)}i3\{\nabla h_{i}(x^{k})\}_{i\in\mathcal{I}_{1}}\cup\{\nabla G_{i}(x^{k})\}_{i\in\mathcal{I}_{2}}\cup\{\nabla H_{i}(x^{k})\}_{i\in\mathcal{I}_{3}} is a basis of  span {{hi(xk)}i=1m{Gi(xk)}i{Hi(xk)}i𝒦}\mbox{ span }\{\{\nabla h_{i}(x^{k})\}_{i=1}^{m}\cup\{\nabla G_{i}(x^{k})\}_{i\in\mathcal{I}^{*}}\cup\{\nabla H_{i}(x^{k})\}_{i\in\mathcal{K}^{*}}\} and thus

{vik}i4{hi(xk)}i1{Gi(xk)}i25{Hi(xk)}i36{νik}iL\displaystyle\{v_{i}^{k}\}_{i\in\mathcal{I}_{4}}\cup\{\nabla h_{i}(x^{k})\}_{i\in\mathcal{I}_{1}}\cup\{\nabla G_{i}(x^{k})\}_{i\in\mathcal{I}_{2}\cup\mathcal{I}_{5}}\cup\{\nabla H_{i}(x^{k})\}_{i\in\mathcal{I}_{3}\cup\mathcal{I}_{6}}\cup\{\nu_{i}^{k}\}_{i\in L}

is linearly dependent. Since LLk:={1,,l:ηik0}L\subseteq L_{k}:=\{1,\dots,l:\eta_{i}^{k}\neq 0\} for any sufficiently large kk, RCPLD (ii) holds.  

Definition 4.3

We say that the Linear Constraint Qualification (LCQ) holds if all functions gi,hi,Gi,Hig_{i},h_{i},G_{i},H_{i} are linear and the set CC is the union of finitely many polyhedral sets.

We now show that LCQ implies RCRCQ.

Proposition 4.2

LCQ implies RCRCQ holds at each xx^{*}\in{\cal F}.

Proof. Since CC is the union of finitely many polyhedral convex sets, for all xkxx^{k}\rightarrow x^{*} and sufficiently large kk, we have ηk=η\eta^{k}=\eta^{*} if ηkη\eta^{k}\rightarrow\eta^{*} and ηk𝒩C(xk)\eta^{k}\in\mathcal{N}_{C}(x^{k}). Moreover all functions are linear. Therefore the matrix J(x,v2,νL):=[v2h(x)TG3(x)TH𝒦4(x)TνL]J^{\prime}(x,v_{{\cal I}_{2}},\nu_{L}):=\left[\begin{array}[]{ccccc}v_{{\cal I}_{2}}&\nabla h(x)^{T}&\nabla G_{{\cal I}^{*}\cup{\cal I}_{3}}(x)^{T}&\nabla H_{{\cal K}^{*}\cup{\cal I}_{4}}(x)^{T}&\nu_{L}\end{array}\right] is a constant matrix for all x,v2=g2(x),η{ηk,η}x,v_{{\cal I}_{2}}=\nabla g_{{\cal I}_{2}}(x),\eta\in\{\eta^{k},\eta^{*}\}, νi:={0}si×{ηi}×{0}ti\nu_{i}:=\{0\}^{s_{i}}\times\{\eta_{i}\}\times\{0\}^{t_{i}}, iLi\in L and therefore RCRCQ holds.  

Example 4.1

Consider the nonsmooth system:

h(x):=2x1+x2=0,\displaystyle h(x):=2x_{1}+x_{2}=0,
g(x):=x1+x2max{12,x3}x4+10,\displaystyle g(x):=x_{1}+x_{2}-\max\{\frac{1}{2},x_{3}\}-x_{4}+1\leq 0,
xC:=C1×C2,\displaystyle x\in C:=C_{1}\times C_{2},

where C1C_{1} is the graph of a continuous function φ:[1,1]\varphi:[-1,1]\to\mathbb{R} as shown in Fig.1 and C2:={(t,1t):t[0,1]}C_{2}:=\{(t,1-t):t\in[0,1]\}. We set φ(x)=0\varphi(x)=0 when x=0,2n,2nx=0,2^{-n},-2^{-n}, for n=0,1,2,n=0,1,2,\cdots. Between any two points of the form 2n12^{-n-1} and 2n2^{-n}, or 2n-2^{-n} and 2n1-2^{-n-1}, the graph of φ\varphi describes the edge of an isosceles triangle whose apex is located at (2n2+2n1,1)(2^{-n-2}+2^{-n-1},1) or (2n22n1,1)(-2^{-n-2}-2^{-n-1},1). Consider the feasible solution x=(x1,x2,x3,x4)=(0,0,12,12)x^{*}=(x^{*}_{1},x^{*}_{2},x^{*}_{3},x^{*}_{4})=(0,0,\frac{1}{2},\frac{1}{2}).

Refer to caption
Figure 1: Feasible region C1C_{1}

We have h(x)=(2,1,0,0)\nabla h(x)=(2,1,0,0),

g(x)={(1,1,1,1)ifx3>12,{(1,1,1,1),(1,1,0,1)}ifx3=12,(1,1,0,1)ifx3<12.\partial g(x)=\left\{\begin{array}[]{ll}(1,1,-1,-1)&{\rm if}\ x_{3}>\frac{1}{2},\\ \{(1,1,-1,-1),(1,1,0,-1)\}&{\rm if}\ x_{3}=\frac{1}{2},\\ (1,1,0,-1)&{\rm if}\ x_{3}<\frac{1}{2}.\end{array}\right.

Since C1C_{1} is symmetrical, we only give the expression of the normal cone for the case when (x1,x2)C1(x_{1},x_{2})\in C_{1} and x10x_{1}\geq 0, for n=0,1,2,n=0,1,2,\cdots,

𝒩C1(x1,x2)=\displaystyle\mathcal{N}_{C_{1}}(x_{1},x_{2})=
{(η1,η2):η1=0,η2ifx1(2n1,2n),x2=0,η2=2n2η1,η1ifx1(2n1,2n2+2n1),x2>0,η2=2n2η1,η1ifx1(2n2+2n1,2n),x2>0,η2=αη1,α=±2n2orη2>0withα(,2n2)(2n2,+)ifx1=2n2+2n1,x2=1,η1=0,orη2=αη1,α=2n2,2n3ifx1=2n1,x2=0,η1η2=0ifx1=0,x2=0,\displaystyle\left\{(\eta_{1},\eta_{2}):\begin{array}[]{ll}\eta_{1}=0,\eta_{2}\in\mathbb{R}&{\rm if}\ x_{1}\in(2^{-n-1},2^{-n}),x_{2}=0,\\ \eta_{2}=-2^{-n-2}\eta_{1},\eta_{1}\in\mathbb{R}&{\rm if}\ x_{1}\in(2^{-n-1},2^{-n-2}+2^{-n-1}),x_{2}>0,\\ \eta_{2}=2^{-n-2}\eta_{1},\eta_{1}\in\mathbb{R}&{\rm if}\ x_{1}\in(2^{-n-2}+2^{-n-1},2^{-n}),x_{2}>0,\\ \eta_{2}=\alpha\eta_{1},\alpha=\pm 2^{-n-2}\ {\rm or}\ \eta_{2}>0\ {\rm with}\\ \alpha\in(-\infty,-2^{-n-2})\cup(2^{-n-2},+\infty)&{\rm if}\ x_{1}=2^{-n-2}+2^{-n-1},x_{2}=1,\\ \eta_{1}=0,{\rm or}\ \eta_{2}=\alpha\eta_{1},\alpha=-2^{-n-2},2^{-n-3}&{\rm if}\ x_{1}=2^{-n-1},x_{2}=0,\\ \eta_{1}\eta_{2}=0&{\rm if}\ x_{1}=0,x_{2}=0,\\ \end{array}\right.

and

𝒩C2(x3,x4)={(η3,η4):η4η3,ifx3=0,x4=1η3=η4,ifx3(0,1),x4=1x3η3η4,ifx3=1,x4=0}.\displaystyle\mathcal{N}_{C_{2}}(x_{3},x_{4})=\left\{(\eta_{3},\eta_{4}):\begin{array}[]{ll}\eta_{4}\geq\eta_{3},&{\rm if}\ x_{3}=0,x_{4}=1\\ \eta_{3}=\eta_{4},&{\rm if}\ x_{3}\in(0,1),x_{4}=1-x_{3}\\ \eta_{3}\geq\eta_{4},&{\rm if}\ x_{3}=1,x_{4}=0\end{array}\right\}.

Also we have 𝒩C(x)=𝒩C1(x1,x2)×𝒩C2(x3,x4)\mathcal{N}_{C}(x)=\mathcal{N}_{C_{1}}(x_{1},x_{2})\times\mathcal{N}_{C_{2}}(x_{3},x_{4}). Hence 𝒩C(x)={(η1,η2):η1η2=0}×{(α,α):α}\mathcal{N}_{C}(x^{*})=\{(\eta_{1},\eta_{2}):\eta_{1}\eta_{2}=0\}\times\{(\alpha,\alpha):\alpha\in\mathbb{R}\}. Let η=(0,0,0,0)\eta^{*}=(0,0,0,0) being an element in the normal cone 𝒩C(x)\mathcal{N}_{C}(x^{*}). For any xkxx^{k}\to x^{*} with x1k=x2k=0,x3k>12x_{1}^{k}=x_{2}^{k}=0,x_{3}^{k}>\frac{1}{2}, vk=(1,1,1,1)g(xk)v^{k}=(1,1,-1,-1)\in\partial g(x^{k}), ηk=(0,0,αk,αk)𝒩C(xk)\eta^{k}=(0,0,\alpha_{k},\alpha_{k})\in{\cal N}_{C}(x^{k}) with αk0\alpha_{k}\rightarrow 0, we have

J(xk,vk,ηk):=[vkh(xk)ηk]=[12011010αk10αk].J^{\prime}(x^{k},v^{k},\eta^{k}):=\left[\begin{array}[]{ccc}v^{k}&\nabla h(x^{k})&\eta^{k}\end{array}\right]=\left[\begin{array}[]{ccc}1&2&0\\ 1&1&0\\ -1&0&\alpha_{k}\\ -1&0&\alpha_{k}\end{array}\right].

r(J(xk,vk,ηk))=3r(J^{\prime}(x^{k},v^{k},\eta^{k}))=3 if αk0\alpha_{k}\neq 0 but r(J(x,v,η))=2r(J^{\prime}(x^{*},v^{*},\eta^{*}))=2. This shows that RCRCQ fails at xx^{*}.

Since the condition

0=λv+μh(x)+η\displaystyle 0=\lambda v^{*}+\mu\nabla h({x^{*}})+\eta^{*} (4.5)

holds when η=(1,0,1,1)𝒩C(x)\eta^{*}=(1,0,1,1)\in\mathcal{N}_{C}(x^{*}), v=(1,1,1,1)g(x)v^{*}=(1,1,-1,-1)\in\partial g(x^{*}), λ=1=μ\lambda=1=-\mu. Hence the NNAMCQ fails at xx^{*}.

We now verify that RCPLD holds. Actually it is easy to see that

0=λv+μh(x)+η,vg(x),η𝒩C(x)0=\lambda v^{*}+\mu\nabla h(x^{*})+\eta^{*},\quad v^{*}\in\partial g(x^{*}),\quad\eta^{*}\in\mathcal{N}_{C}(x^{*})

holds with μ,λ0,η\mu,\lambda\geq 0,\eta^{*} not all equal to zero if and only if v=(1,1,1,1)v^{*}=(1,1,-1,-1) and either (a) or (b) below holds:

  • (a)

    λ=μ\lambda=-\mu, η=μ(1,0,1,1){\eta}^{*}=\mu(-1,0,-1,-1).

  • (b)

    λ=2μ\lambda=-2\mu, η=2μ(0,12,1,1){\eta}^{*}=2\mu(0,-\frac{1}{2},1,1).

In either case (a) or (b), the set L={1,2}L=\{1,2\}. For any xkx=(0,0,12,12)x^{k}\to x^{*}=(0,0,\frac{1}{2},\frac{1}{2}), denote by ηk:=(η1k,η2k,αk,αk)\eta^{k}:=(\eta_{1}^{k},\eta_{2}^{k},\alpha_{k},\alpha_{k}) where (η1k,η2k)𝒩C1(x1k,x2k)(\eta_{1}^{k},\eta_{2}^{k})\in\mathcal{N}_{C_{1}}(x_{1}^{k},x_{2}^{k}), (αk,αk)𝒩C2(x3k,x4k)(\alpha_{k},\alpha_{k})\in\mathcal{N}_{C_{2}}(x_{3}^{k},x_{4}^{k}) and ηkη\eta^{k}\rightarrow\eta^{*}. We also denote by ν1k:=(η1k,η2k,0,0),ν2k:=(0,0,αk,αk)\nu_{1}^{k}:=(\eta_{1}^{k},\eta_{2}^{k},0,0),\nu_{2}^{k}:=(0,0,\alpha_{k},\alpha_{k}). Note that at any xkx^{k} such that xkx,vkv,vkg(xk)x^{k}\rightarrow x^{*},v^{k}\rightarrow v^{*},v^{k}\in\partial g(x^{k}) and kk is large enough, g(x)g(x) is differentiable and vk=g(xk)=(1,1,1,1)v^{k}=\nabla g(x^{k})=(1,1,-1,-1). Since the matrix

J(xk,vk,νLk)=[12η1k011η2k0100αk100αk]J^{\prime}(x^{k},v^{k},\nu_{L}^{k})=\left[\begin{array}[]{cccc}1&2&\eta_{1}^{k}&0\\ 1&1&\eta_{2}^{k}&0\\ -1&0&0&\alpha_{k}\\ -1&0&0&\alpha_{k}\end{array}\right]

is not full rank, the set of vectors h(xk),vk\nabla h(x^{k}),v^{k}, ν1k\nu_{1}^{k}, ν2k\nu_{2}^{k} is always linearly dependent and thus RCPLD holds at xx^{*}.

Note that if αkη1k+2η2k0\alpha_{k}-\eta_{1}^{k}+2\eta_{2}^{k}\neq 0 for large kk, it is easy to see that the rank of the matrix

[12η1k11η2k10αk10αk]\left[\begin{array}[]{ccc}1&2&\eta_{1}^{k}\\ 1&1&\eta_{2}^{k}\\ -1&0&\alpha_{k}\\ -1&0&\alpha_{k}\end{array}\right]

equals to 33 and thus the set of vectors h(xk),vk\nabla h(x^{k}),v^{k}, ηk\eta^{k} is linearly independent. Hence as pointed out in Remark 1.1, our definition for RCPLD is weaker than the condition that the set of vectors h(xk),vk\nabla h(x^{k}),v^{k}, ηk\eta^{k} is linearly dependent.

5 Applications to bilevel programs

In this section we apply RCPLD to the combined program (CP). Throughout this section we assume that the value function V(x)V(x) is Lipschitz continuous at the point of interest. For the case where Y(x)=YY(x)=Y is independent of xx, we can use Danskin’s theorem and for the general case one can use Proposition 5.2 which is a special case of [4, Theorem 6.5.2]. Other weaker sufficient conditions for Lipschitz continuity of the value function as well as the estimates for its subdifferentials can be found for examples in [12].

Proposition 5.1 (Danskin’s Theorem)

([5, page 99] or [6]) Let YsY\subseteq\mathbb{R}^{s} be a compact set and f(x,y)f(x,y) be a function defined on d×s\mathbb{R}^{d}\times\mathbb{R}^{s} that is continuously differentiable at xx^{*}. Then the value function V(x):=min{f(x,y):yY}V(x):=\min\{f(x,y):y\in Y\} is Lipschitz continuous near xx^{*} and its Clarke subdifferential is cV(x)=co{xf(x,y):yS(x)}\partial^{c}V(x^{*})=co\{\nabla_{x}f(x^{*},y):y\in S(x^{*})\}, where cV(x)=coV(x)\partial^{c}V(x^{*})=co\partial V(x^{*}) is the Clarke subdifferential of VV at xx^{*}.

Proposition 5.2

Assume that the set-valued map Y(x)Y(x) is uniformly bounded around xx^{*}, i.e., there exists a neighborhood 𝕌\mathbb{U} of xx^{*} such that the set x𝕌Y(x)\bigcup_{x\in\mathbb{U}}Y(x) is bounded. Suppose that MFCQ holds at yy for all yS(x)y\in S(x^{*}). Then the valued function V(x)V(x) is Lipschitz continuous near xx^{*} and

cV(x)coW(x),\partial^{c}V(x^{*})\subseteq coW(x^{*}),

where

W(x)\displaystyle W(x^{*}) :=\displaystyle:= yS(x){xf(x,y)+uxg(x,y)+vxh(x,y):(u,v)M(x,y)},\displaystyle\bigcup_{y\in S(x^{*})}\left\{\nabla_{x}f(x^{*},y)+u\nabla_{x}g(x^{*},y)+v\nabla_{x}h(x^{*},y):(u,v)\in M(x^{*},y)\right\},
M(x,y)\displaystyle M(x^{*},y) :=\displaystyle:= {(u,v):0=yf(x,y)+uyg(x,y)+vyh(x,y)u0,g(x,y),u=0},\displaystyle\left\{(u,v):\begin{array}[]{c}0=\nabla_{y}f(x^{*},y)+u\nabla_{y}g(x^{*},y)+v\nabla_{y}h(x^{*},y)\\ u\geq 0,\quad\langle g(x,y),u\rangle=0\end{array}\right\},

where uyg:=i=1muiygiu\nabla_{y}g:=\sum_{i=1}^{m}u_{i}\nabla_{y}g_{i}.

Note that if in addition to the assumptions of Proposition 5.2, SS is inner semicontinuous at (x,y)(x^{*},y^{*}) for some yS(x)y^{*}\in S(x^{*}), then the union yS(x)\bigcup_{y\in S(x^{*})} sign can be omitted in Proposition 5.2; see [21, Corollary 1.109]. In the case where the LICQ holds at each yS(x)y\in S(x^{*}), the set of multipliers M(x,y)M(x^{*},y) is a singleton and by [9, Corollary 5.4], the inclusion becomes an equality in the above proposition and V(x)-V(x) is Clarke regular around xx^{*}. In this case, LICQ for the lower level problem holds at every y,yS(x)y,y\in S(x), for all xx near xx^{*} due to the outer semi-continuity of the solution mapping S(x)S(x). Thus we have

x(fV)(x,y)=xc(fV)(x,y)=xf(x,y)cV(x)=xf(x,y)coW(x).\partial_{x}(f-V)(x,y)=\partial_{x}^{c}(f-V)(x,y)=\nabla_{x}f(x,y)-\partial^{c}V(x)=\nabla_{x}f(x,y)-coW(x).

By Carathéodory’s theorem, the convex set coW(x)dcoW(x)\subseteq\mathbb{R}^{d} can be represented by not more than d+1d+1 elements. It follows that for any wcoW(x)w\in coW(x), there exist μi0,i=1d+1μi=1\mu^{i}\geq 0,\sum_{i=1}^{d+1}\mu^{i}=1, M(x,yi)={(ui,vi)},yiS(x)M(x,y^{i})=\{(u^{i},v^{i})\},y^{i}\in S(x) such that w=i=1d+1μi(xf(x,yi)+uixg(x,yi)+vixh(x,yi))w=\sum_{i=1}^{d+1}\mu^{i}(\nabla_{x}f(x,y^{i})+u^{i}\nabla_{x}g(x,y^{i})+v^{i}\nabla_{x}h(x,y^{i})).

Given a feasible vector (x,y,u,v)(x^{*},y^{*},u^{*},v^{*}) of the problem (CP), define the following index sets:

IG=IG(x,y):={i:Gi(x,y)=0},\displaystyle I_{G}^{*}=I_{G}(x^{*},y^{*}):=\{i:G_{i}(x^{*},y^{*})=0\},
=(x,y,u):={i:gi(x,y)=0,ui>0},\displaystyle{\cal I}^{*}={\cal I}(x^{*},y^{*},u^{*}):=\{i:g_{i}(x^{*},y^{*})=0,u_{i}^{*}>0\},
𝒥=𝒥(x,y,u):={i:gi(x,y)=0,ui=0},\displaystyle{\cal J}^{*}={\cal J}(x^{*},y^{*},u^{*}):=\{i:g_{i}(x^{*},y^{*})=0,u_{i}^{*}=0\},
𝒦=𝒦(x,y,u):={i:gi(x,y)<0,ui=0}.\displaystyle{\cal K}^{*}={\cal K}(x^{*},y^{*},u^{*}):=\{i:g_{i}(x^{*},y^{*})<0,u_{i}^{*}=0\}.

Theorem 3.1 can now be applied to the problem (CP) to obtain the M-stationary condition under RCPLD at any local optimal solution.

Theorem 5.1

Let (x,y,u,v)(x^{*},y^{*},u^{*},v^{*}) be a local solution of (CP)(\rm CP) and suppose that the value function V(x)V(x) is Lipschitz continuous at xx^{*}. If the RCPLD holds at (x,y,u,v)(x^{*},y^{*},u^{*},v^{*}), then (x,y,u,v)(x^{*},y^{*},u^{*},v^{*}) is an M-stationary point of problem (CP).

In the rest of this section, we apply the sufficient conditions for RCPLD which are introduced in Section 4 to the bilevel programs. The following theorem follows immediately from Theorem 4.1. The condition is easy to verify since the nonsmooth constraint f(x,y)V(x)0f(x,y)-V(x)\leq 0 is not needed in the verification.

Theorem 5.2

Let (x,y,u,v)(x^{*},y^{*},u^{*},v^{*}) be a local solution of (CP)(\rm CP) and suppose that the value function V(x)V(x) is Lipschitz continuous at xx^{*}. If the rank of the matrix

J=[(yf+uyg+vyh)(x,y)yh(x,y)Tyg𝒥(x,y)Th(x,y)00H(x,y)00g(x,y)00]J^{*}=\left[\begin{array}[]{cccc}\nabla(\nabla_{y}f+u\nabla_{y}g+v\nabla_{y}h)(x^{*},y^{*})&\nabla_{y}h(x^{*},y^{*})^{T}&\nabla_{y}g_{{\cal I}^{*}\cup{\cal J}^{*}}(x^{*},y^{*})^{T}\\ \nabla h(x^{*},y^{*})&0&0\\ \nabla H(x^{*},y^{*})&0&0\\ \nabla g_{{\cal I}^{*}}(x^{*},y^{*})&0&0\end{array}\right]

is equal to d+s+m+n|𝒦|d+s+m+n-|{\cal K}^{*}|, then RCPLD holds and (x,y,u,v)(x^{*},y^{*},u^{*},v^{*}) is an M-stationary point of problem (CP). Moreover (x,y,u,v)(x^{*},y^{*},u^{*},v^{*}) is a local optimal solution of the penalized problem for some μ0\mu\geq 0:

(CPμ)minx,y,u,v\displaystyle({\rm CP}_{\mu})~{}~{}~{}~{}~{}~{}\min_{x,y,u,v} F(x,y)+μϕCP(x,y,u,v)\displaystyle F(x,y)+\mu\phi_{CP}(x,y,u,v)

where ϕCP(x,y,u,v):=(f(x,y)V(x))++H(x,y)+h(x,y)+G+(x,y)+yL(x,y,u,v)+i=1mdΩ(gi(x,y),ui))\phi_{CP}(x,y,u,v):=(f(x,y)-V(x))_{+}+\|H(x,y)\|+\|h(x,y)\|+\|G_{+}(x,y)\|+\|\nabla_{y}L(x,y,u,v)\|+\sum_{i=1}^{m}d_{\Omega}(-g_{i}(x,y),u_{i})).

Proof. Let Γ:=[yg𝒦(x,y)T0]\Gamma^{*}:=\left[\begin{array}[]{c}\nabla_{y}g_{{\cal K}^{*}}(x^{*},y^{*})^{T}\\ 0\end{array}\right]. It is easy to see that

r([JΓ0I|𝒦|])=r(J)+|𝒦|=d+s+m+n,r\left(\left[\begin{array}[]{cc}J^{*}&\Gamma^{*}\\ 0&I_{|\mathcal{K}^{*}|}\end{array}\right]\right)=r(J^{*})+|\mathcal{K}^{*}|=d+s+m+n,

where I|𝒦|I_{|\mathcal{K}^{*}|} is the identity matrix of size |𝒦||\mathcal{K}^{*}|. From Theorem 4.1, the error bound property holds for the problem (CP), i.e., there exist α>0\alpha>0 and ε>0\varepsilon>0 such that

dCP(x,y,u,v)αϕCP(x,y,u,v),(x,y,u,v)𝔹ε(x,y,u,v),\displaystyle d_{{\cal F}_{CP}}(x,y,u,v)\leq\alpha\phi_{CP}(x,y,u,v),\quad\forall(x,y,u,v)\in\mathbb{B}_{\varepsilon}(x^{*},y^{*},u^{*},v^{*}),

where CP{\cal F}_{CP} denotes the feasible region of problem (CP). It follows from Clarke’s exact penalty principle [4, Proposition 2.4.3] that the problem (CPμ)({\rm CP}_{\mu}) is exact with μLFα\mu\geq L_{F}\alpha with LFL_{F} being the Lipschitz constant of the function FF.  

Corollary 5.1

Let (x,y,u,v)(x^{*},y^{*},u^{*},v^{*}) be a local solution of (CP)(\rm CP). Suppose that the value function V(x)V(x) is Lipschitz continuous at xx^{*} and LICQ holds at yy^{*}. Suppose the matrix

SJ=[h(x,y)H(x,y)g(x,y)]SJ^{*}=\left[\begin{array}[]{l}\nabla h(x^{*},y^{*})\\ \nabla H(x^{*},y^{*})\\ \nabla g_{{\cal I}^{*}}(x^{*},y^{*})\end{array}\right]

has full column rank d+sd+s. Then (x,y,u,v)(x^{*},y^{*},u^{*},v^{*}) is an M-stationary point of problem (CP) and a local optimal solution of the penalized problem (CPμ)({\rm CP}_{\mu}) for some μ0\mu\geq 0.

Proof. Since LICQ holds at yy^{*}, the rank of the matrix

SJ1=[yh(x,y)Tyg𝒥(x,y)T]SJ_{1}^{*}=\left[\begin{array}[]{cc}\nabla_{y}h(x^{*},y^{*})^{T}&\nabla_{y}g_{{\cal I}^{*}\cup{\cal J}^{*}}(x^{*},y^{*})^{T}\end{array}\right]

equals to ||+|𝒥|+n|{\cal I}^{*}|+|{\cal J}^{*}|+n. Then r(J)=r(SJ)+r(SJ1)=d+s+m+n|𝒦|r(J^{*})=r(SJ^{*})+r(SJ_{1}^{*})=d+s+m+n-|{\cal K}^{*}| and thus the conclusions in Theorem 5.2 hold.  

The following example illustrates the application of Theorem 5.2 and Corollary 5.1.

Example 5.1

Consider the following bilevel program:

min\displaystyle\min F(x,y)\displaystyle F(x,y)
s.t.\displaystyle{\rm s.t.} x[3,2],\displaystyle x\in[-3,2],
H(x,y):=x2+y2=0,\displaystyle H(x,y):=x^{2}+y-2=0,
yS(x):=argminyf(x,y):=y33y\displaystyle y\in S(x):=\mathop{\rm argmin}\limits_{y}\ f(x,y):=y^{3}-3y
s.t.g1(x,y):=xy0,\displaystyle~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}{\rm s.t.}\ g_{1}(x,y):=x-y\leq 0,
g2(x,y):=y30,\displaystyle~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}g_{2}(x,y):=y-3\leq 0,

where F(x,y)F(x,y) is a Lipschitz continuous function. It is easy to see that the solution set for the lower level program is

S(x)={{x} if x[3,2)(1,2],{2,1} if x=2,{1} if x(2,1],\displaystyle S(x)=\left\{\begin{array}[]{ll}\{x\}&\mbox{ if }x\in[-3,-2)\cup(1,2],\\ \{-2,1\}&\mbox{ if }x=-2,\\ \{1\}&\mbox{ if }x\in(-2,1],\end{array}\right.

and the value function is

V(x)={x33x if x[3,2)(1,2],2 if x[2,1].\displaystyle V(x)=\left\{\begin{array}[]{ll}x^{3}-3x&\mbox{ if }x\in[-3,-2)\cup(1,2],\\ -2&\mbox{ if }x\in[-2,1].\end{array}\right.

In fact since the lower level feasible set Y(x):={y:xy3}Y(x):=\{y\in\mathbb{R}:x\leq y\leq 3\} is uniformly bounded whenever x[3,2]x\in[-3,2] and LICQ holds at each yS(x)y\in S(x), x[3,2]x\in[-3,2], we can conclude that the value function is Lipschitz continuous without actually calculating it. For any xx, the KKT condition for the lower level problem is

0=3y23u1+u2,gi(x,y)0,ui0,uigi(x,y)=0,i=1,2.0=3y^{2}-3-u_{1}+u_{2},\ g_{i}(x,y)\leq 0,\ u_{i}\geq 0,\ u_{i}g_{i}(x,y)=0,i=1,2.

Hence the combined problem can be written as follows:

minx,y,u\displaystyle\min_{x,y,u} F(x,y)\displaystyle F(x,y)
s.t.\displaystyle{\rm s.t.} f(x,y)V(x)0,\displaystyle f(x,y)-V(x)\leq 0,
H(x,y)=0,\displaystyle H(x,y)=0,
yL(x,y,u):=3y23u1+u2=0,\displaystyle\nabla_{y}L(x,y,u):=3y^{2}-3-u_{1}+u_{2}=0,
x[3,2],(g(x,y),u)Ω2.\displaystyle x\in[-3,2],(-g(x,y),u)\in\Omega^{2}.

It is easy to see that the feasible region of the bilevel problem contains three points: (x,y)=(2,2),(x^,y^)=(1,1),(x~,y~)=(1,1)(x^{*},y^{*})=(-2,-2),(\hat{x},\hat{y})=(1,1),(\tilde{x},\tilde{y})=(-1,1). From calculation,

(yL)(x,y,u)=(0,6y,1,1),H(x,y)=(2x,1),g1(x,y)=(1,1),g2(x,y)=(0,1).\displaystyle\nabla(\nabla_{y}L)(x,y,u)=(0,6y,-1,1),\nabla H(x,y)=(2x,1),\nabla g_{1}(x,y)=(1,-1),\nabla g_{2}(x,y)=(0,1).

Suppose that the optimal solution of the bilevel problem is (x,y)=(2,2)(x^{*},y^{*})=(-2,-2) and the one for the combined program is (x,y,u1,u2)=(2,2,9,0)(x^{*},y^{*},u_{1}^{*},u_{2}^{*})=(-2,-2,9,0). The index sets ={1},𝒦={2}{\cal I}^{*}=\{1\},{\cal K}^{*}=\{2\} and 𝒥{\cal J}^{*} is empty. The rank of the matrix

SJ=[H(x,y)g1(x,y)]=[4111]SJ^{*}=\left[\begin{array}[]{c}\nabla H(x^{*},y^{*})\\ \nabla g_{1}(x^{*},y^{*})\end{array}\right]=\left[\begin{array}[]{cc}-4&1\\ 1&-1\end{array}\right]

is equal to 22. Hence by Corollary 5.1, (x,y,u1,u2)(x^{*},y^{*},u_{1}^{*},u_{2}^{*}) is an M-stationary point of problem (CP).

Suppose that the optimal solution of the bilevel problem is (x~,y~)=(1,1)(\tilde{x},\tilde{y})=(-1,1) and the one for the combined program is (x~,y~,u~1,u~2)=(1,1,0,0)(\tilde{x},\tilde{y},\tilde{u}_{1},\tilde{u}_{2})=(-1,1,0,0). The index sets 𝒦~={1,2}\tilde{\cal K}=\{1,2\} and both ~,𝒥~\tilde{\cal I},\tilde{\cal J} are empty. The rank of the matrix

J(x~,y~,u~1,u~2)=[(yf+uyg)(x~,y~)H(x~,y~)]=[0621]J(\tilde{x},\tilde{y},\tilde{u}_{1},\tilde{u}_{2})=\left[\begin{array}[]{c}\nabla(\nabla_{y}f+u\nabla_{y}g)(\tilde{x},\tilde{y})\\ \nabla H(\tilde{x},\tilde{y})\end{array}\right]=\left[\begin{array}[]{cc}0&6\\ -2&1\end{array}\right]

is equal to 22. Hence by Theorem 5.2, (x~,y~,u~1,u~2)(\tilde{x},\tilde{y},\tilde{u}_{1},\tilde{u}_{2}) is an M-stationary point of problem (CP). Moreover (x,y,u1,u2)(x^{*},y^{*},u_{1}^{*},u_{2}^{*}) and (x~,y~,u~1,u~2)(\tilde{x},\tilde{y},\tilde{u}_{1},\tilde{u}_{2}) are local solutions of the penalized problem for some μ0\mu\geq 0:

minx,y,u\displaystyle\min_{x,y,u} F(x,y)+μ((f(x,y)V(x))++|H(x,y)|+|yL(x,y,u)|+|min{g(x,y),u}|)\displaystyle F(x,y)+\mu\left((f(x,y)-V(x))_{+}+|H(x,y)|+|\nabla_{y}L(x,y,u)|+|\min\{-g(x,y),u\}|\right)
s.t.\displaystyle{\rm s.t.} x[3,2].\displaystyle x\in[-3,2].

Since RCRCQ is a stronger condition for RCPLD, the following result follows from Theorem 3.1 and Propositions 5.1 and 5.2.

Theorem 5.3

Let (x,y,u,v)(x^{*},y^{*},u^{*},v^{*}) be a local solution of (CP)(\rm CP). Suppose that either Y(x)=YY(x)=Y is independent of xx with YY compact or the set-valued map Y(x)Y(x) is uniformly bounded around xx^{*} and MFCQ holds at yy for all yS(x)y\in S(x^{*}). Given index sets 3,4𝒥{\cal I}_{3},{\cal I}_{4}\subseteq{\cal J}^{*} and 2IG{\cal I}_{2}\subseteq I_{G}^{*}, α{0,1}\alpha\in\{0,1\}, denote the matrix

J(x,y,u,v,α,w):=[(yf+uyg+vyh)(x,y)yh(x,y)Tyg(x,y)Th(x,y)00H(x,y)00G2(x,y)00g3(x,y)0000E𝒦4α(f(x,y)(w,0))00],\displaystyle J^{\prime}(x,y,u,v,\alpha,w):=\left[\begin{array}[]{ccc}\nabla(\nabla_{y}f+u\nabla_{y}g+v\nabla_{y}h)(x,y)&\nabla_{y}h(x,y)^{T}&\nabla_{y}g(x,y)^{T}\\ \nabla h(x,y)&0&0\\ \nabla H(x,y)&0&0\\ \nabla G_{{\cal I}_{2}}(x,y)&0&0\\ \nabla g_{{\cal I}^{*}\cup{\cal I}_{3}}(x,y)&0&0\\ 0&0&E_{\mathcal{K}^{*}\cup{\cal I}_{4}}\\ \alpha(\nabla f(x,y)-(w,0))&0&0\end{array}\right],

where E𝒦4(|𝒦|+|4|)×mE_{\mathcal{K}^{*}\cup{\cal I}_{4}}\subseteq\mathbb{R}^{(|\mathcal{K}^{*}|+|{\cal I}_{4}|)\times m} denotes the matrix with eiTe_{i}^{T} as its rows, i𝒦4i\in\mathcal{K}^{*}\cup{\cal I}_{4} and eime_{i}\in\mathbb{R}^{m} is the vector such that the ii-th component is one and others are zero. Assume that the matrix J(x,y,u,v,α,w)J^{\prime}(x^{*},y^{*},u^{*},v^{*},\alpha,w^{*}) where wcoW(x)w^{*}\in coW(x^{*}) has the same rank with the matrix J(xk,yk,uk,vk,α,wk)J^{\prime}(x^{k},y^{k},u^{k},v^{k},\alpha,w^{k}) for all sequences {xk},{yk},{uk},{vk},{wk}\{x^{k}\},\{y^{k}\},\{u^{k}\},\{v^{k}\},\{w^{k}\} satisfying xkx,yky,uku,vkv,wkwx^{k}\rightarrow x^{*},y^{k}\rightarrow y^{*},u^{k}\rightarrow u^{*},v^{k}\rightarrow v^{*},w^{k}\rightarrow w^{*}, wkcoW(xk)w^{k}\in coW(x^{k}). Then RCRCQ\rm RCRCQ for (CP)(CP) hold at (x,y,u,v)(x^{*},y^{*},u^{*},v^{*}) and (x,y,u,v)(x^{*},y^{*},u^{*},v^{*}) is an M-stationary point of problem (CP).

The following example illustrate the result.

Example 5.2

Consider the following bilevel program:

min\displaystyle\min F(x,y)\displaystyle F(x,y)
s.t.\displaystyle{\rm s.t.} H(x,y):=x1x2+y12=0,\displaystyle H(x,y):=x_{1}-x_{2}+y-\frac{1}{2}=0,
yargminyf(x,y):=x1exp(y)x2exp(y)\displaystyle y\in\mathop{\rm argmin}_{y}\ f(x,y):=x_{1}\exp(y)-x_{2}\exp(y)
s.t.g1(y):=yln20,\displaystyle~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}{\rm s.t.}\ g_{1}(y):=-y-\ln 2\leq 0,
g2(y):=yln20,\displaystyle~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}g_{2}(y):=y-\ln 2\leq 0,\,

where F(x,y)F(x,y) is a Lipschitz continuous function. It is easy to see that the solution set for the lower level program is

S(x)={[ln2,ln2] if x1=x2,{ln2} if x1<x2,{ln2} if x1>x2,\displaystyle S(x)=\left\{\begin{array}[]{ll}[-\ln 2,\ln 2]&\mbox{ if }x_{1}=x_{2},\\ \{\ln 2\}&\mbox{ if }x_{1}<x_{2},\\ \{-\ln 2\}&\mbox{ if }x_{1}>x_{2},\end{array}\right.

and the value function

V(x)={0 if x1=x2,2(x1x2) if x1<x2,12(x1x2) if x1>x2.\displaystyle V(x)=\left\{\begin{array}[]{ll}0&\mbox{ if }x_{1}=x_{2},\\ 2(x_{1}-x_{2})&\mbox{ if }x_{1}<x_{2},\\ \frac{1}{2}(x_{1}-x_{2})&\mbox{ if }x_{1}>x_{2}.\end{array}\right.

In fact we do not need the above explicit representation of the value function. Indeed, since the constraint set Y=[ln2,ln2]Y=[-\ln 2,\ln 2] is compact, by Danskin’s theorem, for any xx around xx^{*} the Clarke subdifferential of the value function is equal to

cV(x)=coW(x)=co{xf(x,yx)|yxS(x)}=co{exp(yx)|yxS(x)}(1,1).\partial^{c}V(x)=coW(x)=co\{\nabla_{x}f(x,y_{x})|y_{x}\in S(x)\}=\displaystyle co\{\exp(y_{x})|y_{x}\in S(x)\}(1,-1).

The combined problem can be written as follows:

minx,y,u\displaystyle\min_{x,y,u} F(x,y)\displaystyle F(x,y)
s.t.\displaystyle{\rm s.t.} f(x,y)V(x)0,H(x,y)=0,\displaystyle f(x,y)-V(x)\leq 0,H(x,y)=0,
yL(x,y,u):=x1exp(y)x2exp(y)u1+u2=0,\displaystyle\nabla_{y}L(x,y,u):=x_{1}\exp(y)-x_{2}\exp(y)-u_{1}+u_{2}=0,
(g(y),u)Ω2.\displaystyle(-g(y),u)\in\Omega^{2}.

It is easy to see that the bilevel program only has three kinds of optimal solutions, (x1,x2,y)=(a,a,12)(x_{1}^{*},x_{2}^{*},y^{*})=(a,a,\frac{1}{2}) with aa\in\mathbb{R}, (x^1,x^2,ln2)(\hat{x}_{1},\hat{x}_{2},\ln 2) with x^1<x^2\hat{x}_{1}<\hat{x}_{2} and (x~1,x~2,ln2)(\tilde{x}_{1},\tilde{x}_{2},-\ln 2) with x^1>x^2\hat{x}_{1}>\hat{x}_{2} . We now verify that RCRCQ holds in the following three cases.

(a) Consider the optimal solution (x1,x2,y)=(a,a,12)(x_{1}^{*},x_{2}^{*},y^{*})=(a,a,\frac{1}{2}) with aa\in\mathbb{R}, the corresponding solution for the combined program is (x,y,u1,u2)=(a,a,12,0,0)(x^{*},y^{*},u_{1}^{*},u_{2}^{*})=(a,a,\frac{1}{2},0,0). Obviously, the index sets 𝒦={1,2}{\cal K}^{*}=\{1,2\} and both ,𝒥={\cal I}^{*},{\cal J}^{*}=\emptyset. For α{0,1}\alpha\in\{0,1\}, denote by

J(x,y,u,α,ω):=[exp(y)exp(y)x1exp(y)x2exp(y)11111000001000001α(exp(y)w)α(exp(y)w)α(x1exp(y)x2exp(y))00],J^{\prime}(x,y,u,\alpha,\omega):=\left[\begin{array}[]{ccccc}\exp(y)&-\exp(y)&x_{1}\exp(y)-x_{2}\exp(y)&-1&1\\ 1&-1&1&0&0\\ 0&0&0&1&0\\ 0&0&0&0&1\\ {\alpha(\exp(y)-w)}&{-\alpha(\exp(y)-w)}&\alpha(x_{1}\exp(y)-x_{2}\exp(y))&0&0\end{array}\right],

where wco{exp(yx)|yxS(x)}w\in co\{\exp(y_{x})|y_{x}\in S(x)\}. It is easy to see that r(J(x,y,u,α,w))=r(J(x,y,u,α,w))=4r(J^{\prime}(x^{*},y^{*},u^{*},\alpha,w^{*}))=r(J^{\prime}(x,y,u,\alpha,w))=4 for any α{0,1}\alpha\in\{0,1\}, and any (x,y,u)(x,y,u) and ww. Hence RCRCQ holds by Theorem 5.3.

(b) Consider the optimal solution (x^1,x^2,ln2)(\hat{x}_{1},\hat{x}_{2},\ln 2) with x^1<x^2\hat{x}_{1}<\hat{x}_{2}, the corresponding solution for the combined program is (x^1,x^2,ln2,0,u^2)(\hat{x}_{1},\hat{x}_{2},\ln 2,0,\hat{u}_{2}) with x^1<x^2\hat{x}_{1}<\hat{x}_{2} and u^2>0\hat{u}_{2}>0. The index sets 𝒦^={1}\hat{\cal K}=\{1\} and ^={2},𝒥^=\hat{\cal I}=\{2\},\hat{\cal J}=\emptyset, and for any xx around x^=(x^1,x^2)\hat{x}=(\hat{x}_{1},\hat{x}_{2}), S(x)={ln2}S(x)=\{\ln 2\}, coW(x)=(2,2)TcoW(x)=(2,-2)^{T}. For α{0,1}\alpha\in\{0,1\}, denote by

J(x,y,u,α,w):=[exp(y)exp(y)x1exp(y)x2exp(y)11111000010000010α(exp(y)2)α(exp(y)2)α(x1exp(y)x2exp(y))00].J^{\prime}(x,y,u,\alpha,w):=\left[\begin{array}[]{ccccc}\exp(y)&-\exp(y)&x_{1}\exp(y)-x_{2}\exp(y)&-1&1\\ 1&-1&1&0&0\\ 0&0&1&0&0\\ 0&0&0&1&0\\ \alpha(\exp(y)-2)&-\alpha(\exp(y)-2)&\alpha(x_{1}\exp(y)-x_{2}\exp(y))&0&0\end{array}\right].

It is easy to see that r(J(x,y,u,α,w))=4r(J^{\prime}(x,y,u,\alpha,w))=4 or any α{0,1}\alpha\in\{0,1\} and any (x,y,u)(x,y,u) and ww. Hence RCRCQ holds from Theorem 5.3.

(c) Consider the optimal solution (x~1,x~2,ln2)(\tilde{x}_{1},\tilde{x}_{2},-\ln 2) with x~1>x~2\tilde{x}_{1}>\tilde{x}_{2}, the corresponding solution for the combined program is (x~1,x~2,ln2,u~1,0)(\tilde{x}_{1},\tilde{x}_{2},-\ln 2,\tilde{u}_{1},0) with x~1>x~2\tilde{x}_{1}>\tilde{x}_{2} and u^1>0\hat{u}_{1}>0. Similarly as in case (b), RCRCQ holds.

Acknowlegement

The authors would like to thank the anonymous referees for their helpful suggestions and comments which help us to improve the presentation of the paper.

References

  • [1] R. Andreani, G. Haeser, M.L. Schuverdt and P.J.S. Silva, A relaxed constant positive linear dependence constraint qualification and applications, Math. Program., 135(2012), 255–273.
  • [2] R. Andreani, J.M. Martínez and M.L. Schuverdt, On the relation between constant positive linear dependence condition and quasinormality constraint qualification, J. Optim. Theory Appl., 125(2005), 473–483.
  • [3] X. Chen, L. Guo, Z. Lu and J.J. Ye, An augmented Lagrangian method for non-Lipschitz nonconvex programming, SIAM J. Numer. Anal., 55(2017), 168–193.
  • [4] F.H. Clarke, Optimization and Nonsmooth Analysis, Wiley-Interscience, New York, 1983.
  • [5] F.H. Clarke, Yu.S. Ledyaev, R.J. Stern and P.R. Wolenski, Nonsmooth Analysis and Control Theory, Springer, New York, 1998.
  • [6] J.M. Danskin, The Theory of Max-Min and its Applications to Weapons Allocation Problems, Springer, New York, 1967.
  • [7] S. Dempe, J. Dutta and B.S. Mordukhovich, New necessary optimality conditions in optimistic bilevel programming, Optimization, 56 (2007), 577-604.
  • [8] S. Dempe and A.B. Zemkoho, The bilevel programming problems: reformulations, constraint qualifications and optimality conditions, Math. Program., 138 (2013), 447-473.
  • [9] J. Gauvin and F. Dubeau, Differential properties of the marginal function in mathematical programming, Math. Program. Study, 19(1982), 101–119.
  • [10] L. Guo and G.-H. Lin, Notes on some constraint qualifications for mathematical programs with equilibrium constraints, J. Optim. Theory Appl., 156(2013), 600–616.
  • [11] L. Guo, G.-H. Lin and J.J. Ye, Second-order optimality conditions for mathematical programs with equilibrium constraints, J. Optim. Theory Appl., 158(2013), 33–64.
  • [12] L. Guo, G.-H., Lin, J.J. Ye and J. Zhang, Sensitivity analysis of the value function for parametric mathematical programs with equilibrium constraints, SIAM J. Optim., 24(2014), 1206–1237.
  • [13] L. Guo and J.J. Ye, Necessary optimality conditions and exact penalization for non-Lipschitz nonlinear programs, Math. Program., 168(2018), 571–598.
  • [14] L. Guo, J. Zhang and G.-H. Lin, New results on constraint qualifications for nonlinear extremum problems and extensions, J. Optim. Theory Appl., 163(2014), 737–754.
  • [15] A.F. Izmailov, M.V. Solodov and E.I. Uskov, Global convergence of augmented Lagrangian methods applied to optimization problems with degenerate constraints, including problems with complementarity constraints, SIAM J. Optim., 22(2012), 1579–1606.
  • [16] R. Janin, Direction derivative of the marginal function in nonlinear programming, Math. Program. Study, 21(1984), 110–126.
  • [17] A. Jourani and L. Thibault, The approximate subdifferential of composite functions, Bull. Austral. Math. Soc., 47(1993), 443–455.
  • [18] C. Kanzow and A. Schwartz, Mathematical programs with equilibrium constraints: enhanced Fritz John-conditions, new constraint qualifications, and improved exact penalty results, SIAM J. Optim., 20(2010), 2730–2753.
  • [19] L. Minchenko and S. Stakhovski, On relaxed constant rank regularity condition in mathematical programming, Optim., 60(2011), 429–440.
  • [20] J. Mirrlees, The theory of moral hazard and unobservable behaviour: Part I, Rev. Econ. Stud., 66(1999), 3–22.
  • [21] B.S. Mordukhovich, Variational Analysis and Generalized Differentiation, I: Basic Theory; II: Applications, Springer, Berlin, 2006.
  • [22] B.S. Mordukhovich, Variational Analysis and Applications, Springer, New York, 2018.
  • [23] B.S. Mordukhovich, Bilevel optimization and variational analysis, in Advances in Bilevel Optimization, S. Dempe and A. Zemkoho, eds., arxiv:1907.06140, 2019.
  • [24] B.S. Mordukhovich, N. M. Nam and H. M. Phan, Variational analysis of marginal functions with applications to bilevel programming, J. Optim. Theory Appl., 152 (2012), pp. 557-586.
  • [25] J.V. Outrata, On the numerical solution of a class of Stackelberg problems, Z. Oper. Res., 34(1990), 255–277.
  • [26] L. Qi and Z. Wei, On the constant positive linear dependence condition and its application to SQP methods, SIAM J. Optim., 10(2000), 963–981.
  • [27] R.T. Rockafellar and R.J.-B. Wets, Variational Analysis, Springer, Berlin, 1998.
  • [28] X. Wang and Y. Yuan, An augmented Lagrangian trust region method for equality constrained optimization, Optim. Methods soft., 30(2015), 559–582.
  • [29] J.J. Ye, Constraint qualifications and necessary optimality conditions for optimization problems with variational inequality constraints, SIAM J. Optim., 10(2000), 943–962.
  • [30] J.J. Ye, Necessary optimality conditions for multiobjective bilevel programs, Math. Opera. Res., 36(2011), 165–184.
  • [31] J.J. Ye, Constraint qualifications and optimality conditions in bilevel optimization, in Advances in Bilevel Optimization, S. Dempe and A. Zemkoho, eds., arxiv:1910.04072, 2019.
  • [32] J.J. Ye and D. Zhu, Optimality conditions for bilevel programming problems, Optim., 33(1995), 9–27.
  • [33] J.J. Ye and D. Zhu, New necessary optimality conditions for bilevel programs by combining the MPEC and value function approaches, SIAM J. Optim., 20(2010), 1885–1905.