∎

¹¹institutetext: D.T.V. An ²²institutetext: Department of Mathematics and Informatics, Thai Nguyen University of Sciences, Thai Nguyen city, Vietnam
²²email: [email protected] ³³institutetext: N.D. Yen ⁴⁴institutetext: Institute of Mathematics, Vietnam Academy of Science and Technology, Hanoi, Vietnam
⁴⁴email: [email protected]

Optimality conditions based on the Fréchet second-order subdifferential

D.T.V. An N.D. Yen

(Received: date / Accepted: date)

Abstract

This paper focuses on second-order necessary optimality conditions for constrained optimization problems on Banach spaces. For problems in the classical setting, where the objective function is $C^{2}$ -smooth, we show that strengthened second-order necessary optimality conditions are valid if the constraint set is generalized polyhedral convex. For problems in a new setting, where the objective function is just assumed to be $C^{1}$ -smooth and the constraint set is generalized polyhedral convex, we establish sharp second-order necessary optimality conditions based on the Fréchet second-order subdifferential of the objective function and the second-order tangent set to the constraint set. Three examples are given to show that the used hypotheses are essential for the new theorems. Our second-order necessary optimality conditions refine and extend several existing results.

Keywords:

Constrained optimization problems on Banach spaces second-order necessary optimality conditions Fréchet second-order subdifferential second-order tangent set generalized polyhedral convex set.

MSC:

49K27 49J53 90C30 90C46 90C20

1 Introduction

It is well-known that second-order optimality conditions are fundamental results in nonlinear mathematical programming Ben-Tal1980 ; Ben-Tal1982 ; Bonnans_Shapiro_2000 ; L_Y_2008 ; McCormick1967 ; Penot1994 ; Penot1999 ; Polyak ; Ruszczynski2006 , which have numerous applications in stability and sensitivity analysis, as well as in numerical methods for optimization problems. The need of generalizing these conditions to broader settings continues to attract attention of many researchers; see, e.g., ChieuLeeYen2017 ; HSN_1984 ; Huy_Tuyen and the references therein.

In classical second-order optimality conditions, the objective function of the finite-dimensional optimization problem in question is assumed to be twice continuously differentiable (a $C^{2}$ -smooth function for short). If the objective function is continuously Fréchet differentiable and the gradient mapping is locally Lipschitz, then one has deal with a $C^{1,1}$ -smooth problem. Second-order optimality conditions for finite-dimensional $C^{1,1}$ - smooth optimization problems have been obtained by Hiriart-Urruty et al. HSN_1984 , Huy and Tuyen Huy_Tuyen .

If the objective function of an optimization problem is continuously Fréchet differentiable and the gradient mapping is merely continuous, then one has deal with a $C^{1}$ -smooth problem. The class of $C^{1}$ -smooth optimization problems is much larger than that of $C^{1,1}$ - smooth optimization problems. As far as we know, the tools employed in HSN_1984 ; Huy_Tuyen are no longer suitable for $C^{1}$ -smooth problems. To describe locally optimal solutions of $C^{1}$ -smooth unconstrained minimization problems in a Banach space setting, Chieu et al. ChieuLeeYen2017 have explored the possibility of using the Fréchet second-order subdifferential and the limiting second-order subdifferential, which can be viewed as generalized Hessians of extended-real-valued functions. These concepts are due to Mordukhovich Mordukhovich_1992 ; Mordukhovich_2006a . The limiting second-order subdifferential has many applications in stability analysis of optimization problems; see, e.g., Mo_Ro_SIOPT2012 ; MRS_SIOPT2013 ; Poli_Roc_1998 and the references therein. As shown in ChieuChuongYaoYen2011 ; ChieuHuy2011 , the Fréchet second-order subdifferential is very useful in characterizing convexity of extended-real-valued functions. The authors of ChieuLeeYen2017 have shown that the Fréchet second-order subdifferential is suitable for presenting second-order necessary optimality conditions (ChieuLeeYen2017, , Theorems 3.1 and 3.3), while the limiting second-order subdifferential works well for second-order sufficient optimality conditions (ChieuLeeYen2017, , Theorem 4.7 and Corollary 4.8). Consulting a preprint version of ChieuLeeYen2017 , which appeared in 2013, Dai LVD2014 has extended the finite-dimensional version of (ChieuLeeYen2017, , Theorem 3.3) to the case of $C^{1}$ -smooth optimization problems whose constraint sets are described by linear equalities.

Our interest in knowing deeper the role of second-order tangent sets in second-order optimality conditions mainly comes from the book of Bonnans and Shapiro Bonnans_Shapiro_2000 and Theorem 3.45 in the book by Ruszczynski Ruszczynski2006 . When the second-order derivative of the $C^{2}$ -smooth objective function is replaced by the Fréchet second-order subdifferential or the limiting second-order subdifferential, nontrivial questions arise if one wants to have second-order optimality conditions based on second-order tangent sets. Since optimization problems with polyhedral convex constraint sets or generalized polyhedral convex constraint sets will be encountered frequently in our investigations, we remark that they are of great importance in optimization theory (see for example MRS_SIOPT2013 , where full stability of the local minimizers of such problems was characterized). An extended-real-valued function defined on a Banach space is said to be a generalized polyhedral convex function if its epigraph is a generalized polyhedral convex set. The interested reader is referred to (Luan_Yen, , pp. 71–77) and Luan_Yao for more comments on the role of generalized polyhedral convex sets and generalized polyhedral convex functions.

The main goal of this paper is to clarify the applicability of the Fréchet second-order subdifferential to establishing second-order optimality conditions for constrained minimization problems. For problems in the classical setting, where the objective function is $C^{2}$ -smooth, we show that strengthened second-order necessary optimality conditions are valid if the constraint set is generalized polyhedral convex. For problems in a new setting, where the objective function is just assumed to be $C^{1}$ -smooth and the constraint set is generalized polyhedral convex, we establish sharp second-order necessary optimality conditions based on the Fréchet second-order subdifferential of the objective function and the second-order tangent set to the constraint set. Our second-order necessary optimality conditions refine and extend several existing results. We will give three examples to show that the used hypotheses are essential for the new theorems.

The paper organization is as follows. Section 2 presents some basic definitions and auxiliary results. Section 3 is devoted to second-order optimality conditions for constrained optimization problems, where the objective function is $C^{2}$ -smooth. Section 4 studies the possibility of using the Fréchet second-order subdifferential in second-order necessary optimality conditions for constrained optimization problems, where the objective function is $C^{1}$ -smooth.

2 Preliminaries

Let $X$ be a Banach space over the reals with the dual and the second dual being denoted, respectively, by $X^{*}$ and $X^{**}$ . As usual, for a subset $\Omega\subset X$ , we denote its convex hull (resp., interior, and boundary) by ${\rm conv}\,\Omega$ (resp., ${\rm int}\Omega$ , and $\partial\Omega$ ). One says that a nonempty subset $K\subset X$ is a cone if $tK\subset K$ for any $t>0.$ Following Luan_Yao_Yen , we abbreviate the smallest convex cone containing $\Omega$ to cone $\Omega$ . Then, ${\rm cone}\,\Omega=\{tx\mid t>0,\,x\in{\rm conv}\,\Omega\}.$ The polar to a cone $K\subset X$ is $K^{*}:=\{x^{*}\in X^{*}\mid\langle x^{*},x\rangle\leq 0,\ \forall x\in K\}$ . If $A$ is a matrix, then we denote its transpose by $A^{T}$ . The set of positive integers is denoted by $\mathbb{N}$ .

The forthcoming subsection recalls the definitions of contingent cone and second-order tangent set.

2.1 Second-order tangent sets

Definition 1

(See, e.g., (Ruszczynski2006, , Definition 3.11)) A direction $v$ is called tangent to the set $C\subset X$ at a point $\bar{x}\in C$ if there exist sequences of points $x_{k}\in C$ and scalar $\tau_{k}>0$ , $k\in\mathbb{N}$ , such that $\tau_{k}\rightarrow 0$ and $v=\lim\limits_{k\rightarrow\infty}\big{[}\tau_{k}^{-1}(x_{k}-\bar{x})\big{]}.$

The set of all tangent directions to $C$ at a point $\bar{x}\in C$ , denoted by $T_{C}(\bar{x})$ , is called the contingent cone or the Bouligand-Severi tangent cone (Mordukhovich_2006a, , Chapter 1) to $C$ at $\bar{x}$ . From the definition it follows that $v\in T_{C}(\bar{x})$ if and only if there exist a sequence $\{\tau_{k}\}$ of positive scalars and a sequence of vectors $\{v_{k}\}$ with $\tau_{k}\to 0$ and $v_{k}\to v$ as $k\to\infty$ such that $x_{k}:=\bar{x}+\tau_{k}v_{k}$ belongs to $C$ for all $k\in\mathbb{N}$ .

Definition 2

(See, e.g., (Ruszczynski2006, , Definition 3.41)) A vector $w$ is called a second order tangent direction to a set $C\subset X$ at a point $\bar{x}\in C$ and in a tangent direction $v$ , if there exist a sequence of scalars $\tau_{k}>0$ and a sequence of points $x^{k}\in C$ such that $\tau_{k}\rightarrow 0$ and

w=\lim_{k\rightarrow\infty}\frac{x^{k}-\bar{x}-\tau_{k}v}{\frac{\tau_{k}^{2}}{2}}.

(2.1)

The set of all second-order tangent directions to $C$ at a point $\bar{x}\in C$ in a tangent direction $v$ , denoted by $T_{C}^{2}(\bar{x},v)$ , is said to be the second-order tangent set to $C$ at $\bar{x}$ in direction $v$ . Note that the equality (2.1) can be rewritten as

x^{k}=\bar{x}+\tau_{k}v+\frac{\tau_{k}^{2}}{2}w+\mathnormal{o}(\tau_{k}^{2}).

Thus, $w\in T_{C}^{2}(\bar{x},v)$ if and only if there exist a sequence $\{\tau_{k}\}$ of positive scalars and a sequence of vectors $\{w_{k}\}$ with $\tau_{k}\to 0$ and $w_{k}\to w$ as $k\to\infty$ such that $x_{k}:=\bar{x}+\tau_{k}v+\frac{\tau_{k}^{2}}{2}w_{k}$ belongs to $C$ for all $k\in\mathbb{N}$ .

In the next subsection, we recall the definition of the generalized polyhedral convex set from Bonnans_Shapiro_2000 and establish some auxiliary results.

2.2 Generalized polyhedral convex sets

Definition 3

(See (Bonnans_Shapiro_2000, , p. 133) and (Luan_Yao_Yen, , Definition 2.1)) A subset $D\subset X$ is said to be a generalized polyhedral convex set if there exist $x_{i}^{*}\in X^{*}$ , $\alpha_{i}\in\mathbb{R},$ $i=1,2,...,p$ , and a closed affine subspace $L\subset X$ , such that

\displaystyle D=\{x\in X\mid x\in L,\ \langle x_{i}^{*},x\rangle\leq\alpha_{i},\ i=1,2,...,p\}.

(2.2)

If $D$ can be represented in the form of (2.2) with $L=X$ , then we say that it is a polyhedral convex set.

From Definition 3 it follows that every generalized polyhedral convex set is a closed set. If $X$ is finite-dimensional, a subset $D\subset X$ is a generalized polyhedral convex set if and only if it is a polyhedral convex set; see (Luan_Yao_Yen, , p. 541).

Let $D$ be given as in (2.2). According to (Bonnans_Shapiro_2000, , Remark 2.196), there exists a continuous surjective linear mapping $A$ from $X$ to a Banach space $Y$ and a vector $y\in Y$ such that $L=\{x\in X\mid Ax=y\}$ . Hence,

\displaystyle D=\big{\{}x\in X\mid Ax=y,\ \langle x_{i}^{*},x\rangle\leq\alpha_{i},\ i=1,2,...,p\big{\}}.

(2.3)

Put $I=\{1,2,..,p\}$ and, for any $x\in D$ , let $I(x):=\{i\in I\mid\langle x_{i}^{*},x\rangle=\alpha_{i}\}$ .

The first assertion of the next proposition can be found in Ban_Mordukhovich_Song_2011 . The second assertion extends the result in (Ruszczynski2006, , Lemma 3.43) to an infinite-dimensional spaces setting.

Proposition 1

Let $D$ be a generalized polyhedral convex set in a Banach space $X$ . The contingent cones and the second-order tangent sets to $D$ are represented as follows:

(i)

$T_{D}(\bar{x})=\{v\in X\mid Av=0,\;\langle x_{i}^{*},v\rangle\leq 0,\;i\in I(\bar{x})\}$ for any $\bar{x}\in D$ ;
(ii)

$T^{2}_{D}(\bar{x},v)=T_{T_{D}(\bar{x})}(v)$ for any $\bar{x}\in D$ and $v\in T_{D}(\bar{x})$ .

Proof

(i) To show that

\displaystyle T_{D}(\bar{x})\subset\{v\in X\mid Av=0,\ \langle x_{i}^{*},v\rangle\leq 0,\ i\in I(\bar{x})\},

(2.4)

take any $v\in T_{D}(\bar{x})$ . Let $\tau_{k}\downarrow 0$ and $v_{k}\rightarrow v$ be such that $\bar{x}+\tau_{k}v_{k}\in D$ for $k\in\mathbb{N}$ . Then, we have $A(\bar{x}+\tau_{k}v_{k})=y$ and $\langle x_{i}^{*},\bar{x}+\tau_{k}v_{k}\rangle\leq\alpha_{i}$ for all $i\in I.$ This implies that

\displaystyle A(\tau_{k}v_{k})=0\ \;\mbox{and}\ \langle x_{i}^{*},\tau_{k}v_{k}\rangle\leq 0\ \;(\forall i\in I(\bar{x}),\ \forall k\in\mathbb{N}).

(2.5)

From (2.5) we have

\displaystyle A(v_{k})=0\ \;\mbox{and}\ \;\langle x_{i}^{*},v_{k}\rangle\leq 0\ \;(\forall i\in I(\bar{x}),\ \forall k\in\mathbb{N}).

(2.6)

Letting $k\rightarrow\infty$ , from (2.6) we get $A(v)=0$ and $\langle x_{i}^{*},v\rangle\leq 0$ for any $i\in I(\bar{x})$ . In other words, $v$ belongs to the right-hand-side of (2.4). So, the inclusion (2.4) is valid. To prove the opposite inclusion, pick any $v\in X$ satisfying $Av=0$ and $\langle x_{i}^{*},v\rangle\leq 0$ for $i\in I(\bar{x}).$ Since $\bar{x}\in D$ , one has $A\bar{x}=y$ , $\langle x_{i}^{*},\bar{x}\rangle=\alpha_{i}$ for $i\in I(\bar{x})$ , and $\langle x_{i}^{*},\bar{x}\rangle<\alpha_{i}$ for $i\in I\setminus I(\bar{x}).$ Hence, for all $t>0$ small enough, one has $A(\bar{x}+tv)=y,\ \langle x_{i}^{*},\bar{x}+tv\rangle\leq\alpha_{i}$ for $i\in I(\bar{x})$ and $\langle x_{i}^{*},\bar{x}+tv\rangle<\alpha_{i}$ for $i\in I\setminus I(\bar{x}).$ So, $\bar{x}+tv\in D$ for all $t>0$ small enough. It follows that $v\in T_{D}(\bar{x})$ . Thus, assertion (i) is justified.

(ii) Fix any $\bar{x}\in D$ and $v\in T_{D}(\bar{x})$ . By assertion (i), $Av=0$ and $\langle x_{i}^{*},v\rangle\leq 0$ for all $i\in I(\bar{x})$ . Moreover, since

\displaystyle T_{D}(\bar{x})=\{u\in X\mid Au=0,\;\langle x_{i}^{*},u\rangle\leq 0,\;i\in I(\bar{x})\},

(2.7)

applying the same assertion we can compute the contingent cone to the generalized polyhedral convex set $T_{D}(\bar{x})$ at $v$ as follows

\displaystyle T_{T_{D}(\bar{x})}(v)=\big{\{}u\in X\mid Au=0,\ \langle x_{i}^{*},u\rangle\leq 0,\ i\in I^{0}(v)\big{\}},

(2.8)

where $I^{0}(v):=\{i\in I(\bar{x})\mid\langle x_{i}^{*},v\rangle=0\}$ . On one hand, for any fixed vector $w\in T^{2}_{D}(\bar{x},v)$ , we can find sequences $\tau_{k}\downarrow 0$ and $w_{k}\rightarrow w$ such that

\bar{x}+\tau_{k}v+\frac{\tau^{2}_{k}}{2}w_{k}\in D\quad(\forall k\in\mathbb{N}).

By (2.3), one has $A(\bar{x}+\tau_{k}v+\frac{\tau^{2}_{k}}{2}w_{k})=y$ and $\langle x_{i}^{*},\bar{x}+\tau_{k}v+\frac{\tau^{2}_{k}}{2}w_{k}\rangle\leq\alpha_{i},\ i\in I.$ As $\bar{x}\in D$ and $v\in T_{D}(\bar{x})$ , this yields

\displaystyle A\Big{(}\frac{\tau^{2}_{k}}{2}w_{k}\Big{)}=0\ \mbox{and}\ \big{\langle}x_{i}^{*},\frac{\tau^{2}_{k}}{2}w_{k}\big{\rangle}\leq 0,\ \forall i\in I^{0}(v).

(2.9)

Since $\tau_{k}>0$ , (2.9) implies that $A\left(w_{k}\right)=0$ and $\langle x_{i}^{*},w_{k}\rangle\leq 0$ for all $i\in I^{0}(v).$ Letting $k\rightarrow\infty$ , we obtain $A\left(w\right)=0$ and $\langle x_{i}^{*},w\rangle\leq 0$ for all $i\in I^{0}(v).$ Therefore, by (2.8) we can assert that $w\in T_{T_{D}(\bar{x})}(v).$ On the other hand, taking any $w\in T_{T_{D}(\bar{x})}(v)$ , from (2.8) one gets $Aw=0$ and $\langle x_{i}^{*},w\rangle\leq 0$ for all $i\in I^{0}(v).$ By the definition of $I^{0}(v)$ , we have $\langle x_{i}^{*},v\rangle=0$ for any $i\in I^{0}(v)$ and $\langle x_{i}^{*},v\rangle<0$ for any $i\in I(\bar{x})\setminus I^{0}(v)$ . Moreover, since $\bar{x}\in D$ , it holds that $A\bar{x}=y$ , $\langle x_{i}^{*},\bar{x}\rangle=\alpha_{i}$ for $i\in I(\bar{x})$ , and $\langle x_{i}^{*},\bar{x}\rangle<\alpha_{i}$ for $i\in I\setminus I(\bar{x}).$ So, for every $t>0$ sufficiently small, one has $A(\bar{x}+tv+\frac{t^{2}}{2}w)=y,\ \langle x_{i}^{*},\bar{x}+tv+\frac{t^{2}}{2}w\rangle\leq\alpha_{i}$ for all $i\in I^{0}(v)$ and $\langle x_{i}^{*},\bar{x}+tv+\frac{t^{2}}{2}w\rangle<\alpha_{i}$ for all $i\in I\setminus I^{0}(v).$ This yields $\bar{x}+tv+\frac{t^{2}}{2}w\in D$ for every $t>0$ sufficiently small. Hence, $w\in T^{2}_{D}(\bar{x},v).$ We have thus proved the equality stated in assertion (ii). $\hfill\Box$

Remark 1

If $D\subset X$ is a generalized polyhedral convex set then, for any $\bar{x}\in D$ and $v\in T_{D}(\bar{x})$ , one has $T_{D}(\bar{x})\subset T^{2}_{D}(\bar{x},v)$ , and the inclusion can be strict. We can justify this observation by representing $D$ in the form (2.3) and applying some formulas established in the proof of Proposition 1. Indeed, since $I^{0}(v)\subset I(\bar{x})$ , from (2.7), (2.8), and the equality $T^{2}_{D}(\bar{x},v)=T_{T_{D}(\bar{x})}(v)$ , one can deduce that $T_{D}(\bar{x})\subset T^{2}_{D}(\bar{x},v)$ . When $I^{0}(v)$ is a proper subset of $I(\bar{x})$ , the last inclusion can be strict. To have an example, one can choose

D=\big{\{}x=(x_{1},x_{2})\in\mathbb{R}^{2}\mid x_{1}\geq 0,x_{2}\geq 0\big{\}},

$\bar{x}=(0,0)$ , $v=(1,0)$ , then use (2.8) and the equality $T^{2}_{D}(\bar{x},v)=T_{T_{D}(\bar{x})}(v)$ to show that $T^{2}_{D}(\bar{x},v)=\big{\{}w=(w_{1},w_{2})\in\mathbb{R}^{2}\mid w_{2}\geq 0\big{\}}$ , while

T_{D}(\bar{x})=\big{\{}u=(u_{1},u_{2})\in\mathbb{R}^{2}\mid u_{1}\geq 0,u_{2}\geq 0\big{\}}.

As a preparation for getting optimality conditions based on the Fréchet second-order subdifferential, we now recall the later concept and some related constructions.

2.3 Constructions from generalized differentiation

Definition 4

(See (Mordukhovich_2006a, , p. 4 )) Let $\Omega$ be a nonempty subset of $X.$ The Fréchet normal cone to $\Omega$ at $x\in\Omega$ is given by

\displaystyle\widehat{N}_{\Omega}(x):=\Big{\{}x^{*}\in X^{*}\mid\limsup\limits_{u\xrightarrow{\Omega}x}\dfrac{\langle x^{*},u-x\rangle}{\|u-x\|}\leq 0\Big{\}},

where $u\xrightarrow{\Omega}x$ means that $u\rightarrow x$ and $u\in\Omega$ . If $x\not\in\Omega$ , we put $\widehat{N}_{\Omega}(x)=\emptyset$ .

If $\Omega$ is convex, one has

\widehat{N}_{\Omega}(x)=N_{\Omega}(x):=\big{\{}x^{*}\in X^{*}\mid\langle x^{*},u-x\rangle\leq 0,\ \forall u\in\Omega\big{\}},

i.e., $\widehat{N}_{\Omega}(x)$ coincides with the normal cone in the sense of convex analysis. In that case, $[T_{\Omega}(x)]^{*}=N_{\Omega}(x)$ and $[N_{\Omega}(x)]^{*}=T_{\Omega}(x)$ , where

[N_{\Omega}(x)]^{*}:=\{x\in X\mid\langle x^{*},x\rangle\leq 0,\ \forall x^{*}\in N_{\Omega}(x)\}.

Given a set-valued map $F:X\rightrightarrows Y$ between Banach spaces, one defines the graph of $F$ by ${\rm{gph}}\,F=\{(x,y)\in X\times Y\mid y\in F(x)\}.$ The product space $X\times Y$ is equipped with the norm $\|(x,y)\|:=\|x\|+\|y\|$ .

Definition 5

(See (Mordukhovich_2006a, , p. 40)) The Fréchet coderivative of $F$ at $\bar{z}=(\bar{x},\bar{y})$ in ${\rm{gph}}\,F$ is the multifunction $\widehat{D}^{*}F(\bar{x},\bar{y}):Y^{*}\rightrightarrows X^{*}$ given by

\displaystyle\widehat{D}^{*}F(\bar{z})(y^{*})=\!\left\{x^{*}\in X^{*}\mid(x^{*},-y^{*})\!\in\!\widehat{N}_{{\rm gph}\,F}(\bar{z})\right\},\,\forall y^{*}\in Y^{*}.

If $(\bar{x},\bar{y})\notin{\rm{gph}}\,F$ , one puts $\widehat{D}^{*}F(\bar{z})(y^{*})=\emptyset$ for any $y^{*}\in Y^{*}$ .

If $F(x)=\{f(x)\}$ for all $x\in X$ , where $f:X\to Y$ is a single-valued map, we will write $\widehat{D}f(\bar{x})(y^{*})$ instead of $\widehat{D}^{*}F(\bar{x},f(\bar{x}))(y^{*})$ .

Proposition 2

(See (Mordukhovich_2006a, , Theorem 1.38)) Let $f:X\rightarrow Y$ be a Fréchet differentiable function at $\bar{x}$ . Then $\widehat{D}f(\bar{x})(y^{*})=\{\nabla f(\bar{x})^{*}y^{*}\}$ for every $y^{*}\in Y^{*},$ where $\nabla f(\bar{x})^{*}$ is the adjoint operator of $\nabla f(\bar{x}).$

Consider a function $f:X\rightarrow\overline{\mathbb{R}}$ , where $\overline{\mathbb{R}}=[-\infty,+\infty]$ is the extended real line. The epigraph of $f$ is given by ${\rm{epi}}\,f=\{(x,\alpha)\in X\times\mathbb{R}\mid\alpha\geq f(x)\}.$

Definition 6

(See (Mordukhovich_2006a, , Chapter 1)) Let $f:X\rightarrow\overline{\mathbb{R}}$ be a function defined on a Banach space. Suppose that $\bar{x}\in X$ and $|f(\bar{x})|<\infty.$ One calls the set

\displaystyle\widehat{\partial}f(\bar{x}):=\left\{x^{*}\in X^{*}\mid(x^{*},-1)\in\widehat{N}_{{\rm epi}\,f}((\bar{x},f(\bar{x})))\right\}

the Fréchet subdifferential of $f$ at $\bar{x}$ . If $|f(\bar{x})|=\infty$ , one puts $\widehat{\partial}f(\bar{x})=\emptyset$ .

Definition 7

(See (Mordukhovich_2006a, , p. 122)) Let $f:X\rightarrow\overline{\mathbb{R}}$ be a function with a finite value at $\bar{x}.$ For any $\bar{y}\in\widehat{\partial}f(\bar{x})$ , the map $\widehat{\partial}^{2}f(\bar{x},\bar{y}):X^{**}\rightrightarrows X^{*}$ with the values

\displaystyle\widehat{\partial}^{2}f(\bar{x},\bar{y})(u):=(\widehat{D}^{*}\widehat{\partial}f)(\bar{x},\bar{y})(u)\quad(u\in X^{**})

is said to be the Fréchet second-order subdifferential of $f$ at $\bar{x}$ relative to $\bar{y}.$

If $\widehat{\partial}f(\bar{x})$ is a singleton, the symbol $\bar{y}$ in the notation $\widehat{\partial}^{2}f(\bar{x},\bar{y})(u)$ will be omitted. If $f:X\rightarrow\overline{\mathbb{R}}$ is Fréchet differentiable in an open neighborhood of $\bar{x}$ , then $\widehat{\partial}f(\bar{x})=\{\nabla f(\bar{x})\}$ . Moreover, if the operator $\nabla f:X\rightarrow X^{*}$ is Fréchet differentiable at $\bar{x}$ with the second-order derivative $\nabla^{2}f(\bar{x}):=\nabla(\nabla f(\cdot))(\bar{x})$ , then $\nabla^{2}f(\bar{x})$ maps $X^{**}$ to $X^{*}$ . By Proposition 2, $\widehat{\partial}^{2}f(\bar{x})(u)=\{\nabla^{2}f(\bar{x})^{*}u\}$ for every $u\in X^{**}$ . When $X$ is finite-dimensional and $f$ is $C^{2}$ -smooth in an open neighborhood of $\bar{x}$ , then $\nabla^{2}f(\bar{x})$ is identified with the Hessian matrix of $f$ at $\bar{x}$ for which one has $\nabla^{2}f(\bar{x})^{*}=\nabla^{2}f(\bar{x})$ by Clairaut’s rule.

The forthcoming subsection presents two lemmas which will be used repeatedly in the sequel.

2.4 Auxiliary results

Lemma 1

Let $C=\big{\{}x\in X\mid Ax=y,\ \langle x_{i}^{*},x\rangle\leq\alpha_{i},\ i=1,2,...,p\big{\}},$ where $A$ , $y$ , $x_{i}^{*},$ and $\alpha_{i}$ for $i=1,\dots,p$ are the same as in (2.3), be a generalized polyhedral convex set. For any $v\in T_{C}(\bar{x})$ with $-v\in T_{C}(\bar{x})$ , it holds that

\displaystyle T^{2}_{C}(\bar{x},-v)=T^{2}_{C}(\bar{x},v).

(2.10)

Proof

By Proposition 1, $T_{C}^{2}(\bar{x},v)=T_{T_{C}(\bar{x})}(v)$ and $T_{C}^{2}(\bar{x},-v)=T_{T_{C}(\bar{x})}(-v).$ Moreover, one has $T_{T_{C}(\bar{x})}(v)=[N_{T_{C}(\bar{x})}(v)]^{*}$ and $T_{T_{C}(\bar{x})}(-v)=[N_{T_{C}(\bar{x})}(-v)]^{*}.$ Therefore,

\displaystyle T_{C}^{2}(\bar{x},v)=[N_{T_{C}(\bar{x})}(v)]^{*}\quad{\rm and}\quad T_{C}^{2}(\bar{x},-v)=[N_{T_{C}(\bar{x})}(-v)]^{*}.

(2.11)

On one hand, by (Luan_Yao_Yen, , Proposition 4.2), $N_{C}(\bar{x})={\rm cone}\,\big{\{}x_{i}^{*}\mid i\in I(\bar{x})\big{\}}+({\rm ker}\,A)^{\intercal},$ where $I(\bar{x})=\{i\in I\mid\langle x_{i}^{*},\bar{x}\rangle=\alpha_{i}\}$ and

({\rm ker}\,A)^{\intercal}=\{x^{*}\in X^{*}\mid\langle x^{*},x\rangle=0,\ \forall x\in{\rm ker}\,A\}.

On the other hand, according to Proposition 1,

\displaystyle T_{C}(\bar{x})=\{v\in X\mid Av=0,\ \langle x_{i}^{*},v\rangle\leq 0,\ i\in I(\bar{x})\}.

So, $v\in T_{C}(\bar{x})$ and $-v\in T_{C}(\bar{x})$ if and only if $Av=0,$ $\langle x_{i}^{*},v\rangle\leq 0,$ and $\langle x_{i}^{*},-v\rangle\leq 0$ for all $i\in I(\bar{x}).$ This means that $Av=0$ and $\langle x_{i}^{*},v\rangle=0$ for all $i\in I(\bar{x}).$ Putting $I^{0}(u)=\{i\in I(\bar{x})\mid\langle x_{i}^{*},u\rangle=0\}$ for every $u\in T_{C}(\bar{x}),$ we see that $I^{0}(v)=I(\bar{x})=I^{0}(-v)$ . So, thanks to (Luan_Yao_Yen, , Proposition 4.2), we have

\displaystyle N_{T_{C}(\bar{x})}(v)={\rm cone}\,\{x_{i}^{*}\mid i\in I^{0}(v)\}+({\rm ker}\,A)^{\intercal}

and $N_{T_{C}(\bar{x})}(-v)={\rm cone}\,\{x_{i}^{*}\mid i\in I^{0}(v)\}+({\rm ker}\,A)^{\intercal}$ . Thus, by (2.11) we get

\displaystyle T_{C}^{2}(\bar{x},-v)=[N_{T_{C}(\bar{x})}(-v)]^{*}=[N_{T_{C}(\bar{x})}(v)]^{*}=T_{C}^{2}(\bar{x},v).

This justifies (2.10) and completes the proof. $\hfill\Box$

Consider the problem

\min\{f(x)\mid x\in C\},

(P)

where $f:X\rightarrow\mathbb{R}$ is a Fréchet differentiable function and $C$ is a nonempty subset of $X$ .

Lemma 2

Suppose that $\bar{x}$ is a local minimum of (P), where $C$ is a generalized polyhedral convex set. Then, $\langle\nabla f(\bar{x}),v\rangle\geq 0$ for every $v\in T_{C}(\bar{x}).$ Moreover, if $v\in T_{C}(\bar{x})$ is such that $\langle\nabla f(\bar{x}),v\rangle=0$ , then

\displaystyle\langle\nabla f(\bar{x}),w\rangle\geq 0\ \;\mbox{for all}\ \,w\in T_{C}^{2}(\bar{x},v).

(2.12)

Proof

The first assertion is a special case of the result recalled in Theorem 3.1 below. Let $v\in T_{C}(\bar{x})$ be such that $\langle\nabla f(\bar{x}),v\rangle=0$ . To get (2.12), fix any $w\in T_{C}^{2}(\bar{x},v)$ . By Proposition 1 we have $T_{C}^{2}(\bar{x},v)=T_{T_{C}(\bar{x})}(v).$ Moreover, since $C$ is a generalized polyhedral convex set, $T_{C}(\bar{x})$ is a generalized polyhedral convex cone by (Luan_Yao_Yen, , Proposition 2.22). So, applying (Luan_Yao_Yen, , Proposition 2.22), one has $T_{T_{C}(\bar{x})}(v)={\rm cone}\,(T_{C}(\bar{x})-v).$ Thus, the representation $w=\lambda(v^{\prime}-v)$ holds for some $v^{\prime}\in T_{C}(\bar{x})$ and $\lambda>0$ . Therefore,

\langle\nabla f(\bar{x}),w\rangle=\lambda\langle\nabla f(\bar{x}),v^{\prime}\rangle-\lambda\langle\nabla f(\bar{x}),v\rangle.

As $\langle\nabla f(\bar{x}),v^{\prime}\rangle\geq 0$ for any $v^{\prime}\in T_{C}(\bar{x})$ by the first assertion and $\langle\nabla f(\bar{x}),v\rangle=0$ by our assumption, this implies (2.12). $\hfill\Box$

3 Problems in the classical setting

In this section, we focus on second-order optimality conditions for problem (P) under the assumption that $f$ is twice continuously differentiable on $X$ (i.e., $f$ is a $C^{2}$ -smooth function). By abuse of terminology, we call this (P) a problem in the classical setting.

The next first-order and second-order necessary optimality conditions are known results. The proofs in a finite-dimensional setting given in (Ruszczynski2006, , p. 114 and p. 144) are also valid for the infinite-dimensional setting adopted in the present paper. For the first statement, it suffices to assume that $f$ is Fréchet differentiable at $\bar{x}$ .

Theorem 3.1

(See, e.g., (Ruszczynski2006, , Theorem 3.24)) If $\bar{x}$ is a local minimum of (P), then

\displaystyle\langle\nabla f(\bar{x}),v\rangle\geq 0\ \;\mbox{for all}\ \,v\in T_{C}(\bar{x}).

(3.1)

Theorem 3.2

(See, e.g., (Ruszczynski2006, , Theorem 3.45)) Assume that $\bar{x}$ is a local minimum of (P). Then (3.1) holds and, for every $v\in T_{C}(\bar{x})$ satisfying $\langle\nabla f(\bar{x}),v\rangle=0$ , one has

\displaystyle\langle\nabla f(\bar{x}),w\rangle+\langle\nabla^{2}f(\bar{x})v,v\rangle\geq 0\ \;\mbox{for all}\ \,w\in T_{C}^{2}(\bar{x},v).

(3.2)

Clearly, the simultaneous fulfillment of the inequalities $\langle\nabla f(\bar{x}),w\rangle\geq 0$ and $\langle\nabla^{2}f(\bar{x})v,v\rangle\geq 0$ yields the inequality $\langle\nabla f(\bar{x}),w\rangle+\langle\nabla^{2}f(\bar{x})v,v\rangle\geq 0$ in (3.2). Hence, it is reasonable to raise the next question.

Question 1: When Theorem 3.2 can be stated in the following stronger form: “If $\bar{x}$ is a local minimum of (P), then (3.1) holds and the conditions

(c1): $\langle\nabla f(\bar{x}),w\rangle\geq 0$ for all $w\in T_{C}^{2}(\bar{x},v)$ , where $v\in T_{C}(\bar{x})$ is such that $\langle\nabla f(\bar{x}),v\rangle=0$ (i.e., $v$ is a critical direction),
(c2): $\langle\nabla^{2}f(\bar{x})v,v\rangle\geq 0$ for all $v\in T_{C}(\bar{x})$ satisfying $\langle\nabla f(\bar{x}),v\rangle=0$

are fulfilled.”?

If $C$ is a generalized polyhedral convex set, we can answer the above question as follows.

Theorem 3.3

Let $C$ be a generalized polyhedral convex set in a Banach space $X$ . If $\bar{x}$ is a local minimum of (P), then (3.1) holds and the conditions (c1) and (c2) are fulfilled.

Proof

To obtain (c1), pick an arbitrary vector $w\in T_{C}^{2}(\bar{x},v)$ , where $v\in T_{C}(\bar{x})$ and $\langle\nabla f(\bar{x}),v\rangle=0$ . Applying Lemma 2, we have $\langle\nabla f(\bar{x}),w\rangle\geq 0$ .

To prove (c2), take any $v\in T_{C}(\bar{x})$ with $\langle\nabla f(\bar{x}),v\rangle=0$ . If $v=0$ , then the inequality $\langle\nabla^{2}f(\bar{x})v,v\rangle\geq 0$ is obvious. Now, assume that $v\neq 0$ . On one hand, since $C$ is a generalized polyhedral convex set, Proposition 2.22 from Luan_Yao_Yen guarantees that

\displaystyle T_{C}(\bar{x})={\rm cone}\,(C-x)=\{\lambda(x-\bar{x})\mid\lambda>0,\ x\in C\}.

Hence, we have $v=\lambda_{0}(y-\bar{x})$ for some $y\in C$ , $y\not=\bar{x}$ , and $\lambda_{0}>0$ . On the other hand, as $\bar{x}$ is a local minimum of (P), there exists $\varepsilon>0$ such that $f(\bar{x})\leq f(x)$ for every $x\in C$ with $||x-\bar{x}||\leq\varepsilon.$ Put $\bar{\lambda}=\min\{\lambda_{0},\varepsilon(\lambda_{0}||y-\bar{x}||)^{-1}\}$ . Then, $\bar{\lambda}>0$ and we have $\bar{x}+\lambda v\in C$ and $||(\bar{x}+\lambda v)-\bar{x}||\leq\varepsilon$ for all $\lambda\in(0,\bar{\lambda}]$ . Therefore,

	$\displaystyle f(\bar{x})\leq f(\bar{x}+\lambda v)$	$\displaystyle=f(\bar{x})+\lambda\langle\nabla f(\bar{x}),v\rangle+\frac{\lambda^{2}}{2}\langle\nabla^{2}f(\bar{x})v,v\rangle+o(\lambda^{2})$
		$\displaystyle=f(\bar{x})+\frac{\lambda^{2}}{2}\langle\nabla^{2}f(\bar{x})v,v\rangle+o(\lambda^{2}).$

It follows that $\frac{\lambda^{2}}{2}\langle\nabla^{2}f(\bar{x})v,v\rangle+o(\lambda^{2})\geq 0$ for all $\lambda\in(0,\bar{\lambda}]$ . Dividing both sides of the last inequality by $\frac{\lambda^{2}}{2}$ and taking the limit as $\lambda\to 0^{+}$ , we get $\langle\nabla^{2}f(\bar{x})v,v\rangle\geq 0,$ as desired. $\hfill\Box$

Remark 2

In the setting of Theorem 3.3, one has $T_{C}(\bar{x})\subset T^{2}_{C}(\bar{x},v)$ for any $v\in T_{C}(\bar{x})$ . Since the inclusion of sets can be strict (see Remark 1), the property (c1) asserted by Theorem 3.3 is more stringent than the first-order necessary condition in (3.1) which reads as follows: $\langle\nabla f(\bar{x}),u\rangle\geq 0$ for every $u\in T_{C}(\bar{x})$ .

As an application of Theorem 3.3, we now specialize it to the case of quadratic programming problems on Banach spaces with generalized polyhedral convex constraint sets. Note that the later problems have been considered, for example, in Bonnans_Shapiro_2000 and Yen_Yang_2018 . One calls (P) a quadratic programming problem on a generalized polyhedral convex set if $C\subset X$ is a generalized polyhedral convex set and $f(x)=\frac{1}{2}\langle Mx,x\rangle+\langle q,x\rangle+\alpha$ , where $M:X\to X^{*}$ is a bounded linear operator, $q\in X^{*}$ , and $\alpha\in\mathbb{R}$ . It is assumed that $M$ is symmetric in the sense that $\langle Mx,y\rangle=\langle My,x\rangle$ for all $x,y\in X$ . Since $\nabla f(x)=Mx+q$ and $\nabla^{2}f(x)v=Mv$ for all $x,v\in X$ , the next statement follows directly from Theorem 3.3.

Theorem 3.4

Assume that (P) be a quadratic programming problem given by a generalized polyhedral convex set $C\subset X$ and a linear-quadratic function $f(x)=\frac{1}{2}\langle Mx,x\rangle+\langle q,x\rangle+\alpha$ with $M$ being symmetric. If $\bar{x}$ is a local minimum of this problem (P), then the following conditions are satisfied:

(c0): $\langle M\bar{x}+q,v\rangle\geq 0$ for all $v\in T_{C}(\bar{x})$ ;
(c1’): $\langle M\bar{x}+q,w\rangle\geq 0$ for all $w\in T_{C}^{2}(\bar{x},v)$ , where $v\in T_{C}(\bar{x})$ is such that $\langle M\bar{x}+q,v\rangle=0$ ,
(c2’): $\langle Mv,v\rangle\geq 0$ for all $v\in T_{C}(\bar{x})$ satisfying $\langle M\bar{x}+q,v\rangle=0$ .

According to the Majthay-Contesse theorem (see (Lee_Tam_Yen, , Theorem 3.4)), second-order necessary optimality conditions for finite-dimensional quadratic programs are also sufficient ones. Thus, it is of interest to know whether a similar assertion remains true for the second-order necessary optimality conditions in Theorem 3.4, or not.

Question 2: Under the assumptions of Theorem 3.4, if $\bar{x}\in C$ is such that the conditions (c0), (c1’), and (c2’) are fulfilled, then $\bar{x}$ is a local minimum of (P)?

Turning our attention back to Theorem 3.3, observe that if $C$ is not a generalized polyhedral convex set, then the assertions of that theorem may not hold anymore. This means that, in general, the pair of conditions (c1) and (c2) is much stronger than condition (3.2).

To clarify the above observation, we first consider an example where $C$ is a compact convex set in $\mathbb{R}^{2}$ , which is given by a simple inequality.

Example 1

(See (LVD2014, , Example 2, p. 20)) Consider problem (P) where $X=\mathbb{R}^{2}$ , $f(x)=-2x_{1}^{2}-x_{2}^{2}$ for all $x=(x_{1},x_{2})$ , and

C=\big{\{}x=(x_{1},x_{2})\mid g(x)=2x_{1}^{2}+3x_{2}^{2}-6\leq 0\big{\}}.

Since $f$ is continuous and $C$ is compact, (P) has a global solution. As $f$ is Fréchet differentiable, by a well known necessary optimality condition (see the proof of Theorem 5.1 in Mordukhovich_2006b ) which is a dual form of the condition recalled in Theorem 3.1, if $\bar{x}=(\bar{x}_{1},\bar{x}_{2})$ is a solution of (P) then

\displaystyle 0\in\nabla f(\bar{x})+\widehat{N}_{C}(\bar{x}).

(3.3)

On one hand, $\nabla f(\bar{x})=(-4\bar{x}_{1},-2\bar{x}_{2})^{T}$ . On the other hand, as $C$ is a convex set, $\widehat{N}_{C}(\bar{x})$ coincides with the normal cone to $C$ at $\bar{x}$ in the sense of convex analysis. Hence, by (IoffeTihomirov, , p. 206) we have $\widehat{N}_{C}(\bar{x})=\{\lambda\nabla g(\bar{x})=\lambda(4\bar{x}_{1},6\bar{x}_{2})^{T}\mid\lambda\geq 0\}$ whenever $\bar{x}\in\partial C$ . Therefore, if $\bar{x}\in\partial C$ , then (3.3) is equivalent to the existence of $\lambda\geq 0$ satisfying

\begin{cases}-4\bar{x}_{1}+4\lambda\bar{x}_{1}=0\\ -2\bar{x}_{2}+6\lambda\bar{x}_{2}=0.\end{cases}

From this condition, we get four critical points $\bar{x}^{1}=(\sqrt{3},0)^{T}$ , $\bar{x}^{2}=(-\sqrt{3},0)^{T}$ , $\bar{x}^{3}=(0,-\sqrt{2})^{T}$ , $\bar{x}^{4}=(0,\sqrt{2})^{T}$ . If $\bar{x}\in{\rm int}C$ , then (3.3) is equivalent to the condition $\nabla f(\bar{x})=0$ , which gives the fifth critical point $\bar{x}^{5}=(0,0)^{T}$ . Comparing the values of $f$ at these five points, we conclude that $\bar{x}^{1}=(\sqrt{3},0)^{T}$ and $\bar{x}^{2}=(-\sqrt{3},0)^{T}$ are the global minima of (P). Obviously, there exists $x^{0}\in\mathbb{R}^{2}$ such that $\langle\nabla g(\bar{x}^{1}),x^{0}\rangle<0.$ This means that the regularity condition in (Ruszczynski2006, , Lemma 3.16) is satisfied. So, according to (Ruszczynski2006, , formula (3.29), p. 115), one has

	$\displaystyle T_{C}(\bar{x}^{1})$	$\displaystyle=\{v\in\mathbb{R}^{2}\mid\langle\nabla g(\bar{x}^{1}),v\rangle\leq 0\}$
		$\displaystyle=\{v=(v_{1},v_{2})\in\mathbb{R}^{2}\mid v_{1}\leq 0,\ v_{2}\in\mathbb{R}\}.$

Since $\nabla f(\bar{x}^{1})=\left(-4\sqrt{3},0\right)^{T}$ , fixing any $v=(0,v_{2})^{T}\in T_{C}(\bar{x}^{1})$ , we have $\langle\nabla f(\bar{x}^{1}),v\rangle=0$ . Moreover, by (Ruszczynski2006, , Lemma 3.44),

	$\displaystyle T^{2}_{C}(\bar{x}^{1},v)$	$\displaystyle=\{w=(w_{1},w_{2})\in\mathbb{R}^{2}\mid\langle\nabla g(\bar{x}^{1}),w\rangle\leq-\langle\nabla^{2}g(\bar{x}^{1})v,v\rangle\}$
		$\displaystyle=\Big{\{}w=(w_{1},w_{2})\in\mathbb{R}^{2}\mid w_{1}\leq\dfrac{-6v_{2}^{2}}{4\sqrt{3}}\Big{\}}.$

It follows that $\langle\nabla f(\bar{x}^{1}),w\rangle=-4\sqrt{3}w_{1}\geq 0$ for every $w\in T_{C}^{2}(\bar{x},v)$ . Hence, condition (c1) in Theorem 3.3 is satisfied. Since $\langle\nabla^{2}f(\bar{x}^{1})v,v\rangle=-2v_{2}^{2}$ , the requirement $\langle\nabla^{2}f(\bar{x})v,v\rangle\geq 0$ in condition (c2) is violated if $v_{2}\neq 0$ . Thus, the pair of conditions (c1) and (c2) does not hold, while condition (3.2) is fulfilled.

Next, let us consider an example where $C$ is a nonconvex compact set given by an equality.

Example 2

(See (LVD2014, , Example 1, p. 29)) Consider problem (P) and suppose that $f(x)=-x_{1}^{2}-x_{2}^{2}$ for $x=(x_{1},x_{2})\in\mathbb{R}^{2}$ ,

C=\big{\{}x=(x_{1},x_{2})\in\mathbb{R}^{2}\mid h(x)=x_{1}^{2}+2x_{2}^{2}-1=0\big{\}}.

As it has been shown in (LVD2014, , p. 29), $\bar{x}^{1}=(1,0)^{T}$ and $\bar{x}^{2}=(-1,0)^{T}$ are the global solutions of this problem. According to (Ruszczynski2006, , Formula (3.29), p. 115),

\displaystyle T_{C}(\bar{x}^{2})=\{v=(v_{1},v_{2})\in\mathbb{R}^{2}\mid v_{1}=0\}.

Fixing any $v=(0,v_{2})^{T}\in T_{C}(\bar{x}^{2})$ , we have $\langle\nabla f(\bar{x}^{2}),v\rangle=0$ . By (Ruszczynski2006, , Lemma 3.44),

	$\displaystyle T^{2}_{C}(\bar{x}^{2},v)$	$\displaystyle=\{w=(w_{1},w_{2})\in\mathbb{R}^{2}\mid\langle\nabla h(\bar{x}^{2}),w\rangle=-\langle\nabla^{2}h(\bar{x}^{2})v,v\rangle\}$
		$\displaystyle=\{w=(w_{1},w_{2})\in\mathbb{R}^{2}\mid w_{1}=2v_{2}^{2}\}.$

Since $\langle\nabla f(\bar{x}^{2}),w\rangle=2w_{1}=4v_{2}^{2}\geq 0$ for all $w\in T_{C}^{2}(\bar{x},v)$ , condition (c1) in Theorem 3.3 is satisfied. Meanwhile, since $\langle\nabla^{2}f(\bar{x}^{2})v,v\rangle=-2v_{2}^{2}\leq 0$ , the inequality $\langle\nabla^{2}f(\bar{x})v,v\rangle\geq 0$ in condition (c2) is violated if $v_{2}\neq 0$ . Thus, the conditions (c1) and (c2) do not hold simultaneously, while condition (3.2) is fulfilled.

4 Problems in a new setting

The following second-order necessary optimality condition for (P) is one of the main results of this paper. It is based on the Fréchet second-order subdifferential of $f$ and the second-order tangent set to $C$ , which is assumed to be a convex set of a special type. Unlike the situation in Theorem 3.3 where $f$ was assumed to be a $C^{2}$ -smooth function, in the next theorem and throughout this section we just assume that $f$ is a $C^{1}$ -smooth function.

Theorem 4.1

(Second-order necessary optimality condition) Assume that $\bar{x}$ is a locally optimal solution of (P), where $C$ is a generalized polyhedral convex set. Suppose that there exists a constant $\ell>0$ such that

\displaystyle||\nabla f(x)-\nabla f(\bar{x})||\leq\ell||x-\bar{x}||

(4.1)

for every $x$ in some neighborhood of $\bar{x}$ . Consider the restricted second-order subdifferential $\widehat{\partial}^{2}f(\bar{x}):X\rightrightarrows X^{*}$ , where $X$ is canonically embedded in $X^{**}$ . Then, (3.1) is valid and, for each $v\in T_{C}(\bar{x})$ such that $-v\in T_{C}(\bar{x})$ and $\langle\nabla f(\bar{x}),v\rangle=0$ , one has

\displaystyle\langle\nabla f(\bar{x}),w\rangle\geq 0

(4.2)

and

\displaystyle\langle z,v\rangle\geq 0

(4.3)

for any $w\in T_{C}^{2}(\bar{x},v)$ and $z\in\widehat{\partial}^{2}f(\bar{x})(v).$

Proof

Let $\bar{x}$ be such a locally optimal solution of (P) that (4.1) is valid for all $x$ in a neighborhood $U$ of $\bar{x}$ , where $\ell$ is a positive constant. Let $v\in T_{C}(\bar{x})$ be such that $-v\in T_{C}(\bar{x})$ and $\langle\nabla f(\bar{x}),v\rangle=0$ . Suppose that $w\in T_{C}^{2}(\bar{x},v)$ and $z\in\widehat{\partial}^{2}f(\bar{x})(v)$ are given arbitrarily. Since $C$ is a generalized polyhedral convex set, by Lemma 2 we have (4.2). It remains to prove (4.3). To obtain a contraction, suppose that

\displaystyle\langle z,v\rangle<0.

(4.4)

By the definition of Fréchet second-order subdifferential, from $z\in\widehat{\partial}^{2}f(\bar{x})(v)$ we get $z\in\widehat{D}^{*}\nabla f(\cdot)(\bar{x})(v)$ or, equivalently, $(z,-v)\in\widehat{N}_{\textrm{gph}\nabla f(\cdot)}((\bar{x},\nabla f(\bar{x}))).$ So, one has

\displaystyle\limsup\limits_{x\rightarrow\bar{x}}\frac{\langle(z,-v),\big{(}x,\nabla f(x)\big{)}-(\bar{x},\nabla f(\bar{x}))\rangle}{\|x-\bar{x}\|+\|\nabla f(x)-\nabla f(\bar{x})\|}\leq 0.

(4.5)

Recall that every vector $u\in X$ can be regarded as an element of $X^{**}$ by setting $\langle u,x^{*}\rangle=\langle x^{*},u\rangle$ for all $x^{*}\in X^{*}$ . Hence $\langle u,\nabla f(x)\rangle=\langle\nabla f(x),u\rangle$ for all $u,x\in X$ . Since $\langle\nabla f(\bar{x}),v\rangle=0$ , from (4.5) we obtain

\displaystyle\limsup\limits_{x\rightarrow\bar{x}}\frac{\langle z,x-\bar{x}\rangle-\langle\nabla f(x),v\rangle}{\|x-\bar{x}\|+\|\nabla f(x)-\nabla f(\bar{x})\|}\leq 0.

(4.6)

Moreover, as $C$ is a generalized polyhedral convex set, there exists $\bar{k}\in\mathbb{N}$ such that $x^{k}:=\bar{x}-\frac{1}{k}v$ belongs to $C$ for all $k\geq\bar{k}$ .

Since $\bar{x}$ is a local solution of (P) and $\displaystyle\lim_{k\to\infty}x^{k}=\bar{x}$ , there is no loss of generality in assuming that

\displaystyle f(x^{k})\geq f(\bar{x}),\ \,\forall k\geq\bar{k}.

(4.7)

For each $k\geq\bar{k}$ , by the classical mean value theorem one can find a vector

\xi^{k}\in(\bar{x},x^{k}):=\{(1-\tau)\bar{x}+\tau x^{k}\ |\ \tau\in(0,1)\}

such that $f(x^{k})-f(\bar{x})=\langle\nabla f(\xi^{k}),x^{k}-\bar{x}\rangle.$ Since $x^{k}=\bar{x}-\frac{1}{k}v$ , combining this with (4.7) yields $-\frac{1}{k}\langle\nabla f(\xi_{k}),v\rangle\geq 0.$ It follows that

\displaystyle\langle\nabla f(\xi_{k}),v\rangle\leq 0\quad(\forall k\geq\bar{k}).

(4.8)

From (4.6) we can deduce that

\displaystyle\limsup\limits_{{k}\rightarrow\infty}\frac{\langle z,\xi_{k}-\bar{x}\rangle-\langle\nabla f(\xi_{k}),v\rangle}{\|\xi_{k}-\bar{x}\|+\|\nabla f(\xi_{k})-\nabla f(\bar{x})\|}\leq 0.

Noting that $\xi_{k}=\bar{x}-t_{k}v$ for some $t_{k}\in\left(0,\frac{1}{k}\right)$ , from this one gets

\displaystyle\limsup\limits_{{k}\rightarrow\infty}\Delta_{k}\leq 0,

(4.9)

where

\Delta_{k}:=\frac{-t_{k}\langle z,v\rangle-\langle\nabla f(\xi_{k}),v\rangle}{\|-t_{k}v||+\|\nabla f(\xi_{k})-\nabla f(\bar{x})\|}.

Clearly,

\displaystyle\Delta_{k}=\frac{-\langle z,v\rangle-t_{k}^{-1}\langle\nabla f(\xi_{k}),v\rangle}{\|v||+t_{k}^{-1}\|\nabla f(\xi_{k})-\nabla f(\bar{x})\|}.

Hence, by (4.8) one has

\Delta_{k}\geq\frac{-\langle z,v\rangle}{\|v||+t_{k}^{-1}\|\nabla f(\xi_{k})-\nabla f(\bar{x})\|}.

On one hand, using (4.1) we obtain

\displaystyle||\nabla f(\xi_{k})-\nabla f(\bar{x})||\leq\ell||\xi_{k}-\bar{x}||=\ell t_{k}||v||,

provided that $k$ is large enough. On the other hand, by virtue of (4.4) we have $-\langle z,v\rangle>0$ . Consequently, for large enough indexes $k$ , it holds that

\Delta_{k}\geq\frac{-\langle z,v\rangle}{(1+\ell)\|v\|}.

So, we get $\limsup\limits_{{k}\rightarrow\infty}\Delta_{k}>0$ , which contradicts (4.9).

The proof is complete. $\hfill\Box$

Remark 3

To compare Theorem 4.1 with Theorem 3.3, assume for a while that $f$ is $C^{2}$ -smooth. Let $\bar{x}$ be a locally optimal solution of (P), where $C$ is a generalized polyhedral convex set. Then, applying the mean-value theorem for vector-valued functions (see (IoffeTihomirov, , p. 27)) to the gradient mapping $\nabla f(\cdot):X\to X^{*}$ , one can show that there exists a constant $\ell>0$ such that (4.1) holds for every $x$ in some neighborhood of $\bar{x}$ . Since $\widehat{\partial}^{2}f(\bar{x})(u)=\{\nabla^{2}f(\bar{x})^{*}u\}$ for every $u$ in the space $X$ , which is canonically embedded in $X^{**}$ , inequality (4.3) means that $\langle\nabla^{2}f(\bar{x})^{*}v,v\rangle\geq 0$ . Hence, $\langle v,\nabla^{2}f(\bar{x})v\rangle\geq 0$ . By the definition of the canonical embedding of $X$ in $X^{**}$ , the latter means that $\langle\nabla^{2}f(\bar{x})v,v\rangle\geq 0$ . Therefore, the assertions of Theorem 4.1 coincide with those of Theorem 3.3, provided that the critical direction $v$ satisfies the condition $-v\in T_{C}(\bar{x})$ . Thus, in comparison with Theorem 3.3, although Theorem 4.1 helps us to treat optimization problems with objective functions from a larger class, it does not provide a complete extension for the former theorem.

When $C=X$ , (P) becomes the unconstrained optimization problem

\min\{f(x)\mid x\in X\}

(P1)

with $f:X\rightarrow\mathbb{R}$ being a $C^{1}$ -smooth function. From Theorem 4.1 one can easily derive the following second-order optimality condition for (P1), which is due to Chieu et al. ChieuLeeYen2017 .

Theorem 4.2

(See (ChieuLeeYen2017, , Theorem 3.3)) Suppose that $\bar{x}$ is a local solution of (P1) and there exists $\ell>0$ such that $||\nabla f(x)-\nabla f(\bar{x})||\leq\ell||x-\bar{x}||$ for every $x$ in some neighborhood of $\bar{x}$ . Then $\nabla f(\bar{x})=0$ and the second-order subdifferential $\widehat{\partial}^{2}f(\bar{x}):X\rightrightarrows X^{*}$ , where $X$ is canonically embedded in $X^{**}$ , is positive semi-definite, i.e., $\langle z,u\rangle\geq 0$ for any $u\in X$ and $z\in\widehat{\partial}^{2}f(\bar{x})(u).$

Dai (LVD2014, , Chapter 3) has extended the finite-dimensional version of Theorem 4.2 to case of constrained $C^{1}$ -smooth optimization problems of the form

\min\{f(x)\mid h(x)=0\}

(P2)

with $h(x)=Ax+b$ , where $A\in\mathbb{R}^{p\times n}$ is a given matrix and $b\in\mathbb{R}^{p}$ is a given vector. In this case, one has $C=\{x\in\mathbb{R}^{n}\mid Ax+b=0\}$ . Thus, $C$ is a special polyhedral convex set in $\mathbb{R}^{n}$ . The Lagrange function associated with (P2) is defined by setting $L(x,\mu)=f(x)+\langle\mu,h(x)\rangle$ for $(x,\mu)\in\mathbb{R}^{n}\times\mathbb{R}^{p}$ .

Theorem 4.3

(See (LVD2014, , Theorem 3.3)) Suppose that $\bar{x}$ is a local solution of (P2) and $\bar{\mu}\in\mathbb{R}^{p}$ is a Lagrange multiplier corresponding to $\bar{x}$ , that is,

\nabla_{x}L(\bar{x},\bar{\mu})=\nabla f(\bar{x})+A^{T}\bar{\mu}=0.

(4.10)

Suppose that, in addition, there exists a constant $\ell>0$ and a neighborhood $U$ of $\bar{x}$ such that $||\nabla f(x)-\nabla f(\bar{x})||\leq\ell||x-\bar{x}||$ for all $x\in U$ . Then, for any $v\in\mathbb{R}^{n}$ with $Av=0$ , one has $\langle z,v\rangle\geq 0$ for any $z\in\widehat{\partial}^{2}L(\cdot,\bar{\mu})(\bar{x})(v)$ .

Theorem 4.1 is a generalization of Theorem 4.3. Indeed, the existence of $\bar{\mu}\in\mathbb{R}^{p}$ satisfying (4.10) follows from the necessary condition in (3.1) and Farkas’ Lemma (see, e.g., (Rockafellar_1970, , p. 200)). On one hand, since $\nabla_{x}L(x,\mu)=\nabla f(x)+A^{T}\mu$ for every $(x,\mu)\in\mathbb{R}^{n}\times\mathbb{R}^{p}$ , one has $\widehat{\partial}^{2}L(\cdot,\bar{\mu})(\bar{x})(\cdot)=\widehat{\partial}^{2}f(\bar{x})(\cdot)$ . Hence, the inclusion $z\in\widehat{\partial}^{2}L(\cdot,\bar{\mu})(\bar{x})(v)$ is equivalent to saying that $z\in\widehat{\partial}^{2}f(\bar{x})(v)$ . On the other hand, as $T_{C}(\bar{x})=\{u\in\mathbb{R}^{n}\mid Au=0\}$ , the condition $Av=0$ implies that $v\in T_{C}(\bar{x})$ and $-v\in T_{C}(\bar{x})$ . Moreover, from (3.1) one deduces that $\langle\nabla f(\bar{x}),v\rangle=0$ . Therefore, its follows from (4.3) that $\langle z,v\rangle\geq 0$ for any $z\in\widehat{\partial}^{2}L(\cdot,\bar{\mu})(\bar{x})(v)$ .

Theorem 4.1 asserts that inequality (4.3) holds for any $z\in\widehat{\partial}^{2}f(\bar{x})(v)$ if the critical direction $v$ satisfies the additional condition $-v\in T_{C}(\bar{x})$ . The following example will show that the last condition is essential for the validity of the assertion.

Example 3

Let $n=1$ , $C=\mathbb{R}_{+}$ , $g(x)=-x$ for $x\leq 0$ and $g(x)=x^{2}$ for $x\geq 0$ . Define $f(x)=\displaystyle\int_{0}^{x}g(t)dt$ for all $x\in\mathbb{R}$ , where the integration is Riemannian. Since $g(\cdot)$ is continuous on $\mathbb{R}$ , $f$ is a $C^{1}$ -smooth function and $\nabla f(x)=g(x)$ for $x\in\mathbb{R}$ . Note that $f(x)=-\frac{1}{2}x^{2}$ for $x\leq 0$ , $f(x)=\frac{1}{3}x^{3}$ for $x\geq 0$ . Consider the point $\bar{x}:=0$ , which is the unique global solution of (P). Clearly, $f$ satisfies condition (4.1) for every $x\in(-1,1)$ with $\ell=1$ . On one hand, by Proposition 1 we have $T_{C}(\bar{x})=\mathbb{R_{+}}$ and

\displaystyle T^{2}_{C}(\bar{x},v)=T_{T_{C}(\bar{x})}(v)=\begin{cases}\mathbb{R}&\mbox{if}\ v>0,\\ \mathbb{R_{+}}&\mbox{if}\ v=0.\end{cases}

On the other hand, using the definition of the second-order subdifferential, we have

\displaystyle\begin{array}[]{rcl}z\in\widehat{\partial}^{2}f(\bar{x})(v)&\Leftrightarrow&z\in\widehat{D}^{*}\nabla f(\cdot)(\bar{x})(v)\\ &\ \Leftrightarrow&(z,-v)\in\widehat{N}_{\textrm{gph}\nabla f(\cdot)}((\bar{x},\nabla f(\bar{x})))\\ &\Leftrightarrow&\limsup\limits_{x\to\ \bar{x}}\dfrac{\langle(z,-v),(x,\nabla f(x))-(\bar{x},\nabla f(\bar{x}))\rangle}{|x-\bar{x}|+|\nabla f(x)-\nabla f(\bar{x})|}\leq 0.\end{array}

Since $\bar{x}=0$ and $\nabla f(\bar{x})=0$ , the last inequality is equivalent to

\displaystyle\limsup\limits_{x\to 0}\dfrac{zx-v\nabla f(x)}{|x|+|\nabla f(x)|}\leq 0.

(4.11)

From (4.11) one has

\displaystyle 0\geq\limsup\limits_{x\to 0^{+}}\dfrac{zx-vx^{2}}{x+x^{2}}=\limsup\limits_{x\to 0^{+}}\dfrac{z-vx}{1+x}=z

and

\displaystyle 0\geq\limsup\limits_{x\to 0^{-}}\dfrac{zx+vx}{-2x}=\dfrac{-(z+v)}{2}.

It follows that

\displaystyle z\leq 0\quad\mbox{and}\quad z+v\geq 0.

(4.12)

Conversely, if (4.12) is satisfied, then (4.11) holds. Consequently, the inclusion $z\in\widehat{\partial}^{2}f(\bar{x})(v)$ means that $-v\leq z\leq 0$ . So, choosing $v=1$ and $z=-1$ , one has $v\in T_{C}(\bar{x})$ , $\nabla f(\bar{x})v=0$ , and $z\in\widehat{\partial}^{2}f(\bar{x})(v)$ . Clearly, (4.2) holds for any $w\in T_{C}^{2}(\bar{x},v)$ because $\nabla f(\bar{x})=0$ . However, (4.3) is violated as $zv=-1$ . Note that $-v\notin T_{C}(\bar{x})$ .

Acknowledgements. This research was supported by Vietnam Institute for Advanced Study in Mathematics (VIASM). Duong Thi Viet An was also supported by the Simons Foundation Grant Targeted for Institute of Mathematics, Vietnam Academy of Science and Technology.

References

(1) Ban, L., Mordukhovich, B.S., Song, W.: Lipschitzian stability of parametric variational inequalities over generalized polyhedra in Banach spaces. Nonlinear Anal. 74, 441–461 (2011)
(2) Ben-Tal, A.: Second-order and related extremality conditions in nonlinear programming. J. Optim. Theory Appl. 31, 143–165 (1980)
(3) Ben-Tal, A., Zowe, J.: Necessary and sufficient optimality conditions for a class of nonsmooth minimization problems. Math. Programming 24, 70–91 (1982)
(4) Bonnans, J.F., Shapiro, A.: Perturbation Analysis of Optimization Problems. Springer, New York (2000)
(5) Chieu, N.H., Chuong, T.D., Yao, J.-C., Yen, N.D.: Characterizing convexity of a function by its Fréchet and limiting second-order subdifferentials. Set-Valued Var. Anal. 19, 75–96 (2011)
(6) Chieu, N.H., Huy, N.Q.: Second-order subdifferentials and convexity of real-valued functions. Nonlinear Anal. 74, 154–160 (2011)
(7) Chieu, N.H., Lee, G.M., Yen, N.D.: Second-order subdifferentials and optimality conditions for $C^{1}$ -smooth optimization problems. Appl. Anal. Optim. 1, 461–476 (2017)
(8) Dai, L.V.: Necessary and Sufficient Optimality Conditions with Lagrange Multipliers. Undergraduate Thesis, University of Science, Vietnam National University (2014)
(9) Hiriart-Urruty, J.-B., Strodiot, J.-J., Nguyen, V.H: Generalized Hessian matrix and second-order optimality conditions for problems with $C^{1,1}$ data. Appl. Math. Optim. 11, 43–56 (1984)
(10) Ioffe, A.D., Tihomirov, V.M.: Theory of Extremal Problems. Amsterdam, North-Holland (1979)
(11) Huy, N.Q., Tuyen, N.V.: New second-order optimality conditions for a class of differentiable optimization problems. J. Optim. Theory Appl. 171, 27–44 (2016)
(12) Lee, G.M., Tam, N.N., Yen, N.D.: Quadratic Programming and Affine Variational Inequalities. A Qualitative Study. Springer-Verlag, New York (2005)
(13) Luan, N.N., Yao, J-.C.: Generalized polyhedral convex optimization problems. J. Global Optim. 75, 789–811 (2019)
(14) Luan, N.N., Yao, J-.C., Yen, N.D.: On some generalized polyhedral convex constructions. Numer. Funct. Anal. Optim. 39, 537–570 (2018)
(15) Luan, N.N., Yen, N.D.: A representation of generalized convex polyhedra and applications. Optimization 69, 471–492 (2020)
(16) Luenberger, D.G., Ye, Y.: Linear and Nonlinear Programming. Springer, New York (2008)
(17) McCormick, Garth P.: Second order conditions for constrained minima. SIAM J. Appl. Math. 15, 641–652 (1967)
(18) Mordukhovich, B.S.: Sensitivity analysis in nonsmooth optimization. In: Field, D.A., Komkov, V. (eds.) Theoretical Aspects of Industrial Design Field, pp. 32–46. SIAM, Philadelphia (1992)
(19) Mordukhovich, B.S.: Variational Analysis and Generalized Differentiation, Volume I: Basic Theory. Springer, Berlin (2006)
(20) Mordukhovich, B.S.: Variational Analysis and Generalized Differentiation, Volume II: Applications. Springer, Berlin (2006)
(21) Mordukhovich, B.S., Rockafellar, R.T.: Second-order subdifferential calculus with applications to tilt stability in optimization. SIAM J. Optim. 22, 953–986 (2012)
(22) Mordukhovich, B.S., Rockafellar, R.T., Sarabi, M.E.: Characterizations of full stability in constrained optimization. SIAM J. Optim. 23, 1810–1849 (2013)
(23) Penot, J-.P.: Optimality conditions in mathematical programming and composite optimization. Math. Programming 67, 225–245 (1994)
(24) Penot, J-.P.: Second-order conditions for optimization problems with constraints. SIAM J. Control Optim. 37, 303–318 (1999)
(25) Poliquin, R.A., Rockafellar, R.T.: Tilt stability of a local minimum. SIAM J. Optim. 8, 287–299 (1998)
(26) Polyak, B.T.: Introduction to Optimization. Revised version. Optimization Software, Inc., New York (2010)
(27) Rockafellar, R.T.: Convex Analysis. Princeton University Press, Princeton, New Jersey (1970)
(28) Ruszczynski, A.: Nonlinear Optimization. Princeton University Press, New Jersey (2006)
(29) Yen, N.D., Yang, X.: Affine variational inequalities on normed spaces. J. Optim. Theory Appl. 178, 36–55 (2018)