Duality for Composite Optimization Problem within the Framework of Abstract Convexity

\nameEwa Bednarczuk^a and Hung T.T.^a CONTACT The Hung Tran. Email: [email protected]
Disclaimer: This work represents only the author’s view and the European Commission is not responsible for any use that may be made of the information it contains. ^aDepartment of Modeling and Optimization in Dynamical System, System Research Institute-Polish Academy of Science, Warsaw, Poland

Abstract

We study conjugate and Lagrange dualities for composite optimization problems within the framework of abstract convexity. We provide conditions for zero duality gap in conjugate duality. For Lagrange duality, intersection property is applied to obtain zero duality gap. Connection between Lagrange dual and conjugate dual is also established. Examples related to convex and weakly convex functions are given.

keywords:

Abstract convexity;

\Phi

-convexity;

\varepsilon

-subdifferentials; Conjugate dual; Lagrange dual; Zero duality gap; Nonconvex programming; Global Optimization

^†^†articletype: ARTICLE TEMPLATE

{amscode}

49J27; 49J35; 49N15; 90C46; 90C26

1 Introduction

During the last eighty years, convexity has become an essential part in the development of optimization, nonlinear analysis etc. With the surge of computational power, interests in machine learning and data science have risen, which make convexity the backbone of many theories and applications. Together with convexity, there are also many works that try to go beyond convexity which include functions like: strongly convex [1], quasi-convex, pseudo-convex, strongly para-convex and para-convex [2], delta-convex [3], approximately convex [4] etc…

Abstract convexity encompasses many of the above mentioned classes of functions. Abstract convexity, presented in the monographs of Rubinov [5], Pallaschke and Rolewicz [6] is based on the idea to bring convexity outside the range of linearity, by introducing the nonlinear environment where a function becomes an upper envelope of a subset of functions with specific rules. The term “abstract convexity” was used by Rubinov [5] to describe functions which are upper envelopes of a given class of function $\Phi$ , i.e.

f(x)=\sup_{\phi\in\Phi}\left\{\phi(x)\ :X\to\mathbb{R},\ \phi\leq f\right\},

where $X$ is a nonempty set.

Consequently, the concept of abstract convexity works well with classical convex functions. As the supremum operation is retained, many global properties of convex analysis are still in effect for abstract convexity. Till now, the books of Pallaschke and Rolewicz [6], and the monograph of Singer [7] gathered many results about abstract convexity, especially about conjugation, subdifferential and duality. They also have historical results of the main ideas of abstract convexity and its applications. In [5], Rubinov presented basic notions of abstract convexity and applications to global optimization problems.

It has been fifty years since the theory of abstract convexity started. It has made great progress in many disciplines with applications in both theory and computation. In [8], the authors investigated the class of increasing star-shaped functions and constructed an algorithm to solve the global optimization problem using abstract convexity. They also generalized the cutting plane algorithm for nonconvex global minimization problems [8, 9]. While in [10, 11, 12], a general version of monotone operator is developed based on abstract convexity.

In this paper, we pay special attention to duality theory for composite minimization problem by using abstract convexity. We mainly study the conjugate and Lagrangian dualities within the abstract convexity for the composite minimization problem,

\inf_{x\in X}f(x)+g(Lx),

(CP)

where $X$ and $Y$ are the domain sets (spaces) of the functions $f$ and $g$ , the mapping $L:X\to Y$ maps $X$ into $Y$ .

The composite optimization problem is taken as our main interest as it is a general problem of many optimization formulation. One can think of (CP) as nonlinear programming problem, min-max optimization, constrained and unconstrained problem. These problems have been studied extensively in [13, 14]. On the other hand, composite problem can also be considered as a regularized problem by treating the $g(Lx)$ as a penalty term. Such problems are knows as total variation model in image deblurring and denoising [15].

Several efforts have been made for the last twenty years to investigate dualities for optimization problem within the framework of abstract convexity. In [16] strong duality is proved for infimal convolution of Fenchel’s duality. In [17], zero duality gap and exact multiplier of augmented Lagrangian are investigated by using the framework of abstract convexity. While in [18, 19] necessary and sufficient conditions are given to achieve minimax equality from a general Lagrangian using the definition of abstract convexity. In the works of Dolgopolik [20], Gorokhovik and Tykoun [21, 22], the authors applied abstract convexity to approximate the class of nonsmooth functions and formulate necessary optimal conditions for global nonsmooth nonconvex problem. The authors in [23] investigated the problem of minimizing the finite sum of arbitrary functions and provided conditions for zero duality gap through infimal convolution dual. On the other hand, in [24], the problem of zero duality gap is studied with the help of perturbation function.

Our contribution is as follow. Inspired by Moreau’s general subdifferential and conjugation [25], we utilize the framework of abstract conjugation and abstract subdifferential [16] and construct conjugate dual problem to (CP). We derive conditions for zero duality gap and strong duality. As we have different variable spaces of $f$ and $g$ , the conditions we propose reduce to the ones proposed in e.g. [23] when $L$ is the identity mapping. In deriving the Lagrangian dual problem we make use of the intersection property for abstract convex functions, [24, 19] to obtain Lagrange zero duality gap. We also discuss the relationship between conjugate and Lagrangian duals.

The structure of paper is as follows. In Section 2, we state the framework of abstract convexity with conjugation and subdifferential as well as some basic properties for abstract convex functions. Section 3 contains the definition of the composite minimization problem and way to construct the conjugate dual problem. Section 4 contains our main result about zero duality gap given in Theorem 4.3. In Section 5, we provide conditions for strong duality in Theorem 5.2 and Corollary 5.3. In Section 6, we prove Lagrange zero duality gap in Theorem 6.5. The equivalence of conjugate dual and Lagrange dual is examined under some conditions in Corollary 6.8.

2 Preliminaries

Let $X$ be a nonempty set. A function $f:X\to\left(-\infty,+\infty\right]$ is proper if its domain, denoted by $\text{dom }f=\left\{x\in X:f\left(x\right)<+\infty\right\}$ is nonempty.

Let $\Phi=\left\{\phi:X\to\mathbb{R}\right\}$ be a collection of real-valued functions, which is closed under addition of a constant. The support set of $f$ with respect to $\Phi$ is defined as

\text{supp}_{\Phi}f=\left\{\phi\in\Phi:f\left(x\right)\geq\phi\left(x\right)\left(\forall x\in X\right)\right\}.

For $f,g:X\to\left(-\infty,+\infty\right]$ , we write $f\leq g\Leftrightarrow f\left(x\right)\leq g\left(x\right)$ for all $x\in X$ . Elements of $\Phi$ are called elementary functions.

Definition 2.1.

[5, 6] A function $f:X\to\left(-\infty,+\infty\right]$ is $\Phi$ -convex if

f\left(x\right)=\sup_{\phi\in\text{supp}_{\Phi}f}\phi\left(x\right),\ \forall x\in X.

(1)

A function $f$ is $\Phi$ -convex at $x_{0}\in X$ if $f(x_{0})=\sup_{\phi\in\text{supp}_{\Phi}f}\phi\left(x_{0}\right)$ .

Note that when a function $f:X\to(-\infty,+\infty]$ is $\Phi$ -convex, then $\text{supp }f$ is nonempty, since otherwise, $f\equiv-\infty$ .

Definition 2.2.

[5, 6] A $\Phi$ -conjugate $f^{*}:\Phi\to(-\infty,+\infty]$ of $f$ is defined as

f_{\Phi}^{*}\left(\phi\right):=\sup_{x\in X}\left\{\phi\left(x\right)-f\left(x\right)\right\}.

(2)

Analogously, we can define $\Phi$ -biconjugate $f^{**}:X\to(-\infty,+\infty]$ of function $f$ as

f^{**}_{\Phi}(x):=\sup_{\phi\in\Phi}\left\{\phi(x)-f^{*}_{\Phi}(\phi)\right\}.

(3)

As in convex analysis, the following result holds.

Theorem 2.3.

[5, 6] A function $f:X\to\left(-\infty,+\infty\right]$ is $\Phi$ -convex if and only if

f(x)=f^{**}_{\Phi}(x),\ \forall x\in X

(4)

In view of this theorem, we say that $f$ is $\Phi$ -convex at a point $x_{0}\in X$ if $f^{**}_{\Phi}(x_{0})=f(x_{0})$ .

Example 2.4.

Let $X$ be a Banach space with the topological dual $X^{*}$ and

\Phi_{conv}:=\left\{\phi:X\to\mathbb{R}\,\mid\,\phi\left(x\right)=\left\langle u,x\right\rangle+c,u\in X^{*},c\in\mathbb{R}\right\}.

(5)

If a function $f:X\to(-\infty,+\infty]$ is $\Phi_{conv}$ -convex, then it is proper, convex and lower semi-continuous. When $X$ is a locally convex space, [26], then $f$ is $\Phi_{conv}$ -convex if and only if $f$ is proper, convex and lower semi-continuous.

Example 2.5.

Let $X$ be a Hilbert space with the inner product $\left\langle\cdot,\cdot\right\rangle$ and the norm $\left\langle x,x\right\rangle=\left\|x\right\|^{2}$ . Let $a\in\mathbb{R}$ , take $\Phi_{Q,a}$ as a collection of quadratic functions

\Phi_{Q,a}:=\left\{\phi:X\to\mathbb{R}\ |\ \phi\left(x\right)=a\left\|x\right\|^{2}+\left\langle u,x\right\rangle+c\text{ where }c\in\mathbb{R},u\in X\right\}.

(6)

A function $f:X\to(-\infty,+\infty]$ is $\Phi_{Q,a}$ -convex, if for every $x\in X$ , we have

f(x)=\sup_{\phi\in\Phi_{Q,a}}\{\phi(x)\ |\ \phi\leq f\}.

Since $\Phi_{Q,a}$ consists of continuous functions, $f$ is lower semi-continuous on $X$ . Depending upon the sign of $a$ , we get the following.

•

If $a=0$ then we go back to the affine elementary functions as in Example 2.4, so $f$ is lsc proper convex if $f$ is $\Phi_{Q,a}$ -convex.
•

If $a>0$ , $f$ is $\Phi_{Q,a}$ -convex, then $f$ is strongly convex with modulus $a$ . One can look at the definition of a strongly convex function in [1, Definition 4.1].
•

If $a<0$ and $f$ is $\Phi_{Q,a}$ -convex, then $f$ is a weakly convex function with modulus $a$ [1, Definition 4.1], see also [2].

For equivalent definitions of weakly convex functions and related facts, see e.g. [27, 28] and the references therein.

Definition 2.6.

Let $X$ be a normed space with the norm $\lVert\cdot\rVert$ . A function $f:X\to\left(-\infty,+\infty\right]$ is weakly convex with modulus $\rho\geq 0$ or $\rho$ -weakly convex if $f+\rho\left\|\cdot\right\|^{2}$ is a convex function. When $\rho=0$ , $f$ becomes a convex function.

Note that many authors consider the class

\Phi_{Q}:=\left\{\phi:X\to\mathbb{R}\ |\ \phi\left(x\right)=a\left\|x\right\|^{2}+\left\langle u,x\right\rangle+c\text{ where }a\leq 0,c\in\mathbb{R},u\in X\right\},

(7)

where $X$ is a Hilbert space, see e.g. [5, Example 6.2]. For this class of elementary functions, the set $\Phi_{Q}$ -convex functions coincides with the set of all lower semi-continuous functions defined on $X$ and minorized by a function from $\Phi_{Q}$ , see [5, Proposition 6.3]. Quadratically minorized functions have also been investigated in [28]. The starting point of numerous variant concepts of convex functions stems from the work of Hyers and Ulam [3], where they defined $\delta$ -convex function for $\delta\geq 0$ , next approximately convex functions were investigated in e.g. [29, 4, 30] and the references therein. Rolewicz has coined the term ”paraconvexity”, [2], by turning $\delta$ into a non-negative function. The definitions of strong and weak convexity were also given in [1] by Vial.

Definition 2.7.

[5, 6] Let $X$ be a nonempty set and $f:X\to\left(-\infty,+\infty\right]$ . A $\Phi$ -subgradient of $f$ at a point $x\in\text{dom }f$ is any element $\phi\in\Phi$ such that

f\left(y\right)-f\left(x\right)\geq\phi\left(y\right)-\phi\left(x\right),\quad\forall y\in X.

(8)

The collection of all $\Phi$ -subgradients of $f$ at $x$ is called the $\Phi$ -subdifferentials of $f$ at $x$ and is denoted by $\partial_{\Phi}f(x)$ ,

\partial_{\Phi}f\left(x\right):=\left\{\phi\in\Phi\,|\,\left(\forall y\in X\right)\ f\left(y\right)-f\left(x\right)\geq\phi\left(y\right)-\phi\left(x\right)\right\}.

(9)

Furthermore, for $\varepsilon\geq 0$ , the $\varepsilon-\Phi$ -subdifferentials of $f$ at $x\in\text{dom }f$ , $\partial_{\varepsilon,\Phi}f(x)$ , is defined as

\partial_{\varepsilon,\Phi}f(x):=\{\phi\in\Phi\,\mid\,f\left(y\right)-f\left(x\right)\geq\phi\left(y\right)-\phi\left(x\right)-\varepsilon,\quad\forall y\in X\},

(10)

an elements of $\partial_{\varepsilon,\Phi}f(x)$ are called $(\varepsilon,\Phi)$ -subgradients of $f$ at $x\in\text{dom\,}f$ .

Additional facts and results of $\varepsilon$ - $\Phi$ subdifferentials can be found in [16, 10].

The following properties follow directly from the definitions.

Proposition 2.8.

Let $X$ be a nonempty set and $f:X\to\left(-\infty,+\infty\right]$ .

(i)

For all $x\in\text{dom }f$ and $\varepsilon\geq 0$ , an element $\phi\in\Phi$ is a $(\varepsilon,\Phi)$ -subgradient of $f$ at $x$ if and only if

$f(x)+f^{*}_{\Phi}(\phi)\leq\phi(x)+\varepsilon.$ (11)
(ii)

$\text{dom }f^{*}=\bigcap_{\varepsilon>0}\partial_{\varepsilon,\Phi}f\left(X\right)$ .

The proof for (i) can be found in [5, Proposition 7.10] and (ii) in [23, Proposition 2.4].

3 Construction of the conjugate dual

We consider the composite minimization problem

\inf_{x\in X}f\left(x\right)+g\left(Lx\right)

(CP)

where $X$ is a nonempty set and $Y$ is a vector space, $f:X\to\left(-\infty,+\infty\right]$ , $g:Y\to\left(-\infty,+\infty\right]$ and $L:X\to Y$ is a mapping from $X$ to $Y$ . Note that here we do not assume that $L$ is linear and continuous.

Let $\Phi$ be a set of elementary functions $\phi:X\to\mathbb{R}$ and let $\Psi$ be a set of elementary functions $\psi:Y\to\mathbb{R}$ . Assume that both classes are closed under addition of constants and $0\in\Phi,0\in\Psi$ .

We introduce the conjugate dual of (CP) by considering perturbed minimization problems of the form

\inf_{x\in X}f\left(x\right)+g\left(Lx+y\right),\quad y\in Y.

(12)

Following the classical ideas coming from convex analysis, see e.g. [31, 14], we calculate the conjugate of the function $\beta:X\times Y\to(-\infty,+\infty]$ , where

\beta(x,y):=f\left(x\right)+g\left(Lx+y\right),\ \ x\in X,\ y\in Y.

Note that, in the convex case, conjugation is taken with respect to linear functionals, while here we are considering elementary functions $\phi,\psi$ from given classes $\Phi$ and $\Psi$ , respectively, see e.g. [25].

Now, We define the conjugate of $\beta$ with respect to the coupling function

	$\displaystyle c:\Phi\times\Psi\times X\times Y$	$\displaystyle\to\mathbb{R}$
	$\displaystyle\left(\phi,\psi,x,y\right)$	$\displaystyle\mapsto\phi(x)+\psi(Lx+y)-\psi(Lx).$		(13)

For other types of coupling functions defined on Cartesian products of elementary function sets, see e.g., [32].

The c-conjugate of $\beta$ , $\beta^{c}:\Phi\times\Psi\rightarrow(-\infty,+\infty]$ , is defined as

\beta^{c}(\phi,\psi)=\sup_{x\in X}\sup_{y\in Y}\left\{c(\phi,\psi,x,y)-\beta(x,y)\right\}.

(14)

The conjugate dual to problem (CP) is defined as

\sup_{\psi\in\Psi}-\beta^{c}(0,\psi).

(15)

Investigation of the relationship between (CP) and problem (15) is the main focus of the paper. To write (15) in a more explicit form, observe that the objective function of (15) (the dual objective) is equal to

	$\displaystyle-\beta^{c}(0,\psi)$	$\displaystyle=-\sup_{x\in X}\sup_{y\in Y}\left\{c(0,\psi,x,y)-\beta(x,y)\right\}$
		$\displaystyle=-\sup_{x\in X}\sup_{y\in Y}\left\{\psi(Lx+y)-\psi(Lx)-f(x)-g(Lx+y)\right\}$
		$\displaystyle=-\sup_{x\in X}\sup_{z=Lx+y}\left\{\psi(z)-\psi(Lx)-f(x)-g(z)\right\}$
		$\displaystyle=-\sup_{x\in X}\left\{-\psi(Lx)-f(x)+\sup_{z\in Y}\{\psi(z)-g(z)\}\right\}$
		$\displaystyle=-\sup_{x\in X}\left\{-\psi(Lx)-f(x)\right\}-g^{*}_{\Psi}(\psi)$

Hence, the conjugate dual problem takes the form

\sup_{\psi\in\Psi}-\beta^{c}(0,\psi)=\sup_{\psi\in\Psi}-\sup_{x\in X}\left\{-\psi(Lx)-f(x)\right\}-g^{*}_{\Psi}(\psi).

(DCP)

Notice that we have kept the formulation of $\sup_{x\in X}\left\{-\psi(Lx)-f(x)\right\}$ in the dual problem (DCP), since we do not know if $\psi\circ L$ belongs to the class $\Phi$ in general. Below, we give some examples where we further investigate the dual problem.

Example 3.1.

In some cases, we can rewrite the conjugate dual (DCP) in an equivalent and more convenient way.

•

If, for any $\psi\in\Psi$ , $-\psi\circ L\in\Phi$ (alternatively, we can assume that $\psi\circ L\in\Phi$ and $\Phi$ is symmetric i.e. $-\phi\in\Phi$ for all $\phi\in\Phi$ ) we obtain

\sup\limits_{x\in X}\{-f(x)-\psi(Lx)\}=f^{*}_{\Phi}\left(-\psi\circ L\right),

(17)

and the conjugate dual problem (DCP) takes the form

\sup_{\psi\in\Psi}-\beta^{c}(0,\psi)=\sup_{\psi\in\Psi}-f^{*}_{\Phi}\left(-\psi\circ L\right)-g^{*}_{\Psi}(\psi).

(18)

With the help of the operator $L:X\rightarrow Y$ we define another class of elementary functions as follows

\Phi_{L}:=\left\{-\psi\circ L:X\to\mathbb{R},\ \psi\in\Psi\right\},

and the dual (18) can be written as

\sup_{\begin{subarray}{c}\phi\in\Phi_{L},\psi\in\Psi\\ \phi+\psi\circ L=0\end{subarray}}-f^{*}_{\Phi_{L}}\left(\phi\right)-g^{*}_{\Psi}(\psi).

(19)

•

If $\Phi$ is symmetric, and for any $\psi\in\Psi$ , $\psi\circ L$ is $\Phi$ -convex, we have

\psi(Lx)=\sup\left\{\phi(x):\phi\in\text{supp }\psi\circ L\right\}.

The optimal value of the dual problem (DCP) (obtained with the help of the coupling function $c$ from (13)) satisfies

$\displaystyle\sup_{\psi\in\Psi}-\beta^{c}(0,\psi)$	$\displaystyle=\sup_{\psi\in\Psi}\inf_{x\in X}f\left(x\right)+\psi\left(Lx\right)-g^{*}_{\Psi}\left(\psi\right)$
	$\displaystyle=\sup_{\psi\in\Psi}\left\{\inf_{x\in X}\left(f(x)+\sup_{\phi\in\text{supp }\psi\circ L}\phi(x)\right)-g^{*}_{\Psi}(\psi)\right\}$
	$\displaystyle\geq\sup_{\psi\in\Psi}\sup_{\phi\in\text{supp }\psi\circ L}\left\{\inf_{x\in X}\left(f(x)+\phi(x)\right)-g^{*}_{\Psi}(\psi)\right\}$
	$\displaystyle\geq\sup_{\psi\in\Psi,\phi\in\text{supp }\psi\circ L}-f^{}_{\Phi}(-\phi)-g^{}_{\Psi}(\psi).$	(20)

Clearly, the $\Phi$ -convexity of $\psi\circ L$ is more general than the condition $\psi\circ L\in\Phi$ . However, assuming the latter, we obtain, from (DCP) i.e.

\sup_{\psi\in\Psi}-\beta^{c}(0,\psi)=\sup_{\psi\in\Psi}-f^{*}_{\Phi}(-\psi\circ L)-g^{*}_{\Psi}(\psi).

If $X=Y,\Phi=\Psi$ are symmetric, and $L:X\to X$ is such that $\psi\circ L\in\Phi$ , for any $\psi\in\Psi$ then

\sup\limits_{x\in X}\{-f(x)-\psi(Lx)\}=f^{*}_{\Psi}(-\psi\circ L),

and the conjugate dual (DCP) takes the form

\sup_{\psi\in\Psi}-\beta^{c}(0,\psi)=\sup_{\psi\in\Psi}-f^{*}_{\Psi}(-\psi\circ L)-g^{*}_{\Psi}(\psi).

(21)

If $L$ is the identity operator, then (CP) becomes the minimization problem

\inf_{x\in X}f(x)+g(x)

for which the conjugate dual (21) has been discussed in [24, 23].

3.1 Conjugate dual for specific classes $\Phi,\Psi$

Now, we discuss the conjugate dual problem (DCP) when $\Phi,\Psi$ are the sets of linear and quadratic functions i.e. $\Phi_{conv},\Psi_{conv}$ and $\Phi_{Q,a},\Psi_{Q,a}$ given by (5) and (6) respectively.

Example 3.2.

Let $X$ and $Y$ be Banach spaces with the topological duals $X^{*},Y^{*}$ and their bilinear forms $\langle\cdot,\cdot\rangle_{X},\langle\cdot,\cdot\rangle_{Y}$ , respectively. Let $L:X\to Y$ be a linear continuous operator with the conjugate $L^{*}:Y^{*}\to X^{*}$ . We define the sets $\Phi$ and $\Psi$ of elementary functions as follows.

(i)

For the case of affine elementary functions, (cf. (5) above),

	$\displaystyle\Phi_{conv}$	$\displaystyle:=\left\{\phi:X\to\mathbb{R}\ \|\ \phi\left(x\right)=\left\langle u,x\right\rangle_{X}+c,u\in X^{*},c\in\mathbb{R}\right\}$
	$\displaystyle\Psi_{conv}$	$\displaystyle:=\left\{\psi:Y\to\mathbb{R}\ \|\ \psi\left(y\right)=\left\langle v,y\right\rangle_{Y}+d,v\in Y^{*},d\in\mathbb{R}\right\}$

we have $\psi(Lx+y)-\psi(Lx)=\psi(y)$ which leads to the coupling function as defined in the convex case [14]. We have

	$\displaystyle\sup\limits_{x\in X}\{-f(x)-\psi(Lx)\}$	$\displaystyle=\sup\limits_{x\in X}\{-f(x)-\langle v,Lx\rangle_{Y}\}-d$
		$\displaystyle=\sup\limits_{x\in X}\{-f(x)-\langle L^{*}v,x\rangle_{X}\}-d.$

Since $L^{*}v\in X^{*}$ , we have $-\psi\circ L\in\Phi_{conv}$ so

\sup\limits_{x\in X}\{-f(x)-\langle L^{*}v,x\rangle_{X}-d\}=f^{*}_{\Phi_{conv}}(-\psi\circ L)

and the dual problem

\sup_{\psi\in\Psi_{conv}}-\beta^{c}(0,\psi)=\sup_{\psi\in\Psi_{conv}}-f^{*}_{\Phi_{conv}}\left(-\psi\circ L\right)-g^{*}_{\Psi_{conv}}(\psi).

(22)

Note that the condition $\psi\circ L\in\Phi_{conv}$ is satisfied thanks to the form of $\Phi_{conv}$ , $\Psi_{conv}$ , and (22) coincides with the classical Fenchel dual investigated e.g. in [31, Chapter 1, Section 2].

(ii)

In the case of quadratic elementary functions ( $a,b\in\mathbb{R}$ , cf. formula (6) above),

	$\displaystyle\Phi_{Q,a}$	$\displaystyle=:\left\{\phi:X\to\mathbb{R}\ \|\ \phi\left(x\right)=a\left\\|x\right\\|_{X}^{2}+\left\langle u,x\right\rangle_{X}+c,u\in X^{*},c\in\mathbb{R}\right\},$		(23)
	$\displaystyle\Psi_{Q,b}$	$\displaystyle=:\left\{\psi:Y\to\mathbb{R}\ \|\ \psi\left(y\right)=b\left\\|y\right\\|_{Y}^{2}+\left\langle v,y\right\rangle_{Y}+d,v\in Y^{*},d\in\mathbb{R}\right\},$		(24)

one cannot express $\psi\circ L$ as a function in $\Phi_{Q,a}$ . However, we have

	$\displaystyle\psi\left(Lx\right)$	$\displaystyle=b\left\\|Lx\right\\|_{Y}^{2}+\left\langle v,Lx\right\rangle_{Y}+d$
		$\displaystyle=b\left\\|x\right\\|_{L}^{2}+\left\langle L^{*}v,x\right\rangle_{X}+d,$

with $\|x\|_{L}=\langle L^{*}Lx,x\rangle_{X}$ , [33]. Thus, the dual problem (DCP) takes the form

$\displaystyle\sup_{\psi\in\Psi_{Q,b}}-\beta^{c}(0,\psi)$	$\displaystyle=\sup_{\psi\in\Psi_{Q,b}}\inf_{x\in X}f\left(x\right)+\psi\left(Lx\right)-g^{*}_{\Psi_{Q,b}}\left(\psi\right)$
	$\displaystyle=\sup_{\psi\in\Psi_{Q,b}}\inf_{x\in X}f\left(x\right)+b\\|x\\|_{L}^{2}+\left\langle L^{}v,x\right\rangle_{X}+d-g^{}_{\Psi_{Q,b}}\left(\psi\right)$
	$\displaystyle=\sup_{\psi\in\Psi_{Q,b}}-\left(f+b\\|\cdot\\|_{L}^{2}\right)^{}_{\Phi_{Q,a}}\left(-\left\langle L^{}v,\cdot\right\rangle_{X}-d\right)-g^{*}_{\Psi_{Q,b}}\left(\psi\right).$	(25)

(iii)

In the construction of dual problem (15), the roles of $\Phi$ and $\Psi$ are not symmetric. We can see this by considering the following pair of elementary functions,

	$\displaystyle\Phi_{Q,a}$	$\displaystyle:=\left\{\phi:X\to\mathbb{R}\ \|\ \phi\left(x\right)=a\left\\|x\right\\|_{X}^{2}+\left\langle u,x\right\rangle_{X}+c,u\in X^{*},c\in\mathbb{R}\right\},$
	$\displaystyle\Psi_{conv}$	$\displaystyle:=\left\{\psi:Y\to\mathbb{R}\ \|\ \psi\left(y\right)=\left\langle v,y\right\rangle_{Y}+d,v\in Y^{*},d\in\mathbb{R}\right\}.$

In this case, since $\psi\in\Psi_{conv}$ is an affine function we have

-\psi\circ L(\cdot)=-\langle v,L\cdot\rangle_{Y}-d=-\langle L^{*}v,\cdot\rangle_{X}-d:=\phi(\cdot)\in\Phi_{Q,a}

and the dual problem has the same form as (22)

\displaystyle\sup_{\psi\in\Psi_{conv}}-\beta^{c}(0,\psi)

\displaystyle=\sup_{\psi\in\Psi_{conv}}-f^{*}_{\Phi_{Q,a}}\left(-\psi\circ L\right)-g^{*}_{\Psi_{conv}}(\psi)

(26)

However, when reversing the roles of $\Phi$ and $\Psi$ i.e. by taking

	$\displaystyle\Phi_{conv}$	$\displaystyle:=\left\{\phi:X\to\mathbb{R}\|\ \phi\left(x\right)=\left\langle u,x\right\rangle_{X}+c,u\in X^{*},c\in\mathbb{R}\right\}$
	$\displaystyle\Psi_{Q,b}$	$\displaystyle:=\left\{\psi:Y\to\mathbb{R}\|\ \psi\left(y\right)=b\left\\|y\right\\|^{2}_{Y}+\left\langle v,y\right\rangle_{Y}+d,v\in Y^{*},b,d\in\mathbb{R}\right\}.$

we cannot write the dual problem in the form (22), but it is possible to write, for any $\psi\in\Psi_{Q,b}$ ,

\psi(Lx)=b\lVert Lx\rVert^{2}_{Y}+\langle v,Lx\rangle_{Y}+d=\psi_{1}(x)+\psi_{2}(x),

where $\psi_{1}(x)=b\lVert Lx\rVert^{2}_{Y},\psi_{2}(x)=\langle L^{*}v,x\rangle_{X}+d$ and the dual problem takes the form

\sup_{\begin{subarray}{c}\psi\in\Psi_{Q,b}\\ \psi\circ L=\psi_{1}+\psi_{2}\end{subarray}}-\beta^{c}(0,\psi)=\sup_{\begin{subarray}{c}\psi\in\Psi_{Q,b}\\ \psi\circ L=\psi_{1}+\psi_{2}\end{subarray}}-\left(f+\psi_{1}\right)^{*}_{\Phi_{A}}\left(-\psi_{2}\right)-g^{*}_{\Psi_{Q,b}}\left(\psi\right).

(27)

Unlike the dual problems, considered in [24, 23] where $L=Id$ , the appearance of general operators $L$ in the minimization problem (CP) makes it more difficult to present the dual problem without imposing additional assumptions.

Remark 1.

It is clear from the formula (3) and the above examples that the presence of constants in the definition of elementary functions is inessential in the $c$ -conjugate of $\beta$ . Thus, in the sequel, we consider classes $\Phi$ and $\Psi$ of functions in which constants are omitted. For the general consideration related to this fact, see [5, formula 1.4.4].

4 Zero Duality Gap for Conjugate Dual

Having defined the dual problem (DCP), now, we discuss conditions for weak duality and zero duality gap. As in (CP), we assume that the sets $\Phi$ and $\Psi$ of elementary functions are defined on $X$ and $Y$ , respectively, and both contain zeros, i.e., $0\in\Phi,0\in\Psi$ .

Let $X$ be a nonempty set and let $Y$ be a vector space. Let $L:X\to Y$ be a mapping from $X$ to $Y$ . The dual problem (DCP) can be equivalently rewritten in the form

\displaystyle val(DCP):=\sup_{\psi\in\Psi}-\beta^{c}(0,\psi)

\displaystyle=-\inf_{\psi\in\Psi}\left\{\left(f+\psi\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi\right)\right\}

(28)

For the problem (CP), we have

val(CP):=\inf_{x\in X}f\left(x\right)+g\left(Lx\right)=-\left(f+g\circ L\right)^{*}_{\Phi}\left(0\right).

(29)

Weak duality, i.e. the inequality $val(CP)\geq val(DCP)$ means that

\left(f+g\circ L\right)^{*}_{\Phi}\left(0\right)\leq\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi\right).

(30)

In fact, a more general inequality holds true.

Theorem 4.1.

For any $\phi\in\Phi$ , it holds

\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right)\leq\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(\phi\right)+g^{*}_{\Psi}\left(\psi\right).

Proof.

For any $\psi\in\Psi$ and $\phi\in\Phi$ , we have

	$\displaystyle\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right)$	$\displaystyle=\sup_{x\in X}\phi\left(x\right)-f\left(x\right)-g\left(Lx\right)$
		$\displaystyle=\sup_{x\in X}\phi\left(x\right)-f\left(x\right)-g\left(Lx\right)+\psi\left(Lx\right)-\psi\left(Lx\right)$
		$\displaystyle\leq\sup_{x\in X}\left\{\phi\left(x\right)-f\left(x\right)-\psi\left(Lx\right)\right\}+\sup_{x\in X}\left\{\psi\left(Lx\right)-g\left(Lx\right)\right\}$
		$\displaystyle\leq\left(f+\psi\circ L\right)^{*}_{\Phi}\left(\phi\right)+\sup_{y\in L(X)}\left\{\psi\left(y\right)-g\left(y\right)\right\}$
		$\displaystyle\leq\left(f+\psi\circ L\right)^{}_{\Phi}\left(\phi\right)+g^{}_{\Psi}\left(\psi\right).$

Since $\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right)$ is a lower bound of $\left(f+\psi\circ L\right)^{*}_{\Phi}\left(\phi\right)+g^{*}_{\Psi}\left(\psi\right)$ for arbitrary $\psi\in\Psi$ , we get

\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right)\leq\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(\phi\right)+g^{*}_{\Psi}\left(\psi\right),

which completes the proof. ∎

By taking $\phi=0$ in the above theorem, we obtain the weak duality (30) between (CP) and (DCP). Note that no additional assumptions are imposed on the functions $\psi\circ L$ and $\phi$ . However, when zero duality gap is considered, we need to assume a relationship between $\Phi$ and $\Psi$ .

Our first result of zero duality gap is related to [23, Theorem 3.5].

Theorem 4.2.

Let $X$ be a nonempty set and $Y$ be a vector space. Let $f:X\to\left(-\infty,+\infty\right]$ , $g:Y\to\left(-\infty,+\infty\right]$ , and $L:X\to Y$ be a mapping. Assume that $\text{dom }g\cap L\left(\text{dom }f\right)\neq\emptyset$ . Suppose $0\in\Phi$ and $0\in\Psi$ . The following are equivalent.

(i)

For every $\varepsilon>0$ , there exist $x_{\varepsilon}\in X,\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g(Lx_{\varepsilon})$ such that $\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(0\right)\leq-f(x_{\varepsilon})-\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)+\varepsilon$ .
(ii)

$-val(CP)=\left(f+g\circ L\right)^{*}_{\Phi}\left(0\right)=\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi\right)=-val(DCP)<+\infty$ .

Proof.

(i) $\Rightarrow$ (ii): Thanks to Theorem 4.1, we only need to prove that

\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi\right)\leq\left(f+g\circ L\right)^{*}_{\Phi}\left(0\right)<+\infty.

Let $\varepsilon>0$ . From (i), there exist $x_{\varepsilon}\in X$ and $\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g\left(Lx_{\varepsilon}\right)$ such that

\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(0\right)\leq-f(x_{\varepsilon})-\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)+\varepsilon.

From the definition of infimum, we have

	$\displaystyle\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{}_{\Phi}\left(0\right)+g^{}_{\Psi}\left(\psi\right)$	$\displaystyle\leq\left(f+\psi_{\varepsilon}\circ L\right)^{}_{\Phi}\left(0\right)+g^{}_{\Psi}\left(\psi_{\varepsilon}\right)$
		$\displaystyle\leq-f\left(x_{\varepsilon}\right)-\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)+g^{*}_{\Psi}\left(\psi_{\varepsilon}\right)+\varepsilon.$		(31)

By using inequality (11) applied to $g$ and taking into account $\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g(Lx_{\varepsilon})$ , we obtain

	$\displaystyle-f\left(x_{\varepsilon}\right)-\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)+g^{*}_{\Psi}\left(\psi_{\varepsilon}\right)+\varepsilon$	$\displaystyle\leq-f\left(x_{\varepsilon}\right)-g\left(Lx_{\varepsilon}\right)+2\varepsilon$		(32)
		$\displaystyle\leq\left(f+g\circ L\right)^{*}_{\Phi}\left(0\right)+2\varepsilon.$

Finally, by using (31) and (32),

\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi\right)\leq\left(f+g\circ L\right)^{*}_{\Phi}\left(0\right)+2\varepsilon.

As both sides of the above inequality do not depend on $x_{\varepsilon}$ or $\psi_{\varepsilon}$ , we can let $\varepsilon\to 0$ and obtain zero duality gap

\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi\right)\leq\left(f+g\circ L\right)^{*}_{\Phi}\left(0\right).

From (31) and (32), we have

\left(f+g\circ L\right)_{\Phi}^{*}\left(0\right)\leq-f\left(x_{\varepsilon}\right)-g\left(Lx_{\varepsilon}\right)+2\varepsilon<+\infty,

from the assumption of $f$ and $g$ . Thus $\left(f+g\circ L\right)_{\Phi}^{*}\left(0\right)<+\infty$ .

(ii) $\Rightarrow$ (i): From the definition of supremum, for every $\varepsilon>0$ there exists an $x_{\varepsilon}\in X$ such that

\left(f+g\circ L\right)^{*}_{\Phi}\left(0\right)\leq-f\left(x_{\varepsilon}\right)-g\left(Lx_{\varepsilon}\right)+\varepsilon/2.

Also, there exists $\psi_{\varepsilon}\in\Psi$ such that

\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi_{\varepsilon}\right)-\varepsilon/2<\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi\right).

Zero duality gap gives us

\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi_{\varepsilon}\right)-\varepsilon/2\leq-f\left(x_{\varepsilon}\right)-g\left(Lx_{\varepsilon}\right)+\varepsilon/2.

After rearranging both sides, we get

\left[\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(0\right)+f\left(x_{\varepsilon}\right)+\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)\right]+\left[g^{*}_{\Psi}\left(\psi_{\varepsilon}\right)+g\left(Lx_{\varepsilon}\right)-\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)\right]<\varepsilon.

By the definition of conjugate function, each of the two terms is non-negative, and so each of them has to be smaller than $\varepsilon$ . Thus

	$\displaystyle\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(0\right)$	$\displaystyle<-f\left(x_{\varepsilon}\right)-\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)+\varepsilon$
	$\displaystyle g^{*}_{\Psi}\left(\psi_{\varepsilon}\right)+g\left(Lx_{\varepsilon}\right)-\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)$	$\displaystyle<\varepsilon,$

which implies that $\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g\left(Lx_{\varepsilon}\right)$ . Hence, (i) holds. ∎

Remark 2.

Observe that, in some cases, we cannot find $\phi\in\Phi,\psi\in\Psi$ such that $\phi+\psi=0$ (this latter condition is used in [23, Theorem 3.5]). In the case $X=Y,\Phi=\Psi,L=Id$ and $\Phi$ is symmetric (for any $\psi\in\Phi,-\psi\in\Phi$ ), Theorem (4.2)-(i) means that, for $\psi_{\varepsilon}\in\partial_{\varepsilon,\Phi}g(x_{\varepsilon})$ ,

-f(x)-\psi_{\varepsilon}(x)\leq\left(f+\psi_{\varepsilon}\right)^{*}_{\Phi}\left(0\right)\leq-f(x_{\varepsilon})-\psi_{\varepsilon}\left(x_{\varepsilon}\right)+\varepsilon,

i.e. $-\psi_{\varepsilon}\in\partial_{\varepsilon,\Phi}f(x_{\varepsilon})$ . This means that $0\in\bigcap_{\varepsilon>0}\partial_{\varepsilon,\Phi}(f+g)(X)$ , which reduces to the respective condition used in [23, Theorem 3.5-(i)] for proving zero duality gap.

Condition (i) of Theorem 4.2, could be replaced by conditions which are easier to be checked, when we introduce the following assumption: for given $\phi\in\Phi,\psi\in\Psi$ ,

\displaystyle\phi-\psi\circ L\in\Phi.

(33)

Note that condition (34) below will be used in the sequel to check zero duality gap.

Remark 3.

When $\Phi$ is a convex cone, then (33) is satisfied when $-\psi\circ L\in\text{rec }\Phi$ where $\text{rec }\Phi=\left\{\varphi\in F(X):\varphi+\Phi\subset\Phi\right\}$ is the recession cone of $\Phi$ and $F(X)$ is a linear space of all functions defined on $X$ .

Theorem 4.3.

Let $f:X\to\left(-\infty,+\infty\right],g:Y\to\left(-\infty,+\infty\right]$ be such that $\text{dom }g\cap L\left(\text{dom }f\right)\neq\emptyset$ . Let $L:X\to Y$ be a mapping from $X$ into $Y$ . Suppose that $0\in\Phi$ and $0\in\Psi$ . Consider the following conditions.

(i)

For every $\varepsilon>0$ , there exist $x_{\varepsilon}\in X,\phi_{\varepsilon}\in\partial_{\varepsilon,\Phi}f(x_{\varepsilon}),\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g\left(Lx_{\varepsilon}\right)$ such that

\begin{cases}\phi_{\varepsilon}(z)+\psi_{\varepsilon}(Lz)\geq-\varepsilon&\forall z\in X\\ \phi_{\varepsilon}(x_{\varepsilon})+\psi_{\varepsilon}(Lx_{\varepsilon})\leq\varepsilon.\end{cases}

(34)

(ii)

$-val(CP)=\left(f+g\circ L\right)^{*}_{\Phi}\left(0\right)=\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi\right)=-val(DCP)<+\infty$ .

We have (i) $\Rightarrow$ (ii). If, for every $\varepsilon>0$ , there exists $\psi_{\varepsilon}\in\Psi$ such that

\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi_{\varepsilon}\right)-\varepsilon/2<\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi\right)

and $-\psi_{\varepsilon}\circ L\in\Phi$ , then (i) $\Leftrightarrow$ (ii).

Proof.

(i) $\Rightarrow$ (ii). For any $\psi\in\Psi$

\inf_{\psi\in\Psi}\left(\left(f+\psi\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi\right)\right)\leq\left(f+\psi\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi\right).

By using inequality (11), for $\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g(Lx_{\varepsilon}),\phi_{\varepsilon}\in\partial_{\varepsilon,\Phi}f(x_{\varepsilon})$

	$\displaystyle\left(f+\psi_{\varepsilon}\circ L\right)^{}_{\Phi}\left(0\right)+g^{}_{\Psi}\left(\psi_{\varepsilon}\right)$	$\displaystyle\leq\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(0\right)+\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)-g\left(Lx_{\varepsilon}\right)+\varepsilon$
		$\displaystyle=\sup_{x\in X}\left\{-\psi_{\varepsilon}\left(Lx\right)-f\left(x\right)\right\}+\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)-g\left(Lx_{\varepsilon}\right)+\varepsilon$
		$\displaystyle\leq f^{*}_{\Phi}\left(\phi_{\varepsilon}\right)+\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)-g\left(Lx_{\varepsilon}\right)+2\varepsilon$
		$\displaystyle\leq\phi_{\varepsilon}\left(x_{\varepsilon}\right)-f\left(x_{\varepsilon}\right)+\varepsilon+\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)-g\left(Lx_{\varepsilon}\right)+2\varepsilon$
		$\displaystyle\leq 3\varepsilon-f\left(x_{\varepsilon}\right)-g\left(Lx_{\varepsilon}\right)$
		$\displaystyle\leq 3\varepsilon+\left(f+g\circ L\right)^{*}_{\Phi}\left(0\right),$

where in the third estimation, we use (34) to obtain $f^{*}_{\Phi}(\phi_{\varepsilon})$ .

By letting $\varepsilon\to 0$ we obtain zero duality gap. We can use the same argument as in Theorem 4.2 to prove that

\left(f+g\circ L\right)^{*}_{\Phi}\left(0\right)<+\infty.

(ii) $\Rightarrow$ (i). We have

\inf_{\psi\in\Psi}\left(\left(f+\psi\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi\right)\right)=\left(f+g\circ L\right)^{*}_{\Phi}\left(0\right)<+\infty,

so for every $\varepsilon>0$ , there exists $\psi_{\varepsilon}\in\Psi$ such that

\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi_{\varepsilon}\right)-\varepsilon/2<\left(f+g\circ L\right)^{*}_{\Phi}\left(0\right),

and $-\psi_{\varepsilon}\circ L\in\Phi$ . Moreover, there exists an $x_{\varepsilon}\in X$ such that

\left(f+g\circ L\right)^{*}_{\Phi}\left(0\right)<\varepsilon/2-f\left(x_{\varepsilon}\right)-g\left(Lx_{\varepsilon}\right).

Combining the two inequalities gives us

\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi_{\varepsilon}\right)+f\left(x_{\varepsilon}\right)+g\left(Lx_{\varepsilon}\right)<\varepsilon.

By adding and substracting $\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)$ , we get

\left[\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(0\right)+f\left(x_{\varepsilon}\right)+\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)\right]+\left[g^{*}_{\Psi}\left(\psi_{\varepsilon}\right)+g\left(Lx_{\varepsilon}\right)-\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)\right]<\varepsilon.

Since each term is nonnegative, we obtain

	$\displaystyle\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(0\right)+f\left(x_{\varepsilon}\right)+\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)$	$\displaystyle\leq\varepsilon$
	$\displaystyle g^{*}_{\Psi}\left(\psi_{\varepsilon}\right)+g\left(Lx_{\varepsilon}\right)-\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)$	$\displaystyle\leq\varepsilon.$

Since $-\psi_{\varepsilon}\circ L\in\Phi$ , and thus $\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(0\right)=f^{*}_{\Phi}\left(-\psi_{\varepsilon}\circ L\right)$ . By the above inequalities, $\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g\left(Lx_{\varepsilon}\right)$ and

\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(0\right)+f\left(x_{\varepsilon}\right)+\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)=f^{*}_{\Phi}\left(-\psi_{\varepsilon}\circ L\right)+f\left(x_{\varepsilon}\right)+\psi_{\varepsilon}(Lx_{\varepsilon})\leq\varepsilon

which is equivalent to $-\psi_{\varepsilon}\circ L\in\partial_{\varepsilon,\Phi}f\left(x_{\varepsilon}\right)$ . Hence, (i) is proved. ∎

Next we obtain the existence of optimal solution to (CP) in the spirit of [23, Theorem 3.6].

Theorem 4.4.

Let $f:X\to\left(-\infty,+\infty\right],g:Y\to\left(-\infty,+\infty\right]$ be such that $\text{dom }g\cap L\left(\text{dom }f\right)\neq\emptyset$ and let $L:X\to Y$ be a mapping from $X$ into $Y$ . Suppose that $0\in\Phi,0\in\Psi$ and $x\in\text{dom }g\cap L\left(\text{dom }f\right)$ . Consider the following conditions.

(i)

For all $\varepsilon>0$ , there exist $\phi_{\varepsilon}\in\partial_{\varepsilon,\Phi}f(x),\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g\left(Lx\right)$ such that

\displaystyle\begin{cases}\phi_{\varepsilon}(z)+\psi_{\varepsilon}(Lz)\geq-\varepsilon\quad\text{for all }z\in X\\ \phi_{\varepsilon}(x)+\psi_{\varepsilon}(Lx)\leq\varepsilon.\end{cases}

(ii)

$-val(CP)=\left(f+g\circ L\right)^{*}_{\Phi}\left(0\right)=\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi\right)=val(DCP)<+\infty$ and $x$ is an optimal solution to (CP).

We have (i) $\Rightarrow$ (ii). For every $\varepsilon>0$ , if there exists $\psi_{\varepsilon}\in\Psi,\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi_{\varepsilon}\right)-\varepsilon/2<\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi\right)$ such that $-\psi_{\varepsilon}\circ L\in\Phi$ , then the two statements are equivalent.

Proof.

The lines of the proof coincide with the ones of Theorem 4.3. We only need to prove that $x$ is an optimal solution to (CP). Now as (i) holds for all $\varepsilon>0$ , we have

	$\displaystyle\inf_{\psi\in\Psi}\left(\left(f+\psi\circ L\right)^{}_{\Phi}\left(0\right)+g^{}_{\Psi}\left(\psi\right)\right)$	$\displaystyle\leq\left(f+\psi_{\varepsilon}\circ L\right)^{}_{\Phi}\left(0\right)+g^{}_{\Psi}\left(\psi_{\varepsilon}\right)$
		$\displaystyle\leq 3\varepsilon-f\left(x\right)-g\left(Lx\right)$
		$\displaystyle\leq 3\varepsilon+\left(f+g\circ L\right)^{*}_{\Phi}(0).$

Letting $\varepsilon\to 0$ in the above inequality as it holds for all $\varepsilon>0$ . We obtain zero duality gap

\inf_{\psi\in\Psi}\left(\left(f+\psi\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi\right)\right)=(f+g\circ L)^{*}_{\Phi}(0).

Notice that

\inf_{z\in X}f(z)+g(Lz)=-(f+g\circ L)^{*}_{\Phi}(0)\geq f\left(x\right)+g\left(Lx\right).

As $(f+g\circ L)^{*}_{\Phi}(0)<+\infty$ , the primal problem (CP) is finite with $x$ as an optimal solution. ∎

Remark 4.

¹¹1We thank an annonymuous referee for this observation

Let us observe that the condition (i) of Theorem 4.3 and Theorem 4.4 can be replaced by the condition

\phi_{\varepsilon}(x)+\psi_{\varepsilon}(Lx)=0,\quad\forall x\in X.

(35)

Indeed, if (35) holds, which means $-\psi_{\varepsilon}\circ L=\phi_{\varepsilon}\in\Phi$ , then (34) and (33) are satisfied.

4.1 Zero duality gap for specific classes $\Phi,\ \Psi$

Consider the classes of elementary functions as in (6). i.e. $\Phi_{Q,a}$ and $\Psi_{Q,b}$ defined in (23) and (24) for $a,b\leq 0$ . We let $c=d=0$ because they do not affect the dual problem and the condition for zero duality gap. The dual problem (25) takes the form

\sup_{\begin{subarray}{c}\psi\in\Psi_{Q,b}\\ \psi(\cdot)=-b\|\cdot\|^{2}_{Y}+\langle v,\cdot\rangle_{Y}\end{subarray}}-\left(f+b\left\|L\cdot\right\|_{Y}^{2}\right)^{*}_{\Phi_{Q,a}}\left(-\langle L^{*}v,\cdot\rangle\right)-g^{*}_{\Psi_{Q,b}}\left(\psi\right).

(36)

Proposition 4.5.

Let $X$ and $Y$ be Hilbert spaces with inner products, $\langle\cdot,\cdot\rangle_{X}$ and $\langle\cdot,\cdot\rangle_{Y}$ , respectively. Let $L:X\rightarrow Y$ be a continuous linear operator from $X$ to $Y$ , and $f:X\rightarrow(-\infty,+\infty]$ be a weakly convex function with modulus $a$ . Then $f(x)-b\lVert Lx\rVert^{2}_{Y}$ is weakly convex with modulus $a+b\lVert L\rVert^{2}_{\mathcal{L}(X,Y)}$ .

Proof.

By Definition 2.6, a function $f:X\rightarrow(-\infty,+\infty]$ is weakly convex on $X$ with modulus $a>0$ if $f+a\lVert x\rVert^{2}_{X}$ , is a convex function, or equivalently for any $x_{1},x_{2}\in X$ and $\lambda\in[0,1]$ ,

f(\lambda x_{1}+(1-\lambda)x_{2})\leq\lambda f(x_{1})+(1-\lambda)f(x_{2})+a\lambda(1-\lambda)\|x_{1}-x_{2}\|_{X}^{2}.

Now we analyze the weak convexity of the function $f(x)-b\lVert Lx\rVert^{2}_{Y}$ , where $b\geq 0$ . We have

	$\displaystyle\lVert L(\lambda x_{1}+(1-\lambda)x_{2})\rVert^{2}_{Y}$	$\displaystyle=\lVert\lambda Lx_{1}+(1-\lambda)Lx_{2}\rVert^{2}_{Y}$
		$\displaystyle=\lambda^{2}\lVert Lx_{1}\rVert^{2}_{Y}+(1-\lambda)^{2}\lVert Lx_{2}\rVert^{2}_{Y}+2\lambda(1-\lambda)\langle Lx_{1},Lx_{2}\rangle_{Y}$
		$\displaystyle=\lambda\lVert Lx_{1}\rVert^{2}_{Y}+(1-\lambda)\lVert Lx_{2}\rVert^{2}_{Y}-\lambda(1-\lambda)\lVert Lx_{1}-Lx_{2}\rVert^{2}_{Y}.$

Denote $x_{\lambda}:=\lambda x_{1}+\left(1-\lambda\right)x_{2}$ for $\lambda\in\left[0,1\right]$ . Therefore,

	$\displaystyle f\left(x_{\lambda}\right)-b\left\\|Lx_{\lambda}\right\\|_{Y}^{2}$	$\displaystyle\leq\lambda\left(f\left(x_{1}\right)-b\left\\|Lx_{1}\right\\|_{Y}^{2}\right)+\left(1-\lambda\right)\left(f\left(x_{2}\right)-b\left\\|Lx_{2}\right\\|_{Y}^{2}\right)$
		$\displaystyle+b\lambda\left(1-\lambda\right)\left\\|Lx_{1}-Lx_{2}\right\\|_{Y}^{2}+a\lambda\left(1-\lambda\right)\left\\|x_{1}-x_{2}\right\\|_{X}^{2}.$

As $L$ is linear and continuous, we obtain

	$\displaystyle f\left(x_{\lambda}\right)-b\left\\|Lx_{\lambda}\right\\|_{Y}^{2}$	$\displaystyle\leq\lambda\left(f\left(x_{1}\right)-b\left\\|Lx_{1}\right\\|_{Y}^{2}\right)+\left(1-\lambda\right)\left(f\left(x_{2}\right)-b\left\\|Lx_{2}\right\\|_{Y}^{2}\right)$
		$\displaystyle+\left(a+b\lVert L\rVert^{2}_{\mathcal{L}(X,Y)}\right)\lambda\left(1-\lambda\right)\left\\|x_{1}-x_{2}\right\\|_{X}^{2},$

where $\lVert L\rVert_{\mathcal{L}(X,Y)}=\sup_{\|x\|_{X}\leq 1}\|Lx\|_{Y}$ . This means $f(x)-b\lVert Lx\rVert^{2}_{Y}$ is weakly convex with modulus $a+b\lVert L\rVert^{2}_{\mathcal{L}(X,Y)}$ . ∎

The following corollary follows from Theorem 4.3.

Corollary 4.6.

Let $X,Y$ be Hilbert spaces. Let $f:X\to\left(-\infty,+\infty\right],\ g:Y\to\left(-\infty,+\infty\right]$ and $L:X\to Y$ be a continuous linear operator. Assume that $a\geq 0,b\in\mathbb{R}$ , $c=d=0$ in the definitions of the classes of elementary functions $\Phi_{Q,a}$ and $\Psi_{Q,b}$ ((23) and (24)), respectively, and $\text{dom }g\cap L\left(\text{dom }f\right)\neq\emptyset$ . The following are equivalent.

(i)

For every $\varepsilon>0$ , there exist $x_{\varepsilon}\in X,\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi_{Q,b}}g\left(Lx_{\varepsilon}\right),\phi_{\varepsilon}\in\partial_{\varepsilon,\Phi_{Q,a}}\left(f+b_{\varepsilon}\left\|L\cdot\right\|_{Y}^{2}\right)\left(x_{\varepsilon}\right)$ such that

\displaystyle\begin{cases}\phi_{\varepsilon}\left(x\right)+\psi_{\varepsilon}\left(Lx\right)\geq b\left\|Lx\right\|_{Y}^{2}-\varepsilon&\forall x\in X\\ \phi_{\varepsilon}\left(x_{\varepsilon}\right)+\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)\leq b\left\|Lx_{\varepsilon}\right\|_{Y}^{2}+\varepsilon,\end{cases}

where $\psi_{\varepsilon}(y)=b\|y\|^{2}_{Y}+\langle v_{\varepsilon},y\rangle_{Y}$ .

(ii)

$\left(f+g\circ L\right)^{*}_{\Phi_{Q,a}}\left(0\right)={\displaystyle\inf_{\psi\in\Psi_{Q,b}}}\left(f+b\left\|L\cdot\right\|_{Y}^{2}\right)^{*}_{\Phi_{Q,a}}\left(-\langle L^{*}v,\cdot\rangle_{X}\right)+g^{*}_{\Psi_{Q,b}}\left(\psi\right)<+\infty$ , where $\psi(y)=b\|y\|^{2}_{Y}+\langle v,y\rangle_{Y}$ .

Proof.

Let $\varepsilon>0$ . By (i), there exist $x_{\varepsilon}\in X,\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi_{Q,b}}g\left(Lx_{\varepsilon}\right),\phi_{\varepsilon}\in\partial_{\varepsilon,\Phi_{Q,a}}\left(f+b_{\varepsilon}\left\|L\cdot\right\|_{Y}^{2}\right)\left(x_{\varepsilon}\right)$ such that for all $x\in X$ ,

\displaystyle\phi_{\varepsilon}\left(x\right)+\psi_{\varepsilon}\left(Lx\right)=a\lVert x\rVert^{2}_{X}+\langle u_{\varepsilon},x\rangle_{X}+b\lVert Lx\rVert^{2}_{Y}+\langle v_{\varepsilon},Lx\rangle_{Y}\geq b\lVert Lx\rVert^{2}_{Y}-\varepsilon,

where $u_{\varepsilon}\in X,v_{\varepsilon}\in Y$ , i.e.,

a\lVert x\rVert^{2}_{X}+\langle u_{\varepsilon}+L^{*}v_{\varepsilon},x\rangle_{X}+\varepsilon\geq 0,\quad\forall x\in X.

(37)

Hence, it must be $u_{\varepsilon}=-L^{*}v_{\varepsilon}$ . Thus, for $\phi_{\varepsilon}\in\partial_{\varepsilon,\Phi_{Q,a}}\left(f+b\left\|L\cdot\right\|_{Y}^{2}\right)\left(x_{\varepsilon}\right)$ , we can write

	$\displaystyle f\left(x_{\varepsilon}\right)+b\left\\|Lx_{\varepsilon}\right\\|_{Y}^{2}+\left(f+b\left\\|L\cdot\right\\|_{Y}^{2}\right)^{*}_{\Phi_{Q,a}}\left(\phi_{\varepsilon}\right)$	$\displaystyle\leq\phi_{\varepsilon}\left(x_{\varepsilon}\right)+\varepsilon$
	$\displaystyle\left(f+\psi\circ L\right)^{*}_{\Phi_{Q,a}}\left(0\right)$	$\displaystyle\leq-f\left(x_{\varepsilon}\right)-\psi\left(Lx_{\varepsilon}\right)+3\varepsilon,$

which is similar to the proof of Theorem 4.2, so the assertion follows from Theorem 4.2. ∎

Observe that the above corollary does not hold for $\Phi_{Q,a}$ with $a<0$ due to formula (37).

We give simple examples illustrating Theorem 4.3.

Example 4.7.

•

Let $f\left(x\right)=\left(x+1\right)^{2},g\left(x\right)=4x^{2},L=Id$ and

	$\displaystyle\Phi$	$\displaystyle=\left\{\phi\left(x\right)=-ax^{2}+bx,a\geq 0,b\in\mathbb{R}\right\},$
	$\displaystyle\Psi$	$\displaystyle=\left\{\psi\left(x\right)=cx,c\in\mathbb{R}\right\},$

(c.f. Remark 1).

The conjugates of $f$ and $g$ are given as

	$\displaystyle f^{*}_{\Phi}\left(\phi\right)$	$\displaystyle=\frac{\left(b-2\right)^{2}}{4\left(a+1\right)}-1,\quad a\geq 0,b\in\mathbb{R}$
	$\displaystyle g^{*}_{\Psi}\left(\psi\right)$	$\displaystyle=\frac{c^{2}}{16},\quad c\in\mathbb{R}.$

The $\varepsilon$ -subdifferentials of $f$ and $g$ at $x_{0}$ are given as

	$\displaystyle\partial_{\varepsilon,\Phi}f\left(x_{0}\right)$	$\displaystyle=\left\{\phi\left(x\right)=-ax^{2}+bx:\left(a+1\right)\left(x_{0}-\frac{b-2}{2\left(a+1\right)}\right)^{2}\leq\varepsilon,a\geq 0,b\in\mathbb{R}\right\},$
	$\displaystyle\partial_{\varepsilon,\Psi}g\left(x_{0}\right)$	$\displaystyle=\left\{\psi\left(x\right)=cx:\left(2x_{0}-\frac{c}{4}\right)^{2}\leq\varepsilon,c\in\mathbb{R}\right\}.$

To verify condition (i) of Theorem 4.3, we find $x_{\varepsilon}\in\mathbb{R},\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g\left(x_{\varepsilon}\right),\phi_{\varepsilon}\in\partial_{\varepsilon,\Phi}f\left(x_{\varepsilon}\right)$ such that

\begin{cases}\phi_{\varepsilon}\left(x\right)+\psi_{\varepsilon}\left(x\right)\geq-\varepsilon&\forall x\in\mathbb{R}\\ \phi_{\varepsilon}\left(x_{\varepsilon}\right)+\psi_{\varepsilon}\left(x_{\varepsilon}\right)\leq\varepsilon\end{cases}.

Now consider the first inequality

\phi_{\varepsilon}\left(x\right)+\psi_{\varepsilon}\left(x\right)=-ax^{2}+\left(b+c\right)x\geq-\varepsilon\quad\forall x\in\mathbb{R},

which gives $a=0,b+c=0$ , so the second inequality i.e. $\phi_{\varepsilon}\left(x_{\varepsilon}\right)+\psi_{\varepsilon}\left(x_{\varepsilon}\right)\leq\varepsilon$ , holds for $x_{\varepsilon}\in\mathbb{R}$ . Hence, Theorem 4.3-(i) holds.

Moreover, as $\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g\left(x_{\varepsilon}\right),\phi_{\varepsilon}\in\partial_{\varepsilon,\Phi}f\left(x_{\varepsilon}\right)$ , we have

\begin{cases}\left(x_{\varepsilon}-\frac{b-2}{2}\right)^{2}\leq\varepsilon\\ \left(2x_{\varepsilon}+\frac{b}{4}\right)^{2}\leq\varepsilon\end{cases}.

Let $\varepsilon\to 0$ , we obtain $b=2x_{0}+2,c=8x_{0}$ . Since $b+c=0$ , we get $x_{0}=-\frac{1}{5}$ . With $x_{0},a,b,c$ calculated above we get

\displaystyle\inf_{x\in X}f(x)+g(x)=f\left(x_{0}\right)+g\left(x_{0}\right)

\displaystyle=\frac{4}{5},

and

\displaystyle\sup_{\psi\in\Psi}-(f+\psi)^{*}_{\Phi}(0)-g^{*}_{\Psi}(\psi)=-f^{*}_{\Phi}\left(\phi_{\varepsilon}\right)-g^{*}_{\Psi}\left(\psi_{\varepsilon}\right)

\displaystyle=\frac{80}{400}=\frac{4}{5}.

Hence, we achieve zero duality gap.

•

Now we reverse the roles of $\Phi$ and $\Psi$ i.e.

	$\displaystyle\Phi$	$\displaystyle=\left\{\phi\left(x\right)=cx,c\in\mathbb{R}\right\},$
	$\displaystyle\Psi$	$\displaystyle=\left\{\psi\left(x\right)=-ax^{2}+bx,a\geq 0,b\in\mathbb{R}\right\}.$

Then the conjugates of $f$ and $g$ are

	$\displaystyle f^{*}_{\Phi}\left(\phi\right)$	$\displaystyle=\frac{\left(c-2\right)^{2}}{4}-1,$
	$\displaystyle g^{*}_{\Psi}\left(\psi\right)$	$\displaystyle=\frac{b^{2}}{4\left(a+4\right)}.$

The $\varepsilon$ -subdifferentials of $f$ and $g$ are

	$\displaystyle\partial_{\varepsilon,\Phi}f\left(x_{\varepsilon}\right)$	$\displaystyle=\left\{\phi\left(x\right)=cx:\left(x_{\varepsilon}-\frac{c-2}{2}\right)^{2}\leq\varepsilon,c\in\mathbb{R}\right\},$
	$\displaystyle\partial_{\varepsilon,\Psi}g\left(x_{\varepsilon}\right)$	$\displaystyle=\left\{\psi\left(x\right)=-ax^{2}+bx:\left(x_{\varepsilon}-\frac{b}{2\left(a+4\right)}\right)^{2}\leq\frac{\varepsilon}{a+4},a\geq 0,b\in\mathbb{R}\right\}.$

We need to find $x_{\varepsilon}\in\mathbb{R},\phi_{\varepsilon}\in\partial_{\varepsilon,\Phi}f\left(x_{\varepsilon}\right),\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g\left(x_{\varepsilon}\right)$ such that

\begin{cases}\phi_{\varepsilon}\left(x\right)+\psi_{\varepsilon}\left(x\right)\geq-\varepsilon&\forall x\in\mathbb{R}\\ \phi_{\varepsilon}\left(x_{\varepsilon}\right)+\psi_{\varepsilon}\left(x_{\varepsilon}\right)\leq\varepsilon\end{cases}.

The first inequality, for all $x\in\mathbb{R}$ ,

\phi_{\varepsilon}\left(x\right)+\psi_{\varepsilon}\left(x\right)=-ax^{2}+\left(b+c\right)x\geq-\varepsilon,

also implies $a=0,b+c=0$ , so the second inequality is automatically satisfied. Repeating the same steps as before, we achieve zero duality gap at value $4/5$ which is the same in the previous case. However, this time condition (33) does not hold for any $\psi\in\Psi$ and $0$ , because there is no $\phi\in\Phi$ such that

\phi(x)=-\psi\left(x\right)=ax^{2}-bx,

unless $a=0$ .

In some cases, if Theorem 4.3-(i) is not satisfied, we need to check zero duality gap in a another way. Let us consider the following example.

Example 4.8.

Let $X=Y=\mathbb{R}$ , $f\left(x,y\right)=3x^{2}+2y^{2}$ , $g\left(x\right)=-\left(x-1\right)^{2},L\left(x,y\right)=x-y$ . The classes of elementary functions are defined as

	$\displaystyle\Phi$	$\displaystyle=\left\{\phi\left(x,y\right)=-a\left(x^{2}+y^{2}\right)+b_{1}x+b_{2}y,\ a\geq 0,b_{1},b_{2}\in\mathbb{R}\right\}$
	$\displaystyle\Psi$	$\displaystyle=\left\{\psi\left(x\right)=-cx^{2}+dx,\ c\geq 0,d\in\mathbb{R}\right\},$

(c.f. Remark 1).

We have

	$\displaystyle f^{*}_{\Phi}\left(\phi\right)$	$\displaystyle=\frac{b_{1}^{2}}{4\left(a+3\right)}+\frac{b_{2}^{2}}{4\left(a+2\right)},$
	$\displaystyle\partial_{\varepsilon,\Phi}f\left(x_{\varepsilon},y_{\varepsilon}\right)$	$\displaystyle=\begin{cases}\phi\left(x,y\right)=-a\left(x^{2}+y^{2}\right)+b_{1}x+b_{2}y,\\ \text{s.t. }\left(a+3\right)\left(x_{\varepsilon}-\frac{b_{1}}{2\left(a+3\right)}\right)^{2}+\left(a+2\right)\left(y_{\varepsilon}-\frac{b_{2}}{2\left(a+2\right)}\right)^{2}\leq\varepsilon\end{cases},$
	$\displaystyle g^{*}_{\Psi}\left(\psi\right)$	$\displaystyle=\begin{cases}+\infty&0\leq c<1\text{ or }c=1,d\neq 2\\ 1&c=1,d=2\\ \frac{\left(d-2\right)^{2}}{4\left(c-1\right)}+1&c>1\end{cases},$
	$\displaystyle\partial_{\varepsilon,\Psi}g\left(x_{\varepsilon}\right)$	$\displaystyle=\begin{cases}\psi\left(x\right)=-cx^{2}+dx&\left(c-1\right)\left(x_{\varepsilon}-\frac{d-2}{2\left(c-1\right)}\right)^{2}\leq\varepsilon,c>1\\ \psi\left(x\right)=-x^{2}+2x&c=1,d=2,\forall x_{\varepsilon}.\end{cases}$

One option is to find, for $\varepsilon>0$ , $x_{\varepsilon},y_{\varepsilon}\in\mathbb{R},\phi_{\varepsilon}\in\partial_{\varepsilon,\Phi}f\left(x_{\varepsilon},y_{\varepsilon}\right),\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g\left(L\left(x_{\varepsilon},y_{\varepsilon}\right)\right)$ such that

\begin{cases}\phi_{\varepsilon}\left(x,y\right)+\psi_{\varepsilon}\left(L\left(x,y\right)\right)\geq-\varepsilon&\forall x,y\in\mathbb{R}\\ \phi_{\varepsilon}\left(x_{\varepsilon},y_{\varepsilon}\right)+\psi_{\varepsilon}\left(L\left(x_{\varepsilon},y_{\varepsilon}\right)\right)\leq\varepsilon.\end{cases}

(38)

The first inequality is

	$\displaystyle\phi_{\varepsilon}\left(x,y\right)+\psi_{\varepsilon}\left(L\left(x,y\right)\right)$	$\displaystyle\geq-\varepsilon$
	$\displaystyle-a\left(x^{2}+y^{2}\right)+b_{1}x+b_{2}y-c\left(x-y\right)^{2}+d\left(x-y\right)+\varepsilon$	$\displaystyle\geq 0$
	$\displaystyle-\left(a+c\right)\left(x^{2}+y^{2}\right)+2cxy+\left(b_{1}+d\right)x+\left(b_{2}-d\right)y+\varepsilon$	$\displaystyle\geq 0,$

which gives us $a=c=0,b_{1}+d=b_{2}-d=0$ , so the second inequality of (38) holds for any $x,y\in\mathbb{R}$ . However, there are no element in $\partial_{\varepsilon,\Psi}g\left(L\left(x_{\varepsilon},y_{\varepsilon}\right)\right)$ such that $c=0$ , and we cannot applied condition (i) of Corollary 4.3 to determine zero duality gap.

One option is that instead of solving (38), we take $\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g\left(L\left(x_{\varepsilon},y_{\varepsilon}\right)\right)$ and calculate $\partial_{\varepsilon,\Phi}\left(f+\psi_{\varepsilon}\circ L\right)\left(x_{\varepsilon},y_{\varepsilon}\right)$ . We consider the simplest case where $\psi_{\varepsilon}\left(x\right)=-x^{2}+2x$ and we calculate

\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(\phi\right)=-\left(a+2\right)A^{2}-\left(a+1\right)B^{2}-2AB+\left(b_{1}-2\right)A+\left(b_{2}+2\right)B,

where $A=\frac{ab_{1}-2a+b_{1}-b_{2}-4}{2\left(a^{2}+3a+1\right)},B=\frac{ab_{2}+2a-b_{1}+2b_{2}+6}{2\left(a^{2}+3a+1\right)}$ , for $a\geq 0$ . We examine whether $0\in\partial_{\varepsilon,\Phi}\left(f+\psi_{\varepsilon}\circ L\right)\left(x_{\varepsilon},y_{\varepsilon}\right)$ by checking the inequality

\left(f+\psi_{\varepsilon}\circ L\right)^{*}_{\Phi}\left(0\right)+\left(f+\psi_{\varepsilon}\circ L\right)\left(x_{\varepsilon},y_{\varepsilon}\right)\leq\varepsilon.

Substituting $\phi=0$ or $a=b_{1}=b_{2}=0$ , we have

2x_{\varepsilon}^{2}+y_{\varepsilon}^{2}+2x_{\varepsilon}y_{\varepsilon}+2\left(x_{\varepsilon}-y_{\varepsilon}\right)+5\leq\varepsilon.

This inequality has solution $x_{\varepsilon},y_{\varepsilon}$ as the left hand side is a parabola with minimum value $-\varepsilon$ . Thus, condition (i) of Theorem 4.2 is satisfied, and we have zero duality gap for the conjugate duality.

Theorem 4.2 is more general than Theorem 4.3 but it requires more calculations than the latter. Sometimes, it is more convenient to calculate $\varepsilon$ -subdifferential of $f$ and $g$ than $\varepsilon$ -subdifferential of $(f+\psi_{\varepsilon}\circ L)$ where $\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g(Lx_{\varepsilon})$ .

5 Strong Duality for Conjugate Dual

In this section, we investigate strong duality for the conjugate dual (DCP), i.e., we provide conditions for zero duality gap and the existence of solution to the dual problem (DCP); the respective conditions are expressed in terms of additivity of epigraphs of the conjugate functions.

Definition 5.1.

The epigraph of a function $f:X\to(-\infty,+\infty]$ is a subset of $X\times\mathbb{R}$ defined as

\text{epi }f:=\left\{(x,r)\in X\times\mathbb{R}:f(x)\leq r\right\}.

(39)

We have

\text{epi }f^{*}_{\Phi}:=\left\{(\phi,r)\in\Phi\times\mathbb{R}:f^{*}_{\Phi}(\phi)\leq r\right\}.

(40)

In consequence, if $(\phi,r)\in\text{epi }f^{*}_{\Phi}$ , then $\phi\in\text{dom }f^{*}_{\Phi}$ . To investigate strong duality, together with (33), we need the following condition: for a given $\phi\in\Phi$ and $\psi\in\Psi$

\phi+\psi\circ L\in\Phi.

(41)

Remark 5.

Note that (41) is satisfied when $\psi\circ L\in\text{lin }\Phi$ where $\text{lin }\Phi$ is the linear space generated by $\Phi$ .

Theorem 5.2.

Let $X$ be a nonempty set and $Y$ be a vector space. Let $f:X\to\left(-\infty,+\infty\right]$ and $g:Y\to\left(-\infty,+\infty\right]$ be proper functions. Let $\Phi$ and $\Psi$ be sets of elementary functions defined on $X$ and $Y$ , respectively. Let $L:X\to Y$ be a mapping from $X$ to $Y$ . Assume that $\text{dom }g\cap L\left(\text{dom }f\right)\neq\emptyset$ and consider the following conditions.

(i)

Every $(\phi,c_{\phi})\in\text{epi }(f+g\circ L)^{*}_{\Phi}$ , can be expressed as $(\phi,c_{\phi})=(\varphi+\psi\circ L,c_{\varphi}+c_{\psi})$ , where $(\varphi,c_{\varphi})\in\text{epi }f^{*}_{\Phi}$ and $(\psi,c_{\psi})\in\text{epi }g^{*}_{\Psi}$ .
(ii)

For any $\phi\in\Phi$ , it holds $\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right)=\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(\phi\right)+g^{*}_{\Psi}\left(\psi\right)$ and the infimum is attained.

We have (i) $\Rightarrow$ (ii). Moreover, for any $\phi\in\Phi$ , if $\bar{\psi}\in\Psi$ is a solution to the problem

\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(\phi\right)+g^{*}_{\Psi}\left(\psi\right),

(42)

and conditions (41), (33) hold for $\bar{\psi}$ and $\phi$ , then (ii) $\Rightarrow$ (i).

Proof.

(i) $\Rightarrow$ (ii): If $\phi\notin\text{dom }(f+g\circ L)^{*}_{\Phi}$ , then $(f+g\circ L)^{*}_{\Phi}(\phi)=+\infty$ and (ii) holds. Consider the case $\phi\in\text{dom }(f+g\circ L)^{*}_{\Phi}$ . By Theorem 4.1, for every $\phi\in\Phi$ , we have

\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right)\leq\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(\phi\right)+g^{*}_{\Psi}\left(\psi\right).

(43)

We start by proving the opposite inequality. Since $\left(\phi,\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right)\right)\in\text{epi }\left(f+g\circ L\right)^{*}_{\Phi}$ , by (i), there exist $\left(\varphi,c_{f}\right)\in\text{epi }f^{*}_{\Phi}$ and $\left(\psi_{0},c_{g}\right)\in\text{epi }g^{*}_{\Psi}$ such that

	$\displaystyle\phi\left(x\right)$	$\displaystyle=\varphi\left(x\right)+\psi_{0}\left(Lx\right),\quad\forall x\in X$
	$\displaystyle\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right)$	$\displaystyle=c_{f}+c_{g}.$

Thus

	$\displaystyle\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right)$	$\displaystyle=c_{f}+c_{g}$
		$\displaystyle\geq f^{}_{\Phi}\left(\varphi\right)+g^{}_{\Psi}\left(\psi_{0}\right)$
		$\displaystyle=\sup_{x\in X}\left\{\varphi\left(x\right)-f\left(x\right)\right\}+g^{*}_{\Psi}\left(\psi_{0}\right)$
		$\displaystyle=\sup_{x\in X}\left\{\phi\left(x\right)-\psi_{0}\left(Lx\right)-f\left(x\right)\right\}+g^{*}_{\Psi}\left(\psi_{0}\right)$
		$\displaystyle=\left(f+\psi_{0}\circ L\right)^{}_{\Phi}\left(\phi\right)+g^{}_{\Psi}\left(\psi_{0}\right)$
		$\displaystyle\geq\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{}_{\Phi}\left(\phi\right)+g^{}_{\Psi}\left(\psi\right),$

and we have (ii). The infimum is attained from the above inequality, as

	$\displaystyle+\infty>c_{f}+c_{g}=\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right)$	$\displaystyle\geq\left(f+\psi_{0}\circ L\right)^{}_{\Phi}\left(\phi\right)+g^{}_{\Psi}\left(\psi_{0}\right)$
		$\displaystyle\geq\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{}_{\Phi}\left(\phi\right)+g^{}_{\Psi}\left(\psi\right)$
		$\displaystyle\geq\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right),$

so $\psi_{0}\in\Psi$ solves problem (42).

(ii) $\Rightarrow$ (i): From (ii), let $(\phi,c_{f})\in\text{epi }f^{*}_{\Phi}$ and $\bar{\psi}$ be a solution to the problem (42) for $\phi\in\Phi$ . As $\left(f+\bar{\psi}\circ L\right)^{*}_{\Phi}(\phi)+g^{*}_{\Psi}(\bar{\psi})<+\infty$ , we can find $c_{g}\in\mathbb{R}$ such that $(\bar{\psi},c_{g})\in\text{epi }g^{*}_{\Psi}$ . Moreover, from (41), $\phi+\bar{\psi}\circ L:=\varphi\in\Phi$ , we have

	$\displaystyle c_{f}+c_{g}$	$\displaystyle\geq f^{}_{\Phi}\left(\phi\right)+g^{}_{\Psi}\left(\bar{\psi}\right)$
		$\displaystyle=\sup_{x\in X}\left\{\varphi\left(x\right)-\bar{\psi}(Lx)-f\left(x\right)\right\}+g^{*}_{\Psi}\left(\bar{\psi}\right)$
		$\displaystyle=(f+\bar{\psi}\circ L)^{}_{\Phi}(\varphi)+g^{}_{\Psi}(\bar{\psi})$
		$\displaystyle\geq\left(f+g\circ L\right)^{*}_{\Phi}(\varphi).$

This means $\left(\varphi,c_{f}+c_{g}\right)\in\text{epi }(f+g\circ L)^{*}_{\Phi}$ .

On the other hand, let $\left(\phi,r\right)\in\text{epi }\left(f+g\circ L\right)^{*}_{\Phi}$ . We want to prove that there exist $(\varphi,c_{f})\in\text{epi }f^{*}_{\Phi}$ and $(\psi,c_{g})\in\text{epi }g^{*}_{\Psi}$ such that

(\phi,r)=(\varphi+\psi\circ L,c_{f}+c_{g}).

From (ii), for $\phi\in\Phi$ , there exists $\bar{\psi}\in\Psi$ such that $\bar{\psi}$ is the solution to the problem (42) i.e.

\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right)=\left(f+\bar{\psi}\circ L\right)^{*}_{\Phi}\left(\phi\right)+g^{*}_{\Psi}\left(\bar{\psi}\right).

Therefore, we have

	$\displaystyle r$	$\displaystyle\geq\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right)$
	$\displaystyle r$	$\displaystyle\geq\left(f+\bar{\psi}\circ L\right)^{}_{\Phi}\left(\phi\right)+g^{}_{\Psi}\left(\bar{\psi}\right)$
	$\displaystyle r-g^{*}_{\Psi}\left(\bar{\psi}\right)$	$\displaystyle\geq\sup_{x\in X}\left\{\phi\left(x\right)-\bar{\psi}\left(Lx\right)-f\left(x\right)\right\}.$

Condition (33) gives us $\varphi:=\phi-\bar{\psi}\circ L\in\Phi$ . Then

\displaystyle r-g^{*}_{\Psi}\left(\bar{\psi}\right)

\displaystyle\geq\sup_{x\in X}\left\{\varphi\left(x\right)-f\left(x\right)\right\}=f^{*}_{\Phi}\left(\varphi\right).

Therefore, $\left(\varphi,r-g^{*}_{\Psi}\left(\bar{\psi}\right)\right)\in\text{epi }f^{*}_{\Phi}$ and $\left(\bar{\psi},g^{*}_{\Psi}\left(\bar{\psi}\right)\right)\in\text{epi }g^{*}_{\Psi}$ , which means we can decompose $\left(\phi,r\right)$ into $\left(\varphi,r-g^{*}_{\Psi}\left(\bar{\psi}\right)\right)$ and $\left(\bar{\psi},g^{*}_{\Psi}\left(\bar{\psi}\right)\right)$ . Hence, the proof is finished. ∎

Remark 6.

•

Observe that condition (ii) of Theorem 5.2 implies the strong duality relationship between (CP) and (DCP), i.e. zero duality gap holds

-val(CP)=\left(f+g\circ L\right)^{*}_{\Phi}\left(0\right)=\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(0\right)+g^{*}_{\Psi}\left(\psi\right)=-val(DCP),

and the dual problem (DCP) is solvable.

•

The assumption (i) in Theorem 5.2 does not coincide with the following

$\text{epi }(f+g\circ L)_{\Phi}^{*}=\text{epi }f_{\Phi}^{*}+\text{epi }(g\circ L)_{\Phi}^{*},$

(see also [16, Theorem 3.1]), because we consider the set $B:=\left\{(\psi\circ L,r):(\psi,r)\in\text{epi }g_{\Psi}^{*}\right\}$ instead of $\text{epi }(g\circ L)_{\Phi}^{*}$ , where $(\psi,r)\in\text{epi }g_{\Psi}^{*}$ . Then the assumption (i) of Theorem 5.2 is

$\text{epi }(f+g\circ L)_{\Phi}^{*}=\text{epi }f_{\Phi}^{*}+B,$

whenever $\phi+\psi\circ L\in\Phi$ holds true for any $\psi\in\text{dom }g^{*}_{\Psi},\phi\in\text{dom }f^{*}_{\Phi}$ . Examples of classes of elementary functions, where condition $\phi+\psi\circ L\in\Phi$ is satisfied, can be found in Example 3.2-3.

Corollary 5.3.

Let $X$ be a nonempty set and $Y$ be a vector space. Let $f:X\to\left(-\infty,+\infty\right],\ g:Y\to\left(-\infty,+\infty\right]$ and $L:X\to Y$ be a mapping from $X$ to $Y$ . Let $\Phi,\Psi$ be sets of elementary functions on $X$ and $Y$ , respectively. Assume that $\text{dom }g\cap L\left(\text{dom }f\right)\neq\emptyset$ . Let

(i)

$\text{epi }\left(f+g\circ L\right)^{*}_{\Phi}=\text{epi }f^{*}_{\Phi}+\text{epi }\left(g\circ L\right)^{*}_{\Phi}$ ,
(ii)

For $\phi\in\Phi$ , $\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(\phi\right)+g^{*}_{\Psi}\left(\psi\right)=\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right)$ and the infimum is attained.

Consider the following conditions.

a.

For any pair $(\phi,r)\in\text{epi }(g\circ L)^{*}_{\Phi}$ , there exists $\psi\in\Psi$ such that $\phi\leq\psi\circ L$ and $g^{*}_{\Psi}(\psi)\leq r$
b.

There exists $\psi\in\Psi$ such that $g^{*}_{\Psi}(\psi)\leq 0$ and $g\circ L\leq\psi\circ L$ .
c.

For any pair $(\phi,r)\in\text{epi }\left(f+g\circ L\right)^{*}_{\Phi}$ and a given $\psi\in\Psi$ , there exist $\phi_{1},\phi_{2}\in\Phi$ such that $\phi-\psi\circ L\geq\phi_{1}$ and $\psi\circ L\geq\phi_{2}$

$\Phi-\Phi\subset\Phi$ , for any $\varepsilon_{1},\varepsilon_{2}>0$ and a given $\psi\in\Psi$ , there exist $x_{\varepsilon_{2}}\in X,\phi_{\varepsilon_{2}}\in\partial_{\varepsilon_{2},\Phi}\left(g\circ L\right)\left(x_{\varepsilon_{2}}\right)$ such that

\psi\left(Lx_{\varepsilon_{2}}\right)-\phi_{\varepsilon_{2}}\left(x_{\varepsilon_{2}}\right)+\varepsilon_{1}>\sup_{x\in X}\psi\left(Lx\right)-\phi_{\varepsilon_{2}}\left(x\right).

If $\left(a\right)$ or $\left(b\right)$ hold, then (i) $\Rightarrow$ (ii). If $\left(c\right)$ or $\left(d\right)$ hold for $\bar{\psi}\in\Psi$ at which the infimum in (ii) is attained, then (ii) $\Rightarrow$ (i).

Proof.

(i) $\Rightarrow$ (ii): For any $\phi\in\Phi,\left(\phi,\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right)\right)\in\text{epi }\left(f+g\circ L\right)^{*}_{\Phi}$ . There exist two pairs $\left(\phi_{1},c_{1}\right)\in\text{epi }f^{*}_{\Phi}$ and $\left(\phi_{2},c_{2}\right)\in\text{epi }\left(g\circ L\right)^{*}_{\Phi}$ such that $\phi=\phi_{1}+\phi_{2}$ and $c_{1}+c_{2}=\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right)$ . Thanks to Theorem 4.1, for any $\phi\in\Phi$ , we have

\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(\phi\right)+g^{*}_{\Psi}\left(\psi\right)\geq\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right),

so we need to prove the reverse inequality. Consider

\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{*}_{\Phi}\left(\phi\right)+g^{*}_{\Psi}\left(\psi\right)\leq\left(f+\psi\circ L\right)^{*}_{\Phi}\left(\phi\right)+g^{*}_{\Psi}\left(\psi\right)

(44)

Assume $\left(a\right)$ . We can find $\psi_{0}\in\Psi$ such that $\phi_{2}\leq\psi_{0}\circ L$ and $g^{*}_{\Psi}(\psi_{0})\leq c_{2}$ . Note that

\phi_{1}=\phi-\phi_{2}\geq\phi-\psi_{0}\circ L.

Hence,

$\displaystyle\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{}_{\Phi}\left(\phi\right)+g^{}_{\Psi}\left(\psi\right)$	$\displaystyle\leq\left(f+\psi_{0}\circ L\right)^{}_{\Phi}\left(\phi\right)+g^{}_{\Psi}\left(\psi_{0}\right)$
	$\displaystyle\leq\sup_{x\in X}\left\{\phi\left(x\right)-f\left(x\right)-\psi_{0}\left(Lx\right)\right\}+c_{2}$
	$\displaystyle\leq\sup_{x\in X}\left\{\phi_{1}\left(x\right)-f\left(x\right)\right\}+c_{2}$
	$\displaystyle=f^{*}_{\Phi}\left(\phi_{1}\right)+c_{2}$
	$\displaystyle\leq c_{1}+c_{2}=\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right).$	(45)

Inequality (45) gives us

(f+g\circ L)^{*}_{\Phi}(\phi)\leq\left(f+\psi_{0}\circ L\right)^{*}_{\Phi}\left(\phi\right)+g^{*}_{\Psi}\left(\psi_{0}\right)\leq(f+g\circ L)^{*}_{\Phi}(\phi)<+\infty,

and the infimum in (ii) is attained at $\psi_{0}$ .

Now let $\left(b\right)$ hold. There exists $\psi_{0}\in\Psi$ such that $g\circ L\leq\psi_{0}\circ L$ and $g^{*}_{\Psi}(\psi_{0})\leq 0$ . We have

$\displaystyle\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{}_{\Phi}\left(\phi\right)+g^{}_{\Psi}\left(\psi\right)$	$\displaystyle\leq\left(f+\psi_{0}\circ L\right)^{}_{\Phi}\left(\phi\right)+g^{}_{\Psi}\left(\psi_{0}\right)$
	$\displaystyle\leq\sup_{x\in X}\left\{\phi_{1}\left(x\right)-f\left(x\right)\right\}+\sup_{x\in X}\left\{\phi_{2}(x)-\psi_{0}\left(Lx\right)\right\}$
	$\displaystyle\leq f^{*}_{\Phi}(\phi_{1})+\sup_{x\in X}\left\{\phi_{2}(x)-g\left(Lx\right)\right\}$
	$\displaystyle\leq f^{}_{\Phi}\left(\phi_{1}\right)+(g\circ L)^{}_{\Phi}(\phi_{2})$
	$\displaystyle\leq c_{1}+c_{2}=\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right),$	(46)

where we have used $\phi=\phi_{1}+\phi_{2}$ in the second inequality. The attainment of the infimum of (ii) at $\psi_{0}\in\Psi$ follows in the same way as in the proof with condition (a).

(ii) $\Rightarrow$ (i): Let $\left(\phi_{1},c_{1}\right)\in\text{epi }f^{*}_{\Phi},\left(\phi_{2},c_{2}\right)\in\text{epi }\left(g\circ L\right)^{*}_{\Phi}$ we have

c_{1}+c_{2}\geq f^{*}_{\Phi}\left(\phi_{1}\right)+\left(g\circ L\right)^{*}_{\Phi}\left(\phi_{2}\right)\geq\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi_{1}+\phi_{2}\right),

so $\left(\phi_{1}+\phi_{2},c_{1}+c_{2}\right)\in\text{epi }\left(f+g\circ L\right)^{*}_{\Phi}$ , which means

\text{epi }\left(f+g\circ L\right)^{*}_{\Phi}\supset\text{epi }f^{*}_{\Phi}+\text{epi }\left(g\circ L\right)^{*}_{\Phi}.

We want to prove the reverse inclusion. Take $\left(\phi,r\right)\in\text{epi }\left(f+g\circ L\right)^{*}_{\Phi}$ . From (ii), there exists $\bar{\psi}\in\Psi$ which is a solution to the problem,

	$\displaystyle\inf_{\psi\in\Psi}\left(f+\psi\circ L\right)^{}_{\Phi}\left(\phi\right)+g^{}_{\Psi}\left(\psi\right)$	$\displaystyle=\left(f+\bar{\psi}\circ L\right)^{}_{\Phi}\left(\phi\right)+g^{}_{\Psi}\left(\bar{\psi}\right)$
		$\displaystyle=\left(f+g\circ L\right)^{*}_{\Phi}\left(\phi\right)\leq r.$		(47)

We assume $\left(c\right)$ holds. There exist $\phi_{1},\phi_{2}\in\Phi$ such that $\phi\geq\phi_{1}+\bar{\psi}\circ L$ , $\bar{\psi}\circ L\geq\phi_{2}$ , so that $(g\circ L)^{*}_{\Phi}(\phi_{2})\leq g^{*}_{\Psi}(\bar{\psi})$ and

\left(f+\bar{\psi}\circ L\right)^{*}_{\Phi}\left(\phi\right)=\sup_{x\in x}\phi\left(x\right)-\bar{\psi}\left(Lx\right)-f\left(x\right)\geq\sup_{x\in x}\phi_{1}\left(x\right)-f\left(x\right)=f^{*}_{\Phi}(\phi_{1}).

(48)

Combining this with (47) and (48), we obtain

f^{*}_{\Phi}\left(\phi_{1}\right)+(g\circ L)^{*}_{\Phi}\left(\phi_{2}\right)\leq\left(f+\bar{\psi}\circ L\right)^{*}_{\Phi}\left(\phi\right)+g^{*}_{\Psi}\left(\bar{\psi}\right)\leq r.

By taking $\left(\phi_{2},\left(g\circ L\right)^{*}_{\Phi}\left(\phi_{2}\right)\right)\in\text{epi }\left(g\circ L\right)^{*}_{\Phi}$ , we can make $\left(\phi_{1},r-\left(g\circ L\right)^{*}_{\Phi}\left(\phi_{2}\right)\right)\in\text{epi }f^{*}_{\Phi}$ . This means

\left(\phi_{2},\left(g\circ L\right)^{*}_{\Phi}\left(\phi_{2}\right)\right)+\left(\phi_{1},r-\left(g\circ L\right)^{*}_{\Phi}\left(\phi_{2}\right)\right)=\left(\phi,r\right).

Thus, we have

\text{epi }\left(f+g\circ L\right)^{*}_{\Phi}\subset\text{epi }f^{*}_{\Phi}+\text{epi }\left(g\circ L\right)^{*}_{\Phi}.

(49)

Now, let $\left(d\right)$ hold. Let $\bar{\psi}\in\Psi$ be a solution to the infimum problem in (ii). For every $\varepsilon_{1},\varepsilon_{2}>0$ , there exist $x_{\varepsilon_{2}}\in X$ , $\phi_{\varepsilon_{2}}\in\partial_{\varepsilon_{2},\Phi}\left(g\circ L\right)\left(x_{\varepsilon_{2}}\right)$ such that

-\sup_{x\in X}\left\{\bar{\psi}\left(Lx\right)-\phi_{\varepsilon_{2}}\left(x\right)\right\}>-\bar{\psi}\left(Lx_{\varepsilon_{2}}\right)+\phi_{\varepsilon_{2}}\left(x_{\varepsilon_{2}}\right)-\varepsilon_{1}.

We have

	$\displaystyle\left(f+\bar{\psi}\circ L\right)^{*}_{\Phi}\left(\phi\right)$	$\displaystyle=\sup_{x\in X}\left\{\phi\left(x\right)-f\left(x\right)-\bar{\psi}\left(Lx\right)\right\}$
		$\displaystyle\geq\sup_{x\in X}\left\{\phi\left(x\right)-\phi_{\varepsilon_{2}}\left(x\right)-f\left(x\right)\right\}-\sup_{x\in X}\left\{\bar{\psi}\left(Lx\right)-\phi_{\varepsilon_{2}}\left(x\right)\right\}$		(50)

As $\phi-\phi_{\varepsilon_{2}}:=\varphi_{\varepsilon_{2}}\in\Phi$ , we can write $\sup_{x\in X}\left\{\phi\left(x\right)-\phi_{\varepsilon_{2}}\left(x\right)-f\left(x\right)\right\}=f^{*}_{\Phi}\left(\varphi_{\varepsilon_{2}}\right)$ . Since $g^{*}_{\Psi}\left(\bar{\psi}\right)\geq\bar{\psi}\left(y\right)-g\left(y\right)$ for any $y\in Y$ , we can set $y=Lx_{\varepsilon_{2}}$ . Combining this with (50) gives us

	$\displaystyle\left(f+\bar{\psi}\circ L\right)^{}_{\Phi}\left(\phi\right)+g^{}_{\Psi}\left(\bar{\psi}\right)$	$\displaystyle>f^{*}_{\Phi}\left(\varphi_{\varepsilon_{2}}\right)-\bar{\psi}\left(Lx_{\varepsilon_{2}}\right)+\phi_{\varepsilon_{2}}\left(x_{\varepsilon_{2}}\right)-\varepsilon_{1}+\bar{\psi}\left(Lx_{\varepsilon_{2}}\right)-g\left(Lx_{\varepsilon_{2}}\right)$
		$\displaystyle>f^{*}_{\Phi}\left(\varphi_{\varepsilon_{2}}\right)+\phi_{\varepsilon_{2}}\left(x_{\varepsilon_{2}}\right)-g\left(Lx_{\varepsilon_{2}}\right)-\varepsilon_{1}$
		$\displaystyle>f^{}_{\Phi}\left(\varphi_{\varepsilon_{2}}\right)+(g\circ L)^{}_{\Phi}(\phi_{\varepsilon_{2}})-\varepsilon_{1}-\varepsilon_{2},$

as $\phi_{\varepsilon_{2}}\in\partial_{\varepsilon_{2},\Phi}g\left(Lx_{\varepsilon_{2}}\right)$ . From (47), we have

	$\displaystyle r$	$\displaystyle>\left(f+\bar{\psi}\circ L\right)^{}_{\Phi}\left(\phi\right)+g^{}_{\Psi}\left(\bar{\psi}\right)$
		$\displaystyle>f^{}_{\Phi}\left(\varphi_{\varepsilon_{2}}\right)+\left(g\circ L\right)^{}_{\Phi}\left(\phi_{\varepsilon_{2}}\right)-\varepsilon_{2}-\varepsilon_{1}.$

We obtain $\left(\varphi_{\varepsilon_{2}},r+\varepsilon_{1}+\varepsilon_{2}-\left(g\circ L\right)^{*}_{\Phi}\left(\phi_{\varepsilon_{2}}\right)\right)\in\text{epi }f^{*}_{\Phi}$ and $\left(\phi_{\varepsilon_{2}},\left(g\circ L\right)^{*}_{\Phi}(\phi_{\varepsilon_{2}})\right)\in\text{epi }\left(g\circ L\right)^{*}_{\Phi}$ . This means $\text{epi }\left(f+g\circ L\right)^{*}_{\Phi}\subset\text{epi }f^{*}_{\Phi}+\text{epi }\left(g\circ L\right)^{*}_{\Phi}$ , so we have (49). ∎

Remark 7.

•

In Theorem 5.2 and Corollary 5.3, to obtain condition (ii), the assumptions mostly rely on $g$ and the set of elementary functions $\Psi$ . Thus, with appropriate choices of $\Psi$ , we can guarantee the attainment of the infimum in (ii). The motivation comes from the fact that in the construction of the conjugate dual, we perturb $g$ but not $f$ .
•

If $0\in\Phi$ , then the condition $\Phi-\Phi\subset\Phi$ implies symmetry of the set $\Phi$ . Because we can take $0-\phi\in\Phi$ for any $\phi\in\Phi$ .
•

It is true that $\text{epi }f^{*}=\text{supp }f$ . To see this, let us take $(\phi,r)\in\text{epi }f^{*}$ , then $\phi(x)-r\leq f(x)$ for all $x\in X$ . As $\Phi$ is closed under addition of constant, we have $\phi-r\in\Phi$ and $\phi-r\in\text{supp }f$ . Conversely, if $\phi\in\text{supp }f$ then $(\phi,0)\in\text{epi }f^{*}$ (see [16]).
•

Corollary 5.3-(i) is closed to the additivity of the support of the conjugate in [16, Theorem 5.1]

$\text{supp}_{\Phi}(f+g\circ L)=\text{supp }_{\Phi}f+\text{supp }_{\Phi}(g\circ L),$

while in general, [16, Proposition 2.2], we have

$\text{supp}_{\Phi}(f+g\circ L)=\text{co}_{\Phi}\left(\text{supp }_{\Phi}f+\text{supp }_{\Phi}(g\circ L)\right),$

where $\text{co}_{\Phi}A$ is the $\Phi$ -convex hull of $A\subset\Phi$ i.e.

$\text{co}_{\Phi}A=\text{supp }_{\Phi}f_{A},\quad\text{where }f_{A}(x)=\sup_{\phi\in A}\phi(x),\quad\forall x\in X.$
•

In the classical convex analysis, when $X$ and $Y$ are separated locally convex spaces. For lower semicontinuous proper convex functions $f,g$ and a continuous linear mapping $L$ , [31, Theorem 14.1], condition ensuring strong duality are expressed with the help of regularity condition $(RC)_{5}^{\Phi}$ as defined in [31, Chapter 4, Section 14] while in Theorem 5.2 and Corollary 5.3, we are using assumptions related to the additivity of epigraph of the conjugate. We refer the reader to Section 16 in [31] for the discussion of relationship between different regularity conditions ensuring strong duality.

We give an example of a convex problem satisfying the assumption (i) of Theorem 5.2.

Example 5.4.

Let $f\left(x,y\right)=x^{2}+y^{2},g\left(x\right)=2x^{2},L\left(x,y\right)=x+y$ and

	$\displaystyle\Phi$	$\displaystyle=\left\{\phi\left(x,y\right)=-a\left(x^{2}+y^{2}\right)+b_{1}x+b_{2}y,a\geq 0,b_{1},b_{2}\in\mathbb{R}\right\},$
	$\displaystyle\Psi$	$\displaystyle=\left\{\psi\left(x\right)=cx,c\in\mathbb{R}\right\}.$

Condition (41), $\phi+\psi\circ L\in\Phi$ , always holds for all $\phi\in\Phi,\psi\in\Psi$ (see Example 3.2-3). Let us denote

	$\displaystyle\phi(x,y)$	$\displaystyle=-a(x^{2}+y^{2})+b_{1}x+b_{2}y,a\geq 0$
	$\displaystyle\varphi(x,y)$	$\displaystyle=-a_{1}(x^{2}+y^{2})+b_{11}x+b_{12}y,a_{1}\geq 0$
	$\displaystyle\psi(x)$	$\displaystyle=cx.$

We want to calculate

	$\displaystyle\left(\phi,c_{1}\right)\in\text{epi }\left(f+g\circ L\right)_{\Phi}^{*}$	$\displaystyle\Leftrightarrow a\geq 0,\frac{4b_{1}b_{2}-\left(3+a\right)\left(b_{1}^{2}+b_{2}^{2}\right)}{4\left(4-\left(3+a\right)^{2}\right)}\leq c_{1}$
	$\displaystyle\left(\varphi,c_{2}\right)\in\text{epi }f_{\Phi}^{*}$	$\displaystyle\Leftrightarrow a_{1}\geq 0,\frac{b_{11}^{2}+b_{12}^{2}}{4\left(a_{1}+1\right)}\leq c_{2}$
	$\displaystyle\left(\psi,c_{3}\right)\in\text{epi }g_{\Psi}^{*}$	$\displaystyle\Leftrightarrow\psi\in\Psi,\frac{c^{2}}{8}\leq c_{3}$

Let us take $\left(\phi,c_{1}\right)\in\text{epi }\left(f+g\circ L\right)_{\Phi}^{*}$ then we want decompose $\phi=\varphi+\psi\circ L$ and $c_{1}=c_{2}+c_{3}$ where $\left(\varphi,c_{2}\right)\in\text{epi }f_{\Phi}^{*}$ and $\left(\psi,c_{3}\right)\in\text{epi }g_{\Psi}^{*}$ . We have the following system

\begin{cases}a=a_{1}\\ c+b_{11}=b_{1}\\ c+b_{12}=b_{2}\end{cases}

We can choose $c=c_{3}=0,a=a_{1}=0,b_{1}=b_{2}$ so $b_{1}=b_{11}=b_{2}=b_{12}$ and $c_{1}=c_{2}$ . Taking $b_{1}=1=c_{1}$ and assumption (i) of Theorem 5.2 is satisfied. Thus, we have strong duality for conjugate duality. Note that if $\Phi$ is composed of affine functions only, we arrive at the same conclusion.

6 Lagrange Dual

6.1 Construction of Lagrangian Primal-Dual Problems

Of equal importance is a problem to restate the results from [24], which connects the Conjugate Duality with Lagrange Duality. In the case we have an operator $L:X\to Y$ in the formulation of problem (CP). For other construction of Lagrangian dual proposed recently, see e.g. [34] and the references therein.

In this Section, we give new results for zero duality gap for Lagrange dual of composite problems. To construct Lagrangian dual, we introduce the Lagrangian function $\mathcal{L}:X\times\Psi\to(-\infty,+\infty]$ as follow

\mathcal{L}(x,\psi)=f(x)+\psi(Lx)-g^{*}_{\Psi}(\psi),

(51)

where $\Psi$ is a set of elementary functions defined on $Y$ . In the case $\Psi$ is a convex set then $\mathcal{L}(x,\psi)$ is concave with respect to $\psi\in\Psi$ . The Lagrangian dual is

val(LD)=\sup_{\psi\in\Psi}\inf_{x\in X}\mathcal{L}(x,\psi)=\sup_{\psi\in\Psi}\inf_{x\in X}f(x)+\psi(Lx)-g^{*}_{\Psi}(\psi),

(LD)

and the corresponding Lagrangian primal

val(LP):=\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}(x,\psi)=\inf_{x\in X}\sup_{\psi\in\Psi}f(x)+\psi(Lx)-g^{*}_{\Psi}(\psi),

(LP)

Let us state the weak duality for Lagrangian duality.

Theorem 6.1.

Let $f:X\to(-\infty,+\infty],\ g:Y\to(-\infty,+\infty]$ and $L:X\to Y$ be a mapping. Let $\Phi,\Psi$ be sets of elementary functions defined on $X$ and $Y$ , respectively. It holds

\sup_{\psi\in\Psi}\inf_{x\in X}\mathcal{L}(x,\psi)\leq\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}(x,\psi)

This result holds true for any functions not necessarily of the form (51). First we notice that Lagrangian primal (LP) is not equivalent to the composite problem (CP) as

	$\displaystyle\inf_{x\in X}\sup_{\psi\in\Psi}f(x)+\psi(Lx)-g^{*}_{\Psi}(\psi)$	$\displaystyle=\inf_{x\in X}f(x)+\sup_{\psi\in\Psi}\psi(Lx)-g^{*}_{\Psi}(\psi)$
		$\displaystyle=\inf_{x\in X}f(x)+g^{**}_{\Psi}(Lx)$
		$\displaystyle\leq\inf_{x\in X}f(x)+g(Lx).$

Even though the Lagrange dual and conjugate dual are the same, the primal problems are different. Since our main focus is (CP), we discuss the assumptions which make these two primal problems equivalent. Clearly, we can assume

\inf_{x\in X}f(x)+g^{**}_{\Psi}(Lx)=\inf_{x\in X}f(x)+g(Lx).

(52)

Usually, in the classical convex approach, $g$ is lsc and convex iff $g^{**}_{\Psi}=g$ so condition (52) holds. Conversely, we have the following.

Proposition 6.2.

Let $f:X\to(-\infty,+\infty],\ g:Y\to(-\infty,+\infty]$ where $X$ is a nonempty set and $Y$ is a vector space. Let $L:X\to Y$ be a mapping. Let $\Phi,\Psi$ be the sets of elementary functions defined on $X$ and $Y$ , respectively. Assume (52) holds, if there exists an $x_{0}\in X$ such that

\inf_{x\in X}f(x)+g(Lx)\geq f(x_{0})+g(Lx_{0}),

then $g$ is $\Psi$ -convex at $Lx_{0}$ .

Proof.

We have

\inf_{x\in X}f\left(x\right)+g\left(Lx\right)\geq f\left(x_{0}\right)+g\left(Lx_{0}\right).

From (52) we have

f\left(x_{0}\right)+g\left(Lx_{0}\right)\leq\inf_{x\in X}f\left(x\right)+g^{**}_{\Psi}\left(Lx\right)\leq f\left(x_{0}\right)+g^{**}_{\Psi}\left(Lx_{0}\right),

which implies $g\left(Lx_{0}\right)\leq g^{**}_{\Psi}\left(Lx_{0}\right)$ . From the definition of biconjugate function, $g^{**}_{\Psi}\leq g$ , so we have $g\left(Lx_{0}\right)=g^{**}_{\Psi}\left(Lx_{0}\right)$ . Thus, $g$ is $\Psi$ -convex at $Lx_{0}$ . ∎

6.2 Lagrange Zero Duality Gap

In the present subsection, we discuss zero duality gap for Lagrange duality. We follow the result of [24] and exploit the intersection property to prove zero duality gap. In our context, Theorem 6.1 in [24] takes the following form.

Theorem 6.3.

(Theorem 6.1 [24]) Let $X=Y$ be a vector space, $\Phi$ be a convex set of elementary functions defined on $X$ . Let $\mathcal{L}(x,\psi)$ be Lagrangian given by (51) with $L=Id$ . Then the following are equivalent.

(i)

For every $\alpha<\inf_{x\in X}\sup_{\psi\in\Phi}\mathcal{L}\left(x,\psi\right)$ , there exists $\psi_{1},\psi_{2}\in\Phi$ and $\phi_{1}\in\text{supp }\mathcal{L}\left(\cdot,\psi_{1}\right),\phi_{2}\in\text{supp }\mathcal{L}\left(\cdot,\psi_{2}\right)$ such that $\phi_{1},\phi_{2}$ have the intersection property at level $\alpha$ ; i.e., for all $t\in\left[0,1\right]$

\left[t\phi_{1}+\left(1-t\right)\phi_{2}<\alpha\right]\cap\left[\phi_{1}<\alpha\right]=\emptyset\text{ or }\left[t\phi_{1}+\left(1-t\right)\phi_{2}<\alpha\right]\cap\left[\phi_{2}<\alpha\right]=\emptyset,

where $\left[\phi_{1}<\alpha\right]:=\left\{x\in X:\phi_{1}\left(x\right)<\alpha\right\}$ .

(ii)

$val(LP)=val(DCP)$ i.e. $\inf_{x\in X}\sup_{\psi\in\Phi}\mathcal{L}\left(x,\psi\right)=\sup_{\psi\in\Phi}\inf_{x\in X}\mathcal{L}\left(x,\psi\right).$

The proof can be found in [19].

Remark 8.

In the case where for every $\alpha<\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}(x,\psi)$ , we have $\psi_{1}=\psi_{2}$ , i.e. $\phi_{1},\phi_{2}$ belong to the same support set $\text{supp }\mathcal{L}\left(\cdot,\psi_{1}\right)$ for some $\psi_{1}\in\Psi$ , the assumption of concavity of the Lagrangian can be removed. (cf. [18]).

For convenience, we state a result from [24].

Lemma 6.4.

(Lemma 6.2 [24]) Let $X$ be a set, $\alpha\in\mathbb{R}$ and let $\phi_{1},\phi_{2}:X\to\mathbb{R}$ . Two functions $\phi_{1},\phi_{2}$ have the intersection property at level $\alpha$ if and only if there exists $t_{0}\in\left[0,1\right]$ such that $t_{0}\phi_{1}\left(x\right)+\left(1-t_{0}\right)\phi_{2}\left(x\right)\geq\alpha$ for all $x\in X$ .

Observe that in Theorem 6.3, the class of elementary functions $\Phi$ is arbitrary. Below, we relate the intersection property to the lower semi-continuity of the optimal value function, $V$ , in the case where elementary functions satisfy additional conditions. For other results along this line, see e.g. [35, 36].

Let us define the optimal value function $V:Y\to[-\infty,+\infty)$ as follow

V(y):=\inf_{x\in X}\beta(x,y)

(53)

where the function $\beta$ is defined at the beginning of Section 3. Similar to (52), we also define the following condition for any $y\in Y$

\inf_{x\in X}f(x)+g^{**}_{\Psi}(Lx+y)=\inf_{x\in X}f(x)+g(Lx+y).

(54)

A function $V:Y\to[-\infty,+\infty)$ is lower semi-continuous at a point $y_{0}\in Y$ , if for every $\varepsilon>0$ , there exists a neighborhood $W(y_{0})$ such that

V(y)>V(y_{0})-\varepsilon,\quad\forall y\in W(y_{0}).

(55)

The following theorem below connects Theorem 4.2, the intersection property and the lower semi-continuity of the objective function at $y_{0}=0$ .

Theorem 6.5.

Let $f:X\to\left(-\infty,+\infty\right],\ g:Y\to\left(-\infty,+\infty\right]$ . Let $L:X\to Y$ be a mapping from $X$ to $Y$ with $\text{dom }g\cap L\left(\text{dom }f\right)\neq\emptyset$ . Let $\Phi,\Psi$ be sets of elementary functions defined on $X$ and $Y$ , respectively. Let $\mathcal{L}(x,\psi)$ be the Lagrangian defined in (51). Assume $\Psi$ is convex, $0\in\Phi$ and $\inf_{x\in X}f(x)+g(Lx)<+\infty$ . Consider the following.

(i)

For every $\varepsilon>0$ , there exist $x_{\varepsilon}\in X$ and $\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g\left(Lx_{\varepsilon}\right)$ such that $0\in\partial_{\varepsilon,\Phi}\left(f+\psi_{\varepsilon}\circ L\right)\left(x_{\varepsilon}\right)$ .
(ii)

For every $\alpha<\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}\left(x,\psi\right)$ , there exists $\phi_{1},\phi_{2}\in\Phi$ , $\psi_{1},\psi_{2}\in\Psi$ , $\phi_{1}\in\text{supp}\mathcal{L}\left(,\psi_{1}\right),\phi_{2}\in\text{supp}\mathcal{L}\left(,\psi_{2}\right)$ such that the intersection property holds for $\phi_{1},\phi_{2}$ at level $\alpha$ .
(iii)

The function $V$ is lower semi-continuous at 0.

We have (i) $\Rightarrow$ (ii). If condition (52) holds, then (ii) $\Leftrightarrow$ (i). Moreover, if (52) holds and $\Psi$ is composed of lower semi-continuous functions, then (ii) $\Rightarrow$ (iii). (iii) $\Rightarrow$ (ii) when $\Psi$ is a collection of continuous functions and (54) holds.

Proof.

(i) $\Rightarrow$ (ii): For every $\alpha\in\mathbb{R}$ , let $\alpha<\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}\left(x,\psi\right)$ . For every $\varepsilon>0$ , we can find $x_{\varepsilon}\in X$ and $\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g\left(Lx_{\varepsilon}\right)$ such that $0\in\partial_{\varepsilon,\Phi}\left(f+\psi_{\varepsilon}\circ L\right)\left(x_{\varepsilon}\right)$ ,

\left(\forall z\in X\right)\quad f\left(z\right)+\psi_{\varepsilon}\left(Lz\right)\geq f\left(x_{\varepsilon}\right)+\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)-\varepsilon.

(56)

We also have $\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g\left(Lx_{\varepsilon}\right)$ so $g(Lx_{\varepsilon})+g^{*}_{\Psi}(\psi_{\varepsilon})\leq\psi_{\varepsilon}(Lx_{\varepsilon})+\varepsilon$ . Putting this into (56),

	$\displaystyle f\left(z\right)+\psi_{\varepsilon}\left(Lz\right)$	$\displaystyle\geq f\left(x_{\varepsilon}\right)+\psi_{\varepsilon}\left(Lx_{\varepsilon}\right)-\varepsilon$
		$\displaystyle\geq f\left(x_{\varepsilon}\right)+g\left(Lx_{\varepsilon}\right)+g^{*}_{\Psi}\left(\psi_{\varepsilon}\right)-2\varepsilon,$

\mathcal{L}\left(z,\psi_{\varepsilon}\right)=f\left(z\right)+\psi_{\varepsilon}\left(Lz\right)-g^{*}_{\Psi}\left(\psi_{\varepsilon}\right)\geq f\left(x_{\varepsilon}\right)+g\left(Lx_{\varepsilon}\right)-2\varepsilon.

(57)

As $\varepsilon>0$ is arbitrary, we choose $\varepsilon=(f(x_{\varepsilon})+g(Lx_{\varepsilon})-\alpha)/2>0$ so that $\mathcal{L}(z,\psi_{\varepsilon})>\alpha$ . Pick $\phi_{1}\in\Phi$ and $\psi_{1}\in\Psi$ such that $\phi_{1}\in\text{supp }\mathcal{L}\left(\cdot,\psi_{1}\right)$ . By choosing $\phi_{2}=\alpha\in\Phi$ , we have $\phi_{2}\in\text{supp }\mathcal{L}\left(\cdot,\psi_{\varepsilon}\right)$ . From Lemma 6.4 with $t=1$ we have

\mathcal{L}\left(z,t\psi_{\varepsilon}+\left(1-t\right)\psi_{1}\right)=\mathcal{L}\left(z,\psi_{\varepsilon}\right)>t\alpha+\left(1-t\right)\phi_{1}\left(x\right)=\alpha,

so the intersection property holds for $\phi_{1},\phi_{2}$ at level $\alpha$ . As $\alpha$ is arbitrary, we have (ii).

(ii) $\Rightarrow$ (i): As (52) holds, we have $val(CP)=val(LP)$ . From the primal problem (CP), for every $\varepsilon>0$ , there exists an $x_{\varepsilon}\in X$ such that

\inf_{x\in X}f\left(x\right)+g\left(Lx\right)>f\left(x_{\varepsilon}\right)+g\left(Lx_{\varepsilon}\right)-\varepsilon.

Denoting $\alpha:=val(\text{CP})-\varepsilon$ , we have $\alpha<val\left(\text{CP}\right)$ . There exist $\phi_{1},\phi_{2}\in\Phi$ and $\psi_{1},\psi_{2}\in\Psi$ such that $\phi_{1}\in\text{supp }\mathcal{L}(\cdot,\psi_{1}),\phi_{2}\in\text{supp }\mathcal{L}(\cdot,\psi_{2})$ and the intersection property holds at level $\alpha$ . Lemma 6.4 gives us a $t_{0}\in\left[0,1\right]$ such that

t_{0}\phi_{1}\left(x\right)+\left(1-t_{0}\right)\phi_{2}\left(x\right)\geq\alpha\quad\forall x\in X.

(58)

Because $\phi_{1}\in\text{supp}\mathcal{L}\left(\cdot,\psi_{1}\right),\phi_{2}\in\text{supp}\mathcal{L}\left(\cdot,\psi_{2}\right)$ , we have

\mathcal{L}\left(x,\psi_{1}\right)\geq\phi_{1}\left(x\right)\text{ and }\mathcal{L}\left(x,\psi_{2}\right)\geq\phi_{2}\left(x\right)\quad\forall x\in X.

As $\Psi$ is a convex set, $t_{0}\psi_{1}+\left(1-t_{0}\right)\psi_{2}\in\Psi$ . Moreover, $\mathcal{L}\left(x,\psi\right)$ is concave in the second variable, i.e. for all $t\in\left[0,1\right]$ and $\psi_{1},\psi_{2}\in\Psi$

$\displaystyle\mathcal{L}\left(x,t\psi_{1}+\left(1-t\right)\psi_{2}\right)$	$\displaystyle=f\left(x\right)+t\psi_{1}\left(Lx\right)+\left(1-t\right)\psi_{2}\left(Lx\right)-g^{*}_{\Psi}\left(t\psi_{1}+\left(1-t\right)\psi_{2}\right)$
	$\displaystyle\geq t\left[f\left(x\right)+\psi_{1}\left(Lx\right)-g^{}_{\Psi}\left(\psi_{1}\right)\right]+\left(1-t\right)\left[f\left(x\right)+\psi_{2}\left(Lx\right)-g^{}_{\Psi}\left(\psi_{2}\right)\right]$
	$\displaystyle=t\mathcal{L}\left(x,\psi_{1}\right)+\left(1-t\right)\mathcal{L}\left(x,\psi_{2}\right),$	(59)

where the inequality holds due to the concavity of $-g^{*}_{\Psi}$ . Combining inequality (59) with (58), we get

	$\displaystyle\mathcal{L}\left(x,t_{0}\psi_{1}+\left(1-t_{0}\right)\psi_{2}\right)$	$\displaystyle\geq t_{0}\mathcal{L}\left(x,\psi_{1}\right)+\left(1-t_{0}\right)\mathcal{L}\left(x,\psi_{1}\right)$
		$\displaystyle\geq t_{0}\phi_{1}\left(x\right)+\left(1-t_{0}\right)\phi_{2}\left(x\right)\geq\alpha$
		$\displaystyle=val(\text{CP})-\varepsilon$
		$\displaystyle=\inf_{x\in X}f\left(x\right)+g\left(Lx\right)-\varepsilon$
		$\displaystyle>f\left(x_{\varepsilon}\right)+g\left(Lx_{\varepsilon}\right)-2\varepsilon.$

We obtain

\mathcal{L}\left(x,t_{0}\psi_{1}+\left(1-t_{0}\right)\psi_{2}\right)>f\left(x_{\varepsilon}\right)+g\left(Lx_{\varepsilon}\right)-2\varepsilon\quad\forall x\in X

By denoting $\bar{\psi}=t_{0}\psi_{1}+\left(1-t_{0}\right)\psi_{2}$ , we get

\mathcal{L}\left(x,\bar{\psi}\right)=f\left(x\right)+\bar{\psi}\left(Lx\right)-g^{*}_{\Psi}\left(\bar{\psi}\right)>f\left(x_{\varepsilon}\right)+g\left(Lx_{\varepsilon}\right)-2\varepsilon.

(60)

Rearranging both sides, we obtain

f\left(x\right)-f\left(x_{\varepsilon}\right)>-\bar{\psi}\left(Lx\right)+g\left(Lx_{\varepsilon}\right)+g^{*}_{\Psi}\left(\bar{\psi}\right)-2\varepsilon.

We have $g(Lx)+g^{*}_{\Psi}(\psi)\geq\psi(Lx)$ for all $x\in X$ and $\psi\in\Psi$ . From the previous inequality,

f\left(x\right)-f\left(x_{\varepsilon}\right)>-\bar{\psi}\left(Lx\right)+\bar{\psi}\left(Lx_{\varepsilon}\right)-2\varepsilon.

Because, in general, $-\bar{\psi}\circ L$ does not belong to $\Phi$ , we cannot claim that $-\bar{\psi}\circ L\in\partial_{2\varepsilon,\Phi}f\left(x_{\varepsilon}\right)$ , but we have

f\left(x\right)+\bar{\psi}\left(Lx\right)-f\left(x_{\varepsilon}\right)-\bar{\psi}\left(Lx_{\varepsilon}\right)>-2\varepsilon,

which means $0\in\partial_{2\varepsilon,\Phi}\left(f+\bar{\psi}\circ L\right)\left(x_{\varepsilon}\right)$ . On the other hand, from (60), for any $x\in X$ , we have

f\left(x\right)+\bar{\psi}\left(Lx\right)-g^{*}_{\Psi}\left(\bar{\psi}\right)>f\left(x_{\varepsilon}\right)+g\left(Lx_{\varepsilon}\right)-2\varepsilon.

By choosing $x=x_{\varepsilon}$ , we can write

\bar{\psi}\left(Lx_{\varepsilon}\right)-g^{*}_{\Psi}\left(\bar{\psi}\right)>g\left(Lx_{\varepsilon}\right)-2\varepsilon,

\bar{\psi}\left(Lx_{\varepsilon}\right)+2\varepsilon>g\left(Lx_{\varepsilon}\right)+g^{*}_{\Psi}\left(\bar{\psi}\right).

This gives $\bar{\psi}\in\partial_{2\varepsilon,\Psi}g\left(Lx_{\varepsilon}\right)$ . Thus, (i) holds.

(ii) $\Rightarrow$ (iii): As the intersection property holds for every $\alpha\in\mathbb{R}$ , for every $\varepsilon>0$ , we can take $\alpha_{\varepsilon}=\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}\left(x,\psi\right)-\varepsilon/2=V(0)-\varepsilon/2$ . Now we can find $\phi_{1},\phi_{2}\in\Phi,\psi_{1},\psi_{2}\in\Psi$ such that $\phi_{1}\left(\cdot\right)\leq\mathcal{L}\left(\cdot,\psi_{1}\right),\phi_{2}\leq\mathcal{L}\left(\cdot,\psi_{2}\right)$ for all $x\in X$ , and $\phi_{1},\phi_{2}$ satisfy the intersection property at level $\alpha_{\varepsilon}$ . Using Lemma 6.4, there exists $t_{0}\in\left[0,1\right]$ such that for all $x\in X$ ,

	$\displaystyle\alpha_{\varepsilon}$	$\displaystyle\leq t_{0}\phi_{1}\left(x\right)+\left(1-t_{0}\right)\phi_{2}\left(x\right)\leq t_{0}\mathcal{L}\left(x,\psi_{1}\right)+\left(1-t_{0}\right)\mathcal{L}\left(x,\psi_{2}\right)$
		$\displaystyle\leq\mathcal{L}\left(x,t_{0}\psi_{1}+\left(1-t_{0}\right)\psi_{2}\right).$		(61)

On the other hand, we have

	$\displaystyle\mathcal{L}\left(x,\psi\right)$	$\displaystyle=f\left(x\right)+\psi\left(Lx\right)-g_{\Psi}^{*}\left(\psi\right)$
		$\displaystyle=\inf_{y\in Y}f\left(x\right)+g\left(y\right)+\psi\left(Lx\right)-\psi\left(y\right)$
		$\displaystyle\leq f\left(x\right)+g\left(Lx+y\right)+\psi\left(Lx\right)-\psi\left(Lx+y\right),$

where $x\in X,y\in Y$ . Denoting $\psi_{0}=t_{0}\psi_{1}+\left(1-t_{0}\right)\psi_{2}\in\Psi$ , as $\Psi$ is a convex set, (61) gives us

	$\displaystyle\alpha_{\varepsilon}$	$\displaystyle\leq f\left(x\right)+g\left(Lx+y\right)+\psi_{0}\left(Lx\right)-\psi_{0}\left(Lx+y\right)$
	$\displaystyle\alpha_{\varepsilon}+\psi_{0}\left(Lx+y\right)-\psi_{0}\left(Lx\right)$	$\displaystyle\leq f\left(x\right)+g\left(Lx+y\right),$

for any $x\in X,y\in Y$ . Since $\psi_{0}\in\Psi$ is lower semi-continuous, the function $\psi_{x}\left(y\right):=\psi_{0}\left(Lx+y\right)$ is also lower semi-continuous for all $y\in Y$ . There exists a neighborhood $W\left(0\right)\subset Y$ such that

\psi_{x}\left(y\right)\geq\psi_{x}\left(0\right)-\varepsilon/2,\quad\forall y\in W\left(0\right).

Thus,

f\left(x\right)+g\left(Lx+y\right)\geq\alpha_{\varepsilon}+\psi_{x}\left(y\right)-\psi_{x}\left(0\right)\geq\alpha-\varepsilon/2,

for all $y\in W\left(0\right)$ . The inequality holds for all $x\in X$ , we can take the infimum with respect to $x\in X$ on both sides and obtain

V\left(y\right)=\inf_{x\in X}f\left(x\right)+g\left(Lx+y\right)\geq\alpha_{\varepsilon}-\varepsilon/2=V\left(0\right)-\varepsilon,

for all $y\in W\left(0\right)$ . Hence, $V\left(y\right)$ is lower semi-continuous at $y=0$ .

(iii) $\Rightarrow$ (ii): Conversely, for every $\alpha\in\mathbb{R},\alpha<\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}(x,\psi)=V(0)$ , we choose $3\varepsilon=V(0)-\alpha>0$ . There exists a neighborhood $W\left(0\right)\subset Y$ such that

V\left(0\right)-\varepsilon\leq V\left(y\right),\quad\forall y\in W\left(0\right).

Now considering

\displaystyle f\left(x\right)+g_{\Psi}^{**}\left(Lx+y\right)

\displaystyle=\sup_{\psi\in\Psi}f\left(x\right)+\psi\left(Lx+y\right)-g_{\Psi}^{*}\left(\psi\right)

We can find $\psi_{\varepsilon}\in\Psi$ such that

	$\displaystyle f\left(x\right)+g_{\Psi}^{**}\left(Lx+y\right)$	$\displaystyle<f\left(x\right)+\psi_{\varepsilon}\left(Lx+y\right)-g_{\Psi}^{*}\left(\psi_{\varepsilon}\right)+\varepsilon$
		$\displaystyle\leq f\left(x\right)-g_{\Psi}^{*}\left(\psi_{\varepsilon}\right)+\psi_{\varepsilon}\left(Lx\right)+2\varepsilon,\quad\forall y\in W_{1}(0).$

In the last inequality, we can find a neighborhood $W_{1}(0)\subset Y$ such that $\psi_{\varepsilon}$ is continuous at $0$ . Hence, for all $y\in W_{1}(0)\cap W(0)$ , we have

	$\displaystyle V\left(0\right)-\varepsilon$	$\displaystyle\leq V\left(y\right)=\inf_{x\in X}f\left(x\right)+g_{\Psi}^{**}\left(Lx+y\right)$
		$\displaystyle\leq\sup_{\psi\in\Psi}f\left(x\right)-g_{\Psi}^{*}\left(\psi\right)+\psi\left(Lx+y\right),\quad\forall x\in X$
		$\displaystyle<f\left(x\right)-g_{\Psi}^{*}\left(\psi_{\varepsilon}\right)+\psi_{\varepsilon}\left(Lx+y\right)+\varepsilon$
		$\displaystyle\leq f\left(x\right)-g_{\Psi}^{*}\left(\psi_{\varepsilon}\right)+\psi_{\varepsilon}\left(Lx\right)+2\varepsilon,\quad\forall y\in W_{1}(0)$
		$\displaystyle=\mathcal{L}\left(x,\psi_{\varepsilon}\right)+2\varepsilon.$

Finally, $\alpha=V(0)-3\varepsilon\leq\mathcal{L}(x,\psi_{\varepsilon})$ for all $x\in X$ , we can take $\phi=\alpha\in\Phi$ and $\phi\in\text{supp }\mathcal{L}\left(\cdot,\psi_{\varepsilon}\right)$ . Then $\phi$ has the intersection property at level $\alpha$ with any $\phi_{1}\in\text{supp }\mathcal{L}\left(\cdot,\psi\right)$ , for $\psi\in\Psi$ . ∎

Now we can state zero duality gap for Lagrange dual problem.

Proposition 6.6.

Let $f:X\to\left(-\infty,+\infty\right],g:Y\to\left(-\infty,+\infty\right]$ . Let $L:X\to Y$ be a mapping from $X$ to $Y$ with $\text{dom }g\cap L\left(\text{dom }f\right)\neq\emptyset$ . Let $\Phi,\Psi$ be sets of elementary functions defined on $X$ and $Y$ , respectively. Let $\mathcal{L}(x,\psi)$ be the Lagrangian defined by (51). Assume $\Psi$ is convex, $0\in\Phi$ and (52) holds. We further assume $\inf_{x\in X}f(x)+g(Lx)<+\infty$ . The following are equivalent.

(i)

For every $\alpha<\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}\left(x,\psi\right)$ , there exists $\psi_{1},\psi_{2}\in\Psi$ and $\bar{\phi}_{1}\in\text{supp }\mathcal{L}\left(\cdot,\psi_{1}\right),\bar{\phi}_{2}\in\text{supp }\mathcal{L}\left(\cdot,\psi_{2}\right)$ such that $\bar{\phi}_{1},\bar{\phi}_{2}$ have the intersection property at level $\alpha$ .
(ii)

$\displaystyle{\inf_{x\in X}f(x)+g(Lx)=\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}\left(x,\psi\right)=\sup_{\psi\in\Psi}\inf_{x\in X}\mathcal{L}\left(x,\psi\right)<+\infty}.$

Proof.

By applying Theorem 6.5 and Theorem 4.2, we obtain the assertion. ∎

Remark 9.

•

In order to work with the intersection property, we need to find $\psi_{1},\psi_{2}\in\Psi$ and $\phi_{1}\in\text{supp }\mathcal{L}\left(\cdot,\psi_{1}\right),\phi_{2}\in\mathcal{L}\left(\cdot,\psi_{2}\right)$ . Because $\alpha<\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}\left(x,\psi\right)$ , so $\alpha\in\text{supp }\mathcal{L}\left(\cdot,\psi\right)$ for any $\psi\in\Psi$ . We can set $\phi_{1}=\alpha\in\Phi$ , and the intersection property always holds for any $\phi_{2}\in\text{supp }\mathcal{L}\left(\cdot,\psi_{2}\right)$ , as $\left[\phi_{1}<\alpha\right]=\emptyset$ .

•

Let $\Psi$ be symmetric, and for every $\varepsilon>0$ , there exist $\bar{x}\in X$ , $\psi_{1}\circ L\in\Phi$ for $\psi_{1}\in\Psi$ , such that $\psi_{1}\circ L\in\partial_{\varepsilon,\Phi}f\left(\bar{x}\right)$ . We have, for all $y\in X$

	$\displaystyle f\left(y\right)-f\left(\bar{x}\right)$	$\displaystyle\geq\psi_{1}\left(Ly\right)-\psi_{1}\left(L\bar{x}\right)-\varepsilon$
	$\displaystyle f\left(y\right)-\psi_{1}\left(Ly\right)-g^{*}_{\Psi}\left(-\psi_{1}\right)$	$\displaystyle\geq f\left(\bar{x}\right)-\psi_{1}\left(L\bar{x}\right)-g^{*}_{\Psi}\left(-\psi_{1}\right)-\varepsilon$
	$\displaystyle\mathcal{L}\left(y,-\psi_{1}\right)$	$\displaystyle\geq\mathcal{L}\left(\bar{x},-\psi_{1}\right)-\varepsilon,$

so $\bar{x}$ is a $\varepsilon$ -approximate solution to the problem $\inf_{x\in X}\mathcal{L}\left(x,-\psi_{1}\right)$ .

We can replace assumption (52) with a stronger one by the following lemma.

Lemma 6.7.

Assume $\inf_{x\in X}f(x)+g(Lx)<+\infty$ and let $\mathcal{L}(x,\psi)$ be the Lagrangian function given by (51). Then condition (52) i.e.,

\inf_{x\in X}f(x)+g(Lx)=\inf_{x\in X}f(x)+g^{**}_{\Psi}(Lx),

holds if and only if for every $\varepsilon\geq 0$ , there exists $x_{\varepsilon}\in X$ such that

\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}(x,\psi)>f(x_{\varepsilon})+g(Lx_{\varepsilon})-\varepsilon.

(62)

Proof.

If condition (52) holds i.e.,

\inf_{x\in X}f(x)+g(Lx)=\inf_{x\in X}f(x)+g^{**}_{\Psi}(Lx),

then for every $\varepsilon>0$ , one can find an $x_{\varepsilon}\in X$ such that $\inf_{x\in X}f(x)+g(Lx)>f(x_{\varepsilon})+g(Lx_{\varepsilon})-\varepsilon$ . Hence,

\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}(x,\psi)=\inf_{x\in X}f(x)+g^{**}_{\Psi}(Lx)>f(x_{\varepsilon})+g(Lx_{\varepsilon})-\varepsilon.

Conversely, assume that

\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}(x,\psi)>f(x_{\varepsilon})+g(Lx_{\varepsilon})-\varepsilon.

We have

\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}(x,\psi)>f(x_{\varepsilon})+g(Lx_{\varepsilon})-\varepsilon\geq\inf_{x\in X}f(x)+g(Lx)-\varepsilon.

Recall that,

\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}(x,\psi)=\inf_{x\in X}f(x)+g^{**}_{\Psi}(Lx).

Because $g(Lx)\geq g^{**}_{\Psi}(Lx)$ for all $x\in X$ , we can write

\inf_{x\in X}f(x)+g(Lx)\geq\inf_{x\in X}f(x)+g^{**}_{\Psi}(Lx)\geq\inf_{x\in X}f(x)+g(Lx)-\varepsilon.

Both sides do not depend on $\varepsilon$ , we can let $\varepsilon\to 0$ and get the equality. ∎

In fact, condition (62) is stronger than intersection property, as we can see in the corollary below.

Corollary 6.8.

Let $f:X\to\left(-\infty,+\infty\right],g:Y\to\left(-\infty,+\infty\right]$ and let $L:X\to Y$ be a mapping from $X$ to $Y$ with $\text{dom }g\cap L\left(\text{dom }f\right)\neq\emptyset$ . Let $\Phi,\Psi$ be sets of elementary functions defined on $X$ and $Y$ , respectively. Let the Lagrangian function be defined by (51). Assume $\Psi$ is convex, $0\in\Phi$ , $\inf_{x\in X}f(x)+g(Lx)<+\infty$ . The following are equivalent.

(i)

For every $\varepsilon>0$ , there exists $x_{\varepsilon}\in X$ such that $\displaystyle{\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}(x,\psi)}\geq f(x_{\varepsilon})+g(Lx_{\varepsilon})-\varepsilon$ .
(ii)

For every $\varepsilon>0$ , there exist $x_{\varepsilon}\in X$ and $\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g\left(Lx_{\varepsilon}\right)$ such that $0\in\partial_{\varepsilon,\Phi}\left(f+\psi_{\varepsilon}\circ L\right)\left(x_{\varepsilon}\right)$ .
(iii)

$\displaystyle{\inf_{x\in X}f(x)+g(Lx)=\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}\left(x,\psi\right)=\sup_{\psi\in\Psi}\inf_{x\in X}\mathcal{L}\left(x,\psi\right)<+\infty}.$

Proof.

Thanks to Theorem 4.2, Theorem 6.5 and Proposition 6.6, we have (ii) $\Leftrightarrow$ (iii). We only need to prove (i) $\Leftrightarrow$ (ii).

(i) $\Rightarrow$ (ii): For every $\varepsilon>0$ , we have

\sup_{\psi\in\Psi}\mathcal{L}(x,\psi)\geq\inf_{x\in X}\sup_{\psi\in\Psi}\mathcal{L}(x,\psi)\geq f(x_{\varepsilon})+g(Lx_{\varepsilon})-\varepsilon.

There also exists $\psi_{\varepsilon}\in\Psi$ such that

\mathcal{L}(x,\psi_{\varepsilon})+\varepsilon\geq\sup_{\psi\in\Psi}\mathcal{L}(x,\psi)\geq f(x_{\varepsilon})+g(Lx_{\varepsilon})-\varepsilon.

Thus, $\mathcal{L}(x,\psi_{\varepsilon})+\varepsilon\geq f(x_{\varepsilon})+g(Lx_{\varepsilon})-\varepsilon$ for all $x\in X$ . While $\mathcal{L}(x,\psi_{\varepsilon})=f(x)+\psi_{\varepsilon}(Lx)-g^{*}_{\Psi}(\psi_{\varepsilon})$ , we have

(\forall x\in X)\ f(x)+\psi_{\varepsilon}(Lx)-g^{*}_{\Psi}(\psi_{\varepsilon})+\varepsilon\geq f(x_{\varepsilon})+g(Lx_{\varepsilon})-\varepsilon.

(63)

Let $x=x_{\varepsilon}$ and we obtain

\psi_{\varepsilon}(Lx_{\varepsilon})-g^{*}_{\Psi}(\psi_{\varepsilon})\geq g(Lx_{\varepsilon})-2\varepsilon,

so $\psi_{\varepsilon}\in\partial_{2\varepsilon,\Psi}g(Lx_{\varepsilon})$ . On the other hand, by using the property of $\Phi$ -conjugate,

(\forall x\in X)\ g(Lx)+g^{*}_{\Psi}(\psi_{\varepsilon})\geq\psi_{\varepsilon}(Lx).

By putting this in (63), we have $0\in\partial_{2\varepsilon,\Phi}(f+\psi_{\varepsilon}\circ L)(x_{\varepsilon})$ . Hence, we prove (ii).

(ii) $\Rightarrow$ (i): For every $\varepsilon>0$ , there exist $x_{\varepsilon}\in X$ , $\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g(Lx_{\varepsilon})$ and $0\in\partial_{\varepsilon,\Phi}(f+\psi_{\varepsilon}\circ L)(x_{\varepsilon})$ , i.e. for all $z\in X$

f(z)+\psi_{\varepsilon}(Lz)-f(x_{\varepsilon})-\psi_{\varepsilon}(Lx_{\varepsilon})\geq-\varepsilon.

Because $\psi_{\varepsilon}\in\partial_{\varepsilon,\Psi}g(Lx_{\varepsilon})$ , using inequality (11) we get,

	$\displaystyle f(z)+\psi_{\varepsilon}(Lz)-f(x_{\varepsilon})$	$\displaystyle\geq\psi_{\varepsilon}(Lx_{\varepsilon})-\varepsilon$
		$\displaystyle\geq g^{*}_{\Psi}(\psi_{\varepsilon})+g(Lx_{\varepsilon})-2\varepsilon,$

and

\sup_{\psi\in\Psi}\mathcal{L}(z,\psi)\geq\mathcal{L}(z,\psi_{\varepsilon})=f(z)+\psi_{\varepsilon}(Lz)-g^{*}_{\Psi}(\psi_{\varepsilon})\geq f(x_{\varepsilon})+g(Lx_{\varepsilon})-2\varepsilon.

The right-hand side does not depend on $z\in X$ , taking the infimum with respect to $z\in X$ gives us

\inf_{z\in X}\sup_{\psi\in\Psi}\mathcal{L}(z,\psi)\geq f(x_{\varepsilon})+g(Lx_{\varepsilon})-2\varepsilon.

We have proved (i). ∎

Having the class $\Psi$ of elementary functions, we recall the notion of $X$ -convexity in [6, Proposition 1.2.3], by using fomula $\phi(x)=x(\phi)$ for $x\in X,\phi\in\Phi$ . Together with the intersection property, we can prove Langrage zero duality gap as follow.

Proposition 6.9.

Assume that for every $\alpha\in\mathbb{R},\alpha\leq\inf_{x\in X}f\left(x\right)+\psi\left(Lx\right)$ , there exist $\psi\in\text{supp }g,x_{0}\in X$ , such that $f(x_{0})\leq\alpha\leq\psi(Lx_{0})$ . If $x_{0}\in A=\left\{x\in X:Lx\in\text{supp }g^{*}_{\Psi}\right\}\neq\emptyset$ , then we have $val(LP)=val(LD)$ .

Proof.

Let $x_{0}\in A,\psi\in\text{supp }g$ be such that

f(x_{0})\leq\alpha\leq\psi(Lx_{0}).

Then $g^{*}_{\Psi}\left(\psi\right)\leq 0$ . On the other hand, $x_{0}\in A$ means

(\forall\psi\in\Psi)\ \psi(Lx_{0})\leq g^{*}_{\Psi}(\psi)\Leftrightarrow\sup_{\psi\in\Psi}\psi\left(Lx_{0}\right)-g^{*}_{\Psi}\left(\psi\right)\leq 0.

Moreover, $\psi\left(Lx_{0}\right)\geq\alpha$ and

	$\displaystyle\alpha$	$\displaystyle\leq\inf_{x\in X}f\left(x\right)+\psi\left(Lx\right)$
		$\displaystyle\leq\inf_{x\in X}f\left(x\right)+\psi\left(Lx\right)-g^{}_{\Psi}\left(\psi\right),\quad\text{as }g^{}_{\Psi}\left(\psi\right)\leq 0$
		$\displaystyle\leq\sup_{\psi\in\Psi}\inf_{x\in X}f\left(x\right)+\psi\left(Lx\right)-g^{*}_{\Psi}\left(\psi\right).$

And,

	$\displaystyle\alpha$	$\displaystyle\geq f\left(x_{0}\right)\geq\sup_{\psi\in\Psi}f\left(x_{0}\right)+\psi\left(Lx_{0}\right)-g^{*}_{\Psi}\left(\psi\right)$
		$\displaystyle\geq\inf_{x\in X}\sup_{\psi\in\Psi}f\left(x\right)+\psi\left(Lx\right)-g^{*}_{\Psi}\left(\psi\right).$

Combining the two above inequalities, we obtain

\sup_{\psi\in\Psi}\inf_{x\in X}f\left(x\right)+\psi\left(Lx\right)-g^{*}_{\Psi}\left(\psi\right)\geq\inf_{x\in X}\sup_{\psi\in\Psi}f\left(x\right)+\psi\left(Lx\right)-g^{*}_{\Psi}\left(\psi\right).

∎

Remark 10.

In Proposition 6.9, we can replace the assumption $x_{0}\in A$ with $g(Lx_{0})\leq 0$ , for $x_{0}\in X$ . Then zero duality gap holds between (CP) and (LD). Because, for $\psi_{0}\in\text{supp }g$ and $\psi_{0}\geq\alpha\geq f(x_{0})$ , we have

	$\displaystyle\sup_{\psi\in\Psi}\inf_{x\in X}f\left(x\right)+\psi\left(Lx\right)-g^{*}_{\Psi}\left(\psi\right)$	$\displaystyle\geq\inf_{x\in X}f\left(x\right)+\psi_{0}\left(Lx\right)-g^{*}_{\Psi}\left(\psi_{0}\right)$
		$\displaystyle\geq\inf_{x\in X}f\left(x\right)+\psi_{0}\left(Lx\right),\quad\text{as }g^{*}_{\Psi}\left(\psi_{0}\right)\leq 0$
		$\displaystyle\geq\alpha\geq f(x_{0})\geq f(x_{0})+g(Lx_{0})$
		$\displaystyle\geq\inf_{x\in X}f(x)+g(Lx).$

We complete this Section with two examples which illustrate Proposition 6.9.

Example 6.10.

Consider the functions $f\left(x\right)=3x^{2}-3x-10,g\left(x\right)=-2x^{2}+x-8$ and $L=Id$ . Let

\Phi=\Psi=\left\{\psi\left(x\right)=-ax^{2}+bx+c,a\geq 0,b,c\in\mathbb{R}\right\},

is our sets of elementary functions. We want to find $\text{supp }g$ , i.e. the set of all $\psi\in\Psi$ such that

\psi\left(x\right)\leq g\left(x\right)\quad\forall x\in\mathbb{R}.

This gives us $\psi\in\Psi$ where $a>2,c\leq-\frac{\left(1-b\right)^{2}}{4\left(a-2\right)}-8$ or $\psi\left(x\right)=-2x^{2}+x-8$ . Consider $a>2$ , for every $\alpha\in\mathbb{R}$ such that

\alpha\leq\inf_{x}f\left(x\right)+\psi\left(x\right)=\begin{cases}c-10&\text{if }a=3,b=3,c\leq-9\\ -\frac{\left(b-3\right)^{2}}{4\left(3-a\right)}-10+c&\text{if }2<a<3,c\leq-\frac{\left(1-b\right)^{2}}{4\left(a-2\right)}-8\\ -\infty&\text{if }a>3.\end{cases}

We need to find $x_{0}$ such that $f\left(x_{0}\right)\leq\alpha\leq\psi\left(x_{0}\right)$ , we can let $\alpha=\inf_{x}f\left(x\right)+\psi\left(x\right)$ . There are two cases

•

$\alpha=c-10$ or $\alpha\leq-19$ , we cannot find $x_{0}$ such that $f\left(x_{0}\right)\leq-19$ as $f_{\min}=-\frac{43}{4}$ .

•

$\alpha=-\frac{\left(b-3\right)^{2}}{4\left(3-a\right)}-10+c,$ we have to solve $f(x_{0})\leq\alpha\leq\psi(x_{0})$ for $x_{0}$ . We arrive at the following system

	$\displaystyle-ax^{2}+bx+c$	$\displaystyle\geq-\frac{\left(b-3\right)^{2}}{4\left(3-a\right)}-10+c$
	$\displaystyle 3x^{2}-3x-10$	$\displaystyle\leq-\frac{\left(b-3\right)^{2}}{4\left(3-a\right)}-10+c$
	$\displaystyle 2<a<3,$	$\displaystyle\ c\leq-\frac{\left(1-b\right)^{2}}{4\left(a-2\right)}-8.$

By direct calculations, there are no solution $a,b,c,x$ for the above system of inequalities.

For $\psi\left(x\right)=-2x^{2}+x-8$ , and this turns into

\inf_{x\in\mathbb{R}}f\left(x\right)+g\left(x\right)=\alpha=\inf_{x\in\mathbb{R}}f\left(x\right)+\psi\left(x\right)\leq\sup_{\psi\in\psi}\inf_{x\in\mathbb{R}}f\left(x\right)+\psi\left(x\right),

which is zero duality gap.

Example 6.11.

Now consider the functions $f\left(x\right)=x^{2}-3x-10,g\left(x\right)=2x+1$ with $L=Id$ , and the set of elementary functions is

\Phi=\Psi=\left\{\psi\left(x\right)=ax+b,\text{ where }a,b\in\mathbb{R}\right\}.

Calculating $\psi\in\text{supp }g$ gives us $a=2$ and $b\leq 1$ . Now we find

\alpha=\inf_{x\in\mathbb{R}}f\left(x\right)+\psi\left(x\right)=\inf_{x\in\mathbb{R}}x^{2}-x-10+b=-\frac{21}{2}+b.

We also calculate

g^{*}_{\Psi}\left(\psi\right)=\sup_{x\in X}\psi\left(x\right)-g\left(x\right)=b-1,

for $b\leq 1$ . Observe that

	$\displaystyle\sup_{\psi\in\text{ supp g}}\inf_{x\in X}f\left(x\right)+\psi\left(x\right)-g^{*}_{\Psi}\left(\psi\right)$	$\displaystyle=\sup_{\psi\in\text{ supp g}}\inf_{x\in X}x^{2}-x-10+b-b+1$
		$\displaystyle=\sup_{\psi\in\text{ supp g}}\inf_{x\in X}x^{2}-x-9$
		$\displaystyle=\inf_{x\in X}x^{2}-x-9=\inf_{x\in\mathbb{R}}f\left(x\right)+g\left(x\right),$

while

\sup_{\psi\in\text{ supp g}}\inf_{x\in X}f\left(x\right)+\psi\left(x\right)-g^{*}_{\Psi}\left(\psi\right)\leq\sup_{\psi\in\Psi}\inf_{x\in X}f\left(x\right)+\psi\left(x\right)-g^{*}_{\Psi}\left(\psi\right).

Thus we have zero duality gap.

Acknowledgments

The authors are thankful to the anonymous referees for their constructive comments and remarks which improve the quality of the paper.

This work has been supported by the ITN-ETN project TraDE-OPT funded by the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No.861137

References

[1] Vial JP. Strong and weak convexity of sets and functions. Mathematics of Operations Research. 1983;8(2):231–259.
[2] Rolewicz S. On paraconvex multifunctions. Oper Research Verf(Methods of Oper Res). 1979;31:540–546.
[3] Hyers DH, Ulam SM. Approximately convex functions. Proceedings of the American Mathematical Society. 1952;3:821–828.
[4] Páles Z. On approximately convex functions. Proceedings of the American Mathematical Society. 2003;131(1):243–252.
[5] Rubinov AM. Abstract convexity and global optimization. Vol. 44. Springer Science & Business Media; 2013.
[6] Pallaschke DE, Rolewicz S. Foundations of mathematical optimization: convex analysis without linearity. Vol. 388. Springer Science & Business Media; 2013.
[7] Singer I. Abstract convex analysis. Vol. 25. John Wiley & Sons; 1997.
[8] Rubinov AM, Andramonov MY. Minimizing increasing star-shaped functions based on abstract convexity. Journal of Global Optimization. 1999;15(1):19–39.
[9] Andramonov M. A survey of methods of abstract convex programming. Journal of Statistics and Management Systems. 2002;5(1-3):21–37.
[10] Burachik RS, Rubinov A. On abstract convexity and set valued analysis. Journal of Nonlinear and Convex Analysis. 2008;9(1):105.
[11] Dutta J, Martinez-Legaz J, Rubinov A. Monotonic analysis over cones: I. Optimization. 2004;53(2):129–146.
[12] Eberhard A, Mohebi H. Maximal abstract monotonicity and generalized Fenchel’s conjugation formulas. Set-Valued and Variational Analysis. 2010;18(1):79–108.
[13] Rockafellar RT, Wets RJB. Variational analysis. Vol. 317. Springer Science & Business Media; 2009.
[14] Bonnans JF, Shapiro A. Perturbation analysis of optimization problems. Springer Science & Business Media; 2013.
[15] Beck A, Teboulle M. Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems. IEEE transactions on image processing. 2009;18(11):2419–2434.
[16] Jeyakumar V, Rubinov A, Wu Z. Generalized Fenchel’s conjugation formulas and duality for abstract convex functions. Journal of optimization theory and applications. 2007;132(3):441–458.
[17] Burachik RS, Rubinov A. Abstract convexity and augmented lagrangians. SIAM Journal on Optimization. 2007;18(2):413–436.
[18] Syga M. Minimax theorems for $\phi$ -convex functions: sufficient and necessary conditions. Optimization. 2016;65(3):635–649.
[19] Syga M. Minimax theorems for extended real-valued abstract convex–concave functions. Journal of Optimization Theory and Applications. 2018;176(2):306–318.
[20] Dolgopolik MV. Abstract convex approximations of nonsmooth functions. Optimization. 2015;64(7):1439–1469.
[21] Gorokhovik V, Tykoun A. Support points of lower semicontinuous functions with respect to the set of Lipschitz concave functions. Doklady of the National Academy of Sciences of Belarus. 2020;63(6):647–653.
[22] Gorokhovik V, Tykoun A. Abstract convexity of functions with respectto the set of Lipschitz (concave) functions. Proceedings of the Steklov Institute of Mathematics. 2020;309(1):S36–S46.
[23] Bui HT, Burachik RS, Kruger AY, et al. Zero duality gap conditions via abstract convexity. Optimization. 2021;:1–37.
[24] Bednarczuk EM, Syga M. On duality for nonconvex minimization problems within the framework of abstract convexity. Optimization. 2021;To be appeared.
[25] Moreau JJ. Inf-convolution, sous-additivité, convexité des fonctions numériques. Journal de Mathématiques Pures et Appliquées. 1970;.
[26] Ekeland I, Temam R. Convex analysis and variational problems. SIAM; 1999.
[27] Cannarsa P, Sinestrari C. Semiconcave functions, Hamilton-Jacobi equations, and optimal control. Vol. 58. Springer Science & Business Media; 2004.
[28] Attouch H, Aze D. Approximation and regularization of arbitrary functions in hilbert spaces by the lasry-lions method. In: Annales de l’Institut Henri Poincaré C, Analyse non linéaire; Vol. 10; Elsevier; 1993. p. 289–312.
[29] Daniilidis A, Georgiev P. Approximate convexity and submonotonicity. Journal of mathematical analysis and applications. 2004;291(1):292–301.
[30] Rolewicz S. On uniformly approximate convex and strongly alpha (.)-paraconvex functions. Control and Cybernetics. 2001;30(3):323–330.
[31] Bot RI. Conjugate duality in convex optimization. Vol. 637. Springer Science & Business Media; 2009.
[32] Oettli W, Schläger D. Conjugate functions for convex and nonconvex duality. Journal of Global Optimization. 1998;13(4):337–347.
[33] Bauschke HH, Combettes PL, et al. Convex analysis and monotone operator theory in hilbert spaces. Vol. 408. Springer; 2011.
[34] Burachik RS. On asymptotic lagrangian duality for nonsmooth optimization. ANZIAM journal. 2016;58:C93–C123.
[35] Rubinov AM, Huang X, Yang X. The zero duality gap property and lower semicontinuity of the perturbation function. Mathematics of Operations Research. 2002;27(4):775–791.
[36] Huang X, Yang X. Further study on augmented lagrangian duality theory. Journal of Global Optimization. 2005;31(2):193–210.

Duality for Composite Optimization Problem within the Framework of Abstract Convexity

Abstract

keywords:

1 Introduction

2 Preliminaries

Definition 2.1.

Definition 2.2.

Theorem 2.3.

Example 2.4.

Example 2.5.

Definition 2.6.

Definition 2.7.

Proposition 2.8.

3 Construction of the conjugate dual

Example 3.1.

3.1 Conjugate dual for specific classes Φ,Ψ\Phi,\Psi

Example 3.2.

Remark 1.

4 Zero Duality Gap for Conjugate Dual

Theorem 4.1.

Proof.

Theorem 4.2.

Proof.

Remark 2.

Remark 3.

Theorem 4.3.

Proof.

Theorem 4.4.

Proof.

Remark 4.

4.1 Zero duality gap for specific classes Φ,Ψ\Phi,\ \Psi

Proposition 4.5.

Proof.

Corollary 4.6.

Proof.

Example 4.7.

Example 4.8.

5 Strong Duality for Conjugate Dual

Definition 5.1.

Remark 5.

Theorem 5.2.

Proof.

Remark 6.

Corollary 5.3.

Proof.

Remark 7.

Example 5.4.

6 Lagrange Dual

6.1 Construction of Lagrangian Primal-Dual Problems

Theorem 6.1.

Proposition 6.2.

Proof.

6.2 Lagrange Zero Duality Gap

Theorem 6.3.

Remark 8.

Lemma 6.4.

Theorem 6.5.

Proof.

Proposition 6.6.

Proof.

Remark 9.

Lemma 6.7.

Proof.

Corollary 6.8.

Proof.

Proposition 6.9.

Proof.

Remark 10.

Example 6.10.

Example 6.11.

Acknowledgments

References

3.1 Conjugate dual for specific classes $\Phi,\Psi$

4.1 Zero duality gap for specific classes $\Phi,\ \Psi$