This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Distributionally robust possibilistic optimization problems

Romain Guillaume Université de Toulouse-IRIT Toulouse, France, [email protected] Adam Kasperski111Corresponding author Wrocław University of Science and Technology, Wrocław, Poland
[email protected]
Paweł Zieliński Wrocław University of Science and Technology, Wrocław, Poland
[email protected]
Abstract

In this paper a class of optimization problems with uncertain linear constraints is discussed. It is assumed that the constraint coefficients are random vectors whose probability distributions are only partially known. Possibility theory is used to model the imprecise probabilities. In one of the interpretations, a possibility distribution (a membership function of a fuzzy set) in the set of coefficient realizations induces a necessity measure, which in turn defines a family of probability distributions in this set. The distributionally robust approach is then used to transform the imprecise constraints into deterministic counterparts. Namely, the uncertain left-had side of each constraint is replaced with the expected value with respect to the worst probability distribution that can occur. It is shown how to represent the resulting problem by using linear or second order cone constraints. This leads to problems which are computationally tractable for a wide class of optimization models, in particular for linear programming.

Keywords: robust optimization; possibility theory; imprecise probabilities; fuzzy intervals

1 Introduction

In this paper we consider the following uncertain optimization problem:

min𝒄T𝒙𝒂~iT𝒙bii[m]𝒙𝕏.\begin{array}[]{llll}\min&\boldsymbol{c}^{T}\boldsymbol{x}\\ &\tilde{\boldsymbol{a}}_{i}^{T}\boldsymbol{x}\leq b_{i}&i\in[m]\\ &\boldsymbol{x}\in\mathbb{X}.\end{array} (1)

In formulation (1), 𝒄n\boldsymbol{c}\in\mathbb{R}^{n} is a vector of objective function coefficients, 𝒃m\boldsymbol{b}\in\mathbb{R}^{m} is a vector of constraint right-hand sides, and 𝒂~i\tilde{\boldsymbol{a}}_{i} is a vector of uncertain coefficients of the iith constraint. In the following we will use the notation [m]={1,,m}[m]=\{1,\dots,m\}. We assume that 𝒂~i\tilde{\boldsymbol{a}}_{i} is a random vector in n\mathbb{R}^{n} whose realization is unknown when a solution to (1) is determined. A particular realization 𝒂in\boldsymbol{a}_{i}\in\mathbb{R}^{n} of 𝒂~i\tilde{\boldsymbol{a}}_{i} is called a scenario. After replacing 𝒂~i\tilde{\boldsymbol{a}}_{i} with scenario 𝒂i\boldsymbol{a}_{i}, for each i[m]i\in[m], we get a deterministic counterpart of (1). The true probability distribution for 𝒂~i\tilde{\boldsymbol{a}}_{i} is only partially known. In particular, the set of possible scenarios 𝒰in\mathcal{U}_{i}\subseteq\mathbb{R}^{n} is provided, which is the support of 𝒂~i\tilde{\boldsymbol{a}}_{i} or its reasonable approximation. Finally, 𝕏\mathbb{X} is a nonempty and bounded subset of n\mathbb{R}^{n} which restricts the domain of decision variables 𝒙\boldsymbol{x}. If 𝕏\mathbb{X} is a polyhedron, for example, 𝕏={𝒙n:𝑳𝒙𝑼}\mathbb{X}=\{\boldsymbol{x}\in\mathbb{R}^{n}:\boldsymbol{L}\leq\boldsymbol{x}\leq\boldsymbol{U}\}, for some 𝑳,𝑼n\boldsymbol{L},\boldsymbol{U}\in\mathbb{R}^{n}, then we get uncertain linear programming problem. If, additionally, 𝕏{0,1}n\mathbb{X}\subseteq\{0,1\}^{n}, then we get uncertain combinatorial optimization problem.

The method of handling the uncertain constraints and solving (1) depends on the information available about 𝒂~i\tilde{\boldsymbol{a}}_{i}. If 𝒂~i\tilde{\boldsymbol{a}}_{i} is a random vector with a known probability distribution, then the iith constraint can be replaced with a stochastic chance constraint of the form

P(𝒂~iT𝒙bi)1ϵi,{\rm P}(\tilde{\boldsymbol{a}}_{i}^{T}\boldsymbol{x}\leq b_{i})\geq 1-\epsilon_{i},

where ϵi[0,1)\epsilon_{i}\in[0,1) is a given risk level [8, 21]. If the uncertainty set 𝒰i\mathcal{U}_{i} is the only information provided with 𝒂~i\tilde{\boldsymbol{a}}_{i}, then the imprecise constraint 𝒂~iT𝒙bi\tilde{\boldsymbol{a}}_{i}^{T}\boldsymbol{x}\leq b_{i} can be replaced with the following strict robust constraint [2]:

max𝒂i𝒰i𝒂iT𝒙bi.\max_{\boldsymbol{a}_{i}\in\mathcal{U}_{i}}\boldsymbol{a}_{i}^{T}\boldsymbol{x}\leq b_{i}. (2)

Choosing appropriate uncertainty set 𝒰i\mathcal{U}_{i} is crucial in (2). For the discrete uncertainty, 𝒰i\mathcal{U}_{i} consists of KK explicitly listed scenarios [25]. These scenarios correspond to some events which influence the value of 𝒂~i\tilde{\boldsymbol{a}}_{i} or can be the result of sampling the random vector 𝒂~i\tilde{\boldsymbol{a}}_{i}. For the interval uncertainty [25], one provides an interval for each component a~ij\tilde{a}_{ij} of 𝒂~i=(a~i1,,a~in)\tilde{\boldsymbol{a}}_{i}=(\tilde{a}_{i1},\dots,\tilde{a}_{in}) and 𝒰i\mathcal{U}_{i} is the Cartesian product of these intervals (a hyperrectangle in n\mathbb{R}^{n}). In order to cut off the extreme values of this hyperrectangle, whose probability of occurrence can be negligible, one can add a budget to 𝒰i\mathcal{U}_{i}. This budget typically limits deviations of scenarios from some nominal scenario 𝒂^i𝒰i\hat{\boldsymbol{a}}_{i}\in\mathcal{U}_{i} [5, 29]. In another approach 𝒰i\mathcal{U}_{i} is specified as an ellipsoid centered at some nominal scenario 𝒂^i\hat{\boldsymbol{a}}_{i} [2]. Using the ellipsoidal uncertainty one can model correlations between the components of 𝒂~i\tilde{\boldsymbol{a}}_{i}. One can also construct 𝒰i\mathcal{U}_{i} from available data [4], which leads to various data-deriven robust models.

Another approach to handle the uncertainty consists in modeling the uncertain coefficients in (1) by using fuzzy sets. There are a lot of concepts of solving fuzzy optimization problems (see, e.g., [20, 35, 26, 34, 37, 32, 31]) – for a comprehensive description of them we refer the reader to [28]. In one of the most common class of models, 𝒂~i\tilde{\boldsymbol{a}}_{i} is a vector of fuzzy intervals whose membership functions are interpreted as possibility distributions for the components of 𝒂~i\tilde{\boldsymbol{a}}_{i}. A description of possibility theory that offers a general framework of dealing with imprecise or incomplete knowledge can be found, for example, in [14]. It uses two dual possibility and necessity measures to handle the uncertainty. In the possibilistic setting one can replace the uncertain constraints in (1) with fuzzy chance constraints of the form [19, 27, 23]:

N(𝒂~iT𝒙bi)1ϵi,{\rm N}(\tilde{\boldsymbol{a}}^{T}_{i}\boldsymbol{x}\leq b_{i})\geq 1-\epsilon_{i},

where N{\rm N} is a necessity measure induced by a possibility distribution for 𝒂~i\tilde{\boldsymbol{a}}_{i}.

If a partial information about the probability distribution of 𝒂~i\tilde{\boldsymbol{a}}_{i} is available, then the so-called distributionally robust approach can be used (see, e.g., [16, 39, 10]). In this approach it is assumed that the true probability distribution P{\rm P} for 𝒂~i\tilde{\boldsymbol{a}}_{i} belongs to the so-called ambiguity set 𝒫i\mathcal{P}_{i} of probability distributions. For example, one can consider all probability distributions given some bounds on its mean and covariance matrix [10]. Alternatively, the ambiguity set can be specified by providing a family of confidence sets (this approach will be adopted in this paper) [39]. Using the distributionally robust approach, one can consider the following counterpart of the imprecise constraint 𝒂~iT𝒙bi\tilde{\boldsymbol{a}}_{i}^{T}\boldsymbol{x}\leq b_{i}:

maxP𝒫i𝐄P[𝒂~iT𝒙]bi,\max_{{\rm P}\in\mathcal{P}_{i}}\mathrm{\mathbf{E}}_{\rm P}[\tilde{\boldsymbol{a}}_{i}^{T}\boldsymbol{x}]\leq b_{i}, (3)

where the left hand side is the expected value of the random variable 𝒂~iT𝒙\tilde{\boldsymbol{a}}_{i}^{T}\boldsymbol{x} under a worst probability distribution P𝒫i{\rm P}\in\mathcal{P}_{i}. Observe that (3) reduces to (2) if 𝒫i\mathcal{P}_{i} contains all probability distributions with the support 𝒰i\mathcal{U}_{i}.

In this paper we propose a new approach in the area of fuzzy optimization. Will show how the distributionally robust optimization can be naturally used in the setting of possibility theory. A preliminary version of the paper has appeared in [17]. We will start by defining a possibility distribution (a membership function of a fuzzy set) for 𝒂~i\tilde{\boldsymbol{a}}_{i}. This can be done by using both available data or experts knowledge. Following the interpretation of possibility distribution [1, 13], we will define an ambiguity set of probability distributions 𝒫i\mathcal{P}_{i} for 𝒂~i\tilde{\boldsymbol{a}}_{i}. Finally, we will apply constraints of the type (3) to compute a solution. We will show that the model of uncertainty assumed leads to a family of linear or second order cone constraints. So, the resulting problem is tractable for important cases of (1), for example for the class of linear programming problems.

Remark 1.

If one uses (3), then the assumption that 𝐜\boldsymbol{c} and bib_{i}, i[m]i\in[m], are precise in (1) causes no loss of generality. Indeed, if 𝐜~\tilde{\boldsymbol{c}} is imprecise, then we can minimize an auxiliary variable tt, subject to the imprecise constraint 𝐜~T𝐱t0\tilde{\boldsymbol{c}}^{T}\boldsymbol{x}-t\leq 0 (tt reflects the possible realizations of objective function values). If b~i\tilde{b}_{i} is imprecise, then the iith constraint can be rewritten as 𝐚~iT𝐱b~ixn+10\tilde{\boldsymbol{a}}_{i}^{T}\boldsymbol{x}-\tilde{b}_{i}x_{n+1}\leq 0, where xn+1x_{n+1} is an auxiliary variable such that xn+1=1x_{n+1}=1. In both cases, we can replace the imprecise constraint with a one of the form (3).

This paper is organized as follows. In Section 2 we recall basic notions of possibility theory, in particular its probabilistic interpretation assumed in this paper. In Section 3 we discuss the discrete uncertainty representation in which a possibility degree for each scenario is specified. Then the ambiguity set 𝒫i\mathcal{P}_{i} contains a family of discrete probability distributions with a finite support 𝒰i\mathcal{U}_{i}. We will show that (3) is then equivalent to a family of linear constraints. In Section 4 we consider the interval uncertainty representation, in which unknown probability distribution for 𝒂~i\tilde{\boldsymbol{a}}_{i} has a compact continuous support. We propose a method of defining a continuous possibility distribution for 𝒂~i\tilde{\boldsymbol{a}}_{i}, which can be built by using available data or experts knowledge. We show that (3) is then equivalent to a family of second order cone constraints.

2 Possibility theory and imprecise probabilities

Let Ω\Omega be a set of alternatives. A primitive object of possibility theory (see, e.g., [1]) is a possibility distribution π:Ω[0,1]\pi:\Omega\rightarrow[0,1], which assigns to each element uΩu\in\Omega a possibility degree π(u)\pi(u). We only assume that π\pi is normal, i.e. there is uΩu\in\Omega such that π(u)=1\pi(u)=1. Possibility distribution can be built by using available data or experts opinions (see, e.g., [14]). A possibility distribution π\pi induces the following possibility and necessity measures in Ω\Omega:

Π(A)=supuAπ(u),AΩ.\Pi(A)=\sup_{u\in A}\pi(u),\;A\subseteq\Omega.
N(A)=1Π(Ac)=1supuAcπ(u),AΩ,{\rm N}(A)=1-\Pi(A^{c})=1-\sup_{u\in A^{c}}\pi(u),\;A\subseteq\Omega,

where Ac=ΩAA^{c}=\Omega\setminus A is the complement of AA. In this paper we assume that the possibility distribution π\pi represents uncertainty in Ω\Omega, i.e. some knowledge about uncertain quantity u~\tilde{u} taking values in Ω\Omega. We now recall, following [13], a probabilistic interpretation of the pair [Π,N][\Pi,{\rm N}] induced by the possibility distribution π\pi. Define

𝒫(π)\displaystyle\mathcal{P}(\pi) ={P:A measurable N(A)P(A)}\displaystyle=\{{\rm P}:\forall A\text{ measurable }{\rm N}(A)\leq{\rm P}(A)\}
={P:A measurable Π(A)P(A)}.\displaystyle=\{{\rm P}:\forall A\text{ measurable }\Pi(A)\geq{\rm P}(A)\}. (4)

In this case supP𝒫(π)P(A)=Π(A)\sup_{{\rm P}\in\mathcal{P}(\pi)}{\rm P}(A)=\Pi(A) and infP𝒫(π)P(A)=N(A)\inf_{{\rm P}\in\mathcal{P}(\pi)}{\rm P}(A)={\rm N}(A) (see [13, 15, 9]). So, the possibility distribution π\pi in Ω\Omega encodes a family of probability measures in Ω\Omega. Any probability measure P𝒫(π){\rm P}\in\mathcal{P}(\pi) is said to be consistent with π\pi and for any event AΩA\subseteq\Omega the inequalities

N(A)P(A)Π(A){\rm N}(A)\leq{\rm P}(A)\leq\Pi(A) (5)

hold. A detailed discussion on the expressive power of (5) can also be found in [38].

Observe that 𝒫(π)\mathcal{P}(\pi) is nonempty due the assumption that π\pi is normalized. Indeed, the probability distribution such that P({u})=1{\rm P}(\{u\})=1 for some uΩu\in\Omega such that π(u)=1\pi(u)=1 is in 𝒫(π)\mathcal{P}(\pi). To see this consider two cases. If uAu\notin A, then N(A)=0{\rm N}(A)=0 and P(A)N(A)=0{\rm P}(A)\geq{\rm N}(A)=0. If uAu\in A, then P(A)=1N(A){\rm P}(A)=1\geq{\rm N}(A).

3 Discrete uncertainty model

Consider uncertain constraint 𝒂~T𝒙b\tilde{\boldsymbol{a}}^{T}\boldsymbol{x}\leq b, where 𝒂~\tilde{\boldsymbol{a}} is a random vector in n\mathbb{R}^{n} with an unknown probability distribution (in order to simplify notation we skip the index ii in the constraint). In this section we assume that the support 𝒰\mathcal{U} of 𝒂~\tilde{\boldsymbol{a}} is finite, i.e. 𝒰={𝒂1,,𝒂K}\mathcal{U}=\{\boldsymbol{a}_{1},\dots,\boldsymbol{a}_{K}\} is the set of all possible, explicitly listed scenarios (realizations of 𝒂~\tilde{\boldsymbol{a}}) which can occur with a positive probability. We thus consider the discrete uncertainty model [25, 22]. Let μ\mu be a membership function of a fuzzy set in 𝒰\mathcal{U}, μ:𝒰[0,1]\mu:\mathcal{U}\rightarrow[0,1]. We will interpret μ\mu as a possibility distribution π\pi in the scenario set 𝒰\mathcal{U}, i.e. π=μ\pi=\mu. In order to simplify presentation, we will identify each scenario 𝒄i𝒰\boldsymbol{c}_{i}\in\mathcal{U} with its index i[K]i\in[K]. We can thus assume that π\pi is a possibility distribution in the index set [K][K], π:[K][0,1]\pi:[K]\rightarrow[0,1]. The value of π(i)\pi(i) is the possibility degree for scenario 𝒂i\boldsymbol{a}_{i}. Assume also w.l.o.g. that π(1)π(2)π(K)\pi(1)\geq\pi(2)\geq\dots\geq\pi(K) and thus π(1)=1\pi(1)=1.

Given π\pi, we can now construct an ambiguity set 𝒫(π)\mathcal{P}(\pi) of probability distributions for 𝒂~\tilde{\boldsymbol{a}} by using (4). Because 𝒫(π)\mathcal{P}(\pi) contains only discrete probability distributions in [K][K], we get

𝒫(π){𝒑[0,1]K:i[K]pi=1}\mathcal{P}(\pi)\subseteq\{\boldsymbol{p}\in[0,1]^{K}\,:\sum_{i\in[K]}p_{i}=1\}

and we can describe 𝒫(π)\mathcal{P}(\pi) by the following system of linear constraints:

iApi1maxiAπ(i)A[K],|A|1i[K]pi=1pi0i[K]\begin{array}[]{llll}\displaystyle\sum_{i\in A}p_{i}\geq 1-\max_{i\notin A}\pi(i)&\forall\,A\subset[K],|A|\geq 1\\ \displaystyle\sum_{i\in[K]}p_{i}=1\\ p_{i}\geq 0&i\in[K]\end{array} (6)

In description (6) the first set of constraints model the condition P(A)1Π(Ac)=N(A) for all events A[K]{\rm P}(A)\geq 1-\Pi(A^{c})={\rm N}(A)\text{ for all events }A\subset[K], (Ω=[K]\Omega=[K]). Notice that the formulation (6) has exponential number of constraints. In the following, we will show how to reduce the number of constraints to O(K)O(K).

Let us partition the set of indices [K][K] into \ell disjoint sets I1,,II_{1},\dots,I_{\ell} such that the elements of IjI_{j} have the same possibility degrees and the possibility degrees of the elements in IjI_{j} are greater than the possibility degrees of the elements in IkI_{k} for any j>kj>k. For example, let 𝝅=(1,1,0.5,0.5,0.3,0.3,0.3,0.1)\boldsymbol{\pi}=(1,1,0.5,0.5,0.3,0.3,0.3,0.1) be a possibility distribution (a fuzzy set) in [K][K] for K=8K=8 (i.e. a possibility distribution in the set of eight scenarios). Then I1={1,2}I_{1}=\{1,2\}, I2={3,4}I_{2}=\{3,4\}, I3={5,6,7}I_{3}=\{5,6,7\}, I4={8}I_{4}=\{8\}. Let πj\pi^{j} be the possibility degree of the elements in IjI_{j}. In the example we have π1=1\pi^{1}=1, π2=0.5\pi^{2}=0.5, π3=0.3\pi^{3}=0.3 and π4=0.1\pi^{4}=0.1.

Proposition 1.

System (6) is equivalent to the following system of constraints:

iI1Ijpi1πj+1j[1]i[K]pi=1pi0i[K]\begin{array}[]{llll}\displaystyle\sum_{i\in I_{1}\cup\dots\cup I_{j}}p_{i}\geq 1-\pi^{j+1}&j\in[\ell-1]\\ \displaystyle\sum_{i\in[K]}p_{i}=1\\ p_{i}\geq 0&i\in[K]\end{array} (7)
Proof.

It is easy to verify that each feasible solution to (6) is also feasible to (7). Indeed, for A=I1IjA=I_{1}\cup\dots\cup I_{j}, j[1]j\in[\ell-1], we get maxiAπ(i)=πj+1\max_{i\notin A}\pi(i)=\pi^{j+1}. So, (7) is a subset of the constraints (6). Assume that 𝒑\boldsymbol{p} is feasible to (7). Consider any set A[K]A\subseteq[K], |A|1|A|\geq 1. If A=[K]A=[K], then 𝒑\boldsymbol{p} is feasible to (6). Assume that A[K]A\subset[K] and let k[1]k\in[\ell-1] be the first index such that IkAI_{k}\not\subseteq A. If k=1k=1, then N(A)=0{\rm N}(A)=0 and the constraint in (6) holds for AA. Assume that k>1k>1. The inequality from (7)

iI1Ik1pi1πk\sum_{i\in I_{1}\cup\dots\cup I_{k-1}}p_{i}\geq 1-\pi^{k}

implies

iApiiI1Ik1pi1πk=1maxiAπ(i),\sum_{i\in A}p_{i}\geq\sum_{i\in I_{1}\cup\dots\cup I_{k-1}}p_{i}\geq 1-\pi^{k}=1-\max_{i\notin A}\pi(i),

where the last equality results from the fact that IkAI_{k}\not\subseteq A. In consequence the constraint in (6) holds for AA. ∎

For the sample possibility distribution 𝝅=(1,1,0.5,0.5,0.3,0.3,0.3,0.1)\boldsymbol{\pi}=(1,1,0.5,0.5,0.3,0.3,0.3,0.1), the ambiguity set 𝒫(π)\mathcal{P}(\pi) can be described by the following system of constraints:

p1+p20.5p1+p2+p3+p40.7p1+p2+p3+p4+p5+p6+p70.9p1+p2+p3+p4+p5+p6+p7+p8=1pi0i[K]\begin{array}[]{llll}p_{1}+p_{2}\geq 0.5\\ p_{1}+p_{2}+p_{3}+p_{4}\geq 0.7\\ p_{1}+p_{2}+p_{3}+p_{4}+p_{5}+p_{6}+p_{7}\geq 0.9\\ p_{1}+p_{2}+p_{3}+p_{4}+p_{5}+p_{6}+p_{7}+p_{8}=1\\ p_{i}\geq 0&i\in[K]\end{array}

Using Proposition 1, the value of maxP𝒫(π)𝐄P[𝒂~T𝒙]\max_{{\rm P}\in\mathcal{P}(\pi)}\mathrm{\mathbf{E}}_{\rm P}[\tilde{\boldsymbol{a}}^{T}\boldsymbol{x}], for a given 𝒙\boldsymbol{x}, can be computed by using the following linear program:

maxi[K]pi𝒂iT𝒙iI1Ijpi1πj+1j[1]i[K]pi=1pi0i[K]\begin{array}[]{lllll}\max\displaystyle\sum_{i\in[K]}p_{i}\boldsymbol{a}^{T}_{i}\boldsymbol{x}\\ \displaystyle\sum_{i\in I_{1}\cup\dots\cup I_{j}}p_{i}\geq 1-\pi^{j+1}&j\in[\ell-1]\\ \displaystyle\sum_{i\in[K]}p_{i}=1\\ p_{i}\geq 0&i\in[K]\\ \end{array} (8)
Theorem 1.

Given 𝐱𝕏\boldsymbol{x}\in\mathbb{X}, the constraint

maxP𝒫(π)𝐄P[𝒂~T𝒙]b\max_{{\rm P}\in\mathcal{P}(\pi)}\mathrm{\mathbf{E}}_{\rm P}[\tilde{\boldsymbol{a}}^{T}\boldsymbol{x}]\leq b (9)

is equivalent to the following system of linear constraints:

β+j[1]αj(πj+11)bβj[1]:iIjαj𝒂iT𝒙i[K]αj0j[1]β\begin{array}[]{lllll}\displaystyle\beta+\sum_{j\in[\ell-1]}\alpha_{j}(\pi^{j+1}-1)\leq b\\ \displaystyle\beta-\sum_{j\in[\ell-1]:i\in I_{j}}\alpha_{j}\geq\boldsymbol{a}_{i}^{T}\boldsymbol{x}&i\in[K]\\ \alpha_{j}\geq 0&j\in[\ell-1]\\ \beta\in\mathbb{R}&\end{array} (10)
Proof.

For a fixed 𝒙𝕏\boldsymbol{x}\in\mathbb{X}, (8) is a linear programming problem, whose dual is:

minβ+j[1]αj(πj+11)βj[1]:iIjαj𝒂iT𝒙i[K]αj0j[1]β\begin{array}[]{lllll}\min&\displaystyle\beta+\sum_{j\in[\ell-1]}\alpha_{j}(\pi^{j+1}-1)\\ &\displaystyle\beta-\sum_{j\in[\ell-1]:i\in I_{j}}\alpha_{j}\geq\boldsymbol{a}_{i}^{T}\boldsymbol{x}&i\in[K]\\ &\alpha_{j}\geq 0&j\in[\ell-1]\\ &\beta\in\mathbb{R}&\end{array} (11)

where β\beta and αj\alpha_{j} are dual variables. Now the strong duality for linear programming shows that the optimal objective values of (8) and (11) are the same (see, e.g., [30, Theorem 3.1]). This gives us (10). ∎

Consider two boundary cases of the proposed model. If π(k)=1\pi(k)=1 for some k[K]k\in[K], and π(i)=0\pi(i)=0 for each iki\neq k, then 𝒂k\boldsymbol{a}_{k} is the only scenario which can occur. The constraint (3) reduces then to 𝒂kT𝒙b\boldsymbol{a}_{k}^{T}\boldsymbol{x}\leq b. If π(i)=1\pi(i)=1 for all i[K]i\in[K], then 𝒫(π)={𝒑[0,1]K:i[K]pi=1}\mathcal{P}(\pi)=\{\boldsymbol{p}\in[0,1]^{K}:\sum_{i\in[K]}p_{i}=1\} and (3) becomes the strict robust constraint (2) for 𝒰={𝒂1,,𝒂K}\mathcal{U}=\{\boldsymbol{a}_{1},\dots,\boldsymbol{a}_{K}\}. Theorem 1 yields the following corollary.

Corollary 1.

If 𝕏\mathbb{X} is a polyhedron described by a system of linear constraints ((1) is an uncertain linear programming problem). Then the deterministic counterpart of (1) with the constraints of the type (9) and in consequence of the type (10) is a linear programming problem.

Corollary 1 shows that the resulting deterministic counterpart of uncertain linear programming problem (1), under the discrete uncertainty model, is polynomially solvable and thus can be solved efficiently by standard off-the-shelf solvers.

4 Interval uncertainty model

Let us again focus on an uncertain constraint 𝒂~T𝒙b\tilde{\boldsymbol{a}}^{T}\boldsymbol{x}\leq b, where 𝒂~=(a~1,,a~n)\tilde{\boldsymbol{a}}=(\tilde{a}_{1},\dots,\tilde{a}_{n}) is a random vector in n\mathbb{R}^{n} with an unknown probability distribution having now a compact continuous support 𝒰n\mathcal{U}\subseteq\mathbb{R}^{n}. We will start by describing the support 𝒰\mathcal{U}. Next, we will propose a continuous possibility distribution π\pi in 𝒰\mathcal{U} which will induce an ambiguity set of probability distributions 𝒫(π)\mathcal{P}(\pi) according to (4). This possibility distribution can be provided by experts or can be estimated by using available data. Assume that for each component a~j\tilde{a}_{j} of 𝒂~\tilde{\boldsymbol{a}} a nominal (expected, the most probable) value a^j\hat{a}_{j} and an interval [a^ja¯j,a^j+a¯j][\hat{a}_{j}-\underline{a}_{j},\hat{a}_{j}+\overline{a}_{j}] containing possible values of a~j\tilde{a}_{j} are known. Let

(0)=[a^1a¯1,a^1+a¯1]××[a^na¯n,a^n+a¯n],\mathcal{B}(0)=[\hat{a}_{1}-\underline{a}_{1},\hat{a}_{1}+\overline{a}_{1}]\times\dots\times[\hat{a}_{n}-\underline{a}_{n},\hat{a}_{n}+\overline{a}_{n}],

i.e. (0)\mathcal{B}(0) is a hyperrectangle containing possible scenarios 𝒂\boldsymbol{a} (realizations of 𝒂~\tilde{\boldsymbol{a}}). The components of 𝒂~\tilde{\boldsymbol{a}} are often correlated. Also, the probability that a subset of components will take the extreme values can be very small or even 0. So, (0)\mathcal{B}(0) is typically an overestimation of the support 𝒰\mathcal{U} of 𝒂~\tilde{\boldsymbol{a}}. In robust optimization some limit for the deviations of scenarios from the nominal vector 𝒂^\hat{\boldsymbol{a}} is often assumed, i.e. δ(𝒂)Γ\delta(\boldsymbol{a})\leq\Gamma, where δ(𝒂)\delta(\boldsymbol{a}) is a deviation prescribed. The parameter Γ0\Gamma\geq 0 is called a budget and it controls the amount of uncertainty in 𝒰\mathcal{U}. One can define, for example, δ(𝒂)=𝒂𝒂^1\delta(\boldsymbol{a})=||\boldsymbol{a}-\hat{\boldsymbol{a}}||_{1}, which leads to the continuous budgeted uncertainty [29]. In this paper, following [2, 3] we use

δ(𝒂)=𝑩(𝒂𝒂^)2,\delta(\boldsymbol{a})=||\boldsymbol{B}(\boldsymbol{a}-\hat{\boldsymbol{a}})||_{2},

where 𝑩\boldsymbol{B} is a given n×nn\times n matrix. Notice that δ(𝒂)=(𝒂𝒂^)T𝚺(𝒂𝒂^)\delta(\boldsymbol{a})=(\boldsymbol{a}-\hat{\boldsymbol{a}})^{T}\boldsymbol{\Sigma}(\boldsymbol{a}-\hat{\boldsymbol{a}}), where 𝑩=𝚺12\boldsymbol{B}=\boldsymbol{\Sigma}^{\frac{1}{2}} for some postive semidefinite matrix 𝚺\boldsymbol{\Sigma}. For example, 𝚺\boldsymbol{\Sigma} can be an estimation of the covariance matrix for 𝒂~\tilde{\boldsymbol{a}} (see, e.g., [6]). Therefore

(0)={𝒂n:δ(𝒂)Γ}\mathcal{E}(0)=\{\boldsymbol{a}\in\mathbb{R}^{n}:\delta(\boldsymbol{a})\leq\Gamma\}

is an ellipsoid in n\mathbb{R}^{n}. We postulate that (0)(0)\mathcal{B}(0)\cap\mathcal{E}(0) contains all possible realizations of 𝒂~\tilde{\boldsymbol{a}}, i.e. P((0)(0))=1{\rm P}(\mathcal{B}(0)\cap\mathcal{E}(0))=1. Hence (0)(0)\mathcal{B}(0)\cap\mathcal{E}(0) is the estimation of the support of 𝒂~\tilde{\boldsymbol{a}}. In the following, we will construct a possibility distribution for 𝒂~\tilde{\boldsymbol{a}}, which can be easily built from a^j,a¯j,a¯j\hat{a}_{j},\underline{a}_{j},\overline{a}_{j}, j[n]j\in[n], matrix 𝑩\boldsymbol{B} and the budget Γ\Gamma.

Refer to caption
Figure 1: Fuzzy intervals a^,a¯,a¯z1z2\braket{\hat{a},\underline{a},\overline{a}}_{z_{1}-z_{2}} and 0,0,Γz\braket{0,0,\Gamma}_{z} modeling the possibility distributions for a~j\tilde{a}_{j} and δ~\tilde{\delta}.

Let πa~j\pi_{\tilde{a}_{j}} be a possibility distribution for the component a~j\tilde{a}_{j} of 𝒂~\tilde{\boldsymbol{a}}, such that that πa~j(a^j)=1\pi_{\tilde{a}_{j}}(\hat{a}_{j})=1, πa~j(a^ja¯j)=πa~j(a^j+a¯j)=0\pi_{\tilde{a}_{j}}(\hat{a}_{j}-\underline{a}_{j})=\pi_{\tilde{a}_{j}}(\hat{a}_{j}+\overline{a}_{j})=0, πa~j\pi_{\tilde{a}_{j}} is continuous, strictly increasing in [a^ja¯j,a^j][\hat{a}_{j}-\underline{a}_{j},\hat{a}_{j}] and continuous strictly decreasing in [a^j,a^j+a¯j][\hat{a}_{j},\hat{a}_{j}+\overline{a}_{j}]. One can identify the possibility distribution πa~j\pi_{\tilde{a}_{j}} with membership function μa~j\mu_{\tilde{a}_{j}} of a fuzzy interval a^j,a¯j,a¯jz1z2\braket{\hat{a}_{j},\underline{a}_{j},\overline{a}_{j}}_{z_{1}-z_{2}} shown in Figure 1a (i.e. πa~j=μa~j\pi_{\tilde{a}_{j}}=\mu_{\tilde{a}_{j}}). Let

a~j(λ)={aj:πa~j(aj)λ}λ(0,1]\tilde{a}_{j}(\lambda)=\{a_{j}:\pi_{\tilde{a}_{j}}(a_{j})\geq\lambda\}\;\;\lambda\in(0,1]

be the λ\lambda-cut of a~j\tilde{a}_{j}. It is easy to check that a~j(λ)=[a¯j(λ),a¯j(λ)]\tilde{a}_{j}(\lambda)=[\underline{a}_{j}(\lambda),\overline{a}_{j}(\lambda)] is an interval for each fixed λ(0,1]\lambda\in(0,1]. Set [a¯j(0),a¯j(0)]=[a^ia¯j,a^j+a¯j][\underline{a}_{j}(0),\overline{a}_{j}(0)]=[\hat{a}_{i}-\underline{a}_{j},\hat{a}_{j}+\overline{a}_{j}]. The function a¯j(λ)\underline{a}_{j}(\lambda), the left profile of πa~j\pi_{\tilde{a}_{j}}, is continuous and strictly increasing and the function a¯j(λ)\overline{a}_{j}(\lambda), the right profile of πa~j\pi_{\tilde{a}_{j}}, is continuous and strictly decreasing for λ[0,1]\lambda\in[0,1]. For the fuzzy interval a^j,a¯j,a¯jz1z2\braket{\hat{a}_{j},\underline{a}_{j},\overline{a}_{j}}_{z_{1}-z_{2}}, the λ\lambda-cut of a~j\tilde{a}_{j} can be computed by using the following formula:

a~j(λ)=[a^ja¯j(1λz1),a^j+a¯j(1λz2)]λ[0,1].\tilde{a}_{j}(\lambda)=[\hat{a}_{j}-\underline{a}_{j}(1-\lambda^{z_{1}}),\hat{a}_{j}+\overline{a}_{j}(1-\lambda^{z_{2}})]\;\lambda\in[0,1].

We will assume that a¯j,a¯j,z1,z2>0\underline{a}_{j},\overline{a}_{j},z_{1},z_{2}>0. Then a~j(λ)\tilde{a}_{j}(\lambda) are nondegenerate intervals (they have nonempty interiors) for each λ[0,1)\lambda\in[0,1). The numbers z1z_{1} and z2z_{2} control the amount of uncertainty for a~j\tilde{a}_{j} below and over a^j\hat{a}_{j}, respectively. In particular, when z1,z20z_{1},z_{2}\rightarrow 0, then a~j\tilde{a}_{j} becomes a^j\hat{a}_{j}. On the other hand, if z1,z2z_{1},z_{2}\rightarrow\infty, then a~j\tilde{a}_{j} becomes the closed interval [a^ja¯j,a^j+a¯j][\hat{a}_{j}-\underline{a}_{j},\hat{a}_{j}+\overline{a}_{j}] (see Figure 1).

The deviation δ(𝒂)\delta(\boldsymbol{a}) depends on an unknown realization of 𝒂~\tilde{\boldsymbol{a}}, so we can treat the deviation as an uncertain quantity δ~\tilde{\delta}. We can now build the following possibility distribution for δ~\tilde{\delta}: πδ~(0)=1\pi_{\tilde{\delta}}(0)=1, πδ~\pi_{\tilde{\delta}} is continuous and strictly decreasing in [0,Γ][0,\Gamma] and πδ~(δ)=0\pi_{\tilde{\delta}}(\delta)=0 if δ[0,Γ]\delta\notin[0,\Gamma]. Let

δ~(λ)={δn:πδ~(δ)λ}=[0,δ¯(λ)]λ(0,1]\tilde{\delta}(\lambda)=\{\delta\in\mathbb{R}^{n}:\pi_{\tilde{\delta}}(\delta)\geq\lambda\}=[0,\overline{\delta}(\lambda)]\;\;\lambda\in(0,1]

be the λ\lambda-cut of δ~\tilde{\delta}. Fix δ¯(0)=Γ\overline{\delta}(0)=\Gamma. We can identify πδ~\pi_{\tilde{\delta}} with membership function μδ~\mu_{\tilde{\delta}} of a fuzzy interval δ~\tilde{\delta} given, for example, in the form of 0,0,Γz\braket{0,0,\Gamma}_{z} (see Figure 1b). The λ\lambda-cut of δ~\tilde{\delta} is then δ~(λ)=[0,Γ(1λz)]\tilde{\delta}(\lambda)=[0,\Gamma(1-\lambda^{z})] for λ[0,1]\lambda\in[0,1].

We now build the following joint possibility distribution π:n[0,1]\pi:\mathbb{R}^{n}\rightarrow[0,1], for 𝒂~\tilde{\boldsymbol{a}}:

π(𝒂)=min{πa~1(a1),,πa~n(an),πδ~(δ(𝒂))}.\pi(\boldsymbol{a})=\min\{\pi_{\tilde{a}_{1}}(a_{1}),\dots,\pi_{\tilde{a}_{n}}(a_{n}),\pi_{\tilde{\delta}}(\delta(\boldsymbol{a}))\}. (12)

The value of π(𝒂)\pi(\boldsymbol{a}) is the possibility degree for scenario 𝒂n\boldsymbol{a}\in\mathbb{R}^{n}. The first part of the possibility distribution, i.e. min{πa~1(a1),πa~2(a2),,πa~n(an)}\min\{\pi_{\tilde{a}_{1}}(a_{1}),\pi_{\tilde{a}_{2}}(a_{2}),\dots,\pi_{\tilde{a}_{n}}(a_{n})\}, is built from the possibility distributions of non-interacting components [11]. The second part πδ~(δ(𝒂))\pi_{\tilde{\delta}}(\delta(\boldsymbol{a})) is added to model interactions between the components of 𝒂~\tilde{\boldsymbol{a}}. Define the λ\lambda-cut of 𝒂~\tilde{\boldsymbol{a}}

𝒞(λ)={𝒂n:π(𝒂)λ},λ(0,1].\mathcal{C}(\lambda)=\{\boldsymbol{a}\in\mathbb{R}^{n}:\pi(\boldsymbol{a})\geq\lambda\},\;\;\lambda\in(0,1].

It is easily seen that

𝒞(λ)={𝒂n:\displaystyle\mathcal{C}(\lambda)=\{\boldsymbol{a}\in\mathbb{R}^{n}\,:\, aj[a¯j(λ),a¯j(λ)],j[n],\displaystyle a_{j}\in[\underline{a}_{j}(\lambda),\overline{a}_{j}(\lambda)],\;j\in[n],
||𝑩(𝒂𝒂^)||2δ¯(λ)}λ(0,1].\displaystyle||\boldsymbol{B}(\boldsymbol{a}-\hat{\boldsymbol{a}})||_{2}\leq\overline{\delta}(\lambda)\}\;\lambda\in(0,1].

Set 𝒞(0)=(0)(0)\mathcal{C}(0)=\mathcal{B}(0)\cap\mathcal{E}(0). Observe that 𝒞(1)={𝒂^}\mathcal{C}(1)=\{\hat{\boldsymbol{a}}\}. Furthermore 𝒞(λ)\mathcal{C}(\lambda), λ[0,1]\lambda\in[0,1], is a family of nested sets, i.e. 𝒞(λ1)𝒞(λ2)\mathcal{C}(\lambda_{1})\subset\mathcal{C}(\lambda_{2}) if λ1>λ2\lambda_{1}>\lambda_{2}. By the continuity and monotonicity of a¯j(λ)\underline{a}_{j}(\lambda), a¯j(λ)\overline{a}_{j}(\lambda) and δ¯(λ)\overline{\delta}(\lambda) we get

N(𝒞(λ))=1λ,λ[0,1].{\rm N}(\mathcal{C}(\lambda))=1-\lambda,\;\;\lambda\in[0,1]. (13)

Notice also that 𝒞(λ)\mathcal{C}(\lambda) is a closed convex set for each λ[0,1]\lambda\in[0,1] and has nonempty interior for all λ[0,1)\lambda\in[0,1).

Proposition 2.

The following equality

𝒫(π)={P𝒫(n):P(𝒞(λ))1λ,λ[0,1]}\mathcal{P}(\pi)=\{{\rm P}\in\mathcal{PM}(\mathbb{R}^{n}):{\rm P}(\mathcal{C}(\lambda))\geq 1-\lambda,\lambda\in[0,1]\}

holds, where 𝒫(n)\mathcal{PM}(\mathbb{R}^{n}) is the set of all probability measures on n\mathbb{R}^{n}.

Proof.

Let 𝒫(π)={P𝒫(n):P(𝒞(λ))1λ,λ[0,1]}\mathcal{P}^{\prime}(\pi)=\{{\rm P}\in\mathcal{PM}(\mathbb{R}^{n}):{\rm P}(\mathcal{C}(\lambda))\geq 1-\lambda,\lambda\in[0,1]\}. We need to show that 𝒫(π)=𝒫(π)\mathcal{P}^{\prime}(\pi)=\mathcal{P}(\pi). Observe first that 𝒫(π)𝒫(π)\mathcal{P}(\pi)\subseteq\mathcal{P}^{\prime}(\pi). To see this choose any probability distribution P𝒫(π){\rm P}\in\mathcal{P}(\pi). Since 𝒞(λ)n\mathcal{C}(\lambda)\subseteq\mathbb{R}^{n} is an event for each λ[0,1]\lambda\in[0,1], we get P(𝒞(λ))N(𝒞(λ))=1λ{\rm P}(\mathcal{C}(\lambda))\geq{\rm N}(\mathcal{C}(\lambda))=1-\lambda for each λ[0,1]\lambda\in[0,1] (see equation (13)). In consequence P𝒫(π){\rm P}\in\mathcal{P}^{\prime}(\pi). We now show that 𝒫(π)𝒫(π)\mathcal{P}^{\prime}(\pi)\subseteq\mathcal{P}(\pi). Let P𝒫(π){\rm P}\in\mathcal{P}^{\prime}(\pi) and consider event AnA\subseteq\mathbb{R}^{n}. If 𝒞(1)={𝒂^}A\mathcal{C}(1)=\{\hat{\boldsymbol{a}}\}\not\subseteq A, then N(A)=0{\rm N}(A)=0 and P(A)N(A)=0{\rm P}(A)\geq{\rm N}(A)=0. Suppose 𝒞(1)A\mathcal{C}(1)\subseteq A and let

λ=inf{λ[0,1]:𝒞(λ)A}.\lambda^{*}=\inf\{\lambda\in[0,1]:\mathcal{C}(\lambda)\subseteq A\}.

Then P(A)P(𝒞(λ))1λ=N(𝒞(λ)).{\rm P}(A)\geq{\rm P}(\mathcal{C}(\lambda^{*}))\geq 1-\lambda^{*}={\rm N}(\mathcal{C}(\lambda^{*})). Choose λ:=λϵ\lambda^{\prime}:=\lambda^{*}-\epsilon for arbitrarily small ϵ>0\epsilon>0. As 𝒞(λ)A\mathcal{C}(\lambda^{\prime})\not\subseteq A, there is scenario 𝒂\boldsymbol{a} such that 𝒂A\boldsymbol{a}\notin A and π(𝒂)λ\pi(\boldsymbol{a})\geq\lambda^{\prime}. But 𝒂A\boldsymbol{a}\in A for each scenario 𝒂\boldsymbol{a} such that π(𝒂)λ\pi(\boldsymbol{a})\geq\lambda^{*}. Consequently sup𝒂Aπ(𝒂)[λ,λ]\sup_{\boldsymbol{a}\notin A}\pi(\boldsymbol{a})\in[\lambda^{\prime},\lambda^{*}]. Because ϵ0\epsilon\rightarrow 0, N(A)=1λ{\rm N}(A)=1-\lambda^{*}. Hence P(A)N(A){\rm P}(A)\geq{\rm N}(A) and P𝒫(π){\rm P}\in\mathcal{P}(\pi). ∎

In order to construct a tractable reformulation of (3) for 𝒫(π)\mathcal{P}(\pi) we need to use a discretization of 𝒫(π)\mathcal{P}(\pi). Choose integer 0\ell\geq 0 and set Λ={0,1,,}\Lambda=\{0,1,\dots,\ell\}. Define λi=i/\lambda_{i}=i/\ell, iΛi\in\Lambda. We consider the following ambiguity set:

𝒫(π)={P𝒫(n):P(𝒞(λi))1λi:iΛ}.\mathcal{P}^{\ell}(\pi)=\{{\rm P}\in\mathcal{PM}(\mathbb{R}^{n}):{\rm P}(\mathcal{C}(\lambda_{i}))\geq 1-\lambda_{i}:i\in\Lambda\}. (14)

Set 𝒫(π)\mathcal{P}^{\ell}(\pi) is a discrete approximation of 𝒫(π)\mathcal{P}(\pi). It is easy to verify that 𝒫(π)𝒫(π)\mathcal{P}(\pi)\subseteq\mathcal{P}^{\ell}(\pi) for any 0\ell\geq 0. By fixing sufficiently large constant \ell, we obtain arbitrarily close approximation of 𝒫(π)\mathcal{P}(\pi).

Refer to caption
Figure 2: The ambiguity set 𝒫2(π)\mathcal{P}^{2}(\pi), P(𝒞(0))1{\rm P}(\mathcal{C}(0))\geq 1, P(𝒞(0.5))0.5{\rm P}(\mathcal{C}(0.5))\geq 0.5, P(𝒞(1))0{\rm P}(\mathcal{C}(1))\geq 0
Theorem 2.

The constraint

maxP𝒫(π)𝐄P[𝒂~T𝒙]b\max_{{\rm P}\in\mathcal{P}^{\ell}(\pi)}\mathrm{\mathbf{E}}_{\rm P}[\tilde{\boldsymbol{a}}^{T}\boldsymbol{x}]\leq b (15)

is equivalent to the following system of second-order cone constraints:

w+iΛ(λi1)vibγiδ¯(λi)+j[n]αij(a¯j(λi)a^j)+j[n]βij(a^ja¯j(λi))++𝒂^T𝒙wjivjiΛαijβij+𝑩jT𝒖i=xjiΛ,j[n]𝒖i2γiiΛαij,βij0iΛ,j[n]vi,γi0iΛw,𝒖iniΛ\begin{array}[]{llll}\displaystyle w+\sum_{i\in\Lambda}(\lambda_{i}-1)v_{i}\leq b\\ \displaystyle\gamma_{i}\overline{\delta}(\lambda_{i})+\sum_{j\in[n]}\alpha_{ij}(\overline{a}_{j}(\lambda_{i})-\hat{a}_{j})+\sum_{j\in[n]}\beta_{ij}(\hat{a}_{j}-\underline{a}_{j}(\lambda_{i}))+\\ \begin{array}[]{lll}+\hat{\boldsymbol{a}}^{T}\boldsymbol{x}\leq\displaystyle w-\sum_{j\leq i}v_{j}&i\in\Lambda\\ \alpha_{ij}-\beta_{ij}+\boldsymbol{B}^{T}_{j}\boldsymbol{u}_{i}=x_{j}&i\in\Lambda,j\in[n]\\ ||\boldsymbol{u}_{i}||_{2}\leq\gamma_{i}&i\in\Lambda\\ \alpha_{ij},\beta_{ij}\geq 0&i\in\Lambda,j\in[n]\\ v_{i},\gamma_{i}\geq 0&i\in\Lambda\\ w\in\mathbb{R},\boldsymbol{u}_{i}\in\mathbb{R}^{n}&i\in\Lambda\end{array}\end{array} (16)

where Λ={0,1,,}\Lambda=\{0,1,\dots,\ell\} and 𝐁j\boldsymbol{B}_{j} is the jjth column of 𝐁\boldsymbol{B}.

Proof.

The proof is adapted from [39]. The left hand side of (15) can be expressed as the problem of moments (see, e.g., [36, 39]):

max𝒞(λ0)𝒂T𝒙dμ(𝒂)𝒞(λ0)𝟏[𝒂𝒞(λi)]dμ(𝒂)1λiiΛ𝒞(λ0)dμ(𝒂)=1μ+(n),\begin{array}[]{lll}\max&\displaystyle\int_{\mathcal{C}(\lambda_{0})}\boldsymbol{a}^{T}\boldsymbol{x}\;{\rm d}\mu(\boldsymbol{a})\\ &\displaystyle\int_{\mathcal{C}(\lambda_{0})}\boldsymbol{1}_{[\boldsymbol{a}\in\mathcal{C}(\lambda_{i})]}\;{\rm d}\mu(\boldsymbol{a})\geq 1-\lambda_{i}&i\in\Lambda\\ &\displaystyle\int_{\mathcal{C}(\lambda_{0})}{\rm d}\mu(\boldsymbol{a})=1\\ &\mu\in\mathcal{M}_{+}(\mathbb{R}^{n}),\end{array} (17)

where +(n)\mathcal{M}_{+}(\mathbb{R}^{n}) is the set of all nonnegative measures on n\mathbb{R}^{n}. Notice that the second (equality) constraint implies μ\mu is a probability measure supported on 𝒞(λ0)\mathcal{C}(\lambda_{0}). The dual of the problem of moments takes the following form (see, e.g., [24]):

minw+iΛ(λi1)viwiΛ𝟏[𝒂𝒞(λi)]vi𝒂T𝒙𝒂𝒞(λ0)vi0iΛw\begin{array}[]{llll}\min&\displaystyle w+\sum_{i\in\Lambda}(\lambda_{i}-1)v_{i}\\ &\displaystyle w-\sum_{i\in\Lambda}\boldsymbol{1}_{[\boldsymbol{a}\in\mathcal{C}(\lambda_{i})]}v_{i}\geq\boldsymbol{a}^{T}\boldsymbol{x}&\forall\boldsymbol{a}\in\mathcal{C}(\lambda_{0})\\ &v_{i}\geq 0&i\in\Lambda\\ &w\in\mathbb{R}&\end{array} (18)

Strong duality implies that the optimal objective values of (17) and (18) are the same (see, e.g., [24, Theorem 1, Corollary 1]). Define

𝒞¯(λi)=𝒞(λi)𝒞(λi+1),i[1]\overline{\mathcal{C}}(\lambda_{i})=\mathcal{C}(\lambda_{i})\setminus\mathcal{C}(\lambda_{i+1}),\;i\in[\ell-1]

and 𝒞¯(λ)=𝒞(λ)\overline{\mathcal{C}}(\lambda_{\ell})=\mathcal{C}(\lambda_{\ell}). Hence C¯(λi)\overline{C}(\lambda_{i}), iΛi\in\Lambda, form a partition of the support 𝒞(λ0)\mathcal{C}(\lambda_{0}) into +1\ell+1 disjoint sets. Model (18) can be then rewritten as follows:

minw+iΛ(λi1)viwiΛ𝟏[𝒂𝒞(λi)]vi𝒂T𝒙𝒂𝒞¯(λi),iΛvi0iΛw\begin{array}[]{llll}\min&\displaystyle w+\sum_{i\in\Lambda}(\lambda_{i}-1)v_{i}\\ &\displaystyle w-\sum_{i\in\Lambda}\boldsymbol{1}_{[\boldsymbol{a}\in\mathcal{C}(\lambda_{i})]}v_{i}\geq\boldsymbol{a}^{T}\boldsymbol{x}&\forall\boldsymbol{a}\in\overline{\mathcal{C}}(\lambda_{i}),i\in\Lambda\\ &v_{i}\geq 0&i\in\Lambda\\ &w\in\mathbb{R}&\end{array} (19)

which is equivalent to

minw+iΛ(λi1)viwjivj𝒂T𝒙𝒂𝒞¯(λi),iΛvi0iΛw\begin{array}[]{llll}\min&\displaystyle w+\sum_{i\in\Lambda}(\lambda_{i}-1)v_{i}\\ &\displaystyle w-\sum_{j\leq i}v_{j}\geq\boldsymbol{a}^{T}\boldsymbol{x}&\forall\boldsymbol{a}\in\overline{\mathcal{C}}(\lambda_{i}),i\in\Lambda\\ &v_{i}\geq 0&i\in\Lambda\\ &w\in\mathbb{R}&\end{array} (20)

The iith constraint in (20) can be reformulated as

max𝒂𝒞¯(λi)𝒂T𝒙wjivj.\max_{\boldsymbol{a}\in\overline{\mathcal{C}}(\lambda_{i})}\boldsymbol{a}^{T}\boldsymbol{x}\leq w-\sum_{j\leq i}v_{j}. (21)

By the linearity of 𝒂T𝒙\boldsymbol{a}^{T}\boldsymbol{x} in 𝒂\boldsymbol{a}, the left hand side is maximized at the boundary of 𝒞¯(λi)\overline{\mathcal{C}}(\lambda_{i}), which coincides with the boundary of 𝒞(λi)\mathcal{C}(\lambda_{i}). Hence (21) is equivalent to

max𝒂𝒞(λi)𝒂T𝒙wjivj.\max_{\boldsymbol{a}\in\mathcal{C}(\lambda_{i})}\boldsymbol{a}^{T}\boldsymbol{x}\leq w-\sum_{j\leq i}v_{j}.

The left hand side of this inequality, by definition of 𝒞(λi)\mathcal{C}(\lambda_{i}), yields the following problem:

max𝒂T𝒙aja¯j(λi)j[n]aja¯j(λi)j[n]𝑩(𝒂𝒂^)2δ¯(λi)𝒂n\begin{array}[]{lll}\max&\boldsymbol{a}^{T}\boldsymbol{x}\\ &a_{j}\leq\overline{a}_{j}(\lambda_{i})&j\in[n]\\ &-a_{j}\leq-\underline{a}_{j}(\lambda_{i})&j\in[n]\\ &\displaystyle||\boldsymbol{B}(\boldsymbol{a}-\hat{\boldsymbol{a}})||_{2}\leq\overline{\delta}(\lambda_{i})&\\ &\boldsymbol{a}\in\mathbb{R}^{n}&\end{array}

Let us substitute 𝒚=𝒂𝒂^\boldsymbol{y}=\boldsymbol{a}-\hat{\boldsymbol{a}}, which yields:

max𝒚T𝒙+𝒂^T𝒙yja^j+a¯j(λi)j[n]yja¯j(λi)+a^jj[n]𝑩𝒚2δ¯(λi)𝒚n\begin{array}[]{lll}\max&\boldsymbol{y}^{T}\boldsymbol{x}+\hat{\boldsymbol{a}}^{T}\boldsymbol{x}\\ &y_{j}\leq-\hat{a}_{j}+\overline{a}_{j}(\lambda_{i})&j\in[n]\\ &-y_{j}\leq-\underline{a}_{j}(\lambda_{i})+\hat{a}_{j}&j\in[n]\\ &\displaystyle||\boldsymbol{B}\boldsymbol{y}||_{2}\leq\overline{\delta}(\lambda_{i})&\\ &\boldsymbol{y}\in\mathbb{R}^{n}&\end{array}

Introducing new variables 𝒛\boldsymbol{z} leads to the following model:

max𝒚T𝒙+𝒂^T𝒙yja^j+a¯j(λi)j[n]yja¯j(λi)+a^jj[n]𝑩𝒚𝒛=𝟎𝒛2δ¯(λi)𝒚,𝒛n\begin{array}[]{lll}\max&\boldsymbol{y}^{T}\boldsymbol{x}+\hat{\boldsymbol{a}}^{T}\boldsymbol{x}\\ &y_{j}\leq-\hat{a}_{j}+\overline{a}_{j}(\lambda_{i})&j\in[n]\\ &-y_{j}\leq-\underline{a}_{j}(\lambda_{i})+\hat{a}_{j}&j\in[n]\\ &\boldsymbol{B}\boldsymbol{y}-\boldsymbol{z}=\boldsymbol{0}&\\ &\displaystyle||\boldsymbol{z}||_{2}\leq\overline{\delta}(\lambda_{i})&\\ &\boldsymbol{y},\boldsymbol{z}\in\mathbb{R}^{n}&\end{array} (22)

The dual to (22) is (see the Appendix):

minγδ¯(λi)+j=1nαj(a¯j(λi)a^j)+j=1nβj(a^ja¯j(λi))++𝒂^T𝒙αjβj+𝑩jT𝒖=xjj[n]𝒖2γαj,βj,γ0j[n]𝒖n\begin{array}[]{lll}\min&\displaystyle\gamma\overline{\delta}(\lambda_{i})+\sum_{j=1}^{n}\alpha_{j}(\overline{a}_{j}(\lambda_{i})-\hat{a}_{j})+\sum_{j=1}^{n}\beta_{j}(\hat{a}_{j}-\underline{a}_{j}(\lambda_{i}))+\\ &+\hat{\boldsymbol{a}}^{T}\boldsymbol{x}\\ &\begin{array}[]{ll}\alpha_{j}-\beta_{j}+\boldsymbol{B}^{T}_{j}\boldsymbol{u}=x_{j}&j\in[n]\\ ||\boldsymbol{u}||_{2}\leq\gamma\\ \alpha_{j},\beta_{j},\gamma\geq 0&j\in[n]\\ \boldsymbol{u}\in\mathbb{R}^{n}&\end{array}\end{array} (23)

where 𝑩j\boldsymbol{B}_{j} is the jjth column of 𝑩\boldsymbol{B}. The strong duality (see Appendix) implies that (22) and (23) have the same optimal objective function values. Model (23) together with (20) yield (16). ∎

Theorem 2 leads to the following corollary:

Corollary 2.

If 𝕏\mathbb{X} is a polyhedron described by a system of linear constraints (i.e. (1) is an uncertain linear programming problem), then the deterministic counterpart of (1) with the constraints of the type (15) is a second-order cone program.

Corollary 2 shows that the resulting deterministic counterpart of uncertain linear programming problem (1), under the interval uncertainty model, is polynomially solvable (see, e.g., [7]) and thus can solve efficiently by some standard off-the-shelf solvers like IBM ILOG CPLEX [18].

Consider a sample of optimization problem (1) in which the value of the objective function a~1x1+a~2x2\tilde{a}_{1}x_{1}+\tilde{a}_{2}x_{2} is uncertain and two constraints: x12.74x_{1}\geq 2.74 and x23.3x_{2}\geq 3.3 are precisely known. The uncertainty of the coefficients a~1\tilde{a}_{1} and a~2\tilde{a}_{2} is modeled by fuzzy intervals a~1=3,2.5,2.510.32\tilde{a}_{1}=\braket{3,2.5,2.5}_{1-0.32} and a~2=2,1,111\tilde{a}_{2}=\braket{2,1,1}_{1-1}, respectively, regarded as possibility distributions for their values. A possibility distribution for uncertain deviation δ~\tilde{\delta} is prescribed by fuzzy interval δ~=0,0,61\tilde{\delta}=\braket{0,0,6}_{1}, Γ=6\Gamma=6, and 𝑩=[22.513]\boldsymbol{B}=\left[\begin{array}[]{ll}2&2.5\\ 1&-3\end{array}\right]. By (12), we obtain a joint possibility distribution π\pi for 𝒂~=(a~1,a~2)\tilde{\boldsymbol{a}}=(\tilde{a}_{1},\tilde{a}_{2}). Fix =2\ell=2, which leads to the ambiguity set 𝒫2(π)\mathcal{P}^{2}(\pi) for 𝒂~=(a~1,a~2)\tilde{\boldsymbol{a}}=(\tilde{a}_{1},\tilde{a}_{2}) shown in Figure 2, built according to (14). In particular, the set 𝒞(0)\mathcal{C}(0), being the intersection of [0.5,5.5]×[1,3][0.5,5.5]\times[1,3] and the ellipse {𝒂:𝑩(𝒂𝒂^)26}\{\boldsymbol{a}:||\boldsymbol{B}(\boldsymbol{a}-\hat{\boldsymbol{a}})||_{2}\leq 6\} contains all possible scenarios. Applying now the distributionally robust approach in the possibilistic setting gives:

minmaxP𝒫(π)𝐄P[a~1x1+a~2x2]x12.74x23.3\begin{array}[]{lll}\min&\displaystyle\max_{{\rm P}\in\mathcal{P}^{\ell}(\pi)}\mathrm{\mathbf{E}}_{\rm P}[\tilde{a}_{1}x_{1}+\tilde{a}_{2}x_{2}]\\ &x_{1}\geq 2.74\\ &x_{2}\geq 3.3\end{array} (24)

Remark 1 shows how to convert the above problem with the uncertain objective function to the one with the precise objective function and the uncertain constraint. Theorem 2 (see Corollary 2) now leads to a second-order cone program that is equivalent to (24). An optimal solution of the program is x1=2.74x_{1}=2.74 and x2=3.3x_{2}=3.3. Let us focus on evaluating the objective function

maxP𝒫2(π)𝐄P[2.74a~1+3.3a~2].\max_{{\rm P}\in\mathcal{P}^{2}(\pi)}\mathrm{\mathbf{E}}_{\rm P}[2.74\tilde{a}_{1}+3.3\tilde{a}_{2}].

The worst probability distribution P{\rm P} in 𝒫2(π)\mathcal{P}^{2}(\pi) assigns probability 0.5 to scenario (5.15,2.68)𝒞(0)(5.15,2.68)\in\mathcal{C}(0) and probability 0.5 to scenario (3.5,2.5)(3.5,2.5) in 𝒞(0.5)\mathcal{C}(0.5). These two points can be obtained by maximizing the linear function 2.74a1+3.3a22.74a_{1}+3.3a_{2} over the convex sets 𝒞(0)\mathcal{C}(0) and 𝒞(0.5)\mathcal{C}(0.5), respectively (see Figure 2). We thus get maxP𝒫2(π)𝐄P[2.74a~1+3.3a~2]=0.5(2.745.15+3.32.68)+0.5(2.743.5+3.32.5)=20.39\max_{{\rm P}\in\mathcal{P}^{2}(\pi)}\mathrm{\mathbf{E}}_{\rm P}[2.74\tilde{a}_{1}+3.3\tilde{a}_{2}]=0.5\cdot(2.74\cdot 5.15+3.3\cdot 2.68)+0.5\cdot(2.74\cdot 3.5+3.3\cdot 2.5)=20.39.

5 Risk aversion modeling

In this section we will show how the ambiguity set 𝒫(π)\mathcal{P}^{\ell}(\pi) can be relaxed to take individual decision maker risk aversion into account. Observe that the lower bounds on the probabilities for the sets 𝒞(λi)\mathcal{C}(\lambda_{i}) in 𝒫π\mathcal{P}_{\pi}^{\ell} change uniformly. In consequence, a worst probability distribution uniformly assigns probabilities to points on the boundaries of 𝒞(λ0),,𝒞(λ1)\mathcal{C}(\lambda_{0}),\dots,\mathcal{C}(\lambda_{\ell-1}). An example is shown in Figure 2, where (for =2\ell=2) probability 0.5 is assigned to points on the boundaries of 𝒞(0)\mathcal{C}(0) and 𝒞(0.5)\mathcal{C}(0.5). It has been observed that risk averse decision makers can perceive worst coefficient realizations as more probable (see, e.g., [33, 12]). We now propose an method of taking this risk-aversion into account. Let

gρ(z)=11ρ(1ρz)g_{\rho}(z)=\frac{1}{1-\rho}(1-\rho^{z})

for ρ(0,1)\rho\in(0,1) and z[0,1]z\in[0,1]. The function gρ(z)g_{\rho}(z) is continuous, concave, increasing in [0,1][0,1] and such that gρ(0)=0g_{\rho}(0)=0 and gρ(1)=1g_{\rho}(1)=1. If ρ0\rho\rightarrow 0, then gρ(z)1g_{\rho}(z)\rightarrow 1 for each z(0,1]z\in(0,1]. On the other hand, if ρ1\rho\rightarrow 1, then gρ(z)zg_{\rho}(z)\rightarrow z for each z[0,1]z\in[0,1], so gρ(z)g_{\rho}(z) tends to a linear function (see Fig. 3).

Refer to caption
Figure 3: Function gρ(z)=11ρ(1ρz)g_{\rho}(z)=\frac{1}{1-\rho}(1-\rho^{z}).

Fix ρ(0,1)\rho\in(0,1) and consider the following ambiguity set:

𝒫,ρ(π)={P𝒫(n):P(𝒞(λi))1gρ(λi),iΛ}.\mathcal{P}^{\ell,\rho}(\pi)=\left\{{\rm P}\in\mathcal{PM}(\mathbb{R}^{n}):{\rm P}(\mathcal{C}(\lambda_{i}))\geq 1-g_{\rho}(\lambda_{i}),\;i\in\Lambda\right\}.

Since 1λi1gρ(λi)1-\lambda_{i}\geq 1-g_{\rho}(\lambda_{i}), we still get a valid bound P(𝒞(λi))N(𝒞(λi))=1λi1gα(λi){\rm P}(\mathcal{C}(\lambda_{i}))\geq{\rm N}(\mathcal{C}(\lambda_{i}))=1-\lambda_{i}\geq 1-g_{\alpha}(\lambda_{i}). Notice that the lower bound can be now smaller (i.e. larger probabilities can be assigned to worse scenarios), which reflects the probability distortion.

As 𝒫π𝒫π,ρ\mathcal{P}_{\pi}^{\ell}\subseteq\mathcal{P}^{\ell,\rho}_{\pi} for each ρ(0,1)\rho\in(0,1) and 𝒫π,ρ𝒫π,ρ\mathcal{P}^{\ell,\rho}_{\pi}\subseteq\mathcal{P}^{\ell,\rho^{\prime}}_{\pi} if ρρ\rho\geq\rho^{\prime}, decreasing ρ\rho leads to more conservative constraint (3). If ρ1\rho\rightarrow 1, then 𝒫π,ρ=𝒫π\mathcal{P}^{\ell,\rho}_{\pi}=\mathcal{P}_{\pi}^{\ell}. On the other hand, if ρ0\rho\rightarrow 0, then 𝒫π,ρ\mathcal{P}^{\ell,\rho}_{\pi} is the set of all probability measures with support 𝒞(0)\mathcal{C}(0), and the constraint (3) becomes

max𝒂𝒞(0)𝒂T𝒙b,\max_{\boldsymbol{a}\in\mathcal{C}(0)}\boldsymbol{a}^{T}\boldsymbol{x}\leq b,

which is equivalent to the strict robust constraint (2). Theorem 2 remains valid for the ambiguity set 𝒫π,ρ\mathcal{P}^{\ell,\rho}_{\pi}. It is enough to replace the constant λi\lambda_{i} with the constant gρ(λi)g_{\rho}(\lambda_{i}) in the first inequality in (16).

6 Application to portfolio selection

In this section we will apply the model constructed in Section 4 (see also Section 5) to a portfolio selection problem. We are given a set of nn assets whose returns form a random vector 𝒂~=(a~1,,a~n)\tilde{\boldsymbol{a}}=(\tilde{a}_{1},\dots,\tilde{a}_{n}). A portfolio is a vector 𝒙[0,1]n\boldsymbol{x}\in[0,1]^{n}, where xix_{i} is the share of the iith asset. Then 𝒂~T𝒙\tilde{\boldsymbol{a}}^{T}\boldsymbol{x} is the uncertain return of portfolio 𝒙\boldsymbol{x} (accordingly, the quantity 𝒂~T𝒙-\tilde{\boldsymbol{a}}^{T}\boldsymbol{x} is the uncertain loss for 𝒙\boldsymbol{x}). The probability distribution for 𝒂~\tilde{\boldsymbol{a}} is unknown. However, we have a sample of past realizations 𝒂1,,𝒂K\boldsymbol{a}_{1},\dots,\boldsymbol{a}_{K} of 𝒂~\tilde{\boldsymbol{a}}. Using this sample we can estimate the mean return 𝒂^\hat{\boldsymbol{a}} and the covariance matrix 𝚺\boldsymbol{\Sigma} for 𝒂~\tilde{\boldsymbol{a}}. The sample mean and covariance matrix for 7 assets, computed for a sample of 30 subsequent observations in Polish stock market, are

𝒂^=[0.057,0.378,0.324,0.799,0.873,0.271,0.323]\hat{\boldsymbol{a}}=[0.057,-0.378,0.324,-0.799,-0.873,-0.271,-0.323]
𝚺=[7.4690.1490.0990.0762.2250.0441.6490.9670.8650.5781.5580.0530.1433.7140.4541.2651.1880.3202.1880.5290.1520.52518.1681.5614.55812.7451.3915.371]{\footnotesize\boldsymbol{\Sigma}=\left[\begin{array}[]{rrrrrrrr}7.469&0.149&0.099&0.076&2.225&0.044&1.649\\ &0.967&0.865&-0.578&-1.558&0.053&-0.143\\ &&3.714&-0.454&-1.265&1.188&0.320\\ &&&2.188&-0.529&-0.152&0.525\\ &&&&18.168&-1.561&4.558\\ &&&&&12.745&1.391\\ &&&&&&5.371\end{array}\right]}

We fix 𝑩=𝚺12\boldsymbol{B}=\boldsymbol{\Sigma}^{\frac{1}{2}}, =100\ell=100, a¯j=a¯j=6σj\underline{a}_{j}=\overline{a}_{j}=6\cdot\sigma_{j}, where σj=Σjj\sigma_{j}=\sqrt{\Sigma_{jj}} for each j[n]j\in[n]. Using Chebyshev’s inequality one can show that a~j[a^ja¯j,a^j+a¯j]\tilde{a}_{j}\in[\hat{a}_{j}-\underline{a}_{j},\hat{a}_{j}+\overline{a}_{j}] with high probability. We now use the fuzzy intervals a^j,a¯j,a¯j11\braket{\hat{a}_{j},\underline{a}_{j},\overline{a}_{j}}_{1-1} to model the possibility distributions for the asset returns. We use linear membership functions. One can, however, further refine the information for the return values using different shapes, i.e. changing the parameters z1z_{1} and z2z_{2}. We use the fuzzy interval 0,Γ1\braket{0,\Gamma}_{1} for Γ[0,50]\Gamma\in[0,50] to model the possibility distribution for the deviation δ~\tilde{\delta}. We consider the following problem:

minmaxP𝒫(π)𝐄P[𝒂~T𝒙]𝟏T𝒙=1𝒙𝟎\begin{array}[]{lllll}\displaystyle\min&\displaystyle\max_{{\rm P}\in\mathcal{P}^{\ell}(\pi)}\mathrm{\mathbf{E}}_{\rm P}[-\tilde{\boldsymbol{a}}^{T}\boldsymbol{x}]\\ &\boldsymbol{1}^{T}\boldsymbol{x}=1\\ &\boldsymbol{x}\geq\boldsymbol{0}\end{array} (25)
Refer to caption
Figure 4: Optimal objective values for various Γ\Gamma.

Figure 4 shows the optimal objective value of (25) for Γ=0,1,,50\Gamma=0,1,\dots,50. In the first boundary case, Γ=0\Gamma=0, there is no uncertainty and in the optimal portfolio all is allocated to the asset with the largest expected return (smallest expected loss). In the second boundary case, when Γ48\Gamma\geq 48, there is no limit on deviation from 𝒂^\hat{\boldsymbol{a}} and all is allocated to the asset with the smallest a^j+a¯j\hat{a}_{j}+\overline{a}_{j}. For the intermediate values of Γ\Gamma, we get a family of various portfolios in which a diversification is profitable. One such portfolio is shown in Figure 4.

Figure 5 shows the effect of taking the individual risk aversion into account. In this figure the optimal objective value of (25) for ρ(0,1)\rho\in(0,1) is shown (we fix Γ=20\Gamma=20). For smaller values of ρ\rho the objective value of (25) is larger. One can see that the optimal portfolio is adjusted (see Figure 5) to take larger sets of admissible probability distributions into account.

Refer to caption
Figure 5: Optimal objective values for various ρ\rho and Γ=20\Gamma=20.

7 Conclusions

In this paper we have proposed a method of unifying the fuzzy (possibilistic) optimization with the distributionally robust approach. Using the connections between the possibility and probability theories one can build a set of probability distributions for uncertain constraint coefficient, based on a possibilistic information. The uncertain constraint, the left-hand side, can be then replaced with the expected value with respect to the worst probability distribution that can occur. In practical applications the true probability distributions for the uncertain data are often unknown. However, in most cases some information about the possible scenarios (parameter realizations) is available. In this case the possibility theory is an attractive choice, because it is flexible and allows to take both available data and experts’ opinions into account. In this paper we have proposed some methods of defining possibility distributions for scenarios, which is easy to apply in practice. Furthermore, the resulting deterministic counterpart of the problem can be solved efficiently for a large class of problems (for example for linear programming).

The key element of our model is the definition of a joint possibility distribution in scenario set. This step is quite flexible, as there are only minor restrictions on possibility distribution function which must be imposed. The support of this possibility distribution should contain all possible scenarios and a budget should control the magnitude of deviations of scenarios from some nominal (expected) one. We believe that such possibility distribution can be a good tool to handle the uncertainty in optimization problems.

References

  • [1] C. Baudrit and D. Dubois. Practical representations of incomplete probabilistic knowledge. Computational Statistics and Data Analysis, 51:86–108, 2006.
  • [2] A. Ben-Tal, L. El Ghaoui, and A. Nemirovski. Robust optimization. Princeton Series in Applied Mathematics. Princeton University Press, Princeton, NJ, 2009.
  • [3] A. Ben-Tal and A. Nemirovski. Robust solutions of uncertain linear programs. Operation Research Letters, 25:1–13, 1999.
  • [4] D. Bertsimas, V. Gupta, and N. Kallus. Data-deriven robust optimization. Mathematical Programming, 167:235–292, 2017.
  • [5] D. Bertsimas and M. Sim. The price of robustness. Operations research, 52:35–53, 2004.
  • [6] D. Bertsimas and M. Sim. Robust discrete optimization under ellipsoidal uncertainty sets. Technical report, MIT, 2004.
  • [7] S. Boyd and L. Vandenberghe. Covex optimization. Cambridge University Press, 2008.
  • [8] A. Charnes and W. Cooper. Chance-constrainted programming. Management Science, 6(73–79), 1959.
  • [9] G. De Cooman and D. Aeyels. Supremum-preserving upper probabilities. Information Sciences, 118:173–212, 1999.
  • [10] E. Delage and Y. Ye. Distributionally robust optimization under moment uncertainty with application to data-deriven problems. Operations Research, 58:595–612, 2010.
  • [11] S. Destercke, D. Dubois, and E. Chojnacki. A consonant approximation of the product of independent consonant random sets. International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems, 17:773–792, 2009.
  • [12] E. Diecidue and P. P. Wakker. On the intuition of rank-depedent utility. The Journal of Risk and Uncertainty, 23:281–298, 2001.
  • [13] D. Dubois. Possibility theory and statistical reasoning. Computational Statistics and Data Analysis, 51:47–69, 2006.
  • [14] D. Dubois and H. Prade. Possibility theory: an approach to computerized processing of uncertainty. Plenum Press, New York, 1988.
  • [15] D. Dubois and H. Prade. When upper probabilities are possibility measures. Fuzzy Sets and Systems, 49:65–74, 1992.
  • [16] J. Goh and M. Sim. Distributionally robust optimization and its tractable approximations. Operations Research, 58:902–917, 2010.
  • [17] R. Guillaume, A. Kasperski, and P. Zieliński. Distributionally robust optimization in possibilistic setting. In 2021 IEEE International Conference on Fuzzy Systems (FUZZ-IEEE), pages 1–6, 2021.
  • [18] IBM ILOG CPLEX Optimization Studio. CPLEX User’s manual. https://www.ibm.com.
  • [19] M. Inuiguchi and J. Ramik. Possibilistic linear programming: a brief review of fuzzy mathematical programming and a comparison with stochastic programming in portfolio selection problem. Fuzzy Sets and Systems, 111:3–28, 2000.
  • [20] M. Inuiguchi, J. Ramík, T. Tanino, and M. Vlach. Satisficing solutions and duality in interval and fuzzy linear programming. Fuzzy Sets and Systems, 135:151–177, 2003.
  • [21] P. Kall and J. Mayer. Stochastic linear programming. Models, theory and computation. Springer, 2005.
  • [22] A. Kasperski and P. Zieliński. Robust Discrete Optimization Under Discrete and Interval Uncertainty: A Survey. In Robustness Analysis in Decision Aiding, Optimization, and Analytics, pages 113–143. Springer-Verlag, 2016.
  • [23] A. Kasperski and P. Zieliński. Soft robust solutions to possibilistic optimization problems. Fuzzy Sets and Systems, 422:130–148, 2021.
  • [24] E. d. Klerk and M. Laurent. A survey of semidefinite programming approaches to the generalized problem of moments and their error analysis. In C. Araujo, G. Benkart, C. E. Praeger, and B. Tanbay, editors, World Women in Mathematics 2018. Association for Women in Mathematics Series, pages 17–56. Springer, 2019.
  • [25] P. Kouvelis and G. Yu. Robust Discrete Optimization and its Applications. Kluwer Academic Publishers, 1997.
  • [26] Y.-J. Lai and C.-L. Hwang. Fuzzy Mathematical Programming. Springer-Verlag, 1992.
  • [27] B. Liu. Fuzzy random chance-constrained programming. IEEE Transactions on Fuzzy Systems, 9:713–720, 2001.
  • [28] W. A. Lodwick and J. Kacprzyk, editors. Fuzzy Optimization - Recent Advances and Applications, volume 254 of Studies in Fuzziness and Soft Computing. Springer-Verlag, 2010.
  • [29] E. Nasrabadi and J. B. Orlin. Robust optimization with incremental recourse. CoRR, abs/1312.4075, 2013.
  • [30] C. H. Papadimitriou and K. Steiglitz. Combinatorial optimization: algorithms and complexity. Dover Publications Inc., 1998.
  • [31] M. S. Pishvaee, J. Razmin, and S. A. Torabi. Robust possibilistic programming for socially responsible supply chain network design: A new approach. Fuzzy Sets and Systems, 206:1–20, 2012.
  • [32] E. Quaeghebeur, K. Shariatmadar, and G. De Cooman. Constrained optimization problems under uncertainty with coherent lower previsions. Fuzzy Sets and Systems, 206:74–88, 2012.
  • [33] J. Quiggin. A theory of anticipated utility. Journal of Economic Behaviour and Organization, 3:323–343, 1982.
  • [34] J. Ramík. Duality in fuzzy linear programming with possibility and necessity relations. Fuzzy Sets Systems, 157:1283–1302, 2006.
  • [35] J. Ramík and M. Vlach. Generalized concavity in fuzzy optimization and decision analysis. Kluwer Academic Publishers, 2002.
  • [36] A. Shapiro. On duality theory of conic linear problems. In M. Á. Goberna and M. A. López, editors, Semi-infinite programming, chapter 7, pages 135–165. Kluwer Academic Publishers, 2001.
  • [37] R. Słowiński and J. Teghem, editors. Stochastic Versus Fuzzy Approaches to Multiobjective Mathematical Programming under Uncertainty. Kluwer Academic Publishers, 1990.
  • [38] M. Troffaes, E. Miranda, and S. Destercke. On the connection between probability boxes and possibility measures. Information Sciences, 224:88–108, 2013.
  • [39] W. Wiesemann, D. Kuhn, and M. Sim. Distributionally robust convex optimization. Operations Research, 62:1358–1376, 2014.

Appendix

The proof of (23).

Let us rewrite (22) as follows:

min𝒚T𝒙𝒂^T𝒙yja^j+a¯j(λi)j[n]yja¯j(λi)+a^jj[n]𝑩𝒚𝒛=𝟎𝒛2δ¯(λi)𝒚,𝒛n\begin{array}[]{lll}\min&-\boldsymbol{y}^{T}\boldsymbol{x}-\hat{\boldsymbol{a}}^{T}\boldsymbol{x}\\ &y_{j}\leq-\hat{a}_{j}+\overline{a}_{j}(\lambda_{i})&j\in[n]\\ &-y_{j}\leq-\underline{a}_{j}(\lambda_{i})+\hat{a}_{j}&j\in[n]\\ &\boldsymbol{B}\boldsymbol{y}-\boldsymbol{z}=\boldsymbol{0}&\\ &\displaystyle||\boldsymbol{z}||_{2}\leq\overline{\delta}(\lambda_{i})&\\ &\boldsymbol{y},\boldsymbol{z}\in\mathbb{R}^{n}&\end{array} (26)

and write the Lagrangian function

L(𝒚,𝒛,𝜶,𝜷,𝒖,γ)=𝒚T𝒙𝒂^T𝒙+j[n]αj(yj+a^ja¯j(λi))+j[n]βj(yj+a¯j(λi)a^j)+(𝑩𝒚𝒛)T𝒖+γ(𝒛2δ¯(λi)),\begin{array}[]{lllll}L(\boldsymbol{y},\boldsymbol{z},\boldsymbol{\alpha},\boldsymbol{\beta},\boldsymbol{u},\gamma)=&-\boldsymbol{y}^{T}\boldsymbol{x}-\hat{\boldsymbol{a}}^{T}\boldsymbol{x}+\\ &\displaystyle\sum_{j\in[n]}\alpha_{j}(y_{j}+\hat{a}_{j}-\overline{a}_{j}(\lambda_{i}))+\\ &\displaystyle\sum_{j\in[n]}\beta_{j}(-y_{j}+\underline{a}_{j}(\lambda_{i})-\hat{a}_{j})+\\ &(\boldsymbol{B}\boldsymbol{y}-\boldsymbol{z})^{T}\boldsymbol{u}+\gamma(||\boldsymbol{z}||_{2}-\overline{\delta}(\lambda_{i})),\end{array}

where αj,βj,γ0\alpha_{j},\beta_{j},\gamma\geq 0 and 𝒖n\boldsymbol{u}\in\mathbb{R}^{n} are the Lagrangian multipliers. After rearranging the terms we get

L(𝒚,𝒛,𝜶,𝜷,𝒖,γ)=𝒂^T𝒙+j[n](xj+αjβj+𝑩jT𝒖)yj+(γ𝒛2𝒛T𝒖)j[n]αj(a^ja¯j(λi))+j[n]βj(a¯j(λi)a^j)γδ¯(λi).\begin{array}[]{lll}L(\boldsymbol{y},\boldsymbol{z},\boldsymbol{\alpha},\boldsymbol{\beta},\boldsymbol{u},\gamma)=&-\hat{\boldsymbol{a}}^{T}\boldsymbol{x}+\\ &\displaystyle\sum_{j\in[n]}(-x_{j}+\alpha_{j}-\beta_{j}+\boldsymbol{B}^{T}_{j}\boldsymbol{u})y_{j}+\\ &(\gamma||\boldsymbol{z}||_{2}-\boldsymbol{z}^{T}\boldsymbol{u})-\\ &\displaystyle\sum_{j\in[n]}\alpha_{j}(\hat{a}_{j}-\overline{a}_{j}(\lambda_{i}))+\\ &\displaystyle\sum_{j\in[n]}\beta_{j}(\underline{a}_{j}(\lambda_{i})-\hat{a}_{j})-\gamma\overline{\delta}(\lambda_{i}).\end{array}

Problem (26) is convex and the set of feasible solutions is bounded and has nonempty interior. Hence it has an optimal solution with the finite objective function value vv^{*}. The problem also satisfies the strong duality (see, e.g., [7]), i.e.

max𝜶𝟎,𝜷𝟎,𝒖,γ0min𝒚,𝒛L(𝒚,𝒛,𝜶,𝜷,𝒖,γ)=v.\max_{\boldsymbol{\alpha}\geq\boldsymbol{0},\boldsymbol{\beta}\geq\boldsymbol{0},\boldsymbol{u},\gamma\geq 0}\min_{\boldsymbol{y},\boldsymbol{z}}L(\boldsymbol{y},\boldsymbol{z},\boldsymbol{\alpha},\boldsymbol{\beta},\boldsymbol{u},\gamma)=v^{*}. (27)

Since yjy_{j}, j[n]j\in[n], are unrestricted, the equality:

xj+αjβj+𝑩jT𝒖=0,j[n]-x_{j}+\alpha_{j}-\beta_{j}+\boldsymbol{B}^{T}_{j}\boldsymbol{u}=0,\;j\in[n] (28)

must hold. We now show that min𝒛(γ𝒛2𝒛T𝒖)=0\min_{\boldsymbol{z}}(\gamma||\boldsymbol{z}||_{2}-\boldsymbol{z}^{T}\boldsymbol{u})=0 if 𝒖2γ||\boldsymbol{u}||_{2}\leq\gamma and -\infty, otherwise. Assume, that 𝒖2γ||\boldsymbol{u}||_{2}\leq\gamma. By Cauchy-Shwartz inequality, 𝒛T𝒖𝒛2𝒖2γ𝒛2\boldsymbol{z}^{T}\boldsymbol{u}\leq||\boldsymbol{z}||_{2}||\boldsymbol{u}||_{2}\leq\gamma||\boldsymbol{z}||_{2}. Hence γ𝒛2𝒛T𝒖0\gamma||\boldsymbol{z}||_{2}-\boldsymbol{z}^{T}\boldsymbol{u}\geq 0 and min𝒛(γ𝒛2𝒛T𝒖)=0\min_{\boldsymbol{z}}(\gamma||\boldsymbol{z}||_{2}-\boldsymbol{z}^{T}\boldsymbol{u})=0. Assume that 𝒖2>γ||\boldsymbol{u}||_{2}>\gamma and take 𝒛=s𝒖\boldsymbol{z}=s\boldsymbol{u} for some s>0s>0. Then γ𝒛2𝒛T𝒖=γs𝒖2s𝒖T𝒖=s𝒖2(γ𝒖2)\gamma||\boldsymbol{z}||_{2}-\boldsymbol{z}^{T}\boldsymbol{u}=\gamma s||\boldsymbol{u}||_{2}-s\boldsymbol{u}^{T}\boldsymbol{u}=s||\boldsymbol{u}||_{2}(\gamma-||\boldsymbol{u}||_{2}), which can be made arbitrarily small by increasing ss. Hence, it must be

𝒖2γ.||\boldsymbol{u}||_{2}\leq\gamma. (29)

Now, using (28) and (29), the problem (21) can be rewritten as

max𝒂^T𝒙+j=1nαj(a^ja¯j(λi))+j=1nβj(a¯j(λi)a^j)γδ¯(λi)xj+αjβj+𝑩jT𝒖=0j[n]𝒖2γαj,βj,γ0j[n]𝒖n\begin{array}[]{lllll}\max&\displaystyle-\hat{\boldsymbol{a}}^{T}\boldsymbol{x}+\sum_{j=1}^{n}\alpha_{j}(\hat{a}_{j}-\overline{a}_{j}(\lambda_{i}))+\sum_{j=1}^{n}\beta_{j}(\underline{a}_{j}(\lambda_{i})-\hat{a}_{j})-\\ &\gamma\overline{\delta}(\lambda_{i})\\ &\begin{array}[]{lll}-x_{j}+\alpha_{j}-\beta_{j}+\boldsymbol{B}^{T}_{j}\boldsymbol{u}=0&j\in[n]\\ ||\boldsymbol{u}||_{2}\leq\gamma&\\ \alpha_{j},\beta_{j},\gamma\geq 0&j\in[n]\\ \boldsymbol{u}\in\mathbb{R}^{n}&\end{array}\end{array}

which is equivalent to (23). ∎