Preference Robust Modified Optimized Certainty Equivalent
Abstract
Ben-Tal and Teboulle [6] introduce the concept of optimized certainty equivalent (OCE) of an uncertain outcome as the maximum present value of a combination of the cash to be taken out from the uncertain income at present and the expected utility value of the remaining uncertain income. In this paper, we consider two variations of the OCE. First, we introduce a modified OCE by maximizing the combination of the utility of the cash and the expected utility of the remaining uncertain income so that the combined quantity is in a unified utility value. Second, we consider a situation where the true utility function is unknown but it is possible to use partially available information to construct a set of plausible utility functions. To mitigate the risk arising from the ambiguity, we introduce a robust model where the modified OCE is based on the worst-case utility function from the ambiguity set. In the case when the ambiguity set of utility functions is constructed by a Kantorovich ball centered at a nominal utility function, we show how the modified OCE and the corresponding worst case utility function can be identified by solving two linear programs alternatively. We also show the robust modified OCE is statistically robust in a data-driven environment where the underlying data are potentially contaminated. Some preliminary numerical results are reported to demonstrate the performance of the modified OCE and the robust modified OCE model.
Keywords. Robust modified optimized certainty equivalent, ambiguity of utility function, Kantorovich ball, piecewise linear approximation, error bounds, statistical robustness
1 Introduction
Let be a probability space with algebra and probability measure and be a random variable representing future income of a decision maker (DM). The optimized certainty equivalent of is defined as
(1.1) |
where is the decision maker’s utility function and is the probability measure on induced by . The concept is first introduced by Ben-Tal and Teboulle [6] and closely related to other notions of certainty equivalent and risk measures, see [7] for a comprehensive discussion. The economic interpretation of this notion is that the decision maker may need to consume part of at present, denoted by , the sure present value of under the consumption plan becomes , and the optimized certainty equivalent gives rise to the optimal allocation of the consumption which maximizes the sure present value of . As a measure, it enjoys a number of nice properties including constancy ( for constant ), risk aversion () and translation invariance (). In particular, if is a normalized exponential utility function, it coincides with the classical certainty equivalent in the literature of economics. Moreover, if where and for , then the OCE effectively recovers the conditional value-at-risk (CVaR):
(1.2) | |||||
The last equality is Rockafellar and Uryasev’s formulation of CVaR, see [7, 41]. Since CVaR is average of quantile, it is also known as average value-at-risk (AVaR), tail value-at-risk (TVaR) and expected shortfall, see [38, 37].
In this paper, we revisit the subject OCE from two perspectives. One is to consider a modified version of the optimized certainty equivalent
(1.3) |
The modification is motivated to align the sure present value of to the expected utility theory [35] by considering the utility of present consumption instead of the monetary value . Recall that in Von Neumann-Morgenstern expected utility theory [35], the utility function is used to represent the decision maker’s preference relation over a prospect space including both random and deterministic prospects, and such representation is unique up to positive linear transformation. This means we can use both and to represent the DM’s preference. However, the two utility functions would lead to completely different optimal values and optimal solutions in the OCE model. In contrast, the optimal solution is not affected in the MOCE model, and the optimal value is only affected by the same scale of the utility function. In our view, this kind of “invariance” of the optimal allocation and “scalability” w.r.t. the utility function is important because the optimal decision on the allocation/consumption should be determined by the DM’s risk preference irrespective of its equivalent representations.
The modified OCE model may be regarded as a special case of the well known consumption/investment models in economics [35, 19, 12] where is the utility of the current consumption/investment whereas is the expected utility of the remaining asset to be consumed/invested in future. In these models, the utility functions for the current consumption and future consumption are identical. It is also possible to use different utility functions when the consumption at present is used for a new investment or production.
The other is to consider a situation where the decision maker’s utility function is ambiguous, in other words, there is incomplete information to identify a utility function which captures the decision maker’s true utility preference. Consequently we propose to consider a robust optimized certainty equivalent measure
(1.4) |
where is a set of plausible utility functions consistent with the observed utility preferences of the decision maker. The definition is in line with the philosophy of robust optimization where the optimized certainty equivalent value is based on the worst case utility function from set to mitigate the risk arising from potential inaccurate use or misuse of the utility function. By convention, we call the ambiguity set. In the case that is a singleton, RMOCE reduces to MOCE. Note that RMOCE should be differentiated from the distributionally robust formulation of OCE by Wisemann et al. [50] where the focus is on the ambiguity of . In decision analysis, is known as a decision maker’s belief of the state of nature whereas characterizes the decision maker’s taste for risk/utility. The RMOCE model concerns the ambiguity of decision maker’s taste rather than belief.
Ambiguity of utility preference is a well discussed topic in behavioural economics. For instances, Thurstone [44] regards such ambiguity as a lack of accurate description of human behaviour. Karmarkar [27] and Weber [49] ascribe the ambiguity to cognitive difficulty and incomplete information. The ambiguity may also arise in the decision making problems which involve several stakeholders who fail to reach a consensus. Parametric and non-parametric approaches have subsequently been proposed to assess the true utility function, including discrete choice models (Train [45]), standard and paired gambling approaches for preference comparisons and certainty equivalence (Farquhar [15]), we refer readers to Hu et al. [23] for an excellent overview on this.
In decision making under uncertainty, a decision maker may choose the worst case utility function among a set of plausible utility functions representing his/her risk preference to mitigate the overall risk. This kind of research may be traced back to Maccheroni [33]. Cerreia-Vioglio et al. [10] seem to be the first to investigate ambiguity of decision maker’s utility function in the certainty equivalent model by considering the worst-case certainty equivalent from a given set of utility functions in their cautious expected utility model. They show that the DM’s risk preference can be represented by a worst-case certainty equivalent if and only if they are given by a binary relation satisfying the weak order, continuity, weak monotonicity and negative certainty independence (NCI) (NCI states that if a sure outcome is not enough to compensate the DM for a risky prospect, then its mixture with another lottery which reduces the certainty appeal, will not be more attractive than the same mixture of the risky prospect and the lottery).
Armbruster and Delage [3] give a comprehensive treatment of the topic from minimax preference robust optimization (PRO) perspective. Specifically, they propose to use available information of the decision maker’s utility preference such as preferring certain lotteries over other lotteries and being risk averse, -shaped or prudent to construct an ambiguity set of plausible utility functions and then base the optimal decision on the worst case utility function from the ambiguity set. Hu and Mehrotra [24] consider a probabilistic representation of the class of increasing concave utility functions by confining them to a compact interval and normalizing them with range . In doing so, they propose a moment-type framework for constructing the ambiguity set of the decision maker’s utility preference which covers a number of important approaches such as the certainty equivalent and pairwise comparison. Hu and Stepanyan [25] propose a so-called reference-based almost stochastic dominance method for constructing a set of utility functions near a reference utility which satisfies certain stochastic dominance relationship and use the set to characterize the decision maker’s preference. Over the past few years, the research on PRO has received increasing attentions in the communities of stochastic/robust optimization and risk management, see for instances [22, 21, 14, 52, 31, 32] and references therein.
In both (MOCE) and (RMOCE) models, the true probability distribution is assumed to be known. In the data driven problems, the true is unknown but it is possible to use empirical data to construct an approximation of . Unfortunately, such data may be contaminated and consequently we may be concerned by the quality of the MOCE values calculated as such. This kind of issue is well studied in robust statistics [26] and can be traced down to earlier work of Hample [20]. Cont et al. [13] first study the quality of the plug-in estimators of law invariant risk measures using Hampel’s classical concept of qualitative robustness [20], that is, the plug-in estimator of a risk functional is said to be qualitatively robust if it is insensitive to the variation of sampling data. According to Hampel’s theorem, Cont et al. [13] demonstrate that the qualitative robustness of a plug-in estimator is equivalent to the weak continuity of the risk functional and that value at risk (VaR) is qualitatively robust whereas conditional value at risk (CVaR) is not. Krätschmer et al. [30] argue that the use of Hampel’s classical concept of qualitative robustness may be problematic because it requires the risk measure essentially to be insensitive with respect to the tail behaviour of the random variable and the recent financial crisis shows that a faulty estimate of tail behaviour can lead to a drastic underestimation of the risk. Consequently, they propose a refined notion of qualitative robustness that applies also to tail-dependent statistical functionals and that allows one to compare statistical functionals in regards to their degree of robustness. The new concept captures the trade-off between robustness and sensitivity and can be quantified by an index of qualitative robustness. Guo and Xu [17] take a step forward by deriving quantitative statistical robustness of PRO models. Xu and Zhang [51] extend the analysis to distributionally robust optimization models.
In this paper, we consider a situation where the decision maker has a nominal utility function but is short of complete information as to whether it is the true. Consequently we propose to use the Kantorovich ball centered at the nominal utility function as the ambiguity set. We begin with piecewise linear utility (PLU) functions defined over a convex and closed interval of and show that the inner minimization problem in the definition of RMOCE can be reformulated as a linear program when has a finite discrete distribution. We then propose an iterative algorithm to compute the RMOCE by solving the inner minimization problem and outer maximization problem alternatively.
To extend the scope of the proposed computational method, we extend the discussion to the cases that the utility functions are not necessarily piecewise linear and the domain of the utility function is unbounded. We derive error bounds arising from using PLU-based RMOCE to approximate the general RMOCE. Since our numerical scheme for computing the RMOCE is based on the samples of , we study statistical robustness of the sample-based RMOCE to address the case that the sample data of are potentially contaminated. Finally we carry out some numerical tests on the proposed computational schemes for concave utility functions.
The rest of the paper are organized as follows. Section 2 discusses the basic properties of MOCE and RMOCE. Section 3 presents numerical schemes for computing the RMOCE when the utility functions in the ambiguity set are piecewise linear. Section 4 details approximation of the ambiguity set of general utility functions by the ambiguity set of piecewise linear utility functions and its effect on RMOCE. Section 5 discusses the RMOCE model with utility function having unbounded domain and streamlines the potential extensions of the MOCE model to multi-attribute decision making. Section 6 discusses statistical robustness of RMOCE when it is calculated with contaminated data. Section 7 reports numerical results and finally Section 8 concludes with a brief summary of the main contributions of the paper.
2 Properties of MOCE and RMOCE
We begin by discussing the well-definedness of MOCE and RMOCE. Let denote the space of random variables mapping from to with finite -th order moments and . Let be the set of nondecreasing concave utility functions. Throughout this paper, we make a blanket assumption to ensure the well-definedness of the expected utility in the definitions of MOCE and RMOCE.
Assumption 2.1
There exist gauge functions and parameterized by satisfying for such that
The condition stipulates the interaction between the tails distribution of and tails of the utility function. We refer readers to Guo and Xu [18] for more detailed discussions on this. To facilitate the forthcoming discussions, we let denote the set of probability measures on , and for each fixed , define
for . Let denote the class of continuous functions such that for all . The -topology, denoted by , is the coarsest topology on for which the mapping is continuous. A sequence is said to converge -weakly to written if it converges w.r.t. . Note that in the case that when the support set of is a compact set in , then the -topology reduces to ordinary topology of weak convergence.
Our first technical result is on the attainability of the optimum in the definition of MOCE.
Proposition 2.1
Assume: (a) 2.1 holds, (b) there exists such that is a compact set, (c) the support set of , denoted by , is bounded, (d) is strictly concave over . Then for ,
(2.5) |
Moreover, if and converges weakly to , the Dirac probability measure at , then converges to .
Proof. Since is a strictly concave function, (1.3) is a convex optimization problem. Condition (b) ensures existence of an optimal solution, denoted by . Following a similar analysis to the proof of [7, Lemma 2.1], we can write down the first order optimality condition of the program at ,
(2.6) |
where denotes convex subdifferential [40]. Since for any , where denote the left derivative and right derivative of at and
where the expectation/integration at the right hand side is in the sense of Aumann [4]. Consequently we can rewrite (2.6) as
(2.7) | |||||
which yields
Since and are non-increasing, the inequality above implies
(2.8) |
and
(2.9) |
Moreover, since for any , then inequalities (2.8)-(2.9) imply
and hence (2.5).
The second part of the claim follows directly from the first part in that the interval converges to a single point and -convergence coincides with the weak convergence because of the restriction of the range of to a compact subset of .
Ben-Tal and Teboulle [7] derive a similar result to the first part of the proposition for the optimized certainty equivalent and demonstrate that under the conditions that is concave and rather than strictly concave. Strict concavity is needed to ensure the optimum in (2.5) to be achieved in . We can find a counter example otherwise, see 2.3.
Like the optimized certainty equivalent, the newly defined modified optimized certainty equivalent enjoys a number of properties as stated in the next proposition.
Proposition 2.2 (Properties of MOCE)
Let be a closed proper function. Under 2.1, the following assertions hold.
-
(i)
is law invariant.
-
(ii)
(Monotonicity) For any , with respective distributions (push-forward probabilities) , .
-
(iii)
(Risk aversion) If for all , then for any random variable .
-
(iv)
(Second-order stochastic dominance) Let be random variables with compact support. Then for any concave utility function ,
where is the classical certainty equivalent.
-
(v)
(Concavity and positive subhomogeneity) If is concave, then is also concave. Moreover, if , then
(2.10)
Proof. Parts (i)-(iii) follow straightforwardly from the definitions, we prove the rest.
Part (iv). “”. By the definition of certainty equivalent, implies for all concave utility functions. The latter implies dominates in second order, which in turn guarantees dominates in second order for any fixed . Consequently for any . Adding both sides of the inequality by and taking the maximum, we obtain .
“”. Let be the points where the supremum of and are attained. Then
which yields . The latter implies .
Part (v). First we prove the concavity of , i.e. for and any random variables , ,
Since is concave, the function is joint concave over . Therefore, for any , with and , one has
Since , it follows that
Next, we turn to prove the subhomogeneity of . Let , for . Then
(2.11) |
Let . By the concavity of ,
Since , the above inequality implies
(2.12) |
Inequality (2.12) also implies
A combination of the two inequalities implies the objective function in (2.11) is non-increasing in and hence . By setting and to respectively in the inequality above, we obtain (2.10).
Next, we discuss how the utility function may be recovered from a given modified certainty equivalent , which is an important property enjoyed by the OCE. Let
where and . For a concave utility function , the modified optimized certainty equivalent can be written as
(2.13) |
Proposition 2.3
If is a strong risk averse utility function, i.e., for all , and , then
Proof. Observe that is the optimal solution of problem (2.13) if and only if
(2.14) | |||||
The inequality above can be equivalently written as
(2.15) |
Since is strongly risk averse, that is, for all , then and hence inequality (2.15) holds for sufficiently small. This in turn shows that inequality (2.15) holds and hence is the optimal solution of problem (2.13) for all sufficiently small. Thus we have and the conclusion follows.
Example 2.1
We give a few examples which illustrate how MOCE can be calculated in a closed form and their difference in comparison with OCE. Let . Then and
Hence, the recovered utility function is .
Example 2.2 (Exponential Utility Function)
Example 2.3 (Piecewise Linear Utility Function)
Let
where . Then the utility function can be written as and the modified optimized certainty equivalent is
(2.16) |
Compared to optimized certainty equivalent (see [7])
we can also conclude that because for all .
It might be interesting to see where the optimum in (2.16) is achieved. We consider the case that follows a Dirac distribution at point , that is, . Consequently
(2.20) |
The set of optimal solutions is , which is not contained in . This explains that (2.5) may fail to hold without strict concavity of .
We now move on to discuss the properties of the robust modified optimized certainty equivalent.
Proposition 2.4 (Properties of RMOCE)
Let be a closed proper function. Under 2.1, the following assertions hold.
-
(i)
is law invariant.
-
(ii)
(Monotonicity) For any , with respective distributions (push-forward probabilities) , .
-
(iii)
(Risk aversion) If , for all and , then , for any random variable .
-
(iv)
(Second-order stochastic dominance) Let be random variables with compact support. Then for any concave utility function ,
where is the classical certainty equivalent.
-
(v)
(Concavity and positive subhomogeneity) If is concave, then is also concave. Moreover, if , then
(2.21)
Proof. Parts (i)-(iii) are obvious.
Part (iv). Following a similar argument to the proof of part (iv) of 2.2, we can show that implies and for any fixed and hence
Taking infimum on both sides w.r.t. over and then supremum w.r.t. , we obtain .
Part (v). Let We can show as in the proof of 2.2 (v) that is non-increasing over . This property is preserved after taking the infimum in over and then supremum in over .
Before concluding this section, we remark that it is possible to use a different utility function for the present consumption , i.e.,
(2.22) |
In that case, some of the properties of MOCE may be retained. For example, law invariance, monotonicity, risk aversion, concavity, positive subhomogeneity and second-order stochastic dominance are all satisfied when enjoys the same property as . 2.3 also holds when satisfies the same property as . However, the change will have an effect on 2.1, in which case it will be difficult to estimate the interval containing the optimal solution.
3 Computation of RMOCE
Having investigated the properties of MOCE and RMOCE in the previous section, we move on to discuss numerical schemes for computing RMOCE in this section. To this end, we need to have a concrete structure of the ambiguity set. As reviewed in the introduction, various approaches have been proposed for constructing an ambiguity set of utility functions in the literature of preference robust optimization depending on the availability of information. Here we consider a situation where the decision maker has a nominal utility function obtained from empirical data or subjective judgement but lacks of complete information to identify whether it is the true utility function which captures precisely the decision maker’s preference. Consequently we may construct a ball of utility functions centered at the nominal utility function under some appropriate metrics. Here we concentrate on the Kantorovich metric.
3.1 Kantorovich ball of piecewise linear utility functions
We begin by considering a ball of utility function centered at a piecewise linear utility function under the the Kantorovich metric. In practice, decision maker’s utility preferences are often elicited through questionnaires. For example, a customer’s utility preference may be elicited via the customer’s willingness to pay at certain price points [43, 32]. From computational point of view, piecewise linear utility function may bring significant convenience to calculation of OCE, see Nouiehed et al. [36].
Let be an ordered sequence of points in and with and . Let be a class of continuous, non-decreasing, concave, piecewise linear functions defined over an interval with kinks on , as well as Lipschitz condition with modulus and normalized conditions and . Let , we consider a ball in with the Kantorovich metric
(3.23) |
where the subscript represents the Kantorovich metric and
(3.24) |
and
(3.25) |
Note that piecewise linear utility functions are used to approximate general utility functions in the utility preference robust optimization model [18]. The difference is that here we use the Kantorovich ball to construct the ambiguity set of DM’s utility function whereas the authors use pairwise comparison approach to elicit the DM’s utility preferences in [18]. The next proposition states that may be computed by solving a linear program.
Proposition 3.1
The Kantorvich distance is the optimal value of the following linear program:
(3.26a) | |||||
(3.26e) | |||||
Proof. Let . By definition,
where denotes the slope of at interval . Since for each , ,
where denotes the slope of at interval . Note that in this formulation, depends on the slopes of rather than their function values. Let and . Since for all , we have
for . Likewise, since for all , we have
for . To complete the proof, it suffices to show that conditions
(3.27) |
are adequate to cover the generic condition
(3.28) |
We consider two cases.
Case 1. for some . In this case, the generic condition is adequately covered by for all . Because the objective depends only on .
3.2 Alternating iterative algorithm for computing RMOCE
We are now ready to discuss how to compute the RMOCE with the ambiguity set of piecewise linear utility functions constructed by the Kantorovich ball. Assume that the probability distribution of random variable is discrete with for and . Then we can rewrite the RMOCE problem (1.4) as
(3.29) |
Recall that in 2.1, we show that the optimal solutions of MOCE are contained in interval when utility function is strictly concave. Unfortunately, this result is not applicable to problem (3.34) because is piecewise linear. However, under some fairly moderate conditions, we are able to show that the optimal solutions are bounded. The next proposition states this.
Proposition 3.2
Consider MOCE problem (1.3). Let denote the set of optimal solutions. Assume: (a) is a piecewise linear concave function and (b) has at least two pieces in the interval . Then the following assertions hold.
-
(i)
is a compact and convex set.
-
(ii)
If , then .
-
(iii)
If , then .
-
(iv)
If , then .
Proof. Part (i). Observe first that is a convex set since problem (1.3) is a convex optimization problem. Suppose for the sake of a contradiction that is unbounded. Then either is a right half line or a left half line. We consider the former. In that case, there exists sufficiently large such that
(3.30) |
By the first order optimality condition
(3.31) |
The equality holds because of Clarke regularity, see [9, 11]. Since , then any subgradient in set is greater or equal to the subgradient from for all . This means the optimality condition holds if and only if and are in the domain of the same linear piece. But this contradicts assumption (b). Using a similar argument, we can also show that cannot be a left half line.
Part (ii). Assume for a contradiction that . Then inclusion (3.30) holds. Following a similar analysis to that in Part (i), we can show that in this case does not satisfy (3.31). If , then
(3.32) |
Consequently we can show that cannot satisfy the optimality condition (3.31).
Part (iii). In this case, we can show that cannot be larger that because otherwise we would have (3.30) and a contradiction to the optimality condition. Likewise if , then the inclusion (3.32) would be invoked.
Part (iv) is similar to Part (iii), we omit the details.
Note that if we strengthen the condition on two linear pieces in the interval to a smaller interval , then we will be able to strengthen the conclusions in Parts (ii)-(iv) whereby is included in , we leave readers for an exercise.
Now we propose the alternating iterative algorithm for solving the maximin problem (3.29).
Algorithm 3.1
Step 0. Choose an initial point .
Step 1. For s=1,…, solve
(3.33) |
and
(3.34) |
where is a compact subset of .
Step 2. Stop when and .
Note that in equation (3.34), we restrict to taking values in a convex and compact set since 3.2 guarantees that the optimal is contained in such a set. There is another important issue concerning the algorithm, that is, whether a sequence generated by the algorithm converges to the optimal solution of (RMOCE-PLU). The next proposition addresses this.
Proposition 3.3
Algorithm 3.1 either terminates in a finite number of steps with a solution of the (RMOCE-PLU) model or generates a sequence whose cluster points, if exist, are optimal solution of the (RMOCE-PLU) model.
Proof. Let be a cluster point of the sequence generated by Algorithm 3.1. Then for all and ,
(3.35) |
For ,
and
(3.36) |
If Algorithm 3.1 terminates in finite steps, then and for some and satisfies (3.35). In what follows we consider the case that Algorithm 3.1 generates an infinite sequence . Let be a cluster point of . For the simplicity of notation, we assume that . If is not a saddle point, then it violates one of the inequalities in (3.35). Without loss of generality, consider the case that the first inequality of (3.35) is violated, that is, there exists such that
Since is continuous, then for sufficiently large ,
which is a contradiction to (3.36). In the same manner, we can show that satisfies the second inequality in (3.35). The proof is complete.
Note that the cluster point is indeed a saddle point of the maximin problem (3.29) and existence of the latter is guaranteed by the fact that the objective function is linear in and concave in . Problem (3.33) is a convex problem because is a compact and convex set. By writing each utility function as
(3.37) |
for and writing down the Lagrange dual of problem (3.26),
(3.38a) | |||||
(3.38f) | |||||
We can effectively reformulate problem (3.33) as a linear program:
s.t. | |||||
where denotes the slope of at interval . Constraint (3.39) requires the piecewise linear function to be continuous at the kinks, constraint (3.39) and (3.39) represent the normalized conditions, (3.39) requires the concavity of utility function, (3.39) represents the Lipschitz condition, constraints (3.39)-(3.39) represent the bounded Kantorovich ball. Note that here we use the Lagrange dual problem (3.38) instead of the primal problem (3.26) because the latter would have bilinear terms otherwise.
4 RMOCE with non-piecewise linear utility functions
The computational schemes that we discussed in the previous section are applicable to the case when the ambiguity set is constructed by a Kantorovich ball of piecewise linear utility functions. In practice, the utility functions are not necessarily piecewise linear. This raises a question as to how much we may miss if we use to compute (RMOCE) with the ambiguity set constructed by the Kantorovich ball of general utility functions. In this section, we address the issue which is essentially about error bound of modelling error. To maximize the scope of coverage, we consider -ball instead of the Kantovich ball. Let be a class of continuous, non-decreasing, concave functions defined over with Lipschitz condition with moludus and normalized conditions and . For , we define
(4.40) |
where
(4.41) |
is a set of measurable functions defined over and . is known as a pseudo metric. It can be observed that if and only if for all but not necessarily unless is sufficiently large. By specifying particular properties of functions in set , we may obtain some specific metric such as Kantorovich metric and the Kolmogorov metric with , where consists of all indicator functions defined as
(4.42) |
With the definition of the -ball and , we may define the corresponding RMOCE as
(4.43) |
and the one when the utility functions are restricted to be piecewise linear:
(4.44) |
where
(4.45) |
We investigate the difference between and and its propagation to the optimal values. Let and be two sets of utility function, between and , be the deviation distance of from , and be the Hausdorff distance between and .
4.1 Error bound on the ambiguity set
We start by quantifying the difference between the ambiguity sets. To this effect, we need a couple of technical results.
Proposition 4.1
([18, Proposition 4.1]) For each fixed , let be such that for and
(4.46) |
Then
(4.47) |
where . Moreover, in the case when , it holds that
(4.48) |
In the case when , .
Here and later on, we call defined in (4.48) as a projection of on . Next, we quantify the deviation distance and Hausdorff distance between -balls in the and .
Lemma 4.1
Let , and , be any positive numbers. Then the following holds:
-
(i)
, ,
-
(ii)
If is defined as in (4.46) and , then and .
Proof. The proof is similar to that of [46], here we include a sketch for self-containedness.
Part (i). We only prove the first inequality, as the second one can be proved analogously. Let and , where . By the definition of , we have , which implies . Thus
Since for , we have for all . By the definition of , we have
and hence (i) holds.
Part (ii). Let . Under 4.1,
which implies . By Part (i),
Similarly, we have . The result holds due to the definition of Hausdorff distance under -metric.
Now we turn to prove . Since , then we can find a such that is the projection of . Hence for any , we have . Consequently, . On the other hand, for any , we have
hence . Therefore, according to the definition of and Part (i),
The proof is complete.
With 4.1, we are ready to quantify the difference between and .
Theorem 4.1
Let and is a projection of defined as in (4.46) and . Then
(4.49) |
Proof. By the triangle inequality of the Hausdorff distance in the space of , we have
From 4.1, , so it suffices to show . By the definition of ,
where is the projection of . The second inequality follows from (4.48), the third inequality is due to the fact that for any , its projection satisfies
that is, . The last inequality follows from part (i) of 4.1. Likewise, we have
where the third inequality is derived from the fact that for any , that is, , we have , that is . The last inequality follows from part (ii) of 4.1. Finally, by the definition of Hausdorff distance under metric , the proof is complete.
4.2 Error bound on the optimal value
Proof. It is well known that
Let be a small positive number. For any , we can find and depending on such that
where denotes the Hausdorff distance in the space of continuous functions defined on equipped with infinity norm . Combining the above inequalities
By exchanging the positions of and , we have
Since can be arbitrarily small, we obtain
The main challenge here is that differs from . In what follows, we show that
(4.50) |
where . Let and ,
(4.51) | |||||
The last equality is due to the fact that . By taking infimum w.r.t. and taking superemum w.r.t. on both sides of the equality above, we obtain Swapping the positions between and , we obtain (4.50). Combining with Theorem 4.1, we obtain the conclusion.
5 Extensions
In this section, we discuss potential extensions of the MOCE and RMOCE models by considering utility functions with unbounded domain and multivariate utility functions.
5.1 Utility function with unbounded domain
In some important applications such as finance and economics, the underlying random variables which represent market demand, stock price and rate of return often have unbounded support. This raises a question as to whether our proposed model and computational schemes in the previous sections can be effectively applied to these situations. Here we discuss this issue.
We start by defining a set of nonconstant increasing function defined over denoted by . We no longer restrict the domain of to a bounded interval . Let , the -ball in is defined as
where is the pseudo metric defined in (4.41) and is a set of measurable function throughout this section. The robust modified optimized certainty equivalent model based on is defined as
(5.52) |
where is a compact implementable decisions over . Our aim is to solve and our concern is that the numerical schemes proposed in Section 3 cannot be applied to this problem directly. Let be the truncation of over , define
(5.53) |
where denotes the set of nonconstant nondecreasing functions defined over . We rewrite (4.43) as
(5.54) |
What we are interested here is the difference between and in terms of the optimal value. We will show that the difference between and is only related with the radius of the -ball under some moderate conditions and therefore we may solve approximately by solving . The latter can be solved by the piecewise linear approximation scheme detailed in Section 3.
To build the bridge between and , we define the following set
(5.55) |
Notice that is not a ball which is defined under the pseudo metric. Then we can establish the connection between and in the following proposition.
Proposition 5.1
Let and assume that there exists a position number such that
(5.56) |
Then for any there exist constants and such that
(5.57) |
Proof. From the condition (5.56), for any there exist constants and such that
(5.58) |
For any fixed , let for and for and for . Then we can obtain
Hence and
(5.59) |
By taking supremum w.r.t. over on both sides of (5.59), we obtain
Note that , then we have
where the last inequality is from part (i) of 4.1. Consequently (5.57) follows.
From 5.1, we can see that when the interval is large enough, the difference between and will not be significant. We now turn to compare with , and we could get similar conclusion in [18, Section 6.2] that the extended function of is in , where , because . By exploiting the relationship, we can quantify the difference between and in the following theorem.
Theorem 5.1
Assume there exists a constant such that
(5.60) |
and the condition in 5.1 is fulfilled. Then for any , there exist constants and such that
(5.61) |
Proof. It follows from conditions (5.60) and (5.56) that for any there exist constants such that
(5.62) |
and (5.58) holds. Since , the above inequality implies
(5.63) |
By definitions of and ,
Let us estimate the first term at the right side of the last inequality above. Observe that
Thus
(5.64) |
The last inequality holds due to . Now let us turn to the second term. For any and a fixed positive number , we can find and its extended function such that
Consequently we have
(5.65) |
The second equality is satisfied because is the extended function of and . By exchanging the positions of and , we have
(5.66) |
Since can be arbitrarily small, we obtain
(5.67) |
5.2 Multiattribute utility case
The OCE models that we discussed so far are for single attribute decision making. It might be interesting to ask whether the models can be extended to multi-attribute decision making. The answer is yes. Here we present two potential extended models. One is to consider the case that the utility function has an additive structure, that is, the multivariate utility function is the sum of the marginal utility functions of each attribute. Such utility functions are widely used in the literature, see e.g. [28, 1, 2]. In that case, given and , we may define the MOCE as
(5.68) |
where the multiattribute utility function and is the marginal utility function with respect to the th attribute. The formulation can be simplified when the probability distribution of is the product of its marginal distributions:
(5.69) |
The economic interpretation of the model is that the decision maker might have a portfolio of random assets , and the DM would like to cash out from asset . The marginal utilities may be the same or different. Problem (5.69) is decomposable as it stands, thus it retains the properties outlined in Section 2 and can be calculated by calculating single attribute MOCE simultaneously.
When the utility function is non-additive, we may consider the following model:
(5.70) |
where is a fixed vector of weights. In this model, cash to be taken out from the assets is in a prefixed proportion. (MMOCE-B) is essentially a single variate MOCE model. Note that it is possible to further extend model (MMOCE-A) by replacing deterministic vector with a random vector :
(5.71) |
This kind of model has potential applications in finance where a firm detaches risk assets from non-risky assets in order to reduce the systemic risk [47]. In that context, problem (MMOCE-A’) is to find optimal separation from the existing overall portfolio of assets . The problem is intrinsically two-stage, one may use linear/polynomial decision rule [5] or K-adapativity method [8] to obtain a (MMOCE-A)-version of approximation. Note also that model (MMOCE-A’) is related to the IDR-based CDE model recently studied by Qi et al. [39] who use OCE for optimizing individualised medical treatment. Since all of the extended models outlined above require much more detailed analysis, we leave them for future research.
6 Quantitative statistical robustness
6.1 Motivation
In Section 3, we discuss in detail how to obtain an approximate solution of (RMOCE) (to ease reading, we repeat the model here):
(6.72) |
where follows probability distribution . A key assumption is that the true probability distribution is known and discretely distributed. This assumption may not be satisfied in data-driven problems where the true is unknown, and one often uses empirical data to construct an approximation of . Even worse is that such data may be contaminated.
Let denote the contaminated empirical data (we call them perceived data and we use to denote the size of samples rather than number of breakpoints without causing confusion henceforth). Let be the empirical distribution constructed with the perceived data, where is the Dirac measure at . We use the perceived data to solve the RMOCE model (assume that the model is solved precisely without computational error):
(6.73) |
We then ask ourselves as to whether is a good estimation of from statistical point of view. This question is concerned with data perturbation rather than modelling/computational errors as discussed in Section 3.
To proceed the analysis, we introduce another empirical distribution, denoted by , which is constructed by the purified perceived data (the noise in the perceived data is detached, we call them real data henceforth). In practice, it is impossible to detach the noise, we introduce the notion purely for the convenience of statistical analysis. Let be the optimal value of (RMOCE-P) by replacing with . By the classical law of large numbers, we know that and under moderate conditions. Thus in the literature of stochastic programming, is called a statistical estimator of and here we emphasize that this estimator is based on real data.
Our question is then whether is close to because the former is the only quantity that we are able to obtain. To address this question, we assume the perceived data are iid which means for some as . In other words, the perceived data may be viewed as if they are generated by the invisible distribution . Let denote the optimal value of (RMOCE) with being replaced by . We then have
Thus if as uniformly for all close to and as , then is close . This explains roughly the motivation of this section. The formal quantitative statistical robust analysis is a bit more complex as we will examine the difference between the probability distributions of and under some metric rather than estimating for each given set of perceived data.
6.2 Statistical analysis
For any two probability measures , define the pseudo-metric between and by
(6.74) |
It can be seen that is the maximal difference between the expected values of the class of measurable functions with respect to and . The specific pseudo metrics that we consider in this paper are the Fortet-Mourier metric and the Kantorovich metric. Recall that the -th order Fortet-Mourier metric with for :
(6.75) |
where
and
When , the functions in are globally Lipschitz continuous with modulus and coincides with in (3.25). Thus . For more details, see [16, 42, 48].
To get the statistical robustness result, let and denote the Cartesian product and its Borel sigma algebra. Let denote the probability measure on the measurable space with marginal on each and with marginal . Now we can state the definition of statistical robustness of a statistic estimator, which is proposed in [18, 48].
Definition 6.1 (Quantitative statistical robustness)
Let be a set of probability measures. A sequence of statistical estimators is said to be quantitatively statistically robust on w.r.t. if there exists a positive constant such that for all
(6.76) |
where is the Kantorovich metric on and is the Fortet-Mourier metric on .
Here and are probability measures/distributions on . The next theorem states quantitative statistical robustness of .
Theorem 6.1
Assume: (a) There exists a positive constant such that for all and ,
(b) set is chosen such that is a gauge function, that is, is continuous and holds outside a compact set. Then for any ,
(6.77) |
where for some constant .
Proof. By definition
(6.78) |
where and we write for . To see the well-definedness of the pseudo-metric, notice that for every and a fixed
(6.79) |
where is fixed. From condition (b) and nondecreasing property of , there exists a positive number such that
(6.80) |
By the definition of , it follows that
Moreover,
(6.81) | |||||
where the equality holds due to the fact that are independent and identically distributed. Combining (6.79) and (6.81) we can obtain
Similar argument can be made on for any . Next, for any ,
where the last inequality follows from condition (a). Then we can obtain
(6.82) |
It follows by [48, Lemma 4.4] that
(6.83) |
and hence inequality (6.77).
7 Numerical tests
We have carried out some tests on the numerical schemes for computing RMOCE. In this section, we report the preliminary numerical results.
The first set of tests are about the comparison between the MOCE model (1.3) and OCE model (1.1) in terms of the optimal values and the optimal solutions. We do so by considering following some specific distributions including uniform, Gamma, lognormal and normalized Pareto distribution. The second set of tests are on the RMOCE model and numerical schemes proposed in Section 3. We investigate how the optimal value and the worst case utility function in the RMOCE model change as the radius of ambiguity set and the number of breakpoints vary. We use the parallel particle swarm optimization method [29, 34] to solve problem (3.29) and CVX solver to solve inner minimization problem (3.33). All the tests are carried out in Matlab R2021a installed on a PC (16GB RAM, CPU 2.3 GHz) with Intel Core i7 processor.
Throughout the section we restrict to a set of all increasing concave utility functions mapping from a compact interval . We take as the domain of which is the union of ranges of and for by 3.2 because the number of breakpoints can guarantee that . We generate iid samples for random variable with equal probabilities for .
In the first set of tests of OCE and MOCE, we set the nomial utility function as . Table 1 displays the optimal values and the optimal solutions as well as the CPU times. The th and th columns present the optimal values of OCE and MOCE model, and the th and th columns present the optimal solutions of OCE and MOCE model, respectively. As we can see, the OCE values are consistently larger than the MOCE values, this is because . Moreover, we find the optimal solutions of MOCE problem (under ) fall within although we have not displayed the intervals due to the limitation of space. This complies with 2.1.
Distribution | K | CPU time | ||||
---|---|---|---|---|---|---|
Uniform (-1,1) | 10 | -0.5590 | -0.4440 | -0.2220 | -0.4441 | 0.8700 |
100 | -0.1950 | -0.1782 | -0.0891 | -0.1782 | 0.9914 | |
1000 | -0.3508 | -0.3008 | -0.1504 | -0.3008 | 3.8415 | |
Lognormal (0,1) | 10 | 0.4929 | 0.6792 | 0.3396 | 0.6792 | 0.4279 |
100 | 0.5182 | 0.7303 | 0.3651 | 0.7303 | 0.8139 | |
1000 | 0.5313 | 0.7578 | 0.3789 | 0.7578 | 3.7001 | |
Pareto (1,1.5) | 10 | 0.8692 | 2.0337 | 1.0169 | 2.0337 | 0.4484 |
100 | 0.8990 | 2.2926 | 1.1463 | 2.2925 | 0.7263 | |
1000 | 0.8942 | 2.2461 | 1.1231 | 2.2461 | 3.6693 | |
Gamma (0.53,3) | 10 | 0.3392 | 0.4143 | 0.2072 | 0.4143 | 0.5002 |
100 | 0.4415 | 0.5824 | 0.2912 | 0.5825 | 0.7094 | |
1000 | 0.4088 | 0.5255 | 0.2628 | 0.5255 | 3.7729 |
In the second set of tests about RMOCE, we set the nominal utility as where is a parameter which determines the degree of concavity of the utility function. The number of random samples is fixed at for the uniform distribution and for Gamma, lognormal and normalized Pareto distribution. The parameters of the tests are listed in Table 3, the 4th column represents the Lipschitz modulus of utility functions. For the cases where the random samples are generated by uniform distribution, Figures 3 and 3 visualize the worst case utility functions and the optimal values as the radius decreases. Figure 3 visualizes the change of optimal values as the number of breakpoints increases. It can be seen that the number of breakpoints has little effect on the optimal value. For the cases when follows Gamma, lognormal and normalized Pareto distribution, Figures 4 and 5 visualize the changes of the worst case utility functions and the optimal values as the radius decreases. We can see that the worst utility function moves closer to the nominal utility function as the radius of the ambiguity set decreases to zero, the optimal value increases as the radius decreases. This is because the Kantorovich ball becomes smaller when the radius decreases. In the case that , the worst case utility function is the piecewise linear approximation of the nominal utility function. The error bound of the optimal value is also depicted in Figures 3 and 5, note that the error bound is getting smaller when the number of breakpoints increases in Figure 3. Table 3 provides the optimal values and running time for different number of breakpoints.
Distribution | N | L | |
---|---|---|---|
Uniform (-1,1) | 2 | 10 | 30 |
Lognormal (0,1) | 1/2 | 300 | 10 |
Pareto (1,1.5) | 1/3 | 300 | 10 |
Gamma (0.53,3) | 1/2 | 300 | 10 |
N | Optimal value | CPU time |
---|---|---|
20 | -108.4846 | 20.7796 |
40 | -108.6524 | 24.1097 |
60 | -108.5553 | 31.4506 |
80 | -108.5657 | 36.7656 |
100 | -108.5648 | 41.9548 |
1.5in
Figure 1: Worst case utility functions with different r
Figure 2: Optimal values with different r
Figure 3: Optimal values with different N






8 Conclusion
In this paper we explore variations of the concept of optimized certainty equivalent with a number of new inputs. First, we propose a modified optimized certainty equivalent (MOCE) model by considering the utility of present consumption. The optimal strategy (which balances the present and future consumption) is uniquely determined by the decision maker’s risk preference rather than by his/her utility representations (which is not unique). The resulting MOCE value is positive homogeneous in . The MOCE is also in alignment with the consumption models in economics. Second, there is a distinction between OCE and MOCE in terms of the utility functions to be used in the model. In the classical OCE model, it requires the utility function to satisfy and . The new MOCE model does not require these conditions. Third, we propose a preference robust version of the new MOCE model for the case that the decision maker’s true utility function is ambiguous. Ambiguity does exist in practice and this paper provides a comprehensive treatment of the preference robust MOCE model from modelling to computational scheme and underlying theory. Fourth, in the case that the proposed RMOCE model is applied to data-driven problems where the underlying exogenous data (samples of ) are potentially contaminated, we derive sufficient conditions under which the RMOCE calculated with the data is statistically robust. Fifth, we outline potential extensions of the MOCE model from single decision making to multi-attribute decision making and point out potential applications in asset re-organization. In summary, this paper provides a new outlook of OCE in both modelling and analysis, which complement the existing research in the literature.
References
- [1] A. E. Abbas. Multiattribute utility copulas. Operations Research, 57(6):1367–1383, 2009.
- [2] A. E. Abbas and Z. Sun. Multiattribute utility functions satisfying mutual preferential independence. Operations Research, 63(2):378–393, 2015.
- [3] B. Armbruster and E. Delage. Decision making under uncertainty when preference information is incomplete. Management science, 61(1):111–128, 2015.
- [4] R. J. Aumann. Integrals of set-valued functions. Journal of mathematical analysis and applications, 12(1):1–12, 1965.
- [5] D. Bampou and D. Kuhn. Scenario-free stochastic programming with polynomial decision rules. In 2011 50th IEEE Conference on Decision and Control and European Control Conference, pages 7806–7812. IEEE, 2011.
- [6] A. Ben-Tal and M. Teboulle. Expected utility, penalty functions, and duality in stochastic nonlinear programming. Management Science, 32(11):1445–1466, 1986.
- [7] A. Ben‐Tal and M. Teboulle. An old‐new concept of convex risk measures: The optimized certainty equivalent. Mathematical Finance, 17(3):449–476, 2007.
- [8] D. Bertsimas and C. Caramanis. Adaptability via sampling. In 2007 46th IEEE Conference on Decision and Control, pages 4717–4722. IEEE, 2007.
- [9] J. V. Burke, X. Chen, and H. Sun. The subdifferential of measurable composite max integrands and smoothing approximation. Mathematical Programming, 181(2):229–264, 2020.
- [10] S. Cerreia-Vioglio, D. Dillenberger, and P. Ortoleva. Cautious expected utility and the certainty effect. Econometrica, 83(2):693–728, 2015.
- [11] F. H. Clarke. Optimization and Nonsmooth Analysis. SIAM, 1990.
- [12] J. H. Cochrane. Asset pricing: Revised edition. Princeton university press, 2009.
- [13] R. Cont, R. Deguest, and G. Scandolo. Robustness and sensitivity analysis of risk measurement procedures. Quantitative finance, 10(6):593–606, 2010.
- [14] E. Delage, S. Guo, and H. Xu. Shortfall risk models when information on loss function is incomplete. Operations Research, 2022.
- [15] P. H. Farquhar. State of the art-utility assessment methods. Management science, 30(11):1283–1300, 1984.
- [16] A. L. Gibbs and F. E. Su. On choosing and bounding probability metrics. International statistical review, 70(3):419–435, 2002.
- [17] S. Guo and H. Xu. Statistical robustness in utility preference robust optimization models. Mathematical Programming, pages 1–42, 2021.
- [18] S. Guo and H. Xu. Utility preference robust optimization with moment-type information structure. Optimization online, 2021.
- [19] N. H. Hakansson. Optimal investment and consumption strategies under risk for a class of utility functions, pages 525–545. Elsevier, 1975.
- [20] F. R. Hampel. A general qualitative definition of robustness. The annals of mathematical statistics, 42(6):1887–1896, 1971.
- [21] W. Haskell, H. Xu, and W. Huang. Preference robust optimization for choice functions on the space of cdfs. SIAM Journal on Optimization, 2022.
- [22] W. B. Haskell, W. Huang, and H. Xu. Preference elicitation and robust optimization with multi-attribute quasi-concave choice functions. arXiv preprint arXiv:1805.06632, 2018.
- [23] J. Hu, M. Bansal, and S. Mehrotra. Robust decision making using a general utility set. European Journal of Operational Research, 269(2):699–714, 2018.
- [24] J. Hu and S. Mehrotra. Robust decision making over a set of random targets or risk-averse utilities with an application to portfolio optimization. IIE Transactions, 47(4):358–372, 2015.
- [25] J. Hu and G. Stepanyan. Optimization with reference-based robust preference constraints. SIAM Journal on Optimization, 27(4):2230–2257, 2017.
- [26] P. J. Huber and E.M. Ronchetti. Robust Statistics. Wiley, Hoboken, 2nd edition edition, 2009.
- [27] U. S. Karmarkar. Subjectively weighted utility: A descriptive extension of the expected utility model. Organizational behavior and human performance, 21(1):61–72, 1978.
- [28] R. L. Keeney, H. Raiffa, and R. F. Meyer. Decisions with multiple objectives: preferences and value trade-offs. Cambridge university press, 1993.
- [29] J. Kennedy and R. Eberhart. Particle swarm optimization. In Proceedings of ICNN’95-international conference on neural networks, volume 4, pages 1942–1948. IEEE, 1995.
- [30] V. Krätschmer, A. Schied, and H. Zähle. Qualitative and infinitesimal robustness of tail-dependent statistical functionals. Journal of Multivariate Analysis, 103(1):35–47, 2012.
- [31] J. Y. Li. Inverse optimization of convex risk functions. Management Science, 67(11):7113–7141, 2021.
- [32] J. Liu, Z. Chen, and H. Xu. Multistage utility preference robust optimization. arXiv preprint arXiv:2109.04789, 2021.
- [33] F. Maccheroni. Maxmin under risk. Economic Theory, 19(4):823–831, 2002.
- [34] E. Mezura-Montes and C. A. C. Coello. Constraint-handling in nature-inspired numerical optimization: past, present and future. Swarm and Evolutionary Computation, 1(4):173–194, 2011.
- [35] O. Morgenstern and J. Von Neumann. Theory of games and economic behavior. Princeton university press, 1953.
- [36] M. Nouiehed, J. Pang, and M. Razaviyayn. On the pervasiveness of difference-convexity in optimization and statistics. Mathematical Programming, 174(1):195–222, 2019.
- [37] W. Ogryczak and A. Ruszczynski. Dual stochastic dominance and related mean-risk models. SIAM Journal on Optimization, 13(1):60–78, 2002.
- [38] G. C. Pflug and W. Römisch. Modeling, measuring and managing risk. World Scientific, 2007.
- [39] Z. Qi, Y. Cui, Y. Liu, and J. Pang. Estimation of individualized decision rules based on an optimized covariate-dependent equivalent of random outcomes. SIAM Journal on Optimization, 29(3):2337–2362, 2019.
- [40] R. T. Rockafellar. Convex analysis, volume 36. Princeton university press, 1970.
- [41] R. T. Rockafellar and S. Uryasev. Optimization of conditional value-at-risk. Journal of risk, 2:21–42, 2000.
- [42] W. Römisch. Stability of stochastic programming problems. Handbooks in operations research and management science, 10:483–554, 2003.
- [43] D. Sauré and J. P. Vielma. Ellipsoidal methods for adaptive choice-based conjoint analysis. Operations Research, 67(2):315–338, 2019.
- [44] L. L. Thurstone. A law of comparative judgment. Psychological review, 34(4):273, 1927.
- [45] K. E. Train. Discrete choice methods with simulation. Cambridge university press, 2009.
- [46] W. Wang and H. Xu. Robust spectral risk optimization when information on risk spectrum is incomplete. SIAM Journal on Optimization, 30(4):3198–3229, 2020.
- [47] W. Wang, H. Xu, and T. Ma. Optimal scenario-dependent multivariate shortfall risk measure and its application in capital allocation. Available at SSRN 3849125, 2021.
- [48] W. Wang, H. Xu, and T. Ma. Quantitative statistical robustness for tail-dependent law invariant risk measures. Quantitative Finance, pages 1–17, 2021.
- [49] M. Weber. Decision making with incomplete information. European journal of operational research, 28(1):44–57, 1987.
- [50] W. Wiesemann, D. Kuhn, and M. Sim. Distributionally robust convex optimization. Operations Research, 62(6):1358–1376, 2014.
- [51] H. Xu and S. Zhang. Quantitative statistical robustness in distributionally robust optimization models. Pacific Journal of Optimization Special Issue, 2021.
- [52] S. Zhang and H. Xu. Preference robust generalized shortfall risk measure based on the cumulative prospect theory when the value function and weighting functions are ambiguous. arXiv preprint arXiv:2112.10142, 2021.