Sensitivity of multiperiod optimization problems with respect to the adapted Wasserstein distance
Abstract.
We analyze the effect of small changes in the underlying probabilistic model on the value of multiperiod stochastic optimization problems and optimal stopping problems. We work in finite discrete time and measure these changes with the adapted Wasserstein distance. We prove explicit first-order approximations for both problems. Expected utility maximization is discussed as a special case.
Both authors thank Mathias Beiglböck, Yifan Jiang, and Jan Obłój for helpful discussions and two referees for a careful reading. DB acknowledges support from Austrian Science Fund (FWF) through projects ESP-31N and P34743. JW acknowledges support from NSF grant DMS-2205534.
1. Introduction
Consider a (real-valued) discrete-time stochastic process whose probabilistic behavior is governed by a reference model . Typically, such a model could describe the evolution of the stochastic process in an idealized probabilistic setting, as is customary in mathematical finance, or it could be derived from historical observations, as is a common assumption in statistics and machine learning. In both cases, one expects to merely approximate the true but unknown model. Consequently, an important question pertains to the effect that (small) misspecifications of have on quantities of interest in these areas. In this note, we analyze this question in two fundamental instances: optimal stopping problems and convex multiperiod stochastic optimization problems. For simplicity we focus on the latter in this introduction: consider
where is convex in the control variable (i.e., its second argument). The admissible controls are the (uniformly bounded) predictable functions , i.e., only depends on . For concreteness, let us mention that utility maximization—an essential problem in mathematical finance—falls into this framework, by setting
where is a concave utility function, is a payoff function and —see Example 2.6 for more details.
The question how changes of the model influence the value of clearly depends on the chosen distance between models. In order to answer it in a generic way (i.e., without restricting to parametric models), one first needs to choose a suitable metric on the laws of stochastic processes . A crucial observation that has appeared in different contexts and goes back at least to Aldous (1981); Hellwig (1996); Pflug (2010); Pflug and Pichler (2012), is that any metric compatible with weak convergence (and also variants thereof that account for integrability, such as the Wasserstein distance) is too coarse to guarantee continuity of the map in general. Roughly put, the reason is that two processes can have very similar laws but completely different filtrations; hence completely different sets of admissible controls. This fact has been rediscovered several times during the last decades, and researchers from different fields have introduced modifications of the weak topology that guarantee such continuity properties of ; we refer to Backhoff-Veraguas et al. (2020b) for detailed historical references. Strikingly, all the different modifications of the weak topology turn out to be equivalent to the so-called weak adapted topology: this is the coarsest topology that makes multiperiod optimization problems continuous, see Backhoff-Veraguas et al. (2020b); Bartl et al. (2021a).
With the choice of topology settled, the next question pertains to the choice of a suitable distance. This is already relevant in a one-period framework, where the weak and weak adapted topologies coincide. Recent research shows, that the Wasserstein distance, which metrizes the weak topology, is (perhaps surprisingly) powerful and versatile. Analogously, the adapted Wasserstein distance (see Section 2 for the definition) metrizes the weak adapted topology and Pflug (2010); Backhoff-Veraguas et al. (2020a) show, that the multiperiod optimization problems considered in this note are Lipschitz-continuous w.r.t. . However, the Lipschitz-constants depend on global continuity parameters of and are thus far from being sharp in general. Moreover, the exact computation of the (worst case) value of , where is in a neighbourhood of , requires solving an infinite-dimensional optimization problem, which does not admit explicit solutions in general. Both of these issues already occur in a one-period setting—despite the results of, e.g., Blanchet and Murthy (2019); Bartl et al. (2019), which relate this infinite-dimensional optimization problem to a simpler dual problem. In conclusion, computing the error
exactly is only possible in a few (arguably degenerate) cases.
In this note we address both issues by extending ideas of Bartl et al. (2021b); Oblój and Wiesel (2021) from a one-period setting to a multiperiod setting. The key insight of these works is that in a one-period setting, computing first-order approximations for is virtually always possible, while obtaining exact expressions might be infeasible in many cases. Our results go hand in hand with those of Bartl et al. (2021b), and we obtain explicit closed-form solutions for , which have intuitive interpretations. For instance, we show in Theorem 2.4, that under mild integrability and differentiability assumptions
holds as , where is the partial derivative with respect to the -th coordinate of , , is the unique optimizer for and denotes the Landau symbol.
In the case of utility maximization with (and for simplicity), the first-order correction term is essentially the expected quadratic variation of , but not under , but distorted by the -distance of the conditioned Radon-Nykodym density of an equivalent martingale measure w.r.t. —see Example 2.6 for details.
Investigating robustness of optimization problems in varying formulations is a recurring theme in the optimization literature; we refer to Rahimian and Mehrotra (2019) and the references therein for an overview. In the context of mathematical finance, representing distributional uncertainty through Wasserstein neighbourhoods goes back (at least) to Pflug and Wozabal (2007) and has seen a spike in recent research activity, leading to many impressive developments, see, e.g., duality results in Gao and Kleywegt (2016); Blanchet and Murthy (2019); Kuhn et al. (2019); Bartl et al. (2019) and applications in mathematical finance Blanchet et al. (2021), machine learning and statistics Shafieezadeh-Abadeh et al. (2019); Blanchet et al. (2020). Our theoretical results are directly linked to Acciaio and Hou (2022); Backhoff et al. (2022) characterizing the speed of convergence between the true and the (modified) empirical measure in the adapted Wasserstein distance and to new developments for computationally efficient relaxations of optimal transport problems, see Eckstein and Pammer (2022). For completeness, we mention that other notions of distance have been used to model distributional uncertainty, see e.g., Lam (2016, 2018); Calafiore (2007) in the context of operations research, Huber (2011); Lindsay (1994) in the context of statistics, and Herrmann and Muhle-Karbe (2017); Hobson (1998); Karoui et al. (1998) in the context of mathematical finance.
2. Main results
2.1. Preliminaries
We start by setting up notation. Let , let be the path space of a stochastic process in finite discrete time, and let denote the set of all Borel-probability measures on with finite -th moment. Throughout this article, is the identity (i.e., the canonical process) and denote the projections to the first and second coordinate, respectively. The filtration generated by is denoted by , i.e., and . Sometimes we write for the filtration generated by the processes .
For a function we write for the partial derivative of in -th coordinate of , that is,
where is the -th unit vector; for the gradient in , and for the Hessian in . We adopt the same notation for functions or and write for the partial derivative of in -th coordinate of . For univariate functions we simply write for the first and second derivatives.
Definition 2.1.
Let . A Borel probability measure on is called coupling (between and ), if its first marginal distribution is and its second one is . A coupling is called causal if
(2.1) |
-almost surely for all Borel sets and all ; a casual coupling is called bicausal if (2.1) holds also with the roles of and reversed.
Phrased differently, (2.1) means that under , conditionally on the ‘past’ , the ‘future’ is independent of ; see e.g. (Bartl et al., 2021a, Lemma 2.2) for this and further equivalent characterizations of (bi-)causality. It is also instructive to analyze condition (2.1) in the case of a Monge-coupling, i.e., when there is a transport map such that -almost surely. Indeed, then (2.1) simply means that needs to be -measurable.
Fix and define the adapted Wasserstein distance on by
(2.2) |
where the infimum is taken over all bicausal couplings between and . Set
for , and denote by the conjugate Hölder exponent of .
2.2. The uncontrolled case
We are now in a position to state the main results of the paper. We start with a simplified case, where depends on only and there are no controls. The sensitivities of the stochastic optimization and optimal stopping problems in Section 2.3 and 2.4 respectively can be seen as natural extensions of this result; indeed the sensitivity computed in Theorem 2.2 already exhibits the structure, which is common to all our results presented here.
Theorem 2.2.
Let be continuously differentiable and assume that there exists such that
for every . Then, as ,
2.3. Multiperiod stochastic optimization problems
Fix a constant throughout this section, and denote by the set of all predictable controls bounded by , i.e., every is such that only depends on (with the convention that is deterministic) and that for a fixed constant . Recall that
where is assumed to be convex in the control variable (i.e., its second argument).
Assumption 2.3.
For every , is twice continuously differentiable and strongly convex in the sense that on where is the identity matrix111For two -matrices and , we write if is positive semidefinite, that is, for all . and .
Moreover, is differentiable for every , its partial derivatives are jointly continuous, and there is a constant such that
for every and .
Theorem 2.4.
If Assumption 2.3 holds true, then there exists exactly one such that . Furthermore, as ,
Remark 2.5.
The restriction to controls that are uniformly bounded (i.e., satisfy ) is necessary to guarantee continuity of in general. This can be seen easily in the utility maximization example below—even when restricting to models that satisfy a no-arbitrage condition, see, e.g., (Backhoff-Veraguas et al., 2020a, Remark 5.3).
Example 2.6.
Let be a convex loss function, i.e., is bounded from below and convex. Moreover let be (the negative of) a payoff function and consider the problem
where is a fixed value. As discussed in the introduction, corresponds to the utility maximisation problem with payoff .
Suppose that is twice continuously differentiable with and , that is continuously differentiable with bounded derivative, and that for all . Then Assumption 2.3 is satisfied.
The assumption that is used to prove strong convexity in the sense of Assumption 2.3. In the present one-dimensional setting, it simply means that the stock price does not stay constant from time to with positive probability. It is satisfied, for instance, if is a Binomial tree under , or if has a density with respect to the Lebesgue measure—in particular, if is a discretized SDE with non-zero volatility. Moreover the assumption that the derivative of is bounded can be relaxed at the price of restricting to with a slower growth.
Corollary 2.7.
Note that for and , is essentially the expected quadratic variation of , but not under , but distorted by the -distance of the conditioned Radon-Nykodym density of an equivalent martingale measure w.r.t. .
2.4. Optimal stopping problems
Let be such that is -measurable for and consider
where refers to the set of all bounded stopping times with respect to the canonical filtration, i.e., if is such that for every .
Theorem 2.8.
Assume that is continuously differentiable for every and that there is a constant such that
for every and . Furthermore assume that there exists exactly one optimal stopping time for . Then, as ,
Example 2.9.
It is instructive to consider Theorem 2.8 in the special case where is Markovian, i.e., there is function such that for all and . Indeed, in this case, the first-order correction term simplifies to .
2.5. Extensions and open questions
To the best of our knowledge, this is the first work addressing the nonparametric sensitivity of multiperiod optimization problems (w.r.t. the adapted Wasserstein distance). Below we identify possible extensions, which are outside of the current scope of the current article. We plan to address these in future work.
- (1)
-
(2)
A natural extension of our results from a financial perspective would be the analysis of sensitivities for robust option pricing: let be a martingale law (i.e. is a -martingale under ) and consider
In one period models () this was carried out in Bartl et al. (2021b); Nendel and Sgarabottolo (2022). In a similar manner, it is natural to analyze sensitivity of robust American option pricing by considering only martingales in Theorem 2.8.
-
(3)
There are certain natural examples for that do not satisfy our regularity assumptions, e.g., in mathematical finance. In a one-period framework, regularity of can be relaxed systematically, see Bartl et al. (2021b); Nendel and Sgarabottolo (2022), and it is interesting to investigate if this is the case here as well.
-
(4)
In some examples, the restriction to bounded controls is automatic, see, e.g., Rásonyi and Stettner (2005). For instance, in the setting of Example 2.6 with , we suspect that similar arguments as used in Rásonyi and Stettner (2005) might show that a “conditional full support condition” of is sufficient to obtain first-order approximation with unbounded strategies.
- (5)
-
(6)
Motivated from the literature on distributionally robust optimization problems cited in the introduction, one could also consider min-max problems of the form
An important observation is that most arguments in the analysis of such problems (in the one-period setting) heavily rely on (convexity and) compactness of ; both properties fail to hold true in multiple periods. It was recently shown in Bartl et al. (2021a) that these can be recovered by passing to an appropriate factor space of processes together with general filtrations.
-
(7)
The present methods can be extended to cover functionals that depend not only on but also on its disintegrations—as is common in weak optimal transport (see, e.g., Gozlan et al. (2017)). As an example, consider and , where the functions and are suitably (Fréchet) differentiable. Using the same arguments as in the proof of Theorem 2.2, one can show that the first-order correction term equals .222When completing a first draft of this paper, we learned that similar results have been established by Jiang in independent research.
3. Proofs
3.1. Proof of Theorem 2.2
We need the following technical lemma, which essentially states that causal couplings can be approximated by bicausal ones with similar marginals. For a Borel probability measure on and a Borel mapping from to another Polish space, denotes the push-forward of measure under .
Lemma 3.1.
Let and let be a causal coupling between and . For each there exists such that is -measurable, is -measurable, and for every .
In particular, is a bicausal coupling between and .
Proof.
For we consider the Borel mappings
where is a (Borel-)isomorphism and . For , set
By definition is -measurable, is -measurable, and . It remains to note that the bicausality constraints (2.1) are clearly satisfied. ∎
Proof of Theorem 2.2.
To simplify notation, set
We first prove the upper bound, that is
(3.1) |
To that end, for any , let be such that
and let be an (almost) optimal bicausal coupling between and , i.e.,
The fundamental theorem of calculus and Fubini’s theorem imply
Moreover, by the tower property and Hölder’s inequality,
We next claim that, for every ,
(3.2) |
as . Indeed, since is bicausal, we have that
-almost surely, see, e.g., (Bartl et al., 2021a, Lemma 2.2). Therefore, Jensen’s inequality shows that
which converges zero; this follows from the continuity of , since , and since by the growth assumption and since . Then (3.2) follows from the triangle inequality.
We conclude that
where the second inequality follows from Hölder’s inequality between and . Recalling that is an almost optimal bicausal coupling between and and that , this shows (3.1).
It remains to prove the lower bound, that is,
(3.3) |
To that end, we first use the duality between and , which yields the existence of satisfying
Next we use duality between and which yields the existence of random variables satisfying
for . Combining both results,
(3.4) |
Note that, since is -measurable, can be chosen -measurable as well.
At this point, for fixed , we would like to define as the law of and as the law of . Since is -measurable, is clearly causal. Unfortunately however, it does not need to be bicausal in general. We thus first apply Lemma 3.1 to and with , which yields measures and processes which satisfy the assertion of Lemma 3.1.
Now fix . Since
we have that, for every ,
Using the fundamental theorem of calculus and Fubini’s theorem as before, the fact that is -measurable shows that
as , by the growth assumption since in .
In a final step we let . Applying the previous step to shows that
where the equality holds by using the growth assumption on . Recalling the choice of (see (3.4)) completes the proof. ∎
3.2. Proof of Theorem 2.4
The proof of Theorem 2.4 has a similar structure as the proof of Theorem 2.2, but some additional arguments have to be made in order to take care of the optimization in . Throughout, we work under Assumption 2.3. We start with two auxiliary results.
Lemma 3.2.
Let be such that for . Then .
Proof.
Let and let be a bicausal coupling between and . Let be arbitrary and fix that satisfies . Next define by
By bicausality, is actually measurable with respect to (see, e.g., (Bartl et al., 2021a, Lemma 2.2)) and clearly ; thus . Moreover, convexity of implies that
The fundamental theorem of calculus and Hölder’s inequality yield
Using the growth assumption and arguing as in the proof of Theorem 2.2, the last term is at most of order . As was arbitrary, this shows (where again denotes the Landau symbol) and reversing the roles of and completes the proof. ∎
Lemma 3.3.
There exists exactly one such that .
Proof.
This is a standard result. The existence follows from Komlos’ lemma Komlós (1967) and uniqueness from strict convexity. ∎
Proof of Theorem 2.4.
Let be the unique optimizer of (see Lemma 3.3) and, for shorthand notation, set for .
We first prove the upper bound. We claim that it follows from combining the reasoning in the proof of Theorem 2.2 and Lemma 3.2. Indeed, let be such that
and let be a bicausal coupling between and that is (almost) optimal for . Define by and use convexity of to conclude that
From here on, it follows from the fundamental theorem of calculus and Hölder’s inequality just as in the proof of Theorem 2.2 that
This completes the proof of the upper bound.
We proceed with the lower bound. To that end, we start with the same construction as in the proof of Theorem 2.2: let be the law of where satisfies (3.4), that is, is -measurable for every such that
Again, might be only causal and not bicausal, and we need to rely on Lemma 3.1. For the sake of a clearer presentation, we ignore this step this time.
For each , let be almost optimal for , that is
Observe that, by construction of (i.e., since is -measurable for each ), there is such that .
Now let be an arbitrary sequence that converges to zero. By Lemma 3.4 below, after passing to subsequence , converges to -almost surely. Since
for all , the fundamental theorem of calculus and the growth assumption imply
Since -almost surely, the continuity of and the growth assumption imply that
To complete the proof, it remains to recall the choice of and that was an arbitrary sequence. ∎
Lemma 3.4.
In the setting of the proof of Theorem 2.4: there exists a subsequence such that -almost surely.
Proof.
Recall that was chosen almost optimally for , hence
where the last inequality holds by continuity and the growth assumptions on , see the proof of Theorem 2.4. Next recall that for with . In particular, a second order Taylor expansion shows that
The first term is non-negative by optimality of . Thus, since by Lemma 3.2, this implies that the second term must converge to zero. As is strictly positive, this can only happen if in -probability. Hence, after passing to a subsequence, -almost surely. ∎
3.3. Proof of Corollary 2.7
For shorthand notation, set
The goal is to apply Theorem 2.4 to the function
for . To that end, we start by checking Assumption 2.3. Since continuously differentiable and is twice continuously differentiable, the parts of Assumption 2.3 pertaining to the differentiability of hold true. Moreover,
for any . Since and for every by assumption, one can readily verify that there is with such that
Next observe that
A quick computation involving the growth assumption on and shows that
In particular, Assumption 2.3 is satisfied, and the proof follows by applying Theorem 2.4. ∎
3.4. Proof of Theorem 2.8
We start with the upper bound. To that end, let be the optimal stopping time for , let be such that , and let be a (almost) optimal bicausal coupling for . Using a similar argument as in Lemma 3.2, we can use the coupling to build a stopping time such that
—see (Backhoff-Veraguas et al., 2020b, Lemma 7.1) or (Bartl et al., 2021a, Proposition 5.8) for detailed proofs. Under the growth assumption on , the fundamental theorem of calculus and Fubini’s theorem yield
where the last inequality follows from Hölder’s inequality and since
in the same way as in the proof of Theorem 2.2. We also conclude using similar arguments that
for every and every .
We proceed with the lower bound. To make the presentation concise, we assume here that —the general case follows from a (somewhat tedious) adaptation of the arguments presented here. The assumption that the optimal stopping time is unique implies, by the Snell envelope theorem, that
in particular
(3.5) |
As before, set and take that satisfies (3.4), i.e., is -measurable for every , and
(3.6) |
Next, for every , set
Define the process
Since and are -measurable, the coupling is causal between and . Using Lemma 3.1 (just as in the proof of Theorem 2.2), we can actually assume without loss of generality that is in fact bicausal and that for each —we will leave this detail to the reader and proceed.
In particular, since
it follows from (3.6) that ; thus
(3.7) | ||||
(3.8) |
where the equality holds by the Snell envelope theorem and since .
Next note that
Combined with (3.7) and since
we get
where the inequality holds by the tower property. Using the fundamental theorem of calculus just as in the proof of Theorem 2.2 shows that
as , where the convergence holds because, by (3.5), and . To complete the proof, it remains to recall the definition of , see (3.6). ∎
References
- Acciaio and Hou [2022] Beatrice Acciaio and Songyan Hou. Convergence of adapted empirical measures on . arXiv preprint arXiv:2211.10162, 2022.
- Aldous [1981] D. Aldous. Weak convergence and general theory of processes. Department of Statistics, University of California, Berkeley, CA 94720, 1981.
- Backhoff et al. [2022] Julio Backhoff, Daniel Bartl, Mathias Beiglböck, and Johannes Wiesel. Estimating processes in adapted Wasserstein distance. The Annals of Applied Probability, 32(1):529–550, 2022.
- Backhoff-Veraguas et al. [2020a] Julio Backhoff-Veraguas, Daniel Bartl, Mathias Beiglböck, and Manu Eder. Adapted Wasserstein distances and stability in mathematical finance. Finance and Stochastics, 24(3):601–632, 2020a.
- Backhoff-Veraguas et al. [2020b] Julio Backhoff-Veraguas, Daniel Bartl, Mathias Beiglböck, and Manu Eder. All adapted topologies are equal. Probability Theory and Related Fields, 178(3):1125–1172, 2020b.
- Bartl et al. [2019] Daniel Bartl, Samuel Drapeau, and Ludovic Tangpi. Computational aspects of robust optimized certainty equivalents and option pricing. Mathematical Finance, 9(1):203, March 2019.
- Bartl et al. [2021a] Daniel Bartl, Mathias Beiglböck, and Gudmund Pammer. The Wasserstein space of stochastic processes. arXiv preprint arXiv:2104.14245, 2021a.
- Bartl et al. [2021b] Daniel Bartl, Samuel Drapeau, Jan Oblój, and Johannes Wiesel. Sensitivity analysis of Wasserstein distributionally robust optimization problems. Proceedings of the Royal Society A, 477(2256):20210176, 2021b.
- Blanchet and Murthy [2019] Jose Blanchet and Karthyek Murthy. Quantifying distributional model risk via optimal transport. Mathematics of Operations Research, 44(2):565–600, 2019.
- Blanchet et al. [2020] José Blanchet, Yang Kang, José Luis Montiel Olea, Viet Anh Nguyen, and Xuhui Zhang. Machine learning’s dropout training is distributionally robust optimal. arXiv preprint arXiv:2009.06111, 2020.
- Blanchet et al. [2021] Jose Blanchet, Lin Chen, and Xun Yu Zhou. Distributionally robust mean-variance portfolio selection with Wasserstein distances. Management Science, 2021.
- Calafiore [2007] Giuseppe C Calafiore. Ambiguous risk measures and optimal robust portfolios. SIAM Journal on Optimization, 18(3):853–877, 2007.
- Eckstein and Pammer [2022] Stephan Eckstein and Gudmund Pammer. Computational methods for adapted optimal transport. arXiv preprint arXiv:2203.05005, 2022.
- Gao and Kleywegt [2016] Rui Gao and Anton J Kleywegt. Distributionally robust stochastic optimization with wasserstein distance. arXiv preprint arXiv:1604.02199, 2016.
- Gozlan et al. [2017] Nathael Gozlan, Cyril Roberto, Paul-Marie Samson, and Prasad Tetali. Kantorovich duality for general transport costs and applications. Journal of Functional Analysis, 273(11):3327–3405, 2017.
- Hellwig [1996] M. Hellwig. Sequential decisions under uncertainty and the maximum theorem. J. Math. Econom., 25(4):443–464, 1996.
- Herrmann and Muhle-Karbe [2017] Sebastian Herrmann and Johannes Muhle-Karbe. Model uncertainty, recalibration, and the emergence of delta–vega hedging. Finance and Stochastics, 21(4):873–930, 2017.
- Hobson [1998] David G Hobson. Volatility misspecification, option pricing and superreplication via coupling. Annals of Applied Probability, pages 193–205, 1998.
- Huber [2011] Peter J Huber. Robust statistics. In International encyclopedia of statistical science, pages 1248–1251. Springer, 2011.
- [20] Yifan Jiang. Wasserstein distributional sensitivity to model uncertainty in a dynamic context. DPhil Transfer of Status Thesis. University of Oxford, January 2023. Private communication.
- Karoui et al. [1998] Nicole El Karoui, Monique Jeanblanc-Picquè, and Steven E Shreve. Robustness of the black and scholes formula. Mathematical finance, 8(2):93–126, 1998.
- Komlós [1967] Janos Komlós. A generalization of a problem of Steinhaus. Acta Mathematica Academiae Scientiarum Hungaricae, 18(1-2):217–229, 1967.
- Kuhn et al. [2019] Daniel Kuhn, Peyman Mohajerin Esfahani, Viet Anh Nguyen, and Soroosh Shafieezadeh-Abadeh. Wasserstein distributionally robust optimization: Theory and applications in machine learning. In Operations research & management science in the age of analytics, pages 130–166. Informs, 2019.
- Lam [2016] Henry Lam. Robust sensitivity analysis for stochastic systems. Mathematics of Operations Research, 41(4):1248–1275, 2016.
- Lam [2018] Henry Lam. Sensitivity to serial dependency of input processes: A robust approach. Management Science, 64(3):1311–1327, 2018.
- Lindsay [1994] Bruce G Lindsay. Efficiency versus robustness: the case for minimum hellinger distance and related methods. The annals of statistics, 22(2):1081–1114, 1994.
- Nendel and Sgarabottolo [2022] Max Nendel and Alessandro Sgarabottolo. A parametric approach to the estimation of convex risk functionals based on Wasserstein distances. arXiv preprint arXiv:2210.14340, 2022.
- Oblój and Wiesel [2021] Jan Oblój and Johannes Wiesel. Distributionally robust portfolio maximization and marginal utility pricing in one period financial markets. Mathematical Finance, 31(4):1454–1493, 2021.
- Pflug [2010] G Ch Pflug. Version-Independence and Nested Distributions in Multistage Stochastic Optimization. SIAM Journal on Optimization, 20(3):1406–1420, January 2010.
- Pflug and Wozabal [2007] Georg Pflug and David Wozabal. Ambiguity in portfolio selection. Quantitative Finance, 7(4):435–442, 2007.
- Pflug and Pichler [2012] Georg Ch Pflug and Alois Pichler. A Distance For Multistage Stochastic Optimization Models. SIAM Journal on Optimization, 22(1):1–23, January 2012.
- Rahimian and Mehrotra [2019] Hamed Rahimian and Sanjay Mehrotra. Distributionally robust optimization: A review. arXiv preprint arXiv:1908.05659, 2019.
- Rásonyi and Stettner [2005] Miklós Rásonyi and Lukasz Stettner. On utility maximization in discrete-time financial market models. The Annals of Applied Probability, 15(2):1367–1395, 2005.
- Shafieezadeh-Abadeh et al. [2019] Soroosh Shafieezadeh-Abadeh, Daniel Kuhn, and Peyman Mohajerin Esfahani. Regularization via mass transportation. Journal of Machine Learning Research, 20(103):1–68, 2019.