On the existence of Monge solutions to multi-marginal optimal transport with quadratic cost and uniform discrete marginals

Pedram Emami [email protected], [email protected] and Brendan Pass University of Alberta, Edmonton, Alberta, Canada

Abstract.

A natural and important question in multi-marginal optimal transport is whether the Monge ansatz is justified; does there exist a solution of Monge, or deterministic, form? We address this question for the quadratic cost when each marginal measure is $m$ -empirical (that is, uniformly supported on $m$ points). By direct computation, we provide an example showing that the ansatz can fail when the underlying dimension $d$ is $2$ , number of marginals $N$ to be matched is $3$ and the size $m$ of their supports is $3$ . As a consequence, the set of $m$ -empirical measures is not barycentrically convex when $N\geq 3$ , $d\geq 2$ and $m\geq 3$ . It is a well known consequence of the Birkhoff-von Neumann Theorem that the Monge ansatz holds for $N=2$ , standard techniques show it holds when $d=1$ , and we provide a simple proof here that it holds whenever $m=2$ . Therefore, the $N$ , $d$ and $m$ in our counterexample are as small as possible.

Key words and phrases:

multi-marginal optimal transport, discrete marginals, Monge solutions, Wasserstein barycenter

2010 Mathematics Subject Classification:

49Q22

BP is pleased to acknowledge support from Natural Sciences and Engineering Research Council of Canada Grant 04658-2018. The work of PE was completed in partial fulfillment of the requirements for a doctoral degree in mathematics at the University of Alberta.

1. Introduction

Given probability measures $\mu_{1},\mu_{2},...,\mu_{N}$ (known as the marginals) on respective bounded domains $X_{1},X_{2},...,X_{N}\subseteq\mathbb{R}^{d}$ and a cost function $c:X_{1}\times X_{2}\times...\times X_{N}\rightarrow\mathbb{R}$ , the multi-marginal optimal transport problem is to minimize

(1)

\inf_{\gamma\in\Pi(\mu_{1},\mu_{2},...,\mu_{N})}\int_{X_{1}\times X_{2}\times...\times X_{N}}c(x_{1},x_{2},...,x_{N})\,d\gamma,

over the set $\Pi(\mu_{1},\mu_{2},...,\mu_{N})$ of probability measures on $X_{1}\times X_{2}\times...\times X_{N}$ whose marginals are the $\mu_{i}$ . In particular, the $N=2$ case is the classical optimal transport problem, which represents a very active area of modern mathematics, with many applications, reviewed in, for example [27, 25, 28]. The more general case, $N\geq 3$ , has recently emerged as an important problem in its own right, owing largely to it own wide variety of applications; see, for instance, the survey papers [19, 8].

A central question in this theory is whether minimizers concentrate on the graphs of functions over the first variable $x_{1}$ ; solutions with this structure are known as Monge solutions. Restricting the minimization to measures arising from such mappings, or, adopting the terminology of [9], making the Monge ansatz, when justified, amounts to a spectacular dimensional reduction and therefore greatly simplifies the problem computationally. In the classical $N=2$ case, it is well known that if $\mu_{1}$ is absolutely continuous with respect to Lebesgue measure, then, for cost satisfying mild structural conditions, the solution to (1) is unique and of Monge type [4, 11, 15]. Central among such costs is the quadratic cost, $c(x_{1},x_{2})=|x_{1}-x_{2}|^{2}$ , which induces a metric, known as the Wasserstein distance, on the set of probability measures, defined by $W_{2}(\mu_{1},\mu_{2}):=\sqrt{\inf_{\gamma\in\Pi(\mu_{1},\mu_{2})}\int_{X_{1}\times X_{2}}|x_{1}-x_{2}|^{2}\,d\gamma}$ ; Monge solution and uniqueness results for this cost were first proven by Brenier [3]. The situation is much more subtle and less well understood when $N\geq 3$ . However, by now many examples of cost functions for which unique Monge solutions exist for absolutely continuous marginals [5, 12, 7, 22, 20], as well as many for which they do not [6, 18, 17], have been discovered, and some general (though very strong) conditions on $c$ ensuring this structure have been developed [16, 14, 21]. Of special note is the natural, multi-marginal extension of the quadratic cost,

(2)

c(x_{1},x_{2},...,x_{N})=\sum_{i,j=1}^{N}|x_{i}-x_{j}|^{2},

or equivalently, $c(x_{1},x_{2},...,x_{N})=-\left|\sum_{i=1}^{N}x_{i}\right|^{2}$ , for which existence of Monge solutions and uniqueness were proven by Gangbo and Swiech [10]. Solving (1) with this cost is known to be equivalent to finding a Wasserstein barycenter (with equal weights on the measures), which, as introduced by Agueh-Carlier [1], is a minimizer among probability measures $\nu$ on $\mathbb{R}^{d}$ of

(3)

\nu\mapsto\sum_{i=1}^{N}W_{2}^{2}(\nu,\mu_{i}).

Wasserstein barycenters are Fréchet means on the Wasserstein space, and are an extremely active area of current research, due to many applications in statistics, image processing and data science, among other areas (see, for instance, [24, 13, 29, 26] and the survey in the monograph [23]).

In this short note we are interested in discrete, rather than absolutely continuous, marginals $\mu_{i}$ . In general, optimal couplings between discrete measures need not be unique or of Monge type; indeed, in general, there may not even exist a single coupling $\gamma\in\Pi(\mu_{1},\mu_{2},...,\mu_{N})$ which is of Monge type. However, such transport maps at least exist when all the marginals are uniformly distributed on a common number $m$ of points, as is often the case in applications. We will call such measures $m$ -empirical measures. In this case, when $N=2$ , the existence of Monge solutions for any cost function $c$ is an easy consequence of the Birkhoff-von Neumann Theorem, although solutions to (1) are not unique in general. An immediate consequence is that the set of $m$ -empirical measures is geodesically convex in Wasserstein space; that is, any two elements can be connected by a Wasserstein geodesic (often called a displacement interpolant) which lies in this set.

On the other hand, even for $m$ -empirical measures, Monge solutions for $N\geq 3$ do not exist for some cost functions. Indeed, as exposed in [9], when $N\geq 3$ , the convex polytope $\Pi(\mu_{1},...,\mu_{N})$ has extreme points which are not of Monge type, unlike in the $N=2$ case. This immediately implies the existence of cost functions for which Monge solutions do not exist, and [9] provides examples with relevance in physics for which this occurs. Intuition provided there, however, suggests that it is the repulsive nature of certain costs which leads to non-Monge optimizers for $m$ -empirical measures, in agreement with known results in the continuous case. It remains reasonable to expect existence of Monge solutions for the quadratic cost function (2). It is similarly natural to conjecture that $m$ -empirical marginals always have $m$ -empirical Wasserstein barycenters (that is, that the set of $m$ -empirical measures is barycentrically convex); we note that upper bounds on the size of the support of barycenters are known for general discrete measures [2], but the bound obtained there is $N(m-1)+1$ in the case of $m$ -empirical marginals, much larger than $m$ .

We exhibit here, by direct computation, an example showing that Monge solutions may not exist for $N\geq 3$ even for the quadratic cost. An immediate consequence is that the set of $m$ -empirical measures is not barycentrically convex; there exist $m$ -empirical marginals $\mu_{i}$ for which there does not exist a Wasserstein barycenter within the same set.

Our counterexample is for three measures ( $N=3$ ) in two dimensions ( $d=2$ ) supported on three points ( $m=3$ ). These are the smallest values of these three parameters for which such counterexamples can exist. Indeed, for $N=2$ , Monge solutions can always be found as discussed above. When $d=1$ , optimal couplings are well known to be monotone increasing, which implies they are of Monge form and unique. Finally, we present here a simple proof that there always exists a Monge solution for any collection of $2$ -empirical marginals, for any values of $d$ and $N$ .

2. A counterexample to Monge solutions for marginals with three-point support in $\mathbb{R}^{2}$

We define the set of $m$ -empirical measures on $\mathbb{R}^{d}$ as $E_{m}:=\{\frac{1}{m}\sum_{i=1}^{m}\delta_{x_{i}}:x_{1},..,x_{m}\in\mathbb{R}^{d}\}$ .

Here we present a counterexample to the existence of Monge solutions for $3$ marginals supported uniformly on $3$ points in $\mathbb{R}^{2}$ .

Example 1.

Let $\mu_{1},\mu_{2},\mu_{3}\in E_{3}$ be the $3$ -empirical measures supported on $3$ points in $\mathbb{R}^{2}$ defined by

	$\displaystyle\mu_{1}$	$\displaystyle=\frac{1}{3}\left(\delta_{(0.4417,-4.7665)}+\delta_{(-0.27748,1.0397)}+\delta_{(1.4826,4.7896)}\right),$
	$\displaystyle\mu_{2}$	$\displaystyle=\frac{1}{3}\left(\delta_{(-2.1054,-3.9784)}+\delta_{(3.5763,-1.8988)}+\delta_{(3.328,-1.558)}\right),$
(4)		$\displaystyle\mu_{3}$	$\displaystyle=\frac{1}{3}\left(\delta_{(-3.6728,0.23451)}+\delta_{(1.6988,-2.2917)}+\delta_{(-1.1644,-2.386)}\right).$

The optimal transport problem (1) with cost (2) can easily be solved directly by linear programming, yielding the optimal measure $\gamma=\frac{1}{6}[\delta_{x^{113}}+\delta_{x^{122}}+\delta_{x^{211}}+\delta_{x^{223}}+\delta_{x^{331}}+\delta_{x^{332}}]$ , with an optimal cost of $\mathbf{68.027}$ . Here, $x^{\alpha^{1}\alpha^{2}\alpha^{3}}:=(x_{1}^{\alpha^{1}},x_{2}^{\alpha^{3}},x_{3}^{\alpha^{3}})\in(\mathbb{R}^{2})^{3}$ , where the elements $x_{i}^{\alpha^{i}}\in\mathbb{R}^{2}$ in the support of each $\mu_{i}$ are ordered as in (4): for example, $x^{113}=((0.4417,-4.7665),(-2.1054,-3.9784),(-1.1644,-2.386))$ .

On the other hand, potential Monge solutions arise from transport maps, which correspond to measures of the form $\gamma=\frac{1}{3}[\delta_{x^{1,\sigma_{2}(1),\sigma_{3}(1)}}+\delta_{x^{2,\sigma_{2}(2),\sigma_{3}(2)}}+\delta_{x^{3,\sigma_{2}(3),\sigma_{3}(3)}}]$ for permutations $\sigma_{2}$ and $\sigma_{3}$ of the set $\{1,2,3\}$ of indices. One can easily check that the minimal Monge cost (MMC), obtained by minimizing the transport cost among the $(N!)^{m-1}=(3!)^{2}=36$ transport maps, is obtained by $\gamma=\frac{1}{3}[\delta_{x^{113}}+\delta_{x^{222}}+\delta_{x^{331}}]$ , yielding a strictly higher total cost of $\mathbf{68.065}$ . It follows that the problem (1) with quadratic cost (2) and marginals (4) does not admit a Monge solution.

Since a $3$ -empirical Wasserstein barycenter of $\mu_{1},\mu_{2}$ and $\mu_{3}$ would immediately yield a Monge solution to (1) via the equivalence between the problems identified in [1], this also implies that there cannot exist a $3$ -empirical Wasserstein barycenter $\nu\in E_{3}$ of these marginals.

3. Existence of Monge solutions for uniform empirical measures with two-point supports in $\mathbb{R}^{d}$

We now present a simple proof that Monge solutions exist for $2$ -empirical marginals.

Theorem 1.

Let $\mu_{i}=\frac{1}{2}\left(\delta_{x_{i}^{1}}+\delta_{x_{i}^{2}}\right)\in E_{2}$ be $2$ -empirical measures on $\mathbb{R}^{d}$ for $i=1,...,N$ . Then problem (1) with cost (2) has a Monge solution.

Proof.

Since the transform $x_{i}\rightarrow x_{i}-\bar{x}_{i}$ , where each $x_{i}:=(x_{i}^{1}+x_{i}^{2})/2$ is the mean of $\mu_{i}$ , shifts the values of the functional (1) but does not change its minimizer, we can assume without loss of generality that the mean of each $\mu_{i}$ is $0$ ; that is, $x_{i}^{2}=-x_{i}^{1}$ . Now, choose $\alpha_{i}\in\{1,2\}$ for $i=1,2...,N$ , to maximize

M=\max_{\alpha_{i}\in\{1,2\}:i=1,2,...,N}\left|\sum_{i=1}^{n}x_{i}^{\alpha_{i}}\right|^{2}.

Then set $\hat{x}=(x_{1}^{\alpha^{1}},x_{2}^{\alpha^{2}},...,x_{N}^{\alpha^{N}})\in(\mathbb{R}^{d})^{N}$ and $\gamma=\frac{1}{2}[\delta_{\hat{x}}+\delta_{-\hat{x}}]$ . It is clear that $\gamma\in\Pi(\mu_{1},...,\mu_{N})$ and that $\gamma$ is supported on the graph of the transport map $x_{1}^{\alpha^{1}}\mapsto(x_{2}^{\alpha^{2}},...,x_{N}^{\alpha^{N}})$ , $-x_{1}^{\alpha^{1}}\mapsto(-x_{2}^{\alpha^{2}},...,-x_{N}^{\alpha^{N}})$ . Since $\left|\sum_{i=1}^{n}x_{i}\right|^{2}=M$ $\gamma$ a.e, we have

-\int_{(\mathbb{R}^{d})^{N}}\left|\sum_{i=1}^{n}x_{i}\right|^{2}d\gamma=-M,

whereas, for any other $\tilde{\gamma}\in\Pi(\mu_{1},...\mu_{N})$ we have

-\int_{(\mathbb{R}^{d})^{N}}\left|\sum_{i=1}^{n}x_{i}\right|^{2}d\tilde{\gamma}\geq-M,

establishing the optimality of the Monge coupling $\gamma$ for the cost function $c(x_{1},x_{2},...,x_{N})=-\left|\sum_{i=1}^{n}x_{i}\right|^{2}$ , which is well known to be equivalent to (2) [10].

Corollary 1.

Let $\mu_{i}=\frac{1}{2}\left(\delta_{x_{i}^{1}}+\delta_{x_{i}^{2}}\right)\in E_{2}$ be $2$ -empirical measures on $\mathbb{R}^{d}$ for $i=1,...,N$ . Then there exists a $2$ -empirical Wasserstein barycenter $\nu\in E_{2}$ of the $\mu_{i}$ .

Proof.

Let $x^{\pm}:=\pm\frac{1}{N}\sum_{i=1}^{N}x_{i}^{\alpha_{i}}\in\mathbb{R}^{2}$ , where the $\alpha_{i}\in{1,2}$ are as in the previous proof. The equivalence between multi-marginal optimal transport and the Wasserstein barycenter established in Proposition 4.1 of [1] then implies that $\frac{1}{2}[\delta_{x^{+}}+\delta_{x^{-}}]\in E_{2}$ is a Wasserstein barycenter.

4. Discussion

The counterexample in Example 1 was found by brute force, through randomly generating many collections of empirical marginals and computing the solutions to (1). A couple of observations can be gleaned from the data obtained by this procedure:

(1)

The occurrence of problems that do not have Monge solutions is quite rare; for three measures supported on three points in $\mathbb{R}^{2}$ , we found that only about one in every $5000$ to $8000$ problems does not have a Monge solution. The frequency of these counterexamples seems to increase with the underlying dimension; we found that the failure of the Monge ansatz occurs more often in $\mathbb{R}^{3}$ .
(2)

Whenever the Monge ansatz fails, the difference between the minimal Monge cost and the actual optimal cost is quite small. We identified around $500$ cases when the Monge ansatz fails in $\mathbb{R}^{2}$ ; among these the greatest difference between the cost of the minimal Monge cost and the actual optimal cost was $4.3\%$ . In fact, in most cases the error was much lower than the above upper bound as is shown in Figure 1.

The last observation suggest rigorously quantifying the error made by the Monge ansatz as an intriguing line of future research. It seems possible that the errors are small enough that the Monge ansatz as an approximation of (1) may still be a useful tool in some applications.

Refer to caption — Figure 1. The distribution of the errors of minimal Monge costs for $500$ occurrences of multi-marginal optimal transport problems with $3$ -empirical marginals in $\mathbb{R}^{3}$ which do not have Monge solutions.

References

[1] Martial Agueh and Guillaume Carlier, Barycenters in the wasserstein space, SIAM Journal on Mathematical Analysis 43 (2011), no. 2, 904–924.
[2] Ethan Anderes, Steffen Borgwardt, and Jacob Miller, Discrete wasserstein barycenters: Optimal transport for discrete data, Mathematical Methods of Operations Research 84 (2016), 389–409.
[3] Yann Brenier, Décomposition polaire et réarrangement monotone des champs de vecteurs, C. R. Acad. Sci. Paris Sér. I Math. 305 (1987), no. 19, 805–808. MR 923203
[4] Luis A. Caffarelli, Allocation maps with general cost functions, Partial differential equations and applications, Lecture Notes in Pure and Appl. Math., vol. 177, Dekker, New York, 1996, pp. 29–35. MR 1371577
[5] G. Carlier, On a class of multidimensional optimal transportation problems, J. Convex Anal. 10 (2003), no. 2, 517–529. MR 2044434
[6] Guillaume Carlier and Bruno Nazaret, Optimal transportation for the determinant, ESAIM Control Optim. Calc. Var. 14 (2008), no. 4, 678–698. MR 2451790
[7] Maria Colombo, Luigi De Pascale, and Simone Di Marino, Multimarginal optimal transport maps for one-dimensional repulsive costs, Canad. J. Math. 67 (2015), no. 2, 350–368. MR 3314838
[8] Simone Di Marino, Augusto Gerolin, and Luca Nenna, Optimal transportation theory with repulsive costs, Topological optimization and optimal transport, Radon Ser. Comput. Appl. Math., vol. 17, De Gruyter, Berlin, 2017, pp. 204–256. MR 3729378
[9] Gero Friesecke, A Simple Counterexample to the Monge Ansatz in Multimarginal Optimal Transport, Convex Geometry of the Set of Kantorovich Plans, and the Frenkel–Kontorova Model, SIAM Journal on Mathematical Analysis 51 (2019), no. 6, 4332–4355.
[10] W. Gangbo and A. Świȩch, Optimal maps for the multidimensional Monge-Kantorovich problem, Comm. Pure Appl. Math. 51 (1998), no. 1, 23–45.
[11] Wilfrid Gangbo and Robert J. McCann, The geometry of optimal transportation, Acta Math. 177 (1996), no. 2, 113–161. MR 1440931
[12] Henri Heinich, Problème de Monge pour $n$ probabilités, C. R. Math. Acad. Sci. Paris 334 (2002), no. 9, 793–795. MR 1905042
[13] Nhat Ho, XuanLong Nguyen, Mikhail Yurochkin, Hung Hai Bui, Viet Huynh, and Dinh Phung, Multilevel clustering via Wasserstein means, Proceedings of the 34th International Conference on Machine Learning (Doina Precup and Yee Whye Teh, eds.), Proceedings of Machine Learning Research, vol. 70, PMLR, 06–11 Aug 2017, pp. 1501–1509.
[14] Young-Heon Kim and Brendan Pass, A general condition for Monge solutions in the multi-marginal optimal transport problem, SIAM J. Math. Anal. 46 (2014), no. 2, 1538–1550. MR 3190751
[15] Vladimir Levin, Abstract cyclical monotonicity and Monge solutions for the general Monge-Kantorovich problem, Set-Valued Anal. 7 (1999), no. 1, 7–32. MR 1699061
[16] B. Pass, Uniqueness and monge solutions in the multimarginal optimal transportation problem, SIAM Journal on Mathematical Analysis 43 (2011), no. 6, 2758–2775.
[17] by same author, On the local structure of optimal measures in the multi-marginal optimal transportation problem, Calculus of Variations and Partial Differential Equations 43 (2012), 529–536, 10.1007/s00526-011-0421-z.
[18] Brendan Pass, Remarks on the semi-classical Hohenberg-Kohn functional, Nonlinearity 26 (2013), no. 9, 2731–2744. MR 3106579
[19] by same author, Multi-marginal optimal transport: theory and applications, ESAIM Math. Model. Numer. Anal. 49 (2015), no. 6, 1771–1790. MR 3423275
[20] Brendan Pass and Adolfo Vargas-Jiménez, Monge solutions and uniqueness in multi-marginal optimal transport via graph theory, Adv. Math. 428 (2023), Paper No. 109101, 43. MR 4601780
[21] Brendan W. Pass and Adolfo Vargas-Jiménez, A general framework for multi-marginal optimal transport, To appeach in Math. Programming.
[22] by same author, Multi-marginal optimal transportation problem for cyclic costs, SIAM J. Math. Anal. 53 (2021), no. 4, 4386–4400. MR 4296755
[23] Gabriel Peyré and Marco Cuturi, Computational optimal transport: With applications to data science, 2019.
[24] Julien Rabin, Gabriel Peyré, Julie Delon, and Marc Bernot, Wasserstein barycenter and its application to texture mixing, Scale Space and Variational Methods in Computer Vision (Berlin, Heidelberg) (Alfred M. Bruckstein, Bart M. ter Haar Romeny, Alexander M. Bronstein, and Michael M. Bronstein, eds.), Springer Berlin Heidelberg, 2012, pp. 435–446.
[25] Filippo Santambrogio, Optimal transport for applied mathematicians, vol. 55, Birkäuser, NY, 2015.
[26] M. Staib, S. Claici, J. Solomon, and S. Jegelka, Parallel Streaming Wasserstein Barycenters, Advances in Neural Information Processing Systems (I. Guyon et al., ed.), vol. 30, Curran Associates, Inc., 2017.
[27] Cédric Villani, Topics in optimal transportation, vol. 58, American Mathematical Soc., 2003.
[28] Cédric Villani, Optimal transport, Grundlehren der mathematischen Wissenschaften [Fundamental Principles of Mathematical Sciences], vol. 338, Springer-Verlag, Berlin, 2009, Old and new. MR 2459454
[29] H. Yang and E. Tabak, Conditional density estimation, latent variable discovery, and optimal transport, Comm. Pure Appl. Math. 75 (2022), no. 3, 610–663. MR 4373178

On the existence of Monge solutions to multi-marginal optimal transport with quadratic cost and uniform discrete marginals

Abstract.

Key words and phrases:

2010 Mathematics Subject Classification:

1. Introduction

2. A counterexample to Monge solutions for marginals with three-point support in ℝ2\mathbb{R}^{2}

Example 1.

3. Existence of Monge solutions for uniform empirical measures with two-point supports in ℝd\mathbb{R}^{d}

Theorem 1.

Proof.

Corollary 1.

Proof.

4. Discussion

References

2. A counterexample to Monge solutions for marginals with three-point support in $\mathbb{R}^{2}$

3. Existence of Monge solutions for uniform empirical measures with two-point supports in $\mathbb{R}^{d}$