This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Convergence of asymptotic costs for random Euclidean matching problems

Michael Goldman M.G.: Université de Paris, CNRS, Sorbonne-Université, Laboratoire Jacques-Louis Lions (LJLL), F-75005 Paris, France [email protected]  and  Dario Trevisan D.T.: Dipartimento di Matematica, Università degli Studi di Pisa, 56125 Pisa, Italy [email protected]
Abstract.

We investigate the average minimum cost of a bipartite matching between two samples of nn independent random points uniformly distributed on a unit cube in d3d\geq 3 dimensions, where the matching cost between two points is given by any power p1p\geq 1 of their Euclidean distance. As nn grows, we prove convergence, after a suitable renormalization, towards a finite and positive constant. We also consider the analogous problem of optimal transport between nn points and the uniform measure. The proofs combine sub-additivity inequalities with a PDE ansatz similar to the one proposed in the context of the matching problem in two dimensions and later extended to obtain upper bounds in higher dimensions.

Key words and phrases:
Matching problem, optimal transport, geometric probability
2010 Mathematics Subject Classification:
60D05, 90C05, 39B62, 60F25, 35J05

1. Introduction

The aim of this paper is to extend the results of [12, 9, 4, 11] on the existence of the thermodynamic limit for some random optimal matching problems. Because of their relations to computer science, statistical physics and quantization of measures, optimal matching problems have been the subject of intense research both from the mathematical and physical communities. We refer for instance to [29, 27, 10] for more details in particular regarding the vast literature.

Probably the simplest and most studied variant of these problems is the bipartite (or Euclidean bipartite) matching on the unit cube in dd dimensions. Given p1p\geq 1 and two independent families of i.i.d. random variables (Xi)i1(X_{i})_{i\geq 1} and (Yi)i1(Y_{i})_{i\geq 1} with common law the uniform (Lebesgue) measure on [0,1]d[0,1]^{d}, the problem is to understand the behavior for large nn of

𝔼[1nminπi=1n|XiYπ(i)|p],\mathbb{E}\left[\frac{1}{n}\min_{\pi}\sum_{i=1}^{n}|X_{i}-Y_{\pi(i)}|^{p}\right],

where the minimum is taken over all permutations π\pi of {1,,n}\{1,\ldots,n\}. It is by now well-known, see [1, 4, 6, 21] that for111The notation A1A\ll 1, which we only use in assumptions, means that there exists an ε>0\varepsilon>0 only depending on the dimension dd and on p1p\geq 1, such that if AεA\leq\varepsilon then the conclusion holds. Similarly, the notation ABA\lesssim B, which we use in output statements, means that there exists a global constant C>0C>0 depending on the dimension dd and on p1p\geq 1 such that ACBA\leq CB. We write ABA\sim B if both ABA\lesssim B and BAB\lesssim A. n1n\gg 1 (see [13] for some non-asymptotic bounds)

𝔼[1nminπi=1n|XiYπ(i)|p]{np2 for d=1(lognn)p2 for d=2npd for d3.\mathbb{E}\left[\frac{1}{n}\min_{\pi}\sum_{i=1}^{n}|X_{i}-Y_{\pi(i)}|^{p}\right]\sim\begin{cases}n^{-\frac{p}{2}}&\textrm{ for }d=1\\ \left(\frac{\log n}{n}\right)^{\frac{p}{2}}&\textrm{ for }d=2\\ n^{-\frac{p}{d}}&\textrm{ for }d\geq 3.\end{cases} (1.1)

Let us point out that while the case d3d\geq 3, pd/2p\geq d/2 is not explicitly covered in the literature, the proof of [21] clearly extends to any p2p\neq 2 (see also [7]). Our main result is the following:

Theorem 1.1.

For every d3d\geq 3 and p1p\geq 1, there exists a constant fbi=fbi(p,d)>0f_{\infty}^{\textrm{bi}}=f_{\infty}^{\textrm{bi}}(p,d)>0 such that

limnnpd𝔼[1nminπi=1n|XiYπ(i)|p]=fbi.\lim_{n\to\infty}n^{\frac{p}{d}}\mathbb{E}\left[\frac{1}{n}\min_{\pi}\sum_{i=1}^{n}|X_{i}-Y_{\pi(i)}|^{p}\right]=f_{\infty}^{\textrm{bi}}. (1.2)

This extends earlier results of [12, 9, 4, 11] where the same conclusion was obtained under the more restrictive condition p<d/2p<d/2. See also [26] for bounds on fbi(1,d)f_{\infty}^{\textrm{bi}}(1,d) as dd becomes large. As in the previously quoted papers, our proof is based on a sub-additivity argument and makes use of the classical observation that, thanks to the Birkhoff-Von Neumann Theorem, the bi-partite matching problem is actually an optimal transport problem. Indeed, if μ=i=1nδXi\mu=\sum_{i=1}^{n}\delta_{X_{i}} and λ=i=1nδYi\lambda=\sum_{i=1}^{n}\delta_{Y_{i}} are the associated empirical measures, then

minπi=1n|XiYπ(i)|p=Wpp(μ,λ),\min_{\pi}\sum_{i=1}^{n}|X_{i}-Y_{\pi(i)}|^{p}=W^{p}_{p}\left(\mu,\lambda\right),

where WpW_{p} denotes the Wasserstein distance of order pp (see [28, 25]). However, the papers [9, 4, 11] rely then upon combinatorial arguments, and in fact their results apply to a larger class of random optimization problems, while [12] strongly uses the dual formulation of optimal transport, which in the case p=1p=1 is quite specific, since it becomes a maximization over the set of 11-Lipschitz functions.

The optimal transport point of view allows us to treat the defect in sub-additivity as a defect in local distribution of mass rather than a defect in local distribution of points. More precisely, even if μ([0,1]d)=λ([0,1]d)\mu([0,1]^{d})=\lambda([0,1]^{d}) it is in general not true that for a given partition of [0,1]d[0,1]^{d} in sub-cubes QiQ_{i}, μ(Qi)=λ(Qi)\mu(Q_{i})=\lambda(Q_{i}). Therefore, in order to use sub-additivity, one needs to relax the definition of the matching problem to take into account this disparity. In [9, 4, 11] this is done by requiring that as many points as possible are matched. Here we allow instead for varying weights. That is, for μ\mu and λ\lambda containing potentially a different number of points, we consider the problem

𝔼[Wpp(μμ([0,1]d),λλ([0,1]d))].\mathbb{E}\left[W^{p}_{p}\left(\frac{\mu}{\mu([0,1]^{d})},\frac{\lambda}{\lambda([0,1]^{d})}\right)\right].

The main sub-additivity argument for this quantity is contained in Lemma 3.1. In order to estimate the error in sub-additivity, we then use in Lemma 3.4 a PDE ansatz similar to the one proposed in the context of the matching problem in [10] and then used in [3] (see also [21, 17, 5]) to show that when d=2d=2

limnnlogn𝔼[1nW22(μ,λ)]=12π.\lim_{n\to\infty}\frac{n}{\log n}\mathbb{E}\left[\frac{1}{n}W_{2}^{2}(\mu,\lambda)\right]=\frac{1}{2\pi}.

Notice however that our use of this linearization ansatz is quite different from [10, 3]. Indeed, for us the main contribution to the transportation cost is given by the transportation at the smaller scales and linearization is only used to estimate the higher order error term. On the other hand, in [10, 3] (see also [2]), the main contribution to the cost is given by the linearized i.e. H1H^{-1}, cost. This is somewhat in line with the prediction by [10] that for d3d\geq 3, the first order contribution to the Wasserstein cost is not given by the linearized problem while higher order corrections are. In any case, we give in Proposition 5.3 an alternative argument to estimate the error term without relying on the PDE ansatz. There we use instead an elementary comparison argument with one-dimensional transport plans. We included both proofs since we believe that they could prove useful in other contexts.

Let us make a few more comments on the proof of Theorem 1.1. As in [12, 9, 4], the proof of (1.2) is actually first done on a variant of the problem where the number of points follows a Poisson distribution instead of being deterministic. This is due to the fact that the restriction of a Poisson point process is again a Poisson point process. For this variant of the problem, rather than working on a fixed cube [0,1]d[0,1]^{d} with an increasing density of points, we prefer to make a blow-up at a scale L=n1/dL=n^{1/d} and consider in Theorem 5.1, a fixed Poisson point process of intensity 11 on d\mathbb{R}^{d} but restricted to cubes QL=[0,L]dQ_{L}=[0,L]^{d} with L1L\gg 1 (hence the terminology thermodynamic limit). We believe that the sub-additivity argument is slightly clearer in these variables (a similar rescaling is actually implicitly used in [9, 4]). This setting is somewhat reminiscent of [18], where super-additivity is used to construct an optimal coupling between the Poisson point process and the Lebesgue measure on d\mathbb{R}^{d}. In order to pass from the Poisson version of the problem to the deterministic one, we prove a general de-Poissonization result in Proposition 6.1 which can hopefully be useful in other contexts.

Besides the bipartite matching we also treat in Theorem 4.1 and Theorem 6.2 the case of the matching to the reference measure. We actually treat this problem first since the proof is a bit simpler. Indeed, while the general scheme of the proof is identical to the bipartite case, the PDE ansatz used in Lemma 3.4 works well for “regular” measures and a more delicate argument is required for the bipartite matching. Notice that by Jensen and triangle inequalities, (1.1) also holds for the matching to the reference measure.

We point out that in [12, 4, 11], it is more generally proven that if the points XiX_{i} and YiY_{i} have a common law μ\mu supported in [0,1]d[0,1]^{d} instead of the Lebesgue measure (for measures with unbounded support a condition on the moments is required), then for 1p<d/21\leq p<d/2 and d3d\geq 3,

lim supnnpd𝔼[1nminπi=1n|XiYπ(i)|p]fbi[0,1]d(dμdx)1pd.\limsup_{n\to\infty}n^{\frac{p}{d}}\mathbb{E}\left[\frac{1}{n}\min_{\pi}\sum_{i=1}^{n}|X_{i}-Y_{\pi(i)}|^{p}\right]\leq f_{\infty}^{\textrm{bi}}\int_{[0,1]^{d}}\left(\frac{d\mu}{dx}\right)^{1-\frac{p}{d}}.

However, when p>d/2p>d/2 and without additional assumptions on μ\mu, the asymptotic rate may be different and thus this inequality may fail, see e.g. [13]. Positive results for specific densities can be obtained nonetheless. For instance, it is proven in [22] that for the standard Gaussian measure μ\mu on d\mathbb{R}^{d}, d3d\geq 3, the asymptotic bound

npd𝔼[1nminπi=1n|XiYπ(i)|p]1n^{\frac{p}{d}}\mathbb{E}\left[\frac{1}{n}\min_{\pi}\sum_{i=1}^{n}|X_{i}-Y_{\pi(i)}|^{p}\right]\sim 1

holds true also for d/2p<dd/2\leq p<d.

Finally, we notice that usual results on concentration of measure allow us to improve from convergence of the expectations to strong convergence. However, we are able to cover only the case 1p<d1\leq p<d, see Remark 6.5.

The plan of the paper is the following. In Section 2, we fix some notation and recall basic moment and concentration bounds for Poisson random variables. In Section 3, we state and prove our two main lemmas namely the sub-additivity estimate Lemma 3.1 and the error estimate Lemma 3.4. In Section 4, we then prove the existence of the thermodynamic limit for the matching problem of a Poisson point process to the reference measure. The analog result for the bipartite matching between two Poisson point processes is obtained in Section 5. Finally, in Section 6 we pass from the Poissonized problem to the deterministic one and discuss stronger convergence results.

2. Notation and preliminary results

We use the notation |A||A| for the Lebesgue measure of a Borel set AdA\subseteq\mathbb{R}^{d}, and Af\int_{A}f for the Lebesgue integral of a function ff on AA. For L>0L>0, we let QL=[0,L]dQ_{L}=[0,L]^{d}. We denote by |p||p| the Euclidean norm of a vector pNp\in\mathbb{R}^{N}. For a function ϕ\phi, we use the notation ϕ\nabla\phi for the gradient, ϕ\nabla\cdot\phi for the divergence, Δϕ\Delta\phi for the Laplacian and 2ϕ\nabla^{2}\phi for the Hessian.

2.1. Optimal transport

In this section we introduce some notation for the Wasserstein distance and recall few simple properties that will be used throughout. Proofs can be found in any of the monographs [28, 25, 24] with expositions of theory of optimal transport, from different perspectives.

Given p1p\geq 1, a Borel subset Ωd\Omega\subseteq\mathbb{R}^{d} and two positive Borel measures μ\mu, λ\lambda with μ(Ω)=λ(Ω)(0,)\mu(\Omega)=\lambda(\Omega)\in(0,\infty) and finite pp-th moments, the Wasserstein distance of order p1p\geq 1 between μ\mu and λ\lambda is defined as the quantity

Wp(μ,λ)=(minπ𝒞(μ,λ)d×d|xy|p𝑑π(x,y))1p,W_{p}(\mu,\lambda)=\left(\min_{\pi\in\mathcal{C}(\mu,\lambda)}\int_{\mathbb{R}^{d}\times\mathbb{R}^{d}}{|x-y|}^{p}d\pi(x,y)\right)^{\frac{1}{p}},

where 𝒞(μ,λ)\mathcal{C}(\mu,\lambda) is the set of couplings between μ\mu and λ\lambda. Moreover, if μ(Ω)=λ(Ω)=0\mu(\Omega)=\lambda(\Omega)=0, we define Wp(μ,λ)=0W_{p}(\mu,\lambda)=0, while if μ(Ω)λ(Ω)\mu(\Omega)\neq\lambda(\Omega), we let Wp(μ,λ)=W_{p}(\mu,\lambda)=\infty.
Let us recall that since WpW_{p} is a distance, we have the triangle inequality

Wp(μ,ν)Wp(μ,λ)+Wp(ν,λ).W_{p}(\mu,\nu)\leq W_{p}(\mu,\lambda)+W_{p}(\nu,\lambda). (2.1)

We will also use the classical sub-additivity inequality

Wpp(iμi,iλi)iWpp(μi,λi),W_{p}^{p}\left(\sum_{i}\mu_{i},\sum_{i}\lambda_{i}\right)\leq\sum_{i}W^{p}_{p}(\mu_{i},\lambda_{i}), (2.2)

for a finite set of positive measures μi\mu_{i}, λi\lambda_{i}. This follows from the observation that if πi𝒞(μi,λi)\pi_{i}\in\mathcal{C}(\mu_{i},\lambda_{i}), then iπi𝒞(iμi,iλi)\sum_{i}\pi_{i}\in\mathcal{C}(\sum_{i}\mu_{i},\sum_{i}\lambda_{i}).

Remark 2.1.

In fact, our results deal with the transportation cost Wpp(μ,λ)W_{p}^{p}(\mu,\lambda) rather than Wp(μ,λ)W_{p}(\mu,\lambda). To keep notation simple, we write

WΩp(μ,λ)=Wpp(μ¬Ω,λ¬Ω).W^{p}_{\Omega}(\mu,\lambda)=W_{p}^{p}(\mu\mathop{\raisebox{-0.5468pt}{\reflectbox{\rotatebox[origin={br}]{-90.0}{$\lnot$}}}}\Omega,\lambda\mathop{\raisebox{-0.5468pt}{\reflectbox{\rotatebox[origin={br}]{-90.0}{$\lnot$}}}}\Omega).

Moreover, if a measure is absolutely continuous with respect to Lebesgue measure, we only write its density. For example,

WΩp(μ,μ(Ω)|Ω|),W^{p}_{\Omega}\left(\mu,\frac{\mu(\Omega)}{|\Omega|}\right),

denotes the transportation cost between μ¬Ω\mu\mathop{\raisebox{-0.5468pt}{\reflectbox{\rotatebox[origin={br}]{-90.0}{$\lnot$}}}}\Omega to the uniform measure on Ω\Omega with total mass μ(Ω)\mu(\Omega).

Occasionally we may write WΩ(μ,λ)W_{\Omega}(\mu,\lambda) instead of (WΩp(μ,λ))1/p\left(W_{\Omega}^{p}(\mu,\lambda)\right)^{1/p}. This may lead to some ambiguity, but it should be clear from the context.

2.2. Poisson point processes

As in [12, 9, 4], we exploit invariance properties of Poisson point processes on d\mathbb{R}^{d} with uniform intensity in order to obtain simpler sub-additivity estimates. We refer e.g. to [19] for a general introduction to Poisson point processes. Here we only recall that a Poisson point process on d\mathbb{R}^{d} with intensity one can be defined as a random variable taking values on locally finite atomic measures

μ=iδXi\mu=\sum_{i}\delta_{X_{i}}

such that, for every k1k\geq 1, for any disjoint Borel sets A1,,AkdA_{1},\ldots,A_{k}\subseteq\mathbb{R}^{d}, the random variables μ(A1),,μ(Ak)\mu(A_{1}),\ldots,\mu(A_{k}) are independent and μ(Ai)\mu(A_{i}) has a Poisson distribution of parameter |Ai||A_{i}|, for every i=1,,ki=1,\ldots,k. In particular, if AdA\subseteq\mathbb{R}^{d} is Lebesgue negligible, then μ(A)=0\mu(A)=0 almost surely.

Existence of a Poisson point process of intensity one is obtained via a superposition argument, noticing that on every bounded subset Ωd\Omega\subseteq\mathbb{R}^{d} the law of μ¬Ω\mu\mathop{\raisebox{-0.5468pt}{\reflectbox{\rotatebox[origin={br}]{-90.0}{$\lnot$}}}}\Omega can be easily described: conditionally on μ(Ω)=n\mu(\Omega)=n, the measure μ¬Ω\mu\mathop{\raisebox{-0.5468pt}{\reflectbox{\rotatebox[origin={br}]{-90.0}{$\lnot$}}}}\Omega has the same law as the random measure

i=1nδXi,\sum_{i=1}^{n}\delta_{X_{i}},

where (Xi)i=1n(X_{i})_{i=1}^{n} are independent random variables with uniform law on Ω\Omega. Uniqueness in law can be also obtained, so that translation invariance of Lebesgue measure entails that the process is stationary, i.e., any deterministic translation of the random measure μ\mu leaves its law unchanged.

Let us finally recall the classical Cramér-Chernoff concentration bounds for Poisson random variables.

Lemma 2.2.

Let NN be a Poisson random variable with parameter n1n\gg 1. Then, for every t>0t>0

[|Nn|t]2exp(t22(t+n)).\mathbb{P}[|N-n|\geq t]\leq 2\exp\left(-\frac{t^{2}}{2(t+n)}\right). (2.3)

As a consequence, for every q1q\geq 1,

𝔼[|Nn|q]qnq2.\mathbb{E}\left[|N-n|^{q}\right]\lesssim_{q}n^{\frac{q}{2}}. (2.4)
Proof.

For the concentration bound (2.3), see for instance [8]. By the layer-cake representation,

𝔼[|Nn|q]\displaystyle\mathbb{E}\left[|N-n|^{q}\right] 0tq1exp(t22(t+n))𝑑t\displaystyle\lesssim\int_{0}^{\infty}t^{q-1}\exp\left(-\frac{t^{2}}{2(t+n)}\right)dt
0ntq1𝑑t+nntq1exp(ct2n)𝑑t+ntq1exp(ct)𝑑t\displaystyle\lesssim\int_{0}^{\sqrt{n}}t^{q-1}dt+\int_{\sqrt{n}}^{n}t^{q-1}\exp\left(-c\frac{t^{2}}{n}\right)dt+\int_{n}^{\infty}t^{q-1}\exp(-ct)dt
qnq2+nq2+nq1exp(cn)qnq2.\displaystyle\lesssim_{q}n^{\frac{q}{2}}+n^{\frac{q}{2}}+n^{q-1}\exp(-cn)\lesssim_{q}n^{\frac{q}{2}}.

3. The main lemmas

Our sub-additivity argument rests on a general but relatively simple lemma (which we only apply here for rectangles).

Lemma 3.1.

For every p1p\geq 1, there exists a constant C>0C>0 depending only on pp such that the following holds. For every Borel set Ωd\Omega\subset\mathbb{R}^{d}, every Borel partition (Ωi)i(\Omega_{i})_{i\in\mathbb{N}} of Ω\Omega, every measures μ\mu, λ\lambda on Ω\Omega, and every ε(0,1)\varepsilon\in(0,1),

WΩp(μ,μ(Ω)λ(Ω)λ)(1+ε)iWΩip(μ,μ(Ωi)λ(Ωi)λ)+Cεp1WΩp(iμ(Ωi)λ(Ωi)χΩiλ,μ(Ω)λ(Ω)λ).W^{p}_{\Omega}\left(\mu,\frac{\mu(\Omega)}{\lambda(\Omega)}\lambda\right)\leq(1+\varepsilon)\sum_{i}W^{p}_{\Omega_{i}}\left(\mu,\frac{\mu(\Omega_{i})}{\lambda(\Omega_{i})}\lambda\right)+\frac{C}{\varepsilon^{p-1}}W^{p}_{\Omega}\left(\sum_{i}\frac{\mu(\Omega_{i})}{\lambda(\Omega_{i})}\chi_{\Omega_{i}}\lambda,\frac{\mu(\Omega)}{\lambda(\Omega)}\lambda\right). (3.1)
Proof.

We first use the triangle inequality (2.1) to get

WΩp(μ,μ(Ω)λ(Ω)λ)(WΩ(μ,iμ(Ωi)λ(Ωi)χΩiλ)+WΩ(iμ(Ωi)λ(Ωi)χΩiλ,μ(Ω)λ(Ω)λ))p.W^{p}_{\Omega}\left(\mu,\frac{\mu(\Omega)}{\lambda(\Omega)}\lambda\right)\leq\left(W_{\Omega}\left(\mu,\sum_{i}\frac{\mu(\Omega_{i})}{\lambda(\Omega_{i})}\chi_{\Omega_{i}}\lambda\right)+W_{\Omega}\left(\sum_{i}\frac{\mu(\Omega_{i})}{\lambda(\Omega_{i})}\chi_{\Omega_{i}}\lambda,\frac{\mu(\Omega)}{\lambda(\Omega)}\lambda\right)\right)^{p}.

The proof is then concluded by combining the elementary inequality

(a+b)p(1+ε)ap+Cεp1bpa,b>0 and ε(0,1),(a+b)^{p}\leq(1+\varepsilon)a^{p}+\frac{C}{\varepsilon^{p-1}}b^{p}\qquad\forall a,b>0\textrm{ and }\varepsilon\in(0,1), (3.2)

with the sub-additivity of WΩpW_{\Omega}^{p} (2.2) in the form

WΩp(μ,iμ(Ωi)λ(Ωi)χΩiλ)iWΩip(μ,μ(Ωi)λ(Ωi)λ).W^{p}_{\Omega}\left(\mu,\sum_{i}\frac{\mu(\Omega_{i})}{\lambda(\Omega_{i})}\chi_{\Omega_{i}}\lambda\right)\leq\sum_{i}W^{p}_{\Omega_{i}}\left(\mu,\frac{\mu(\Omega_{i})}{\lambda(\Omega_{i})}\lambda\right).

Remark 3.2.

Alternatively, we could have also stated (3.1) in the slightly more symmetric form:

WΩp(μμ(Ω),λλ(Ω))(1+ε)iμ(Ωi)μ(Ω)WΩip(μμ(Ωi),λλ(Ωi))+Cεp1WΩp(1μ(Ω)iμ(Ωi)λ(Ωi)χΩiλ,λλ(Ω)).W^{p}_{\Omega}\left(\frac{\mu}{\mu(\Omega)},\frac{\lambda}{\lambda(\Omega)}\right)\leq(1+\varepsilon)\sum_{i}\frac{\mu(\Omega_{i})}{\mu(\Omega)}W^{p}_{\Omega_{i}}\left(\frac{\mu}{\mu(\Omega_{i})},\frac{\lambda}{\lambda(\Omega_{i})}\right)\\ +\frac{C}{\varepsilon^{p-1}}W^{p}_{\Omega}\left(\frac{1}{\mu(\Omega)}\sum_{i}\frac{\mu(\Omega_{i})}{\lambda(\Omega_{i})}\chi_{\Omega_{i}}\lambda,\frac{\lambda}{\lambda(\Omega)}\right).

However, the sub-additivity argument turns out to be a little bit simpler using (3.1) instead.

Lemma 3.1 shows that in order to estimate the defect in sub-additivity, it is enough to bound the local defect of mass distribution. This will be done here through a PDE argument.

Definition 3.3.

We say that a rectangle R=x+i=1d[0,Li]R=x+\prod_{i=1}^{d}[0,L_{i}] is of moderate aspect ratio if for every i,ji,j, Li/Lj2L_{i}/L_{j}\leq 2. A partition ={Ri}\mathcal{R}=\{R_{i}\} of RR is called admissible if for every ii, RiR_{i} is a rectangle of moderate aspect ratio and 3d|R||Ri||R|3^{-d}|R|\leq|R_{i}|\leq|R|. Notice that in particular #1\#\mathcal{R}\lesssim 1 for every admissible partition.

Lemma 3.4.

Let RR be a rectangle of moderate aspect ratio, μ\mu and λ\lambda be measures on RR with equal mass, both absolutely continuous with respect to Lebesgue and such that infRλ>0\inf_{R}\lambda>0. Then, for every p1p\geq 1

WRp(μ,λ)diamp(R)(infRλ)p1R|μλ|p.W_{R}^{p}(\mu,\lambda)\lesssim\frac{\operatorname{diam}^{p}(R)}{(\inf_{R}\lambda)^{p-1}}\int_{R}|\mu-\lambda|^{p}. (3.3)
Proof.

Let ϕ\phi be a solution of the Poisson equation with Neumann boundary conditions

Δϕ=μλin R and νϕ=0on R.\Delta\phi=\mu-\lambda\quad\textrm{in }R\qquad\textrm{ and }\qquad\nu\cdot\nabla\phi=0\quad\textrm{on }\partial R. (3.4)

We first argue that

WRp(μ,λ)1(infRλ)p1R|ϕ|p.W^{p}_{R}(\mu,\lambda)\lesssim\frac{1}{(\inf_{R}\lambda)^{p-1}}\int_{R}|\nabla\phi|^{p}. (3.5)

Let us point out that this estimate is well-known and has already been used in the context of the matching problem, see [3, 17] in the case p=2p=2 and [21, Th. 2] for general p1p\geq 1. Still, we give a proof for the reader’s convenience.
We first argue as in [17, Lem. 2.7], and use triangle inequality (2.1) and the monotonicity of WRW_{R} (2.2) to get

WR(μ,λ)\displaystyle W_{R}(\mu,\lambda) WR(μ,12(μ+λ))+WR(12(μ+λ),λ)\displaystyle\leq W_{R}\left(\mu,\frac{1}{2}(\mu+\lambda)\right)+W_{R}\left(\frac{1}{2}(\mu+\lambda),\lambda\right)
WR(12μ,12λ)+WR(12(μ+λ),λ)\displaystyle\leq W_{R}\left(\frac{1}{2}\mu,\frac{1}{2}\lambda\right)+W_{R}\left(\frac{1}{2}(\mu+\lambda),\lambda\right)
=21p(WR(μ,λ)+WR(μ+λ,2λ))\displaystyle=2^{-\frac{1}{p}}\left(W_{R}(\mu,\lambda)+W_{R}(\mu+\lambda,2\lambda)\right)

and thus

WR(μ,λ)WR(μ+λ,2λ).W_{R}(\mu,\lambda)\lesssim W_{R}(\mu+\lambda,2\lambda).

We now recall that by the Benamou-Brenier formula (see [25, Th. 5.28])

WRp(μ+λ,2λ)=minρ,j{01R1ρp1|j|p:tρ+j=0,ρ0=μ+λ,ρ1=2λ}.W_{R}^{p}(\mu+\lambda,2\lambda)=\min_{\rho,j}\left\{\int_{0}^{1}\int_{R}\frac{1}{\rho^{p-1}}|j|^{p}\ :\ \partial_{t}\rho+\nabla\cdot j=0,\rho_{0}=\mu+\lambda,\ \rho_{1}=2\lambda\right\}.

Estimate (3.5) follows using

ρt=(1t)μ+tλ+λ and j=ϕ\rho_{t}=(1-t)\mu+t\lambda+\lambda\qquad\textrm{ and }\qquad j=\nabla\phi

as competitor and noticing that for t[0,1]t\in[0,1], ρtinfRλ\rho_{t}\geq\inf_{R}\lambda.
We now claim that

R|ϕ|pdiamp(R)R|μλ|p,\int_{R}|\nabla\phi|^{p}\lesssim\operatorname{diam}^{p}(R)\int_{R}|\mu-\lambda|^{p}, (3.6)

which together with (3.5) would conclude the proof of (3.3). Estimate (3.6) is a direct consequence of Poincaré inequality and Calderón-Zygmund estimates for the Laplacian. However, since we did not find a precise reference for (global) Calderón-Zygmund estimates on rectangles with Neumann boundary conditions, we give here a short proof.
By scaling, we may assume that diam(R)=1\operatorname{diam}(R)=1. Furthermore, using even reflections along R\partial R we may replace Neumann boundary conditions by periodic ones in (3.4). By Poincaré inequality [23, Prop. 12.29] (notice that thanks to the periodic boundary conditions we now have Rϕ=0\int_{R}\nabla\phi=0),

R|ϕ|pR|2ϕ|p.\int_{R}|\nabla\phi|^{p}\lesssim\int_{R}|\nabla^{2}\phi|^{p}.

By interior Calderón-Zygmund estimates (see for instance [15, Th. 7.3]), periodicity and the fact that RR has moderate aspect ratio, we get

R|2ϕ|pR|μλ|p+(R|2ϕ|2)p2.\int_{R}|\nabla^{2}\phi|^{p}\lesssim\int_{R}|\mu-\lambda|^{p}+\left(\int_{R}|\nabla^{2}\phi|^{2}\right)^{\frac{p}{2}}.

By Bochner’s formula and Hölder inequality,

R|2ϕ|2=R|μλ|2(R|μλ|p)2p,\int_{R}|\nabla^{2}\phi|^{2}=\int_{R}|\mu-\lambda|^{2}\lesssim\left(\int_{R}|\mu-\lambda|^{p}\right)^{\frac{2}{p}},

which concludes the proof of (3.6). ∎

Remark 3.5.

For p=2p=2, combining the energy identity

R|ϕ|2=Rϕ(λμ)\int_{R}|\nabla\phi|^{2}=\int_{R}\phi(\lambda-\mu)

with Poincaré inequality, we see that (3.6) (and thus (3.3)) holds for any convex set RR. Although the situation for p2p\geq 2 is more subtle, see [14, Prop. 2], we believe that (3.6) holds for any rectangle, not necessarily of moderate aspect ratio.

4. Matching to the reference measure

In this section, we consider the optimal matching problem between μ\mu a Poisson point process on d\mathbb{R}^{d} with intensity one and the Lebesgue measure. More precisely, for every L1L\geq 1 we let

fref(L)=𝔼[1|QL|WQLp(μ,κ)],f^{\mathrm{ref}}(L)=\mathbb{E}\left[\frac{1}{|Q_{L}|}W_{Q_{L}}^{p}(\mu,\kappa)\right],

where QL=[0,L]dQ_{L}=[0,L]^{d} and

κ=μ(QL)|QL|\kappa=\frac{\mu(Q_{L})}{|Q_{L}|}

is the generic constant for which this is well defined.

Theorem 4.1.

For every d3d\geq 3 and p1p\geq 1, the limit

fref=limLfref(L)f_{\infty}^{\mathrm{ref}}=\lim_{L\to\infty}f^{\mathrm{ref}}(L)

exists and is strictly positive. Moreover, there exists C>0C>0 depending on pp and dd such that for L1L\geq 1,

freffref(L)+CLd22.f_{\infty}^{\mathrm{ref}}\leq f^{\mathrm{ref}}(L)+\frac{C}{L^{\frac{d-2}{2}}}. (4.1)

The proof follows the argument of [9] (see also [4]) and is mostly based on the following sub-additivity estimate.

Proposition 4.2.

For every d3d\geq 3 and p1p\geq 1, there exists a constant C>0C>0 such that for every L1L\geq 1 and mm\in\mathbb{N},

fref(mL)fref(L)+CLd22.f^{\mathrm{ref}}(mL)\leq f^{\mathrm{ref}}(L)+\frac{C}{L^{\frac{d-2}{2}}}. (4.2)
Proof.

We start by pointing out that since fref(L)Lpf^{\mathrm{ref}}(L)\lesssim L^{p}, it is not restrictive to assume that L1L\gg 1 in the proof of (4.2).

Step 1. [The dyadic case] For the sake of clarity, we start with the simpler case m=2km=2^{k} for some k1k\geq 1. We claim that

fref(2L)fref(L)+CLd22.f^{\mathrm{ref}}(2L)\leq f^{\mathrm{ref}}(L)+\frac{C}{L^{\frac{d-2}{2}}}. (4.3)

The desired estimate (4.2) would then follow iterating (4.3) and using that k012k(d2)<\sum_{k\geq 0}\frac{1}{2^{k(d-2)}}<\infty for d3d\geq 3. In order to prove (4.3), we divide the cube Q2LQ_{2L} in 2d2^{d} sub-cubes Qi=xi+QLQ_{i}=x_{i}+Q_{L} and let κi=μ(Qi)|QL|\kappa_{i}=\frac{\mu(Q_{i})}{|Q_{L}|} (and κ=μ(Q2L)|Q2L|\kappa=\frac{\mu(Q_{2L})}{|Q_{2L}|}). Notice that we are considering a partition up to a Lebesgue negligible remainder, which gives no contribution almost surely. By (3.1), for every ε(0,1)\varepsilon\in(0,1),

WQ2Lp(μ,κ)(1+ε)iWQip(μ,κi)+Cεp1WQ2Lp(iκiχQi,κ).W^{p}_{Q_{2L}}(\mu,\kappa)\leq(1+\varepsilon)\sum_{i}W_{Q_{i}}^{p}(\mu,\kappa_{i})+\frac{C}{\varepsilon^{p-1}}W^{p}_{Q_{2L}}\left(\sum_{i}\kappa_{i}\chi_{Q_{i}},\kappa\right).

Dividing by |Q2L||Q_{2L}|, taking expectations and using the fact that by translation invariance 𝔼[1|QL|WQip(μ,κi)]=fref(L)\mathbb{E}[\frac{1}{|Q_{L}|}W_{Q_{i}}^{p}(\mu,\kappa_{i})]=f^{\mathrm{ref}}(L), we get

fref(2L)\displaystyle f^{\mathrm{ref}}(2L) (1+ε)i|QL||Q2L|fref(L)+Cεp1𝔼[1|Q2L|WQ2Lp(iκiχQi,κ)]\displaystyle\leq(1+\varepsilon)\sum_{i}\frac{|Q_{L}|}{|Q_{2L}|}f^{\mathrm{ref}}(L)+\frac{C}{\varepsilon^{p-1}}\mathbb{E}\left[\frac{1}{|Q_{2L}|}W^{p}_{Q_{2L}}\left(\sum_{i}\kappa_{i}\chi_{Q_{i}},\kappa\right)\right]
=(1+ε)fref(L)+Cεp1𝔼[1|Q2L|WQ2Lp(iκiχQi,κ)].\displaystyle=(1+\varepsilon)f^{\mathrm{ref}}(L)+\frac{C}{\varepsilon^{p-1}}\mathbb{E}\left[\frac{1}{|Q_{2L}|}W^{p}_{Q_{2L}}\left(\sum_{i}\kappa_{i}\chi_{Q_{i}},\kappa\right)\right].

We now estimate 𝔼[1|Q2L|WQ2Lp(iκiχQi,κ)]\mathbb{E}[\frac{1}{|Q_{2L}|}W^{p}_{Q_{2L}}(\sum_{i}\kappa_{i}\chi_{Q_{i}},\kappa)]. Using 1|Q2L|WQ2Lp(iκiχQi,κ)Lpκ\frac{1}{|Q_{2L}|}W^{p}_{Q_{2L}}(\sum_{i}\kappa_{i}\chi_{Q_{i}},\kappa)\lesssim L^{p}\kappa together with

[κ12]exp(cLd),\mathbb{P}\left[\kappa\leq\frac{1}{2}\right]\leq\exp(-cL^{d}),

which follows from the Cramér-Chernoff bounds (2.3), we may reduce ourselves to the event {κ12}\{\kappa\geq\frac{1}{2}\}. Under this condition, by (3.3), we have

1|Q2L|WQ2Lp(iκiχQi,κ)\displaystyle\frac{1}{|Q_{2L}|}W^{p}_{Q_{2L}}\left(\sum_{i}\kappa_{i}\chi_{Q_{i}},\kappa\right) LpLdQ2Li|κiκ|pχQi\displaystyle\lesssim\frac{L^{p}}{L^{d}}\int_{Q_{2L}}\sum_{i}|\kappa_{i}-\kappa|^{p}\chi_{Q_{i}}
Lp(|κ1|p+i|κi1|p).\displaystyle\lesssim L^{p}\left(|\kappa-1|^{p}+\sum_{i}|\kappa_{i}-1|^{p}\right).

Recalling that μ(Qi)\mu(Q_{i}) are Poisson random variables of parameter |Qi||Q_{i}| and that κi=μ(Qi)|Qi|\kappa_{i}=\frac{\mu(Q_{i})}{|Q_{i}|}, we get from (2.4)

𝔼[|κ1|p]𝔼[|κi1|p]1Lpd2.\mathbb{E}\left[|\kappa-1|^{p}\right]\sim\mathbb{E}\left[|\kappa_{i}-1|^{p}\right]\lesssim\frac{1}{L^{\frac{pd}{2}}}.

Thus

𝔼[1|Q2L|WQ2Lp(iκiχQi,κ)]1Lp2(d2)\mathbb{E}\left[\frac{1}{|Q_{2L}|}W^{p}_{Q_{2L}}\left(\sum_{i}\kappa_{i}\chi_{Q_{i}},\kappa\right)\right]\lesssim\frac{1}{L^{\frac{p}{2}(d-2)}} (4.4)

and we conclude that for ε(0,1)\varepsilon\in(0,1),

fref(2L)(1+ε)fref(L)+Cεp11Lp2(d2).f^{\mathrm{ref}}(2L)\leq(1+\varepsilon)f^{\mathrm{ref}}(L)+\frac{C}{\varepsilon^{p-1}}\frac{1}{L^{\frac{p}{2}(d-2)}}. (4.5)

Optimizing in ε\varepsilon by choosing ε=L(d2)/2\varepsilon=L^{-(d-2)/2}, and using that fref(L)f^{\mathrm{ref}}(L) is bounded (by (1.1) and (2.3), see for instance [16, Prop. 2.7] for details) we conclude the proof of (4.3). Let us point out that we used here boundedness of fref(L)f^{\mathrm{ref}}(L) for simplicity but that as shown below it can also be obtained as a consequence of our proof.

Step 2.[The general case] We now consider the case when mm\in\mathbb{N} is not necessarily dyadic. We will partition QmLQ_{mL} into rectangles of almost dyadic size and thus need to deal with slightly more general configurations than dyadic cubes. Let us introduce some notation. Let RR be a rectangle with moderate aspect ratio and ={Ri}\mathcal{R}=\{R_{i}\} be an admissible partition of RR (recall Definition 3.3). Slightly abusing notation, we define

fref(R)=𝔼[1|R|WRp(μ,κ)].f^{\mathrm{ref}}(R)=\mathbb{E}\left[\frac{1}{|R|}W^{p}_{R}(\mu,\kappa)\right]. (4.6)

Step 2.1. We claim that the following variant of (4.5) holds: for every rectangle RR of moderate aspect ratio with |R|1|R|\gg 1, every admissible partition \mathcal{R} of RR and every ε(0,1)\varepsilon\in(0,1), we have

fref(R)(1+ε)i|Ri||R|fref(Ri)+Cεp11|R|p(d2)2d.f^{\mathrm{ref}}(R)\leq(1+\varepsilon)\sum_{i}\frac{|R_{i}|}{|R|}f^{\mathrm{ref}}(R_{i})+\frac{C}{\varepsilon^{p-1}}\frac{1}{|R|^{\frac{p(d-2)}{2d}}}. (4.7)

Defining κi=μ(Ri)|Ri|\kappa_{i}=\frac{\mu(R_{i})}{|R_{i}|} and using (3.1) as above, we get

fref(R)(1+ε)i|Ri||R|fref(Ri)+Cεp1𝔼[1|R|WRp(iκiχRi,κ)].f^{\mathrm{ref}}(R)\leq(1+\varepsilon)\sum_{i}\frac{|R_{i}|}{|R|}f^{\mathrm{ref}}(R_{i})+\frac{C}{\varepsilon^{p-1}}\mathbb{E}\left[\frac{1}{|R|}W_{R}^{p}\left(\sum_{i}\kappa_{i}\chi_{R_{i}},\kappa\right)\right].

The estimate

𝔼[1|R|WRp(iκiχRi,κ)]1|R|p(d2)2d\mathbb{E}\left[\frac{1}{|R|}W_{R}^{p}\left(\sum_{i}\kappa_{i}\chi_{R_{i}},\kappa\right)\right]\lesssim\frac{1}{|R|^{\frac{p(d-2)}{2d}}} (4.8)

is then obtained arguing exactly as for (4.4), using first the Crámer-Chernoff bound (2.3) to reduce to the event {κ12}\{\kappa\geq\frac{1}{2}\} and then (3.3) (recalling that diam(R)|R|1d\operatorname{diam}(R)\sim|R|^{\frac{1}{d}} since RR has moderate aspect ratio) in combination with (2.4) and the fact that #1\#\mathcal{R}\lesssim 1 since \mathcal{R} is an admissible partition.

Step 2.2. Starting from the cube QmLQ_{mL}, let us construct a sequence of finer and finer partitions of QmLQ_{mL} by rectangles of moderate aspect ratios and side-length given by integer multiples of LL. We let 0={QmL}\mathcal{R}_{0}=\{Q_{mL}\} and define k\mathcal{R}_{k} inductively as follows. Let RkR\in\mathcal{R}_{k}. Up to translation we may assume that R=i=1d(0,miL)R=\prod_{i=1}^{d}(0,m_{i}L) for some mim_{i}\in\mathbb{N}. We then split each interval (0,miL)(0,m_{i}L) into (0,mi2L)(mi2L,miL)(0,\lfloor\frac{m_{i}}{2}\rfloor L)\cup(\lfloor\frac{m_{i}}{2}\rfloor L,m_{i}L). It is readily seen that this induces an admissible partition of RR. Let us point out that when mi=1m_{i}=1 for some ii, the corresponding interval (0,mi2L)(0,\lfloor\frac{m_{i}}{2}\rfloor L) is empty. This procedure stops after a finite number of steps KK once K={QL+zi,zi[0,m1]d}\mathcal{R}_{K}=\{Q_{L}+z_{i},z_{i}\in[0,m-1]^{d}\}. It is also readily seen that 2K1<m2K2^{K-1}<m\leq 2^{K} and that for every k[0,K]k\in[0,K] and every RkR\in\mathcal{R}_{k} we have |R|(2KkL)d|R|\sim(2^{K-k}L)^{d}.
Let us prove by a downward induction that there exists Λ>0\Lambda>0 such that for every k[0,K]k\in[0,K] and every RkR\in\mathcal{R}_{k},

fref(R)fref(QL)+Λ(1+fref(QL))Ld22j=KkK2jd22.f^{\mathrm{ref}}(R)\leq f^{\mathrm{ref}}(Q_{L})+\Lambda(1+f^{\mathrm{ref}}(Q_{L}))L^{-\frac{d-2}{2}}\sum_{j=K-k}^{K}2^{-j\frac{d-2}{2}}. (4.9)

This is clearly true for k=Kk=K. Assume that it holds true for k+1k+1. Let RkR\in\mathcal{R}_{k}. Applying (4.7) with ε=(2KkL)(d2)/21\varepsilon=(2^{K-k}L)^{-(d-2)/2}\ll 1, we get

fref(R)\displaystyle f^{\mathrm{ref}}(R) (1+ε)Rik+1,RiR|Ri||R|fref(Ri)+Cε1|R|p(d2)2d\displaystyle\leq(1+\varepsilon)\sum_{R_{i}\in\mathcal{R}_{k+1},R_{i}\subset R}\frac{|R_{i}|}{|R|}f^{\mathrm{ref}}(R_{i})+\frac{C}{\varepsilon}\frac{1}{|R|^{\frac{p(d-2)}{2d}}}
(4.9)(1+ε)(fref(QL)+Λ(1+fref(QL))Ld22j=Kk+1K2jd22)+C(2KkL)d22\displaystyle\stackrel{{\scriptstyle\eqref{induction}}}{{\leq}}(1+\varepsilon)\left(f^{\mathrm{ref}}(Q_{L})+\Lambda(1+f^{\mathrm{ref}}(Q_{L}))L^{-\frac{d-2}{2}}\sum_{j=K-k+1}^{K}2^{-j\frac{d-2}{2}}\right)+C(2^{K-k}L)^{-\frac{d-2}{2}}
fref(QL)+Λ(1+fref(QL))Ld22\displaystyle\leq f^{\mathrm{ref}}(Q_{L})+\Lambda(1+f^{\mathrm{ref}}(Q_{L}))L^{-\frac{d-2}{2}}
×[j=Kk+1K2jd22+2(Kk)d22(CΛ+Ld22j=Kk+1K2jd22)].\displaystyle\times\left[\sum_{j=K-k+1}^{K}2^{-j\frac{d-2}{2}}+2^{-(K-k)\frac{d-2}{2}}\left(\frac{C}{\Lambda}+L^{-\frac{d-2}{2}}\sum_{j=K-k+1}^{K}2^{-j\frac{d-2}{2}}\right)\right].

If LL is large enough, then (j=Kk+1K2jd22)L(d2)/212(\sum_{j=K-k+1}^{K}2^{-j\frac{d-2}{2}})L^{-(d-2)/2}\leq\frac{1}{2}. Finally, choosing Λ2C\Lambda\geq 2C yields (4.9).
Applying (4.9) to R=QmLR=Q_{mL} and using that j02jd22<\sum_{j\geq 0}2^{-j\frac{d-2}{2}}<\infty, we get

fref(mL)fref(L)+C(1+fref(L))1Ld22.f^{\mathrm{ref}}(mL)\leq f^{\mathrm{ref}}(L)+C(1+f^{\mathrm{ref}}(L))\frac{1}{L^{\frac{d-2}{2}}}.

Since fref(L)Lpf^{\mathrm{ref}}(L)\lesssim L^{p}, writing that every L1L\gg 1 may be written as L=mLL=mL^{\prime} for some mm\in\mathbb{N} and L[1,2]L^{\prime}\in[1,2], we conclude that fref(L)f^{\mathrm{ref}}(L) is bounded and thus (4.2) follows. ∎

Remark 4.3.

We point out that as a consequence of the proof of Proposition 4.2 we have for every rectangle RR of moderate aspect ratio (recall definition (4.6))

fref(R)1.f^{\mathrm{ref}}(R)\lesssim 1. (4.10)

We can now prove Theorem 4.1

Proof of Theorem 4.1.

The existence of a limit freff_{\infty}^{\mathrm{ref}} is obtained from (4.2) arguing exactly as in [9] using the continuity of Lfref(L)L\mapsto f^{\mathrm{ref}}(L) (which can be obtained for instance by dominated convergence). The fact that fref(L)1f^{\mathrm{ref}}(L)\gtrsim 1 and thus fref>0f_{\infty}^{\mathrm{ref}}>0 follows from (1.1) and (2.3). Finally, (4.1) follows from (4.2) by sending mm\to\infty for fixed LL.

Remark 4.4.

It may be conjectured from [10] that

|fref(L)fref|1Ld2.|f^{\mathrm{ref}}(L)-f_{\infty}^{\mathrm{ref}}|\lesssim\frac{1}{L^{d-2}}.

Let us notice that if we could replace the term ε(p1)\varepsilon^{-(p-1)} in (4.5) by a constant then letting ε0\varepsilon\to 0 we would get the lower bound

fref(L)fref1Lp(d2)2,f^{\mathrm{ref}}(L)-f_{\infty}^{\mathrm{ref}}\gtrsim-\frac{1}{L^{\frac{p(d-2)}{2}}},

which, at least for p=2p=2, is in line with the conjectured rate. See also Remark 6.4 for rates in the case of a deterministic number of points.

5. Bi-partite matching

We now turn to the bi-partite matching. For μ\mu and λ\lambda two independent Poisson point processes of intensity one, we want to study the asymptotic behavior as LL\to\infty of

fbi(L)=𝔼[1|QL|WQLp(μ,κλ)],f^{\textrm{bi}}(L)=\mathbb{E}\left[\frac{1}{|Q_{L}|}W_{Q_{L}}^{p}(\mu,\kappa\lambda)\right],

with κ=μ(QL)λ(QL)\kappa=\frac{\mu(Q_{L})}{\lambda(Q_{L})}. Analogously to Theorem 4.1, we have:

Theorem 5.1.

For every d3d\geq 3 and p1p\geq 1, the limit

fbi=limLfbi(L)f_{\infty}^{\textrm{bi}}=\lim_{L\to\infty}f^{\textrm{bi}}(L)

exists and is strictly positive. Moreover, there exists C>0C>0 depending on pp and dd such that for L1L\geq 1,

fbifbi(L)+CLd22p.f_{\infty}^{\textrm{bi}}\leq f^{\textrm{bi}}(L)+\frac{C}{L^{\frac{d-2}{2p}}}.

The proof of Theorem 5.1 follows the same line of arguments as for Theorem 4.1. We only detail the estimate of the sub-additivity defect i.e. the counterpart of (4.8), since it is more delicate in the bipartite case.

Proposition 5.2.

Let RR be a rectangle of moderate aspect ratio with |R|1|R|\gg 1 and ={Ri}\mathcal{R}=\{R_{i}\} be an admissible partition of RR (recall Definition 3.3). Defining κi=μ(Ri)λ(Ri)\kappa_{i}=\frac{\mu(R_{i})}{\lambda(R_{i})} and κ=μ(R)λ(R)\kappa=\frac{\mu(R)}{\lambda(R)}, we have for every d3d\geq 3 and p1p\geq 1,

𝔼[1|R|WRp(iκiχRiλ,κλ)]1|R|d22d.\mathbb{E}\left[\frac{1}{|R|}W_{{R}}^{p}\left(\sum_{i}\kappa_{i}\chi_{R_{i}}\lambda,\kappa\lambda\right)\right]\lesssim\frac{1}{|R|^{\frac{d-2}{2d}}}. (5.1)
Proof.

As opposed to (4.8), since λ\lambda is atomic, we cannot directly use (3.3). The idea is to use as intermediate step the matching between λ\lambda and the reference measure. Let us point out that since fref(Ri)=𝔼[1|Ri|WRip(λ,λ(Ri)|Ri|)]f^{\mathrm{ref}}(R_{i})=\mathbb{E}\left[\frac{1}{|R_{i}|}W^{p}_{R_{i}}\left(\lambda,\frac{\lambda(R_{i})}{|R_{i}|}\right)\right] is of order one, we cannot apply naively the triangle inequality. The main observation is that since |κiκ|1|\kappa_{i}-\kappa|\ll 1 with overwhelming probability, the amount of mass which actually needs to be transported is very small.

Let 1θ>01\gg\theta>0 to be optimized later on. For every ii let

θi:=(κκi)+θ\theta_{i}:=(\kappa-\kappa_{i})+\theta

and assume that

θ2maxi|κκi|\theta\geq 2\max_{i}|\kappa-\kappa_{i}| (5.2)

so that 32θθi12θ>0\frac{3}{2}\theta\geq\theta_{i}\geq\frac{1}{2}\theta>0. Notice that thanks to the Crámer-Chernoff bounds (2.3), (5.2) is satisfied with overwhelming probability as long as θ|R|1/2\theta\gg|R|^{-1/2}. Using the triangle inequality (2.1) we have

WRp(iκiχRiλ,κλ)\displaystyle W_{R}^{p}\left(\sum_{i}\kappa_{i}\chi_{R_{i}}\lambda,\kappa\lambda\right)
WRp(iκiχRiλ,iχRi[(κiθ)λ+θλ(Ri)|Ri|dx])\displaystyle\lesssim W_{R}^{p}\left(\sum_{i}\kappa_{i}\chi_{R_{i}}\lambda,\sum_{i}\chi_{R_{i}}\left[(\kappa_{i}-\theta)\lambda+\theta\frac{\lambda(R_{i})}{|R_{i}|}dx\right]\right)
+WRp(iχRi[(κiθ)λ+θλ(Ri)|Ri|dx],iχRi[(κiθ)λ+θiλ(Ri)|Ri|dx])\displaystyle+W_{R}^{p}\left(\sum_{i}\chi_{R_{i}}\left[(\kappa_{i}-\theta)\lambda+\theta\frac{\lambda(R_{i})}{|R_{i}|}dx\right],\sum_{i}\chi_{R_{i}}\left[(\kappa_{i}-\theta)\lambda+\theta_{i}\frac{\lambda(R_{i})}{|R_{i}|}dx\right]\right)
+WRp(iχRi[(κiθ)λ+θiλ(Ri)|Ri|dx],κλ).\displaystyle+W_{R}^{p}\left(\sum_{i}\chi_{R_{i}}\left[(\kappa_{i}-\theta)\lambda+\theta_{i}\frac{\lambda(R_{i})}{|R_{i}|}dx\right],\kappa\lambda\right).

Notice that iθλ(Ri)=iθiλ(Ri)\sum_{i}\theta\lambda(R_{i})=\sum_{i}\theta_{i}\lambda(R_{i}) by definition of θi\theta_{i} and the fact that κλ(R)=μ(R)=iκiλ(R)\kappa\lambda(R)=\mu(R)=\sum_{i}\kappa_{i}\lambda(R) so that the second term is well defined.
We now estimate the three terms separately. The first and third are estimated in a similar way and we thus focus only on the first one. By sub-additivity (2.2) of WRpW^{p}_{R}, we have

WRp(iκiχRiλ,iχRi[(κiθ)λ+θλ(Ri)|Ri|dx])θiWRip(λ,λ(Ri)|Ri|).W_{R}^{p}\left(\sum_{i}\kappa_{i}\chi_{R_{i}}\lambda,\sum_{i}\chi_{R_{i}}\left[(\kappa_{i}-\theta)\lambda+\theta\frac{\lambda(R_{i})}{|R_{i}|}dx\right]\right)\leq\theta\sum_{i}W_{R_{i}}^{p}\left(\lambda,\frac{\lambda(R_{i})}{|R_{i}|}\right).

We turn to the middle term. Again by sub-additivity of WRpW^{p}_{R},

WRp(iχRi[(κiθ)λ+θλ(Ri)|Ri|dx],iχRi[(κiθ)λ+θiλ(Ri)|Ri|dx])WRp(iχRiθλ(Ri)|Ri|,iχRiθiλ(Ri)|Ri|).W_{R}^{p}\left(\sum_{i}\chi_{R_{i}}\left[(\kappa_{i}-\theta)\lambda+\theta\frac{\lambda(R_{i})}{|R_{i}|}dx\right],\sum_{i}\chi_{R_{i}}\left[(\kappa_{i}-\theta)\lambda+\theta_{i}\frac{\lambda(R_{i})}{|R_{i}|}dx\right]\right)\\ \leq W_{R}^{p}\left(\sum_{i}\chi_{R_{i}}\theta\frac{\lambda(R_{i})}{|R_{i}|},\sum_{i}\chi_{R_{i}}\theta_{i}\frac{\lambda(R_{i})}{|R_{i}|}\right).

Using (3.3) in the event {λ(Ri)/|Ri|1}\{\lambda(R_{i})/|R_{i}|\sim 1\} (which has overwhelming probability) and recalling that we assumed 32θθi12θ\frac{3}{2}\theta\geq\theta_{i}\geq\frac{1}{2}\theta, we have

WRp(iχRiθλ(Ri)|Ri|,iχRiθiλ(Ri)|Ri|)diamp(R)θp1i|κκi|p|Ri|.W_{R}^{p}\left(\sum_{i}\chi_{R_{i}}\theta\frac{\lambda(R_{i})}{|R_{i}|},\sum_{i}\chi_{R_{i}}\theta_{i}\frac{\lambda(R_{i})}{|R_{i}|}\right)\lesssim\frac{\operatorname{diam}^{p}(R)}{\theta^{p-1}}\sum_{i}\left|\kappa-\kappa_{i}\right|^{p}|R_{i}|.

Putting these two estimates together, dividing by |R||R| and taking expectations we find

𝔼[1|R|WRp(iκiχRiλ,κλ)]\displaystyle\mathbb{E}\left[\frac{1}{|R|}W_{R}^{p}\left(\sum_{i}\kappa_{i}\chi_{R_{i}}\lambda,\kappa\lambda\right)\right] θifref(Ri)+diamp(R)θp1i𝔼[|κκi|p]\displaystyle\lesssim\theta\sum_{i}f^{\mathrm{ref}}(R_{i})+\frac{\operatorname{diam}^{p}(R)}{\theta^{p-1}}\sum_{i}\mathbb{E}\left[\left|\kappa-\kappa_{i}\right|^{p}\right]
(4.10)&(2.4)θ+1θp11|R|p(d2)2d.\displaystyle\stackrel{{\scriptstyle\eqref{eq:boundfR}\&\eqref{eq:momentPoi}}}{{\lesssim}}\theta+\frac{1}{\theta^{p-1}}\frac{1}{|R|^{p\frac{(d-2)}{2d}}}.

Optimizing in θ\theta by choosing θ=|R|d22d|R|1/2\theta=|R|^{-\frac{d-2}{2d}}\gg|R|^{-1/2} (so that (5.2) is satisfied) this yields (5.1).

Comparing (4.4) and (5.1), one may wonder if (5.1) is suboptimal and could be improved. Let us prove that it is not the case, at least if we consider the slightly more regular situation of the matching to a deterministic grid.

Proposition 5.3.

Let λ=X𝐙dδX\lambda=\sum_{X\in\mathbf{Z}^{d}}\delta_{X} and for every L1L\gg 1, (μ1,,μ2d)(\mu_{1},\cdots,\mu_{2^{d}}) be 2d2^{d} independent Poisson random variables of parameter LdL^{d}. Writing for L1L\gg 1, Q2L=i=12dQiQ_{2L}=\cup_{i=1}^{2^{d}}Q_{i} where QiQ_{i} are disjoint cubes of sidelength LL and defining κi=μiλ(Qi)\kappa_{i}=\frac{\mu_{i}}{\lambda(Q_{i})}, κ=iμiλ(Q2L)\kappa=\frac{\sum_{i}\mu_{i}}{\lambda(Q_{2L})}, we have for d3d\geq 3 and p1p\geq 1,

𝔼[1|Q2L|WQ2Lp(iκiχQiλ,κλ)]1Ld22.\mathbb{E}\left[\frac{1}{|Q_{2L}|}W_{Q_{2L}}^{p}\left(\sum_{i}\kappa_{i}\chi_{Q_{i}}\lambda,\kappa\lambda\right)\right]\sim\frac{1}{L^{\frac{d-2}{2}}}.
Proof.

Step 1.[The lower bound] We start by proving that

𝔼[1|Q2L|WQ2Lp(iκiχQiλ,κλ)]1Ld22.\mathbb{E}\left[\frac{1}{|Q_{2L}|}W_{Q_{2L}}^{p}\left(\sum_{i}\kappa_{i}\chi_{Q_{i}}\lambda,\kappa\lambda\right)\right]\gtrsim\frac{1}{L^{\frac{d-2}{2}}}. (5.3)

For this we notice that if π\pi is any admissible coupling, then |xy|1|x-y|\geq 1 for every (x,y)(x,y) in the support of π\pi with xyx\neq y and thus222recall that WQ2L1W^{1}_{Q_{2L}} denotes the 11-Wasserstein distance.

WQ2Lp(iκiχQiλ,κλ)WQ2L1(iκiχQiλ,κλ).W_{Q_{2L}}^{p}\left(\sum_{i}\kappa_{i}\chi_{Q_{i}}\lambda,\kappa\lambda\right)\geq W^{1}_{Q_{2L}}\left(\sum_{i}\kappa_{i}\chi_{Q_{i}}\lambda,\kappa\lambda\right).

Let Σ=iQiQ2L\Sigma=\cup_{i}\partial Q_{i}\cap Q_{2L} and εi=sign(κiκ)\varepsilon_{i}=sign(\kappa_{i}-\kappa). Using ξ(x)=d(x,Σ)iεiχQi(x)\xi(x)=d(x,\Sigma)\sum_{i}\varepsilon_{i}\chi_{Q_{i}}(x) as test function in the dual formulation of WQ2L1W^{1}_{Q_{2L}}, we obtain

WQ2L1(iκiχQiλ,κλ)iX𝐙dQid(X,Σ)|κiκ|Ld+1i|κiκ|.W_{Q_{2L}}^{1}\left(\sum_{i}\kappa_{i}\chi_{Q_{i}}\lambda,\kappa\lambda\right)\geq\sum_{i}\sum_{X\in\mathbf{Z}^{d}\cap Q_{i}}d(X,\Sigma)|\kappa_{i}-\kappa|\gtrsim L^{d+1}\sum_{i}|\kappa_{i}-\kappa|.

Taking expectations we find (5.3).

Step 2.[The upper bound] We turn to the corresponding upper bound,

𝔼[1|Q2L|WQ2Lp(iκiχQiλ,κλ)]1Ld22.\mathbb{E}\left[\frac{1}{|Q_{2L}|}W_{Q_{2L}}^{p}\left(\sum_{i}\kappa_{i}\chi_{Q_{i}}\lambda,\kappa\lambda\right)\right]\lesssim\frac{1}{L^{\frac{d-2}{2}}}. (5.4)

One could argue exactly as for (5.1) but we provide an alternative proof which uses a one-dimensional argument instead of (3.3). Notice that this argument could have also been used to give a different proof of (4.4).

Step 2.1.[The one-dimensional estimate] Let λ=X𝐙δX\lambda=\sum_{X\in\mathbf{Z}}\delta_{X} and for κ1\kappa_{1}, κ2>0\kappa_{2}>0 and L1L\gg 1, define κ=κ1λ(0,L/2)+κ2λ(L/2,L)λ(0,L)\kappa=\frac{\kappa_{1}\lambda(0,L/2)+\kappa_{2}\lambda(L/2,L)}{\lambda(0,L)}. We claim that if |κ1κ2|L1min(κ1,κ2)|\kappa_{1}-\kappa_{2}|\ll L^{-1}\min(\kappa_{1},\kappa_{2}), then

W(0,L)p(κ1χ(0,L/2)λ+κ2χ(L/2,L)λ,κλ)L2|κ1κ2|.W^{p}_{(0,L)}\left(\kappa_{1}\chi_{(0,L/2)}\lambda+\kappa_{2}\chi_{(L/2,L)}\lambda,\kappa\lambda\right)\lesssim L^{2}|\kappa_{1}-\kappa_{2}|. (5.5)

Let us assume without loss of generality that κ1κ2\kappa_{1}\geq\kappa_{2}. The optimal transport map is essentially symmetric around L/2L/2 and is given in (0,L/2)(0,L/2) by sending a mass (k+1)λ(L/2,L)λ(0,L)(κ1κ2)(k+1)\frac{\lambda(L/2,L)}{\lambda(0,L)}(\kappa_{1}-\kappa_{2}) from position kk to k+1k+1 (which is admissible since by hypothesis k|κ1κ2|min(κ1,κ2)k|\kappa_{1}-\kappa_{2}|\ll\min(\kappa_{1},\kappa_{2})) so that

W(0,L)p(μ,λ)k=0L/2k|κ1κ2|L2|κ1κ2|.W^{p}_{(0,L)}(\mu,\lambda)\lesssim\sum_{k=0}^{L/2}k|\kappa_{1}-\kappa_{2}|\sim L^{2}|\kappa_{1}-\kappa_{2}|.

This proves (5.5).

Step 2.2.[Proof of (5.4)] The proof is made recursively by layers. In the first step, we pass from 2d2^{d} cubes to 2d12^{d-1} rectangles of the form x+(0,L)×(0,L/2)d1x+(0,L)\times(0,L/2)^{d-1}. For this we remark that for every cube QiQ_{i} there is exactly one cube QjQ_{j} such that Qj=Qi±L2e1Q_{j}=Q_{i}\pm\frac{L}{2}e_{1} (where eie_{i} is the canonical basis of d\mathbb{R}^{d}). Let us focus on Q1=(0,L/2)dQ_{1}=(0,L/2)^{d} and Q2=L2e1+Q1Q_{2}=\frac{L}{2}e_{1}+Q_{1}. Let R=Q1Q2=(0,L)×(0,L/2)d1R=Q_{1}\cup Q_{2}=(0,L)\times(0,L/2)^{d-1}. Define κ^=κ1λ(Q1)+κ2λ(Q2)λ(R)=μ1+μ2λ(R)\hat{\kappa}=\frac{\kappa_{1}\lambda(Q_{1})+\kappa_{2}\lambda(Q_{2})}{\lambda(R)}=\frac{\mu_{1}+\mu_{2}}{\lambda(R)}. We claim that

𝔼[1|R|WR2(κ1χQ1λ+κ2χQ2λ,κ^λ)]1Ld22.\mathbb{E}\left[\frac{1}{|R|}W_{R}^{2}\left(\kappa_{1}\chi_{Q_{1}}\lambda+\kappa_{2}\chi_{Q_{2}}\lambda,\hat{\kappa}\lambda\right)\right]\lesssim\frac{1}{L^{\frac{d-2}{2}}}.

For this we notice that in RR, the measures κ1χQ1λ+κ2χQ2λ\kappa_{1}\chi_{Q_{1}}\lambda+\kappa_{2}\chi_{Q_{2}}\lambda and κ^λ\hat{\kappa}\lambda are constant in the directions orthogonal to e1e_{1} and thus

1|R|WRp(κ1χQ1λ+κ2χQ2λ,κ^λ)1LW(0,L)p(κ1χ(0,L/2)λ+κ2χ(L/2,L)λ,κ^λ).\frac{1}{|R|}W_{R}^{p}\left(\kappa_{1}\chi_{Q_{1}}\lambda+\kappa_{2}\chi_{Q_{2}}\lambda,\hat{\kappa}\lambda\right)\lesssim\frac{1}{L}W_{(0,L)}^{p}\left(\kappa_{1}\chi_{(0,L/2)}\lambda+\kappa_{2}\chi_{(L/2,L)}\lambda,\hat{\kappa}\lambda\right).

Since κi=μiλ(Qi)\kappa_{i}=\frac{\mu_{i}}{\lambda(Q_{i})}, by the Cramér-Chernoff bounds (2.3), we have |κi1|=O(Ld/2)|\kappa_{i}-1|=O(L^{-d/2}) with overwhelming probability. Hence, if d3d\geq 3 we may apply (5.5) to get

𝔼[1|R|WRp(κ1χQ1λ+κ2χQ2λ,κ^λ)]L𝔼[|κ1κ2|]1Ld22\mathbb{E}\left[\frac{1}{|R|}W_{R}^{p}\left(\kappa_{1}\chi_{Q_{1}}\lambda+\kappa_{2}\chi_{Q_{2}}\lambda,\hat{\kappa}\lambda\right)\right]\lesssim L\mathbb{E}[|\kappa_{1}-\kappa_{2}|]\lesssim\frac{1}{L^{\frac{d-2}{2}}}

which proves the claim.
We finally iterate this this argument 2d2^{d} times. At every step kk we have 2dk2^{d-k} rectangles of the form x+(0,L)k×(0,L/2)dkx+(0,L)^{k}\times(0,L/2)^{d-k} and each iteration has a cost of the same order (namely L(d2)/2L^{-(d-2)/2}). Using triangle inequality this concludes the proof of (5.4).

Remark 5.4.

Notice that in the proof of (5.5), we locally move a mass of order L(d2)/2L^{-(d-2)/2} which corresponds to the optimal choice of θ\theta in the proof of (5.1).

6. De-Poissonization

In this section we discuss how to transfer limit results from matching problems with Poisson point process, i.e., with a random number of points, to those of with a deterministic number of points.

We use the following general result.

Proposition 6.1.

Let f:(0,)×k[0,)f:(0,\infty)\times\mathbb{N}^{k}\to[0,\infty), (L,n)f(L|n)(L,n)\mapsto f(L|n), satisfy the following assumptions:

  1. (1)

    (pp-homogeneity) f(L|n)=Lpf(1|n)f(L|n)=L^{p}f(1|n), for every nkn\in\mathbb{N}^{k}, L>0L>0,

  2. (2)

    (boundedness) f(1|n)1f(1|n)\lesssim 1, for every nkn\in\mathbb{N}^{k},

  3. (3)

    (monotonicity) f(1|n)f(1|m)f(1|n)\leq f(1|m), for every m,nkm,n\in\mathbb{N}^{k} such that minim_{i}\leq n_{i} for i=1,,ki=1,\ldots,k.

Define

f(L)=𝔼[f(L|NL)]f(L)=\mathbb{E}\left[f(L|N_{L})\right]

where NL=(NL,i)i=1kN_{L}=(N_{L,i})_{i=1}^{k} are i.i.d. Poisson random variables with parameter LdL^{d}. Then,

lim infnnpdf(1|(n,,n))=lim infLf(L),andlim supnnpdf(1|(n,,n))=lim supLf(L).\liminf_{n\to\infty}n^{\frac{p}{d}}f(1|(n,\ldots,n))=\liminf_{L\to\infty}f(L),\quad\text{and}\quad\limsup_{n\to\infty}n^{\frac{p}{d}}f(1|(n,\ldots,n))=\limsup_{L\to\infty}f(L).
Proof.

Let 0<δ<Ld0<\delta<L^{d} and introduce the event

A={|NL,iLd|<δfor i=1,,k}.A=\left\{|N_{L,i}-L^{d}|<\delta\quad\text{for $i=1,\ldots,k$}\right\}.

By independence and Poisson tail bounds (2.3),

(A)(12exp(12δ2Ld+δ))k.\mathbb{P}(A)\geq\left(1-2\exp\left(-\frac{1}{2}\frac{\delta^{2}}{L^{d}+\delta}\right)\right)^{k}. (6.1)

We decompose

f(L)=𝔼[f(L|NL)|A](A)+𝔼[f(L|NL)|Ac](Ac).f(L)=\mathbb{E}\left[f(L|N_{L})|A\right]\mathbb{P}(A)+\mathbb{E}\left[f(L|N_{L})|A^{c}\right]\mathbb{P}(A^{c}).

If AA holds, we use monotonicity of f(L|n)f(L|n) to argue that

Lpf(1|a)𝔼[f(L|NL)|A]Lpf(1|b),L^{p}f(1|a)\leq\mathbb{E}\left[f(L|N_{L})|A\right]\leq L^{p}f(1|b),

where a=Ld+δa=L^{d}+\delta, b=Ldδb=L^{d}-\delta. Otherwise, we use (1)(1) and (2)(2) to obtain

0𝔼[f(L|NL)|Ac]Lp.0\leq\mathbb{E}\left[f(L|N_{L})|A^{c}\right]\leq L^{p}.

Combining these inequalities, we have

Lpf(1|a)(A)f(L)Lp(f(1|b)+(1(A))).L^{p}f(1|a)\mathbb{P}(A)\leq f(L)\leq L^{p}\left(f(1|b)+(1-\mathbb{P}(A))\right). (6.2)

For any n1n\geq 1, let L=L(n)L=L(n) be such such that Ld+Ld/22βlogL=nL^{d}+L^{d/2}\sqrt{2\beta\log L}=n, for some fixed β>p\beta>p. Then, we have from (6.2) with δ=Ld/22βlogL\delta=L^{d/2}\sqrt{2\beta\log L},

f(1|n)f(L)Lp(A).f(1|n)\leq\frac{f(L)}{L^{p}\mathbb{P}(A)}.

As nn\to\infty, we have

npdLp=(1+Ld22βlogL)pd=1+O(lognn).\frac{n^{\frac{p}{d}}}{L^{p}}=(1+L^{-\frac{d}{2}}\sqrt{2\beta\log L})^{\frac{p}{d}}=1+O\left(\sqrt{\frac{\log n}{n}}\right).

Moreover, by (6.1),

(A)(12exp(12δ2Ld+δ))k=1O(Lβ)=1O(nβd).\mathbb{P}(A)\geq\left(1-2\exp\left(-\frac{1}{2}\frac{\delta^{2}}{L^{d}+\delta}\right)\right)^{k}=1-O(L^{-\beta})=1-O\left(n^{-\frac{\beta}{d}}\right).

It follows that

lim supnnpdf(1|(n,,n))lim supLf(L),andlim infnnpdf(1|(n,,n))lim infLf(L).\limsup_{n\to\infty}n^{\frac{p}{d}}f(1|(n,\ldots,n))\leq\limsup_{L\to\infty}f(L),\quad\text{and}\quad\liminf_{n\to\infty}n^{\frac{p}{d}}f(1|(n,\ldots,n))\leq\liminf_{L\to\infty}f(L).

To obtain the converse inequalities, we argue analogously choosing instead L=L(n)L=L(n) such that LdLd/22βlogL=nL^{d}-L^{d/2}\sqrt{2\beta\log L}=n. ∎

We now apply Proposition 6.1 to matching problems. Let us first consider the case of matching to the reference measure.

Theorem 6.2.

Let d3d\geq 3, p1p\geq 1, and (Xi)i1(X_{i})_{i\geq 1} be i.i.d. uniform random variables on [0,1]d[0,1]^{d}. Then,

limnnpd𝔼[W[0,1]dp(1ni=1nδXi,1)]=fref,\lim_{n\to\infty}n^{\frac{p}{d}}\mathbb{E}\left[W_{[0,1]^{d}}^{p}\left(\frac{1}{n}\sum_{i=1}^{n}\delta_{X_{i}},1\right)\right]=f_{\infty}^{\mathrm{ref}},

with freff_{\infty}^{\mathrm{ref}} as in Theorem 4.1.

Proof.

Recalling the notation Q1=[0,1]dQ_{1}=[0,1]^{d}, we introduce the function

f(L|n)=Lp𝔼[WQ1p(1ni=1nδXi,1)],if n1,\displaystyle f(L|n)=L^{p}\mathbb{E}\left[W_{Q_{1}}^{p}\left(\frac{1}{n}\sum_{i=1}^{n}\delta_{X_{i}},1\right)\right],\quad\text{if $n\geq 1$,}

and f(L|0)=0f(L|0)=0. It is clearly bounded and pp-homogeneous in the sense of Proposition 6.1. To show monotonicity, let 1mn1\leq m\leq n and use the identity

1ni=1nδXi=(nm)1I{1,,n}|I|=m1miIδXi\frac{1}{n}\sum_{i=1}^{n}\delta_{X_{i}}={n\choose m}^{-1}\sum_{\begin{subarray}{c}I\subseteq\left\{1,\ldots,n\right\}\\ |I|=m\end{subarray}}\frac{1}{m}\sum_{i\in I}\delta_{X_{i}}

in combination with the convexity of the transportation cost, to obtain

WQ1p(i=1nδXi,1)\displaystyle W_{Q_{1}}^{p}\left(\sum_{i=1}^{n}\delta_{X_{i}},1\right) =WQ1p((nm)1I{1,,n}|I|=m1miIδXi,(nm)1I{1,,n}|I|=m1)\displaystyle=W_{Q_{1}}^{p}\left({n\choose m}^{-1}\sum_{\begin{subarray}{c}I\subseteq\left\{1,\ldots,n\right\}\\ |I|=m\end{subarray}}\frac{1}{m}\sum_{i\in I}\delta_{X_{i}},{n\choose m}^{-1}\sum_{\begin{subarray}{c}I\subseteq\left\{1,\ldots,n\right\}\\ |I|=m\end{subarray}}1\right)
(nm)1I{1,,n}|I|=mWQ1p(1miIδXi,1).\displaystyle\leq{n\choose m}^{-1}\sum_{\begin{subarray}{c}I\subseteq\left\{1,\ldots,n\right\}\\ |I|=m\end{subarray}}W_{Q_{1}}^{p}\left(\frac{1}{m}\sum_{i\in I}\delta_{X_{i}},1\right).

Taking expectation yields f(1|n)f(1|m)f(1|n)\leq f(1|m), since for I{1,,n}I\subseteq\left\{1,\ldots,n\right\} with |I|=m|I|=m, {Xi}iI\left\{X_{i}\right\}_{i\in I} have the same law as {Xi}1=1m\left\{X_{i}\right\}_{1=1}^{m}.

Let μ\mu be a Poisson point process of intensity one on d\mathbb{R}^{d} and for L>1L>1 let NL=μ(QL)N_{L}=\mu(Q_{L}) be a Poisson random variable of parameter LdL^{d}. For n1n\geq 1, we notice that

f(L|n)=𝔼[WQLp(μμ(QL),1|QL|)|NL=n]=Ldnfref(L|n),\begin{split}f(L|n)&=\mathbb{E}\left[\left.W_{Q_{L}}^{p}\left(\frac{\mu}{\mu(Q_{L})},\frac{1}{|Q_{L}|}\right)\right|N_{L}=n\right]\\ &=\frac{L^{d}}{n}f^{\mathrm{ref}}(L|n),\end{split} (6.3)

where we write, extending the notation from Section 4,

fref(L|n)=𝔼[1|QL|WQLp(μ,κ)|NL=n],f^{\mathrm{ref}}(L|n)=\mathbb{E}\left[\left.\frac{1}{|Q_{L}|}W_{Q_{L}}^{p}\left(\mu,\kappa\right)\right|N_{L}=n\right],

with κ=μ(QL)/|QL|\kappa=\mu(Q_{L})/|Q_{L}| (and fref(L|0)=0f^{\mathrm{ref}}(L|0)=0). Notice that

fref(L)=𝔼[fref(L|NL)].f^{\mathrm{ref}}(L)=\mathbb{E}[f^{\mathrm{ref}}(L|N_{L})].

In order to apply Proposition 6.1 and conclude the proof, we argue that

limL𝔼[|f(L|NL)fref(L|NL)|]=0.\lim_{L\to\infty}\mathbb{E}\left[\left|f\left(L|N_{L}\right)-f^{\mathrm{ref}}\left(L|N_{L}\right)\right|\right]=0. (6.4)

To this aim, we first bound 𝔼[f(L|NL)]\mathbb{E}\left[f(L|N_{L})\right] uniformly from above as LL\to\infty. For this we combine the following two simple facts. First, since f(L|n)Lpf(L|n)\lesssim L^{p}, letting AL={NL|QL|/2}A_{L}=\{N_{L}\geq|Q_{L}|/2\}, we have

𝔼[f(L|NL)|ALc]Lp.\mathbb{E}\left[f(L|N_{L})|A_{L}^{c}\right]\lesssim L^{p}.

Second, since by (2.3) [AL]1\mathbb{P}[A_{L}]\gtrsim 1,

𝔼[f(L|NL)|AL]=(6.3)𝔼[LdNLfref(L|NL)|AL]𝔼[fref(L|NL)]=fref(L)1,\mathbb{E}[f(L|N_{L})|A_{L}]\stackrel{{\scriptstyle\eqref{eq:identity-f-fref}}}{{=}}\mathbb{E}\left[\left.\frac{L^{d}}{N_{L}}f^{\mathrm{ref}}(L|N_{L})\right|A_{L}\right]\lesssim\mathbb{E}[f^{\mathrm{ref}}(L|N_{L})]=f^{\mathrm{ref}}(L)\lesssim 1,

where in the last inequality we used that freff^{\mathrm{ref}} is bounded as a consequence of Theorem 4.1. Therefore,

𝔼[f(L|NL)]=𝔼[f(L|NL)|ALc][ALc]+𝔼[f(L|NL)|AL][AL]Lp[ALc]+1(2.3)Lpexp(cLd)+11.\begin{split}\mathbb{E}[f(L|N_{L})]&=\mathbb{E}[f(L|N_{L})|A_{L}^{c}]\mathbb{P}[A_{L}^{c}]+\mathbb{E}[f(L|N_{L})|A_{L}]\mathbb{P}[A_{L}]\\ &\lesssim L^{p}\mathbb{P}[A_{L}^{c}]+1\\ &\stackrel{{\scriptstyle\eqref{eq:Cramer}}}{{\lesssim}}L^{p}\exp(-cL^{d})+1\lesssim 1.\end{split}

Using Hölder inequality with 1q+1q=1\frac{1}{q}+\frac{1}{q^{\prime}}=1, and the fact that f(L|NL)Lpf(L|N_{L})\lesssim L^{p},

𝔼[|f(L|NL)fref(L|NL)|]=(6.3)𝔼[|NL|QL|1|f(L|NL)]𝔼[|NL|QL|1|q]1q𝔼[f(L|NL)q]1q.q(2.4)qLd2Lp(q1)q𝔼[f(L|NL)]1qqLd2+p(q1)q.\begin{split}\mathbb{E}\left[\left|f(L|N_{L})-f^{\mathrm{ref}}(L|N_{L})\right|\right]&\stackrel{{\scriptstyle\eqref{eq:identity-f-fref}}}{{=}}\mathbb{E}\left[\left|\frac{N_{L}}{|Q_{L}|}-1\right|f(L|N_{L})\right]\\ &\leq\mathbb{E}\left[\left|\frac{N_{L}}{|Q_{L}|}-1\right|^{q^{\prime}}\right]^{\frac{1}{q^{\prime}}}\mathbb{E}\left[f\left(L|N_{L}\right)^{q}\right]^{\frac{1}{q}}.\\ &\stackrel{{\scriptstyle\eqref{eq:momentPoi}}}{{\lesssim_{q}}}L^{-\frac{d}{2}}L^{\frac{p(q-1)}{q}}\mathbb{E}\left[f\left(L|N_{L}\right)\right]^{\frac{1}{q}}\lesssim_{q}L^{-\frac{d}{2}+\frac{p(q-1)}{q}}.\end{split}

Choosing qq close enough to 11, in particular 1<q<2p2pd1<q<\frac{2p}{2p-d} if p>d2p>\frac{d}{2}, we get (6.4). ∎

Arguing similarly, we obtain the corresponding result for the bi-partite matching on the unit cube, that is Theorem 1.1, which we restate for the reader’s convenience.

Theorem 6.3.

Let d3d\geq 3, p1p\geq 1, and (Xi)i1(X_{i})_{i\geq 1}, (Yi)i1(Y_{i})_{i\geq 1}, be independent uniform random variables on [0,1]d[0,1]^{d}. Then,

limnnpd𝔼[W[0,1]dp(1ni=1nδXi,1ni=1nδYi)]=fbi,\lim_{n\to\infty}n^{\frac{p}{d}}\mathbb{E}\left[W_{[0,1]^{d}}^{p}\left(\frac{1}{n}\sum_{i=1}^{n}\delta_{X_{i}},\frac{1}{n}\sum_{i=1}^{n}\delta_{Y_{i}}\right)\right]=f_{\infty}^{\textrm{bi}},

with fbif_{\infty}^{\textrm{bi}} as in Theorem 5.1.

Proof.

The proof is very similar to that of Theorem 6.2, but in this case we define the function

f(L|n1,n2)=Lp𝔼[W[0,1]dp(1n1i=1n1δXi,1n2i=1n2δYi)]f(L|n_{1},n_{2})=L^{p}\mathbb{E}\left[W_{[0,1]^{d}}^{p}\left(\frac{1}{n_{1}}\sum_{i=1}^{n_{1}}\delta_{X_{i}},\frac{1}{n_{2}}\sum_{i=1}^{n_{2}}\delta_{Y_{i}}\right)\right]

for n1n_{1}, n21n_{2}\geq 1, and let f(L|n1,n2)=0f(L|n_{1},n_{2})=0 otherwise. It is clearly bounded and pp-homogeneous. Using the identities

1n1i=1n1δXi=(n1m1)1(n2m2)1I1{1,,n1}|I|=m1I2{1,,n2}|I|=m21m1iI1δXi\frac{1}{n_{1}}\sum_{i=1}^{n_{1}}\delta_{X_{i}}={n_{1}\choose m_{1}}^{-1}{n_{2}\choose m_{2}}^{-1}\sum_{\begin{subarray}{c}I_{1}\subseteq\left\{1,\ldots,n_{1}\right\}\\ |I|=m_{1}\end{subarray}}\sum_{\begin{subarray}{c}I_{2}\subseteq\left\{1,\ldots,n_{2}\right\}\\ |I|=m_{2}\end{subarray}}\frac{1}{m_{1}}\sum_{i\in I_{1}}\delta_{X_{i}}
1n2i=1n2δYi=(n1m1)1(n2m2)1I1{1,,n1}|I|=m1I2{1,,n2}|I|=m21m2iI2δYi\frac{1}{n_{2}}\sum_{i=1}^{n_{2}}\delta_{Y_{i}}={n_{1}\choose m_{1}}^{-1}{n_{2}\choose m_{2}}^{-1}\sum_{\begin{subarray}{c}I_{1}\subseteq\left\{1,\ldots,n_{1}\right\}\\ |I|=m_{1}\end{subarray}}\sum_{\begin{subarray}{c}I_{2}\subseteq\left\{1,\ldots,n_{2}\right\}\\ |I|=m_{2}\end{subarray}}\frac{1}{m_{2}}\sum_{i\in I_{2}}\delta_{Y_{i}}

we obtain monotonicity arguing analogously as in Theorem 6.2. The proof is then concluded as in Theorem 6.2 and we omit the details. ∎

Remark 6.4 (Convergence rates).

An inspection of the proof of Proposition 6.1 shows that one can transfer rates of convergence (even only one-sided) from the Poisson case to that of a fixed number of points. In the case of matching to the reference measure this leads to the inequality

frefnpd𝔼[WQ1p(1ni=1nδXi,1)]+Cn2d2d,f_{\infty}^{\mathrm{ref}}\leq n^{\frac{p}{d}}\mathbb{E}\left[W_{Q_{1}}^{p}\left(\frac{1}{n}\sum_{i=1}^{n}\delta_{X_{i}},1\right)\right]+Cn^{\frac{2-d}{2d}},

while for the bi-partite matching we obtain

fbinpd𝔼[WQ1p(1ni=1nδXi,1ni=1nδYi)]+Cn2d2dp,f_{\infty}^{\textrm{bi}}\leq n^{\frac{p}{d}}\mathbb{E}\left[W_{Q_{1}}^{p}\left(\frac{1}{n}\sum_{i=1}^{n}\delta_{X_{i}},\frac{1}{n}\sum_{i=1}^{n}\delta_{Y_{i}}\right)\right]+Cn^{\frac{2-d}{2dp}},

for some constant C0C\geq 0. Besides being one-sided bounds, these are still far from the conjectured rates in [10], which for p=2p=2 read

n2d𝔼[WQ12(1ni=1nδXi,1ni=1nδYi)]=fbi+Cn2dd+o(n2dd),n^{\frac{2}{d}}\mathbb{E}\left[W_{Q_{1}}^{2}\left(\frac{1}{n}\sum_{i=1}^{n}\delta_{X_{i}},\frac{1}{n}\sum_{i=1}^{n}\delta_{Y_{i}}\right)\right]=f_{\infty}^{\textrm{bi}}+Cn^{\frac{2-d}{d}}+o(n^{\frac{2-d}{d}}),

with an explicit constant CC.

Remark 6.5 (Strong convergence).

If p<dp<d, standard concentration of measure arguments allow to obtain strong convergence from convergence of the expected values (see also [3] for a similar argument in the case p=d=2p=d=2). Let us consider for example the case of bi-partite matching, and show that, both \mathbb{P}-a.s. and in L1()L^{1}(\mathbb{P}),

limnnpdWQ1p(1ni=1nδXi,1ni=1nδYi)=fbi,\lim_{n\to\infty}n^{\frac{p}{d}}W_{Q_{1}}^{p}\left(\frac{1}{n}\sum_{i=1}^{n}\delta_{X_{i}},\frac{1}{n}\sum_{i=1}^{n}\delta_{Y_{i}}\right)=f_{\infty}^{\textrm{bi}}, (6.5)

with (Xi)i1(X_{i})_{i\geq 1}, (Yi)i1(Y_{i})_{i\geq 1} independent uniform random variables on Q1=[0,1]dQ_{1}=[0,1]^{d}.

For any n1n\geq 1, the function

[0,1]d×2n(x1,,xn,y1,,yn)n1dWp(μ(xi)i=1n,μ(yi)i=1n),[0,1]^{d\times 2n}\ni(x_{1},\ldots,x_{n},y_{1},\ldots,y_{n})\mapsto n^{\frac{1}{d}}W_{p}\left(\mu_{\left(x_{i}\right)_{i=1}^{n}},\mu_{\left(y_{i}\right)_{i=1}^{n}}\right),

is 2n1dmin{12,1p}2n^{\frac{1}{d}-\min\left\{\frac{1}{2},\frac{1}{p}\right\}}-Lipschitz with respect to the Euclidean distance, where we write

μ(xi)i=1n=1ni=1nδxi,for xi[0,1]d.\mu_{\left(x_{i}\right)_{i=1}^{n}}=\frac{1}{n}\sum_{i=1}^{n}\delta_{x_{i}},\quad\text{for $x_{i}\in[0,1]^{d}$.}

This relies on the triangle inequality (2.1) and the fact that

(xi)i=1nμ(xi)i=1nis nmin{12,1p}-Lipschitz(x_{i})_{i=1}^{n}\mapsto\mu_{\left(x_{i}\right)_{i=1}^{n}}\quad\text{is $n^{-\min\left\{\frac{1}{2},\frac{1}{p}\right\}}$-Lipschitz}

if we endow the set of probability measures on [0,1]d[0,1]^{d} with the Wasserstein distance of order pp. Indeed, for (xi)i=1n,(yi)i=1n[0,1]d×n(x_{i})_{i=1}^{n},(y_{i})_{i=1}^{n}\in[0,1]^{d\times n}, then

Wp(μ(xi)i=1n,μ(yi)i=1n)(1ni=1n|xiyi|p)1pnmin{12,1p}(i=1n|xiyi|2)12.W_{p}\left(\mu_{\left(x_{i}\right)_{i=1}^{n}},\mu_{\left(y_{i}\right)_{i=1}^{n}}\right)\leq\left(\frac{1}{n}\sum_{i=1}^{n}|x_{i}-y_{i}|^{p}\right)^{\frac{1}{p}}\leq n^{-\min\left\{\frac{1}{2},\frac{1}{p}\right\}}\left(\sum_{i=1}^{n}|x_{i}-y_{i}|^{2}\right)^{\frac{1}{2}}.

Gaussian concentration for the uniform measure on the unit cube [20, Prop. 2.8] yields that if

Zn=n1dWp(μ(Xi)i=1n,μ(Yi)i=1n),Z_{n}=n^{\frac{1}{d}}W_{p}\left(\mu_{\left(X_{i}\right)_{i=1}^{n}},\mu_{\left(Y_{i}\right)_{i=1}^{n}}\right),

then, for r>0r>0,

(|Zn𝔼[Zn]|r)2exp(cn2(min{12,1p}1d)r2),\mathbb{P}\left(\left|Z_{n}-\mathbb{E}\left[Z_{n}\right]\right|\geq r\right)\leq 2\exp\left(-cn^{2\left(\min\left\{\frac{1}{2},\frac{1}{p}\right\}-\frac{1}{d}\right)}r^{2}\right),

where c>0c>0 is an absolute constant. A standard application of Borel-Cantelli Lemma gives that, if 1p<d1\leq p<d and d3d\geq 3, then

limn(Zn𝔼[Zn])=0,-a.s.\lim_{n\to\infty}\left(Z_{n}-\mathbb{E}\left[Z_{n}\right]\right)=0,\quad\text{$\mathbb{P}$-a.s.} (6.6)

Moreover, using the layer-cake formula, we obtain the inequality

𝔼[|Zn𝔼[Zn]|p]npdmin{p2,1},\mathbb{E}\left[\left|Z_{n}-\mathbb{E}\left[Z_{n}\right]\right|^{p}\right]\lesssim n^{\frac{p}{d}-\min\left\{\frac{p}{2},1\right\}}, (6.7)

which is infinitesimal as nn\to\infty. To conclude, it is sufficient to argue that limn𝔼[Zn]=(fbi)1/p\lim_{n\to\infty}\mathbb{E}\left[Z_{n}\right]=(f_{\infty}^{\textrm{bi}})^{1/p}, since it yields by (6.7) and (6.6) that limnZn=(fbi)1/p\lim_{n\to\infty}Z_{n}=(f_{\infty}^{\textrm{bi}})^{1/p} in Lp()L^{p}(\mathbb{P}) and \mathbb{P}-a.s. and hence (6.5). By Theorem 6.2, limn𝔼[Znp]=fbi\lim_{n\to\infty}\mathbb{E}\left[Z_{n}^{p}\right]=f_{\infty}^{\textrm{bi}}. By Jensen’s inequality,

lim supn𝔼[Zn]plimn𝔼[Znp]=fbi.\limsup_{n\to\infty}\mathbb{E}\left[Z_{n}\right]^{p}\leq\lim_{n\to\infty}\mathbb{E}\left[Z_{n}^{p}\right]=f_{\infty}^{\textrm{bi}}.

By the elementary inequality (3.2), we have, for any ε>0\varepsilon>0,

Znp(𝔼[Zn]+|Zn𝔼[Zn]|)p(1+ε)𝔼[Zn]p+Cεp1|Zn𝔼[Zn]|p.Z_{n}^{p}\leq\left(\mathbb{E}\left[Z_{n}\right]+|Z_{n}-\mathbb{E}\left[Z_{n}\right]|\right)^{p}\leq\left(1+\varepsilon\right)\mathbb{E}\left[Z_{n}\right]^{p}+\frac{C}{\varepsilon^{p-1}}|Z_{n}-\mathbb{E}\left[Z_{n}\right]|^{p}.

Taking expectation and letting nn\to\infty, using (6.7), we obtain

fbi=limn𝔼[Znp](1+ε)lim infn𝔼[Zn]p.f_{\infty}^{\textrm{bi}}=\lim_{n\to\infty}\mathbb{E}\left[Z_{n}^{p}\right]\leq\left(1+\varepsilon\right)\liminf_{n\to\infty}\mathbb{E}\left[Z_{n}\right]^{p}.

Letting ε0\varepsilon\to 0 we conclude.

Acknowledgments

M.G was partially supported by the project ANR-18-CE40-0013 SHAPO financed by the French Agence Nationale de la Recherche (ANR) and by the LYSM LIA AMU CNRS ECM INdAM. M.G. also thanks the Centro de Giorgi in Pisa for its hospitality. D.T. was partially supported by the University of Pisa, Project PRA 2018-49, and Gnampa project 2019 “Proprietà analitiche e geometriche di campi aleatori”.

References

  • [1] M. Ajtai, J. Komlós, and G. Tusnády, On optimal matchings., Combinatorica 4 (1984), 259–264.
  • [2] L. Ambrosio and F. Glaudo, Finer estimates on the 2-dimensional matching problem, J. Éc. polytech. Math. 6 (2019), 737–765.
  • [3] L. Ambrosio, F. Stra, and D. Trevisan, A PDE approach to a 2-dimensional matching problem, Probab. Theory Relat. Fields 173 (2019), no. 1-2, 433–477.
  • [4] F. Barthe and C. Bordenave, Combinatorial optimization over two random point sets., Séminaire de probabilités XLV, Cham: Springer, 2013, pp. 483–535.
  • [5] D. Benedetto and E. Caglioti, Euclidean random matching in 2d for non-constant densities, Journal of Statistical Physics 181 (2020), no. 3, 854–869.
  • [6] S. Bobkov and M. Ledoux, One-dimensional empirical measures, order statistics, and Kantorovich transport distances, Mem. Amer. Math. Soc. 261 (2019), no. 1259, v+126.
  • [7] S. Bobkov and M. Ledoux, A simple Fourier analytic proof of the AKT optimal matching theorem, 2019.
  • [8] S. Boucheron, G. Lugosi, and P. Massart, Concentration inequalities, Oxford University Press, Oxford, 2013.
  • [9] J. H. Boutet de Monvel and O. C. Martin, Almost sure convergence of the minimum bipartite matching functional in Euclidean space, Combinatorica 22 (2002), no. 4, 523–530.
  • [10] S. Caracciolo, C. Lucibello, G. Parisi, and G. Sicuro, Scaling hypothesis for the Euclidean bipartite matching problem, Physical Review E 90 (2014), no. 1.
  • [11] S. Dereich, M. Scheutzow, and R. Schottstedt, Constructive quantization: approximation by empirical measures, Ann. Inst. Henri Poincaré Probab. Stat. 49 (2013), no. 4, 1183–1203.
  • [12] V. Dobrić and J.E. Yukich, Asymptotics for transportation cost in high dimensions., J. Theor. Probab. 8 (1995), no. 1, 97–118.
  • [13] N. Fournier and A. Guillin, On the rate of convergence in Wasserstein distance of the empirical measure, Probab. Theory Related Fields 162 (2015), no. 3-4, 707–738.
  • [14] S. J. Fromm, Potential space estimates for Green potentials in convex domains, Proc. Amer. Math. Soc. 119 (1993), no. 1, 225–233.
  • [15] M. Giaquinta and L. Martinazzi, An introduction to the regularity theory for elliptic systems, harmonic maps and minimal graphs, second ed., Appunti. Scuola Normale Superiore di Pisa (Nuova Serie), vol. 11, Edizioni della Normale, Pisa, 2012.
  • [16] M. Goldman, M. Huesmann, and F. Otto, A large-scale regularity theory for the Monge-Ampere equation with rough data and application to the optimal matching problem, Aug 2018.
  • [17] by same author, Quantitative linearization results for the Monge-Ampère equation, (2019).
  • [18] M. Huesmann and K.-T. Sturm, Optimal transport from Lebesgue to Poisson, Ann. Probab. 41 (2013), no. 4, 2426–2478.
  • [19] G. Last and M. Penrose, Lectures on the Poisson process, vol. 7, Cambridge University Press, 2017.
  • [20] M. Ledoux, The concentration of measure phenomenon, no. 89, American Mathematical Soc., 2001.
  • [21] by same author, On optimal matching of Gaussian samples, Zap. Nauchn. Sem. S.-Peterburg. Otdel. Mat. Inst. Steklov. (POMI) 457 (2017), no. Veroyatnost’  i Statistika. 25, 226–264.
  • [22] M. Ledoux and J.-X. Zhu, On optimal matching of Gaussian samples III, arXiv preprint arXiv:1911.07579 (2019).
  • [23] G. Leoni, A first course in Sobolev spaces, second ed., Graduate Studies in Mathematics, vol. 181, American Mathematical Society, Providence, RI, 2017.
  • [24] G. Peyré, M. Cuturi, et al., Computational optimal transport: With applications to data science, Foundations and Trends® in Machine Learning 11 (2019), no. 5-6, 355–607.
  • [25] F. Santambrogio, Optimal transport for applied mathematicians, Progress in Nonlinear Differential Equations and their Applications, vol. 87, Birkhäuser/Springer, Cham, 2015, Calculus of variations, PDEs, and modeling.
  • [26] M. Talagrand, Matching random samples in many dimensions, Ann. Appl. Probab. 2 (1992), no. 4, 846–856.
  • [27] by same author, Upper and lower bounds for stochastic processes: modern methods and classical problems, vol. 60, Springer Science & Business Media, 2014.
  • [28] C. Villani, Topics in optimal transportation, Graduate Studies in Mathematics, vol. 58, American Mathematical Society, Providence, RI, 2003.
  • [29] J. E. Yukich, Probability theory of classical Euclidean optimization problems., Berlin: Springer, 1998.