This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

\useosf

Orlicz space regularization of continuous optimal transport problems

Dirk Lorenz Institute of Analysis and Algebra, TU Braunschweig, 38092 Braunschweig, Germany, ([email protected], [email protected])    Hinrich Mahler11footnotemark: 1
Abstract

In this work we analyze regularized optimal transport problems in the so-called Kantorovich form, i.e. given two Radon measures on two compact sets, the aim is to find a transport plan, which is another Radon measure on the product of the sets, that has these two measures as marginals and minimizes the sum of a certain linear cost function and a regularization term. We focus on regularization terms where a Young’s function applied to the (density of the) transport plan is integrated against a product measure. This forces the transport plan to belong to a certain Orlicz space. The predual problem is derived and proofs for strong duality and existence of primal solutions of the regularized problem are presented. Existence of (pre-)dual solutions is shown for the special case of LpL^{p}-regularization for p2p\geq 2. Moreover, two results regarding Γ\Gamma-convergence are stated: The first is concerned with marginals that do not lie in the appropriate Orlicz space and guarantees Γ\Gamma-convergence to the original Kantorovich problem, when smoothing the marginals. The second result gives convergence of a regularized and discretized problem to the unregularized, continuous problem.

1 Introduction

In this paper we consider the optimal transport problem in the Kantorovich form in the following setting: For compact sets Ω1,Ω2n\Omega_{1},\,\Omega_{2}\subset\mathbb{R}^{n}, probability measures μ1,μ2\mu_{1},\mu_{2} on Ω1,Ω2\Omega_{1},\Omega_{2}, respectively, and a real-valued continuous cost function c:Ω1×Ω2c:\Omega_{1}\times\Omega_{2}\to\mathbb{R} we want to solve

infπΩ1×Ω2cdπ\inf_{\pi}\int_{\Omega_{1}\times\Omega_{2}}c\mathop{}\!\mathrm{d}\pi (OT)

where the infimum is taken over all probability measures on Ω1×Ω2\Omega_{1}\times\Omega_{2} which have μ1\mu_{1} and μ2\mu_{2} as their first and second marginals, respectively. This problem has been well studied, and an overview is given in the recent books [31, 33, 27]. For example, it is known that the problem has a solution π¯\bar{\pi} and that the support of π¯\bar{\pi} is contained in the so-called cc-superdifferential of a cc-concave function on Ω1\Omega_{1}, see [1, Theorem 1.13]. In the case where c(x1,x2)=|x1x2|2c(x_{1},x_{2})=|x_{1}-x_{2}|^{2} is the squared Euclidean distance, this implies that the support of an optimal plan π\pi is singular with respect to the Lebesgue measure. This motivates the use of regularization of the continuous problem to obtain approximate solutions that are absolutely continuous w.r.t. given measures. That in turn allows to apply classical discretization techniques to solve the regularized problem approximately.

A regularization method that received much attention recently is regularization with the negative entropy of π\pi, i.e. adding a term Ω1×Ω2Φ(π)dλ\int_{\Omega_{1}\times\Omega_{2}}\Phi(\pi)\mathop{}\!\mathrm{d}\lambda with Φ(t)=tlog(t)\Phi(t)=t\log(t) and some measure λ\lambda on Ω1×Ω2\Omega_{1}\times\Omega_{2} [8, 11, 12, 10, 2]. Since π\pi is a measure, one has to interpret Φ(π)\Phi(\pi) appropriately: One should think of π\pi as the Radon-Nikodym derivative of π\pi with respect to the regularization measure λ\lambda, and we will make this distinction explicit in the following. In [10] entropic regularization with respect to the Lebesgue measure is considered and it is shown that the analysis of entropically regularized optimal transport problems naturally takes place in the function space LlogLL\log L (also called Zygmund space [3]) and that optimal plans for entropic regularization are always in LlogL(Ω1×Ω2)L\log L(\Omega_{1}\times\Omega_{2}) and exist if and only if the marginals are in the spaces LlogL(Ωi)L\log L(\Omega_{i}). These spaces are an example of so-called Orlicz spaces [28]. This motivates the analysis of regularization in arbitrary Orlicz spaces in this paper. Another motivation to study a more general regularization comes from the fact that regularization with the L2L^{2}-norm has been shown to be beneficial in some applications, see [29, 4, 13, 20]. Using the product of the marginals λ=μ1×μ2\lambda=\mu_{1}\times\mu_{2} for regularization has been considered in the case of entropic regularization [17, 27, 32]. In this case one can show existence of the dual problem with different techniques. These observations motivate us to consider regularization with Young’s function with respect to general measures. Notable regularizations that our approach covers are LpL^{p} regularization with p>1p>1 arbitrary and the Tsallis entropy [25].

The notion of Orlicz spaces in the context of convex integral functionals has previously been used in [18], where the author considers a more general setting than the one presented here. More precisely, the spaces used in [18] are a generalization of the Orlicz spaces used here, which are also known as Musielak-Orlicz spaces [24]. Existence of both primal and dual optimizers are covered. By choosing γ(z,t)=εΦ(t)+c(z)t+A(z)\gamma^{*}(z,t)=\varepsilon\cdot\Phi(t)+c(z)t+A(z) with A(z):=mintεΦ(t)+c(z)tA(z):=\min_{t}\varepsilon\cdot\Phi(t)+c(z)t and regularization parameter ε\varepsilon, a problem similar to the one considered here is recovered. The difference lies in the fact that the cost function cc is part of the definition of the relevant Musielak-Orlicz spaces in this case and hence, the analysis takes place in different spaces. As the aim of [18] is to weaken the necessary assumptions as much as possible, the overall setting is more abstract and the proofs rely heavily on the author’s work [19]. Here we aim for a self-contained, more elementary treatment of the problem.

During the review process of the present work, we became aware of the preprint [22], which also considers regularization of the Kantorovich problem with Young’s functions. While the underlying domains Ωi\Omega_{i} are chosen to be general complete separable metric spaces, only regularization with respect to the marginal measures is considered and this allows the authors to derive existence of dual solutions independent of the form of the regularization.

Moreover, in [26] regularization with general convex functionals F:𝔐(Ω){}F:\mathfrak{M}(\Omega)\to\mathbb{R}\cup\{\infty\} is considered. A duality result similar to our Theorem 3.1 is derived and existence of primal solutions is covered. However, [26, Theorem 2] implies the existence of continuous dual optimizers. As Example 3.7 below demonstrates, this can not be the case in general.

1.1 Notation and problem statement

Let us first fix some notation before we formulate our problem. The spaces of Radon and probability measures on Ωn\Omega\subset\mathbb{R}^{n} will be denoted by 𝔐(Ω)\mathfrak{M}(\Omega) and 𝒫(Ω)\mathcal{P}(\Omega), respectively. The cone of non-negative Radon measures will be denoted by 𝔐+(Ω)\mathfrak{M}_{+}(\Omega). With 𝒞(Ω)\mathcal{C}(\Omega) and 𝒞b(Ω)\mathcal{C}_{\mathrm{b}}(\Omega) we denote the spaces of continuous functions and bounded, continuous functions, respectively. The Lebesgue measure will be denoted by \mathcal{L} and integrals w.r.t. the Lebesgue measure are simply denoted by dx\mathop{}\!\mathrm{d}x with the appropriate integration variable xx.

In the following we will consider compact domains Ω1\Omega_{1}, Ω2\Omega_{2} equipped with finite measures λ1𝔐+(Ω1)\lambda_{1}\in\mathfrak{M}_{+}(\Omega_{1}) and λ2𝔐+(Ω2)\lambda_{2}\in\mathfrak{M}_{+}(\Omega_{2}), respectively. The measures λ1\lambda_{1} and λ2\lambda_{2} will be assumed to have full support, i.e. sptλi=Ωi\operatorname{spt}\lambda_{i}=\Omega_{i}, for i=1,2i=1,2. We will denote Ω:=Ω1×Ω2\Omega:=\Omega_{1}\times\Omega_{2} and λ:=λ1λ2\lambda:=\lambda_{1}\otimes\lambda_{2}. For the space of pp-integrable functions on Ω\Omega with respect to the measure ν\nu, the symbol Lp(Ω,dν)L^{p}(\Omega,\mathop{}\!\mathrm{d}\nu) will be used. When a measure ν\nu is absolutely continuous with respect to another measure μ\mu, written as νμ\nu\ll\mu, the Radon-Nikodym derivative of ν\nu w.r.t. to μ\mu, i.e. the density of ν\nu w.r.t μ\mu, will be denoted by dνdμ\tfrac{\mathop{}\!\mathrm{d}\nu}{\mathop{}\!\mathrm{d}\mu}.

The characteristic function of a set AA will be denoted by 𝟙A\mathds{1}_{A}. For two functions f:Ω1f:\Omega_{1}\to\mathbb{R} and g:Ω2g:\Omega_{2}\to\mathbb{R}, denote by fg:Ω1×Ω2f\oplus g:\Omega_{1}\times\Omega_{2}\to\mathbb{R}, (x1,x2)f(x1)+g(x2)(x_{1},x_{2})\mapsto f(x_{1})+g(x_{2}) the outer sum of ff and gg. This notation generalizes to measures ν1\nu_{1}, ν2\nu_{2} on Ω1\Omega_{1}, Ω2\Omega_{2}, respectively, by ν1ν2:=ν1+ν2\nu_{1}\oplus\nu_{2}:=\nu_{1}\otimes\mathcal{L}+\mathcal{L}\otimes\nu_{2}. For ν𝔐(Ω1)\nu\in\mathfrak{M}(\Omega_{1}) and f:Ω1Ω2f:\Omega_{1}\to\Omega_{2}, the pushforward of ν\nu by ff will be denoted as f#νf_{\#}\nu, i.e. the measure on Ω2\Omega_{2} defined by f#ν(A):=ν(f1(A))f_{\#}\nu(A):=\nu(f^{-1}(A)). Most importantly, the pushforward of the coordinate projections Pi:Ω1×Ω2ΩiP_{i}:\Omega_{1}\times\Omega_{2}\to\Omega_{i}, Pi(x1,x2)=xiP_{i}(x_{1},x_{2})=x_{i} will be used. Note, that (Pi)#π(P_{i})_{\#}\pi is the ii-th marginal of π𝔐(Ω1×Ω2)\pi\in\mathfrak{M}(\Omega_{1}\times\Omega_{2}). For real-valued functions ff we denote by f+:=max(f,0)f_{+}:=\max(f,0) the positive part and by f:=min(f,0)f_{-}:=-\min(f,0) the negative part. Finally, for a function g:[0,){}g:[0,\infty)\to\mathbb{R}\cup\{\infty\} denote by g{g}_{\infty} its extension to the real line by infinity, i.e.

g(x):={g(x),x0,,else.{g}_{\infty}(x):=\begin{cases}g(x),&x\geq 0,\\ \infty,&\text{else.}\end{cases}

The Orlicz space regularized Kantorovich problem of optimal transport considered in this work now reads as

infπ𝒫(Ω),πλ(Pi)#π=μi,i=1,2Ωcdπ+γΩΦ(dπdλ)dλ,\inf_{\begin{subarray}{c}\pi\in\mathcal{P}(\Omega),\,\pi\ll\lambda\\ (P_{i})_{\#}{\pi}={\mu_{i}},\,i=1,2\end{subarray}}\int_{\Omega}c\mathop{}\!\mathrm{d}\pi+\gamma\int_{\Omega}{\Phi}_{\infty}(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda\,, (P)

where Φ\Phi is a so-called Young’s function. Note that the regularization of π\pi is employed w.r.t. some product measure λ\lambda. Important cases are λ=\lambda=\mathcal{L} and λ=μ1μ2\lambda=\mu_{1}\otimes\mu_{2}. Note also that π\pi is required to be absolutely continuous w.r.t. λ\lambda. This is due to the fact that even for Young’s functions Φ\Phi satisfying modest conditions like Φ(t)t↛\frac{\Phi(t)}{t}\not\to\infty, as tt\to\infty, by e.g. [16, Theorem 5.19] the optimal π\pi may have a singular part w.r.t. λ\lambda. That however, would make the process of regularizing futile. Therefore we will require limtΦ(t)/t=\lim_{t\to\infty}\nicefrac{{\Phi(t)}}{{t}}=\infty throughout the paper.

Now consider the marginal constraints. As we will see later in Lemmas 2.9 and 3.4, it is necessary that the marginals μi\mu_{i} are absolutely continuous with respect to λi\lambda_{i} and that Φ(dμidλi)\Phi(\tfrac{\mathop{}\!\mathrm{d}\mu_{i}}{\mathop{}\!\mathrm{d}\lambda_{i}}) is integrable with respect to λi\lambda_{i}. To formulate the marginal constraints in terms of the densities, we recall that (P1)#π=μ1(P_{1})_{\#}{\pi}={\mu_{1}} if for all λ1\lambda_{1}-measurable sets AA it holds that π(A×Ω2)=μ1(A)\pi(A\times\Omega_{2})=\mu_{1}(A). In terms of densities and integrals, this reads as

Ω2Adπdλdλ1dλ2=A𝑑μ1=Adμ1dλ1dλ1.\int_{\Omega_{2}}\int_{A}\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda_{1}\mathop{}\!\mathrm{d}\lambda_{2}=\int_{A}d\mu_{1}=\int_{A}\tfrac{\mathop{}\!\mathrm{d}\mu_{1}}{\mathop{}\!\mathrm{d}\lambda_{1}}\mathop{}\!\mathrm{d}\lambda_{1}.

Using Fubini’s theorem we get

A(Ω2dπdλdλ2dμ1dλ1)dλ1=0,\int_{A}\left(\int_{\Omega_{2}}\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda_{2}-\tfrac{\mathop{}\!\mathrm{d}\mu_{1}}{\mathop{}\!\mathrm{d}\lambda_{1}}\right)\mathop{}\!\mathrm{d}\lambda_{1}=0\,,

and hence, the marginal constraints read as

Ω2dπdλdλ2=dμ1dλ1λ1-a.e.andΩ1dπdλdλ1=dμ2dλ2λ2-a.e.\int_{\Omega_{2}}\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda_{2}=\tfrac{\mathop{}\!\mathrm{d}\mu_{1}}{\mathop{}\!\mathrm{d}\lambda_{1}}\quad\text{$\lambda_{1}$-a.e.}\quad\text{and}\quad\int_{\Omega_{1}}\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda_{1}=\tfrac{\mathop{}\!\mathrm{d}\mu_{2}}{\mathop{}\!\mathrm{d}\lambda_{2}}\quad\text{$\lambda_{2}$-a.e.}

Note that for the integral Ωcdπ=Ωcdπdλdλ\int_{\Omega}c\mathop{}\!\mathrm{d}\pi=\int_{\Omega}c\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda to exist, the cost function cc does not need to be continuous and the problem may be formulated for more general cost functions. However, some of the results in this work require cc to be continuous and for simplicity this shall be assumed throughout the paper.

Let us summarize our assumptions:

Assumption 1.1.

For i=1,2i=1,2 let Ωin\Omega_{i}\subset\mathbb{R}^{n} be compact domains equipped with finite measures λi𝔐+(Ωi)\lambda_{i}\in\mathfrak{M}_{+}(\Omega_{i}) with sptλi=Ωi\operatorname{spt}\lambda_{i}=\Omega_{i}. Let c:Ωc:\Omega\to\mathbb{R} be a continuous cost function. Let Φ\Phi be such that limtΦ(t)/t=\lim_{t\to\infty}\nicefrac{{\Phi(t)}}{{t}}=\infty. Finally, for i=1,2i=1,2 let μi𝒫(Ωi)\mu_{i}\in\mathcal{P}(\Omega_{i}) such that μiλi\mu_{i}\ll\lambda_{i} and Φ(dμidλi)\Phi(\tfrac{\mathop{}\!\mathrm{d}\mu_{i}}{\mathop{}\!\mathrm{d}\lambda_{i}}) is integrable with respect to λi\lambda_{i}.

1.2 Contribution and Organization

The notions of Young’s functions and Orlicz spaces are introduced in Section 2 alongside some auxiliary results that will be used in the later sections. Section 3 deals with the question of existence of solutions in the framework of Fenchel duality. The first contribution (Theorem 3.4) guarantees existence of solutions of problem (P), which generalizes the corresponding results of [10, 20]. Afterwards, the predual problem is analyzed for the special case Φ(t)=tp/p\Phi(t)=\nicefrac{{t^{p}}}{{p}} for p>1p>1. As second contribution Theorem 3.13 gives existence of dual optimizers in LqL^{q}, where 1/p+1/q=1\nicefrac{{1}}{{p}}+\nicefrac{{1}}{{q}}=1 and p2p\geq 2. This generalizes the corresponding result of [20]. In Section 4 Γ\Gamma-convergence of different related problems is considered. First, a continuous, regularized problem with arbitrary marginals is considered. Theorem 4.2 extends [10, Theorem 5.1] and guarantees Γ\Gamma-convergence to the unregularized problem (OT) when smoothing the marginals. Note that the case of Γ\Gamma-convergence for fixed marginals in LlogL(Ω)L\log L(\Omega) has been treated in [8, Theorem 2.7] for Ω=n1×n2\Omega=\mathbb{R}^{n_{1}}\times\mathbb{R}^{n_{2}}. While Theorem 4.2 is stated only for compact Ω\Omega, it allows for marginals not in LlogL(Ω)L\log L(\Omega) and a coupled reduction of the regularization and the smoothing parameter. The final contribution (Theorem 4.9) is the proof of Γ\Gamma-convergence of a discretized and regularized optimal transport problem to the unregularized continuous problem (OT). The result covers both entropic and quadratic regularization as special cases. Some of the results in this paper are a direct generalization of results of previous papers and their proofs also follow the general proof strategy.

2 Young’s functions and Orlicz spaces

In this section, some notions about Young’s functions and Orlicz spaces are introduced. For a more detailed introduction, see [3, 28].

Definition 2.1 (Young’s function [3, Definitions IV.8.1, IV.8.11]).
  1. i)

    Let φ:[0,)[0,]\varphi:[0,\infty)\to[0,\infty] be increasing and lower semi-continuous, with φ(0)=0\varphi(0)=0. Suppose that φ\varphi is neither identically zero nor identically infinite on (0,)(0,\infty). Then the function Φ\Phi defined by Φ(t):=0tφ(s)ds\Phi(t):=\int_{0}^{t}\varphi(s)\mathop{}\!\mathrm{d}s is said to be a Young’s function.

  2. ii)

    Let ψ(s):=inf{t|φ(t)s}\psi(s):=\inf\left\{t\,\middle|\,\varphi(t)\geq s\right\}. Then, the function Ψ\Psi defined by Ψ(t):=0tψ(s)ds\Psi(t):=\int_{0}^{t}\psi(s)\mathop{}\!\mathrm{d}s is said to be the complementary Young’s function of Φ\Phi.

By definition, Young’s functions are convex and for a Young’s function Φ\Phi it holds that the complementary Young’s function Ψ\Psi is also a Young’s function and actually equal to the convex conjugate Φ\Phi^{*}.

The negative entropy regularization uses the function Φ(t)=tlog(t)\Phi(t)=t\log(t) which is not a Young’s function, but the function t(tlog(t))+t\mapsto(t\log(t))_{+} is. Hence, we introduce a slight generalization of the notion of Young’s function to be able to treat this case as well.

Definition 2.2 (Quasi-Young’s functions).

We say that Φ\Phi is a quasi-Young’s function if it is convex, lower semi-continuous and Φ+{\Phi}_{+} is a Young’s function.

Note that convexity of Φ\Phi shows that Φ\Phi is bounded from below. Moreover, any Young’s function is also a quasi-Young’s function.

Example 2.3.

The function Φ(t)=tlog(t)\Phi(t)=t\log(t) is a quasi-Young’s function because Φ+(t)=(tlog(t))+{\Phi}_{+}(t)={(t\log(t))}_{+} is a Young’s function.

Definition 2.4 (Luxemburg and Orlicz spaces [3, Definition IV.8.10]).

Let Φ\Phi be a Young’s function, Ωn\Omega\subset\mathbb{R}^{n} and ν𝔐+(Ω)\nu\in\mathfrak{M}_{+}(\Omega). Define the Luxemburg norm of a measurable function f:Ωf:\Omega\to\mathbb{R} w.r.t. ν\nu as

fLΦ(Ω,dν):=inf{γ0|ΩΦ(|f|γ)dν1}.\lVert f\rVert_{L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\nu)}:=\inf\left\{\gamma\geq 0\,\middle|\,\int_{\Omega}\Phi\left(\frac{\lvert f\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu\leq 1\right\}\,.

Then the space

LΦ(Ω,dν):={f:Ωmeasurable|fLΦ(Ω,dν)<}L^{\Phi}(\Omega,\,\mathop{}\!\mathrm{d}\nu):=\Big{\{}f:\Omega\to\mathbb{R}\,\mathrm{measurable}\,\Big{|}\,\lVert f\rVert_{L^{\Phi}(\Omega,\,\mathop{}\!\mathrm{d}\nu)}<\infty\Big{\}}

of measurable functions on Ω\Omega with finite Luxemburg norm is called the Orlicz space of Φ\Phi w.r.t. ν\nu.

Remark 2.5 ([9, Remark 1]).

The bound 11 in the definition of the Luxemburg norm can be replaced by any a(0,)a\leavevmode\nobreak\ \in\leavevmode\nobreak\ (0,\infty). That is, all norms defined by

fLΦ(Ω,dν),a:=inf{γ0|ΩΦ(|f|γ)dνa}{\lVert f\rVert}_{L^{\Phi}(\Omega,\,\mathop{}\!\mathrm{d}\nu),a}:=\inf\left\{\gamma\geq 0\,\middle|\,\int_{\Omega}\Phi\left(\frac{\lvert f\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu\leq a\right\}

are equivalent. This can be seen by combining the inequalities .LΦ(Ω,dν),b.LΦ(Ω,dν),a{\lVert\,\boldsymbol{.}\,\rVert}_{L^{\Phi}(\Omega,\,\mathop{}\!\mathrm{d}\nu),b}\leavevmode\nobreak\ \leq\leavevmode\nobreak\ {\lVert\,\boldsymbol{.}\,\rVert}_{L^{\Phi}(\Omega,\,\mathop{}\!\mathrm{d}\nu),a} and a.LΦ(Ω,dν),ab.LΦ(Ω,dν),ba{\lVert\,\boldsymbol{.}\,\rVert}_{L^{\Phi}(\Omega,\,\mathop{}\!\mathrm{d}\nu),a}\leavevmode\nobreak\ \leq\leavevmode\nobreak\ b{\lVert\,\boldsymbol{.}\,\rVert}_{L^{\Phi}(\Omega,\,\mathop{}\!\mathrm{d}\nu),b} for 0<a<b0<a<b.

The definition of Orlicz spaces does not immediately allow for the concept of quasi-Young’s functions to be incorporated. However, the following results establish the desired connection.

Lemma 2.6.

Let Φ\Phi be a Young’s function and Ωn\Omega\subset\mathbb{R}^{n}, ν𝔐+(Ω)\nu\in\mathfrak{M}_{+}(\Omega) with ν(Ω)<\nu(\Omega)<\infty and let t00t_{0}\geq 0. Then Φ~\tilde{\Phi} defined by Φ~(t):=(Φ(t)Φ(t0))+\tilde{\Phi}(t):={(\Phi(t)-\Phi(t_{0}))}_{+} is a Young’s function and LΦ(Ω,dν)=LΦ~(Ω,dν)L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\nu)=L^{\tilde{\Phi}}(\Omega,\mathop{}\!\mathrm{d}\nu).

A proof of Lemma 2.6 can be found in Appendix A.

Corollary 2.7.

Let Φ\Phi be a Young’s function, Θ\Theta a quasi-Young’s function with Θ+=Φ{\Theta}_{+}=\Phi, ν\nu a measure and Ωn\Omega\subset\mathbb{R}^{n} with ν(Ω)<\nu(\Omega)<\infty. Then, fLΦ(Ω,dν)f\in L^{\Phi}(\Omega,\,\mathop{}\!\mathrm{d}\nu) if and only if

ΩΘ(|f|γ)dν1\int_{\Omega}\Theta\left(\frac{\lvert f\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu\leq 1

for some γ<\gamma<\infty.

Proof.

Let t0=inf{t|Θ(τ)0τt}t_{0}=\inf\{t\,|\,\Theta(\tau)\geq 0\,\forall\tau\geq t\} and Φ~(t):=(Φ(t)Φ(t0))+\tilde{\Phi}(t):={(\Phi(t)-\Phi(t_{0}))}_{+}. Let γ\gamma be such that

ΩΦ~(|f|γ)dν1.\int_{\Omega}\tilde{\Phi}\left(\frac{\lvert f\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu\leq 1.

Then

1ΩΦ~(|f|γ)dνΩΘ(|f|γ)dν1\geq\int_{\Omega}\tilde{\Phi}\left(\frac{\lvert f\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu\geq\int_{\Omega}\Theta\left(\frac{\lvert f\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu

together with Lemma 2.6, shows one implication.

For the other implication, by Remark 2.5 it suffices to show that ΩΦ~(|f|/γ)dν<\int_{\Omega}\tilde{\Phi}\left(\nicefrac{{\lvert f\rvert}}{{\gamma}}\right)\mathop{}\!\mathrm{d}\nu<\infty whenever ΩΘ(|f|/γ)dν<\int_{\Omega}\Theta\left(\nicefrac{{\lvert f\rvert}}{{\gamma}}\right)\mathop{}\!\mathrm{d}\nu<\infty. However, this holds trivially, since Θ\Theta is bounded from below and Ω\Omega has finite measure w.r.t. ν\nu. ∎

Lemmas 2.6 and 2.7 state that the definitions of .LΦ(Ω,dν)\lVert\,\boldsymbol{.}\,\rVert_{L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\nu)} and LΦ(Ω,dν)L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\nu) are essentially independent of whether Φ\Phi is a Young’s function or just a quasi-Young’s function. To simplify notation, .LΘ(Ω,dν)\lVert\,\boldsymbol{.}\,\rVert_{L^{\Theta}(\Omega,\mathop{}\!\mathrm{d}\nu)} and LΘ(Ω,dν)L^{\Theta}(\Omega,\mathop{}\!\mathrm{d}\nu) will therefore be used for quasi-Young’s functions Θ\Theta as well. Note that for a quasi-Young’s function Θ\Theta with Θ+=Φ{\Theta}_{+}=\Phi, LΘ(Ω,dν)=LΦ(Ω,dν)L^{\Theta}(\Omega,\mathop{}\!\mathrm{d}\nu)=L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\nu), while in general .LΘ(Ω,dν)\lVert\,\boldsymbol{.}\,\rVert_{L^{\Theta}(\Omega,\mathop{}\!\mathrm{d}\nu)} and .LΦ(Ω,dν)\lVert\,\boldsymbol{.}\,\rVert_{L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\nu)} are equivalent but not equal. Moreover for complementary Young’s functions Φ\Phi and Ψ=Φ\Psi=\Phi^{*} which are proper, locally integrable and satisfy a certain growth condition, called the Δ2\Delta_{2}-property near infinity, it holds that (LΦ(Ω,dν))(L^{\Phi}{}(\Omega,\mathop{}\!\mathrm{d}\nu))^{*} is canonically isometrically isomorphic to (LΨ(Ω,dν),.LΨ(Ω,dν))(L^{\Psi}(\Omega,\mathop{}\!\mathrm{d}\nu),\lVert\,\boldsymbol{.}\,\rVert_{L^{\Psi}(\Omega,\mathop{}\!\mathrm{d}\nu)}) (see, e.g. [14]).

Example 2.8 (LlogLL\log L and LexpL_{\mathrm{exp}}).

Let Φ(t)=tlogt\Phi(t)=t\log t and Φ~(t)=(tlog(t))+\tilde{\Phi}(t)=(t\log(t))_{+}. The space of measurable functions ff with ΩΦ~(|f|)dν<\int_{\Omega}\tilde{\Phi}(\lvert f\rvert)\mathop{}\!\mathrm{d}\nu<\infty is called LlogL(Ω,dν)L\log L(\Omega,\,\mathop{}\!\mathrm{d}\nu). By the above corollary, the space of measurable functions gg with ΩΦ(|g|)dν<\int_{\Omega}{\Phi}(\lvert g\rvert)\mathop{}\!\mathrm{d}\nu<\infty is equal to LlogL(Ω,dν)L\log L(\Omega,\,\mathop{}\!\mathrm{d}\nu). The complementary Young’s function Ψ~\tilde{\Psi} of Φ~\tilde{\Phi} is given by

Ψ~(t)={t,t1et1,else.\tilde{\Psi}(t)=\begin{cases}t,&t\leq 1\\ \mathrm{e}^{t-1},&\text{else.}\end{cases}

As Φ\Phi satisfies the Δ2\Delta_{2}-property near infinity, the dual space of LlogL(Ω,dν)L\log L(\Omega,\,\mathop{}\!\mathrm{d}\nu) is thus given by the space of measurable functions hh that satisfy ΩΨ~(|h|)dν<\int_{\Omega}\tilde{\Psi}(\lvert h\rvert)\mathop{}\!\mathrm{d}\nu<\infty, which is called Lexp(Ω,dν)L_{\mathrm{exp}}(\Omega,\,\mathop{}\!\mathrm{d}\nu).

The following result states that the marginals of a transport plan with density in LΦL^{\Phi} also have density in the respective LΦL^{\Phi} space.

Lemma 2.9.

Let νi𝔐+(Ωi)\nu_{i}\in\mathfrak{M}_{+}(\Omega_{i}) be such that νi(Ωi)<\nu_{i}(\Omega_{i})<\infty, for i=1,2i=1,2 and set ν:=ν1ν2\nu:=\nu_{1}\otimes\nu_{2}. Let π𝔐+(Ω)\pi\in\mathfrak{M}_{+}(\Omega).

If dπdνLΦ(Ω,dν)\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\nu}\in L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\nu) for a quasi-Young’s function Φ\Phi, then d(Pi)#πdνiLΦ(Ωi,dνi)\tfrac{\mathop{}\!\mathrm{d}(P_{i})_{\#}\pi}{\mathop{}\!\mathrm{d}\nu_{i}}\in L^{\Phi}(\Omega_{i},\mathop{}\!\mathrm{d}\nu_{i}) for i=1,2i=1,2 with

d(Pi)#πdνiLΦ(Ωi,dνi)max(1,ν3i(Ω3i))dπdνLΦ(Ω,dν).\big{\lVert}\tfrac{\mathop{}\!\mathrm{d}(P_{i})_{\#}\pi}{\mathop{}\!\mathrm{d}\nu_{i}}\big{\rVert}_{L^{\Phi}(\Omega_{i},d\nu_{i})}\leq\max\left(1,\nu_{3-i}(\Omega_{3-i})\right)\big{\lVert}{\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\nu}}\big{\rVert}_{L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\nu)}\,.
Proof.

Setting dπd(ν1ν2)(x1,x2)=f(x1,x2)\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}(\nu_{1}\otimes\nu_{2})}(x_{1},x_{2})=f(x_{1},x_{2}), one observes, that d(P2)#πdν2=Ω1f(x1,x2)dν1(x1)\tfrac{\mathop{}\!\mathrm{d}(P_{2})_{\#}\pi}{\mathop{}\!\mathrm{d}\nu_{2}}=\int_{\Omega_{1}}f(x_{1},x_{2})\mathop{}\!\mathrm{d}\nu_{1}(x_{1}), and similarly for d(P1)#πdν1\tfrac{\mathop{}\!\mathrm{d}(P_{1})_{\#}\pi}{\mathop{}\!\mathrm{d}\nu_{1}}. The proof given for [10, Lemma 2.11] then holds with minor modifications. ∎

Remark 2.10.

The above result immediately yields that no optimal solution π¯\bar{\pi} of problem (P) can exist for any γ>0\gamma>0, if μi≪̸λi\mu_{i}\not\ll\lambda_{i} for i{1,2}i\in\{1,2\}.

Next, a few facts are derived, which will be useful for the analysis of both the primal and dual regularized optimal transport problems and Γ\Gamma-convergence.

Lemma 2.11.

Let Φ\Phi be a quasi-Young’s function and ff such that fLΦ(Ω,dν)>1\lVert f\rVert_{L^{\Phi}(\Omega,\,\mathop{}\!\mathrm{d}\nu)}>1. Then, ΩΦ(|f|)dνfLΦ(Ω,dν)\int_{\Omega}\Phi(\lvert f\rvert)\mathop{}\!\mathrm{d}\nu\geq\lVert f\rVert_{L^{\Phi}(\Omega,\,\mathop{}\!\mathrm{d}\nu)}.

Proof.

Noting that Φ(0)0\Phi(0)\leq 0, the proof given for [10, Lemma 2.6] holds with minor adjustments. ∎

For complementary Young’s’s functions Φ\Phi and Ψ\Psi, we have the following result on conjugating scaled Young’s functions.

Lemma 2.12.

Let Φ\Phi be a Young’s function, Ψ=Φ\Psi=\Phi^{*} and γ0\gamma\neq 0.

  1. 1.

    Then for Ψ~(t):=γΨ(t/γ)\tilde{\Psi}(t):=\gamma\Psi(\nicefrac{{t}}{{\gamma}}) it holds that Ψ~=γΦ\tilde{\Psi}^{*}=\gamma\Phi.

  2. 2.

    Let Θ\Theta be a quasi-Young’s function with Θ+=Φ{\Theta}_{+}=\Phi. Then for Θ~(t):=γΘ(t/γ)\widetilde{\Theta^{*}}(t):=\gamma\Theta^{*}(\nicefrac{{t}}{{\gamma}}) it holds that (Θ~)=γΘ(\widetilde{\Theta^{*}})^{*}=\gamma\Theta.

Proof.

The assertion follows directly from the definition of the convex conjugate and the Fenchel-Moreau theorem. ∎

The following Lemma 2.13 gives some useful insights into the behavior of the objective function of problem (P) under perturbation of π\pi. Recall from Assumption 1.1 that for Young’s functions Φ\Phi we required limtΦ(t)/t=\lim_{t\to\infty}\nicefrac{{\Phi(t)}}{{t}}=\infty.

Lemma 2.13.

Let ν𝔐+(Ω)\nu\in\mathfrak{M}_{+}(\Omega), (πk)𝒫(Ω)(\pi_{k})\subset\mathcal{P}(\Omega) and π𝒫(Ω)\pi\in\mathcal{P}(\Omega) such that πkπ\pi_{k}\xrightharpoonup{*}\pi and πkν\pi_{k}\ll\nu. Let g:=Φg:={\Phi}_{\infty} for some quasi-Young’s function Φ\Phi. Then the following statements hold.

  1. 1.

    Let π≪̸ν\pi\not\ll\nu. Then

    lim infkΩg(dπkdν)dν=.\liminf_{k\to\infty}\int_{\Omega}g\left(\frac{\mathop{}\!\mathrm{d}\pi_{k}}{\mathop{}\!\mathrm{d}\nu}\right)\mathop{}\!\mathrm{d}\nu=\infty\,.
  2. 2.

    Let πν\pi\ll\nu. Then

    lim infkΩg(dπkdν)dνΩg(dπdν)dν.\liminf_{k\to\infty}\int_{\Omega}g\left(\frac{\mathop{}\!\mathrm{d}\pi_{k}}{\mathop{}\!\mathrm{d}\nu}\right)\mathop{}\!\mathrm{d}\nu\geq\int_{\Omega}g\left(\frac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\nu}\right)\mathop{}\!\mathrm{d}\nu\,.
Proof.

Since gg grows superlinearly at \infty, the recession function g(t)=limhg(s+ht)g(s)hg^{\infty}(t)=\lim_{h\to\infty}\frac{g(s+ht)-g(s)}{h} (which is independent of ss) is infinite for all t>0t>0. By [16, Theorem 5.19], it holds for every sequence πk\pi_{k} which weakly* converges to π\pi that

lim infkΩg(dπkdν)dνΩg(dπdν)dν+Ωg(dπd|πs|)d|πs|,\liminf_{k\to\infty}\int_{\Omega}g(\tfrac{\mathop{}\!\mathrm{d}\pi_{k}}{\mathop{}\!\mathrm{d}\nu})\mathop{}\!\mathrm{d}\nu\geq\int_{\Omega}g(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\nu})\mathop{}\!\mathrm{d}\nu+\int_{\Omega}g^{\infty}(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lvert\pi_{\mathrm{s}}\rvert})\mathop{}\!\mathrm{d}{\lvert\pi_{s}\rvert}\,,

where πs\pi_{\mathrm{s}} denotes the unique measure singular to ν\nu in the Lebesgue decomposition π=πs+πac\pi=\pi_{\mathrm{s}}+\pi_{\mathrm{ac}} (e.g. [16, Theorem 1.115]) and πac\pi_{\mathrm{ac}} denotes the corresponding measure with πacν\pi_{\mathrm{ac}}\ll\nu.

  1. 1.

    It suffices to show dπd|πs|(x)>0\frac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lvert\pi_{\mathrm{s}}\rvert}(x)>0 for every xspt|πs|x\in\operatorname{spt}\lvert\pi_{\mathrm{s}}\rvert. First note that since π𝒫(Ω)\pi\in\mathcal{P}(\Omega), πs\pi_{\mathrm{s}} is non-negative and hence |πs|=πs\lvert\pi_{\mathrm{s}}\rvert=\pi_{\mathrm{s}}. Let now CC be a bounded, convex closed set containing the origin in its interior. Then by [16, Definition 1.156], for every xsptπsx\in\operatorname{spt}\pi_{\mathrm{s}} it holds that

    dπdπs(x)\displaystyle\frac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\pi_{\mathrm{s}}}(x) =limr0π((x+rC)Ω)πs((x+rC)Ω)\displaystyle=\lim_{r\searrow 0}\frac{\pi((x+rC)\cap\Omega)}{\pi_{\mathrm{s}}((x+rC)\cap\Omega)}
    =limr0πac((x+rC)Ω)+πs((x+rC)Ω)πs((x+rC)Ω)\displaystyle=\lim_{r\searrow 0}\frac{\pi_{\mathrm{ac}}((x+rC)\cap\Omega)+\pi_{s}((x+rC)\cap\Omega)}{\pi_{\mathrm{s}}((x+rC)\cap\Omega)}
    =limr0πac((x+rC)Ω)πs((x+rC)Ω)+1\displaystyle=\lim_{r\searrow 0}\frac{\pi_{\mathrm{ac}}((x+rC)\cap\Omega)}{\pi_{\mathrm{s}}((x+rC)\cap\Omega)}+1
    1,\displaystyle\geq 1\,,

    since πs((x+rC)Ω)>0\pi_{\mathrm{s}}((x+rC)\cap\Omega)>0 for all rr because of xsptπsx\in\operatorname{spt}\pi_{\mathrm{s}}. In fact,

    limr0πac((x+rC)Ω)=0\lim_{r\searrow 0}\pi_{\mathrm{ac}}((x+rC)\cap\Omega)=0

    and dπd|πs|(x)=1\frac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lvert\pi_{\mathrm{s}}\rvert}(x)=1 for every xspt|πs|x\in\operatorname{spt}\lvert\pi_{\mathrm{s}}\rvert.

  2. 2.

    The second statement follows directly, since for πν\pi\ll\nu it holds that πs=0\pi_{\mathrm{s}}=0. ∎

3 Existence of Solutions

In this section, we show strong duality for the regularized mass transport (P) using Fenchel duality in the spaces 𝒫(Ω)\mathcal{P}(\Omega) and 𝒞(Ω)\mathcal{C}(\Omega). The result will then be used to study the question of existence of solutions for both the primal and the dual problem.

Theorem 3.1 (Strong duality).

Let Φ\Phi be a quasi-Young’s function and Assumption 1.1 hold. If (Φ)(c/γ)({\Phi}_{\infty})^{*}\left(\nicefrac{{-c}}{{\gamma}}\right) is integrable w.r.t. λ\lambda, then the predual problem to (P) is

supαi𝒞(Ωi),i=1,2Ω1α1dμ1+Ω2α2dμ2γΩ(Φ)(α1α2cγ)dλ\sup_{\begin{subarray}{c}\alpha_{i}\in\mathcal{C}(\Omega_{i}),\\ i=1,2\end{subarray}}\int_{\Omega_{1}}\alpha_{1}\mathop{}\!\mathrm{d}\mu_{1}+\int_{\Omega_{2}}\alpha_{2}\mathop{}\!\mathrm{d}\mu_{2}-\gamma\int_{\Omega}({\Phi}_{\infty})^{*}\left(\frac{{\alpha_{1}}\oplus{\alpha_{2}}-c}{\gamma}\right)\mathop{}\!\mathrm{d}\lambda (P*)

and strong duality holds. Furthermore, if the supremum is finite, (P) possesses a minimizer.

Proof.

Strong duality holds by standard arguments (see e.g. [6, Theorem 4.4.3]) and (assuming a finiteness of the supremum) the primal problem (P) possesses a minimizer.

To derive the dual problem, we start from the primal problem, express the equality conditions Ω2dπdλdλ2=dμ1dλ1\int_{\Omega_{2}}\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda_{2}=\tfrac{\mathop{}\!\mathrm{d}\mu_{1}}{\mathop{}\!\mathrm{d}\lambda_{1}} and Ω1dπdλdλ1=dμ2dλ2\int_{\Omega_{1}}\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda_{1}=\tfrac{\mathop{}\!\mathrm{d}\mu_{2}}{\mathop{}\!\mathrm{d}\lambda_{2}} as suprema over continuous functions and get

infdπdλLΦ(Ω,dλ),Ω2dπdλdλ2=dμ1dλ1,Ω1dπdλdλ1=dμ2dλ2Ωcdπdλ+γΦ~(dπdλ)dλ=infdπdλLΦ(Ω,dλ)supαi𝒞(Ωi),i=1,2Ωcdπdλ+γΦ~(dπdλ)dλ+Ω1(dμ1dλ1Ω2dπdλdλ2)α1dλ1+Ω2(dμ2dλ2Ω1dπdλdλ1)α2dλ2=supαi𝒞(Ωi),i=1,2infdπdλLΦ(Ω,dλ)(Ω(cα1α2)dπdλ+γΦ~(dπdλ)dλ)+Ω1α1dμ1+Ω2α2dμ2\inf_{\begin{subarray}{c}\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\in L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\lambda),\\ \int_{\Omega_{2}}\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda_{2}=\tfrac{\mathop{}\!\mathrm{d}\mu_{1}}{\mathop{}\!\mathrm{d}\lambda_{1}},\\ \int_{\Omega_{1}}\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda_{1}=\tfrac{\mathop{}\!\mathrm{d}\mu_{2}}{\mathop{}\!\mathrm{d}\lambda_{2}}\end{subarray}}\int_{\Omega}c\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}+\gamma\tilde{\Phi}(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda\\ =\inf_{\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\in L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\lambda)}\sup_{\begin{subarray}{c}\alpha_{i}\in\mathcal{C}(\Omega_{i}),\\ i=1,2\end{subarray}}\int_{\Omega}c\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}+\gamma\tilde{\Phi}(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda+\int_{\Omega_{1}}\Big{(}\tfrac{\mathop{}\!\mathrm{d}\mu_{1}}{\mathop{}\!\mathrm{d}\lambda_{1}}-\int_{\Omega_{2}}\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda_{2}\Big{)}\alpha_{1}\mathop{}\!\mathrm{d}\lambda_{1}+\int_{\Omega_{2}}\Big{(}\tfrac{\mathop{}\!\mathrm{d}\mu_{2}}{\mathop{}\!\mathrm{d}\lambda_{2}}-\int_{\Omega_{1}}\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda_{1}\Big{)}\alpha_{2}\mathop{}\!\mathrm{d}\lambda_{2}\\ =\sup_{\begin{subarray}{c}\alpha_{i}\in\mathcal{C}(\Omega_{i}),\\ i=1,2\end{subarray}}\inf_{\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\in L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\lambda)}\Big{(}\int_{\Omega}(c-\alpha_{1}\oplus\alpha_{2})\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}+\gamma\tilde{\Phi}(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda\Big{)}+\int_{\Omega_{1}}\alpha_{1}\mathop{}\!\mathrm{d}\mu_{1}+\int_{\Omega_{2}}\alpha_{2}\mathop{}\!\mathrm{d}\mu_{2}

The integrand of the first integral in (P*) is normal, so that it can be conjugated pointwise [30, Theorem 2]. Carrying out the conjugation with the help of Lemma 2.12, one obtains the claim. ∎

Example 3.2.
  1. 1.

    Using Φ(t)=tlogt\Phi(t)=t\log t and λi=i\lambda_{i}=\mathcal{L}_{i}, one obtains the result for LlogLL\log L as stated in [10, Theorem 3.1]. In this case it holds that (Φ)(r)=exp(r)=Φ(r)({\Phi}_{\infty})^{*}(r)=\exp(r)=\Phi^{*}(r).

  2. 2.

    Using Φ(t)=12t2\Phi(t)=\frac{1}{2}t^{2} and λi=i\lambda_{i}=\mathcal{L}_{i}, one obtains the result for L2L^{2} as stated in [20]. In this case it holds that (Φ)(r)=max(0,r)2({\Phi}_{\infty})^{*}(r)=\max(0,r)^{2}.

One can show that in general (Φ)(r)=Φ(r)({\Phi}_{\infty})^{*}(r)=\Phi^{*}(r) for rinfτ>0Φ(τ)r\geq\inf_{\tau>0}\partial\Phi(\tau) and equal to Φ(0)-\Phi(0) otherwise.

Remark 3.3.

Theorem 3.1 does not claim that the supremum is attained, i.e. that the predual problem (P*) admits a solution. Moreover, the solutions of (P*) cannot be unique since one can add and subtract constants to α1\alpha_{1} and α2\alpha_{2}, respectively, without changing the functional value. If however (Φ)({\Phi}_{\infty})^{*} is strictly convex, then the functional in (P*) is strictly concave up to such a constant and therefore any solution is uniquely determined by this constant. This is the case, e.g., for functions Φ\Phi with superlinear growth at \infty.

3.1 Existence result for the primal problem

The duality result can now be used to address the question of existence of a solution to (P).

Theorem 3.4 (Existence of solutions of (P)).

Let Assumption 1.1 hold. If (Φ)(c/γ)({\Phi}_{\infty})^{*}\left(\nicefrac{{-c}}{{\gamma}}\right) is integrable w.r.t. λ\lambda and

dμidλiLΦ(Ωi,dλi),i=1,2d(μ1μ2)dλLΦ(Ω,dλ)\tfrac{\mathop{}\!\mathrm{d}\mu_{i}}{\mathop{}\!\mathrm{d}\lambda_{i}}\in L^{\Phi}(\Omega_{i},\mathop{}\!\mathrm{d}\lambda_{i}),\,i=1,2\,\Rightarrow\,\tfrac{\mathop{}\!\mathrm{d}(\mu_{1}\otimes\mu_{2})}{\mathop{}\!\mathrm{d}\lambda}\in L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\lambda) (3.1)

we have that problem (P) admits a minimizer π¯\bar{\pi} if and only if dμidλiLΦ(Ωi,dλi)\tfrac{\mathop{}\!\mathrm{d}\mu_{i}}{\mathop{}\!\mathrm{d}\lambda_{i}}\in L^{\Phi}(\Omega_{i},\mathop{}\!\mathrm{d}\lambda_{i}) for i=1,2i=1,2. In this case, π¯LΦ(Ω,dλ)\bar{\pi}\in L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\lambda) and the minimizer is unique, if Φ\Phi is strictly convex.

Proof.

The proof given in [10, Theorem 3.3] for Φ(t)=tlogt\Phi(t)=t\log t holds for arbitrary Φ\Phi and arbitrary product measures λ=λ1λ2\lambda=\lambda_{1}\otimes\lambda_{2}, since it only relies on Lemma 2.9 and condition (3.1). ∎

Example 3.5.

Since d(μ1μ2)dλ=dμ1dλ1dμ2dλ2\tfrac{\mathop{}\!\mathrm{d}(\mu_{1}\otimes\mu_{2})}{\mathop{}\!\mathrm{d}\lambda}=\tfrac{\mathop{}\!\mathrm{d}\mu_{1}}{\mathop{}\!\mathrm{d}\lambda_{1}}\otimes\tfrac{\mathop{}\!\mathrm{d}\mu_{2}}{\mathop{}\!\mathrm{d}\lambda_{2}}, condition (3.1) is satisfied e.g. when Φ\Phi satisfies either Φ(xy)CΦ(x)Φ(y)\Phi(xy)\leq C\Phi(x)\Phi(y) for some C>0C>0 or Φ(xy)C1xΦ(y)+C2Φ(x)y\Phi(xy)\leq C_{1}x\Phi(y)+C_{2}\Phi(x)y for some C1,C20C_{1},C_{2}\geq 0. For Φ(t)=tp/p\Phi(t)=\nicefrac{{t^{p}}}{{p}}, p>1p>1, both conditions hold trivially. For Φ(t)=tlogt\Phi(t)=t\log t the second condition holds, since log(xy)=log(x)+log(y)\log(xy)=\log(x)+\log(y).

3.2 Existence result for the predual problem with Lp(Ω,dλ)L^{p}(\Omega,\mathop{}\!\mathrm{d}\lambda) regularization

The question of existence of solutions to the predual problem (P*) proves to be difficult for general Young’s functions. There are results that show existence for the predual problem in the entropic case [10] and in the quadratic case [20] (both considering the penalty w.r.t. the Lebesgue measure), but their proofs are quite different in nature. In the case where λ=μ1μ2\lambda=\mu_{1}\otimes\mu_{2} (i.e. the product of the marginals) and Φ(t)=tlog(t)\Phi(t)=t\log(t), one can show dual existence in a different way, see [17] for a proof based on the convergence of the Sinkhorn method and [32] for a sketch of a proof that uses regularity of the cost function. Our methods do not allow to show existence of solutions for the dual problem in the general case considered up to now. Hence, we consider a special case in the following: We specialize to the case of Lp(Ω,dλ)L^{p}(\Omega,\mathop{}\!\mathrm{d}\lambda)-regularization for a measure λ=λ1λ2\lambda=\lambda_{1}\otimes\lambda_{2} with μiλi\mu_{i}\ll\lambda_{i}, i=1,2i=1,2. That is, we use the Young’s functions Φ(t)=tp/p\Phi(t)=\nicefrac{{t^{p}}}{{p}} for p>1p>1, and thus, Φ(t)=tq/q{\Phi}^{*}(t)=\nicefrac{{t^{q}}}{{q}}, with 1/p+1/q=1\nicefrac{{1}}{{p}}+\nicefrac{{1}}{{q}}=1, and the predual is actually also the dual. Moreover, (Φ)(s)=1q(s+)q({\Phi}_{\infty})^{*}(s)=\tfrac{1}{q}({s}_{+})^{q}. To keep notation clean, (t+)p({t}_{+})^{p} will abbreviated as t+p{t}_{+}^{p} with slight abuse of notation and similarly for (t)p({t}_{-})^{p}.

Assumption 3.6.

Let Assumption 1.1 hold. In addition, let Φ(t)=tp/p\Phi(t)=\nicefrac{{t^{p}}}{{p}} for p>1p>1 and let the cost function cc be continuous and fulfill ccc\geq c^{\dagger} for some constant c>c^{\dagger}>-\infty. Furthermore, let the marginals μi\mu_{i} with dμidλiLp(Ωi,dλi)\tfrac{\mathop{}\!\mathrm{d}\mu_{i}}{\mathop{}\!\mathrm{d}\lambda_{i}}\in L^{p}(\Omega_{i},\mathop{}\!\mathrm{d}\lambda_{i}) satisfy dμidλiδ>0\tfrac{\mathop{}\!\mathrm{d}\mu_{i}}{\mathop{}\!\mathrm{d}\lambda_{i}}\geq\delta>0 λi\lambda_{i}-a.e. for i=1,2i=1,2.

Note that the latter condition, i.e that the densities should be bounded away from zero, can be guaranteed by proper choice of λi\lambda_{i}, e.g. λi=μi\lambda_{i}=\mu_{i} gives dμidλi=1\tfrac{\mathop{}\!\mathrm{d}\mu_{i}}{\mathop{}\!\mathrm{d}\lambda_{i}}=1. It can not be expected for problem (P*) to have continuous optimizers α1{\alpha_{1}}, α2{\alpha_{2}}, as the following example demonstrates.

Example 3.7.

For i=1,2i=1,2 let Ωi=[1,1]\Omega_{i}=[-1,1], λi=|Ωi+δ0\lambda_{i}=\mathcal{L}|_{\Omega_{i}}+\delta_{0}, where δ0\delta_{0} denotes the Dirac measure at zero, and μi=δ0\mu_{i}=\delta_{0}. Moreover, let c0c\equiv 0. Clearly the minimum in problem (P) is 1p\frac{1}{p}, which is attained for π=δ0×δ0\pi=\delta_{0}\times\delta_{0}. By Theorem 3.1, the optimal value of problem (P*) is 1p\frac{1}{p}, as well. Going on, it holds

supαi𝒞(Ωi)Ω1α1dμ1+Ω2α2dμ21qΩ(α1α2)+qdλ\displaystyle\sup_{\alpha_{i}\in\mathcal{C}(\Omega_{i})}\int_{\Omega_{1}}\alpha_{1}\mathop{}\!\mathrm{d}\mu_{1}+\int_{\Omega_{2}}\alpha_{2}\mathop{}\!\mathrm{d}\mu_{2}-\frac{1}{q}\int_{\Omega}(\alpha_{1}\oplus\alpha_{2})_{+}^{q}\mathop{}\!\mathrm{d}\lambda
=\displaystyle= supαi𝒞(Ωi)α1(0)+α2(0)1q(α1(0)+α2(0))+q1qΩ(α1α2)+qd\displaystyle\sup_{\alpha_{i}\in\mathcal{C}(\Omega_{i})}\alpha_{1}(0)+\alpha_{2}(0)-\frac{1}{q}(\alpha_{1}(0)+\alpha_{2}(0))_{+}^{q}-\frac{1}{q}\int_{\Omega}(\alpha_{1}\oplus\alpha_{2})_{+}^{q}\mathop{}\!\mathrm{d}\mathcal{L}

If α1α20\alpha_{1}\oplus\alpha_{2}\leq 0, the supremum (1p\frac{1}{p}) can clearly not be attained. Hence, assume (α1α2)(x)>0(\alpha_{1}\oplus\alpha_{2})(x)>0 for some xΩx\in\Omega, which implies Ω(α1α2)+qd>0\int_{\Omega}(\alpha_{1}\oplus\alpha_{2})_{+}^{q}\mathop{}\!\mathrm{d}\mathcal{L}>0, as αi𝒞(Ωi)\alpha_{i}\in\mathcal{C}(\Omega_{i}). Thus,

α1(0)+α2(0)1q(α1(0)+α2(0))+q1qΩ(α1α2)+qd\displaystyle\alpha_{1}(0)+\alpha_{2}(0)-\frac{1}{q}(\alpha_{1}(0)+\alpha_{2}(0))_{+}^{q}-\frac{1}{q}\int_{\Omega}(\alpha_{1}\oplus\alpha_{2})_{+}^{q}\mathop{}\!\mathrm{d}\mathcal{L}
<\displaystyle< (α1α2)(0)1q((α1α2)(0))+q11q=1p,\displaystyle(\alpha_{1}\oplus\alpha_{2})(0)-\frac{1}{q}((\alpha_{1}\oplus\alpha_{2})(0))_{+}^{q}\leq 1-\frac{1}{q}=\frac{1}{p}\,,

where equality holds for (α1α2)(0)=1(\alpha_{1}\oplus\alpha_{2})(0)=1. Due to the strict inequality, the supremum in problem (P*) can not be attained for αi𝒞(Ωi)\alpha_{i}\in\mathcal{C}(\Omega_{i}), i=1,2i=1,2.

However, with the help of the following result, we can define a variant of the predual problem, for which existence of minimizers can be shown.

Lemma 3.8.

Let Assumption 1.1 hold and let αi:Ωi{±}\alpha_{i}:\Omega_{i}\to\mathbb{R}\cup\{\pm\infty\}, i=1,2i=1,2 be such that (α1α2c)+Lq(Ω,dλ){({\alpha_{1}}\oplus{\alpha_{2}}-c)}_{+}\in L^{q}(\Omega,\mathop{}\!\mathrm{d}\lambda). Then (αi)+Lq(Ωi,dλi){(\alpha_{i})}_{+}\in L^{q}(\Omega_{i},\mathop{}\!\mathrm{d}\lambda_{i}), i=1,2i=1,2.

Proof.

First note, that

(α1α2c+c)+q\displaystyle{({\alpha_{1}}\oplus{\alpha_{2}}-c+c)}_{+}^{q} ((α1α2c)++c+)q=2q((α1α2c)+2+c+2)q\displaystyle\leq{({({\alpha_{1}}\oplus{\alpha_{2}}-c)}_{+}+{c}_{+})}^{q}=2^{q}\left(\frac{{({\alpha_{1}}\oplus{\alpha_{2}}-c)}_{+}}{2}+\frac{{c}_{+}}{2}\right)^{\!q}
2q1((α1α2c)+q+c+q),\displaystyle\leq 2^{q-1}\left({({\alpha_{1}}\oplus{\alpha_{2}}-c)}_{+}^{q}+{c}_{+}^{q}\right)\,,

so that (α1α2)+Lq(Ω,dλ){(\alpha_{1}\oplus\alpha_{2})}_{+}\in L^{q}(\Omega,\mathop{}\!\mathrm{d}\lambda). This implies (αi)+<{(\alpha_{i})}_{+}<\infty λi\lambda_{i}-a.e. for i=1,2i=1,2.

We now consider (αi)+{(\alpha_{i})}_{+}, i=1,2i=1,2 separately and start with i=1i=1. Let MΩ2M\subset\Omega_{2} be such that α20\alpha_{2}\geq 0 λ2\lambda_{2}-a.e. on MM. If λ2(M)=0\lambda_{2}(M)=0 the assertion holds trivially, so we assume λ2(M)>0\lambda_{2}(M)>0. Then,

(α1)+q\displaystyle\lVert{(\alpha_{1})}_{+}\rVert_{q} =sup{Ω1(α1)+g1dλ1|g1Lp(Ω1,dλ1),g1p=1}\displaystyle=\sup\left\{\int_{\Omega_{1}}{(\alpha_{1})}_{+}g_{1}\mathop{}\!\mathrm{d}\lambda_{1}\,\middle|\,g_{1}\in L^{p}(\Omega_{1},\mathop{}\!\mathrm{d}\lambda_{1}),\lVert g_{1}\rVert_{p}=1\right\}
=sup{λ2(M)1p1MΩ1(α1)+(g1g2)dλ1dλ2|g1Lp(Ω1,dλ1),g1p=1,g2=𝟙Mλ2(M)1p}\displaystyle\begin{multlined}=\sup\left\{\lambda_{2}(M)^{\frac{1}{p}-1}\int_{M}\int_{\Omega_{1}}{(\alpha_{1})}_{+}(g_{1}\otimes g_{2})\mathop{}\!\mathrm{d}\lambda_{1}\mathop{}\!\mathrm{d}\lambda_{2}\,\middle|\vphantom{\lambda_{2}(M)^{-\tfrac{1}{p}}}\right.\\ \hphantom{\sup\left\{\int_{M}\int_{\Omega_{1}}{(\alpha_{1})}_{+}\right.}\left.g_{1}\in L^{p}(\Omega_{1},\mathop{}\!\mathrm{d}\lambda_{1}),\lVert g_{1}\rVert_{p}=1,g_{2}=\mathds{1}_{M}\lambda_{2}(M)^{-\tfrac{1}{p}}\right\}\end{multlined}=\sup\left\{\lambda_{2}(M)^{\frac{1}{p}-1}\int_{M}\int_{\Omega_{1}}{(\alpha_{1})}_{+}(g_{1}\otimes g_{2})\mathop{}\!\mathrm{d}\lambda_{1}\mathop{}\!\mathrm{d}\lambda_{2}\,\middle|\vphantom{\lambda_{2}(M)^{-\tfrac{1}{p}}}\right.\\ \hphantom{\sup\left\{\int_{M}\int_{\Omega_{1}}{(\alpha_{1})}_{+}\right.}\left.g_{1}\in L^{p}(\Omega_{1},\mathop{}\!\mathrm{d}\lambda_{1}),\lVert g_{1}\rVert_{p}=1,g_{2}=\mathds{1}_{M}\lambda_{2}(M)^{-\tfrac{1}{p}}\right\}
λ2(M)1p1sup{MΩ1(α1α2)+(g1g2)dλ1dλ2|giLp(Ωi,dλi),gip=1,i=1,2}\displaystyle\leq\lambda_{2}(M)^{\frac{1}{p}-1}\cdot\sup\left\{\int_{M}\int_{\Omega_{1}}{(\alpha_{1}\oplus\alpha_{2})}_{+}(g_{1}\otimes g_{2})\mathop{}\!\mathrm{d}\lambda_{1}\mathop{}\!\mathrm{d}\lambda_{2}\,\middle|\,g_{i}\in L^{p}(\Omega_{i},\mathop{}\!\mathrm{d}\lambda_{i}),\lVert g_{i}\rVert_{p}=1,i=1,2\right\}
λ2(M)1p1sup{Ω(α1α2)+fdλ|fLp(Ω,dλ),fp=1}\displaystyle\leq\lambda_{2}(M)^{\frac{1}{p}-1}\cdot\sup\left\{\int_{\Omega}{(\alpha_{1}\oplus\alpha_{2})}_{+}f\mathop{}\!\mathrm{d}\lambda\,\middle|\,f\in L^{p}(\Omega,\mathop{}\!\mathrm{d}\lambda),\lVert f\rVert_{p}=1\right\}
=λ2(M)1p1(α1α2)+q<,\displaystyle=\lambda_{2}(M)^{\frac{1}{p}-1}\cdot\lVert{(\alpha_{1}\oplus\alpha_{2})}_{+}\rVert_{q}<\infty\,,

where the first inequality is justified because thanks to (α1)+0{(\alpha_{1})}_{+}\geq 0 we can take the supremum over g10g_{1}\geq 0. With the same argumentation, we can prove (α2)+Lq(Ω2,dλ2){(\alpha_{2})}_{+}\in L^{q}(\Omega_{2},\mathop{}\!\mathrm{d}\lambda_{2}). ∎

Hence, the objective function of problem (P*) is also well defined for functions αiL1(Ωi,dλi){\alpha_{i}}\in L^{1}(\Omega_{i},\mathop{}\!\mathrm{d}\lambda_{i}), i=1,2i=1,2, with (α1α2c)+Lq(Ω,dλ){({\alpha_{1}}\oplus{\alpha_{2}}-c)}_{+}\in L^{q}(\Omega,\mathop{}\!\mathrm{d}\lambda). We now considered the following variant of the predual problem:

min{Λ(α1,α2):=1q(α1α2c)+Lq(Ω,dλ)qγq1Ω1α1dμ1γq1Ω2α2dμ2|αiL1(Ωi,dλi),i=1,2,1γ(α1α2c)+Lq(Ω,dλ)}.\begin{multlined}\min\left\{\Lambda({\alpha_{1}},{\alpha_{2}}):=\frac{1}{q}\lVert{({\alpha_{1}}\oplus{\alpha_{2}}-c)}_{+}\rVert^{q}_{L^{q}(\Omega,\,\mathop{}\!\mathrm{d}\lambda)}\vspace*{-1em}-\gamma^{q-1}\int_{\Omega_{1}}{\alpha_{1}}\mathop{}\!\mathrm{d}\mu_{1}-\gamma^{q-1}\int_{\Omega_{2}}{\alpha_{2}}\mathop{}\!\mathrm{d}\mu_{2}\right.\\ \left.\vphantom{\int_{\Omega_{2}}{\alpha_{2}}}\middle|\,\alpha_{i}\in L^{1}(\Omega_{i},\mathop{}\!\mathrm{d}\lambda_{i}),\,i=1,2,\,\tfrac{1}{\gamma}{({\alpha_{1}}\oplus{\alpha_{2}}-c)}_{+}\in L^{q}(\Omega,\mathop{}\!\mathrm{d}\lambda)\right\}\,.\end{multlined}\min\left\{\Lambda({\alpha_{1}},{\alpha_{2}}):=\frac{1}{q}\lVert{({\alpha_{1}}\oplus{\alpha_{2}}-c)}_{+}\rVert^{q}_{L^{q}(\Omega,\,\mathop{}\!\mathrm{d}\lambda)}\vspace*{-1em}-\gamma^{q-1}\int_{\Omega_{1}}{\alpha_{1}}\mathop{}\!\mathrm{d}\mu_{1}-\gamma^{q-1}\int_{\Omega_{2}}{\alpha_{2}}\mathop{}\!\mathrm{d}\mu_{2}\right.\\ \left.\vphantom{\int_{\Omega_{2}}{\alpha_{2}}}\middle|\,\alpha_{i}\in L^{1}(\Omega_{i},\mathop{}\!\mathrm{d}\lambda_{i}),\,i=1,2,\,\tfrac{1}{\gamma}{({\alpha_{1}}\oplus{\alpha_{2}}-c)}_{+}\in L^{q}(\Omega,\mathop{}\!\mathrm{d}\lambda)\right\}\,. (P)

The strategy in this section is as follows.

  1. 1.

    First, show that problem (P) admits a solution (α¯1,α¯2)L1(Ω1,dλ1)×L1(Ω2,dλ2)({\bar{\alpha}_{1}},{\bar{\alpha}_{2}})\in L^{1}(\Omega_{1},\mathop{}\!\mathrm{d}\lambda_{1})\times L^{1}(\Omega_{2},\mathop{}\!\mathrm{d}\lambda_{2}).

  2. 2.

    Then, prove that α¯1{\bar{\alpha}_{1}} and α¯2{\bar{\alpha}_{2}} possess higher regularity, namely that they are functions in Lq(Ωi,dλi)L^{q}(\Omega_{i},\mathop{}\!\mathrm{d}\lambda_{i}).

The objective function needs to be extended to allow to deal with weak-* converging sequences. To that end, define

G:Lq(Ω,dλ)wΩ(1qw+qwμ)dλ,G:L^{q}(\Omega,\mathop{}\!\mathrm{d}\lambda)\ni w\mapsto\int_{\Omega}\left(\frac{1}{q}{w}_{+}^{q}-w\mu\right)\mathop{}\!\mathrm{d}\lambda\in\mathbb{R}\,,

where μ:=γq1d(μ1μ2)dλ\mu:=\gamma^{q-1}\tfrac{\mathop{}\!\mathrm{d}(\mu_{1}\otimes\mu_{2})}{\mathop{}\!\mathrm{d}\lambda}. Note that in the case λi=μi\lambda_{i}=\mu_{i}, the variable μ\mu is given by 1γq11\cdot\gamma^{q-1}. Then, thanks to the normalization of μ1{\mu_{1}} and μ2{\mu_{2}},

Λ(α1,α2)=G(α1α2c)Ωcμdλα1,α2Lq.\Lambda({\alpha_{1}},{\alpha_{2}})=G({\alpha_{1}}\oplus{\alpha_{2}}-c)-\int_{\Omega}c\mu\mathop{}\!\mathrm{d}\lambda\quad\forall{\alpha_{1}},{\alpha_{2}}\in L^{q}\,.

Of course, GG is also well defined as a functional on the feasible set of problem (P) and this functional will be denoted by the same symbol to ease notation. In order to extend GG to the space of Radon measures, consider for a given measure w𝔐(Ω)w\in\mathfrak{M}(\Omega) the Hahn-Jordan decomposition w=w+ww={w}_{+}-{w}_{-} and assume dw+dλLq(Ω,dλ)\tfrac{\mathop{}\!\mathrm{d}{w}_{+}}{\mathop{}\!\mathrm{d}\lambda}\in L^{q}(\Omega,\mathop{}\!\mathrm{d}\lambda). Then, Ωd(w+μ)dλdλ\int_{\Omega}\tfrac{\mathop{}\!\mathrm{d}(w_{+}\mu)}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda is finite for μ\mu as defined above and μi\mu_{i} as in Assumption 3.6. Regarding the negative part, we set Ωμdw:=\int_{\Omega}\mu\mathop{}\!\mathrm{d}w_{-}:=\infty, whenever this expression is not properly defined, as ww_{-} and μ\mu are both positive. Combining this, we always have Ωμdw{}-\int_{\Omega}\mu\mathop{}\!\mathrm{d}w\in\mathbb{R}\cup\{\infty\} and define

G(w):=Ω1qdw+qdλdλΩμdw.G(w):=\int_{\Omega}\frac{1}{q}\frac{\mathop{}\!\mathrm{d}{w}_{+}^{q}}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda-\int_{\Omega}\mu\mathop{}\!\mathrm{d}w\,.
Remark 3.9.

If wλw\ll\lambda, then dw+dλL1(Ω,dλ)\tfrac{\mathop{}\!\mathrm{d}{w}_{+}}{\mathop{}\!\mathrm{d}\lambda}\in L^{1}(\Omega,\mathop{}\!\mathrm{d}\lambda) and dw+dλ(x)=max{0,dwdλ(x)}\tfrac{\mathop{}\!\mathrm{d}{w}_{+}}{\mathop{}\!\mathrm{d}\lambda}(x)=\max\{0,\tfrac{\mathop{}\!\mathrm{d}w}{\mathop{}\!\mathrm{d}\lambda}(x)\} λ\lambda-a.e. in Ω\Omega. Hence, both functionals denoted by GG coincide on Lq(Ω,dλ)L^{q}(\Omega,\mathop{}\!\mathrm{d}\lambda), which justifies this notation.

The following auxiliary results are generalizations of the corresponding results in [20]. The first lemma covers the coercivity of GG in L1(Ω,dλ)L^{1}(\Omega,\mathop{}\!\mathrm{d}\lambda). To keep notation simple, from now on we will abbreviate .Lp(Ω,dλ)\lVert\,\boldsymbol{.}\,\rVert_{L^{p}(\Omega,\mathop{}\!\mathrm{d}\lambda)} by .p\lVert\,\boldsymbol{.}\,\rVert_{p} and similarly for .Lp(Ωi,dλi)\lVert\,\boldsymbol{.}\,\rVert_{L^{p}(\Omega_{i},\mathop{}\!\mathrm{d}\lambda_{i})}, i=1,2i=1,2, where the underlying space will be clear from the context.

Lemma 3.10.

Let Assumption 3.6 hold and suppose that a sequence (wn)Lq(Ω,dλ)(w_{n})\subset L^{q}(\Omega,\mathop{}\!\mathrm{d}\lambda) fulfills

G(wn)C<nG(w_{n})\leq C<\infty\quad\forall n\in\mathbb{N}

for some C>0C>0. Then, the sequences (wn)+{(w_{n})}_{+} and (wn){(w_{n})}_{-} are bounded in Lq(Ω,dλ)L^{q}(\Omega,\mathop{}\!\mathrm{d}\lambda) and L1(Ω,dλ)L^{1}(\Omega,\mathop{}\!\mathrm{d}\lambda), respectively.

Proof.

This proof follows the outline of the proof in [20, Lemma 2.5].

We rewrite GG as G(w)=Ω1qw+qw+μdλ+ΩwμdλG(w)=\int_{\Omega}\frac{1}{q}{w}_{+}^{q}-{w}_{+}\mu\mathop{}\!\mathrm{d}\lambda+\int_{\Omega}{w}_{-}\mu\mathop{}\!\mathrm{d}\lambda. The positivity of μ\mu implies

1q(wn)+qq=G(wn)+Ω(wn)+μdλΩ(wn)μdλC+μp(wn)+q,\frac{1}{q}\lVert{(w_{n})}_{+}\rVert_{q}^{q}=G(w_{n})+\int_{\Omega}{(w_{n})}_{+}\mu\mathop{}\!\mathrm{d}\lambda-\int_{\Omega}{(w_{n})}_{-}\mu\mathop{}\!\mathrm{d}\lambda\leq C+\lVert\mu\rVert_{p}\lVert{(w_{n})}_{+}\rVert_{q}\,,

which gives the first assertion. The second one can be seen by making use of μγq1δ2\mu\geq\gamma^{q-1}\delta^{2} with δ\delta from Assumption 3.6, which yields the estimate

CG(wn)\displaystyle C\geq G(w_{n}) =1qΩ(wn)+qdλΩ(wn)+μdλ+Ω(wn)μdλ\displaystyle=\frac{1}{q}\int_{\Omega}{(w_{n})}_{+}^{q}\mathop{}\!\mathrm{d}\lambda-\int_{\Omega}{(w_{n})}_{+}\mu\mathop{}\!\mathrm{d}\lambda+\int_{\Omega}{(w_{n})}_{-}\mu\mathop{}\!\mathrm{d}\lambda
1q(wn)+qqμp(wn)+q+γq1δ2(wn)1\displaystyle\geq\frac{1}{q}\lVert{(w_{n})}_{+}\rVert_{q}^{q}-\lVert\mu\rVert_{p}\lVert{(w_{n})}_{+}\rVert_{q}+\gamma^{q-1}\delta^{2}\lVert{(w_{n})}_{-}\rVert_{1}
μp(wn)+q+γq1δ2(wn)1.\displaystyle\geq-\lVert\mu\rVert_{p}\lVert{(w_{n})}_{+}\rVert_{q}+\gamma^{q-1}\delta^{2}\lVert{(w_{n})}_{-}\rVert_{1}\,.

Since (wn)+q\lVert{(w_{n})}_{+}\rVert_{q} is already known to be bounded, the second assertion holds. ∎

The next lemma provides a lower semi-continuity result for GG w.r.t. weak-* convergence in 𝔐(Ω)\mathfrak{M}(\Omega). Note that the extension of GG as introduced above is needed, here.

Lemma 3.11.

Let Assumption 3.6 hold and a sequence (wn)Lq(Ω,dλ)(w_{n})\subset L^{q}(\Omega,\mathop{}\!\mathrm{d}\lambda) be given such that wnλw¯w_{n}\lambda\xrightharpoonup{*}\bar{w} in 𝔐(Ω)\mathfrak{M}(\Omega) and G(wn)C<G(w_{n})\leq C<\infty for all nn\in\mathbb{N}. Then it holds that w¯+λ{\bar{w}}_{+}\ll\lambda with dw¯+dλLq(Ω,dλ)\tfrac{{\mathop{}\!\mathrm{d}\bar{w}}_{+}}{\mathop{}\!\mathrm{d}\lambda}\in L^{q}(\Omega,\mathop{}\!\mathrm{d}\lambda) and

G(w¯)lim infnG(wn).G(\bar{w})\leq\liminf_{n\to\infty}G(w_{n}).
Proof.

The proof given in [20, Lemma 2.6] only uses Lemma 3.10 and fundamental properties of L2(Ω,d)L^{2}(\Omega,\mathop{}\!\mathrm{d}\mathcal{L}), which also hold for Lq(Ω,dλ)L^{q}(\Omega,\mathop{}\!\mathrm{d}\lambda), q>1q>1, and can thus be readily extended. ∎

Now all prerequisites for proving the existence result for problem (P) are gathered.

Proposition 3.12.

Let Assumption 3.6 hold. Then, problem (P) admits a solution (α¯1,α¯2)L1(Ω1,dλ1)×L1(Ω2,dλ2)({\bar{\alpha}_{1}},{\bar{\alpha}_{2}})\in L^{1}(\Omega_{1},\mathop{}\!\mathrm{d}\lambda_{1})\times L^{1}(\Omega_{2},\mathop{}\!\mathrm{d}\lambda_{2}).

Proof.

In [20, Proposition 2.9] the statement is proven for p=2p=2 via the classical direct method of the calculus of variations using only [20, Lemmas 2.7 & 2.8] and Lemmas 3.10 and 3.11, where [20, Lemmas 2.7 & 2.8] are rather technical results holding independently of the choice of p>1p>1 and λ1\lambda_{1}, λ2\lambda_{2}. Hence, the proof also holds for p>1p>1. ∎

Next, it is shown that αi\alpha_{i}, i=1,2i=1,2 are indeed functions in Lq(Ωi,dλi)L^{q}(\Omega_{i},\mathop{}\!\mathrm{d}\lambda_{i}).

Theorem 3.13.

Let Assumption 3.6 hold and let p2p\geq 2. Then every optimal solution (α¯1,α¯2)({\bar{\alpha}_{1}},{\bar{\alpha}_{2}}) from Proposition 3.12 satisfies α¯iLq(Ωi,dλi){\bar{\alpha}_{i}}\in L^{q}(\Omega_{i},\mathop{}\!\mathrm{d}\lambda_{i}), i=1,2i=1,2. Moreover, the negative parts of α¯i\bar{\alpha}_{i} are bounded and the function 1γq1(α¯1α¯2c)+q1\tfrac{1}{\gamma^{q-1}}{{(\bar{\alpha}_{1}\oplus\bar{\alpha}_{2}-c)}_{+}^{q-1}} has the marginals dμ1dλ1\tfrac{\mathop{}\!\mathrm{d}\mu_{1}}{\mathop{}\!\mathrm{d}\lambda_{1}} and dμ2dλ2\tfrac{\mathop{}\!\mathrm{d}\mu_{2}}{\mathop{}\!\mathrm{d}\lambda_{2}}.

Proof.

We only consider the the negative parts, as for the positive parts we already have (αi)+Lq(Ωi,dλi){(\alpha_{i})}_{+}\in L^{q}(\Omega_{i},\mathop{}\!\mathrm{d}\lambda_{i}) by Lemma 3.8.

First note, that for functions fi:Ωif_{i}:\Omega_{i}\to\mathbb{R}, i=1,2i=1,2 and g:Ω1g:\Omega_{1}\to\mathbb{R} it holds

((f1+g)f2)+=max(0,(f1f2)+g0)(f1f2)++g+0,{((f_{1}+g)\oplus f_{2})}_{+}=\max\Big{(}0,(f_{1}\oplus f_{2})+g\oplus 0\Big{)}\leq{(f_{1}\oplus f_{2})}_{+}+{g}_{+}\oplus 0\,,

where 0 is to be understood as the constant mapping x20x_{2}\mapsto 0. Let now φ𝒞c(Ω1)\varphi\in\mathcal{C}_{\mathrm{c}}^{\infty}(\Omega_{1}) and fix some 0<t10<t\leq 1. Then, thanks to

0((α¯1+tφ)α¯2c)+(α¯1α¯2c)++tφ+0,0\leq{((\bar{\alpha}_{1}+t\varphi)\oplus\bar{\alpha}_{2}-c)}_{+}\leq{(\bar{\alpha}_{1}\oplus\bar{\alpha}_{2}-c)}_{+}+t{\varphi}_{+}\oplus 0\,, (3.2)

Proposition 3.12 implies that ((α¯1+tφ)α¯2c)+Lq(Ω,dλ){((\bar{\alpha}_{1}+t\varphi)\oplus\bar{\alpha}_{2}-c)}_{+}\in L^{q}(\Omega,\mathop{}\!\mathrm{d}\lambda), so that (α¯1+tφ,α¯2)(\bar{\alpha}_{1}+t\varphi,\bar{\alpha}_{2}) is feasible for problem (P). Therefore, the optimality of (α¯1,α¯2)(\bar{\alpha}_{1},\bar{\alpha}_{2}) for problem (P) yields

1qΩ1t(((α¯1+tφ)α¯2c)+q(α¯1α¯2c)+q)dλγq1Ω1dμ1dλ1φdλ10\frac{1}{q}\int_{\Omega}\frac{1}{t}\left({((\bar{\alpha}_{1}+t\varphi)\oplus\bar{\alpha}_{2}-c)}_{+}^{q}-{(\bar{\alpha}_{1}\oplus\bar{\alpha}_{2}-c)}_{+}^{q}\right)\mathop{}\!\mathrm{d}\lambda-\gamma^{q-1}\int_{\Omega_{1}}\tfrac{\mathop{}\!\mathrm{d}\mu_{1}}{\mathop{}\!\mathrm{d}\lambda_{1}}\varphi\mathop{}\!\mathrm{d}\lambda_{1}\geq 0

for all 0<t10<t\leq 1. Owing to the continuous differentiability of rr+q\mathbb{R}\ni r\mapsto{r}_{+}^{q}\in\mathbb{R}, the first integrand converges to q(α¯1α¯2c)+q1φq{(\bar{\alpha}_{1}\oplus\bar{\alpha}_{2}-c)}_{+}^{q-1}\varphi λ\lambda-a.e. in Ω\Omega.

Moreover, for x0x\geq 0, the mapping [0,1]t(x+tφ+)q[0,1]\ni t\mapsto(x+t{\varphi}_{+})^{q}\in\mathbb{R} is Lipschitz continuous with Lipschitz constant qφ+(x+φ+)q1q{\varphi}_{+}\left(x+{\varphi}_{+}\right)^{q-1}. Together with (3.2), this gives

1t(((α¯1+tφ)α¯2c)+q(α¯1α¯2c)+q)1t(((α¯1α¯2c)++tφ+0)q(α¯1α¯2c)+q)qφ+((α¯1α¯2c)++φ+0)q1qφ+(2max{(α¯1α¯2c)+,φ+0})q1qφ+2q1((α¯1α¯2c)+q1+(φ+0)q1),\begin{split}\frac{1}{t}\left({((\bar{\alpha}_{1}+t\varphi)\oplus\bar{\alpha}_{2}-c)}_{+}^{q}-{(\bar{\alpha}_{1}\oplus\bar{\alpha}_{2}-c)}_{+}^{q}\right)&\leq\frac{1}{t}\left(\left({(\bar{\alpha}_{1}\oplus\bar{\alpha}_{2}-c)}_{+}+t{\varphi}_{+}\oplus 0\right)^{q}-{(\bar{\alpha}_{1}\oplus\bar{\alpha}_{2}-c)}_{+}^{q}\right)\\ &\leq q{\varphi}_{+}\left({(\bar{\alpha}_{1}\oplus\bar{\alpha}_{2}-c)}_{+}+{\varphi}_{+}\oplus 0\right)^{q-1}\\ &\leq q{\varphi}_{+}\left(2\max\left\{{(\bar{\alpha}_{1}\oplus\bar{\alpha}_{2}-c)}_{+}\,,\,{\varphi}_{+}\oplus 0\right\}\right)^{q-1}\\ &\leq q{\varphi}_{+}2^{q-1}\left({(\bar{\alpha}_{1}\oplus\bar{\alpha}_{2}-c)}_{+}^{q-1}+({\varphi}_{+}\oplus 0)^{q-1}\right)\,,\end{split} (3.3)

since (x+y)r(2max{x,y})r(x+y)^{r}\leq\left(2\max\{x\,,\,y\}\right)^{r} for all r>0r>0, x,y0x,y\geq 0. As gLq(Ω,dλ)g\in L^{q}(\Omega,\mathop{}\!\mathrm{d}\lambda) with λ(Ω)<\lambda(\Omega)<\infty implies Ωgq1dλ<\int_{\Omega}g^{q-1}\mathop{}\!\mathrm{d}\lambda<\infty for all q>1q>1 (see e.g. [15, Proposition 6.12]), the right-hand side is integrable.

Hence, due to Lebesgue’s dominated convergence theorem, passing to the limit t0t\searrow 0 is allowed and yields

Ω1(Ω2(α¯1α¯2c)+q1dλ2γq1dμ1dλ1)φdλ10.\int_{\Omega_{1}}\left(\int_{\Omega_{2}}{(\bar{\alpha}_{1}\oplus\bar{\alpha}_{2}-c)}_{+}^{q-1}\mathop{}\!\mathrm{d}\lambda_{2}-\gamma^{q-1}\tfrac{\mathop{}\!\mathrm{d}\mu_{1}}{\mathop{}\!\mathrm{d}\lambda_{1}}\right)\varphi\mathop{}\!\mathrm{d}\lambda_{1}\geq 0\,.

Since φ𝒞c(Ω1)\varphi\in\mathcal{C}_{\mathrm{c}}^{\infty}(\Omega_{1}) was arbitrary, the fundamental lemma of the calculus of variations gives

Ω2(α¯1α¯2c)+q1dλ2=γq1dμ1dλ1\int_{\Omega_{2}}{(\bar{\alpha}_{1}\oplus\bar{\alpha}_{2}-c)}_{+}^{q-1}\mathop{}\!\mathrm{d}\lambda_{2}=\gamma^{q-1}\tfrac{\mathop{}\!\mathrm{d}\mu_{1}}{\mathop{}\!\mathrm{d}\lambda_{1}} (3.4)

λ1\lambda_{1}-a.e. in Ω1\Omega_{1}. Next, define a sequence of functions (fn)(f_{n}) by

fn(x2):=(n+α¯2(x2)c)+n,f_{n}(x_{2}):={(-n+\bar{\alpha}_{2}(x_{2})-c^{\dagger})}_{+}\quad\forall n\in\mathbb{N}\,,

where cc^{\dagger} is the lower bound for cc from Assumption 3.6. It holds that fnLq1(Ω2,dλ2)f_{n}\in L^{q-1}(\Omega_{2},\mathop{}\!\mathrm{d}\lambda_{2}), which can be seen as follows: Since α¯2L1(Ω2,dλ2)\bar{\alpha}_{2}\in L^{1}(\Omega_{2},\mathop{}\!\mathrm{d}\lambda_{2}), clearly (α¯2)q1L1/(q1)(Ω2,dλ2)(\bar{\alpha}_{2})^{q-1}\in L^{\nicefrac{{1}}{{(q-1)}}}(\Omega_{2},\mathop{}\!\mathrm{d}\lambda_{2}). Furthermore, since Ω2\Omega_{2} is compact and q2q\leq 2, it holds L1/(q1)(Ω2,dλ2)L1(Ω2,dλ2)L^{\nicefrac{{1}}{{(q-1)}}}(\Omega_{2},\mathop{}\!\mathrm{d}\lambda_{2})\hookrightarrow L^{1}(\Omega_{2},\mathop{}\!\mathrm{d}\lambda_{2}). Consequently, (α¯2)q1L1(Ω2,dλ2)(\bar{\alpha}_{2})^{q-1}\in L^{1}(\Omega_{2},\mathop{}\!\mathrm{d}\lambda_{2}) and fnq1f_{n}^{q-1} is also integrable for every nn.

The functions fnf_{n} satisfy fn0f_{n}\geq 0 and fn0f_{n}\searrow 0 λ2\lambda_{2}-a.e. in Ω2\Omega_{2} for nn\to\infty, so that the monotone convergence theorem gives

Ω2fnq1dλ2n0.\int_{\Omega_{2}}f_{n}^{q-1}\mathop{}\!\mathrm{d}\lambda_{2}\xrightarrow[n\to\infty]{}0\,.

Thus, there exists NN\in\mathbb{N} such that

Ω2(N+α¯2c)+q1dλ2<γq1δ,\int_{\Omega_{2}}{(-N+\bar{\alpha}_{2}-c^{\dagger})}_{+}^{q-1}\mathop{}\!\mathrm{d}\lambda_{2}<\gamma^{q-1}\delta\,,

with the threshold δ>0\delta>0 from Assumption 3.6. Now assume that α¯1N\bar{\alpha}_{1}\leq-N λ1\lambda_{1}-a.e. on a set EΩ1E\subset\Omega_{1} with λ1(E)>0\lambda_{1}(E)>0. Then

Ω2(α¯1α¯2c)+q1dλ2Ω2(N+α¯2c)+q1dλ2<γq1δγq1dμ1dλ1\int_{\Omega_{2}}{(\bar{\alpha}_{1}\oplus\bar{\alpha}_{2}-c^{\dagger})}_{+}^{q-1}\mathop{}\!\mathrm{d}\lambda_{2}\leq\int_{\Omega_{2}}{(-N+\bar{\alpha}_{2}-c^{\dagger})}_{+}^{q-1}\mathop{}\!\mathrm{d}\lambda_{2}<\gamma^{q-1}\delta\leq\gamma^{q-1}\tfrac{\mathop{}\!\mathrm{d}\mu_{1}}{\mathop{}\!\mathrm{d}\lambda_{1}}

λ1\lambda_{1}-a.e. in EE, which is a contradiction to (3.4). Therefore α¯1>N\bar{\alpha}_{1}>-N λ1\lambda_{1}-a.e. in Ω1\Omega_{1}, which even implies that (α¯1)L(Ω1,dλ1){(\bar{\alpha}_{1})}_{-}\in L^{\infty}(\Omega_{1},\mathop{}\!\mathrm{d}\lambda_{1}). Concerning (α¯2){(\bar{\alpha}_{2})}_{-}, one may argue exactly the same way to conclude that (α¯2)L(Ω1,dλ1){(\bar{\alpha}_{2})}_{-}\in L^{\infty}(\Omega_{1},\mathop{}\!\mathrm{d}\lambda_{1}), too. ∎

Remark 3.14.

While it seems clear that proving a generalization of Theorem 3.13 to general Young’s functions or even quasi-Young’s functions Φ\Phi is likely to be complicated or even impossible without making strict assumptions on Φ\Phi, not even the existence result for optimizers in L1L^{1} can be generalized directly. The problem occurs in Lemma 3.10, which could not be extended to the case of Young’s functions or quasi-Young’s functions Φ\Phi in this work. That additional assumptions on Φ\Phi might be necessary for Lemma 3.10 to hold can be seen as follows.

In the general case, the function GG would be defined as

G:Lq(Ω,dλ)wΩ(γΦ~(wγ)wμ)dλ,G:L^{q}(\Omega,\mathop{}\!\mathrm{d}\lambda)\ni w\mapsto\int_{\Omega}\left(\gamma\tilde{\Phi}^{*}\left(\tfrac{w}{\gamma}\right)-w\mu\right)\mathop{}\!\mathrm{d}\lambda\in\mathbb{R}\,,

where μ:=d(μ1μ2)dλ\mu:=\tfrac{\mathop{}\!\mathrm{d}({\mu_{1}}\otimes{\mu_{2}})}{\mathop{}\!\mathrm{d}\lambda} and in the proof of Lemma 3.10 an inequality of the form

CwnLΦ~(Ω,dλ)γΩΦ~(wnγ)dλnC\lVert w_{n}\rVert_{L^{\tilde{\Phi}^{*}}(\Omega,\mathop{}\!\mathrm{d}\lambda)}\leq\gamma\int_{\Omega}\tilde{\Phi}^{*}\left(\tfrac{w_{n}}{\gamma}\right)\mathop{}\!\mathrm{d}\lambda\quad\forall n\in\mathbb{N}

would be necessary. For this to hold, it would suffice to know

Φ(fLΦ(Ω,dλ))CΩΦ(|f|)dλ\Phi\left(\lVert f\rVert_{L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\lambda)}\right)\leq C\int_{\Omega}\Phi(\lvert f\rvert)\mathop{}\!\mathrm{d}\lambda

for (quasi-)Young’s functions Φ\Phi, but this is not true in general as the Young’s function Φ(t)=max(t2,t3)\Phi(t)=\max(t^{2},t^{3}) and Ω=(0,1)\Omega=(0,1) shows. Indeed for f=a𝟙(0,b)f=a\mathds{1}_{(0,b)} for some a,b(0,1)a,b\in(0,1) one readily computes that fLΦ(Ω,d)=ab1/3\lVert f\rVert_{L^{\Phi}(\Omega,\,\mathop{}\!\mathrm{d}\mathcal{L})}=ab^{1/3} and the above mentioned inequality would be

a2b2/3Ca2b,a^{2}b^{2/3}\leq Ca^{2}b\,,

which is not possible for any constant CC independent of bb.111We thank the user harfe from mathoverflow who provided this counterexample to our question https://mathoverflow.net/q/333925. This counterexample indicates that both the growth of Φ\Phi at infinity and at zero are important properties for this problem.

4 Γ\Gamma-Convergence

We return to the general setting, i.e. we only assume that Assumption 1.1 holds, and consider results on Γ\Gamma-convergence of problems related to problem (P). Recall from, e.g., [7], that a sequence (Fn)(F_{n}) of functionals Fn:X{}F_{n}:X\to\mathbb{R}\cup\{\infty\} on a metric space XX is said to Γ\Gamma-converge to a functional F:X{}F:X\to\mathbb{R}\cup\{\infty\}, written F=ΓlimnFnF=\operatorname*{\Gamma-lim}_{n\to\infty}F_{n}, if

  1. (i)

    for every sequence {xn}X\{x_{n}\}\subset X with xnxx_{n}\to x,

    F(x)lim infnFn(xn),F(x)\leq\liminf_{n\to\infty}F_{n}(x_{n}),
  2. (ii)

    for every xXx\in X, there is a sequence {xn}X\{x_{n}\}\subset X with xnxx_{n}\to x and

    F(x)lim supnFn(xn).F(x)\geq\limsup_{n\to\infty}F_{n}(x_{n}).

It is a straightforward consequence of this definition that if FnF_{n} Γ\Gamma-converges to FF and xnx_{n} is a minimizer of FnF_{n} for every nn\in\mathbb{N}, then every cluster point of the sequence (xn)(x_{n}) is a minimizer to FF. Furthermore, Γ\Gamma-convergence is stable under perturbations by continuous functionals.

4.1 Continuous case

When considering arbitrary measures as marginals, their densities w.r.t. λi\lambda_{i} may not be in LΦ(Ω,dλi)L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\lambda_{i}) and by Theorem 3.4, problem (P) will not admit a solution in that case. One may therefore consider smoothed marginals μiδ\mu^{\delta}_{i} with dμiδdλiLΦ(Ωi,dλi)\tfrac{\mathop{}\!\mathrm{d}\mu_{i}^{\delta}}{\mathop{}\!\mathrm{d}\lambda_{i}}\in L^{\Phi}(\Omega_{i},\mathop{}\!\mathrm{d}\lambda_{i}), i=1,2i=1,2, converging to μ1\mu_{1} and μ2\mu_{2}, respectively and show that the regularized problem with these marginals Γ\Gamma-converges to the unregularized problem with the original marginals.

Let φ\varphi be a smooth, compactly supported, non-negative kernel on n\mathbb{R}^{n} with unit integral (w.r.t. the Lebesgue measure) and set

φr(x)=1rnφ(xr),Gr:=φrφr.\varphi_{r}(x)=\tfrac{1}{r^{n}}\varphi(\tfrac{x}{r})\,,\qquad G_{r}:=\varphi_{r}\otimes\varphi_{r}\,.

Since the marginals and transport plans will be smoothed by convolutions, the domains Ω1\Omega_{1} and Ω2\Omega_{2} will be extended slightly to take care of boundary effects. Hence, let Ω~1,Ω~2\tilde{\Omega}_{1},\,\tilde{\Omega}_{2} be compact supersets of Ω1,Ω2\Omega_{1},\,\Omega_{2}, respectively, such that

Ωi+sptφΩ~i,i=1,2,\Omega_{i}+\operatorname{spt}\varphi\subseteq\tilde{\Omega}_{i},\quad i=1,2\,,

which is large enough to contain the supports of the smoothed marginals μir\mu_{i}^{r}, i=1,2i=1,2 for r1r\leq 1 (and the width of the convolution kernels will be assumed to be small enough for this in the following). For a function (or measure) ff on Ω1\Omega_{1} denote by f~\tilde{f} the extension of ff onto Ω~1\tilde{\Omega}_{1} by zero (and analogously for functions and measures on Ω2\Omega_{2} and Ω1×Ω2\Omega_{1}\times\Omega_{2}). Let λ^i\hat{\lambda}_{i} be the extension of λi\lambda_{i} onto Ω~i\tilde{\Omega}_{i} by the Lebesgue measure and λ^=λ^1λ^2\hat{\lambda}=\hat{\lambda}_{1}\otimes\hat{\lambda}_{2}. Let c^\hat{c} be a continuous extension of cc onto Ω~1×Ω~2\tilde{\Omega}_{1}\times\tilde{\Omega}_{2} and let

Fγ[π]=Ω~1×Ω~2c^dπ+γΩ~1×Ω~2Φ(dπdλ^)dλ^,F_{\gamma}[\pi]=\int_{\tilde{\Omega}_{1}\times\tilde{\Omega}_{2}}\hat{c}\mathop{}\!\mathrm{d}\pi+\gamma\int_{\tilde{\Omega}_{1}\times\tilde{\Omega}_{2}}{\Phi}_{\infty}(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\hat{\lambda}})\mathop{}\!\mathrm{d}\hat{\lambda}\,,

where we set the second integral to ++\infty, if π≪̸λ^\pi\not\ll\hat{\lambda}, and

Eγν1,ν2[π]\displaystyle E_{\gamma}^{{\nu_{1}},{\nu_{2}}}[\pi] ={Fγ[π]if 0π𝒫(Ω~1×Ω~2),(Pi)#π=νi,i=1,2,else,\displaystyle=\begin{cases}F_{\gamma}[\pi]&\text{if }0\leq\pi\in\mathcal{P}(\tilde{\Omega}_{1}\times\tilde{\Omega}_{2}),\,(P_{i})_{\#}{\pi}={\nu_{i}},\,i=1,2\,,\\ \infty&\text{else,}\end{cases}

for νi𝔐+(Ωi)\nu_{i}\in\mathfrak{M}_{+}(\Omega_{i}), i=1,2i=1,2.

First, we state an auxiliary result ensuring that the marginal constraints are preserved by convolution. For simplicity, we state it for measures on n\mathbb{R}^{n} (but we could restrict everything to the respective domains Ωi\Omega_{i}, Ω~i\tilde{\Omega}_{i}). Note that for ν𝔐(Ω)\nu\in\mathfrak{M}(\Omega) the expression φrν\varphi_{r}*\nu can – thanks to the smoothness of φr\varphi_{r} – be interpreted both as smooth function or as measure in 𝔐\mathfrak{M} with that smooth function as density. Here, we choose to interpret φrν\varphi_{r}*\nu as measure. A proof is given in Appendix B.

Lemma 4.1.

Let μ1,μ2𝒫(n)\mu_{1},\mu_{2}\in\mathcal{P}(\mathbb{R}^{n}) and let π𝒫(n×n)\pi\in\mathcal{P}(\mathbb{R}^{n}\times\mathbb{R}^{n}) with (Pi)#π=μi(P_{i})_{\#}{\pi}=\mu_{i}, i=1,2i=1,2. Let μiδ:=φδμi,i=1,2\mu_{i}^{\delta}:=\varphi_{\delta}*\mu_{i},\,i=1,2 and πδ:=Gδπ\pi_{\delta}:=G_{\delta}*\pi. Then (Pi)#πδ=μiδ(P_{i})_{\#}{\pi_{\delta}}=\mu_{i}^{\delta} for i=1,2i=1,2.

Theorem 4.2 (Γ\Gamma-convergence for smoothed marginals).

Let Assumption 1.1 hold and let (γ,δ)Φ0(\gamma,\delta)\xrightarrow{\Phi}0 denote

γ0,δ0,γΦ+(1δ2n)0.\gamma\to 0,\qquad\delta\to 0,\qquad\gamma{\Phi}_{+}\left(\frac{1}{{\delta}^{2n}}\right)\to 0\,.

Define the smoothed marginals as μiδ=φδμ~i{\mu_{i}^{\delta}}=\varphi_{\delta}*\tilde{\mu}_{i} for i=1,2i=1,2. Then it holds that

Γlim(γ,δ)Φ0Eγμ1δ,μ2δ=E0μ1,μ2\operatorname*{\Gamma-lim}_{(\gamma,\delta)\xrightarrow{\Phi}0}E_{\gamma}^{{\mu_{1}^{\delta}},{\mu_{2}^{\delta}}}=E_{0}^{{\mu_{1}},{\mu_{2}}}

with respect to weak-* convergence in 𝔐(Ω1×Ω2)\mathfrak{M}({\Omega}_{1}\times{\Omega}_{2}). Moreover, if γ,δ0{\gamma},\delta\to 0 are chosen such that

γdμ1δdλ^1LΦ(Ω1,dλ^1)orγdμ2δdλ^2LΦ(Ω2,dλ^2),\gamma\big{\lVert}\tfrac{\mathop{}\!\mathrm{d}\mu_{1}^{\delta}}{\mathop{}\!\mathrm{d}\hat{\lambda}_{1}}\big{\rVert}_{L^{\Phi}(\Omega_{1},\mathop{}\!\mathrm{d}\hat{\lambda}_{1})}\to\infty\quad\text{or}\quad\gamma\big{\lVert}\tfrac{\mathop{}\!\mathrm{d}\mu_{2}^{\delta}}{\mathop{}\!\mathrm{d}\hat{\lambda}_{2}}\big{\rVert}_{L^{\Phi}(\Omega_{2},\mathop{}\!\mathrm{d}\hat{\lambda}_{2})}\to\infty\,,

then Eγμ1δ,μ2δE^{{\mu_{1}^{\delta}},{\mu_{2}^{\delta}}}_{\gamma} does not have a finite Γ{\Gamma}-limit. More precisely, even for feasible πδ\pi_{\delta} (i.e. with marginals μiδ\mu_{i}^{\delta}) it holds that

lim(γ,δ)Φ0Eγμ1δ,μ2δ[πδ]=.\lim_{(\gamma,\delta)\xrightarrow{\Phi}0}E^{{\mu_{1}^{\delta}},{\mu_{2}^{\delta}}}_{\gamma}[\pi_{\delta}]=\infty\,.
Proof.

This proof follows the outline of the proof in [10, Theorem 5.1].

  1. i)

    lim inf\liminf-condition: Let πδπ\pi_{\delta}\xrightharpoonup{*}{\pi}, then limδ0F0[πδ]=F0[π~]\lim_{\delta\to 0}F_{0}[\pi_{\delta}]=F_{0}[\tilde{\pi}] due to c^\hat{c} being continuous and bounded. Furthermore,

    Ω~1×Ω~2Φ(dπˇdλ^)dλ^Cλ^(Ω~1×Ω~2)\int_{\tilde{\Omega}_{1}\times\tilde{\Omega}_{2}}{\Phi}_{\infty}(\tfrac{\mathop{}\!\mathrm{d}\check{\pi}}{\mathop{}\!\mathrm{d}\hat{\lambda}})\mathop{}\!\mathrm{d}\hat{\lambda}\geq-C\cdot\hat{\lambda}(\tilde{\Omega}_{1}\times\tilde{\Omega}_{2})

    for some C>0C>0 only dependent on Φ\Phi for any πˇ0\check{\pi}\geq 0. Thus,

    F0[π~]=lim(γ,δ)Φ0F0[πδ]γCλ^(Ω~1×Ω~2)lim inf(γ,δ)Φ0Fγ[πδ].F_{0}[\tilde{\pi}]=\lim_{(\gamma,\delta)\xrightarrow{\Phi}0}F_{0}[\pi_{\delta}]-\gamma C\cdot\hat{\lambda}\left(\tilde{\Omega}_{1}\times\tilde{\Omega}_{2}\right)\leq\liminf_{(\gamma,\delta)\xrightarrow{\Phi}0}F_{\gamma}[\pi_{\delta}]\,.

    Finally, the pushforward operator is weak-* continuous, which implies that the marginal constraints are preserved under weak-* convergence of πδ\pi_{\delta}, μ1δ{\mu_{1}^{\delta}}, and μ2δ{\mu_{2}^{\delta}} (note that μiδμi\mu_{i}^{\delta}\xrightharpoonup{*}\mu_{i}).

  2. ii)

    lim sup\limsup-condition: It suffices to consider a recovery sequence for π𝒫(Ω1×Ω2){\pi}\in\mathcal{P}({\Omega}_{1}\times{\Omega}_{2}), because for π𝒫(Ω~1×Ω~2)𝒫(Ω1×Ω2){\pi}\in\mathcal{P}(\tilde{\Omega}_{1}\times\tilde{\Omega}_{2})\setminus\mathcal{P}({\Omega}_{1}\times{\Omega}_{2}) the marginal conditions for μ1{\mu_{1}} and μ2{\mu_{2}} can never be satisfied.

    If E0μ~1,μ~2[π~]=E_{0}^{{\tilde{\mu}_{1}},{\tilde{\mu}_{2}}}[\tilde{\pi}]=\infty, then Eγμ~1,μ~2[π~]=E_{\gamma}^{{\tilde{\mu}_{1}},{\tilde{\mu}_{2}}}[\tilde{\pi}]=\infty for every γ{\gamma} and the lim sup\limsup condition holds trivially. Let therefore E0μ~1,μ~2[π~]E_{0}^{{\tilde{\mu}_{1}},{\tilde{\mu}_{2}}}[\tilde{\pi}] be finite and set πδ=Gδπ~\pi_{\delta}=G_{\delta}*\tilde{\pi}. Then 0πδ0\leq\pi_{\delta} and πδπ~\pi_{\delta}\xrightharpoonup{*}\tilde{\pi} as well as (Pi)#πδ=μiδ(P_{i})_{\#}{\pi_{\delta}}={\mu_{i}^{\delta}} for i=1,2i=1,2, by Lemma 4.1. Finally, by Young’s convolution inequality,

    dπδdλ^GδL(Ω,dλ^)dπ~dλ^L1(Ω,dλ^)Cδ2n\tfrac{\mathop{}\!\mathrm{d}\pi_{\delta}}{\mathop{}\!\mathrm{d}\hat{\lambda}}\leq\lVert G_{\delta}\rVert_{L^{\infty}(\Omega,\mathop{}\!\mathrm{d}\hat{\lambda})}\big{\lVert}\tfrac{\mathop{}\!\mathrm{d}\tilde{\pi}}{\mathop{}\!\mathrm{d}\hat{\lambda}}\big{\rVert}_{L^{1}(\Omega,\mathop{}\!\mathrm{d}\hat{\lambda})}\leq\frac{C}{{\delta}^{2n}}

    for some constant C>0C>0. Thus,

    Ω~1×Ω~2Φ(dπδdλ^)dλ^=Ω~1×Ω~2Φ(dπδdλ^)dλ^Ω~1×Ω~2Φ+(dπδdλ^)dλ^λ^(Ω~1×Ω~2)γΦ+(Cδ2n),\begin{split}\int_{\tilde{\Omega}_{1}\times\tilde{\Omega}_{2}}{\Phi}_{\infty}(\tfrac{\mathop{}\!\mathrm{d}\pi_{\delta}}{\mathop{}\!\mathrm{d}\hat{\lambda}})\mathop{}\!\mathrm{d}\hat{\lambda}&=\int_{\tilde{\Omega}_{1}\times\tilde{\Omega}_{2}}\Phi(\tfrac{\mathop{}\!\mathrm{d}\pi_{\delta}}{\mathop{}\!\mathrm{d}\hat{\lambda}})\mathop{}\!\mathrm{d}\hat{\lambda}\leq\int_{\tilde{\Omega}_{1}\times\tilde{\Omega}_{2}}{\Phi}_{+}(\tfrac{\mathop{}\!\mathrm{d}\pi_{\delta}}{\mathop{}\!\mathrm{d}\hat{\lambda}})\mathop{}\!\mathrm{d}\hat{\lambda}\\ &\leq\hat{\lambda}\left(\tilde{\Omega}_{1}\times\tilde{\Omega}_{2}\right)\gamma{\Phi}_{+}\left(\frac{C}{{\delta}^{2n}}\right)\,,\end{split} (4.1)

    and the right hand side vanishes for (γ,δ)Φ0(\gamma,\delta)\xrightarrow{\Phi}0 by the assumption on the (coupled) convergence of γ\gamma and δ\delta. Therefore,

    E0μ~1,μ~2[π~]=lim(γ,δ)Φ0(F0[πδ]+λ^(Ω~1×Ω~2)γΦ+(Cδ2n))lim(γ,δ)Φ0Fγ[πδ].E_{0}^{{\tilde{\mu}_{1}},{\tilde{\mu}_{2}}}[\tilde{\pi}]=\lim_{(\gamma,\delta)\xrightarrow{\Phi}0}\left(F_{0}[\pi_{\delta}]+\hat{\lambda}\left(\tilde{\Omega}_{1}\times\tilde{\Omega}_{2}\right)\gamma{\Phi}_{+}\left(\frac{C}{{\delta}^{2n}}\right)\!\right)\geq\lim_{(\gamma,\delta)\xrightarrow{\Phi}0}F_{\gamma}[\pi_{\delta}]\,.
  3. iii)

    For the second statement, recall from Lemma 2.9 that

    γdμ1δdλ^1LΦ(Ω1,dλ^1)γmax(1,λ^2(Ω2))dπδdλ^LΦ(Ω,dλ^),γdμ2δdλ^2LΦ(Ω2,dλ^2)γmax(1,λ^1(Ω1))dπδdλ^LΦ(Ω,dλ^).\begin{split}{\gamma}\big{\lVert}\tfrac{\mathop{}\!\mathrm{d}\mu_{1}^{\delta}}{\mathop{}\!\mathrm{d}\hat{\lambda}_{1}}\big{\rVert}_{L^{\Phi}(\Omega_{1},\mathop{}\!\mathrm{d}\hat{\lambda}_{1})}&\leq\gamma\max\left(1,\hat{\lambda}_{2}\left(\Omega_{2}\right)\right)\big{\lVert}\tfrac{\mathop{}\!\mathrm{d}\pi_{\delta}}{\mathop{}\!\mathrm{d}\hat{\lambda}}\big{\rVert}_{L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\hat{\lambda})}\,,\\ {\gamma}\big{\lVert}\tfrac{\mathop{}\!\mathrm{d}\mu_{2}^{\delta}}{\mathop{}\!\mathrm{d}\hat{\lambda}_{2}}\big{\rVert}_{L^{\Phi}(\Omega_{2},\mathop{}\!\mathrm{d}\hat{\lambda}_{2})}&\leq\gamma\max\left(1,\hat{\lambda}_{1}\left(\Omega_{1}\right)\right)\big{\lVert}\tfrac{\mathop{}\!\mathrm{d}\pi_{\delta}}{\mathop{}\!\mathrm{d}\hat{\lambda}}\big{\rVert}_{L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\hat{\lambda})}\,.\end{split}

    By Lemma 2.11, this immediately yields Fγ[πδ]F_{\gamma}[\pi_{\delta}]\to\infty, and the assertion follows.∎

In the case λ=×\lambda=\mathcal{L}\times\mathcal{L}, the assumption γΦ+(1δ2n)0\gamma{\Phi}_{+}\left(\frac{1}{{\delta}^{2n}}\right)\to 0 is much stronger than necessary for some Young’s functions. For example consider Φ(t)=tp/p\Phi(t)=\nicefrac{{t^{p}}}{{p}} with p>1p>1 or Φ(t)=tlogt\Phi(t)=t\log t. In those cases the condition

γδ2nΦ+(1δ2n)0{\gamma}{\delta^{2n}}{\Phi}_{+}\left(\frac{1}{{\delta}^{2n}}\right)\to 0

suffices, as the following result states. For Φ(t)=tlogt\Phi(t)=t\log t this gives exactly the result in [10, Theorem 5.1].

Corollary 4.3.

Let λ=×\lambda=\mathcal{L}\times\mathcal{L} and let Φ\Phi be a quasi-Young’s function such that t1Φ(t)t^{-1}\Phi(t) is monotone. Then it suffices to assume

γδ2nΦ+(1δ2n)0{\gamma}{\delta^{2n}}{\Phi}_{+}\left(\frac{1}{{\delta}^{2n}}\right)\to 0

in Theorem 4.2.

Proof.

A refinement of estimate (4.1) can be given. Using the monotonicity of t1Φ(t)t^{-1}\Phi(t) we obtain

Ω~1×Ω~2Φ(dπδd)d\displaystyle\int_{\tilde{\Omega}_{1}\times\tilde{\Omega}_{2}}\Phi(\tfrac{\mathop{}\!\mathrm{d}\pi_{\delta}}{\mathop{}\!\mathrm{d}\mathcal{L}})\mathop{}\!\mathrm{d}\mathcal{L} Ω~1×Ω~2Φ+(dπδd)d\displaystyle\leq\int_{\tilde{\Omega}_{1}\times\tilde{\Omega}_{2}}{\Phi}_{+}(\tfrac{\mathop{}\!\mathrm{d}\pi_{\delta}}{\mathop{}\!\mathrm{d}\mathcal{L}})\mathop{}\!\mathrm{d}\mathcal{L}
=Ω~1×Ω~2dπδd(dπδd)1Φ+(dπδd)d\displaystyle=\int_{\tilde{\Omega}_{1}\times\tilde{\Omega}_{2}}\tfrac{\mathop{}\!\mathrm{d}\pi_{\delta}}{\mathop{}\!\mathrm{d}\mathcal{L}}\left(\tfrac{\mathop{}\!\mathrm{d}\pi_{\delta}}{\mathop{}\!\mathrm{d}\mathcal{L}}\right)^{-1}{\Phi}_{+}(\tfrac{\mathop{}\!\mathrm{d}\pi_{\delta}}{\mathop{}\!\mathrm{d}\mathcal{L}})\mathop{}\!\mathrm{d}\mathcal{L}
δ2nCΦ+(Cδ2n)Ω~1×Ω~2dπδdd\displaystyle\leq\frac{\delta^{2n}}{C}{\Phi}_{+}\left(\frac{C}{\delta^{2n}}\right)\int_{\tilde{\Omega}_{1}\times\tilde{\Omega}_{2}}\tfrac{\mathop{}\!\mathrm{d}\pi_{\delta}}{\mathop{}\!\mathrm{d}\mathcal{L}}\mathop{}\!\mathrm{d}\mathcal{L}
=δ2nCΦ+(Cδ2n)\displaystyle=\frac{\delta^{2n}}{C}{\Phi}_{+}\left(\frac{C}{\delta^{2n}}\right)

where again the right hand side vanishes by assumption. The assertion follows as in Theorem 4.2. ∎

4.2 Discretized problems

Here we describe a discretization of problem (P) and show two approximation results:

  1. i)

    Γ\Gamma-convergence of the discretized and regularized problem towards the continuous, regularized problem (P)

  2. ii)

    Γ\Gamma-convergence of the discretized and regularized problem towards the continuous, unregularized problem (OT)

We recall the problem data: Marginals μi𝒫(Ωi)\mu_{i}\in\mathcal{P}(\Omega_{i}) and finite measures λi𝔐+(Ωi)\lambda_{i}\in\mathfrak{M}_{+}(\Omega_{i}) with μiλi\mu_{i}\ll\lambda_{i} for i=1,2i=1,2 and a continuous and positive cost function cc on Ω\Omega. We have λ=λ1λ2\lambda=\lambda_{1}\otimes\lambda_{2} and aim to discretize the problem

minπ𝒫(Ω)Ωcdπdλ+γΦ(dπdλ)dλ,s.t.Ω2dπdλdλ2=dμ1dλ1λ1-a.eΩ1dπdλdλ1=dμ2dλ2λ2-a.e.\begin{split}\min_{\pi\in\mathcal{P}(\Omega)}\int_{\Omega}c\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}+\gamma{\Phi}_{\infty}(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda,\quad\text{s.t.}\quad\int_{\Omega_{2}}\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda_{2}&=\tfrac{\mathop{}\!\mathrm{d}\mu_{1}}{\mathop{}\!\mathrm{d}\lambda_{1}}\quad\text{$\lambda_{1}$-a.e}\\ \int_{\Omega_{1}}\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda_{1}&=\tfrac{\mathop{}\!\mathrm{d}\mu_{2}}{\mathop{}\!\mathrm{d}\lambda_{2}}\quad\text{$\lambda_{2}$-a.e}.\end{split} (4.2)

We do a Galerkin discretization with piecewise constant functions. For kk\in\mathbb{N} let (Qji,k)(Q^{i,k}_{j}) be a sequence of finite partitions of Ωi\Omega_{i} such that for every jj there is an ll with Qji,k+1Qli,kQ_{j}^{i,k+1}\subset Q_{l}^{i,k} and such that (Qi1,k)(Q^{1,k}_{i}), (Qj2,k)(Q^{2,k}_{j}), and Iijk:=Qi1,k×Qj2,kI^{k}_{ij}:=Q^{1,k}_{i}\times Q^{2,k}_{j} satisfy the following assumption:

Assumption 4.4.

Let AΩA\subset\Omega be a Borel set and ε>0\varepsilon>0. Then there exists some KK\in\mathbb{N} such that for all KkK\leq k\in\mathbb{N} the sets A+kA^{k}_{+}, AkA^{k}_{-} defined by

Ak\displaystyle A^{k}_{-} :={(i,j)|Ii,jkA}Ii,jk,\displaystyle:=\bigcup_{\left\{(i,j)\,\middle|\,I^{k}_{i,j}\subseteq A\right\}}I^{k}_{i,j}\,, A+k\displaystyle A^{k}_{+} :={(i,j)|Ii,jkA}Ii,jk\displaystyle:=\bigcup_{\left\{(i,j)\,\middle|\,I^{k}_{i,j}\cap A\neq\emptyset\right\}}I^{k}_{i,j}

satisfy

ν(A+k)ν(Ak)\displaystyle\nu(A^{k}_{+})-\nu(A^{k}_{-}) <ε,\displaystyle<\varepsilon\,, (4.3)

for all ν𝔐+(Ω)\nu\in\mathfrak{M}_{+}(\Omega).

Remark 4.5.

If Assumption 4.4 is fulfilled for λ=λ1λ2\lambda=\lambda_{1}\otimes\lambda_{2}, condition (4.3) holds analogously for λ1\lambda_{1} and λ2\lambda_{2}, which can be seen as follows: For AiΩiA^{i}\subset\Omega_{i}, i=1,2i=1,2 let A+i,kA^{i,k}_{+} and Ai,kA^{i,k}_{-} be defined analogously based on Qji,kQ^{i,k}_{j}. It holds

A1,k×Ω2\displaystyle A^{1,k}_{-}\times\Omega_{2} =(A1×Ω2)kΩ\displaystyle=(A^{1}\times\Omega_{2})^{k}_{-}\subset\Omega
A+1,k×Ω2\displaystyle A^{1,k}_{+}\times\Omega_{2} =(A1×Ω2)+kΩ.\displaystyle=(A^{1}\times\Omega_{2})^{k}_{+}\subset\Omega\,.

Thus,

λ1(A+1,k)λ1(A1,k)\displaystyle\lambda_{1}(A^{1,k}_{+})-\lambda_{1}(A^{1,k}_{-}) =(P1)#λ(A+1,k)(P1)#λ(A1,k)=λ(A+1,k×Ω2)λ(A1,k×Ω2)\displaystyle=(P_{1})_{\#}\lambda(A^{1,k}_{+})-(P_{1})_{\#}\lambda(A^{1,k}_{-})=\lambda(A^{1,k}_{+}\times\Omega_{2})-\lambda(A^{1,k}_{-}\times\Omega_{2})
=λ((A1×Ω2)+k)λ((A1×Ω2)k)<ε\displaystyle=\lambda((A^{1}\times\Omega_{2})^{k}_{+})-\lambda((A^{1}\times\Omega_{2})^{k}_{-})<\varepsilon

by (4.3) and the argument holds analogously for λ2\lambda_{2}.

Assumption 4.4 yields the following auxiliary result about piecewise constant approximation of measures.

Lemma 4.6.

Let Assumption 4.4 hold. Let ν𝔐+(Ω)\nu\in\mathfrak{M}_{+}(\Omega), νi𝔐+(Ωi)\nu^{i}\in\mathfrak{M}_{+}(\Omega_{i}), i=1,2i=1,2 and define

νk:=i,jν(Ii,jk)λ(Ii,jk)𝟙Ii,jk,νki:=jνi(Qji,k)λi(Qji,k)𝟙Qji,k,i=1,2.\nu_{k}:=\sum_{i,j}\frac{\nu(I^{k}_{i,j})}{\lambda(I^{k}_{i,j})}\mathds{1}_{I^{k}_{i,j}}\,,\qquad\nu^{i}_{k}:=\sum_{j}\frac{\nu^{i}(Q^{i,k}_{j})}{\lambda_{i}(Q^{i,k}_{j})}\mathds{1}_{Q^{i,k}_{j}}\,,i=1,2\,.

Then νkλν\nu_{k}\lambda\xrightharpoonup{*}\nu and νkiλiνi\nu^{i}_{k}\lambda_{i}\xrightharpoonup{*}\nu^{i}.

A proof of Lemma 4.6 can be found in Appendix C. We now set

μ1,ik\displaystyle\mu_{1,i}^{k} :=μ1(Qi1,k),\displaystyle:=\mu_{1}(Q^{1,k}_{i}), μ2,jk\displaystyle\mu_{2,j}^{k} :=μ2(Qj2,k)\displaystyle:=\mu_{2}(Q^{2,k}_{j})
λ1,ik\displaystyle\lambda_{1,i}^{k} :=λ1(Qi1,k),\displaystyle:=\lambda_{1}(Q^{1,k}_{i}), λ2,jk\displaystyle\lambda_{2,j}^{k} :=λ2(Qj2,k).\displaystyle:=\lambda_{2}(Q^{2,k}_{j})\,.

Then by Lemma 4.6

iμ1,ikλ1,ik𝟙Qi1,kλ1\displaystyle\sum_{i}\tfrac{\mu_{1,i}^{k}}{\lambda_{1,i}^{k}}\mathds{1}_{Q^{1,k}_{i}}\lambda_{1} μ1\displaystyle\xrightharpoonup{*}\mu_{1} jμ2,jkλ2,jk𝟙Qj2,kλ2\displaystyle\sum_{j}\tfrac{\mu_{2,j}^{k}}{\lambda_{2,j}^{k}}\mathds{1}_{Q^{2,k}_{j}}\lambda_{2} μ2.\displaystyle\xrightharpoonup{*}\mu_{2}.

Note that division by zero is not a problem here, since λ1\lambda_{1} and λ2\lambda_{2} were assumed to have full support and hence λ1,ik,λ2,jk0\lambda^{k}_{1,i},\lambda^{k}_{2,j}\neq 0 for all i,j,ki,j,k. We define the finite-dimensional spaces

𝒱1,k:={𝟙Qi1,ki},𝒱2,k:={𝟙Qj2,kj},𝒱k=𝒱1,k𝒱2,k={𝟙Iijki,j}.\mathcal{V}^{1,k}:=\langle{\{\mathds{1}_{Q^{1,k}_{i}}\mid i\}}\rangle,\quad\mathcal{V}^{2,k}:=\langle{\{\mathds{1}_{Q^{2,k}_{j}}\mid j\}}\rangle,\quad\mathcal{V}^{k}=\mathcal{V}^{1,k}\otimes\mathcal{V}^{2,k}=\langle{\{\mathds{1}_{I^{k}_{ij}}\mid i,j\}}\rangle\,.

The discrete problem is then one in 𝒱k\mathcal{V}^{k}, namely

infπ𝒱kΩcπdλ+γΩΦ(π)dλ,s.t. =Ω2πdλ2iμk1,iλk1.i1Q1,ki, =Ω1πdλ1jμk2,jλk2.j1Q2,kj \inf_{\pi\in\mathcal{V}^{k}}\int_{\Omega}c\pi\mathop{}\!\mathrm{d}\lambda+\gamma\int_{\Omega}{\Phi}_{\infty}(\pi)\mathop{}\!\mathrm{d}\lambda,\quad\text{s.t.}\quad\parbox[t]{173.44534pt}{ $\int_{\Omega_{2}}\pi\mathop{}\!\mathrm{d}\lambda_{2}=\sum_{i}\tfrac{\mu^{k}_{1,i}}{\lambda^{k}_{1.i}}\mathds{1}_{Q^{1,k}_{i}}$,\\ $\int_{\Omega_{1}}\pi\mathop{}\!\mathrm{d}\lambda_{1}=\sum_{j}\tfrac{\mu^{k}_{2,j}}{\lambda^{k}_{2.j}}\mathds{1}_{Q^{2,k}_{j}}$ } (PD)

If we discretize the sought-after density dπdλ\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda} by

dπdλ=ijpij𝟙Iijk𝒱k,\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}=\sum_{ij}p_{ij}\mathds{1}_{I^{k}_{ij}}\in\mathcal{V}^{k}\,,

we can derive the optimization problem for the unknown coefficients pijp_{ij} as follows: The objective function is

Ωcdπdλ+γΦ(dπdλ)dλ=ijIijkcdλpij+γΦ(pij)λ1,ikλ2,jk.\int_{\Omega}c\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}+\gamma{\Phi}_{\infty}(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda=\sum_{ij}\int_{I^{k}_{ij}}c\mathop{}\!\mathrm{d}\lambda\;p_{ij}+\gamma{\Phi}_{\infty}(p_{ij})\lambda^{k}_{1,i}\lambda^{k}_{2,j}.

The first marginal constraint is given by

Ω2ijpij𝟙Qi1,k(x1)𝟙Qj2,k(x2)dλ2(x2)=iμ1,ikλ1,ik𝟙Qi1,k(x1)\displaystyle\int_{\Omega_{2}}\sum_{ij}p_{ij}\mathds{1}_{Q^{1,k}_{i}}(x_{1})\mathds{1}_{Q^{2,k}_{j}}(x_{2})\mathop{}\!\mathrm{d}\lambda_{2}(x_{2})=\sum_{i}\tfrac{\mu^{k}_{1,i}}{\lambda^{k}_{1,i}}\mathds{1}_{Q^{1,k}_{i}}(x_{1})

and this leads to the equation

jpijλ2,jk=μ1,ikλ1,ik.\sum_{j}p_{ij}\lambda^{k}_{2,j}=\tfrac{\mu^{k}_{1,i}}{\lambda^{k}_{1,i}}\,.

A similar equation can be derived for the other marginal constraint. With

cijk:=1λ1,ikλ2,jkIijkcdλ,c^{k}_{ij}:=\tfrac{1}{\lambda^{k}_{1,i}\lambda^{k}_{2,j}}\int_{I^{k}_{ij}}c\mathop{}\!\mathrm{d}\lambda\,,

we arrive at the fully discretized problem

minpij(cijkpij+γΦ(pij))λ1,ikλ2,jk,jpijλ2,jk=μ1,ikλ1,ikipijλ1,ik=μ2,jkλ2,jk.\begin{split}\min_{p}\sum_{ij}\Big{(}c^{k}_{ij}p_{ij}+\gamma{\Phi}_{\infty}(p_{ij})\Big{)}\lambda^{k}_{1,i}\lambda^{k}_{2,j},\quad\sum_{j}p_{ij}\lambda^{k}_{2,j}&=\tfrac{\mu^{k}_{1,i}}{\lambda^{k}_{1,i}}\\ \sum_{i}p_{ij}\lambda^{k}_{1,i}&=\tfrac{\mu^{k}_{2,j}}{\lambda^{k}_{2,j}}.\end{split}

This is a finite-dimensional convex minimization problem with linear constraints. Several general methods could be used to solve this problem numerically, see, e.g. [4, 27, 20, 21]. The following theorem now guarantees that the discretized and regularized problem converges to the continuous regularized problem (P). The proof makes use of the so-called Jensen’s inequality (see e.g. [5, Theorem 2.12.19]), which we recall for the convenience of the reader: Let K,ΩdK,\Omega\subset\mathbb{R}^{d}, MKM\subset K and ν𝔐+(d)\nu\in\mathfrak{M}_{+}(\mathbb{R}^{d}) with ν(Ω)<\nu(\Omega)<\infty. Let φ:Kd\varphi:K\to\mathbb{R}^{d} be convex and let f:ΩMf:\Omega\to M be a ν\nu-integrable function such that φ(f)\varphi(f) is ν\nu-integrable. Then,

ν(Ω)φ(1ν(Ω)Ωfdν)Ωφ(f)dν.\nu(\Omega)\cdot\varphi\Big{(}\frac{1}{\nu(\Omega)}\int_{\Omega}f\mathop{}\!\mathrm{d}\nu\Big{)}\leq\int_{\Omega}\varphi(f)\mathop{}\!\mathrm{d}\nu\,.
Theorem 4.7.

Let Assumptions 1.1 and 4.4 hold. Let c𝒞(Ω,[0,))c\in\mathcal{C}(\Omega,[0,\infty)), Φ\Phi be a quasi-Young’s function, and let γ>0\gamma>0. Then the minimization problem (PD) Γ\Gamma-converges to problem (4.2) w.r.t. weak-* convergence in 𝔐(Ω)\mathfrak{M}(\Omega) as kk\to\infty.

Proof.

We define, F,Fk:𝔐(Ω){}F,F_{k}:\mathfrak{M}(\Omega)\to\mathbb{R}\cup\{\infty\} via

Fk(π)={Ωcdπ+γΩΦ(dπdλ)dλ,d(P1)#πdλ1=iμ1,ikλ1,ik𝟙Qi1,k,d(P2)#πdλ2=jμ2,jkλ2,jk𝟙Qj2,k,0πλ,dπdλ𝒱k,elseF_{k}(\pi)=\begin{cases}\int_{\Omega}c\mathop{}\!\mathrm{d}\pi+\gamma\int_{\Omega}{\Phi}_{\infty}(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda,&\tfrac{\mathop{}\!\mathrm{d}(P_{1})_{\#}\pi}{\mathop{}\!\mathrm{d}\lambda_{1}}=\sum_{i}\tfrac{\mu^{k}_{1,i}}{\lambda^{k}_{1,i}}\mathds{1}_{Q^{1,k}_{i}}\,,\tfrac{\mathop{}\!\mathrm{d}(P_{2})_{\#}\pi}{\mathop{}\!\mathrm{d}\lambda_{2}}=\sum_{j}\tfrac{\mu^{k}_{2,j}}{\lambda^{k}_{2,j}}\mathds{1}_{Q^{2,k}_{j}}\,,\\ &0\leq\pi\ll\lambda,\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\in\mathcal{V}_{k}\\ \infty,&\text{else}\end{cases}

and

F(π)={Ωcdπ+γΩΦ(dπdλ)dλ,0πλ,(Pi)#π=μi,i=1,2,else.F(\pi)=\begin{cases}\int_{\Omega}c\mathop{}\!\mathrm{d}\pi+\gamma\int_{\Omega}{\Phi}_{\infty}(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda,&0\leq\pi\ll\lambda,\,(P_{i})_{\#}\pi=\mu_{i}\,,i=1,2\\ \infty,&\text{else.}\end{cases}

Given an arbitrary π𝔐(Ω)\pi\in\mathfrak{M}(\Omega), we now check the two conditions for Γ\Gamma-convergence.

  1. i)

    lim inf\liminf-condition: Let (πk)𝔐(Ω)(\pi_{k})\subset\mathfrak{M}(\Omega) such that πkπ\pi_{k}\xrightharpoonup{*}\pi.

    If F(π)<F(\pi)<\infty, pass to a subsequence (denoted by the same symbol) with finite values Fk(πk)F_{k}(\pi_{k}). Because πkπ\pi_{k}\xrightharpoonup{*}\pi, we have

    ΩcdπkΩcdπ,\int_{\Omega}c\mathop{}\!\mathrm{d}\pi_{k}\to\int_{\Omega}c\mathop{}\!\mathrm{d}\pi\,,

    and moreover, lim infkΩΦ(dπkdλ)dλΩΦ(dπdλ)dλ\liminf_{k\to\infty}\int_{\Omega}{\Phi}_{\infty}(\tfrac{\mathop{}\!\mathrm{d}\pi_{k}}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda\geq\int_{\Omega}{\Phi}_{\infty}(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda by Lemma 2.13.

    If F(π)=F(\pi)=\infty, assume for a contradiction that lim infkFk(πk)<\liminf_{k\to\infty}F_{k}(\pi_{k})<\infty. Pass to a subsequence (not renamed) (πk)(\pi_{k}) with limkFk(πk)=lim infkFk(πk)\lim_{k\to\infty}F_{k}(\pi_{k})=\liminf_{k\to\infty}F_{k}(\pi_{k}).

    Since, πkπ\pi_{k}\xrightharpoonup{*}\pi and, by Lemma 4.6, iμ1,ikλ1,ik𝟙Qi1,kdμ1dλ1\sum_{i}\tfrac{\mu^{k}_{1,i}}{\lambda^{k}_{1,i}}\mathds{1}_{Q^{1,k}_{i}}\xrightharpoonup{*}\tfrac{\mathop{}\!\mathrm{d}\mu_{1}}{\mathop{}\!\mathrm{d}\lambda_{1}}, and jμ2,jkλ2,jk𝟙Qj2,kdμ2dλ2\sum_{j}\tfrac{\mu^{k}_{2,j}}{\lambda^{k}_{2,j}}\mathds{1}_{Q^{2,k}_{j}}\xrightharpoonup{*}\tfrac{\mathop{}\!\mathrm{d}\mu_{2}}{\mathop{}\!\mathrm{d}\lambda_{2}}, we see that dπdλ\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda} always satisfies the marginal constraints and positivity. Hence, F(π)=F(\pi)=\infty can not occur due to violation of these constraints.

    So, if π\pi satisfies the marginals but π≪̸λ\pi\not\ll\lambda or πλ\pi\ll\lambda with ΩΦ(dπdλ)dλ=\int_{\Omega}{\Phi}_{\infty}(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda=\infty, then it holds lim infkFk(πk)=\liminf_{k\to\infty}F_{k}(\pi_{k})=\infty by Lemma 2.13, which again is a contradiction.

    Thus, lim infFk(πk)F(π)\liminf F_{k}(\pi_{k})\geq F(\pi) in every case.

  2. ii)

    lim sup\limsup-condition: If F(π)=F(\pi)=\infty, the condition lim supkFk(πk)\limsup_{k\to\infty}F_{k}(\pi_{k}) is trivially fulfilled for πk:=π\pi_{k}:=\pi. Hence, consider F(π)<F(\pi)<\infty and define (πk)(\pi_{k}) by

    dπkdλ:=i,jπ(Iijk)λ(Iijk)𝟙Iijk.\frac{\mathop{}\!\mathrm{d}\pi_{k}}{\mathop{}\!\mathrm{d}\lambda}:=\sum_{i,j}\frac{\pi(I^{k}_{ij})}{\lambda(I^{k}_{ij})}\mathds{1}_{I_{ij}^{k}}\,.

    Note that by assumption λ(Iijk)>0\lambda(I^{k}_{ij})>0 and therefore πk0\pi_{k}\geq 0 by definition. One easily sees that

    Ω1dπkdλdλ1=jμ2,jkλ2,jk𝟙Qj2,k,andΩ2dπkdλdλ2=iμ1,ikλ1,ik𝟙Qi1,k,\int_{\Omega_{1}}\frac{\mathop{}\!\mathrm{d}\pi_{k}}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda_{1}=\sum_{j}\frac{\mu^{k}_{2,j}}{\lambda^{k}_{2,j}}\mathds{1}_{Q^{2,k}_{j}},\quad\text{and}\quad\int_{\Omega_{2}}\frac{\mathop{}\!\mathrm{d}\pi_{k}}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda_{2}=\sum_{i}\frac{\mu^{k}_{1,i}}{\lambda^{k}_{1,i}}\mathds{1}_{Q^{1,k}_{i}},

    i.e., Fk(πk)<F_{k}(\pi_{k})<\infty. In particular, πkπ\pi_{k}\xrightharpoonup{*}\pi by Lemma 4.6.

    It remains to show lim supkFk(πk)F(π)\limsup_{k\to\infty}F_{k}(\pi_{k})\leq F(\pi). As before, we have

    ΩcdπkΩcdπ.\int_{\Omega}c\mathop{}\!\mathrm{d}\pi_{k}\to\int_{\Omega}c\mathop{}\!\mathrm{d}\pi\,.

    Moreover, we get from Jensen’s inequality that

    Φ(π(Iijk)λ(Iijk))=Φ(1λ(Iijk)Iijkdπdλdλ)1λ(Iijk)IijkΦ(dπdλ)dλ.\Phi\Big{(}\tfrac{\pi(I^{k}_{ij})}{\lambda(I^{k}_{ij})}\Big{)}=\Phi\Big{(}\tfrac{1}{\lambda(I^{k}_{ij})}\int_{I^{k}_{ij}}\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\mathop{}\!\mathrm{d}\lambda\Big{)}\leq\tfrac{1}{\lambda(I^{k}_{ij})}\int_{I^{k}_{ij}}\Phi(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda.

    With this we obtain

    ΩΦ(dπkdλ)dλ\displaystyle\int_{\Omega}\Phi(\tfrac{\mathop{}\!\mathrm{d}\pi_{k}}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda =ΩΦ(ijπ(Iijk)λ(Iijk)𝟙Iijk)dλ=i,jIijkdλΦ(π(Iijk)λ(Iijk))=i,jλ(Iijk)Φ(π(Iijk)λ(Iijk))\displaystyle=\int_{\Omega}\Phi\left(\sum_{ij}\tfrac{\pi(I^{k}_{ij})}{\lambda(I^{k}_{ij})}\mathds{1}_{I^{k}_{ij}}\right)\mathop{}\!\mathrm{d}\lambda=\sum_{i,j}\int_{I^{k}_{ij}}\mathop{}\!\mathrm{d}\lambda\,\Phi\left(\tfrac{\pi(I^{k}_{ij})}{\lambda(I^{k}_{ij})}\right)=\sum_{i,j}\lambda(I^{k}_{ij})\Phi\left(\tfrac{\pi(I^{k}_{ij})}{\lambda(I^{k}_{ij})}\right)
    ijIijkΦ(dπdλ)dλ=ΩΦ(dπdλ)dλ\displaystyle\leq\sum_{ij}\int_{I^{k}_{ij}}\Phi(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda=\int_{\Omega}\Phi(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda

    which shows the lim sup\limsup-condition.∎

Γ\Gamma-convergence for simultaneously decreasing the regularization parameter and refining the discretization proves to be a harder problem. The following example shows that the convergence rate of γk\gamma_{k} must be linked to the convergence rate of the discretization.

Example 4.8.

Let Ω=[0,1]2\Omega=[0,1]^{2}, μ1=μ2=δ0{\mu_{1}}={\mu_{2}}=\delta_{0} the Dirac measure at zero and λi=i\lambda_{i}=\mathcal{L}_{i} the Lebesgue measure on Ωi\Omega_{i}. Then clearly π=δ0\pi=\delta_{0} is the only feasible and thus the optimal transport plan. Let 0<hk0<h_{k} be a sequence, which is to be chosen.

We consider Φ(t)=tlogt\Phi(t)=t\log t and have the discretized optimal plan

πk=π([0,hk]2)λ([0,hk]2)𝟙[0,hk]2=𝟙[0,hk]2hk2.\pi_{k}=\frac{\pi([0,h_{k}]^{2})}{\lambda([0,h_{k}]^{2})}\mathds{1}_{{[0,h_{k}]}^{2}}=\frac{\mathds{1}_{{[0,h_{k}]}^{2}}}{h_{k}^{2}}\,.

However, it holds that

γkΩΦ(πk)dλ=γklog(hk2),\gamma_{k}\int_{\Omega}\Phi(\pi_{k})\mathop{}\!\mathrm{d}\lambda=\gamma_{k}\log(h_{k}^{-2}),

and hence, the lim sup\limsup condition can not be fulfilled if hkh_{k} is such that γklog(hk2)↛0\gamma_{k}\log(h_{k}^{-2})\not\to 0 (which holds, e.g., for hk=exp(1/γk)h_{k}=\exp(-1/\gamma_{k})).

Let now λ~i=δ0\tilde{\lambda}_{i}=\delta_{0} be the Dirac measure at zero. The discretized optimal plan now is

π~k=π([0,hk]2)λ~([0,hk]2)𝟙[0,hk]2=𝟙[0,hk]2,with λ~:=λ~1λ~2\tilde{\pi}_{k}=\frac{\pi([0,h_{k}]^{2})}{\tilde{\lambda}([0,h_{k}]^{2})}\mathds{1}_{{[0,h_{k}]}^{2}}=\mathds{1}_{{[0,h_{k}]}^{2}}\,,\qquad\text{with }\tilde{\lambda}:=\tilde{\lambda}_{1}\otimes\tilde{\lambda}_{2}

and hence,

γkΩΦ(π~k)dλ~=γkΦ(π~k(0))=γkΦ(1)=0.\gamma_{k}\int_{\Omega}\Phi(\tilde{\pi}_{k})\mathop{}\!\mathrm{d}\tilde{\lambda}=\gamma_{k}\Phi(\tilde{\pi}_{k}(0))=\gamma_{k}\Phi(1)=0\,.

Thus, hkh_{k} may be chosen arbitrary in this case.

Using the insights of Example 4.8 the desired result can be formulated.

Theorem 4.9.

Under the assumptions of Theorem 4.7, let (hk)(h_{k}) be a positive sequence with hkλ(Iijk)h_{k}\leq\lambda(I^{k}_{ij}) for all i,ji,j and let (γk)(\gamma_{k}) be a sequence converging to zero such that

γkΦ+(hk1)0.\gamma_{k}{\Phi}_{+}(h_{k}^{-1})\to 0\,. (4.4)

Then the minimization problem

inf{Ωcdπ\displaystyle\inf\left\{\int_{\Omega}\right.c\mathop{}\!\mathrm{d}\pi +γkΩΦ(dπdλ)dλ|πλ,0dπdλ𝒱k,\displaystyle+\gamma_{k}\left.\int_{\Omega}{\Phi}_{\infty}(\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda\,\middle|\right.\,\pi\ll\lambda\,,0\leq\tfrac{\mathop{}\!\mathrm{d}\pi}{\mathop{}\!\mathrm{d}\lambda}\in\mathcal{V}^{k},
d(P1)#πdλ1=iμ1,ikλ1,ik𝟙Qi1,k,d(P2)#πdλ2=jμ2,jkλ2,jk𝟙Qj2,k}\displaystyle\left.\tfrac{\mathop{}\!\mathrm{d}(P_{1})_{\#}\pi}{\mathop{}\!\mathrm{d}\lambda_{1}}=\sum_{i}\tfrac{\mu^{k}_{1,i}}{\lambda^{k}_{1,i}}\mathds{1}_{Q^{1,k}_{i}},\,\tfrac{\mathop{}\!\mathrm{d}(P_{2})_{\#}\pi}{\mathop{}\!\mathrm{d}\lambda_{2}}=\sum_{j}\tfrac{\mu^{k}_{2,j}}{\lambda^{k}_{2,j}}\mathds{1}_{Q^{2,k}_{j}}\right\}

Γ\Gamma-converges to (OT) w.r.t. weak-* convergence in 𝔐(Ω)\mathfrak{M}(\Omega) as kk\to\infty.

Proof.

Let FkF_{k} be defined as in the proof of Theorem 4.7 with γ=γk\gamma=\gamma_{k} and let F:𝔐(Ω){}F:\mathfrak{M}(\Omega)\to\mathbb{R}\cup\{\infty\} be defined via

F(π)={Ωcdπ,0π𝒫(Ω),(Pi)#π=μi,i=1,2,,else.F(\pi)=\begin{cases}\int_{\Omega}c\mathop{}\!\mathrm{d}\pi,&0\leq\pi\in\mathcal{P}(\Omega),\,(P_{i})_{\#}{\pi}=\mu_{i},\,i=1,2,\\ \infty,&\text{else.}\end{cases}

Given an arbitrary π𝔐(Ω)\pi\in\mathfrak{M}(\Omega), check now the two conditions for Γ\Gamma-convergence.

  1. i)

    lim inf\liminf-condition: Let π\pi be any measure and (πk)(\pi_{k}) be such that πkπ\pi_{k}\xrightharpoonup{*}\pi.

    The case F(π)=F(\pi)=\infty can be treated similarly to the proof of Theorem 4.7, with the difference that we don’t have to consider the case π≪̸λ\pi\not\ll\lambda. For F(π)<F(\pi)<\infty we get

    ΩcdπkΩcdπ.\int_{\Omega}c\mathop{}\!\mathrm{d}\pi_{k}\to\int_{\Omega}c\mathop{}\!\mathrm{d}\pi\,.

    Moreover, since Φ\Phi is bounded from below, we can extract a subsequence such that ΩΦ(dπkdλ)dλ\int_{\Omega}\Phi(\tfrac{\mathop{}\!\mathrm{d}\pi_{k}}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda is bounded from above and below and obtain

    limkγkΩΦ(dπkdλ)dλ=0\lim_{k\to\infty}\gamma_{k}\int_{\Omega}\Phi(\tfrac{\mathop{}\!\mathrm{d}\pi_{k}}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda=0

    which proves lim infFk(πk)F(π)\liminf F_{k}(\pi_{k})\geq F(\pi).

  2. ii)

    lim sup\limsup-condition: For F(π)=F(\pi)=\infty we have nothing to prove.

    If F(π)<F(\pi)<\infty, define πk\pi_{k} as in the proof of Theorem 4.7. Then, the marginal constraints are satisfied and πkπ\pi_{k}\xrightharpoonup{*}\pi. Hence ΩcdπkΩcdπ\int_{\Omega}c\mathop{}\!\mathrm{d}\pi_{k}\to\int_{\Omega}c\mathop{}\!\mathrm{d}\pi and it remains to show that

    lim supkγkΩΦ(dπkdλ)dλ0.\limsup_{k\to\infty}\gamma_{k}\int_{\Omega}\Phi(\tfrac{\mathop{}\!\mathrm{d}\pi_{k}}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda\leq 0\,.

    Using 0π(Iijk)π(Ω)=10\leq\pi(I^{k}_{ij})\leq\pi(\Omega)=1 and monotonicity of Φ+\Phi_{+} we get

    ΩΦ(dπkdλ)dλ=ijΦ(π(Iijk)λ(Iijk))λ(Iijk)ijΦ+(π(Iijk)λ(Iijk))λ(Iijk)ijΦ+(1λ(Iijk))λ(Iijk)Φ+(hk1)ijλ(Iijk)=Φ+(hk1)λ(Ω).\begin{split}\int_{\Omega}{\Phi}_{\infty}(\tfrac{\mathop{}\!\mathrm{d}\pi_{k}}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda&=\sum_{ij}\Phi\left(\tfrac{\pi(I^{k}_{ij})}{\lambda(I^{k}_{ij})}\right)\lambda(I^{k}_{ij})\leq\sum_{ij}{\Phi}_{+}\left(\tfrac{\pi(I^{k}_{ij})}{\lambda(I^{k}_{ij})}\right)\lambda(I^{k}_{ij})\\ &\leq\sum_{ij}{\Phi}_{+}\Big{(}\tfrac{1}{\lambda(I^{k}_{ij})}\Big{)}\lambda(I^{k}_{ij})\leq{\Phi}_{+}(h_{k}^{-1})\sum_{ij}\lambda(I^{k}_{ij})\\ &={\Phi}_{+}(h_{k}^{-1})\lambda(\Omega)\,.\end{split} (4.5)

    Hence,

    γkΩΦ(dπkdλ)dλλ(Ω)γkΦ+(hk1)0\displaystyle\gamma_{k}\int_{\Omega}\Phi(\tfrac{\mathop{}\!\mathrm{d}\pi_{k}}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda\leq\lambda(\Omega)\gamma_{k}{\Phi}_{+}(h_{k}^{-1})\to 0

    as desired.∎

Corollary 4.10.

Theorems 4.7 and 4.9 remain true if instead of a continuous cost function cc, a sequence of step functions (ck)(c_{k}), constant on the sets IijkI^{k}_{ij} and converging to cc uniformly, is used.

Proof.

Rewrite the integral Ωckdπk\int_{\Omega}c_{k}\mathop{}\!\mathrm{d}\pi_{k} as the dual pairing ck,πk\langle c_{k}\,,\,\pi_{k}\rangle, where (ck)𝒞b(Ω)(c_{k})\subset\mathcal{C}_{\mathrm{b}}(\Omega) converges strongly and πk𝔐(Ω)\pi_{k}\subset\mathfrak{M}(\Omega) converges weakly-*. Because the variation norm πk\lVert\pi_{k}\rVert is bounded ([23, Corollary 2.6.10]), ck,πkc,π\langle c_{k}\,,\,\pi_{k}\rangle\to\langle c\,,\,\pi\rangle can be seen by standard arguments, which yields the assertion. ∎

Similarly to Corollary 4.3, for t1Φ(t)t^{-1}\Phi(t) monotone the assumption on (γk)(\gamma_{k}) can be weakened.

Corollary 4.11.

Let Φ\Phi be a quasi-Young’s function such that t1Φ(t)t^{-1}\Phi(t) is monotone. Then it suffices to assume

γkhkΦ+(hk1)0\gamma_{k}{h_{k}}{\Phi}_{+}\left(h_{k}^{-1}\right)\to 0

instead of condition (4.4) in Theorem 4.9.

Proof.

Using the monotonicity of t1Φ(t)t^{-1}\Phi(t), (4.5) can be refined to

ΩΦ(dπkdλ)dλ\displaystyle\int_{\Omega}{\Phi}_{\infty}(\tfrac{\mathop{}\!\mathrm{d}\pi_{k}}{\mathop{}\!\mathrm{d}\lambda})\mathop{}\!\mathrm{d}\lambda =ijΦ(π(Iijk)λ(Iijk))λ(Iijk)π(Iijk)π(Iijk)λ(Iijk)λ(Iijk)ijΦ+(π(Iijk)λ(Iijk))λ(Iijk)π(Iijk)π(Iijk)\displaystyle=\sum_{ij}\Phi\left(\tfrac{\pi(I^{k}_{ij})}{\lambda(I^{k}_{ij})}\right)\tfrac{\lambda(I^{k}_{ij})}{\pi(I^{k}_{ij})}\tfrac{\pi(I^{k}_{ij})}{\lambda(I^{k}_{ij})}\lambda(I^{k}_{ij})\leq\sum_{ij}\Phi_{+}\left(\tfrac{\pi(I^{k}_{ij})}{\lambda(I^{k}_{ij})}\right)\tfrac{\lambda(I^{k}_{ij})}{\pi(I^{k}_{ij})}\pi(I^{k}_{ij})
ijΦ+((λ(Iijk))1)λ(Iijk)π(Iijk)hkΦ+(hk1)ijπ(Iijk)\displaystyle\leq\sum_{ij}\Phi_{+}\left(\left(\lambda(I^{k}_{ij})\right)^{-1}\right)\lambda(I^{k}_{ij})\pi(I^{k}_{ij})\leq h_{k}\Phi_{+}(h_{k}^{-1})\sum_{ij}\pi(I^{k}_{ij})
=hkΦ+(hk1)π(Ω)=hkΦ+(hk1)\displaystyle=h_{k}\Phi_{+}(h_{k}^{-1})\pi(\Omega)=h_{k}\Phi_{+}(h_{k}^{-1})

and the assertion follows as in Theorem 4.9. ∎

5 Conclusion

Employing regularization in Orlicz spaces to the optimal transport problem allows to generalize the existence results of [10, 20] for both the primal and the predual problem and under mild assumptions, the results hold for regularization w.r.t. product measures λ=λ1λ2\lambda=\lambda_{1}\otimes\lambda_{2}. More precisely, primal solutions exist if and only if the marginals are functions in the appropriate Orlicz spaces and existence of optimizers in LqL^{q} for the predual problem has been shown for the special case Φ(t)=tp/p\Phi(t)=\nicefrac{{t^{p}}}{{p}}, p2p\geq 2.

A combined regularization and smoothing approach leads to a family of well-posed approximations that Γ\Gamma-converge to the original Kantorovich formulation if the regularization and smoothing parameters are coupled in an appropriate way. This gives a generalization of the corresponding result [10, Theorem 5.1] for Φ(t)=tlogt\Phi(t)=t\log t. Similarly, a combined regularization and discretization approach leads to another family of approximations. It could be proven that, again, Γ\Gamma-convergence is guaranteed if the regularization parameter and the discretization fineness are coupled in an appropriate way.

Existence of solutions of the dual problem for general (quasi-)Young’s functions has been considered in a different framework in [18]. Still, future work might investigate, if the result can also be achieved by the approach considered here. Moreover, numerical methods for solving the regularized problem (P) have not been discussed here.

Acknowledgments

The authors would like to thank two anonymous reviewers for their suggestions regarding the presentation and valuable comments, which also helped to close some gaps in the argumentation.

Appendix

Appendix A Proof of Lemma 2.6

Proof.

Let Φ(t):=0tφ(s)ds\Phi(t):=\int_{0}^{t}\varphi(s)\mathop{}\!\mathrm{d}s be as in Definition 2.1. Then with φ~:=φ𝟙(t0,)\tilde{\varphi}:=\varphi\cdot\mathds{1}_{(t_{0},\infty)} it holds that Φ~(t)=0tφ~(s)ds\tilde{\Phi}(t)=\int_{0}^{t}\tilde{\varphi}(s)\mathop{}\!\mathrm{d}s and hence, Φ~\tilde{\Phi} is a Young’s function.

To show LΦ(Ω,dν)=LΦ~(Ω,dν)L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\nu)=L^{\tilde{\Phi}}(\Omega,\mathop{}\!\mathrm{d}\nu) it suffices to show that for any ff

fLΦ(Ω,dν)<fLΦ~(Ω,dν)<.\lVert f\rVert_{L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\nu)}<\infty\Leftrightarrow\lVert f\rVert_{L^{\tilde{\Phi}}(\Omega,\mathop{}\!\mathrm{d}\nu)}<\infty\,.

If fLΦ(Ω,dν)<\lVert f\rVert_{L^{\Phi}(\Omega,\mathop{}\!\mathrm{d}\nu)}<\infty, then fLΦ~(Ω,dν)<\lVert f\rVert_{L^{\tilde{\Phi}}(\Omega,\mathop{}\!\mathrm{d}\nu)}<\infty trivially. Let therefore fLΦ~(Ω,dν)<\lVert f\rVert_{L^{\tilde{\Phi}}(\Omega,\mathop{}\!\mathrm{d}\nu)}\leavevmode\nobreak\ <\leavevmode\nobreak\ \infty, i.e. there is a γ\gamma such that

ΩΦ~(|f(x)|γ)dν(x)=Ωγ(f)Φ(|f(x)|γ)dν(x)ν(Ωγ(f))Φ(t0)1,\int_{\Omega}\tilde{\Phi}\left(\frac{\lvert f(x)\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu(x)=\int_{\Omega_{\gamma}(f)}\Phi\left(\frac{\lvert f(x)\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu(x)-\nu(\Omega_{\gamma}(f))\Phi(t_{0})\leq 1\,,

where Ωγ(f):={x||f(x)|γt0}\Omega_{\gamma}(f):=\left\{x\,\middle|\,\frac{\lvert f(x)\rvert}{\gamma}\geq t_{0}\right\}. Let r(0,1]r\in(0,1] and write

ΩΦ(r|f(x)|γ)dν(x)=({x||f(x)|γt0}+{x|t0|f(x)|γt0r}+{x|t0r|f(x)|γ})Φ(r|f(x)|γ)dν(x).\int_{\Omega}\Phi\left(r\frac{\lvert f(x)\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu(x)=\left(\int_{\left\{x\,\middle|\,\frac{\lvert f(x)\rvert}{\gamma}\leq t_{0}\right\}}+\int_{\left\{x\,\middle|\,t_{0}\leq\frac{\lvert f(x)\rvert}{\gamma}\leq\frac{t_{0}}{r}\right\}}+\int_{\left\{x\,\middle|\,\frac{t_{0}}{r}\leq\frac{\lvert f(x)\rvert}{\gamma}\right\}}\right)\Phi\left(r\frac{\lvert f(x)\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu(x)\,.

Since Φ(0)=0\Phi(0)=0 and Φ\Phi is convex, for every s[0,1]s\in[0,1] it holds that Φ(sx)sΦ(x)\Phi(sx)\leq s\Phi(x) and we obtain an upper bound for the first integral by

{x||f(x)|γt0}Φ(r|f(x)|γ)dν(x)r{x||f(x)|γt0}Φ(|f(x)|γ)dν(x)rC\int_{\left\{x\,\middle|\,\frac{\lvert f(x)\rvert}{\gamma}\leq t_{0}\right\}}\Phi\left(r\frac{\lvert f(x)\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu(x)\leq r\int_{\left\{x\,\middle|\,\frac{\lvert f(x)\rvert}{\gamma}\leq t_{0}\right\}}\Phi\left(\frac{\lvert f(x)\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu(x)\leq rC

for some constant CC. With the same argument,

{x|t0r|f(x)|γ}Φ(r|f(x)|γ)dν(x)\displaystyle\int_{\left\{x\,\middle|\,\frac{t_{0}}{r}\leq\frac{\lvert f(x)\rvert}{\gamma}\right\}}\Phi\left(r\frac{\lvert f(x)\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu(x) r{x|t0r|f(x)|γ}Φ(|f(x)|γ)dν(x)\displaystyle\leq r\int_{\left\{x\,\middle|\,\frac{t_{0}}{r}\leq\frac{\lvert f(x)\rvert}{\gamma}\right\}}\Phi\left(\frac{\lvert f(x)\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu(x)
r{x|t0|f(x)|γ}Φ(|f(x)|γ)dν(x)\displaystyle\leq r\int_{\left\{x\,\middle|\,t_{0}\leq\frac{\lvert f(x)\rvert}{\gamma}\right\}}\Phi\left(\frac{\lvert f(x)\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu(x)
=r(ΩΦ~(|f(x)|γ)dν(x)+ν(Ωγ(f))Φ(t0))\displaystyle=r\left(\int_{\Omega}\tilde{\Phi}\left(\frac{\lvert f(x)\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu(x)+\nu(\Omega_{\gamma}(f))\Phi(t_{0})\right)
r(1+ν(Ωγ(f))Φ(t0)).\displaystyle\leq r\left(1+\nu(\Omega_{\gamma}(f))\Phi(t_{0})\right)\,.

Finally, since {x|t0|f(x)|γt0r}{x|t0|f(x)|γ}\left\{x\,\middle|\,t_{0}\leq\frac{\lvert f(x)\rvert}{\gamma}\leq\frac{t_{0}}{r}\right\}\subseteq\left\{x\,\middle|\,t_{0}\leq\frac{\lvert f(x)\rvert}{\gamma}\right\}, the same estimate holds for the second integral. Combining the estimates for the three integrals one obtains

ΩΦ(r|f(x)|γ)dν(x)r(C+2(1+ν(Ωγ(f))Φ(t0))),\int_{\Omega}\Phi\left(r\frac{\lvert f(x)\rvert}{\gamma}\right)\mathop{}\!\mathrm{d}\nu(x)\leq r\left(C+2\cdot(1+\nu(\Omega_{\gamma}(f))\Phi(t_{0}))\right)\,,

which is less or equal than 11 for rr small enough. ∎

Appendix B Proof of Lemma 4.1

Proof.

We only treat the case i=1i=1 (the case i=2i=2 being analogous). Recalling Gδ=φδφδG_{\delta}=\varphi_{\delta}\otimes\varphi_{\delta} and using Fubini’s Theorem we get

dμ1δd(y1)\displaystyle\frac{\mathop{}\!\mathrm{d}\mu_{1}^{\delta}}{\mathop{}\!\mathrm{d}\mathcal{L}}(y_{1}) =d(φδ(P1)#π)d(y1)=nφδ(y1x1)d((P1)#π)(x1)\displaystyle=\frac{\mathop{}\!\mathrm{d}\left(\varphi_{\delta}*(P_{1})_{\#}{\pi}\right)}{\mathop{}\!\mathrm{d}\mathcal{L}}(y_{1})=\int_{\mathbb{R}^{n}}\varphi_{\delta}(y_{1}-x_{1})\mathop{}\!\mathrm{d}((P_{1})_{\#}{\pi})(x_{1})
=n×nφδ(y1x1)dπ(x1,x2)=nn×nGδ(y1x1,y2x2)dπ(x1,x2)dy2\displaystyle=\int_{\mathbb{R}^{n}\times\mathbb{R}^{n}}\varphi_{\delta}(y_{1}-x_{1})\mathop{}\!\mathrm{d}\pi(x_{1},x_{2})=\int_{\mathbb{R}^{n}}\int_{\mathbb{R}^{n}\times\mathbb{R}^{n}}G_{\delta}(y_{1}-x_{1},y_{2}-x_{2})\mathop{}\!\mathrm{d}\pi(x_{1},x_{2})\mathop{}\!\mathrm{d}y_{2}
=n(Gδπ)(y1,y2)dy2=d(P1)#πδd(y1),\displaystyle=\int_{\mathbb{R}^{n}}\left(G_{\delta}*\pi\right)(y_{1},y_{2})\mathop{}\!\mathrm{d}y_{2}=\frac{\mathop{}\!\mathrm{d}(P_{1})_{\#}\pi_{\delta}}{\mathop{}\!\mathrm{d}\mathcal{L}}(y_{1})\,,

where the third equality follows directly from the definition of the Lebesgue integral via simple functions. ∎

Appendix C Proof of Lemma 4.6

Proof.

Let AΩA\subset\Omega be a Borel set, ε>0\varepsilon>0 and AkA^{k}_{-}, A+kA^{k}_{+} as in Assumption 4.4. Then for all i,ji,j it holds

λ(Ii,jkA+k)={λ(Iijk),IijkA,0,else\lambda(I^{k}_{i,j}\cap A^{k}_{+})=\begin{cases}\lambda(I^{k}_{ij}),&I^{k}_{ij}\cap A\neq\emptyset,\\ 0,&\text{else}\end{cases}

and similarly for λ(Ii,jkAk)\lambda(I^{k}_{i,j}\cap A^{k}_{-}). In combination with (4.3) this yields

Aνkdλν(A)\displaystyle\int_{A}\nu_{k}\mathop{}\!\mathrm{d}\lambda-\nu(A) A+kνkdλν(Ak)=i,jν(Iijk)λ(Iijk)λ(Ii,jkA+k)ν(Ak)\displaystyle\leq\int_{A^{k}_{+}}\nu_{k}\mathop{}\!\mathrm{d}\lambda-\nu(A^{k}_{-})=\sum_{i,j}\frac{\nu(I^{k}_{ij})}{\lambda(I^{k}_{ij})}\lambda(I^{k}_{i,j}\cap A^{k}_{+})-\nu(A^{k}_{-})
={(i,j)|Ii,jkA}ν(Ii,jk)ν(Ak)=ν(A+k)ν(Ak)<ε,\displaystyle=\sum_{\left\{(i,j)\,\middle|\,I^{k}_{i,j}\cap A\neq\emptyset\right\}}\nu(I^{k}_{i,j})-\nu(A^{k}_{-})=\nu(A^{k}_{+})-\nu(A^{k}_{-})<\varepsilon\,,

for kk large enough. Using an analogous argument for a lower bound, we get

|Aνkdλν(A)|<ε,\left|\int_{A}\nu_{k}\mathop{}\!\mathrm{d}\lambda-\nu(A)\right|<\varepsilon\,,

which yields the first assertion. Analogous argumentation proves the result for νi𝔐+(Ωi)\nu^{i}\in\mathfrak{M}_{+}(\Omega_{i}). ∎

References

  • [1] L. Ambrosio and N. Gigli. A user’s guide to optimal transport. In Modelling and Optimisation of Flows on Networks, pages 1–155. Springer, 2013. doi:10.1007/978-3-642-32160-3_1.
  • [2] J.-D. Benamou, G. Carlier, M. Cuturi, L. Nenna, and G. Peyré. Iterative Bregman projections for regularized transportation problems. SIAM Journal on Scientific Computing, 37(2):A1111–A1138, 2015. doi:10.1137/141000439.
  • [3] C. Bennett and R. Sharpley. Interpolation of Operators, volume 129 of Pure and Applied Mathematics. Academic Press, Inc., Boston, MA, 1988. doi:10.1016/S0079-8169(13)62909-8.
  • [4] M. Blondel, V. Seguy, and A. Rolet. Smooth and sparse optimal transport. In A. Storkey and F. Perez-Cruz, editors, Proceedings of the Twenty-First International Conference on Artificial Intelligence and Statistics, volume 84 of Proceedings of Machine Learning Research, pages 880–889. PMLR, 09–11 Apr 2018.
  • [5] V. I. Bogachev. Measure theory. Springer, Berlin New York, 2007.
  • [6] J. Borwein and Q. J. Zhu. Techniques of variational analysis. Springer, New York, 2005.
  • [7] A. Braides. Γ\Gamma-convergence for Beginners, volume 22 of Oxford Lecture Series in Mathematics and its Applications. Oxford University Press, Oxford, 2002. doi:10.1093/acprof:oso/9780198507840.001.0001.
  • [8] G. Carlier, V. Duval, G. Peyré, and B. Schmitzer. Convergence of entropic schemes for optimal transport and gradient flows. SIAM Journal on Mathematical Analysis, 49, 12 2015. doi:10.1137/15M1050264.
  • [9] A. Caruso. Two properties of norms in Orlicz spaces. Le Matematiche, 56(1):183–194, 2001.
  • [10] C. Clason, D. A. Lorenz, H. Mahler, and B. Wirth. Entropic regularization of continuous optimal transport problems. Journal of Mathematical Analysis and Applications, 494(1):124432, Feb. 2021. doi:10.1016/j.jmaa.2020.124432.
  • [11] M. Cuturi. Sinkhorn distances: Lightspeed computation of optimal transportation distances. Advances in Neural Information Processing Systems, 26, 06 2013.
  • [12] M. Cuturi, G. Peyré, and A. Rolet. A smoothed dual approach for variational Wasserstein problems. SIAM Journal on Imaging Sciences, 9, 03 2015. doi:10.1137/15M1032600.
  • [13] A. Dessein, N. Papadakis, and J.-L. Rouas. Regularized optimal transport and the rot mover’s distance. The Journal of Machine Learning Research, 19(1):590–642, 2018.
  • [14] L. Diening, P. Harjulehto, P. Hästö, and M. Růžička. Lebesgue and Sobolev Spaces with Variable Exponents, volume 2017 of Lecture Notes in Mathematics. Springer, 4 2011. doi:10.1007/978-3-642-18363-8.
  • [15] G. B. Folland. Real analysis : modern techniques and their applications. Wiley, New York, 1999.
  • [16] I. Fonseca and G. Leoni. Modern Methods in the Calculus of Variations: LpL^{p} Spaces. Springer Monographs in Mathematics. Springer, 2007. doi:10.1007/978-0-387-69006-3.
  • [17] A. Genevay. Entropy-regularized optimal transport for machine learning. PhD thesis, Université Paris Dauphine, 2019.
  • [18] C. Léonard. Minimization of entropy functionals. Journal of Mathematical Analysis and Applications, 346(1):183 – 204, 2008. doi:10.1016/j.jmaa.2008.04.048.
  • [19] C. Léonard. Convex minimization problems with weak constraint qualifications. Journal of Convex Analysis, 17(1):321–348, 2010.
  • [20] D. Lorenz, P. Manns, and C. Meyer. Quadratically regularized optimal transport. Applied Mathematics & Optimization, 09 2019. doi:10.1007/s00245-019-09614-w.
  • [21] D. A. Lorenz and H. Mahler. Orlicz-space regularization for optimal transport and algorithms for quadratic regularization, 2019. arXiv:1909.06082.
  • [22] S. D. Marino and A. Gerolin. Optimal transport losses and Sinkhorn algorithm with general convex regularization, 2020. arXiv:2007.00976.
  • [23] R. E. Megginson. An Introduction to Banach Space Theory (Graduate Texts in Mathematics). Springer, 1998. doi:10.1007/978-1-4612-0603-3.
  • [24] J. Musielak. Orlicz Spaces and Modular Spaces. Springer Berlin Heidelberg, 1983. doi:10.1007/bfb0072210.
  • [25] B. Muzellec, R. Nock, G. Patrini, and F. Nielsen. Tsallis regularized optimal transport and ecological inference. Proceedings of the AAAI Conference on Artificial Intelligence, 31(1), Feb. 2017. URL: https://ojs.aaai.org/index.php/AAAI/article/view/10854.
  • [26] F.-P. Paty and M. Cuturi. Regularized optimal transport is ground cost adversarial. In H. Daumé III and A. Singh, editors, Proceedings of the 37th International Conference on Machine Learning, volume 119 of Proceedings of Machine Learning Research, pages 7532–7542. PMLR, 13–18 Jul 2020. URL: http://proceedings.mlr.press/v119/paty20a.html.
  • [27] G. Peyré and M. Cuturi. Computational optimal transport. Foundations and Trends® in Machine Learning, 11(5-6):355–607, 2019. doi:10.1561/2200000073.
  • [28] M. M. Rao and Z. D. Ren. Theory of Orlicz Spaces. Pure and Applied Mathematics. Dekker, 1991.
  • [29] L. Roberts, L. Razoumov, L. Su, and Y. Wang. Gini-regularized optimal transport with an application to spatio-temporal forecasting, 2017. arXiv:1712.02512.
  • [30] R. T. Rockafellar. Integrals which are convex functionals. Pacific J. Math., 24:525–539, 1968. doi:10.2140/pjm.1968.24.525.
  • [31] F. Santambrogio. Optimal Transport for Applied Mathematicians: Calculus of Variations, PDEs, and Modeling (Progress in Nonlinear Differential Equations and Their Applications). Birkhäuser, 2015. doi:10.1007/978-3-319-20828-2.
  • [32] F.-X. Vialard. An elementary introduction to entropic regularization and proximal methods for numerical optimal transport. Lecture, May 2019. URL: https://hal.archives-ouvertes.fr/hal-02303456.
  • [33] C. Villani. Optimal Transport. Old and New, volume 338 of Grundlehren der Mathematischen Wissenschaften. Springer-Verlag, Berlin, 2009. doi:10.1007/978-3-540-71050-9.