This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Global sensitivity analysis and Wasserstein spaces

Jean-Claude Fort MAP5 Université Paris Descartes, SPC, 45 rue des Saints Pères, 75006 Paris, France. Thierry Klein Institut de Mathématiques de Toulouse; UMR5219. Université de Toulouse; ENAC - Ecole Nationale de l’Aviation Civile , Université de Toulouse, France Agnès Lagnoux Institut de Mathématiques de Toulouse; UMR5219. Université de Toulouse; CNRS. UT2J, F-31058 Toulouse, France.
Abstract

Sensitivity indices are commonly used to quantity the relative influence of any specific group of input variables on the output of a computer code. In this paper, we focus both on computer codes the output of which is a cumulative distribution function and on stochastic computer codes. We propose a way to perform a global sensitivity analysis for these kinds of computer codes. In the first setting, we define two indices: the first one is based on Wasserstein Fréchet means while the second one is based on the Hoeffding decomposition of the indicators of Wasserstein balls. Further, when dealing with the stochastic computer codes, we define an “ideal version” of the stochastic computer code thats fits into the frame of the first setting. Finally, we deduce a procedure to realize a second level global sensitivity analysis, namely when one is interested in the sensitivity related to the input distributions rather than in the sensitivity related to the inputs themselves. Several numerical studies are proposed as illustrations in the different settings.

Keywords: Global sensitivity indices, functional computer codes, stochastic computer codes, second level uncertainty, Fréchet means, Wasserstein spaces.

AMS subject classification 62G05, 62G20, 62G30, 65C60, 62E17.

1 Introduction

The use of complex computer models for the analysis of applications from sciences, engineering and other fields is by now routine. For instance, in the area of marine submersion, complex computer codes have been developed to simulate submersion events (see, e.g., [3, 34] for more details). In the context of aircraft design, sensitivity analysis and metamodelling are intensively used to optimize the design of an airplane (see, e.g., [51]). Several other concrete examples of stochastic computer codes can be found in [42].

Often, the models are expensive to run in terms of computational time. Thus it is crucial to understand the global influence of one or several inputs on the output of the system under study with a moderate number of runs afforded [53]. When these inputs are regarded as random elements, this problem is generally called (global) sensitivity analysis. We refer to [16, 52, 56] for an overview of the practical aspects of global sensitivity analysis.

A classical tool to perform global sensitivity analysis consists in computing the Sobol indices. These indices were first introduced in [50] and then considered by [55]. They are well tailored when the output space is \mathbb{R}. The Sobol indices compare, using the Hoeffding decomposition [33], the conditional variance of the output knowing some of the input variables to the total variance of the output. Many different estimation procedures of the Sobol indices have been proposed and studied in the literature. Some are based on Monte-Carlo or quasi Monte-Carlo design of experiments (see [38, 47] and references therein for more details). More recently a method based on nested Monte-Carlo [28] has been developed. In particular, an efficient estimation of the Sobol indices can be performed through the so-called Pick-Freeze method. For the description of this method and its theoretical study (consistency, central limit theorem, concentration inequalities and Berry- Esseen bounds), we refer to [36, 25] and references therein. Some other estimation procedures are based on different designs of experiments using for example polynomial chaos expansions (see [57] and the reference therein for more details).

Since Sobol indices are variance based, they only quantify the influence of the inputs on the mean behaviour of the code. Many authors proposed other criteria to compare the conditional distribution of the output knowing some of the inputs to the distribution of the output. In [47, 49, 48], the authors use higher moments to define new indices while, in [6, 7, 15], the use of divergences or distances between measures allows to define new indices. In [20], the authors use contrast functions to build indices that are goal oriented. Although these works define nice theoretical indices, the existence of a relevant statistical estimation procedure is still in most cases an open question. The case of vectorial-valued computer codes is considered in [26] where a sensitivity index based on the whole distribution utilizing the Cramér-von-Mises distance is defined. Within this framework, the authors show that the Pick-Freeze estimation procedure provides an asymptotically Gaussian estimator of the index. The definition of the Cramér-von-Mises indices has been extended to computer codes valued in general metric spaces in [21, 27].

Nowadays, the computer code output is often no longer a real-valued multidimensional variable but rather a function computed at various locations. In that sense, it can be considered as a functional output. Some other times, the computer code is stochastic in the sense that the same inputs can lead to different outputs. When the output of the computer code is a function (for instance, a cumulative distribution function) or when the computer code is stochastic, Sobol indices are no longer well tailored. It is then crucial to define indices adapted to the functional or random aspect of the output. When the output is vectorial or valued in an Hilbert space some generalizations of Sobol indices are available [39, 24]. Nevertheless, these indices are still based on the Hoeffding decomposition of the output; so that they only quantify the relative influence of an input through the variance. More recently, indices based on the whole distribution have been developed [15, 8, 6]. In particular, the method relying on Cramér-von-Mises distance [26] compares the conditionnal cumulative distribution function with the unconditional one by considering the Hoeffding decomposition of half-space indicators (rather than the Hoeffding decomposition of the output itself) and by integrating them. This method was then extend to codes taking values in a Riemannian manifold [21] and then in general metric spaces [27].

In this work, we focus on two kinds of computer codes: 1) computer codes the output of which is the cumulative distribution function of a real random variable and 2) real-valued stochastic computer codes. A first step will consist in performing global sensitivity analysis for these kinds of computer codes. Further, we will deduce how to perform second level sensitivity analysis using the tools developed in the first step. A code with cumulative distribution function as output can be seen as a code taking values in the space of all probability measures on \mathbb{R}. This space can be endowed with a metric (for example, the Wasserstein metric [59]). This point of view allows to define at least two different indices for this kind of codes, generalizing the framework of [27]. The first one is based on Wasserstein Fréchet means while the second one is based on the Hoeffding decomposition of the indicators of Wasserstein balls. Further, stochastic codes (see Section 5 for a bibliographical study) can be seen as a “discrete approximation” of codes having cumulative distribution functions as values. Then it is possible to define “natural” indices for such stochastic codes. Finally, second level sensitivity analysis aims at considering uncertainties on the type of the input distributions and/or on the parameters of the input distributions (see Section 6 for a bibliographical study). Actually, this kind of problem can be embedded in the framework of stochastic codes.

The article is organized as follows. In Section 2, we introduce and precisely define a general class of global sensitivity indices. We also present statistical methods to estimate these indices. In Section 3, we recall some basic facts on Wasserstein distances, Wasserstein costs and Fréchet means. In Section 4, we define and study the statistical properties of two new global sensitivity indices for computer codes valued in general Wasserstein spaces. Further, in Section 5, we study the case of stochastic computer codes. Finally, Section 6 is dedicated to the sensitivity analysis with respect to the distributions of the input variables.

2 Sensitivity indices for codes valued in general metric spaces

We consider a black-box code ff defined on a product of measurable spaces E=E1×E2××EpE=E_{1}\times E_{2}\times\ldots\times E_{p} (pp\in\mathbb{N}^{*}) taking its values in a metric space 𝒳\mathcal{X}. The output denoted by ZZ is then given by

Z=f(X1,,Xp).Z=f(X_{1},\ldots,X_{p}). (1)

We denote by \mathbb{P} the distribution of the output code ZZ.

The aim of this work is to give some partial answers to the following questions.

  1. Question 1

    How can we perform Global Sensitivity Analysis (GSA) when the output space is the space of probability distribution functions (p.d.f.) on \mathbb{R} or the space of cumulative distribution functions (c.d.f.)?

  2. Question 2

    How can we perform GSA for stochastic computer codes?

  3. Question 3

    How can we perform GSA with respect to the choice of the distributions of the input variables?

2.1 The general metric spaces sensitivity index

In [27], the authors performed GSA for codes ff taking values in general metric spaces. To do so, they consider a family of test functions parameterized by mm\in\mathbb{N}^{*} elements of 𝒳\mathcal{X} and defined by

𝒳m×𝒳(a,x)Ta(x).\begin{matrix}&\mathcal{X}^{m}\times\mathcal{X}&\to&\mathbb{R}\\ &(a,x)&\mapsto&T_{a}(x).\\ \end{matrix}

Let u{1,,p}\textbf{u}\subset\{1,\ldots,p\} and Xu=(Xi,iu)X_{\textbf{u}}=(X_{i},i\in\textbf{u}). Assuming that the test functions TaT_{a} are L2-functions with respect to the product measure m\mathbb{P}^{\otimes m}\otimes\mathbb{P} (where m\mathbb{P}^{\otimes m} is the product mm-times of the distribution of the output code ZZ) on 𝒳m×𝒳\mathcal{X}^{m}\times\mathcal{X}, they allow to defined the general metric space sensitivity index with respect to XuX_{\textbf{u}} by

S2,GMSu=𝒳m𝔼[(𝔼[Ta(Z)]𝔼[Ta(Z)|Xu])2]𝑑m(a)𝒳mVar(Ta(Z))𝑑m(a)=𝒳mVar(𝔼[Ta(Z)|Xu])𝑑m(a)𝒳mVar(Ta(Z))𝑑m(a).\displaystyle S_{2,\text{GMS}}^{\textbf{u}}=\frac{\int_{\mathcal{X}^{m}}\mathbb{E}\left[\left(\mathbb{E}[T_{a}(Z)]-\mathbb{E}[T_{a}(Z)|X_{\textbf{u}}]\right)^{2}\right]d\mathbb{P}^{\otimes m}(a)}{\int_{\mathcal{X}^{m}}\hbox{{\rm Var}}(T_{a}(Z))d\mathbb{P}^{\otimes m}(a)}=\frac{\int_{\mathcal{X}^{m}}\hbox{{\rm Var}}\left(\mathbb{E}[T_{a}(Z)|X_{\textbf{u}}]\right)d\mathbb{P}^{\otimes m}(a)}{\int_{\mathcal{X}^{m}}\hbox{{\rm Var}}(T_{a}(Z))d\mathbb{P}^{\otimes m}(a)}. (2)

Roughly speaking, there are two parts in the previous indices. First, for any value of aa, we consider the numerator 𝔼[(𝔼[Ta(Z)]𝔼[Ta(Z)|Xu])2]\mathbb{E}\bigl{[}\left(\mathbb{E}[T_{a}(Z)]-\mathbb{E}[T_{a}(Z)|X_{\textbf{u}}]\right)^{2}\bigr{]} and the denominator Var(Ta(Z))\hbox{{\rm Var}}(T_{a}(Z)) of the classical Sobol index of Ta(Z)T_{a}(Z). We call this part the Sobol’ part. Second, we integrate each part with respect to the measure m\mathbb{P}^{\otimes m}; this will be called the integration part.

As explained in [27], by construction, the indices S2,GMSuS_{2,\text{GMS}}^{\textbf{u}} lie in [0,1][0,1] and share the same properties as their Sobol counterparts:

  1. 1.

    the different contributions sum to 1;

  2. 2.

    they are invariant by translation, by any isometric and by any non-degenerated scaling of ZZ.

Estimation

Three different estimation procedures are available in this context. The two first methods are based on the so-called Pick-Freeze scheme. More precisely, the Pick-Freeze scheme, considered in [36], is a well tailored design of experiment. Namely, let XuX^{\textbf{u}} be the random vector such that Xiu=XiX^{\textbf{u}}_{i}=X_{i} if iui\in{\textbf{u}} and Xiu=XiX^{\textbf{u}}_{i}=X^{\prime}_{i} if iui\notin{\textbf{u}} where XiX^{\prime}_{i} is an independent copy of XiX_{i}. We then set

Zu:=f(Xu).\displaystyle Z^{{\textbf{u}}}:=f(X^{\textbf{u}}). (3)

Further, the procedure consists in rewriting the variances of the conditional expectation in terms of covariances as follows:

Var(𝔼[Z|Xu])=Cov(Z,Zu).\displaystyle\hbox{{\rm Var}}(\mathbb{E}[Z|X^{\textbf{u}}])=\hbox{{\rm Cov}}(Z,Z^{\textbf{u}}). (4)

Alternatively, the third estimation procedure that can be seen as an ingenious and effective approximation of the Pick-Freeze scheme is based on rank statistics. Until now, it is unfortunately only available to estimate first order indices in the case of real-valued inputs.

  • First method - Pick-Freeze. Introduced in [26], this procedure is based on a double Monte-Carlo scheme to estimate the Cramér-von-Mises indices S2,CVMuS^{\textbf{u}}_{2,\text{CVM}}. More precisely, to estimate S2,GMSuS_{2,\text{GMS}}^{\textbf{u}} in our context, one considers the following design of experiment consisting in:

    1. 1.

      a classical Pick-Freeze NN-sample, that is two NN-samples of ZZ: (Zj,Zju)(Z_{j},Z_{j}^{\textbf{u}}), 1jN1\leqslant j\leqslant N;

    2. 2.

      mm another NN-samples of ZZ independent of (Zj,Zju)1jN(Z_{j},Z_{j}^{\textbf{u}})_{1\leqslant j\leqslant N}: Wl,kW_{l,k}, 1lm1\leqslant l\leqslant m, 1kN1\leqslant k\leqslant N.

    The empirical estimator of the numerator of S2,GMSuS_{2,\text{GMS}}^{\textbf{u}} is then given by

    N^2,GMS,PFu=\displaystyle\widehat{N}_{2,\text{GMS},\text{PF}}^{\textbf{u}}= 1Nm1i1,,imN[1Nj=1NTW1,i1,,Wm,im(Zj)TW1,i1,,Wm,im(Zju)]\displaystyle\frac{1}{N^{m}}\sum_{1\leqslant i_{1},\dots,i_{m}\leqslant N}\biggl{[}\frac{1}{N}\sum_{j=1}^{N}T_{W_{1,i_{1}},\cdots,W_{m,i_{m}}}(Z_{j})T_{W_{1,i_{1}},\cdots,W_{m,i_{m}}}(Z_{j}^{\textbf{u}})\biggr{]}
    1Nm1i1,,imN[12Nj=1N(TW1,i1,,Wm,im(Zj)+TW1,i1,,Wm,im(Zju))]2\displaystyle-\frac{1}{N^{m}}\sum_{1\leqslant i_{1},\dots,i_{m}\leqslant N}\biggl{[}\frac{1}{2N}\sum_{j=1}^{N}\left(T_{W_{1,i_{1}},\cdots,W_{m,i_{m}}}(Z_{j})+T_{W_{1,i_{1}},\cdots,W_{m,i_{m}}}(Z_{j}^{\textbf{u}})\right)\biggr{]}^{2}

    while the one of the denominator is

    D^2,GMS,PFu=\displaystyle\widehat{D}_{2,\text{GMS},\text{PF}}^{\textbf{u}}= 1Nm1i1,,imN[12Nj=1N(TW1,i1,,Wm,im(Zj)2+TW1,i1,,Wm,im(Zju)2)]\displaystyle\frac{1}{N^{m}}\sum_{1\leqslant i_{1},\dots,i_{m}\leqslant N}\biggl{[}\frac{1}{2N}\sum_{j=1}^{N}\left(T_{W_{1,i_{1}},\cdots,W_{m,i_{m}}}(Z_{j})^{2}+T_{W_{1,i_{1}},\cdots,W_{m,i_{m}}}(Z_{j}^{\textbf{u}})^{2}\right)\biggr{]}
    1Nm1i1,,imN[12Nj=1N(TW1,i1,,Wm,im(Zj)+TW1,i1,,Wm,im(Zju))]2.\displaystyle-\frac{1}{N^{m}}\sum_{1\leqslant i_{1},\dots,i_{m}\leqslant N}\biggl{[}\frac{1}{2N}\sum_{j=1}^{N}\left(T_{W_{1,i_{1}},\cdots,W_{m,i_{m}}}(Z_{j})+T_{W_{1,i_{1}},\cdots,W_{m,i_{m}}}(Z_{j}^{\textbf{u}})\right)\biggr{]}^{2}.

    For 𝒳=k\mathcal{X}=\mathbb{R}^{k}, m=1m=1, and TaT_{a} given by Ta(x)=𝟙{xa}T_{a}(x)=\mathbbm{1}_{\{x\leqslant a\}}, the index S2,GMS,PFuS_{2,\text{GMS},\text{PF}}^{\textbf{u}} is nothing more than the index S2,CVMuS_{2,\text{CVM}}^{\textbf{u}} defined in [26] based on the Cramér-von-Mises distance and the whole distribution of the output. Its estimator S^2,CVMu\widehat{S}_{2,\text{CVM}}^{\textbf{u}} defined as the ratio of N^2,GMS,PFu\widehat{N}_{2,\text{GMS},\text{PF}}^{\textbf{u}} and D^2,GMS,PFu\widehat{D}_{2,\text{GMS},\text{PF}}^{\textbf{u}} with Ta(x)=𝟙{xa}T_{a}(x)=\mathbbm{1}_{\{x\leqslant a\}} has been proved to be asymptotically Gaussian [26, Theorem 3.8]. The proof relies on Donsker’s theorem and the functional delta method [58, Theorem 20.8]. Hence, in the general case of S2,GMSuS_{2,\text{GMS}}^{\textbf{u}}, the central limit theorem will be still valid as soon as the collection (Ta)a𝒳m(T_{a})_{a\in\mathcal{X}^{m}} forms a Donsker’s class of functions.

  • Second method - U-statistics. As done in [27], this method allows the practitioner to get rid of the additional random variables (Wl,k)(W_{l,k}) for l{1,,m}l\in\{1,\ldots,m\} and k{1,,N}k\in\{1,\ldots,N\}. The estimator is now based on U-statistics and deals simultaneously with the Sobol part and the integration part with respect to dm(a)d\mathbb{P}^{\otimes m}(a). It suffices to rewrite S2,GMSuS_{2,\text{GMS}}^{\textbf{u}} as

    S2,GMSu=I(Φ1)I(Φ2)I(Φ3)I(Φ4),S_{2,\text{GMS}}^{\textbf{u}}=\frac{I(\Phi_{1})-I(\Phi_{2})}{I(\Phi_{3})-I(\Phi_{4})}, (5)

    where,

    Φ1(𝐳1,,𝐳m+1)=Tz1,,zm(zm+1)Tz1,,zm(zm+1u),\displaystyle\Phi_{1}(\mathbf{z}_{1},\dots,\mathbf{z}_{m+1})=T_{z_{1},\dots,z_{m}}(z_{m+1})T_{z_{1},\dots,z_{m}}(z_{m+1}^{\textbf{u}}),
    Φ2(𝐳1,,𝐳m+2)=Tz1,,zm(zm+1)Tz1,,zm(zm+2u),\displaystyle\Phi_{2}(\mathbf{z}_{1},\dots,\mathbf{z}_{m+2})=T_{z_{1},\dots,z_{m}}(z_{m+1})T_{z_{1},\dots,z_{m}}(z_{m+2}^{\textbf{u}}), (6)
    Φ3(𝐳1,,𝐳m+1)=Tz1,,zm(zm+1)2,\displaystyle\Phi_{3}(\mathbf{z}_{1},\dots,\mathbf{z}_{m+1})=T_{z_{1},\dots,z_{m}}(z_{m+1})^{2},
    Φ4(𝐳1,,𝐳m+2)=Tz1,,zm(zm+1)Tz1,,zm(zm+2),\displaystyle\Phi_{4}(\mathbf{z}_{1},\dots,\mathbf{z}_{m+2})=T_{z_{1},\dots,z_{m}}(z_{m+1})T_{z_{1},\dots,z_{m}}(z_{m+2}),

    denoting by 𝐳i\mathbf{z}_{i} the pair (zi,ziu)(z_{i},z_{i}^{\textbf{u}}) and, for l=1,,4l=1,\dots,4,

    I(Φl)=𝒳m(l)Φl(𝐳1,,𝐳m(l))𝑑2u,m(l)(𝐳1,𝐳m(l)),\displaystyle I(\Phi_{l})=\int_{\mathcal{X}^{m(l)}}\Phi_{l}(\mathbf{z}_{1},\dots,\mathbf{z}_{m(l)})d\mathbb{P}_{2}^{u,\otimes m(l)}(\mathbf{z}_{1}\dots,\mathbf{z}_{m(l)}), (7)

    with m(1)=m(3)=m+1m(1)=m(3)=m+1 and m(2)=m(4)=m+2m(2)=m(4)=m+2. Finally, one considers the empirical version of (5) as estimator of S2,GMSu{S}_{2,\text{GMS}}^{\textbf{u}}:

    S^2,GMS,Ustatu=U1,NU2,NU3,NU4,N,\widehat{S}_{2,\text{GMS},\text{Ustat}}^{\textbf{u}}=\frac{U_{1,N}-U_{2,N}}{U_{3,N}-U_{4,N}}, (8)

    where, for l=1,,4l=1,\dots,4,

    Ul,N\displaystyle U_{l,N} =(Nm(l))11i1<<im(l)NΦls(𝐙i1,,𝐙im(l))\displaystyle=\begin{pmatrix}N\\ m(l)\end{pmatrix}^{-1}\sum_{1\leqslant i_{1}<\dots<i_{m(l)}\leqslant N}\Phi_{l}^{s}\left(\mathbf{Z}_{i_{1}},\dots,\mathbf{Z}_{i_{m(l)}}\right) (9)

    and the function:

    Φls(𝐳1,,𝐳m(l))=1(m(l))!τ𝒮m(l)Φl(𝐳τ(1),,𝐳τ(m(l)))\displaystyle\Phi_{l}^{s}(\mathbf{z}_{1},\dots,\mathbf{z}_{m(l)})=\frac{1}{(m(l))!}\sum_{\tau\in\mathcal{S}_{m(l)}}\Phi_{l}(\mathbf{z}_{\tau(1)},\dots,\mathbf{z}_{\tau(m(l))})

    is the symmetrized version of Φl\Phi_{l}. In [27, Theorem 2.3], the estimator S^2,GMSu\widehat{S}_{2,\text{GMS}}^{\textbf{u}} has been proved to be consistent and asymptotically Gaussian.

    Even if the Pick-Freeze procedure is quite general, it presents some drawbacks. First of all, the Pick-Freeze design of experiment is peculiar and may not be available in real applications. Moreover, it can be unfortunately very time consuming in practice. For instance, estimating all the first order Sobol indices requires (p+1)N(p+1)N calls to the computer code.

  • Third method - Rank-based. In [14], Chatterjee proposes an efficient way based on ranks to estimate a new coefficient of correlation. This estimation procedure can be seen as an approximation of the Pick-Freeze scheme and then has been exploited in [23] to perform a more efficient estimation of S2,GMSuS_{2,\text{GMS}}^{\textbf{u}}. Anyway, this method is only well tailored for estimating first order indices i.e. in the case of u={i}\textbf{u}=\{i\} for some i{1,,p}i\in\{1,\ldots,p\} and when the input XiX_{i}\in\mathbb{R}.

    More precisely, an i.i.d. sample of pairs of real-valued random variables (Xi,j,Yj)1jN(X_{i,j},Y_{j})_{1\leqslant j\leqslant N} (i{1,,p}i\in\{1,\cdots,p\}) is considered, assuming for simplicity that the laws of XiX_{i} and YY are both diffuse (ties are excluded). The pairs (Xi,(1),Y(1)),,(Xi,(N),Y(N))(X_{i,(1)},Y_{(1)}),\ldots,(X_{i,(N)},Y_{(N)}) are rearranged in such a way that

    Xi,(1)<<Xi,(N)X_{i,(1)}<\ldots<X_{i,(N)}

    and, for any j=1,,Nj=1,\ldots,N, Y(j)Y_{(j)} is the output computed from Xi,(j)X_{i,(j)}. Let rjr_{j} be the rank of Y(j)Y_{(j)}, that is,

    rj=#{j{1,,N},Y(j)Y(j)}.r_{j}=\#\{j^{\prime}\in\{1,\dots,N\},\ Y_{(j^{\prime})}\leqslant Y_{(j)}\}.

    The new correlation coefficient is then given by

    ξN(Xi,Y)=13j=1N1|rj+1rj|N21.\xi_{N}(X_{i},Y)=1-\frac{3\sum_{j=1}^{N-1}|r_{j+1}-r_{j}|}{N^{2}-1}. (10)

    In [14], it is proved that ξN(X,Y)\xi_{N}(X,Y) converges almost surely to a deterministic limit ξ(X,Y)\xi(X,Y) which is actually equal to S2,CVMiS_{2,\text{CVM}}^{i} when Y=Z=f(X1,,Xp)Y=Z=f(X_{1},\cdots,X_{p}). Further, the author also proves a central limit theorem when XiX_{i} and YY are independent, which is clearly not relevant in the context of sensitivity analysis (where XiX_{i} and YY are assumed to be dependent through the computer code).

    In our context, recall that u={i}\textbf{u}=\{i\} and let Y=ZY=Z. Let also πi(j)\pi_{i}(j) be the rank of Xi,jX_{i,j} in the sample (Xi,1,,Xi,N)(X_{i,1},\ldots,X_{i,N}) of XiX_{i} and define

    Ni(j)={πi1(πi(j)+1)if πi(j)+1N,πi1(1)if πi(j)=N.\displaystyle N_{i}(j)=\begin{cases}\pi_{i}^{-1}(\pi_{i}(j)+1)&\text{if $\pi_{i}(j)+1\leqslant N$},\\ \pi_{i}^{-1}(1)&\text{if $\pi_{i}(j)=N$}.\\ \end{cases} (11)

    Then the empirical estimator S^2,GMS,Ranki\widehat{S}_{2,\text{GMS},\text{Rank}}^{i} of S2,GMSiS_{2,\text{GMS}}^{i} only requires a NN-sample (Zj)1jN(Z_{j})_{1\leqslant j\leqslant N} of ZZ and is given by the ratio between

    N^2,GMS,Ranki=\displaystyle\widehat{N}_{2,\text{GMS},\text{Rank}}^{i}= 1Nm1i1,,imN[1Nj=1NTZi1,,Zim(Zj)TZi1,,Zim(ZNi(j))]\displaystyle\frac{1}{N^{m}}\sum_{1\leqslant i_{1},\dots,i_{m}\leqslant N}\biggl{[}\frac{1}{N}\sum_{j=1}^{N}T_{Z_{i_{1}},\cdots,Z_{i_{m}}}(Z_{j})T_{Z_{i_{1}},\cdots,Z_{i_{m}}}(Z_{N_{i}(j)})\biggr{]}
    1Nm1i1,,imN[1Nj=1NTZi1,,Zim(Zj)]2\displaystyle-\frac{1}{N^{m}}\sum_{1\leqslant i_{1},\dots,i_{m}\leqslant N}\biggl{[}\frac{1}{N}\sum_{j=1}^{N}T_{Z_{i_{1}},\cdots,Z_{i_{m}}}(Z_{j})\biggr{]}^{2}

    and D^2,GMS,Rank\widehat{D}_{2,\text{GMS},\text{Rank}}

    1Nm1i1,,imN[1Nj=1NTZi1,,Zim(Zj)2]1Nm1i1,,imN[1Nj=1NTZi1,,Zim(Zj)]2.\displaystyle\frac{1}{N^{m}}\sum_{1\leqslant i_{1},\dots,i_{m}\leqslant N}\biggl{[}\frac{1}{N}\sum_{j=1}^{N}T_{Z_{i_{1}},\cdots,Z_{i_{m}}}(Z_{j})^{2}\biggr{]}-\frac{1}{N^{m}}\sum_{1\leqslant i_{1},\dots,i_{m}\leqslant N}\biggl{[}\frac{1}{N}\sum_{j=1}^{N}T_{Z_{i_{1}},\cdots,Z_{i_{m}}}(Z_{j})\biggr{]}^{2}.

    It is worth mentioning that ZNi(j)Z_{N_{i}(j)} plays the same role as ZjiZ^{i}_{j} (the Pick-Freeze version of ZjZ_{j}) in the Pick-Freeze estimation procedure. Anyway, the strength of the rank-based estimation procedure lies in the fact that only one NN-sample of ZZ is required while (m+2)(m+2) samples of size NN are necessary in the Pick-Freeze estimation of a single index (worse, (m+1+p)(m+1+p) samples of size NN are required when one wants to estimates pp indices).

Comparison of the estimation procedures

First, the Pick-Freeze estimation procedure allows the estimation of several sensitivity indices: the classical Sobol indices for real-valued outputs, as well as their generalization for vectorial-valued codes, but also the indices based on higher moments [49] and the Cramér-von-Mises indices which take into account on the whole distribution [26, 21]. Moreover, the Pick-Freeze estimators have desirable statistical properties. More precisely, this estimation scheme has been proved to be consistent and asymptotically normal (i.e. the rate of convergence is N\sqrt{N}) in [36, 25, 27]. The limiting variances can be computed explicitly, allowing the practitioner to build confidence intervals. In addition, for a given sample size NN, exponential inequalities have been established. Last but not least, the sequence of estimators is asymptotically efficient from such a design of experiment (see, [58] for the definition of the asymptotic efficiency and [25] for more details on the result).

However, the Pick-Freeze estimators have two major drawbacks. First, they rely on a particular experimental design that may be unavailable in practice. Second, the number of model calls to estimate all first order Sobol indices grows linearly with the number of input parameters. For example, if we consider p=99p=99 input parameters and only n=1000n=1000 calls are allowed, then only a sample of size n/(p+1)=10n/(p+1)=10 is available to estimate each single first order Sobol index.

Secondly, the estimation procedure based on U-statistics has the same kind of asymptotic guarantees as consistency and asymptotic normality. Furthermore, the estimation scheme is reduced to 2N2N evaluations of the code. Last but not least, using the results of Hoeffding [33] on U-statistics, the asymptotic normality is proved straightforwardly.

Finally, embedding Chatterjee’s method in the GSA framework (called rank-based method in this framework) thereby eliminates the two drawbacks of the classical Pick-Freeze estimation. In addition, the rank-based method allows for the estimation of a large class of GSA indices which include the Sobol indices and the higher order moment indices proposed by Owen [47, 49, 48]. Using a single sample of size NN, it is now possible to estimate at the same time all the first order Sobol indices , first order Cramér-von-Mises indices, and other useful first order sensitivity indices as soon as all inputs are real valued.

2.2 The universal sensitivity index

Formula (2) can be generalized in the following ways.

  1. 1.

    The point aa in the definition of the test functions is allowed to belong to another measurable space than 𝒳m\mathcal{X}^{m}.

  2. 2.

    The probability measure m\mathbb{P}^{\otimes m} in (2) can be replaced by any “admissible” probability measure.

Such generalizations lead to the definition of a universal sensitivity index and its procedures of estimation.

Definition 2.1.

Let aa belongs to some measurable space Ω\Omega endowed with some probability measure \mathbb{Q}. For any u{1,,p}\textbf{u}\subset\{1,\cdots,p\}, we define the universal sensitivity index with respect to XuX_{\textbf{u}} by

S2,Univu(Ta,)=Ω𝔼[(𝔼[Ta(Z)]𝔼[Ta(Z)|Xu])2]𝑑(a)ΩVar(Ta(Z))𝑑(a)=ΩVar(𝔼[Ta(Z)|Xu])𝑑(a)ΩVar(Ta(Z))𝑑(a).\displaystyle S_{2,\text{Univ}}^{\textbf{u}}(T_{a},\mathbb{Q})=\frac{\int_{\Omega}\mathbb{E}\left[\left(\mathbb{E}[T_{a}(Z)]-\mathbb{E}[T_{a}(Z)|X_{\textbf{u}}]\right)^{2}\right]d\mathbb{Q}(a)}{\int_{\Omega}\hbox{{\rm Var}}(T_{a}(Z))d\mathbb{Q}(a)}=\frac{\int_{\Omega}\hbox{{\rm Var}}\left(\mathbb{E}[T_{a}(Z)|X_{\textbf{u}}]\right)d\mathbb{Q}(a)}{\int_{\Omega}\hbox{{\rm Var}}(T_{a}(Z))d\mathbb{Q}(a)}. (12)

Notice that the index S2,Univu(Ta,)S_{2,\text{Univ}}^{\textbf{u}}(T_{a},\mathbb{Q}) is obtained by the integration over aa with respect to \mathbb{Q} of the Hoeffding decomposition of Ta(Z)T_{a}(Z). Hence, by construction, this index lies in [0,1][0,1] and shares the same properties as its Sobol counterparts:

  1. 1.

    the different contributions sum to 1;

  2. 2.

    it is invariant by translation, by any isometric and by any non-degenerated scaling of ZZ.

The universality is twofold. First, it allows to consider more general relevant indices. Secondly, this definition encompasses, as particular cases, the classical sensitivity indices. Indeed,

  • the so-called Sobol index SuS^{\textbf{u}} with respect to XuX_{\textbf{u}} is S2,Univu(Id,)S_{2,\text{Univ}}^{\textbf{u}}(\operatorname{Id},\mathbb{P});

  • the Cramér-von-Mises index S2,CVMuS_{2,\text{CVM}}^{\textbf{u}} with respect to XuX_{\textbf{u}} is S2,Univu(𝟙a,d)S_{2,\text{Univ}}^{\textbf{u}}(\mathbbm{1}_{\cdot{}\leqslant a},\mathbb{P}^{\otimes d}) where 𝒳=d\mathcal{X}=\mathbb{R}^{d} and Ω=𝒳\Omega=\mathcal{X};

  • the general metric space sensitivity index S2,GMSuS_{2,\text{GMS}}^{\textbf{u}} with respect to XuX_{\textbf{u}} is S2,Univu(Ta,m)S_{2,\text{Univ}}^{\textbf{u}}(T_{a},\mathbb{P}^{\otimes m}) where Ω=𝒳m\Omega=\mathcal{X}^{m}.

An example where \mathbb{Q} is different from \mathbb{P} will be considered in Section 4.

Estimation

Here, we assume that \mathbb{Q} is different from m\mathbb{P}^{\otimes m} and we follow the same tracks as for the estimation of S2,GMSuS_{2,\text{GMS}}^{\textbf{u}} in Section 2.1.

  • First method - Pick-Freeze. We use the same design of experiment as in the First method of Section 2.1 but instead of considering that the mm additional NN-samples (Wl,k)(W_{l,k}) for l{1,,m}l\in\{1,\ldots,m\} and k{1,,N}k\in\{1,\ldots,N\} are drawn with respect to the distribution \mathbb{P} of the output, they are now drawn with respect to the law \mathbb{Q}. More precisely, one considers the following design of experiment consisting in:

    1. 1.

      a classical Pick-Freeze sample, that is two NN-samples of ZZ: (Zj,Zju)(Z_{j},Z_{j}^{\textbf{u}}), 1jN1\leqslant j\leqslant N;

    2. 2.

      mm \mathbb{Q}-distributed NN-samples Wl,kW_{l,k}, l{1,,m}l\in\{1,\ldots,m\} and k{1,,N}k\in\{1,\ldots,N\} that are independent of (Zj,Zju)(Z_{j},Z_{j}^{\textbf{u}}) for 1jN1\leqslant j\leqslant N.

    The empirical estimator of the numerator of S2,UnivuS_{2,\text{Univ}}^{\textbf{u}} is then given by

    N^2,Univ,PFu=\displaystyle\widehat{N}_{2,\text{Univ},\text{PF}}^{\textbf{u}}= 1Nm1i1,,imN[1Nj=1NTW1,i1,,Wm,im(Zj)TW1,i1,,Wm,im(Zju)]\displaystyle\frac{1}{N^{m}}\sum_{1\leqslant i_{1},\dots,i_{m}\leqslant N}\biggl{[}\frac{1}{N}\sum_{j=1}^{N}T_{W_{1,i_{1}},\cdots,W_{m,i_{m}}}(Z_{j})T_{W_{1,i_{1}},\cdots,W_{m,i_{m}}}(Z_{j}^{\textbf{u}})\biggr{]}
    1Nm1i1,,imN[12Nj=1N(TW1,i1,,Wm,im(Zj)+TW1,i1,,Wm,im(Zju))]2\displaystyle-\frac{1}{N^{m}}\sum_{1\leqslant i_{1},\dots,i_{m}\leqslant N}\biggl{[}\frac{1}{2N}\sum_{j=1}^{N}\left(T_{W_{1,i_{1}},\cdots,W_{m,i_{m}}}(Z_{j})+T_{W_{1,i_{1}},\cdots,W_{m,i_{m}}}(Z_{j}^{\textbf{u}})\right)\biggr{]}^{2}

    while the one of the denominator is

    D^2,Univ,PFu=\displaystyle\widehat{D}_{2,\text{Univ},\text{PF}}^{\textbf{u}}= 1Nm1i1,,imN[12Nj=1N(TW1,i1,,Wm,im(Zj)2+TW1,i1,,Wm,im(Zju)2)]\displaystyle\frac{1}{N^{m}}\sum_{1\leqslant i_{1},\dots,i_{m}\leqslant N}\biggl{[}\frac{1}{2N}\sum_{j=1}^{N}\left(T_{W_{1,i_{1}},\cdots,W_{m,i_{m}}}(Z_{j})^{2}+T_{W_{1,i_{1}},\cdots,W_{m,i_{m}}}(Z_{j}^{\textbf{u}})^{2}\right)\biggr{]}
    1Nm1i1,,imN[12Nj=1N(TW1,i1,,Wm,im(Zj)+TW1,i1,,Wm,im(Zju))]2.\displaystyle-\frac{1}{N^{m}}\sum_{1\leqslant i_{1},\dots,i_{m}\leqslant N}\biggl{[}\frac{1}{2N}\sum_{j=1}^{N}\left(T_{W_{1,i_{1}},\cdots,W_{m,i_{m}}}(Z_{j})+T_{W_{1,i_{1}},\cdots,W_{m,i_{m}}}(Z_{j}^{\textbf{u}})\right)\biggr{]}^{2}.

    As previously, it is straightforward (as soon as the collection (Ta)a𝒳m(T_{a})_{a\in\mathcal{X}^{m}} forms a Donsker’s class of functions) to adapt the proof of Theorem [26, Theorem 3.8] to prove the asymptotic normality of the estimator.

  • Second method - U-statistics. This method is not relevant in this case since d\mathbb{Q}\neq\mathbb{P}^{\otimes d}.

  • Third method - Rank-based. Here, the design of experiment reduces to:

    1. 1.

      a NN-sample of ZZ: ZjZ_{j}, 1jN1\leqslant j\leqslant N;

    2. 2.

      a NN-sample of WW that is \mathbb{Q}-distributed: WkW_{k}, 1kN1\leqslant k\leqslant N, independent of ZjZ_{j}, 1jN1\leqslant j\leqslant N.

    The empirical estimator S^2,Univ,Ranku\widehat{S}_{2,\text{Univ},\text{Rank}}^{\textbf{u}} of S2,UnivuS_{2,\text{Univ}}^{\textbf{u}} is then given by the ratio between

    N^2,Univ,Ranku=\displaystyle\widehat{N}_{2,\text{Univ},\text{Rank}}^{\textbf{u}}= 1Nm1i1,,imN[1Nj=1NTWi1,,Wim(Zj)TWi1,,Wim(ZN(j))]\displaystyle\frac{1}{N^{m}}\sum_{1\leqslant i_{1},\dots,i_{m}\leqslant N}\biggl{[}\frac{1}{N}\sum_{j=1}^{N}T_{W_{i_{1}},\cdots,W_{i_{m}}}(Z_{j})T_{W_{i_{1}},\cdots,W_{i_{m}}}(Z_{N(j)})\biggr{]}
    1Nm1i1,,imN[1Nj=1NTWi1,,Wim(Zj)]2\displaystyle-\frac{1}{N^{m}}\sum_{1\leqslant i_{1},\dots,i_{m}\leqslant N}\biggl{[}\frac{1}{N}\sum_{j=1}^{N}T_{W_{i_{1}},\cdots,W_{i_{m}}}(Z_{j})\biggr{]}^{2}

    and D^2,Univ,Rank\widehat{D}_{2,\text{Univ},\text{Rank}}

    1Nm1i1,,imN[1Nj=1NTWi1,,Wim(Zj)2]1Nm1i1,,imN[1Nj=1NTWi1,,Wim(Zj)]2.\displaystyle\frac{1}{N^{m}}\sum_{1\leqslant i_{1},\dots,i_{m}\leqslant N}\biggl{[}\frac{1}{N}\sum_{j=1}^{N}T_{W_{i_{1}},\cdots,W_{i_{m}}}(Z_{j})^{2}\biggr{]}-\frac{1}{N^{m}}\sum_{1\leqslant i_{1},\dots,i_{m}\leqslant N}\biggl{[}\frac{1}{N}\sum_{j=1}^{N}T_{W_{i_{1}},\cdots,W_{i_{m}}}(Z_{j})\biggr{]}^{2}.

We recall that this last method only applies for first order sensitivity indices and real-valued input variables.

2.3 A sketch of answer to Questions 1 to 3

In the sequel, we discuss how pertinent choices of the metric, of the class of functions TaT_{a} and of the probability measure \mathbb{Q} can provide some partial answers to Questions 1 to 3 raised at the beginning of Section 2. For instance, in order to answer to Question 1, we can consider that 𝒳=q()\mathcal{X}=\mathcal{M}_{q}(\mathbb{R}) is the space of probability measures μ\mu on \mathbb{R} that are LqL^{q}-functions and we endow this space with the Wasserstein metric WqW_{q} (see Section 3.1 for some recalls on Wasserstein metrics). We will propose two possible approaches to define interesting sensitivity indices in this framework.

  • In Section 4.1, we use (2) with m=2m=2, a=(μ1,μ2)a=(\mu_{1},\mu_{2}) and Ta(Z)=𝟙{ZB(μ1,μ2)}T_{a}(Z)=\mathbbm{1}_{\{Z\in B(\mu_{1},\mu_{2})\}} where B(μ1,μ2)B(\mu_{1},\mu_{2}) is the open ball: {μq(),Wq(μ,μ1)<Wq(μ1,μ2)}\{\mu\in\mathcal{M}_{q}(\mathbb{R}),W_{q}(\mu,\mu_{1})<W_{q}(\mu_{1},\mu_{2})\}.

  • In Section 4.2, we use the notion of Fréchet means on Wasserstein spaces (see Section 3.2) and the index defined in (12) with appropriate choices of aa, TaT_{a}, and \mathbb{Q}.

The case of stochastic computer computer codes raised in Question 2 will be addressed as follows. A computer code (to be defined) valued in q()\mathcal{M}_{q}(\mathbb{R}) will be seen as an ideal case of stochastic computer codes. Finally, it will be possible to treat Question 3 using the framework of Question 2.

3 Wasserstein spaces and random distributions

3.1 Definition

For any q1q\geqslant 1, we define the qq-Wasserstein distance between two probability distributions that are LqL^{q}-integrable and characterized by their c.d.f.’s FF and GG on d\mathbb{R}^{d} by:

Wq(F,G)=minXF,YG(𝔼[XYq]1/q),W_{q}(F,G)=\min_{X\sim F,Y\sim G}\left(\mathbb{E}[\|X-Y\|^{q}]^{1/q}\right),

where XFX\sim F and YGY\sim G mean that XX and YY are random variables with respective c.d.f.’s FF and GG. We define the Wasserstein space 𝒲q(d)\mathcal{W}_{q}(\mathbb{R}^{d}) as the space of all LqL^{q}-integrable measures defined on d\mathbb{R}^{d} endowed with the qq-Wasserstein distance WqW_{q}. In the sequel, any measure is identified to its c.d.f. or in some cases to its p.d.f. In the unidimensional case (d=1d=1), it is a well known fact that Wq(F,G)W_{q}(F,G) has an explicitly expression given by

Wq(F,G)=(01|F(v)G(v)|q𝑑v)1/q=𝔼[|F(U)G(U)|q]1/q.\displaystyle W_{q}(F,G)=\left(\int_{0}^{1}|F^{-}(v)-G^{-}(v)|^{q}dv\right)^{1/q}=\mathbb{E}[|F^{-}(U)-G^{-}(U)|^{q}]^{1/q}. (13)

Here FF^{-} and GG^{-} are the generalized inverses of the increasing functions FF and GG and UU is a random variable uniformly distributed on [0,1][0,1]. Of course, F(U)F^{-}(U) and G(U)G^{-}(U) have c.d.f.’s FF and GG. The representation (13) of the qq-Wasserstein distance when d=1d=1 can be generalized to a wider class of “contrast functions”. For more details on Wasserstein spaces, one can refer to [59] and [5] and the references therein.

Definition 3.1.

We call contrast function any application cc from 2\mathbb{R}^{2} to \mathbb{R} satisfying the "measure property" 𝒫\cal P defined by

𝒫:xxandyy,c(x,y)c(x,y)c(x,y)+c(x,y)0,\mathcal{P}:\forall x\leqslant x^{\prime}\ \mathrm{and\ }\forall y\leqslant y^{\prime},c(x^{\prime},y^{\prime})-c(x^{\prime},y)-c(x,y^{\prime})+c(x,y)\leqslant 0,

meaning that cc defines a negative measure on 2\mathbb{R}^{2}.

For instance, c(x,y)=xyc(x,y)=-xy satisfies 𝒫\cal P. If cc satisfies 𝒫\cal P then any function of the form a(x)+b(y)+c(x,y)a(x)+b(y)+c(x,y) also satisfies 𝒫\cal P. If CC is a convex real function, c(x,y)=C(xy)c(x,y)=C(x-y) satisfies 𝒫\cal P. In particular, c(x,y)=(xy)2=x2+y22xyc(x,y)=(x-y)^{2}=x^{2}+y^{2}-2xy satisfy 𝒫\cal P and actually so does c(x,y)=|xy|pc(x,y)=|x-y|^{p} as soon as p1p\geqslant 1.

Definition 3.2.

We define the Skorohod space 𝒟:=𝒟([0,1])\mathcal{D}:=\mathcal{D}\left([0,1]\right) of all distribution functions as the space of all non-decreasing functions from \mathbb{R} to [0,1][0,1] that are càd-làg with limit 0 (resp. 11) in -\infty (resp. ++\infty) equipped with the supremum norm.

Definition 3.3.

For any F𝒟F\in\mathcal{D}, any G𝒟G\in\mathcal{D}, and any positive contrast function cc, we define the cc-Wasserstein cost by

Wc(F,G)=minXF,YG𝔼[c(X,Y)]<+.\displaystyle W_{c}(F,G)=\min_{X\sim F,Y\sim G}\mathbb{E}\left[c(X,Y)\right]<+\infty.

Obviously, Wqq=WcW_{q}^{q}=W_{c} with c(x,y)=|xy|qc(x,y)=|x-y|^{q}. The following theorem can be found in ([11]).

Theorem 3.4 (Cambanis, Simon, Stout [11]).

Let cc be a contrast function. Then

Wc(F,G)=01c(F(v),G(v))𝑑v=𝔼[c(F(U),G(U))],W_{c}(F,G)=\int_{0}^{1}c(F^{-}(v),G^{-}(v))dv=\mathbb{E}[c(F^{-}(U),G^{-}(U))],

where UU is a random variable uniformly distributed on [0,1][0,1].

3.2 Extension of the Fréchet mean to contrast functions

Definition 3.5.

We call a loss function any positive and measurable function ll. Then, we define a Fréchet feature l[X]{\cal E}_{l}[X] of a random variable XX taking values in a measurable space {\cal M} as (whenever it exists):

l[X]Argminθ𝔼[l(X,θ)].\displaystyle{\cal E}_{l}[X]\in\operatorname*{Argmin}_{\theta\in{\cal M}}\mathbb{E}[l(X,\theta)]. (14)

When \mathcal{M} is a metric space endowed with a distance d=ld=l, the Fréchet feature corresponds to the classical Fréchet mean (see [22]). In particular, d[X]{\cal E}_{d}[X] minimizes 𝔼[d2(X,θ)]\mathbb{E}[d^{2}(X,\theta)] which is an extension of the definition of the classical mean in d\mathbb{R}^{d} that minimizes 𝔼[Xθ2]\mathbb{E}[\|X-\theta\|^{2}].

Now we consider =𝒟\mathcal{M}=\mathcal{D} and l=Wcl=W_{c}. Further, Equation (14) becomes

Wc[𝔽]ArgminG𝒟𝔼[Wc(𝔽,G)].\displaystyle{\cal E}_{W_{c}}[\mathbb{F}]\in\operatorname*{Argmin}_{G\in\mathcal{D}}\mathbb{E}\left[W_{c}(\mathbb{F},G)\right].

where 𝔽\mathbb{F} is a measurable function from a measurable space Ω\Omega to 𝒟\mathcal{D}.

Theorem 3.6.

Let cc be a positive contrast function. Assume that the application defined by (ω,v)Ω×(0,1)𝔽(ω,v)(\omega,v)\in\Omega\times(0,1)\mapsto\mathbb{F}^{-}(\omega,v)\in\mathbb{R} is measurable. In addition, assume that c[𝔽]{\cal E}_{c}[\mathbb{F}] exists and is unique. Then there exists a unique Fréchet mean of 𝔼[c(𝔽(v),s)]\mathbb{E}[c(\mathbb{F}^{-}(v),s)] denoted by c[𝔽](v){\cal E}_{c}[\mathbb{F}^{-}](v) and we have

(c[𝔽])(v)=c[𝔽](v)=Argmins𝔼[c(𝔽(v),s)].({\cal E}_{c}[\mathbb{F}])^{-}(v)={\cal E}_{c}[\mathbb{F}^{-}](v)=\operatorname*{Argmin}_{s\in\mathbb{R}}\mathbb{E}[c(\mathbb{F}^{-}(v),s)].
Proof of Theorem 3.6.

Since cc satisfies 𝒫\cal P, we have

𝔼[Wc(𝔽,G)]=𝔼[01c(𝔽(v),G(v))𝑑v]=01𝔼[c(𝔽(v),G(v))]𝑑v,\mathbb{E}[W_{c}(\mathbb{F},G)]=\mathbb{E}\left[\int_{0}^{1}c(\mathbb{F}^{-}(v),G^{-}(v))dv\right]=\int_{0}^{1}\mathbb{E}[c(\mathbb{F}^{-}(v),G^{-}(v))]dv,

by Fubini’s theorem. Now, for all v(0,1)v\in(0,1), the quantity 𝔼[c(𝔽(v),G(v))]\mathbb{E}[c(\mathbb{F}^{-}(v),G^{-}(v))] is minimum for G(v)=c[𝔽1](v)G^{-}(v)={\cal E}_{c}[\mathbb{F}^{-1}](v).

01𝔼[c(𝔽(v),c[𝔽1](v))]𝑑v01𝔼[c(𝔽(v),G(v))]𝑑v\displaystyle\int_{0}^{1}\mathbb{E}[c(\mathbb{F}^{-}(v),{\cal E}_{c}[\mathbb{F}^{-1}](v))]dv\leqslant\int_{0}^{1}\mathbb{E}[c(\mathbb{F}^{-}(v),G^{-}(v))]dv

and, in particular, for G=c[𝔽]1G^{-}={\cal E}_{c}[\mathbb{F}]^{-1}, one gets

01𝔼[c(𝔽(v),c[𝔽1](v))]𝑑v01𝔼[c(𝔽(v),c[𝔽]1(v))]𝑑v.\displaystyle\int_{0}^{1}\mathbb{E}[c(\mathbb{F}^{-}(v),{\cal E}_{c}[\mathbb{F}^{-1}](v))]dv\leqslant\int_{0}^{1}\mathbb{E}[c(\mathbb{F}^{-}(v),{\cal E}_{c}[\mathbb{F}]^{-1}(v))]dv.

Conversely, by the definition of c[𝔽]1{\cal E}_{c}[\mathbb{F}]^{-1}, we have for all GG,

01𝔼[c(𝔽(v),c[𝔽]1(v))]𝑑v\displaystyle\int_{0}^{1}\mathbb{E}[c(\mathbb{F}^{-}(v),{\cal E}_{c}[\mathbb{F}]^{-1}(v))]dv 01𝔼[c(𝔽(v),G(v))]𝑑v\displaystyle\leqslant\int_{0}^{1}\mathbb{E}[c(\mathbb{F}^{-}(v),G^{-}(v))]dv

and, in particular, for G=c[𝔽1]G^{-}={\cal E}_{c}[\mathbb{F}^{-1}], one gets

01𝔼[c(𝔽(v),c[𝔽]1(v))]𝑑v\displaystyle\int_{0}^{1}\mathbb{E}[c(\mathbb{F}^{-}(v),{\cal E}_{c}[\mathbb{F}]^{-1}(v))]dv 01𝔼[c(𝔽(v),c[𝔽1](v))]𝑑v.\displaystyle\leqslant\int_{0}^{1}\mathbb{E}[c(\mathbb{F}^{-}(v),{\cal E}_{c}[\mathbb{F}^{-1}](v))]dv.

The theorem then follows by the uniqueness of the minimizer. ∎

In the previous theorem, we propose a very genereal non parametric framework for the random element 𝔽\mathbb{F} together with some assumptions on existence and uniqueness of the Fréchet feature and measurability of the map (ω,v)𝔽(ω,v)(\omega,v)\mapsto\mathbb{F}^{-}(\omega,v). Nevertheless, it is possible to construct explicit parametric models for 𝔽\mathbb{F} for whom theses assumptions are satisfied. For instance, the authors of [4] ensures measurability for some parametric models on 𝔽\mathbb{F} using results of [18]. Notice that in [19] a new sensitivity indice is defined for each feature associated to a contrast function. In section 4.2 we will restrict our analysis to Fréchet means, hence to Sobol indices.

3.3 Examples

The Fréchet mean in the 𝒲2()\mathcal{W}_{2}(\mathbb{R}) space is the inverse of the function v𝔼[𝔽(v)]v\mapsto\mathbb{E}\ \left[\mathbb{F}^{-}(v)\right]. Another example is the Fréchet median. Since the median in \mathbb{R} is related to the L1L^{1} cost, the Fréchet 𝒲1()\mathcal{W}_{1}(\mathbb{R}) median of a random c.d.f. is

(Med(𝔽)(v)Med(𝔽(v)).(\mbox{Med}(\mathbb{F})^{-}(v)\in\mbox{Med}(\mathbb{F}^{-}(v)).

More generally, we recall that, for α(0,1)\alpha\in(0,1), the α\alpha-quantile in \mathbb{R} is the Fréchet mean associated to the contrast function cα(x,y)=(1α)(yx)𝟙xy<0+α(xy)𝟙xy0c_{\alpha}(x,y)=(1-\alpha)(y-x)\mathbbm{1}_{x-y<0}+\alpha(x-y)\mathbbm{1}_{x-y\geqslant 0}, also called the pinball function. Then we can define an α\alpha-quantile qα(𝔽)q_{\alpha}(\mathbb{F}) of a random c.d.f. as:

(qα(𝔽))(v)qα(𝔽(v)),(q_{\alpha}(\mathbb{F}))^{-}(v)\in q_{\alpha}(\mathbb{F}^{-}(v)),

where qα(X)q_{\alpha}(X) is the set of the α\alpha-quantiles of a random variable XX taking values in \mathbb{R}. Naturally, taking α=1/2\alpha=1/2 leads to the median.

Let us illustrate the previous definitions on an example. Let XX be a random variable with c.d.f. F0F_{0} which is assumed to be increasing and continuous. Let also mm and σ\sigma two real random variables such that σ\sigma>0. Then we consider the random c.d.f. 𝔽\mathbb{F} of σX+m\sigma X+m:

𝔽(x)=F0(xmσ)and𝔽1(v)=σF01(v)+m.\mathbb{F}(x)=F_{0}\left(\frac{x-m}{\sigma}\right)\quad\text{and}\quad\mathbb{F}^{-1}(v)=\sigma F_{0}^{-1}(v)+m.

Naturally, the Fréchet mean of 𝔽\mathbb{F} is [𝔽](x)=F0(x𝔼[m]/𝔼[σ]){\cal E}[\mathbb{F}](x)=F_{0}\left({x-\mathbb{E}[m]}/{\mathbb{E}[\sigma]}\right) and its α\alpha-quantile is given by

(qα(𝔽))1(v)=qα(σF01(v)+m).(q_{\alpha}(\mathbb{F}))^{-1}(v)=q_{\alpha}(\sigma F_{0}^{-1}(v)+m).

4 Sensitivity analysis in general Wasserstein spaces

In this section, we consider that our computer code is 𝒲q()\mathcal{W}_{q}(\mathbb{R})-valued; namely, the output of an experiment is the c.d.f. or the p.d.f. of a measure μ𝒲q()\mu\in\mathcal{W}_{q}(\mathbb{R}). For instance, in [9], [40] and [46], the authors deal with p.d.f.-valued computer codes (and stochastic computer codes). In other words, they define the following application:

f:\displaystyle f: E\displaystyle E \displaystyle\to\mathcal{F}
x\displaystyle x f(x)\displaystyle\mapsto f(x)

where \mathcal{F} is the set of p.d.f.’s:

={gL1();g0,g(x)𝑑x=1}.\mathcal{F}=\left\{g\in L^{1}(\mathbb{R});\ g\geqslant 0,\ \int_{\mathbb{R}}g(x)dx=1\right\}.

Here, we choose to identify any element of 𝒲q()\mathcal{W}_{q}(\mathbb{R}) with its c.d.f. In this framework, the output of the cmoputer code is then a c.d.f. denoted by

𝔽=f(X1,,Xp).\mathbb{F}=f(X_{1},\ldots,X_{p}). (16)

Here, \mathbb{P} denotes the law of the c.d.f. 𝔽\mathbb{F}. In addition, we set q=2q=2. The case of a general qq can be handled analogously.

4.1 Sensitivity anlaysis using Equation (2) and Wasserstein balls

Consider FF, F1F_{1}, and F2F_{2} three elements of 𝒲2()\mathcal{W}_{2}(\mathbb{R}) and, for a=(F1,F2)a=(F_{1},F_{2}), the family of test functions

Ta(F)=T(F1,F2)(F)=1 W2(F1,F)W2(F1,F2).T_{a}(F)=T_{(F_{1},F_{2})}(F)={1\rule{0.51663pt}{6.93192pt}\hskip 2.15277pt}_{W_{2}(F_{1},F)\leqslant W_{2}(F_{1},F_{2})}. (17)

Then, for all u{1,,p}\textbf{u}\subset\{1,\cdots,p\}, (2) becomes

S2,W2u\displaystyle S_{2,W_{2}}^{\textbf{u}} =S2,Univu((F1,F2,F)TF1,F2(F),2)\displaystyle=S_{2,\text{Univ}}^{\textbf{u}}((F_{1},F_{2},F)\mapsto T_{F_{1},F_{2}}(F),\mathbb{P}^{\otimes 2})
=𝒲2()×𝒲2()𝔼[(𝔼[1 W2(F1,𝔽)W2(F1,F2)]𝔼[1 W2(F1,𝔽)W2(F1,F2)|Xu])2]𝑑2(F1,F2)𝒲2()×𝒲2()Var(1 W2(F1,𝔽)W2(F1,F2))𝑑2(F1,F2)\displaystyle=\frac{\int_{\mathcal{W}_{2}(\mathbb{R})\times\mathcal{W}_{2}(\mathbb{R})}\mathbb{E}\left[\left(\mathbb{E}[{1\rule{0.51663pt}{6.93192pt}\hskip 2.15277pt}_{W_{2}(F_{1},\mathbb{F})\leqslant W_{2}(F_{1},F_{2})}]-\mathbb{E}[{1\rule{0.51663pt}{6.93192pt}\hskip 2.15277pt}_{W_{2}(F_{1},\mathbb{F})\leqslant W_{2}(F_{1},F_{2})}|X^{\textbf{u}}]\right)^{2}\right]d\mathbb{P}^{\otimes 2}(F_{1},F_{2})}{\int_{\mathcal{W}_{2}(\mathbb{R})\times\mathcal{W}_{2}(\mathbb{R})}\hbox{{\rm Var}}({1\rule{0.51663pt}{6.93192pt}\hskip 2.15277pt}_{W_{2}(F_{1},\mathbb{F})\leqslant W_{2}(F_{1},F_{2})})d\mathbb{P}^{\otimes 2}(F_{1},F_{2})}
=𝒲2()×𝒲2()Var(𝔼[1 W2(F1,𝔽)W2(F1,F2)|X𝐮])𝑑2(F1,F2)𝒲2()×𝒲2()Var(1 W2(F1,𝔽)W2(F1,F2))𝑑2(F1,F2).\displaystyle=\frac{\int_{\mathcal{W}_{2}(\mathbb{R})\times\mathcal{W}_{2}(\mathbb{R})}\hbox{{\rm Var}}\left(\mathbb{E}[{1\rule{0.51663pt}{6.93192pt}\hskip 2.15277pt}_{W_{2}(F_{1},\mathbb{F})\leqslant W_{2}(F_{1},F_{2})}|X^{\bf u}]\right)d\mathbb{P}^{\otimes 2}(F_{1},F_{2})}{\int_{\mathcal{W}_{2}(\mathbb{R})\times\mathcal{W}_{2}(\mathbb{R})}\hbox{{\rm Var}}({1\rule{0.51663pt}{6.93192pt}\hskip 2.15277pt}_{W_{2}(F_{1},\mathbb{F})\leqslant W_{2}(F_{1},F_{2})})d\mathbb{P}^{\otimes 2}(F_{1},F_{2})}. (18)

As explained in Section 2.1, S2,W2uS_{2,W_{2}}^{\textbf{u}} is obtained by integration over aa of the Hoeffding decomposition of Ta(𝔽)T_{a}(\mathbb{F}). Hence, by construction, this index lies in [0,1][0,1] and shares the same properties as its Sobol counterparts:

  1. 1.

    the different contributions sum to 1;

  2. 2.

    it is invariant by translation, by any isometric and by any non-degenerated scaling of 𝔽\mathbb{F}.

4.2 Sensitivity analysis using Equation (12) and Fréchet means

In the classical framework where the output ZZ is real, we recall that the Sobol index with respect to XuX_{\textbf{u}} is defined by

Su=Var(𝔼[Z|Xu])Var(Z)=Var(Z)𝔼[Var(Z|Xu)]Var(Z),S^{\textbf{u}}=\frac{\text{Var}(\mathbb{E}[Z|X_{\textbf{u}}])}{\text{Var}(Z)}=\frac{\hbox{{\rm Var}}(Z)-\mathbb{E}[\hbox{{\rm Var}}(Z|X_{\textbf{u}})]}{\hbox{{\rm Var}}(Z)}, (19)

by the property of the conditional expectation. In one hand, one may extend this formula to the framework of this section where the output of interest is the c.d.f. 𝔽\mathbb{F}:

Su(𝔽)=Var(𝔽)𝔼[Var(𝔽|Xu))]Var(𝔽),S^{\textbf{u}}(\mathbb{F})=\frac{\hbox{{\rm Var}}(\mathbb{F})-\mathbb{E}[\hbox{{\rm Var}}(\mathbb{F}|X_{\textbf{u}}))]}{\hbox{{\rm Var}}(\mathbb{F})},

where Var(𝔽)=𝔼[W22(𝔽,W2(𝔽))]\hbox{{\rm Var}}(\mathbb{F})=\mathbb{E}[W_{2}^{2}(\mathbb{F},{\cal E}_{W_{2}}(\mathbb{F}))] with W2(𝔽){\cal E}_{W_{2}}(\mathbb{F}) the Fréchet mean of 𝔽\mathbb{F}. From Theorem 3.6, we get

Var(𝔽)=𝔼[01|𝔽(v)(𝔽)(v)|2𝑑v]=𝔼[01|𝔽(v)𝔼[𝔽(v)]|2𝑑v]=01Var(𝔽(v))𝑑v\hbox{{\rm Var}}(\mathbb{F})=\mathbb{E}\left[\int_{0}^{1}|\mathbb{F}^{-}(v)-{\cal E}(\mathbb{F})^{-}(v)|^{2}dv\right]=\mathbb{E}\left[\int_{0}^{1}|\mathbb{F}^{-}(v)-\mathbb{E}[\mathbb{F}^{-}(v)]|^{2}dv\right]=\int_{0}^{1}\hbox{{\rm Var}}(\mathbb{F}^{-}(v))dv

leading to

Su(𝔽)\displaystyle S^{\textbf{u}}(\mathbb{F}) =01Var(𝔽(v))𝑑v01𝔼[Var(𝔽(v)|Xu)]𝑑v01Var(𝔽(v))𝑑v=01Var(𝔼[𝔽(v)|Xu])𝑑v01Var(𝔽(v))𝑑v.\displaystyle=\frac{\int_{0}^{1}\hbox{{\rm Var}}(\mathbb{F}^{-}(v))dv-\int_{0}^{1}\mathbb{E}[\hbox{{\rm Var}}(\mathbb{F}^{-}(v)|X_{\textbf{u}})]dv}{\int_{0}^{1}\hbox{{\rm Var}}(\mathbb{F}^{-}(v))dv}=\frac{\int_{0}^{1}\hbox{{\rm Var}}(\mathbb{E}[\mathbb{F}^{-}(v)|X_{\textbf{u}}])dv}{{\int_{0}^{1}\hbox{{\rm Var}}(\mathbb{F}^{-}(v))dv}}. (20)

In the other hand, one can consider (12), with m=1m=1,

Tv(𝔽)=𝔽(v)\displaystyle T_{v}(\mathbb{F})=\mathbb{F}^{-}(v) (21)

and \mathbb{Q} the uniform probability measure on [0,1][0,1]. In that case,

Var(𝔽)=𝔼[01|𝔽(v)W2(𝔽)(v)|2𝑑v]=01Var(𝔽(v))𝑑v=𝔼[W22(𝔽,W2(𝔽))].\hbox{{\rm Var}}(\mathbb{F})=\mathbb{E}\left[\int_{0}^{1}|\mathbb{F}^{-}(v)-{\cal E}_{W_{2}}(\mathbb{F})^{-}(v)|^{2}dv\right]=\int_{0}^{1}\hbox{{\rm Var}}(\mathbb{F}^{-}(v))dv=\mathbb{E}[W_{2}^{2}(\mathbb{F},{\cal E}_{W_{2}}(\mathbb{F}))].

Then

S2,Univu(Tv,𝒰([0,1]))=01𝔼[(W2(𝔽)(v)W2(𝔽|Xu)(v))2]𝑑v01Var(𝔽(v))𝑑v=𝔼[W22(W2(𝔽|Xu),W2(𝔽))]𝔼[W22(𝔽,W2(𝔽))].S_{2,\text{Univ}}^{\textbf{u}}(T_{v},\mathcal{U}([0,1]))=\frac{\int_{0}^{1}\mathbb{E}\left[\left({\cal E}_{W_{2}}(\mathbb{F})^{-}(v)-{\cal E}_{W_{2}}(\mathbb{F}|X_{\textbf{u}})^{-}(v)\right)^{2}\right]dv}{\int_{0}^{1}\hbox{{\rm Var}}(\mathbb{F}^{-}(v))dv}=\frac{\mathbb{E}\left[W_{2}^{2}({\cal E}_{W_{2}}(\mathbb{F}|X_{\textbf{u}}),{\cal E}_{W_{2}}(\mathbb{F}))\right]}{\mathbb{E}\left[W_{2}^{2}(\mathbb{F},{\cal E}_{W_{2}}(\mathbb{F}))\right]}.

is exactly the same as Su(𝔽)S^{\textbf{u}}(\mathbb{F}) in (20). Thus, as explained in Section 2.2, Su(𝔽)S^{\textbf{u}}(\mathbb{F}) lies in [0,1][0,1] and:

  1. 1.

    the different contributions sum to 1;

  2. 2.

    it is invariant by translation, by any isometric and by any non-degenerated scaling of 𝔽\mathbb{F}.

4.3 Estimation procedure

As noticed in the previous section, both

S2,W2u=S2,Univu(Ta,2)S_{2,W_{2}}^{\textbf{u}}=S_{2,Univ}^{\textbf{u}}(T_{a},\mathbb{P}^{\otimes 2})

with TaT_{a} defined in (17) and

Su(𝔽)=S2,Univu(Tv,𝒰([0,1]))S^{\textbf{u}}(\mathbb{F})=S_{2,\text{Univ}}^{\textbf{u}}(T_{v},\mathcal{U}([0,1]))

with TvT_{v} defined in (21), are particular cases of indices of the form (12).

When aa belongs to the same space as the output and when \mathbb{Q} is equal to m\mathbb{P}^{\otimes m}, one may first use the Pick-Freeze estimations of the indices given in (4.1) and (20). To do so, it is convenient once again to use (4) leading to

S2,W2u\displaystyle S_{2,W_{2}}^{\textbf{u}} =𝒲2()×𝒲2()Cov(1 W2(F1,𝔽)W2(F1,F2),1 W2(F1,𝔽u)W2(F1,F2))𝑑2(F1,F2)𝒲2()×𝒲2()Var(1 W2(F1,𝔽)W2(F1,F2))𝑑2(F1,F2)\displaystyle=\frac{\int_{\mathcal{W}_{2}(\mathbb{R})\times\mathcal{W}_{2}(\mathbb{R})}\hbox{{\rm Cov}}\left({1\rule{0.51663pt}{6.93192pt}\hskip 2.15277pt}_{W_{2}(F_{1},\mathbb{F})\leqslant W_{2}(F_{1},F_{2})},{1\rule{0.51663pt}{6.93192pt}\hskip 2.15277pt}_{W_{2}(F_{1},\mathbb{F}^{\textbf{u}})\leqslant W_{2}(F_{1},F_{2})}\right)d\mathbb{P}^{\otimes 2}(F_{1},F_{2})}{\int_{\mathcal{W}_{2}(\mathbb{R})\times\mathcal{W}_{2}(\mathbb{R})}\hbox{{\rm Var}}({1\rule{0.51663pt}{6.93192pt}\hskip 2.15277pt}_{W_{2}(F_{1},\mathbb{F})\leqslant W_{2}(F_{1},F_{2})})d\mathbb{P}^{\otimes 2}(F_{1},F_{2})} (22)

and

Su(𝔽)\displaystyle S^{\textbf{u}}(\mathbb{F}) =01Cov(𝔽(v),𝔽,u(v))𝑑v01Var(𝔽(v))𝑑v\displaystyle=\frac{\int_{0}^{1}\hbox{{\rm Cov}}\left(\mathbb{F}^{-}(v),\mathbb{F}^{-,\textbf{u}}(v)\right)dv}{{\int_{0}^{1}\hbox{{\rm Var}}(\mathbb{F}^{-}(v))dv}} (23)

where 𝔽u\mathbb{F}^{\textbf{u}} and 𝔽,u\mathbb{F}^{-,\textbf{u}} are respectively the Pick-Freeze versions of 𝔽\mathbb{F} and 𝔽\mathbb{F}^{-}. Secondly, one may resort to the estimations based on U-statistics together on the Pick-Freeze design of experiment. Thirdly, it is also possible and easy to obtain rank-based estimations in the vein of (10).

4.4 Numerical comparison of both indices

Example 4.1 (Toy model).

Let X1,X2,X3X_{1},X_{2},X_{3} be 33 independent and positive random variables. We consider the c.d.f.-valued code ff, the output of which is given by

𝔽(t)=t1+X1+X2+X1X31 0t1+X1+X2+X1X3+1 1+X1+X2+X1X3<t,\mathbb{F}(t)=\frac{t}{1+X_{1}+X_{2}+X_{1}X_{3}}{1\rule{0.51663pt}{6.93192pt}\hskip 2.15277pt}_{0\leqslant t\leqslant 1+X_{1}+X_{2}+X_{1}X_{3}}+{1\rule{0.51663pt}{6.93192pt}\hskip 2.15277pt}_{1+X_{1}+X_{2}+X_{1}X_{3}<t}, (24)

so that

𝔽1(v)=v(1+X1+X2+X1X3).\mathbb{F}^{-1}(v)=v\Bigl{(}1+X_{1}+X_{2}+X_{1}X_{3}\Bigr{)}. (25)

In addition, one gets

Var(𝔽1(v))\displaystyle\hbox{{\rm Var}}\left(\mathbb{F}^{-1}(v)\right) =v2(Var(X1(1+X3))+Var(X2))\displaystyle=v^{2}\left(\hbox{{\rm Var}}(X_{1}(1+X_{3}))+\hbox{{\rm Var}}(X_{2})\right)
=v2(Var(X1)Var(X3)+Var(X1)(1+𝔼[X3])2+Var(X3)𝔼[X1]2+Var(X2))\displaystyle=v^{2}\left(\hbox{{\rm Var}}(X_{1})\hbox{{\rm Var}}(X_{3})+\hbox{{\rm Var}}(X_{1})(1+\mathbb{E}[X_{3}])^{2}+\hbox{{\rm Var}}(X_{3})\mathbb{E}[X_{1}]^{2}+\hbox{{\rm Var}}(X_{2})\right)

and

𝔼[𝔽1(v)|X1]=v(1+X1(1+𝔼[X3])+𝔼[X2]),\displaystyle\mathbb{E}\left[\mathbb{F}^{-1}(v)|X_{1}\right]=v\Bigl{(}1+X_{1}(1+\mathbb{E}[X_{3}])+\mathbb{E}[X_{2}]\Bigr{)},
𝔼[𝔽1(v)|X2]=v(1+𝔼[X1](1+𝔼[X3])+X2),\displaystyle\mathbb{E}\left[\mathbb{F}^{-1}(v)|X_{2}\right]=v\Bigl{(}1+\mathbb{E}[X_{1}](1+\mathbb{E}[X_{3}])+X_{2}\Bigr{)},
𝔼[𝔽1(v)|X3]=v(1+𝔼[X1](1+X3)+𝔼[X2]),\displaystyle\mathbb{E}\left[\mathbb{F}^{-1}(v)|X_{3}\right]=v\Bigl{(}1+\mathbb{E}[X_{1}](1+X_{3})+\mathbb{E}[X_{2}]\Bigr{)},
𝔼[𝔽1(v)|X1X3]=v(1+X1(1+X3)+𝔼[X2]),\displaystyle\mathbb{E}\left[\mathbb{F}^{-1}(v)|X_{1}X_{3}\right]=v\Bigl{(}1+X_{1}(1+X_{3})+\mathbb{E}[X_{2}]\Bigr{)},

and finally

Var(𝔼[𝔽1(v)|X1])=v2(1+𝔼[X3])2Var(X1),\displaystyle\hbox{{\rm Var}}\left(\mathbb{E}\left[\mathbb{F}^{-1}(v)|X_{1}\right]\right)=v^{2}(1+\mathbb{E}[X_{3}])^{2}\hbox{{\rm Var}}(X_{1}),
Var(𝔼[𝔽1(v)|X2])=v2Var(X2),\displaystyle\hbox{{\rm Var}}\left(\mathbb{E}\left[\mathbb{F}^{-1}(v)|X_{2}\right]\right)=v^{2}\hbox{{\rm Var}}(X_{2}),
Var(𝔼[𝔽1(v)|X3])=v2𝔼[X1]2Var(X3),\displaystyle\hbox{{\rm Var}}\left(\mathbb{E}\left[\mathbb{F}^{-1}(v)|X_{3}\right]\right)=v^{2}\mathbb{E}[X_{1}]^{2}\hbox{{\rm Var}}(X_{3}),
Var(𝔼[𝔽1(v)|X1,X3])=v2(Var(X1)Var(X3)+Var(X1)(1+𝔼[X3])2+Var(X3)𝔼[X1]2).\displaystyle\hbox{{\rm Var}}\left(\mathbb{E}\left[\mathbb{F}^{-1}(v)|X_{1},X_{3}\right]\right)=v^{2}\left(\hbox{{\rm Var}}(X_{1})\hbox{{\rm Var}}(X_{3})+\hbox{{\rm Var}}(X_{1})(1+\mathbb{E}[X_{3}])^{2}+\hbox{{\rm Var}}(X_{3})\mathbb{E}[X_{1}]^{2}\right).

For 𝐮=i1,2,3{\bf u}=i\in{1,2,3} or 𝐮={1,3}{\bf u}=\{1,3\}, it remains to plug the previous formulas in (20) to get the explicit expressions of the indices S𝐮(𝔽)S^{\bf u}(\mathbb{F}).

Now, in order to get a closed formula for the indices defined in (4.1), we assume XiX_{i} is Bernoulli distributed with parameter 0<pi<10<p_{i}<1. In (4.1), the distributions F1F_{1} and F2F_{2} can be either 𝒰([0,1])\mathcal{U}([0,1]), 𝒰([0,2])\mathcal{U}([0,2]), 𝒰([0,3])\mathcal{U}([0,3]), or 𝒰([0,4])\mathcal{U}([0,4]) with respective probabilities q1=(1p1)(1p2)q_{1}=(1-p_{1})(1-p_{2}), q2=(1p1)p2+p1(1p2)(1p3)q_{2}=(1-p_{1})p_{2}+p_{1}(1-p_{2})(1-p_{3}), q3=p1((1p2)p3+p2(1p3))q_{3}=p_{1}((1-p_{2})p_{3}+p_{2}(1-p_{3})), and q4=p1p2p3q_{4}=p_{1}p_{2}p_{3}. In the sequel, we give, for all sixteen possibilities for the distribution of (F1,F2)(F_{1},F_{2}), the corresponding contributions for the numerator and for the denominator of (4.1).

With probability p1,1=(1p1)2(1p2)2p_{1,1}=(1-p_{1})^{2}(1-p_{2})^{2}, F1F_{1} and F2𝒰([0,1])F_{2}\sim\mathcal{U}([0,1]). Then W22(F1,F2)=0W_{2}^{2}(F_{1},F_{2})=0, W22(F1,𝔽)=13(X1+X2+X1X3)2W_{2}^{2}(F_{1},\mathbb{F})=\frac{1}{3}(X_{1}+X_{2}+X_{1}X_{3})^{2}, and W22(F1,𝔽)W22(F1,F2)W_{2}^{2}(F_{1},\mathbb{F})\leqslant W_{2}^{2}(F_{1},F_{2}) if and only if X1+X2+X1X3=0X_{1}+X_{2}+X_{1}X_{3}=0. Since (X1+X2+X1X3=0)=(1p1)(1p2)\mathbb{P}\left(X_{1}+X_{2}+X_{1}X_{3}=0\right)=(1-p_{1})(1-p_{2}), the contribution d1,1d_{1,1} of this case to the denominator is thus

d1,1=q1,1(1q1,1)with q1,1=(1p1)(1p2).d_{1,1}=q_{1,1}(1-q_{1,1})\quad\text{with $q_{1,1}=(1-p_{1})(1-p_{2})$}.

Moreover,

𝔼[𝟙X1+X2+X1X3=0|X1]=(X1+X2+X1X3=0|X1)=𝟙X1=0(X2=0)=(1p2)𝟙X1=0.\displaystyle\mathbb{E}[\mathbbm{1}_{X_{1}+X_{2}+X_{1}X_{3}=0}|X_{1}]=\mathbb{P}\Bigl{(}X_{1}+X_{2}+X_{1}X_{3}=0|X_{1}\Bigr{)}=\mathbbm{1}_{X_{1}=0}\mathbb{P}(X_{2}=0)=(1-p_{2})\mathbbm{1}_{X_{1}=0}.

so that, the contribution to the numerator is here given by

n1,11=Var(𝔼[𝟙X1+X2+X1X3=0|X1])=p1(1p1)(1p2)2.n_{1,1}^{1}=\hbox{{\rm Var}}(\mathbb{E}[\mathbbm{1}_{X_{1}+X_{2}+X_{1}X_{3}=0}|X_{1}])=p_{1}(1-p_{1})(1-p_{2})^{2}.

Similarly, one gets

n1,12=Var(𝔼[𝟙X1+X2+X1X3=0|X2])=p2(1p2)(1p1)2andn1,13=0.n_{1,1}^{2}=\hbox{{\rm Var}}(\mathbb{E}[\mathbbm{1}_{X_{1}+X_{2}+X_{1}X_{3}=0}|X_{2}])=p_{2}(1-p_{2})(1-p_{1})^{2}\quad\text{and}\quad n_{1,1}^{3}=0.

Moreover, regarding the indices with respect to X1X_{1} and X3X_{3},

𝔼[𝟙X1+X2+X1X3=0|X1,X3]=(X1+X2+X1X3=0|X1,X3)=𝟙X1=0(X2=0)=(1p2)𝟙X1=0\mathbb{E}[\mathbbm{1}_{X_{1}+X_{2}+X_{1}X_{3}=0}|X_{1},X_{3}]=\mathbb{P}\Bigl{(}X_{1}+X_{2}+X_{1}X_{3}=0|X_{1},X_{3}\Bigr{)}=\mathbbm{1}_{X_{1}=0}\mathbb{P}(X_{2}=0)=(1-p_{2})\mathbbm{1}_{X_{1}=0}

and the contribution to the numerator is given by

n1,11,3=Var(𝔼[𝟙X1+X2+X1X3=0|X1,X3])=p1(1p1)(1p2)2.n_{1,1}^{1,3}=\hbox{{\rm Var}}(\mathbb{E}[\mathbbm{1}_{X_{1}+X_{2}+X_{1}X_{3}=0}|X_{1},X_{3}])=p_{1}(1-p_{1})(1-p_{2})^{2}.

The remaining fifteen cases can be treated similarly and are gathered (with the first case developed above) in the following table. Finally, for k=1,,3k=1,\ldots,3, one may compute the explicit expression of S2,W2kS_{2,W_{2}}^{k}:

S2,W2k\displaystyle S_{2,W_{2}}^{k} =𝒲2()×𝒲2()Cov(1 W2(F1,𝔽)W2(F1,F2),1 W2(F1,𝔽u)W2(F1,F2))𝑑2(F1,F2)𝒲2()×𝒲2()Var(1 W2(F1,𝔽)W2(F1,F2))𝑑2(F1,F2)=i,jpi,jni,jki,jpi,jdi,j.\displaystyle=\frac{\int_{\mathcal{W}_{2}(\mathbb{R})\times\mathcal{W}_{2}(\mathbb{R})}\hbox{{\rm Cov}}\left({1\rule{0.51663pt}{6.93192pt}\hskip 2.15277pt}_{W_{2}(F_{1},\mathbb{F})\leqslant W_{2}(F_{1},F_{2})},{1\rule{0.51663pt}{6.93192pt}\hskip 2.15277pt}_{W_{2}(F_{1},\mathbb{F}^{\textbf{u}})\leqslant W_{2}(F_{1},F_{2})}\right)d\mathbb{P}^{\otimes 2}(F_{1},F_{2})}{\int_{\mathcal{W}_{2}(\mathbb{R})\times\mathcal{W}_{2}(\mathbb{R})}\hbox{{\rm Var}}({1\rule{0.51663pt}{6.93192pt}\hskip 2.15277pt}_{W_{2}(F_{1},\mathbb{F})\leqslant W_{2}(F_{1},F_{2})})d\mathbb{P}^{\otimes 2}(F_{1},F_{2})}=\frac{\sum_{i,j}p_{i,j}n_{i,j}^{k}}{\sum_{i,j}p_{i,j}d_{i,j}}.

Some numerical values have not been explicited in the table but given below:

Case 2 Var(𝟙X1=1(1(1p2)𝟙X3=0))=p1(1p1)(1(1p2)(1p3))2+p1(1p2)2p3(1p3),\displaystyle\hbox{{\rm Var}}(\mathbbm{1}_{X_{1}=1}(1-(1-p_{2})\mathbbm{1}_{X_{3}=0}))=p_{1}(1-p_{1})(1-(1-p_{2})(1-p_{3}))^{2}+p_{1}(1-p_{2})^{2}p_{3}(1-p_{3}),
Case 6 Var(𝟙X1=1(p2(1p2)𝟙X3=0))=p1(1p1)(p2(1p2)(1p3))2+p1(1p2)2p3(1p3),\displaystyle\hbox{{\rm Var}}(\mathbbm{1}_{X_{1}=1}(p_{2}-(1-p_{2})\mathbbm{1}_{X_{3}=0}))=p_{1}(1-p_{1})(p_{2}-(1-p_{2})(1-p_{3}))^{2}+p_{1}(1-p_{2})^{2}p_{3}(1-p_{3}),
Case 11 Var(𝟙X1=1(p2+(12p2)𝟙X3=1))=p1(1p1)(p2+(12p2)p3)2+p1(12p2)2p3(1p3),\displaystyle\hbox{{\rm Var}}(\mathbbm{1}_{X_{1}=1}(p_{2}+(1-2p_{2})\mathbbm{1}_{X_{3}=1}))=p_{1}(1-p_{1})(p_{2}+(1-2p_{2})p_{3})^{2}+p_{1}(1-2p_{2})^{2}p_{3}(1-p_{3}),
Case 15 Var(𝟙X1=1(p2+(1p2)𝟙X3=1))=p1(1p1)(p2+(1p2)p3)2+p1(1p2)2p3(1p3).\displaystyle\hbox{{\rm Var}}(\mathbbm{1}_{X_{1}=1}(p_{2}+(1-p_{2})\mathbbm{1}_{X_{3}=1}))=p_{1}(1-p_{1})(p_{2}+(1-p_{2})p_{3})^{2}+p_{1}(1-p_{2})^{2}p_{3}(1-p_{3}).
Case 1 F1𝒰([0,1])F_{1}\sim\mathcal{U}([0,1]), F2𝒰([0,1])F_{2}\sim\mathcal{U}([0,1]) Case 2 F1𝒰([0,1])F_{1}\sim\mathcal{U}([0,1]), F2𝒰([0,2])F_{2}\sim\mathcal{U}([0,2])
Prob. q12q_{1}^{2} Prob. q1q2q_{1}q_{2}
Num. 1 p1(1p1)(1p2)2p_{1}(1-p_{1})(1-p_{2})^{2} Num. 1 p1(1p1)(p2+p3p2p3)2p_{1}(1-p_{1})(p_{2}+p_{3}-p_{2}p_{3})^{2}
Num. 2 (1p1)2p2(1p2)(1-p_{1})^{2}p_{2}(1-p_{2}) Num. 2 p12p2(1p2)(1p3)2p_{1}^{2}p_{2}(1-p_{2})(1-p_{3})^{2}
Num. 3 0 Num. 3 p12(1p2)2p3(1p3)p_{1}^{2}(1-p_{2})^{2}p_{3}(1-p_{3})
Num. 1,3 p1(1p1)(1p2)2p_{1}(1-p_{1})(1-p_{2})^{2} Num. 1,3 Var(𝟙X1=1(1(1p2)𝟙X3=0)\hbox{{\rm Var}}(\mathbbm{1}_{X_{1}=1}(1-(1-p_{2})\mathbbm{1}_{X_{3}=0})
qq Den. (1p1)(1p2)(1-p_{1})(1-p_{2}) qq Den. (1p1)+p1(1p2)(1p3)(1-p_{1})+p_{1}(1-p_{2})(1-p_{3})
Case 3 F1𝒰([0,1])F_{1}\sim\mathcal{U}([0,1]), F2𝒰([0,3])F_{2}\sim\mathcal{U}([0,3]) Case 4 F1𝒰([0,1])F_{1}\sim\mathcal{U}([0,1]), F2𝒰([0,4])F_{2}\sim\mathcal{U}([0,4])
Prob. q1q3q_{1}q_{3} Prob. q1q4q_{1}q_{4}
Num. 1 p1(1p1)p22p32p_{1}(1-p_{1})p_{2}^{2}p_{3}^{2} Num. 1 0
Num. 2 p12p2(1p2)p32p_{1}^{2}p_{2}(1-p_{2})p_{3}^{2} Num. 2 0
Num. 3 p12p22p3(1p3)p_{1}^{2}p_{2}^{2}p_{3}(1-p_{3}) Num. 3 0
Num. 1,3 p1p22p3(1p1p3)p_{1}p_{2}^{2}p_{3}(1-p_{1}p_{3}) Num. 1,3 0
qq Den. 1p1p2p31-p_{1}p_{2}p_{3} qq Den. 0
Case 5 F1𝒰([0,2])F_{1}\sim\mathcal{U}([0,2]), F2𝒰([0,1])F_{2}\sim\mathcal{U}([0,1]) Case 6 F1𝒰([0,2])F_{1}\sim\mathcal{U}([0,2]), F2𝒰([0,2])F_{2}\sim\mathcal{U}([0,2])
Prob. q1q2q_{1}q_{2} Prob. q22q_{2}^{2}
Num. 1 p1(1p1)p22p32p_{1}(1-p_{1})p_{2}^{2}p_{3}^{2} Num. 1 p1(1p1)(p2(1p2)(1p3))2p_{1}(1-p_{1})(p_{2}-(1-p_{2})(1-p_{3}))^{2}
Num. 2 p12p2(1p2)p32p_{1}^{2}p_{2}(1-p_{2})p_{3}^{2} Num. 2 p2(1p2)(p1(1p3)(1p1))2p_{2}(1-p_{2})(p_{1}(1-p_{3})-(1-p_{1}))^{2}
Num. 3 p12p22p3(1p3)p_{1}^{2}p_{2}^{2}p_{3}(1-p_{3}) Num. 3 p12(1p2)2p3(1p3)p_{1}^{2}(1-p_{2})^{2}p_{3}(1-p_{3})
Num. 1,3 p1p22p3(1p1p3)p_{1}p_{2}^{2}p_{3}(1-p_{1}p_{3}) Num. 1,3 Var(𝟙X1=1(p2(1p2)𝟙X3=0))\hbox{{\rm Var}}(\mathbbm{1}_{X_{1}=1}(p_{2}-(1-p_{2})\mathbbm{1}_{X_{3}=0}))
qq Den. 1p1p2p31-p_{1}p_{2}p_{3} qq Den. (1p1)p2+p1(1p2)(1p3)(1-p_{1})p_{2}+p_{1}(1-p_{2})(1-p_{3})
Case 7 F1𝒰([0,2])F_{1}\sim\mathcal{U}([0,2]), F2𝒰([0,3])F_{2}\sim\mathcal{U}([0,3]) Case 8 F1𝒰([0,2])F_{1}\sim\mathcal{U}([0,2]), F2𝒰([0,4])F_{2}\sim\mathcal{U}([0,4])
Prob. q2q3q_{2}q_{3} Prob. q2q4q_{2}q_{4}
Num. 1 p1(1p1)p22p32p_{1}(1-p_{1})p_{2}^{2}p_{3}^{2} Num. 1 0
Num. 2 p12p2(1p2)p32p_{1}^{2}p_{2}(1-p_{2})p_{3}^{2} Num. 2 0
Num. 3 p12p22p3(1p3)p_{1}^{2}p_{2}^{2}p_{3}(1-p_{3}) Num. 3 0
Num. 1,3 p1p22p3(1p1p3)p_{1}p_{2}^{2}p_{3}(1-p_{1}p_{3}) Num. 1,3 0
qq Den. 1p1p2p31-p_{1}p_{2}p_{3} qq Den. 0
Case 9 F1𝒰([0,3])F_{1}\sim\mathcal{U}([0,3]), F2𝒰([0,1])F_{2}\sim\mathcal{U}([0,1]) Case 10 F1𝒰([0,3])F_{1}\sim\mathcal{U}([0,3]), F2𝒰([0,2])F_{2}\sim\mathcal{U}([0,2])
Prob. q1q3q_{1}q_{3} Prob. q2q3q_{2}q_{3}
Num. 1 0 Num. 1 p1(1p1)(1p2)2p_{1}(1-p_{1})(1-p_{2})^{2}
Num. 2 0 Num. 2 (1p1)2p2(1p2)(1-p_{1})^{2}p_{2}(1-p_{2})
Num. 3 0 Num. 3 0
Num. 1,3 0 Num. 1,3 p1(1p1)(1p2)2p_{1}(1-p_{1})(1-p_{2})^{2}
qq Den. 0 qq Den. (1p1)p2+p1(1-p_{1})p_{2}+p_{1}
Case 11 F1𝒰([0,3])F_{1}\sim\mathcal{U}([0,3]), F2𝒰([0,3])F_{2}\sim\mathcal{U}([0,3]) Case 12 F1𝒰([0,3])F_{1}\sim\mathcal{U}([0,3]), F2𝒰([0,4])F_{2}\sim\mathcal{U}([0,4])
Prob. q32q_{3}^{2} Prob. q3q4q_{3}q_{4}
Num. 1 p1(1p1)(p2(1p3)+(1p2)p3)2p_{1}(1-p_{1})(p_{2}(1-p_{3})+(1-p_{2})p_{3})^{2} Num. 1 p1(1p1)(1p2)2p_{1}(1-p_{1})(1-p_{2})^{2}
Num. 2 p12p2(1p2)(2p31)2p_{1}^{2}p_{2}(1-p_{2})(2p_{3}-1)^{2} Num. 2 (1p1)2p2(1p2)(1-p_{1})^{2}p_{2}(1-p_{2})
Num. 3 p12(2p21)2p3(1p3)p_{1}^{2}(2p_{2}-1)^{2}p_{3}(1-p_{3}) Num. 3 0
Num. 1,3 Var(𝟙X1=1(p2+(12p2)𝟙X3=1)\hbox{{\rm Var}}(\mathbbm{1}_{X_{1}=1}(p_{2}+(1-2p_{2})\mathbbm{1}_{X_{3}=1}) Num. 1,3 p1(1p1)(1p2)2p_{1}(1-p_{1})(1-p_{2})^{2}
qq Den. p1(p2(1p3)+(1p2)p3)p_{1}(p_{2}(1-p_{3})+(1-p_{2})p_{3}) qq Den. (1p1)p2+p1(1-p_{1})p_{2}+p_{1}
Case 13 F1𝒰([0,4])F_{1}\sim\mathcal{U}([0,4]), F2𝒰([0,1])F_{2}\sim\mathcal{U}([0,1]) Case 14 F1𝒰([0,4])F_{1}\sim\mathcal{U}([0,4]), F2𝒰([0,2])F_{2}\sim\mathcal{U}([0,2])
Prob. q1q4q_{1}q_{4} Prob. q2q4q_{2}q_{4}
Num. 1 0 Num. 1 p1(1p1)(1p2)2p_{1}(1-p_{1})(1-p_{2})^{2}
Num. 2 0 Num. 2 (1p1)2p2(1p2)(1-p_{1})^{2}p_{2}(1-p_{2})
Num. 3 0 Num. 3 0
Num. 1,3 0 Num. 1,3 p1(1p1)(1p2)2p_{1}(1-p_{1})(1-p_{2})^{2}
qq Den. 0 qq Den. (1p1)p2+p1(1-p_{1})p_{2}+p_{1}
Case 15 F1𝒰([0,4])F_{1}\sim\mathcal{U}([0,4]), F2𝒰([0,3])F_{2}\sim\mathcal{U}([0,3]) Case 16 F1𝒰([0,4])F_{1}\sim\mathcal{U}([0,4]), F2𝒰([0,4])F_{2}\sim\mathcal{U}([0,4])
Prob. q3q4q_{3}q_{4} Prob. q42q_{4}^{2}
Num. 1 p1(1p1)(p2+(1p2)p3)2p_{1}(1-p_{1})(p_{2}+(1-p_{2})p_{3})^{2} Num. 1 p1(1p1)p22p32p_{1}(1-p_{1})p_{2}^{2}p_{3}^{2}
Num. 2 p12p2(1p2)(1p3)2p_{1}^{2}p_{2}(1-p_{2})(1-p_{3})^{2} Num. 2 p12p2(1p2)p32p_{1}^{2}p_{2}(1-p_{2})p_{3}^{2}
Num. 3 p12(1p2)2p3(1p3)p_{1}^{2}(1-p_{2})^{2}p_{3}(1-p_{3}) Num. 3 p12p22p3(1p3)p_{1}^{2}p_{2}^{2}p_{3}(1-p_{3})
Num. 1,3 Var(𝟙X1=1(p2+(1p2)𝟙X3=1)\hbox{{\rm Var}}(\mathbbm{1}_{X_{1}=1}(p_{2}+(1-p_{2})\mathbbm{1}_{X_{3}=1}) Num. 1,3 p1p22p3(1p1p3)p_{1}p_{2}^{2}p_{3}(1-p_{1}p_{3})
qq Den. p1(p2+(1p2)p3)p_{1}(p_{2}+(1-p_{2})p_{3}) qq Den. p1p2p3p_{1}p_{2}p_{3}

In Figure 1, we have represented the indices S1(𝔽)S^{1}(\mathbb{F}), S2(𝔽)S^{2}(\mathbb{F}), S3(𝔽)S^{3}(\mathbb{F}), and S13(𝔽)S^{13}(\mathbb{F}) given by (20) with respect to the values of p1p_{1} and p2p_{2} varying from 0 to 1 for a fixed value of p3p_{3}. We have considered three different values of p3p_{3}: p3=0.01p_{3}=0.01 (first row), 0.50.5, (second row) and 0.990.99 (third row). Analogously, the same kind of illustration for the indices S2,W21S^{1}_{2,W_{2}}, S2,W22S^{2}_{2,W_{2}}, S2,W23S^{3}_{2,W_{2}}, and S2,W213S^{13}_{2,W_{2}} given by (4.1) is provided in Figure 2 . In addition, the regions of predominance of each index Su(𝔽)S^{\textbf{u}}(\mathbb{F}) are plotted in Figure 3. The values of p1p_{1} and p2p_{2} still vary from 0 to 1 and the fixed values of p3p_{3} considered are: p3=0.01p_{3}=0.01 (first row), 0.50.5, (second row) and 0.990.99 (third row). Finally, the same kind of illustration for the indices S2,W2uS^{\textbf{u}}_{2,W_{2}} is given in Figure 4.

Refer to caption
Figure 1: Model (24). Values of the indices S1(𝔽)S^{1}(\mathbb{F}), S2(𝔽)S^{2}(\mathbb{F}), S3(𝔽)S^{3}(\mathbb{F}), and S13(𝔽)S^{13}(\mathbb{F}) given by (20) (from left to right) with respect to the values of p1p_{1} and p2p_{2} (varying from 0 to 1). In the first row (resp. second and third), p3p_{3} is fixed to p3=0.01p_{3}=0.01 (resp. 0.50.5 and 0.990.99).
Refer to caption
Figure 2: Model (24). Values of the indices S2,W21S^{1}_{2,W_{2}}, S2,W22S^{2}_{2,W_{2}}, S2,W23S^{3}_{2,W_{2}}, and S2,W213S^{13}_{2,W_{2}} given by (4.1) (from left to right) with respect to the values of p1p_{1} and p2p_{2} (varying from 0 to 1). In the first row (resp. second and third), p3p_{3} is fixed to p3=0.01p_{3}=0.01 (resp. 0.50.5 and 0.990.99).
Refer to caption
Figure 3: Model (24). In the first row of the figure, regions where S1(𝔽)S2(𝔽)S^{1}(\mathbb{F})\geqslant S^{2}(\mathbb{F}) (black), S1(𝔽)S2(𝔽)S^{1}(\mathbb{F})\leqslant S^{2}(\mathbb{F}) (white), and S1(𝔽)=S2(𝔽)S^{1}(\mathbb{F})=S^{2}(\mathbb{F}) (gray) with respect to p1p_{1} and p2p_{2} varying from 0 to 1 and, from left to right, p3=0.01p_{3}=0.01, 0.50.5, and 0.990.99. Analogously, the second (resp. last) row considers the regions with S1(𝔽)S^{1}(\mathbb{F}) and S3(𝔽)S^{3}(\mathbb{F}) (resp. S2(𝔽)S^{2}(\mathbb{F}) and S3(𝔽)S^{3}(\mathbb{F})) with respect to p1p_{1} and p3p_{3} (resp. p2p_{2} and p3p_{3}) varying from 0 to 1 and, from left to right, p2=0.01p_{2}=0.01, 0.50.5, and 0.990.99 (resp. p1=0.01p_{1}=0.01, 0.50.5, and 0.990.99).
Refer to caption
Figure 4: Model (24). In the first row of the figure, regions where S2,W21S2,W22S^{1}_{2,W_{2}}\geqslant S^{2}_{2,W_{2}} (black), S2,W21S2,W22S^{1}_{2,W_{2}}\leqslant S^{2}_{2,W_{2}} (white), and S2,W21=S2,W22S^{1}_{2,W_{2}}=S^{2}_{2,W_{2}} (gray) with respect to p1p_{1} and p2p_{2} varying from 0 to 1 and, from left to right, p3=0.01p_{3}=0.01, 0.50.5, and 0.990.99. Analogously, the second (resp. last) row considers the regions with S2,W21S^{1}_{2,W_{2}} and S2,W23S^{3}_{2,W_{2}} (resp. S2,W22S^{2}_{2,W_{2}} and S2,W23S^{3}_{2,W_{2}}) with respect to p1p_{1} and p3p_{3} (resp. p2p_{2} and p3p_{3}) varying from 0 to 1 and, from left to right, p2=0.01p_{2}=0.01, 0.50.5, and 0.990.99 (resp. p1=0.01p_{1}=0.01, 0.50.5, and 0.990.99).

In order to compare the estimation accuracy of the Pick-Freeze method and the rank-based method at a fixed size, we assume that only N=450N=450 calls of the computer code are allowed to estimate the indices Su(𝔽)S^{\textbf{u}}(\mathbb{F}) and S2,W2uS^{\textbf{u}}_{2,W_{2}} for u={1}\textbf{u}=\{1\}, {2}\{2\}, and {3}\{3\}. We only focus on the first order indices since, as explained previously, the rank-based procedure has not been developed yet for higher order indices. We repeat the estimation procedure 500 times. The boxplots of the mean square errors for the estimation of the Fréchet indices Su(𝔽)S^{\textbf{u}}(\mathbb{F}) and the Wasserstein indices S2,W2uS^{\textbf{u}}_{2,W_{2}} have been plotted in Figure 5. We observe that, for a fixed sample size N=450N=450 (corresponding to a Pick-Freeze sample size N=64N=64), the rank-based estimation procedure performs much better than the Pick-Freeze method with significantly lower mean errors.

Refer to caption
Figure 5: Model (24) with p1=1/3p_{1}=1/3, p2=2/3p_{2}=2/3, and p3=3/4p_{3}=3/4. Boxplots of the mean square errors of the estimation of the Fréchet indices Su(𝔽)S^{\textbf{u}}(\mathbb{F}) (top row) and the Wasserstein indices S2,W2uS^{\textbf{u}}_{2,W_{2}} (bottom row) with a fixed sample size and 500 replications. The indices with respect to u={1}\textbf{u}=\{1\}, {2}\{2\}, and {3}\{3\} are displayed from left to right. The results of the Pick-Freeze estimation procedure with N=64N=64 are provided in the left side of each graphic. The results of the rank-based methodology with N=450N=450 are provided in the right side of each graphic.

5 Sensitivity analysis for stochastic computer codes

This section deals with stochastic computer codes in the sense that two evaluations of the code for the same input lead to different outputs.

5.1 State of the art

A first natural way to handle stochastic computer codes is definitely to consider the expectation of the output code. Indeed, as mentioned in [9], previous works dealing with stochastic simulators together with robust design or optimization and sensitivity analysis consist mainly in approximating the mean and variance of the stochastic output [17, 10, 37, 2] and then performing a global sensitivity analysis on the expectation of the output code [42].

As pointed out by [35], another approach is to consider that the stochastic code is of the form f(X,D)f(X,D) where the random element XX contains the classical input variables and the variable DD is an extra unobserved random input.

Such an idea was exploited in [36] to compare the estimation of the Sobol indices in an “exact” model to the estimation of the Sobol indices in an associated metamodel.

Analogously, the author of [43] assumes the existence of an extra random variable DD which is not chosen by the practitioner but rather generated at each computation of the output YY independently of XX. In this framework, he builds two different indices. The first index is obtained by substituting f(X,D)f(X,D) for f(X)f(X) in the classical definition of the first order Sobol index Si=Var(𝔼[f(X)|Xi])/Var(f(X))S^{i}=\hbox{{\rm Var}}(\mathbb{E}[f(X)|X_{i}])/\hbox{{\rm Var}}(f(X)). In this case, DD is considered as another input, even though it is not observable. The second index is obtained by substituting 𝔼[f(X,D)|X]\mathbb{E}[f(X,D)|X] for f(X)f(X) in the Sobol index. The noise is then smoothed out.

Similarly, the authors of [31] traduces the randomness of the computer code using such an extra random variable DD. In practice, their algorithm returns mm realizations of the first order Sobol indices SiS^{i} for i=1,,pi=1,\ldots,p, denoted by S^ji(dj)\hat{S}^{i}_{j}(d_{j}) for j=1,,mj=1,\ldots,m and i=1,,pi=1,\ldots,p. Then, for any i=1,,pi=1,\ldots,p, they approximate the statistical properties of SiS^{i} by considering the sample rr-th moments given by

μ^ri=1mj=1m(S^ji(dj))r\displaystyle\hat{\mu}^{i}_{r}=\frac{1}{m}\sum_{j=1}^{m}(\hat{S}^{i}_{j}(d_{j}))^{r}

noticing that

𝔼D[μ^ri]=𝔼[(S^i)r]andVarD(μ^ri)=1MVarD((S^i)r).\mathbb{E}_{D}[\hat{\mu}^{i}_{r}]=\mathbb{E}[(\hat{S}^{i})^{r}]\quad\text{and}\quad\hbox{{\rm Var}}_{D}(\hat{\mu}^{i}_{r})=\frac{1}{M}\hbox{{\rm Var}}_{D}((\hat{S}^{i})^{r}).

5.2 The space 𝒲q\mathcal{W}_{q} as an ideal version of stochastic computer codes

When dealing with stochastic computer codes, the practitioner is generally interested in the distribution μx\mu_{x} of the output for a given xx. As previously seen, one can translate this type of codes in terms of a deterministic code by considering an extra input which is not chosen by the practitioner itself but which is a latent variable generated randomly by the computer code and independently of the classical input. As usual in the framework of sensitivity analysis, one considers the input as a random variable. All the random variables (the one chosen by the practitioner and the one generated by the computer code) are built on the same probability space, leading to the function fsf_{s}:

fs:\displaystyle f_{s}: E×𝒟\displaystyle E\times\mathcal{D} \displaystyle\to\mathbb{R}
(x,D)\displaystyle(x,D) fs(x,D),\displaystyle\mapsto f_{s}(x,D),

where DD is the extra random variable lying in 𝒟\mathcal{D}. We naturally denote the output random variable fs(x,)f_{s}(x,\cdot{}) by fs(x)f_{s}(x).

Hence, one may define another (deterministic) computer code associated with fsf_{s} whose output is the probability measure:

f:\displaystyle f: E\displaystyle E 𝒲q(E)\displaystyle\to\mathcal{W}_{q}(E)
x\displaystyle x μx.\displaystyle\mapsto\mu_{x}.

The framework of (5.2) is exactly the one of Section 4.1 and has already been handled. Obviously, in practice, one does not assess the output of the code ff but one can only obtain an empirical approximation of the measure μx\mu_{x} given by nn evaluations of fsf_{s} at xx, namely,

μx,n=1nk=1nδfs(x,Dk).\mu_{x,n}=\frac{1}{n}\sum_{k=1}^{n}\delta_{f_{s}(x,D_{k})}.

Further, (5.2) can be seen as an ideal version of (5.2). Concretely, for a single random input XE=E1××EpX\in E=E_{1}\times\dots\times E_{p}, we will evaluate nn times the code fsf_{s} defined by (5.2) (so that the code will generate independently nn hidden variables D1D_{1}, …, DnD_{n}) and one may observe

fs(X,D1),,fs(X,Dn)f_{s}(X,D_{1}),\dots,f_{s}(X,D_{n})

leading to the measure μX,n=k=1nδfs(X,Dk)/n\mu_{X,n}=\sum_{k=1}^{n}\delta_{f_{s}(X,D_{k})}/n approximating the distribution of fs(X)f_{s}(X). We emphasize on the fact that the random variables D1,,DnD_{1},\dots,D_{n} are not observed.

5.3 Sensitivity analysis

Let us now present the methodology we adopt. In order to study the sensitivity of the distribution μx\mu_{x}, one can use the framework introduced in Section 4.1 and the index S2,WquS_{2,W_{q}}^{\textbf{u}} given by (4.1).

In an ideal scenario which corresponds to the framework of (5.2), one may asses to the probability measure μx\mu_{x} for any xx. Then following the estimation procedure of Section 4.3, one gets an estimation of the sensitivity index S2,WquS_{2,W_{q}}^{\textbf{u}} with good asymptotic properties [27, Theorem 2.3].

In the more realistic framework presented above in (5.2), we only have access to the approximation μx,n\mu_{x,n} of μx\mu_{x} rendering more complex the estimation procedure and the study of the asymptotic properties. In this case, the general design of experiments is the following:

(X1,D1,1,,D1,n)\displaystyle(X_{1},D_{1,1},\ldots,D_{1,n}) fs(X1,D1,1),,fs(X1,D1,n),\displaystyle\to\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ f_{s}(X_{1},D_{1,1}),\dots,f_{s}(X_{1},D_{1,n}),
(X1u,D1,1,,D1,n)\displaystyle(X_{1}^{\textbf{u}},D_{1,1}^{\prime},\dots,D_{1,n}^{\prime}) fs(X1u,D1,1),,fs(X1u,D1,n),\displaystyle\to\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ f_{s}(X_{1}^{\textbf{u}},D_{1,1}^{\prime}),\dots,f_{s}(X_{1}^{\textbf{u}},D_{1,n}^{\prime}),
\displaystyle\vdots
(XN,DN,1,,DN,n)\displaystyle(X_{N},D_{N,1},\dots,D_{N,n}) fs(XN,DN,1),,fs(XN,DN,n),\displaystyle\to\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ f_{s}(X_{N},D_{N,1}),\dots,f_{s}(X_{N},D_{N,n}),
(XNu,DN,1,,DN,n)\displaystyle(X_{N}^{\textbf{u}},D_{N,1}^{\prime},\dots,D_{N,n}^{\prime}) fs(XNu,DN,1),,fs(XNu,DN,n),\displaystyle\to\leavevmode\nobreak\ \leavevmode\nobreak\ \leavevmode\nobreak\ f_{s}(X_{N}^{\textbf{u}},D_{N,1}^{\prime}),\dots,f_{s}(X_{N}^{\textbf{u}},D_{N,n}^{\prime}),

where 2×N×n2\times N\times n is the total number of evaluations of the stochastic computer code (5.2). Then we construct the approximations of μj\mu_{j} (standing for μXj\mu_{X_{j}}) for any j=1,,Nj=1,\dots,N given by

μj,n=1nk=1nδfs(Xj,Dj,k).\displaystyle\mu_{j,n}=\frac{1}{n}\sum_{k=1}^{n}\delta_{f_{s}(X_{j},D_{j,k})}. (28)

From there, one may use one of the three estimation procedures presented in Section 2.1.

  • First method - Pick-Freeze. It suffices to plug the empirical version μn\mu_{n} of each measure μ\mu under concern in (22).

  • Second method - U-statistics. For l=1,,4l=1,\dots,4, let

    Ul,N,n\displaystyle U_{l,N,n} =(Nm(l))11i1<<im(l)NΦls(𝝁i1,n,,𝝁im(l),n)\displaystyle=\begin{pmatrix}N\\ m(l)\end{pmatrix}^{-1}\sum_{1\leqslant i_{1}<\dots<i_{m(l)}\leqslant N}\Phi_{l}^{s}\left(\mbox{$\boldsymbol{\mu}$}_{i_{1},n},\dots,\mbox{$\boldsymbol{\mu}$}_{i_{m(l)},n}\right) (29)

    where as previously seen Φs\Phi^{s}_{\cdot{}} is the symmetrized version of Φ\Phi_{\cdot{}} defined in (2.1) and 𝝁=(μ,μu)\mbox{$\boldsymbol{\mu}$}=(\mu,\mu^{\textbf{u}}). Then we estimate S2,WquS_{2,W_{q}}^{\textbf{u}} by

    S^2,Wq,Ustat,nu=U1,N,nU2,N,nU3,N,nU4,N,n.\widehat{S}_{2,W_{q},\text{Ustat},n}^{\textbf{u}}=\frac{U_{1,N,n}-U_{2,N,n}}{U_{3,N,n}-U_{4,N,n}}. (30)
  • Third method - Rank-based. The rank-based estimation procedure may also easily extend to this context by using the empirical version μn\mu_{n} of each measure μ\mu under concern instead of the true one μ\mu, as explained into more details in the numerical study developed in Section 5.1.

Actually, these estimators are easy to compute since for two discrete measures supported on a same number of points and given by

ν1=1nk=1nδxk,ν2=1nk=1nδyk,\nu_{1}=\frac{1}{n}\sum_{k=1}^{n}\delta_{x_{k}},\;\nu_{2}=\frac{1}{n}\sum_{k=1}^{n}\delta_{y_{k}},

the Wasserstein distance between ν1\nu_{1} and ν2\nu_{2} simply writes

Wqq(ν1,ν2)=1nk=1n(x(k)y(k))q,\displaystyle W_{q}^{q}(\nu_{1},\nu_{2})=\frac{1}{n}\sum_{k=1}^{n}(x_{(k)}-y_{(k)})^{q}, (31)

where x(k)x_{(k)} is the kk-th order statistics of xx.

5.4 Central limit theorem for the estimator based on U-statistics

Proposition 5.1.

Consider three i.i.d. copies X1X_{1}, X2X_{2} and X3X_{3} of a random variable XX. Let δ(N)\delta(N) be a sequence tending to 0 as NN goes to infinity and such that

(|Wq(μX1,μX3)Wq(μX1,μX2)|δ(N))=o(1N).\mathbb{P}\left(\left\lvert W_{q}(\mu_{X_{1}},\mu_{X_{3}})-W_{q}(\mu_{X_{1}},\mu_{X_{2}})\right\rvert\leqslant\delta(N)\right)=o\left(\frac{1}{\sqrt{N}}\right).

Let nn such that 𝔼[Wq(μX,μX,n)]=o(δ(N)/N)\mathbb{E}[W_{q}(\mu_{X},\mu_{X,n})]=o(\delta(N)/\sqrt{N}). Under the assumptions of [27, Theorem 2.3], we get, for any u{1,,p}\textbf{u}\subset\{1,\cdots,p\},

N(S^2,Wq,Ustat,nuS2,Wqu)n+𝒩(0,σ2)\displaystyle\sqrt{N}\left(\widehat{S}_{2,W_{q},\text{Ustat},n}^{\textbf{u}}-S_{2,W_{q}}^{\textbf{u}}\right)\xrightarrow[n\to+\infty]{\mathcal{L}}\mathcal{N}(0,\sigma^{2}) (32)

where the asymptotic variance σ2\sigma^{2} is given by Equation (13) in the proof of Theorem 2.3 in [27].

In some particular frameworks, one may derive easily a suitable value of δ(N)\delta(N). Two examples are given in the following.

Example 5.2.

If the inverse of the random variable W=|Wq(μX1,μX3)Wq(μX1,μX2)|W=\left\lvert W_{q}(\mu_{X_{1}},\mu_{X_{3}})-W_{q}(\mu_{X_{1}},\mu_{X_{2}})\right\rvert has a finite expectation, then, by Markov inequality,

(Wδ(N))=(W1δ(N)1)1δ(N)𝔼[1W]\displaystyle\mathbb{P}\left(W\leqslant\delta(N)\right)=\mathbb{P}\left(W^{-1}\geqslant\delta(N)^{-1}\right)\leqslant\frac{1}{\delta(N)}\mathbb{E}\left[\frac{1}{W}\right]

and it suffices to choose δ(N)\delta(N) so that δ(N)1=o(N1/2)\delta(N)^{-1}=o\left(N^{-1/2}\right) as NN goes to infinity.

Example 5.3 (Uniform example).

Assume that XX is uniformly distributed on [0,1][0,1] and that μX\mu_{X} is a Gaussian distribution centered at XX with unit variance. Then the Wasserstein distance W2(μX1,μX2)W_{2}(\mu_{X_{1}},\mu_{X_{2}}) rewrites as (X1X2)2(X_{1}-X_{2})^{2} so that the random variable W=|W2(μX1,μX3)W2(μX1,μX2)|W=\left\lvert W_{2}(\mu_{X_{1}},\mu_{X_{3}})-W_{2}(\mu_{X_{1}},\mu_{X_{2}})\right\rvert is given by

|(X1X3)2(X1X2)2|=|(X3X2)(X2+X32X1)|.\left\lvert(X_{1}-X_{3})^{2}-(X_{1}-X_{2})^{2}\right\rvert=\left\lvert(X_{3}-X_{2})(X_{2}+X_{3}-2X_{1})\right\rvert.

Consequently,

(Wδ(N))\displaystyle\mathbb{P}(W\leqslant\delta(N)) (|X3X2|δ(N))+(|X2+X32X1|δ(N)).\displaystyle\leqslant\mathbb{P}(\left\lvert X_{3}-X_{2}\right\rvert\leqslant\sqrt{\delta(N)})+\mathbb{P}(\left\lvert X_{2}+X_{3}-2X_{1}\right\rvert\leqslant\sqrt{\delta(N)}).

Notice that |X3X2|\left\lvert X_{3}-X_{2}\right\rvert is triangular distributed with parameter a=0a=0, b=1b=1, and c=0c=0 leading to

(|X3X2|α)=α(2α),for all α[0,1].\mathbb{P}(\left\lvert X_{3}-X_{2}\right\rvert\leqslant\alpha)=\alpha(2-\alpha),\quad\text{for all $\alpha\in[0,1]$}.

In addition,

(|X2+X32X1|δ(N))\displaystyle\mathbb{P}(\left\lvert X_{2}+X_{3}-2X_{1}\right\rvert\leqslant\sqrt{\delta(N)}) (||X2X1||X3X1||δ(N))\displaystyle\leqslant\mathbb{P}(\left\lvert\left\lvert X_{2}-X_{1}\right\rvert-\left\lvert X_{3}-X_{1}\right\rvert\right\rvert\leqslant\sqrt{\delta(N)})
=01(||X2u||X3u||δ(N))𝑑u.\displaystyle=\int_{0}^{1}\mathbb{P}(\left\lvert\left\lvert X_{2}-u\right\rvert-\left\lvert X_{3}-u\right\rvert\right\rvert\leqslant\sqrt{\delta(N)})du.

Now, X2uX_{2}-u and X3uX_{3}-u are two independent random variables uniformly distributed on [u,u][-u,-u]. Then (see Figure 6), one has

(||X2u||X3u||α)4α,\mathbb{P}(\left\lvert\left\lvert X_{2}-u\right\rvert-\left\lvert X_{3}-u\right\rvert\right\rvert\leqslant\alpha)\leqslant 4\alpha,

whence

(|X2+X32X1|δ(N))4δ(N).\mathbb{P}(\left\lvert X_{2}+X_{3}-2X_{1}\right\rvert\leqslant\sqrt{\delta(N)})\leqslant 4\sqrt{\delta(N)}.

Thus it turns out that (Wδ(N))=O(δ(N))\mathbb{P}(W\leqslant\delta(N))=O\Bigl{(}\sqrt{\delta(N)}\Bigr{)}. Consequently, a suitable choice for δ(N)\delta(N) is δ(N)=o(1/N)\delta(N)=o(1/N).

uuuuu1u-1u1u-1α\alphaα\alphaα-\alphaα-\alpha0\bullet\bullet\bullet\bullet
Figure 6: Domain Γu,α={(x1,x2)[0,1];||x1u||x2u||α}\Gamma_{u,\alpha}=\{(x_{1},x_{2})\in[0,1];\;\left\lvert\left\lvert x_{1}-u\right\rvert-\left\lvert x_{2}-u\right\rvert\right\rvert\leqslant\alpha\} (in grey).

Analogously, one may derive suitable choices for nn in some particular cases. For instance, we refer the reader to [5] to get upper bounds on 𝔼[Wq(μX,μX,n)]\mathbb{E}[W_{q}(\mu_{X},\mu_{X,n})] for several values of q1q\geqslant 1 and several assumptions on the distribution on μX\mu_{X}: general, uniform, Gaussian, beta, log concave… Here are some results.

  • In the general framework, the upper bound for q1q\geqslant 1 relies on the functional

    Jq(μX)=(FμX(x)(1FμX(x)))q/2fμX(x)q1)𝑑xJ_{q}(\mu_{X})=\int_{\mathbb{R}}\frac{\left(F_{\mu_{X}}(x)(1-F_{\mu_{X}}(x))\right)^{q/2}}{f_{\mu_{X}}(x)^{q-1)}}dx

    where FμXF_{\mu_{X}} is the c.d.f. associated to μX\mu_{X} and fμXf_{\mu_{X}} its p.d.f. See Cf. [5, Theorems 3.2, 5.1 and 5.3].

  • Assume that μX\mu_{X} is uniformly distributed on [0,1][0,1]. Then by [5, Theorems 4.7, 4.8 and 4.9], for any n1n\geqslant 1,

    𝔼[W2(μX,μX,n)2]16n,\mathbb{E}[W_{2}(\mu_{X},\mu_{X,n})^{2}]\leqslant\frac{1}{6n},

    for any q1q\geqslant 1 and for any n1n\geqslant 1,

    𝔼[Wq(μX,μX,n)q]1/q(Const)qn.\mathbb{E}[W_{q}(\mu_{X},\mu_{X,n})^{q}]^{1/q}\leqslant(Const)\sqrt{\frac{q}{n}}.

    and for any n1n\geqslant 1,

    𝔼[W(μX,μX,n)](Const)n.\mathbb{E}[W_{\infty}(\mu_{X},\mu_{X,n})]\leqslant\frac{(Const)}{n}.

    E.g. (Const)=π/2(Const)=\sqrt{\pi/2}.

  • Assume that μX\mu_{X} is a log-concave distribution with standard deviation σ\sigma. Then by [5, Corollaries 6.10 and 6.12], for any 1q<21\leqslant q<2 and for any n1n\geqslant 1,

    𝔼[Wq(μX,μX,n)q](Const)2q(σn)q,\mathbb{E}[W_{q}(\mu_{X},\mu_{X,n})^{q}]\leqslant\frac{(Const)}{2-q}\left(\frac{\sigma}{\sqrt{n}}\right)^{q},

    for any n1n\geqslant 1,

    𝔼[W2(μX,μX,n)2](Const)σ2lognn,\mathbb{E}[W_{2}(\mu_{X},\mu_{X,n})^{2}]\leqslant\frac{(Const)\sigma^{2}\log n}{n},

    and for any q>2q>2 and for any n1n\geqslant 1,

    𝔼[Wq(μX,μX,n)q]Cqσqn,\mathbb{E}[W_{q}(\mu_{X},\mu_{X,n})^{q}]\leqslant\frac{C_{q}\sigma^{q}}{n},

    where CqC_{q} depends on qq, only. Furthermore, if μX\mu_{X} supported on [a,b][a,b], then for any n1n\geqslant 1,

    𝔼[W2(μX,μX,n)2](Const)(ba)2n+1.\mathbb{E}[W_{2}(\mu_{X},\mu_{X,n})^{2}]\leqslant\frac{(Const)(b-a)^{2}}{n+1}.

    E.g. (Const)=4/ln2(Const)=4/\ln 2. Cf. [5, Corollary 6.11].

Example 5.3 - continued. We consider that XX is uniformly distributed on [0,1][0,1] and μX\mu_{X} is a Gaussian distribution centered at XX with unit variance. Then, by [5, Corollary 6.14], we have for any n3n\geqslant 3,

𝔼[W2(μX,μX,n)2](Const)loglognn,\mathbb{E}[W_{2}(\mu_{X},\mu_{X,n})^{2}]\leqslant\frac{(Const)\log\log n}{n},

and for any q>2q>2 and for any n3n\geqslant 3,

𝔼[Wq(μX,μX,n)q]Cqn(logn)q/2,\mathbb{E}[W_{q}(\mu_{X},\mu_{X,n})^{q}]\leqslant\frac{C_{q}}{n(\log n)^{q/2}},

where CqC_{q} depends only on qq. Since we have already chosen δ(N)=o(N1)\delta(N)=o(N^{-1}), it remains to take nn so that loglogn/n=o(N2)\log\log n/n=o(N^{-2}) to fulfill the condition 𝔼[W2(μX,μX,n)]=o(δ(N)/N)\mathbb{E}[W_{2}(\mu_{X},\mu_{X,n})]=o(\delta(N)/\sqrt{N}).

5.5 Numerical study

Example 4.1 - continued. Here, we consider again the code given by (24). Having in mind the notation of Section 5.2, we consider the ideal version of the code:

f:\displaystyle f: E\displaystyle E 𝒲q(E)\displaystyle\to\mathcal{W}_{q}(E)
(X1,X2,X3)\displaystyle(X_{1},X_{2},X_{3}) μ(X1,X2,X3)\displaystyle\mapsto\mu_{(X_{1},X_{2},X_{3})}

where μ(X1,X2,X3)\mu_{(X_{1},X_{2},X_{3})} is the uniform distribution on [0,1+X1+X2+X1X3][0,1+X_{1}+X_{2}+X_{1}X_{3}], the c.d.f. of which is 𝔽\mathbb{F} given by (24) and its stochastic counterpart:

fs:\displaystyle f_{s}: E×D\displaystyle E\times D \displaystyle\to\mathbb{R}
(X1,X2,X3,D)\displaystyle(X_{1},X_{2},X_{3},D) fs(X1,X2,X3,D)\displaystyle\mapsto f_{s}(X_{1},X_{2},X_{3},D)

where fs(X1,X2,X3,D)f_{s}(X_{1},X_{2},X_{3},D) is a realization of μ(X1,X2,X3)\mu_{(X_{1},X_{2},X_{3})}.

Hence, we no longer assume that one may observe NN realizations of 𝔽\mathbb{F} associated to the NN initial realizations of (X1,X2,X3)(X_{1},X_{2},X_{3}). Instead, for any of the NN initial realizations of (X1,X2,X3)(X_{1},X_{2},X_{3}), we assess nn realizations of a uniform random variable on [0,1+X1+X2+X1X3][0,1+X_{1}+X_{2}+X_{1}X_{3}].

In order to compare the estimation accuracy of the Pick-Freeze method and the rank-based method at a fixed size, we assume that only N=450N=450 calls of the computer code ff are allowed to estimate the indices Su(𝔽)S^{\textbf{u}}(\mathbb{F}) and S2,W2uS^{\textbf{u}}_{2,W_{2}} for u={1}\textbf{u}=\{1\}, {2}\{2\}, and {3}\{3\}. We only focus on the first order indices since, as explained previously, the rank-based procedure has not been developed yet for higher order indices. The empirical c.d.f. based on the empirical measures μi,n\mu_{i,n} for i=1,,ni=1,\ldots,n in (28) are constructed with n=100n=100 evaluations. We repeat the estimation procedure 500 times. The boxplots of the mean square errors for the estimation of the Fréchet indices Su(𝔽)S^{\textbf{u}}(\mathbb{F}) and the Wasserstein indices S2,W2uS^{\textbf{u}}_{2,W_{2}} have been plotted in Figure 7. We observe that, for a fixed sample size N=450N=450 (corresponding to a Pick-Freeze sample size N=64N=64), the rank-based estimation procedure performs much better than the Pick-Freeze method with significantly lower mean errors.

Refer to caption
Figure 7: Model (5.5) with p1=1/3p_{1}=1/3, p2=2/3p_{2}=2/3, and p3=3/4p_{3}=3/4. Boxplot of the mean square errors of the estimation of the Fréchet indices Su(𝔽)S^{\textbf{u}}(\mathbb{F}) (top row) and the Wasserstein indices S2,W2uS^{\textbf{u}}_{2,W_{2}} (bottom row) with a fixed sample size and 200 replications. The indices with respect to u={1}\textbf{u}=\{1\}, {2}\{2\}, and {3}\{3\} are displayed from left to right. The results of the Pick-Freeze estimation procedure with N=64N=64 are provided in the left side of each graphic. The results of the rank-based methodology with N=450N=450 are provided in the right side of each graphic. The approximation size nn is fixed at 500.

Another numerical study in the particular setting of stochastic computer codes and inspired by [32] will be considered in Section 6.3.

6 Sensitivity analysis with respect to the law of the inputs

6.1 State of the art

The paper [44] is devoted to second level uncertainty which corresponds to the uncertainty on the type of the input distributions and/or on the parameters of the input distributions. As mentioned by the authors, such uncertainties can be handled in two different manners: (1) aggregating them with no distinction [13, 12] or (2) separating them [44]. In [13], e.g., the uncertainty concerns the parameters of the input distributions. The authors study the expectation with respect to the distribution of the parameters of the conditional output. In [12], the second level uncertainties are transformed into first level uncertainties considering the aggregated vector containing the input random variables vector together with the vector of uncertain parameters. Alternatively, in [44], the uncertainty brought by the lack of knowledge of the input distributions and the uncertainty of the random inputs are treated separately. A double Monte-Carlo algorithm is first considered. In the outer loop, a Monte-Carlo sample of input distribution is generated, while the inner loop proceeds to a global sensitivity analysis associated to each distribution. A more efficient algorithm is also proposed with a unique Monte-Carlo loop. The sensitivity analysis is then performed using the so-called Hilbert-Schmidt dependence measures (HSIC indices) on the input distributions rather than the input random variables themselves. See, e.g., [29] for the definition of the HSIC indices and more details on the algorithms.

In [45], a different approach is adopted. A failure probability is studied while the uncertainty concerns the parameters of the input distributions. An algorithm with low computational cost is proposed to handle such uncertainty together with the rare event setting. A single initial sample allows to compute the failure probabilities associated to different parameters of the input distributions. A similar idea is exploited in [41] in which the authors consider input perturbations. and Perturbed-Law based Indices that are used to quantify the impact of a perturbation of an input p.d.f. on a failure probability. Analogously, the authors of [30, 32] are interested in (marginal) p.d.f. perturbations and the aim is to study the “robustness of the Sobol indices to distributional uncertainty and to marginal distribution uncertainty” which correspond to second level uncertainty. For instance, the basic idea of the approach proposed in [30] is to view the total Sobol index as an operator which inputs the p.d.f. and returns the Sobol index. Then the analysis of robustness is done computing and studying the Fréchet derivative of this operator. The same principle is used in [32] to treat the robustness with respect to the marginal distribution uncertainty.

Last but not least, it is worth mentioning the classical approach of epistemic global sensitivity analysis of Dempster-Shafer theory (see, e.g., [54, 1]). This theory describes the random variables together with an epistemic uncertainty traduced in terms of an associated epistemic variable ZZ on a set AA, a mass function representing a probability measure on the set (A)\mathbb{P}(A) of all subsets AA. This lack of knowledge leads to an upper and lower bound of the c.d.f. and can be viewed as a second level uncertainty.

6.2 Link with stochastic computer codes

We propose a new procedure that stems from the the methodology in the context of stochastic computer codes described in Section 5. We still denote by μi\mu_{i} (i=1,,pi=1,\ldots,p) the distribution of the input XiX_{i} (i=1,,pi=1,\ldots,p) in the model given by (1). There are several ways to model the uncertainty with respect to the choice of each μi\mu_{i}. Here we adopt the following framework. We assume that each μi\mu_{i} belongs to some family 𝒫i\mathcal{P}_{i} of probability measures endowed with the probability measure μ𝐢\mathbf{\mathbb{P}_{\mu_{i}}}. In general, there might be measurability issues and the question of how to define a σ\sigma-field on some general spaces 𝒫i\mathcal{P}_{i} can be tricky. We will restrict our study to the simple case where the existence of the probability measure μ𝐢\mathbf{\mathbb{P}_{\mu_{i}}} on 𝒫i\mathcal{P}_{i} is given by the construction of the set 𝒫i\mathcal{P}_{i}. More precisely, we proceed as follows.

  • First, for 1ip1\leqslant i\leqslant p, let did_{i} be an integer and let Θidi\Theta_{i}\subset\mathbb{R}^{d_{i}}. Then consider the probability space (Θi,(Θi),νΘi)\left(\Theta_{i},\mathcal{B}(\Theta_{i}),\nu_{\Theta_{i}}\right) where (Θi)\mathcal{B}(\Theta_{i}) is the Borel σ\sigma-field and νΘi\nu_{\Theta_{i}} is a probability measure on (Θi,(Θi))\left(\Theta_{i},\mathcal{B}(\Theta_{i})\right).

  • Second, for 1ip1\leqslant i\leqslant p, we consider an identifiable parametric set of probability measure 𝒫i\mathcal{P}_{i} on EiE_{i}: 𝒫i:={μθ,θΘi}\mathcal{P}_{i}:=\{\mu_{\theta},\theta\in\Theta_{i}\}. Let us denote by πi\pi_{i} the one-to-one mapping from Θi\Theta_{i} to 𝒫i\mathcal{P}_{i} defined by πi(θ):=μθ𝒫i\pi_{i}(\theta):=\mu_{\theta}\in\mathcal{P}_{i} and define the σ\sigma-field i\mathcal{F}_{i} on 𝒫i\mathcal{P}_{i} by

    AiB(Θi),A=πi(B).A\in\mathcal{F}_{i}\iff\exists B\in\mathcal{B}(\Theta_{i}),\ A=\pi_{i}(B).

    Then we endow this measurable space with the probability Πi\Pi_{i} defined, for any AiA\in\mathcal{F}_{i}, by

    Πi(A)=νΘi(πi1(A)).\Pi_{i}(A)=\nu_{\Theta_{i}}\left(\pi_{i}^{-1}(A)\right).
  • Third, in order to perform a second level sensitivity analysis on (1), we introduce the stochastic mapping fsf_{s} from 𝒫1××𝒫p\mathcal{P}_{1}\times\ldots\times\mathcal{P}_{p} to 𝒳\mathcal{X} defined by

    fs(μ1,,μp)=f(X1,,Xp)f_{s}\left(\mu_{1},\ldots,\mu_{p}\right)=f(X_{1},\ldots,X_{p}) (34)

    where X1,,XpX_{1},\ldots,X_{p} are independently drawn according to the distribution μ1××μp\mu_{1}\times\ldots\times\mu_{p}. Hence fsf_{s} is a stochastic computer code from 𝒫1××𝒫p\mathcal{P}_{1}\times\ldots\times\mathcal{P}_{p} to 𝒳\mathcal{X} and once the probability measures μi\mathbb{P}_{\mu_{i}} on each 𝒫i\mathcal{P}_{i} are defined, we can perform sensitivity analysis using the framework of Section 5.

6.3 Numerical study

As in [32], let us consider the synthetic example defined on [0,1]3[0,1]^{3} by

f(X1,X2,X3)=2X2e2X1+X32.\displaystyle f(X_{1},X_{2},X_{3})=2X_{2}e^{-2X_{1}}+X_{3}^{2}. (35)

We are interested in the uncertainty in the support of the random variables X1X_{1}, X2X_{2} and X3X_{3}. To do so, we follow the notation and framework of [32]. For i=1i=1, 2, and 3, we assume that XiX_{i} is uniformly distributed on the interval [Ai,Bi][A_{i},B_{i}], where AiA_{i} and BiB_{i} are themselves uniformly distributed on [0,0.1][0,0.1] and [0.9,1][0.9,1] respectively. As remarked in [32], it seems natural that ff will vary more in the X2X_{2}-direction when X1X_{1} is close to 0 and less when X1X_{1} is close to 1.

As mentioned in Section 6.1, the authors of [32] view the total Sobol index as an operator which inputs the p.d.f. and returns the total Sobol index. Then they study the Fréchet derivative of this operator and determine the most influential p.d.f., which depends on a parameter denoted by δ\delta. Finally, they make vary this parameter δ\delta.

Here, we adopt the methodology explained in the previous section (Section 6.2). Namely, we consider the stochastic computer code given by:

fs(μ1,μ2,μ3)=2X2e2X1+X32,\displaystyle f_{s}(\mu_{1},\mu_{2},\mu_{3})=2X_{2}e^{-2X_{1}}+X_{3}^{2}, (36)

where the XiX_{i}’s are independently drawn according to the uniform measure μi\mu_{i} on [Ai,Bi][A_{i},B_{i}] with AiA_{i} and BiB_{i} themselves uniformly distributed on [0,0.1][0,0.1] and [0.9,1][0.9,1] respectively. Then to estimate the indices S2,W2iS_{2,W_{2}}^{i}, for i=1i=1, 2, and 3, we proceed as follows.

  1. 1.

    For i=1i=1, 2, and 3, we produce a NN-sample ([Ai,j,Bi,j])j=1,,N\left([A_{i,j},B_{i,j}]\right)_{j=1,\ldots,N} of intervals [Ai,Bi][A_{i},B_{i}].

  2. 2.

    For i=1i=1, 2, and 3, and, for j=1,,Nj=1,\ldots,N, we generate a nn-sample (Xi,j,k)k=1,,n\left(X_{i,j,k}\right)_{k=1,\ldots,n} of XiX_{i}, where Xi,j,kX_{i,j,k} is uniformly distributed on [Ai,j,Bi,j][A_{i,j},B_{i,j}].

  3. 3.

    For j=1,,Nj=1,\ldots,N, we compute the nn-sample (Yj,k)k=1,,n\left(Y_{j,k}\right)_{k=1,\ldots,n} of the output using

    Y=f(X1,X2,X3)=2X2e2X1+X32.Y=f(X_{1},X_{2},X_{3})=2X_{2}e^{-2X_{1}}+X_{3}^{2}.

    Thus we get a NN-sample of the empirical measures of the distribution of the output YY given by:

    μj,n=1nk=1nδYj,k,for j=1,,N.\mu_{j,n}=\frac{1}{n}\sum_{k=1}^{n}\delta_{Y_{j,k}},\quad\text{for $j=1,\ldots,N$}.
  4. 4.

    For i=1i=1, 2, and 3, we order the intervals ([Ai,j,Bi,j])j=1,,N\left([A_{i,j},B_{i,j}]\right)_{j=1,\ldots,N} and get the Pick-Freeze versions of YY to treat the sensitivity analysis regarding the ii-th input.

  5. 5.

    Finally, it remains to compute the indicators of the empirical version of (22) using (31) and their means to get the Pick-Freeze estimators of S2,W2uS^{\textbf{u}}_{2,W_{2}}, for u={1}\textbf{u}=\{1\}, {2}\{2\}, {3}\{3\}, {1,2}\{1,2\}, {1,3}\{1,3\}, and {2,3}\{2,3\}.

Notice that we only consider the estimators based on the Pick-Freeze method since we allow for both bounds of the interval to vary and, as explained previously, the rank-based procedure has not been developed yet neither for higher order indices nor in higher dimensions.

First, we compute the estimators of S2,W2uS^{\textbf{u}}_{2,W_{2}} following the previous procedure with a sample size N=500N=500 and an approximation size n=500n=500. We also perform another batch of simulations allowing for higher variability on the bounds: AiA_{i} is now uniformly distributed on [0,0.45][0,0.45] while BiB_{i} is now uniformly distributed on [0.55,1][0.55,1]. The results are displayed in Table 1.

u {1}\{1\} {2}\{2\} {3}\{3\} {1,2}\{1,2\} {1,3}\{1,3\} {2,3}\{2,3\}
Ai[0,0.1]A_{i}\in[0,0.1]
Bi[0.9,1]B_{i}\in[0.9,1] S^2,W2u\hat{S}^{\textbf{u}}_{2,W_{2}} 0.07022 0.08791 0.09236 0.14467 0.21839 0.19066
Ai[0,0.45]A_{i}\in[0,0.45]
Bi[0.55,1]B_{i}\in[0.55,1] S^2,W2u\hat{S}^{\textbf{u}}_{2,W_{2}} 0.11587 0.06542 0.169529 0.22647 0.40848 0.34913
Table 1: Model (35). GSA on the parameters of the input distributions. Estimations of S2,W2uS^{\textbf{u}}_{2,W_{2}} with a sample size N=500N=500 and an approximation size n=500n=500. In the first row, AiA_{i} is uniformly distributed on [0,0.1][0,0.1] while BiB_{i} is uniformly distributed on [0.55,1][0.55,1]. In the second row, we allow for more variability: AiA_{i} is uniformly distributed on [0,0.45][0,0.45] while BiB_{i} is uniformly distributed on [0.55,1][0.55,1].

Second, we run another simulations allowing for more variability on the upper bound related to the third input X3X_{3} only: B3B_{3} is uniformly distributed on [0.5,1][0.5,1] (instead of [0.9,1][0.9,1]). The results are displayed in Table 2. We still use a sample size N=500N=500 and an approximation size n=500n=500.

u {1}\{1\} {2}\{2\} {3}\{3\} {1,2}\{1,2\} {1,3}\{1,3\} {2,3}\{2,3\}
S^2,W2u\hat{S}^{\textbf{u}}_{2,W_{2}} 0.01196 0.06069 0.56176 -0.01723 0.63830 0.59434
Table 2: Model (35). GSA on the parameters of the input distributions. Estimations of S2,W2uS^{\textbf{u}}_{2,W_{2}} with a sample size N=500N=500 and an approximation size n=500n=500 and more variability on B3B_{3}, now uniformly distributed on [0.5,1][0.5,1].

Third, we perform a classical GSA on the inputs rather than on the parameters of their distributions. Namely, we estimate the index S2,CVMuS^{\textbf{u}}_{2,CVM} with a sample size N=104N=10^{4}. The reader is referred to [26, Section 3] for the definition of the index S2,CVMuS^{\textbf{u}}_{2,CVM} and its Pick-Freeze estimator together with their properties. The results are displayed in Table 3.

u {1}\{1\} {2}\{2\} {3}\{3\} {1,2}\{1,2\} {1,3}\{1,3\} {2,3}\{2,3\}
S^2,W2u\hat{S}^{\textbf{u}}_{2,W_{2}} 0.13717 0.15317 0.33889 0.33405 0.468163 0.53536
Table 3: Model (35). Direct GSA on the inputs. Estimations of S2,CVMuS^{\textbf{u}}_{2,CVM} with a sample size N=104N=10^{4}. The reader is referred to [26, Section 3] for the definition of the index S2,CVMuS^{\textbf{u}}_{2,CVM} and its Pick-Freeze estimator together with their properties.

In Table 3, we see that the Cramér-von-Mises index related to X3X_{3} is more than twice as important as X1X_{1} and X2X_{2} (when considering only first order effects). Nevertheless, when one is interested in the choice of the input distributions of X1X_{1}, X2X_{2}, and X3X_{3}, the first row in Table 1 shows that each choice is equally important. Now, if one give more freedom to the space where the distribution lives, the relative importance may change as one can see in Table 2 (second row) and in Table 3. More precisely, in Table 2, the variability of the third input distribution (namely, the variability of its upper bound) is five times bigger than the other variabilities. Not surprisingly, it results that the importance of the choice of the third input distribution is then much more important than the choices of the distributions of the two first inputs.

7 Conclusion

In this article, we present a very general way to perform sensitivity analysis when the output ZZ of a computer code lives in a metric space. The main idea is to consider real-valued squared integrable test functions (Ta(Z))aΩ(T_{a}(Z))_{a\in\Omega} parameterized by a finite number of elements of a probability space. Then Hoeffding decomposition of the test functions Ta(Z)T_{a}(Z) is computed and integrated with respect to the parameter aa. This very general and flexible definition allows, in one hand, to recover a lot of classical indices (namely, the Sobol indices and the Cramér-von-Mises indices) and, in the other hand, to perform a well tailored and interpretable sensitivity analysis. Furthermore, a sensitivity analysis is also made possible for computer codes the output of which is a c.d.f., for stochastic computer codes (that are seen as an approximation of c.d.f.-valued computer codes). Last but not least, it enables also to perform second level sensitivity analysis by embedding second level sensitivity analysis as a particular case of stochastic computer codes.

Acknowledgment

Appendix A Proofs

A.1 Notation

It is convenient to have short expressions for terms that converge in probability to zero. We follow [58]. The notation o(1)o_{\mathbb{P}}(1) (respectively O(1)O_{\mathbb{P}}(1)) stands for a sequence of random variables that converges to zero in probability (resp. is bounded in probability) as nn\to\infty. More generally, for a sequence of random variables RnR_{n},

Xn\displaystyle X_{n} =o(Rn)meansXn=YnRnwithYn0\displaystyle=o_{\mathbb{P}}(R_{n})\quad\textrm{means}\quad X_{n}=Y_{n}R_{n}\quad\textrm{with}\quad Y_{n}\overset{\mathbb{P}}{\rightarrow}0
Xn\displaystyle X_{n} =O(Rn)meansXn=YnRnwithYn=O(1).\displaystyle=O_{\mathbb{P}}(R_{n})\quad\textrm{means}\quad X_{n}=Y_{n}R_{n}\quad\textrm{with}\quad Y_{n}=O_{\mathbb{P}}(1).

For deterministic sequences XnX_{n} and RnR_{n}, the stochastic notation reduce to the usual oo and OO. Finally, cc stands for a generic constant that may differ from one line to another.

A.2 Proof of Proposition 5.1

One has

N(S^2,Wq,Ustat,nuS2,GMSu)=N(S^2,Wq,Ustat,nuS^2,GMS,Ustatu)+N(S^2,GMS,UstatuS2,GMSu).\sqrt{N}\left(\widehat{S}_{2,W_{q},\text{Ustat},n}^{\textbf{u}}-S_{2,GMS}^{\textbf{u}}\right)=\sqrt{N}\left(\widehat{S}_{2,W_{q},\text{Ustat},n}^{\textbf{u}}-\widehat{S}_{2,GMS,\text{Ustat}}^{\textbf{u}}\right)+\sqrt{N}\left(\widehat{S}_{2,GMS,\text{Ustat}}^{\textbf{u}}-S_{2,GMS}^{\textbf{u}}\right).

By [27, Theorem 2.3], the second term in the right hand side of the previous equation is asymptotically Gaussian. If we prove that the first term in the right hand side is o(1)o_{\mathbb{P}}(1), then by Slutsky’s Lemma [58, Lemma 2.8], N(S^2,GMS,Ustat,nuS2,GMSu)\sqrt{N}\left(\widehat{S}_{2,GMS,\text{Ustat},n}^{\textbf{u}}-S_{2,GMS}^{\textbf{u}}\right) is asymptotically Gaussian.

Now we prove that N(S^2,GMS,Ustat,nuS^2,GMS,Ustatu)=o(1)\sqrt{N}\left(\widehat{S}_{2,GMS,\text{Ustat},n}^{\textbf{u}}-\widehat{S}_{2,GMS,\text{Ustat}}^{\textbf{u}}\right)=o_{\mathbb{P}}(1). We write

S^2,Wq,Ustat,nuS^2,GMS,Ustatu=Ψ(U1,N,n,U2,N,n,U3,N,n,U4,N,n)Ψ(U1,N,U2,N,U3,N,U4,N)\displaystyle\widehat{S}_{2,W_{q},\text{Ustat},n}^{\textbf{u}}-\widehat{S}_{2,GMS,\text{Ustat}}^{\textbf{u}}=\Psi(U_{1,N,n},U_{2,N,n},U_{3,N,n},U_{4,N,n})-\Psi(U_{1,N},U_{2,N},U_{3,N},U_{4,N})
=[(U1,N,nU1,N)(U2,N,nU2,N)](U3,NU4,N)[(U3,N,nU3,N)(U4,N,nU4,N)+(U3,NU4,N)](U3,NU4,N)\displaystyle=\frac{\left[(U_{1,N,n}-U_{1,N})-(U_{2,N,n}-U_{2,N})\right](U_{3,N}-U_{4,N})}{\left[(U_{3,N,n}-U_{3,N})-(U_{4,N,n}-U_{4,N})+(U_{3,N}-U_{4,N})\right](U_{3,N}-U_{4,N})}
[(U3,N,nU3,N)(U4,N,nU4,N)](U1,NU2,N)[(U3,N,nU3,N)(U4,N,nU4,N)+(U3,NU4,N)](U3,NU4,N).\displaystyle\quad-\frac{\left[(U_{3,N,n}-U_{3,N})-(U_{4,N,n}-U_{4,N})\right](U_{1,N}-U_{2,N})}{\left[(U_{3,N,n}-U_{3,N})-(U_{4,N,n}-U_{4,N})+(U_{3,N}-U_{4,N})\right](U_{3,N}-U_{4,N})}.

Since (Ul,N,nUl,N,n)(U_{l,N,n}-U_{l,N,n}), for l=3l=3 and 44 and (U3,NU4,N)(U_{3,N}-U_{4,N}) converges almost surely respectively to 0 and I(Φ3)I(Φ4)I(\Phi_{3})-I(\Phi_{4}), the denominator converges almost surely. Thus it suffices to prove that the numerator is o(1/N)o_{\mathbb{P}}(1/\sqrt{N}) which reduces to prove that N(Ul,N,nUl,N)=o(1)\sqrt{N}\left(U_{l,N,n}-U_{l,N}\right)=o_{\mathbb{P}}(1) for l=1,,4l=1,\dots,4, where Ul,N,nU_{l,N,n} (respectively Ul,NU_{l,N}) has been defined in (29) (resp. (9)). Let l=1l=1 for example. The other terms can be treated analogously. Here, m(1)=3m(1)=3. We write

𝔼\displaystyle\mathbb{E} [|U1,N,nU1,N|]\displaystyle\left[\left\lvert U_{1,N,n}-U_{1,N}\right\rvert\right]
(N3)1(3!)11i1<i2<i3Nτ𝒮3𝔼[|Φ1(𝝁τ(i1),n,𝝁τ(i2),n,𝝁τ(i3),n)Φ1(𝝁τ(i1),𝝁τ(i2),𝝁τ(i3))|]\displaystyle\leqslant\begin{pmatrix}N\\ 3\end{pmatrix}^{-1}(3!)^{-1}\sum_{\begin{subarray}{c}1\leqslant i_{1}<i_{2}<i_{3}\leqslant N\\ \tau\in\mathcal{S}_{3}\end{subarray}}\mathbb{E}\left[\left\lvert\Phi_{1}\left(\mbox{$\boldsymbol{\mu}$}_{\tau(i_{1}),n},\mbox{$\boldsymbol{\mu}$}_{\tau(i_{2}),n},\mbox{$\boldsymbol{\mu}$}_{\tau(i_{3}),n}\right)-\Phi_{1}\left(\mbox{$\boldsymbol{\mu}$}_{\tau(i_{1})},\mbox{$\boldsymbol{\mu}$}_{\tau(i_{2})},\mbox{$\boldsymbol{\mu}$}_{\tau(i_{3})}\right)\right\rvert\right]
=𝔼[|Φ1(𝝁1,n,𝝁2,n,𝝁3,n)Φ1(𝝁1,𝝁2,𝝁3)|]\displaystyle=\mathbb{E}\left[\left\lvert\Phi_{1}\left(\mbox{$\boldsymbol{\mu}$}_{1,n},\dots\mbox{$\boldsymbol{\mu}$}_{2,n},\mbox{$\boldsymbol{\mu}$}_{3,n}\right)-\Phi_{1}\left(\mbox{$\boldsymbol{\mu}$}_{1},\mbox{$\boldsymbol{\mu}$}_{2},\mbox{$\boldsymbol{\mu}$}_{3}\right)\right\rvert\right]
2𝔼[|𝟙Wq(μ1,μ3)Wq(μ1,μ2)𝟙Wq(μ1,n,μ3,n)Wq(μ1,n,μ2,n)|]\displaystyle\leqslant 2\mathbb{E}\left[\left\lvert\mathbbm{1}_{W_{q}(\mu_{1},\mu_{3})\leqslant W_{q}(\mu_{1},\mu_{2})}-\mathbbm{1}_{W_{q}(\mu_{1,n},\mu_{3,n})\leqslant W_{q}(\mu_{1,n},\mu_{2,n})}\right\rvert\right]
=:2𝔼[Bn]\displaystyle\mathrel{=}:2\mathbb{E}\left[B_{n}\right]

where the random variable BnB_{n} in the expectation in the right hand side of the previous inequality is a Bernoulli random variable whose distribution does not depend on (1,2,3)(1,2,3). Let Δ(N)\Delta(N) be the following event

Δ(N)={|Wq(μτ(1),μτ(3))Wq(μτ(1),μτ(2))|δ(N)}.\Delta(N)=\left\{\left\lvert W_{q}(\mu_{\tau(1)},\mu_{\tau(3)})-W_{q}(\mu_{\tau(1)},\mu_{\tau(2)})\right\rvert\geqslant\delta(N)\right\}.

Obviously, we get 𝔼[Bn𝟙Δ(N)c](Δ(N)c)\mathbb{E}\left[B_{n}\mathbbm{1}_{\Delta(N)^{c}}\right]\leqslant\mathbb{P}(\Delta(N)^{c}), where AcA^{c} stands for the complementary of AA in Ω\Omega. Furthermore,

𝔼[Bn𝟙Δ(N)]\displaystyle\mathbb{E}\left[B_{n}\mathbbm{1}_{\Delta(N)}\right] 𝔼[Bn|Δ(N)]=(Bn=1|Δ(N))\displaystyle\leqslant\mathbb{E}\left[B_{n}|\Delta(N)\right]=\mathbb{P}\left(B_{n}=1|\Delta(N)\right)
r=13(Wq(μr,μr,n)δ(N)4)\displaystyle\leqslant\sum_{r=1}^{3}\mathbb{P}\left(W_{q}(\mu_{r},\mu_{r,n})\geqslant\frac{\delta(N)}{4}\right)
12δ(N)𝔼[Wq(μ1,μ1,n)].\displaystyle\leqslant\frac{12}{\delta(N)}\mathbb{E}[W_{q}(\mu_{1},\mu_{1,n})].

Finally, we introduce ε>0\varepsilon>0 and we study:

(N|U1,N,nU1,N|ε)\displaystyle\mathbb{P}\left(\sqrt{N}\left\lvert U_{1,N,n}-U_{1,N}\right\rvert\geqslant\varepsilon\right) Nε𝔼[|U1,N,nU1,N|]\displaystyle\leqslant\frac{\sqrt{N}}{\varepsilon}\mathbb{E}\left[\left\lvert U_{1,N,n}-U_{1,N}\right\rvert\right]
2Nε𝔼[Bn]\displaystyle\leqslant 2\frac{\sqrt{N}}{\varepsilon}\mathbb{E}\left[B_{n}\right]
Nε24δ(N)𝔼[Wq(μ1,μ1,n)]+2Nε(Δ(N)c).\displaystyle\leqslant\frac{\sqrt{N}}{\varepsilon}\frac{24}{\delta(N)}\mathbb{E}[W_{q}(\mu_{1},\mu_{1,n})]+2\frac{\sqrt{N}}{\varepsilon}\mathbb{P}(\Delta(N)^{c}).

It remains to choose first, δ(N)\delta(N) so that (Δ(N)c)=o(1/N)\mathbb{P}(\Delta(N)^{c})=o\left(1/\sqrt{N}\right) and second, nn such that 𝔼[Wq(μ1,μ1,n)]=o(δ(N)/N)\mathbb{E}[W_{q}(\mu_{1},\mu_{1,n})]=o(\delta(N)/\sqrt{N}). Consequently, N(U1,N,nU1,N)=o(1)\sqrt{N}(U_{1,N,n}-U_{1,N})=o_{\mathbb{P}}(1). Analogously, one gets N(Ul,N,nUl,N)=o(1)\sqrt{N}(U_{l,N,n}-U_{l,N})=o_{\mathbb{P}}(1) for ll=2, 3 and 4.

References

  • [1] D. A. Alvarez. Reduction of uncertainty using sensitivity analysis methods for infinite random sets of indexable type. International journal of approximate reasoning, 50(5):750–762, 2009.
  • [2] B. Ankenman, B. L. Nelson, and J. Staum. Stochastic kriging for simulation metamodeling. In 2008 Winter Simulation Conference, pages 362–370. IEEE, 2008.
  • [3] J. D. Betancourt, F. Bachoc, T. Klein, D. Idier, R. Pedreros, and J. Rohmer. Gaussian process metamodeling of functional-input code for coastal flood hazard assessment. Reliability Engineering and System Safety, 198, June 2020.
  • [4] J. Bigot and T. Klein. Consistent estimation of a population barycenter in the Wasserstein space. ArXiv e-prints, Dec. 2012.
  • [5] S. Bobkov and M. Ledoux. One-dimensional empirical measures, order statistics, and kantorovich transport distances. Memoirs of the American Mathematical Society, To appear, 2019.
  • [6] E. Borgonovo. A new uncertainty importance measure. Reliability Engineering & System Safety, 92(6):771–784, 2007.
  • [7] E. Borgonovo, W. Castaings, and S. Tarantola. Moment independent importance measures: New results and analytical test cases. Risk Analysis, 31(3):404–428, 2011.
  • [8] E. Borgonovo and B. Iooss. Moment Independent Importance Measures and a Common Rationale. Manuscript.
  • [9] T. Browne, B. Iooss, L. Le Gratiet, J. Lonchampt, and E. Remy. Stochastic simulators based optimization by gaussian process metamodels - application to maintenance investments planning issues. Quality and Reliability Engineering International, 32:2067–2080, 2016.
  • [10] D. Bursztyn and D. M. Steinberg. Screening experiments for dispersion effects. In Screening, pages 21–47. Springer, 2006.
  • [11] S. Cambanis, G. Simons, and W. Stout. Inequalities for Ek(X,Y)Ek(X,Y) when the marginals are fixed. Z. Wahrscheinlichkeitstheorie und Verw. Gebiete, 36(4):285–294, 1976.
  • [12] V. Chabridon. Reliability-oriented sensitivity analysis under probabilistic model uncertainty. PhD thesis, Université Clermont-Auvergne, 2018.
  • [13] V. Chabridon, M. Balesdent, J.-M. Bourinet, J. Morio, and N. Gayton. Evaluation of failure probability under parameter epistemic uncertainty: application to aerospace system reliability assessment. Aerospace Science and Technology, 69:526–537, 2017.
  • [14] S. Chatterjee. A new coefficient of correlation. arXiv e-prints, page arXiv:1909.10140, Sep 2019.
  • [15] S. Da Veiga. Global sensitivity analysis with dependence measures. J. Stat. Comput. Simul., 85(7):1283–1305, 2015.
  • [16] E. De Rocquigny, N. Devictor, and S. Tarantola. Uncertainty in industrial practice. Wiley Chisterter England, 2008.
  • [17] G. Dellino and C. Meloni. Uncertainty management in simulation-optimization of complex systems. Springer, 2015.
  • [18] J. Fontbona, H. Guérin, and S. Méléard. Measurability of optimal transportation and strong coupling of martingale measures. Electron. Commun. Probab., 15:124–133, 2010.
  • [19] J.-C. Fort, T. Klein, and N. Rachdi. New sensitivity analysis subordinated to a contrast. May 2013.
  • [20] J.-C. Fort, T. Klein, and N. Rachdi. New sensitivity analysis subordinated to a contrast. Comm. Statist. Theory Methods, 45(15):4349–4364, 2016.
  • [21] R. Fraiman, F. Gamboa, and L. Moreno. Sensitivity indices for output on a Riemannian manifold. arXiv e-prints, page arXiv:1810.11591, Oct 2018.
  • [22] M. Fréchet. Les éléments aléatoires de nature quelconque dans un espace distancié. Ann. Inst. H.Poincaré, Sect. B, Prob. et Stat., 10:235–310, 1948.
  • [23] F. Gamboa, P. Gremaud, T. Klein, and A. Lagnoux. Global Sensitivity Analysis: a new generation of mighty estimators based on rank statistics. Working paper or preprint, Feb. 2020.
  • [24] F. Gamboa, A. Janon, T. Klein, and A. Lagnoux. Sensitivity analysis for multidimensional and functional outputs. Electronic Journal of Statistics, 8:575–603, 2014.
  • [25] F. Gamboa, A. Janon, T. Klein, A. Lagnoux, and C. Prieur. Statistical inference for Sobol pick-freeze Monte Carlo method. Statistics, 50(4):881–902, 2016.
  • [26] F. Gamboa, T. Klein, and A. Lagnoux. Sensitivity analysis based on Cramér–von Mises distance. SIAM/ASA J. Uncertain. Quantif., 6(2):522–548, 2018.
  • [27] F. Gamboa, T. Klein, A. Lagnoux, and L. Moreno. Sensitivity analysis in general metric spaces. Working paper or preprint, Feb. 2020.
  • [28] T. Goda. Computing the variance of a conditional expectation via non-nested monte carlo. Operations Research Letters, 45(1):63 – 67, 2017.
  • [29] A. Gretton, O. Bousquet, A. Smola, and B. Schölkopf. Measuring statistical dependence with Hilbert-Schmidt norms. In International conference on algorithmic learning theory, pages 63–77. Springer, 2005.
  • [30] J. Hart and P. A. Gremaud. Robustness of the Sobol’indices to distributional uncertainty. International Journal for Uncertainty Quantification, 9(5), 2019.
  • [31] J. L. Hart, A. Alexanderian, and P. A. Gremaud. Efficient computation of Sobol’indices for stochastic models. SIAM Journal on Scientific Computing, 39(4):A1514–A1530, 2017.
  • [32] J. L. Hart and P. A. Gremaud. Robustness of the Sobol’indices to marginal distribution uncertainty. SIAM/ASA Journal on Uncertainty Quantification, 7(4):1224–1244, 2019.
  • [33] W. Hoeffding. A class of statistics with asymptotically normal distribution. Ann. Math. Statistics, 19:293–325, 1948.
  • [34] D. Idier, A. Aurouet, F. Bachoc, A. Baills, J. D. Betancourt, J. Durand, R. Mouche, J. Rohmer, F. Gamboa, T. Klein, J. Lambert, G. Le Cozannet, S. Leroy, J. Louisor, R. Pedreros, and A.-L. Véron. Toward a User-Based, Robust and Fast Running Method for Coastal Flooding Forecast, Early Warning, and Risk Prevention. Journal of Coastal Research, Special Issue, 95:11–15, 2020.
  • [35] B. Ioss, T. Klein, and A. Lagnoux. Sobol’ sensitivity analysis for stochastic numerical codes. In Proceedings of the SAMO 2016 Conference, Reunion Island, France, pages 48–49, 2016.
  • [36] A. Janon, T. Klein, A. Lagnoux, M. Nodet, and C. Prieur. Asymptotic normality and efficiency of two Sobol index estimators. ESAIM: Probability and Statistics, 18:342–364, 1 2014.
  • [37] J. P. Kleijnen. Design and analysis of simulation experiments. In International Workshop on Simulation, pages 3–22. Springer, 2015.
  • [38] S. Kucherenko and S. Song. Different numerical estimators for main effect global sensitivity indices. Reliability Engineering & System Safety, 165:222–238, 2017.
  • [39] M. Lamboni, H. Monod, and D. Makowski. Multivariate sensitivity analysis to measure global contribution of input factors in dynamic models. Reliability Engineering & System Safety, 96(4):450–459, 2011.
  • [40] L. Le Gratiet. Asymptotic normality of a Sobol index estimator in gaussian process regression framework. Preprint, 2013.
  • [41] P. Lemaître, E. Sergienko, A. Arnaud, N. Bousquet, F. Gamboa, and B. Iooss. Density modification-based reliability sensitivity analysis. Journal of Statistical Computation and Simulation, 85(6):1200–1223, 2015.
  • [42] A. Marrel, B. Iooss, S. Da Veiga, and M. Ribatet. Global sensitivity analysis of stochastic computer models with joint metamodels. Statistics and Computing, 22(3):833–847, 2012.
  • [43] G. Mazo. An optimal tradeoff between explorations and repetitions in global sensitivity analysis for stochastic computer models. 2019.
  • [44] A. Meynaoui, A. Marrel, and B. Laurent. New statistical methodology for second level global sensitivity analysis. working paper or preprint, Feb. 2019.
  • [45] J. Morio. Influence of input pdf parameters of a model on a failure probability estimation. Simulation Modelling Practice and Theory, 19(10):2244–2255, 2011.
  • [46] V. Moutoussamy, S. Nanty, and B. Pauwels. Emulators for stochastic simulation codes. In CEMRACS 2013—modelling and simulation of complex systems: stochastic and deterministic approaches, volume 48 of ESAIM Proc. Surveys, pages 116–155. EDP Sci., Les Ulis, 2015.
  • [47] A. Owen. Better estimation of small Sobol’sensitivity indices. arXiv preprint arXiv:1204.4763, 2012.
  • [48] A. Owen. Variance components and generalized Sobol’ indices. SIAM/ASA Journal on Uncertainty Quantification, 1(1):19–41, 2013.
  • [49] A. Owen, J. Dick, and S. Chen. Higher order Sobol’ indices. Information and Inference, 3(1):59–81, 2014.
  • [50] K. Pearson. On the partial correlation ratio. Proceedings of the Royal Society of London. Series A, Containing Papers of a Mathematical and Physical Character, 91(632):492–498, 1915.
  • [51] N. Peteilh, T. Klein, T. Druot, N. Bartoli, and R. P. Liem. Challenging Top Level Aircraft Requirements based on operations analysis and data-driven models, application to take-off performance design requirements. In AIAA AVIATION 2020 FORUM, AIAA AVIATION 2020 FORUM, Reno, NV, United States, June 2020. American Institute of Aeronautics and Astronautics, American Institute of Aeronautics and Astronautics.
  • [52] A. Saltelli, K. Chan, and E. Scott. Sensitivity analysis. Wiley Series in Probability and Statistics. John Wiley & Sons, Ltd., Chichester, 2000.
  • [53] T. J. Santner, B. Williams, and W. Notz. The Design and Analysis of Computer Experiments. Springer-Verlag, 2003.
  • [54] P. Smets et al. What is Dempster-Shafer’s model. Advances in the Dempster-Shafer theory of evidence, pages 5–34, 1994.
  • [55] I. Sobol. Global sensitivity indices for nonlinear mathematical models and their Monte Carlo estimates. Mathematics and Computers in Simulation, 55(1-3):271–280, 2001.
  • [56] I. M. Sobol. Sensitivity estimates for nonlinear mathematical models. Math. Modeling Comput. Experiment, 1(4):407–414, 1993.
  • [57] B. Sudret. Global sensitivity analysis using polynomial chaos expansions. Reliability Engineering & System Safety, 93(7):964–979, 2008.
  • [58] A. W. van der Vaart. Asymptotic statistics, volume 3 of Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, Cambridge, 1998.
  • [59] C. Villani. Topics in Optimal Transportation. American Mathematical Society, 2003.