On the lower expected star discrepancy for jittered sampling than simple random sampling

Jun Xian, Xiaoda Xu J. Xian
Department of Mathematics and Guangdong Province Key Laboratory of Computational Science
Sun Yat-sen University
510275 Guangzhou
China. [email protected] X. Xu
Department of Mathematics
Sun Yat-sen University
510275 Guangzhou
China. [email protected]

Abstract.

We compare expected star discrepancy under jittered sampling with simple random sampling, and the strong partition principle for the star discrepancy is proved.

Key words and phrases:

Expected star discrepancy; Stratified sampling.

2010 Mathematics Subject Classification:

65C10, 11K38, 65D30.

1. Introduction

Classical jittered sampling (JS) acquires better expected star discrepancy bounds than using traditional Monte Carlo (MC) sampling, see [9]. This means jittered sampling is the refinement of the traditional Monte Carlo method, the problem now is that is there a direct comparison of star discrepancy itself, not a bound comparison. This involves the strong partition principle for the star discrepancy version, see an introduction in [13]. Among various techniques for measuring the irregularities of point distribution, star discrepancy is the most efficient, which is well established and has found applications in various areas such as computer graphics, machining learning, numerical integration, and financial engineering, see [15, 1, 7, 10].

Star discrepancy. The star discrepancy of a sampling set $P_{N,d}=\{t_{1},t_{2},\ldots,t_{N}\}$ is defined by:

(1.1)

D_{N}^{*}\left(t_{1},t_{2},\ldots,t_{N}\right):=\sup_{x\in[0,1]^{d}}\left|\lambda([0,x])-\frac{\sum_{n=1}^{N}I_{[0,x]}\left(t_{n}\right)}{N}\right|,

where $\lambda$ denotes the $d$ -dimensional Lebesgue measure and $I_{[0,x]}$ denotes the characteristic function defined on the $d$ -dimensional rectangle $[0,x]$ .

The special deterministic point set constructions are called low discrepancy point sets, the best known asymptotic upper bounds for star discrepancy are of the form

O(\frac{(\ln N)^{\alpha_{d}}}{N}),

where $\alpha_{d}\geq 0$ for fixed $d$ . Studies on such point sets, see [14, 6]. We introduce random factors in the present paper. Formers have conducted extensive research on the comparisons between expected discrepancy. In [13], it is proved that jittered sampling constitutes a set whose expected $L_{p}-$ discrepancy is smaller than that of purely random points. Further, a theoretical conclusion that the jittered sampling does not have the minimal expected $L_{2}-$ discrepancy is presented in [12]. We study expected star discrepancy under different partition models, which are the following,

(1.2)

\mathbb{E}(D_{N}^{*}(Y))<\mathbb{E}(D_{N}^{*}(X)),

where $X$ denotes a simple random sampling set, $Y$ denotes a stratified sampling point set that is uniformly distributed in the grid-based stratified subsets. Since 2016 in [16], F. Pausinger and S. Steinerberger have given the upper and lower bounds for expected star discrepancy under jittered sampling, and they proved the strong partition principle for $L_{2}-$ discrepancy, they mentioned whether this conclusion could be generalized to star discrepancy. Then, in 2021 [12], M. Kiderlen and F. Pausinger referred again to the proof of the strong partition principle for the star discrepancy.

The rest of this paper is organized as follows. Section 2 presents preliminaries on stratified sampling and $\delta-$ covers. Section 3 presents our main results, which provide comparisons of the expected star discrepancy for simple random sampling and stratified sampling under grid-based equivolume partition. Section 4 includes the proofs of the main result. Finally, in section 5 we conclude the paper with a short summary.

2. Preliminaries on stratified sampling and $\delta-$ covers

Before introducing the main result, we list the preliminaries used in this paper.

2.1. Jittered sampling

Jittered sampling is a type of grid-based equivolume partition. $[0,1]^{d}$ is divided into $m^{d}$ axis parallel boxes $Q_{i},1\leq i\leq N,$ each with sides $\frac{1}{m},$ see illustration of Figure 1. Research on the jittered sampling are extensive, see [4, 9, 12, 13, 16].

Refer to caption — (a) jittered sampling in two dimension

For jittered sampling, we consider a rectangle $R=[0,x)$ (we shall call it the test set in the following) in $[0,1]^{d}$ anchored at $0$ . For the corresponding isometric grid partition $\Omega=\{Q_{1},Q_{2},\ldots,Q_{N}\}$ of $[0,1]^{d}$ , if we put

I_{N}:=\{j:\partial R\cap Q_{j}\neq\emptyset\},

and

C_{N}:=|I_{N}|,

which means the cardinality of the index set $I_{N}$ . Then for $C_{N}$ , easy to show that

(2.1)

C_{N}\leq d\cdot N^{1-\frac{1}{d}}.

2.2. $\delta-$ covers

Secondly, to discretize the star discrepancy, we use the definition of $\delta-$ covers as in [8], which is well known in empirical process theory, see, e.g., [17].

Definition 2.1.

For any $\delta\in(0,1]$ , a finite set $\Gamma$ of points in $[0,1)^{d}$ is called a $\delta$ -cover of $[0,1)^{d}$ , if for every $y\in[0,1)^{d}$ , there exist $x,z\in\Gamma\cup\{0\}$ such that $x\leq y\leq z$ and $\lambda([0,z])-\lambda([0,x])\leq\delta$ . The number $\mathcal{N}(d,\delta)$ denotes the smallest cardinality of a $\delta$ -cover of $[0,1)^{d}$ .

From [8, 11], combining with Stirling’s formula, the following estimation for $\mathcal{N}(d,\delta)$ holds, that is, for any $d\geq 1$ and $\delta\in(0,1]$ ,

(2.2)

\mathcal{N}(d,\delta)\leq 2^{d}\cdot\frac{e^{d}}{\sqrt{2\pi d}}\cdot(\delta^{-1}+1)^{d}.

Let $P=\{p_{1},p_{2},\ldots,p_{N}\}\subset[0,1]^{d}$ and $\Gamma$ be $\delta-$ covers, then

D_{N}^{*}(P)\leq D_{\Gamma}(P)+\delta,

where

(2.3)

D_{\Gamma}(P):=\max_{x\in\Gamma}|\lambda([0,x])-\frac{\sum_{n=1}^{N}I_{[0,x]}\left(p_{n}\right)}{N}|.

Formula (2.3) provides convenience for estimating the star discrepancy.

3. Expected star discrepancy for stratified and simple random sampling

In this section, a comparison of expected star discrepancy for jittered sampling and simple random sampling is obtained, which means the strong partition principle for star discrepancy holds.

3.1. Comparison of expected star discrepancy under jittered sampling and simple random sampling

Theorem 3.1.

Let $m,d\in\mathbb{N}$ with $m\geq d\geq 2$ . Let $N=m^{d}$ , $X=\{X_{1},X_{2},\ldots,X_{N}\}$ is a simple random sampling set. Stratified random $d-$ dimension point set $Y=\{Y_{1},Y_{2},Y_{3},\ldots,Y_{N}\}$ is uniformly distributed in the grid-based stratified subsets of $\Omega^{*}_{|}$ , then

(3.1)

\mathbb{E}(D_{N}^{*}(Y))<\mathbb{E}(D_{N}^{*}(X)).

Remark 3.2.

The inequality (3.1) in Theorem 3.1 presents lower expected star discrepancy under grid-based partition than simple random sampling, which also means [Strong Partition Principle [13]] holds, it is stated in [12] that strong partition principle for the star discrepancy needs to be proved.

4. Proofs

In this section, we present the proof of Theorem 3.1. We first introduce Bernstein’s inequality, which is a standard result that can be found in textbooks on probability, see, e.g., [5] and we shall omit the proof here.

Lemma 4.1 (Bernstein’s inequality).

Let $Z_{1},\ldots,Z_{N}$ be independent random variables with expected values $\mathbb{E}(Z_{j})=\mu_{j}$ and variances $\sigma_{j}^{2}$ for $j=1,\ldots,N$ . Assume $|Z_{j}-\mu_{j}|\leq C$ (C is a constant) for each $j$ and set $\Sigma^{2}:=\sum_{j=1}^{N}\sigma_{j}^{2}$ , then for any $\lambda\geq 0$ ,

\mathbb{P}\left\{\Big{|}\sum_{j=1}^{N}[Z_{j}-\mu_{j}]\Big{|}\geq\lambda\right\}\leq 2\exp\left(-\frac{\lambda^{2}}{2\Sigma^{2}+\frac{2}{3}C\lambda}\right).

4.1. Proof of Theorem 3.1

For the arbitrary test set $R=[0,x)$ , we unify a label $W=\{W_{1},W_{2},\ldots,W_{N}\}$ for the sampling point sets formed by different sampling models, and we consider the following discrepancy function,

(4.1)

\Delta_{\mathscr{P}}(x)=\frac{1}{N}\sum_{n=1}^{N}\mathbf{1}_{R}(W_{n})-\lambda(R).

For the grid-based equivolume partition $\Omega=\{\Omega_{1},\Omega_{2},\ldots,\Omega_{N}\}$ , we divide the test set $R$ into two parts, one is the disjoint union of $\Omega_{i}$ entirely contained by $R$ and another is the union of remaining pieces which are the intersections of some $\Omega_{j}$ and $R$ , i.e.,

(4.2)

R=\bigcup_{i\in I_{0}}\Omega_{i}\cup\bigcup_{j\in J_{0}}(\Omega_{j}\cap R),

where $I_{0},J_{0}$ are two index sets.

We set

T=\bigcup_{j\in J_{0}}(\Omega_{j}\cap R).

Besides, for an equivolume partition $\Omega=\{\Omega_{1},\Omega_{2},\ldots,\Omega_{N}\}$ of $[0,1]^{d}$ and the corresponding stratified sampling set $P_{\Omega}$ , we have

(4.3)		$\displaystyle\text{Var}(\sum_{n=1}^{N}\mathbf{1}_{R}(P_{\Omega}))$	$\displaystyle=\sum_{i=1}^{N}\frac{\|\Omega_{i}\cap[0,x]\|}{\|\Omega_{i}\|}(1-\frac{\|\Omega_{i}\cap[0,x]\|}{\|\Omega_{i}\|})$
(4.3)			$\displaystyle=N\|[0,x]\|-N^{2}\sum_{i=1}^{N}(\|\Omega_{i}\cap[0,x]\|)^{2}.$

For sampling sets $Y=\{Y_{1},Y_{2},\ldots,Y_{N}\}$ and $X=\{X_{1},X_{2},\ldots,X_{N}\}$ , we have

(4.4)

\text{Var}(\sum_{n=1}^{N}\mathbf{1}_{R}(Y))=N|[0,x]|-N^{2}\sum_{i=1}^{N}(|\Omega^{*}_{|,i}\cap[0,x]|)^{2},

and

(4.5)

\text{Var}(\sum_{n=1}^{N}\mathbf{1}_{R}(X))=N|[0,x]|-N^{2}|[0,x]|^{2}.

Hence, we have

(4.6)

\text{Var}(\sum_{n=1}^{N}\mathbf{1}_{R}(Y))\leq\text{Var}(\sum_{n=1}^{N}\mathbf{1}_{R}(X))

Now, we exclude the equality case in (4.6), which means the following formula holds,

(4.7)

N|\Omega^{*}_{|,i}\cap[0,x]|=|[0,x]|,

for $i=1,2,\ldots,N$ and for almost all $x\in[0,1]^{d}$ .

Hence,

(4.8)

\int_{[0,x]}\mathbf{1}_{\Omega^{*}_{|,i}}(y)dy=\int_{[0,x]}\frac{1}{N}dy.

for almost all $x\in[0,1]^{d}$ and all $i=1,2,\ldots,N$ , which implies $\mathbf{1}_{\Omega^{*}_{|,i}}=\frac{1}{N}$ , which is not impossible for $N\geq 2.$

For set $T$ , we have

(4.9)

\text{Var}(\frac{1}{N}\sum_{n=1}^{N}\mathbf{1}_{T}(Y_{n}))=\sum_{i=1}^{|J_{0}|}\frac{|\Omega^{*}_{i,|}\cap[0,x]|}{|\Omega^{*}_{i,|}|}(1-\frac{|\Omega^{*}_{i,|}\cap[0,x]|}{|\Omega^{*}_{i,|}|})

The same analysis for set $T$ as $R$ , we have

(4.10)

\text{Var}(\frac{1}{N}\sum_{n=1}^{N}\mathbf{1}_{T}(Y_{n}))<\text{Var}(\frac{1}{N}\sum_{n=1}^{N}\mathbf{1}_{T}(X_{n})).

For test set $R=[0,x)$ , we choose $R_{0}=[0,y)$ and $R_{1}=[0,z)$ such that $y\leq x\leq z$ and $\lambda(R_{1})-\lambda(R_{0})\leq\frac{1}{N}$ , then ( $R_{0}$ , $R_{1}$ ) constitute the $\frac{1}{N}-$ covers. For $R_{0}$ and $R_{1}$ , we can divide them into two parts as we did for (4.2) respectively. Let

T_{0}=\bigcup_{j\in J_{0}}(\Omega_{j}\cap R_{0}),

and

T_{1}=\bigcup_{j\in J_{0}}(\Omega_{j}\cap R_{1}).

We have the same conclusion for $T_{0}$ and $T_{1}$ . In order to unify the two cases $T_{0}$ and $T_{1}$ (Because $T_{0}$ and $T_{1}$ are generated from two test sets with the same cardinality, and the cardinality is the covering numbers), we consider a set $R^{\prime}$ which can be divided into two parts

(4.11)

R^{\prime}=\bigcup_{k\in K}\Omega_{k}\cup\bigcup_{l\in L}(\Omega_{l}\cap R^{\prime}),

where $K,L$ are two index sets. Moreover, we set the cardinality of $R^{{}^{\prime}}\subset[0,1)^{d}$ at most $2^{d-1}\frac{e^{d}}{\sqrt{2\pi d}}(N+1)^{d}$ (the $\delta-$ covering numbers, where we choose $\delta=\frac{1}{N}$ ), and we let

T^{\prime}=\bigcup_{l\in L}(\Omega_{l}\cap R^{\prime}).

We define new random variables $\chi_{j},1\leq j\leq|L|$ , as follows

\chi_{j}=\left\{\begin{aligned} &1,W_{j}\in\Omega_{j}\cap R^{\prime},\\ &0,otherwise.\end{aligned}\right.

Then,

(4.12)			$\displaystyle N\cdot D^{*}_{N}\left(W_{1},W_{2},\ldots,W_{N};R^{\prime}\right)$
			$\displaystyle=N\cdot D^{*}_{N}\left(W_{1},W_{2},\ldots,W_{N};T^{\prime}\right)$
			$\displaystyle=\|\sum_{j=1}^{\|L\|}\chi_{j}-N(\sum_{j=1}^{\|L\|}\lambda(\Omega_{j}\cap T^{\prime}))\|.$

Since

\mathbb{P}(\chi_{j}=1)=\frac{\lambda(\Omega_{j}\cap T^{\prime})}{\lambda(\Omega_{j})}=N\cdot\lambda(\Omega_{j}\cap T^{\prime}),

we get

(4.13)

\mathbf{E}(\chi_{j})=N\cdot\lambda(\Omega_{j}\cap T^{\prime}).

Thus, from (4.12) and (4.13), we obtain

(4.14)

N\cdot D^{*}_{N}\left(W_{1},W_{2},\ldots,W_{N};R^{\prime}\right)=|\sum_{j=1}^{|L|}(\chi_{j}-\mathbb{E}(\chi_{j}))|.

Let

\sigma_{j}^{2}=\mathbb{E}(\chi_{j}-\mathbb{E}(\chi_{j}))^{2},\Sigma=(\sum_{j=1}^{|L|}\sigma_{j}^{2})^{\frac{1}{2}}.

Therefore, from Lemma 4.1, for every $R^{\prime}$ , we have,

\mathbb{P}\left(\Big{|}\sum_{j=1}^{|L|}(\chi_{j}-\mathbb{E}(\chi_{j}))\Big{|}>\lambda\right)\leq 2\cdot\exp(-\frac{\lambda^{2}}{2\Sigma^{2}+\frac{2\lambda}{3}}).

Let $\mathscr{B}=\bigcup\limits_{R^{\prime}}\left(\Big{|}\sum_{j=1}^{|L|}(\chi_{j}-\mathbb{E}(\chi_{j}))|>\lambda\right),$ then using $\delta-$ covering numbers, we have

(4.15)

\mathbb{P}(\mathscr{B})\leq(2e)^{d}\cdot\frac{1}{\sqrt{2\pi d}}\cdot(N+1)^{d}\cdot\exp(-\frac{\lambda^{2}}{2\Sigma^{2}+\frac{2\lambda}{3}}).

Combining with (4.14), we get

(4.16)			$\displaystyle\mathbb{P}\Big{(}\bigcup_{R^{\prime}}\left(N\cdot D_{N}^{*}\left(W_{1},W_{2},\ldots,W_{N};R^{\prime}\right)>\lambda\right)\Big{)}$
(4.16)			$\displaystyle\leq(2e)^{d}\cdot\frac{1}{\sqrt{2\pi d}}\cdot(N+1)^{d}\cdot\exp(-\frac{\lambda^{2}}{2\Sigma^{2}+\frac{2\lambda}{3}}).$

For point sets $Y$ and $X$ , if we let

\Sigma_{0}^{2}=\text{Var}(\sum_{n=1}^{N}\mathbf{1}_{T^{\prime}}(Y_{n})),\Sigma_{1}^{2}=\text{Var}(\sum_{n=1}^{N}\mathbf{1}_{T^{\prime}}(X_{n})).

Then (4.10) implies

\Sigma_{0}^{2}<\Sigma_{1}^{2}.

Besides, as (4.16), we have

(4.17)			$\displaystyle\mathbb{P}\Big{(}\bigcup_{R^{\prime}}\left(N\cdot D_{N}^{*}\left(Y_{1},Y_{2},\ldots,Y_{N};R^{\prime}\right)>\lambda\right)\Big{)}$
(4.17)			$\displaystyle\leq(2e)^{d}\cdot\frac{1}{\sqrt{2\pi d}}\cdot(N+1)^{d}\cdot\exp(-\frac{\lambda^{2}}{2\Sigma_{0}^{2}+\frac{2\lambda}{3}}),$

and

	$\displaystyle\mathbb{P}\Big{(}\bigcup_{R^{\prime}}\left(N\cdot D_{N}^{*}\left(X_{1},X_{2},\ldots,X_{N};R^{\prime}\right)>\lambda\right)\Big{)}$
	$\displaystyle\leq(2e)^{d}\cdot\frac{1}{\sqrt{2\pi d}}\cdot(N+1)^{d}\cdot\exp(-\frac{\lambda^{2}}{2\Sigma_{1}^{2}+\frac{2\lambda}{3}}),$

respectively.

Suppose $A(d,q,N)=d\ln(2e)+d\ln(N+1)-\frac{\ln(2\pi d)}{2}-\ln(1-q)$ , and we choose

\lambda=\sqrt{2\Sigma_{0}^{2}\cdot A(d,q,N)+\frac{A^{2}(d,q,N)}{9}}+\frac{A(d,q,N)}{3}

in (4.17), then we have

(4.18)

\mathbb{P}\Big{(}\bigcup_{R^{\prime}}\left(N\cdot D_{N}^{*}\left(Y_{1},Y_{2},\ldots,Y_{N};R^{\prime}\right)>\lambda\right)\Big{)}\leq 1-q.

Hence, from (4.18), it can easily be verified

(4.19)

\max_{R_{i},i=0,1}D_{N}^{*}(Y_{1},Y_{2},\ldots,Y_{N};R_{i})\leq\frac{\sqrt{2\Sigma_{0}^{2}\cdot A(d,q,N)+\frac{A^{2}(d,q,N)}{9}}}{N}+\frac{A(d,q,N)}{3N}

holds with probability at least $q$ .

From (2.3), combining with $\delta-$ covering numbers(where $\delta=\frac{1}{N}$ ), we get,

(4.20)		$\displaystyle D_{N}^{*}(Y)$	$\displaystyle\leq\frac{\sqrt{2\Sigma_{0}^{2}\cdot A(d,q,N)+\frac{A^{2}(d,q,N)}{9}}}{N}+\frac{A(d,q,N)+3}{3N}$
(4.20)			$\displaystyle\leq(\sqrt{2}\cdot\Sigma_{0}+1)\frac{A(d,q,N)}{N}$

holds with probability at least $q$ , the last inequality in (4.20) holds because $A(d,q,N)\geq 3$ holds for all $q\in(0,1)$ .

Same analysis with point set $X$ , we have

(4.21)		$\displaystyle D_{N}^{*}(X)$	$\displaystyle\leq\frac{\sqrt{2\Sigma_{1}^{2}\cdot A(d,q,N)+\frac{A^{2}(d,q,N)}{9}}}{N}+\frac{A(d,q,N)+3}{3N}$
(4.21)			$\displaystyle\leq(\sqrt{2}\cdot\Sigma_{1}+1)\frac{A(d,q,N)}{N}$

holds with probability at least $q$ .

Now, we fix a probability value $q_{0}\in(0,1)$ in (4.20), i.e., we suppose (4.20) holds with probability exactly $q_{0}$ , where $q_{0}\in[q,1)$ . Choose this $q_{0}$ in (4.21), we have

D_{N}^{*}(X)\leq(\sqrt{2}\cdot\Sigma_{1}+1)\frac{A(d,q_{0},N)}{N},

holds with probability $q_{0}.$

Therefore from $\Sigma_{0}<\Sigma_{1}$ , we obtain,

(4.22)

D_{N}^{*}(X)\leq(\sqrt{2}\cdot\Sigma_{0}+1)\frac{A(d,q_{0},N)}{N}

holds with probability $q_{1},$ where $0<q_{1}<q_{0}$ .

We use the following fact to estimate the expected star discrepancy

(4.23)

\mathbb{E}[D^{*}_{N}(W)]=\int_{0}^{1}\mathbb{P}(D^{*}_{N}(W)\geq t)dt,

where $D^{*}_{N}(W)$ denotes the star discrepancy of point set $W$ .

Plugging $q_{0}$ into (4.20), we have

(4.24)

D_{N}^{*}\left(Y\right)\leq(\sqrt{2}\cdot\Sigma_{0}+1)\frac{A(d,q_{0},N)}{N}

holds with probability $q_{0}$ . Then (4.24) is equivalent to

\mathbb{P}\big{(}D_{N}^{*}\left(Y\right)\geq(\sqrt{2}\cdot\Sigma_{0}+1)\frac{A(d,q_{0},N)}{N}\big{)}=1-q_{0}.

Now releasing $q_{0}$ and let

(4.25)

t=(\sqrt{2}\cdot\Sigma_{0}+1)\frac{A(d,q_{0},N)}{N},

(4.26)

C_{0}(\Sigma_{0},N)=\frac{\sqrt{2}\cdot\Sigma_{0}+1}{N},

and

(4.27)

C_{1}(d,\Sigma_{0},N)=\frac{\sqrt{2}\cdot\Sigma_{0}+1}{N}\cdot(d\ln(2e)+d\ln(N+1)-\frac{\ln(2\pi d)}{2}).

Then

(4.28)

t=C_{1}(d,\Sigma_{0},N)-C_{0}(\Sigma_{0},N)\ln(1-q_{0}).

Thus from (4.23) and $q_{0}\in[q,1)$ , we have

(4.29)			$\displaystyle\mathbb{E}[D^{}_{N}(Y)]=\int_{0}^{1}\mathbb{P}(D^{}_{N}(Y)\geq t)dt$
			$\displaystyle=\int_{1-e^{\frac{C_{1}(d,\Sigma_{0},N)}{C_{0}(\Sigma_{0},N)}}}^{1-e^{\frac{C_{1}(d,\Sigma_{0},N)-1}{C_{0}(\Sigma_{0},N)}}}\mathbb{P}\Big{(}D^{*}_{N}(Y)\geq(\sqrt{2}\cdot\Sigma_{0}+1)\frac{A(d,q_{0},N)}{N}\Big{)}\cdot C_{0}(\Sigma_{0},N)\cdot\frac{1}{1-q_{0}}dq_{0}$
			$\displaystyle=\int_{q}^{1-e^{\frac{C_{1}(d,\Sigma_{0},N)-1}{C_{0}(\Sigma_{0},N)}}}C_{0}(\Sigma_{0},N)\cdot\frac{1-q_{0}}{1-q_{0}}dq_{0}.$

Furthermore, from (4.22), we have

\mathbb{P}\big{(}D_{N}^{*}\left(X\right)\geq(\sqrt{2}\cdot\Sigma_{0}+1)\frac{A(d,q_{0},N)}{N}\big{)}=1-q_{1}.

Following the steps from (4.25) to (4.28), we obtain,

\mathbb{E}[D^{*}_{N}(X)]=\int_{0}^{1}\mathbb{P}(D^{*}_{N}(X)\geq t)dt=\int_{q}^{1-e^{\frac{C_{1}(d,\Sigma_{0},N)-1}{C_{0}(\Sigma_{0},N)}}}C_{0}(\Sigma_{0},N)\cdot\frac{1-q_{1}}{1-q_{0}}dq_{0}.

From $q_{1}<q_{0},$ we obtain

\frac{1-q_{1}}{1-q_{0}}>\frac{1-q_{0}}{1-q_{0}}.