This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Strongly Efficient Rare-Event Simulation for Regularly Varying Lévy Processes with Infinite Activities

Xingyu Wang, Chang-Han Rhee
Abstract

In this paper, we address rare-event simulation for heavy-tailed Lévy processes with infinite activities. The presence of infinite activities poses a critical challenge, making it impractical to simulate or store the precise sample path of the Lévy process. We present a rare-event simulation algorithm that incorporates an importance sampling strategy based on heavy-tailed large deviations, the stick-breaking approximation for the extrema of Lévy processes, the Asmussen-Rosiński approximation, and the randomized debiasing technique. By establishing a novel characterization for the Lipschitz continuity of the law of Lévy processes, we show that the proposed algorithm is unbiased and strongly efficient under mild conditions, and hence applicable to a broad class of Lévy processes. In numerical experiments, our algorithm demonstrates significant improvements in efficiency compared to the crude Monte-Carlo approach.

1 Introduction

In this paper, we propose a strongly efficient rare-event simulation algorithm for heavy-tailed Lévy processes with infinite activities. Specifically, the goal is to estimate probabilities of the form 𝐏(XA)\mathbf{P}(X\in A), where X={X(t):t[0,1]}X=\{X(t):\ t\in[0,1]\} is a Lévy process in \mathbb{R}, AA is a subset of the càdlàg space that doesn’t include the typical path of XX so that 𝐏(XA)\mathbf{P}(X\in A) is close to 0, and the event {XA}\{X\in A\} is “unsimulatable” due to the infinite number of activities within any finite time interval. The defining features of the problem are as follows.

  • The increments of the Lévy process X(t)X(t) are heavy-tailed. Throughout this paper, we characterize the heavy-tailed phenomenon through the notion of regular variation and assume that the tail cdf 𝐏(±X(t)>x)\mathbf{P}(\pm X(t)>x) decays roughly at a power-law rate of 1/xα1/x^{\alpha}; see Definition 1 for details. The notion of heavy tails provides the mathematical formulation for the extreme uncertainty that manifests in a wide range of real-world dynamics and systems, including the spread of COVID-19 (see, e.g., [21]), traffic in computer and communication networks (see, e.g., [45]), financial assets (see, e.g., [30, 10]), and the training of deep neural networks (see, e.g., [38, 41]).

  • AA is a general subset of 𝔻\mathbb{D} (i.e., the space of the real-valued càdlàg functions over [0,1][0,1]) that involves the supremum of the path. For concreteness in our presentation, the majority of the paper focuses on

    A={ξ𝔻:supt[0,1]ξ(t)a;supt(0,1]ξ(t)ξ(t)<b}.\displaystyle A=\Big{\{}\xi\in\mathbb{D}:\sup_{t\in[0,1]}\xi(t)\geq{a};\sup_{t\in(0,1]}\xi(t)-\xi(t-)<{b}\Big{\}}. (1.1)

    Intuitively speaking, this is closely related to ruin probabilities under reinsurance mechanisms, as {XA}\{X\in A\} requires the supremum of the process X(t)X(t) over [0,1][0,1] to exceed some threshold aa even though all upward jumps in X(t)X(t) are bounded by bb. Nevertheless, we stress that the algorithmic framework proposed in this paper is flexible enough to address more general form of events {XA}\{X\in A\} that are of practical interest. For instance, we demonstrate in Section A of the Appendix that the framework can also address rare event simulation in the context of barrier option pricing.

  • X(t)X(t) possesses infinite activities; see Section 2.3 for the precise definition. Consequently, it is computationally infeasible to simulate or store the entire sample path of such processes. In other words, we focus on a computationally challenging case where 𝐈{XA}\mathbf{I}\{X\in A\} cannot be exactly simulated or evaluated. Addressing such “unsimulatable” cases is crucial due to the increasing popularity of Lévy models with infinite activities in risk management and mathematical finance (see, e.g., [15, 16, 56, 5, 57]), as they offer more accurate and flexible descriptions for the price and volatility of financial assets compared to the classical jump-diffusion models (see, e.g., [50]).

In summary, our goal is to tackle a practically significant yet computationally challenging task, where the nature of the rare events renders crude Monte Carlo methods highly inefficient, if not entirely infeasible, due to the infinite activities in X(t)X(t). To address these challenges, we integrate several mathematical machinery: a design of importance sampling based on sample-path large deviations for heavy-tailed Lévy processes in [54], the stick-breaking approximation in [35] for Lévy processes with infinite activities, and the randomized multilevel Monte Carlo debiasing technique in [55]. By combining these tools, we propose a rare event simulation algorithm for heavy-tailed Lévy processes with infinite activities that attains strong efficiency (see Definition 2 for details).

As mentioned above, the first challenge is rooted in the nature of rare events as the crude Monte Carlo methods can be prohibitively expensive when estimating a small p=𝐏(XA)p=\mathbf{P}(X\in A). Instead, variance reduction techniques are often employed for efficient rare event simulation. When the underlying uncertainties are light-tailed, the exponential tilting strategy guided by large deviation theories has been successfully applied in a variety of contexts; see, e.g., [14, 11, 59, 28]. However, the exponential tilting approach falls short in providing a principled and provably efficient design of the importance sampling estimators (see, for example, [4]) due to fundamentally different mechanisms through which the rare events occur. Instead, different importance sampling strategies (e.g., [6, 27, 7, 9, 51, 8]) and other variance reduction techniques such as conditional Monte Carlo (e.g., [2, 42]) and Markov Chain Monte Carlo (e.g., [37]) have been proposed to address problems associated with specific types of processes or events.

Recent developments in heavy-tailed large deviations, such as those by [54] and [61], offer critical insights into the design of efficient and universal importance sampling schemes for heavy-tailed systems. Central to this development is the discrete hierarchy of heavy-tailed rare events, known as the catastrophe principle. The principle dictates that rare events in heavy-tailed systems arise due to catastrophic failures of a small number of system components, and the number of such components governs the asymptotic rate at which the associated rare events occur. This creates a discrete hierarchy among heavy-tailed rare events. By combining the defensive importance sampling design with such hierarchy, strongly efficient importance sampling algorithms have been proposed for a variety of rare events associated with random walks and compound Poisson processes in [20]. See also [62] for a tutorial on this topic. In this paper, we adopt and extend this framework to encompass Lévy processes with infinite activities. The specifics of the importance sampling distribution are detailed in Section 3.1.

Another challenge arises from the simulation of Lévy processes with infinite activities. While the design of importance sampling algorithm in [20] has been successfully applied to a wide range of stochastic systems that are exactly simulatable (including random walks, compound Poisson processes, iterates of stochastic gradient descent, and several classes of queueing systems), it cannot be implemented for Lévy processes with infinite activities. More specifically, the simulation of the random vector (X(t),M(t))(X(t),\ M(t)), where M(t)=supstX(t)M(t)=\sup_{s\leq t}X(t), poses a significant challenge in the case with infinite activities. As of now, exact simulation of the extrema of Lévy processes (excluding the compound Poisson case) is only available for specific cases (see, for instance, [36, 23, 18]), let alone the exact simulation of the joint law of (X(t),M(t))(X(t),\ M(t)). We therefore approach the challenge by considering the following questions: (i)(i) Does there exist a provably efficient approximation algorithm for (X(t),M(t))(X(t),\ M(t)), and (ii)(ii) Are we able to remove the approximation bias while still attaining strong efficiency in our rare-event simulation algorithm?

Regarding the first question, several classes of algorithms have been proposed for the approximate simulation of the extrema of Lévy processes. This includes the random walk approximations based on Euler-type discretization of the process (see e.g., [1, 26, 33]), the Wiener-Hopf approximation methods (see e.g. [43, 31]) based on the fluctuation theory of Lévy processes, the jump-adapted Gaussian approximations (see e.g. [24, 25]), and the characteristic function approach in [12, 13] based on efficient evaluation of joint cdf. Nevertheless, the approximation errors in the aforementioned methods are either unavailable or exhibit a polynomial rate of decay. Thankfully, the recently developed stick-breaking approximation (SBA) algorithm in [35] provides a novel approach to the simulation of the joint law of X(t)X(t) and M(t)M(t). The theoretical foundation of SBA is the following description for the concave majorants of Lévy processes with infinite activities in [52]:

(X(t),M(t))\ensurestackMath\stackon[1pt]=d(j1ξj,j1max{ξj,0}).\big{(}X(t),\ M(t)\big{)}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\Big{(}\sum_{j\geq 1}\xi_{j},\ \sum_{j\geq 1}\max\{\xi_{j},0\}\Big{)}.

Here, (lj)j1(l_{j})_{j\geq 1} is a sequence of iteratively generated non-negative RVs satisfying j1lj=t\sum_{j\geq 1}l_{j}=t and 𝐄lj=t/2jj1\mathbf{E}l_{j}=t/2^{j}\ \forall j\geq 1; conditioned on the values of (lj)j1(l_{j})_{j\geq 1}, ξj\xi_{j}’s are independently generated such that ξj\ensurestackMath\stackon[1pt]=dX(lj)\xi_{j}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}X(l_{j}). While it is computationally infeasible to generate the entirety of the infinite sequences (lj)j1(l_{j})_{j\geq 1} and (ξj)j1(\xi_{j})_{j\geq 1}, by terminating the procedure at the mm-th step we yield approximations of the form

(X^m(t),M^m(t))\ensurestackMath\stackon[1pt]=Δ(j=1mξj,j=1mmax{ξj,0}).\displaystyle\big{(}\hat{X}_{m}(t),\ \hat{M}_{m}(t)\big{)}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\Big{(}\sum_{j=1}^{m}\xi_{j},\ \sum_{j=1}^{m}\max\{\xi_{j},0\}\Big{)}. (1.2)

We provide a review in Section 2.3. In particular, due to 𝐄[j>mlj]=t/2m\mathbf{E}\big{[}\sum_{j>m}l_{j}\big{]}=t/2^{m}, with each extra step in (1.2) we expect to reduce the approximation error by half, thus leading to the geometric convergence rate of errors. See [35] for analyses of the approximation errors for different types of functionals.

Additionally, while SBA can be considered sufficiently accurate for a wide range of tasks, eliminating the approximation errors is crucial in the context of rare-event simulation. Otherwise, any effort to efficiently estimate a small probability might be fruitless and could be overwhelmed by potentially large errors in the algorithm. In order to remove the approximation errors of SBA in (1.2), we employ the construction of unbiased estimators proposed in [55]. This can be interpreted as a randomized version of the multilevel Monte Carlo scheme [39, 32] when a sequence of biased yet increasingly more accurate approximations is available. It allows us to construct an unbiased estimation algorithm that terminates within a finite number of steps. By combining SBA, the randomized debiasing technique, and the design of importance sampling distributions based on heavy-tailed large deviations, we propose Algorithm 2 for rare-event simulation of Lévy processes with infinite activities. In case that the exact sampling of X(t)X(t), and hence the increments ξj\xi_{j}’s in (1.2), is not available, we further incorporate the Asmussen-Rosiński approximation (ARA) in [3]. This approximation replaces the small-jump martingale in the Lévy process X(t)X(t) with a Brownian motion of the same variance, thus leading to Algorithm 3. We note that the combination of SBA and the randomized debiasing technique has been explored in [35], and an ARA-incorporated version of SBA has been proposed in [34]. However, the goal of proposing strongly efficient rare-event simulation algorithm adds another layer of difficulty and sets our work apart from the existing literature. In particular, the notion of strong efficiency demands that the proposed estimator remains efficient under the importance sampling algorithm w.r.t. not just a given task, but throughout a sequence of increasingly more challenging rare-event simulation tasks as 𝐏(XA)\mathbf{P}(X\in A) tends to 0. This introduces a new dimension into the theoretical analysis that is not presented in [35, 34] and necessitates the development of new technical tools to characterize the performance of the algorithm when all these components (importance sampling, SBA, debiasing technique, and ARA) are in effect.

An important technical question in our analysis concerns the continuity of the law of the running supremum M(t)M(t). To provide high-level descriptions, let us consider estimators for 𝐏(XA)=𝐄[𝐈{XA}]\mathbf{P}(X\in A)=\mathbf{E}\big{[}\mathbf{I}\{X\in A\}\big{]} that admit the form f(X^)f(\hat{X}) where X^\hat{X} is some approximation to the Lévy process XX and f(ξ)=𝐈{ξA}.f(\xi)=\mathbf{I}\{\xi\in A\}. SBA and the debiasing technique allow us to construct X^\hat{X} such that the deviation X^X\hat{X}-X has a small variance. Nevertheless, the estimation can be fallible if XX concentrates on the boundary cases, i.e., XX falls into a neighborhood of A\partial A fairly often. Specializing to the case in (1.1), this requires obtaining sufficiently tight bounds for probabilities of form 𝐏(M(t)[x,x+δ])\mathbf{P}(M(t)\in[x,x+\delta]). Nevertheless, the continuity of the law of the supremum M(t)M(t) remains an active area of study, with many essential questions left open. Recent developments regarding the law of M(t)M(t) are mostly qualitative or focus on the cumulative distribution function (cdf); see, e.g., [19, 22, 44, 47, 49, 48]. In short, addressing this aspect of the challenge requires us to establish novel and useful quantitative characterizations of the law of supremum M(t)M(t).

For our purpose of efficient rare event simulation, particularly under the importance sampling scheme detailed in Section 3.1, the following condition proves to be sufficient:

𝐏(X<z(t)[x,x+δ])Ctλ1δzz0,t>0,x,δ[0,1].\displaystyle\mathbf{P}\Big{(}X^{<z}(t)\in[x,x+\delta]\Big{)}\leq\frac{C}{t^{\lambda}\wedge 1}\delta\qquad\forall z\geq z_{0},\ t>0,\ x\in\mathbb{R},\ \delta\in[0,1]. (1.3)

Here, X<z(t)X^{<z}(t) is a modulated version of the process X(t)X(t) where all the upward jumps with sizes larger than zz are removed; see Section 3 for the rigorous definition. First, we establish in Theorem 3.2 (resp. Theorem 3.3) that Algorithm 2 (resp. Algorithm 3) does attain strong efficiency under condition (1.3). More importantly, we demonstrate in Section 4 that condition (1.3) is mild for Lévy processes with infinitive activities, as it only requires the intensity of jumps to approach \infty (hence attaining infinite activities in XX) at a rate that is not too slow. In particular, in Theorems 4.2 and 4.4 we provide two sets of sufficient conditions for (1.3) that are easy to verify. We note that the representation of concave majorants for Lévy processes developed in [52] proves to be a valuable tool for studying the law of X(t)X(t) and M(t)M(t). As will be elaborated in the proofs in Section 6, the key technical tool that allows us to connect condition 1.3 with the law of the supremum M(t)M(t) is, again, the representation in (1.2). See also [17] for its application in studying the joint density of X(t)X(t) and M(t)M(t) of stable processes.

Some algorithmic contributions of this paper were presented in a preliminary form at a conference in [60] without rigorous proofs. The current paper presents several significant extensions: (i)(i) In addition to Algorithm 2, we also propose an ARA-incorporated version of the importance sampling algorithm (see Algorithm 3) to address the case where X(t)X(t) cannot be exactly simulated; (ii)(ii) Rigorous proofs of strong efficiency are provided in Section 6 in this paper; (iii)(iii) We establish two sets of sufficient conditions for (1.3) in Section 4, leveraging the properties of regularly varying or semi-stable processes.

The rest of the paper is structured as follows. Section 2 reviews the theoretical foundations of our algorithms, including the heavy-tailed large deviation theories (Section 2.2), the stick-breaking approximations (Section 2.3), and the debiasing technique (Section 2.4). Section 3 presents the importance sampling algorithms and establishes their strong efficiency. Section 4 investigates the continuity of the law of X(t)X(t) and provides sufficient conditions for (1.3), a critical condition to ensure the strong efficiency of our importance sampling scheme. Numerical experiments are reported in Section 5. The proofs of all technical results are collected in Section 6. In the Appendix, Section A extends the algorithmic framework to the context of barrier option pricing.

2 Preliminaries

In this section, we introduce some notations and results that will be frequently used when developing the strongly efficient rare-event simulation algorithm.

2.1 Notations

Let ={0,1,2,}\mathbb{N}=\{0,1,2,\ldots\} be the set of non-negative integers. For any positive integer kk, let [k]={1,2,,k}[k]=\{1,2,\ldots,k\}. For any x,yx,y\in\mathbb{R}, let xy\ensurestackMath\stackon[1pt]=Δmin{x,y}x\wedge y\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\min\{x,y\} and xy\ensurestackMath\stackon[1pt]=Δmax{x,y}x\vee y\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\max\{x,y\}. For any xx\in\mathbb{R}, we define (x)+\ensurestackMath\stackon[1pt]=Δx0(x)^{+}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}x\vee 0 as the positive part of xx, and

x\ensurestackMath\stackon[1pt]=Δmax{n:nx},x\ensurestackMath\stackon[1pt]=Δmin{n:nx}\lfloor x\rfloor\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\max\{n\in\mathbb{Z}:\ n\leq x\},\qquad\lceil x\rceil\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\min\{n\in\mathbb{Z}:\ n\geq x\}

as the floor and ceiling function. Given a measure space (𝒳,,μ)(\mathcal{X},\mathcal{F},\mu) and any set AA\in\mathcal{F}, we use μ|A()\ensurestackMath\stackon[1pt]=Δμ(A)\mu|_{A}(\cdot)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\mu(A\cap\cdot) to denote restriction of the measure μ\mu on AA. For any random variable XX and any Borel measureable set AA, let (X)\mathscr{L}(X) be the law of XX, and (X|A)\mathscr{L}(X|A) be the law of XX conditioned on event AA. Let (𝔻[0,1],,𝒅)(\mathbb{D}_{[0,1],\mathbb{R}},\bm{d}) be the metric space of 𝔻=𝔻[0,1],\mathbb{D}=\mathbb{D}_{[0,1],\mathbb{R}} (i.e., the space of all real-valued càdlàg functions with domain [0,1][0,1]) equipped with Skorokhod J1J_{1} metric 𝒅\bm{d}. Here, the metric 𝒅\bm{d} is defined by

𝒅(x,y)\ensurestackMath\stackon[1pt]=ΔinfλΛsupt[0,1]|λ(t)t||x(λ(t))y(t)|\displaystyle\bm{d}(x,y)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\inf_{\lambda\in\Lambda}\sup_{t\in[0,1]}|\lambda(t)-t|\vee|x(\lambda(t))-y(t)| (2.1)

with Λ\Lambda being the set of all increasing homeomorphisms from [0,1][0,1] to itself.

Henceforth in this paper, the heavy-tailedness of any random element will be captured by the notion of regular variation.

Definition 1.

For any measurable function ϕ:(0,)(0,)\phi:(0,\infty)\to(0,\infty), we say that ϕ\phi is regularly varying as xx\rightarrow\infty with index β\beta (denoted as ϕ(x)𝒱β(x)\phi(x)\in\mathcal{RV}_{\beta}(x) as xx\to\infty) if limxϕ(tx)/ϕ(x)=tβ\lim_{x\rightarrow\infty}\phi(tx)/\phi(x)=t^{\beta} for all t>0t>0. We also say that a measurable function ϕ(η)\phi(\eta) is regularly varying as η0\eta\downarrow 0 with index β\beta if limη0ϕ(tη)/ϕ(η)=tβ\lim_{\eta\downarrow 0}\phi(t\eta)/\phi(\eta)=t^{\beta} for any t>0t>0. We denote this as ϕ(η)𝒱β(η)\phi(\eta)\in{\mathcal{RV}_{\beta}}(\eta) as η0\eta\downarrow 0.

For properties of regularly varying functions, see, for example, Chapter 2 of [53].

Next, we discuss the Lévy-Ito decomposition of one-dimensional Lévy processes, i.e., X(t)X(t)\in\mathbb{R}. The law of a one-dimensional Lévy process {X(t):t0}\{X(t):t\geq 0\} is completely characterized by its generating triplet (c,σ,ν)(c,\sigma,\nu) where cc\in\mathbb{R} represents the constant drift, σ0\sigma\geq 0 is the magnitude of the Brownian motion term, and the Lévy measure ν\nu characterizes the intensity of the jumps. More precisely,

X(t)\ensurestackMath\stackon[1pt]=dct+σB(t)+|x|1x[N([0,t]×dx)tν(dx)]+|x|>1xN([0,t]×dx)\displaystyle X(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}ct+\sigma B(t)+\int_{|x|\leq 1}x[N([0,t]\times dx)-t\nu(dx)]+\int_{|x|>1}xN([0,t]\times dx) (2.2)

where Leb()\text{Leb}(\cdot) is the Lebesgue measure on \mathbb{R}, BB is a standard Brownian motion, the measure ν\nu satisfies (|x|21)ν(dx)<\int(|x|^{2}\wedge 1)\nu(dx)<\infty, and NN is a Poisson random measure over (0,)×(0,\infty)\times\mathbb{R} with intensity measure Leb((0,))×ν\text{Leb}((0,\infty))\times\nu and is independent of BB. For standard references on this topic, see Chapter 4 of [58].

Given two sequences of non-negative real numbers (xn)n1(x_{n})_{n\geq 1} and (yn)n1(y_{n})_{n\geq 1}, we say that xn=𝑶(yn)x_{n}=\bm{O}(y_{n}) (as nn\to\infty) if there exists some C[0,)C\in[0,\infty) such that xnCynn1x_{n}\leq Cy_{n}\ \forall n\geq 1. Besides, we say that xn=𝒐(yn)x_{n}=\bm{o}(y_{n}) if limnxn/yn=0\lim_{n\rightarrow\infty}x_{n}/y_{n}=0. The goal of this paper is described in the following definition of strong efficiency.

Definition 2.

Let (Ln)n1(L_{n})_{n\geq 1} be a sequence of random variables supported on a probability space (Ω,,𝐏)(\Omega,\mathcal{F},\mathbf{P}) and (An)n1(A_{n})_{n\geq 1} be a sequence of events (i.e., AnnA_{n}\in\mathcal{F}\ \forall n). We say that (Ln)n1(L_{n})_{n\geq 1} are unbiased and strongly efficient estimators of (An)n1(A_{n})_{n\geq 1} if

𝐄Ln=𝐏(An)n1;𝐄Ln2=𝑶(𝐏2(An)) as n.\mathbf{E}L_{n}=\mathbf{P}(A_{n})\ \forall n\geq 1;\qquad\mathbf{E}L^{2}_{n}=\bm{O}\big{(}\mathbf{P}^{2}(A_{n})\big{)}\ \text{ as }n\rightarrow\infty.

We stress again that strongly efficient estimators (Ln)n1(L_{n})_{n\geq 1} achieve uniformly bounded relative errors (i.e., the ratio between standard error and mean) for all n1n\geq 1.

2.2 Sample-Path Large Deviations for Regularly Varying Lévy Processes

The key ingredient of our importance sampling algorithm is the recent development of the sample-path large deviations for Lévy processes with regularly varying increments; see [54]. To familiarize the readers with this mathematical machinery, we start by reviewing the results in the one-sided cases, and then move onto the more general two-sided results.

Let X(t)X(t)\in\mathbb{R} be a centered Lévy process (i.e., 𝐄X(t)=0t>0\mathbf{E}X(t)=0\ \forall t>0) with generating triplet (c,σ,ν)(c,\sigma,\nu) such that the Lévy measure ν\nu is supported on (0,)(0,\infty). In other words, all the discontinuities in XX will be positive, hence one-sided. Moreover, we are interested in the heavy-tailed setting where the function H+(x)=ν[x,)H_{+}(x)=\nu[x,\infty) is regularly varying as xx\rightarrow\infty with index α-\alpha where α>1\alpha>1. Define a scaled version of the process as X¯n(t)\ensurestackMath\stackon[1pt]=Δ1nX(nt)\bar{X}_{n}(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\frac{1}{n}X(nt), and let X¯n\ensurestackMath\stackon[1pt]=Δ{X¯n(t):t[0,1]}\bar{X}_{n}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\bar{X}_{n}(t):\ t\in[0,1]\}. Note that X¯n\bar{X}_{n} is a random element taking values in 𝔻\mathbb{D}.

For all l1l\geq 1, let 𝔻l\mathbb{D}_{l} be the subset of 𝔻\mathbb{D} containing all the non-decreasing step functions that has ll jumps (i.e., discontinuities) and vanishes at the origin. Let 𝔻0={𝟎}\mathbb{D}_{0}=\{\bm{0}\} be the set that only contains the zero function 𝟎(t)0\bm{0}(t)\equiv 0. Let 𝔻<l=j=0,1,,l1𝔻l\mathbb{D}_{<l}=\cup_{j=0,1,\cdots,l-1}\mathbb{D}_{l}. For any β>0\beta>0, let νβ\nu_{\beta} be the measure supported on (0,)(0,\infty) with νβ(x,)=xβ\nu_{\beta}(x,\infty)=x^{-\beta}. For any positive integer ll, let νβl\nu^{l}_{\beta} be the ll-fold product measure of νβ\nu_{\beta} restricted on {𝒚=(y1,,yl)(0,)l:y1y2yl}\{\bm{y}=(y_{1},\ldots,y_{l})\in(0,\infty)^{l}:\ y_{1}\geq y_{2}\geq\cdots\geq y_{l}\}. Define the measure (for l1l\geq 1)

𝐂l()\ensurestackMath\stackon[1pt]=Δ𝐄[νβl{𝒚(0,)l:j=1lyj𝐈[Uj,1]}]\mathbf{C}_{l}(\cdot)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\mathbf{E}\Bigg{[}\nu_{\beta}^{l}\Big{\{}\bm{y}\in(0,\infty)^{l}:\ \sum_{j=1}^{l}y_{j}\mathbf{I}_{[U_{j},1]}\in\cdot\Big{\}}\Bigg{]}

where all UjU_{j}’s are iid copies of Unif(0,1)\text{Unif}(0,1). In case that l=0l=0, we set 𝐂β0\mathbf{C}^{0}_{\beta} as the Dirac measure on 0. The following result provides sharp asymptotics for rare events associated with X¯n\bar{X}_{n}. Henceforth in this paper, all measurable sets are understood to be Borel measurable.

Result 1 (Theorem 3.1 of [54]).

Let A𝔻A\subset\mathbb{D} be measurable. Suppose that 𝒥(A)\ensurestackMath\stackon[1pt]=Δmin{j:𝔻jA}<\mathcal{J}(A)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\min\{j\in\mathbb{N}:\ \mathbb{D}_{j}\cap A\neq\emptyset\}<\infty and AA is bounded away from 𝔻<𝒥(A)\mathbb{D}_{<\mathcal{J}(A)} in the sense that 𝐝(A,𝔻<𝒥(A))>0\bm{d}(A,\mathbb{D}_{<\mathcal{J}(A)})>0. Then

𝐂𝒥(A)(A)lim infn𝐏(X¯nA)(nν[n,))𝒥(A)lim supn𝐏(X¯nA)(nν[n,))𝒥(A)𝐂𝒥(A)(A)<\mathbf{C}_{\mathcal{J}(A)}(A^{\circ})\leq\liminf_{n\rightarrow\infty}\frac{\mathbf{P}(\bar{X}_{n}\in A)}{(n\nu[n,\infty))^{\mathcal{J}(A)}}\leq\limsup_{n\rightarrow\infty}\frac{\mathbf{P}(\bar{X}_{n}\in A)}{(n\nu[n,\infty))^{\mathcal{J}(A)}}\leq\mathbf{C}_{\mathcal{J}(A)}(A^{-})<\infty

where A,AA^{\circ},A^{-} are the interior and closure of AA respectively.

Intuitively speaking, Result 1 embodies a general principle that, in heavy-tailed systems, rare events arise due to several “large jumps”. Here, 𝒥(A)\mathcal{J}(A) denotes the minimum number of jumps required in X¯n\bar{X}_{n} for event {X¯nA}\{\bar{X}_{n}\in A\} to occur. As shown above in Result 1, 𝒥(A)\mathcal{J}(A) dictates the polynomial rate of decay of the probabilities of the rare events 𝐏(X¯nA)\mathbf{P}(\bar{X}_{n}\in A). Furthermore, results such as Corollary 4.1 in [54] characterize the conditional limits of X¯n\bar{X}_{n}: conditioning on the occurrence of rare events {X¯nA}\{\bar{X}_{n}\in A\}, the conditional law (X¯n|{X¯nA})\mathscr{L}(\bar{X}_{n}|\{\bar{X}_{n}\in A\}) converges in distribution to that of a step function over [0,1][0,1] with exactly 𝒥(A)\mathcal{J}(A) jumps (of random sizes and arrival times) as nn\to\infty. Therefore, 𝒥(A)\mathcal{J}(A) also dictates the most likely scenarios of the rare events. This insight proves to be critical when we develop the importance sampling distributions for the rare events simulation algorithm in Section 3.

Results for the two-sided cases admit a similar yet slightly more involved form, where the Lévy process X(t)X(t) exhibits both positive and negative jumps. Specifically, let X(t)X(t) be a centered Lévy process such that for H+(x)=ν[x,)H_{+}(x)=\nu[x,\infty) and H(x)=ν(,x]H_{-}(x)=\nu(-\infty,-x], we have H+(x)𝒱α(x)H_{+}(x)\in\mathcal{RV}_{-\alpha}(x) and H(x)𝒱α(x)H_{-}(x)\in\mathcal{RV}_{-\alpha^{\prime}}(x) as xx\rightarrow\infty for some α,α>1\alpha,\alpha^{\prime}>1. Let 𝔻j,k\mathbb{D}_{j,k} be the set containing all step functions in 𝔻\mathbb{D} vanishing at the origin that has exactly jj upward jumps and kk downward jumps. As a convention, let 𝔻0,0={𝟎}\mathbb{D}_{0,0}=\{\bm{0}\}. Given α,α>1\alpha,\alpha^{\prime}>1, let 𝔻<j,k\ensurestackMath\stackon[1pt]=Δ(l,m)𝕀<j,k𝔻l,m\mathbb{D}_{<j,k}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\bigcup_{(l,m)\in\mathbb{I}_{<j,k}}\mathbb{D}_{l,m} where 𝕀<j,k\ensurestackMath\stackon[1pt]=Δ{(l,m)2{(j,k)}:l(α1)+m(α1)j(α1)+k(α1)}.\mathbb{I}_{<j,k}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\big{\{}(l,m)\in\mathbb{N}^{2}\char 92\relax\{(j,k)\}:\ l(\alpha-1)+m(\alpha^{\prime}-1)\leq j(\alpha-1)+k(\alpha^{\prime}-1)\big{\}}. Let 𝐂0,0\mathbf{C}_{0,0} be the Dirac measure on 𝟎\bm{0}. For any (j,k)2{(0,0)}(j,k)\in\mathbb{N}^{2}\char 92\relax\{(0,0)\} let

𝐂j,k()\ensurestackMath\stackon[1pt]=Δ𝐄[ναj×ναk{(𝒙,𝒚)(0,)j×(0,)k:l=1jxl𝐈[Ul,1]m=1kym𝐈[Vm,1]}]\displaystyle\mathbf{C}_{j,k}(\cdot)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\mathbf{E}\Bigg{[}\nu^{j}_{\alpha}\times\nu^{k}_{\alpha^{\prime}}\bigg{\{}(\bm{x},\bm{y})\in(0,\infty)^{j}\times(0,\infty)^{k}:\ \sum_{l=1}^{j}x_{l}\mathbf{I}_{[U_{l},1]}-\sum_{m=1}^{k}y_{m}\mathbf{I}_{[V_{m},1]}\in\cdot\bigg{\}}\Bigg{]} (2.3)

where all UlU_{l}’s and VmV_{m}’s are iid copies of Unif(0,1)(0,1) RVs. Now, we are ready to state the two-sided result.

Result 2 (Theorem 3.4 of [54]).

Let A𝔻A\subset\mathbb{D} be measurable. Suppose that

(𝒥(A),𝒦(A))argmin(j,k)2,𝔻j,kAj(α1)+k(α1)\displaystyle\big{(}\mathcal{J}(A),\mathcal{K}(A)\big{)}\in\underset{(j,k)\in\mathbb{N}^{2},\ \mathbb{D}_{j,k}\cap A\neq\emptyset}{\text{argmin}}j(\alpha-1)+k(\alpha^{\prime}-1) (2.4)

and AA is bounded away from 𝔻<𝒥(A),𝒦(A)\mathbb{D}_{<\mathcal{J}(A),\mathcal{K}(A)}. Then the argument minimum in (2.4) is unique, and

𝐂𝒥(A),𝒦(A)(A)\displaystyle\mathbf{C}_{\mathcal{J}(A),\mathcal{K}(A)}(A^{\circ}) lim infn𝐏(X¯nA)(nν[n,))𝒥(A)(nν(,n])𝒦(A)\displaystyle\leq\liminf_{n\rightarrow\infty}\frac{\mathbf{P}(\bar{X}_{n}\in A)}{(n\nu[n,\infty))^{\mathcal{J}(A)}(n\nu(-\infty,-n])^{\mathcal{K}(A)}}
lim supn𝐏(X¯nA)(nν[n,))𝒥(A)(nν(,n])𝒦(A)𝐂𝒥(A),𝒦(A)(A)<\displaystyle\leq\limsup_{n\rightarrow\infty}\frac{\mathbf{P}(\bar{X}_{n}\in A)}{(n\nu[n,\infty))^{\mathcal{J}(A)}(n\nu(-\infty,-n])^{\mathcal{K}(A)}}\leq\mathbf{C}_{\mathcal{J}(A),\mathcal{K}(A)}(A^{-})<\infty

where A,AA^{\circ},A^{-} are the interior and closure of AA respectively.

2.3 Concave Majorants and Stick-Breaking Approximations of Lévy Processes with Infinite Activities

Next, we review the distribution of the concave majorant of a Lévy process with infinite activities characterized in [52], which paves the way to the stick-breaking approximation algorithm proposed in [35]. Let X(t)X(t) be a Lévy process with generating triplet (c,σ,ν)(c,\sigma,\nu). We say that XX has infinite activities if σ>0\sigma>0 or ν()=\nu(\mathbb{R})=\infty. Let M(t)\ensurestackMath\stackon[1pt]=ΔsupstX(s)M(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sup_{s\leq t}X(s) be the running supremum of X(t)X(t). The results in [52] establishes a Poisson–Dirichlet distribution that underlies the joint law of X(t)X(t) and M(t)M(t). Specifically, we fix some T>0T>0 and let ViV_{i}’s be iid copies of Unif(0,1)(0,1) RVs. Recursively, let

l1=TV1,lj=Vj(Tl1l2lj1)j2.\displaystyle l_{1}=TV_{1},\qquad l_{j}=V_{j}\cdot(T-l_{1}-l_{2}-\ldots-l_{j-1})\quad\forall j\geq 2. (2.5)

Conditioning on the values of (lj)j1(l_{j})_{j\geq 1}, let ξj\xi_{j} be a random copy of X(lj)X(l_{j}), with all ξj\xi_{j} being independently generated.

Result 3 (Theorem 1 in [52]).

Suppose that the Lévy process XX has infinite activities. Then (with (x)+=max{x,0}(x)^{+}=\max\{x,0\})

(X(T),M(T))\ensurestackMath\stackon[1pt]=d(j1ξj,j1(ξj)+).\displaystyle\big{(}X(T),M(T)\big{)}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\big{(}\sum_{j\geq 1}\xi_{j},\sum_{j\geq 1}(\xi_{j})^{+}\big{)}. (2.6)

Based on the distribution characterized in (2.6), the stick-breaking approximation algorithm was proposed in [35] where finitely many ξi\xi_{i}’s are generated in order to approximate X(T)X(T) and M(T)M(T). This approximation technique is a key component of our rare event simulation algorithm. In particular, we utilize a coupling between different Lévy processes based on the representation (2.6) above. For clarity of our description, we focus on two Lévy processes XX and X~\widetilde{X} with generating triplets (c,σ,ν)(c,\sigma,\nu) and (c~,σ~,ν~)(\widetilde{c},\widetilde{\sigma},\widetilde{\nu}), respectively. Suppose that both XX and X~\widetilde{X} have infinite activities. We first generate lil_{i}’s as described in (2.5). Conditioning on the values of (li)i1(l_{i})_{i\geq 1}, we then independently generate ξi\xi_{i} and ξ~i\widetilde{\xi}_{i}, which are random copies of X(li)X(l_{i}) and X~(li)\widetilde{X}(l_{i}), respectively. Let M~(t)\ensurestackMath\stackon[1pt]=ΔsupstX~(s)\widetilde{M}(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sup_{s\leq t}\widetilde{X}(s). Applying Result 3, we identify a coupling between X(T),M(T),X~(T),M~(T)X(T),M(T),\widetilde{X}(T),\widetilde{M}(T) such that

(X(T),M(T),X~(T),M~(T))\ensurestackMath\stackon[1pt]=d(i1ξi,i1(ξi)+,i1ξ~i,i1(ξ~i)+).\displaystyle\big{(}X(T),M(T),\widetilde{X}(T),\widetilde{M}(T)\big{)}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\big{(}\sum_{i\geq 1}\xi_{i},\sum_{i\geq 1}(\xi_{i})^{+},\sum_{i\geq 1}\widetilde{\xi}_{i},\sum_{i\geq 1}(\widetilde{\xi}_{i})^{+}\big{)}. (2.7)
Remark 1.

It is worth noticing that the method described above in fact implies the existence of a probability space (Ω,,𝐏)(\Omega,\mathcal{F},\mathbf{P}) that supports the entire sample paths {X(t):t[0,T]}\{X(t):\ t\in[0,T]\} and {X~(t):t[0,T]}\{\widetilde{X}(t):\ t\in[0,T]\}, whose endpoint values X(T),X~(T)X(T),\widetilde{X}(T) and suprema M(T),M~(T)M(T),\widetilde{M}(T) admit the joint law in (2.7). In particular, once we obtain lil_{i} based on (2.5), one can generate Ξi\Xi_{i} that are iid copies of the entire paths of XX. That is, we generate a piece of sample path Ξi\Xi_{i} on the stick lil_{i}, and the quantities ξi\xi_{i} introduced earlier can be obtained by setting ξi=Ξi(li)\xi_{i}=\Xi_{i}(l_{i}). To recover the sample path of XX based on the pieces Ξi\Xi_{i}, it suffices to apply Vervatt transform onto each Ξi\Xi_{i} and then reorder the pieces based on their slopes. We refer the readers to theorem 4 in [52]. In summary, the method described above leads to a coupling between the sample paths of the underlying Lévy processes XX and X~\widetilde{X} such that (2.7) holds.

2.4 Randomized Debiasing Technique

To achieve unbiasedness in our algorithm and remove the errors in the stick-breaking approximations, we apply the randomized multi-level Monte-Carlo technique studied in [55]. In particular, due to τ\tau being finite (almost surely) in Result 4 below, the simulation of ZZ relies only on Y0,Y1,,YτY_{0},Y_{1},\cdots,Y_{\tau} instead of the infinite sequence (Yn)n0(Y_{n})_{n\geq 0}.

Result 4 (Theorem 1 in [55]).

Let random variables YY and (Ym)m0(Y_{m})_{m\geq 0} be such that limm𝐄Ym=𝐄Y\lim_{m\rightarrow\infty}\mathbf{E}Y_{m}=\mathbf{E}Y. Let τ\tau be a positive integer-valued random variable with unbounded support, independent of (Ym)m0(Y_{m})_{m\geq 0} and YY. Suppose that

m1𝐄|Ym1Y|2/𝐏(τm)<,\displaystyle\sum_{m\geq 1}\mathbf{E}|Y_{m-1}-Y|^{2}\big{/}\mathbf{P}(\tau\geq m)<\infty, (2.8)

then Z\ensurestackMath\stackon[1pt]=Δm=0τ(YmYm1)/𝐏(τm)Z\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sum_{m=0}^{\tau}(Y_{m}-Y_{m-1})\big{/}\mathbf{P}(\tau\geq m) (with the convention Y1=0Y_{-1}=0) satisfies

𝐄Z=𝐄Y,𝐄Z2=m0v¯m/𝐏(τm)\mathbf{E}Z=\mathbf{E}Y,\qquad\mathbf{E}Z^{2}=\sum_{m\geq 0}\bar{v}_{m}\big{/}\mathbf{P}(\tau\geq m)

where v¯m=𝐄|Ym1Y|2𝐄|YmY|2\bar{v}_{m}=\mathbf{E}|Y_{m-1}-Y|^{2}-\mathbf{E}|Y_{m}-Y|^{2}.

3 Algorithm

Throughout the rest of this paper, let X(t)X(t) be a Lévy process with generating triplet (cX,σ,ν)(c_{X},\sigma,\nu) satisfying the following heavy-tailed assumption.

Assumption 1.

𝐄X(1)=0\mathbf{E}X(1)=0. X(t)X(t) is of infinite activity. The Blumenthal-Getoor index β\ensurestackMath\stackon[1pt]=Δinf{p>0:(1,1)|x|pν(dx)<}\beta\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\inf\{p>0:\int_{(-1,1)}|x|^{p}\nu(dx)<\infty\} satisfies β<2\beta<2. Besides, one of the two claims below holds for the Lévy measure ν\nu.

  • (One-sided cases) ν\nu is supported on (0,)(0,\infty), and function H+(x)=ν[x,)H_{+}(x)=\nu[x,\infty) is regularly varying as xx\rightarrow\infty with index α-\alpha where α>1\alpha>1;

  • (Two-sided cases) There exist α,α>1\alpha,\alpha^{\prime}>1 such that H+(x)=ν[x,)H_{+}(x)=\nu[x,\infty) is regularly varying as xx\rightarrow\infty with index α-\alpha and H(x)=ν(,x]H_{-}(x)=\nu(-\infty,-x] is regularly varying as xx\rightarrow\infty with index α-\alpha^{\prime}.

The other assumption on X(t)X(t) revolves around the continuity of the law of X<zX^{<z}, which is the Lévy process with generating triplet (cX,σ,ν|(,z))(c_{X},\sigma,\nu|_{(-\infty,z)}). That is, X<zX^{<z} is a modulated version of XX where all the upward jumps with size larger than zz are removed.

Assumption 2.

There exist z0,C,λ>0z_{0},C,\lambda>0 such that

𝐏(X<z(t)[x,x+δ])Cδtλ1zz0,t>0,x,δ>0.\mathbf{P}\big{(}X^{<z}(t)\in[x,x+\delta]\big{)}\leq\frac{C\delta}{t^{\lambda}\wedge 1}\qquad\forall z\geq z_{0},\ t>0,\ x\in\mathbb{R},\ \delta>0.

Assumption 2 can be interpreted as a uniform version of Lipschitz continuity in the law of X<z(t)X^{<z}(t). In Section 4, we show that Assumption 2 is a mild condition for Lévy process with infinite activities and is easy to verify.

Next, we describe a class of target events (An)n1(A_{n})_{n\geq 1} for which we propose a strongly efficient rare event simulation algorithm. Let X¯n(t)=1nX(nt)\bar{X}_{n}(t)=\frac{1}{n}X(nt) and X¯n={X¯n(t):t[0,1]}\bar{X}_{n}=\{\bar{X}_{n}(t):\ t\in[0,1]\} be the scaled version of the process. Define events

A\ensurestackMath\stackon[1pt]=Δ{ξ𝔻:supt[0,1]ξ(t)a;supt(0,1]ξ(t)ξ(t)<b},An\ensurestackMath\stackon[1pt]=Δ{X¯nA}.\displaystyle A\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\xi\in\mathbb{D}:\sup_{t\in[0,1]}\xi(t)\geq a;\sup_{t\in(0,1]}\xi(t)-\xi(t-)<b\},\qquad A_{n}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\bar{X}_{n}\in A\}. (3.1)

In words, ξA\xi\in A means that the path ξ\xi crossed barrier aa even though no upward jumps in ξ\xi is larger than bb. For technical reasons, we also impose the following mild condition on the values of the constants a,b>0a,b>0.

Assumption 3.

a,b>0a,b>0 and a/b.a/b\notin\mathbb{Z}.

In this section, we present a strongly efficient rare-event simulation algorithm for (An)n1(A_{n})_{n\geq 1}. Specifically, Section 3.1 presents the design of the importance sampling distribution 𝐐n\mathbf{Q}_{n}, Section 3.2 discusses how we apply the randomized Monte-Carlo debiasing technique in Result 4 in our algorithm, Section 3.3 discusses how we combine the debiasing technique with SBA in Result 3, and Section 3.4 explains how to sample from the importance sampling distribution 𝐐n\mathbf{Q}_{n}. Combining all these components in Section 3.5, we propose Algorithm 2 for rare-event simulation of 𝐏(An)\mathbf{P}(A_{n}) and establish its strong efficiency in Theorem 3.2. Section 3.6 addresses the case where the exact simulation of X<z(t)X^{<z}(t) is not available.

3.1 Importance Sampling Distributions 𝐐n\mathbf{Q}_{n}

At the core of our algorithm is a principled design of importance sampling strategies based on heavy-tailed large deviations. This can be seen as an extension of the framework proposed in [20]. First, note that

l\ensurestackMath\stackon[1pt]=Δa/b\displaystyle l^{*}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\lceil a/b\rceil (3.2)

indicates the number of jumps required to cross the barrier aa starting from the origin if no jump is allowed to be larger than bb. Based on the sample-path large deviations reviewed in Section 2.2, we expect the events An={X¯nA}A_{n}=\{\bar{X}_{n}\in A\} to be almost always caused by exactly ll^{*} large upward jumps in X¯n\bar{X}_{n}. These insights reveal critical information regarding the conditional law 𝐏(|X¯nA)\mathbf{P}(\ \cdot\ |\bar{X}_{n}\in A). More importantly, they lead to a natural yet effective choice of importance sampling distributions to focus on the ll^{*}-large-jump paths and provides sufficient approximations to 𝐏(|X¯nA)\mathbf{P}(\ \cdot\ |\bar{X}_{n}\in A). Specifically, for any γ(0,b)\gamma\in(0,b), define events Bnγ\ensurestackMath\stackon[1pt]=Δ{X¯nBγ}B^{\gamma}_{n}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\bar{X}_{n}\in B^{\gamma}\} with

Bγ\ensurestackMath\stackon[1pt]=Δ{ξ𝔻:#{t[0,1]:ξ(t)ξ(t)γ}l},\displaystyle B^{\gamma}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\big{\{}\xi\in\mathbb{D}:\#\{t\in[0,1]:\xi(t)-\xi(t-)\geq\gamma\}\geq l^{*}\big{\}}, (3.3)

where, for any ξ𝔻\xi\in\mathbb{D}, we define ξ(t)=limstξ(s)\xi(t-)=\lim_{s\uparrow t}\xi(s) as the left-limit of ξ\xi at time tt. Intuitively speaking, the parameter γ(0,b)\gamma\in(0,b) acts as a threshold of “large jumps”: any path ξBγ\xi\in B^{\gamma} has at least ll^{*} upward jumps that are considered large relative to the threshold level γ\gamma. To prevent the likelihood ratio from blowing up to infinity, we then consider an importance sampling distribution with defensive mixtures (see [40]) and define (for some w(0,1)w\in(0,1))

𝐐n()\ensurestackMath\stackon[1pt]=Δw𝐏()+(1w)𝐏(|Bnγ).\displaystyle\mathbf{Q}_{n}(\cdot)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}w\mathbf{P}(\cdot)+(1-w)\mathbf{P}(\ \cdot\ |B^{\gamma}_{n}). (3.4)

Sampling from 𝐏(|Bnγ)\mathbf{P}(\ \cdot\ |B^{\gamma}_{n}), and hence 𝐐n()\mathbf{Q}_{n}(\cdot), is straightforward and will be addressed in Section 3.4.

With the design of the importance sampling distribution 𝐐n\mathbf{Q}_{n} in hand, one would naturally consider an estimator for 𝐏(An)\mathbf{P}(A_{n}) of form 𝐈And𝐏d𝐐n\mathbf{I}_{A_{n}}\cdot\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}. This is due to

𝐄𝐐n[𝐈And𝐏d𝐐n]=𝐄[𝐈An]=𝐏(An).\mathbf{E}^{\mathbf{Q}_{n}}\bigg{[}\mathbf{I}_{A_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\bigg{]}=\mathbf{E}[\mathbf{I}_{A_{n}}]=\mathbf{P}(A_{n}).

Here, we use 𝐄𝐐n\mathbf{E}^{\mathbf{Q}_{n}} to denote the expectation operator under law 𝐐n\mathbf{Q}_{n} and 𝐄\mathbf{E} for the expectation under 𝐏\mathbf{P}. Nevertheless, the exact evaluation or simulation of 𝐈An=𝐈{X¯nA}\mathbf{I}_{A_{n}}=\mathbf{I}\{\bar{X}_{n}\in A\} is generally not computationally feasible due to the infinite activities of the process XX, making it computationally infeasible to simulate or store the entire sample path with finite computational resources. This marks a significant difference from the tasks in [20], which focus on random walks or compound Poisson processes with constant drifts that can be simulated exactly. To overcome this challenge, we instead consider estimators LnL_{n} in the form of

Ln=Znd𝐏d𝐐n=Znw+1w𝐏(Bnγ)𝐈Bnγ\displaystyle L_{n}=Z_{n}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}=\frac{Z_{n}}{w+\frac{1-w}{\mathbf{P}(B^{\gamma}_{n})}\mathbf{I}_{B^{\gamma}_{n}}} (3.5)

where ZnZ_{n} can be simulated within finite computational resources and allows LnL_{n} to recover the right expectation under the importance sampling distribution 𝐐n\mathbf{Q}_{n}, i.e., 𝐄𝐐n[Ln]=𝐏(An)\mathbf{E}^{\mathbf{Q}_{n}}[L_{n}]=\mathbf{P}(A_{n}). In Section 3.2, we elaborate on the design of the estimators ZnZ_{n}.

3.2 Estimators ZnZ_{n}

Intuitively speaking, the goal is to construct ZnZ_{n}’s that can be plugged into (3.5) as unbiased estimators of 𝐈An\mathbf{I}_{A_{n}}. To this end, we consider the following decomposition of the Lévy process XX. For any ξ𝔻\xi\in\mathbb{D} and t0t\geq 0, let Δξ(t)=ξ(t)ξ(t)\Delta\xi(t)=\xi(t)-\xi(t-) be the size of the discontinuity in ξ\xi at time tt. Recall that γ(0,b)\gamma\in(0,b) is the threshold of large jumps in the definition of BγB^{\gamma} in (3.3). Let

Jn(t)\displaystyle J_{n}(t) \ensurestackMath\stackon[1pt]=Δs[0,t]ΔX(s)𝐈(ΔX(s)nγ),\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sum_{s\in[0,t]}\Delta X(s)\mathbf{I}\big{(}\Delta X(s)\geq n\gamma\big{)}, (3.6)
Ξn(t)\displaystyle\Xi_{n}(t) \ensurestackMath\stackon[1pt]=ΔX(t)Jn(t)=X(t)s[0,t]ΔX(s)𝐈(ΔX(s)nγ).\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}X(t)-J_{n}(t)=X(t)-\sum_{s\in[0,t]}\Delta X(s)\mathbf{I}\big{(}\Delta X(s)\geq n\gamma\big{)}.

We highlight several important facts regarding the decomposition X(t)=Jn(t)+Ξn(t)X(t)=J_{n}(t)+\Xi_{n}(t).

  • By the definition of 𝐐n\mathbf{Q}_{n}, the law of Ξn\Xi_{n} remains unchanged under both 𝐐n\mathbf{Q}_{n} and 𝐏\mathbf{P}, which is identical to the law of X<nγX^{<n\gamma}, namely, a Lévy process with generating triplet (cX,σ,ν|(,nγ))(c_{X},\sigma,\nu|_{(-\infty,n\gamma)}).

  • Under 𝐏\mathbf{P}, the process JnJ_{n} admits the law of a Lévy process with generating triplet (0,0,ν|[nγ,))(0,0,\nu|_{[n\gamma,\infty)}), which is a compound Poisson process.

  • Under 𝐐n\mathbf{Q}_{n}, the path {Jn(t):t[0,n]}\{J_{n}(t):\ t\in[0,n]\} follows the same law as a Lévy process with generating triplet (0,0,ν|[nγ,))(0,0,\nu|_{[n\gamma,\infty)}), conditioned on having at least ll^{*} jumps over [0,n][0,n].

  • Under both 𝐏\mathbf{P} and 𝐐n\mathbf{Q}_{n}, the two processes JnJ_{n} and Ξn\Xi_{n} are independent.

Let J¯n(t)=1nJn(nt)\bar{J}_{n}(t)=\frac{1}{n}J_{n}(nt), J¯n={J¯n(t):t[0,1]}\bar{J}_{n}=\{\bar{J}_{n}(t):\ t\in[0,1]\}, Ξ¯n(t)=1nΞn(nt)\bar{\Xi}_{n}(t)=\frac{1}{n}\Xi_{n}(nt), and Ξ¯n={Ξ¯n(t):t[0,1]}\bar{\Xi}_{n}=\{\bar{\Xi}_{n}(t):\ t\in[0,1]\}. We now discuss how the decomposition

X¯n=J¯n+Ξ¯n\bar{X}_{n}=\bar{J}_{n}+\bar{\Xi}_{n}

can help us construct unbiased estimators of 𝐈An\mathbf{I}_{A_{n}}. First, recall that γ(0,b)\gamma\in(0,b). As a result, in the definition of events An={X¯nA}A_{n}=\{\bar{X}_{n}\in A\} in (3.1), the condition supt(0,1]ξ(t)ξ(t)<b\sup_{t\in(0,1]}\xi(t)-\xi(t-)<b only concerns the large jump process J¯n\bar{J}_{n} since any upward jump in Ξ¯n\bar{\Xi}_{n} is bounded by γ<b\gamma<b. Therefore, with

E\ensurestackMath\stackon[1pt]=Δ{ξ𝔻:supt(0,1]ξ(t)ξ(t)<b},En\ensurestackMath\stackon[1pt]=Δ{J¯nE}\displaystyle E\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\xi\in\mathbb{D}:\ \sup_{t\in(0,1]}\xi(t)-\xi(t-)<b\},\qquad E_{n}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\bar{J}_{n}\in E\} (3.7)

and

M(t)\ensurestackMath\stackon[1pt]=ΔsupstX(s),Yn\ensurestackMath\stackon[1pt]=Δ𝐈(M(n)na),M(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sup_{s\leq t}X(s),\qquad Y^{*}_{n}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\mathbf{I}\big{(}M(n)\geq na\big{)},

we have

𝐈An=Yn𝐈En.\mathbf{I}_{A_{n}}=Y^{*}_{n}\mathbf{I}_{E_{n}}.

As discussed above, the exact evaluation of YnY^{*}_{n} is generally not computationally possible. Instead, suppose that we have access to a sequence of random variables (Y^nm)m0(\hat{Y}^{m}_{n})_{m\geq 0} that only take values in {0,1}\{0,1\} and provide progressively more accurate approximations to YnY^{*}_{n} as mm\rightarrow\infty. Then in light of the debiasing technique in Result 4, one can consider (under the convention that Y^n10\hat{Y}^{-1}_{n}\equiv 0)

Zn=m=0τY^nmY^nm1𝐏(τm)𝐈En\displaystyle Z_{n}=\sum_{m=0}^{\tau}\frac{\hat{Y}^{m}_{n}-\hat{Y}^{m-1}_{n}}{\mathbf{P}(\tau\geq m)}\mathbf{I}_{E_{n}} (3.8)

where τ\tau is Geom(ρ)\text{Geom}(\rho) for some ρ(0,1)\rho\in(0,1) and is independent of everything else. That is, 𝐏(τm)=ρm1\mathbf{P}(\tau\geq m)=\rho^{m-1} for all m1m\geq 1. Indeed, this construction of ZnZ_{n} is justified by the following proposition. We defer the proof to Section 6.1.

Proposition 3.1.

Let C0>0C_{0}>0, ρ0(0,1)\rho_{0}\in(0,1), μ>2l(α1)\mu>2l^{*}(\alpha-1), and m¯\bar{m}\in\mathbb{N}. Suppose that

𝐏(YnY^nm|𝒟(J¯n)=k)C0ρ0m(k+1)k0,n1,mm¯\displaystyle\mathbf{P}\Big{(}Y^{*}_{n}\neq\hat{Y}^{m}_{n}\ \Big{|}\ \mathcal{D}(\bar{J}_{n})=k\Big{)}\leq C_{0}\rho^{m}_{0}\cdot(k+1)\qquad\forall k\geq 0,n\geq 1,m\geq\bar{m} (3.9)

where 𝒟(ξ)\mathcal{D}(\xi) counts the number of discontinuities in ξ\xi for any ξ𝔻\xi\in\mathbb{D}. Besides, suppose that for all Δ(0,1)\Delta\in(0,1),

𝐏(YnY^nm,X¯nAΔ|𝒟(J¯n)=k)C0ρ0mΔ2nμn1,m0,k=0,1,,l1,\displaystyle\mathbf{P}\Big{(}Y^{*}_{n}\neq\hat{Y}^{m}_{n},\ \bar{X}_{n}\notin A^{\Delta}\ \Big{|}\ \mathcal{D}(\bar{J}_{n})=k\Big{)}\leq\frac{C_{0}\rho^{m}_{0}}{\Delta^{2}n^{\mu}}\qquad\forall n\geq 1,m\geq 0,k=0,1,\cdots,l^{*}-1, (3.10)

where AΔ={ξ𝔻:supt[0,1]ξ(t)aΔ}A^{\Delta}=\big{\{}\xi\in\mathbb{D}:\sup_{t\in[0,1]}\xi(t)\geq a-\Delta\big{\}}. Then given ρ(ρ0,1)\rho\in(\rho_{0},1), there exists some γ¯=γ¯(ρ)(0,b)\bar{\gamma}=\bar{\gamma}(\rho)\in(0,b) such that for all γ(0,γ¯)\gamma\in(0,\bar{\gamma}), the estimators (Ln)n1(L_{n})_{n\geq 1} specified in (3.5) and (3.8) are unbiased and strongly efficient for 𝐏(An)=𝐏(X¯nA)\mathbf{P}(A_{n})=\mathbf{P}(\bar{X}_{n}\in A) under the importance sampling distribution 𝐐n\mathbf{Q}_{n} in (3.4).

3.3 Construction of Y^nm\hat{Y}^{m}_{n}

In light of Proposition 3.1, our next goal is to design Y^nm\hat{Y}^{m}_{n}’s that provide sufficient approximations to Yn=𝐈(M(n)na)Y^{*}_{n}=\mathbf{I}(M(n)\geq na) and satisfy the conditions (3.9) and (3.10).

Recall the decomposition of X(t)=Ξn(t)+Jn(t)X(t)=\Xi_{n}(t)+J_{n}(t) in (3.6). Under both 𝐐n\mathbf{Q}_{n} and 𝐏\mathbf{P}, the processes Ξn\Xi_{n} and JnJ_{n} are independent, and Ξn\Xi_{n} admits the law of X<nγX^{<n\gamma}, i.e., a Lévy process with generating triplet (cX,σ,ν|(,nγ))(c_{X},\sigma,\nu|_{(-\infty,n\gamma)}). This section discusses how, after sampling JnJ_{n} from 𝐐n\mathbf{Q}_{n}, we approximate the supremum of Ξn\Xi_{n}. Specifically, on event {𝒟(J¯n)=k}\{\mathcal{D}(\bar{J}_{n})=k\}, i.e., the process JnJ_{n} makes kk jumps over [0,n][0,n], JnJ_{n} admits the form of ζk\zeta_{k} with

ζk(t)=i=1kzi𝐈[ui,n](t)t[0,n]\displaystyle\zeta_{k}(t)=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}(t)\qquad\forall t\in[0,n] (3.11)

for some zi[nγ,)z_{i}\in[n\gamma,\infty) and u1<u2<<uku_{1}<u_{2}<\cdots<u_{k}. This allows us to partition [0,n][0,n] into k+1k+1 disjoint intervals [0,u1),[u1,u2),,[uk1,uk),[uk,1][0,u_{1}),\ [u_{1},u_{2}),\ldots,\ [u_{k-1},u_{k}),\ [u_{k},1]. We adopt the convention u00,uk+11u_{0}\equiv 0,u_{k+1}\equiv 1 and set

Ii=[ui1,ui)i[k],Ik+1=[uk,1].\displaystyle I_{i}=[u_{i-1},u_{i})\quad\forall i\in[k],\qquad I_{k+1}=[u_{k},1]. (3.12)

For ζk=i=1kzi𝐈[ui,n]\zeta_{k}=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}, define

Mn(i),(ζk)\ensurestackMath\stackon[1pt]=ΔsuptIiΞn(t)Ξn(ui1)\displaystyle M_{n}^{(i),*}(\zeta_{k})\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sup_{t\in I_{i}}\Xi_{n}(t)-\Xi_{n}(u_{i-1}) (3.13)

as the supremum of the fluctuations of Ξn(t)\Xi_{n}(t) over IiI_{i}. Define random function

Yn(ζk)\ensurestackMath\stackon[1pt]=Δmaxi[k+1]𝐈(Ξn(ui1)+ζk(ui1)+Mn(i),(ζk)na),\displaystyle Y^{*}_{n}(\zeta_{k})\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\max_{i\in[k+1]}\mathbf{I}\Big{(}\Xi_{n}(u_{i-1})+\zeta_{k}(u_{i-1})+M^{(i),*}_{n}(\zeta_{k})\geq na\Big{)}, (3.14)

and note that Yn(Jn)=𝐈(supt[0,n]X(t)na)Y^{*}_{n}(J_{n})=\mathbf{I}(\sup_{t\in[0,n]}X(t)\geq na).

In theory, the representation (3.14) provides an algorithm for the simulation of 𝐈(supt[0,n]X(t)na)\mathbf{I}(\sup_{t\in[0,n]}X(t)\geq na). Nevertheless, the exact simulation of the supremum Mn(i),(ζk)M^{(i),*}_{n}(\zeta_{k}) is generally not available. Instead, we apply SBA introduced in Section 2.3 to approximate Mn(i),(ζk)M^{(i),*}_{n}(\zeta_{k}), thus providing the construction of Y^nm\hat{Y}^{m}_{n}. Specifically, define

l1(i)\displaystyle l^{(i)}_{1} =V1(i)(uiui1);\displaystyle=V^{(i)}_{1}\cdot(u_{i}-u_{i-1}); (3.15)
lj(i)\displaystyle l^{(i)}_{j} =Vj(i)(uiui1l1(i)l2(i)lj1(i))j2\displaystyle=V^{(i)}_{j}\cdot(u_{i}-u_{i-1}-l^{(i)}_{1}-l^{(i)}_{2}-\cdots-l^{(i)}_{j-1})\qquad\forall j\geq 2 (3.16)

where each Vj(i)V^{(i)}_{j} is an iid copy of Unif(0,1)(0,1). That is, for each i[k+1]i\in[k+1], the sequence (lj(i))j1(l^{(i)}_{j})_{j\geq 1} is defined under the recursion in (2.5), with T=uiui1T=u_{i}-u_{i-1} set as the length of IiI_{i}. Then, conditioning on the values of lj(i)l^{(i)}_{j}’s, we sample

ξj(i)𝐏(Ξn(lj(i))),\displaystyle\xi^{(i)}_{j}\sim\mathbf{P}\Big{(}\Xi_{n}(l^{(i)}_{j})\in\ \cdot\ \Big{)}, (3.17)

i.e., ξj(i)\xi^{(i)}_{j} is an independent copy of Ξn(lj(i))\Xi_{n}(l^{(i)}_{j}), with all ξj(i)\xi^{(i)}_{j} being independently generated. Result 3 then implies (Ξn(ui)Ξn(ui1),Mn(i),(ζk))\ensurestackMath\stackon[1pt]=d(j1ξj(i),j1(ξj(i))+)\big{(}\Xi_{n}(u_{i})-\Xi_{n}(u_{i-1}),\ M^{(i),*}_{n}(\zeta_{k})\big{)}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\big{(}\sum_{j\geq 1}\xi^{(i)}_{j},\ \sum_{j\geq 1}(\xi^{(i)}_{j})^{+}\big{)} for each i[k+1]i\in[k+1]. Furthermore, by summing up only finitely many ξj(i)\xi^{(i)}_{j}’s, we define

M^n(i),m(ζk)=j=1m+log2(nd)(ξj(i))+\displaystyle\hat{M}^{(i),m}_{n}(\zeta_{k})=\sum_{j=1}^{m+\lceil\log_{2}(n^{d})\rceil}(\xi^{(i)}_{j})^{+} (3.18)

as an approximation to Mn(i),(ζk)M^{(i),*}_{n}(\zeta_{k}) defined in (3.13). Here, d>0d>0 is another parameter of the algorithm. For technical reasons, we add an extra log2(nd)\lceil\log_{2}(n^{d})\rceil term in the summation in (3.18), which helps ensure that the algorithm achieves strong efficiency as nn\to\infty while only introducing a minor increase in the computational complexity.

Now, we are ready to present the design of the approximators Y^nm\hat{Y}^{m}_{n}. For ζk=i=1kzi𝐈[ui,n]\zeta_{k}=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}, define the random function

Y^nm(ζk)=maxi[k+1]𝐈(q=1i1j0ξj(q)+q=1i1zq+M^n(i)(ζk)na).\displaystyle\hat{Y}^{m}_{n}(\zeta_{k})=\max_{i\in[k+1]}\mathbf{I}\bigg{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}+\hat{M}^{(i)}_{n}(\zeta_{k})\geq na\bigg{)}. (3.19)

Here, note that q=1i1zq=ζk(ui1)\sum_{q=1}^{i-1}z_{q}=\zeta_{k}(u_{i-1}). As a high-level description, the algorithm proceeds as follows. After sampling JnJ_{n} from the importance sampling distribution 𝐐n\mathbf{Q}_{n} defined in (3.4), we plug Y^nm(Jn)\hat{Y}^{m}_{n}(J_{n}) into ZnZ_{n} defined in (3.8), which in turn allows us to simulate Ln=Znd𝐏d𝐐n{L_{n}}=Z_{n}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}} as the importance sampling estimator under 𝐐n\mathbf{Q}_{n}.

Remark 2.

At first glance, one may get the impression that the simulation of Y^nm\hat{Y}^{m}_{n} involves the summation of infinitely many elements in q=1i1j0ξj(q),m\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q),m}_{j}. Fortunately, the truncation index τ\tau in ZnZ_{n} (see (3.8)) is almost surely finite. Therefore, when running the algorithm in practice, after τ\tau is decided, there is no need to simulate any Y^nm\hat{Y}^{m}_{n} beyond mτm\leq\tau. Given the construction of M^n(i),m(ζk)\hat{M}^{(i),m}_{n}(\zeta_{k}) in (3.18), the simulation of Y^nm(ζk)\hat{Y}^{m}_{n}(\zeta_{k}) only requires (for each i[k+1]i\in[k+1]) ξ1(i),ξ2(i),,ξτ+log2(nd)(i),\xi^{(i)}_{1},\xi^{(i)}_{2},\ldots,\xi^{(i)}_{\tau+\lceil\log_{2}(n^{d})\rceil}, as well as the sum jτ+log2(nd)+1ξj(i)\sum_{j\geq\tau+\lceil\log_{2}(n^{d})\rceil+1}\xi^{(i)}_{j}. Furthermore, conditioning on the value of uiui1j=1τ+log2(nd)ξj(i)=tu_{i}-u_{i-1}-\sum_{j=1}^{\tau+\lceil\log_{2}(n^{d})\rceil}\xi^{(i)}_{j}=t, the sum jτ+log2(nd)+1ξj(i)\sum_{j\geq\tau+\lceil\log_{2}(n^{d})\rceil+1}\xi^{(i)}_{j} admits the law of Ξn(t)\Xi_{n}(t) (see Result 3). This allows us to simulate jτ+log2(nd)+1ξj(i)\sum_{j\geq\tau+\lceil\log_{2}(n^{d})\rceil+1}\xi^{(i)}_{j} in one shot.

Note that to implement the importance sampling algorithm and ensure strong efficiency, the following tasks still remain to be addressed.

  1. (i)(i)

    As mentioned above, the evaluation of Y^nm(Jn)\hat{Y}^{m}_{n}(J_{n}) requires the ability to first sample JnJ_{n} from the importance sampling distribution 𝐐n\mathbf{Q}_{n} defined in (3.4). We address this in Algorithm 1 proposed in Section 3.4. In summary, the simulation algorithm of estimators LnL_{n} is detailed in Algorithm 2.

  2. (ii)(ii)

    The strong efficiency of the proposed algorithm needs to be justified by verifying the conditions in Proposition 3.1. This will be done in Section 3.5 by establishing Theorem 3.2.

  3. (iii)(iii)

    Simulating ξj(i)\xi^{(i)}_{j}’s requires the exact simulation of X<nγ(t)X^{<n\gamma}(t), which may not be computationally feasible in certain cases. To address this challenge, Section 3.6 proposes Algorithm 3, which builds upon Algorithm 2 and incorporates another layer of approximation via ARA.

3.4 Sampling from 𝐏(|Bnγ)\mathbf{P}(\ \cdot\ |B^{\gamma}_{n})

In this section, we revisit the task of sampling from 𝐏(|Bnγ)\mathbf{P}(\ \cdot\ |B^{\gamma}_{n}), which is at the core of the implementation of the importance sampling distribution 𝐐n\mathbf{Q}_{n} in (3.4).

Recall that under 𝐏\mathbf{P}, the process JnJ_{n} is a compound Poisson process with generating triplet (0,0,ν|[nγ,))(0,0,\nu|_{[n\gamma,\infty)}). More precisely, let N~n()\widetilde{N}_{n}(\cdot) be a Poisson process with rate ν[nγ,)\nu[n\gamma,\infty), and we use (Si)i1(S_{i})_{i\geq 1} to denote the arrival times of jumps in N~n()\widetilde{N}_{n}(\cdot). Let (Wi)i1(W_{i})_{i\geq 1} be a sequence of iid random variables from

νnnormalized()=νn()ν[nγ,),νn()=ν([nγ,))\nu^{\text{normalized}}_{n}(\cdot)=\frac{\nu_{n}(\cdot)}{\nu[n\gamma,\infty)},\qquad\nu_{n}(\cdot)=\nu\big{(}\cdot\cap[n\gamma,\infty)\big{)}

and let WiW_{i}’s be independent of N~n()\widetilde{N}_{n}(\cdot). Under 𝐏\mathbf{P}, we have

Jn(t)\ensurestackMath\stackon[1pt]=di=1N~n(t)Wi=i1Wi𝐈[Si,)(t)t0.J_{n}(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\sum_{i=1}^{\widetilde{N}_{n}(t)}W_{i}=\sum_{i\geq 1}W_{i}\mathbf{I}_{[S_{i},\infty)}(t)\qquad\forall t\geq 0.

Furthermore, for each k0k\geq 0, conditioning on {N~n(n)=k}\{\widetilde{N}_{n}(n)=k\}, the law of S1,,SkS_{1},\ldots,S_{k} is equivalent to that of the order statistics of kk iid samples from Unif(0,n)(0,n), and WiW_{i}’s are still independent of SiS_{i}’s with the law unaltered. Therefore, the sampling of JnJ_{n} from 𝐏(|Bnγ)\mathbf{P}(\ \cdot\ |B^{\gamma}_{n}) can proceed as follows. We first generate kk from the distribution of Poisson(nν[nγ,))(n\nu[n\gamma,\infty)), conditioning on klk\geq l^{*}. Then, independently, we generate S1,,SkS_{1},\cdots,S_{k} as the order statistics of kk iid samples from Unif(0,n)(0,n), and W1,,WkW_{1},\cdots,W_{k} as iid samples of law νnnormalized()\nu^{\text{normalized}}_{n}(\cdot). It is worth mentioning that the sampling of WiW_{i} can be addressed with the help of the inverse measure. Specifically, define Qn(y)\ensurestackMath\stackon[1pt]=Δinf{s>0:νn[s,)<y}Q^{\leftarrow}_{n}(y)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}{}\inf\{s>0:\nu_{n}[s,\infty)<y\} as the inverse of νn\nu_{n}, and observe that

yνn[s,)Qn(y)s.y\leq\nu_{n}[s,\infty)\qquad\Longleftrightarrow\qquad Q^{\leftarrow}_{n}(y)\geq s.

More importantly, for UUnif(0,νn[nγ,))U\sim\text{Unif}(0,\nu_{n}[n\gamma,\infty)), the law of Qn(U)Q^{\leftarrow}_{n}(U) is νnnormalized()\nu^{\text{normalized}}_{n}(\cdot). This leads to the steps detailed in Algorithm 1.

Algorithm 1 Simulation of JnJ_{n} from 𝐏(|Bnγ)\mathbf{P}(\ \cdot\ |B^{\gamma}_{n})
1:n,l,γ>0n\in\mathbb{N},l^{*}\in\mathbb{N},\gamma>0, the Lévy measure ν\nu.
2:Sample kk from a Poisson distribution with rate nν[nγ,)n\nu[n\gamma,\infty) conditioning on klk\geq l^{*}
3:Simulate Γ1,,ΓkiidUnif(0,νn[nγ,))\Gamma_{1},\cdots,\Gamma_{k}\stackrel{{\scriptstyle\text{iid}}}{{\sim}}Unif\big{(}0,\nu_{n}[n\gamma,\infty)\big{)}
4:Simulate U1,,UkiidUnif(0,n)U_{1},\cdots,U_{k}\stackrel{{\scriptstyle\text{iid}}}{{\sim}}Unif(0,n)
5:Return Jn=i=1kQn(Γi)𝐈[Ui,n]J_{n}=\sum_{i=1}^{k}Q^{\leftarrow}_{n}(\Gamma_{i})\mathbf{I}_{[U_{i},n]}

3.5 Strong Efficiency and Computational Complexity

With all the discussions above, we propose Algorithm 2 for rare-event simulation of 𝐏(An)\mathbf{P}(A_{n}). Specifically, here is a list of the parameters of the algorithm.

  • γ(0,b)\gamma\in(0,b): the threshold in BγB^{\gamma} defined in (3.3),

  • w(0,1)w\in(0,1): the weight of the defensive mixture in 𝐐n\mathbf{Q}_{n}; see (3.4),

  • ρ(0,1)\rho\in(0,1): the geometric rate of decay for 𝐏(τm)\mathbf{P}(\tau\geq m) in (3.8),

  • d>0d>0: determining the log2(nd)\log_{2}(n^{d}) term in (3.18).

The choice of w(0,1)w\in(0,1) won’t affect the strong efficiency of the algorithm. Meanwhile, under proper parametrization, Algorithm 2 meets conditions (3.9) and (3.10) stated in Proposition 3.1 and attains strong efficiency. This is verified in Theorem 3.2.

Theorem 3.2.

Let d>max{2, 2l(α1)}d>\max\{2,\ 2l^{*}(\alpha-1)\} and w(0,1)w\in(0,1). There exists ρ0(0,1)\rho_{0}\in(0,1) such that the following claim holds: Given ρ(ρ0,1)\rho\in(\rho_{0},1), there exists γ¯(0,b)\bar{\gamma}\in(0,b) such that Algorithm 2 is unbiased and strongly efficient under any γ(0,γ¯)\gamma\in(0,\bar{\gamma}).

We defer the proof to Section 6.2. In fact, in Section 3.6 we propose Algorithm 3, which can be seen as an extended version of Algorithm 2 with another layer of approximation. The strong efficiency of Algorithm 2 follows directly from that of Algorithm 3 (i.e., by setting κ=0\kappa=0 in the proof of Theorem 3.3). The choices of γ¯\bar{\gamma} and ρ¯\bar{\rho} that ensure strong efficiency are also specified at the end of Section 3.6.

Algorithm 2 Strongly Efficient Estimation of 𝐏(An)\mathbf{P}(A_{n})
1:w(0,1),γ>0,d>0,ρ(0,1)w\in(0,1),\ \gamma>0,\ d>0,\ \rho\in(0,1) as the parameters of the algorithm; a,b>0a,b>0 as the characterization of the set AA; (cX,σ,ν)(c_{X},\sigma,\nu) as the generating triplet of X(t)X(t).
2:
3:Set tn=log2(nd)t_{n}=\lceil\log_{2}(n^{d})\rceil
4:
5:if Unif(0,1)<w\text{Unif}(0,1)<w then \triangleright Sample JnJ_{n} from 𝐐n\mathbf{Q}_{n}
6:    Sample Jn=i=1kzi𝐈[ui,n]J_{n}=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]} from 𝐏\mathbf{P}
7:else
8:    Sample Jn=i=1kzi𝐈[ui,n]J_{n}=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]} from 𝐏(|Bnγ)\mathbf{P}(\ \cdot\ |B^{\gamma}_{n}) using Algorithm 1
9:Set u0=0,uk+1=nu_{0}=0,u_{k+1}=n.
10:
11:Sample τGeom(ρ)\tau\sim\text{Geom}(\rho) \triangleright Decide Truncation Index τ\tau
12:
13:for i=1,2,,k+1i=1,2,\ldots,k+1 do \triangleright Stick-Breaking Procedure
14:    for j=1,2,,tn+τj=1,2,\ldots,t_{n}+\tau do
15:       Sample Vj(i)Unif(0,1)V^{(i)}_{j}\sim\text{Unif}(0,1)
16:       Set lj(i)=Vj(i)(uiui+1l1(i)l2(i)lj1(i))l^{(i)}_{j}=V^{(i)}_{j}(u_{i}-u_{i+1}-l^{(i)}_{1}-l^{(i)}_{2}-\ldots-l^{(i)}_{j-1})
17:       Sample ξj(i)𝐏(X<nγ(lj(i)))\xi^{(i)}_{j}\sim\mathbf{P}\big{(}X^{<n\gamma}(l^{(i)}_{j})\in\ \cdot\ \big{)}     
18:    Set ltn+τ+1(i)=uiui1l1(i)l2(i)ltn+τ(i)l^{(i)}_{t_{n}+\tau+1}=u_{i}-u_{i-1}-l^{(i)}_{1}-l^{(i)}_{2}-\ldots-l^{(i)}_{t_{n}+\tau}
19:    Sample ξtn+τ+1(i)𝐏(X<nγ(ltn+τ+1(i)))\xi^{(i)}_{t_{n}+\tau+1}\sim\mathbf{P}\big{(}X^{<n\gamma}(l^{(i)}_{t_{n}+\tau+1})\in\ \cdot\ \big{)}
20:
21:for m=1,,τm=1,\cdots,\tau do \triangleright Evaluate Y^nm\hat{Y}^{m}_{n}
22:    for i=1,2,,k+1i=1,2,\ldots,k+1 do
23:       Set M^n(i),m=q=1i1j=1tn+τ+1ξj(q)+q=1i1zq+j=1tn+m(ξj(i))+\hat{M}^{(i),m}_{n}=\sum_{q=1}^{i-1}\sum_{j=1}^{t_{n}+\tau+1}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}+\sum_{j=1}^{t_{n}+m}(\xi^{(i)}_{j})^{+}     
24:    Set Y^nm=𝐈{maxi=1,,k+1M^n(i),mna}\hat{Y}^{m}_{n}=\mathbf{I}\big{\{}\max_{i=1,\ldots,k+1}\hat{M}^{(i),m}_{n}\geq na\big{\}}
25:Set Zn=Y^n1+m=2τ(Y^nmY^nm1)/ρm1Z_{n}=\hat{Y}^{1}_{n}+\sum_{m=2}^{\tau}(\hat{Y}^{m}_{n}-\hat{Y}^{m-1}_{n})\big{/}\rho^{m-1} \triangleright Return the Estimator LnL_{n}
26:if  maxi=1,,kzi>b\max_{i=1,\cdots,k}z_{i}>b  then
27:    Return Ln=0L_{n}=0.
28:else
29:    Set λn=nν[nγ,),pn=1l=0l1eλnλnll!,In=𝐈{JnBnγ}\lambda_{n}=n\nu[n\gamma,\infty),\ p_{n}=1-\sum_{l=0}^{l^{*}-1}e^{-\lambda_{n}}\frac{\lambda_{n}^{l}}{l!},\ I_{n}=\mathbf{I}\{J_{n}\in B^{\gamma}_{n}\}
30:    Return Ln=Zn/(w+1wpnIn)L_{n}=Z_{n}/(w+\frac{1-w}{p_{n}}I_{n})
Remark 3.

To conclude, we add a remark regarding the computational complexity of Algorithm 2 under the goal of attaining a given level of relative error at a specified confidence level. First, consider the case where the complexity of simulation of X<z(t)X^{<z}(t) scales linearly with tt (uniformly for all z[z0,]z\in[z_{0},\infty] for some constant z0z_{0}). This is a standard since the number of jumps we expect to simulate over [0,t][0,t] grows linearly with tt. Then, the complexity of the SBA steps at step 13 of Algorithm 2 also scales linearly with nn, as the stick lengths of lj(i)l^{(i)}_{j}’s, in expectation, grow linearly with nn because we deal with the time horizon [0,n][0,n] given the scale factor nn. Next, since the same law for the truncation index τ\tau (see step 8 of Algorithm 2) is applied for all nn, the only other factor that is varying with nn is tn=log2(nd)t_{n}=\lceil\log_{2}(n^{d})\rceil in the loop at step 10. The strong efficiency of the algorithm then implies a computational complexity of order O(nlog2n)O(n\cdot\log_{2}n). If we instead assume that the cost of simulating X<z(t)X^{<z}(t) is also uniformly bounded for all tt, then the overall complexity of Algorithm 2 is further reduced to O(log2n)O(\log_{2}n).

In comparison, the crude Monte Carlo method requires a number of samples that is inversely proportional to the target probability 𝐏(An)O(1/nl(α1))\mathbf{P}(A_{n})\approx O(1/n^{l^{*}(\alpha-1)}) (see Lemma 6.1) with α>1\alpha>1 being the heavy-tailed index in Assumption 1 and l1l^{*}\geq 1 defined in (3.2). Hypothetically, assuming that the evaluation of 𝕀An\mathbb{I}_{A_{n}} (which at least requires the simulation of X(t)X(t) and M(t)M(t)) is computationally feasible at a cost that scales linearly with nn, we end up with a computational complexity of O(nnl(α1))O(n\cdot n^{l^{*}(\alpha-1)}) (compared to the O(nlog2n)O(n\cdot\log_{2}n) cost of our algorithm). Similarly, if we assume that the cost of generating 𝕀An\mathbb{I}_{A_{n}} is uniformly bounded for all nn, then the complexity of the crude Monte-Carlo method is O(nl(α1))O(n^{l^{*}(\alpha-1)}) (compared to the O(log2n)O(\log_{2}n) cost of our algorithm). In summary, not only does the proposed importance sampling algorithm address Lévy processes with infinite activities that are not simulatable for crude Monte Carlo methods, but it also enjoys a significant improvement in terms of computational complexity, with the advantage becoming even more evident for multiple-jump events with large ll^{*}.

3.6 Construction of Y^nm\hat{Y}^{m}_{n} with ARA

As stressed earlier, implementing Algorithm 3.5 requires the ability to sample from 𝐏(X<nγ(t))\mathbf{P}(X^{<n\gamma}(t)\in\ \cdot\ ). The goal of this section is to address the challenge when the exact simulation of X<nγ(t)X^{<n\gamma}(t) is not available. The plan is to incorporate the Asmussen-Rosiński approximation (ARA) in [3] into the design of the approximation Y^nm\hat{Y}^{m}_{n} proposed in Section 3.3.

To be specific, let

κn,m\ensurestackMath\stackon[1pt]=Δκmnrn1,m0\displaystyle\kappa_{n,m}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\frac{\kappa^{m}}{n^{r}}\qquad\forall n\geq 1,\ m\geq 0 (3.20)

where κ(0,1)\kappa\in(0,1) and r>0r>0 are two additional parameters of our algorithm. As a convention, we set κn,11\kappa_{n,-1}\equiv 1. Without loss of generality, we consider nn large enough such that nγ>1=κn,1n\gamma>1=\kappa_{n,-1}. For the Lévy process Ξn=X<nγ\Xi_{n}=X^{<n\gamma} with the generating triplet (cX,σ,ν|(,nγ))(c_{X},\sigma,\nu|_{(-\infty,n\gamma)}), consider the following decomposition (with B(t)B(t) being a standard Brownian motion)

Ξn(t)\displaystyle\Xi_{n}(t) =cXt+σB(t)+stΔX(s)𝐈(ΔX(s)(,1][1,nγ))\ensurestackMath\stackon[1pt]=ΔJn,1(t)\displaystyle=c_{X}t+\sigma B(t)+\underbrace{\sum_{s\leq t}\Delta X(s)\mathbf{I}\Big{(}\Delta X(s)\in(-\infty,-1]\cup[1,n\gamma)\Big{)}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}J_{n,-1}(t)} (3.21)
+m0[stΔX(s)𝐈(|ΔX(s)|[κn,m,κn,m1))tν((κn,m1,κn,m][κn,m,κn,m1))\ensurestackMath\stackon[1pt]=ΔJn,m(t)].\displaystyle\quad+\sum_{m\geq 0}\Bigg{[}\underbrace{\sum_{s\leq t}\Delta X(s)\mathbf{I}\Big{(}|\Delta X(s)|\in[\kappa_{n,m},\kappa_{n,m-1})\Big{)}-t\cdot\nu\Big{(}(-\kappa_{n,m-1},-\kappa_{n,m}]\cup[\kappa_{n,m},\kappa_{n,m-1})\Big{)}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}J_{n,m}(t)}\Bigg{]}.

Here, for any m0m\geq 0, Jn,mJ_{n,m} is a martingale with var[Jn,m(1)]=σ¯2(κn,m1)σ¯2(κn,m)var[J_{n,m}(1)]=\bar{\sigma}^{2}(\kappa_{n,m-1})-\bar{\sigma}^{2}(\kappa_{n,m}) where

σ¯2(c)\ensurestackMath\stackon[1pt]=Δ(c,c)x2ν(dx)c(0,1].\displaystyle\bar{\sigma}^{2}(c)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\int_{(-c,c)}x^{2}\nu(dx)\qquad\forall c\in(0,1]. (3.22)

Generally speaking, the difficulty of implementing Algorithm 3.5 lies in the exact simulation of the martingale m0Jn,m\sum_{m\geq 0}J_{n,m}. In particular, whenever we have ν((,0)(0,))=\nu((-\infty,0)\cup(0,\infty))=\infty for the Lévy measure ν\nu, the expected number of jumps in m0Jn,m\sum_{m\geq 0}J_{n,m} (and hence X<nγX^{<n\gamma} and XX) will be infinite over any time interval with positive length. By applying ARA, our goal is to approximate the jump martingale Jn,mJ_{n,m}’s using Brownian motions, which yields a process that is amenable to exact simulation. To do so, let (Wm)m1(W^{m})_{m\geq 1} be a sequence of iid copies of standard Brownian motions, which are also independent of B(t)B(t). For each m0m\geq 0, define

Ξ˘nm(t)\displaystyle\breve{\Xi}^{m}_{n}(t) \ensurestackMath\stackon[1pt]=ΔcXt+σB(t)+q=1mJn,q(t)+qm+1σ¯2(κn,q1)σ¯2(κn,q)Wq(t).\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}c_{X}t+\sigma B(t)+\sum_{q=-1}^{m}J_{n,q}(t)+\sum_{q\geq m+1}\sqrt{\bar{\sigma}^{2}(\kappa_{n,q-1})-\bar{\sigma}^{2}(\kappa_{n,q})}\cdot W^{q}(t). (3.23)

Here, the process Ξ˘nm\breve{\Xi}^{m}_{n} can be interpreted as an approximation to Ξn\Xi_{n}, where the jump martingale (with jump sizes under κn,m\kappa_{n,m}) is substituted by a standard Brownian motion with the same variance. Note that for any t>0t>0, the random variable Ξ˘nm(t)\breve{\Xi}^{m}_{n}(t) is exactly simulatable, as it is a convolution of a compound Poisson process with constant drift and a Gaussian random variable.

Based on the approximations Ξ˘nm\breve{\Xi}^{m}_{n} in (3.23), we apply SBA and reconstruct M^n(i),m\hat{M}^{(i),m}_{n} (originally defined in (3.18)) and Y^nm\hat{Y}^{m}_{n} (originally defined in (3.19)) as follows. Let ζk(t)=i=1kzi𝐈[ui,n](t)\zeta_{k}(t)=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}(t) be a piece-wise step function with kk jumps over (0,n](0,n], i.e., admitting the form in (3.11). Recall that the jump times in ζk\zeta_{k} leads to a partition of [0,n][0,n] of (Ii)i[k+1](I_{i})_{i\in[k+1]} defined in (3.12). For any IiI_{i}, let the sequence lj(i)l^{(i)}_{j}’s be defined as in (3.15)–(3.16). Next, conditioning on (lj(i))j1(l^{(i)}_{j})_{j\geq 1}, one can sample ξj(i),m,ξj(i)\xi^{(i),m}_{j},\xi^{(i)}_{j} as

(ξj(i),ξj(i),0,ξj(i),1,ξj(i),2,)\ensurestackMath\stackon[1pt]=d(Ξn(lj(i)),Ξ˘n0(lj(i)),Ξ˘n1(lj(i)),Ξ˘n2(lj(i)),).\displaystyle\big{(}\xi^{(i)}_{j},\xi^{(i),0}_{j},\xi^{(i),1}_{j},\xi^{(i),2}_{j},\ldots)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\Big{(}\Xi_{n}(l^{(i)}_{j}),\ \breve{\Xi}^{0}_{n}(l^{(i)}_{j}),\ \breve{\Xi}^{1}_{n}(l^{(i)}_{j}),\ \breve{\Xi}^{2}_{n}(l^{(i)}_{j}),\ldots\Big{)}. (3.24)

The coupling in (2.7) then implies

(Ξn(ui)Ξn(ui1),suptIiΞn(t)Ξn(ui1),Ξ˘n0(ui)Ξ˘n0(ui1),suptIiΞ˘n0(t)Ξ˘n0(ui1),\displaystyle\Big{(}\Xi_{n}(u_{i})-\Xi_{n}(u_{i-1}),\ \sup_{t\in I_{i}}\Xi_{n}(t)-\Xi_{n}(u_{i-1}),\ \breve{\Xi}^{0}_{n}(u_{i})-\breve{\Xi}^{0}_{n}(u_{i-1}),\ \sup_{t\in I_{i}}\breve{\Xi}^{0}_{n}(t)-\breve{\Xi}^{0}_{n}(u_{i-1}), (3.25)
Ξ˘n1(ui)Ξ˘n1(ui1),suptIiΞ˘n1(t)Ξ˘n1(ui1),)\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\breve{\Xi}^{1}_{n}(u_{i})-\breve{\Xi}^{1}_{n}(u_{i-1}),\ \sup_{t\in I_{i}}\breve{\Xi}^{1}_{n}(t)-\breve{\Xi}^{1}_{n}(u_{i-1}),\ldots\Big{)}
\ensurestackMath\stackon[1pt]=d(j1ξj(i),j1(ξj(i))+,j1ξj(i),0,j1(ξj(i),0)+,j1ξj(i),1,j1(ξj(i),1)+,).\displaystyle\qquad\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\Big{(}\sum_{j\geq 1}\xi^{(i)}_{j},\ \sum_{j\geq 1}(\xi^{(i)}_{j})^{+},\ \sum_{j\geq 1}\xi^{(i),0}_{j},\ \sum_{j\geq 1}(\xi^{(i),0}_{j})^{+},\ \sum_{j\geq 1}\xi^{(i),1}_{j},\ \sum_{j\geq 1}(\xi^{(i),1}_{j})^{+},\ldots\Big{)}.

Now, we define

M^n(i),m(ζk)=j=1m+log2(nd)(ξj(i),m)+\displaystyle{\hat{M}^{(i),m}_{n}(\zeta_{k})}=\sum_{j=1}^{m+\lceil\log_{2}(n^{d})\rceil}(\xi^{(i),m}_{j})^{+} (3.26)

as an approximation to Mn(i),(ζk)=suptIiΞn(t)Ξn(ui1)=j1(ξj(i))+{M_{n}^{(i),*}(\zeta_{k})}=\sup_{t\in I_{i}}\Xi_{n}(t)-\Xi_{n}(u_{i-1})=\sum_{j\geq 1}(\xi^{(i)}_{j})^{+} using both ARA and SBA. Compared to the original design in (3.18), the main difference in (3.26) is that we substitute ξj(i)\xi^{(i)}_{j} with ξj(i),m\xi^{(i),m}_{j}, and the latter is exactly simulatable as, conditioning on the values of lj(i)l^{(i)}_{j}’s, it admits the law of Ξ˘nm\breve{\Xi}^{m}_{n}. Similarly, let

Y^nm(ζk)=maxi[k+1]𝐈(q=1i1j0ξj(q),m+q=1i1zq+M^n(i),m(ζk)na);\displaystyle{\hat{Y}^{m}_{n}(\zeta_{k})}=\max_{i\in[k+1]}\mathbf{I}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q),m}_{j}+\sum_{q=1}^{i-1}z_{q}+\hat{M}^{(i),m}_{n}(\zeta_{k})\geq na\Big{)}; (3.27)

Again, the main difference between (3.27) and (3.19) is that we incorporate ARA and substitute ξj(i)\xi^{(i)}_{j}’s with ξj(i),m\xi^{(i),m}_{j}’s.

Plugging the design of Y^nm(ζk)\hat{Y}^{m}_{n}(\zeta_{k}) in (3.27) into the estimator ZnZ_{n} in (3.8), we propose Algorithm 3 for rare-event simulation of 𝐏(An)\mathbf{P}(A_{n}) when exact simulation of X<nγX^{<n\gamma} is not available. Below is a summary of the parameters of the algorithm.

  • γ(0,b)\gamma\in(0,b): the threshold in BγB^{\gamma} defined in (3.3),

  • w(0,1)w\in(0,1): the weight of the defensive mixture in 𝐐n\mathbf{Q}_{n}; see (3.4),

  • ρ(0,1)\rho\in(0,1): the geometric rate of decay for 𝐏(τm)\mathbf{P}(\tau\geq m) in (3.8),

  • κ[0,1),r>0\kappa\in[0,1),\ r>0: determining the truncation threshold κn,m\kappa_{n,m} in (3.20),

  • d>0d>0: determining the log2(nd)\log_{2}(n^{d}) term in (3.26).

Theorem 3.3 justifies that, under proper parametrization, Algorithm 3 is unbiased and strongly efficient.

Theorem 3.3.

Let μ>2l(α1)\mu>2l^{*}(\alpha-1) and β+(β,2)\beta_{+}\in(\beta,2) where α>1\alpha>1 is the heavy-tail index and β(0,2)\beta\in(0,2) is the Blumenthal-Getoor index in Assumption 1. Let w(0,1)w\in(0,1) and

κ2β+<12,r(2β+)>max{2,μ1},d>max{2,2μ1}.\displaystyle\kappa^{2-\beta_{+}}<\frac{1}{2},\qquad r(2-\beta_{+})>\max\{2,\mu-1\},\qquad d>\max\{2,2\mu-1\}. (3.28)

There exists ρ0(0,1)\rho_{0}\in(0,1) such that the following claim holds: Given ρ(ρ0,1)\rho\in(\rho_{0},1), there exists γ¯(0,b)\bar{\gamma}\in(0,b) such that Algorithm 3 is unbiased and strongly efficient under any γ(0,γ¯)\gamma\in(0,\bar{\gamma}).

In Section 6.2 we provide the proof, the key arguments of which are the verification of conditions (3.9) and (3.10) in Proposition 3.1. Here, we specify the choices of the parameters. First, pick α3(0,1λ),α4(0,12λ)\alpha_{3}\in(0,\frac{1}{\lambda}),\ \alpha_{4}\in(0,\frac{1}{2\lambda}) where λ>0\lambda>0 is the constant in Assumption 2. Next, pick α2(0,α321)\alpha_{2}\in(0,\frac{\alpha_{3}}{2}\wedge 1) and α1(0,α2λ)\alpha_{1}\in(0,\frac{\alpha_{2}}{\lambda}). Also, fix δ(1/2,1)\delta\in(1/\sqrt{2},1). This allows us to pick ρ0(0,1)\rho_{0}\in(0,1) such that

ρ0>max{δα1,κ2β+δ2,12δ,δα2λα1,δ1λα3,δα2+α32}.\rho_{0}>\max\bigg{\{}\delta^{\alpha_{1}},\ \frac{\kappa^{2-\beta_{+}}}{\delta^{2}},\ \frac{1}{\sqrt{2}\delta},\ \delta^{\alpha_{2}-\lambda\alpha_{1}},\ \delta^{1-\lambda\alpha_{3}},\delta^{-\alpha_{2}+\frac{\alpha_{3}}{2}}\bigg{\}}.

After picking ρ(ρ0,1)\rho\in(\rho_{0},1), one can find some q>1q>1 such that ρ01/q<ρ\rho_{0}^{1/q}<\rho. Let p>1p>1 be such that 1p+1q=1\frac{1}{p}+\frac{1}{q}=1. Let Δ>0\Delta>0 be small enough such that aΔ>(l1)ba-\Delta>(l^{*}-1)b. Then, we pick γ¯(0,b)\bar{\gamma}\in(0,b) small enough such that

aΔ(l1)bγ¯+l1>2lp.\frac{a-\Delta-(l^{*}-1)b}{\bar{\gamma}}+l^{*}-1>2l^{*}p.

Again, the details of the parameter choices can be found at the beginning of Section 6. It is also worth mentioning that, by setting κ=0\kappa=0, Algorithm 3 would reduce to Algorithm 2, as ξj(i),m\xi^{(i),m}_{j}’s in (3.25) would reduce to ξj(i)\xi^{(i)}_{j}’s in (3.17); in other words, the ARA mechanism is effective only if the truncation threshold κn,m=κm/nr>0\kappa_{n,m}=\kappa^{m}/n^{r}>0. As a result, Theorem 3.2 follows directly from Theorem 3.3 by setting κ=0\kappa=0.

Remark 4.

While Algorithm 3 terminates within finite steps almost surely, its computational complexity may not be finite in expectation. This is partially due to the implementation of ARA as we approximate the jump martingale Jn,m(t)J_{n,m}(t) in (3.21) using a independent Brownian motion term in (3.23). In theory, a potential remedy is to identify a better coupling between the jump martingales and Brownian motions; see, for instance, Theorem 9 of [46]. This would allow us to pick a larger κ\kappa for the truncation threshold κn,m\kappa_{n,m} in ARA, under which the simulation algorithm generates significantly fewer jumps when sampling ξj(i),m\xi^{(i),m}_{j}’s. However, to the best of our knowledge, there is no practical implementation of the coupling in [46]. We note that similar issues arise in works such as [34], where the coupling in [46] imply a much tighter error bound in theory but cannot be implemented in practice.

4 Lipschitz Continuity of the Distribution of X<z(t)X^{<z}(t)

This section investigates the sufficient conditions for Assumption 2. That is, there exist z0,C,λ>0{z_{0}},\ {C},\ {\lambda}>0 such that

𝐏(X<z(t)[x,x+δ])Cδtλ1zz0,t>0,x,δ>0.\displaystyle\mathbf{P}\big{(}X^{<z}(t)\in[x,x+\delta]\big{)}\leq\frac{C\delta}{t^{\lambda}\wedge 1}\qquad\forall z\geq z_{0},\ t>0,\ x\in\mathbb{R},\ \delta>0. (4.1)

Here, recall that X>zX^{>z} is the Lévy process with generating triplet (cX,σ,ν|(,z))(c_{X},\sigma,\nu|_{(-\infty,z)}). In other words, this is a modulated version of XX where any the upward jump larger than zz is removed.

Algorithm 3 Strongly Efficient Estimation of 𝐏(An)\mathbf{P}(A_{n}) with ARA
1:w(0,1),γ>0,r>0,d>0,κ[0,1),ρ(0,1)w\in(0,1),\ \gamma>0,\ r>0,\ d>0,\ \kappa\in[0,1),\ \rho\in(0,1) as the parameters in the algorithm; a,b>0a,b>0 as the characterization of the set AA; (cX,σ,ν)(c_{X},\sigma,\nu) as the generating triplet of X(t)X(t); σ¯()\bar{\sigma}(\cdot) is defined in (3.22).
2:
3:Set tn=log2(nd)t_{n}=\lceil\log_{2}(n^{d})\rceil and κn,m=κm/nr\kappa_{n,m}=\kappa^{m}/n^{r}
4:
5:if Unif(0,1)<w\text{Unif}(0,1)<w then \triangleright Sample JnJ_{n} from 𝐐n\mathbf{Q}_{n}
6:    Sample Jn=i=1kzi𝐈[ui,n]J_{n}=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]} from 𝐏\mathbf{P}
7:else
8:    Sample Jn=i=1kzi𝐈[ui,n]J_{n}=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]} from 𝐏(|Bnγ)\mathbf{P}(\ \cdot\ |B^{\gamma}_{n}) using Algorithm 1
9:Set u0=0,uk+1=nu_{0}=0,u_{k+1}=n.
10:
11:Sample τGeom(ρ)\tau\sim Geom(\rho) \triangleright Decide Truncation Index τ\tau
12:
13:for i=1,2,,k+1i=1,2,\ldots,k+1 do \triangleright Stick-Breaking Procedure
14:    for j=1,2,,tn+τj=1,2,\ldots,t_{n}+\tau do
15:       Sample Vj(i)Unif(0,1)V^{(i)}_{j}\sim\text{Unif}(0,1)
16:       Set lj(i)=Vj(i)(uiui+1l1(i)l2(i)lj1(i))l^{(i)}_{j}=V^{(i)}_{j}(u_{i}-u_{i+1}-l^{(i)}_{1}-l^{(i)}_{2}-\ldots-l^{(i)}_{j-1})     
17:    Set ltn+τ+1(i)=uiui1l1(i)l2(i)ltn+τ(i)l^{(i)}_{t_{n}+\tau+1}=u_{i}-u_{i-1}-l^{(i)}_{1}-l^{(i)}_{2}-\ldots-l^{(i)}_{t_{n}+\tau}
18:
19:for i=1,,k+1i=1,\cdots,k+1  do \triangleright Sample ξj(i),m\xi^{(i),m}_{j}
20:    for j=1,2,,tn+τ+1j=1,2,\cdots,t_{n}+\tau+1 do
21:       Sample xj(i)N(0,σ2lj(i))x^{(i)}_{j}\sim N(0,\sigma^{2}\cdot l^{(i)}_{j})
22:       Sample yj(i),1𝐏(Jn,1(lj(i)))y^{(i),-1}_{j}\sim\mathbf{P}(J_{n,-1}(l^{(i)}_{j})\in\ \cdot\ )
23:       for m=0,1,,τm=0,1,\ldots,\tau do
24:          Sample yj(i),m𝐏(Jn,m(lj(i)))y^{(i),m}_{j}\sim\mathbf{P}(J_{n,m}(l^{(i)}_{j})\in\ \cdot\ )
25:          Sample wj(i),mN(0,(σ¯2(κn,m1)σ¯2(κn,m))lj(i))w^{(i),m}_{j}\sim N(0,(\bar{\sigma}^{2}(\kappa_{n,m-1})-\bar{\sigma}^{2}(\kappa_{n,m}))\cdot l^{(i)}_{j})        
26:       Sample wj(i),τ+1N(0,σ¯2(κn,τ)lj(i))w^{(i),\tau+1}_{j}\sim N(0,\bar{\sigma}^{2}(\kappa_{n,\tau})\cdot l^{(i)}_{j})
27:       for m=0,,τm=0,\ldots,\tau do
28:          Set ξj(i),m=cXlj(i)+xj(i)+q=1myj(i),q+q=m+1τ+1wj(i),q\xi^{(i),m}_{j}=c_{X}\cdot l^{(i)}_{j}+x^{(i)}_{j}+\sum_{q=-1}^{m}y^{(i),q}_{j}+\sum_{q=m+1}^{\tau+1}w^{(i),q}_{j}            
29:
30:for m=1,,τm=1,\cdots,\tau do \triangleright Evaluate Y^nm\hat{Y}^{m}_{n}
31:    for i=1,2,,k+1i=1,2,\ldots,k+1 do
32:       Set M^n(i),m=q=1i1j=1tn+τ+1ξj(q),m+q=1i1zq+j=1tn+m(ξj(i),m)+\hat{M}^{(i),m}_{n}=\sum_{q=1}^{i-1}\sum_{j=1}^{t_{n}+\tau+1}\xi^{(q),m}_{j}+\sum_{q=1}^{i-1}z_{q}+\sum_{j=1}^{t_{n}+m}(\xi^{(i),m}_{j})^{+}     
33:    Set Y^nm=𝐈{maxi=1,,k+1M^n(i),mna}\hat{Y}^{m}_{n}=\mathbf{I}\big{\{}\max_{i=1,\ldots,k+1}\hat{M}^{(i),m}_{n}\geq na\big{\}}
34:
35:Set Zn=Y^n1+m=2τ(Y^nmY^nm1)/ρm1Z_{n}=\hat{Y}^{1}_{n}+\sum_{m=2}^{\tau}(\hat{Y}^{m}_{n}-\hat{Y}^{m-1}_{n})\big{/}\rho^{m-1} \triangleright Return the Estimator LnL_{n}
36:if  maxi=1,,kzi>b\max_{i=1,\cdots,k}z_{i}>b  then
37:    Return Ln=0L_{n}=0.
38:else
39:    Set λn=nν[nγ,),pn=1l=0l1eλnλnll!,In=𝐈{JnBnγ}\lambda_{n}=n\nu[n\gamma,\infty),\ p_{n}=1-\sum_{l=0}^{l^{*}-1}e^{-\lambda_{n}}\frac{\lambda_{n}^{l}}{l!},\ I_{n}=\mathbf{I}\{J_{n}\in B^{\gamma}_{n}\}
40:    Return Ln=Zn/(w+1wpnIn)L_{n}=Z_{n}/(w+\frac{1-w}{p_{n}}I_{n})

To demonstrate our approach for establishing condition (4.1), we start by considering a simple case where the Lévy process X(t)X(t) has generating tripet (cX,σ,ν)(c_{X},\sigma,\nu) with σ>0\sigma>0. This leads to the decomposition

X<z(t)\ensurestackMath\stackon[1pt]=dσB(t)+Y<z(t)t,z>0X^{<z}(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\sigma B(t)+Y^{<z}(t)\qquad\forall t,z>0

where BB is a standard Brownian motion, Y<zY^{<z} is a Lévy process with generating triplet (cX,0,ν|(,z))(c_{X},0,\nu|_{(-\infty,z)}), and the two processes are independent. Now, for any x,t>0x\in\mathbb{R},\ t>0 and δ(0,1)\delta\in(0,1),

𝐏(X<z(t)[x,x+δ])\displaystyle\mathbf{P}(X^{<z}(t)\in[x,x+\delta]) =𝐏(σB(t)[xy,xy+δ])𝐏(Y<z(t)dy)\displaystyle=\int_{\mathbb{R}}\mathbf{P}(\sigma B(t)\in[x-y,x-y+\delta])\cdot\mathbf{P}(Y^{<z}(t)\in dy)
=𝐏(B(t)t[xyσt,xy+δσt])𝐏(Y<z(t)dy)\displaystyle=\int_{\mathbb{R}}\mathbf{P}\bigg{(}\frac{B(t)}{\sqrt{t}}\in\Big{[}\frac{x-y}{\sigma\sqrt{t}},\frac{x-y+\delta}{\sigma\sqrt{t}}\Big{]}\bigg{)}\cdot\mathbf{P}(Y^{<z}(t)\in dy)
1σ2πδt.\displaystyle\leq\frac{1}{\sigma\sqrt{2\pi}}\cdot\frac{\delta}{\sqrt{t}}. (4.2)

The last inequality follows from the fact that a standard Normal distribution admits a density function bounded by 1/2π1/\sqrt{2\pi}. Therefore, we verified Assumption 2 under λ=1/2,C=1σ2π\lambda=1/2,C=\frac{1}{\sigma\sqrt{2\pi}}, and any z0>0z_{0}>0. The simple idea behind (4.2) is that continuity conditions such as (4.1) can be passed from one distribution to another through convolutional structures. To generalize this approach to the scenarios where σ=0\sigma=0 in the generating triplet of the Lévy process XX, we introduce the following definition.

Definition 3.

Let μ1\mu_{1} and μ2\mu_{2} be Borel measures on \mathbb{R}. For any Borel set AA\subset\mathbb{R}, we say that μ1\mu_{1} majorizes μ2\mu_{2} when restricted on AA (denoted as (μ1μ2)|A0(\mu_{1}-\mu_{2})|_{A}\geq 0) if μ(BA)=μ1(BA)μ2(BA)0\mu(B\cap A)=\mu_{1}(B\cap A)-\mu_{2}(B\cap A)\geq 0 for any Borel set BB\subset\mathbb{R}. In other words μ|A=(μ1μ2)|A\mu|_{A}=(\mu_{1}-\mu_{2})|_{A} is a positive measure.

Now, let us consider the case where the generating triplet of XX is (cX,0,ν(c_{X},0,\nu). For the Lévy measure ν\nu, if we can find some z0>0z_{0}>0, some Borel set A(,z0)A\subseteq(-\infty,z_{0}) and some (positive) Borel measure μ\mu such that (νμ)|A0(\nu-\mu)|_{A}\geq 0, then through a straightforward superposition of Poisson random measures, we obtain the decomposition (let μA=μ|A\mu_{A}=\mu|_{A})

X<z(t)\ensurestackMath\stackon[1pt]=dY(t)+X~<z,A(t)zz0\displaystyle X^{<z}(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}Y(t)+\widetilde{X}^{<z,-A}(t)\qquad\forall z\geq z_{0} (4.3)

where Y(t)Y(t) is a Lévy process with generating triplet (0,0,μA)(0,0,\mu_{A}), X~<z,A(t)\widetilde{X}^{<z,-A}(t) is a Lévy process with generating triplet (cX,0,νμA)(c_{X},0,\nu-\mu_{A}), and the two processes are independent. Furthermore, if Assumption 2 (conditions of form (4.1)) holds for the process Y(t)Y(t) with generating triplet (0,0,μA)(0,0,\mu_{A}), then by repeating the arguments in (4.2) we can show that Assumption 2 holds in X<z(t)X^{<z}(t) for any zz0z\geq z_{0}.

Recall our running assumption that the Lévy process X(t)X(t) is of infinite activities (see Assumption 1). In case that σ=0\sigma=0, we must have ν((,0)(0,))=\nu((-\infty,0)\cup(0,\infty))=\infty for XX to have infinite activity. Therefore, the key step is to identify the majorized measure μ\mu such that

  • (νμ)|A0(\nu-\mu)|_{A}\geq 0 holds for ν\nu with infinite mass and some set AA,

  • condition (4.1) holds for the Lévy process Y(t)Y(t) in (4.3) with generating triplet (0,0,μ|A)(0,0,\mu|_{A}).

In the first main result of this section, we show that measures of form μ[x,)\mu[x,\infty) that roughly increase at a power-law rate 1/xα1/x^{\alpha} (as x0x\downarrow 0) provide ideal choices for such majorized measures. In particular, the corresponding Lévy process Y(t)Y(t) in (4.3) is intimately related to α\alpha-stable processes that naturally satisfy continuity properties of form (4.1). We collect the proof in Section 6.3.

Proposition 4.1.

Let α(0,2),z0>0\alpha\in(0,2),z_{0}>0, and ϵ(0,(2α)/2)\epsilon\in(0,(2-\alpha)/2). Suppose that μ[x,)\mu[x,\infty) is regularly varying as x0x\downarrow 0 with index (α+2ϵ)-(\alpha+2\epsilon). Then the Lévy process Y(t)Y(t) with generating triplet (0,0,μ|(0,z0))(0,0,\mu|_{(0,z_{0})}) has a continuous density function fY(t)f_{Y(t)} for each t>0t>0. Furthermore, there exists a constant C<C<\infty such that

fY(t)Ct1/α1t>0.\left\lVert f_{Y(t)}\right\rVert_{\infty}\leq\frac{C}{t^{1/\alpha}\wedge 1}\qquad\forall t>0.

where f=supx|f(x)|\left\lVert f\right\rVert_{\infty}=\sup_{x\in\mathbb{R}}|f(x)|.

Equipped with Proposition 4.1, we obtain the following set of sufficient conditions for Assumption 2.

Theorem 4.2.

Let (cX,σ,ν)(c_{X},\sigma,\nu) be the generating triplet of Lévy process XX.

  1. (i)

    If σ>0\sigma>0, then Assumption 2 holds for λ=1/2\lambda=1/2 and any z0>0z_{0}>0.

  2. (ii)

    If there exist Borel measure μ\mu, some z0>0z_{0}>0, and some α(0,2)\alpha^{\prime}\in(0,2) such that (νμ)|(0,z0)0(\nu-\mu)|_{(0,z_{0})}\geq 0 (resp., (νμ)|(z0,0)0(\nu-\mu)|_{(-z_{0},0)}\geq 0) and μ[x,)\mu[x,\infty) (resp., μ(,x]\mu(-\infty,x]) is regularly varying with index α-\alpha^{\prime} as x0x\downarrow 0, then Assumption 2 holds with λ=1/α\lambda=1/\alpha for any α(0,α)\alpha\in(0,\alpha^{\prime}).

Proof.

Part (i)(i) follows immediately from the calculations in (4.2). To prove part (ii)(ii), we fix some α(0,α)\alpha\in(0,\alpha^{\prime}), and without loss of generality assume that (νμ)|(0,z0)0(\nu-\mu)|_{(0,z_{0})}\geq 0 and μ[x,)\mu[x,\infty) is regularly varying with index α\alpha^{\prime} as x0x\downarrow 0. This allows us to fix some ϵ=(αα)/2(0,(2α)/2)\epsilon=(\alpha^{\prime}-\alpha)/2\in\big{(}0,(2-\alpha)/2\big{)}.

For any zz0z\geq z_{0}, let Y(t)Y(t) and X~<z,A(t)\widetilde{X}^{<z,-A}(t) be defined as in (4.3) with A=(0,z0)A=(0,z_{0}). First of all, applying Proposition 4.1, we can find C>0C>0 such that fY(t)Ct1/α1t>0.\left\lVert f_{Y(t)}\right\rVert_{\infty}\leq\frac{C}{t^{1/\alpha}\wedge 1}\ \forall t>0. Next, due to the independence between YY and X~<z,A(t)\widetilde{X}^{<z,-A}(t), it holds for all x,δ0x\in\mathbb{R},\delta\geq 0, and t>0t>0 that

𝐏(X<z(t)[x,x+δ])=𝐏(Y(t)[xy,xy+δ])𝐏(X~<z,A(t)dy)Ct1/α1δ.\mathbf{P}(X^{<z}(t)\in[x,x+\delta])=\int_{\mathbb{R}}\mathbf{P}(Y(t)\in[x-y,x-y+\delta])\cdot\mathbf{P}(\widetilde{X}^{<z,-A}(t)\in dy)\leq\frac{C}{t^{1/\alpha}\wedge 1}\cdot\delta.

This concludes the proof. ∎

Remark 5.

It is worth noting that the conditions stated in Theorem 4.2 are mild for Lévy process X(t)X(t) with infinite activities. In particular, for XX to exhibit infinite activity, we must have either σ>0\sigma>0 or ν()=\nu(\mathbb{R})=\infty. Theorem 4.2 (i) deals with the case where σ>0\sigma>0. On the other hand, when σ=0\sigma=0 we must have either limϵ0ν[ϵ,)=\lim_{\epsilon\downarrow 0}\nu[\epsilon,\infty)=\infty or limϵ0ν(,ϵ]=\lim_{\epsilon\downarrow 0}\nu(-\infty,-\epsilon]=\infty. To satisfy the conditions in part (ii) of Theorem 4.2, the only other requirement is that ν[ϵ,)\nu[\epsilon,\infty) (or ν(,ϵ]\nu(-\infty,-\epsilon]) approaches infinity at a rate that is at least comparable to some power-law functions.

The next set of sufficient conditions for Assumption 2 revolves around another type of self-similarity structure in the Lévy measure ν\nu.

Definition 4.

Given α(0,2)\alpha\in(0,2) and b>1b>1, a Lévy process XX is α\alpha-semi-stable with span bb if its Lévy measure ν\nu satisfies

ν=bαTbν\displaystyle\nu=b^{-\alpha}T_{b}\nu (4.4)

where the transformation TrT_{r} (r>0\forall r>0) onto a Borel measure ρ\rho on \mathbb{R} is given by (Trρ)(B)=ρ(r1B)(T_{r}\rho)(B)=\rho(r^{-1}B).

As a special case of semi-stable processes, note that XX is α\alpha-stable if

ν(dx)=c1dxx1+α𝐈{x>0}+c2dx|x|1+α𝐈{x<0}\nu(dx)=c_{1}\cdot\frac{dx}{x^{1+\alpha}}\mathbf{I}\{x>0\}+c_{2}\cdot\frac{dx}{|x|^{1+\alpha}}\mathbf{I}\{x<0\}

where c1,c20,c1+c2>0.c_{1},c_{2}\geq 0,\ c_{1}+c_{2}>0. See Theorem 14.3 in [58] for details. However, it is worth noting that the Lévy processes with regularly varying Lévy measures ν\nu studied in Proposition 4.1 are not strict subsets of the semi-stable processes introduced in Definition 4. For instance, given a Borel measure ν\nu, suppose that f(x)=ν((,x][x,))f(x)=\nu\big{(}(-\infty,-x]\cup[x,\infty)\big{)} is regularly varying at 0 with index α>0\alpha>0. Even if ν\nu satisfies the scaling-invariant property in (4.4) for some b>1b>1, we can fix a sequence of points {xn=1bn}n1\{x_{n}=\frac{1}{b^{n}}\}_{n\geq 1} and assign an extra mass of lnn\ln n onto ν\nu at each point xnx_{n}. In doing so, we break the scaling-invariant property but still maintain the regular variation of ν\nu. On the other hand, to show that semi-stable processes may not have regularly varying Lévy measure (when restricted on some neighborhood of the origin), let us consider a simple example. For some b>1b>1 and α(0,2)\alpha\in(0,2), define the following measure:

ν({bn})=bnαn0;ν({bn:n})=0.\nu(\{b^{-n}\})=b^{n\alpha}\ \ \forall n\geq 0;\qquad\nu\big{(}\mathbb{R}\char 92\relax\{b^{n}:\ n\in\mathbb{N}\}\big{)}=0.

Clearly, ν\nu can be seen as the restriction of the Lévy measure (restricted on (1,1)(-1,1)) of some α\alpha-semi-stable process. Now define function f(x)=ν[x,)f(x)=\nu[x,\infty) on (0,)(0,\infty). For any t>0t>0,

f(tx)f(x)=n=0logb(1/tx)bnαn=0logb(1/x)bnα=blogb(1/tx)+11blogb(1/x)+11.\frac{f(tx)}{f(x)}=\frac{\sum_{n=0}^{\lfloor\log_{b}(1/tx)\rfloor}b^{n\alpha}}{\sum_{n=0}^{\lfloor\log_{b}(1/x)\rfloor}b^{n\alpha}}=\frac{b^{\lfloor\log_{b}(1/tx)\rfloor+1}-1}{b^{\lfloor\log_{b}(1/x)\rfloor+1}-1}.

As x0x\rightarrow 0, we see that f(tx)/f(x)f(tx)/f(x) will be very close to

bα(logb(1/tx)logb(1/x)).b^{\alpha(\lfloor\log_{b}(1/tx)\rfloor-\lfloor\log_{b}(1/x)\rfloor)}.

As long as we didn’t pick t=bkt=b^{k} for some kk\in\mathbb{Z}, asymptotically, the value of f(tx)/f(x)f(tx)/f(x) will repeatedly cycle through the following three different values

{bαlogb(1/t),bαlogb(1/t)+α,bαlogb(1/t)α},\{b^{\alpha\lfloor\log_{b}(1/t)\rfloor},b^{\alpha\lfloor\log_{b}(1/t)\rfloor+\alpha},b^{\alpha\lfloor\log_{b}(1/t)\rfloor-\alpha}\},

thus implying that f(tx)/f(x)f(tx)/f(x) does not converge as xx approaches 0. This confirms that ν[x,)\nu[x,\infty) is not regularly varying as x0x\downarrow 0.

In Proposition 4.3, we show that semi-stable processes, as well as their truncated counterparts, satisfy continuity conditions of form (4.1). We say that the process Y(t)Y(t) is non-trivial if it is not a deterministic linear function (i.e., Y(t)ctY(t)\equiv ct for some cc\in\mathbb{R}). The proof is again detailed in Section 6.3.

Proposition 4.3.

Let α(0,2)\alpha\in(0,2) and NN\in\mathbb{Z}. Suppose that μ\mu is the Lévy measure of a non-trivial α\alpha-semi-stable process Y(t)Y^{\prime}(t) of span b>1b>1. Then under z0=bNz_{0}=b^{N}, the Lévy process {Y(t):t>0}\{Y(t):\ t>0\} with generating triplet (0,0,μ|(z0,z0))(0,0,\mu|_{(-z_{0},z_{0})}) has a continuous density function fY(t)f_{Y(t)} for any t>0t>0. Furthermore, there exists some C(0,)C\in(0,\infty) such that

fY(t)Ct1/α1t>0.\left\lVert f_{Y(t)}\right\rVert_{\infty}\leq\frac{C}{t^{1/\alpha}\wedge 1}\qquad\forall t>0.

Lastly, by applying Proposition 4.3, we yield another set of sufficient conditions for Assumption 2.

Theorem 4.4.

Let (cX,σ,ν)(c_{X},\sigma,\nu) be the generating triplet of Lévy process XX. Suppose that there exist some Borel measure μ\mu and some z>0,α(0,2)z^{\prime}>0,\ \alpha\in(0,2) such that (νμ)|(z,z)0,(\nu-\mu)|_{(-z^{\prime},z^{\prime})}\geq 0, and μ\mu is the Lévy measure of some α\alpha-semi-stable process. Then Assumption 2 holds for λ=1/α\lambda=1/\alpha.

Proof.

Let b>1b>1 be the span of the α\alpha-semi-stable process. Fix some NN\in\mathbb{Z} such that z0\ensurestackMath\stackon[1pt]=ΔbNzz_{0}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}b^{N}\leq z^{\prime}. For any zz0z\geq z_{0}, let Y(t)Y(t) and X~<z,A(t)\widetilde{X}^{<z,-A}(t) be defined as in (4.3) with A=(z0,z0)A=(-z_{0},z_{0}). First of all, applying Proposition 4.3, we can find C>0C>0 such that fY(t)Ct1/α1t>0.\left\lVert f_{Y(t)}\right\rVert_{\infty}\leq\frac{C}{t^{1/\alpha}\wedge 1}\ \forall t>0. Next, due to the independence between YY and X~<z,A(t)\widetilde{X}^{<z,-A}(t), it holds for all x,δ0x\in\mathbb{R},\delta\geq 0, and t>0t>0 that

𝐏(X<z(t)[x,x+δ])=𝐏(Y(t)[xy,xy+δ])𝐏(X~<z,A(t)dy)Ct1/α1δ.\mathbf{P}(X^{<z}(t)\in[x,x+\delta])=\int_{\mathbb{R}}\mathbf{P}(Y(t)\in[x-y,x-y+\delta])\cdot\mathbf{P}(\widetilde{X}^{<z,-A}(t)\in dy)\leq\frac{C}{t^{1/\alpha}\wedge 1}\cdot\delta.

This concludes the proof. ∎

5 Numerical Experiments

In this section, we apply the importance sampling strategy outlined in Algorithms 2 and 3 and demonstrate that (i)(i) the performance of the importance sampling estimators under different scaling factors and tail distributions, and (ii)(ii) the strong efficiency of the proposed algorithms when compared to crude Monte Carlo methods. Specifically, consider a Lévy process X(t)=B(t)+i=1N(t)WiX(t)=B(t)+\sum_{i=1}^{N(t)}W_{i}, where B(t)B(t) is the standard Brownian motion, NN is a Poisson process with arrival rate 0.50.5, and {Wi}i1\{W_{i}\}_{i\geq 1} is a sequence of iid random variables with law (for some α>1\alpha>1)

𝐏(W1>x)=𝐏(W1>x)=0.5(1+x)α,x>0.\mathbf{P}(W_{1}>x)=\mathbf{P}(-W_{1}>x)=\frac{0.5}{(1+x)^{\alpha}},\qquad\forall x>0.

For each n1n\geq 1, we define the scaled process X¯n(t)=X(nt)n\bar{X}_{n}(t)=\frac{X(nt)}{n}. The goal is to estimate the probability of An={XnA}A_{n}=\{X_{n}\in A\}, where the set AA is defined as in (3.1) with a=2a=2 and b=1.15b=1.15. Note that this is a case with l=a/b=2l^{*}=\lceil a/b\rceil=2.

To evaluate the performance of the importance sampling estimator under different scaling factors and tail distributions, we run experiments under α{1.45,1.6,1.75}\alpha\in\{1.45,1.6,1.75\}, and n{100,200,,1000}n\in\{100,200,\cdots,1000\}. The efficiency is evaluated by the relative error of the algorithm, namely the ratio between the standard deviation and the estimated mean. In Algorithm 2, we set γ=0.25,w=0.05,ρ=0.97\gamma=0.25,\ w=0.05,\rho=0.97, and d=4d=4. In Algorithm 3, we further set κ=0.5\kappa=0.5 and r=1.5r=1.5. For both algorithms, we generate 10,000 independent samples for each combination of α{1.45,1.6,1.75}\alpha\in\{1.45,1.6,1.75\} and n{1000,2000,,10000}n\in\{1000,2000,\cdots,10000\}. For the number of samples in crude Monte Carlo estimation, we ensure that at least 64/p^α,n64/\hat{p}_{\alpha,n} samples are generated, where p^α,n\hat{p}_{\alpha,n} is the probability estimated by Algorithm 2.

The results are summarized in Table 5.1 and Figure 5.1. In Table 5.1, we see that for a fixed α\alpha, the relative error of the importance sampling estimators stabilizes around a constant level as nn increases. This aligns with the strong efficiency established in Theorems 3.2 and 3.3. In comparison, the relative error of crude Monte Carlo estimators continues to increase as nn tends to infinity. Figure 5.1 further highlights that our importance sampling estimators significantly outperform crude Monte Carlo methods by orders of magnitude. In summary, when Algorithms 2 and 3 are appropriately parameterized, their efficiency becomes increasingly evident when compared against the crude Monte Carlo approach as the scaling factor nn grows larger and the target probability approaches 0.

Refer to caption
Figure 5.1: Relative errors of the proposed importance sampling estimator. Results are plotted under log scale. Dashed lines: the importance sampling estimator in Algorithm 2, Dotted lines: the importance sampling estimator with ARA in Algorithm 3, Solid lines: the crude Monte-Carlo methods (solid lines).
Table 5.1: Relative errors of Algorithm 2 (first row), Algorithm 3 (second row), and crude Monte Carlo (third row).
n 200 400 600 800 1000
α=1.45\alpha=1.45 11.7011.70 13.6513.65 14.4014.40 15.3315.33 15.8215.82
10.7410.74 13.5713.57 13.5713.57 15.9815.98 14.1114.11
97.86 136.03 195.40 238.81 273.13
α=1.6\alpha=1.6 15.0315.03 17.5317.53 19.0619.06 20.1220.12 20.9820.98
15.59 18.23 19.59 21.30 21.30
237.82 386.35 526.13 681.79 866.02
α=1.75\alpha=1.75 19.0319.03 22.5422.54 23.9423.94 25.9725.97 25.7725.77
18.23 19.22 22.92 28.85 31.61
524.78 1091.29 1298.98 1965.22 2089.82

6 Proofs

6.1 Proof of Proposition 3.1

We first prepare two technical lemmas using the sample-path large deviations for heavy-tailed Lévy processes reviewed in Section 2.2.

Lemma 6.1.

For the set A𝔻A\subset\mathbb{D} defined in (3.1) and the quantity ll^{*} defined in (3.2),

0<lim infn𝐏(X¯nA)(nν[n,))llim supn𝐏(X¯nA)(nν[n,))l<.\displaystyle 0<\liminf_{n\rightarrow\infty}\frac{\mathbf{P}(\bar{X}_{n}\in A)}{(n\nu[n,\infty))^{l^{*}}}\leq\limsup_{n\rightarrow\infty}\frac{\mathbf{P}(\bar{X}_{n}\in A)}{(n\nu[n,\infty))^{l^{*}}}<\infty.
Proof.

In this proof, we focus on the two-sided case in Assumption 1. It is worth noticing that analysis for the one-sided case is almost identical, with the only major difference being that we apply Result 1 (i.e., the one-sided version of the large deviations of X¯n\bar{X}_{n}) instead of Result 2 (i.e., the two-sided version). Specifically, we claim that

  1. (ii)

    (l,0)argmin(j,k)2,𝔻j,kAj(α1)+k(α1);\big{(}l^{*},0\big{)}\in\underset{(j,k)\in\mathbb{N}^{2},\ \mathbb{D}_{j,k}\cap A\neq\emptyset}{\text{argmin}}j(\alpha-1)+k(\alpha^{\prime}-1);

  2. (iiii)

    𝐂l,0(A)>0\mathbf{C}_{l^{*},0}(A^{\circ})>0;

  3. (iiiiii)

    the set AA is bounded away from 𝔻<l,0\mathbb{D}_{<l^{*},0}.

Then by applying Result 2, we yield

0<𝐂l,0(A)lim infn𝐏(X¯nA)(nν[n,))llim supn𝐏(X¯nA)(nν[n,))l𝐂l,0(A)<\displaystyle 0<{\mathbf{C}_{l^{*},0}(A^{\circ})}\leq\liminf_{n\rightarrow\infty}\frac{\mathbf{P}(\bar{X}_{n}\in A)}{(n\nu[n,\infty))^{l^{*}}}\leq\limsup_{n\rightarrow\infty}\frac{\mathbf{P}(\bar{X}_{n}\in A)}{(n\nu[n,\infty))^{l^{*}}}\leq{\mathbf{C}_{l^{*},0}(A^{-})}<\infty

and conclude the proof. Now, it remains to prove claims (ii), (iiii), and (iiiiii).

Proof of Claim (i)(i).

By definitions of 𝔻j,k\mathbb{D}_{j,k}, for any ξ𝔻j,k\xi\in\mathbb{D}_{j,k} there exist (ui)i=1j(0,)j(u_{i})_{i=1}^{j}\in(0,\infty)^{j}, (ti)i=1j(0,1]j(t_{i})_{i=1}^{j}\in(0,1]^{j} and (vi)i=1k(0,)k(v_{i})_{i=1}^{k}\in(0,\infty)^{k}, (si)i=1k(0,1]k(s_{i})_{i=1}^{k}\in(0,1]^{k} such that

ξ(t)=i=1jui𝐈[ti,1](t)i=1kvi𝐈[si,1](t)t[0,1].\displaystyle\xi(t)=\sum_{i=1}^{j}u_{i}\mathbf{I}_{[t_{i},1]}(t)-\sum_{i=1}^{k}v_{i}\mathbf{I}_{[s_{i},1]}(t)\qquad\forall t\in[0,1]. (6.1)

First, from Assumption 3, one can choose ϵ>0\epsilon>0 small enough such that l(bϵ)>al^{*}(b-\epsilon)>a. Then for the case with (j,k)=(l,0)(j,k)=(l^{*},0) in (6.1), by picking ui=bϵu_{i}=b-\epsilon for all i[l]i\in[l^{*}], we have supt[0,1]ξ(t)=i=1lui=l(bϵ)>a\sup_{t\in[0,1]}\xi(t)=\sum_{i=1}^{l^{*}}u_{i}=l^{*}(b-\epsilon)>a, and hence ξA\xi\in A. This verifies 𝔻l,0A\mathbb{D}_{l^{*},0}\cap A\neq\emptyset.

Next, suppose we can show that jlj\geq l^{*} is a necessary condition for 𝔻j,kA\mathbb{D}_{j,k}\cap A\neq\emptyset. Then we get

{(j,k)2:𝔻j,kA}{(j,k)2:jl,k0},\displaystyle\big{\{}(j,k)\in\mathbb{N}^{2}:\ \mathbb{D}_{j,k}\cap A\neq\emptyset\big{\}}\subseteq\big{\{}(j,k)\in\mathbb{N}^{2}:\ j\geq l^{*},\ k\geq 0\big{\}},

which immediately verifies claim (i)(i) due to α,α>1\alpha,\alpha^{\prime}>1; see Assumption 1. Now, to show that jlj\geq l^{*} is a necessary condition for 𝔻j,kA\mathbb{D}_{j,k}\cap A\neq\emptyset, note that from (6.1), it holds for any ξ𝔻j,kA\xi\in\mathbb{D}_{j,k}\cap A that a<supt[0,1]ξ(t)i=1jui<jb.a<\sup_{t\in[0,1]}\xi(t)\leq\sum_{i=1}^{j}u_{i}<jb. As a result, we must have j>a/bj>a/b and hence jl=a/bj\geq l^{*}=\lceil a/b\rceil due to a/ba/b\notin\mathbb{Z}; see Assumption 3. This concludes the proof of claim (i)(i).

Proof of Claim (ii)(ii).

Again, choose some ϵ>0\epsilon>0 small enough such that l(bϵ)>al^{*}(b-\epsilon)>a. Given any ui(bϵ,b)u_{i}\in(b-\epsilon,b) and 0<t1<t2<<tl<10<t_{1}<t_{2}<\cdots<t_{l^{*}}<1, the step function ξ(t)=i=1lui𝐈[ti,1](t)\xi(t)=\sum_{i=1}^{l^{*}}u_{i}\mathbf{I}_{[t_{i},1]}(t) satisfies supt[0,1]ξ(t)l(bϵ)>a\sup_{t\in[0,1]}\xi(t)\geq l^{*}(b-\epsilon)>a, thus implying ξA\xi\in A. Therefore, (for the definition of 𝐂j,k\mathbf{C}_{j,k}, see (2.3))

𝐂l,0(A)\displaystyle\mathbf{C}_{l^{*},0}(A^{\circ}) ναl((bϵ,b)l)=1l![1(bϵ)α1bα]l>0.\displaystyle\geq\nu^{l^{*}}_{\alpha}\Big{(}(b-\epsilon,b)^{l^{*}}\Big{)}=\frac{1}{l^{*}!}\bigg{[}\frac{1}{(b-\epsilon)^{\alpha}}-\frac{1}{b^{\alpha}}\bigg{]}^{l^{*}}>0.

Proof of Claim (iii)(iii).

Assumption 3 implies that a>(l1)ba>(l^{*}-1)b, allowing us to choose ϵ>0\epsilon>0 small enough that aϵ>(l1)(b+ϵ)a-\epsilon>(l^{*}-1)(b+\epsilon). It suffices to show that

𝒅(ξ,ξ)ϵξ𝔻<l,0,ξA.\displaystyle\bm{d}(\xi,\xi^{\prime})\geq\epsilon\qquad\forall\xi\in\mathbb{D}_{<l^{*},0},\ \xi^{\prime}\in A. (6.2)

Here, 𝒅\bm{d} is the Skorokhod J1J_{1} metric; see (2.1) for the definition. To prove (6.2), we start with the following observation: due to claim (i)(i), for any (j,k)2(j,k)\in\mathbb{N}^{2} with (j,k)𝕀<l,0(j,k)\in\mathbb{I}_{<l^{*},0}, we must have jl1j\leq l^{*}-1. Now, we proceed with a proof by contradiction. Suppose there is some ξ𝔻j,k\xi\in\mathbb{D}_{j,k} with jl1j\leq l^{*}-1 and some ξA\xi^{\prime}\in A such that 𝒅(ξ,ξ)<ϵ\bm{d}(\xi,\xi^{\prime})<\epsilon. Due to ξA\xi^{\prime}\in A (and hence no upward jump in ξ\xi^{\prime} is larger than bb) and 𝒅(ξ,ξ)<ϵ\bm{d}(\xi,\xi^{\prime})<\epsilon, under the representation (6.1) we must have ui<b+ϵi[j]u_{i}<b+\epsilon\ \forall i\in[j]. This implies supt[0,1]ξ(t)i=1jui<j(b+ϵ)(l1)(b+ϵ)\sup_{t\in[0,1]}\xi(t)\leq\sum_{i=1}^{j}u_{i}<j(b+\epsilon)\leq(l^{*}-1)(b+\epsilon). Due to 𝒅(ξ,ξ)<ϵ\bm{d}(\xi,\xi^{\prime})<\epsilon again, we yield the contradiction that supt[0,1]ξ(t)<(l1)(b+ϵ)+ϵ<a\sup_{t\in[0,1]}\xi^{\prime}(t)<(l^{*}-1)(b+\epsilon)+\epsilon<a (and hence ξA\xi^{\prime}\notin A). This concludes the proof of claim (iii)(iii). ∎

Lemma 6.2.

Let p>1p>1. Let Δ>0\Delta>0 be such that aΔ>(l1)ba-\Delta>(l^{*}-1)b and [aΔ(l1)b]/γ[a-\Delta-(l^{*}-1)b]/\gamma\notin\mathbb{Z}. Suppose that (Jγ+l1)/p>2l(J_{\gamma}+l^{*}-1)/p>2l^{*} holds for

Jγ\ensurestackMath\stackon[1pt]=ΔaΔ(l1)bγ.\displaystyle J_{\gamma}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\lceil\frac{a-\Delta-(l^{*}-1)b}{\gamma}\rceil. (6.3)

Then

𝐏(X¯nAΔE,𝒟(J¯n)l1)=𝒐((nν[n,))2pl)as n\mathbf{P}\big{(}\bar{X}_{n}\in A^{\Delta}\cap E,\ \mathcal{D}(\bar{J}_{n})\leq l^{*}-1\big{)}=\bm{o}\Big{(}\big{(}n\nu[n,\infty)\big{)}^{2pl^{*}}\Big{)}\qquad\text{as }n\to\infty

where AΔ={ξ𝔻:supt[0,1]ξ(t)aΔ},E\ensurestackMath\stackon[1pt]=Δ{ξ𝔻:supt(0,1]ξ(t)ξ(t)<b}A^{\Delta}=\{\xi\in\mathbb{D}:\sup_{t\in[0,1]}\xi(t)\geq a-\Delta\},\ E\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\xi\in\mathbb{D}:\sup_{t\in(0,1]}\xi(t)-\xi(t-)<b\} and the function 𝒟(ξ)\mathcal{D}(\xi) counts the number of discontinuities in ξ𝔻\xi\in\mathbb{D}.

Proof.

Similar to the proof of Lemma 6.1, we focus on the two-sided case in Assumption 1. Still, it is worth noticing that the proof of the one-sided case is almost identical, with the only major difference being that we apply Result 1 instead of Result 2.

First, observe that 𝐏(X¯nAΔE,𝒟(J¯n)l1)=𝐏(X¯nF)\mathbf{P}(\bar{X}_{n}\in A^{\Delta}\cap E,\ \mathcal{D}(\bar{J}_{n})\leq l^{*}-1)=\mathbf{P}(\bar{X}_{n}\in F) where

F\displaystyle F \ensurestackMath\stackon[1pt]=Δ{ξ𝔻:supt[0,1]ξ(t)aΔ;supt(0,1]ξ(t)ξ(t)<b,\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\big{\{}\xi\in\mathbb{D}:\ \sup_{t\in[0,1]}\xi(t)\geq a-\Delta;\ \sup_{t\in(0,1]}\xi(t)-\xi(t-)<b,
#{t[0,1]:ξ(t)ξ(t)γ}l1}.\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\#\{t\in[0,1]:\ \xi(t)-\xi(t-)\geq\gamma\}\leq l^{*}-1\big{\}}.

Furthermore, we claim that

  1. (i)(i)

    (Jγ+l1,0)argmin(j,k)2,𝔻j,kFj(α1)+k(α1);(J_{\gamma}+l^{*}-1,0)\in\underset{(j,k)\in\mathbb{N}^{2},\ \mathbb{D}_{j,k}\cap F\neq\emptyset}{\text{argmin}}j(\alpha-1)+k(\alpha^{\prime}-1);

  2. (ii)(ii)

    the set FF is bounded away from 𝔻<Jγ+l1,0\mathbb{D}_{<J_{\gamma}+l^{*}-1,0}.

Then we are able to apply Result 2 and obtain

𝐏(X¯nAΔE,𝒟(J¯n)l1)=𝐏(X¯nF)=𝑶((nν[n,))Jγ+l1)as n.\mathbf{P}(\bar{X}_{n}\in A^{\Delta}\cap E,\ \mathcal{D}(\bar{J}_{n})\leq l^{*}-1)=\mathbf{P}(\bar{X}_{n}\in F)=\bm{O}\big{(}(n\nu[n,\infty))^{J_{\gamma}+l^{*}-1}\big{)}\qquad\text{as }n\to\infty.

Lastly, by our assumption (Jγ+l1)/p>2l(J_{\gamma}+l^{*}-1)/p>2l^{*}, we get (nν[n,))l1+Jγ=𝒐((nν[n,))2pl)(n\nu[n,\infty))^{l^{*}-1+J_{\gamma}}=\bm{o}\big{(}\big{(}n\nu[n,\infty)\big{)}^{2pl^{*}}\big{)} and conclude the proof. Now, it remains to prove claims (i)(i) and (ii)(ii).

Proof of Claim (i)(i).

By definition of 𝔻j,k\mathbb{D}_{j,k}, given any ξ𝔻j,k\xi\in\mathbb{D}_{j,k} there exist (ui)i=1j(0,)j,(ti)i=1j(0,1]j(u_{i})_{i=1}^{j}\in(0,\infty)^{j},(t_{i})_{i=1}^{j}\in(0,1]^{j} and (vi)i=1k(0,)k,(si)i=1k(0,1]k(v_{i})_{i=1}^{k}\in(0,\infty)^{k},(s_{i})_{i=1}^{k}\in(0,1]^{k} such that the representation (6.1) holds. By assumption [aΔ(l1)b]/γ[a-\Delta-(l^{*}-1)b]/\gamma\notin\mathbb{Z}, for JγJ_{\gamma} defined in (6.3) we have

(Jγ1)γ<aΔ(l1)b<Jγγ.\displaystyle(J_{\gamma}-1)\gamma<a-\Delta-(l^{*}-1)b<J_{\gamma}\cdot\gamma. (6.4)

It then holds for all ϵ>0\epsilon>0 small enough that aΔ<Jγ(γϵ)+(l1)(bϵ)a-\Delta<J_{\gamma}(\gamma-\epsilon)+(l^{*}-1)(b-\epsilon). As a result, for the case with (j,k)=(l1+Jγ,0)(j,k)=(l^{*}-1+J_{\gamma},0) in (6.1), by picking ui=bϵu_{i}=b-\epsilon for all i[l1]i\in[l^{*}-1], and ui=γϵu_{i}=\gamma-\epsilon for all i=l,l+1,,l1+Jγi=l^{*},l^{*}+1,\cdots,l^{*}-1+J_{\gamma}, we get supt[0,1]ξ(t)=Jγ(γϵ)+(l1)(bϵ)>aΔ\sup_{t\in[0,1]}\xi(t)=J_{\gamma}(\gamma-\epsilon)+(l^{*}-1)(b-\epsilon)>a-\Delta. This proves that ξ𝔻l1+Jγ,0F\xi\in\mathbb{D}_{l^{*}-1+J_{\gamma},0}\cap F, and hence 𝔻l1+Jγ,0F\mathbb{D}_{l^{*}-1+J_{\gamma},0}\cap F\neq\emptyset.

Next, suppose we can show that jl1+Jγj\geq l^{*}-1+J_{\gamma} is the necessary condition for 𝔻j,kF\mathbb{D}_{j,k}\cap F\neq\emptyset. Then, we get

{(j,k)2:𝔻j,kF}{(j,k)2:jl1+Jγ,k0},\displaystyle\{(j,k)\in\mathbb{N}^{2}:\ \mathbb{D}_{j,k}\cap F\neq\emptyset\}\subseteq\{(j,k)\in\mathbb{N}^{2}:\ j\geq l^{*}-1+J_{\gamma},\ k\geq 0\},

which immediately verifies claim (i)(i) due to α,α>1\alpha,\alpha^{\prime}>1; see Assumption 1. Now, to show that jl1+Jγj\geq l^{*}-1+J_{\gamma} is a necessary condition, note that, from (6.1), it holds for any ξ𝔻j,kF\xi\in\mathbb{D}_{j,k}\cap F that aΔ<supt[0,1]ξ(t)i=1jui.a-\Delta<\sup_{t\in[0,1]}\xi(t)\leq\sum_{i=1}^{j}u_{i}. Furthermore, by the definition of the set FF, we must have (here, w.l.o.g., we order uiu_{i}’s by u1u2uju_{1}\geq u_{2}\geq\ldots\geq u_{j}) ui<bu_{i}<b for all i[l1]i\in[l^{*}-1] and ui<γu_{i}<\gamma for all i=l,l+1,,ji=l^{*},l^{*}+1,\cdots,j. This implies (l1)b+(jl+1)γ>aΔ,(l^{*}-1)b+(j-l^{*}+1)\gamma>a-\Delta, and hence j>aΔ(l1)bγ+l1,j>\frac{a-\Delta-(l^{*}-1)b}{\gamma}+l^{*}-1, which is equivalent to jJγ+l1j\geq J_{\gamma}+l^{*}-1.

Proof of Claim (ii)(ii).

From (6.4), we can fix some ϵ>0\epsilon>0 small enough such that

aΔϵ>(l1)(b+ϵ)+(Jγ1)(γ+ϵ).\displaystyle a-\Delta-\epsilon>(l^{*}-1)(b+\epsilon)+(J_{\gamma}-1)(\gamma+\epsilon). (6.5)

It suffices to show that

𝒅(ξ,ξ)ϵξ𝔻<Jγ+l1,0,ξF.\displaystyle\bm{d}(\xi,\xi^{\prime})\geq\epsilon\qquad\forall\xi\in\mathbb{D}_{<J_{\gamma}+l^{*}-1,0},\ \xi^{\prime}\in F. (6.6)

Here, 𝒅\bm{d} is the Skorokhod J1J_{1} metric; see (2.1) for the definition. To prove (6.6), we start with the following observation: using claim (i)(i), for any (j,k)2(j,k)\in\mathbb{N}^{2} with (j,k)𝕀<Jγ+l1,0(j,k)\in\mathbb{I}_{<J_{\gamma}+l^{*}-1,0}, we must have jJγ+l2j\leq J_{\gamma}+l^{*}-2. Next, we proceed with a proof by contradiction. Suppose there is some ξ𝔻j,k\xi\in\mathbb{D}_{j,k} with jJγ+l2j\leq J_{\gamma}+l^{*}-2 and some ξF\xi^{\prime}\in F such that 𝒅(ξ,ξ)<ϵ\bm{d}(\xi,\xi^{\prime})<\epsilon. By the definition of the set FF above, any upward jump in ξ\xi^{\prime} is bounded by bb, and at most l1l^{*}-1 of them is larger than γ\gamma. Then from 𝒅(ξ,ξ)<ϵ\bm{d}(\xi,\xi^{\prime})<\epsilon, we know that any upward jump in ξ\xi is bounded by b+ϵb+\epsilon, and at most l1l^{*}-1 of them is larger than γ+ϵ\gamma+\epsilon. Through (6.1), we now have

supt[0,1]ξ(t)\displaystyle\sup_{t\in[0,1]}\xi(t) i=1jui(l1)(b+ϵ)+(Jγ1)(γ+ϵ)<aΔϵ.\displaystyle\leq\sum_{i=1}^{j}u_{i}\leq(l^{*}-1)(b+\epsilon)+(J_{\gamma}-1)(\gamma+\epsilon)<a-\Delta-\epsilon.

The last inequality follows from (6.5). Using 𝒅(ξ,ξ)<ϵ\bm{d}(\xi,\xi^{\prime})<\epsilon again, we yield the contraction that supt[0,1]ξ(t)<aΔ\sup_{t\in[0,1]}\xi^{\prime}(t)<a-\Delta and hence ξF\xi^{\prime}\notin F. This concludes the proof of (6.6). ∎

Now, we are ready to prove Proposition 3.1.

Proof of Proposition 3.1.

We start by proving the unbiasedness of the importance sampling estimator

Ln=Znd𝐏d𝐐n=m=0τY^nm𝐈End𝐏d𝐐nY^nm1𝐈End𝐏d𝐐n𝐏(τm).\displaystyle L_{n}=Z_{n}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}=\sum_{m=0}^{\tau}\frac{\hat{Y}^{m}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}-\hat{Y}^{m-1}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}}{\mathbf{P}(\tau\geq m)}.

under 𝐐n\mathbf{Q}_{n}. Note that under both 𝐏\mathbf{P} and 𝐐n\mathbf{Q}_{n}, we have τGeom(ρ)\tau\sim\text{Geom}(\rho) (i.e., 𝐏(τm)=ρm1\mathbf{P}(\tau\geq m)=\rho^{m-1}) and that τ\tau is independent of everything else. In light of Result 4, it suffices to verify limm𝐄𝐐n[Ym]=𝐄𝐐n[Y]\lim_{m\to\infty}\mathbf{E}^{\mathbf{Q}_{n}}[Y_{m}]=\mathbf{E}^{\mathbf{Q}_{n}}[Y] and condition (2.8) (under 𝐐n\mathbf{Q}_{n}) with the choice of Ym=Y^nm𝐈End𝐏d𝐐nY_{m}=\hat{Y}^{m}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}} and Y=Yn𝐈End𝐏d𝐐n.Y=Y^{*}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}. In particular, it suffices to show that (for any n1n\geq 1)

m1𝐄𝐐n[|Y^nm1𝐈End𝐏d𝐐nYn𝐈End𝐏d𝐐n|2]/𝐏(τm)<.\displaystyle\sum_{m\geq 1}\mathbf{E}^{\mathbf{Q}_{n}}\Bigg{[}\bigg{|}\hat{Y}^{m-1}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}-Y^{*}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\bigg{|}^{2}\Bigg{]}\Bigg{/}\mathbf{P}(\tau\geq m)<\infty. (6.7)

To see why, note that (6.7) directly verifies condition (2.8). Furthermore, it implies limm𝐄𝐐n|Y^nm1𝐈End𝐏d𝐐nYn𝐈End𝐏d𝐐n|2=0.\lim_{m\to\infty}\mathbf{E}^{\mathbf{Q}_{n}}\big{|}\hat{Y}^{m-1}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}-Y^{*}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\big{|}^{2}=0. The 2\mathcal{L}_{2} convergence then implies the 1\mathcal{L}_{1} convergence, i.e., limm𝐄𝐐n[Y^nm1𝐈End𝐏d𝐐n]=𝐄𝐐n[Yn𝐈End𝐏d𝐐n].\lim_{m\to\infty}\mathbf{E}^{\mathbf{Q}_{n}}[\hat{Y}^{m-1}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}]=\mathbf{E}^{\mathbf{Q}_{n}}[Y^{*}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}].

To prove claim (6.7), observe that

𝐄𝐐n[|Y^nm1𝐈End𝐏d𝐐nYn𝐈End𝐏d𝐐n|2]\displaystyle\mathbf{E}^{\mathbf{Q}_{n}}\Bigg{[}\bigg{|}\hat{Y}^{m-1}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}-Y^{*}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\bigg{|}^{2}\Bigg{]} 𝐄𝐐n[|Y^nm1Yn|2(d𝐏d𝐐n)2]\displaystyle\leq\mathbf{E}^{\mathbf{Q}_{n}}\bigg{[}|\hat{Y}^{m-1}_{n}-Y^{*}_{n}|^{2}\cdot\bigg{(}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\bigg{)}^{2}\bigg{]}
=𝐄[|Y^nm1Yn|2d𝐏d𝐐n]\displaystyle=\mathbf{E}\bigg{[}|\hat{Y}^{m-1}_{n}-Y^{*}_{n}|^{2}\cdot\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\bigg{]}
1w𝐄|Y^nm1Yn|2due to d𝐏d𝐐n1w, see (3.5).\displaystyle\leq\frac{1}{w}\mathbf{E}|\hat{Y}^{m-1}_{n}-Y^{*}_{n}|^{2}\qquad\text{due to }\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\leq\frac{1}{w}\text{, see \eqref{def: estimator Ln}}.

In particular, since Y^nm\hat{Y}^{m}_{n} and YnY^{*}_{n} only take values in {0,1}\{0,1\}, we have 𝐄|Y^nmYn|2=𝐏(Y^nmYn),\mathbf{E}|\hat{Y}^{m}_{n}-Y^{*}_{n}|^{2}=\mathbf{P}(\hat{Y}^{m}_{n}\neq Y^{*}_{n}), and

𝐏(Y^nmYn)\displaystyle\mathbf{P}(\hat{Y}^{m}_{n}\neq Y^{*}_{n}) =k0𝐏(YnY^nm|𝒟(J¯n)=k)𝐏(𝒟(J¯n)=k)\displaystyle=\sum_{k\geq 0}\mathbf{P}(Y^{*}_{n}\neq\hat{Y}^{m}_{n}\ |\ \mathcal{D}(\bar{J}_{n})=k)\mathbf{P}(\mathcal{D}(\bar{J}_{n})=k)
k0C0ρ0m(k+1)𝐏(𝒟(J¯n)=k)for all mm¯ due to (3.9)\displaystyle\leq\sum_{k\geq 0}C_{0}\rho^{m}_{0}\cdot(k+1)\cdot\mathbf{P}(\mathcal{D}(\bar{J}_{n})=k)\qquad\text{for all $m\geq\bar{m}$ due to \eqref{condition 1, proposition: design of Zn}}
=C0ρ0m𝐄[1+Poisson(nν[nγ,))]=C0ρ0m(1+nν[nγ,)).\displaystyle=C_{0}\rho^{m}_{0}\cdot\mathbf{E}\Big{[}1+\text{Poisson}\big{(}n\nu[n\gamma,\infty)\big{)}\Big{]}=C_{0}\rho^{m}_{0}\cdot\big{(}1+n\nu[n\gamma,\infty)\big{)}. (6.8)

The last line in the display above follows from the definition of J¯n(t)=1nJ(nt)\bar{J}_{n}(t)=\frac{1}{n}J(nt) in (3.6). To conclude, note that ν(x)𝒱α(x)\nu(x)\in\mathcal{RV}_{-\alpha}(x) and hence nν[nγ,)𝒱(α1)(n)n\nu[n\gamma,\infty)\in\mathcal{RV}_{-(\alpha-1)}(n) with α>1\alpha>1, thus implying supn1nν[nγ,)<\sup_{n\geq 1}n\nu[n\gamma,\infty)<\infty; also, as prescribed in Proposition 3.1 we have ρ(ρ0,1)\rho\in(\rho_{0},1).

The rest of the proof is devoted to establishing the strong efficiency of LnL_{n}. Observe that

𝐄𝐐n[Ln2]\displaystyle\mathbf{E}^{\mathbf{Q}_{n}}[L^{2}_{n}] =Zn2d𝐏d𝐐nd𝐏d𝐐n𝑑𝐐n=Zn2d𝐏d𝐐n𝑑𝐏=Zn2𝐈Bnγd𝐏d𝐐n𝑑𝐏+Zn2𝐈(Bnγ)cd𝐏d𝐐n𝑑𝐏.\displaystyle=\int Z^{2}_{n}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}d\mathbf{Q}_{n}=\int Z^{2}_{n}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}d\mathbf{P}=\int Z^{2}_{n}\mathbf{I}_{B^{\gamma}_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}d\mathbf{P}+\int Z^{2}_{n}\mathbf{I}_{(B^{\gamma}_{n})^{c}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}d\mathbf{P}.

By definitions in (3.5), on event (Bnγ)c(B^{\gamma}_{n})^{c} we have d𝐏d𝐐n1w\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\leq\frac{1}{w}, while on event BnγB^{\gamma}_{n} we have d𝐏d𝐐n𝐏(Bnγ)1w\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\leq\frac{\mathbf{P}(B^{\gamma}_{n})}{1-w}. As a result,

𝐄𝐐n[Ln2]𝐏(Bnγ)1w𝐄[Zn2𝐈Bnγ]+1w𝐄[Zn2𝐈(Bnγ)c].\displaystyle\mathbf{E}^{\mathbf{Q}_{n}}[L^{2}_{n}]\leq\frac{\mathbf{P}(B^{\gamma}_{n})}{1-w}\mathbf{E}[{Z^{2}_{n}\mathbf{I}_{B^{\gamma}_{n}}}]+\frac{1}{w}\mathbf{E}[{Z^{2}_{n}\mathbf{I}_{(B^{\gamma}_{n})^{c}}}]. (6.9)

Meanwhile, Lemma 6.1 implies that

0<lim infn𝐏(An)(nν[n,))llim supn𝐏(An)(nν[n,))l<.\displaystyle 0<\liminf_{n\rightarrow\infty}\frac{\mathbf{P}(A_{n})}{(n\nu[n,\infty))^{l^{*}}}\leq\limsup_{n\rightarrow\infty}\frac{\mathbf{P}(A_{n})}{(n\nu[n,\infty))^{l^{*}}}<\infty. (6.10)

Let Zn,1\ensurestackMath\stackon[1pt]=ΔZn𝐈BnγZ_{n,1}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}Z_{n}\mathbf{I}_{B^{\gamma}_{n}} and Zn,2\ensurestackMath\stackon[1pt]=ΔZn𝐈(Bnγ)cZ_{n,2}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}Z_{n}\mathbf{I}_{(B^{\gamma}_{n})^{c}}. Furthermore, given ρ(ρ0,1)\rho\in(\rho_{0},1), we claim the existence of some γ¯=γ¯(ρ)(0,b)\bar{\gamma}=\bar{\gamma}(\rho)\in(0,b) such that for any γ(0,γ¯)\gamma\in(0,\bar{\gamma}),

𝐏(Bnγ)\displaystyle\mathbf{P}(B^{\gamma}_{n}) =𝑶((nν[n,))l),\displaystyle=\bm{O}\big{(}(n\nu[n,\infty))^{l^{*}}\big{)}, (6.11)
𝐄[Zn,12]\displaystyle\mathbf{E}[Z_{n,1}^{2}] =𝑶((nν[n,))l),\displaystyle=\bm{O}\big{(}(n\nu[n,\infty))^{l^{*}}\big{)}, (6.12)
𝐄[Zn,22]\displaystyle\mathbf{E}[Z_{n,2}^{2}] =𝒐((nν[n,))2l),\displaystyle=\bm{o}\big{(}(n\nu[n,\infty))^{2l^{*}}\big{)}, (6.13)

as nn\to\infty. Then, using (6.11) and (6.12) we get 𝐏(Bnγ)𝐄[Zn2𝐈Bnγ]=𝑶((nν[n,))2l)=𝑶(𝐏2(An)).\mathbf{P}(B^{\gamma}_{n})\mathbf{E}[{Z^{2}_{n}\mathbf{I}_{B^{\gamma}_{n}}}]=\bm{O}\big{(}(n\nu[n,\infty))^{2l^{*}}\big{)}=\bm{O}\big{(}\mathbf{P}^{2}(A_{n})\big{)}. The last equality follows from (6.10). Similarly, from (6.10) and (6.13) we get 𝐄[Zn2𝐈(Bnγ)c]=𝒐((nν[n,))2l)=𝒐(𝐏2(An)).\mathbf{E}[{Z^{2}_{n}\mathbf{I}_{(B^{\gamma}_{n})^{c}}}]=\bm{o}\big{(}(n\nu[n,\infty))^{2l^{*}}\big{)}=\bm{o}\big{(}\mathbf{P}^{2}(A_{n})\big{)}. Therefore, in (6.9) we have 𝐄𝐐n[Ln2]=𝑶(𝐏2(An))\mathbf{E}^{\mathbf{Q}_{n}}[L^{2}_{n}]=\bm{O}\big{(}\mathbf{P}^{2}(A_{n})\big{)}, thus establishing the strong efficiency. Now, it remains to prove claims (6.11), (6.12), and (6.13).

Proof of Claim (6.11).

We show that the claim holds for all γ(0,b)\gamma\in(0,b). For any c>0c>0 and kk\in\mathbb{N}, note that

𝐏(Poisson(c)k)\displaystyle\mathbf{P}\big{(}\text{Poisson}(c)\geq k\big{)} =jkexp(c)cjj!=ckjkexp(c)cjkj!ckjkexp(c)cjk(jk)!=ck.\displaystyle=\sum_{j\geq k}\exp(-c)\frac{c^{j}}{j!}=c^{k}\sum_{j\geq k}\exp(-c)\frac{c^{j-k}}{j!}\leq c^{k}\sum_{j\geq k}\exp(-c)\frac{c^{j-k}}{(j-k)!}=c^{k}. (6.14)

Recall that Bnγ={X¯nBγ}B^{\gamma}_{n}=\{\bar{X}_{n}\in B^{\gamma}\} and Bγ\ensurestackMath\stackon[1pt]=Δ{ξ𝔻:#{t[0,1]:ξ(t)ξ(t)γ}l}.B^{\gamma}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\xi\in\mathbb{D}:\#\{t\in[0,1]:\xi(t)-\xi(t-)\geq\gamma\}\geq l^{*}\}. Therefore,

𝐏(Bnγ)\displaystyle\mathbf{P}(B^{\gamma}_{n}) =𝐏(#{t[0,n]:X(t)X(t)nγ}l)due to X¯n(t)=X(nt)/n\displaystyle=\mathbf{P}\big{(}\#\{t\in[0,n]:\ X(t)-X(t-)\geq n\gamma\}\geq l^{*}\big{)}\qquad\text{due to }\bar{X}_{n}(t)=X(nt)/n
=klexp(nν[nγ,))(nν[nγ,))kk!(nν[nγ,))ldue to (6.14).\displaystyle=\sum_{k\geq l^{*}}\exp\big{(}-n\nu[n\gamma,\infty)\big{)}\frac{\big{(}n\nu[n\gamma,\infty)\big{)}^{k}}{k!}\leq\big{(}n\nu[n\gamma,\infty)\big{)}^{l^{*}}\qquad\text{due to }\eqref{proof, bound poisson dist tail, proposition: design of Zn}.

Lastly, the regularly varying nature of ν[x,)\nu[x,\infty) (see Assumption 1) implies limn(nν[nγ,))l(nν[n,))l=1/γαl(0,),\lim_{n\rightarrow\infty}\frac{(n\nu[n\gamma,\infty))^{l^{*}}}{(n\nu[n,\infty))^{l^{*}}}=1/\gamma^{\alpha l^{*}}\in(0,\infty), and hence 𝐏(Bnγ)=𝑶((nν[n,))l)\mathbf{P}(B^{\gamma}_{n})=\bm{O}\big{(}(n\nu[n,\infty))^{l^{*}}\big{)}.

Proof of Claim (6.12).

Again, we prove the claim for all γ(0,b)\gamma\in(0,b). By the definition of ZnZ_{n} in (3.8),

Zn,1=Zn𝐈Bnγ=m=0τY^nm𝐈EnBnγY^nm1𝐈EnBnγ𝐏(τm).\displaystyle Z_{n,1}=Z_{n}\mathbf{I}_{B^{\gamma}_{n}}=\sum_{m=0}^{\tau}\frac{\hat{Y}^{m}_{n}\mathbf{I}_{E_{n}\cap B^{\gamma}_{n}}-\hat{Y}^{m-1}_{n}\mathbf{I}_{E_{n}\cap B^{\gamma}_{n}}}{\mathbf{P}(\tau\geq m)}.

Meanwhile, by the definition of BnγB^{\gamma}_{n}, we have 𝐈Bnγ=0\mathbf{I}_{B^{\gamma}_{n}}=0 on {𝒟(J¯n)<l}\{\mathcal{D}(\bar{J}_{n})<l^{*}\}, where 𝒟(ξ)\mathcal{D}(\xi) counts the number of discontinuities for any ξ𝔻\xi\in\mathbb{D}. By applying Result 4 under the choice of Ym=Y^nm𝐈EnBnγY_{m}=\hat{Y}^{m}_{n}\mathbf{I}_{E_{n}\cap B^{\gamma}_{n}} and Y=Yn𝐈EnBnγ,Y=Y^{*}_{n}\mathbf{I}_{E_{n}\cap B^{\gamma}_{n}}, we yield

𝐄Zn,12\displaystyle\mathbf{E}Z^{2}_{n,1} m1𝐄[|Yn𝐈EnBnγY^nm1𝐈EnBnγ|2]𝐏(τm)\displaystyle\leq\sum_{m\geq 1}\frac{\mathbf{E}\Big{[}\big{|}Y^{*}_{n}\mathbf{I}_{E_{n}\cap B^{\gamma}_{n}}-\hat{Y}^{m-1}_{n}\mathbf{I}_{E_{n}\cap B^{\gamma}_{n}}\big{|}^{2}\Big{]}}{\mathbf{P}(\tau\geq m)}
m1kl𝐄[𝐈(YnY^nm1)|{𝒟(J¯n)=k}]𝐏(τm)𝐏(𝒟(J¯n)=k)due to 𝐈Bnγ=0 on {𝒟(J¯n)<l}\displaystyle\leq\sum_{m\geq 1}\sum_{k\geq l^{*}}\frac{\mathbf{E}\Big{[}\mathbf{I}\big{(}Y^{*}_{n}\neq\hat{Y}^{m-1}_{n}\big{)}\ \Big{|}\ \{\mathcal{D}(\bar{J}_{n})=k\}\Big{]}}{\mathbf{P}(\tau\geq m)}\cdot\mathbf{P}\big{(}\mathcal{D}(\bar{J}_{n})=k\big{)}\ \ \text{due to }\mathbf{I}_{B^{\gamma}_{n}}=0\text{ on }\{\mathcal{D}(\bar{J}_{n})<l^{*}\}
kl𝐏(𝒟(J¯n)=k)m1𝐏(YnY^nm1|{𝒟(J¯n)=k})𝐏(τm)\displaystyle\leq\sum_{k\geq l^{*}}\mathbf{P}\big{(}\mathcal{D}(\bar{J}_{n})=k\big{)}\cdot\sum_{m\geq 1}\frac{\mathbf{P}\Big{(}Y^{*}_{n}\neq\hat{Y}^{m-1}_{n}\ \Big{|}\ \{\mathcal{D}(\bar{J}_{n})=k\}\Big{)}}{\mathbf{P}(\tau\geq m)}
kl𝐏(𝒟(J¯n)=k)[m=1m¯1ρm1+mm¯+1C0ρ0m1(k+1)ρm1]by condition (3.9)\displaystyle\leq\sum_{k\geq l^{*}}\mathbf{P}\big{(}\mathcal{D}(\bar{J}_{n})=k\big{)}\cdot\bigg{[}\sum_{m=1}^{\bar{m}}\frac{1}{\rho^{m-1}}+\sum_{m\geq\bar{m}+1}\frac{C_{0}\rho_{0}^{m-1}\cdot(k+1)}{\rho^{m-1}}\bigg{]}\qquad\text{by condition \eqref{condition 1, proposition: design of Zn}}
kl𝐏(𝒟(J¯n)=k)(k+1)[m=1m¯1ρm1+mm¯+1C0ρ0m1ρm1\ensurestackMath\stackon[1pt]=ΔC~ρ,1].\displaystyle\leq\sum_{k\geq l^{*}}\mathbf{P}\big{(}\mathcal{D}(\bar{J}_{n})=k\big{)}\cdot(k+1)\cdot\bigg{[}\underbrace{\sum_{m=1}^{\bar{m}}\frac{1}{\rho^{m-1}}+\sum_{m\geq\bar{m}+1}\frac{C_{0}\rho_{0}^{m-1}}{\rho^{m-1}}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\widetilde{C}_{\rho,1}}\bigg{]}. (6.15)

In particular, given ρ(ρ0,1)\rho\in(\rho_{0},1), we have C~ρ,1<\widetilde{C}_{\rho,1}<\infty, and hence

𝐄Zn,12\displaystyle\mathbf{E}Z^{2}_{n,1} C~ρ,1kl(k+1)𝐏(𝒟(J¯n)=k)\displaystyle\leq\widetilde{C}_{\rho,1}\sum_{k\geq l^{*}}(k+1)\cdot\mathbf{P}\big{(}\mathcal{D}(\bar{J}_{n})=k\big{)}
=C~ρ,1kl(k+1)exp(nν[nγ,))(nν[nγ,))kk!\displaystyle=\widetilde{C}_{\rho,1}\sum_{k\geq l^{*}}(k+1)\cdot\exp\big{(}-n\nu[n\gamma,\infty)\big{)}\frac{\big{(}n\nu[n\gamma,\infty)\big{)}^{k}}{k!}
2C~ρ,1klkexp(nν[nγ,))(nν[nγ,))kk! due to l1k+1k2kl\displaystyle\leq 2\widetilde{C}_{\rho,1}\sum_{k\geq l^{*}}k\cdot\exp\big{(}-n\nu[n\gamma,\infty)\big{)}\frac{\big{(}n\nu[n\gamma,\infty)\big{)}^{k}}{k!}\qquad\text{ due to }l^{*}\geq 1\ \Longrightarrow\ \frac{k+1}{k}\leq 2\ \forall k\geq l^{*}
2C~ρ,1(nν[nγ,))lklexp(nν[nγ,))(nν[nγ,))kl(kl)! due to l1\displaystyle\leq 2\widetilde{C}_{\rho,1}\cdot\big{(}n\nu[n\gamma,\infty)\big{)}^{l^{*}}\sum_{k\geq l^{*}}\exp\big{(}-n\nu[n\gamma,\infty)\big{)}\frac{\big{(}n\nu[n\gamma,\infty)\big{)}^{k-l^{*}}}{(k-l^{*})!}\qquad\text{ due to }l^{*}\geq 1
=2C~ρ,1(nν[nγ,))l.\displaystyle=2\widetilde{C}_{\rho,1}\cdot\big{(}n\nu[n\gamma,\infty)\big{)}^{l^{*}}.

Again, the regular varying nature of ν[x,)\nu[x,\infty) allows us to conclude that 𝐄Zn,12=𝑶((nν[n,))l)\mathbf{E}Z^{2}_{n,1}=\bm{O}\big{(}(n\nu[n,\infty))^{l^{*}}\big{)}.

Proof of Claim (6.13).

Fix some ρ(ρ0,1)\rho\in(\rho_{0},1) and some q>1q>1 such that ρ01/q<ρ\rho_{0}^{1/q}<\rho. Let p>1p>1 be such that 1p+1q=1\frac{1}{p}+\frac{1}{q}=1. By Assumption 3, we can pick some Δ0>0\Delta_{0}>0 small enough such that aΔ0>(l1)ba-\Delta_{0}>(l^{*}_{1})b. This allows us to pick γ¯(0,b)\bar{\gamma}\in(0,b) small enough such that (J^+l1)/p>2l(\hat{J}+l^{*}-1)/p>2l^{*} where

J^\ensurestackMath\stackon[1pt]=ΔaΔ0(l1)bγ¯.\displaystyle\hat{J}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\frac{a-\Delta_{0}-(l^{*}-1)b}{\bar{\gamma}}. (6.16)

We prove the claim for all γ(0,γ¯)\gamma\in(0,\bar{\gamma}). Specifically, given any γ(0,γ¯)\gamma\in(0,\bar{\gamma}), one can pick Δ(0,Δ0)\Delta\in(0,\Delta_{0}) such that [aΔ(l1)b]/γ[a-\Delta-(l^{*}-1)b]/\gamma\notin\mathbb{Z}. Due to our choice of γ\gamma and Δ\Delta, it follows from (6.16) that (Jγ+l1)/p>2l(J_{\gamma}+l^{*}-1)/p>2l^{*} where

Jγ\ensurestackMath\stackon[1pt]=ΔaΔ(l1)bγ.\displaystyle J_{\gamma}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\lceil\frac{a-\Delta-(l^{*}-1)b}{\gamma}\rceil.

Let AΔ={ξ𝔻:supt[0,1]ξ(t)aΔ}A^{\Delta}=\{\xi\in\mathbb{D}:\sup_{t\in[0,1]}\xi(t)\geq a-\Delta\} and AnΔ={X¯nAΔ}A^{\Delta}_{n}=\{\bar{X}_{n}\in A^{\Delta}\}. Also, note that

Zn,2\displaystyle Z_{n,2} =Zn𝐈(Bnγ)c=Zn𝐈AnΔ(Bnγ)c\ensurestackMath\stackon[1pt]=ΔZn,3+Zn𝐈(AnΔ)c(Bnγ)c\ensurestackMath\stackon[1pt]=ΔZn,4.\displaystyle={Z_{n}\mathbf{I}_{(B^{\gamma}_{n})^{c}}}=\underbrace{Z_{n}\mathbf{I}_{A^{\Delta}_{n}\cap(B^{\gamma}_{n})^{c}}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}Z_{n,3}}+\underbrace{Z_{n}\mathbf{I}_{(A^{\Delta}_{n})^{c}\cap(B^{\gamma}_{n})^{c}}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}Z_{n,4}}.

Specifically, Zn,3=m=0τY^nm𝐈AnΔEn(Bnγ)cY^nm1𝐈AnΔEn(Bnγ)c𝐏(τm)Z_{n,3}=\sum_{m=0}^{\tau}\frac{\hat{Y}^{m}_{n}\mathbf{I}_{A^{\Delta}_{n}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}-\hat{Y}^{m-1}_{n}\mathbf{I}_{A^{\Delta}_{n}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}}{\mathbf{P}(\tau\geq m)}. Analogous to the calculations in (6.15), by applying Result 4 under the choice of Ym=Y^nm𝐈AnΔEn(Bnγ)cY_{m}=\hat{Y}^{m}_{n}\mathbf{I}_{A^{\Delta}_{n}\cap E_{n}\cap(B^{\gamma}_{n})^{c}} and Y=Yn𝐈AnΔEn(Bnγ)c,Y=Y^{*}_{n}\mathbf{I}_{A^{\Delta}_{n}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}, we yield

𝐄Zn,32\displaystyle\mathbf{E}Z_{n,3}^{2} m1𝐄[|YnY^nm1|2𝐈AnΔEn(Bnγ)c]𝐏(τm)\displaystyle\leq\sum_{m\geq 1}\frac{\mathbf{E}\Big{[}\big{|}Y^{*}_{n}-\hat{Y}^{m-1}_{n}\big{|}^{2}\mathbf{I}_{A^{\Delta}_{n}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}\Big{]}}{\mathbf{P}(\tau\geq m)}
=m1𝐄[𝐈(YnY^nm1)𝐈AnΔEn(Bnγ)c]𝐏(τm)because Y^nm and Yn only take values in {0,1}\displaystyle=\sum_{m\geq 1}\frac{\mathbf{E}\Big{[}\mathbf{I}\big{(}Y^{*}_{n}\neq\hat{Y}^{m-1}_{n}\big{)}\cdot\mathbf{I}_{A^{\Delta}_{n}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}\Big{]}}{\mathbf{P}(\tau\geq m)}\qquad\text{because }\hat{Y}^{m}_{n}\text{ and }Y^{*}_{n}\text{ only take values in }\{0,1\}
m1(𝐏(YnY^nm1))1/q(𝐏(AnΔEn(Bnγ)c))1/p𝐏(τm) by Hölder’s inequality.\displaystyle\leq\sum_{m\geq 1}\frac{\Big{(}\mathbf{P}\big{(}Y^{*}_{n}\neq\hat{Y}^{m-1}_{n}\big{)}\Big{)}^{1/q}\cdot\Big{(}\mathbf{P}\big{(}A^{\Delta}_{n}\cap E_{n}\cap(B^{\gamma}_{n})^{c}\big{)}\Big{)}^{1/p}}{\mathbf{P}(\tau\geq m)}\qquad\text{ by Hölder's inequality}.

Applying Lemma 6.2, we get (𝐏(AnΔEn(Bnγ)c))1/p=𝒐((nν[n,))2l).\big{(}\mathbf{P}(A^{\Delta}_{n}\cap E_{n}\cap(B^{\gamma}_{n})^{c})\big{)}^{1/p}=\bm{o}\big{(}(n\nu[n,\infty))^{2l^{*}}\big{)}. On the other hand, it has been shown in (6.8) that for any n1n\geq 1 and mm¯m\geq\bar{m}, we have 𝐏(YnY^nm)C0Cγρ0m\mathbf{P}(Y^{*}_{n}\neq\hat{Y}^{m}_{n})\leq C_{0}C_{\gamma}\rho^{m}_{0} where Cγ\ensurestackMath\stackon[1pt]=Δsupn1nν[nγ,)+1<.C_{\gamma}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sup_{n\geq 1}n\nu[n\gamma,\infty)+1<\infty. In summary,

𝐄Zn,32\displaystyle\mathbf{E}Z_{n,3}^{2} 𝒐((nν[n,))2l)[m=1m¯1ρm1+mm¯+1(C0Cγ)1/q(ρ01/q)m1ρm1\ensurestackMath\stackon[1pt]=ΔC~ρ,2].\displaystyle\leq\bm{o}\big{(}(n\nu[n,\infty))^{2l^{*}}\big{)}\cdot\bigg{[}\underbrace{\sum_{m=1}^{\bar{m}}\frac{1}{\rho^{m-1}}+\sum_{m\geq\bar{m}+1}\frac{(C_{0}C_{\gamma})^{1/q}\cdot(\rho_{0}^{1/q})^{m-1}}{\rho^{m-1}}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\widetilde{C}_{\rho,2}}\bigg{]}. (6.17)

Note that C~ρ,2<\widetilde{C}_{\rho,2}<\infty due to our choice of ρ01/q<ρ\rho_{0}^{1/q}<\rho.

Similarly, to bound the second-order moment of Zn,4=m=0τY^nm𝐈(AnΔ)cEn(Bnγ)cY^nm1𝐈(AnΔ)cEn(Bnγ)c𝐏(τm)Z_{n,4}=\sum_{m=0}^{\tau}\frac{\hat{Y}^{m}_{n}\mathbf{I}_{(A^{\Delta}_{n})^{c}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}-\hat{Y}^{m-1}_{n}\mathbf{I}_{(A^{\Delta}_{n})^{c}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}}{\mathbf{P}(\tau\geq m)}, we apply Result 4 again and get

𝐄Zn,42\displaystyle\mathbf{E}Z^{2}_{n,4} m1𝐄[|YnY^nm1|2𝐈(AnΔ)cEn(Bnγ)c]𝐏(τm)\displaystyle\leq\sum_{m\geq 1}\frac{\mathbf{E}\Big{[}\big{|}Y^{*}_{n}-\hat{Y}^{m-1}_{n}\big{|}^{2}\mathbf{I}_{(A^{\Delta}_{n})^{c}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}\Big{]}}{\mathbf{P}(\tau\geq m)}
=m1𝐄[𝐈(YnY^nm1)𝐈(AnΔ)cEn(Bnγ)c]𝐏(τm)because Y^nm and Yn only take values in {0,1}\displaystyle=\sum_{m\geq 1}\frac{\mathbf{E}\Big{[}\mathbf{I}\big{(}Y^{*}_{n}\neq\hat{Y}^{m-1}_{n}\big{)}\cdot\mathbf{I}_{(A^{\Delta}_{n})^{c}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}\Big{]}}{\mathbf{P}(\tau\geq m)}\qquad\text{because }\hat{Y}^{m}_{n}\text{ and }Y^{*}_{n}\text{ only take values in }\{0,1\}
m1𝐏({YnY^nm1,X¯nAΔ}(Bnγ)c)𝐏(τm) due to AnΔ={X¯nAΔ}\displaystyle\leq\sum_{m\geq 1}\frac{\mathbf{P}\Big{(}\big{\{}Y^{*}_{n}\neq\hat{Y}^{m-1}_{n},\ \bar{X}_{n}\notin A^{\Delta}\big{\}}\cap(B^{\gamma}_{n})^{c}\Big{)}}{\mathbf{P}(\tau\geq m)}\qquad\text{ due to }A^{\Delta}_{n}=\{\bar{X}_{n}\in A^{\Delta}\}
=m1𝐏({YnY^nm1,X¯nAΔ}{𝒟(J¯n)<l})𝐏(τm)due to Bnγ={𝒟(J¯n)l}\displaystyle=\sum_{m\geq 1}\frac{\mathbf{P}\Big{(}\big{\{}Y^{*}_{n}\neq\hat{Y}^{m-1}_{n},\ \bar{X}_{n}\notin A^{\Delta}\big{\}}\cap\{\mathcal{D}(\bar{J}_{n})<l^{*}\}\Big{)}}{\mathbf{P}(\tau\geq m)}\qquad\text{due to }B^{\gamma}_{n}=\{\mathcal{D}(\bar{J}_{n})\geq l^{*}\}
=m1k=0l1𝐏(YnY^nm1,X¯nAΔ|{𝒟(J¯n)=k})𝐏(τm)𝐏(𝒟(J¯n)=k)\displaystyle=\sum_{m\geq 1}\sum_{k=0}^{l^{*}-1}\frac{\mathbf{P}\big{(}Y^{*}_{n}\neq\hat{Y}^{m-1}_{n},\ \bar{X}_{n}\notin A^{\Delta}\ \big{|}\ \{\mathcal{D}(\bar{J}_{n})=k\}\big{)}}{\mathbf{P}(\tau\geq m)}\cdot\mathbf{P}\big{(}\mathcal{D}(\bar{J}_{n})=k\big{)}
m1k=0l1C0ρ0m1Δ2nμρm1 due to (3.10)\displaystyle\leq\sum_{m\geq 1}\sum_{k=0}^{l^{*}-1}\frac{C_{0}\rho^{m-1}_{0}}{\Delta^{2}n^{\mu}\cdot\rho^{m-1}}\qquad\text{ due to \eqref{condition 2, proposition: design of Zn}}
=lm1C0ρ0m1Δ2nμρm1=C0lΔ(1ρ0ρ)1nμ=𝒐((nν[n,))2l).\displaystyle=l^{*}\sum_{m\geq 1}\frac{C_{0}\rho^{m-1}_{0}}{\Delta^{2}n^{\mu}\cdot\rho^{m-1}}=\frac{C_{0}l^{*}}{\Delta\cdot(1-\frac{\rho_{0}}{\rho})}\cdot\frac{1}{n^{\mu}}=\bm{o}\Big{(}\big{(}n\nu[n,\infty)\big{)}^{2l^{*}}\Big{)}. (6.18)

The last equality follows from the condition μ>2l(α1)\mu>2l^{*}(\alpha-1) prescribed in Proposition 3.1 and the fact that nν[n,)𝒱(α1)(n)n\nu[n,\infty)\in\mathcal{RV}_{-(\alpha-1)}(n) as nn\to\infty. Combining (6.17) and (6.18) with the preliminary bound (x+y)22x2+2y2(x+y)^{2}\leq 2x^{2}+2y^{2}, we yield 𝐄Zn,222𝐄Zn,32+2𝐄Zn,42=𝒐((nν[n,))2l)\mathbf{E}Z^{2}_{n,2}\leq 2\mathbf{E}Z^{2}_{n,3}+2\mathbf{E}Z^{2}_{n,4}=\bm{o}\big{(}(n\nu[n,\infty))^{2l^{*}}\big{)} and conclude the proof of (6.13). ∎

6.2 Proof of Theorems 3.2 and 3.3

We stress again that Theorem 3.2 follows directly from Theorem 3.3 with κ=0\kappa=0 (i.e., by disabling ARA from Algorithm 3). We devote the remainder of this section to proving Theorem 3.3.

Throughout Section 6.2, we fix the following constants and parameters. First, let β[0,2)\beta\in[0,2) be the Blumenthal-Getoor index of X(t)X(t) and α>1\alpha>1 be the regularly varying index of ν[x,)\nu[x,\infty); see Assumption 1. Fix some

β+(β,2),μ>2l(α1).\displaystyle\beta_{+}\in(\beta,2),\qquad\mu>2l^{*}(\alpha-1). (6.19)

This allows us to pick d,rd,r large enough such that

r(2β+)>max{2,μ1},d>max{2,2μ1}\displaystyle r(2-\beta_{+})>\max\{2,\mu-1\},\qquad d>\max\{2,2\mu-1\} (6.20)

for dd in (3.26) and rr in (3.20). Let λ>0\lambda>0 be the constant in Assumption 2. Choose

α3(0,1λ),α4(0,12λ).\displaystyle\alpha_{3}\in(0,\frac{1}{\lambda}),\qquad\alpha_{4}\in(0,\frac{1}{2\lambda}). (6.21)

Next, fix

α2(0,α321).\displaystyle\alpha_{2}\in(0,\frac{\alpha_{3}}{2}\wedge 1). (6.22)

Based on the chosen value of α2\alpha_{2}, fix

α1(0,α2λ).\displaystyle\alpha_{1}\in(0,\frac{\alpha_{2}}{\lambda}). (6.23)

Pick

δ(1/2,1).\displaystyle\delta\in(1/\sqrt{2},1). (6.24)

Since we require α2\alpha_{2} to be strictly less than 11, there is some integer m¯\bar{m} such that

δmα2δmδmα22 and δmα2<amm¯\displaystyle\delta^{m\alpha_{2}}-\delta^{m}\geq\frac{\delta^{m\alpha_{2}}}{2}\text{ and }\delta^{m\alpha_{2}}<a\qquad\forall m\geq\bar{m} (6.25)

where a>0a>0 is the parameter in set AA; see Assumption 3. Based on the values of δ\delta and β+\beta_{+}, it holds for all κ[0,1)\kappa\in[0,1) small enough that

κ2β+<12<δ2\displaystyle\kappa^{2-\beta_{+}}<\frac{1}{2}<\delta^{2} (6.26)

Then, based on all previous choices, it holds for all ρ1(0,1)\rho_{1}\in(0,1) close enough to 1 such that

δα1\displaystyle\delta^{\alpha_{1}} <ρ1,\displaystyle<\rho_{1}, (6.27)
κ2β+δ2\displaystyle\frac{\kappa^{2-\beta_{+}}}{\delta^{2}} <ρ1\displaystyle<\rho_{1} (6.28)
12δ\displaystyle\frac{1}{\sqrt{2}\delta} <ρ1\displaystyle<\rho_{1} (6.29)
δα2λα1\displaystyle\delta^{\alpha_{2}-\lambda\alpha_{1}} <ρ1\displaystyle<\rho_{1} (6.30)
δ1λα3\displaystyle\delta^{1-\lambda\alpha_{3}} <ρ1\displaystyle<\rho_{1} (6.31)
δα2+α32\displaystyle\delta^{-\alpha_{2}+\frac{\alpha_{3}}{2}} <ρ1,\displaystyle<\rho_{1}, (6.32)
(1/2)κ2β+\displaystyle(1/\sqrt{2})\vee\kappa^{2-\beta_{+}} <ρ1.\displaystyle<\rho_{1}. (6.33)

Lastly, pick ρ0(ρ1,1)\rho_{0}\in(\rho_{1},1). By picking a larger m¯\bar{m} if necessary, we can ensure that

m2ρ1mρ0mmm¯.\displaystyle m^{2}\rho^{m}_{1}\leq\rho^{m}_{0}\qquad\forall m\geq\bar{m}. (6.34)

Next, we make a few observations. Given some non-negative integer kk, let

ζk(t)=i=1kzi𝐈[ui,n](t)\displaystyle\zeta_{k}(t)=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}(t) (6.35)

where 0<u1<u2<<uk<n0<u_{1}<u_{2}<\ldots<u_{k}<n are the order statistics of kk iid samples of Unif(0,n)(0,n), and ziz_{i}’s are iid samples from ν([nγ,))/ν[nγ,)\nu(\cdot\cap[n\gamma,\infty))/\nu[n\gamma,\infty). We adopt the convention that u00u_{0}\equiv 0 and uk+11u_{k+1}\equiv 1. Note that when k=0k=0, we set ζ0(t)0\zeta_{0}(t)\equiv 0 as the zero function, and set I1=[0,n]I_{1}=[0,n], u0=0u_{0}=0, and u1=nu_{1}=n.

For Yn()Y^{*}_{n}(\cdot) defined in (3.14) and Y^nm()\hat{Y}^{m}_{n}(\cdot) defined in (3.27), note that

Yn(ζk)=maxi[k+1]𝐈{Wn(i),(ζk)na},Y^nm(ζk)=maxi[k+1]𝐈{W^n(i),m(ζk)na}\displaystyle Y^{*}_{n}(\zeta_{k})=\max_{i\in[k+1]}\mathbf{I}\{W^{(i),*}_{n}(\zeta_{k})\geq na\},\qquad\hat{Y}^{m}_{n}(\zeta_{k})=\max_{i\in[k+1]}\mathbf{I}\{\hat{W}^{(i),m}_{n}(\zeta_{k})\geq na\} (6.36)

where

Wn(i),(ζk)\displaystyle W^{(i),*}_{n}(\zeta_{k}) \ensurestackMath\stackon[1pt]=Δq=1i1j0ξj(q)+q=1i1zq+j1(ξj(i))+,\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}+\sum_{j\geq 1}(\xi^{(i)}_{j})^{+}, (6.37)
W^n(i),m(ζk)\displaystyle\hat{W}^{(i),m}_{n}(\zeta_{k}) \ensurestackMath\stackon[1pt]=Δq=1i1j0ξj(q),m+q=1i1zq+j=1m+log2(nd)(ξj(i),m)+.\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q),m}_{j}+\sum_{q=1}^{i-1}z_{q}+\sum_{j=1}^{m+\lceil\log_{2}(n^{d})\rceil}(\xi^{(i),m}_{j})^{+}. (6.38)

See (3.15)–(3.17) and (3.24) for the definitions ξj(i)\xi^{(i)}_{j}’s and ξj(i),m\xi^{(i),m}_{j}’s, respectively. Also, define

W~n(i),m(ζk)\displaystyle\widetilde{W}^{(i),m}_{n}(\zeta_{k}) \ensurestackMath\stackon[1pt]=Δq=1i1j0ξj(q)+q=1i1zq+j=1m+log2(nd)(ξj(i))+.\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}+\sum_{j=1}^{m+\lceil\log_{2}(n^{d})\rceil}(\xi^{(i)}_{j})^{+}. (6.39)

As intermediate steps for the proof of Theorem 3.3, we present the following two results. Proposition 6.3 states that, using W~n(i),m(ζk)\widetilde{W}^{(i),m}_{n}(\zeta_{k}) as an anchor, we see that Wn(i),(ζk)W^{(i),*}_{n}(\zeta_{k}) and W^n(i),m(ζk)\hat{W}^{(i),m}_{n}(\zeta_{k}) would stay close enough with high probability, especially for large mm. Proposition 6.4 then shows that it is unlikely for the law of W~n(i),m(ζk)\widetilde{W}^{(i),m}_{n}(\zeta_{k}) to concentrate around any yy\in\mathbb{R}.

Proposition 6.3.

There exists some constant C1(0,)C_{1}\in(0,\infty) such that the inequality

𝐏(|Wn(i),(ζk)W~n(i),m(ζk)||W^n(i),m(ζk)W~n(i),m(ζk)|>x)C1κm(2β+)x2nr(2β+)1+C1x1nd12m\displaystyle\mathbf{P}\bigg{(}\Big{|}W^{(i),*}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{|}\vee\Big{|}\hat{W}^{(i),m}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{|}>x\bigg{)}\leq\frac{C_{1}\kappa^{m(2-\beta_{+})}}{x^{2}\cdot n^{r(2-\beta_{+})-1}}+\frac{C_{1}}{x}\sqrt{\frac{1}{n^{d-1}\cdot 2^{m}}}

holds for any kk\in\mathbb{N}, i[k+1]i\in[k+1], n1n\geq 1, mm\in\mathbb{N}, and x>0x>0.

Proposition 6.4.

There exists some constant C2(0,)C_{2}\in(0,\infty) such that the inequality

𝐏(W~n(i),m(ζk)[yδmn,y+δmn] for some i[k+1])(k+1)C2ρ0m\displaystyle\mathbf{P}\bigg{(}\widetilde{W}^{(i),m}_{n}(\zeta_{k})\in\bigg{[}y-\frac{\delta^{m}}{\sqrt{n}},y+\frac{\delta^{m}}{\sqrt{n}}\bigg{]}\text{ for some }i\in[k+1]\bigg{)}\leq(k+1)\cdot C_{2}\rho^{m}_{0}

holds for any kk\in\mathbb{N}, i[k+1]i\in[k+1], n1n\geq 1, mm¯m\geq\bar{m}, and y>δmα2y>\delta^{m\alpha_{2}}.

First, equipped with Propositions 6.3 and 6.4, we are able to prove the main results of Section 3.6, i.e., Theorem 3.3.

Proof of Theorem 3.3.

In light of Proposition 3.1, it suffices to verify conditions (3.9) and (3.10).

Verification of (3.9).

Conditioning on {𝒟(J¯n)=k}\{\mathcal{D}(\bar{J}_{n})=k\}, the conditional law of Jn={Jn(t):t[0,n]}J_{n}=\{J_{n}(t):\ t\in[0,n]\} is the same as the law of the process ζk\zeta_{k} specified in (6.35). This implies

𝐏(Yn(Jn)Y^nm(Jn)|𝒟(J¯n)=k)\displaystyle\mathbf{P}\big{(}Y^{*}_{n}(J_{n})\neq\hat{Y}^{m}_{n}(J_{n})\ \big{|}\ \mathcal{D}(\bar{J}_{n})=k\big{)} =𝐏(Yn(ζk)Y^nm(ζk)).\displaystyle=\mathbf{P}\big{(}Y^{*}_{n}(\zeta_{k})\neq\hat{Y}^{m}_{n}(\zeta_{k})\big{)}.

Next, on event

i[k+1]({|W(i),n(ζk)W~(i),mn(ζk)||W^(i),mn(ζk)W~(i),mn(ζk)|δmn}\displaystyle\bigcap_{i\in[k+1]}\Bigg{(}\bigg{\{}\Big{|}W^{(i),*}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{|}\vee\Big{|}\hat{W}^{(i),m}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{|}\leq\frac{\delta^{m}}{\sqrt{n}}\bigg{\}}
{W~(i),mn(ζk)[naδmn,na+δmn]}),\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\cap\bigg{\{}\widetilde{W}^{(i),m}_{n}(\zeta_{k})\notin\bigg{[}na-\frac{\delta^{m}}{\sqrt{n}},na+\frac{\delta^{m}}{\sqrt{n}}\bigg{]}\bigg{\}}\Bigg{)},

we must have (for any i[k+1]i\in[k+1])

W(i),n(ζk)W^(i),mn(ζk)<na or W(i),n(ζk)W^(i),mn(ζk)>na.\displaystyle W^{(i),*}_{n}(\zeta_{k})\vee\hat{W}^{(i),m}_{n}(\zeta_{k})<na\qquad\text{ or }\qquad W^{(i),*}_{n}(\zeta_{k})\wedge\hat{W}^{(i),m}_{n}(\zeta_{k})>na.

It then follows from (6.36) that, on this event, we have Yn(ζk)=Y^mn(ζk)Y^{*}_{n}(\zeta_{k})=\hat{Y}^{m}_{n}(\zeta_{k}). Therefore,

𝐏(Yn(ζk)Y^mn(ζk))\displaystyle\mathbf{P}\big{(}Y^{*}_{n}(\zeta_{k})\neq\hat{Y}^{m}_{n}(\zeta_{k})\big{)} (6.40)
i[k+1]𝐏(|W(i),n(ζk)W~(i),mn(ζk)||W^(i),mn(ζk)W~(i),mn(ζk)|>δmn)\displaystyle\leq\sum_{i\in[k+1]}\mathbf{P}\bigg{(}\Big{|}W^{(i),*}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{|}\vee\Big{|}\hat{W}^{(i),m}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{|}>\frac{\delta^{m}}{\sqrt{n}}\bigg{)}
+𝐏(W~(i),mn(ζk)[naδmn,na+δmn] for some i[k+1]).\displaystyle+\mathbf{P}\bigg{(}\widetilde{W}^{(i),m}_{n}(\zeta_{k})\in\bigg{[}na-\frac{\delta^{m}}{\sqrt{n}},na+\frac{\delta^{m}}{\sqrt{n}}\bigg{]}\text{ for some }i\in[k+1]\bigg{)}.

Applying Proposition 6.3 (with x=δm/n)x=\delta^{m}/\sqrt{n}), we get (for any i[k+1]i\in[k+1])

𝐏(|W(i),n(ζk)W~(i),mn(ζk)||W^(i),mn(ζk)W~(i),mn(ζk)|>δmn)\displaystyle\mathbf{P}\bigg{(}\Big{|}W^{(i),*}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{|}\vee\Big{|}\hat{W}^{(i),m}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{|}>\frac{\delta^{m}}{\sqrt{n}}\bigg{)}
C1[κm(2β+)nδ2mnr(2β+)1+n(2δ)mnd1]\displaystyle\leq C_{1}\cdot\Bigg{[}\frac{\kappa^{m(2-\beta_{+})}\cdot n}{\delta^{2m}\cdot n^{r(2-\beta_{+})-1}}+\frac{\sqrt{n}}{(\sqrt{2}\delta)^{m}\cdot\sqrt{n^{d-1}}}\Bigg{]}
=C1[(κ2β+δ2)m1nr(2β+)2+(12δ)m1nd2]\displaystyle=C_{1}\cdot\Bigg{[}\bigg{(}\frac{\kappa^{2-\beta_{+}}}{\delta^{2}}\bigg{)}^{m}\cdot\frac{1}{n^{r(2-\beta_{+})-2}}+\bigg{(}\frac{1}{\sqrt{2}\delta}\bigg{)}^{m}\cdot\sqrt{\frac{1}{n^{d-2}}}\ \Bigg{]}
C1[(κ2β+δ2)m+(12δ)m]due to the choices of d and r in (6.20)\displaystyle\leq C_{1}\cdot\Bigg{[}\bigg{(}\frac{\kappa^{2-\beta_{+}}}{\delta^{2}}\bigg{)}^{m}+\bigg{(}\frac{1}{\sqrt{2}\delta}\bigg{)}^{m}\Bigg{]}\qquad\text{due to the choices of $d$ and $r$ in \eqref{proof, choose d and r, proposition: hat Y m n condition 1}}
2C1ρm0due to the choices in (6.28) and (6.29), and ρ0(ρ1,1).\displaystyle\leq 2C_{1}\rho^{m}_{0}\qquad\text{due to the choices in \eqref{proofChooseRhoByKappa} and \eqref{proofChooseRhoByDelta}, and $\rho_{0}\in(\rho_{1},1)$}. (6.41)

On the other hand, due to (6.25), we have naδmα2aδmα2>0na-\delta^{m\alpha_{2}}\geq a-\delta^{m\alpha_{2}}>0. for all n1n\geq 1 and mm¯m\geq\bar{m}. This allows us to apply Proposition 6.4 (with y=nay=na) and yield (for any i[k+1]i\in[k+1])

𝐏(W~(i),mn(ζk)[naδmn,na+δmn] for some i[k+1])(k+1)C2ρm0mm¯.\displaystyle\mathbf{P}\bigg{(}\widetilde{W}^{(i),m}_{n}(\zeta_{k})\in\bigg{[}na-\frac{\delta^{m}}{\sqrt{n}},na+\frac{\delta^{m}}{\sqrt{n}}\bigg{]}\text{ for some }i\in[k+1]\bigg{)}\leq(k+1)\cdot C_{2}\rho^{m}_{0}\qquad\forall m\geq\bar{m}. (6.42)

Plugging (6.41) and (6.42) into (6.40), we conclude the proof by setting C0=2C1+C2C_{0}=2C_{1}+C_{2}.

Verification of (3.10).

Fix some Δ(0,1)\Delta\in(0,1) and k=0,1,,l1k=0,1,\ldots,l^{*}-1. Again, conditioning on {𝒟(J¯n)=k}\{\mathcal{D}(\bar{J}_{n})=k\}, the conditional law of Jn={Jn(t):t[0,n]}J_{n}=\{J_{n}(t):\ t\in[0,n]\} is the same as the law of the process ζk\zeta_{k} specified in (6.35). This implies

𝐏(Yn(Jn)Y^mn(Jn),X¯nAΔ|𝒟(J¯n)=k)\displaystyle\mathbf{P}\Big{(}Y^{*}_{n}(J_{n})\neq\hat{Y}^{m}_{n}(J_{n}),\ \bar{X}_{n}\notin A^{\Delta}\ \Big{|}\ \mathcal{D}(\bar{J}_{n})=k\Big{)}
=𝐏(Yn(Jn)Y^mn(Jn),supt[0,n]X(t)<n(aΔ)|𝒟(J¯n)=k)by definition of set AΔ\displaystyle=\mathbf{P}\Big{(}Y^{*}_{n}(J_{n})\neq\hat{Y}^{m}_{n}(J_{n}),\ \sup_{t\in[0,n]}X(t)<n(a-\Delta)\ \Big{|}\ \mathcal{D}(\bar{J}_{n})=k\Big{)}\qquad\text{by definition of set $A^{\Delta}$}
=𝐏(maxi[k+1]W^(i),mn(ζk)na,maxi[k+1]W(i),n(ζk)<n(aΔ))\displaystyle=\mathbf{P}\Big{(}\max_{i\in[k+1]}\hat{W}^{(i),m}_{n}(\zeta_{k})\geq na,\ \max_{i\in[k+1]}W^{(i),*}_{n}(\zeta_{k})<n(a-\Delta)\Big{)}
i[k+1]𝐏(|W^(i),mn(ζk)W(i),n(ζk)|>nΔ)\displaystyle\leq\sum_{i\in[k+1]}\mathbf{P}\Big{(}\big{|}\hat{W}^{(i),m}_{n}(\zeta_{k})-W^{(i),*}_{n}(\zeta_{k})\big{|}>n\Delta\Big{)}
i[k+1]𝐏(|W(i),n(ζk)W~(i),mn(ζk)||W^(i),mn(ζk)W~(i),mn(ζk)|>nΔ2)\displaystyle\leq\sum_{i\in[k+1]}\mathbf{P}\bigg{(}\Big{|}W^{(i),*}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{|}\vee\Big{|}\hat{W}^{(i),m}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{|}>\frac{n\Delta}{2}\bigg{)}
(k+1)[4C1Δ2n2κm(2β+)nr(2β+)1+2C1Δ1n1nd12m]by Proposition 6.3\displaystyle\leq(k+1)\cdot\Bigg{[}\frac{4C_{1}}{\Delta^{2}n^{2}}\cdot\frac{\kappa^{m(2-\beta_{+})}}{n^{r(2-\beta_{+})-1}}+\frac{2C_{1}}{\Delta}\cdot\frac{1}{n}\sqrt{\frac{1}{n^{d-1}\cdot 2^{m}}}\ \Bigg{]}\qquad\text{by Proposition~{}\ref{proposition, intermediate 1, strong efficiency ARA}}
=(k+1)[4C1Δ2κm(2β+)nr(2β+)+1+2C1Δ(1/2)mnd+12]\displaystyle=(k+1)\cdot\Bigg{[}\frac{4C_{1}}{\Delta^{2}}\cdot\frac{\kappa^{m(2-\beta_{+})}}{n^{r(2-\beta_{+})+1}}+\frac{2C_{1}}{\Delta}\cdot\frac{(1/\sqrt{2})^{m}}{n^{\frac{d+1}{2}}}\Bigg{]}
k+1nμ[4C1Δ2κm(2β+)+2C1Δ(1/2)m]by the choices of r and d in (6.20)\displaystyle\leq\frac{k+1}{n^{\mu}}\cdot\Bigg{[}\frac{4C_{1}}{\Delta^{2}}\cdot\kappa^{m(2-\beta_{+})}+\frac{2C_{1}}{\Delta}\cdot(1/\sqrt{2})^{m}\Bigg{]}\qquad\text{by the choices of $r$ and $d$ in \eqref{proof, choose d and r, proposition: hat Y m n condition 1}}
k+1nμ[4C1Δ2ρ0m+2C1Δρ0m]due to the choice of ρ1 in (6.33) and ρ0(ρ1,1).\displaystyle\leq\frac{k+1}{n^{\mu}}\cdot\Bigg{[}\frac{4C_{1}}{\Delta^{2}}\cdot\rho_{0}^{m}+\frac{2C_{1}}{\Delta}\cdot\rho_{0}^{m}\Bigg{]}\quad\text{due to the choice of $\rho_{1}$ in \eqref{proofChooseRhoConstantBound} and $\rho_{0}\in(\rho_{1},1)$}.

Due to Δ(0,1)\Delta\in(0,1) (and hence 1Δ<1Δ2\frac{1}{\Delta}<\frac{1}{\Delta^{2}}) and kl1k\leq l^{*}-1, we conclude the proof by setting C0=6lC1C_{0}=6l^{*}C_{1}. ∎

The rest of this section is devoted to proving Propositions 6.3 and 6.4. First, we collect a useful result.

Result 5 (Lemma 1 of [35]).

Let ν\nu be the Lévy measure of a Lévy process XX. Let I0p(ν)\ensurestackMath\stackon[1pt]=Δ(1,1)|x|pν(dx).I_{0}^{p}(\nu)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\int_{(-1,1)}|x|^{p}\nu(dx). Suppose that β<2\beta<2 for the Blumenthal-Getoor index β\ensurestackMath\stackon[1pt]=Δinf{p>0:Ip0(ν)<}\beta\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\inf\{p>0:\ I^{p}_{0}(\nu)<\infty\}. Then

(κ,κ)x2ν(dx)κ2β+Iβ+0(ν)κ(0,1],β+(β,2).\displaystyle\int_{(-\kappa,\kappa)}x^{2}\nu(dx)\leq\kappa^{2-\beta_{+}}I^{\beta_{+}}_{0}(\nu)\qquad\forall\kappa\in(0,1],\ \beta_{+}\in(\beta,2).

Next, we prepare two lemmas regarding the expectations of the supremum of Ξn\Xi_{n} (see (3.6) for the definition) and the difference between Ξn\Xi_{n} and Ξ˘mn\breve{\Xi}^{m}_{n} (see (3.23)).

Lemma 6.5.

There exists a constant CX<C_{X}<\infty (depending only on the law of Lévy process X(t)X(t)) such that

𝐄[sups[0,t]Ξn(t)]CX(t+t)t>0,n1.\displaystyle\mathbf{E}\bigg{[}\sup_{s\in[0,t]}\Xi_{n}(t)\bigg{]}\leq C_{X}(\sqrt{t}+t)\qquad\forall t>0,\ n\geq 1.
Proof.

Recall that the generating triplet of XX is (cX,σ,ν)(c_{X},\sigma,\nu) and for the Blumenthal-Getoor index β\ensurestackMath\stackon[1pt]=Δinf{p>0:(1,1)|x|pν(dx)<}\beta\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\inf\{p>0:\int_{(-1,1)}|x|^{p}\nu(dx)<\infty\} we have β<2\beta<2; see Assumption 1. Fix some β+(1β,2)\beta_{+}\in(1\vee\beta,2) in this proof. We prove the lemma for

CX\ensurestackMath\stackon[1pt]=Δmax{|σ|2π+2Iβ+0(ν),(cX)++I1+(ν)+2I0β+(ν)}\displaystyle C_{X}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\max\Big{\{}|\sigma|\sqrt{\frac{2}{\pi}}+2\sqrt{I^{\beta_{+}}_{0}(\nu)},\ (c_{X})^{+}+I^{1}_{+}(\nu)+2I_{0}^{\beta_{+}}(\nu)\Big{\}}

where (x)+=x0(x)^{+}=x\vee 0, I1+(ν)=[1,)xν(dx)I^{1}_{+}(\nu)=\int_{[1,\infty)}x\nu(dx), and Ip0(ν)=(1,1)|x|pν(dx).I^{p}_{0}(\nu)=\int_{(-1,1)}|x|^{p}\nu(dx).

Recall that Ξn\Xi_{n} is a Lévy process with generating triplet (cX,σ,ν|(,nγ))(c_{X},\sigma,\nu|_{(-\infty,n\gamma)}). Let νn\ensurestackMath\stackon[1pt]=Δν|(,nγ)\nu_{n}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\nu|_{(-\infty,n\gamma)}. It follows from Lemma 2 of [35] (specifically, by setting t=Tt=T in equation (26)) that, for all t>0t>0 and n1n\geq 1,

𝐄sups[0,t]Ξn(t)(|σ|2π+2Iβ+0(νn))t+((cX)++I1+(νn)+2I0β+(νn))t.\displaystyle\mathbf{E}\sup_{s\in[0,t]}\Xi_{n}(t)\leq\bigg{(}|\sigma|\sqrt{\frac{2}{\pi}}+2\sqrt{I^{\beta_{+}}_{0}(\nu_{n})}\bigg{)}\sqrt{t}+\Big{(}(c_{X})^{+}+I^{1}_{+}(\nu_{n})+2I_{0}^{\beta_{+}}(\nu_{n})\Big{)}t. (6.43)

In particular, note that Iβ+0(νn)=(1,1)|x|pνn(dx)=(1,1)(,nγ)|x|pν(dx)Iβ+0(ν)I^{\beta_{+}}_{0}(\nu_{n})=\int_{(-1,1)}|x|^{p}\nu_{n}(dx)=\int_{(-1,1)\cap(-\infty,n\gamma)}|x|^{p}\nu(dx)\leq I^{\beta_{+}}_{0}(\nu) and I1+(νn)=[1,)xνn(dx)=[1,)(,nγ)xν(dx)I1+(ν).I^{1}_{+}(\nu_{n})=\int_{[1,\infty)}x\nu_{n}(dx)=\int_{[1,\infty)\cap(-\infty,n\gamma)}x\nu(dx)\leq I^{1}_{+}(\nu). Plugging these two bounds into (6.43), we conclude the proof. ∎

Lemma 6.6.

There exists some C(0,)C\in(0,\infty) (only depending on the choice of β+(β,2)\beta_{+}\in(\beta,2) in (6.19) and the law of Lévy process XX) such that

𝐏(supt[0,n]|Ξn(t)Ξ˘mn(t)|>x)Cκm(2β+)x2nr(2β+)1x>0,n1,m\mathbf{P}\Big{(}\sup_{t\in[0,n]}\Big{|}\Xi_{n}(t)-\breve{\Xi}^{m}_{n}(t)\Big{|}>x\Big{)}\leq\frac{C\kappa^{m(2-\beta_{+})}}{x^{2}n^{r(2-\beta_{+})-1}}\qquad\forall x>0,\ n\geq 1,\ m\in\mathbb{N}

where rr is the parameter in the truncation threshold κn,m=κm/nr\kappa_{n,m}=\kappa^{m}/n^{r} (see (3.20)).

Proof.

From the definitions of Ξn\Xi_{n} and Ξ˘mn\breve{\Xi}^{m}_{n} in (3.21) and (3.23), respectively, we have

Ξn(t)Ξ˘mn(t)\ensurestackMath\stackon[1pt]=dX(κn,m,κn,m)(t)σ¯(κn,m)B(t)\displaystyle\Xi_{n}(t)-\breve{\Xi}^{m}_{n}(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}X^{(-\kappa_{n,m},\kappa_{n,m})}(t)-\bar{\sigma}(\kappa_{n,m})B(t)

where X(c,c)X^{(-c,c)} is the Lévy process with generating triplet (0,0,ν|(c,c))(0,0,\nu|_{(-c,c)}), κn,m=κm/nr,\kappa_{n,m}=\kappa^{m}/n^{r}, and BB is a standard Brownian motion independent of X(κn,m,κn,m)X^{(-\kappa_{n,m},\kappa_{n,m})}. In particular, X(κn,m,κn,m)X^{(-\kappa_{n,m},\kappa_{n,m})} is a martingale with variance var[X(κn,m,κn,m)(1)]=σ¯2(κn,m)var\big{[}X^{(-\kappa_{n,m},\kappa_{n,m})}(1)\big{]}=\bar{\sigma}^{2}(\kappa_{n,m}); see (3.22) for the definition of σ¯2()\bar{\sigma}^{2}(\cdot). Therefore,

𝐏(supt[0,n]|Ξn(t)Ξ˘mn(t)|>x)\displaystyle\mathbf{P}\Big{(}\sup_{t\in[0,n]}\Big{|}\Xi_{n}(t)-\breve{\Xi}^{m}_{n}(t)\Big{|}>x\Big{)}
1x2𝐄|X(κn,m,κn,m)(n)σ¯(κn,m)B(n)|2using Doob’s inequality\displaystyle\leq\frac{1}{x^{2}}\mathbf{E}\Big{|}X^{(-\kappa_{n,m},\kappa_{n,m})}(n)-\bar{\sigma}(\kappa_{n,m})B(n)\Big{|}^{2}\qquad\text{using Doob's inequality}
=2nx2σ¯2(κn,m)due to the independence between X(κn,m,κn,m) and B\displaystyle=\frac{2n}{x^{2}}\bar{\sigma}^{2}(\kappa_{n,m})\qquad\text{due to the independence between $X^{(-\kappa_{n,m},\kappa_{n,m})}$ and $B$}
2nx2κ2β+n,mI0β+(ν) using Result 5\displaystyle\leq\frac{2n}{x^{2}}\cdot\kappa^{2-\beta_{+}}_{n,m}I_{0}^{\beta_{+}}(\nu)\qquad\text{ using Result \ref{result: bound bar sigma}}
=2I0β+(ν)x2nκm(2β+)nr(2β+)=2I0β+(ν)x2κm(2β+)nr(2β+)1 due to κn,m=κm/nr.\displaystyle=\frac{2I_{0}^{\beta_{+}}(\nu)}{x^{2}}\cdot\frac{n\kappa^{m(2-\beta_{+})}}{n^{r(2-\beta_{+})}}=\frac{2I_{0}^{\beta_{+}}(\nu)}{x^{2}}\cdot\frac{\kappa^{m(2-\beta_{+})}}{n^{r(2-\beta_{+})-1}}\qquad\text{ due to }\kappa_{n,m}=\kappa^{m}/n^{r}.

To conclude the proof, we set C=2I0β+(ν)=2(1,1)|x|β+ν(dx)C=2I_{0}^{\beta_{+}}(\nu)=2\int_{(-1,1)}\int|x|^{\beta_{+}}\nu(dx). ∎

To facilitate the presentation of the next few lemmas, we consider a slightly more general version of the stick-breaking procedure described in (3.15)–(3.24), to allow for arbitrary stick length. Specifically, for any l>0l>0, let

l1(l)=V1l,lj(l)=Vj(ll1(l)l2(l)lj1(l))j2,\displaystyle l_{1}(l)=V_{1}\cdot l,\qquad l_{j}(l)=V_{j}\cdot\big{(}l-l_{1}(l)-l_{2}(l)-\cdots-l_{j-1}(l)\big{)}\quad\forall j\geq 2, (6.44)

where VjV_{j}’s are iid copies of Unif(0,1)(0,1). Independent of VjV_{j}’s, for any nn and mm, let Ξn\Xi_{n} and Ξ˘mn\breve{\Xi}^{m}_{n} be Lévy processes with joint law specified in (3.21) and (3.23), respectively. Conditioning on the values of lj(l)l_{j}(l), define ξ[n]j(l),ξ[n],mj(l)\xi^{[n]}_{j}(l),\xi^{[n],m}_{j}(l) using (for all j1j\geq 1)

(ξ[n]j(l),ξ[n],0j(l),ξ[n],1j(l),ξ[n],2j(l),)=(Ξn(lj(l)),Ξ˘0n(lj(l)),Ξ˘1n(lj(l)),Ξ˘2n(lj(l)),).\displaystyle\Big{(}\xi^{[n]}_{j}(l),\xi^{[n],0}_{j}(l),\xi^{[n],1}_{j}(l),\xi^{[n],2}_{j}(l),\ldots\Big{)}=\Big{(}\Xi_{n}\big{(}l_{j}(l)\big{)},\ \breve{\Xi}^{0}_{n}\big{(}l_{j}(l)\big{)},\ \breve{\Xi}^{1}_{n}\big{(}l_{j}(l)\big{)},\ \breve{\Xi}^{2}_{n}\big{(}l_{j}(l)\big{)},\ \ldots\Big{)}. (6.45)
Lemma 6.7.

There exists some C(0,)C\in(0,\infty) (only depending on the choice of β+(β,2)\beta_{+}\in(\beta,2) in (6.19) and the law of Lévy process XX) such that, for all mm\in\mathbb{N} and n1n\geq 1,

𝐏(|j=1m+log2(nd)(ξ[n]j(l))+j=1m+log2(nd)(ξ[n],mj(l))+|>y)Cκm(2β+)y2nr(2β+)1y>0,l[0,n]\mathbf{P}\bigg{(}\bigg{|}\sum_{j=1}^{m+\lceil\log_{2}(n^{d})\rceil}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}-\sum_{j=1}^{m+\lceil\log_{2}(n^{d})\rceil}\big{(}\xi^{[n],m}_{j}(l)\big{)}^{+}\bigg{|}>y\bigg{)}\leq\frac{C\kappa^{m(2-\beta_{+})}}{y^{2}n^{r(2-\beta_{+})-1}}\qquad\forall y>0,\ l\in[0,n]

where rr is the parameter in the truncation threshold κn,m=κm/nr\kappa_{n,m}=\kappa^{m}/n^{r} (see (3.20)) and (x)+=x0(x)^{+}=x\vee 0.

Proof.

For notational simplicity, set k(n)=log2(nd)k(n)=\lceil\log_{2}(n^{d})\rceil. Due to |(x)+(y)+||xy||(x)^{+}-(y)^{+}|\leq|x-y|,

𝐏(|j=1m+k(n)(ξ[n]j(l))+j=1m+k(n)(ξ[n],mj(l))+|>y)\displaystyle\mathbf{P}\Big{(}\Big{|}\sum_{j=1}^{m+k(n)}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}-\sum_{j=1}^{m+k(n)}\big{(}\xi^{[n],m}_{j}(l)\big{)}^{+}\Big{|}>y\Big{)}
𝐏(j=1m+k(n)|(ξ[n]j(l))+(ξ[n],mj(l))+|>y)𝐏(j=1m+k(n)|ξ[n]j(l)ξ[n],mj(l)\ensurestackMath\stackon[1pt]=Δqj|>y).\displaystyle\leq\mathbf{P}\Big{(}\sum_{j=1}^{m+k(n)}\Big{|}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}-\big{(}\xi^{[n],m}_{j}(l)\big{)}^{+}\Big{|}>y\Big{)}\leq\mathbf{P}\Big{(}\sum_{j=1}^{m+k(n)}\Big{|}\underbrace{\xi^{[n]}_{j}(l)-\xi^{[n],m}_{j}(l)}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}q_{j}}\Big{|}>y\Big{)}. (6.46)

Furthermore, we claim the existence of some constant C~(0,)\tilde{C}\in(0,\infty) such that (for any y,d>0y,d>0, l[0.n]l\in[0.n], and any n1,mn\geq 1,\ m\in\mathbb{N})

𝐏(j=1m+k(n)|qj|>y)C~nσ¯2(κn,m)y2.\displaystyle\mathbf{P}(\sum_{j=1}^{m+k(n)}|q_{j}|>y)\leq\tilde{C}\cdot\frac{n\bar{\sigma}^{2}(\kappa_{n,m})}{y^{2}}. (6.47)

Then using Result 5, we yield

nσ¯2(κn,m)nκ2β+n,mI0β+(ν)=κm(2β+)nr(2β+)1I0β+(ν)\displaystyle n\bar{\sigma}^{2}(\kappa_{n,m})\leq n\cdot\kappa^{2-\beta_{+}}_{n,m}I_{0}^{\beta_{+}}(\nu)=\frac{\kappa^{m(2-\beta_{+})}}{n^{r(2-\beta_{+})-1}}\cdot I_{0}^{\beta_{+}}(\nu)

where I0β+(ν)=(1,1)|x|β+ν(dx)I_{0}^{\beta_{+}}(\nu)=\int_{(-1,1)}\int|x|^{\beta_{+}}\nu(dx). Setting C=C~I0β+(ν)C=\tilde{C}I_{0}^{\beta_{+}}(\nu), we conclude the proof.

Now, it only remains to prove claim (6.47). Let χ=21/4\chi=2^{1/4}. Note that

1=(χ1)j11χj(χ1)(1χ+1χ2++1χm+k(n)).\displaystyle 1=(\chi-1)\sum_{j\geq 1}\frac{1}{\chi^{j}}\geq(\chi-1)\Big{(}\frac{1}{\chi}+\frac{1}{\chi^{2}}+\cdots+\frac{1}{\chi^{m+k(n)}}\Big{)}.

As a result,

𝐏(j=1k(n)+m|qj|>y)\displaystyle\mathbf{P}\Big{(}\sum_{j=1}^{k(n)+m}|q_{j}|>y\Big{)}\leq 𝐏(j=1k(n)+m|qj|>y(χ1)j=1k(n)+m1χj)j=1k(n)+m𝐏(|qj|>yχ1χj)\displaystyle\mathbf{P}\Big{(}\sum_{j=1}^{k(n)+m}|q_{j}|>y(\chi-1)\sum_{j=1}^{k(n)+m}\frac{1}{\chi^{j}}\Big{)}\leq\sum_{j=1}^{k(n)+m}\mathbf{P}\Big{(}|q_{j}|>y\cdot\frac{\chi-1}{\chi^{j}}\Big{)} (6.48)

Next, we bound each 𝐏(|qj|>yχ1χj)\mathbf{P}(|q_{j}|>y\frac{\chi-1}{\chi^{j}}). Conditioning on lj(l)=tl_{j}(l)=t (for any t[0,l]t\in[0,l]), we get

𝐏(|qj|>yχ1χj|lj(l)=t)\displaystyle\mathbf{P}\bigg{(}|q_{j}|>y\frac{\chi-1}{\chi^{j}}\ \bigg{|}\ l_{j}(l)=t\bigg{)} =𝐏(|Ξn(t)Ξ˘mn(t)|>yχ1χj) due to (6.45)\displaystyle=\mathbf{P}\bigg{(}\Big{|}\Xi_{n}(t)-\breve{\Xi}^{m}_{n}(t)\Big{|}>y\frac{\chi-1}{\chi^{j}}\bigg{)}\qquad\text{ due to \eqref{def general xi n m l, 3}}
χ2jy2(χ1)2𝐄|X(κn,m,κn,m)(t)σ¯(κn,m)B(t)|2\displaystyle\leq\frac{\chi^{2j}}{y^{2}(\chi-1)^{2}}\mathbf{E}\Big{|}X^{(-\kappa_{n,m},\kappa_{n,m})}(t)-\bar{\sigma}(\kappa_{n,m})B(t)\Big{|}^{2}
=χ2jy2(χ1)22σ¯2(κn,m)t\displaystyle=\frac{\chi^{2j}}{y^{2}(\chi-1)^{2}}\cdot 2\bar{\sigma}^{2}(\kappa_{n,m})t
𝐏(|qj|>yχ1χj)\displaystyle\Longrightarrow\mathbf{P}\bigg{(}|q_{j}|>y\frac{\chi-1}{\chi^{j}}\bigg{)} χ2jy2(χ1)22σ¯2(κn,m)𝐄[lj(l)]\displaystyle\leq\frac{\chi^{2j}}{y^{2}(\chi-1)^{2}}\cdot 2\bar{\sigma}^{2}(\kappa_{n,m})\cdot\mathbf{E}[l_{j}(l)]
=2jy2(21/41)22σ¯2(κm,n)𝐄[lj(l)] due to χ=21/4\displaystyle=\frac{\sqrt{2^{j}}}{y^{2}(2^{1/4}-1)^{2}}\cdot 2\bar{\sigma}^{2}(\kappa_{m,n})\cdot\mathbf{E}[l_{j}(l)]\qquad\text{ due to }\chi=2^{1/4}
=2jy2(21/41)22σ¯2(κm,n)l2jby definition of lj(l) in (6.44)\displaystyle=\frac{\sqrt{2^{j}}}{y^{2}(2^{1/4}-1)^{2}}\cdot 2\bar{\sigma}^{2}(\kappa_{m,n})\cdot\frac{l}{2^{j}}\qquad\text{by definition of $l_{j}(l)$ in \eqref{def general xi n m l, 2}}
2(21/41)22jnσ¯2(κm,n)y2 due to ln.\displaystyle\leq\frac{2}{(2^{1/4}-1)^{2}\sqrt{2^{j}}}\cdot\frac{n\bar{\sigma}^{2}(\kappa_{m,n})}{y^{2}}\qquad\text{ due to }l\leq n.

Therefore, in (6.48), we get

𝐏(j=1k(n)+m|qj|>y)\displaystyle\mathbf{P}\Big{(}\sum_{j=1}^{k(n)+m}|q_{j}|>y\Big{)} nσ¯2(κm,n)y2j12(21/41)22j=nσ¯2(κm,n)y222(21/41)2(21)\ensurestackMath\stackon[1pt]=ΔC~,\displaystyle\leq\frac{n\bar{\sigma}^{2}(\kappa_{m,n})}{y^{2}}\sum_{j\geq 1}\frac{2}{(2^{1/4}-1)^{2}\sqrt{2^{j}}}=\frac{n\bar{\sigma}^{2}(\kappa_{m,n})}{y^{2}}\cdot\underbrace{\frac{2\sqrt{2}}{(2^{1/4}-1)^{2}(\sqrt{2}-1)}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\tilde{C}},

thus establishing claim (6.47). ∎

Lemma 6.8.

Let n+n\in\mathbb{Z}_{+} and l[0,n]l\in[0,n]. Let CX<C_{X}<\infty be the constant characterized in Lemma 6.5 that only depends on the law of Lévy process XX. The inequality

𝐏(j>m+log2(nd)(ξ[n]j(l))+>x)2CXx1nd12m\displaystyle\mathbf{P}\Big{(}\sum_{j>m+\lceil\log_{2}(n^{d})\rceil}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}>x\Big{)}\leq\frac{2C_{X}}{x}\sqrt{\frac{1}{n^{d-1}\cdot 2^{m}}}

holds for all x>0x>0, n1n\geq 1, and m0m\geq 0, where (x)+=x0(x)^{+}=x\vee 0.

Proof.

For this proof, we adopt the notation l˘k(l)\ensurestackMath\stackon[1pt]=Δll1(l)l2(l)lk(l)\breve{l}_{k}(l)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}l-l_{1}(l)-l_{2}(l)-\ldots-l_{k}(l) for the remaining stick length after the first kk sticks. Conditioning on l˘m+log2(nd)(l)=t\breve{l}_{m+\lceil\log_{2}(n^{d})\rceil}(l)=t,

𝐏(j>m+log2(nd)(ξ[n]j(l))+>x|l˘m+log2(nd)(l)=t)\displaystyle\mathbf{P}\Big{(}\sum_{j>m+\lceil\log_{2}(n^{d})\rceil}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}>x\ \bigg{|}\ \breve{l}_{m+\lceil\log_{2}(n^{d})\rceil}(l)=t\Big{)} =𝐏(sups[0,t]Ξn(s)>x) by Result 3\displaystyle=\mathbf{P}\Big{(}\sup_{s\in[0,t]}\Xi_{n}(s)>x\Big{)}\qquad\text{ by Result~{}\ref{result: concave majorant of Levy}}
CXx(t+t)using Lemma 6.5.\displaystyle\leq\frac{C_{X}}{x}(\sqrt{t}+t)\qquad\text{using Lemma~{}\ref{lemma: algo, bound supremum of truncated X}}.

Therefore, unconditionally,

𝐏(j>m+log2(nd)(ξ[n]j(l))+>x)\displaystyle\mathbf{P}\Big{(}\sum_{j>m+\lceil\log_{2}(n^{d})\rceil}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}>x\Big{)} CXx𝐄[l˘m+log2(nd)(l)+𝐄l˘m+log2(nd)(l)]\displaystyle\leq\frac{C_{X}}{x}\mathbf{E}\Big{[}\sqrt{\breve{l}_{m+\lceil\log_{2}(n^{d})\rceil}(l)}+\mathbf{E}\breve{l}_{m+\lceil\log_{2}(n^{d})\rceil}(l)\Big{]}
CXx[𝐄l˘m+log2(nd)(l)+𝐄l˘m+log2(nd)(l)]\displaystyle\leq\frac{C_{X}}{x}\Big{[}\sqrt{\mathbf{E}\breve{l}_{m+\lceil\log_{2}(n^{d})\rceil}(l)}+\mathbf{E}\breve{l}_{m+\lceil\log_{2}(n^{d})\rceil}(l)\Big{]}

The last line follows from Jensen’s inequality. Lastly, by definition of lj(l)l_{j}(l)’s in (6.44), we have

𝐄l˘m+log2(nd)(l)\displaystyle\mathbf{E}\breve{l}_{m+\lceil\log_{2}(n^{d})\rceil}(l) =l2m+log2(nd)l2mndn2mnd=1nd12mdue to l[0,n].\displaystyle=\frac{l}{2^{m+\lceil\log_{2}(n^{d})\rceil}}\leq\frac{l}{2^{m}\cdot n^{d}}\leq\frac{n}{2^{m}\cdot n^{d}}=\frac{1}{n^{d-1}\cdot 2^{m}}\qquad\text{due to $l\in[0,n]$}.

This concludes the proof. ∎

Lemma 6.9.

Let n+n\in\mathbb{Z}_{+} and l[0,n]l\in[0,n]. Let CC and λ\lambda be the constants in Assumption 2. Let CX<C_{X}<\infty be the constant characterized in Lemma 6.5 that only depends on the law of Lévy process XX. The inequality

𝐏(j=1m+log2(nd)(ξ[n]j(l))+[y,y+c])C(m+(log2(nd))nα4λδα3λc+4CX(m2+(log2(nd))2)δα3/2y0nα4/2.\displaystyle\mathbf{P}\Big{(}\sum_{j=1}^{m+\lceil\log_{2}(n^{d})\rceil}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}\in[y,y+c]\Big{)}\leq C\frac{(m+(\lceil\log_{2}(n^{d})\rceil)n^{\alpha_{4}\lambda}}{\delta^{\alpha_{3}\lambda}}c+4C_{X}\big{(}m^{2}+(\lceil\log_{2}(n^{d})\rceil)^{2}\big{)}\frac{\delta^{\alpha_{3}/2}}{y_{0}\cdot n^{\alpha_{4}/2}}.

holds for all yy0>0y\geq y_{0}>0, c>0c>0, n1n\geq 1, and mm\in\mathbb{N}.

Proof.

To simplify notations, in this proof we set k(n)=log2(nd)k(n)=\lceil\log_{2}(n^{d})\rceil and write lj=lj(l)l_{j}=l_{j}(l) when there is no ambiguity. For the sequence of random variables (l1,,lm+k(n))(l_{1},\cdots,l_{m+k(n)}), let l~1l~2l~m+k(n)\tilde{l}_{1}\geq\tilde{l}_{2}\geq\cdots\geq\tilde{l}_{m+k(n)} be its order statistics. Given any ordered positive real sequence t1t2tm+k(n)>0t_{1}\geq t_{2}\geq\cdots\geq t_{m+k(n)}>0, by conditioning on l~j=tjj[m+k(n)]\tilde{l}_{j}=t_{j}\ \forall j\in[m+k(n)], it follows from (6.45) that

𝐏(j=1m+k(n)(ξ[n]j(l))+[y,y+c]|l~j=tjj[m+k(n)])=𝐏(j=1m+k(n)(Ξn(j)(tj))+[y,y+c])\displaystyle\mathbf{P}\Big{(}\sum_{j=1}^{m+k(n)}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}\in[y,y+c]\ \Big{|}\ \tilde{l}_{j}=t_{j}\ \forall j\in[m+k(n)]\Big{)}=\mathbf{P}\Big{(}\sum_{j=1}^{m+k(n)}\big{(}\Xi_{n}^{(j)}(t_{j})\big{)}^{+}\in[y,y+c]\Big{)} (6.49)

where Ξn(j)\Xi_{n}^{(j)}’s are iid copies of the Lévy processes Ξn=X<nγ\Xi_{n}=X^{<n\gamma}. Next, fix

η=δmα3/nα4.\displaystyle\eta=\delta^{m\alpha_{3}}/n^{\alpha_{4}}.

Given the sequence of real numbers tjt_{j}’s, we define J\ensurestackMath\stackon[1pt]=Δ#{j[m+k(n)]:tj>η}J\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\#\{j\in[m+k(n)]:\ t_{j}>\eta\} as the number of elements in the sequence that are larger than η\eta. In case that t1ηt_{1}\leq\eta, we set J=0J=0. With JJ defined, we consider a decomposition of events in (6.49) based on the first j[m+k(n)]j\in[m+k(n)] such that Ξn(j)(tj)>0\Xi_{n}^{(j)}(t_{j})>0 (and hence (Ξn(j)(tj))+>0\big{(}\Xi_{n}^{(j)}(t_{j})\big{)}^{+}>0), especially if such tjt_{j} is larger than η\eta or not. To be specific,

𝐏(j=1m+k(n)(Ξn(j)(tj))+[y,y+c])=j=1J𝐏(Ξn(i)(ti)0i[j1];Ξn(j)(tj)>0;i=jm+k(n)(Ξn(i)(ti))+[y,y+c])\ensurestackMath\stackon[1pt]=Δpj+𝐏(Ξn(i)(ti)0i[J];j=J+1m+k(n)(Ξn(j)(tj))+[y,y+c])\ensurestackMath\stackon[1pt]=Δp.\begin{split}&\mathbf{P}\Big{(}\sum_{j=1}^{m+k(n)}\big{(}\Xi_{n}^{(j)}(t_{j})\big{)}^{+}\in[y,y+c]\Big{)}\\ &=\sum_{j=1}^{J}\underbrace{\mathbf{P}\bigg{(}\Xi_{n}^{(i)}(t_{i})\leq 0\ \forall i\in[j-1];\ \Xi_{n}^{(j)}(t_{j})>0;\ \sum_{i=j}^{m+k(n)}\big{(}\Xi_{n}^{(i)}(t_{i})\big{)}^{+}\in[y,y+c]\bigg{)}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}p_{j}}\\ &+\underbrace{\mathbf{P}\bigg{(}\Xi_{n}^{(i)}(t_{i})\leq 0\ \forall i\in[J];\sum_{j=J+1}^{m+k(n)}\big{(}\Xi_{n}^{(j)}(t_{j})\big{)}^{+}\in[y,y+c]\bigg{)}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}p_{*}}.\end{split} (6.50)

We first bound terms pjp_{j}’s. For any j[J]j\in[J], observe that

pj\displaystyle p_{j} 𝐏(Ξn(j)(tj)>0;i=jm+k(n)(Ξn(i)(ti))+[y,y+c])\displaystyle\leq\mathbf{P}\bigg{(}\Xi_{n}^{(j)}(t_{j})>0;\ \sum_{i=j}^{m+k(n)}\big{(}\Xi_{n}^{(i)}(t_{i})\big{)}^{+}\in[y,y+c]\bigg{)}
=𝐏(Ξn(j)(tj)[yx,yx+c](0,))𝐏(i=j+1m+k(n)(Ξn(i)(ti))+dx)\displaystyle=\int_{\mathbb{R}}\mathbf{P}\bigg{(}\Xi_{n}^{(j)}(t_{j})\in[y-x,y-x+c]\cap(0,\infty)\bigg{)}\mathbf{P}\bigg{(}\sum_{i=j+1}^{m+k(n)}\big{(}\Xi_{n}^{(i)}(t_{i})\big{)}^{+}\in dx\bigg{)}
Cctjλ1 by Assumption 2\displaystyle\leq\frac{Cc}{t_{j}^{\lambda}\wedge 1}\qquad\text{ by Assumption~{}\ref{assumption: holder continuity strengthened on X < z t}}
Cnα4λδmα3λc due to jJ, and hence tj>η=δmα3/nα4.\displaystyle\leq\frac{Cn^{\alpha_{4}\lambda}}{\delta^{m\alpha_{3}\lambda}}\cdot c\qquad\text{ due to $j\leq J$, and hence $t_{j}>\eta=\delta^{m\alpha_{3}}/n^{\alpha_{4}}$}. (6.51)

On the other hand,

p\displaystyle p_{*} 𝐏(j=J+1m+k(n)(Ξn(j)(tj))+[y,y+c])𝐏(j=J+1m+k(n)(Ξn(j)(tj))+y0) due to yy0>0\displaystyle\leq\mathbf{P}\bigg{(}\sum_{j=J+1}^{m+k(n)}\big{(}\Xi_{n}^{(j)}(t_{j})\big{)}^{+}\in[y,y+c]\bigg{)}\leq\mathbf{P}\bigg{(}\sum_{j=J+1}^{m+k(n)}\big{(}\Xi_{n}^{(j)}(t_{j})\big{)}^{+}\geq y_{0}\bigg{)}\qquad\text{ due to }y\geq y_{0}>0
j=J+1m+k(n)𝐏(Ξn(j)(tj)y0/N) with N\ensurestackMath\stackon[1pt]=Δm+k(n)J\displaystyle\leq\sum_{j=J+1}^{m+k(n)}\mathbf{P}\Big{(}\Xi_{n}^{(j)}(t_{j})\geq y_{0}/N\Big{)}\qquad\text{ with }N\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}m+k(n)-J
j=J+1m+k(n)CX(tj+tj)Ny0 by Lemma 6.5\displaystyle\leq\sum_{j=J+1}^{m+k(n)}\frac{C_{X}(\sqrt{t_{j}}+t_{j})\cdot N}{y_{0}}\qquad\text{ by Lemma~{}\ref{lemma: algo, bound supremum of truncated X}}
j=J+1m+k(n)CX(η+η)Ny0due to j>J, and hence tjη=δmα3/nα4\displaystyle\leq\sum_{j=J+1}^{m+k(n)}\frac{C_{X}(\sqrt{\eta}+\eta)\cdot N}{y_{0}}\qquad\text{due to }j>J\text{, and hence }t_{j}\leq\eta=\delta^{m\alpha_{3}}/n^{\alpha_{4}}
=N2CX(η+η)y0(m+k(n))2CX(η+η)y0 due to Nm+k(n)\displaystyle=N^{2}\cdot\frac{C_{X}(\sqrt{\eta}+\eta)}{y_{0}}\leq\big{(}m+k(n)\big{)}^{2}\cdot\frac{C_{X}(\sqrt{\eta}+\eta)}{y_{0}}\qquad\text{ due to }N\leq m+k(n)
2CX(m2+(log2(nd))2)η+ηy0 using (u+v)22(u2+v2)\displaystyle\leq 2C_{X}\big{(}m^{2}+(\lceil\log_{2}(n^{d})\rceil)^{2}\big{)}\frac{\sqrt{\eta}+\eta}{y_{0}}\qquad\text{ using }(u+v)^{2}\leq 2(u^{2}+v^{2})
4CX(m2+(log2(nd))2)ηy0=4CX(m2+(log2(nd))2)δmα3/2y0nα4/2.\displaystyle\leq 4C_{X}\big{(}m^{2}+(\lceil\log_{2}(n^{d})\rceil)^{2}\big{)}\frac{\sqrt{\eta}}{y_{0}}=4C_{X}\big{(}m^{2}+(\lceil\log_{2}(n^{d})\rceil)^{2}\big{)}\frac{\delta^{m\alpha_{3}/2}}{y_{0}\cdot n^{\alpha_{4}/2}}. (6.52)

Plugging (6.51) and (6.52) into (6.50), we yield

𝐏(j=1m+k(n)(ξ[n]j(l))+[y,y+c]|l~j=tjj[m+k(n)])\displaystyle\mathbf{P}\Big{(}\sum_{j=1}^{m+k(n)}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}\in[y,y+c]\ \Big{|}\ \tilde{l}_{j}=t_{j}\ \forall j\in[m+k(n)]\Big{)}
JCnα4λδmα3λc+4CX(m2+(log2(nd))2)δmα3/2y0nα4/2\displaystyle\leq J\cdot\frac{Cn^{\alpha_{4}\lambda}}{\delta^{m\alpha_{3}\lambda}}c+4C_{X}\big{(}m^{2}+(\lceil\log_{2}(n^{d})\rceil)^{2}\big{)}\frac{\delta^{m\alpha_{3}/2}}{y_{0}\cdot n^{\alpha_{4}/2}}
C(m+(log2(nd))nα4λδmα3λc+4CX(m2+(log2(nd))2)mδα3/2y0nα4/2 due to Jm+log2(nd).\displaystyle\leq C\frac{(m+(\lceil\log_{2}(n^{d})\rceil)n^{\alpha_{4}\lambda}}{\delta^{m\alpha_{3}\lambda}}c+4C_{X}\big{(}m^{2}+(\lceil\log_{2}(n^{d})\rceil)^{2}\big{)}\frac{m\delta^{\alpha_{3}/2}}{y_{0}\cdot n^{\alpha_{4}/2}}\qquad\text{ due to }J\leq m+\lceil\log_{2}(n^{d})\rceil.

To conclude the proof, just note that the inequality above holds when conditioning on any sequence of t1t2tm+k(n)>0t_{1}\geq t_{2}\geq\cdots\geq t_{m+k(n)}>0, so it would also hold unconditionally. ∎

Now, we are ready to prove Propositions 6.3 and 6.4.

Proof of Proposition 6.3.

In this proof, we fix some kk\in\mathbb{N}, n1n\geq 1, and mm\in\mathbb{N}. Let the process ζk\zeta_{k} be defined as in (6.35). Recall the definitions of W(i),n(ζk)W^{(i),*}_{n}(\zeta_{k}), W^(i),mn(ζk)\hat{W}^{(i),m}_{n}(\zeta_{k}), and W~(i),mn(ζk)\widetilde{W}^{(i),m}_{n}(\zeta_{k}) in (6.37)–(6.39). See also (3.15)–(3.24) for the definitions ξ(i)j\xi^{(i)}_{j}’s and ξ(i),mj\xi^{(i),m}_{j}’s.

To simplify notations, define t(n)=log2(nd)t(n)=\lceil\log_{2}(n^{d})\rceil. Define events

E1(i)(x)\displaystyle E_{1}^{(i)}(x) \ensurestackMath\stackon[1pt]=Δ{|q=1i1j0ξ(q)jq=1i1j0ξ(q),mj|x2},\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\bigg{\{}\Big{|}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}-\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q),m}_{j}\Big{|}\leq\frac{x}{2}\bigg{\}},
E2(i)(x)\displaystyle E_{2}^{(i)}(x) \ensurestackMath\stackon[1pt]=Δ{|j=1m+t(n)(ξ(i)j)+j=1m+t(n)(ξ(i),mj)+|x2},\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\bigg{\{}\Big{|}\sum_{j=1}^{m+t(n)}(\xi^{(i)}_{j})^{+}-\sum_{j=1}^{m+t(n)}(\xi^{(i),m}_{j})^{+}\Big{|}\leq\frac{x}{2}\bigg{\}},
E3(i)(x)\displaystyle E_{3}^{(i)}(x) \ensurestackMath\stackon[1pt]=Δ{jm+t(n)+1(ξ(i)j)+x}.\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\bigg{\{}\sum_{j\geq m+t(n)+1}(\xi^{(i)}_{j})^{+}\leq x\bigg{\}}.

Note that on event E1(i)(x)E2(i)(x)E3(i)(x)E_{1}^{(i)}(x)\cap E_{2}^{(i)}(x)\cap E_{3}^{(i)}(x), we must have |W(i),n(ζk)W~(i),mn(ζk)|x|W^{(i),*}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})|\leq x and |W^(i),mn(ζk)W~(i),mn(ζk)|x.|\hat{W}^{(i),m}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})|\leq x. As a result,

𝐏(|W(i),n(ζk)W~(i),mn(ζk)||W^(i),mn(ζk)W~(i),mn(ζk)|>x)q=13𝐏((E(i)q(x))c).\displaystyle\mathbf{P}\bigg{(}\Big{|}W^{(i),*}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{|}\vee\Big{|}\hat{W}^{(i),m}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{|}>x\bigg{)}\leq\sum_{q=1}^{3}\mathbf{P}\Big{(}\big{(}E^{(i)}_{q}(x)\big{)}^{c}\Big{)}.

Furthermore, we claim the existence of constant (C~q)q=1,2,3(\widetilde{C}_{q})_{q=1,2,3}, the values of which do not depend on x,k,nx,k,n, and mm, such that (for any x>0x>0 and i[k+1]i\in[k+1])

𝐏((E1(i)(x))c)=𝐏(|q=1i1j0ξ(q)jq=1i1j0ξ(q),mj|>x2)\displaystyle\mathbf{P}\Big{(}\big{(}E_{1}^{(i)}(x)\big{)}^{c}\Big{)}=\mathbf{P}\bigg{(}\Big{|}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}-\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q),m}_{j}\Big{|}>\frac{x}{2}\bigg{)} C~1κm(2β+)x2nr(2β+)1,\displaystyle\leq\frac{\widetilde{C}_{1}\kappa^{m(2-\beta_{+})}}{x^{2}n^{r(2-\beta_{+})-1}}, (6.53)
𝐏((E2(i)(x))c)=𝐏(|j=1m+t(n)(ξ(i)j)+j=1m+t(n)(ξ(i),mj)+|>x2)\displaystyle\mathbf{P}\Big{(}\big{(}E_{2}^{(i)}(x)\big{)}^{c}\Big{)}=\mathbf{P}\bigg{(}\Big{|}\sum_{j=1}^{m+t(n)}(\xi^{(i)}_{j})^{+}-\sum_{j=1}^{m+t(n)}(\xi^{(i),m}_{j})^{+}\Big{|}>\frac{x}{2}\bigg{)} C~2κm(2β+)x2nr(2β+)1,\displaystyle\leq\frac{\widetilde{C}_{2}\kappa^{m(2-\beta_{+})}}{x^{2}n^{r(2-\beta_{+})-1}}, (6.54)
𝐏((E3(i)(x))c)=𝐏(jm+t(n)+1(ξ(i)j)+>x)\displaystyle\mathbf{P}\Big{(}\big{(}E_{3}^{(i)}(x)\big{)}^{c}\Big{)}=\mathbf{P}\bigg{(}\sum_{j\geq m+t(n)+1}(\xi^{(i)}_{j})^{+}>x\bigg{)} C~3x1nd12m.\displaystyle\leq\frac{\widetilde{C}_{3}}{x}\sqrt{\frac{1}{n^{d-1}\cdot 2^{m}}}. (6.55)

This allows us to conclude the proof by setting C1=q=13C~qC_{1}=\sum_{q=1}^{3}\widetilde{C}_{q}. Now, it remains to prove claims (6.53)–(6.55).

Proof of Claim (6.53)

The claim is trivial if i1i\leq 1, so we only consider the case where i2i\geq 2. Due to the coupling between ξ(i)j\xi^{(i)}_{j} and ξ(i),mj\xi^{(i),m}_{j} in (3.24)(3.25), we have

(q=1i1j0ξ(q)j,q=1i1j0ξ(q),mj)\ensurestackMath\stackon[1pt]=d(Ξn(ui1),Ξ˘mn(ui1))\displaystyle\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j},\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q),m}_{j}\Big{)}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\big{(}\Xi_{n}(u_{i-1}),\breve{\Xi}^{m}_{n}(u_{i-1})\big{)}

where the laws of processes Ξn,Ξ˘mn\Xi_{n},\breve{\Xi}^{m}_{n} are stated in (3.21) and (3.23), respectively. Applying Lemma 6.6, we yield

𝐏(|q=1i1j0ξ(q)jq=1i1j0ξ(q),mj|>x2)\displaystyle\mathbf{P}\bigg{(}\bigg{|}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}-\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q),m}_{j}\bigg{|}>\frac{x}{2}\bigg{)} 𝐏(supt[0,n]|Ξn(t)Ξ˘mn(t)|>x2)4Cκm(2β+)x2nr(2β+)1,\displaystyle\leq\mathbf{P}\bigg{(}\sup_{t\in[0,n]}|\Xi_{n}(t)-\breve{\Xi}^{m}_{n}(t)|>\frac{x}{2}\bigg{)}\leq\frac{4C\kappa^{m(2-\beta_{+})}}{x^{2}n^{r(2-\beta_{+})-1}},

where C<C<\infty is the constant characterized in Lemma 6.6 that only depends on β+\beta_{+} and the law of the Lévy process XX. To conclude the proof of claim (6.53), we pick C~1=4C\widetilde{C}_{1}=4C.

Proof of Claim (6.54)

It follows directly from Lemma 6.7 that

𝐏(|j=1m+t(n)(ξ(i)j)+j=1m+t(n)(ξ(i),mj)+|>x2)4Cκm(2β+)x2nr(2β+)1,\displaystyle\mathbf{P}\bigg{(}\bigg{|}\sum_{j=1}^{m+t(n)}(\xi^{(i)}_{j})^{+}-\sum_{j=1}^{m+t(n)}(\xi^{(i),m}_{j})^{+}\bigg{|}>\frac{x}{2}\bigg{)}\leq\frac{4C\kappa^{m(2-\beta_{+})}}{x^{2}n^{r(2-\beta_{+})-1}},

where C<C<\infty is the constant characterized in Lemma 6.7 that only depends on β+\beta_{+} and the law of the Lévy process XX. To conclude the proof of claim (6.54), we pick C~2=4C\widetilde{C}_{2}=4C.

Proof of Claim (6.55)

Using Lemma 6.8,

𝐏(jm+t(n)+1(ξ(i)j)+>x)\displaystyle\mathbf{P}\bigg{(}\sum_{j\geq m+t(n)+1}(\xi^{(i)}_{j})^{+}>x\bigg{)} 2CXx1nd12m\displaystyle\leq\frac{2C_{X}}{x}\cdot\sqrt{\frac{1}{n^{d-1}\cdot 2^{m}}}

where CX<C_{X}<\infty is the constant characterized in Lemma 6.8 that only depends on the law of the Lévy process XX. By setting C~3=2CX\widetilde{C}_{3}=2C_{X}, we conclude the proof of claim (6.55). ∎

Proof of Proposition 6.4.

In this proof, we fix some kk\in\mathbb{N}. Recall the representation ζk(t)=i=1kzi𝐈[ui,n](t)\zeta_{k}(t)=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}(t) in (6.35) where 0<u1<u2<<uk<n0<u_{1}<u_{2}<\ldots<u_{k}<n are the order statistics of kk iid samples of Unif(0,n)(0,n). Recall the definition of W~(i),mn(ζk)\widetilde{W}^{(i),m}_{n}(\zeta_{k}) in (6.39). See also (3.15)–(3.24) for the definitions ξ(i)j\xi^{(i)}_{j}’s and ξ(i),mj\xi^{(i),m}_{j}’s.

We start with the following decomposition of events:

𝐏(i[k+1]s.t.W~(i),mn(ζk)[yδmn,y+δmn])\displaystyle\mathbf{P}\bigg{(}\exists i\in[k+1]\ s.t.\ \widetilde{W}^{(i),m}_{n}(\zeta_{k})\in\bigg{[}y-\frac{\delta^{m}}{\sqrt{n}},y+\frac{\delta^{m}}{\sqrt{n}}\bigg{]}\bigg{)}
𝐏(u1<nδmα1)+𝐏(i[k+1]s.t.W~(i),mn(ζk)[yδmn,y+δmn],u1nδmα1).\displaystyle\leq\mathbf{P}(u_{1}<n\delta^{m\alpha_{1}})+\mathbf{P}\bigg{(}\exists i\in[k+1]\ s.t.\ \widetilde{W}^{(i),m}_{n}(\zeta_{k})\in\bigg{[}y-\frac{\delta^{m}}{\sqrt{n}},y+\frac{\delta^{m}}{\sqrt{n}}\bigg{]},\ u_{1}\geq n\delta^{m\alpha_{1}}\bigg{)}.

First, 𝐏(u1<nδmα1)k𝐏(Unif(0,n)<nδmα1)=kδmα1<kρm0.\mathbf{P}(u_{1}<n\delta^{m\alpha_{1}})\leq k\cdot\mathbf{P}(\text{Unif}(0,n)<n\delta^{m\alpha_{1}})=k\cdot\delta^{m\alpha_{1}}<k\cdot\rho^{m}_{0}. The last inequality follows from our choice of ρ1\rho_{1} in (6.27) and ρ0(ρ1,1)\rho_{0}\in(\rho_{1},1). Furthermore, for each i[k+1]i\in[k+1]

𝐏(W~(i),mn(ζk)[yδmn,y+δmn],u1nδmα1)\displaystyle\mathbf{P}\bigg{(}\widetilde{W}^{(i),m}_{n}(\zeta_{k})\in[y-\frac{\delta^{m}}{\sqrt{n}},y+\frac{\delta^{m}}{\sqrt{n}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\bigg{)}
=𝐏(q=1i1j0ξ(q)j+q=1i1zq[yδmα2,y+δmα2],W~(i),mn(ζk)[yδmn,y+δmn],u1nδmα1)\displaystyle=\mathbf{P}\bigg{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\in[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}],\ \widetilde{W}^{(i),m}_{n}(\zeta_{k})\in[y-\frac{\delta^{m}}{\sqrt{n}},y+\frac{\delta^{m}}{\sqrt{n}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\bigg{)}
+𝐏(q=1i1j0ξ(q)j+q=1i1zq[yδmα2,y+δmα2],W~(i),mn(ζk)[yδmn,y+δmn],u1nδmα1)\displaystyle\qquad+\mathbf{P}\bigg{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\notin[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}],\ \widetilde{W}^{(i),m}_{n}(\zeta_{k})\in[y-\frac{\delta^{m}}{\sqrt{n}},y+\frac{\delta^{m}}{\sqrt{n}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\bigg{)}
𝐏(q=1i1j0ξ(q)j+q=1i1zq[yδmα2,y+δmα2],u1nδmα1)\displaystyle\leq\mathbf{P}\bigg{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\in[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\bigg{)}
+𝐏(q=1i1j0ξ(q)j+q=1i1zq[yδmα2,y+δmα2],W~(i),mn[yδmn,y+δmn])\displaystyle\qquad+\mathbf{P}\bigg{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\notin[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}],\ \widetilde{W}^{(i),m}_{n}\in[y-\frac{\delta^{m}}{\sqrt{n}},y+\frac{\delta^{m}}{\sqrt{n}}]\bigg{)}
𝐏(q=1i1j0ξ(q)j+q=1i1zq[yδmα2,y+δmα2],u1nδmα1)\displaystyle\leq\mathbf{P}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\in[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\Big{)}
+[yδmα2,y+δmα2]𝐏(j=1m+t(n)(ξ(i)j)+[yxδmn,yx+δmn])𝐏(q=1i1j0ξ(q)j+q=1i1zqdx)\displaystyle\qquad+\int_{\mathbb{R}\char 92\relax[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}]}\mathbf{P}\bigg{(}\sum_{j=1}^{m+t(n)}(\xi^{(i)}_{j})^{+}\in[y-x-\frac{\delta^{m}}{\sqrt{n}},y-x+\frac{\delta^{m}}{\sqrt{n}}]\bigg{)}\mathbf{P}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\in dx\Big{)}
=𝐏(q=1i1j0ξ(q)j+q=1i1zq[yδmα2,y+δmα2],u1nδmα1)\displaystyle=\mathbf{P}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\in[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\Big{)}
+(,yδmα2]𝐏(j=1m+t(n)(ξ(i)j)+[yxδmn,yx+δmn])𝐏(q=1i1j0ξ(q)j+q=1i1zqdx).\displaystyle\qquad+\int_{(-\infty,y-\delta^{m\alpha_{2}}]}\mathbf{P}\bigg{(}\sum_{j=1}^{m+t(n)}(\xi^{(i)}_{j})^{+}\in[y-x-\frac{\delta^{m}}{\sqrt{n}},y-x+\frac{\delta^{m}}{\sqrt{n}}]\bigg{)}\mathbf{P}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\in dx\Big{)}.

The last equality follows from the simple fact that j1(ξ(i)j)+0\sum_{j\geq 1}(\xi^{(i)}_{j})^{+}\geq 0. Furthermore, we claim the existence of constants C~1\widetilde{C}_{1} and C~2\widetilde{C}_{2}, the values of which do not vary with parameters n,m,k,y,in,m,k,y,i, such that for all n1n\geq 1 and mm¯m\geq\bar{m},

𝐏(q=1i1j0ξ(q)j+q=1i1zq[yδmα2,y+δmα2],u1nδmα1)\displaystyle\mathbf{P}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\in[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\Big{)} C~1ρ0my>δmα2,\displaystyle\leq\widetilde{C}_{1}\rho_{0}^{m}\qquad\forall y>\delta^{m\alpha_{2}}, (6.56)
𝐏(j=1m+t(n)(ξ(i)j)+[w,w+2δmn])\displaystyle\mathbf{P}\bigg{(}\sum_{j=1}^{m+t(n)}(\xi^{(i)}_{j})^{+}\in[w,w+\frac{2\delta^{m}}{\sqrt{n}}]\bigg{)} C~2ρ0mwδmα2δmn.\displaystyle\leq\widetilde{C}_{2}\rho_{0}^{m}\qquad\forall w\geq\delta^{m\alpha_{2}}-\frac{\delta^{m}}{\sqrt{n}}. (6.57)

Then, we conclude the proof by setting C2=1+C~1+C~2C_{2}=1+\widetilde{C}_{1}+\widetilde{C}_{2}. Now, we prove claims (6.56) and (6.57)

Proof of Claim (6.56)

If i1i\leq 1, the claim is trivial due to y>δmαy>\delta^{m\alpha} and hence 0[yδmα2,y+δmα2]0\notin[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}]. Now, we consider the case where i2i\geq 2. Due to the independence between ziz_{i} and ξ(i)j\xi^{(i)}_{j},

𝐏(q=1i1j0ξ(q)j+q=1i1zq[yδmα2,y+δmα2],u1nδmα1)\displaystyle\mathbf{P}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\in[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\Big{)}
=𝐏(q=1i1j0ξ(q)j[yxδmα2,yx+δmα2],u1nδmα1)𝐏(q=1i1zqdx)\displaystyle=\int_{\mathbb{R}}\mathbf{P}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}\in[y-x-\delta^{m\alpha_{2}},y-x+\delta^{m\alpha_{2}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\Big{)}\mathbf{P}(\sum_{q=1}^{i-1}z_{q}\in dx)
𝐏(q=1i1j0ξ(q)j[yxδmα2,yx+δmα2]|u1nδmα1)𝐏(q=1i1zqdx)\displaystyle\leq\int_{\mathbb{R}}\mathbf{P}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}\in[y-x-\delta^{m\alpha_{2}},y-x+\delta^{m\alpha_{2}}]\ \Big{|}\ u_{1}\geq n\delta^{m\alpha_{1}}\Big{)}\mathbf{P}(\sum_{q=1}^{i-1}z_{q}\in dx)
=𝐏(X<nγ(ui1)[yxδmα2,yx+δmα2]|u1nδmα1)𝐏(q=1i1zqdx)\displaystyle=\int_{\mathbb{R}}\mathbf{P}\Big{(}X^{<n\gamma}(u_{i-1})\in[y-x-\delta^{m\alpha_{2}},y-x+\delta^{m\alpha_{2}}]\ \Big{|}\ u_{1}\geq n\delta^{m\alpha_{1}}\Big{)}\mathbf{P}(\sum_{q=1}^{i-1}z_{q}\in dx)

where (ui)i=1k(u_{i})_{i=1}^{k} are independent of the Lévy process X<nγX^{<n\gamma}. In particular, recall that 0=u0<u1<u2<<uk<n0=u_{0}<u_{1}<u_{2}<\ldots<u_{k}<n are order statistics. Therefore, on event {u1nδmα1}\{u_{1}\geq n\delta^{m\alpha_{1}}\} we must have ui1u1nδmα1u_{i-1}\geq u_{1}\geq n\delta^{m\alpha_{1}}. It then follows directly from Assumption 2 that

𝐏(X<nγ(ui1)[yxδmα2,yx+δmα2]|u1nδmα1)\displaystyle\mathbf{P}\Big{(}X^{<n\gamma}(u_{i-1})\in[y-x-\delta^{m\alpha_{2}},y-x+\delta^{m\alpha_{2}}]\ \Big{|}\ u_{1}\geq n\delta^{m\alpha_{1}}\Big{)}
C(nλδmα1λ)12δmα22C(δα2δλα1)m2Cρ0mdue to (6.30) and ρ0(ρ1,1),\displaystyle\leq\frac{C}{(n^{\lambda}\delta^{m\alpha_{1}\lambda})\wedge 1}\cdot 2\delta^{m\alpha_{2}}\leq 2C\cdot\bigg{(}\frac{\delta^{\alpha_{2}}}{\delta^{\lambda\alpha_{1}}}\bigg{)}^{m}\leq 2C\cdot\rho_{0}^{m}\qquad\text{due to \eqref{proofChooseRhoByAlpha_12} and }\rho_{0}\in(\rho_{1},1),

where CC and λ\lambda are the constants specified in Assumption 2. To conclude, it suffices to set C~1=2C\widetilde{C}_{1}=2C.

Proof of Claim (6.57)

Applying Lemma 6.9 with y0=δmα2δmny_{0}=\delta^{m\alpha_{2}}-\frac{\delta^{m}}{\sqrt{n}} and c=2δmnc=\frac{2\delta^{m}}{\sqrt{n}}, we get (for all n1,mm¯,yy0n\geq 1,m\geq\bar{m},y\geq y_{0})

𝐏(j=1m+t(n)(ξ(i)j)+[y,y+2δmn])\displaystyle\mathbf{P}\bigg{(}\sum_{j=1}^{m+t(n)}(\xi^{(i)}_{j})^{+}\in[y,y+\frac{2\delta^{m}}{\sqrt{n}}]\bigg{)}
C(m+(log2(nd))nα4λδmα3λ2δmn+4CX(m2+(log2(nd))2)δmα3/2(δmα2δmn)nα4/2\displaystyle\leq C\frac{(m+(\lceil\log_{2}(n^{d})\rceil)n^{\alpha_{4}\lambda}}{\delta^{m\alpha_{3}\lambda}}\cdot\frac{2\delta^{m}}{\sqrt{n}}+4C_{X}\big{(}m^{2}+(\lceil\log_{2}(n^{d})\rceil)^{2}\big{)}\frac{\delta^{m\alpha_{3}/2}}{(\delta^{m\alpha_{2}}-\frac{\delta^{m}}{\sqrt{n}})\cdot n^{\alpha_{4}/2}}
C(m+(log2(nd))nα4λδmα3λ2δmn+8CX(m2+(log2(nd))2)δmα3/2δmα2nα4/2due to (6.25)\displaystyle\leq C\frac{(m+(\lceil\log_{2}(n^{d})\rceil)n^{\alpha_{4}\lambda}}{\delta^{m\alpha_{3}\lambda}}\cdot\frac{2\delta^{m}}{\sqrt{n}}+8C_{X}\big{(}m^{2}+(\lceil\log_{2}(n^{d})\rceil)^{2}\big{)}\frac{\delta^{m\alpha_{3}/2}}{{\delta^{m\alpha_{2}}}\cdot n^{\alpha_{4}/2}}\qquad\text{due to \eqref{proofChooseMbar}}
=2Cmn12λα4(δδλα3)m\ensurestackMath\stackon[1pt]=Δpn,m,1+2Clog2(nd)n12λα4(δδλα3)m\ensurestackMath\stackon[1pt]=Δpn,m,2\displaystyle=\underbrace{2C\cdot\frac{m}{n^{\frac{1}{2}-\lambda\alpha_{4}}}\cdot\bigg{(}\frac{\delta}{\delta^{\lambda\alpha_{3}}}\bigg{)}^{m}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}p_{n,m,1}}+\underbrace{2C\cdot\frac{\lceil\log_{2}(n^{d})\rceil}{n^{\frac{1}{2}-\lambda\alpha_{4}}}\cdot\bigg{(}\frac{\delta}{\delta^{\lambda\alpha_{3}}}\bigg{)}^{m}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}p_{n,m,2}}
+8CXm2nα4/2(δα3/2δα2)m\ensurestackMath\stackon[1pt]=Δpn,m,3+8CX(log2(nd))2nα4/2(δα3/2δα2)m.\ensurestackMath\stackon[1pt]=Δpn,m,4\displaystyle\qquad+\underbrace{8C_{X}\cdot\frac{m^{2}}{n^{{\alpha_{4}}/{2}}}\cdot\bigg{(}\frac{\delta^{\alpha_{3}/2}}{\delta^{\alpha_{2}}}\bigg{)}^{m}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}p_{n,m,3}}+\underbrace{8C_{X}\cdot\frac{\big{(}\lceil\log_{2}(n^{d})\rceil\big{)}^{2}}{n^{{\alpha_{4}}/{2}}}\cdot\bigg{(}\frac{\delta^{\alpha_{3}/2}}{\delta^{\alpha_{2}}}\bigg{)}^{m}.}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}p_{n,m,4}}

Here, CX<C_{X}<\infty is the constant in Lemma 6.5 that only depends on the law of Lévy process XX, and C(0,),λ>0C\in(0,\infty),\lambda>0 are the constants in Assumption 2. First, for any n1n\geq 1 and mm¯m\geq\bar{m},

pn,m,1\displaystyle p_{n,m,1} 2Cm(δδλα3)mdue to 12>λα4; see (6.21)\displaystyle\leq 2C\cdot m\cdot\bigg{(}\frac{\delta}{\delta^{\lambda\alpha_{3}}}\bigg{)}^{m}\qquad\text{due to $\frac{1}{2}>\lambda\alpha_{4}$; see \eqref{proofChooseAlpha_34}}
2Cmρ1mdue to (6.31)\displaystyle\leq 2C\cdot m\rho_{1}^{m}\qquad\text{due to \eqref{proofChooseRhoSBALongStick}}
2Cρ0m due to (6.34).\displaystyle\leq 2C\cdot\rho_{0}^{m}\qquad\text{ due to \eqref{proof, choose bar m and rho 0}}.

For term pn,m,2p_{n,m,2}, note that log2(nd)n12λα40\frac{\lceil\log_{2}(n^{d})\rceil}{n^{\frac{1}{2}-\lambda\alpha_{4}}}\rightarrow 0 as nn\rightarrow\infty due to 12>λα4\frac{1}{2}>\lambda\alpha_{4}. This allows us to fix some Cd,1<C_{d,1}<\infty such that supn=1,2,log2(nd)n12λα4Cd,1\sup_{n=1,2,\cdots}\frac{\lceil\log_{2}(n^{d})\rceil}{n^{\frac{1}{2}-\lambda\alpha_{4}}}\leq C_{d,1}. As a result, for any n1,m0n\geq 1,m\geq 0,

pn,m,2\displaystyle p_{n,m,2} 2CCd,1(δδλα3)m2CCd,1ρm0 due to (6.31) and ρ0(ρ1,1).\displaystyle\leq 2CC_{d,1}\cdot\bigg{(}\frac{\delta}{\delta^{\lambda\alpha_{3}}}\bigg{)}^{m}\leq 2CC_{d,1}\cdot\rho^{m}_{0}\qquad\text{ due to \eqref{proofChooseRhoSBALongStick} and }\rho_{0}\in(\rho_{1},1).

Similarly, for all n1n\geq 1 and mm¯m\geq\bar{m},

pn,m,3\displaystyle p_{n,m,3} 8CXm2(δα3/2δα2)m8CXm2ρm1due to (6.32)\displaystyle\leq 8C_{X}\cdot m^{2}\cdot\bigg{(}\frac{\delta^{\alpha_{3}/2}}{\delta^{\alpha_{2}}}\bigg{)}^{m}\leq 8C_{X}\cdot m^{2}\rho^{m}_{1}\qquad\text{due to \eqref{proofChooseRhoSBAShortStick}}
8CXρm0 due to (6.34).\displaystyle\leq 8C_{X}\cdot\rho^{m}_{0}\qquad\text{ due to \eqref{proof, choose bar m and rho 0}}.

Besides, due to (log2(nd))2nα4/20\frac{(\lceil\log_{2}(n^{d})\rceil)^{2}}{n^{{\alpha_{4}}/{2}}}\rightarrow 0 as nn\rightarrow\infty, we can find Cd,2<C_{d,2}<\infty such that supn=1,2,,(log2(nd))2nα4/2Cd,2\sup_{n=1,2,\cdots,}\frac{(\lceil\log_{2}(n^{d})\rceil)^{2}}{n^{{\alpha_{4}}/{2}}}\leq C_{d,2}. This leads to (for all n1,m0n\geq 1,m\geq 0)

pn,m,4\displaystyle p_{n,m,4} 8CXCd,2(δα3/2δα2)m8CXCd,2ρm0 due to (6.31) and ρ0(ρ1,1).\displaystyle\leq 8C_{X}C_{d,2}\cdot\bigg{(}\frac{\delta^{\alpha_{3}/2}}{\delta^{\alpha_{2}}}\bigg{)}^{m}\leq 8C_{X}C_{d,2}\cdot\rho^{m}_{0}\qquad\text{ due to \eqref{proofChooseRhoSBALongStick} and }\rho_{0}\in(\rho_{1},1).

To conclude the proof, we can simply set C~2=2C+2CCd,1+8CX+8CXCd,2.\widetilde{C}_{2}=2C+2CC_{d,1}+8C_{X}+8C_{X}C_{d,2}.

6.3 Proof of Propositions 4.1 and 4.3

The proof of Proposition 4.1 is based on the inversion formula of the characteristic functions (see, e.g., Theorem 3.3.14 of [29]). Specifically, we compare the characteristic function of Y(t)Y(t) with an α\alpha-stable process to draw connections between their distributions.

Proof of Proposition 4.1.

The Lévy-Khintchine formula (see e.g. Theorem 8.1 of [58]) leads to the following expression for the characteristic function of φt(z)=𝐄exp(izY(t))\varphi_{t}(z)=\mathbf{E}\exp(izY(t)):

φt(z)=exp(t(0,z0)[exp(izx)1izx𝐈(0,1](x)\ensurestackMath\stackon[1pt]=Δϕ(z,x)]μ(dx))z,t>0.\displaystyle\varphi_{t}(z)=\exp\Big{(}t\int_{(0,z_{0})}\big{[}\underbrace{\exp(izx)-1-izx\mathbf{I}_{(0,1]}(x)}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\phi(z,x)}\big{]}\mu(dx)\Big{)}\qquad\forall z\in\mathbb{R},\ t>0.

Note that

ϕ(z,x)\displaystyle\phi(z,x) =cos(zx)1+i(sin(zx)zx𝐈(0,1](x)).\displaystyle=\cos(zx)-1+i\big{(}\sin(zx)-zx\mathbf{I}_{(0,1]}(x)\big{)}.

Then from |ex+iy|=ex|e^{x+iy}|=e^{x} for all x,yx,y\in\mathbb{R},

|φt(z)|=exp(t(0,z0)(1cos(zx))μ(dx))z,t>0.\displaystyle|\varphi_{t}(z)|=\exp\Big{(}-t\int_{(0,z_{0})}\big{(}1-\cos(zx)\big{)}\mu(dx)\Big{)}\qquad\forall z\in\mathbb{R},\ t>0. (6.58)

Furthermore, we claim the existence of some M~,C~(0,)\widetilde{M},\widetilde{C}\in(0,\infty) such that

(0,z0)(1cos(zx))μ(dx)C~|z|αz with |z|M~.\displaystyle\int_{(0,z_{0})}\big{(}1-\cos(zx)\big{)}\mu(dx)\geq\widetilde{C}|z|^{\alpha}\qquad\forall z\in\mathbb{R}\text{ with }|z|\geq\widetilde{M}. (6.59)

Plugging (6.59) into (6.58), we yield that for all |z|M~|z|\geq\widetilde{M} and t>0t>0, |φt(z)|exp(tC~|z|α).|\varphi_{t}(z)|\leq\exp(-t\widetilde{C}|z|^{\alpha}). It then follows directly from the inversion formula (see Theorem 3.3.14 of [29]) that, for all t>0t>0, Y(t)Y(t) admits a continuous density function fY(t)f_{Y(t)} with a uniform bound

fY(t)\displaystyle\left\lVert f_{Y(t)}\right\rVert_{\infty} 12π|φt(z)|dz\displaystyle\leq\frac{1}{2\pi}\int|\varphi_{t}(z)|dz
12π(2M~+|z|M~exp(tC~|z|α)dz)\displaystyle\leq\frac{1}{2\pi}\Big{(}2\widetilde{M}+\int_{|z|\geq\tilde{M}}\exp\big{(}-t\widetilde{C}|z|^{\alpha}\big{)}dz\Big{)}
12π(2M~+1t1/αexp(C~|x|α)dx)by letting x=zt1/α\displaystyle\leq\frac{1}{2\pi}\Big{(}2\widetilde{M}+\frac{1}{t^{1/\alpha}}\int_{\mathbb{R}}\exp(-\widetilde{C}|x|^{\alpha})dx\Big{)}\qquad\text{by letting $x=zt^{1/\alpha}$}
=M~π+C1t1/αwhere C1=12πexp(C~|x|α)dx<.\displaystyle=\frac{\widetilde{M}}{\pi}+\frac{C_{1}}{t^{1/\alpha}}\qquad\text{where $C_{1}=\frac{1}{2\pi}\int_{\mathbb{R}}\exp(-\widetilde{C}|x|^{\alpha})dx<\infty$}.

To conclude the proof, pick C=M~π+C1C=\frac{\widetilde{M}}{\pi}+C_{1}. Now, it only remains to prove claim (6.59).

Proof of Claim (6.59).

We start by fixing some constants.

C0=0(1cosy)dyy1+α.\displaystyle C_{0}=\int_{0}^{\infty}(1-\cos{y})\frac{dy}{y^{1+\alpha}}. (6.60)

For y(0,1]y\in(0,1], note that 1cosyy2/21-\cos y\leq y^{2}/2, and hence |1cosy|y1+α12yα1\frac{|1-\cos y|}{y^{1+\alpha}}\leq\frac{1}{2y^{\alpha-1}}. For y(1,)y\in(1,\infty), note that 1cosy[0,1]1-\cos y\in[0,1] and hence |1cosy|y1+α1/yα+1\frac{|1-\cos y|}{y^{1+\alpha}}\leq 1/y^{\alpha+1}. Due to α(0,2)\alpha\in(0,2), we have C0=0(1cosy)dyy1+α(0,)C_{0}=\int_{0}^{\infty}(1-\cos{y})\frac{dy}{y^{1+\alpha}}\in(0,\infty). Next, choose positive real numbers θ,δ\theta,\ \delta such that

θ2α2(2α)C04,\displaystyle\frac{\theta^{2-\alpha}}{2(2-\alpha)}\leq\frac{C_{0}}{4}, (6.61)
δαθαC04.\displaystyle\frac{\delta}{\alpha\theta^{\alpha}}\leq\frac{C_{0}}{4}. (6.62)

For any M>0M>0 and z0z\neq 0, observe that (by setting y=|z|xy=|z|x in the last step)

xM|z|(1cos(zx))dxx1+α|z|α\displaystyle\frac{\int_{x\geq\frac{M}{|z|}}\big{(}1-\cos(zx)\big{)}\frac{dx}{x^{1+\alpha}}}{|z|^{\alpha}} =xM|z|(1cos(|z|x))dxx1+α|z|α=M(1cosy)dyy1+α.\displaystyle=\frac{\int_{x\geq\frac{M}{|z|}}\big{(}1-\cos(|z|x)\big{)}\frac{dx}{x^{1+\alpha}}}{|z|^{\alpha}}=\int_{M}^{\infty}\big{(}1-\cos{y}\big{)}\frac{dy}{y^{1+\alpha}}.

Therefore, by fixing some M>θM>\theta large enough, we have

1|z|αxM/|z|(1cos(zx))dxx1+αC04z0.\displaystyle\frac{1}{|z|^{\alpha}}\int_{x\geq M/|z|}\big{(}1-\cos(zx)\big{)}\frac{dx}{x^{1+\alpha}}\leq\frac{C_{0}}{4}\qquad\forall z\neq 0. (6.63)

To proceed, we compare (0,z0)(1cos(zx))μ(dx)\int_{(0,z_{0})}\big{(}1-\cos(zx)\big{)}\mu(dx) with 0M/z(1cos(zx))dxx1+α\int_{0}^{M/z}\big{(}1-\cos(zx)\big{)}\frac{dx}{x^{1+\alpha}}. Recall that z0z_{0} is the constant prescribed in the statement of Proposition 4.1. For any zz\in\mathbb{R} such that |z|>M/z0|z|>M/z_{0},

1|z|α[(0,z0)(1cos(zx))μ(dx)0(1cos(zx))dxx1+α]1|z|α[(θ/|z|,M/|z|)(1cos(zx))μ(dx)0(1cos(zx))dxx1+α]due to our choice of M>θ and |z|>M/z01|z|α0θ/|z|(1cos(zx))dxx1+α\ensurestackMath\stackon[1pt]=ΔI1(z)1|z|αM/|z|(1cos(zx))dxx1+α\ensurestackMath\stackon[1pt]=ΔI2(z)+1|z|α[[θ/|z|,M/|z|)(1cos(zx))μ(dx)[θ/|z|,M/|z|)(1cos(zx))dxx1+α]\ensurestackMath\stackon[1pt]=ΔI3(z).\begin{split}&\frac{1}{|z|^{\alpha}}\bigg{[}\int_{(0,z_{0})}\Big{(}1-\cos(zx)\Big{)}\mu(dx)-\int_{0}^{\infty}\Big{(}1-\cos(zx)\Big{)}\frac{dx}{x^{1+\alpha}}\bigg{]}\\ &\geq\frac{1}{|z|^{\alpha}}\bigg{[}\int_{(\theta/|z|,M/|z|)}\Big{(}1-\cos(zx)\Big{)}\mu(dx)-\int_{0}^{\infty}\Big{(}1-\cos(zx)\Big{)}\frac{dx}{x^{1+\alpha}}\bigg{]}\\ &\qquad\text{due to our choice of $M>\theta$ and $|z|>M/z_{0}$}\\ &\geq-\underbrace{\frac{1}{|z|^{\alpha}}\int_{0}^{\theta/|z|}\Big{(}1-\cos(zx)\Big{)}\frac{dx}{x^{1+\alpha}}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}I_{1}(z)}-\underbrace{\frac{1}{|z|^{\alpha}}\int_{M/|z|}^{\infty}\Big{(}1-\cos(zx)\Big{)}\frac{dx}{x^{1+\alpha}}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}I_{2}(z)}\\ &\qquad+\underbrace{\frac{1}{|z|^{\alpha}}\bigg{[}\int_{[\theta/|z|,M/|z|)}\Big{(}1-\cos(zx)\Big{)}\mu(dx)-\int_{[\theta/|z|,M/|z|)}\Big{(}1-\cos(zx)\Big{)}\frac{dx}{x^{1+\alpha}}\bigg{]}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}I_{3}(z)}.\end{split} (6.64)

We bound the terms I1(z)I_{1}(z), I2(z)I_{2}(z), and I3(z)I_{3}(z) separately. First, for any z0z\neq 0,

I1(z)\displaystyle I_{1}(z) 1|z|α0θ/|z|z2x22dxx1+α due to 1cosww22w\displaystyle\leq\frac{1}{|z|^{\alpha}}\int_{0}^{\theta/|z|}\frac{z^{2}x^{2}}{2}\frac{dx}{x^{1+\alpha}}\qquad\text{ due to }1-\cos w\leq\frac{w^{2}}{2}\ \forall w\in\mathbb{R}
=120θy1αdyby setting y=|z|x\displaystyle=\frac{1}{2}\int_{0}^{\theta}y^{1-\alpha}dy\qquad\text{by setting }y=|z|x
=12θ2α2αC04due to (6.61).\displaystyle=\frac{1}{2}\cdot\frac{\theta^{2-\alpha}}{2-\alpha}\leq\frac{C_{0}}{4}\ \ \ \text{due to \eqref{chooseTheta_Lcont}}. (6.65)

For I2(z)I_{2}(z), it follows immediately from (6.63) that

I2(z)C04z0.\displaystyle I_{2}(z)\leq\frac{C_{0}}{4}\qquad\forall z\neq 0. (6.66)

Next, in order to bound I3(z)I_{3}(z), we consider the function h(z)\ensurestackMath\stackon[1pt]=Δ1cosz.h(z)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}1-\cos{z}. Since h(z)h(z) is uniformly continuous on [θ,M][\theta,M], we can find some N,t0>1N\in\mathbb{N},\ t_{0}>1, and a sequence of real numbers θ=x0>x1>>xN=M\theta=x_{0}>x_{1}>\cdots>x_{N}=M such that

xj1xj=t0j=1,2,,N,|h(x)h(y)|<δj=1,2,,N,x,y[xj,xj1].\begin{split}&\frac{x_{j-1}}{x_{j}}=t_{0}\qquad\forall j=1,2,\cdots,N,\\ &|h(x)-h(y)|<\delta\qquad\forall j=1,2,\cdots,N,\ x,y\in[x_{j},x_{j-1}].\end{split} (6.67)

In other words, we use a geometric sequence {x0,x1,,xN}\{x_{0},x_{1},\cdots,x_{N}\} to partition [θ,M][\theta,M] into NN intervals. On any of these intervals, the fluctuations of h(z)=1coszh(z)=1-\cos{z} is bounded by the constant δ\delta fixed in (6.62). Now fix some Δ>0\Delta>0 such that (recall that ϵ>0\epsilon>0 is prescribed in the statement of this proposition)

(1Δ)t0α+ϵ>1.\displaystyle(1-\Delta)t_{0}^{\alpha+\epsilon}>1. (6.68)

Since μ[x,)\mu[x,\infty) is regularly varying as x0x\rightarrow 0 with index (α+2ϵ)-(\alpha+2\epsilon), for g(y)=μ[1/y,)g(y)=\mu[1/y,\infty) we have g𝒱α+2ϵ(y)g\in\mathcal{RV}_{\alpha+2\epsilon}(y) as yy\to\infty. By Potter’s bound (see Proposition 2.6 in [53]),there exists y¯1>0\bar{y}_{1}>0 such that

g(ty)g(y)(1Δ)tα+ϵyy¯1,t1.\displaystyle\frac{g(ty)}{g(y)}\geq(1-\Delta)t^{\alpha+\epsilon}\qquad\forall y\geq\bar{y}_{1},\ t\geq 1. (6.69)

Meanwhile, define

g~(y)=yα,να(dx)=𝐈(0,)(x)dxx1+α\widetilde{g}(y)=y^{\alpha},\qquad\nu_{\alpha}(dx)=\mathbf{I}_{(0,\infty)}(x)\frac{dx}{x^{1+\alpha}}

and note that g~(y)=να(1/y,)\widetilde{g}(y)=\nu_{\alpha}(1/y,\infty). Due to g𝒱α+2ϵg\in\mathcal{RV}_{\alpha+2\epsilon}, we can find some y¯2>0\bar{y}_{2}>0 such that

g(y)t0α1(1Δ)t0α+ϵ1g~(y)yy¯2.\displaystyle g(y)\geq\frac{t_{0}^{\alpha}-1}{(1-\Delta)t_{0}^{\alpha+\epsilon}-1}\cdot\widetilde{g}(y)\ \ \ \forall y\geq\bar{y}_{2}. (6.70)

Let M~=max{M/z0,My¯1,My¯2}\widetilde{M}=\max\{M/z_{0},M\bar{y}_{1},M\bar{y}_{2}\}. For any |z|M~|z|\geq\widetilde{M}, we have |z|M/z0|z|\geq M/z_{0} and |z|xj|z|My¯1y¯2\frac{|z|}{x_{j}}\geq\frac{|z|}{M}\geq\bar{y}_{1}\vee\bar{y}_{2} for any j=0,1,,Nj=0,1,\cdots,N. As a result, for zz\in\mathbb{R} with |z|M~|z|\geq\widetilde{M} and any j=1,2,,Nj=1,2,\cdots,N,

μ[xj/|z|,xj1/|z|)\displaystyle\mu[x_{j}/|z|,x_{j-1}/|z|) =g(|z|/xj)g(|z|/xj1) by definition of g(y)=μ[1/y,)\displaystyle=g(|z|/x_{j})-g(|z|/x_{j-1})\qquad\text{ by definition of }g(y)=\mu[1/y,\infty)
=g(t0|z|/xj1)g(|z|/xj1) due to xj1=t0xj; see (6.67)\displaystyle=g(t_{0}|z|/x_{j-1})-g(|z|/x_{j-1})\qquad\text{ due to $x_{j-1}=t_{0}x_{j}$; see \eqref{uContOfG_lCont}}
g(|z|/xj1)((1Δ)t0α+ϵ1)due to |z|xjy¯1y¯2 and (6.69)\displaystyle\geq g(|z|/x_{j-1})\cdot\Big{(}(1-\Delta)t_{0}^{\alpha+\epsilon}-1\Big{)}\qquad\text{due to $\frac{|z|}{x_{j}}\geq\bar{y}_{1}\vee\bar{y}_{2}$ and \eqref{potterBound_lCont}}
g~(|z|/xj1)(t0α1)due to (6.70).\displaystyle\geq\widetilde{g}(|z|/x_{j-1})\cdot(t_{0}^{\alpha}-1)\qquad\text{due to \eqref{gBound_lCont}}.

On the other hand,

να[xj/|z|,xj1/|z|)\displaystyle\nu_{\alpha}[x_{j}/|z|,x_{j-1}/|z|) =g~(|z|/xj)g~(|z|/xj1)=g~(|z|/xj1)(t0α1).\displaystyle=\widetilde{g}(|z|/x_{j})-\widetilde{g}(|z|/x_{j-1})=\widetilde{g}(|z|/x_{j-1})\cdot(t_{0}^{\alpha}-1).

Therefore, given any zz\in\mathbb{R} such that |z|M~|z|\geq\widetilde{M}, we have μ(Ej(z))να(Ej(z))\mu\big{(}E_{j}(z)\big{)}\geq\nu_{\alpha}\big{(}E_{j}(z)\big{)} for all j[N]j\in[N] where Ej(z)=[xj/|z|,xj1/|z|)E_{j}(z)=[x_{j}/|z|,x_{j-1}/|z|). This leads to

I3(z)\displaystyle I_{3}(z)
=1|z|αj=1N[Ej(z)(1cos(zx))μ(dx)Ej(z)(1cos(zx))dxx1+α]\displaystyle=\frac{1}{|z|^{\alpha}}\sum_{j=1}^{N}\bigg{[}\int_{E_{j}(z)}\Big{(}1-\cos(zx)\Big{)}\mu(dx)-\int_{E_{j}(z)}\Big{(}1-\cos(zx)\Big{)}\frac{dx}{x^{1+\alpha}}\bigg{]}
1|z|αj=1N[m¯jμ(Ej(z))m¯jνα(Ej(z))]\displaystyle\geq\frac{1}{|z|^{\alpha}}\sum_{j=1}^{N}\Big{[}\underline{m}_{j}\cdot\mu\big{(}E_{j}(z)\big{)}-\bar{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}\Big{]}
with m¯j=max{h(z):z[xj,xj1]},m¯j=min{h(z):z[xj,xj1]}\bar{m}_{j}=\max\{h(z):\ z\in[x_{j},x_{j-1}]\},\ \underline{m}_{j}=\min\{h(z):\ z\in[x_{j},x_{j-1}]\}
=1|z|αj=1N[m¯jμ(Ej(z))m¯jνα(Ej(z))]+1|z|αj=1N[m¯jνα(Ej(z))m¯jνα(Ej(z))]\displaystyle=\frac{1}{|z|^{\alpha}}\sum_{j=1}^{N}\Big{[}\underline{m}_{j}\cdot\mu\big{(}E_{j}(z)\big{)}-\underline{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}\Big{]}+\frac{1}{|z|^{\alpha}}\sum_{j=1}^{N}\Big{[}\underline{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}-\bar{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}\Big{]}
0+1|z|αj=1N[m¯jνα(Ej(z))m¯jνα(Ej(z))]due to μ(Ej(z))να(Ej(z))\displaystyle\geq 0+\frac{1}{|z|^{\alpha}}\sum_{j=1}^{N}\Big{[}\underline{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}-\bar{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}\Big{]}\qquad\text{due to }\mu\big{(}E_{j}(z)\big{)}\geq\nu_{\alpha}\big{(}E_{j}(z)\big{)}
δ|z|αj=1Nνα(Ej(z))=δ|z|ανc[θ/|z|,M/|z|)due to (6.67)\displaystyle\geq-\frac{\delta}{|z|^{\alpha}}\sum_{j=1}^{N}\nu_{\alpha}\big{(}E_{j}(z)\big{)}=-\frac{\delta}{|z|^{\alpha}}\nu_{c}[\theta/|z|,M/|z|)\qquad\text{due to \eqref{uContOfG_lCont}}
=δ|z|αθ/|z|M/|z|dxx1+α\displaystyle=-\frac{\delta}{|z|^{\alpha}}\int_{\theta/|z|}^{M/|z|}\frac{dx}{x^{1+\alpha}}
δ|z|αθ/|z|dxx1+α=δαθα\displaystyle\geq-\frac{\delta}{|z|^{\alpha}}\int_{\theta/|z|}^{\infty}\frac{dx}{x^{1+\alpha}}=-\frac{\delta}{\alpha\theta^{\alpha}}
C04due to (6.62).\displaystyle\geq-\frac{C_{0}}{4}\qquad\text{due to \eqref{chooseDelta_Lcont}.} (6.71)

Plugging (6.65), (6.66), and (6.71) back into (6.64), we have shown that for all |z|M~|z|\geq\widetilde{M},

1|z|α(0,z0)(1cos(zx))μ(dx)\displaystyle\frac{1}{|z|^{\alpha}}\int_{(0,z_{0})}\big{(}1-\cos(zx)\big{)}\mu(dx)
3C04+1|z|α0(1cos(zx))dxx1+α\displaystyle\geq-\frac{3C_{0}}{4}+\frac{1}{|z|^{\alpha}}\int_{0}^{\infty}\Big{(}1-\cos(zx)\Big{)}\frac{dx}{x^{1+\alpha}}
=3C04+0(1cosy)dyy1+αby setting y=|z|x\displaystyle=-\frac{3C_{0}}{4}+\int_{0}^{\infty}\Big{(}1-\cos y\Big{)}\frac{dy}{y^{1+\alpha}}\qquad\text{by setting }y=|z|x
=3C04+C0=C04 by definition of C0=0(1cosy)dyy1+α.\displaystyle=-\frac{3C_{0}}{4}+C_{0}=\frac{C_{0}}{4}\qquad\text{ by definition of }C_{0}=\int_{0}^{\infty}(1-\cos{y})\frac{dy}{y^{1+\alpha}}.

To conclude the proof of claim (6.59), we set C~=C0/4.\widetilde{C}=C_{0}/4.

Again, the proof of Proposition 4.3 makes use of the inversion formula.

Proof of Proposition 4.3.

Let us denote the characteristic functions of Y(t)Y^{\prime}(t) and Y(t)Y(t) by φt{\varphi}_{t} and φ~t\widetilde{\varphi}_{t}, respectively. Repeating the arguments using complex conjugates in (6.58), we obtain

|φ~t(z)|=exp(t|x|<bN(1cos(zx))μ(dx)).\displaystyle|\widetilde{\varphi}_{t}(z)|=\exp\Big{(}-t\int_{|x|<b^{N}}\big{(}1-\cos(zx)\big{)}\mu(dx)\Big{)}.

As for φt\varphi_{t}, using proposition 14.9 in [58], we get

|φt(z)|=exp(t|z|αη(z))\displaystyle|\varphi_{t}(z)|=\exp\Big{(}-t|z|^{\alpha}\eta(z)\Big{)} (6.72)

where η(z)\eta(z) is a non-negative function continuous on {0}\mathbb{R}\char 92\relax\{0\} satisfying η(bz)=η(z)\eta(bz)=\eta(z) and

η(z)=(1cos(zx))μ(dx)|z|αz0.\displaystyle\eta(z)=\frac{\int_{\mathbb{R}}\big{(}1-\cos(zx)\big{)}\mu(dx)}{|z|^{\alpha}}\qquad\forall z\neq 0.

This implies η(z)=η(z)\eta(z)=\eta(-z) for all z0z\neq 0. Furthermore, we claim the existence of some c>0c>0 such that

η(z)cz[1,b].\displaystyle\eta(z)\geq c\qquad\forall z\in[1,b]. (6.73)

Then due to the self-similarity of μ\mu (i.e., η(bz)=η(z)\eta(bz)=\eta(z)), we have η(z)c\eta(z)\geq c for all z0z\neq 0. In the meantime, note that

1|z|α|x|bN(1cos(zx))μ(dx)μ{x:|x|bN}|z|α.\displaystyle\frac{1}{|z|^{\alpha}}\int_{|x|\geq b^{N}}\big{(}1-\cos(zx)\big{)}\mu(dx)\leq\frac{\mu\{x:|x|\geq b^{N}\}}{|z|^{\alpha}}.

By picking M>0M>0 large enough, it holds for any |z|M|z|\geq M that

1|z|α|x|bN(1cos(zx))μ(dx)c2.\displaystyle\frac{1}{|z|^{\alpha}}\int_{|x|\geq b^{N}}\big{(}1-\cos(zx)\big{)}\mu(dx)\leq\frac{c}{2}. (6.74)

Therefore, for any |z|M|z|\geq M,

|x|<bN(1cos(zx))μ(dx)\displaystyle\int_{|x|<b^{N}}\big{(}1-\cos(zx)\big{)}\mu(dx) =x(1cos(zx))μ(dx)|x|bN(1cos(zx))μ(dx)\displaystyle=\int_{x\in\mathbb{R}}\big{(}1-\cos(zx)\big{)}\mu(dx)-\int_{|x|\geq b^{N}}\big{(}1-\cos(zx)\big{)}\mu(dx)
=η(z)|z|α|x|bN(1cos(zx))μ(dx)\displaystyle=\eta(z)\cdot|z|^{\alpha}-\int_{|x|\geq b^{N}}\big{(}1-\cos(zx)\big{)}\mu(dx)
c|z|αc2|z|α=c2|z|αusing (6.73) and (6.74),\displaystyle\geq c|z|^{\alpha}-\frac{c}{2}|z|^{\alpha}=\frac{c}{2}|z|^{\alpha}\qquad\text{using \eqref{goal, proposition: lipschitz cont, 2} and \eqref{fact 2, proposition: lipschitz cont, 2}},

and hence |φ~t(z)|exp(c2t|z|α)|\widetilde{\varphi}_{t}(z)|\leq\exp\big{(}-\frac{c}{2}t|z|^{\alpha}\big{)} for all |z|M|z|\geq M. Applying inversion formula, we get (for any t>0t>0)

fY(t)\displaystyle\left\lVert f_{Y(t)}\right\rVert_{\infty} 12π|φ~t(z)|dz\displaystyle\leq\frac{1}{2\pi}\int|\widetilde{\varphi}_{t}(z)|dz
12π[2M+|z|M|φ~t(z)|dz]\displaystyle\leq\frac{1}{2\pi}\bigg{[}2M+\int_{|z|\geq M}|\widetilde{\varphi}_{t}(z)|dz\bigg{]}
Mπ+12πexp(c2t|z|α)dz\displaystyle\leq\frac{M}{\pi}+\frac{1}{2\pi}\int\exp\bigg{(}-\frac{c}{2}t|z|^{\alpha}\bigg{)}dz
=Mπ+12π1t1/αexp(c2|x|α)dxusing x=t1/αz\displaystyle=\frac{M}{\pi}+\frac{1}{2\pi}\cdot\frac{1}{t^{1/\alpha}}\int\exp\bigg{(}-\frac{c}{2}|x|^{\alpha}\bigg{)}dx\qquad\text{using }x=t^{1/\alpha}\cdot z
Mπ+C1t1/α where C1=12πexp(c2|x|α)dx.\displaystyle\leq\frac{M}{\pi}+\frac{C_{1}}{t^{1/\alpha}}\qquad\text{ where }C_{1}=\frac{1}{2\pi}\int\exp\bigg{(}-\frac{c}{2}|x|^{\alpha}\bigg{)}dx.

To conclude the proof, we set C=Mπ+C1C=\frac{M}{\pi}+C_{1}. Now it only remains to prove claim (6.73).

Proof of Claim (6.73)

We proceed with a proof by contradiction. If infz[1,b]η(z)=0\inf_{z\in[1,b]}\eta(z)=0, then by continuity of η(z)\eta(z), there exists some z[1,b]z\in[1,b] such that

(1cos(zx))μ(dx)=0.\displaystyle\int_{\mathbb{R}}\big{(}1-\cos(zx)\big{)}\mu(dx)=0.

Now for any ϵ>0\epsilon>0, define the following sets:

S\displaystyle S ={x: 1cos(zx)>0}={2πzk:k};\displaystyle=\{x\in\mathbb{R}:\ 1-\cos(zx)>0\}=\mathbb{R}\char 92\relax\{\frac{2\pi}{z}k:\ k\in\mathbb{Z}\};
Sϵ\displaystyle S_{\epsilon} ={x: 1cos(zx)ϵ}.\displaystyle=\{x\in\mathbb{R}:\ 1-\cos(zx)\geq\epsilon\}.

Observe that

  • For any ϵ>0\epsilon>0, we have ϵμ(Sϵ)Sϵ(1cos(zx))μ(dx)(1cos(zx))μ(dx)=0,\epsilon\cdot\mu(S_{\epsilon})\leq\int_{S_{\epsilon}}\big{(}1-\cos(zx)\big{)}\mu(dx)\leq\int_{\mathbb{R}}\big{(}1-\cos(zx)\big{)}\mu(dx)=0, which implies μ(Sϵ)=0\mu(S_{\epsilon})=0;

  • Meanwhile, limϵ0μ(Sϵ)=μ(S)=0\lim_{\epsilon\rightarrow 0}\mu(S_{\epsilon})=\mu(S)=0.

Together with the fact that μ()>0\mu(\mathbb{R})>0 (so that the process is non-trivial), there must be some m,δ>0m\in\mathbb{Z},\ \delta>0 such that

μ({2πzm})=δ>0.\mu(\{\frac{2\pi}{z}m\})=\delta>0.

Besides, from μ(S)=0\mu(S)=0 we know that μ({2πz,2πz}{0})=0\mu\big{(}\{-\frac{2\pi}{z},\frac{2\pi}{z}\}\setminus\{0\}\big{)}=0. However, by definition of semi-stable processes in (4.4) we know that μ=bαTbμ\mu=b^{-\alpha}T_{b}\mu where the transformation TrT_{r} (r>0\forall r>0) onto a Borel measure ρ\rho on \mathbb{R} is defined as (Trρ)(B)=ρ(r1B)(T_{r}\rho)(B)=\rho(r^{-1}B). This implies

μ({2πmzbk})>0k=1,2,3,\displaystyle\mu(\{\frac{2\pi m}{z}b^{-k}\})>0\ \ \forall k=1,2,3,\cdots

which would contradict μ({2πz,2πz}{0})=0\mu\big{(}\{-\frac{2\pi}{z},\frac{2\pi}{z}\}\setminus\{0\}\big{)}=0 eventually for kk large enough. This concludes the proof of η(z)>0\eta(z)>0 for all z[1,b]z\in[1,b]. ∎

References

  • [1] S. Asmussen, P. Glynn, and J. Pitman. Discretization Error in Simulation of One-Dimensional Reflecting Brownian Motion. The Annals of Applied Probability, 5(4):875 – 896, 1995.
  • [2] S. Asmussen and D. P. Kroese. Improved algorithms for rare event simulation with heavy tails. Advances in Applied Probability, 38(2):545–558, 2006.
  • [3] S. Asmussen and J. Rosiński. Approximations of small jumps of lévy processes with a view towards simulation. Journal of Applied Probability, 38(2):482–493, 2001.
  • [4] A. Bassamboo, S. Juneja, and A. Zeevi. On the inefficiency of state-independent importance sampling in the presence of heavy tails. Operations Research Letters, 35(2):251–260, 2007.
  • [5] M. L. Bianchi, S. T. Rachev, Y. S. Kim, and F. J. Fabozzi. Tempered infinitely divisible distributions and processes. Theory of Probability & Its Applications, 55(1):2–26, 2011.
  • [6] J. Blanchet and P. Glynn. Efficient rare-event simulation for the maximum of heavy-tailed random walks. The Annals of Applied Probability, 18(4):1351 – 1378, 2008.
  • [7] J. Blanchet, P. Glynn, and J. Liu. Efficient rare event simulation for heavy-tailed multiserver queues. Technical report, Department of Statistics, Columbia University, 2008.
  • [8] J. Blanchet, H. Hult, and K. Leder. Rare-event simulation for stochastic recurrence equations with heavy-tailed innovations. ACM Trans. Model. Comput. Simul., 23(4), dec 2013.
  • [9] J. H. Blanchet and J. Liu. State-dependent importance sampling for regularly varying random walks. Advances in Applied Probability, 40(4):1104–1128, 2008.
  • [10] S. Borak, A. Misiorek, and R. Weron. Models for heavy-tailed asset returns, pages 21–55. Springer Berlin Heidelberg, Berlin, Heidelberg, 2011.
  • [11] O. Boxma, E. Cahen, D. Koops, and M. Mandjes. Linear networks: rare-event simulation and markov modulation. Methodology and Computing in Applied Probability, 2019.
  • [12] S. Boyarchenko and S. Levendorskii. Efficient evaluation of joint pdf of a lévy process, its extremum, and hitting time of the extremum, 2023.
  • [13] S. Boyarchenko and S. Levendorskii. Simulation of a lévy process, its extremum, and hitting time of the extremum via characteristic functions, 2023.
  • [14] J. A. Bucklew, P. Ney, and J. S. Sadowsky. Monte carlo simulation and large deviations theory for uniformly recurrent markov chains. Journal of Applied Probability, 27(1):44–59, 1990.
  • [15] P. Carr, H. Geman, D. Madan, and M. Yor. The fine structure of asset returns: An empirical investigation. The Journal of Business, 75(2):305–332, 2002.
  • [16] P. Carr, H. Geman, D. B. Madan, and M. Yor. Stochastic volatility for lévy processes. Mathematical Finance, 13(3):345–382, 2003.
  • [17] J. I. G. Cázares, A. Kohatsu-Higa, and A. Mijatović. Joint density of the stable process and its supremum: Regularity and upper bounds. Bernoulli, 29(4):3443 – 3469, 2023.
  • [18] J. I. G. Cázares, A. Mijatović, and G. U. Bravo. ε\varepsilon-strong simulation of the convex minorants of stable processes and meanders. Electronic Journal of Probability, 25(none):1 – 33, 2020.
  • [19] L. Chaumont. On the law of the supremum of Lévy processes. The Annals of Probability, 41(3A):1191 – 1217, 2013.
  • [20] B. Chen, J. Blanchet, C.-H. Rhee, and B. Zwart. Efficient rare-event simulation for multiple jump events in regularly varying random walks and compound poisson processes. Mathematics of Operations Research, 44(3):919–942, 2019.
  • [21] J. E. Cohen, R. A. Davis, and G. Samorodnitsky. Covid-19 cases and deaths in the united states follow taylor’s law for heavy-tailed distributions with infinite variance. Proceedings of the National Academy of Sciences, 119(38):e2209234119, 2022.
  • [22] L. Coutin, M. Pontier, and W. Ngom. Joint distribution of a lévy process and its running supremum. Journal of Applied Probability, 55(2):488–512, 2018.
  • [23] J. I. G. Cázares, F. Lin, and A. Mijatović. Fast exact simulation of the first passage of a tempered stable subordinator across a non-increasing function, 2023.
  • [24] S. Dereich. Multilevel Monte Carlo algorithms for Lévy-driven SDEs with Gaussian correction. The Annals of Applied Probability, 21(1):283 – 311, 2011.
  • [25] S. Dereich and F. Heidenreich. A multilevel monte carlo algorithm for lévy-driven stochastic differential equations. Stochastic Processes and their Applications, 121(7):1565–1587, 2011.
  • [26] E. H. A. Dia and D. Lamberton. Connecting discrete and continuous lookback or hindsight options in exponential lévy models. Advances in Applied Probability, 43(4):1136–1165, 2011.
  • [27] P. Dupuis, K. Leder, and H. Wang. Importance sampling for sums of random variables with regularly varying tails. ACM Trans. Model. Comput. Simul., 17(3):14–es, jul 2007.
  • [28] P. Dupuis, A. D. Sezer, and H. Wang. Dynamic importance sampling for queueing networks. The Annals of Applied Probability, 17(4):1306 – 1346, 2007.
  • [29] R. Durrett. Probability: Theory and Examples. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019.
  • [30] P. Embrechts, C. Klüppelberg, and T. Mikosch. Modelling extremal events: for insurance and finance, volume 33. Springer Science & Business Media, 2013.
  • [31] A. Ferreiro-Castilla, A. Kyprianou, R. Scheichl, and G. Suryanarayana. Multilevel monte carlo simulation for lévy processes based on the wiener–hopf factorisation. Stochastic Processes and their Applications, 124(2):985–1010, 2014.
  • [32] M. B. Giles. Multilevel monte carlo path simulation. Operations Research, 56(3):607–617, 2008.
  • [33] M. B. Giles and Y. Xia. Multilevel monte carlo for exponential lévy models. Finance and Stochastics, 21(4):995–1026, 2017.
  • [34] J. González Cázares and A. Mijatović. Simulation of the drawdown and its duration in lévy models via stick-breaking gaussian approximation. Finance and Stochastics, 26(4):671–732, 2022.
  • [35] J. I. González Cázares, A. Mijatović, and G. Uribe Bravo. Geometrically convergent simulation of the extrema of lévy processes. Mathematics of Operations Research, 47(2):1141–1168, 2022.
  • [36] J. I. González Cázares, A. Mijatović, and G. U. Bravo. Exact simulation of the extrema of stable processes. Advances in Applied Probability, 51(4):967–993, 2019.
  • [37] T. Gudmundsson and H. Hult. Markov chain monte carlo for computing rare-event probabilities for a heavy-tailed random walk. Journal of Applied Probability, 51(2):359–376, 2014.
  • [38] M. Gurbuzbalaban, U. Simsekli, and L. Zhu. The heavy-tail phenomenon in sgd. In M. Meila and T. Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 3964–3975. PMLR, 18–24 Jul 2021.
  • [39] S. Heinrich. Multilevel monte carlo methods. In S. Margenov, J. Waśniewski, and P. Yalamov, editors, Large-Scale Scientific Computing, pages 58–67, Berlin, Heidelberg, 2001. Springer Berlin Heidelberg.
  • [40] T. Hesterberg. Weighted average importance sampling and defensive mixture distributions. Technometrics, 37(2):185–194, 1995.
  • [41] L. Hodgkinson and M. Mahoney. Multiplicative noise and heavy tails in stochastic optimization. In International Conference on Machine Learning, pages 4262–4274. PMLR, 2021.
  • [42] H. Hult, S. Juneja, and K. Murthy. Exact and efficient simulation of tail probabilities of heavy-tailed infinite series. 2016.
  • [43] A. Kuznetsov, A. E. Kyprianou, J. C. Pardo, and K. van Schaik. A Wiener–Hopf Monte Carlo simulation technique for Lévy processes. The Annals of Applied Probability, 21(6):2171 – 2190, 2011.
  • [44] M. Kwaśnicki, J. Małecki, and M. Ryznar. Suprema of Lévy processes. The Annals of Probability, 41(3B):2047 – 2065, 2013.
  • [45] Y. Li. Queuing theory with heavy tails and network traffic modeling. working paper or preprint, Oct. 2018.
  • [46] E. Mariucci and M. Reiß. Wasserstein and total variation distance between marginals of Lévy processes. Electronic Journal of Statistics, 12(2):2482 – 2514, 2018.
  • [47] Z. Michna. Formula for the supremum distribution of a spectrally positive lévy process, 2012.
  • [48] Z. Michna. Explicit formula for the supremum distribution of a spectrally negative stable process. Electronic Communications in Probability, 18(none):1 – 6, 2013.
  • [49] Z. Michna, Z. Palmowski, and M. Pistorius. The distribution of the supremum for spectrally asymmetric lévy processes, 2014.
  • [50] A. Mijatović and P. Tankov. A new look at short-term implied volatility in asset price models with jumps. Mathematical Finance, 26(1):149–183, 2016.
  • [51] K. R. A. Murthy, S. Juneja, and J. Blanchet. State-independent importance sampling for random walks with regularly varying increments. Stochastic Systems, 4(2):321–374, 2014.
  • [52] J. Pitman and G. U. Bravo. The convex minorant of a Lévy process. The Annals of Probability, 40(4):1636 – 1674, 2012.
  • [53] S. I. Resnick. Heavy-tail phenomena: probabilistic and statistical modeling. Springer Science & Business Media, 2007.
  • [54] C.-H. Rhee, J. Blanchet, B. Zwart, et al. Sample path large deviations for lévy processes and random walks with regularly varying increments. The Annals of Probability, 47(6):3551–3605, 2019.
  • [55] C.-H. Rhee and P. W. Glynn. Unbiased estimation with square root convergence for sde models. Operations Research, 63(5):1026–1043, 2015.
  • [56] J. Rosiński. Tempering stable processes. Stochastic Processes and their Applications, 117(6):677–707, 2007.
  • [57] P. Sabino. Pricing energy derivatives in markets driven by tempered stable and cgmy processes of ornstein–uhlenbeck type. Risks, 10(8), 2022.
  • [58] K.-i. Sato, S. Ken-Iti, and A. Katok. Lévy processes and infinitely divisible distributions. Cambridge university press, 1999.
  • [59] G. Torrisi. Simulating the ruin probability of risk processes with delay in claim settlement. Stochastic Processes and their Applications, 112(2):225–244, 2004.
  • [60] X. Wang and C.-H. Rhee. Rare-event simulation for multiple jump events in heavy-tailed lévy processes with infinite activities. In Proceedings of the Winter Simulation Conference, WSC ’20, page 409–420. IEEE Press, 2021.
  • [61] X. Wang and C.-H. Rhee. Large deviations and metastability analysis for heavy-tailed dynamical systems, 2023.
  • [62] X. Wang and C.-H. Rhee. Importance sampling strategy for heavy-tailed systems with catastrophe principle. In Proceedings of the Winter Simulation Conference, WSC ’23, page 76–90. IEEE Press, 2024.

Appendix A Barrier Option Pricing

A.1 Problem Setting

This section considers the estimation of probabilities P(An)P(A_{n}) with An={X¯nA}A_{n}=\{\bar{X}_{n}\in A\} and

A\ensurestackMath\stackon[1pt]=Δ{ξ𝔻:ξ(1)b,supt1ξ(t)+cta},A\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\xi\in\mathbb{D}:\ \xi(1)\leq-b,\ \sup_{t\leq 1}\xi(t)+ct\geq a\},

which corresponds to rare-event simulation in the context of down-and-in option. Here, we assume that a,b>0a,\ b>0 and c<ac<a. We consider the two-sided case in Assumption 1. That is, X(t)X(t) is a centered Lévy process with Lévy measures ν\nu, and there exists some α,α>1\alpha,\ \alpha^{\prime}>1 such that ν[x,)𝒱α(x)\nu[x,\infty)\in\mathcal{RV}_{-\alpha}(x) and ν(,x]𝒱α(x)\nu(-\infty,-x]\in\mathcal{RV}_{-\alpha^{\prime}}(x) as xx\to\infty. Also, we impose an alternative version of Assumption 2 throughout. Let X(z,z)(t)X^{(-z,z)}(t) be the Lévy process with with generating triplet (cX,σ,ν|(z,z))(c_{X},\sigma,\nu|_{(-z,z)}). That is, X(z,z)(t)X^{(-z,z)}(t) is a modulated version of XX where all jumps with size larger than zz are removed.

Assumption 4.

There exist z0,C,λ>0{z_{0}},\ {C},\ {\lambda}>0 such that

𝐏(X(z,z)(t)[x,x+δ])Cδtλ1zz0,t>0,x,δ>0.\displaystyle\mathbf{P}\big{(}X^{(-z,z)}(t)\in[x,x+\delta]\big{)}\leq\frac{C\delta}{t^{\lambda}\wedge 1}\qquad\forall z\geq z_{0},\ t>0,\ x\in\mathbb{R},\ \delta>0.

A.2 Importance Sampling Algorithm

Below, we present the design of the importance sampling algorithm. For any ξ𝔻\xi\in\mathbb{D} and t(0,1]t\in(0,1], let Δξ(t)=ξ(t)ξ(t)\Delta\xi(t)=\xi(t)-\xi(t-) be the discontinuity in ξ\xi at time tt, and we set Δξ(0)0\Delta\xi(0)\equiv 0. Let

Bγ={ξ𝔻:#{t[0,1]:Δξ(t)γ}1,#{t[0,1]:Δξ(t)γ}1}\displaystyle B^{\gamma}=\Big{\{}\xi\in\mathbb{D}:\ \#\{t\in[0,1]:\ \Delta\xi(t)\geq\gamma\}\geq 1,\ \#\{t\in[0,1]:\ \Delta\xi(t)\leq-\gamma\}\geq 1\Big{\}}

and let Bγn={X¯nBγn}B^{\gamma}_{n}=\{\bar{X}_{n}\in B^{\gamma}_{n}\}. Intuitively speaking, on event BγnB^{\gamma}_{n} there is at least one upward and one downward “large” jump in X¯n\bar{X}_{n}, where γ>0\gamma>0 is understood as the threshold for jump sizes to be considered “large”.

Fix some w(0,1)w\in(0,1), and let

𝐐n()=w𝐏()+(1w)𝐏(|Bγn).\mathbf{Q}_{n}(\cdot)=w\mathbf{P}(\cdot)+(1-w)\mathbf{P}(\ \cdot\ |B^{\gamma}_{n}).

The algorithm samples

Ln=Znd𝐏d𝐐n=Znw+1w𝐏(Bγn)𝐈Bγn{L_{n}}=Z_{n}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}=\frac{Z_{n}}{w+\frac{1-w}{\mathbf{P}(B^{\gamma}_{n})}\mathbf{I}_{B^{\gamma}_{n}}}

under 𝐐n\mathbf{Q}_{n}. Now, we discuss the design of ZnZ_{n} to ensure the strong efficiency of LnL_{n}. Analogous to the decomposition in (3.6), let

Jn(t)\displaystyle{J_{n}(t)} =s[0,t]ΔX(s)𝐈(|ΔX(s)|nγ),\displaystyle=\sum_{s\in[0,t]}\Delta X(s)\mathbf{I}\big{(}|\Delta X(s)|\geq n\gamma\big{)},
Ξn(t)\displaystyle{\Xi_{n}(t)} =X(t)Jn(t)=X(t)s[0,t]ΔX(s)𝐈(|ΔX(s)|nγ).\displaystyle=X(t)-J_{n}(t)=X(t)-\sum_{s\in[0,t]}\Delta X(s)\mathbf{I}\big{(}|\Delta X(s)|\geq n\gamma\big{)}.

Let J¯n(t)=1nJn(nt){\bar{J}_{n}(t)}=\frac{1}{n}J_{n}(nt), J¯n={J¯n(t):t[0,1]}\bar{J}_{n}=\{\bar{J}_{n}(t):\ t\in[0,1]\}, Ξ¯n(t)=1nΞn(nt){\bar{\Xi}_{n}(t)}=\frac{1}{n}\Xi_{n}(nt), and Ξ¯n={Ξ¯n(t):t[0,1]}\bar{\Xi}_{n}=\{\bar{\Xi}_{n}(t):\ t\in[0,1]\}. Meanwhile, set

Mc(t)\ensurestackMath\stackon[1pt]=ΔsupstX(s)+cs,Yn;c\ensurestackMath\stackon[1pt]=Δ𝐈(Mc(n)na,X(n)nb),\displaystyle{M_{c}(t)}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sup_{s\leq t}X(s)+cs,\qquad{Y^{*}_{n;c}}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\mathbf{I}\big{(}M_{c}(n)\geq na,\ X(n)\leq-nb\big{)},

We have 𝐈An=Yn;c\mathbf{I}_{A_{n}}=Y^{*}_{n;c}. Under the convention Y^1n0\hat{Y}^{-1}_{n}\equiv 0, consider estimators ZnZ_{n} of form

Zn=m=0τY^mn;cY^m1n;c𝐏(τm)\displaystyle{Z_{n}}=\sum_{m=0}^{\tau}\frac{\hat{Y}^{m}_{n;c}-\hat{Y}^{m-1}_{n;c}}{\mathbf{P}(\tau\geq m)} (A.1)

where τ{\tau} is Geom(ρ)\text{Geom}(\rho) for some ρ(0,1){\rho}\in(0,1) and is independent of everything else. Analogous to Proposition 3.1, the following result provides sufficient conditions on Y^mn;c\hat{Y}^{m}_{n;c} for LnL_{n} to attain strong efficiency.

Proposition A.1.

Let C0>0C_{0}>0, ρ0(0,1)\rho_{0}\in(0,1), μ>α+α2\mu>\alpha+\alpha^{\prime}-2, and m¯\bar{m}\in\mathbb{N}. Suppose that

𝐏(Yn;cY^mn;c|𝒟+(J¯n)=k,𝒟(J¯n)=k)C0ρm0(k+k+1)k,k0,n1,mm¯\displaystyle\mathbf{P}\Big{(}Y^{*}_{n;c}\neq\hat{Y}^{m}_{n;c}\ \Big{|}\ \mathcal{D}^{+}(\bar{J}_{n})=k,\ \mathcal{D}^{-}(\bar{J}_{n})=k^{\prime}\Big{)}\leq C_{0}\rho^{m}_{0}\cdot(k+k^{\prime}+1)\qquad\forall k,\ k^{\prime}\geq 0,\ n\geq 1,\ m\geq\bar{m} (A.2)

where 𝒟+(ξ)\mathcal{D}^{+}(\xi) and 𝒟(ξ)\mathcal{D}^{-}(\xi) count the number of discontinuities of positive and negative sizes in ξ\xi, respectively. Besides, suppose that for all Δ(0,1)\Delta\in(0,1),

𝐏(Yn;cY^mn;c,X¯nAΔ|𝒟+(J¯n)=0 or 𝒟(J¯n)=0)C0ρm0Δ2nμn1,m0\displaystyle\mathbf{P}\Big{(}Y^{*}_{n;c}\neq\hat{Y}^{m}_{n;c},\ \bar{X}_{n}\notin A^{\Delta}\ \Big{|}\ \mathcal{D}^{+}(\bar{J}_{n})=0\text{ or }\mathcal{D}^{-}(\bar{J}_{n})=0\Big{)}\leq\frac{C_{0}\rho^{m}_{0}}{\Delta^{2}n^{\mu}}\qquad\forall n\geq 1,\ m\geq 0 (A.3)

where AΔ={ξ𝔻:supt[0,1]ξ(t)+ctaΔ,ξ(1)b}{A^{\Delta}}=\big{\{}\xi\in\mathbb{D}:\sup_{t\in[0,1]}\xi(t)+ct\geq a-\Delta,\ \xi(1)\leq-b\big{\}}. Then given ρ(ρ0,1)\rho\in(\rho_{0},1), there exists some γ¯=γ¯(ρ)(0,b)\bar{\gamma}=\bar{\gamma}(\rho)\in(0,b) such that for all γ(0,γ¯)\gamma\in(0,\bar{\gamma}), the estimators (Ln)n1(L_{n})_{n\geq 1} are unbiased and strongly efficient for 𝐏(An)=𝐏(X¯nA)\mathbf{P}(A_{n})=\mathbf{P}(\bar{X}_{n}\in A) under the importance sampling distribution 𝐐n\mathbf{Q}_{n}.

The proof is almost identical to that of Proposition 3.1. In particular, the proof requires that

𝐏(An)=𝑶(nν[n,)nν(,n])\mathbf{P}(A_{n})=\bm{O}\big{(}n\nu[n,\infty)\cdot n\nu(-\infty,-n]\big{)}

and that, for any β>0\beta>0, it holds for all γ\gamma small enough that

𝐏(AΔnBγn)=𝒐(nβ)\mathbf{P}(A^{\Delta}_{n}\setminus B^{\gamma}_{n})=\bm{o}(n^{\beta})

where AΔn={X¯nAΔ}A^{\Delta}_{n}=\{\bar{X}_{n}\in A^{\Delta}\}. These can be obtained directly using sample path large deviations for heavy-tailed Lévy processes in Result 2. The Proposition A.1 is then established by repeating the arguments in Proposition 3.1 using Result 4 for randomized debiasing technique.

A.3 Construction of Y^mn;c\hat{Y}^{m}_{n;c}

Next, we describe the construction of Y^mn;c\hat{Y}^{m}_{n;c} that can satisfy the conditions in Proposition A.1. Specifically, we consider the case where ARA is involved. Let

Ξn;c(t)\ensurestackMath\stackon[1pt]=ΔΞn(t)+ct.\Xi_{n;c}(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\Xi_{n}(t)+ct.

Under both 𝐏\mathbf{P} and 𝐐n\mathbf{Q}_{n}, Ξn;c(t)\Xi_{n;c}(t) admits the law of a Lévy process with generating triplet (cX+c,σ,ν|(nγ,nγ))(c_{X}+c,\sigma,\nu|_{(-n\gamma,n\gamma)}). This leads to the Lévy-Ito decomposition

Ξn,c(t)\displaystyle\Xi_{n,c}(t) \ensurestackMath\stackon[1pt]=d(cX+c)t+σB(t)+stΔX(s)𝐈(ΔX(s)(nγ,1][1,nγ))\ensurestackMath\stackon[1pt]=ΔJn,1(t)\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}(c_{X}+c)t+\sigma B(t)+\underbrace{\sum_{s\leq t}\Delta X(s)\mathbf{I}\Big{(}\Delta X(s)\in(-n\gamma,-1]\cup[1,n\gamma)\Big{)}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}J_{n,-1}(t)}
+m0[stΔX(s)𝐈(|ΔX(s)|[κn,m,κn,m1))tν((κn,m1,κn,m][κn,m,κn,m1))\ensurestackMath\stackon[1pt]=ΔJn,m(t)]\displaystyle+\sum_{m\geq 0}\Bigg{[}\underbrace{\sum_{s\leq t}\Delta X(s)\mathbf{I}\Big{(}|\Delta X(s)|\in[\kappa_{n,m},\kappa_{n,m-1})\Big{)}-t\cdot\nu\Big{(}(-\kappa_{n,m-1},-\kappa_{n,m}]\cup[\kappa_{n,m},\kappa_{n,m-1})\Big{)}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}J_{n,m}(t)}\Bigg{]}

with κn,m\kappa_{n,m} defined in (3.20). Besides, let σ¯2()\bar{\sigma}^{2}(\cdot) be defined as in (3.22). For each n1n\geq 1 and m0m\geq 0, consider the approximation

Ξ˘mn;c(t)\displaystyle{\breve{\Xi}^{m}_{n;c}(t)} \ensurestackMath\stackon[1pt]=Δ(cX+c)t+σB(t)+q=1mJn,q(t)+qm+1σ¯2(κn,q1)σ¯2(κn,q)Wq(t)\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}(c_{X}+c)t+\sigma B(t)+\sum_{q=-1}^{m}J_{n,q}(t)+\sum_{q\geq m+1}\sqrt{\bar{\sigma}^{2}(\kappa_{n,q-1})-\bar{\sigma}^{2}(\kappa_{n,q})}\cdot W^{q}(t)

where (Wm)m1(W^{m})_{m\geq 1} is a sequence of iid copies of standard Brownian motions independent of everything else.

Next, we discuss how to apply SBA and construct approximators Y^mn;c\hat{Y}^{m}_{n;c}’s in (A.1). Let ζk(t)=i=1kzi𝐈[ui,n](t)\zeta_{k}(t)=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}(t) be a piece-wise step function with kk jumps over (0,n](0,n], where 0<u1<u2<<ukn0<u_{1}<u_{2}<\ldots<u_{k}\leq n, and zi0z_{i}\neq 0 for each i[k]i\in[k]. Recall that the jump times in ζk\zeta_{k} leads to a partition of [0,n][0,n] of (Ii)i[k+1](I_{i})_{i\in[k+1]} defined in (3.12). For any IiI_{i}, let the sequence l(i)jl^{(i)}_{j}’s be defined as in (3.15)–(3.16). Conditioning on (l(i)j)j1(l^{(i)}_{j})_{j\geq 1}, one can then sample ξ(i),mj;c,ξ(i)j;c{\xi^{(i),m}_{j;c},\xi^{(i)}_{j;c}} using

(ξ(i)j;c,ξ(i),0j;c,ξ(i),1j;c,ξ(i),2j;c,)\ensurestackMath\stackon[1pt]=d(Ξn;c(l(i)j),Ξ˘0n;c(l(i)j),Ξ˘1n;c(l(i)j),Ξ˘2n;c(l(i)j),).\displaystyle\big{(}\xi^{(i)}_{j;c},\xi^{(i),0}_{j;c},\xi^{(i),1}_{j;c},\xi^{(i),2}_{j;c},\ldots)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\Big{(}\Xi_{n;c}(l^{(i)}_{j}),\ \breve{\Xi}^{0}_{n;c}(l^{(i)}_{j}),\ \breve{\Xi}^{1}_{n;c}(l^{(i)}_{j}),\ \breve{\Xi}^{2}_{n;c}(l^{(i)}_{j}),\ldots\Big{)}.

The coupling in (2.7) then implies

(Ξn;c(ui)Ξn;c(ui1),suptIiΞn;c(t)Ξn;c(ui1),Ξ˘0n;c(ui)Ξ˘0n;c(ui1),suptIiΞ˘0n;c(t)Ξ˘0n;c(ui1),\displaystyle\Big{(}\Xi_{n;c}(u_{i})-\Xi_{n;c}(u_{i-1}),\ \sup_{t\in I_{i}}\Xi_{n;c}(t)-\Xi_{n;c}(u_{i-1}),\ \breve{\Xi}^{0}_{n;c}(u_{i})-\breve{\Xi}^{0}_{n;c}(u_{i-1}),\ \sup_{t\in I_{i}}\breve{\Xi}^{0}_{n;c}(t)-\breve{\Xi}^{0}_{n;c}(u_{i-1}),
Ξ˘1n;c(ui)Ξ˘1n;c(ui1),suptIiΞ˘1n;c(t)Ξ˘1n;c(ui1),)\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\breve{\Xi}^{1}_{n;c}(u_{i})-\breve{\Xi}^{1}_{n;c}(u_{i-1}),\ \sup_{t\in I_{i}}\breve{\Xi}^{1}_{n;c}(t)-\breve{\Xi}^{1}_{n;c}(u_{i-1}),\ldots\Big{)}
\ensurestackMath\stackon[1pt]=d(j1ξ(i)j;c,j1(ξ(i)j;c)+,j1ξ(i),0j;c,j1(ξ(i),0j;c)+,j1ξ(i),1j;c,j1(ξ(i),1j;c)+,).\displaystyle\qquad\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\Big{(}\sum_{j\geq 1}\xi^{(i)}_{j;c},\ \sum_{j\geq 1}(\xi^{(i)}_{j;c})^{+},\ \sum_{j\geq 1}\xi^{(i),0}_{j;c},\ \sum_{j\geq 1}(\xi^{(i),0}_{j;c})^{+},\ \sum_{j\geq 1}\xi^{(i),1}_{j;c},\ \sum_{j\geq 1}(\xi^{(i),1}_{j;c})^{+},\ldots\Big{)}.

Now, we define

M^(i),mn;c(ζk)=j=1m+log2(nd)(ξ(i),mj;c)+\displaystyle{\hat{M}^{(i),m}_{n;c}(\zeta_{k})}=\sum_{j=1}^{m+\lceil\log_{2}(n^{d})\rceil}(\xi^{(i),m}_{j;c})^{+}

as an approximation to Mn;c(i),(ζk)=suptIiΞn;c(t)Ξn;c(ui1)=j1(ξ(i)j;c)+.{M_{n;c}^{(i),*}(\zeta_{k})}=\sup_{t\in I_{i}}\Xi_{n;c}(t)-\Xi_{n;c}(u_{i-1})=\sum_{j\geq 1}(\xi^{(i)}_{j;c})^{+}. Now, set

Y^mn;c(ζk)\displaystyle{\hat{Y}^{m}_{n;c}(\zeta_{k})} =[maxi[k+1]𝐈(q=1i1j0ξ(q),mj;c+q=1i1zq+M^(i),mn;c(ζk)na)]𝐈(q=1k+1j0ξ(q),mj;c+q=1kzqcnnb).\displaystyle=\bigg{[}\max_{i\in[k+1]}\mathbf{I}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q),m}_{j;c}+\sum_{q=1}^{i-1}z_{q}+\hat{M}^{(i),m}_{n;c}(\zeta_{k})\geq na\Big{)}\bigg{]}\cdot\mathbf{I}\Big{(}\sum_{q=1}^{k+1}\sum_{j\geq 0}\xi^{(q),m}_{j;c}+\sum_{q=1}^{k}z_{q}-cn\leq-nb\Big{)}.

In (A.1), we plug in Y^mn;c=Y^mn;c(Jn)\hat{Y}^{m}_{n;c}=\hat{Y}^{m}_{n;c}(J_{n}).

The proof of the strong efficiency is almost identical to that of Theorem 3.3. The only major difference is that in Lemma 6.9, we apply Assumption 4 instead of Assumption 2.