Strongly Efficient Rare-Event Simulation for Regularly Varying Lévy Processes with Infinite Activities

Xingyu Wang, Chang-Han Rhee

Abstract

In this paper, we address rare-event simulation for heavy-tailed Lévy processes with infinite activities. The presence of infinite activities poses a critical challenge, making it impractical to simulate or store the precise sample path of the Lévy process. We present a rare-event simulation algorithm that incorporates an importance sampling strategy based on heavy-tailed large deviations, the stick-breaking approximation for the extrema of Lévy processes, the Asmussen-Rosiński approximation, and the randomized debiasing technique. By establishing a novel characterization for the Lipschitz continuity of the law of Lévy processes, we show that the proposed algorithm is unbiased and strongly efficient under mild conditions, and hence applicable to a broad class of Lévy processes. In numerical experiments, our algorithm demonstrates significant improvements in efficiency compared to the crude Monte-Carlo approach.

1 Introduction

In this paper, we propose a strongly efficient rare-event simulation algorithm for heavy-tailed Lévy processes with infinite activities. Specifically, the goal is to estimate probabilities of the form $\mathbf{P}(X\in A)$ , where $X=\{X(t):\ t\in[0,1]\}$ is a Lévy process in $\mathbb{R}$ , $A$ is a subset of the càdlàg space that doesn’t include the typical path of $X$ so that $\mathbf{P}(X\in A)$ is close to $0$ , and the event $\{X\in A\}$ is “unsimulatable” due to the infinite number of activities within any finite time interval. The defining features of the problem are as follows.

•

The increments of the Lévy process $X(t)$ are heavy-tailed. Throughout this paper, we characterize the heavy-tailed phenomenon through the notion of regular variation and assume that the tail cdf $\mathbf{P}(\pm X(t)>x)$ decays roughly at a power-law rate of $1/x^{\alpha}$ ; see Definition 1 for details. The notion of heavy tails provides the mathematical formulation for the extreme uncertainty that manifests in a wide range of real-world dynamics and systems, including the spread of COVID-19 (see, e.g., [21]), traffic in computer and communication networks (see, e.g., [45]), financial assets (see, e.g., [30, 10]), and the training of deep neural networks (see, e.g., [38, 41]).

•

$A$ is a general subset of $\mathbb{D}$ (i.e., the space of the real-valued càdlàg functions over $[0,1]$ ) that involves the supremum of the path. For concreteness in our presentation, the majority of the paper focuses on

\displaystyle A=\Big{\{}\xi\in\mathbb{D}:\sup_{t\in[0,1]}\xi(t)\geq{a};\sup_{t\in(0,1]}\xi(t)-\xi(t-)<{b}\Big{\}}.

(1.1)

Intuitively speaking, this is closely related to ruin probabilities under reinsurance mechanisms, as $\{X\in A\}$ requires the supremum of the process $X(t)$ over $[0,1]$ to exceed some threshold $a$ even though all upward jumps in $X(t)$ are bounded by $b$ . Nevertheless, we stress that the algorithmic framework proposed in this paper is flexible enough to address more general form of events $\{X\in A\}$ that are of practical interest. For instance, we demonstrate in Section A of the Appendix that the framework can also address rare event simulation in the context of barrier option pricing.

•

$X(t)$ possesses infinite activities; see Section 2.3 for the precise definition. Consequently, it is computationally infeasible to simulate or store the entire sample path of such processes. In other words, we focus on a computationally challenging case where $\mathbf{I}\{X\in A\}$ cannot be exactly simulated or evaluated. Addressing such “unsimulatable” cases is crucial due to the increasing popularity of Lévy models with infinite activities in risk management and mathematical finance (see, e.g., [15, 16, 56, 5, 57]), as they offer more accurate and flexible descriptions for the price and volatility of financial assets compared to the classical jump-diffusion models (see, e.g., [50]).

In summary, our goal is to tackle a practically significant yet computationally challenging task, where the nature of the rare events renders crude Monte Carlo methods highly inefficient, if not entirely infeasible, due to the infinite activities in $X(t)$ . To address these challenges, we integrate several mathematical machinery: a design of importance sampling based on sample-path large deviations for heavy-tailed Lévy processes in [54], the stick-breaking approximation in [35] for Lévy processes with infinite activities, and the randomized multilevel Monte Carlo debiasing technique in [55]. By combining these tools, we propose a rare event simulation algorithm for heavy-tailed Lévy processes with infinite activities that attains strong efficiency (see Definition 2 for details).

As mentioned above, the first challenge is rooted in the nature of rare events as the crude Monte Carlo methods can be prohibitively expensive when estimating a small $p=\mathbf{P}(X\in A)$ . Instead, variance reduction techniques are often employed for efficient rare event simulation. When the underlying uncertainties are light-tailed, the exponential tilting strategy guided by large deviation theories has been successfully applied in a variety of contexts; see, e.g., [14, 11, 59, 28]. However, the exponential tilting approach falls short in providing a principled and provably efficient design of the importance sampling estimators (see, for example, [4]) due to fundamentally different mechanisms through which the rare events occur. Instead, different importance sampling strategies (e.g., [6, 27, 7, 9, 51, 8]) and other variance reduction techniques such as conditional Monte Carlo (e.g., [2, 42]) and Markov Chain Monte Carlo (e.g., [37]) have been proposed to address problems associated with specific types of processes or events.

Recent developments in heavy-tailed large deviations, such as those by [54] and [61], offer critical insights into the design of efficient and universal importance sampling schemes for heavy-tailed systems. Central to this development is the discrete hierarchy of heavy-tailed rare events, known as the catastrophe principle. The principle dictates that rare events in heavy-tailed systems arise due to catastrophic failures of a small number of system components, and the number of such components governs the asymptotic rate at which the associated rare events occur. This creates a discrete hierarchy among heavy-tailed rare events. By combining the defensive importance sampling design with such hierarchy, strongly efficient importance sampling algorithms have been proposed for a variety of rare events associated with random walks and compound Poisson processes in [20]. See also [62] for a tutorial on this topic. In this paper, we adopt and extend this framework to encompass Lévy processes with infinite activities. The specifics of the importance sampling distribution are detailed in Section 3.1.

Another challenge arises from the simulation of Lévy processes with infinite activities. While the design of importance sampling algorithm in [20] has been successfully applied to a wide range of stochastic systems that are exactly simulatable (including random walks, compound Poisson processes, iterates of stochastic gradient descent, and several classes of queueing systems), it cannot be implemented for Lévy processes with infinite activities. More specifically, the simulation of the random vector $(X(t),\ M(t))$ , where $M(t)=\sup_{s\leq t}X(t)$ , poses a significant challenge in the case with infinite activities. As of now, exact simulation of the extrema of Lévy processes (excluding the compound Poisson case) is only available for specific cases (see, for instance, [36, 23, 18]), let alone the exact simulation of the joint law of $(X(t),\ M(t))$ . We therefore approach the challenge by considering the following questions: $(i)$ Does there exist a provably efficient approximation algorithm for $(X(t),\ M(t))$ , and $(ii)$ Are we able to remove the approximation bias while still attaining strong efficiency in our rare-event simulation algorithm?

Regarding the first question, several classes of algorithms have been proposed for the approximate simulation of the extrema of Lévy processes. This includes the random walk approximations based on Euler-type discretization of the process (see e.g., [1, 26, 33]), the Wiener-Hopf approximation methods (see e.g. [43, 31]) based on the fluctuation theory of Lévy processes, the jump-adapted Gaussian approximations (see e.g. [24, 25]), and the characteristic function approach in [12, 13] based on efficient evaluation of joint cdf. Nevertheless, the approximation errors in the aforementioned methods are either unavailable or exhibit a polynomial rate of decay. Thankfully, the recently developed stick-breaking approximation (SBA) algorithm in [35] provides a novel approach to the simulation of the joint law of $X(t)$ and $M(t)$ . The theoretical foundation of SBA is the following description for the concave majorants of Lévy processes with infinite activities in [52]:

\big{(}X(t),\ M(t)\big{)}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\Big{(}\sum_{j\geq 1}\xi_{j},\ \sum_{j\geq 1}\max\{\xi_{j},0\}\Big{)}.

Here, $(l_{j})_{j\geq 1}$ is a sequence of iteratively generated non-negative RVs satisfying $\sum_{j\geq 1}l_{j}=t$ and $\mathbf{E}l_{j}=t/2^{j}\ \forall j\geq 1$ ; conditioned on the values of $(l_{j})_{j\geq 1}$ , $\xi_{j}$ ’s are independently generated such that $\xi_{j}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}X(l_{j})$ . While it is computationally infeasible to generate the entirety of the infinite sequences $(l_{j})_{j\geq 1}$ and $(\xi_{j})_{j\geq 1}$ , by terminating the procedure at the $m$ -th step we yield approximations of the form

\displaystyle\big{(}\hat{X}_{m}(t),\ \hat{M}_{m}(t)\big{)}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\Big{(}\sum_{j=1}^{m}\xi_{j},\ \sum_{j=1}^{m}\max\{\xi_{j},0\}\Big{)}.

(1.2)

We provide a review in Section 2.3. In particular, due to $\mathbf{E}\big{[}\sum_{j>m}l_{j}\big{]}=t/2^{m}$ , with each extra step in (1.2) we expect to reduce the approximation error by half, thus leading to the geometric convergence rate of errors. See [35] for analyses of the approximation errors for different types of functionals.

Additionally, while SBA can be considered sufficiently accurate for a wide range of tasks, eliminating the approximation errors is crucial in the context of rare-event simulation. Otherwise, any effort to efficiently estimate a small probability might be fruitless and could be overwhelmed by potentially large errors in the algorithm. In order to remove the approximation errors of SBA in (1.2), we employ the construction of unbiased estimators proposed in [55]. This can be interpreted as a randomized version of the multilevel Monte Carlo scheme [39, 32] when a sequence of biased yet increasingly more accurate approximations is available. It allows us to construct an unbiased estimation algorithm that terminates within a finite number of steps. By combining SBA, the randomized debiasing technique, and the design of importance sampling distributions based on heavy-tailed large deviations, we propose Algorithm 2 for rare-event simulation of Lévy processes with infinite activities. In case that the exact sampling of $X(t)$ , and hence the increments $\xi_{j}$ ’s in (1.2), is not available, we further incorporate the Asmussen-Rosiński approximation (ARA) in [3]. This approximation replaces the small-jump martingale in the Lévy process $X(t)$ with a Brownian motion of the same variance, thus leading to Algorithm 3. We note that the combination of SBA and the randomized debiasing technique has been explored in [35], and an ARA-incorporated version of SBA has been proposed in [34]. However, the goal of proposing strongly efficient rare-event simulation algorithm adds another layer of difficulty and sets our work apart from the existing literature. In particular, the notion of strong efficiency demands that the proposed estimator remains efficient under the importance sampling algorithm w.r.t. not just a given task, but throughout a sequence of increasingly more challenging rare-event simulation tasks as $\mathbf{P}(X\in A)$ tends to $0$ . This introduces a new dimension into the theoretical analysis that is not presented in [35, 34] and necessitates the development of new technical tools to characterize the performance of the algorithm when all these components (importance sampling, SBA, debiasing technique, and ARA) are in effect.

An important technical question in our analysis concerns the continuity of the law of the running supremum $M(t)$ . To provide high-level descriptions, let us consider estimators for $\mathbf{P}(X\in A)=\mathbf{E}\big{[}\mathbf{I}\{X\in A\}\big{]}$ that admit the form $f(\hat{X})$ where $\hat{X}$ is some approximation to the Lévy process $X$ and $f(\xi)=\mathbf{I}\{\xi\in A\}.$ SBA and the debiasing technique allow us to construct $\hat{X}$ such that the deviation $\hat{X}-X$ has a small variance. Nevertheless, the estimation can be fallible if $X$ concentrates on the boundary cases, i.e., $X$ falls into a neighborhood of $\partial A$ fairly often. Specializing to the case in (1.1), this requires obtaining sufficiently tight bounds for probabilities of form $\mathbf{P}(M(t)\in[x,x+\delta])$ . Nevertheless, the continuity of the law of the supremum $M(t)$ remains an active area of study, with many essential questions left open. Recent developments regarding the law of $M(t)$ are mostly qualitative or focus on the cumulative distribution function (cdf); see, e.g., [19, 22, 44, 47, 49, 48]. In short, addressing this aspect of the challenge requires us to establish novel and useful quantitative characterizations of the law of supremum $M(t)$ .

For our purpose of efficient rare event simulation, particularly under the importance sampling scheme detailed in Section 3.1, the following condition proves to be sufficient:

\displaystyle\mathbf{P}\Big{(}X^{<z}(t)\in[x,x+\delta]\Big{)}\leq\frac{C}{t^{\lambda}\wedge 1}\delta\qquad\forall z\geq z_{0},\ t>0,\ x\in\mathbb{R},\ \delta\in[0,1].

(1.3)

Here, $X^{<z}(t)$ is a modulated version of the process $X(t)$ where all the upward jumps with sizes larger than $z$ are removed; see Section 3 for the rigorous definition. First, we establish in Theorem 3.2 (resp. Theorem 3.3) that Algorithm 2 (resp. Algorithm 3) does attain strong efficiency under condition (1.3). More importantly, we demonstrate in Section 4 that condition (1.3) is mild for Lévy processes with infinitive activities, as it only requires the intensity of jumps to approach $\infty$ (hence attaining infinite activities in $X$ ) at a rate that is not too slow. In particular, in Theorems 4.2 and 4.4 we provide two sets of sufficient conditions for (1.3) that are easy to verify. We note that the representation of concave majorants for Lévy processes developed in [52] proves to be a valuable tool for studying the law of $X(t)$ and $M(t)$ . As will be elaborated in the proofs in Section 6, the key technical tool that allows us to connect condition 1.3 with the law of the supremum $M(t)$ is, again, the representation in (1.2). See also [17] for its application in studying the joint density of $X(t)$ and $M(t)$ of stable processes.

Some algorithmic contributions of this paper were presented in a preliminary form at a conference in [60] without rigorous proofs. The current paper presents several significant extensions: $(i)$ In addition to Algorithm 2, we also propose an ARA-incorporated version of the importance sampling algorithm (see Algorithm 3) to address the case where $X(t)$ cannot be exactly simulated; $(ii)$ Rigorous proofs of strong efficiency are provided in Section 6 in this paper; $(iii)$ We establish two sets of sufficient conditions for (1.3) in Section 4, leveraging the properties of regularly varying or semi-stable processes.

The rest of the paper is structured as follows. Section 2 reviews the theoretical foundations of our algorithms, including the heavy-tailed large deviation theories (Section 2.2), the stick-breaking approximations (Section 2.3), and the debiasing technique (Section 2.4). Section 3 presents the importance sampling algorithms and establishes their strong efficiency. Section 4 investigates the continuity of the law of $X(t)$ and provides sufficient conditions for (1.3), a critical condition to ensure the strong efficiency of our importance sampling scheme. Numerical experiments are reported in Section 5. The proofs of all technical results are collected in Section 6. In the Appendix, Section A extends the algorithmic framework to the context of barrier option pricing.

2 Preliminaries

In this section, we introduce some notations and results that will be frequently used when developing the strongly efficient rare-event simulation algorithm.

2.1 Notations

Let $\mathbb{N}=\{0,1,2,\ldots\}$ be the set of non-negative integers. For any positive integer $k$ , let $[k]=\{1,2,\ldots,k\}$ . For any $x,y\in\mathbb{R}$ , let $x\wedge y\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\min\{x,y\}$ and $x\vee y\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\max\{x,y\}$ . For any $x\in\mathbb{R}$ , we define $(x)^{+}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}x\vee 0$ as the positive part of $x$ , and

\lfloor x\rfloor\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\max\{n\in\mathbb{Z}:\ n\leq x\},\qquad\lceil x\rceil\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\min\{n\in\mathbb{Z}:\ n\geq x\}

as the floor and ceiling function. Given a measure space $(\mathcal{X},\mathcal{F},\mu)$ and any set $A\in\mathcal{F}$ , we use $\mu|_{A}(\cdot)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\mu(A\cap\cdot)$ to denote restriction of the measure $\mu$ on $A$ . For any random variable $X$ and any Borel measureable set $A$ , let $\mathscr{L}(X)$ be the law of $X$ , and $\mathscr{L}(X|A)$ be the law of $X$ conditioned on event $A$ . Let $(\mathbb{D}_{[0,1],\mathbb{R}},\bm{d})$ be the metric space of $\mathbb{D}=\mathbb{D}_{[0,1],\mathbb{R}}$ (i.e., the space of all real-valued càdlàg functions with domain $[0,1]$ ) equipped with Skorokhod $J_{1}$ metric $\bm{d}$ . Here, the metric $\bm{d}$ is defined by

\displaystyle\bm{d}(x,y)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\inf_{\lambda\in\Lambda}\sup_{t\in[0,1]}|\lambda(t)-t|\vee|x(\lambda(t))-y(t)|

(2.1)

with $\Lambda$ being the set of all increasing homeomorphisms from $[0,1]$ to itself.

Henceforth in this paper, the heavy-tailedness of any random element will be captured by the notion of regular variation.

Definition 1.

For any measurable function $\phi:(0,\infty)\to(0,\infty)$ , we say that $\phi$ is regularly varying as $x\rightarrow\infty$ with index $\beta$ (denoted as $\phi(x)\in\mathcal{RV}_{\beta}(x)$ as $x\to\infty$ ) if $\lim_{x\rightarrow\infty}\phi(tx)/\phi(x)=t^{\beta}$ for all $t>0$ . We also say that a measurable function $\phi(\eta)$ is regularly varying as $\eta\downarrow 0$ with index $\beta$ if $\lim_{\eta\downarrow 0}\phi(t\eta)/\phi(\eta)=t^{\beta}$ for any $t>0$ . We denote this as $\phi(\eta)\in{\mathcal{RV}_{\beta}}(\eta)$ as $\eta\downarrow 0$ .

For properties of regularly varying functions, see, for example, Chapter 2 of [53].

Next, we discuss the Lévy-Ito decomposition of one-dimensional Lévy processes, i.e., $X(t)\in\mathbb{R}$ . The law of a one-dimensional Lévy process $\{X(t):t\geq 0\}$ is completely characterized by its generating triplet $(c,\sigma,\nu)$ where $c\in\mathbb{R}$ represents the constant drift, $\sigma\geq 0$ is the magnitude of the Brownian motion term, and the Lévy measure $\nu$ characterizes the intensity of the jumps. More precisely,

\displaystyle X(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}ct+\sigma B(t)+\int_{|x|\leq 1}x[N([0,t]\times dx)-t\nu(dx)]+\int_{|x|>1}xN([0,t]\times dx)

(2.2)

where $\text{Leb}(\cdot)$ is the Lebesgue measure on $\mathbb{R}$ , $B$ is a standard Brownian motion, the measure $\nu$ satisfies $\int(|x|^{2}\wedge 1)\nu(dx)<\infty$ , and $N$ is a Poisson random measure over $(0,\infty)\times\mathbb{R}$ with intensity measure $\text{Leb}((0,\infty))\times\nu$ and is independent of $B$ . For standard references on this topic, see Chapter 4 of [58].

Given two sequences of non-negative real numbers $(x_{n})_{n\geq 1}$ and $(y_{n})_{n\geq 1}$ , we say that $x_{n}=\bm{O}(y_{n})$ (as $n\to\infty$ ) if there exists some $C\in[0,\infty)$ such that $x_{n}\leq Cy_{n}\ \forall n\geq 1$ . Besides, we say that $x_{n}=\bm{o}(y_{n})$ if $\lim_{n\rightarrow\infty}x_{n}/y_{n}=0$ . The goal of this paper is described in the following definition of strong efficiency.

Definition 2.

Let $(L_{n})_{n\geq 1}$ be a sequence of random variables supported on a probability space $(\Omega,\mathcal{F},\mathbf{P})$ and $(A_{n})_{n\geq 1}$ be a sequence of events (i.e., $A_{n}\in\mathcal{F}\ \forall n$ ). We say that $(L_{n})_{n\geq 1}$ are unbiased and strongly efficient estimators of $(A_{n})_{n\geq 1}$ if

\mathbf{E}L_{n}=\mathbf{P}(A_{n})\ \forall n\geq 1;\qquad\mathbf{E}L^{2}_{n}=\bm{O}\big{(}\mathbf{P}^{2}(A_{n})\big{)}\ \text{ as }n\rightarrow\infty.

We stress again that strongly efficient estimators $(L_{n})_{n\geq 1}$ achieve uniformly bounded relative errors (i.e., the ratio between standard error and mean) for all $n\geq 1$ .

2.2 Sample-Path Large Deviations for Regularly Varying Lévy Processes

The key ingredient of our importance sampling algorithm is the recent development of the sample-path large deviations for Lévy processes with regularly varying increments; see [54]. To familiarize the readers with this mathematical machinery, we start by reviewing the results in the one-sided cases, and then move onto the more general two-sided results.

Let $X(t)\in\mathbb{R}$ be a centered Lévy process (i.e., $\mathbf{E}X(t)=0\ \forall t>0$ ) with generating triplet $(c,\sigma,\nu)$ such that the Lévy measure $\nu$ is supported on $(0,\infty)$ . In other words, all the discontinuities in $X$ will be positive, hence one-sided. Moreover, we are interested in the heavy-tailed setting where the function $H_{+}(x)=\nu[x,\infty)$ is regularly varying as $x\rightarrow\infty$ with index $-\alpha$ where $\alpha>1$ . Define a scaled version of the process as $\bar{X}_{n}(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\frac{1}{n}X(nt)$ , and let $\bar{X}_{n}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\bar{X}_{n}(t):\ t\in[0,1]\}$ . Note that $\bar{X}_{n}$ is a random element taking values in $\mathbb{D}$ .

For all $l\geq 1$ , let $\mathbb{D}_{l}$ be the subset of $\mathbb{D}$ containing all the non-decreasing step functions that has $l$ jumps (i.e., discontinuities) and vanishes at the origin. Let $\mathbb{D}_{0}=\{\bm{0}\}$ be the set that only contains the zero function $\bm{0}(t)\equiv 0$ . Let $\mathbb{D}_{<l}=\cup_{j=0,1,\cdots,l-1}\mathbb{D}_{l}$ . For any $\beta>0$ , let $\nu_{\beta}$ be the measure supported on $(0,\infty)$ with $\nu_{\beta}(x,\infty)=x^{-\beta}$ . For any positive integer $l$ , let $\nu^{l}_{\beta}$ be the $l-$ fold product measure of $\nu_{\beta}$ restricted on $\{\bm{y}=(y_{1},\ldots,y_{l})\in(0,\infty)^{l}:\ y_{1}\geq y_{2}\geq\cdots\geq y_{l}\}$ . Define the measure (for $l\geq 1$ )

\mathbf{C}_{l}(\cdot)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\mathbf{E}\Bigg{[}\nu_{\beta}^{l}\Big{\{}\bm{y}\in(0,\infty)^{l}:\ \sum_{j=1}^{l}y_{j}\mathbf{I}_{[U_{j},1]}\in\cdot\Big{\}}\Bigg{]}

where all $U_{j}$ ’s are iid copies of $\text{Unif}(0,1)$ . In case that $l=0$ , we set $\mathbf{C}^{0}_{\beta}$ as the Dirac measure on 0. The following result provides sharp asymptotics for rare events associated with $\bar{X}_{n}$ . Henceforth in this paper, all measurable sets are understood to be Borel measurable.

Result 1 (Theorem 3.1 of [54]).

Let $A\subset\mathbb{D}$ be measurable. Suppose that $\mathcal{J}(A)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\min\{j\in\mathbb{N}:\ \mathbb{D}_{j}\cap A\neq\emptyset\}<\infty$ and $A$ is bounded away from $\mathbb{D}_{<\mathcal{J}(A)}$ in the sense that $\bm{d}(A,\mathbb{D}_{<\mathcal{J}(A)})>0$ . Then

\mathbf{C}_{\mathcal{J}(A)}(A^{\circ})\leq\liminf_{n\rightarrow\infty}\frac{\mathbf{P}(\bar{X}_{n}\in A)}{(n\nu[n,\infty))^{\mathcal{J}(A)}}\leq\limsup_{n\rightarrow\infty}\frac{\mathbf{P}(\bar{X}_{n}\in A)}{(n\nu[n,\infty))^{\mathcal{J}(A)}}\leq\mathbf{C}_{\mathcal{J}(A)}(A^{-})<\infty

where $A^{\circ},A^{-}$ are the interior and closure of $A$ respectively.

Intuitively speaking, Result 1 embodies a general principle that, in heavy-tailed systems, rare events arise due to several “large jumps”. Here, $\mathcal{J}(A)$ denotes the minimum number of jumps required in $\bar{X}_{n}$ for event $\{\bar{X}_{n}\in A\}$ to occur. As shown above in Result 1, $\mathcal{J}(A)$ dictates the polynomial rate of decay of the probabilities of the rare events $\mathbf{P}(\bar{X}_{n}\in A)$ . Furthermore, results such as Corollary 4.1 in [54] characterize the conditional limits of $\bar{X}_{n}$ : conditioning on the occurrence of rare events $\{\bar{X}_{n}\in A\}$ , the conditional law $\mathscr{L}(\bar{X}_{n}|\{\bar{X}_{n}\in A\})$ converges in distribution to that of a step function over $[0,1]$ with exactly $\mathcal{J}(A)$ jumps (of random sizes and arrival times) as $n\to\infty$ . Therefore, $\mathcal{J}(A)$ also dictates the most likely scenarios of the rare events. This insight proves to be critical when we develop the importance sampling distributions for the rare events simulation algorithm in Section 3.

Results for the two-sided cases admit a similar yet slightly more involved form, where the Lévy process $X(t)$ exhibits both positive and negative jumps. Specifically, let $X(t)$ be a centered Lévy process such that for $H_{+}(x)=\nu[x,\infty)$ and $H_{-}(x)=\nu(-\infty,-x]$ , we have $H_{+}(x)\in\mathcal{RV}_{-\alpha}(x)$ and $H_{-}(x)\in\mathcal{RV}_{-\alpha^{\prime}}(x)$ as $x\rightarrow\infty$ for some $\alpha,\alpha^{\prime}>1$ . Let $\mathbb{D}_{j,k}$ be the set containing all step functions in $\mathbb{D}$ vanishing at the origin that has exactly $j$ upward jumps and $k$ downward jumps. As a convention, let $\mathbb{D}_{0,0}=\{\bm{0}\}$ . Given $\alpha,\alpha^{\prime}>1$ , let $\mathbb{D}_{<j,k}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\bigcup_{(l,m)\in\mathbb{I}_{<j,k}}\mathbb{D}_{l,m}$ where $\mathbb{I}_{<j,k}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\big{\{}(l,m)\in\mathbb{N}^{2}\char 92\relax\{(j,k)\}:\ l(\alpha-1)+m(\alpha^{\prime}-1)\leq j(\alpha-1)+k(\alpha^{\prime}-1)\big{\}}.$ Let $\mathbf{C}_{0,0}$ be the Dirac measure on $\bm{0}$ . For any $(j,k)\in\mathbb{N}^{2}\char 92\relax\{(0,0)\}$ let

\displaystyle\mathbf{C}_{j,k}(\cdot)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\mathbf{E}\Bigg{[}\nu^{j}_{\alpha}\times\nu^{k}_{\alpha^{\prime}}\bigg{\{}(\bm{x},\bm{y})\in(0,\infty)^{j}\times(0,\infty)^{k}:\ \sum_{l=1}^{j}x_{l}\mathbf{I}_{[U_{l},1]}-\sum_{m=1}^{k}y_{m}\mathbf{I}_{[V_{m},1]}\in\cdot\bigg{\}}\Bigg{]}

(2.3)

where all $U_{l}$ ’s and $V_{m}$ ’s are iid copies of Unif $(0,1)$ RVs. Now, we are ready to state the two-sided result.

Result 2 (Theorem 3.4 of [54]).

Let $A\subset\mathbb{D}$ be measurable. Suppose that

\displaystyle\big{(}\mathcal{J}(A),\mathcal{K}(A)\big{)}\in\underset{(j,k)\in\mathbb{N}^{2},\ \mathbb{D}_{j,k}\cap A\neq\emptyset}{\text{argmin}}j(\alpha-1)+k(\alpha^{\prime}-1)

(2.4)

and $A$ is bounded away from $\mathbb{D}_{<\mathcal{J}(A),\mathcal{K}(A)}$ . Then the argument minimum in (2.4) is unique, and

	$\displaystyle\mathbf{C}_{\mathcal{J}(A),\mathcal{K}(A)}(A^{\circ})$	$\displaystyle\leq\liminf_{n\rightarrow\infty}\frac{\mathbf{P}(\bar{X}_{n}\in A)}{(n\nu[n,\infty))^{\mathcal{J}(A)}(n\nu(-\infty,-n])^{\mathcal{K}(A)}}$
		$\displaystyle\leq\limsup_{n\rightarrow\infty}\frac{\mathbf{P}(\bar{X}_{n}\in A)}{(n\nu[n,\infty))^{\mathcal{J}(A)}(n\nu(-\infty,-n])^{\mathcal{K}(A)}}\leq\mathbf{C}_{\mathcal{J}(A),\mathcal{K}(A)}(A^{-})<\infty$

where $A^{\circ},A^{-}$ are the interior and closure of $A$ respectively.

2.3 Concave Majorants and Stick-Breaking Approximations of Lévy Processes with Infinite Activities

Next, we review the distribution of the concave majorant of a Lévy process with infinite activities characterized in [52], which paves the way to the stick-breaking approximation algorithm proposed in [35]. Let $X(t)$ be a Lévy process with generating triplet $(c,\sigma,\nu)$ . We say that $X$ has infinite activities if $\sigma>0$ or $\nu(\mathbb{R})=\infty$ . Let $M(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sup_{s\leq t}X(s)$ be the running supremum of $X(t)$ . The results in [52] establishes a Poisson–Dirichlet distribution that underlies the joint law of $X(t)$ and $M(t)$ . Specifically, we fix some $T>0$ and let $V_{i}$ ’s be iid copies of Unif $(0,1)$ RVs. Recursively, let

\displaystyle l_{1}=TV_{1},\qquad l_{j}=V_{j}\cdot(T-l_{1}-l_{2}-\ldots-l_{j-1})\quad\forall j\geq 2.

(2.5)

Conditioning on the values of $(l_{j})_{j\geq 1}$ , let $\xi_{j}$ be a random copy of $X(l_{j})$ , with all $\xi_{j}$ being independently generated.

Result 3 (Theorem 1 in [52]).

Suppose that the Lévy process $X$ has infinite activities. Then (with $(x)^{+}=\max\{x,0\}$ )

\displaystyle\big{(}X(T),M(T)\big{)}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\big{(}\sum_{j\geq 1}\xi_{j},\sum_{j\geq 1}(\xi_{j})^{+}\big{)}.

(2.6)

Based on the distribution characterized in (2.6), the stick-breaking approximation algorithm was proposed in [35] where finitely many $\xi_{i}$ ’s are generated in order to approximate $X(T)$ and $M(T)$ . This approximation technique is a key component of our rare event simulation algorithm. In particular, we utilize a coupling between different Lévy processes based on the representation (2.6) above. For clarity of our description, we focus on two Lévy processes $X$ and $\widetilde{X}$ with generating triplets $(c,\sigma,\nu)$ and $(\widetilde{c},\widetilde{\sigma},\widetilde{\nu})$ , respectively. Suppose that both $X$ and $\widetilde{X}$ have infinite activities. We first generate $l_{i}$ ’s as described in (2.5). Conditioning on the values of $(l_{i})_{i\geq 1}$ , we then independently generate $\xi_{i}$ and $\widetilde{\xi}_{i}$ , which are random copies of $X(l_{i})$ and $\widetilde{X}(l_{i})$ , respectively. Let $\widetilde{M}(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sup_{s\leq t}\widetilde{X}(s)$ . Applying Result 3, we identify a coupling between $X(T),M(T),\widetilde{X}(T),\widetilde{M}(T)$ such that

\displaystyle\big{(}X(T),M(T),\widetilde{X}(T),\widetilde{M}(T)\big{)}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\big{(}\sum_{i\geq 1}\xi_{i},\sum_{i\geq 1}(\xi_{i})^{+},\sum_{i\geq 1}\widetilde{\xi}_{i},\sum_{i\geq 1}(\widetilde{\xi}_{i})^{+}\big{)}.

(2.7)

Remark 1.

It is worth noticing that the method described above in fact implies the existence of a probability space $(\Omega,\mathcal{F},\mathbf{P})$ that supports the entire sample paths $\{X(t):\ t\in[0,T]\}$ and $\{\widetilde{X}(t):\ t\in[0,T]\}$ , whose endpoint values $X(T),\widetilde{X}(T)$ and suprema $M(T),\widetilde{M}(T)$ admit the joint law in (2.7). In particular, once we obtain $l_{i}$ based on (2.5), one can generate $\Xi_{i}$ that are iid copies of the entire paths of $X$ . That is, we generate a piece of sample path $\Xi_{i}$ on the stick $l_{i}$ , and the quantities $\xi_{i}$ introduced earlier can be obtained by setting $\xi_{i}=\Xi_{i}(l_{i})$ . To recover the sample path of $X$ based on the pieces $\Xi_{i}$ , it suffices to apply Vervatt transform onto each $\Xi_{i}$ and then reorder the pieces based on their slopes. We refer the readers to theorem 4 in [52]. In summary, the method described above leads to a coupling between the sample paths of the underlying Lévy processes $X$ and $\widetilde{X}$ such that (2.7) holds.

2.4 Randomized Debiasing Technique

To achieve unbiasedness in our algorithm and remove the errors in the stick-breaking approximations, we apply the randomized multi-level Monte-Carlo technique studied in [55]. In particular, due to $\tau$ being finite (almost surely) in Result 4 below, the simulation of $Z$ relies only on $Y_{0},Y_{1},\cdots,Y_{\tau}$ instead of the infinite sequence $(Y_{n})_{n\geq 0}$ .

Result 4 (Theorem 1 in [55]).

Let random variables $Y$ and $(Y_{m})_{m\geq 0}$ be such that $\lim_{m\rightarrow\infty}\mathbf{E}Y_{m}=\mathbf{E}Y$ . Let $\tau$ be a positive integer-valued random variable with unbounded support, independent of $(Y_{m})_{m\geq 0}$ and $Y$ . Suppose that

\displaystyle\sum_{m\geq 1}\mathbf{E}|Y_{m-1}-Y|^{2}\big{/}\mathbf{P}(\tau\geq m)<\infty,

(2.8)

then $Z\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sum_{m=0}^{\tau}(Y_{m}-Y_{m-1})\big{/}\mathbf{P}(\tau\geq m)$ (with the convention $Y_{-1}=0$ ) satisfies

\mathbf{E}Z=\mathbf{E}Y,\qquad\mathbf{E}Z^{2}=\sum_{m\geq 0}\bar{v}_{m}\big{/}\mathbf{P}(\tau\geq m)

where $\bar{v}_{m}=\mathbf{E}|Y_{m-1}-Y|^{2}-\mathbf{E}|Y_{m}-Y|^{2}$ .

3 Algorithm

Throughout the rest of this paper, let $X(t)$ be a Lévy process with generating triplet $(c_{X},\sigma,\nu)$ satisfying the following heavy-tailed assumption.

Assumption 1.

$\mathbf{E}X(1)=0$ . $X(t)$ is of infinite activity. The Blumenthal-Getoor index $\beta\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\inf\{p>0:\int_{(-1,1)}|x|^{p}\nu(dx)<\infty\}$ satisfies $\beta<2$ . Besides, one of the two claims below holds for the Lévy measure $\nu$ .

•

(One-sided cases) $\nu$ is supported on $(0,\infty)$ , and function $H_{+}(x)=\nu[x,\infty)$ is regularly varying as $x\rightarrow\infty$ with index $-\alpha$ where $\alpha>1$ ;
•

(Two-sided cases) There exist $\alpha,\alpha^{\prime}>1$ such that $H_{+}(x)=\nu[x,\infty)$ is regularly varying as $x\rightarrow\infty$ with index $-\alpha$ and $H_{-}(x)=\nu(-\infty,-x]$ is regularly varying as $x\rightarrow\infty$ with index $-\alpha^{\prime}$ .

The other assumption on $X(t)$ revolves around the continuity of the law of $X^{<z}$ , which is the Lévy process with generating triplet $(c_{X},\sigma,\nu|_{(-\infty,z)})$ . That is, $X^{<z}$ is a modulated version of $X$ where all the upward jumps with size larger than $z$ are removed.

Assumption 2.

There exist $z_{0},C,\lambda>0$ such that

\mathbf{P}\big{(}X^{<z}(t)\in[x,x+\delta]\big{)}\leq\frac{C\delta}{t^{\lambda}\wedge 1}\qquad\forall z\geq z_{0},\ t>0,\ x\in\mathbb{R},\ \delta>0.

Assumption 2 can be interpreted as a uniform version of Lipschitz continuity in the law of $X^{<z}(t)$ . In Section 4, we show that Assumption 2 is a mild condition for Lévy process with infinite activities and is easy to verify.

Next, we describe a class of target events $(A_{n})_{n\geq 1}$ for which we propose a strongly efficient rare event simulation algorithm. Let $\bar{X}_{n}(t)=\frac{1}{n}X(nt)$ and $\bar{X}_{n}=\{\bar{X}_{n}(t):\ t\in[0,1]\}$ be the scaled version of the process. Define events

\displaystyle A\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\xi\in\mathbb{D}:\sup_{t\in[0,1]}\xi(t)\geq a;\sup_{t\in(0,1]}\xi(t)-\xi(t-)<b\},\qquad A_{n}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\bar{X}_{n}\in A\}.

(3.1)

In words, $\xi\in A$ means that the path $\xi$ crossed barrier $a$ even though no upward jumps in $\xi$ is larger than $b$ . For technical reasons, we also impose the following mild condition on the values of the constants $a,b>0$ .

Assumption 3.

$a,b>0$ and $a/b\notin\mathbb{Z}.$

In this section, we present a strongly efficient rare-event simulation algorithm for $(A_{n})_{n\geq 1}$ . Specifically, Section 3.1 presents the design of the importance sampling distribution $\mathbf{Q}_{n}$ , Section 3.2 discusses how we apply the randomized Monte-Carlo debiasing technique in Result 4 in our algorithm, Section 3.3 discusses how we combine the debiasing technique with SBA in Result 3, and Section 3.4 explains how to sample from the importance sampling distribution $\mathbf{Q}_{n}$ . Combining all these components in Section 3.5, we propose Algorithm 2 for rare-event simulation of $\mathbf{P}(A_{n})$ and establish its strong efficiency in Theorem 3.2. Section 3.6 addresses the case where the exact simulation of $X^{<z}(t)$ is not available.

3.1 Importance Sampling Distributions $\mathbf{Q}_{n}$

At the core of our algorithm is a principled design of importance sampling strategies based on heavy-tailed large deviations. This can be seen as an extension of the framework proposed in [20]. First, note that

\displaystyle l^{*}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\lceil a/b\rceil

(3.2)

indicates the number of jumps required to cross the barrier $a$ starting from the origin if no jump is allowed to be larger than $b$ . Based on the sample-path large deviations reviewed in Section 2.2, we expect the events $A_{n}=\{\bar{X}_{n}\in A\}$ to be almost always caused by exactly $l^{*}$ large upward jumps in $\bar{X}_{n}$ . These insights reveal critical information regarding the conditional law $\mathbf{P}(\ \cdot\ |\bar{X}_{n}\in A)$ . More importantly, they lead to a natural yet effective choice of importance sampling distributions to focus on the $l^{*}$ -large-jump paths and provides sufficient approximations to $\mathbf{P}(\ \cdot\ |\bar{X}_{n}\in A)$ . Specifically, for any $\gamma\in(0,b)$ , define events $B^{\gamma}_{n}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\bar{X}_{n}\in B^{\gamma}\}$ with

\displaystyle B^{\gamma}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\big{\{}\xi\in\mathbb{D}:\#\{t\in[0,1]:\xi(t)-\xi(t-)\geq\gamma\}\geq l^{*}\big{\}},

(3.3)

where, for any $\xi\in\mathbb{D}$ , we define $\xi(t-)=\lim_{s\uparrow t}\xi(s)$ as the left-limit of $\xi$ at time $t$ . Intuitively speaking, the parameter $\gamma\in(0,b)$ acts as a threshold of “large jumps”: any path $\xi\in B^{\gamma}$ has at least $l^{*}$ upward jumps that are considered large relative to the threshold level $\gamma$ . To prevent the likelihood ratio from blowing up to infinity, we then consider an importance sampling distribution with defensive mixtures (see [40]) and define (for some $w\in(0,1)$ )

\displaystyle\mathbf{Q}_{n}(\cdot)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}w\mathbf{P}(\cdot)+(1-w)\mathbf{P}(\ \cdot\ |B^{\gamma}_{n}).

(3.4)

Sampling from $\mathbf{P}(\ \cdot\ |B^{\gamma}_{n})$ , and hence $\mathbf{Q}_{n}(\cdot)$ , is straightforward and will be addressed in Section 3.4.

With the design of the importance sampling distribution $\mathbf{Q}_{n}$ in hand, one would naturally consider an estimator for $\mathbf{P}(A_{n})$ of form $\mathbf{I}_{A_{n}}\cdot\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}$ . This is due to

\mathbf{E}^{\mathbf{Q}_{n}}\bigg{[}\mathbf{I}_{A_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\bigg{]}=\mathbf{E}[\mathbf{I}_{A_{n}}]=\mathbf{P}(A_{n}).

Here, we use $\mathbf{E}^{\mathbf{Q}_{n}}$ to denote the expectation operator under law $\mathbf{Q}_{n}$ and $\mathbf{E}$ for the expectation under $\mathbf{P}$ . Nevertheless, the exact evaluation or simulation of $\mathbf{I}_{A_{n}}=\mathbf{I}\{\bar{X}_{n}\in A\}$ is generally not computationally feasible due to the infinite activities of the process $X$ , making it computationally infeasible to simulate or store the entire sample path with finite computational resources. This marks a significant difference from the tasks in [20], which focus on random walks or compound Poisson processes with constant drifts that can be simulated exactly. To overcome this challenge, we instead consider estimators $L_{n}$ in the form of

\displaystyle L_{n}=Z_{n}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}=\frac{Z_{n}}{w+\frac{1-w}{\mathbf{P}(B^{\gamma}_{n})}\mathbf{I}_{B^{\gamma}_{n}}}

(3.5)

where $Z_{n}$ can be simulated within finite computational resources and allows $L_{n}$ to recover the right expectation under the importance sampling distribution $\mathbf{Q}_{n}$ , i.e., $\mathbf{E}^{\mathbf{Q}_{n}}[L_{n}]=\mathbf{P}(A_{n})$ . In Section 3.2, we elaborate on the design of the estimators $Z_{n}$ .

3.2 Estimators $Z_{n}$

Intuitively speaking, the goal is to construct $Z_{n}$ ’s that can be plugged into (3.5) as unbiased estimators of $\mathbf{I}_{A_{n}}$ . To this end, we consider the following decomposition of the Lévy process $X$ . For any $\xi\in\mathbb{D}$ and $t\geq 0$ , let $\Delta\xi(t)=\xi(t)-\xi(t-)$ be the size of the discontinuity in $\xi$ at time $t$ . Recall that $\gamma\in(0,b)$ is the threshold of large jumps in the definition of $B^{\gamma}$ in (3.3). Let

	$\displaystyle J_{n}(t)$	$\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sum_{s\in[0,t]}\Delta X(s)\mathbf{I}\big{(}\Delta X(s)\geq n\gamma\big{)},$		(3.6)
	$\displaystyle\Xi_{n}(t)$	$\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}X(t)-J_{n}(t)=X(t)-\sum_{s\in[0,t]}\Delta X(s)\mathbf{I}\big{(}\Delta X(s)\geq n\gamma\big{)}.$

We highlight several important facts regarding the decomposition $X(t)=J_{n}(t)+\Xi_{n}(t)$ .

•

By the definition of $\mathbf{Q}_{n}$ , the law of $\Xi_{n}$ remains unchanged under both $\mathbf{Q}_{n}$ and $\mathbf{P}$ , which is identical to the law of $X^{<n\gamma}$ , namely, a Lévy process with generating triplet $(c_{X},\sigma,\nu|_{(-\infty,n\gamma)})$ .
•

Under $\mathbf{P}$ , the process $J_{n}$ admits the law of a Lévy process with generating triplet $(0,0,\nu|_{[n\gamma,\infty)})$ , which is a compound Poisson process.
•

Under $\mathbf{Q}_{n}$ , the path $\{J_{n}(t):\ t\in[0,n]\}$ follows the same law as a Lévy process with generating triplet $(0,0,\nu|_{[n\gamma,\infty)})$ , conditioned on having at least $l^{*}$ jumps over $[0,n]$ .
•

Under both $\mathbf{P}$ and $\mathbf{Q}_{n}$ , the two processes $J_{n}$ and $\Xi_{n}$ are independent.

Let $\bar{J}_{n}(t)=\frac{1}{n}J_{n}(nt)$ , $\bar{J}_{n}=\{\bar{J}_{n}(t):\ t\in[0,1]\}$ , $\bar{\Xi}_{n}(t)=\frac{1}{n}\Xi_{n}(nt)$ , and $\bar{\Xi}_{n}=\{\bar{\Xi}_{n}(t):\ t\in[0,1]\}$ . We now discuss how the decomposition

\bar{X}_{n}=\bar{J}_{n}+\bar{\Xi}_{n}

can help us construct unbiased estimators of $\mathbf{I}_{A_{n}}$ . First, recall that $\gamma\in(0,b)$ . As a result, in the definition of events $A_{n}=\{\bar{X}_{n}\in A\}$ in (3.1), the condition $\sup_{t\in(0,1]}\xi(t)-\xi(t-)<b$ only concerns the large jump process $\bar{J}_{n}$ since any upward jump in $\bar{\Xi}_{n}$ is bounded by $\gamma<b$ . Therefore, with

\displaystyle E\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\xi\in\mathbb{D}:\ \sup_{t\in(0,1]}\xi(t)-\xi(t-)<b\},\qquad E_{n}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\bar{J}_{n}\in E\}

(3.7)

and

M(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sup_{s\leq t}X(s),\qquad Y^{*}_{n}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\mathbf{I}\big{(}M(n)\geq na\big{)},

we have

\mathbf{I}_{A_{n}}=Y^{*}_{n}\mathbf{I}_{E_{n}}.

As discussed above, the exact evaluation of $Y^{*}_{n}$ is generally not computationally possible. Instead, suppose that we have access to a sequence of random variables $(\hat{Y}^{m}_{n})_{m\geq 0}$ that only take values in $\{0,1\}$ and provide progressively more accurate approximations to $Y^{*}_{n}$ as $m\rightarrow\infty$ . Then in light of the debiasing technique in Result 4, one can consider (under the convention that $\hat{Y}^{-1}_{n}\equiv 0$ )

\displaystyle Z_{n}=\sum_{m=0}^{\tau}\frac{\hat{Y}^{m}_{n}-\hat{Y}^{m-1}_{n}}{\mathbf{P}(\tau\geq m)}\mathbf{I}_{E_{n}}

(3.8)

where $\tau$ is $\text{Geom}(\rho)$ for some $\rho\in(0,1)$ and is independent of everything else. That is, $\mathbf{P}(\tau\geq m)=\rho^{m-1}$ for all $m\geq 1$ . Indeed, this construction of $Z_{n}$ is justified by the following proposition. We defer the proof to Section 6.1.

Proposition 3.1.

Let $C_{0}>0$ , $\rho_{0}\in(0,1)$ , $\mu>2l^{*}(\alpha-1)$ , and $\bar{m}\in\mathbb{N}$ . Suppose that

\displaystyle\mathbf{P}\Big{(}Y^{*}_{n}\neq\hat{Y}^{m}_{n}\ \Big{|}\ \mathcal{D}(\bar{J}_{n})=k\Big{)}\leq C_{0}\rho^{m}_{0}\cdot(k+1)\qquad\forall k\geq 0,n\geq 1,m\geq\bar{m}

(3.9)

where $\mathcal{D}(\xi)$ counts the number of discontinuities in $\xi$ for any $\xi\in\mathbb{D}$ . Besides, suppose that for all $\Delta\in(0,1)$ ,

\displaystyle\mathbf{P}\Big{(}Y^{*}_{n}\neq\hat{Y}^{m}_{n},\ \bar{X}_{n}\notin A^{\Delta}\ \Big{|}\ \mathcal{D}(\bar{J}_{n})=k\Big{)}\leq\frac{C_{0}\rho^{m}_{0}}{\Delta^{2}n^{\mu}}\qquad\forall n\geq 1,m\geq 0,k=0,1,\cdots,l^{*}-1,

(3.10)

where $A^{\Delta}=\big{\{}\xi\in\mathbb{D}:\sup_{t\in[0,1]}\xi(t)\geq a-\Delta\big{\}}$ . Then given $\rho\in(\rho_{0},1)$ , there exists some $\bar{\gamma}=\bar{\gamma}(\rho)\in(0,b)$ such that for all $\gamma\in(0,\bar{\gamma})$ , the estimators $(L_{n})_{n\geq 1}$ specified in (3.5) and (3.8) are unbiased and strongly efficient for $\mathbf{P}(A_{n})=\mathbf{P}(\bar{X}_{n}\in A)$ under the importance sampling distribution $\mathbf{Q}_{n}$ in (3.4).

3.3 Construction of $\hat{Y}^{m}_{n}$

In light of Proposition 3.1, our next goal is to design $\hat{Y}^{m}_{n}$ ’s that provide sufficient approximations to $Y^{*}_{n}=\mathbf{I}(M(n)\geq na)$ and satisfy the conditions (3.9) and (3.10).

Recall the decomposition of $X(t)=\Xi_{n}(t)+J_{n}(t)$ in (3.6). Under both $\mathbf{Q}_{n}$ and $\mathbf{P}$ , the processes $\Xi_{n}$ and $J_{n}$ are independent, and $\Xi_{n}$ admits the law of $X^{<n\gamma}$ , i.e., a Lévy process with generating triplet $(c_{X},\sigma,\nu|_{(-\infty,n\gamma)})$ . This section discusses how, after sampling $J_{n}$ from $\mathbf{Q}_{n}$ , we approximate the supremum of $\Xi_{n}$ . Specifically, on event $\{\mathcal{D}(\bar{J}_{n})=k\}$ , i.e., the process $J_{n}$ makes $k$ jumps over $[0,n]$ , $J_{n}$ admits the form of $\zeta_{k}$ with

\displaystyle\zeta_{k}(t)=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}(t)\qquad\forall t\in[0,n]

(3.11)

for some $z_{i}\in[n\gamma,\infty)$ and $u_{1}<u_{2}<\cdots<u_{k}$ . This allows us to partition $[0,n]$ into $k+1$ disjoint intervals $[0,u_{1}),\ [u_{1},u_{2}),\ldots,\ [u_{k-1},u_{k}),\ [u_{k},1]$ . We adopt the convention $u_{0}\equiv 0,u_{k+1}\equiv 1$ and set

\displaystyle I_{i}=[u_{i-1},u_{i})\quad\forall i\in[k],\qquad I_{k+1}=[u_{k},1].

(3.12)

For $\zeta_{k}=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}$ , define

\displaystyle M_{n}^{(i),*}(\zeta_{k})\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sup_{t\in I_{i}}\Xi_{n}(t)-\Xi_{n}(u_{i-1})

(3.13)

as the supremum of the fluctuations of $\Xi_{n}(t)$ over $I_{i}$ . Define random function

\displaystyle Y^{*}_{n}(\zeta_{k})\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\max_{i\in[k+1]}\mathbf{I}\Big{(}\Xi_{n}(u_{i-1})+\zeta_{k}(u_{i-1})+M^{(i),*}_{n}(\zeta_{k})\geq na\Big{)},

(3.14)

and note that $Y^{*}_{n}(J_{n})=\mathbf{I}(\sup_{t\in[0,n]}X(t)\geq na)$ .

In theory, the representation (3.14) provides an algorithm for the simulation of $\mathbf{I}(\sup_{t\in[0,n]}X(t)\geq na)$ . Nevertheless, the exact simulation of the supremum $M^{(i),*}_{n}(\zeta_{k})$ is generally not available. Instead, we apply SBA introduced in Section 2.3 to approximate $M^{(i),*}_{n}(\zeta_{k})$ , thus providing the construction of $\hat{Y}^{m}_{n}$ . Specifically, define

	$\displaystyle l^{(i)}_{1}$	$\displaystyle=V^{(i)}_{1}\cdot(u_{i}-u_{i-1});$		(3.15)
	$\displaystyle l^{(i)}_{j}$	$\displaystyle=V^{(i)}_{j}\cdot(u_{i}-u_{i-1}-l^{(i)}_{1}-l^{(i)}_{2}-\cdots-l^{(i)}_{j-1})\qquad\forall j\geq 2$		(3.16)

where each $V^{(i)}_{j}$ is an iid copy of Unif $(0,1)$ . That is, for each $i\in[k+1]$ , the sequence $(l^{(i)}_{j})_{j\geq 1}$ is defined under the recursion in (2.5), with $T=u_{i}-u_{i-1}$ set as the length of $I_{i}$ . Then, conditioning on the values of $l^{(i)}_{j}$ ’s, we sample

\displaystyle\xi^{(i)}_{j}\sim\mathbf{P}\Big{(}\Xi_{n}(l^{(i)}_{j})\in\ \cdot\ \Big{)},

(3.17)

i.e., $\xi^{(i)}_{j}$ is an independent copy of $\Xi_{n}(l^{(i)}_{j})$ , with all $\xi^{(i)}_{j}$ being independently generated. Result 3 then implies $\big{(}\Xi_{n}(u_{i})-\Xi_{n}(u_{i-1}),\ M^{(i),*}_{n}(\zeta_{k})\big{)}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\big{(}\sum_{j\geq 1}\xi^{(i)}_{j},\ \sum_{j\geq 1}(\xi^{(i)}_{j})^{+}\big{)}$ for each $i\in[k+1]$ . Furthermore, by summing up only finitely many $\xi^{(i)}_{j}$ ’s, we define

\displaystyle\hat{M}^{(i),m}_{n}(\zeta_{k})=\sum_{j=1}^{m+\lceil\log_{2}(n^{d})\rceil}(\xi^{(i)}_{j})^{+}

(3.18)

as an approximation to $M^{(i),*}_{n}(\zeta_{k})$ defined in (3.13). Here, $d>0$ is another parameter of the algorithm. For technical reasons, we add an extra $\lceil\log_{2}(n^{d})\rceil$ term in the summation in (3.18), which helps ensure that the algorithm achieves strong efficiency as $n\to\infty$ while only introducing a minor increase in the computational complexity.

Now, we are ready to present the design of the approximators $\hat{Y}^{m}_{n}$ . For $\zeta_{k}=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}$ , define the random function

\displaystyle\hat{Y}^{m}_{n}(\zeta_{k})=\max_{i\in[k+1]}\mathbf{I}\bigg{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}+\hat{M}^{(i)}_{n}(\zeta_{k})\geq na\bigg{)}.

(3.19)

Here, note that $\sum_{q=1}^{i-1}z_{q}=\zeta_{k}(u_{i-1})$ . As a high-level description, the algorithm proceeds as follows. After sampling $J_{n}$ from the importance sampling distribution $\mathbf{Q}_{n}$ defined in (3.4), we plug $\hat{Y}^{m}_{n}(J_{n})$ into $Z_{n}$ defined in (3.8), which in turn allows us to simulate ${L_{n}}=Z_{n}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}$ as the importance sampling estimator under $\mathbf{Q}_{n}$ .

Remark 2.

At first glance, one may get the impression that the simulation of $\hat{Y}^{m}_{n}$ involves the summation of infinitely many elements in $\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q),m}_{j}$ . Fortunately, the truncation index $\tau$ in $Z_{n}$ (see (3.8)) is almost surely finite. Therefore, when running the algorithm in practice, after $\tau$ is decided, there is no need to simulate any $\hat{Y}^{m}_{n}$ beyond $m\leq\tau$ . Given the construction of $\hat{M}^{(i),m}_{n}(\zeta_{k})$ in (3.18), the simulation of $\hat{Y}^{m}_{n}(\zeta_{k})$ only requires (for each $i\in[k+1]$ ) $\xi^{(i)}_{1},\xi^{(i)}_{2},\ldots,\xi^{(i)}_{\tau+\lceil\log_{2}(n^{d})\rceil},$ as well as the sum $\sum_{j\geq\tau+\lceil\log_{2}(n^{d})\rceil+1}\xi^{(i)}_{j}$ . Furthermore, conditioning on the value of $u_{i}-u_{i-1}-\sum_{j=1}^{\tau+\lceil\log_{2}(n^{d})\rceil}\xi^{(i)}_{j}=t$ , the sum $\sum_{j\geq\tau+\lceil\log_{2}(n^{d})\rceil+1}\xi^{(i)}_{j}$ admits the law of $\Xi_{n}(t)$ (see Result 3). This allows us to simulate $\sum_{j\geq\tau+\lceil\log_{2}(n^{d})\rceil+1}\xi^{(i)}_{j}$ in one shot.

Note that to implement the importance sampling algorithm and ensure strong efficiency, the following tasks still remain to be addressed.

$(i)$

As mentioned above, the evaluation of $\hat{Y}^{m}_{n}(J_{n})$ requires the ability to first sample $J_{n}$ from the importance sampling distribution $\mathbf{Q}_{n}$ defined in (3.4). We address this in Algorithm 1 proposed in Section 3.4. In summary, the simulation algorithm of estimators $L_{n}$ is detailed in Algorithm 2.
$(ii)$

The strong efficiency of the proposed algorithm needs to be justified by verifying the conditions in Proposition 3.1. This will be done in Section 3.5 by establishing Theorem 3.2.
$(iii)$

Simulating $\xi^{(i)}_{j}$ ’s requires the exact simulation of $X^{<n\gamma}(t)$ , which may not be computationally feasible in certain cases. To address this challenge, Section 3.6 proposes Algorithm 3, which builds upon Algorithm 2 and incorporates another layer of approximation via ARA.

3.4 Sampling from $\mathbf{P}(\ \cdot\ |B^{\gamma}_{n})$

In this section, we revisit the task of sampling from $\mathbf{P}(\ \cdot\ |B^{\gamma}_{n})$ , which is at the core of the implementation of the importance sampling distribution $\mathbf{Q}_{n}$ in (3.4).

Recall that under $\mathbf{P}$ , the process $J_{n}$ is a compound Poisson process with generating triplet $(0,0,\nu|_{[n\gamma,\infty)})$ . More precisely, let $\widetilde{N}_{n}(\cdot)$ be a Poisson process with rate $\nu[n\gamma,\infty)$ , and we use $(S_{i})_{i\geq 1}$ to denote the arrival times of jumps in $\widetilde{N}_{n}(\cdot)$ . Let $(W_{i})_{i\geq 1}$ be a sequence of iid random variables from

\nu^{\text{normalized}}_{n}(\cdot)=\frac{\nu_{n}(\cdot)}{\nu[n\gamma,\infty)},\qquad\nu_{n}(\cdot)=\nu\big{(}\cdot\cap[n\gamma,\infty)\big{)}

and let $W_{i}$ ’s be independent of $\widetilde{N}_{n}(\cdot)$ . Under $\mathbf{P}$ , we have

J_{n}(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\sum_{i=1}^{\widetilde{N}_{n}(t)}W_{i}=\sum_{i\geq 1}W_{i}\mathbf{I}_{[S_{i},\infty)}(t)\qquad\forall t\geq 0.

Furthermore, for each $k\geq 0$ , conditioning on $\{\widetilde{N}_{n}(n)=k\}$ , the law of $S_{1},\ldots,S_{k}$ is equivalent to that of the order statistics of $k$ iid samples from Unif $(0,n)$ , and $W_{i}$ ’s are still independent of $S_{i}$ ’s with the law unaltered. Therefore, the sampling of $J_{n}$ from $\mathbf{P}(\ \cdot\ |B^{\gamma}_{n})$ can proceed as follows. We first generate $k$ from the distribution of Poisson $(n\nu[n\gamma,\infty))$ , conditioning on $k\geq l^{*}$ . Then, independently, we generate $S_{1},\cdots,S_{k}$ as the order statistics of $k$ iid samples from Unif $(0,n)$ , and $W_{1},\cdots,W_{k}$ as iid samples of law $\nu^{\text{normalized}}_{n}(\cdot)$ . It is worth mentioning that the sampling of $W_{i}$ can be addressed with the help of the inverse measure. Specifically, define $Q^{\leftarrow}_{n}(y)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}{}\inf\{s>0:\nu_{n}[s,\infty)<y\}$ as the inverse of $\nu_{n}$ , and observe that

y\leq\nu_{n}[s,\infty)\qquad\Longleftrightarrow\qquad Q^{\leftarrow}_{n}(y)\geq s.

More importantly, for $U\sim\text{Unif}(0,\nu_{n}[n\gamma,\infty))$ , the law of $Q^{\leftarrow}_{n}(U)$ is $\nu^{\text{normalized}}_{n}(\cdot)$ . This leads to the steps detailed in Algorithm 1.

Algorithm 1 Simulation of

J_{n}

from

\mathbf{P}(\ \cdot\ |B^{\gamma}_{n})

n\in\mathbb{N},l^{*}\in\mathbb{N},\gamma>0

, the Lévy measure

\nu

2:Sample

k

from a Poisson distribution with rate

n\nu[n\gamma,\infty)

conditioning on

k\geq l^{*}

3:Simulate

\Gamma_{1},\cdots,\Gamma_{k}\stackrel{{\scriptstyle\text{iid}}}{{\sim}}Unif\big{(}0,\nu_{n}[n\gamma,\infty)\big{)}

4:Simulate

U_{1},\cdots,U_{k}\stackrel{{\scriptstyle\text{iid}}}{{\sim}}Unif(0,n)

5:Return

J_{n}=\sum_{i=1}^{k}Q^{\leftarrow}_{n}(\Gamma_{i})\mathbf{I}_{[U_{i},n]}

3.5 Strong Efficiency and Computational Complexity

With all the discussions above, we propose Algorithm 2 for rare-event simulation of $\mathbf{P}(A_{n})$ . Specifically, here is a list of the parameters of the algorithm.

•

$\gamma\in(0,b)$ : the threshold in $B^{\gamma}$ defined in (3.3),
•

$w\in(0,1)$ : the weight of the defensive mixture in $\mathbf{Q}_{n}$ ; see (3.4),
•

$\rho\in(0,1)$ : the geometric rate of decay for $\mathbf{P}(\tau\geq m)$ in (3.8),
•

$d>0$ : determining the $\log_{2}(n^{d})$ term in (3.18).

The choice of $w\in(0,1)$ won’t affect the strong efficiency of the algorithm. Meanwhile, under proper parametrization, Algorithm 2 meets conditions (3.9) and (3.10) stated in Proposition 3.1 and attains strong efficiency. This is verified in Theorem 3.2.

Theorem 3.2.

Let $d>\max\{2,\ 2l^{*}(\alpha-1)\}$ and $w\in(0,1)$ . There exists $\rho_{0}\in(0,1)$ such that the following claim holds: Given $\rho\in(\rho_{0},1)$ , there exists $\bar{\gamma}\in(0,b)$ such that Algorithm 2 is unbiased and strongly efficient under any $\gamma\in(0,\bar{\gamma})$ .

We defer the proof to Section 6.2. In fact, in Section 3.6 we propose Algorithm 3, which can be seen as an extended version of Algorithm 2 with another layer of approximation. The strong efficiency of Algorithm 2 follows directly from that of Algorithm 3 (i.e., by setting $\kappa=0$ in the proof of Theorem 3.3). The choices of $\bar{\gamma}$ and $\bar{\rho}$ that ensure strong efficiency are also specified at the end of Section 3.6.

Algorithm 2 Strongly Efficient Estimation of

\mathbf{P}(A_{n})

w\in(0,1),\ \gamma>0,\ d>0,\ \rho\in(0,1)

as the parameters of the algorithm;

a,b>0

as the characterization of the set

A

;

(c_{X},\sigma,\nu)

as the generating triplet of

X(t)

3:Set

t_{n}=\lceil\log_{2}(n^{d})\rceil

5:if

\text{Unif}(0,1)<w

then

\triangleright

Sample

J_{n}

from

\mathbf{Q}_{n}

6: Sample

J_{n}=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}

from

\mathbf{P}

7:else

8: Sample

J_{n}=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}

from

\mathbf{P}(\ \cdot\ |B^{\gamma}_{n})

using Algorithm 1

9:Set

u_{0}=0,u_{k+1}=n

10:

11:Sample

\tau\sim\text{Geom}(\rho)

\triangleright

Decide Truncation Index

\tau

12:

13:for

i=1,2,\ldots,k+1

\triangleright

Stick-Breaking Procedure

14: for

j=1,2,\ldots,t_{n}+\tau

15: Sample

V^{(i)}_{j}\sim\text{Unif}(0,1)

16: Set

l^{(i)}_{j}=V^{(i)}_{j}(u_{i}-u_{i+1}-l^{(i)}_{1}-l^{(i)}_{2}-\ldots-l^{(i)}_{j-1})

17: Sample

\xi^{(i)}_{j}\sim\mathbf{P}\big{(}X^{<n\gamma}(l^{(i)}_{j})\in\ \cdot\ \big{)}

18: Set

l^{(i)}_{t_{n}+\tau+1}=u_{i}-u_{i-1}-l^{(i)}_{1}-l^{(i)}_{2}-\ldots-l^{(i)}_{t_{n}+\tau}

19: Sample

\xi^{(i)}_{t_{n}+\tau+1}\sim\mathbf{P}\big{(}X^{<n\gamma}(l^{(i)}_{t_{n}+\tau+1})\in\ \cdot\ \big{)}

20:

21:for

m=1,\cdots,\tau

\triangleright

Evaluate

\hat{Y}^{m}_{n}

22: for

i=1,2,\ldots,k+1

23: Set

\hat{M}^{(i),m}_{n}=\sum_{q=1}^{i-1}\sum_{j=1}^{t_{n}+\tau+1}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}+\sum_{j=1}^{t_{n}+m}(\xi^{(i)}_{j})^{+}

24: Set

\hat{Y}^{m}_{n}=\mathbf{I}\big{\{}\max_{i=1,\ldots,k+1}\hat{M}^{(i),m}_{n}\geq na\big{\}}

25:Set

Z_{n}=\hat{Y}^{1}_{n}+\sum_{m=2}^{\tau}(\hat{Y}^{m}_{n}-\hat{Y}^{m-1}_{n})\big{/}\rho^{m-1}

\triangleright

Return the Estimator

L_{n}

26:if

\max_{i=1,\cdots,k}z_{i}>b

then

27: Return

L_{n}=0

28:else

29: Set

\lambda_{n}=n\nu[n\gamma,\infty),\ p_{n}=1-\sum_{l=0}^{l^{*}-1}e^{-\lambda_{n}}\frac{\lambda_{n}^{l}}{l!},\ I_{n}=\mathbf{I}\{J_{n}\in B^{\gamma}_{n}\}

30: Return

L_{n}=Z_{n}/(w+\frac{1-w}{p_{n}}I_{n})

Remark 3.

To conclude, we add a remark regarding the computational complexity of Algorithm 2 under the goal of attaining a given level of relative error at a specified confidence level. First, consider the case where the complexity of simulation of $X^{<z}(t)$ scales linearly with $t$ (uniformly for all $z\in[z_{0},\infty]$ for some constant $z_{0}$ ). This is a standard since the number of jumps we expect to simulate over $[0,t]$ grows linearly with $t$ . Then, the complexity of the SBA steps at step 13 of Algorithm 2 also scales linearly with $n$ , as the stick lengths of $l^{(i)}_{j}$ ’s, in expectation, grow linearly with $n$ because we deal with the time horizon $[0,n]$ given the scale factor $n$ . Next, since the same law for the truncation index $\tau$ (see step 8 of Algorithm 2) is applied for all $n$ , the only other factor that is varying with $n$ is $t_{n}=\lceil\log_{2}(n^{d})\rceil$ in the loop at step 10. The strong efficiency of the algorithm then implies a computational complexity of order $O(n\cdot\log_{2}n)$ . If we instead assume that the cost of simulating $X^{<z}(t)$ is also uniformly bounded for all $t$ , then the overall complexity of Algorithm 2 is further reduced to $O(\log_{2}n)$ .

In comparison, the crude Monte Carlo method requires a number of samples that is inversely proportional to the target probability $\mathbf{P}(A_{n})\approx O(1/n^{l^{*}(\alpha-1)})$ (see Lemma 6.1) with $\alpha>1$ being the heavy-tailed index in Assumption 1 and $l^{*}\geq 1$ defined in (3.2). Hypothetically, assuming that the evaluation of $\mathbb{I}_{A_{n}}$ (which at least requires the simulation of $X(t)$ and $M(t)$ ) is computationally feasible at a cost that scales linearly with $n$ , we end up with a computational complexity of $O(n\cdot n^{l^{*}(\alpha-1)})$ (compared to the $O(n\cdot\log_{2}n)$ cost of our algorithm). Similarly, if we assume that the cost of generating $\mathbb{I}_{A_{n}}$ is uniformly bounded for all $n$ , then the complexity of the crude Monte-Carlo method is $O(n^{l^{*}(\alpha-1)})$ (compared to the $O(\log_{2}n)$ cost of our algorithm). In summary, not only does the proposed importance sampling algorithm address Lévy processes with infinite activities that are not simulatable for crude Monte Carlo methods, but it also enjoys a significant improvement in terms of computational complexity, with the advantage becoming even more evident for multiple-jump events with large $l^{*}$ .

3.6 Construction of $\hat{Y}^{m}_{n}$ with ARA

As stressed earlier, implementing Algorithm 3.5 requires the ability to sample from $\mathbf{P}(X^{<n\gamma}(t)\in\ \cdot\ )$ . The goal of this section is to address the challenge when the exact simulation of $X^{<n\gamma}(t)$ is not available. The plan is to incorporate the Asmussen-Rosiński approximation (ARA) in [3] into the design of the approximation $\hat{Y}^{m}_{n}$ proposed in Section 3.3.

To be specific, let

\displaystyle\kappa_{n,m}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\frac{\kappa^{m}}{n^{r}}\qquad\forall n\geq 1,\ m\geq 0

(3.20)

where $\kappa\in(0,1)$ and $r>0$ are two additional parameters of our algorithm. As a convention, we set $\kappa_{n,-1}\equiv 1$ . Without loss of generality, we consider $n$ large enough such that $n\gamma>1=\kappa_{n,-1}$ . For the Lévy process $\Xi_{n}=X^{<n\gamma}$ with the generating triplet $(c_{X},\sigma,\nu|_{(-\infty,n\gamma)})$ , consider the following decomposition (with $B(t)$ being a standard Brownian motion)

	$\displaystyle\Xi_{n}(t)$	$\displaystyle=c_{X}t+\sigma B(t)+\underbrace{\sum_{s\leq t}\Delta X(s)\mathbf{I}\Big{(}\Delta X(s)\in(-\infty,-1]\cup[1,n\gamma)\Big{)}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}J_{n,-1}(t)}$		(3.21)
		$\displaystyle\quad+\sum_{m\geq 0}\Bigg{[}\underbrace{\sum_{s\leq t}\Delta X(s)\mathbf{I}\Big{(}\|\Delta X(s)\|\in[\kappa_{n,m},\kappa_{n,m-1})\Big{)}-t\cdot\nu\Big{(}(-\kappa_{n,m-1},-\kappa_{n,m}]\cup[\kappa_{n,m},\kappa_{n,m-1})\Big{)}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}J_{n,m}(t)}\Bigg{]}.$

Here, for any $m\geq 0$ , $J_{n,m}$ is a martingale with $var[J_{n,m}(1)]=\bar{\sigma}^{2}(\kappa_{n,m-1})-\bar{\sigma}^{2}(\kappa_{n,m})$ where

\displaystyle\bar{\sigma}^{2}(c)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\int_{(-c,c)}x^{2}\nu(dx)\qquad\forall c\in(0,1].

(3.22)

Generally speaking, the difficulty of implementing Algorithm 3.5 lies in the exact simulation of the martingale $\sum_{m\geq 0}J_{n,m}$ . In particular, whenever we have $\nu((-\infty,0)\cup(0,\infty))=\infty$ for the Lévy measure $\nu$ , the expected number of jumps in $\sum_{m\geq 0}J_{n,m}$ (and hence $X^{<n\gamma}$ and $X$ ) will be infinite over any time interval with positive length. By applying ARA, our goal is to approximate the jump martingale $J_{n,m}$ ’s using Brownian motions, which yields a process that is amenable to exact simulation. To do so, let $(W^{m})_{m\geq 1}$ be a sequence of iid copies of standard Brownian motions, which are also independent of $B(t)$ . For each $m\geq 0$ , define

\displaystyle\breve{\Xi}^{m}_{n}(t)

\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}c_{X}t+\sigma B(t)+\sum_{q=-1}^{m}J_{n,q}(t)+\sum_{q\geq m+1}\sqrt{\bar{\sigma}^{2}(\kappa_{n,q-1})-\bar{\sigma}^{2}(\kappa_{n,q})}\cdot W^{q}(t).

(3.23)

Here, the process $\breve{\Xi}^{m}_{n}$ can be interpreted as an approximation to $\Xi_{n}$ , where the jump martingale (with jump sizes under $\kappa_{n,m}$ ) is substituted by a standard Brownian motion with the same variance. Note that for any $t>0$ , the random variable $\breve{\Xi}^{m}_{n}(t)$ is exactly simulatable, as it is a convolution of a compound Poisson process with constant drift and a Gaussian random variable.

Based on the approximations $\breve{\Xi}^{m}_{n}$ in (3.23), we apply SBA and reconstruct $\hat{M}^{(i),m}_{n}$ (originally defined in (3.18)) and $\hat{Y}^{m}_{n}$ (originally defined in (3.19)) as follows. Let $\zeta_{k}(t)=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}(t)$ be a piece-wise step function with $k$ jumps over $(0,n]$ , i.e., admitting the form in (3.11). Recall that the jump times in $\zeta_{k}$ leads to a partition of $[0,n]$ of $(I_{i})_{i\in[k+1]}$ defined in (3.12). For any $I_{i}$ , let the sequence $l^{(i)}_{j}$ ’s be defined as in (3.15)–(3.16). Next, conditioning on $(l^{(i)}_{j})_{j\geq 1}$ , one can sample $\xi^{(i),m}_{j},\xi^{(i)}_{j}$ as

\displaystyle\big{(}\xi^{(i)}_{j},\xi^{(i),0}_{j},\xi^{(i),1}_{j},\xi^{(i),2}_{j},\ldots)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\Big{(}\Xi_{n}(l^{(i)}_{j}),\ \breve{\Xi}^{0}_{n}(l^{(i)}_{j}),\ \breve{\Xi}^{1}_{n}(l^{(i)}_{j}),\ \breve{\Xi}^{2}_{n}(l^{(i)}_{j}),\ldots\Big{)}.

(3.24)

The coupling in (2.7) then implies

	$\displaystyle\Big{(}\Xi_{n}(u_{i})-\Xi_{n}(u_{i-1}),\ \sup_{t\in I_{i}}\Xi_{n}(t)-\Xi_{n}(u_{i-1}),\ \breve{\Xi}^{0}_{n}(u_{i})-\breve{\Xi}^{0}_{n}(u_{i-1}),\ \sup_{t\in I_{i}}\breve{\Xi}^{0}_{n}(t)-\breve{\Xi}^{0}_{n}(u_{i-1}),$		(3.25)
	$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\breve{\Xi}^{1}_{n}(u_{i})-\breve{\Xi}^{1}_{n}(u_{i-1}),\ \sup_{t\in I_{i}}\breve{\Xi}^{1}_{n}(t)-\breve{\Xi}^{1}_{n}(u_{i-1}),\ldots\Big{)}$
	$\displaystyle\qquad\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\Big{(}\sum_{j\geq 1}\xi^{(i)}_{j},\ \sum_{j\geq 1}(\xi^{(i)}_{j})^{+},\ \sum_{j\geq 1}\xi^{(i),0}_{j},\ \sum_{j\geq 1}(\xi^{(i),0}_{j})^{+},\ \sum_{j\geq 1}\xi^{(i),1}_{j},\ \sum_{j\geq 1}(\xi^{(i),1}_{j})^{+},\ldots\Big{)}.$

Now, we define

\displaystyle{\hat{M}^{(i),m}_{n}(\zeta_{k})}=\sum_{j=1}^{m+\lceil\log_{2}(n^{d})\rceil}(\xi^{(i),m}_{j})^{+}

(3.26)

as an approximation to ${M_{n}^{(i),*}(\zeta_{k})}=\sup_{t\in I_{i}}\Xi_{n}(t)-\Xi_{n}(u_{i-1})=\sum_{j\geq 1}(\xi^{(i)}_{j})^{+}$ using both ARA and SBA. Compared to the original design in (3.18), the main difference in (3.26) is that we substitute $\xi^{(i)}_{j}$ with $\xi^{(i),m}_{j}$ , and the latter is exactly simulatable as, conditioning on the values of $l^{(i)}_{j}$ ’s, it admits the law of $\breve{\Xi}^{m}_{n}$ . Similarly, let

\displaystyle{\hat{Y}^{m}_{n}(\zeta_{k})}=\max_{i\in[k+1]}\mathbf{I}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q),m}_{j}+\sum_{q=1}^{i-1}z_{q}+\hat{M}^{(i),m}_{n}(\zeta_{k})\geq na\Big{)};

(3.27)

Again, the main difference between (3.27) and (3.19) is that we incorporate ARA and substitute $\xi^{(i)}_{j}$ ’s with $\xi^{(i),m}_{j}$ ’s.

Plugging the design of $\hat{Y}^{m}_{n}(\zeta_{k})$ in (3.27) into the estimator $Z_{n}$ in (3.8), we propose Algorithm 3 for rare-event simulation of $\mathbf{P}(A_{n})$ when exact simulation of $X^{<n\gamma}$ is not available. Below is a summary of the parameters of the algorithm.

•

$\gamma\in(0,b)$ : the threshold in $B^{\gamma}$ defined in (3.3),
•

$w\in(0,1)$ : the weight of the defensive mixture in $\mathbf{Q}_{n}$ ; see (3.4),
•

$\rho\in(0,1)$ : the geometric rate of decay for $\mathbf{P}(\tau\geq m)$ in (3.8),
•

$\kappa\in[0,1),\ r>0$ : determining the truncation threshold $\kappa_{n,m}$ in (3.20),
•

$d>0$ : determining the $\log_{2}(n^{d})$ term in (3.26).

Theorem 3.3 justifies that, under proper parametrization, Algorithm 3 is unbiased and strongly efficient.

Theorem 3.3.

Let $\mu>2l^{*}(\alpha-1)$ and $\beta_{+}\in(\beta,2)$ where $\alpha>1$ is the heavy-tail index and $\beta\in(0,2)$ is the Blumenthal-Getoor index in Assumption 1. Let $w\in(0,1)$ and

\displaystyle\kappa^{2-\beta_{+}}<\frac{1}{2},\qquad r(2-\beta_{+})>\max\{2,\mu-1\},\qquad d>\max\{2,2\mu-1\}.

(3.28)

There exists $\rho_{0}\in(0,1)$ such that the following claim holds: Given $\rho\in(\rho_{0},1)$ , there exists $\bar{\gamma}\in(0,b)$ such that Algorithm 3 is unbiased and strongly efficient under any $\gamma\in(0,\bar{\gamma})$ .

In Section 6.2 we provide the proof, the key arguments of which are the verification of conditions (3.9) and (3.10) in Proposition 3.1. Here, we specify the choices of the parameters. First, pick $\alpha_{3}\in(0,\frac{1}{\lambda}),\ \alpha_{4}\in(0,\frac{1}{2\lambda})$ where $\lambda>0$ is the constant in Assumption 2. Next, pick $\alpha_{2}\in(0,\frac{\alpha_{3}}{2}\wedge 1)$ and $\alpha_{1}\in(0,\frac{\alpha_{2}}{\lambda})$ . Also, fix $\delta\in(1/\sqrt{2},1)$ . This allows us to pick $\rho_{0}\in(0,1)$ such that

\rho_{0}>\max\bigg{\{}\delta^{\alpha_{1}},\ \frac{\kappa^{2-\beta_{+}}}{\delta^{2}},\ \frac{1}{\sqrt{2}\delta},\ \delta^{\alpha_{2}-\lambda\alpha_{1}},\ \delta^{1-\lambda\alpha_{3}},\delta^{-\alpha_{2}+\frac{\alpha_{3}}{2}}\bigg{\}}.

After picking $\rho\in(\rho_{0},1)$ , one can find some $q>1$ such that $\rho_{0}^{1/q}<\rho$ . Let $p>1$ be such that $\frac{1}{p}+\frac{1}{q}=1$ . Let $\Delta>0$ be small enough such that $a-\Delta>(l^{*}-1)b$ . Then, we pick $\bar{\gamma}\in(0,b)$ small enough such that

\frac{a-\Delta-(l^{*}-1)b}{\bar{\gamma}}+l^{*}-1>2l^{*}p.

Again, the details of the parameter choices can be found at the beginning of Section 6. It is also worth mentioning that, by setting $\kappa=0$ , Algorithm 3 would reduce to Algorithm 2, as $\xi^{(i),m}_{j}$ ’s in (3.25) would reduce to $\xi^{(i)}_{j}$ ’s in (3.17); in other words, the ARA mechanism is effective only if the truncation threshold $\kappa_{n,m}=\kappa^{m}/n^{r}>0$ . As a result, Theorem 3.2 follows directly from Theorem 3.3 by setting $\kappa=0$ .

Remark 4.

While Algorithm 3 terminates within finite steps almost surely, its computational complexity may not be finite in expectation. This is partially due to the implementation of ARA as we approximate the jump martingale $J_{n,m}(t)$ in (3.21) using a independent Brownian motion term in (3.23). In theory, a potential remedy is to identify a better coupling between the jump martingales and Brownian motions; see, for instance, Theorem 9 of [46]. This would allow us to pick a larger $\kappa$ for the truncation threshold $\kappa_{n,m}$ in ARA, under which the simulation algorithm generates significantly fewer jumps when sampling $\xi^{(i),m}_{j}$ ’s. However, to the best of our knowledge, there is no practical implementation of the coupling in [46]. We note that similar issues arise in works such as [34], where the coupling in [46] imply a much tighter error bound in theory but cannot be implemented in practice.

4 Lipschitz Continuity of the Distribution of $X^{<z}(t)$

This section investigates the sufficient conditions for Assumption 2. That is, there exist ${z_{0}},\ {C},\ {\lambda}>0$ such that

\displaystyle\mathbf{P}\big{(}X^{<z}(t)\in[x,x+\delta]\big{)}\leq\frac{C\delta}{t^{\lambda}\wedge 1}\qquad\forall z\geq z_{0},\ t>0,\ x\in\mathbb{R},\ \delta>0.

(4.1)

Here, recall that $X^{>z}$ is the Lévy process with generating triplet $(c_{X},\sigma,\nu|_{(-\infty,z)})$ . In other words, this is a modulated version of $X$ where any the upward jump larger than $z$ is removed.

Algorithm 3 Strongly Efficient Estimation of

\mathbf{P}(A_{n})

with ARA

w\in(0,1),\ \gamma>0,\ r>0,\ d>0,\ \kappa\in[0,1),\ \rho\in(0,1)

as the parameters in the algorithm;

a,b>0

as the characterization of the set

A

;

(c_{X},\sigma,\nu)

as the generating triplet of

X(t)

;

\bar{\sigma}(\cdot)

is defined in (3.22).

3:Set

t_{n}=\lceil\log_{2}(n^{d})\rceil

and

\kappa_{n,m}=\kappa^{m}/n^{r}

5:if

\text{Unif}(0,1)<w

then

\triangleright

Sample

J_{n}

from

\mathbf{Q}_{n}

6: Sample

J_{n}=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}

from

\mathbf{P}

7:else

8: Sample

J_{n}=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}

from

\mathbf{P}(\ \cdot\ |B^{\gamma}_{n})

using Algorithm 1

9:Set

u_{0}=0,u_{k+1}=n

10:

11:Sample

\tau\sim Geom(\rho)

\triangleright

Decide Truncation Index

\tau

12:

13:for

i=1,2,\ldots,k+1

\triangleright

Stick-Breaking Procedure

14: for

j=1,2,\ldots,t_{n}+\tau

15: Sample

V^{(i)}_{j}\sim\text{Unif}(0,1)

16: Set

l^{(i)}_{j}=V^{(i)}_{j}(u_{i}-u_{i+1}-l^{(i)}_{1}-l^{(i)}_{2}-\ldots-l^{(i)}_{j-1})

17: Set

l^{(i)}_{t_{n}+\tau+1}=u_{i}-u_{i-1}-l^{(i)}_{1}-l^{(i)}_{2}-\ldots-l^{(i)}_{t_{n}+\tau}

18:

19:for

i=1,\cdots,k+1

\triangleright

Sample

\xi^{(i),m}_{j}

20: for

j=1,2,\cdots,t_{n}+\tau+1

21: Sample

x^{(i)}_{j}\sim N(0,\sigma^{2}\cdot l^{(i)}_{j})

22: Sample

y^{(i),-1}_{j}\sim\mathbf{P}(J_{n,-1}(l^{(i)}_{j})\in\ \cdot\ )

23: for

m=0,1,\ldots,\tau

24: Sample

y^{(i),m}_{j}\sim\mathbf{P}(J_{n,m}(l^{(i)}_{j})\in\ \cdot\ )

25: Sample

w^{(i),m}_{j}\sim N(0,(\bar{\sigma}^{2}(\kappa_{n,m-1})-\bar{\sigma}^{2}(\kappa_{n,m}))\cdot l^{(i)}_{j})

26: Sample

w^{(i),\tau+1}_{j}\sim N(0,\bar{\sigma}^{2}(\kappa_{n,\tau})\cdot l^{(i)}_{j})

27: for

m=0,\ldots,\tau

28: Set

\xi^{(i),m}_{j}=c_{X}\cdot l^{(i)}_{j}+x^{(i)}_{j}+\sum_{q=-1}^{m}y^{(i),q}_{j}+\sum_{q=m+1}^{\tau+1}w^{(i),q}_{j}

29:

30:for

m=1,\cdots,\tau

\triangleright

Evaluate

\hat{Y}^{m}_{n}

31: for

i=1,2,\ldots,k+1

32: Set

\hat{M}^{(i),m}_{n}=\sum_{q=1}^{i-1}\sum_{j=1}^{t_{n}+\tau+1}\xi^{(q),m}_{j}+\sum_{q=1}^{i-1}z_{q}+\sum_{j=1}^{t_{n}+m}(\xi^{(i),m}_{j})^{+}

33: Set

\hat{Y}^{m}_{n}=\mathbf{I}\big{\{}\max_{i=1,\ldots,k+1}\hat{M}^{(i),m}_{n}\geq na\big{\}}

34:

35:Set

Z_{n}=\hat{Y}^{1}_{n}+\sum_{m=2}^{\tau}(\hat{Y}^{m}_{n}-\hat{Y}^{m-1}_{n})\big{/}\rho^{m-1}

\triangleright

Return the Estimator

L_{n}

36:if

\max_{i=1,\cdots,k}z_{i}>b

then

37: Return

L_{n}=0

38:else

39: Set

\lambda_{n}=n\nu[n\gamma,\infty),\ p_{n}=1-\sum_{l=0}^{l^{*}-1}e^{-\lambda_{n}}\frac{\lambda_{n}^{l}}{l!},\ I_{n}=\mathbf{I}\{J_{n}\in B^{\gamma}_{n}\}

40: Return

L_{n}=Z_{n}/(w+\frac{1-w}{p_{n}}I_{n})

To demonstrate our approach for establishing condition (4.1), we start by considering a simple case where the Lévy process $X(t)$ has generating tripet $(c_{X},\sigma,\nu)$ with $\sigma>0$ . This leads to the decomposition

X^{<z}(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\sigma B(t)+Y^{<z}(t)\qquad\forall t,z>0

where $B$ is a standard Brownian motion, $Y^{<z}$ is a Lévy process with generating triplet $(c_{X},0,\nu|_{(-\infty,z)})$ , and the two processes are independent. Now, for any $x\in\mathbb{R},\ t>0$ and $\delta\in(0,1)$ ,

$\displaystyle\mathbf{P}(X^{<z}(t)\in[x,x+\delta])$	$\displaystyle=\int_{\mathbb{R}}\mathbf{P}(\sigma B(t)\in[x-y,x-y+\delta])\cdot\mathbf{P}(Y^{<z}(t)\in dy)$
	$\displaystyle=\int_{\mathbb{R}}\mathbf{P}\bigg{(}\frac{B(t)}{\sqrt{t}}\in\Big{[}\frac{x-y}{\sigma\sqrt{t}},\frac{x-y+\delta}{\sigma\sqrt{t}}\Big{]}\bigg{)}\cdot\mathbf{P}(Y^{<z}(t)\in dy)$
	$\displaystyle\leq\frac{1}{\sigma\sqrt{2\pi}}\cdot\frac{\delta}{\sqrt{t}}.$	(4.2)

The last inequality follows from the fact that a standard Normal distribution admits a density function bounded by $1/\sqrt{2\pi}$ . Therefore, we verified Assumption 2 under $\lambda=1/2,C=\frac{1}{\sigma\sqrt{2\pi}}$ , and any $z_{0}>0$ . The simple idea behind (4.2) is that continuity conditions such as (4.1) can be passed from one distribution to another through convolutional structures. To generalize this approach to the scenarios where $\sigma=0$ in the generating triplet of the Lévy process $X$ , we introduce the following definition.

Definition 3.

Let $\mu_{1}$ and $\mu_{2}$ be Borel measures on $\mathbb{R}$ . For any Borel set $A\subset\mathbb{R}$ , we say that $\mu_{1}$ majorizes $\mu_{2}$ when restricted on $A$ (denoted as $(\mu_{1}-\mu_{2})|_{A}\geq 0$ ) if $\mu(B\cap A)=\mu_{1}(B\cap A)-\mu_{2}(B\cap A)\geq 0$ for any Borel set $B\subset\mathbb{R}$ . In other words $\mu|_{A}=(\mu_{1}-\mu_{2})|_{A}$ is a positive measure.

Now, let us consider the case where the generating triplet of $X$ is $(c_{X},0,\nu$ ). For the Lévy measure $\nu$ , if we can find some $z_{0}>0$ , some Borel set $A\subseteq(-\infty,z_{0})$ and some (positive) Borel measure $\mu$ such that $(\nu-\mu)|_{A}\geq 0$ , then through a straightforward superposition of Poisson random measures, we obtain the decomposition (let $\mu_{A}=\mu|_{A}$ )

\displaystyle X^{<z}(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}Y(t)+\widetilde{X}^{<z,-A}(t)\qquad\forall z\geq z_{0}

(4.3)

where $Y(t)$ is a Lévy process with generating triplet $(0,0,\mu_{A})$ , $\widetilde{X}^{<z,-A}(t)$ is a Lévy process with generating triplet $(c_{X},0,\nu-\mu_{A})$ , and the two processes are independent. Furthermore, if Assumption 2 (conditions of form (4.1)) holds for the process $Y(t)$ with generating triplet $(0,0,\mu_{A})$ , then by repeating the arguments in (4.2) we can show that Assumption 2 holds in $X^{<z}(t)$ for any $z\geq z_{0}$ .

Recall our running assumption that the Lévy process $X(t)$ is of infinite activities (see Assumption 1). In case that $\sigma=0$ , we must have $\nu((-\infty,0)\cup(0,\infty))=\infty$ for $X$ to have infinite activity. Therefore, the key step is to identify the majorized measure $\mu$ such that

•

$(\nu-\mu)|_{A}\geq 0$ holds for $\nu$ with infinite mass and some set $A$ ,
•

condition (4.1) holds for the Lévy process $Y(t)$ in (4.3) with generating triplet $(0,0,\mu|_{A})$ .

In the first main result of this section, we show that measures of form $\mu[x,\infty)$ that roughly increase at a power-law rate $1/x^{\alpha}$ (as $x\downarrow 0$ ) provide ideal choices for such majorized measures. In particular, the corresponding Lévy process $Y(t)$ in (4.3) is intimately related to $\alpha$ -stable processes that naturally satisfy continuity properties of form (4.1). We collect the proof in Section 6.3.

Proposition 4.1.

Let $\alpha\in(0,2),z_{0}>0$ , and $\epsilon\in(0,(2-\alpha)/2)$ . Suppose that $\mu[x,\infty)$ is regularly varying as $x\downarrow 0$ with index $-(\alpha+2\epsilon)$ . Then the Lévy process $Y(t)$ with generating triplet $(0,0,\mu|_{(0,z_{0})})$ has a continuous density function $f_{Y(t)}$ for each $t>0$ . Furthermore, there exists a constant $C<\infty$ such that

\left\lVert f_{Y(t)}\right\rVert_{\infty}\leq\frac{C}{t^{1/\alpha}\wedge 1}\qquad\forall t>0.

where $\left\lVert f\right\rVert_{\infty}=\sup_{x\in\mathbb{R}}|f(x)|$ .

Equipped with Proposition 4.1, we obtain the following set of sufficient conditions for Assumption 2.

Theorem 4.2.

Let $(c_{X},\sigma,\nu)$ be the generating triplet of Lévy process $X$ .

(i)

If $\sigma>0$ , then Assumption 2 holds for $\lambda=1/2$ and any $z_{0}>0$ .
(ii)

If there exist Borel measure $\mu$ , some $z_{0}>0$ , and some $\alpha^{\prime}\in(0,2)$ such that $(\nu-\mu)|_{(0,z_{0})}\geq 0$ (resp., $(\nu-\mu)|_{(-z_{0},0)}\geq 0$ ) and $\mu[x,\infty)$ (resp., $\mu(-\infty,x]$ ) is regularly varying with index $-\alpha^{\prime}$ as $x\downarrow 0$ , then Assumption 2 holds with $\lambda=1/\alpha$ for any $\alpha\in(0,\alpha^{\prime})$ .

Proof.

Part $(i)$ follows immediately from the calculations in (4.2). To prove part $(ii)$ , we fix some $\alpha\in(0,\alpha^{\prime})$ , and without loss of generality assume that $(\nu-\mu)|_{(0,z_{0})}\geq 0$ and $\mu[x,\infty)$ is regularly varying with index $\alpha^{\prime}$ as $x\downarrow 0$ . This allows us to fix some $\epsilon=(\alpha^{\prime}-\alpha)/2\in\big{(}0,(2-\alpha)/2\big{)}$ .

For any $z\geq z_{0}$ , let $Y(t)$ and $\widetilde{X}^{<z,-A}(t)$ be defined as in (4.3) with $A=(0,z_{0})$ . First of all, applying Proposition 4.1, we can find $C>0$ such that $\left\lVert f_{Y(t)}\right\rVert_{\infty}\leq\frac{C}{t^{1/\alpha}\wedge 1}\ \forall t>0.$ Next, due to the independence between $Y$ and $\widetilde{X}^{<z,-A}(t)$ , it holds for all $x\in\mathbb{R},\delta\geq 0$ , and $t>0$ that

\mathbf{P}(X^{<z}(t)\in[x,x+\delta])=\int_{\mathbb{R}}\mathbf{P}(Y(t)\in[x-y,x-y+\delta])\cdot\mathbf{P}(\widetilde{X}^{<z,-A}(t)\in dy)\leq\frac{C}{t^{1/\alpha}\wedge 1}\cdot\delta.

This concludes the proof. ∎

Remark 5.

It is worth noting that the conditions stated in Theorem 4.2 are mild for Lévy process $X(t)$ with infinite activities. In particular, for $X$ to exhibit infinite activity, we must have either $\sigma>0$ or $\nu(\mathbb{R})=\infty$ . Theorem 4.2 (i) deals with the case where $\sigma>0$ . On the other hand, when $\sigma=0$ we must have either $\lim_{\epsilon\downarrow 0}\nu[\epsilon,\infty)=\infty$ or $\lim_{\epsilon\downarrow 0}\nu(-\infty,-\epsilon]=\infty$ . To satisfy the conditions in part (ii) of Theorem 4.2, the only other requirement is that $\nu[\epsilon,\infty)$ (or $\nu(-\infty,-\epsilon]$ ) approaches infinity at a rate that is at least comparable to some power-law functions.

The next set of sufficient conditions for Assumption 2 revolves around another type of self-similarity structure in the Lévy measure $\nu$ .

Definition 4.

Given $\alpha\in(0,2)$ and $b>1$ , a Lévy process $X$ is $\alpha$ -semi-stable with span $b$ if its Lévy measure $\nu$ satisfies

\displaystyle\nu=b^{-\alpha}T_{b}\nu

(4.4)

where the transformation $T_{r}$ ( $\forall r>0$ ) onto a Borel measure $\rho$ on $\mathbb{R}$ is given by $(T_{r}\rho)(B)=\rho(r^{-1}B)$ .

As a special case of semi-stable processes, note that $X$ is $\alpha$ -stable if

\nu(dx)=c_{1}\cdot\frac{dx}{x^{1+\alpha}}\mathbf{I}\{x>0\}+c_{2}\cdot\frac{dx}{|x|^{1+\alpha}}\mathbf{I}\{x<0\}

where $c_{1},c_{2}\geq 0,\ c_{1}+c_{2}>0.$ See Theorem 14.3 in [58] for details. However, it is worth noting that the Lévy processes with regularly varying Lévy measures $\nu$ studied in Proposition 4.1 are not strict subsets of the semi-stable processes introduced in Definition 4. For instance, given a Borel measure $\nu$ , suppose that $f(x)=\nu\big{(}(-\infty,-x]\cup[x,\infty)\big{)}$ is regularly varying at 0 with index $\alpha>0$ . Even if $\nu$ satisfies the scaling-invariant property in (4.4) for some $b>1$ , we can fix a sequence of points $\{x_{n}=\frac{1}{b^{n}}\}_{n\geq 1}$ and assign an extra mass of $\ln n$ onto $\nu$ at each point $x_{n}$ . In doing so, we break the scaling-invariant property but still maintain the regular variation of $\nu$ . On the other hand, to show that semi-stable processes may not have regularly varying Lévy measure (when restricted on some neighborhood of the origin), let us consider a simple example. For some $b>1$ and $\alpha\in(0,2)$ , define the following measure:

\nu(\{b^{-n}\})=b^{n\alpha}\ \ \forall n\geq 0;\qquad\nu\big{(}\mathbb{R}\char 92\relax\{b^{n}:\ n\in\mathbb{N}\}\big{)}=0.

Clearly, $\nu$ can be seen as the restriction of the Lévy measure (restricted on $(-1,1)$ ) of some $\alpha$ -semi-stable process. Now define function $f(x)=\nu[x,\infty)$ on $(0,\infty)$ . For any $t>0$ ,

\frac{f(tx)}{f(x)}=\frac{\sum_{n=0}^{\lfloor\log_{b}(1/tx)\rfloor}b^{n\alpha}}{\sum_{n=0}^{\lfloor\log_{b}(1/x)\rfloor}b^{n\alpha}}=\frac{b^{\lfloor\log_{b}(1/tx)\rfloor+1}-1}{b^{\lfloor\log_{b}(1/x)\rfloor+1}-1}.

As $x\rightarrow 0$ , we see that $f(tx)/f(x)$ will be very close to

b^{\alpha(\lfloor\log_{b}(1/tx)\rfloor-\lfloor\log_{b}(1/x)\rfloor)}.

As long as we didn’t pick $t=b^{k}$ for some $k\in\mathbb{Z}$ , asymptotically, the value of $f(tx)/f(x)$ will repeatedly cycle through the following three different values

\{b^{\alpha\lfloor\log_{b}(1/t)\rfloor},b^{\alpha\lfloor\log_{b}(1/t)\rfloor+\alpha},b^{\alpha\lfloor\log_{b}(1/t)\rfloor-\alpha}\},

thus implying that $f(tx)/f(x)$ does not converge as $x$ approaches $0$ . This confirms that $\nu[x,\infty)$ is not regularly varying as $x\downarrow 0$ .

In Proposition 4.3, we show that semi-stable processes, as well as their truncated counterparts, satisfy continuity conditions of form (4.1). We say that the process $Y(t)$ is non-trivial if it is not a deterministic linear function (i.e., $Y(t)\equiv ct$ for some $c\in\mathbb{R}$ ). The proof is again detailed in Section 6.3.

Proposition 4.3.

Let $\alpha\in(0,2)$ and $N\in\mathbb{Z}$ . Suppose that $\mu$ is the Lévy measure of a non-trivial $\alpha$ -semi-stable process $Y^{\prime}(t)$ of span $b>1$ . Then under $z_{0}=b^{N}$ , the Lévy process $\{Y(t):\ t>0\}$ with generating triplet $(0,0,\mu|_{(-z_{0},z_{0})})$ has a continuous density function $f_{Y(t)}$ for any $t>0$ . Furthermore, there exists some $C\in(0,\infty)$ such that

\left\lVert f_{Y(t)}\right\rVert_{\infty}\leq\frac{C}{t^{1/\alpha}\wedge 1}\qquad\forall t>0.

Lastly, by applying Proposition 4.3, we yield another set of sufficient conditions for Assumption 2.

Theorem 4.4.

Let $(c_{X},\sigma,\nu)$ be the generating triplet of Lévy process $X$ . Suppose that there exist some Borel measure $\mu$ and some $z^{\prime}>0,\ \alpha\in(0,2)$ such that $(\nu-\mu)|_{(-z^{\prime},z^{\prime})}\geq 0,$ and $\mu$ is the Lévy measure of some $\alpha$ -semi-stable process. Then Assumption 2 holds for $\lambda=1/\alpha$ .

Proof.

Let $b>1$ be the span of the $\alpha$ -semi-stable process. Fix some $N\in\mathbb{Z}$ such that $z_{0}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}b^{N}\leq z^{\prime}$ . For any $z\geq z_{0}$ , let $Y(t)$ and $\widetilde{X}^{<z,-A}(t)$ be defined as in (4.3) with $A=(-z_{0},z_{0})$ . First of all, applying Proposition 4.3, we can find $C>0$ such that $\left\lVert f_{Y(t)}\right\rVert_{\infty}\leq\frac{C}{t^{1/\alpha}\wedge 1}\ \forall t>0.$ Next, due to the independence between $Y$ and $\widetilde{X}^{<z,-A}(t)$ , it holds for all $x\in\mathbb{R},\delta\geq 0$ , and $t>0$ that

\mathbf{P}(X^{<z}(t)\in[x,x+\delta])=\int_{\mathbb{R}}\mathbf{P}(Y(t)\in[x-y,x-y+\delta])\cdot\mathbf{P}(\widetilde{X}^{<z,-A}(t)\in dy)\leq\frac{C}{t^{1/\alpha}\wedge 1}\cdot\delta.

This concludes the proof. ∎

5 Numerical Experiments

In this section, we apply the importance sampling strategy outlined in Algorithms 2 and 3 and demonstrate that $(i)$ the performance of the importance sampling estimators under different scaling factors and tail distributions, and $(ii)$ the strong efficiency of the proposed algorithms when compared to crude Monte Carlo methods. Specifically, consider a Lévy process $X(t)=B(t)+\sum_{i=1}^{N(t)}W_{i}$ , where $B(t)$ is the standard Brownian motion, $N$ is a Poisson process with arrival rate $0.5$ , and $\{W_{i}\}_{i\geq 1}$ is a sequence of iid random variables with law (for some $\alpha>1$ )

\mathbf{P}(W_{1}>x)=\mathbf{P}(-W_{1}>x)=\frac{0.5}{(1+x)^{\alpha}},\qquad\forall x>0.

For each $n\geq 1$ , we define the scaled process $\bar{X}_{n}(t)=\frac{X(nt)}{n}$ . The goal is to estimate the probability of $A_{n}=\{X_{n}\in A\}$ , where the set $A$ is defined as in (3.1) with $a=2$ and $b=1.15$ . Note that this is a case with $l^{*}=\lceil a/b\rceil=2$ .

To evaluate the performance of the importance sampling estimator under different scaling factors and tail distributions, we run experiments under $\alpha\in\{1.45,1.6,1.75\}$ , and $n\in\{100,200,\cdots,1000\}$ . The efficiency is evaluated by the relative error of the algorithm, namely the ratio between the standard deviation and the estimated mean. In Algorithm 2, we set $\gamma=0.25,\ w=0.05,\rho=0.97$ , and $d=4$ . In Algorithm 3, we further set $\kappa=0.5$ and $r=1.5$ . For both algorithms, we generate 10,000 independent samples for each combination of $\alpha\in\{1.45,1.6,1.75\}$ and $n\in\{1000,2000,\cdots,10000\}$ . For the number of samples in crude Monte Carlo estimation, we ensure that at least $64/\hat{p}_{\alpha,n}$ samples are generated, where $\hat{p}_{\alpha,n}$ is the probability estimated by Algorithm 2.

The results are summarized in Table 5.1 and Figure 5.1. In Table 5.1, we see that for a fixed $\alpha$ , the relative error of the importance sampling estimators stabilizes around a constant level as $n$ increases. This aligns with the strong efficiency established in Theorems 3.2 and 3.3. In comparison, the relative error of crude Monte Carlo estimators continues to increase as $n$ tends to infinity. Figure 5.1 further highlights that our importance sampling estimators significantly outperform crude Monte Carlo methods by orders of magnitude. In summary, when Algorithms 2 and 3 are appropriately parameterized, their efficiency becomes increasingly evident when compared against the crude Monte Carlo approach as the scaling factor $n$ grows larger and the target probability approaches $0$ .

Refer to caption — Figure 5.1: Relative errors of the proposed importance sampling estimator. Results are plotted under log scale. Dashed lines: the importance sampling estimator in Algorithm 2, Dotted lines: the importance sampling estimator with ARA in Algorithm 3, Solid lines: the crude Monte-Carlo methods (solid lines).

Table 5.1: Relative errors of Algorithm 2 (first row), Algorithm 3 (second row), and crude Monte Carlo (third row).

n	200	400	600	800	1000
$\alpha=1.45$	$11.70$	$13.65$	$14.40$	$15.33$	$15.82$
	$10.74$	$13.57$	$13.57$	$15.98$	$14.11$
	97.86	136.03	195.40	238.81	273.13
$\alpha=1.6$	$15.03$	$17.53$	$19.06$	$20.12$	$20.98$
	15.59	18.23	19.59	21.30	21.30
	237.82	386.35	526.13	681.79	866.02
$\alpha=1.75$	$19.03$	$22.54$	$23.94$	$25.97$	$25.77$
	18.23	19.22	22.92	28.85	31.61
	524.78	1091.29	1298.98	1965.22	2089.82

6 Proofs

6.1 Proof of Proposition 3.1

We first prepare two technical lemmas using the sample-path large deviations for heavy-tailed Lévy processes reviewed in Section 2.2.

Lemma 6.1.

For the set $A\subset\mathbb{D}$ defined in (3.1) and the quantity $l^{*}$ defined in (3.2),

\displaystyle 0<\liminf_{n\rightarrow\infty}\frac{\mathbf{P}(\bar{X}_{n}\in A)}{(n\nu[n,\infty))^{l^{*}}}\leq\limsup_{n\rightarrow\infty}\frac{\mathbf{P}(\bar{X}_{n}\in A)}{(n\nu[n,\infty))^{l^{*}}}<\infty.

Proof.

In this proof, we focus on the two-sided case in Assumption 1. It is worth noticing that analysis for the one-sided case is almost identical, with the only major difference being that we apply Result 1 (i.e., the one-sided version of the large deviations of $\bar{X}_{n}$ ) instead of Result 2 (i.e., the two-sided version). Specifically, we claim that

( $i$ )

$\big{(}l^{*},0\big{)}\in\underset{(j,k)\in\mathbb{N}^{2},\ \mathbb{D}_{j,k}\cap A\neq\emptyset}{\text{argmin}}j(\alpha-1)+k(\alpha^{\prime}-1);$
( $ii$ )

$\mathbf{C}_{l^{*},0}(A^{\circ})>0$ ;
( $iii$ )

the set $A$ is bounded away from $\mathbb{D}_{<l^{*},0}$ .

Then by applying Result 2, we yield

\displaystyle 0<{\mathbf{C}_{l^{*},0}(A^{\circ})}\leq\liminf_{n\rightarrow\infty}\frac{\mathbf{P}(\bar{X}_{n}\in A)}{(n\nu[n,\infty))^{l^{*}}}\leq\limsup_{n\rightarrow\infty}\frac{\mathbf{P}(\bar{X}_{n}\in A)}{(n\nu[n,\infty))^{l^{*}}}\leq{\mathbf{C}_{l^{*},0}(A^{-})}<\infty

and conclude the proof. Now, it remains to prove claims ( $i$ ), ( $ii$ ), and ( $iii$ ).

Proof of Claim $(i)$ .

By definitions of $\mathbb{D}_{j,k}$ , for any $\xi\in\mathbb{D}_{j,k}$ there exist $(u_{i})_{i=1}^{j}\in(0,\infty)^{j}$ , $(t_{i})_{i=1}^{j}\in(0,1]^{j}$ and $(v_{i})_{i=1}^{k}\in(0,\infty)^{k}$ , $(s_{i})_{i=1}^{k}\in(0,1]^{k}$ such that

\displaystyle\xi(t)=\sum_{i=1}^{j}u_{i}\mathbf{I}_{[t_{i},1]}(t)-\sum_{i=1}^{k}v_{i}\mathbf{I}_{[s_{i},1]}(t)\qquad\forall t\in[0,1].

(6.1)

First, from Assumption 3, one can choose $\epsilon>0$ small enough such that $l^{*}(b-\epsilon)>a$ . Then for the case with $(j,k)=(l^{*},0)$ in (6.1), by picking $u_{i}=b-\epsilon$ for all $i\in[l^{*}]$ , we have $\sup_{t\in[0,1]}\xi(t)=\sum_{i=1}^{l^{*}}u_{i}=l^{*}(b-\epsilon)>a$ , and hence $\xi\in A$ . This verifies $\mathbb{D}_{l^{*},0}\cap A\neq\emptyset$ .

Next, suppose we can show that $j\geq l^{*}$ is a necessary condition for $\mathbb{D}_{j,k}\cap A\neq\emptyset$ . Then we get

\displaystyle\big{\{}(j,k)\in\mathbb{N}^{2}:\ \mathbb{D}_{j,k}\cap A\neq\emptyset\big{\}}\subseteq\big{\{}(j,k)\in\mathbb{N}^{2}:\ j\geq l^{*},\ k\geq 0\big{\}},

which immediately verifies claim $(i)$ due to $\alpha,\alpha^{\prime}>1$ ; see Assumption 1. Now, to show that $j\geq l^{*}$ is a necessary condition for $\mathbb{D}_{j,k}\cap A\neq\emptyset$ , note that from (6.1), it holds for any $\xi\in\mathbb{D}_{j,k}\cap A$ that $a<\sup_{t\in[0,1]}\xi(t)\leq\sum_{i=1}^{j}u_{i}<jb.$ As a result, we must have $j>a/b$ and hence $j\geq l^{*}=\lceil a/b\rceil$ due to $a/b\notin\mathbb{Z}$ ; see Assumption 3. This concludes the proof of claim $(i)$ .

Proof of Claim $(ii)$ .

Again, choose some $\epsilon>0$ small enough such that $l^{*}(b-\epsilon)>a$ . Given any $u_{i}\in(b-\epsilon,b)$ and $0<t_{1}<t_{2}<\cdots<t_{l^{*}}<1$ , the step function $\xi(t)=\sum_{i=1}^{l^{*}}u_{i}\mathbf{I}_{[t_{i},1]}(t)$ satisfies $\sup_{t\in[0,1]}\xi(t)\geq l^{*}(b-\epsilon)>a$ , thus implying $\xi\in A$ . Therefore, (for the definition of $\mathbf{C}_{j,k}$ , see (2.3))

\displaystyle\mathbf{C}_{l^{*},0}(A^{\circ})

\displaystyle\geq\nu^{l^{*}}_{\alpha}\Big{(}(b-\epsilon,b)^{l^{*}}\Big{)}=\frac{1}{l^{*}!}\bigg{[}\frac{1}{(b-\epsilon)^{\alpha}}-\frac{1}{b^{\alpha}}\bigg{]}^{l^{*}}>0.

Proof of Claim $(iii)$ .

Assumption 3 implies that $a>(l^{*}-1)b$ , allowing us to choose $\epsilon>0$ small enough that $a-\epsilon>(l^{*}-1)(b+\epsilon)$ . It suffices to show that

\displaystyle\bm{d}(\xi,\xi^{\prime})\geq\epsilon\qquad\forall\xi\in\mathbb{D}_{<l^{*},0},\ \xi^{\prime}\in A.

(6.2)

Here, $\bm{d}$ is the Skorokhod $J_{1}$ metric; see (2.1) for the definition. To prove (6.2), we start with the following observation: due to claim $(i)$ , for any $(j,k)\in\mathbb{N}^{2}$ with $(j,k)\in\mathbb{I}_{<l^{*},0}$ , we must have $j\leq l^{*}-1$ . Now, we proceed with a proof by contradiction. Suppose there is some $\xi\in\mathbb{D}_{j,k}$ with $j\leq l^{*}-1$ and some $\xi^{\prime}\in A$ such that $\bm{d}(\xi,\xi^{\prime})<\epsilon$ . Due to $\xi^{\prime}\in A$ (and hence no upward jump in $\xi^{\prime}$ is larger than $b$ ) and $\bm{d}(\xi,\xi^{\prime})<\epsilon$ , under the representation (6.1) we must have $u_{i}<b+\epsilon\ \forall i\in[j]$ . This implies $\sup_{t\in[0,1]}\xi(t)\leq\sum_{i=1}^{j}u_{i}<j(b+\epsilon)\leq(l^{*}-1)(b+\epsilon)$ . Due to $\bm{d}(\xi,\xi^{\prime})<\epsilon$ again, we yield the contradiction that $\sup_{t\in[0,1]}\xi^{\prime}(t)<(l^{*}-1)(b+\epsilon)+\epsilon<a$ (and hence $\xi^{\prime}\notin A$ ). This concludes the proof of claim $(iii)$ . ∎

Lemma 6.2.

Let $p>1$ . Let $\Delta>0$ be such that $a-\Delta>(l^{*}-1)b$ and $[a-\Delta-(l^{*}-1)b]/\gamma\notin\mathbb{Z}$ . Suppose that $(J_{\gamma}+l^{*}-1)/p>2l^{*}$ holds for

\displaystyle J_{\gamma}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\lceil\frac{a-\Delta-(l^{*}-1)b}{\gamma}\rceil.

(6.3)

Then

\mathbf{P}\big{(}\bar{X}_{n}\in A^{\Delta}\cap E,\ \mathcal{D}(\bar{J}_{n})\leq l^{*}-1\big{)}=\bm{o}\Big{(}\big{(}n\nu[n,\infty)\big{)}^{2pl^{*}}\Big{)}\qquad\text{as }n\to\infty

where $A^{\Delta}=\{\xi\in\mathbb{D}:\sup_{t\in[0,1]}\xi(t)\geq a-\Delta\},\ E\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\xi\in\mathbb{D}:\sup_{t\in(0,1]}\xi(t)-\xi(t-)<b\}$ and the function $\mathcal{D}(\xi)$ counts the number of discontinuities in $\xi\in\mathbb{D}$ .

Proof.

Similar to the proof of Lemma 6.1, we focus on the two-sided case in Assumption 1. Still, it is worth noticing that the proof of the one-sided case is almost identical, with the only major difference being that we apply Result 1 instead of Result 2.

First, observe that $\mathbf{P}(\bar{X}_{n}\in A^{\Delta}\cap E,\ \mathcal{D}(\bar{J}_{n})\leq l^{*}-1)=\mathbf{P}(\bar{X}_{n}\in F)$ where

	$\displaystyle F$	$\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\big{\{}\xi\in\mathbb{D}:\ \sup_{t\in[0,1]}\xi(t)\geq a-\Delta;\ \sup_{t\in(0,1]}\xi(t)-\xi(t-)<b,$
		$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\#\{t\in[0,1]:\ \xi(t)-\xi(t-)\geq\gamma\}\leq l^{*}-1\big{\}}.$

Furthermore, we claim that

$(i)$

$(J_{\gamma}+l^{*}-1,0)\in\underset{(j,k)\in\mathbb{N}^{2},\ \mathbb{D}_{j,k}\cap F\neq\emptyset}{\text{argmin}}j(\alpha-1)+k(\alpha^{\prime}-1);$
$(ii)$

the set $F$ is bounded away from $\mathbb{D}_{<J_{\gamma}+l^{*}-1,0}$ .

Then we are able to apply Result 2 and obtain

\mathbf{P}(\bar{X}_{n}\in A^{\Delta}\cap E,\ \mathcal{D}(\bar{J}_{n})\leq l^{*}-1)=\mathbf{P}(\bar{X}_{n}\in F)=\bm{O}\big{(}(n\nu[n,\infty))^{J_{\gamma}+l^{*}-1}\big{)}\qquad\text{as }n\to\infty.

Lastly, by our assumption $(J_{\gamma}+l^{*}-1)/p>2l^{*}$ , we get $(n\nu[n,\infty))^{l^{*}-1+J_{\gamma}}=\bm{o}\big{(}\big{(}n\nu[n,\infty)\big{)}^{2pl^{*}}\big{)}$ and conclude the proof. Now, it remains to prove claims $(i)$ and $(ii)$ .

Proof of Claim $(i)$ .

By definition of $\mathbb{D}_{j,k}$ , given any $\xi\in\mathbb{D}_{j,k}$ there exist $(u_{i})_{i=1}^{j}\in(0,\infty)^{j},(t_{i})_{i=1}^{j}\in(0,1]^{j}$ and $(v_{i})_{i=1}^{k}\in(0,\infty)^{k},(s_{i})_{i=1}^{k}\in(0,1]^{k}$ such that the representation (6.1) holds. By assumption $[a-\Delta-(l^{*}-1)b]/\gamma\notin\mathbb{Z}$ , for $J_{\gamma}$ defined in (6.3) we have

\displaystyle(J_{\gamma}-1)\gamma<a-\Delta-(l^{*}-1)b<J_{\gamma}\cdot\gamma.

(6.4)

It then holds for all $\epsilon>0$ small enough that $a-\Delta<J_{\gamma}(\gamma-\epsilon)+(l^{*}-1)(b-\epsilon)$ . As a result, for the case with $(j,k)=(l^{*}-1+J_{\gamma},0)$ in (6.1), by picking $u_{i}=b-\epsilon$ for all $i\in[l^{*}-1]$ , and $u_{i}=\gamma-\epsilon$ for all $i=l^{*},l^{*}+1,\cdots,l^{*}-1+J_{\gamma}$ , we get $\sup_{t\in[0,1]}\xi(t)=J_{\gamma}(\gamma-\epsilon)+(l^{*}-1)(b-\epsilon)>a-\Delta$ . This proves that $\xi\in\mathbb{D}_{l^{*}-1+J_{\gamma},0}\cap F$ , and hence $\mathbb{D}_{l^{*}-1+J_{\gamma},0}\cap F\neq\emptyset$ .

Next, suppose we can show that $j\geq l^{*}-1+J_{\gamma}$ is the necessary condition for $\mathbb{D}_{j,k}\cap F\neq\emptyset$ . Then, we get

\displaystyle\{(j,k)\in\mathbb{N}^{2}:\ \mathbb{D}_{j,k}\cap F\neq\emptyset\}\subseteq\{(j,k)\in\mathbb{N}^{2}:\ j\geq l^{*}-1+J_{\gamma},\ k\geq 0\},

which immediately verifies claim $(i)$ due to $\alpha,\alpha^{\prime}>1$ ; see Assumption 1. Now, to show that $j\geq l^{*}-1+J_{\gamma}$ is a necessary condition, note that, from (6.1), it holds for any $\xi\in\mathbb{D}_{j,k}\cap F$ that $a-\Delta<\sup_{t\in[0,1]}\xi(t)\leq\sum_{i=1}^{j}u_{i}.$ Furthermore, by the definition of the set $F$ , we must have (here, w.l.o.g., we order $u_{i}$ ’s by $u_{1}\geq u_{2}\geq\ldots\geq u_{j}$ ) $u_{i}<b$ for all $i\in[l^{*}-1]$ and $u_{i}<\gamma$ for all $i=l^{*},l^{*}+1,\cdots,j$ . This implies $(l^{*}-1)b+(j-l^{*}+1)\gamma>a-\Delta,$ and hence $j>\frac{a-\Delta-(l^{*}-1)b}{\gamma}+l^{*}-1,$ which is equivalent to $j\geq J_{\gamma}+l^{*}-1$ .

Proof of Claim $(ii)$ .

From (6.4), we can fix some $\epsilon>0$ small enough such that

\displaystyle a-\Delta-\epsilon>(l^{*}-1)(b+\epsilon)+(J_{\gamma}-1)(\gamma+\epsilon).

(6.5)

It suffices to show that

\displaystyle\bm{d}(\xi,\xi^{\prime})\geq\epsilon\qquad\forall\xi\in\mathbb{D}_{<J_{\gamma}+l^{*}-1,0},\ \xi^{\prime}\in F.

(6.6)

Here, $\bm{d}$ is the Skorokhod $J_{1}$ metric; see (2.1) for the definition. To prove (6.6), we start with the following observation: using claim $(i)$ , for any $(j,k)\in\mathbb{N}^{2}$ with $(j,k)\in\mathbb{I}_{<J_{\gamma}+l^{*}-1,0}$ , we must have $j\leq J_{\gamma}+l^{*}-2$ . Next, we proceed with a proof by contradiction. Suppose there is some $\xi\in\mathbb{D}_{j,k}$ with $j\leq J_{\gamma}+l^{*}-2$ and some $\xi^{\prime}\in F$ such that $\bm{d}(\xi,\xi^{\prime})<\epsilon$ . By the definition of the set $F$ above, any upward jump in $\xi^{\prime}$ is bounded by $b$ , and at most $l^{*}-1$ of them is larger than $\gamma$ . Then from $\bm{d}(\xi,\xi^{\prime})<\epsilon$ , we know that any upward jump in $\xi$ is bounded by $b+\epsilon$ , and at most $l^{*}-1$ of them is larger than $\gamma+\epsilon$ . Through (6.1), we now have

\displaystyle\sup_{t\in[0,1]}\xi(t)

\displaystyle\leq\sum_{i=1}^{j}u_{i}\leq(l^{*}-1)(b+\epsilon)+(J_{\gamma}-1)(\gamma+\epsilon)<a-\Delta-\epsilon.

The last inequality follows from (6.5). Using $\bm{d}(\xi,\xi^{\prime})<\epsilon$ again, we yield the contraction that $\sup_{t\in[0,1]}\xi^{\prime}(t)<a-\Delta$ and hence $\xi^{\prime}\notin F$ . This concludes the proof of (6.6). ∎

Now, we are ready to prove Proposition 3.1.

Proof of Proposition 3.1.

We start by proving the unbiasedness of the importance sampling estimator

\displaystyle L_{n}=Z_{n}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}=\sum_{m=0}^{\tau}\frac{\hat{Y}^{m}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}-\hat{Y}^{m-1}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}}{\mathbf{P}(\tau\geq m)}.

under $\mathbf{Q}_{n}$ . Note that under both $\mathbf{P}$ and $\mathbf{Q}_{n}$ , we have $\tau\sim\text{Geom}(\rho)$ (i.e., $\mathbf{P}(\tau\geq m)=\rho^{m-1}$ ) and that $\tau$ is independent of everything else. In light of Result 4, it suffices to verify $\lim_{m\to\infty}\mathbf{E}^{\mathbf{Q}_{n}}[Y_{m}]=\mathbf{E}^{\mathbf{Q}_{n}}[Y]$ and condition (2.8) (under $\mathbf{Q}_{n}$ ) with the choice of $Y_{m}=\hat{Y}^{m}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}$ and $Y=Y^{*}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}.$ In particular, it suffices to show that (for any $n\geq 1$ )

\displaystyle\sum_{m\geq 1}\mathbf{E}^{\mathbf{Q}_{n}}\Bigg{[}\bigg{|}\hat{Y}^{m-1}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}-Y^{*}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\bigg{|}^{2}\Bigg{]}\Bigg{/}\mathbf{P}(\tau\geq m)<\infty.

(6.7)

To see why, note that (6.7) directly verifies condition (2.8). Furthermore, it implies $\lim_{m\to\infty}\mathbf{E}^{\mathbf{Q}_{n}}\big{|}\hat{Y}^{m-1}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}-Y^{*}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\big{|}^{2}=0.$ The $\mathcal{L}_{2}$ convergence then implies the $\mathcal{L}_{1}$ convergence, i.e., $\lim_{m\to\infty}\mathbf{E}^{\mathbf{Q}_{n}}[\hat{Y}^{m-1}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}]=\mathbf{E}^{\mathbf{Q}_{n}}[Y^{*}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}].$

To prove claim (6.7), observe that

	$\displaystyle\mathbf{E}^{\mathbf{Q}_{n}}\Bigg{[}\bigg{\|}\hat{Y}^{m-1}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}-Y^{*}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\bigg{\|}^{2}\Bigg{]}$	$\displaystyle\leq\mathbf{E}^{\mathbf{Q}_{n}}\bigg{[}\|\hat{Y}^{m-1}_{n}-Y^{*}_{n}\|^{2}\cdot\bigg{(}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\bigg{)}^{2}\bigg{]}$
		$\displaystyle=\mathbf{E}\bigg{[}\|\hat{Y}^{m-1}_{n}-Y^{*}_{n}\|^{2}\cdot\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\bigg{]}$
		$\displaystyle\leq\frac{1}{w}\mathbf{E}\|\hat{Y}^{m-1}_{n}-Y^{*}_{n}\|^{2}\qquad\text{due to }\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\leq\frac{1}{w}\text{, see \eqref{def: estimator Ln}}.$

In particular, since $\hat{Y}^{m}_{n}$ and $Y^{*}_{n}$ only take values in $\{0,1\}$ , we have $\mathbf{E}|\hat{Y}^{m}_{n}-Y^{*}_{n}|^{2}=\mathbf{P}(\hat{Y}^{m}_{n}\neq Y^{*}_{n}),$ and

$\displaystyle\mathbf{P}(\hat{Y}^{m}_{n}\neq Y^{*}_{n})$	$\displaystyle=\sum_{k\geq 0}\mathbf{P}(Y^{*}_{n}\neq\hat{Y}^{m}_{n}\ \|\ \mathcal{D}(\bar{J}_{n})=k)\mathbf{P}(\mathcal{D}(\bar{J}_{n})=k)$
	$\displaystyle\leq\sum_{k\geq 0}C_{0}\rho^{m}_{0}\cdot(k+1)\cdot\mathbf{P}(\mathcal{D}(\bar{J}_{n})=k)\qquad\text{for all $m\geq\bar{m}$ due to \eqref{condition 1, proposition: design of Zn}}$
	$\displaystyle=C_{0}\rho^{m}_{0}\cdot\mathbf{E}\Big{[}1+\text{Poisson}\big{(}n\nu[n\gamma,\infty)\big{)}\Big{]}=C_{0}\rho^{m}_{0}\cdot\big{(}1+n\nu[n\gamma,\infty)\big{)}.$	(6.8)

The last line in the display above follows from the definition of $\bar{J}_{n}(t)=\frac{1}{n}J(nt)$ in (3.6). To conclude, note that $\nu(x)\in\mathcal{RV}_{-\alpha}(x)$ and hence $n\nu[n\gamma,\infty)\in\mathcal{RV}_{-(\alpha-1)}(n)$ with $\alpha>1$ , thus implying $\sup_{n\geq 1}n\nu[n\gamma,\infty)<\infty$ ; also, as prescribed in Proposition 3.1 we have $\rho\in(\rho_{0},1)$ .

The rest of the proof is devoted to establishing the strong efficiency of $L_{n}$ . Observe that

\displaystyle\mathbf{E}^{\mathbf{Q}_{n}}[L^{2}_{n}]

\displaystyle=\int Z^{2}_{n}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}d\mathbf{Q}_{n}=\int Z^{2}_{n}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}d\mathbf{P}=\int Z^{2}_{n}\mathbf{I}_{B^{\gamma}_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}d\mathbf{P}+\int Z^{2}_{n}\mathbf{I}_{(B^{\gamma}_{n})^{c}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}d\mathbf{P}.

By definitions in (3.5), on event $(B^{\gamma}_{n})^{c}$ we have $\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\leq\frac{1}{w}$ , while on event $B^{\gamma}_{n}$ we have $\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\leq\frac{\mathbf{P}(B^{\gamma}_{n})}{1-w}$ . As a result,

\displaystyle\mathbf{E}^{\mathbf{Q}_{n}}[L^{2}_{n}]\leq\frac{\mathbf{P}(B^{\gamma}_{n})}{1-w}\mathbf{E}[{Z^{2}_{n}\mathbf{I}_{B^{\gamma}_{n}}}]+\frac{1}{w}\mathbf{E}[{Z^{2}_{n}\mathbf{I}_{(B^{\gamma}_{n})^{c}}}].

(6.9)

Meanwhile, Lemma 6.1 implies that

\displaystyle 0<\liminf_{n\rightarrow\infty}\frac{\mathbf{P}(A_{n})}{(n\nu[n,\infty))^{l^{*}}}\leq\limsup_{n\rightarrow\infty}\frac{\mathbf{P}(A_{n})}{(n\nu[n,\infty))^{l^{*}}}<\infty.

(6.10)

Let $Z_{n,1}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}Z_{n}\mathbf{I}_{B^{\gamma}_{n}}$ and $Z_{n,2}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}Z_{n}\mathbf{I}_{(B^{\gamma}_{n})^{c}}$ . Furthermore, given $\rho\in(\rho_{0},1)$ , we claim the existence of some $\bar{\gamma}=\bar{\gamma}(\rho)\in(0,b)$ such that for any $\gamma\in(0,\bar{\gamma})$ ,

$\displaystyle\mathbf{P}(B^{\gamma}_{n})$	$\displaystyle=\bm{O}\big{(}(n\nu[n,\infty))^{l^{*}}\big{)},$	(6.11)
$\displaystyle\mathbf{E}[Z_{n,1}^{2}]$	$\displaystyle=\bm{O}\big{(}(n\nu[n,\infty))^{l^{*}}\big{)},$	(6.12)
$\displaystyle\mathbf{E}[Z_{n,2}^{2}]$	$\displaystyle=\bm{o}\big{(}(n\nu[n,\infty))^{2l^{*}}\big{)},$	(6.13)

as $n\to\infty$ . Then, using (6.11) and (6.12) we get $\mathbf{P}(B^{\gamma}_{n})\mathbf{E}[{Z^{2}_{n}\mathbf{I}_{B^{\gamma}_{n}}}]=\bm{O}\big{(}(n\nu[n,\infty))^{2l^{*}}\big{)}=\bm{O}\big{(}\mathbf{P}^{2}(A_{n})\big{)}.$ The last equality follows from (6.10). Similarly, from (6.10) and (6.13) we get $\mathbf{E}[{Z^{2}_{n}\mathbf{I}_{(B^{\gamma}_{n})^{c}}}]=\bm{o}\big{(}(n\nu[n,\infty))^{2l^{*}}\big{)}=\bm{o}\big{(}\mathbf{P}^{2}(A_{n})\big{)}.$ Therefore, in (6.9) we have $\mathbf{E}^{\mathbf{Q}_{n}}[L^{2}_{n}]=\bm{O}\big{(}\mathbf{P}^{2}(A_{n})\big{)}$ , thus establishing the strong efficiency. Now, it remains to prove claims (6.11), (6.12), and (6.13).

Proof of Claim (6.11).

We show that the claim holds for all $\gamma\in(0,b)$ . For any $c>0$ and $k\in\mathbb{N}$ , note that

\displaystyle\mathbf{P}\big{(}\text{Poisson}(c)\geq k\big{)}

\displaystyle=\sum_{j\geq k}\exp(-c)\frac{c^{j}}{j!}=c^{k}\sum_{j\geq k}\exp(-c)\frac{c^{j-k}}{j!}\leq c^{k}\sum_{j\geq k}\exp(-c)\frac{c^{j-k}}{(j-k)!}=c^{k}.

(6.14)

Recall that $B^{\gamma}_{n}=\{\bar{X}_{n}\in B^{\gamma}\}$ and $B^{\gamma}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\xi\in\mathbb{D}:\#\{t\in[0,1]:\xi(t)-\xi(t-)\geq\gamma\}\geq l^{*}\}.$ Therefore,

	$\displaystyle\mathbf{P}(B^{\gamma}_{n})$	$\displaystyle=\mathbf{P}\big{(}\#\{t\in[0,n]:\ X(t)-X(t-)\geq n\gamma\}\geq l^{*}\big{)}\qquad\text{due to }\bar{X}_{n}(t)=X(nt)/n$
		$\displaystyle=\sum_{k\geq l^{}}\exp\big{(}-n\nu[n\gamma,\infty)\big{)}\frac{\big{(}n\nu[n\gamma,\infty)\big{)}^{k}}{k!}\leq\big{(}n\nu[n\gamma,\infty)\big{)}^{l^{}}\qquad\text{due to }\eqref{proof, bound poisson dist tail, proposition: design of Zn}.$

Lastly, the regularly varying nature of $\nu[x,\infty)$ (see Assumption 1) implies $\lim_{n\rightarrow\infty}\frac{(n\nu[n\gamma,\infty))^{l^{*}}}{(n\nu[n,\infty))^{l^{*}}}=1/\gamma^{\alpha l^{*}}\in(0,\infty),$ and hence $\mathbf{P}(B^{\gamma}_{n})=\bm{O}\big{(}(n\nu[n,\infty))^{l^{*}}\big{)}$ .

Proof of Claim (6.12).

Again, we prove the claim for all $\gamma\in(0,b)$ . By the definition of $Z_{n}$ in (3.8),

\displaystyle Z_{n,1}=Z_{n}\mathbf{I}_{B^{\gamma}_{n}}=\sum_{m=0}^{\tau}\frac{\hat{Y}^{m}_{n}\mathbf{I}_{E_{n}\cap B^{\gamma}_{n}}-\hat{Y}^{m-1}_{n}\mathbf{I}_{E_{n}\cap B^{\gamma}_{n}}}{\mathbf{P}(\tau\geq m)}.

Meanwhile, by the definition of $B^{\gamma}_{n}$ , we have $\mathbf{I}_{B^{\gamma}_{n}}=0$ on $\{\mathcal{D}(\bar{J}_{n})<l^{*}\}$ , where $\mathcal{D}(\xi)$ counts the number of discontinuities for any $\xi\in\mathbb{D}$ . By applying Result 4 under the choice of $Y_{m}=\hat{Y}^{m}_{n}\mathbf{I}_{E_{n}\cap B^{\gamma}_{n}}$ and $Y=Y^{*}_{n}\mathbf{I}_{E_{n}\cap B^{\gamma}_{n}},$ we yield

$\displaystyle\mathbf{E}Z^{2}_{n,1}$	$\displaystyle\leq\sum_{m\geq 1}\frac{\mathbf{E}\Big{[}\big{\|}Y^{*}_{n}\mathbf{I}_{E_{n}\cap B^{\gamma}_{n}}-\hat{Y}^{m-1}_{n}\mathbf{I}_{E_{n}\cap B^{\gamma}_{n}}\big{\|}^{2}\Big{]}}{\mathbf{P}(\tau\geq m)}$
	$\displaystyle\leq\sum_{m\geq 1}\sum_{k\geq l^{}}\frac{\mathbf{E}\Big{[}\mathbf{I}\big{(}Y^{}_{n}\neq\hat{Y}^{m-1}_{n}\big{)}\ \Big{\|}\ \{\mathcal{D}(\bar{J}_{n})=k\}\Big{]}}{\mathbf{P}(\tau\geq m)}\cdot\mathbf{P}\big{(}\mathcal{D}(\bar{J}_{n})=k\big{)}\ \ \text{due to }\mathbf{I}_{B^{\gamma}_{n}}=0\text{ on }\{\mathcal{D}(\bar{J}_{n})<l^{*}\}$
	$\displaystyle\leq\sum_{k\geq l^{}}\mathbf{P}\big{(}\mathcal{D}(\bar{J}_{n})=k\big{)}\cdot\sum_{m\geq 1}\frac{\mathbf{P}\Big{(}Y^{}_{n}\neq\hat{Y}^{m-1}_{n}\ \Big{\|}\ \{\mathcal{D}(\bar{J}_{n})=k\}\Big{)}}{\mathbf{P}(\tau\geq m)}$
	$\displaystyle\leq\sum_{k\geq l^{*}}\mathbf{P}\big{(}\mathcal{D}(\bar{J}_{n})=k\big{)}\cdot\bigg{[}\sum_{m=1}^{\bar{m}}\frac{1}{\rho^{m-1}}+\sum_{m\geq\bar{m}+1}\frac{C_{0}\rho_{0}^{m-1}\cdot(k+1)}{\rho^{m-1}}\bigg{]}\qquad\text{by condition \eqref{condition 1, proposition: design of Zn}}$
	$\displaystyle\leq\sum_{k\geq l^{*}}\mathbf{P}\big{(}\mathcal{D}(\bar{J}_{n})=k\big{)}\cdot(k+1)\cdot\bigg{[}\underbrace{\sum_{m=1}^{\bar{m}}\frac{1}{\rho^{m-1}}+\sum_{m\geq\bar{m}+1}\frac{C_{0}\rho_{0}^{m-1}}{\rho^{m-1}}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\widetilde{C}_{\rho,1}}\bigg{]}.$	(6.15)

In particular, given $\rho\in(\rho_{0},1)$ , we have $\widetilde{C}_{\rho,1}<\infty$ , and hence

	$\displaystyle\mathbf{E}Z^{2}_{n,1}$	$\displaystyle\leq\widetilde{C}_{\rho,1}\sum_{k\geq l^{*}}(k+1)\cdot\mathbf{P}\big{(}\mathcal{D}(\bar{J}_{n})=k\big{)}$
		$\displaystyle=\widetilde{C}_{\rho,1}\sum_{k\geq l^{*}}(k+1)\cdot\exp\big{(}-n\nu[n\gamma,\infty)\big{)}\frac{\big{(}n\nu[n\gamma,\infty)\big{)}^{k}}{k!}$
		$\displaystyle\leq 2\widetilde{C}_{\rho,1}\sum_{k\geq l^{}}k\cdot\exp\big{(}-n\nu[n\gamma,\infty)\big{)}\frac{\big{(}n\nu[n\gamma,\infty)\big{)}^{k}}{k!}\qquad\text{ due to }l^{}\geq 1\ \Longrightarrow\ \frac{k+1}{k}\leq 2\ \forall k\geq l^{*}$
		$\displaystyle\leq 2\widetilde{C}_{\rho,1}\cdot\big{(}n\nu[n\gamma,\infty)\big{)}^{l^{}}\sum_{k\geq l^{}}\exp\big{(}-n\nu[n\gamma,\infty)\big{)}\frac{\big{(}n\nu[n\gamma,\infty)\big{)}^{k-l^{}}}{(k-l^{})!}\qquad\text{ due to }l^{*}\geq 1$
		$\displaystyle=2\widetilde{C}_{\rho,1}\cdot\big{(}n\nu[n\gamma,\infty)\big{)}^{l^{*}}.$

Again, the regular varying nature of $\nu[x,\infty)$ allows us to conclude that $\mathbf{E}Z^{2}_{n,1}=\bm{O}\big{(}(n\nu[n,\infty))^{l^{*}}\big{)}$ .

Proof of Claim (6.13).

Fix some $\rho\in(\rho_{0},1)$ and some $q>1$ such that $\rho_{0}^{1/q}<\rho$ . Let $p>1$ be such that $\frac{1}{p}+\frac{1}{q}=1$ . By Assumption 3, we can pick some $\Delta_{0}>0$ small enough such that $a-\Delta_{0}>(l^{*}_{1})b$ . This allows us to pick $\bar{\gamma}\in(0,b)$ small enough such that $(\hat{J}+l^{*}-1)/p>2l^{*}$ where

\displaystyle\hat{J}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\frac{a-\Delta_{0}-(l^{*}-1)b}{\bar{\gamma}}.

(6.16)

We prove the claim for all $\gamma\in(0,\bar{\gamma})$ . Specifically, given any $\gamma\in(0,\bar{\gamma})$ , one can pick $\Delta\in(0,\Delta_{0})$ such that $[a-\Delta-(l^{*}-1)b]/\gamma\notin\mathbb{Z}$ . Due to our choice of $\gamma$ and $\Delta$ , it follows from (6.16) that $(J_{\gamma}+l^{*}-1)/p>2l^{*}$ where

\displaystyle J_{\gamma}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\lceil\frac{a-\Delta-(l^{*}-1)b}{\gamma}\rceil.

Let $A^{\Delta}=\{\xi\in\mathbb{D}:\sup_{t\in[0,1]}\xi(t)\geq a-\Delta\}$ and $A^{\Delta}_{n}=\{\bar{X}_{n}\in A^{\Delta}\}$ . Also, note that

\displaystyle Z_{n,2}

\displaystyle={Z_{n}\mathbf{I}_{(B^{\gamma}_{n})^{c}}}=\underbrace{Z_{n}\mathbf{I}_{A^{\Delta}_{n}\cap(B^{\gamma}_{n})^{c}}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}Z_{n,3}}+\underbrace{Z_{n}\mathbf{I}_{(A^{\Delta}_{n})^{c}\cap(B^{\gamma}_{n})^{c}}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}Z_{n,4}}.

Specifically, $Z_{n,3}=\sum_{m=0}^{\tau}\frac{\hat{Y}^{m}_{n}\mathbf{I}_{A^{\Delta}_{n}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}-\hat{Y}^{m-1}_{n}\mathbf{I}_{A^{\Delta}_{n}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}}{\mathbf{P}(\tau\geq m)}$ . Analogous to the calculations in (6.15), by applying Result 4 under the choice of $Y_{m}=\hat{Y}^{m}_{n}\mathbf{I}_{A^{\Delta}_{n}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}$ and $Y=Y^{*}_{n}\mathbf{I}_{A^{\Delta}_{n}\cap E_{n}\cap(B^{\gamma}_{n})^{c}},$ we yield

	$\displaystyle\mathbf{E}Z_{n,3}^{2}$	$\displaystyle\leq\sum_{m\geq 1}\frac{\mathbf{E}\Big{[}\big{\|}Y^{*}_{n}-\hat{Y}^{m-1}_{n}\big{\|}^{2}\mathbf{I}_{A^{\Delta}_{n}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}\Big{]}}{\mathbf{P}(\tau\geq m)}$
		$\displaystyle=\sum_{m\geq 1}\frac{\mathbf{E}\Big{[}\mathbf{I}\big{(}Y^{}_{n}\neq\hat{Y}^{m-1}_{n}\big{)}\cdot\mathbf{I}_{A^{\Delta}_{n}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}\Big{]}}{\mathbf{P}(\tau\geq m)}\qquad\text{because }\hat{Y}^{m}_{n}\text{ and }Y^{}_{n}\text{ only take values in }\{0,1\}$
		$\displaystyle\leq\sum_{m\geq 1}\frac{\Big{(}\mathbf{P}\big{(}Y^{*}_{n}\neq\hat{Y}^{m-1}_{n}\big{)}\Big{)}^{1/q}\cdot\Big{(}\mathbf{P}\big{(}A^{\Delta}_{n}\cap E_{n}\cap(B^{\gamma}_{n})^{c}\big{)}\Big{)}^{1/p}}{\mathbf{P}(\tau\geq m)}\qquad\text{ by Hölder's inequality}.$

Applying Lemma 6.2, we get $\big{(}\mathbf{P}(A^{\Delta}_{n}\cap E_{n}\cap(B^{\gamma}_{n})^{c})\big{)}^{1/p}=\bm{o}\big{(}(n\nu[n,\infty))^{2l^{*}}\big{)}.$ On the other hand, it has been shown in (6.8) that for any $n\geq 1$ and $m\geq\bar{m}$ , we have $\mathbf{P}(Y^{*}_{n}\neq\hat{Y}^{m}_{n})\leq C_{0}C_{\gamma}\rho^{m}_{0}$ where $C_{\gamma}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sup_{n\geq 1}n\nu[n\gamma,\infty)+1<\infty.$ In summary,

\displaystyle\mathbf{E}Z_{n,3}^{2}

\displaystyle\leq\bm{o}\big{(}(n\nu[n,\infty))^{2l^{*}}\big{)}\cdot\bigg{[}\underbrace{\sum_{m=1}^{\bar{m}}\frac{1}{\rho^{m-1}}+\sum_{m\geq\bar{m}+1}\frac{(C_{0}C_{\gamma})^{1/q}\cdot(\rho_{0}^{1/q})^{m-1}}{\rho^{m-1}}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\widetilde{C}_{\rho,2}}\bigg{]}.

(6.17)

Note that $\widetilde{C}_{\rho,2}<\infty$ due to our choice of $\rho_{0}^{1/q}<\rho$ .

Similarly, to bound the second-order moment of $Z_{n,4}=\sum_{m=0}^{\tau}\frac{\hat{Y}^{m}_{n}\mathbf{I}_{(A^{\Delta}_{n})^{c}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}-\hat{Y}^{m-1}_{n}\mathbf{I}_{(A^{\Delta}_{n})^{c}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}}{\mathbf{P}(\tau\geq m)}$ , we apply Result 4 again and get

$\displaystyle\mathbf{E}Z^{2}_{n,4}$	$\displaystyle\leq\sum_{m\geq 1}\frac{\mathbf{E}\Big{[}\big{\|}Y^{*}_{n}-\hat{Y}^{m-1}_{n}\big{\|}^{2}\mathbf{I}_{(A^{\Delta}_{n})^{c}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}\Big{]}}{\mathbf{P}(\tau\geq m)}$
	$\displaystyle=\sum_{m\geq 1}\frac{\mathbf{E}\Big{[}\mathbf{I}\big{(}Y^{}_{n}\neq\hat{Y}^{m-1}_{n}\big{)}\cdot\mathbf{I}_{(A^{\Delta}_{n})^{c}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}\Big{]}}{\mathbf{P}(\tau\geq m)}\qquad\text{because }\hat{Y}^{m}_{n}\text{ and }Y^{}_{n}\text{ only take values in }\{0,1\}$
	$\displaystyle\leq\sum_{m\geq 1}\frac{\mathbf{P}\Big{(}\big{\{}Y^{*}_{n}\neq\hat{Y}^{m-1}_{n},\ \bar{X}_{n}\notin A^{\Delta}\big{\}}\cap(B^{\gamma}_{n})^{c}\Big{)}}{\mathbf{P}(\tau\geq m)}\qquad\text{ due to }A^{\Delta}_{n}=\{\bar{X}_{n}\in A^{\Delta}\}$
	$\displaystyle=\sum_{m\geq 1}\frac{\mathbf{P}\Big{(}\big{\{}Y^{}_{n}\neq\hat{Y}^{m-1}_{n},\ \bar{X}_{n}\notin A^{\Delta}\big{\}}\cap\{\mathcal{D}(\bar{J}_{n})<l^{}\}\Big{)}}{\mathbf{P}(\tau\geq m)}\qquad\text{due to }B^{\gamma}_{n}=\{\mathcal{D}(\bar{J}_{n})\geq l^{*}\}$
	$\displaystyle=\sum_{m\geq 1}\sum_{k=0}^{l^{}-1}\frac{\mathbf{P}\big{(}Y^{}_{n}\neq\hat{Y}^{m-1}_{n},\ \bar{X}_{n}\notin A^{\Delta}\ \big{\|}\ \{\mathcal{D}(\bar{J}_{n})=k\}\big{)}}{\mathbf{P}(\tau\geq m)}\cdot\mathbf{P}\big{(}\mathcal{D}(\bar{J}_{n})=k\big{)}$
	$\displaystyle\leq\sum_{m\geq 1}\sum_{k=0}^{l^{*}-1}\frac{C_{0}\rho^{m-1}_{0}}{\Delta^{2}n^{\mu}\cdot\rho^{m-1}}\qquad\text{ due to \eqref{condition 2, proposition: design of Zn}}$
	$\displaystyle=l^{}\sum_{m\geq 1}\frac{C_{0}\rho^{m-1}_{0}}{\Delta^{2}n^{\mu}\cdot\rho^{m-1}}=\frac{C_{0}l^{}}{\Delta\cdot(1-\frac{\rho_{0}}{\rho})}\cdot\frac{1}{n^{\mu}}=\bm{o}\Big{(}\big{(}n\nu[n,\infty)\big{)}^{2l^{*}}\Big{)}.$	(6.18)

The last equality follows from the condition $\mu>2l^{*}(\alpha-1)$ prescribed in Proposition 3.1 and the fact that $n\nu[n,\infty)\in\mathcal{RV}_{-(\alpha-1)}(n)$ as $n\to\infty$ . Combining (6.17) and (6.18) with the preliminary bound $(x+y)^{2}\leq 2x^{2}+2y^{2}$ , we yield $\mathbf{E}Z^{2}_{n,2}\leq 2\mathbf{E}Z^{2}_{n,3}+2\mathbf{E}Z^{2}_{n,4}=\bm{o}\big{(}(n\nu[n,\infty))^{2l^{*}}\big{)}$ and conclude the proof of (6.13). ∎

6.2 Proof of Theorems 3.2 and 3.3

We stress again that Theorem 3.2 follows directly from Theorem 3.3 with $\kappa=0$ (i.e., by disabling ARA from Algorithm 3). We devote the remainder of this section to proving Theorem 3.3.

Throughout Section 6.2, we fix the following constants and parameters. First, let $\beta\in[0,2)$ be the Blumenthal-Getoor index of $X(t)$ and $\alpha>1$ be the regularly varying index of $\nu[x,\infty)$ ; see Assumption 1. Fix some

\displaystyle\beta_{+}\in(\beta,2),\qquad\mu>2l^{*}(\alpha-1).

(6.19)

This allows us to pick $d,r$ large enough such that

\displaystyle r(2-\beta_{+})>\max\{2,\mu-1\},\qquad d>\max\{2,2\mu-1\}

(6.20)

for $d$ in (3.26) and $r$ in (3.20). Let $\lambda>0$ be the constant in Assumption 2. Choose

\displaystyle\alpha_{3}\in(0,\frac{1}{\lambda}),\qquad\alpha_{4}\in(0,\frac{1}{2\lambda}).

(6.21)

Next, fix

\displaystyle\alpha_{2}\in(0,\frac{\alpha_{3}}{2}\wedge 1).

(6.22)

Based on the chosen value of $\alpha_{2}$ , fix

\displaystyle\alpha_{1}\in(0,\frac{\alpha_{2}}{\lambda}).

(6.23)

Pick

\displaystyle\delta\in(1/\sqrt{2},1).

(6.24)

Since we require $\alpha_{2}$ to be strictly less than $1$ , there is some integer $\bar{m}$ such that

\displaystyle\delta^{m\alpha_{2}}-\delta^{m}\geq\frac{\delta^{m\alpha_{2}}}{2}\text{ and }\delta^{m\alpha_{2}}<a\qquad\forall m\geq\bar{m}

(6.25)

where $a>0$ is the parameter in set $A$ ; see Assumption 3. Based on the values of $\delta$ and $\beta_{+}$ , it holds for all $\kappa\in[0,1)$ small enough that

\displaystyle\kappa^{2-\beta_{+}}<\frac{1}{2}<\delta^{2}

(6.26)

Then, based on all previous choices, it holds for all $\rho_{1}\in(0,1)$ close enough to 1 such that

$\displaystyle\delta^{\alpha_{1}}$	$\displaystyle<\rho_{1},$	(6.27)
$\displaystyle\frac{\kappa^{2-\beta_{+}}}{\delta^{2}}$	$\displaystyle<\rho_{1}$	(6.28)
$\displaystyle\frac{1}{\sqrt{2}\delta}$	$\displaystyle<\rho_{1}$	(6.29)
$\displaystyle\delta^{\alpha_{2}-\lambda\alpha_{1}}$	$\displaystyle<\rho_{1}$	(6.30)
$\displaystyle\delta^{1-\lambda\alpha_{3}}$	$\displaystyle<\rho_{1}$	(6.31)
$\displaystyle\delta^{-\alpha_{2}+\frac{\alpha_{3}}{2}}$	$\displaystyle<\rho_{1},$	(6.32)
$\displaystyle(1/\sqrt{2})\vee\kappa^{2-\beta_{+}}$	$\displaystyle<\rho_{1}.$	(6.33)

Lastly, pick $\rho_{0}\in(\rho_{1},1)$ . By picking a larger $\bar{m}$ if necessary, we can ensure that

\displaystyle m^{2}\rho^{m}_{1}\leq\rho^{m}_{0}\qquad\forall m\geq\bar{m}.

(6.34)

Next, we make a few observations. Given some non-negative integer $k$ , let

\displaystyle\zeta_{k}(t)=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}(t)

(6.35)

where $0<u_{1}<u_{2}<\ldots<u_{k}<n$ are the order statistics of $k$ iid samples of Unif $(0,n)$ , and $z_{i}$ ’s are iid samples from $\nu(\cdot\cap[n\gamma,\infty))/\nu[n\gamma,\infty)$ . We adopt the convention that $u_{0}\equiv 0$ and $u_{k+1}\equiv 1$ . Note that when $k=0$ , we set $\zeta_{0}(t)\equiv 0$ as the zero function, and set $I_{1}=[0,n]$ , $u_{0}=0$ , and $u_{1}=n$ .

For $Y^{*}_{n}(\cdot)$ defined in (3.14) and $\hat{Y}^{m}_{n}(\cdot)$ defined in (3.27), note that

\displaystyle Y^{*}_{n}(\zeta_{k})=\max_{i\in[k+1]}\mathbf{I}\{W^{(i),*}_{n}(\zeta_{k})\geq na\},\qquad\hat{Y}^{m}_{n}(\zeta_{k})=\max_{i\in[k+1]}\mathbf{I}\{\hat{W}^{(i),m}_{n}(\zeta_{k})\geq na\}

(6.36)

where

	$\displaystyle W^{(i),*}_{n}(\zeta_{k})$	$\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}+\sum_{j\geq 1}(\xi^{(i)}_{j})^{+},$		(6.37)
	$\displaystyle\hat{W}^{(i),m}_{n}(\zeta_{k})$	$\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q),m}_{j}+\sum_{q=1}^{i-1}z_{q}+\sum_{j=1}^{m+\lceil\log_{2}(n^{d})\rceil}(\xi^{(i),m}_{j})^{+}.$		(6.38)

See (3.15)–(3.17) and (3.24) for the definitions $\xi^{(i)}_{j}$ ’s and $\xi^{(i),m}_{j}$ ’s, respectively. Also, define

\displaystyle\widetilde{W}^{(i),m}_{n}(\zeta_{k})

\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}+\sum_{j=1}^{m+\lceil\log_{2}(n^{d})\rceil}(\xi^{(i)}_{j})^{+}.

(6.39)

As intermediate steps for the proof of Theorem 3.3, we present the following two results. Proposition 6.3 states that, using $\widetilde{W}^{(i),m}_{n}(\zeta_{k})$ as an anchor, we see that $W^{(i),*}_{n}(\zeta_{k})$ and $\hat{W}^{(i),m}_{n}(\zeta_{k})$ would stay close enough with high probability, especially for large $m$ . Proposition 6.4 then shows that it is unlikely for the law of $\widetilde{W}^{(i),m}_{n}(\zeta_{k})$ to concentrate around any $y\in\mathbb{R}$ .

Proposition 6.3.

There exists some constant $C_{1}\in(0,\infty)$ such that the inequality

\displaystyle\mathbf{P}\bigg{(}\Big{|}W^{(i),*}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{|}\vee\Big{|}\hat{W}^{(i),m}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{|}>x\bigg{)}\leq\frac{C_{1}\kappa^{m(2-\beta_{+})}}{x^{2}\cdot n^{r(2-\beta_{+})-1}}+\frac{C_{1}}{x}\sqrt{\frac{1}{n^{d-1}\cdot 2^{m}}}

holds for any $k\in\mathbb{N}$ , $i\in[k+1]$ , $n\geq 1$ , $m\in\mathbb{N}$ , and $x>0$ .

Proposition 6.4.

There exists some constant $C_{2}\in(0,\infty)$ such that the inequality

\displaystyle\mathbf{P}\bigg{(}\widetilde{W}^{(i),m}_{n}(\zeta_{k})\in\bigg{[}y-\frac{\delta^{m}}{\sqrt{n}},y+\frac{\delta^{m}}{\sqrt{n}}\bigg{]}\text{ for some }i\in[k+1]\bigg{)}\leq(k+1)\cdot C_{2}\rho^{m}_{0}

holds for any $k\in\mathbb{N}$ , $i\in[k+1]$ , $n\geq 1$ , $m\geq\bar{m}$ , and $y>\delta^{m\alpha_{2}}$ .

First, equipped with Propositions 6.3 and 6.4, we are able to prove the main results of Section 3.6, i.e., Theorem 3.3.

Proof of Theorem 3.3.

In light of Proposition 3.1, it suffices to verify conditions (3.9) and (3.10).

Verification of (3.9).

Conditioning on $\{\mathcal{D}(\bar{J}_{n})=k\}$ , the conditional law of $J_{n}=\{J_{n}(t):\ t\in[0,n]\}$ is the same as the law of the process $\zeta_{k}$ specified in (6.35). This implies

\displaystyle\mathbf{P}\big{(}Y^{*}_{n}(J_{n})\neq\hat{Y}^{m}_{n}(J_{n})\ \big{|}\ \mathcal{D}(\bar{J}_{n})=k\big{)}

\displaystyle=\mathbf{P}\big{(}Y^{*}_{n}(\zeta_{k})\neq\hat{Y}^{m}_{n}(\zeta_{k})\big{)}.

Next, on event

	$\displaystyle\bigcap_{i\in[k+1]}\Bigg{(}\bigg{\{}\Big{\|}W^{(i),*}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{\|}\vee\Big{\|}\hat{W}^{(i),m}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{\|}\leq\frac{\delta^{m}}{\sqrt{n}}\bigg{\}}$
	$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\cap\bigg{\{}\widetilde{W}^{(i),m}_{n}(\zeta_{k})\notin\bigg{[}na-\frac{\delta^{m}}{\sqrt{n}},na+\frac{\delta^{m}}{\sqrt{n}}\bigg{]}\bigg{\}}\Bigg{)},$

we must have (for any $i\in[k+1]$ )

\displaystyle W^{(i),*}_{n}(\zeta_{k})\vee\hat{W}^{(i),m}_{n}(\zeta_{k})<na\qquad\text{ or }\qquad W^{(i),*}_{n}(\zeta_{k})\wedge\hat{W}^{(i),m}_{n}(\zeta_{k})>na.

It then follows from (6.36) that, on this event, we have $Y^{*}_{n}(\zeta_{k})=\hat{Y}^{m}_{n}(\zeta_{k})$ . Therefore,

	$\displaystyle\mathbf{P}\big{(}Y^{*}_{n}(\zeta_{k})\neq\hat{Y}^{m}_{n}(\zeta_{k})\big{)}$		(6.40)
	$\displaystyle\leq\sum_{i\in[k+1]}\mathbf{P}\bigg{(}\Big{\|}W^{(i),*}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{\|}\vee\Big{\|}\hat{W}^{(i),m}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{\|}>\frac{\delta^{m}}{\sqrt{n}}\bigg{)}$
	$\displaystyle+\mathbf{P}\bigg{(}\widetilde{W}^{(i),m}_{n}(\zeta_{k})\in\bigg{[}na-\frac{\delta^{m}}{\sqrt{n}},na+\frac{\delta^{m}}{\sqrt{n}}\bigg{]}\text{ for some }i\in[k+1]\bigg{)}.$

Applying Proposition 6.3 (with $x=\delta^{m}/\sqrt{n})$ , we get (for any $i\in[k+1]$ )

	$\displaystyle\mathbf{P}\bigg{(}\Big{\|}W^{(i),*}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{\|}\vee\Big{\|}\hat{W}^{(i),m}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{\|}>\frac{\delta^{m}}{\sqrt{n}}\bigg{)}$
	$\displaystyle\leq C_{1}\cdot\Bigg{[}\frac{\kappa^{m(2-\beta_{+})}\cdot n}{\delta^{2m}\cdot n^{r(2-\beta_{+})-1}}+\frac{\sqrt{n}}{(\sqrt{2}\delta)^{m}\cdot\sqrt{n^{d-1}}}\Bigg{]}$
	$\displaystyle=C_{1}\cdot\Bigg{[}\bigg{(}\frac{\kappa^{2-\beta_{+}}}{\delta^{2}}\bigg{)}^{m}\cdot\frac{1}{n^{r(2-\beta_{+})-2}}+\bigg{(}\frac{1}{\sqrt{2}\delta}\bigg{)}^{m}\cdot\sqrt{\frac{1}{n^{d-2}}}\ \Bigg{]}$
	$\displaystyle\leq C_{1}\cdot\Bigg{[}\bigg{(}\frac{\kappa^{2-\beta_{+}}}{\delta^{2}}\bigg{)}^{m}+\bigg{(}\frac{1}{\sqrt{2}\delta}\bigg{)}^{m}\Bigg{]}\qquad\text{due to the choices of $d$ and $r$ in \eqref{proof, choose d and r, proposition: hat Y m n condition 1}}$
	$\displaystyle\leq 2C_{1}\rho^{m}_{0}\qquad\text{due to the choices in \eqref{proofChooseRhoByKappa} and \eqref{proofChooseRhoByDelta}, and $\rho_{0}\in(\rho_{1},1)$}.$		(6.41)

On the other hand, due to (6.25), we have $na-\delta^{m\alpha_{2}}\geq a-\delta^{m\alpha_{2}}>0$ . for all $n\geq 1$ and $m\geq\bar{m}$ . This allows us to apply Proposition 6.4 (with $y=na$ ) and yield (for any $i\in[k+1]$ )

\displaystyle\mathbf{P}\bigg{(}\widetilde{W}^{(i),m}_{n}(\zeta_{k})\in\bigg{[}na-\frac{\delta^{m}}{\sqrt{n}},na+\frac{\delta^{m}}{\sqrt{n}}\bigg{]}\text{ for some }i\in[k+1]\bigg{)}\leq(k+1)\cdot C_{2}\rho^{m}_{0}\qquad\forall m\geq\bar{m}.

(6.42)

Plugging (6.41) and (6.42) into (6.40), we conclude the proof by setting $C_{0}=2C_{1}+C_{2}$ .

Verification of (3.10).

Fix some $\Delta\in(0,1)$ and $k=0,1,\ldots,l^{*}-1$ . Again, conditioning on $\{\mathcal{D}(\bar{J}_{n})=k\}$ , the conditional law of $J_{n}=\{J_{n}(t):\ t\in[0,n]\}$ is the same as the law of the process $\zeta_{k}$ specified in (6.35). This implies

	$\displaystyle\mathbf{P}\Big{(}Y^{*}_{n}(J_{n})\neq\hat{Y}^{m}_{n}(J_{n}),\ \bar{X}_{n}\notin A^{\Delta}\ \Big{\|}\ \mathcal{D}(\bar{J}_{n})=k\Big{)}$
	$\displaystyle=\mathbf{P}\Big{(}Y^{*}_{n}(J_{n})\neq\hat{Y}^{m}_{n}(J_{n}),\ \sup_{t\in[0,n]}X(t)<n(a-\Delta)\ \Big{\|}\ \mathcal{D}(\bar{J}_{n})=k\Big{)}\qquad\text{by definition of set $A^{\Delta}$}$
	$\displaystyle=\mathbf{P}\Big{(}\max_{i\in[k+1]}\hat{W}^{(i),m}_{n}(\zeta_{k})\geq na,\ \max_{i\in[k+1]}W^{(i),*}_{n}(\zeta_{k})<n(a-\Delta)\Big{)}$
	$\displaystyle\leq\sum_{i\in[k+1]}\mathbf{P}\Big{(}\big{\|}\hat{W}^{(i),m}_{n}(\zeta_{k})-W^{(i),*}_{n}(\zeta_{k})\big{\|}>n\Delta\Big{)}$
	$\displaystyle\leq\sum_{i\in[k+1]}\mathbf{P}\bigg{(}\Big{\|}W^{(i),*}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{\|}\vee\Big{\|}\hat{W}^{(i),m}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{\|}>\frac{n\Delta}{2}\bigg{)}$
	$\displaystyle\leq(k+1)\cdot\Bigg{[}\frac{4C_{1}}{\Delta^{2}n^{2}}\cdot\frac{\kappa^{m(2-\beta_{+})}}{n^{r(2-\beta_{+})-1}}+\frac{2C_{1}}{\Delta}\cdot\frac{1}{n}\sqrt{\frac{1}{n^{d-1}\cdot 2^{m}}}\ \Bigg{]}\qquad\text{by Proposition~{}\ref{proposition, intermediate 1, strong efficiency ARA}}$
	$\displaystyle=(k+1)\cdot\Bigg{[}\frac{4C_{1}}{\Delta^{2}}\cdot\frac{\kappa^{m(2-\beta_{+})}}{n^{r(2-\beta_{+})+1}}+\frac{2C_{1}}{\Delta}\cdot\frac{(1/\sqrt{2})^{m}}{n^{\frac{d+1}{2}}}\Bigg{]}$
	$\displaystyle\leq\frac{k+1}{n^{\mu}}\cdot\Bigg{[}\frac{4C_{1}}{\Delta^{2}}\cdot\kappa^{m(2-\beta_{+})}+\frac{2C_{1}}{\Delta}\cdot(1/\sqrt{2})^{m}\Bigg{]}\qquad\text{by the choices of $r$ and $d$ in \eqref{proof, choose d and r, proposition: hat Y m n condition 1}}$
	$\displaystyle\leq\frac{k+1}{n^{\mu}}\cdot\Bigg{[}\frac{4C_{1}}{\Delta^{2}}\cdot\rho_{0}^{m}+\frac{2C_{1}}{\Delta}\cdot\rho_{0}^{m}\Bigg{]}\quad\text{due to the choice of $\rho_{1}$ in \eqref{proofChooseRhoConstantBound} and $\rho_{0}\in(\rho_{1},1)$}.$

Due to $\Delta\in(0,1)$ (and hence $\frac{1}{\Delta}<\frac{1}{\Delta^{2}}$ ) and $k\leq l^{*}-1$ , we conclude the proof by setting $C_{0}=6l^{*}C_{1}$ . ∎

The rest of this section is devoted to proving Propositions 6.3 and 6.4. First, we collect a useful result.

Result 5 (Lemma 1 of [35]).

Let $\nu$ be the Lévy measure of a Lévy process $X$ . Let $I_{0}^{p}(\nu)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\int_{(-1,1)}|x|^{p}\nu(dx).$ Suppose that $\beta<2$ for the Blumenthal-Getoor index $\beta\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\inf\{p>0:\ I^{p}_{0}(\nu)<\infty\}$ . Then

\displaystyle\int_{(-\kappa,\kappa)}x^{2}\nu(dx)\leq\kappa^{2-\beta_{+}}I^{\beta_{+}}_{0}(\nu)\qquad\forall\kappa\in(0,1],\ \beta_{+}\in(\beta,2).

Next, we prepare two lemmas regarding the expectations of the supremum of $\Xi_{n}$ (see (3.6) for the definition) and the difference between $\Xi_{n}$ and $\breve{\Xi}^{m}_{n}$ (see (3.23)).

Lemma 6.5.

There exists a constant $C_{X}<\infty$ (depending only on the law of Lévy process $X(t)$ ) such that

\displaystyle\mathbf{E}\bigg{[}\sup_{s\in[0,t]}\Xi_{n}(t)\bigg{]}\leq C_{X}(\sqrt{t}+t)\qquad\forall t>0,\ n\geq 1.

Proof.

Recall that the generating triplet of $X$ is $(c_{X},\sigma,\nu)$ and for the Blumenthal-Getoor index $\beta\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\inf\{p>0:\int_{(-1,1)}|x|^{p}\nu(dx)<\infty\}$ we have $\beta<2$ ; see Assumption 1. Fix some $\beta_{+}\in(1\vee\beta,2)$ in this proof. We prove the lemma for

\displaystyle C_{X}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\max\Big{\{}|\sigma|\sqrt{\frac{2}{\pi}}+2\sqrt{I^{\beta_{+}}_{0}(\nu)},\ (c_{X})^{+}+I^{1}_{+}(\nu)+2I_{0}^{\beta_{+}}(\nu)\Big{\}}

where $(x)^{+}=x\vee 0$ , $I^{1}_{+}(\nu)=\int_{[1,\infty)}x\nu(dx)$ , and $I^{p}_{0}(\nu)=\int_{(-1,1)}|x|^{p}\nu(dx).$

Recall that $\Xi_{n}$ is a Lévy process with generating triplet $(c_{X},\sigma,\nu|_{(-\infty,n\gamma)})$ . Let $\nu_{n}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\nu|_{(-\infty,n\gamma)}$ . It follows from Lemma 2 of [35] (specifically, by setting $t=T$ in equation (26)) that, for all $t>0$ and $n\geq 1$ ,

\displaystyle\mathbf{E}\sup_{s\in[0,t]}\Xi_{n}(t)\leq\bigg{(}|\sigma|\sqrt{\frac{2}{\pi}}+2\sqrt{I^{\beta_{+}}_{0}(\nu_{n})}\bigg{)}\sqrt{t}+\Big{(}(c_{X})^{+}+I^{1}_{+}(\nu_{n})+2I_{0}^{\beta_{+}}(\nu_{n})\Big{)}t.

(6.43)

In particular, note that $I^{\beta_{+}}_{0}(\nu_{n})=\int_{(-1,1)}|x|^{p}\nu_{n}(dx)=\int_{(-1,1)\cap(-\infty,n\gamma)}|x|^{p}\nu(dx)\leq I^{\beta_{+}}_{0}(\nu)$ and $I^{1}_{+}(\nu_{n})=\int_{[1,\infty)}x\nu_{n}(dx)=\int_{[1,\infty)\cap(-\infty,n\gamma)}x\nu(dx)\leq I^{1}_{+}(\nu).$ Plugging these two bounds into (6.43), we conclude the proof. ∎

Lemma 6.6.

There exists some $C\in(0,\infty)$ (only depending on the choice of $\beta_{+}\in(\beta,2)$ in (6.19) and the law of Lévy process $X$ ) such that

\mathbf{P}\Big{(}\sup_{t\in[0,n]}\Big{|}\Xi_{n}(t)-\breve{\Xi}^{m}_{n}(t)\Big{|}>x\Big{)}\leq\frac{C\kappa^{m(2-\beta_{+})}}{x^{2}n^{r(2-\beta_{+})-1}}\qquad\forall x>0,\ n\geq 1,\ m\in\mathbb{N}

where $r$ is the parameter in the truncation threshold $\kappa_{n,m}=\kappa^{m}/n^{r}$ (see (3.20)).

Proof.

From the definitions of $\Xi_{n}$ and $\breve{\Xi}^{m}_{n}$ in (3.21) and (3.23), respectively, we have

\displaystyle\Xi_{n}(t)-\breve{\Xi}^{m}_{n}(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}X^{(-\kappa_{n,m},\kappa_{n,m})}(t)-\bar{\sigma}(\kappa_{n,m})B(t)

where $X^{(-c,c)}$ is the Lévy process with generating triplet $(0,0,\nu|_{(-c,c)})$ , $\kappa_{n,m}=\kappa^{m}/n^{r},$ and $B$ is a standard Brownian motion independent of $X^{(-\kappa_{n,m},\kappa_{n,m})}$ . In particular, $X^{(-\kappa_{n,m},\kappa_{n,m})}$ is a martingale with variance $var\big{[}X^{(-\kappa_{n,m},\kappa_{n,m})}(1)\big{]}=\bar{\sigma}^{2}(\kappa_{n,m})$ ; see (3.22) for the definition of $\bar{\sigma}^{2}(\cdot)$ . Therefore,

	$\displaystyle\mathbf{P}\Big{(}\sup_{t\in[0,n]}\Big{\|}\Xi_{n}(t)-\breve{\Xi}^{m}_{n}(t)\Big{\|}>x\Big{)}$
	$\displaystyle\leq\frac{1}{x^{2}}\mathbf{E}\Big{\|}X^{(-\kappa_{n,m},\kappa_{n,m})}(n)-\bar{\sigma}(\kappa_{n,m})B(n)\Big{\|}^{2}\qquad\text{using Doob's inequality}$
	$\displaystyle=\frac{2n}{x^{2}}\bar{\sigma}^{2}(\kappa_{n,m})\qquad\text{due to the independence between $X^{(-\kappa_{n,m},\kappa_{n,m})}$ and $B$}$
	$\displaystyle\leq\frac{2n}{x^{2}}\cdot\kappa^{2-\beta_{+}}_{n,m}I_{0}^{\beta_{+}}(\nu)\qquad\text{ using Result \ref{result: bound bar sigma}}$
	$\displaystyle=\frac{2I_{0}^{\beta_{+}}(\nu)}{x^{2}}\cdot\frac{n\kappa^{m(2-\beta_{+})}}{n^{r(2-\beta_{+})}}=\frac{2I_{0}^{\beta_{+}}(\nu)}{x^{2}}\cdot\frac{\kappa^{m(2-\beta_{+})}}{n^{r(2-\beta_{+})-1}}\qquad\text{ due to }\kappa_{n,m}=\kappa^{m}/n^{r}.$

To conclude the proof, we set $C=2I_{0}^{\beta_{+}}(\nu)=2\int_{(-1,1)}\int|x|^{\beta_{+}}\nu(dx)$ . ∎

To facilitate the presentation of the next few lemmas, we consider a slightly more general version of the stick-breaking procedure described in (3.15)–(3.24), to allow for arbitrary stick length. Specifically, for any $l>0$ , let

\displaystyle l_{1}(l)=V_{1}\cdot l,\qquad l_{j}(l)=V_{j}\cdot\big{(}l-l_{1}(l)-l_{2}(l)-\cdots-l_{j-1}(l)\big{)}\quad\forall j\geq 2,

(6.44)

where $V_{j}$ ’s are iid copies of Unif $(0,1)$ . Independent of $V_{j}$ ’s, for any $n$ and $m$ , let $\Xi_{n}$ and $\breve{\Xi}^{m}_{n}$ be Lévy processes with joint law specified in (3.21) and (3.23), respectively. Conditioning on the values of $l_{j}(l)$ , define $\xi^{[n]}_{j}(l),\xi^{[n],m}_{j}(l)$ using (for all $j\geq 1$ )

\displaystyle\Big{(}\xi^{[n]}_{j}(l),\xi^{[n],0}_{j}(l),\xi^{[n],1}_{j}(l),\xi^{[n],2}_{j}(l),\ldots\Big{)}=\Big{(}\Xi_{n}\big{(}l_{j}(l)\big{)},\ \breve{\Xi}^{0}_{n}\big{(}l_{j}(l)\big{)},\ \breve{\Xi}^{1}_{n}\big{(}l_{j}(l)\big{)},\ \breve{\Xi}^{2}_{n}\big{(}l_{j}(l)\big{)},\ \ldots\Big{)}.

(6.45)

Lemma 6.7.

There exists some $C\in(0,\infty)$ (only depending on the choice of $\beta_{+}\in(\beta,2)$ in (6.19) and the law of Lévy process $X$ ) such that, for all $m\in\mathbb{N}$ and $n\geq 1$ ,

\mathbf{P}\bigg{(}\bigg{|}\sum_{j=1}^{m+\lceil\log_{2}(n^{d})\rceil}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}-\sum_{j=1}^{m+\lceil\log_{2}(n^{d})\rceil}\big{(}\xi^{[n],m}_{j}(l)\big{)}^{+}\bigg{|}>y\bigg{)}\leq\frac{C\kappa^{m(2-\beta_{+})}}{y^{2}n^{r(2-\beta_{+})-1}}\qquad\forall y>0,\ l\in[0,n]

where $r$ is the parameter in the truncation threshold $\kappa_{n,m}=\kappa^{m}/n^{r}$ (see (3.20)) and $(x)^{+}=x\vee 0$ .

Proof.

For notational simplicity, set $k(n)=\lceil\log_{2}(n^{d})\rceil$ . Due to $|(x)^{+}-(y)^{+}|\leq|x-y|$ ,

	$\displaystyle\mathbf{P}\Big{(}\Big{\|}\sum_{j=1}^{m+k(n)}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}-\sum_{j=1}^{m+k(n)}\big{(}\xi^{[n],m}_{j}(l)\big{)}^{+}\Big{\|}>y\Big{)}$
	$\displaystyle\leq\mathbf{P}\Big{(}\sum_{j=1}^{m+k(n)}\Big{\|}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}-\big{(}\xi^{[n],m}_{j}(l)\big{)}^{+}\Big{\|}>y\Big{)}\leq\mathbf{P}\Big{(}\sum_{j=1}^{m+k(n)}\Big{\|}\underbrace{\xi^{[n]}_{j}(l)-\xi^{[n],m}_{j}(l)}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}q_{j}}\Big{\|}>y\Big{)}.$		(6.46)

Furthermore, we claim the existence of some constant $\tilde{C}\in(0,\infty)$ such that (for any $y,d>0$ , $l\in[0.n]$ , and any $n\geq 1,\ m\in\mathbb{N}$ )

\displaystyle\mathbf{P}(\sum_{j=1}^{m+k(n)}|q_{j}|>y)\leq\tilde{C}\cdot\frac{n\bar{\sigma}^{2}(\kappa_{n,m})}{y^{2}}.

(6.47)

Then using Result 5, we yield

\displaystyle n\bar{\sigma}^{2}(\kappa_{n,m})\leq n\cdot\kappa^{2-\beta_{+}}_{n,m}I_{0}^{\beta_{+}}(\nu)=\frac{\kappa^{m(2-\beta_{+})}}{n^{r(2-\beta_{+})-1}}\cdot I_{0}^{\beta_{+}}(\nu)

where $I_{0}^{\beta_{+}}(\nu)=\int_{(-1,1)}\int|x|^{\beta_{+}}\nu(dx)$ . Setting $C=\tilde{C}I_{0}^{\beta_{+}}(\nu)$ , we conclude the proof.

Now, it only remains to prove claim (6.47). Let $\chi=2^{1/4}$ . Note that

\displaystyle 1=(\chi-1)\sum_{j\geq 1}\frac{1}{\chi^{j}}\geq(\chi-1)\Big{(}\frac{1}{\chi}+\frac{1}{\chi^{2}}+\cdots+\frac{1}{\chi^{m+k(n)}}\Big{)}.

As a result,

\displaystyle\mathbf{P}\Big{(}\sum_{j=1}^{k(n)+m}|q_{j}|>y\Big{)}\leq

\displaystyle\mathbf{P}\Big{(}\sum_{j=1}^{k(n)+m}|q_{j}|>y(\chi-1)\sum_{j=1}^{k(n)+m}\frac{1}{\chi^{j}}\Big{)}\leq\sum_{j=1}^{k(n)+m}\mathbf{P}\Big{(}|q_{j}|>y\cdot\frac{\chi-1}{\chi^{j}}\Big{)}

(6.48)

Next, we bound each $\mathbf{P}(|q_{j}|>y\frac{\chi-1}{\chi^{j}})$ . Conditioning on $l_{j}(l)=t$ (for any $t\in[0,l]$ ), we get

	$\displaystyle\mathbf{P}\bigg{(}\|q_{j}\|>y\frac{\chi-1}{\chi^{j}}\ \bigg{\|}\ l_{j}(l)=t\bigg{)}$	$\displaystyle=\mathbf{P}\bigg{(}\Big{\|}\Xi_{n}(t)-\breve{\Xi}^{m}_{n}(t)\Big{\|}>y\frac{\chi-1}{\chi^{j}}\bigg{)}\qquad\text{ due to \eqref{def general xi n m l, 3}}$
		$\displaystyle\leq\frac{\chi^{2j}}{y^{2}(\chi-1)^{2}}\mathbf{E}\Big{\|}X^{(-\kappa_{n,m},\kappa_{n,m})}(t)-\bar{\sigma}(\kappa_{n,m})B(t)\Big{\|}^{2}$
		$\displaystyle=\frac{\chi^{2j}}{y^{2}(\chi-1)^{2}}\cdot 2\bar{\sigma}^{2}(\kappa_{n,m})t$
	$\displaystyle\Longrightarrow\mathbf{P}\bigg{(}\|q_{j}\|>y\frac{\chi-1}{\chi^{j}}\bigg{)}$	$\displaystyle\leq\frac{\chi^{2j}}{y^{2}(\chi-1)^{2}}\cdot 2\bar{\sigma}^{2}(\kappa_{n,m})\cdot\mathbf{E}[l_{j}(l)]$
		$\displaystyle=\frac{\sqrt{2^{j}}}{y^{2}(2^{1/4}-1)^{2}}\cdot 2\bar{\sigma}^{2}(\kappa_{m,n})\cdot\mathbf{E}[l_{j}(l)]\qquad\text{ due to }\chi=2^{1/4}$
		$\displaystyle=\frac{\sqrt{2^{j}}}{y^{2}(2^{1/4}-1)^{2}}\cdot 2\bar{\sigma}^{2}(\kappa_{m,n})\cdot\frac{l}{2^{j}}\qquad\text{by definition of $l_{j}(l)$ in \eqref{def general xi n m l, 2}}$
		$\displaystyle\leq\frac{2}{(2^{1/4}-1)^{2}\sqrt{2^{j}}}\cdot\frac{n\bar{\sigma}^{2}(\kappa_{m,n})}{y^{2}}\qquad\text{ due to }l\leq n.$

Therefore, in (6.48), we get

\displaystyle\mathbf{P}\Big{(}\sum_{j=1}^{k(n)+m}|q_{j}|>y\Big{)}

\displaystyle\leq\frac{n\bar{\sigma}^{2}(\kappa_{m,n})}{y^{2}}\sum_{j\geq 1}\frac{2}{(2^{1/4}-1)^{2}\sqrt{2^{j}}}=\frac{n\bar{\sigma}^{2}(\kappa_{m,n})}{y^{2}}\cdot\underbrace{\frac{2\sqrt{2}}{(2^{1/4}-1)^{2}(\sqrt{2}-1)}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\tilde{C}},

thus establishing claim (6.47). ∎

Lemma 6.8.

Let $n\in\mathbb{Z}_{+}$ and $l\in[0,n]$ . Let $C_{X}<\infty$ be the constant characterized in Lemma 6.5 that only depends on the law of Lévy process $X$ . The inequality

\displaystyle\mathbf{P}\Big{(}\sum_{j>m+\lceil\log_{2}(n^{d})\rceil}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}>x\Big{)}\leq\frac{2C_{X}}{x}\sqrt{\frac{1}{n^{d-1}\cdot 2^{m}}}

holds for all $x>0$ , $n\geq 1$ , and $m\geq 0$ , where $(x)^{+}=x\vee 0$ .

Proof.

For this proof, we adopt the notation $\breve{l}_{k}(l)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}l-l_{1}(l)-l_{2}(l)-\ldots-l_{k}(l)$ for the remaining stick length after the first $k$ sticks. Conditioning on $\breve{l}_{m+\lceil\log_{2}(n^{d})\rceil}(l)=t$ ,

	$\displaystyle\mathbf{P}\Big{(}\sum_{j>m+\lceil\log_{2}(n^{d})\rceil}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}>x\ \bigg{\|}\ \breve{l}_{m+\lceil\log_{2}(n^{d})\rceil}(l)=t\Big{)}$	$\displaystyle=\mathbf{P}\Big{(}\sup_{s\in[0,t]}\Xi_{n}(s)>x\Big{)}\qquad\text{ by Result~{}\ref{result: concave majorant of Levy}}$
		$\displaystyle\leq\frac{C_{X}}{x}(\sqrt{t}+t)\qquad\text{using Lemma~{}\ref{lemma: algo, bound supremum of truncated X}}.$

Therefore, unconditionally,

	$\displaystyle\mathbf{P}\Big{(}\sum_{j>m+\lceil\log_{2}(n^{d})\rceil}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}>x\Big{)}$	$\displaystyle\leq\frac{C_{X}}{x}\mathbf{E}\Big{[}\sqrt{\breve{l}_{m+\lceil\log_{2}(n^{d})\rceil}(l)}+\mathbf{E}\breve{l}_{m+\lceil\log_{2}(n^{d})\rceil}(l)\Big{]}$
		$\displaystyle\leq\frac{C_{X}}{x}\Big{[}\sqrt{\mathbf{E}\breve{l}_{m+\lceil\log_{2}(n^{d})\rceil}(l)}+\mathbf{E}\breve{l}_{m+\lceil\log_{2}(n^{d})\rceil}(l)\Big{]}$

The last line follows from Jensen’s inequality. Lastly, by definition of $l_{j}(l)$ ’s in (6.44), we have

\displaystyle\mathbf{E}\breve{l}_{m+\lceil\log_{2}(n^{d})\rceil}(l)

\displaystyle=\frac{l}{2^{m+\lceil\log_{2}(n^{d})\rceil}}\leq\frac{l}{2^{m}\cdot n^{d}}\leq\frac{n}{2^{m}\cdot n^{d}}=\frac{1}{n^{d-1}\cdot 2^{m}}\qquad\text{due to $l\in[0,n]$}.

This concludes the proof. ∎

Lemma 6.9.

Let $n\in\mathbb{Z}_{+}$ and $l\in[0,n]$ . Let $C$ and $\lambda$ be the constants in Assumption 2. Let $C_{X}<\infty$ be the constant characterized in Lemma 6.5 that only depends on the law of Lévy process $X$ . The inequality

\displaystyle\mathbf{P}\Big{(}\sum_{j=1}^{m+\lceil\log_{2}(n^{d})\rceil}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}\in[y,y+c]\Big{)}\leq C\frac{(m+(\lceil\log_{2}(n^{d})\rceil)n^{\alpha_{4}\lambda}}{\delta^{\alpha_{3}\lambda}}c+4C_{X}\big{(}m^{2}+(\lceil\log_{2}(n^{d})\rceil)^{2}\big{)}\frac{\delta^{\alpha_{3}/2}}{y_{0}\cdot n^{\alpha_{4}/2}}.

holds for all $y\geq y_{0}>0$ , $c>0$ , $n\geq 1$ , and $m\in\mathbb{N}$ .

Proof.

To simplify notations, in this proof we set $k(n)=\lceil\log_{2}(n^{d})\rceil$ and write $l_{j}=l_{j}(l)$ when there is no ambiguity. For the sequence of random variables $(l_{1},\cdots,l_{m+k(n)})$ , let $\tilde{l}_{1}\geq\tilde{l}_{2}\geq\cdots\geq\tilde{l}_{m+k(n)}$ be its order statistics. Given any ordered positive real sequence $t_{1}\geq t_{2}\geq\cdots\geq t_{m+k(n)}>0$ , by conditioning on $\tilde{l}_{j}=t_{j}\ \forall j\in[m+k(n)]$ , it follows from (6.45) that

\displaystyle\mathbf{P}\Big{(}\sum_{j=1}^{m+k(n)}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}\in[y,y+c]\ \Big{|}\ \tilde{l}_{j}=t_{j}\ \forall j\in[m+k(n)]\Big{)}=\mathbf{P}\Big{(}\sum_{j=1}^{m+k(n)}\big{(}\Xi_{n}^{(j)}(t_{j})\big{)}^{+}\in[y,y+c]\Big{)}

(6.49)

where $\Xi_{n}^{(j)}$ ’s are iid copies of the Lévy processes $\Xi_{n}=X^{<n\gamma}$ . Next, fix

\displaystyle\eta=\delta^{m\alpha_{3}}/n^{\alpha_{4}}.

Given the sequence of real numbers $t_{j}$ ’s, we define $J\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\#\{j\in[m+k(n)]:\ t_{j}>\eta\}$ as the number of elements in the sequence that are larger than $\eta$ . In case that $t_{1}\leq\eta$ , we set $J=0$ . With $J$ defined, we consider a decomposition of events in (6.49) based on the first $j\in[m+k(n)]$ such that $\Xi_{n}^{(j)}(t_{j})>0$ (and hence $\big{(}\Xi_{n}^{(j)}(t_{j})\big{)}^{+}>0$ ), especially if such $t_{j}$ is larger than $\eta$ or not. To be specific,

\begin{split}&\mathbf{P}\Big{(}\sum_{j=1}^{m+k(n)}\big{(}\Xi_{n}^{(j)}(t_{j})\big{)}^{+}\in[y,y+c]\Big{)}\\ &=\sum_{j=1}^{J}\underbrace{\mathbf{P}\bigg{(}\Xi_{n}^{(i)}(t_{i})\leq 0\ \forall i\in[j-1];\ \Xi_{n}^{(j)}(t_{j})>0;\ \sum_{i=j}^{m+k(n)}\big{(}\Xi_{n}^{(i)}(t_{i})\big{)}^{+}\in[y,y+c]\bigg{)}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}p_{j}}\\ &+\underbrace{\mathbf{P}\bigg{(}\Xi_{n}^{(i)}(t_{i})\leq 0\ \forall i\in[J];\sum_{j=J+1}^{m+k(n)}\big{(}\Xi_{n}^{(j)}(t_{j})\big{)}^{+}\in[y,y+c]\bigg{)}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}p_{*}}.\end{split}

(6.50)

We first bound terms $p_{j}$ ’s. For any $j\in[J]$ , observe that

$\displaystyle p_{j}$	$\displaystyle\leq\mathbf{P}\bigg{(}\Xi_{n}^{(j)}(t_{j})>0;\ \sum_{i=j}^{m+k(n)}\big{(}\Xi_{n}^{(i)}(t_{i})\big{)}^{+}\in[y,y+c]\bigg{)}$
	$\displaystyle=\int_{\mathbb{R}}\mathbf{P}\bigg{(}\Xi_{n}^{(j)}(t_{j})\in[y-x,y-x+c]\cap(0,\infty)\bigg{)}\mathbf{P}\bigg{(}\sum_{i=j+1}^{m+k(n)}\big{(}\Xi_{n}^{(i)}(t_{i})\big{)}^{+}\in dx\bigg{)}$
	$\displaystyle\leq\frac{Cc}{t_{j}^{\lambda}\wedge 1}\qquad\text{ by Assumption~{}\ref{assumption: holder continuity strengthened on X < z t}}$
	$\displaystyle\leq\frac{Cn^{\alpha_{4}\lambda}}{\delta^{m\alpha_{3}\lambda}}\cdot c\qquad\text{ due to $j\leq J$, and hence $t_{j}>\eta=\delta^{m\alpha_{3}}/n^{\alpha_{4}}$}.$	(6.51)

On the other hand,

$\displaystyle p_{*}$	$\displaystyle\leq\mathbf{P}\bigg{(}\sum_{j=J+1}^{m+k(n)}\big{(}\Xi_{n}^{(j)}(t_{j})\big{)}^{+}\in[y,y+c]\bigg{)}\leq\mathbf{P}\bigg{(}\sum_{j=J+1}^{m+k(n)}\big{(}\Xi_{n}^{(j)}(t_{j})\big{)}^{+}\geq y_{0}\bigg{)}\qquad\text{ due to }y\geq y_{0}>0$
	$\displaystyle\leq\sum_{j=J+1}^{m+k(n)}\mathbf{P}\Big{(}\Xi_{n}^{(j)}(t_{j})\geq y_{0}/N\Big{)}\qquad\text{ with }N\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}m+k(n)-J$
	$\displaystyle\leq\sum_{j=J+1}^{m+k(n)}\frac{C_{X}(\sqrt{t_{j}}+t_{j})\cdot N}{y_{0}}\qquad\text{ by Lemma~{}\ref{lemma: algo, bound supremum of truncated X}}$
	$\displaystyle\leq\sum_{j=J+1}^{m+k(n)}\frac{C_{X}(\sqrt{\eta}+\eta)\cdot N}{y_{0}}\qquad\text{due to }j>J\text{, and hence }t_{j}\leq\eta=\delta^{m\alpha_{3}}/n^{\alpha_{4}}$
	$\displaystyle=N^{2}\cdot\frac{C_{X}(\sqrt{\eta}+\eta)}{y_{0}}\leq\big{(}m+k(n)\big{)}^{2}\cdot\frac{C_{X}(\sqrt{\eta}+\eta)}{y_{0}}\qquad\text{ due to }N\leq m+k(n)$
	$\displaystyle\leq 2C_{X}\big{(}m^{2}+(\lceil\log_{2}(n^{d})\rceil)^{2}\big{)}\frac{\sqrt{\eta}+\eta}{y_{0}}\qquad\text{ using }(u+v)^{2}\leq 2(u^{2}+v^{2})$
	$\displaystyle\leq 4C_{X}\big{(}m^{2}+(\lceil\log_{2}(n^{d})\rceil)^{2}\big{)}\frac{\sqrt{\eta}}{y_{0}}=4C_{X}\big{(}m^{2}+(\lceil\log_{2}(n^{d})\rceil)^{2}\big{)}\frac{\delta^{m\alpha_{3}/2}}{y_{0}\cdot n^{\alpha_{4}/2}}.$	(6.52)

Plugging (6.51) and (6.52) into (6.50), we yield

	$\displaystyle\mathbf{P}\Big{(}\sum_{j=1}^{m+k(n)}\big{(}\xi^{[n]}_{j}(l)\big{)}^{+}\in[y,y+c]\ \Big{\|}\ \tilde{l}_{j}=t_{j}\ \forall j\in[m+k(n)]\Big{)}$
	$\displaystyle\leq J\cdot\frac{Cn^{\alpha_{4}\lambda}}{\delta^{m\alpha_{3}\lambda}}c+4C_{X}\big{(}m^{2}+(\lceil\log_{2}(n^{d})\rceil)^{2}\big{)}\frac{\delta^{m\alpha_{3}/2}}{y_{0}\cdot n^{\alpha_{4}/2}}$
	$\displaystyle\leq C\frac{(m+(\lceil\log_{2}(n^{d})\rceil)n^{\alpha_{4}\lambda}}{\delta^{m\alpha_{3}\lambda}}c+4C_{X}\big{(}m^{2}+(\lceil\log_{2}(n^{d})\rceil)^{2}\big{)}\frac{m\delta^{\alpha_{3}/2}}{y_{0}\cdot n^{\alpha_{4}/2}}\qquad\text{ due to }J\leq m+\lceil\log_{2}(n^{d})\rceil.$

To conclude the proof, just note that the inequality above holds when conditioning on any sequence of $t_{1}\geq t_{2}\geq\cdots\geq t_{m+k(n)}>0$ , so it would also hold unconditionally. ∎

Now, we are ready to prove Propositions 6.3 and 6.4.

Proof of Proposition 6.3.

In this proof, we fix some $k\in\mathbb{N}$ , $n\geq 1$ , and $m\in\mathbb{N}$ . Let the process $\zeta_{k}$ be defined as in (6.35). Recall the definitions of $W^{(i),*}_{n}(\zeta_{k})$ , $\hat{W}^{(i),m}_{n}(\zeta_{k})$ , and $\widetilde{W}^{(i),m}_{n}(\zeta_{k})$ in (6.37)–(6.39). See also (3.15)–(3.24) for the definitions $\xi^{(i)}_{j}$ ’s and $\xi^{(i),m}_{j}$ ’s.

To simplify notations, define $t(n)=\lceil\log_{2}(n^{d})\rceil$ . Define events

	$\displaystyle E_{1}^{(i)}(x)$	$\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\bigg{\{}\Big{\|}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}-\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q),m}_{j}\Big{\|}\leq\frac{x}{2}\bigg{\}},$
	$\displaystyle E_{2}^{(i)}(x)$	$\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\bigg{\{}\Big{\|}\sum_{j=1}^{m+t(n)}(\xi^{(i)}_{j})^{+}-\sum_{j=1}^{m+t(n)}(\xi^{(i),m}_{j})^{+}\Big{\|}\leq\frac{x}{2}\bigg{\}},$
	$\displaystyle E_{3}^{(i)}(x)$	$\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\bigg{\{}\sum_{j\geq m+t(n)+1}(\xi^{(i)}_{j})^{+}\leq x\bigg{\}}.$

Note that on event $E_{1}^{(i)}(x)\cap E_{2}^{(i)}(x)\cap E_{3}^{(i)}(x)$ , we must have $|W^{(i),*}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})|\leq x$ and $|\hat{W}^{(i),m}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})|\leq x.$ As a result,

\displaystyle\mathbf{P}\bigg{(}\Big{|}W^{(i),*}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{|}\vee\Big{|}\hat{W}^{(i),m}_{n}(\zeta_{k})-\widetilde{W}^{(i),m}_{n}(\zeta_{k})\Big{|}>x\bigg{)}\leq\sum_{q=1}^{3}\mathbf{P}\Big{(}\big{(}E^{(i)}_{q}(x)\big{)}^{c}\Big{)}.

Furthermore, we claim the existence of constant $(\widetilde{C}_{q})_{q=1,2,3}$ , the values of which do not depend on $x,k,n$ , and $m$ , such that (for any $x>0$ and $i\in[k+1]$ )

$\displaystyle\mathbf{P}\Big{(}\big{(}E_{1}^{(i)}(x)\big{)}^{c}\Big{)}=\mathbf{P}\bigg{(}\Big{\|}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}-\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q),m}_{j}\Big{\|}>\frac{x}{2}\bigg{)}$	$\displaystyle\leq\frac{\widetilde{C}_{1}\kappa^{m(2-\beta_{+})}}{x^{2}n^{r(2-\beta_{+})-1}},$	(6.53)
$\displaystyle\mathbf{P}\Big{(}\big{(}E_{2}^{(i)}(x)\big{)}^{c}\Big{)}=\mathbf{P}\bigg{(}\Big{\|}\sum_{j=1}^{m+t(n)}(\xi^{(i)}_{j})^{+}-\sum_{j=1}^{m+t(n)}(\xi^{(i),m}_{j})^{+}\Big{\|}>\frac{x}{2}\bigg{)}$	$\displaystyle\leq\frac{\widetilde{C}_{2}\kappa^{m(2-\beta_{+})}}{x^{2}n^{r(2-\beta_{+})-1}},$	(6.54)
$\displaystyle\mathbf{P}\Big{(}\big{(}E_{3}^{(i)}(x)\big{)}^{c}\Big{)}=\mathbf{P}\bigg{(}\sum_{j\geq m+t(n)+1}(\xi^{(i)}_{j})^{+}>x\bigg{)}$	$\displaystyle\leq\frac{\widetilde{C}_{3}}{x}\sqrt{\frac{1}{n^{d-1}\cdot 2^{m}}}.$	(6.55)

This allows us to conclude the proof by setting $C_{1}=\sum_{q=1}^{3}\widetilde{C}_{q}$ . Now, it remains to prove claims (6.53)–(6.55).

Proof of Claim (6.53)

The claim is trivial if $i\leq 1$ , so we only consider the case where $i\geq 2$ . Due to the coupling between $\xi^{(i)}_{j}$ and $\xi^{(i),m}_{j}$ in (3.24)(3.25), we have

\displaystyle\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j},\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q),m}_{j}\Big{)}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\big{(}\Xi_{n}(u_{i-1}),\breve{\Xi}^{m}_{n}(u_{i-1})\big{)}

where the laws of processes $\Xi_{n},\breve{\Xi}^{m}_{n}$ are stated in (3.21) and (3.23), respectively. Applying Lemma 6.6, we yield

\displaystyle\mathbf{P}\bigg{(}\bigg{|}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}-\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q),m}_{j}\bigg{|}>\frac{x}{2}\bigg{)}

\displaystyle\leq\mathbf{P}\bigg{(}\sup_{t\in[0,n]}|\Xi_{n}(t)-\breve{\Xi}^{m}_{n}(t)|>\frac{x}{2}\bigg{)}\leq\frac{4C\kappa^{m(2-\beta_{+})}}{x^{2}n^{r(2-\beta_{+})-1}},

where $C<\infty$ is the constant characterized in Lemma 6.6 that only depends on $\beta_{+}$ and the law of the Lévy process $X$ . To conclude the proof of claim (6.53), we pick $\widetilde{C}_{1}=4C$ .

Proof of Claim (6.54)

It follows directly from Lemma 6.7 that

\displaystyle\mathbf{P}\bigg{(}\bigg{|}\sum_{j=1}^{m+t(n)}(\xi^{(i)}_{j})^{+}-\sum_{j=1}^{m+t(n)}(\xi^{(i),m}_{j})^{+}\bigg{|}>\frac{x}{2}\bigg{)}\leq\frac{4C\kappa^{m(2-\beta_{+})}}{x^{2}n^{r(2-\beta_{+})-1}},

where $C<\infty$ is the constant characterized in Lemma 6.7 that only depends on $\beta_{+}$ and the law of the Lévy process $X$ . To conclude the proof of claim (6.54), we pick $\widetilde{C}_{2}=4C$ .

Proof of Claim (6.55)

Using Lemma 6.8,

\displaystyle\mathbf{P}\bigg{(}\sum_{j\geq m+t(n)+1}(\xi^{(i)}_{j})^{+}>x\bigg{)}

\displaystyle\leq\frac{2C_{X}}{x}\cdot\sqrt{\frac{1}{n^{d-1}\cdot 2^{m}}}

where $C_{X}<\infty$ is the constant characterized in Lemma 6.8 that only depends on the law of the Lévy process $X$ . By setting $\widetilde{C}_{3}=2C_{X}$ , we conclude the proof of claim (6.55). ∎

Proof of Proposition 6.4.

In this proof, we fix some $k\in\mathbb{N}$ . Recall the representation $\zeta_{k}(t)=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}(t)$ in (6.35) where $0<u_{1}<u_{2}<\ldots<u_{k}<n$ are the order statistics of $k$ iid samples of Unif $(0,n)$ . Recall the definition of $\widetilde{W}^{(i),m}_{n}(\zeta_{k})$ in (6.39). See also (3.15)–(3.24) for the definitions $\xi^{(i)}_{j}$ ’s and $\xi^{(i),m}_{j}$ ’s.

We start with the following decomposition of events:

	$\displaystyle\mathbf{P}\bigg{(}\exists i\in[k+1]\ s.t.\ \widetilde{W}^{(i),m}_{n}(\zeta_{k})\in\bigg{[}y-\frac{\delta^{m}}{\sqrt{n}},y+\frac{\delta^{m}}{\sqrt{n}}\bigg{]}\bigg{)}$
	$\displaystyle\leq\mathbf{P}(u_{1}<n\delta^{m\alpha_{1}})+\mathbf{P}\bigg{(}\exists i\in[k+1]\ s.t.\ \widetilde{W}^{(i),m}_{n}(\zeta_{k})\in\bigg{[}y-\frac{\delta^{m}}{\sqrt{n}},y+\frac{\delta^{m}}{\sqrt{n}}\bigg{]},\ u_{1}\geq n\delta^{m\alpha_{1}}\bigg{)}.$

First, $\mathbf{P}(u_{1}<n\delta^{m\alpha_{1}})\leq k\cdot\mathbf{P}(\text{Unif}(0,n)<n\delta^{m\alpha_{1}})=k\cdot\delta^{m\alpha_{1}}<k\cdot\rho^{m}_{0}.$ The last inequality follows from our choice of $\rho_{1}$ in (6.27) and $\rho_{0}\in(\rho_{1},1)$ . Furthermore, for each $i\in[k+1]$

	$\displaystyle\mathbf{P}\bigg{(}\widetilde{W}^{(i),m}_{n}(\zeta_{k})\in[y-\frac{\delta^{m}}{\sqrt{n}},y+\frac{\delta^{m}}{\sqrt{n}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\bigg{)}$
	$\displaystyle=\mathbf{P}\bigg{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\in[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}],\ \widetilde{W}^{(i),m}_{n}(\zeta_{k})\in[y-\frac{\delta^{m}}{\sqrt{n}},y+\frac{\delta^{m}}{\sqrt{n}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\bigg{)}$
	$\displaystyle\qquad+\mathbf{P}\bigg{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\notin[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}],\ \widetilde{W}^{(i),m}_{n}(\zeta_{k})\in[y-\frac{\delta^{m}}{\sqrt{n}},y+\frac{\delta^{m}}{\sqrt{n}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\bigg{)}$
	$\displaystyle\leq\mathbf{P}\bigg{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\in[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\bigg{)}$
	$\displaystyle\qquad+\mathbf{P}\bigg{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\notin[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}],\ \widetilde{W}^{(i),m}_{n}\in[y-\frac{\delta^{m}}{\sqrt{n}},y+\frac{\delta^{m}}{\sqrt{n}}]\bigg{)}$
	$\displaystyle\leq\mathbf{P}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\in[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\Big{)}$
	$\displaystyle\qquad+\int_{\mathbb{R}\char 92\relax[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}]}\mathbf{P}\bigg{(}\sum_{j=1}^{m+t(n)}(\xi^{(i)}_{j})^{+}\in[y-x-\frac{\delta^{m}}{\sqrt{n}},y-x+\frac{\delta^{m}}{\sqrt{n}}]\bigg{)}\mathbf{P}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\in dx\Big{)}$
	$\displaystyle=\mathbf{P}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\in[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\Big{)}$
	$\displaystyle\qquad+\int_{(-\infty,y-\delta^{m\alpha_{2}}]}\mathbf{P}\bigg{(}\sum_{j=1}^{m+t(n)}(\xi^{(i)}_{j})^{+}\in[y-x-\frac{\delta^{m}}{\sqrt{n}},y-x+\frac{\delta^{m}}{\sqrt{n}}]\bigg{)}\mathbf{P}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\in dx\Big{)}.$

The last equality follows from the simple fact that $\sum_{j\geq 1}(\xi^{(i)}_{j})^{+}\geq 0$ . Furthermore, we claim the existence of constants $\widetilde{C}_{1}$ and $\widetilde{C}_{2}$ , the values of which do not vary with parameters $n,m,k,y,i$ , such that for all $n\geq 1$ and $m\geq\bar{m}$ ,

	$\displaystyle\mathbf{P}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\in[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\Big{)}$	$\displaystyle\leq\widetilde{C}_{1}\rho_{0}^{m}\qquad\forall y>\delta^{m\alpha_{2}},$		(6.56)
	$\displaystyle\mathbf{P}\bigg{(}\sum_{j=1}^{m+t(n)}(\xi^{(i)}_{j})^{+}\in[w,w+\frac{2\delta^{m}}{\sqrt{n}}]\bigg{)}$	$\displaystyle\leq\widetilde{C}_{2}\rho_{0}^{m}\qquad\forall w\geq\delta^{m\alpha_{2}}-\frac{\delta^{m}}{\sqrt{n}}.$		(6.57)

Then, we conclude the proof by setting $C_{2}=1+\widetilde{C}_{1}+\widetilde{C}_{2}$ . Now, we prove claims (6.56) and (6.57)

Proof of Claim (6.56)

If $i\leq 1$ , the claim is trivial due to $y>\delta^{m\alpha}$ and hence $0\notin[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}]$ . Now, we consider the case where $i\geq 2$ . Due to the independence between $z_{i}$ and $\xi^{(i)}_{j}$ ,

	$\displaystyle\mathbf{P}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}+\sum_{q=1}^{i-1}z_{q}\in[y-\delta^{m\alpha_{2}},y+\delta^{m\alpha_{2}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\Big{)}$
	$\displaystyle=\int_{\mathbb{R}}\mathbf{P}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}\in[y-x-\delta^{m\alpha_{2}},y-x+\delta^{m\alpha_{2}}],\ u_{1}\geq n\delta^{m\alpha_{1}}\Big{)}\mathbf{P}(\sum_{q=1}^{i-1}z_{q}\in dx)$
	$\displaystyle\leq\int_{\mathbb{R}}\mathbf{P}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q)}_{j}\in[y-x-\delta^{m\alpha_{2}},y-x+\delta^{m\alpha_{2}}]\ \Big{\|}\ u_{1}\geq n\delta^{m\alpha_{1}}\Big{)}\mathbf{P}(\sum_{q=1}^{i-1}z_{q}\in dx)$
	$\displaystyle=\int_{\mathbb{R}}\mathbf{P}\Big{(}X^{<n\gamma}(u_{i-1})\in[y-x-\delta^{m\alpha_{2}},y-x+\delta^{m\alpha_{2}}]\ \Big{\|}\ u_{1}\geq n\delta^{m\alpha_{1}}\Big{)}\mathbf{P}(\sum_{q=1}^{i-1}z_{q}\in dx)$

where $(u_{i})_{i=1}^{k}$ are independent of the Lévy process $X^{<n\gamma}$ . In particular, recall that $0=u_{0}<u_{1}<u_{2}<\ldots<u_{k}<n$ are order statistics. Therefore, on event $\{u_{1}\geq n\delta^{m\alpha_{1}}\}$ we must have $u_{i-1}\geq u_{1}\geq n\delta^{m\alpha_{1}}$ . It then follows directly from Assumption 2 that

	$\displaystyle\mathbf{P}\Big{(}X^{<n\gamma}(u_{i-1})\in[y-x-\delta^{m\alpha_{2}},y-x+\delta^{m\alpha_{2}}]\ \Big{\|}\ u_{1}\geq n\delta^{m\alpha_{1}}\Big{)}$
	$\displaystyle\leq\frac{C}{(n^{\lambda}\delta^{m\alpha_{1}\lambda})\wedge 1}\cdot 2\delta^{m\alpha_{2}}\leq 2C\cdot\bigg{(}\frac{\delta^{\alpha_{2}}}{\delta^{\lambda\alpha_{1}}}\bigg{)}^{m}\leq 2C\cdot\rho_{0}^{m}\qquad\text{due to \eqref{proofChooseRhoByAlpha_12} and }\rho_{0}\in(\rho_{1},1),$

where $C$ and $\lambda$ are the constants specified in Assumption 2. To conclude, it suffices to set $\widetilde{C}_{1}=2C$ .

Proof of Claim (6.57)

Applying Lemma 6.9 with $y_{0}=\delta^{m\alpha_{2}}-\frac{\delta^{m}}{\sqrt{n}}$ and $c=\frac{2\delta^{m}}{\sqrt{n}}$ , we get (for all $n\geq 1,m\geq\bar{m},y\geq y_{0}$ )

	$\displaystyle\mathbf{P}\bigg{(}\sum_{j=1}^{m+t(n)}(\xi^{(i)}_{j})^{+}\in[y,y+\frac{2\delta^{m}}{\sqrt{n}}]\bigg{)}$
	$\displaystyle\leq C\frac{(m+(\lceil\log_{2}(n^{d})\rceil)n^{\alpha_{4}\lambda}}{\delta^{m\alpha_{3}\lambda}}\cdot\frac{2\delta^{m}}{\sqrt{n}}+4C_{X}\big{(}m^{2}+(\lceil\log_{2}(n^{d})\rceil)^{2}\big{)}\frac{\delta^{m\alpha_{3}/2}}{(\delta^{m\alpha_{2}}-\frac{\delta^{m}}{\sqrt{n}})\cdot n^{\alpha_{4}/2}}$
	$\displaystyle\leq C\frac{(m+(\lceil\log_{2}(n^{d})\rceil)n^{\alpha_{4}\lambda}}{\delta^{m\alpha_{3}\lambda}}\cdot\frac{2\delta^{m}}{\sqrt{n}}+8C_{X}\big{(}m^{2}+(\lceil\log_{2}(n^{d})\rceil)^{2}\big{)}\frac{\delta^{m\alpha_{3}/2}}{{\delta^{m\alpha_{2}}}\cdot n^{\alpha_{4}/2}}\qquad\text{due to \eqref{proofChooseMbar}}$
	$\displaystyle=\underbrace{2C\cdot\frac{m}{n^{\frac{1}{2}-\lambda\alpha_{4}}}\cdot\bigg{(}\frac{\delta}{\delta^{\lambda\alpha_{3}}}\bigg{)}^{m}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}p_{n,m,1}}+\underbrace{2C\cdot\frac{\lceil\log_{2}(n^{d})\rceil}{n^{\frac{1}{2}-\lambda\alpha_{4}}}\cdot\bigg{(}\frac{\delta}{\delta^{\lambda\alpha_{3}}}\bigg{)}^{m}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}p_{n,m,2}}$
	$\displaystyle\qquad+\underbrace{8C_{X}\cdot\frac{m^{2}}{n^{{\alpha_{4}}/{2}}}\cdot\bigg{(}\frac{\delta^{\alpha_{3}/2}}{\delta^{\alpha_{2}}}\bigg{)}^{m}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}p_{n,m,3}}+\underbrace{8C_{X}\cdot\frac{\big{(}\lceil\log_{2}(n^{d})\rceil\big{)}^{2}}{n^{{\alpha_{4}}/{2}}}\cdot\bigg{(}\frac{\delta^{\alpha_{3}/2}}{\delta^{\alpha_{2}}}\bigg{)}^{m}.}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}p_{n,m,4}}$

Here, $C_{X}<\infty$ is the constant in Lemma 6.5 that only depends on the law of Lévy process $X$ , and $C\in(0,\infty),\lambda>0$ are the constants in Assumption 2. First, for any $n\geq 1$ and $m\geq\bar{m}$ ,

	$\displaystyle p_{n,m,1}$	$\displaystyle\leq 2C\cdot m\cdot\bigg{(}\frac{\delta}{\delta^{\lambda\alpha_{3}}}\bigg{)}^{m}\qquad\text{due to $\frac{1}{2}>\lambda\alpha_{4}$; see \eqref{proofChooseAlpha_34}}$
		$\displaystyle\leq 2C\cdot m\rho_{1}^{m}\qquad\text{due to \eqref{proofChooseRhoSBALongStick}}$
		$\displaystyle\leq 2C\cdot\rho_{0}^{m}\qquad\text{ due to \eqref{proof, choose bar m and rho 0}}.$

For term $p_{n,m,2}$ , note that $\frac{\lceil\log_{2}(n^{d})\rceil}{n^{\frac{1}{2}-\lambda\alpha_{4}}}\rightarrow 0$ as $n\rightarrow\infty$ due to $\frac{1}{2}>\lambda\alpha_{4}$ . This allows us to fix some $C_{d,1}<\infty$ such that $\sup_{n=1,2,\cdots}\frac{\lceil\log_{2}(n^{d})\rceil}{n^{\frac{1}{2}-\lambda\alpha_{4}}}\leq C_{d,1}$ . As a result, for any $n\geq 1,m\geq 0$ ,

\displaystyle p_{n,m,2}

\displaystyle\leq 2CC_{d,1}\cdot\bigg{(}\frac{\delta}{\delta^{\lambda\alpha_{3}}}\bigg{)}^{m}\leq 2CC_{d,1}\cdot\rho^{m}_{0}\qquad\text{ due to \eqref{proofChooseRhoSBALongStick} and }\rho_{0}\in(\rho_{1},1).

Similarly, for all $n\geq 1$ and $m\geq\bar{m}$ ,

	$\displaystyle p_{n,m,3}$	$\displaystyle\leq 8C_{X}\cdot m^{2}\cdot\bigg{(}\frac{\delta^{\alpha_{3}/2}}{\delta^{\alpha_{2}}}\bigg{)}^{m}\leq 8C_{X}\cdot m^{2}\rho^{m}_{1}\qquad\text{due to \eqref{proofChooseRhoSBAShortStick}}$
		$\displaystyle\leq 8C_{X}\cdot\rho^{m}_{0}\qquad\text{ due to \eqref{proof, choose bar m and rho 0}}.$

Besides, due to $\frac{(\lceil\log_{2}(n^{d})\rceil)^{2}}{n^{{\alpha_{4}}/{2}}}\rightarrow 0$ as $n\rightarrow\infty$ , we can find $C_{d,2}<\infty$ such that $\sup_{n=1,2,\cdots,}\frac{(\lceil\log_{2}(n^{d})\rceil)^{2}}{n^{{\alpha_{4}}/{2}}}\leq C_{d,2}$ . This leads to (for all $n\geq 1,m\geq 0$ )

\displaystyle p_{n,m,4}

\displaystyle\leq 8C_{X}C_{d,2}\cdot\bigg{(}\frac{\delta^{\alpha_{3}/2}}{\delta^{\alpha_{2}}}\bigg{)}^{m}\leq 8C_{X}C_{d,2}\cdot\rho^{m}_{0}\qquad\text{ due to \eqref{proofChooseRhoSBALongStick} and }\rho_{0}\in(\rho_{1},1).

To conclude the proof, we can simply set $\widetilde{C}_{2}=2C+2CC_{d,1}+8C_{X}+8C_{X}C_{d,2}.$ ∎

6.3 Proof of Propositions 4.1 and 4.3

The proof of Proposition 4.1 is based on the inversion formula of the characteristic functions (see, e.g., Theorem 3.3.14 of [29]). Specifically, we compare the characteristic function of $Y(t)$ with an $\alpha$ -stable process to draw connections between their distributions.

Proof of Proposition 4.1.

The Lévy-Khintchine formula (see e.g. Theorem 8.1 of [58]) leads to the following expression for the characteristic function of $\varphi_{t}(z)=\mathbf{E}\exp(izY(t))$ :

\displaystyle\varphi_{t}(z)=\exp\Big{(}t\int_{(0,z_{0})}\big{[}\underbrace{\exp(izx)-1-izx\mathbf{I}_{(0,1]}(x)}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\phi(z,x)}\big{]}\mu(dx)\Big{)}\qquad\forall z\in\mathbb{R},\ t>0.

Note that

\displaystyle\phi(z,x)

\displaystyle=\cos(zx)-1+i\big{(}\sin(zx)-zx\mathbf{I}_{(0,1]}(x)\big{)}.

Then from $|e^{x+iy}|=e^{x}$ for all $x,y\in\mathbb{R}$ ,

\displaystyle|\varphi_{t}(z)|=\exp\Big{(}-t\int_{(0,z_{0})}\big{(}1-\cos(zx)\big{)}\mu(dx)\Big{)}\qquad\forall z\in\mathbb{R},\ t>0.

(6.58)

Furthermore, we claim the existence of some $\widetilde{M},\widetilde{C}\in(0,\infty)$ such that

\displaystyle\int_{(0,z_{0})}\big{(}1-\cos(zx)\big{)}\mu(dx)\geq\widetilde{C}|z|^{\alpha}\qquad\forall z\in\mathbb{R}\text{ with }|z|\geq\widetilde{M}.

(6.59)

Plugging (6.59) into (6.58), we yield that for all $|z|\geq\widetilde{M}$ and $t>0$ , $|\varphi_{t}(z)|\leq\exp(-t\widetilde{C}|z|^{\alpha}).$ It then follows directly from the inversion formula (see Theorem 3.3.14 of [29]) that, for all $t>0$ , $Y(t)$ admits a continuous density function $f_{Y(t)}$ with a uniform bound

	$\displaystyle\left\lVert f_{Y(t)}\right\rVert_{\infty}$	$\displaystyle\leq\frac{1}{2\pi}\int\|\varphi_{t}(z)\|dz$
		$\displaystyle\leq\frac{1}{2\pi}\Big{(}2\widetilde{M}+\int_{\|z\|\geq\tilde{M}}\exp\big{(}-t\widetilde{C}\|z\|^{\alpha}\big{)}dz\Big{)}$
		$\displaystyle\leq\frac{1}{2\pi}\Big{(}2\widetilde{M}+\frac{1}{t^{1/\alpha}}\int_{\mathbb{R}}\exp(-\widetilde{C}\|x\|^{\alpha})dx\Big{)}\qquad\text{by letting $x=zt^{1/\alpha}$}$
		$\displaystyle=\frac{\widetilde{M}}{\pi}+\frac{C_{1}}{t^{1/\alpha}}\qquad\text{where $C_{1}=\frac{1}{2\pi}\int_{\mathbb{R}}\exp(-\widetilde{C}\|x\|^{\alpha})dx<\infty$}.$

To conclude the proof, pick $C=\frac{\widetilde{M}}{\pi}+C_{1}$ . Now, it only remains to prove claim (6.59).

Proof of Claim (6.59).

We start by fixing some constants.

\displaystyle C_{0}=\int_{0}^{\infty}(1-\cos{y})\frac{dy}{y^{1+\alpha}}.

(6.60)

For $y\in(0,1]$ , note that $1-\cos y\leq y^{2}/2$ , and hence $\frac{|1-\cos y|}{y^{1+\alpha}}\leq\frac{1}{2y^{\alpha-1}}$ . For $y\in(1,\infty)$ , note that $1-\cos y\in[0,1]$ and hence $\frac{|1-\cos y|}{y^{1+\alpha}}\leq 1/y^{\alpha+1}$ . Due to $\alpha\in(0,2)$ , we have $C_{0}=\int_{0}^{\infty}(1-\cos{y})\frac{dy}{y^{1+\alpha}}\in(0,\infty)$ . Next, choose positive real numbers $\theta,\ \delta$ such that

	$\displaystyle\frac{\theta^{2-\alpha}}{2(2-\alpha)}\leq\frac{C_{0}}{4},$		(6.61)
	$\displaystyle\frac{\delta}{\alpha\theta^{\alpha}}\leq\frac{C_{0}}{4}.$		(6.62)

For any $M>0$ and $z\neq 0$ , observe that (by setting $y=|z|x$ in the last step)

\displaystyle\frac{\int_{x\geq\frac{M}{|z|}}\big{(}1-\cos(zx)\big{)}\frac{dx}{x^{1+\alpha}}}{|z|^{\alpha}}

\displaystyle=\frac{\int_{x\geq\frac{M}{|z|}}\big{(}1-\cos(|z|x)\big{)}\frac{dx}{x^{1+\alpha}}}{|z|^{\alpha}}=\int_{M}^{\infty}\big{(}1-\cos{y}\big{)}\frac{dy}{y^{1+\alpha}}.

Therefore, by fixing some $M>\theta$ large enough, we have

\displaystyle\frac{1}{|z|^{\alpha}}\int_{x\geq M/|z|}\big{(}1-\cos(zx)\big{)}\frac{dx}{x^{1+\alpha}}\leq\frac{C_{0}}{4}\qquad\forall z\neq 0.

(6.63)

To proceed, we compare $\int_{(0,z_{0})}\big{(}1-\cos(zx)\big{)}\mu(dx)$ with $\int_{0}^{M/z}\big{(}1-\cos(zx)\big{)}\frac{dx}{x^{1+\alpha}}$ . Recall that $z_{0}$ is the constant prescribed in the statement of Proposition 4.1. For any $z\in\mathbb{R}$ such that $|z|>M/z_{0}$ ,

\begin{split}&\frac{1}{|z|^{\alpha}}\bigg{[}\int_{(0,z_{0})}\Big{(}1-\cos(zx)\Big{)}\mu(dx)-\int_{0}^{\infty}\Big{(}1-\cos(zx)\Big{)}\frac{dx}{x^{1+\alpha}}\bigg{]}\\ &\geq\frac{1}{|z|^{\alpha}}\bigg{[}\int_{(\theta/|z|,M/|z|)}\Big{(}1-\cos(zx)\Big{)}\mu(dx)-\int_{0}^{\infty}\Big{(}1-\cos(zx)\Big{)}\frac{dx}{x^{1+\alpha}}\bigg{]}\\ &\qquad\text{due to our choice of $M>\theta$ and $|z|>M/z_{0}$}\\ &\geq-\underbrace{\frac{1}{|z|^{\alpha}}\int_{0}^{\theta/|z|}\Big{(}1-\cos(zx)\Big{)}\frac{dx}{x^{1+\alpha}}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}I_{1}(z)}-\underbrace{\frac{1}{|z|^{\alpha}}\int_{M/|z|}^{\infty}\Big{(}1-\cos(zx)\Big{)}\frac{dx}{x^{1+\alpha}}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}I_{2}(z)}\\ &\qquad+\underbrace{\frac{1}{|z|^{\alpha}}\bigg{[}\int_{[\theta/|z|,M/|z|)}\Big{(}1-\cos(zx)\Big{)}\mu(dx)-\int_{[\theta/|z|,M/|z|)}\Big{(}1-\cos(zx)\Big{)}\frac{dx}{x^{1+\alpha}}\bigg{]}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}I_{3}(z)}.\end{split}

(6.64)

We bound the terms $I_{1}(z)$ , $I_{2}(z)$ , and $I_{3}(z)$ separately. First, for any $z\neq 0$ ,

$\displaystyle I_{1}(z)$	$\displaystyle\leq\frac{1}{\|z\|^{\alpha}}\int_{0}^{\theta/\|z\|}\frac{z^{2}x^{2}}{2}\frac{dx}{x^{1+\alpha}}\qquad\text{ due to }1-\cos w\leq\frac{w^{2}}{2}\ \forall w\in\mathbb{R}$
	$\displaystyle=\frac{1}{2}\int_{0}^{\theta}y^{1-\alpha}dy\qquad\text{by setting }y=\|z\|x$
	$\displaystyle=\frac{1}{2}\cdot\frac{\theta^{2-\alpha}}{2-\alpha}\leq\frac{C_{0}}{4}\ \ \ \text{due to \eqref{chooseTheta_Lcont}}.$	(6.65)

For $I_{2}(z)$ , it follows immediately from (6.63) that

\displaystyle I_{2}(z)\leq\frac{C_{0}}{4}\qquad\forall z\neq 0.

(6.66)

Next, in order to bound $I_{3}(z)$ , we consider the function $h(z)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}1-\cos{z}.$ Since $h(z)$ is uniformly continuous on $[\theta,M]$ , we can find some $N\in\mathbb{N},\ t_{0}>1$ , and a sequence of real numbers $\theta=x_{0}>x_{1}>\cdots>x_{N}=M$ such that

\begin{split}&\frac{x_{j-1}}{x_{j}}=t_{0}\qquad\forall j=1,2,\cdots,N,\\ &|h(x)-h(y)|<\delta\qquad\forall j=1,2,\cdots,N,\ x,y\in[x_{j},x_{j-1}].\end{split}

(6.67)

In other words, we use a geometric sequence $\{x_{0},x_{1},\cdots,x_{N}\}$ to partition $[\theta,M]$ into $N$ intervals. On any of these intervals, the fluctuations of $h(z)=1-\cos{z}$ is bounded by the constant $\delta$ fixed in (6.62). Now fix some $\Delta>0$ such that (recall that $\epsilon>0$ is prescribed in the statement of this proposition)

\displaystyle(1-\Delta)t_{0}^{\alpha+\epsilon}>1.

(6.68)

Since $\mu[x,\infty)$ is regularly varying as $x\rightarrow 0$ with index $-(\alpha+2\epsilon)$ , for $g(y)=\mu[1/y,\infty)$ we have $g\in\mathcal{RV}_{\alpha+2\epsilon}(y)$ as $y\to\infty$ . By Potter’s bound (see Proposition 2.6 in [53]),there exists $\bar{y}_{1}>0$ such that

\displaystyle\frac{g(ty)}{g(y)}\geq(1-\Delta)t^{\alpha+\epsilon}\qquad\forall y\geq\bar{y}_{1},\ t\geq 1.

(6.69)

Meanwhile, define

\widetilde{g}(y)=y^{\alpha},\qquad\nu_{\alpha}(dx)=\mathbf{I}_{(0,\infty)}(x)\frac{dx}{x^{1+\alpha}}

and note that $\widetilde{g}(y)=\nu_{\alpha}(1/y,\infty)$ . Due to $g\in\mathcal{RV}_{\alpha+2\epsilon}$ , we can find some $\bar{y}_{2}>0$ such that

\displaystyle g(y)\geq\frac{t_{0}^{\alpha}-1}{(1-\Delta)t_{0}^{\alpha+\epsilon}-1}\cdot\widetilde{g}(y)\ \ \ \forall y\geq\bar{y}_{2}.

(6.70)

Let $\widetilde{M}=\max\{M/z_{0},M\bar{y}_{1},M\bar{y}_{2}\}$ . For any $|z|\geq\widetilde{M}$ , we have $|z|\geq M/z_{0}$ and $\frac{|z|}{x_{j}}\geq\frac{|z|}{M}\geq\bar{y}_{1}\vee\bar{y}_{2}$ for any $j=0,1,\cdots,N$ . As a result, for $z\in\mathbb{R}$ with $|z|\geq\widetilde{M}$ and any $j=1,2,\cdots,N$ ,

	$\displaystyle\mu[x_{j}/\|z\|,x_{j-1}/\|z\|)$	$\displaystyle=g(\|z\|/x_{j})-g(\|z\|/x_{j-1})\qquad\text{ by definition of }g(y)=\mu[1/y,\infty)$
		$\displaystyle=g(t_{0}\|z\|/x_{j-1})-g(\|z\|/x_{j-1})\qquad\text{ due to $x_{j-1}=t_{0}x_{j}$; see \eqref{uContOfG_lCont}}$
		$\displaystyle\geq g(\|z\|/x_{j-1})\cdot\Big{(}(1-\Delta)t_{0}^{\alpha+\epsilon}-1\Big{)}\qquad\text{due to $\frac{\|z\|}{x_{j}}\geq\bar{y}_{1}\vee\bar{y}_{2}$ and \eqref{potterBound_lCont}}$
		$\displaystyle\geq\widetilde{g}(\|z\|/x_{j-1})\cdot(t_{0}^{\alpha}-1)\qquad\text{due to \eqref{gBound_lCont}}.$

On the other hand,

\displaystyle\nu_{\alpha}[x_{j}/|z|,x_{j-1}/|z|)

\displaystyle=\widetilde{g}(|z|/x_{j})-\widetilde{g}(|z|/x_{j-1})=\widetilde{g}(|z|/x_{j-1})\cdot(t_{0}^{\alpha}-1).

Therefore, given any $z\in\mathbb{R}$ such that $|z|\geq\widetilde{M}$ , we have $\mu\big{(}E_{j}(z)\big{)}\geq\nu_{\alpha}\big{(}E_{j}(z)\big{)}$ for all $j\in[N]$ where $E_{j}(z)=[x_{j}/|z|,x_{j-1}/|z|)$ . This leads to

	$\displaystyle I_{3}(z)$
	$\displaystyle=\frac{1}{\|z\|^{\alpha}}\sum_{j=1}^{N}\bigg{[}\int_{E_{j}(z)}\Big{(}1-\cos(zx)\Big{)}\mu(dx)-\int_{E_{j}(z)}\Big{(}1-\cos(zx)\Big{)}\frac{dx}{x^{1+\alpha}}\bigg{]}$
	$\displaystyle\geq\frac{1}{\|z\|^{\alpha}}\sum_{j=1}^{N}\Big{[}\underline{m}_{j}\cdot\mu\big{(}E_{j}(z)\big{)}-\bar{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}\Big{]}$
	with $\bar{m}_{j}=\max\{h(z):\ z\in[x_{j},x_{j-1}]\},\ \underline{m}_{j}=\min\{h(z):\ z\in[x_{j},x_{j-1}]\}$
	$\displaystyle=\frac{1}{\|z\|^{\alpha}}\sum_{j=1}^{N}\Big{[}\underline{m}_{j}\cdot\mu\big{(}E_{j}(z)\big{)}-\underline{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}\Big{]}+\frac{1}{\|z\|^{\alpha}}\sum_{j=1}^{N}\Big{[}\underline{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}-\bar{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}\Big{]}$
	$\displaystyle\geq 0+\frac{1}{\|z\|^{\alpha}}\sum_{j=1}^{N}\Big{[}\underline{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}-\bar{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}\Big{]}\qquad\text{due to }\mu\big{(}E_{j}(z)\big{)}\geq\nu_{\alpha}\big{(}E_{j}(z)\big{)}$
	$\displaystyle\geq-\frac{\delta}{\|z\|^{\alpha}}\sum_{j=1}^{N}\nu_{\alpha}\big{(}E_{j}(z)\big{)}=-\frac{\delta}{\|z\|^{\alpha}}\nu_{c}[\theta/\|z\|,M/\|z\|)\qquad\text{due to \eqref{uContOfG_lCont}}$
	$\displaystyle=-\frac{\delta}{\|z\|^{\alpha}}\int_{\theta/\|z\|}^{M/\|z\|}\frac{dx}{x^{1+\alpha}}$
	$\displaystyle\geq-\frac{\delta}{\|z\|^{\alpha}}\int_{\theta/\|z\|}^{\infty}\frac{dx}{x^{1+\alpha}}=-\frac{\delta}{\alpha\theta^{\alpha}}$
	$\displaystyle\geq-\frac{C_{0}}{4}\qquad\text{due to \eqref{chooseDelta_Lcont}.}$		(6.71)

Plugging (6.65), (6.66), and (6.71) back into (6.64), we have shown that for all $|z|\geq\widetilde{M}$ ,

	$\displaystyle\frac{1}{\|z\|^{\alpha}}\int_{(0,z_{0})}\big{(}1-\cos(zx)\big{)}\mu(dx)$
	$\displaystyle\geq-\frac{3C_{0}}{4}+\frac{1}{\|z\|^{\alpha}}\int_{0}^{\infty}\Big{(}1-\cos(zx)\Big{)}\frac{dx}{x^{1+\alpha}}$
	$\displaystyle=-\frac{3C_{0}}{4}+\int_{0}^{\infty}\Big{(}1-\cos y\Big{)}\frac{dy}{y^{1+\alpha}}\qquad\text{by setting }y=\|z\|x$
	$\displaystyle=-\frac{3C_{0}}{4}+C_{0}=\frac{C_{0}}{4}\qquad\text{ by definition of }C_{0}=\int_{0}^{\infty}(1-\cos{y})\frac{dy}{y^{1+\alpha}}.$

To conclude the proof of claim (6.59), we set $\widetilde{C}=C_{0}/4.$ ∎

Again, the proof of Proposition 4.3 makes use of the inversion formula.

Proof of Proposition 4.3.

Let us denote the characteristic functions of $Y^{\prime}(t)$ and $Y(t)$ by ${\varphi}_{t}$ and $\widetilde{\varphi}_{t}$ , respectively. Repeating the arguments using complex conjugates in (6.58), we obtain

\displaystyle|\widetilde{\varphi}_{t}(z)|=\exp\Big{(}-t\int_{|x|<b^{N}}\big{(}1-\cos(zx)\big{)}\mu(dx)\Big{)}.

As for $\varphi_{t}$ , using proposition 14.9 in [58], we get

\displaystyle|\varphi_{t}(z)|=\exp\Big{(}-t|z|^{\alpha}\eta(z)\Big{)}

(6.72)

where $\eta(z)$ is a non-negative function continuous on $\mathbb{R}\char 92\relax\{0\}$ satisfying $\eta(bz)=\eta(z)$ and

\displaystyle\eta(z)=\frac{\int_{\mathbb{R}}\big{(}1-\cos(zx)\big{)}\mu(dx)}{|z|^{\alpha}}\qquad\forall z\neq 0.

This implies $\eta(z)=\eta(-z)$ for all $z\neq 0$ . Furthermore, we claim the existence of some $c>0$ such that

\displaystyle\eta(z)\geq c\qquad\forall z\in[1,b].

(6.73)

Then due to the self-similarity of $\mu$ (i.e., $\eta(bz)=\eta(z)$ ), we have $\eta(z)\geq c$ for all $z\neq 0$ . In the meantime, note that

\displaystyle\frac{1}{|z|^{\alpha}}\int_{|x|\geq b^{N}}\big{(}1-\cos(zx)\big{)}\mu(dx)\leq\frac{\mu\{x:|x|\geq b^{N}\}}{|z|^{\alpha}}.

By picking $M>0$ large enough, it holds for any $|z|\geq M$ that

\displaystyle\frac{1}{|z|^{\alpha}}\int_{|x|\geq b^{N}}\big{(}1-\cos(zx)\big{)}\mu(dx)\leq\frac{c}{2}.

(6.74)

Therefore, for any $|z|\geq M$ ,

	$\displaystyle\int_{\|x\|<b^{N}}\big{(}1-\cos(zx)\big{)}\mu(dx)$	$\displaystyle=\int_{x\in\mathbb{R}}\big{(}1-\cos(zx)\big{)}\mu(dx)-\int_{\|x\|\geq b^{N}}\big{(}1-\cos(zx)\big{)}\mu(dx)$
		$\displaystyle=\eta(z)\cdot\|z\|^{\alpha}-\int_{\|x\|\geq b^{N}}\big{(}1-\cos(zx)\big{)}\mu(dx)$
		$\displaystyle\geq c\|z\|^{\alpha}-\frac{c}{2}\|z\|^{\alpha}=\frac{c}{2}\|z\|^{\alpha}\qquad\text{using \eqref{goal, proposition: lipschitz cont, 2} and \eqref{fact 2, proposition: lipschitz cont, 2}},$

and hence $|\widetilde{\varphi}_{t}(z)|\leq\exp\big{(}-\frac{c}{2}t|z|^{\alpha}\big{)}$ for all $|z|\geq M$ . Applying inversion formula, we get (for any $t>0$ )

	$\displaystyle\left\lVert f_{Y(t)}\right\rVert_{\infty}$	$\displaystyle\leq\frac{1}{2\pi}\int\|\widetilde{\varphi}_{t}(z)\|dz$
		$\displaystyle\leq\frac{1}{2\pi}\bigg{[}2M+\int_{\|z\|\geq M}\|\widetilde{\varphi}_{t}(z)\|dz\bigg{]}$
		$\displaystyle\leq\frac{M}{\pi}+\frac{1}{2\pi}\int\exp\bigg{(}-\frac{c}{2}t\|z\|^{\alpha}\bigg{)}dz$
		$\displaystyle=\frac{M}{\pi}+\frac{1}{2\pi}\cdot\frac{1}{t^{1/\alpha}}\int\exp\bigg{(}-\frac{c}{2}\|x\|^{\alpha}\bigg{)}dx\qquad\text{using }x=t^{1/\alpha}\cdot z$
		$\displaystyle\leq\frac{M}{\pi}+\frac{C_{1}}{t^{1/\alpha}}\qquad\text{ where }C_{1}=\frac{1}{2\pi}\int\exp\bigg{(}-\frac{c}{2}\|x\|^{\alpha}\bigg{)}dx.$

To conclude the proof, we set $C=\frac{M}{\pi}+C_{1}$ . Now it only remains to prove claim (6.73).

Proof of Claim (6.73)

We proceed with a proof by contradiction. If $\inf_{z\in[1,b]}\eta(z)=0$ , then by continuity of $\eta(z)$ , there exists some $z\in[1,b]$ such that

\displaystyle\int_{\mathbb{R}}\big{(}1-\cos(zx)\big{)}\mu(dx)=0.

Now for any $\epsilon>0$ , define the following sets:

	$\displaystyle S$	$\displaystyle=\{x\in\mathbb{R}:\ 1-\cos(zx)>0\}=\mathbb{R}\char 92\relax\{\frac{2\pi}{z}k:\ k\in\mathbb{Z}\};$
	$\displaystyle S_{\epsilon}$	$\displaystyle=\{x\in\mathbb{R}:\ 1-\cos(zx)\geq\epsilon\}.$

Observe that

•

For any $\epsilon>0$ , we have $\epsilon\cdot\mu(S_{\epsilon})\leq\int_{S_{\epsilon}}\big{(}1-\cos(zx)\big{)}\mu(dx)\leq\int_{\mathbb{R}}\big{(}1-\cos(zx)\big{)}\mu(dx)=0,$ which implies $\mu(S_{\epsilon})=0$ ;
•

Meanwhile, $\lim_{\epsilon\rightarrow 0}\mu(S_{\epsilon})=\mu(S)=0$ .

Together with the fact that $\mu(\mathbb{R})>0$ (so that the process is non-trivial), there must be some $m\in\mathbb{Z},\ \delta>0$ such that

\mu(\{\frac{2\pi}{z}m\})=\delta>0.

Besides, from $\mu(S)=0$ we know that $\mu\big{(}\{-\frac{2\pi}{z},\frac{2\pi}{z}\}\setminus\{0\}\big{)}=0$ . However, by definition of semi-stable processes in (4.4) we know that $\mu=b^{-\alpha}T_{b}\mu$ where the transformation $T_{r}$ ( $\forall r>0$ ) onto a Borel measure $\rho$ on $\mathbb{R}$ is defined as $(T_{r}\rho)(B)=\rho(r^{-1}B)$ . This implies

\displaystyle\mu(\{\frac{2\pi m}{z}b^{-k}\})>0\ \ \forall k=1,2,3,\cdots

which would contradict $\mu\big{(}\{-\frac{2\pi}{z},\frac{2\pi}{z}\}\setminus\{0\}\big{)}=0$ eventually for $k$ large enough. This concludes the proof of $\eta(z)>0$ for all $z\in[1,b]$ . ∎

References

[1] S. Asmussen, P. Glynn, and J. Pitman. Discretization Error in Simulation of One-Dimensional Reflecting Brownian Motion. The Annals of Applied Probability, 5(4):875 – 896, 1995.
[2] S. Asmussen and D. P. Kroese. Improved algorithms for rare event simulation with heavy tails. Advances in Applied Probability, 38(2):545–558, 2006.
[3] S. Asmussen and J. Rosiński. Approximations of small jumps of lévy processes with a view towards simulation. Journal of Applied Probability, 38(2):482–493, 2001.
[4] A. Bassamboo, S. Juneja, and A. Zeevi. On the inefficiency of state-independent importance sampling in the presence of heavy tails. Operations Research Letters, 35(2):251–260, 2007.
[5] M. L. Bianchi, S. T. Rachev, Y. S. Kim, and F. J. Fabozzi. Tempered infinitely divisible distributions and processes. Theory of Probability & Its Applications, 55(1):2–26, 2011.
[6] J. Blanchet and P. Glynn. Efficient rare-event simulation for the maximum of heavy-tailed random walks. The Annals of Applied Probability, 18(4):1351 – 1378, 2008.
[7] J. Blanchet, P. Glynn, and J. Liu. Efficient rare event simulation for heavy-tailed multiserver queues. Technical report, Department of Statistics, Columbia University, 2008.
[8] J. Blanchet, H. Hult, and K. Leder. Rare-event simulation for stochastic recurrence equations with heavy-tailed innovations. ACM Trans. Model. Comput. Simul., 23(4), dec 2013.
[9] J. H. Blanchet and J. Liu. State-dependent importance sampling for regularly varying random walks. Advances in Applied Probability, 40(4):1104–1128, 2008.
[10] S. Borak, A. Misiorek, and R. Weron. Models for heavy-tailed asset returns, pages 21–55. Springer Berlin Heidelberg, Berlin, Heidelberg, 2011.
[11] O. Boxma, E. Cahen, D. Koops, and M. Mandjes. Linear networks: rare-event simulation and markov modulation. Methodology and Computing in Applied Probability, 2019.
[12] S. Boyarchenko and S. Levendorskii. Efficient evaluation of joint pdf of a lévy process, its extremum, and hitting time of the extremum, 2023.
[13] S. Boyarchenko and S. Levendorskii. Simulation of a lévy process, its extremum, and hitting time of the extremum via characteristic functions, 2023.
[14] J. A. Bucklew, P. Ney, and J. S. Sadowsky. Monte carlo simulation and large deviations theory for uniformly recurrent markov chains. Journal of Applied Probability, 27(1):44–59, 1990.
[15] P. Carr, H. Geman, D. Madan, and M. Yor. The fine structure of asset returns: An empirical investigation. The Journal of Business, 75(2):305–332, 2002.
[16] P. Carr, H. Geman, D. B. Madan, and M. Yor. Stochastic volatility for lévy processes. Mathematical Finance, 13(3):345–382, 2003.
[17] J. I. G. Cázares, A. Kohatsu-Higa, and A. Mijatović. Joint density of the stable process and its supremum: Regularity and upper bounds. Bernoulli, 29(4):3443 – 3469, 2023.
[18] J. I. G. Cázares, A. Mijatović, and G. U. Bravo. $\varepsilon$ -strong simulation of the convex minorants of stable processes and meanders. Electronic Journal of Probability, 25(none):1 – 33, 2020.
[19] L. Chaumont. On the law of the supremum of Lévy processes. The Annals of Probability, 41(3A):1191 – 1217, 2013.
[20] B. Chen, J. Blanchet, C.-H. Rhee, and B. Zwart. Efficient rare-event simulation for multiple jump events in regularly varying random walks and compound poisson processes. Mathematics of Operations Research, 44(3):919–942, 2019.
[21] J. E. Cohen, R. A. Davis, and G. Samorodnitsky. Covid-19 cases and deaths in the united states follow taylor’s law for heavy-tailed distributions with infinite variance. Proceedings of the National Academy of Sciences, 119(38):e2209234119, 2022.
[22] L. Coutin, M. Pontier, and W. Ngom. Joint distribution of a lévy process and its running supremum. Journal of Applied Probability, 55(2):488–512, 2018.
[23] J. I. G. Cázares, F. Lin, and A. Mijatović. Fast exact simulation of the first passage of a tempered stable subordinator across a non-increasing function, 2023.
[24] S. Dereich. Multilevel Monte Carlo algorithms for Lévy-driven SDEs with Gaussian correction. The Annals of Applied Probability, 21(1):283 – 311, 2011.
[25] S. Dereich and F. Heidenreich. A multilevel monte carlo algorithm for lévy-driven stochastic differential equations. Stochastic Processes and their Applications, 121(7):1565–1587, 2011.
[26] E. H. A. Dia and D. Lamberton. Connecting discrete and continuous lookback or hindsight options in exponential lévy models. Advances in Applied Probability, 43(4):1136–1165, 2011.
[27] P. Dupuis, K. Leder, and H. Wang. Importance sampling for sums of random variables with regularly varying tails. ACM Trans. Model. Comput. Simul., 17(3):14–es, jul 2007.
[28] P. Dupuis, A. D. Sezer, and H. Wang. Dynamic importance sampling for queueing networks. The Annals of Applied Probability, 17(4):1306 – 1346, 2007.
[29] R. Durrett. Probability: Theory and Examples. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019.
[30] P. Embrechts, C. Klüppelberg, and T. Mikosch. Modelling extremal events: for insurance and finance, volume 33. Springer Science & Business Media, 2013.
[31] A. Ferreiro-Castilla, A. Kyprianou, R. Scheichl, and G. Suryanarayana. Multilevel monte carlo simulation for lévy processes based on the wiener–hopf factorisation. Stochastic Processes and their Applications, 124(2):985–1010, 2014.
[32] M. B. Giles. Multilevel monte carlo path simulation. Operations Research, 56(3):607–617, 2008.
[33] M. B. Giles and Y. Xia. Multilevel monte carlo for exponential lévy models. Finance and Stochastics, 21(4):995–1026, 2017.
[34] J. González Cázares and A. Mijatović. Simulation of the drawdown and its duration in lévy models via stick-breaking gaussian approximation. Finance and Stochastics, 26(4):671–732, 2022.
[35] J. I. González Cázares, A. Mijatović, and G. Uribe Bravo. Geometrically convergent simulation of the extrema of lévy processes. Mathematics of Operations Research, 47(2):1141–1168, 2022.
[36] J. I. González Cázares, A. Mijatović, and G. U. Bravo. Exact simulation of the extrema of stable processes. Advances in Applied Probability, 51(4):967–993, 2019.
[37] T. Gudmundsson and H. Hult. Markov chain monte carlo for computing rare-event probabilities for a heavy-tailed random walk. Journal of Applied Probability, 51(2):359–376, 2014.
[38] M. Gurbuzbalaban, U. Simsekli, and L. Zhu. The heavy-tail phenomenon in sgd. In M. Meila and T. Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 3964–3975. PMLR, 18–24 Jul 2021.
[39] S. Heinrich. Multilevel monte carlo methods. In S. Margenov, J. Waśniewski, and P. Yalamov, editors, Large-Scale Scientific Computing, pages 58–67, Berlin, Heidelberg, 2001. Springer Berlin Heidelberg.
[40] T. Hesterberg. Weighted average importance sampling and defensive mixture distributions. Technometrics, 37(2):185–194, 1995.
[41] L. Hodgkinson and M. Mahoney. Multiplicative noise and heavy tails in stochastic optimization. In International Conference on Machine Learning, pages 4262–4274. PMLR, 2021.
[42] H. Hult, S. Juneja, and K. Murthy. Exact and efficient simulation of tail probabilities of heavy-tailed infinite series. 2016.
[43] A. Kuznetsov, A. E. Kyprianou, J. C. Pardo, and K. van Schaik. A Wiener–Hopf Monte Carlo simulation technique for Lévy processes. The Annals of Applied Probability, 21(6):2171 – 2190, 2011.
[44] M. Kwaśnicki, J. Małecki, and M. Ryznar. Suprema of Lévy processes. The Annals of Probability, 41(3B):2047 – 2065, 2013.
[45] Y. Li. Queuing theory with heavy tails and network traffic modeling. working paper or preprint, Oct. 2018.
[46] E. Mariucci and M. Reiß. Wasserstein and total variation distance between marginals of Lévy processes. Electronic Journal of Statistics, 12(2):2482 – 2514, 2018.
[47] Z. Michna. Formula for the supremum distribution of a spectrally positive lévy process, 2012.
[48] Z. Michna. Explicit formula for the supremum distribution of a spectrally negative stable process. Electronic Communications in Probability, 18(none):1 – 6, 2013.
[49] Z. Michna, Z. Palmowski, and M. Pistorius. The distribution of the supremum for spectrally asymmetric lévy processes, 2014.
[50] A. Mijatović and P. Tankov. A new look at short-term implied volatility in asset price models with jumps. Mathematical Finance, 26(1):149–183, 2016.
[51] K. R. A. Murthy, S. Juneja, and J. Blanchet. State-independent importance sampling for random walks with regularly varying increments. Stochastic Systems, 4(2):321–374, 2014.
[52] J. Pitman and G. U. Bravo. The convex minorant of a Lévy process. The Annals of Probability, 40(4):1636 – 1674, 2012.
[53] S. I. Resnick. Heavy-tail phenomena: probabilistic and statistical modeling. Springer Science & Business Media, 2007.
[54] C.-H. Rhee, J. Blanchet, B. Zwart, et al. Sample path large deviations for lévy processes and random walks with regularly varying increments. The Annals of Probability, 47(6):3551–3605, 2019.
[55] C.-H. Rhee and P. W. Glynn. Unbiased estimation with square root convergence for sde models. Operations Research, 63(5):1026–1043, 2015.
[56] J. Rosiński. Tempering stable processes. Stochastic Processes and their Applications, 117(6):677–707, 2007.
[57] P. Sabino. Pricing energy derivatives in markets driven by tempered stable and cgmy processes of ornstein–uhlenbeck type. Risks, 10(8), 2022.
[58] K.-i. Sato, S. Ken-Iti, and A. Katok. Lévy processes and infinitely divisible distributions. Cambridge university press, 1999.
[59] G. Torrisi. Simulating the ruin probability of risk processes with delay in claim settlement. Stochastic Processes and their Applications, 112(2):225–244, 2004.
[60] X. Wang and C.-H. Rhee. Rare-event simulation for multiple jump events in heavy-tailed lévy processes with infinite activities. In Proceedings of the Winter Simulation Conference, WSC ’20, page 409–420. IEEE Press, 2021.
[61] X. Wang and C.-H. Rhee. Large deviations and metastability analysis for heavy-tailed dynamical systems, 2023.
[62] X. Wang and C.-H. Rhee. Importance sampling strategy for heavy-tailed systems with catastrophe principle. In Proceedings of the Winter Simulation Conference, WSC ’23, page 76–90. IEEE Press, 2024.

Appendix A Barrier Option Pricing

A.1 Problem Setting

This section considers the estimation of probabilities $P(A_{n})$ with $A_{n}=\{\bar{X}_{n}\in A\}$ and

A\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\{\xi\in\mathbb{D}:\ \xi(1)\leq-b,\ \sup_{t\leq 1}\xi(t)+ct\geq a\},

which corresponds to rare-event simulation in the context of down-and-in option. Here, we assume that $a,\ b>0$ and $c<a$ . We consider the two-sided case in Assumption 1. That is, $X(t)$ is a centered Lévy process with Lévy measures $\nu$ , and there exists some $\alpha,\ \alpha^{\prime}>1$ such that $\nu[x,\infty)\in\mathcal{RV}_{-\alpha}(x)$ and $\nu(-\infty,-x]\in\mathcal{RV}_{-\alpha^{\prime}}(x)$ as $x\to\infty$ . Also, we impose an alternative version of Assumption 2 throughout. Let $X^{(-z,z)}(t)$ be the Lévy process with with generating triplet $(c_{X},\sigma,\nu|_{(-z,z)})$ . That is, $X^{(-z,z)}(t)$ is a modulated version of $X$ where all jumps with size larger than $z$ are removed.

Assumption 4.

There exist ${z_{0}},\ {C},\ {\lambda}>0$ such that

\displaystyle\mathbf{P}\big{(}X^{(-z,z)}(t)\in[x,x+\delta]\big{)}\leq\frac{C\delta}{t^{\lambda}\wedge 1}\qquad\forall z\geq z_{0},\ t>0,\ x\in\mathbb{R},\ \delta>0.

A.2 Importance Sampling Algorithm

Below, we present the design of the importance sampling algorithm. For any $\xi\in\mathbb{D}$ and $t\in(0,1]$ , let $\Delta\xi(t)=\xi(t)-\xi(t-)$ be the discontinuity in $\xi$ at time $t$ , and we set $\Delta\xi(0)\equiv 0$ . Let

\displaystyle B^{\gamma}=\Big{\{}\xi\in\mathbb{D}:\ \#\{t\in[0,1]:\ \Delta\xi(t)\geq\gamma\}\geq 1,\ \#\{t\in[0,1]:\ \Delta\xi(t)\leq-\gamma\}\geq 1\Big{\}}

and let $B^{\gamma}_{n}=\{\bar{X}_{n}\in B^{\gamma}_{n}\}$ . Intuitively speaking, on event $B^{\gamma}_{n}$ there is at least one upward and one downward “large” jump in $\bar{X}_{n}$ , where $\gamma>0$ is understood as the threshold for jump sizes to be considered “large”.

Fix some $w\in(0,1)$ , and let

\mathbf{Q}_{n}(\cdot)=w\mathbf{P}(\cdot)+(1-w)\mathbf{P}(\ \cdot\ |B^{\gamma}_{n}).

The algorithm samples

{L_{n}}=Z_{n}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}=\frac{Z_{n}}{w+\frac{1-w}{\mathbf{P}(B^{\gamma}_{n})}\mathbf{I}_{B^{\gamma}_{n}}}

under $\mathbf{Q}_{n}$ . Now, we discuss the design of $Z_{n}$ to ensure the strong efficiency of $L_{n}$ . Analogous to the decomposition in (3.6), let

	$\displaystyle{J_{n}(t)}$	$\displaystyle=\sum_{s\in[0,t]}\Delta X(s)\mathbf{I}\big{(}\|\Delta X(s)\|\geq n\gamma\big{)},$
	$\displaystyle{\Xi_{n}(t)}$	$\displaystyle=X(t)-J_{n}(t)=X(t)-\sum_{s\in[0,t]}\Delta X(s)\mathbf{I}\big{(}\|\Delta X(s)\|\geq n\gamma\big{)}.$

Let ${\bar{J}_{n}(t)}=\frac{1}{n}J_{n}(nt)$ , $\bar{J}_{n}=\{\bar{J}_{n}(t):\ t\in[0,1]\}$ , ${\bar{\Xi}_{n}(t)}=\frac{1}{n}\Xi_{n}(nt)$ , and $\bar{\Xi}_{n}=\{\bar{\Xi}_{n}(t):\ t\in[0,1]\}$ . Meanwhile, set

\displaystyle{M_{c}(t)}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\sup_{s\leq t}X(s)+cs,\qquad{Y^{*}_{n;c}}\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\mathbf{I}\big{(}M_{c}(n)\geq na,\ X(n)\leq-nb\big{)},

We have $\mathbf{I}_{A_{n}}=Y^{*}_{n;c}$ . Under the convention $\hat{Y}^{-1}_{n}\equiv 0$ , consider estimators $Z_{n}$ of form

\displaystyle{Z_{n}}=\sum_{m=0}^{\tau}\frac{\hat{Y}^{m}_{n;c}-\hat{Y}^{m-1}_{n;c}}{\mathbf{P}(\tau\geq m)}

(A.1)

where ${\tau}$ is $\text{Geom}(\rho)$ for some ${\rho}\in(0,1)$ and is independent of everything else. Analogous to Proposition 3.1, the following result provides sufficient conditions on $\hat{Y}^{m}_{n;c}$ for $L_{n}$ to attain strong efficiency.

Proposition A.1.

Let $C_{0}>0$ , $\rho_{0}\in(0,1)$ , $\mu>\alpha+\alpha^{\prime}-2$ , and $\bar{m}\in\mathbb{N}$ . Suppose that

\displaystyle\mathbf{P}\Big{(}Y^{*}_{n;c}\neq\hat{Y}^{m}_{n;c}\ \Big{|}\ \mathcal{D}^{+}(\bar{J}_{n})=k,\ \mathcal{D}^{-}(\bar{J}_{n})=k^{\prime}\Big{)}\leq C_{0}\rho^{m}_{0}\cdot(k+k^{\prime}+1)\qquad\forall k,\ k^{\prime}\geq 0,\ n\geq 1,\ m\geq\bar{m}

(A.2)

where $\mathcal{D}^{+}(\xi)$ and $\mathcal{D}^{-}(\xi)$ count the number of discontinuities of positive and negative sizes in $\xi$ , respectively. Besides, suppose that for all $\Delta\in(0,1)$ ,

\displaystyle\mathbf{P}\Big{(}Y^{*}_{n;c}\neq\hat{Y}^{m}_{n;c},\ \bar{X}_{n}\notin A^{\Delta}\ \Big{|}\ \mathcal{D}^{+}(\bar{J}_{n})=0\text{ or }\mathcal{D}^{-}(\bar{J}_{n})=0\Big{)}\leq\frac{C_{0}\rho^{m}_{0}}{\Delta^{2}n^{\mu}}\qquad\forall n\geq 1,\ m\geq 0

(A.3)

where ${A^{\Delta}}=\big{\{}\xi\in\mathbb{D}:\sup_{t\in[0,1]}\xi(t)+ct\geq a-\Delta,\ \xi(1)\leq-b\big{\}}$ . Then given $\rho\in(\rho_{0},1)$ , there exists some $\bar{\gamma}=\bar{\gamma}(\rho)\in(0,b)$ such that for all $\gamma\in(0,\bar{\gamma})$ , the estimators $(L_{n})_{n\geq 1}$ are unbiased and strongly efficient for $\mathbf{P}(A_{n})=\mathbf{P}(\bar{X}_{n}\in A)$ under the importance sampling distribution $\mathbf{Q}_{n}$ .

The proof is almost identical to that of Proposition 3.1. In particular, the proof requires that

\mathbf{P}(A_{n})=\bm{O}\big{(}n\nu[n,\infty)\cdot n\nu(-\infty,-n]\big{)}

and that, for any $\beta>0$ , it holds for all $\gamma$ small enough that

\mathbf{P}(A^{\Delta}_{n}\setminus B^{\gamma}_{n})=\bm{o}(n^{\beta})

where $A^{\Delta}_{n}=\{\bar{X}_{n}\in A^{\Delta}\}$ . These can be obtained directly using sample path large deviations for heavy-tailed Lévy processes in Result 2. The Proposition A.1 is then established by repeating the arguments in Proposition 3.1 using Result 4 for randomized debiasing technique.

A.3 Construction of $\hat{Y}^{m}_{n;c}$

Next, we describe the construction of $\hat{Y}^{m}_{n;c}$ that can satisfy the conditions in Proposition A.1. Specifically, we consider the case where ARA is involved. Let

\Xi_{n;c}(t)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}\Xi_{n}(t)+ct.

Under both $\mathbf{P}$ and $\mathbf{Q}_{n}$ , $\Xi_{n;c}(t)$ admits the law of a Lévy process with generating triplet $(c_{X}+c,\sigma,\nu|_{(-n\gamma,n\gamma)})$ . This leads to the Lévy-Ito decomposition

	$\displaystyle\Xi_{n,c}(t)$	$\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}(c_{X}+c)t+\sigma B(t)+\underbrace{\sum_{s\leq t}\Delta X(s)\mathbf{I}\Big{(}\Delta X(s)\in(-n\gamma,-1]\cup[1,n\gamma)\Big{)}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}J_{n,-1}(t)}$
		$\displaystyle+\sum_{m\geq 0}\Bigg{[}\underbrace{\sum_{s\leq t}\Delta X(s)\mathbf{I}\Big{(}\|\Delta X(s)\|\in[\kappa_{n,m},\kappa_{n,m-1})\Big{)}-t\cdot\nu\Big{(}(-\kappa_{n,m-1},-\kappa_{n,m}]\cup[\kappa_{n,m},\kappa_{n,m-1})\Big{)}}_{\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}J_{n,m}(t)}\Bigg{]}$

with $\kappa_{n,m}$ defined in (3.20). Besides, let $\bar{\sigma}^{2}(\cdot)$ be defined as in (3.22). For each $n\geq 1$ and $m\geq 0$ , consider the approximation

\displaystyle{\breve{\Xi}^{m}_{n;c}(t)}

\displaystyle\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptscriptstyle\Delta}}}(c_{X}+c)t+\sigma B(t)+\sum_{q=-1}^{m}J_{n,q}(t)+\sum_{q\geq m+1}\sqrt{\bar{\sigma}^{2}(\kappa_{n,q-1})-\bar{\sigma}^{2}(\kappa_{n,q})}\cdot W^{q}(t)

where $(W^{m})_{m\geq 1}$ is a sequence of iid copies of standard Brownian motions independent of everything else.

Next, we discuss how to apply SBA and construct approximators $\hat{Y}^{m}_{n;c}$ ’s in (A.1). Let $\zeta_{k}(t)=\sum_{i=1}^{k}z_{i}\mathbf{I}_{[u_{i},n]}(t)$ be a piece-wise step function with $k$ jumps over $(0,n]$ , where $0<u_{1}<u_{2}<\ldots<u_{k}\leq n$ , and $z_{i}\neq 0$ for each $i\in[k]$ . Recall that the jump times in $\zeta_{k}$ leads to a partition of $[0,n]$ of $(I_{i})_{i\in[k+1]}$ defined in (3.12). For any $I_{i}$ , let the sequence $l^{(i)}_{j}$ ’s be defined as in (3.15)–(3.16). Conditioning on $(l^{(i)}_{j})_{j\geq 1}$ , one can then sample ${\xi^{(i),m}_{j;c},\xi^{(i)}_{j;c}}$ using

\displaystyle\big{(}\xi^{(i)}_{j;c},\xi^{(i),0}_{j;c},\xi^{(i),1}_{j;c},\xi^{(i),2}_{j;c},\ldots)\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\Big{(}\Xi_{n;c}(l^{(i)}_{j}),\ \breve{\Xi}^{0}_{n;c}(l^{(i)}_{j}),\ \breve{\Xi}^{1}_{n;c}(l^{(i)}_{j}),\ \breve{\Xi}^{2}_{n;c}(l^{(i)}_{j}),\ldots\Big{)}.

The coupling in (2.7) then implies

	$\displaystyle\Big{(}\Xi_{n;c}(u_{i})-\Xi_{n;c}(u_{i-1}),\ \sup_{t\in I_{i}}\Xi_{n;c}(t)-\Xi_{n;c}(u_{i-1}),\ \breve{\Xi}^{0}_{n;c}(u_{i})-\breve{\Xi}^{0}_{n;c}(u_{i-1}),\ \sup_{t\in I_{i}}\breve{\Xi}^{0}_{n;c}(t)-\breve{\Xi}^{0}_{n;c}(u_{i-1}),$
	$\displaystyle\qquad\qquad\qquad\qquad\qquad\qquad\qquad\qquad\breve{\Xi}^{1}_{n;c}(u_{i})-\breve{\Xi}^{1}_{n;c}(u_{i-1}),\ \sup_{t\in I_{i}}\breve{\Xi}^{1}_{n;c}(t)-\breve{\Xi}^{1}_{n;c}(u_{i-1}),\ldots\Big{)}$
	$\displaystyle\qquad\mathrel{\ensurestackMath{\stackon[1pt]{=}{\scriptstyle d}}}\Big{(}\sum_{j\geq 1}\xi^{(i)}_{j;c},\ \sum_{j\geq 1}(\xi^{(i)}_{j;c})^{+},\ \sum_{j\geq 1}\xi^{(i),0}_{j;c},\ \sum_{j\geq 1}(\xi^{(i),0}_{j;c})^{+},\ \sum_{j\geq 1}\xi^{(i),1}_{j;c},\ \sum_{j\geq 1}(\xi^{(i),1}_{j;c})^{+},\ldots\Big{)}.$

Now, we define

\displaystyle{\hat{M}^{(i),m}_{n;c}(\zeta_{k})}=\sum_{j=1}^{m+\lceil\log_{2}(n^{d})\rceil}(\xi^{(i),m}_{j;c})^{+}

as an approximation to ${M_{n;c}^{(i),*}(\zeta_{k})}=\sup_{t\in I_{i}}\Xi_{n;c}(t)-\Xi_{n;c}(u_{i-1})=\sum_{j\geq 1}(\xi^{(i)}_{j;c})^{+}.$ Now, set

\displaystyle{\hat{Y}^{m}_{n;c}(\zeta_{k})}

\displaystyle=\bigg{[}\max_{i\in[k+1]}\mathbf{I}\Big{(}\sum_{q=1}^{i-1}\sum_{j\geq 0}\xi^{(q),m}_{j;c}+\sum_{q=1}^{i-1}z_{q}+\hat{M}^{(i),m}_{n;c}(\zeta_{k})\geq na\Big{)}\bigg{]}\cdot\mathbf{I}\Big{(}\sum_{q=1}^{k+1}\sum_{j\geq 0}\xi^{(q),m}_{j;c}+\sum_{q=1}^{k}z_{q}-cn\leq-nb\Big{)}.

In (A.1), we plug in $\hat{Y}^{m}_{n;c}=\hat{Y}^{m}_{n;c}(J_{n})$ .

The proof of the strong efficiency is almost identical to that of Theorem 3.3. The only major difference is that in Lemma 6.9, we apply Assumption 4 instead of Assumption 2.

	$\displaystyle\mathbf{E}^{\mathbf{Q}_{n}}\Bigg{[}\bigg{\|}\hat{Y}^{m-1}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}-Y^{*}_{n}\mathbf{I}_{E_{n}}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\bigg{\|}^{2}\Bigg{]}$	$\displaystyle\leq\mathbf{E}^{\mathbf{Q}_{n}}\bigg{[}\|\hat{Y}^{m-1}_{n}-Y^{*}_{n}\|^{2}\cdot\bigg{(}\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\bigg{)}^{2}\bigg{]}$
		$\displaystyle=\mathbf{E}\bigg{[}\|\hat{Y}^{m-1}_{n}-Y^{*}_{n}\|^{2}\cdot\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\bigg{]}$
		$\displaystyle\leq\frac{1}{w}\mathbf{E}\|\hat{Y}^{m-1}_{n}-Y^{*}_{n}\|^{2}\qquad\text{due to }\frac{d\mathbf{P}}{d\mathbf{Q}_{n}}\leq\frac{1}{w}\text{, see \eqref{def: estimator Ln}}.$

$\displaystyle\mathbf{E}Z^{2}_{n,4}$	$\displaystyle\leq\sum_{m\geq 1}\frac{\mathbf{E}\Big{[}\big{\|}Y^{*}_{n}-\hat{Y}^{m-1}_{n}\big{\|}^{2}\mathbf{I}_{(A^{\Delta}_{n})^{c}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}\Big{]}}{\mathbf{P}(\tau\geq m)}$
	$\displaystyle=\sum_{m\geq 1}\frac{\mathbf{E}\Big{[}\mathbf{I}\big{(}Y^{}_{n}\neq\hat{Y}^{m-1}_{n}\big{)}\cdot\mathbf{I}_{(A^{\Delta}_{n})^{c}\cap E_{n}\cap(B^{\gamma}_{n})^{c}}\Big{]}}{\mathbf{P}(\tau\geq m)}\qquad\text{because }\hat{Y}^{m}_{n}\text{ and }Y^{}_{n}\text{ only take values in }\{0,1\}$
	$\displaystyle\leq\sum_{m\geq 1}\frac{\mathbf{P}\Big{(}\big{\{}Y^{*}_{n}\neq\hat{Y}^{m-1}_{n},\ \bar{X}_{n}\notin A^{\Delta}\big{\}}\cap(B^{\gamma}_{n})^{c}\Big{)}}{\mathbf{P}(\tau\geq m)}\qquad\text{ due to }A^{\Delta}_{n}=\{\bar{X}_{n}\in A^{\Delta}\}$
	$\displaystyle=\sum_{m\geq 1}\frac{\mathbf{P}\Big{(}\big{\{}Y^{}_{n}\neq\hat{Y}^{m-1}_{n},\ \bar{X}_{n}\notin A^{\Delta}\big{\}}\cap\{\mathcal{D}(\bar{J}_{n})<l^{}\}\Big{)}}{\mathbf{P}(\tau\geq m)}\qquad\text{due to }B^{\gamma}_{n}=\{\mathcal{D}(\bar{J}_{n})\geq l^{*}\}$
	$\displaystyle=\sum_{m\geq 1}\sum_{k=0}^{l^{}-1}\frac{\mathbf{P}\big{(}Y^{}_{n}\neq\hat{Y}^{m-1}_{n},\ \bar{X}_{n}\notin A^{\Delta}\ \big{\|}\ \{\mathcal{D}(\bar{J}_{n})=k\}\big{)}}{\mathbf{P}(\tau\geq m)}\cdot\mathbf{P}\big{(}\mathcal{D}(\bar{J}_{n})=k\big{)}$
	$\displaystyle\leq\sum_{m\geq 1}\sum_{k=0}^{l^{*}-1}\frac{C_{0}\rho^{m-1}_{0}}{\Delta^{2}n^{\mu}\cdot\rho^{m-1}}\qquad\text{ due to \eqref{condition 2, proposition: design of Zn}}$
	$\displaystyle=l^{}\sum_{m\geq 1}\frac{C_{0}\rho^{m-1}_{0}}{\Delta^{2}n^{\mu}\cdot\rho^{m-1}}=\frac{C_{0}l^{}}{\Delta\cdot(1-\frac{\rho_{0}}{\rho})}\cdot\frac{1}{n^{\mu}}=\bm{o}\Big{(}\big{(}n\nu[n,\infty)\big{)}^{2l^{*}}\Big{)}.$	(6.18)

	$\displaystyle\left\lVert f_{Y(t)}\right\rVert_{\infty}$	$\displaystyle\leq\frac{1}{2\pi}\int\|\varphi_{t}(z)\|dz$
		$\displaystyle\leq\frac{1}{2\pi}\Big{(}2\widetilde{M}+\int_{\|z\|\geq\tilde{M}}\exp\big{(}-t\widetilde{C}\|z\|^{\alpha}\big{)}dz\Big{)}$
		$\displaystyle\leq\frac{1}{2\pi}\Big{(}2\widetilde{M}+\frac{1}{t^{1/\alpha}}\int_{\mathbb{R}}\exp(-\widetilde{C}\|x\|^{\alpha})dx\Big{)}\qquad\text{by letting $x=zt^{1/\alpha}$}$
		$\displaystyle=\frac{\widetilde{M}}{\pi}+\frac{C_{1}}{t^{1/\alpha}}\qquad\text{where $C_{1}=\frac{1}{2\pi}\int_{\mathbb{R}}\exp(-\widetilde{C}\|x\|^{\alpha})dx<\infty$}.$

	$\displaystyle\mu[x_{j}/\|z\|,x_{j-1}/\|z\|)$	$\displaystyle=g(\|z\|/x_{j})-g(\|z\|/x_{j-1})\qquad\text{ by definition of }g(y)=\mu[1/y,\infty)$
		$\displaystyle=g(t_{0}\|z\|/x_{j-1})-g(\|z\|/x_{j-1})\qquad\text{ due to $x_{j-1}=t_{0}x_{j}$; see \eqref{uContOfG_lCont}}$
		$\displaystyle\geq g(\|z\|/x_{j-1})\cdot\Big{(}(1-\Delta)t_{0}^{\alpha+\epsilon}-1\Big{)}\qquad\text{due to $\frac{\|z\|}{x_{j}}\geq\bar{y}_{1}\vee\bar{y}_{2}$ and \eqref{potterBound_lCont}}$
		$\displaystyle\geq\widetilde{g}(\|z\|/x_{j-1})\cdot(t_{0}^{\alpha}-1)\qquad\text{due to \eqref{gBound_lCont}}.$

	$\displaystyle I_{3}(z)$
	$\displaystyle=\frac{1}{\|z\|^{\alpha}}\sum_{j=1}^{N}\bigg{[}\int_{E_{j}(z)}\Big{(}1-\cos(zx)\Big{)}\mu(dx)-\int_{E_{j}(z)}\Big{(}1-\cos(zx)\Big{)}\frac{dx}{x^{1+\alpha}}\bigg{]}$
	$\displaystyle\geq\frac{1}{\|z\|^{\alpha}}\sum_{j=1}^{N}\Big{[}\underline{m}_{j}\cdot\mu\big{(}E_{j}(z)\big{)}-\bar{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}\Big{]}$
	with $\bar{m}_{j}=\max\{h(z):\ z\in[x_{j},x_{j-1}]\},\ \underline{m}_{j}=\min\{h(z):\ z\in[x_{j},x_{j-1}]\}$
	$\displaystyle=\frac{1}{\|z\|^{\alpha}}\sum_{j=1}^{N}\Big{[}\underline{m}_{j}\cdot\mu\big{(}E_{j}(z)\big{)}-\underline{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}\Big{]}+\frac{1}{\|z\|^{\alpha}}\sum_{j=1}^{N}\Big{[}\underline{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}-\bar{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}\Big{]}$
	$\displaystyle\geq 0+\frac{1}{\|z\|^{\alpha}}\sum_{j=1}^{N}\Big{[}\underline{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}-\bar{m}_{j}\cdot\nu_{\alpha}\big{(}E_{j}(z)\big{)}\Big{]}\qquad\text{due to }\mu\big{(}E_{j}(z)\big{)}\geq\nu_{\alpha}\big{(}E_{j}(z)\big{)}$
	$\displaystyle\geq-\frac{\delta}{\|z\|^{\alpha}}\sum_{j=1}^{N}\nu_{\alpha}\big{(}E_{j}(z)\big{)}=-\frac{\delta}{\|z\|^{\alpha}}\nu_{c}[\theta/\|z\|,M/\|z\|)\qquad\text{due to \eqref{uContOfG_lCont}}$
	$\displaystyle=-\frac{\delta}{\|z\|^{\alpha}}\int_{\theta/\|z\|}^{M/\|z\|}\frac{dx}{x^{1+\alpha}}$
	$\displaystyle\geq-\frac{\delta}{\|z\|^{\alpha}}\int_{\theta/\|z\|}^{\infty}\frac{dx}{x^{1+\alpha}}=-\frac{\delta}{\alpha\theta^{\alpha}}$
	$\displaystyle\geq-\frac{C_{0}}{4}\qquad\text{due to \eqref{chooseDelta_Lcont}.}$		(6.71)

Strongly Efficient Rare-Event Simulation for Regularly Varying Lévy Processes with Infinite Activities

Abstract

1 Introduction

2 Preliminaries

2.1 Notations

Definition 1.

Definition 2.

2.2 Sample-Path Large Deviations for Regularly Varying Lévy Processes

Result 1 (Theorem 3.1 of [54]).

Result 2 (Theorem 3.4 of [54]).

2.3 Concave Majorants and Stick-Breaking Approximations of Lévy Processes with Infinite Activities

Result 3 (Theorem 1 in [52]).

Remark 1.

2.4 Randomized Debiasing Technique

Result 4 (Theorem 1 in [55]).

3 Algorithm

Assumption 1.

Assumption 2.

Assumption 3.

3.1 Importance Sampling Distributions 𝐐n\mathbf{Q}_{n}

3.2 Estimators ZnZ_{n}

Proposition 3.1.

3.3 Construction of Y^nm\hat{Y}^{m}_{n}

Remark 2.

3.4 Sampling from 𝐏(⋅|Bnγ)\mathbf{P}(\ \cdot\ |B^{\gamma}_{n})

3.5 Strong Efficiency and Computational Complexity

Theorem 3.2.

Remark 3.

3.6 Construction of Y^nm\hat{Y}^{m}_{n} with ARA

Theorem 3.3.

Remark 4.

4 Lipschitz Continuity of the Distribution of X<z​(t)X^{<z}(t)

Definition 3.

Proposition 4.1.

Theorem 4.2.

Proof.

Remark 5.

Definition 4.

Proposition 4.3.

Theorem 4.4.

Proof.

5 Numerical Experiments

6 Proofs

6.1 Proof of Proposition 3.1

Lemma 6.1.

Proof.

Lemma 6.2.

Proof.

Proof of Proposition 3.1.

6.2 Proof of Theorems 3.2 and 3.3

Proposition 6.3.

Proposition 6.4.

Proof of Theorem 3.3.

Result 5 (Lemma 1 of [35]).

Lemma 6.5.

Proof.

Lemma 6.6.

Proof.

Lemma 6.7.

Proof.

Lemma 6.8.

Proof.

Lemma 6.9.

Proof.

Proof of Proposition 6.3.

Proof of Proposition 6.4.

6.3 Proof of Propositions 4.1 and 4.3

Proof of Proposition 4.1.

Proof of Proposition 4.3.

References

Appendix A Barrier Option Pricing

A.1 Problem Setting

Assumption 4.

A.2 Importance Sampling Algorithm

Proposition A.1.

A.3 Construction of Y^mn;c\hat{Y}^{m}_{n;c}

3.1 Importance Sampling Distributions $\mathbf{Q}_{n}$

3.2 Estimators $Z_{n}$

3.3 Construction of $\hat{Y}^{m}_{n}$

3.4 Sampling from $\mathbf{P}(\ \cdot\ |B^{\gamma}_{n})$

3.6 Construction of $\hat{Y}^{m}_{n}$ with ARA

4 Lipschitz Continuity of the Distribution of $X^{<z}(t)$

A.3 Construction of $\hat{Y}^{m}_{n;c}$