Strongly Efficient Rare-Event Simulation for Regularly Varying Lévy Processes with Infinite Activities
Abstract
In this paper, we address rare-event simulation for heavy-tailed Lévy processes with infinite activities. The presence of infinite activities poses a critical challenge, making it impractical to simulate or store the precise sample path of the Lévy process. We present a rare-event simulation algorithm that incorporates an importance sampling strategy based on heavy-tailed large deviations, the stick-breaking approximation for the extrema of Lévy processes, the Asmussen-Rosiński approximation, and the randomized debiasing technique. By establishing a novel characterization for the Lipschitz continuity of the law of Lévy processes, we show that the proposed algorithm is unbiased and strongly efficient under mild conditions, and hence applicable to a broad class of Lévy processes. In numerical experiments, our algorithm demonstrates significant improvements in efficiency compared to the crude Monte-Carlo approach.
1 Introduction
In this paper, we propose a strongly efficient rare-event simulation algorithm for heavy-tailed Lévy processes with infinite activities. Specifically, the goal is to estimate probabilities of the form , where is a Lévy process in , is a subset of the càdlàg space that doesn’t include the typical path of so that is close to , and the event is “unsimulatable” due to the infinite number of activities within any finite time interval. The defining features of the problem are as follows.
-
•
The increments of the Lévy process are heavy-tailed. Throughout this paper, we characterize the heavy-tailed phenomenon through the notion of regular variation and assume that the tail cdf decays roughly at a power-law rate of ; see Definition 1 for details. The notion of heavy tails provides the mathematical formulation for the extreme uncertainty that manifests in a wide range of real-world dynamics and systems, including the spread of COVID-19 (see, e.g., [21]), traffic in computer and communication networks (see, e.g., [45]), financial assets (see, e.g., [30, 10]), and the training of deep neural networks (see, e.g., [38, 41]).
-
•
is a general subset of (i.e., the space of the real-valued càdlàg functions over ) that involves the supremum of the path. For concreteness in our presentation, the majority of the paper focuses on
(1.1) Intuitively speaking, this is closely related to ruin probabilities under reinsurance mechanisms, as requires the supremum of the process over to exceed some threshold even though all upward jumps in are bounded by . Nevertheless, we stress that the algorithmic framework proposed in this paper is flexible enough to address more general form of events that are of practical interest. For instance, we demonstrate in Section A of the Appendix that the framework can also address rare event simulation in the context of barrier option pricing.
-
•
possesses infinite activities; see Section 2.3 for the precise definition. Consequently, it is computationally infeasible to simulate or store the entire sample path of such processes. In other words, we focus on a computationally challenging case where cannot be exactly simulated or evaluated. Addressing such “unsimulatable” cases is crucial due to the increasing popularity of Lévy models with infinite activities in risk management and mathematical finance (see, e.g., [15, 16, 56, 5, 57]), as they offer more accurate and flexible descriptions for the price and volatility of financial assets compared to the classical jump-diffusion models (see, e.g., [50]).
In summary, our goal is to tackle a practically significant yet computationally challenging task, where the nature of the rare events renders crude Monte Carlo methods highly inefficient, if not entirely infeasible, due to the infinite activities in . To address these challenges, we integrate several mathematical machinery: a design of importance sampling based on sample-path large deviations for heavy-tailed Lévy processes in [54], the stick-breaking approximation in [35] for Lévy processes with infinite activities, and the randomized multilevel Monte Carlo debiasing technique in [55]. By combining these tools, we propose a rare event simulation algorithm for heavy-tailed Lévy processes with infinite activities that attains strong efficiency (see Definition 2 for details).
As mentioned above, the first challenge is rooted in the nature of rare events as the crude Monte Carlo methods can be prohibitively expensive when estimating a small . Instead, variance reduction techniques are often employed for efficient rare event simulation. When the underlying uncertainties are light-tailed, the exponential tilting strategy guided by large deviation theories has been successfully applied in a variety of contexts; see, e.g., [14, 11, 59, 28]. However, the exponential tilting approach falls short in providing a principled and provably efficient design of the importance sampling estimators (see, for example, [4]) due to fundamentally different mechanisms through which the rare events occur. Instead, different importance sampling strategies (e.g., [6, 27, 7, 9, 51, 8]) and other variance reduction techniques such as conditional Monte Carlo (e.g., [2, 42]) and Markov Chain Monte Carlo (e.g., [37]) have been proposed to address problems associated with specific types of processes or events.
Recent developments in heavy-tailed large deviations, such as those by [54] and [61], offer critical insights into the design of efficient and universal importance sampling schemes for heavy-tailed systems. Central to this development is the discrete hierarchy of heavy-tailed rare events, known as the catastrophe principle. The principle dictates that rare events in heavy-tailed systems arise due to catastrophic failures of a small number of system components, and the number of such components governs the asymptotic rate at which the associated rare events occur. This creates a discrete hierarchy among heavy-tailed rare events. By combining the defensive importance sampling design with such hierarchy, strongly efficient importance sampling algorithms have been proposed for a variety of rare events associated with random walks and compound Poisson processes in [20]. See also [62] for a tutorial on this topic. In this paper, we adopt and extend this framework to encompass Lévy processes with infinite activities. The specifics of the importance sampling distribution are detailed in Section 3.1.
Another challenge arises from the simulation of Lévy processes with infinite activities. While the design of importance sampling algorithm in [20] has been successfully applied to a wide range of stochastic systems that are exactly simulatable (including random walks, compound Poisson processes, iterates of stochastic gradient descent, and several classes of queueing systems), it cannot be implemented for Lévy processes with infinite activities. More specifically, the simulation of the random vector , where , poses a significant challenge in the case with infinite activities. As of now, exact simulation of the extrema of Lévy processes (excluding the compound Poisson case) is only available for specific cases (see, for instance, [36, 23, 18]), let alone the exact simulation of the joint law of . We therefore approach the challenge by considering the following questions: Does there exist a provably efficient approximation algorithm for , and Are we able to remove the approximation bias while still attaining strong efficiency in our rare-event simulation algorithm?
Regarding the first question, several classes of algorithms have been proposed for the approximate simulation of the extrema of Lévy processes. This includes the random walk approximations based on Euler-type discretization of the process (see e.g., [1, 26, 33]), the Wiener-Hopf approximation methods (see e.g. [43, 31]) based on the fluctuation theory of Lévy processes, the jump-adapted Gaussian approximations (see e.g. [24, 25]), and the characteristic function approach in [12, 13] based on efficient evaluation of joint cdf. Nevertheless, the approximation errors in the aforementioned methods are either unavailable or exhibit a polynomial rate of decay. Thankfully, the recently developed stick-breaking approximation (SBA) algorithm in [35] provides a novel approach to the simulation of the joint law of and . The theoretical foundation of SBA is the following description for the concave majorants of Lévy processes with infinite activities in [52]:
Here, is a sequence of iteratively generated non-negative RVs satisfying and ; conditioned on the values of , ’s are independently generated such that . While it is computationally infeasible to generate the entirety of the infinite sequences and , by terminating the procedure at the -th step we yield approximations of the form
(1.2) |
We provide a review in Section 2.3. In particular, due to , with each extra step in (1.2) we expect to reduce the approximation error by half, thus leading to the geometric convergence rate of errors. See [35] for analyses of the approximation errors for different types of functionals.
Additionally, while SBA can be considered sufficiently accurate for a wide range of tasks, eliminating the approximation errors is crucial in the context of rare-event simulation. Otherwise, any effort to efficiently estimate a small probability might be fruitless and could be overwhelmed by potentially large errors in the algorithm. In order to remove the approximation errors of SBA in (1.2), we employ the construction of unbiased estimators proposed in [55]. This can be interpreted as a randomized version of the multilevel Monte Carlo scheme [39, 32] when a sequence of biased yet increasingly more accurate approximations is available. It allows us to construct an unbiased estimation algorithm that terminates within a finite number of steps. By combining SBA, the randomized debiasing technique, and the design of importance sampling distributions based on heavy-tailed large deviations, we propose Algorithm 2 for rare-event simulation of Lévy processes with infinite activities. In case that the exact sampling of , and hence the increments ’s in (1.2), is not available, we further incorporate the Asmussen-Rosiński approximation (ARA) in [3]. This approximation replaces the small-jump martingale in the Lévy process with a Brownian motion of the same variance, thus leading to Algorithm 3. We note that the combination of SBA and the randomized debiasing technique has been explored in [35], and an ARA-incorporated version of SBA has been proposed in [34]. However, the goal of proposing strongly efficient rare-event simulation algorithm adds another layer of difficulty and sets our work apart from the existing literature. In particular, the notion of strong efficiency demands that the proposed estimator remains efficient under the importance sampling algorithm w.r.t. not just a given task, but throughout a sequence of increasingly more challenging rare-event simulation tasks as tends to . This introduces a new dimension into the theoretical analysis that is not presented in [35, 34] and necessitates the development of new technical tools to characterize the performance of the algorithm when all these components (importance sampling, SBA, debiasing technique, and ARA) are in effect.
An important technical question in our analysis concerns the continuity of the law of the running supremum . To provide high-level descriptions, let us consider estimators for that admit the form where is some approximation to the Lévy process and SBA and the debiasing technique allow us to construct such that the deviation has a small variance. Nevertheless, the estimation can be fallible if concentrates on the boundary cases, i.e., falls into a neighborhood of fairly often. Specializing to the case in (1.1), this requires obtaining sufficiently tight bounds for probabilities of form . Nevertheless, the continuity of the law of the supremum remains an active area of study, with many essential questions left open. Recent developments regarding the law of are mostly qualitative or focus on the cumulative distribution function (cdf); see, e.g., [19, 22, 44, 47, 49, 48]. In short, addressing this aspect of the challenge requires us to establish novel and useful quantitative characterizations of the law of supremum .
For our purpose of efficient rare event simulation, particularly under the importance sampling scheme detailed in Section 3.1, the following condition proves to be sufficient:
(1.3) |
Here, is a modulated version of the process where all the upward jumps with sizes larger than are removed; see Section 3 for the rigorous definition. First, we establish in Theorem 3.2 (resp. Theorem 3.3) that Algorithm 2 (resp. Algorithm 3) does attain strong efficiency under condition (1.3). More importantly, we demonstrate in Section 4 that condition (1.3) is mild for Lévy processes with infinitive activities, as it only requires the intensity of jumps to approach (hence attaining infinite activities in ) at a rate that is not too slow. In particular, in Theorems 4.2 and 4.4 we provide two sets of sufficient conditions for (1.3) that are easy to verify. We note that the representation of concave majorants for Lévy processes developed in [52] proves to be a valuable tool for studying the law of and . As will be elaborated in the proofs in Section 6, the key technical tool that allows us to connect condition 1.3 with the law of the supremum is, again, the representation in (1.2). See also [17] for its application in studying the joint density of and of stable processes.
Some algorithmic contributions of this paper were presented in a preliminary form at a conference in [60] without rigorous proofs. The current paper presents several significant extensions: In addition to Algorithm 2, we also propose an ARA-incorporated version of the importance sampling algorithm (see Algorithm 3) to address the case where cannot be exactly simulated; Rigorous proofs of strong efficiency are provided in Section 6 in this paper; We establish two sets of sufficient conditions for (1.3) in Section 4, leveraging the properties of regularly varying or semi-stable processes.
The rest of the paper is structured as follows. Section 2 reviews the theoretical foundations of our algorithms, including the heavy-tailed large deviation theories (Section 2.2), the stick-breaking approximations (Section 2.3), and the debiasing technique (Section 2.4). Section 3 presents the importance sampling algorithms and establishes their strong efficiency. Section 4 investigates the continuity of the law of and provides sufficient conditions for (1.3), a critical condition to ensure the strong efficiency of our importance sampling scheme. Numerical experiments are reported in Section 5. The proofs of all technical results are collected in Section 6. In the Appendix, Section A extends the algorithmic framework to the context of barrier option pricing.
2 Preliminaries
In this section, we introduce some notations and results that will be frequently used when developing the strongly efficient rare-event simulation algorithm.
2.1 Notations
Let be the set of non-negative integers. For any positive integer , let . For any , let and . For any , we define as the positive part of , and
as the floor and ceiling function. Given a measure space and any set , we use to denote restriction of the measure on . For any random variable and any Borel measureable set , let be the law of , and be the law of conditioned on event . Let be the metric space of (i.e., the space of all real-valued càdlàg functions with domain ) equipped with Skorokhod metric . Here, the metric is defined by
(2.1) |
with being the set of all increasing homeomorphisms from to itself.
Henceforth in this paper, the heavy-tailedness of any random element will be captured by the notion of regular variation.
Definition 1.
For any measurable function , we say that is regularly varying as with index (denoted as as ) if for all . We also say that a measurable function is regularly varying as with index if for any . We denote this as as .
For properties of regularly varying functions, see, for example, Chapter 2 of [53].
Next, we discuss the Lévy-Ito decomposition of one-dimensional Lévy processes, i.e., . The law of a one-dimensional Lévy process is completely characterized by its generating triplet where represents the constant drift, is the magnitude of the Brownian motion term, and the Lévy measure characterizes the intensity of the jumps. More precisely,
(2.2) |
where is the Lebesgue measure on , is a standard Brownian motion, the measure satisfies , and is a Poisson random measure over with intensity measure and is independent of . For standard references on this topic, see Chapter 4 of [58].
Given two sequences of non-negative real numbers and , we say that (as ) if there exists some such that . Besides, we say that if . The goal of this paper is described in the following definition of strong efficiency.
Definition 2.
Let be a sequence of random variables supported on a probability space and be a sequence of events (i.e., ). We say that are unbiased and strongly efficient estimators of if
We stress again that strongly efficient estimators achieve uniformly bounded relative errors (i.e., the ratio between standard error and mean) for all .
2.2 Sample-Path Large Deviations for Regularly Varying Lévy Processes
The key ingredient of our importance sampling algorithm is the recent development of the sample-path large deviations for Lévy processes with regularly varying increments; see [54]. To familiarize the readers with this mathematical machinery, we start by reviewing the results in the one-sided cases, and then move onto the more general two-sided results.
Let be a centered Lévy process (i.e., ) with generating triplet such that the Lévy measure is supported on . In other words, all the discontinuities in will be positive, hence one-sided. Moreover, we are interested in the heavy-tailed setting where the function is regularly varying as with index where . Define a scaled version of the process as , and let . Note that is a random element taking values in .
For all , let be the subset of containing all the non-decreasing step functions that has jumps (i.e., discontinuities) and vanishes at the origin. Let be the set that only contains the zero function . Let . For any , let be the measure supported on with . For any positive integer , let be the fold product measure of restricted on . Define the measure (for )
where all ’s are iid copies of . In case that , we set as the Dirac measure on 0. The following result provides sharp asymptotics for rare events associated with . Henceforth in this paper, all measurable sets are understood to be Borel measurable.
Result 1 (Theorem 3.1 of [54]).
Let be measurable. Suppose that and is bounded away from in the sense that . Then
where are the interior and closure of respectively.
Intuitively speaking, Result 1 embodies a general principle that, in heavy-tailed systems, rare events arise due to several “large jumps”. Here, denotes the minimum number of jumps required in for event to occur. As shown above in Result 1, dictates the polynomial rate of decay of the probabilities of the rare events . Furthermore, results such as Corollary 4.1 in [54] characterize the conditional limits of : conditioning on the occurrence of rare events , the conditional law converges in distribution to that of a step function over with exactly jumps (of random sizes and arrival times) as . Therefore, also dictates the most likely scenarios of the rare events. This insight proves to be critical when we develop the importance sampling distributions for the rare events simulation algorithm in Section 3.
Results for the two-sided cases admit a similar yet slightly more involved form, where the Lévy process exhibits both positive and negative jumps. Specifically, let be a centered Lévy process such that for and , we have and as for some . Let be the set containing all step functions in vanishing at the origin that has exactly upward jumps and downward jumps. As a convention, let . Given , let where Let be the Dirac measure on . For any let
(2.3) |
where all ’s and ’s are iid copies of Unif RVs. Now, we are ready to state the two-sided result.
2.3 Concave Majorants and Stick-Breaking Approximations of Lévy Processes with Infinite Activities
Next, we review the distribution of the concave majorant of a Lévy process with infinite activities characterized in [52], which paves the way to the stick-breaking approximation algorithm proposed in [35]. Let be a Lévy process with generating triplet . We say that has infinite activities if or . Let be the running supremum of . The results in [52] establishes a Poisson–Dirichlet distribution that underlies the joint law of and . Specifically, we fix some and let ’s be iid copies of Unif RVs. Recursively, let
(2.5) |
Conditioning on the values of , let be a random copy of , with all being independently generated.
Result 3 (Theorem 1 in [52]).
Suppose that the Lévy process has infinite activities. Then (with )
(2.6) |
Based on the distribution characterized in (2.6), the stick-breaking approximation algorithm was proposed in [35] where finitely many ’s are generated in order to approximate and . This approximation technique is a key component of our rare event simulation algorithm. In particular, we utilize a coupling between different Lévy processes based on the representation (2.6) above. For clarity of our description, we focus on two Lévy processes and with generating triplets and , respectively. Suppose that both and have infinite activities. We first generate ’s as described in (2.5). Conditioning on the values of , we then independently generate and , which are random copies of and , respectively. Let . Applying Result 3, we identify a coupling between such that
(2.7) |
Remark 1.
It is worth noticing that the method described above in fact implies the existence of a probability space that supports the entire sample paths and , whose endpoint values and suprema admit the joint law in (2.7). In particular, once we obtain based on (2.5), one can generate that are iid copies of the entire paths of . That is, we generate a piece of sample path on the stick , and the quantities introduced earlier can be obtained by setting . To recover the sample path of based on the pieces , it suffices to apply Vervatt transform onto each and then reorder the pieces based on their slopes. We refer the readers to theorem 4 in [52]. In summary, the method described above leads to a coupling between the sample paths of the underlying Lévy processes and such that (2.7) holds.
2.4 Randomized Debiasing Technique
To achieve unbiasedness in our algorithm and remove the errors in the stick-breaking approximations, we apply the randomized multi-level Monte-Carlo technique studied in [55]. In particular, due to being finite (almost surely) in Result 4 below, the simulation of relies only on instead of the infinite sequence .
Result 4 (Theorem 1 in [55]).
Let random variables and be such that . Let be a positive integer-valued random variable with unbounded support, independent of and . Suppose that
(2.8) |
then (with the convention ) satisfies
where .
3 Algorithm
Throughout the rest of this paper, let be a Lévy process with generating triplet satisfying the following heavy-tailed assumption.
Assumption 1.
. is of infinite activity. The Blumenthal-Getoor index satisfies . Besides, one of the two claims below holds for the Lévy measure .
-
•
(One-sided cases) is supported on , and function is regularly varying as with index where ;
-
•
(Two-sided cases) There exist such that is regularly varying as with index and is regularly varying as with index .
The other assumption on revolves around the continuity of the law of , which is the Lévy process with generating triplet . That is, is a modulated version of where all the upward jumps with size larger than are removed.
Assumption 2.
There exist such that
Assumption 2 can be interpreted as a uniform version of Lipschitz continuity in the law of . In Section 4, we show that Assumption 2 is a mild condition for Lévy process with infinite activities and is easy to verify.
Next, we describe a class of target events for which we propose a strongly efficient rare event simulation algorithm. Let and be the scaled version of the process. Define events
(3.1) |
In words, means that the path crossed barrier even though no upward jumps in is larger than . For technical reasons, we also impose the following mild condition on the values of the constants .
Assumption 3.
and
In this section, we present a strongly efficient rare-event simulation algorithm for . Specifically, Section 3.1 presents the design of the importance sampling distribution , Section 3.2 discusses how we apply the randomized Monte-Carlo debiasing technique in Result 4 in our algorithm, Section 3.3 discusses how we combine the debiasing technique with SBA in Result 3, and Section 3.4 explains how to sample from the importance sampling distribution . Combining all these components in Section 3.5, we propose Algorithm 2 for rare-event simulation of and establish its strong efficiency in Theorem 3.2. Section 3.6 addresses the case where the exact simulation of is not available.
3.1 Importance Sampling Distributions
At the core of our algorithm is a principled design of importance sampling strategies based on heavy-tailed large deviations. This can be seen as an extension of the framework proposed in [20]. First, note that
(3.2) |
indicates the number of jumps required to cross the barrier starting from the origin if no jump is allowed to be larger than . Based on the sample-path large deviations reviewed in Section 2.2, we expect the events to be almost always caused by exactly large upward jumps in . These insights reveal critical information regarding the conditional law . More importantly, they lead to a natural yet effective choice of importance sampling distributions to focus on the -large-jump paths and provides sufficient approximations to . Specifically, for any , define events with
(3.3) |
where, for any , we define as the left-limit of at time . Intuitively speaking, the parameter acts as a threshold of “large jumps”: any path has at least upward jumps that are considered large relative to the threshold level . To prevent the likelihood ratio from blowing up to infinity, we then consider an importance sampling distribution with defensive mixtures (see [40]) and define (for some )
(3.4) |
Sampling from , and hence , is straightforward and will be addressed in Section 3.4.
With the design of the importance sampling distribution in hand, one would naturally consider an estimator for of form . This is due to
Here, we use to denote the expectation operator under law and for the expectation under . Nevertheless, the exact evaluation or simulation of is generally not computationally feasible due to the infinite activities of the process , making it computationally infeasible to simulate or store the entire sample path with finite computational resources. This marks a significant difference from the tasks in [20], which focus on random walks or compound Poisson processes with constant drifts that can be simulated exactly. To overcome this challenge, we instead consider estimators in the form of
(3.5) |
where can be simulated within finite computational resources and allows to recover the right expectation under the importance sampling distribution , i.e., . In Section 3.2, we elaborate on the design of the estimators .
3.2 Estimators
Intuitively speaking, the goal is to construct ’s that can be plugged into (3.5) as unbiased estimators of . To this end, we consider the following decomposition of the Lévy process . For any and , let be the size of the discontinuity in at time . Recall that is the threshold of large jumps in the definition of in (3.3). Let
(3.6) | ||||
We highlight several important facts regarding the decomposition .
-
•
By the definition of , the law of remains unchanged under both and , which is identical to the law of , namely, a Lévy process with generating triplet .
-
•
Under , the process admits the law of a Lévy process with generating triplet , which is a compound Poisson process.
-
•
Under , the path follows the same law as a Lévy process with generating triplet , conditioned on having at least jumps over .
-
•
Under both and , the two processes and are independent.
Let , , , and . We now discuss how the decomposition
can help us construct unbiased estimators of . First, recall that . As a result, in the definition of events in (3.1), the condition only concerns the large jump process since any upward jump in is bounded by . Therefore, with
(3.7) |
and
we have
As discussed above, the exact evaluation of is generally not computationally possible. Instead, suppose that we have access to a sequence of random variables that only take values in and provide progressively more accurate approximations to as . Then in light of the debiasing technique in Result 4, one can consider (under the convention that )
(3.8) |
where is for some and is independent of everything else. That is, for all . Indeed, this construction of is justified by the following proposition. We defer the proof to Section 6.1.
Proposition 3.1.
Let , , , and . Suppose that
(3.9) |
where counts the number of discontinuities in for any . Besides, suppose that for all ,
(3.10) |
where . Then given , there exists some such that for all , the estimators specified in (3.5) and (3.8) are unbiased and strongly efficient for under the importance sampling distribution in (3.4).
3.3 Construction of
In light of Proposition 3.1, our next goal is to design ’s that provide sufficient approximations to and satisfy the conditions (3.9) and (3.10).
Recall the decomposition of in (3.6). Under both and , the processes and are independent, and admits the law of , i.e., a Lévy process with generating triplet . This section discusses how, after sampling from , we approximate the supremum of . Specifically, on event , i.e., the process makes jumps over , admits the form of with
(3.11) |
for some and . This allows us to partition into disjoint intervals . We adopt the convention and set
(3.12) |
For , define
(3.13) |
as the supremum of the fluctuations of over . Define random function
(3.14) |
and note that .
In theory, the representation (3.14) provides an algorithm for the simulation of . Nevertheless, the exact simulation of the supremum is generally not available. Instead, we apply SBA introduced in Section 2.3 to approximate , thus providing the construction of . Specifically, define
(3.15) | ||||
(3.16) |
where each is an iid copy of Unif. That is, for each , the sequence is defined under the recursion in (2.5), with set as the length of . Then, conditioning on the values of ’s, we sample
(3.17) |
i.e., is an independent copy of , with all being independently generated. Result 3 then implies for each . Furthermore, by summing up only finitely many ’s, we define
(3.18) |
as an approximation to defined in (3.13). Here, is another parameter of the algorithm. For technical reasons, we add an extra term in the summation in (3.18), which helps ensure that the algorithm achieves strong efficiency as while only introducing a minor increase in the computational complexity.
Now, we are ready to present the design of the approximators . For , define the random function
(3.19) |
Here, note that . As a high-level description, the algorithm proceeds as follows. After sampling from the importance sampling distribution defined in (3.4), we plug into defined in (3.8), which in turn allows us to simulate as the importance sampling estimator under .
Remark 2.
At first glance, one may get the impression that the simulation of involves the summation of infinitely many elements in . Fortunately, the truncation index in (see (3.8)) is almost surely finite. Therefore, when running the algorithm in practice, after is decided, there is no need to simulate any beyond . Given the construction of in (3.18), the simulation of only requires (for each ) as well as the sum . Furthermore, conditioning on the value of , the sum admits the law of (see Result 3). This allows us to simulate in one shot.
Note that to implement the importance sampling algorithm and ensure strong efficiency, the following tasks still remain to be addressed.
3.4 Sampling from
In this section, we revisit the task of sampling from , which is at the core of the implementation of the importance sampling distribution in (3.4).
Recall that under , the process is a compound Poisson process with generating triplet . More precisely, let be a Poisson process with rate , and we use to denote the arrival times of jumps in . Let be a sequence of iid random variables from
and let ’s be independent of . Under , we have
Furthermore, for each , conditioning on , the law of is equivalent to that of the order statistics of iid samples from Unif, and ’s are still independent of ’s with the law unaltered. Therefore, the sampling of from can proceed as follows. We first generate from the distribution of Poisson, conditioning on . Then, independently, we generate as the order statistics of iid samples from Unif, and as iid samples of law . It is worth mentioning that the sampling of can be addressed with the help of the inverse measure. Specifically, define as the inverse of , and observe that
More importantly, for , the law of is . This leads to the steps detailed in Algorithm 1.
3.5 Strong Efficiency and Computational Complexity
With all the discussions above, we propose Algorithm 2 for rare-event simulation of . Specifically, here is a list of the parameters of the algorithm.
-
•
: the threshold in defined in (3.3),
-
•
: the weight of the defensive mixture in ; see (3.4),
-
•
: the geometric rate of decay for in (3.8),
-
•
: determining the term in (3.18).
The choice of won’t affect the strong efficiency of the algorithm. Meanwhile, under proper parametrization, Algorithm 2 meets conditions (3.9) and (3.10) stated in Proposition 3.1 and attains strong efficiency. This is verified in Theorem 3.2.
Theorem 3.2.
Let and . There exists such that the following claim holds: Given , there exists such that Algorithm 2 is unbiased and strongly efficient under any .
We defer the proof to Section 6.2. In fact, in Section 3.6 we propose Algorithm 3, which can be seen as an extended version of Algorithm 2 with another layer of approximation. The strong efficiency of Algorithm 2 follows directly from that of Algorithm 3 (i.e., by setting in the proof of Theorem 3.3). The choices of and that ensure strong efficiency are also specified at the end of Section 3.6.
Remark 3.
To conclude, we add a remark regarding the computational complexity of Algorithm 2 under the goal of attaining a given level of relative error at a specified confidence level. First, consider the case where the complexity of simulation of scales linearly with (uniformly for all for some constant ). This is a standard since the number of jumps we expect to simulate over grows linearly with . Then, the complexity of the SBA steps at step 13 of Algorithm 2 also scales linearly with , as the stick lengths of ’s, in expectation, grow linearly with because we deal with the time horizon given the scale factor . Next, since the same law for the truncation index (see step 8 of Algorithm 2) is applied for all , the only other factor that is varying with is in the loop at step 10. The strong efficiency of the algorithm then implies a computational complexity of order . If we instead assume that the cost of simulating is also uniformly bounded for all , then the overall complexity of Algorithm 2 is further reduced to .
In comparison, the crude Monte Carlo method requires a number of samples that is inversely proportional to the target probability (see Lemma 6.1) with being the heavy-tailed index in Assumption 1 and defined in (3.2). Hypothetically, assuming that the evaluation of (which at least requires the simulation of and ) is computationally feasible at a cost that scales linearly with , we end up with a computational complexity of (compared to the cost of our algorithm). Similarly, if we assume that the cost of generating is uniformly bounded for all , then the complexity of the crude Monte-Carlo method is (compared to the cost of our algorithm). In summary, not only does the proposed importance sampling algorithm address Lévy processes with infinite activities that are not simulatable for crude Monte Carlo methods, but it also enjoys a significant improvement in terms of computational complexity, with the advantage becoming even more evident for multiple-jump events with large .
3.6 Construction of with ARA
As stressed earlier, implementing Algorithm 3.5 requires the ability to sample from . The goal of this section is to address the challenge when the exact simulation of is not available. The plan is to incorporate the Asmussen-Rosiński approximation (ARA) in [3] into the design of the approximation proposed in Section 3.3.
To be specific, let
(3.20) |
where and are two additional parameters of our algorithm. As a convention, we set . Without loss of generality, we consider large enough such that . For the Lévy process with the generating triplet , consider the following decomposition (with being a standard Brownian motion)
(3.21) | ||||
Here, for any , is a martingale with where
(3.22) |
Generally speaking, the difficulty of implementing Algorithm 3.5 lies in the exact simulation of the martingale . In particular, whenever we have for the Lévy measure , the expected number of jumps in (and hence and ) will be infinite over any time interval with positive length. By applying ARA, our goal is to approximate the jump martingale ’s using Brownian motions, which yields a process that is amenable to exact simulation. To do so, let be a sequence of iid copies of standard Brownian motions, which are also independent of . For each , define
(3.23) |
Here, the process can be interpreted as an approximation to , where the jump martingale (with jump sizes under ) is substituted by a standard Brownian motion with the same variance. Note that for any , the random variable is exactly simulatable, as it is a convolution of a compound Poisson process with constant drift and a Gaussian random variable.
Based on the approximations in (3.23), we apply SBA and reconstruct (originally defined in (3.18)) and (originally defined in (3.19)) as follows. Let be a piece-wise step function with jumps over , i.e., admitting the form in (3.11). Recall that the jump times in leads to a partition of of defined in (3.12). For any , let the sequence ’s be defined as in (3.15)–(3.16). Next, conditioning on , one can sample as
(3.24) |
The coupling in (2.7) then implies
(3.25) | |||
Now, we define
(3.26) |
as an approximation to using both ARA and SBA. Compared to the original design in (3.18), the main difference in (3.26) is that we substitute with , and the latter is exactly simulatable as, conditioning on the values of ’s, it admits the law of . Similarly, let
(3.27) |
Again, the main difference between (3.27) and (3.19) is that we incorporate ARA and substitute ’s with ’s.
Plugging the design of in (3.27) into the estimator in (3.8), we propose Algorithm 3 for rare-event simulation of when exact simulation of is not available. Below is a summary of the parameters of the algorithm.
-
•
: the threshold in defined in (3.3),
-
•
: the weight of the defensive mixture in ; see (3.4),
-
•
: the geometric rate of decay for in (3.8),
-
•
: determining the truncation threshold in (3.20),
-
•
: determining the term in (3.26).
Theorem 3.3 justifies that, under proper parametrization, Algorithm 3 is unbiased and strongly efficient.
Theorem 3.3.
In Section 6.2 we provide the proof, the key arguments of which are the verification of conditions (3.9) and (3.10) in Proposition 3.1. Here, we specify the choices of the parameters. First, pick where is the constant in Assumption 2. Next, pick and . Also, fix . This allows us to pick such that
After picking , one can find some such that . Let be such that . Let be small enough such that . Then, we pick small enough such that
Again, the details of the parameter choices can be found at the beginning of Section 6. It is also worth mentioning that, by setting , Algorithm 3 would reduce to Algorithm 2, as ’s in (3.25) would reduce to ’s in (3.17); in other words, the ARA mechanism is effective only if the truncation threshold . As a result, Theorem 3.2 follows directly from Theorem 3.3 by setting .
Remark 4.
While Algorithm 3 terminates within finite steps almost surely, its computational complexity may not be finite in expectation. This is partially due to the implementation of ARA as we approximate the jump martingale in (3.21) using a independent Brownian motion term in (3.23). In theory, a potential remedy is to identify a better coupling between the jump martingales and Brownian motions; see, for instance, Theorem 9 of [46]. This would allow us to pick a larger for the truncation threshold in ARA, under which the simulation algorithm generates significantly fewer jumps when sampling ’s. However, to the best of our knowledge, there is no practical implementation of the coupling in [46]. We note that similar issues arise in works such as [34], where the coupling in [46] imply a much tighter error bound in theory but cannot be implemented in practice.
4 Lipschitz Continuity of the Distribution of
This section investigates the sufficient conditions for Assumption 2. That is, there exist such that
(4.1) |
Here, recall that is the Lévy process with generating triplet . In other words, this is a modulated version of where any the upward jump larger than is removed.
To demonstrate our approach for establishing condition (4.1), we start by considering a simple case where the Lévy process has generating tripet with . This leads to the decomposition
where is a standard Brownian motion, is a Lévy process with generating triplet , and the two processes are independent. Now, for any and ,
(4.2) |
The last inequality follows from the fact that a standard Normal distribution admits a density function bounded by . Therefore, we verified Assumption 2 under , and any . The simple idea behind (4.2) is that continuity conditions such as (4.1) can be passed from one distribution to another through convolutional structures. To generalize this approach to the scenarios where in the generating triplet of the Lévy process , we introduce the following definition.
Definition 3.
Let and be Borel measures on . For any Borel set , we say that majorizes when restricted on (denoted as ) if for any Borel set . In other words is a positive measure.
Now, let us consider the case where the generating triplet of is ). For the Lévy measure , if we can find some , some Borel set and some (positive) Borel measure such that , then through a straightforward superposition of Poisson random measures, we obtain the decomposition (let )
(4.3) |
where is a Lévy process with generating triplet , is a Lévy process with generating triplet , and the two processes are independent. Furthermore, if Assumption 2 (conditions of form (4.1)) holds for the process with generating triplet , then by repeating the arguments in (4.2) we can show that Assumption 2 holds in for any .
Recall our running assumption that the Lévy process is of infinite activities (see Assumption 1). In case that , we must have for to have infinite activity. Therefore, the key step is to identify the majorized measure such that
-
•
holds for with infinite mass and some set ,
- •
In the first main result of this section, we show that measures of form that roughly increase at a power-law rate (as ) provide ideal choices for such majorized measures. In particular, the corresponding Lévy process in (4.3) is intimately related to -stable processes that naturally satisfy continuity properties of form (4.1). We collect the proof in Section 6.3.
Proposition 4.1.
Let , and . Suppose that is regularly varying as with index . Then the Lévy process with generating triplet has a continuous density function for each . Furthermore, there exists a constant such that
where .
Equipped with Proposition 4.1, we obtain the following set of sufficient conditions for Assumption 2.
Theorem 4.2.
Proof.
Part follows immediately from the calculations in (4.2). To prove part , we fix some , and without loss of generality assume that and is regularly varying with index as . This allows us to fix some .
Remark 5.
It is worth noting that the conditions stated in Theorem 4.2 are mild for Lévy process with infinite activities. In particular, for to exhibit infinite activity, we must have either or . Theorem 4.2 (i) deals with the case where . On the other hand, when we must have either or . To satisfy the conditions in part (ii) of Theorem 4.2, the only other requirement is that (or ) approaches infinity at a rate that is at least comparable to some power-law functions.
The next set of sufficient conditions for Assumption 2 revolves around another type of self-similarity structure in the Lévy measure .
Definition 4.
Given and , a Lévy process is -semi-stable with span if its Lévy measure satisfies
(4.4) |
where the transformation () onto a Borel measure on is given by .
As a special case of semi-stable processes, note that is -stable if
where See Theorem 14.3 in [58] for details. However, it is worth noting that the Lévy processes with regularly varying Lévy measures studied in Proposition 4.1 are not strict subsets of the semi-stable processes introduced in Definition 4. For instance, given a Borel measure , suppose that is regularly varying at 0 with index . Even if satisfies the scaling-invariant property in (4.4) for some , we can fix a sequence of points and assign an extra mass of onto at each point . In doing so, we break the scaling-invariant property but still maintain the regular variation of . On the other hand, to show that semi-stable processes may not have regularly varying Lévy measure (when restricted on some neighborhood of the origin), let us consider a simple example. For some and , define the following measure:
Clearly, can be seen as the restriction of the Lévy measure (restricted on ) of some -semi-stable process. Now define function on . For any ,
As , we see that will be very close to
As long as we didn’t pick for some , asymptotically, the value of will repeatedly cycle through the following three different values
thus implying that does not converge as approaches . This confirms that is not regularly varying as .
In Proposition 4.3, we show that semi-stable processes, as well as their truncated counterparts, satisfy continuity conditions of form (4.1). We say that the process is non-trivial if it is not a deterministic linear function (i.e., for some ). The proof is again detailed in Section 6.3.
Proposition 4.3.
Let and . Suppose that is the Lévy measure of a non-trivial -semi-stable process of span . Then under , the Lévy process with generating triplet has a continuous density function for any . Furthermore, there exists some such that
Lastly, by applying Proposition 4.3, we yield another set of sufficient conditions for Assumption 2.
Theorem 4.4.
Let be the generating triplet of Lévy process . Suppose that there exist some Borel measure and some such that and is the Lévy measure of some -semi-stable process. Then Assumption 2 holds for .
5 Numerical Experiments
In this section, we apply the importance sampling strategy outlined in Algorithms 2 and 3 and demonstrate that the performance of the importance sampling estimators under different scaling factors and tail distributions, and the strong efficiency of the proposed algorithms when compared to crude Monte Carlo methods. Specifically, consider a Lévy process , where is the standard Brownian motion, is a Poisson process with arrival rate , and is a sequence of iid random variables with law (for some )
For each , we define the scaled process . The goal is to estimate the probability of , where the set is defined as in (3.1) with and . Note that this is a case with .
To evaluate the performance of the importance sampling estimator under different scaling factors and tail distributions, we run experiments under , and . The efficiency is evaluated by the relative error of the algorithm, namely the ratio between the standard deviation and the estimated mean. In Algorithm 2, we set , and . In Algorithm 3, we further set and . For both algorithms, we generate 10,000 independent samples for each combination of and . For the number of samples in crude Monte Carlo estimation, we ensure that at least samples are generated, where is the probability estimated by Algorithm 2.
The results are summarized in Table 5.1 and Figure 5.1. In Table 5.1, we see that for a fixed , the relative error of the importance sampling estimators stabilizes around a constant level as increases. This aligns with the strong efficiency established in Theorems 3.2 and 3.3. In comparison, the relative error of crude Monte Carlo estimators continues to increase as tends to infinity. Figure 5.1 further highlights that our importance sampling estimators significantly outperform crude Monte Carlo methods by orders of magnitude. In summary, when Algorithms 2 and 3 are appropriately parameterized, their efficiency becomes increasingly evident when compared against the crude Monte Carlo approach as the scaling factor grows larger and the target probability approaches .

n | 200 | 400 | 600 | 800 | 1000 |
---|---|---|---|---|---|
97.86 | 136.03 | 195.40 | 238.81 | 273.13 | |
15.59 | 18.23 | 19.59 | 21.30 | 21.30 | |
237.82 | 386.35 | 526.13 | 681.79 | 866.02 | |
18.23 | 19.22 | 22.92 | 28.85 | 31.61 | |
524.78 | 1091.29 | 1298.98 | 1965.22 | 2089.82 |
6 Proofs
6.1 Proof of Proposition 3.1
We first prepare two technical lemmas using the sample-path large deviations for heavy-tailed Lévy processes reviewed in Section 2.2.
Proof.
In this proof, we focus on the two-sided case in Assumption 1. It is worth noticing that analysis for the one-sided case is almost identical, with the only major difference being that we apply Result 1 (i.e., the one-sided version of the large deviations of ) instead of Result 2 (i.e., the two-sided version). Specifically, we claim that
-
()
-
()
;
-
()
the set is bounded away from .
Then by applying Result 2, we yield
and conclude the proof. Now, it remains to prove claims (), (), and ().
Proof of Claim .
By definitions of , for any there exist , and , such that
(6.1) |
First, from Assumption 3, one can choose small enough such that . Then for the case with in (6.1), by picking for all , we have , and hence . This verifies .
Next, suppose we can show that is a necessary condition for . Then we get
which immediately verifies claim due to ; see Assumption 1. Now, to show that is a necessary condition for , note that from (6.1), it holds for any that As a result, we must have and hence due to ; see Assumption 3. This concludes the proof of claim .
Proof of Claim .
Again, choose some small enough such that . Given any and , the step function satisfies , thus implying . Therefore, (for the definition of , see (2.3))
Proof of Claim .
Assumption 3 implies that , allowing us to choose small enough that . It suffices to show that
(6.2) |
Here, is the Skorokhod metric; see (2.1) for the definition. To prove (6.2), we start with the following observation: due to claim , for any with , we must have . Now, we proceed with a proof by contradiction. Suppose there is some with and some such that . Due to (and hence no upward jump in is larger than ) and , under the representation (6.1) we must have . This implies . Due to again, we yield the contradiction that (and hence ). This concludes the proof of claim . ∎
Lemma 6.2.
Let . Let be such that and . Suppose that holds for
(6.3) |
Then
where and the function counts the number of discontinuities in .
Proof.
Similar to the proof of Lemma 6.1, we focus on the two-sided case in Assumption 1. Still, it is worth noticing that the proof of the one-sided case is almost identical, with the only major difference being that we apply Result 1 instead of Result 2.
First, observe that where
Furthermore, we claim that
-
-
the set is bounded away from .
Then we are able to apply Result 2 and obtain
Lastly, by our assumption , we get and conclude the proof. Now, it remains to prove claims and .
Proof of Claim .
By definition of , given any there exist and such that the representation (6.1) holds. By assumption , for defined in (6.3) we have
(6.4) |
It then holds for all small enough that . As a result, for the case with in (6.1), by picking for all , and for all , we get . This proves that , and hence .
Next, suppose we can show that is the necessary condition for . Then, we get
which immediately verifies claim due to ; see Assumption 1. Now, to show that is a necessary condition, note that, from (6.1), it holds for any that Furthermore, by the definition of the set , we must have (here, w.l.o.g., we order ’s by ) for all and for all . This implies and hence which is equivalent to .
Proof of Claim .
From (6.4), we can fix some small enough such that
(6.5) |
It suffices to show that
(6.6) |
Here, is the Skorokhod metric; see (2.1) for the definition. To prove (6.6), we start with the following observation: using claim , for any with , we must have . Next, we proceed with a proof by contradiction. Suppose there is some with and some such that . By the definition of the set above, any upward jump in is bounded by , and at most of them is larger than . Then from , we know that any upward jump in is bounded by , and at most of them is larger than . Through (6.1), we now have
The last inequality follows from (6.5). Using again, we yield the contraction that and hence . This concludes the proof of (6.6). ∎
Now, we are ready to prove Proposition 3.1.
Proof of Proposition 3.1.
We start by proving the unbiasedness of the importance sampling estimator
under . Note that under both and , we have (i.e., ) and that is independent of everything else. In light of Result 4, it suffices to verify and condition (2.8) (under ) with the choice of and In particular, it suffices to show that (for any )
(6.7) |
To see why, note that (6.7) directly verifies condition (2.8). Furthermore, it implies The convergence then implies the convergence, i.e.,
To prove claim (6.7), observe that
In particular, since and only take values in , we have and
(6.8) |
The last line in the display above follows from the definition of in (3.6). To conclude, note that and hence with , thus implying ; also, as prescribed in Proposition 3.1 we have .
The rest of the proof is devoted to establishing the strong efficiency of . Observe that
By definitions in (3.5), on event we have , while on event we have . As a result,
(6.9) |
Meanwhile, Lemma 6.1 implies that
(6.10) |
Let and . Furthermore, given , we claim the existence of some such that for any ,
(6.11) | ||||
(6.12) | ||||
(6.13) |
as . Then, using (6.11) and (6.12) we get The last equality follows from (6.10). Similarly, from (6.10) and (6.13) we get Therefore, in (6.9) we have , thus establishing the strong efficiency. Now, it remains to prove claims (6.11), (6.12), and (6.13).
Proof of Claim (6.11).
We show that the claim holds for all . For any and , note that
(6.14) |
Recall that and Therefore,
Lastly, the regularly varying nature of (see Assumption 1) implies and hence .
Proof of Claim (6.12).
Again, we prove the claim for all . By the definition of in (3.8),
Meanwhile, by the definition of , we have on , where counts the number of discontinuities for any . By applying Result 4 under the choice of and we yield
(6.15) |
In particular, given , we have , and hence
Again, the regular varying nature of allows us to conclude that .
Proof of Claim (6.13).
Fix some and some such that . Let be such that . By Assumption 3, we can pick some small enough such that . This allows us to pick small enough such that where
(6.16) |
We prove the claim for all . Specifically, given any , one can pick such that . Due to our choice of and , it follows from (6.16) that where
6.2 Proof of Theorems 3.2 and 3.3
We stress again that Theorem 3.2 follows directly from Theorem 3.3 with (i.e., by disabling ARA from Algorithm 3). We devote the remainder of this section to proving Theorem 3.3.
Throughout Section 6.2, we fix the following constants and parameters. First, let be the Blumenthal-Getoor index of and be the regularly varying index of ; see Assumption 1. Fix some
(6.19) |
This allows us to pick large enough such that
(6.20) |
for in (3.26) and in (3.20). Let be the constant in Assumption 2. Choose
(6.21) |
Next, fix
(6.22) |
Based on the chosen value of , fix
(6.23) |
Pick
(6.24) |
Since we require to be strictly less than , there is some integer such that
(6.25) |
where is the parameter in set ; see Assumption 3. Based on the values of and , it holds for all small enough that
(6.26) |
Then, based on all previous choices, it holds for all close enough to 1 such that
(6.27) | ||||
(6.28) | ||||
(6.29) | ||||
(6.30) | ||||
(6.31) | ||||
(6.32) | ||||
(6.33) |
Lastly, pick . By picking a larger if necessary, we can ensure that
(6.34) |
Next, we make a few observations. Given some non-negative integer , let
(6.35) |
where are the order statistics of iid samples of Unif, and ’s are iid samples from . We adopt the convention that and . Note that when , we set as the zero function, and set , , and .
For defined in (3.14) and defined in (3.27), note that
(6.36) |
where
(6.37) | ||||
(6.38) |
See (3.15)–(3.17) and (3.24) for the definitions ’s and ’s, respectively. Also, define
(6.39) |
As intermediate steps for the proof of Theorem 3.3, we present the following two results. Proposition 6.3 states that, using as an anchor, we see that and would stay close enough with high probability, especially for large . Proposition 6.4 then shows that it is unlikely for the law of to concentrate around any .
Proposition 6.3.
There exists some constant such that the inequality
holds for any , , , , and .
Proposition 6.4.
There exists some constant such that the inequality
holds for any , , , , and .
First, equipped with Propositions 6.3 and 6.4, we are able to prove the main results of Section 3.6, i.e., Theorem 3.3.
Proof of Theorem 3.3.
Verification of (3.9).
Conditioning on , the conditional law of is the same as the law of the process specified in (6.35). This implies
Next, on event
we must have (for any )
It then follows from (6.36) that, on this event, we have . Therefore,
(6.40) | |||
Applying Proposition 6.3 (with , we get (for any )
(6.41) |
On the other hand, due to (6.25), we have . for all and . This allows us to apply Proposition 6.4 (with ) and yield (for any )
(6.42) |
Plugging (6.41) and (6.42) into (6.40), we conclude the proof by setting .
Verification of (3.10).
Fix some and . Again, conditioning on , the conditional law of is the same as the law of the process specified in (6.35). This implies
Due to (and hence ) and , we conclude the proof by setting . ∎
The rest of this section is devoted to proving Propositions 6.3 and 6.4. First, we collect a useful result.
Result 5 (Lemma 1 of [35]).
Let be the Lévy measure of a Lévy process . Let Suppose that for the Blumenthal-Getoor index . Then
Next, we prepare two lemmas regarding the expectations of the supremum of (see (3.6) for the definition) and the difference between and (see (3.23)).
Lemma 6.5.
There exists a constant (depending only on the law of Lévy process ) such that
Proof.
Recall that the generating triplet of is and for the Blumenthal-Getoor index we have ; see Assumption 1. Fix some in this proof. We prove the lemma for
where , , and
Lemma 6.6.
Proof.
From the definitions of and in (3.21) and (3.23), respectively, we have
where is the Lévy process with generating triplet , and is a standard Brownian motion independent of . In particular, is a martingale with variance ; see (3.22) for the definition of . Therefore,
To conclude the proof, we set . ∎
To facilitate the presentation of the next few lemmas, we consider a slightly more general version of the stick-breaking procedure described in (3.15)–(3.24), to allow for arbitrary stick length. Specifically, for any , let
(6.44) |
where ’s are iid copies of Unif. Independent of ’s, for any and , let and be Lévy processes with joint law specified in (3.21) and (3.23), respectively. Conditioning on the values of , define using (for all )
(6.45) |
Lemma 6.7.
Proof.
For notational simplicity, set . Due to ,
(6.46) |
Furthermore, we claim the existence of some constant such that (for any , , and any )
(6.47) |
Then using Result 5, we yield
where . Setting , we conclude the proof.
Lemma 6.8.
Let and . Let be the constant characterized in Lemma 6.5 that only depends on the law of Lévy process . The inequality
holds for all , , and , where .
Proof.
For this proof, we adopt the notation for the remaining stick length after the first sticks. Conditioning on ,
Therefore, unconditionally,
The last line follows from Jensen’s inequality. Lastly, by definition of ’s in (6.44), we have
This concludes the proof. ∎
Lemma 6.9.
Proof.
To simplify notations, in this proof we set and write when there is no ambiguity. For the sequence of random variables , let be its order statistics. Given any ordered positive real sequence , by conditioning on , it follows from (6.45) that
(6.49) |
where ’s are iid copies of the Lévy processes . Next, fix
Given the sequence of real numbers ’s, we define as the number of elements in the sequence that are larger than . In case that , we set . With defined, we consider a decomposition of events in (6.49) based on the first such that (and hence ), especially if such is larger than or not. To be specific,
(6.50) |
We first bound terms ’s. For any , observe that
(6.51) |
On the other hand,
(6.52) |
Plugging (6.51) and (6.52) into (6.50), we yield
To conclude the proof, just note that the inequality above holds when conditioning on any sequence of , so it would also hold unconditionally. ∎
Proof of Proposition 6.3.
In this proof, we fix some , , and . Let the process be defined as in (6.35). Recall the definitions of , , and in (6.37)–(6.39). See also (3.15)–(3.24) for the definitions ’s and ’s.
To simplify notations, define . Define events
Note that on event , we must have and As a result,
Furthermore, we claim the existence of constant , the values of which do not depend on , and , such that (for any and )
(6.53) | ||||
(6.54) | ||||
(6.55) |
This allows us to conclude the proof by setting . Now, it remains to prove claims (6.53)–(6.55).
Proof of Claim (6.53)
The claim is trivial if , so we only consider the case where . Due to the coupling between and in (3.24)(3.25), we have
where the laws of processes are stated in (3.21) and (3.23), respectively. Applying Lemma 6.6, we yield
where is the constant characterized in Lemma 6.6 that only depends on and the law of the Lévy process . To conclude the proof of claim (6.53), we pick .
Proof of Claim (6.54)
It follows directly from Lemma 6.7 that
where is the constant characterized in Lemma 6.7 that only depends on and the law of the Lévy process . To conclude the proof of claim (6.54), we pick .
Proof of Claim (6.55)
Proof of Proposition 6.4.
In this proof, we fix some . Recall the representation in (6.35) where are the order statistics of iid samples of Unif. Recall the definition of in (6.39). See also (3.15)–(3.24) for the definitions ’s and ’s.
We start with the following decomposition of events:
First, The last inequality follows from our choice of in (6.27) and . Furthermore, for each
The last equality follows from the simple fact that . Furthermore, we claim the existence of constants and , the values of which do not vary with parameters , such that for all and ,
(6.56) | ||||
(6.57) |
Then, we conclude the proof by setting . Now, we prove claims (6.56) and (6.57)
Proof of Claim (6.56)
If , the claim is trivial due to and hence . Now, we consider the case where . Due to the independence between and ,
where are independent of the Lévy process . In particular, recall that are order statistics. Therefore, on event we must have . It then follows directly from Assumption 2 that
where and are the constants specified in Assumption 2. To conclude, it suffices to set .
Proof of Claim (6.57)
Applying Lemma 6.9 with and , we get (for all )
Here, is the constant in Lemma 6.5 that only depends on the law of Lévy process , and are the constants in Assumption 2. First, for any and ,
For term , note that as due to . This allows us to fix some such that . As a result, for any ,
Similarly, for all and ,
Besides, due to as , we can find such that . This leads to (for all )
To conclude the proof, we can simply set ∎
6.3 Proof of Propositions 4.1 and 4.3
The proof of Proposition 4.1 is based on the inversion formula of the characteristic functions (see, e.g., Theorem 3.3.14 of [29]). Specifically, we compare the characteristic function of with an -stable process to draw connections between their distributions.
Proof of Proposition 4.1.
The Lévy-Khintchine formula (see e.g. Theorem 8.1 of [58]) leads to the following expression for the characteristic function of :
Note that
Then from for all ,
(6.58) |
Furthermore, we claim the existence of some such that
(6.59) |
Plugging (6.59) into (6.58), we yield that for all and , It then follows directly from the inversion formula (see Theorem 3.3.14 of [29]) that, for all , admits a continuous density function with a uniform bound
To conclude the proof, pick . Now, it only remains to prove claim (6.59).
Proof of Claim (6.59).
We start by fixing some constants.
(6.60) |
For , note that , and hence . For , note that and hence . Due to , we have . Next, choose positive real numbers such that
(6.61) | |||
(6.62) |
For any and , observe that (by setting in the last step)
Therefore, by fixing some large enough, we have
(6.63) |
To proceed, we compare with . Recall that is the constant prescribed in the statement of Proposition 4.1. For any such that ,
(6.64) |
We bound the terms , , and separately. First, for any ,
(6.65) |
For , it follows immediately from (6.63) that
(6.66) |
Next, in order to bound , we consider the function Since is uniformly continuous on , we can find some , and a sequence of real numbers such that
(6.67) |
In other words, we use a geometric sequence to partition into intervals. On any of these intervals, the fluctuations of is bounded by the constant fixed in (6.62). Now fix some such that (recall that is prescribed in the statement of this proposition)
(6.68) |
Since is regularly varying as with index , for we have as . By Potter’s bound (see Proposition 2.6 in [53]),there exists such that
(6.69) |
Meanwhile, define
and note that . Due to , we can find some such that
(6.70) |
Let . For any , we have and for any . As a result, for with and any ,
On the other hand,
Therefore, given any such that , we have for all where . This leads to
with | |||
(6.71) |
Again, the proof of Proposition 4.3 makes use of the inversion formula.
Proof of Proposition 4.3.
Let us denote the characteristic functions of and by and , respectively. Repeating the arguments using complex conjugates in (6.58), we obtain
As for , using proposition 14.9 in [58], we get
(6.72) |
where is a non-negative function continuous on satisfying and
This implies for all . Furthermore, we claim the existence of some such that
(6.73) |
Then due to the self-similarity of (i.e., ), we have for all . In the meantime, note that
By picking large enough, it holds for any that
(6.74) |
Therefore, for any ,
and hence for all . Applying inversion formula, we get (for any )
To conclude the proof, we set . Now it only remains to prove claim (6.73).
Proof of Claim (6.73)
We proceed with a proof by contradiction. If , then by continuity of , there exists some such that
Now for any , define the following sets:
Observe that
-
•
For any , we have which implies ;
-
•
Meanwhile, .
Together with the fact that (so that the process is non-trivial), there must be some such that
Besides, from we know that . However, by definition of semi-stable processes in (4.4) we know that where the transformation () onto a Borel measure on is defined as . This implies
which would contradict eventually for large enough. This concludes the proof of for all . ∎
References
- [1] S. Asmussen, P. Glynn, and J. Pitman. Discretization Error in Simulation of One-Dimensional Reflecting Brownian Motion. The Annals of Applied Probability, 5(4):875 – 896, 1995.
- [2] S. Asmussen and D. P. Kroese. Improved algorithms for rare event simulation with heavy tails. Advances in Applied Probability, 38(2):545–558, 2006.
- [3] S. Asmussen and J. Rosiński. Approximations of small jumps of lévy processes with a view towards simulation. Journal of Applied Probability, 38(2):482–493, 2001.
- [4] A. Bassamboo, S. Juneja, and A. Zeevi. On the inefficiency of state-independent importance sampling in the presence of heavy tails. Operations Research Letters, 35(2):251–260, 2007.
- [5] M. L. Bianchi, S. T. Rachev, Y. S. Kim, and F. J. Fabozzi. Tempered infinitely divisible distributions and processes. Theory of Probability & Its Applications, 55(1):2–26, 2011.
- [6] J. Blanchet and P. Glynn. Efficient rare-event simulation for the maximum of heavy-tailed random walks. The Annals of Applied Probability, 18(4):1351 – 1378, 2008.
- [7] J. Blanchet, P. Glynn, and J. Liu. Efficient rare event simulation for heavy-tailed multiserver queues. Technical report, Department of Statistics, Columbia University, 2008.
- [8] J. Blanchet, H. Hult, and K. Leder. Rare-event simulation for stochastic recurrence equations with heavy-tailed innovations. ACM Trans. Model. Comput. Simul., 23(4), dec 2013.
- [9] J. H. Blanchet and J. Liu. State-dependent importance sampling for regularly varying random walks. Advances in Applied Probability, 40(4):1104–1128, 2008.
- [10] S. Borak, A. Misiorek, and R. Weron. Models for heavy-tailed asset returns, pages 21–55. Springer Berlin Heidelberg, Berlin, Heidelberg, 2011.
- [11] O. Boxma, E. Cahen, D. Koops, and M. Mandjes. Linear networks: rare-event simulation and markov modulation. Methodology and Computing in Applied Probability, 2019.
- [12] S. Boyarchenko and S. Levendorskii. Efficient evaluation of joint pdf of a lévy process, its extremum, and hitting time of the extremum, 2023.
- [13] S. Boyarchenko and S. Levendorskii. Simulation of a lévy process, its extremum, and hitting time of the extremum via characteristic functions, 2023.
- [14] J. A. Bucklew, P. Ney, and J. S. Sadowsky. Monte carlo simulation and large deviations theory for uniformly recurrent markov chains. Journal of Applied Probability, 27(1):44–59, 1990.
- [15] P. Carr, H. Geman, D. Madan, and M. Yor. The fine structure of asset returns: An empirical investigation. The Journal of Business, 75(2):305–332, 2002.
- [16] P. Carr, H. Geman, D. B. Madan, and M. Yor. Stochastic volatility for lévy processes. Mathematical Finance, 13(3):345–382, 2003.
- [17] J. I. G. Cázares, A. Kohatsu-Higa, and A. Mijatović. Joint density of the stable process and its supremum: Regularity and upper bounds. Bernoulli, 29(4):3443 – 3469, 2023.
- [18] J. I. G. Cázares, A. Mijatović, and G. U. Bravo. -strong simulation of the convex minorants of stable processes and meanders. Electronic Journal of Probability, 25(none):1 – 33, 2020.
- [19] L. Chaumont. On the law of the supremum of Lévy processes. The Annals of Probability, 41(3A):1191 – 1217, 2013.
- [20] B. Chen, J. Blanchet, C.-H. Rhee, and B. Zwart. Efficient rare-event simulation for multiple jump events in regularly varying random walks and compound poisson processes. Mathematics of Operations Research, 44(3):919–942, 2019.
- [21] J. E. Cohen, R. A. Davis, and G. Samorodnitsky. Covid-19 cases and deaths in the united states follow taylor’s law for heavy-tailed distributions with infinite variance. Proceedings of the National Academy of Sciences, 119(38):e2209234119, 2022.
- [22] L. Coutin, M. Pontier, and W. Ngom. Joint distribution of a lévy process and its running supremum. Journal of Applied Probability, 55(2):488–512, 2018.
- [23] J. I. G. Cázares, F. Lin, and A. Mijatović. Fast exact simulation of the first passage of a tempered stable subordinator across a non-increasing function, 2023.
- [24] S. Dereich. Multilevel Monte Carlo algorithms for Lévy-driven SDEs with Gaussian correction. The Annals of Applied Probability, 21(1):283 – 311, 2011.
- [25] S. Dereich and F. Heidenreich. A multilevel monte carlo algorithm for lévy-driven stochastic differential equations. Stochastic Processes and their Applications, 121(7):1565–1587, 2011.
- [26] E. H. A. Dia and D. Lamberton. Connecting discrete and continuous lookback or hindsight options in exponential lévy models. Advances in Applied Probability, 43(4):1136–1165, 2011.
- [27] P. Dupuis, K. Leder, and H. Wang. Importance sampling for sums of random variables with regularly varying tails. ACM Trans. Model. Comput. Simul., 17(3):14–es, jul 2007.
- [28] P. Dupuis, A. D. Sezer, and H. Wang. Dynamic importance sampling for queueing networks. The Annals of Applied Probability, 17(4):1306 – 1346, 2007.
- [29] R. Durrett. Probability: Theory and Examples. Cambridge Series in Statistical and Probabilistic Mathematics. Cambridge University Press, 2019.
- [30] P. Embrechts, C. Klüppelberg, and T. Mikosch. Modelling extremal events: for insurance and finance, volume 33. Springer Science & Business Media, 2013.
- [31] A. Ferreiro-Castilla, A. Kyprianou, R. Scheichl, and G. Suryanarayana. Multilevel monte carlo simulation for lévy processes based on the wiener–hopf factorisation. Stochastic Processes and their Applications, 124(2):985–1010, 2014.
- [32] M. B. Giles. Multilevel monte carlo path simulation. Operations Research, 56(3):607–617, 2008.
- [33] M. B. Giles and Y. Xia. Multilevel monte carlo for exponential lévy models. Finance and Stochastics, 21(4):995–1026, 2017.
- [34] J. González Cázares and A. Mijatović. Simulation of the drawdown and its duration in lévy models via stick-breaking gaussian approximation. Finance and Stochastics, 26(4):671–732, 2022.
- [35] J. I. González Cázares, A. Mijatović, and G. Uribe Bravo. Geometrically convergent simulation of the extrema of lévy processes. Mathematics of Operations Research, 47(2):1141–1168, 2022.
- [36] J. I. González Cázares, A. Mijatović, and G. U. Bravo. Exact simulation of the extrema of stable processes. Advances in Applied Probability, 51(4):967–993, 2019.
- [37] T. Gudmundsson and H. Hult. Markov chain monte carlo for computing rare-event probabilities for a heavy-tailed random walk. Journal of Applied Probability, 51(2):359–376, 2014.
- [38] M. Gurbuzbalaban, U. Simsekli, and L. Zhu. The heavy-tail phenomenon in sgd. In M. Meila and T. Zhang, editors, Proceedings of the 38th International Conference on Machine Learning, volume 139 of Proceedings of Machine Learning Research, pages 3964–3975. PMLR, 18–24 Jul 2021.
- [39] S. Heinrich. Multilevel monte carlo methods. In S. Margenov, J. Waśniewski, and P. Yalamov, editors, Large-Scale Scientific Computing, pages 58–67, Berlin, Heidelberg, 2001. Springer Berlin Heidelberg.
- [40] T. Hesterberg. Weighted average importance sampling and defensive mixture distributions. Technometrics, 37(2):185–194, 1995.
- [41] L. Hodgkinson and M. Mahoney. Multiplicative noise and heavy tails in stochastic optimization. In International Conference on Machine Learning, pages 4262–4274. PMLR, 2021.
- [42] H. Hult, S. Juneja, and K. Murthy. Exact and efficient simulation of tail probabilities of heavy-tailed infinite series. 2016.
- [43] A. Kuznetsov, A. E. Kyprianou, J. C. Pardo, and K. van Schaik. A Wiener–Hopf Monte Carlo simulation technique for Lévy processes. The Annals of Applied Probability, 21(6):2171 – 2190, 2011.
- [44] M. Kwaśnicki, J. Małecki, and M. Ryznar. Suprema of Lévy processes. The Annals of Probability, 41(3B):2047 – 2065, 2013.
- [45] Y. Li. Queuing theory with heavy tails and network traffic modeling. working paper or preprint, Oct. 2018.
- [46] E. Mariucci and M. Reiß. Wasserstein and total variation distance between marginals of Lévy processes. Electronic Journal of Statistics, 12(2):2482 – 2514, 2018.
- [47] Z. Michna. Formula for the supremum distribution of a spectrally positive lévy process, 2012.
- [48] Z. Michna. Explicit formula for the supremum distribution of a spectrally negative stable process. Electronic Communications in Probability, 18(none):1 – 6, 2013.
- [49] Z. Michna, Z. Palmowski, and M. Pistorius. The distribution of the supremum for spectrally asymmetric lévy processes, 2014.
- [50] A. Mijatović and P. Tankov. A new look at short-term implied volatility in asset price models with jumps. Mathematical Finance, 26(1):149–183, 2016.
- [51] K. R. A. Murthy, S. Juneja, and J. Blanchet. State-independent importance sampling for random walks with regularly varying increments. Stochastic Systems, 4(2):321–374, 2014.
- [52] J. Pitman and G. U. Bravo. The convex minorant of a Lévy process. The Annals of Probability, 40(4):1636 – 1674, 2012.
- [53] S. I. Resnick. Heavy-tail phenomena: probabilistic and statistical modeling. Springer Science & Business Media, 2007.
- [54] C.-H. Rhee, J. Blanchet, B. Zwart, et al. Sample path large deviations for lévy processes and random walks with regularly varying increments. The Annals of Probability, 47(6):3551–3605, 2019.
- [55] C.-H. Rhee and P. W. Glynn. Unbiased estimation with square root convergence for sde models. Operations Research, 63(5):1026–1043, 2015.
- [56] J. Rosiński. Tempering stable processes. Stochastic Processes and their Applications, 117(6):677–707, 2007.
- [57] P. Sabino. Pricing energy derivatives in markets driven by tempered stable and cgmy processes of ornstein–uhlenbeck type. Risks, 10(8), 2022.
- [58] K.-i. Sato, S. Ken-Iti, and A. Katok. Lévy processes and infinitely divisible distributions. Cambridge university press, 1999.
- [59] G. Torrisi. Simulating the ruin probability of risk processes with delay in claim settlement. Stochastic Processes and their Applications, 112(2):225–244, 2004.
- [60] X. Wang and C.-H. Rhee. Rare-event simulation for multiple jump events in heavy-tailed lévy processes with infinite activities. In Proceedings of the Winter Simulation Conference, WSC ’20, page 409–420. IEEE Press, 2021.
- [61] X. Wang and C.-H. Rhee. Large deviations and metastability analysis for heavy-tailed dynamical systems, 2023.
- [62] X. Wang and C.-H. Rhee. Importance sampling strategy for heavy-tailed systems with catastrophe principle. In Proceedings of the Winter Simulation Conference, WSC ’23, page 76–90. IEEE Press, 2024.
Appendix A Barrier Option Pricing
A.1 Problem Setting
This section considers the estimation of probabilities with and
which corresponds to rare-event simulation in the context of down-and-in option. Here, we assume that and . We consider the two-sided case in Assumption 1. That is, is a centered Lévy process with Lévy measures , and there exists some such that and as . Also, we impose an alternative version of Assumption 2 throughout. Let be the Lévy process with with generating triplet . That is, is a modulated version of where all jumps with size larger than are removed.
Assumption 4.
There exist such that
A.2 Importance Sampling Algorithm
Below, we present the design of the importance sampling algorithm. For any and , let be the discontinuity in at time , and we set . Let
and let . Intuitively speaking, on event there is at least one upward and one downward “large” jump in , where is understood as the threshold for jump sizes to be considered “large”.
Fix some , and let
The algorithm samples
under . Now, we discuss the design of to ensure the strong efficiency of . Analogous to the decomposition in (3.6), let
Let , , , and . Meanwhile, set
We have . Under the convention , consider estimators of form
(A.1) |
where is for some and is independent of everything else. Analogous to Proposition 3.1, the following result provides sufficient conditions on for to attain strong efficiency.
Proposition A.1.
Let , , , and . Suppose that
(A.2) |
where and count the number of discontinuities of positive and negative sizes in , respectively. Besides, suppose that for all ,
(A.3) |
where . Then given , there exists some such that for all , the estimators are unbiased and strongly efficient for under the importance sampling distribution .
The proof is almost identical to that of Proposition 3.1. In particular, the proof requires that
and that, for any , it holds for all small enough that
where . These can be obtained directly using sample path large deviations for heavy-tailed Lévy processes in Result 2. The Proposition A.1 is then established by repeating the arguments in Proposition 3.1 using Result 4 for randomized debiasing technique.
A.3 Construction of
Next, we describe the construction of that can satisfy the conditions in Proposition A.1. Specifically, we consider the case where ARA is involved. Let
Under both and , admits the law of a Lévy process with generating triplet . This leads to the Lévy-Ito decomposition
with defined in (3.20). Besides, let be defined as in (3.22). For each and , consider the approximation
where is a sequence of iid copies of standard Brownian motions independent of everything else.
Next, we discuss how to apply SBA and construct approximators ’s in (A.1). Let be a piece-wise step function with jumps over , where , and for each . Recall that the jump times in leads to a partition of of defined in (3.12). For any , let the sequence ’s be defined as in (3.15)–(3.16). Conditioning on , one can then sample using
The coupling in (2.7) then implies
Now, we define
as an approximation to Now, set
In (A.1), we plug in .