Simplified Stochastic Calculus With Applications in Economics and Finance

Aleš Černý Business School (formerly Cass)
City, University of London
106 Bunhill Row, London, EC1Y 8TZ, UK. [email protected] and Johannes Ruf Department of Mathematics
LSE
Houghton Street, London, WC2A 2AE, UK. [email protected]

Abstract.

The paper introduces a simple way of recording and manipulating general stochastic processes without explicit reference to a probability measure. In the new calculus, operations traditionally presented in a measure-specific way are instead captured by tracing the behaviour of jumps (also when no jumps are physically present). The calculus is fail-safe in that, under minimal assumptions, all informal calculations yield mathematically well-defined stochastic processes. The calculus is also intuitive as it allows the user to pretend all jumps are of compound Poisson type. The new calculus is very effective when it comes to computing drifts and expected values that possibly involve a change of measure. Such drift calculations yield, for example, partial integro–differential equations, Hamilton–Jacobi–Bellman equations, Feynman–Kac formulae, or exponential moments needed in numerous applications. We provide several illustrations of the new technique, among them a novel result on the Margrabe option to exchange one defaultable asset for another.

Keywords. finance; drift; Émery formula; Girsanov’s theorem; simplified stochastic calculus

2010 Mathematics Subject Classification:

(Primary) 60H05, 60H10; 60G44, 60G48; (Secondary) 91B02, 91B25; 91G10

Previous versions of this paper were circulated under the title “Finance without Brownian motions: An introduction to simplified stochastic calculus.” We thank Jan Kallsen, Jan-Frederik Mai, Lola Martinez Miranda, Johannes Muhle-Karbe, and two anonymous referees for helpful comments.

1. Introduction

Anyone who has attempted stochastic modelling with jumps will be aware of the sudden increase in mathematical complexity in models that are not of compound Poisson type. The difficulty is such that experienced researchers readily forgo generality in order to reduce the technical burden placed on their readers; see, for example, Feng and Linetsky (2008), Cai and Kou (2012), Hong and Jin (2018), and Aït-Sahalia and Matthys (2019).

In this paper we introduce an intuitive calculus that works for general processes but retains the simplicity of compound Poisson calculations. To achieve this, a change of paradigm is required. Classical Itô calculus is based on decomposing the increments of every process into signal (drift, expected change) and noise (Brownian motion, zero-mean shock). This is at once convenient and mathematically expedient. The convenience of knowing the drift is immediate. Many tasks where stochastic processes are concerned involve computation of the drift of some quantity. Hamilton–Jacobi–Bellman equations in optimal control, for example, express the fact that the optimal value function plus the integrated historical cost is a martingale and therefore has zero drift. Similarly, Feynman–Kac formulae reflect zero drift of an integral of costs discounted at a specified stochastic killing rate. Closer to home, the Black–Scholes partial differential equation can be obtained by setting the drift of the discounted option price process to zero under the risk-neutral measure.

The expediency of the signal–noise decomposition comes from the early construction of the Itô integral where the drift is integrated path-by-path but the Brownian motion integral is performed, loosely speaking, by summing up uncorrelated square-integrable random variables with zero mean. The paradigm shift is applied here: we separate how a process is recorded from the drift calculation. In other words, we do not carry the drift with us at all times but only evaluate it when the drift is really needed. This feels a little uncomfortable at first but there are ample rewards for the small intellectual effort required.

The consequences of the subtle change in perspective are far-reaching. By recording processes in a measure-invariant manner the technicalities of stochastic integration fall away, the importance of Brownian motions and Poisson processes recedes, and one begins to see deeper into the fundamental relationships among the modelled variables, which now take center stage. Measure change, too, becomes an easy application of the simplified calculus, for as long as the new measure is directly driven by the variables being studied, which is overwhelmingly the case in practice.

We have prepared the paper with two audiences in mind. First and foremost, the paper is intended for the research community whose members do not consider themselves experts in mathematics in general, or stochastic analysis in particular, but who nevertheless use stochastic calculus as a modelling tool. To this readership we want to demonstrate that the new calculus is easy to understand and apply in practice. Second, but no less important, we address colleagues specializing in stochastic analysis whom we wish to convince that all our arguments are mathematically rigorous.

The stated goal is not without its challenges. Where practical, we contrast the new approach with the more involved classical notation. In order to perform such comparison, one has to introduce some advanced concepts, such as the Poisson random measure, which are needed in classical stochastic calculus. Plainly, the lay readers will not be acquainted with some of the advanced concepts, nor do they need to be. Familiarity with Brownian motion, compound Poisson processes, some version of Itô’s formula, and perhaps Girsanov’s theorem ought to be enough to sufficiently appreciate the backdrop against which the new calculus is constructed. The new calculus itself only needs a grasp of drift, volatility, and jump arrival intensity, plus three basic rules that are self-evident on an informal level.

The paper is organized as follows. In the rest of this introduction, we trace how the novel concept of this paper, the semimartingale representation (1.18), arises from classical Itô calculus. Section 2 provides a thorough introduction to the simplified stochastic calculus. It also explains how the proposed approach facilitates computation of drifts and expected values; in particular, it tackles the introductory example in the presence of jumps. Section 3 demonstrates the strength of the proposed approach on three additional examples. Section 4 amplifies this point by showcasing calculations that also require a change of measure. In particular, Example 4.3 contains a new result that makes use of a non-equivalent change of measure. Section 5 highlights the robustness of the proposed approach whereby, for a given task, the same representation applies in both discrete and continuous models. Such unification is unattainable in standard calculus.

The examples in the paper are inspired by applications in Economics and Financial Mathematics but the broader lessons are clearly applicable to Science at large. We explore wider repercussions of the proposed methodology and briefly mention applications to Statistics and Engineering in the concluding Section 6.

1.1. McKean calculus for Itô processes

For reasons of tractability, there is a preponderance of continuous-time stochastic models based on Brownian motion. Traditional stochastic calculus reflects this historical bias. As an example, consider a stochastic model for two economic variables, capital $K$ and labour $L$ ,

	$\displaystyle\frac{\mathrm{d}K_{t}}{K_{t}}$	$\displaystyle=\mu_{K}\mathrm{d}t+\sigma_{K}\mathrm{d}W_{t},$		(1.1)
	$\displaystyle\frac{\mathrm{d}L_{t}}{L_{t}}$	$\displaystyle=\mu_{L}\mathrm{d}t+\sigma_{L}\left(\rho_{KL}\mathrm{d}W_{t}+\sqrt[\ ]{1-\rho^{2}_{KL}}\mathrm{d}\widehat{W}_{t}\right).$		(1.2)

Here $W$ and $\widehat{W}$ are two independent Brownian motions. The inputs in this model are $\mu_{K}$ , $\mu_{L}$ , $\sigma_{K}>0$ , $\sigma_{L}>0$ , and $\rho_{KL}\in[-1,1]$ describing the correlation between the changes in $K$ and $L$ . Informally, the ‘drift part’ $\mu_{K}\mathrm{d}t$ represents expected change, while the ‘noise’ $\sigma_{K}\mathrm{d}W_{t}$ is loosely interpreted as a shock with mean zero and variance $\sigma_{K}^{2}\mathrm{d}t$ .

The symbol $\mathrm{d}K_{t}$ represents an increase in capital over an infinitesimal time period $\mathrm{d}t$ . The left-hand side of (1.1) signifies percentage change in capital over the same period. The percentage change per unit of time is not well-defined because the derivative $\nicefrac{{\mathrm{d}W_{t}}}{{\mathrm{d}t}}$ does not exist. However, the expected percentage change per unit of time is finite and equal to $\mu_{K}$ . This means the expected proportional increase in capital over a fixed time horizon $T$ equals

\textsf{E}\left[\frac{K_{T}}{K_{0}}\right]=\mathrm{e}^{\mu_{K}T}.

(1.3)

In the terminology of Samuelson (1965), $K$ and $L$ are geometric Brownian motions with drift rates $\mu_{K}$ and $\mu_{L}$ , respectively.

Suppose we are interested in the evolution of the capital-labour ratio, $\nicefrac{{K}}{{L}}$ . The standard Itô calculus (Itô 1951, Theorem 6; Karatzas and Shreve 1991, Theorem 3.6; Duffie 2001, p. 95; Björk 2009, Theorem 4.16) yields, after simplifications,

\begin{split}\mathrm{d}\left(\frac{K_{t}}{L_{t}}\right)=&\frac{K_{t}}{L_{t}}(\mu_{K}-\mu_{L}-\rho_{KL}\sigma_{K}\sigma_{L}+\sigma^{2}_{L})\mathrm{d}t\\ &\quad+\frac{K_{t}}{L_{t}}(\sigma_{K}-\rho_{KL}\sigma_{L})\mathrm{d}W_{t}-\frac{K_{t}}{L_{t}}\sigma_{L}\sqrt[\ ]{1-\rho^{2}_{KL}}\mathrm{d}\widehat{W}_{t}.\end{split}

(1.4)

One can make two observations at this point:

(1)

The formula (1.4) is not easy to decipher — the calculations are involved.
(2)

The formula is ‘misleading’ — the processes $K$ and $L$ are already given, therefore the object on the left-hand side is defined path by path as the ratio $\nicefrac{{K_{T}(\omega)}}{{L_{T}(\omega)}}$ and cannot depend on the reference probability measure. In contrast, some of the objects on the right-hand side are strongly measure-dependent: certainly, if we change the reference probability measure, there is no guarantee that $W$ and $\widehat{W}$ will still be Brownian motions under the new measure.

McKean (1969) addresses (1) and informally also (2) by rewriting (1.4) in the form

\displaystyle\mathrm{d}\left(\frac{K_{t}}{L_{t}}\right)=\frac{K_{t}}{L_{t}}\left(\frac{\mathrm{d}K_{t}}{K_{t}}-\frac{\mathrm{d}L_{t}}{L_{t}}-\frac{\mathrm{d}K_{t}}{K_{t}}\frac{\mathrm{d}L_{t}}{L_{t}}+\left(\frac{\mathrm{d}L_{t}}{L_{t}}\right)^{2}\right),

(1.5)

where $\mathrm{d}K_{t}\mathrm{d}L_{t}$ is understood to stand for $\mathrm{d}[K,L]_{t}$ and $[K,L]$ is the quadratic covariation of the processes $K$ and $L$ . In the present case we have

\frac{\mathrm{d}[K,L]_{t}}{K_{t}L_{t}}=\rho_{KL}\sigma_{K}\sigma_{L}\mathrm{d}t\qquad\text{and}\qquad\frac{\mathrm{d}[L,L]_{t}}{L^{2}_{t}}=\sigma^{2}_{L}\mathrm{d}t.

Let us make two further observations:

(3)

Formula (1.5) is not only measure-independent, it is also model-free in the sense that it holds for any two continuous semimartingales $K$ and $L$ such that the integrals of $\nicefrac{{\mathrm{d}K_{t}}}{{K_{t}}}$ and $\nicefrac{{\mathrm{d}L_{t}}}{{L_{t}}}$ are well-defined.

(4)

McKean (1969, p. 33) observes that one can obtain (1.5) much more directly without passing through (1.1)–(1.2) and (1.4), simply by writing down a second-order Taylor expansion for $f(K,L)=\nicefrac{{K}}{{L}}$ in the form

\begin{split}\mathrm{d}f(K_{t},L_{t})&=\tfrac{\partial f}{\partial k}(K_{t},L_{t})\mathrm{d}K_{t}+\tfrac{\partial f}{\partial\ell}(K_{t},L_{t})\mathrm{d}L_{t}\\ &\ +\frac{1}{2}\left(\tfrac{\partial^{2}f}{\partial k^{2}}(K_{t},L_{t})(\mathrm{d}K_{t})^{2}+2\tfrac{\partial^{2}f}{\partial k\partial\ell}(K_{t},L_{t})\mathrm{d}K_{t}\mathrm{d}L_{t}+\tfrac{\partial^{2}f}{\partial\ell^{2}}(K_{t},L_{t})(\mathrm{d}L_{t})^{2}\right).\end{split}

(1.6)

Suppose now we wish to evaluate the expected value of the capital–labour ratio. Here the formula (1.4) is very helpful because it tells us immediately that $\nicefrac{{K}}{{L}}$ is a geometric Brownian motion with the drift rate

b=\mu_{K}-\mu_{L}-\rho_{KL}\sigma_{K}\sigma_{L}+\sigma^{2}_{L}.

(1.7)

With this coefficient in hand one swiftly concludes, in analogy to (1.3),

\textsf{E}\left[\frac{K_{T}}{L_{T}}\right]=\frac{K_{0}}{L_{0}}\mathrm{e}^{bT}.

(1.8)

The need for equation (1.4) is only illusory, however. One can obtain formula (1.7) equally easily from the measure-independent formula (1.5) by inserting the expected rate of change of $K,L$ and their quadratic (co)variations on the right-hand side of (1.5) as implied by (1.1)–(1.2),

$\displaystyle\dfrac{\mathrm{d}(\nicefrac{{K_{t}}}{{L_{t}}})}{\nicefrac{{K_{t}}}{{L_{t}}}}$	$\displaystyle\,\,=\,\,$	$\displaystyle\dfrac{\mathrm{d}K_{t}}{K_{t}}$	$\displaystyle\,\,-\,\,$	$\displaystyle\dfrac{\mathrm{d}L_{t}}{L_{t}}$	$\displaystyle\,\,-\,\,$	$\displaystyle\dfrac{\mathrm{d}[K,L]_{t}}{K_{t}L_{t}}$	$\displaystyle\,\,+\,\,$	$\displaystyle\dfrac{\mathrm{d}[L,L]_{t}}{L_{t}^{2}}$ ,	(1.9)
$\displaystyle\downarrow$		$\displaystyle\downarrow$		$\displaystyle\downarrow$		$\displaystyle\downarrow$		$\displaystyle\downarrow$
$\displaystyle b$	$\displaystyle\,\,=\,\,$	$\displaystyle\mu_{K}$	$\displaystyle\,\,-\,\,$	$\displaystyle\mu_{L}$	$\displaystyle\,\,-\,\,$	$\displaystyle\rho_{KL}\sigma_{K}\sigma_{L}$	$\displaystyle\,\,+\,\,$	$\displaystyle\sigma^{2}_{L}.$	(1.10)

Note that apart from the initial values $K_{0}$ , $L_{0}$ and the time horizon $T$ , the calculation requires five characteristics of the capital and labour processes: $\mu_{K}$ , $\mu_{L}$ , $\sigma_{K}$ , $\sigma_{L}$ , and $\rho_{KL}$ .

1.2. First steps

Let us now modify (1.1)–(1.2) by adding two jump components $J^{K}$ and $J^{L}$ that jointly form a Lévy process,

$\displaystyle\frac{\mathrm{d}K_{t}}{K_{t-}}$	$\displaystyle=\mu_{K}\mathrm{d}t+\sigma_{K}\mathrm{d}W_{t}+\mathrm{d}J^{K}_{t};\vskip 3.0pt plus 1.0pt minus 1.0pt$	(1.11)
$\displaystyle\frac{\mathrm{d}L_{t}}{L_{t-}}$	$\displaystyle=\mu_{L}\mathrm{d}t+\sigma_{L}\left(\rho_{KL}\mathrm{d}W_{t}+\sqrt[\ ]{1-\rho^{2}_{KL}}\mathrm{d}\widehat{W}_{t}\right)+\mathrm{d}J^{L}_{t};$	(1.12)
$\displaystyle\mathrm{d}J^{K}_{t}$	$\displaystyle=\int_{\|x_{1}\|\leq 1}x_{1}(N(\mathrm{d}t,\mathrm{d}(x_{1},x_{2}))-\Pi(\mathrm{d}(x_{1},x_{2}))\mathrm{d}t)+\int_{\|x_{1}\|>1}x_{1}N(\mathrm{d}t,\mathrm{d}(x_{1},x_{2}));$	(1.13)
$\displaystyle\mathrm{d}J^{L}_{t}$	$\displaystyle=\int_{\|x_{2}\|\leq 1}x_{2}(N(\mathrm{d}t,\mathrm{d}(x_{1},x_{2}))-\Pi(\mathrm{d}(x_{1},x_{2}))\mathrm{d}t)+\int_{\|x_{2}\|>1}x_{2}N(\mathrm{d}t,\mathrm{d}(x_{1},x_{2})).$	(1.14)

Here $N$ is a Poisson random measure and $\Pi$ the corresponding Lévy measure. In particular, when $(J^{K},J^{L})$ form a compound Poisson process, i.e., when $\Pi(\mathbb{R}^{2})<\infty$ , the quantity $\Pi(\mathbb{R}^{2})$ is the arrival intensity of jumps and

x=(x_{1},x_{2})\mapsto\frac{\Pi((-\infty,x_{1})\times(-\infty,x_{2}))}{\Pi(\mathbb{R}^{2})}

is the bivariate cumulative distribution of jump sizes. The complicated expression in (1.14) is required to accommodate the models where the small jumps of $L$ have infinite variation while, at the same time, the large jumps of $L$ have infinite mean.

In the presence of jumps, by convention, the paths of $K$ and $L$ are assumed to be right-continuous with left limits. That means the value of capital before a jump is given by the left limit $K_{t-}$ while $K_{t}$ is the value after a jump. The jump is naturally defined as the difference between the two, $\Delta K_{t}=K_{t}-K_{t-}$ and likewise for $L$ ; see Figure 1.1.

Refer to caption — Figure 1.1. Illustration of a right-continuous path with left limits (process $X$ ).

Very little has changed between (1.1)–(1.2) and (1.11)–(1.12); we have replaced one process with time-homogeneous independent increments by another. But unlike in the Brownian case, it is now mathematically possible that the right-hand side of (1.12) — the percentage change in labour supply $L$ — does not have a finite first moment, while the mean of $\nicefrac{{K}}{{L}}$ is nevertheless finite.¹¹1From a modelling point of view it is not realistic to believe that $L$ has infinite mean. We are simply stating that the calculus must be general enough to entertain such possibility. There are other circumstances where infinite mean arises more naturally, for example stock price $S$ will typically have finite mean but $\ln S$ may plausibly have mean equal to $-\infty$ in a model with jumps if returns close to -100% occur frequently enough.

The main takeaway message is that decomposing a stochastic integral into ‘signal’ and ‘noise’ as suggested by (1.1)–(1.2) is, in general, not straightforward. One possibility is to split $J^{L}$ into two components containing the small and large jumps, respectively, and decompose only the small jump component into signal and noise as shown in (1.14). This makes (1.12) look more like (1.2) and largely represents the current practice in applications where jumps of the most general type are considered, see Kallsen (2000), Fujiwara and Miyahara (2003), Hubalek et al. (2006), Jeanblanc et al. (2007), Øksendal and Sulem (2007), Bender and Niethammer (2008), and Applebaum (2009).

In this paper, we do handle arbitrary jumps but we interpret the difficulty with signal–noise decomposition very differently, which is to say we refrain from using such a decomposition altogether and instead look for a measure-invariant representation of $\nicefrac{{K}}{{L}}$ . The sought expression must reduce to the McKean calculus (1.6) when $K$ and $L$ are continuous. At the same time, it must correctly account for changes due to jumps in $K$ and $L$ . Just such a formula, suitably reinterpreted, can be traced to Émery (1978, Section 3). Let us now describe what we mean, first informally and then rigorously.

In the present example we seek the representation of $\nicefrac{{K}}{{L}}$ . This will be written symbolically as

\mathrm{d}\left(\frac{K_{t}}{L_{t}}\right)=\frac{K_{t-}+\mathrm{d}K_{t}}{L_{t-}+\mathrm{d}L_{t}}-\frac{K_{t-}}{L_{t-}},

(1.15)

which informally leads to an expression for percentage changes

\frac{\mathrm{d}\left(\nicefrac{{K_{t}}}{{L_{t}}}\right)}{\nicefrac{{K_{t-}}}{{L_{t-}}}}=\frac{1+\nicefrac{{\mathrm{d}K_{t}}}{{K_{t-}}}}{1+\nicefrac{{\mathrm{d}L_{t}}}{{L_{t-}}}}-1.

(1.16)

For Émery, the right-hand side of (1.16) represents a deterministic, time-constant function

\eta(x_{1},x_{2})=\frac{1+x_{1}}{1+x_{2}}-1

(1.17)

that acts on the increments $\nicefrac{{\mathrm{d}K_{t}}}{{K_{t-}}}$ and $\nicefrac{{\mathrm{d}L_{t}}}{{L_{t-}}}$ . The real meaning of the expression $\xi(\mathrm{d}X_{t})$ for a generic deterministic time-constant $\mathcal{C}^{2}$ function $\xi$ with $\xi(0)=0$ and any semimartingale $X$ is supplied by the Émery formula²²2The symbols $D\xi$ and $D^{2}\xi$ stand for first and second partial derivatives of $\xi$ . Émery (1978) considers only the case when $\xi$ is deterministic and constant in time. We explicitly allow $\xi$ to be predictable in order to develop a calculus that includes stochastic integration and Itô’s formula.

\xi_{t}(\mathrm{d}X_{t})=D\xi_{t}(0)\mathrm{d}X_{t}+\frac{1}{2}\sum_{i,j=1}^{d}D^{2}_{ij}\xi_{t}(0)\,\mathrm{d}\scalebox{1.2}{$[$}X^{(i)},X^{(j)}\scalebox{1.2}{$]$}^{c}_{t}+\left(\xi_{t}(\Delta X_{t})-D\xi_{t}(0)\Delta X_{t}\right),

(1.18)

where the last term yields absolutely summable jumps of finite variation and $[\cdot,\cdot]^{c}$ stands for the continuous part of quadratic covariation. An easy way to memorize the Émery formula is suggested in equation (2.12) and the subsequent paragraph.

Having assigned meaning to the right-hand side of (1.16) via the formula (1.18) with $\xi=\eta$ from (1.17), equality (1.16) is no longer an informal expression but a theorem whose validity one needs to establish. To accomplish this goal and build the simplified calculus, one can begin with the observation that (1.15), too, represents a function that acts on the increments of the underlying processes; in this case it acts on $\mathrm{d}K_{t}$ and $\mathrm{d}L_{t}$ . One important difference is that the function in question is no longer deterministic, i.e., we have

\mathrm{d}\left(\frac{K_{t}}{L_{t}}\right)=\tilde{\eta}_{t}(\mathrm{d}K_{t},\mathrm{d}L_{t})

with

\tilde{\eta}_{t}(x_{1},x_{2})=\frac{K_{t-}+x_{1}}{L_{t-}+x_{2}}-\frac{K_{t-}}{L_{t-}}.

(1.19)

Another practical consideration is that the predictable function $\tilde{\eta}$ in (1.19) is not finite-valued at the point where $x_{2}=-L_{-}$ .

To accommodate functions with restricted domain, we say that

\text{a predictable function $\xi$ is \emph{compatible} with $X$ if }\xi(\Delta X)\text{ is finite-valued},\textsf{P}\text{--a.s.}

It is formally straightforward to apply the formula (1.18) to a predictable function $\xi$ . However, it is a priori not clear that the formula will be well-defined; in particular, for $\xi=\tilde{\eta}$ in (1.19) it is not clear that one obtains

\sum_{0<t\leq\cdot}|\xi_{t}(\Delta X_{t})-D\xi_{t}(0)\Delta X_{t}|<\infty.

(1.20)

Ideally, we would like to have a calculus that gives rise only to those predictable functions $\xi$ for which the integral $\int_{0}^{\cdot}\xi_{t}(\mathrm{d}X_{t})$ is always well-defined so we do not have to manually check admissibility every time a new $\xi$ arises. The precise formulation of two such subclasses, whose elements will be called universal representing functions, is given in Appendix A. We denote by $\mathfrak{I}_{0\mathbb{R}}^{d,n}$ the subset of universal representing functions that map $\mathbb{R}^{d}$ –valued processes to $\mathbb{R}^{n}$ –dimensional processes. We also set $\mathfrak{I}_{0\mathbb{R}}=\bigcup_{d,n\in\mathbb{N}}\mathfrak{I}_{0\mathbb{R}}^{d,n}$ . For computations involving characteristic functions, it is convenient to generalize the Émery formula (1.18) to a subset of complex predictable functions by interpreting $D\xi$ and $D^{2}\xi$ as complex derivatives. The subset of all universal representing functions that are complex-differentiable near the origin will be denoted by $\mathfrak{I}_{0\mathbb{C}}$ .

Propositions A.3–A.5 in Appendix A show that the class $\mathfrak{I}_{0\mathbb{R}}$ is self-contained; if we use standard real-valued operations we are guaranteed to stay within $\mathfrak{I}_{0\mathbb{R}}$ . The class $\mathfrak{I}_{0\mathbb{C}}$ is also self-contained provided the transformations we apply are complex-differentiable, as will be the case in all our examples involving complex numbers. These results guarantee that we will only ever encounter functions in $\mathfrak{I}_{0\mathbb{R}}$ (resp., $\mathfrak{I}_{0\mathbb{C}}$ ) and so will never have to check the conditions of Definition A.1 manually.

1.3. Integral notation

Just as the McKean calculus of Subsection 1.1, the simplified stochastic calculus is most intuitive when expressed in differential form, such as (1.15) and (1.16). When one wishes to speak of the integrated process whose increments are equal to $\xi_{t}(\mathrm{d}X_{t})$ , one typically just introduces a new label, say $Y$ , writing $\mathrm{d}Y_{t}=\xi_{t}(\mathrm{d}X_{t})$ . There is nothing wrong with the relabelling approach; it does deliver all the immediate benefits of the simplified calculus and helps to keep technicalities to a minimum.

Side-by-side with the intuitive differential approach, we want to offer the reader an alternative ‘high-level’ view of the calculus where the roles of $\xi$ and $X$ are acknowledged explicitly. Accordingly, the process with increments $\xi_{t}(\mathrm{d}X_{t})$ , starting at 0, will be denoted by $\xi\circ X$ , i.e.,

\xi\circ X=\int_{0}^{\cdot}\xi_{t}(\mathrm{d}X_{t}).

The high-level notation may seem a little abstract at first, but it offers distinct benefits such as compactness and flexibility. For example, in the integral notation one can write

	$\displaystyle[X,X]=$	$\displaystyle\int_{0}^{\cdot}(\mathrm{d}X_{t})^{2}={\operatorname{id}}^{2}\circ X;$
	$\displaystyle[[X,X],X]=$	$\displaystyle\int_{0}^{\cdot}(\mathrm{d}X_{t})^{3}={\operatorname{id}}^{3}\circ X;$
	$\displaystyle[[[X,X],X],X]=[[X,X],[X,X]]=$	$\displaystyle\int_{0}^{\cdot}(\mathrm{d}X_{t})^{4}={\operatorname{id}}^{4}\circ X.$

Here we let ${\operatorname{id}}$ denote the identity function;

{\operatorname{id}}(x)=x.

Below we use ${\operatorname{id}}_{1}(x)=x_{1}$ for the first component, ${\operatorname{id}}_{2}$ for the second, etc., where required.

The notation $\xi\circ X$ also emphasizes the universality of the transformation $X\mapsto\xi\circ X$ . In the same way that $[X,X]$ is well-defined for any semimartingale $X$ , the process $|{\operatorname{id}}|^{\alpha}\circ X$ is well-defined for any semimartingale $X$ and any real $\alpha\geq 2$ . This brings us to other universal transformations that are commonly used in the literature. For example, provided that $X$ and $X_{-}$ are different from zero, the literature defines $\mathcal{L}(X)$ as the process of cumulative percentage change in $X$ , i.e.,

\mathrm{d}\mathcal{L}(X)_{t}=\frac{\mathrm{d}X_{t}}{X_{t-}}.

Thus, in the integral notation formula (1.16) reads

\mathcal{L}\left(\frac{K}{L}\right)=\left(\frac{1+{\operatorname{id}}_{1}}{1+{\operatorname{id}}_{2}}-1\right)\circ(\mathcal{L}(K),\mathcal{L}(L)).

(1.21)

Provided that the cumulative percentage changes in $K$ and $L$ are well-defined, formula (1.21) holds for arbitrary semimartingales $K$ and $L$ ; it is new in this generality as far as we know.

2. Simplified stochastic calculus

2.1. Composite rules

The simplified stochastic calculus rests on a sequential application of Propositions A.3–A.5. In practice, one would not use these propositions directly but instead combine them into composite rules that transform a represented process $Y=Y_{0}+\xi\circ X$ into another represented process $Z$ . In differential notation, we have $\mathrm{d}Y_{t}=\xi_{t}(\mathrm{d}X_{t})$ and the first two rules read as follows.

Stochastic integration with respect to $Y$ :

For a locally bounded process $\zeta$ and the integral $Z=\int_{0}^{\cdot}\zeta_{t}\mathrm{d}Y_{t}$ one has

\mathrm{d}Z_{t}=\zeta_{t}\xi_{t}(\mathrm{d}X_{t}).

Itô formula for $Z=f(Y)$ :

For a suitably smooth function $f$ such that $Y$ and $Y_{-}$ lie in the interior of the domain of $f$ one has

\mathrm{d}Z_{t}=\mathrm{d}f(Y_{t})=f(Y_{t-}+\mathrm{d}Y_{t})-f(Y_{t-})=f(Y_{t-}+\xi_{t}(\mathrm{d}X_{t}))-f(Y_{t-}).

We now restate the above fully rigorously in integral notation.

Corollary 2.1.

Assume $\xi\in\mathfrak{I}_{0\mathbb{R}}^{d,n}$ (resp., $\mathfrak{I}_{0\mathbb{C}}^{d,n}$ ) is compatible with $X$ and consider the $n$ –dimensional process

Y=Y_{0}+\xi\circ X.

The following rules then apply.

•

Stochastic integration: For a locally bounded $\mathbb{R}^{m\times n}$ –valued (resp., $\mathbb{C}^{m\times n}$ –valued) predictable process $\zeta$ we have $\zeta\xi\in\mathfrak{I}^{d,m}_{0\mathbb{R}}$ (resp., $\mathfrak{I}^{d,m}_{0\mathbb{C}}$ ) and

$Z=Z_{0}+\int_{0}^{\cdot}\zeta_{u}\mathrm{d}Y_{u}=Z_{0}+\zeta\xi\circ X;$ (2.1)
•

Smooth transformation (‘Itô’s formula’): Assume $Y$ and $Y_{-}$ remain in an open subset $\mathcal{U}$ of $\mathbb{R}^{n}$ (resp., $\mathbb{C}^{n}$ ) where the function $f:\mathcal{U}\to\mathbb{R}^{m}$ is twice continuously differentiable (resp. where $f:\mathcal{U}\to\mathbb{C}^{m}$ is analytic). Then $f(Y_{-}+\xi)-f(Y_{-})$ is in $\mathfrak{I}_{0\mathbb{R}}^{d,m}$ (resp., $\mathfrak{I}_{0\mathbb{C}}^{d,m}$ ) and it is compatible with $X$ . Furthermore,

$Z=f(Y)=f(Y_{0})+(f(Y_{-}+\xi)-f(Y_{-}))\circ X.$ (2.2)

Let us outline a general scheme of how the two composite rules are applied in practice. At the outset, one will designate the primitive input to the problem at hand; this input process is thereafter labeled $X$ . For example, in Subsections 1.1 and 1.2 it is natural to start from the bivariate process

X=(\mathcal{L}(K),\mathcal{L}(L)).

(2.3)

Observe that $X$ is always representable with respect to itself thanks to Proposition A.3 with $\zeta$ equal to the $d\times d$ identity matrix. Starting off from the trivial representation $X=X_{0}+{\operatorname{id}}\circ X$ , i.e., taking $Y=X$ and $\xi(x)=x$ in Corollary 2.1, one applies smooth transformation or stochastic integration as required to obtain the first intermediate result $Z$ . This intermediate result (relabeled $Y$ ) becomes the input to the next application of Corollary 2.1 producing the next intermediate output $Z$ . The $Y\to Z$ pattern is repeated until one reaches the desired output $Z$ ; in our example the goal is

Z=\mathcal{L}\left(\frac{K}{L}\right).

(2.4)

Table 2.1 illustrates the steps required in the transition from (2.3) to (2.4).

\mathrm{d}Y

operation

Y\to Z

\mathrm{d}Z

\mathrm{d}\!\left[\!\!\!\begin{array}[]{c}\mathcal{L}(K)\\[-1.0pt] \mathcal{L}(L)\end{array}\!\!\!\right]

integration

\zeta=\left[\!\!\begin{array}[]{cc}K_{-}&0\\[-1.0pt] 0&L_{-}\end{array}\!\!\right]

\mathrm{d}\!\left[\!\!\begin{array}[]{c}K\\[-1.0pt] L\end{array}\!\!\right]=\left[\!\!\begin{array}[]{cc}K_{-}&0\\[-1.0pt] 0&L_{-}\end{array}\!\!\right]\mathrm{d}\!\left[\!\!\begin{array}[]{c}\mathcal{L}(K)\\[-1.0pt] \mathcal{L}(L)\end{array}\!\!\right]

\mathrm{d}\!\left[\!\!\begin{array}[]{c}K\\[-1.0pt] L\end{array}\!\!\right]

smooth transformation

f(K,L)=\frac{K}{L}

$\mathrm{d}\!\left(\frac{K}{L}\right)$	$=\left(\frac{K_{-}+\mathrm{d}K}{L_{-}+\mathrm{d}L}-\frac{K_{-}}{L_{-}}\right)$
	$=\frac{K_{-}}{L_{-}}\left(\frac{1+\mathrm{d}\mathcal{L}\left(K\right)}{1+\mathrm{d}\mathcal{L}\left(L\right)}-1\right)$

\mathrm{d}\!\left(\dfrac{K}{L}\right)

integration

\zeta=\frac{L_{-}}{K_{-}}

$\mathrm{d}\mathcal{L}\!\left(\frac{K}{L}\right)$	$=\frac{L_{-}}{K_{-}}\,\mathrm{d}\!\left(\frac{K}{L}\right)$
	$=\frac{1+\mathrm{d}\mathcal{L}(K)}{1+\mathrm{d}\mathcal{L}(L)}-1$

Table 2.1. Schematic derivation of (1.21) by means of Corollary 2.1

The discussion above concerns formal calculations where the rules of the simplified calculus are applied mechanically. Many users will prefer a more intuitive approach whose main idea is apparent in the last column of Table 2.1. Here one observes that the calculus traces the behaviour of jumps and so one may effectively pretend that $X$ , $Y$ , and $Z$ are finite-variation pure-jump processes. In this way, it is possible to arrive at the correct $\xi$ even without applying formal rules. In the context of (1.21), for example, suppose $K$ increases by 50% and $L$ increases by 20%. The percentage change in $\nicefrac{{K}}{{L}}$ is then precisely

\frac{1+0.5}{1+0.2}-1=25\%.

Therefore, formula (1.21), among other things, describes jump transformations: every time $\mathcal{L}(K)$ jumps by $x_{1}$ (e.g., 0.5) and $\mathcal{L}(L)$ jumps by $x_{2}$ (e.g., 0.2) the process $\mathcal{L}(\nicefrac{{K}}{{L}})$ jumps by

\xi(x)=\frac{1+x_{1}}{1+x_{2}}-1.

As a further example, let us see how the rules of the simplified calculus can be used to obtain the representation of the logarithmic return in terms of the rate of return.

Example 2.2 (Representation of the log return in terms of the rate of return).

Let $S>0$ stand for the value of an investment with $S_{-}>0$ and let $X=\mathcal{L}(S)=\int_{0}^{\cdot}\nicefrac{{\mathrm{d}S_{t}}}{{S_{t-}}}$ be the cumulative rate of return on this investment. On a purely intuitive level, thinking only of jump transformations, one can write

\displaystyle\mathrm{d}\ln S_{t}=\ln\left(S_{t-}+\mathrm{d}S_{t}\right)-\ln S_{t-}=\ln\left(1+\frac{\mathrm{d}S_{t}}{S_{t-}}\right)=\ln\left(1+\mathrm{d}\mathcal{L}(S)_{t}\right).

More formally, the integration rule yields $S=S_{0}+S_{-}\,{\operatorname{id}}\circ\mathcal{L}(S)$ and smooth transformation then gives

\ln S=\ln(S_{0}+S_{-}\,{\operatorname{id}}\circ\mathcal{L}(S))=\ln S_{0}+(\ln(S_{-}+S_{-}\,{\operatorname{id}})-\ln S_{-})\circ\mathcal{L}(S).

Both approaches yield the representation

\ln S=\ln S_{0}+\ln(1+{\operatorname{id}})\circ\mathcal{L}(S)

for any semimartingale $S$ such that $S_{-}>0$ and $S>0$ .∎

As the final introductory example consider the representation of quadratic covariation.

Example 2.3 (Representation of quadratic covariation).

The quadratic covariation $[X,Y]$ satisfies (or, as in Meyer 1976, is defined by) the identity

XY=X_{0}Y_{0}+\int_{0}^{\cdot}X_{t-}\mathrm{d}Y_{t}+\int_{0}^{\cdot}Y_{t-}\mathrm{d}X_{t}+[X,Y].

This yields

	$\displaystyle\mathrm{d}[X,Y]_{t}=$	$\displaystyle\qquad\qquad\qquad\mathrm{d}(X_{t}Y_{t})$	$\displaystyle-{}$	$\displaystyle X_{t-}\mathrm{d}Y_{t}-Y_{t-}\mathrm{d}X_{t}$
	$\displaystyle=$	$\displaystyle\,(X_{t-}+\mathrm{d}X_{t})(Y_{t-}+\mathrm{d}Y_{t})-X_{t-}Y_{t-}$	$\displaystyle\ -{}$	$\displaystyle X_{t-}\mathrm{d}Y_{t}-Y_{t-}\mathrm{d}X_{t}$	$\displaystyle\ ={}$	$\displaystyle\mathrm{d}X_{t}\mathrm{d}Y_{t}.$

More formally, the integration and smooth transformation rules yield

[X,Y]=((X_{-}+{\operatorname{id}}_{1})(Y_{-}+{\operatorname{id}}_{2})-X_{-}Y_{-}-X_{-}\,{\operatorname{id}}_{2}-Y_{-}\,{\operatorname{id}}_{1})\circ(X,Y)={\operatorname{id}}_{1}\,{\operatorname{id}}_{2}\circ(X,Y).

Thus, in the differential notation one can rigorously write $\mathrm{d}[X,X]_{t}=(\mathrm{d}X_{t})^{2}$ for any univariate semimartingale $X$ . ∎

Section 3 contains many more explicit representations that are useful in practice. Some of these are well known in the specialist literature, while others are new. Proposition A.5, in particular, is a powerful tool for obtaining new representations from old ones. We summarize it here in the form of a composition rule. Observe that in the differential notation the rule is completely natural; it asserts that

\mathrm{d}Y_{t}=\xi_{t}(\mathrm{d}X_{t})

and

\mathrm{d}Z_{t}=\psi_{t}(\mathrm{d}Y_{t})

yield

\mathrm{d}Z_{t}=\psi_{t}(\xi_{t}(\mathrm{d}X_{t}))

(2.5)

Corollary 2.4 (Composition of representations).

Assume $\xi\in\mathfrak{I}_{0\mathbb{R}}^{d,n}$ is compatible with $X$ and consider the $n$ –dimensional process $Y=Y_{0}+\xi\circ X$ . For $\psi\in\mathfrak{I}_{0\mathbb{R}}^{n,m}$ compatible with $Y$ one obtains that $\psi(\xi)\in\mathfrak{I}_{0\mathbb{R}}^{d,m}$ is compatible with $X$ and

Z=Z_{0}+\psi\circ Y=Z_{0}+\psi(\xi)\circ X.

(2.6)

An analogous statement holds with $\mathfrak{I}_{0\mathbb{C}}$ in place of $\mathfrak{I}_{0\mathbb{R}}$ .

The composition rule allows the user to store some common calculations and ‘recycle’ them later without having to revisit their detailed derivation. Suppose, for instance, that we are given the evolution of $(\ln K,\ln L)$ as the primitive input. Thanks to Corollary 2.4, there is no need to calculate everything afresh all the way from $(\ln K,\ln L)$ to $\mathcal{L}(\nicefrac{{K}}{{L}})$ . One only computes the passage from $(\ln K,\ln L)$ to $(\mathcal{L}(K),\mathcal{L}(L))$ which yields (see equation (3.4) below)

(\mathcal{L}(K),\mathcal{L}(L))=\left(\mathrm{e}^{{\operatorname{id}}_{1}}-1,\mathrm{e}^{{\operatorname{id}}_{2}}-1\right)\circ(\ln K,\ln L),

while the passage from $(\mathcal{L}(K),\mathcal{L}(L))$ to $\mathcal{L}(\nicefrac{{K}}{{L}})$ can be recycled from (1.21). The two results composed together give

\mathcal{L}\left(\frac{K}{L}\right)=\left(\mathrm{e}^{{\operatorname{id}}_{1}-{\operatorname{id}}_{2}}-1\right)\circ(\ln K,\ln L).

(2.7)

In differential notation,

\mathrm{d}K_{t}=\mathrm{d}\mathrm{e}^{\ln K_{t}}=\mathrm{e}^{\ln K_{t-}+\mathrm{d}\ln K_{t}}-\mathrm{e}^{\ln K_{t-}}=K_{t-}(\mathrm{e}^{\mathrm{d}\ln K_{t}}-1)

substituted into (1.16) yields

\frac{\mathrm{d}\left(\nicefrac{{K_{t}}}{{L_{t}}}\right)}{\nicefrac{{K_{t-}}}{{L_{t-}}}}=\frac{1+\mathrm{e}^{\mathrm{d}\ln K_{t}}-1}{1+\mathrm{e}^{\mathrm{d}\ln L_{t}}-1}-1=\mathrm{e}^{\mathrm{d}\ln K_{t}-\mathrm{d}\ln L_{t}}-1,

which is the differential equivalent of formula (2.7).

2.2. Émery formula and drift computation

Having mastered the art of representing one process by means of another, we would like to obtain an analogon of (1.9)–(1.10). Our task is to express the drift of the represented process with the help of the characteristics of the representing process.

To begin with, we collect the predictable characteristics (i.e., the drift rate, the covariance matrix of the associated Brownian motion, and the Lévy measure) of the input process $X=(\mathcal{L}(K),\mathcal{L}(L))$ in equations (1.11)–(1.14) in the more compact form

b^{X[h^{1}]}=\left[\!\!\begin{array}[]{c}\mu_{K}\\ \mu_{L}\end{array}\!\!\right];\qquad c^{X}=\left[\!\!\begin{array}[]{cc}\sigma_{K}^{2}&\rho_{KL}\sigma_{K}\sigma_{L}\\ \rho_{KL}\sigma_{K}\sigma_{L}&\sigma_{L}^{2}\end{array}\!\!\right];\qquad F^{X}=\Pi.

(2.8)

Here $X[h^{1}]$ denotes the process $X$ with jumps greater than 1 in absolute value removed,

X[h^{1}]=X_{0}+({\operatorname{id}}_{1}\,\mathbf{1}_{|{\operatorname{id}}_{1}|\leq 1},{\operatorname{id}}_{2}\mathbf{1}_{|{\operatorname{id}}_{2}|\leq 1})\circ X.

(2.9)

Observe that this precisely matches the decomposition of jumps appearing in (1.13)–(1.14) and ensures that the drift of $X[h^{1}]$ is finite.³³3In contrast, the drift of $X$ need not be well-defined in general; see also Footnote 1. More generally, we will denote by $X[h]$ the process containing the small jumps of $X$ as given by a specific truncation function $h$ and observe that $X[h^{1}]$ corresponds to the choice

h^{1}=({\operatorname{id}}_{1}\,\mathbf{1}_{|{\operatorname{id}}_{1}|\leq 1},\ldots,{\operatorname{id}}_{d}\,\mathbf{1}_{|{\operatorname{id}}_{d}|\leq 1}),

(2.10)

where $d$ is the dimension of $X$ . The mechanics of truncation are described in Definition B.1.

The reader must be warned that $X[0]$ , the continuous part of $X$ , is not always well-defined. For this reason, $X[0]$ cannot be universally represented in contrast to $X[h^{1}]$ , whose universal representation appears in (2.9). Nonetheless, the situations where $X[0]$ exists do arise in practice, for example, in the Merton (1976) jump-diffusion model. In such models we may write

\mathrm{d}X_{t}=\mathrm{d}X[0]_{t}+\Delta X_{t}.

(2.11)

This, when substituted into (1.18), leads to the simplified expression

\xi_{t}(\mathrm{d}X_{t})=D\xi_{t}(0)\mathrm{d}X[0]_{t}+\frac{1}{2}\sum_{i,j=1}^{d}D^{2}_{ij}\xi_{t}(0)\,\mathrm{d}\!\left[X^{(i)},X^{(j)}\right]^{c}_{t}+\xi_{t}(\Delta X_{t}),

(2.12)

which offers a valuable insight into the nature of the Émery formula. We observe that the first two terms of (2.12) correspond to the McKean calculus for the continuous part of $X$ while the last term accounts for the jumps in $X$ . The two components do not interact and can be treated separately. In the most general situation where the decomposition (2.11) does not exist, one can make (2.12) rigorous by adding small jumps to the first term and subtracting them in the last term to obtain

\xi_{t}(\mathrm{d}X_{t})=D\xi_{t}(0)\mathrm{d}X[h]_{t}+\frac{1}{2}\sum_{i,j=1}^{d}D^{2}_{ij}\xi_{t}(0)\,\mathrm{d}\!\left[X^{(i)},X^{(j)}\right]^{c}_{t}+\left(\xi_{t}(\Delta X_{t})-D\xi_{t}(0)h(\Delta X_{t})\right).

(2.13)

The original Émery formula (1.18) corresponds simply to the case where all the jumps of $X$ have been added to the first term and subtracted in the last term of (2.12).

We thus come to understand the Émery formula as a spectrum of equivalent expressions where one can dial the truncation function $h$ all the way down to $0$ or all the way up to $h(x)=x$ . In this sense, the truncation is unimportant – we can always choose $h$ to suit our needs. In a univariate case, one would thus pick $h=0$ if jumps of $X$ have finite variation as in the Merton jump-diffusion model, failing that, $h={\operatorname{id}}$ if the drift of $X$ exists, and finally $h=h^{1}={\operatorname{id}}\mathbf{1}_{|{\operatorname{id}}|\leq 1}$ in all remaining cases. In a multivariate case this choice can be performed component-wise. With such choice of $h$ , the drift of each contributing term in (2.13) is guaranteed to be finite.

We can now perform the feat previously achieved on a smaller scale in (1.9)–(1.10). By matching each term in (2.13) with its drift contribution, one obtains

b^{\xi\circ X}=D\xi(0)b^{X[h]}+\frac{1}{2}\sum_{i,j=1}^{d}D^{2}_{ij}\xi(0)c^{X}_{ij}+\int_{\mathbb{R}^{d}}(\xi(x)-D\xi(0)h(x))F^{X}(\mathrm{d}x).

(2.14)

Formula (2.14) is proved in Theorem B.6.

Specifically, with $\xi$ given in (1.17) we obtain

D\xi(0)=[1\ \ -1],\qquad D^{2}\xi(0)=\left[\begin{array}[]{cc}0&-1\\ -1&2\end{array}\right].

For the specific input parameters in (2.8) and the corresponding truncation function in (2.10) the drift conversion formula (2.14) yields

\begin{split}b^{\mathcal{L}(\nicefrac{{K}}{{L}})}={}&\mu_{K}-\mu_{L}-\rho_{KL}\sigma_{K}\sigma_{L}+\sigma_{L}^{2}\\ &{}+\int_{\mathbb{R}^{2}}\left(\frac{1+x_{1}}{1+x_{2}}-1-\left(x_{1}\mathbf{1}_{|x_{1}|\leq 1}-x_{2}\mathbf{1}_{|x_{2}|\leq 1}\right)\right)\Pi(\mathrm{d}(x_{1},x_{2})),\end{split}

(2.15)

which is the appropriate generalization of (1.10) provided the integral in (2.15) converges.⁴⁴4We might consider, for example, a model where the jumps in $\mathcal{L}(K)$ and $\mathcal{L}(L)$ are independent, in which case $\Pi(\mathrm{d}x_{1},\mathrm{d}x_{2})=\mathbf{1}_{x_{2}=0}\Pi^{K}(\mathrm{d}x_{1})+\mathbf{1}_{x_{1}=0}\Pi^{L}(\mathrm{d}x_{2}),$ meaning capital and labour do not jump simultaneously. We may take $\Pi^{K}(\mathrm{d}x_{1})$ to be lognormal so that $\int_{0}^{\infty}x_{1}\Pi^{K}(\mathrm{d}x_{1})<\infty$ and $K$ has finite mean. The choice $\Pi^{L}(\mathrm{d}x_{2})=x_{2}^{-2}\mathbf{1}_{x_{2}>0}\mathrm{d}x_{2}$ then provides an example where $L$ has infinite variation (even with $\sigma^{L}=0$ ) and infinite mean while the mean of $K/L$ remains finite (see also Theorem B.4). Formula (1.8) continues to hold with this choice of $b$ ; see Theorems B.4 and B.6 below.

3. Further examples with drift computation

We will now showcase the strength of process representations such as (1.21) when it comes to computing drifts. We will do so side-by-side with the classical approach. Let us therefore start with an $\mathbb{R}$ –valued Lévy process written in the classical notation,

\begin{split}X&=X_{0}+\int_{0}^{\cdot}\alpha\mathrm{d}s+\int_{0}^{\cdot}\sigma\mathrm{d}W_{s}+\int_{0}^{\cdot}\int_{|x|\leq 1}x\widehat{N}(\mathrm{d}s,\mathrm{d}x)+\int_{0}^{\cdot}\int_{|x|>1}xN(\mathrm{d}s,\mathrm{d}x),\end{split}

(3.1)

where $N$ is a Poisson jump measure, $\Pi$ the corresponding Lévy measure,

\widehat{N}(\mathrm{d}t,\mathrm{d}x)=N(\mathrm{d}t,\mathrm{d}x)-\Pi(\mathrm{d}x)\mathrm{d}t

the compensated Poisson jump measure, and $\alpha,\sigma\in\mathbb{R}$ .

In the simplified stochastic calculus we will never have to write out the decomposition (3.1) in full. Instead we just note that $X$ is an Itô semimartingale⁵⁵5A generalization of (3.1) where $\alpha$ , $\sigma^{2}$ , and $\Pi$ are allowed to be stochastic; see Definition B.5 below. with characteristics

\left(b^{X[h^{1}]}=\alpha,c^{X}=\sigma^{2},F^{X}=\Pi\right).

(3.2)

The notation of (3.2) emphasizes the fact that some expressions below, such as (3.6), remain valid even if $b^{X}$ , $c^{X}$ , and $F^{X}$ are stochastic.

In the next example, we will find the representation for the cumulative percentage change in $\mathrm{e}^{vX}$ for fixed $v\in\mathbb{C}$ and use this to compute the moment generating function of $X$ .

Example 3.1 (Drift of $\mathcal{L}(\mathrm{e}^{vX})$ for $v\in\mathbb{C}$ ).

By integration and smooth transformation we have

\mathrm{d}\mathcal{L}(\mathrm{e}^{vX})_{t}=\frac{\mathrm{d}\kern 0.83298pt\mathrm{e}^{vX_{t}}}{\mathrm{e}^{vX_{t-}}}=\frac{\mathrm{e}^{v(X_{t-}+\mathrm{d}X_{t})}-\mathrm{e}^{vX_{t-}}}{\mathrm{e}^{vX_{t-}}}=\mathrm{e}^{v\hskip 0.70004pt\mathrm{d}X_{t}}-1,

(3.3)

or equivalently,

\mathcal{L}(\mathrm{e}^{vX})=(\mathrm{e}^{v\hskip 0.70004pt{\operatorname{id}}}-1)\circ X.

(3.4)

The representing function is $\xi=\mathrm{e}^{v\hskip 0.70004pt{\operatorname{id}}}-1$ with $\xi^{\prime}(0)=v$ and $\xi^{\prime\prime}(0)=v^{2}$ . The corresponding Émery formula (2.13) reads⁶⁶6A helpful mnemonic device for the Émery formula is shown in equation (2.12) and the subsequent paragraph.

\hskip 2.84544pt\xi_{t}(\mathrm{d}X_{t})=v\mathrm{d}X[h]_{t}\ +\frac{1}{2}v^{2}\mathrm{d}[X,X]_{t}^{c}+\left(\mathrm{e}^{v\Delta X_{t}}-1-vh(\Delta X_{t})\right).

(3.5)

It is valid for any semimartingale $X$ .

It is now straightforward to compute the drift in (3.5), provided it exists. Specifically, for an Itô semimartingale $X$ , (3.5) yields the drift rate of

\displaystyle\hskip 2.84544ptb^{\mathcal{L}(\mathrm{e}^{vX})}\ =\ vb^{X[h]}\hskip 8.5359pt\ +\frac{1}{2}v^{2}c^{X}\hskip 28.45274pt+\int_{\mathbb{R}}\left(\mathrm{e}^{vx}-1-vh(x)\right)F^{X}(\mathrm{d}x).

(3.6)

If, additionally, $X$ is a Lévy process as in (3.1)–(3.2), we obtain

\hskip 2.84544ptb^{\mathcal{L}(\mathrm{e}^{vX})}\ =\ \alpha v\hskip 25.6073pt\ +\frac{1}{2}\sigma^{2}v^{2}\hskip 28.45274pt+\int_{\mathbb{R}}\left(\mathrm{e}^{vx}-1-vx\mathbf{1}_{\lvert x\rvert\leq 1}\right)\Pi(\mathrm{d}x)

as long as the integral is finite and, in analogy to (1.3),

\displaystyle\textsf{E}\left[\mathrm{e}^{v(X_{T}-X_{0})}\right]=\exp\left(b^{\mathcal{L}(\mathrm{e}^{vX})}T\right).

(3.7)

The drift rate $\kappa^{X}(v)=b^{\mathcal{L}(\mathrm{e}^{vX})}$ is known as the cumulant function of the r.v. $X_{1}-X_{0}$ .⁷⁷7Formula (3.7) holds thanks to Theorems B.4 and B.6. Note that (3.7) is in fact the Lévy-Khintchin formula applied to the Lévy process $X$ (Sato 1999, Theorem 8.1).∎

Remark 3.2.

Let us now consider the same calculation using the form (3.1). Itô’s formula (Applebaum 2009, Theorem 4.4.7) gives

\begin{split}\mathrm{e}^{vX}-\mathrm{e}^{vX_{0}}&=\int_{0}^{\cdot}v\mathrm{e}^{vX_{s-}}(\alpha\mathrm{d}s+\sigma\mathrm{d}W_{s})+\frac{1}{2}\int_{0}^{t}v^{2}\mathrm{e}^{vX_{s-}}\sigma^{2}\mathrm{d}s\\ &\qquad+\int_{0}^{\cdot}\int_{|x|\leq 1}\left(\mathrm{e}^{v(X_{s-}+x)}-\mathrm{e}^{vX_{s-}}\right)\widehat{N}(\mathrm{d}s,\mathrm{d}x)\\ &\qquad+\int_{0}^{\cdot}\int_{|x|>1}\left(\mathrm{e}^{v(X_{s-}+x)}-\mathrm{e}^{vX_{s-}}\right)N(\mathrm{d}s,\mathrm{d}x)\\ &\qquad+\int_{0}^{\cdot}\int_{|x|\leq 1}\left(\mathrm{e}^{v(X_{s-}+x)}-\mathrm{e}^{vX_{s-}}-v\mathrm{e}^{vX_{s-}}x\right)\Pi(\mathrm{d}x)\mathrm{d}s.\end{split}

(3.8)

Integration yields (Applebaum 2009, Section 4.3.3)

\begin{split}\mathcal{L}(\mathrm{e}^{vX})=&\int_{0}^{\cdot}\mathrm{e}^{-vX_{s-}}\mathrm{d}\mathrm{e}^{vX_{s}}\\ =&\int_{0}^{\cdot}\left(\alpha v+\frac{1}{2}\sigma^{2}v^{2}\right)\mathrm{d}s+\int_{0}^{\cdot}\sigma v\mathrm{d}W_{s}+\int_{0}^{\cdot}\int_{|x|\leq 1}(\mathrm{e}^{vx}-1)\widehat{N}(\mathrm{d}s,\mathrm{d}x)\\ &+\int_{0}^{\cdot}\int_{|x|>1}(\mathrm{e}^{vx}-1)N(\mathrm{d}s,\mathrm{d}x)+\int_{0}^{\cdot}\int_{|x|\leq 1}(\mathrm{e}^{vx}-1-vx)\Pi(\mathrm{d}x)\mathrm{d}s.\end{split}

(3.9)

Finally, the drift rate is evaluated by computing the drift of each contributing term in (3.9),

\begin{split}b^{\mathcal{L}(\mathrm{e}^{vX})}&=\alpha v+\frac{1}{2}\sigma^{2}v^{2}+0+0\\ &\qquad+\int_{|x|>1}(\mathrm{e}^{vx}-1)\Pi(\mathrm{d}x)+\int_{|x|\leq 1}\left(\mathrm{e}^{vx}-1-vx\right)\Pi(\mathrm{d}x).\end{split}

The calculations (3.8)–(3.9) become much easier in the approach (3.3)–(3.5) because the rules of the simplified stochastic calculus are more compact and easier to remember. ∎

The main advantage of the simplified calculus is that one does not have to keep track of the drift, volatility, and jump intensities through intermediate calculations. In the next example we will evaluate all three characteristics of the process $Y=\mathcal{L}(\mathrm{e}^{vX})$ when $X$ is an Itô semimartingale. Having all three characteristics is not necessary for our purposes; this example merely shows that the characteristics are easily recalled at any moment — if needed.

Example 3.3 (Characteristics of a represented Itô semimartingale).

Consider $\xi\in\mathfrak{I}_{0\mathbb{R}}^{d,n}\cup\mathfrak{I}_{0\mathbb{C}}^{d,n}$ compatible with an Itô semimartingale $X$ .

(1)

For the ‘volatility’ of the represented process $Y=Y_{0}+\xi\circ X$ we obtain

$c^{Y}=D\xi(0)c^{X}D\xi(0)^{\top}.$

(2)

To compute the drift of $Y[g]$ , observe that one naturally obtains

Y[g]=Y_{0}+g\circ Y,

for any truncation function $g$ equal to identity on a neighbourhood of zero (see Proposition B.2). The chain rule (2.6) now yields $Y[g]=Y_{0}+g(\xi)\circ X$ . As $Dg(0)$ is by assumption an identity matrix, we have $D(g\circ\xi)(0)=D\xi(0)$ , $D^{2}(g\circ\xi)=D^{2}\xi(0)$ , and the desired drift is

b^{Y[g]}=D\xi(0)b^{X[h]}+\frac{1}{2}\sum_{i,j=1}^{d}D^{2}_{ij}\xi(0)c_{ij}^{X}+\int_{\mathbb{R}}\left(g(\xi(x))-D\xi(0)h(x)\right)F^{X}(\mathrm{d}x).

(3)

Finally, let $G$ be a closed $n$ –dimensional set not containing zero. The process $\mathbf{1}_{{\operatorname{id}}\in G}\circ Y$ counts the jumps of $Y$ whose size is in $G$ ; its drift yields the jump arrival intensity $F^{Y}(G)$ . The chain rule (2.6) gives $\mathbf{1}_{{\operatorname{id}}\in G}\circ Y=\mathbf{1}_{\xi\in G}\circ X$ . The function $\psi=\mathbf{1}_{\xi\in G}$ satisfies $D\psi(0)=D^{2}\psi(0)=0$ which yields

$F^{Y}(G)=b^{\psi\circ X}=0+0+\int_{\mathbb{R}}\mathbf{1}_{\xi(x)\in G}F^{X}(\mathrm{d}x).$

Thus, for each $(\omega,t)$ , we recognize $F^{Y}$ as the image (a.k.a. push-forward) measure of $F^{X}$ obtained via the mapping $\xi$ .

For concreteness, set $Y=\mathcal{L}(\mathrm{e}^{vX})=(\mathrm{e}^{v\hskip 0.70004pt{\operatorname{id}}}-1)\circ X$ for some fixed $v\in\mathbb{C}$ and take $X$ to be the Lévy process defined by (3.1). As $\xi(x)=\mathrm{e}^{vx}-1$ , we obtain $\xi^{\prime}(0)=v$ and $\xi^{\prime\prime}(0)=v^{2}$ , and

	$\displaystyle b^{Y[h^{1}]}$	$\displaystyle=\alpha v+\frac{1}{2}\sigma^{2}v^{2}+\int_{\mathbb{R}}\left((\mathrm{e}^{vx}-1)\mathbf{1}_{\lvert\mathrm{e}^{vx}-1\rvert\leq 1}-vx\mathbf{1}_{\lvert x\rvert\leq 1}\right)\Pi(\mathrm{d}x);$
	$\displaystyle c^{Y}$	$\displaystyle=\sigma^{2}v^{2};$
	$\displaystyle F^{Y}(G)$	$\displaystyle=\int_{\mathbb{R}}\mathbf{1}_{G}(\mathrm{e}^{vx}-1)\Pi(\mathrm{d}x).$

We thus conclude that if $X$ is a Lévy process then $Y=\mathcal{L}(\mathrm{e}^{vX})$ is again a Lévy process for all $v\in\mathbb{C}$ . ∎

The next example illustrates the convenience of composing two representations without having to work with their predictable characteristics.

Example 3.4 (Maximization of exponential utility).

Fix a time horizon $T>0$ , and assume that $X$ is a one-dimensional Lévy process given by (3.1)–(3.2). Consider an economy consisting of one bond with constant price $1$ and of one risky asset with price process $S=\mathrm{e}^{X}$ . Moreover, consider an agent with exponential utility function $u:w\mapsto-\mathrm{e}^{-w}$ .

Since $X$ is assumed to have stationary and independent increments and since we consider an exponential utility function, it is reasonable to conjecture that the optimal portfolio is a constant dollar amount $\lambda\in\mathbb{R}$ invested in the risky asset. Denote by $R=\mathcal{L}(\mathrm{e}^{X})$ the cumulative yield on an 1$ investment in the risky asset. Normalizing initial wealth to zero, the optimal wealth process equals $\lambda R$ and its expected utility is $\textsf{E}[\mathrm{e}^{-\lambda R_{T}}]$ . In analogy to (1.3) the expected utility can be obtained via the time rate of the expected percentage change of the quantity $\mathrm{e}^{-\lambda R}$ . This is nothing other than the drift rate of the process $\mathcal{L}(\mathrm{e}^{-\lambda R})$ . Provided this drift, commonly denoted by $\kappa^{R}(-\lambda)$ , is finite, the expected utility will be equal to $\textsf{E}[\mathrm{e}^{-\lambda R_{T}}]=\mathrm{e}^{\kappa^{R}(-\lambda)T}$ , cf. (3.7).

Formula (3.3) and the composition rule (2.5) give $\mathrm{d}R_{t}=\nicefrac{{\mathrm{d}\mathrm{e}^{X_{t}}}}{{\mathrm{e}^{X_{t-}}}}=\mathrm{e}^{\mathrm{d}X_{t}}-1$ and

\displaystyle\mathrm{d}\mathcal{L}(\mathrm{e}^{-\lambda R})_{t}=\frac{\mathrm{d}\mathrm{e}^{-\lambda R_{t}}}{\mathrm{e}^{-\lambda R_{t-}}}

\displaystyle=\mathrm{e}^{-\lambda\mathrm{d}R_{t}}-1=\mathrm{e}^{-\lambda(\mathrm{e}^{\mathrm{d}X_{t}}-1)}-1.

(3.10)

For $\xi(x)=\mathrm{e}^{-\lambda(\mathrm{e}^{x}-1)}-1$ one has $\xi^{\prime}(0)=-\lambda$ and $\xi^{\prime\prime}(0)=\lambda^{2}-\lambda$ . Hence the desired drift reads

\kappa^{R}(-\lambda)=b^{\mathcal{L}(\mathrm{e}^{-\lambda R})}=-\alpha\lambda+\frac{\sigma^{2}}{2}\left(\lambda^{2}-\lambda\right)+\int_{\mathbb{R}}\left(\mathrm{e}^{-\lambda\left(\mathrm{e}^{x}-1\right)}-1+\lambda x\mathbf{1}_{\lvert x\rvert\leq 1}\right)\Pi(\mathrm{d}x).

(3.11)

This expresses the cumulant function of $R$ by means of the jump intensity of the process $X$ .

Under the non-restrictive assumptions of Fujiwara and Miyahara (2003, Corollary 3.4) the expression (3.11) is finite for all $\lambda\in\mathbb{R}$ and $\lambda\mapsto\kappa^{R}(-\lambda)$ has a unique maximizer $\lambda_{*}$ (Fujiwara and Miyahara 2003, Proposition 3.3). Under the same assumptions, $R$ is locally bounded and it follows from the results in Biagini and Černý (2011) that $\nicefrac{{\lambda_{*}}}{{S_{-}}}$ is the optimal strategy in a sufficiently wide class of admissible strategies for trading in $S$ , therefore $\lambda_{*}$ is the optimal dollar amount to be invested. ∎

Remark 3.5.

In Fujiwara and Miyahara (2003) the previous calculation is performed in two steps: first the characteristics of the yield process $R=\mathcal{L}(\mathrm{e}^{X})$ are computed and these are then plugged into the Lévy-Khintchin formula (3.6) of Example 3.1 to evaluate the cumulant function $\kappa^{R}(-\lambda)$ , which after some cancellations and change of variables gives (3.11). The two-stage procedure is akin to using $\mathrm{d}t,\mathrm{d}W$ notation which, too, forces the user to keep track of the characteristics at every step of a multistage calculation, see (1.1)–(1.2) and (1.4). The simplified stochastic calculus allows us to maintain a model-free formulation until the very end so that the drift calculation is performed only once, when the drift is finally needed. ∎

The online appendix Černý and Ruf (2020c) discusses affine models and the derivation of Riccati equations as an additional example.

4. Drift under a change of measure

Next we will demonstrate that the simplified calculus becomes very powerful when it comes to evaluating drifts under a different measure.

Example 4.1 (Minimal entropy martingale measure).

Let us continue in the economic setting of Example 3.4 with the stock price process $S=\mathrm{e}^{X}$ , dollar yield process $R=\mathcal{L}(\mathrm{e}^{X})$ , exponential utility $u:w\mapsto\mathrm{e}^{-w}$ , and optimal wealth process $\lambda_{*}R$ . Under the assumptions of Fujiwara and Miyahara (2003, Corollary 3.4) the Radon-Nikodym derivative $Z_{T}=\nicefrac{{\mathrm{d}\textsf{Q}}}{{\mathrm{d}\textsf{P}}}|_{\mathscr{F}_{T}}$ of the representative agent pricing measure is proportional to the marginal utility evaluated at the optimal wealth, that is, to $\mathrm{e}^{-\lambda_{*}R_{T}}$ ; see Fujiwara and Miyahara (2003, Theorem 3.1 and Corollary 4.4(3)). This Q is known in the literature as the minimal entropy martingale measure and the corresponding density process $Z$ satisfies $Z_{t}=\mathrm{e}^{-\lambda_{*}R_{t}-\kappa^{R}(-\lambda_{*})t}$ for all $t\in[0,T]$ . The process $Z$ is a true martingale thanks to Corollary C.3.

To value contingent claims on the stock $S=\mathrm{e}^{X}$ , it is necessary to compute the characteristic function of $X$ under Q. The required cumulant function $\kappa_{\textsf{Q}}^{X}(v)$ is just the expected rate of change of $\mathrm{e}^{vX}$ under Q, i.e., the Q–drift rate of $\mathcal{L}(e^{vX})_{t}$ . By Theorem C.2 (ii) and Example 2.3 this Q–drift is the same as the P–drift of

	$\displaystyle\mathrm{d}\mathcal{L}(e^{vX})_{t}+\mathrm{d}\scalebox{1.2}{$[$}\mathcal{L}(e^{vX}),\mathcal{L}(\mathrm{e}^{-\lambda_{*}R})\scalebox{1.2}{$]$}_{t}$	$\displaystyle=\mathrm{d}\mathcal{L}(e^{vX})_{t}+\mathrm{d}\mathcal{L}(e^{vX})_{t}\mathrm{d}\mathcal{L}(\mathrm{e}^{-\lambda_{*}R})_{t}$
		$\displaystyle=(\mathrm{e}^{v\mathrm{d}X_{t}}-1)\mathrm{e}^{-\lambda_{*}(\mathrm{e}^{\mathrm{d}X_{t}}-1)},$

where the second equality combines the representations of $\mathcal{L}(e^{vX})$ and $\mathcal{L}(\mathrm{e}^{-\lambda_{*}R})$ obtained earlier in (3.3) and (3.10).

The function $\psi(x)=(\mathrm{e}^{vx}-1)\mathrm{e}^{-\lambda_{*}(\mathrm{e}^{x}-1)}$ satisfies $\psi^{\prime}(0)=v$ and $\psi^{\prime\prime}(0)=v^{2}-2\lambda_{*}v$ . Consequently, if the drift exists, the P–drift rate of $\psi\circ X$ reads

b^{\psi\circ X}=vb^{X[h]}+\frac{c^{X}}{2}\left(v^{2}-2\lambda_{*}v\right)+\int_{\mathbb{R}}\left(\psi(x)-vh(x)\right)F^{X}(\mathrm{d}x).

(4.1)

In the Lévy setting this yields

\begin{split}\kappa_{\textsf{Q}}^{X}(v)&=b_{\textsf{Q}}^{\mathcal{L}(\mathrm{e}^{vX})}=b^{\mathcal{L}(\mathrm{e}^{vX})+[\mathcal{L}(\mathrm{e}^{vX}),\mathcal{L}(\mathrm{e}^{-\lambda_{*}R})]}\\ &=\alpha v+\frac{\sigma^{2}}{2}\left(v^{2}-2\lambda_{*}v\right)+\int_{\mathbb{R}}\left((e^{vx}-1)\mathrm{e}^{-\lambda_{*}(\mathrm{e}^{x}-1)}-vx\mathbf{1}_{\lvert x\rvert\leq 1}\right)\Pi(\mathrm{d}x),\end{split}

(4.2)

whenever the integral on the right-hand side is finite. ∎

Remark 4.2.

The standard calculus using the formulation (3.1) requires much more work. First, one must find an explicit expression for $\ln Z$ , which after a significant effort reads

\begin{split}\ln Z=&-\int_{0}^{\cdot}\lambda_{*}\sigma\mathrm{d}W_{s}-\frac{1}{2}\int_{0}^{\cdot}\lambda_{*}^{2}\sigma^{2}\mathrm{d}s+\int_{0}^{\cdot}\int_{\mathbb{R}}-\lambda_{*}(\mathrm{e}^{x}-1)\widehat{N}(\mathrm{d}s,\mathrm{d}x)\\ &+\int_{0}^{\cdot}\int_{\mathbb{R}}\left(-\lambda_{*}(\mathrm{e}^{x}-1)-\left(\mathrm{e}^{-\lambda_{*}(\mathrm{e}^{x}-1)}-1\right)\right)\Pi(\mathrm{d}x)\mathrm{d}s,\end{split}

assuming $\ln Z$ has finite mean. Next, one constructs a new Brownian motion for the measure Q,

\mathrm{d}W^{\textsf{Q}}_{t}=\mathrm{d}W_{t}+\lambda_{*}\sigma\mathrm{d}t,

and a new compensated Poisson jump measure

\widehat{N}^{\textsf{Q}}(\mathrm{d}t,\mathrm{d}x)=\widehat{N}(\mathrm{d}t,\mathrm{d}x)+\left(1-\mathrm{e}^{-\lambda_{*}(\mathrm{e}^{x}-1)}\right)\Pi(\mathrm{d}x)\mathrm{d}t,

both using a custom-made formula, see Applebaum (2009, Theorem 5.2.12 and Exercise 5.2.14) and Øksendal and Sulem (2007, Theorem 1.32 and Lemma 1.33).

These quantities are then substituted into (3.9) to obtain

\begin{split}\mathcal{L}(\mathrm{e}^{vX})&=\int_{0}^{\cdot}\left(\alpha v+\frac{\sigma^{2}}{2}\left(v^{2}-2\lambda_{*}v\right)\right)\mathrm{d}s\\ &\qquad+\int_{0}^{\cdot}\sigma v\mathrm{d}W^{\textsf{Q}}_{s}+\int_{0}^{\cdot}\int_{\mathbb{R}}(\mathrm{e}^{vx}-1)\widehat{N}^{\textsf{Q}}(\mathrm{d}x,\mathrm{d}s)\\ &\qquad+\int_{0}^{\cdot}\int_{|x|\leq 1}\left(\mathrm{e}^{-\lambda_{*}(\mathrm{e}^{x}-1)}(\mathrm{e}^{vx}-1)-vx\right)\Pi(\mathrm{d}x)\mathrm{d}s\\ &\qquad+\int_{0}^{\cdot}\int_{|x|>1}\mathrm{e}^{-\lambda_{*}(\mathrm{e}^{x}-1)}(\mathrm{e}^{vx}-1)\Pi(\mathrm{d}x)\mathrm{d}s,\end{split}

(4.3)

provided the Q–drift of $\mathcal{L}(\mathrm{e}^{vX})$ exists. The drift is now available by summing up the first, fourth, and fifth term in (4.3). In (4.2) the same result is available directly after plugging the specific form $h(x)=x\mathbf{1}_{|x|\leq 1}$ and the characteristics (3.2) into the formula (4.1). The main difference between the two approaches is that (4.1) is more compact and arguably much easier to obtain than (4.3).∎

We conclude this section with a bivariate example that makes use of a non-equivalent change of measure.

Example 4.3 (An option to exchange one defaultable asset for another).

Fix $d=2$ and let $X=(X^{(1)},X^{(2)})$ be an $\mathbb{R}^{2}$ –valued Lévy martingale with the characteristic triplet

\left(\left[\begin{array}[]{cc}0\\ 0\end{array}\right],\left[\begin{array}[]{cc}\sigma_{1}^{2}&\sigma_{12}\\ \sigma_{12}&\sigma_{2}^{2}\end{array}\right],\Pi\right)

relative to the truncation function $h(x)=x$ .

Consider next two assets with price dynamics given by stochastic exponentials (see B.2) as

S^{(1)}=S^{(1)}_{0}\mathscr{E}\scalebox{1.2}{$($}X^{(1)}\scalebox{1.2}{$)$}=S^{(1)}_{0}+\int_{0}^{\cdot}S^{(1)}_{t-}\mathrm{d}X^{(1)}_{t};\qquad S^{(2)}=S^{(2)}_{0}\mathscr{E}\scalebox{1.2}{$($}X^{(2)}\scalebox{1.2}{$)$}=S^{(2)}_{0}+\int_{0}^{\cdot}S^{(2)}_{t-}\mathrm{d}X^{(2)}_{t}.

In financial economics one interprets $\mathscr{E}(X)$ as the value of a closed fund with initial investment of $1 following a trading strategy whose cumulative rate of return equals $X$ . For the sake of simplicity, we assume the existence of some risk-free asset that pays zero interest rate. We furthermore assume that the Lévy measure $\Pi$ is supported on $[-1,\infty)\times[-1,\infty)$ meaning both assets can default, perhaps simultaneously.

To value an option to exchange asset $S^{(1)}$ for asset $S^{(2)}$ on a specific date $T$ , one must compute the expectation

p=\textsf{E}\left[\left(S^{(1)}_{T}-S^{(2)}_{T}\right)^{+}\right];

see Margrabe (1978). Here the expectation is taken with respect to the valuation measure with the risk-free asset as numéraire, which explains why $X$ is a martingale.

Let $\textsf{Q}_{k}$ be the valuation measure with $S^{(k)}$ as a numéraire, that is, $\nicefrac{{\mathrm{d}\textsf{Q}_{k}}}{{\mathrm{d}\textsf{P}}}=\nicefrac{{S^{(k)}_{T}}}{{S^{(k)}_{0}}}$ for each $k\in\{1,2\}$ . Then one obtains an alternative expression for the price of the Margrabe option, namely⁸⁸8Note $\textsf{Q}^{(k)}$ is not necessarily equivalent but only absolutely continuous with respect to P since the event $\{S^{(k)}_{T}=0\}$ is allowed to have positive probability under P. Nonetheless, the formula (4.4) remains valid because the option payoff is zero whenever $S^{(1)}_{T}$ is zero. For other applications of defaultable numéraires see, for example, Fisher et al. (2019) and references therein.

p=S^{(1)}_{0}\textsf{E}^{\textsf{Q}_{1}}\left[\left(1-\frac{S^{(2)}_{T}}{S^{(1)}_{T}}\right)^{\!\!+}\,\right].

(4.4)

To evaluate (4.4) by integral transform methods one needs to compute the expectation

\textsf{E}^{\textsf{Q}_{1}}\left[\mathbf{1}_{\left\{S_{T}^{(2)}>0\right\}}\left(\frac{S^{(2)}_{T}}{S^{(1)}_{T}}\right)^{\!\!v}\,\right]

(4.5)

for certain values $v\in\mathbb{C}$ . Let us fix such $v$ . In the absence of default (of either asset), the computation of (4.5) is achieved by evaluating the expected rate of change of $V=\mathbf{1}_{\left\{S_{T}^{(2)}>0\right\}}\left(\nicefrac{{S^{(2)}}}{{S^{(1)}}}\right)^{v}$ under the measure $\textsf{Q}_{1}$ , in analogy to Example 4.1. This is an easy exercise in the simplified stochastic calculus: for a semimartingale $Y$ with $Y>0$ and $Y_{-}>0$ one obtains

\frac{\mathrm{d}Y_{t}^{v}}{Y_{t-}^{v}}=\frac{(Y_{t-}+\mathrm{d}Y_{t})^{v}-Y_{t-}^{v}}{Y_{t-}^{v}}=\left(1+\frac{\mathrm{d}Y_{t}}{Y_{t-}}\right)^{v}-1,

which yields

\mathcal{L}(Y^{v})=((1+{\operatorname{id}})^{v}-1)\circ\mathcal{L}(Y).

Composition with $\mathcal{L}(\nicefrac{{S^{(2)}}}{{S^{(1)}}})=\left(\nicefrac{{(1+{\operatorname{id}}_{2})}}{{(1+{\operatorname{id}}_{1})}}-1\right)\circ(\mathcal{L}(S^{(1)}),\mathcal{L}(S^{(2)}))$ then gives

\mathcal{L}(V)=\left(\left(\frac{1+{\operatorname{id}}_{2}}{1+{\operatorname{id}}_{1}}\right)^{\!\!v}-1\right)\circ X.

(4.6)

Representation (4.6) together with $\mathcal{L}\scalebox{1.2}{$($}S^{(1)}\scalebox{1.2}{$)$}=X^{(1)}={\operatorname{id}}_{1}\circ X$ yields

\mathcal{L}(V)+\scalebox{1.2}{$[$}\mathcal{L}(V),\mathcal{L}\scalebox{1.2}{$($}S^{(1)}\scalebox{1.2}{$)$}\scalebox{1.2}{$]$}=(1+{\operatorname{id}}_{1})\left(\left(\frac{1+{\operatorname{id}}_{2}}{1+{\operatorname{id}}_{1}}\right)^{\!\!v}-1\right)\circ X.

(4.7)

Let us denote the function appearing on the right-hand side of (4.7) by $\psi$ . In conclusion, without default (neither $S^{(1)}$ nor $S^{(2)}$ is allowed to hit zero) one obtains from Corollary C.3 (ii) that

\textsf{E}^{\textsf{Q}_{1}}\left[\mathbf{1}_{\left\{S_{T}^{(2)}>0\right\}}\left(\frac{S^{(2)}_{T}}{S^{(1)}_{T}}\right)^{\!\!v}\,\right]=\left(\frac{S^{(2)}_{0}}{S^{(1)}_{0}}\right)^{v}\exp\left(b^{\psi\circ X}T\right).

(4.8)

In the presence of default (i.e., if either $S^{(1)}$ or $S^{(2)}$ may hit zero), $V$ is no longer a P–semimartingale. However,

S_{\uparrow}^{(1)}=S^{(1)}_{0}\mathscr{E}\scalebox{1.2}{$($}\mathbf{1}_{{\operatorname{id}}_{1}\neq-1}\,{\operatorname{id}}_{1}\circ X\scalebox{1.2}{$)$}

is $\textsf{Q}_{1}$ –indistinguishable from $S^{(1)}$ with $S_{\uparrow}^{(1)}>0$ and $S_{\uparrow-}^{(1)}>0$ , P–a.s. Therefore, the process

	$\displaystyle V_{\uparrow}=\mathbf{1}_{\left\{S^{(2)}>0\right\}}\left(\frac{S^{(2)}}{S_{\uparrow}^{(1)}}\right)^{\!\!v}$	$\displaystyle=\left(\frac{S^{(2)}_{0}}{S^{(1)}_{0}}\right)^{\!\!v}\mathscr{E}\left(-\mathbf{1}_{{\operatorname{id}}_{2}=-1}\circ X\right)\left(\frac{\mathscr{E}\left({\operatorname{id}}_{2}\mathbf{1}_{{\operatorname{id}}_{2}\neq-1}\circ X\right)}{\mathscr{E}\left({\operatorname{id}}_{1}\mathbf{1}_{{\operatorname{id}}_{1}\neq-1}\circ X\right)}\right)^{\!\!v}$
		$\displaystyle=\left(\frac{S^{(2)}_{0}}{S^{(1)}_{0}}\right)^{\!\!v}\mathscr{E}\left(\left(\left(\frac{1+{\operatorname{id}}_{2}\mathbf{1}_{{\operatorname{id}}_{2}\neq-1}}{1+{\operatorname{id}}_{1}\mathbf{1}_{{\operatorname{id}}_{1}\neq-1}}\right)^{\!\!v}\mathbf{1}_{{\operatorname{id}}_{2}\neq-1}-1\right)\circ X\right)$

is a P–semimartingale $\textsf{Q}_{1}$ –indistinguishable from $V$ . Corollary C.3 (ii) shows that (4.8) goes through with a modified jump transformation function

\psi\left(x_{1},x_{2}\right)=\left(1+x_{1}\right)\left(\mathbf{1}_{x_{2}\neq-1}\left(\frac{1+\mathbf{1}_{x_{2}\neq-1}x_{2}}{1+\mathbf{1}_{x_{1}\neq-1}x_{1}}\right)^{\!\!v}-1\right).

(4.9)

We now proceed to compute the drift rate $b^{\psi\circ X}$ with $\psi$ in (4.9). To this end, note that

\displaystyle D\psi(0)=v\left[\begin{array}[]{cc}-1&1\end{array}\right];\qquad D^{2}\psi(0)=v(v-1)\left[\begin{array}[]{cc}1&-1\\ -1&1\end{array}\right].

Next, formula (2.14) with $h(x)=x$ for all $x\in\mathbb{R}$ yields

	$\displaystyle b^{\psi\circ X}=$	$\displaystyle\frac{1}{2}\left(\sigma_{1}^{2}-2\sigma_{12}+\sigma_{2}^{2}\right)v(v-1)$
		$\displaystyle+\int_{\mathbb{R}^{2}}\left(\left(1+x_{1}\right)\left(\left(\frac{1+x_{2}}{1+x_{1}}\right)^{\!\!v}\mathbf{1}_{x_{2}\neq-1}-1\right)+vx_{1}-vx_{2}\right)\Pi\left(\mathrm{d}x_{1},\mathrm{d}x_{2}\right)$
	$\displaystyle=$	$\displaystyle\frac{1}{2}\left(\sigma_{1}^{2}-2\sigma_{12}+\sigma_{2}^{2}\right)v(v-1)-\lambda_{2}^{\textsf{Q}_{1}}+v\left(\lambda_{2}^{\textsf{Q}_{1}}-\lambda_{1}^{\textsf{Q}_{2}}\right)$
		$\displaystyle+\int_{(-1,\infty)^{2}}\left(\left(1+x_{1}\right)\left(\left(\frac{1+x_{2}}{1+x_{1}}\right)^{\!\!v}-1\right)+vx_{1}-vx_{2}\right)\Pi\left(\mathrm{d}x_{1},\mathrm{d}x_{2}\right),$

as long as the expectation in (4.8) is finite. Here, the coefficient

\lambda_{2}^{\textsf{Q}_{1}}=\int_{\mathbb{R}^{2}}\left(1+x_{1}\right)\mathbf{1}_{x_{2}=-1}\Pi\left(\mathrm{d}x_{1},\mathrm{d}x_{2}\right)

signifies the arrival intensity of default of asset $2$ under the probability measure $\textsf{Q}_{1}$ and $\lambda_{1}^{\textsf{Q}_{2}}$ has the converse meaning.⁹⁹9The coefficient $\lambda_{2}^{\textsf{Q}_{1}}$ is the drift rate of the process $\mathbf{1}_{{\operatorname{id}}_{2}=-1}\circ X$ under $\textsf{Q}_{1}$ ; see Theorem C.2 (ii). Observe that without default $\kappa(v)=b^{\psi\circ X}$ can be interpreted as the cumulant function of $\ln\nicefrac{{S^{(2)}_{1}}}{{S^{(1)}_{1}}}-\ln\nicefrac{{S^{(2)}_{0}}}{{S^{(1)}_{0}}}$ under $\textsf{Q}_{1}$ .

For concreteness let us now assume that, in the absence of default, our model follows a bivariate Merton (1976) jump-diffusion. In other words, on the open interval $(-1,\infty)\times(-1,\infty)$ , the measure $\Pi$ is a fixed multiple $\lambda\geq 0$ of a push-forward measure of a bivariate normal distribution with parameters

\left(\left[\begin{array}[]{cc}m_{1}\\ m_{2}\end{array}\right],\left[\begin{array}[]{cc}s_{1}^{2}&s_{12}\\ s_{12}&s_{2}^{2}\end{array}\right]\right)

through the mapping $(\mathrm{e}^{{\operatorname{id}}_{1}}-1,\mathrm{e}^{{\operatorname{id}}_{2}}-1)$ . Once the integrals have been evaluated one obtains

	$\displaystyle\kappa(v)=b^{\psi\circ X}\ =\$	$\displaystyle\frac{1}{2}\left(\sigma_{1}^{2}-2\sigma_{12}+\sigma_{2}^{2}\right)v(v-1)-\lambda_{2}^{\textsf{Q}_{1}}$
		$\displaystyle+v\left(\lambda\left(\mathrm{e}^{m_{1}+\frac{1}{2}s_{11}}-\mathrm{e}^{m_{2}+\frac{1}{2}s_{22}}\right)+\lambda_{2}^{\textsf{Q}_{1}}-\lambda_{1}^{\textsf{Q}_{2}}\right)$
		$\displaystyle+\lambda\mathrm{e}^{(1-v)m_{1}+vm_{2}+\frac{1}{2}(1-v)^{2}s_{11}+v(1-v)s_{12}+\frac{1}{2}v^{2}s_{22}}-\lambda\mathrm{e}^{m_{1}+\frac{1}{2}s_{11}}.$

Continuing now with the Fourier transform, Hubalek et al. (2006, Lemma 4.1) yields

(1-x)^{+}=\mathbf{1}_{x=0}+\mathbf{1}_{x>0}\int_{\beta+i\mathbb{R}}g(v)x^{v}\mathrm{d}v,\qquad x\geq 0,

where $g(v)=\frac{1}{2\pi i}\frac{1}{v(v-1)}$ and $\beta<0$ . Consequently, using (4.4), the price of the Margrabe option is given as

	$\displaystyle\frac{p}{S^{(1)}_{0}}$	$\displaystyle=\textsf{Q}_{1}\left[S_{T}^{(2)}=0\right]+\int_{\beta+i\mathbb{R}}g(v)\textsf{E}^{\textsf{Q}_{1}}\left[\mathbf{1}_{\left\{S_{T}^{(2)}>0\right\}}\left(\frac{S_{T}^{(2)}}{S_{T}^{(1)}}\right)^{\!\!v}\right]\mathrm{d}v$
		$\displaystyle=\mathrm{e}^{\kappa(0)T}+\int_{\beta+i\mathbb{R}}\left(\frac{S_{0}^{(2)}}{S_{0}^{(1)}}\right)^{\!\!v}\ \frac{1}{2\pi i}\frac{1}{v(v-1)}\mathrm{e}^{\kappa(v)T}\mathrm{d}v,$

where $\kappa(0)=-\lambda_{2}^{\textsf{Q}_{1}}$ . The integrals are well-defined and Fubini may be applied in the first equality because the function $v\mapsto|g(v)|\mathbf{1}_{\{S_{T}^{(2)}>0\}}(\nicefrac{{S_{T}^{(2)}}}{{S_{T}^{(1)}}})^{\operatorname{Re}v}$ is product-integrable on $\beta+i\mathbb{R}$ . ∎

5. Jumps at predictable times

This section illustrates the simplified stochastic calculus in a discrete-time model. The examples below preserve the independent increments feature of the Brownian and the Lévy-based examples in Subsection 1.1 and Sections 3 and 4. This forces the jumps to occur at fixed times. For simplicity we assume these times can be enumerated in an increasing sequence without an accumulation point. In general, the jumps could occur at all rational times, and if we dropped the independent increments assumption, also at random predictable times. These advanced features are handled in full generality in the companion papers Černý and Ruf (2020b, a).

Example 5.1 (Maximization of expected utility).

Denote by $S>0$ the value of a risky asset and assume the logarithmic price $X=\ln S$ is a discrete-time process (Definition B.5) with independent and identically distributed increments. Namely, for each $k\in\mathbb{N}$ we let the distribution of $\Delta X_{k}$ take three values, $\ln 1.1,0$ , and $\ln 0.9$ , with probabilities $p_{u}$ , $p_{m}$ , and $p_{d}$ , respectively. With zero risk-free rate the value of a fund investing $1 in the risky asset equals $R=\mathcal{L}(\mathrm{e}^{X})$ .

To evaluate the expected utility

\textsf{E}\left[\mathrm{e}^{-\lambda R_{t}}\right],\qquad t\geq 0,

we recall from Example 3.4 the representation of the cumulative percentage change in $\mathrm{e}^{-\lambda R}$ . Specifically, from (3.10) one obtains $\mathcal{L}(\mathrm{e}^{-\lambda R})=\eta\circ X$ with

\eta(x)=\mathrm{e}^{-\lambda(\mathrm{e}^{x}-1)}-1.

(5.1)

Formulae (B.4) and (B.5) now yield

\begin{split}\textsf{E}\left[\mathrm{e}^{-\lambda R_{t}}\right]&=\prod_{k=1}^{\lfloor t\rfloor}\textsf{E}\left[1+\eta_{k}(\Delta X_{k})\right]\\ &=\prod_{k=1}^{\lfloor t\rfloor}\textsf{E}\left[\mathrm{e}^{-\lambda(\mathrm{e}^{\Delta X_{k}}-1)}\right]=\left(p_{u}\mathrm{e}^{-0.1\lambda}+p_{m}+p_{d}\mathrm{e}^{0.1\lambda}\right)^{\lfloor t\rfloor},\end{split}

(5.2)

for all $t\geq 0$ . ∎

Example 5.2 (Minimal entropy martingale measure).

We now compute the minimal entropy martingale measure in the setting of the previous example for some given time horizon $T>0$ . Optimizing (5.2) over $\lambda$ , one obtains an explicit expression for the optimal dollar amount in the risky asset, namely

\lambda_{*}=\frac{\ln(\nicefrac{{p_{u}}}{{p_{d}}})}{0.2}.

Then the random variable $\nicefrac{{\mathrm{e}^{-\lambda_{*}R_{T}}}}{{\textsf{E}[\mathrm{e}^{-\lambda_{*}R_{T}}]}}$ gives the density $\nicefrac{{\mathrm{d}\textsf{Q}}}{{\mathrm{d}\textsf{P}}}$ of the minimal entropy martingale measure Q. As in Example 4.1, we seek the expected value of $\mathrm{e}^{vX_{t}}$ under Q, for fixed $t\in[0,T]$ and $v\in\mathbb{C}$ , provided the expectation is finite. Because $\mathcal{L}(\mathrm{e}^{vX})=\xi\circ X$ with

\xi(x)=\mathrm{e}^{vx}-1,

the desired expectation is given by Corollary C.3 (i) with $\eta$ from (5.1) as follows,

	$\displaystyle\textsf{E}^{\textsf{Q}}\left[\mathrm{e}^{vX_{t}}\right]$	$\displaystyle=\prod_{k=1}^{\lfloor t\rfloor}\frac{\textsf{E}\left[(1+\eta_{k}(\Delta X_{k}))(1+\xi_{k}(\Delta X_{k}))\right]}{\textsf{E}\left[1+\eta_{k}(\Delta X_{k})\right]}$
		$\displaystyle=\prod_{k=1}^{\lfloor t\rfloor}\frac{\textsf{E}\left[\mathrm{e}^{-\lambda_{}(\mathrm{e}^{\Delta X_{k}}-1)}\mathrm{e}^{v\Delta X_{k}}\right]}{\textsf{E}\left[\mathrm{e}^{-\lambda_{}(\mathrm{e}^{\Delta X_{k}}-1)}\right]}=\left(\frac{(1.1^{v}+0.9^{v})\sqrt[\ ]{p_{u}p_{d}}+p_{m}}{2\sqrt[\ ]{p_{u}p_{d}}+p_{m}}\right)^{\lfloor t\rfloor}\!\!\!\!\!,\quad t\geq 0.$

Here $\textsf{E}^{\textsf{Q}}[\mathrm{e}^{vX_{t}}]$ considered as a function of $v\in\mathbb{C}$ gives the moment generating function of $X_{t}=\ln S_{t}$ under Q for each $t\geq 0$ and can be therefore used to price contingent claims by integral transform methods. ∎

6. Concluding remarks

In this paper we have introduced the notion of ‘ $X$ –representation’ to describe a generic modelling situation where one starts from a (multivariate) process $X$ whose predictable P–characteristics are given as the primitive input to the problem. The process $X$ , which is trivially representable, is transformed by several applications of the composite rules (2.1)–(2.2) to another process $Y$ which is also $X$ –representable. In many situations the required end product is the P–drift of $Y$ . These examples include i) the construction of partial integro-differential equations from martingale criteria (e.g., Večeř and Xu 2004, Theorem 3.3); ii) the computation of exponential compensators (e.g., Duffie et al. 2003, Proposition 11.2); iii) the formulation of optimality conditions for various dynamic optimization problems (e.g., Øksendal and Sulem 2007, Theorem 3.1(v)).

Existing methods force us to keep track of the characteristics (drift, volatility, and jump intensities) throughout all intermediate calculations (e.g., Øksendal and Sulem 2007, Theorem 1.14). One of the drawbacks of describing processes via their characteristic triplets is that the drift and the jump intensities are measure-dependent and the drift additionally also depends on the truncation function $h$ . The new calculus, in contrast, works with $X$ –representations, which themselves do not depend on the characteristics in an overt way. This makes individual steps such as change of variables much simpler and the overall calculus more transparent and easy to use. An $X$ –representation is converted into a drift only when the drift is really needed.

The proposed calculus emphasizes the universal nature of transformations such as stochastic integration or change of variables, which can typically be applied in the same way to any starting process $X$ . For example, the conversion from the rate of return $\nicefrac{{\mathrm{d}X_{t}}}{{X_{t-}}}$ to the logarithmic return $\mathrm{d}\ln X_{t}$ always takes the form $\mathrm{d}\ln X_{t}=\ln\left(1+\nicefrac{{\mathrm{d}X_{t}}}{{X_{t-}}}\right)$ . Robust results such as this are helpful in two ways. They offer an easy way to visualize fundamental relationships and separate what is fundamental from what is model-specific. Secondly, they open an avenue for studying richer models where, say, a Brownian motion is replaced with a more general process with independent increments. In the proposed calculus this is possible without additional overheads as long as the Markovian structure of the problem remains unchanged.

Further advantages of the new calculus become apparent when the drift of $Y$ is to be computed under some new probability measure Q absolutely continuous with respect to P. The need to switch measures comes particularly from mathematical finance as illustrated in Examples 4.1 and 4.3, but it also arises in natural sciences as part of filtering theory (Särkkä and Sottinen 2008, and the references therein) and in Monte Carlo simulations (Grigoriu 2002, Section 5.4.2). In existing approaches a change of measure requires a custom-made formula that even depends on the form in which the density process $M$ of $\nicefrac{{\mathrm{d}\textsf{Q}}}{{\mathrm{d}\textsf{P}}}$ is supplied. If $M$ is written as a stochastic exponential we need one formula, if it appears as an ordinary exponential we need another formula. These formulae convey little intuition and are consequently hard to memorize. In the new calculus there is no need to refer to a formula: we simply notice that by Girsanov’s theorem the Q–drift of $V$ equals the P–drift of $V+[V,\mathcal{L}(M)]$ . Since it is easy to write down the representation of $V+[V,\mathcal{L}(M)]$ , the Girsanov computation comes at virtually no extra cost.

Somewhat surprisingly, the simplified calculus implies that one can perform classical Itô calculus on continuous processes by tracing the behaviour of a hypothetical pure-jump finite variation process. While this observation may seem paradoxical at first sight, we believe the emphasis on jumps makes the simplified stochastic calculus less intellectually taxing than classical approaches firmly rooted in Brownian motion.

References

Aït-Sahalia and Matthys (2019) Aït-Sahalia, Y. and F. Matthys (2019). Robust consumption and portfolio policies when asset prices can jump. Journal of Economic Theory 179, 1–56.
Applebaum (2009) Applebaum, D. (2009). Lévy Processes and Stochastic Calculus (2nd ed.), Volume 116 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge.
Bender and Niethammer (2008) Bender, C. and C. R. Niethammer (2008). On $q$ -optimal martingale measures in exponential Lévy models. Finance & Stochastics 12(3), 381–410.
Biagini and Černý (2011) Biagini, S. and A. Černý (2011). Admissible strategies in semimartingale portfolio selection. SIAM Journal on Control and Optimization 49(1), 42–72.
Björk (2009) Björk, T. (2009). Arbitrage Theory in Continuous Time (3rd ed.). Oxford University Press, Oxford.
Black and Scholes (1973) Black, F. and M. Scholes (1973). The pricing of options and corporate liabilities. Journal of Political Economy 81, 163–175.
Cai and Kou (2012) Cai, N. and S. Kou (2012). Pricing Asian options under a hyper-exponential jump diffusion model. Operations Research 60(1), 64–77.
Černý and Ruf (2020a) Černý, A. and J. Ruf (2020a). Simplified calculus for semimartingales: Multiplicative compensators and changes of measure. Available from ssrn.com/abstract=3633622.
Černý and Ruf (2020b) Černý, A. and J. Ruf (2020b). Simplified stochastic calculus via semimartingale representations. Available from ssrn.com/abstract=3633638.
Černý and Ruf (2020c) Černý, A. and J. Ruf (2020c). Supplement to: Simplified stochastic calculus with applications in Economics and Finance. Available from ssrn.com/abstract_id=3752072.
Duffie (2001) Duffie, D. (2001). Dynamic Asset Pricing Theory (3rd ed.). Princeton: Princeton University Press.
Duffie et al. (2003) Duffie, D., D. Filipović, and W. Schachermayer (2003). Affine processes and applications in finance. The Annals of Applied Probability 13(3), 984–1053.
Émery (1978) Émery, M. (1978). Stabilité des solutions des équations différentielles stochastiques application aux intégrales multiplicatives stochastiques. Zeitschrift für Wahrscheinlichkeitstheorie und Verwandte Gebiete 41(3), 241–262.
Feng and Linetsky (2008) Feng, L. and V. Linetsky (2008). Pricing options in jump-diffusion models: An extrapolation approach. Operations Research 56(2), 304–325.
Fisher et al. (2019) Fisher, T., S. Pulido, and J. Ruf (2019). Financial models with defaultable numéraires. Mathematical Finance 29(1), 117–136.
Fujiwara and Miyahara (2003) Fujiwara, T. and Y. Miyahara (2003). The minimal entropy martingale measures for geometric Lévy processes. Finance and Stochastics 7(4), 509–531.
Grigoriu (2002) Grigoriu, M. (2002). Stochastic Calculus: Applications in Science and Engineering. Springer, New York.
Hong and Jin (2018) Hong, Y. and X. Jin (2018). Semi-analytical solutions for dynamic portfolio choice in jump-diffusion models and the optimal bond-stock mix. European Journal of Operational Research 265(1), 389 – 398.
Hubalek et al. (2006) Hubalek, F., J. Kallsen, and L. Krawczyk (2006). Variance-optimal hedging for processes with stationary independent increments. The Annals of Applied Probability 16(2), 853–885.
Itô (1951) Itô, K. (1951). On a formula concerning stochastic differentials. Nagoya Mathematics Journal 3, 55–65.
Jacod and Shiryaev (2003) Jacod, J. and A. N. Shiryaev (2003). Limit Theorems for Stochastic Processes (2nd ed.), Volume 288 of Comprehensive Studies in Mathematics. Springer, Berlin.
Jeanblanc et al. (2007) Jeanblanc, M., S. Klöppel, and Y. Miyahara (2007). Minimal $f^{q}$ -martingale measures of exponential Lévy processes. The Annals of Applied Probability 17(5-6), 1615–1638.
Kallsen (2000) Kallsen, J. (2000). Optimal portfolios for exponential Lévy processes. Mathematical Methods of Operations Research 51, 357–374.
Karatzas and Shreve (1991) Karatzas, I. and S. E. Shreve (1991). Brownian Motion and Stochastic Calculus (2nd ed.). Springer, New York.
Larsson and Ruf (2020) Larsson, M. and J. Ruf (2020). Convergence of local supermartingales. Annales de l’Institut Henri Poincaré (B) Probabilités et Statistiques. to appear.
Margrabe (1978) Margrabe, W. (1978). The value of an option to exchange one asset for another. Journal of Finance 33(1), 177–186.
McKean (1969) McKean, Jr., H. P. (1969). Stochastic Integrals. Probability and Mathematical Statistics, No. 5. Academic Press, New York.
Merton (1976) Merton, R. C. (1976). Option pricing when the underlying stock returns are discontinuous. Journal of Financial Economics 3, 125–144.
Meyer (1976) Meyer, P.-A. (1976). Un cours sur les intégrales stochastiques. In Séminaire de Probabilités X, Strasbourg, Volume 511 of Lecture Notes in Mathematics, pp. 245–400. Springer, Berlin.
Øksendal and Sulem (2007) Øksendal, B. and A. Sulem (2007). Applied Stochastic Control of Jump Diffusions (2nd ed.). Universitext. Springer, Berlin.
Samuelson (1965) Samuelson, P. (1965). Rational theory of warrant pricing. Industrial Management Review 6, 13–32.
Särkkä and Sottinen (2008) Särkkä, S. and T. Sottinen (2008). Application of Girsanov theorem to particle filtering of discretely observed continuous-time non-linear systems. Bayesian Analysis 3(3), 555–584.
Sato (1999) Sato, K. (1999). Lévy Processes and Infinitely Divisible Distributions, Volume 68 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge.
Večeř and Xu (2004) Večeř, J. and M. Xu (2004). Pricing Asian options in a semimartingale model. Quantitative Finance 4(2), 170–175.

Appendix A Notation and details about the representations

In this appendix we provide the setup of this paper and the proofs of the statements in Section 1. Unless specified otherwise, $d$ , $m$ , and $n$ are positive integers. The underlying filtered probability space is denoted by $(\Omega,\mathscr{F},\mathfrak{F},\textsf{P})$ . Complex integral of a locally bounded $\mathbb{C}^{n}$ –valued process $\zeta=\zeta^{\prime}+i\zeta^{\prime\prime}$ with respect to a $\mathbb{C}^{n}$ –valued semimartingale $X=X^{\prime}+iX^{\prime\prime}$ is the $\mathbb{C}$ –valued semimartingale

\displaystyle\int_{0}^{\cdot}\zeta_{t}\mathrm{d}X_{t}

\displaystyle=\int_{0}^{\cdot}\zeta^{\prime}_{t}\mathrm{d}X^{\prime}_{t}-\int_{0}^{\cdot}\zeta^{\prime\prime}_{t}\mathrm{d}X^{\prime\prime}_{t}+i\left(\int_{0}^{\cdot}\zeta^{\prime\prime}_{t}\,\mathrm{d}X^{\prime}_{t}+\int_{0}^{\cdot}\zeta^{\prime}_{t}\mathrm{d}X^{\prime\prime}_{t}\right).

We write $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{\mathbb{C}}^{n}=\mathbb{C}^{n}\cup\{\mathrm{NaN}\}$ for some ‘non-number’ $\mathrm{NaN}\notin\bigcup_{n\in\mathbb{N}}\mathbb{C}^{n}$ and $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{\Omega}^{n}_{\mathbb{C}}=\Omega\times[0,\infty)\times\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{\mathbb{C}}^{n}$ . The symbols $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{\mathbb{R}}$ and $\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{\Omega}^{n}_{\mathbb{R}}$ have an analogous meaning. For a predictable function $\xi$ we shall always assume that $\xi(\mathrm{NaN})=\mathrm{NaN}$ . If $\psi:\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{\Omega}^{n}_{\mathbb{C}}\rightarrow\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{\mathbb{C}}^{m}$ , with $m\in\mathbb{N}$ , denotes another predictable function we shall write $\psi\circ\xi$ or $\psi(\xi)$ to denote the predictable function $(\omega,t,x)\mapsto\psi(\omega,t,\xi(\omega,t,x))$ and likewise with $\mathbb{C}$ replaced by $\mathbb{R}$ .

Provided they exist, we write $D\xi$ and $D^{2}\xi$ for the complex derivatives of $\xi:\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{\Omega}^{d}_{\mathbb{C}}\rightarrow\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{\mathbb{C}}^{n}$ , resp., the real derivatives of $\xi:\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{\Omega}^{d}_{\mathbb{R}}\rightarrow\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{\mathbb{R}}^{n}$ . Note that $D\xi$ has dimension $n\times d$ and $D^{2}\xi$ has dimension $n\times d\times d$ .

Definition A.1 (Two subclasses of universal representing functions).

Let $\mathfrak{I}^{d,n}_{0\mathbb{C}}$ denote the set of all predictable functions $\xi:\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{\Omega}^{d}_{\mathbb{C}}\rightarrow\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{\mathbb{C}}^{n}$ such that the following properties hold:

(1)

$\xi(\omega,t,0)=0$ for all $(\omega,t)\in\Omega\times[0,\infty)$ .
(2)
There is a predictable process $R$ locally bounded away from zero, i.e., with strictly positive running infimum $R^{*}$ , such that
1. (a)
  
  $x\mapsto\xi(\omega,t,x)$ is analytic on $|x|\leq R(\omega,t)$ , for all $(\omega,t)\in\Omega\times[0,\infty)$ ;
2. (b)
  
  $\sup_{|x|\leq R}\scalebox{1.2}{$|$}D^{2}\xi(x)\scalebox{1.2}{$|$}$ is locally bounded.
(3)

$D\xi(0)$ is locally bounded.

We write $\mathfrak{I}_{0\mathbb{C}}=\bigcup_{k,r\in\mathbb{N}}\mathfrak{I}^{k,r}_{0\mathbb{C}}$ . The subclass $\mathfrak{I}_{0\mathbb{R}}$ of predictable functions $\xi:\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{\Omega}^{d}_{\mathbb{R}}\rightarrow\macc@depth\char 1\relax\frozen@everymath{\macc@group}\macc@set@skewchar\macc@nested@a 111{\mathbb{R}}^{n}$ is defined by replacing (a) with the requirement (a’) $x\mapsto\xi(\omega,t,x)$ is twice differentiable for $|x|\leq R(\omega,t)$ , for all $(\omega,t)\in\Omega\times[0,\infty)$ . ∎

Let us provide some context to the previous definition. Most of the time, we are interested in ‘real-valued’ transformations $\xi$ , which map an $\mathbb{R}^{d}$ –valued semimartingale to an $\mathbb{R}^{n}$ –valued semimartingale. The core class $\mathfrak{I}_{0\mathbb{R}}$ is perfectly suited for this purpose. The Émery formula, as stated in (1.18), works also for complex-valued $\xi$ if we interpret $D\xi(0)$ and $D^{2}\xi(0)$ as complex derivatives. Such generalization of the theory from $\mathbb{R}$ to $\mathbb{C}$ , albeit limited by forcing $\xi$ to be analytic at $0$ , is helpful when computing characteristic functions, for example. This leads to the definition of $\mathfrak{I}_{0\mathbb{C}}$ , which is now a proper subclass within a larger core class $\mathfrak{I}_{0}$ of complex-valued universal representing functions nesting also $\mathfrak{I}_{0\mathbb{R}}$ .

We do not define $\mathfrak{I}_{0}$ itself in this paper but it can be shown that $\mathfrak{I}^{d,n}_{0}$ has a one-to-one correspondence with $\mathfrak{I}^{2d,2n}_{0\mathbb{R}}$ . A more general Émery formula is available for $\mathfrak{I}_{0}$ but will not be needed in this paper. We hence refer the interested reader to Černý and Ruf (2020b) for more details. All computations in this paper can be performed either within $\mathfrak{I}_{0\mathbb{R}}$ or within $\mathfrak{I}_{0\mathbb{C}}$ and the two classes are never used jointly. The arguments for the two cases are often identical but should be read and understood as two separate arguments because the meaning of $D\xi$ is different in the two cases. The proofs for each of the two classes are self-contained.

Let us now briefly show that all terms in (1.18) are well-defined.

Lemma A.2.

If $\xi\in\mathfrak{I}_{0\mathbb{R}}\cup\mathfrak{I}_{0\mathbb{C}}$ then the integrals $\int_{0}^{\cdot}D\xi_{t}(0)\mathrm{d}X_{t}$ and

\int_{0}^{\cdot}\sum_{i,j=1}^{d}D^{2}_{ij}\xi_{t}(0)\,\mathrm{d}\big{[}X^{(i)},X^{(j)}\big{]}^{c}_{t}

are well-defined. If, additionally, $\xi$ is compatible with $X$ , then (1.20) holds.

Proof.

Because $D\xi(0)$ and $D^{2}\xi(0)$ are locally bounded, the two integrals are well-defined by Jacod and Shiryaev (2003, Theorem I.4.31). By assumption, $(\tau_{n})_{n\in\mathbb{N}}$ given by $\tau_{n}=\inf\{t:R^{*}_{t}\leq\nicefrac{{1}}{{n}}\}$ is a localizing sequence. Next, let $(\rho_{n})_{n\in\mathbb{N}}$ be the localizing sequence from Definition A.1 (2)(2)(b). Then $(\tau_{n}\wedge\rho_{n})_{n\in\mathbb{N}}$ is again a localizing sequence such that, after localization, $|\xi(x)-D\xi(0)x|\leq K|x|^{2}$ for all $|x|\leq\delta$ for some constants $K>0$ and $\delta>0$ . This yields, after localization,

\displaystyle\sum_{0<t\leq\cdot}

\displaystyle|\xi_{t}(\Delta X_{t})-D\xi_{t}(0)\Delta X_{t}|=\!\!\!\!\sum_{\stackrel{{\scriptstyle 0<t\leq\cdot}}{{|\Delta X_{t}|\leq\delta}}}\!\!\!\!|\xi_{t}(\Delta X_{t})-D\xi_{t}(0)\Delta X_{t}|+\!\!\!\!\sum_{\stackrel{{\scriptstyle 0<t\leq\cdot}}{{|\Delta X_{t}|>\delta}}}\!\!\!\!|\xi_{t}(\Delta X_{t})-D\xi_{t}(0)\Delta X_{t}|<\infty

as the last sum has only finitely many summands. ∎

We are now ready to state and prove the main properties of semimartingale representations.

Proposition A.3 (Representation of stochastic integrals).

Let $\zeta$ be a locally bounded predictable $\mathbb{R}^{n\times d}$ –valued (resp., $\mathbb{C}^{n\times d}$ –valued) process. Then the predictable function $\xi=\zeta\,{\operatorname{id}}$ belongs to $\mathfrak{I}_{0\mathbb{R}}^{d,n}$ (resp., $\mathfrak{I}_{0\mathbb{C}}^{d,n}$ ) and for any $\mathbb{R}^{d}$ –valued (resp., $\mathbb{C}^{d}$ –valued) semimartingale $X$ one has

\int_{0}^{\cdot}\zeta_{t}\mathrm{d}X_{t}=(\zeta\,{\operatorname{id}})\circ X.

(A.1)

Proof.

We start with the complex-valued case. As $D\xi=\zeta$ and $D^{2}\xi=0$ , we have that $\xi$ belongs to $\mathfrak{I}_{0\mathbb{C}}$ and is compatible with any $\mathbb{C}^{d}$ –valued semimartingale $X$ . The Émery formula (1.18) now yields (A.1). The real-valued proof proceeds analogously. ∎

Proposition A.4 (Representation of smooth transformations).

Let $\mathcal{U}\subset\mathbb{R}^{d}$ (resp., $\mathcal{U}\subset\mathbb{C}^{d}$ ) be an open set such that $X_{-},X\in\mathcal{U}$ , let $f:\mathcal{U}\rightarrow\mathbb{R}^{n}$ be a twice continuously differentiable function (resp., let $f:\mathcal{U}\rightarrow\mathbb{C}^{n}$ be an analytic function), and let

\displaystyle\xi^{f,X}(x)=\begin{cases}f\left(X_{-}+x\right)-f\left(X_{-}\right),&\quad X_{-}+x\in\mathcal{U}\\ \mathrm{NaN},&\quad X_{-}+x\notin\mathcal{U}\end{cases},\quad x\in\mathbb{R}^{d}\ \ (\text{resp., }x\in\mathbb{C}^{d}).

Then $\xi^{f,X}\in\mathfrak{I}_{0\mathbb{R}}^{d,n}$ (resp. $\xi^{f,X}\in\mathfrak{I}_{0\mathbb{C}}^{d,n}$ ) is compatible with $X$ and

f(X)=f(X_{0})+\xi^{f,X}\circ X.

Proof.

The first part of the proof is identical for both cases. Note that $D\xi^{f,X}(0)=Df(X_{-})$ and $D^{2}\xi^{f,X}(0)=D^{2}f(X_{-})$ . As both $Df(X_{-})$ and $D^{2}f(X_{-})$ are finite-valued predictable processes, they are locally bounded by Larsson and Ruf (2020, Proposition 3.2). Next, denote by $R\in(0,1]$ the minimum of 1 and half of the distance from $X_{-}$ to the boundary of $\mathcal{U}$ and by $R^{*}$ its running infimum. The left-continuity of $R$ now yields $R^{*}>0$ . Next, observe that

\tau_{n}=\inf\left\{t\geq 0:R^{*}_{t}<\frac{1}{n}\right\}\wedge\inf\{t\geq 0:|X_{t-}|>n\},\qquad n\in\mathbb{N},

is a localizing sequence of stopping times that makes $\sup_{|x|\leq R}\scalebox{1.2}{$|$}D^{2}\xi(x)\scalebox{1.2}{$|$}$ locally bounded, yielding $\xi^{f,X}\in\mathfrak{I}_{0\mathbb{R}}$ (resp., $\mathfrak{I}_{0\mathbb{C}}$ ). As $\xi^{f,X}(\Delta X)=f(X)-f(X_{-})$ , we have $\xi^{f,X}$ is compatible with $X$ .

For $\xi\in\mathfrak{I}_{0\mathbb{R}}$ , Lemma A.2 and the Émery formula (1.18) now yield that $f(X_{0})+\xi^{f,X}\circ X$ is the Itô-Meyer change of variables formula (Jacod and Shiryaev 2003, I.4.57) and hence equal to $f(X)$ . For $\xi\in\mathfrak{I}_{0\mathbb{C}}$ the result follows by identifying $\mathbb{C}^{d}$ with $\mathbb{R}^{2d}$ and using the real-valued statement we have just proved. ∎

Proposition A.5 (Composition of universal representing functions).

The space $\mathfrak{I}_{0\mathbb{R}}$ is closed under dimensionally correct composition, i.e., if $\xi\in\mathfrak{I}_{0\mathbb{R}}^{d,n}$ and $\psi\in\mathfrak{I}_{0\mathbb{R}}^{n,m}$ then $\psi\circ\xi\in\mathfrak{I}_{0\mathbb{R}}^{d,m}$ . An analogous statement holds for $\mathfrak{I}_{0\mathbb{C}}$ .

Proof.

The proof is identical for both cases. By localization we may assume that $D\psi(0)$ is bounded and that there exists a constant $\delta_{\psi}>0$ such that $\sup_{|y|\leq\delta_{\psi}}D^{2}\psi(y)$ and consequently also $\sup_{|y|\leq\delta_{\psi}}D\psi(y)$ are bounded. By the same construction, we may assume that there exists a constant $\delta_{\xi}>0$ such that $\sup_{|x|\leq\delta_{\xi}}D^{2}\xi(x)$ and $\sup_{|x|\leq\delta_{\xi}}D\xi(x)$ are bounded. Moreover, there exists also $\delta_{\psi\circ\xi}\in(0,\delta_{\xi})$ such that $\sup_{|x|\leq\delta_{\psi\circ\xi}}\xi(x)<\delta_{\psi}$ .

By direct computation, for all $|x|\leq\delta_{\psi\circ\xi}$ we now have

	$\displaystyle D(\psi\circ\xi)(0)$	$\displaystyle=\sum_{k=1}^{n}D_{k}\psi(0)D\xi^{(k)}(0);$
	$\displaystyle D^{2}(\psi\circ\xi)(x)$	$\displaystyle=\sum_{k,l=1}^{n}D^{2}_{kl}\psi(\xi(x))D\xi^{(k)}(x)^{\top}D\xi^{(l)}(x)+\sum_{k=1}^{n}D_{k}\psi(\xi(x))D^{2}\xi^{(k)}(x).$

This yields a positive non-increasing sequence $\left(\delta_{\psi\circ\xi}^{(n)}\right)_{n\in\mathbb{N}}$ and a localizing sequence $(\tau_{n})_{n\in\mathbb{N}}$ of stopping times such that $D(\psi\circ\xi)(0)$ and

\sup_{|x|\leq\delta^{(n)}_{\psi\circ\xi}}D^{2}(\psi\circ\xi)(x)

are bounded on the stochastic interval $[\![\tau_{n-1},\tau_{n}[\![$ for each $n\in\mathbb{N}$ . The desired process $R_{\psi\circ\xi}$ is obtained by setting $\sum_{n\in\mathbb{N}}\delta_{\psi\circ\xi}^{(n)}\mathbf{1}_{[\![\tau_{n-1},\tau_{n}[\![}$ . ∎

Appendix B Truncation and predictable compensators

In this appendix we complement the observations in Subsection 2.2. We begin by formally introducing truncation functions.

Definition B.1 (Truncation function for $X$ ).

We say that a predictable function $h$ is a truncation function for a semimartingale $X$ if $h$ is time-constant and deterministic, compatible with $X$ , $\sum_{0<t\leq\cdot}|\Delta X_{t}-h(\Delta X_{t})|<\infty$ , and if

X[h]=X-\sum_{0<t\leq\cdot}(\Delta X_{t}-h(\Delta X_{t}))

is a special semimartingale, i.e., if $X[h]$ can be decomposed into the sum of a local martingale and a predictable process of finite variation. ∎

Proposition B.2 (Universal truncation functions).

If a bounded time-constant deterministic function $h$ equals identity on an open neighbourhood of 0 then it is a truncation function for any compatible semimartingale $X$ . Furthermore, one then has $h\in\mathfrak{I}_{0\mathbb{R}}\cup\mathfrak{I}_{0\mathbb{C}}$ and

X[h]=X_{0}+h\circ X.

Proof.

Clearly $h\in\mathfrak{I}_{0\mathbb{R}}\cup\mathfrak{I}_{0\mathbb{C}}$ . Lemma A.2 with $\xi=h$ now yields $\sum_{0<t\leq\cdot}|h(\Delta X_{t})-\Delta X_{t}|<\infty$ . Next, observe that $\Delta X[h]_{t}=h(\Delta X_{t})$ is bounded, therefore $X[h]$ is special by Jacod and Shiryaev (2003, I.4.24). Finally, the Émery formula (1.18) yields

X_{0}+h\circ X=X+\sum_{0<t\leq\cdot}(h(\Delta X_{t})-\Delta X_{t})=X[h],

which completes the proof. ∎

Proposition B.3 (Émery formula with truncation).

Assume $\xi\in\mathfrak{I}_{0\mathbb{R}}$ (resp., $\mathfrak{I}_{0\mathbb{C}}$ ) is compatible with $X$ and let $h$ be a truncation function for $X$ . Then

\sum_{0<t\leq\cdot}\left|\xi_{t}(\Delta X_{t})-D\xi_{t}(0)h(\Delta X_{t})\right|<\infty

and

\begin{split}\xi\circ X=\int_{0}^{\cdot}D\xi_{t}(0)\mathrm{d}X[h]_{t}&+\frac{1}{2}\int_{0}^{\cdot}\sum_{i,j=1}^{d}D^{2}_{ij}\xi_{t}(0)\,\mathrm{d}\scalebox{1.2}{$[$}X^{(i)},X^{(j)}\scalebox{1.2}{$]$}^{c}_{t}\\ &+\sum_{0<t\leq\cdot}\left(\xi_{t}(\Delta X_{t})-D\xi_{t}(0)h(\Delta X_{t})\right).\end{split}

(B.1)

Proof.

First, the triangle inequality gives

	$\displaystyle\sum_{0<t\leq\cdot}\left\|\xi_{t}(\Delta X_{t})-D\xi_{t}(0)h(\Delta X_{t})\right\|\leq$	$\displaystyle\sum_{0<t\leq\cdot}\left\|\xi_{t}(\Delta X_{t})-D\xi_{t}(0)\Delta X_{t}\right\|$
		$\displaystyle+\sum_{0<t\leq\cdot}\left\|D\xi_{t}(0)\Delta X_{t}-D\xi_{t}(0)h(\Delta X_{t})\right\|$
	$\displaystyle<$	$\displaystyle\infty,\qquad t\geq 0.$

Here the second sum is finite thanks to Lemma A.2 and the third due to the local boundedness of $D\xi(0)$ and Definition B.1. The identity

\int_{0}^{\cdot}D\xi_{t}(0)\mathrm{d}X_{t}=\int_{0}^{\cdot}D\xi_{t}(0)\mathrm{d}X[h]_{t}+\sum_{0<t\leq\cdot}\left(D\xi_{t}(0)\Delta X_{t}-D\xi_{t}(0)h(\Delta X_{t})\right)

and the Émery formula (1.18) now yield the second part of the claim. ∎

We now introduce notation dealing with predictable compensators. If $X$ is a special semimartingale, we denote by $B^{X}$ its predictable compensator, i.e., the unique predictable finite variation process starting at zero such that $X-B^{X}$ is a local P–martingale. If Q is another probability measure absolutely continuous with respect to P and $X$ is Q–special, we denote the corresponding Q–compensator by $B^{X}_{\textsf{Q}}$ . Finally, we denote by $\nu^{X}$ the predictable P–compensator of the jumps of $X$ , i.e., for any compact interval $J\subset\mathbb{R}^{d}$ (resp., $\mathbb{C}^{d}$ ) not containing the origin, $\nu^{X}([0,\cdot]\times J)$ is the predictable compensator of the finite variation process $\sum_{0<t\leq\cdot}\mathbf{1}_{\{\Delta X_{t}\in J\}}$ .

We shall say that a semimartingale is PII if it has independent increments. The following result for PII semimartingales relates drifts to expected values, and hence shall be very useful. It is proved in Černý and Ruf (2020a, Proposition 2.14 and Theorem 4.1). At this point, we remind the reader that the stochastic exponential $\mathscr{E}(X)$ of a one-dimensional semimartingale $X$ is given as the (unique) solution of the stochastic differential equation

\displaystyle\mathscr{E}(X)=1+\int_{0}^{\cdot}\mathscr{E}(X)_{t-}\mathrm{d}X_{t}.

(B.2)

Theorem B.4.

Assume $\xi\in\mathfrak{I}_{0\mathbb{R}}\cup\mathfrak{I}_{0\mathbb{C}}$ is compatible with $X$ . If $\xi$ is deterministic and $X$ is PII, then $\xi\circ X$ , too, is PII. Furthermore, if $\xi\circ X$ is special one has

	$\displaystyle\textsf{E}[(\xi\circ X)_{t}]$	$\displaystyle=B^{\xi\circ X}_{t},$	$\displaystyle\qquad t$	$\displaystyle\geq 0;$		(B.3)
	$\displaystyle\textsf{E}[\mathscr{E}(\xi\circ X)_{t}]$	$\displaystyle=\mathscr{E}\scalebox{1.2}{$($}B^{\xi\circ X}\scalebox{1.2}{$)$}_{t},$	$\displaystyle\qquad t$	$\displaystyle\geq 0.$		(B.4)

Below we evaluate the right-hand-side of (B.3) and (B.4) for two important classes of stochastic processes.

Definition B.5.

We say that $X$ is a discrete-time process if $X$ is constant on $[k-1,k)$ for each $k\in\mathbb{N}$ . We say that $X$ is an Itô semimartingale if for all truncation functions $h$ for $X$ there exists a triplet $(b^{X[h]},c^{X},F^{X})$ of predictable processes such that $B^{X[h]}=\int_{0}^{\cdot}b^{X[h]}\mathrm{d}t$ , $[X,X]^{c}=\int_{0}^{\cdot}c^{X}\mathrm{d}t$ , and $\nu^{X}$ can be written in disintegrated form as $\nu^{X}=\int_{0}^{\cdot}\int F^{X}(\mathrm{d}x)\mathrm{d}t$ . ∎

Theorem B.6.

Let $X$ be a semimartingale and let $h$ be a truncation function for $X$ . Assume that $\xi\in\mathfrak{I}_{0\mathbb{R}}\cup\mathfrak{I}_{0\mathbb{C}}$ is compatible with $X$ and that $\xi\circ X$ is special. The following statements then hold.

(i)

If $X$ is a discrete-time process then $\xi\circ X$ is a discrete-time process and

	$\displaystyle B_{t}^{\xi\circ X}$	$\displaystyle=\sum_{k=1}^{\lfloor t\rfloor}\textsf{E}_{k-}[\xi_{k}(\Delta X_{k})],$	$\displaystyle\qquad t$	$\displaystyle\geq 0;$
	$\displaystyle\mathscr{E}(B^{\xi\circ X})_{t}$	$\displaystyle=\prod_{k=1}^{\lfloor t\rfloor}\textsf{E}_{k-}\left[1+\xi_{k}(\Delta X_{k})\right],$	$\displaystyle\qquad t$	$\displaystyle\geq 0.$		(B.5)

(ii)

If $X$ is an Itô semimartingale then $\xi\circ X$ is an Itô semimartingale and

	$\displaystyle b^{\xi\circ X}$	$\displaystyle=D\xi(0)b^{X[h]}+\frac{1}{2}\sum_{i,j=1}^{d}D^{2}_{ij}\xi(0)c^{X}_{ij}+\int_{\mathbb{R}^{d}}\left(\xi(x)-D\xi(0)h(x)\right)F^{X}(\mathrm{d}x);$
	$\displaystyle B^{\xi\circ X}$	$\displaystyle=\int_{0}^{\cdot}b_{t}^{\xi\circ X}\mathrm{d}t;\qquad\mathscr{E}(B^{\xi\circ X})=\exp\left(\int_{0}^{\cdot}b_{t}^{\xi\circ X}\mathrm{d}t\right).$

Proof.

By (B.1), we have

\begin{split}B^{\xi\circ X}=\int_{0}^{\cdot}D\xi_{t}(0)\mathrm{d}B^{X[h]}_{t}&+\frac{1}{2}\int_{0}^{\cdot}\sum_{i,j=1}^{d}D^{2}_{ij}\xi_{t}(0)\,\mathrm{d}\scalebox{1.2}{$[$}X^{(i)},X^{(j)}\scalebox{1.2}{$]$}^{c}_{t}\\ &+\int_{0}^{\cdot}\int_{\mathbb{R}^{d}}\left(\xi_{t}(x)-D\xi_{t}(0)h(x)\right)\nu^{X}(\mathrm{d}t,\mathrm{d}x),\end{split}

yielding the statement. ∎

Appendix C Change of measure

This appendix collects results on changes of measures.

Theorem C.1 (Girsanov’s theorem for absolutely continuous probability measures).

Let $N$ be a P–semimartingale such that

M=\mathscr{E}(N)

is a uniformly integrable P–martingale with $M\geq 0$ . Define the probability measure Q by

\frac{\mathrm{d}\textsf{Q}}{\mathrm{d}\textsf{P}}=M_{\infty}.

For a Q–semimartingale $V$ and a P–semimartingale $V_{\uparrow}$ , Q–indistinguishable from $V$ , the following are equivalent.

(1)

$V$ is Q–special.
(2)

$V_{\uparrow}+[V_{\uparrow},N]$ is P–special.

If either condition holds then the corresponding compensators are equal, i.e.,

\displaystyle B^{V}_{\textsf{Q}}=B^{V_{\uparrow}+[V_{\uparrow},N]},\qquad\text{$\textsf{Q}$--almost surely}.

The proof is quite classical and we do not reproduce it here. For details see Černý and Ruf (2020a, Proposition 5.2).

Theorem C.2 (Girsanov’s theorem – representations).

Assume $\eta,\xi\in\mathfrak{I}_{0\mathbb{R}}$ (resp., $\mathfrak{I}_{0\mathbb{C}}$ ) with $\eta\geq-1$ are compatible with a semimartingale $X$ such that $\eta\circ X$ is special and $\Delta B^{\eta\circ X}>-1$ . Assume further that $M=\nicefrac{{\mathscr{E}(\eta\circ X)}}{{\mathscr{E}\left(B^{\eta\circ X}\right)}}$ is a real-valued uniformly integrable P–martingale and define the probability measure Q by

\frac{\mathrm{d}\textsf{Q}}{\mathrm{d}\textsf{P}}=M_{\infty}.

Assume also that $\xi\circ X$ is Q–special. Then the following statements hold.

(i)

If $X$ is a discrete-time process under P then $\xi\circ X$ is a discrete-time process under Q and

\displaystyle B_{\textsf{Q},t}^{\xi\circ X}

\displaystyle=\sum_{k=1}^{\lfloor t\rfloor}\textsf{E}_{k-}\Bigg{[}\xi_{k}(\Delta X_{k})\frac{1+\eta_{k}(\Delta X_{k})}{1+\textsf{E}_{k-}[\eta_{k}(\Delta X_{k})]}\Bigg{]},\qquad

\displaystyle t\geq 0.

(ii)

If $X$ is an Itô P–semimartingale then $\xi\circ X$ is an Itô Q–semimartingale and

	$\displaystyle b^{\xi\circ X}_{\textsf{Q}}=b^{(1+\eta)\xi\circ X}$	$\displaystyle=D\xi(0)b^{X[h]}+\frac{1}{2}\sum_{i,j=1}^{d}\left(D^{2}_{ij}\xi(0)+2D_{i}\xi(0)D_{j}\eta(0)\right)c^{X}_{ij}$
		$\displaystyle\qquad\qquad\qquad+\int_{\mathbb{R}^{d}}\left(\xi(x)(1+\eta(x))-D\xi(0)h(x)\right)F^{X}(\mathrm{d}x).$

Proof.

In this proof all predictable functions appearing in representations are in $\mathfrak{I}_{0\mathbb{R}}$ (resp., $\mathfrak{I}_{0\mathbb{C}}$ ). By a standard calculation, the process $M=\nicefrac{{\mathscr{E}(\eta\circ X)}}{{\mathscr{E}\left(B^{\eta\circ X}\right)}}$ satisfies $M=\mathscr{E}(N)$ with

N=\left(\frac{1+\eta({\operatorname{id}}_{1})}{1+{\operatorname{id}}_{2}}-1\right)\circ(X,B^{\eta\circ X}).

From the representation of quadratic covariation in Example 2.3 we then obtain

V+[V,N]=\xi({\operatorname{id}}_{1})\frac{1+\eta({\operatorname{id}}_{1})}{1+{\operatorname{id}}_{2}}\circ(X,B^{\eta\circ X})=\xi\frac{1+\eta}{1+\Delta B^{\eta\circ X}}\circ X.

The rest follows from the general Girsanov theorem (Theorem C.1) and the drift formulae in Theorem B.6. ∎

Corollary C.3.

With the notation and assumptions as in Theorem C.2 above, if $X$ is PII under P stopped at a finite time and $\eta$ is deterministic then $M=\nicefrac{{\mathscr{E}(\eta\circ X)}}{{\mathscr{E}\left(B^{\eta\circ X}\right)}}$ is a uniformly integrable martingale. Furthermore, if $\xi$ , too, is deterministic then $\xi\circ X$ is PII under Q and the following statements hold for all $t\geq 0$ .

(i)

If $X$ is a discrete-time process under P then

\displaystyle\textsf{E}^{\textsf{Q}}[\mathscr{E}(\xi\circ X)_{t}]

\displaystyle=\prod_{k=1}^{\lfloor t\rfloor}\textsf{E}\Bigg{[}(1+\xi_{k}(\Delta X_{k}))\frac{1+\eta_{k}(\Delta X_{k})}{1+\textsf{E}[\eta_{k}(\Delta X_{k})]}\Bigg{]}.

(ii)

If $X$ is an Itô P–semimartingale then

$\displaystyle\textsf{E}^{\textsf{Q}}[\mathscr{E}(\xi\circ X)_{t}]=\exp\left(\int_{0}^{t}b_{u}^{(1+\eta)\xi\circ X}\mathrm{d}u\right).$

Proof.

First note that if $\eta$ is deterministic and if $X$ is PII, then Example 3.3 yields that $\eta\circ X$ is again PII. Next, Theorem B.4 yields the martingale property of $M$ . The PII property of $X$ under Q follows from Girsanov’s theorem. The argument then follows from Theorems B.4, B.6, and C.2. ∎

	$\displaystyle\sum_{0<t\leq\cdot}\left\|\xi_{t}(\Delta X_{t})-D\xi_{t}(0)h(\Delta X_{t})\right\|\leq$	$\displaystyle\sum_{0<t\leq\cdot}\left\|\xi_{t}(\Delta X_{t})-D\xi_{t}(0)\Delta X_{t}\right\|$
		$\displaystyle+\sum_{0<t\leq\cdot}\left\|D\xi_{t}(0)\Delta X_{t}-D\xi_{t}(0)h(\Delta X_{t})\right\|$
	$\displaystyle<$	$\displaystyle\infty,\qquad t\geq 0.$

Simplified Stochastic Calculus With Applications in Economics and Finance

Abstract.

2010 Mathematics Subject Classification:

1. Introduction

1.1. McKean calculus for Itô processes

1.2. First steps

1.3. Integral notation

2. Simplified stochastic calculus

2.1. Composite rules

Corollary 2.1.

Example 2.2 (Representation of the log return in terms of the rate of return).

Example 2.3 (Representation of quadratic covariation).

Corollary 2.4 (Composition of representations).

2.2. Émery formula and drift computation

3. Further examples with drift computation

Example 3.1 (Drift of ℒ​(ev​X)\mathcal{L}(\mathrm{e}^{vX}) for v∈ℂv\in\mathbb{C}).

Remark 3.2.

Example 3.3 (Characteristics of a represented Itô semimartingale).

Example 3.4 (Maximization of exponential utility).

Remark 3.5.

4. Drift under a change of measure

Example 4.1 (Minimal entropy martingale measure).

Remark 4.2.

Example 4.3 (An option to exchange one defaultable asset for another).

5. Jumps at predictable times

Example 5.1 (Maximization of expected utility).

Example 5.2 (Minimal entropy martingale measure).

6. Concluding remarks

References

Appendix A Notation and details about the representations

Definition A.1 (Two subclasses of universal representing functions).

Lemma A.2.

Proof.

Proposition A.3 (Representation of stochastic integrals).

Proof.

Proposition A.4 (Representation of smooth transformations).

Proof.

Proposition A.5 (Composition of universal representing functions).

Proof.

Appendix B Truncation and predictable compensators

Definition B.1 (Truncation function for XX).

Proposition B.2 (Universal truncation functions).

Proof.

Proposition B.3 (Émery formula with truncation).

Proof.

Theorem B.4.

Definition B.5.

Theorem B.6.

Proof.

Appendix C Change of measure

Theorem C.1 (Girsanov’s theorem for absolutely continuous probability measures).

Theorem C.2 (Girsanov’s theorem – representations).

Proof.

Corollary C.3.

Proof.

Example 3.1 (Drift of $\mathcal{L}(\mathrm{e}^{vX})$ for $v\in\mathbb{C}$ ).

Definition B.1 (Truncation function for $X$ ).