This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Finance from the viewpoint of physics

A. Jakovac111e-mail: [email protected]
Institute of Physics, Eotvos University, H-1117 Budapest, Hungary
Abstract

In this note we review the basic mathematical ideas used in finance in the language of modern physics. We focus on discrete time formalism, derive path integral and Green’s function formulas for pricing. We also discuss various risk mitigation methods.

1 Introduction

Advanced mathematical methods are used in finance for a long time to understand the functioning of the market. In this continuously fluctuating environment probability theory provides that solid basis, on which the assessment of the present values, and the risk mitigation techniques can be based. This aspect of the market has become even more enhanced after the crisis in 2008. Since then the market is more prudent, collateralization is applied often even for simple products. New, more complicated financial products have appeared, the use of computers in the trading becomes more and more widespread. All of these facts result in the increase of the role of mathematical methods in the finance.

There are numerous well written books on mathematical finance, for example [1, 2, 3]. These books, and most of the financial literature uses the phrasing of probability theory that was founded by Kolgomorov [4] and Ito^\hat{\mathrm{o}} [5] in the first half of the XX. century. This approach considers the stochastic process as a measure which can be used for integrating a function (adapted process). This thought nicely fits into the mathematical movements of the early XX. century, namely the raise of measure theory and Lebesque integral.

In the same time, however, a different formalism describing probabilistic processes was also born, mainly driven by physicists, Einstein, Langevin, Fokker, Planck, later Dirac and Feynman. Here we treat the stochastic process as a differential equation (Langevin-equation), where in the source term an unusual, fast oscillating function appears, called white noise. The white noise is a normally distributed random function where the correlation between different times is described by a Dirac-delta. In the 1920’s, however, it was absolutely unclear how to deal with the Dirac-delta ”function”. It was only the 1950’s where Schwartz gave a mathematically satisfying description [6] as a distribution.

An alternative rephrasing of the Langevin-equations can be given using an integral representation, called functional (or path) integral. This approach was initiated by Wiener in the 1920’s, but its full weight has obtained by Dirac and Feynman in the 1940’s [7]. With this formulation the same problem appeared as for the Dirac-delta earlier: the continuum limit, except for some elementary cases like the Wiener-integral, seemed to be senseless.

The solution for giving sense for the path integral (and, in fact, for all the quantum field theory) arrived only in the 1970’s with the ideas of renormalization (for summary and references c.f. [8]). The main idea is in fact related to the ones used in defining the Lebesque-integral and the Dirac-delta: we approach the continuum limit through some discretization, and we study the change of the results under the change of the discretization. But, unlike in the case of integrals and the distributions, the continuum limit is much more complicated in this case, and we always must keep referring to the discretization scale. Actually, although this could be seem a bug in the line of thought, it leads to new, measurable effects (running coupling constants, trace anomaly) [9].

This solution gave a huge impact on the development of statistical physics and quantum field theory, in disciplines where the formalism strongly relies on the path integral. Present day numerical computations of elementary particle physics use mostly path integral methods in some discretization, and no sooner can the continuum limit be achieved than at the end of the computations. In this way, however, precise numerical results could be obtained (c.f. for example [10]).

MC simulations are used in various fields nowadays, including finance. In the financial sector the most models are extensions of the Brownian motion, and so Gaussian MC simulations can be applied to simulate the price movements.

The purpose of this note is to give an introduction to finance in the language of physics. Being so, it is the part of an ongoing effort to bring the ideas of physics into finance and vice versa [11, 12, 13, 14, 15, 16].

This note is built up as follows. We define the mathematical space that corresponds to the market (Section 2), then we discuss the value of a portfolio in Section 3. In Section 4 we look at the market from the point of view of the statistics, and introduce the tools of treating the price changes in a discretized formulation. In Section 5 we turn to the possibility of continuous approximation. In the next section (Section 6) we solve some stochastic differential equations. In Section 7 we discuss risk mitigation techniques applied in the market, and the way how the assumption of risk neutrality leads to the determination of the price of a derivative (Section 8). The paper closes with a Summary section (Section 9).

2 The space of trades

In order to be able to speak about the financial products we have to define an abstract space that represents the trades. To understand the logics we recall that trading traditionally stems from the exchange of properties of different people, families, tribes, or later firms. All tradeable properties will be called asset, let it be direct material goods like vegetables, cattles or tools, or indirect ones as field, workpower or even the life of a person (which is traded for example when somebody enters the army). The assets can have parameters (for example quality, expiration date etc.), then we treat them as different assets.

The property of a trader usually consists of several assets. They can have a house, two horses, five and a half barrel oil and also three and a half cows if two persons have seven cows together. In the property (we will call it a portfolio) thus all assets has some quantity. The property or portfolio is thus the list of all the assets with their available quantity.

The mathematical structure corresponding to this construction is the vector space. Let us denote by AA that vector space (asset space or portfolio space) where the basis elements are the assets. Although it can be thought to be infinite dimensional (because, for example, the quality forms a continuum), in practice only a finite number of asset types are traded, so we do not loose anything if we think it as a finite dimensional vector space. We mathematically define the portfolio as an element of the asset space

𝒫A.{\cal P}\in A.

In finance there is a singled out asset that plays a universal role, and this is money. In economics money has various roles, here we just consider one aspect, the universal exchange tool. We use US dollars as numeraire, and denote the corresponding asset by USD. So if we have ten dollars and two dogs, then our portfolio can be described as 𝒫=10USD+2dogs{\cal P}=10\mathrm{USD}+2\mathrm{dogs}. Logical.

2.1 Loans and other promises

What makes it more interesting is that not only the actual goods can be traded in a spot exchange, but other “financial products” as well. One of the simplest financial product is a loan. This can be money, but other assets can be lended and borrowed, too.

Who has a debt, has, in some sense, a negative property. If we owe three cows then our portfolio could be written as 3cow-3\mathrm{cow}. But it is not the most adequate notation, and sometimes it can lead to misunderstandings. The reason is that if we have three cows and owe three cows, the above notation would suggest writing 𝒫=3cow3cow=0{\cal P}=3\mathrm{cow}-3\mathrm{cow}=0. But it is not true that we have nothing, because we can use the benefits of the cows, for example we can drink their milk.

Thus, somewhat generalizing the concept of the loan, we will speak about general promises or liabilities. A debt can be considered as a promise that we will give (back) a certain asset if we are asked for. The loan is the opposite, somebody have promised us a payoff at some time. In fact the actual assets and the promises on actual assets are the main constituents of the more complicated financial products.

Let us denote the promise with pp, and its argument is the asset that is promised. The loan is a positive promise, because when it is given, one will possess the given asset. This means that if we have three cows and owe three cows, then our property is

𝒫=3cows3p(cows).{\cal P}=3\mathrm{cows}-3p(\mathrm{cows}). (1)

Now we can not simplify this equation, this means exactly what we want to. pp is defined to be a linear map of the asset space

p:AA,p(αa+βb)=αp(a)+βp(b)p:A\to A,\qquad p(\alpha a+\beta b)=\alpha p(a)+\beta p(b) (2)

A promise, since it concerns future events, can have several more parameters, that is why it is worth to denote them as a function. A usual parameter is the maturity or tenor or expiration time, denoting when the promise is due. If we denote the present time as t=0t=0, then

𝒫=3cows3p(cows,T),T=1y{\cal P}=3\mathrm{cows}-3p(\mathrm{cows},T),\qquad T=1y (3)

means that we should deliver 3 cows in one year from now. TT can be a time interval, discussed later.

2.2 Common financial products

In this language we can describe a lot of financial products. For example a loan with notional XX USD, payed back in parts, can be described as

𝒫=XUSDn=1Ncnp(USD,tn)Xrp(USD,T),{\cal P}=X\mathrm{USD}-\sum_{n=1}^{N}c_{n}p(\mathrm{USD},t_{n})-X_{r}p(\mathrm{USD},T), (4)

where cnc_{n} is the interest rate to be paid at time tnt_{n} (for example tn=nmt_{n}=n\mathrm{m} for monthly payoff), and XrX_{r} is the remainder due at expiration time TT. To determine the value of the parameters cn,N,Tc_{n},\,N,\,T and XrX_{r} at fixed tnt_{n}, we can use different techniques discussed later. For a fixed rate loan cnc_{n} is constant.

Another product is the futures trade when an asset ’a’ is agreed to be bought or sold at a given, strike price KK at maturity time TT. If we want to buy that asset, called we are in long position, then our portfolio consists of

𝒫=p(a,T)Kp(USD,T).{\cal P}=p(a,T)-Kp(\mathrm{USD},T). (5)

If we want to sell the asset, called we are in the short position, then our portfolio is

𝒫=p(a,T)+Kp(USD,T).{\cal P}=-p(a,T)+Kp(\mathrm{USD},T). (6)

Another interesting parameter of the promise can be its optionality. One of the counterparties may have the right not to fulfill or not to exercise their promise. In this case the two parties are not equivalent. We call the one who possesses the optionality to be in the long position, the other counterparty (who “sells the optionality”) is in the short position, irrespective whether the promise is about to buy or sell something.

A possible notation for the options is to multiply the possible payoffs by a number α{0,1}\alpha\in\{0,1\}. When α=1\alpha=1, then the promise is fulfilled, otherwise it is denied. It is also important that who has the right to decide the value of α\alpha, that we indicate as a ±\pm index: if the index is ++, then the portfolio owner has the right to set the value of α\alpha (i.e. she is in the long position), if the index is - then someone else determines its value (so the portfolio owner is in the short position with respect to the option).

For example if we agreed that trader ’A’ has the option to buy a product ’a’ at time (or time interval) TT for a strike price KK from trader ’B’ (European option), then their portfolios read

𝒫A=α+(p(a,T)Kp(USD,T)),𝒫B=α(p(a,T)+Kp(USD,T)).{\cal P}_{A}=\alpha_{+}(p(a,T)-Kp(\mathrm{USD},T)),\qquad{\cal P}_{B}=\alpha_{-}(-p(a,T)+Kp(\mathrm{USD},T)). (7)

The exercise date can be also optional, in American option it is any value in [0,T][0,T], in Bermudan option there are some fixed dates. Similarly as in the previous case, we can denote its optionality by a subscript ±\pm. An American option can be described as

𝒫A=α+(p(a,T+)Kp(USD,T+)),𝒫B=α(p(a,T)+Kp(USD,T)){\cal P}_{A}=\alpha_{+}(p(a,T_{+})-Kp(\mathrm{USD},T_{+})),\qquad{\cal P}_{B}=\alpha_{-}(-p(a,T_{-})+Kp(\mathrm{USD},T_{-})) (8)

where T+=T[0,T].T_{+}=T_{-}\in[0,T].

We note that the strike price can also be a complicated construction, even depending on the price history. For example we can agree that the buyer of the option has the right to sell a given asset at the average price that was achieved in a given time interval (Asian option), or anything more exotic ones.

We also note that, although the choice of α\alpha is completely up to the trader in the long position, sensible traders choose α=1\alpha=1 if it is beneficial to them. This makes it possible to determine the price of the option, see later.

3 Value of the portfolio

By now we can describe what we have currently. In a trade we exchange two (or more) assets. But the question is, how much is a given asset worth? Clearly no one would bargain away his property, but at the same time everybody wants to achieve the highest price possible.

On the other hand there is not an explicit value measure for the goods. In particular because goods may have hidden advantage for somebody, and this person is willing to buy them at a higher price, too. So the only measure for the value of an asset is that for how much is it used to trade. A well informed trader will trade the asset at exactly the price that is adequate at that moment. The lack of information leads to failed trade, or to arbitrage, when an asset can be bought from and sold to different parties, realizing a net profit.

If a market is well informed, and there are a lot of vigilant merchants around, then arbitrage can not be hold for a long time. If it was strictly true, then there would be a single price for each asset. But actually it is just an approximation, since nobody knows that value, and so all the trades modify somewhat the price. A momentary excess in demand will raise the price, while a momentary excess of offers will lower it, and this is repeated time and time again. So, if we insist having a definite price, we have to say that the prices fluctuate.

If we sell or buy several assets, then we trade them separately. This means that the price (value of the portfolio) is a linear map from the asset space and time to the real numbers (actually 𝑹+\bm{R}_{+}). Thus

S:A×𝑹𝑹+linear(a,t)S(a,t)\begin{array}[t]{lrll}S\,:&A\times\bm{R}&\to\bm{R}_{+}&\qquad\mathrm{linear}\cr&(a,t)&\mapsto S(a,t)&\cr\end{array} (9)

gives the price/value of the asset aa at a time tt.

In a fair business neither of the counterparties lose, both of them give or receive the price which corresponds to the assets they trade. If it is a spot bargain, then both parties know the market price, and this serves as a relation point. But if the payoffs happen in the future, one needs a tool to compute the value of the asset at present. This is the present value, and this forms the basis of a fair trade.

3.1 Discounting a risk free zero coupon bond

The most simple future payoff is the zero coupon bond, which is p(USD,T)p(\mathrm{USD},T), i.e. it pays 1USD at a future time TT once. We also assume that it is risk free, meaning that we can count on the payoff with hundred percent certainty. For example we may think of a US government bond. Our task is to tell its value at time tt, which is called discounting the value of the payoff.

To tell the present value, we have to compare the investment in a zero coupon bond to a bank deposit in a safe bank. If it would be more advantageous to invest into a bank deposit, then we would short the zero coupon bond now, and put the money in the bank deposit. A time TT the bank deposit would have a higher value, and so we could gain money with zero starting capital. If the investment into the zero coupon bond would be more advantageous, we could do the inverse: we borrow money from a bank, and put it into the bond, and realize a net profit at time TT. To avoid these arbitrage possibilities, the present values of a risk free zero coupon bond and a risk free bank deposit must agree.

But the bank pays interest rate for all the deposits. In the most simple case it is a fixed annual interest rate r1r_{1}. Technically the paying of the interest happens periodically in each dtdt time period, with the corresponding interest rate rdtr_{dt}. rdtr_{dt} can be determined from the condition that after one year we get r1r_{1} rate (assuming 1/dt1/dt is integer)

(1+rdt)[1/dt]=1+r1rdt=(1+r1)dt1(1+r_{dt})^{[1/dt]}=1+r_{1}\quad\Rightarrow\quad r_{dt}=(1+r_{1})^{dt}-1 (10)

In case dt0dt\to 0 (called continuous compounding) we denote rdt=dtrr_{dt}=dt\,r. Then

(1+rdt)[t/dt]=(1+rtdtt)t/dtdt0ert.(1+r_{dt})^{[t/dt]}=\left(1+\frac{rt\,dt}{t}\right)^{t/dt}\stackrel{{\scriptstyle dt\to 0}}{{\longrightarrow}}e^{rt}. (11)

This also means that r=ln(1+r1)r=\ln(1+r_{1}).

If we deposited XXUSD in the bank at time tt, at a later time TT it is worth Xer(Tt)Xe^{r(T-t)} USD. This should be compared to the case, when we buy a zero coupon bond at time tt with maturity TT. In an arbitrage-free fair business both should have a value of 1USD at time TT, so we require Xer(Tt)=1Xe^{r(T-t)}=1. Thus the value of the zero coupon bond at time tt is

X=S(p(USD,T),t)=er(Tt)=(1+r1)(Tt).X=S(p(\mathrm{USD},T),t)=e^{-r(T-t)}=(1+r_{1})^{-(T-t)}. (12)

This formula makes it possible to determine the value of cc for a fixed rate loan. The portfolio was given in (4). In a fair business the value of the portfolio is zero at all times. Let us compute it at time zero (present time), when we have

0=S(𝒫,0)=Xn=1NcS(p(USD,tn),0)XrS(p(USD,T),0).0=S({\cal P},0)=X-\sum_{n=1}^{N}cS(p(\mathrm{USD},t_{n}),0)-X_{r}S(p(\mathrm{USD},T),0). (13)

Let us choose tn=nΔtt_{n}=n\,\Delta t, T=(N+1)ΔtT=(N+1)\Delta t, and denote the actual interest rate (which is the risk free interest rate plus the spread) by rr. Then we find

X=cerΔterT1erΔt+XrerT,X=c\frac{e^{-r\Delta t}-e^{-rT}}{1-e^{-r\Delta t}}+X_{r}e^{-rT}, (14)

and, correspondingly,

c=(XXrerT)1erΔterΔterT.c=(X-X_{r}e^{-rT})\frac{1-e^{-r\Delta t}}{e^{-r\Delta t}-e^{-rT}}. (15)

Therefore the condition of arbitrage freeness in the absence of risk leads to a definite price for the zero coupon bond, and a definite value of the fixed rate paying.

3.2 Discounting the price of an asset

Let us assume that we have a promise that we are given an asset aa at time TT, so our portfolio is p(a,T)p(a,T). What is the value of the portfolio at time tt?

What we certainly know is that

S(p(a,T),T)=S(a,T),S(p(a,T),T)=S(a,T), (16)

since the promise is fulfilled then, we obtain the asset, and its price is what is determined by the market at that time. We claim that it is true at other times as well, i.e.

S(p(a,T),t)=S(a,t),S(p(a,T),t)=S(a,t), (17)

it does not depend on TT.

The reason is that if S(p(a,T),t)>S(a,t)S(p(a,T),t)>S(a,t), then we buy the asset now, and, at the same time we sell the promise of delivery at time TT. Therefore we have now the asset aa, payed its value (S(a,t)USD-S(a,t)\,\mathrm{USD}), we promised a delivery of aa at time TT (this is p(a,T)-p(a,T)), and we obtained the price for the promise S(p(a,T),t)USDS(p(a,T),t)\,\mathrm{USD}. Our portfolio therefore reads

𝒫1(t)=aS(a,t)USDp(a,T)+S(p(a,T),t)USD.{\cal P}_{1}(t)=a-S(a,t)\,\mathrm{USD}-p(a,T)+S(p(a,T),t)\,\mathrm{USD}. (18)

The value of the portfolio is zero at time tt. Its value at time TT, if the promise is fulfilled

S(𝒫1,T)=S(a,T)S(p(a,T),T)+(S(p(a,T),t)S(a,t))S(USD,T).S({\cal P}_{1},T)=S(a,T)-S(p(a,T),T)+(S(p(a,T),t)-S(a,t))S(\mathrm{USD},T). (19)

But the first two term cancel each other by equation (16), and so what remains is

S(𝒫1,T)=(S(p(a,T),t)S(a,t))S(USD,T)>0.S({\cal P}_{1},T)=(S(p(a,T),t)-S(a,t))S(\mathrm{USD},T)>0. (20)

Therefore we could gain money. If S(p(a,T),t)<S(a,t)S(p(a,T),t)<S(a,t), then we build a portfolio

𝒫2=a+S(a,t)USD+p(a,T)S(p(a,T),t)USD,{\cal P}_{2}=-a+S(a,t)\,\mathrm{USD}+p(a,T)-S(p(a,T),t)\,\mathrm{USD}, (21)

for that S(𝒫2,0)=0S({\cal P}_{2},0)=0 and S(𝒫2,T)=(S(a,t)S(p(a,T),t))S(USD,T)>0S({\cal P}_{2},T)=(S(a,t)-S(p(a,T),t))S(USD,T)>0 again. To exclude this arbitrage possibility we need to have S(p(a,T),t)=S(a,t)S(p(a,T),t)=S(a,t), which we wanted to demonstrate.

We remark that the two cases are somewhat different. If the price of the promise is larger than the actual price, we immediately can realize a profit without any original capital. The other case is feasible if we have the asset previously, otherwise we can not realize the a-a part of the portfolio. But, if the asset is liquid enough, there are enough assets in the market to forbid this arbitrage.

Using this result we can give the price of a futures trade. The portfolio of a long position is given by (5), its price is therefore

S(p(a,T)Kp(USD,T),t)=S(a,t)Ker(Tt).S(p(a,T)-Kp(\mathrm{USD},T),t)=S(a,t)-Ke^{-r(T-t)}. (22)

4 Statistical approach to the market

In fact the discounting of an asset price is the only one which is independent on the way the market operates. Already the calculation of the discount factor of a fixed payoff depends strongly on the details, in this case on the interest rate. A fair business takes into account the market rates which, however, fluctuate in time. Therefore we should understand, how the market operates, how the prices are determined, why, and how do they fluctuate. This is a very complicated question, and we can just hope that we find a satisfactory approximation.

The first point we have to clarify is the recording of the prices. Although previously we used a continuous time notation, but it is an abstraction, an approximation. In reality all the recordings have a time stamp that is not infinitely fine. There is a smallest time difference that can be resolved, say dτ=1μd\tau=1\musec (as an upper estimate), thus all trades and prices can be characterized by an integer; in particular the price of asset aa at time t=ndτt=nd\tau will be denoted as SnaS_{na}. We will use a fixed NN number of assets, then the vector of all prices is Sn=(Sn1,Sn2,,SnN)S_{n}=(S_{n1},S_{n2},\dots,S_{nN}). Sometimes we will put a comma between the two indices in order to avoid misunderstanding, for example we will write Sn+1,aS_{n+1,a}.

When we think about a dynamic model of price changes we must pin down that in a complete model the price in the future must depend solely on the information available at the present. In fact, we can not make decisions based on past events if they are forgotten. The only way of remembering the past events is to make notes (eventually in our memory) about them, and then it is an available information in the present. So we may write generally

Sn+1=Sn+(informationavailableatpresent).S_{n+1}=S_{n}+{\cal F}(\mathrm{information\ available\ at\ present}). (23)

The factors determining the evolution of the price, of course, are numerous. Moreover, for a quantitative prediction we should have known the actual form of the \cal F. Thus predicting the price in the future seems to be impossible.

Still, we can benefit from the generic form above. We may divide the information available at present into three parts. The first part are externalities that do not depend on the status of the market: for example the natural events like wheather, new discoveries, inventions, political or military actions. In a a market model we do not want to describe their dynamics, we take them as given processes, and as such these can be taken into account as an explicit time dependence. We may hope that these effects are slow (usually they are, but for example the weather can have significant influence in certain areas also on daily basis).

The second part of the variables describe the market. Among them there are the asset prices, but other market factors can also be present like forward rates. They appear on both sides of the equation, and we denote them unified with SS.

The third part is again (mainly) independent on the status of the market, but these are fast processes. They consist, for example, of the momentary intentions of the participants of the market. Let us denote them as ξi\xi_{i}, where ii runs through some (large) index set. These processes are in principle well defined, they follow their own dynamics, but it is impossible to tell their time dependence from the knowledge of the asset prices. All in all we have the equation

Sn+1=Sn+n(Sn,ξin).S_{n+1}=S_{n}+{\cal F}_{n}(S_{n},\xi_{in}). (24)

Were the ξi\xi_{i} absent from the above equation, we could determine \cal F from the observation of price changes in the past, and eventually recalibrate its form from time to time. But it is hopeless to determine the actual form of the ξi\xi_{i} functions. What helps us in this situation is that they are numerous, and although they are deterministic one-by-one, their net effect is still something that can be described statistically. This means that we assume a time dependence for them, solve the above equation for all possible time dependences, and finally we average over the results with some weight. We will assume that these variables are normalized in a way that they fluctuate around zero (their mean is treated as a deterministic effect).

4.1 Linearization

Using the fact that the ξi\xi_{i} effects are small one-by-one, we can power expand the \cal F function to first order

Sn+1=Sn+n(Sn,0)+ξinnξin|(Sn,0)+.S_{n+1}=S_{n}+{\cal F}_{n}(S_{n},0)+\xi_{in}\frac{\partial{\cal F}_{n}}{\partial\xi_{in}}\biggr{|}_{(S_{n},0)}+\dots. (25)

The last term is a weighted sum of the ξi\xi_{i} variables at time index nn. Now we can argue that the distribution of the sum of mostly independent random variables (with bounded variance) is a Gaussian. This is the central limit theorem, and in fact we need to fulfil some conditions that we tacitly assume that is in fact the case here. Thus the last term can be substituted by a single term with some generic coefficient:

Sn+1=Sn+n(Sn,0)+Zn(Sn)ξn,S_{n+1}=S_{n}+{\cal F}_{n}(S_{n},0)+Z_{n}(S_{n})\xi_{n}, (26)

where the ξn\xi_{n} variables are all Gaussian distributed random variables with zero mean and unit variance. We will assume that these random variables are independent for different times: indeed, we can argue that there are different trades throughout the world at random times, and so their interrelation is weak. But we must know that this is again an approximation, because if we do not observe all effects, the effective dynamics of the rest will contain memory effects. What we assume is that these memory effects are small.

Although all the formulae are supposed to be written for multi-component variables, it may be useful to write out the indices explicitly. In the multi-component notation the above equation can be written as

Sn+1,a=Sna+na(Sn,0)+Zna(Sn)ξna.S_{n+1,a}=S_{na}+{\cal F}_{na}(S_{n},0)+Z_{na}(S_{n})\xi_{na}. (27)

The ξna\xi_{na} random variables are not necessarily independent for different assets

𝑬(ξnaξmb)=Cn,abδnm,\bm{E}\left(\xi_{na}\xi_{mb}\right)=C_{n,ab}\delta_{nm}, (28)

and so the covariance matrix of the complete noise term reads

𝑬ZnaξnaZmbξmb=δnmZnaZnbCn,ab.\bm{E}Z_{na}\xi_{na}Z_{mb}\xi_{mb}=\delta_{nm}Z_{na}Z_{nb}C_{n,ab}. (29)

To simplify the treatment, we diagonalize the correlation matrix (which is a symmetric regular real matrix) as

Cn,ab=k=1Nλnkva(nk)vb(nk),C_{n,ab}=\sum_{k=1}^{N}\lambda_{nk}v_{a}^{(nk)}v_{b}^{(nk)}, (30)

where the 𝒗(nk)\bm{v}^{(nk)} vectors are eigenvectors of the covariance matrix 𝑪n\bm{C}_{n}, and they are orthonormal: 𝒗(nk)𝒗(n)=δk\bm{v}^{(nk)}\bm{v}^{(n\ell)}=\delta_{k\ell}. Then we can write (27) as

Sn+1,a=Sna+na(Sn,0)+k=1NZn,ak(Sn)ξnkS_{n+1,a}=S_{na}+{\cal F}_{na}(S_{n},0)+\sum_{k=1}^{N}Z_{n,ak}(S_{n})\xi_{nk} (31)

with the volatility matrix

Zn,ak=Znaλnkva(nk),Z_{n,ak}=Z_{na}\sqrt{\lambda_{nk}}v_{a}^{(nk)}, (32)

and uncorrelated noise terms

𝑬(ξnkξm)=δkδnm.\bm{E}\left(\xi_{nk}\xi_{m\ell}\right)=\delta_{k\ell}\delta_{nm}. (33)

Indeed, the correlation of the noise term reads now as

𝑬(k=1NZn,akξnk)(=1NZm,bξm)=δnmZnaZnbk=1Nλnkva(k)vb(k)\bm{E}\left(\sum_{k=1}^{N}Z_{n,ak}\xi_{nk}\right)\left(\sum_{\ell=1}^{N}Z_{m,b\ell}\xi_{m\ell}\right)=\delta_{nm}Z_{na}Z_{nb}\sum_{k=1}^{N}\lambda_{nk}v_{a}^{(k)}v_{b}^{(k)} (34)

which is exactly the complete covariance matrix (29).

All the above means that it is enough to have as many random Gaussian variables, as the number of the assets on the market (originally we had much more). These variables can be thought to be independent, and appear in the evolution equations multiplied by the volatility matrix Zn,akZ_{n,ak}. Thus the cumulative distribution of the random variables is

𝒫({ξ})=nk𝒫G(ξnk),𝒫G(ξ)=12πeξ22.{\cal P}(\{\xi\})=\prod_{nk}{\cal P}_{G}(\xi_{nk}),\qquad{\cal P}_{G}(\xi)=\frac{1}{\sqrt{2\pi}}e^{-\frac{\xi^{2}}{2}}. (35)

From now on we suppress the multidimensional indices, treat ZZ as a matrix Zn,akZ_{n,ak}, and ξ\xi as a vector ξnk\xi_{nk}.

4.2 Scaling under changing of the discretization time

In the above discussion the value of dτd\tau could be chosen arbitrarily. Our first guess was 1μ1\musec, but just as well could it be 2μ2\musec or even 0.5μ0.5\musec. What effect does it have on the form of the dynamic equation?

Let us first assume that we want to work with dt=2dτdt=2d\tau. This can be thought that we want to tell Sn+2S_{n+2} from SnS_{n}. When we recursively substitute the equation of Sn+1S_{n+1} into the equation of Sn+2S_{n+2} we have a lengthy expression. But the price changes are so very little in this time interval that in the argument of μ\mu and σ\sigma functions we can use the previous value. This simplifies the discussion to

Sn+2=Sn+2n(Sn,0)+2Zn(Sn)ξn+ξn+12,S_{n+2}=S_{n}+2{\cal F}_{n}(S_{n},0)+\sqrt{2}Z_{n}(S_{n})\frac{\xi_{n}+\xi_{n+1}}{\sqrt{2}}, (36)

where in the last expression we divided and multiplied by 2\sqrt{2}. The distribution of the sum of independent Gaussian random variables is a Gaussian random variable. The correlation matrix coming from the the last expression is thus

12𝑬(ξna+ξn+1,a)(ξn,b+ξn+1,b)=12𝑬(ξnaξnb)+12𝑬(ξn+1,aξn+1,b)=δab\frac{1}{2}\bm{E}(\xi_{na}+\xi_{n+1,a})(\xi_{n,b}+\xi_{n+1,b})=\frac{1}{2}\bm{E}(\xi_{na}\xi_{nb})+\frac{1}{2}\bm{E}(\xi_{n+1,a}\xi_{n+1,b})=\delta_{ab} (37)

is the same as for ξn\xi_{n}. Thus we may write

Sn+2=Sn+2n(Sn,0)+2Zn(Sn)ξn.S_{n+2}=S_{n}+2{\cal F}_{n}(S_{n},0)+\sqrt{2}Z_{n}(S_{n})\xi_{n}. (38)

This can be generalized to arbitrary dtdt (as far the change of the prices in this time interval is negligible): the first term is multiplied by dt/dτdt/d\tau, the second term, on the other hand, by dt/dτ\sqrt{dt/d\tau}.

Sn+dt/dτ=Sn+dtdτn(Sn,0)+dtdτZn(Sn)ξn.S_{n+dt/d\tau}=S_{n}+\frac{dt}{d\tau}{\cal F}_{n}(S_{n},0)+\sqrt{\frac{dt}{d\tau}}Z_{n}(S_{n})\xi_{n}. (39)

We may introduce the notations

μn(Sn)=1dτn(Sn,0),σn(Sn)=1dτZn(Sn),dSn=Sn+dt/dτSn,\mu_{n}(S_{n})=\frac{1}{d\tau}{\cal F}_{n}(S_{n},0),\qquad\sigma_{n}(S_{n})=\frac{1}{\sqrt{d\tau}}Z_{n}(S_{n}),\qquad dS_{n}=S_{n+dt/d\tau}-S_{n}, (40)

and then we can write for the dtdt discretization time

dSn=μn(Sn)dt+σn(Sn)dtξn.dS_{n}=\mu_{n}(S_{n})\,dt+\sigma_{n}(S_{n})\sqrt{dt}\,\xi_{n}. (41)

We remark that in the multi-dimensional case σ\sigma is a matrix in the asset price space.

This form shows that the continuous time limit is not trivial: not all the variables scale like dtdt, and so in the dt0dt\to 0 limit the above equation does not go to a differential equation. Indeed, the continuous limit is known as a stochastic differential equation.

4.3 Numerical computation of an expectation value

In the practical point of view the treatment of (41) looks like the following. First we find the solution Sn+1S_{n+1} depending on the time series ξ={ξ0,ξ1,,ξn}\xi=\{\xi_{0},\xi_{1},\dots,\xi_{n}\} and on the initial condition S0S_{0}. Let us denote it

Sn+1(S0,ξ).S_{n+1}(S_{0},\xi). (42)

Here we have used the fact that Sn+1S_{n+1} can depend only on the past events. Then we should calculate the expected value of any function of Sn+1S_{n+1} by averaging over the possible ξ\xi series over independent Gaussian distributions. In formula this reads

𝑬f(Sn+1)=dNξ0(2π)N/2e12ξ02dNξn(2π)N/2e12ξn2f(Sn+1(S0,ξ)).\bm{E}f(S_{n+1})=\int\limits_{-\infty}^{\infty}\frac{d^{N}\xi_{0}}{(2\pi)^{N/2}}e^{-\frac{1}{2}\xi^{2}_{0}}\dots\frac{d^{N}\xi_{n}}{(2\pi)^{N/2}}e^{-\frac{1}{2}\xi^{2}_{n}}f(S_{n+1}(S_{0},\xi)). (43)

The two equations (41) and (43) provide a well defined numerical framework to solve any stochastic problem numerically.

Often we use a momentum generation function that is defined as

𝑬eJS=dNξ0(2π)N/2e12ξ02+J0S0dNξn(2π)N/2e12ξn2+JnSn,\bm{E}e^{JS}=\int\limits_{-\infty}^{\infty}\frac{d^{N}\xi_{0}}{(2\pi)^{N/2}}e^{-\frac{1}{2}\xi^{2}_{0}+J_{0}S_{0}}\dots\frac{d^{N}\xi_{n}}{(2\pi)^{N/2}}e^{-\frac{1}{2}\xi^{2}_{n}+J_{n}S_{n}}, (44)

where JS=a,nJnaSnaJS=\sum_{a,n}J_{na}S_{na} and the SS series satisfy (41).

4.4 Change of variables

A very interesting consequence of the different scaling properties of the various terms in (41) is that, in case of a variable change, a nontrivial factor appears.

Let us assume that we have a new variable X=f(t,S)X=f(t,S), where ff is a smooth function. In discretized case it reads Xn=fn(Sn)X_{n}=f_{n}(S_{n}). Let us consider the change in XX up to 𝒪(dt3/2){\cal O}(dt^{3/2}):

dXn=fn+1(Sn+1)fn(Sn)=tfn(Sn)dt+fn(Sn+dSn)fn(Sn).dX_{n}=f_{n+1}(S_{n+1})-f_{n}(S_{n})=\partial_{t}f_{n}(S_{n})dt+f_{n}(S_{n}+dS_{n})-f_{n}(S_{n}). (45)

If all terms were scale as dtdt in dSdS, then we could power expand ff to first order. But the different terms scale in different ways, so we must go until the second order term:

dXn=tfn(Sn)dt+Sfn(Sn)dSn+12S2fn(Sn)dSn2+𝒪(dSn3).dX_{n}=\partial_{t}f_{n}(S_{n})dt+\partial_{S}f_{n}(S_{n})dS_{n}+\frac{1}{2}\partial_{S}^{2}f_{n}(S_{n})dS_{n}^{2}+{\cal O}(dS_{n}^{3}). (46)

Here we can use (41) for the value of dSndS_{n}. We remark that in dSn2dS_{n}^{2} there is a single term that is proportional to dtdt, all other terms are of higher order. Thus we shall write

dX=tfdt+Sf(μdt+σdtξ)+12σ2S2fdtξ2+𝒪(dt3/2),dX=\partial_{t}fdt+\partial_{S}f\left(\mu dt+\sigma\sqrt{dt}\,\xi\right)+\frac{1}{2}\sigma^{2}\partial_{S}^{2}fdt\xi^{2}+{\cal O}(dt^{3/2}), (47)

where we omitted the arguments for brevity (note that f(t,S)f(t,S) is a differentiable funciton, so it is sensible to speak about tf\partial_{t}f even if the time steps are discrete). We rewrite this formula as

dX=tfdt+Sfμdt+12σ2S2fdt+Sfσdtξ¯,dX=\partial_{t}fdt+\partial_{S}f\mu dt+\frac{1}{2}\sigma^{2}\partial_{S}^{2}fdt+\partial_{S}f\sigma\sqrt{dt}\,\bar{\xi}, (48)

where we introduced a new random variable having zero mean as

ξ¯=ξ+σS2f2Sfdt(ξ21).\bar{\xi}=\xi+\frac{\sigma\partial_{S}^{2}f}{2\partial_{S}f}\sqrt{dt}(\xi^{2}-1). (49)

As we see, the change of XX is not Gaussian distributed, so XX is not a Brownian motion any more. But the difference from a Brownian motion vanishes like dt\sim\sqrt{dt} as dt0dt\to 0. So in the limit we can omit the difference of ξ¯\bar{\xi} and ξ\xi. Then we find

dX=(tf+μSf+12σ2S2f)dt+σSfdtξ.dX=\left(\partial_{t}f+\mu\partial_{S}f+\frac{1}{2}\sigma^{2}\partial_{S}^{2}f\right)dt+\sigma\partial_{S}f\sqrt{dt}\,\xi. (50)

This is the Ito^\hat{\mathrm{o}}-formula. In case of any number of correlated assets it reads

dXc=(tfc+μafcSa+12(σTσ)ab2fcSaSb)dt+faSaσabξbdt.dX_{c}=\left(\partial_{t}f_{c}+\mu_{a}\frac{\partial f_{c}}{\partial S_{a}}+\frac{1}{2}(\sigma^{T}\sigma)_{ab}\frac{\partial^{2}f_{c}}{\partial S_{a}\partial S_{b}}\right)dt+\frac{\partial f_{a}}{\partial S_{a}}\sigma_{ab}\xi_{b}\sqrt{dt}. (51)

4.5 Evolution equation of the distribution functions

Let us assume that we have a statistical information about the price at present, we know its distribution function 𝒫0(S){\cal P}_{0}(S). Then what will be the distribution function at later times?

To give a formal definition for the distribution function we realize that for any quantity g(ξ)g(\xi) depending on a real valued random variable the expected value can be written with the help of the Dirac-delta

𝑬ξg(ξ)=𝑑x𝑬δ(ξx)g(x).\bm{E}_{\xi}g(\xi)=\int\limits_{-\infty}^{\infty}dx\,\bm{E}\delta(\xi-x)g(x). (52)

The g(x)g(x) does not depend on ξ\xi, so we can take it out from the scope of the expected value and obtain

𝑬ξg(ξ)=𝑑x𝒫(x)g(x),\bm{E}_{\xi}g(\xi)=\int\limits_{-\infty}^{\infty}dx\,{\cal P}(x)g(x), (53)

where 𝒫{\cal P} is the distribution function

𝒫(x)=𝑬ξδ(ξx).{\cal P}(x)=\bm{E}_{\xi}\delta(\xi-x). (54)

The question we want to answer is that what is the distribution function of the prices at time t=ndtt=ndt if we know the distribution 𝒫m(S){\cal P}_{m}(S) at time t=mdtt=mdt. What we have to do is to solve the price motion using the equation (41), starting from some S=SmS=S_{m} initial condition at t=mt=m, and assuming given ξ={ξm,,ξn1}\xi=\{\xi_{m},\dots,\xi_{n-1}\}. Having obtained a solution Sn(Sm,ξ)S_{n}(S_{m},\xi), finally we have to average over all ξ\xi and SmS_{m}.

Then we can write for the expected value of any f(Sn)f(S_{n}) function

𝑬f(Sn)=𝑬Sm𝑬ξf(Sn(Sm,ξ))=𝑑S𝑑S𝑬Smδ(SSm)𝑬ξδ(SSn(S,ξ))f(S).\bm{E}f(S_{n})=\bm{E}_{S_{m}}\bm{E}_{\xi}f(S_{n}(S_{m},\xi))=\!\int\limits_{-\infty}^{\infty}\!dSdS^{\prime}\bm{E}_{S_{m}}\delta(S^{\prime}-S_{m})\bm{E}_{\xi}\delta(S-S_{n}(S^{\prime},\xi))f(S). (55)

With the distribution functions we can write this expression as

𝑬f(Sn)=𝑑S𝑑S𝒫m(S)𝒫mn(S,S)f(S),\bm{E}f(S_{n})=\int\limits_{-\infty}^{\infty}\!dSdS^{\prime}{\cal P}_{m}(S^{\prime}){\cal P}_{mn}(S^{\prime},S)f(S), (56)

where

𝒫mn(S,S)=𝑬ξδ(SSn(S,ξ)).{\cal P}_{mn}(S^{\prime},S)=\bm{E}_{\xi}\delta(S-S_{n}(S^{\prime},\xi)). (57)

There are different methods to derive this quantity, here we will use the Ito^\hat{\mathrm{o}} formula, applied to the expectation value of the f(Sn)f(S_{n}) function above. First let us fix the initial distribution to

𝒫m(S)δ(SSm),{\cal P}_{m}(S^{\prime})\to\delta(S^{\prime}-S_{m}), (58)

then the SS^{\prime} integral disappears. Now we change nn, and write up the change in the expected value in two ways. At the one hand we have

d(𝑬f(Sn))=𝑑S𝑑𝒫mn(Sm,S)f(S).d(\bm{E}f(S_{n}))=\int\limits_{-\infty}^{\infty}dS\,d{\cal P}_{mn}(S_{m},S)f(S). (59)

At the other hand from (41) we have

d(𝑬f(Sn))=𝑬df(Sn)=𝑬(μSf+12σ2S2f)dt==𝑑S(μSf+12σ2S2f)𝒫mn(Sm,S)𝑑t==𝑑Sf(S)(S(μ𝒫)+12S2(σ2𝒫))𝑑t,\begin{split}d(\bm{E}f(S_{n}))&=\bm{E}df(S_{n})=\bm{E}\left(\mu\partial_{S}f+\frac{1}{2}\sigma^{2}\partial_{S}^{2}f\right)dt=\\ &=\int\limits_{-\infty}^{\infty}dS\,\left(\mu\partial_{S}f+\frac{1}{2}\sigma^{2}\partial_{S}^{2}f\right){\cal P}_{mn}(S_{m},S)dt=\\ &=\int\limits_{-\infty}^{\infty}dS\,f(S)\left(-\partial_{S}(\mu{\cal P})+\frac{1}{2}\partial_{S}^{2}(\sigma^{2}{\cal P})\right)dt,\end{split} (60)

where in the last line we performed partial integration, and omitted the arguments of 𝒫{\cal P} for brevity. Since the two expressions are equal for any ff function, we can conclude

d𝒫=(S(μ𝒫)+12S2(σ2𝒫))dt.d{\cal P}=\left(-\partial_{S}(\mu{\cal P})+\frac{1}{2}\partial_{S}^{2}(\sigma^{2}{\cal P})\right)dt. (61)

In continuous time this leads to a partial differential equation known as the Fokker-Planck-equation or Kolmogorov-PDE:

t𝒫=S(μ𝒫)+12S2(σ2𝒫).\partial_{t}{\cal P}=-\partial_{S}(\mu{\cal P})+\frac{1}{2}\partial_{S}^{2}(\sigma^{2}{\cal P}). (62)

If we wanted to write out the indices explicitly we would write

t𝒫a=Sb(μb𝒫a)+122SbSc((σTσ)bc𝒫a).\frac{\partial}{\partial t}{\cal P}_{a}=-\frac{\partial}{\partial S_{b}}(\mu_{b}{\cal P}_{a})+\frac{1}{2}\frac{\partial^{2}}{\partial S_{b}\partial S_{c}}((\sigma^{T}\sigma)_{bc}{\cal P}_{a}). (63)

4.5.1 Composition rule and dependence on the initial conditions

We can perform the evaluation of the expected value (57) in two parts, if we want. We choose a m<k<nm<k<n internal time, and we draw up the condition that at kk we arrived at S=SkS=S_{k}, and then, starting from this value, we proceed from knk\to n. Formally we can write

𝒫mn(S,S)=𝑑S′′𝑬ξδ(S′′Sk(S,ξ))δ(SSn(S′′,ξ)),{\cal P}_{mn}(S^{\prime},S)=\int\limits_{-\infty}^{\infty}dS^{\prime\prime}\bm{E}_{\xi}\delta(S^{\prime\prime}-S_{k}(S^{\prime},\xi))\delta(S-S_{n}(S^{\prime\prime},\xi)), (64)

where in the last delta function we tacitly assumed that we start the time evolution from kk. The two Dirac-deltas are independent on each other, because in the first case we have to average only over {ξm,,ξk1}\{\xi_{m},\dots,\xi_{k-1}\}, in the second case only over {ξk,,ξn1}\{\xi_{k},\dots,\xi_{n-1}\}. Therefore we can write

𝒫mn(S,S)=𝑑S′′𝒫mk(S,S′′)𝒫kn(S′′,S).{\cal P}_{mn}(S^{\prime},S)=\int\limits_{-\infty}^{\infty}dS^{\prime\prime}{\cal P}_{mk}(S^{\prime},S^{\prime\prime}){\cal P}_{kn}(S^{\prime\prime},S). (65)

This formula makes it possible to find a differential equation with respect to the initial conditions of the distribution function. If we change kk, namely, the left hand side does not vary. Thus

0=𝑑S′′[dk𝒫mk(S,S′′)]𝒫kn(S′′,S)+𝒫mk(S,S′′)dk𝒫kn(S′′,S).0=\int\limits_{-\infty}^{\infty}dS^{\prime\prime}\left[d_{k}{\cal P}_{mk}(S^{\prime},S^{\prime\prime})\right]{\cal P}_{kn}(S^{\prime\prime},S)+{\cal P}_{mk}(S^{\prime},S^{\prime\prime})d_{k}{\cal P}_{kn}(S^{\prime\prime},S). (66)

We can use (61) to write for the first term

𝑑S′′(S′′(μ𝒫mk(S,S′′))+12S′′2(σ2𝒫mk(S,S′′)))𝑑tk𝒫kn(S′′,S).\int\limits_{-\infty}^{\infty}dS^{\prime\prime}\left(-\partial_{S^{\prime\prime}}(\mu{\cal P}_{mk}(S^{\prime},S^{\prime\prime}))+\frac{1}{2}\partial_{S^{\prime\prime}}^{2}(\sigma^{2}{\cal P}_{mk}(S^{\prime},S^{\prime\prime}))\right)dt_{k}{\cal P}_{kn}(S^{\prime\prime},S). (67)

We perform partial integration, and substitute the result back into the previous expression. Since this must be true for any 𝒫mk(S,S′′){\cal P}_{mk}(S^{\prime},S^{\prime\prime}) we conclude

d𝒫kn(S′′,S)=(μ(S′′)S′′𝒫kn(S′′,S)12σ2(S′′)S′′2𝒫kn(S′′,S))dtk.d{\cal P}_{kn}(S^{\prime\prime},S)=\left(-\mu(S^{\prime\prime})\partial_{S^{\prime\prime}}{\cal P}_{kn}(S^{\prime\prime},S)-\frac{1}{2}\sigma^{2}(S^{\prime\prime})\partial_{S^{\prime\prime}}^{2}{\cal P}_{kn}(S^{\prime\prime},S)\right)dt_{k}. (68)

In continuous time it reads

t0𝒫=μS0𝒫12σ2S02𝒫,\partial_{t_{0}}{\cal P}=-\mu\partial_{S_{0}}{\cal P}-\frac{1}{2}\sigma^{2}\partial_{S_{0}}^{2}{\cal P}, (69)

where the 0 index denotes the initial conditions.

4.5.2 Change of variables in the distribution function

We may also work out the change of the distribution function under the change of its argument. We change the variable from xy=Y(x)x\to y=Y(x), when YY is invertible. Then the distribution of yy reads

𝒫y(y)=𝑑x𝒫x(x)δ(yY(x)).{\cal P}_{y}(y)=\int dx{\cal P}_{x}(x)\delta(y-Y(x)). (70)

Changing to new variable y=Y(x)y^{\prime}=Y(x), the integral measure changes by the Jacobian, and we find

𝒫y(y)=|Yx|1𝒫x(x)|x=Y1(y).{\cal P}_{y}(y)=\left|\frac{\partial Y}{\partial x}\right|^{-1}\!\!\!{\cal P}_{x}(x)\;\biggr{|}_{x=Y^{-1}(y)}. (71)

4.6 Path integral

In (43) we have seen how to compute an expectation value numerically. Here we continue this line of thought, rewriting that formula.

To treat (43) we have to know the additional information of how to determine SS, i.e. we need the equation (41). We may work out a formula which is self-contained, i.e. it contains both the time evolution as well as the averaging. The key is that we can represent a recursion through an integral over a Dirac-delta

f(Sm+1)=δ(Sm+gm(Sm,ξm)Sm+1)f(Sm+1)𝑑Sm+1,f(S_{m+1})=\int\delta\left(S_{m}+g_{m}(S_{m},\xi_{m})-S_{m+1}\right)f(S_{m+1})dS_{m+1}, (72)

where in the present case (41) corresponds to gm(S,ξ)=μm(S)dt+σm(S)dtξmg_{m}(S,\xi)=\mu_{m}(S)dt+\sigma_{m}(S)\sqrt{dt}\xi_{m}. This form can be applied for all m=1,2nm=1,2\dots n, and obtain

𝑬f(Sn)=m=1n[e12ξm2δ(SmSm1gm1(Sm1,ξm1))]f(Sn)𝒟ξ𝒟S(2π)Nn/2,\bm{E}f(S_{n})=\int\prod_{m=1}^{n}\left[e^{-\frac{1}{2}\xi^{2}_{m}}\delta(S_{m}-S_{m-1}-g_{m-1}(S_{m-1},\xi_{m-1}))\right]\frac{f(S_{n})\,{\cal D}\xi{\cal D}S}{(2\pi)^{Nn/2}}, (73)

where with initial condition S0=S_{0}=given, and we also introduced the notation

𝒟h=dh1dh2dhn{\cal D}h=dh_{1}dh_{2}\dots dh_{n} (74)

for h=ξh=\xi and SS.

In order to simplify the formulae, and get rid of the disturbing constant factors, we may introduce

f(Sn)=m=1n[e12ξm2δ(SmSm1gm1(Sm1,ξm1))]f(Sn)𝒟ξ𝒟S,{\left\langle{f(S_{n})}\right\rangle}=\int\prod_{m=1}^{n}\left[e^{-\frac{1}{2}\xi^{2}_{m}}\delta(S_{m}-S_{m-1}-g_{m-1}(S_{m-1},\xi_{m-1}))\right]f(S_{n})\,{\cal D}\xi{\cal D}S, (75)

and then

𝑬f(sn)=11f(Sn).\bm{E}f(s_{n})=\frac{1}{{\left\langle{1}\right\rangle}}{\left\langle{f(S_{n})}\right\rangle}. (76)

We can also introduce the generator functional

Z[S0;J]=eJS|S0Z[S_{0};J]=\left.{\left\langle{e^{JS}}\right\rangle}\right|_{S_{0}} (77)

where we also indicated the initial condition. We usually denote Z(S0)=Z[S0;0]=1Z(S_{0})=Z[S_{0};0]={\left\langle{1}\right\rangle}, it is sometimes called partition function in physics. Then

𝑬f(sn)=1Z(S0)f(Sn)|S0.\bm{E}f(s_{n})=\frac{1}{Z(S_{0})}\left.{\left\langle{f(S_{n})}\right\rangle}\right|_{S_{0}}. (78)

In the sequel we will omit all constant factors in all expected values, the division with the corresponding ZZ will take care of the correct normalization.

We note that the upper limit of the product term in (75) can be extended to infinity. The reason is that if the integrand does not depend on the last variable, then the Dirac-delta simply gives one. In this way we can get rid of the last integral unless ff depends on it. Finally we have

f(Sn)=m=1[e12ξm2δ(SmSm1gm1(Sm1))]f(Sn)𝒟ξ𝒟S.{\left\langle{f(S_{n})}\right\rangle}=\int\prod_{m=1}^{\infty}\left[e^{-\frac{1}{2}\xi^{2}_{m}}\delta(S_{m}-S_{m-1}-g_{m-1}(S_{m-1}))\right]f(S_{n})\,{\cal D}\xi{\cal D}S. (79)

The next step is to integrate over the ξm\xi_{m} variables. This is not difficult, because the gmg_{m} are linear in this variable. The master formula is

dNξ(2π)N/2e12ξ2δ(ABξ)=1detBe12(B1A)2.\int\limits_{-\infty}^{\infty}\frac{d^{N}\xi}{(2\pi)^{N/2}}e^{-\frac{1}{2}\xi^{2}}\delta(A-B\xi)=\frac{1}{\det B}e^{-\frac{1}{2}(B^{-1}A)^{2}}. (80)

We then obtain, using (41)

f(Sn)=e12m=0dt(S˙mμm)Cm1(S˙mμm)f(Sn)𝒟CS,{\left\langle{f(S_{n})}\right\rangle}=\int e^{-\frac{1}{2}\sum_{m=0}^{\infty}dt(\dot{S}_{m}-\mu_{m})C_{m}^{-1}(\dot{S}_{m}-\mu_{m})}f(S_{n})\,{\cal D}_{C}S, (81)

where we denoted

S˙m=Sm+1Smdt,C=σmTσm,𝒟CS=dS1detC1dSndetCn.\dot{S}_{m}=\frac{S_{m+1}-S_{m}}{dt},\quad C=\sigma_{m}^{T}\sigma_{m},\qquad{\cal D}_{C}S=\frac{dS_{1}}{\sqrt{\det C_{1}}}\dots\frac{dS_{n}}{\sqrt{\det C_{n}}}\dots. (82)

Here we also used that detσ=detC\det\sigma=\sqrt{\det C}. This formula has the big advantage that it does not need any supplementary condition, we can calculate the expectation values simply by performing the integrals.

In physical terms the exponent is called the Hamiltonian, or, in other context, the Euclidean Lagrangian. So we can write

Lm=12(S˙mμm)Cm1(S˙mμm),L_{m}=\frac{1}{2}(\dot{S}_{m}-\mu_{m})C_{m}^{-1}(\dot{S}_{m}-\mu_{m}), (83)

then

f(Sn)|S0=e12m=0dtLmf(Sn)𝒟σS|S0,\left.{\left\langle{f(S_{n})}\right\rangle}\right|_{S_{0}}=\int\left.e^{-\frac{1}{2}\sum_{m=0}^{\infty}dtL_{m}}f(S_{n})\,{\cal D}_{\sigma}S\right|_{S_{0}}, (84)

which is called the path integral representation of the expectation value.

The distribution function is the expected value of the Dirac-delta:

𝒫(0,S0;t,S)=1Z(S0)e12m=0dtLmδ(SnS)𝒟σS|S0.{\cal P}(0,S_{0};t,S)=\frac{1}{Z(S_{0})}\int e^{-\frac{1}{2}\sum_{m=0}^{\infty}dtL_{m}}\delta(S_{n}-S){\cal D}_{\sigma}S\biggr{|}_{S_{0}}. (85)

5 Continuous approaches

In the previous section we used a discrete representation of the stochastic process. Traditionally, however, the continuous description is used in general. In this section we overview some of them.

5.1 Langevin-equaiton: a differential equation form

In physics the usual procedure is to write up a formal differential equation

dSdt=μ+σξ,\frac{dS}{dt}=\mu+\sigma\xi, (86)

known as the Langevin-equation; the symbols μ\mu and σ\sigma denote general f(t,S)f(t,S) functions, while ξ(t)\xi(t) is a continuous random variable known as a white noise.

In order to reproduce the discretized form (41) we have to choose the correlation function of these random variables carefully. The correct choice is

𝑬ξ(a)(t)ξ(b)(t)=Cabδ(tt),\bm{E}\xi^{(a)}(t)\xi^{(b)}(t^{\prime})=C_{ab}\delta(t-t^{\prime}), (87)

where δ(t)\delta(t) is the Dirac-delta distribution. In this case, namely, by integrating the Langevin-equation from tt to t+dtt+dt we obtain

dS=μdt+σtt+dtξ(t)𝑑t.dS=\mu dt+\sigma\int\limits_{t}^{t+dt}\xi(t^{\prime})dt^{\prime}. (88)

We re-introduce ξn\xi_{n} as

ξn=1dttt+dtξ(t)𝑑t.\xi_{n}=\frac{1}{\sqrt{dt}}\int\limits_{t}^{t+dt}\xi(t^{\prime})dt^{\prime}. (89)

The correlation between ξn\xi_{n} and ξm\xi_{m} for different nmn\neq m is zero, and

𝑬ξn(a)ξn(b)=1dttt+dt[𝑬ξ(a)(t)ξ(b)(t′′)]𝑑t𝑑t′′=Cab.\bm{E}\xi^{(a)}_{n}\xi^{(b)}_{n}=\frac{1}{dt}\int\limits_{t}^{t+dt}\left[\bm{E}\xi^{(a)}(t^{\prime})\xi^{(b)}(t^{\prime\prime})\right]dt^{\prime}dt^{\prime\prime}=C_{ab}. (90)

Thus the ”average” of a stochastic variable must be calculated by dividing the square-root of the time interval, not the time interval itself.

5.2 Ito calculus: measures

Equation (41) can be thought as a relation for measures. Then dtdt serves as an ordinary Riemann-measure, while the dW={dtξn|n=0,}dW=\{\sqrt{dt}\,\xi_{n}\,|\,n=0,\dots\infty\} set is interpreted as a probability measure, usually referred to as the Brownian motion. We now discuss the one dimensional case with C=1C=1.

5.2.1 Probability theory in nutshell

This approach needs somewhat more preparation, and we recommend the interested reader to turn to more detailed description; here we just list the very essence of what we need. The point is that we try to generalize the concept of random variable to continuous ”indices”. In the discrete version one defines the sample space Ω\Omega that consists of elementary events, like an actual series of results of finite number of dice throwing (e.g. (1,3,3,2,4,5)(1,3,3,2,4,5)). The event space \cal F is the power set of Ω\Omega, consisting of all the subsets of it. Under the union operation this is a σ\sigma-algebra.

A probability measure is first defined as a function P:Ω[0,1]P:\Omega\to[0,1], but it can be lifted to P:[0,1]P:{\cal F}\to[0,1] with P(E)=ωEP(ω)P(E)=\sum_{\omega\in E}P(\omega). PP must satisfy P(Ω)=1P(\Omega)=1. A random variable is X:Ω𝑹X:\Omega\to\bm{R}. The expected value of a random variable is defined as

𝑬PX=ωΩX(ω)P(ω).\bm{E}_{P}X=\sum_{\omega\in\Omega}X(\omega)P(\omega). (91)

In the continuous case the problem is that the elementary events (also called atoms), forming Ω\Omega, all have zero probability. Therefore the probability measure can be defined only on \cal F, which is additive for unions of (countable) mutually disjunct subsets of Ω\Omega:

P:[0,1],P(Ω)=1,P(iIAi)=iIP(Ai),P:{\cal F}\to[0,1],\qquad P(\Omega)=1,\qquad P(\cup_{i\in I}A_{i})=\sum_{i\in I}P(A_{i}), (92)

where II is a countable index set, and AiAj={}A_{i}\cap A_{j}=\{\} for iji\neq j. The (Ω,,P)(\Omega,{\cal F},P) set is called probability space.

The generalization of the discrete expected value to continuous case is a stochastic integral denoted by

𝑬PX=ΩX(ω)𝑑P(ω).\bm{E}_{P}X=\int_{\Omega}X(\omega)dP(\omega). (93)

This is defined as a limiting procedure. First define the integral if XX is a step function, i.e. X=iIxi𝑰AiX=\sum_{i\in I}x_{i}\bm{I}_{A_{i}}, where AiA_{i} are disjoint elements of \cal F and 𝑰Ai(ω)=1\bm{I}_{A_{i}}(\omega)=1 if ωAi\omega\in A_{i} and 0 otherwise (indicator function). Then

𝑬PX=ΩX(ω)𝑑P(ω)=iIxiP(Ai).\bm{E}_{P}X=\int_{\Omega}X(\omega)dP(\omega)=\sum_{i\in I}x_{i}P(A_{i}). (94)

Then this definition can be extended to any function that can be approached as a limit of step functions.

5.2.2 The Ito process

The integral associated to the dWdW measure is the Ito^\hat{\mathrm{o}} integral. In our approach, fixing the dtdt time steps, we can integrate a function that is constant during these time steps (i.e. a fine step function, in mathematics it is called a process adapted to the discretization). The result of the integral is then a stochastic variable

IT=0TΔ(t)𝑑W(t)=nT/dtΔ(ndt)ξndt.I_{T}=\int\limits_{0}^{T}\Delta(t)dW(t)=\sum_{n\leq T/dt}\Delta(n\,dt)\xi_{n}\sqrt{dt}. (95)

This sum is also a Gaussian variable with zero mean and the following variance

𝑬IT2=0TΔ2(t)𝑑t,\bm{E}I_{T}^{2}=\int\limits_{0}^{T}\Delta^{2}(t)dt, (96)

as it can be seen from the square of the sum.

It is not hard to see that this definition does not depend on the length of the time intervals, just because of the Gaussian nature of the ξn\xi_{n} variables. So refining the time mesh we can approach the integral of any functions that can be described as a limit of step functions (measurable functions).

The quadratic variance of the integration measure reads

dWdW=dtξnξn=dt+dt(ξnξn1)=dt+𝒪(dt3/2),dW\,dW=dt\xi_{n}\xi_{n}=dt+dt(\xi_{n}\xi_{n}-1)=dt+{\cal O}(dt^{3/2}), (97)

and the last term vanish when dt0dt\to 0. This formula makes the basis of Ito^\hat{\mathrm{o}} calculus.

5.3 Path integral

There is also a continuous notation for the path integral. The sum in (84) multiplied by dtdt naturally leads to the integral notation

0L(t)𝑑t=m=0dtLm\int\limits_{0}^{\infty}L(t)dt=\sum_{m=0}^{\infty}dtL_{m} (98)

with t=mdtt=m\,dt. Then

f(S(t))=𝒟σSe0𝑑tL(t)f(S(t)),\left\langle f(S(t))\right\rangle=\int{\cal D}_{\sigma}S\,e^{-\int\limits_{0}^{\infty}dtL(t)}f(S(t)), (99)

where

L(t,S˙,S)=12(S˙μ)C1(S˙μ).L(t,\dot{S},S)=\frac{1}{2}(\dot{S}-\mu)C^{-1}(\dot{S}-\mu). (100)

6 Solutions of some stochastic differential equations

In this section we discuss some stochastic differential equations, and give their distribution functions. We will always start from the initial condition S(t=0)=S0S(t=0)=S_{0}, or 𝒫(t=0,S)=δ(SS0){\cal P}(t=0,S)=\delta(S-S_{0}).

6.1 The Brownian motion

The simplest stochastic equation is when the drift and the variance are constant. Then we can diagonalize the covariance matrix, and so we may deal with one dimensional problems. The equation we have to solve, in the discrete notation reads

Sn+1=Sn+μdt+σdtξn,S_{n+1}=S_{n}+\mu dt+\sigma\sqrt{dt}\xi_{n}, (101)

where ξn\xi_{n} are independent Gaussian variables with zero mean and unit variance. The solution of the recursion is very simple

Sn=S0+μndt+σdti=0n1ξi.S_{n}=S_{0}+\mu ndt+\sigma\sqrt{dt}\sum_{i=0}^{n-1}\xi_{i}. (102)

We introduce

ξ=1ni=0n1ξi,\xi=\frac{1}{\sqrt{n}}\sum_{i=0}^{n-1}\xi_{i}, (103)

which is a Gaussian random variable with zero mean and unit variance. So we have

Sn=S0+μt+σtξ,S_{n}=S_{0}+\mu t+\sigma\sqrt{t}\xi, (104)

where t=ndtt=ndt. Thus the distribution function reads

𝒫BM(t,S)=12πtσ2e(SS0μt)22tσ2.{\cal P}_{BM}(t,S)=\frac{1}{\sqrt{2\pi t\sigma^{2}}}e^{-\frac{(S-S_{0}-\mu t)^{2}}{2t\sigma^{2}}}. (105)

6.2 Geometric Brownian motion (GBM)

The most prominent feature of the market prices is that it is not important in which unit we measure the prices. We can use any currencies, gold prices or any other asset price as numeraire, the dynamics of the market is the same. Therefore only the relative price changes must be important. The stochastic differential equaiton that describes this property is simplest

S˙S=μ+σξ\frac{\dot{S}}{S}=\mu+\sigma\xi (106)

in the Langevin notation.

With new variable X=lnS/S0X=\ln S/S_{0} with some S0S_{0} we obtain, using the Ito^\hat{\mathrm{o}} formula

X˙=μ12σ2+σξ.\dot{X}=\mu-\frac{1}{2}\sigma^{2}+\sigma\xi. (107)

This is the Brownian motion discussed above. Using this equation it is usual to give the solution of the GBM as

S=S0exp[(μ12σ2)t+σtξ].S=S_{0}\exp\left[\left(\mu-\frac{1}{2}\sigma^{2}\right)t+\sigma\sqrt{t}\xi\right]. (108)

The distribution function of XX is the one given in (105). The formula (71) gives the distribution function of SS, using S=SS^{\prime}=S

𝒫GBM(t,S)=1S12πtσ2exp[12tσ2(lnSS0(μ12σ2)t)2],{\cal P}_{GBM}(t,S)=\frac{1}{S}\,\frac{1}{\sqrt{2\pi t\sigma^{2}}}\exp\left[-\frac{1}{2t\sigma^{2}}\left(\ln\frac{S}{S_{0}}-(\mu-\frac{1}{2}\sigma^{2})t\right)^{2}\right], (109)

this is a lognormal distribution.

6.3 Vasicek/Hull-White model

In finance the mean reverting model means that for long terms the random variable fluctuates around a single value. Such model is the following

S˙=a(bS)+σξ.\dot{S}=a(b-S)+\sigma\xi. (110)

Depending on whether the parameters are time dependent or not, do we call this model (extended) Vasicek or Hull-White model. Here we solve the model with constant parameters.

Introduce a new variable S=eatR+bS=e^{-at}R+b, then

S˙=eatR˙aeatR=aeatR+σξ,\dot{S}=e^{-at}\dot{R}-ae^{-at}R=-ae^{-at}R+\sigma\xi, (111)

therefore

R˙=σeatξ.\dot{R}=\sigma e^{at}\xi. (112)

This equation can be solved to RR by a simple integral. So we find for the original variable

S=S0eat+b(1eat)+σ0t𝑑sea(ts)ξ(s).S=S_{0}e^{-at}+b(1-e^{-at})+\sigma\int\limits_{0}^{t}\!ds\,e^{-a(t-s)}\xi(s). (113)

This describes a Gaussian random variable with mean

μ¯=S0eat+b(1eat),\bar{\mu}=S_{0}e^{-at}+b(1-e^{-at}), (114)

and variance

σ¯2=σ20t𝑑s𝑑sea(ts)a(ts)ξ(s)ξ(s)=σ21e2at2a.{\bar{\sigma}}^{2}=\sigma^{2}\int\limits_{0}^{t}\!dsds^{\prime}\,e^{-a(t-s)-a(t-s^{\prime})}{\left\langle{\xi(s)\xi(s^{\prime})}\right\rangle}=\sigma^{2}\frac{1-e^{-2at}}{2a}. (115)

So the distribution is

𝒫VHW(t,S)=12πσ¯2(t)exp[(SS0μ¯(t))22σ¯2(t)].{\cal P}_{VHW}(t,S)=\frac{1}{\sqrt{2\pi\bar{\sigma}^{2}(t)}}\exp\left[-\frac{(S-S_{0}-\bar{\mu}(t))^{2}}{2\bar{\sigma}^{2}(t)}\right]. (116)

As we see, the mean in long terms goes to bb, the process fluctuates around it with a variance σ2/(2a)\sigma^{2}/(2a).

7 Risk of a portfolio

In the previous sections we discussed the general framework of the price dynamics. Now let us think about the evaluation of the present value of an asset.

The most striking question is that if there are two assets with interest rates r1>r2r_{1}>r_{2}, then why is not there an arbitrage possibility? Indeed, the portfolio

𝒫=S2a1S1a2{\cal P}=S_{2}a_{1}-S_{1}a_{2} (117)

has zero value at t=0t=0, but at t=Tt=T it is worth

S(𝒫,T)=S2S1(T)S1S2(T)=(er1Ter2T)S1S2>0.S({\cal P},T)=S_{2}S_{1}(T)-S_{1}S_{2}(T)=\left(e^{r_{1}T}-e^{r_{2}T}\right)S_{1}S_{2}>0. (118)

So it seems that it is worth to realize this portfolio, we gain money from nothing.

The main point that we did not take into account is the risk. Let us assume for example that a2a_{2} is practically risk-free, while a1a_{1} has an annual default risk dd. The average annual rate thus is 0×d+r1×(1d)=r1×(1d)0\times d+r_{1}\times(1-d)=r_{1}\times(1-d). The risk therefore diminishes the rate.

The first problem here is that it is very hard to tell the exact value of dd before a real default will occur. We may give vague estimates, but we can easily miss a factor of two or even ten. As a number example consider the case when a1a_{1} pays an interest rate 20%, a2a_{2} has a risk-free rate 10%. If the default risk is 5% for a1a_{1}, then the average interest rate is still 14%14\%, so a1a_{1} is a better investment. But if the default risk is 10% then the average rate is 8%, then already a2a_{2} takes over.

But there is another effect. Let us assume that we can borrow money for rate r<r2<r1r<r_{2}<r_{1}, and we want to buy the assets from a loan. To be sure we hold back a relative amount cc as a collateral (usually it is demanded by the bank lending the money, too). So if we have a principal of 1USD we can borrow 1/c1/cUSD, and after a year we have

1USDr2rcUSD.1\,\mathrm{USD}\;\to\;\frac{r_{2}-r}{c}\,\mathrm{USD}. (119)

This is the leverage effect, resulting that the effective rate of a risk-free investment can be raised to very high. In the ideal case when c0c\to 0, any small difference between the risk-free and bank loan rate makes the effective rate grow to infinity.

The first lesson here is that if there were a risk-free investment possibility with higher annual rate than another, then this would indeed cause a very high level arbitrage possibility. Therefore the completely risk-free rate is a unique number.

The second remark is that we can leverage, of course, the risky investments, too. But there is a possibility to lose all the money with non negligible probability rate, then we stay back with the debt liability. This means that we must reserve a higher collateral in the risky case, preparing for the worst case. This will easily make the effective rate much lower than the effective rate for a risk-free investment.

So the real question is that how conservatively, how prudently do the banks evaluate and treat the risk. The practice nowadays is that the banks do not tolerate risky investments too well. This has some psychological factors in it, the market could work in different ways. But the present day practice requires the business to be practically risk-free.

7.1 Risk mitigation by creating indices

The assets, of course are not risk-free one-by-one, so we must make efforts to get rid of the risk. We can do it by combining assets into a portfolio. There are two main techniques to do this. The first one is to combine independent assets into a single portfolio: these are called indices. So we consider the portfolio

𝒫=i=1Nwiai.{\cal P}=\sum_{i=1}^{N}w_{i}a_{i}. (120)

The value of the portfolio reads:

SP=i=1NwiSi.S_{P}=\sum_{i=1}^{N}w_{i}S_{i}. (121)

We will assume that the aia_{i} assets follow the equation

S˙i=Si(μi+σiξi),\dot{S}_{i}=S_{i}(\mu_{i}+\sigma_{i}\xi_{i}), (122)

where we factored out the price itself, and ξiξj=δij{\left\langle{\xi_{i}\xi_{j}}\right\rangle}=\delta_{ij}. If wiw_{i} are independent of the prices of the underlying assets, then SPS_{P} satisfies

S˙P=i=1NwiSi(μi+σiξi)=SP(μ¯+σ¯ξ),\dot{S}_{P}=\sum_{i=1}^{N}w_{i}S_{i}(\mu_{i}+\sigma_{i}\xi_{i})=S_{P}(\bar{\mu}+\bar{\sigma}\xi), (123)

where

μ¯=i=1Nwixiμi,σ¯2=i=1Nwi2xi2σi2,\bar{\mu}=\sum_{i=1}^{N}w_{i}x_{i}\mu_{i},\qquad\bar{\sigma}^{2}=\sum_{i=1}^{N}w_{i}^{2}x_{i}^{2}\sigma_{i}^{2}, (124)

where xi=Si/SPx_{i}=S_{i}/S_{P}.

To diminish the effective risk, we should minimize the above expression by choosing the correct weights with the constraint that we should keep the value of the portfolio fixed, i.e.

1=i=1Nwixi.1=\sum_{i=1}^{N}w_{i}x_{i}. (125)

Then we have to satisfy

wii=1N(wi2xi2σi2λwixi)=0,\frac{\partial}{\partial w_{i}}\sum_{i=1}^{N}\left(w_{i}^{2}x_{i}^{2}\sigma_{i}^{2}-\lambda w_{i}x_{i}\right)=0, (126)

where λ\lambda is a Lagrange multiplicator. This results in

wi=λ2xiσi2.w_{i}=\frac{\lambda}{2x_{i}\sigma_{i}^{2}}. (127)

The value of the λ\lambda comes from

1=iλ2σi2λ=1i12σi2.1=\sum_{i}\frac{\lambda}{2\sigma_{i}^{2}}\quad\Rightarrow\quad\lambda=\frac{1}{\sum_{i}\frac{1}{2\sigma_{i}^{2}}}. (128)

Putting all together, after some algebra, we find

1σ¯2=i=1N1σi2.\frac{1}{\bar{\sigma}^{2}}=\sum_{i=1}^{N}\frac{1}{\sigma_{i}^{2}}. (129)

We see that in this way we can not achieve a complete risk-free portfolio, but we can mitigate the risks of the single underlying assets.

While this is simple in theory, practically it is not simple to reliably make an estimate on the σi\sigma_{i} values. It is also a question, how many assets do we want to include in the index, how do we treat the default risk, etc. We can make also the optimization in a different way, for example fixing a given risk and optimizing the effective interest rate. This results in the fact that there are various indices in the market that differ in the way we compute the weights.

7.2 Risk mitigation by hedging

The other way we can mitigate the risk is that we combine assets in a portfolio that have interdependent risks. In the market there are asset classes where the asset prices depend on each other, so there is a correlation between the risks. The most simple example of this case is when we consider an asset, and a derivative of it. A derivative in this sense is an asset that is built exclusively on the other, underlying asset (e.g. option, swap or similar products).

So let us assume that we have a portfolio where the underlying asset is aa, and we add some derivatives aia_{i} to it. So we have

𝒫=iαiaiδa,{\cal P}=\sum_{i}\alpha_{i}a_{i}-\delta a, (130)

where the weights αi\alpha_{i} and δ\delta are real numbers. The value of the portfolio is

SP=iαifi(t,S)δS,S_{P}=\sum_{i}\alpha_{i}f_{i}(t,S)-\delta S, (131)

where we have denoted the value of the derivatives at time tt and at spot price SS as fi(t,S)f_{i}(t,S). Now we think about these functions as prices that can be obtained by observing the market.

What is somewhat more complicated here compared with the previous case, is that the price of the portfolio may depend non-linearly on the price of the underlying, and so its dynamics must be computed using the Ito^\hat{\mathrm{o}} lemma. So, if

S˙=μ+σξ,\dot{S}=\mu+\sigma\xi, (132)

where μ\mu and σ\sigma can be SS dependent, then we have for the complete portfolio

S˙P=tSP+μSSSP+12σ2S2S2SP+σSSSPξ.\dot{S}_{P}=\partial_{t}S_{P}+\mu S\partial_{S}S_{P}+\frac{1}{2}\sigma^{2}S^{2}\partial^{2}_{S}S_{P}+\sigma S\partial_{S}S_{P}\xi. (133)

This expression is risk-free, if the term containing ξ\xi is zero. This leads to

SSP=0.\partial_{S}S_{P}=0. (134)

This would mean, however, that SPS_{P} does not depend on SS, put another way, it is not built on the asset aa. This contradicts our first equation.

So perfect risk-freeness can not be achieved in this way, either. The best we can do is to ensure vanishing derivative at a given price of the underlying, practically at the actual spot price S=S0S=S_{0}. Thus we require

0=SSP|S0.0=\partial_{S}S_{P}\biggr{|}_{S_{0}}. (135)

It is usual to introduce the Δ\Delta risk of the portfolio by the definition

ΔP=SSP|S0.\Delta_{P}=\partial_{S}S_{P}\biggr{|}_{S_{0}}. (136)

Risk-freeness at the spot price requires that the delta-risk of the portfolio vanishes

ΔP=0.\Delta_{P}=0. (137)

It is also said that we have a delta-neutral portfolio, or that we hedged out the delta risk.

Using our portfolio we have

ΔP=iαiΔiδ,\Delta_{P}=\sum_{i}\alpha_{i}\Delta_{i}-\delta, (138)

where

Δi=Sfi(t,S0).\Delta_{i}=\partial_{S}f_{i}(t,S_{0}). (139)

A delta-neutral portfolio can be achieved using one single derivative with α=1\alpha=1 and the underlying, by choosing

δ=Sf(t,S0).\delta=\partial_{S}f(t,S_{0}). (140)

7.2.1 Higher order hedging and the ”greeks”

There are several issues with the hedging strategy described above. One is that we do not really know the relation of the underlying and the derivative prices. We can observe the spot price of the derivative, i.e. f(t,S0)f(t,S_{0}), but to estimate Sf(t,S)\partial_{S}f(t,S) we should know it for any other prices as well. This can not be observed directly, thus we need a market model. So, strictly speaking, what we can do is to use the estimated present value f~(t,S,)\tilde{f}(t,S,{\cal M}) which already depends on the market model {\cal M}.

In practice the market model has some parameters, first of all the (estimated) volatility parameter σ0\sigma_{0} of the underlying asset. But, since no market model is perfect, the actual market can be described only with a non-constant volatility parameter. So, in this sense not just the price, but also the model has fluctuations. Now the complete analysis of the previous subsection can be repeated with the substitution Sσ0S\to\sigma_{0}. What we obtain is that for a risk-free portfolio we need both

SSP|S0=σSP|S0=0.\partial_{S}S_{P}\biggr{|}_{S_{0}}=\partial_{\sigma}S_{P}\biggr{|}_{S_{0}}=0. (141)

It is usual to introduce the quantity κ\kappa (kappa; sometimes it is called 𝒱{\cal V} vega), the analogue of Δ\Delta, corresponding to the price change under the changing volatility parameter:

κ=σSP(t,S0,σ).\kappa=\partial_{\sigma}S_{P}(t,S_{0},\sigma). (142)

We need that the kappa value of the complete portfolio is zero (delta-kappa neutral position).

Another issue is that we can ensure risk-free portfolio only at a single price S=S0S=S_{0}. As soon as the price moves, the risk will grow. Practically one always has to fine-tune the portfolio by adjusting the Δ\Delta (and κ\kappa) to the actual price. If, however, Δ\Delta strongly depends on the price of the underlying, then a sudden price change is hard to follow. This motivates the introduction of Γ\Gamma as the derivative of Δ\Delta (the second derivative of the present value of the derivative)

ΓP=SΔP(S)=S2SP(t,S0,σ).\Gamma_{P}=\partial_{S}\Delta_{P}(S)=\partial_{S}^{2}S_{P}(t,S_{0},\sigma). (143)

To ensure stability of a portfolio not just the delta, but also the ΓP\Gamma_{P} should be zero (delta-gamma neutral position).

We could continue this analysis, and introduce other ”greeks” to denote the higher derivatives, c.f. for example [17], all characterize the sanity of a portfolio. But usually, besides delta-risk, the kappa and/or the gamma is the most important to hedge out.

For all the greeks, the risk of the portfolio is the weighted sum of the individual assets

κP=iαiκi,ΓP=iαiΓi,.\kappa_{P}=\sum_{i}\alpha_{i}\kappa_{i},\qquad\Gamma_{P}=\sum_{i}\alpha_{i}\Gamma_{i},\dots. (144)

If, for example, we have two derivatives, then we can require

δ=α1Δ1+α2Δ2\delta=\alpha_{1}\Delta_{1}+\alpha_{2}\Delta_{2} (145)

to hedge out the Delta-risk, and

0=α1κ1+α2κ20=\alpha_{1}\kappa_{1}+\alpha_{2}\kappa_{2} (146)

to hedge out the kappa-risk. If we want to hedge out the gamma-risk as well, we need a third derivative.

If we continuously monitor the different greeks of the portfolio, we see, how sensitive it is for various ways of price changes. The best practice is to keep all the risks in a given narrow range.

8 Present value and pricing

As we have argued, the market requires the investments to be the possibly most risk-free. This also means that single assets are practically never traded one-by-one, only in portfolios where the risks are mitigated. But all risk-free portfolios must grow with the same rate, otherwise arbitrage would show up. This means that the rates of the individual assets play no role at all. Being part of a portfolio, all assets must be treated as if they had a common drift factor. In this artificial world, called the risk-neutral world we find for all derivatives (including the underlying asset)

ddtf(t,S)rn=rf(t,S)rn\frac{d}{dt}{\left\langle{f(t,S)}\right\rangle}_{rn}=r{\left\langle{f(t,S)}\right\rangle}_{rn} (147)

where rnrn stands for ”risk-neutral”. The rate itself can be a time dependent function, but it can not depend on the single asset prices.

This equation, in fact, is enough to determine the present value of an asset. We can do it in two equivalent ways, one leading to a differential equation, the other an integral formula.

8.1 Black-Scholes-Merton formula

In this approach we consider a portfolio built on an underlying and one derivative. Its value is

SP=f(t,S)δS.S_{P}=f(t,S)-\delta S. (148)

If it is in the delta-neutral position, then

δ=Sf(t,S0).\delta=\partial_{S}f(t,S_{0}). (149)

Now we express the time derivative of the portfolio in two ways. On the one hand the portfolio is risk free at S=S0S=S_{0}, so we require (147) to be hold

ddtSP(t,S0)=rSP(t,S0).\frac{d}{dt}S_{P}(t,S_{0})=rS_{P}(t,S_{0}). (150)

We find for our portfolio above

ddtSP(t,S0)=rf(t,S0)rS0Sf(t,S0).\frac{d}{dt}S_{P}(t,S_{0})=rf(t,S_{0})-rS_{0}\partial_{S}f(t,S_{0}). (151)

On the other hand, if SSP(t,S0)=0\partial_{S}S_{P}(t,S_{0})=0, then from (133) we find

ddtSP(t,S0)=tf(t,S0)+12σ2S02S2f(t,S0).\frac{d}{dt}S_{P}(t,S_{0})=\partial_{t}f(t,S_{0})+\frac{1}{2}\sigma^{2}S_{0}^{2}\partial^{2}_{S}f(t,S_{0}). (152)

Putting the two equations together we find

tf(t,S0)+12σ2S02S2f(t,S0)=r(f(t,S0)Sf(t,S0)S0).\partial_{t}f(t,S_{0})+\frac{1}{2}\sigma^{2}S_{0}^{2}\partial^{2}_{S}f(t,S_{0})=r\left(f(t,S_{0})-\partial_{S}f(t,S_{0})S_{0}\right). (153)

Strictly speaking the above equation is valid only at tt and S0S_{0}. But as the best approximation for the risk-free portfolio, we can demand that it holds for other SS as well. This leads to the Black-Scholes-Merton differential equation

tf+rSSf+12σ2S2S2f=rf.\partial_{t}f+rS\partial_{S}f+\frac{1}{2}\sigma^{2}S^{2}\partial^{2}_{S}f=rf. (154)

The solution of the Black-Scholes-Merton model requires initial condition in time and boundary conditions in SS. This latter is usually omitted, the boundaries being in the infinity. The initial condition of time, on the other hand, is set by the promised payoff in the future

f(T,S)=P(S).f(T,S)=P(S). (155)

It is then a final condition, not an initial one, and we should evolve the time backwards in order to obtain the derivative price today at t=t0t=t_{0}. This will give the present value of the derivative.

8.2 Integral formula

We can use a different route to have an expression from the condition (147). First we find

ddtet0t𝑑tr(t)f(t,S)rn=0.\frac{d}{dt}e^{-\int_{t_{0}}^{t}dt^{\prime}r(t^{\prime})}{\left\langle{f(t,S)}\right\rangle}_{rn}=0. (156)

This means that the quantity

M(t,S)=et0t𝑑tr(t)f(t,S)M(t,S)=e^{-\int_{t_{0}}^{t}dt^{\prime}r(t^{\prime})}f(t,S) (157)

is a random variable whose expected value under the risk-neutral measure is time independent (called to be a martingale under the risk-neutral measure).

At t=t0t=t_{0} present time we know the price of the asset, S=S0S=S_{0}, thus the price distribution is δ(SS0)\delta(S-S_{0}), and so so the expected value M(t0,S)=f(t0,S0){\left\langle{M(t_{0},S)}\right\rangle}=f(t_{0},S_{0}). From time independence of the expected value of MM follows

f(t0,S0)=et0t𝑑tr(t)f(t,S)t,rn.f(t_{0},S_{0})=e^{-\int_{t_{0}}^{t}dt^{\prime}r(t^{\prime})}{\left\langle{f(t,S)}\right\rangle}_{t,rn}. (158)

If we have a promised payoff P(S)P(S) at time tt, then f(t,S)=P(S)f(t,S)=P(S) (assuming the promise is fulfilled). Therefore

f(t0,S0)=et0t𝑑tr(t)P(S)t,rn.f(t_{0},S_{0})=e^{-\int_{t_{0}}^{t}dt^{\prime}r(t^{\prime})}{\left\langle{P(S)}\right\rangle}_{t,rn}. (159)

This formula does not assume any underlying market model, so it can be used in general.

If we write the payoff as an integral over Dirac-deltas, we can write

f(t0,S0)=et0t𝑑tr(t)𝑑SP(S)δ(SS)t,rn.f(t_{0},S_{0})=e^{-\int_{t_{0}}^{t}dt^{\prime}r(t^{\prime})}\int\limits_{-\infty}^{\infty}dS^{\prime}P(S^{\prime}){\left\langle{\delta(S-S^{\prime})}\right\rangle}_{t,rn}. (160)

The last term is the distribution function in the risk-neutral world:

f(t0,S0)=et0t𝑑tr(t)𝑑S𝒫rn(t0,S0;t,S)P(S).f(t_{0},S_{0})=e^{-\int_{t_{0}}^{t}dt^{\prime}r(t^{\prime})}\int\limits_{-\infty}^{\infty}dS^{\prime}{\cal P}_{rn}(t_{0},S_{0};t,S^{\prime})\,P(S^{\prime}). (161)

This last formula shows that the Green’s function of the present value determination is

𝒢(t0,S0;t,S)=et0t𝑑tr(t)𝒫rn(t0,S0;t,S).{\cal G}(t_{0},S_{0};t,S)=e^{-\int_{t_{0}}^{t}dt^{\prime}r(t^{\prime})}{\cal P}_{rn}(t_{0},S_{0};t,S^{\prime}). (162)

Using (69) we see that, if the underlying follows a Langevin equation, then the Green’s function satisfies

t0𝒢=r𝒢μS0𝒢12σ2S02𝒢,\partial_{t_{0}}{\cal G}=r{\cal G}-\mu\partial_{S_{0}}{\cal G}-\frac{1}{2}\sigma^{2}\partial^{2}_{S_{0}}{\cal G}, (163)

which is the Black-Scholes-Merton equation (154). This shows that 𝒢\cal G is the Green’s function of the Black-Scholes equation, too. It also proves that f(t0,S0)f(t_{0},S_{0}) satisfies the Black-Scholes equation, so the integral approach is equivalent to the differential equation approach.

Using path integral formula we can write from (99)

𝒢(t0,S0;t,S)=1Z(S0)𝒟CSet0𝑑tL(t)et0t𝑑tr(t)δ(S(t)S)|S0,{\cal G}(t_{0},S_{0};t,S)=\frac{1}{Z(S_{0})}\int{\cal D}_{C}S\,e^{-\int\limits_{t_{0}}^{\infty}dt^{\prime}L(t^{\prime})}e^{-\int\limits_{t_{0}}^{t}dt^{\prime}r(t^{\prime})}\delta(S(t)-S)\bigr{|}_{S_{0}}, (164)

where

Z(S0)=𝒟CSet0𝑑tL(t)|S0.Z(S_{0})=\int{\cal D}_{C}S\,e^{-\int\limits_{t_{0}}^{\infty}dt^{\prime}L(t^{\prime})}\bigr{|}_{S_{0}}. (165)

If there are several payoffs, then the linearity of the above equation tells us that the present values simply add up. So we can generalize the computation of a present value to arbitrary, continuously compounded payoffs p(t,S)p(t,S)

f(t0,S0)=𝑑t𝑑S𝒢(t0,S0;t,S)p(t,S).f(t_{0},S_{0})=\int\limits_{-\infty}^{\infty}dt\int\limits_{-\infty}^{\infty}dS\,{\cal G}(t_{0},S_{0};t,S)p(t,S). (166)

A fixed payoff at time TT can be the represented as p(t,x)=δ(tT)P(T)p(t,x)=\delta(t-T)P(T).

8.3 Option price in the GBM market model

To see an example we will compute the present value of the European call option in the geometric Brownian motion market model. The promised payoff of the call option reads

p(t,S)=(SK)+δ(tT),p(t,S)=(S-K)^{+}\delta(t-T), (167)

where x+=xΘ(x)x^{+}=x\Theta(x). To determine the present value, we use (159). It contains an expected value calculation, where the best is to use the explicit solution (108), where we shall use the drift μ=r=\mu=r=const. Then we find, with ξξ\xi\to-\xi:

f(0,S)=ertdξ2πe12ξ2(Se(r12σ2)tσξtK)+.f(0,S)=e^{-rt}\int\frac{d\xi}{\sqrt{2\pi}}e^{-\frac{1}{2}\xi^{2}}\left(Se^{(r-\frac{1}{2}\sigma^{2})t-\sigma\xi\sqrt{t}}-K\right)^{+}. (168)

The condition of positivity is ξ<d\xi<d_{-}, where

d=1σt(lnSK+(r12σ2)t).d_{-}=\frac{1}{\sigma\sqrt{t}}\left(\ln\frac{S}{K}+(r-\frac{1}{2}\sigma^{2})t\right). (169)

Thus we have

f(0,S)=ddξ2πe12ξ2(Se12σ2tσξtKert).f(0,S)=\int\limits_{-\infty}^{d_{-}}\frac{d\xi}{\sqrt{2\pi}}e^{-\frac{1}{2}\xi^{2}}\left(Se^{-\frac{1}{2}\sigma^{2}t-\sigma\xi\sqrt{t}}-Ke^{-rt}\right). (170)

The negative of the exponent in the first term is

12ξ2+12σ2t+σtξ=12(ξ+σt).\frac{1}{2}\xi^{2}+\frac{1}{2}\sigma^{2}t+\sigma\sqrt{t}\xi=\frac{1}{2}(\xi+\sigma\sqrt{t}). (171)

We can change variable in the first term to ξ=ξ+σt:[,d+]\xi^{\prime}=\xi+\sigma\sqrt{t}:\in[-\infty,d_{+}], then the upper limit of the integration is

d+=1σt(lnSK+(r+12σ2)t)d_{+}=\frac{1}{\sigma\sqrt{t}}\left(\ln\frac{S}{K}+(r+\frac{1}{2}\sigma^{2})t\right) (172)

Then in both terms we can realize the erf function, and we arrive finally at the Black-Scholes-formula

f(0,S)=SΦ(d+)KertΦ(d),f(0,S)=S\Phi(d_{+})-Ke^{-rt}\Phi(d_{-}), (173)

where

Φ(x)=xdξ2πe12ξ2.\Phi(x)=\int\limits_{-\infty}^{x}\frac{d\xi}{\sqrt{2\pi}}e^{-\frac{1}{2}\xi^{2}}. (174)

A different form for it reads

f(0,S)Kert=emΦ(mz+z2)Φ(mzz2),\frac{f(0,S)}{Ke^{-rt}}=e^{m}\Phi(\frac{m}{z}+\frac{z}{2})-\Phi(\frac{m}{z}-\frac{z}{2}), (175)

where

m=lnSKert,z=σt.m=\ln\frac{S}{Ke^{-rt}},\qquad z=\sigma\sqrt{t}. (176)

mm at t=0t=0 is sometimes called moneyness, m=0m=0, i.e. K=SK=S corresponds to the at-the-money (ATM) trade.

From this form we can also calculate the greeks, for example

Δ=Sf=Φ(d+)+1σt(𝒩(d+)KertS𝒩(d))κ=σf=Sd+σ𝒩(d+)Kertdσ𝒩(d),\begin{split}\Delta&=\partial_{S}f=\Phi(d_{+})+\frac{1}{\sigma\sqrt{t}}\left({\cal N}(d_{+})-\frac{Ke^{-rt}}{S}{\cal N}(d_{-})\right)\\ \kappa&=\partial_{\sigma}f=S\frac{\partial d_{+}}{\partial\sigma}{\cal N}(d_{+})-Ke^{-rt}\frac{\partial d_{-}}{\partial\sigma}{\cal N}(d_{-}),\end{split} (177)

where 𝒩{\cal N} denotes the normal Gaussian function, and

d±σ=1σ2t(lnSK+rt)±12t.\frac{\partial d_{\pm}}{\partial\sigma}=-\frac{1}{\sigma^{2}\sqrt{t}}\left(\ln\frac{S}{K}+rt\right)\pm\frac{1}{2}\sqrt{t}. (178)

9 Summary

The goal of this note was to summarize the ideas used in the financial practice in the language of physics. We have used the discrete time description of the time evolution which fits best to the philosophy of the renormalization group.

This note is far from being comprehensive, there are a lot of details missing. Also most of the discussed material is known and was written in various books even in more elaborated way. What makes this note somewhat different is that it puts emphasis on topics that are not usual to discuss (such as discrete time formalism or path integral).

Acknowledgment

The author gratefully acknowledges useful discussions with K. Cziszter, G. Fath and Z. Foris. This research was supported by the Hungarian Research Fund under the contract K104292.

References

  • [1] J. Hull, Options, futures, and other derivatives. Upper Saddle River, NJ [u.a.]: Pearson Prentice Hall, 6. ed., pearson internat. ed ed., 2006.
  • [2] S. E. Shreve, Stochastic Calculus for Finance I: The Binomial Asset Pricing Model: Binomial Asset Pricing Model. New York, NY: Springer-Verlag, 2003.
  • [3] S. E. Shreve, Stochastic Calculus for Finance II: Continuous-time models. New York, NY: Springer-Verlag, 2003.
  • [4] A. N. Kolmogorov, Foundations of the Theory of Probability. New York, NY; Heidelberg: Martino Fine Books (November 6, 2013), 2013.
  • [5] K. Ito, An Introduction to Probability Theory. Cambridge, United Kingdom: Cambridge University Press; 1 edition (January 1, 1986), 1986.
  • [6] “Distribution (mathematics).” https://en.wikipedia.org/wiki/Distribution_(mathematics).
  • [7] “Path integral formulation.” https://en.wikipedia.org/wiki/Path_integral_formulation.
  • [8] “Renormalization group.” https://en.wikipedia.org/wiki/Renormalization_group.
  • [9] J. C. Collins, Renormalization. Cambridge, United Kingdom: Cambridge University Press, 1984.
  • [10] S. Borsanyi et al., “Calculation of the axion mass based on high-temperature lattice quantum chromodynamics,” Nature, vol. 539, no. 7627, pp. 69–71, 2016.
  • [11] R. N. Mantegna and H. E. Stanley, An introduction to econophysics: correlations and complexity in finance. Cambridge, United Kingdom: Cambridge University Press), 2000.
  • [12] B. E. Baaquie, C. Coriano, and M. Srikant, “Quantum mechanics, path integrals and option pricing: Reducing the complexity of finance,” in 2nd International Workshop on Nonlinear Physics: Theory and Experiment Gallipoli, Lecce, Italy, June 27-July 6, 2002, 2002.
  • [13] A. B. Schmidt, Quantitative Finance for Physicists: An Introduction. Elsevier Inc, Cambridge, MA 02139: Academic Press; 1 edition (December 28, 2004), 2004.
  • [14] Z. Kakushadze, “Path Integral and Asset Pricing,” Quantitative Finance, vol. 15, no. 11, pp. 1759–1771, 2015.
  • [15] F. Jovanovic and C. Schinckus, Econophysics and Financial Economics: An Emerging Dialogue. New York, NY 10016, USA: Oxford University Press), 2017.
  • [16] B. E. Baaquie, Quantum Field Theory for Economics and Finance. Cambridge, United Kingdom: Cambridge University Press; 1 edition (August 31, 2018), 2018.
  • [17] “Greeks (finance).” https://en.wikipedia.org/wiki/Greeks_(finance).