Finance from the viewpoint of physics
Abstract
In this note we review the basic mathematical ideas used in finance in the language of modern physics. We focus on discrete time formalism, derive path integral and Green’s function formulas for pricing. We also discuss various risk mitigation methods.
1 Introduction
Advanced mathematical methods are used in finance for a long time to understand the functioning of the market. In this continuously fluctuating environment probability theory provides that solid basis, on which the assessment of the present values, and the risk mitigation techniques can be based. This aspect of the market has become even more enhanced after the crisis in 2008. Since then the market is more prudent, collateralization is applied often even for simple products. New, more complicated financial products have appeared, the use of computers in the trading becomes more and more widespread. All of these facts result in the increase of the role of mathematical methods in the finance.
There are numerous well written books on mathematical finance, for example [1, 2, 3]. These books, and most of the financial literature uses the phrasing of probability theory that was founded by Kolgomorov [4] and It [5] in the first half of the XX. century. This approach considers the stochastic process as a measure which can be used for integrating a function (adapted process). This thought nicely fits into the mathematical movements of the early XX. century, namely the raise of measure theory and Lebesque integral.
In the same time, however, a different formalism describing probabilistic processes was also born, mainly driven by physicists, Einstein, Langevin, Fokker, Planck, later Dirac and Feynman. Here we treat the stochastic process as a differential equation (Langevin-equation), where in the source term an unusual, fast oscillating function appears, called white noise. The white noise is a normally distributed random function where the correlation between different times is described by a Dirac-delta. In the 1920’s, however, it was absolutely unclear how to deal with the Dirac-delta ”function”. It was only the 1950’s where Schwartz gave a mathematically satisfying description [6] as a distribution.
An alternative rephrasing of the Langevin-equations can be given using an integral representation, called functional (or path) integral. This approach was initiated by Wiener in the 1920’s, but its full weight has obtained by Dirac and Feynman in the 1940’s [7]. With this formulation the same problem appeared as for the Dirac-delta earlier: the continuum limit, except for some elementary cases like the Wiener-integral, seemed to be senseless.
The solution for giving sense for the path integral (and, in fact, for all the quantum field theory) arrived only in the 1970’s with the ideas of renormalization (for summary and references c.f. [8]). The main idea is in fact related to the ones used in defining the Lebesque-integral and the Dirac-delta: we approach the continuum limit through some discretization, and we study the change of the results under the change of the discretization. But, unlike in the case of integrals and the distributions, the continuum limit is much more complicated in this case, and we always must keep referring to the discretization scale. Actually, although this could be seem a bug in the line of thought, it leads to new, measurable effects (running coupling constants, trace anomaly) [9].
This solution gave a huge impact on the development of statistical physics and quantum field theory, in disciplines where the formalism strongly relies on the path integral. Present day numerical computations of elementary particle physics use mostly path integral methods in some discretization, and no sooner can the continuum limit be achieved than at the end of the computations. In this way, however, precise numerical results could be obtained (c.f. for example [10]).
MC simulations are used in various fields nowadays, including finance. In the financial sector the most models are extensions of the Brownian motion, and so Gaussian MC simulations can be applied to simulate the price movements.
The purpose of this note is to give an introduction to finance in the language of physics. Being so, it is the part of an ongoing effort to bring the ideas of physics into finance and vice versa [11, 12, 13, 14, 15, 16].
This note is built up as follows. We define the mathematical space that corresponds to the market (Section 2), then we discuss the value of a portfolio in Section 3. In Section 4 we look at the market from the point of view of the statistics, and introduce the tools of treating the price changes in a discretized formulation. In Section 5 we turn to the possibility of continuous approximation. In the next section (Section 6) we solve some stochastic differential equations. In Section 7 we discuss risk mitigation techniques applied in the market, and the way how the assumption of risk neutrality leads to the determination of the price of a derivative (Section 8). The paper closes with a Summary section (Section 9).
2 The space of trades
In order to be able to speak about the financial products we have to define an abstract space that represents the trades. To understand the logics we recall that trading traditionally stems from the exchange of properties of different people, families, tribes, or later firms. All tradeable properties will be called asset, let it be direct material goods like vegetables, cattles or tools, or indirect ones as field, workpower or even the life of a person (which is traded for example when somebody enters the army). The assets can have parameters (for example quality, expiration date etc.), then we treat them as different assets.
The property of a trader usually consists of several assets. They can have a house, two horses, five and a half barrel oil and also three and a half cows if two persons have seven cows together. In the property (we will call it a portfolio) thus all assets has some quantity. The property or portfolio is thus the list of all the assets with their available quantity.
The mathematical structure corresponding to this construction is the vector space. Let us denote by that vector space (asset space or portfolio space) where the basis elements are the assets. Although it can be thought to be infinite dimensional (because, for example, the quality forms a continuum), in practice only a finite number of asset types are traded, so we do not loose anything if we think it as a finite dimensional vector space. We mathematically define the portfolio as an element of the asset space
In finance there is a singled out asset that plays a universal role, and this is money. In economics money has various roles, here we just consider one aspect, the universal exchange tool. We use US dollars as numeraire, and denote the corresponding asset by USD. So if we have ten dollars and two dogs, then our portfolio can be described as . Logical.
2.1 Loans and other promises
What makes it more interesting is that not only the actual goods can be traded in a spot exchange, but other “financial products” as well. One of the simplest financial product is a loan. This can be money, but other assets can be lended and borrowed, too.
Who has a debt, has, in some sense, a negative property. If we owe three cows then our portfolio could be written as . But it is not the most adequate notation, and sometimes it can lead to misunderstandings. The reason is that if we have three cows and owe three cows, the above notation would suggest writing . But it is not true that we have nothing, because we can use the benefits of the cows, for example we can drink their milk.
Thus, somewhat generalizing the concept of the loan, we will speak about general promises or liabilities. A debt can be considered as a promise that we will give (back) a certain asset if we are asked for. The loan is the opposite, somebody have promised us a payoff at some time. In fact the actual assets and the promises on actual assets are the main constituents of the more complicated financial products.
Let us denote the promise with , and its argument is the asset that is promised. The loan is a positive promise, because when it is given, one will possess the given asset. This means that if we have three cows and owe three cows, then our property is
(1) |
Now we can not simplify this equation, this means exactly what we want to. is defined to be a linear map of the asset space
(2) |
A promise, since it concerns future events, can have several more parameters, that is why it is worth to denote them as a function. A usual parameter is the maturity or tenor or expiration time, denoting when the promise is due. If we denote the present time as , then
(3) |
means that we should deliver 3 cows in one year from now. can be a time interval, discussed later.
2.2 Common financial products
In this language we can describe a lot of financial products. For example a loan with notional USD, payed back in parts, can be described as
(4) |
where is the interest rate to be paid at time (for example for monthly payoff), and is the remainder due at expiration time . To determine the value of the parameters and at fixed , we can use different techniques discussed later. For a fixed rate loan is constant.
Another product is the futures trade when an asset ’a’ is agreed to be bought or sold at a given, strike price at maturity time . If we want to buy that asset, called we are in long position, then our portfolio consists of
(5) |
If we want to sell the asset, called we are in the short position, then our portfolio is
(6) |
Another interesting parameter of the promise can be its optionality. One of the counterparties may have the right not to fulfill or not to exercise their promise. In this case the two parties are not equivalent. We call the one who possesses the optionality to be in the long position, the other counterparty (who “sells the optionality”) is in the short position, irrespective whether the promise is about to buy or sell something.
A possible notation for the options is to multiply the possible payoffs by a number . When , then the promise is fulfilled, otherwise it is denied. It is also important that who has the right to decide the value of , that we indicate as a index: if the index is , then the portfolio owner has the right to set the value of (i.e. she is in the long position), if the index is then someone else determines its value (so the portfolio owner is in the short position with respect to the option).
For example if we agreed that trader ’A’ has the option to buy a product ’a’ at time (or time interval) for a strike price from trader ’B’ (European option), then their portfolios read
(7) |
The exercise date can be also optional, in American option it is any value in , in Bermudan option there are some fixed dates. Similarly as in the previous case, we can denote its optionality by a subscript . An American option can be described as
(8) |
where
We note that the strike price can also be a complicated construction, even depending on the price history. For example we can agree that the buyer of the option has the right to sell a given asset at the average price that was achieved in a given time interval (Asian option), or anything more exotic ones.
We also note that, although the choice of is completely up to the trader in the long position, sensible traders choose if it is beneficial to them. This makes it possible to determine the price of the option, see later.
3 Value of the portfolio
By now we can describe what we have currently. In a trade we exchange two (or more) assets. But the question is, how much is a given asset worth? Clearly no one would bargain away his property, but at the same time everybody wants to achieve the highest price possible.
On the other hand there is not an explicit value measure for the goods. In particular because goods may have hidden advantage for somebody, and this person is willing to buy them at a higher price, too. So the only measure for the value of an asset is that for how much is it used to trade. A well informed trader will trade the asset at exactly the price that is adequate at that moment. The lack of information leads to failed trade, or to arbitrage, when an asset can be bought from and sold to different parties, realizing a net profit.
If a market is well informed, and there are a lot of vigilant merchants around, then arbitrage can not be hold for a long time. If it was strictly true, then there would be a single price for each asset. But actually it is just an approximation, since nobody knows that value, and so all the trades modify somewhat the price. A momentary excess in demand will raise the price, while a momentary excess of offers will lower it, and this is repeated time and time again. So, if we insist having a definite price, we have to say that the prices fluctuate.
If we sell or buy several assets, then we trade them separately. This means that the price (value of the portfolio) is a linear map from the asset space and time to the real numbers (actually ). Thus
(9) |
gives the price/value of the asset at a time .
In a fair business neither of the counterparties lose, both of them give or receive the price which corresponds to the assets they trade. If it is a spot bargain, then both parties know the market price, and this serves as a relation point. But if the payoffs happen in the future, one needs a tool to compute the value of the asset at present. This is the present value, and this forms the basis of a fair trade.
3.1 Discounting a risk free zero coupon bond
The most simple future payoff is the zero coupon bond, which is , i.e. it pays 1USD at a future time once. We also assume that it is risk free, meaning that we can count on the payoff with hundred percent certainty. For example we may think of a US government bond. Our task is to tell its value at time , which is called discounting the value of the payoff.
To tell the present value, we have to compare the investment in a zero coupon bond to a bank deposit in a safe bank. If it would be more advantageous to invest into a bank deposit, then we would short the zero coupon bond now, and put the money in the bank deposit. A time the bank deposit would have a higher value, and so we could gain money with zero starting capital. If the investment into the zero coupon bond would be more advantageous, we could do the inverse: we borrow money from a bank, and put it into the bond, and realize a net profit at time . To avoid these arbitrage possibilities, the present values of a risk free zero coupon bond and a risk free bank deposit must agree.
But the bank pays interest rate for all the deposits. In the most simple case it is a fixed annual interest rate . Technically the paying of the interest happens periodically in each time period, with the corresponding interest rate . can be determined from the condition that after one year we get rate (assuming is integer)
(10) |
In case (called continuous compounding) we denote . Then
(11) |
This also means that .
If we deposited USD in the bank at time , at a later time it is worth USD. This should be compared to the case, when we buy a zero coupon bond at time with maturity . In an arbitrage-free fair business both should have a value of 1USD at time , so we require . Thus the value of the zero coupon bond at time is
(12) |
This formula makes it possible to determine the value of for a fixed rate loan. The portfolio was given in (4). In a fair business the value of the portfolio is zero at all times. Let us compute it at time zero (present time), when we have
(13) |
Let us choose , , and denote the actual interest rate (which is the risk free interest rate plus the spread) by . Then we find
(14) |
and, correspondingly,
(15) |
Therefore the condition of arbitrage freeness in the absence of risk leads to a definite price for the zero coupon bond, and a definite value of the fixed rate paying.
3.2 Discounting the price of an asset
Let us assume that we have a promise that we are given an asset at time , so our portfolio is . What is the value of the portfolio at time ?
What we certainly know is that
(16) |
since the promise is fulfilled then, we obtain the asset, and its price is what is determined by the market at that time. We claim that it is true at other times as well, i.e.
(17) |
it does not depend on .
The reason is that if , then we buy the asset now, and, at the same time we sell the promise of delivery at time . Therefore we have now the asset , payed its value (), we promised a delivery of at time (this is ), and we obtained the price for the promise . Our portfolio therefore reads
(18) |
The value of the portfolio is zero at time . Its value at time , if the promise is fulfilled
(19) |
But the first two term cancel each other by equation (16), and so what remains is
(20) |
Therefore we could gain money. If , then we build a portfolio
(21) |
for that and again. To exclude this arbitrage possibility we need to have , which we wanted to demonstrate.
We remark that the two cases are somewhat different. If the price of the promise is larger than the actual price, we immediately can realize a profit without any original capital. The other case is feasible if we have the asset previously, otherwise we can not realize the part of the portfolio. But, if the asset is liquid enough, there are enough assets in the market to forbid this arbitrage.
Using this result we can give the price of a futures trade. The portfolio of a long position is given by (5), its price is therefore
(22) |
4 Statistical approach to the market
In fact the discounting of an asset price is the only one which is independent on the way the market operates. Already the calculation of the discount factor of a fixed payoff depends strongly on the details, in this case on the interest rate. A fair business takes into account the market rates which, however, fluctuate in time. Therefore we should understand, how the market operates, how the prices are determined, why, and how do they fluctuate. This is a very complicated question, and we can just hope that we find a satisfactory approximation.
The first point we have to clarify is the recording of the prices. Although previously we used a continuous time notation, but it is an abstraction, an approximation. In reality all the recordings have a time stamp that is not infinitely fine. There is a smallest time difference that can be resolved, say sec (as an upper estimate), thus all trades and prices can be characterized by an integer; in particular the price of asset at time will be denoted as . We will use a fixed number of assets, then the vector of all prices is . Sometimes we will put a comma between the two indices in order to avoid misunderstanding, for example we will write .
When we think about a dynamic model of price changes we must pin down that in a complete model the price in the future must depend solely on the information available at the present. In fact, we can not make decisions based on past events if they are forgotten. The only way of remembering the past events is to make notes (eventually in our memory) about them, and then it is an available information in the present. So we may write generally
(23) |
The factors determining the evolution of the price, of course, are numerous. Moreover, for a quantitative prediction we should have known the actual form of the . Thus predicting the price in the future seems to be impossible.
Still, we can benefit from the generic form above. We may divide the information available at present into three parts. The first part are externalities that do not depend on the status of the market: for example the natural events like wheather, new discoveries, inventions, political or military actions. In a a market model we do not want to describe their dynamics, we take them as given processes, and as such these can be taken into account as an explicit time dependence. We may hope that these effects are slow (usually they are, but for example the weather can have significant influence in certain areas also on daily basis).
The second part of the variables describe the market. Among them there are the asset prices, but other market factors can also be present like forward rates. They appear on both sides of the equation, and we denote them unified with .
The third part is again (mainly) independent on the status of the market, but these are fast processes. They consist, for example, of the momentary intentions of the participants of the market. Let us denote them as , where runs through some (large) index set. These processes are in principle well defined, they follow their own dynamics, but it is impossible to tell their time dependence from the knowledge of the asset prices. All in all we have the equation
(24) |
Were the absent from the above equation, we could determine from the observation of price changes in the past, and eventually recalibrate its form from time to time. But it is hopeless to determine the actual form of the functions. What helps us in this situation is that they are numerous, and although they are deterministic one-by-one, their net effect is still something that can be described statistically. This means that we assume a time dependence for them, solve the above equation for all possible time dependences, and finally we average over the results with some weight. We will assume that these variables are normalized in a way that they fluctuate around zero (their mean is treated as a deterministic effect).
4.1 Linearization
Using the fact that the effects are small one-by-one, we can power expand the function to first order
(25) |
The last term is a weighted sum of the variables at time index . Now we can argue that the distribution of the sum of mostly independent random variables (with bounded variance) is a Gaussian. This is the central limit theorem, and in fact we need to fulfil some conditions that we tacitly assume that is in fact the case here. Thus the last term can be substituted by a single term with some generic coefficient:
(26) |
where the variables are all Gaussian distributed random variables with zero mean and unit variance. We will assume that these random variables are independent for different times: indeed, we can argue that there are different trades throughout the world at random times, and so their interrelation is weak. But we must know that this is again an approximation, because if we do not observe all effects, the effective dynamics of the rest will contain memory effects. What we assume is that these memory effects are small.
Although all the formulae are supposed to be written for multi-component variables, it may be useful to write out the indices explicitly. In the multi-component notation the above equation can be written as
(27) |
The random variables are not necessarily independent for different assets
(28) |
and so the covariance matrix of the complete noise term reads
(29) |
To simplify the treatment, we diagonalize the correlation matrix (which is a symmetric regular real matrix) as
(30) |
where the vectors are eigenvectors of the covariance matrix , and they are orthonormal: . Then we can write (27) as
(31) |
with the volatility matrix
(32) |
and uncorrelated noise terms
(33) |
Indeed, the correlation of the noise term reads now as
(34) |
which is exactly the complete covariance matrix (29).
All the above means that it is enough to have as many random Gaussian variables, as the number of the assets on the market (originally we had much more). These variables can be thought to be independent, and appear in the evolution equations multiplied by the volatility matrix . Thus the cumulative distribution of the random variables is
(35) |
From now on we suppress the multidimensional indices, treat as a matrix , and as a vector .
4.2 Scaling under changing of the discretization time
In the above discussion the value of could be chosen arbitrarily. Our first guess was sec, but just as well could it be sec or even sec. What effect does it have on the form of the dynamic equation?
Let us first assume that we want to work with . This can be thought that we want to tell from . When we recursively substitute the equation of into the equation of we have a lengthy expression. But the price changes are so very little in this time interval that in the argument of and functions we can use the previous value. This simplifies the discussion to
(36) |
where in the last expression we divided and multiplied by . The distribution of the sum of independent Gaussian random variables is a Gaussian random variable. The correlation matrix coming from the the last expression is thus
(37) |
is the same as for . Thus we may write
(38) |
This can be generalized to arbitrary (as far the change of the prices in this time interval is negligible): the first term is multiplied by , the second term, on the other hand, by .
(39) |
We may introduce the notations
(40) |
and then we can write for the discretization time
(41) |
We remark that in the multi-dimensional case is a matrix in the asset price space.
This form shows that the continuous time limit is not trivial: not all the variables scale like , and so in the limit the above equation does not go to a differential equation. Indeed, the continuous limit is known as a stochastic differential equation.
4.3 Numerical computation of an expectation value
In the practical point of view the treatment of (41) looks like the following. First we find the solution depending on the time series and on the initial condition . Let us denote it
(42) |
Here we have used the fact that can depend only on the past events. Then we should calculate the expected value of any function of by averaging over the possible series over independent Gaussian distributions. In formula this reads
(43) |
The two equations (41) and (43) provide a well defined numerical framework to solve any stochastic problem numerically.
Often we use a momentum generation function that is defined as
(44) |
where and the series satisfy (41).
4.4 Change of variables
A very interesting consequence of the different scaling properties of the various terms in (41) is that, in case of a variable change, a nontrivial factor appears.
Let us assume that we have a new variable , where is a smooth function. In discretized case it reads . Let us consider the change in up to :
(45) |
If all terms were scale as in , then we could power expand to first order. But the different terms scale in different ways, so we must go until the second order term:
(46) |
Here we can use (41) for the value of . We remark that in there is a single term that is proportional to , all other terms are of higher order. Thus we shall write
(47) |
where we omitted the arguments for brevity (note that is a differentiable funciton, so it is sensible to speak about even if the time steps are discrete). We rewrite this formula as
(48) |
where we introduced a new random variable having zero mean as
(49) |
As we see, the change of is not Gaussian distributed, so is not a Brownian motion any more. But the difference from a Brownian motion vanishes like as . So in the limit we can omit the difference of and . Then we find
(50) |
This is the It-formula. In case of any number of correlated assets it reads
(51) |
4.5 Evolution equation of the distribution functions
Let us assume that we have a statistical information about the price at present, we know its distribution function . Then what will be the distribution function at later times?
To give a formal definition for the distribution function we realize that for any quantity depending on a real valued random variable the expected value can be written with the help of the Dirac-delta
(52) |
The does not depend on , so we can take it out from the scope of the expected value and obtain
(53) |
where is the distribution function
(54) |
The question we want to answer is that what is the distribution function of the prices at time if we know the distribution at time . What we have to do is to solve the price motion using the equation (41), starting from some initial condition at , and assuming given . Having obtained a solution , finally we have to average over all and .
Then we can write for the expected value of any function
(55) |
With the distribution functions we can write this expression as
(56) |
where
(57) |
There are different methods to derive this quantity, here we will use the It formula, applied to the expectation value of the function above. First let us fix the initial distribution to
(58) |
then the integral disappears. Now we change , and write up the change in the expected value in two ways. At the one hand we have
(59) |
At the other hand from (41) we have
(60) |
where in the last line we performed partial integration, and omitted the arguments of for brevity. Since the two expressions are equal for any function, we can conclude
(61) |
In continuous time this leads to a partial differential equation known as the Fokker-Planck-equation or Kolmogorov-PDE:
(62) |
If we wanted to write out the indices explicitly we would write
(63) |
4.5.1 Composition rule and dependence on the initial conditions
We can perform the evaluation of the expected value (57) in two parts, if we want. We choose a internal time, and we draw up the condition that at we arrived at , and then, starting from this value, we proceed from . Formally we can write
(64) |
where in the last delta function we tacitly assumed that we start the time evolution from . The two Dirac-deltas are independent on each other, because in the first case we have to average only over , in the second case only over . Therefore we can write
(65) |
This formula makes it possible to find a differential equation with respect to the initial conditions of the distribution function. If we change , namely, the left hand side does not vary. Thus
(66) |
We can use (61) to write for the first term
(67) |
We perform partial integration, and substitute the result back into the previous expression. Since this must be true for any we conclude
(68) |
In continuous time it reads
(69) |
where the index denotes the initial conditions.
4.5.2 Change of variables in the distribution function
We may also work out the change of the distribution function under the change of its argument. We change the variable from , when is invertible. Then the distribution of reads
(70) |
Changing to new variable , the integral measure changes by the Jacobian, and we find
(71) |
4.6 Path integral
In (43) we have seen how to compute an expectation value numerically. Here we continue this line of thought, rewriting that formula.
To treat (43) we have to know the additional information of how to determine , i.e. we need the equation (41). We may work out a formula which is self-contained, i.e. it contains both the time evolution as well as the averaging. The key is that we can represent a recursion through an integral over a Dirac-delta
(72) |
where in the present case (41) corresponds to . This form can be applied for all , and obtain
(73) |
where with initial condition given, and we also introduced the notation
(74) |
for and .
In order to simplify the formulae, and get rid of the disturbing constant factors, we may introduce
(75) |
and then
(76) |
We can also introduce the generator functional
(77) |
where we also indicated the initial condition. We usually denote , it is sometimes called partition function in physics. Then
(78) |
In the sequel we will omit all constant factors in all expected values, the division with the corresponding will take care of the correct normalization.
We note that the upper limit of the product term in (75) can be extended to infinity. The reason is that if the integrand does not depend on the last variable, then the Dirac-delta simply gives one. In this way we can get rid of the last integral unless depends on it. Finally we have
(79) |
The next step is to integrate over the variables. This is not difficult, because the are linear in this variable. The master formula is
(80) |
We then obtain, using (41)
(81) |
where we denoted
(82) |
Here we also used that . This formula has the big advantage that it does not need any supplementary condition, we can calculate the expectation values simply by performing the integrals.
In physical terms the exponent is called the Hamiltonian, or, in other context, the Euclidean Lagrangian. So we can write
(83) |
then
(84) |
which is called the path integral representation of the expectation value.
The distribution function is the expected value of the Dirac-delta:
(85) |
5 Continuous approaches
In the previous section we used a discrete representation of the stochastic process. Traditionally, however, the continuous description is used in general. In this section we overview some of them.
5.1 Langevin-equaiton: a differential equation form
In physics the usual procedure is to write up a formal differential equation
(86) |
known as the Langevin-equation; the symbols and denote general functions, while is a continuous random variable known as a white noise.
In order to reproduce the discretized form (41) we have to choose the correlation function of these random variables carefully. The correct choice is
(87) |
where is the Dirac-delta distribution. In this case, namely, by integrating the Langevin-equation from to we obtain
(88) |
We re-introduce as
(89) |
The correlation between and for different is zero, and
(90) |
Thus the ”average” of a stochastic variable must be calculated by dividing the square-root of the time interval, not the time interval itself.
5.2 Ito calculus: measures
Equation (41) can be thought as a relation for measures. Then serves as an ordinary Riemann-measure, while the set is interpreted as a probability measure, usually referred to as the Brownian motion. We now discuss the one dimensional case with .
5.2.1 Probability theory in nutshell
This approach needs somewhat more preparation, and we recommend the interested reader to turn to more detailed description; here we just list the very essence of what we need. The point is that we try to generalize the concept of random variable to continuous ”indices”. In the discrete version one defines the sample space that consists of elementary events, like an actual series of results of finite number of dice throwing (e.g. ). The event space is the power set of , consisting of all the subsets of it. Under the union operation this is a -algebra.
A probability measure is first defined as a function , but it can be lifted to with . must satisfy . A random variable is . The expected value of a random variable is defined as
(91) |
In the continuous case the problem is that the elementary events (also called atoms), forming , all have zero probability. Therefore the probability measure can be defined only on , which is additive for unions of (countable) mutually disjunct subsets of :
(92) |
where is a countable index set, and for . The set is called probability space.
The generalization of the discrete expected value to continuous case is a stochastic integral denoted by
(93) |
This is defined as a limiting procedure. First define the integral if is a step function, i.e. , where are disjoint elements of and if and 0 otherwise (indicator function). Then
(94) |
Then this definition can be extended to any function that can be approached as a limit of step functions.
5.2.2 The Ito process
The integral associated to the measure is the It integral. In our approach, fixing the time steps, we can integrate a function that is constant during these time steps (i.e. a fine step function, in mathematics it is called a process adapted to the discretization). The result of the integral is then a stochastic variable
(95) |
This sum is also a Gaussian variable with zero mean and the following variance
(96) |
as it can be seen from the square of the sum.
It is not hard to see that this definition does not depend on the length of the time intervals, just because of the Gaussian nature of the variables. So refining the time mesh we can approach the integral of any functions that can be described as a limit of step functions (measurable functions).
The quadratic variance of the integration measure reads
(97) |
and the last term vanish when . This formula makes the basis of It calculus.
5.3 Path integral
There is also a continuous notation for the path integral. The sum in (84) multiplied by naturally leads to the integral notation
(98) |
with . Then
(99) |
where
(100) |
6 Solutions of some stochastic differential equations
In this section we discuss some stochastic differential equations, and give their distribution functions. We will always start from the initial condition , or .
6.1 The Brownian motion
The simplest stochastic equation is when the drift and the variance are constant. Then we can diagonalize the covariance matrix, and so we may deal with one dimensional problems. The equation we have to solve, in the discrete notation reads
(101) |
where are independent Gaussian variables with zero mean and unit variance. The solution of the recursion is very simple
(102) |
We introduce
(103) |
which is a Gaussian random variable with zero mean and unit variance. So we have
(104) |
where . Thus the distribution function reads
(105) |
6.2 Geometric Brownian motion (GBM)
The most prominent feature of the market prices is that it is not important in which unit we measure the prices. We can use any currencies, gold prices or any other asset price as numeraire, the dynamics of the market is the same. Therefore only the relative price changes must be important. The stochastic differential equaiton that describes this property is simplest
(106) |
in the Langevin notation.
With new variable with some we obtain, using the It formula
(107) |
This is the Brownian motion discussed above. Using this equation it is usual to give the solution of the GBM as
(108) |
6.3 Vasicek/Hull-White model
In finance the mean reverting model means that for long terms the random variable fluctuates around a single value. Such model is the following
(110) |
Depending on whether the parameters are time dependent or not, do we call this model (extended) Vasicek or Hull-White model. Here we solve the model with constant parameters.
Introduce a new variable , then
(111) |
therefore
(112) |
This equation can be solved to by a simple integral. So we find for the original variable
(113) |
This describes a Gaussian random variable with mean
(114) |
and variance
(115) |
So the distribution is
(116) |
As we see, the mean in long terms goes to , the process fluctuates around it with a variance .
7 Risk of a portfolio
In the previous sections we discussed the general framework of the price dynamics. Now let us think about the evaluation of the present value of an asset.
The most striking question is that if there are two assets with interest rates , then why is not there an arbitrage possibility? Indeed, the portfolio
(117) |
has zero value at , but at it is worth
(118) |
So it seems that it is worth to realize this portfolio, we gain money from nothing.
The main point that we did not take into account is the risk. Let us assume for example that is practically risk-free, while has an annual default risk . The average annual rate thus is . The risk therefore diminishes the rate.
The first problem here is that it is very hard to tell the exact value of before a real default will occur. We may give vague estimates, but we can easily miss a factor of two or even ten. As a number example consider the case when pays an interest rate 20%, has a risk-free rate 10%. If the default risk is 5% for , then the average interest rate is still , so is a better investment. But if the default risk is 10% then the average rate is 8%, then already takes over.
But there is another effect. Let us assume that we can borrow money for rate , and we want to buy the assets from a loan. To be sure we hold back a relative amount as a collateral (usually it is demanded by the bank lending the money, too). So if we have a principal of 1USD we can borrow USD, and after a year we have
(119) |
This is the leverage effect, resulting that the effective rate of a risk-free investment can be raised to very high. In the ideal case when , any small difference between the risk-free and bank loan rate makes the effective rate grow to infinity.
The first lesson here is that if there were a risk-free investment possibility with higher annual rate than another, then this would indeed cause a very high level arbitrage possibility. Therefore the completely risk-free rate is a unique number.
The second remark is that we can leverage, of course, the risky investments, too. But there is a possibility to lose all the money with non negligible probability rate, then we stay back with the debt liability. This means that we must reserve a higher collateral in the risky case, preparing for the worst case. This will easily make the effective rate much lower than the effective rate for a risk-free investment.
So the real question is that how conservatively, how prudently do the banks evaluate and treat the risk. The practice nowadays is that the banks do not tolerate risky investments too well. This has some psychological factors in it, the market could work in different ways. But the present day practice requires the business to be practically risk-free.
7.1 Risk mitigation by creating indices
The assets, of course are not risk-free one-by-one, so we must make efforts to get rid of the risk. We can do it by combining assets into a portfolio. There are two main techniques to do this. The first one is to combine independent assets into a single portfolio: these are called indices. So we consider the portfolio
(120) |
The value of the portfolio reads:
(121) |
We will assume that the assets follow the equation
(122) |
where we factored out the price itself, and . If are independent of the prices of the underlying assets, then satisfies
(123) |
where
(124) |
where .
To diminish the effective risk, we should minimize the above expression by choosing the correct weights with the constraint that we should keep the value of the portfolio fixed, i.e.
(125) |
Then we have to satisfy
(126) |
where is a Lagrange multiplicator. This results in
(127) |
The value of the comes from
(128) |
Putting all together, after some algebra, we find
(129) |
We see that in this way we can not achieve a complete risk-free portfolio, but we can mitigate the risks of the single underlying assets.
While this is simple in theory, practically it is not simple to reliably make an estimate on the values. It is also a question, how many assets do we want to include in the index, how do we treat the default risk, etc. We can make also the optimization in a different way, for example fixing a given risk and optimizing the effective interest rate. This results in the fact that there are various indices in the market that differ in the way we compute the weights.
7.2 Risk mitigation by hedging
The other way we can mitigate the risk is that we combine assets in a portfolio that have interdependent risks. In the market there are asset classes where the asset prices depend on each other, so there is a correlation between the risks. The most simple example of this case is when we consider an asset, and a derivative of it. A derivative in this sense is an asset that is built exclusively on the other, underlying asset (e.g. option, swap or similar products).
So let us assume that we have a portfolio where the underlying asset is , and we add some derivatives to it. So we have
(130) |
where the weights and are real numbers. The value of the portfolio is
(131) |
where we have denoted the value of the derivatives at time and at spot price as . Now we think about these functions as prices that can be obtained by observing the market.
What is somewhat more complicated here compared with the previous case, is that the price of the portfolio may depend non-linearly on the price of the underlying, and so its dynamics must be computed using the It lemma. So, if
(132) |
where and can be dependent, then we have for the complete portfolio
(133) |
This expression is risk-free, if the term containing is zero. This leads to
(134) |
This would mean, however, that does not depend on , put another way, it is not built on the asset . This contradicts our first equation.
So perfect risk-freeness can not be achieved in this way, either. The best we can do is to ensure vanishing derivative at a given price of the underlying, practically at the actual spot price . Thus we require
(135) |
It is usual to introduce the risk of the portfolio by the definition
(136) |
Risk-freeness at the spot price requires that the delta-risk of the portfolio vanishes
(137) |
It is also said that we have a delta-neutral portfolio, or that we hedged out the delta risk.
Using our portfolio we have
(138) |
where
(139) |
A delta-neutral portfolio can be achieved using one single derivative with and the underlying, by choosing
(140) |
7.2.1 Higher order hedging and the ”greeks”
There are several issues with the hedging strategy described above. One is that we do not really know the relation of the underlying and the derivative prices. We can observe the spot price of the derivative, i.e. , but to estimate we should know it for any other prices as well. This can not be observed directly, thus we need a market model. So, strictly speaking, what we can do is to use the estimated present value which already depends on the market model .
In practice the market model has some parameters, first of all the (estimated) volatility parameter of the underlying asset. But, since no market model is perfect, the actual market can be described only with a non-constant volatility parameter. So, in this sense not just the price, but also the model has fluctuations. Now the complete analysis of the previous subsection can be repeated with the substitution . What we obtain is that for a risk-free portfolio we need both
(141) |
It is usual to introduce the quantity (kappa; sometimes it is called vega), the analogue of , corresponding to the price change under the changing volatility parameter:
(142) |
We need that the kappa value of the complete portfolio is zero (delta-kappa neutral position).
Another issue is that we can ensure risk-free portfolio only at a single price . As soon as the price moves, the risk will grow. Practically one always has to fine-tune the portfolio by adjusting the (and ) to the actual price. If, however, strongly depends on the price of the underlying, then a sudden price change is hard to follow. This motivates the introduction of as the derivative of (the second derivative of the present value of the derivative)
(143) |
To ensure stability of a portfolio not just the delta, but also the should be zero (delta-gamma neutral position).
We could continue this analysis, and introduce other ”greeks” to denote the higher derivatives, c.f. for example [17], all characterize the sanity of a portfolio. But usually, besides delta-risk, the kappa and/or the gamma is the most important to hedge out.
For all the greeks, the risk of the portfolio is the weighted sum of the individual assets
(144) |
If, for example, we have two derivatives, then we can require
(145) |
to hedge out the Delta-risk, and
(146) |
to hedge out the kappa-risk. If we want to hedge out the gamma-risk as well, we need a third derivative.
If we continuously monitor the different greeks of the portfolio, we see, how sensitive it is for various ways of price changes. The best practice is to keep all the risks in a given narrow range.
8 Present value and pricing
As we have argued, the market requires the investments to be the possibly most risk-free. This also means that single assets are practically never traded one-by-one, only in portfolios where the risks are mitigated. But all risk-free portfolios must grow with the same rate, otherwise arbitrage would show up. This means that the rates of the individual assets play no role at all. Being part of a portfolio, all assets must be treated as if they had a common drift factor. In this artificial world, called the risk-neutral world we find for all derivatives (including the underlying asset)
(147) |
where stands for ”risk-neutral”. The rate itself can be a time dependent function, but it can not depend on the single asset prices.
This equation, in fact, is enough to determine the present value of an asset. We can do it in two equivalent ways, one leading to a differential equation, the other an integral formula.
8.1 Black-Scholes-Merton formula
In this approach we consider a portfolio built on an underlying and one derivative. Its value is
(148) |
If it is in the delta-neutral position, then
(149) |
Now we express the time derivative of the portfolio in two ways. On the one hand the portfolio is risk free at , so we require (147) to be hold
(150) |
We find for our portfolio above
(151) |
On the other hand, if , then from (133) we find
(152) |
Putting the two equations together we find
(153) |
Strictly speaking the above equation is valid only at and . But as the best approximation for the risk-free portfolio, we can demand that it holds for other as well. This leads to the Black-Scholes-Merton differential equation
(154) |
The solution of the Black-Scholes-Merton model requires initial condition in time and boundary conditions in . This latter is usually omitted, the boundaries being in the infinity. The initial condition of time, on the other hand, is set by the promised payoff in the future
(155) |
It is then a final condition, not an initial one, and we should evolve the time backwards in order to obtain the derivative price today at . This will give the present value of the derivative.
8.2 Integral formula
We can use a different route to have an expression from the condition (147). First we find
(156) |
This means that the quantity
(157) |
is a random variable whose expected value under the risk-neutral measure is time independent (called to be a martingale under the risk-neutral measure).
At present time we know the price of the asset, , thus the price distribution is , and so so the expected value . From time independence of the expected value of follows
(158) |
If we have a promised payoff at time , then (assuming the promise is fulfilled). Therefore
(159) |
This formula does not assume any underlying market model, so it can be used in general.
If we write the payoff as an integral over Dirac-deltas, we can write
(160) |
The last term is the distribution function in the risk-neutral world:
(161) |
This last formula shows that the Green’s function of the present value determination is
(162) |
Using (69) we see that, if the underlying follows a Langevin equation, then the Green’s function satisfies
(163) |
which is the Black-Scholes-Merton equation (154). This shows that is the Green’s function of the Black-Scholes equation, too. It also proves that satisfies the Black-Scholes equation, so the integral approach is equivalent to the differential equation approach.
If there are several payoffs, then the linearity of the above equation tells us that the present values simply add up. So we can generalize the computation of a present value to arbitrary, continuously compounded payoffs
(166) |
A fixed payoff at time can be the represented as .
8.3 Option price in the GBM market model
To see an example we will compute the present value of the European call option in the geometric Brownian motion market model. The promised payoff of the call option reads
(167) |
where . To determine the present value, we use (159). It contains an expected value calculation, where the best is to use the explicit solution (108), where we shall use the drift const. Then we find, with :
(168) |
The condition of positivity is , where
(169) |
Thus we have
(170) |
The negative of the exponent in the first term is
(171) |
We can change variable in the first term to , then the upper limit of the integration is
(172) |
Then in both terms we can realize the erf function, and we arrive finally at the Black-Scholes-formula
(173) |
where
(174) |
A different form for it reads
(175) |
where
(176) |
at is sometimes called moneyness, , i.e. corresponds to the at-the-money (ATM) trade.
From this form we can also calculate the greeks, for example
(177) |
where denotes the normal Gaussian function, and
(178) |
9 Summary
The goal of this note was to summarize the ideas used in the financial practice in the language of physics. We have used the discrete time description of the time evolution which fits best to the philosophy of the renormalization group.
This note is far from being comprehensive, there are a lot of details missing. Also most of the discussed material is known and was written in various books even in more elaborated way. What makes this note somewhat different is that it puts emphasis on topics that are not usual to discuss (such as discrete time formalism or path integral).
Acknowledgment
The author gratefully acknowledges useful discussions with K. Cziszter, G. Fath and Z. Foris. This research was supported by the Hungarian Research Fund under the contract K104292.
References
- [1] J. Hull, Options, futures, and other derivatives. Upper Saddle River, NJ [u.a.]: Pearson Prentice Hall, 6. ed., pearson internat. ed ed., 2006.
- [2] S. E. Shreve, Stochastic Calculus for Finance I: The Binomial Asset Pricing Model: Binomial Asset Pricing Model. New York, NY: Springer-Verlag, 2003.
- [3] S. E. Shreve, Stochastic Calculus for Finance II: Continuous-time models. New York, NY: Springer-Verlag, 2003.
- [4] A. N. Kolmogorov, Foundations of the Theory of Probability. New York, NY; Heidelberg: Martino Fine Books (November 6, 2013), 2013.
- [5] K. Ito, An Introduction to Probability Theory. Cambridge, United Kingdom: Cambridge University Press; 1 edition (January 1, 1986), 1986.
- [6] “Distribution (mathematics).” https://en.wikipedia.org/wiki/Distribution_(mathematics).
- [7] “Path integral formulation.” https://en.wikipedia.org/wiki/Path_integral_formulation.
- [8] “Renormalization group.” https://en.wikipedia.org/wiki/Renormalization_group.
- [9] J. C. Collins, Renormalization. Cambridge, United Kingdom: Cambridge University Press, 1984.
- [10] S. Borsanyi et al., “Calculation of the axion mass based on high-temperature lattice quantum chromodynamics,” Nature, vol. 539, no. 7627, pp. 69–71, 2016.
- [11] R. N. Mantegna and H. E. Stanley, An introduction to econophysics: correlations and complexity in finance. Cambridge, United Kingdom: Cambridge University Press), 2000.
- [12] B. E. Baaquie, C. Coriano, and M. Srikant, “Quantum mechanics, path integrals and option pricing: Reducing the complexity of finance,” in 2nd International Workshop on Nonlinear Physics: Theory and Experiment Gallipoli, Lecce, Italy, June 27-July 6, 2002, 2002.
- [13] A. B. Schmidt, Quantitative Finance for Physicists: An Introduction. Elsevier Inc, Cambridge, MA 02139: Academic Press; 1 edition (December 28, 2004), 2004.
- [14] Z. Kakushadze, “Path Integral and Asset Pricing,” Quantitative Finance, vol. 15, no. 11, pp. 1759–1771, 2015.
- [15] F. Jovanovic and C. Schinckus, Econophysics and Financial Economics: An Emerging Dialogue. New York, NY 10016, USA: Oxford University Press), 2017.
- [16] B. E. Baaquie, Quantum Field Theory for Economics and Finance. Cambridge, United Kingdom: Cambridge University Press; 1 edition (August 31, 2018), 2018.
- [17] “Greeks (finance).” https://en.wikipedia.org/wiki/Greeks_(finance).