This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

GAN-MC: a Variance Reduction Tool for Derivatives Pricing

Weishi Wang [email protected] Department of Statistics, The University of Chicago

1 Abstract

We propose a parameter-free model for estimating the price or valuation of financial derivatives like options, forwards and futures using non-supervised learning networks and Monte Carlo. Although some arbitrage-based pricing formula performs greatly on derivatives pricing like Black-Scholes on option pricing, generative model-based Monte Carlo estimation(GAN-MC) will be more accurate and holds more generalizability when lack of training samples on derivatives, underlying asset’s price dynamics are unknown or the no-arbitrage conditions can not be solved analytically. We analyze the variance reduction feature of our model and to validate the potential value of the pricing model, we collect real world market derivatives data and show that our model outperforms other arbitrage-based pricing models and non-parametric machine learning models. For comparison, we estimate the price of derivatives using Black-Scholes model, ordinary least squares, radial basis function networks, multilayer perception regression, projection pursuit regression and Monte Carlo only models.

2 Introduction

Financial derivatives are used for risk management, hedging, speculation, and arbitrage. Better understanding of pricing of derivatives could help traders better hedge against risk, and the price of derivatives could reflect the fluctuations on the underlying assets. Much of the success and growth of the market for options and other derivatives securities should be traced to the seminal work by Black and Scholes [1] and Merton [2]. They introduced the closed-form option pricing formulas through no-arbitrage conditions and dynamic hedging arguments. Such celebrated Black-Scholes and Merton formulas have been well generalized, extended and applied to various securities. Nicole, Monique and Steven [3] provide conditions under which the Black–Scholes formula is robust with respect to a misspecification of volatility. Wu [4] introduces the fuzzy set theory to the Black–Scholes formula, which attaches belief degree on the European option. Marcin [5] introduces a subdiffusive geometric Brownian motion to underlying asset prices’ dynamics and tests the pricing model on prices of European option. Carmona and Valdo [6] generalize the Black-Scholes formula in all dimensions by approximate formulas and provide lower and upper bounds for hedging of multivariate contingent claims. Moreover, while closed-form expressions are not available in some generalizations and extensions, pricing formulas may still take effects numerically.

However, the derivation of the pricing formula via the hedging or no-arbitrage approach, either analytically or numerically, highly depends on the particular parametric form of the underlying asset’s price dynamics. Thus the misspecification of the stochastic process will lead to system pricing and hedging errors for derivatives related to this price. Therefore, previous parametric pricing methods are closely tied to the ability of capturing the dynamics of underlying asset prices’ process.

In this paper, we creatively introduce the generative model-based Monte Carlo estimation(GAN-MC) for derivatives pricing and hedging. We will not assume any specific dynamics on the underlying asset prices. We only treat the asset prices as simple multivariate random variables and try to approximate its distribution by a neural network. Then we get the pricing formula from derivatives’ definition and Monte Carlo estimation for the statistical stability. Compared to the previous non-parametric pricing approach like Hutchinson [7], our model relies less on derivatives’ regime and more stable with the advantage of Monte Carlo.

In order to better capture the dynamics of underlying asset prices through non-parametric approach, we introduce generative adversarial nets(GAN) [8] to approximate underlying asset prices’ distribution. GAN is the framework for estimating generative models via an adversarial process. The celebrated neural network model has been widely generalized and extended. Zhang, Goodfellow, etc [9] propose the Self-Attention Generative Adversarial Network which allows attention-driven and long-range dependency modeling for generation tasks. Mehdi and Simon [10] introduce Conditional GAN, which uses additional information to direct the data generation process. Chen, Lin, etc [11] propose the Depth-image Guided GAN which adds some architectural constraints to network and generates realistic depth maps conditioned on input image. Martin and Soumith [12] introduce the Wasserstein GAN which stabilize the training process by replacing the original metric by Wasserstein-1 distance. Chen, Duan, etc [13] propose InfoGAN, an information-theoretic extension to the generative adversarial net which is able to learn disentangled representations in a completely unsupervised manner by attempting to make conditional learned automatically.

Monte Carlo could be used for option pricing under different underlying asset prices’ dynamics assumptions. The original approach is raised by Boyle [14], he uses risk neutrality to obtain equilibrium rate of return on underlying assets and uses Monte Carlo to improve efficiency of estimation. Fu and Hu [15] introduce techniques for the sensitivity analysis of Monte Carlo option pricing and they propose an approach for the pricing of options with early exercise features. Birge [16] introduces quasi Monte Carlo sequences which have order of magnitude better asymptotic error rate and such sequences could be used in option pricing. Mark [17] presents several enhancements to reduce the bias as well as variance of Monte Carlo estimators and improve the efficiency of the branching based estimators. Poirot and Tankov [18] relate the underlying asset prices to the tempered stable (also known as CGMY) processes and under an appropriate equivalent probability measure a tempered stable process becomes a stable process, thus provide a fast Monte Carlo algorithm for European option pricing.

The attention on training on biased datasets is increasing in recent days. The work from Yo-whan Kim, Samarth Mishra, etc [19] raise the idea of pre-training on synthetic video data. Compared with directly training on real video clips data, the model will perform better on downstream tasks when pre-training on the synthetic or biased datasets. Our GAN-MC model’s success on derivatives pricing could be analogous to their success. Estimation based on synthetic or fake underlying asset prices outperforms non-parametric models directly trained on real derivatives prices.

2.1 Our Contributions

We summarize the major contributions of our paper as follows:

  • We first introduce the generative model-based Monte Carlo estimation for derivatives pricing. We assume that the underlying asset prices follow multivariate random variable distribution and use GAN to approximate the distribution. Then we use Monte Carlo estimation to get pricing formula for each derivative: option, forward and futures.

  • We get the consistent estimators for prices of different derivatives theoretically and validate the accuracy of our pricing algorithms on real market data. Compared with arbitrage-based pricing formula like Black-Scholes formula, non-parametric pricing models like radial basis function networks, multilayer perception regression, and projection pursuit regression, and some simple models like linear regression and Monte Carlo only models, our GAN-MC pricing model always reaches state-of-the-art on real market data.

We organize this paper as follows: In Section 3, we define the problem setup and some assumptions for data representation and training. In Section 4, we introduce our GAN-MC model for derivatives pricing, including option, forward and futures. We cover European call option, European put option, American call option and American put option in option pricing. And we cover commodity and equity for forward and futures pricing. We still prove the variance of estimator will decrease with the increase of generated sample size in this section. In Section 5, we conduct experiments to test the generated sample and test the accuracy of our algorithms. Compared with other models, our GAN-MC always reaches state-of-the-art on real market option, futures and forward data.

3 Problem setup

Ω1\Omega_{1}n\mathbb{R}^{n}×\timesm1\mathbb{R}^{m_{1}}(Ω1,F1,Y)(\Omega_{1},F_{1},\mathbb{P}_{Y})m×m\mathbb{R}^{m\times m}m2\mathbb{R}^{m_{2}}{0,1} or \mathbb{R}Ω2\Omega_{2}(Ω2,F2,γ)(\Omega_{2},F_{2},\mathbb{P}_{\gamma})×\timesZZY=GθgZY=G_{\theta g}\circ ZGGDDγ\gamma
Figure 1: GAN structure

The pricing methods for financial derivatives are highly based on the prediction of underlying assets, like stock prices. The basic idea of Monte Carlo for derivatives pricing is about generating fake stock price samples. Similarly here, if we denote the stock price vector 𝐒t,T=(St,St+1,,St+T1)+T\mathbf{S}_{t,T}=\left(S_{t},S_{t+1},\cdots,S_{t+T-1}\right)^{\top}\in\mathbb{R}_{+}^{T} as a multivariate random variable from time tt to time t+T1t+T-1, where 𝐒t,T:Ω+T\mathbf{S}_{t,T}:\Omega\to\mathbb{R}_{+}^{T} is a measurable function mapping from sample space to TT-dimensional positive real space. Then one stock price vector on real stock market 𝐬t,T=(st,st+1,,st+T1)\mathbf{s}_{t,T}=\left(s_{t},s_{t+1},\cdots,s_{t+T-1}\right) would be a realization of 𝐒t,T\mathbf{S}_{t,T}.

For the generative adversarial nets, as shown in figure 1, we denote G:n×m1m×mG:\mathbb{R}^{n}\times\mathbb{R}^{m_{1}}\to\mathbb{R}^{m\times m} as the generator mapping, where m1m_{1} is the number of parameters for generator network, nn is the dimension of random noise 𝐙\mathbf{Z}, we denote D:T×m2{0,1}D:\mathbb{R}^{T}\times\mathbb{R}^{m_{2}}\to\{0,1\} as the discriminator mapping, where m2m_{2} is the number of parameters for discriminator network, and we denote γ:Ω2m×m\gamma:\Omega_{2}\rightarrow\mathbb{R}^{m\times m} is the real distribution mapping to image space. Then the training loss of GAN could be described as minGmaxDM(Y,γ)\min_{G}\max_{D}M(\mathbb{P}_{Y},\mathbb{P}_{\gamma}), where M(,)M(\cdot,\cdot) is the metric between two measurable functions, γ\mathbb{P}_{\gamma} is the probability measure of random variable γ\gamma, Y\mathbb{P}_{Y} is the probability measure of random noise after mapping of generator GG.

We first consider the dynamic structure of 𝐒t,T\mathbf{S}_{t,T}.

Assumption 1 (Date Independence).

If TT is a relative large number, the distribution of 𝐒t,T\mathbf{S}_{t,T} does not depend on initial time point tt, or equivalently the distribution of 𝐒t,T\mathbf{S}_{t,T} could be written as the distribution of 𝐒T\mathbf{S}_{T}.

Assumption 1 states that if TT is large, the high dimension distribution would be complex enough to include all the volatility on stock market within a period of time, like three years. GAN is a powerful tool to learn a specific high dimensional distribution from training set. Thus we can treat the stock price data as 1-D real distribution(γ\gamma in figure 1) under such assumption.

Assumption 2 (Covariance Rank).

The matrix rank of Cov(𝐒T)\textrm{Cov}\left(\mathbf{S}_{T}\right) should not be too small.

The successful training of GAN models requires that we should not train the generator on highly-correlated samples to avoid loss collapse. For a given TT, The rank of Cov(𝐒T)\textrm{Cov}\left(\mathbf{S}_{T}\right) ranges from 11 to TT. Assumption 2 states that the rank should not be close to 1, which guarantees the covariance structure of stock market data is complex enough for GAN to learn and the training process would not collapse at most time.

4 Methodology

In this section, we introduce the main algorithm, Generative Adversarial Nets-Monte Carlo(GAN-MC) model, for derivatives pricing like option, futures and forward pricing.

4.1 GAN-MC for Option Pricing

The theory of option pricing estimates the value of an option contract by assigning a price, known as premium, based on the calculated probability that the contract will finish in the money(ITM) at expiration. The option pricing theory provides an evaluation of an option’s fair value, and the accurate pricing model could help traders better incorporate the option value into their strategies.

If we denote a specific underlying stock price data as 𝐬1,n=(s1,s2,,sn)\mathbf{s}_{1,n}=(s_{1},s_{2},\cdots,s_{n}) which starts at day 1 with length n1n-1, the continuously compounded annual risk-free interest rate as rr, the value of a call or put option on a stock that pays no dividend as CC or PP, the exercise price for the given option as XX, the proportion of a year before the option expires as T0T_{0} and Δt\Delta t is the time unit. And TT is a fixed parameter which satisfies Assumption 1 and Assumption 2. We set N1N_{1} as the sample size threshold for training, and N2N_{2} as the size of fake data generation. While α\alpha is a proportion parameter which controls the weights that different 𝐬t,T\mathbf{s}_{t,T} contributes for the generation. Then algorithm 1 illustrates how to use GAN-MC on option pricing.

Input: 𝐬1,n,r,X,T0,T,N1,N2,α\mathbf{s}_{1,n},r,X,T_{0},T,N_{1},N_{2},\alpha
Output: Estimated call or put option price C^\widehat{C} or P^\widehat{P}
1 for d=1,2,,Td=1,2,\cdots,T do
2       Partition the stock price data 𝐬1,n\mathbf{s}_{1,n} into a training set 𝒮d,Tn\mathcal{S}^{n}_{d,T} according to (1);
3       Train GAN on training set 𝒮d,Tn\mathcal{S}^{n}_{d,T} and check training loss;
4       if loss does not collapse and |𝒮d,Tn|N1\left|\mathcal{S}^{n}_{d,T}\right|\geq N_{1} then
5            break
6      
7 end for
8for i=1,2,,N2i=1,2,\cdots,N_{2} do
9       Generate random noise 𝐙i\mathbf{Z}_{i} and denote 𝐒~T(i)=G(𝐙i)=(s~n+1(i),s~n+2(i),,s~n+T(i))\tilde{\mathbf{S}}^{(i)}_{T}=G(\mathbf{Z}_{i})=(\tilde{s}^{(i)}_{n+1},\tilde{s}^{(i)}_{n+2},\cdots,\tilde{s}^{(i)}_{n+T}) ;
10       Calculate tSim(𝐒~T(i),𝐬nT,T)\textrm{tSim}\left(\tilde{\mathbf{S}}^{(i)}_{T},\mathbf{s}_{n-T,T}\right) between two stock prices by (2) for each ii;
11      
12 end for
13Sort the list of similarities to form πα=(p1,p2,,pN2)\pi_{\alpha}=(p_{1},p_{2},\cdots,p_{N_{2}}) such that tSim(𝐒~T(pi),𝐬nT,T)tSim(𝐒~T(pj),𝐬nT,T)\textrm{tSim}\left(\tilde{\mathbf{S}}^{(p_{i})}_{T},\mathbf{s}_{n-T,T}\right)\leq\textrm{tSim}\left(\tilde{\mathbf{S}}^{(p_{j})}_{T},\mathbf{s}_{n-T,T}\right) for every 1ijN21\leq i\leq j\leq N_{2}, take α={pi:iαN2}\mathcal{I}_{\alpha}=\{p_{i}:i\geq\lceil\alpha N_{2}\rceil\} ;
14 Calculate C^\widehat{C} or P^\widehat{P} accordingly;
return C^\widehat{C} or P^\widehat{P}
Algorithm 1 GAN-MC for option pricing

First if we have the historical stock price data 𝐬1,n\mathbf{s}_{1,n}, we need to separate the data into realizations of 𝐒T\mathbf{S}_{T} to construct training set for the following missions. The training set 𝒮d,Tn\mathcal{S}^{n}_{d,T} with the length of sliding window dd and given TT is defined as

𝒮d,Tn={𝐬1,T,𝐬1+d,T,,𝐬1+nTdd,T}\mathcal{S}^{n}_{d,T}=\left\{\mathbf{s}_{1,T},\mathbf{s}_{1+d,T},\cdots,\mathbf{s}_{1+\lfloor\frac{n-T}{d}\rfloor d,T}\right\} (1)

Obviously, when dd equals to 0, all the realizations will be the same and the training process of GAN would fail and collapse. With the increase of dd, the overlap proportion between different training sample will become smaller, which will make it much more harder for generator to deceive the discriminator. But the size of training set would become smaller at the same time. Such trade-off is considered during the iteration, we start from a small dd and keep checking the training loss of GAN, if the loss does not collapse and the training set is not so small, we keep the GAN model for the following generation.

Inspired from the work of GAN [8], the 1-D optimization process of GAN here could be stated as

minGmaxD𝔼𝐬t,Tp(𝐒T)[logD(𝐬t,T)]+𝔼𝐙p(𝐙)[log(1D(G(𝐙)))]\min_{G}\max_{D}\mathbb{E}_{\mathbf{s}_{t,T}\sim p(\mathbf{S}_{T})}\left[\log D(\mathbf{s}_{t,T})\right]+\mathbb{E}_{\mathbf{Z}\sim p(\mathbf{Z})}\left[\log(1-D(G(\mathbf{Z})))\right]

where p(𝐒T)p(\mathbf{S}_{T}) is the probability distribution of 𝐒T\mathbf{S}_{T} and p(𝐙)p(\mathbf{Z}) is the probability distribution of noise 𝐙\mathbf{Z}. We design both the generator and discriminator by fully connected networks(FCNNs) [20]. Instead of convolutional neural network, the dense layer could better capture the structure information on 1 dimensional data.

After training, the generated data {𝐒~T(i)}i=1N2\{\tilde{\mathbf{S}}^{(i)}_{T}\}_{i=1}^{N_{2}} could be treated as fake stock prices which follow the distribution of 𝐒T\mathbf{S}_{T}. Under Assumption 1, these fake stock prices could be treated as the predictions for the following TT days, which means we could denote 𝐒~T(i)=(s~n+1(i),s~n+2(i),,s~n+T(i))\tilde{\mathbf{S}}^{(i)}_{T}=(\tilde{s}^{(i)}_{n+1},\tilde{s}^{(i)}_{n+2},\cdots,\tilde{s}^{(i)}_{n+T}). However, there will be endogenous variation within real world stock prices, which makes Assumption 1 hard to be completely satisfied. Obviously the price of a option should rely much more on recent stock prices, and the old stock price will contribute less to the option pricing. Similar to the work of Cassisi [21], here we define the similarity of two time series X=(x1,x2,,xn),Y=(y1,y2,,yn)X=(x_{1},x_{2},\cdots,x_{n}),Y=(y_{1},y_{2},\cdots,y_{n}) by

tSim(X,Y)=1ni=1n(1|xiyi||xi|+|yi|)\textrm{tSim}\left(X,Y\right)=\frac{1}{n}\sum_{i=1}^{n}\left(1-\frac{|x_{i}-y_{i}|}{|x_{i}|+|y_{i}|}\right) (2)

and we calculate tSim(𝐒~T(i),𝐬nT,T)\textrm{tSim}\left(\tilde{\mathbf{S}}^{(i)}_{T},\mathbf{s}_{n-T,T}\right) for each ii. Some of the generated fake stock prices will be similar to recent stock trend, while others may look completely different and are close to earlier stock data. Thus we rank the list of similarities to form a unique order πα=(p1,p2,,pN2)\pi_{\alpha}=(p_{1},p_{2},\cdots,p_{N_{2}}) such that the similarities are placed in a increasing trend, for every 1ijN21\leq i\leq j\leq N_{2} we have tSim(𝐒~T(pi),𝐬nT,T)tSim(𝐒~T(pj),𝐬nT,T)\textrm{tSim}(\tilde{\mathbf{S}}^{(p_{i})}_{T},\mathbf{s}_{n-T,T})\leq\textrm{tSim}(\tilde{\mathbf{S}}^{(p_{j})}_{T},\mathbf{s}_{n-T,T}). For a proportion parameter α(0,1)\alpha\in(0,1) we take the generated fake stock prices which are close to recent market trend α={pi:iαN2}\mathcal{I}_{\alpha}=\{p_{i}:i\geq\lceil\alpha N_{2}\rceil\} as the candidate samples index set for Monte Carlo.

The basic theory of option pricing relies on risk neutral valuation. According to the original Monte Carlo methods used on European option pricing [14], the contract holder can purchase the stock at a future date T0Δt\frac{T_{0}}{\Delta t} at a price XX agreed upon in the contract. The payoff function of a call option could be stated as f(S)=max(SX,0)f(S)=\max(S-X,0) where SS is the stock price at expiration date. We need the investment payoff is equal to the compound total return obtained by investing the option premium CC, for European call option

1|α|i|α|f(s~n+T0Δt(i))=(1+rΔt)T0ΔtC\frac{1}{|\mathcal{I}_{\alpha}|}\sum_{i\in|\mathcal{I}_{\alpha}|}f(\tilde{s}^{(i)}_{n+\frac{T_{0}}{\Delta t}})=(1+r\Delta t)^{\frac{T_{0}}{\Delta t}}C

and solving the equation we get the GAN-based Monte Carlo estimation for call option

C^=(1+rΔt)T0Δt1|α|iαmax(s~n+T0Δt(i)X,0)\widehat{C}=\left(1+r\Delta t\right)^{-\frac{T_{0}}{\Delta t}}\frac{1}{|\mathcal{I}_{\alpha}|}\sum_{i\in\mathcal{I}_{\alpha}}\max\left(\tilde{s}^{(i)}_{n+\frac{T_{0}}{\Delta t}}-X,0\right) (3)

Similarly, the only difference for put option pricing lies on the payoff function. The put option is a contract giving the option buyer the right to sell a specified amount of an underlying security, therefore the payoff function for put option should be f(S)=max(XS,0)f(S)=\max(X-S,0). And we get the GAN-based Monte Carlo estimation for European put option

P^=(1+rΔt)T0Δt1|α|iαmax(Xs~n+T0Δt(i),0)\widehat{P}=\left(1+r\Delta t\right)^{-\frac{T_{0}}{\Delta t}}\frac{1}{|\mathcal{I}_{\alpha}|}\sum_{i\in\mathcal{I}_{\alpha}}\max\left(X-\tilde{s}^{(i)}_{n+\frac{T_{0}}{\Delta t}},0\right) (4)

For American option, since the contract allows holders to exercise their right at any time before and including the expiration date, the equation of the call option should be changed to

C1|α|i|α|f(s~n+T0Δt(i))(1+rΔt)T0ΔtCC\leq\frac{1}{|\mathcal{I}_{\alpha}|}\sum_{i\in|\mathcal{I}_{\alpha}|}f(\tilde{s}^{(i)}_{n+\frac{T_{0}}{\Delta t}})\leq(1+r\Delta t)^{\frac{T_{0}}{\Delta t}}C

and we get the lower and upper bound for American call option

(1+rΔt)T0Δt1|α|iαmax(s~n+T0Δt(i)X,0)C^1|α|iαmax(s~n+T0Δt(i)X,0)\left(1+r\Delta t\right)^{-\frac{T_{0}}{\Delta t}}\frac{1}{|\mathcal{I}_{\alpha}|}\sum_{i\in\mathcal{I}_{\alpha}}\max\left(\tilde{s}^{(i)}_{n+\frac{T_{0}}{\Delta t}}-X,0\right)\leq\widehat{C}\leq\frac{1}{|\mathcal{I}_{\alpha}|}\sum_{i\in\mathcal{I}_{\alpha}}\max\left(\tilde{s}^{(i)}_{n+\frac{T_{0}}{\Delta t}}-X,0\right)

Similarly for American put option

(1+rΔt)T0Δt1|α|iαmax(Xs~n+T0Δt(i),0)P^1|α|iαmax(s~n+T0Δt(i)X,0)\left(1+r\Delta t\right)^{-\frac{T_{0}}{\Delta t}}\frac{1}{|\mathcal{I}_{\alpha}|}\sum_{i\in\mathcal{I}_{\alpha}}\max\left(X-\tilde{s}^{(i)}_{n+\frac{T_{0}}{\Delta t}},0\right)\leq\widehat{P}\leq\frac{1}{|\mathcal{I}_{\alpha}|}\sum_{i\in\mathcal{I}_{\alpha}}\max\left(\tilde{s}^{(i)}_{n+\frac{T_{0}}{\Delta t}}-X,0\right)

After getting the lower and upper bound for American put option, we could take the average of the two bounds as the final pricing formula for American option.

One significant advantage for Monte Carlo estimation methods is that the variance of statistics will decrease with the increase on sample size. Lower variance is associated with lower risk for investors. Such advantage could be stated as the following theorem.

Theorem 1.

Given r,T0,α,Xr,T_{0},\alpha,X, Var(C^)\textrm{Var}(\widehat{C}) or Var(P^)\textrm{Var}(\widehat{P}) would not increase with the increase of N2N_{2}.

Proof.

We design generator as fully connected network consisting of dense layers and activation layers, thus GG is a continuous function. And 𝐙\mathbf{Z} is a random noise, we generate {𝐙1,𝐙2,,𝐙N2}\{\mathbf{Z}_{1},\mathbf{Z}_{2},\cdots,\mathbf{Z}_{N_{2}}\} as independent and identically distributed random variables, if we denote G(𝐙i)jG(\mathbf{Z}_{i})_{j} as the j-th element of G(𝐙i)G(\mathbf{Z}_{i}), then the random variables max(G(𝐙i)n+T0ΔtX,0)\max(G(\mathbf{Z}_{i})_{n+\frac{T_{0}}{\Delta t}}-X,0) or max(XG(𝐙i)n+T0Δt,0)\max(X-G(\mathbf{Z}_{i})_{n+\frac{T_{0}}{\Delta t}},0) are continuous functions of independent variables {𝐙i}i=1N2\{\mathbf{Z}_{i}\}_{i=1}^{N_{2}}. Therefore {max(G(𝐙i)n+T0ΔtX,0)}i=1N2\{\max(G(\mathbf{Z}_{i})_{n+\frac{T_{0}}{\Delta t}}-X,0)\}_{i=1}^{N_{2}} are i.i.d. random variables and {max(XG(𝐙i)n+T0Δt,0)}i=1N2\{\max(X-G(\mathbf{Z}_{i})_{n+\frac{T_{0}}{\Delta t}},0)\}_{i=1}^{N_{2}} are i.i.d. random variables. We know that α[N2]\mathcal{I}_{\alpha}\subseteq[N_{2}] and we take σ12Var(max(G(𝐙i)n+T0ΔtX,0))\sigma^{2}_{1}\coloneqq\textrm{Var}(\max(G(\mathbf{Z}_{i})_{n+\frac{T_{0}}{\Delta t}}-X,0)) and σ22Var(max(XG(𝐙i)n+T0Δt,0))\sigma^{2}_{2}\coloneqq\textrm{Var}(\max(X-G(\mathbf{Z}_{i})_{n+\frac{T_{0}}{\Delta t}},0)), then the variance of estimated call and put option could be stated as

Var(C^)σ12|α|Var(P^)σ22|α|\textrm{Var}(\widehat{C})\propto\frac{\sigma^{2}_{1}}{|\mathcal{I}_{\alpha}|}\quad\textrm{Var}(\widehat{P})\propto\frac{\sigma^{2}_{2}}{|\mathcal{I}_{\alpha}|}

therefore with the increase of N2N_{2}, the size of set α\mathcal{I}_{\alpha} will increase or stay the same. So if we keep r,T0,α,Xr,T_{0},\alpha,X unchanged, Var(C^)\textrm{Var}(\widehat{C}) or Var(P^)\textrm{Var}(\widehat{P}) will decrease or stay the same. ∎

4.2 GAN-MC for Forward and Futures Pricing

A forward contract is a customized contract between two parties to buy or sell an asset at a specified price on a future date, and a futures contract is a standardized legal contract to buy or sell the underlying assets at a predetermined price for delivery at a specified time in the future. Forward contract is similar with futures, but settlement of forward contract takes place at the end of the contract, different with futures which settles on a daily basis. The underlying asset transacted is usually a commodity or financial instrument. Based on different commodities, securities, currencies or intangibles such as interest rates and stock indexes, the forward or futures could be categorized into markets like foreign exchange market, bond market, equity market and commodity market. Here we focus our pricing model on equity market and commodity market. And the valuation of equity forward or futures origins from a single stock, a customized basket of stocks or on an index of stocks, the valuation of commodity forward or futures depends on the cost of carry during the interim before delivery.

4.2.1 Equity Market

Similar to the setup in section 4.1, we denote the underlying stock or index price as 𝐬1,n=(s1,s2,,sn)\mathbf{s}_{1,n}=(s_{1},s_{2},\cdots,s_{n}), the value of equity forward or futures as FeqF^{\textrm{eq}}, the historical data set of annual dividend per share of this stock as {D(t)}t=1n\{D(t)\}_{t=1}^{n}, the proportion of a year before the delivery date as T0T_{0} and Δt\Delta t is the time unit. Other parameters r,T,N1,N2,αr,T,N_{1},N_{2},\alpha are the same defined as section 4.1. We introduce algorithm 2 to use GAN-MC on equity forward or futures pricing.

Input: 𝐬1,n,r,{D(t)}t=1n,T0,T,N1,N2,α\mathbf{s}_{1,n},r,\{D(t)\}_{t=1}^{n},T_{0},T,N_{1},N_{2},\alpha
Output: Estimated equity futures or forward price F^eq\widehat{F}^{\textrm{eq}}
1 for d=1,2,,Td=1,2,\cdots,T do
2       Partition the stock price data 𝐬1,n\mathbf{s}_{1,n} into a training set 𝒮d,Tn\mathcal{S}^{n}_{d,T} according to (1);
3       Train GAN on training set 𝒮d,Tn\mathcal{S}^{n}_{d,T} and check training loss;
4       if loss does not collapse and |𝒮d,Tn|N1\left|\mathcal{S}^{n}_{d,T}\right|\geq N_{1} then
5            break
6      
7 end for
8for i=1,2,,N2i=1,2,\cdots,N_{2} do
9       Generate random noise 𝐙i\mathbf{Z}_{i} and denote 𝐒~T(i)=G(𝐙i)=(s~n+1(i),s~n+2(i),,s~n+T(i))\tilde{\mathbf{S}}^{(i)}_{T}=G(\mathbf{Z}_{i})=(\tilde{s}^{(i)}_{n+1},\tilde{s}^{(i)}_{n+2},\cdots,\tilde{s}^{(i)}_{n+T}) ;
10       Calculate tSim(𝐒~T(i),𝐬nT,T)\textrm{tSim}\left(\tilde{\mathbf{S}}^{(i)}_{T},\mathbf{s}_{n-T,T}\right) between two stock or index prices by (2) for each ii;
11      
12 end for
13Sort the list of similarities to form πα=(p1,p2,,pN2)\pi_{\alpha}=(p_{1},p_{2},\cdots,p_{N_{2}}) such that tSim(𝐒~T(pi),𝐬nT,T)tSim(𝐒~T(pj),𝐬nT,T)\textrm{tSim}\left(\tilde{\mathbf{S}}^{(p_{i})}_{T},\mathbf{s}_{n-T,T}\right)\leq\textrm{tSim}\left(\tilde{\mathbf{S}}^{(p_{j})}_{T},\mathbf{s}_{n-T,T}\right) for every 1ijN21\leq i\leq j\leq N_{2}, take α={pi:iαN2}\mathcal{I}_{\alpha}=\{p_{i}:i\geq\lceil\alpha N_{2}\rceil\} ;
14 Fit a linear model D(t)=at+b+ϵtD(t)=at+b+\epsilon_{t} on set {D(t)}t=1n\{D(t)\}_{t=1}^{n} where {ϵt}t=1n\{\epsilon_{t}\}_{t=1}^{n} is the set of random noise and predict the annual dividend per share by D^(n+T0Δt)=a^(n+T0Δt)+b^\widehat{D}(n+\frac{T_{0}}{\Delta t})=\hat{a}(n+\frac{T_{0}}{\Delta t})+\hat{b};
Calculate
F^eq=snexp{(r1|α|iαD^(n+T0Δt)s~n+T0Δt(i))T0}\widehat{F}^{\textrm{eq}}=s_{n}\cdot\exp{\left\{\left(r-\frac{1}{|\mathcal{I}_{\alpha}|}\sum_{i\in\mathcal{I}_{\alpha}}\frac{\widehat{D}(n+\frac{T_{0}}{\Delta t})}{\tilde{s}^{(i)}_{n+\frac{T_{0}}{\Delta t}}}\right)T_{0}\right\}} (5)
return F^eq\widehat{F}^{\textrm{eq}}
Algorithm 2 GAN-MC for equity futures/forward pricing

The first step for equity market pricing is the same with option pricing. We need to partition the stock or index data into training set 𝐒d,Tn\mathbf{S}^{n}_{d,T} with a proper sliding window dd, which could make the GAN trained successfully with the given TT. Similarly the stock or index price at different time tt will contribute differently for forward or futures pricing. We still use the rank of similarities between generated fake stock or index data {𝐒~T(i)}i=1N2\{\tilde{\mathbf{S}}^{(i)}_{T}\}_{i=1}^{N_{2}} and 𝐬nT,T\mathbf{s}_{n-T,T} to control the effects of training sample at different time. Equity forward or futures prices are usually quoted in the same way as equity prices quoted in the underlying cash market by exchanges. And a pricing model is mainly used to calculate risk for a future contract, although it is utilized for computing both price and risk for a forward. The theoretical value of a equity forward or futures depends on the dividend model assumption [22], under dividend yield assumption, the theoretical equity forward’s or futures’ price is given by

Fτeq=stexp{(rD(τ)sτ)(τt)}F^{\textrm{eq}}_{\tau}=s_{t}\exp{\left\{\left(r-\frac{D(\tau)}{s_{\tau}}\right)(\tau-t)\right\}} (6)

Where FτeqF^{\textrm{eq}}_{\tau} denotes the forward or futures price at delivery date τ\tau, sts_{t} and sτs_{\tau} denote the stock or index price at time τ,t\tau,t and D(τ)D(\tau) means the annual dividend per share at time τ\tau.

Given the historical annual dividend per share data {D(t)}t=1n\{D(t)\}_{t=1}^{n}, we need to predict the annual dividend per share at time n+T0Δtn+\frac{T_{0}}{\Delta t} which is the delivery date for the equity forward or futures. There are lots of methods for prediction and here we just consider the simple linear model

D(t)=at+b+ϵtϵti.i.d.𝒩(0,σ2)t[n]D(t)=at+b+\epsilon_{t}\quad\epsilon_{t}\overset{\text{i.i.d.}}{\sim}\mathcal{N}(0,\sigma^{2})\quad t\in[n]

Where ϵt\epsilon_{t} denotes random noise and we use least squares estimation to get a^=t=1n(tt¯)(D(t)D(t)¯)t=1n(tt¯)2\hat{a}=\frac{\sum_{t=1}^{n}(t-\bar{t})(D(t)-\overline{D(t)})}{\sum_{t=1}^{n}(t-\bar{t})^{2}} and b^=D(t)¯a^t¯\hat{b}=\overline{D(t)}-\hat{a}\bar{t} where t¯=1nt=1nt\bar{t}=\frac{1}{n}\sum_{t=1}^{n}t and D(t)¯=1nt=1nD(t)\overline{D(t)}=\frac{1}{n}\sum_{t=1}^{n}D(t). Therefore we predict the annual dividend per share at the delivery date as D^(n+T0Δt)=a^(n+T0Δt)+b^\widehat{D}(n+\frac{T_{0}}{\Delta t})=\hat{a}(n+\frac{T_{0}}{\Delta t})+\hat{b}.

Given the annual dividend per share, the dividend yield estimation by GAN-based Monte Carlo should be 1|α|iαD^(n+T0Δt)s~n+T0Δt(i)\frac{1}{|\mathcal{I}_{\alpha}|}\sum_{i\in\mathcal{I}_{\alpha}}\frac{\widehat{D}(n+\frac{T_{0}}{\Delta t})}{\tilde{s}^{(i)}_{n+\frac{T_{0}}{\Delta t}}}. Put it into the equity futures pricing formula 6 we get the GAN-based Monte Carlo estimation formula 5.

Similar to the variance reduction in our method on option pricing. We could still reduce the variance of estimated prices.

Theorem 2.

Given r,T0,α,{D(t)}t=1nr,T_{0},\alpha,\{D(t)\}_{t=1}^{n}, Var(F^eq)\textrm{Var}(\widehat{F}^{\textrm{eq}}) would not increase with the increase of N2N_{2}.

Proof.

We design generator as fully connected network consisting of dense layers and activation layers, thus GG is a continuous function. And 𝐙\mathbf{Z} is a random noise, we generate {𝐙1,𝐙2,,𝐙N2}\{\mathbf{Z}_{1},\mathbf{Z}_{2},\cdots,\mathbf{Z}_{N_{2}}\} as independent and identically distributed random variables, thus random variables set {Xi=D^(n+T0Δt)G(𝐙i)n+T0Δt}iα\{X_{i}=\frac{\widehat{D}(n+\frac{T_{0}}{\Delta t})}{G(\mathbf{Z}_{i})_{n+\frac{T_{0}}{\Delta t}}}\}_{i\in\mathcal{I}_{\alpha}} are collections of independent and identically distributed random variables if {D(t)}t=1n\{D(t)\}_{t=1}^{n} is fixed because XiX_{i} is continuous functions of independent variables {𝐙i}i=1N2\{\mathbf{Z}_{i}\}_{i=1}^{N_{2}}. We denote μ0\mu_{0} as the mean of XiX_{i} and σ02\sigma^{2}_{0} as the variance of XiX_{i}, then

𝔼(1|α|iαXi)=μ0Var(1|α|iαXi)=1|α|σ02\mathbb{E}\left(\frac{1}{|\mathcal{I}_{\alpha}|}\sum_{i\in\mathcal{I}_{\alpha}}X_{i}\right)=\mu_{0}\quad\textrm{Var}\left(\frac{1}{|\mathcal{I}_{\alpha}|}\sum_{i\in\mathcal{I}_{\alpha}}X_{i}\right)=\frac{1}{|\mathcal{I}_{\alpha}|}\sigma^{2}_{0}

And with the increase of N2N_{2}, α[N2]\mathcal{I}_{\alpha}\subseteq[N_{2}], |α||\mathcal{I}_{\alpha}| will increase or stay the same, and Var(1|α|iαXi)\text{Var}(\frac{1}{|\mathcal{I}_{\alpha}|}\sum_{i\in\mathcal{I}_{\alpha}}X_{i}) will not increase. If 𝔼(F^eq)\mathbb{E}(\widehat{F}^{\textrm{eq}}) keeps unchanged, Var(F^eq)\text{Var}(\widehat{F}^{\textrm{eq}}) will decrease or stay the same. ∎

4.2.2 Commodity Market

For commodity forward contract or futures, the underlying asset could usually be divided into food, energy and materials. Similar to the parameter setup in section 4.1, we denote the underlying commodity spot price as 𝐬1,n=(s1,s2,,sn)\mathbf{s}_{1,n}=(s_{1},s_{2},\cdots,s_{n}), the average cost of carry from time tt to τ\tau as P(t,τ)P(t,\tau), the historical commodity forward contract or futures price as {Ftco}t=nN3n\{F^{\textrm{co}}_{t}\}_{t=n-N_{3}}^{n}, the proportion of a year before the delivery date as T0T_{0} and Δt\Delta t is the time unit. Other parameters r,T,N1,N2,αr,T,N_{1},N_{2},\alpha are the same defined as section 4.1. We introduce algorithm 3 to use GAN-MC on commodity forward or futures pricing.

Input: 𝐬1,n,r,{Ftco}t=nN3n,T0,T,N1,N2,N3,α\mathbf{s}_{1,n},r,\{F^{\textrm{co}}_{t}\}_{t=n-N_{3}}^{n},T_{0},T,N_{1},N_{2},N_{3},\alpha
Output: Estimated commodity forward or futures price F^co\widehat{F}^{\textrm{co}}
1 for d=1,2,,Td=1,2,\cdots,T do
2       Partition the spot price data 𝐬1,n\mathbf{s}_{1,n} into a training set 𝒮d,Tn\mathcal{S}^{n}_{d,T} according to (1);
3       Train GAN on training set 𝒮d,Tn\mathcal{S}^{n}_{d,T} and check training loss;
4       if loss does not collapse and |𝒮d,Tn|N1\left|\mathcal{S}^{n}_{d,T}\right|\geq N_{1} then
5            break
6      
7 end for
8for i=1,2,,N2i=1,2,\cdots,N_{2} do
9       Generate random noise 𝐙i\mathbf{Z}_{i} and denote 𝐒~T(i)=G(𝐙i)=(s~n+1(i),s~n+2(i),,s~n+T(i))\tilde{\mathbf{S}}^{(i)}_{T}=G(\mathbf{Z}_{i})=(\tilde{s}^{(i)}_{n+1},\tilde{s}^{(i)}_{n+2},\cdots,\tilde{s}^{(i)}_{n+T}) ;
10       Calculate tSim(𝐒~T(i),𝐬nT,T)\textrm{tSim}\left(\tilde{\mathbf{S}}^{(i)}_{T},\mathbf{s}_{n-T,T}\right) between two spot prices by (2) for each ii;
11      
12 end for
13Sort the list of similarities to form πα=(p1,p2,,pN2)\pi_{\alpha}=(p_{1},p_{2},\cdots,p_{N_{2}}) such that tSim(𝐒~T(pi),𝐬nT,T)tSim(𝐒~T(pj),𝐬nT,T)\textrm{tSim}\left(\tilde{\mathbf{S}}^{(p_{i})}_{T},\mathbf{s}_{n-T,T}\right)\leq\textrm{tSim}\left(\tilde{\mathbf{S}}^{(p_{j})}_{T},\mathbf{s}_{n-T,T}\right) for every 1ijN21\leq i\leq j\leq N_{2}, take α={pi:iαN2}\mathcal{I}_{\alpha}=\{p_{i}:i\geq\lceil\alpha N_{2}\rceil\} ;
14 Estimate cost of carry P^(n,n+T0Δt)\widehat{P}\left(n,n+\frac{T_{0}}{\Delta t}\right) by formula (7);
15 Calculate F^co\widehat{F}^{\textrm{co}} from formula  (8);
return F^co\widehat{F}^{\textrm{co}}
Algorithm 3 GAN-MC for commodity futures/forward pricing

Similar to the equity pricing formula (6), the theoretical commodity forward price is based on its current spot price, plus the cost of carry during the interim before delivery [1]. The simple commodity forward contract or futures could be expressed as

Fτco=(st+P(t,τ))exp{r(τt)}F_{\tau}^{\textrm{co}}=(s_{t}+P(t,\tau))\exp{\{r(\tau-t)\}}

Where FτcoF^{\textrm{co}}_{\tau} denotes the forward or futures price at delivery date τ\tau, sts_{t} denotes the commodity spot price at time tt. Then if we use sτs_{\tau} to replace the term stexpr(τt)s_{t}\exp{r(\tau-t)}, the price of commodity forward or futures would become Fτ=sτ+P(t,τ)exp{r(τt)}F_{\tau}=s_{\tau}+P(t,\tau)\exp{\{r(\tau-t)\}}. Like the previous notation in algorithms, for pricing the commodity forward or futures at time n+T0Δtn+\frac{T_{0}}{\Delta t}, we use empirical estimation of P(n,n+T0Δt)P\left(n,n+\frac{T_{0}}{\Delta t}\right)

P^(n,n+T0Δt)=1N3+1j=0N3[Fnjcoexp(rT0)snj]\widehat{P}\left(n,n+\frac{T_{0}}{\Delta t}\right)=\frac{1}{N_{3}+1}\sum_{j=0}^{N_{3}}\left[\frac{F^{\textrm{co}}_{n-j}}{\exp{(rT_{0})}}-s_{n-j}\right] (7)

Where N3+1N_{3}+1 is the sample size for empirical estimation. Then we use GAN-MC to estimate s~n+T0Δt\tilde{s}_{n+\frac{T_{0}}{\Delta t}}. Analog to equity futures or forward pricing formula 5, the commodity futures or forward pricing formula would be

F^co=1|α|iαs~n+T0Δt(i)+P^(n,n+T0Δt)exp(rT0)\widehat{F}^{\textrm{co}}=\frac{1}{|\mathcal{I}_{\alpha}|}\sum_{i\in\mathcal{I}_{\alpha}}\tilde{s}^{(i)}_{n+\frac{T_{0}}{\Delta t}}+\widehat{P}\left(n,n+\frac{T_{0}}{\Delta t}\right)\exp{(rT_{0})} (8)

Similar to the variance reduction theorem in equity forward or futures pricing, we claim

Theorem 3.

Given r,T0,α,{Ftco}t=nN3nr,T_{0},\alpha,\{F^{\textrm{co}}_{t}\}_{t=n-N_{3}}^{n} fixed, Var(F^co)\textrm{Var}(\widehat{F}^{\textrm{co}}) would not increase with the increase of N2N_{2}.

5 Experiments

5.1 Stock and Index Price Prediction

The accuracy of Monte Carlo pricing models is highly based on the accuracy and variation of prediction on underlying assets. Equation (3)(4)(5) and(8) have shown the pricing results are directly correlated to the generated stock prices in the future. Thus before we test our pricing models on real-world market data, we first check the generated fake stock prices tracks after GAN training.

Refer to caption
Figure 2: Stock and index prediction for the following 80 days. The upper figure shows prediction results for TSLA, the blue line denotes the real stock price of TSLA from 11/16/2021 to 3/11/2022, the red dotted line and green dotted line denote two generated samples from generator GG. The lower figure shows prediction results for index S&P 500, the blue line denotes real index price from 2/9/2022 to 6/3/2022, the red dotted line and green dotted line indicate two generated samples from generator GG.

We collect TSLA historical daily stock prices from 3/22/2019 to 4/11/2022 and S&P 500 index prices from 6/14/2019 to 7/6/2022 for training and prediction. Excluding all the holidays and weekends, there are 771 daily data in total. We set nn in 𝐬1,n\mathbf{s}_{1,n} to 670, the dimension of generator output TT to 128, the sample size threshold N1N_{1} to 270, the size of generation N2N_{2} to 16001600 and the proportion parameter α\alpha to 0.80.8. All the financial data is collected from Bloomberg database.

We follow the procedures of fake price generation in algorithm 1 and algorithm 2. Then we form the index set α\mathcal{I}_{\alpha} and pick two generated price tracks as prediction in comparison to real market data. As shown in fig 2, the red dotted line and green dotted line indicate two generated samples G(𝐙i),G(𝐙j)G(\mathbf{Z}_{i}),G(\mathbf{Z}_{j}) where i,jαi,j\in\mathcal{I}_{\alpha}. Overall the generated fake stock tracks are more turbulent than real stock prices, there are sharp decreases and sharp increases between two single day. In TSLA stock price prediction, the generated samples distribute around real stock prices. And as for pricing, the variation between different generations would make Monte Carlo estimation more accurate. An interesting discovery in S&P 500 index price prediction is that two generated data share a same trend, which means they have a large linear correlation. The trends are similar but the predicted index prices are not exactly the same for each day. The prediction experiments show that our GAN could generate fitful samples for pricing.

5.2 Option Pricing

Before showing our experiments on option pricing, we first introduce other basic models for option pricing.

5.2.1 Black–Scholes

The Black-Scholes model, also known as the Black-Scholes-Merton (BSM) model estimates the theoretical value of derivatives based on other investment instruments, taking into account the impact of time and other risk factors. Raised by Fisher and Myron [23], the Black-Scholes model for option pricing assumes no dividends are paid out during the life of the option, markets are random, there are no transaction costs in buying the option, the risk-free rate and volatility of the underlying asset are known and constant, the returns of the underlying asset are normally distributed and the option can only be exercised at expiration. Then if we denote CC as the call option price, N(x)N(x) denotes the standard normal cumulative distribution function, N(x)=12πxez2/2𝑑zN(x)=\frac{1}{\sqrt{2\pi}}\int_{-\infty}^{x}e^{-z^{2}/2}dz. And we denote XX as the strike price, sts_{t} as the spot price at time tt, T1T_{1} as the date of option expiration, rr as annual risk-free interest rate, σ\sigma as the standard deviation of the stock’s returns, which is known as volatility. Then the formula for call option

C=N(d1)stN(d2)Xer(T1t)C=N(d_{1})s_{t}-N(d_{2})Xe^{-r(T_{1}-t)}

where

d1\displaystyle d_{1} =1σT1t[log(stX)+(r+σ22)(T1t)]\displaystyle=\frac{1}{\sigma\sqrt{T_{1}-t}}\left[\log\left(\frac{s_{t}}{X}\right)+\left(r+\frac{\sigma^{2}}{2}\right)(T_{1}-t)\right]
d2\displaystyle d_{2} =d1σT1t\displaystyle=d_{1}-\sigma\sqrt{T_{1}-t}

And the formula for put option

P=N(d2)Xer(T1t)N(d1)stP=N(-d_{2})Xe^{-r(T_{1}-t)}-N(-d_{1})s_{t}

where d1,d2d_{1},d_{2} are same defined in call option.

In implementing Black-Scholes formula, we actually use the pricing tools in Bloomberg where volatility is automatically calculated according to market data.

5.2.2 Radial Basis Function Network

As mentioned in the work by Hutchinson and Poggio [7], the Radial Basis Function(RBF) network could be used for data fitting[24]. If we replace the Euclidean norm with matrix norm, the fitting process could be expressed as

f^(x)=i=1kcihi((xzi)WW(xzi))+α0+α1x\hat{f}(x)=\sum_{i=1}^{k}c_{i}h_{i}((x-z_{i})^{\top}W^{\top}W(x-z_{i}))+\alpha_{0}+\alpha_{1}^{\top}x

where W,ci,zi,α0,α1W,c_{i},z_{i},\alpha_{0},\alpha_{1} are parameters to be optimized on, and hih_{i} is the basis function, which could be either Gaussian or multiquadric. In pricing model we add one more sigmoid layer as output. The augmented network will be of the form g(f^(x))g(\hat{f}(x)) where g(u)=1/(1+eu)g(u)=1/(1+e^{-u}).

In implementing RBF network, we set the basis function as multiquadric and set the input for the model as g^(s/X,1,T1t)\hat{g}(s/X,1,T_{1}-t) where ss is stock price, XX is strike price and T1tT_{1}-t is the time until maturity.

5.2.3 Multilayer Perceptrons Regression

Still in the paper by Hutchinson and Poggio [7], Multilayer Perceptrons(MLPs) [25] are classical methods for high dimension regression. Consisting of fully connected networks, a general formulation of MLPs with univariate output could be written as

f^(x)=h(i=1kδih(β0i+β1ix)+δ0)\hat{f}(x)=h\left(\sum_{i=1}^{k}\delta_{i}h(\beta_{0i}+\beta_{1i}^{\top}x)+\delta_{0}\right)

Where h()h(\cdot) is the sigmoid function, and δi,δ0,β0i,β1i\delta_{i},\delta_{0},\beta_{0i},\beta_{1i} are parameters to be optimized on. Unlike the RBF network, the nonlinear function hh in MLP is usually fixed for the entire network.

In implementing MLP regression, we set the input of the model as f^(s/X,1,T1t)\hat{f}(s/X,1,T_{1}-t) where ss is stock price, XX is strike price and T1tT_{1}-t is the time until maturity.

5.2.4 Projection Pursuit Regression

Still mentioned in the work of Hutchinson and Poggio [7], projection pursuit regression(PPR) [26] was developed for high-dimensional regression. PPR models are composed of projections of the data and estimating nonlinear combining functions from data. The regression model could be stated as

f^(x)=i=1kδihi(βix)+δ0\hat{f}(x)=\sum_{i=1}^{k}\delta_{i}h_{i}(\beta_{i}^{\top}x)+\delta_{0}

where hih_{i} are functions estimated from data, δi,δ0,βi\delta_{i},\delta_{0},\beta_{i} are parameters to be optimized on, and kk is the number of projections.

In implementing PPR, we set the input of the model as f^(s/X,1,T1t)\hat{f}(s/X,1,T_{1}-t) where ss is stock price, XX is strike price and T1tT_{1}-t is the time until maturity.

5.2.5 Monte Carlo

In fact there are many methods of Monte Carlo estimation for option pricing, and the main difference lays on the assumptions of stock price generation. Here we adapt the methods from Kevin [27]. We assume that the process of stock price follows geometric Brownian motion. If we denote ss as stock price, tt as time, then

ds=μsdt+σsdzdz=ϵtds=\mu sdt+\sigma sdz\quad dz=\epsilon\sqrt{t}

where μ\mu is the drift parameter, σ\sigma is volatility parameter, ϵ\epsilon is a random draw from standard normal distribution. Given stock price at time tt, we relate drift parameter to expected stock price and exercise price μ=1T1tlog(X/st)\mu=\frac{1}{T_{1}-t}\log(X/s_{t}) where T1T_{1} is the date of option expiration. Following the stochastic process, we generate N2N_{2} different stock price tracks (st(i),s~t+1(i),,s~T1(i))i=1N2(s^{(i)}_{t},\tilde{s}^{(i)}_{t+1},\cdots,\tilde{s}^{(i)}_{T_{1}})_{i=1}^{N_{2}} for pricing. Then if we denote Δt\Delta t as time unit, r as annual risk-free interest rate, the estimated price C^=(1+rΔt)T1t1N2i=1N2max(s~T1(i)X,0)\widehat{C}=(1+r\Delta t)^{T_{1}-t}\frac{1}{N_{2}}\sum_{i=1}^{N_{2}}\max(\tilde{s}^{(i)}_{T_{1}}-X,0) and P^=(1+rΔt)T1t1N2i=1N2max(Xs~T1(i),0)\widehat{P}=(1+r\Delta t)^{T_{1}-t}\frac{1}{N_{2}}\sum_{i=1}^{N_{2}}\max(X-\tilde{s}^{(i)}_{T_{1}},0) for European option pricing and take the average of lower and upper bounds for American option pricing.

In implementing Monte Carlo, we collect the volatility parameter for each option from Bloomberg.

5.2.6 Experiments

We first test our model on option pricing, which covers European and American options, call and put options. As for American options, we collect TSLA historical daily stock prices data from 3/22/2019 to 1/27/2022 for GAN training and Monte Carlo estimation, excluding all the holidays and weekends, there are 720 daily data in total. We collect TSLA call option data with strike price, last deal price and expiration date from 2/25/2022 to 3/10/2022, there are 8 different expiration dates for each day, and there are 10 different strike prices for each expiration date. We collect TSLA put option data from 3/21/2022 to 4/1/2022 and the other setup is same to call. Thus we have 800 option data for call and put, we use 720 data for training RBF network, MLP regression, PPR and Linear Regression, and the rest for testing.

We set the dimension of generator TT to 128, the sample size threshold N1N_{1} to 290, the size of generation N2N_{2} to 5120 and the proportion parameter α\alpha to 0.8. We collect the annual risk-free interest rate from Bloomberg. We use mean absolute percentage error(MAPE) as the metric for model evaluation, where

MAPE=100%Ntestj=1Ntest|V^V|V\text{MAPE}=\frac{100\%}{N_{\text{test}}}\sum_{j=1}^{N_{\text{test}}}\frac{|\widehat{V}-V|}{V}

where V^\widehat{V} is the predicted call or put option price, VV is the real option price and NtestN_{\text{test}} is the size of test set.

Model MAPE
GAN-MC 1.42%
MC 2.55%
BS 1.33%
RBF Network 8.75%
MLP Regression 6.42%
PPR 4.52%
LR 10.3%
LR-ITM 10.88%
LR-OTM 10.55%
Model MAPE
*GAN-MC 1.02%
MC 1.91%
BS 2.82%
RBF Network 12.1%
MLP Regression 9.46%
PPR 10.69%
LR 15.32%
LR-ITM 12.59%
LR-OTM 17.25%
Table 1: Performance table for TSLA option pricing. Here GAN-MC means our model, MC means Monte Carlo estimation, BS is the abbreviation of Black-Scholes model, LR is the linear model using all the option data, LR-ITM is In-the-Money Linear Model, LR-OTM is Out-of-the-Money Linear Model. The left table is the performance on call option and the right table collects performance for put option.

As seen from table 1, our model’s performance is close to Black-Scholes model on call option pricing and our model reaches state-of-the-art for TSLA put option pricing. For a single price prediction, a smaller variance of C^\widehat{C} or P^\widehat{P} would make the prediction more accurate, we set N2N_{2} as a relative large number to make the Monte Carlo estimation more accurate. And in both two cases our GAN-MC performs better than MC only model, which means GAN holds better generation capacity for real market data. As for the three non-parametric deep learning models, RBF network, MLP regression and PPR, the performances on TSLA call and put option are quite similar for the equivalence of three representations.

Apart from common stock prices, our model could still work on index prices. As for European options, we test our model on S&P 500 index option price. Similar to TSLA, for call option pricing, we collect S&P 500 historical daily index prices data from 6/14/2019 to 1/10/2022 for GAN training and Monte Carlo estimation, excluding all the holidays and weekends, there are 650 daily data in total. We collect S&P 500 call option(SPXW) data with strike price, last deal price and expiration date from 4/6/2022 to 4/28/2022, there are 8 different expiration dates mixed with call option types for each day and 10 different strike prices for each expiration date. After removing the part of SPX, we have 700 S&P 500 call option data. We use 650 data for training and the rest for testing. As for put option, we collect S&P 500 historical daily index prices data from 1/6/2020 to 9/30/2022 for GAN training and Monte Carlo estimation, excluding all the holidays and weekends, there are 690 daily data in total. We collect S&P 500 put option(SPXW) data with strike price, last deal price and expiration date from 10/7/2022 to 10/28/2022, we use 690 data for training and the rest for testing.

Model MAPE
*GAN-MC 4.50%
MC 7.27%
BS 19.96%
RBF Network 17.00%
MLP Regression 14.2%
PPR 10.40%
LR 6.72%
LR-ITM 6.52%
LR-OTM 7.82%
Model MAPE
*GAN-MC 2.68%
MC 20.61%
BS 8.20%
RBF Network 16.83%
MLP Regression 12.28%
PPR 19.97%
LR 10.58%
LR-ITM 10.39%
LR-OTM 11.04%
Table 2: Performance table for S&P 500 Weeklys(SPXW) options pricing. The left table is the performance on call option and the right table collects performance for put option.

As seen from table 2, our model performs best among all the pricing models. Maybe sometimes the last deal price of SPXW will not lay on the range of bid price and ask price, Black-Scholes performs badly on SPXW pricing. And it seems the linear trend is significant in SPXW, therefore linear models perform better than other datasets.

5.3 Equity Forward or Futures Pricing

Similar to the methods used in option pricing, we conduct experiments for equity futures pricing by GAN-MC, RBF network, MLP regression, PPR, linear regression and Monte Carlo. We first collect S&P 500 historical daily index prices data from 6/14/2019 to 4/21/2022 for GAN training and Monte Carlo estimation, excluding all the holidays and weekends, there are 720 daily data in total. Then we collect E-Mini S&P 500 futures data, which includes ESU22(delivery at 9/16/2022) and ESZ22(delivery at 12/16/2022) from 7/12/2021 to 7/6/2022. In addtion we collect historical E-Mini S&P 500 futures ESH21(delivery at 3/18/2022) from 1/8/2021 to 3/18/2022. All the futures data includes the last futures price, remaining time before delivery date and S&P 500 index price. We use 720 data for training RNF network, MLP regression, PPR and linear regression, and the rest for testing. Different from option pricing, we use stock price ss and the time until delivery T1tT_{1}-t as the input variables for non-parametric machine learning models for equity futures pricing.

As for Monte Carlo, following the assumption of geometric Brownian motion, we estimate drift parameter for lack of strike prices

μ^=1nt=1n(ds)tstdt\hat{\mu}=\frac{1}{n}\sum_{t=1}^{n}\frac{(ds)_{t}}{s_{t}dt}

where (ds)t=st+1st(ds)_{t}=s_{t+1}-s_{t}. And we estimate the volatility parameter as historical volatility

σ^=1n1t=1n(RtR¯)2\hat{\sigma}=\sqrt{\frac{1}{n-1}\sum_{t=1}^{n}(R_{t}-\bar{R})^{2}}

with Rt=log(st/st1)R_{t}=\log(s_{t}/s_{t-1}) and R¯\bar{R} is the mean of RtR_{t}. Then the Monte Carlo estimation of equity forward or futures price is given by F^eq=stexp{(r1N2i=1N2D^(T1)s~T1(i))T1tΔt}\widehat{F}^{\textrm{eq}}=s_{t}\cdot\exp{\left\{\left(r-\frac{1}{N_{2}}\sum_{i=1}^{N_{2}}\frac{\widehat{D}(T_{1})}{\tilde{s}^{(i)}_{T_{1}}}\right)\frac{T_{1}-t}{\Delta t}\right\}}, where tt is current date, T1T_{1} is the delivery date, Δt\Delta t is time unit and s~T1(i)\tilde{s}^{(i)}_{T_{1}} is the generated stock price in track (st(i),s~t+1(i),,s~T1(i))i=1N2(s^{(i)}_{t},\tilde{s}^{(i)}_{t+1},\cdots,\tilde{s}^{(i)}_{T_{1}})_{i=1}^{N_{2}}.

We set the dimension of generator TT to 128, the sample size threshold N1N_{1} to 290, the size of generation N2N_{2} to 5120 and the proportion parameter α\alpha to 0.8. We collect the annual risk-free interest rate from Bloomberg. And we use mean absolute percentage error(MAPE) as the metric for model evaluation.

Model MAPE
*GAN-MC 0.03%
MC 0.36%
RBF Network(Gauss) 0.17%
RBF Network(Sqrt) 0.31%
MLP Regression 0.43%
PPR 0.11%
LR 0.31%
Table 3: Performance table for E-Mini S&P 500 futures pricing. Here RBF Network(Gauss) means we use Gaussian as basis functions and RBF Network(Sqrt) means we use multiquadric as basis functions.

As seen from table 3 our model still performs best on equity futures pricing. For a single price prediction, a smaller variance of F^eq\widehat{F}^{\textrm{eq}} would make the prediction value more accurate. We set N2N_{2} as a relative large number to make the Monte Carlo estimation more accurate. And in all cases our GAN-MC performs better than MC only model, which means GAN holds better stability and generation capacity for market data.

5.4 Commodity Forward or Futures Pricing

Quite similar to the methods used in section 5.3, we conduct experiments for commodity forward contract pricing by GAN-MC, RBF network, MLP regression, PPR, linear regression and Monte Carlo. We first collect LME copper spot daily price data from 12/4/19 to 9/2/22 for GAN training and Monte Carlo estimation, excluding all the holidays and weekends, there are 700 daily data in total. Then we collect LME copper 3 months rolling forward daily price data from 10/17/18 to 9/30/22. All the forward data includes the last forward price, remaining time before delivery date, which is three months and spot price. We use 700 data for training RNF network, MLP regression, PPR and linear regression, and the rest for testing. We use spot price ss and the time until delivery as the input variables for non-parametric machine learning models for commodity forward or futures pricing.

Similar to the method used in equity futures pricing, the Monte Carlo estimation of commodity forward contract or futures price is given by F^co=1N2i=1N2s~T1(i)+P^(t,T1)exp(r(T1t))\widehat{F}^{\textrm{co}}=\frac{1}{N_{2}}\sum_{i=1}^{N_{2}}\tilde{s}^{(i)}_{T_{1}}+\widehat{P}(t,T_{1})\exp{(r(T_{1}-t))} where tt is current date, T1T_{1} is the settlement date, and s~T1(i)\tilde{s}^{(i)}_{T_{1}} is the generated stock price in track (st(i),s~t+1(i),,s~T1(i))i=1N2(s^{(i)}_{t},\tilde{s}^{(i)}_{t+1},\cdots,\tilde{s}^{(i)}_{T_{1}})_{i=1}^{N_{2}}.

We set the dimension of generator TT to 128, the sample size threshold N1N_{1} to 290, the size of generation N2N_{2} to 5120, sample size for estimating cost of carry N3N_{3} to 50 and the proportion parameter α\alpha to 0.8. We collect the annual risk-free interest rate from Bloomberg. And we use mean absolute percentage error(MAPE) as the metric for model evaluation.

Model MAPE
*GAN-MC 0.08%
MC 0.53%
RBF Network(Gauss) 0.90%
RBF Network(Sqrt) 2.33%
MLP Regression 1.11%
PPR 1.24%
LR 1.13%
Table 4: Performance table for LME copper 3 month forward pricing.

The results in table 4 show that our model performs best on commodity forward pricing. If we compare GAN-MC with MC, the better performance of our model proves the generative network’s efficiency and capacity during generation. Apart from our model, Monte Carlo and RBF Network(Gauss) models perform greatly on LME copper forward pricing.

Apart from commodity forward contract, GAN-MC could still handle the commodity futures cases. We then test our model on crude oil futures. Similar to copper, we collect Cushing, OK WTI crude oil historical daily spot price from 7/11/2019 to 8/30/2022 for GAN training and Monte Carlo estimation, excluding all the holidays and weekends, there are 700 daily data in total. Then we collect CLV2, which is the WTI crude oil future settled on October 2022 from 2/5/2019 to 2/9/2022, and CLX2, which is settled on November 2022 from 2/5/2020 to 2/9/2022. All the futures data includes the last futures price, remaining time before delivery date and WTI crude oil spot price. We use 700 daily data for training RNF network, MLP regression, PPR and linear regression, and the rest for testing.

Model MAPE
*GAN-MC 0.58%
MC 7.44%
RBF Network(Gauss) 2.59%
RBF Network(Sqrt) 1.88%
MLP Regression 4.75%
PPR 2.26%
LR 4.29%
Table 5: Performance table for WTI crude oil futures pricing.

The results in table 5 show that our model performs best on commodity futures pricing. Such success results from the capacity, generating ability and the variance reduction properties of GAN-MC. Apart from our model, RBF Network(Sqrt) and PPR models perform greatly on WTI crude oil futures pricing.

6 Conclusion

All the success of our model on different real market derivatives pricing proves the correctness of our GAN-MC model. GAN is a powerful tool for capturing the trend and variation of the underlying asset prices like stock or index price. Monte Carlo could be used for reducing the variance of estimators given independent sequences and efficient for derivatives pricing. Although parametric derivatives pricing formulas are preferred when they are available, our result show that generative model-based Monte Carlo alternatives could be useful substitutes when arbitrage-based pricing formula or non-parametric pricing model fails. While our results are promising, we can not claim our approach will be successful in general, we have not covered swap and other derivatives pricing yet and we hope to provide a more comprehensive analysis of these alternatives in the near future.

References

  • Black [1976] Fischer Black. The pricing of commodity contracts. Journal of financial economics, 3(1-2):167–179, 1976.
  • Merton [1973] Robert C Merton. Theory of rational option pricing. The Bell Journal of economics and management science, pages 141–183, 1973.
  • Karoui et al. [1998] Nicole El Karoui, Monique Jeanblanc-Picquè, and Steven E Shreve. Robustness of the black and scholes formula. Mathematical finance, 8(2):93–126, 1998.
  • Wu [2004] Hsien-Chung Wu. Pricing european options based on the fuzzy pattern of black–scholes formula. Computers & Operations Research, 31(7):1069–1081, 2004.
  • Magdziarz [2009] Marcin Magdziarz. Black-scholes formula in subdiffusive regime. Journal of Statistical Physics, 136(3):553–564, 2009.
  • Carmona and Durrleman [2005] René Carmona and Valdo Durrleman. Generalizing the black-scholes formula to multivariate contingent claims. Journal of computational finance, 9(2):43, 2005.
  • Hutchinson et al. [1994] James M Hutchinson, Andrew W Lo, and Tomaso Poggio. A nonparametric approach to pricing and hedging derivative securities via learning networks. The journal of Finance, 49(3):851–889, 1994.
  • Goodfellow et al. [2014] Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. Generative adversarial nets. Advances in neural information processing systems, 27, 2014.
  • Zhang et al. [2019] Han Zhang, Ian Goodfellow, Dimitris Metaxas, and Augustus Odena. Self-attention generative adversarial networks. In International conference on machine learning, pages 7354–7363. PMLR, 2019.
  • Mirza and Osindero [2014] Mehdi Mirza and Simon Osindero. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784, 2014.
  • Chen et al. [2020] Liangjian Chen, Shih-Yao Lin, Yusheng Xie, Yen-Yu Lin, Wei Fan, and Xiaohui Xie. Dggan: Depth-image guided generative adversarial networks for disentangling rgb and depth images in 3d hand pose estimation. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pages 411–419, 2020.
  • Arjovsky et al. [2017] Martin Arjovsky, Soumith Chintala, and Léon Bottou. Wasserstein generative adversarial networks. In International conference on machine learning, pages 214–223. PMLR, 2017.
  • Chen et al. [2016] Xi Chen, Yan Duan, Rein Houthooft, John Schulman, Ilya Sutskever, and Pieter Abbeel. Infogan: Interpretable representation learning by information maximizing generative adversarial nets. Advances in neural information processing systems, 29, 2016.
  • Boyle [1977] Phelim P Boyle. Options: A monte carlo approach. Journal of financial economics, 4(3):323–338, 1977.
  • Fu and Hu [1995] Michael C Fu and Jian-Qlang Hu. Sensitivity analysis for monte carlo simulation of option pricing. Probability in the Engineering and Informational Sciences, 9(3):417–446, 1995.
  • Birge [1995] John R Birge. Quasi-monte carlo approaches to option pricing. Technical report, 1995.
  • Broadie et al. [1997] Mark Broadie, Paul Glasserman, and Gautam Jain. Enhanced monte carlo estimates for american option prices. Journal of Derivatives, 5:25–44, 1997.
  • Poirot and Tankov [2006] Jérémy Poirot and Peter Tankov. Monte carlo option pricing for tempered stable (cgmy) processes. Asia-Pacific Financial Markets, 13(4):327–344, 2006.
  • Kim [2022] Yo-whan Kim. How Transferable are Video Representations Based on Synthetic Data? PhD thesis, Massachusetts Institute of Technology, 2022.
  • Rosenblatt [1961] Frank Rosenblatt. Principles of neurodynamics. perceptrons and the theory of brain mechanisms. Technical report, Cornell Aeronautical Lab Inc Buffalo NY, 1961.
  • Cassisi et al. [2012] Carmelo Cassisi, Placido Montalto, Marco Aliotta, Andrea Cannata, and Alfredo Pulvirenti. Similarity measures and dimensionality reduction techniques for time series data mining. Advances in data mining knowledge discovery and applications, pages 71–96, 2012.
  • Quail and Overdahl [2009] Rob Quail and James A Overdahl. Financial derivatives: pricing and risk management, volume 5. John Wiley & Sons, 2009.
  • Black and Scholes [2019] Fischer Black and Myron Scholes. The pricing of options and corporate liabilities. In World Scientific Reference on Contingent Claims Analysis in Corporate Finance: Volume 1: Foundations of CCA and Equity Valuation, pages 3–21. World Scientific, 2019.
  • Poggio and Girosi [1990] Tomaso Poggio and Federico Girosi. Networks for approximation and learning. Proceedings of the IEEE, 78(9):1481–1497, 1990.
  • Rumelhart et al. [1985] David E Rumelhart, Geoffrey E Hinton, and Ronald J Williams. Learning internal representations by error propagation. Technical report, California Univ San Diego La Jolla Inst for Cognitive Science, 1985.
  • Friedman and Stuetzle [1981] Jerome H Friedman and Werner Stuetzle. Projection pursuit regression. Journal of the American statistical Association, 76(376):817–823, 1981.
  • Brewer et al. [2012] Kevin D Brewer, Yi Feng, and Clarence CY Kwan. Geometric brownian motion, option pricing, and simulation: Some spreadsheet-based exercises in financial modeling. Spreadsheets in Education, 5(3):4598, 2012.