This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Multiscale Markowitz

Raphael Douady, Revant Nayar
(August 2024)
Abstract

Traditional Markowitz portfolio optimization constrains daily portfolio variance to a target value, optimising returns, Sharpe or variance within this constraint. However, this approach overlooks the relationship between variance at different time scales, typically described by σ(Δt)(Δt)H\sigma(\Delta t)\propto(\Delta t)^{H} where HH is the Hurst exponent, most of the time assumed to be 12\frac{1}{2}. This paper introduces a multifrequency optimization framework that allows investors to specify target portfolio variance across a range of frequencies, characterized by a target Hurst exponent HtargetH_{target}, or optimize the portfolio at multiple time scales. By incorporating this scaling behavior, we enable a more nuanced and comprehensive risk management strategy that aligns with investor preferences at various time scales. This approach effectively manages portfolio risk across multiple frequencies and adapts to different market conditions, providing a robust tool for dynamic asset allocation. This overcomes some of the traditional limitations of Markowitz, when it comes to dealing with crashes, regime changes, volatility clustering or multifractality in markets. We illustrate this concept with a toy example and discuss the practical implementation for assets with varying scaling behaviors.

1 Introduction

In the classical Markowitz portfolio optimization framework, the daily variance is typically constrained to a predefined maximum value, such that iσi2<σtarget2\sum_{i}\sigma_{i}^{2}<\sigma_{\text{target}}^{2}. The objective is then to maximize returns while adhering to this variance constraint. This approach assumes a fixed variance at a single time scale, usually on a daily basis.

However, it is well understood that the variance of asset returns across different time scales is, to a large extent, related by a scaling law of the form:

σ(Δt)(Δt)H\sigma(\Delta t)\propto(\Delta t)^{H}

where H=0.5H=0.5 for Brownian motion [1]. Despite this, the target portfolio variance is typically defined at only one frequency (commonly daily), without consideration for how variance behaves across other time scales. In practice, it would be reasonable to expect that investors might have target portfolio variances across a range of frequencies, from the fast trade adjustment frequency to long-term portfolio objectives. The most general form of this would be an arbitrary function of scale, σtarget2(Δt)\sigma_{\text{target}}^{2}(\Delta t), which need not be continuous or differentiable. This function allows investors to specify their risk preferences over a spectrum of relevant time scales, for example, from one day to one month or more [6].

For simplicity, however, we can assume self-similarity in the variance structure and instead fix the target scaling behavior, characterized by a target Hurst exponent HtargetH_{\text{target}}. This gives us the relationship:

σtarget2(Δt)(Δt)Htarget\sigma_{\text{target}}^{2}(\Delta t)\propto(\Delta t)^{H_{\text{target}}}

This approach allows for flexibility in portfolio optimization based on different investor risk preferences across time scales [8]. For example, an investor who is more risk-averse to lower frequency volatility (e.g., weekly returns) than higher frequency volatility (e.g., daily returns) might prefer a target Hurst exponent Htarget<HmarketH_{\text{target}}<H_{\text{market}}. This situation could arise if a fund seeks to limit drawdowns at lower frequencies, where volatility is naturally higher. Conversely, if an investor is willing to tolerate more variance at lower frequencies, the target Hurst exponent might be set such that Htarget>HmarketH_{\text{target}}>H_{\text{market}} [2].

In this article, we first justify the need for a multifrequency approach to time series analysis through observed stylized facts, such as varying volatility, fat tails and autocorrelation. Then we study a toy example based on multifractality and show how optimal weights in this context depend on the various parameters.

Finally, we evidence on US sector index tracking ETFs the superiority of multifrequency optimization over traditional Markowitz based on daily variances and covariances.

1.1 Why Multiscale Optimization?

We might question why there is a need at all to do multiscale portfolio optimization. Isn’t the single scale case enough?

In reality, two facts occur in stock and stock index prices. On the one hand, non-trivial self-similarity with varying critical exponents has been observed in pricing time series since the time of Mandelbrot (who noticed it for cotton prices). At lower frequencies (weekly and below), momentum appears, i.e. a Hurst exponent above 12\frac{1}{2} while fat tails fade away. On the contrary, intraday returns display fatter tails but negative autocorrelation, i.e. a Hurst exponent below 12\frac{1}{2}.

On the other hand, periods of high and low volatility operate like an acceleration and slowdown of the time clock. Using a fixed time clock for the return series boils down to using some kind of random time sampling with respect to the market ”volatility time”. We will se that, in fact, volatility is a hidden market variable. In other words, short of considering it as a hidden variable, prices series are not Markovian, despite the Efficient Market Hypothesis, which one to needs to enter into high frequency trading to invalidate it.

The logic of all this is that investors at marginally different time horizons behave in a similar way. Dynamic phase transitions are also characterised by self similar fixed points (markets show these near crashes and regime shifts). Motivated by this, there are certain cases where Markowitz breaks, particularly when the volatility spikes. Exposures that are calibrated on a low volatility period turn out be way to large for the newly appearing high volatility. We will show that this can be addressed by the multiscale optimization.

1.2 Stylised Facts

The following are typical stylised facts observed in the stock market and in financial markets in general:

  • Volatility Clusters Market volatility varies through time in a irregular and time-asymmetric manner. Volatility may suddenly surge following an event or a news that triggers its instability. Then it will take time to progressively decrease.

  • Rough volatility: This causes the volatility of some assets to increase disproportionately at longer scales, causing over-allocation to them [2]. Globally, volatility remains in a limited range, implying a negative autocorrelation of its variations at various time scales.

  • Non-Ellipticity Asset returns exhibit varying fat-tailed distributions subject to the tail concentration effect: under extreme conditions, the number of effective variables driving the market shrinks down to just a few of them. The Hurst exponent also varies, due to market impact eg. illiquids, corporate bonds, PE/VC funds, emerging markets etc. We know that Markowitz generally overallocates to illiquids[12, 13, 14].

  • Crashes and Bubbles: During times of crashes, often Markowitz worsens drawdowns as opposed to an equally weighted portfolio [9]. This is because the volatility begins to scale non-trivially with scale. We expect that we see power laws during bubbles as well, with imaginary critical exponents.

  • Stochastic Volatility: This introduces uncertainty on the volatility of the different assets, resulting in instabilities and estimation errors, in addition to non-generizability out of sample [4].

  • Regime Changes: It is unable to handle stochastic regime changes such as shocks induced by changes in interest rates, unemployment numbers, and global macroeconomic and political factors [7].

A lot of these are instances when the volatility becomes increasingly rough, with Hursts substantially different from half, or when tails become disproportionately fat [2]. In the case of market crashes, in fact, that is what really happens. Bubbles too are characterized by Hursts, but those which are imaginary, corresponding to dynamics governed by the Log Periodic Power Law (LPPL). Markowitz is based on the condition of ellipticity, where all stocks have the same Hurst exponents. Therefore, optimal weights at one time scale generalize across scales.

2 Motivations

2.1 Fractional Diffusion PDE

We are motivated by the fact that pricing time series has been shown to exhibit self-similarity in various studies, starting with the time of Mandelbrot [12]. Also self-similar dynamics is at the endpoints of renormalization group flows. We start with the most general non-interacting PDE we can write down describing fractional diffusion in both space and time: βP(x,t)tβ=Kα(Δ)αP(x,t)\\ \frac{\partial^{\beta}P(x,t)}{\partial t^{\beta}}=-K_{\alpha}(-\Delta)^{\alpha}P(x,t)\\ Where

  • P(x,t)P(x,t) is the probability density function

  • βtβ\frac{\partial^{\beta}}{\partial t^{\beta}} is fractional Caputo derivative (with 0<β<=10<\beta<=1)

  • (Δ)α(-\Delta)^{\alpha} is the fractional Laplacian (Riesz derivative) of order α\alpha (with 0<α<=20<\alpha<=2)

  • KαK_{\alpha} is a generalised diffusion coefficient.

Here we see that there are qualitatively three types of cases:

  • Brownian motion (β=1,α=2\beta=1,\alpha=2): This case equates to the usual Markowitz case. Here |x|t0.5|x|\propto t^{0.5}

  • Fat Tails (β=1,1<=α<2\beta=1,1<=\alpha<2): This corresponds to no time dependence but fat tails with a power law |x|t1/α|x|\propto t^{1/\alpha}

  • Gaussian distribution with time dependence (0<β<2,α=20<\beta<2,\alpha=2): Here we have the benefit of no fat tails but time dependence causes anomalous scaling |x|tβ|x|\propto t^{\beta}

  • Time dependence and Fat Tails(0<β<2,1<=α<=20<\beta<2,1<=\alpha<=2): This is the most general base with time dependence and fat tails. Here we have that |x|tβ/α|x|\propto t^{\beta/\alpha}

Note that in most studies, the Hurst is considered to be solely due to time dependence, and H:=β/2H:=\beta/2. However, this obfuscates the fact that anomalous scaling can be due to both time depedence and fat tails. Hence thereafter in this study, when we refer to the Hurst, we refer to the standardised Hurst defined as H:=β/α\\ H:=\beta/\alpha\\ This recognises that anomalous scaling can be due to both fat tails and time dependence. In a similar vein, in the presence of nonlinearities we can in principle have multifractality due to both the effect of the probability distribution (βnnβ\beta_{n}\neq n\beta) and the interactions (αnα/n\alpha_{n}\neq\alpha/n). Analogously, we have a standardised generalized Hurst: Hn:=βn/αn\\ H_{n}:=\beta_{n}/\alpha_{n}\\ The direct observable here is the scaling law |x|ntHn|x|^{n}\propto t^{H_{n}}, while it is more subtle to address the contributions of the iid distribution and dependence structure to it, and will not be addressed.

The discerning reader might be concerned here about the inconsistency of the Markowitz paradigm with the presence of fat tails, since the covariance matrix does not in principle converge under the presence of fat tails. The solution to that is straightforward: one can replace the covariance matrix with the L1L^{1}-modified covariance matrix and solve the optimisation problem: minwwiΣijL1wj\\ min_{w}w_{i}\Sigma^{L^{1}}_{ij}w_{j}\\ Where the L1L^{1}-modified covariance matrix is written as: ΣijL1=ρij|RiRi~||RjRj~|\\ \Sigma^{L^{1}}_{ij}=\rho_{ij}|R_{i}-\tilde{R_{i}}||R_{j}-\tilde{R_{j}}|\\ Where ρij\rho_{ij} are just the cross-asset correlations and |RiRi~||R_{i}-\tilde{R_{i}}| and |RjRj~||R_{j}-\tilde{R_{j}}| are absolute deviations from the robust central estimate (median). This will in principle have better convergence properties in the presence of tails, but should not make a substantial difference to the allocations.

3 Implementation

In practice, it would work in the following manner. The returns are invariant under rescaling, so the return maximization condition will remain as is across a range of scales. As we progressively move to higher and higher frequencies, we will measure the variance of each asset across a range of frequencies, σi2(Δt)\sigma_{i}^{2}(\Delta t), and then for each frequency we get:

σportfolio(Δt)2=iwiwjΣij(Δt)\sigma_{\text{portfolio}}(\Delta t)^{2}=\sum_{i}w_{i}w_{j}\Sigma_{ij}(\Delta t)

We have a few options on how to implement this:

  1. 1.

    For each frequency, we impose maximal return under the constraint
    σportfolio(Δt)<σtarget(Δt)\sigma_{\text{portfolio}}(\Delta t)<\sigma_{\text{target}}(\Delta t) where σtarget2(Δt)σtarget2(Δ1)|Δt|Htarget\sigma^{2}_{\text{target}}(\Delta t)\propto\sigma^{2}_{\text{target}}(\Delta 1)|\Delta t|^{H_{target}}. This approach ensures that risk, in the form of variance, is effectively managed across a range of frequencies [5].

  2. 2.

    Construct a minimum variance or maximal Sharpe portfolios with the variance estimated by taking an average across scales or by averaging weights after multiscale optimization [10]. The alternative is to compute weights across various scales wi(Λ)w_{i}(\Lambda) and then average them out w~=wi(Λ)Λ\tilde{w}=\langle w_{i}(\Lambda)\rangle_{\Lambda} before renormalization.

  3. 3.

    Construct a minimum variance or maximal Sharpe portfolio where variance is a multiscale estimate of variance, involving the average over scales Σij=Σij(Δt)/(Δt)Δt\Sigma_{ij}=\langle\Sigma_{ij}(\Delta t)/(\Delta t)\rangle_{\Delta t}.

We consider only the last case since the first does not sufficiently constrain the portfolio. We use the minimum variance optimisation, the advantage of which is that it directly measures the efficacy of the variance estimator without dependence on the mean estimator [10].Let us see how multi-frequency optimization constrains our weights at a range of scales, beginning with special cases.

4 Special Cases

4.1 Elliptical Case

Consider a simplified scenario where all assets exhibit the same standardised Hurst exponent, HH, so |x|=|t|H|x|=|t|^{H}. Here, although we have anomalous scaling, the optimization problem can be solved at any given frequency, and the resulting solution will be applicable across all frequencies. This is due to the fact that when the scale is varied through a scaling transformation, the portfolio variance is modified by a factor of (Δt1/Δt2)H(\Delta t_{1}/\Delta t_{2})^{H} [1], thus rescaling our optimisation function by a constant.

However, in practice, assets often exhibit diverse scaling behaviors with frequency and are typically not self-similar. In such cases, it is necessary to compute the optimization independently at each frequency to account for these differences [6].

4.2 General Case

In a more generalized scenario, the variances and covariances of different assets, denoted as σi(Δt)\sigma_{i}(\Delta t) and ρi(Δt)\rho_{i}(\Delta t), respectively, can vary arbitrarily with scale and are derived from empirical data. Similarly, the target variance σtarget(Δt)\sigma_{\text{target}}(\Delta t) can be expressed as an arbitrary function of Δt\Delta t. Under these circumstances, the optimization problem is formulated as follows:

  • Minimize the portfolio variance: iwiΣijMSwj\sum_{i}w_{i}\Sigma^{MS}_{ij}w_{j}

  • Here ΣijMS=<Σij(Δt)/Δt>\Sigma^{MS}_{ij}=<\Sigma_{ij}(\Delta t)/\Delta t>

  • Ensure portfolio weights across scales sum to one: iwi(Λ)=1\sum_{i}w_{i}(\Lambda)=1

  • Maintain non-negative weights: wi(Λ)0i,Λw_{i}(\Lambda)\geq 0\,\forall\,i,\Lambda

This general case highlights the complexity of portfolio optimization when dealing with assets that exhibit non-uniform scaling behavior across different time scales.

5 Effect of Multifractality

5.1 Introduction to Multifractality

Multifractality in financial time series reflects the idea that different parts of the data may exhibit different scaling behaviors. Unlike a monofractal process, which is characterized by a single Hurst exponent HH, a multifractal process is described by a spectrum of exponents H(q)H(q), where qq is a moment order. This spectrum captures the complex, heterogeneous nature of financial markets, where the roughness of returns can vary depending on the time scale and the statistical moment being considered.

In a multifractal framework, the variance scaling law is generalized to:

<σq(Δt)>(Δt)H(q),<\sigma^{q}(\Delta t)>\propto(\Delta t)^{H(q)},

where H(q)H(q) is the multifractal scaling function, which depends on the moment qq. For example, H(1)H(1) corresponds to the traditional Hurst exponent used in variance scaling. Multifractality corresponds to H(q)qH(1)H(q)\neq qH(1)

5.2 Multifractal Portfolio Variance

To incorporate multifractality into portfolio optimization, the variance of each asset ii at a given time scale Δt\Delta t is modeled using the multifractal formalism:

σiq(Δt)(Δt)Hi(q),\sigma_{i}^{q}(\Delta t)\propto\left(\Delta t\right)^{H_{i}(q)},

where Hi(q)H_{i}(q) is the multifractal scaling exponent for asset ii at moment qq. For Markowitz, we will be interested in q=2q=2 where multifractality implies that H(2)2H(1)H(2)\neq 2H(1). Similarly the covariance between stocks ii and jj scales as:

Σij(Δt)=ρij(Δt)σi(Δt)σj(Δt)=(Δt)Hij(2),\Sigma_{ij}(\Delta t)=\rho_{ij}(\Delta t)\sigma_{i}(\Delta t)\sigma_{j}(\Delta t)=\left(\Delta t\right)^{H_{ij}(2)},

Where multifractality ensures Hij(2)(Hi(1)+Hj(1))/2H_{ij}(2)\neq(H_{i}(1)+H_{j}(1))/2. In fact from the above expression, if we assume a the scaling exponent for the correlation:

ρij(Δt)Hijρ\rho_{ij}\propto(\Delta t)^{H^{\rho}_{ij}}

We find that:

Hijρ=Hij(2)(Hi(1)+Hj(1))H^{\rho}_{ij}=H_{ij}(2)-(H_{i}(1)+H_{j}(1))

This is well known to be non-zero and positive as a result of the Epps effect [15], where correlation between any two pairs of stocks is expected to increase with increasing length scale. Empirically from studies of the Epps effect, we find that Hijρ0.3H^{\rho}_{ij}\approx 0.3, which implies that the variation of cross correlation with scale decreases as one moves to lower and lower frequencies. The portfolio variance at time scale Δt\Delta t then becomes:

σportfolio2(Δt)=i=1Nwi2σi2(Δt)+2i=1Nj=i+1Nwiwjρij(Δt)σi(Δt)σj(Δt).\sigma_{\text{portfolio}}^{2}(\Delta t)=\sum_{i=1}^{N}w_{i}^{2}\sigma_{i}^{2}(\Delta t)+2\sum_{i=1}^{N}\sum_{j=i+1}^{N}w_{i}w_{j}\rho_{ij}(\Delta t)\sigma_{i}(\Delta t)\sigma_{j}(\Delta t).

This expression now accounts for multifractality by incorporating the qq-dependent scaling behavior of the assets, in terms of both the variance and covariance.

5.3 Multifractal Optimization Problem

The optimization problem in a multifractal setting aims to minimize the portfolio variance across both time scales and moments. The optimization problem can be formulated as:

Minimize k=1Kσportfolio2(Δtk,q),\text{Minimize }\sum_{k=1}^{K}\sigma_{\text{portfolio}}^{2}(\Delta t_{k},q),

Here:

σportfolio2(Δt,q)=iwi2σi2(Δt)+2ijwiwjρij(Δt)σi(Δt)σj(Δt)\sigma_{\text{portfolio}}^{2}(\Delta t,q)=\sum_{i}w_{i}^{2}\sigma_{i}^{2}(\Delta t)+2\sum_{ij}w_{i}w_{j}\rho_{ij}(\Delta t)\sigma_{i}(\Delta t)\sigma_{j}(\Delta t)

We here will expand the variances and covariances with the correct scaling properties. This is subject to:

i=1Nwiμiμtarget,\sum_{i=1}^{N}w_{i}\mu_{i}\geq\mu_{\text{target}},
i=1Nwi=1,\sum_{i=1}^{N}w_{i}=1,
wi0,i.w_{i}\geq 0,\quad\forall i.

Here, the optimization takes into account not only the scaling behavior across different time scales Δtk\Delta t_{k} but also the varying roughness of the time series as captured by the multifractal spectrum Hi(q)H_{i}(q). This leads to a more complex, yet more accurate, description of portfolio risk that better reflects the true nature of financial markets.

5.4 Estimation of Multifractal Parameters

To implement the multifractal optimization framework, it is necessary to estimate the multifractal scaling exponents Hi(q)H_{i}(q) for each asset. This can be done using methods such as the multifractal detrended fluctuation analysis (MF-DFA) or the wavelet transform modulus maxima (WTMM) method. These techniques allow for the extraction of the multifractal spectrum from historical return data.

The estimated multifractal spectrum can then be used to calculate the portfolio variance at different time scales and moments, which serves as the input for the optimization problem.

6 Sensitivity Analysis

We can compute the sensitivity of the weights to the Hursts, multifractal Hursts etc. First let us evaluate the dependence on volatility.

To find the optimal weights 𝐰\mathbf{w} that minimize σp2\sigma_{p}^{2} subject to the constraint 𝐰𝟏=1\mathbf{w}^{\top}\mathbf{1}=1, we employ the method of Lagrange multipliers. The Lagrangian \mathcal{L} is given by:

(𝐰,λ)=𝐰Σ𝐰λ(𝐰𝟏1)\mathcal{L}(\mathbf{w},\lambda)=\mathbf{w}^{\top}\Sigma\mathbf{w}-\lambda(\mathbf{w}^{\top}\mathbf{1}-1)

where λ\lambda is the Lagrange multiplier associated with the budget constraint. We implement first order conditions. Take the derivative of \mathcal{L} with respect to 𝐰\mathbf{w} and set it to zero:

𝐰=2Σ𝐰λ𝟏=0\frac{\partial\mathcal{L}}{\partial\mathbf{w}}=2\Sigma\mathbf{w}-\lambda\mathbf{1}=0

Solving for 𝐰\mathbf{w}:

2Σ𝐰=λ𝟏𝐰=λ2Σ1𝟏2\Sigma\mathbf{w}=\lambda\mathbf{1}\quad\Rightarrow\quad\mathbf{w}=\frac{\lambda}{2}\Sigma^{-1}\mathbf{1}

Using the budget constraint 𝐰𝟏=1\mathbf{w}^{\top}\mathbf{1}=1:

𝐰𝟏=(λ2Σ1𝟏)𝟏=λ2𝟏Σ1𝟏=1\mathbf{w}^{\top}\mathbf{1}=\left(\frac{\lambda}{2}\Sigma^{-1}\mathbf{1}\right)^{\top}\mathbf{1}=\frac{\lambda}{2}\mathbf{1}^{\top}\Sigma^{-1}\mathbf{1}=1

Let S=𝟏Σ1𝟏S=\mathbf{1}^{\top}\Sigma^{-1}\mathbf{1}, a scalar. Then:

λ2S=1λ=2S\frac{\lambda}{2}S=1\quad\Rightarrow\quad\lambda=\frac{2}{S}

Substituting back into the expression for 𝐰\mathbf{w}:

𝐰=1SΣ1𝟏\mathbf{w}=\frac{1}{S}\Sigma^{-1}\mathbf{1}

Thus, the optimal weights are:

𝐰=Σ1𝟏𝟏Σ1𝟏\mathbf{w}=\frac{\Sigma^{-1}\mathbf{1}}{\mathbf{1}^{\top}\Sigma^{-1}\mathbf{1}}

6.1 Effect of Variance

To demonstrate that increasing the variance σk2\sigma_{k}^{2} of a specific asset kk leads to a decrease in its optimal weight wkw_{k}, while keeping all other parameters constant.

We want to now express the weight of asset kk. From the optimal weights expression:

wk=(Σ1𝟏)kS=skSw_{k}=\frac{(\Sigma^{-1}\mathbf{1})_{k}}{S}=\frac{s_{k}}{S}

where sk=(Σ1𝟏)ks_{k}=(\Sigma^{-1}\mathbf{1})_{k} and S=𝟏Σ1𝟏S=\mathbf{1}^{\top}\Sigma^{-1}\mathbf{1}.

Now we want to analyze the effect of increasing σk2\sigma_{k}^{2}. We aim to compute the derivative wkσk2\frac{\partial w_{k}}{\partial\sigma_{k}^{2}} and show that it is negative.

wkσk2=σk2(skS)=skσk2SskSσk2S2\frac{\partial w_{k}}{\partial\sigma_{k}^{2}}=\frac{\partial}{\partial\sigma_{k}^{2}}\left(\frac{s_{k}}{S}\right)=\frac{\frac{\partial s_{k}}{\partial\sigma_{k}^{2}}S-s_{k}\frac{\partial S}{\partial\sigma_{k}^{2}}}{S^{2}}

Computing skσk2\frac{\partial s_{k}}{\partial\sigma_{k}^{2}} and Sσk2\frac{\partial S}{\partial\sigma_{k}^{2}}

Since 𝐬=Σ1𝟏\mathbf{s}=\Sigma^{-1}\mathbf{1}, we have:

𝐬σk2=Σ1(Σσk2)Σ1𝟏\frac{\partial\mathbf{s}}{\partial\sigma_{k}^{2}}=-\Sigma^{-1}\left(\frac{\partial\Sigma}{\partial\sigma_{k}^{2}}\right)\Sigma^{-1}\mathbf{1}

Given that Σ\Sigma only depends on σk2\sigma_{k}^{2} through the (k,k)(k,k)-element:

Σσk2=𝐞k𝐞k\frac{\partial\Sigma}{\partial\sigma_{k}^{2}}=\mathbf{e}_{k}\mathbf{e}_{k}^{\top}

where 𝐞k\mathbf{e}_{k} is the kk-th standard basis vector. Thus:

𝐬σk2=Σ1𝐞k𝐞kΣ1𝟏\frac{\partial\mathbf{s}}{\partial\sigma_{k}^{2}}=-\Sigma^{-1}\mathbf{e}_{k}\mathbf{e}_{k}^{\top}\Sigma^{-1}\mathbf{1}

Specifically, the derivative of sks_{k} is:

skσk2=(Σ1)kksk\frac{\partial s_{k}}{\partial\sigma_{k}^{2}}=-(\Sigma^{-1})_{kk}s_{k}

Similarly, the derivative of SS is:

Sσk2=sk2\frac{\partial S}{\partial\sigma_{k}^{2}}=-s_{k}^{2}

Lets get the final expression for the derivative. Substituting back:

wkσk2=(Σ1)kkskS+sk3S2=sk((Σ1)kkS+sk2)S2\frac{\partial w_{k}}{\partial\sigma_{k}^{2}}=\frac{-(\Sigma^{-1})_{kk}s_{k}S+s_{k}^{3}}{S^{2}}=\frac{s_{k}(-(\Sigma^{-1})_{kk}S+s_{k}^{2})}{S^{2}}

Given that Σ1\Sigma^{-1} is positive definite, (Σ1)kk>0(\Sigma^{-1})_{kk}>0, and S>sk2S>s_{k}^{2}, it follows that:

wkσk2<0\frac{\partial w_{k}}{\partial\sigma_{k}^{2}}<0

We know that σHi=|Δt|Hiln(Δt)\frac{\partial\sigma}{\partial H_{i}}=|\Delta t|^{H_{i}}ln(\Delta t)

It follows that

wkHk<0\frac{\partial w_{k}}{\partial H_{k}}<0

6.2 Effect of Correlation and Multifractality

Next, we consider the effect of increasing the correlation between two assets, say ii and jj, on their combined weight wi+wjw_{i}+w_{j}. The covariance between assets ii and jj is given by:

σij=ρijσiσj\sigma_{ij}=\rho_{ij}\sigma_{i}\sigma_{j}

where ρij\rho_{ij} is the correlation between assets ii and jj. Increasing ρij\rho_{ij} increases the covariance σij\sigma_{ij}.

The weights wiw_{i} and wjw_{j} depend on the inverse of the covariance matrix Σ1\Sigma^{-1}. To compute the effect of increasing ρij\rho_{ij}, we differentiate the weights wiw_{i} and wjw_{j} with respect to ρij\rho_{ij}:

(wi)ρij=(si)ρij1S(si)Sρij1S2\frac{\partial(w_{i})}{\partial\rho_{ij}}=\frac{\partial(s_{i})}{\partial\rho_{ij}}\cdot\frac{1}{S}-(s_{i})\cdot\frac{\partial S}{\partial\rho_{ij}}\cdot\frac{1}{S^{2}}
(wj)ρij=(sj)ρij1S(sj)Sρij1S2\frac{\partial(w_{j})}{\partial\rho_{ij}}=\frac{\partial(s_{j})}{\partial\rho_{ij}}\cdot\frac{1}{S}-(s_{j})\cdot\frac{\partial S}{\partial\rho_{ij}}\cdot\frac{1}{S^{2}}

where si=(Σ1𝟏)is_{i}=(\Sigma^{-1}\mathbf{1})_{i} and sj=(Σ1𝟏)js_{j}=(\Sigma^{-1}\mathbf{1})_{j}. The derivative Σρij\frac{\partial\Sigma}{\partial\rho_{ij}} has nonzero entries only at positions (i,j)(i,j) and (j,i)(j,i). Here we can have that wiw_{i} and wjw_{j} might increase or decrease. However, if we do an optimisation over the combined asset wi+wjw_{i}+w_{j} we can conclude that the combined weight of the synthetic asset decreases with an increase in correlation. This is because in this modified picture, all you are doing is changing the variance of the two assets: <lrsynthetic,ij2>=<(lri+lrj)2>=<(lri)2>+<(lrj)2>+<lrilrj>\\ <lr_{synthetic,ij}^{2}>=<(lr_{i}+lr_{j})^{2}>=<(lr_{i})^{2}>+<(lr_{j})^{2}>+<lr_{i}lr_{j}>\\ So all the other variances and covariances are the same, except that of the synthetic combination of ii and jj. Thus if we freeze wijw_{ij}, we get from the previous result:

wijρij<=0\frac{\partial w_{ij}}{\partial\rho_{ij}}<=0

Thus:

wijHij=wijρijρijHij=wijρij|Δt|Hijln(Δt)<0\frac{\partial w_{ij}}{\partial H_{ij}}=\frac{\partial w_{ij}}{\partial\rho_{ij}}\frac{\partial\rho_{ij}}{\partial H_{ij}}=\frac{\partial w_{ij}}{\partial\rho_{ij}}*|\Delta t|^{H_{ij}}ln(\Delta t)<0

Here note that although empirically Hij>0H_{ij}>0 which causes a positive multifractality, one could have had Hij<0H_{ij}<0 as well in principle. In that case the correlations would decay with scale, and hence the weight of the synthetic asset would increase with an increase in the multifractal parameter HijH_{ij}.

6.3 Summary

To summarize our results we get as expected:

  • If you increase volatility of a stock, its portfolio weight goes down

  • If you increase Hurst (β/2\beta/2) of a stock,HiH_{i} goes up, its portfolio weight goes down

  • If you increase the fat tailedness of a stock (1/α1/\alpha),HiH_{i} goes up, its portfolio weight goes down

  • If you increase correlation between two stocks, their combined weight goes down.

  • If you introduce positive multifractality in two stocks’ correlation (increased Epps effect),HijH_{ij} goes up, their combined weight goes down; and vice versa.

7 Out of Sample Performance

To rigorously assess the efficacy of the proposed multiscale optimization methods, we conducted an out-of-sample evaluation in two scenarios: the minimum variance portfolio and the maximal Sharpe portfolio, with the mean estimated using a simple moving average. The portfolio weights correspond to allocations across the 11 sectors of the S&P 500. The following choices were used for the backtest:

  • We utilized five years of data, spanning from 2019 to 2024, a period that notably includes significant market events such as the March 2020 COVID-19 crash and the 2022 market correction.

  • The SPDR sector ETFs were used in lieu of allocation to the 11 sectors of the S&P500S\&P500, making it a long-only sector rotation strategy. We also considered a factor rotation strategy taking 9 factors (quality, growth, value, low volatility etc) where each factor was represented by ETFs.

  • A lookback of six months was chosen (125 days), which is common in such studies.

  • Minimum variance portfolios with overlapping vs non-overlapping averages were considered at lower frequencies

  • At each scale, the covariance matrix was evaluated on the same period, but lower frequency covariances were averaged over all posible non-overlapping sets to increase robustness. We compute covariance matrix at a range of scales ΣijMS=<Σij(Δt)/Δt>Δt\Sigma_{ij}^{MS}=<\Sigma_{ij}(\Delta t)/\Delta t>_{\Delta t}

  • Transaction costs were assumed to be negligible for the purpose of this study.

Table 1: Backtest Results for Various Portfolio Optimization Methods (Sector Rotation)
Method Sharpe Ratio Sortino Ratio Max Drawdown (%)
Equally Weighted 0.45 0.53 -39.9
Traditional Markowitz 0.35 0.41 -33.1
Multiscale Markowitz 0.53 0.62 -31.5
Multiscale Markowitz (Overlapping) 0.49 0.58 -30.3
Table 2: Backtest Results for Various Portfolio Optimization Methods (Factor Rotation)
Method Sharpe Ratio Sortino Ratio Max Drawdown (%)
Equally Weighted 0.42 0.52 -39.1
Traditional Markowitz 0.43 0.49 -36.9
Multiscale Markowitz 0.53 0.61 -36.3
Multiscale Markowitz (Overlapping) 0.57 0.66 -34.9

Our findings indicate that the multiscale optimization method results in portfolios with higher Sharpe ratios and Sortino ratios, as well as lower kurtosis and drawdowns. This enhanced performance is attributable to the multiscale method’s ability to account for the effects of non-ellipticity and fat tails, phenomena that are often inadequately captured by traditional minimum variance portfolios [3, 10].

References

  • [1] Benoit B. Mandelbrot and John W. Van Ness, ”Fractional Brownian motions, fractional noises and applications,” SIAM Review, vol. 10, no. 4, pp. 422–437, 1968.
  • [2] Jim Gatheral, Thibault Jaisson, and Mathieu Rosenbaum, ”Volatility is rough,” Quantitative Finance, vol. 18, no. 6, pp. 933–949, 2018.
  • [3] Laurent E. Calvet and Adlai J. Fisher, ”Multifractality in asset returns: Theory and evidence,” Review of Economics and Statistics, vol. 84, no. 3, pp. 381–406, 2002.
  • [4] Steven L. Heston, ”A closed-form solution for options with stochastic volatility with applications to bond and currency options,” The Review of Financial Studies, vol. 6, no. 2, pp. 327–343, 1993.
  • [5] Gregory Berman and Lawrence Hochberg, ”Multiscale modeling of financial time series and portfolio optimization,” Journal of Investment Strategies, vol. 1, no. 2, pp. 45–67, 2008.
  • [6] Jean-François Muzy, Jérôme Delour, and Emmanuel Bacry, ”Modelling fluctuations of financial time series: from cascade process to stochastic volatility model,” The European Physical Journal B-Condensed Matter and Complex Systems, vol. 17, no. 3, pp. 537–548, 2000.
  • [7] Andrew Ang and Allan Timmermann, ”Regime Changes and Financial Markets,” Annual Review of Financial Economics, vol. 4, no. 1, pp. 313–337, 2012.
  • [8] Edgar E. Peters, Fractal Market Analysis: Applying Chaos Theory to Investment and Economics, John Wiley & Sons, 1994.
  • [9] Rama Cont and Jean-Philippe Bouchaud, ”Herd behavior and aggregate fluctuations in financial markets,” Macroeconomic Dynamics, vol. 4, no. 2, pp. 170–196, 2000.
  • [10] Robert J. Bianchi, Michael E. Drew, and Jesse H. Fan, ”Multiscale Hedge Fund Performance Persistence: Evidence from Wavelet Analysis,” Journal of Financial Econometrics, vol. 13, no. 1, pp. 50–78, 2015.
  • [11] H. E. Hurst, ”Long-term storage capacity of reservoirs,” Transactions of the American Society of Civil Engineers, vol. 116, no. 1, pp. 770–799, 1951.
  • [12] Benoit Mandelbrot, ”The variation of certain speculative prices,” The Journal of Business, vol. 36, no. 4, pp. 394–419, 1963.
  • [13] Rama Cont, ”Empirical properties of asset returns: stylized facts and statistical issues,” Quantitative Finance, vol. 1, no. 2, pp. 223–236, 2001.
  • [14] Yakov Amihud and Haim Mendelson, ”Asset pricing and the bid-ask spread,” Journal of Financial Economics, vol. 17, no. 2, pp. 223–249, 1986.
  • [15] Thomas W. Epps, ”Comovements in stock prices in the very short run,” Journal of the American Statistical Association, vol. 74, no. 366a, pp. 291–298, 1979.