Multiscale Markowitz
Abstract
Traditional Markowitz portfolio optimization constrains daily portfolio variance to a target value, optimising returns, Sharpe or variance within this constraint. However, this approach overlooks the relationship between variance at different time scales, typically described by where is the Hurst exponent, most of the time assumed to be . This paper introduces a multifrequency optimization framework that allows investors to specify target portfolio variance across a range of frequencies, characterized by a target Hurst exponent , or optimize the portfolio at multiple time scales. By incorporating this scaling behavior, we enable a more nuanced and comprehensive risk management strategy that aligns with investor preferences at various time scales. This approach effectively manages portfolio risk across multiple frequencies and adapts to different market conditions, providing a robust tool for dynamic asset allocation. This overcomes some of the traditional limitations of Markowitz, when it comes to dealing with crashes, regime changes, volatility clustering or multifractality in markets. We illustrate this concept with a toy example and discuss the practical implementation for assets with varying scaling behaviors.
1 Introduction
In the classical Markowitz portfolio optimization framework, the daily variance is typically constrained to a predefined maximum value, such that . The objective is then to maximize returns while adhering to this variance constraint. This approach assumes a fixed variance at a single time scale, usually on a daily basis.
However, it is well understood that the variance of asset returns across different time scales is, to a large extent, related by a scaling law of the form:
where for Brownian motion [1]. Despite this, the target portfolio variance is typically defined at only one frequency (commonly daily), without consideration for how variance behaves across other time scales. In practice, it would be reasonable to expect that investors might have target portfolio variances across a range of frequencies, from the fast trade adjustment frequency to long-term portfolio objectives. The most general form of this would be an arbitrary function of scale, , which need not be continuous or differentiable. This function allows investors to specify their risk preferences over a spectrum of relevant time scales, for example, from one day to one month or more [6].
For simplicity, however, we can assume self-similarity in the variance structure and instead fix the target scaling behavior, characterized by a target Hurst exponent . This gives us the relationship:
This approach allows for flexibility in portfolio optimization based on different investor risk preferences across time scales [8]. For example, an investor who is more risk-averse to lower frequency volatility (e.g., weekly returns) than higher frequency volatility (e.g., daily returns) might prefer a target Hurst exponent . This situation could arise if a fund seeks to limit drawdowns at lower frequencies, where volatility is naturally higher. Conversely, if an investor is willing to tolerate more variance at lower frequencies, the target Hurst exponent might be set such that [2].
In this article, we first justify the need for a multifrequency approach to time series analysis through observed stylized facts, such as varying volatility, fat tails and autocorrelation. Then we study a toy example based on multifractality and show how optimal weights in this context depend on the various parameters.
Finally, we evidence on US sector index tracking ETFs the superiority of multifrequency optimization over traditional Markowitz based on daily variances and covariances.
1.1 Why Multiscale Optimization?
We might question why there is a need at all to do multiscale portfolio optimization. Isn’t the single scale case enough?
In reality, two facts occur in stock and stock index prices. On the one hand, non-trivial self-similarity with varying critical exponents has been observed in pricing time series since the time of Mandelbrot (who noticed it for cotton prices). At lower frequencies (weekly and below), momentum appears, i.e. a Hurst exponent above while fat tails fade away. On the contrary, intraday returns display fatter tails but negative autocorrelation, i.e. a Hurst exponent below .
On the other hand, periods of high and low volatility operate like an acceleration and slowdown of the time clock. Using a fixed time clock for the return series boils down to using some kind of random time sampling with respect to the market ”volatility time”. We will se that, in fact, volatility is a hidden market variable. In other words, short of considering it as a hidden variable, prices series are not Markovian, despite the Efficient Market Hypothesis, which one to needs to enter into high frequency trading to invalidate it.
The logic of all this is that investors at marginally different time horizons behave in a similar way. Dynamic phase transitions are also characterised by self similar fixed points (markets show these near crashes and regime shifts). Motivated by this, there are certain cases where Markowitz breaks, particularly when the volatility spikes. Exposures that are calibrated on a low volatility period turn out be way to large for the newly appearing high volatility. We will show that this can be addressed by the multiscale optimization.
1.2 Stylised Facts
The following are typical stylised facts observed in the stock market and in financial markets in general:
-
•
Volatility Clusters Market volatility varies through time in a irregular and time-asymmetric manner. Volatility may suddenly surge following an event or a news that triggers its instability. Then it will take time to progressively decrease.
-
•
Rough volatility: This causes the volatility of some assets to increase disproportionately at longer scales, causing over-allocation to them [2]. Globally, volatility remains in a limited range, implying a negative autocorrelation of its variations at various time scales.
-
•
Non-Ellipticity Asset returns exhibit varying fat-tailed distributions subject to the tail concentration effect: under extreme conditions, the number of effective variables driving the market shrinks down to just a few of them. The Hurst exponent also varies, due to market impact eg. illiquids, corporate bonds, PE/VC funds, emerging markets etc. We know that Markowitz generally overallocates to illiquids[12, 13, 14].
-
•
Crashes and Bubbles: During times of crashes, often Markowitz worsens drawdowns as opposed to an equally weighted portfolio [9]. This is because the volatility begins to scale non-trivially with scale. We expect that we see power laws during bubbles as well, with imaginary critical exponents.
-
•
Stochastic Volatility: This introduces uncertainty on the volatility of the different assets, resulting in instabilities and estimation errors, in addition to non-generizability out of sample [4].
-
•
Regime Changes: It is unable to handle stochastic regime changes such as shocks induced by changes in interest rates, unemployment numbers, and global macroeconomic and political factors [7].
A lot of these are instances when the volatility becomes increasingly rough, with Hursts substantially different from half, or when tails become disproportionately fat [2]. In the case of market crashes, in fact, that is what really happens. Bubbles too are characterized by Hursts, but those which are imaginary, corresponding to dynamics governed by the Log Periodic Power Law (LPPL). Markowitz is based on the condition of ellipticity, where all stocks have the same Hurst exponents. Therefore, optimal weights at one time scale generalize across scales.
2 Motivations
2.1 Fractional Diffusion PDE
We are motivated by the fact that pricing time series has been shown to exhibit self-similarity in various studies, starting with the time of Mandelbrot [12]. Also self-similar dynamics is at the endpoints of renormalization group flows. We start with the most general non-interacting PDE we can write down describing fractional diffusion in both space and time: Where
-
•
is the probability density function
-
•
is fractional Caputo derivative (with )
-
•
is the fractional Laplacian (Riesz derivative) of order (with )
-
•
is a generalised diffusion coefficient.
Here we see that there are qualitatively three types of cases:
-
•
Brownian motion (): This case equates to the usual Markowitz case. Here
-
•
Fat Tails (): This corresponds to no time dependence but fat tails with a power law
-
•
Gaussian distribution with time dependence (): Here we have the benefit of no fat tails but time dependence causes anomalous scaling
-
•
Time dependence and Fat Tails(): This is the most general base with time dependence and fat tails. Here we have that
Note that in most studies, the Hurst is considered to be solely due to time dependence, and . However, this obfuscates the fact that anomalous scaling can be due to both time depedence and fat tails. Hence thereafter in this study, when we refer to the Hurst, we refer to the standardised Hurst defined as This recognises that anomalous scaling can be due to both fat tails and time dependence. In a similar vein, in the presence of nonlinearities we can in principle have multifractality due to both the effect of the probability distribution () and the interactions (). Analogously, we have a standardised generalized Hurst: The direct observable here is the scaling law , while it is more subtle to address the contributions of the iid distribution and dependence structure to it, and will not be addressed.
The discerning reader might be concerned here about the inconsistency of the Markowitz paradigm with the presence of fat tails, since the covariance matrix does not in principle converge under the presence of fat tails. The solution to that is straightforward: one can replace the covariance matrix with the -modified covariance matrix and solve the optimisation problem: Where the -modified covariance matrix is written as: Where are just the cross-asset correlations and and are absolute deviations from the robust central estimate (median). This will in principle have better convergence properties in the presence of tails, but should not make a substantial difference to the allocations.
3 Implementation
In practice, it would work in the following manner. The returns are invariant under rescaling, so the return maximization condition will remain as is across a range of scales. As we progressively move to higher and higher frequencies, we will measure the variance of each asset across a range of frequencies, , and then for each frequency we get:
We have a few options on how to implement this:
-
1.
For each frequency, we impose maximal return under the constraint
where . This approach ensures that risk, in the form of variance, is effectively managed across a range of frequencies [5]. -
2.
Construct a minimum variance or maximal Sharpe portfolios with the variance estimated by taking an average across scales or by averaging weights after multiscale optimization [10]. The alternative is to compute weights across various scales and then average them out before renormalization.
-
3.
Construct a minimum variance or maximal Sharpe portfolio where variance is a multiscale estimate of variance, involving the average over scales .
We consider only the last case since the first does not sufficiently constrain the portfolio. We use the minimum variance optimisation, the advantage of which is that it directly measures the efficacy of the variance estimator without dependence on the mean estimator [10].Let us see how multi-frequency optimization constrains our weights at a range of scales, beginning with special cases.
4 Special Cases
4.1 Elliptical Case
Consider a simplified scenario where all assets exhibit the same standardised Hurst exponent, , so . Here, although we have anomalous scaling, the optimization problem can be solved at any given frequency, and the resulting solution will be applicable across all frequencies. This is due to the fact that when the scale is varied through a scaling transformation, the portfolio variance is modified by a factor of [1], thus rescaling our optimisation function by a constant.
However, in practice, assets often exhibit diverse scaling behaviors with frequency and are typically not self-similar. In such cases, it is necessary to compute the optimization independently at each frequency to account for these differences [6].
4.2 General Case
In a more generalized scenario, the variances and covariances of different assets, denoted as and , respectively, can vary arbitrarily with scale and are derived from empirical data. Similarly, the target variance can be expressed as an arbitrary function of . Under these circumstances, the optimization problem is formulated as follows:
-
•
Minimize the portfolio variance:
-
•
Here
-
•
Ensure portfolio weights across scales sum to one:
-
•
Maintain non-negative weights:
This general case highlights the complexity of portfolio optimization when dealing with assets that exhibit non-uniform scaling behavior across different time scales.
5 Effect of Multifractality
5.1 Introduction to Multifractality
Multifractality in financial time series reflects the idea that different parts of the data may exhibit different scaling behaviors. Unlike a monofractal process, which is characterized by a single Hurst exponent , a multifractal process is described by a spectrum of exponents , where is a moment order. This spectrum captures the complex, heterogeneous nature of financial markets, where the roughness of returns can vary depending on the time scale and the statistical moment being considered.
In a multifractal framework, the variance scaling law is generalized to:
where is the multifractal scaling function, which depends on the moment . For example, corresponds to the traditional Hurst exponent used in variance scaling. Multifractality corresponds to
5.2 Multifractal Portfolio Variance
To incorporate multifractality into portfolio optimization, the variance of each asset at a given time scale is modeled using the multifractal formalism:
where is the multifractal scaling exponent for asset at moment . For Markowitz, we will be interested in where multifractality implies that . Similarly the covariance between stocks and scales as:
Where multifractality ensures . In fact from the above expression, if we assume a the scaling exponent for the correlation:
We find that:
This is well known to be non-zero and positive as a result of the Epps effect [15], where correlation between any two pairs of stocks is expected to increase with increasing length scale. Empirically from studies of the Epps effect, we find that , which implies that the variation of cross correlation with scale decreases as one moves to lower and lower frequencies. The portfolio variance at time scale then becomes:
This expression now accounts for multifractality by incorporating the -dependent scaling behavior of the assets, in terms of both the variance and covariance.
5.3 Multifractal Optimization Problem
The optimization problem in a multifractal setting aims to minimize the portfolio variance across both time scales and moments. The optimization problem can be formulated as:
Here:
We here will expand the variances and covariances with the correct scaling properties. This is subject to:
Here, the optimization takes into account not only the scaling behavior across different time scales but also the varying roughness of the time series as captured by the multifractal spectrum . This leads to a more complex, yet more accurate, description of portfolio risk that better reflects the true nature of financial markets.
5.4 Estimation of Multifractal Parameters
To implement the multifractal optimization framework, it is necessary to estimate the multifractal scaling exponents for each asset. This can be done using methods such as the multifractal detrended fluctuation analysis (MF-DFA) or the wavelet transform modulus maxima (WTMM) method. These techniques allow for the extraction of the multifractal spectrum from historical return data.
The estimated multifractal spectrum can then be used to calculate the portfolio variance at different time scales and moments, which serves as the input for the optimization problem.
6 Sensitivity Analysis
We can compute the sensitivity of the weights to the Hursts, multifractal Hursts etc. First let us evaluate the dependence on volatility.
To find the optimal weights that minimize subject to the constraint , we employ the method of Lagrange multipliers. The Lagrangian is given by:
where is the Lagrange multiplier associated with the budget constraint. We implement first order conditions. Take the derivative of with respect to and set it to zero:
Solving for :
Using the budget constraint :
Let , a scalar. Then:
Substituting back into the expression for :
Thus, the optimal weights are:
6.1 Effect of Variance
To demonstrate that increasing the variance of a specific asset leads to a decrease in its optimal weight , while keeping all other parameters constant.
We want to now express the weight of asset . From the optimal weights expression:
where and .
Now we want to analyze the effect of increasing . We aim to compute the derivative and show that it is negative.
Computing and
Since , we have:
Given that only depends on through the -element:
where is the -th standard basis vector. Thus:
Specifically, the derivative of is:
Similarly, the derivative of is:
Lets get the final expression for the derivative. Substituting back:
Given that is positive definite, , and , it follows that:
We know that
It follows that
6.2 Effect of Correlation and Multifractality
Next, we consider the effect of increasing the correlation between two assets, say and , on their combined weight . The covariance between assets and is given by:
where is the correlation between assets and . Increasing increases the covariance .
The weights and depend on the inverse of the covariance matrix . To compute the effect of increasing , we differentiate the weights and with respect to :
where and . The derivative has nonzero entries only at positions and . Here we can have that and might increase or decrease. However, if we do an optimisation over the combined asset we can conclude that the combined weight of the synthetic asset decreases with an increase in correlation. This is because in this modified picture, all you are doing is changing the variance of the two assets: So all the other variances and covariances are the same, except that of the synthetic combination of and . Thus if we freeze , we get from the previous result:
Thus:
Here note that although empirically which causes a positive multifractality, one could have had as well in principle. In that case the correlations would decay with scale, and hence the weight of the synthetic asset would increase with an increase in the multifractal parameter .
6.3 Summary
To summarize our results we get as expected:
-
•
If you increase volatility of a stock, its portfolio weight goes down
-
•
If you increase Hurst () of a stock, goes up, its portfolio weight goes down
-
•
If you increase the fat tailedness of a stock (), goes up, its portfolio weight goes down
-
•
If you increase correlation between two stocks, their combined weight goes down.
-
•
If you introduce positive multifractality in two stocks’ correlation (increased Epps effect), goes up, their combined weight goes down; and vice versa.
7 Out of Sample Performance
To rigorously assess the efficacy of the proposed multiscale optimization methods, we conducted an out-of-sample evaluation in two scenarios: the minimum variance portfolio and the maximal Sharpe portfolio, with the mean estimated using a simple moving average. The portfolio weights correspond to allocations across the 11 sectors of the S&P 500. The following choices were used for the backtest:
-
•
We utilized five years of data, spanning from 2019 to 2024, a period that notably includes significant market events such as the March 2020 COVID-19 crash and the 2022 market correction.
-
•
The SPDR sector ETFs were used in lieu of allocation to the 11 sectors of the , making it a long-only sector rotation strategy. We also considered a factor rotation strategy taking 9 factors (quality, growth, value, low volatility etc) where each factor was represented by ETFs.
-
•
A lookback of six months was chosen (125 days), which is common in such studies.
-
•
Minimum variance portfolios with overlapping vs non-overlapping averages were considered at lower frequencies
-
•
At each scale, the covariance matrix was evaluated on the same period, but lower frequency covariances were averaged over all posible non-overlapping sets to increase robustness. We compute covariance matrix at a range of scales
-
•
Transaction costs were assumed to be negligible for the purpose of this study.
Method | Sharpe Ratio | Sortino Ratio | Max Drawdown (%) |
---|---|---|---|
Equally Weighted | 0.45 | 0.53 | -39.9 |
Traditional Markowitz | 0.35 | 0.41 | -33.1 |
Multiscale Markowitz | 0.53 | 0.62 | -31.5 |
Multiscale Markowitz (Overlapping) | 0.49 | 0.58 | -30.3 |
Method | Sharpe Ratio | Sortino Ratio | Max Drawdown (%) |
---|---|---|---|
Equally Weighted | 0.42 | 0.52 | -39.1 |
Traditional Markowitz | 0.43 | 0.49 | -36.9 |
Multiscale Markowitz | 0.53 | 0.61 | -36.3 |
Multiscale Markowitz (Overlapping) | 0.57 | 0.66 | -34.9 |
Our findings indicate that the multiscale optimization method results in portfolios with higher Sharpe ratios and Sortino ratios, as well as lower kurtosis and drawdowns. This enhanced performance is attributable to the multiscale method’s ability to account for the effects of non-ellipticity and fat tails, phenomena that are often inadequately captured by traditional minimum variance portfolios [3, 10].
References
- [1] Benoit B. Mandelbrot and John W. Van Ness, ”Fractional Brownian motions, fractional noises and applications,” SIAM Review, vol. 10, no. 4, pp. 422–437, 1968.
- [2] Jim Gatheral, Thibault Jaisson, and Mathieu Rosenbaum, ”Volatility is rough,” Quantitative Finance, vol. 18, no. 6, pp. 933–949, 2018.
- [3] Laurent E. Calvet and Adlai J. Fisher, ”Multifractality in asset returns: Theory and evidence,” Review of Economics and Statistics, vol. 84, no. 3, pp. 381–406, 2002.
- [4] Steven L. Heston, ”A closed-form solution for options with stochastic volatility with applications to bond and currency options,” The Review of Financial Studies, vol. 6, no. 2, pp. 327–343, 1993.
- [5] Gregory Berman and Lawrence Hochberg, ”Multiscale modeling of financial time series and portfolio optimization,” Journal of Investment Strategies, vol. 1, no. 2, pp. 45–67, 2008.
- [6] Jean-François Muzy, Jérôme Delour, and Emmanuel Bacry, ”Modelling fluctuations of financial time series: from cascade process to stochastic volatility model,” The European Physical Journal B-Condensed Matter and Complex Systems, vol. 17, no. 3, pp. 537–548, 2000.
- [7] Andrew Ang and Allan Timmermann, ”Regime Changes and Financial Markets,” Annual Review of Financial Economics, vol. 4, no. 1, pp. 313–337, 2012.
- [8] Edgar E. Peters, Fractal Market Analysis: Applying Chaos Theory to Investment and Economics, John Wiley & Sons, 1994.
- [9] Rama Cont and Jean-Philippe Bouchaud, ”Herd behavior and aggregate fluctuations in financial markets,” Macroeconomic Dynamics, vol. 4, no. 2, pp. 170–196, 2000.
- [10] Robert J. Bianchi, Michael E. Drew, and Jesse H. Fan, ”Multiscale Hedge Fund Performance Persistence: Evidence from Wavelet Analysis,” Journal of Financial Econometrics, vol. 13, no. 1, pp. 50–78, 2015.
- [11] H. E. Hurst, ”Long-term storage capacity of reservoirs,” Transactions of the American Society of Civil Engineers, vol. 116, no. 1, pp. 770–799, 1951.
- [12] Benoit Mandelbrot, ”The variation of certain speculative prices,” The Journal of Business, vol. 36, no. 4, pp. 394–419, 1963.
- [13] Rama Cont, ”Empirical properties of asset returns: stylized facts and statistical issues,” Quantitative Finance, vol. 1, no. 2, pp. 223–236, 2001.
- [14] Yakov Amihud and Haim Mendelson, ”Asset pricing and the bid-ask spread,” Journal of Financial Economics, vol. 17, no. 2, pp. 223–249, 1986.
- [15] Thomas W. Epps, ”Comovements in stock prices in the very short run,” Journal of the American Statistical Association, vol. 74, no. 366a, pp. 291–298, 1979.