SSAAM: Sentiment Signal-based Asset Allocation Method with Causality Information

Rei Taguchi School of Engineering
The University of Tokyo
Tokyo, Japan
[email protected] Hiroki Sakaji School of Engineering
The University of Tokyo
Tokyo, Japan
[email protected] Kiyoshi Izumi School of Engineering
The University of Tokyo
Tokyo, Japan
[email protected]

Abstract

This study demonstrates whether financial text is useful for tactical asset allocation using stocks by using natural language processing to create polarity indexes in financial news. In this study, we performed clustering of the created polarity indexes using the change-point detection algorithm. In addition, we constructed a stock portfolio and rebalanced it at each change point utilizing an optimization algorithm. Consequently, the asset allocation method proposed in this study outperforms the comparative approach. This result suggests that the polarity index helps construct the equity asset allocation method.

Index Terms:

Financial news, MLM scoring, causal inference, change-point detection, portfolio optimization

I Introduction

This study proposes that financial text can be useful for tactical asset allocation methods using equities. This study focuses on the point at which stock and portfolio prices change rapidly due to external factors, that is, the point of regime change. Regimes in finance theory refer to invisible market states, such as expansion, recession, bulls, and bears. In this study, we specifically drew on the two studies presented below. Wood et al.[1] used a change-point detection module to capture regime changes and created a simple and expressive model. Ito et al.[2] developed a method for switching investment strategies in response to market conditions. In this study, we go one step further and focus on how to measure future regime changes. If the information on future regime changes (i.e., future changes in the market environment) is known, active management with a higher degree of freedom becomes possible. However, there are certain limitations in calculating future regimes using only traditional financial time-series data. Therefore, this study constructs an investment strategy based on a combination of alternative data that has been attracting attention in recent years and financial time-series data.

In this study, we hypothesized the following:

•

Portfolio performance can be improved by switching between risk-minimizing and return-maximizing optimization strategies according to the change points created by the polarity index.

The contributions of this study are as follows:

•

We demonstrate that the estimation of regime change points using financial text is active the active management and propose a highly expressive asset allocation framework.

The framework of this study consists of the following four steps.

•

Step 1 (Creating polarity index): Score financial news titles using MLM scoring. In addition, quartiles are calculated from the same data, and a three-value classification of positive, negative, and neutral is performed according to the quartile range. The calculated values are aggregated daily.

•

Step 2 (Demonstration of leading effects): We use statistical causal inference to demonstrate whether financial news has leading effects on a stock portfolio. Use the polarity index created in Step 1. We will also create a portfolio of 10 stocks combined. The algorithm used is VAR-LiNGAM.

•

Step 3 (Change point detection): Verify that the polarity index has leading effects in Step 2. Calculate the regime change point of the polarity index using the change point detection algorithm. The algorithm used is the Binary Segmentation Search Method.

•

Step 4 (Portfolio optimization): Portfolio optimization is performed based on the change points created in Step 3. The algorithm used is EVaR optimization.

II Method

II-A Creating polarity index

This study used pseudo-log-likelihood scores (PLLs) to create polarity indices. PLLs are scores based on probabilistic language models proposed by Salazar et al.[3]. Because masked language models (MLMs) are pre-trained by predicting words in both directions, they cannot be handled by conventional probabilistic language models. However, PLLs can determine the naturalness of sentences at a high level because they are represented by the sum of the log-likelihoods of the conditional probabilities when each word is masked and predicted. Token $\psi_{t}$ is replaced by [MASK], and the past and present tokens $\textbf{$\Psi$}_{\backslash t}=[\psi_{1},\psi_{2},...,\psi_{t}]$ are predicted. $t$ represents time. $\Theta$ is the model parameter. $P_{MLM}(\cdot)$ denotes the probability of each sentence token. The MLM selects BERT (Devlin et al.[4]).

\displaystyle\textbf{PLL($\Psi$)}:=\sum^{|\textbf{$\Psi$}|}_{t=1}\log_{2}P_{MLM}(\psi_{t}|\textbf{$\Psi$}_{\backslash t};\Theta)

(1)

After pre-processing, score the financial news text with PLLs one sentence at a time. Quartile ranges¹¹1Arranging the data in decreasing order, the data in the 1/4 are called the 1st quartile, the data in the 2/4 are called the 2nd quartile, and the data in the 3/4 are called the 3rd quartile. (3rd quartile - 1st quartile) is called the quartile range. were calculated for data that scored one sentence at a time. The figure below illustrates the polarity classification method.

TABLE I: Polarity Classification Method

Classification Method	Sentiment Score
3rd quartile $<$ PLLs	1 (positive)
1st quartile $\leq$ PLLs $\leq$ 3rd quartile	0 (neutral)
1st quartile $>$ PLLs	-1 (negative)

Aggregate the scores chronologically according to the title column of financial news.

II-B Demonstration of leading effects

In this study, we used VAR-LiNGAM to demonstrate the precedence. VAR-LiNGAM is a statistical causal inference model proposed by Hyvärinen et al.[5]. The causal graph inferred by VAR-LiNGAM is as follows:

\displaystyle\textbf{x}(t)=\sum^{T}_{\tau=1}\textbf{B}_{\tau}\textbf{x}(t-\tau)+\textbf{e}(t)

(2)

where $\textbf{x}(t)$ is the vector of the variables at time $t$ and $\tau$ is the time delay. $T$ represents the maturity date. In addition, $\textbf{B}_{\tau}$ is a coefficient matrix that represents the causal relationship between the variables $\textbf{x}(t-\tau)$ . $\textbf{e}(t)$ denotes the disturbance term. VAR-LiNGAM was implemented using the following procedure: First, a VAR (Vector Auto-Regressive) model is applied to the causal relationships among variables from the lag time to the current time. Second, for the causal relationships among variables at the current time, LiNGAM inference is performed using the residuals of the VAR model. This study confirms whether financial news is preferred to stock portfolios.

II-C Change point detection

Binary segmentation search (Bai[6]; Fryzlewicz[7]) is a greedy sequential algorithm. The notation of the algorithm follows Truong et al.[8]. This operation is greedy in the sense that it seeks the change point with the lowest sum of costs. Next, the signal was divided into two at the position of the first change point, and the same operation was repeated for the obtained partial signal until the stop reference was reached. The binary segmentation search is expressed in Algorithm 1. We define a signal $y=\{y_{s}\}^{S}_{s=1}$ that follows a multivariate non-stationary stochastic process. This process involves $S$ samples. $L$ refers to the list of change points. Let $s$ denote the value of a change point. $G$ refers to an ordered list of change points to be computed. If signal $y$ is given, the $(b-a)$ -sample long sub-signal $\{y_{s}\}^{b}_{s=a+1},(1\leq a<b\leq S)$ is simply denoted $y_{a,b}$ . Hats represent the calculated values. Other notations are noted in the algorithm’s comment.

Algorithm 1 Binary Segmentation Search

signal

y=\{y_{s}\}^{S}_{s=1}

, cost function

c(\cdot)

, stopping criterion.

Initialize

L\leftarrow\{\}.

\triangleright

Estimated breakpoints.

k\leftarrow|L|.

\triangleright

Number of breakpoints.

s_{0}\leftarrow 0

and

s_{k+1}\leftarrow S

\triangleright

Dummy variables

k>0

then

Denote by

s_{i}(i=1,...,k)

the elements (in ascending order) of

L

, ie

L=\{s_{1},...,s_{k}\}.

end if

Initialize

G

(k+1)

-long array.

\triangleright

List of gains

for

i=0,...,k

G[i]\leftarrow c(y_{s_{i},s_{i+1}})-\mathop{\min}_{s_{i}<s<s_{i+1}}[c(y_{s_{i},s})+c(y_{s,s_{i+1}})].

end for

\hat{i}\leftarrow\mathop{\arg\max}_{i}G[i]

\hat{s}\leftarrow\mathop{\arg\min}_{s_{i}<s<s_{i+1}}[c(y_{s_{\hat{i}},t})+c(y_{s,s_{{\hat{i}}+1}})].

\triangleright

Estimated change-points

L\leftarrow L\cup\{\hat{s}\}

stopping criterion is met.

set

L

of estimated breakpoint indexes.

II-D Portfolio optimization

The entropy value at risk (EVaR) is a coherent risk measure that is the upper bound between the value at risk (VaR) and conditional value at risk (CVaR) derived from Chernoff’s inequality (Ahmadi-Javid[9]; Ahmadi-Javid[10]). EVaR has the advantage of being computationally tractable compared to other risk measures, such as CVaR, when incorporated into stochastic optimization problems (Ahmadi-Javid[10]). EVaR is defined as follows.

\displaystyle\textbf{EVaR}_{\alpha}(X):=\min_{z>0}\left\{z\ln\left(\frac{1}{\alpha}M_{X}\left(\frac{1}{z}\right)\right)\right\}

(3)

$X$ is a random variable. $M_{X}$ is the moment-generating function. $\alpha$ denotes the significance level. $z$ are variables. A general convex programming framework for the EVaR is proposed by Cajas[11]. In this study, we switch between the following two optimization strategies depending on the regime classified in Section II-C.

•

Minimize risk optimization: A convex optimization problem with constraints imposed to minimize EVaR given a level of expected $\mu$ ( $\widehat{\mu})$ .

\displaystyle\begin{aligned} &\text{minimize}&q+z\log_{e}\left(\frac{1}{T\alpha}\right)\\ &\text{subject to}&\mu w\geq\widehat{\mu}\\ &&\sum^{N}_{i=1}w_{i}=1\\ &&z\geq\sum^{T}_{j=1}u_{j}\\ &&(-r_{j}w^{\top}-q,z,u_{j})\in{K_{exp}}&&(\forall j=1,...,T)\\ &&w_{i}=0&&(\forall i=1,...,N)\\ \end{aligned}

(4)

•

Maximize return optimization: A convex optimization problem imposed to maximize expected return given a level of expected $EVaR$ ( $\widehat{EVaR}$ ).

\displaystyle\begin{aligned} &\text{maximize}&\mu w^{\top}\\ &\text{subject to}&q+z\log_{e}\left(\frac{1}{T\alpha}\right)\geq\widehat{EVaR}\\ &&\sum^{N}_{i=1}w_{i}=1\\ &&z\geq\sum^{T}_{j=1}u_{j}\\ &&(-r_{j}w^{\top}-q,z,u_{j})\in{K_{exp}}&&(\forall j=1,...,T)\\ &&w_{i}=0&&(\forall i=1,...,N)\\ \end{aligned}

(5)

where $q$ , $z$ and $u$ are the variables, $K_{exp}$ is the exponential cone, and $T$ is the number of observations. $w$ is defined as a vector of weights for $N$ assets, $r$ is a matrix of returns, and $\mu$ is the mean vector of assets.

III Experiments & Results

III-A Dataset description

This study calculates the signal for portfolio rebalancing and tactical asset allocation to actively go for an alpha based on the assumption that financial news precedes the equity portfolio. Two types of data were used.

•

Stock Data: We used the daily stock data provided by Yahoo!Finance²²2https://finance.yahoo.com/. The stocks used are the components of the NYSE FANG+ Index: Facebook, Apple, Amazon, Netflix, Google, Microsoft, Alibaba, Baidu, NVIDIA, and Tesla were selected. For this data, adjusted closing prices are used. The time period for this data is January 2015 through December 2019.

•

Financial News Data: We used the daily historical financial news archive provided by Kaggle³³3https://www.kaggle.com/, a data analysis platform. This data represents the historical news archive of U.S. stocks listed on the NYSE/NASDAQ for the past 12 years. This data was confirmed to contain information on ten stock data issues. This data consists of 9 columns and 221,513 rows. The title and release date columns were used in this study. The time period for this data is January 2015 through December 2019.

III-B Preparation for backtesting

The polarity index is presented in section II-A. The financial news data were pre-processed once before creating the polarity index. Both financial news and stock data are in daily units; however, to match the period, if there are blanks in either , lines containing blanks are dropped. Once the polarity index is created in Section II-A, the next step is to create a stock portfolio by adding the adjusted closing prices of 10 stocks. The investment ratio for the portfolio is set uniformly for all stocks. Next, we use VAR-LiNGAM in Section II-B to perform causal inference. The causal inference results are as follows: Python library ruptures (Truong et al.[8]) was used.

TABLE II: Causal Inference in VAR-LiNGAM

Direction	Causal Graph Value
Index(t-1) $\dashrightarrow$ Index(t)	0.39
Index(t-1) $\dashrightarrow$ Portfolio(t)	0.11
Portfolio(t-1) $\dashrightarrow$ Portfolio(t)	1.00

The values in Table II refer to the elements of the adjacency matrix. The lower limit was set to 0.05. The results in the table show that the polarity index has a leading edge in the equity portfolio. The Python library LiNGAM (Hyvärinen et al.[5]) was used.

III-C Backtesting scenarios

In this study, the following rebalancing timings were merged and backtested. Python library vector (Polakow[12]) and Riskfolio-Lib (Cajas[13]) was used for backtesting. In addition to EVaR optimization, CVaR optimization and the mean-variance model were used as optimization algorithms and comparative methods, respectively. In this study, the number of regimes was set to 5 and 10. The rebalancing times were 30, 90, and 180 days. The backtesting methodology was as follows. In this study, CPD-EVaR++ was positioned as the proposed strategy, and CPD-EVaR+ was the runner-up strategy.

•

CPD-EVaR++ (proposed): Changepoint rebalancing using risk minimization and return maximization EVaR optimization + regular intervals rebalancing strategy
•

CPD-EVaR+: Changepoint rebalancing using risk minimization and no-restrictions EVaR optimization + regular intervals rebalancing strategy
•

EVaR: EVaR optimization regular intervals rebalancing strategy
•

CVaR: CVaR optimization regular intervals rebalancing strategy
•

MV: Mean-Variance optimization regular intervals rebalancing strategy

The binary determination of whether the polarity index within each regime shows an upward or downward trend is made by examining the divided regimes. MinRiskOpt (Section $\ref{Portfolio optimization}-(\ref{MinRisk})$ ) is assigned to an upward trend, and MaxReturnOpt (Section $\ref{Portfolio optimization}-(\ref{MaxRet})$ ) is assigned to a downward trend.

III-D Evaluation by backtesting

The following metrics were employed to assess the portfolio performance.

•

Total Return (TR): TR refers to the total return earned from investing in an investment product within a given period. TR formula is as follows: TR = Valuation Amount + Cumulative Distribution Amount Received + Cumulative Amount Sold - Cumulative Amount Bought. This study does not incorporate tax amounts and trading commissions.
•

Maximum Drawdown (MDD): MDD refers to the rate of decline from the maximum asset. MDD formula is as follows: MDD = (Trough Value - Peak Value) / Peak Value.

TABLE III: Backtesting (SSAAM)

Rebalance	Regime	Algorithm	TR [%]	MDD [%]
30-days	5	CPD-EVaR++	810.9915	26.8629
	5	CPD-EVaR+	594.7410	26.8629
	10	CPD-EVaR++	485.5201	45.0235
	10	CPD-EVaR+	392.1392	42.4803
90-days	5	CPD-EVaR++	535.7349	27.6386
	5	CPD-EVaR+	410.8530	27.6386
	10	CPD-EVaR++	417.8354	27.7646
	10	CPD-EVaR+	373.5849	27.7646
180-days	5	CPD-EVaR++	152.0988	27.3924
	5	CPD-EVaR+	131.2210	27.3924
	10	CPD-EVaR++	169.2992	25.3050
	10	CPD-EVaR+	232.4513	25.3050

TABLE IV: Backtesting (comparison)

Rebalance	Algorithm	TR [%]	MDD [%]
30-days	EVaR	587.9630	46.6651
	CVaR	558.7446	44.4532
	MV	527.2827	42.9851
90-days	EVaR	500.1421	44.9860
	CVaR	496.7423	44.0592
	MV	459.1195	42.7358
180-days	EVaR	353.2412	44.7714
	CVaR	382.9451	44.2525
	MV	360.4298	42.8165

IV Discussion & Conclusion

Table III shows that the higher the number of regular rebalances, the higher the total return. In addition, the maximum drawdowns hovered between 25% and 45%, which is considered acceptable to the average system trader. In this study, the experiment was conducted separately when the regime was five and when the regime was ten. The total return was higher when the regime was five, whereas the maximum drawdown was almost the same for both regimes. Moreover, as hypothesized, CPD-EVaR++, a combination of risk minimization and return maximization operations, performed better than the others. Therefore, using this method, the best practice in managing equity portfolios is to use CPD-EVaR++ and to rebalance irregularly in regime 5, in addition to regular rebalancing every 30 days.

Backtesting of Table IV using the same parameters as in Table III. The results show that for the algorithm, EVaR optimization performed better than the others, similar to the results of Cajas[11]. This may be because the computational efficiency of EVaR in stochastic optimization problems is higher than that of other risk measures, such as CVaR.

This study demonstrates the utility of financial text in asset allocation with equity portfolios. In the future, we would like to develop a tactical asset allocation strategy that mixes stocks and other asset classes, such as bonds. In the future, we would also like to apply this research to monetary policy and other macroeconomic analyses.

Acknowledgment

This work was supported by the JST-Mirai Program Grant Number JPMJMI20B1, Japan. The authors declare that the research was conducted without any commercial or financial relationships that could be construed as potential conflicts of interest.

References

[1] Kieran Wood, Stephen Roberts, and Stefan Zohren. Slow momentum with fast reversion: A trading strategy using deep learning and changepoint detection. The Journal of Financial Data Science, 4(1):111–129, dec 2021.
[2] Masatake Ito, Kabun Jo, and Norio Hibiki. Application of asset allocation models in practice and mutual fund design [in japanese]. Operations research as a management science, 66(10):683–689, 2021.
[3] Julian Salazar, Davis Liang, Toan Q. Nguyen, and Katrin Kirchhoff. Masked language model scoring. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 2699–2712, Online, July 2020. Association for Computational Linguistics.
[4] Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. Bert: Pre-training of deep bidirectional transformers for language understanding, 2019.
[5] Aapo Hyvärinen, Kun Zhang, Shohei Shimizu, and Patrik O Hoyer. Estimation of a structural vector autoregression model using non-gaussianity. Journal of Machine Learning Research, 11(5), 2010.
[6] Jushan Bai. Estimating multiple breaks one at a time. Econometric theory, 13(3):315–352, 1997.
[7] Piotr Fryzlewicz. Wild binary segmentation for multiple change-point detection. The Annals of Statistics, 42(6):2243–2281, 2014.
[8] Charles Truong, Laurent Oudre, and Nicolas Vayatis. Selective review of offline change point detection methods. Signal Processing, 167:107299, 2020.
[9] A. Ahmadi-Javid. An information-theoretic approach to constructing coherent risk measures. In 2011 IEEE International Symposium on Information Theory Proceedings, pages 2125–2127, 2011.
[10] Amir Ahmadi-Javid. Entropic value-at-risk: A new coherent risk measure. Journal of Optimization Theory and Applications, 155(3):1105–1123, 2012.
[11] Dany Cajas. Entropic portfolio optimization: a disciplined convex programming framework. Available at SSRN 3792520, 2021.
[12] Oleg Polakow. vectorbt (1.4.2), 2022.
[13] Dany Cajas. Riskfolio-lib (3.0.0), 2022.