This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

On Spurious Causality, CO2, and Global Temperature

Philippe Goulet Coulombe        Maximilian Göbel Department of Economics, [email protected]. We thanks an anonymous referee who suggested the use of information flows in an earlier paper of ours. We are grateful to Boyuan Zhang for many helpful comments and suggestions.
(           University of Pennsylvania    ISEG - Universidade de Lisboa
First Draft: October 10, 2020
This Draft: December 12, 2024
)
Abstract

Stips, Macias, Coughlan, Garcia-Gorriz, and Liang (2016) (Nature Scientific Reports) use information flows (Liang,, 2008, 2014) to establish causality from various forcings to global temperature. We show that the formulas being used hinges on a simplifying assumption that is nearly always rejected by the data. We propose an adequate measure of information flow based on Vector Autoregressions, and find that most results in StipsEtAl2016 cannot be corroborated. Then, it is discussed which modeling choices (e.g., the choice of CO2 series and assumptions about simultaneous relationships) may help in extracting credible estimates of causal flows and the transient climate response simply by looking at the joint dynamics of two climatic time series.

Keywords: information flows, vector autoregressions, global warming, climate econometrics

1 Introduction

Causality is fundamental to science. While causal statements can reasonably be made without hardship in controlled environments, things are far less straightforward when only observational data is available. Answering the question of what happens to YY if one intervenes on XX is compromised by the simple fact that XX and YY were not generated by exogenously modulating XX, but by (often) endogenous interactions between the two variables.

Yet, for many scientific interrogations of capital importance – like the relationship between greenhouse gases and global temperature, only observational data is available. How does different forcings cause global mean surface temperature anomalies (GMTA)? In a popular article, Stips, Macias, Coughlan, Garcia-Gorriz, and Liang (2016) (henceforth, SMCGL) set up to answer the question using information flow (IF) from one variable to another for bivariate stochastic dynamical systems – a methodology developed in Liang2008; Liang2014; Liang2015; Liang2016. Authors make grand claims about the technique being able to extract "true" and "rigorous" causality, which clash with the usually somber tone of causal analysis.

We show that the IF methodology will not extract causality for the near-universe of bivariate stochastic dynamic systems estimated on real data – with those of SMCGL included. In a nutshell, this occurs because Liang2008 formulas being used throughout, assume that a certain statistical quantity is 0 when it is not. This can be tested, and it is rejected almost all the time. More precisely, Liang2014 formula assumes that when conditioning on the past state of the system (Xt1X_{t-1} and Yt1Y_{t-1}), the remaining variation in XtX_{t} and YtY_{t} (the innovations driving the system) are uncorrelated. In SMCGL, where the time unit tt is one year, this implies that forcings and GMTA are uncorrelated within the same year, conditional on last year’s values of both. That is unlikely to happen, and the data will testify to that. The assumption is presented as "linearity" (Liang2014), but it has little to do with it. Rather, it assumes that the system is identified, meaning that the joint dynamics of the time series data fall on the knife edge case where causality can indeed be claimed from the data without any external assumption (sims1980). The problem is that for discretely sampled time series, the assumption is almost always rejected by lower-frequency data.111In tawia2019time, IF are used on daily data, which can alleviate the problem if there are no intra-day relationships. This last condition is something that should be verified, not assumed. In short, SMCGL assume there to be no identification problem (in the sense of sims1980), and find no identification problem.

In this note, we detail the consequences of this high-standing omission on IF measures and SMCGL’s results. First, in simulations (using data generating processes proposed by Liang2014) of which we know the true causality structure, it is shown that IF will often conclude that XtX_{t} is largely causing YtY_{t}, when in fact the reverse is true. Second, we reconsider a key part of SMCGL’s study where the authors investigate the causal structure between different forcings and GMTA. Using an appropriate methodology that accounts for the correlated innovations, it is found that in most instances, the data by itself cannot back SMCGL’s claims. That is, unlike what the authors have put forward, it is not possible (within this framework) to claim that many forcings are causing GMTA’s increase as a direct implication of the data. In other words, from the data alone, it is impossible to discriminate between certain forcings mostly causing GMTA or the reverse. External assumptions based on physical knowledge could remedy that. Or different data. We show that in the case of CO2, results in accord with the scientific consensus can be obtained when using concentration directly rather than its radiative content.

The alternative methodology that we use are structural Vector Autoregressions (SVAR) which are simultaneous dynamic systems of equations. They can characterize a linear dynamic system in discrete time. The methodology was introduced to macroeconomics by sims1980 and is now so in many fields, ranging from neuroscience (chen2011vector) to climate (VARCTIC). To document its reliability for this application, we report implied transient climate response (TCR) estimates of our alternative methodology. We find that those (i) largely depend on the necessary assumptions made about the simultaneous (with one year) impact of CO2 forcing on GMTA but (ii) are in the range of recent estimates (OttoEtAl2013; MontamatStock2020) if one assumes simultaneous causality running from CO2 forcing to GMTA.

Section 2 briefly review IFs and explain where the problematic assumption occurs. It also discusses relevant notions of Vector Autoregressions (VARs, sims1980) as a comprehensive framework to think about causality in multivariate time series systems. Section 3 display IFs possibly spurious behavior using simulated data where the true causality is known. Section 4 revisit the question of causality between different forcings and GMTA using appropriate tools. Section 5 concludes.

2 Information Flows, Non-Innocuous Assumptions and VARs

In this section, we review the basics of IFs as applied in empirical work, pin down the problematic assumption, explain why it is harmful through the lenses of a VAR, and propose an alternative measure of IF based on the VAR.

Liang2008 considers the data generating process (DGP)

dX=F(X,t)dt+B(x,t)dW,\displaystyle d\textbf{X}=\textbf{F}\left(\textbf{X},t\right)dt+\textbf{B}\left(\textbf{x},t\right)d\textbf{W}\;, (1)

where X=(X1,X2)2\textbf{X}=\left(X_{1},X_{2}\right)\in\mathbb{R}^{2} are the state variables and F=(F1,F2)\textbf{F}=\left(F_{1},F_{2}\right). W=(W1,W2)\textbf{W}=\left(W_{1},W_{2}\right) is a standard Wiener process, with ΔW=dWdt\Delta\textbf{W}=\frac{d\textbf{W}}{dt} and E(ΔWi)=0E\left(\Delta W_{i}\right)=0 and E(ΔWi)2=ΔtE\left(\Delta W_{i}\right)^{2}=\Delta t. B is a 2×22\times 2 matrix and its entries bijb_{ij} govern how perturbations instantaneously impact the system. At this point, the only non-innocuous assumption is that of a bivariate system. The information flow from X2X_{2} to X1X_{1}, T21T_{2\rightarrow 1}, is then defined as

T21=dH1dtdH12dt\displaystyle T_{2\rightarrow 1}=\frac{dH_{1}}{dt}-\frac{dH_{1\bcancel{2}}}{dt} (2)

where dH1dt\frac{dH_{1}}{dt} is the evolution of the marginal entropy of X1X_{1} and dH12dt\frac{dH_{1\bcancel{2}}}{dt} denotes dH1dt\frac{dH_{1}}{dt} where the spillovers from X2X_{2} are excluded. After lengthy derivations and reasonable assumptions (like that WW’s, soon to be called structural shocks, are uncorrelated) it is obtained that

Tji=E[1ρi(Fiρi)Xi]+12E[1ρi2(giiρi)Xi2],\displaystyle T_{j\rightarrow i}=-E\left[\frac{1}{\rho_{i}}\frac{\partial\left(F_{i}\rho_{i}\right)}{\partial X_{i}}\right]+\frac{1}{2}E\left[\frac{1}{\rho_{i}}\frac{\partial^{2}\left(g_{ii}\rho_{i}\right)}{\partial X^{2}_{i}}\right]\;, (3)

where ρ\rho is the joint probability density of variables XiX_{i} and XjX_{j}, and ρi\rho_{i} denotes the marginal density of series XiX_{i}. Given that the model in (1) is not readily identified from the data, this is not yet operational. After further derivations and assuming 𝐁=[bii00bjj]\mathbf{B}=\left[\begin{smallmatrix}b_{ii}&0\\ 0&b_{jj}\end{smallmatrix}\right], Liang2014 transforms (3) into a workable formula made of empirical moments

Tji=σi,iσi,jσj,Δi(σi,j)2σi,Δi(σi,i)2σj,jσi,i(σi,j)2\displaystyle T_{j\rightarrow i}=\frac{\sigma_{i,i}\,\sigma_{i,j}\,\sigma_{j,\Delta i}-\left(\sigma_{i,j}\right)^{2}\,\sigma_{i,\Delta i}}{\left(\sigma_{i,i}\right)^{2}\,\sigma_{j,j}-\sigma_{i,i}\,\left(\sigma_{i,j}\right)^{2}} (4)

where σi,i\sigma_{i,i} is the variance of ii, σij,\sigma_{ij,} the covariance between ii and jj, and σi,Δi\sigma_{i,\Delta i} is the covariance between ii and the kthk^{th} difference of ii (we follow Liang2015 and set k=1k=1). However, the validity of this appealing formula rests on the seemingly technical assumption of a diagonal 𝐁\mathbf{B}. Our point is that this assumption is far from merely technical and very frequently wrong. Motivating the diagonal 𝐁\mathbf{B} among other assumptions, Liang2014 states

Since the dynamics is unknown, we first need to choose a model. As always, a linear model is the natural choice, at least at the initial stage of development.

The problem is that (i) of course, we must choose an empirical model, but we will try to avoid those that the data blatantly reject, (ii) 𝐁\mathbf{B} being diagonal has nothing to do with linearity, and (iii) there is no further potential "development" possible without this assumption, which, in effect, assumes the causal problem away. As a result, whenever the diagonal 𝐁\mathbf{B} is violated by the data, IFs – as currently used in empirical studies – provide spurious causality.

We now use a very popular framework, from macroeconomics, to think more clearly about 𝐁\mathbf{B}. Liang2008’s flow of simplifications and assumptions makes his once sophisticated (1) collapse to that of a bivariate VAR with 1 lag

[X1,tX2,t]=[c1c2]𝒄+[a111a121a211a221]𝑨[X1,t1X2,t1]+[b11b12b21b22]𝐁[ε1tε2t].\left[\begin{array}[]{l}X_{1,t}\\ X_{2,t}\end{array}\right]=\underbrace{\left[\begin{array}[]{l}c_{1}\\ c_{2}\end{array}\right]}_{\bm{c}}+\underbrace{\left[\begin{array}[]{ll}a_{11}^{1}&a_{12}^{1}\\ a_{21}^{1}&a_{22}^{1}\end{array}\right]}_{\bm{A}}\left[\begin{array}[]{l}X_{1,t-1}\\ X_{2,t-1}\end{array}\right]+\underbrace{\left[\begin{array}[]{ll}b_{11}&b_{12}\\ b_{21}&b_{22}\end{array}\right]}_{\mathbf{B}}\left[\begin{array}[]{l}\varepsilon_{1t}\\ \varepsilon_{2t}\end{array}\right]. (5)

The seemingly innocuous assumption is much less so within a statistical framework relating assumptions directly to observable quantities. Indeed, the uncorrelatedness of WW’s, which here translates to that of ϵ\epsilon, combined with b21=b12=0b_{21}=b_{12}=0 have a very stark implication. Let the number of endogenous variables be MM and the number of lags PP. Provided imposing M=2M=2 and P=1P=1 is reasonable (4) is valid if and only if regression residuals [u1tu2t]=[b11b12b21b22][ε1tε2t]\left[\begin{smallmatrix}u_{1t}\\ u_{2t}\end{smallmatrix}\right]=\left[\begin{smallmatrix}b_{11}&b_{12}\\ b_{21}&b_{22}\end{smallmatrix}\right]\left[\begin{smallmatrix}\varepsilon_{1t}\\ \varepsilon_{2t}\end{smallmatrix}\right] are not cross-correlated. This is easily testable: one needs to estimate equations 1 and 2 separately by least squares, collect the residuals and calculate their correlation ρu\rho_{u}. If the latter is different from 0 (and this could be formally tested with a t-test), then Liang2008’s simple formula does not apply. While ρu=0\rho_{u}=0 might be plausible in continuous time (or anything near it), this is a monumental stretch for data sampled at the yearly frequency (like in SMCGL). Fortunately, unlike true structural causality, the data can directly inform us on whether Liang2014’s formula is valid or not for specific pairs of time series.

2.1 Acknowledging the Identification Problem: Vector Autoregressions

This exposition follows closely VARCTIC. In time series analysis, the "identification" problem originates from simultaneity in the data. We can learn whether

Xt1YtorYt1XtX_{t-1}\rightarrow Y_{t}\quad\text{or}\quad Y_{t-1}\rightarrow X_{t}

is more plausible. This is predictive causality in the sense of granger1969. However, the data itself cannot discriminate between

XtYtandXtYt.X_{t}\rightarrow Y_{t}\quad\text{and}\quad X_{t}\leftarrow Y_{t}.

In words, a correlation between XtX_{t} and YtY_{t} can be generated by two different causal structures. Liang2008’s solution is to assume such relationships do not exist — yet, they do. Within a VAR, the problem boils down to the need for identifying CC in

C𝒚t=Ψ0+p=1PΨp𝒚tp+𝜺t,\displaystyle C\bm{y}_{t}=\Psi_{0}+\sum_{p=1}^{P}{\Psi_{p}}\bm{y}_{t-p}+\bm{\varepsilon}_{t}, (6)

where 𝒚t\bm{y}_{t} is an MM by 1 vector – meaning the dynamic system incorporates MM variables. Ψp\Psi_{p}’s parameterizes how each of these variables is predicted by its own lags and lags of the M1M-1 remaining variables. PP is the number of lags being included. The matrix CC characterizes how the MM different variables interact contemporaneously — e.g., how total forcing affects GMTA within the same year (a time unit tt in SMCGL). Finally, the structural disturbances are mutually uncorrelated with mean zero:

𝜺t=[ε1,t,,εM,t]N(0,IM).\bm{\varepsilon}_{t}=\left[\varepsilon_{1,t},\enskip...\enskip,\varepsilon_{M,t}\right]\leavevmode\nobreak\ \sim\leavevmode\nobreak\ \,N\left(0,\leavevmode\nobreak\ I_{M}\right).

Equation (6) is the so-called structural form of the VAR, which cannot be estimated because CC is not identified by the data. SMCGL uses formula (4) which implicitly assumes a constrained version of (6) with M=2M=2, P=1P=1, and, most importantly, CC being a diagonal matrix. The validity of their analysis hinges upon those constraints not being rejected by the data. In section 4, we find that the data disagrees with at least two of them.

Equation (6) is a structural model which can be used to answer causal questions directly. However, the elements of CC are not plain regression coefficients, and cannot be estimated as such — they would be biased. It does not mean that they do not exist. The implications of their existence can best be understood by looking at an estimable "reduced-form" VAR

𝒚t=𝒄+p=1PΦp𝒚tp+𝒖t,\displaystyle\bm{y}_{t}=\bm{c}+\sum_{p=1}^{P}{\Phi_{p}}\bm{y}_{t-p}+\bm{u}_{t}, (7)

where 𝒄=C1Ψ0\bm{c}=C^{-1}\Psi_{0} and Φp=C1Ψp\Phi_{p}=C^{-1}\Psi_{p} are both regression coefficients obtained by running least squares separately on each equation. 𝒖t\bm{u}_{t} are now regression residuals

𝒖t=[u1,t,,uM,t]N(0,Σu)\bm{u}_{t}=\left[u_{1,t},\enskip...\enskip,u_{M,t}\right]\leavevmode\nobreak\ \sim\leavevmode\nobreak\ \,N\left(0,\leavevmode\nobreak\ \Sigma_{u}\right)

which will be cross-correlated if the true CC is not diagonal. As mentioned earlier, Liang2014’s simplifying assumptions translate into Σu=C1C1\Sigma_{u}=C^{-1^{\prime}}C^{-1} being diagonal which is often at odds with the data. All the parameters of (7) can be estimated with traditional methods, but the model is not "structural" and cannot be used for causal inference. Structural VARs, which aim at uncovering "structural" causality (instead of predictive causality à la granger1969) acknowledge Σu\Sigma_{u} being non-diagonal and provide ways to obtain CC. As a byproduct, they can procure valid measures of information flow.

The raw material of causal measures are 𝜺t\bm{\varepsilon}_{t}, the structural disturbances entering the systems. However, those, like structural causality, are not directly extractable from the data: we only have 𝒖t\bm{u}_{t} and translating those back to 𝜺t\bm{\varepsilon}_{t} necessitates CC. The latter is not directly attainable, but can be retrieved using the mapping Σu=C1C1\Sigma_{u}=C^{-1^{\prime}}C^{-1}. In words, this means the covariance matrix of regression residuals from (7) can be used as raw material to retrieve the "structural" CC. Mechanically, the identification problem emerges because there are many CC’s satisfying Σ^u=C1C1\hat{\Sigma}_{u}=C^{-1^{\prime}}C^{-1} — numerous causal structures deliver the same residuals’ cross-correlations.

The strategy we opt for is the traditional Choleski decomposition of Σ^u\hat{\Sigma}_{u}. This is one of many identification strategies for the VAR (kilian2017svar). But among the catalog of methodologies, the Choleski decomposition is certainly popular (if not the most popular) in applied work and is simple to implement. Mechanically, it provides a lower-triangular matrix CC, satisfying Σ^u=𝐁𝐁\hat{\Sigma}_{u}=\mathbf{B}^{\prime}\mathbf{B} where 𝐁\mathbf{B} is the same as from equation (5), but with dimensions M×MM\times M. Its purpose is to transform cross-correlated regressions residuals utu_{t} (equation (7)) into uncorrelated structural shocks 𝜺t\bm{\varepsilon}_{t} (equation (6)). This is done by reversing the relationship 𝒖t=C𝜺t\bm{u}_{t}=C\bm{\varepsilon}_{t}.

Uncorrelatedness is essential to study how GMTA responds to a given forcing, keeping everything else constant. Such a causal claim would be impossible when considering an impulse from correlated residuals utu_{t} as those always co-move. A Choleski decompostion of Σu\Sigma_{u} is one way to transform the observed utu_{t} into the unobserved fundamental shocks 𝜺t\bm{\varepsilon}_{t}. The assumption underlying such an approach to orthogonalization is a causal ordering of shocks. The ordering restricts how variables interact with each other within the same year, conditional on the previous state of the system. In our bivariate setup, ordering forcing jj after GMTA implies that forcing cannot impact GMTA within the same year. Ordering GMTA after a given forcing implies that GMTA cannot impact forcing jj within the same year. SMCGL implicitly assumes both restrictions at the same time, which results in a rejected overidentified model. In contrast, when the model is just identified (when only one restriction is imposed), this choice cannot be validated by the data itself as it does not alter the likelihood.

When revisiting SMCGL’s empirical work, we consider both orderings and document how sensitive conclusions are to that necessary choice.222Of course, there are identification schemes outside of the family of ”orderings” obtained by Choleski decomposition, but those are beyond the scope of this paper – and unnecessary to make our main point. There are cases where the sign of net IF (a qualitative notion) between jj and GMTA depends on the ordering choice, and cases where it does not. For instance, we will find that whatever is assumed about CC in a proper VAR system, total forcing appears to be causing GMTA much more than the reverse. In other cases, like CO2CO_{2}-induced radiative forcing, this cannot be simply ruled out by the data.

2.2 An Adequate Measure of IF Based on the VAR

In a complete multivariate system like a VAR, the errors of the hh-steps ahead forecast yt+h,my_{t+h,m} can be related back to structural shocks – that is, the anomalies driving the dynamics of the system. For instance, we can compute the share of the forecast error variance of GMTA 10 years from now that is attributable to CO2CO_{2} anomalies. Intuitively, if CO2CO_{2} is causing GMTA to increase, its exogenous impulses should be an important driver of GMTA’s variance rather than GMTA anomalies themselves – a high information flow/transfer between the two variables. Accordingly, VAR forecast errors are

𝒖t+h=𝒚t+h𝒚^t+h=h=0h1Θh𝜺t+hh\bm{u}_{t+h}=\bm{y}_{t+h}-\hat{\bm{y}}_{t+h}=\sum_{h^{\prime}=0}^{h-1}\Theta_{h^{\prime}}\bm{\varepsilon}_{t+h-h^{\prime}}

where Θh\Theta_{h} is a function of Φp\Phi_{p}’s and CC. See kilian2017svar for further details. The forecast error variance decomposition (FEVD) of the whole system at horizon hh can be analytically calculated using the entries of matrix Θh\Theta_{h^{\prime}}. Precisely,

MSPEh=E(𝒖t+h𝒖t+h)=h=0h1ΘhΘh.\displaystyle\operatorname{MSPE}_{h}=E(\bm{u}_{t+h}\bm{u}{t+_{h}}^{\prime})=\sum_{h^{\prime}=0}^{h-1}\Theta_{h^{\prime}}\Theta_{h^{\prime}}^{\prime}\;.

An information flow "share" of ii for jj can be characterized by the share of forecast error variance of variable jj attributable to structural shocks of ii.333For a discussion on how to think about ”shocks” in a physical system, see VARCTIC. While those measures can be assessed for any horizon hh (which can contain useful information), we focus on the cumulative sum since it is a measure of total flow. For non-stationary VARs (like those of the empirical section) we use a horizon of h=15h=15 years. In the case of stationary VAR process, FEVD measures quickly converge to their long-run value as hh increases. Moreover, in that case, our FEVD-based IF measures can be more directly motivated from the Wold representation of a VAR process (kilian2017svar). Finally, note that whenever the b12=b21=0b_{12}=b_{21}=0 assumption is approximately true, the FEVD approach and IFs give qualitatively similar answers.

3 Simulations

This section showcases how IF can lead to a pretense of causal knowledge, with conclusions that are sometimes the exact opposite of the truth. We consider four data-generating processes (DGPs) where the true PP is one. The first two correspond to what is proposed in Liang2014. The last two are generic VAR positively autocorrelated processes with differing degrees of persistence. Following the notation above (equation (5)):

𝑨\displaystyle\bm{A} =[0.50.500.6],𝒄=[0.10.7]\displaystyle=\left[\begin{array}[]{ll}0.5&0.5\\ 0&0.6\end{array}\right],\quad\bm{c}=\left[\begin{array}[]{ll}0.1\\ 0.7\end{array}\right] (8e)
𝑨\displaystyle\bm{A} =[0.50.90.20.5],𝒄=[00]\displaystyle=\left[\begin{array}[]{ll}-0.5&0.9\\ -0.2&0.5\end{array}\right],\quad\bm{c}=\left[\begin{array}[]{ll}0\\ 0\end{array}\right] (8j)
𝑨\displaystyle\bm{A} =[0.50.20.50.25],𝒄=[00]\displaystyle=\left[\begin{array}[]{ll}0.5&-0.2\\ -0.5&0.25\end{array}\right],\quad\bm{c}=\left[\begin{array}[]{ll}0\\ 0\end{array}\right] (8o)
𝑨\displaystyle\bm{A} =[0.250.10.20.1],𝒄=[00]\displaystyle=\left[\begin{array}[]{ll}0.25&-0.1\\ -0.2&0.1\end{array}\right],\quad\bm{c}=\left[\begin{array}[]{ll}0\\ 0\end{array}\right] (8t)

As highlighted in the previous section, IFs are calculated assuming bij=0b_{ij}=0. However, as will be reported and formally tested in section 4, this is frequently not the case for most time series, especially those sampled at low frequencies. Using a controlled simulation environment, we study how IFs behave for values of ρu:=corr(utX1,utX2)[1,1]\rho_{u}:=corr(u_{t}^{X_{1}},u_{t}^{X_{2}})\in[-1,1].444Variances of utX1u_{t}^{X_{1}} and utX2u_{t}^{X_{2}} are one. Note that IFs are invariant to ρu0\rho_{u}\neq 0 emerging from b12=0b_{12}=0 or b21=0b_{21}=0, which is obviously problematic given the causal content of bijb_{ij}.555It is important to note that while we consider cases where either b12=0b_{12}=0 or b21=0b_{21}=0, there is a continuum of possibilities between those. We do so for simplicity of exposition (it makes the problem dichotomous). Moreover, setting either b12=0b_{12}=0 or b21=0b_{21}=0 to 0 corresponds to a causal ordering which is by far the most common identification scheme used in practice – and what we will be using in section 4.

As our indicator of true underlying information flow, taking into account ρu0\rho_{u}\neq 0, we use the FEVD-based measure described earlier. Note that here, unlike the application to real data in section 4, we know what are the true bjib_{ji}’s. When ρu0\rho_{u}\neq 0, the correlation must be attributed to either b12b_{12} or b21b_{21}, or a combination of both. In a bivariate setup, this amounts to setting bij=0b_{ij}=0, and attributing ρ0\rho\neq 0 to bjib_{ji}. Hence, it is possible to tell when standard IFs conclude falsehoods, because in simulations b12b_{12} and b21b_{21} are known.

In terms of notation, Υiji,j\Upsilon^{i,j}_{i\rightarrow j} means the FEVD share at horizon h=10h=10 with i=1,2i=1,2 and iji\neq j. The superscripts in Υiji,j\Upsilon^{i,j}_{i\rightarrow j} determine the true ordering, hence ii ordered before jj. The subscripts indicate that we plot the contribution of variable ii to the forecast error variance of variable jj at horizon h=10h=10. The simulations have shown that h=10h=10 is sufficient for convergence.

Refer to caption
(a) DGP(1)
Refer to caption
(b) DGP(2)
Refer to caption
(c) DGP(3)
Refer to caption
(d) DGP(4)
Refer to caption

Notes: Color specification: the light blue area shows the region, in which FEVDs suggest Υ12i,j\Upsilon^{i,j}_{1\rightarrow 2} >> Υ21i,j\Upsilon^{i,j}_{2\rightarrow 1}, regardless of the ordering. Similarly, the light green area shows the region, in which FEVDs suggest Υ21i,j\Upsilon^{i,j}_{2\rightarrow 1} >> Υ12i,j\Upsilon^{i,j}_{1\rightarrow 2} – independent of the ordering. The white/non-colored areas shows those regions, for which the ordering of X1X1 and X2X2 does matter. An unambiguous determination of Υ12i,j\Upsilon^{i,j}_{1\rightarrow 2} >> Υ21i,j\Upsilon^{i,j}_{2\rightarrow 1}, respectively Υ21i,j\Upsilon^{i,j}_{2\rightarrow 1} >> Υ12i,j\Upsilon^{i,j}_{1\rightarrow 2}, is not possible.

Figure 1: Ranking of NIFs (τij\tau_{i\rightarrow j}) vs Ranking of FEVDs (Υiji,j\Upsilon^{i,j}_{i\rightarrow j})
Different Levels of Correlation; Horizon h=10h=10

Figure 1 plots the absolute normalized information flows (NIF), τij\tau_{i\rightarrow j} for DGP(1) through DGP(4), with varying ρu\rho_{u}. A first observation is that τij\tau_{i\rightarrow j} often varies significantly with ρu\rho_{u}, even when the very formula underlying it assumes ρu=0\rho_{u}=0. The relative strength of τ12\tau_{1\rightarrow 2} and τ21\tau_{2\rightarrow 1} can easily collapse to 0 or be much higher than what IFs report for ρu=0\rho_{u}=0.

The fact that IFs vary with ρu\rho_{u} could give false hopes that they account for simultaneous relationships. We conduct a simple exercise to show it does not. An interesting empirical question is whether ii is causing jj more than jj is causing ii, which is the appealing promise of IFs. This moves beyond testing for Granger Causality and aims at quantifying the flow of information. The shaded green region corresponds to the values of ρu\rho_{u} for which either the true underlying causality is that X2X_{2} causes X1X_{1} more than the reverse, irrespective of whether it emerges from either b12=0b_{12}=0 or b12=0b_{12}=0. For those values of ρu\rho_{u}, when the true ordering is unknown, the qualitative conclusion about the sign of the net causality flow does not hinge on knowledge of bijb_{ij}. Analogously, the blue shade represents values of ρu\rho_{u} for which X1X_{1} causing more X2X_{2} is unanimous among bijb_{ij} configurations. White regions are values of ρu\rho_{u} where conclusions about the sign of net causality flow cannot be determined from the data, i.e., the unknown bijb_{ij} is necessary to settle. In other words, given the data available to the modeler, X2X_{2} causing more X1X_{1} and vice versa are both equally likely, and sorting it out decisively implies making a successful guess on the true value of bijb_{ij}.

If IFs were correctly calibrated, the green dotted line should only be above the blue in the shaded green regions, and the dotted blue line above the green one in blue regions. Figure 1 clearly demonstrates this not to be the case. For every DGP, there is a substantial range of values (white spaces) for which IFs clearly conclude τij>τji\tau_{i\rightarrow j}>\tau_{j\rightarrow i} when there is one pair of (bij,bji)(b_{ij},b_{ji}) out of two for which the reverse is true. To address concerns about an horizon mismatch hh between IFs and FEVD, Figure 4 in Appendix B shows results for h=2h=2. As it turns out, things worsen for IFs (the blue and green regions shrink) since the simultaneity problem is less diluted in dynamics at short horizons.

In sum, those simulations (largely inspired by Liang2015 DGPs) show two things. First, the tractable IFs (from formula (4)) are functions of ρu\rho_{u} even though they assume it to be zero. This compromises any statement on the strength of causal links. Second, for any DGP, there is an underlying pair (bij,bji)\left(b_{ij},b_{ji}\right) for which IFs’ conclusion about the net causal flow is the opposite of reality. This is not only an artifact of large |ρu||\rho_{u}|, as exemplified by DGP(4).

4 GMTA and Radiative Forcing Revisited

In this section, we first revisit SMCGL’s application of IFs to GMTA and forcings. Second, we look at what lies behind opaque FEVD measurements by reporting impulse response functions, and computing the transient climate response implied by simplistic bivariate VARs.

Global warming generated by man-made forcing is the prevalent generalization of the notion climate change. Numerous researchers have dedicated their works to this very relationship between anthropogenic forcing and constituents of global climate (Hansen2006; AndrewsEtAl2010; LiEtAl2013; notz2016observed) and the socio-economic consequences thereof (Nord14; ipcc2019). Despite overwhelming evidence for anthropogenic forcing being the main driver of global climate change (ipccGlobWarm2018), scientists have also observed that especially since the turn of the millennium, global temperature has plateaued despite ever-rising greenhouse gases and contrary to projections from key climate models.666Resolving this puzzle has led to reevaluate the role of oceans in the interplay of radiative forcing and the climatic response (Tollefson14; Marotzke2015).

IFs hinge exactly on these bivariate relationships, which are attractive in their clarity, but are certainly a stark oversimplification of a complex system. Nevertheless, to give empirical content to our critique of the methodology, we study the relationship between GMTA and 7 forcings from SMCGL using both IFs and our FEVD-based remedy. The sample of annual means ranges from 1850 - 2005.777Data on the Pacific Decadal Oscillation (PDO) ranges from 1900 - 2005. We follow SMCGL and take our data from therein referenced data providers. Table 1 summarizes the results.888The sample was restricted to 1850-2005 to match that of SMCGL. Table 5 reports results extending the sample to 2017.

In Table 1 we report estimated correlation between residuals of the bivariate VAR(1) implied by IFs. When forcing P=1P=1, in 5 cases out of 7, the null that ρ^ui,GMTA=0\hat{\rho}_{u}^{i,GMTA}=0 is rejected at least at the 10% level. When choosing PP with Bayesian Information Criterion (BIC), only ρ^uSolar,GMTA=0\hat{\rho}_{u}^{\text{Solar},GMTA}=0 cannot be rejected. Hence, as repeatedly mentioned in the text, IFs assume something that can be, and is rejected by the data. Moreover, in the light of simulations carried earlier, the qualitative and quantitative insights from IFs are often spurious under such conditions.

Table 1: Empirical Results for the Bivariate Relationship
Between Various Forcings and GMTA

Correlation Normalized IF (IF ×\times 100) FEVD Lags (PP) Correlation of Residuals (ρu\rho_{u}) Ordering: ii, GMTA Ordering: GMTA, ii ii \rightarrow GMTA GMTA \rightarrow ii iGMTA{i\rightarrow\text{GMTA}} GMTAi{\text{GMTA}\rightarrow i} iGMTA{i\rightarrow\text{GMTA}} GMTAi{\text{GMTA}\rightarrow i} Total Forcing 0.73 30.6 20.8 4 0.23*** 47.4 13.0 28.0 25.4 (15.3) (11.1) 1 0.29*** 51.4 9.6 27.6 28.7 Anthropogenic 0.86 39.8 -20.0 4 -0.19** 6.5 3.7 3.8 13.4 (35.7) (-0.6) 1 -0.19** 5.0 5.8 2.2 17.1 CO2 - ERF (W/m2{}^{\text{2}}) SMCGL 0.86 39.6 -15.2 4 - 0.14* 6.5 8.4 5.6 17.4 (35.1) (-0.4) 1 -0.15* 2.8 4.7 1.1 12.8 Aerosol -0.82 35.9 -24.5 4 -0.19** 2.9 0.6 2.1 1.6 (24.3) (-0.4) 1 -0.10 3.5 4.0 1.8 1.2 Solar 0.49 13.5 6.7 8 0.05 8.5 1.6 6.6 2.5 (3.8) (2.3) 1 0.08 16.6 4.2 12.3 6.8 Volcanic 0.09 0.9 -0.5 4 0.18** 10.9 0.8 3.1 3.6 (0.2) (-0.4) 1 0.20** 7.1 1.4 0.6 3.7 PDO 0.17 -1.2 -0.6 4 0.35*** 31.1 0.9 6.3 10.3 (-0.2) (-0.5) 1 0.34*** 9.1 0.5 0.2 10.7 CO2 (Mt/yr) 0.82 37.1 -4.3 2 -0.10 8.9 2.1 10.7 0.6 (27.0) (-0.0) 1 -0.05 4.2 0.0 4.4 0.4 CO2 (W/m2{}^{\text{2}}) 0.86 39.5 -14.0 4 0.23*** 5.2 16.8 2.9 4.7 (34.5) (-0.3) 1 0.07 1.6 4.1 0.9 1.8

Notes: ii corresponds to the type of radiative forcing, listed in the left most column. The second column ("Correlation") documents the correlation between GMTA and variable ii. FEVD values are taken at horizon h=15h=15, which translates into the contribution of variable ii in the variance of the forecast error of variable jj a decade and a half after the in-sample end date. Numbers in bold underline the highest absolute causal flow among a (ii,GMTA) pair for a given measure. "*", "**", and "***" means that the null of the residuals cross-correlation of residuals is rejected at the 10%,5% and 1% level respectively.

Naturally, we concentrate on FEVD results which correctly account for ρu0\rho_{u}\neq 0. However, setting ρu=0\rho_{u}=0 is only one of the empirical shortcomings of the empirical IF formula — it also sets PP, the number of lags of each XtX_{t}, to be 1. Clearly, it is empirically plausible that Xi,t2X_{i,t-2} or Xj,t4X_{j,t-4} may have an impact on Xi,tX_{i,t} beyond what is channeled by single year lags (Xi,t1X_{i,t-1} and Xj,t1X_{j,t-1}) . In other words, P=1P=1 is extremely restrictive on climatic dynamics. When choosing PP with BIC, results align better with prior scientific knowledge. Nonetheless, for the sake of completeness, we report both results (P=1P=1 and P=PP=P^{*}, with PP^{*} being BIC’s choice).

With P=PP=P^{*}, the fact that total forcing causes GMTA more than the reverse is without appeal. Nevertheless, the quantitative answer is, again, highly dependent on the ordering choice. After 15 years, total forcing anomalies are responsible for explaining between 28.0% and 47.4% of that of GMTA – depending on the preferred ordering. The net causal flow being higher from aerosol and solar to GMTA are also unanimous, but much smaller. Indecisive results are reported for Anthropogenic, Volcanic and PDO. Overall, Table 1 suggests that the data itself does not support the strong qualitative conclusions of SMCGL for their CO2 measure and Anthropogenic.

When PP is forced to one, as in IFs, inconclusive results are reported for total forcing, aerosol, volcanic and PDO. P=1P=1 specifications, irrespective of the ordering999There are other identification schemes which cannot be cast as ”orderings”. That is, there are rotations of 𝒖t\bm{u}_{t} (even when ρu=0\rho_{u}=0) giving different structural shocks. Hence, while the two orderings span a lot of possibilities (and those traditionally considered first in practice), they do represent the universe of rotations of 𝒖t\bm{u}_{t} into ϵt\bm{\epsilon}_{t}., conclude that GMTA is causing more CO2CO_{2} and Anthropogenic than the reverse. Only solar forcing results are unanimous, and in line with what climatic common wisdom suggests. Note that it also agrees with results from original IFs, which is not surprising given that ρ^Solar,GMTA\hat{\rho}_{Solar,GMTA} is in the close vicinity of 0. However, choosing a proper PP nearly doubles the share of forecast errors attributable to solar forcing.

4.1 What’s up with CO2?

Results for CO2 are rather surprising. Irrespective of the ordering, GMTA is reported to cause CO2 more than the reverse, a finding contradicting SMCGL’s results and common wisdom.101010Hansen2006 states that global warming started to accelerate not prior to the 1970s. Only after 1975 did the global temperature increase by approximately 0.2C per decade. However, SMCGL’s CO2 measure exhibit ill dynamic behavior that cannot possibly be that of a natural quantity. This becomes obvious when plotting their CO2-ERF measure in first difference: it evolves according to a series of dichotomous jumps. This CO2-ERF series, in which ERF stands for effective radiative forcing, is a concept presented in MyhreEtAl2013. The series itself is based on EtminanEtAl2016.

Given the strange jumping behavior of the CO2-ERF series, and to make our calculations comparable to other findings in the literature (BrunsEtAl2020; MontamatStock2020), we derive RF from CO2 concentration as follows: we use the well established Meinshausen2017 data set on annual global means of CO2 concentration, measured in parts per million (ppm). We follow MyhreEtAl2013 and transform the increase in CO2 concentration in year tt measured in ppm relative to the concentration in a given base year, CO2,base, into radiative forcing, RFtCO2RF^{\text{CO}_{2}}_{t} – measured in W/m2{}^{\text{2}} – as follows: RFtCO2=5.35×ln(CO2,t/CO2,base)RF^{\text{CO}_{2}}_{t}=5.35\times\text{ln}\left(\text{CO}_{2,t}/\text{CO}_{2,base}\right). We use 1850 as base year following BrunsEtAl2020.

As it turns out, considering this less contentious CO2 series does not resolve the apparently counterintuitive finding that GMTA explains a larger portion of the forecast error variance of CO2 than vice versa. Such a finding has also been reported using a different methodology in koutsoyiannis2020atmospheric. We explore a last avenue, that of using annual CO2 emissions rather than RFtCO2RF^{\text{CO}_{2}}_{t}. This last attempt is successful in reconciling the FEVD approach with the traditional wisdom that CO2 is causing GMTA "more" than the reverse. This finding is independent of the ordering choice.

In sum, based on this particular time series evidence, the causal link between certain forcings and GMTA remains disputable. What is less disputable are the effects of total forcing, CO2 emissions, and solar forcing which all explain an important share of GMTA anomalies independently of arbitrary ordering preferences. Nonetheless, we see such analyses as rather primitive and potentially misleading. For instance, XX causing ”more” YY does not mean that the reverse causality is not quantitatively important, or climatologically relevant. We now turn to a more promising way to extract meaning out of selected bivariate VARs.

4.2 Impulse Response Functions

To open the black box of those rather abstract measurements, we report in Figure 3 impulse response functions (IRF) for the bivariate models of CO2-GMTA and total forcing-GMTA. Since sims1980, the dominant approach for studying the properties of the VAR around its deterministic path has been IRFs to structural shocks. Their dynamic effect can be analyzed as that of a randomly assigned treatment because those have been transformed to be uncorrelated, which provides the "keeping everything else constant" interpretation.

The IRF of a variable ii to a one standard deviation shock of εj,t\varepsilon_{j,t} is defined as

IRF(ji,h)=E(yi,t+h|𝒚𝒕,εt,j=σεj)E(yi,t+h|𝒚𝒕,εt,j=0).\displaystyle IRF(j\rightarrow i,h)=E(y_{i,t+h}|\bm{y_{t}},\varepsilon_{t,j}=\sigma_{\varepsilon_{j}})-E(y_{i,t+h}|\bm{y_{t}},\varepsilon_{t,j}=0). (9)

Thus, it is the expected difference, hh months after "impact", between a bivariate system that responded to an unexpected CO2 increase, and the same system where no such increase occurred. In a linear VAR, the above takes a closed-form solution in terms of the matrices from (6). Models are now estimated with Bayesian methods, optimizing the hyperparameters of a standard Minnesota prior, and choosing the number of lags as reported in the gray-shaded rows of Table 1. The primary motivation is to obtain valid inference even in the presence of nonstationarity. Point estimates are nearly identical to that of OLS. For details on such choices in the context of climate data and a more thorough (yet introductory) treatment of IRFs, see VARCTIC.

Figure 2: IRFs: Annual Emissions and Global Temperature
Refer to caption
(a) CO2 (Mt/yr) \rightarrow GMTA
Refer to caption
(b) CO2 (Mt/yr) \rightarrow GMTA (including trend)
Refer to caption
(c) Total Forcing \rightarrow GMTA
Refer to caption
(d) Total Forcing \rightarrow GMTA, (including trend)
Refer to caption

Notes: We show impulse response functions from bivariate Vector Autoregressions of GMTA and annual CO2 emissions. The right column includes a time trend as an additional exogenous regressor. The solid line is the median of 10,000 draws from the posterior distribution. Hyperparameters were optimized. The shaded area comprises the 68% credible region. Lags are those reported in Table 1 under the gray shaded rows.

In Figure 2 we show the effect of an unexpected increase in annual emissions on GMTA, while Figure 3 shows the response of GMTA radiative forcing generated by an unexpected rise in cumulative emissions.

Figure 3: IRFs: Cumulative Emissions and Global Temperature
Refer to caption
(a) CO2 \rightarrow GMTA
Refer to caption
(b) CO2 \rightarrow GMTA (including trend)
Refer to caption

Notes: We show impulse response functions from bivariate Vector Autoregressions of GMTA and CO2, and Total Forcing respectively. The right column includes a time trend as an additional exogenous regressor. The solid line is the median of 10,000 draws from the posterior distribution. Hyperparameters were optimized. The shaded area comprises the 68% credible region. Lags are those reported in Table 1 under the gray shaded rows.

Qualitatively, the impact of annual emissions and cumulative CO2-induced forcing shocks on global temperature is vastly similar. In accord with findings in VARCTIC for the effect of CO2 on Arctic sea ice extent, the impact of total forcing and CO2 shocks is highly durable. This time, it is on GMTA rather than sea ice extent. As reflected in Figures 2 and 3, this result is qualitatively independent of the ordering choice. In both cases (and for both forcing variables), the effect of forcing takes about two years to completely settle in. However, it is clear that the reported short-run impact strongly depends on the identification assumptions, which SMCGL completely abstract from.

Whether the slightly negative short-run response of GMTA in the first panel of Figure 2 favors the ordering {GMTA,CO2}\left\{\text{{GMTA}},\,\text{{CO${}_{2}$}}\right\} over {CO2,GMTA}\left\{\text{{CO${}_{2}$}},\,\text{{GMTA}}\right\} is debatable: ForsterEtAl2020 find the reduction in global CO2 emissions during the COVID-19 pandemic to have resulted in a short-run rise of global temperature. The key mechanism is a decline in the cooling-effect of aerosols as a result of less SO2 emissions. The authors project a rise in global temperature over the first 24 months following the pandemic-induced reduction in global nitrogen oxide (NOx) emissions.111111Especially NO2 is found to be well-correlated with CO2 emissions (ForsterEtAl2020).

Lastly, a robustness check. Our variables are clearly nonstationary. While this does not pose a problem for Bayesian inference, it is nevertheless natural to wonder if results would be significantly altered by including a time trend. Accordingly, the second column reports the same IRFs, but for VAR specifications augmented with trends. Those show that adding such an exogenous explanatory variable does not change the dynamics of a GMTA response to an unexpected shock to CO2. The addition of a trend to the bivariate model of GMTA and total forcing, allows GMTA to slowly revert to a lower impact – which is nevertheless highly persistent.

4.3 Are VAR Estimates Quantitatively Reasonable?

The transient climate response (TCR) is a frequently used metric to measure the impact of rising atmospheric CO2 concentration on temperature. It is not only an indication of the trajectory of ongoing climate change, but also serves as a benchmark to evaluate the results of climate model projections (PhillipsLeirvikStorelvmo2020). The TCR is defined as the increase in temperature, between h0h_{0} and hTh_{T}, under the assumption that CO2 increases annually by 1%. hTh_{T} is defined as that point in time when – due to the steady annual increase of 1% – CO2 concentration is twice as high as at date h0h_{0} (Pretis2020; MontamatStock2020). Such a doubling of CO2 would occur approximately after 70 years (OttoEtAl2013). Following the transformation of ppm to W/m2{}^{\text{2}} as suggested by MyhreEtAl2013 and MontamatStock2020, a doubling of CO2 under an annual increase in concentration of 1% would generate a radiative forcing of 5.35×ln(2)3.7 W/m25.35\times\text{ln}\left(2\right)\approx 3.7\text{ W/m}^{2}.121212An annual increase of 1% in atmospheric CO2 concentration results in a doubling of CO2 after approximately 70 years, which is described more formally as: h×ln(1.011)=ln(2),h\times ln\left(\frac{1.01}{1}\right)=ln\left(2\right)\;, for h70h\approx 70 (MontamatStock2020).

Typical estimates for TCR fall within a range of 1C-2.5C with a 66% probability, as summarized in the IPCC 5th Assessment Report (BindoffEtAl2013). More recent estimates are well aligned with this range. BrunsEtAl2020 reports a point estimate of TCR ranging from 1.17C to 1.85C, depending on the type of data and model specification. Pretis2020 embeds a two-component energy balance model into a cointegrated vector autoregressive model. His estimates vary across model specification and range from 1.24C to 1.38C. PhillipsLeirvikStorelvmo2020 report a global transient climate sensitivity of 2.05C. The IV regression in MontamatStock2020 allows for a differentiation of TCR measurements across different horizons, normalized to giving 70-year-horizon estimates. Their point estimates range in the neighborhood of 1.5C within a 95% confidence interval of roughly 0.9C to 2.1C.

Despite being much more simplistic than the models of the aforementioned works on TCR, the bivariate VARs do also allow to estimate the impact of a doubling of CO2 on temperature. Here we make use of the concept of IRFs, as presented in Equation (9). In particular, we estimate the impact on temperature when RFCO2RF^{\text{CO}_{2}} increases by one standard deviation of its reduced-form residuals, σεj\sigma_{\varepsilon_{j}}, where j=RFCO2j=RF^{\text{CO}_{2}}, from a bivariate VAR of RFCO2RF^{\text{CO}_{2}} and GMTA. Recalling the definition of TCR, a doubling of CO2 concentration, which is achieved by an annual increase of 1% in atmospheric CO2 concentration, generates an additional radiative forcing of approximately 5.35×ln(2)3.7 W/m25.35\times\text{ln}\left(2\right)\approx 3.7\text{ W/m}^{2}. In our case, the shock σεj\sigma_{\varepsilon_{j}} to CO2 is a one-time event at horizon h=0h=0, but its effects are distributed over horizons h=1,2,3,,Hh=1,2,3,...,H. This allows us to measure the cumulative increase, Ξi\Xi_{i}, in i{RFCO2,GMTA}i\in\left\{RF^{\text{CO}_{2}},GMTA\right\} generated by σεj\sigma_{\varepsilon_{j}}, where j=RFCO2j=RF^{\text{CO}_{2}}, at any horizon hh:

Ξj,h=s=0hIRF(ji,s),\Xi_{j,h}=\sum^{h}_{s=0}IRF(j\rightarrow i,s)\enskip,

where IRF(ji,s)IRF(j\rightarrow i,s) is defined as in Equation (9). Adapting the formula of OttoEtAl2013, we estimate TCRh as the increase in GMTA at horizon hh as follows:

TCRh=ΞGMTA,h×5.35×ln(2)ΞRFCO2,h,\displaystyle TCR_{h}=\Xi_{GMTA,h}\times\frac{5.35\times\text{ln}\left(2\right)}{\Xi_{RF^{\text{CO}_{2}},h}}\enskip, (10)

where ΞGMTA,h\Xi_{GMTA,h} is the cumulative increase in global temperature at horizon hh, resulting from the shock σεj\sigma_{\varepsilon_{j}}, where j=RFCO2j=RF^{\text{CO}_{2}}, at horizon h=0h=0. Likewise, ΞRFCO2,h\Xi_{RF^{\text{CO}_{2}},h} is the cumulative increase in radiative forcing of CO2 at horizon hh, resulting from the shock σεj\sigma_{\varepsilon_{j}}, where j=RFCO2j=RF^{\text{CO}_{2}}, at horizon h=0h=0.

In Table 2 we present median point estimates of TCRh for h=20h=20 and h=70h=70 (as in MontamatStock2020) from bivariate VARs of RFCO2RF^{\text{CO}_{2}} and GMTA – with and without an exogenous time trend. Thus, Table 2 reports the TCR corresponding to the model specifications in Figure 3 (a) and (b). That is, we use Bayesian estimation techniques and deploy a Minnesota prior on our parameter estimates. Our VAR has four lags and the estimation is based on annual observations between 1850 and 2005.

Table 2: Transient Climate Response

Ordering Without Trend With Trend TCR20 TCR70 TCR20 TCR70 CO2, GMTA 1.99C 2.06C 2.17C 2.58C GMTA, CO2 0.57C 1.82C 0.85C 2.39C

The main message of Table 2 is twofold: first, the ordering of the variables heavily influences the final results for TCR20\text{TCR}_{20}, demonstrating the importance of respecting the possibility of cross-correlated residuals. Mechanically, the discordance brought up by the ordering choice vanishes at much longer horizons, and, as a result, TCR70\text{TCR}_{70} estimates are largely similar. However, at that horizon, it is the choice of whether or not to include a trend that can alter results significantly. Second, even though the TCRh point estimates of the trend models are rather located at the upper bound of the IPCC range of 1C-2.5C, a simplistic bivariate VAR model including a constant and a time trend as additional exogenous regressors, is capable of providing a reasonable approximation of the rise in global mean temperature, triggered by a doubling of atmospheric CO2 concentration.

TCR estimates can be helpful in choosing which ordering is most plausible. For instance, only by ordering CO2 first do we get TCR20\text{TCR}_{20} to fall within the IPCC range. IRFs can also help sort things out. Ordering CO2 second leads to a surprisingly lasting negative effect of CO2 shocks on GMTA. An increasingly popular approach to VAR identification in macroeconomics is to use sign restrictions, where implausible IRF draws (based on economic theory) are tossed out (Uhlig2005). This dispenses the researcher from formulating a likely contentious causal ordering of variables. In a climate application, one could identify the VAR by rejecting specifications generating implausible IRF(CO2GMTA,h)IRF(\text{CO${}_{2}$}\rightarrow\text{GMTA},h) or TCRs. Applying this sort of reasoning leads us to favor – to nobody’s surprise – the specification where simultaneous causality runs from CO2 to GMTA. However, it is important to stress that this choice is obtained from prior knowledge on what is deemed reasonable and what is not, rather than our two time series.

5 Conclusion

This note is a cautionary tale about how seemingly innocuous simplifying assumptions can go wrong – especially when they are formulated without consulting the data. IFs, as proposed by SMCGL, is a concept that hinges on the assumption of zero correlation between the residuals of a bivariate VAR(1) process. In discrete time, especially with observations at lower frequencies, such an assumption is most often not justified. Both stylized simulations and an empirical application in the form of the transient climate response demonstrate that being negligent about cross-correlated residuals can lead to markedly different outcomes. Our results show that already in a bivariate system of CO2 and GMTA, the resulting TCR depends on how one deals with correlated residuals.

FEVDs provide an alternative to IFs. Albeit being a decisive improvement over IFs, FEVDs are as good as their underlying statistical model. To avoid departing too much from the SMCGL’s framework, we only considered bivariate models. Climate systems obviously comprise numerous additional variables. wilson2010atmospheric considers four in a VAR setup, but there could be many more. Additionally, the dynamics were assumed to be linear and time invariant, an approximation that should be eventually tested. estrada2013statistically trend "breaks" model is one way to do it and they report results pointing in the same direction as ours. However, with machine learning tools becoming increasingly common use, more flexible alternatives could be used to yield further insights. Finally, we considered simplistic identification schemes which implied a causal ordering of variables. There is a plethora of more sophisticated schemes available (kilian2017svar) and those could be used in future work, especially when moving beyond bivariate systems. Another – simpler – avenue is the use of data sampled at higher frequencies (like daily) which, by construction, makes the simultaneity problem much less of a Damocles sword.

References

  • Andrews et al., (2010) Andrews, T., Forster, P., Boucher, O., Bellouin, N., and Jones, A. (2010). Precipitation, Radiative Forcing and Global Temperature Change. Geophysical Research Letters, 37(14).
  • Bindoff et al., (2013) Bindoff, N., Stott, P., AchutaRao, P., Allen, M., Gillett, N., Gutzler, D., Hansingo, K., Hegerl, G., Hu, Y., Jain, S., Mokhov, I., Overland, J., Perlwitz, J., Sebbari, R., and Zhang, X. (2013). Chapter 10 - Detection and Attribution of Climate Change: From Global to Regional. In Climate Change 2013: The Physical Science Basis. IPCC Working Group I Contribution to AR5. Cambridge University Press, Cambridge.
  • Bruns et al., (2020) Bruns, S., Csereklyei, Z., and Stern, D. (2020). A Multicointegration Model of Global Climate Change. Journal of Econometrics, 214(1):175–197. Annals Issue: Econometric Models of Climate Change.
  • Chen et al., (2011) Chen, G., Glen, D. R., Saad, Z. S., Hamilton, J. P., Thomason, M. E., Gotlib, I. H., and Cox, R. W. (2011). Vector autoregression, structural equation modeling, and their synthesis in neuroimaging data analysis. Computers in biology and medicine, 41(12):1142–1155.
  • Estrada et al., (2013) Estrada, F., Perron, P., and Martínez-López, B. (2013). Statistically derived contributions of diverse human influences to twentieth-century temperature changes. Nature Geoscience, 6(12):1050–1055.
  • Etminan et al., (2016) Etminan, M., Myhre, G., Highwood, E., and Shine, K. (2016). Radiative Forcing of Carbon Dioxide, Methane, and Nitrous Oxide: A significant Revision of the Methane Radiative Forcing. Geophysical Research Letters, 43(24):12,614–12,623.
  • Forster et al., (2020) Forster, P., Forster, H., Evans, M., Gidden, M., Jones, C., Keller, C., Lamboll, R., Quéré, C., Rogelj, J., Rosen, D., Schleussner, C.-F., Richardson, T., Smith, C., and Turnock, S. (2020). Current and Future Global Climate Impacts Resulting from COVID-19. Nature Climate Change, 10(10):913–919.
  • Goulet Coulombe and Göbel, (2021) Goulet Coulombe, P. and Göbel, M. (2021). Arctic amplification of anthropogenic forcing: A vector autoregressive analysis. Journal of Climate, Forthcoming.
  • Granger, (1969) Granger, C. W. (1969). Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society, pages 424–438.
  • Granville Tunnicliffe, (2015) Granville Tunnicliffe, W. (2015). Atmospheric co2 and global temperatures: the strength and nature of their dependence. Working Paper.
  • Hansen et al., (2006) Hansen, J., Sato, M., Ruedy, R., Lo, K., Lea, D., and Medina-Elizade, M. (2006). Global Temperature Change. Proceedings of the National Academy of Sciences, 103(39):14288–14293.
  • Kilian and Lütkepohl, (2017) Kilian, L. and Lütkepohl, H. (2017). Structural vector autoregressive analysis. Cambridge University Press.
  • Koutsoyiannis and Kundzewicz, (2020) Koutsoyiannis, D. and Kundzewicz, Z. W. (2020). Atmospheric temperature and co2: Hen-or-egg causality? Sci, 2(4):83.
  • Li et al., (2013) Li, C., Notz, D., Tietsche, S., and Marotzke, J. (2013). The Transient versus the Equilibrium Response of Sea Ice to Global Warming. Journal of Climate, 26(15):5624–5636.
  • Liang, (2008) Liang, X. (2008). Information Flow within Stochastic Dynamical System. Phys. Rev. E, 78(5):031113.
  • Liang, (2014) Liang, X. (2014). Unraveling the Cause-Effect Relation between Time Series. Phys. Rev. E, 90:052150.
  • Liang, (2015) Liang, X. (2015). Normalizing the Causality between Time Series. Phys. Rev. E, 92:022126.
  • Liang, (2016) Liang, X. (2016). Information Flow and Causality as rigorous Notions ab initio. Phys. Rev. E, 94:052201.
  • Marotzke and Forster, (2015) Marotzke, J. and Forster, P. (2015). Forcing, Feedback and Internal Variability in Global Temperature Trends. Nature, 517(7536):565–570.
  • Masson-Delmotte et al., (2018) Masson-Delmotte, V., Zhai, P., Pörtner, H.-O., Roberts, D., Skea, J., Shukla, P., Pirani, A., Moufouma-Okia, W., Péan, C., Pidcock, R., Connors, S., Matthews, J.B.R.and Chen, Y., Zhou, X., Gomis, M., Lonnoy, E., Maycock, T., Tignor, M., and Waterfield, T. (2018). IPCC, 2018: Global Warming of 1.5C. An IPCC Special Report on the Impacts of Global Warming of 1.5C above Pre-Industrial Levels and related Global Greenhouse Gas Emission Pathways, in the Context of Strengthening the Global Response to the Threat of Climate Change, Sustainable Development, and Efforts to Eradicate Poverty. In press.
  • Meinshausen et al., (2017) Meinshausen, M., Vogel, E., Nauels, A., Lorbacher, K., Meinshausen, N., Etheridge, D., Fraser, P., Montzka, S., Rayner, P., Trudinger, C., Krummel, P., Beyerle, U., Canadell, J., Daniel, J., Enting, I. G., Law, R., Lunder, C., O’Doherty, S., Prinn, R., Reimann, S., Rubino, M., Velders, G., Vollmer, M., Wang, R., and Weiss, R. (2017). Historical greenhouse gas concentrations for climate modelling (cmip6). Geoscientific Model Development, 10(5):2057–2116.
  • Montamat and Stock, (2020) Montamat, G. and Stock, J. (2020). Quasi-experimental estimates of the transient climate response using observational data. Climatic Change, 160(3):361–371.
  • Myhre et al., (2013) Myhre, G., Shindell, D., Bréon, F.-M., Collins, W., Fuglestvedt, J., Huang, J., Koch, D., Lamarque, J.-F., Lee, D., Mendoza, B., Nakajima, T., Robock, A., Stephens, G., Takemura, T., and Zhang, H. (2013). Anthropogenic and Natural Radiative Forcing. In Stocker, T., Qin, D., Plattner, G.-K., Tignor, M., Allen, S. K., Doschung, J., Nauels, A., Xia, Y., Bex, V., and Midgley, P., editors, Climate Change 2013: The Physical Science Basis. Contribution of Working Group I to the Fifth Assessment Report of the Intergovernmental Panel on Climate Change, pages 659–740. Cambridge University Press, Cambridge, UK.
  • Nordhaus, (2014) Nordhaus, W. (2014). Estimates of the social cost of carbon: Concepts and results from the dice-2013r model and alternative approaches. Journal of the Association of Environmental and Resource Economists, 1(1/2):273–312.
  • Notz and Stroeve, (2016) Notz, D. and Stroeve, J. (2016). Observed arctic sea-ice loss directly follows anthropogenic co2 emission. Science, 354(6313):747–750.
  • Otto et al., (2013) Otto, A., Otto, F., Boucher, O., Church, J., Hegerl, G., Forster, P., Gillett, N. P., Gregory, J., Johnson, G., Knutti, R., Lewis, N., Lohmann, U., Marotzke, J., Myhre, G., Shindell, D., Stevens, B., and Allen, M. (2013). Energy Budget Constraints on Climate Response. Nature Geoscience, 6(6):415–416.
  • Phillips et al., (2020) Phillips, P., Leirvik, T., and Storelvmo, T. (2020). Econometric Estimates of Earth’s Transient Climate Sensitivity. Journal of Econometrics, 214(1):6–32. Annals Issue: Econometric Models of Climate Change.
  • Pretis, (2020) Pretis, F. (2020). Econometric Modelling of Climate Systems: The Equivalence of Energy Balance Models and Cointegrated Vector Autoregressions. Journal of Econometrics, 214(1):256–273. Annals Issue: Econometric Models of Climate Change.
  • Shukla et al., (2019) Shukla, P., Skea, J., Calvo Buendia, E., Masson-Delmotte, V., Pörtner, H.-O., Roberts, D., Zhai, P., Slade, R., Connors, S., van Diemen, R., Ferrat, M., Haughey, E., Luz, S., Neogi, S., Pathak, M., Petzold, J., Portugal Pereira, J., Vyas, P., Huntley, E., Kissick, K., Belkacemi, M., and Malley, J. (2019). Climate Change and Land: an IPCC Special Report on Climate Change, Desertification, Land Degradation, Sustainable Land Management, Food Security, and Greenhouse Gas Fluxes in Terrestrial Ecosystems. In press.
  • Sims, (1980) Sims, C. A. (1980). Macroeconomics and reality. Econometrica: journal of the Econometric Society, pages 1–48.
  • Stips et al., (2016) Stips, A., Macias, D., Coughlan, C., Garcia-Gorriz, E., and Liang, X. (2016). On the Causal Structure between CO2 and Global Temperature. Scientific Reports, 6:21691.
  • Tawia Hagan et al., (2019) Tawia Hagan, D. F., Wang, G., San Liang, X., and Dolman, H. A. (2019). A time-varying causality formalism based on the liang–kleeman information flow for analyzing directed interactions in nonstationary climate systems. Journal of Climate, 32(21):7521–7537.
  • Tollefson, (2014) Tollefson, J. (2014). Climate Change: The Case of the Missing Heat. Nature, 505:276–278.
  • Uhlig, (2005) Uhlig, H. (2005). What are the effects of monetary policy on output? results from an agnostic identification procedure. Journal of Monetary Economics, 52(2):381 – 419.

Appendix A Data Sources

For the empirical part of Section 4, we refer to the variables presented in Table 1 of SMCGL. Due to the sparse description of the data sources, our data set reduces to the annual global mean surface temperature anomalies (GMTA) and seven other representatives of radiative forcing. For five variables we could match the reported correlation coefficients as well as the IF values of SMCGL, Table 1. For total forcing and solar these measures do not coincide. For the analysis in Section 4.2 and 4.3, we use the CO2 series based on Meinshausen2017. We transform ppm into W/m2{}^{\text{2}} as described in Section 4.

Table 3: List of Variables
Abbreviation Description Data Source
Total Forcing annual; 1850-2005 KNMI Climate Explorer
Anthropogenic annual; 1850-2005 KNMI Climate Explorer
CO2-ERF (W/m2{}^{\text{2}}) annual; 1850-2005 KNMI Climate Explorer
Aerosol annual; 1850-2005 KNMI Climate Explorer
Solar annual; 1850-2005 KNMI Climate Explorer
Volcanic annual; 1850-2005 KNMI Climate Explorer
PDO annual averages of monthly observations; 1900-2005 KNMI Climate Explorer
GMTA annual; global; 1900-2005 HadCRUT4
CO2 (Million Tonnes / Year) annual; global production-based emissions; 1850-2005 Our World in Data – not in use
CO2 (ppm) annual; global; 1850-2014 – based on Meinshausen2017 annual; global; 2015-2017 – based on NOAA-ESRL 1850-2014: IAC ETH Zürich 2015-2017: NOAA-ESRL

Appendix B Additional Simulation Results

In Figure 4 below, we compare NIF and FEVDh=2. One might be concerned that results in Figure 1 suffer from a horizon mismatch between NIF and FEVDh=10. Results suggest that reducing hh worsens NIFs’ problems by shrinking the "safe" regions. This is intuitive, the effect of assumptions on simultaneous relationships gets milder as we get further from h=0h=0.

Refer to caption
(a) DGP(1)
Refer to caption
(b) DGP(2)
Refer to caption
(c) DGP(3)
Refer to caption
(d) DGP(4)
Refer to caption

Notes: Color specification: the light blue area shows the region, in which FEVDs suggest Υ12i,j\Upsilon^{i,j}_{1\rightarrow 2} >> Υ21i,j\Upsilon^{i,j}_{2\rightarrow 1}, regardless of the ordering. Similarly, the light green area shows the region, in which FEVDs suggest Υ21i,j\Upsilon^{i,j}_{2\rightarrow 1} >> Υ12i,j\Upsilon^{i,j}_{1\rightarrow 2} – independent of the ordering. The white/non-colored areas shows those regions, for which the ordering of X1X1 and X2X2 does matter. An unambiguous determination of Υ12i,j\Upsilon^{i,j}_{1\rightarrow 2} >> Υ21i,j\Upsilon^{i,j}_{2\rightarrow 1}, respectively Υ21i,j\Upsilon^{i,j}_{2\rightarrow 1} >> Υ12i,j\Upsilon^{i,j}_{1\rightarrow 2}, is not possible.

Figure 4: Ranking of NIFs (τij\tau_{i\rightarrow j}) vs Ranking of FEVDs (Υiji,j\Upsilon^{i,j}_{i\rightarrow j})
Different Levels of Correlation; Horizon h=2h=2

Appendix C Additional Empirical Results

Table 4: Empirical Results for the Bivariate Relationship
Between Various Forcings and GMTA
Sample Period: 1850-2017

Correlation Normalized IF (IF ×\times 100) FEVD Lags (PP) Correlation of Residuals (ρu\rho_{u}) Ordering: ii, GMTA Ordering: GMTA, ii ii \rightarrow GMTA GMTA \rightarrow ii iGMTA{i\rightarrow\text{GMTA}} GMTAi{\text{GMTA}\rightarrow i} iGMTA{i\rightarrow\text{GMTA}} GMTAi{\text{GMTA}\rightarrow i} Total Forcing 0.82 36.8 29.5 4 0.23*** 48.0 15.6 28.3 29.3 (17.2) (15.3) 1 0.29*** 55.7 14.4 30.0 36.6 Anthropogenic 0.91 43.7 -18.9 4 -0.16** 6.5 5.0 4.9 13.5 (39.9) (-0.6) 1 -0.15** 4.2 5.1 2.4 13.7 CO2 - ERF (W/m2{}^{\text{2}}) SMCGL 0.91 43.5 -13.0 4 -0.10 5.75 9.2 5.3 15.4 (39.0) (-0.3) 1 -0.11 2.1 3.1 1.2 8.0 Aerosol -0.84 37.9 -45.7 4 -0.17** 3.9 2.9 6.1 0.0 (19.4) (-1.3) 1 -0.00 2.0 33.3 2.0 33.1 Solar 31.4 7.0 2.1 8 0.05 4.6 0.9 3.1 1.4 (1.1) (0.6) 1 0.06 6.7 1.0 4.4 2.0 Volcanic 0.11 1.3 -0.3 4 0.18** 10.1 0.5 2.7 2.4 (0.2) (-0.2) 1 0.20*** 7.1 0.3 0.6 3.7 PDO 0.15 -2.3 -0.4 4 0.4*** 31.0 0.8 3.9 13.7 (-0.3) (-0.3) 1 0.38*** 9.5 0.3 0.7 13.5 CO2 (Mt/yr) 0.89 42.0 1.0 2 -0.12 7.9 1.9 8.8 0.3 (31.0) (0.00) 1 -0.10 5.4 0.0 5.4 0.7 CO2 (W/m2{}^{\text{2}}) 0.91 43.4 -13.4 2 0.18** 5.0 16.1 6.1 6.2 (38.3) (-0.3) 1 0.06 1.5 3.4 1.0 1.6

Notes: ii corresponds to the type of radiative forcing, listed in the left most column. The second column ("Correlation") documents the correlation between GMTA and variable ii. FEVD values are taken at horizon h=15h=15, which translates into the contribution of variable ii in the variance of the forecast error of variable jj a decade and a half after the in-sample end date. Numbers in bold underline the highest absolute causal flow among a (ii,GMTA) pair for a given measure. "*", "**", and "***" means that the null of the residuals cross-correlation of residuals is rejected at the 10%,5% and 1% level respectively. Sample period: 1850-2017.  

Table 5: Transient Climate Response
Sample Period: 1850-2017

Ordering Without Trend With Trend TCR20 TCR70 TCR20 TCR70 CO2, GMTA 1.46C 1.94C 1.76C 2.35C GMTA, CO2 0.58C 1.79C 0.97C 2.22C