This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Continuity scaling: A rigorous framework for detecting and quantifying causality accurately

Xiong Ying School of Mathematical Sciences, SCMS, and SCAM, Fudan University, Shanghai 200433, China Research Institute for Intelligent Complex Systems, CCSB, and LCNBI, Fudan University, Shanghai 200433, China State Key Laboratory of Medical Neurobiology, and MOE Frontiers Center for Brain Science, Institutes of Brain Science, Fudan University, Shanghai 200032, China    Si-Yang Leng Research Institute for Intelligent Complex Systems, CCSB, and LCNBI, Fudan University, Shanghai 200433, China Institute of AI and Robotics, Academy for Engineering and Technology, Fudan University, Shanghai 200433, China    Huan-Fei Ma School of Mathematical Sciences, Soochow University, Suzhou 215006, China    Qing Nie Department of Mathematics, Department of Developmental and Cell Biology, and NSF-Simons Center for Multiscale Cell Fate Research, University of California, Irvine, CA 92697-3875, USA    Ying-Cheng Lai School of Electrical, Computer, and Energy Engineering, Arizona State University, Tempe, Arizona 85287-5706, USA    Wei Lin [email protected] School of Mathematical Sciences, SCMS, and SCAM, Fudan University, Shanghai 200433, China Research Institute for Intelligent Complex Systems, CCSB, and LCNBI, Fudan University, Shanghai 200433, China State Key Laboratory of Medical Neurobiology, and MOE Frontiers Center for Brain Science, Institutes of Brain Science, Fudan University, Shanghai 200032, China Shanghai Artificial Intelligence Laboratory, Shanghai 200232, China
Abstract

Data based detection and quantification of causation in complex, nonlinear dynamical systems is of paramount importance to science, engineering and beyond. Inspired by the widely used methodology in recent years, the cross-map-based techniques, we develop a general framework to advance towards a comprehensive understanding of dynamical causal mechanisms, which is consistent with the natural interpretation of causality. In particular, instead of measuring the smoothness of the cross map as conventionally implemented, we define causation through measuring the scaling law for the continuity of the investigated dynamical system directly. The uncovered scaling law enables accurate, reliable, and efficient detection of causation and assessment of its strength in general complex dynamical systems, outperforming those existing representative methods. The continuity scaling based framework is rigorously established and demonstrated using datasets from model complex systems and the real world.

I Introduction

Identifying and ascertaining causal relations is a problem of paramount importance to science and engineering with broad applications bunge2017causality ; pearl2009causality ; Runge2019inferring . For example, accurate detection of causation is key to identifying the origin of diseases in precision medicine CV:2015 and is important to fields such as psychiatry SSFRLPWBBA:2016 . Traditionally, associational concepts are often misinterpreted as causation CoxHinkley1979 ; CoverThomas2012 , while causal analysis in fact goes one step further beyond association in a sense that, instead of using static conditions, causation is induced under changing conditions JPearl2009 . The principle of Granger causality formalizes a paradigmatic framework Wiener:1956 ; Granger:1969 ; Haufe2013 , quantifying causality in terms of prediction improvements, but, because of its linear, multivariate, and statistical regression nature, the various derived methods require extensive data ding2006granger . Entropy based methods Schreiber:2000 ; frenzel2007partial ; vicente2011transfer ; runge2012escaping ; SCB:2014 ; CSB:2015 ; STB:2015 ; research2021 face a similar difficulty. Another issue with the Granger causality is the fundamental requirement of separability of the underlying dynamical variables, which usually cannot be met in the real world systems. To overcome these difficulties, the cross-map-based techniques, paradigms tailored to dynamical systems, have been developed and have gained widespread attention in the past decade PRE_Hirata2010 ; PRE_Quiroga2000 ; PhyD_Arnhold1999 ; harnack2017topological ; SMYHDFM:2012 ; DFHKMMPYS:2013 ; WPCFMCHMPW:2014 ; MAC:2014 ; MW:2014 ; MAC:2014 ; YDGS:2015 ; clark2015spatial ; JHHLL:2016 ; ma2017detection ; ma2017detection ; Amigo2018 ; wang2018detecting ; pcm .

The cross map is originated from nonlinear time series analysis Takens:1981 ; PCFS:1980 ; sauer1991embedology ; stark1999delay ; stark2003delay ; muldoon1998delay . A brief understanding of such a map is as follows. Consider two subsystems: XX and YY. In the reconstructed phase space of XX, if for any state vector at a time a set of neighboring vectors can be found, the set of the cross mapped vectors, which are the partners with equal time of XX, could be available in YY. The cross map underlying the reconstructed spaces can be written as Yt=Φ(Xt)Y_{t}=\Phi(X_{t}) (where XtX_{t} and YtY_{t} are delay coordinates with sufficiently large dimensions) for the case of YY unidirectionally causing XX while, mathematically, its inverse map does not exist Amigo2018 . In practice, using the prior knowledge on the true causality in toy models or/and the assumption on the expanding property of Φ\Phi (representing by its Jacobian’s singular value larger than one in the topological causality framework harnack2017topological ), scientists developed many practically useful techniques based on the cross map for causality detection. For instance, the “activity” method, originally designed to measure the continuity of the inverse of the cross map, compares the divergence of the cross mapped vectors to the state vector in XX with the divergence of the independently-selected neighboring vectors to the same state vector PRE_Quiroga2000 ; PhyD_Arnhold1999 . The topological causality measures the divergence rate of the cross mapped vectors from the state vectors in YY harnack2017topological , and the convergent cross mapping (CCM), increasing the length of time series, compares the true state vector YY with the average of the cross mapped vectors, as the estimation of YY SMYHDFM:2012 ; DFHKMMPYS:2013 ; WPCFMCHMPW:2014 ; MAC:2014 ; MW:2014 ; YDGS:2015 ; clark2015spatial ; JHHLL:2016 ; ma2017detection ; Amigo2018 ; wang2018detecting ; PRE_Hirata2010 ; pcm . Then, the change of the divergence or the accuracy of the estimation is statistically evaluated for determining the causation from YY to XX. Inversely, the causation from XX to YY can be evaluated in an analogous manner. The above evaluations DFHKMMPYS:2013 ; WPCFMCHMPW:2014 ; MAC:2014 ; MW:2014 ; YDGS:2015 ; clark2015spatial ; JHHLL:2016 ; ma2017detection ; harnack2017topological ; Amigo2018 ; wang2018detecting ; PRE_Hirata2010 ; pcm can be understood at a conceptional and qualitative level and perform well in many demonstrations.

In this work, striving for a comprehensive understanding of causal mechanisms and inspired by the cross-map-based techniques, we develop a mathematically rigorous framework for detecting causality in nonlinear dynamical systems, turning eyes towards investigating the original systems from their cross maps, which is also logically consistent with the natural interpretation of causality as functional dependences pearl2009causality ; JPearl2009 . The skills used in cross-map-based methods are assmilated in our framework, while we directly study the original dynamical systems or the reconstructed systems instead of the cross maps. The foundation of our framework is the scaling law for the changing relation of ε\varepsilon with δ\delta arising from the continuity for the investigated system, henceforth the term “continuity scaling”. In addition to providing a theory, we demonstrate, using synthetic and real-world data, that our continuity scaling framework is accurate, computationally efficient, widely applicable, showing advantages over the existing methods.

II Continuity Scaling Framework

To explain the mathematical idea behind the development of our framework, we use the following class of discrete time dynamical systems: 𝒙t+1=𝒇(𝒙t,𝒚t)\bm{x}_{t+1}=\bm{f}(\bm{x}_{t},\bm{y}_{t}) and 𝒚t+1=𝒈(𝒙t,𝒚t)\bm{y}_{t+1}=\bm{g}(\bm{x}_{t},\bm{y}_{t}) for tt\in\mathbb{N}, where the state variables {𝒙t}t\{\bm{x}_{t}\}_{t\in\mathbb{N}}, {𝒚t}t\{\bm{y}_{t}\}_{t\in\mathbb{N}} evolve in the compact manifolds \mathcal{M}, 𝒩\mathcal{N} of dimension DD_{\mathcal{M}}, D𝒩D_{\mathcal{N}} under sufficiently smooth map 𝒇\bm{f}, 𝒈\bm{g}, respectively. We adopt the common recognition of causality in dynamical systems.

Definition. If the dependence of 𝒇(𝒙,𝒚)\bm{f}(\bm{x},\bm{y}) on 𝒚\bm{y} is nontrivial (i.e., a directional coupling exists), a variation in 𝒚\bm{y} will result in a change in the value of 𝒇(𝒙,𝒚)\bm{f}(\bm{x},\bm{y}) for any given 𝒙\bm{x}, which, according to the natural interpretation of causality spirtes2000causation ; pearl2009causality , admits that 𝒚:{𝒚t}t\bm{y}:\{\bm{y}_{t}\}_{t\in\mathbb{N}} has a direct causal effect on 𝒙:{𝒙t}t\bm{x}:\{\bm{x}_{t}\}_{t\in\mathbb{N}}, denoted by 𝒚𝒙\bm{y}\hookrightarrow\bm{x}, as shown in the upper panel of Fig. 1.

Refer to caption
Figure 1: Illustration of causal relation between two sets of dynamical variables. (a) Existence of causation from 𝒚\bm{y} in 𝒩\mathcal{N} to 𝒙\bm{x} in \mathcal{M}, where each correspondence from 𝒙t+1\bm{x}_{t+1} to 𝒚t\bm{y}_{t} is one-to-one, represented by the line or the arrow, respectively, in the upper and the middle panels. In this case, a change in lnε𝒙\ln{\varepsilon_{\bm{x}}} results in a direct change in δ𝒚\delta_{\bm{y}} (the lower panel) with ε𝒙\varepsilon_{\bm{x}} and δ𝒚\delta_{\bm{y}} denoting the neighborhood size of the resulting variable 𝒙\bm{x} and of the causal variable 𝒚\bm{y}, respectively. (b) Absence of causation from 𝒚\bm{y} to 𝒙\bm{x}, where every point on each trajectory, {𝒚t}\{\bm{y}_{t}\}, in 𝒩\mathcal{N} could be the correspondent point from 𝒙t+1\bm{x}_{t+1} in \mathcal{M} (the upper panel) and thus every point in 𝒩\mathcal{N} belongs to the largest δ\delta-neighborhood of 𝒚t\bm{y}_{t} (the middle panel). In this case, δ𝒚\delta_{\bm{y}} does not depend on ε𝒙\varepsilon_{\bm{x}} (the lower panel). Also refer to the supplemental animation for illustration.

We now interpret the causal relationship stipulated by the continuity of a function. Let 𝒇𝒙g()𝒇(𝒙g,)\bm{f}_{\bm{x}_{\text{g}}}(\cdot)\triangleq\bm{f}(\bm{x}_{\text{g}},\cdot) for a given point 𝒙g\bm{x}_{\text{g}}\in\mathcal{M}. For any 𝒚P𝒩\bm{y}_{P}\in\mathcal{N}, we denote its image under the given function by 𝒙I𝒇𝒙g(𝒚P)\bm{x}_{I}\triangleq\bm{f}_{\bm{x}_{\text{g}}}(\bm{y}_{P}). Applying the logic statement of a continuous function to 𝒇𝒙g()\bm{f}_{\bm{x}_{\text{g}}}(\cdot), we have that, for any neighborhood 𝒪(𝐱I,ε)\mathcal{O}(\bm{x}_{I},\varepsilon) centered at 𝐱I\bm{x}_{I} and of radius ε>0\varepsilon>0, there exists a neighborhood 𝒪(𝐲P,δ)\mathcal{O}(\bm{y}_{P},\delta) centered at 𝐲P\bm{y}_{P} of radius δ>0\delta>0, such that 𝐟𝐱g(𝒪(𝐲P,δ))𝒪(𝐱I,ε)\bm{f}_{\bm{x}_{\text{g}}}(\mathcal{O}(\bm{y}_{P},\delta))\subset\mathcal{O}(\bm{x}_{I},\varepsilon). The neighborhood and its radius are defined by

𝒪(𝒑,h)={𝒔𝒮|dist𝒮(𝒔,𝒑)<h},𝒑𝒮,h>0,\mathcal{O}(\bm{p},h)=\big{\{}\bm{s}\in\mathcal{S}~{}\big{|}~{}\text{dist}_{\mathcal{S}}(\bm{s},\bm{p})<h\big{\}},~{}\bm{p}\in\mathcal{S},~{}h>0,

where dist𝒮(,)\text{dist}_{\mathcal{S}}(\cdot,\cdot) represents an appropriate metric describing the distance between two given points in a specified manifold 𝒮\mathcal{S} with 𝒮=\mathcal{S}=\mathcal{M} or 𝒩\mathcal{N}. The meaning of this mathematical statement is that, if we have a neighborhood of the resulting variable 𝒙I\bm{x}_{I} first, we can then find a neighborhood for the causal variable 𝒚P\bm{y}_{P} to satisfy the above mapping and inclusion relation. This operation of “first-ε\varepsilon-then-δ\delta” provides a rigorous base for the principle that the information about the resulting variable can be used to estimate the information of the causal variable and therefore to ascertain causation, as indicated by the long arrow in the middle panels of Fig. 1. Note that, the existence of the δ>0\delta>0 neighborhood is always guaranteed for a continuous map 𝒇𝒙g\bm{f}_{\bm{x}_{\text{g}}}. In fact, due to the compactness of the manifold 𝒩\mathcal{N}, a largest value of δ\delta exists. However, if 𝒚P\bm{y}_{P} does not have an explicit causal effect on the variable 𝒙I\bm{x}_{I}, i.e., 𝒇𝒙g\bm{f}_{\bm{x}_{\rm g}} is independent of 𝒚P\bm{y}_{P}, the existence of δ\delta is still assured but it is independent of the value of ε\varepsilon, as shown in the upper panel of Fig. 1. This means that merely determining the existence of a δ\delta-neighborhood is not enough for inferring causation - it is necessary to vary ε\varepsilon systematically and to examine the scaling relation between δ\delta and ε\varepsilon. In the following we discuss a number of scenarios.

Case I: Dynamical variables {(𝒙t,𝒚t)}t\{(\bm{x}_{t},\bm{y}_{t})\}_{t\in\mathbb{N}} are fully measurable. For any given constant ε𝒙>0\varepsilon_{\bm{x}}>0, the set {𝒙τ|τI𝒙t(ε𝒙)}\{\bm{x}_{\tau}\in\mathcal{M}~{}|~{}\tau\in I^{t}_{\bm{x}}(\varepsilon_{\bm{x}})\} can be used to approximate the neighborhood 𝒪(𝒙t+1,ε𝒙)\mathcal{O}(\bm{x}_{t+1},\varepsilon_{\bm{x}}), where the time index set is

I𝒙t(ε𝒙){τ|dist(𝒙t+1,𝒙τ)<ε𝒙}.I^{t}_{\bm{x}}(\varepsilon_{\bm{x}})\triangleq\left\{\tau\in\mathbb{N}~{}|~{}\text{dist}_{\mathcal{M}}(\bm{x}_{t+1},\bm{x}_{\tau})<\varepsilon_{\bm{x}}\right\}. (1)

The radius δ𝒚t=δ𝒚t(ε𝒙)\delta^{t}_{\bm{y}}=\delta^{t}_{\bm{y}}(\varepsilon_{\bm{x}}) of the neighborhood 𝒪(𝒚t,δ𝒚t)\mathcal{O}(\bm{y}_{t},\delta^{t}_{\bm{y}}) satisfying 𝒇𝒙g=𝒙t(𝒪(𝒚t,δ𝒚t))𝒪(𝒙t+1,ε𝒙)\bm{f}_{\bm{x}_{\rm g}=\bm{x}_{t}}(\mathcal{O}(\bm{y}_{t},\delta^{t}_{\bm{y}}))\subset\mathcal{O}(\bm{x}_{t+1},\varepsilon_{\bm{x}}) can be estimated as

δ𝒚t(ε𝒙){#[I¯𝒙t(ε𝒙)]}1τI¯𝒙t(ε𝒙)dist𝒩(𝒚t,𝒚τ1),\delta^{t}_{\bm{y}}(\varepsilon_{\bm{x}})\triangleq\left\{\#[\bar{I}^{t}_{\bm{x}}(\varepsilon_{\bm{x}})]\right\}^{-1}\sum_{\tau\in\bar{I}^{t}_{\bm{x}}(\varepsilon_{\bm{x}})}\text{dist}_{\mathcal{N}}(\bm{y}_{t},\bm{y}_{\tau-1}), (2)

where #[]\#[\cdot] is the cardinality of the given set and the index set is I¯𝒙t(ε𝒙){τI𝒙t(ε𝒙)|dist(𝒙t,𝒙τ1)<ε𝒙}\bar{I}^{t}_{\bm{x}}(\varepsilon_{\bm{x}})\triangleq\{\tau\in I^{t}_{\bm{x}}(\varepsilon_{\bm{x}})~{}|~{}~{}\text{dist}_{\mathcal{M}}(\bm{x}_{t},\bm{x}_{\tau-1})<\varepsilon_{\bm{x}}\}.

The strict mathematical steps for estimating δ𝒚t\delta^{t}_{\bm{y}} are given in Section II of Supplementary Information (SI). We emphasize that here correspondence between 𝒙t+1\bm{x}_{t+1} and 𝒚t\bm{y}_{t} is investigated, differing from the cross-map-based methods, with one-step time difference naturally arising. This consideration yields a key condition [DD], which is only need when considering the original iteration/flow and whose detailed description and universality are demonstrated in SI. We reveal a linear scaling law between δ𝒚tt\langle\delta^{t}_{\bm{y}}\rangle_{t\in\mathbb{N}} and lnε𝒙\ln{\varepsilon_{\bm{x}}}, as shown in the lower panels of Fig. 1, whose slope s𝒚𝒙s_{\bm{y}\hookrightarrow\bm{x}} is an indicator of the correspondent relation between ε\varepsilon and δ\delta and hence the causal relation 𝒚𝒙\bm{y}\hookrightarrow\bm{x}. Here, t\langle\cdot\rangle_{t\in\mathbb{N}} denotes the average over time. In particular, a larger slope value implies a stronger causation in the direction from 𝒚\bm{y} to 𝒙\bm{x} as represented by the map functions 𝒇(𝒙t,𝒚t)\bm{f}(\bm{x}_{t},\bm{y}_{t}) [Fig. 1], while a near zero slope indicates null causation in this direction [Fig. 1]. Likewise, possible causation in the reversed direction, 𝒙𝒚\bm{x}\hookrightarrow\bm{y}, as represented by the function 𝒈(𝒙t,𝒚t)\bm{g}(\bm{x}_{t},\bm{y}_{t}), can be assessed analogously. And the unidirectional case when 𝒇(𝒙,𝒚)=𝒇0(𝒙)\bm{f}(\bm{x},\bm{y})=\bm{f}_{0}(\bm{x}) independent of 𝒚\bm{y} is uniformly considered in Case II. We summarize the consideration below and an argument for the generic existence of the scaling law is provided in Section II of SI.

Theorem. For dynamical variables {(𝒙t,𝒚t)}t\{(\bm{x}_{t},\bm{y}_{t})\}_{t\in\mathbb{N}} measured directly from the dynamical systems, if the slope s𝒚𝒙s_{\bm{y}\hookrightarrow\bm{x}} defined above is zero, no causation exists from 𝒚\bm{y} to 𝒙\bm{x}. Otherwise, a directional coupling can be confirmed from 𝒚\bm{y} to 𝒙\bm{x} and the slope s𝒚𝒙s_{\bm{y}\hookrightarrow\bm{x}} increases monotonically with the coupling strength.

Case II: The dynamical variables {(𝒙t,𝒚t)}t\{(\bm{x}_{t},\bm{y}_{t})\}_{t\in\mathbb{N}} are not directly accessible but measurable time series {ut}t\{u_{t}\}_{t\in\mathbb{N}} and {vt}t\{v_{t}\}_{t\in\mathbb{N}} are available, where ut=u(𝒙t)u_{t}=u(\bm{x}_{t}) and vt=v(𝒚t)v_{t}=v(\bm{y}_{t}) with uu: ru\mathcal{M}\to\mathbb{R}^{r_{u}} and vv: 𝒩rv\mathcal{N}\to\mathbb{R}^{r_{v}} being smooth observational functions. To assess causation from 𝒚\bm{y} to 𝒙\bm{x}, we assume one-dimensional observational time series (for simplicity): ru=rv=1r_{u}=r_{v}=1, and use the classical delay-coordinate embedding method Takens:1981 ; PCFS:1980 ; sauer1991embedology ; stark1999delay ; stark2003delay ; Cummins2015 ; muldoon1998delay to reconstruct the phase space: 𝒖t=(ut,ut+τu,,ut+(du1)τu)\bm{u}_{t}=(u_{t},u_{t+\tau_{u}},\cdots,u_{t+(d_{u}-1)\tau_{u}})^{\top} and 𝒗t=(vt,vt+τv,,vt+(dv1)τv)\bm{v}_{t}=(v_{t},v_{t+\tau_{v}},\cdots,v_{t+(d_{v}-1)\tau_{v}})^{\top}, where τu,v\tau_{u,v} is the delay time and du,v>2(D+D𝒩)d_{u,v}>2(D_{\mathcal{M}}+D_{\mathcal{N}}) is the embedding dimension that can be determined using some standard criteria kantz2004nonlinear . As illustrated in Fig. 2, the dynamical evolution of the reconstructed states {(𝒖t,𝒗t)}t\{(\bm{u}_{t},\bm{v}_{t})\}_{t\in\mathbb{N}} is governed by

𝒖t+1=𝒇~(𝒖t,𝒗t),𝒗t+1=𝒈~(𝒖t,𝒗t).\bm{u}_{t+1}=\tilde{\bm{f}}(\bm{u}_{t},\bm{v}_{t}),~{}~{}\bm{v}_{t+1}=\tilde{\bm{g}}(\bm{u}_{t},\bm{v}_{t}). (3)

The map functions can be calculated as 𝒇~(𝒖,𝒗)𝑬u[𝒇,𝒈](𝚷1𝑬u1(𝒖),𝚷2𝑬v1(𝒗))\tilde{\bm{f}}(\bm{u},\bm{v})\triangleq\bm{E}_{u}\circ[\bm{f},\bm{g}]\left(\bm{\Pi}_{1}\circ\bm{E}_{u}^{-1}(\bm{u}),\bm{\Pi}_{2}\circ\bm{E}_{v}^{-1}(\bm{v})\right), 𝒈~(𝒖,𝒗)𝑬v[𝒇,𝒈](𝚷1𝑬u1(𝒖),𝚷2𝑬v1(𝒗))\tilde{\bm{g}}(\bm{u},\bm{v})\triangleq\bm{E}_{v}\circ[\bm{f},\bm{g}]\left(\bm{\Pi}_{1}\circ\bm{E}_{u}^{-1}(\bm{u}),\bm{\Pi}_{2}\circ\bm{E}_{v}^{-1}(\bm{v})\right), where the embedding (diffeomorphism) 𝑬s\bm{E}_{s}: ×𝒩~sds\mathcal{M}\times\mathcal{N}\to\widetilde{\mathcal{L}}_{s}\subset\mathbb{R}^{d_{s}} with ~s𝑬s(×𝒩)\widetilde{\mathcal{L}}_{s}\triangleq\bm{E}_{s}(\mathcal{M}\times\mathcal{N}), s=uorvs=u\ \mathrm{or}\ v, is given by

𝑬u(𝒙,𝒚)(u(𝒙),u𝚷1[𝒇,𝒈]τu(𝒙,𝒚),u𝚷1[𝒇,𝒈]2τu(𝒙,𝒚),,u𝚷1[𝒇,𝒈](du1)τu(𝒙,𝒚)),𝑬v(𝒙,𝒚)(v(𝒚),v𝚷2[𝒇,𝒈]τv(𝒙,𝒚),v𝚷2[𝒇,𝒈]2τv(𝒙,𝒚),,v𝚷2[𝒇,𝒈](dv1)τv(𝒙,𝒚)),\begin{array}[]{l}\bm{E}_{u}(\bm{x},\bm{y})\triangleq\big{(}u(\bm{x}),u\circ\bm{\Pi}_{1}\circ[\bm{f},\bm{g}]^{\tau_{u}}(\bm{x},\bm{y}),u\circ\bm{\Pi}_{1}\circ\\ \hskip 28.45274pt[\bm{f},\bm{g}]^{2\tau_{u}}(\bm{x},\bm{y}),\cdots,u\circ\bm{\Pi}_{1}\circ[\bm{f},\bm{g}]^{(d_{u}-1)\tau_{u}}(\bm{x},\bm{y})\big{)},\\ \bm{E}_{v}(\bm{x},\bm{y})\triangleq\big{(}v(\bm{y}),v\circ\bm{\Pi}_{2}\circ[\bm{f},\bm{g}]^{\tau_{v}}(\bm{x},\bm{y}),v\circ\bm{\Pi}_{2}\circ\\ \hskip 28.45274pt[\bm{f},\bm{g}]^{2\tau_{v}}(\bm{x},\bm{y}),\cdots,v\circ\bm{\Pi}_{2}\circ[\bm{f},\bm{g}]^{(d_{v}-1)\tau_{v}}(\bm{x},\bm{y})\big{)},\end{array}

with the inverse function 𝑬s1\bm{E}_{s}^{-1} defined on ~s\widetilde{\mathcal{L}}_{s}, [𝒇,𝒈]k[\bm{f},\bm{g}]^{k} representing the kkth iteration of the map, and the projection mappings as 𝚷1(𝒙,𝒚)=𝒙\bm{\Pi}_{1}(\bm{x},\bm{y})=\bm{x} and 𝚷2(𝒙,𝒚)=𝒚\bm{\Pi}_{2}(\bm{x},\bm{y})=\bm{y}. Case II has now been reduced to Case I and our continuity scaling framework can be used to ascertain the causation from 𝒗\bm{v} to 𝒖\bm{u} based on the measured time series with the indices I𝒖t(ε𝒖)I^{t}_{\bm{u}}(\varepsilon_{\bm{u}}), δ𝒗t(ε𝒖)\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u}}) and s𝒗𝒖s_{\bm{v}\hookrightarrow\bm{u}} [Eqs. (1) and (2)].

Refer to caption
Figure 2: Illustration of system dynamics before and after embedding for Case II. In the left panel, the arrows describe how the original systems (𝒇,𝒈)(\bm{f},\bm{g}) is equivalent to the system (𝒇~,𝒈~)(\tilde{\bm{f}},\tilde{\bm{g}}) after embedding. In the right panel, causation between the internal variables 𝒙\bm{x} and 𝒚\bm{y} can be ascertained by detecting the causation between the variables 𝒖\bm{u} and 𝒗\bm{v} reconstructed from measured time series.

Does the causation from 𝒗\bm{v} to 𝒖\bm{u} imply causation from 𝒚\bm{y} to 𝒙\bm{x}? The answer is affirmative, which can be argued, as follows. If the original map function 𝒇\bm{f} is independent of 𝒚\bm{y}: 𝒇(𝒙,𝒚)=𝒇0(𝒙)\bm{f}(\bm{x},\bm{y})=\bm{f}_{0}(\bm{x}), there is no causation from 𝒚\bm{y} to 𝒙\bm{x}. In this case, the embedding 𝑬u(𝒙,𝒚)\bm{E}_{u}(\bm{x},\bm{y}) becomes independent of 𝒚\bm{y}, degenerating into the form of 𝑬u(𝒙,𝒚)=𝑬u0(𝒙)\bm{E}_{u}(\bm{x},\bm{y})=\bm{E}_{u0}(\bm{x}), a diffeomorphism from \mathcal{M} to ~u0=𝑬u0()\widetilde{\mathcal{L}}_{u0}=\bm{E}_{u0}(\mathcal{M}) only. As a result, Eq. (3) becomes: 𝒖t+1=𝒇~0(𝒖t)\bm{u}_{t+1}=\tilde{\bm{f}}_{0}(\bm{u}_{t}) and 𝒗t+1=𝒈~(𝒖t,𝒗t)\bm{v}_{t+1}=\tilde{\bm{g}}(\bm{u}_{t},\bm{v}_{t}), where 𝒇~0(𝒖)=𝑬u0𝒇𝑬u01(𝒖)\tilde{\bm{f}}_{0}(\bm{u})=\bm{E}_{u0}\circ\bm{f}\circ\bm{E}_{u0}^{-1}(\bm{u}) and the resulting mapping 𝒇~0\tilde{\bm{f}}_{0} is independent of 𝒗\bm{v}. The independence can be validated by computing the slope s𝒗𝒖s_{\bm{v}\hookrightarrow\bm{u}} associated with the scaling relation between δ𝒗tt\langle\delta^{t}_{\bm{v}}\rangle_{t\in\mathbb{N}} and lnε𝒖\ln{\varepsilon_{\bm{u}}}, where a zero slope indicates null causation from 𝒗\bm{v} to 𝒖\bm{u} and hence null causation from 𝒚\bm{y} to 𝒙\bm{x}. Conversely, a finite slope signifies causation between the variables. Thus, any type of causal relation (unidirectional or bi-directional) detected between the reconstructed state variables {(𝒖t,𝒗t)}t\{(\bm{u}_{t},\bm{v}_{t})\}_{t\in\mathbb{N}} implies the same type of causal relation between the internal but inaccessible variables 𝒙\bm{x} and 𝒚\bm{y} of the original system.

Case III: The structure of the internal variables is completely unknown. Given the observational functions u~,v~\tilde{u},\tilde{v}: \mathcal{M}×\times𝒩\mathcal{N}\to\mathbb{R} with u~t=u~(𝒙t,𝒚t)\tilde{u}_{t}=\tilde{u}(\bm{x}_{t},\bm{y}_{t}) and v~t=v~(𝒙t,𝒚t)\tilde{v}_{t}=\tilde{v}(\bm{x}_{t},\bm{y}_{t}), we first reconstruct the state space: 𝒖~t=(u~t,u~t+τ,,u~t+(d1)τ)\tilde{\bm{u}}_{t}=(\tilde{u}_{t},\tilde{u}_{t+\tau},\cdots,\tilde{u}_{t+(d-1)\tau})^{\top} and 𝒗~t=(v~t,v~t+τ,,v~t+(d1)τ)\tilde{\bm{v}}_{t}=(\tilde{v}_{t},\tilde{v}_{t+\tau},\cdots,\tilde{v}_{t+(d-1)\tau})^{\top}. To detect and quantify causation from 𝒗~\tilde{\bm{v}} to 𝒖~\tilde{\bm{u}} (or vice versa), we carry out a continuity scaling analysis with the modified indices I𝒖~t(ε𝒖~)I^{t}_{\tilde{\bm{u}}}(\varepsilon_{\tilde{\bm{u}}}), δ𝒗~t(ε𝒖~)\delta^{t}_{\tilde{\bm{v}}}(\varepsilon_{\tilde{\bm{u}}}) and s𝒗~𝒖~s_{\tilde{\bm{v}}\hookrightarrow\tilde{\bm{u}}}. Differing from Case II, here, due to the lack of knowledge about the correspondence structure between the internal and observational variables, a causal relation for the latter does not definitely imply the same for the former.

Case IV: Continuous-time dynamical systems possessing a sufficiently smooth flow {𝑺t;t}\{\bm{S}_{t};t\in\mathbb{R}\} on a compact manifold \mathcal{H}: d𝑺t(𝒖0)/dt=𝝌(𝑺t(𝒖0))\text{d}\bm{S}_{t}(\bm{u}_{0})/\text{d}t=\bm{\chi}(\bm{S}_{t}(\bm{u}_{0})), where 𝝌\bm{\chi} is the vector field. Let {u^t=ωn+ν}n\{\hat{u}_{t=\omega n+\nu}\}_{n\in\mathbb{Z}} and {v^t=ωn+ν}n\{\hat{v}_{t=\omega n+\nu}\}_{n\in\mathbb{Z}} be two respective time series from the smooth observational functions u^,v^\hat{u},~{}\hat{v}: \mathcal{H}\rightarrow\mathbb{R} with u^t=u^(𝑺t)\hat{u}_{t}=\hat{u}(\bm{S}_{t}) and v^t=v^(𝑺t)\hat{v}_{t}=\hat{v}(\bm{S}_{t}), where 1/ω1/\omega is the sampling rate and ν\nu is the time shift. Defining 𝚵𝑺ω\bm{\Xi}\triangleq\bm{S}_{\omega}: \mathcal{H}\rightarrow\mathcal{H} and 𝑺^n𝑺ωn+ν(𝒖0)\hat{\bm{S}}_{n}\triangleq{\bm{S}_{\omega n+\nu}}(\bm{u}_{0}), we obtain a discrete-time system as 𝑺^n+1=𝚵(𝑺^n)\hat{\bm{S}}_{n+1}=\bm{\Xi}(\hat{\bm{S}}_{n}) with the observational functions as u^n=u^(𝑺^n)\hat{u}_{n}=\hat{u}(\hat{\bm{S}}_{n}) and v^n=v^(𝑺^n)\hat{v}_{n}=\hat{v}(\hat{\bm{S}}_{n}), reducing the case to Case III and rendering applicable our continuity scaling analysis to unveil and quantify the causal relation between {u^t=ωn+ν}n\{\hat{u}_{t=\omega n+\nu}\}_{n\in\mathbb{Z}} and {v^t=ωn+ν}n\{\hat{v}_{t=\omega n+\nu}\}_{n\in\mathbb{Z}}. If the domains of u^\hat{u} and v^\hat{v} have their own restrictions on some particular subspaces, e.g., u^\hat{u}: u\mathcal{H}_{u}\rightarrow\mathbb{R} and v^\hat{v}: v\mathcal{H}_{v}\rightarrow\mathbb{R} with =uv\mathcal{H}=\mathcal{H}_{u}\oplus\mathcal{H}_{v}, the case is further reduced to Case II, so the detected causal relation between the observational variables imply causation between the internal variables belonging to their respective subspaces.

Refer to caption
Figure 3: Ascertaining and characterizing causation in various ecological systems of interacting species. (a,b) Unidirectional causation of two coupled species. In (a), the values of the slope sx1x2s_{x_{1}\hookrightarrow x_{2}} associated with the causal relation x1x2x_{1}\hookrightarrow x_{2} are approximately 0.0004, 0.1167, 0.1203, and 0.1238 for four different values of the coupling parameter μ21\mu_{21}. (b) Near zero slope values sx2x1s_{x_{2}\hookrightarrow x_{1}} for x2x1x_{2}\hookrightarrow x_{1}, indicating its nonexistence. (c,d) Inferred causal network of five species whose interacting structure is, respectively, that of a ring: xixi+1(mod5)x_{i}\hookrightarrow x_{i+1(\mathrm{mod}~{}5)} (i=1,,5i=1,\cdots,5) and of a tree: xjxj+1,j+3x_{j}\hookrightarrow x_{j+1,j+3} (j=1,2j=1,2), where the estimated slope values are color-coded. Results of a statistical analysis of the accuracy and reliability of the determined causal interactions are presented in SI section III. Time series of length 50005000 are used in all these simulations. The embedding parameters are τs=1\tau_{s}=1 and ds=3d_{s}=3 with s=x1,,x5s=x_{1},\cdots,x_{5}.

III Demonstrations: From Complex Dynamical Models to Real-World Networks

To demonstrate the efficacy of our continuity scaling framework and its superior performance, we have carried out extensive numerical tests with a large number of synthetic and empirical datasets, including those from gene regulatory networks as well as those of air pollution and hospital admission. The practical steps of the continuity scaling framework together with the significance test procedures are described in Methods. We present three representative examples here, while leaving others of significance to SI.

The first example is an ecological model of two unidirectionally interacting species: x1,t+1=x1,t(3.83.8x1,tμ12x2,t)x_{1,t+1}=x_{1,t}(3.8-3.8x_{1,t}-\mu_{12}x_{2,t}) and x2,t+1=x2,t(3.73.7x2,tμ21x1,t)x_{2,t+1}=x_{2,t}(3.7-3.7x_{2,t}-\mu_{21}x_{1,t}). With time series {(x1,t,x2,t)}t\{(x_{1,t},x_{2,t})\}_{t\in\mathbb{N}} obtained from different values of the coupling parameters, our continuity scaling framework yields correct results of different degree of unidirectional causation, as shown in Figs. 3-3. In all cases, there exists a reasonable range of lnεx2\ln{\varepsilon_{x_{2}}} (neither too small nor too large) from which the slope sx1x2s_{x_{1}\hookrightarrow x_{2}} of the linear scaling can be extracted. The statistical significance of the estimated slope values and consequently the strength of causation can be assessed with the standard pp-value test lancaster2018surrogate (Methods and SI). An ecological model with bidirectional coupling has also been tested (see Section III of SI). Figures 3-3 show the results from ecological networks of five mutually interacting species on a ring and on a tree structure, respectively, where the color-coded slope values reflect accurately the interaction patterns in both cases.

Refer to caption
Figure 4: Detecting causation in the unidirectionally coupled Lorenz system. The results are for different values of μ21\mu_{21} (μ12=0\mu_{12}=0) and sampling rate 1/ω1/\omega. (a,b) Color-coded values of the slopes sy1y2s_{y_{1}\hookrightarrow y_{2}} and sy2y1s_{y_{2}\hookrightarrow y_{1}}, respectively. The integration time step is 10310^{-3} and the embedding parameters are ds=7d_{s}=7, τs0.05\tau_{s}\approx 0.05 with ω|τs\omega|\tau_{s} (s=y1s=y_{1} or y2y_{2}). See Section III and Tab. S9 of SI for all the other parameters including the time series lengths used in the simulations.

The second example is the coupled Lorenz system: x˙i=σi(yixi)+μijxj\dot{x}_{i}=\sigma_{i}(y_{i}-x_{i})+\mu_{ij}x_{j}, y˙i=xi(ρizi)yi\dot{y}_{i}=x_{i}(\rho_{i}-z_{i})-y_{i}, z˙i=xiyiβizi\dot{z}_{i}=x_{i}y_{i}-\beta_{i}z_{i} with i,j=1,2i,j=1,2 and iji\not=j. We use time series {y1,t,y2,t}t=nω\{y_{1,t},y_{2,t}\}_{t=n\omega} for detecting different configurations of causation (see Section III of SI). Figure 4 presents the overall result, where the color-coded estimated values of the slope from the continuity scaling are shown for different combinations of the sampling rate 1/ω1/\omega and coupling strength. Even with relatively low sampling rate, our continuity scaling framework can successfully detect and quantify the strength of causation. Note that the accuracy does not vary monotonously with the sampling rate, indicating the potential of our framework to ascertain and quantify causation even with rare data. Moreover, the proposed index can accurately reflect the true causal strength (denoting by the coupling parameter), which is also evidenced by numerical tests in Sections III and IV of SI. Robustness tests against different noise perturbations are provided in Section III of SI demonstrating the practical usefulness of our framework. Additionally, analogous to the first example, we present in SI several examples on causation detection in the coupled Lorenz system with nonlinear couplings, and the Rössler-Lorenz system, etc., which further demonstrates the generic efficacy of our framework.

In addition, we present study on several real-world dataset, which brings new insights to the evolutionary mechanism of underlying systems. We study gene expression data from DREAM4 in silico Network Challenge marbach2009generating ; marbach2010revealing , whose intrinsic gene regulatory networks (GRNs) are known for verification [Fig. 5 and Fig. S17 of SI]. Applying our framework to these data, we ascertain the causations between each pair of genes by using the continuity scaling framework. The corresponding ROC curves for five different networks as well as their AUROC values are shown in Fig. 5, which indicates a high detection accuracy in dealing with real-world data.

Refer to caption
Figure 5: Detecting causal interactions in five GRNs. (a) One representative GRN containing 20 randomly-selected genes. Other four structures can be find in Fig. S17 of SI. (b) The ROC curves as well as their AUROC values demonstrate the efficacy of our framework.

We then test the causal relationship in a marine ecosystem consisting of Pacific sardine landings, northern anchovy landings and sea surfae temperature (SST). We reveal new findings to support the competing relationship hypothesis stated in Lasker1983Ocean which cannot be detected by CCM SMYHDFM:2012 . As pointed out in Fig. 6, while common influence from SST to both species is verified with both methods, our continuity scaling additionally illuminates notable influence from anchovy to sardine with its reverse direction being less significant. While competing relationship plays an important role in ecosystems, continuity scaling can reveal more essential interaction mechanism. See Section III.E of SI for more details.

Refer to caption
Figure 6: The comparison of causal network structure detected by continuity scaling and CCM among sea surface temperature, sardine and anchovy.

Moreover, we study the transmission mechanism of the recent COVID-19 pandemic. Particularly, we analyze the daily new cases of COVID-19 of representative contries for two stages: day 11 (January 22nd{}^{\text{nd}} 2020) to day 100100 (April 30th{}^{\text{th}} 2020) and day 101101 (May 1st{}^{\text{st}} 2020) to day 391391 (February 15th{}^{\text{th}} 2021). Our continuity scaling is pairwisely applied to reconstruct the transmission causal network. As shown in Fig. 7, China shows a significant effect on a few countries at the first stage and this effect disappears at the second stage. However, other countries show a different situation with China, whose external effect lasts as shown in Section III.E and Fig. S18 of SI. Our results accord with that China holds stringent epidemic control strategies with sporadic domestic infections, as evidenced by official daily briefings, demonstrating the potential of continuity scaling in detecting causal networks for ongoing complex systems. Additionally, We emphasize that day 100100 is a suitable critical day to distinguish the early severe stage and the late well-under-control stage of the pandemic [see Fig. S18(a) of SI], while slight change of the critical day will not nullify our result. As shown in Fig. S18(b) of SI, when the critical day varies from day 9494 to day 106106, no significant change (less than 5%5\%) of the detected causal links occurs at both stages, and the number of countries under influence of China at Stage 22 remains zero. See more details in Section III.E of SI.

Refer to caption
Figure 7: The causal effect from China to other countries of the COVID-19 pandemic detected by continuity scaling between Stage 11 and 22. Here, Stage 11 is from January 22nd{}^{\text{nd}} 2020 to April 30th{}^{\text{th}} 2020, and Stage 22 is from May 1st{}^{\text{st}} 2020 to February 15th{}^{\text{th}} 2021. For those detected causal links between all countries, refer to Section III.E and Fig. S18 of SI. These maps are for illustration only.

Additional real world examples including air pollutants and hospital admission record from Hong Kong are also shown in Section III of SI.

IV Discussion

To summarize, we have developed a novel framework for data based detection and quantification of causation in complex dynamical systems. On the basis of the widely used cross-map-based techniques, our framework enjoys a rigorous foundation, focusing on the continuity scaling law of the concerned system directly instead of only investigating the continuity of its cross map. Therefore, our framework is consistent with the standard interpretation of causality, and works even in the typical cases where several existing typical methods do not perform that well or even they fail (see the comparison results in Section IV of SI). In addition, the mathematical reasoning leading to the core of our framework, the continuity scaling, helps resolve the long-standing issue associated with techniques directly using cross map that information about the resulting variables is required to project the dynamical behavior of the causal variables, whereas several works in the literature Quyen1999PhyD , which directly studied the continuity or the smoothness of the cross map, likely yielded confused detected results on causal directions.

Computational complexity. The computational complexity of the algorithm is O(T2Nε)O(T^{2}N_{\varepsilon}), which is relatively smaller than the CCM method, whose computational complexity is O(T2logT)O(T^{2}\log T).

Limitations and future works. Nevertheless, there are still some limitations in the presently proposed framework. For instance, currently, only bivariate detection algorithm is designed, so generalization to multivariate network inference requires further considerations, as analogous to those works presented in Refs. Peters2017elements ; Runge2018causal ; research2019 . In addition, the causal time delay has not been taken into account in the current framework, so it also could be further investigated, similar to the work reported in Ref. ma2017detection . Definitely, we will settle these questions in our future work.

Detecting causality in complex dynamical systems has broad applications not only in science and engineering, but also in many aspects of the modern society, demanding accurate, efficient, and rigorously justified and hence trustworthy methodologies. Our present work provides a vehicle along this feat, and indeed resolves the puzzles arising in the use of those influential methods.

V Methods

Continuity scaling framework: a detailed description of algorithms. Let {ut}t=1,2,,T\{u_{t}\}_{t=1,2,\dots,T} and {vt}t=1,2,,T\{v_{t}\}_{t=1,2,\dots,T} be two experimentally measured time series of internal variables {(𝒙t,𝒚t)}t\{(\bm{x}_{t},\bm{y}_{t})\}_{t\in\mathbb{N}}. Typically, if the dynamical variables {(𝒙t,𝒚t)}t\{(\bm{x}_{t},\bm{y}_{t})\}_{t\in\mathbb{N}} are accessible, {(ut,vt)}\{(u_{t},v_{t})\} reduce to one-dimensional coordinate of the internal system. The key computational steps of our continuity scaling framework are described, as follows.

We reconstruct the phase space using the classical method of delay coordinate embedding Takens:1981 with the optimal embedding dimension dzd_{z} and time lag τz\tau_{z} determined by the methods in Refs. MutualInfoMethod ; FNNmethod (i.e., the false nearest neighbors and the delayed mutual information respectively):

z{𝒛(t)=(zt,zt+τz,,zt+(dz1)τz)|t=1,,T0},\mathcal{L}_{z}\triangleq\left\{\bm{z}(t)=\left(z_{t},z_{t+\tau_{z}},\dots,z_{t+(d_{z}-1)\tau_{z}}\right)~{}|~{}t=1,\dots,T_{0}\right\}, (4)

where z=u,vz=u,v, T0=min{T(dz1)τz|z=u,v}T_{0}=\min\{T-(d_{z}-1)\tau_{z}~{}|~{}z=u,v\}, and Euclidean distance is used for both u,v\mathcal{L}_{u,v}.

We present the steps for causation detection using the case of 𝒗𝒖\bm{v}\hookrightarrow\bm{u} as an example.

We calculate the respective diameters for u,v\mathcal{L}_{u,v} as

Dzmax{distz(𝒛(t),𝒛(τ))|1t,τT0},D_{z}\triangleq\max\left\{\text{dist}_{\mathcal{L}_{z}}(\bm{z}(t),\bm{z}(\tau))~{}\big{|}~{}1\leqslant t,\tau\leqslant T_{0}\right\}, (5)

where z=u,vz=u,v, and 𝒛=𝒖,𝒗\bm{z}=\bm{u},\bm{v}. We set up a group of numbers, {ε𝒖,j}j=1,,Nϵ\{\varepsilon_{\bm{u},j}\}_{j=1,\cdots,N_{\epsilon}}, as ε𝒖,1=eDu\varepsilon_{\bm{u},1}=e\cdot D_{u}, ε𝒖,Nϵ=Du\varepsilon_{\bm{u},N_{\epsilon}}=D_{u}, with the other elements satisfying

(lnε𝒖,jlnε𝒖,1)/(j1)=(lnε𝒖,Nεlnε𝒖,1)/(Nε1)(\ln\varepsilon_{\bm{u},j}-\ln\varepsilon_{\bm{u},1})/(j-1)=(\ln\varepsilon_{\bm{u},N_{\varepsilon}}-\ln\varepsilon_{\bm{u},1})/(N_{\varepsilon}-1) (6)

for j=2,,Nε1j=2,\dots,N_{\varepsilon}-1. Then, in light of (1) with (2), we have

δ𝒗t(ε𝒖)=#[I𝒖t(ε𝒖)]1τI𝒖t(ε𝒖)distv(𝒗(t),𝒗(τ1)),\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u}})=\#[I^{t}_{\bm{u}}(\varepsilon_{\bm{u}})]^{-1}\sum_{\tau\in I^{t}_{\bm{u}}(\varepsilon_{\bm{u}})}\text{dist}_{\mathcal{L}_{v}}(\bm{v}(t),\bm{v}(\tau-1)), (7)

with

I𝒖t(ε𝒖)={τ|distu(𝒖(t+1),𝒖(τ))<ε𝒖,|t+1τ|>E},\begin{array}[]{l}I^{t}_{\bm{u}}(\varepsilon_{\bm{u}})=\{\tau\in\mathbb{N}~{}\big{|}~{}\text{dist}_{\mathcal{L}_{u}}(\bm{u}(t+1),\bm{u}(\tau))<\varepsilon_{\bm{u}},\\ ~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}|t+1-\tau|>E\},\end{array} (8)

where, numerically, ε𝒖\varepsilon_{\bm{u}} alters its value successively from the set {ε𝒖,j}j=1,,Nϵ\{\varepsilon_{\bm{u},j}\}_{j=1,\cdots,N_{\epsilon}}, and the threshold EE is a positive number chosen to avoid the situation where the nearest neighboring points are induced by the consecutive time order only.

As defined, δ𝒗t(ε𝒖)t\langle\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u}})\rangle_{t\in\mathbb{N}} is the average of δ𝒗t(ε𝒖)\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u}}) over all possible time tt. We use a finite number of pairs {(δ𝒗t(ε𝒖,j)tT0,lnε𝒖,j)}j=1,,Nε\{(\langle\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u},j})\rangle_{t\in\mathbb{N}_{T_{0}}},\ln\varepsilon_{\bm{u},j})\}_{j=1,\dots,N_{\varepsilon}} to approximate the scaling relation between δ𝒗t(ε𝒖)t\langle\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u}})\rangle_{t\in\mathbb{N}} and lnε𝒖\ln\varepsilon_{\bm{u}}, where T0={1,2,,T0}\mathbb{N}_{T_{0}}=\{1,2,\cdots,T_{0}\}. Theoretically, a larger value of NϵN_{\epsilon} and a smaller value of ee will result in a more accurate approximation of the scaling relation. In practice, the accuracy is determined by the length of the observational time series, the sampling duration, and different types of noise perturbations. In numerical simulations, we set e=0.001e=0.001 and Nϵ=33N_{\epsilon}=33. In addition, a too large or a too small value of ε𝒖\varepsilon_{\bm{u}} can induce insufficient data to restore the neighborhood and/or the entire manifold. We thus set δ𝒗t(ε𝒖,j)=δ𝒗t(ε𝒖,j+1)\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u},j})=\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u},j+1}) as a practical technique as the number of points is limited practically in a small neighborhood. As a result, near zero slope values would appear on both sides of the scaling curve δ𝒗t(ε𝒖)t\langle\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u}})\rangle_{t\in\mathbb{N}}-lnε𝒖\ln\varepsilon_{\bm{u}}, as demonstrated in Fig. 3 and in SI. In such a case, to estimate the slope of the scaling relation, we take the following approach.

Define a group of numbers by

Sjδ𝒗t(ε𝒖,j+1)tT0δ𝒗t(ε𝒖,j)tT0lnε𝒖,j+1lnε𝒖,j,S_{j}\triangleq\frac{\langle\delta_{\bm{v}}^{t}(\varepsilon_{\bm{u},j+1})\rangle_{t\in\mathbb{N}_{T_{0}}}-\langle\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u},j})\rangle_{t\in\mathbb{N}_{T_{0}}}}{\ln\varepsilon_{\bm{u},j+1}-\ln\varepsilon_{\bm{u},j}}, (9)

where j=1,,Nε1j=1,\cdots,N_{\varepsilon}-1, sort them in a descending order, from which we determine the [Nε+12][\frac{N_{\varepsilon}+1}{2}] largest numbers, collect their subscripts - jj’s together as an index set J^\hat{J}, and set H{j,j+1|jJ^}H\triangleq\left\{j,~{}j+1~{}\big{|}~{}j\in\hat{J}\right\}. Applying the least squares method to the linear regression model:

δ𝒗t(ε𝒖)t=Slnε𝒖+b\langle\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u}})\rangle_{t\in\mathbb{N}}=S\cdot\ln\varepsilon_{\bm{u}}+b (10)

with the dataset {(δ𝒗t(ε𝒖,j)tT0,lnε𝒖,j)}jH\{(\langle\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u},j})\rangle_{t\in\mathbb{N}_{T_{0}}},\ln\varepsilon_{\bm{u},j})\}_{j\in H}, we get the optimal values (S^,b^)(\hat{S},\hat{b}) for the parameters (S,b)(S,b) in (10) and finally obtain the slope of the scaling relation as s𝒗𝒖S^s_{\bm{v}\hookrightarrow\bm{u}}\triangleq\hat{S}.

For the other causal direction from 𝒖\bm{u} to 𝒗\bm{v}, these steps are equally applicable to estimating the slope s𝒖𝒗s_{\bm{u}\hookrightarrow\bm{v}}.

To assess the statistical significance of the numerically determined causation, we devise the following surrogate test using the case of 𝒗\bm{v} causing 𝒖\bm{u} as an illustrative example.

Divide the time series {𝒖(t)}tT0\{\bm{u}(t)\}_{t\in\mathbb{N}_{T_{0}}} into NGN_{G} consecutive segments of equal length (except for the last segment - the shortest segment). Randomly shuffle these segments and then regroup them into a surrogate sequence {𝒖^(t)}tT0\{\bm{\hat{u}}(t)\}_{t\in\mathbb{N}_{T_{0}}}. Applying such a random permutation method to {𝒗(t)}tT0\{\bm{v}(t)\}_{t\in\mathbb{N}_{T_{0}}} generates another surrogate sequence {𝒗^(t)}tT0\{\bm{\hat{v}}(t)\}_{t\in\mathbb{N}_{T_{0}}}. Carrying out the slope computation yields s𝒗^𝒖^s_{\bm{\hat{v}}\hookrightarrow\bm{\hat{u}}}. The procedure can be repeated for a sufficient number of times, say QQ, which consequently yields a group of estimated slopes, denoted as {s𝒗^𝒖^q}q=0,1,Q\{s^{q}_{\bm{\hat{v}}\hookrightarrow\bm{\hat{u}}}\}_{q=0,1\cdots,Q}, where s𝒗^𝒖^0s^{0}_{\bm{\hat{v}}\hookrightarrow\bm{\hat{u}}} is set as s𝒗𝒖s_{\bm{v}\hookrightarrow\bm{u}} obtained from the original time series. For all the estimated slopes, we calculate their mean μ^𝒗𝒖\hat{\mu}_{\bm{v}\hookrightarrow\bm{u}} and the standard deviation σ^𝒗𝒖\hat{\sigma}_{\bm{v}\hookrightarrow\bm{u}}. The pp-value for s𝒗𝒖s_{\bm{v}\hookrightarrow\bm{u}} is calculated as

ps𝒗𝒖1normcdf[s𝒗𝒖μ^𝒗𝒖σ^𝒗𝒖],p_{s_{\bm{v}\hookrightarrow\bm{u}}}\triangleq 1-\text{normcdf}\left[\frac{s_{\bm{v}\hookrightarrow\bm{u}}-\hat{\mu}_{\bm{v}\hookrightarrow\bm{u}}}{\hat{\sigma}_{\bm{v}\hookrightarrow\bm{u}}}\right], (11)

where normcdf[]\text{normcdf}[\cdot] is the cumulative Gaussian distribution function. The principle of statistical hypothesis testing guarantees the existence of causation from 𝒗\bm{v} to 𝒖\bm{u} if ps𝒗𝒖<0.05p_{s_{\bm{v}\hookrightarrow\bm{u}}}<0.05.

In simulations, we set the number of segments to be NG=25N_{G}=25 and the number of times for random permutations to be Q20Q\geq 20.

References

  • (1) Bunge, M. Causality and modern science (Routledge, 2017).
  • (2) Pearl, J. Causality (Cambridge university press, 2009).
  • (3) Runge, J., Bathiany, S. et al. Inferring causation from time series in Earth system sciences. Nature communications 10, 2553 (2019). doi: 10.1038/s41467-019-10105-3.
  • (4) Collins, F. S. & Varmus, H. A new initiative on precision medicine. New England Journal of Medicine 372, 793–795 (2015). doi: 10.1056/NEJMp1500523.
  • (5) Saxe, G. N. et al. A complex systems approach to causal discovery in psychiatry. PloS ONE 11, e0151174 (2016). doi: 10.1371/journal.pone.0151174.
  • (6) Cox, D. R. & Hinkley, D. V. Theoretical statistics (CRC Press, 1979). doi: 10.1201/b14832.
  • (7) Cover, T. M. Elements of information theory (John Wiley & Sons, 1999).
  • (8) Pearl, J. et al. Causal inference in statistics: An overview. Statistics Surveys 3, 96–146 (2009). doi: 10.1214/09-SS057.
  • (9) Wiener, N. The theory of prediction. Modern mathematics for engineers (1956).
  • (10) Granger, C. W. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society 424–438 (1969). doi: 10.2307/1912791.
  • (11) Haufe, S. et al. A critical assessment of connectivity measures for EEG data: A simulation study. NeuroImage 120–133(2013). doi: 10.1016/j.neuroimage.2012.09.036.
  • (12) Ding, M., Chen, Y. & Bressler, S. L. Granger causality: basic theory and application to neuroscience. Handbook of time series analysis: recent theoretical developments and applications 437 (2006). doi: 10.1002/9783527609970.ch17.
  • (13) Schreiber, T. Measuring information transfer. Physical Review Letters 85, 461 (2000). doi: 10.1103/PhysRevLett.85.461.
  • (14) Frenzel, S. & Pompe, B. Partial mutual information for coupling analysis of multivariate time series. Physical Review Letters 99, 204101 (2007). doi: 10.1103/PhysRevLett.99.204101.
  • (15) Vicente, R., Wibral, M., Lindner, M. & Pipa, G. Transfer entropy–a model-free measure of effective connectivity for the neurosciences. Journal of Computational Neuroscience 30, 45–67 (2011). doi: 10.1007/s10827-010-0262-3.
  • (16) Runge, J., Heitzig, J., Petoukhov, V. & Kurths, J. Escaping the curse of dimensionality in estimating multivariate transfer entropy. Physical Review Letters 108, 258701 (2012). doi: 10.1103/PhysRevLett.108.258701.
  • (17) Sun, J., Cafaro, C. & Bollt, E. M. Identifying the coupling structure in complex systems through the optimal causation entropy principle. Entropy 16, 3416–3433 (2014). doi: 10.3390/e16063416.
  • (18) Cafaro, C., Lord, W. M., Sun, J. & Bollt, E. M. Causation entropy from symbolic representations of dynamical systems. Chaos: An Interdisciplinary Journal of Nonlinear Science 25, 043106 (2015). doi: 10.1063/1.4916902.
  • (19) Sun, J., Taylor, D. & Bollt, E. M. Causal network inference by optimal causation entropy. SIAM Journal on Applied Dynamical Systems 14, 73–106 (2015). doi: 10.1137/140956166.
  • (20) Solyanik-Gorgone, M., Ye, J., Miscuglio, M., Afanasev, A., Willner, A. E. & Sorger, V. J. Quantifying Information via Shannon Entropy in Spatially Structured Optical Beams. Research 2021, 9780760 (2021). doi: 10.34133/2021/9780760.
  • (21) Hirata, Y. & Aihara, K. Identifying hidden common causes from bivariate time series: A method using recurrence plots. Physical Review E 81, 016203 (2010). doi: 10.1103/PhysRevE.81.016203.
  • (22) Quiroga, R. Q., Arnhold, J. & Grassberger, P. Learning driver-response relationships from synchronization patterns. Physical Review E 61, 5142 (2000). doi: 10.1103/PhysRevE.61.5142.
  • (23) Arnhold, J., Grassberger, P., Lehnertz, K. & Elger, C. E. A robust method for detecting interdependences: application to intracranially recorded eeg. Physica D: Nonlinear Phenomena 134, 419–430 (1999). doi: 10.1016/S0167-2789(99)00140-2.
  • (24) Harnack, D., Laminski, E., Schünemann, M. & Pawelzik, K. R. Topological causality in dynamical systems. Physical Review Letters 119, 098301 (2017). doi: 10.1103/PhysRevLett.119.098301.
  • (25) Sugihara, G. et al. Detecting causality in complex ecosystems. Science 338, 496–500 (2012). doi: 10.1126/science.1227079.
  • (26) Deyle, E. R. et al. Predicting climate effects on pacific sardine. Proceedings of the National Academy of Sciences 110, 6430–6435 (2013). doi: 10.1073/pnas.1215506110.
  • (27) Wang, X. et al. A two-fold increase of carbon cycle sensitivity to tropical temperature variations. Nature 506, 212–215 (2014). doi: 10.1038/nature12915.
  • (28) Ma, H., Aihara, K. & Chen, L. Detecting causality from nonlinear dynamics with short-term time series. Scientific Reports 4, 1–10 (2014). doi: 10.1038/srep07464.
  • (29) McCracken, J. M. & Weigel, R. S. Convergent cross-mapping and pairwise asymmetric inference. Physical Review E 90, 062903 (2014). doi: 10.1103/PhysRevE.90.062903.
  • (30) Ye, H., Deyle, E. R., Gilarranz, L. J. & Sugihara, G. Distinguishing time-delayed causal interactions using convergent cross mapping. Scientific Reports 5, 14750 (2015). doi: 10.1038/srep14750.
  • (31) Clark, A. T. et al. Spatial convergent cross mapping to detect causal relationships from short time series. Ecology 96, 1174–1181 (2015). doi: 10.1890/14-1479.1.
  • (32) Jiang, J.-J., Huang, Z.-G., Huang, L., Liu, H. & Lai, Y.-C. Directed dynamical influence is more detectable with noise. Scientific Reports 6, 24088 (2016). doi: 10.1038/srep24088.
  • (33) Ma, H. et al. Detection of time delays and directional interactions based on time series from complex dynamical systems. Physical Review E 96, 012221 (2017). doi: 10.1103/PhysRevE.96.012221.
  • (34) Amigó, J. M. & Hirata, Y. Detecting directional couplings from multivariate flows by the joint distance distribution. Chaos: An Interdisciplinary Journal of Nonlinear Science 28, 075302 (2018). doi: 10.1063/1.5010779.
  • (35) Wang, Y. et al. Detecting the causal effect of soil moisture on precipitation using convergent cross mapping. Scientific Reports 8, 1–8 (2018). doi: 10.1038/s41598-018-30669-2.
  • (36) Leng, S. et al. Partial cross mapping eliminates indirect causal influences. Nature Communications 11, 1–9 (2020). doi: 10.1038/s41467-020-16238-0.
  • (37) Takens, F. Detecting strange attractors in turbulence. In Dynamical systems and turbulence, Warwick 1980, 366–381 (Springer, 1981). doi: 10.1007/BFb0091924.
  • (38) Packard, N. H., Crutchfield, J. P., Farmer, J. D. & Shaw, R. S. Geometry from a time series. Physical Review Letters 45, 712 (1980). doi: 10.1103/PhysRevLett.45.712.
  • (39) Sauer, T., Yorke, J. A. & Casdagli, M. Embedology. Journal of Statistical Physics 65, 579–616 (1991). doi: 10.1007/BF01053745.
  • (40) Stark, J. Delay embeddings for forced systems. i. deterministic forcing. Journal of Nonlinear Science 9, 255–332 (1999). doi: 10.1007/s003329900072.
  • (41) Stark, J., Broomhead, D. S., Davies, M. E. & Huke, J. Delay embeddings for forced systems. ii. stochastic forcing. Journal of Nonlinear Science 13, 519–577 (2003). doi: 10.1007/s00332-003-0534-4.
  • (42) Muldoon, M. R., Broomhead, D. S., Huke, J. P. & Hegger, R. Delay embedding in the presence of dynamical noise. Dynamics and Stability of Systems 13, 175–186 (1998). doi: 10.1080/02681119808806259.
  • (43) Spirtes, P., Glymour, C. N., Scheines, R. & Heckerman, D. Causation, prediction, and search (MIT press, 2000).
  • (44) Cummins, B., Gedeon, T. & Spendlove, K. On the efficacy of state space reconstruction methods in determining causality. SIAM Journal on Applied Dynamical Systems 14, 335–381 (2015). doi: 10.1137/130946344.
  • (45) Kantz, H. & Schreiber, T. Nonlinear time series analysis, vol. 7 (Cambridge university press, 2004). doi: 10.1017/CBO9780511755798.
  • (46) Lancaster, G., Iatsenko, D., Pidde, A., Ticcinelli, V. & Stefanovska, A. Surrogate data for hypothesis testing of physical systems. Physics Reports 748, 1–60 (2018). doi: 10.1016/j.physrep.2018.06.001.
  • (47) Marbach, D., Schaffter, T., Mattiussi, C. & Floreano, D. Generating realistic in silico gene networks for performance assessment of reverse engineering methods. Journal of Computational Biology 16, 229–239 (2009). doi: 10.1089/cmb.2008.09TT.
  • (48) Marbach, D. et al. Revealing strengths and weaknesses of methods for gene network inference. Proceedings of the National Academy of Sciences 107, 6286–6291 (2010). doi: 10.1073/pnas.0913357107.
  • (49) R. Lasker & A. MacCall. New ideas on the fluctuations of the clupeoid stocks off California. In Proceedings of the Joint Oceanographic Assembly, Halifax, August 1982: General Symposia, 110–120 (Department of Fisheries and Oceans, Ontario, 1983).
  • (50) Quyen, M. L. V., Martinerie, J., Adam, C., Varela, F. J. Nonlinear analyses of interictal EEG map the brain interdependences in human focal epilepsy. Physica D: Nonlinear Phenomena 127,250–266(1999). doi: 10.1016/S0167-2789(98)00258-9.
  • (51) Peters, J., Janzing, D. et al. Elements of causal inference: foundations and learning algorithms (MIT Press, 2017).
  • (52) Runge, J. Causal network reconstruction from time series: From theoretical assumptions to practical estimation. Chaos: An Interdisciplinary Journal of Nonlinear Science 28, 075310 (2018). doi: 10.1063/1.5025050.
  • (53) Lou, Y., Wang, L. & Chen, G. Enhancing Controllability Robustness of q-Snapback Networks through Redirecting Edges. Research 2019, 7857534 (2019). doi: 10.34133/2019/7857534.
  • (54) Fraser, A. M. & Swinney, H. L. Independent coordinates for strange attractors from mutual information. Physical Review A 33, 1134 (1986). doi: 10.1103/PhysRevA.33.1134.
  • (55) Kennel, M. B., Brown, R. & Abarbanel, H. D. Determining embedding dimension for phase-space reconstruction using a geometrical construction. Physical Review A 45, 3403 (1992). doi: 10.1103/PhysRevA.45.3403.

VI Acknowledgements

W.L. is supported by the National Key R&D Program of China (Grant No. 2018YFC0116600), by the National Natural Science Foundation of China (Grant Nos. 11925103 & 61773125), by the STCSM (Grant No. 18DZ1201000), and by the Shanghai Municipal Science and Technology Major Project (No. 2021SHZDZX0103). Y.-C.L. is supported by AFOSR (Grant No. FA9550-21-1-0438). S.-Y.L. is supported by the National Natural Science Foundation of China (No. 12101133) and “Chenguang Program” supported by Shanghai Education Development Foundation and Shanghai Municipal Education Commission (No. 20CG01). Q.N. is partially supported by NSF (Grant No. DMS1763272) and the Simons Foundation (Grant No. 594598).

VII Competing Interests

The authors declare no competing interests.

VIII Code Availability

The source codes for our CS framework are available at https://github.com/bianzhiyu/ContinuityScaling.

IX Contributions

W.L. conceived idea; X.Y., S.-Y.L., and W.L. designed and performed research; X.Y., S.-Y.L., H.-F.M., and W.L. analyzed data, H.-F.M., Y.-C.L., and Q.N. contributed data and analysis tools, and all the authors wrote the paper. X.Y. and S.-Y.L. were equally contributed.