Continuity scaling: A rigorous framework for detecting and quantifying causality accurately

Xiong Ying School of Mathematical Sciences, SCMS, and SCAM, Fudan University, Shanghai 200433, China Research Institute for Intelligent Complex Systems, CCSB, and LCNBI, Fudan University, Shanghai 200433, China State Key Laboratory of Medical Neurobiology, and MOE Frontiers Center for Brain Science, Institutes of Brain Science, Fudan University, Shanghai 200032, China Si-Yang Leng Research Institute for Intelligent Complex Systems, CCSB, and LCNBI, Fudan University, Shanghai 200433, China Institute of AI and Robotics, Academy for Engineering and Technology, Fudan University, Shanghai 200433, China Huan-Fei Ma School of Mathematical Sciences, Soochow University, Suzhou 215006, China Qing Nie Department of Mathematics, Department of Developmental and Cell Biology, and NSF-Simons Center for Multiscale Cell Fate Research, University of California, Irvine, CA 92697-3875, USA Ying-Cheng Lai School of Electrical, Computer, and Energy Engineering, Arizona State University, Tempe, Arizona 85287-5706, USA Wei Lin [email protected] School of Mathematical Sciences, SCMS, and SCAM, Fudan University, Shanghai 200433, China Research Institute for Intelligent Complex Systems, CCSB, and LCNBI, Fudan University, Shanghai 200433, China State Key Laboratory of Medical Neurobiology, and MOE Frontiers Center for Brain Science, Institutes of Brain Science, Fudan University, Shanghai 200032, China Shanghai Artificial Intelligence Laboratory, Shanghai 200232, China

Abstract

Data based detection and quantification of causation in complex, nonlinear dynamical systems is of paramount importance to science, engineering and beyond. Inspired by the widely used methodology in recent years, the cross-map-based techniques, we develop a general framework to advance towards a comprehensive understanding of dynamical causal mechanisms, which is consistent with the natural interpretation of causality. In particular, instead of measuring the smoothness of the cross map as conventionally implemented, we define causation through measuring the scaling law for the continuity of the investigated dynamical system directly. The uncovered scaling law enables accurate, reliable, and efficient detection of causation and assessment of its strength in general complex dynamical systems, outperforming those existing representative methods. The continuity scaling based framework is rigorously established and demonstrated using datasets from model complex systems and the real world.

I Introduction

Identifying and ascertaining causal relations is a problem of paramount importance to science and engineering with broad applications bunge2017causality ; pearl2009causality ; Runge2019inferring . For example, accurate detection of causation is key to identifying the origin of diseases in precision medicine CV:2015 and is important to fields such as psychiatry SSFRLPWBBA:2016 . Traditionally, associational concepts are often misinterpreted as causation CoxHinkley1979 ; CoverThomas2012 , while causal analysis in fact goes one step further beyond association in a sense that, instead of using static conditions, causation is induced under changing conditions JPearl2009 . The principle of Granger causality formalizes a paradigmatic framework Wiener:1956 ; Granger:1969 ; Haufe2013 , quantifying causality in terms of prediction improvements, but, because of its linear, multivariate, and statistical regression nature, the various derived methods require extensive data ding2006granger . Entropy based methods Schreiber:2000 ; frenzel2007partial ; vicente2011transfer ; runge2012escaping ; SCB:2014 ; CSB:2015 ; STB:2015 ; research2021 face a similar difficulty. Another issue with the Granger causality is the fundamental requirement of separability of the underlying dynamical variables, which usually cannot be met in the real world systems. To overcome these difficulties, the cross-map-based techniques, paradigms tailored to dynamical systems, have been developed and have gained widespread attention in the past decade PRE_Hirata2010 ; PRE_Quiroga2000 ; PhyD_Arnhold1999 ; harnack2017topological ; SMYHDFM:2012 ; DFHKMMPYS:2013 ; WPCFMCHMPW:2014 ; MAC:2014 ; MW:2014 ; MAC:2014 ; YDGS:2015 ; clark2015spatial ; JHHLL:2016 ; ma2017detection ; ma2017detection ; Amigo2018 ; wang2018detecting ; pcm .

The cross map is originated from nonlinear time series analysis Takens:1981 ; PCFS:1980 ; sauer1991embedology ; stark1999delay ; stark2003delay ; muldoon1998delay . A brief understanding of such a map is as follows. Consider two subsystems: $X$ and $Y$ . In the reconstructed phase space of $X$ , if for any state vector at a time a set of neighboring vectors can be found, the set of the cross mapped vectors, which are the partners with equal time of $X$ , could be available in $Y$ . The cross map underlying the reconstructed spaces can be written as $Y_{t}=\Phi(X_{t})$ (where $X_{t}$ and $Y_{t}$ are delay coordinates with sufficiently large dimensions) for the case of $Y$ unidirectionally causing $X$ while, mathematically, its inverse map does not exist Amigo2018 . In practice, using the prior knowledge on the true causality in toy models or/and the assumption on the expanding property of $\Phi$ (representing by its Jacobian’s singular value larger than one in the topological causality framework harnack2017topological ), scientists developed many practically useful techniques based on the cross map for causality detection. For instance, the “activity” method, originally designed to measure the continuity of the inverse of the cross map, compares the divergence of the cross mapped vectors to the state vector in $X$ with the divergence of the independently-selected neighboring vectors to the same state vector PRE_Quiroga2000 ; PhyD_Arnhold1999 . The topological causality measures the divergence rate of the cross mapped vectors from the state vectors in $Y$ harnack2017topological , and the convergent cross mapping (CCM), increasing the length of time series, compares the true state vector $Y$ with the average of the cross mapped vectors, as the estimation of $Y$ SMYHDFM:2012 ; DFHKMMPYS:2013 ; WPCFMCHMPW:2014 ; MAC:2014 ; MW:2014 ; YDGS:2015 ; clark2015spatial ; JHHLL:2016 ; ma2017detection ; Amigo2018 ; wang2018detecting ; PRE_Hirata2010 ; pcm . Then, the change of the divergence or the accuracy of the estimation is statistically evaluated for determining the causation from $Y$ to $X$ . Inversely, the causation from $X$ to $Y$ can be evaluated in an analogous manner. The above evaluations DFHKMMPYS:2013 ; WPCFMCHMPW:2014 ; MAC:2014 ; MW:2014 ; YDGS:2015 ; clark2015spatial ; JHHLL:2016 ; ma2017detection ; harnack2017topological ; Amigo2018 ; wang2018detecting ; PRE_Hirata2010 ; pcm can be understood at a conceptional and qualitative level and perform well in many demonstrations.

In this work, striving for a comprehensive understanding of causal mechanisms and inspired by the cross-map-based techniques, we develop a mathematically rigorous framework for detecting causality in nonlinear dynamical systems, turning eyes towards investigating the original systems from their cross maps, which is also logically consistent with the natural interpretation of causality as functional dependences pearl2009causality ; JPearl2009 . The skills used in cross-map-based methods are assmilated in our framework, while we directly study the original dynamical systems or the reconstructed systems instead of the cross maps. The foundation of our framework is the scaling law for the changing relation of $\varepsilon$ with $\delta$ arising from the continuity for the investigated system, henceforth the term “continuity scaling”. In addition to providing a theory, we demonstrate, using synthetic and real-world data, that our continuity scaling framework is accurate, computationally efficient, widely applicable, showing advantages over the existing methods.

II Continuity Scaling Framework

To explain the mathematical idea behind the development of our framework, we use the following class of discrete time dynamical systems: $\bm{x}_{t+1}=\bm{f}(\bm{x}_{t},\bm{y}_{t})$ and $\bm{y}_{t+1}=\bm{g}(\bm{x}_{t},\bm{y}_{t})$ for $t\in\mathbb{N}$ , where the state variables $\{\bm{x}_{t}\}_{t\in\mathbb{N}}$ , $\{\bm{y}_{t}\}_{t\in\mathbb{N}}$ evolve in the compact manifolds $\mathcal{M}$ , $\mathcal{N}$ of dimension $D_{\mathcal{M}}$ , $D_{\mathcal{N}}$ under sufficiently smooth map $\bm{f}$ , $\bm{g}$ , respectively. We adopt the common recognition of causality in dynamical systems.

Definition. If the dependence of $\bm{f}(\bm{x},\bm{y})$ on $\bm{y}$ is nontrivial (i.e., a directional coupling exists), a variation in $\bm{y}$ will result in a change in the value of $\bm{f}(\bm{x},\bm{y})$ for any given $\bm{x}$ , which, according to the natural interpretation of causality spirtes2000causation ; pearl2009causality , admits that $\bm{y}:\{\bm{y}_{t}\}_{t\in\mathbb{N}}$ has a direct causal effect on $\bm{x}:\{\bm{x}_{t}\}_{t\in\mathbb{N}}$ , denoted by $\bm{y}\hookrightarrow\bm{x}$ , as shown in the upper panel of Fig. 1.

Refer to caption — Figure 1: Illustration of causal relation between two sets of dynamical variables. (a) Existence of causation from $\bm{y}$ in $\mathcal{N}$ to $\bm{x}$ in $\mathcal{M}$ , where each correspondence from $\bm{x}_{t+1}$ to $\bm{y}_{t}$ is one-to-one, represented by the line or the arrow, respectively, in the upper and the middle panels. In this case, a change in $\ln{\varepsilon_{\bm{x}}}$ results in a direct change in $\delta_{\bm{y}}$ (the lower panel) with $\varepsilon_{\bm{x}}$ and $\delta_{\bm{y}}$ denoting the neighborhood size of the resulting variable $\bm{x}$ and of the causal variable $\bm{y}$ , respectively. (b) Absence of causation from $\bm{y}$ to $\bm{x}$ , where every point on each trajectory, $\{\bm{y}_{t}\}$ , in $\mathcal{N}$ could be the correspondent point from $\bm{x}_{t+1}$ in $\mathcal{M}$ (the upper panel) and thus every point in $\mathcal{N}$ belongs to the largest $\delta$ -neighborhood of $\bm{y}_{t}$ (the middle panel). In this case, $\delta_{\bm{y}}$ does not depend on $\varepsilon_{\bm{x}}$ (the lower panel). Also refer to the supplemental animation for illustration.

We now interpret the causal relationship stipulated by the continuity of a function. Let $\bm{f}_{\bm{x}_{\text{g}}}(\cdot)\triangleq\bm{f}(\bm{x}_{\text{g}},\cdot)$ for a given point $\bm{x}_{\text{g}}\in\mathcal{M}$ . For any $\bm{y}_{P}\in\mathcal{N}$ , we denote its image under the given function by $\bm{x}_{I}\triangleq\bm{f}_{\bm{x}_{\text{g}}}(\bm{y}_{P})$ . Applying the logic statement of a continuous function to $\bm{f}_{\bm{x}_{\text{g}}}(\cdot)$ , we have that, for any neighborhood $\mathcal{O}(\bm{x}_{I},\varepsilon)$ centered at $\bm{x}_{I}$ and of radius $\varepsilon>0$ , there exists a neighborhood $\mathcal{O}(\bm{y}_{P},\delta)$ centered at $\bm{y}_{P}$ of radius $\delta>0$ , such that $\bm{f}_{\bm{x}_{\text{g}}}(\mathcal{O}(\bm{y}_{P},\delta))\subset\mathcal{O}(\bm{x}_{I},\varepsilon)$ . The neighborhood and its radius are defined by

\mathcal{O}(\bm{p},h)=\big{\{}\bm{s}\in\mathcal{S}~{}\big{|}~{}\text{dist}_{\mathcal{S}}(\bm{s},\bm{p})<h\big{\}},~{}\bm{p}\in\mathcal{S},~{}h>0,

where $\text{dist}_{\mathcal{S}}(\cdot,\cdot)$ represents an appropriate metric describing the distance between two given points in a specified manifold $\mathcal{S}$ with $\mathcal{S}=\mathcal{M}$ or $\mathcal{N}$ . The meaning of this mathematical statement is that, if we have a neighborhood of the resulting variable $\bm{x}_{I}$ first, we can then find a neighborhood for the causal variable $\bm{y}_{P}$ to satisfy the above mapping and inclusion relation. This operation of “first- $\varepsilon$ -then- $\delta$ ” provides a rigorous base for the principle that the information about the resulting variable can be used to estimate the information of the causal variable and therefore to ascertain causation, as indicated by the long arrow in the middle panels of Fig. 1. Note that, the existence of the $\delta>0$ neighborhood is always guaranteed for a continuous map $\bm{f}_{\bm{x}_{\text{g}}}$ . In fact, due to the compactness of the manifold $\mathcal{N}$ , a largest value of $\delta$ exists. However, if $\bm{y}_{P}$ does not have an explicit causal effect on the variable $\bm{x}_{I}$ , i.e., $\bm{f}_{\bm{x}_{\rm g}}$ is independent of $\bm{y}_{P}$ , the existence of $\delta$ is still assured but it is independent of the value of $\varepsilon$ , as shown in the upper panel of Fig. 1. This means that merely determining the existence of a $\delta$ -neighborhood is not enough for inferring causation - it is necessary to vary $\varepsilon$ systematically and to examine the scaling relation between $\delta$ and $\varepsilon$ . In the following we discuss a number of scenarios.

Case I: Dynamical variables $\{(\bm{x}_{t},\bm{y}_{t})\}_{t\in\mathbb{N}}$ are fully measurable. For any given constant $\varepsilon_{\bm{x}}>0$ , the set $\{\bm{x}_{\tau}\in\mathcal{M}~{}|~{}\tau\in I^{t}_{\bm{x}}(\varepsilon_{\bm{x}})\}$ can be used to approximate the neighborhood $\mathcal{O}(\bm{x}_{t+1},\varepsilon_{\bm{x}})$ , where the time index set is

I^{t}_{\bm{x}}(\varepsilon_{\bm{x}})\triangleq\left\{\tau\in\mathbb{N}~{}|~{}\text{dist}_{\mathcal{M}}(\bm{x}_{t+1},\bm{x}_{\tau})<\varepsilon_{\bm{x}}\right\}.

(1)

The radius $\delta^{t}_{\bm{y}}=\delta^{t}_{\bm{y}}(\varepsilon_{\bm{x}})$ of the neighborhood $\mathcal{O}(\bm{y}_{t},\delta^{t}_{\bm{y}})$ satisfying $\bm{f}_{\bm{x}_{\rm g}=\bm{x}_{t}}(\mathcal{O}(\bm{y}_{t},\delta^{t}_{\bm{y}}))\subset\mathcal{O}(\bm{x}_{t+1},\varepsilon_{\bm{x}})$ can be estimated as

\delta^{t}_{\bm{y}}(\varepsilon_{\bm{x}})\triangleq\left\{\#[\bar{I}^{t}_{\bm{x}}(\varepsilon_{\bm{x}})]\right\}^{-1}\sum_{\tau\in\bar{I}^{t}_{\bm{x}}(\varepsilon_{\bm{x}})}\text{dist}_{\mathcal{N}}(\bm{y}_{t},\bm{y}_{\tau-1}),

(2)

where $\#[\cdot]$ is the cardinality of the given set and the index set is $\bar{I}^{t}_{\bm{x}}(\varepsilon_{\bm{x}})\triangleq\{\tau\in I^{t}_{\bm{x}}(\varepsilon_{\bm{x}})~{}|~{}~{}\text{dist}_{\mathcal{M}}(\bm{x}_{t},\bm{x}_{\tau-1})<\varepsilon_{\bm{x}}\}$ .

The strict mathematical steps for estimating $\delta^{t}_{\bm{y}}$ are given in Section II of Supplementary Information (SI). We emphasize that here correspondence between $\bm{x}_{t+1}$ and $\bm{y}_{t}$ is investigated, differing from the cross-map-based methods, with one-step time difference naturally arising. This consideration yields a key condition [DD], which is only need when considering the original iteration/flow and whose detailed description and universality are demonstrated in SI. We reveal a linear scaling law between $\langle\delta^{t}_{\bm{y}}\rangle_{t\in\mathbb{N}}$ and $\ln{\varepsilon_{\bm{x}}}$ , as shown in the lower panels of Fig. 1, whose slope $s_{\bm{y}\hookrightarrow\bm{x}}$ is an indicator of the correspondent relation between $\varepsilon$ and $\delta$ and hence the causal relation $\bm{y}\hookrightarrow\bm{x}$ . Here, $\langle\cdot\rangle_{t\in\mathbb{N}}$ denotes the average over time. In particular, a larger slope value implies a stronger causation in the direction from $\bm{y}$ to $\bm{x}$ as represented by the map functions $\bm{f}(\bm{x}_{t},\bm{y}_{t})$ [Fig. 1], while a near zero slope indicates null causation in this direction [Fig. 1]. Likewise, possible causation in the reversed direction, $\bm{x}\hookrightarrow\bm{y}$ , as represented by the function $\bm{g}(\bm{x}_{t},\bm{y}_{t})$ , can be assessed analogously. And the unidirectional case when $\bm{f}(\bm{x},\bm{y})=\bm{f}_{0}(\bm{x})$ independent of $\bm{y}$ is uniformly considered in Case II. We summarize the consideration below and an argument for the generic existence of the scaling law is provided in Section II of SI.

Theorem. For dynamical variables $\{(\bm{x}_{t},\bm{y}_{t})\}_{t\in\mathbb{N}}$ measured directly from the dynamical systems, if the slope $s_{\bm{y}\hookrightarrow\bm{x}}$ defined above is zero, no causation exists from $\bm{y}$ to $\bm{x}$ . Otherwise, a directional coupling can be confirmed from $\bm{y}$ to $\bm{x}$ and the slope $s_{\bm{y}\hookrightarrow\bm{x}}$ increases monotonically with the coupling strength.

Case II: The dynamical variables $\{(\bm{x}_{t},\bm{y}_{t})\}_{t\in\mathbb{N}}$ are not directly accessible but measurable time series $\{u_{t}\}_{t\in\mathbb{N}}$ and $\{v_{t}\}_{t\in\mathbb{N}}$ are available, where $u_{t}=u(\bm{x}_{t})$ and $v_{t}=v(\bm{y}_{t})$ with $u$ : $\mathcal{M}\to\mathbb{R}^{r_{u}}$ and $v$ : $\mathcal{N}\to\mathbb{R}^{r_{v}}$ being smooth observational functions. To assess causation from $\bm{y}$ to $\bm{x}$ , we assume one-dimensional observational time series (for simplicity): $r_{u}=r_{v}=1$ , and use the classical delay-coordinate embedding method Takens:1981 ; PCFS:1980 ; sauer1991embedology ; stark1999delay ; stark2003delay ; Cummins2015 ; muldoon1998delay to reconstruct the phase space: $\bm{u}_{t}=(u_{t},u_{t+\tau_{u}},\cdots,u_{t+(d_{u}-1)\tau_{u}})^{\top}$ and $\bm{v}_{t}=(v_{t},v_{t+\tau_{v}},\cdots,v_{t+(d_{v}-1)\tau_{v}})^{\top}$ , where $\tau_{u,v}$ is the delay time and $d_{u,v}>2(D_{\mathcal{M}}+D_{\mathcal{N}})$ is the embedding dimension that can be determined using some standard criteria kantz2004nonlinear . As illustrated in Fig. 2, the dynamical evolution of the reconstructed states $\{(\bm{u}_{t},\bm{v}_{t})\}_{t\in\mathbb{N}}$ is governed by

\bm{u}_{t+1}=\tilde{\bm{f}}(\bm{u}_{t},\bm{v}_{t}),~{}~{}\bm{v}_{t+1}=\tilde{\bm{g}}(\bm{u}_{t},\bm{v}_{t}).

(3)

The map functions can be calculated as $\tilde{\bm{f}}(\bm{u},\bm{v})\triangleq\bm{E}_{u}\circ[\bm{f},\bm{g}]\left(\bm{\Pi}_{1}\circ\bm{E}_{u}^{-1}(\bm{u}),\bm{\Pi}_{2}\circ\bm{E}_{v}^{-1}(\bm{v})\right)$ , $\tilde{\bm{g}}(\bm{u},\bm{v})\triangleq\bm{E}_{v}\circ[\bm{f},\bm{g}]\left(\bm{\Pi}_{1}\circ\bm{E}_{u}^{-1}(\bm{u}),\bm{\Pi}_{2}\circ\bm{E}_{v}^{-1}(\bm{v})\right)$ , where the embedding (diffeomorphism) $\bm{E}_{s}$ : $\mathcal{M}\times\mathcal{N}\to\widetilde{\mathcal{L}}_{s}\subset\mathbb{R}^{d_{s}}$ with $\widetilde{\mathcal{L}}_{s}\triangleq\bm{E}_{s}(\mathcal{M}\times\mathcal{N})$ , $s=u\ \mathrm{or}\ v$ , is given by

\begin{array}[]{l}\bm{E}_{u}(\bm{x},\bm{y})\triangleq\big{(}u(\bm{x}),u\circ\bm{\Pi}_{1}\circ[\bm{f},\bm{g}]^{\tau_{u}}(\bm{x},\bm{y}),u\circ\bm{\Pi}_{1}\circ\\ \hskip 28.45274pt[\bm{f},\bm{g}]^{2\tau_{u}}(\bm{x},\bm{y}),\cdots,u\circ\bm{\Pi}_{1}\circ[\bm{f},\bm{g}]^{(d_{u}-1)\tau_{u}}(\bm{x},\bm{y})\big{)},\\ \bm{E}_{v}(\bm{x},\bm{y})\triangleq\big{(}v(\bm{y}),v\circ\bm{\Pi}_{2}\circ[\bm{f},\bm{g}]^{\tau_{v}}(\bm{x},\bm{y}),v\circ\bm{\Pi}_{2}\circ\\ \hskip 28.45274pt[\bm{f},\bm{g}]^{2\tau_{v}}(\bm{x},\bm{y}),\cdots,v\circ\bm{\Pi}_{2}\circ[\bm{f},\bm{g}]^{(d_{v}-1)\tau_{v}}(\bm{x},\bm{y})\big{)},\end{array}

with the inverse function $\bm{E}_{s}^{-1}$ defined on $\widetilde{\mathcal{L}}_{s}$ , $[\bm{f},\bm{g}]^{k}$ representing the $k$ th iteration of the map, and the projection mappings as $\bm{\Pi}_{1}(\bm{x},\bm{y})=\bm{x}$ and $\bm{\Pi}_{2}(\bm{x},\bm{y})=\bm{y}$ . Case II has now been reduced to Case I and our continuity scaling framework can be used to ascertain the causation from $\bm{v}$ to $\bm{u}$ based on the measured time series with the indices $I^{t}_{\bm{u}}(\varepsilon_{\bm{u}})$ , $\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u}})$ and $s_{\bm{v}\hookrightarrow\bm{u}}$ [Eqs. (1) and (2)].

Does the causation from $\bm{v}$ to $\bm{u}$ imply causation from $\bm{y}$ to $\bm{x}$ ? The answer is affirmative, which can be argued, as follows. If the original map function $\bm{f}$ is independent of $\bm{y}$ : $\bm{f}(\bm{x},\bm{y})=\bm{f}_{0}(\bm{x})$ , there is no causation from $\bm{y}$ to $\bm{x}$ . In this case, the embedding $\bm{E}_{u}(\bm{x},\bm{y})$ becomes independent of $\bm{y}$ , degenerating into the form of $\bm{E}_{u}(\bm{x},\bm{y})=\bm{E}_{u0}(\bm{x})$ , a diffeomorphism from $\mathcal{M}$ to $\widetilde{\mathcal{L}}_{u0}=\bm{E}_{u0}(\mathcal{M})$ only. As a result, Eq. (3) becomes: $\bm{u}_{t+1}=\tilde{\bm{f}}_{0}(\bm{u}_{t})$ and $\bm{v}_{t+1}=\tilde{\bm{g}}(\bm{u}_{t},\bm{v}_{t})$ , where $\tilde{\bm{f}}_{0}(\bm{u})=\bm{E}_{u0}\circ\bm{f}\circ\bm{E}_{u0}^{-1}(\bm{u})$ and the resulting mapping $\tilde{\bm{f}}_{0}$ is independent of $\bm{v}$ . The independence can be validated by computing the slope $s_{\bm{v}\hookrightarrow\bm{u}}$ associated with the scaling relation between $\langle\delta^{t}_{\bm{v}}\rangle_{t\in\mathbb{N}}$ and $\ln{\varepsilon_{\bm{u}}}$ , where a zero slope indicates null causation from $\bm{v}$ to $\bm{u}$ and hence null causation from $\bm{y}$ to $\bm{x}$ . Conversely, a finite slope signifies causation between the variables. Thus, any type of causal relation (unidirectional or bi-directional) detected between the reconstructed state variables $\{(\bm{u}_{t},\bm{v}_{t})\}_{t\in\mathbb{N}}$ implies the same type of causal relation between the internal but inaccessible variables $\bm{x}$ and $\bm{y}$ of the original system.

Case III: The structure of the internal variables is completely unknown. Given the observational functions $\tilde{u},\tilde{v}$ : $\mathcal{M}$ $\times$ $\mathcal{N}$ $\to$ $\mathbb{R}$ with $\tilde{u}_{t}=\tilde{u}(\bm{x}_{t},\bm{y}_{t})$ and $\tilde{v}_{t}=\tilde{v}(\bm{x}_{t},\bm{y}_{t})$ , we first reconstruct the state space: $\tilde{\bm{u}}_{t}=(\tilde{u}_{t},\tilde{u}_{t+\tau},\cdots,\tilde{u}_{t+(d-1)\tau})^{\top}$ and $\tilde{\bm{v}}_{t}=(\tilde{v}_{t},\tilde{v}_{t+\tau},\cdots,\tilde{v}_{t+(d-1)\tau})^{\top}$ . To detect and quantify causation from $\tilde{\bm{v}}$ to $\tilde{\bm{u}}$ (or vice versa), we carry out a continuity scaling analysis with the modified indices $I^{t}_{\tilde{\bm{u}}}(\varepsilon_{\tilde{\bm{u}}})$ , $\delta^{t}_{\tilde{\bm{v}}}(\varepsilon_{\tilde{\bm{u}}})$ and $s_{\tilde{\bm{v}}\hookrightarrow\tilde{\bm{u}}}$ . Differing from Case II, here, due to the lack of knowledge about the correspondence structure between the internal and observational variables, a causal relation for the latter does not definitely imply the same for the former.

Case IV: Continuous-time dynamical systems possessing a sufficiently smooth flow $\{\bm{S}_{t};t\in\mathbb{R}\}$ on a compact manifold $\mathcal{H}$ : $\text{d}\bm{S}_{t}(\bm{u}_{0})/\text{d}t=\bm{\chi}(\bm{S}_{t}(\bm{u}_{0}))$ , where $\bm{\chi}$ is the vector field. Let $\{\hat{u}_{t=\omega n+\nu}\}_{n\in\mathbb{Z}}$ and $\{\hat{v}_{t=\omega n+\nu}\}_{n\in\mathbb{Z}}$ be two respective time series from the smooth observational functions $\hat{u},~{}\hat{v}$ : $\mathcal{H}$ $\rightarrow$ $\mathbb{R}$ with $\hat{u}_{t}=\hat{u}(\bm{S}_{t})$ and $\hat{v}_{t}=\hat{v}(\bm{S}_{t})$ , where $1/\omega$ is the sampling rate and $\nu$ is the time shift. Defining $\bm{\Xi}\triangleq\bm{S}_{\omega}$ : $\mathcal{H}$ $\rightarrow$ $\mathcal{H}$ and $\hat{\bm{S}}_{n}\triangleq{\bm{S}_{\omega n+\nu}}(\bm{u}_{0})$ , we obtain a discrete-time system as $\hat{\bm{S}}_{n+1}=\bm{\Xi}(\hat{\bm{S}}_{n})$ with the observational functions as $\hat{u}_{n}=\hat{u}(\hat{\bm{S}}_{n})$ and $\hat{v}_{n}=\hat{v}(\hat{\bm{S}}_{n})$ , reducing the case to Case III and rendering applicable our continuity scaling analysis to unveil and quantify the causal relation between $\{\hat{u}_{t=\omega n+\nu}\}_{n\in\mathbb{Z}}$ and $\{\hat{v}_{t=\omega n+\nu}\}_{n\in\mathbb{Z}}$ . If the domains of $\hat{u}$ and $\hat{v}$ have their own restrictions on some particular subspaces, e.g., $\hat{u}$ : $\mathcal{H}_{u}$ $\rightarrow$ $\mathbb{R}$ and $\hat{v}$ : $\mathcal{H}_{v}$ $\rightarrow$ $\mathbb{R}$ with $\mathcal{H}=\mathcal{H}_{u}\oplus\mathcal{H}_{v}$ , the case is further reduced to Case II, so the detected causal relation between the observational variables imply causation between the internal variables belonging to their respective subspaces.

III Demonstrations: From Complex Dynamical Models to Real-World Networks

To demonstrate the efficacy of our continuity scaling framework and its superior performance, we have carried out extensive numerical tests with a large number of synthetic and empirical datasets, including those from gene regulatory networks as well as those of air pollution and hospital admission. The practical steps of the continuity scaling framework together with the significance test procedures are described in Methods. We present three representative examples here, while leaving others of significance to SI.

The first example is an ecological model of two unidirectionally interacting species: $x_{1,t+1}=x_{1,t}(3.8-3.8x_{1,t}-\mu_{12}x_{2,t})$ and $x_{2,t+1}=x_{2,t}(3.7-3.7x_{2,t}-\mu_{21}x_{1,t})$ . With time series $\{(x_{1,t},x_{2,t})\}_{t\in\mathbb{N}}$ obtained from different values of the coupling parameters, our continuity scaling framework yields correct results of different degree of unidirectional causation, as shown in Figs. 3-3. In all cases, there exists a reasonable range of $\ln{\varepsilon_{x_{2}}}$ (neither too small nor too large) from which the slope $s_{x_{1}\hookrightarrow x_{2}}$ of the linear scaling can be extracted. The statistical significance of the estimated slope values and consequently the strength of causation can be assessed with the standard $p$ -value test lancaster2018surrogate (Methods and SI). An ecological model with bidirectional coupling has also been tested (see Section III of SI). Figures 3-3 show the results from ecological networks of five mutually interacting species on a ring and on a tree structure, respectively, where the color-coded slope values reflect accurately the interaction patterns in both cases.

The second example is the coupled Lorenz system: $\dot{x}_{i}=\sigma_{i}(y_{i}-x_{i})+\mu_{ij}x_{j}$ , $\dot{y}_{i}=x_{i}(\rho_{i}-z_{i})-y_{i}$ , $\dot{z}_{i}=x_{i}y_{i}-\beta_{i}z_{i}$ with $i,j=1,2$ and $i\not=j$ . We use time series $\{y_{1,t},y_{2,t}\}_{t=n\omega}$ for detecting different configurations of causation (see Section III of SI). Figure 4 presents the overall result, where the color-coded estimated values of the slope from the continuity scaling are shown for different combinations of the sampling rate $1/\omega$ and coupling strength. Even with relatively low sampling rate, our continuity scaling framework can successfully detect and quantify the strength of causation. Note that the accuracy does not vary monotonously with the sampling rate, indicating the potential of our framework to ascertain and quantify causation even with rare data. Moreover, the proposed index can accurately reflect the true causal strength (denoting by the coupling parameter), which is also evidenced by numerical tests in Sections III and IV of SI. Robustness tests against different noise perturbations are provided in Section III of SI demonstrating the practical usefulness of our framework. Additionally, analogous to the first example, we present in SI several examples on causation detection in the coupled Lorenz system with nonlinear couplings, and the Rössler-Lorenz system, etc., which further demonstrates the generic efficacy of our framework.

In addition, we present study on several real-world dataset, which brings new insights to the evolutionary mechanism of underlying systems. We study gene expression data from DREAM4 in silico Network Challenge marbach2009generating ; marbach2010revealing , whose intrinsic gene regulatory networks (GRNs) are known for verification [Fig. 5 and Fig. S17 of SI]. Applying our framework to these data, we ascertain the causations between each pair of genes by using the continuity scaling framework. The corresponding ROC curves for five different networks as well as their AUROC values are shown in Fig. 5, which indicates a high detection accuracy in dealing with real-world data.

We then test the causal relationship in a marine ecosystem consisting of Pacific sardine landings, northern anchovy landings and sea surfae temperature (SST). We reveal new findings to support the competing relationship hypothesis stated in Lasker1983Ocean which cannot be detected by CCM SMYHDFM:2012 . As pointed out in Fig. 6, while common influence from SST to both species is verified with both methods, our continuity scaling additionally illuminates notable influence from anchovy to sardine with its reverse direction being less significant. While competing relationship plays an important role in ecosystems, continuity scaling can reveal more essential interaction mechanism. See Section III.E of SI for more details.

Moreover, we study the transmission mechanism of the recent COVID-19 pandemic. Particularly, we analyze the daily new cases of COVID-19 of representative contries for two stages: day $1$ (January 22 ${}^{\text{nd}}$ 2020) to day $100$ (April 30 ${}^{\text{th}}$ 2020) and day $101$ (May 1 ${}^{\text{st}}$ 2020) to day $391$ (February 15 ${}^{\text{th}}$ 2021). Our continuity scaling is pairwisely applied to reconstruct the transmission causal network. As shown in Fig. 7, China shows a significant effect on a few countries at the first stage and this effect disappears at the second stage. However, other countries show a different situation with China, whose external effect lasts as shown in Section III.E and Fig. S18 of SI. Our results accord with that China holds stringent epidemic control strategies with sporadic domestic infections, as evidenced by official daily briefings, demonstrating the potential of continuity scaling in detecting causal networks for ongoing complex systems. Additionally, We emphasize that day $100$ is a suitable critical day to distinguish the early severe stage and the late well-under-control stage of the pandemic [see Fig. S18(a) of SI], while slight change of the critical day will not nullify our result. As shown in Fig. S18(b) of SI, when the critical day varies from day $94$ to day $106$ , no significant change (less than $5\%$ ) of the detected causal links occurs at both stages, and the number of countries under influence of China at Stage $2$ remains zero. See more details in Section III.E of SI.

Additional real world examples including air pollutants and hospital admission record from Hong Kong are also shown in Section III of SI.

IV Discussion

To summarize, we have developed a novel framework for data based detection and quantification of causation in complex dynamical systems. On the basis of the widely used cross-map-based techniques, our framework enjoys a rigorous foundation, focusing on the continuity scaling law of the concerned system directly instead of only investigating the continuity of its cross map. Therefore, our framework is consistent with the standard interpretation of causality, and works even in the typical cases where several existing typical methods do not perform that well or even they fail (see the comparison results in Section IV of SI). In addition, the mathematical reasoning leading to the core of our framework, the continuity scaling, helps resolve the long-standing issue associated with techniques directly using cross map that information about the resulting variables is required to project the dynamical behavior of the causal variables, whereas several works in the literature Quyen1999PhyD , which directly studied the continuity or the smoothness of the cross map, likely yielded confused detected results on causal directions.

Computational complexity. The computational complexity of the algorithm is $O(T^{2}N_{\varepsilon})$ , which is relatively smaller than the CCM method, whose computational complexity is $O(T^{2}\log T)$ .

Limitations and future works. Nevertheless, there are still some limitations in the presently proposed framework. For instance, currently, only bivariate detection algorithm is designed, so generalization to multivariate network inference requires further considerations, as analogous to those works presented in Refs. Peters2017elements ; Runge2018causal ; research2019 . In addition, the causal time delay has not been taken into account in the current framework, so it also could be further investigated, similar to the work reported in Ref. ma2017detection . Definitely, we will settle these questions in our future work.

Detecting causality in complex dynamical systems has broad applications not only in science and engineering, but also in many aspects of the modern society, demanding accurate, efficient, and rigorously justified and hence trustworthy methodologies. Our present work provides a vehicle along this feat, and indeed resolves the puzzles arising in the use of those influential methods.

V Methods

Continuity scaling framework: a detailed description of algorithms. Let $\{u_{t}\}_{t=1,2,\dots,T}$ and $\{v_{t}\}_{t=1,2,\dots,T}$ be two experimentally measured time series of internal variables $\{(\bm{x}_{t},\bm{y}_{t})\}_{t\in\mathbb{N}}$ . Typically, if the dynamical variables $\{(\bm{x}_{t},\bm{y}_{t})\}_{t\in\mathbb{N}}$ are accessible, $\{(u_{t},v_{t})\}$ reduce to one-dimensional coordinate of the internal system. The key computational steps of our continuity scaling framework are described, as follows.

We reconstruct the phase space using the classical method of delay coordinate embedding Takens:1981 with the optimal embedding dimension $d_{z}$ and time lag $\tau_{z}$ determined by the methods in Refs. MutualInfoMethod ; FNNmethod (i.e., the false nearest neighbors and the delayed mutual information respectively):

\mathcal{L}_{z}\triangleq\left\{\bm{z}(t)=\left(z_{t},z_{t+\tau_{z}},\dots,z_{t+(d_{z}-1)\tau_{z}}\right)~{}|~{}t=1,\dots,T_{0}\right\},

(4)

where $z=u,v$ , $T_{0}=\min\{T-(d_{z}-1)\tau_{z}~{}|~{}z=u,v\}$ , and Euclidean distance is used for both $\mathcal{L}_{u,v}$ .

We present the steps for causation detection using the case of $\bm{v}\hookrightarrow\bm{u}$ as an example.

We calculate the respective diameters for $\mathcal{L}_{u,v}$ as

D_{z}\triangleq\max\left\{\text{dist}_{\mathcal{L}_{z}}(\bm{z}(t),\bm{z}(\tau))~{}\big{|}~{}1\leqslant t,\tau\leqslant T_{0}\right\},

(5)

where $z=u,v$ , and $\bm{z}=\bm{u},\bm{v}$ . We set up a group of numbers, $\{\varepsilon_{\bm{u},j}\}_{j=1,\cdots,N_{\epsilon}}$ , as $\varepsilon_{\bm{u},1}=e\cdot D_{u}$ , $\varepsilon_{\bm{u},N_{\epsilon}}=D_{u}$ , with the other elements satisfying

(\ln\varepsilon_{\bm{u},j}-\ln\varepsilon_{\bm{u},1})/(j-1)=(\ln\varepsilon_{\bm{u},N_{\varepsilon}}-\ln\varepsilon_{\bm{u},1})/(N_{\varepsilon}-1)

(6)

for $j=2,\dots,N_{\varepsilon}-1$ . Then, in light of (1) with (2), we have

\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u}})=\#[I^{t}_{\bm{u}}(\varepsilon_{\bm{u}})]^{-1}\sum_{\tau\in I^{t}_{\bm{u}}(\varepsilon_{\bm{u}})}\text{dist}_{\mathcal{L}_{v}}(\bm{v}(t),\bm{v}(\tau-1)),

(7)

with

\begin{array}[]{l}I^{t}_{\bm{u}}(\varepsilon_{\bm{u}})=\{\tau\in\mathbb{N}~{}\big{|}~{}\text{dist}_{\mathcal{L}_{u}}(\bm{u}(t+1),\bm{u}(\tau))<\varepsilon_{\bm{u}},\\ ~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}~{}|t+1-\tau|>E\},\end{array}

(8)

where, numerically, $\varepsilon_{\bm{u}}$ alters its value successively from the set $\{\varepsilon_{\bm{u},j}\}_{j=1,\cdots,N_{\epsilon}}$ , and the threshold $E$ is a positive number chosen to avoid the situation where the nearest neighboring points are induced by the consecutive time order only.

As defined, $\langle\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u}})\rangle_{t\in\mathbb{N}}$ is the average of $\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u}})$ over all possible time $t$ . We use a finite number of pairs $\{(\langle\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u},j})\rangle_{t\in\mathbb{N}_{T_{0}}},\ln\varepsilon_{\bm{u},j})\}_{j=1,\dots,N_{\varepsilon}}$ to approximate the scaling relation between $\langle\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u}})\rangle_{t\in\mathbb{N}}$ and $\ln\varepsilon_{\bm{u}}$ , where $\mathbb{N}_{T_{0}}=\{1,2,\cdots,T_{0}\}$ . Theoretically, a larger value of $N_{\epsilon}$ and a smaller value of $e$ will result in a more accurate approximation of the scaling relation. In practice, the accuracy is determined by the length of the observational time series, the sampling duration, and different types of noise perturbations. In numerical simulations, we set $e=0.001$ and $N_{\epsilon}=33$ . In addition, a too large or a too small value of $\varepsilon_{\bm{u}}$ can induce insufficient data to restore the neighborhood and/or the entire manifold. We thus set $\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u},j})=\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u},j+1})$ as a practical technique as the number of points is limited practically in a small neighborhood. As a result, near zero slope values would appear on both sides of the scaling curve $\langle\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u}})\rangle_{t\in\mathbb{N}}$ - $\ln\varepsilon_{\bm{u}}$ , as demonstrated in Fig. 3 and in SI. In such a case, to estimate the slope of the scaling relation, we take the following approach.

Define a group of numbers by

S_{j}\triangleq\frac{\langle\delta_{\bm{v}}^{t}(\varepsilon_{\bm{u},j+1})\rangle_{t\in\mathbb{N}_{T_{0}}}-\langle\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u},j})\rangle_{t\in\mathbb{N}_{T_{0}}}}{\ln\varepsilon_{\bm{u},j+1}-\ln\varepsilon_{\bm{u},j}},

(9)

where $j=1,\cdots,N_{\varepsilon}-1$ , sort them in a descending order, from which we determine the $[\frac{N_{\varepsilon}+1}{2}]$ largest numbers, collect their subscripts - $j$ ’s together as an index set $\hat{J}$ , and set $H\triangleq\left\{j,~{}j+1~{}\big{|}~{}j\in\hat{J}\right\}$ . Applying the least squares method to the linear regression model:

\langle\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u}})\rangle_{t\in\mathbb{N}}=S\cdot\ln\varepsilon_{\bm{u}}+b

(10)

with the dataset $\{(\langle\delta^{t}_{\bm{v}}(\varepsilon_{\bm{u},j})\rangle_{t\in\mathbb{N}_{T_{0}}},\ln\varepsilon_{\bm{u},j})\}_{j\in H}$ , we get the optimal values $(\hat{S},\hat{b})$ for the parameters $(S,b)$ in (10) and finally obtain the slope of the scaling relation as $s_{\bm{v}\hookrightarrow\bm{u}}\triangleq\hat{S}$ .

For the other causal direction from $\bm{u}$ to $\bm{v}$ , these steps are equally applicable to estimating the slope $s_{\bm{u}\hookrightarrow\bm{v}}$ .

To assess the statistical significance of the numerically determined causation, we devise the following surrogate test using the case of $\bm{v}$ causing $\bm{u}$ as an illustrative example.

Divide the time series $\{\bm{u}(t)\}_{t\in\mathbb{N}_{T_{0}}}$ into $N_{G}$ consecutive segments of equal length (except for the last segment - the shortest segment). Randomly shuffle these segments and then regroup them into a surrogate sequence $\{\bm{\hat{u}}(t)\}_{t\in\mathbb{N}_{T_{0}}}$ . Applying such a random permutation method to $\{\bm{v}(t)\}_{t\in\mathbb{N}_{T_{0}}}$ generates another surrogate sequence $\{\bm{\hat{v}}(t)\}_{t\in\mathbb{N}_{T_{0}}}$ . Carrying out the slope computation yields $s_{\bm{\hat{v}}\hookrightarrow\bm{\hat{u}}}$ . The procedure can be repeated for a sufficient number of times, say $Q$ , which consequently yields a group of estimated slopes, denoted as $\{s^{q}_{\bm{\hat{v}}\hookrightarrow\bm{\hat{u}}}\}_{q=0,1\cdots,Q}$ , where $s^{0}_{\bm{\hat{v}}\hookrightarrow\bm{\hat{u}}}$ is set as $s_{\bm{v}\hookrightarrow\bm{u}}$ obtained from the original time series. For all the estimated slopes, we calculate their mean $\hat{\mu}_{\bm{v}\hookrightarrow\bm{u}}$ and the standard deviation $\hat{\sigma}_{\bm{v}\hookrightarrow\bm{u}}$ . The $p$ -value for $s_{\bm{v}\hookrightarrow\bm{u}}$ is calculated as

p_{s_{\bm{v}\hookrightarrow\bm{u}}}\triangleq 1-\text{normcdf}\left[\frac{s_{\bm{v}\hookrightarrow\bm{u}}-\hat{\mu}_{\bm{v}\hookrightarrow\bm{u}}}{\hat{\sigma}_{\bm{v}\hookrightarrow\bm{u}}}\right],

(11)

where $\text{normcdf}[\cdot]$ is the cumulative Gaussian distribution function. The principle of statistical hypothesis testing guarantees the existence of causation from $\bm{v}$ to $\bm{u}$ if $p_{s_{\bm{v}\hookrightarrow\bm{u}}}<0.05$ .

In simulations, we set the number of segments to be $N_{G}=25$ and the number of times for random permutations to be $Q\geq 20$ .

References

(1) Bunge, M. Causality and modern science (Routledge, 2017).
(2) Pearl, J. Causality (Cambridge university press, 2009).
(3) Runge, J., Bathiany, S. et al. Inferring causation from time series in Earth system sciences. Nature communications 10, 2553 (2019). doi: 10.1038/s41467-019-10105-3.
(4) Collins, F. S. & Varmus, H. A new initiative on precision medicine. New England Journal of Medicine 372, 793–795 (2015). doi: 10.1056/NEJMp1500523.
(5) Saxe, G. N. et al. A complex systems approach to causal discovery in psychiatry. PloS ONE 11, e0151174 (2016). doi: 10.1371/journal.pone.0151174.
(6) Cox, D. R. & Hinkley, D. V. Theoretical statistics (CRC Press, 1979). doi: 10.1201/b14832.
(7) Cover, T. M. Elements of information theory (John Wiley & Sons, 1999).
(8) Pearl, J. et al. Causal inference in statistics: An overview. Statistics Surveys 3, 96–146 (2009). doi: 10.1214/09-SS057.
(9) Wiener, N. The theory of prediction. Modern mathematics for engineers (1956).
(10) Granger, C. W. Investigating causal relations by econometric models and cross-spectral methods. Econometrica: Journal of the Econometric Society 424–438 (1969). doi: 10.2307/1912791.
(11) Haufe, S. et al. A critical assessment of connectivity measures for EEG data: A simulation study. NeuroImage 120–133(2013). doi: 10.1016/j.neuroimage.2012.09.036.
(12) Ding, M., Chen, Y. & Bressler, S. L. Granger causality: basic theory and application to neuroscience. Handbook of time series analysis: recent theoretical developments and applications 437 (2006). doi: 10.1002/9783527609970.ch17.
(13) Schreiber, T. Measuring information transfer. Physical Review Letters 85, 461 (2000). doi: 10.1103/PhysRevLett.85.461.
(14) Frenzel, S. & Pompe, B. Partial mutual information for coupling analysis of multivariate time series. Physical Review Letters 99, 204101 (2007). doi: 10.1103/PhysRevLett.99.204101.
(15) Vicente, R., Wibral, M., Lindner, M. & Pipa, G. Transfer entropy–a model-free measure of effective connectivity for the neurosciences. Journal of Computational Neuroscience 30, 45–67 (2011). doi: 10.1007/s10827-010-0262-3.
(16) Runge, J., Heitzig, J., Petoukhov, V. & Kurths, J. Escaping the curse of dimensionality in estimating multivariate transfer entropy. Physical Review Letters 108, 258701 (2012). doi: 10.1103/PhysRevLett.108.258701.
(17) Sun, J., Cafaro, C. & Bollt, E. M. Identifying the coupling structure in complex systems through the optimal causation entropy principle. Entropy 16, 3416–3433 (2014). doi: 10.3390/e16063416.
(18) Cafaro, C., Lord, W. M., Sun, J. & Bollt, E. M. Causation entropy from symbolic representations of dynamical systems. Chaos: An Interdisciplinary Journal of Nonlinear Science 25, 043106 (2015). doi: 10.1063/1.4916902.
(19) Sun, J., Taylor, D. & Bollt, E. M. Causal network inference by optimal causation entropy. SIAM Journal on Applied Dynamical Systems 14, 73–106 (2015). doi: 10.1137/140956166.
(20) Solyanik-Gorgone, M., Ye, J., Miscuglio, M., Afanasev, A., Willner, A. E. & Sorger, V. J. Quantifying Information via Shannon Entropy in Spatially Structured Optical Beams. Research 2021, 9780760 (2021). doi: 10.34133/2021/9780760.
(21) Hirata, Y. & Aihara, K. Identifying hidden common causes from bivariate time series: A method using recurrence plots. Physical Review E 81, 016203 (2010). doi: 10.1103/PhysRevE.81.016203.
(22) Quiroga, R. Q., Arnhold, J. & Grassberger, P. Learning driver-response relationships from synchronization patterns. Physical Review E 61, 5142 (2000). doi: 10.1103/PhysRevE.61.5142.
(23) Arnhold, J., Grassberger, P., Lehnertz, K. & Elger, C. E. A robust method for detecting interdependences: application to intracranially recorded eeg. Physica D: Nonlinear Phenomena 134, 419–430 (1999). doi: 10.1016/S0167-2789(99)00140-2.
(24) Harnack, D., Laminski, E., Schünemann, M. & Pawelzik, K. R. Topological causality in dynamical systems. Physical Review Letters 119, 098301 (2017). doi: 10.1103/PhysRevLett.119.098301.
(25) Sugihara, G. et al. Detecting causality in complex ecosystems. Science 338, 496–500 (2012). doi: 10.1126/science.1227079.
(26) Deyle, E. R. et al. Predicting climate effects on pacific sardine. Proceedings of the National Academy of Sciences 110, 6430–6435 (2013). doi: 10.1073/pnas.1215506110.
(27) Wang, X. et al. A two-fold increase of carbon cycle sensitivity to tropical temperature variations. Nature 506, 212–215 (2014). doi: 10.1038/nature12915.
(28) Ma, H., Aihara, K. & Chen, L. Detecting causality from nonlinear dynamics with short-term time series. Scientific Reports 4, 1–10 (2014). doi: 10.1038/srep07464.
(29) McCracken, J. M. & Weigel, R. S. Convergent cross-mapping and pairwise asymmetric inference. Physical Review E 90, 062903 (2014). doi: 10.1103/PhysRevE.90.062903.
(30) Ye, H., Deyle, E. R., Gilarranz, L. J. & Sugihara, G. Distinguishing time-delayed causal interactions using convergent cross mapping. Scientific Reports 5, 14750 (2015). doi: 10.1038/srep14750.
(31) Clark, A. T. et al. Spatial convergent cross mapping to detect causal relationships from short time series. Ecology 96, 1174–1181 (2015). doi: 10.1890/14-1479.1.
(32) Jiang, J.-J., Huang, Z.-G., Huang, L., Liu, H. & Lai, Y.-C. Directed dynamical influence is more detectable with noise. Scientific Reports 6, 24088 (2016). doi: 10.1038/srep24088.
(33) Ma, H. et al. Detection of time delays and directional interactions based on time series from complex dynamical systems. Physical Review E 96, 012221 (2017). doi: 10.1103/PhysRevE.96.012221.
(34) Amigó, J. M. & Hirata, Y. Detecting directional couplings from multivariate flows by the joint distance distribution. Chaos: An Interdisciplinary Journal of Nonlinear Science 28, 075302 (2018). doi: 10.1063/1.5010779.
(35) Wang, Y. et al. Detecting the causal effect of soil moisture on precipitation using convergent cross mapping. Scientific Reports 8, 1–8 (2018). doi: 10.1038/s41598-018-30669-2.
(36) Leng, S. et al. Partial cross mapping eliminates indirect causal influences. Nature Communications 11, 1–9 (2020). doi: 10.1038/s41467-020-16238-0.
(37) Takens, F. Detecting strange attractors in turbulence. In Dynamical systems and turbulence, Warwick 1980, 366–381 (Springer, 1981). doi: 10.1007/BFb0091924.
(38) Packard, N. H., Crutchfield, J. P., Farmer, J. D. & Shaw, R. S. Geometry from a time series. Physical Review Letters 45, 712 (1980). doi: 10.1103/PhysRevLett.45.712.
(39) Sauer, T., Yorke, J. A. & Casdagli, M. Embedology. Journal of Statistical Physics 65, 579–616 (1991). doi: 10.1007/BF01053745.
(40) Stark, J. Delay embeddings for forced systems. i. deterministic forcing. Journal of Nonlinear Science 9, 255–332 (1999). doi: 10.1007/s003329900072.
(41) Stark, J., Broomhead, D. S., Davies, M. E. & Huke, J. Delay embeddings for forced systems. ii. stochastic forcing. Journal of Nonlinear Science 13, 519–577 (2003). doi: 10.1007/s00332-003-0534-4.
(42) Muldoon, M. R., Broomhead, D. S., Huke, J. P. & Hegger, R. Delay embedding in the presence of dynamical noise. Dynamics and Stability of Systems 13, 175–186 (1998). doi: 10.1080/02681119808806259.
(43) Spirtes, P., Glymour, C. N., Scheines, R. & Heckerman, D. Causation, prediction, and search (MIT press, 2000).
(44) Cummins, B., Gedeon, T. & Spendlove, K. On the efficacy of state space reconstruction methods in determining causality. SIAM Journal on Applied Dynamical Systems 14, 335–381 (2015). doi: 10.1137/130946344.
(45) Kantz, H. & Schreiber, T. Nonlinear time series analysis, vol. 7 (Cambridge university press, 2004). doi: 10.1017/CBO9780511755798.
(46) Lancaster, G., Iatsenko, D., Pidde, A., Ticcinelli, V. & Stefanovska, A. Surrogate data for hypothesis testing of physical systems. Physics Reports 748, 1–60 (2018). doi: 10.1016/j.physrep.2018.06.001.
(47) Marbach, D., Schaffter, T., Mattiussi, C. & Floreano, D. Generating realistic in silico gene networks for performance assessment of reverse engineering methods. Journal of Computational Biology 16, 229–239 (2009). doi: 10.1089/cmb.2008.09TT.
(48) Marbach, D. et al. Revealing strengths and weaknesses of methods for gene network inference. Proceedings of the National Academy of Sciences 107, 6286–6291 (2010). doi: 10.1073/pnas.0913357107.
(49) R. Lasker & A. MacCall. New ideas on the fluctuations of the clupeoid stocks off California. In Proceedings of the Joint Oceanographic Assembly, Halifax, August 1982: General Symposia, 110–120 (Department of Fisheries and Oceans, Ontario, 1983).
(50) Quyen, M. L. V., Martinerie, J., Adam, C., Varela, F. J. Nonlinear analyses of interictal EEG map the brain interdependences in human focal epilepsy. Physica D: Nonlinear Phenomena 127,250–266(1999). doi: 10.1016/S0167-2789(98)00258-9.
(51) Peters, J., Janzing, D. et al. Elements of causal inference: foundations and learning algorithms (MIT Press, 2017).
(52) Runge, J. Causal network reconstruction from time series: From theoretical assumptions to practical estimation. Chaos: An Interdisciplinary Journal of Nonlinear Science 28, 075310 (2018). doi: 10.1063/1.5025050.
(53) Lou, Y., Wang, L. & Chen, G. Enhancing Controllability Robustness of q-Snapback Networks through Redirecting Edges. Research 2019, 7857534 (2019). doi: 10.34133/2019/7857534.
(54) Fraser, A. M. & Swinney, H. L. Independent coordinates for strange attractors from mutual information. Physical Review A 33, 1134 (1986). doi: 10.1103/PhysRevA.33.1134.
(55) Kennel, M. B., Brown, R. & Abarbanel, H. D. Determining embedding dimension for phase-space reconstruction using a geometrical construction. Physical Review A 45, 3403 (1992). doi: 10.1103/PhysRevA.45.3403.

VI Acknowledgements

W.L. is supported by the National Key R&D Program of China (Grant No. 2018YFC0116600), by the National Natural Science Foundation of China (Grant Nos. 11925103 & 61773125), by the STCSM (Grant No. 18DZ1201000), and by the Shanghai Municipal Science and Technology Major Project (No. 2021SHZDZX0103). Y.-C.L. is supported by AFOSR (Grant No. FA9550-21-1-0438). S.-Y.L. is supported by the National Natural Science Foundation of China (No. 12101133) and “Chenguang Program” supported by Shanghai Education Development Foundation and Shanghai Municipal Education Commission (No. 20CG01). Q.N. is partially supported by NSF (Grant No. DMS1763272) and the Simons Foundation (Grant No. 594598).

VII Competing Interests

The authors declare no competing interests.

VIII Code Availability

The source codes for our CS framework are available at https://github.com/bianzhiyu/ContinuityScaling.

IX Contributions

W.L. conceived idea; X.Y., S.-Y.L., and W.L. designed and performed research; X.Y., S.-Y.L., H.-F.M., and W.L. analyzed data, H.-F.M., Y.-C.L., and Q.N. contributed data and analysis tools, and all the authors wrote the paper. X.Y. and S.-Y.L. were equally contributed.