On the Complexity of Counterfactual Reasoning
Abstract
We study the computational complexity of counterfactual reasoning in relation to the complexity of associational and interventional reasoning on structural causal models (SCMs). We show that counterfactual reasoning is no harder than associational or interventional reasoning on fully specified SCMs in the context of two computational frameworks. The first framework is based on the notion of treewidth and includes the classical variable elimination and jointree algorithms. The second framework is based on the more recent and refined notion of causal treewidth which is directed towards models with functional dependencies such as SCMs. Our results are constructive and based on bounding the (causal) treewidth of twin networks— used in standard counterfactual reasoning that contemplates two worlds, real and imaginary—to the (causal) treewidth of the underlying SCM structure. In particular, we show that the latter (causal) treewidth is no more than twice the former plus one. Hence, if associational or interventional reasoning is tractable on a fully specified SCM then counterfactual reasoning is tractable too. We extend our results to general counterfactual reasoning that requires contemplating more than two worlds and discuss applications of our results to counterfactual reasoning with partially specified SCMs that are coupled with data. We finally present empirical results that measure the gap between the complexities of counterfactual reasoning and associational/interventional reasoning on random SCMs.
1 Introduction
A theory of causality has emerged over the last few decades based on two parallel hierarchies, an information hierarchy and a reasoning hierarchy, often called the causal hierarchy Pearl and Mackenzie (2018). On the reasoning side, this theory has crystallized three levels of reasoning with increased sophistication and proximity to human reasoning: associational, interventional and counterfactual, which are exemplified by the following canonical probabilities. Associational : probability of given that was observed. Example: probability that a patient has a flu given they have a fever. Interventional : probability of given that was established by an intervention. This is different from . Example: seeing the barometer fall tells us about the weather but moving the barometer needle won’t bring rain. Counterfactual : probability of if we were to establish by an intervention given that neither nor are true. Example: probability that a patient who did not take a vaccine and died would have recovered had they been vaccinated. On the information side, these forms of reasoning were shown to require different levels of knowledge, encoded as (1) associational models, (2) causal models and (3) functional (mechanistic) models, respectively, with each class of models containing more information than the preceding one. In the framework of probabilistic graphical models Koller and Friedman (2009), this information is encoded by (1) Bayesian networks Darwiche (2009); Pearl (1988), (2) causal Bayesian networks Pearl (2009); Peters et al. (2017); Spirtes et al. (2000), and (3) functional Bayesian networks Balke and Pearl (1995); Pearl (2009).

Counterfactual reasoning has received much interest as it inspires both introspection and contemplating scenarios that have not been seen before, and is therefore viewed by many as a hallmark of human intelligence. Figure 1 depicts a functional Bayesian network, also known as a structural causal model (SCM) Galles and Pearl (1998); Halpern (2000), which can be used to answer counterfactual queries. Variables without causes are called exogenous or root and variables with causes are called endogenous or internal. The only uncertainty in SCMs concerns the states of exogenous variables and this uncertainty is quantified using distributions over these variables. Endogenous variables are assumed to be functional: they are functionally determined by their causes where the functional relationships, also known as causal mechanisms, are specified by structural equations.111These equations can also be specified using conditional probability tables (CPTs) that are normally used in Bayesian networks, but the CPTs will contain only deterministic distributions. These equations and the distributions over exogenous variables define the parameters of the causal graph, leading to a fully specified SCM which can be used to evaluate associational, interventional and counterfactual queries. For example, the SCM in Figure 1 has enough information to evaluate the counterfactual query : the probability that a patient who did not take the treatment and died would have been alive had they been given the treatment (). A causal Bayesian network contains less information than a functional one (SCM) as it does not require endogenous variables to be functional, but it is sufficient to compute associational and interventional probabilities. A Bayesian network contains even less information as it does not require network edges to have a causal interpretation, only that the conditional independences encoded by the network are correct, so it can only compute associational probabilities.
All three forms of reasoning (and models) involve a directed acyclic graph (DAG) which we call the base network; see left of Figure 1. Associational and interventional reasoning can be implemented by applying classical inference algorithms to the base network. The time complexity can be bounded by , where is the number of network nodes and is its treewidth (a graph-theoretic measure of connectivity). Counterfactual reasoning is more sophisticated and is based on a three-step process that involves abduction, intervention and prediction Balke and Pearl (1994b). When contemplating two worlds, this process can be implemented by applying classical inference algorithms to a twin network Balke and Pearl (1994b), obtained by duplicating endogenous nodes in the base network; see right of Figure 1. To compute the counterfactual query , one asserts as an observation on one side of the twin network (real world) and computes the interventional query on the other side of the network (imaginary world). The time complexity can be bounded by , where is the number of nodes in the twin network and is its treewidth. A recent result provides a much tighter bound using the notion of causal treewidth Chen and Darwiche (2022); Darwiche (2021), which is no greater than treewidth but applies only when certain nodes in the base network are functional — in SCMs every endogenous node is functional.
One would expect the more sophisticated counterfactual reasoning with twin networks to be more expensive than associational/interventional reasoning with base networks since the former networks are larger and have more complex topologies. But the question is: How much more expensive? For example, can counterfactual reasoning be intractable on a twin network when associational/interventional reasoning is tractable on its base network? We address this question in the context of reasoning algorithms whose complexity is exponential only in the (causal) treewidth, such as the jointree algorithm Lauritzen and Spiegelhalter (1988), the variable elimination algorithm Zhang and Poole (1994); Dechter (1996) and circuit-based algorithms Darwiche (2003, 2022). In particular, we show in Sections 3 & 4 that the (causal) treewidth of a twin network is at most twice the (causal) treewidth of its base network plus one. Hence, the complexity of counterfactual reasoning on fully specified SCMs is no more than quadratic in the complexity of associational and interventional reasoning, so the former must be tractable if the latter is tractable. We extend our results in Section 5 to counterfactual reasoning that requires contemplating more than two worlds, where we also discuss a class of applications that require this type of reasoning and for which fully specified SCMs can be readily available. Our results apply directly to counterfactual reasoning on fully specified SCMs but we also discuss in Section 6 how they can sometimes be used in the context of counterfactual reasoning on data and a partially specified SCM. We finally present empirical results in Section 7 which reveal that, on average, the complexity gap between counterfactual and associational/interventional reasoning on fully specified SCMs can be smaller than what our worst-case bounds may suggest.
2 Technical Preliminaries
We next review the notions of treewidth Robertson and Seymour (1986) and causal treewidth Chen and Darwiche (2022); Darwiche (2021, 2020) which we use to characterize the computational complexity of counterfactual reasoning on fully specified SCMs. We also review the notions of elimination orders, jointrees and thinned jointrees which are the basis for defining (causal) treewidth and act as data structures that characterize the computational complexity of various reasoning algorithms. We use these notions extensively when stating and proving our results (proofs of all results are in Appendix A). We assume all variables are discrete. A variable is denoted by an uppercase letter (e.g. ) and its values by a lowercase letter (e.g. ). A set of variables is denoted by a bold uppercase letter (e.g. ) and its instantiations by a bold lowercase letter (e.g. ).
2.1 Elimination Orders and Treewidth
These are total orders of the network variables which drive, and characterize the complexity of, the classical variable elimination algorithm when computing associational, interventional and counterfactual queries. Consider a DAG where every node represents a variable. An elimination order for is a total ordering of the variables in , where is the variable in the order, starting from . An elimination order defines an elimination process on the moral graph of DAG which is used to define the treewidth of . The moral graph is obtained from by adding an undirected edge between every pair of common parents and then removing directions from all directed edges. When we eliminate variable from , we connect every pair of neighbors of in and remove from . This elimination process induces a cluster sequence , where is and its neighbors in just before eliminating . The width of an elimination order is the size of its largest induced cluster minus . The treewidth for DAG is the minimum width of any elimination order for . The variable elimination algorithm computes queries in time where is the number of nodes in the (base or twin) network and is the width of a corresponding elimination order. Elimination orders are usually constructed using heuristics that aim to minimize their width. We use the popular minfill heuristic Kjaerulff (1990) in our experiments while noting that more effective heuristics may exist as shown in Kjærulff (1994); Larrañaga et al. (1997).



2.2 Jointrees and Treewidth
These are data structures that drive, and characterize the complexity of, the classical jointree algorithm; see Figure 2(b). Let the family of variable in DAG be the set containing and its parents in . A jointree for DAG is a pair where is a tree and is a function that maps each family of into nodes in called the hosts of family . The requirements are: only nodes with a single neighbor (called leaves) can be hosts; each leaf node hosts exactly one family; and each family must be hosted by at least one node.222The standard definition of jointrees allows any node to be a host of any number of families. Our definition facilitates the upcoming treatment and does not preclude optimal jointrees. This induces a cluster for each jointree node and a separator for each jointree edge which are defined as follows. Separator is the set of variables hosted at both sides of edge . If jointree node is a leaf then cluster is the family hosted by ; otherwise, is the union of separators adjacent to node . The width of a jointree is the size of its largest cluster minus . The minimum width attained by any jointree for corresponds to the treewidth of . The jointree algorithm computes queries in time where is the number of nodes and is the width of a corresponding jointree. Jointrees are usually constructed from elimination orders, and there are polytime, width-preserving transformations between elimination orders and jointrees; see (Darwiche, 2009, Ch 9) for details.
2.3 Thinned Jointrees and Causal Treewidth
To thin a jointree is to remove some variables from its separators (and hence clusters, which are defined in terms of separators); see Figure 2(c). Thinning can reduce the jointree width quite significantly, leading to exponential savings in reasoning time. Thinning is possible only when some variables in the network are functional, even without knowing the specific functional relationships (i.e., structural equations). The causal treewidth is the minimum width for any thinned jointree. Causal treewidth is no greater than treewidth and the former can be bounded when the latter is not. While this notion can be applied broadly as in Darwiche (2020), it is particularly relevant to counterfactual reasoning since every internal node in an SCM is functional so the causal treewidth for SCMs can be much smaller than their treewidth. There are alternate definitions of thinned jointrees. The next definition is based on thinning rules Chen and Darwiche (2022).
A thinning rule removes a variable from a separator under certain conditions. There are two thinning rules which apply only to functional variables. The first rule removes variable from a separator if edge is on the path between two leaf nodes that host the family of and every separator on that path contains . The second rule removes variable from a separator if no other separator contains , or no other separator contains . A thinned jointree is obtained by applying these rules to exhaustion. Figure 2 depicts an optimal, classical jointree and a thinned jointree for the same DAG (the latter has smaller width).
The effectiveness of thinning rules depends on the number of jointree nodes that host a family , , and the location of these nodes in the jointree. One can enable more thinnings by increasing the number of jointree nodes that host each family . This process is called replication where is called the number of replicas for family . Replication comes at the expense of increasing the number of jointree nodes so the definition of causal treewidth limits this growth by requiring the jointree size to be a polynomial in the number of nodes in the underlying DAG; see Chen and Darwiche (2022) for details.333Thinning rules will not trigger if families are not replicated ( for all ). Replication usually increases the width of a jointree from to with the goal of having thinning rules reduce width to width . The replication strategy may sometimes not be effective on certain networks, leading to . We use the strategy in Chen and Darwiche (2022) which exhibits this behavior on certain networks.
3 The Treewidth of Twin Networks
Consider Figure 3(a) which depicts a 2-bit half adder. Suppose the binary inputs and are randomly sampled from some distribution and the gates may not be functioning properly. This circuit can be modeled using the network in Figure 3(b). Variables represent the inputs and outputs of the circuit; represent the health of the XOR gate and the AND gate; and represents an unknown external random sampler that decides the state of inputs and . Suppose that currently input is high, input is low, yet both outputs and are low which is an abnormal circuit behavior. We wish to know whether the half adder would still behave correctly when we turn both inputs and on. This question can be formulated using the following counterfactual query: . This query can be answered using a twin network as shown in Figure 3(c), where each non-root variable has a duplicate . The current evidence is asserted on the variables representing the real world and the interventional query is computed on the duplicate variables representing the imaginary world. This is done by removing the edges incoming into the intervened upon variables , asserting evidence and finally computing the probability of as shown in Figure 3(d); see Pearl (2009) for an elaborate discussion of these steps. This basically illustrates how a counterfactual query can be computed using algorithms for associational queries, like variable elimination, but on a mutilated twin network instead of the base network.
We next show that the treewidth of a twin network is at most twice the treewidth of its base network plus one, which allows us to relate the complexities of assocational, interventional and counterfactual reasoning on fully specified SCMs. We first recall the definition of twin networks as proposed by Balke and Pearl (1994b).
Definition 1.
Given a base network , its twin network is constructed as follows. For each internal variable in , add a new variable labeled . For each parent of , if is an internal variable, make a parent of ; otherwise, make a parent of . We will call a base variable and a duplicate variable.
For convenience, we use when is root. For variables , we use to denote . Figure 3(c) depicts the twin network for the base network in Figure 3(b).
3.1 Twin elimination orders
Our result on the treewidth of twin networks is based on converting every elimination order for the base network into an elimination order for its twin network while providing a guarantee on the width of the latter in terms of the width of the former. We provide a similar result for jointrees that we use when discussing the causal treewidth of twin networks.
Definition 2.
Consider an elimination order for a base network . The twin elimination order is an elimination order for its twin network constructed by replacing each non-root variable in by .
Consider the base network in Figure 3(b) and its elimination order , , , , , , . The twin elimination order will be , , , , , , , , , , . Recall that eliminating variables from a base network induces a cluster sequence . We use to denote the cluster of eliminated variable . Similarly, eliminating variables from a twin network induces a cluster sequence and we use to denote the cluster of eliminated variable and to denote the cluster of its eliminated duplicate .
Theorem 1.
Suppose we are eliminating variables from base network using an elimination order and eliminating variables from its twin network using the twin elimination order . For every variable in , we have and .
This theorem has two key corollaries. The first relates the widths of an elimination order and its twin elimination order.
Corollary 1.
Let be the width of elimination order for base network and let be the width of twin elimination order for twin network . We then have .
The above bound is tight as shown in Appendix B. The next corollary gives us our first major result.
Corollary 2.
If is the treewidth of base network and is the treewidth of its twin network , then .
3.2 Twin jointrees


We will now provide a similar result for jointrees. That is, we will show how to convert a jointree for a base network into a jointree for its twin network while providing a guarantee on the width/size of the twin jointree in terms of the width/size of the base jointree. This may seem like a redundant result given Corollary 1 but the provided conversion will actually be critical for our later result on bounding the causal treewidth of twin networks. It can also be significantly more efficient than constructing a jointree by operating on the (larger) twin network.
Our conversion process operates on a jointree after directing its edges away from some node , call it a root. This defines a single parent for each jointree node , which is the neighbor of closest to root , with all other neighbors of being its children. These parent-child relationships are invariant when running the algorithm. We also use a subroutine for duplicating the jointree nodes rooted at some node . This subroutine duplicates node and its descendant while also duplicating the edges connecting these nodes. If a duplicated node hosts a family , this subroutine will make host the duplicate family (so iff ).
The conversion process is given in Algorithm 1 which should be called initially with a root that does not host a family for an internal DAG node and . The twin jointree in Figure 4(b) was obtained from the base jointree in Figure 4(a) by this algorithm which simply adds nodes and edges to the base jointree. If an edge in the base jointree is duplicated by Algorithm 1, we call a duplicated edge and a duplicate edge. Otherwise, we call an invariant edge. In Figure 4(b), duplicate edges are shown in red and invariant edges are shown in green. We now have the following key result on these twin jointrees.
Theorem 2.
If the input jointree to Alg. 1 has separators and the output jointree has separators , then for duplicated edges , ; for duplicate edges , ; and for invariant edges , .
One can verify that the separators in Figure 4 satisfy these properties. The following result bounds the width and size of twin jointrees generated by Algorithm 1.
Corollary 3.
Let be the width of a jointree for base network and let be the number of jointree nodes. Calling Algorithm 1 on this jointree will generate a jointree for twin network whose width is no greater than and whose number of nodes is no greater than .
The above bound on width is tight as shown in Appendix B. Since treewidth can be defined in terms of jointree width, the above result leads to the same guarantee of Corollary 2 on the treewidth of twin networks. However, the main role of the construction in this section is in bounding the causal treewidth of twin networks. This is discussed next.
4 The Causal Treewidth of Twin Networks
Recall that causal treewidth is a more refined notion than treewidth as it uses more information about the network. In particular, this notion is relevant when we know that some variables in the network are functional, without needing to know the specific functions (equations) of these variables. By exploiting this information, one can construct thinned jointrees that have smaller separators and clusters compared to classical jointrees, which can lead to exponential savings in reasoning time Chen and Darwiche (2022); Darwiche (2021, 2020). As mentioned earlier, the causal treewidth corresponds to the minimum width of any thinned jointree. This is guaranteed to be no greater than treewidth and can be bounded when treewidth is not Darwiche (2021). We next show that the causal treewidth of a twin network is also at most twice the causal treewidth of its base network plus one.
Theorem 3.
Consider a twin jointree constructed by Algorithm 1 from a base jointree with thinned separators . The following are valid thinned separators for this twin jointree: for duplicated edges , ; for duplicate edges ; ; and for invariant edges , .
This theorem shows that a thinned, base jointree can be easily converted into a thinned, twin jointree. This is significant for two reasons. First, this method avoids the explicit construction of thinned jointrees for twin networks which can be quite expensive computationally Chen and Darwiche (2022). Second, we have the following guarantee on the width of thinned, twin jointrees constructed by Theorem 3.
Corollary 4.
Consider the thinned, base and twin jointrees in Theorem 3. If the thinned, base jointree has width , then the thinned, twin jointree has width no greater than .
Due to space constraints, we include a thinned jointree for the base network and the corresponding thinned, twin jointree constructed by Algorithm 1 and Theorem 3 in Appendix C. We can now bound the causal treewidth of twin networks.
Corollary 5.
If and are the causal treewidths of a base network and its twin network, then .
5 Counterfactual Reasoning Beyond Two Worlds
Standard counterfactual reasoning contemplates two worlds, one real and another imaginary, while assuming that exogenous variables correspond to causal mechanisms that govern both worlds. This motivates the notion of a twin network as it ensures that these causal mechanisms are invariant. We can think of counterfactual reasoning as a kind of temporal reasoning where endogenous variables can change their states over time. A more general setup arises when we allow some exogenous variables to change their states over time. For example, consider again the half adder in Figure 3(a) and its base network in Figure 5. Suppose we set inputs and to high and low and observe outputs and to be high and low, which is a normal behavior. We then set both inputs to low and observe that the outputs do not change, which is an abnormal behavior. We then aim to predict the state of outputs if we were to set both inputs to high. This scenario involves three time steps (worlds). Moreover, while the health of gates and are invariant over time, we do not wish to make the same assumption about the inputs and . We can model this situation using the network in Figure 5, which is a more general type of networks that we call -world networks.
Definition 3.
Consider a base network and let be a subset of its roots and be an integer. The -world network of is constructed as follows. For each variable in that is not in , replace it with N duplicates of , labeled . For each parent of , if is in , make a parent of for all . Otherwise, make a parent of for all .
This definition corresponds to the notion of a parallel worlds model Avin et al. (2005) when contains all roots in the base network. Moreover, twin networks fall as a special case when and contains all roots of the base network. We next bound the (causal) treewidth of -world networks by the (causal) treewidth of their base networks.
Theorem 4.
If and are the (causal) treewidths of a base network and its -world network, then .
The class of -world networks is a subclass of dynamic Bayesian networks Dean and Kanazawa (1989) and is significant for a number of reasons. First, as illustrated above, it arises when reasoning about the behavior of systems consisting of function blocks (e.g., gates) Hamscher et al. (1992). These kinds of physical systems can be easily modeled using fully specified SCMs, where the structural equations correspond to component behaviors and the distributions over exogenous variables correspond to component reliabilities; see (Darwiche, 2009, Ch 5) for a textbook discussion and Mengshoel et al. (2010) for a case study of a real-world electrical power system. More broadly, -world networks allow counterfactual reasoning that involves conflicting observations and actions that arise in multiple worlds as in the unit selection problem Li and Pearl (2022)—for example, Huang and Darwiche (2023) used Theorem 4 to obtain bounds on the complexity of this problem. See also Avin et al. (2005); Shpitser and Pearl (2007, 2008) for further applications of -world networks in the context of counterfactual reasoning. Appendix D shows that the treewidth bound of Theorem 4 holds for a generalization of -world networks that permits the duplication of only a subset of base nodes and allows certain edges that extend between worlds.
6 Counterfactual Reasoning with Partially Specified SCMs
The results we presented on -world networks, which include twin networks, apply directly to fully specified SCMs. In particular, in the context of variable elimination and jointree algorithms, these results allow us to bound the complexity of computing counterfactual queries in terms of the complexity of computing associational/interventional queries. Moreover, they provide efficient methods for constructing elimination orders and jointrees that can be used for computing counterfactual queries based on the ones used for answering associational/interventional queries, while ensuring that the stated bounds will be realized. Recall again that our bounds and constructions apply to both traditional treewidth and the more recent causal treewidth.
Causal reasoning can also be conducted on partially specified SCMs and data, which is a more common and challenging task. A partially specified SCM typically includes the SCM structure and some information about its parameters (i.e., its structural equations and the distributions over its exogenous variables). For example, we may not know any of the SCM paramaters, or we may know the structural equations but not the distributions over exogenous variables as assumed in Zaffalon et al. (2021). A central question in this setup is whether the available information, which includes data, is sufficient to obtain a point estimate for the causal query of interest, in which case the query is said to be identifiable. A significant amount of work has focused on characterizing conditions under which causal queries (both counterfactual and interventional) are identifiable; see, Pearl (2009); Spirtes et al. (2000) for textbook discussions of this subject and Shpitser and Pearl (2008); Correa et al. (2021) for some results on the identification of counterfactual queries.
When a query is identifiable, the classical approach for estimating it is to derive an estimand using techniques such as the do-calculus for interventional queries Pearl (1995); Tian and Pearl (2002); Shpitser and Pearl (2006).444See Jung et al. (2021b, a, 2020) for some recent work on estimating identifiable interventional queries from finite data. Some recent approaches take a different direction by first estimating the SCM parameters to yield a fully specified SCM that is then used to answer (identifiable) interventional and counterfactual queries using classical inference algorithms Zaffalon et al. (2022, 2021); Darwiche (2021). Our results on twin and -world networks apply directly in this case as they can be used when conducting inference on the fully parameterized SCM. For unidentifiable queries, the classical approach is to derive a closed-form bound on the query; see, for example, Balke and Pearl (1994a); Pearl (1999); Tian and Pearl (2000); Dawid et al. (2017); Rosset et al. (2018); Evans (2018); Zhang et al. (2021); Mueller et al. (2022). Some recent approaches take a different direction for establishing bounds, such as reducing the problem into one of polynomial programming Duarte et al. (2021); Zhang et al. (2022) or inference on credal networks Zaffalon et al. (2020); Cozman (2000); Mauá and Cozman (2020). Another recent direction is to establish (approximate) bounds by estimating SCM parameters and then using classical inference algorithms on the fully specified SCM to obtain point estimates Zaffalon et al. (2021, 2022). Since the query is not identifiable, different parametrizations can lead to different point estimates which are employed to improve (widen) the computed bounds. Our results can also be used in this case for computing point estimates based on a particular parametrization (fully specified SCM) within the overall process of establishing bounds.
7 Experimental Results
We consider experiments that target random networks whose structures emulate the structures of SCMs used in counterfactual reasoning. We have a few objectives in mind. First, we wish to compare the widths of base and twin jointrees, with and without thinning. These widths do not correspond to (causal) treewidth since the jointrees are constructed using heuristics (finding optimal jointrees is NP-hard). Next, we want to compare the quality of twin jointrees constructed by Algorithm 1 (TWIN-ALG1), which operates directly on a base jointree, to the quality of twin jointrees obtained by applying the minfill heuristic to a twin network (TWIN-MF). Recall that the former method is more efficient than the latter method. Finally, we wish to conduct a similar comparison between the thinned, twin jointrees constructed according to Theorem 3 (TWIN-THM3) and the thinned, twin jointrees obtained by applying the minfill heuristic and thinning rules to a twin network (TWIN-MF-RLS). Again, the former method is more efficient than the latter. The widths of these jointrees will be compared to the widths of base jointrees constructed by minfill (BASE-MF) and thinned, base jointrees constructed by minfill and thinning rules (BASE-MF-RLS).
We generated random networks according to the method used in Darwiche (2020). Given a number of nodes and a maximum number of parents , the method chooses the parents of node randomly from the set . The number of parents for node is chosen randomly from the set . We refer to these networks as rNET. We then consider each internal node and add a unique root as parent for . This is meant to emulate the structure of SCMs as the exogenous variable can be viewed as representing the different causal mechanisms for endogenous variable . We refer to these modified networks as rSCM. The twin networks of rSCM are more complex than those for rNET since more variables are shared between the two slices representing the real and imaginary worlds (i.e., more information is shared between the two worlds). We used and . For each combination of and , we generated random, base networks and reported averages of two properties for the constructed jointrees: width and normalized width. If a jointree has clusters , then normalized width is . This accounts for all clusters in the jointree (instead of just the largest one) and the jointree size. The data we generated occupies significant space so we included it in Appendix E while providing representative plots in Figure 6 for jointree widths under . We next discuss patterns exhibited in these plots and the full data in Appendix E, which also includes experiments using random networks generated according to the method in Ide and Cozman (2002).
First, the widths of twin jointrees are always less than twice the widths of their base jointrees and often significantly less than that. This is not guaranteed by our theoretical bounds as those apply to (causal) treewidth not to the widths of jointrees produced by heuristics — the latter widths are an upper bound on the former. Second, constructing a twin jointree by directly applying Algorithm 1 to a base jointree (TWIN-ALG1) is relatively comparable to constructing the twin jointree by operating on the twin network (TWIN-MF), as would normally be done. This also holds for thinned jointrees (TWIN-THM3 vs TWIN-MF-RLS) and is encouraging since the former methods are much more efficient than the latter ones. Third, the employment of thinned jointrees can lead to significant reduction in width and hence an exponential reduction in reasoning time. This can be seen by comparing the widths of twin jointrees TWIN-THM3 and TWIN-ALG1 since the former is thinned but the latter is not (similarly for TWIN-MF-RLS and TWIN-MF). Fourth, the twin jointrees of rSCM have larger widths than those of rNET. Recall that in rSCM, every endogenous variable has its own exogenous variable as a parent. Therefore, the distribution over exogenous variables has a larger space in rSCM compared to rNET. Since this distribution needs to be shared between the real and imaginary worlds, counterfactual reasoning with rSCM is indeed expected to be more complex computationally than reasoning with rNET. Finally, consider Figure 6(b) for a bottom-line comparison between the complexity of counterfactual reasoning and the complexity of associational/interventional reasoning in practice. Jointrees BASE-MF have the smallest widths for base networks so these are the jointrees one would use for associational/interventional reasoning. The best twin jointrees are TWIN-MF-RLS which are thinned. This is what one would use for counterfactual reasoning. The widths of latter jointrees are always less than twice the widths of the former, and quite often significantly much less.555See footnote 3 for why BASE-MF is better than BASE-MF-RLS for rSCM.


8 Conclusion
We studied the complexity of counterfactual reasoning on fully specified SCMs in relation to the complexity of associational and interventional reasoning on these models. Our basic finding is that in the context of algorithms based on (causal) treewidth, the former complexity is no greater than quadratic in the latter when counterfactual reasoning involves only two worlds. We extended these results to counterfactual reasoning that requires multiple worlds, showing that the gap in complexity is bounded polynomially by the number of needed worlds. Our empirical results suggest that for two types of random SCMs, the complexity of counterfactual reasoning is closer to that of associational and interventional reasoning than our worst-case theoretical analysis may suggest. While our results directly target counterfactual reasoning on fully specified SCMs, we also discussed cases when they can be applied to counterfactual reasoning on partially specified SCMs that are coupled with data.
Acknowledgements
We wish to thank Haiying Huang, Scott Mueller, Judea Pearl, Ilya Shpitser, Jin Tian and Marco Zaffalon for providing useful feedback on an earlier version of this paper. This work has been partially supported by ONR grant N000142212501.
References
- Avin et al. [2005] Chen Avin, Ilya Shpitser, and Judea Pearl. Identifiability of path-specific effects. In IJCAI, pages 357–363. Professional Book Center, 2005.
- Balke and Pearl [1994a] Alexander Balke and Judea Pearl. Counterfactual probabilities: Computational methods, bounds and applications. In UAI, pages 46–54. Morgan Kaufmann, 1994.
- Balke and Pearl [1994b] Alexander Balke and Judea Pearl. Probabilistic evaluation of counterfactual queries. In AAAI, pages 230–237. AAAI Press / The MIT Press, 1994.
- Balke and Pearl [1995] Alexander Balke and Judea Pearl. Counterfactuals and policy analysis in structural models. In UAI, pages 11–18. Morgan Kaufmann, 1995.
- Bareinboim et al. [2021] E. Bareinboim, Juan David Correa, D. Ibeling, and Thomas F. Icard. On Pearl’s hierarchy and the foundations of causal inference. 2021. Technical Report, R-60, Colombia University.
- Chen and Darwiche [2022] Yizuo Chen and Adnan Darwiche. On the definition and computation of causal treewidth. In UAI, volume 180 of Proceedings of Machine Learning Research, pages 368–377. PMLR, 2022.
- Correa et al. [2021] Juan D. Correa, Sanghack Lee, and Elias Bareinboim. Nested counterfactual identification from arbitrary surrogate experiments. In NeurIPS, pages 6856–6867, 2021.
- Cozman [2000] Fábio Gagliardi Cozman. Credal networks. Artif. Intell., 120(2):199–233, 2000.
- Darwiche [2003] Adnan Darwiche. A differential approach to inference in Bayesian networks. J. ACM, 50(3):280–305, 2003.
- Darwiche [2009] Adnan Darwiche. Modeling and Reasoning with Bayesian Networks. Cambridge University Press, 2009.
- Darwiche [2020] Adnan Darwiche. An advance on variable elimination with applications to tensor-based computation. In ECAI, volume 325 of Frontiers in Artificial Intelligence and Applications, pages 2559–2568. IOS Press, 2020.
- Darwiche [2021] Adnan Darwiche. Causal inference with tractable circuits. In WHY Workshop, NeurIPS, 2021. https://arxiv.org/abs/2202.02891.
- Darwiche [2022] Adnan Darwiche. Tractable Boolean and arithmetic circuits. In Pascal Hitzler and Md Kamruzzaman Sarker, editors, Neuro-symbolic Artificial Intelligence: The State of the Art, volume 342, chapter 6. Frontiers in Artificial Intelligence and Applications. IOS Press, 2022.
- Dawid et al. [2017] Philip Dawid, Monica Musio, and Rossella Murtas. The probability of causation. Law, Probability and Risk, (16):163–179, 2017.
- Dean and Kanazawa [1989] Thomas Dean and Keiji Kanazawa. A model for reasoning about persistence and causation. Computational Intelligence, 5, 1989.
- Dechter [1996] Rina Dechter. Bucket elimination: A unifying framework for probabilistic inference. In Proceedings of the Twelfth Annual Conference on Uncertainty in Artificial Intelligence (UAI), pages 211–219, 1996.
- Duarte et al. [2021] Guilherme Duarte, Noam Finkelstein, Dean Knox, Jonathan Mummolo, and Ilya Shpitser. An automated approach to causal inference in discrete settings. CoRR, abs/2109.13471, 2021.
- Evans [2018] Robin J. Evans. Margins of discrete Bayesian networks. The Annals of Statistics, 46(6A):2623 – 2656, 2018.
- Galles and Pearl [1998] David Galles and Judea Pearl. An axiomatic characterization of causal counterfactuals. Foundations of Science, 3(1):151–182, 1998.
- Halpern [2000] Joseph Y Halpern. Axiomatizing causal reasoning. Journal of Artificial Intelligence Research, 12:317–337, 2000.
- Hamscher et al. [1992] Walter Hamscher, Luca Console, and Johan de Kleer, editors. Readings in Model-Based Diagnosis. Morgan Kaufmann Publishers Inc., San Francisco, CA, USA, 1992.
- Huang and Darwiche [2023] Haiying Huang and Adnan Darwiche. An algorithm and complexity results for causal unit selection. In 2nd Conference on Causal Learning and Reasoning, Proceedings of Machine Learning Research, 2023.
- Ide and Cozman [2002] Jaime Shinsuke Ide and Fábio Gagliardi Cozman. Random generation of bayesian networks. In SBIA, volume 2507 of Lecture Notes in Computer Science, pages 366–375. Springer, 2002.
- Jung et al. [2020] Yonghan Jung, Jin Tian, and Elias Bareinboim. Learning causal effects via weighted empirical risk minimization. In NeurIPS, 2020.
- Jung et al. [2021a] Yonghan Jung, Jin Tian, and Elias Bareinboim. Estimating identifiable causal effects on markov equivalence class through double machine learning. In ICML, volume 139 of Proceedings of Machine Learning Research, pages 5168–5179. PMLR, 2021.
- Jung et al. [2021b] Yonghan Jung, Jin Tian, and Elias Bareinboim. Estimating identifiable causal effects through double machine learning. In AAAI, pages 12113–12122. AAAI Press, 2021.
- Kjaerulff [1990] Uffe Kjaerulff. Triangulation of graphs – algorithms giving small total state space. 1990. Technical Report R-90-09, Department of Mathematics and Computer Science, University of Aalborg, Denmark.
- Kjærulff [1994] Uffe Kjærulff. Reduction of computational complexity in bayesian networks through removal of weak dependences. In UAI, pages 374–382. Morgan Kaufmann, 1994.
- Koller and Friedman [2009] Daphne Koller and Nir Friedman. Probabilistic Graphical Models - Principles and Techniques. MIT Press, 2009.
- Larrañaga et al. [1997] Pedro Larrañaga, Cindy M. H. Kuijpers, Mikel Poza, and Roberto H. Murga. Decomposing bayesian networks: triangulation of the moral graph with genetic algorithms. Statistics and Computing, 7, 1997.
- Lauritzen and Spiegelhalter [1988] S. L. Lauritzen and D. J. Spiegelhalter. Local computations with probabilities on graphical structures and their application to expert systems. Journal of the Royal Statistical Society. Series B (Methodological), 50(2):157–224, 1988.
- Li and Pearl [2022] Ang Li and Judea Pearl. Unit selection with nonbinary treatment and effect. CoRR, abs/2208.09569, 2022.
- Mauá and Cozman [2020] Denis Deratani Mauá and Fabio Gagliardi Cozman. Thirty years of credal networks: Specification, algorithms and complexity. International Journal of Approximate Reasoning, 126:133–157, 2020.
- Mengshoel et al. [2010] Ole J. Mengshoel, Mark Chavira, Keith Cascio, Scott Poll, Adnan Darwiche, and N. Serdar Uckun. Probabilistic model-based diagnosis: An electrical power system case study. IEEE Trans. Syst. Man Cybern. Part A, 40(5):874–885, 2010.
- Mueller et al. [2022] Scott Mueller, Ang Li, and Judea Pearl. Causes of effects: Learning individual responses from population data. In IJCAI, pages 2712–2718. ijcai.org, 2022.
- Pearl and Mackenzie [2018] Judea Pearl and Dana Mackenzie. The Book of Why: The New Science of Cause and Effect. Basic Books, 2018.
- Pearl [1988] Judea Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann, 1988.
- Pearl [1995] Judea Pearl. Causal diagrams for empirical research. Biometrika, 82(4):669–688, 1995.
- Pearl [1999] Judea Pearl. Probabilities of causation: Three counterfactual interpretations and their identification. Synth., 121(1-2):93–149, 1999.
- Pearl [2009] Judea Pearl. Causality. Cambridge University Press, second edition, 2009.
- Peters et al. [2017] Jonas Peters, Dominik Janzing, and Bernhard Schölkopf. Elements of Causal Inference: Foundations and Learning Algorithms. MIT Press, 2017.
- Robertson and Seymour [1986] Neil Robertson and Paul D. Seymour. Graph minors. II. algorithmic aspects of tree-width. J. Algorithms, 7(3):309–322, 1986.
- Rosset et al. [2018] Denis Rosset, Nicolas Gisin, and Elie Wolfe. Universal bound on the cardinality of local hidden variables in networks. Quantum Info. Comput., 18(11–12):910–926, sep 2018.
- Shpitser and Pearl [2006] Ilya Shpitser and Judea Pearl. Identification of joint interventional distributions in recursive semi-markovian causal models. In AAAI, pages 1219–1226. AAAI Press, 2006.
- Shpitser and Pearl [2007] Ilya Shpitser and Judea Pearl. What counterfactuals can be tested. In UAI, pages 352–359. AUAI Press, 2007.
- Shpitser and Pearl [2008] Ilya Shpitser and Judea Pearl. Complete identification methods for the causal hierarchy. J. Mach. Learn. Res., 9:1941–1979, 2008.
- Spirtes et al. [2000] Peter Spirtes, Clark Glymour, and Richard Scheines. Causation, Prediction, and Search, Second Edition. Adaptive computation and machine learning. MIT Press, 2000.
- Tian and Pearl [2000] Jin Tian and Judea Pearl. Probabilities of causation: Bounds and identification. Ann. Math. Artif. Intell., 28(1-4):287–313, 2000.
- Tian and Pearl [2002] Jin Tian and Judea Pearl. A general identification condition for causal effects. In AAAI/IAAI, pages 567–573. AAAI Press / The MIT Press, 2002.
- Zaffalon et al. [2020] Marco Zaffalon, Alessandro Antonucci, and Rafael Cabañas. Structural causal models are (solvable by) credal networks. In International Conference on Probabilistic Graphical Models, PGM, volume 138 of Proceedings of Machine Learning Research, pages 581–592. PMLR, 2020.
- Zaffalon et al. [2021] Marco Zaffalon, Alessandro Antonucci, and Rafael Cabañas. Causal Expectation-Maximisation. In WHY Workshop, NeurIPS, 2021. https://arxiv.org/abs/2011.02912.
- Zaffalon et al. [2022] Marco Zaffalon, Alessandro Antonucci, Rafael Cabañas, David Huber, and Dario Azzimonti. Bounding counterfactuals under selection bias. In PGM, volume 186 of Proceedings of Machine Learning Research, pages 289–300. PMLR, 2022.
- Zhang and Poole [1994] Nevin Lianwen Zhang and David L. Poole. Intercausal independence and heterogeneous factorization. In UAI, pages 606–614. Morgan Kaufmann, 1994.
- Zhang et al. [2021] Junzhe Zhang, Jin Tian, and Elias Bareinboim. Partial counterfactual identification from observational and experimental data. CoRR, abs/2110.05690, 2021.
- Zhang et al. [2022] Junzhe Zhang, Jin Tian, and Elias Bareinboim. Partial counterfactual identification from observational and experimental data. In ICML, volume 162 of Proceedings of Machine Learning Research, pages 26548–26558. PMLR, 2022.
Appendix A Proofs
Proof of Theorem 1
Consider a base network and its twin network . We first introduce a set notation , which contains both the base and duplicate variable for if is an internal variable and collapses to a single variable if is a root variable.
For an elimination order on a base network , Darwiche [2009] defines a graph sequence induced by , where is the moral graph for , and is the result of eliminating from . Similarly, we define a twin graph sequence , where is the moral graph for , and is the result of eliminating from .
Consider in the graph sequence induced by on . For each variable in , let be the set consisting of and its neighbors in . Similarly, let and denote the set consisting of and its neighbors, and the set consisting of and its neighbors in , respectively. By definition, and .
We first propose a Lemma that relates and .
Lemma 1.
Suppose we apply elimination order to a base network and apply to its twin network . Then for each variable in , we have and .
Proof.
We will prove this by induction on . The statement holds initially for by the definition of twin networks. Suppose the statement holds for , i.e. and for every in , we need to show the statement holds for .
For simplicity, let denote the variable being eliminated at step , i.e. . WLG, consider each base variable in (similar argument can be applied to each duplicate variable ). is affected by the elimination of iff is a neighbor of in . Moreover, by induction, is a neighbor of in only if is a neighbor of in .
When is not a neighbor of , then and , so the statement holds.
When is a neighbor of , by the definition of variable elimination. We can then bound as follows:
(eliminating on ) | ||||
(by inductive hypothesis) | ||||
Proof of Theorem 1.
Consider each variable that is eliminated at step , i.e. . By Lemma 1, .
We next bound if is an internal variable,
(eliminating from ) | ||||
(by Lemma 1) | ||||
Proof of Theorem 2
We first state a key observation from Algorithm 1, which is formulated as Lemma 2. For simplicity, we say a leaf node hosts variable if it hosts a family that contains .
Lemma 2.
Consider each invariant edge in a twin jointree constructed from Algorithm 1. A variable is hosted in some leaf on the -side of the edge iff is also hosted in some leaf on the -side of the edge.
Proof.
If is a root variable, then and the lemma holds. Now suppose is an internal variable. Let be the leaf node on the -side of the edge that hosts , it suffices to show that is duplicated into a on the -side that hosts .
Since is an internal variable, hosts either the family for or a family for some child of . In either case, hosts a family for an internal variable. Moreover, is contained in some subtree rooted at whose leaves host only families for internal variables. Since is on the -side of the invariant edge , is also on the -side of the edge. Hence the duplicate leaf , which hosts , is also on the -side of the edge.
Conversely, suppose is hosted by some leaf on the -side of edge , then hosts some family for a duplicate variable. is contained in some duplicate subtree rooted at whose leaves only host families for duplicate variables. It follows that the base subtree rooted at is located on the -side of the edge and contains a base node hosting . ∎
We first recall the definition of separators: if and only if is hosted on both sides of the edge . For simplicity, we use to denote the variables that appear on the -side of the edge in the base jointree. Similarly, we use and to denote the variables that appear on the -side in the twin jointree. By definition, for each edge , and . Given a jointree and its root, we say that a jointree node is above a jointree node if is closer to the root than , and that is below if is further from the root than .
Proof of Theorem 2.
We derive the separators for each type of edges. WLG, for each edge , assume that is above . First consider each duplicated edge , we have by Algorithm 1. Moreover, can only contain extra duplicate variables comparing to . Thus, .
For each duplicate edge , we have by Algorithm 1. We next show that for each , iff , which then concludes . We first show the if-part. Let be the least common ancestor of and . If , then is hosted by some leaf that appears either below , or above , in the base jointree. Suppose is below , then ’s duplicate hosts on the -side in the twin jointree by Algorithm 1, which implies . Suppose is above . If is the root of the jointree, then is a root variable and . Otherwise, let be the parent of , then is an invariant edge, and appears on the -side of the edge by Lemma 2.
Similar argument applies for the only-if part. Suppose , then is hosted by some duplicate leaf either below or above . If is below , then the duplicated node is also below . If is above , then the duplicated node must also appear above by Lemma 2.
For each invariant edge , it follows from Lemma 2 that and . Hence, . ∎
Proof of Corollary 3.
Let be a jointree for , and let be the twin jointree for obtained from Algorithm 1. Let be a node in jointree . By Theorem 3, for all neighbors of , . So . So . So .
The number of nodes in the twin jointree is at most twice the number of nodes in the base jointree since Algorithm 1 adds at most one duplicate for each node in the base jointree. ∎
Proof of Theorem 3
Proof of Theorem 3.
From Chen and Darwiche [2022], a thinned jointree can be obtained by applying a sequence of thinning rules to a base jointree. Let denote the thinning rules being applied to the base jointree in order. We next construct a thinning sequence for the twin jointree, which leads to the thinned separators defined in the Theorem.
We first define a parallel thinning step as simultaneously thinning a functional variable from the base jointree and thinning from the twin jointree. Consider any thinning rule that thins from a separator , the parallel thinning on is defined as follows:
-
•
Suppose is a duplicated edge, then we thin from and from
-
•
Suppose is an invariant edge, then we thin from
First note that the definition of parallel thinning ensures that the relation between and specified in the theorem holds after every parallel thinning step. It remains to show that the parallel thinnings on are indeed valid.
Let , denote the separators for the base and twin jointree during the parallel thinnings. Consider a parallel thinning step being applied to a duplicated edge, which, by definition, removes from , from , and from . Suppose the removal of from is supported by the first thinning rule, i.e. the edge is on the path between two leaf nodes, call them and , both hosting the family of , and every separator on that path contains . We claim that the removal of from and the removal of from are both supported by the first thinning rule. By Algorithm 1, the leaf nodes host the family of , and the leaf nodes host the family of in the twin jointree. Moreover, by the inductive assumption on separators, appears on every separator between and and appears on every separator between and . This is based on an observation that the path from to can be divided into three sub-paths: a sub-path consisting of only duplicated edges , a sub-path consisting of only invariant edges , and a sub-path consisting of only duplicated edges , where 666Note that we do not preclude the possibility of having no invariant edge on the path.. Given the path from to , we can then express the path from to as three sub-paths as well: a sub-path consisting of only duplicate edges , a sub-path consisting of only invariant edges , and a sub-path consisting of only duplicate edges . For each duplicated edge on the sub-path from to or on the sub-path from to , iff . For each invariant edge on the sub-path from to , iff .
Suppose the removal of from is supported by the second thinning rule, i.e. no other separators contains , or no other separators contains . Then the removal of from and the removal of from are both supported by the second thinning rule due to the inductive assumption on separators.
Consider now a parallel thinning step being applied to an invariant edge which removes from and from . Suppose the removal of from is supported by the first thinning rule, where the edge is on the path between two leaf nodes and both hosting the family of and every separator on that path contains . Again, by the inductive assumption on separators, appears in all separators on the path between and , which also includes the invariant edge . Hence, can also be removed from using the first thinning rule. Suppose the removal of from is supported by the second thinning rule, then we can apply the second thinning rule to remove from due to the inductive assumption. ∎
Proof of Theorem 4
We first extend the notion of twin graph sequence to -world graph sequence, denoted as , where is the moral graph for , and is the result of eliminating from . For each variable () in , let be the set consisting of and its neighbors in . For a set of variables , we use to denote .
Lemma 3.
Consider a base network and its -world network . Let be an elimination order for . Then for each variable () in , .
Proof.
We will prove this by induction on . The statement holds initially for by the definition of -world networks. Suppose the statement holds for , i.e. for every variable , we need to show the statement holds for .
Let . Suppose we eliminate variables from . is affected by the elimination of iff is a neighbor of in . Moreover, by induction, is a neighbor of in only if is a neighbor of in .
When is not a neighbor of , then for all and , so the statement holds. When is a neighbor of , for all , we can then bound as follows:
(by induction hypothesis) | ||||
Lemma 4.
Let be an -world network and let be a positive integer. Then .
Proof.
Let . By Lemma 3, . We next bound where by induction. For each , assume for , then
Proof of Theorem 4.
By Lemma 4, for all , . Therefore, the width of is
For causal treewidth, we can extend Algorithm 1 to construct -world jointrees by making duplicates for each duplicated subtree, and construct -world thinned jointrees by extending Theorem 3. By analogous arguments, we can show that . ∎


Appendix B Tightness of Bounds
As we claimed in the paper, Corollaries and provide tight bounds for the widths of twin elimination orders defined by Definition , and twin jointrees constructed by Algorithm 1, respectively. Moreover, we provide a concrete example where a base network has treewidth and its twin network has treewidth , which suggests that our bound on treewidth in Corollary may also be tight (we found these examples through exhaustive search which we were able to do on small examples only, given the allotted computational resources).
For the tightness of Corollary , consider the base network shown in Figure 7(a). The elimination order has width of The twin network for is shown in Figure 7(b). By Definition , the twin elimination order for is which has a width of
Appendix C Additional Figures
Figure 10(b) shows a thinned jointree for the base network in Figure 10(a). Figure 10(c) shows the corresponding thinned, twin jointree constructed by Algorithm 1 and annotated with the thinned separators (and clusters) as given by Theorem 3. The jointree width is preserved in this case.


Appendix D Generalized -World Networks
We show that the bound on treewidth also applies to a more general version of -world networks, called generalized -world networks, that allow any selected, subset of nodes to be duplicated and also allow certain edges between duplicates of the same variable.
Definition 4.
Let be a base network and let be any subset of its nodes. A generalized -world network is constructed as follows. We replace each node with duplicates denoted as . For each parent of , if is not in , then make a parent of ; otherwise, make a parent of for each . Moreover, for each pair of duplicates and where , an edge may be added from to .
Figure 11(b) shows a generalized -world network for the base network in Figure 11(a). The notion of generalized -world networks is significant as it covers a vast subclass of dynamic Bayesian networks, which is widely used for temporal reasoning.
The following result shows that our bound still applies to generalized -world networks. It is worth mentioning that the bound also applies for any subgraphs of a generalized -world network, as the treewidth of any subgraph is no greater than the treewidth of the original network.
Theorem 5.
Let and be the treewidths of a base network and its generalized -world network, then .
Proof.
The proof is similar to the one for Theorem 4. We are assuming here that the duplicated nodes of are . We maintain the following invariant: for each and each node () in , . To show the invariant holds initially, observe that the construction of a generalized -world network involves two steps: duplicating the nodes in and adding edges between the duplicates of . After the first step, the invariant holds since an edge is added between two nodes in only if the parent-child relationship exists in the base network. We next show that the invariant still holds after the second step. Each time we add an edge from one duplicate to another duplicate (), becomes a neighbor of and the parents of . Since , and the parents of all belong to the set , and are still subsets of after the edge is added. Similarly, for each parent of where , since already belongs to the set , remains a subset of after the edge is added. It follows inductively that the invariant still holds after all edges are added.
We next show that the invariant holds after each elimination step. This can be done with a same inductive argument as Lemma 3. Suppose the invariant holds in and we want to eliminate , we consider how the eliminations affect the neighbors of each node in . Suppose is not in , then by the inductive assumption, none of are in . Thus and . Suppose is in , then and . The inductive assumption holds for both cases.
We finally show that the cluster formed for each (), denoted , is a subset of , where . This is similar to Lemma 4. For each , by the definition of variable elimination, the cluster is bounded as . We can then conclude the bound on widths and treewidths. ∎
Appendix E More on Experiments
Our experiments were run on 2.60GHz Intel Xeon E5-2670 CPU with 256 GB of memory.
In the main paper, we plotted the jointree widths for random networks with a maximum number of parents (). Here we show the complete experimental results for the random networks with generated according to the method of Darwiche [2020] which we discussed in the main paper. We record the widths and normalized widths for each twin jointrees. Recall that if a jointree has clusters , then the normalized width is . Table 1 shows the complete results for rNET and Table 2 shows the complete results for rSCM.
In addition, we report results on random networks generated by a second method proposed in Ide and Cozman [2002]. Given a number of nodes and a maximum degree777The degree of a node is the number of its parents and children. for each node, the method generates a random network by repeatedly adding/removing random edges from the current DAG. We refer to these random networks as rNET2. Similarly to what we did for rNET, we then added a unique root as parent for each internal variable in rNET2. We refer to these modified networks as rSCM2. For each combination of and , we generated random base networks and reported the average widths and normalized widths for each twin jointree. Table 3 shows the complete results for rNET2 and Table 4 shows the complete results for rSCM2. The patterns in these tables resemble the ones for rNET and rSCM.
num vars | max pars | stats | BASE- MF | TWIN- ALG1 | TWIN- MF | BASE- MF-RLS | TWIN- THM3 | TWIN- MF-RLS | ||||||
wd | nwd | wd | nwd | wd | nwd | wd | nwd | wd | nwd | wd | nwd | |||
50 | 3 | mean | 7.5 | 11.5 | 14.4 | 17.1 | 10.0 | 14.0 | 5.2 | 11.9 | 7.6 | 13.2 | 5.6 | 13.0 |
std | 1.4 | 0.9 | 2.8 | 2.4 | 2.0 | 1.4 | 1.0 | 0.7 | 1.2 | 0.7 | 1.2 | 0.7 | ||
5 | mean | 14.3 | 17.4 | 26.8 | 29.2 | 15.9 | 19.6 | 7.2 | 15.0 | 10.0 | 16.2 | 7.3 | 16.0 | |
std | 2.0 | 1.7 | 4.0 | 3.8 | 1.9 | 1.7 | 1.2 | 0.8 | 2.1 | 0.9 | 1.1 | 0.8 | ||
7 | mean | 19.5 | 22.4 | 36.9 | 39.1 | 20.4 | 24.1 | 8.9 | 17.1 | 12.2 | 18.3 | 8.9 | 18.1 | |
std | 2.0 | 1.9 | 4.1 | 3.9 | 1.8 | 1.7 | 0.8 | 0.7 | 2.4 | 0.8 | 0.9 | 0.7 | ||
75 | 3 | mean | 10.2 | 13.8 | 18.9 | 21.5 | 14.5 | 17.9 | 6.1 | 12.9 | 8.6 | 14.3 | 6.9 | 14.0 |
std | 1.9 | 1.5 | 3.6 | 3.4 | 2.4 | 2.0 | 1.6 | 0.8 | 1.9 | 0.9 | 1.9 | 0.8 | ||
5 | mean | 21.3 | 23.9 | 39.1 | 41.1 | 23.9 | 27.2 | 8.4 | 16.4 | 11.9 | 17.8 | 9.1 | 17.5 | |
std | 2.3 | 2.2 | 5.1 | 4.9 | 3.0 | 2.7 | 1.6 | 0.8 | 2.2 | 1.0 | 1.8 | 0.8 | ||
7 | mean | 28.8 | 31.4 | 54.0 | 56.0 | 30.3 | 33.7 | 11.0 | 18.6 | 16.4 | 21.2 | 11.7 | 19.7 | |
std | 2.6 | 2.5 | 5.2 | 5.2 | 2.2 | 2.1 | 1.9 | 0.9 | 4.1 | 2.4 | 2.0 | 0.9 | ||
100 | 3 | mean | 13.7 | 16.7 | 25.1 | 27.5 | 19.9 | 22.8 | 7.1 | 13.8 | 10.2 | 15.3 | 8.3 | 14.9 |
std | 2.4 | 2.1 | 4.7 | 4.4 | 3.3 | 2.8 | 1.7 | 0.8 | 2.2 | 1.2 | 1.9 | 0.9 | ||
5 | mean | 27.1 | 29.7 | 50.8 | 52.9 | 31.4 | 34.5 | 9.5 | 17.1 | 13.5 | 18.9 | 11.2 | 18.4 | |
std | 2.5 | 2.5 | 5.1 | 5.0 | 3.6 | 3.2 | 2.2 | 0.6 | 2.9 | 1.5 | 3.0 | 0.9 | ||
7 | mean | 38.7 | 40.9 | 72.9 | 74.7 | 40.8 | 43.8 | 14.5 | 20.3 | 22.9 | 26.6 | 15.9 | 21.9 | |
std | 3.5 | 3.4 | 6.7 | 6.5 | 3.3 | 3.2 | 3.3 | 1.5 | 7.1 | 5.8 | 3.4 | 1.8 | ||
125 | 3 | mean | 16.7 | 19.6 | 30.8 | 33.0 | 26.9 | 29.3 | 7.9 | 14.6 | 10.5 | 16.1 | 9.2 | 15.9 |
std | 2.1 | 1.9 | 4.0 | 3.8 | 3.9 | 3.6 | 2.0 | 0.8 | 2.3 | 1.2 | 2.3 | 1.0 | ||
5 | mean | 34.9 | 37.3 | 65.1 | 66.9 | 39.1 | 42.1 | 12.1 | 18.3 | 17.2 | 21.4 | 14.8 | 20.3 | |
std | 3.1 | 2.9 | 5.9 | 5.7 | 3.6 | 3.2 | 2.4 | 1.0 | 3.8 | 2.5 | 3.6 | 1.9 | ||
7 | mean | 47.9 | 50.1 | 89.8 | 91.7 | 50.6 | 53.5 | 19.7 | 23.7 | 32.5 | 35.1 | 21.7 | 26.1 | |
std | 3.2 | 3.1 | 6.8 | 6.7 | 3.8 | 3.6 | 4.2 | 3.3 | 8.7 | 8.3 | 4.5 | 3.7 | ||
150 | 3 | mean | 20.3 | 22.9 | 36.9 | 38.9 | 31.8 | 34.2 | 8.6 | 15.2 | 11.1 | 16.7 | 9.8 | 16.4 |
std | 2.4 | 2.3 | 4.7 | 4.7 | 4.2 | 3.9 | 2.1 | 0.8 | 2.2 | 1.1 | 2.5 | 1.1 | ||
5 | mean | 41.4 | 43.6 | 76.9 | 78.6 | 49.1 | 51.7 | 14.6 | 19.7 | 21.2 | 24.7 | 18.6 | 22.8 | |
std | 3.3 | 3.1 | 7.4 | 7.2 | 6.3 | 5.8 | 2.9 | 1.8 | 5.3 | 4.5 | 4.0 | 3.2 | ||
7 | mean | 56.7 | 58.7 | 106.7 | 108.2 | 60.6 | 63.3 | 22.5 | 26.1 | 36.8 | 39.3 | 25.6 | 29.4 | |
std | 4.1 | 3.8 | 8.1 | 7.9 | 3.6 | 3.5 | 4.8 | 3.9 | 9.6 | 9.1 | 5.1 | 4.5 | ||
200 | 3 | mean | 25.8 | 28.2 | 47.2 | 49.1 | 42.9 | 44.8 | 10.7 | 16.4 | 13.6 | 18.3 | 12.3 | 17.9 |
std | 2.5 | 2.5 | 4.9 | 4.9 | 5.4 | 5.2 | 2.9 | 1.5 | 2.9 | 1.5 | 3.2 | 1.9 | ||
5 | mean | 55.1 | 57.2 | 102.4 | 104.0 | 65.7 | 68.0 | 19.9 | 23.8 | 29.5 | 32.3 | 26.0 | 29.4 | |
std | 3.6 | 3.5 | 7.8 | 7.6 | 7.0 | 6.8 | 4.2 | 3.3 | 8.1 | 7.6 | 6.1 | 5.3 | ||
7 | mean | 75.3 | 77.4 | 141.5 | 143.1 | 80.4 | 82.7 | 34.3 | 37.1 | 57.5 | 59.8 | 39.0 | 42.3 | |
std | 4.1 | 3.9 | 8.4 | 8.2 | 4.3 | 4.1 | 4.8 | 4.7 | 10.2 | 10.2 | 5.1 | 4.8 | ||
250 | 3 | mean | 31.6 | 34.0 | 58.1 | 60.0 | 55.7 | 57.4 | 12.1 | 17.4 | 15.0 | 19.5 | 14.1 | 19.2 |
std | 3.1 | 2.9 | 6.1 | 6.0 | 6.4 | 6.2 | 2.8 | 1.5 | 3.1 | 2.1 | 3.2 | 2.0 | ||
5 | mean | 68.6 | 70.5 | 128.5 | 129.9 | 84.9 | 86.8 | 25.8 | 28.9 | 38.9 | 41.3 | 36.1 | 38.9 | |
std | 4.1 | 3.8 | 7.9 | 7.7 | 9.9 | 9.6 | 4.7 | 4.4 | 9.4 | 9.2 | 7.5 | 7.1 | ||
7 | mean | 95.6 | 97.4 | 180.3 | 181.7 | 102.7 | 104.7 | 44.2 | 46.7 | 74.6 | 76.6 | 49.5 | 52.7 | |
std | 4.9 | 4.7 | 10.1 | 9.9 | 5.5 | 5.3 | 6.5 | 6.3 | 13.2 | 13.1 | 6.8 | 6.7 | ||
300 | 3 | mean | 40.1 | 42.3 | 73.9 | 75.6 | 69.9 | 71.5 | 13.9 | 18.4 | 17.4 | 21.2 | 16.3 | 20.5 |
std | 3.9 | 3.6 | 7.6 | 7.3 | 8.0 | 7.9 | 2.7 | 1.8 | 3.9 | 3.3 | 3.4 | 2.7 | ||
5 | mean | 82.4 | 84.1 | 153.3 | 154.5 | 104.8 | 106.4 | 32.8 | 35.7 | 50.1 | 52.3 | 46.1 | 48.6 | |
std | 4.7 | 4.5 | 9.7 | 9.6 | 14.6 | 14.3 | 5.4 | 5.3 | 10.4 | 10.2 | 8.8 | 8.5 | ||
7 | mean | 114.3 | 116.0 | 215.9 | 217.1 | 123.4 | 125.2 | 55.5 | 58.0 | 95.2 | 97.1 | 62.2 | 65.3 | |
std | 4.1 | 4.0 | 8.4 | 8.4 | 5.8 | 5.6 | 7.2 | 7.2 | 14.3 | 14.2 | 7.5 | 7.3 |
num vars | max pars | stats | BASE- MF | TWIN- ALG1 | TWIN- MF | BASE- MF-RLS | TWIN- THM3 | TWIN- MF-RLS | ||||||
wd | nwd | wd | nwd | wd | nwd | wd | nwd | wd | nwd | wd | nwd | |||
50 | 3 | mean | 7.5 | 11.9 | 14.4 | 17.4 | 14.0 | 17.1 | 11.1 | 15.6 | 14.7 | 18.3 | 14.2 | 18.3 |
std | 1.4 | 0.7 | 2.8 | 2.2 | 2.7 | 2.3 | 2.5 | 2.0 | 3.0 | 2.3 | 2.6 | 2.2 | ||
5 | mean | 14.3 | 17.4 | 26.9 | 29.3 | 26.4 | 28.9 | 19.0 | 22.7 | 23.5 | 26.5 | 22.9 | 26.2 | |
std | 2.0 | 1.7 | 4.0 | 3.8 | 3.9 | 3.7 | 2.5 | 2.2 | 3.3 | 2.9 | 2.8 | 2.4 | ||
7 | mean | 19.5 | 22.4 | 37.0 | 39.3 | 37.0 | 39.1 | 24.3 | 27.7 | 31.0 | 33.5 | 29.4 | 32.4 | |
std | 2.0 | 1.9 | 4.1 | 3.9 | 4.2 | 4.0 | 2.6 | 2.2 | 3.5 | 3.2 | 3.4 | 2.9 | ||
75 | 3 | mean | 10.2 | 14.0 | 18.9 | 21.6 | 18.5 | 21.3 | 14.9 | 18.5 | 19.1 | 22.1 | 18.1 | 21.5 |
std | 1.9 | 1.3 | 3.6 | 3.4 | 3.6 | 3.4 | 3.2 | 2.8 | 4.1 | 3.6 | 3.7 | 3.2 | ||
5 | mean | 21.3 | 23.9 | 39.1 | 41.1 | 38.5 | 40.7 | 26.9 | 29.7 | 33.5 | 35.8 | 31.8 | 34.5 | |
std | 2.3 | 2.2 | 5.1 | 4.9 | 5.1 | 4.9 | 2.7 | 2.5 | 3.8 | 3.6 | 3.7 | 3.3 | ||
7 | mean | 28.8 | 31.4 | 54.1 | 56.2 | 53.9 | 55.9 | 34.5 | 36.9 | 44.9 | 46.9 | 42.3 | 44.5 | |
std | 2.6 | 2.5 | 5.3 | 5.3 | 5.2 | 5.1 | 3.4 | 3.1 | 5.5 | 5.2 | 4.9 | 4.6 | ||
100 | 3 | mean | 13.7 | 16.8 | 25.1 | 27.5 | 24.4 | 26.9 | 19.5 | 22.4 | 24.9 | 27.1 | 23.1 | 25.9 |
std | 2.4 | 2.0 | 4.7 | 4.4 | 4.7 | 4.4 | 3.6 | 3.2 | 4.6 | 4.2 | 4.0 | 3.8 | ||
5 | mean | 27.1 | 29.7 | 50.8 | 52.9 | 50.8 | 52.8 | 34.6 | 37.0 | 43.9 | 45.8 | 40.5 | 42.8 | |
std | 2.5 | 2.5 | 5.4 | 5.2 | 5.1 | 5.0 | 3.9 | 3.7 | 5.4 | 5.2 | 4.8 | 4.5 | ||
7 | mean | 38.7 | 40.9 | 73.0 | 74.8 | 72.8 | 74.5 | 45.8 | 47.8 | 61.3 | 62.9 | 57.4 | 59.4 | |
std | 3.5 | 3.4 | 6.7 | 6.5 | 6.8 | 6.7 | 4.7 | 4.5 | 7.9 | 7.7 | 7.4 | 7.1 | ||
125 | 3 | mean | 16.7 | 19.6 | 30.8 | 33.0 | 30.5 | 32.7 | 23.8 | 26.5 | 29.5 | 31.8 | 28.0 | 30.6 |
std | 2.1 | 1.8 | 4.0 | 3.8 | 3.7 | 3.5 | 2.8 | 2.6 | 4.0 | 3.6 | 3.6 | 3.3 | ||
5 | mean | 34.9 | 37.3 | 65.1 | 66.9 | 64.9 | 66.6 | 44.4 | 46.5 | 56.5 | 58.1 | 51.8 | 54.0 | |
std | 3.1 | 2.9 | 5.9 | 5.7 | 6.1 | 6.0 | 5.0 | 4.7 | 7.0 | 6.7 | 5.6 | 5.4 | ||
7 | mean | 47.9 | 50.1 | 90.1 | 91.9 | 89.9 | 91.6 | 55.4 | 57.5 | 76.1 | 77.6 | 71.2 | 73.2 | |
std | 3.2 | 3.1 | 6.7 | 6.6 | 6.6 | 6.4 | 3.9 | 3.7 | 7.3 | 7.0 | 6.5 | 6.2 | ||
150 | 3 | mean | 20.3 | 22.9 | 36.9 | 38.9 | 36.2 | 38.2 | 27.3 | 30.0 | 34.0 | 36.1 | 31.9 | 34.3 |
std | 2.4 | 2.3 | 4.7 | 4.7 | 4.4 | 4.3 | 3.5 | 3.3 | 4.4 | 4.2 | 4.5 | 4.2 | ||
5 | mean | 41.4 | 43.6 | 76.9 | 78.6 | 76.3 | 77.9 | 50.6 | 52.5 | 65.3 | 66.8 | 60.6 | 62.5 | |
std | 3.3 | 3.1 | 7.4 | 7.2 | 6.9 | 6.9 | 4.3 | 4.2 | 7.3 | 7.1 | 6.5 | 6.2 | ||
7 | mean | 56.7 | 58.7 | 106.7 | 108.2 | 106.0 | 107.5 | 65.7 | 67.4 | 91.1 | 92.4 | 83.2 | 85.0 | |
std | 4.1 | 3.8 | 8.1 | 7.9 | 7.5 | 7.4 | 4.5 | 4.4 | 8.0 | 7.9 | 7.0 | 6.9 | ||
200 | 3 | mean | 25.8 | 28.3 | 47.2 | 49.1 | 46.2 | 48.2 | 35.5 | 37.6 | 44.2 | 45.9 | 40.7 | 42.8 |
std | 2.5 | 2.5 | 5.0 | 4.9 | 4.9 | 4.9 | 4.5 | 4.3 | 6.4 | 6.1 | 5.6 | 5.3 | ||
5 | mean | 55.0 | 57.2 | 102.5 | 104.1 | 102.0 | 103.5 | 67.7 | 69.4 | 88.7 | 90.0 | 82.6 | 84.1 | |
std | 3.6 | 3.5 | 7.3 | 7.2 | 7.6 | 7.5 | 5.5 | 5.4 | 8.8 | 8.6 | 7.8 | 7.6 | ||
7 | mean | 75.3 | 77.4 | 141.5 | 143.1 | 75.3 | 77.4 | 87.0 | 88.6 | 124.6 | 125.8 | 115.2 | 116.7 | |
std | 4.1 | 3.9 | 8.4 | 8.2 | 4.1 | 3.9 | 4.6 | 4.4 | 9.2 | 9.2 | 8.4 | 8.3 | ||
250 | 3 | mean | 31.6 | 34.0 | 58.1 | 60.0 | 57.5 | 59.4 | 43.8 | 45.8 | 54.3 | 55.9 | 49.8 | 51.9 |
std | 3.1 | 3.0 | 6.0 | 6.0 | 6.6 | 6.5 | 5.2 | 4.9 | 7.7 | 7.4 | 6.3 | 6.0 | ||
5 | mean | 68.6 | 70.5 | 128.5 | 129.9 | 127.1 | 128.5 | 82.1 | 83.6 | 110.3 | 111.5 | 101.7 | 103.2 | |
std | 4.1 | 3.8 | 7.9 | 7.7 | 7.2 | 7.1 | 5.6 | 5.5 | 8.9 | 8.8 | 7.2 | 7.1 | ||
7 | mean | 95.7 | 97.4 | 180.4 | 181.7 | 179.3 | 180.7 | 109.1 | 110.4 | 157.5 | 158.7 | 146.6 | 147.9 | |
std | 4.9 | 4.7 | 10.1 | 9.9 | 8.9 | 8.8 | 6.0 | 5.9 | 12.2 | 12.2 | 10.9 | 10.8 | ||
300 | 3 | mean | 40.1 | 42.3 | 73.9 | 75.6 | 72.4 | 74.1 | 54.1 | 55.8 | 66.7 | 68.1 | 61.8 | 63.6 |
std | 3.9 | 3.6 | 7.6 | 7.3 | 7.5 | 7.4 | 5.1 | 5.1 | 6.8 | 6.7 | 7.1 | 7.0 | ||
5 | mean | 82.5 | 84.1 | 153.3 | 154.5 | 152.7 | 154.0 | 99.1 | 100.4 | 135.0 | 136.2 | 124.3 | 125.8 | |
std | 4.7 | 4.5 | 9.7 | 9.6 | 10.4 | 10.3 | 6.3 | 6.2 | 10.3 | 10.2 | 9.9 | 9.8 | ||
7 | mean | 114.4 | 116.0 | 215.9 | 217.2 | 215.1 | 216.3 | 131.5 | 132.7 | 194.1 | 195.2 | 178.8 | 180.1 | |
std | 4.1 | 4.0 | 8.4 | 8.3 | 9.1 | 9.1 | 5.7 | 5.6 | 12.8 | 12.7 | 11.2 | 11.1 |
num vars | max degree | stats | BASE- MF | TWIN- ALG1 | TWIN- MF | BASE- MF-RLS | TWIN- THM3 | TWIN- MF-RLS | ||||||
wd | nwd | wd | nwd | wd | nwd | wd | nwd | wd | nwd | wd | nwd | |||
50 | 5 | mean | 20.0 | 22.4 | 37.3 | 39.0 | 20.3 | 23.6 | 8.7 | 14.5 | 13.5 | 17.1 | 8.7 | 15.5 |
std | 1.3 | 1.1 | 3.3 | 3.1 | 1.3 | 1.1 | 1.4 | 0.6 | 2.9 | 1.8 | 1.4 | 0.7 | ||
10 | mean | 32.9 | 35.4 | 55.5 | 57.7 | 32.9 | 36.4 | 13.7 | 18.6 | 21.6 | 24.7 | 13.7 | 19.6 | |
std | 1.0 | 0.9 | 11.9 | 11.3 | 1.0 | 0.9 | 1.8 | 1.0 | 6.0 | 4.5 | 1.8 | 1.0 | ||
15 | mean | 38.1 | 40.7 | 62.5 | 64.9 | 38.1 | 41.7 | 19.7 | 23.4 | 30.6 | 33.1 | 19.7 | 24.4 | |
std | 1.1 | 0.9 | 15.6 | 14.8 | 1.1 | 0.9 | 2.1 | 1.6 | 8.3 | 7.2 | 2.1 | 1.6 | ||
100 | 5 | mean | 40.0 | 41.8 | 76.7 | 78.1 | 40.1 | 42.7 | 14.5 | 18.4 | 23.5 | 25.8 | 14.6 | 19.4 |
std | 1.2 | 1.4 | 6.1 | 5.9 | 1.7 | 1.5 | 2.1 | 1.6 | 5.0 | 4.6 | 2.1 | 1.6 | ||
10 | mean | 65.1 | 67.4 | 110.6 | 112.7 | 65.5 | 68.6 | 33.0 | 35.9 | 52.3 | 54.5 | 33.0 | 36.9 | |
std | 1.7 | 1.5 | 25.1 | 24.3 | 1.8 | 1.5 | 2.5 | 2.4 | 11.8 | 11.2 | 2.5 | 2.4 | ||
15 | mean | 75.1 | 77.5 | 114.7 | 117.1 | 75.2 | 78.5 | 45.3 | 48.0 | 66.0 | 68.5 | 45.3 | 49.0 | |
std | 1.8 | 1.6 | 33.0 | 32.2 | 1.8 | 1.6 | 3.1 | 3.1 | 18.7 | 17.7 | 3.1 | 3.1 | ||
150 | 5 | mean | 59.7 | 61.2 | 110.6 | 112.0 | 60.3 | 62.4 | 21.5 | 24.5 | 36.6 | 38.6 | 21.5 | 25.5 |
std | 2.6 | 2.3 | 17.2 | 16.8 | 2.5 | 2.3 | 2.5 | 2.3 | 7.8 | 7.4 | 2.5 | 2.3 | ||
10 | mean | 97.1 | 99.0 | 159.6 | 161.7 | 97.3 | 100.1 | 49.8 | 52.5 | 80.9 | 83.1 | 49.8 | 53.5 | |
std | 2.0 | 1.8 | 37.0 | 36.7 | 2.0 | 1.8 | 3.6 | 3.6 | 20.1 | 19.3 | 3.6 | 3.6 | ||
15 | mean | 109.7 | 111.8 | 168.9 | 171.3 | 109.8 | 112.8 | 66.7 | 69.1 | 92.7 | 95.3 | 72.3 | 75.6 | |
std | 2.5 | 2.2 | 49.8 | 49.2 | 2.5 | 2.3 | 4.2 | 4.2 | 28.4 | 27.6 | 4.2 | 4.2 | ||
200 | 3 | mean | 109.7 | 111.8 | 168.9 | 171.3 | 109.8 | 112.8 | 66.7 | 69.1 | 92.7 | 95.3 | 72.3 | 75.6 |
std | 3.0 | 2.8 | 23.3 | 22.9 | 2.6 | 2.5 | 3.3 | 3.3 | 10.0 | 9.6 | 3.2 | 3.2 | ||
5 | mean | 127.0 | 128.5 | 193.6 | 195.8 | 127.2 | 129.6 | 65.0 | 67.4 | 97.2 | 99.6 | 65.0 | 68.4 | |
std | 2.8 | 2.7 | 56.4 | 55.9 | 3.0 | 2.9 | 4.7 | 4.6 | 27.8 | 27.0 | 4.7 | 4.6 | ||
7 | mean | 141.1 | 142.7 | 210.8 | 212.9 | 141.3 | 143.9 | 82.9 | 85.2 | 121.3 | 123.7 | 82.9 | 86.3 | |
std | 3.2 | 3.1 | 61.0 | 60.6 | 3.3 | 3.1 | 4.8 | 4.7 | 33.9 | 33.2 | 4.8 | 4.7 |
num vars | max degree | stats | BASE- MF | TWIN- ALG1 | TWIN- MF | BASE- MF-RLS | TWIN- THM3 | TWIN- MF-RLS | ||||||
wd | nwd | wd | nwd | wd | nwd | wd | nwd | wd | nwd | wd | nwd | |||
50 | 5 | mean | 20.0 | 22.4 | 36.2 | 39.9 | 37.2 | 39.2 | 27.9 | 30.2 | 39.6 | 41.3 | 37.1 | 39.1 |
std | 1.3 | 1.1 | 2.4 | 2.3 | 2.7 | 2.4 | 2.0 | 2.0 | 3.4 | 3.1 | 2.9 | 2.6 | ||
10 | mean | 32.9 | 35.4 | 64.8 | 66.7 | 64.9 | 66.8 | 37.4 | 39.8 | 54.0 | 55.8 | 50.6 | 52.9 | |
std | 1.0 | 0.9 | 2.4 | 2.3 | 2.5 | 2.5 | 1.5 | 1.2 | 3.1 | 2.9 | 2.3 | 2.3 | ||
15 | mean | 38.1 | 40.7 | 75.3 | 77.4 | 76.4 | 78.5 | 40.8 | 43.4 | 61.1 | 63.2 | 60.3 | 62.5 | |
std | 1.1 | 0.9 | 2.4 | 2.3 | 3.4 | 3.1 | 1.3 | 1.2 | 3.5 | 3.4 | 3.4 | 3.4 | ||
100 | 5 | mean | 40.1 | 41.7 | 78.4 | 79.8 | 77.7 | 79.1 | 54.8 | 56.2 | 80.6 | 81.8 | 74.3 | 75.7 |
std | 1.5 | 1.4 | 3.6 | 3.5 | 3.5 | 3.3 | 3.1 | 3.1 | 6.2 | 6.1 | 4.7 | 4.6 | ||
10 | mean | 65.1 | 67.4 | 129.3 | 131.0 | 128.6 | 130.4 | 75.1 | 76.6 | 116.2 | 117.5 | 106.4 | 108.2 | |
std | 1.7 | 1.5 | 3.6 | 3.4 | 3.9 | 3.6 | 2.5 | 2.3 | 5.6 | 5.4 | 4.1 | 4.0 | ||
15 | mean | 75.1 | 77.5 | 149.6 | 151.5 | 149.3 | 151.2 | 81.2 | 83.1 | 131.0 | 132.5 | 125.7 | 127.8 | |
std | 1.8 | 1.6 | 3.8 | 3.5 | 3.4 | 3.3 | 2.2 | 2.0 | 5.0 | 4.9 | 4.9 | 4.8 | ||
150 | 5 | mean | 59.7 | 61.1 | 118.5 | 119.7 | 118.1 | 119.4 | 83.0 | 84.3 | 125.1 | 126.2 | 113.8 | 115.0 |
std | 2.5 | 2.3 | 4.8 | 4.7 | 4.9 | 4.8 | 3.7 | 3.6 | 6.1 | 6.0 | 4.5 | 4.4 | ||
10 | mean | 97.1 | 99.0 | 193.4 | 194.9 | 193.3 | 194.8 | 111.1 | 112.5 | 174.7 | 175.8 | 161.1 | 162.7 | |
std | 2.0 | 1.8 | 4.2 | 4.0 | 4.2 | 4.0 | 2.3 | 2.2 | 5.5 | 5.5 | 5.0 | 4.9 | ||
15 | mean | 109.7 | 91.8 | 218.8 | 220.4 | 218.8 | 220.4 | 120.3 | 121.9 | 196.6 | 197.9 | 186.0 | 187.7 | |
std | 2.5 | 2.2 | 5.2 | 5.0 | 5.4 | 5.1 | 2.9 | 2.7 | 6.0 | 5.9 | 5.4 | 5.3 | ||
200 | 5 | mean | 75.6 | 76.6 | 150.7 | 151.8 | 150.8 | 151.9 | 102.5 | 103.6 | 155.0 | 156.1 | 143.3 | 144.4 |
std | 2.8 | 2.6 | 5.6 | 5.5 | 5.3 | 5.2 | 4.9 | 4.8 | 7.2 | 7.1 | 5.4 | 5.4 | ||
10 | mean | 127.0 | 128.5 | 253.4 | 254.6 | 252.7 | 254.0 | 149.4 | 148.6 | 233.0 | 234.2 | 212.9 | 214.3 | |
std | 2.8 | 2.7 | 5.5 | 5.5 | 4.5 | 4.4 | 3.4 | 3.4 | 7.7 | 7.7 | 6.2 | 6.1 | ||
15 | mean | 141.1 | 142.7 | 281.7 | 283.0 | 281.5 | 282.8 | 156.3 | 159.6 | 255.6 | 256.7 | 238.6 | 240.2 | |
std | 3.2 | 3.1 | 6.4 | 6.2 | 6.5 | 6.4 | 4.3 | 4.2 | 9.7 | 9.7 | 7.0 | 7.0 |