A Semantics for Counterfactuals
in Quantum Causal Models
Abstract
We introduce a formalism for the evaluation of counterfactual queries in the framework of quantum causal models, generalising Pearl’s semantics for counterfactuals in classical causal models [1], thus completing the last rung in the quantum analogue of Pearl’s “ladder of causation”. To this end, we define a suitable extension of Pearl’s notion of a ‘classical structural causal model’, which we denote analogously by ‘quantum structural causal model’, and a corresponding extension of Pearl’s three-step procedure of abduction, action, and prediction. We show that every classical (probabilistic) structural causal model can be extended to a quantum structural causal model, and prove that counterfactual queries that can be formulated within a classical structural causal model agree with their corresponding queries in the quantum extension – but the latter is more expressive. Counterfactuals in quantum causal models come in different forms: we distinguish between active and passive counterfactual queries, depending on whether or not an intervention is to be performed in the action step. This is in contrast to the classical case, where counterfactuals are always interpreted in the active sense. Another distinctive feature of our formalism is that it break the connection between causal and counterfactual dependence that exists in the classical case: quantum counterfactuals allow for counterfactual dependence without causal dependence. This distinction between classical and quantum causal models may shed light on how the latter can reproduce quantum correlations that violate Bell inequalities while being faithful to the relativistic causal structure.
1 Introduction
The world of alternative possibilities has been pondered upon and analyzed routinely, in many fields of study including but not limited to social [2] and public policy [3], psychiatry [4], economy [5], weather and climate change [6], artificial intelligence [7], philosophy and causality [8, 9]. For example, questions involving counterfactuals can have important social and legal implications, such as “Given that the patient has died after treatment, would they have survived had they been given a different treatment?”, or “how many lives could the US have saved had it authorized booster vaccines sooner?”[10].
The status of counterfactual questions also figures centrally in debates about quantum mechanics [11], where results such as Bell’s theorem [12] and the Kochen-Specker theorem [13] have been interpreted as requiring the abandonment of “counterfactual definiteness”[14], encapsulated in Peres’ famous dictum “unperformed experiments have no results” [15]. Could this assertion be used by a lawyer in an argument to dismiss a medical malpractice lawsuit as meaningless? Presumably not. Dismissing all counterfactual questions as meaningless due to quantum theory thus seems too strong. Here, we seek to delineate what counterfactual questions involving quantum systems can be unambiguously answered when unambiguously formulated, and to provide some direction for resolving the ambiguity that is inherent in counterfactual questions that are not so carefully constructed.
The semantics of counterfactuals has a controversial history. In one of the early accounts, David Lewis [16] proposed to evaluate counterfactuals via a similarity analysis of possible worlds, where “a counterfactual ‘If it were that A, then it would be that C’ is (non-vacuously) true if and only if some (accessible) world where both A and C are true is more similar to our actual world, overall, than is any world where A is true but C is false” [16]. This analysis is inevitably vague, as it requires an account of “similarity” among possible worlds, which Lewis attempts to resolve via a system of priorities. The goal is to identify closest worlds as possible worlds in which things are kept more or less the same as in our actual world, except for some ‘minimal changes’, required to make the antecedent of a given counterfactual true.
A recent approach, due to Judea Pearl, proposes to define counterfactuals in terms of a sufficiently well-specified causal model for a given situation, denoted by a (classical) structural causal model [1]. In Pearl’s approach, the ‘minimal changes’ required to make the antecedent of a counterfactual true are conceptualised in terms of an intervention, which breaks the causal connections into the variable being intervened upon while fixing it to the required counterfactual value. Structural causal models feature at the top of a hierarchy of progressively sophisticated models that can answer progressively sophisticated questions, which Pearl has dubbed the “ladder of causation” [17] (see Fig. 1).
As is well known, however, the classical causal model-framework of Pearl fails to reproduce quantum correlations while maintaining faithfulness to the relativistic causal structure—as vividly expressed by Bell’s theorem [12] and recent ‘fine-tuning theorems’ [18, 19, 20]. The program of quantum causal models [21, 22, 23, 24, 25, 26, 27, 28, 29] aims to resolve this tension by extending the classical causal model framework, while maintaining compatibility with relativistic causality. One of the aims of our work is to complete the last rung in the quantum analogue of Pearl’s “ladder of causation”, by proposing a framework to answer counterfactual queries in quantum causal models.
Pearl argues that the levels of interventions and counterfactuals are particularly important for human intelligence and understanding, as they are crucial for our internal modeling of the world and of the effects of our actions. In contrast, he argues that current artificial intelligence (AI) models —however impressive— are still restricted to level 1 of his causal hierarchy [17]. Another motivation of our extension of Pearl’s analysis to the framework of quantum causal models is thus its potential applications for quantum AI.
A key distinction from the classical case is that, due to the indeterminism inherent in quantum causal models, counterfactual queries do not always have truth values (unlike in Lewis’ and Pearl’s accounts). Another difference is that an intervention is not always required in order to make the antecedent of a counterfactual true. This leads to a richer semantics for counterfactuals in the quantum case, which contains Pearl’s classical structural causal model as a special case, as we show.
Finally, an important distinction regards the connection between counterfactual dependence and causal dependence. In Pearl’s account, counterfactual dependence requires causal dependence. Similarly, Lewis [30] proposed an analysis of causal dependence based on his own notion of counterfactual dependence. In contrast, in quantum causal models there can be counterfactual dependence among events without causal dependence. This fact sheds new light on the nature of the compatibility with relativistic causality that is offered by quantum causal models. It can be thought of as a clarification and generalisation of Shimony’s notion of “passion at a distance” [31].
The rest of the paper is organised as follows. In Sec. 2, we review the basic ingredients to Pearl’s ladder of causation (see Fig. 1), as well as his three-step procedure for evaluating counterfactuals based on the notion of (classical) structural causal models. In Sec. 3, we highlight the issues in accommodating quantum theory within this framework, in the light of Bell’s theorem and the assumption of “no fine-tuning” [18, 19]. The framework of quantum causal models aims to resolve this discrepancy. We introduce some key notions and notation of the latter in Sec. 4, which will set the stage for our definition of quantum counterfactuals and their semantics based on a novel notion of quantum structural causal models in Sec. 5. In Sec. 6, we show that Pearl’s classical formalism of counterfactuals naturally embeds into our framework; conversely, in Sec. 7 we elaborate on how our framework generalizes Pearl’s formalism, by distinguishing passive from active counterfactuals in quantum causal models. This results in a difference between causal and counterfactual dependence in quantum causal models, which pinpoints a remarkable departure from classical counterfactual reasoning. We discuss this using the pertinent example of the Bell scenario in Sec. 7.3. In Sec. 8, we briefly review the debate surrounding the notion of counterfactual definiteness in quantum mechanics, and how although it fails in our formalism, this is not the particularly distinctive feature of quantum counterfactuals, and this rejection by itself cannot resolve Bell’s theorem. Sec. 9 reflects on some of the key assumptions to our notion of quantum counterfactuals in the context of recent developments on quantum statistical inference. Sec. 10 concludes with a brief summary and discussion of the results, and questions for further work.

2 The Classical Causal Model Framework
This section contains the minimal background on classical causal models and the evaluation of counterfactuals required for the generalization to the quantum case in Sec. 5. We will review a small fraction of the framework outlined in much more detail in Ref. [1]; readers familiar with the latter may readily skip this section.
In his book on causality [1], Judea Pearl identifies a hierarchy of progressively sophisticated models that are capable of answering progressively sophisticated causal queries. This hierarchy is often depicted as the three-rung ‘Ladder of Causation’ (see Fig. 1).
At the bottom of the ladder is the level of association (‘level 1’), related to observations and statistical relations. It answers questions such as “how would seeing change my belief in ?” The second rung is the level of intervention (‘level 2’), which considers questions such as “If I take aspirin, will my headache be cured”? The final rung in the ladder of causation is the level of counterfactuals (‘level 3’), associated with activities such as imagining, retrospecting, and understanding. It considers questions such as “Was it the aspirin that stopped my headache?”, “Had I not taken the aspirin, would my headache not have been cured?” etc. In other words, counterfactuals deal with ‘why’-questions. Formally, levels 1, 2 and 3 are related to Bayesian networks, causal Bayesian networks, and structural causal models, respectively. We will formally define these in the coming subsections.
2.1 Level 1 - Bayesian networks
In Pearl’s framework, level 1 of the ladder of causation (Fig. 1) is the level of association, which encodes statistical data in the form of a probability distribution over random variables .111Throughout, we will use boldface notation to indicate tuples of variables. The latter are assumed to take values in a finite set, whose elements are denoted by the corresponding lowercase . The proposition ‘’ represents an event where the random variable takes the value , and denotes the probability that this event occurs.
Statistical independence conditions in a probability distribution can be conveniently represented graphically using directed acyclic graphs (DAGs), which in this context are also known as Bayesian networks. The nodes in a Bayesian network represent the random variables , while arrows (‘’) in impose a ‘kinship’ relation: we call the “parents” and the “children” of the node . For example, in Fig. 2, is the parent node of and ; is a child node of , and .
Definition 1 (Classical Markov condition).
A joint probability distribution is said to be Markov relative to a DAG with nodes if and only if there exist conditional probability distributions for each such that,
(1) |
In general, a probability distribution may be Markov relative to many Bayesian networks, corresponding to different ways it can be decomposed into conditional distributions. Moreover, a Bayesian network will have many distributions which are Markov with respect to it. Note that at this level (level 1), the DAG representing a Bayesian network does not carry causal meaning, but is merely a convenient representation of statistical conditional independences.
2.2 Level 2 - Causal Bayesian networks and classical causal models
At level 2 of the hierarchy are causal (Bayesian) networks. In contrast to Bayesian networks, the arrows between nodes in a causal Bayesian network do encode causal relationships. In particular, the parents of a node are now interpreted as direct causes of . Moreover, a causal network is an oracle for interventions. The effect of an intervention is modeled as a “mini-surgery” in the graph that cuts all incoming arrows into the node being intervened upon and sets it to a specified value. We define the do-intervention on a subset of nodes as the submodel , where is the modified DAG with the same nodes as , but with all incoming arrows for removed from , and where arises from by setting the values at to . More precisely, letting
(2) |
Definition 2 (Classical Causal Model).
A classical causal model is a pair , consisting of a directed acyclic graph with nodes , a probability distribution that is Markov with respect to , according to Def. 1 and all its submodels , arising from do-interventions with .
For example, if we perform a do-intervention on the classical causal model with DAG in Fig. 2, then is the DAG shown in Fig. 3, and the truncated factorization formula for the remaining variables reads
(3) |
2.3 Level 3 - Structural causal models and the evaluation of counterfactuals
At level 3 of the hierarchy are (classical) structural causal models. Such models consist of a set of nodes , distinguished into endogenous variables and exogenous variables , together with a set of functions that encode structural relations between the variables. The term “exogenous” indicates that any causes of such variables lie outside the model; they can be thought of as local ‘noise variables’.
Definition 3 (Classical Structural Causal Model).
A (classical) structural causal model (CSM) is a triple , where is a set of endogenous variables, is a set of exogenous variables and is a set of functions such that for some .
Every structural causal model is associated with a directed graph , which represents the causal structure of the model as specified by the relations . Here, we will restrict CSMs to those defining directed acyclic graphs. For example, the causal model of Fig. 2 can be extended to a CSM with causal relations as depicted in Fig. 4.
In analogy with the do-interventions for causal Bayesian networks in Sec. 2.2, we define do-interventions in a CSM . Let with corresponding exogenous variables and functions , and let , and . Then the do-intervention defines a submodel . In terms of the causal graph , the action removes all incoming arrows to the nodes , thus generating a new graph .
The submodel represents a minimal change to the original model such that is true while keeping the values of the exogenous variables fixed – which are thought of as “background conditions”. In turn, we can use to analyze counterfactual statements with antecedent .
Definition 4 (Counterfactual).
Let be a structural causal model, and let . The counterfactual statement “ would have been , had been , in a situation specified by the background variables ” is denoted by , where is the potential response of to the action , that is, the solution for of the modified set of equations in the submodel . is called the antecedent and is the consequent of the counterfactual.
Note that given any complete specification of the exogenous variables, every counterfactual statement of the form above has a truth value.222Here, by “truth values” we mean Boolean truth values ‘true’ and ‘false’. We do not consider more general truth values such as ‘indefinite’ which may arise e.g. in intuitionistic logic. Denoting a “causal world” by the pair , we can say that a counterfactual has a truth value in every causal world where it can be defined. This is the case even when the model with determines to have a value different from that specified in the antecedent, because the counterfactual is evaluated relative to the modified submodel .
Definition 5 (Probabilistic structural causal model).
A probabilistic structural causal model (PSM) is defined by a pair , where is a structural causal model (see Def 3) and is a probability distribution defined over the exogenous variables of .
Since every endogenous variable is a function of and its parent nodes, , the distribution in a PSM defines a probability distribution over every subset by
(4) |
In particular, the probability of the counterfactual “ would have been , had been ” can be computed using the submodel as
(5) |
More generally, the probability of a counterfactual query might be conditioned on prior observations ‘’. In this case, we first update the probability distribution in the PSM to obtain a modified probability distribution conditioned on observed data and then use this updated probability distribution to evaluate the probability for the counterfactual as in Eq. (5). Combining the above steps, one arrives at the following theorem, proved in Ref. [1]:
Theorem 1 (Pearl [1]).
Given a probabilistic structural causal model (PSM) (see Def 5), and subsets , the probability for the counterfactual “ would have been , had been ”, given the observation of , is denoted by and can be evaluated systematically by a three-step procedure:
-
•
Step 1: Abduction: using the observed data , use Bayesian inference to update the probability distribution in the PSM to obtain .
-
•
Step 2: Action: perform a do-intervention , by which the values of are specified independently of their parent nodes. The resultant model is denoted as .
-
•
Step 3: Prediction: in the modified model , compute the probability of via Eq. (5).
As an example, consider the situation where and are observed, that is, .333Note that and in Thm. 1 are not necessarily disjoint. We evaluate the probability of the counterfactual “ would have been , had been ” as:
(6) |
where we used Eq. (5) in the first and Bayes’ theorem in the second step.444Note that a probabilistic structural causal model implies the existence of a joint probability distribution over all variables. In this case, an alternative expression for the probability of the counterfactual in Eq. (6) reads (7) In the quantum case such a distribution does not generally exist [13].
In temporal metaphors, step 1 explains the past (the exogenous variables ) in light of the current evidence ; step 2 minimally bends the course of history to comply with the hypothetical antecedent and step 3 predicts the future based on our new understanding of the past and our newly established condition.
3 Quantum violations of classical causality
Classical causal models face notorious difficulties in explaining quantum correlations. Firstly, Bell’s theorem [12, 32, 33] can be interpreted in terms of classical causal models, as proving that such models cannot reproduce all quantum correlations (in particular, those that violate a Bell inequality) while maintaining relativistic causal structure and the assumption of “free choice”. The latter is the assumption that experimentally controllable parameters like measurement settings can always be chosen via “free variables”, which can be understood as variables that have no relevant causes in a causal model for the experiment. That is, they share no common causes with, nor are caused by, any other variables in the model. Thus, “free variables” can be modeled as exogenous variables.
For concreteness, consider the standard Bell scenario with a causal structure represented in the DAG in Fig. 5, where variables and denote the outcomes of experiments performed by two agents, Alice and Bob. Variables and denote their choices of experiment, which are assumed to be “free variables” and thus have no incoming arrows. Since Alice and Bob perform measurements in space-like separated regions, no relativistic causal connection is allowed between and nor between and . In this scenario, Reichenbach’s principle of common cause [34, 22] – which is a consequence of the classical causal Markov condition – implies the existence of common causes underlying any correlations between the two sides of the experiment. denotes a complete specification of any such common causes. As we are assuming a relativistic causal structure, those must be in the common past light cone of Alice’s and Bob’s experiments.
Marginalizing over the common cause variable , the classical causal Markov condition applied to the DAG in Fig. 5 implies the factorization:
(8) |
A model satisfying Eq. (8) is also called a local hidden variable model. Importantly, local hidden variable models satisfy the Bell inequalities [12, 32], which have been experimentally violated by quantum correlations [35, 36, 37, 38].555The 2022 Nobel Prize in Physics was awarded in part for the demonstration of Bell inequality violations. It follows that no classical causal model can explain quantum correlations under the above assumptions.
More recently, Wood and Spekkens [18] showed that certain Bell inequality violations cannot be reproduced by any classical causal model that satisfies the assumption of “no fine-tuning”. This is the requirement that any conditional independence between variables in the model be explained as arising from the structure of the causal graph, rather than from fine-tuned model parameters. This assumption is essential for causal discovery – without it, it is generally not possible to experimentally determine which of a number of candidate graphs faithfully represents a given situation. This result was later generalized to arbitrary Bell and Kochen-Specker inequality violations in Refs. [19, 20].
These results motivate the search for a generalization of classical causal models that accommodates quantum correlations and allows for causal discovery, while maintaining faithfulness to relativistic causal structure. Ref. [22] considers modifications of Reichenbach’s principle of common cause [34]—which is implied by the causal Markov condition in the special case of the common cause scenario in Fig. 5, as assumed in Bell’s theorem [12, 32]. The authors of Ref. [22] argue that one could maintain the principle of common cause—the requirement that correlations between two causally disconnected events should be explained via common causes—by relaxing the condition that a full specification of those common causes factorizes the probabilities for the events in question, as by Eq. (8). Using the Leifer-Spekkens formalism for quantum conditional states, they instead propose that Eq. (8) should be replaced by the requirement that the channels between the common cause and Alice and Bob’s labs factorize—or more precisely, the Choi-Jamiołkowski operators corresponding to those channels. This is essentially the type of resolution of Bell’s theorem that is provided by quantum causal models, to which we now turn. After introducing structural quantum causal models in Sec. 4.1 and quantum counterfactuals queries in Sec. 5, in Sec. 7.3 we will revisit the Bell scenario from the perspective of counterfactuals in quantum causal models.
4 Quantum causal models
In recent years a growing number of papers have addressed the problem of generalizing the classical causal models formalism to accommodate quantum correlations, in a way that is compatible with relativistic causality and faithfulness. This has led to the development of various frameworks for quantum causal models. The more developed of those are the frameworks by Costa and Shrapnel [26] and Barrett, Lorenz and Oreshkov [29]. In this work, we use a combination of the notation and features of both of these formalisms.
Quantum nodes and quantum interventions. Recall that in a classical causal model, a node represents a locus for potential interventions. In order to generalize this to the quantum case, we start by introducing a quantum node , which is associated with two Hilbert spaces and , corresponding to the incoming system and the outgoing system, respectively. An intervention at a quantum node is represented by a quantum instrument (see Fig. 6). This is a set of trace-non-increasing completely positive (CP) maps from the space of linear operators on to the space of linear operators on ,
(9) |
such that is a completely positive, trace-preserving (CPTP) map—i.e. a quantum channel.666We sometimes write for this CPTP map to indicate that it is associated with the instrument . Note however that a given CPTP map will in general be associated with many different instruments. Here, is a label for the (choice of) instrument, and labels the classical outcome of the instrument, which occurs with probability for an input state ; consequently, the state on the output system conditioned on the outcome of the intervention is given by . For simplicity, we consider finite-dimensional systems only.
Using the Choi-Jamiołkowski (CJ) isomorphism,777Here, we follow the notation in Ref. [26]. This differs from the one used in Refs. [27, 39, 29], which applies a basis-independent version of the Choi-Jamiołkowski isomorphism, by identifying the Hilbert space associated with outgoing systems with its dual (see also Ref. [40]). we represent a quantum instrument in terms of a positive operator-valued measure . More precisely, every completely positive map is represented by a positive semi-definite operator given by
(10) |
In a slight abuse of notation, we will write also for the representation of an instrument in terms of positive operators under the Choi-Jamiołkowski isomorphism. Note that the fact that is trace-preserving imposes the following trace condition on (cf. Ref. [41]),
(11) |
Quantum process operators. In a quantum causal model we will distinguish between two types of quantum operations: quantum interventions, which are local to a quantum node, and a quantum process operator, which acts between quantum nodes and contains information about the causal (influence) relations between the nodes in the model.
To motivate the general definition (Def. 6 below), we first consider the simplest case: for a single quantum node , a quantum process operator is any operator such that the pairing888With Ref. [29], we will adopt the shorthand .
(12) |
defines a probability for every positive semi-definite operator , and satisfies the normalisation condition
(13) |
for every quantum channel (CPTP map) . Consequently, given a process operator , we may interpret as the probability to obtain outcome when performing an instrument .
As a generalisation of the Born rule (on the composite system ), Eq. (12) in particular implies that is positive, hence, corresponds to a completely positive map .
More generally, it will be useful to introduce a notation for the positive semi-definite operator corresponding to a bipartite channel of the form :
(14) |
Note that is distinguished from the representation of the Choi matrices corresponding to quantum instruments in Eq. (10) by an overall transposition, indicating the different roles played by instruments and processes in the inner product of Eq. (12). In particular, we have for some channel satisfying the normalisation condition in Eq. (13).
Generalizing this idea to finitely many quantum nodes, a quantum process operator is defined as follows.
Definition 6 (Process operator).
A (quantum) process operator over quantum nodes is a positive semi-definite operator , which satisfies the normalisation condition,
(15) |
for any choice of quantum channels at nodes .999Every process operator satisfies a trace condition analogous to Eq. (11): , hence, defines a CPTP map . Yet, the converse is generally not true.
Comparing with Eq. (12), we define the probability of obtaining outcomes when performing interventions at quantum nodes by
(16) |
Eq. (16) defines a generalization of the Born rule (on the composite system ) [42, 28].
Quantum causal models. With the above ingredients, we obtain quantum generalizations of the causal Markov condition in Def. 1 and thereby of classical causal models (causal networks) in Def. 2.
Definition 7 (Quantum causal Markov condition).
A quantum process operator is Markov for a given DAG if and only if there exist positive operators such that (corresponding to quantum channels ) for each quantum node of such that101010Here and below, we implicitly assume the individual operators to be ‘padded’ with identities on all nodes not explicitly involved in such that the multiplication of operators is well-defined.
(17) |
and for all .
Definition 8 (Quantum causal model).
A quantum causal model is a pair , consisting of a DAG , whose vertices represent quantum nodes , and a quantum process operator that is Markov with respect to , according to Def. 7.
4.1 Quantum structural causal models
Recall that in the classical case, counterfactuals are evaluated relative to a classical structural causal model (CSM) (see Def. 3), which associates an exogenous variable and a function , to every node . Given a CSM, we thus have full information about the underlying process and any uncertainty arises solely from our lack of knowledge about the values of the variables at exogenous nodes, which is encoded in the probability distribution of the probabilistic structural causal model (PSM) .
In order to define a notion of quantum structural causal models, we find it useful to introduce the lack of knowledge on exogenous nodes directly in terms of a special type of quantum instruments,111111Here, our formalism diverges from the one in Ref. [29], which assigns the lack of knowledge about exogenous degrees of freedom as part of the process operator , and which does not distinguish between different state preparations. This is a change in perspective in so far as we will place our lack of knowledge as a lack of knowledge about events at the exogenous nodes, rather than a lack of knowledge about the process.
(18) |
Quantum instruments of this form discard the input to the node and with probability prepare the state in the output. In other words, is a discard-and-prepare instrument. Ignoring the outcome of this instrument, one obtains the channel , corresponding to the preparation of state in the output of node .
Note that the outcome and output of a discard-and-prepare instrument are independent of the input state . In order to avoid carrying around arbitrary input states in formulas below (as required for normalization), we will therefore adopt the convention,
(19) |
such that for any state .
Definition 9.
(no-influence condition). Let be the Choi-Jamiolkowski (CJ) representation of the channel corresponding to the unitary transformation . We say that system does not influence system (denoted as ) if and only if there exists a quantum channel with corresponding CJ representation such that .121212We remark that the labels refer to arbitrary systems, not necessarily nodes in a quantum causal model. Within a quantum causal model, two of those labels, say and , may refer to output and input Hilbert spaces of the same node.
Given these preliminaries, we define a quantum version of the structural causal models in Def. 3.
Definition 10 (Quantum structural causal model).
A quantum structural causal model (QSM) is a triple
, specified by:
-
(i)
a set of quantum nodes, which are split into
-
–
a set of endogenous nodes ,
-
–
a set of exogenous nodes ,
-
–
and a sink node ;
-
–
- (ii)
-
(iii)
a set of discard-and-prepare instruments for every exogenous node .
Note that in general we need to include an additional sink node , in order for the process operator to be unitary. contains any excess information that is discarded in the process (cf. Ref. [29]).
We emphasize the subtle, but conceptually crucial difference between Def. 4.5 in Ref. [29] and our Def. 10. The former specifies the input states on ancillary nodes directly, as part of a ‘unitary process with inputs’, while the latter encodes input states in terms of discard-and-prepare instruments, acting on an arbitrary input state. This will enable us to use classical Bayesian inference on the outcomes of instruments at exogenous nodes in the abduction step of the evaluation of quantum counterfactuals, as we’ll see in Sec. 5 below. In contrast, this is not possible using Def. 4.5 in Ref. [29], but may require a generalisation of Bayesian inference to the quantum case (see Sec. 9).
Following Ref. [29], we define a notion of structural compatibility of a process operator with a graph .
Definition 11.
[Compatibility of a quantum process operator with a DAG] A quantum process operator over nodes is said to be structurally compatible with a DAG if and only if there exists a quantum structural causal model (QSM) that recovers as a marginal,
(21) |
where satisfies the no-influence relations
(22) |
with defined by .
Similar to Thm. 4.10 in Ref. [29], one shows that a process operator is structurally compatible with if and only if it is Markov for .
Theorem 2 (Equivalence of quantum compatibility and Markovianity).
For a DAG with nodes and a quantum process operator , the following are equivalent:
-
1.
is structurally compatible with .
-
2.
is Markov for .
Proof.
The difference between our definition of ‘structural compatibility’ in Def. 11 and that of ’compatibility’ in Def. 4.8 in Ref. [29] is that the latter applies to a “unitary process with inputs” (see Def. 4.5 in Ref. [29]), while Def. 11 applies to a QSM as defined in Def. 10. Yet, we show that is compatible with if and only if it is structurally compatible with . The result then follows from the proof of Thm. 4.10 in Ref. [29].
First, let be compatible with , then by Def. 4.8 in Ref. [29] there exists a unitary process that satisfies the no-influence conditions and , and states such that is recovered as a marginal,
(23) |
where we traced over the inputs of exogenous nodes . Choosing discard-and-prepare measurements such that (cf. Eq. (19)), defines a QSM (cf. Def. 10): in particular, satisfies Eq. (20). Moreover, also satisfies Eq. (22), and Eq. (23) implies Eq. (21. From this it follows that is structurally compatible with .
Conversely, if is structurally compatible with it admits a QSM , from which we extract the unitary process operator satisfying the no-influence conditions in Eq. (20) and Eq. (22), and which recovers as a marginal in Eq. (23) for inputs , as a consequence of Eq. (21). It then follows that is compatible with . ∎
Theorem 2 establishes that for every process operator that is Markov for a graph , there exists a QSM model over that reproduces that process. Note however that this does not necessarily give us information about which QSM correctly describes a given physical process. This requires that the outcomes of instruments at the exogenous nodes correspond to “stable events” (cf. Ref. [43]), e.g. due to decoherence. That is, for a QSM to be taken to correctly describe a physical process, the events represented at the exogenous nodes must be effectively classical events, in line with their treatment as fixed background events. The evaluation of counterfactuals will be relative to a QSM, and different QSMs compatible with the same process will in general give different answers to the same counterfactual query. This situation is analogous to the classical case. The question of determining which (classical or quantum) structural causal model correctly describes a given physical realisation of a process is an important question, but beyond the scope of this work.
Finally, we need the following notion (cf. Eq. (19) in Ref. [28]). Given a particular set of outcomes at the exogenous instruments, we define a conditional process operator as follows,
(24) |
This allows us to calculate the conditional probability ‘’ to obtain a set of outcomes for a set of instruments at endogenous nodes, given a set of outcomes for the exogenous instruments:
(25) |
Assuming that the a QSM correctly describes a given physical scenario, and in particular that the events associated with can be thought of as well-decohered, stable events, we can think of Eq. (24) as representing the actual process realised in a given run of the experiment, where our (prior) ignorance about which process is actually realised is encoded in the subjective probabilities .
5 Counterfactuals in Quantum Causal Models
Classically, a counterfactual query has the form “Given evidence , would have been had been ?”. In Pearl’s formalism, the corresponding counterfactual statement can be assigned a truth value given a full specification of the background conditions in a structural causal model. In that formalism, probabilities only arise out of our lack of knowledge about exogenous variables, and one can define the probability for the counterfactual to be true as the probability that lies in the range of values where the counterfactual is evaluated as true. In contrast, in quantum causal models, a counterfactual statement will in general not have a truth value! This is the case even if we are given maximal information about the process (represented as a unitary process) and maximal information about the events at the exogenous nodes (represented as a full specification of the exogenous variables ‘’ in a quantum structural causal model131313Here we are assuming that maximal information about an event corresponding to the preparation of a quantum state is given by a (pure) quantum state. This of course assumes that quantum mechanics is “complete” in the sense that there are no hidden variables that would further specify the outcomes of instruments. While this is admittedly an important assumption, it is the natural assumption to make in the context of quantum causal models—which aim to maintain compatibility with relativistic causality [44].).
In order to avoid the implicit assumption of ‘counterfactual definiteness’ inherent to the notion of a probability of a counterfactual as in the classical case (see Def. 4), we seek a notion of counterfactual probability in the quantum case.
Definition 12 (Counterfactual probability).
Let be a quantum structural causal model. Then the counterfactual probability that outcomes would have obtained for a subset of nodes C, had instruments been implemented and outcomes obtained at a set of nodes B (disjoint from C), in the situation specified by the background variables , is denoted by and given by
(26) |
where , , and . For , we set for counterfactuals with impossible antecedent (‘counterpossibles’).
More generally, we want to calculate the expected value of the counterfactual probability given some evidence, for which we define a standard quantum counterfactual query as follows.
Definition 13 (Standard quantum counterfactual query).
Let be a quantum structural causal model. Then a standard quantum counterfactual query, denoted by , is the expected probability that outcomes would have obtained for a subset of nodes C, had instruments been implemented and outcomes obtained at a set of nodes B (disjoint from C), given the evidence that a set of instruments has been implemented and outcomes obtained.
Note that to obtain an unambiguous answer, one needs to specify all the instruments in all the nodes, both actual and counterfactual. Def. 13 may not look general enough to accommodate all types of counterfactuals one can envisage, but we will discuss later how the answer to seemingly different types of counterfactual queries can be obtained from the answer to a standard query after suitable interpretation. At times there will be ambiguity in how to interpret some counterfactual queries, and the task of interpretation will be to reduce any counterfactual query to the appropriate standard query—we will return to this later. We now proceed to show how we can answer a quantum counterfactual query.
5.1 Evaluation of counterfactuals
The evaluation of a standard counterfactual query within a quantum structural causal model proceeds through a three-step process of abduction, action and prediction, in analogy with the classical case.
Abduction. We infer what the past must have been, given information we have at present, that is, we want to update our information about the instrument outcomes at the exogenous nodes , given that outcomes have been observed upon performing instruments at nodes .141414In the language of Ref. [43], we treat the outcomes at exogenous nodes as “stable facts”. In taking this stance we set aside the question of when an instrument outcome can be said to be a stable fact, i.e. we set aside the measurement problem, which applies to the quantum causal model framework in the same way as to standard quantum theory [45, 44]. Since we are talking about jointly measured variables, we can perform Bayesian update to calculate the conditional probability151515Here, we assume that since is an actually observed event.
(27) |
Action. Next, we modify the instruments at endogenous nodes to , as required by the antecedent of the counterfactual query. We highlight an important distinction from the classical case: unlike in Pearl’s formalism, we do not need to modify the process itself, since an ‘arrow-breaking’ intervention at a node can always be emulated via some appropriate discard-and-prepare instrument, for example, by the instrument
(28) |
Deciding what instruments are appropriate for a given counterfactual query not in standard form is part of the interpretational task we will return to in Sec. 7 below. For a standard quantum counterfactual query, this is unambiguous since the counterfactual instruments are defined as part of the query (see Def. 13).
Prediction. Finally, we calculate the expected value of the counterfactual probability
(29) |
Whenever the counterfactual has an impossible antecedent for some values of the background variables with nonzero probability, that is, whenever for some with , we set .
If a counterfactual query can be interpreted as a standard quantum counterfactual query, then it will have an unambiguous answer as above. In Sec. 7, we will discuss the task of interpreting a general quantum counterfactual query that is not already in standard form. Before doing so, we proceed by proving that the present formalism extends Pearl’s classical formalism.
6 From classical to quantum structural causal models
Having defined a notion of quantum structural causal models (QSM) in Def. 10, it is an important question to ask in what sense this definition extends that of a probabilistic structural causal model (PSM) in Def. 5 and, in particular, that of a classical structural causal model (CSM) in Def. 3. In this section, we show that QSMs indeed provide a generalization of PSMs—by extending an arbitrary PSM to a QSM . In order to do so, we need to take care of two crucial physical differences between Def. 3 and Def. 10.
First, note that the structural relations in a CSM are generally not reversible, while unitary evolution in QSMs postulates an underlying reversible process. We therefore need to lift a generic CSM to a reversible CSM, whose structural relations are given in terms of bijective functions, yet whose independence conditions coincide with those of the original CSM. Second, while classical information (in a CSM) can be copied, quantum information famously cannot. We therefore need to find a mechanism to encode classical copy operations into a QSM. This will require us to introduce auxiliary systems, which also need to preserve the no-influence conditions required between exogenous variables in Def. 10, (ii).
The next theorem asserts that an extension of a CSM to a QSM satisfying these constraints always exists.
Theorem 3.
Every PSM , consisting of a CSM and a probability distribution over exogenous variables, can be extended to a QSM such that
(30) | ||||
(31) |
In particular, preserves the independence conditions between variables in (as defined by ),
(32) |
Proof.
(Sketch) The proof consists of several parts:
-
(i)
we find a binary extension of the CSM ,
-
(ii)
we extend the binary CSM to a binary, reversible CSM, where all functional relations are bijective,
-
(iii)
we encode classical copy operations in a QSM using CNOT-gates,
-
(iv)
by promoting classical variables to quantum nodes, and by linearly extending bijective functions between classical variables to isometries, we construct a QSM , which extends the PSM as desired.
For details of the proof, see App. A. ∎
We will see in Sec. 7 that a QSM admits different types of counterfactual queries, some of which are genuinely quantum, that is, they do not arise in a CSM. Nevertheless, Thm. 3 implies that counterfactual queries arising in a PSM coincide with the corresponding queries in its quantum extension .
Corollary 1.
The evaluation of a counterfactual in a (PSM) coincides with the evaluation of the corresponding do-interventional counterfactual (see also Sec. 7) in its quantum extension .
Proof.
Given a distribution over exogenous nodes, Thm. 3 assures that do-interventions in Eq. (2) yield the same prediction—whether evaluated via Eq. (6) in or as a do-interventional counterfactual via Eq. (29) in . This leaves us with the update step in Pearl’s analysis of counterfactuals (cf. Thm. 1). More precisely, we need to show that the Bayesian update in Eq. (27) does not affect the distribution over the space of additional ancillae and in the proof of Thm. 3. This is a simple consequence of the way distributions over exogenous nodes in are encoded in .
First, the distribution over copy ancillae is given by a -distribution peaked on the state (see Eq.(90) in App. A). In other words, we have full knowledge of the initialization of the copy ancillae, hence, the update step in Eq. (27) is trivial in this case.
Second, let be any distribution over exogenous nodes in the binary, reversible extension of (see (i) and (ii) in App. A) such that , that is, arises from by marginalisation under the discarding operation (see (ii) in App. A).161616A canonical choice for is the product distribution of and the uniform distribution over , . But since the variables in are related only to the sink node via (see Eq. (79) in App. A), we have . The marginalised updated distribution thus reads
(33) |
In other words, Bayesian inference in Eq. (27) commutes with marginalisation. ∎
Thm. 3 and Cor. 1 show that our definition of QSMs in Def. 10 generalizes that of CSMs in Def. 3. What is more, this generalization is proper: a QSM cannot generally be thought of as a CSM, while also keeping the relevant independence conditions between the variables of the model. Indeed, casting a QSM to a CSM is to specify a local hidden variable model for the QSM, yet a general QSM will not admit a local hidden variable model.
In short, the counterfactual probabilities defined by a QSM can generally not be interpreted as probabilities of counterfactuals (to be true). In Sec. 7, we will further analyse the distinctions between classical and quantum counterfactuals, and see some instances of counterfactual queries in the quantum case that do not have an analog in the classical case.
7 Interpretation of quantum counterfactual queries
In this section, we emphasize some crucial differences between the semantics of counterfactuals in classical and quantum causal models. Recall that in order to compute the probability of a counterfactual in a classical structural causal model (CSM), a do-intervention has to be considered in at least one of the nodes. Indeed, there is no way for the antecedent of the counterfactual query to be true without some modification in the model, since a complete specification of the values of exogenous variables determines the values of endogenous variables, and thus determines the antecedent to have its actual value. CSMs are inherently deterministic.
In contrast, in a quantum structural causal model (QSM) the probability that a different outcome would have been obtained can be nonzero even without a do-intervention, since even maximal knowledge of the events at the exogenous nodes does not, in general, determine the outcomes of endogenous instruments. QSMs are inherently probabilistic.
As a consequence, we will distinguish between two kinds of counterfactuals in the quantum case, namely, passive and active counterfactuals, which we define and discuss examples of in Sec. 7.1. In Sec. 7.2, we provide an argument for the disambiguation between passive and active counterfactuals, when faced with an ambiguous (classical) counterfactual query. Moreover, as a consequence of the richer semantics of quantum counterfactuals, in Sec. 7.3 we show how (passive) quantum counterfactuals break the link between causal and counterfactual dependence that exists in the classical setting. We discuss this explicitly in the case of the Bell scenario.
7.1 Passive and active counterfactuals
In Sec. 5.1 we outlined a three-step procedure to evaluate counterfactual probabilities in quantum causal models. Note that, unlike in its classical counterpart (Thm. 1), an arrow-breaking do-intervention is not necessary in order to make the antecedent of the counterfactual true. Counterfactual queries can therefore be evaluated without a do-intervention on the underlying causal graph, and, in particular, without changing the instruments performed at quantum nodes at all. Indeed, according to Def. 13, the expected counterfactual probability has a well-defined numerical value whenever the antecedent has a nonzero probability of occurring for all values of the background variables that are compatible with the evidence , that is,
(34) |
It is possible for Eq. (34) to be satisfied even while keeping the endogenous instruments fixed, i.e. even if . Crucially, unlike in the classical case, we will see that this may be the case even if the antecedent is incompatible with the observed values . This motivates the following distinction for quantum counterfactuals.
Definition 14.
Let be a quantum structural causal model. A counterfactual query (following Def. 13) is called a passive counterfactual if , that is, if no intervention is performed on the nodes specified by the antecedent; otherwise it is called an active counterfactual.
The special case of an active counterfactual where specifies a do-intervention, (see Eq. (28)), will also be called a do-interventional counterfactual.
In the following, we discuss two examples of passive, active and do-interventional counterfactuals.
Example 1.
Consider the causal graph in Fig. 8 and a compatible QSM , where represent endogenous nodes, and represents an exogenous node with the following discard-and-prepare instrument,
(35) |
such that prepares the maximally mixed state, and we assume identity channels between pairs of nodes,
(36) |
With respect to the model , we will calculate expected counterfactual probabilities of the form , where we fix the actual instruments at endogenous nodes with
(37) |
but consider different counterfactual instruments , corresponding to (i) passive, (ii) do-interventional, and (iii) active counterfactual queries. To this end, we first calculate the conditional process operators (cf. Eq. (24)), conditioned on outcomes of the instrument in Eq. (35):
(38) |
-
(i)
Passive case: “Given that occurred in the actually performed instrument , what is the probability that would have obtained using the instrument , had it been that , using the instrument ?”.
In the abduction step, we update our information about the exogenous node, given the evidence. Note that observing the outcome at node in this particular case gives us no information about the outcome at the exogenous node since both outcomes at occur with equal probability for both of the possible values of the exogenous variable. This is expressed in the following conditional probabilities (c.f. Eq. (27) and the denominator of Eq. (26)), noting that here the counterfactual antecedent is , in place of in Eq. (26):
(39) (40) From the above, we see that we satisfy the conditions for a passive counterfactual to have a well-defined numerical value, given in Eq. (34). Since , this is a passive counterfactual, hence, no action step, that is, no intervention is needed. For the prediction step, we first compute the required counterfactual probabilities from Eq. (26),
(41) Using Eq. (1), the counterfactual probability for the different values of can be written as
(42) (43) The expected counterfactual probability (refer Eq.(29),
(44) (45) (46) (47) The obtained expected counterfactual probability is thus numerically equivalent to the probability of obtaining outcome for the instrument specified by , given the preparation of state at the input of node . In other words, the counterfactual probability is simply dictated by the counterfactual outcome at , which is a sensible result, since in this case the actual outcome gives us no information about the exogenous variables , the counterfactual outcome is possible for either of the values of , and furthermore the effect of the exogenous variables on is screened off by the outcome of the projective counterfactual instrument considered for node .
-
(ii)
Do-interventional case: “Given that occurred in the actually performed instrument , what is the probability that would have obtained using the instrument , had it been that , using the instrument ?”.
Here, instead of , we perform the single-element instrument corresponding to a do-intervention,
(48) which discards the input and prepares the state at the output of A. As before, the actual instruments are denoted by . Since the actual instruments and evidence are the same as in case (i), the abduction step yields the same result, given in the left side of Eqs. (39) and (40). In the action step, however, we modify the instrument at node , and the counterfactual instruments are now denoted by .
For the prediction step, we again first compute the required counterfactual probabilities from Eq. (26),
(49) The counterfactual probabilities for the different values of thus take the same form as in Eqs. (42) and (43), but with the instrument element in place of . The expected counterfactual probability corresponding to query (ii) can then be computed as
(50) (51) (52) As the instrument is arbitrary, we see that we obtain the same result for the do-interventional as for the passive counterfactual in this case. So we see that although the do-intervention guarantees the conditions for the counterfactual antecedent to have occurred, it is not necessary in situations where it could have occurred passively, as in case (i).
-
(iii)
Active case: we ask “Given that occurred in the actually performed instrument , what is the probability that would have obtained using the instrument , had it been that , using the instrument ”.
Again, the abduction step yields the same results as in cases (i) and (ii). For the action step, this is an active counterfactual whenever . Specifically, let’s consider
(53) where is an arbitrary projector and is its orthogonal complement. This instrument performs a projective measurement on the basis on the input of , and prepares the corresponding state at the output. Considering the counterfactual probabilities from Eq. (26), we find them to have the same value for the two possible values of , largely independent of ,
(54) provided the denominator of Eq. (54) is nonzero (which does depend on ). And since the abducted probabilities for the two values of in this case are equal, the expected counterfactual probability also has the same numerical value whenever the denominator of Eq. (54) is nonzero for both values of , namely,
(55) However, there are exceptions to this, for some specific values of . The denominator of Eq. (54) will be zero (in other words, the counterfactual antecedent will be impossible) for when and for when . In these cases we set the value of the corresponding counterfactual probability in Eq. (54) – and for the expected counterfactual probability in Eq. (55) – to , by definition, to indicate a counterpossible (see Section 5.1). Whenever it is numerically well-defined, on the other hand, the expected counterfactual probability has the same value as in cases (i) and (ii), considering that the instrument denoted by is again arbitrary.
Example 2.
Consider the same setup as in Ex. 1, but with a different instrument at the exogenous node ,
(56) |
This instrument also prepares at the output of the exogenous node, on average, the maximally entangled state, that is, . This implies that, averaging over the values of the exogenous variable, we obtain the same process operator for the endogenous nodes, and thus the same quantum causal model as per Def. 7. However, it implies a distinct quantum structural causal model, as per Def. 10. Let’s analyse some of the consequences of this fact for the evaluation of counterfactuals.
In contrast to Ex. 1, the conditional process operators for outcomes now become
(57) | ||||
(58) |
Again, we compute the (i) passive, (ii) do-interventional, and (iii) active counterfactual probabilities for the same counterfactual queries as in Ex. 1.
The required abducted probabilities (Eq. (27)) will be again the same for all queries – which share the same actual instruments and evidence – and are calculated to be
(59) | ||||
(60) |
-
(i)
Passive case: “Given that occurred in the actually performed instrument , what is the probability that would have obtained using the instrument , had it been that , using the instrument ?”
Now the antecedent becomes a counterpossible for , and the corresponding counterfactual probability is set to
(61) And since the abducted probability for is nonzero, the expected counterfactual probability is also by definition,
(62) We thus see that the change in the preparation instrument at the exogenous node leads to a different result from Ex. 1 – even though both cases lead to the same process operator for the endogenous nodes, when averaged over the background variables. This illustrates the dependence of counterfactual questions on the quantum structural causal model rather than on the quantum causal model for the endogenous nodes alone.
-
(ii)
Do-interventional case: “Given that occurred in the actually performed instrument , what is the probability that would have obtained using the instrument , had it been that , using the instrument ?”
The do-intervention Eq. (48) now guarantees that the denominator of the counterfactual probability is nonzero (indeed 1),
(63) Since the abducted probability for , given the observed evidence, is zero, the expected counterfactual probability (Eq. (29)) is equal to the counterfactual probability corresponding to , which is computed as
(64) (65) (66) Note that this is the same result as for the same query in Example 1, since the do-intervention breaks the causal dependence from the exogenous variables in this particular case.
-
(iii)
Active case: “Given that occurred in the actually performed instrument , what is the probability that would have obtained using the instrument , had it been that , using the instrument ?”
Using the instrument in Eq. (53), we find the counterfactual probability to have the same form for the two values of , and largely independent of (provided the denominator is nonzero, which again depends on ),
(67) Distinct from Example 1, however, the denominator of Eq. (67) will be zero (in other words, the counterfactual antecedent will be impossible) for when and for when . However, since the only nonzero abducted probability is , the expected counterfactual probability is numerically well-defined for all ,
(68)
Comparing the two examples, we see that while the passive counterfactual in Ex. 2 is always a counterpossible, that is, it has an impossible antecedent (and is thus assigned a conventional value ), the same passive counterfactual evaluated in Ex. 1 yields an expected counterfactual probability that is always numerically well-defined. This is in stark contrast to the classical case, where - as a consequence of the intrinsic determinism of classical structural causal models - a passive interpretation of a counterfactual query would always result in a counterpossible. Note also that both examples have the same average state (a maximally mixed state) prepared at the exogenous node, showing that different contexts for the state preparations of the same mixed state, and thus different quantum structural models, can result in different evaluations for a quantum counterfactual. We also see that in both examples the do-interventional counterfactual is numerically well-defined and has the same value, whereas the counterfactual probabilities in active cases are counterpossibles for different counterfactual instruments.
These distinctions then lead to the question: what should we do when a counterfactual statement is not already in standard form, and thus it is unclear whether it should be interpreted as a passive or active counterfactual?
7.2 Disambiguation of counterfactual queries: the principle of minimality
Note that classical counterfactuals evaluated with respect to a probabilistic structural causal model correspond to do-interventional (quantum) counterfactual queries when evaluated with respect to the quantum extension (cf. Cor. 1). In fact, a classical counterfactual query in Def. 4 is always defined in terms of a do-intervention, since this is the only way to make the antecedent true. In this sense, we may say that classical counterfactual queries naturally embed into our formalism as do-interventional counterfactuals.
Yet, the richer structure of quantum counterfactuals, as seen in the previous section, may sometimes allow for a different interpretation of a classical counterfactual query, in particular, the antecedent of a quantum counterfactual can sometimes be true without an intervention. This leaves a certain ambiguity if we want to interpret a classical counterfactual as a quantum counterfactual query according to Def. 13: for the latter, one must specify a counterfactual instrument, in particular, one must decide whether to interpret the classical counterfactual query passively or actively (do-interventionally). For example, again referring to the scenario represented in Fig. 8, consider the query:
Given that , what is the probability that , had it been that ?
Note that all of the counterfactuals in the previous section are of this form, until we specify what instruments those outcomes correspond to. And in the model of Example 1, both the passive and do-interventional interpretations of this query are always numerically well-defined.
This ambiguity does not occur in a classical structural causal model (CSM), since in that case all the variables are determined by a complete specification of the exogenous variables. Consequently, the only way the antecedent of a counterfactual like the one above could be realized while keeping the background variables fixed, is via some modification of the model.171717We remark that, contrary to Pearl, a counterfactual may also be interpreted as a backtracking counterfactual, where the background conditions are not necessarily kept fixed. A semantics for backtracking counterfactuals within a classical SCM has recently been proposed in Ref. [46]. Pearl justifies the do-intervention as “the minimal change (to a model) necessary for establishing the antecedent” [1]. In our case, due to the split-node structure, a do-intervention is reflected not as a change in the model itself, but as a change in the instrument used at the antecedent nodes, that is, via the use of a do-instrument.
To decide whether a counterfactual query not in standard form should be analyzed as passive or active when interpreted with respect to a QSM, we thus propose a principle of minimality, motivated by the minimal changes from actuality required in Pearl’s analysis. If the antecedent of a counterfactual can be established with no change to (the instruments applied to) a model – that is, as in a passive reading of the counterfactual – this is by definition the minimal change.
Definition 15 (Principle of Minimality).
Lewis’ account of counterfactuals invokes a notion of similarity among possible worlds [16]. For Lewis, one should order the closest possible worlds by some measure of similarity, based on which a counterfactual is declared true in a world if the consequent of the counterfactual is true in all the closest worlds to where the antecedent of the counterfactual is true. Arguably, a world in which both the model and instruments are the same, but where the counterfactual antecedent occurs, is closer to the actual world than any world where a different instrument is used. Thus the Principle of Minimality is also justified by a form of Lewis’ analysis applied to our case.
In Example 1, however, where both the active and do-interventional readings are numerically well-defined, they also produce the same (expected) counterfactual probability, rendering the ambiguity essentially irrelevant. We next turn to an important example where this is not the case.
7.3 Causal dependence and counterfactual dependence in the Bell scenario
A conceptually important consequence of our semantics for counterfactuals (and relevant for the disambiguation of passive from active counterfactual queries) is that, unlike in the case of Pearl’s framework, counterfactual dependence does not in general imply causal dependence. We establish this claim using the pertinent example of a common cause scenario, as shown in Fig. 9. This is essentially the causal structure of a Bell scenario, as in Fig. 5, although here we omit the nodes associated with the choices of setting and , which are now the choices of instrument for the quantum nodes and , and left implicit181818Similarly, the labels and of the quantum nodes themselves are not to be confused with the outcomes of instruments that may be used at those nodes.
Example 3 (Bell scenario).
Consider the causal scenario in Fig. 9, with instruments
(69) | ||||
(70) | ||||
(71) |
where the output of factorises as and where is a Bell state. Let the unitary channel be given by identities
(72) |
Here for simplicity we consider a case where we have complete information about the event at the common cause node (that is, the single-outcome instrument that prepares the Bell state), and thus there is no exogenous node, and no abduction step is necessary. Now consider the counterfactual query
:“Given that , what is the probability that had it been that ?”.
This query is ambiguous until we specify what instruments those counterfactual outcomes correspond to. On the one hand, interpreting as a do-interventional counterfactual with , while keeping the instrument at fixed – the most parsimonious interpretation, since the consequent of a counterfactual does not call for an intervention even in the classical case – we obtain the (expected) counterfactual probability (which in this case are the same as there is no abduction involved),
(73) |
Similarly, consider the query
:“Given that , what is the probability that had it been that ?”.
Interpreted do-interventionally, the answer to this query is
(74) |
In other words, there is no counterfactual dependence between and in this case, when the counterfactual antecedent is interpreted as the outcome of a do-intervention.
On the other hand, both and can be interpreted as a passive counterfactual query (with ), since the antecedent of both queries have nonzero model probabilities . In this passive reading, we obtain
(75) |
for , and
(76) |
for . In other words, According to the above equations, it would have been the case that with certainty had it been the case that , and it would have been the case that with certainty had it been the case that (which in fact, is the actual case). In other words, the outcomes at and are counterfactually dependent.
In Pearl’s classical semantics, counterfactual dependence of the type in Eqs. (75) and (76) would imply that is a cause of .191919In Lewis’s account [30], such counterfactual dependence also implies causal dependence. The difference is that Pearl analyzes counterfactuals in terms of causation, which he takes to be more fundamental, whereas Lewis analyzes causation in terms of counterfactuals, which he takes as more fundamental. Nevertheless, the quantum structural causal model we used to derive this result has by construction no causal dependence from to . This shows that in quantum causal models, (passive) counterfactual dependence does not imply causal dependence.
Note also that in the passive reading, the counterfactual antecedent corresponds to an event – in the technical sense of an instrument element, that is, a CP map – that was one of the potential outcomes of the actual instrument. A counterfactual antecedent interpreted as a do-intervention, on the other hand, is a different event altogether – technically distinct from any event in the actual instrument. This fact is obscured in the classical case, since in Pearl’s formalism we identify the incoming and outgoing systems, and it is implicitly assumed that we can always (at least in principle) perform ideal non-disturbing measurements of the variables involved. Classically, the event ‘’ can ambiguously correspond to “an ideal non-disturbing measurement of has produced the outcome ” or “the variable was set to the value ”. The distinction between those interpretations, in Pearl, is attributed to the structural relations in the model; the second interpretation is represented as a surgical excision of causal arrows, while leaving the variables themselves otherwise intact. In a quantum causal model, on the other hand, a do-intervention corresponds to a related but technically distinct event in an otherwise intact model.
In our view, counterfactual dependence without causal dependence is a much more distinctively quantum feature than the failure of “counterfactual definiteness”, which we turn to next.
8 A Note on Counterfactual Definiteness and Bell’s theorem
As we’ve seen in Sec. 7, one of the features of our formalism is that a counterfactual proposition is not always either true or false, unlike in the classical semantics. In the classical case, a structural causal model underpins the deterministic nature of the system by defining functional dependencies among the nodes. This is not in the quantum case, even with full knowledge about a quantum structural model. Whereas in the classical case we can define the probability of a counterfactual (to be true), in the quantum case we can in general only define a counterfactual probability.
The lack of definite truth values for counterfactuals in quantum mechanics can be thought of as a failure of “counterfactual definiteness”, a concept that has a long and controversial history in discussions of Bell’s theorem. Skyrms [14] defines counterfactual definiteness (CFD) as follows, (attributing it to Stapp [47], who expressed the idea that Bell’s theorem requires it as an underlying assumption):
“Counterfactual definiteness; essentially the assumption that subjective conditionals of the form: ‘If measurement had been performed, result would have been obtained’ always have a definite truth value (even for measurements that were not carried out because incompatible measurements were being made) and that the quantum mechanical statistics are the probabilities of such conditionals.”
Skyrms [14] argues, contrary to Stapp, that some forms of Bell’s theorem can be proved using conditional probabilities rather than probabilities of subjunctive conditionals, and hence that CFD is not a necessary assumption in its derivation. Since then there has been a long debate about the status of CFD as an assumption underlying the derivation of Bell’s theorem (see e.g. [48, 49, 50, 51, 52, 53]). Analysing the details of this long and nuanced literature is beyond our scope, but from the perspective of our framework, we can say some (hopefully) clarifying remarks.
Firstly, Bell inequalities can be derived from many different sets of propositions [33, 44]. Disagreements often arise due to different “camps” using the same terms (most notably ‘locality’) to refer to different concepts. Bell’s 1964 theorem [12] explicitly used a notion of ‘locality’ and an assumption of determinism (as well as an implicit ”free choice” assumption). Bell inequalities can also be derived [33], however, by replacing Locality and determinism by Bell’s 1976 notion of Local Causality [54] (stronger than Locality).
In the causal language we are using here, Bell’s 1964 propositions, using determinism, are analogous to assuming the existence of a classical structural causal model (over the common-cause causal structure of a Bell scenario). As we’ve seen above, in Pearl’s semantics, given a classical structural causal model, counterfactuals always have well-defined truth values. When one argues that CFD is necessary for derivations of Bell inequalities, rather than ‘locality’ alone, one is (we surmise) implicitly assuming something like Bell’s 1964 notions of Locality, rather than Local Causality. To derive a Bell inequality, the notion of Locality indeed needs to be supplemented with something else, and the upshot is that CFD carries the same effect in this context as assuming determinism, as Bell did in 1964.
However, one may also assume Local Causality instead, as Bell did in 1976 (or indeed other assumptions [33, 44]), without assuming determinism. In this case, in causal language, one can effectively assume a Classical Causal Model, without further assuming that there exists an underlying Structural Causal Model. In this case, CFD may in general fail, or at least we fail to have the structure necessary to define Pearl’s semantics of counterfactuals. Thus, Bell inequalities can be derived without assuming CFD.
Nevertheless, one may argue (essentially as in [49]), that determinism can be derived from Local Causality plus the perfect correlations of a pure entangled state. This would then make CFD true after all. However, perfect correlations are not observable in practice in real experiments, and in any case this does not change the fact that the derivation of Bell inequalities does not require perfect correlations (and indeed this is what makes them experimentally testable!).
Clearly, CFD as defined by Skyrms [14] does indeed fail in the quantum framework presented here. This however does not imply that Bell’s theorem can be resolved merely by rejecting CFD. Indeed, CFD may fail even in a completely classical but indeterministic causal model. The violation of Bell inequalities is explained within quantum causal models not simply as due to the indeterminism of quantum theory, but something deeper. One way of thinking about it is that it is due to the failure of Reichenbach’s principle of common cause, which is in turn a consequence of the Classical Causal Markov Condition: in quantum causal models, a complete specification of the causes of an event (in the case of a Bell scenario, the preparation of the entangled state) does not in general render it uncorrelated with its non-effects – even if those non-effects are space-like separated, like the outcome of a distant instrument. Another way of thinking about it, we argue here, is that instead of the failure of CFD, the “quantumness” of quantum causal models is better captured by the fact that, as discussed in Sec. 7.3, quantum causal models allow for counterfactual dependence without causal dependence. A more detailed discussion of this point, and how it does not arise merely as an artefact of the split-node structure of quantum causal models, will be left for future work [55].
9 Generalisations and related work: quantum Bayes’ theorem
Thm. 3 and Cor. 1 show that our formalism for counterfactuals in quantum causal models (see Sec. 5) is a valid generalization of Pearl’s formalism in the classical case (see Sec. 2). In this section, we review the key assumptions of our formalism, discuss possible generalizations, and draw parallels with related work on quantum Bayesian inference.
Recall that our notion of a ‘quantum counterfactual’ in Def. 13 is evaluated with respect to a quantum structural causal model (QSM) (see Def. 10). A QSM reproduces a given physical process operator over observed nodes , that is, arises from coarse-graining of ancillary (environmental) degrees of freedom in (cf. Eq. (21)). As such, encodes additional information that is not present in : namely, (i) it assumes an underlying unitary process , and (ii) it incorporates partial knowledge about the preparation of ancillary states at exogenous nodes in the form of preparation instruments (cf. Eq. (19)), acting on an arbitrary input state. Together, this allowed us to reduce the abduction step in our formalism to classical Bayesian inference.
We remark that this situation (of a unitary background process with ancillas prepared in a fixed basis) arises naturally in the context of quantum circuits, future quantum computers, and thus supposedly in the context of future quantum AI. Nevertheless, for other use cases it might be less clear how to model our background knowledge on a physical process in terms of a QSM, thus prompting relaxations of the assumptions baked into Def. 10. First, one may want to drop our assumption of a unitary background process. This assumption closely resembles Pearl’s classical formalism, which models any uncertainty about a stochastic physical process as a probabilistic mixture of deterministic processes. Yet, one might argue that assuming a unitary background process is too restrictive (or perhaps even in general fundamentally unwarranted) and that one should allow for arbitrary convex decompositions of a quantum stochastic process (CPTP map). To this end, note that knowledge about stable facts [43] that lead to a preferred convex decomposition of the process operator (into valid process operators ) is all that is necessary to perform (classical) Bayesian inference (cf. Eq. (27)).
A more radical generalization could arise by taking our information about exogenous variables to be inherently quantum. That is, without information in the form of stable facts regarding the distribution over outcomes of preparation instruments in a QSM, our knowledge about exogenous variables merely takes the form of a generic quantum state .202020Note that without the extra information about exogenous instruments in a QSM, we reduce to the situation described by a “unitary process operator with inputs”, as defined in Def. 4.5 of Ref. [29] (see also Thm. 2). In this case, inference can no longer be described by (classical) Bayes’ theorem but requires a quantum generalization. Much recent work has been devoted to finding a generalization of Bayes’ theorem to the quantum case, which has given rise to various different proposals for a quantum Bayesian inverse – see Ref. [56, 57, 58] for a recent categorical (process-diagrammatic) definition and Ref. [59] for an attempt at an axiomatic derivation. Once a definition for the Bayesian inverse has been fixed - and provided it exists212121Ref. [58] characterises the existence of a Bayesian inverse in the categorical setting for finite-dimensional -algebras. - we can perform a generalized abduction step in Sec. 5.1 and, consequently, obtain a generalised formalism for counterfactuals. We leave a more careful analysis of counterfactuals arising in this way and their comparison to our formalism for future study.
10 Conclusion
We defined a notion of counterfactual in quantum causal models and provided a semantics to evaluate counterfactual probabilities, generalizing the three-step procedure of evaluating probabilities of counterfactuals in classical causal models due to Pearl [1]. The third level in Pearl’s ladder of causality (see Fig. 1) had thus far remained an open gap in the generalization of Pearl’s formalism to quantum causal models; here, we fill this gap.
To this end, we introduce the notion of a quantum structural causal model, which takes inspiration from Pearl’s notion of a classical structural causal model, yet differs from the latter in several ways. A quantum structural model is fundamentally probabilistic; it does not assign truth values to all counterfactuals, and in this sense violates “counterfactual definiteness”.
Despite these differences, we prove that every classical structural causal model admits an extension to a quantum structural causal model, which preserves the relevant independence conditions and yields the same probabilities for counterfactual queries arising in the classical case. Thus, quantum structural causal models and the evaluation of counterfactuals therein subsume Pearl’s classical formalism.
On the other hand, quantum structural causal models have a richer structure than their classical counterparts. We identify different types of counterfactual queries arising in quantum causal models, and explain how they are distinguished from counterfactual queries in classical causal models. Based on this distinction, we evaluate these different types of quantum counterfactual queries in the Bell scenario and show that counterfactual dependence does not generally imply causal dependence in this case. In this way, our analysis provides a new way of understanding how quantum causal models generalize Reichenbach’s principle of common cause to the quantum case [34, 21, 22]: a quantum common cause allows for counterfactual dependence without causal dependence, unlike a classical common cause.
Our work opens up several avenues for future study. Of practical importance are applications of counterfactuals in quantum technologies. For example, questions such as “Given that certain outcomes were observed at receiver nodes in a quantum network, what is the probability that different outcomes would have been observed, had there been an eavesdropper in the network?” can be relevant for security applications.
It is well-known that quantum theory violates “counterfactual definiteness” in the sense of the phenomenon often referred to as ‘quantum contextuality’ [13]. The latter has been identified as a key distinguishing feature between classical and quantum physics, as well as a resource for quantum computation [60, 61, 62, 63]. It would thus be interesting to study contextuality from the perspective of the counterfactual semantics spelled out here.
Finally, our analysis hinges on the classicality (‘stable facts’ in Ref. [43]) of background (exogenous) variables in the model, as it allows us to apply (classical) Bayes’ inference on our classical knowledge about exogenous variables. In turn, considering the possibility of our ‘prior’ knowledge about exogenous variables to be genuinely quantum motivates a generalization of Bayes’ theorem to the quantum case (see Sec. 9). We expect that combining our ideas with recent progress along those lines will constitute a fruitful direction for future research.
Acknowledgements.
The authors acknowledge financial support through grant number FQXi-RFP-1807 from the Foundational Questions Institute and Fetzer Franklin Fund, a donor advised fund of Silicon Valley Community Foundation, and ARC Future Fellowship FT180100317.
References
- [1] J. Pearl. Causality: Models, Reasoning and Inference. Cambridge University Press, 2000.
- [2] J. D. Fearon. Causes and Counterfactuals in Social Science: Exploring an analogy between cellular automata and historical processes, pages 39–68. Princeton University Press, 1996.
- [3] M. Loi and M. Rodrigues. A note on the impact evaluation of public policies: the counterfactual analysis. Nov 2012. https://mpra.ub.uni-muenchen.de/id/eprint/42444.
- [4] S. Tagini et al. Counterfactual thinking in psychiatric and neurological diseases: A scoping review. PLOS ONE, 16(2):e0246388, Feb 2021. https://doi.org/10.1371/journal.pone.0246388.
- [5] M. Ravallion. Poverty in China since 1950: A Counterfactual Perspective. Technical report, National Bureau of Economic Research, Jan 2021. 10.3386/w28370.
- [6] G. Woo. A counterfactual perspective on compound weather risk. Weather and Climate Extremes, 32:100314, Jun 2021. https://doi.org/10.1016/j.wace.2021.100314.
- [7] K. Holtman. Counterfactual Planning in AGI systems. arXiv e-prints, Jan 2021. https://doi.org/10.48550/arXiv.2102.00834.
- [8] C. Hoerl et al. Understanding Counterfactuals, Understanding Causation: Issues in Philosophy and Psychology. Oxford University Press, 2011.
- [9] J. Collins et al. Causation and Counterfactuals. MIT Press, 2004.
- [10] B. Black et al. Covid-19 boosters: If the us had matched israel’s speed and take-up, an estimated 29,000 us lives would have been saved: Study compares israel’s covid-19 booster performance to the us. Health Affairs, 42(12):1747–1757, 2023. https://doi.org/10.1377/hlthaff.2023.00718.
- [11] L. Vaidman. Counterfactuals in Quantum Mechanics. In Compendium of Quantum Physics, pages 132–136. Springer, 2009.
- [12] J. S. Bell. On the Einstein-Podolsky-Rosen paradox. Physics, 1:195, Nov 1964. https://doi.org/10.1103/PhysicsPhysiqueFizika.1.195.
- [13] S. Kochen and E. P. Specker. The problem of hidden variables in quantum mechanics. Journal of Mathematics and Mechanics, 17:59–87, 1967.
- [14] B. Skyrms. Counterfactual definiteness and local causation. Philosophy of Science, 49(1):43–50, 1982. https://doi.org/10.1086/289033.
- [15] A. Peres. Unperformed experiments have no results. American Journal of Physics, 46(7):745–747, 1978. https://doi.org/10.1119/1.11393.
- [16] D. Lewis. Counterfactuals and Comparative Possibility. In IFS, pages 57–85. Springer, 1973.
- [17] J. Pearl and D. Mackenzie. The Book of Why: the New Science of Cause and Effect. Basic Books, 2018.
- [18] C. J. Wood and R. W. Spekkens. The lesson of causal discovery algorithms for quantum correlations: causal explanations of Bell-inequality violations require fine-tuning. New J. Phys., 17(3):033002, Mar 2015. 10.1088/1367-2630/17/3/033002.
- [19] E. G. Cavalcanti. Classical causal models for Bell and Kochen-Specker inequality violations require fine-tuning. Phys. Rev. X, 8(2):021018, Apr 2018. https://doi.org/10.1103/PhysRevX.8.021018.
- [20] J. C. Pearl and E. G. Cavalcanti. Classical causal models cannot faithfully explain Bell nonlocality or Kochen-Specker contextuality in arbitrary scenarios. Quantum, 5:518, Aug 2021. https://doi.org/10.22331/q-2021-08-05-518.
- [21] M. S. Leifer and R. W. Spekkens. Towards a formulation of quantum theory as a causally neutral theory of Bayesian inference. Phys. Rev. A, 88(5), Nov 2013. https://doi.org/10.1103/PhysRevA.88.052130.
- [22] E. G. Cavalcanti and R. Lal. On modifications of Reichenbach’s principle of common cause in light of Bell’s theorem. J. Phys. A, 47(42):424018, Oct 2014. https://iopscience.iop.org/article/10.1088/1751-8113/47/42/424018.
- [23] Joe Henson, Raymond Lal, and Matthew F. Pusey. Theory-independent limits on correlations from generalized Bayesian networks. New Journal of Physics, 16(11):113043, November 2014. arXiv: 1405.2572 Publisher: IOP Publishing.
- [24] Rafael Chaves, Christian Majenz, and David Gross. Information–theoretic implications of quantum causal structures. Nature Communications, 6:5766, January 2015. arXiv: 1407.3800 Publisher: Nature Publishing Group.
- [25] Jacques Pienaar and Caslav Brukner. A graph-separation theorem for quantum causal models. New Journal of Physics, 17(7):073020, July 2015. arXiv: 1406.0430.
- [26] F. Costa and S. Shrapnel. Quantum causal modelling. New J. Phys., 18(6):063032, Jun 2016. 10.1088/1367-2630/18/6/063032.
- [27] J. M. A. Allen et al. Quantum common causes and quantum causal models. Phys. Rev. X, 7:031021, Jul 2017. https://doi.org/10.1103/PhysRevX.7.031021.
- [28] S. Shrapnel et al. Updating the Born rule. New J. Phys., 20(5):053010, May 2018. 10.1088/1367-2630/aabe12.
- [29] J. Barrett, R. Lorenz, and O. Oreshkov. Quantum causal models, Jun 2019. https://doi.org/10.48550/arXiv.1906.10726.
- [30] D. Lewis. Causation. J. Philos, 70(17):556, Oct 1973. 10.2307/2025310.
- [31] A. Shimony. Controllable and uncontrollable non-locality. In S. Kamefuchi, editor, Foundations of Quantum Mechanics in the Light of New Technology, pages 225–230. Physical Society of Japan, Tokyo, 1984.
- [32] J. S. Bell. The theory of local beables. Epistemol. Lett., 9:11–24, Nov 1975.
- [33] H. M. Wiseman and E. G. Cavalcanti. Causarum Investigatio and the Two Bell’s Theorems of John Bell, pages 119–142. Springer, Cham, 2017.
- [34] H. Reichenbach. The Direction of Time, volume 65. University of California Press, 1991.
- [35] J. F. Clauser et al. Proposed experiment to test local hidden-variable theories. Phys. Rev. Lett., 23:880–884, Oct 1969. https://doi.org/10.1103/PhysRevLett.23.88.
- [36] A. Aspect et al. Experimental tests of realistic local theories via Bell’s theorem. Phys. Rev. Lett., 47:460–463, Aug 1981. https://doi.org/10.1103/PhysRevLett.47.460.
- [37] M. Giustina et al. Significant-loophole-free test of Bell’s theorem with entangled photons. Phys. Rev. Lett., 115:250401, Dec 2015. https://doi.org/10.1103/PhysRevLett.115.250401.
- [38] L. K. Shalm et al. Strong loophole-free test of local realism. Phys. Rev. Lett., 115:250402, Dec 2015. https://doi.org/10.1103/PhysRevLett.115.250402.
- [39] T. Hoffreumon and O. Oreshkov. The multi-round process matrix. Quantum, 5:384, Jan 2021. https://doi.org/10.22331/q-2021-01-20-384.
- [40] M. Frembs and E. G. Cavalcanti. Variations on the Choi-Jamiołkowski isomorphism. arXiv e-prints, Nov 2022. https://doi.org/10.48550/arXiv.2211.16533.
- [41] A. Jamiołkowski. Linear transformations which preserve trace and positive semidefiniteness of operators. Rep. Math. Phys., 3(4):275 – 278, Dec 1972. https://doi.org/10.1016/0034-4877(72)90011-0.
- [42] M. Araújo et al. Witnessing causal nonseparability. New J. Phys., 17(10):102001, Oct 2015. 10.1088/1367-2630/17/10/102001.
- [43] A. Di Biagio and C. Rovelli. Stable facts, relative facts. Found. Phys., 51(1):30, Feb 2021. https://doi.org/10.1007/s10701-021-00429-w.
- [44] E. G. Cavalcanti and H. M. Wiseman. Implications of local friendliness violation for quantum causality. Entropy, 23(8):925, 2021. https://doi.org/10.3390/e23080925.
- [45] Eric G. Cavalcanti. Bell’s theorem and the measurement problem: reducing two mysteries to one? Journal of Physics: Conference Series, 701:012002, March 2016. arXiv: 1602.07404.
- [46] J. von Kügelgen et al. Backtracking counterfactuals. arXiv e-prints, Nov 2022. https://doi.org/10.48550/arXiv.2211.00472.
- [47] Henry Pierce Stapp. S-matrix interpretation of quantum theory. Physical Review D, 3(6):1303, 1971. https://doi.org/10.1103/PhysRevD.3.1303.
- [48] Guy Blaylock. The EPR paradox, Bell’s inequality, and the question of locality. American Journal of Physics, 78(1):111–120, 2010. https://doi.org/10.1119/1.3243279.
- [49] Tim Maudlin. What Bell proved: A reply to Blaylock. American Journal of Physics, 78(1):121–125, 2010. https://doi.org/10.1119/1.3243280.
- [50] Robert B Griffiths. EPR, Bell, and quantum locality. American Journal of Physics, 79(9):954–965, 2011. https://doi.org/10.1119/1.3606371.
- [51] Tim Maudlin. How Bell reasoned: A reply to Griffiths. American Journal of Physics, 79(9):966–970, 2011. https://doi.org/10.1119/1.3606476.
- [52] Justo Pastor Lambare and Rodney Franco. A note on Bell’s theorem logical consistency. Foundations of Physics, 51(4):84, 2021. https://doi.org/10.1007/s10701-021-00488-z.
- [53] Marek Żukowski and Časlav Brukner. Quantum non-locality-it ain’t necessarily so… Journal of Physics A: Mathematical and Theoretical, 47(42):424009, 2014. 10.1088/1751-8113/47/42/424009.
- [54] John Stewart Bell. Epistemological Lett. 9. pages 11–24, 1976.
- [55] Ardra Kooderi Suresh and Eric G. Cavalcanti. Counterfactuals in classical split-node causal models (in preparation).
- [56] K. Cho and B. Jacobs. Disintegration and Bayesian inversion via string diagrams. Math. Struct. Comput. Sci., 29(7):938–971, 2019. https://doi.org/10.1017/S0960129518000488.
- [57] T. Fritz. A synthetic approach to Markov kernels, conditional independence and theorems on sufficient statistics. Adv. Math., 370:107239, 2020. https://doi.org/10.1016/j.aim.2020.107239.
- [58] A.J. Parzygnat and B. P. Russo. A non-commutative Bayes’ theorem. Lin. Alg. Appl., 644:28–94, Jul 2022. https://doi.org/10.1016/j.laa.2022.02.030.
- [59] A. J. Parzygnat and F. Buscemi. Axioms for retrodiction: achieving time-reversal symmetry with a prior. arXiv e-prints, Oct 2022. https://doi.org/10.48550/arXiv.2210.13531.
- [60] R. Raussendorf. Contextuality in measurement-based quantum computation. Phys. Rev. A, 88(2):022322, Aug 2013. https://doi.org/10.1103/PhysRevA.88.022322.
- [61] M. Howard et al. Contextuality supplies the ‘magic’ for quantum computation. Nature, 510(7505):351—355, Jun 2014. https://doi.org/10.1038/nature13460.
- [62] M. Frembs et al. Contextuality as a resource for measurement-based quantum computation beyond qubits. New J. Phys., 20(10):103011, Oct 2018. https://iopscience.iop.org/article/10.1088/1367-2630/aae3ad.
- [63] M. Frembs et al. Hierarchies of resources for measurement-based quantum computation. New J. Phys., 25(1):013002, Jan 2023. https://iopscience.iop.org/article/10.1088/1367-2630/acaee2.
- [64] M. Bataille. Quantum circuits of CNOT gates. arXiv preprint arXiv:2009.13247, Oct 2020. https://doi.org/10.48550/arXiv.2009.13247.
Appendix A Proof of Thm. 3
The proof consists of several parts: (i) we find a binary extension of a classical structural causal model (CSM), (ii) we provide a protocol that extends a (binary) CSM to a reversible one, where all functional relations are bijective, (iii) we encode classical copy operations in a quantum structural causal model (QSM) using CNOT-gates. In the final part (iv), we combine (i)-(iii) to construct a QSM , which (linearly) extends a PSM as desired.
(i) binary encoding: every CSM has a binary extension. Let be a CSM (see Def. 3). First, we enlarge the sets in to sets of cardinality a power of two. For all , let denote a set of cardinality . Second, we extend to the enlarged sets and as follows. For every , let be the function given by
(77) |
In words, identifies all elements in and all elements in with the same (arbitrary) element . It is easy to see that the causal structure of the CSM is the same as that of : the extensions and of and are defined locally, and has the same functional form as . Consequently, admits a (Markov) factorization of the same form as in Eq. (30).
(ii) reversibility: every (binary) CSM has a (binary) reversible extension. Let be a CSM. For every , we decompose its domain according to its pre-images for all . In particular, let be a set of cardinality . Next, we label the elements in . We will use the lexicographic order on (any choice of) alphabets labeling the elements in the sets and , respectively. For every , let be any order on such that for all . Combining orders on , we obtain an order on , denoted by . With this order, we define an extension of the form
Let denote the projection onto the first factor , i.e., ; clearly, .222222 can be interpreted as a ‘discarding operation’.
Now, is injective, however, it is not surjective in general. In order to obtain a bijective map, we also need to extend the domain of . Let , and be sets of cardinalities such that
(78) |
Similar to before, we denote the lexicographic order corresponding to (some choice of) alphabets on and by . Let be any decomposition such that and for all . Moreover, let be any order such that for all and . Then we define by
(79) |
In this case, we have . Moreover, is a bijection by construction.
Finally, we note that the extensions are local: they only add degrees of freedom and , which are local to the node , is a cause of only and the are all discarded. It follows that—up to an additional ‘sink’ node —the CSM for , and has the same causal structure as , that is, admits a (Markov) factorisation of the same form as in Eq. (30). Moreover, note that we can apply the above protocol to the binary extension of a CSM , resulting in a binary, reversible CSM, which we will denote by also.232323Note, that the binary encoding in Eq. (77) will generally increase the cardinality of the sink nodes .
(iii) copy operation. Since classical information can be copied and processed to various nodes in a CSM , we need a way to encode such copying in a quantum extension . Indeed, classical information is copied in the CSM whenever there exist , such that . Let be the set containing all nodes that are related to via . Since we are only interested in proving the existence of a quantum extension of , we will content ourselves with an inefficient copy protocol: namely, we will copy all of to every node .242424This copy protocol will generally increase the cardinality of variables in the CSM, and is inefficient in this regard.
The idea is to use CNOT gates to implement classical copying in a fixed basis. This requires additional ancillary (quantum) nodes in the quantum extension of a CSM , which we will denote by . More precisely, assume that are quantum nodes with associated Hilbert spaces of respective dimension .252525By (i) and (ii), we may assume that is a binary, reversible CSM, in particular, . Further, let , where denotes a quantum node with associated Hilbert spaces of the same dimension as , that is, . Then we devise a copy protocol involving a total of CNOT-gates as follows,262626Here and below, we implicitly assume the individual CNOT gates in to be ‘padded’ with identities on ancillary systems not involved in the present CNOT gate. As an example, see the copy operation between three-qubit nodes in Eq. (84).
(80) |
where denotes the CNOT gate with control and target given by the qubit in and , respectively; that is,
whenever . We will adopt the convention that the Hilbert space after the operation is discarded. Note that Eq. (80) is then valid for , namely we have in this case.
We are left to show that the encoding in Eq. (80) satisfies the relations whenever in Def. 10 for the additional ancillary nodes . This follows immediately from the commutation relations of CNOT gates (see Prop. 1 in [64]), together with Thm. 2.
Below, we explicitly analyze the case of two CNOT-gates, copying a single qubit from a node to two children . The situation is depicted in Fig. 10.
We want to show that and are local influences only, that is, does not signal to and does not signal to . Let
(81) |
be a state at and let
(82) |
be any bipartite state over the joint system and .272727We remark that in the QSM constructed below, is in fact always a product state. In order to check the no-influence relation ‘’ we evaluate
(83) |
where is the copy operation in Eq. (80), in our case, it is the product of two CNOT gates,
(84) |
and is given by,
(85) |
By straightforward computation we obtain,
(86) |
which shows that is independent of . Similarly, one shows that is not affected by the matrix elements of .
The argument can be extended to more than two (blocks of) CNOT gates at a node (cf. Eq. (80)) and for nodes with more than one child node.
(iv) quantum extension: every PSM has a quantum extension. Let be a PSM (see Def. 5) with CSM and a probability distribution over exogenous variables of . The structural relations define independence conditions between variables . Representing these by a DAG, let be an order compatible with the order of nodes in the DAG, that is, . Compatibility of with the DAG, equivalently the causal Markov condition in Def. 1, reads
By steps (i) and (ii), we may assume that are sets of cardinality a power of two and that the are bijective functions. In order to construct a QSM from , we promote these classical variables to quantum nodes, by associating them with the (free) Hilbert spaces , , , , and , over the outcome sets associated to , together with a respective choice and orthonormal basis, as well as an identification of bases between input and output spaces, e.g. in . Moreover, since the are bijective, complex-linear extension yields isometries
(87) |
Part (iii) further yields unitaries implementing local classical copy operations,
(88) |
where by our above convention, . Setting (see Fig. 11) and composing the ’s for all nodes in order, we define a global isometry by
(89) |
Recall that by our convention that is discarded after the copy process, we have for . Moreover, adopting the notation of the copy protocol from Eq. (80), we set as well as for all with in the input and output quantum nodes of the isometry and consequently for , where consists of exogenous variables and , and consists of and (see Fig. 11).
Note that the ordering of unitaries in Eq. (89) maintains the partial ordering of the directed acyclic graph corresponding to . Consequently, admits the factorisation in Eq. (32).
Finally, we encode the distribution over exogenous nodes of . First, we trivially extend to by setting for and for in step (i). Second, we define as the product distribution of and the uniform distribution over , i.e., in step (ii). Finally, step (iii) requires us to initialise the input state of the copy ancillae . Taken together, we define instruments
(90) |
In summary, we thus obtain a QSM , which by construction reproduces the classical probability distribution defined by and in Eq. (31), when evaluated on instruments corresponding to projective measurements in the (-times copied) preferred bases, . This completes the proof.