State-independent all-versus-nothing arguments
Abstract
Contextuality is a key feature of quantum information that challenges classical intuitions, providing the basis for constructing explicit proofs of quantum advantage. While a number of evidences of quantum advantage are based on the contextuality argument, the definition of contextuality is different in each research, causing incoherence in the establishment of instant connection between their results. In this report, we review the mathematical structure of sheaf-theoretic contextuality and extend this framework to explain Kochen-Specker type contextuality. We first cover the definitions in contextuality with detailed examples. Then, we state the all-versus-nothing (AvN) argument and define a state-independent AvN class. It is shown that Kochen-Specker type contextuality, or contextuality in a partial closure, can be translated into this framework by the partial closure of observables under the multiplication of commuting measurements. Finally, we compare each case of contextuality in an operator-side view, where the strict hierarchy of contextuality class in a state-side view seems to merge into the state-independent AvN class together with the partial closure formalism. Overall, this report provides a unified interpretation of contextuality by integrating Kochen-Specker type notions into the state-independent AvN argument. The results present novel insights into contextuality, which pave the way for a coherent approach to constructing proofs of quantum advantage.
1 Introduction
So, why do we need a quantum computer? This question must be one of the most frequently asked questions for those who research quantum computers. The claim of exponential quantum advantage in the Deutsch-Jozsa algorithm [1] and the monumental factoring algorithm by Shor [2] opened an intriguing discussion on quantum advantage. They inspired numerous computational tasks for quantum information systems, for example, the GHZ [3] and magic square games [4], quantum search algorithm [5], and variational quantum eigensolver (VQE) [6] for quantum simulation and optimization.
Those cases of quantum advantage are crucial for the development of quantum technologies since they can spark a totally novel area of application, opening new markets and research fields for quantum scientists. However, discovering and proving a new instance of quantum advantage are challenging works due to the complexity and counter-intuitive nature of quantum mechanics.
Contextuality is a useful tool to characterize the specific point where quantum information systems depart from classical systems. Originating from Bell’s pioneering work on the non-local behavior of quantum mechanics [7], Kochen and Specker [8] characterized the idea of contextuality based on the problem of a hidden variable theory. Subsequent investigations on quantum observables were made by Mermin [9, 10] who first used the term contextuality in his articles. Recent attempts to apply mathematical structures to contextuality [11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24] have gained a certain degree of success, producing proofs of quantum advantage in some computational processes [18, 20, 23].
1.1 Applications of contextuality to quantum advantage
Contextuality as a resource of quantum advantage was first discussed by Raussendorf [19] in a measurement-based quantum computing model (MBQC). His major results state that an MBQC requires contextuality when it computes a non-linear Boolean function with a certain probability. This idea was developed by Bermejo-Vega et al. [20], establishing the explicit proof that contextuality is a necessary resource for a large class of MBQC schemes.
It has also been observed that contextuality plays a crucial role in a gate-based quantum computing model. Bravyi, Gosset, and König [25] reported the quantum advantage of shallow circuits, which was the first mathematical proof of an unconditional quantum advantage for a certain class of quantum circuits. Here they considered strategies for non-local games recast into the circuit, proving a separation between classical and quantum computational models. Subsequently, this strategy was shown to be noise robust [26] and extended to distributed non-local games [18].
Karanjai, Wallman, and Bartlett [23] also developed their own framework of contextuality to prove that the spatial complexity of classical simulation of a quantum measurement process is bounded by contextuality. This result is further developed by Kirby and Love [24], who suggested applying contextuality to evaluate the quantum advantage of the VQE algorithm. They proposed a test to determine whether or not the given objective function for the VQE is contextual, employing the compatibility graph of Pauli operators and the measure, contextual -distance. This test detects the non-classicality of the objective function, thus filtering out classically simulatable procedures. This research has important implications for the application of contextuality to evaluate practical quantum algorithms.
1.2 Classification of contextuality
The sheaf-theoretic definition of contextuality has been instrumental in our understanding of contextuality, as it provides precise mathematical structure to the intuitive concept of contextuality. The sheaf-theoretic framework was first proposed by Abramsky and Brandenburger [11, 13], where they defined events and distributions on the measurement scenario and identified the sheaf structure of those concepts. Here, one can connect the global distribution to the hidden-variable model, which is well-known for its failure to explain the distinct features of quantum theory. Further discussion by Abramsky, Barbosa, and Mansfield [16] investigated a measure of contextuality. This work opened the way to quantify contextuality in the given quantum scenario.
The subsequent development in the cohomological approach to contextuality also provides a substantial methodology for witnessing contextuality in a given measurement scenario. Abramsky, Mansfield, and Barbosa [12] proposed the approach based on the Čech cohomology invariant, which leverages powerful tools of sheaf cohomology to detect contextuality in the empirical model. The proposal by Okay, Roberts, Bartlett, and Raussendorf [21] established the topological approach to identifying contextuality, which has the potential to provide a more refined analysis, although an additional topological structure must be concerned. Those approaches were connected by Aasnæss [18], complementing generality and completeness in each approach by translating arguments from one to another.
On the other hand, a stronger form of contextuality, namely, the all-versus-nothing (AvN) argument was also characterized by the same group. Abramsky et al. [14, 15] formalized the logical inconsistency in quantum information systems into an AvN argument referring back to the observation by Mermin [9, 10]. This class of contextuality is also observed as an obstruction in the cohomology group in the work by Aasnæss [18].
While the sheaf-theoretic framework provides the base of arguments in the quantum advantage of MBQCs and shallow circuits, the last case of the application, Refs. 23, 24, roots back to Kochen and Specker’s framework on formalizing contextuality, so-called contextuality in a closed sub-theory. This notion seems to describe the same idea with the sheaf-theoretic contextuality, but it is solely based on arguments on operator algebra of observables. Moreover, the interesting point is that they bring the 2-qubit measurement scenario of certain observables into the stronger class of contextuality, while the 2-qubit scenario was previously claimed to be unable to show strong non-locality [17]. This confliction clearly motivated our research on the classification of contextuality, giving a specific case for the two different points of view. Jumping into the result, it turns out that Kochen-Specker type contextuality actually deals with an abelian partial group generated by the given set of observables. Furthermore, we do find strong contextuality in this generated abelian partial group, i.e., in a partial closure.
In this report, we review the mathematical structure of the sheaf-theoretic formalism of contextuality and refine the framework to characterize the connecting point to Kochen-Specker type contextuality. We first go through strict definitions of measurement scenarios, measurement covers, events, distributions, and empirical models. Under these definitions, we review the connection between contextuality and a sheaf theory. Then, we further develop those ideas to classify strong contextuality and all-versus-nothing (AvN) arguments, which define stronger classes of contextuality. The main proposal of this report is the definition of a state-independent AvN class, which is proven in the text that induces AvN arguments for any state realizing the measurement scenario. Then, we show that Kochen-Specker type contextuality is translated into the state-independent AvN in a partial closure, by considering an abelian partial group structure of a Pauli -group. Finally, we summarize the classes of contextuality in a state-side view and an operator-side view, where we can notice that state-dependent contextual scenarios seem to merge into the state-independent AvN class in a partial closure.
2 Sheaf-theoretic contextuality
The sheaf theoretic framework of Abramsky and Brandenberger [11] provides a mathematical structure to formalize the concept of contextuality. In this section, we go through the definitions used in the sheaf theoretic approach with the example of the Bell scenario. We also discuss the contextual fraction as a useful concept for quantifying contextuality, referring to the paper by Abramsky, Barbosa, and Mansfield [16].
2.1 Measurement scenarios
Definition 2.1.
A measurement scenario is a triple where:
-
•
is a finite set of measurements;
-
•
is a family of measurement contexts, where each context represents a set of measurements that can be performed together;
-
•
is a finite set of outcomes.
For example, the well-known Bell scenario, where two experimenters, Alice and Bob, can each choose between performing one of two different measurements, say, or for Alice and or for Bob, obtaining one of two possible outcomes, is represented as follows:
Here, the measurement contexts explains that we are allowed to measure either or , but not both (same for and ). We will generally assume a measurement scenario for the rest of this paper. We also refer to Bell-type measurement scenarios in a tuple , where is the number of agents, is the number of possible measurements for each agent, and is the number of possible outcomes. Thus, the example is a scenario.
From the example, we also capture the idea that there needs some additional information about to describe situations in quantum mechanics. While we described as consisting of “sets of measurements that can be performed together,” it remains unclear how many (or how few) sets are needed fully characterize a given scenario. Hence, we aim to collect a sufficient number of sets to cover all possible measurements, while keeping them as compact as possible to represent each set of compatible measurements. Here comes the definition of the measurement cover.
Definition 2.2.
A measurement cover on the set of measurements is a family of measurement contexts such that:
-
•
covers : ;
-
•
is an anti-chain: If and then .
We think of the anti-chain condition because we shall focus on the maximal compatible sets of measurements.
2.2 Events
Definition 2.3.
Given a set of measurements and a set of outcomes , an event or an assignment over is a function .
For example, consider the event of observing and in the Bell scenario and obtaining outcomes 0 and 1, each. We represent this event by a tuple , or simply, . Here we further focus on the mathematical structure of events. We must strictly distinguish events and . In the latter case, there is no information about what we measured on Bob’s side. It means not only that we do not know the outcome Bob obtained, but also that we even cannot say which measurement either or Bob actually measured. It implies that the events are given with respect to the set of measurements as well as assigning a value for each measurement . In fact, we can define this set of events as a categorical functor including the restriction map, which formalizes “forgetting” information out of what is concerned in the smaller context.
Definition 2.4.
Given a set of measurements and a set of outcomes , a functor , called the event sheaf, is defined as follows:
-
•
, ;
-
•
and , a map of events is defined by the usual functional restriction .

Remark.
denotes the category of sets and functions. Here, it is composed of sets of events and restriction maps. denotes the power set category of , conventionally composed of sets of subsets of and inclusion maps. denotes its opposite, having projection maps instead of inclusion maps. Thus, for , a projection maps to by . Refer to Ref. 27 to study categorical notions used for quantum theory.
Fig. 1 illustrates an example of an event sheaf defined in the Bell scenario. basically maps each subset to a set of functions, what we call events on . Each event is a mapping from to such that assigns an outcome for each measurement .
2.3 Distributions and empirical models
Definition 2.5.
For any set and a semiring , the support of a function is defined by:
Definition 2.6.
For any set and a semiring , an -distribution on is a function which has finite support, and is such that:
Definition 2.7.
For any sets and a semiring , is a functor defined as follows:
-
•
is the set of -distributions on ;
-
•
given a function , the action of on is defined by:
Now we can compose a functor with the event sheaf to form a functor . For each A distribution maps .
1/2 | 0 | 0 | 1/2 | |
3/8 | 1/8 | 1/8 | 3/8 | |
3/8 | 1/8 | 1/8 | 3/8 | |
1/8 | 3/8 | 3/8 | 1/8 |
For example, consider the Bell scenario again. Table 1 realizes the Clauser-Horne-Shimony-Holt (CHSH) model, obtained from local projective measurements equatorial at angles 0 for , and for , on the two-qubit Bell state. In the table, each row represents a context , and each column represents an event indexed by the outcome for each measurement. Then the first row of the table is a distribution mapping each event to the semiring as follows:
Remark.
Given , maps the projection map to:
where for each events :
Thus is the marginal of the distribution , which assigns to each event in the smaller context the sum of weights of all events in the larger context which restrict to .
In the later part of this paper, it turns out to be obvious that the functor cannot be a sheaf on the semirings we are concerned. However, we actually deal with a specific family of distributions when we look into a quantum system. This concept is characterized by an empirical model.
Definition 2.8.
Given a measurement cover , a no-signaling empirical model for is a compatible family of distributions such that:
-
•
for any measurement context , is a local distribution of at , i.e. ;
-
•
is compatible for any measurement contexts , i.e. .
The compatibility condition corresponds to the concept no-signaling in the sense that the choice of context, or , does not affect the local distribution at .
Now for our Bell scenario, we can show that it is an empirical model for a measurement cover where is the distribution specified by each row of the table. The compatibility condition is verified by computing the marginal, for example, for the first two rows:
Thus, we verify that . Here we denoted the restriction map to the one-point set by rather than . We shall keep this abbreviation throughout this paper since it does not violate our intuition too much. The same computation for the other contexts shows that is a no-signaling empirical model. We shall also use the term empirical model for the no-signaling model as we will only focus on quantum-like models with the no-signaling condition.
2.4 Sheaf condition and global distributions
We previously defined two functors and . In fact, these two functors can be identified as presheaves. can be further identified to be a sheaf, justifying its name, ’event sheaf.’ Here, we will look through the definitions of presheaves and sheaves referring to Ref [28] to make those facts clear. Remark that we can consider the measurement cover as a generating set of the topology of , giving each context an open subset of .
Definition 2.9.
Let be a topological space. A presheaf on consists of the data:
-
(i)
for every open subset , an object ;
-
(ii)
for every inclusion of open subsets of , a morphism of objects ;
subject to the conditions:
-
(a)
is the identity map ;
-
(b)
if are three open subsets, then .
Definition 2.10.
A presheaf on a topological space is a sheaf if it satisfies the following supplementary conditions:
-
(c)
(Locality) if is an open set, if is an open covering of , and if are elements such that for all , then ;
-
(d)
(Gluing) if is an open set, if is an open covering of , and if we have elements for each , with the property that for each , , then there is an element such that for each .
In general, we assume discrete topology for the set of measurements . It is easy to show that is a sheaf and is a presheaf. However, generally fails to achieve the sheaf conditions.
1/2 | 0 | 0 | 1/2 | |
1/4 | 1/4 | 1/4 | 1/4 |
For the counter-example of locality, consider two different probability distributions defined on a measurement context as Table 2. We can verify that and do agree on every local measurement event.
1/2 | 0 | 0 | 1/2 | |
1/2 | 0 | 0 | 1/2 | |
1/2 | 0 | 0 | 1/2 | |
0 | 1/2 | 1/2 | 0 |
For the counter-example of gluing, the best example is actually the PR-box, although it is not a quantum realizable model. Table 3 illustrates the empirical model of PR-box, which is not actually a quantum model but still satisfies the condition of being no-signaling empirical model. We can try gluing the first three rows to obtain the probability distribution over , but in this case, there only exists a unique global distribution , such that and 0 otherwise. However, it fails to match with the distribution over the context since there is no probability of getting from this global distribution.
In general, empirical models are compatible families of distributions. Although the sheaf condition does not hold for the entire presheaf , it is possible to ask if the gluing property holds for such a specific family . This is equivalent to asking the existence of a global distribution , defined on the entire set of measurements . Such a global distribution defines a distribution on the set , which marginalizes to yield the behavior of the empirical model: i.e. .
Remark.
A global distribution is often called a global section in a sheaf-theoretic sense. However, in this paper, we will keep the notation of event/assignment and distribution to distinguish one from another.
Now, let’s consider the meaning of the existence of a global distribution for a given empirical model . Here we will show that has a global distribution if and only if it is realized by a factorisable hidden-variable model.
Definition 2.11.
Given a measurement cover , let be a set of values for a hidden variable. A hidden-variable model over is an assignment such that:
-
•
;
-
•
and , ;
-
•
and , .
Definition 2.12.
A hidden-variable model realizes an empirical model if, and :
Definition 2.13.
A hidden-variable model is factorisable if, and :
Theorem 2.14.
Let be an empirical model defined on a measurement cover for a distribution functor . The following are equivalent.
-
(a)
has a global distribution.
-
(b)
has a realization by a factorisable hidden-variable model.
Proof.
: For each , we define a distribution such that for and otherwise. Then if has a global distribution :
Here, we identify to , to , and to , so we showed is realized by a hidden variable model. Moreover, it is factorisable from that:
: Suppose that is realized by a factorisable hidden-variable model . For each , we define for any such that . By the compatibility of the family , this definition is independent of the choice C, being well defined. We define a distribution for each by:
Now to show that is a distribution, let the set of measurements be enumerated as . Specify the global assignment by a tuple , where . Then we can calculate:
We can also show that for each context , . We choose an enumeration of such that , . Then:
Now we define a distribution by averaging over the hidden variables:
The condition for distribution is automatically satisfied from that . Moreover, restricts at each context to yield as:
Thus is a global distribution for . ∎
This result provides a definitive justification for equating the phenomena of non-locality and contextuality with obstructions to the existence of global distributions. We shall say the empirical model is noncontextual if there exists a global distribution for .
2.5 Existence of global distributions and the contextual fraction
Definition 2.15.
Given a measurement cover , let be the number of global assignments, and be the number of local assignments ranging over contexts. The incidence matrix is an Boolean matrix that records the restriction relation between global and local assignments:
The incidence matrix conceptually represents the tuple of restriction maps
At the same time, viewing it as a matrix over the semiring , it acts by matrix multiplication on distributions in , represented as row vectors:
Thus, the image of this map will be the set of families which arise from global distributions.
Proposition 2.16.
Let be an empirical model defined on a measurement cover . Let be an incidence matrix of . Let a vector defined by , which encodes the value of the empirical model on each local assignment. Then, solutions to the following equation correspond bijectively to global distributions for .
(1) |
With this proposition, we can now compute the existence of global distributions from the given empirical model. However, the existence of global distributions depends on the semiring , on which empirical models are discussed. We have naturally thought of probability distributions so far, which are defined on the non-negative reals , but here, let me address how the situation differs according to the semiring.
There are three main examples of semirings discussed in the field of contextuality: the Booleans
the non-negative reals
and the reals
We call the distribution on Boolean semiring a possibility distribution. In the case of the semiring , is the set of probability distributions. In the case of the reals , is the set of signed probability measures with finite support, allowing for ’negative probabilities.’ It can be shown that we can always find global distributions for the given empirical model on the semiring , making a sheaf. It is also straightforward to show that the probabilistic global distribution can exist only when the possibilistic global distribution exists.
Now we will extend this discussion about the incidence matrix further to define the measure of contextuality. Here we consider the semiring . First, define a convex sum of two empirical models and on the same measurement cover, by taking the convex sum of probability distributions on each context. Compatibility is preserved in the convex sum; hence, it yields a well-defined empirical model.
Definition 2.17.
Given an empirical model , consider a convex decomposition
(2) |
where denotes a noncontextual model and is another empirical model. The noncontextual fraction is the maximum possible value of such . The contextual fraction is given by .
Then, we will consider the computation of this contextual fraction using the incidence matrix. Recall the equation (1), which is valid only for a noncontextual model. Since now we can no longer guarantee the solution of , we will consider a relaxed condition:
where the comparison is elementwise throughout this section. Let an empirical model and some convex decomposition be given. Because both empirical models are defined on the same measurement cover, the incidence matrix is the same for both models. Let encode the noncontextual model and suppose a vector satisfies , . Then, since ,
which implies existence of a solution with a lower bound . Here, the norm . Thus, an optimal solution of the following linear programming(LP) gives the noncontextual fraction by :
Thus, we obtain a method computing a contextual fraction of a given empirical model . One can refer to Ref [16] to check that this contextual fraction has monotonicity for the free operations of a resource theory, thus, works as a useful measure.
2.6 Possibilistic empirical models and strong contextuality
Once we have this measure of contextuality, one can notice the empirical models whose contextuality equals 1. In fact, an empirical model has a contextual fraction of 1 if and only if it is strongly contextual. Here we define strong contextuality based on a possibilistic empirical model and show the equivalence of maximally contextual and strongly contextual.
Definition 2.18 (Possibilistic empirical model).
A possibilistic empirical model is a sub-presheaf of such that:
-
•
every compatible family for the measurement cover induces a global assignment;
-
•
is flasque beneath the cover: If then every is the restriction of some .
When an empirical model is given, a possibilistic empirical model of can be derived as follows:
Remark.
Here we define the possibilistic empirical model of as a family of sets, rather than a family of possibilistic distributions . In fact, there is no problem with interpreting the possibilistic empirical model as a normal empirical model of a possibility distribution, but, we rather choose that the possibilistic empirical model returns a set of possible events, i.e. if to clarify the notation.
Definition 2.19.
Let be a possibilistic empirical model for a measurement scenario . We say that is
-
•
logically contextual at if there is no global assignment such that ;
-
•
logically contextual if is logically contextual at some local assignment. Otherwise, it is non-contextual;
-
•
strongly contextual, written , if has no global assignment. i.e. .
Remark.
Note that possibilistic contextuality and strong contextuality differ from each other. An empirical model is possibilistically contextual if there is no global possibility distribution and is strongly contextual if there is no global assignment .
Proposition 2.20.
An empirical model is strongly contextual if and only if it is maximally contextual.
Proof.
Suppose that admits a convex decomposition (2) as follows:
By the Theorem 2.14, we can take the non-contextual empirical model to be a convex sum of deterministic models , where each is a global assignment. If , then from (2), for each . Thus strong contextuality implies maximal contextuality.
For the converse, suppose that . Taking , we shall define such that (2) holds. For each and :
It is easily verified that, for each , . To ensure that is always non-negative, we must have . Since this is the infimum of a finite set of positive numbers, we can find satisfying this condition.
It remains to verify that is no-signalling, i.e. that forms a compatible family. Given , fix . Now
A similar analysis applies to . Using the compatibility of , we conclude that ∎
3 All-versus-nothing arguments and partial groups
When we look into a quantum system, we see that the measurement depends on both the quantum state and the observable. However, it has been observed that some sets of observables inhere the contextuality independently from the quantum state. This type of contextuality, earlier observed by Kochen and Specker [8], has been developed to define different types of contextuality [10, 21, 23, 24]. Here, we formulate them with a sheaf-theoretic structure, starting from what is formally studied as an all-versus-nothing (AvN) argument [14, 15, 18]. We extend this argument to state-independent AvN and claim that Kochen-Specker type contextuality is, in fact, state-independent AvN in a partial closure.
3.1 Mermin’s square

Before we start, let’s first look into the simple example of state-independent AvN. Fig. 2 shows the Mermin’s square [10] that consists of mutually commuting Pauli observables on two qubits. In each row and each column, the product of the first two operators equals the last one, except for the last column: . Once we rearrange the product, we get that the product of every row and column equals : for all three rows and the first two columns from the left, and for the last column.
Here, consider a measurement scenario where each row and each column translates to a measurement context . Then, try assigning a global assignment to this measurement scenario. The product equations are mapped to the following linear equations,
(4) |
where the outcome corresponds to the eigenvalue of by . This system of linear equations is not satisfiable by any global assignment . To see this, add the equations altogether then each element on the left-hand side is added twice, which ends up being 0, while the right-hand side equals 1. Therefore, no quantum state can realize a non-contextual empirical model with these observables, i.e., this set of observables inheres inconsistency.
000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 | |
---|---|---|---|---|---|---|---|---|
1/2 | 1/2 | |||||||
1/2 | 1/2 | |||||||
1/2 | 1/2 | |||||||
1/4 | 1/4 | 1/4 | 1/4 | |||||
1/4 | 1/4 | 1/4 | 1/4 | |||||
1 |
For example, consider this measurement scenario realized by . Table 4 illustrates the empirical model on Mermin’s square realized by . It is obvious that this empirical model is strongly contextual. In fact, we can prove that every empirical model with this measurement scenario is strongly contextual.
3.2 Consistency of an R-linear theory
The key concept first to notice is a ring structure given to the set of outcomes , which justifies writing a measurement scenario in the form , where is the ring of outcomes.
Definition 3.1.
Let be a measurement scenario. An -linear equation is a triple where is a context, assigns a coefficient in to each , and is a constant. A local assignment satisfies , written , if
An -linear theory is a set of -linear equations. A global assignment satisfies , written , if . is consistent if there exists a global assignment that satisfies .
Remark.
Note that the consistency of an -linear theory is defined on an event sheaf , not on an empirical model.
For example, consider Mermin’s square. The linear equation for a local assignment on the context translates to a triple where 1 denotes a constant function that maps every measurement to 1. The explicit statement of the whole linear theory of Mermin’s square is given as follows:
Once we suppose a global assignment such that . This deduces the same equation with (4), thus, a contradiction. Therefore, the linear theory of Mermin’s square is inconsistent.
3.3 AvN arguments
An AvN argument [14, 15, 18] is characterized by an -linear theory of a possibilistic empirical model. Here, we consider an -linear theory derived from a set of assignments , as follows:
Once we have a possibilistic empirical model , gives a set of possible assignments. Thus, we can connect a possibilistic empirical model to an -linear theory.
Definition 3.2.
Given a possibilistic empirical model , an -linear theory is defined by:
We say that is if its -linear theory is inconsistent. i.e. there is no global assignment such that .
000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 | |
---|---|---|---|---|---|---|---|---|
1 | 1 | |||||||
1 | 1 | |||||||
1 | 1 | |||||||
1 | 1 | 1 | 1 | |||||
1 | 1 | 1 | 1 | |||||
1 |
For example, go back to the empirical model on Mermin’s square realized by Bell state . The possibilistic empirical model obtained from the probabilistic empirical model is characterized in Table 5. The theory contains the following linear equations.
Each local assignment is defined on the specific context in each line. Note that this theory contains more linear equations compared to the previous example. Again, once we suppose a global assignment , we find an inconsistency, so is .
Proposition 3.3.
If is then is strongly contextual.
Proof.
Suppose is not strongly contextual, i.e. that there is some . Then for each , , hence by the definition of . Thus, is consistent. ∎
3.4 State-independent AvN
While we defined the AvN argument on a possibilistic empirical model, Mermin’s square seems to lie in a stronger class of contextuality, as we have seen that the linear theory of Mermin’s square with a possibilistic empirical model is larger than that without an empirical model. When we have a measurement in a quantum system, it is actually associated with an observable and a state , or more generally, a density matrix , so to characterize a measurement . Hereby, we focus on an algebra given by the set of observables, and each measurement is specified by an observable . In the remaining part of section 3, we denote the set of observables as , which is realized to be a set of measurements by a state , and a measurement cover of observables , realized to be a measurement cover by a state . We may abuse the word “measurement cover” for the measurement cover of observables, but clearly, the measurement cover of observables must be realized by a state to be an actual measurement cover.
Now turning to the set of observables , we first restrict our concern to the subset of Pauli -group . This is to take care before dealing with the general quantum observables, i.e., arbitrary self-adjoint operators on a complex Hilbert space, where Tsirelson’s problem [29] may arise. To be specific, when we consider a span of a set of nonlocal observables, it may not be able to approximate the generated operator algebra with a set of local Pauli operators, even if the dimension of the Hilbert space is finite [30, 31]. Thus, we restrict our concern to Pauli groups where each basis operator is a tensor product of local Pauli operators.
Now in , we have a multiplicative group operation, which is generally not commutative. However, an event sheaf still can be assigned on this set of Pauli observables. For any subset , we consider an assignment that maps to the measurement of each observable as for its eigenvalue .
Once we have a multiplicative relation in , we observe that it translates to a linear equation according to the following rules:
-
•
when holds for mutually commuting , the relation corresponds to a linear equation ;
-
•
when holds for mutually commuting , the relation corresponds to a linear equation .
For example, when we have , it is equivalent to , so we can map it to a linear equation. In the , , so it further reduces to an equation in . The second relation stands for the dual element of such that . This is dual in the sense of a local assignment where always holds.
Note that the condition of mutual commutation is required to map multiplication in to addition in , which is commutative. However, there is an intriguing object we already have that encodes the commutability, namely, the measurement cover .
From the definition of a measurement cover, is a family of maximal commuting subsets of , i.e., for any , , and if there exists for such that the elements of commutes with each other, then . Since commutative multiplication only occurs in a commuting subset, it only occurs in a measurement context. This means that we can define a linear theory of a set of observables , as presented in the following definition.
Definition 3.4 (State-independent AvN).
Given a set of observables , a linear theory is defined by:
We say that is state-independently AvN if its linear theory is inconsistent. i.e. there is no such that .
Proposition 3.5.
If is state-independently then any possibilistic model on is .
Proof.
What we want to show is that any possibilistic empirical model realized by any state satisfies . Here we deal with a density matrix for generality. For each linear equation , what we want to have is:
for realized by an arbitrary density matrix . Here we show it by contradiction.
Suppose not. Then it means such that . This equation with the assignment maps to an equation with an assignment of eigenvalues as follows:
When , it means that , where is obtaining eigenvalues sequentially going through each . Now look into the updated state:
where is the projector of each eigenvalue of . We have all ’s in , so they mutually commute, and so do their projectors , so the expression is valid. Then, from the definition of , we have . Then we can try measuring this from the updated state as follows:
Here we derive a contradiction:
This is because , which means , thus , resulting in that . Hence we have , , therefore . Since we have shown it with arbitrary , . However, since is not satisfiable, is also not satisfiable. ∎
Now, we highlight that this definition catches the idea discussed by Kochen and Specker [8]. Here, can be translated to a partial algebra in Kochen and Specker’s argument where the commeasurability corresponds to the measurement cover . The rule of mapping the multiplicative equations of into a linear theory coincides with an embedding of into a Boolean algebra. The inconsistency of the linear theory implies the failure of embeddability, which was previously connected to the non-classical logic of quantum mechanics.
However, there still exists some ambiguity in matching these concepts to Kochen and Specker’s argument. In particular, the set is not necessarily closed under multiplication, which disturbs the direct interpretation of as a partial algebra as in Kochen and Specker’s work. Considering that some useful arguments on quantum advantage, e.g., Refs. 23, 24, are based on Kochen-Specker type contextuality, we should characterize the relation between the set of observables and a partial algebra, by defining the measurement cover of in a more rigorous way.
3.5 Partial group and Kochen-Specker type contextuality
First, let’s start with the concept of a partial group. While the idea of partial algebra was raised earlier by Kochen and Specker in 1975 [8], the definition of a partial group is discussed only after Assiry in 2018 [32]. Here we define an abelian partial group based on those two studies without exactly showing how this definition connects to the aforementioned mathematical objects.
Definition 3.6.
An abelian partial group consists of a set , a binary relation , and a binary operation , satisfying the following properties:
-
•
the relation is reflexive and symmetric, i.e., for any , and ;
-
•
the relation is closed under the operation , i.e., if , and then ;
-
•
there exists an identity such that for any and ;
-
•
if a subset of satisfies that for any , generates an abelian group .
Such relation is called commutativity or commeasurability of . The binary operation is in fact a partial binary operation on , written by . Note that for a commuting subset , the partial binary operation on becomes a commutative binary operation on the abelian group . Meanwhile, we may refer to an abelian partial group without specifying its commutativity and partial binary operation.
For example, the set of observables in Mermin’s square (including sign and an identity ) forms an abelian partial group where the commutativity corresponds to the usual commutativity relation of observables, and the partial binary operation is defined on a commuting set of observables. Remark the following properties in this example:
-
•
;
-
•
, but ;
-
•
is a commuting subset of and generates an abelian group .
Remark.
A group can be interpreted as an abelian partial group where the commutativity corresponds to the usual commutativity relation of the elements of , and the partial binary operation is derived from the binary operation of . Thus, we shall say a group is an abelian partial group without specifying its commutativity relation and partial binary operation explicitly.
This definition of an abelian partial group characterizes the algebraic structure of quantum observables where the multiplication of observables is restricted by commutativity so that we can bring the hidden variable arguments to each commuting subset. Furthermore, it can extend to a partial algebra by introducing addition and scalar multiplication, so that it can coincide with an operator algebra, as in Kochen and Specker’s original work.
Kochen and Specker found that the hidden variable model and the quantum model branch out when we try to glue the measurement outcomes of overlapping commuting subsets to obtain the global hidden variable model, i.e., there exists a quantum model of measurement inconsistent with the hidden variable model on the global domain, but still satisfies the classical arguments on each commuting domain. This concept justifies the statement: the quantum model of measurement depends on a “context.”
While this partial algebraic structure plays an important role in quantum theory, we do notice that a set of observables in a measurement scenario is not generally a closed abelian partial group. For example, an scenario has an observable set which is not closed under multiplication, e.g., . This is exactly the difference between sheaf-theoretic contextuality and Kochen-Specker type contextuality: Kochen-Specker type contextuality is defined on an abelian partial group while sheaf-theoretic contextuality does not require the measurement set to be an abelian partial group.
Having said that, we can try linking those two definitions by considering the generation of an abelian partial group by the given set of measurements, or in other words, a partial closure of the given set.
Definition 3.7 (Partial closure).
For a given subset of , a partial closure of on is the smallest abelian partial group in containing .
For example, is a subset of a Pauli group of which a partial closure is an abelian partial group in . Likewise, the partial closure of is .
Finally, we can characterize the measurement cover of a set in the context of an abelian partial group of quantum observables. is defined on an abelian partial group , where each context is a maximal commuting subset of . Similarly, we can derive the measurement cover of , where each context is a maximal abelian group in . One may interpret as a family of the intersections of and the abelian groups , which also catches the idea that the measurement cover originates in the commutativity relation of .
Now, it is straightforward to specify Kochen-Specker type contextuality in the language of an abelian partial group.
Definition 3.8 (Kochen-Specker type contextuality).
Given a set , has Kochen-Specker type contextuality if there is no global assignment of eigenvalues consistent with the partial commutative multiplication in .
Kochen-Specker type contextuality is also called as contextuality in a closed sub-theory, concerning that the argument only deals with a part of the quantum observables. It is also clear that Kochen-Specker type contextuality is nothing but the state-independent AvN in a partial closure, once we define the state-independent AvN in a partial closure as follows:
Definition 3.9.
Given a set of observables , is state-independently in a partial closure if is inconsistent.
Following the argument so far, we state the following corollary without giving proof.
Corollary 3.10.
A set of observables has Kochen-Specker type contextuality if and only if is state-independently in a partial closure.

Kirby and Love [24] developed this concept to define their own way of witnessing contextuality, and applied it to evaluate the classical simulatability of practical quantum algorithms. Fig. 3 illustrates the example of the formalism Kirby and Love developed.
Definition 3.11.
A determining tree for a Pauli observable over a set of Pauli observables is a tree whose nodes are Pauli operators and whose leaves are operators in , such that:
-
•
the root is ;
-
•
all children of any particular parent pairwise commute as operators;
-
•
every parent node is the operator product of its children.
We say that is determined by if there exists a determining tree over . For a determining tree , the determining set is defined to be the set containing one copy of each operator with odd multiplicity as a leaf in .
Note that if and only if there exists over . The determining tree images inductive production of . Whenever we have , , which maps to a linear equation in . Thus, each determining tree gives a linear theory .
Now, once we get a determining tree , it pushes us to try inducing an assignment for from some global assignment , which assigns value following the determining tree over . However, what we can observe is that the determining tree for is not unique, and in fact, in general. The following theorem characterizes this failure of assigning values to the determining tree.
Theorem 3.12 (KL contextuality).
A set of Pauli operators is state-independently in a partial closure if and only if there exists a determining tree over and a determining tree over such that .

Corollary 3.13.
A set is state-independently in a partial closure if and only if it contains a subset consisting of four operators whose commutability graph has one of the forms given in Fig. 4 (up to permutations of the operators).
We refer to Ref. 24 for the proof of corollary 3.13. The conversed direction of theorem 3.12 is easily confirmed. Suppose that there exists such a pair of determining trees, and suppose induces a valid such that and . Then , since . However, since , , we have an equation , which leads to that , which is a contradiction. Thus, no such assignment is valid.
On the other hand, the forward direction of the theorem is not really obvious, although we do conjecture it to be true as it was in Ref. 24. Here we leave this proof for our future work, but we do state it as a theorem rather than a conjecture.
3.6 Classification of contextuality

Although a given empirical model is simply contextual, or even non-contextual, it might have AvN contextuality in a partial closure. For example, consider the Bell scenario. This is actually the case that motivated this research, where the Bell scenario with measurements is not contextual at all, but from Mermin’s square, it turns out to have state-independent AvN in a partial closure. Hereby, the argument of Ref. 17, that the 2-qubit scenario is unable to show strong non-locality, seems to disagree with this result. However, we could realize that the statement is still correct since it considers non-locality, where only local observables are concerned.
000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 | |
---|---|---|---|---|---|---|---|---|
1/4 | 1/4 | 1/4 | 1/4 | |||||
1/4 | 1/4 | 1/4 | 1/4 | |||||
1/4 | 1/4 | 1/4 | 1/4 | |||||
1/4 | 1/4 | 1/4 | 1/4 | |||||
⋮ |
000 | 001 | 010 | 011 | 100 | 101 | 110 | 111 | |
---|---|---|---|---|---|---|---|---|
1/8 | 1/8 | 1/8 | 1/8 | 1/8 | 1/8 | 1/8 | 1/8 | |
1/8 | 1/8 | 1/8 | 1/8 | 1/8 | 1/8 | 1/8 | 1/8 | |
1/8 | 1/8 | 1/8 | 1/8 | 1/8 | 1/8 | 1/8 | 1/8 | |
1/8 | 1/8 | 1/8 | 1/8 | 1/8 | 1/8 | 1/8 | 1/8 | |
⋮ |
Another point we make is that the state-independent AvN class is strictly smaller than the ordinary AvN class. In other words, there exist state-dependent AvN models. For example, consider Mermin’s star illustrated in Fig. 5. Table 6 shows the empirical model on scenario realized by GHZ state, which is AvN. However, in the same measurement setting, we can realize another empirical model as in Table 7 with equal superposition state . Here, every probability in the scenario is , which turns out to be non-contextual.


As a result, we can summarize the relationship between different definitions of contextuality as Fig. 6. However, the picture seems quite different when it comes to the operator-side view, as we illustrate in Fig. 7. Although we know few cases of state-dependently contextual scenarios, they seem to integrate into the state-independent AvN when their partial closures are considered. In this view, we conjecture that state-dependent contextuality is in fact state-independent AvN in a partial closure. We present this idea in the following conjecture.
Conjecture 3.14.
Any measurement cover that realizes contextual empirical model for some state is state-independently in a partial closure.
This conjecture yields an interesting idea that the partial closure may provide a way to connect sheaf-theoretical contextuality to state-independent contextuality. Here we leave the proof of this conjecture for our future work.
4 Conclusion
In this report, we reviewed the sheaf-theoretic framework of contextuality and proposed state-independent AvN arguments. This work provides a coherent mathematical structure to compare each class of contextuality, clarifying the hierarchy of state-independent AvN - AvN - strong contextuality - contextuality. Kochen-Specker type contextuality integrates into this framework by considering a partial closure of the given set of measurements.
This work also develops the idea that contextuality does not necessarily require measurements to be “local.” While the compatibility condition of events and distributions originates in the no-signaling principle, the condition is still valid when we include non-local observables in the set of measurements. However, it requires a cautious approach to deal with non-local observables because of Tsireleson’s problem [29, 30, 31], which is also discussed in Ref. 33. Here we restricted our concern to a Pauli -group to avoid this problem.
Whilst we organized a consistent framework of contextuality, it cannot be affirmed that this framework is the most effective framework for formalizing contextuality in every case. A graph-theoretic approach based on Kochen-Specker type contextuality may be suitable for some proofs of quantum advantage, or a topological framework may be more effective in other cases. However, this work, together with Aasnæss’s thesis [18], implies a certain relationship between those approaches, thus enabling the translation of arguments in each framework from one to another.
Future studies may work on proving that state-dependent contextuality merges into state-independent AvN when its partial closure is concerned. It would also be a question if there is a set of observables that is state-independent contextual but not state-independent AvN, or an empirical model that is strongly contextual but not AvN. The generalization of state-independent contextuality to arbitrary self-adjoint operators on a complex Hilbert space would be another intriguing problem. Such discussions will clarify the point where classical and quantum information systems diverge.
In summary, this work presents an extensive approach from a sheaf-theoretic framework to Kochen-Specker type contextuality and state-independent contextuality, providing a consistent mathematical language to compare notions of contextuality. This will serve as a key tool to evaluate quantum advantage, guiding how the contextuality arguments from different frameworks can be translated to each other.
References
- Deutsch and Jozsa [1992] D. Deutsch and R. Jozsa, Proceedings of the Royal Society of London. Series A: Mathematical and Physical Sciences 439, 553 (1992).
- Shor [1994] P. Shor, in Proceedings 35th Annual Symposium on Foundations of Computer Science (1994) pp. 124–134.
- Greenberger et al. [1990] D. M. Greenberger, M. A. Horne, A. Shimony, and A. Zeilinger, American Journal of Physics 58, 1131 (1990).
- Cleve et al. [2010] R. Cleve, P. Hoyer, B. Toner, and J. Watrous, Consequences and limits of nonlocal strategies (2010), arXiv:quant-ph/0404076 [quant-ph] .
- Grover [1996] L. K. Grover, in Proceedings of the Twenty-Eighth Annual ACM Symposium on Theory of Computing, STOC ’96 (Association for Computing Machinery, New York, NY, USA, 1996) p. 212–219.
- Peruzzo et al. [2014] A. Peruzzo, J. McClean, P. Shadbolt, M.-H. Yung, X.-Q. Zhou, P. J. Love, A. Aspuru-Guzik, and J. L. O’Brien, Nature Communications 5, 4213 (2014).
- Bell [1964] J. S. Bell, Physics Physique Fizika 1, 195 (1964).
- Kochen and Specker [1975] S. Kochen and E. P. Specker, The problem of hidden variables in quantum mechanics, in The Logico-Algebraic Approach to Quantum Mechanics: Volume I: Historical Evolution, edited by C. A. Hooker (Springer Netherlands, Dordrecht, 1975) pp. 293–328.
- Mermin [1990] N. D. Mermin, Phys. Rev. Lett. 65, 3373 (1990).
- Mermin [1993] N. D. Mermin, Rev. Mod. Phys. 65, 803 (1993).
- Abramsky and Brandenburger [2011] S. Abramsky and A. Brandenburger, New Journal of Physics 13, 113036 (2011).
- Abramsky et al. [2012] S. Abramsky, S. Mansfield, and R. S. Barbosa, Electronic Proceedings in Theoretical Computer Science 95, 1 (2012).
- Abramsky [2014] S. Abramsky, Contextual semantics: From quantum mechanics to logic, databases, constraints, and complexity (2014), arXiv:1406.7386 [quant-ph] .
- Abramsky et al. [2015] S. Abramsky, R. S. Barbosa, K. Kishida, R. Lal, and S. Mansfield, in 24th EACSL Annual Conference on Computer Science Logic (CSL 2015), Leibniz International Proceedings in Informatics (LIPIcs), Vol. 41, edited by S. Kreutzer (Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 2015) pp. 211–228.
- Abramsky et al. [2017a] S. Abramsky, R. S. Barbosa, G. Carù, and S. Perdrix, Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences 375, 20160385 (2017a).
- Abramsky et al. [2017b] S. Abramsky, R. S. Barbosa, and S. Mansfield, Phys. Rev. Lett. 119, 050504 (2017b).
- Abramsky et al. [2018] S. Abramsky, R. S. Barbosa, G. Carù, N. de Silva, K. Kishida, and S. Mansfield, in 12th Conference on the Theory of Quantum Computation, Communication and Cryptography (TQC 2017), Leibniz International Proceedings in Informatics (LIPIcs), Vol. 73, edited by M. M. Wilde (Schloss Dagstuhl–Leibniz-Zentrum fuer Informatik, Dagstuhl, Germany, 2018) pp. 9:1–9:20.
- Aasnæss [2022] S. Aasnæss, Comparing two cohomological obstructions for contextuality, and a generalised construction of quantum advantage with shallow circuits (2022), arXiv:2212.09382 [quant-ph] .
- Raussendorf [2013] R. Raussendorf, Phys. Rev. A 88, 022322 (2013).
- Bermejo-Vega et al. [2017] J. Bermejo-Vega, N. Delfosse, D. E. Browne, C. Okay, and R. Raussendorf, Phys. Rev. Lett. 119, 120505 (2017).
- Okay et al. [2017] C. Okay, S. Roberts, S. D. Bartlett, and R. Raussendorf, Topological proofs of contextuality in quantum mechanics (2017), arXiv:1701.01888 [quant-ph] .
- Cabello et al. [2014] A. Cabello, S. Severini, and A. Winter, Phys. Rev. Lett. 112, 040401 (2014).
- Karanjai et al. [2018] A. Karanjai, J. J. Wallman, and S. D. Bartlett, Contextuality bounds the efficiency of classical simulation of quantum processes (2018), arXiv:1802.07744 [quant-ph] .
- Kirby and Love [2019] W. M. Kirby and P. J. Love, Phys. Rev. Lett. 123, 200501 (2019).
- Bravyi et al. [2018] S. Bravyi, D. Gosset, and R. König, Science 362, 308 (2018).
- Bravyi et al. [2020] S. Bravyi, D. Gosset, R. König, and M. Tomamichel, Nature Physics 16, 1040 (2020).
- Heunen and Vicary [2019] C. Heunen and J. Vicary, Categories for Quantum Theory: An Introduction (Oxford University Press, 2019).
- Hartshorne [1977] R. Hartshorne, Algebraic geometry, Graduate Texts in Mathematics, Vol. 52 (Springer New York, NY, 1977).
- Tsirel’son [1987] B. S. Tsirel’son, Journal of Soviet Mathematics 36, 557 (1987).
- Vidick [2021] T. Vidick, MIP*=RE: A negative resolution to Connes’ embedding problem and Tsirelson’s problem (2021).
- Ji et al. [2022] Z. Ji, A. Natarajan, T. Vidick, J. Wright, and H. Yuen, MIP*=RE (2022), arXiv:2001.04383 [quant-ph] .
- Assiry [2018] A. Assiry, Partial groups (2018).
- Cleve et al. [2017] R. Cleve, L. Liu, and W. Slofstra, Journal of Mathematical Physics 58, 012202 (2017).