This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

State-independent all-versus-nothing arguments

Boseong Kim Department of Physics, University College London,
Gower Street, London, WC1E 6BT, United Kingdom
   Samson Abramsky [email protected] Department of Computer Science, University College London,
Gower Street, London, WC1E 6BT, United Kingdom
Abstract

Contextuality is a key feature of quantum information that challenges classical intuitions, providing the basis for constructing explicit proofs of quantum advantage. While a number of evidences of quantum advantage are based on the contextuality argument, the definition of contextuality is different in each research, causing incoherence in the establishment of instant connection between their results. In this report, we review the mathematical structure of sheaf-theoretic contextuality and extend this framework to explain Kochen-Specker type contextuality. We first cover the definitions in contextuality with detailed examples. Then, we state the all-versus-nothing (AvN) argument and define a state-independent AvN class. It is shown that Kochen-Specker type contextuality, or contextuality in a partial closure, can be translated into this framework by the partial closure of observables under the multiplication of commuting measurements. Finally, we compare each case of contextuality in an operator-side view, where the strict hierarchy of contextuality class in a state-side view seems to merge into the state-independent AvN class together with the partial closure formalism. Overall, this report provides a unified interpretation of contextuality by integrating Kochen-Specker type notions into the state-independent AvN argument. The results present novel insights into contextuality, which pave the way for a coherent approach to constructing proofs of quantum advantage.

preprint: Project Report

1 Introduction

So, why do we need a quantum computer? This question must be one of the most frequently asked questions for those who research quantum computers. The claim of exponential quantum advantage in the Deutsch-Jozsa algorithm [1] and the monumental factoring algorithm by Shor [2] opened an intriguing discussion on quantum advantage. They inspired numerous computational tasks for quantum information systems, for example, the GHZ [3] and magic square games [4], quantum search algorithm [5], and variational quantum eigensolver (VQE) [6] for quantum simulation and optimization.

Those cases of quantum advantage are crucial for the development of quantum technologies since they can spark a totally novel area of application, opening new markets and research fields for quantum scientists. However, discovering and proving a new instance of quantum advantage are challenging works due to the complexity and counter-intuitive nature of quantum mechanics.

Contextuality is a useful tool to characterize the specific point where quantum information systems depart from classical systems. Originating from Bell’s pioneering work on the non-local behavior of quantum mechanics [7], Kochen and Specker [8] characterized the idea of contextuality based on the problem of a hidden variable theory. Subsequent investigations on quantum observables were made by Mermin [9, 10] who first used the term contextuality in his articles. Recent attempts to apply mathematical structures to contextuality [11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24] have gained a certain degree of success, producing proofs of quantum advantage in some computational processes [18, 20, 23].

1.1 Applications of contextuality to quantum advantage

Contextuality as a resource of quantum advantage was first discussed by Raussendorf [19] in a measurement-based quantum computing model (MBQC). His major results state that an MBQC requires contextuality when it computes a non-linear Boolean function with a certain probability. This idea was developed by Bermejo-Vega et al. [20], establishing the explicit proof that contextuality is a necessary resource for a large class of MBQC schemes.

It has also been observed that contextuality plays a crucial role in a gate-based quantum computing model. Bravyi, Gosset, and König [25] reported the quantum advantage of shallow circuits, which was the first mathematical proof of an unconditional quantum advantage for a certain class of quantum circuits. Here they considered strategies for non-local games recast into the circuit, proving a separation between classical and quantum computational models. Subsequently, this strategy was shown to be noise robust [26] and extended to distributed non-local games [18].

Karanjai, Wallman, and Bartlett [23] also developed their own framework of contextuality to prove that the spatial complexity of classical simulation of a quantum measurement process is bounded by contextuality. This result is further developed by Kirby and Love [24], who suggested applying contextuality to evaluate the quantum advantage of the VQE algorithm. They proposed a test to determine whether or not the given objective function for the VQE is contextual, employing the compatibility graph of Pauli operators and the measure, contextual pp-distance. This test detects the non-classicality of the objective function, thus filtering out classically simulatable procedures. This research has important implications for the application of contextuality to evaluate practical quantum algorithms.

1.2 Classification of contextuality

The sheaf-theoretic definition of contextuality has been instrumental in our understanding of contextuality, as it provides precise mathematical structure to the intuitive concept of contextuality. The sheaf-theoretic framework was first proposed by Abramsky and Brandenburger [11, 13], where they defined events and distributions on the measurement scenario and identified the sheaf structure of those concepts. Here, one can connect the global distribution to the hidden-variable model, which is well-known for its failure to explain the distinct features of quantum theory. Further discussion by Abramsky, Barbosa, and Mansfield [16] investigated a measure of contextuality. This work opened the way to quantify contextuality in the given quantum scenario.

The subsequent development in the cohomological approach to contextuality also provides a substantial methodology for witnessing contextuality in a given measurement scenario. Abramsky, Mansfield, and Barbosa [12] proposed the approach based on the Čech cohomology invariant, which leverages powerful tools of sheaf cohomology to detect contextuality in the empirical model. The proposal by Okay, Roberts, Bartlett, and Raussendorf [21] established the topological approach to identifying contextuality, which has the potential to provide a more refined analysis, although an additional topological structure must be concerned. Those approaches were connected by Aasnæss [18], complementing generality and completeness in each approach by translating arguments from one to another.

On the other hand, a stronger form of contextuality, namely, the all-versus-nothing (AvN) argument was also characterized by the same group. Abramsky et al. [14, 15] formalized the logical inconsistency in quantum information systems into an AvN argument referring back to the observation by Mermin [9, 10]. This class of contextuality is also observed as an obstruction in the cohomology group in the work by Aasnæss [18].

While the sheaf-theoretic framework provides the base of arguments in the quantum advantage of MBQCs and shallow circuits, the last case of the application, Refs. 23, 24, roots back to Kochen and Specker’s framework on formalizing contextuality, so-called contextuality in a closed sub-theory. This notion seems to describe the same idea with the sheaf-theoretic contextuality, but it is solely based on arguments on operator algebra of observables. Moreover, the interesting point is that they bring the 2-qubit measurement scenario of certain observables into the stronger class of contextuality, while the 2-qubit scenario was previously claimed to be unable to show strong non-locality [17]. This confliction clearly motivated our research on the classification of contextuality, giving a specific case for the two different points of view. Jumping into the result, it turns out that Kochen-Specker type contextuality actually deals with an abelian partial group generated by the given set of observables. Furthermore, we do find strong contextuality in this generated abelian partial group, i.e., in a partial closure.

In this report, we review the mathematical structure of the sheaf-theoretic formalism of contextuality and refine the framework to characterize the connecting point to Kochen-Specker type contextuality. We first go through strict definitions of measurement scenarios, measurement covers, events, distributions, and empirical models. Under these definitions, we review the connection between contextuality and a sheaf theory. Then, we further develop those ideas to classify strong contextuality and all-versus-nothing (AvN) arguments, which define stronger classes of contextuality. The main proposal of this report is the definition of a state-independent AvN class, which is proven in the text that induces AvN arguments for any state realizing the measurement scenario. Then, we show that Kochen-Specker type contextuality is translated into the state-independent AvN in a partial closure, by considering an abelian partial group structure of a Pauli nn-group. Finally, we summarize the classes of contextuality in a state-side view and an operator-side view, where we can notice that state-dependent contextual scenarios seem to merge into the state-independent AvN class in a partial closure.

2 Sheaf-theoretic contextuality

The sheaf theoretic framework of Abramsky and Brandenberger [11] provides a mathematical structure to formalize the concept of contextuality. In this section, we go through the definitions used in the sheaf theoretic approach with the example of the Bell scenario. We also discuss the contextual fraction as a useful concept for quantifying contextuality, referring to the paper by Abramsky, Barbosa, and Mansfield [16].

2.1 Measurement scenarios

Definition 2.1.

A measurement scenario is a triple X,,O\left<X,\mathcal{M},O\right> where:

  • XX is a finite set of measurements;

  • 𝒫(X)\mathcal{M}\subset\mathcal{P}(X) is a family of measurement contexts, where each context CC\in\mathcal{M} represents a set of measurements that can be performed together;

  • OO is a finite set of outcomes.

For example, the well-known Bell scenario, where two experimenters, Alice and Bob, can each choose between performing one of two different measurements, say, a1a_{1} or a2a_{2} for Alice and b1b_{1} or b2b_{2} for Bob, obtaining one of two possible outcomes, is represented as follows:

X={a1,a2,b1,b2},O={0,1},\displaystyle X=\{a_{1},a_{2},b_{1},b_{2}\},\quad O=\{0,1\},
={{a1,b1},{a1,b2},{a2,b1},{a2,b2}}.\displaystyle\mathcal{M}=\left\{\{a_{1},b_{1}\},\{a_{1},b_{2}\},\{a_{2},b_{1}\},\{a_{2},b_{2}\}\right\}.

Here, the measurement contexts CC\in\mathcal{M} explains that we are allowed to measure either a1a_{1} or a2a_{2}, but not both (same for b1b_{1} and b2b_{2}). We will generally assume a measurement scenario X,,O\left<X,\mathcal{M},O\right> for the rest of this paper. We also refer to Bell-type measurement scenarios in a tuple (n,k,l)(n,k,l), where nn is the number of agents, kk is the number of possible measurements for each agent, and ll is the number of possible outcomes. Thus, the example is a (2,2,2)(2,2,2) scenario.

From the example, we also capture the idea that there needs some additional information about \mathcal{M} to describe situations in quantum mechanics. While we described \mathcal{M} as consisting of “sets of measurements that can be performed together,” it remains unclear how many (or how few) sets are needed fully characterize a given scenario. Hence, we aim to collect a sufficient number of sets to cover all possible measurements, while keeping them as compact as possible to represent each set of compatible measurements. Here comes the definition of the measurement cover.

Definition 2.2.

A measurement cover \mathcal{M} on the set XX of measurements is a family of measurement contexts such that:

  • \mathcal{M} covers XX: CC=X\displaystyle\bigcup_{C\in\mathcal{M}}C=X;

  • \mathcal{M} is an anti-chain: If C,CC,C^{\prime}\in\mathcal{M} and CCC\subset C^{\prime} then C=CC=C^{\prime}.

We think of the anti-chain condition because we shall focus on the maximal compatible sets of measurements.

2.2 Events

Definition 2.3.

Given a set of measurements XX and a set of outcomes OO, an event or an assignment over UXU\subset X is a function s:UOs:U\rightarrow O.

For example, consider the event of observing a1a_{1} and b1b_{1} in the Bell scenario and obtaining outcomes 0 and 1, each. We represent this event by a tuple (a10,b11)(a_{1}\mapsto 0,b_{1}\mapsto 1), or simply, (0,1)(0,1). Here we further focus on the mathematical structure of events. We must strictly distinguish events (a10,b11)(a_{1}\mapsto 0,b_{1}\mapsto 1) and (a10)(a_{1}\mapsto 0). In the latter case, there is no information about what we measured on Bob’s side. It means not only that we do not know the outcome Bob obtained, but also that we even cannot say which measurement either b1b_{1} or b2b_{2} Bob actually measured. It implies that the events are given with respect to the set UU of measurements as well as assigning a value oOo\in O for each measurement xUx\in U. In fact, we can define this set of events as a categorical functor including the restriction map, which formalizes “forgetting” information out of what is concerned in the smaller context.

Definition 2.4.

Given a set of measurements XX and a set of outcomes OO, a functor :𝒫(X)𝗈𝗉𝐒𝐞𝐭\mathcal{E}:\mathcal{P}(X)^{\mathsf{op}}\rightarrow\mathbf{Set}, called the event sheaf, is defined as follows:

  • UX\displaystyle\forall U\subset X, (U):=xUO\mathcal{E}(U):=\prod_{x\in U}O;

  • U,UX\forall U,U^{\prime}\subset X and UUU\subset U^{\prime}, a map 𝗋𝖾𝗌UU:(U)(U)\mathsf{res}^{U^{\prime}}_{U}:\mathcal{E}(U^{\prime})\rightarrow\mathcal{E}(U) of events is defined by the usual functional restriction 𝗋𝖾𝗌UU(s)=s|U\mathsf{res}^{U^{\prime}}_{U}(s)=\left.s\right|_{U}.

Refer to caption
Figure 1: Visualization of an event sheaf. An event sheaf is a functor consisting of a mapping from 𝒫(X)\mathcal{P}(X) to a set of events and a restriction mapping between sets of events.
Remark.

𝐒𝐞𝐭\mathbf{Set} denotes the category of sets and functions. Here, it is composed of sets of events and restriction maps. 𝒫(X)\mathcal{P}(X) denotes the power set category of XX, conventionally composed of sets of subsets of XX and inclusion maps. 𝒫(X)𝗈𝗉\mathcal{P}(X)^{\mathsf{op}} denotes its opposite, having projection maps instead of inclusion maps. Thus, for UUU\subset U^{\prime}, a projection π:UU\pi:U^{\prime}\rightarrow U maps to 𝗋𝖾𝗌UU:(U)(U)\mathsf{res}^{U^{\prime}}_{U}:\mathcal{E}(U^{\prime})\rightarrow\mathcal{E}(U) by \mathcal{E}. Refer to Ref. 27 to study categorical notions used for quantum theory.

Fig. 1 illustrates an example of an event sheaf defined in the Bell scenario. \mathcal{E} basically maps each subset UXU\subset X to a set of functions, what we call events on UU. Each event ss is a mapping from UU to OO such that assigns an outcome for each measurement xUx\in U.

2.3 Distributions and empirical models

Definition 2.5.

For any set XX and a semiring RR, the support 𝗌𝗎𝗉𝗉(ϕ)\mathsf{supp}(\phi) of a function ϕ:XR\phi:X\rightarrow R is defined by:

𝗌𝗎𝗉𝗉(ϕ)={xXϕ(x)0}.\mathsf{supp}(\phi)=\left\{x\in X\mid\phi(x)\neq 0\right\}.
Definition 2.6.

For any set XX and a semiring RR, an RR-distribution on XX is a function d:XRd:X\rightarrow R which has finite support, and is such that:

xXd(x)=1.\sum_{x\in X}d(x)=1.
Definition 2.7.

For any sets X,YX,Y and a semiring RR, 𝒟R\mathcal{D}_{R} is a functor defined as follows:

  • 𝒟R(X)\mathcal{D}_{R}(X) is the set of RR-distributions on XX;

  • given a function f:XYf:X\rightarrow Y, the action of 𝒟R\mathcal{D}_{R} on ff is defined by:

    𝒟R(f):𝒟R(X)𝒟R(Y)::d[yf(x)=yd(x)].\mathcal{D}_{R}(f):\mathcal{D}_{R}(X)\rightarrow\mathcal{D}_{R}(Y)::d\mapsto\left[y\mapsto\sum_{f(x)=y}d(x)\right].

Now we can compose a functor 𝒟R:𝐒𝐞𝐭𝐒𝐞𝐭\mathcal{D}_{R}:\mathbf{Set}\rightarrow\mathbf{Set} with the event sheaf :𝒫(X)𝗈𝗉𝐒𝐞𝐭\mathcal{E}:\mathcal{P}(X)^{\mathsf{op}}\rightarrow\mathbf{Set} to form a functor 𝒟R:𝒫(X)𝗈𝗉𝐒𝐞𝐭\mathcal{D}_{R}\mathcal{E}:\mathcal{P}(X)^{\mathsf{op}}\rightarrow\mathbf{Set}. For each UXU\subset X A distribution d𝒟R(U)d\in\mathcal{D}_{R}\mathcal{E}(U) maps (U)R\mathcal{E}(U)\rightarrow R.

Table 1: Empirical model on CHSH(2,2,2)\mathrm{CHSH}-(2,2,2) scenario. The table is obtained from local projective measurements equatorial at angles 0 for a1a_{1}, b1b_{1} and π/3\pi/3 for a2a_{2}, b2b_{2} on the two-qubit Bell state |ϕ+\left|\phi^{+}\right>.
(0,0)(0,0) (0,1)(0,1) (1,0)(1,0) (1,1)(1,1)
{a1,b1}\{a_{1},b_{1}\} 1/2 0 0 1/2
{a1,b2}\{a_{1},b_{2}\} 3/8 1/8 1/8 3/8
{a2,b1}\{a_{2},b_{1}\} 3/8 1/8 1/8 3/8
{a2,b2}\{a_{2},b_{2}\} 1/8 3/8 3/8 1/8

For example, consider the Bell scenario again. Table 1 realizes the Clauser-Horne-Shimony-Holt (CHSH) model, obtained from local projective measurements equatorial at angles 0 for a1a_{1}, b1b_{1} and π/3\pi/3 for a2a_{2}, b2b_{2} on the two-qubit Bell state. In the table, each row represents a context CC, and each column represents an event s(C):COs\in\mathcal{E}(C):C\rightarrow O indexed by the outcome for each measurement. Then the first row of the table is a distribution d𝒟R({a1,b1})d\in\mathcal{D}_{R}\mathcal{E}(\{a_{1},b_{1}\}) mapping each event s({a1,b1})s\in\mathcal{E}(\{a_{1},b_{1}\}) to the semiring 0\mathbb{R}_{\geq 0} as follows:

d((0,0))\displaystyle d((0,0)) =\displaystyle= 1/2\displaystyle 1/2
d((0,1))\displaystyle d((0,1)) =\displaystyle= 0\displaystyle 0
d((1,0))\displaystyle d((1,0)) =\displaystyle= 0\displaystyle 0
d((1,1))\displaystyle d((1,1)) =\displaystyle= 1/2.\displaystyle 1/2.
Remark.

Given UUU\subset U^{\prime}, 𝒟R\mathcal{D}_{R}\mathcal{E} maps the projection map π:UU\pi:U\rightarrow U^{\prime} to:

𝒟R(U)𝒟R(U)::dd|U,\mathcal{D}_{R}\mathcal{E}(U^{\prime})\rightarrow\mathcal{D}_{R}\mathcal{E}(U)::d\mapsto\left.d\right|_{U},

where for each events s(U)s\in\mathcal{E}(U):

d|U(s)=s(U),s|U=sd(s).\left.d\right|_{U}(s)=\sum_{s^{\prime}\in\mathcal{E}(U^{\prime}),\left.s^{\prime}\right|_{U}=s}d(s^{\prime}).

Thus d|U\left.d\right|_{U} is the marginal of the distribution dd, which assigns to each event ss in the smaller context UU the sum of weights of all events ss^{\prime} in the larger context which restrict to ss.

In the later part of this paper, it turns out to be obvious that the functor 𝒟R\mathcal{D}_{R}\mathcal{E} cannot be a sheaf on the semirings we are concerned. However, we actually deal with a specific family of distributions when we look into a quantum system. This concept is characterized by an empirical model.

Definition 2.8.

Given a measurement cover \mathcal{M}, a no-signaling empirical model ee for \mathcal{M} is a compatible family of distributions such that:

  • for any measurement context CC\in\mathcal{M}, eCe_{C} is a local distribution of 𝒟R\mathcal{D}_{R}\mathcal{E} at CC, i.e. eC𝒟R(C)e_{C}\in\mathcal{D}_{R}\mathcal{E}(C);

  • ee is compatible for any measurement contexts C,CC,C^{\prime}\in\mathcal{M}, i.e. eC|CC=eC|CC\left.e_{C}\right|_{C\cap C^{\prime}}=\left.e_{C^{\prime}}\right|_{C\cap C^{\prime}}.

The compatibility condition corresponds to the concept no-signaling in the sense that the choice of context, CC or CC^{\prime}, does not affect the local distribution at CCC\cap C^{\prime}.

Now for our Bell scenario, we can show that it is an empirical model ee for a measurement cover ={{a1,b1},{a1,b2},{a2,b1},{a2,b2}}\mathcal{M}=\left\{\{a_{1},b_{1}\},\{a_{1},b_{2}\},\{a_{2},b_{1}\},\{a_{2},b_{2}\}\right\} where eCe_{C} is the distribution specified by each row of the table. The compatibility condition is verified by computing the marginal, for example, for the first two rows:

e{a1,b1}|a1((0))\displaystyle\left.e_{\{a_{1},b_{1}\}}\right|_{a_{1}}((0)) =\displaystyle= s|a1=((0))e{a1,b1}(s)\displaystyle\sum_{\left.s^{\prime}\right|_{a_{1}}=((0))}e_{\{a_{1},b_{1}\}}(s^{\prime})
=\displaystyle= e{a1,b1}((0,0))+e{a1,b1}((0,1))\displaystyle e_{\{a_{1},b_{1}\}}((0,0))+e_{\{a_{1},b_{1}\}}((0,1))
=\displaystyle= 1/2+0=1/2;\displaystyle 1/2+0=1/2;
=e{a1,b2}|a1((0))\displaystyle=\left.e_{\{a_{1},b_{2}\}}\right|_{a_{1}}((0)) =\displaystyle= s|a1=((0))e{a1,b2}(s)\displaystyle\sum_{\left.s^{\prime}\right|_{a_{1}}=((0))}e_{\{a_{1},b_{2}\}}(s^{\prime})
=\displaystyle= e{a1,b2}((0,0))+e{a1,b2}((0,1))\displaystyle e_{\{a_{1},b_{2}\}}((0,0))+e_{\{a_{1},b_{2}\}}((0,1))
=\displaystyle= 3/8+1/8=1/2;\displaystyle 3/8+1/8=1/2;
e{a1,b1}|a1((1))\displaystyle\left.e_{\{a_{1},b_{1}\}}\right|_{a_{1}}((1)) =\displaystyle= s|a1=((1))e{a1,b1}(s)\displaystyle\sum_{\left.s^{\prime}\right|_{a_{1}}=((1))}e_{\{a_{1},b_{1}\}}(s^{\prime})
=\displaystyle= e{a1,b1}((1,0))+e{a1,b1}((1,1))\displaystyle e_{\{a_{1},b_{1}\}}((1,0))+e_{\{a_{1},b_{1}\}}((1,1))
=\displaystyle= 0+1/2=1/2;\displaystyle 0+1/2=1/2;
=e{a1,b2}|a1((1))\displaystyle=\left.e_{\{a_{1},b_{2}\}}\right|_{a_{1}}((1)) =\displaystyle= s|a1=((1))e{a1,b2}(s)\displaystyle\sum_{\left.s^{\prime}\right|_{a_{1}}=((1))}e_{\{a_{1},b_{2}\}}(s^{\prime})
=\displaystyle= e{a1,b2}((1,0))+e{a1,b2}((1,1))\displaystyle e_{\{a_{1},b_{2}\}}((1,0))+e_{\{a_{1},b_{2}\}}((1,1))
=\displaystyle= 1/8+3/8=1/2.\displaystyle 1/8+3/8=1/2.

Thus, we verify that e{a1,b1}|a1=e{a1,b2}|a1\left.e_{\{a_{1},b_{1}\}}\right|_{a_{1}}=\left.e_{\{a_{1},b_{2}\}}\right|_{a_{1}}. Here we denoted the restriction map to the one-point set {m}\{m\} by s|m\left.s\right|_{m} rather than s|{m}\left.s\right|_{\{m\}}. We shall keep this abbreviation throughout this paper since it does not violate our intuition too much. The same computation for the other contexts CC\in\mathcal{M} shows that ee is a no-signaling empirical model. We shall also use the term empirical model for the no-signaling model as we will only focus on quantum-like models with the no-signaling condition.

2.4 Sheaf condition and global distributions

We previously defined two functors \mathcal{E} and 𝒟R\mathcal{D}_{R}\mathcal{E}. In fact, these two functors can be identified as presheaves. \mathcal{E} can be further identified to be a sheaf, justifying its name, ’event sheaf.’ Here, we will look through the definitions of presheaves and sheaves referring to Ref [28] to make those facts clear. Remark that we can consider the measurement cover \mathcal{M} as a generating set of the topology of XX, giving each context CC an open subset of XX.

Definition 2.9.

Let XX be a topological space. A presheaf \mathscr{F} on XX consists of the data:

  1. (i)

    for every open subset UXU\subset X, an object (U)\mathscr{F}(U);

  2. (ii)

    for every inclusion UUU\subset U^{\prime} of open subsets of XX, a morphism of objects ρUU:(U)(U)\rho_{U}^{U^{\prime}}:\mathscr{F}(U^{\prime})\rightarrow\mathscr{F}(U);

subject to the conditions:

  1. (a)

    ρUU\rho_{U}^{U} is the identity map (U)(U)\mathscr{F}(U)\rightarrow\mathscr{F}(U);

  2. (b)

    if UUU′′U\subset U^{\prime}\subset U^{\prime\prime} are three open subsets, then ρUU′′=ρUUρUU′′\rho_{U}^{U^{\prime\prime}}=\rho_{U}^{U^{\prime}}\circ\rho_{U^{\prime}}^{U^{\prime\prime}}.

Definition 2.10.

A presheaf \mathscr{F} on a topological space XX is a sheaf if it satisfies the following supplementary conditions:

  1. (c)

    (Locality) if UU is an open set, if {Vi}\left\{V_{i}\right\} is an open covering of UU, and if s,t(U)s,t\in\mathscr{F}(U) are elements such that s|Vi=t|Vi\left.s\right|_{V_{i}}=\left.t\right|_{V_{i}} for all ii, then s=ts=t;

  2. (d)

    (Gluing) if UU is an open set, if {Vi}\left\{V_{i}\right\} is an open covering of UU, and if we have elements si(Vi)s_{i}\in\mathscr{F}(V_{i}) for each ii, with the property that for each i,ji,j, si|ViVj=sj|ViVj\left.s_{i}\right|_{V_{i}\cap V_{j}}=\left.s_{j}\right|_{V_{i}\cap V_{j}}, then there is an element s(U)s\in\mathscr{F}(U) such that s|Vi=si\left.s\right|_{V_{i}}=s_{i} for each ii.

In general, we assume discrete topology for the set of measurements XX. It is easy to show that \mathcal{E} is a sheaf and 𝒟R\mathcal{D}_{R}\mathcal{E} is a presheaf. However, 𝒟R\mathcal{D}_{R}\mathcal{E} generally fails to achieve the sheaf conditions.

Table 2: Two different distributions over the same measurement context, which is compatible in each local measurement.
(0,0)(0,0) (0,1)(0,1) (1,0)(1,0) (1,1)(1,1)
d1d_{1} 1/2 0 0 1/2
d2d_{2} 1/4 1/4 1/4 1/4

For the counter-example of locality, consider two different probability distributions defined on a measurement context {a,b}\{a,b\} as Table 2. We can verify that d1d_{1} and d2d_{2} do agree on every local measurement event.

d1|a((0))=d1|a((1))=d1|b((0))=d1|b((1))\displaystyle\left.d_{1}\right|_{a}((0))=\left.d_{1}\right|_{a}((1))=\left.d_{1}\right|_{b}((0))=\left.d_{1}\right|_{b}((1))
=\displaystyle= d2|a((0))=d2|a((1))=d2|b((0))=d2|b((1))=1/2\displaystyle\left.d_{2}\right|_{a}((0))=\left.d_{2}\right|_{a}((1))=\left.d_{2}\right|_{b}((0))=\left.d_{2}\right|_{b}((1))=1/2
Table 3: Empirical model on PR-box. This model is not realizable in quantum information systems, but it still satisfies the condition of a no-signaling empirical model.
(0,0)(0,0) (0,1)(0,1) (1,0)(1,0) (1,1)(1,1)
{a1,b1}\{a_{1},b_{1}\} 1/2 0 0 1/2
{a1,b2}\{a_{1},b_{2}\} 1/2 0 0 1/2
{a2,b1}\{a_{2},b_{1}\} 1/2 0 0 1/2
{a2,b2}\{a_{2},b_{2}\} 0 1/2 1/2 0

For the counter-example of gluing, the best example is actually the PR-box, although it is not a quantum realizable model. Table 3 illustrates the empirical model of PR-box, which is not actually a quantum model but still satisfies the condition of being no-signaling empirical model. We can try gluing the first three rows to obtain the probability distribution over XX, but in this case, there only exists a unique global distribution d𝒟R(X)d\in\mathcal{D}_{R}\mathcal{E}(X), such that d((0,0,0,0))=d((1,1,1,1))=1/2d((0,0,0,0))=d((1,1,1,1))=1/2 and 0 otherwise. However, it fails to match with the distribution over the context {a2,b2}\{a_{2},b_{2}\} since there is no probability of getting s(a2)s(b2)s(a_{2})\neq s(b_{2}) from this global distribution.

In general, empirical models are compatible families of distributions. Although the sheaf condition does not hold for the entire presheaf 𝒟R\mathcal{D}_{R}\mathcal{E}, it is possible to ask if the gluing property holds for such a specific family {eC}C\left\{e_{C}\right\}_{C\in\mathcal{M}}. This is equivalent to asking the existence of a global distribution d𝒟R(X)d\in\mathcal{D}_{R}\mathcal{E}(X), defined on the entire set of measurements XX. Such a global distribution defines a distribution on the set (X)=OX\mathcal{E}(X)=O^{X}, which marginalizes to yield the behavior of the empirical model: i.e. C,d|C=eC\forall C\in\mathcal{M},\left.d\right|_{C}=e_{C}.

Remark.

A global distribution is often called a global section in a sheaf-theoretic sense. However, in this paper, we will keep the notation of event/assignment and distribution to distinguish one from another.

Now, let’s consider the meaning of the existence of a global distribution for a given empirical model ee. Here we will show that ee has a global distribution if and only if it is realized by a factorisable hidden-variable model.

Definition 2.11.

Given a measurement cover \mathcal{M}, let Λ\Lambda be a set of values for a hidden variable. A hidden-variable model hh over Λ\Lambda is an assignment such that:

  • hΛ𝒟R(Λ)h_{\Lambda}\in\mathcal{D}_{R}(\Lambda);

  • λΛ\forall\lambda\in\Lambda and CC\in\mathcal{M}, hCλ𝒟R(C)h_{C}^{\lambda}\in\mathcal{D}_{R}\mathcal{E}(C);

  • λΛ\forall\lambda\in\Lambda and C,CC,C^{\prime}\in\mathcal{M}, hCλ|CC=hCλ|CC\left.h_{C}^{\lambda}\right|_{C\cap C^{\prime}}=\left.h_{C^{\prime}}^{\lambda}\right|_{C\cap C^{\prime}}.

Definition 2.12.

A hidden-variable model hh realizes an empirical model ee if, C\forall C\in\mathcal{M} and s(C)s\in\mathcal{E}(C):

eC(s)=λΛhCλ(s)hΛ(λ).e_{C}(s)=\sum_{\lambda\in\Lambda}h_{C}^{\lambda}(s)\cdot h_{\Lambda}(\lambda).
Definition 2.13.

A hidden-variable model hh is factorisable if, C\forall C\in\mathcal{M} and s(C)s\in\mathcal{E}(C):

hCλ(s)=mChCλ|m(s|m).h_{C}^{\lambda}(s)=\prod_{m\in C}\left.h_{C}^{\lambda}\right|_{m}(\left.s\right|_{m}).
Theorem 2.14.

Let ee be an empirical model defined on a measurement cover \mathcal{M} for a distribution functor 𝒟R\mathcal{D}_{R}. The following are equivalent.

  1. (a)

    ee has a global distribution.

  2. (b)

    ee has a realization by a factorisable hidden-variable model.

Proof.

(a)(b)(a)\rightarrow(b): For each sOXs\in O^{X}, we define a distribution δs𝒟R(X)\delta_{s}\in\mathcal{D}_{R}\mathcal{E}(X) such that δs(s)=1\delta_{s}(s^{\prime})=1 for s=ss=s^{\prime} and 0 otherwise. Then if ee has a global distribution dd:

eC(s)=d|C(s)=s(X),s|C=sd(s)=s(X)δs|C(s)d(s).e_{C}(s)=\left.d\right|_{C}(s)=\sum_{s^{\prime}\in\mathcal{E}(X),\left.s^{\prime}\right|_{C}=s}d(s^{\prime})=\sum_{s^{\prime}\in\mathcal{E}(X)}\delta_{\left.s^{\prime}\right|_{C}}(s)\cdot d(s^{\prime}).

Here, we identify (X)\mathcal{E}(X) to Λ\Lambda, dd to hΛh_{\Lambda}, and δs|C\delta_{\left.s^{\prime}\right|_{C}} to hCλh_{C}^{\lambda}, so we showed ee is realized by a hidden variable model. Moreover, it is factorisable from that:

δs|C(s)=mCδs|m(s|m).\delta_{\left.s^{\prime}\right|_{C}}(s)=\prod_{m\in C}\delta_{\left.s^{\prime}\right|_{m}}(\left.s\right|_{m}).

(b)(a)(b)\rightarrow(a): Suppose that ee is realized by a factorisable hidden-variable model hh. For each mXm\in X, we define hmλ:=hCλ|m𝒟R(m)h_{m}^{\lambda}:=\left.h_{C}^{\lambda}\right|_{m}\in\mathcal{D}_{R}\mathcal{E}(m) for any CC\in\mathcal{M} such that mCm\in C. By the compatibility of the family {hCλ}\left\{h_{C}^{\lambda}\right\}, this definition is independent of the choice C, being well defined. We define a distribution hXλ𝒟R(X)h_{X}^{\lambda}\in\mathcal{D}_{R}\mathcal{E}(X) for each λΛ\lambda\in\Lambda by:

hXλ(s)=mXhmλ(s|m).h_{X}^{\lambda}(s)=\prod_{m\in X}h_{m}^{\lambda}(\left.s\right|_{m}).

Now to show that hXλh_{X}^{\lambda} is a distribution, let the set of measurements XX be enumerated as X={m1,,mp}X=\{m_{1},\cdots,m_{p}\}. Specify the global assignment s(X)s\in\mathcal{E}(X) by a tuple (o1,,op)(o_{1},\cdots,o_{p}), where oi=s(mi)o_{i}=s(m_{i}). Then we can calculate:

s(X)mXhmλ(s|m)\displaystyle\displaystyle\sum_{s\in\mathcal{E}(X)}\prod_{m\in X}h_{m}^{\lambda}(\left.s\right|_{m})
=\displaystyle= s(X)i=1phmiλ(s|mi)\displaystyle\displaystyle\sum_{s\in\mathcal{E}(X)}\prod_{i=1}^{p}h_{m_{i}}^{\lambda}(\left.s\right|_{m_{i}})
=\displaystyle= o1hm1λ(m1o1)(o2hm2λ(m2o2)((ophmpλ(mpop))))\displaystyle\displaystyle\sum_{o_{1}}h_{m_{1}}^{\lambda}(m_{1}\mapsto o_{1})\cdot\left(\sum_{o_{2}}h_{m_{2}}^{\lambda}(m_{2}\mapsto o_{2})\cdot\left(\cdots\left(\sum_{o_{p}}h_{m_{p}}^{\lambda}(m_{p}\mapsto o_{p})\right)\right)\right)
=\displaystyle= o1hm1λ(m1o1)(o2hm2λ(m2o2)((1)))\displaystyle\displaystyle\sum_{o_{1}}h_{m_{1}}^{\lambda}(m_{1}\mapsto o_{1})\cdot\left(\sum_{o_{2}}h_{m_{2}}^{\lambda}(m_{2}\mapsto o_{2})\cdot\left(\cdots\left(1\right)\right)\right)
=\displaystyle= \displaystyle\cdots
=\displaystyle= o1hm1λ(m1o1)1=1.\displaystyle\displaystyle\sum_{o_{1}}h_{m_{1}}^{\lambda}(m_{1}\mapsto o_{1})\cdot 1=1.

We can also show that for each context CC\in\mathcal{M}, hXλ|C=hCλ\left.h_{X}^{\lambda}\right|_{C}=h_{C}^{\lambda}. We choose an enumeration of XX such that C={m1,,mq}C=\{m_{1},\cdots,m_{q}\}, qpq\leq p. Then:

hXλ|C(s)=\displaystyle\left.h_{X}^{\lambda}\right|_{C}(s)= s(X),s|C=shXλ(s)\displaystyle\displaystyle\sum_{s^{\prime}\in\mathcal{E}(X),\left.s^{\prime}\right|_{C}=s}h_{X}^{\lambda}(s^{\prime})
=\displaystyle= s(X),s|C=si=1phmiλ(mioi)\displaystyle\displaystyle\sum_{s^{\prime}\in\mathcal{E}(X),\left.s^{\prime}\right|_{C}=s}\prod_{i=1}^{p}h_{m_{i}}^{\lambda}(m_{i}\mapsto o_{i})
=\displaystyle= i=1qhmiλ(mioi)(oq+1,,opj=q+1phmjλ(mjoj))\displaystyle\displaystyle\prod_{i=1}^{q}h_{m_{i}}^{\lambda}(m_{i}\mapsto o_{i})\cdot\left(\sum_{o_{q+1},\cdots,o_{p}}\prod_{j=q+1}^{p}h_{m_{j}}^{\lambda}(m_{j}\mapsto o_{j})\right)
=\displaystyle= hCλ1=hCλ.\displaystyle h_{C}^{\lambda}\cdot 1=h_{C}^{\lambda}.

Now we define a distribution d𝒟R(X)d\in\mathcal{D}_{R}\mathcal{E}(X) by averaging over the hidden variables:

d(s):=λΛhXλ(s)hΛ(λ).d(s):=\sum_{\lambda\in\Lambda}h_{X}^{\lambda}(s)\cdot h_{\Lambda}(\lambda).

The condition for distribution is automatically satisfied from that λΛhΛ(λ)=1\sum_{\lambda\in\Lambda}h_{\Lambda}(\lambda)=1. Moreover, dd restricts at each context CC to yield eCe_{C} as:

d|C(s)=\displaystyle\left.d\right|_{C}(s)= s(X),s|C=sd(s)\displaystyle\displaystyle\sum_{s^{\prime}\in\mathcal{E}(X),\left.s^{\prime}\right|_{C}=s}d(s^{\prime})
=\displaystyle= s(X),s|C=sλΛhXλ(s)hΛ(λ)\displaystyle\displaystyle\sum_{s^{\prime}\in\mathcal{E}(X),\left.s^{\prime}\right|_{C}=s}\sum_{\lambda\in\Lambda}h_{X}^{\lambda}(s^{\prime})\cdot h_{\Lambda}(\lambda)
=\displaystyle= λΛhΛ(λ)hXλ|C(s)\displaystyle\displaystyle\sum_{\lambda\in\Lambda}h_{\Lambda}(\lambda)\cdot\left.h_{X}^{\lambda}\right|_{C}(s)
=\displaystyle= λΛhΛ(λ)hCλ(s)\displaystyle\displaystyle\sum_{\lambda\in\Lambda}h_{\Lambda}(\lambda)\cdot h_{C}^{\lambda}(s)
=\displaystyle= eC(s).\displaystyle e_{C}(s).

Thus dd is a global distribution for ee. ∎

This result provides a definitive justification for equating the phenomena of non-locality and contextuality with obstructions to the existence of global distributions. We shall say the empirical model ee is noncontextual if there exists a global distribution for ee.

2.5 Existence of global distributions and the contextual fraction

Definition 2.15.

Given a measurement cover \mathcal{M}, let n:=|(X)|n:=\left|\mathcal{E}(X)\right| be the number of global assignments, and m:=C|(C)|=|{C,sC,s(C)}|m:=\sum_{C\in\mathcal{M}}\left|\mathcal{E}(C)\right|=\left|\left\{\left<C,s\right>\mid C\in\mathcal{M},s\in\mathcal{E}(C)\right\}\right| be the number of local assignments ranging over contexts. The incidence matrix 𝐌\mathbf{M} is an m×nm\times n Boolean matrix that records the restriction relation between global and local assignments:

𝐌[C,s,g]:={1ifg|C=s;0otherwise.\mathbf{M}\left[\left<C,s\right>,g\right]:=\begin{cases}1&\mathrm{if}\;\left.g\right|_{C}=s;\\ 0&\mathrm{otherwise.}\end{cases}

The incidence matrix conceptually represents the tuple of restriction maps

(X)C(C)::s(s|C)C.\mathcal{E}(X)\rightarrow\prod_{C\in\mathcal{M}}\mathcal{E}(C)::s\mapsto\left(\left.s\right|_{C}\right)_{C\in\mathcal{M}}.

At the same time, viewing it as a matrix over the semiring RR, it acts by matrix multiplication on distributions in 𝒟R(X)\mathcal{D}_{R}\mathcal{E}(X), represented as row vectors:

d(d|C)C.d\mapsto\left(\left.d\right|_{C}\right)_{C\in\mathcal{M}}.

Thus, the image of this map will be the set of families {eC}C\left\{e_{C}\right\}_{C\in\mathcal{M}} which arise from global distributions.

Proposition 2.16.

Let ee be an empirical model defined on a measurement cover \mathcal{M}. Let 𝐌\mathbf{M} be an incidence matrix of \mathcal{M}. Let a vector 𝐕\mathbf{V} defined by 𝐕[i]=eC(si)\mathbf{V}[i]=e_{C}(s_{i}), which encodes the value of the empirical model on each local assignment. Then, solutions to the following equation correspond bijectively to global distributions for ee.

𝐌𝐗=𝐕where𝐗Rn,𝐗1=1.\mathbf{M}\mathbf{X}=\mathbf{V}\quad\mathrm{where}\>\mathbf{X}\in R^{n},\;\left\|\mathbf{X}\right\|_{1}=1. (1)

With this proposition, we can now compute the existence of global distributions from the given empirical model. However, the existence of global distributions depends on the semiring RR, on which empirical models are discussed. We have naturally thought of probability distributions so far, which are defined on the non-negative reals 0\mathbb{R}_{\geq 0}, but here, let me address how the situation differs according to the semiring.

There are three main examples of semirings discussed in the field of contextuality: the Booleans

(𝔹,,),(\mathbb{B},\vee,\wedge),

the non-negative reals

(0,+,×),(\mathbb{R}_{\geq 0},+,\times),

and the reals

(,+,×).(\mathbb{R},+,\times).

We call the distribution d𝒟𝔹d\in\mathcal{D}_{\mathbb{B}} on Boolean semiring a possibility distribution. In the case of the semiring 0\mathbb{R}_{\geq 0}, 𝒟0\mathcal{D}_{\mathbb{R}_{\geq 0}} is the set of probability distributions. In the case of the reals \mathbb{R}, 𝒟\mathcal{D}_{\mathbb{R}} is the set of signed probability measures with finite support, allowing for ’negative probabilities.’ It can be shown that we can always find global distributions for the given empirical model on the semiring \mathbb{R}, making 𝒟\mathcal{D}_{\mathbb{R}} a sheaf. It is also straightforward to show that the probabilistic global distribution can exist only when the possibilistic global distribution exists.

Now we will extend this discussion about the incidence matrix further to define the measure of contextuality. Here we consider the semiring 0\mathbb{R}_{\geq 0}. First, define a convex sum λe+(1λ)e\lambda e+(1-\lambda)e^{\prime} of two empirical models ee and ee^{\prime} on the same measurement cover, by taking the convex sum of probability distributions on each context. Compatibility is preserved in the convex sum; hence, it yields a well-defined empirical model.

Definition 2.17.

Given an empirical model ee, consider a convex decomposition

e=λeNC+(1λ)e,λ[0,1],e=\lambda e^{NC}+(1-\lambda)e^{\prime},\quad\lambda\in[0,1], (2)

where eNCe^{NC} denotes a noncontextual model and ee^{\prime} is another empirical model. The noncontextual fraction 𝖭𝖢𝖥(e)\mathsf{NCF}(e) is the maximum possible value of such λ\lambda. The contextual fraction is given by 𝖢𝖥(e):=1𝖭𝖢𝖥(e)\mathsf{CF}(e):=1-\mathsf{NCF}(e).

Then, we will consider the computation of this contextual fraction using the incidence matrix. Recall the equation (1), which is valid only for a noncontextual model. Since now we can no longer guarantee the solution of 𝐌𝐗=𝐕\mathbf{M}\mathbf{X}=\mathbf{V}, we will consider a relaxed condition:

𝐌𝐗𝐕,\mathbf{M}\mathbf{X}\leq\mathbf{V},

where the comparison is elementwise throughout this section. Let an empirical model ee and some convex decomposition e=λeNC+(1λ)ee=\lambda e^{NC}+(1-\lambda)e^{\prime} be given. Because both empirical models are defined on the same measurement cover, the incidence matrix is the same for both models. Let 𝐕NC\mathbf{V}^{NC} encode the noncontextual model eNCe^{NC} and suppose a vector 𝐗NC0n\mathbf{X}^{NC}\in\mathbb{R}_{\geq 0}^{n} satisfies 𝐌𝐗NC=𝐕NC\mathbf{M}\mathbf{X}^{NC}=\mathbf{V}^{NC}, 𝐗1=1\left\|\mathbf{X}\right\|_{1}=1. Then, since 𝐕=λ𝐕NC+(1λ)𝐕\mathbf{V}=\lambda\mathbf{V}^{NC}+(1-\lambda)\mathbf{V}^{\prime},

λ𝐕NC=𝐌λ𝐗NC𝐌(λ𝐗NC+(1λ)𝐗)=𝐌𝐗𝐕,\lambda\mathbf{V}^{NC}=\mathbf{M}\lambda\mathbf{X}^{NC}\leq\mathbf{M}\left(\lambda\mathbf{X}^{NC}+(1-\lambda)\mathbf{X}^{\prime}\right)=\mathbf{M}\mathbf{X}\leq\mathbf{V},

which implies existence of a solution 𝐌𝐗𝐕\mathbf{M}\mathbf{X}\leq\mathbf{V} with a lower bound λ𝐕NC\lambda\mathbf{V}^{NC}. Here, the norm 𝐗1λ𝐗NC+(1λ)𝐗1λ\left\|\mathbf{X}\right\|_{1}\geq\left\|\lambda\mathbf{X}^{NC}+(1-\lambda)\mathbf{X}^{\prime}\right\|_{1}\geq\lambda. Thus, an optimal solution 𝐗\mathbf{X}^{*} of the following linear programming(LP) gives the noncontextual fraction by 𝖭𝖢𝖥(e)=maxλ=𝐗1\mathsf{NCF}(e)=\max{\lambda}=\left\|\mathbf{X}^{*}\right\|_{1}:

𝙵𝚒𝚗𝚍𝐗n𝚖𝚊𝚡𝚒𝚖𝚒𝚣𝚒𝚗𝚐𝐗1𝚜𝚞𝚋𝚓𝚎𝚌𝚝𝚝𝚘𝐌𝐗𝐕.\displaystyle\begin{array}[]{ll}\mathtt{Find}&\mathbf{X}\in\mathbb{R}^{n}\\ \mathtt{maximizing}&\left\|\mathbf{X}\right\|_{1}\\ \mathtt{subject\;to}&\mathbf{M}\mathbf{X}\leq\mathbf{V}.\end{array}

Thus, we obtain a method computing a contextual fraction of a given empirical model ee. One can refer to Ref [16] to check that this contextual fraction has monotonicity for the free operations of a resource theory, thus, works as a useful measure.

2.6 Possibilistic empirical models and strong contextuality

Once we have this measure of contextuality, one can notice the empirical models whose contextuality equals 1. In fact, an empirical model has a contextual fraction of 1 if and only if it is strongly contextual. Here we define strong contextuality based on a possibilistic empirical model and show the equivalence of maximally contextual and strongly contextual.

Definition 2.18 (Possibilistic empirical model).

A possibilistic empirical model 𝒮\mathcal{S} is a sub-presheaf of \mathcal{E} such that:

  • every compatible family for the measurement cover \mathcal{M} induces a global assignment;

  • 𝒮\mathcal{S} is flasque beneath the cover: If UUCU\subset U^{\prime}\subset C\in\mathcal{M} then every s𝒮(U)s\in\mathcal{S}(U) is the restriction of some s𝒮(U)s^{\prime}\in\mathcal{S}(U^{\prime}).

When an empirical model ee is given, a possibilistic empirical model 𝒮e\mathcal{S}_{e} of ee can be derived as follows:

𝒮e(U):={s(U)C,s|UCsupp(eC|UC)}.\mathcal{S}_{e}(U):=\left\{s\in\mathcal{E}(U)\mid\forall C\in\mathcal{M},\left.s\right|_{U\cap C}\in\mathrm{supp}\left(\left.e_{C}\right|_{U\cap C}\right)\right\}.
Remark.

Here we define the possibilistic empirical model 𝒮e\mathcal{S}_{e} of ee as a family of sets, rather than a family of possibilistic distributions 𝒮e,C:C𝔹\mathcal{S}_{e,C}:C\rightarrow\mathbb{B}. In fact, there is no problem with interpreting the possibilistic empirical model as a normal empirical model of a possibility distribution, but, we rather choose that the possibilistic empirical model returns a set of possible events, i.e. s𝒮e(U)s\in\mathcal{S}_{e}(U) if s|UCsupp(eC|UC)\left.s\right|_{U\cap C}\in\mathrm{supp}(\left.e_{C}\right|_{U\cap C}) to clarify the notation.

Definition 2.19.

Let 𝒮\mathcal{S} be a possibilistic empirical model for a measurement scenario X,,O\left<X,\mathcal{M},O\right>. We say that 𝒮\mathcal{S} is

  • logically contextual at s𝒮(C)s\in\mathcal{S}(C) if there is no global assignment g𝒮(X)g\in\mathcal{S}(X) such that g|C=s\left.g\right|_{C}=s;

  • logically contextual if 𝒮\mathcal{S} is logically contextual at some local assignment. Otherwise, it is non-contextual;

  • strongly contextual, written SC(𝒮)\mathrm{SC}(\mathcal{S}), if 𝒮\mathcal{S} has no global assignment. i.e. 𝒮(X)=\mathcal{S}(X)=\emptyset.

Remark.

Note that possibilistic contextuality and strong contextuality differ from each other. An empirical model is possibilistically contextual if there is no global possibility distribution d𝒟R(X)d\in\mathcal{D}_{R}\mathcal{E}(X) and is strongly contextual if there is no global assignment g(X)g\in\mathcal{E}(X).

Proposition 2.20.

An empirical model ee is strongly contextual if and only if it is maximally contextual.

Proof.

Suppose that ee admits a convex decomposition (2) as follows:

e=λeNC+(1λ)e,λ[0,1].e=\lambda e^{NC}+(1-\lambda)e^{\prime},\quad\lambda\in[0,1].

By the Theorem 2.14, we can take the non-contextual empirical model eNCe^{NC} to be a convex sum of deterministic models iμiδsi\sum_{i}\mu_{i}\delta_{s_{i}}, where each si(X)s_{i}\in\mathcal{E}(X) is a global assignment. If λ>0\lambda>0, then from (2), si𝒮e(X)s_{i}\in\mathcal{S}_{e}(X) for each ii. Thus strong contextuality implies maximal contextuality.

For the converse, suppose that s𝒮e(X)s\in\mathcal{S}_{e}(X). Taking eNC=λδse^{NC}=\lambda\cdot\delta_{s}, we shall define qq such that (2) holds. For each CC\in\mathcal{M} and s(C)s^{\prime}\in\mathcal{E}(C):

qC(s):=eC(s)λδs|c(s)1λ.q_{C}(s^{\prime}):=\frac{e_{C}(s^{\prime})-\lambda\cdot\delta_{\left.s\right|_{c}}(s^{\prime})}{1-\lambda}.

It is easily verified that, for each CC, s(C)\sum_{s^{\prime}\in\mathcal{E}(C)}. To ensure that qq is always non-negative, we must have λinfCeC(s|C)\lambda\leq\inf_{C\in\mathcal{M}}e_{C}(\left.s\right|_{C}). Since this is the infimum of a finite set of positive numbers, we can find λ>0\lambda>0 satisfying this condition.

It remains to verify that qq is no-signalling, i.e. that {qC}\left\{q_{C}\right\} forms a compatible family. Given C,CC,C^{\prime}\in\mathcal{M}, fix s0(CC)s_{0}\in\mathcal{E}\left(C\cap C^{\prime}\right). Now

qC|CC(s0)=11λ[(s(C),s|CC=s0eC(s))λδs|CC(s0)].\left.q_{C}\right|_{C\cap C^{\prime}}(s_{0})=\frac{1}{1-\lambda}\left[\left(\sum_{s^{\prime}\in\mathcal{E}(C),\left.s^{\prime}\right|_{C\cap C^{\prime}}=s_{0}}e_{C}(s^{\prime})\right)-\lambda\cdot\left.\delta_{s}\right|_{C\cap C^{\prime}}(s_{0})\right].

A similar analysis applies to qC|CC(s0)\left.q_{C^{\prime}}\right|_{C\cap C^{\prime}}(s_{0}). Using the compatibility of ee, we conclude that qC|CC=qC|CC\left.q_{C}\right|_{C\cap C^{\prime}}=\left.q_{C^{\prime}}\right|_{C\cap C^{\prime}}

3 All-versus-nothing arguments and partial groups

When we look into a quantum system, we see that the measurement depends on both the quantum state and the observable. However, it has been observed that some sets of observables inhere the contextuality independently from the quantum state. This type of contextuality, earlier observed by Kochen and Specker [8], has been developed to define different types of contextuality [10, 21, 23, 24]. Here, we formulate them with a sheaf-theoretic structure, starting from what is formally studied as an all-versus-nothing (AvN) argument [14, 15, 18]. We extend this argument to state-independent AvN and claim that Kochen-Specker type contextuality is, in fact, state-independent AvN in a partial closure.

3.1 Mermin’s square

Refer to caption
Figure 2: Mermin’s square. The observables within each row and each column mutually commute, and the product of all three observables equals +I+I except for the last column being I-I.

Before we start, let’s first look into the simple example of state-independent AvN. Fig. 2 shows the Mermin’s square [10] that consists of mutually commuting Pauli observables on two qubits. In each row and each column, the product of the first two operators equals the last one, except for the last column: XXZZ=YYXX\cdot ZZ=-YY. Once we rearrange the product, we get that the product of every row and column equals ±I\pm I: +I+I for all three rows and the first two columns from the left, and I-I for the last column.

Here, consider a measurement scenario X,,2\left<X,\mathcal{M},\mathbb{Z}_{2}\right> where each row and each column translates to a measurement context CC\in\mathcal{M}. Then, try assigning a global assignment g:X2g:X\rightarrow\mathbb{Z}_{2} to this measurement scenario. The product equations are mapped to the following linear equations,

g(XI)g(IX)g(XX)=0,g(IZ)g(ZI)g(ZZ)=0,g(XZ)g(ZX)g(YY)=0,g(XI)g(IZ)g(XZ)=0,g(IX)g(ZI)g(ZX)=0,g(XX)g(ZZ)g(YY)=1,\begin{split}g(XI)\oplus g(IX)\oplus g(XX)&=0,\\ g(IZ)\oplus g(ZI)\oplus g(ZZ)&=0,\\ g(XZ)\oplus g(ZX)\oplus g(YY)&=0,\\ g(XI)\oplus g(IZ)\oplus g(XZ)&=0,\\ g(IX)\oplus g(ZI)\oplus g(ZX)&=0,\\ g(XX)\oplus g(ZZ)\oplus g(YY)&=1,\end{split} (4)

where the outcome g(x)2g(x)\in\mathbb{Z}_{2} corresponds to the eigenvalue of xx by λ(x)=(1)g(x)\lambda(x)=(-1)^{g(x)}. This system of linear equations is not satisfiable by any global assignment gg. To see this, add the equations altogether then each element on the left-hand side is added twice, which ends up being 0, while the right-hand side equals 1. Therefore, no quantum state can realize a non-contextual empirical model with these observables, i.e., this set of observables inheres inconsistency.

Table 4: Empirical model on Mermin’s square realized by Bell state |ϕ+\left|\phi^{+}\right>.
CC 000 001 010 011 100 101 110 111
{XI,IX,XX}\{XI,IX,XX\} 1/2 1/2
{IZ,ZI,ZZ}\{IZ,ZI,ZZ\} 1/2 1/2
{XZ,ZX,YY}\{XZ,ZX,YY\} 1/2 1/2
{XI,IZ,XZ}\{XI,IZ,XZ\} 1/4 1/4 1/4 1/4
{IX,ZI,ZX}\{IX,ZI,ZX\} 1/4 1/4 1/4 1/4
{XX,ZZ,YY}\{XX,ZZ,YY\} 1

For example, consider this measurement scenario realized by |ϕ+=(|00+|11)/2\left|\phi^{+}\right>=\left(\left|00\right>+\left|11\right>\right)/2. Table 4 illustrates the empirical model on Mermin’s square realized by |ϕ+\left|\phi^{+}\right>. It is obvious that this empirical model is strongly contextual. In fact, we can prove that every empirical model with this measurement scenario is strongly contextual.

3.2 Consistency of an R-linear theory

The key concept first to notice is a ring structure given to the set of outcomes OO, which justifies writing a measurement scenario in the form X,,R\left<X,\mathcal{M},R\right>, where RR is the ring of outcomes.

Definition 3.1.

Let X,,R\left<X,\mathcal{M},R\right> be a measurement scenario. An RR-linear equation is a triple ϕ=C,r,a\phi=\left<C,r,a\right> where CC\in\mathcal{M} is a context, r:CRr:C\rightarrow R assigns a coefficient in RR to each xCx\in C, and aRa\in R is a constant. A local assignment s(C)s\in\mathcal{E}(C) satisfies ϕ\phi, written sϕs\vDash\phi, if

xCr(x)s(x)=a.\sum_{x\in C}r(x)s(x)=a.

An RR-linear theory 𝕋R\mathbb{T}_{R} is a set of RR-linear equations. A global assignment g(X)g\in\mathcal{E}(X) satisfies 𝕋R\mathbb{T}_{R}, written g𝕋Rg\vDash\mathbb{T}_{R}, if ϕ=C,r,a𝕋R,g|Cϕ\forall\phi=\left<C,r,a\right>\in\mathbb{T}_{R},\left.g\right|_{C}\vDash\phi. 𝕋R\mathbb{T}_{R} is consistent if there exists a global assignment gg that satisfies 𝕋R\mathbb{T}_{R}.

Remark.

Note that the consistency of an RR-linear theory is defined on an event sheaf \mathcal{E}, not on an empirical model.

For example, consider Mermin’s square. The linear equation s(XI)s(IX)s(XX)=0s(XI)\oplus s(IX)\oplus s(XX)=0 for a local assignment s(C)s\in\mathcal{E}(C) on the context C={XI,IX,XX}C=\{XI,IX,XX\}\in\mathcal{M} translates to a triple {XI,IX,XX},1,0\left<\{XI,IX,XX\},1,0\right> where 1 denotes a constant function that maps every measurement xx to 1. The explicit statement of the whole linear theory 𝕋2\mathbb{T}_{\mathbb{Z}_{2}} of Mermin’s square is given as follows:

𝕋2={{XI,IX,XX},1,0,{IZ,ZI,ZZ},1,0,{XZ,ZX,YY},1,0,\displaystyle\mathbb{T}_{\mathbb{Z}_{2}}=\left\{\left<\{XI,IX,XX\},1,0\right>,\left<\{IZ,ZI,ZZ\},1,0\right>,\left<\{XZ,ZX,YY\},1,0\right>,\right.
{XI,IZ,XZ},1,0,{IX,ZI,ZX},1,0,{XX,ZZ,YY},1,1},\displaystyle\left.\left<\{XI,IZ,XZ\},1,0\right>,\left<\{IX,ZI,ZX\},1,0\right>,\left<\{XX,ZZ,YY\},1,1\right>\right\},

Once we suppose a global assignment g(X)g\in\mathcal{E}(X) such that ϕ𝕋2,g|Cϕ\forall\phi\in\mathbb{T}_{\mathbb{Z}_{2}},\left.g\right|_{C}\vDash\phi. This deduces the same equation with (4), thus, a contradiction. Therefore, the linear theory of Mermin’s square is inconsistent.

3.3 AvN arguments

An AvN argument [14, 15, 18] is characterized by an RR-linear theory of a possibilistic empirical model. Here, we consider an RR-linear theory 𝕋R\mathbb{T}_{R} derived from a set of assignments S(U)S\subset\mathcal{E}(U), as follows:

𝕋R(S):={ϕsS,sϕ}.\mathbb{T}_{R}(S):=\left\{\phi\mid\forall s\in S,s\vDash\phi\right\}.

Once we have a possibilistic empirical model 𝒮\mathcal{S}, 𝒮(C)\mathcal{S}(C) gives a set of possible assignments. Thus, we can connect a possibilistic empirical model 𝒮\mathcal{S} to an RR-linear theory.

Definition 3.2.

Given a possibilistic empirical model 𝒮\mathcal{S}, an RR-linear theory 𝕋R(𝒮)\mathbb{T}_{R}(\mathcal{S}) is defined by:

𝕋R(𝒮):=C𝕋R(𝒮(C))={ϕ=C,r,as𝒮(C),sϕ}.\mathbb{T}_{R}(\mathcal{S}):=\bigcup_{C\in\mathcal{M}}\mathbb{T}_{R}\left(\mathcal{S}(C)\right)=\left\{\phi=\left<C,r,a\right>\mid\forall s\in\mathcal{S}(C),s\vDash\phi\right\}.

We say that 𝒮\mathcal{S} is AvN\mathrm{AvN} if its RR-linear theory 𝕋R(𝒮)\mathbb{T}_{R}(\mathcal{S}) is inconsistent. i.e. there is no global assignment g(X)g\in\mathcal{E}(X) such that g𝕋R(𝒮)g\vDash\mathbb{T}_{R}(\mathcal{S}).

Table 5: Probabilistic empirical model on Mermin’s square realized by Bell state |ϕ+\left|\phi^{+}\right>.
CC 000 001 010 011 100 101 110 111
{XI,IX,XX}\{XI,IX,XX\} 1 1
{IZ,ZI,ZZ}\{IZ,ZI,ZZ\} 1 1
{XZ,ZX,YY}\{XZ,ZX,YY\} 1 1
{XI,IZ,XZ}\{XI,IZ,XZ\} 1 1 1 1
{IX,ZI,ZX}\{IX,ZI,ZX\} 1 1 1 1
{XX,ZZ,YY}\{XX,ZZ,YY\} 1

For example, go back to the empirical model on Mermin’s square realized by Bell state |ϕ+\left|\phi^{+}\right>. The possibilistic empirical model obtained from the probabilistic empirical model is characterized in Table 5. The theory 𝕋R(𝒮)\mathbb{T}_{R}(\mathcal{S}) contains the following linear equations.

s(XI)s(IX)=0,s(XX)=0\displaystyle s(XI)\oplus s(IX)=0,~{}s(XX)=0 on{XI,IX,XX}\displaystyle\mathrm{on}~{}\{XI,IX,XX\}
s(IZ)s(ZI)=0,s(ZZ)=0\displaystyle s(IZ)\oplus s(ZI)=0,~{}s(ZZ)=0 on{IZ,ZI,ZZ}\displaystyle\mathrm{on}~{}\{IZ,ZI,ZZ\}
s(XZ)s(ZX)=1,s(YY)=1\displaystyle s(XZ)\oplus s(ZX)=1,~{}s(YY)=1 on{XZ,ZX,YY}\displaystyle\mathrm{on}~{}\{XZ,ZX,YY\}
s(XI)s(IZ)s(XZ)=0\displaystyle s(XI)\oplus s(IZ)\oplus s(XZ)=0 on{XI,IZ,XZ}\displaystyle\mathrm{on}~{}\{XI,IZ,XZ\}
s(IX)s(ZI)s(ZX)=0\displaystyle s(IX)\oplus s(ZI)\oplus s(ZX)=0 on{IX,ZI,ZX}\displaystyle\mathrm{on}~{}\{IX,ZI,ZX\}
s(XX)=0,s(ZZ)=0,s(YY)=1\displaystyle s(XX)=0,~{}s(ZZ)=0,~{}s(YY)=1 on{XX,ZZ,YY}\displaystyle\mathrm{on}~{}\{XX,ZZ,YY\}

Each local assignment ss is defined on the specific context in each line. Note that this theory contains more linear equations compared to the previous example. Again, once we suppose a global assignment gg, we find an inconsistency, so 𝒮\mathcal{S} is AvN\mathrm{AvN}.

Proposition 3.3.

If 𝒮\mathcal{S} is AvN\mathrm{AvN} then 𝒮\mathcal{S} is strongly contextual.

Proof.

Suppose 𝒮\mathcal{S} is not strongly contextual, i.e. that there is some g𝒮(X)g\in\mathcal{S}(X). Then for each ϕ=C,r,a𝕋R(𝒮)\phi=\left<C,r,a\right>\in\mathbb{T}_{R}(\mathcal{S}), g|C𝒮(C)\left.g\right|_{C}\in\mathcal{S}(C), hence g|Cϕ\left.g\right|_{C}\vDash\phi by the definition of 𝕋R(𝒮)\mathbb{T}_{R}(\mathcal{S}). Thus, 𝕋R(𝒮)\mathbb{T}_{R}(\mathcal{S}) is consistent. ∎

3.4 State-independent AvN

While we defined the AvN argument on a possibilistic empirical model, Mermin’s square seems to lie in a stronger class of contextuality, as we have seen that the linear theory of Mermin’s square with a possibilistic empirical model is larger than that without an empirical model. When we have a measurement xx in a quantum system, it is actually associated with an observable OO and a state ψ\psi, or more generally, a density matrix ρ\rho, so to characterize a measurement x=Oψx=O_{\psi}. Hereby, we focus on an algebra given by the set of observables, and each measurement xx is specified by an observable OO. In the remaining part of section 3, we denote the set of observables as X={xi}X=\{x_{i}\}, which is realized to be a set of measurements Xψ={xi,ψ}X_{\psi}=\{x_{i,\psi}\} by a state ψ\psi, and a measurement cover of observables \mathcal{M}, realized to be a measurement cover ψ\mathcal{M}_{\psi} by a state ψ\psi. We may abuse the word “measurement cover” for the measurement cover of observables, but clearly, the measurement cover of observables must be realized by a state ψ\psi to be an actual measurement cover.

Now turning to the set of observables XX, we first restrict our concern to the subset of Pauli nn-group GnG_{n}. This is to take care before dealing with the general quantum observables, i.e., arbitrary self-adjoint operators on a complex Hilbert space, where Tsirelson’s problem [29] may arise. To be specific, when we consider a span of a set of nonlocal observables, it may not be able to approximate the generated operator algebra with a set of local Pauli operators, even if the dimension of the Hilbert space is finite [30, 31]. Thus, we restrict our concern to Pauli groups GnG_{n} where each basis operator is a tensor product of local Pauli operators.

Now in GnG_{n}, we have a multiplicative group operation, which is generally not commutative. However, an event sheaf \mathcal{E} still can be assigned on this set of Pauli observables. For any subset UXGnU\subset X\subset G_{n}, we consider an assignment s:U2s:U\rightarrow\mathbb{Z}_{2} that maps to the measurement of each observable xUx\in U as λ(x)=(1)s(x)\lambda(x)=\left(-1\right)^{s(x)} for its eigenvalue λ(x)=±1\lambda(x)=\pm 1.

Once we have a multiplicative relation in XX, we observe that it translates to a linear equation according to the following rules:

  • when ixi=I\prod_{i}x_{i}=I holds for mutually commuting xiXx_{i}\in X, the relation corresponds to a linear equation is(xi)=0\sum_{i}s(x_{i})=0;

  • when ixi=I\prod_{i}x_{i}=-I holds for mutually commuting xiXx_{i}\in X, the relation corresponds to a linear equation is(xi)=1\sum_{i}s(x_{i})=1.

For example, when we have xy=zx\cdot y=z, it is equivalent to xyz1=Ix\cdot y\cdot z^{-1}=I, so we can map it to a linear equation. In the GnG_{n}, x1=xx^{-1}=x, so it further reduces to an equation xyz=Ix\cdot y\cdot z=I in GnG_{n}. The second relation stands for the dual element x-x of xx such that x(x)=Ix\cdot(-x)=-I. This is dual in the sense of a local assignment ss where s(x)s(x)=1s(x)\oplus s(-x)=1 always holds.

Note that the condition of mutual commutation is required to map multiplication in XX to addition in 2\mathbb{Z}_{2}, which is commutative. However, there is an intriguing object we already have that encodes the commutability, namely, the measurement cover \mathcal{M}.

From the definition of a measurement cover, \mathcal{M} is a family of maximal commuting subsets of XX, i.e., for any x,yCXx,y\in C\subset X, xy=yxx\cdot y=y\cdot x, and if there exists CCXC\subset C^{\prime}\subset X for CC\in\mathcal{M} such that the elements of CC^{\prime} commutes with each other, then C=CC=C^{\prime}. Since commutative multiplication only occurs in a commuting subset, it only occurs in a measurement context. This means that we can define a linear theory of a set of observables XX, as presented in the following definition.

Definition 3.4 (State-independent AvN).

Given a set of observables XGnX\subset G_{n}, a linear theory 𝕋2(X)\mathbb{T}_{\mathbb{Z}_{2}}(X) is defined by:

𝕋2(X):={ϕ=C,r,aC,xCxr(x)=(1)aI}.\mathbb{T}_{\mathbb{Z}_{2}}(X):=\left\{\phi=\left<C,r,a\right>\mid C\in\mathcal{M},\prod_{x\in C}x^{r(x)}=(-1)^{a}\cdot I\right\}.

We say that XX is state-independently AvN if its linear theory 𝕋2(X)\mathbb{T}_{\mathbb{Z}_{2}}(X) is inconsistent. i.e. there is no g(X)g\in\mathcal{E}(X) such that g𝕋2(X)g\vDash\mathbb{T}_{\mathbb{Z}_{2}}(X).

Proposition 3.5.

If XX is state-independently AvN\mathrm{AvN} then any possibilistic model 𝒮\mathcal{S} on Xψ,ψ,2\left<X_{\psi},\mathcal{M}_{\psi},\mathbb{Z}_{2}\right> is AvN\mathrm{AvN}.

Proof.

What we want to show is that any possibilistic empirical model 𝒮\mathcal{S} realized by any state satisfies 𝕋2(X)\mathbb{T}_{\mathbb{Z}_{2}}(X). Here we deal with a density matrix ρ\rho for generality. For each linear equation ϕ=C,r,a𝕋2(X)\phi=\left<C,r,a\right>\in\mathbb{T}_{\mathbb{Z}_{2}}(X), what we want to have is:

xCr(x)s(xρ)=a,\sum_{x\in C}r(x)s(x_{\rho})=a,

for s𝒮(C)\forall s\in\mathcal{S}(C) realized by an arbitrary density matrix ρ\rho. Here we show it by contradiction.

Suppose not. Then it means s𝒮(C)\exists s\in\mathcal{S}(C) such that xCr(x)s(xρ)a\sum_{x\in C}r(x)s(x_{\rho})\neq a. This equation with the assignment ss maps to an equation with an assignment of eigenvalues λ(x)=(1)s(xρ)\lambda(x)=(-1)^{s(x_{\rho})} as follows:

xCλ(x)r(x)(1)a.\prod_{x\in C}\lambda(x)^{r(x)}\neq(-1)^{a}.

When s𝒮(C)s\in\mathcal{S}(C), it means that Prob(λ)0\mathrm{Prob}(\lambda)\neq 0, where Prob(λ)\mathrm{Prob}(\lambda) is obtaining eigenvalues λ(x)\lambda(x) sequentially going through each xCx\in C. Now look into the updated state:

ρ=(xCPλ(x))ρ(xCPλ(x)),\rho^{\prime}=\left(\prod_{x\in C}P_{\lambda(x)}\right)\rho\left(\prod_{x\in C}P_{\lambda(x)}\right),

where Pλ(x)P_{\lambda(x)} is the projector of each eigenvalue λ(x)\lambda(x) of xx. We have all xx’s in CC, so they mutually commute, and so do their projectors Pλ(x)P_{\lambda(x)}, so the expression is valid. Then, from the definition of 𝕋2()\mathbb{T}_{\mathbb{Z}_{2}}(\mathcal{M}), we have xCx=(1)aI\prod_{x\in C^{\prime}}x=(-1)^{a}\cdot I. Then we can try measuring this from the updated state ρ\rho as follows:

Tr((xCxr(x))ρ)=(1)aTr(ρ)=(1)a.\mathrm{Tr}\left(\left(\prod_{x\in C}x^{r(x)}\right)\rho^{\prime}\right)=(-1)^{a}\cdot\mathrm{Tr}\left(\rho^{\prime}\right)=(-1)^{a}.

Here we derive a contradiction:

lhs\displaystyle\mathrm{lhs} =\displaystyle= Tr((xCxr(x))(xCPλ(x))ρ(xCPλ(x)))\displaystyle\mathrm{Tr}\left(\left(\prod_{x\in C}x^{r(x)}\right)\left(\prod_{x\in C}P_{\lambda(x)}\right)\rho\left(\prod_{x\in C}P_{\lambda(x)}\right)\right)
=\displaystyle= Tr((xCxr(x)Pλ(x))ρ(xCPλ(x)))\displaystyle\mathrm{Tr}\left(\left(\prod_{x\in C}x^{r(x)}P_{\lambda(x)}\right)\rho\left(\prod_{x\in C}P_{\lambda(x)}\right)\right)
=\displaystyle= Tr((xCλ(x)r(x)Pλ(x))ρ(xCPλ(x)))\displaystyle\mathrm{Tr}\left(\left(\prod_{x\in C}\lambda(x)^{r(x)}\cdot P_{\lambda(x)}\right)\rho\left(\prod_{x\in C}P_{\lambda(x)}\right)\right)
=\displaystyle= Tr((xCλ(x)r(x))(xCPλ(x))ρ(xCPλ(x)))\displaystyle\mathrm{Tr}\left(\left(\prod_{x\in C}\lambda(x)^{r(x)}\right)\cdot\left(\prod_{x\in C}P_{\lambda(x)}\right)\rho\left(\prod_{x\in C}P_{\lambda(x)}\right)\right)
=\displaystyle= (xCλ(x)r(x))Tr(ρ)\displaystyle\left(\prod_{x\in C}\lambda(x)^{r(x)}\right)\mathrm{Tr}\left(\rho^{\prime}\right)
=\displaystyle= xCλ(x)r(x)(1)a=rhs.\displaystyle\prod_{x\in C}\lambda(x)^{r(x)}\neq(-1)^{a}=\mathrm{rhs}.

This is because Tr(ρ)1\mathrm{Tr}(\rho^{\prime})\neq 1, which means xCPλ(x)=0\prod_{x\in C}P_{\lambda(x)}=0, thus Prob(λ)=0\mathrm{Prob}(\lambda)=0, resulting in that s𝒮s\notin\mathcal{S}. Hence we have s𝒮(C)\forall s\in\mathcal{S}(C), sϕs\vDash\phi, therefore ϕ𝕋2(𝒮)\phi\in\mathbb{T}_{\mathbb{Z}_{2}}(\mathcal{S}). Since we have shown it with arbitrary ϕ𝕋2(X)\phi\in\mathbb{T}_{\mathbb{Z}_{2}}(X), 𝕋2(X)𝕋2(𝒮)\mathbb{T}_{\mathbb{Z}_{2}}(X)\subset\mathbb{T}_{\mathbb{Z}_{2}}(\mathcal{S}). However, since 𝕋2(X)\mathbb{T}_{\mathbb{Z}_{2}}(X) is not satisfiable, 𝕋2(𝒮)\mathbb{T}_{\mathbb{Z}_{2}}(\mathcal{S}) is also not satisfiable. ∎

Now, we highlight that this definition catches the idea discussed by Kochen and Specker [8]. Here, XX can be translated to a partial algebra in Kochen and Specker’s argument where the commeasurability corresponds to the measurement cover \mathcal{M}. The rule of mapping the multiplicative equations of XX into a linear theory 𝕋2(X)\mathbb{T}_{\mathbb{Z}_{2}}(X) coincides with an embedding of XX into a Boolean algebra. The inconsistency of the linear theory implies the failure of embeddability, which was previously connected to the non-classical logic of quantum mechanics.

However, there still exists some ambiguity in matching these concepts to Kochen and Specker’s argument. In particular, the set XGnX\subset G_{n} is not necessarily closed under multiplication, which disturbs the direct interpretation of XX as a partial algebra as in Kochen and Specker’s work. Considering that some useful arguments on quantum advantage, e.g., Refs. 23, 24, are based on Kochen-Specker type contextuality, we should characterize the relation between the set of observables and a partial algebra, by defining the measurement cover \mathcal{M} of XX in a more rigorous way.

3.5 Partial group and Kochen-Specker type contextuality

First, let’s start with the concept of a partial group. While the idea of partial algebra was raised earlier by Kochen and Specker in 1975 [8], the definition of a partial group is discussed only after Assiry in 2018 [32]. Here we define an abelian partial group based on those two studies without exactly showing how this definition connects to the aforementioned mathematical objects.

Definition 3.6.

An abelian partial group Γ,,\left<\Gamma,\smallfrown,\cdot\right> consists of a set Γ\Gamma, a binary relation Γ×Γ\smallfrown\subset\Gamma\times\Gamma, and a binary operation :Γ\cdot:\smallfrown\rightarrow\Gamma, satisfying the following properties:

  • the relation \smallfrown is reflexive and symmetric, i.e., for any a,bΓa,b\in\Gamma, aaa\smallfrown a and abbaa\smallfrown b\Leftrightarrow b\smallfrown a;

  • the relation \smallfrown is closed under the operation \cdot, i.e., if aba\smallfrown b, bcb\smallfrown c and cac\smallfrown a then abca\cdot b\smallfrown c;

  • there exists an identity 1Γ1\in\Gamma such that 1a1\smallfrown a for any aΓa\in\Gamma and 1a=a1=a1\cdot a=a\cdot 1=a;

  • if a subset SS of Γ\Gamma satisfies that aba\smallfrown b for any a,bSa,b\in S, SS generates an abelian group SΓ\left<S\right>\subset\Gamma.

Such relation \smallfrown is called commutativity or commeasurability of Γ\Gamma. The binary operation :Γ\cdot:\smallfrown\rightarrow\Gamma is in fact a partial binary operation on Γ\Gamma, written by :Γ×ΓΓ\cdot:\Gamma\times\Gamma\rightharpoonup\Gamma. Note that for a commuting subset SΓS\subset\Gamma, the partial binary operation \cdot on Γ\Gamma becomes a commutative binary operation on the abelian group SΓ\left<S\right>\subset\Gamma. Meanwhile, we may refer to an abelian partial group Γ\Gamma without specifying its commutativity and partial binary operation.

For example, the set of observables in Mermin’s square (including ±\pm sign and an identity IIII) forms an abelian partial group where the commutativity corresponds to the usual commutativity relation of observables, and the partial binary operation is defined on a commuting set of observables. Remark the following properties in this example:

  • Γ=±{II,XI,IX,XX,IZ,ZI,ZZ,XZ,ZX,YY}\Gamma=\pm\{II,XI,IX,XX,IZ,ZI,ZZ,XZ,ZX,YY\};

  • XIIZXI\smallfrown IZ, but XI⌢̸ZIXI\not\smallfrown ZI;

  • S={XI,IZ}S=\{XI,IZ\} is a commuting subset of Γ\Gamma and generates an abelian group S={II,XI,IZ,XZ}Γ\left<S\right>=\{II,XI,IZ,XZ\}\subset\Gamma.

Remark.

A group GG can be interpreted as an abelian partial group where the commutativity corresponds to the usual commutativity relation of the elements of GG, and the partial binary operation is derived from the binary operation of GG. Thus, we shall say a group GG is an abelian partial group without specifying its commutativity relation and partial binary operation explicitly.

This definition of an abelian partial group characterizes the algebraic structure of quantum observables where the multiplication of observables is restricted by commutativity so that we can bring the hidden variable arguments to each commuting subset. Furthermore, it can extend to a partial algebra by introducing addition and scalar multiplication, so that it can coincide with an operator algebra, as in Kochen and Specker’s original work.

Kochen and Specker found that the hidden variable model and the quantum model branch out when we try to glue the measurement outcomes of overlapping commuting subsets to obtain the global hidden variable model, i.e., there exists a quantum model of measurement inconsistent with the hidden variable model on the global domain, but still satisfies the classical arguments on each commuting domain. This concept justifies the statement: the quantum model of measurement depends on a “context.”

While this partial algebraic structure plays an important role in quantum theory, we do notice that a set of observables XX in a measurement scenario is not generally a closed abelian partial group. For example, an XZ(2,2,2)XZ-(2,2,2) scenario has an observable set X={XI,IX,ZI,IZ}X=\{XI,IX,ZI,IZ\} which is not closed under multiplication, e.g., XIIX=XXXXI\cdot IX=XX\notin X. This is exactly the difference between sheaf-theoretic contextuality and Kochen-Specker type contextuality: Kochen-Specker type contextuality is defined on an abelian partial group while sheaf-theoretic contextuality does not require the measurement set to be an abelian partial group.

Having said that, we can try linking those two definitions by considering the generation of an abelian partial group by the given set of measurements, or in other words, a partial closure of the given set.

Definition 3.7 (Partial closure).

For a given subset XX of Γ\Gamma, a partial closure X¯\overline{X} of XX on Γ\Gamma is the smallest abelian partial group in Γ\Gamma containing XX.

For example, X1={XI,IZ}X_{1}=\{XI,IZ\} is a subset of a Pauli group G2G_{2} of which a partial closure X1¯={II,XI,IZ,XZ}\overline{X_{1}}=\{II,XI,IZ,XZ\} is an abelian partial group in G2G_{2}. Likewise, the partial closure of X2={XI,IX,ZI,IZ}X_{2}=\{XI,IX,ZI,IZ\} is X2¯=±{II,XI,IX,XX,IZ,ZI,ZZ,XZ,ZX,YY}G2\overline{X_{2}}=\pm\{II,XI,IX,XX,IZ,ZI,ZZ,XZ,ZX,YY\}\subset G_{2}.

Finally, we can characterize the measurement cover \mathcal{M} of a set XX in the context of an abelian partial group of quantum observables. \mathcal{M} is defined on an abelian partial group GnXG_{n}\supset X, where each context CC\in\mathcal{M} is a maximal commuting subset of XX. Similarly, we can derive the measurement cover ¯\overline{\mathcal{M}} of X¯\overline{X}, where each context C¯C\in\overline{\mathcal{M}} is a maximal abelian group in X¯\overline{X}. One may interpret \mathcal{M} as a family of the intersections of XX and the abelian groups C¯C\in\overline{\mathcal{M}}, which also catches the idea that the measurement cover \mathcal{M} originates in the commutativity relation of GnXG_{n}\supset X.

Now, it is straightforward to specify Kochen-Specker type contextuality in the language of an abelian partial group.

Definition 3.8 (Kochen-Specker type contextuality).

Given a set XGnX\subset G_{n}, XX has Kochen-Specker type contextuality if there is no global assignment of eigenvalues λ:X¯{±1}\lambda:\overline{X}\rightarrow\{\pm 1\} consistent with the partial commutative multiplication in X¯\overline{X}.

Kochen-Specker type contextuality is also called as contextuality in a closed sub-theory, concerning that the argument only deals with a part of the quantum observables. It is also clear that Kochen-Specker type contextuality is nothing but the state-independent AvN in a partial closure, once we define the state-independent AvN in a partial closure as follows:

Definition 3.9.

Given a set of observables XGnX\subset G_{n}, XX is state-independently AvN\mathrm{AvN} in a partial closure if 𝕋2(X¯)\mathbb{T}_{\mathbb{Z}_{2}}(\overline{X}) is inconsistent.

Following the argument so far, we state the following corollary without giving proof.

Corollary 3.10.

A set of observables XX has Kochen-Specker type contextuality if and only if XX is state-independently AvN\mathrm{AvN} in a partial closure.

Refer to caption
Figure 3: Determining tree proposed by Kirby and Love [24]. Each parent operator and its children operators pairwise commute and the product of children operators equals the parent operator.

Kirby and Love [24] developed this concept to define their own way of witnessing contextuality, and applied it to evaluate the classical simulatability of practical quantum algorithms. Fig. 3 illustrates the example of the formalism Kirby and Love developed.

Definition 3.11.

A determining tree τx\tau_{x} for a Pauli observable xx over a set of Pauli observables XX is a tree whose nodes are Pauli operators and whose leaves are operators in XX, such that:

  • the root is xx;

  • all children of any particular parent pairwise commute as operators;

  • every parent node is the operator product of its children.

We say that xx is determined by XX if there exists a determining tree τx\tau_{x} over XX. For a determining tree τx\tau_{x}, the determining set D(τx)D(\tau_{x}) is defined to be the set containing one copy of each operator with odd multiplicity as a leaf in τx\tau_{x}.

Note that xX¯x\in\overline{X} if and only if there exists τx\tau_{x} over XX. The determining tree images inductive production of X¯\overline{X}. Whenever we have ixi=x\prod_{i}x_{i}=x, (ixi)x=I\left(\prod_{i}x_{i}\right)\cdot x=I, which maps to a linear equation in 𝕋2(X¯)\mathbb{T}_{\mathbb{Z}_{2}}(\overline{X}). Thus, each determining tree τx\tau_{x} gives a linear theory ϕτx𝕋2(X¯)\phi_{\tau_{x}}\in\mathbb{T}_{\mathbb{Z}_{2}}(\overline{X}).

Now, once we get a determining tree τx\tau_{x}, it pushes us to try inducing an assignment gτxg_{\tau_{x}} for xX¯x\in\overline{X} from some global assignment g(X)g\in\mathcal{E}(X), which assigns value gτx=xD(τx)g(x)g_{\tau_{x}}=\sum_{x^{\prime}\in D(\tau_{x})}g(x^{\prime}) following the determining tree τx\tau_{x} over XX. However, what we can observe is that the determining tree for xx is not unique, and in fact, gτx(x)gτx(x)g_{\tau_{x}}(x)\neq g_{\tau_{x}^{\prime}}(x) in general. The following theorem characterizes this failure of assigning values to the determining tree.

Theorem 3.12 (KL contextuality).

A set XX of Pauli operators is state-independently AvN\mathrm{AvN} in a partial closure if and only if there exists a determining tree τx\tau_{x} over XX and a determining tree τx\tau_{-x}^{\prime} over XX such that D(τx)=D(τx)D(\tau_{x})=D(\tau_{-x}^{\prime}).

Refer to caption
Figure 4: Commutability graphs that induce state-independent AvN in a partial closure.
Corollary 3.13.

A set XGnX\subset G_{n} is state-independently AvN\mathrm{AvN} in a partial closure if and only if it contains a subset consisting of four operators whose commutability graph has one of the forms given in Fig. 4 (up to permutations of the operators).

We refer to Ref. 24 for the proof of corollary 3.13. The conversed direction of theorem 3.12 is easily confirmed. Suppose that there exists such a pair of determining trees, and suppose g(X)g\in\mathcal{E}(X) induces a valid g𝒯(X¯)g_{\mathcal{T}}\in\mathcal{E}(\overline{X}) such that 𝒯x=τx\mathcal{T}_{x}=\tau_{x} and 𝒯x=τx\mathcal{T}_{-x}=\tau_{-x}^{\prime}. Then g𝒯(x)=xD(τx)g(x)=xD(τx)g(x)=g𝒯(x)g_{\mathcal{T}}(x)=\sum_{x^{\prime}\in D(\tau_{x})}g(x^{\prime})=\sum_{x^{\prime}\in D(\tau_{-x}^{\prime})}g(x^{\prime})=g_{\mathcal{T}}(-x), since D(τx)=D(τx)D(\tau_{x})=D(\tau_{-x}^{\prime}). However, since ±xX¯\pm x\in\overline{X}, {±I,±x}¯\{\pm I,\pm x\}\in\overline{\mathcal{M}}, we have an equation x(x)=Ix(-x)=-I, which leads to that g𝒯(x)+g𝒯(x)=1g_{\mathcal{T}}(x)+g_{\mathcal{T}}(-x)=1, which is a contradiction. Thus, no such assignment g𝒯g_{\mathcal{T}} is valid.

On the other hand, the forward direction of the theorem is not really obvious, although we do conjecture it to be true as it was in Ref. 24. Here we leave this proof for our future work, but we do state it as a theorem rather than a conjecture.

3.6 Classification of contextuality

Refer to caption
Figure 5: Mermin’s star. The observables lying on each line mutually commute, and the product of all four observables equals +I+I except for the horizontal line being I-I.

Although a given empirical model is simply contextual, or even non-contextual, it might have AvN contextuality in a partial closure. For example, consider the XZ(2,2,2)\mathrm{XZ}-(2,2,2) Bell scenario. This is actually the case that motivated this research, where the Bell scenario with measurements (XI,ZI,IX,IZ)(XI,ZI,IX,IZ) is not contextual at all, but from Mermin’s square, it turns out to have state-independent AvN in a partial closure. Hereby, the argument of Ref. 17, that the 2-qubit scenario is unable to show strong non-locality, seems to disagree with this result. However, we could realize that the statement is still correct since it considers non-locality, where only local observables are concerned.

Table 6: Empirical model on XY(3,2,2)XY-(3,2,2) scenario realized by GHZ state.
CC 000 001 010 011 100 101 110 111
{XII,IXI,IIX}\{XII,IXI,IIX\} 1/4 1/4 1/4 1/4
{XII,IYI,IIY}\{XII,IYI,IIY\} 1/4 1/4 1/4 1/4
{YII,IXI,YII}\{YII,IXI,YII\} 1/4 1/4 1/4 1/4
{YII,IYI,IIX}\{YII,IYI,IIX\} 1/4 1/4 1/4 1/4
\vdots
Table 7: Empirical model on XY(3,2,2)XY-(3,2,2) scenario realized by equal superposition state.
CC 000 001 010 011 100 101 110 111
{XII,IXI,IIX}\{XII,IXI,IIX\} 1/8 1/8 1/8 1/8 1/8 1/8 1/8 1/8
{XII,IYI,IIY}\{XII,IYI,IIY\} 1/8 1/8 1/8 1/8 1/8 1/8 1/8 1/8
{YII,IXI,YII}\{YII,IXI,YII\} 1/8 1/8 1/8 1/8 1/8 1/8 1/8 1/8
{YII,IYI,IIX}\{YII,IYI,IIX\} 1/8 1/8 1/8 1/8 1/8 1/8 1/8 1/8
\vdots

Another point we make is that the state-independent AvN class is strictly smaller than the ordinary AvN class. In other words, there exist state-dependent AvN models. For example, consider Mermin’s star illustrated in Fig. 5. Table 6 shows the empirical model on XY(3,2,2)XY-(3,2,2) scenario realized by GHZ state, which is AvN. However, in the same measurement setting, we can realize another empirical model as in Table 7 with equal superposition state |+++\left|+++\right>. Here, every probability in the scenario is 1/81/8, which turns out to be non-contextual.

Refer to caption
Figure 6: State-side view. With the Bell pair, Mermin’s square lies in B1B_{1}, CHSH(2,2,2)\mathrm{CHSH}-(2,2,2) Bell scenario lies in B4B_{4}, and XZ(2,2,2)XZ-(2,2,2) scenario lies in B5B_{5}. With the GHZ state, Mermin’s star lies in G1G_{1}, XY(3,2,2)XY-(3,2,2) scenario lies in G2G_{2}, and XZ(3,2,2)XZ-(3,2,2) scenario lies in G5G_{5}. The areas B2,B3,G3,G4B_{2},B_{3},G_{3},G_{4} seem to be abandoned, yet explicit proof is not addressed.
Refer to caption
Figure 7: Operator-side view with conjecture 3.14. Regarding 2-qubit scenarios, Mermin’s square is state-independently AvN, and CHSH(2,2,2)\mathrm{CHSH}-(2,2,2) and XZ(2,2,2)XZ-(2,2,2) scenarios are state-dependently contextual. The non-contextual example may given by ={ZI,IZ}\mathcal{M}=\{ZI,IZ\}. However, in a partial closure, Mermin’s square, CHSH(2,2,2)\mathrm{CHSH}-(2,2,2), and XZ(2,2,2)XZ-(2,2,2) are all identical scenarios.

As a result, we can summarize the relationship between different definitions of contextuality as Fig. 6. However, the picture seems quite different when it comes to the operator-side view, as we illustrate in Fig. 7. Although we know few cases of state-dependently contextual scenarios, they seem to integrate into the state-independent AvN when their partial closures are considered. In this view, we conjecture that state-dependent contextuality is in fact state-independent AvN in a partial closure. We present this idea in the following conjecture.

Conjecture 3.14.

Any measurement cover \mathcal{M} that realizes contextual empirical model for some state ψ\psi is state-independently AvN\mathrm{AvN} in a partial closure.

This conjecture yields an interesting idea that the partial closure may provide a way to connect sheaf-theoretical contextuality to state-independent contextuality. Here we leave the proof of this conjecture for our future work.

4 Conclusion

In this report, we reviewed the sheaf-theoretic framework of contextuality and proposed state-independent AvN arguments. This work provides a coherent mathematical structure to compare each class of contextuality, clarifying the hierarchy of state-independent AvN - AvN - strong contextuality - contextuality. Kochen-Specker type contextuality integrates into this framework by considering a partial closure of the given set of measurements.

This work also develops the idea that contextuality does not necessarily require measurements to be “local.” While the compatibility condition of events and distributions originates in the no-signaling principle, the condition is still valid when we include non-local observables in the set of measurements. However, it requires a cautious approach to deal with non-local observables because of Tsireleson’s problem [29, 30, 31], which is also discussed in Ref. 33. Here we restricted our concern to a Pauli nn-group to avoid this problem.

Whilst we organized a consistent framework of contextuality, it cannot be affirmed that this framework is the most effective framework for formalizing contextuality in every case. A graph-theoretic approach based on Kochen-Specker type contextuality may be suitable for some proofs of quantum advantage, or a topological framework may be more effective in other cases. However, this work, together with Aasnæss’s thesis [18], implies a certain relationship between those approaches, thus enabling the translation of arguments in each framework from one to another.

Future studies may work on proving that state-dependent contextuality merges into state-independent AvN when its partial closure is concerned. It would also be a question if there is a set of observables that is state-independent contextual but not state-independent AvN, or an empirical model that is strongly contextual but not AvN. The generalization of state-independent contextuality to arbitrary self-adjoint operators on a complex Hilbert space would be another intriguing problem. Such discussions will clarify the point where classical and quantum information systems diverge.

In summary, this work presents an extensive approach from a sheaf-theoretic framework to Kochen-Specker type contextuality and state-independent contextuality, providing a consistent mathematical language to compare notions of contextuality. This will serve as a key tool to evaluate quantum advantage, guiding how the contextuality arguments from different frameworks can be translated to each other.

References