Algorithmic Randomness, Effective Disintegrations, and Rates of Convergence to the Truth
Abstract.
Lévy’s Upward Theorem says that the conditional expectation of an integrable random variable converges with probability one to its true value with increasing information. In this paper, we use methods from effective probability theory to characterise the probability one set along which convergence to the truth occurs, and the rate at which the convergence occurs. We work within the setting of computable probability measures defined on computable Polish spaces and introduce a new general theory of effective disintegrations. We use this machinery to prove our main results, which (1) identify the points along which certain classes of effective random variables converge to the truth in terms of certain classes of algorithmically random points, and which further (2) identify when computable rates of convergence exist. Our convergence results significantly generalize earlier results within a unifying novel abstract framework, and there are no precursors of our results on computable rates of convergence. Finally, we make a case for the importance of our work for the foundations of Bayesian probability theory.
2010 Mathematics Subject Classification:
Primary 03D32 Secondary: 03A10, 03D78, 03F60, 60A10, 60B05, 60G481. Introduction
Measure-theoretic probability was developed in the early 20th Century in response to pressing problems in statistical physics, astronomy, and pure mathematics, and today it is used throughout the mathematical sciences.111For a historical survey, see [73]. What proved to be an especially significant conceptual progress was the ability to say that certain properties are true with probability one. Early examples include Borel’s strong law of large numbers, irrational rotations of the unit interval, Birkhoff’s ergodic theorem, and Poincaré’s recurrence theorem. It is often unclear, however, what these sets are. That is to say, measure-theoretic results only assert the existence of certain sets of probability one but fail to characterise the points that are elements of those sets. This was pointed out as early as 1916 by Weyl, who insisted that a deeper understanding of the sets involved in zero-one laws was necessary in order to interpret the results of measure-theoretic probability.222[74].
The theory of algorithmic randomness involves a fine-grained classification of different measure one sets, with the primary exemplars being the Martin-Löf random points, the Schnorr random points, and the Kurtz random points.333The original papers of Martin-Löf, Schnorr, and Kurtz are: [41], [42], [64], [65], [36]. There are now several comprehensive references on algorithmic randomness, including [39], [50], [15], [68]. Originally this was done for the uniform “fair coin” measure on Cantor space (the space of infinite sequences of 0’s and 1’s) and the famous results pertained to algorithmic incompressibility and the Turing degrees.444For instance, the Levin-Schnorr characterisation of Martin-Löf randomness in terms of initial segment complexity, and the Kučera-Gács proof that every Turing degree is below the degree of a Martin-Löf random. See, e.g., [15, Theorem 6.3.10 p. 239, Theorem 8.3.2 p. 326] for statement and references. However, the theory has been recently developed for a more general class of computable probability measures on computable spaces, by authors such as Gács, Hoyrup and Rojas, Reimann, Rute, and Miyabe.555[22], [29], [30], [56], [61], [46], [32] (listed in rough chronological order). A related recent trend has been showing that effectivized versions of classical theorems on almost sure convergence prove convergence exactly on various classes of algorithmically random points.666[7], [47]. The latter is, in part, a survey and contains many further references. This arguably contributes to the deeper understanding along the lines suggested by Weyl. Further, this recent trend suggests reconceiving of the various notions of algorithmic randomness less as on a par with rival conceptual analyses of a pre-theoretic phenomenon, à la the Church-Turing thesis, and more as delineations of extensionally and conceptually distinct kinds of probability one events.777This point is due to [54]. Or, if one puts the point in terms of the corresponding null sets, the various notions demarcate different types of impossibility that occur throughout measure-theoretic mathematics and its many applications.
Our main theorems (Theorems 1.5, 1.6, 1.8, 1.9, 1.11) contribute to this recent literature by characterising, in terms of algorithmic randomness, the points at which Lévy’s Upward Theorem holds for various classes of effective random variables, as well as providing information about the rates of convergence to the truth.
Let us recall the classical statement of Lévy’s Theorem.888[75, p. 134], [38, §41 pp. 128 ff]. Suppose is a probability triple. Let be an increasing sequence of sub--algebras of whose union generates . Then Lévy’s Upward Theorem states that one has both -a.s. and in , for any -measurable function in . In this, denotes the conditional expectation of relative to , which, recall, is defined as the -a.s. unique -measurable function such that for all events in the sub--algebra of .
The convergence in Lévy’s Upward Theorem is one of the cornerstones of Bayesian epistemology.999[18, pp. 144 ff], [28, pp. 28-29]. The random variable can be thought of as a quantity that a Bayesian agent, whose degrees of belief are captured by the underlying probability measure , is trying to estimate by repeatedly performing an experiment. The quantity can be seen as encoding the agent’s opinions regarding the value of after having observed the outcomes of the first experiments. Lévy’s Upward Theorem then implies that, with probability one, the Bayesian agent’s opinions regarding the value of will converge to ’s true value in the limit.
To be able to characterise the -measure one set on which , one needs to choose versions of and . It seems natural to focus attention on classes of effective random variables defined relative to spaces and probability measures which are themselves computable. For, computability seems like a natural constraint to place on our Bayesian agent, and many of the examples of probability measures and random variables that occur in practice and applications are computable. However, the Bayesian perspective recommends few other general constraints on what is eligible to be a credence or a prior. Hence it is important to develop the theory for a maximally broad class of computable spaces and probability measures.101010The distinctive status of the computability hypothesis which we are suggesting raises a host of interesting and complex conceptual questions, ranging from the nature of cognition to the character of inductive inference. We put these issues aside here.
1.1. Effective probability and algorithmic randomness
The computable Polish spaces with computable probability measures are such an appropriately general class of spaces and measures. In this brief section we collect together the few definitions we need about their theory. The reader already familiar with these concepts can easily skip to the next section (§1.2).
A Polish space is a topological space which is separable and completely metrizable. All the paradigmatic spaces such as the reals and their products and their closed and open subspaces are Polish, and similarly for Cantor space. Descriptive set theory takes as its subject matter the Borel and projective subsets of Polish spaces.111111[34], [48]. When topological considerations are salient or when one needs i.i.d. sequences with prescribed distributions, it is often assumed in contemporary probability theory that the sample space is a Polish space or a Borel subset thereof.121212A Borel subset of a Polish space together with its Borel subsets is known as a standard Borel space in descriptive set theory (cf. §[34, Definition 12.5, Corollary 13.4]). For representative examples of standard Borel spaces within probability, see e.g. [17, p. 51], [33, p. 7, pp. 561 ff]. For a classic probability text that foregrounds standard Borel spaces, see [51, Chapter 1].
A computable Polish space is a Polish space with a distinguished countable dense set and a distinguished complete compatible metric such that the distance between any two elements of the countable dense set is a computable real, uniformly in .131313A standard reference for computable Polish spaces is [48, Chapter 3]. A comparison to the Weihrauch approach to computable analysis is given in [24]. One can view the treatment of metric spaces in [70] as an axiomatization of Polish spaces and reals which are computable in an oracle. Finally, the study of computable Polish spaces in and of themselves is distinct from effective descriptive set theory, which usually refers to techniques for proving results about all Borel sets by first proving it for lightface Borel sets in Baire space (the most famous example of this is the Glimm-Effros dichotomy (cf. [23, Chapter 6])). (The enumeration of the distinguished countable dense set can contain repetitions, and will need to do so in finite spaces.)
In a computable Polish space, an open set is c.e. open if there is a computable function which enumerates a subsequence of the countable dense set and a computable sequence of rational radii such that . In this, denotes the open ball with centre and radius relative to metric (when the metric is clear from context, we just write ). The name “c.e. open” is chosen since the natural numbers are a computable Polish space with the discrete metric, and the c.e. opens in this space are precisely the computably enumerable sets of natural numbers, one of the canonical objects of the contemporary theory of computation.141414See [71], a standard reference. Further, many of the elementary methods of studying c.e. sets (e.g., universal enumerations) extend to c.e. open sets. The complements of c.e. open sets are called effectively closed sets (cf. §2.1).
In a computable Polish space, we say that a sequence at geometric rate if for all . We say that a sequence fast if at geometric rate . We then say that a point is computable if there is a computable function which enumerates a subsequence of the countable dense set such that fast. This subsequence is called a witness to the computability of . In Cantor space with its usual metric, the computable points are precisely the computable subsets of natural numbers.
In the real numbers, the countable dense set is the rationals, and the above definition of computable points is precisely how Turing defined computable real numbers at the outset of the theory of computation nearly a century ago.151515[72]. An equivalent formalisation of computable reals is by Dedekind cuts. A real is left-c.e. (resp. right-c.e.) if its left Dedekind cut in the rationals is a c.e. set (resp. if its right Dedekind cut in the rationals is a c.e. set). One can show that a real is computable iff it is both left-c.e. and right-c.e., and uniformly so. (An example of a left-c.e. real that is not computable is , where is an injective computable function with non-computable range.)161616These and other effective aspects of real numbers are treated extensively in e.g. [57], [15, Chapter 5].
These preliminaries in place, one can then quickly define the required core notions from algorithmic randomness and effective probability. These are all needed in order to formally state our main theorems, but one might restrict oneself to (1)-(8) on a first pass and come back to the others as needed.
Definition 1.1.
(Core notions)
-
(1)
A function is lower semi-computable (abbreviated lsc) if for all rational , the set is uniformly c.e. open.
-
(2)
A function is upper semi-computable (abbreviated usc) if for all rational , the set is uniformly c.e. open.
-
(3)
A probability measure is computable if is uniformly left-c.e. as ranges over c.e. opens.
-
(4)
Given a computable probability measure and a computable real , a function is an Schnorr test if it is lsc and if is a computable real, where this denotes the -norm .
-
(5)
Given a computable probability measure and a computable real , a function is an Martin-Löf test if it is lsc and .
-
(6)
A point in is Kurtz random relative to (abbreviated ) if is in every c.e. open with .
-
(7)
A point in is Schnorr random relative to (abbreviated ) if for any Schnorr test (equivalently, for any Schnorr test, for computable).
-
(8)
A point in is Martin-Löf random relative to (abbreviated ) if for any Martin-Löf test (equivalently, for any Martin-Löf test, for computable).
-
(9)
A computable basis for is a computable sequence of c.e. opens such that every c.e. open can be effectively written as a union of elements in .
-
(10)
If is a computable probability measure, then a -computable basis for is a computable basis such that (i) finite unions from uniformly have -computable measure, and (ii) each c.e. open in is uniformly paired with an effectively closed superset of the same -measure. If is clear from context, we simply say measure computable basis instead of -computable basis.
-
(11)
A sub--algebra of the Borel sets on is -effective if it is generated by a computable sequence of events from the algebra generated by a -computable basis .171717When working with , we assume that we are working with the codes for Boolean combinations of elements of , and only by extension with the sets that they define. This is because there are some spaces where Boolean algebra structure on the quotient is not computable. We say that is generated by .
-
(12)
A full -effective filtration (resp. almost-full -effective filtration) is an increasing sequence of uniformly -effective sub--algebras generated by a uniformly computable sequence from the algebra generated by a -computable basis , which is further equipped with a uniform procedure for going from a c.e. open to a computable sequence such that (resp. on ).
-
(13)
If is a point of the computable Polish space and is a subset of Baire space (the space of all functions from natural numbers to natural numbers), then weakly computes an element of if, for every sequence from the countable dense set of such that fast, there is in which is Turing reducible to the function . If , then we just say that weakly computes .181818One can extend Turing reducibility from a relation between sets of natural numbers to a relation between closed subsets of Baire space. In this setting, the notion of weak computation is called Muchnik reducibility, and is contrasted to a strong uniform notion called Medvedev reducibility. See [69], [27] for introduction and references, although this theory is usually focused on effectively closed sets. Given a point , the set of functions such that at a fixed rate, in the sense of (15), is a closed subset of Baire space. 191919If is Cantor space or the reals, then for each point of the space, there is a sequence of least Turing degree such that fast. In these settings, computational properties of the point of the space usually refer to those of this sequence. However, there are spaces for which there are points with no sequence of least Turing degree. See Miller [43].
-
(14)
If is a point of the computable Polish space , and is any collection of Turing degrees (equivalence classes of elements of Baire space under Turing reducibility), then we say that is in if there is some whose Turing degree is in such that fast, where again enumerates the countable dense set. In the case where just consists of the computable degree, note that is in iff is computable as a point of .
-
(15)
Suppose are elements in a metric space such that . Then a rate of convergence for is a function such that for all rational and all one has .202020If at geometric rate in a computable Polish space, then one defines a rate in the sense of (15) by setting for the least such that . Often in practice we use the case where is the reals and and , where are real-valued functions. A synonym for rate is modulus, and so we often use the variable for rates.
For the algorithmic randomness notions in (6)-(8), we just write instead of when is clear from context; and similarly for and . For -algebras , it is always understood that they are sub--algebras of the Borel -algebra, and when is clear from context we just say effective instead of -effective.
Algorithmic randomness is often formulated in terms of effective null sets, called sequential tests. But the definitions given above in terms of integral tests are easier to work with in our setting and are known to be equivalent to the sequential definitions, by theorems of Levin and Miyabe.212121[37], [45, Theorem 3.5].
Before turning to disintegrations, it is helpful to introduce notational conventions regarding versions of integrable functions vs. equivalence classes thereof. In the following definition, the -algebra on is simply , and similarly for .
Definition 1.2.
(Conventions on functions defined pointwise vs. functions defined up to -a.s. equivalence)
Suppose is a Polish space and suppose that is a finite non-negative measure on the Borel events of . Then we define:
is the set of pointwise defined Borel measurable functions such that .
is the set of pointwise defined Borel measurable functions such that .
is the set of equivalence classes of elements of under -a.s. equivalence. That is, is the classical Banach space with norm .
is the set of equivalence classes of elements of under -a.s. equivalence. That is, is a positive cone in the Banach space .
1.2. Classical and effective disintegrations
We use disintegrations for the versions of conditional expectation. While, classically, conditional expectation is defined only -a.s., in order to characterise algorithmic randomness notions in terms of Lévy’s Upward Theorem, we need to select specific versions of conditional expectation. The concept of a disintegration provides a very general way of making such selections. It is due to Rohlin222222[58], [59], [60]. and is routinely used today in ergodic theory and optimal transport,232323It is often used in the proof of the Ergodic Decomposition Theorem and the Gluing Lemma. See [19, 154], [63, 182]. and it is closely related to conditional probability distributions.242424[11], [53, §5.3].
Suppose that is a Polish space, is a probability measure on the Borel sets of and is a countably generated sub--algebra of the Borel -algebra. Define the equivalence relation on by iff, for all in , one has in iff in , and let be the corresponding equivalence class.252525Since we are focused on Lévy’s Upward Theorem, we are focusing on countably generated sub--algebras of the Borel -algebra. Note that this has the consequence that the relation is a smooth Borel equivalence relation (cf. [23, §5.4]). More complicated Borel equivalence relations occur naturally in nearby topics. For instance, Rute examines Lévy’s Downward Theorem (cf. [61, Theorem 11.2], [75, Theorem 14.4]), which in Cantor space results naturally in the sub--algebra of -invariant events, where is the Borel equivalence relation featuring in the Glimm-Effros dichotomy (cf. [23, Definition 6.1.1]).
Let be the Polish space of non-negative Borel measures on (cf. §2.2). For Borel measurable , whose action is written as , we define the partial map
(1.1) |
This map is a version, that is, it is partially defined on all pairs . Note that it is totally defined on , with range ,262626Indeed, it is totally defined on all pairs ( where is non-negative Borel measurable. But for our purpose of defining a version of we only need to pay attention to when is in . and it is totally defined and finite on all simple functions. It is further helpful to keep in mind that whether is finite depends on whether the element of is additionally in : that is, it is integrability with respect to rather than which is at issue.
For a Polish space , one says that a map is the disintegration of with respect to if both the following happen:272727We are following the treatment of Einsiedler-Ward [19, 135]. Since the main examples of disintegrations involve products (cf. Appendicies A-B), often alternative definitions of disintegrations involve maps that axiomatize the role that the projection operators play in the paradigmatic examples. For an example of definitions along these lines, see [11], [53, §5.3].
-
–
for all in , one has that is a version of the conditional expectation of with respect to and .282828Hence it is in , and thus it is defined and finite for -a.s. many from .
-
–
For -a.s. many from , one has and .
A disintegration of with respect to exists for any countably generated sub--algebra of the Borel -algebra on .292929[19, 135]. Indeed, a little more is true: one can replace by one of its Borel subsets. Further, one can relax the assumption that is countably generated, provided that one does not insist on . Further, it is possible to be more agnostic about the codomain of outside the -measure one set on which it outputs probability measures. See Appendix A for two classical examples of disintegrations.
Here is our key definition of effective disintegration:
Definition 1.3.
(Effective disintegrations).
Let be a computable Polish space. Let be a computable probability measure on . Let be a -effective -algebra. Let be a -measure one subset of . Then the map is an disintegration of with respect to if each of the following happen:
-
(1)
For all in , one has that is a version of the conditional expectation of with respect to and .
-
(2)
For all in , one has that and .
-
(3)
For c.e. open , the map is uniformly lsc from to .
We further define:
A Kurtz disintegration is simply a disintegration.
A Schnorr disintegration is a map which is both a disintegration and a disintegration.
A Martin-Löf disintegration is a map which is a disintegration and a disintegration and a disintegration.
Technically, it appears possible to, e.g., be a disintegration but not a disintegration. This is due to the universal quantifier over at the outset of (2). But this possibility does not appear to occur naturally among examples.
Due to space constraints, we have opted to focus on theory in the body of the text, and have put a brief discussion of the many interesting examples of effective disintegrations in Appendix B.
Finally, we can define:
Definition 1.4.
Let be an almost-full effective filtration, equipped uniformly with Kurtz disintegrations . A point in is said to be density random with respect to , abbreviated , if is in and for every c.e. open .
In this, is the Dirac measure centred at . With the limit written as such, by the Portmanteau Theorem one sees that it is a strengthening of the weak convergence of the measures . Since we use the disintegration to define the conditional expectation, as in equation (1.1) above, the limit in Definition 1.4 can be written equivalently as . With respect to the canonical filtration of length -strings on Cantor space and its natural disintegration (cf. Example B.1), density randomness has been a focal topic in recent literature on algorithmic randomness.303030[3], [47], [35]. In this setting with being the uniform measure, it is known that is a proper subset of . One example which shows this properness is an element of such that is c.e. open, where is the lexicographic order. Definition 1.4 is our suggestion for how to generalise this to the setting of arbitrary effective disintegrations.
1.3. Statement of main results
Our first main theorem is the following:
Theorem 1.5.
(Effective Upward Lévy Theorem for Schnorr Randomness). Suppose that is a computable Polish space and is a computable probability measure. Suppose that is an almost-full effective filtration, equipped uniformly with Kurtz disintegrations.
If is computable, then the following four items are equivalent for in :
-
(1)
is in .
-
(2)
is in and for all Schnorr tests .
-
(3)
is in and exists for all Schnorr tests and for all c.e. opens with computable.
-
(4)
is in and exists for all Schnorr tests .
In condition (2), means that the limit of exists and is finite and equal to . Likewise in (3)-(4), the existence of the limit means that it is finite. Note that (3) implies that already proves the analogue of density randomness where we restrict to c.e. open with computable.
For rates of convergence, we have:
Theorem 1.6.
(Rates for Upward Lévy Theorem for Schnorr Randomness). For all as in Theorem 1.5, one has:
-
(1)
For all in and all computable and all Schnorr tests one has that weakly computes a rate of convergence for .
-
(2)
For all in of computably dominated degree and all computable and all Schnorr tests one has that there is a computable rate for the convergence .
The notion in Theorem 1.6(2) is a classical notion from the theory of computation: a Turing degree is computably dominated if any function from natural numbers to natural numbers that is computable from the degree is dominated by a computable function, in the sense that the computable function is eventually above it.313131[71, 124], [50, 27]. A more traditional name for this concept is “of hyperimmune-free degree.” This more traditional name comes from an equivalent definition that emerged in the context of Post’s Problem (cf. [71, 133 ff]). For many but not all computable Polish spaces and in computable, there are non-atoms in (and hence in ) of computably dominated degree. This is a consequence of the existence of universal tests for and the Computably Dominated Basis Theorem (cf. discussion at Proposition 2.18, Example 2.19, Question 2.20).
It is unknown to us whether Theorem 1.6(2) can be improved, in the sense of an affirmative answer to the following question:
Question 1.7.
For all in that are not of computably dominated degree and all computable and for all Schnorr tests is there a computable rate for the convergence ?
The following is the simplest concrete version of the question (cf. Example B.1):
If is uniform measure on Cantor space, and in is not of computably dominated degree, and if is c.e. open with computable and not in , then does the convergence have a computable rate?323232For such , the set can be rather complex. In particular, Carotenuto-Nies [8] show that it is -complete when is dense. It is not clear to us whether this complexity is located among the ’s or the non-computably dominated ’s, or whether it is reflected in their rates of convergence.
Under uniform measure on Cantor space, the points which are not of computably dominated degree have measure one, a result due to Martin.333333Martin’s paper [40] is unpublished, but his proof has subsequently appeared in other sources, such as [15, Theorem 8.21.1 p. 381], [13, Theorem 1.2]. One way to negatively resolve the question would be to show that the non-computable-domination in Martin’s proof (or a variation on it) could be witnessed by a rate of convergence associated to an Schnorr test, or perhaps even to an indicator function of a c.e. open with computable.
We prove Theorems 1.5-1.6 in §8. Theorem 1.5 extends and unifies prior work by Pathak, Rojas, and Simpson, and of Rute (see discussion in §1.4 below), while Theorem-1.6 is entirely new.
Our next theorem pertains to convergence along Martin-Löf tests:
Theorem 1.8.
(Effective Upward Lévy Theorem for Density Randomness ).
Suppose that is a computable Polish space and in is computable. Suppose that is an almost-full effective filtration, equipped uniformly with Kurtz disintegrations .
If is computable, then the following three items are equivalent for in :
-
(1)
is in .
-
(2)
is in and for all Martin-Löf tests .
-
(3)
is in and exists for all Martin-Löf tests and for every c.e. open .
In contrast to Theorem 1.6(2), one has the following, whose proof is a traditional diagonalization argument deploying the halting set:
Theorem 1.9.
(Rates for Upward Lévy Theorem for Density Randomness).
There are as in Theorem 1.8 which have the property that for every computable and every in there is Martin-Löf test such that the convergence has no computable rate.
Hence, once we shift the tests from Schnorr tests to Martin-Löf tests, we never have points which possess computable rates for all tests. In one sense, Question 1.7 is asking whether there is some way to emulate a halting-set-like construction among the non-computably dominated ’s.
We prove Theorems 1.8-1.9 in §9. Theorem 1.8 extends work from a paper of Miyabe, Nies, and Zhang, which we discuss in the next section, while Theorem 1.9 is entirely new.
Given Theorem 1.5 and Theorem 1.8, it is natural to try to understand whether there is a convergence to the truth characterisation of , or at least some nearby superset of it (by contrast, is a subset of ). We thus isolate a class of Martin-Löf tests which have approximations such that in at an exponential rate, but not a rate that can necessarily be computed. Hence we define the following, where clauses (1)-(3) mimic the canonical approximations of Schnorr tests (cf. Proposition 2.16, Lemma 3.2), and where clause (4) pertains to exponential rates:
Definition 1.10.
Suppose that is computable.
A maximal Doob test is an lsc function in such that there is a uniformly computable sequence of Schnorr tests satisfying
-
(1)
on and on .
-
(2)
is equal on to a non-negative lsc function.
-
(3)
for is equal on to an Schnorr test, uniformly in .
-
(4)
For all , .343434By taking , we have in , and so is an Martin-Löf test, and hence in conjunction with (2) we have that is equal on to an Martin-Löf test.
A point is -maximal Doob random relative to , abbreviated , if for all maximal Doob tests.
Our theorem on this is the following:
Theorem 1.11.
(Effective Upward Lévy Theorem for Maximal Doob Randomness, ).
Suppose that is a computable Polish space and in is computable. Suppose that is an almost-full effective filtration, equipped uniformly with Kurtz disintegrations.
If is computable, then the following three items are equivalent for in :
-
(1)
in .
-
(2)
is in and for all maximal Doob tests .
-
(3)
is in and exists for all maximal Doob tests .
We prove Theorem 1.11 in §10. The name “Maximal Doob” in Theorem 1.11 and Definition 1.10 comes from the role played by Doob’s Maximal Inequality (cf. Lemma 5.1(1)) in the proofs in §10. In Proposition 10.1, we note that . But we do not know the answer to the following question:
Question 1.12.
Are the inclusions proper?
We suspect that is a proper subset of , and that one could show this by establishing the analogue of Theorem 1.9.
We add that we do not know the answer to the following:
1.4. Relation to previous work
Theorem 1.5 generalises the result of Pathak, Rojas, and Simpson, who show it for the specific case of and , being the -fold product of Lebesgue measure on with itself, and with being given by dyadic partitions.353535[52]. They state their result not in terms of Schnorr tests, but in terms of computable points of . See §11. Under this guise, Lévy’s Upward Theorem just is the Lebesgue Differentiation Theorem. Their proof goes through Tarski’s decidability results on the first-order theory of the reals, and so seems in certain key steps specific to the reals with Lebesgue measure.363636Such as at [52, Lemma 3.3 p. 339]. However, see Proposition 4.1 below, which generalises rather directly from their setting to the general setting.
In conjunction with the properties of effective disintegrations (cf. §7), one can derive the equivalence of (1)-(2) in Theorem 1.5 from results of Rute.373737In particular, for the (1) to (2) direction of Theorem 1.5, see Rute’s “Effective Levy 0/1 law” [61, Theorem 6.3 p. 31]. Our Proposition 7.5 and Proposition 2.4 implies that if is an Schnorr test, then is a computable point of , and so Rute’s Theorem 6.3 applies, once one internalises how to translate back and forth between Schnorr tests and computable points of (cf. §11). For the (2) to (1) direction of Theorem 1.5, see Rute’s [61, Example 12.1 p. 31]. As for Schnorr randomness, our work then expands on Rute’s primarily by finding a large class of versions of conditional expectations to which his results apply, and by identifying the information on rates of convergence in Theorem 1.6. More generally, the theory we develop is organised around the elementary concept of an integral Schnorr test, and so we hope might be of value to others by virtue of being accessible.383838In particular, we can avoid appeal to Rute’s theory of a.e. convergence, which is an alternative way to organise effective convergence in (cf. §2.4). See [61, Proposition 3.15 p. 15] and his Convergence Lemma [61, Lemma 3.19 p. 17].
Further, we are able to strengthen what is, in our view, one of the more foundationally significant parts of Rute’s work. He notes that traditionally “algorithmic randomness is more concerned with success than convergence” and that “only computable randomness has a well-known characterisation in terms of martingale convergence instead of martingale success.”393939[61, p. 7]. He is referring to what is called a “folklore” characterisation of computable randomness on Cantor space with the uniform measure in [15, Theorem 7.1.3 p. 270]. In Cantor space with the uniform measure, Rute has a characterisation of Schnorr randomness in terms of convergence of martingales.404040See items (1), (4) in his Example 1.5, immediately below the preceding quotation. We have been able to generalise this to all computable measures on computable Polish spaces: see Theorem 12.2. This proof follows Rute’s Hilbert space proof in broad outline. It seems to us that keeping track of the maximal function, which we can then use in DCT arguments, has been helpful here.
In the setting of Cantor space with the uniform measure and the natural filtration of length -strings and the natural disintegration (cf. Example B.1), our Theorem 1.8 was already known for and hence all computable . This result is in a paper of Miyabe, Nies, and Zhang, where it is attributed to the Madison group of Andrews, Cai, Diamondstone, Lempp and Miller.414141[47, Theorem 3.3 p. 312]. Their proof is a little more general, in that it just concerns martingale convergence rather than martingales associated to random variables. Their proof, in the Cantor space setting, also can be modified to give not only convergence but convergence to the truth for random variables. Their argument goes through an auxiliary test notion of Madison test. While we only have it for computable , our proof of Theorem 1.8 goes through first principles about density randomness and effective disintegrations. It is not presently clear to us whether the Cantor space proof using Madison tests can be generalised to arbitrary computable probability measures on computable Polish spaces equipped with effective disintegrations.424242As a final remark about the previous literature, we should mention that Lévy’s Upward Theorem has also been studied in the context of Shafer and Vovk’s game-theoretic probability ([67], [66, Chapter 8]). Their approach conceives of martingales primarily as game-theoretic strategies, and does not treat computational matters explicitly. By contrast, here we are focusing on the martingales and on effective properties of them conceived of as sequences of random variables. Discerning the relation between our approach and their approach would involve, as a first step, carefully going through their approach and ascertaining the exact levels of effectivity needed to secure their results, and secondly translating back and forth between the strategy and random variable paradigms.
We hope our efforts brings this prior important work on algorithmic randomness to the attention of a broader audience. Bayesianism is an important and increasingly dominant framework in a variety of disciplines, and this prior work and our work hopefully makes vivid the way in which computability theory and algorithmic randomness bears directly on the question of when and how fast Bayesian inductive methods converge to the truth. As its proof makes clear, the non-computable rate of convergence to the truth in Theorem 1.9 is another presentation of the halting set, and so this proof gives the theory of computation a central role in a limitative theorem of inductive inference, similar to its central role in the great limitative theorems of deductive inference like the Incompleteness Theorems. Further, the existence of Schnorr random worlds that are computably dominated, as in Theorem 1.6(2), shows that a central kind of randomness is entirely compatible with there being effective ways of determining how close we are to the truth. This optimistic inductive possibility is not one that would be visible in absence of recent work in algorithmic randomness.
Internal to the discussion about Bayesianism within philosophy, authors such as Belot have voiced the concern that the classical theory only tells us that worlds at which we fail to converge to the truth have probability zero, but otherwise tells us little about when and where the failure happens.434343[2]. From the perspective of Theorems 1.5, 1.8, 1.11, the probability zero event of non-convergence is not arbitrary, so long as one is insisting on convergence along a broad enough class of effective random variables. Namely, the sequences along which convergence to the truth fails for some element of this class are exactly those that are not random with respect to the underlying computable prior probability measure. In other words, those sequences can be determined by effective means to be atypical from the agent’s point of view.
Finally, we should emphasise that ours is not the only perspective on conditional expectations and its effectivity that one could adopt. In focusing on disintegrations, we are presupposing a framework where pointwise there is a single “formula” for the conditional expectation, namely the one displayed in equation (1.1) (and again see Appendicies A-B for examples). Likewise, the effectivity constraints in Definition 1.3 have the consequence that the conditional expectation operator is a continuous computable function (cf. Proposition 7.5), and so sends computable points to computable points (cf. Proposition 2.4). Both of these presuppositions constrain the applicability of our framework. For instance, Rao points out that conditional expectations are used throughout econometrics, but there one often uses the Dynkin-Doob Lemma as definitional of the conditional expectation,444444[55, 376]. For an example, see the presentation of conditional expectation in [20, Chapter 7]. and there is no more hope of having a single formula come out of it than there is of having all variables expressible in linear terms of one another. Likewise, conditional expectations and martingales can be used to prove theorems like the Radon-Nikodym Theorem,454545[75, p. 145-146]. which is “computably false” in that there are computable absolutely continuous probability measures with no computable Radon-Nikodym derivative.464646[70, p. 396], [77], [31]. That, of course, is not to say that these are not of interest or that determining how non-effective they are is not of interest, but just to say they will not be available in a framework like ours where we restrict to computable continuous conditional expectation operators.474747The paper Ackerman et. al. [1] is an important recent paper studying how non-effective, in general, it is to have disintegrations. Our Definition 1.3, by contrast, restricts attention to those disintegrations that are highly effective. This will not be all of them, and we do not claim that it would be all of the interesting ones.
1.5. Outline of paper
The paper is organised as follows. In §2, we begin with a brief discussion of some aspects of effectively closed sets and computable continuous functions and lsc functions which we need for our proof, and then we go over relevant aspects of the three computable Polish spaces which are central for effective probability theory:
-
–
The computable Polish space of non-negative finite Borel measures on and its computable Polish subspace of probability measures.
-
–
For each computable in and each computable , the computable Polish space .
-
–
For each computable in the computable Polish space of Borel measurable functions which are finite -a.s. and whose topology is given by convergence in measure.
The space is needed since when is in , the maximal function is in , and is guaranteed to be in iff . In addition to being needed in the proofs of the main theorems, the material in §2 also can serve to contextualise many components of Definition 1.1. For instance, we mention in §2.2 a result of Hoyrup-Rojas that an element of is computable in the sense of Definition 1.1(3) iff it is computable as an element of the Polish space . Likewise, in §2.3 we mention a result saying that an Schnorr test is simply a non-negative lsc function whose equivalence class is a computable element of (cf. Proposition 2.16). Finally, towards the close of §2.4, we define an Schnorr test and prove a new characterisation of in terms of these tests (cf. Definition 2.28, Proposition 2.29).
In §3 we present two lemmas on Schnorr randomness. The second of these, called the Self-location Lemma (3.3) is a distinctive feature of Schnorr randomness (vis-à-vis the other algorithmic randomness notions), and is central to our proof of Theorem 1.6. In §4 we present some results on recovering the pointwise values of effective random variables on . In §5, we review various classical features of the maximal function which we shall need later. In §6 we present an abstract treatment of Theorem 1.5 in terms of various effective constraints that a version of the conditional expectation may satisfy. In §7 we develop the fundamental properties of effective disintegrations. In §8, we prove Theorems 1.5-1.6, and in §9 we prove Theorems 1.8-1.9 and in §10 we prove Theorem 1.11. In §11, we show how Miyabe’s translation method allows us to recast Theorem 1.5 in terms of computable points of . In §12, we develop the theory of martingales in and prove the aforementioned generalisation of Rute’s result characterising Schnorr randomness in terms of martingale convergence. In Appendix A we briefly exposit two classical examples of disintegrations, and in Appendix B we present several examples of effective disintegrations.
In a sequel to this paper, we present a similar analysis of the Blackwell-Dubins Theorem,484848[5] which is also a “convergence to the truth” result, but wherein the pair “agent and world” is replaced with a pair of agents whose credences are variously absolutely continuous with respect to one another.
2. Computable Polish spaces for effective probability theory
2.1. Effectively closed, computable continuous, and lsc
In this section, we briefly describe two further concepts from the theory of computable Polish spaces: namely effectively closed subsets and computable continuous functions, and we close by mentioning a few brief aspects of lsc functions.
Before we do that, we mention one elementary proposition on computable Polish spaces which is worth having in hand (for e.g. the Self-location Lemma 3.3):
Proposition 2.1.
Let be a computable Polish space with metric and countable dense set . Suppose the map is such that fast. Then
-
(1)
The set is c.e. in graph of .
-
(2)
The point is computable iff the set is c.e.
Proof.
For (1), since the distances between the points of the countable dense set is uniformly computable, they are also uniformly right-c.e. Hence, it suffices to note that iff there is with .
As mentioned in §1, the complement of a c.e. open set is called an effectively closed set. In Cantor space and Baire space, the effectively closed sets can be represented as paths through computable trees.494949[9, p. 41]. The following provides a simple example on the real line of a classically closed set which is not effectively closed:
Example 2.2.
Suppose that and is right-c.e. and is left-c.e but neither are computable. Then is a computable Polish space. Further, is a classically closed subset of the reals which is not an effectively closed subset of the reals.
It is a computable Polish space since its countable dense set is c.e. since it is the intersection of the the right Dedekind cut of and the left Dedekind cut of . If were effectively closed in the reals then would be c.e. open in the reals. Then by choosing a rational in , one has that is c.e., contrary to hypothesis.
Effectively closed subsets of a computable Polish space need not themselves have the structure of a computable Polish space, since one in addition needs to produce an enumeration of a countable dense set where the distance between the points is uniformly computable.505050By contrast, classically, the Polish subspaces of a Polish space are precisely the subsets. See [34, p. 17]. Hence, we define: a computable Polish subspace of is given by a an effectively closed subset of and a countable sequence of points which are uniformly computable points of and which are dense in . One can check that the c.e. opens relative to are just the c.e. opens of the space intersected with , and further any effectively closed subset of is also an effectively closed subset of . Similarly, one can check that a computable point of is just a computable point of which happens to be in . As a simple example of a Polish subspace which is not a computable Polish subspace, one has:
Example 2.3.
Suppose that and is left-c.e. and is right-c.e but neither are computable. Then the closed interval is an effectively closed subset of the reals which is not a computable Polish subspace of the reals.
It is effectively closed since being left-c.e. and being right-c.e. implies that and are c.e. open. And if were a computable Polish subspace of the reals, and if were a sequence of uniformly computable reals dense in , then for a rational we would have iff there is such that , which is a c.e. condition and so would be right-c.e. and thus computable.
An effectively closed set is computably compact if there is a partial computable procedure which, when given an index for a computable sequence of c.e. opens in which covers , returns a natural number such that covers . We further say that is strongly computably compact if there is a partial computable procedure which, when given an index for a computable sequence of c.e. opens in halts iff this is a cover of , and when it halts returns a natural number such that covers . If itself is strongly computably compact, then so are all of its effectively closed sets. If is computable, then is strongly computably compact. Likewise, Cantor space is strongly computably compact, and if is computable then the computably bounded set is a computable Polish subspace of Baire space which is strongly computably compact. Example 2.2 is an example of a compact computable Polish space which is not computably compact, since if it were then we could compute the endpoints using maxs and mins of the centres of finite coverings with fast decreasing radii. For another example of a compact computable Polish space which is not computably compact, one can take the paths through a computable subtree of Baire space which is not computably bounded.515151See [10, Example 2.1.5 p. 59].
If are two computable Polish spaces, then a function is computable continuous if inverse images of c.e. opens are uniformly c.e. open.525252See Moschovakis [48, 110]. Simpson [70, Exercise II.6.9 p. 88] notes that it is equivalent to his preferred definition at [70, 85]. The following characterisation usefully parameterises each continuous computable function by a single c.e. set, where it is assumed for the sake of simplicity that both countable dense sets are identified with the natural numbers:535353This is from [26, 1169]. It can be seen as a simplification of Simpson’s definition in [70, 85].
-
–
A function is computable continuous iff there is a c.e. set such that both (i) if is in then and (ii) for all in and all there is in with in and .
Computable continuous maps are also computable continuous when restricted to computable Polish subspaces. The computable continuous maps preserve computability of points:
Proposition 2.4.
If is computable continuous and in is computable, then in is computable.
Proof.
This proposition is important because many arguments for the computability of points in effective analysis and probability can be seen as the result of applying computable continuous functions to computable points.
There is a partial converse to the previous proposition in the uniformly continuous setting. A computable modulus of uniform continuity for a uniformly continuous function is a computable function such that implies for all in . For instance, if is rational, then a -Lipschitz function is just a function with linear modulus of uniform continuity . The partial converse to Proposition 2.4 is the following:
Proposition 2.5.
Suppose are computable Polish spaces and that has a computable modulus of uniform continuity. Suppose that the image of the countable dense set in under is uniformly computable in . Then is computable continuous.
Proof.
Suppose is the countable dense set in . Suppose that is the computable modulus of uniform continuity. Suppose that fast, where is a uniformly computable sequence from the countable dense set in . Then define the c.e. set . First we show that . For, suppose that is in . Then . Then . Further since fast, we have , from which we obtain by triangle inequality. Second suppose that is in and . Let be such that . Since is an enumeration of the countable dense set, there is such that . Then is in , and the tuple is in and . ∎
This proposition is widely applicable in our context since many operators in functional analysis are uniformly continuous, and since one often in practice has good control over what happens with the countable dense set (see the proofs of Proposition 2.16 and Proposition 2.21 for representative examples).
Finally, recall the notion of core notion of lsc from Definition 1.1(1), which is the effectivization of the classical notion of lower semi-continuous. This class of functions has some paradigmatic examples and useful closure conditions which we briefly enumerate without proof:
Proposition 2.6.
Constant functions that are left-c.e. reals are lsc. Indicator functions of c.e. opens are lsc.
Lsc functions are closed under addition, maxs and mins. Non-negative lsc functions are closed under multiplication.
Sups of uniformly lsc functions are lsc. Infinite sums of non-negative lsc functions are lsc. Compositions of lsc functions with computable continuous functions are lsc.
Since is lsc iff is usc, one can use this proposition to obtain examples and closure conditions for usc functions as well.
The analogue of Proposition 2.4 for lsc functions is that they send computable points to left-c.e. reals.
2.2. The space of probability measures
If is a Polish space, then the space of real-valued finite signed Borel measures on is written as . Recall that the weak∗-topology on is the smallest topology such that all the linear maps are continuous, where ranges over bounded continuous functions on the space.545454Or, equivalently, as ranges over all bounded uniformly continuous functions on the space ([34, 110]). Unless the space is finite, the weak∗-topology on is not metrizable.555555[6, 17, 102] However, when is a Polish space, the space of all probability Borel measures on with the weak∗-topology is a Polish space, as is the space of all finite non-negative Borel measures on .565656[34, §17.E pp. 109 ff]. Convergence in is characterised by the Portmanteau Theorem.575757[34, Theorem 17.20 p. 111], [4, Theorem 2.1 p. 16].
A natural countable dense set on the spaces and are the finite averages of Dirac measures associated to points from the countable dense set on , with rational values for the weights. These spaces can be completely metrized by the metric of Prohorov. However, when working on , it is often more useful to work with the Wasserstein metric, and further when the metric on is unbounded it is more convenient to work with the Kantorovich-Rubinshtein metric on :585858[6, 104,111].
In this, . Hoyrup and Rojas prove that with the metrics of Prohorov or Wasserstein are computable Polish spaces, and their proof extends naturally to the Kantorovich-Rubinshtein metric. Likewise, their proof shows that is a computable Polish space and has as a computable Polish subspace.595959[29, 49], [30, 838]. Further, Hoyrup and Rojas characterise the computable points in as follows, and their proof extends naturally to :606060[29, 52], [30, 839].
Proposition 2.7.
A point in is computable iff is computable and is uniformly left-c.e. for c.e. opens in .
To illustrate the utility of this proposition, consider from Example 2.2. This proposition implies that Lebesgue measure on is not a computable point of since is left-c.e. but not computable. Likewise, is not a computable point of since one can choose rationals such that , and then one has is right-c.e. but not computable.
Recall the notion of a comptuable basis and measure computable basis from Definition 1.1(9)-(10). Hoyrup and Rojas use an effective version of the Baire Category Theorem to prove every is a computable point of has a -computable basis. Moreover, the basis can be taken to be open balls with centres from the countable dense set and with radii given by a dense computable sequence of non-zero reals, with the closed balls being the corresponding effectively closed supersets.616161See [30, Corollary 5.2.1 p. 844], [29, Theorem 2.2.1.2 p. 60]. Rute also employs this result of Hoyrup and Rojas, [61, pp. 13-14], although he leaves out from the definition of a measure computable basis the pairing of each basis element with an effectively closed superset of the same measure. Hoyrup and Rojas include this pairing, but further require that is dense (cf. [30, Definition 5.1.2 p. 842], [29, Definition 2.2.1.2 p. 58]).
The following proposition summarises some basic properties of -computable bases:
Proposition 2.8.
Suppose that is a computable point of .
-
(1)
Elements of the algebra generated by a -computable basis uniformly have -computable measure. Indeed, this holds for all holds for all sequences of events such that finite unions of them have uniformly -computable measure.
-
(2)
If a computable sequence of c.e. opens with uniformly computable -measure is added to a -computable basis, then finite unions from the resulting sequence have uniformly computable -measure, as do elements from the algebra generated by the resulting sequence. Indeed, this holds for all computable bases such that finite unions of them have uniformly -computable measure.
-
(3)
If a computable sequence of c.e. opens with uniformly computable -measure and with uniformly effectively closed supersets of the same -measure is added to a -computable basis, then the result is a -computable basis.
-
(4)
The -computable bases are closed under effective union.
Proof.
For (1), suppose that is a sequence of events such that finite unions of them have uniformly -computable measure.
First note that finite intersections have uniformly computable -measure: this is an induction on , and for the induction step, use inclusion-exclusion .
Likewise, finite unions of finite intersections of members of have uniformly computable -measure: this is an induction on , and for the induction step, use distribution .
Finally, note that finite intersections of members of and complements of members of have uniformly computable -measure. This follows from the previous steps, the elementary identity and distribution as follows when : , which is uniformly computable by the two previous paragraphs. If then note that , and since is computable, we can argue as in the case of with playing the role of .
For (2), suppose that is a computable sequence of c.e. opens such that is uniformly computable. Suppose that is a computable basis such that finite unions of them have uniformly -computable measure. We must show that finite unions from have uniformly -computable measure. It suffices to consider the case where just consists of a single c.e. open , since by induction and (1) we may assume that the are already among the . Since is a computable basis, write , where is a computable function. Then . Since this limit is increasing, and is uniformly computable, we have that is left-c.e. Similarly, . Since this limit is increasing, and is uniformly computable by (1), we have that is left-c.e. Then is also right-c.e. and hence computable.
Many of the canonical computable bases are measure computable bases:
Example 2.9.
If a computable basis on consists of sets which are also uniformly effectively closed, then the basis is -computable for any computable point of . This point applies to the canonical computable basis of clopens on Baire space or Cantor space.
Example 2.10.
If a computable basis on consists of c.e. open sets such that is uniformly effectively closed with is finite, then the basis is -computable for any computable atomless in . This point applies to the canonical atomless measures on for computable.
Here is an example of a computable basis that is not a measure computable basis:
Example 2.11.
Let be an injective function whose range is c.e. but not computable, so that is left-c.e. but not computable. Let , which converges upwards to one, starting from . Define a computable point of by . A computable basis for is given by where are rationals. But this is not a -computable basis since is left-c.e. but not computable.
To illustrate the utility of measure computable bases, consider the following approximation method. In this proof, we use the standard notation for the -th c.e. set, and we use for the points in which get enumerated in by stage in the canonical enumeration.626262[71, pp. 17-18, 47].
Proposition 2.12.
Suppose is a computable point of .
From a rational and an index for a c.e. open with computable, one can uniformly compute an index for an effectively closed set and an index for as a computable real such that .
Proof.
We work with the -computable basis as above (discussed immediately before Proposition 2.8). Let rational be given. Suppose is c.e. open with computable. Let where is a computable function. For each let . Note that has -computable measure, uniformly in . Using this and the computability of , compute such that . For each , the set is c.e. and dense in the open interval and so . Compute such that . Then is a finite union of effectively closed sets and so effectively closed; and further is a computable real since it is a finite union of elements from the -computable basis. Further and . ∎
The following is an important property of the interaction of with -computable bases:
Proposition 2.13.
Each element of the algebra generated by a -computable basis is uniformly identical on to a c.e. open , which is effectively paired with an effectively closed superset of of the same -measure.
Note that since is an effectively closed -null set, on .
Proof.
Suppose that is a -computable basis with corresponding effectively closed set of the same -measure. Again, since is an effectively closed -null set, we have that on . Then is c.e. open with effectively closed superset which with it agrees on .
Suppose that is an element of the algebra generated by the -computable basis . Then can be written as the finite union of finite intersections of the and their relative complements . This is indexed by a finite list of pairs of strings such that
(2.1) |
Then form c.e. open by replacing the effectively closed with the c.e. open , and similarly form effectively closed by replacing c.e. open with effectively closed , as follows:
(2.2) |
Then are equal on and hence have the same -measure, and further is c.e. open and is effectively closed.
∎
The previous proposition places topological constraints on the sets in -computable bases, at least when the measure has full support (that is, there are no open -null sets):
Proposition 2.14.
Suppose that is a computable point of .
-
(1)
If has full support and is a -measure one set and the c.e. open is equal to effectively closed on , then .
-
(2)
If has full support then no element of a -computable basis can satisfy .
In this, we use for topological closure.
Proof.
By contrast, Proposition 2.8(3) implies any c.e. open with computable and effectively closed and can be added to any -computable basis to form a larger -computable basis.
For a simple example of c.e. open as in Proposition 2.14(2), one has the following:636363This example is a minor modification of an example from a proof in [9, p. 58].
Example 2.15.
Consider Cantor space with the uniform measure. Let be a computable sequence of natural numbers such that (resp. is computable). Let be any computable set. For all , consider the following clopen:
Since makes decisions on many bits, its measure is . Then is a c.e. open. And (resp. is computable by the Comparison Test and the fact that is clopen, cf. Example 2.9 and Proposition 2.8( 1)). Further, the set is dense and so its closure is the entire space.
For a similar example on the unit interval with Lebesgue measure, one can use the complements of positive measure Cantor sets.
2.3. The space of integrable functions
For a computable point of and computable, there is a natural Polish space structure on (cf. Definition 1.2). For, one can take as the countable dense set the simple functions , where come from the algebra of sets generated by a -computable basis. If are two such functions, then so is , and hence it suffices to show that if is such a simple function then is computable. Since comes from an algebra, we can assume that the are pairwise disjoint, which implies everywhere. Then one has , which is computable by Proposition 2.8(1). Note that the countable dense set is in , that is, it is defined everywhere rather than merely -a.s. (cf. Definition 1.2). But when we pass to their equivalence classes, they become elements of , and they are a countable dense set in .
We do not record the choice of the -computable basis in the notation for the computable Polish space . This is for two reasons. First, the -computable bases are closed under effective union Proposition 2.8(4). Hence one can typically just assume that one is working with the union of whichever of them are salient in a given context. Second, one can check that any two -computable bases result in computably homeomorphic presentations of .
Many of the natural continuous functions on the computable Polish space are computable continuous, such as: addition, subtraction, multiplication by computable scalar, absolute value, maximum, minimum, positive part, and negative part.
By considering the continuous computable function , one sees that , and so is an effectively closed subset of (cf. Definition 1.2). Further, it is a computable Polish subspace, since the equivalence classes of the non-negative elements of the countable dense set of are dense in .
Since we are working with a finite computable measure from , if , then the identity map is a computable continuous map from into and satisfies for all from . We refer to this as the computable embedding of into .
In working with for it is useful to remember the following inequalities:
(2.3) |
By letting and , one obtains the following inequalities:
(2.4) |
The following proposition gives a canonical approximation of lsc functions which are bounded from below, and indicates that for the non-negative ones, being a computable point of is solely a matter of the computability of the norm. The first part is due to Miyabe for .646464[44, Lemma 4.6]. This kind of approximation is a mainstay of working with lsc functions, and different approximations tend to be appropriate for different purposes.656565See [39, Definition 1.7.4 p. 35]. We will need a variation on this approximation in Proposition 7.10.
Proposition 2.16.
From a rational and a lsc function , one can compute an index for a computable sequence of functions from the countable dense set of such that everywhere and everywhere.
Proof.
Let be a -computable basis. Enumerate as . For each , one has that is uniformly c.e. open. Hence, there is a computable function such that . Then define
(2.5) |
This is an element of the countable dense set of since we just enumerate as and for each non-empty subset of we consider the element of the algebra generated by the -computable basis, and we let , so that we have . Further, at the initial stages (if any) where is empty, we set .
Further from (2.5) one sees that since the sum over which we taking the maximum grows in . Further, one has everywhere since if we had , then for some with and in . But then , and so . Finally, one has everywhere, since if not we would have for some and some and hence would be in and so would be in for some in and hence there would be such that is in and hence by definition in (2.5) one would have that .
Suppose is computable and is lsc and in . If is a computable point of , then since the norm is computable continuous (using Proposition 2.5), we have that is computable (using Proposition 2.4). Conversely, suppose that is an Schnorr test, so that is computable. Then by taking -th roots, we have is computable. Since converges upwards to and we can compute both, we can compute a such that for all . Then using the estimate from (2.4), we have that , and so by taking -th roots again we have .
Similarly, for the last point, since converges upwards to , we can use the estimate from (2.4) to argue that for all there is such that for all one has , and so . ∎
The following records the “universal test” for . For integral tests, it is due Gács and Hoyrup-Rojas in the case .666666Gács [22, 102, Corollary 3.3], Hoyrup-Rojas [30, 845-6]. Further, the version stated here is simplified in that it is only stated for a single measure, whereas these authors state a version where the lsc functions have domain . Hoyrup-Rojas improve on Gács by removing any assumption about the computability of the Boolean algebra structure on the algebra generated by the canonical computable basis.
Proposition 2.17.
Suppose is a computable point of and is computable.
Then there is an Martin-Löf test with such that for all Martin-Löf tests with there is constant such that everywhere.
Hence , an increasing sequence of effectively closed sets.
Proof.
(Sketch) Enumerate the Martin-Löf tests with -norm as . Do this by enumerating approximations to them (as in Proposition 2.16) which have -norm . Then set . ∎
The previous proposition has the following useful consequence regarding computable domination, which recall features in Theorem 1.5(2):
Proposition 2.18.
Suppose that is computably compact and is a computable point of . Then there are points in of computably dominated degree.
The main idea of the proof is to build a computably compact space of fast Cauchy sequences above , and to apply there the Computably Dominated Basis Theorem.676767[9, Theorem 3.7 p. 54], [71, Theorem 9.5.1 p. 179]. One can of course thematize the space of fast Cauchy sequences more than we are doing in this short paper, and in part what we are doing in the below proof is doing the construction out “by hand” in the computably compact case.
Proof.
Without loss of generality, we identify the countable dense set with the natural numbers. By effective Baire Category Theorem, choose a strictly decreasing computable sequence of positive reals such that are disjoint. We define a non-decreasing computable sequence of natural numbers as follows. Suppose that we have already defined things up to stage . To define at stage , we consider the open cover and use computable compactness to compute an such that covers . Define the following computable trees:
The tree is computable since are disjoint. Further has no dead ends since we can just extend by repeating the last entry (since ). Let , which is then a computable Polish space with countable dense set given by extending any node in by means of repeating its last entry indefinitely. Since the function is computable, one has that is strongly computably compact.
The map given by sending to in is well-defined. For, since , every in is a Cauchy sequence.
By definition of , note that for all . For let . Since , choose such that . Then .
Note that any in is a sequence from the countable dense set of which converges fast to . This is because .
Further, is surjective: if in is given, then for each choose such that is in . Then is in since for all one has .
Then by Proposition 2.5, the map is computable continuous since it has a computable modulus of uniform continuity. For, if rational is given, compute least such that . Suppose that are in with agreeing . Then .
By the previous proposition, choose a non-empty effectively closed subset of which consists only of ’s. Then is an effectively closed subset of , which is thus strongly computably compact since is. By the Computably Dominated Basis Theorem, there is an element of of computably dominated degree. ∎
The following example shows that one cannot in general assume that the ’s of computably dominated degree in the previous proposition are non-atoms.
Example 2.19.
There is an uncountable computably compact computable Polish space and a computable point of such that the only elements of of computably dominated degree are among the atoms.
This follows from a construction of Ng et. al.686868[49, Lemma 2.1, Theorem 2.2]. Let be the uniform measure on Cantor space and let . Ng et. al. constructs a computable continuous map with image such that every in is such that is non-isolated in , and vice-versa, and in this circumstance and have the same Turing degree.
The image is a computable Polish space.696969Since it is the computable continuous image of Cantor space, cf. [10, Theorem 2.4.8(3) pp. 73-74]. Further, pushforwards of computable probability measures under computable continuous maps are computable probability measures (by Proposition 2.7), and so is a computable point of . Note that has full support since has full support.
Suppose that in is in and is of computably dominated degree. Then we claim that is an atom. For reductio, suppose not. Since any isolated point in a space with full support is an atom, one has that is not isolated. Since is a surjection, choose in with . By the construction, is in . But these points are not of computably dominated degree.707070E.g. [15, Theorem 8.21.2 p. 382]. Since have the same Turing degree, is not of computably dominated degree, contrary to hypothesis.
It is not clear to us what happens in the general atomless non-compact case:
Question 2.20.
Suppose that is a computable Polish space which is not computably compact, and that in is computable and atomless. Is there an element in that is of computably dominated degree?
If is the reals, it can be written as an effective union of computably compact Polish subspaces, and so the answer is affirmative, by Proposition 2.18. If is Baire space, then the answer is again affirmative, by using effective tightness to describe the ’s as a subset of a countable union of computably compact sets, and then applying the Computably Dominated Basis Theorem again. Hence to answer the question negatively one should be looking for spaces which are not “effectively ” and spaces where effective tightness does not produce a union of computably compact sets containing the ’s.
If is in and is in , then it induces the push-forward probability measure in . The following proposition tells us that the map is computable continuous. We use this proposition primarily in conjunction with Proposition 2.4 and Proposition 2.7, which together tell us that pushforwards of -computable functions are themselves computable.
Proposition 2.21.
Let be a computable Polish space. Suppose that is a computable point of . Then the map from to given by sending to is continuous computable. Similarly, the map from to given by sending to is continuous computable.
Proof.
We apply Proposition 2.5.
Suppose that is an element of the countable dense set of . Then , where is rational and the are elements of the algebra generated by a -computable basis. Then uniformly in rationals one has that , which is left-c.e. and indeed computable. Hence is a computable point of by Proposition 2.7.
Any computable function satisfying is a computable modulus of uniform continuity. To see this, suppose that , and suppose that is 1-Lipschitz, and that . By change of variables, one has that , where the second-to-last inequality uses that is 1-Lipschitz. By taking the supremum over all -Lipschitz with , one has , which by construction is .
Since is a computable Polish subspace of , the restriction of to it is also computable continuous. ∎
The above proposition has the following extremely useful consequence:717171Outside of density, the statement of this lemma is contained in Miyabe’s proof of his characterisation of in terms of Schnorr tests. See e.g. the line “It follows that is computable uniformly in ” ([45, p. 6]). Miyabe does not use pushforwards, but rather does it out by hand for Schnorr tests.
Lemma 2.22.
Let be a computable Polish space. Suppose that is a computable point of .
Suppose is lsc with -a.s. Suppose that is a computable point of . Then there is a computable sequence of reals dense in such that is c.e. open with uniformly -computable measure.
In particular, this is true of any Schnorr test.
Proof.
Let , which by hypothesis is a computable point of . By the Hoyrup-Rojas result discussed in §2.2, there is a -computable basis of the form , where ranges over rationals and is a computable sequence dense in , and where further has the same -measure as . Since -a.s., we have , which is computable.
The last point follows from the previous proposition. ∎
Proposition 2.23.
For any element of the countable dense set of , one can compute an index for a non-negative lsc function and a non-negative usc function such that on .
Likewise, for any element of the countable dense set of , one can compute a rational and an index for a non-negative lsc function and a non-negative usc function such that on .
Proof.
Let be an element of the countable dense set of . Then , where is rational and is an element of the algebra generated by a -computable basis. By Proposition 2.13, suppose that is a c.e. open which is uniformly equal on to , and suppose that is an effectively closed superset of of the same -measure. Then is non-negative lsc, and is non-negative usc, and they agree with on .
Let be an element of the countable dense set of . Then , where is rational and is an element of the algebra generated by a -computable basis. Let . Then is an element of the countable dense set of . ∎
2.4. The space of measurable functions
The space of equivalence classes of Borel measurable functions that are finite -a.s. under -a.s. identity is denoted by , where is in . We write when is clear from context.
In keeping with the notational conventions in §1.2, we write for the pointwise-defined Borel measurable functions that are finite -a.s.
The topology on is given by convergence in measure. To enhance readability, if is a measurable function, then we write for the more cumbersome . Then recall in measure iff for all one has that . Recall that a consequence of Egoroff’s Theorem is that -a.s. implies in for in .727272[21, p. 62].
A compatible complete metric is given by where . Note that the set is upwards closed, so that . When in , this is called the Ky Fan metric.737373[16, 289] While satisfies the triangle inequality and satisfies iff -a.s., it does not in general satisfy .747474[14, 65-69], [16, 289-290].,757575More generally, is not a Banach space. In working with the metric, it is useful to note that iff . Finally, note that -a.s. implies in .
The natural countable dense set for is the the rational-valued simple functions formed from the algebra generated by a -computable basis, that is, the same countable dense set as we used for for computable. Classically, this set is dense in , so it remains to verify that the distance between these two points is uniformly computable:
Proposition 2.24.
If is a rational-valued simple functions formed from the algebra generated by a -computable basis, then is computable, and uniformly so. If are two such simple functions, then is computable, and uniformly so.
Proof.
If is one of these functions, then so too is . Suppose that , where is rational and is are pairwise disjoint events from the algebra generated by a -computable basis.
For in , let , which is a finite set whose index is computable uniformly from . Then , which is a computable real, uniformly in (by Proposition 2.8(1).
Then in satisfies iff , which is a c.e. condition. If we enumerate these rational and take mins as we go, we get a computable decreasing sequence of rationals which converges down to , so that is right-c.e.
Likewise, in satisfies iff , which is a c.e. condition. If we enumerate these rational and take maxes as we go, we get a increasing computable sequence of rationals which converges up to , so that is left-c.e.
Similarly if are from a countable dense set then is a computable real since is also an element of the countable dense set. ∎
We call the following the computable embedding of into . The square root in the rate of convergence is, in our view, explanatory of the many ’s that appear in Pathak et. al. when dealing computable points of .767676See e.g. [52, p. 343].
Proposition 2.25.
Suppose that is computable. Then the inclusion map is a uniformly continuous computable map from to . Further, if at geometric rate of convergence in , then at geometric rate in .
Proof.
Suppose that . Since and have the same countable dense set, by Proposition 2.5, it suffices to show that there is a computable modulus of uniform continuity. Given rational , compute rational and compute a rational . Suppose are in with . Then , and so .
Suppose that and suppose at geometric rate of convergence in . Then , and so at geometric rate in . Let . Then . Then . ∎
The following proposition is the natural effectivization of the Bounded Convergence Theorem:777777[75, 130].
Proposition 2.26.
(Effective Bounded Convergence Theorem). Suppose is a computable point of . Then:
-
(1)
Suppose that in at a geometric rate of convergence . Suppose that such that -a.s. for all . Then at a geometric rate of convergence in .
-
(2)
Suppose is a computable point of and is a rational such -a.s. Then is a computable point of .
-
(3)
Suppose is a uniformly computable point of and is a uniformly computable sequence of rationals such that -a.s. Then is uniformly a computable point of .
Proof.
For (1), classically some subsequence of converges -a.s. to . Hence, a.s. Further, we may suppose . Suppose that at a geometric rate of convergence in , so that for all . Choose sufficiently large so that . Let , so that for all we have . Then , where the last inequality follows from .
Using Proposition 2.5, one has that many of the usual operations on are computable continuous, such as addition and minimum and maximum. Indeed, each of these three has modulus . We can use this observation to prove the following. It had been previously established by Rute, although our proof is different.787878[61, Proposition 3.26].
Proposition 2.27.
If is a computable point of (resp. ), then is a computable point of (resp. ).
Proof.
Let , so that on , and is uniformly a computable point of (resp. ). By Proposition 2.26 (3) one has that is uniformly a computable point of (resp. ). By Proposition 2.21, one has that is uniformly a computable point of (resp. ). For rational (resp. rational ), compute natural number , so that by Proposition 2.7 the real is uniformly left-c.e. Then by Proposition 2.7, the probability measure is a computable point of (resp. ). ∎
We then define:
Definition 2.28.
An Schnorr test is a lsc function which is a computable point of .
Proposition 2.29.
A point is in iff for all Schnorr tests .
Proof.
If for all Schnorr tests , then by the computable embedding of into , we have for all Schnorr tests , and so is in . Conversely, suppose that is in . Let be an Schnorr test. Since is in , it is finite -a.s. By Lemma 2.22 and the previous proposition, there is a computable sequence of reals in the open interval such that is c.e. open with uniformly -computable measure. So we have , and since is computable, we can compute a subsequence with . Hence is an Schnorr test, and so and so is only finitely many of the , and hence there is such that . ∎
Miyabe has shown that being in every -measure one class is equivalent to for every non-negative lsc in .797979[47, Proposition 3.3]. This notion of randomness is also called weak 2-randomness. In conjunction with the above proposition, it suggests that there is little room for a simple characterisation of in terms of non-negative lsc functions in .
Finally, we update our previous approximation theorem for Schnorr tests to Schnorr tests:
Proposition 2.30.
For all Schnorr tests , one can compute a subsequence of the from Proposition 2.16 such that fast in .
Proof.
The come from the countable dense set of and hence are computable points of . Since everywhere, we have in measure, and so in . Since is computable, we just search for a subsequence with . ∎
3. Two Schnorr lemmas: Flipping an approximation and Self-location
In this section, we provide two lemmas on Schnorr tests, one involving turning an approximation from below into a non-increasing subsequence converging down to zero, and another based upon a distinctive self-location property of Schnorr randoms.
The first of these involves a partial subtraction operator, which involves some care since it helps one avoid situations with . These situations can potentially arise since lsc functions are allowed to take infinite values.
Proposition 3.1.
Suppose that computable or .
Suppose that is an lsc function in (resp. an Schnorr test). Suppose that is a -measure one set on which the function is finite.
Suppose that is an lsc function such that on . Suppose that is paired with a usc function such that are equal on .
Define . Then
-
–
are finite on .
-
–
is non-negative lsc and in (resp. an Schnorr test) and is equal on to .
Proof.
For the first item, since the lsc function has codomain and the usc function has codomain , then when the two agree they have finite value. And they agree on .
Since is usc, is lsc. Since the lsc functions are closed under addition (cf. Proposition 2.6), one has that is lsc. Since the lsc functions are preserved under max (cf. again Proposition 2.6), we have that is non-negative lsc. Further, since are in (resp. are -computable) and this property is preserved under subtraction and maxes, we have that is also in (resp. -computable). On , one has that is both equal to and is non-negative, and hence equal to . ∎
While partial subtraction operation is not defined absolutely, but only relative to the hypotheses of the previous proposition, the situation of the following lemma is the one which tends to be operative in applications. We call it “flipping an approximation” since it takes a non-decreasing approximation and turns it into a non-increasing approximation . While classically trivial, it requires some organisation to handle within effective categories:
Lemma 3.2.
(Flipping an approximation) Suppose that computable (resp. ).
For each Schnorr test , let be from the countable dense set of as in Proposition 2.16 (resp. Proposition 2.30), so that everywhere and everywhere and fast in . Using Proposition 2.23, let be non-negative lsc and usc respectively with on . Then by the previous proposition, we have:
-
–
are finite on .
-
–
is an Schnorr test and is equal on to .
-
–
is non-increasing and on .
-
–
for , similarly is an Schnorr test and is equal on to .
Now we turn to self-location. The idea is that given a certain kind of computable “chart” of the computable Polish space, the Schnorr randoms can weakly compute their position on the chart. (For the notion of weak computation, see Definition 1.1(13).
Lemma 3.3.
(Self-location lemma).
Suppose that is a computable point of .
Suppose that is a computable sequence of c.e. opens with uniformly computable -measure.
Suppose that is in . Then weakly computes the element of Cantor space.
Proof.
Let be a -computable basis, with associated sequence of effectively closed supersets of the same measure. Let be a computable subsequence such that . Since and the have uniformly computable -measure, there is a computable function such that where . Let , which is an effectively closed set equal to on . Then is an Schnorr test.
Let be in . Since , there are only finitely many such that is in . Hence, there are there are only finitely many such that is in . Then the sets and differ by only finitely much and hence are Turing equivalent.
Since comes from the algebra generated by the -computable basis, using the -computable basis as in Proposition 2.13 we can compute indexes for c.e. opens such that on . Since both and are uniformly c.e. open, choose computable sequences from the countable dense set and from such that and , where denotes again the open ball around of radius . We can enumerate these sets as and , where and .
Consider a sequence from the countable dense set which converges fast to our point in . Given , to compute from the sequence whether is in , we simply start enumerating both and : eventually gets in one of them (and only ever gets in one of them), and we use the sequence to determine when this happens, by Proposition 2.1(1). ∎
Here are some simple applications of self-location, which we use to obtain the information about weak computation in Theorem 1.5(1):
Proposition 3.4.
Suppose that computable (resp. ).
-
(1)
Suppose that is a sequence of Schnorr tests such that is non-increasing on and such that on . Every in weakly computes a modulus of convergence for .
- (2)
Proof.
For (1), using Lemma 2.22 (resp. in conjunction with Proposition 2.27), choose a computable sequence of reals decreasing to zero such that the c.e. open has -computable measure, uniformly. Consider a sequence from the countable dense set which converges fast to . By the Self-location lemma, we can Turing compute from it the “chart” set . Let be rational. We show how to compute from a natural number such that for all . By hypothesis, decreases down to zero. Hence to compute from we just search for and then search for with .
4. Recovering pointwise values on Schnorr randoms
In this section we prove some results about pointwise limits existing on the Schnorr randoms for various effective functions convering fast in . By way of motivation for these kinds of results, consider and let be Lebesgue measure and recall the canonical example of convergence with -a.s. lack of pointwise convergence:
Proposition 4.1 below says that the slow -convergence in this example is essential to the lack of pointwise limits on . By modifying the events in to be open, one similarly gets a sequence of Schnorr tests which lacks pointwise limits on all . Proposition 4.3 likewise says that the slow convergence of is essential to the -a.s. lack of pointwise limits on .
In the setting of and and being the -fold product of Lebesgue measure on , the following result is due to Pathak, Rojas, and Simpson, who used sequential Schnorr tests.808080[52, Lemma 3.7]. See also [61, §3.3] and [76, Chapter 3] (cf. [70, p. 394]).
Proposition 4.1.
Suppose that is computable. Suppose that is a computable point of . Suppose that is a computable sequence from the countable dense set of such that fast in . Then exists on and is a version of .
Moreover, on this limit does not depend on the choice of or the choice of the -computable basis.
Finally, if is in addition an Schnorr test, then for all in .
Proof.
(Sketch) Let . Then using Proposition 2.23, it is equal on to a Schnorr test. This shows that is a Cauchy sequence for in .
If is another such witness to the computabilty of , then let , and it is similarly equal on to a Schnorr test.
To see that the partially defined function is a version of in , simply note that classically some subsequence converges to -a.s. and so is a version of . The sequence is computable in some oracle, and so by the previous paragraph we get that agree on all the Schnorr randoms relative to that oracle, and so is also a version of .
The final remark follows from the second paragraph by choosing to be the approximation to the lsc function from Proposition 2.16. ∎
The following is an analogue of the above proposition for . This proposition is essentially the natural effectivization of the classical proof that Cauchy-in-measure sequences converge in measure.818181E.g. [21, Theorem 2.30 p. 61].
Proposition 4.2.
Suppose that is a computable point of . Suppose that is a computable sequence from the countable dense set of such that at a geometric rate of convergence in . Then for all in one has that exists and is a version of .
This limit does not depend on the choice of or the choice of the -computable basis or the choice of the rate of geometric convergence.
Hence, if is in addition an Schnorr test, then for all in .
Note that by the computable embedding of into , the limit in this proposition agrees with the limit in the previous proposition on .
Proof.
We may suppose that the geometric rate of convergence is rational. Then we can compute whether is rational or irrational, and hence uniformly in we have that has uniformly computable left- and right Dedekind cuts. Since the are from the countable dense set, so is , and hence we can write it as , where is rational and the events are pairwise disjoint and come from the algebra generated by a -computable basis. By Proposition 2.13, this is equal on to the finite sum , where is a c.e. open which is equal on to . Let , which is a c.e. open since it is equal to , where , and is computable since has uniformly computable right Dedkind cuts. Then is equal on to and is a c.e. open with computable -measure, uniformly in .
We then have . Letting be the c.e. open , we have . Further for we have has computable -measure since it is a finite union of events with -computable measure coming from the algebra generated by a -computable basis. And then is computable since we can approximate it by since . Hence is an Schnorr test.
If a point is in , then it is not in some , while it is in . Then we argue for the following six items about elements of :
-
(1)
For all and all in , for all we have .
-
(2)
Hence for all and all in , we have that for is a Cauchy sequence and thus exists.
-
(3)
For all in and all , we have . For, let . Let and choose such that . Then by (1) one has that . Since this holds for all , we are done.
-
(4)
Since is a -measure one set, one has that exists -a.s.
-
(5)
Further one has that in . For let . Choose such that . Let , so that . Then by (3) we have .
-
(6)
Since both in and in , we have that -a.s.
Suppose that is another computable sequence from the countable dense set of such that at a geometric rate of convergence in . Note that and are equal -a.s. since they are both equal -a.s. to . Let and be constructed from and just as we constructed and from and above. Let , rational number . Let . Note that for all , we have and . Let be a c.e. open which is equal on to , and note that has computable -measure, as in the argument of the first paragraph of the proof. Then one has that , where the middle term is zero since and are equal -a.s. Hence is an integral test, and thus, for in one has that .
As in the previous proof, the limit does not depend on the choice of -computable basis since -computable bases are closed under effective unions (cf. Proposition 2.8).
The remarks about being a version of , and the remark about Schnorr tests, follows as in the proof of the previous proposition. ∎
There is a result similar to Proposition 4.1 when the are themselves Schnorr tests:
Proposition 4.3.
Suppose that is computable (resp. ). Suppose that are uniformly Schnorr tests with fast in , so that is also a computable point of . Then exists and for all in . If is also an Schnorr test, then for all in .
Proof.
By Proposition 2.16 (resp. Proposition 2.30), choose doubly-indexed computable sequence from the countable dense set of such that for all we have everywhere and and fast in . Then fast in . Hence, by Proposition 4.1 (resp. Proposition 4.2), exists on . Note that by these propositions, if is also an Schnorr test then on .
It suffices to show that on . Use Lemma 3.2 to rewrite what we are to show as , where is an Schnorr test equal to on , so that is an Schnorr test equal to on . By Lemma 2.22 in conjunction with Proposition 2.21 (resp. Proposition 2.27), choose a computable sequence in the interval such that the has computable -measure. Then is c.e. open with -computable measure. Let which is an Schnorr test. Let and . Then . Then is an Schnorr test: it is non-negative lsc as a sum of non-negative lsc functions, and the sequence is uniformly -computable and we just showed that fast in . Now we verify on . Let in . Let . Since is in , choose such that we have the estimate . Choose such that for all . Let . If is in , then by our estimate we have . If is not in , then by the definition of we have .
∎
5. Classical features of the maximal function
Suppose that is a point of and is any increasing filtration of Borel subsets of . In this section, we recall some classical features of the maximal function of an integrable function .
First, we recall the following, which gives us information about the codomain of the maximal function:
Lemma 5.1.
-
(1)
If then for in , and the maximal function maps to .
-
(2)
If , then the maximal function maps to .
Proof.
For and in , the sequence is a non-negative martingale, and so by Doob’s Maximal Inequality828282[25, Theorem 9.4 pp. 505-506]. followed by conditional Jensen we have: . Then by the Monotone Convergence Theorem we have . For , let be the -algebra generated by all the . By the classical Lévy Upward Theorem we have that -a.s. which shows that is finite -a.s., and hence that is in . ∎
The following proposition collects together all the other classical facts about the maximal function which we need:
Proposition 5.2.
-
(1)
For , the maximal function is uniformly continuous with modulus of uniform continuity.
-
(2)
For , the maximal function is uniformly continuous with a modulus of uniform continuity.
6. An abstract version of Lévy’s Theorem for Schnorr randomness
Before we state our abstract version of Lévy’s Theorem, we need the following definition:
Definition 6.1.
Suppose that has -measure one.
Suppose that is a class of Schnorr tests.
Then the Schnorr tests are approximated from below on by if from an index for an Schnorr test one can compute
-
–
an index for a sequence of Schnorr tests in such that on and on and fast in ; and
-
–
an index for a sequence of non-negative usc functions equal to on .
Recall from Proposition 3.1 that the are finite on , and we define , and we have that is an Schnorr test equal on to . Similarly if , then we define , and we have that is an Schnorr test equal on to .
The following is our abstract version of Lévy’s Theorem for Schnorr randomness. It is an abstract version in that we are not told more about the function other than that what is stated explicitly in the hypotheses (I)-(IV). In particular, we do not assume that comes from an effective disintegration, although in the next section we will show that effective disintegrations satisfy the hypotheses of the theorem.
Theorem 6.2.
Suppose that is a computable point of . Suppose that is an increasing filtration of Borel sets. Suppose that is computable.
Suppose that is a function such that for every in , one has that is a version of the conditional expectation of with respect to . Define the function by .
Suppose that is a superset of .
Suppose that:
-
(I)
maps non-negative lsc functions to non-negative lsc functions.
-
(II)
satisfies the following properties on :
-
(a)
If on , then on ;
-
(b)
If in then on ;
-
(c)
on ;
-
(a)
-
(III)
Both and send the countable dense set of uniformly to computable points of .
-
(IV)
The Schnorr tests are approximated from below on by a class of Schnorr tests such that on for each in .
Then the following three items are equivalent for in :
-
(1)
is in .
-
(2)
is in and for every Schnorr test .
-
(3)
is in and exists for every Schnorr test and for every c.e. open with -computable measure.
We also have:
-
(i)
Every Borel set is equal on to a Borel set in the -algebra generated by the union of the .848484Indeed, if and is in (resp. ) then we can take to be (resp. ), and if and is (resp. ) then may be taken to be (resp. ). And the same for the lightface classes.
- (ii)
Regarding (i), note that this is saying that the hypotheses of the Theorem amount collectively to an assumption that the union of the filtration generates a -algebra very close to the Borel -algebra, from the perspective of .
Proof.
First we note three things about the maps and and computable.
For , the map maps Schnorr tests uniformly to Schnorr tests. For, by (I), it sends non-negative lsc functions to non-negative lsc functions. And by conditional Jensen and (III) and Propositions 2.4,2.5, it sends computable points to computable points.
For , the map maps Schnorr tests uniformly to Schnorr tests. For, by (I), it sends non-negative lsc functions to non-negative lsc functions. And by Proposition 5.2(1) and (III) and Propositions 2.4,2.5, it sends computable points to computable points.
For , the map sends Schnorr tests uniformly to Schnorr tests. For, by (I), it sends non-negative lsc functions to non-negative lsc functions. And by Proposition 5.2(2) and (III) and Propositions 2.4,2.5 it sends computable points to computable points.
Suppose (1); we show (2). Suppose that is an Schnorr test; we want to show that on . Choose from as in (IV). By definition, one has that pointwise on and fast in and is non-decreasing on . Let be the Schnorr test . Then pointwise on and is non-increasing on and fast in .
Suppose (resp. ). Since is finite on , we have that on and hence by (IIc) we have on . These are all Schnorr tests (resp. except for which is an Schnorr test), and so they are finite on and we hence have on . Since on , we have on by (IIa). Hence on . Since the maximal function maps into (resp. ) and has a computable modulus of uniform continuity (cf. Proposition 5.2), and since fast in (resp. at a geometric rate in ), one can compute a subsequence such that fast in (resp. in ). By Proposition 4.3 one has that pointwise on . Let in and let . Since are Schnorr tests and since is an Schnorr test (resp. Schnorr test), these values are all finite on the point . Choose such that for all and . Choose such that for all one has . By the hypothesis on the coming from , choose such that for all . Hence for all one has that .
Note that the previous paragraph yields (ii). For, Proposition 3.4(1) tells us that can weakly compute a modulus of convergence for . And Proposition 3.4(2) tells us that can weakly compute a modulus of convergence for . And the extra hypothesis in (ii) says that can weakly compute a modulus of convergence for .
Suppose (3); we show (1). Suppose that is a point satisfying (3). We want to show that is in . Let be an Schnorr test. We want to show that . Suppose for reductio that . By hypothesis, exists and is finite. Choose and rationals such that for all . Then is in the c.e. open . Using a -computable basis, choose c.e. open which is a subset of and which contains and which has computable -measure. Let , which is an Schnorr test. Since everywhere, by (3), there is such that for all , and thus for all such . Let . Since everywhere, by (IIa), we have , a contradiction.
For (i), by using a -computable basis it suffices to prove it for c.e. open with computable. In this case, is an Schnorr test. Then on . By (IIa), one has on . For rational in the interval , let . This set is in since is a version of the conditional expectation of relative to . Further this set is c.e. open by the second paragraph of this proof. Then on one has that .858585This event is further in . When we decompose an arbitrary open as a union of c.e. opens with -computable measure, we will get an event in . Hence, is equal on to an event in the -algebra generated by the union of the . ∎
7. Fundamental properties of effective disintegrations
In this section, we develop the properties of effective disintegrations (cf. Definition 1.3, and for examples see Appendicies A-B). In the next propositions, denotes a -measure one subset of , as in the definition of an effective disintegration. Further, throughout this section, the expression is defined as in (1.1) of §1.2, namely the version of conditional expectation coming from the effective disintegration.
Proposition 7.1.
Suppose is an disintegration of . Then for lsc , the map is lsc, uniformly in .
Proof.
Suppose that . Since , we have . Choose rational in the interval . Then and . Then , since otherwise -a.e. and hence .
Suppose that rational satisfies and . If is not -integrable then trivially we have . Hence suppose is -integrable. Since , choose such that . Then . Then . Then . Then . ∎
Proposition 7.2.
Suppose is a disintegration of . Then for in the conditional expectation satisfies the following monotone linearity properties:
-
(1)
If on , then on ;
-
(2)
If in then everywhere.
-
(3)
everywhere.
Proof.
The use of Definition 1.3(2) in the proof of the previous proposition is typical, and henceforth we do not explicitly reiterate it as we go along.
We stated the previous proposition for non-negative functions. For these functions, the conditional expectation in (1.1) is automatically defined for all points in , even if it is infinite. However, in (1.1) is automatically defined and finite when is a simple function, and so the previous proposition holds for these functions as well. More generally, the previous proposition holds for functions which take negative values, provided that is defined and finite on all points of .
Proposition 7.3.
Suppose is a disintegration of .
If in is equal on to a function which is -measurable, then one has on .
Proof.
By Proposition 7.2(1), it suffices to consider functions in which are themselves -measurable (as opposed to being merely equal on to such a function).
Suppose that is in . If , then for some rationals we have . Then is in the event in . Then . Then for -a.s. many values and hence , a contradiction. The case of is similar. ∎
In the below proposition, we use the traditional names for the properties of conditional expectation.868686E.g. [75, 88].. Of course, the hypothesis of effective disintegrations in Definition 1.3(1) is that is a version of conditional expectation. But in this proposition and several others in this section, what we are verifying is that they hold pointwise on specifiable measure -one subsets.
Proposition 7.4.
Suppose is a disintegration of .
-
(1)
(Conditional MCT). Suppose that are in and on and on . Then on .
-
(2)
(Conditional DCT) Suppose that are in and on and on . Then:
-
(a)
If in and then .
-
(b)
If on , then on .
-
(a)
-
(3)
(‘Taking out what is known’). Suppose that in , and suppose that is equal on to a -measurable function. Then on .
Proof.
For (1), let in . By hypothesis, for -a.s. many points, and likewise for -a.s. many points. Then by MCT applied to we have .
For (2), let in with . This means that , and so is in . By hypothesis, for -a.s. many points, and likewise for -a.s. many points. Hence by the DCT applied to , we have that .
For (3), by Proposition 7.2(1), it suffices to prove it for which is itself -measurable. We show it by induction on complexity of .
Suppose where is -measurable. If in then and then is a -measure one event and then it reduces to the observation that . If is not in then and then is a -measure zero event and then it reduces to the observation that .
Unlike the previous propositions, this proposition concerns Kurtz disintegrations:
Proposition 7.5.
Suppose is a Kurtz disintegration of . If is computable, then is computable continuous.
Proof.
By conditional Jensen, the function is a computable modulus of uniform continuity. Hence by Proposition 2.5, it suffices to show that if is an element of the countable dense set of , then is a computable point of . Since we can effectively separate into positive and negative parts, it suffices by the linearity of conditional expectation (Proposition 7.2) to consider the case where . We may assume further that the are pairwise disjoint, which like in the discussion at the beginning of §2.3 implies that .
By Proposition 2.13, let be a c.e. open which is equal to on . Let , so that likewise on . Then on and on . By Proposition 7.2 we have for in :
By Definition 1.3(3), the functions are lsc. Hence, by Proposition 2.16, choose functions from the countable dense set of that converge upward to , in that everywhere and everywhere. Let and , which likewise converge upward to and respectively on .
Suppose is in . Since we are working with a Kurtz disintegration, we then have that is a -measure one set, and so . Since the are pairwise disjoint, we then have , and so . Then for in , by the convexity of the -th power function applied with coefficients and points , we have:
Since this estimate holds on the -measure one set , by taking expectations and then -th roots, we have . Since the right-hand side is a computable value which goes to zero as goes to infinity (by MCT), we can compute a subsequence of the which is a witness to the computability of . Since are equal on , we have is also a computable point of . ∎
The previous proposition has the following elementary consequence:
Proposition 7.6.
Suppose is a Kurtz disintegration of . Suppose is computable.
-
(1)
If is an Martin-Löf test, then is an Martin-Löf test and is an Martin-Löf test.
-
(2)
If is an Schnorr test, then is an Schnorr test and is an Schnorr test.
Proof.
This next proposition seems specific to Schnorr tests and :
Proposition 7.7.
(Tower) Suppose that are two effective -algebras, each of which has a Kurtz disintegration. Then for every Schnorr test , one has that on .
Proof.
By the previous proposition, one has that the two functions and are Schnorr tests. Suppose that they are not equal on in . Then, without loss of generality, there are rationals with . By Lemma 2.22, there is a computable real in the interval with computable. Since is lsc, the set is c.e. open, and it has computable measure. By the same lemma, there is a computable real in the interval with computable, so that is likewise computable. Since is lsc, the set is effectively closed. Since it has computable -measure, by Proposition 2.12, there is a a decreasing sequence of c.e. opens with uniformly computable and . We then claim that . For, suppose not. Then . Since is computable by Proposition 2.8(2), we can then compute a subsequence with , so that is an Schnorr test. But since in and in by construction, we have a contradiction. Hence indeed . Since are by definition -measurable, we have that is also -measurable and hence -measurable. Then one has the following identities by the definition of conditional expectation (these identities being the classical proof of the tower property):
But these identities give us the below identity, where the remaining inequalities follow from the definitions of :
But since , we then have that , contrary to construction.
∎
Proposition 7.8.
(The rôle of independence) Suppose is a Kurtz disintegration of .
If in is independent of , then everywhere.
If is an Martin-Löf test independent of , then on .
Proof.
Let us abbreviate .
First suppose that in is independent of . Let in be arbitrary. Suppose that . Choose rational with . Let , which is in . Then we have the following, where the first step is independence: .
Second suppose that is an Martin-Löf test, independent of . Then is non-negative lsc by Proposition 7.1. Suppose is in . Suppose . Choose rational with . Let , which is effectively closed and in . Since it contains the point , we have that . Then we have the following, where the first step is from independence: . Since we have , a contradiction. ∎
Now we turn towards effective properties of the maximal function (cf. §5 for the classical properties). Recall that almost-full was defined in Definition 1.1(12).
Proposition 7.9.
Let be an almost-full effective filtration equipped with Kurtz disintegrations. Let be the associated version of the maximal function.
For , the maximal function is computable continuous from to .
For , the maximal function is computable continuous from to .
For , the maximal function sends the countable dense set of (resp. ) uniformly to computable points of (resp. ).
Proof.
First suppose . Let , where is rational and comes from the algebra generated by a -computable basis. Without loss of generality, and the are pairwise disjoint. By almost-fullness of the filtration and Proposition 2.13, for each , let on , where is a c.e. open equal on to an element from the sequence which generates , where is a computable function. By replacing by , we may assume that and are non-decreasing in , for each . Let . The and come from the algebra generated by a -computable basis, and so for each , we can compute a subsequence such that . By Doob’s Maximal Inequality (Lemma 5.1), we have
which is . Hence it remains to show that is uniformly a computable point of . Since for all , the c.e. open is equal on to an element of , one has that on by Proposition 7.3. Hence, on . And the latter is a computable element of since it is a finite max of conditional expectations which are computable elements of by Proposition 7.5.
For , one just appeals to the fact that and for computable share a common dense set and that computably embeds in .
Proposition 7.10.
Let be an almost-full effective filtration equipped with Kurtz disintegrations.
For every Schnorr test , we can compute an index for a sequence of Schnorr tests such that on and on and fast in and on . Indeed, there is computable function such that
(7.1) |
Further, we can compute an index for a non-negative usc function such that are equal on .
Finally, we can compute from an index for a non-negative rational which bounds on .
Proof.
Enumerate as . For each , one has that is uniformly c.e. open. Since the filtration is almost-full, by using Proposition 2.13 there is a computable sequence of c.e. opens such that on , where the are equal on to events from the sequence which generates . Then define
(7.2) |
As in the proof of Proposition 2.16, this is a rational-valued step function, whose events are equal on to events from the sequence which generates , where is a computable function. Then by Proposition 7.3 one has that
(7.3) |
Further from (7.2) one sees that everywhere since the sum over which we taking the maximum grows in . Further, one has on since if we had for in , then for some with in . But since on , we then have . Finally, one has on , since if not we would have for some in and some and hence would be in and so would be in for some and hence for one would have that by definition in (7.2).
We can pass to a subsequence of which goes to fast in as in the proof of Proposition 2.16.
8. Proof of Theorems 1.5-1.6
First we prove Theorem 1.5:
Proof.
The results of the previous section show that the conditions of Theorem 6.2 are satisfied:
- –
- –
- –
- –
Finally, we argue from Theorem 1.5(4) to Theorem 1.5(1). Suppose Theorem 1.5(4) is satisfied. We want to show that is in . Let be an Schnorr test. We want to show that . Suppose not. Since by hypothesis exists, there are rationals and there is such that for all . Then is in the c.e. open . Since the filtration is almost-full and is in , there is and an event from such that is in and . Then , and hence . Hence is a -measure one event, and hence , a contradiction.
∎
Now we turn to Theorem 1.6:
Proof.
For Theorem 1.6(2) suppose that in is of computably dominated degree. By the previous paragraph, we have that weakly computes a modulus for the convergence . Likewise defined by is computable from . Since is of computably dominated degree, we have that at least one of these is dominated by some computable function, call it , so that that past some point, call it , we have . Let be rational. Compute such that . Let . Then . Then . ∎
9. Proof of Theorems 1.8-1.9
We begin with an elementary proposition.
Proposition 9.1.
Suppose in such that for every c.e. open one has .
-
(1)
For every event in the algebra generated by the c.e. opens, one has .
-
(2)
For every simple function generated from events in this algebra, one has .
-
(3)
For every lsc which is in each of , one has .
This proposition too illustrates that convergence to the truth is a strengthening of weak convergence, since in (3) there is no boundedness constraint on the lsc function.
Proof.
For (1), every event in this algebra can be written as a finite disjoint union of sets of the form , where are c.e. opens. Let and , so that the set has the form . Since are in and since is c.e. open as well, we have that
For (2), one just applies the previous item and the properties of the integral.
Now we prove Theorem 1.8:
Proof.
Suppose Theorem 1.8(1); we show Theorem 1.8(2). Since is in , one has that is in . Suppose that is an Martin-Löf test. Since is in , one has . By Proposition 7.6(1), the functions are Martin-Löf tests as well, and hence , which is just to say that . By Proposition 9.1(3) applied to one has that , which is just to say that . Hence, it remains to show that . Suppose not. For reductio, suppose there are rational with .
Let and , so that is c.e. open and is effectively closed. Since is in and is not in , we have .
By definition of , we have for all that . For any such that , we then have , so that . Hence, our reductio hypothesis gives that there are infinitely many with .
By Proposition 7.6(1), we have that are Martin-Löf tests. Hence its maximal function is also non-negative lsc. Let , which is c.e. open. By Doob’s Submartingale Inequality, one has that is the following:
Hence is an Martin-Löf test, and hence there is constant such that .
Then for all we have , so that is in with . Let be the conjugate exponent to . Then, for each , we have by Hölder with respect to that:
(9.1) |
Since , we have that , contradicting the previous conclusion from the reductio hypothesis.
Now we prove Theorem 1.9:
Proof.
We work in Cantor space with the uniform measure , the effective full filtration of the algebra of events generated by the length strings, and with the effective disintegration . Then , and likewise . (This is just Example B.1 for Cantor space and uniform measure).
We show that for in there is c.e. open with such that is not in and the convergence does not have a computable rate.
(Since the example involves an indicator function, we have that is in for all computable).
Let . Let be the halting set . Enumerate it as , where the map is injective.
Define and .
For , define clopen and define the c.e. open . Then and . Since just makes decisions on bits , it is independent of all bits (since is uniform measure). We then have that for any , and hence for any .
Let in . Since is an Martin-Löf test, one has that there is such that is not in . Hence . Suppose that with computable rate . Let be such that for all . Then . Since , one has that , a contradiction to the previous paragraph. ∎
10. Proof of Theorem 1.11
First we note a fact mentioned in the introduction, namely that Maximal Doob Randomness is inbetween Martin-Löf and Schnorr randomness:
Proposition 10.1.
For all computable one has .
Proof.
Since any maximal Doob test is an Martin-Löf test, we have . To show that , it suffices to show that any Schnorr test is an maximal Doob test. By Proposition 2.16, let be from the countable dense set of so that on and on and fast in . Then we can compute a subsequence such that . Then for all one has that . Then we are done by Lemma 3.2. ∎
The following closure condition on maximal Doob tests is the difficult component of the proof of Theorem 1.11:
Proposition 10.2.
If is computable and is an maximal Doob test with witness , then is equal on to a maximal Doob test with witness equal on to .
Proof.
The function is equal on to a non-negative lsc since it is equal on to a supremum of non-negative lsc functions (cf. Proposition 7.1, Proposition 7.2(1)). Since , by Proposition 7.9, we have that is equal on to an Schnorr test. For , the quantity is :
(10.1) | ||||
(10.2) |
To estimate (10.1), let us first define a computable sequence of non-negative left-c.e. reals:
For fixed we have . To estimate (10.1), we have the following, where the last line follows from Doob’s Maximal Inequality (Lemma 5.1(1)):
For (10.2), we have the following, where we use Doob’s Maximal Inequality (Lemma 5.1) again at the end:
∎
Here is the proof of Theorem 1.11:
Proof.
Suppose (1); we prove (2). One has that in and indeed in by Proposition 10.1. Suppose now that is an maximal Doob test with witness . By the previous proposition and (1), we have . Let . Choose such that for all we have . Choose such that for all we have . By Theorem 1.5 applied to , we have that . Choose such that for all we have . Then putting this all together, we have for all that is the following:
11. Back and forth between tests and computable points
In this section we indicate how to state a version of Theorem 1.5 in terms of -computable points. This essentially follows by a translation method of Miyabe.
We begin with how to select a version for each computable point . Pathak, Rojas, and Simpson and Rute have shown how to do this via Proposition 4.1. We use the following slight variant of their selection method:
Definition 11.1.
Suppose that is a computable point of with witness . Then we define a version in by:
Hence, Proposition 4.1 tells us that the definition of goes through the first case break on all points of , and on these points it is independent of the choice of the witness . However, on we have that it is dependent on the witness . If one was working more extensively with , one would want to develop some notation which better mark its dependence on the version . But this dependence has the following advantage: if is an Schnorr test and is a witness to its being -computable such that everywhere (as in Proposition 2.16), then everywhere.878787Pathak, Rojas, and Simpson [52, Definition 3.8 p. 314] work with a variant of our that organises the case break depending on whether is in . Their approach has the advantage of making , which they denote as , entirely independent of the witness . Rute [61, Definition 3.17 p. 16] organises the case break the same as we do but sets it undefined when the limit does not exist, which prevents it from being an Schnorr test.
Miyabe proved the following transfer result for going back and forth between -computable functions and differences of Schnorr tests:888888[45, Theorem 4.3 p. 7]. He proved it for , but the proof is the same for .
Proposition 11.2.
Suppose that is a computable point of and is computable.
-
(1)
Suppose is a computable point of with witness . Then there are Schnorr tests such that on on .
-
(2)
Suppose that are Schnorr tests. Then there is -computable with witness such that on on .
Proof.
These observations allow us to restate Theorem 1.5 in terms of -computable points, provided one assumes Schnorr disintegrations:
Corollary 11.3.
Suppose that is a computable Polish space and is a computable probability measure. Suppose that is an almost-full effective filtration, equipped with Schnorr disintegrations.
If is computable, then the following three items are equivalent for in :
-
(1)
is in .
-
(2)
is in and for every computable with witness .
-
(3)
is in and exists for every computable with witness .
Proof.
Suppose (2); we show Theorem 1.5(2). But simply note that any Schnorr test has a witness from Proposition 2.16 with everywhere.
Suppose Theorem 1.5(2); we show (2). Let be computable with witness . By the previous proposition, there are two Schnorr tests such that on . Since we are working with Schnorr disintegrations, by Proposition 7.2(1), one has that on for all . Then by Theorem 1.5(2) and Proposition 7.2(3), we have on that .
12. Martingale convergence in
Our topic in this paper is convergence of the conditional expectations of random variables. But of course these are instances of martingales. In this brief section, we prove a result mentioned in §1.4, namely that one can characterise in terms of convergence of certain martingales. As mentioned there, this generalises a result of Rute from Cantor space to the more general setting.
If and if is a filtration, then a classical martingale in adapted to is a sequence of measurable functions in such that -a.s. When is clear from context, we just say classical martingale in .
If is computable and is an effective filtration equipped with Schnorr disintegrations , then a martingale of Schnorr tests adapted to and is a uniformly computable sequence of -measurable Schnorr tests such that on , where the version of the conditional expectation is that from the disintegration. When and are clear from context, we just say martingale of Schnorr tests.
Here is an example:
Example 12.1.
(Products of mean one independent variables).
Suppose is computable. Suppose that is a sequence of independent Schnorr tests with for all . Suppose that is an effective filtration equipped with Schnorr disintegrations. Then is a martingale of Schnorr tests.
Our goal is to prove the following:
Theorem 12.2.
Suppose that is a computable point of . Let be an almost-full effective filtration equipped with Schnorr disintegrations.
The following are equivalent for in :
-
(1)
is in .
-
(2)
is in and exists for every martingale of Schnorr tests such that both is computable and the maximal function is a Schnorr test.
This theorem generalises a result of Rute.898989[61, Corollary 6.8 and Theorem 12.6]. But there are two important differences between our result and Rute’s. First, Rute’s analogue of the direction from (2) to (1) of Theorem 12.2 only works for Cantor space and the uniform measure. Second, Rute’s results do not require that the maximal function be a Schnorr test.
We do not know the answer to the following:
Question 12.3.
Does Theorem 12.2 also hold for all computable ?
It is clear from the proofs below that (2) to (1) holds for computable . Hence it is a question of (1) to (2).
Throughout the remainder of this section, is a computable Polish space, is a computable point of , and is an effective filtration equipped with Schnorr disintegrations. We only assume that is almost-full in the last proposition, and flag this assumption when it comes up.
We begin by noting the following two elementary results:
Proposition 12.4.
If is a martingale of Schnorr tests, then on for all .
Proof.
Proposition 12.5.
If is a classical martingale in , then for all .
Proof.
The function is a convex function, and hence is a submartingale.909090[75, p. 138]. And the expectation of a submartingale is always non-decreasing. ∎
The following gives a canonical example of a martingale of Schnorr tests, and in conjunction with Theorem 1.5 gives the (2) to (1) direction of Theorem 12.2.
Proposition 12.6.
Suppose that is computable.
If is an Schnorr test, then is a martingale of Schnorr tests.
If and the filtration is almost-full then the maximal function is also an Schnorr test and is computable.
Proof.
The function is non-negative lsc by Proposition 7.1, and it is -computable by Proposition 7.5. To see that it satisfies the martingale condition, from everywhere we have everywhere. And by the tower property (Proposition 7.7), the latter is equal to on , which is by definition .
Suppose now that the filtration is almost-full and . By almost-fullness and Theorem 1.5, we have that on . Since we have is in by Lemma 5.1(1). Then we can dominate by and argue by DCT as follows, where the first identity comes from Proposition 12.5:
(12.1) |
Since is by hypothesis a computable point of , we have that is computable and hence likewise is computable. ∎
In conjunction with Corollary 11.3, the following proposition then gives the (1) to (2) direction of Theorem 12.2. As mentioned in §1.4, the proof largely follows the outline of Rute’s own Hilbert space proof.
Proposition 12.7.
Suppose that the filtration is almost-full.
Suppose is a martingale of Schnorr tests such that both is computable and is a Schnorr test.
Then there is -computable function such that on for each .
Further, the -computable function can be taken to be a pointwise limit of a computable subsequence of the , which limit exists at least on .
Proof.
Recall that for we have by Hilbert space methods in that .919191[25, 488]. This implies that for we have:
Let , which classically is in . Since is in , we can dominate by and argue by DCT and the previous equation that:
Since the latter is a computable real which goes to zero, we can compute a subsequence which converges to fast in . By Proposition 4.3 and Definition 11.1, we have that on .
13. Conclusion
The main results of this paper (Theorems 1.5, 1.6, 1.8, 1.9, 1.11) characterise the points under which Lévy’s Upward Theorem holds in terms of notions from algorithmic randomness and the rates of convergence in terms of concepts from the classical theory of computation. As discussed in §1.4 this builds on work by previous authors. That which is new are the results on rates of convergence in Theorems 1.6, 1.9, the articulation of the general framework of effective disintegrations (see Definition 1.3, §7 for fundamental properties, and Appendix B for examples), a conceptually new proof of the characterisation of density randomness in the more general framework of effective disintegrations for (Theorem 1.9), and the articulation of the new concept of Maximal Doob Randomness (cf. Definition 1.10, Theorem 1.11, and Question 1.12). As far as Schnorr randomness goes, we noted in §1.4 that Theorem 1.5 can be derived from Rute’s work, modulo the verification of certain properties of effective disintegrations and the Miyabe translation method in §11. We have extended Rute on Schnorr randomness in the generalisation of the martingale result in §12. We have also sought to present very accessible proofs, based almost entirely on the concept of Schnorr test.
Our results also contribute to understanding the significance of convergence to the truth results for Bayesian inference. As was pointed out by philosophers of science, the probability one qualification in theorems like Lévy’s Upward Theorem raises the spectre of arbitrariness: a Bayesian with credences represented by a probability measure believes in convergence to the truth with certainty, but might do so only by arbitrarily packaging into a set of probability zero those points at which convergence fails.929292[2], [18, pp. 144 ff], [28, pp. 28-29]. We have shown that for certain classes of effective random variables the packaging is anything but arbitrary. The probability one set on which convergence to the truth is successful coincides with standard classes of points which are algorithmically random by the lights of the computable probability measure. Thus, the effective typicality expressed by convergence to the truth is extensionally equivalent with a principled effective typicality of the underlying probability measure.
Appendix A Examples of classical disintegrations
In this appendix, we review two classical examples of disintegrations. This also affords us the opportunity to illustrate natural circumstances in which Lévy’s Upward Theorem need not hold for all points. Another reason to dwell on these two examples is that one of the main theorems of Rohlin is that, up to Borel isomorphism, “blendings” of these two examples are the only examples of disintegrations of countably generated -algebras.939393[59, §4 pp. 40-41], [12, Theorem 1.12 p. 12]. Our diagrams in this appendix are inspired by the few diagrams in Einsiedler-Ward,949494[19, pp. 122-123]. although they only work with a single -algebra rather than a filtration.
Most concrete examples of disintegrations involve products. We write for the product measure on formed from finite measure on and finite measure on .
Example A.1.
(Refined partitions of the unit square). Let with measure being the product of Lebesgue measure on with itself. Let be the dyadic partition of into many squares, and let be the -algebra it generates. We can visualise the elements of as any shape one can form from the squares in the below diagrams, so that like in pixelations more detailed shapes become available as gets larger:
In this diagram, we use the familiar diagrammatic conventions from point-set topology to indicate which components of the partition contain the edges: for instance, for , the southwest square and the northwest square have two edges, the southeast square has three edges, and the northeast square has four edges. Then the following map is a disintegration of , where is again the set of finite Borel measures on :
(A.1) |
Further, using equation (1.1) from §1.2, one has the following associated formula for the version of conditional expectation of with respect to :
(A.2) |
Suppose that at stage , the agent’s world is located in the square from . Intuitively this means that the agent’s evidence at this stage of inquiry is . Then (A.2) says that the agent’s best estimate as to the value of a random variable at this stage is obtained by averaging over the event according to the prior probability measure , and then making it higher to the extent that the prior probability is lower. In the case where the random variable is the indicator function of a Borel event , this best estimate is just the usual conditional probability . For instance, if is the closed polygon displayed below, then the conditional probability of the agent at stage is higher than if she were at another world iff there is more overlap between than between :
This example also vividly illustrates how can fail. For instance, take the vertex , which is in the polygon since it is closed. For all , this point is the southwest vertex of a dyadic square in , and such a square overlaps the closed polygon only at this vertex. Hence for the usc function , one has both and for all .
In simple examples like this one, geometric intuition can guide us as to what points holds. The further assurance the classical version of Levy’s Upward Theorem provides is that holds on a set of -probability one, even when geometric intuition is unavailable. The additional assurance that Theorems 1.5,1.8, 1.11 provides is that holds on the random points relative to , for a large class of effective random variables . From this perspective, the problem with our vertex is that it is not sufficiently random, which enabled us to construct a random variable which failed to converge to the truth at this point.
Example A.2.
(Refined lines in the unit square) Let with Lebesgue measure being the product of Lebesgue measure on with itself. Let be the -algebra on generated by events of the form , where is Borel and where is from a dyadic partition of into (half)-closed intervals of equal length. Intuitively, is the -algebra of evidence where the agent knows everything there is to know about the -component at the outset, but is progressively learning more about the -component. Since events of the form are in , we can depict the -algebra as the decomposition of into vertical lines. Likewise, we can depict as the decomposition of into half vertical lines, etc. While we draw only ten such vertical lines in the below diagram, the idea is that is being decomposed into continuum-many such vertical lines at each stage:
Then the following map is a disintegration of , where is the Dirac measure centred on :
(A.3) |
Further, using equation (1.1) from §1.2, one has the following associated formula for the version of conditional expectation of with respect to :
(A.4) |
Suppose that at stage , the agent’s world is such that its second coordinate located in the interval from . Intuitively this means that the agent’s evidence at this stage of inquiry is the line . Then (A.2) says that the agent’s best estimate as to the value of a random variable at this world and stage is obtained by defining the one-place random variable and then doing a one-dimensional analogue of the update in Example A.1. For instance, if is the displayed closed triangle and we consider the usc function , then is obtained by calculating the length of the line , and then by multiplying by a factor of which is responsive to smaller partitions of the -axis involving less likely events. We illustrate this with respect to the marked point in the below diagram, where the line is indicated with a heavier dark line:
In contrast to the previous Example A.1, in this example many of the events in have measure zero according to the prior probability measure . Like in the previous Example A.1, we have natural pointwise failures of here as well: the rightmost vertex of the triangle displays the same kind of failure as in the previous example. And like in that case, the interpretation suggested by Theorems 1.5,1.8, 1.11 is that the vertex is insufficiently random.
Appendix B Examples of effective disintegrations
In this section, we describe several examples of effective disintegrations (cf. Definition 1.3). We focus for the most part on effectivizing the two paradigmatic Examples A.1-A.2 from the previous appendix, but we also include a countable product (Example B.13). In a sequel to this paper, we look also at Bayesian parameter spaces and sample spaces.
One example like Example A.1 is already widely-used in algorithmic randomness, although it is not usually thematized as such:
Example B.1.
(The canonical concrete refined partition disintegrations).
Suppose that is a computable tree with no dead ends. Let , the paths through , which is a computable Polish space, and suppose in is computable with full support.
Suppose that is the effective refined partition generated by the length strings in . That is, is generated by the sets of paths in through the length strings .
Let by .
Then is a Martin-Löf disintegration of with respect to and one has the following expression for the version of conditional expectation, which is defined for all in and all in :
(B.1) |
Since we want to generalise this in what follows, we defer the verification that it is a Martin-Löf disintegration.
The following is an effective disintegration like Example A.2:
Example B.2.
(The canonical concrete refined lines disintegrations).
Suppose that are computable trees with no dead ends. Let and , the paths through respectively, which are computable Polish spaces, and suppose in and in are computable with full support.
Let be the -algebra on generated by sets of the form , where ranges over c.e. opens from a -computable basis on , and where ranges over length strings in .
Then the map given by is a Martin-Löf disintegration of with respect to , and one has the following expression for the version of conditional expectation, which for each in is defined for -a.s. many in :
(B.2) |
Note that is free under the integral sign. Since there are no continuity assumptions on (it is merely an element of ) the value of the conditional expectation in (B.2) can apriori change drastically with small changes of . By contrast, in (B.1), ’s contribution is restricted to its first bits.
In what follows, we want to generalise these two examples to a broader class of computable Polish spaces and verify that they are indeed Martin-Löf disintegrations. We begin by generalising the way in which the previous examples involve partitions. We define a special case of Definition 1.1(11):
Definition B.3.
A sub--algebra of the Borel sets on is a -effective partition if it is generated by a computable sequence of events from the algebra generated by -computable basis such that the events are a partition of .
Given such a partition with its computable index set , we define the c.e. set .
Further, a -effective softening of is a pairwise disjoint computable sequence of c.e. opens such that on .
Proposition B.4.
Every effective partition has an effective softening.
Proof.
Since the computable sequence comes from the algebra generated by a -computable basis, by Proposition 2.13, there is a computable sequence of c.e. opens and effectively closed with and on . Then define recursively the sequence of c.e. opens by and . ∎
Softenings of full partitions are a canonical way to obtain almost-full effective filtrations (cf. Definition 1.1(12)):
Proposition B.5.
Suppose that is a full -effective partition equipped with effective softenings which generate -algebras . Then is an almost-full -effective partition.
Proof.
Uniformly from an index for a c.e. open , we can compute an index for a sequence from the sequences which generate the filtration such that . Each is equal on to , where the latter comes from the softening. Then is equal on to . ∎
Here is how to organise a suitably generalised version of a single stage of the filtration of Example A.1:
Proposition B.6.
Let be a computable point of . Let be an effective partition of with effective softening .
Let by .
Then is a Martin-Löf disintegration of with respect to and one has the following expression for the version of conditional expectation, which is defined for all in and all in :
(B.3) |
Proof.
By the definition of effective softening, the sets for distinct in have empty intersection. Hence, the map has codomain . In particular, if is in for in , then , which is in and hence in . But if not in any for in , then , which is a point of .959595If one does not introduce softenings, then this part of the argument breaks down and need not be a finite measure.
By the definition in (1.1) and since for in , one has the following for all in :
If in , then when we integrate over in with respect to we then get
If not in then the event is -null and so trivially we get:
Since elements generate , this shows that (B.3) is a version of the conditional expectation of with respect to . Further, it is totally defined for all in and all in .
Since is an effective partition, one has that for in . Further, for in , we have that in . Hence for in we have . The same argument works for and .
Suppose that is c.e. open and is rational. Since the are pairwise disjoint, one has that iff there is in such that is in and , which is a c.e. open condition in variable . ∎
Here is how to organise a suitably generalised version of the initial step of the filtration of Example A.2, that is, where the partition on the second component consists just of a single set (like in Example A.2).
Proposition B.7.
Suppose that are computable Polish spaces. Suppose that is a computable point of and is a computable point of . Let be the -effective -algebra on generated by sets of the form , where ranges over c.e. opens from a -computable basis on . Then given by is a Martin-Löf disintegration of with respect to , and one has the following expression for the version of conditional expectation, which for each in is defined for -a.s. many in :
(B.4) |
Proof.
By the definition in (1.1) and by Fubini-Tonelli, one has the following for in , and by the same theorem it is defined for -a.s. many in , and hence for -a.s. many in :
Since from does not appear free in this last term, when we integrate with respect to in we get:
Hence for Borel subsets of we have by Fubini-Tonelli that:
This shows that (B.4) is a version of the condition expectation of with respect to .
Note that , for any . Further, recall that .969696One can easily check this by hand. It also follows from the fact that Kurtz randomness is preserved both ways under computable continuous open maps. Hence for in one has the identity:
From this we get .
In this next paragraph, we use some notation familiar from Fubini-Tonelli, namely if and in , then is defined to be .
For Schnorr disintegrations, suppose that is in , so that is in . Since , we want to show that , where is the event . We have . By choosing a Turing degree which computes a fast Cauchy sequence for , we have by van Lambalgen’s Theorem that and so .979797This “hard” direction of van Lambalgen’s Theorem works in arbitrary computable Polish spaces with computable measures, basically because it is Fubini-Tonelli type argument. It similarly works for . For the setting of Cantor space with uniform measure, see discussion in [15, pp. 257-258, 357]. Then by Fubini-Tonelli , which is what we wanted to show. The argument for Martin-Löf disintegrations is similar.
Suppose that is c.e. open. Then we can write where are computable sequences of c.e. opens with and . Then for rational , we have iff there is with , which happens iff there is with , which happens iff there is with in and . This is a c.e. open condition in variables . ∎
Finally, we can combine partitions and lines as follows, which gives a suitably generalised version of an individual step in the filtration from Example A.2 (like or in that example).
Proposition B.8.
Suppose that are computable Polish spaces. Suppose that is a computable point of and is a computable point of .
Suppose is an effective partition of with effective softening .
Let be the -effective -algebra on generated by sets of the form , where ranges over c.e. opens from a -computable basis on . Then the map given by is a Martin-Löf disintegration of with respect to , and one has the following expression for the version of conditional expectation:
(B.5) |
Another variant on Example A.2 is
Proposition B.9.
Let , where is a computable sequence of computable Polish spaces. Let a computable point of be given by , where is a computable sequence in . Let . Let be the -algebra on generated by sets of the form , where for ranges over c.e. opens from a -computable basis on . For in , write its coordinates as . Then given by is a Martin-Löf disintegration of with respect to , and one has the following expression for the version of conditional expectation, which for each in is defined for -a.s. many from , and where and
Proof.
Simply apply Proposition B.7 to , where and and and . ∎
The simplest kind of an effective filtration, which occurs in both Examples B.1-B.2 is the following:
Definition B.10.
Suppose that is a computable Polish space. Suppose that is a computable tree with no dead ends. Let . Suppose that is an effective partition , uniformly in . If the partitions refine one another, in that for all in , then the is an effective filtration, which we call an effective refined partition.
The following isolates the natural sufficient condition for an effective refined partition to be full, and this condition is obviously met in Example B.1:
Proposition B.11.
Suppose that the effective refined partition satisfies the following properties:
-
–
Effectively Shrinking: There is a computable function such that for all rational one has that for all in with .
-
–
Effectively non-empty: There is a uniformly computable sequence of points in .
Then the effective refined partition is full.
Proof.
(Sketch) The two conditions imply that there is a well-defined computable continuous surjection given by iff . For fullness, suppose that is c.e. open. Then is c.e. open. Then there is c.e. set such that . Then one can check that . ∎
We can also obtain full effective filtrations by combining lines with full effective partitions, as in B.2:
Example B.12.
Suppose that are computable Polish spaces. Suppose that is a computable point of and suppose that is a computable point of .
Suppose that is a full effective partition of (resp. almost-full effective partition of ).
Let be the effective -algebra .
Then is a full effective filtration of (resp. almost-full effective filtration of ).
Another example of a full effective filtration is related to countable products:
Example B.13.
The effective -algebras from the countable products Example B.9 is a full effective filtration. This is because every c.e. open can be uniformly written as , for some c.e. index set , where , where is uniformly c.e. open in for .
Of course, this example is the same as that of full effective partitions when is uniformly countable. However, this example goes beyond that of full effective partitions when the are uncountable.
References
- [1] Nathanael L Ackerman, Cameron E Freer, and Daniel M Roy, On computability and disintegration, Mathematical Structures in Computer Science 27 (2017), no. 8, 1287–1314.
- [2] Gordon Belot, Bayesian orgulity, Philosophy of Science 80 (2013), no. 4, 483–503.
- [3] Laurent Bienvenu, Rupert Hölzl, Joseph S Miller, and André Nies, Denjoy, Demuth and density, Journal of Mathematical Logic 14 (2014), no. 01, 1450004.
- [4] Patrick Billingsley, Convergence of probability measures, Wiley, 2013.
- [5] David Blackwell and Lester Dubins, Merging of opinions with increasing information, Annals of Mathematical Statistics 33 (1962), no. 3, 882–886.
- [6] Vladimir I Bogachev, Weak convergence of measures, American Mathematical Society, 2018.
- [7] Vasco Brattka, Joseph S Miller, and André Nies, Randomness and differentiability, Transactions of the American Mathematical Society 368 (2015), no. 1, 581–605.
- [8] Gemma Carotenuto and André Nies, Lightface -completeness of density sets under effective wadge reducibility, Pursuit of the Universal, Springer, 2016, pp. 234–239.
- [9] Douglas Cenzer, -Classes in computability theory, Handbook of Computability Theory (Edward R Griffor, ed.), Studies in Logic and the Foundations of Mathematics, vol. 140, North-Holland, Amsterdam, 1999, pp. 37–85.
- [10] Douglas Cenzer and Jeffrey B Remmel, Effectively closed sets, May 2017.
- [11] J T Chang and D Pollard, Conditioning as disintegration, Statistica Neerlandica 51 (1997), no. 3, 287–317.
- [12] Vaughn Climenhaga and Anatole Katok, Measure theory through dynamical eyes, (2012), https://arxiv.org/abs/1208.4550.
- [13] Natasha L Dobrinen and Stephen G Simpson, Almost everywhere domination, Journal of Symbolic Logic 69 (2004), no. 3, 914–922.
- [14] J L Doob, Measure theory, Graduate Texts in Mathematics, vol. 143, Springer, 1994.
- [15] Rod Downey and Dennis Hirschfeldt, Algorithmic randomness and complexity, Springer, Berlin, 2010.
- [16] R M Dudley, Real analysis and probability, Cambridge University Press, 2002.
- [17] Rick Durrett, Probability: Theory and examples, fourth ed., Cambridge University Press, 2010.
- [18] John Earman, Bayes or bust? A critical examination of Bayesian confirmation theory, MIT Press, Cambridge, 1992.
- [19] Manfred Einsiedler and Thomas Ward, Ergodic theory: with a view towards number theory, Graduate Texts in Mathematics, vol. 259, Springer, 2010.
- [20] Jean-Pierre Florens, Velayoudom Marimoutou, and Anne Peguin-Feissolle, Econometric modeling and inference, Cambridge University Press, 2007.
- [21] Gerald B. Folland, Real analysis, second ed., Pure and Applied Mathematics, Wiley, 1999.
- [22] Peter Gács, Uniform test of algorithmic randomness over a general space, Theoretical Computer Science 341 (2005), no. 1, 91–137.
- [23] Su Gao, Invariant descriptive set theory, CRC Press, 2008.
- [24] Vassilios Gregoriades, Tamás Kispéter, and Arno Pauly, A comparison of concepts from computable analysis and effective descriptive set theory, Mathematical Structures in Computer Science 27 (2017), no. 8, 1414–1436.
- [25] Allan Gut, Probability: A graduate course, Springer, 2013.
- [26] Matthew Harrison-Trainor, Alexander Melnikov, and Keng Meng Ng, Computability of Polish spaces up to homeomorphism, Journal of Symbolic Logic 85 (2020), no. 4, 1664–1686.
- [27] Peter G Hinman, A survey of Mučnik and Medvedev degrees, Bulletin of Symbolic Logic 18 (2012), no. 2, 161–229.
- [28] Colin Howson and Urbach Peter, Scientific Reasoning: The Bayesian Approach, third ed., Open Court, Chicago, 2006.
- [29] Mathieu Hoyrup, Calculabilité, aléatoire et théorie ergodique sur les espaces métriques, Ph.D. thesis, Université Paris-Diderot - Paris VII, 2008.
- [30] Mathieu Hoyrup and Cristóbal Rojas, Computability of probability measures and Martin-Löf randomness over metric spaces, Information and Computation 207 (2009), no. 7, 830–847.
- [31] Mathieu Hoyrup, Cristóbal Rojas, and Klaus Weihrauch, Computability of the Radon-Nikodym derivative, Models of Computation in Context (Benedikt Löwe, Dag Normann, Ivan Soskov, and Alexandra Soskova, eds.), Lecture Notes in Computer Science, vol. 6735, Springer, Berlin, Heidelberg, 2011, pp. 132–141.
- [32] Mathieu Hoyrup and Jason Rute, Computable measure theory and algorithmic randomness, Handbook of Computability and Complexity in Analysis (Vasco Brattka and Peter Hertling, eds.), Springer, 2021, pp. 227–270.
- [33] Olav Kallenberg, Foundations of modern probability, Springer, New York, NY, 2002.
- [34] Alexander S. Kechris, Classical descriptive set theory, Graduate Texts in Mathematics, vol. 156, Springer, 1995.
- [35] Mushfeq Khan, Lebgesgue density and Classes, Journal of Symbolic Logic 81 (2016), no. 1, 80–95.
- [36] Stuart Alan Kurtz, Randomness and genericity in the degrees of unsolvability, Ph.D. thesis, University of Illinois at Urbana-Champaign, 1981.
- [37] Leonid Anatolievich Levin, Uniform tests of randomness, Soviet Mathematics Doklady 17 (1976), no. 2, 337–340.
- [38] P Lévy, Théorie de l’addition des variables aléatoires, Gauthier-Villars, Paris (1938).
- [39] Ming Li and Paul Vitányi, An introduction to Kolmogorov complexity and its applications, second ed., Graduate Texts in Computer Science, Springer, 1997.
- [40] Donald Martin, Measure, category, and degrees of unsolvability, Unpublished (1967).
- [41] Per Martin-Löf, The definition of random sequences, Information and Control 9 (1966), 602–619.
- [42] by same author, On the notion of randomness, Intuitionism and Proof Theory (Akiko Kino, John Myhill, and Richard E Vesley, eds.), North-Holland, Amsterdam, 1970, pp. 73–78.
- [43] Joseph S Miller, Degrees of unsolvability of continuous functions, Journal of Symbolic Logic 69 (2004), no. 2, 555–584.
- [44] Kenshi Miyabe, Characterization of kurtz randomness by a differentiation theorem, Theory of Computing Systems 52 (2013), no. 1, 113–132.
- [45] by same author, -Computability, layerwise computability, and Solovay reducibility, Computability 2 (2013), 15–29.
- [46] by same author, Algorithmic randomness over general spaces, Mathematical Logic Quarterly (2014).
- [47] Kenshi Miyabe, André Nies, and Jing Zhang, Using almost-everywhere theorems from analysis to study randomness, Bulletin of Symbolic Logic 22 (2016), no. 3, 305–331.
- [48] Yiannis N Moschovakis, Descriptive set theory, second ed., Mathematical Surveys and Monographs, vol. 155, American Mathematical Society, Providence, RI, 2009.
- [49] Keng Meng Ng, Frank Stephan, Yue Yang, and Liang Yu, Computational aspects of the hyperimmune-free degrees, Proceedings of the 12th Asian Logic Conference, World Scientific, 2012, pp. 271–284.
- [50] André Nies, Computability and randomness, Oxford University Press, 2008.
- [51] K R Parthasarathy, Probability measures on metric spaces, Academic Press, 1967.
- [52] Noopur Pathak, Cristóbal Rojas, and Stephen G. Simpson, Schnorr randomness and the Lebesgue differentiation theorem, Proceedings of the American Mathematical Society 142 (2014), no. 1, 335–349.
- [53] David Pollard, A user’s guide to measure theoretic probability, Cambridge University Press, 2002.
- [54] Christopher P Porter, On analogues of the Church–Turing Thesis in algorithmic randomness, Review of Symbolic Logic 9 (2016), no. 3, 456–479.
- [55] Malempati M Rao, Stochastic processes - inference theory, second ed., Springer, 2014.
- [56] Jan Reimann, Randomness—beyond Lebesgue measure, Logic Colloquium 2006 (S Barry Cooper, Herman Geuvers, Anand Pillay, and Jouko Vaananen, eds.), Cambridge University Press, Cambridge, 2010, pp. 247–279.
- [57] Robert Rettinger and Xizhong Zheng, Computability of real numbers, Handbook of Computability and Complexity in Analysis (Vasco Brattka and Peter Hertling, eds.), Springer, 2021, pp. 3–28.
- [58] V A Rohlin, On the fundamental ideas of measure theory, Matematicheskii Sbornik (1949), Translated in [59].
- [59] by same author, On the fundamental ideas of measure theory, Functional analysis and measure theory, American Mathematical Society, 1962, pp. 1–54.
- [60] V A Rokhlin, Lectures on the entropy of measure-preserving transformations, Russian Mathematical Surveys 22 (1967), no. 5.
- [61] Jason Rute, Algorithmic randomness, martingales, and differentiability I, preprint (2012), This is an expanded version of [62, Chapter 2].
- [62] by same author, Topics in algorithmic randomness and computable analysis, Ph.D. thesis, Carnegie Mellon University, 2013.
- [63] Filippo Santambrogio, Optimal transport for applied mathematicians: Calculus of variations, PDEs, and modeling, Birkhauser, 2015.
- [64] Claus-Peter Schnorr, Zufälligkeit und Wahrscheinlichkeit. Eine algorithmische Begründung der Wahrscheinlichkeitstheorie, Lecture Notes in Mathematics, vol. 218, Springer, 1971.
- [65] by same author, A survey of the theory of random sequences, Basic Problems in Methodology and Linguistics: Part Three of the Proceedings of the Fifth International Congress of Logic, Methodology and Philosophy of Science, London, Ontario, Canada-1975, Reidel, Dordrecht, 1977, pp. 193–211.
- [66] Glenn Shafer and Vladimir Vovk, Game-Theoretic foundations for probability and finance, Wiley, 2019.
- [67] Glenn Shafer, Vladimir Vovk, and Akimichi Takemura, Levy’s zero-one law in game-theoretic probability, (2009), https://arxiv.org/abs/0905.0254.
- [68] A Shen, V A Uspensky, and N Vereshchagin, Kolmogorov complexity and algorithmic randomness, American Mathematical Society, 2017.
- [69] Stephen G Simpson, Mass problems and randomness, Bulletin of Symbolic Logic 11 (2005), no. 1, 1–27.
- [70] Stephen G. Simpson, Subsystems of second order arithmetic, second ed., Cambridge University Press, Cambridge, 2009.
- [71] Robert I Soare, Turing computability: Theory and applications, Springer, 2016.
- [72] Alan M Turing, On computable numbers, with an application to the Entscheidungsproblem, Proceedings of the London Mathematical Society 2 (1937), no. 1, 230–265.
- [73] Jan Von Plato, Creating modern probability: Its mathematics, physics and philosophy in historical perspective, Cambridge University Press, 1994.
- [74] Hermann Weyl, Über die Gleichverteilung von Zahlen Mod. Eins, Mathematische Annalen 77 (1916), 313–352.
- [75] David Williams, Probability with martingales, Cambridge Mathematical Textbooks, Cambridge University Press, Cambridge, 1991.
- [76] Xiaokang Yu, Measure theory in weak subsystems of second-order arithmetic, Ph.D. thesis, The Pennsylvania State University, 1987.
- [77] by same author, Radon-Nikodým theorem is equivalent to arithmetical comprehension, Logic and computation (Pittsburgh, PA, 1987), Contemporary Mathematics, vol. 106, American Mathematical Society, 1990, pp. 289–297.