Counting Functions for Random Objects in a Category
Abstract
In arithmetic statistics and analytic number theory, the asymptotic growth rate of counting functions giving the number of objects with order below is studied as . We define general counting functions which count epimorphisms out of an object on a category under some ordering. Given a probability measure on the isomorphism classes of the category with sufficient respect for a product structure, we prove a version of the Law of Large Numbers to give the asymptotic growth rate as tends towards of such functions with probability in terms of the finite moments of and the ordering. Such counting functions are motivated by work in arithmetic statistics, including number field counting as in Malle’s conjecture and point counting as in the Batyrev-Manin conjecture. Recent work of Sawin–Wood gives sufficient conditions to construct such a measure from a well-behaved sequence of finite moments in very broad contexts, and we prove our results in this broad context with the added assumption that a product structure in the category is respected. These results allow us to formalize vast heuristic predictions about counting functions in general settings.
1 Introduction
Distributions of arithmetic objects are commonly studied via counting functions. A classic example is the prime number theorem, which determines the asymptotic growth rate of
as tends to infinity. Counting functions on subsets of prime numbers is one way that we work to understand the distribution of the prime numbers. Cramér made rigorous a heuristic argument of Gauss for predicting the distribution of prime numbers in [Cra94], whereby a sequence of independent variables indexed by natural numbers and valued in are considered, for which with probability as suggested by the prime number theorem. By asking the same distribution questions of the sequence instead of the sequence of prime numbers, Cramér is able to prove that this sequence satisfies a number of properties conjectured to be true for the primes with probability . See [Gra95] for a summary of Cramér’s random model and some improvements.
In arithmetic statistics, other objects are frequently studied in this way. Given an ordering of the objects satisfying a Northcott property, one can ask for the asymptotic growth rate of the counting function
(1) |
as tends to . The standard examples are points on schemes (or stacks) bounded by height, whose asymptotic growth rates are predicted by the Batyrev-Manin conjecture [FMT89, BM90], and counting -extensions of a global field ordered by discriminant, whose asymptotic growth rates are predicted by Malle’s conjecture [Mal02, Mal04].
Malle’s conjecture in particular is closely related to the study of class group statistics, where one asks for the distribution of the -parts of the class group of an extension as varies over some family of extensions ordered by discriminant. This distribution is very often predicted to agree with a true probability distribution on random groups. For a non-exhaustive list of examples for statistics of very general unramified objects over a family of global fields see [CL84, FW89, BBH17, LWZB19]. Despite this close relation, there has been no attempt to make predictions for Malle’s counting function using similar random structures. Like with Cramér’s model, it is reasonable to consider that there exists a random group with some discriminant structure which models the absolute Galois group of the global field for which Malle’s conjecture holds with probability . Potentially the same is true for other counting functions in arithmetic statistics.
Another eye-catching feature of Malle’s conjecture is the Galois correspondence. -extensions are in a -to- correspondence with surjective homomorphisms . The recent work of Sawin–Wood [SW22] solved the moment problem for random objects in a category, where they proved in great generality that given a sequence on (isomorphism classes of) the category of finite objects that “do not grow too fast”, there exists a measure on (isomorphism classes of) the category of pro-objects whose finite moments are given by . In the setting of objects in a category, this is taken to mean
for each finite object . Here we see an immediate similarity with Malle’s conjecture - Malle’s conjecture can be interpreted as an asymptotic count for the number of epimorphisms from the absolute Galois group of to a fixed finite group. This suggests that the setting Sawin–Wood work in, which they state was built with statistics of unramified objects in mind, is also an excellent setting to model Malle’s conjecture. By extension, random models for other counting functions may also exist in the setting considered by Sawin–Wood as long as they can be interrpretted as counting epimorphisms.
The goal of this paper is to consider counting functions in the broad setting described by Sawin–Wood. The remainder of this introduction is separated into two subsections - the first defining a counting function with respect to an ordering for counting epimorphisms in a category (see Definition 1.1), and the second for stating Law of Large Numbers results to determine the asymptotic growth rate of these counting functions with probability (see Theorem 1.3). We will prove the Law of Large Numbers for random objects in a category in as great of generality as possible, with the intention of allowing these results to easily translate to a variety of other counting functions in other settings. In a forthcoming paper [Alb23], the author will apply these results to the case of Malle’s conjecture to both recreate and improve on existing predictions for the asymptotic growth rate. This will be done by constructing a category of “groups with local data”, applying the results of [SW22] to construct a measure on this category, and applying the results of this paper to prove that 100% of random groups with local data satisfy Malle’s conjecture.
We adopt the notation and terminology of [SW22] throughout. Let be a diamond category in the sense of [SW22, Definition 1.3] with countably many isomorphism classes, and suppose is a probability measure (so the whole space has measure ) on the isomorphism classes of corresponding pro-objects, , under the level topology with finite moments for each . Sawin–Wood give sufficient conditions for a sequence to correspond to such a measure in [SW22, Theorem 1.7 and 1.8]. Our results do not require to be constructed in this way, but the author expects that most interesting examples will come from the existence results proven by Sawin–Wood.
1.1 Counting Functions
To translate the setting given by [SW22] to that of counting functions more closely resembling (1), we very generally define an ordering to be a sequence of functions indexed by positive integers . We are motivated by classical orderings such as the discriminant and height functions, which correspond to being the characteristic function of an increasing chain of finite sets , i.e. a chain of subsets for which . In the general format described in (1), we would choose the ordering
The property that such functions have finite support is called the Northcott Property, and is known to hold for discriminant and height orderings among many others.
Definition 1.1.
The counting function on a pro-object ordered by is defined by
as a function of , when the series is convergent.
In the classical setting, a Northcott Property is used to guarantee that the counting function is well-defined by forcing the series to be finite. We remark that in the classical setting is taken to be bounded above by a real number , while Definition 1.1 restricts to only using integer bounds, . This will not make a difference for classical orderings, as the (norm of the) discriminant and height functions are integer-valued. However, it is an artifact of the methods we use that we ask the sequence to be indexed by a countable set rather than the uncountable set of positive real numbers.
In our general setting, we may relax the Northcott requirement and still guarantee an (almost everywhere) well-defined counting function. Fix a probability measure on the category of pro-objects with finite moments . For convenience, we package the sequence as a discrete measure with (note that while we require to be a probability measure, need not be).
Definition 1.2.
Fix a probability measure on the category of pro-objects with finite moments . We call an -ordering with respect to if
In other words, the are -functions for the discrete measure induced by . We will often omit “with respect to ” if the probability measure is clear from context.
We will prove in Lemma 3.1 that is well-defined as a function of for almost all whenever is an -ordering (i.e., there exists a measure set of on which the counting function is well-defined as a function on the positive integers), and that the expected value is given by
(2) |
In the classical case, if has finite support in an increasing chain as tends towards then the counting function is a sum of increasing length depending on , where the summands are random variables as varies according to . This is precisely the setting of the Law of Large Numbers, which suggests a much stronger result of the form
as with probability in some appropriate sense. Typically, there is some requirement of pairwise independence between the summands in order to prove such a statement (see [SS93] for the classical statement, and [Sen13] for a brief history of the Law of Large Numbers, including cases that relax the requirement of pairwise independence). The work of Korchevsky–Petrov [KP10] on nonegative random variables, building on work in [Ete83a, Ete83b, Pet09a, Pet09b], is particularly relevant to these counting functions, as is always nonegative. We will prove several different versions of the Law of Large Numbers for of varying strengths under sufficiently nice orderings.
1.2 Main Results
Take to be a diamond category with countably many isomorphism classes, the category of pro-objects, and a probability measure on with finite moments . Theorem 1.3 is our primary probabilistic result, while Theorems 1.4 and 1.5 are our primary category theoretic results.
Theorem 1.3 (Law of Large Numbers in a Category).
Let be a real-valued -ordering for which . Suppose there exists an integer and a non-decreasing function for which , and
(3) |
Then each of the following hold:
-
(i)
(Weak Law of Large Numbers)
as , where the “p.” stands for converges in probability with respect to .
-
(ii)
(Strong Law of Large Numbers) If we additionally assume that then
as , where the “a.s.” stands for converges almost surely with respect to .
-
(iii)
(Strong Law of Large Numbers) If we additionally assume that
-
•
is nonegative,
-
•
the counting function is almost everywhere nondecreasing, and
-
•
for a nondecreasing function for which ,
then
as , where the “a.s.” stands for converges almost surely with respect to .
-
•
The primary new contribution of Theorem 1.3 is the connection between counting functions and the Law of Large Numbers. The probabilistic content is largely standard, and in fact the proof of Theorem 1.3(iii) is essentially the same as that of [KP10]. We follow a standard technique for bounding probabilities using Chebyshev’s law and the first Borel-Cantelli lemma. Most standard references will have some version of this technique, sometimes in the context of the “method of moments”, such as [Fis11, Chapter 4]. This technique is ubiquitous to the point that examples can be found in proofs on Wikipedia [wik22] and a number of Math StackExchange posts such as [Mic17, Ele18].
If the are specifically characteristic functions with finite support in an increasing chain then Theorem 1.3(iii) is a special case of [KP10, Theorem 3]. We state and prove these results separately from work in [KP10] to allow for a more general class of orderings, although the proof requires no new ideas. For example, we can now directly consider orderings of the form
This type of ordering gives counting functions that count objects landing in the moving interval . These sorts of counting functions are used to avoid errant behavior for small objects, and can be useful for proving averaging results.
In cases for which , there is something we can say using the more general results Theorem 4.1 and Theorem 4.2 we prove in the main body of the paper. This requires a finer study of the rate of convergence, as we are no longer allowed to have on the denominator for such cases.
The bound on moments in (3) is a concrete way to state that the random variables are “close enough to independent” for the Law of Large Numbers. Our primary categorical result is on bounding this quantity in terms of purely categorical structures of and together with the discrete moments of the ordering .
Given an ordering , we extend the ordering multiplicatively to the product category by
We also define the discrete measure given by the mixed moments of the measure as
This is sufficient information to prove a bound for (3) for .
Theorem 1.4.
Let be a real-valued -ordering. Then
where is the full subcategory of pairs such that
-
(a)
the epi-product exists (defined in Definition 5.1), and
-
(b)
.
Taking away the full subcategory of is the new content, and allows for bounds on the higher moments that can be computed solely from data given by , , and without reference to the measure . The epi-product in condition (a) is a special type of product structure in a category, satisfying a similar universal property to the direct product with every morphism in the diagram being an epimorphism (Definition 5.1). We will show that the existence of an epi-product together with (b) implies the random variables and are uncorrelated. This will be key for bounding the moments. It will often be the case that determining the existence or nonexistence of epi-products is easier than approaching the bound (3) in Theorem 1.3 directly. We include some examples using this approach in Section 6.
We will prove a similar bound for (3) when is even, but this will require more assumptions on the category. These extra assumptions can prove useful for applying Theorem 1.4, as they give us extra tools for calculating the integral with respect to . Lemma 5.3 will be very useful in this regard, and we give an example demonstrating its use in Section 6.
Theorem 1.5.
Suppose is a category with for which every morphism factors uniquely (up to isomorphism) as a composition of an epimorphism with a monomorphism. Let be a real-valued -ordering and a positive integer. Then
where is the full subcategory of tuples for which there exists an index such that for each subset with
-
(a)
the product object exists in ,
-
(b)
has finitely many subobjects up to isomorphism,
-
(c)
the epi-product exists for each subobject (defined in Definition 5.1), and
-
(d)
for each subobject .
Unique factorization of morphisms is known for many categories of interest, including finite sets, finite groups, and finite modules over a ring. In fact, each of these categories automatically satisfies parts (a) and (b) as well. Thus, in numerous categories of interest, Theorem 1.5 requires no extra input over Theorem 1.4.
We also prove an upper bound utilizing simpler categorical structures that holds without the restrictions of Theorem 1.3.
Corollary 1.6.
Let be a diamond category which has countably many isomorphism classes, be a probability measure on the isomorphism classes of the corresponding category of pro-objects with finite moments , and an -ordering.
Then for any and any positive integer
as , where the “a.s.” stands for converges almost surely with respect to .
Corollary 1.6 is motivated by Malle’s predicted weak upper bound, although the correspondence is less than obvious. In the classical case is a nonegative function so that , and when the denominator is given by
Ignoring the other values of and taking sufficiently large, Corollary 1.6 appears to give an upper bound of the form
with probability , where we recall that is playing the role of in Malle’s counting function. Of course, we cannot actually ignore the other values of , and depending on and it is possible for these bounds to be worse.
Theorem 1.3 and Corollary 1.6 can be used to provide evidence for conjectures about counting functions in great generality. We state this as an explicit heuristic:
Heuristic 1.7 (Vast Counting Heuristic).
Let be a diamond category which has countably many isomorphism classes and let be a probability measure on the isomorphism classes of the corresponding category of pro-objects with finite moments . Choose an -ordering .
Let be an element of for which is well-defined. If is an object we expect to behave typically∗ with respect to the probability measure , such as an object coming from arithmetic, then
- (i)
-
(ii)
(Weak form) For each and each positive integer
The asterisk on “typically” is meant to highlight that this word is where the intricacies of counting conjectures lie, and in particular is in recognition of counter-examples to Malle’s conjecture and the like. For number field counting, Malle’s original conjecture is incorrect due to behavior coming from roots of unity that Malle did not initially consider. For the purpose of a general statement, this can be interpreted as saying “if there are no obstructions, this is what we expect”. Heuristic 1.7 provides a baseline for counting predictions when we believe there are no “non-random” structures of missing from the measure on the category . Issues, like those occurring in Malle’s conjecture from roots of unity, might be fixed by either (a) incorporating the missing structure into a modified version of the category (this is analogous to the approach commonly take in the study of Malle’s conjecture), or (b) excluding epimorphisms from the count which are affected by the missing structure (this is the approach taken for the Batyrev-Manin conjecture).
The literature on proving new cases of the Law of Large Numbers is vast, and it is possible that Theorem 1.3 may be extended with finer probabilistic arguments. From a categorical perspective, we’ve stated Theorem 1.4, Theorem 1.5, Corollary 1.6 and Heuristic 1.7 in the greatest generality possible so that they might be applied to make predictions on a variety of different counting functions in arithmetic statistics independent of the author’s immediate knowledge. We hope that these results will be useful as a starting point for many future predictions for the asymptotic growth rates of counting functions.
1.3 Layout of the Paper
We begin in Section 2 with a discussion of what we mean by an asymptotic version of the Law of Large Numbers. This is conceptually important for understanding the methods of this paper, as the references to the Law of Large Numbers are not just in analogy. Korchevskey–Petrov’s result [KP10, Theorem 3] is a particular example of the type of Law of Large Numbers result we consider.
We prove well-definedness of the counting function in Section 3. We give the probabilistic proofs of Theorem 1.3 and Corollary 1.6 in Section 4. These proofs closely follow the methods of [KP10]. We define epi-products in Section 5, prove some useful features of these objects, and prove Theorems 1.4 and 1.5.
Lastly, we include Section 6 to work out a couple of important special cases. The arithmetic statistics questions that primarily motivated the author will require independent work to set up the necessary category, measure, and orderings which the author will present in a forthcoming paper [Alb23]. We instead include simpler examples where these steps are either shorter or done by previous work such as in [SW22], and we pay particular attention to when epi-products exist in such cases. Despite the easier nature of these examples, we recover existing important random models from number theory as special cases of Theorem 1.3 and Heuristic 1.7.
Acknowledgments
The author would like to thank Melanie Matchett Wood for feedback and helpful discussions on the mechanics on [SW22].
2 An asymptotic version of the Law of Large Numbers
Given a sequence of independent, identically distributed random variables the Law of Large Numbers says that we expect to converge to the expected value of in some sense. A version of this statement can sometimes also hold for random variables which are neither independent nor identically distributed. If the variables are not identically distributed, then may vary depending on , which necessarily alters the statement we want to make. We will instead be comparing
with |
as . When the random variables are pairwise independent, identically distributed, and have bounded variance in some sense, the difference of these is known to grow slower than almost surely by Kolmogorov’s strong law [Ete81, SS93]. Rather than a convergence statement, we will state the Law of Large Numbers as an asymptotic statement. The conclusion of Kolmogorov’s strong law would then be written as
We can understand this term from the perspective of asymptotic vounting functions as picking out a “main term + O(error term)”. Kolmogorov’s strong law can then be understood as
as with probability . It is often the case with the Law of Large Numbers that such results are written as
The statements of Theorem 1.3 can be interpretted as something similar to this. is defined to be a sum of random variables , so if we label we can understand the counting function as
The ordering is analogous to a statement of the form “”, although our results apply to more general orderings. We can then see that behaves like a sum of random variables , whose length is growing with . The philosophy of the Law of Large Numbers suggests that we should expect this to be close to . However, there is no reason to expect an error term like to hold. This isn’t precisely what we want, we are interested in a statement of the form
with probability . This error does not necessarily agree with .
We consider two examples to demonstrate the intricacies of the error term.
-
•
Consider the following example: Let be a random variable equal to with probability and with probability , and consider the dependent sequence . Classical SLLN certainly holds for this sequence, as
However, the are identically 0 with probability , so
This example satisfies the error term, but it is certainly not what we want. What is happening here is that
right from the start, so the classical error for the Law of Large Numbers does not actually tell us much about how close is to .
-
•
If we instead work backwards, we might ask for error terms which are smaller than
so that the sum of expected values can rightly be recognized as the main term. In the classical setting with identically distributed random variables with nonzero expected value , these notions coincide as
In fact, this is known more generally. Korchevsky–Petrov [KP10, Theorem 3] give sufficient conditions for when
as , when is a sum of nonegative random variables. Theorem 1.3(iii) is based on their result.
However, in the case that for all this is asking for a trivial error term and a trivial main term. This is certainly an issue, as in the classical case with a sequence of independent identically distributed random variables with mean the central limit theorem forces the error term to be about (i.e., not ). The case that all expected values are zero is important in the classical study and applications of the Law of Large Numbers, so we do not want to exclude this case.
The condition in Theorem 1.3 is a consequence of the second bullet above. This corresponds to the condition that for all but finitely many , so that we can make sense of having on the denominator. This issue can alternatively be fixed avoided by relaxing what we want for an error term. Some options might include to guarantee the sum is nonzero unless are identically zero, or something of the form to give an error of a similar nature to the Central Limit Limit. We give a version of Theorem 1.3(i,ii) in Theorems 4.1 and 4.2 with explicit rates of convergence. These results can be used to produce probability statements even in the case that , although they will no longer be of the form
3 Well-defined counting functions
A sequence of such -functions is called an -ordering as in Definition 1.2. As in Definition 1.1, we define the counting function
for each . By allowing to have possibly infinite support, we need to make sure that the counting function is well-defined and the linearity of expectation behaves as we expect. In the case of -orderings, we prove this below:
Lemma 3.1.
If is an -ordering, then
-
(a)
is well-defined as a function on the positive integers almost surely (i.e. with probability ) with respect to , and
-
(b)
Proof.
This follows from the Fubini-Tonelli theorem. Indeed, being means
All probability spaces are -finite, so it follows from the Fubinit-Tonelli theorem that
Certainly implies almost surely, so it must be that for fixed the counting function
converges absolutely for almost all . As there are countably many , countable additivity implies is well-defined almost surely as a function on the positive integers.
The fact that the integrals of are finite also implies
by the Fubini-Tonelli Theorem. Evaluating the inner sum/integral gives
proving (b). ∎
4 Law of Large Numbers for categories
We prove Theorem 1.3 and Corollary 1.6 in this section. The probabilistic content of each proof is essentially Chebyshev’s bound, the Borel-Cantelli lemma, and a few tricks for nonegative orderings previously applied in [KP10]. We prove Theorem 1.3(i,ii) with explicit control on the rate of convergence, in part to make the proof of Corollary 1.6 more straight forward.
4.1 The Weak Law of Large Numbers
We prove the following result, which is Theorem 1.3(i) together with an explicit rate of convergence.
Theorem 4.1.
Let be a real-valued -ordering. Suppose there exists an integer , a non-decreasing function for which , and a function for which
(4) |
Then
as , where the “p.” stands for converges in probability with respect to .
Theorem 1.3(i) follows by taking for some small .
Proof.
If is a random variable with , then Chebyshev’s bound states that for each positive integer ,
(See, for instance, [MU17].) In the context of this proof, we let denote the expected value with respect to for the sake of convenience. Set
It then follows that
Taking , it follows that
for each . Thus,
This is the definition of converging in probability to , and so concludes the proof. ∎
4.2 The Strong Law of Large Numbers
We prove the following result, which is Theorem 1.3(ii) together with an explicit rate of convergence.
Theorem 4.2.
Let be a real-valued -ordering. Suppose there exists an integer , a non-decreasing function for which , and a function for which
(5) |
Additionally, assume that . Then
as , where the “a.s.” stands for converges almost surely with respect to .
Theorem 1.3(ii) follows by taking for some small .
Proof.
We proceed in a similar way to the proof of Theorem 4.1. Set
so that
It then follows from Chebyshev’s bound that
Taking , it follows that
for each . Thus, the Borel-Cantelli lemma implies
so that
By countable additivity and taking for tending towards , this implies converges to almost surely, concluding the proof. ∎
4.3 Almost Sure Upper Bounds
Proof of Corollary 1.6.
We bound
where the last line follows from being an function and moving the integral all the way to the inside. We remark that the sum over permutations is a bit larger than the truth, but this bound will be sufficient for our purposes. This is exactly a sum of mixed moments
noting that is invariant under the permuting the coordinates. Taking
and in Theorem 4.2 is sufficient to conclude the proof. ∎
4.4 The Strong Law of Large Numbers for nonnegative counting functions
Theorem 1.3(iii) takes advantage of some tricks employed by [KP10] for nonegative functions. The following proof is in essense the same as their main result, however the functions we allow are slightly more general than the sequence of weights utilized in [KP10, Theorem 1].
These tricks are not compatible with controlling the rate of convergence, so we do not state a more general result for this part. It is certainly possible that more can be proven for this case but we do not do so here.
Proof of Theorem 1.3(iii).
We remark that being nondecreasing almost everywhere implies
is also nondecreasing.
The primary trick is to prove convergence along a subsequence of indices first, then interpolate to the remaining indices.
Fix . Define the subsequence of natrual numbers by
These necessarily exist by with . We apply Chebyshev’s inequality to
with to prove that
where the second comparison follows from being nondecreasing. A straight foward exercise in the convergence of infinite series shows that if converges, then so does for any (for a reference, see [Pet09a, Lemma 1]). Similar to the proof of Theorem 4.2, the Borel-Cantelli Lemma then implies
Next, we interpolate to other indices not belonging to the subsequence . There exists an for which . We expand
Given that the counting function and its moments are nondecreasing almost everywhere, this produces an almost everywhere upper bound
The subsequence is nondecreasing, and if it follows that
We can simplify the upper bound to
Taking the limit as implies
Taking sufficiently close to gives the desired . The is calculated similarly using the lower bound
The limit as can again be made arbitrarily close to by taking close to . ∎
5 Epi-Products
We let be a diamond category as defined in [SW22, Definition 1.3] with the corresponding category of pro-objects. We fix throughout a probability measure on with finite moments .
Theorem 1.4 and Theorem 1.5 rely on the concept of an epi-product, which is a categorical construction capturing the lack of correlation between and as varies according to .
5.1 The Definition of an Epi-Product
Definition 5.1.
We define the epi-product of and to be an object with epimorphisms to and which satisfies the universal property
where all morphisms in the diagram (including the universal morphism) are required to be epimorphisms.
This is similar to the definition of a product, except we require all the morphisms (including the universal morphism) to be epimorphisms. Notice that we do not ask that has epi-products in general. This flexibility will allow us to choose a wider variety of categories to work over. We remark that if both the usual product and exist then there is necessarily a unique isomorphism between them. However, the existence of one does not imply the existence of the other. We will discuss some basic examples at the end of the paper.
Lemma 5.2.
Let be a probability measure on the pro-objects of with finite moments , and let vary with respect to . Then the random variables and for are uncorrelated if exists and .
Proof.
By the definition of uncorrelated, it suffices to prove that
The universal property of epi-products implies that
Therefore
∎
5.2 The Proof of Theorem 1.4
Proof of Theorem 1.4.
For simplicity in this proof, we write for the expected value with respect to . When is real-valued, the square is always positive and we can ignore the absolute value. Thus
By Lemma 5.2, implies and are uncorrelated. By definition, this is equivalent to . Therefore the sum simplifies to
The result then follows from the triangle inequality. ∎
5.3 The Proof of Theorem 1.5
Theorem 1.5 is proven similarly to Theorem 1.4, although the higher power is trickier to keep track of. In particular, some notion of “mutually uncorrelated” random variables is needed to compare the moments of three or more of the terms.
The extra categorical requirements in Theorem 1.5 are used to reframe the mixed moments as first moments, which allows us to more easily capture the required “mutually uncorrelated”ness using subobjects of product objects. This reframing is also useful for computing the mixed moments, so we state and prove it separately:
Lemma 5.3.
Suppose is a category with for which every morphism factors uniquely (up to isomorphism) as a composition of an epimorphism with a monomorphism. Let be objects for which the product exists in and has finitely many subobjects up to isomorphism. Then
where the sum is over subobjects up to isomorphism for which is also an epimorphism for each projection morphism .
Proof.
By the universal property of the product, there is an embedding
Partitioning this embedding based on the possible images gives a bijection to a disjoint union
Taking cardinalities and integrating with respect to concludes the proof. ∎
We can now prove Theorem 1.5.
Proof of Theorem 1.5.
For simplicity in this proof, we write for the expected value with respect to . When is real-valued and we take the moment, we can essentially ignore the absolute values. We then compute
where we can switch the order of the integral defining and the countable sum because is an -ordering.
We multiply out the product of random variables to produce
It suffices to show this quantity is when , as the fact that , , and are invariant under permutation implies
Now, suppose that , with distinguished index . We separate the term of the product out, and compute
By Lemma 5.2 and the properties of the distinguished index, it is necessarily the case that
for each . Therefore this is a sum of zeros, and cancels out as claimed. ∎
6 Examples
Determining the bound in Theorem 1.4 and Theorem 1.5 is fairly specific to the category, and the motivating examples of number field counting and the Batyrev-Manin conjecture take a fair bit of set up to define the corresponding diamond category , construct a measure modeling arithmetic behavior, translate classical orderings into a sequence , and compute the moments . All this before even considering the question of bounding the higher moments. The plus side is the immense return value of Theorem 1.3 and Corollary 1.6. If you understand when epi-products exist enough to apply these results for one ordering, then it is often the case that the same reasoning will apply to numerous other orderings on the same category.
The author will construct a model for Malle’s counting function and determine the reasonableness of a large collection of orderings in forthcoming work [Alb23]. For the purposes of this paper, we include some more basic examples where these steps are either easier or already done for us in works like [SW22].
6.1 Random subsets with independent elements
We prove a typical tool used to give random models for sets of integers as a special case of our methods:
Theorem 6.1.
Let be a random subset of where we let the events be pairwise independent with probability . This forms a true probability measure on , and for any any
as .
If we let be the probability that is prime (with and in order to make sense) this is precisely Cramér’s original random model for the set of primes [Cra94]. We reproduce Cramér’s main term with this asymptotic:
with probability . One easily checks using difference calculus that the main term is of the same order of magnitude as in agreement with the prime number theorem. The error we produce, while not as strong as Cramér’s, is still suggesting the truth of the Riemann hypothesis.
Proof.
Let be the category gotten by letting be the category of finite subsets of whose morphisms are inclusions. The pro-objects are all the subsets of . This category is incredibly nice for numerous reasons:
-
(a)
contains only the inclusion map if and is empty otherwise.
-
(b)
The product of and is given by the union , and the fact that all morphisms are epimorphisms implies this is an epi-product too.
-
(c)
Any sequence of finite moments on this category is well-behaved in the sense of [SW22], because the well-behavedness sum is always finite. A level is the power set for some finite set , while the elements with epimorphisms from are precisely the subsets of .
-
(d)
The Möbius function on the lattice of isomorphism classes ordered by epimorphisms is given by .
The function is then the characteristic function of the event , which among a level has expected value precisely , so we define this to be . Certainly whenever . We check that corresponds to a measure on the pro-objects using [SW22, Theorem 1.7]. Indeed,
so the measure exists.
We also consider that is the characteristic function of the event . Thus, we conclude that
so that . In this example, we have a very convenient bound for the finite moments of unions
which follows from the assumption that . In particular, this implies for each .
Any ordering supported on a family of pairwise disjoint sets necessarily satisfies that intersected with the support of is precisely the collection of tuples in the support of which has at most distinct coordinates. We choose to be the characteristic function of singleton sets for which . Up to the number of ways to choose matching coordinates, it suffices to consider objects in . Thus, we can bound
Noting that is supported on singleton sets, this is equivalent to
If , we can bound this by a constant times the leading term , and otherwise this is . Thus we conclude via Theorem 1.5 that
Taking and in Theorem 4.2 then implies
The counting function is given by
while the moments of the ordering are given by
Taking sufficiently large concludes the proof. ∎
Keeping in mind the connection with random models for prime numbers, we also prove the following:
Corollary 6.2.
Let be a random subset of where we let the events be pairwise independent with probability . Let be a characteristic function of some subset , and define the corresponding ordering supported on singleton sets by . Then
as .
If we let be the probabilities of Cramér’s model (or of the modifications improving the model), Heuristic 1.7 makes predictions for sets of prime numbers belonging to any subset obeying the incredibly mild density condition
for some . This prediction is extremely broad, where the currently “known issues” amount to divisibility by small primes which are accounted for in corrections to Cramér’s random model (see [Gra95] for a nice summary). This example is not “novel”, such measure statements exist throughout the literature on random models for the primes generalizing Cramér’s work as a means to justify a number of the most famous conjectures on the distribution of primes. This includes the likes of the Hardy-Littlewood conjecture (where is the characteristic function on admissible constellations starting from ) and primes of the form (where is the characteristic function of integers of the form ).
In addition to being a short example with nice properties, this is meant to demonstrate two things. Firstly this example demonstrates that important existing random models (and corresponding results for those models) for the distribution of prime numbers arise as special cases of Heuristic 1.7 (respectively Theorem 1.3) for the category of subsets of . Secondly, this example demonstrates the power of working at this level of generality. We have given a concrete description for when such probability results will exist, which allow us to tackle numerous orderings at a time.
6.2 Random groups
Any sequence on the category of finite groups is well-behaved in the sense of [SW22], so their results can be used to determine when such a moment sequence gives rise to a probability distribution. We will focus on the example discussed in [SW22]. We can use Theorem 1.3 to make predictions for (asymptotically) how many quotients of a “random” profinite group have a given behavior. While questions of this nature are easy enough to formulate, the moments of the most natural orderings can be much harder to compute in practice. Utilizing the results of the preceeding subsection, we give the following example:
Theorem 6.3.
Let on the category of finite abelian groups with associated probability measure on the profinite abelian groups. Then the average number of maximal subgroups in a random pro-abelian group of index bounded above by tends to in almost surely. More specifically,
as .
Proof.
We take the ordering
All maximal subgroups of an abelian group are normal with quotient isomorphic to for some prime , so we can compute
We note that is nonegative and that is certainly increasing in . Thus, we can apply Theorem 1.3(iii). We first compute
We next apply Theorem 1.4 to bound (3). The ordering is supported on the finite simple abelian groups , which are of pairwise coprime order. Thus, the intersection of with the support of is precisely the diagonal objects with . We compute
where the second equality follows from Lemma 5.3, noting that the subgroups of that surject onto each coordinate are the whole group, and subgroups isomorphic to . The integral actually converges as by a similar computation, so this is the maximum. All together Theorem 1.4 implies
where is non decreasing with . We confirm that converges, so the result follows from Theorem 1.3(iii). ∎
More general questions on random groups can be asked, but much about the asymptotic behavior of large families of groups is difficult to compute. This makes asymptotic growth rates and error terms also difficult to compute in practice. For instance, it would be nice to study the ordering if and otherwise. The corresponding counting function is then
This is a naturally interesting function to ask about. However, the corresponding moment
is not easy to determine. For instance, there are reasons to suspect that 100% of groups ordered by cardinality are -groups, but the author is not aware of a proof of this statement. Moreover, the bound in Theorem 1.4 requires some more serious group theory to translate the results of Lemma 5.3 into bounds on the error.
A number of interesting statistical questions that are simple to state arise in this way. The language of Theorem 1.3 is useful to create a framework for determining asymptotic growth rates, even if the moments of the ordering requires additional study to prove any results.
References
- [Alb23] Brandon Alberts. A random group with local data, 2023. Forthcoming.
- [BBH17] Nigel Boston, Michael R. Bush, and Farshid Hajir. Heuristics for -class towers of imaginary quadratic fields. Mathematische Annalen, 368(1-2):633–669, June 2017.
- [BM90] V. V. Batyrev and Yu. I. Manin. Sur le nombre des points rationnels de hauteur borné des variétés algébriques. Mathematische Annalen, 286(1-3):27–43, 1990.
- [CL84] Henri Cohen and Hendrik. W. Lenstra. Heuristics on class groups of number fields. Lecture Notes in Mathematics Number Theory Noordwijkerhout 1983, pages 33–62, 1984.
- [Cra94] Harald Cramér. Some theorems concerning prime numbers. Springer Collected Works in Mathematics, page 138–170, 1994.
- [Ele18] (https://math.stackexchange.com/users/7145/elements) Elements. Weak law of large numbers for dependent random variables with bounded covariance. Mathematics Stack Exchange, 2018. URL:https://math.stackexchange.com/q/245327 (version: 2018-04-10).
- [Ete81] N. Etemadi. An elementary proof of the strong law of large numbers. Z. Wahrscheinlichkeitstheorie und Verwandte Gebiete, 55(1):119–122, 1981.
- [Ete83a] N. Etemadi. On the laws of large numbers for nonnegative random variables. Journal of Multivariate Analysis, 13(1):187–193, 1983.
- [Ete83b] N. Etemadi. Stability of sums of weighted nonnegative random variables. Journal of Multivariate Analysis, 13(2):361–365, 1983.
- [Fis11] Hans Fischer. A history of the central limit theorem: From classical to modern probability theory. Springer New York, 2011.
- [FMT89] Jens Franke, Yuri I. Manin, and Yuri Tschinkel. Rational points of bounded height on fano varieties. Inventiones Mathematicae, 95(2):421–435, 1989.
- [FW89] Eduardo Friedman and Lawrence C. Washington. On the distribution of divisor class groups of curves over a finite field. Théorie des nombres / Number Theory, page 227–239, 1989.
- [Gra95] Andrew Granville. Harald cramér and the distribution of prime numbers. Scandinavian Actuarial Journal, 1995(1):12–28, 1995.
- [KP10] V. M. Korchevsky and V. V. Petrov. On the strong law of large numbers for sequences of dependent random variables. Vestnik St. Petersburg University: Mathematics, 43(3):143–147, 2010.
- [LWZB19] Yuan Liu, Melanie Matchett Wood, and David Zureick-Brown. A predicted distribution for Galois groups of maximal unramified extensions, July 2019. Preprint available at https://arxiv.org/abs/1907.05002.
- [Mal02] Gunter Malle. On the distribution of Galois groups. Journal of Number Theory, 92(2):315–329, 2002.
- [Mal04] Gunter Malle. On the distribution of Galois groups, II. Experimental Mathematics, 13(2):129–135, 2004.
- [Mic17] (https://math.stackexchange.com/users/155065/michael) Michael. Pairwise uncorrelated random variables in strong law of large numbers (slln). Mathematics Stack Exchange, 2017. URL:https://math.stackexchange.com/q/2545239 (version: 2017-11-30).
- [MU17] Michael Mitzenmacher and Eli Upfal. Probability and computing: Randomization and probabilistic techniques in algorithms and data analysis. Cambridge University Press, 2017.
- [Pet09a] V. V. Petrov. On stability of sums of nonnegative random variables. Journal of Mathematical Sciences, 159(3):324–326, 2009.
- [Pet09b] V. V. Petrov. On the strong law of large numbers for nonnegative random variables. Theory of Probability &; Its Applications, 53(2):346–349, 2009.
- [Sen13] Eugene Seneta. A tricentenary history of the law of large numbers. Bernoulli, 19(4), 2013.
- [SS93] Pranab Kumar Sen and Julio da Motta Singer. Large sample methods in statistics: An introduction with applications. Chapman & Hall/CRC, 1 edition, 1993.
- [SW22] Will Sawin and Melanie Matchett Wood. The moment problem for random objects in a category, Oct 2022.
- [wik22] Law of large numbers, Oct 2022.