Counting Functions for Random Objects in a Category

Brandon Alberts

Abstract

In arithmetic statistics and analytic number theory, the asymptotic growth rate of counting functions giving the number of objects with order below $X$ is studied as $X\to\infty$ . We define general counting functions which count epimorphisms out of an object on a category under some ordering. Given a probability measure $\mu$ on the isomorphism classes of the category with sufficient respect for a product structure, we prove a version of the Law of Large Numbers to give the asymptotic growth rate as $X$ tends towards $\infty$ of such functions with probability $1$ in terms of the finite moments of $\mu$ and the ordering. Such counting functions are motivated by work in arithmetic statistics, including number field counting as in Malle’s conjecture and point counting as in the Batyrev-Manin conjecture. Recent work of Sawin–Wood gives sufficient conditions to construct such a measure $\mu$ from a well-behaved sequence of finite moments in very broad contexts, and we prove our results in this broad context with the added assumption that a product structure in the category is respected. These results allow us to formalize vast heuristic predictions about counting functions in general settings.

1 Introduction

Distributions of arithmetic objects are commonly studied via counting functions. A classic example is the prime number theorem, which determines the asymptotic growth rate of

\pi(X)=\#\{p\text{ prime }\mid p\leq X\}\sim\frac{X}{\log X}

as $X$ tends to infinity. Counting functions on subsets of prime numbers is one way that we work to understand the distribution of the prime numbers. Cramér made rigorous a heuristic argument of Gauss for predicting the distribution of prime numbers in [Cra94], whereby a sequence of independent variables $X_{n}$ indexed by natural numbers and valued in $\{0,1\}$ are considered, for which $X_{n}=1$ with probability $\frac{1}{\log(n)}$ as suggested by the prime number theorem. By asking the same distribution questions of the sequence $X_{n}$ instead of the sequence of prime numbers, Cramér is able to prove that this sequence satisfies a number of properties conjectured to be true for the primes with probability $1$ . See [Gra95] for a summary of Cramér’s random model and some improvements.

In arithmetic statistics, other objects are frequently studied in this way. Given an ordering of the objects satisfying a Northcott property, one can ask for the asymptotic growth rate of the counting function

\displaystyle\#\{\text{object}\mid\text{order(object)}\leq X\}

(1)

as $X$ tends to $\infty$ . The standard examples are points on schemes (or stacks) bounded by height, whose asymptotic growth rates are predicted by the Batyrev-Manin conjecture [FMT89, BM90], and counting $G$ -extensions of a global field $K$ ordered by discriminant, whose asymptotic growth rates are predicted by Malle’s conjecture [Mal02, Mal04].

Malle’s conjecture in particular is closely related to the study of class group statistics, where one asks for the distribution of the $p$ -parts of the class group of an extension $K/\mathbb{Q}$ as $K$ varies over some family of extensions ordered by discriminant. This distribution is very often predicted to agree with a true probability distribution on random groups. For a non-exhaustive list of examples for statistics of very general unramified objects over a family of global fields see [CL84, FW89, BBH17, LWZB19]. Despite this close relation, there has been no attempt to make predictions for Malle’s counting function using similar random structures. Like with Cramér’s model, it is reasonable to consider that there exists a random group with some discriminant structure which models the absolute Galois group of the global field $K$ for which Malle’s conjecture holds with probability $1$ . Potentially the same is true for other counting functions in arithmetic statistics.

Another eye-catching feature of Malle’s conjecture is the Galois correspondence. $G$ -extensions $L/K$ are in a $1$ -to- $|\textnormal{Aut}(G)|$ correspondence with surjective homomorphisms $\textnormal{Gal}(\overline{K}/K)\to G$ . The recent work of Sawin–Wood [SW22] solved the moment problem for random objects in a category, where they proved in great generality that given a sequence $M_{G}$ on (isomorphism classes of) the category of finite objects $C$ that “do not grow too fast”, there exists a measure $\mu$ on (isomorphism classes of) the category of pro-objects $\mathcal{P}$ whose finite moments are given by $M_{G}$ . In the setting of objects in a category, this is taken to mean

\int_{\mathcal{P}}\#{\rm Epi}(\mathscr{G},G)\ d\mu(\mathscr{G})=M_{G}

for each finite object $G$ . Here we see an immediate similarity with Malle’s conjecture - Malle’s conjecture can be interpreted as an asymptotic count for the number of epimorphisms from the absolute Galois group of $K$ to a fixed finite group. This suggests that the setting Sawin–Wood work in, which they state was built with statistics of unramified objects in mind, is also an excellent setting to model Malle’s conjecture. By extension, random models for other counting functions may also exist in the setting considered by Sawin–Wood as long as they can be interrpretted as counting epimorphisms.

The goal of this paper is to consider counting functions in the broad setting described by Sawin–Wood. The remainder of this introduction is separated into two subsections - the first defining a counting function with respect to an ordering for counting epimorphisms in a category (see Definition 1.1), and the second for stating Law of Large Numbers results to determine the asymptotic growth rate of these counting functions with probability $1$ (see Theorem 1.3). We will prove the Law of Large Numbers for random objects in a category in as great of generality as possible, with the intention of allowing these results to easily translate to a variety of other counting functions in other settings. In a forthcoming paper [Alb23], the author will apply these results to the case of Malle’s conjecture to both recreate and improve on existing predictions for the asymptotic growth rate. This will be done by constructing a category of “groups with local data”, applying the results of [SW22] to construct a measure on this category, and applying the results of this paper to prove that 100% of random groups with local data satisfy Malle’s conjecture.

We adopt the notation and terminology of [SW22] throughout. Let $C$ be a diamond category in the sense of [SW22, Definition 1.3] with countably many isomorphism classes, and suppose $\mu$ is a probability measure (so the whole space has measure $1$ ) on the isomorphism classes of corresponding pro-objects, $\mathcal{P}$ , under the level topology with finite moments $M_{G}$ for each $G\in C/\cong$ . Sawin–Wood give sufficient conditions for a sequence $M_{G}$ to correspond to such a measure in [SW22, Theorem 1.7 and 1.8]. Our results do not require $\mu$ to be constructed in this way, but the author expects that most interesting examples will come from the existence results proven by Sawin–Wood.

1.1 Counting Functions

To translate the setting given by [SW22] to that of counting functions more closely resembling (1), we very generally define an ordering to be a sequence of functions $f_{n}:C/\cong\to\mathbb{C}$ indexed by positive integers $n$ . We are motivated by classical orderings such as the discriminant and height functions, which correspond to $f_{n}$ being the characteristic function of an increasing chain of finite sets $A_{n}\subseteq C/\cong$ , i.e. a chain of subsets for which $A_{n}\subseteq A_{n+1}$ . In the general format described in (1), we would choose the ordering

f_{n}({\rm object})=\begin{cases}1&{\rm order}({\rm object})\leq n\\ 0&\text{else}.\end{cases}

The property that such functions have finite support is called the Northcott Property, and is known to hold for discriminant and height orderings among many others.

Definition 1.1.

The counting function on a pro-object $\mathscr{G}\in\mathcal{P}$ ordered by $f_{n}:C/\cong\to\mathbb{C}$ is defined by

\displaystyle N(\mathscr{G},f_{n})=\sum_{G\in C/\cong}f_{n}(G)\#{\rm Epi}(\mathscr{G},G)

as a function of $n$ , when the series is convergent.

In the classical setting, a Northcott Property is used to guarantee that the counting function is well-defined by forcing the series to be finite. We remark that in the classical setting ${\rm order}({\rm object})$ is taken to be bounded above by a real number $X$ , while Definition 1.1 restricts to only using integer bounds, $n$ . This will not make a difference for classical orderings, as the (norm of the) discriminant and height functions are integer-valued. However, it is an artifact of the methods we use that we ask the sequence $(f_{n})_{n=1}^{\infty}$ to be indexed by a countable set rather than the uncountable set of positive real numbers.

In our general setting, we may relax the Northcott requirement and still guarantee an (almost everywhere) well-defined counting function. Fix a probability measure $\mu$ on the category of pro-objects $\mathcal{P}$ with finite moments $M_{G}$ . For convenience, we package the sequence $M_{G}$ as a discrete measure $M:C/\cong\to\mathbb{R}_{\geq 0}$ with $M(\{G\})=M_{G}$ (note that while we require $\mu$ to be a probability measure, $M$ need not be).

Definition 1.2.

Fix a probability measure $\mu$ on the category of pro-objects $\mathcal{P}$ with finite moments $M_{G}$ . We call $f_{n}$ an $L^{1}$ -ordering with respect to $\mu$ if

\int_{C}|f_{n}|\ dM=\sum_{G\in C/\cong}|f_{n}(G)|M_{G}<\infty.

In other words, the $f_{n}$ are $L^{1}$ -functions for the discrete measure $M$ induced by $\mu$ . We will often omit “with respect to $\mu$ ” if the probability measure is clear from context.

We will prove in Lemma 3.1 that $N(\mathscr{G},f_{n})$ is well-defined as a function of $n$ for almost all $\mathscr{G}$ whenever $f_{n}$ is an $L^{1}$ -ordering (i.e., there exists a measure $1$ set of $\mathscr{G}$ on which the counting function is well-defined as a function on the positive integers), and that the expected value is given by

\displaystyle\int_{\mathcal{P}}N(\mathscr{G},f_{n})\ d\mu(\mathscr{G})=\int_{C}f_{n}\ dM.

(2)

In the classical case, if $f_{n}$ has finite support in an increasing chain as $n$ tends towards $\infty$ then the counting function $N(\mathscr{G},f_{n})$ is a sum of increasing length depending on $n$ , where the summands $f_{n}(G)\#{\rm Epi}(\mathscr{G},G)$ are random variables as $\mathscr{G}$ varies according to $\mu$ . This is precisely the setting of the Law of Large Numbers, which suggests a much stronger result of the form

\frac{N(\mathscr{G},f_{n})}{\displaystyle\int_{C}f_{n}\ dM}\longrightarrow 1

as $n\to\infty$ with probability $1$ in some appropriate sense. Typically, there is some requirement of pairwise independence between the summands in order to prove such a statement (see [SS93] for the classical statement, and [Sen13] for a brief history of the Law of Large Numbers, including cases that relax the requirement of pairwise independence). The work of Korchevsky–Petrov [KP10] on nonegative random variables, building on work in [Ete83a, Ete83b, Pet09a, Pet09b], is particularly relevant to these counting functions, as $\#{\rm Epi}(\mathscr{G},G)$ is always nonegative. We will prove several different versions of the Law of Large Numbers for $N(\mathscr{G},f_{n})$ of varying strengths under sufficiently nice orderings.

1.2 Main Results

Take $C$ to be a diamond category with countably many isomorphism classes, $\mathcal{P}$ the category of pro-objects, and $\mu$ a probability measure on $\mathcal{P}$ with finite moments $M_{G}$ . Theorem 1.3 is our primary probabilistic result, while Theorems 1.4 and 1.5 are our primary category theoretic results.

Theorem 1.3 (Law of Large Numbers in a Category).

Let $f_{n}:C/\cong\to\mathbb{R}$ be a real-valued $L^{1}$ -ordering for which $\liminf_{n\to\infty}\left\lvert\int_{C}f_{n}dM\right\rvert>0$ . Suppose there exists an integer $k$ and a non-decreasing function $\gamma:\mathbb{N}\to\mathbb{R}^{+}$ for which $\lim_{t\to\infty}\gamma(t)=\infty$ , and

\displaystyle\int_{\mathscr{G}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{k}d\mu(\mathscr{G})=O\left(\frac{\left\lvert\int_{C}f_{n}\ dM\right\rvert^{k}}{\gamma(n)}\right).

(3)

Then each of the following hold:

(i)

(Weak Law of Large Numbers)

$\frac{N(\mathscr{G},f_{n})}{\displaystyle\int_{C}f_{n}\ dM}\overset{p.}{\longrightarrow}1$

as $n\to\infty$ , where the “p.” stands for converges in probability with respect to $\mu$ .
(ii)

(Strong Law of Large Numbers) If we additionally assume that $\sum_{n=1}^{\infty}\frac{1}{\gamma(n)}<\infty$ then

$\frac{N(\mathscr{G},f_{n})}{\displaystyle\int_{C}f_{n}\ dM}\overset{a.s.}{\longrightarrow}1$

as $n\to\infty$ , where the “a.s.” stands for converges almost surely with respect to $\mu$ .
(iii)
(Strong Law of Large Numbers) If we additionally assume that
- •
  
  $f_{n}$ is nonegative,
- •
  
  the counting function $n\mapsto N(\mathscr{G},f_{n})$ is almost everywhere nondecreasing, and
- •
  
  $\gamma(n)=\psi(\int_{C}f_{n}dM)$ for a nondecreasing function $\psi:\mathbb{R}\to\mathbb{R}^{+}$ for which $\sum_{n=1}^{\infty}\frac{1}{n\psi(n)}<\infty$ ,
then

$\frac{N(\mathscr{G},f_{n})}{\displaystyle\int_{C}f_{n}\ dM}\overset{a.s.}{\longrightarrow}1$

as $n\to\infty$ , where the “a.s.” stands for converges almost surely with respect to $\mu$ .

The primary new contribution of Theorem 1.3 is the connection between counting functions and the Law of Large Numbers. The probabilistic content is largely standard, and in fact the proof of Theorem 1.3(iii) is essentially the same as that of [KP10]. We follow a standard technique for bounding probabilities using Chebyshev’s law and the first Borel-Cantelli lemma. Most standard references will have some version of this technique, sometimes in the context of the “method of moments”, such as [Fis11, Chapter 4]. This technique is ubiquitous to the point that examples can be found in proofs on Wikipedia [wik22] and a number of Math StackExchange posts such as [Mic17, Ele18].

If the $f_{n}$ are specifically characteristic functions with finite support in an increasing chain then Theorem 1.3(iii) is a special case of [KP10, Theorem 3]. We state and prove these results separately from work in [KP10] to allow for a more general class of orderings, although the proof requires no new ideas. For example, we can now directly consider orderings of the form

f_{n}({\rm object})=\begin{cases}1&n<{\rm order}({\rm object})\leq 2n\\ 0&\text{else}.\end{cases}

This type of ordering gives counting functions $N(\mathscr{G},f_{n})$ that count objects landing in the moving interval $(n,2n]$ . These sorts of counting functions are used to avoid errant behavior for small objects, and can be useful for proving averaging results.

In cases for which $\liminf_{n\to\infty}\left\lvert\int_{C}f_{n}dM\right\rvert=0$ , there is something we can say using the more general results Theorem 4.1 and Theorem 4.2 we prove in the main body of the paper. This requires a finer study of the rate of convergence, as we are no longer allowed to have $\int_{C}f_{n}dM$ on the denominator for such cases.

The bound on $k^{\rm th}$ moments in (3) is a concrete way to state that the random variables are “close enough to independent” for the Law of Large Numbers. Our primary categorical result is on bounding this quantity in terms of purely categorical structures of $C$ and $\mathcal{P}$ together with the discrete moments of the ordering $f_{n}$ .

Given an ordering $f_{n}:C/\cong\to\mathbb{C}$ , we extend the ordering multiplicatively to the product category $f_{n}:C^{k}/\cong\to\mathbb{C}$ by

f_{n}(G_{1},...,G_{k})=f_{n}(G_{1})\cdots f_{n}(G_{k}).

We also define the discrete measure $M^{(j)}:C^{j}/\cong\to\mathbb{R}_{\geq 0}\cup\{\infty\}$ given by the mixed moments of the measure $\mu$ as

M_{(G_{1},...,G_{j})}^{(j)}=\int_{\mathcal{P}}\prod_{i=1}^{j}\#{\rm Epi}(\mathscr{G},G_{i})\ d\mu(\mathscr{G}).

This is sufficient information to prove a bound for (3) for $k=2$ .

Theorem 1.4.

Let $f_{n}:C/\cong\to\mathbb{R}$ be a real-valued $L^{1}$ -ordering. Then

\int_{\mathscr{G}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{2}d\mu(\mathscr{G})\ll_{k}\max_{j\in\{1,2\}}\int_{C^{2}\setminus E(2,M)}|f_{n}|\ dM^{(j)}(dM)^{2-j},

where $E(2,M)\subseteq C^{2}$ is the full subcategory of pairs $(G_{1},G_{2})\in C^{2}$ such that

(a)

the epi-product $G_{1}\times_{\rm epi}G_{2}$ exists (defined in Definition 5.1), and
(b)

$M_{G_{1}\times_{\rm epi}G_{2}}=M_{G_{1}}M_{G_{2}}$ .

Taking away the full subcategory $E(2,M)$ of $C^{2}$ is the new content, and allows for bounds on the higher moments that can be computed solely from data given by $M$ , $C$ , and $f_{n}$ without reference to the measure $\mu$ . The epi-product in condition (a) is a special type of product structure in a category, satisfying a similar universal property to the direct product with every morphism in the diagram being an epimorphism (Definition 5.1). We will show that the existence of an epi-product together with (b) implies the random variables $\#{\rm Epi}(\mathscr{G},G_{1})$ and $\#{\rm Epi}(\mathscr{G},G_{2})$ are uncorrelated. This will be key for bounding the moments. It will often be the case that determining the existence or nonexistence of epi-products is easier than approaching the bound (3) in Theorem 1.3 directly. We include some examples using this approach in Section 6.

We will prove a similar bound for (3) when $k>2$ is even, but this will require more assumptions on the category. These extra assumptions can prove useful for applying Theorem 1.4, as they give us extra tools for calculating the integral with respect to $M^{(2)}$ . Lemma 5.3 will be very useful in this regard, and we give an example demonstrating its use in Section 6.

Theorem 1.5.

Suppose $C$ is a category with for which every morphism factors uniquely (up to isomorphism) as a composition of an epimorphism with a monomorphism. Let $f_{n}:C/\cong\to\mathbb{R}$ be a real-valued $L^{1}$ -ordering and $k$ a positive integer. Then

\int_{\mathscr{G}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{2k}d\mu(\mathscr{G})\ll_{k}\max_{j\in\{1,...,2k\}}\int_{C^{2k}\setminus E(2k,M)}|f_{n}|\ dM^{(j)}(dM)^{2k-j},

where $E(2k,M)\subseteq C^{2k}$ is the full subcategory of tuples $(G_{1},G_{2},...,G_{2k})$ for which there exists an index $i$ such that for each subset $A\subseteq\{1,...,k\}$ with $i\not\in A$

(a)

the product object $G_{A}:=\prod_{m\in A}G_{m}$ exists in $C^{2k}$ ,
(b)

$G_{A}$ has finitely many subobjects up to isomorphism,
(c)

the epi-product $G_{i}\times_{\rm epi}H$ exists for each subobject $H\hookrightarrow G_{A}$ (defined in Definition 5.1), and
(d)

$M_{G_{i}\times_{\rm epi}H}=M_{G_{i}}M_{H}$ for each subobject $H\hookrightarrow G_{A}$ .

Unique factorization of morphisms is known for many categories of interest, including finite sets, finite groups, and finite modules over a ring. In fact, each of these categories automatically satisfies parts (a) and (b) as well. Thus, in numerous categories of interest, Theorem 1.5 requires no extra input over Theorem 1.4.

We also prove an upper bound utilizing simpler categorical structures that holds without the restrictions of Theorem 1.3.

Corollary 1.6.

Let $C$ be a diamond category which has countably many isomorphism classes, $\mu$ be a probability measure on the isomorphism classes of the corresponding category of pro-objects $\mathcal{P}$ with finite moments $M_{G}$ , and $f_{n}:C/\cong\to\mathbb{C}$ an $L^{1}$ -ordering.

Then for any $\epsilon>0$ and any positive integer $k$

\frac{N(\mathscr{G},f_{n})}{\displaystyle n^{\frac{1+\epsilon}{k}}\max_{j\in\{1,...,k\}}\left\{\int_{C^{k}}|f_{n}|\ dM^{(j)}(dM)^{k-j}\right\}^{1/k}}\overset{a.s.}{\longrightarrow}0

as $n\to\infty$ , where the “a.s.” stands for converges almost surely with respect to $\mu$ .

Corollary 1.6 is motivated by Malle’s predicted weak upper bound, although the correspondence is less than obvious. In the classical case $f_{n}$ is a nonegative function so that $\int f_{n}\ dM=\int|f_{n}|\ dM$ , and when $j=1$ the denominator is given by

n^{\frac{1+\epsilon}{k}}\int_{C}f_{n}\ dM.

Ignoring the other values of $j$ and taking $k$ sufficiently large, Corollary 1.6 appears to give an upper bound of the form

N(\mathscr{G},f_{n})\ll n^{\epsilon}\int_{C}f_{n}\ dM

with probability $1$ , where we recall that $n$ is playing the role of $X$ in Malle’s counting function. Of course, we cannot actually ignore the other values of $j$ , and depending on $C$ and $M$ it is possible for these bounds to be worse.

Theorem 1.3 and Corollary 1.6 can be used to provide evidence for conjectures about counting functions in great generality. We state this as an explicit heuristic:

Heuristic 1.7 (Vast Counting Heuristic).

Let $C$ be a diamond category which has countably many isomorphism classes and let $\mu$ be a probability measure on the isomorphism classes of the corresponding category of pro-objects $\mathcal{P}$ with finite moments $M_{G}$ . Choose an $L^{1}$ -ordering $f_{n}$ .

Let $\mathbb{G}$ be an element of $\mathcal{P}/\cong$ for which $N(\mathbb{G},f_{n})$ is well-defined. If $\mathbb{G}$ is an object we expect to behave typically^∗ with respect to the probability measure $\mu$ , such as an object coming from arithmetic, then

(i)

(Strong form) If $f_{n}$ satisfies the bound in (3) for some positive integer $k$ , then

$N(\mathbb{G},f_{n})\sim\int_{C}f_{n}\ dM$

as $n\to\infty$ .
(ii)

(Weak form) For each $\epsilon>0$ and each positive integer $k$

$N(\mathbb{G},f_{n})\ll n^{\frac{1+\epsilon}{k}}\max_{j\in\{0,...,k\}}\left(\int_{C^{j}}|f_{n}|\ dM^{(j)}(dM)^{k-j}\right)^{1/k}.$

The asterisk on “typically” is meant to highlight that this word is where the intricacies of counting conjectures lie, and in particular is in recognition of counter-examples to Malle’s conjecture and the like. For number field counting, Malle’s original conjecture is incorrect due to behavior coming from roots of unity that Malle did not initially consider. For the purpose of a general statement, this can be interpreted as saying “if there are no obstructions, this is what we expect”. Heuristic 1.7 provides a baseline for counting predictions when we believe there are no “non-random” structures of $\mathbb{G}$ missing from the measure $\mu$ on the category $\mathcal{P}$ . Issues, like those occurring in Malle’s conjecture from roots of unity, might be fixed by either (a) incorporating the missing structure into a modified version of the category (this is analogous to the approach commonly take in the study of Malle’s conjecture), or (b) excluding epimorphisms from the count which are affected by the missing structure (this is the approach taken for the Batyrev-Manin conjecture).

The literature on proving new cases of the Law of Large Numbers is vast, and it is possible that Theorem 1.3 may be extended with finer probabilistic arguments. From a categorical perspective, we’ve stated Theorem 1.4, Theorem 1.5, Corollary 1.6 and Heuristic 1.7 in the greatest generality possible so that they might be applied to make predictions on a variety of different counting functions in arithmetic statistics independent of the author’s immediate knowledge. We hope that these results will be useful as a starting point for many future predictions for the asymptotic growth rates of counting functions.

1.3 Layout of the Paper

We begin in Section 2 with a discussion of what we mean by an asymptotic version of the Law of Large Numbers. This is conceptually important for understanding the methods of this paper, as the references to the Law of Large Numbers are not just in analogy. Korchevskey–Petrov’s result [KP10, Theorem 3] is a particular example of the type of Law of Large Numbers result we consider.

We prove well-definedness of the counting function in Section 3. We give the probabilistic proofs of Theorem 1.3 and Corollary 1.6 in Section 4. These proofs closely follow the methods of [KP10]. We define epi-products in Section 5, prove some useful features of these objects, and prove Theorems 1.4 and 1.5.

Lastly, we include Section 6 to work out a couple of important special cases. The arithmetic statistics questions that primarily motivated the author will require independent work to set up the necessary category, measure, and orderings which the author will present in a forthcoming paper [Alb23]. We instead include simpler examples where these steps are either shorter or done by previous work such as in [SW22], and we pay particular attention to when epi-products exist in such cases. Despite the easier nature of these examples, we recover existing important random models from number theory as special cases of Theorem 1.3 and Heuristic 1.7.

Acknowledgments

The author would like to thank Melanie Matchett Wood for feedback and helpful discussions on the mechanics on [SW22].

2 An asymptotic version of the Law of Large Numbers

Given a sequence of independent, identically distributed random variables $\{X_{i}\}$ the Law of Large Numbers says that we expect $\frac{1}{n}(X_{1}+\cdots+X_{n})$ to converge to the expected value of $X_{i}$ in some sense. A version of this statement can sometimes also hold for random variables which are neither independent nor identically distributed. If the variables are not identically distributed, then $\mathbb{E}[X_{i}]$ may vary depending on $i$ , which necessarily alters the statement we want to make. We will instead be comparing

\displaystyle\sum_{i=1}^{n}X_{i}

with

\displaystyle\sum_{i=1}^{n}\mathbb{E}[X_{i}]

as $n\to\infty$ . When the random variables are pairwise independent, identically distributed, and have bounded variance in some sense, the difference of these is known to grow slower than $n$ almost surely by Kolmogorov’s strong law [Ete81, SS93]. Rather than a convergence statement, we will state the Law of Large Numbers as an asymptotic statement. The conclusion of Kolmogorov’s strong law would then be written as

{\rm Prob}\left(\frac{1}{n}\sum_{i=1}^{n}X_{i}-\frac{1}{n}\sum_{i=1}^{n}\mathbb{E}[X_{i}]=o(1)\text{ as }n\to\infty\right)=1.

We can understand this term from the perspective of asymptotic vounting functions as picking out a “main term + O(error term)”. Kolmogorov’s strong law can then be understood as

\sum_{i=1}^{n}X_{i}=\sum_{i=1}^{n}\mathbb{E}[X_{i}]+o(n)

as $n\to\infty$ with probability $1$ . It is often the case with the Law of Large Numbers that such results are written as

\frac{\sum_{i=1}^{n}X_{i}-\sum_{i=1}^{n}\mathbb{E}[X_{i}]}{n}\overset{a.s.}{\longrightarrow}0.

The statements of Theorem 1.3 can be interpretted as something similar to this. $N(\mathscr{G},f_{n})$ is defined to be a sum of random variables $f_{n}(G)\#{\rm Epi}(\mathscr{G},G)$ , so if we label $X_{G}=f_{n}(G)\#{\rm Epi}(\mathscr{G},G)$ we can understand the counting function as

N(\mathscr{G},f_{n})=\sum_{G,f_{n}(G)\neq 0}X_{G}.

The ordering is analogous to a statement of the form “ ${\rm order}(G)<n$ ”, although our results apply to more general orderings. We can then see that $N(\mathscr{G},f_{n})$ behaves like a sum of random variables $X_{G}$ , whose length is growing with $n$ . The philosophy of the Law of Large Numbers suggests that we should expect this to be close to $\sum_{G,f_{n}(G)\neq 0}\mathbb{E}[X_{G}]=\int_{C}f_{n}\ dM$ . However, there is no reason to expect an error term like $o(n)$ to hold. This isn’t precisely what we want, we are interested in a statement of the form

N(\mathscr{G},f_{n})=\int_{C}f_{n}\ dM+o\left(\int_{C}f_{n}\ dM\right)

with probability $1$ . This error does not necessarily agree with $o(n)$ .

We consider two examples to demonstrate the intricacies of the error term.

•

Consider the following example: Let $X$ be a random variable equal to $1$ with probability $1/2$ and $0$ with probability $1/2$ , and consider the dependent sequence $X_{n}=\frac{1}{n}X$ . Classical SLLN certainly holds for this sequence, as

	$\displaystyle\left\lvert\sum_{i=1}^{n}X_{i}-\sum_{i=1}^{n}\mathbb{E}[X_{i}]\right\rvert$	$\displaystyle\leq\sum_{i=1}^{n}X_{i}+\sum_{i=1}^{n}\frac{1}{2i}$
		$\displaystyle\leq\sum_{i=1}^{n}\frac{1}{i}+\sum_{i=1}^{n}\frac{1}{2i}$
		$\displaystyle=\sum_{i=1}^{n}\frac{1}{3i}$
		$\displaystyle=O(\log n)=o(n).$

However, the $X_{n}$ are identically 0 with probability $1/2$ , so

{\rm Prob}\left(\sum_{i=1}^{n}X_{i}\sim\sum_{i=1}^{n}\mathbb{E}[X_{i}]\text{ as }n\to\infty\right)\leq\frac{1}{2}.

This example satisfies the $o(n)$ error term, but it is certainly not what we want. What is happening here is that

\sum_{i=1}^{n}\mathbb{E}[X_{i}]=o(n)

right from the start, so the classical error for the Law of Large Numbers does not actually tell us much about how close $\sum X_{i}$ is to $\sum\mathbb{E}[X_{i}]$ .

•

If we instead work backwards, we might ask for error terms which are smaller than

$o\left(\sum_{i=1}^{n}\mathbb{E}[X_{i}]\right),$

so that the sum of expected values can rightly be recognized as the main term. In the classical setting with identically distributed random variables with nonzero expected value $E$ , these notions coincide as

$o\left(\sum_{i=1}^{n}\mathbb{E}[X_{i}]\right)=o(En)=o(n).$

In fact, this is known more generally. Korchevsky–Petrov [KP10, Theorem 3] give sufficient conditions for when

$\frac{S_{n}-\mathbb{E}[S_{n}]}{\mathbb{E}[S_{n}]}\overset{a.s.}{\longrightarrow}1$

as $n\to\infty$ , when $S_{n}=\sum_{i=1}^{n}X_{n}$ is a sum of nonegative random variables. Theorem 1.3(iii) is based on their result.

However, in the case that $\mathbb{E}[X_{i}]=0$ for all $i$ this is asking for a trivial error term and a trivial main term. This is certainly an issue, as in the classical case with a sequence of independent identically distributed random variables with mean $0$ the central limit theorem forces the error term to be about $\sqrt{n}$ (i.e., not $o(0)$ ). The case that all expected values are zero is important in the classical study and applications of the Law of Large Numbers, so we do not want to exclude this case.

The condition $\liminf_{n\to\infty}\left\lvert\int_{C}f_{n}dM\right\rvert>0$ in Theorem 1.3 is a consequence of the second bullet above. This corresponds to the condition that $\mathbb{E}[S_{n}]\neq 0$ for all but finitely many $n$ , so that we can make sense of having $\mathbb{E}[S_{n}]$ on the denominator. This issue can alternatively be fixed avoided by relaxing what we want for an error term. Some options might include $o\left(\sum_{i=1}^{n}\mathbb{E}[|X_{i}|]\right)$ to guarantee the sum is nonzero unless $X_{i}$ are identically zero, or something of the form $o\left(n^{1/2+\epsilon}\right)$ to give an error of a similar nature to the Central Limit Limit. We give a version of Theorem 1.3(i,ii) in Theorems 4.1 and 4.2 with explicit rates of convergence. These results can be used to produce probability $1$ statements even in the case that $\liminf\left\lvert\int_{C}f_{n}dM\right\rvert=0$ , although they will no longer be of the form

N(\mathscr{G},f_{n})\sim\int_{C}f_{n}dM.

3 Well-defined counting functions

A sequence of such $L^{1}$ -functions $f_{n}$ is called an $L^{1}$ -ordering as in Definition 1.2. As in Definition 1.1, we define the counting function

N(\mathscr{G},f_{n})=\sum_{G\in C/\cong}f_{n}(G)\#{\rm Epi}(\mathscr{G},G)

for each $\mathscr{G}\in\mathcal{P}$ . By allowing $f_{n}$ to have possibly infinite support, we need to make sure that the counting function is well-defined and the linearity of expectation behaves as we expect. In the case of $L^{1}$ -orderings, we prove this below:

Lemma 3.1.

If $f_{n}$ is an $L^{1}$ -ordering, then

(a)

$N(\mathscr{G},f_{n})$ is well-defined as a function on the positive integers $n$ almost surely (i.e. with probability $1$ ) with respect to $\mu$ , and
(b)

$\displaystyle\int_{\mathcal{P}}N(\mathscr{G},f_{n})\ d\mu(\mathscr{G})=\int_{C}f_{n}\ dM.$

Proof.

This follows from the Fubini-Tonelli theorem. Indeed, $f_{n}$ being $L^{1}$ means

\sum_{C}\int_{\mathcal{P}}|f_{n}(G)|\#{\rm Epi}(\mathscr{G},G)\ d\mu(\mathscr{G})<\infty.

All probability spaces are $\sigma$ -finite, so it follows from the Fubinit-Tonelli theorem that

	$\displaystyle\int_{\mathcal{P}}\left\lvert\sum_{G\in C/\cong}\|f_{n}(G)\|\#{\rm Epi}(\mathscr{G},G)\right\rvert\ d\mu(\mathscr{G})$	$\displaystyle=\int_{\mathcal{P}}\sum_{G\in C/\cong}\|f_{n}(G)\|\#{\rm Epi}(\mathscr{G},G)\ d\mu(\mathscr{G})$
		$\displaystyle=\sum_{G\in C/\cong}\int_{\mathcal{P}}\|f_{n}(G)\|\#{\rm Epi}(\mathscr{G},G)\ d\mu(\mathscr{G})$
		$\displaystyle=\sum_{G\in C/\cong}\|f_{n}(G)\|M_{G}<\infty.$

Certainly $\mathbb{E}[|X|]<\infty$ implies $X<\infty$ almost surely, so it must be that for fixed $n$ the counting function

N(\mathscr{G},f_{n})=\sum_{G\in C/\cong}f_{n}(G)\#{\rm Epi}(\mathscr{G},G)

converges absolutely for almost all $\mathscr{G}$ . As there are countably many $n$ , countable additivity implies $N(\mathscr{G},f_{n})$ is well-defined almost surely as a function on the positive integers.

The fact that the integrals of $|f_{n}|$ are finite also implies

\int_{\mathcal{P}}\sum_{C}f_{n}(G)\#{\rm Epi}(\mathscr{G},G)\ d\mu(\mathscr{G})=\sum_{C}\int_{\mathcal{P}}f_{n}(G)\#{\rm Epi}(\mathscr{G},G)\ d\mu(\mathscr{G})

by the Fubini-Tonelli Theorem. Evaluating the inner sum/integral gives

\int_{\mathcal{P}}N(\mathscr{G},f_{n})\ d\mu=\sum_{C}f_{n}(G)M_{G}=\int_{C}f_{n}\ dM\leq\int_{C}|f_{n}|\ dM<\infty,

proving (b). ∎

4 Law of Large Numbers for categories

We prove Theorem 1.3 and Corollary 1.6 in this section. The probabilistic content of each proof is essentially Chebyshev’s bound, the Borel-Cantelli lemma, and a few tricks for nonegative orderings previously applied in [KP10]. We prove Theorem 1.3(i,ii) with explicit control on the rate of convergence, in part to make the proof of Corollary 1.6 more straight forward.

4.1 The Weak Law of Large Numbers

We prove the following result, which is Theorem 1.3(i) together with an explicit rate of convergence.

Theorem 4.1.

Let $f_{n}:C/\cong\to\mathbb{R}$ be a real-valued $L^{1}$ -ordering. Suppose there exists an integer $k$ , a non-decreasing function $\gamma:\mathbb{N}\to\mathbb{R}^{+}$ for which $\lim_{t\to\infty}\gamma(t)=\infty$ , and a function $E:\mathbb{N}\to\mathbb{R}^{+}$ for which

\displaystyle\int_{\mathscr{G}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{k}d\mu(\mathscr{G})=O\left(\frac{E(n)^{k}}{\gamma(n)}\right).

(4)

Then

\frac{N(\mathscr{G},f_{n})-\displaystyle\int_{C}f_{n}\ dM}{E(n)}\overset{p.}{\longrightarrow}0

as $n\to\infty$ , where the “p.” stands for converges in probability with respect to $\mu$ .

Theorem 1.3(i) follows by taking $E(n)=\max\left\{\left\lvert\int_{C}f_{n}\ dM\right\rvert,\delta\right\}$ for some small $\delta>0$ .

Proof.

If $Y$ is a random variable with $\mathbb{E}[Y]=0$ , then Chebyshev’s bound states that for each positive integer $k$ ,

{\rm Prob}\left(|Y|>\lambda\right)\leq\lambda^{-k}\mathbb{E}[|Y|^{k}].

(See, for instance, [MU17].) In the context of this proof, we let $\mathbb{E}$ denote the expected value with respect to $\mu(\mathscr{G})$ for the sake of convenience. Set

Y_{n}=N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM.

It then follows that

	$\displaystyle\mathbb{E}[\|Y_{n}\|^{k}]$	$\displaystyle=\int_{\mathcal{P}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{k}\ d\mu(\mathscr{G})$
		$\displaystyle=O\left(\frac{E(n)^{k}}{\gamma(n)}\right).$

Taking $\lambda=\epsilon E(n)$ , it follows that

\displaystyle{\rm Prob}\left(|Y_{n}|>\epsilon E(n)\right)\ll\frac{1}{\epsilon^{k}\gamma(n)}

for each $\epsilon>0$ . Thus,

\displaystyle\limsup_{n\to\infty}{\rm Prob}\left(\frac{|Y_{n}|}{E(n)}>\epsilon\right)=0.

This is the definition of $Y_{n}/E(n)$ converging in probability to $0$ , and so concludes the proof. ∎

4.2 The Strong Law of Large Numbers

We prove the following result, which is Theorem 1.3(ii) together with an explicit rate of convergence.

Theorem 4.2.

\displaystyle\int_{\mathscr{G}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{k}d\mu(\mathscr{G})=O\left(\frac{E(n)^{k}}{\gamma(n)}\right).

(5)

Additionally, assume that $\sum_{n=1}^{\infty}\frac{1}{\gamma(n)}<\infty$ . Then

\frac{N(\mathscr{G},f_{n})-\displaystyle\int_{C}f_{n}\ dM}{E(n)}\overset{a.s.}{\longrightarrow}0

as $n\to\infty$ , where the “a.s.” stands for converges almost surely with respect to $\mu$ .

Theorem 1.3(ii) follows by taking $E(n)=\max\left\{\left\lvert\int_{C}f_{n}\ dM\right\rvert,\delta\right\}$ for some small $\delta>0$ .

Proof.

We proceed in a similar way to the proof of Theorem 4.1. Set

Y_{n}=N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM

so that

	$\displaystyle\mathbb{E}[\|Y_{n}\|^{k}]$	$\displaystyle=\int_{\mathcal{P}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{k}\ d\mu(\mathscr{G})$
		$\displaystyle=O\left(\frac{E(n)^{k}}{\gamma(n)}\right).$

It then follows from Chebyshev’s bound that

{\rm Prob}\left(|Y_{n}|>\lambda\right)\ll\frac{E(n)^{k}}{\lambda^{k}\gamma(n)}.

Taking $\lambda=\epsilon E(n)$ , it follows that

\displaystyle\sum_{n=1}^{\infty}{\rm Prob}\left(|Y_{n}|>\epsilon E(n)\right)\ll\sum_{n=1}^{\infty}\frac{1}{\epsilon^{k}\gamma(n)}<\infty.

for each $\epsilon>0$ . Thus, the Borel-Cantelli lemma implies

{\rm Prob}\left(|Y_{n}|>\epsilon E(n)\text{ infinitely often }\right)=0,

so that

\displaystyle{\rm Prob}\left(\limsup_{n\to\infty}\frac{|Y_{n}|}{E(n)}>\epsilon\right)=0.

By countable additivity and taking $\epsilon=1/m$ for $m\in\mathbb{Z}$ tending towards $\infty$ , this implies $Y_{n}/E(n)$ converges to $0$ almost surely, concluding the proof. ∎

4.3 Almost Sure Upper Bounds

Corollary 1.6 follows almost immediately from Theorem 4.2.

Proof of Corollary 1.6.

We bound

	$\displaystyle\int_{\mathscr{G}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{k}d\mu(\mathscr{G})$
	$\displaystyle\leq\int_{\mathscr{G}}\left(\|N(\mathscr{G},f_{n})\|+\int_{C}\|f_{n}\|\ dM\right)^{k}d\mu(\mathscr{G})$
	$\displaystyle\leq\int_{\mathscr{G}}\left(\sum_{G\in C/\cong}\|f_{n}(G)\|(\#{\rm Epi}(\mathscr{G},G)+M_{G})\right)^{k}d\mu(\mathscr{G})$
	$\displaystyle\leq\sum_{(G_{1},...,G_{k})\in C^{k}/\cong}\|f_{n}(G_{1},...,G_{k})\|\sum_{j=0}^{2k}\sum_{\sigma\in S_{k}}M_{(G_{\sigma(1)},...,G_{\sigma(j)})}^{(j)}M_{G_{\sigma(j+1)}}\cdots M_{G_{\sigma(k)}},$

where the last line follows from $f_{n}$ being an $L^{1}$ function and moving the integral all the way to the inside. We remark that the sum over permutations $\sigma\in S_{k}$ is a bit larger than the truth, but this bound will be sufficient for our purposes. This is exactly a sum of mixed moments

	$\displaystyle\leq\sum_{j=0}^{2k}\sum_{\sigma\in S_{k}}\int_{\sigma(C^{k})}\|f_{n}\|\ dM^{(j)}(dM)^{k-j}$
	$\displaystyle\ll_{k}\max_{j\in\{0,1,...,k\}}\left\{\int_{C^{k}}\|f_{n}\|\ dM^{(j)}(dM)^{k-j}\right\},$

noting that $f_{n}$ is invariant under the permuting the coordinates. Taking

E(n)=n^{\frac{1+\epsilon}{k}}\max_{j\in\{0,1,...,k\}}\left\{\int_{C^{k}}|f_{n}|\ dM^{(j)}(dM)^{k-j}\right\}^{1/k}

and $\gamma(n)=n^{1+\epsilon}$ in Theorem 4.2 is sufficient to conclude the proof. ∎

4.4 The Strong Law of Large Numbers for nonnegative counting functions

Theorem 1.3(iii) takes advantage of some tricks employed by [KP10] for nonegative functions. The following proof is in essense the same as their main result, however the functions $f_{n}(G)$ we allow are slightly more general than the sequence of weights $w_{k}$ utilized in [KP10, Theorem 1].

These tricks are not compatible with controlling the rate of convergence, so we do not state a more general result for this part. It is certainly possible that more can be proven for this case but we do not do so here.

Proof of Theorem 1.3(iii).

We remark that $n\mapsto N(\mathscr{G},f_{n})$ being nondecreasing almost everywhere implies

n\mapsto\int_{C}f_{n}\ dM=\int_{\mathcal{P}}N(\mathscr{G},f_{n})\ d\mu(\mathscr{G})

is also nondecreasing.

The primary trick is to prove convergence along a subsequence of indices first, then interpolate to the remaining indices.

Fix $b>1$ . Define the subsequence of natrual numbers $(n_{i})$ by

n_{i}=\inf\left\{n:\int_{C}f_{n}\ dM\geq b^{i}\right\}.

These necessarily exist by $\int_{C}f_{n}dM\to\infty$ with $n$ . We apply Chebyshev’s inequality to

Y_{n}=N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM

with $\lambda=\epsilon\int_{C}f_{n}\ dM$ to prove that

	$\displaystyle\sum_{i=1}^{\infty}{\rm Prob}\left(\|Y_{n_{i}}\|>\epsilon\int_{C}f_{n_{i}}\ dM\right)$	$\displaystyle\ll\sum_{i=1}^{\infty}\frac{1}{\epsilon^{k}\psi\left(\int_{C}f_{n}dM\right)}$
		$\displaystyle\leq\sum_{i=1}^{\infty}\frac{1}{\epsilon^{k}\psi\left(b^{i}\right)},$

where the second comparison follows from $\psi$ being nondecreasing. A straight foward exercise in the convergence of infinite series shows that if $\sum_{n=1}^{\infty}\frac{1}{n\psi(n)}$ converges, then so does $\sum_{i=1}^{\infty}\frac{1}{\psi(b^{i})}$ for any $b>1$ (for a reference, see [Pet09a, Lemma 1]). Similar to the proof of Theorem 4.2, the Borel-Cantelli Lemma then implies

{\rm Prob}\left(\lim_{i\to\infty}\frac{|Y_{n_{i}}|}{\displaystyle\int_{C}f_{n_{i}}\ dM}=0\right)=1.

Next, we interpolate to other indices $n$ not belonging to the subsequence $(n_{i})$ . There exists an $i$ for which $n_{i}\leq n\leq n_{i+1}$ . We expand

\displaystyle\frac{N(\mathscr{G},f_{n})-\int_{C}f_{n}dM}{\int_{C}f_{n}dM}

\displaystyle=\frac{N(\mathscr{G},f_{n})-\int_{C}f_{n_{i+1}}dM}{\int_{C}f_{n}dM}+\frac{\int_{C}f_{n_{i+1}}dM-\int_{C}f_{n}dM}{\int_{C}f_{n}dM}.

Given that the counting function and its moments are nondecreasing almost everywhere, this produces an almost everywhere upper bound

\displaystyle\leq\frac{\left\lvert N(\mathscr{G},f_{n_{i+1}})-\int_{C}f_{n_{i+1}}dM\right\rvert}{\int_{C}f_{n_{i+1}}dM}\frac{\int_{C}f_{n_{i+1}}dM}{\int_{C}f_{n_{i}}dM}+\frac{\int_{C}f_{n_{i+1}}dM-\int_{C}f_{n_{i}}dM}{\int_{C}f_{n_{i}}dM}.

The subsequence $(n_{i})$ is nondecreasing, and if $n_{i}<n_{i+1}$ it follows that

b^{i}\leq\int_{C}f_{n_{i}}dM\leq\int_{C}f_{n_{i+1}-1}dM<b^{i+1}\leq\int_{C}f_{n_{i+1}}dM.

We can simplify the upper bound to

\displaystyle\leq\frac{\left\lvert N(\mathscr{G},f_{n_{i+1}})-\int_{C}f_{n_{i+1}}dM\right\rvert}{\int_{C}f_{n_{i+1}}dM}b^{2}+(b^{2}-1).

Taking the limit as $i\to\infty$ implies

\limsup_{n\to\infty}\frac{N(\mathscr{G},f_{n})-\int_{C}f_{n}dM}{\int_{C}f_{n}dM}\leq b^{2}-1.

Taking $b>1$ sufficiently close to $1$ gives the desired $\limsup$ . The $\liminf$ is calculated similarly using the lower bound

	$\displaystyle\frac{N(\mathscr{G},f_{n})-\int_{C}f_{n}dM}{\int_{C}f_{n}dM}$	$\displaystyle\geq\frac{N(\mathscr{G},f_{n_{i}})-\int_{C}f_{n_{i}}dM}{\int_{C}f_{n_{i}}dM}\frac{\int_{C}f_{n_{i}}dM}{\int_{C}f_{n_{i+1}}dM}-\frac{\int_{C}f_{n}dM-\int_{C}f_{n_{i}}dM}{\int_{C}f_{n}dM}$
		$\displaystyle\geq\frac{N(\mathscr{G},f_{n_{i}})-\int_{C}f_{n_{i}}dM}{\int_{C}f_{n_{i}}dM}b^{-2}-1+\frac{\int_{C}f_{n_{i}}dM}{\int_{C}f_{n}dM}$
		$\displaystyle\geq\frac{N(\mathscr{G},f_{n_{i}})-\int_{C}f_{n_{i}}dM}{\int_{C}f_{n_{i}}dM}b^{-2}-1+b^{-2}.$

The limit as $i\to\infty$ can again be made arbitrarily close to $0$ by taking $b$ close to $1$ . ∎

5 Epi-Products

We let $C$ be a diamond category as defined in [SW22, Definition 1.3] with $\mathcal{P}$ the corresponding category of pro-objects. We fix throughout a probability measure $\mu$ on $\mathcal{P}$ with finite moments $M_{G}$ .

Theorem 1.4 and Theorem 1.5 rely on the concept of an epi-product, which is a categorical construction capturing the lack of correlation between $\#{\rm Epi}(\mathscr{G},G)$ and $\#{\rm Epi}(\mathscr{G},H)$ as $\mathscr{G}$ varies according to $\mu$ .

5.1 The Definition of an Epi-Product

Definition 5.1.

We define the epi-product of $G_{1}$ and $G_{2}$ to be an object $G_{1}\times_{\rm Epi}G_{2}$ with epimorphisms to $G_{1}$ and $G_{2}$ which satisfies the universal property

where all morphisms in the diagram (including the universal morphism) are required to be epimorphisms.

This is similar to the definition of a product, except we require all the morphisms (including the universal morphism) to be epimorphisms. Notice that we do not ask that $C$ has epi-products in general. This flexibility will allow us to choose a wider variety of categories to work over. We remark that if both the usual product $G_{1}\times G_{2}$ and $G_{1}\times_{\rm Epi}G_{2}$ exist then there is necessarily a unique isomorphism between them. However, the existence of one does not imply the existence of the other. We will discuss some basic examples at the end of the paper.

Lemma 5.2.

Let $\mu$ be a probability measure on the pro-objects of $C$ with finite moments $M_{G}$ , and let $\mathscr{G}$ vary with respect to $\mu$ . Then the random variables $\#{\rm Epi}(\mathscr{G},G)$ and $\#{\rm Epi}(\mathscr{G},H)$ for $G,H\in C/\cong$ are uncorrelated if $G\times_{\rm Epi}H$ exists and $M_{G\times_{\rm Epi}H}=M_{G}M_{H}$ .

Proof.

By the definition of uncorrelated, it suffices to prove that

\mathbb{E}[\#{\rm Epi}(\mathscr{G},G)\#{\rm Epi}(\mathscr{G},H)]=\mathbb{E}[\#{\rm Epi}(\mathscr{G},G)]\mathbb{E}[\#{\rm Epi}(\mathscr{G},H)].

The universal property of epi-products implies that

\#{\rm Epi}(\mathscr{G},G)\#{\rm Epi}(\mathscr{G},H)=\#{\rm Epi}(\mathscr{G},G\times_{\rm Epi}H).

Therefore

	$\displaystyle\mathbb{E}[\#{\rm Epi}(\mathscr{G},G)\#{\rm Epi}(\mathscr{G},H)]$	$\displaystyle=\mathbb{E}[\#{\rm Epi}(\mathscr{G},G\times_{\rm Epi}H)]$
		$\displaystyle=M_{G\times_{\rm Epi}H}$
		$\displaystyle=M_{G}M_{H}$
		$\displaystyle=\mathbb{E}[\#{\rm Epi}(\mathscr{G},G)]\mathbb{E}[\#{\rm Epi}(\mathscr{G},H)].$

∎

5.2 The Proof of Theorem 1.4

Theorem 1.4 follows almost immediately from Lemma 5.2.

Proof of Theorem 1.4.

For simplicity in this proof, we write $\mathbb{E}$ for the expected value with respect to $\mu$ . When $f_{n}$ is real-valued, the square is always positive and we can ignore the absolute value. Thus

	$\displaystyle\int_{\mathscr{G}}\left(N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right)^{2}d\mu(\mathscr{G})$	$\displaystyle=\mathbb{E}\left[\left(\sum_{G\in C/\cong}f_{n}(G)(\#{\rm Epi}(\mathscr{G},G)-M_{G})\right)^{2}\right]$
		$\displaystyle=\sum_{(G_{1},G_{2})\in C^{2}/\cong}f_{n}(G_{1},G_{2})\mathbb{E}\left[\prod_{i=1}^{2}\left(\#{\rm Epi}(\mathscr{G},G_{i})-M_{G_{i}}\right)\right]$
		$\displaystyle=\sum_{(G_{1},G_{2})\in C^{2}/\cong}f_{n}(G_{1},G_{2})\left(M^{(2)}_{(G_{1},G_{2})}-M_{G_{1}}M_{G_{2}}\right).$

By Lemma 5.2, $(G_{1},G_{2})\in E(2,M)$ implies $\#{\rm Epi}(\mathscr{G},G_{1})$ and $\#{\rm Epi}(\mathscr{G},G_{2})$ are uncorrelated. By definition, this is equivalent to $M^{(2)}_{(G_{1},G_{2})}=M_{G_{1}}M_{G_{2}}$ . Therefore the sum simplifies to

	$\displaystyle=\sum_{(G_{1},G_{2})\in C^{2}\setminus E(2,M)/\cong}f_{n}(G_{1},G_{2})\left(M^{(2)}_{(G_{1},G_{2})}-M_{G_{1}}M_{G_{2}}\right)$
	$\displaystyle=\int_{C^{2}\setminus E(2,M)}f_{n}\ dM^{(2)}-\int_{C^{2}\setminus E(2,M)}f_{n}\ (dM)^{2}.$

The result then follows from the triangle inequality. ∎

5.3 The Proof of Theorem 1.5

Theorem 1.5 is proven similarly to Theorem 1.4, although the higher power is trickier to keep track of. In particular, some notion of “mutually uncorrelated” random variables is needed to compare the moments of three or more of the $\#{\rm Epi}(\mathscr{G},G_{i})$ terms.

The extra categorical requirements in Theorem 1.5 are used to reframe the mixed moments as first moments, which allows us to more easily capture the required “mutually uncorrelated”ness using subobjects of product objects. This reframing is also useful for computing the mixed moments, so we state and prove it separately:

Lemma 5.3.

Suppose $C$ is a category with for which every morphism factors uniquely (up to isomorphism) as a composition of an epimorphism with a monomorphism. Let $G_{1},...,G_{j}\in C$ be objects for which the product $G_{1}\times\cdots\times G_{j}$ exists in $C$ and has finitely many subobjects up to isomorphism. Then

M^{(j)}_{(G_{1},...,G_{j})}=\sum_{\begin{subarray}{c}\iota:H\hookrightarrow\prod_{i}G_{i}\\ \rho_{m}\iota\text{ is an epimorphism}\end{subarray}}M_{H},

where the sum is over subobjects $\iota:H\hookrightarrow\prod_{i}G_{i}$ up to isomorphism for which $\rho_{m}\iota$ is also an epimorphism for each projection morphism $\rho_{m}:\prod_{i}G_{i}\to G_{m}$ .

Proof.

By the universal property of the product, there is an embedding

\prod_{i}{\rm Epi}(\mathscr{G},G_{i})\hookrightarrow\textnormal{Hom}\left(\mathscr{G},\prod_{i}G_{i}\right).

Partitioning this embedding based on the possible images gives a bijection to a disjoint union

\prod_{i}{\rm Epi}(\mathscr{G},G_{i})\longleftrightarrow\coprod_{\begin{subarray}{c}\iota:H\hookrightarrow\prod_{i}G_{i}\\ \rho_{m}\iota\text{ is an epimorphism}\end{subarray}}{\rm Epi}(\mathscr{G},H).

Taking cardinalities and integrating with respect to $\mu$ concludes the proof. ∎

We can now prove Theorem 1.5.

Proof of Theorem 1.5.

For simplicity in this proof, we write $\mathbb{E}$ for the expected value with respect to $\mu$ . When $f_{n}$ is real-valued and we take the $2k^{\rm th}$ moment, we can essentially ignore the absolute values. We then compute

	$\displaystyle\int_{\mathscr{G}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{2k}d\mu(\mathscr{G})$
	$\displaystyle=\mathbb{E}\left[\left(\sum_{G\in C/\cong}f_{n}(G)(\#{\rm Epi}(\mathscr{G},G)-M_{G})\right)^{2k}\right]$
	$\displaystyle=\sum_{(G_{1},...,G_{2k})\in C^{2k}/\cong}f_{n}(G_{1},...,G_{k})\mathbb{E}\left[\prod_{i=1}^{2k}\left(\#{\rm Epi}(\mathscr{G},G_{i})-M_{G_{i}}\right)\right],$

where we can switch the order of the integral defining $\mathbb{E}$ and the countable sum because $f_{n}$ is an $L^{1}$ -ordering.

We multiply out the product of random variables to produce

\displaystyle\mathbb{E}\left[\prod_{i=1}^{2k}\left(\#{\rm Epi}(\mathscr{G},G_{i})-M_{G_{i}}\right)\right]

\displaystyle=\sum_{A\subseteq\{1,2,...,k\}}(-1)^{|A|}M^{(|A|)}_{(G_{m})_{m\in A}}\prod_{m\not\in A}M_{G_{m}}.

It suffices to show this quantity is $0$ when $(G_{1},...,G_{2k})\in E(2k,M)$ , as the fact that $f_{n}$ , $M^{(j)}$ , and $E(2k,M)$ are invariant under permutation implies

	$\displaystyle\sum_{(G_{1},...,G_{k})\in C^{2k}\setminus E(2k,M)/\cong}f_{n}(G_{1},...,G_{2k})\sum_{A\subseteq\{1,2,...,k\}}(-1)^{\|A\|}M^{(\|A\|)}_{(G_{m})_{m\in A}}\prod_{m\not\in A}M_{G_{m}}$
	$\displaystyle\ll_{k}\max_{j\in\{1,2,...,2k\}}\sum_{\begin{subarray}{c}A\subseteq\{1,2,...,k\}\\ \|A\|=j\end{subarray}}\sum_{(G_{1},...,G_{k})\in C^{2k}\setminus E(2k,M)/\cong}\|f_{n}(G_{1},...,G_{2k})\|M^{(j)}_{(G_{m})_{m\in A}}\prod_{m\not\in A}M_{G_{m}}$
	$\displaystyle\ll_{k}\max_{j\in\{1,...,2k\}}\int_{C^{2k}\setminus E(2k,M)}\|f_{n}\|\ dM^{(j)}(dM)^{2k-j}.$

Now, suppose that $(G_{1},...,G_{2k})\in E(2k,M)$ , with distinguished index $i$ . We separate the $i^{\rm th}$ term of the product out, and compute

	$\displaystyle\mathbb{E}\left[\prod_{m=1}^{2k}\left(\#{\rm Epi}(\mathscr{G},G_{m})-M_{G_{m}}\right)\right]$
	$\displaystyle=\mathbb{E}\left[\#{\rm Epi}(\mathscr{G},G_{i})\prod_{m\neq i}\left(\#{\rm Epi}(\mathscr{G},G_{m})-M_{G_{m}}\right)\right]-M_{G_{i}}\mathbb{E}\left[\prod_{m\neq i}\left(\#{\rm Epi}(\mathscr{G},G_{m})-M_{G_{m}}\right)\right]$
	$\displaystyle=\sum_{\begin{subarray}{c}A\subset\{1,...,2k\}\\ i\not\in A\end{subarray}}\sum_{\begin{subarray}{c}\iota:H\hookrightarrow G_{A}\\ \rho_{m}\iota\text{ is an epi.}\end{subarray}}\left(\mathbb{E}\left[\#{\rm Epi}(\mathscr{G},G_{i})\#{\rm Epi}(\mathscr{G},H)\right]-M_{G_{i}}M_{H}\right)\prod_{m\not\in A}M_{G_{m}}.$

By Lemma 5.2 and the properties of the distinguished $i^{\rm th}$ index, it is necessarily the case that

\mathbb{E}\left[\#{\rm Epi}(\mathscr{G},G_{i})\#{\rm Epi}(\mathscr{G},H)\right]=M_{G_{i}}M_{H}

for each $H$ . Therefore this is a sum of zeros, and cancels out as claimed. ∎

6 Examples

Determining the bound in Theorem 1.4 and Theorem 1.5 is fairly specific to the category, and the motivating examples of number field counting and the Batyrev-Manin conjecture take a fair bit of set up to define the corresponding diamond category $C$ , construct a measure modeling arithmetic behavior, translate classical orderings into a sequence $f_{n}$ , and compute the moments $\int f_{n}\ dM$ . All this before even considering the question of bounding the higher moments. The plus side is the immense return value of Theorem 1.3 and Corollary 1.6. If you understand when epi-products exist enough to apply these results for one ordering, then it is often the case that the same reasoning will apply to numerous other orderings on the same category.

The author will construct a model for Malle’s counting function and determine the reasonableness of a large collection of orderings in forthcoming work [Alb23]. For the purposes of this paper, we include some more basic examples where these steps are either easier or already done for us in works like [SW22].

6.1 Random subsets with independent elements

We prove a typical tool used to give random models for sets of integers as a special case of our methods:

Theorem 6.1.

Let $\mathscr{G}$ be a random subset of $\mathbb{N}$ where we let the events $(n\in\mathscr{G})$ be pairwise independent with probability $r_{n}\in[0,1]$ . This forms a true probability measure on $2^{\mathbb{N}}$ , and for any any $\epsilon>0$

\frac{\displaystyle\#(\mathscr{G}\cap\{1,...,n\})-\sum_{j\leq n}r_{j}}{\displaystyle n^{\epsilon}\sqrt{\sum_{j\leq n}r_{j}}}\overset{a.s.}{\longrightarrow}0

as $n\to\infty$ .

If we let $r_{n}=\frac{1}{\log(n)}$ be the probability that $n$ is prime (with $r_{1}=0$ and $r_{2}=1$ in order to make sense) this is precisely Cramér’s original random model for the set of primes [Cra94]. We reproduce Cramér’s main term with this asymptotic:

\#(\mathscr{G}\cap\{1,...,n\})=\sum_{j=3}^{n}\frac{1}{\log(j)}+o\left(n^{\epsilon}\sqrt{\sum_{j=3}^{n}\frac{1}{\log(j)}}\right).

with probability $1$ . One easily checks using difference calculus that the main term is of the same order of magnitude as ${\rm Li}(n)$ in agreement with the prime number theorem. The error we produce, while not as strong as Cramér’s, is still $o(n^{1/2+\epsilon})$ suggesting the truth of the Riemann hypothesis.

Proof.

Let $C$ be the category gotten by letting $C^{\rm op}$ be the category of finite subsets of $\mathbb{N}$ whose morphisms are inclusions. The pro-objects are all the subsets of $\mathbb{N}$ . This category is incredibly nice for numerous reasons:

(a)

$\textnormal{Hom}(A,B)={\rm Epi}(A,B)$ contains only the inclusion map $A\hookleftarrow B$ if $B\subseteq A$ and is empty otherwise.
(b)

The product of $A$ and $B$ is given by the union $A\cup B$ , and the fact that all morphisms are epimorphisms implies this is an epi-product too.
(c)

Any sequence of finite moments $M_{A}$ on this category is well-behaved in the sense of [SW22], because the well-behavedness sum is always finite. A level is the power set $2^{D}$ for some finite set $D$ , while the elements with epimorphisms from $A$ are precisely the subsets of $A$ .
(d)

The Möbius function on the lattice of isomorphism classes ordered by epimorphisms is given by $\mu(B,A)=(-1)^{|A\setminus B|}$ .

The function $\#{\rm Epi}(B,A)$ is then the characteristic function of the event $(A\subseteq B)$ , which among a level $2^{D}$ has expected value precisely $\prod_{a\in A}r_{a}$ , so we define this to be $M_{A}$ . Certainly $M_{A\cup B}=M_{A}M_{B}$ whenever $A\cap B=\emptyset$ . We check that $M_{A}$ corresponds to a measure on the pro-objects using [SW22, Theorem 1.7]. Indeed,

	$\displaystyle v_{2^{D},B}$	$\displaystyle=\sum_{A\subseteq D}\frac{\hat{\mu}(B,A)}{\|\textnormal{Aut}(A)\|}M_{A}$
		$\displaystyle=\sum_{B\subset A\subseteq D}\frac{(-1)^{\|A\setminus B\|}}{1}\prod_{a\in A}r_{a}$
		$\displaystyle=\prod_{b\in B}r_{b}\prod_{d\in D\setminus b}(1-r_{d})$
		$\displaystyle\geq 0$

so the measure $\mu$ exists.

We also consider that $\#{\rm Epi}(\mathscr{G},A)$ is the characteristic function of the event $(A\subseteq\mathscr{G})$ . Thus, we conclude that

\#{\rm Epi}(\mathscr{G},A)\#{\rm Epi}(\mathscr{G},B)=\#{\rm Epi}(\mathscr{G},A\cup B),

so that $M_{(A_{1},A_{2},...,A_{2k})}=M_{A_{1}\cup A_{2}\cup\cdots\cup A_{2k}}$ . In this example, we have a very convenient bound for the finite moments of unions

M_{A\cup B}=\prod_{a\in A\cup B}r_{a}\geq\prod_{a\in A}r_{a}\prod_{b\in B}r_{b}=M_{A}M_{B}

which follows from the assumption that $r_{n}\in[0,1]$ . In particular, this implies $M^{(j)}\leq M^{(2k)}$ for each $j\leq 2k$ .

Any ordering $f_{n}$ supported on a family of pairwise disjoint sets necessarily satisfies that $C^{2k}\setminus E(2k,M)$ intersected with the support of $f_{n}$ is precisely the collection of tuples $(A_{1},...,A_{2k})$ in the support of $f_{n}$ which has at most $k$ distinct coordinates. We choose $f_{n}$ to be the characteristic function of singleton sets $\{m\}$ for which $m\leq n$ . Up to the number of ways to choose matching coordinates, it suffices to consider objects in $C^{k}$ . Thus, we can bound

	$\displaystyle\int_{C^{2k}\setminus E(2k,M)}\|f_{n}\|\ dM^{(j)}(dM)^{2k-j}$	$\displaystyle\leq\int_{C^{2k}\setminus E(2k,M)}\|f_{n}\|\ dM^{(2k)}$
		$\displaystyle\ll_{k}\sum_{(A_{1},...,A_{k})\in C^{k}}f_{n}(A_{1})\cdots f_{n}(A_{k})M_{A_{1}\cup\cdots\cup A_{k}}$

Noting that $f_{n}$ is supported on singleton sets, this is equivalent to

	$\displaystyle=\sum_{\|A\|\leq k}\#\{a_{1},...,a_{k}\in A:A=\{a_{1},...,a_{k}\}\}\cdot\prod_{a\in A}f_{n}(\{a\})M_{\{a\}}$
	$\displaystyle\ll_{k}\sum_{1\leq\|A\|\leq k}\prod_{a\in A}f_{n}(\{a\})M_{\{a\}}$
	$\displaystyle=\left(1+\int_{C}f_{n}\ dM\right)^{k}-1.$

If $\int_{C}f_{n}\ dM\geq 1$ , we can bound this by a constant times the leading term $(\int_{C}f_{n}\ dM)^{k}$ , and otherwise this is $O(1)$ . Thus we conclude via Theorem 1.5 that

\displaystyle\int_{\mathcal{P}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{2k}d\mu(\mathscr{G})

\displaystyle\ll_{k}\left(\int_{C}f_{n}\ dM\right)^{k}+O(1).

Taking $E(n)=\left(\int_{C}f_{n}\ dM\right)^{1/2}$ and $\gamma(n)=n^{2}$ in Theorem 4.2 then implies

\frac{N(\mathscr{G},f_{n})-\displaystyle\int_{C}f_{n}\ dM}{n^{1/k}\sqrt{\displaystyle\int_{C}f_{n}\ dM}}\overset{a.s.}{\longrightarrow}0.

The counting function is given by

	$\displaystyle N(\mathscr{G},f_{n})$	$\displaystyle=\sum_{m\leq n}\#{\rm Epi}(\mathscr{G},\{m\})$
		$\displaystyle=\#(\mathscr{G}\cap\{1,...,n\}),$

while the moments of the ordering are given by

	$\displaystyle\int_{C}f_{n}\ dM$	$\displaystyle=\sum f_{n}(A)M_{A}$
		$\displaystyle=\sum_{m\leq n}r_{m}.$

Taking $k$ sufficiently large concludes the proof. ∎

Keeping in mind the connection with random models for prime numbers, we also prove the following:

Corollary 6.2.

Let $\mathscr{G}$ be a random subset of $\mathbb{N}$ where we let the events $(n\in\mathscr{G})$ be pairwise independent with probability $r_{n}\in[0,1]$ . Let $\mathbf{1}_{S}$ be a characteristic function of some subset $S\subseteq\mathbb{N}$ , and define the corresponding ordering $f_{n}$ supported on singleton sets by $f_{n}=\mathbf{1}_{S\cap\{1,...,n\}}$ . Then

\frac{\displaystyle\#\{m\leq n:m\in\mathscr{G}\cap S\}-\sum_{\begin{subarray}{c}j\leq n\\ j\in S\end{subarray}}r_{j}}{\displaystyle n^{\epsilon}\sqrt{\sum_{\begin{subarray}{c}j\leq n\\ j\in S\end{subarray}}r_{j}}}\overset{a.s.}{\longrightarrow}0

as $n\to\infty$ .

If we let $r_{n}$ be the probabilities of Cramér’s model (or of the modifications improving the model), Heuristic 1.7 makes predictions for sets of prime numbers belonging to any subset $S\subseteq\mathbb{N}$ obeying the incredibly mild density condition

\sum_{\begin{subarray}{c}j\leq n\\ j\in S\end{subarray}}r_{j}\gg n^{\delta}

for some $\delta>0$ . This prediction is extremely broad, where the currently “known issues” amount to divisibility by small primes which are accounted for in corrections to Cramér’s random model (see [Gra95] for a nice summary). This example is not “novel”, such measure $1$ statements exist throughout the literature on random models for the primes generalizing Cramér’s work as a means to justify a number of the most famous conjectures on the distribution of primes. This includes the likes of the Hardy-Littlewood conjecture (where $f$ is the characteristic function on admissible constellations starting from $j$ ) and primes of the form $x^{2}+1$ (where $f$ is the characteristic function of integers of the form $x^{2}+1$ ).

In addition to being a short example with nice properties, this is meant to demonstrate two things. Firstly this example demonstrates that important existing random models (and corresponding results for those models) for the distribution of prime numbers arise as special cases of Heuristic 1.7 (respectively Theorem 1.3) for the category of subsets of $\mathbb{N}$ . Secondly, this example demonstrates the power of working at this level of generality. We have given a concrete description for when such probability $1$ results will exist, which allow us to tackle numerous orderings at a time.

6.2 Random groups

Any sequence $M_{G}=O(|G|^{n})$ on the category of finite groups is well-behaved in the sense of [SW22], so their results can be used to determine when such a moment sequence gives rise to a probability distribution. We will focus on the example $M_{G}=1$ discussed in [SW22]. We can use Theorem 1.3 to make predictions for (asymptotically) how many quotients of a “random” profinite group have a given behavior. While questions of this nature are easy enough to formulate, the moments of the most natural orderings $f_{n}$ can be much harder to compute in practice. Utilizing the results of the preceeding subsection, we give the following example:

Theorem 6.3.

Let $M_{G}=1$ on the category of finite abelian groups with associated probability measure $\mu$ on the profinite abelian groups. Then the average number of maximal subgroups in a random pro-abelian group of index bounded above by $n$ tends to $\log\log n$ in almost surely. More specifically,

\frac{\#\{N\leq\mathscr{G}\text{ maximal}:[\mathscr{G}:N]\leq n\}}{\log\log n}\overset{a.s.}{\longrightarrow}1

as $n\to\infty$ .

Proof.

We take the ordering

f_{n}(G)=\begin{cases}\frac{1}{p-1}&G\cong C_{p},\ p\leq n\\ 0&\text{else}.\end{cases}

All maximal subgroups of an abelian group are normal with quotient isomorphic to $C_{p}$ for some prime $p$ , so we can compute

	$\displaystyle\#\{N\leq\mathscr{G}\text{ maximal}:[\mathscr{G}:N]\leq n\}$	$\displaystyle=\sum_{p\leq n}\frac{\#{\rm Epi}(\mathscr{G},C_{p})}{p-1}$
		$\displaystyle=\sum_{G\in C/\cong}f_{n}(G)\#{\rm Epi}(\mathscr{G},C_{p})$
		$\displaystyle=N(\mathscr{G},f_{n}).$

We note that $f_{n}$ is nonegative and that $n\mapsto N(\mathscr{G},f_{n})$ is certainly increasing in $n$ . Thus, we can apply Theorem 1.3(iii). We first compute

\displaystyle\int_{C}f_{n}\ dM

\displaystyle=\sum_{p\leq n}\frac{1}{p-1}\sim\log\log n.

We next apply Theorem 1.4 to bound (3). The ordering $f_{n}$ is supported on the finite simple abelian groups $C_{p}$ , which are of pairwise coprime order. Thus, the intersection of $C^{2}\setminus E(2,M)$ with the support of $f_{n}$ is precisely the diagonal objects $(C_{p},C_{p})$ with $p\leq n$ . We compute

	$\displaystyle\int_{C^{2}\setminus{\rm Epi}_{M}^{2}C^{2}}f_{n}\ dM^{(2)}$	$\displaystyle=\sum_{p\leq n}\frac{1}{(p-1)^{2}}M_{(C_{p},C_{p})}$
		$\displaystyle=\sum_{p\leq n}\frac{1}{(p-1)^{2}}\left(M_{C_{p}\times C_{p}}-(p-1)M_{C_{p}}\right)$
		$\displaystyle=\sum_{p\leq n}\frac{p}{(p-1)^{2}}$
		$\displaystyle\sim\log\log n,$

where the second equality follows from Lemma 5.3, noting that the subgroups of $C_{p}\times C_{p}$ that surject onto each coordinate are the whole group, and $p-1$ subgroups isomorphic to $C_{p}$ . The $j=1$ integral actually converges as $n\to\infty$ by a similar computation, so this is the maximum. All together Theorem 1.4 implies

	$\displaystyle\int_{\mathcal{P}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{2}d\mu(\mathscr{G})$	$\displaystyle=O(\log\log n)$
		$\displaystyle=O\left(\frac{(\log\log n)^{2}}{(\log\log n)^{1/2}}\right)$
		$\displaystyle=O\left(\frac{\left(\int_{C}f_{n}dM\right)^{2}}{\psi\left(\int_{C}f_{n}dM\right)}\right),$

where $\psi(t)=t^{1/2}$ is non decreasing with $\lim_{t\to\infty}\psi(t)=\infty$ . We confirm that $\sum\frac{1}{n\psi(n)}=\sum n^{-3/2}<\infty$ converges, so the result follows from Theorem 1.3(iii). ∎

More general questions on random groups can be asked, but much about the asymptotic behavior of large families of groups is difficult to compute. This makes asymptotic growth rates and error terms also difficult to compute in practice. For instance, it would be nice to study the ordering $f_{n}(G)=\frac{1}{|\textnormal{Aut}(G)|}$ if $|G|\leq n$ and $0$ otherwise. The corresponding counting function is then

N(\mathscr{G},f_{n})=\#\{N\trianglelefteq\mathscr{G}\mid[\mathscr{G}:N]\leq n\}.

This is a naturally interesting function to ask about. However, the corresponding moment

\int_{C}f_{n}\ dM=\sum_{|G|\leq n}\frac{1}{|\textnormal{Aut}(G)|}

is not easy to determine. For instance, there are reasons to suspect that 100% of groups ordered by cardinality are $2$ -groups, but the author is not aware of a proof of this statement. Moreover, the bound in Theorem 1.4 requires some more serious group theory to translate the results of Lemma 5.3 into bounds on the error.

A number of interesting statistical questions that are simple to state arise in this way. The language of Theorem 1.3 is useful to create a framework for determining asymptotic growth rates, even if the moments of the ordering requires additional study to prove any results.

References

[Alb23] Brandon Alberts. A random group with local data, 2023. Forthcoming.
[BBH17] Nigel Boston, Michael R. Bush, and Farshid Hajir. Heuristics for $p$ -class towers of imaginary quadratic fields. Mathematische Annalen, 368(1-2):633–669, June 2017.
[BM90] V. V. Batyrev and Yu. I. Manin. Sur le nombre des points rationnels de hauteur borné des variétés algébriques. Mathematische Annalen, 286(1-3):27–43, 1990.
[CL84] Henri Cohen and Hendrik. W. Lenstra. Heuristics on class groups of number fields. Lecture Notes in Mathematics Number Theory Noordwijkerhout 1983, pages 33–62, 1984.
[Cra94] Harald Cramér. Some theorems concerning prime numbers. Springer Collected Works in Mathematics, page 138–170, 1994.
[Ele18] (https://math.stackexchange.com/users/7145/elements) Elements. Weak law of large numbers for dependent random variables with bounded covariance. Mathematics Stack Exchange, 2018. URL:https://math.stackexchange.com/q/245327 (version: 2018-04-10).
[Ete81] N. Etemadi. An elementary proof of the strong law of large numbers. Z. Wahrscheinlichkeitstheorie und Verwandte Gebiete, 55(1):119–122, 1981.
[Ete83a] N. Etemadi. On the laws of large numbers for nonnegative random variables. Journal of Multivariate Analysis, 13(1):187–193, 1983.
[Ete83b] N. Etemadi. Stability of sums of weighted nonnegative random variables. Journal of Multivariate Analysis, 13(2):361–365, 1983.
[Fis11] Hans Fischer. A history of the central limit theorem: From classical to modern probability theory. Springer New York, 2011.
[FMT89] Jens Franke, Yuri I. Manin, and Yuri Tschinkel. Rational points of bounded height on fano varieties. Inventiones Mathematicae, 95(2):421–435, 1989.
[FW89] Eduardo Friedman and Lawrence C. Washington. On the distribution of divisor class groups of curves over a finite field. Théorie des nombres / Number Theory, page 227–239, 1989.
[Gra95] Andrew Granville. Harald cramér and the distribution of prime numbers. Scandinavian Actuarial Journal, 1995(1):12–28, 1995.
[KP10] V. M. Korchevsky and V. V. Petrov. On the strong law of large numbers for sequences of dependent random variables. Vestnik St. Petersburg University: Mathematics, 43(3):143–147, 2010.
[LWZB19] Yuan Liu, Melanie Matchett Wood, and David Zureick-Brown. A predicted distribution for Galois groups of maximal unramified extensions, July 2019. Preprint available at https://arxiv.org/abs/1907.05002.
[Mal02] Gunter Malle. On the distribution of Galois groups. Journal of Number Theory, 92(2):315–329, 2002.
[Mal04] Gunter Malle. On the distribution of Galois groups, II. Experimental Mathematics, 13(2):129–135, 2004.
[Mic17] (https://math.stackexchange.com/users/155065/michael) Michael. Pairwise uncorrelated random variables in strong law of large numbers (slln). Mathematics Stack Exchange, 2017. URL:https://math.stackexchange.com/q/2545239 (version: 2017-11-30).
[MU17] Michael Mitzenmacher and Eli Upfal. Probability and computing: Randomization and probabilistic techniques in algorithms and data analysis. Cambridge University Press, 2017.
[Pet09a] V. V. Petrov. On stability of sums of nonnegative random variables. Journal of Mathematical Sciences, 159(3):324–326, 2009.
[Pet09b] V. V. Petrov. On the strong law of large numbers for nonnegative random variables. Theory of Probability &; Its Applications, 53(2):346–349, 2009.
[Sen13] Eugene Seneta. A tricentenary history of the law of large numbers. Bernoulli, 19(4), 2013.
[SS93] Pranab Kumar Sen and Julio da Motta Singer. Large sample methods in statistics: An introduction with applications. Chapman & Hall/CRC, 1 edition, 1993.
[SW22] Will Sawin and Melanie Matchett Wood. The moment problem for random objects in a category, Oct 2022.
[wik22] Law of large numbers, Oct 2022.

	$\displaystyle\int_{\mathcal{P}}\left\lvert\sum_{G\in C/\cong}\|f_{n}(G)\|\#{\rm Epi}(\mathscr{G},G)\right\rvert\ d\mu(\mathscr{G})$	$\displaystyle=\int_{\mathcal{P}}\sum_{G\in C/\cong}\|f_{n}(G)\|\#{\rm Epi}(\mathscr{G},G)\ d\mu(\mathscr{G})$
		$\displaystyle=\sum_{G\in C/\cong}\int_{\mathcal{P}}\|f_{n}(G)\|\#{\rm Epi}(\mathscr{G},G)\ d\mu(\mathscr{G})$
		$\displaystyle=\sum_{G\in C/\cong}\|f_{n}(G)\|M_{G}<\infty.$

	$\displaystyle\int_{\mathscr{G}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{k}d\mu(\mathscr{G})$
	$\displaystyle\leq\int_{\mathscr{G}}\left(\|N(\mathscr{G},f_{n})\|+\int_{C}\|f_{n}\|\ dM\right)^{k}d\mu(\mathscr{G})$
	$\displaystyle\leq\int_{\mathscr{G}}\left(\sum_{G\in C/\cong}\|f_{n}(G)\|(\#{\rm Epi}(\mathscr{G},G)+M_{G})\right)^{k}d\mu(\mathscr{G})$
	$\displaystyle\leq\sum_{(G_{1},...,G_{k})\in C^{k}/\cong}\|f_{n}(G_{1},...,G_{k})\|\sum_{j=0}^{2k}\sum_{\sigma\in S_{k}}M_{(G_{\sigma(1)},...,G_{\sigma(j)})}^{(j)}M_{G_{\sigma(j+1)}}\cdots M_{G_{\sigma(k)}},$