This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Counting Functions for Random Objects in a Category

Brandon Alberts
Abstract

In arithmetic statistics and analytic number theory, the asymptotic growth rate of counting functions giving the number of objects with order below XX is studied as XX\to\infty. We define general counting functions which count epimorphisms out of an object on a category under some ordering. Given a probability measure μ\mu on the isomorphism classes of the category with sufficient respect for a product structure, we prove a version of the Law of Large Numbers to give the asymptotic growth rate as XX tends towards \infty of such functions with probability 11 in terms of the finite moments of μ\mu and the ordering. Such counting functions are motivated by work in arithmetic statistics, including number field counting as in Malle’s conjecture and point counting as in the Batyrev-Manin conjecture. Recent work of Sawin–Wood gives sufficient conditions to construct such a measure μ\mu from a well-behaved sequence of finite moments in very broad contexts, and we prove our results in this broad context with the added assumption that a product structure in the category is respected. These results allow us to formalize vast heuristic predictions about counting functions in general settings.

1 Introduction

Distributions of arithmetic objects are commonly studied via counting functions. A classic example is the prime number theorem, which determines the asymptotic growth rate of

π(X)=#{p prime pX}XlogX\pi(X)=\#\{p\text{ prime }\mid p\leq X\}\sim\frac{X}{\log X}

as XX tends to infinity. Counting functions on subsets of prime numbers is one way that we work to understand the distribution of the prime numbers. Cramér made rigorous a heuristic argument of Gauss for predicting the distribution of prime numbers in [Cra94], whereby a sequence of independent variables XnX_{n} indexed by natural numbers and valued in {0,1}\{0,1\} are considered, for which Xn=1X_{n}=1 with probability 1log(n)\frac{1}{\log(n)} as suggested by the prime number theorem. By asking the same distribution questions of the sequence XnX_{n} instead of the sequence of prime numbers, Cramér is able to prove that this sequence satisfies a number of properties conjectured to be true for the primes with probability 11. See [Gra95] for a summary of Cramér’s random model and some improvements.

In arithmetic statistics, other objects are frequently studied in this way. Given an ordering of the objects satisfying a Northcott property, one can ask for the asymptotic growth rate of the counting function

#{objectorder(object)X}\displaystyle\#\{\text{object}\mid\text{order(object)}\leq X\} (1)

as XX tends to \infty. The standard examples are points on schemes (or stacks) bounded by height, whose asymptotic growth rates are predicted by the Batyrev-Manin conjecture [FMT89, BM90], and counting GG-extensions of a global field KK ordered by discriminant, whose asymptotic growth rates are predicted by Malle’s conjecture [Mal02, Mal04].

Malle’s conjecture in particular is closely related to the study of class group statistics, where one asks for the distribution of the pp-parts of the class group of an extension K/K/\mathbb{Q} as KK varies over some family of extensions ordered by discriminant. This distribution is very often predicted to agree with a true probability distribution on random groups. For a non-exhaustive list of examples for statistics of very general unramified objects over a family of global fields see [CL84, FW89, BBH17, LWZB19]. Despite this close relation, there has been no attempt to make predictions for Malle’s counting function using similar random structures. Like with Cramér’s model, it is reasonable to consider that there exists a random group with some discriminant structure which models the absolute Galois group of the global field KK for which Malle’s conjecture holds with probability 11. Potentially the same is true for other counting functions in arithmetic statistics.

Another eye-catching feature of Malle’s conjecture is the Galois correspondence. GG-extensions L/KL/K are in a 11-to-|Aut(G)||\textnormal{Aut}(G)| correspondence with surjective homomorphisms Gal(K¯/K)G\textnormal{Gal}(\overline{K}/K)\to G. The recent work of Sawin–Wood [SW22] solved the moment problem for random objects in a category, where they proved in great generality that given a sequence MGM_{G} on (isomorphism classes of) the category of finite objects CC that “do not grow too fast”, there exists a measure μ\mu on (isomorphism classes of) the category of pro-objects 𝒫\mathcal{P} whose finite moments are given by MGM_{G}. In the setting of objects in a category, this is taken to mean

𝒫#Epi(𝒢,G)𝑑μ(𝒢)=MG\int_{\mathcal{P}}\#{\rm Epi}(\mathscr{G},G)\ d\mu(\mathscr{G})=M_{G}

for each finite object GG. Here we see an immediate similarity with Malle’s conjecture - Malle’s conjecture can be interpreted as an asymptotic count for the number of epimorphisms from the absolute Galois group of KK to a fixed finite group. This suggests that the setting Sawin–Wood work in, which they state was built with statistics of unramified objects in mind, is also an excellent setting to model Malle’s conjecture. By extension, random models for other counting functions may also exist in the setting considered by Sawin–Wood as long as they can be interrpretted as counting epimorphisms.

The goal of this paper is to consider counting functions in the broad setting described by Sawin–Wood. The remainder of this introduction is separated into two subsections - the first defining a counting function with respect to an ordering for counting epimorphisms in a category (see Definition 1.1), and the second for stating Law of Large Numbers results to determine the asymptotic growth rate of these counting functions with probability 11 (see Theorem 1.3). We will prove the Law of Large Numbers for random objects in a category in as great of generality as possible, with the intention of allowing these results to easily translate to a variety of other counting functions in other settings. In a forthcoming paper [Alb23], the author will apply these results to the case of Malle’s conjecture to both recreate and improve on existing predictions for the asymptotic growth rate. This will be done by constructing a category of “groups with local data”, applying the results of [SW22] to construct a measure on this category, and applying the results of this paper to prove that 100% of random groups with local data satisfy Malle’s conjecture.

We adopt the notation and terminology of [SW22] throughout. Let CC be a diamond category in the sense of [SW22, Definition 1.3] with countably many isomorphism classes, and suppose μ\mu is a probability measure (so the whole space has measure 11) on the isomorphism classes of corresponding pro-objects, 𝒫\mathcal{P}, under the level topology with finite moments MGM_{G} for each GC/G\in C/\cong. Sawin–Wood give sufficient conditions for a sequence MGM_{G} to correspond to such a measure in [SW22, Theorem 1.7 and 1.8]. Our results do not require μ\mu to be constructed in this way, but the author expects that most interesting examples will come from the existence results proven by Sawin–Wood.

1.1 Counting Functions

To translate the setting given by [SW22] to that of counting functions more closely resembling (1), we very generally define an ordering to be a sequence of functions fn:C/f_{n}:C/\cong\to\mathbb{C} indexed by positive integers nn. We are motivated by classical orderings such as the discriminant and height functions, which correspond to fnf_{n} being the characteristic function of an increasing chain of finite sets AnC/A_{n}\subseteq C/\cong, i.e. a chain of subsets for which AnAn+1A_{n}\subseteq A_{n+1}. In the general format described in (1), we would choose the ordering

fn(object)={1order(object)n0else.f_{n}({\rm object})=\begin{cases}1&{\rm order}({\rm object})\leq n\\ 0&\text{else}.\end{cases}

The property that such functions have finite support is called the Northcott Property, and is known to hold for discriminant and height orderings among many others.

Definition 1.1.

The counting function on a pro-object 𝒢𝒫\mathscr{G}\in\mathcal{P} ordered by fn:C/f_{n}:C/\cong\to\mathbb{C} is defined by

N(𝒢,fn)=GC/fn(G)#Epi(𝒢,G)\displaystyle N(\mathscr{G},f_{n})=\sum_{G\in C/\cong}f_{n}(G)\#{\rm Epi}(\mathscr{G},G)

as a function of nn, when the series is convergent.

In the classical setting, a Northcott Property is used to guarantee that the counting function is well-defined by forcing the series to be finite. We remark that in the classical setting order(object){\rm order}({\rm object}) is taken to be bounded above by a real number XX, while Definition 1.1 restricts to only using integer bounds, nn. This will not make a difference for classical orderings, as the (norm of the) discriminant and height functions are integer-valued. However, it is an artifact of the methods we use that we ask the sequence (fn)n=1(f_{n})_{n=1}^{\infty} to be indexed by a countable set rather than the uncountable set of positive real numbers.

In our general setting, we may relax the Northcott requirement and still guarantee an (almost everywhere) well-defined counting function. Fix a probability measure μ\mu on the category of pro-objects 𝒫\mathcal{P} with finite moments MGM_{G}. For convenience, we package the sequence MGM_{G} as a discrete measure M:C/0M:C/\cong\to\mathbb{R}_{\geq 0} with M({G})=MGM(\{G\})=M_{G} (note that while we require μ\mu to be a probability measure, MM need not be).

Definition 1.2.

Fix a probability measure μ\mu on the category of pro-objects 𝒫\mathcal{P} with finite moments MGM_{G}. We call fnf_{n} an L1L^{1}-ordering with respect to μ\mu if

C|fn|𝑑M=GC/|fn(G)|MG<.\int_{C}|f_{n}|\ dM=\sum_{G\in C/\cong}|f_{n}(G)|M_{G}<\infty.

In other words, the fnf_{n} are L1L^{1}-functions for the discrete measure MM induced by μ\mu. We will often omit “with respect to μ\mu” if the probability measure is clear from context.

We will prove in Lemma 3.1 that N(𝒢,fn)N(\mathscr{G},f_{n}) is well-defined as a function of nn for almost all 𝒢\mathscr{G} whenever fnf_{n} is an L1L^{1}-ordering (i.e., there exists a measure 11 set of 𝒢\mathscr{G} on which the counting function is well-defined as a function on the positive integers), and that the expected value is given by

𝒫N(𝒢,fn)𝑑μ(𝒢)=Cfn𝑑M.\displaystyle\int_{\mathcal{P}}N(\mathscr{G},f_{n})\ d\mu(\mathscr{G})=\int_{C}f_{n}\ dM. (2)

In the classical case, if fnf_{n} has finite support in an increasing chain as nn tends towards \infty then the counting function N(𝒢,fn)N(\mathscr{G},f_{n}) is a sum of increasing length depending on nn, where the summands fn(G)#Epi(𝒢,G)f_{n}(G)\#{\rm Epi}(\mathscr{G},G) are random variables as 𝒢\mathscr{G} varies according to μ\mu. This is precisely the setting of the Law of Large Numbers, which suggests a much stronger result of the form

N(𝒢,fn)Cfn𝑑M1\frac{N(\mathscr{G},f_{n})}{\displaystyle\int_{C}f_{n}\ dM}\longrightarrow 1

as nn\to\infty with probability 11 in some appropriate sense. Typically, there is some requirement of pairwise independence between the summands in order to prove such a statement (see [SS93] for the classical statement, and [Sen13] for a brief history of the Law of Large Numbers, including cases that relax the requirement of pairwise independence). The work of Korchevsky–Petrov [KP10] on nonegative random variables, building on work in [Ete83a, Ete83b, Pet09a, Pet09b], is particularly relevant to these counting functions, as #Epi(𝒢,G)\#{\rm Epi}(\mathscr{G},G) is always nonegative. We will prove several different versions of the Law of Large Numbers for N(𝒢,fn)N(\mathscr{G},f_{n}) of varying strengths under sufficiently nice orderings.

1.2 Main Results

Take CC to be a diamond category with countably many isomorphism classes, 𝒫\mathcal{P} the category of pro-objects, and μ\mu a probability measure on 𝒫\mathcal{P} with finite moments MGM_{G}. Theorem 1.3 is our primary probabilistic result, while Theorems 1.4 and 1.5 are our primary category theoretic results.

Theorem 1.3 (Law of Large Numbers in a Category).

Let fn:C/f_{n}:C/\cong\to\mathbb{R} be a real-valued L1L^{1}-ordering for which lim infn|Cfn𝑑M|>0\liminf_{n\to\infty}\left\lvert\int_{C}f_{n}dM\right\rvert>0. Suppose there exists an integer kk and a non-decreasing function γ:+\gamma:\mathbb{N}\to\mathbb{R}^{+} for which limtγ(t)=\lim_{t\to\infty}\gamma(t)=\infty, and

𝒢|N(𝒢,fn)Cfn𝑑M|k𝑑μ(𝒢)=O(|Cfn𝑑M|kγ(n)).\displaystyle\int_{\mathscr{G}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{k}d\mu(\mathscr{G})=O\left(\frac{\left\lvert\int_{C}f_{n}\ dM\right\rvert^{k}}{\gamma(n)}\right). (3)

Then each of the following hold:

  1. (i)

    (Weak Law of Large Numbers)

    N(𝒢,fn)Cfn𝑑Mp.1\frac{N(\mathscr{G},f_{n})}{\displaystyle\int_{C}f_{n}\ dM}\overset{p.}{\longrightarrow}1

    as nn\to\infty, where the “p.” stands for converges in probability with respect to μ\mu.

  2. (ii)

    (Strong Law of Large Numbers) If we additionally assume that n=11γ(n)<\sum_{n=1}^{\infty}\frac{1}{\gamma(n)}<\infty then

    N(𝒢,fn)Cfn𝑑Ma.s.1\frac{N(\mathscr{G},f_{n})}{\displaystyle\int_{C}f_{n}\ dM}\overset{a.s.}{\longrightarrow}1

    as nn\to\infty, where the “a.s.” stands for converges almost surely with respect to μ\mu.

  3. (iii)

    (Strong Law of Large Numbers) If we additionally assume that

    • fnf_{n} is nonegative,

    • the counting function nN(𝒢,fn)n\mapsto N(\mathscr{G},f_{n}) is almost everywhere nondecreasing, and

    • γ(n)=ψ(Cfn𝑑M)\gamma(n)=\psi(\int_{C}f_{n}dM) for a nondecreasing function ψ:+\psi:\mathbb{R}\to\mathbb{R}^{+} for which n=11nψ(n)<\sum_{n=1}^{\infty}\frac{1}{n\psi(n)}<\infty,

    then

    N(𝒢,fn)Cfn𝑑Ma.s.1\frac{N(\mathscr{G},f_{n})}{\displaystyle\int_{C}f_{n}\ dM}\overset{a.s.}{\longrightarrow}1

    as nn\to\infty, where the “a.s.” stands for converges almost surely with respect to μ\mu.

The primary new contribution of Theorem 1.3 is the connection between counting functions and the Law of Large Numbers. The probabilistic content is largely standard, and in fact the proof of Theorem 1.3(iii) is essentially the same as that of [KP10]. We follow a standard technique for bounding probabilities using Chebyshev’s law and the first Borel-Cantelli lemma. Most standard references will have some version of this technique, sometimes in the context of the “method of moments”, such as [Fis11, Chapter 4]. This technique is ubiquitous to the point that examples can be found in proofs on Wikipedia [wik22] and a number of Math StackExchange posts such as [Mic17, Ele18].

If the fnf_{n} are specifically characteristic functions with finite support in an increasing chain then Theorem 1.3(iii) is a special case of [KP10, Theorem 3]. We state and prove these results separately from work in [KP10] to allow for a more general class of orderings, although the proof requires no new ideas. For example, we can now directly consider orderings of the form

fn(object)={1n<order(object)2n0else.f_{n}({\rm object})=\begin{cases}1&n<{\rm order}({\rm object})\leq 2n\\ 0&\text{else}.\end{cases}

This type of ordering gives counting functions N(𝒢,fn)N(\mathscr{G},f_{n}) that count objects landing in the moving interval (n,2n](n,2n]. These sorts of counting functions are used to avoid errant behavior for small objects, and can be useful for proving averaging results.

In cases for which lim infn|Cfn𝑑M|=0\liminf_{n\to\infty}\left\lvert\int_{C}f_{n}dM\right\rvert=0, there is something we can say using the more general results Theorem 4.1 and Theorem 4.2 we prove in the main body of the paper. This requires a finer study of the rate of convergence, as we are no longer allowed to have Cfn𝑑M\int_{C}f_{n}dM on the denominator for such cases.

The bound on kthk^{\rm th} moments in (3) is a concrete way to state that the random variables are “close enough to independent” for the Law of Large Numbers. Our primary categorical result is on bounding this quantity in terms of purely categorical structures of CC and 𝒫\mathcal{P} together with the discrete moments of the ordering fnf_{n}.

Given an ordering fn:C/f_{n}:C/\cong\to\mathbb{C}, we extend the ordering multiplicatively to the product category fn:Ck/f_{n}:C^{k}/\cong\to\mathbb{C} by

fn(G1,,Gk)=fn(G1)fn(Gk).f_{n}(G_{1},...,G_{k})=f_{n}(G_{1})\cdots f_{n}(G_{k}).

We also define the discrete measure M(j):Cj/0{}M^{(j)}:C^{j}/\cong\to\mathbb{R}_{\geq 0}\cup\{\infty\} given by the mixed moments of the measure μ\mu as

M(G1,,Gj)(j)=𝒫i=1j#Epi(𝒢,Gi)dμ(𝒢).M_{(G_{1},...,G_{j})}^{(j)}=\int_{\mathcal{P}}\prod_{i=1}^{j}\#{\rm Epi}(\mathscr{G},G_{i})\ d\mu(\mathscr{G}).

This is sufficient information to prove a bound for (3) for k=2k=2.

Theorem 1.4.

Let fn:C/f_{n}:C/\cong\to\mathbb{R} be a real-valued L1L^{1}-ordering. Then

𝒢|N(𝒢,fn)Cfn𝑑M|2𝑑μ(𝒢)kmaxj{1,2}C2E(2,M)|fn|𝑑M(j)(dM)2j,\int_{\mathscr{G}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{2}d\mu(\mathscr{G})\ll_{k}\max_{j\in\{1,2\}}\int_{C^{2}\setminus E(2,M)}|f_{n}|\ dM^{(j)}(dM)^{2-j},

where E(2,M)C2E(2,M)\subseteq C^{2} is the full subcategory of pairs (G1,G2)C2(G_{1},G_{2})\in C^{2} such that

  • (a)

    the epi-product G1×epiG2G_{1}\times_{\rm epi}G_{2} exists (defined in Definition 5.1), and

  • (b)

    MG1×epiG2=MG1MG2M_{G_{1}\times_{\rm epi}G_{2}}=M_{G_{1}}M_{G_{2}}.

Taking away the full subcategory E(2,M)E(2,M) of C2C^{2} is the new content, and allows for bounds on the higher moments that can be computed solely from data given by MM, CC, and fnf_{n} without reference to the measure μ\mu. The epi-product in condition (a) is a special type of product structure in a category, satisfying a similar universal property to the direct product with every morphism in the diagram being an epimorphism (Definition 5.1). We will show that the existence of an epi-product together with (b) implies the random variables #Epi(𝒢,G1)\#{\rm Epi}(\mathscr{G},G_{1}) and #Epi(𝒢,G2)\#{\rm Epi}(\mathscr{G},G_{2}) are uncorrelated. This will be key for bounding the moments. It will often be the case that determining the existence or nonexistence of epi-products is easier than approaching the bound (3) in Theorem 1.3 directly. We include some examples using this approach in Section 6.

We will prove a similar bound for (3) when k>2k>2 is even, but this will require more assumptions on the category. These extra assumptions can prove useful for applying Theorem 1.4, as they give us extra tools for calculating the integral with respect to M(2)M^{(2)}. Lemma 5.3 will be very useful in this regard, and we give an example demonstrating its use in Section 6.

Theorem 1.5.

Suppose CC is a category with for which every morphism factors uniquely (up to isomorphism) as a composition of an epimorphism with a monomorphism. Let fn:C/f_{n}:C/\cong\to\mathbb{R} be a real-valued L1L^{1}-ordering and kk a positive integer. Then

𝒢|N(𝒢,fn)Cfn𝑑M|2k𝑑μ(𝒢)kmaxj{1,,2k}C2kE(2k,M)|fn|𝑑M(j)(dM)2kj,\int_{\mathscr{G}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{2k}d\mu(\mathscr{G})\ll_{k}\max_{j\in\{1,...,2k\}}\int_{C^{2k}\setminus E(2k,M)}|f_{n}|\ dM^{(j)}(dM)^{2k-j},

where E(2k,M)C2kE(2k,M)\subseteq C^{2k} is the full subcategory of tuples (G1,G2,,G2k)(G_{1},G_{2},...,G_{2k}) for which there exists an index ii such that for each subset A{1,,k}A\subseteq\{1,...,k\} with iAi\not\in A

  • (a)

    the product object GA:=mAGmG_{A}:=\prod_{m\in A}G_{m} exists in C2kC^{2k},

  • (b)

    GAG_{A} has finitely many subobjects up to isomorphism,

  • (c)

    the epi-product Gi×epiHG_{i}\times_{\rm epi}H exists for each subobject HGAH\hookrightarrow G_{A} (defined in Definition 5.1), and

  • (d)

    MGi×epiH=MGiMHM_{G_{i}\times_{\rm epi}H}=M_{G_{i}}M_{H} for each subobject HGAH\hookrightarrow G_{A}.

Unique factorization of morphisms is known for many categories of interest, including finite sets, finite groups, and finite modules over a ring. In fact, each of these categories automatically satisfies parts (a) and (b) as well. Thus, in numerous categories of interest, Theorem 1.5 requires no extra input over Theorem 1.4.

We also prove an upper bound utilizing simpler categorical structures that holds without the restrictions of Theorem 1.3.

Corollary 1.6.

Let CC be a diamond category which has countably many isomorphism classes, μ\mu be a probability measure on the isomorphism classes of the corresponding category of pro-objects 𝒫\mathcal{P} with finite moments MGM_{G}, and fn:C/f_{n}:C/\cong\to\mathbb{C} an L1L^{1}-ordering.

Then for any ϵ>0\epsilon>0 and any positive integer kk

N(𝒢,fn)n1+ϵkmaxj{1,,k}{Ck|fn|dM(j)(dM)kj}1/ka.s.0\frac{N(\mathscr{G},f_{n})}{\displaystyle n^{\frac{1+\epsilon}{k}}\max_{j\in\{1,...,k\}}\left\{\int_{C^{k}}|f_{n}|\ dM^{(j)}(dM)^{k-j}\right\}^{1/k}}\overset{a.s.}{\longrightarrow}0

as nn\to\infty, where the “a.s.” stands for converges almost surely with respect to μ\mu.

Corollary 1.6 is motivated by Malle’s predicted weak upper bound, although the correspondence is less than obvious. In the classical case fnf_{n} is a nonegative function so that fn𝑑M=|fn|𝑑M\int f_{n}\ dM=\int|f_{n}|\ dM, and when j=1j=1 the denominator is given by

n1+ϵkCfn𝑑M.n^{\frac{1+\epsilon}{k}}\int_{C}f_{n}\ dM.

Ignoring the other values of jj and taking kk sufficiently large, Corollary 1.6 appears to give an upper bound of the form

N(𝒢,fn)nϵCfn𝑑MN(\mathscr{G},f_{n})\ll n^{\epsilon}\int_{C}f_{n}\ dM

with probability 11, where we recall that nn is playing the role of XX in Malle’s counting function. Of course, we cannot actually ignore the other values of jj, and depending on CC and MM it is possible for these bounds to be worse.

Theorem 1.3 and Corollary 1.6 can be used to provide evidence for conjectures about counting functions in great generality. We state this as an explicit heuristic:

Heuristic 1.7 (Vast Counting Heuristic).

Let CC be a diamond category which has countably many isomorphism classes and let μ\mu be a probability measure on the isomorphism classes of the corresponding category of pro-objects 𝒫\mathcal{P} with finite moments MGM_{G}. Choose an L1L^{1}-ordering fnf_{n}.

Let 𝔾\mathbb{G} be an element of 𝒫/\mathcal{P}/\cong for which N(𝔾,fn)N(\mathbb{G},f_{n}) is well-defined. If 𝔾\mathbb{G} is an object we expect to behave typically with respect to the probability measure μ\mu, such as an object coming from arithmetic, then

  • (i)

    (Strong form) If fnf_{n} satisfies the bound in (3) for some positive integer kk, then

    N(𝔾,fn)Cfn𝑑MN(\mathbb{G},f_{n})\sim\int_{C}f_{n}\ dM

    as nn\to\infty.

  • (ii)

    (Weak form) For each ϵ>0\epsilon>0 and each positive integer kk

    N(𝔾,fn)n1+ϵkmaxj{0,,k}(Cj|fn|dM(j)(dM)kj)1/k.N(\mathbb{G},f_{n})\ll n^{\frac{1+\epsilon}{k}}\max_{j\in\{0,...,k\}}\left(\int_{C^{j}}|f_{n}|\ dM^{(j)}(dM)^{k-j}\right)^{1/k}.

The asterisk on “typically” is meant to highlight that this word is where the intricacies of counting conjectures lie, and in particular is in recognition of counter-examples to Malle’s conjecture and the like. For number field counting, Malle’s original conjecture is incorrect due to behavior coming from roots of unity that Malle did not initially consider. For the purpose of a general statement, this can be interpreted as saying “if there are no obstructions, this is what we expect”. Heuristic 1.7 provides a baseline for counting predictions when we believe there are no “non-random” structures of 𝔾\mathbb{G} missing from the measure μ\mu on the category 𝒫\mathcal{P}. Issues, like those occurring in Malle’s conjecture from roots of unity, might be fixed by either (a) incorporating the missing structure into a modified version of the category (this is analogous to the approach commonly take in the study of Malle’s conjecture), or (b) excluding epimorphisms from the count which are affected by the missing structure (this is the approach taken for the Batyrev-Manin conjecture).

The literature on proving new cases of the Law of Large Numbers is vast, and it is possible that Theorem 1.3 may be extended with finer probabilistic arguments. From a categorical perspective, we’ve stated Theorem 1.4, Theorem 1.5, Corollary 1.6 and Heuristic 1.7 in the greatest generality possible so that they might be applied to make predictions on a variety of different counting functions in arithmetic statistics independent of the author’s immediate knowledge. We hope that these results will be useful as a starting point for many future predictions for the asymptotic growth rates of counting functions.

1.3 Layout of the Paper

We begin in Section 2 with a discussion of what we mean by an asymptotic version of the Law of Large Numbers. This is conceptually important for understanding the methods of this paper, as the references to the Law of Large Numbers are not just in analogy. Korchevskey–Petrov’s result [KP10, Theorem 3] is a particular example of the type of Law of Large Numbers result we consider.

We prove well-definedness of the counting function in Section 3. We give the probabilistic proofs of Theorem 1.3 and Corollary 1.6 in Section 4. These proofs closely follow the methods of [KP10]. We define epi-products in Section 5, prove some useful features of these objects, and prove Theorems 1.4 and 1.5.

Lastly, we include Section 6 to work out a couple of important special cases. The arithmetic statistics questions that primarily motivated the author will require independent work to set up the necessary category, measure, and orderings which the author will present in a forthcoming paper [Alb23]. We instead include simpler examples where these steps are either shorter or done by previous work such as in [SW22], and we pay particular attention to when epi-products exist in such cases. Despite the easier nature of these examples, we recover existing important random models from number theory as special cases of Theorem 1.3 and Heuristic 1.7.

Acknowledgments

The author would like to thank Melanie Matchett Wood for feedback and helpful discussions on the mechanics on [SW22].

2 An asymptotic version of the Law of Large Numbers

Given a sequence of independent, identically distributed random variables {Xi}\{X_{i}\} the Law of Large Numbers says that we expect 1n(X1++Xn)\frac{1}{n}(X_{1}+\cdots+X_{n}) to converge to the expected value of XiX_{i} in some sense. A version of this statement can sometimes also hold for random variables which are neither independent nor identically distributed. If the variables are not identically distributed, then 𝔼[Xi]\mathbb{E}[X_{i}] may vary depending on ii, which necessarily alters the statement we want to make. We will instead be comparing

i=1nXi\displaystyle\sum_{i=1}^{n}X_{i} with i=1n𝔼[Xi]\displaystyle\sum_{i=1}^{n}\mathbb{E}[X_{i}]

as nn\to\infty. When the random variables are pairwise independent, identically distributed, and have bounded variance in some sense, the difference of these is known to grow slower than nn almost surely by Kolmogorov’s strong law [Ete81, SS93]. Rather than a convergence statement, we will state the Law of Large Numbers as an asymptotic statement. The conclusion of Kolmogorov’s strong law would then be written as

Prob(1ni=1nXi1ni=1n𝔼[Xi]=o(1) as n)=1.{\rm Prob}\left(\frac{1}{n}\sum_{i=1}^{n}X_{i}-\frac{1}{n}\sum_{i=1}^{n}\mathbb{E}[X_{i}]=o(1)\text{ as }n\to\infty\right)=1.

We can understand this term from the perspective of asymptotic vounting functions as picking out a “main term + O(error term)”. Kolmogorov’s strong law can then be understood as

i=1nXi=i=1n𝔼[Xi]+o(n)\sum_{i=1}^{n}X_{i}=\sum_{i=1}^{n}\mathbb{E}[X_{i}]+o(n)

as nn\to\infty with probability 11. It is often the case with the Law of Large Numbers that such results are written as

i=1nXii=1n𝔼[Xi]na.s.0.\frac{\sum_{i=1}^{n}X_{i}-\sum_{i=1}^{n}\mathbb{E}[X_{i}]}{n}\overset{a.s.}{\longrightarrow}0.

The statements of Theorem 1.3 can be interpretted as something similar to this. N(𝒢,fn)N(\mathscr{G},f_{n}) is defined to be a sum of random variables fn(G)#Epi(𝒢,G)f_{n}(G)\#{\rm Epi}(\mathscr{G},G), so if we label XG=fn(G)#Epi(𝒢,G)X_{G}=f_{n}(G)\#{\rm Epi}(\mathscr{G},G) we can understand the counting function as

N(𝒢,fn)=G,fn(G)0XG.N(\mathscr{G},f_{n})=\sum_{G,f_{n}(G)\neq 0}X_{G}.

The ordering is analogous to a statement of the form “order(G)<n{\rm order}(G)<n”, although our results apply to more general orderings. We can then see that N(𝒢,fn)N(\mathscr{G},f_{n}) behaves like a sum of random variables XGX_{G}, whose length is growing with nn. The philosophy of the Law of Large Numbers suggests that we should expect this to be close to G,fn(G)0𝔼[XG]=Cfn𝑑M\sum_{G,f_{n}(G)\neq 0}\mathbb{E}[X_{G}]=\int_{C}f_{n}\ dM. However, there is no reason to expect an error term like o(n)o(n) to hold. This isn’t precisely what we want, we are interested in a statement of the form

N(𝒢,fn)=Cfn𝑑M+o(Cfn𝑑M)N(\mathscr{G},f_{n})=\int_{C}f_{n}\ dM+o\left(\int_{C}f_{n}\ dM\right)

with probability 11. This error does not necessarily agree with o(n)o(n).

We consider two examples to demonstrate the intricacies of the error term.

  • Consider the following example: Let XX be a random variable equal to 11 with probability 1/21/2 and 0 with probability 1/21/2, and consider the dependent sequence Xn=1nXX_{n}=\frac{1}{n}X. Classical SLLN certainly holds for this sequence, as

    |i=1nXii=1n𝔼[Xi]|\displaystyle\left\lvert\sum_{i=1}^{n}X_{i}-\sum_{i=1}^{n}\mathbb{E}[X_{i}]\right\rvert i=1nXi+i=1n12i\displaystyle\leq\sum_{i=1}^{n}X_{i}+\sum_{i=1}^{n}\frac{1}{2i}
    i=1n1i+i=1n12i\displaystyle\leq\sum_{i=1}^{n}\frac{1}{i}+\sum_{i=1}^{n}\frac{1}{2i}
    =i=1n13i\displaystyle=\sum_{i=1}^{n}\frac{1}{3i}
    =O(logn)=o(n).\displaystyle=O(\log n)=o(n).

    However, the XnX_{n} are identically 0 with probability 1/21/2, so

    Prob(i=1nXii=1n𝔼[Xi] as n)12.{\rm Prob}\left(\sum_{i=1}^{n}X_{i}\sim\sum_{i=1}^{n}\mathbb{E}[X_{i}]\text{ as }n\to\infty\right)\leq\frac{1}{2}.

    This example satisfies the o(n)o(n) error term, but it is certainly not what we want. What is happening here is that

    i=1n𝔼[Xi]=o(n)\sum_{i=1}^{n}\mathbb{E}[X_{i}]=o(n)

    right from the start, so the classical error for the Law of Large Numbers does not actually tell us much about how close Xi\sum X_{i} is to 𝔼[Xi]\sum\mathbb{E}[X_{i}].

  • If we instead work backwards, we might ask for error terms which are smaller than

    o(i=1n𝔼[Xi]),o\left(\sum_{i=1}^{n}\mathbb{E}[X_{i}]\right),

    so that the sum of expected values can rightly be recognized as the main term. In the classical setting with identically distributed random variables with nonzero expected value EE, these notions coincide as

    o(i=1n𝔼[Xi])=o(En)=o(n).o\left(\sum_{i=1}^{n}\mathbb{E}[X_{i}]\right)=o(En)=o(n).

    In fact, this is known more generally. Korchevsky–Petrov [KP10, Theorem 3] give sufficient conditions for when

    Sn𝔼[Sn]𝔼[Sn]a.s.1\frac{S_{n}-\mathbb{E}[S_{n}]}{\mathbb{E}[S_{n}]}\overset{a.s.}{\longrightarrow}1

    as nn\to\infty, when Sn=i=1nXnS_{n}=\sum_{i=1}^{n}X_{n} is a sum of nonegative random variables. Theorem 1.3(iii) is based on their result.

    However, in the case that 𝔼[Xi]=0\mathbb{E}[X_{i}]=0 for all ii this is asking for a trivial error term and a trivial main term. This is certainly an issue, as in the classical case with a sequence of independent identically distributed random variables with mean 0 the central limit theorem forces the error term to be about n\sqrt{n} (i.e., not o(0)o(0)). The case that all expected values are zero is important in the classical study and applications of the Law of Large Numbers, so we do not want to exclude this case.

The condition lim infn|Cfn𝑑M|>0\liminf_{n\to\infty}\left\lvert\int_{C}f_{n}dM\right\rvert>0 in Theorem 1.3 is a consequence of the second bullet above. This corresponds to the condition that 𝔼[Sn]0\mathbb{E}[S_{n}]\neq 0 for all but finitely many nn, so that we can make sense of having 𝔼[Sn]\mathbb{E}[S_{n}] on the denominator. This issue can alternatively be fixed avoided by relaxing what we want for an error term. Some options might include o(i=1n𝔼[|Xi|])o\left(\sum_{i=1}^{n}\mathbb{E}[|X_{i}|]\right) to guarantee the sum is nonzero unless XiX_{i} are identically zero, or something of the form o(n1/2+ϵ)o\left(n^{1/2+\epsilon}\right) to give an error of a similar nature to the Central Limit Limit. We give a version of Theorem 1.3(i,ii) in Theorems 4.1 and 4.2 with explicit rates of convergence. These results can be used to produce probability 11 statements even in the case that lim inf|Cfn𝑑M|=0\liminf\left\lvert\int_{C}f_{n}dM\right\rvert=0, although they will no longer be of the form

N(𝒢,fn)Cfn𝑑M.N(\mathscr{G},f_{n})\sim\int_{C}f_{n}dM.

3 Well-defined counting functions

A sequence of such L1L^{1}-functions fnf_{n} is called an L1L^{1}-ordering as in Definition 1.2. As in Definition 1.1, we define the counting function

N(𝒢,fn)=GC/fn(G)#Epi(𝒢,G)N(\mathscr{G},f_{n})=\sum_{G\in C/\cong}f_{n}(G)\#{\rm Epi}(\mathscr{G},G)

for each 𝒢𝒫\mathscr{G}\in\mathcal{P}. By allowing fnf_{n} to have possibly infinite support, we need to make sure that the counting function is well-defined and the linearity of expectation behaves as we expect. In the case of L1L^{1}-orderings, we prove this below:

Lemma 3.1.

If fnf_{n} is an L1L^{1}-ordering, then

  • (a)

    N(𝒢,fn)N(\mathscr{G},f_{n}) is well-defined as a function on the positive integers nn almost surely (i.e. with probability 11) with respect to μ\mu, and

  • (b)

    𝒫N(𝒢,fn)𝑑μ(𝒢)=Cfn𝑑M.\displaystyle\int_{\mathcal{P}}N(\mathscr{G},f_{n})\ d\mu(\mathscr{G})=\int_{C}f_{n}\ dM.

Proof.

This follows from the Fubini-Tonelli theorem. Indeed, fnf_{n} being L1L^{1} means

C𝒫|fn(G)|#Epi(𝒢,G)𝑑μ(𝒢)<.\sum_{C}\int_{\mathcal{P}}|f_{n}(G)|\#{\rm Epi}(\mathscr{G},G)\ d\mu(\mathscr{G})<\infty.

All probability spaces are σ\sigma-finite, so it follows from the Fubinit-Tonelli theorem that

𝒫|GC/|fn(G)|#Epi(𝒢,G)|𝑑μ(𝒢)\displaystyle\int_{\mathcal{P}}\left\lvert\sum_{G\in C/\cong}|f_{n}(G)|\#{\rm Epi}(\mathscr{G},G)\right\rvert\ d\mu(\mathscr{G}) =𝒫GC/|fn(G)|#Epi(𝒢,G)dμ(𝒢)\displaystyle=\int_{\mathcal{P}}\sum_{G\in C/\cong}|f_{n}(G)|\#{\rm Epi}(\mathscr{G},G)\ d\mu(\mathscr{G})
=GC/𝒫|fn(G)|#Epi(𝒢,G)𝑑μ(𝒢)\displaystyle=\sum_{G\in C/\cong}\int_{\mathcal{P}}|f_{n}(G)|\#{\rm Epi}(\mathscr{G},G)\ d\mu(\mathscr{G})
=GC/|fn(G)|MG<.\displaystyle=\sum_{G\in C/\cong}|f_{n}(G)|M_{G}<\infty.

Certainly 𝔼[|X|]<\mathbb{E}[|X|]<\infty implies X<X<\infty almost surely, so it must be that for fixed nn the counting function

N(𝒢,fn)=GC/fn(G)#Epi(𝒢,G)N(\mathscr{G},f_{n})=\sum_{G\in C/\cong}f_{n}(G)\#{\rm Epi}(\mathscr{G},G)

converges absolutely for almost all 𝒢\mathscr{G}. As there are countably many nn, countable additivity implies N(𝒢,fn)N(\mathscr{G},f_{n}) is well-defined almost surely as a function on the positive integers.

The fact that the integrals of |fn||f_{n}| are finite also implies

𝒫Cfn(G)#Epi(𝒢,G)dμ(𝒢)=C𝒫fn(G)#Epi(𝒢,G)𝑑μ(𝒢)\int_{\mathcal{P}}\sum_{C}f_{n}(G)\#{\rm Epi}(\mathscr{G},G)\ d\mu(\mathscr{G})=\sum_{C}\int_{\mathcal{P}}f_{n}(G)\#{\rm Epi}(\mathscr{G},G)\ d\mu(\mathscr{G})

by the Fubini-Tonelli Theorem. Evaluating the inner sum/integral gives

𝒫N(𝒢,fn)𝑑μ=Cfn(G)MG=Cfn𝑑MC|fn|𝑑M<,\int_{\mathcal{P}}N(\mathscr{G},f_{n})\ d\mu=\sum_{C}f_{n}(G)M_{G}=\int_{C}f_{n}\ dM\leq\int_{C}|f_{n}|\ dM<\infty,

proving (b). ∎

4 Law of Large Numbers for categories

We prove Theorem 1.3 and Corollary 1.6 in this section. The probabilistic content of each proof is essentially Chebyshev’s bound, the Borel-Cantelli lemma, and a few tricks for nonegative orderings previously applied in [KP10]. We prove Theorem 1.3(i,ii) with explicit control on the rate of convergence, in part to make the proof of Corollary 1.6 more straight forward.

4.1 The Weak Law of Large Numbers

We prove the following result, which is Theorem 1.3(i) together with an explicit rate of convergence.

Theorem 4.1.

Let fn:C/f_{n}:C/\cong\to\mathbb{R} be a real-valued L1L^{1}-ordering. Suppose there exists an integer kk, a non-decreasing function γ:+\gamma:\mathbb{N}\to\mathbb{R}^{+} for which limtγ(t)=\lim_{t\to\infty}\gamma(t)=\infty, and a function E:+E:\mathbb{N}\to\mathbb{R}^{+} for which

𝒢|N(𝒢,fn)Cfn𝑑M|k𝑑μ(𝒢)=O(E(n)kγ(n)).\displaystyle\int_{\mathscr{G}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{k}d\mu(\mathscr{G})=O\left(\frac{E(n)^{k}}{\gamma(n)}\right). (4)

Then

N(𝒢,fn)Cfn𝑑ME(n)p.0\frac{N(\mathscr{G},f_{n})-\displaystyle\int_{C}f_{n}\ dM}{E(n)}\overset{p.}{\longrightarrow}0

as nn\to\infty, where the “p.” stands for converges in probability with respect to μ\mu.

Theorem 1.3(i) follows by taking E(n)=max{|Cfn𝑑M|,δ}E(n)=\max\left\{\left\lvert\int_{C}f_{n}\ dM\right\rvert,\delta\right\} for some small δ>0\delta>0.

Proof.

If YY is a random variable with 𝔼[Y]=0\mathbb{E}[Y]=0, then Chebyshev’s bound states that for each positive integer kk,

Prob(|Y|>λ)λk𝔼[|Y|k].{\rm Prob}\left(|Y|>\lambda\right)\leq\lambda^{-k}\mathbb{E}[|Y|^{k}].

(See, for instance, [MU17].) In the context of this proof, we let 𝔼\mathbb{E} denote the expected value with respect to μ(𝒢)\mu(\mathscr{G}) for the sake of convenience. Set

Yn=N(𝒢,fn)Cfn𝑑M.Y_{n}=N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM.

It then follows that

𝔼[|Yn|k]\displaystyle\mathbb{E}[|Y_{n}|^{k}] =𝒫|N(𝒢,fn)Cfn𝑑M|k𝑑μ(𝒢)\displaystyle=\int_{\mathcal{P}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{k}\ d\mu(\mathscr{G})
=O(E(n)kγ(n)).\displaystyle=O\left(\frac{E(n)^{k}}{\gamma(n)}\right).

Taking λ=ϵE(n)\lambda=\epsilon E(n), it follows that

Prob(|Yn|>ϵE(n))1ϵkγ(n)\displaystyle{\rm Prob}\left(|Y_{n}|>\epsilon E(n)\right)\ll\frac{1}{\epsilon^{k}\gamma(n)}

for each ϵ>0\epsilon>0. Thus,

lim supnProb(|Yn|E(n)>ϵ)=0.\displaystyle\limsup_{n\to\infty}{\rm Prob}\left(\frac{|Y_{n}|}{E(n)}>\epsilon\right)=0.

This is the definition of Yn/E(n)Y_{n}/E(n) converging in probability to 0, and so concludes the proof. ∎

4.2 The Strong Law of Large Numbers

We prove the following result, which is Theorem 1.3(ii) together with an explicit rate of convergence.

Theorem 4.2.

Let fn:C/f_{n}:C/\cong\to\mathbb{R} be a real-valued L1L^{1}-ordering. Suppose there exists an integer kk, a non-decreasing function γ:+\gamma:\mathbb{N}\to\mathbb{R}^{+} for which limtγ(t)=\lim_{t\to\infty}\gamma(t)=\infty, and a function E:+E:\mathbb{N}\to\mathbb{R}^{+} for which

𝒢|N(𝒢,fn)Cfn𝑑M|k𝑑μ(𝒢)=O(E(n)kγ(n)).\displaystyle\int_{\mathscr{G}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{k}d\mu(\mathscr{G})=O\left(\frac{E(n)^{k}}{\gamma(n)}\right). (5)

Additionally, assume that n=11γ(n)<\sum_{n=1}^{\infty}\frac{1}{\gamma(n)}<\infty. Then

N(𝒢,fn)Cfn𝑑ME(n)a.s.0\frac{N(\mathscr{G},f_{n})-\displaystyle\int_{C}f_{n}\ dM}{E(n)}\overset{a.s.}{\longrightarrow}0

as nn\to\infty, where the “a.s.” stands for converges almost surely with respect to μ\mu.

Theorem 1.3(ii) follows by taking E(n)=max{|Cfn𝑑M|,δ}E(n)=\max\left\{\left\lvert\int_{C}f_{n}\ dM\right\rvert,\delta\right\} for some small δ>0\delta>0.

Proof.

We proceed in a similar way to the proof of Theorem 4.1. Set

Yn=N(𝒢,fn)Cfn𝑑MY_{n}=N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM

so that

𝔼[|Yn|k]\displaystyle\mathbb{E}[|Y_{n}|^{k}] =𝒫|N(𝒢,fn)Cfn𝑑M|k𝑑μ(𝒢)\displaystyle=\int_{\mathcal{P}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{k}\ d\mu(\mathscr{G})
=O(E(n)kγ(n)).\displaystyle=O\left(\frac{E(n)^{k}}{\gamma(n)}\right).

It then follows from Chebyshev’s bound that

Prob(|Yn|>λ)E(n)kλkγ(n).{\rm Prob}\left(|Y_{n}|>\lambda\right)\ll\frac{E(n)^{k}}{\lambda^{k}\gamma(n)}.

Taking λ=ϵE(n)\lambda=\epsilon E(n), it follows that

n=1Prob(|Yn|>ϵE(n))n=11ϵkγ(n)<.\displaystyle\sum_{n=1}^{\infty}{\rm Prob}\left(|Y_{n}|>\epsilon E(n)\right)\ll\sum_{n=1}^{\infty}\frac{1}{\epsilon^{k}\gamma(n)}<\infty.

for each ϵ>0\epsilon>0. Thus, the Borel-Cantelli lemma implies

Prob(|Yn|>ϵE(n) infinitely often )=0,{\rm Prob}\left(|Y_{n}|>\epsilon E(n)\text{ infinitely often }\right)=0,

so that

Prob(lim supn|Yn|E(n)>ϵ)=0.\displaystyle{\rm Prob}\left(\limsup_{n\to\infty}\frac{|Y_{n}|}{E(n)}>\epsilon\right)=0.

By countable additivity and taking ϵ=1/m\epsilon=1/m for mm\in\mathbb{Z} tending towards \infty, this implies Yn/E(n)Y_{n}/E(n) converges to 0 almost surely, concluding the proof. ∎

4.3 Almost Sure Upper Bounds

Corollary 1.6 follows almost immediately from Theorem 4.2.

Proof of Corollary 1.6.

We bound

𝒢|N(𝒢,fn)Cfn𝑑M|k𝑑μ(𝒢)\displaystyle\int_{\mathscr{G}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{k}d\mu(\mathscr{G})
𝒢(|N(𝒢,fn)|+C|fn|𝑑M)k𝑑μ(𝒢)\displaystyle\leq\int_{\mathscr{G}}\left(|N(\mathscr{G},f_{n})|+\int_{C}|f_{n}|\ dM\right)^{k}d\mu(\mathscr{G})
𝒢(GC/|fn(G)|(#Epi(𝒢,G)+MG))k𝑑μ(𝒢)\displaystyle\leq\int_{\mathscr{G}}\left(\sum_{G\in C/\cong}|f_{n}(G)|(\#{\rm Epi}(\mathscr{G},G)+M_{G})\right)^{k}d\mu(\mathscr{G})
(G1,,Gk)Ck/|fn(G1,,Gk)|j=02kσSkM(Gσ(1),,Gσ(j))(j)MGσ(j+1)MGσ(k),\displaystyle\leq\sum_{(G_{1},...,G_{k})\in C^{k}/\cong}|f_{n}(G_{1},...,G_{k})|\sum_{j=0}^{2k}\sum_{\sigma\in S_{k}}M_{(G_{\sigma(1)},...,G_{\sigma(j)})}^{(j)}M_{G_{\sigma(j+1)}}\cdots M_{G_{\sigma(k)}},

where the last line follows from fnf_{n} being an L1L^{1} function and moving the integral all the way to the inside. We remark that the sum over permutations σSk\sigma\in S_{k} is a bit larger than the truth, but this bound will be sufficient for our purposes. This is exactly a sum of mixed moments

j=02kσSkσ(Ck)|fn|𝑑M(j)(dM)kj\displaystyle\leq\sum_{j=0}^{2k}\sum_{\sigma\in S_{k}}\int_{\sigma(C^{k})}|f_{n}|\ dM^{(j)}(dM)^{k-j}
kmaxj{0,1,,k}{Ck|fn|𝑑M(j)(dM)kj},\displaystyle\ll_{k}\max_{j\in\{0,1,...,k\}}\left\{\int_{C^{k}}|f_{n}|\ dM^{(j)}(dM)^{k-j}\right\},

noting that fnf_{n} is invariant under the permuting the coordinates. Taking

E(n)=n1+ϵkmaxj{0,1,,k}{Ck|fn|dM(j)(dM)kj}1/kE(n)=n^{\frac{1+\epsilon}{k}}\max_{j\in\{0,1,...,k\}}\left\{\int_{C^{k}}|f_{n}|\ dM^{(j)}(dM)^{k-j}\right\}^{1/k}

and γ(n)=n1+ϵ\gamma(n)=n^{1+\epsilon} in Theorem 4.2 is sufficient to conclude the proof. ∎

4.4 The Strong Law of Large Numbers for nonnegative counting functions

Theorem 1.3(iii) takes advantage of some tricks employed by [KP10] for nonegative functions. The following proof is in essense the same as their main result, however the functions fn(G)f_{n}(G) we allow are slightly more general than the sequence of weights wkw_{k} utilized in [KP10, Theorem 1].

These tricks are not compatible with controlling the rate of convergence, so we do not state a more general result for this part. It is certainly possible that more can be proven for this case but we do not do so here.

Proof of Theorem 1.3(iii).

We remark that nN(𝒢,fn)n\mapsto N(\mathscr{G},f_{n}) being nondecreasing almost everywhere implies

nCfn𝑑M=𝒫N(𝒢,fn)𝑑μ(𝒢)n\mapsto\int_{C}f_{n}\ dM=\int_{\mathcal{P}}N(\mathscr{G},f_{n})\ d\mu(\mathscr{G})

is also nondecreasing.

The primary trick is to prove convergence along a subsequence of indices first, then interpolate to the remaining indices.

Fix b>1b>1. Define the subsequence of natrual numbers (ni)(n_{i}) by

ni=inf{n:Cfn𝑑Mbi}.n_{i}=\inf\left\{n:\int_{C}f_{n}\ dM\geq b^{i}\right\}.

These necessarily exist by Cfn𝑑M\int_{C}f_{n}dM\to\infty with nn. We apply Chebyshev’s inequality to

Yn=N(𝒢,fn)Cfn𝑑MY_{n}=N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM

with λ=ϵCfn𝑑M\lambda=\epsilon\int_{C}f_{n}\ dM to prove that

i=1Prob(|Yni|>ϵCfni𝑑M)\displaystyle\sum_{i=1}^{\infty}{\rm Prob}\left(|Y_{n_{i}}|>\epsilon\int_{C}f_{n_{i}}\ dM\right) i=11ϵkψ(Cfn𝑑M)\displaystyle\ll\sum_{i=1}^{\infty}\frac{1}{\epsilon^{k}\psi\left(\int_{C}f_{n}dM\right)}
i=11ϵkψ(bi),\displaystyle\leq\sum_{i=1}^{\infty}\frac{1}{\epsilon^{k}\psi\left(b^{i}\right)},

where the second comparison follows from ψ\psi being nondecreasing. A straight foward exercise in the convergence of infinite series shows that if n=11nψ(n)\sum_{n=1}^{\infty}\frac{1}{n\psi(n)} converges, then so does i=11ψ(bi)\sum_{i=1}^{\infty}\frac{1}{\psi(b^{i})} for any b>1b>1 (for a reference, see [Pet09a, Lemma 1]). Similar to the proof of Theorem 4.2, the Borel-Cantelli Lemma then implies

Prob(limi|Yni|Cfni𝑑M=0)=1.{\rm Prob}\left(\lim_{i\to\infty}\frac{|Y_{n_{i}}|}{\displaystyle\int_{C}f_{n_{i}}\ dM}=0\right)=1.

Next, we interpolate to other indices nn not belonging to the subsequence (ni)(n_{i}). There exists an ii for which ninni+1n_{i}\leq n\leq n_{i+1}. We expand

N(𝒢,fn)Cfn𝑑MCfn𝑑M\displaystyle\frac{N(\mathscr{G},f_{n})-\int_{C}f_{n}dM}{\int_{C}f_{n}dM} =N(𝒢,fn)Cfni+1𝑑MCfn𝑑M+Cfni+1𝑑MCfn𝑑MCfn𝑑M.\displaystyle=\frac{N(\mathscr{G},f_{n})-\int_{C}f_{n_{i+1}}dM}{\int_{C}f_{n}dM}+\frac{\int_{C}f_{n_{i+1}}dM-\int_{C}f_{n}dM}{\int_{C}f_{n}dM}.

Given that the counting function and its moments are nondecreasing almost everywhere, this produces an almost everywhere upper bound

|N(𝒢,fni+1)Cfni+1𝑑M|Cfni+1𝑑MCfni+1𝑑MCfni𝑑M+Cfni+1𝑑MCfni𝑑MCfni𝑑M.\displaystyle\leq\frac{\left\lvert N(\mathscr{G},f_{n_{i+1}})-\int_{C}f_{n_{i+1}}dM\right\rvert}{\int_{C}f_{n_{i+1}}dM}\frac{\int_{C}f_{n_{i+1}}dM}{\int_{C}f_{n_{i}}dM}+\frac{\int_{C}f_{n_{i+1}}dM-\int_{C}f_{n_{i}}dM}{\int_{C}f_{n_{i}}dM}.

The subsequence (ni)(n_{i}) is nondecreasing, and if ni<ni+1n_{i}<n_{i+1} it follows that

biCfni𝑑MCfni+11𝑑M<bi+1Cfni+1𝑑M.b^{i}\leq\int_{C}f_{n_{i}}dM\leq\int_{C}f_{n_{i+1}-1}dM<b^{i+1}\leq\int_{C}f_{n_{i+1}}dM.

We can simplify the upper bound to

|N(𝒢,fni+1)Cfni+1𝑑M|Cfni+1𝑑Mb2+(b21).\displaystyle\leq\frac{\left\lvert N(\mathscr{G},f_{n_{i+1}})-\int_{C}f_{n_{i+1}}dM\right\rvert}{\int_{C}f_{n_{i+1}}dM}b^{2}+(b^{2}-1).

Taking the limit as ii\to\infty implies

lim supnN(𝒢,fn)Cfn𝑑MCfn𝑑Mb21.\limsup_{n\to\infty}\frac{N(\mathscr{G},f_{n})-\int_{C}f_{n}dM}{\int_{C}f_{n}dM}\leq b^{2}-1.

Taking b>1b>1 sufficiently close to 11 gives the desired lim sup\limsup. The lim inf\liminf is calculated similarly using the lower bound

N(𝒢,fn)Cfn𝑑MCfn𝑑M\displaystyle\frac{N(\mathscr{G},f_{n})-\int_{C}f_{n}dM}{\int_{C}f_{n}dM} N(𝒢,fni)Cfni𝑑MCfni𝑑MCfni𝑑MCfni+1𝑑MCfn𝑑MCfni𝑑MCfn𝑑M\displaystyle\geq\frac{N(\mathscr{G},f_{n_{i}})-\int_{C}f_{n_{i}}dM}{\int_{C}f_{n_{i}}dM}\frac{\int_{C}f_{n_{i}}dM}{\int_{C}f_{n_{i+1}}dM}-\frac{\int_{C}f_{n}dM-\int_{C}f_{n_{i}}dM}{\int_{C}f_{n}dM}
N(𝒢,fni)Cfni𝑑MCfni𝑑Mb21+Cfni𝑑MCfn𝑑M\displaystyle\geq\frac{N(\mathscr{G},f_{n_{i}})-\int_{C}f_{n_{i}}dM}{\int_{C}f_{n_{i}}dM}b^{-2}-1+\frac{\int_{C}f_{n_{i}}dM}{\int_{C}f_{n}dM}
N(𝒢,fni)Cfni𝑑MCfni𝑑Mb21+b2.\displaystyle\geq\frac{N(\mathscr{G},f_{n_{i}})-\int_{C}f_{n_{i}}dM}{\int_{C}f_{n_{i}}dM}b^{-2}-1+b^{-2}.

The limit as ii\to\infty can again be made arbitrarily close to 0 by taking bb close to 11. ∎

5 Epi-Products

We let CC be a diamond category as defined in [SW22, Definition 1.3] with 𝒫\mathcal{P} the corresponding category of pro-objects. We fix throughout a probability measure μ\mu on 𝒫\mathcal{P} with finite moments MGM_{G}.

Theorem 1.4 and Theorem 1.5 rely on the concept of an epi-product, which is a categorical construction capturing the lack of correlation between #Epi(𝒢,G)\#{\rm Epi}(\mathscr{G},G) and #Epi(𝒢,H)\#{\rm Epi}(\mathscr{G},H) as 𝒢\mathscr{G} varies according to μ\mu.

5.1 The Definition of an Epi-Product

Definition 5.1.

We define the epi-product of G1G_{1} and G2G_{2} to be an object G1×EpiG2G_{1}\times_{\rm Epi}G_{2} with epimorphisms to G1G_{1} and G2G_{2} which satisfies the universal property

H{H}G2{G_{2}}G1{G_{1}}G1×EpiG2{G_{1}\times_{\rm Epi}G_{2}}

where all morphisms in the diagram (including the universal morphism) are required to be epimorphisms.

This is similar to the definition of a product, except we require all the morphisms (including the universal morphism) to be epimorphisms. Notice that we do not ask that CC has epi-products in general. This flexibility will allow us to choose a wider variety of categories to work over. We remark that if both the usual product G1×G2G_{1}\times G_{2} and G1×EpiG2G_{1}\times_{\rm Epi}G_{2} exist then there is necessarily a unique isomorphism between them. However, the existence of one does not imply the existence of the other. We will discuss some basic examples at the end of the paper.

Lemma 5.2.

Let μ\mu be a probability measure on the pro-objects of CC with finite moments MGM_{G}, and let 𝒢\mathscr{G} vary with respect to μ\mu. Then the random variables #Epi(𝒢,G)\#{\rm Epi}(\mathscr{G},G) and #Epi(𝒢,H)\#{\rm Epi}(\mathscr{G},H) for G,HC/G,H\in C/\cong are uncorrelated if G×EpiHG\times_{\rm Epi}H exists and MG×EpiH=MGMHM_{G\times_{\rm Epi}H}=M_{G}M_{H}.

Proof.

By the definition of uncorrelated, it suffices to prove that

𝔼[#Epi(𝒢,G)#Epi(𝒢,H)]=𝔼[#Epi(𝒢,G)]𝔼[#Epi(𝒢,H)].\mathbb{E}[\#{\rm Epi}(\mathscr{G},G)\#{\rm Epi}(\mathscr{G},H)]=\mathbb{E}[\#{\rm Epi}(\mathscr{G},G)]\mathbb{E}[\#{\rm Epi}(\mathscr{G},H)].

The universal property of epi-products implies that

#Epi(𝒢,G)#Epi(𝒢,H)=#Epi(𝒢,G×EpiH).\#{\rm Epi}(\mathscr{G},G)\#{\rm Epi}(\mathscr{G},H)=\#{\rm Epi}(\mathscr{G},G\times_{\rm Epi}H).

Therefore

𝔼[#Epi(𝒢,G)#Epi(𝒢,H)]\displaystyle\mathbb{E}[\#{\rm Epi}(\mathscr{G},G)\#{\rm Epi}(\mathscr{G},H)] =𝔼[#Epi(𝒢,G×EpiH)]\displaystyle=\mathbb{E}[\#{\rm Epi}(\mathscr{G},G\times_{\rm Epi}H)]
=MG×EpiH\displaystyle=M_{G\times_{\rm Epi}H}
=MGMH\displaystyle=M_{G}M_{H}
=𝔼[#Epi(𝒢,G)]𝔼[#Epi(𝒢,H)].\displaystyle=\mathbb{E}[\#{\rm Epi}(\mathscr{G},G)]\mathbb{E}[\#{\rm Epi}(\mathscr{G},H)].

5.2 The Proof of Theorem 1.4

Theorem 1.4 follows almost immediately from Lemma 5.2.

Proof of Theorem 1.4.

For simplicity in this proof, we write 𝔼\mathbb{E} for the expected value with respect to μ\mu. When fnf_{n} is real-valued, the square is always positive and we can ignore the absolute value. Thus

𝒢(N(𝒢,fn)Cfn𝑑M)2𝑑μ(𝒢)\displaystyle\int_{\mathscr{G}}\left(N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right)^{2}d\mu(\mathscr{G}) =𝔼[(GC/fn(G)(#Epi(𝒢,G)MG))2]\displaystyle=\mathbb{E}\left[\left(\sum_{G\in C/\cong}f_{n}(G)(\#{\rm Epi}(\mathscr{G},G)-M_{G})\right)^{2}\right]
=(G1,G2)C2/fn(G1,G2)𝔼[i=12(#Epi(𝒢,Gi)MGi)]\displaystyle=\sum_{(G_{1},G_{2})\in C^{2}/\cong}f_{n}(G_{1},G_{2})\mathbb{E}\left[\prod_{i=1}^{2}\left(\#{\rm Epi}(\mathscr{G},G_{i})-M_{G_{i}}\right)\right]
=(G1,G2)C2/fn(G1,G2)(M(G1,G2)(2)MG1MG2).\displaystyle=\sum_{(G_{1},G_{2})\in C^{2}/\cong}f_{n}(G_{1},G_{2})\left(M^{(2)}_{(G_{1},G_{2})}-M_{G_{1}}M_{G_{2}}\right).

By Lemma 5.2, (G1,G2)E(2,M)(G_{1},G_{2})\in E(2,M) implies #Epi(𝒢,G1)\#{\rm Epi}(\mathscr{G},G_{1}) and #Epi(𝒢,G2)\#{\rm Epi}(\mathscr{G},G_{2}) are uncorrelated. By definition, this is equivalent to M(G1,G2)(2)=MG1MG2M^{(2)}_{(G_{1},G_{2})}=M_{G_{1}}M_{G_{2}}. Therefore the sum simplifies to

=(G1,G2)C2E(2,M)/fn(G1,G2)(M(G1,G2)(2)MG1MG2)\displaystyle=\sum_{(G_{1},G_{2})\in C^{2}\setminus E(2,M)/\cong}f_{n}(G_{1},G_{2})\left(M^{(2)}_{(G_{1},G_{2})}-M_{G_{1}}M_{G_{2}}\right)
=C2E(2,M)fn𝑑M(2)C2E(2,M)fn(dM)2.\displaystyle=\int_{C^{2}\setminus E(2,M)}f_{n}\ dM^{(2)}-\int_{C^{2}\setminus E(2,M)}f_{n}\ (dM)^{2}.

The result then follows from the triangle inequality. ∎

5.3 The Proof of Theorem 1.5

Theorem 1.5 is proven similarly to Theorem 1.4, although the higher power is trickier to keep track of. In particular, some notion of “mutually uncorrelated” random variables is needed to compare the moments of three or more of the #Epi(𝒢,Gi)\#{\rm Epi}(\mathscr{G},G_{i}) terms.

The extra categorical requirements in Theorem 1.5 are used to reframe the mixed moments as first moments, which allows us to more easily capture the required “mutually uncorrelated”ness using subobjects of product objects. This reframing is also useful for computing the mixed moments, so we state and prove it separately:

Lemma 5.3.

Suppose CC is a category with for which every morphism factors uniquely (up to isomorphism) as a composition of an epimorphism with a monomorphism. Let G1,,GjCG_{1},...,G_{j}\in C be objects for which the product G1××GjG_{1}\times\cdots\times G_{j} exists in CC and has finitely many subobjects up to isomorphism. Then

M(G1,,Gj)(j)=ι:HiGiρmι is an epimorphismMH,M^{(j)}_{(G_{1},...,G_{j})}=\sum_{\begin{subarray}{c}\iota:H\hookrightarrow\prod_{i}G_{i}\\ \rho_{m}\iota\text{ is an epimorphism}\end{subarray}}M_{H},

where the sum is over subobjects ι:HiGi\iota:H\hookrightarrow\prod_{i}G_{i} up to isomorphism for which ρmι\rho_{m}\iota is also an epimorphism for each projection morphism ρm:iGiGm\rho_{m}:\prod_{i}G_{i}\to G_{m}.

Proof.

By the universal property of the product, there is an embedding

iEpi(𝒢,Gi)Hom(𝒢,iGi).\prod_{i}{\rm Epi}(\mathscr{G},G_{i})\hookrightarrow\textnormal{Hom}\left(\mathscr{G},\prod_{i}G_{i}\right).

Partitioning this embedding based on the possible images gives a bijection to a disjoint union

iEpi(𝒢,Gi)ι:HiGiρmι is an epimorphismEpi(𝒢,H).\prod_{i}{\rm Epi}(\mathscr{G},G_{i})\longleftrightarrow\coprod_{\begin{subarray}{c}\iota:H\hookrightarrow\prod_{i}G_{i}\\ \rho_{m}\iota\text{ is an epimorphism}\end{subarray}}{\rm Epi}(\mathscr{G},H).

Taking cardinalities and integrating with respect to μ\mu concludes the proof. ∎

We can now prove Theorem 1.5.

Proof of Theorem 1.5.

For simplicity in this proof, we write 𝔼\mathbb{E} for the expected value with respect to μ\mu. When fnf_{n} is real-valued and we take the 2kth2k^{\rm th} moment, we can essentially ignore the absolute values. We then compute

𝒢|N(𝒢,fn)Cfn𝑑M|2k𝑑μ(𝒢)\displaystyle\int_{\mathscr{G}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{2k}d\mu(\mathscr{G})
=𝔼[(GC/fn(G)(#Epi(𝒢,G)MG))2k]\displaystyle=\mathbb{E}\left[\left(\sum_{G\in C/\cong}f_{n}(G)(\#{\rm Epi}(\mathscr{G},G)-M_{G})\right)^{2k}\right]
=(G1,,G2k)C2k/fn(G1,,Gk)𝔼[i=12k(#Epi(𝒢,Gi)MGi)],\displaystyle=\sum_{(G_{1},...,G_{2k})\in C^{2k}/\cong}f_{n}(G_{1},...,G_{k})\mathbb{E}\left[\prod_{i=1}^{2k}\left(\#{\rm Epi}(\mathscr{G},G_{i})-M_{G_{i}}\right)\right],

where we can switch the order of the integral defining 𝔼\mathbb{E} and the countable sum because fnf_{n} is an L1L^{1}-ordering.

We multiply out the product of random variables to produce

𝔼[i=12k(#Epi(𝒢,Gi)MGi)]\displaystyle\mathbb{E}\left[\prod_{i=1}^{2k}\left(\#{\rm Epi}(\mathscr{G},G_{i})-M_{G_{i}}\right)\right] =A{1,2,,k}(1)|A|M(Gm)mA(|A|)mAMGm.\displaystyle=\sum_{A\subseteq\{1,2,...,k\}}(-1)^{|A|}M^{(|A|)}_{(G_{m})_{m\in A}}\prod_{m\not\in A}M_{G_{m}}.

It suffices to show this quantity is 0 when (G1,,G2k)E(2k,M)(G_{1},...,G_{2k})\in E(2k,M), as the fact that fnf_{n}, M(j)M^{(j)}, and E(2k,M)E(2k,M) are invariant under permutation implies

(G1,,Gk)C2kE(2k,M)/fn(G1,,G2k)A{1,2,,k}(1)|A|M(Gm)mA(|A|)mAMGm\displaystyle\sum_{(G_{1},...,G_{k})\in C^{2k}\setminus E(2k,M)/\cong}f_{n}(G_{1},...,G_{2k})\sum_{A\subseteq\{1,2,...,k\}}(-1)^{|A|}M^{(|A|)}_{(G_{m})_{m\in A}}\prod_{m\not\in A}M_{G_{m}}
kmaxj{1,2,,2k}A{1,2,,k}|A|=j(G1,,Gk)C2kE(2k,M)/|fn(G1,,G2k)|M(Gm)mA(j)mAMGm\displaystyle\ll_{k}\max_{j\in\{1,2,...,2k\}}\sum_{\begin{subarray}{c}A\subseteq\{1,2,...,k\}\\ |A|=j\end{subarray}}\sum_{(G_{1},...,G_{k})\in C^{2k}\setminus E(2k,M)/\cong}|f_{n}(G_{1},...,G_{2k})|M^{(j)}_{(G_{m})_{m\in A}}\prod_{m\not\in A}M_{G_{m}}
kmaxj{1,,2k}C2kE(2k,M)|fn|𝑑M(j)(dM)2kj.\displaystyle\ll_{k}\max_{j\in\{1,...,2k\}}\int_{C^{2k}\setminus E(2k,M)}|f_{n}|\ dM^{(j)}(dM)^{2k-j}.

Now, suppose that (G1,,G2k)E(2k,M)(G_{1},...,G_{2k})\in E(2k,M), with distinguished index ii. We separate the ithi^{\rm th} term of the product out, and compute

𝔼[m=12k(#Epi(𝒢,Gm)MGm)]\displaystyle\mathbb{E}\left[\prod_{m=1}^{2k}\left(\#{\rm Epi}(\mathscr{G},G_{m})-M_{G_{m}}\right)\right]
=𝔼[#Epi(𝒢,Gi)mi(#Epi(𝒢,Gm)MGm)]MGi𝔼[mi(#Epi(𝒢,Gm)MGm)]\displaystyle=\mathbb{E}\left[\#{\rm Epi}(\mathscr{G},G_{i})\prod_{m\neq i}\left(\#{\rm Epi}(\mathscr{G},G_{m})-M_{G_{m}}\right)\right]-M_{G_{i}}\mathbb{E}\left[\prod_{m\neq i}\left(\#{\rm Epi}(\mathscr{G},G_{m})-M_{G_{m}}\right)\right]
=A{1,,2k}iAι:HGAρmι is an epi.(𝔼[#Epi(𝒢,Gi)#Epi(𝒢,H)]MGiMH)mAMGm.\displaystyle=\sum_{\begin{subarray}{c}A\subset\{1,...,2k\}\\ i\not\in A\end{subarray}}\sum_{\begin{subarray}{c}\iota:H\hookrightarrow G_{A}\\ \rho_{m}\iota\text{ is an epi.}\end{subarray}}\left(\mathbb{E}\left[\#{\rm Epi}(\mathscr{G},G_{i})\#{\rm Epi}(\mathscr{G},H)\right]-M_{G_{i}}M_{H}\right)\prod_{m\not\in A}M_{G_{m}}.

By Lemma 5.2 and the properties of the distinguished ithi^{\rm th} index, it is necessarily the case that

𝔼[#Epi(𝒢,Gi)#Epi(𝒢,H)]=MGiMH\mathbb{E}\left[\#{\rm Epi}(\mathscr{G},G_{i})\#{\rm Epi}(\mathscr{G},H)\right]=M_{G_{i}}M_{H}

for each HH. Therefore this is a sum of zeros, and cancels out as claimed. ∎

6 Examples

Determining the bound in Theorem 1.4 and Theorem 1.5 is fairly specific to the category, and the motivating examples of number field counting and the Batyrev-Manin conjecture take a fair bit of set up to define the corresponding diamond category CC, construct a measure modeling arithmetic behavior, translate classical orderings into a sequence fnf_{n}, and compute the moments fn𝑑M\int f_{n}\ dM. All this before even considering the question of bounding the higher moments. The plus side is the immense return value of Theorem 1.3 and Corollary 1.6. If you understand when epi-products exist enough to apply these results for one ordering, then it is often the case that the same reasoning will apply to numerous other orderings on the same category.

The author will construct a model for Malle’s counting function and determine the reasonableness of a large collection of orderings in forthcoming work [Alb23]. For the purposes of this paper, we include some more basic examples where these steps are either easier or already done for us in works like [SW22].

6.1 Random subsets with independent elements

We prove a typical tool used to give random models for sets of integers as a special case of our methods:

Theorem 6.1.

Let 𝒢\mathscr{G} be a random subset of \mathbb{N} where we let the events (n𝒢)(n\in\mathscr{G}) be pairwise independent with probability rn[0,1]r_{n}\in[0,1]. This forms a true probability measure on 22^{\mathbb{N}}, and for any any ϵ>0\epsilon>0

#(𝒢{1,,n})jnrjnϵjnrja.s.0\frac{\displaystyle\#(\mathscr{G}\cap\{1,...,n\})-\sum_{j\leq n}r_{j}}{\displaystyle n^{\epsilon}\sqrt{\sum_{j\leq n}r_{j}}}\overset{a.s.}{\longrightarrow}0

as nn\to\infty.

If we let rn=1log(n)r_{n}=\frac{1}{\log(n)} be the probability that nn is prime (with r1=0r_{1}=0 and r2=1r_{2}=1 in order to make sense) this is precisely Cramér’s original random model for the set of primes [Cra94]. We reproduce Cramér’s main term with this asymptotic:

#(𝒢{1,,n})=j=3n1log(j)+o(nϵj=3n1log(j)).\#(\mathscr{G}\cap\{1,...,n\})=\sum_{j=3}^{n}\frac{1}{\log(j)}+o\left(n^{\epsilon}\sqrt{\sum_{j=3}^{n}\frac{1}{\log(j)}}\right).

with probability 11. One easily checks using difference calculus that the main term is of the same order of magnitude as Li(n){\rm Li}(n) in agreement with the prime number theorem. The error we produce, while not as strong as Cramér’s, is still o(n1/2+ϵ)o(n^{1/2+\epsilon}) suggesting the truth of the Riemann hypothesis.

Proof.

Let CC be the category gotten by letting CopC^{\rm op} be the category of finite subsets of \mathbb{N} whose morphisms are inclusions. The pro-objects are all the subsets of \mathbb{N}. This category is incredibly nice for numerous reasons:

  • (a)

    Hom(A,B)=Epi(A,B)\textnormal{Hom}(A,B)={\rm Epi}(A,B) contains only the inclusion map ABA\hookleftarrow B if BAB\subseteq A and is empty otherwise.

  • (b)

    The product of AA and BB is given by the union ABA\cup B, and the fact that all morphisms are epimorphisms implies this is an epi-product too.

  • (c)

    Any sequence of finite moments MAM_{A} on this category is well-behaved in the sense of [SW22], because the well-behavedness sum is always finite. A level is the power set 2D2^{D} for some finite set DD, while the elements with epimorphisms from AA are precisely the subsets of AA.

  • (d)

    The Möbius function on the lattice of isomorphism classes ordered by epimorphisms is given by μ(B,A)=(1)|AB|\mu(B,A)=(-1)^{|A\setminus B|}.

The function #Epi(B,A)\#{\rm Epi}(B,A) is then the characteristic function of the event (AB)(A\subseteq B), which among a level 2D2^{D} has expected value precisely aAra\prod_{a\in A}r_{a}, so we define this to be MAM_{A}. Certainly MAB=MAMBM_{A\cup B}=M_{A}M_{B} whenever AB=A\cap B=\emptyset. We check that MAM_{A} corresponds to a measure on the pro-objects using [SW22, Theorem 1.7]. Indeed,

v2D,B\displaystyle v_{2^{D},B} =ADμ^(B,A)|Aut(A)|MA\displaystyle=\sum_{A\subseteq D}\frac{\hat{\mu}(B,A)}{|\textnormal{Aut}(A)|}M_{A}
=BAD(1)|AB|1aAra\displaystyle=\sum_{B\subset A\subseteq D}\frac{(-1)^{|A\setminus B|}}{1}\prod_{a\in A}r_{a}
=bBrbdDb(1rd)\displaystyle=\prod_{b\in B}r_{b}\prod_{d\in D\setminus b}(1-r_{d})
0\displaystyle\geq 0

so the measure μ\mu exists.

We also consider that #Epi(𝒢,A)\#{\rm Epi}(\mathscr{G},A) is the characteristic function of the event (A𝒢)(A\subseteq\mathscr{G}). Thus, we conclude that

#Epi(𝒢,A)#Epi(𝒢,B)=#Epi(𝒢,AB),\#{\rm Epi}(\mathscr{G},A)\#{\rm Epi}(\mathscr{G},B)=\#{\rm Epi}(\mathscr{G},A\cup B),

so that M(A1,A2,,A2k)=MA1A2A2kM_{(A_{1},A_{2},...,A_{2k})}=M_{A_{1}\cup A_{2}\cup\cdots\cup A_{2k}}. In this example, we have a very convenient bound for the finite moments of unions

MAB=aABraaArabBrb=MAMBM_{A\cup B}=\prod_{a\in A\cup B}r_{a}\geq\prod_{a\in A}r_{a}\prod_{b\in B}r_{b}=M_{A}M_{B}

which follows from the assumption that rn[0,1]r_{n}\in[0,1]. In particular, this implies M(j)M(2k)M^{(j)}\leq M^{(2k)} for each j2kj\leq 2k.

Any ordering fnf_{n} supported on a family of pairwise disjoint sets necessarily satisfies that C2kE(2k,M)C^{2k}\setminus E(2k,M) intersected with the support of fnf_{n} is precisely the collection of tuples (A1,,A2k)(A_{1},...,A_{2k}) in the support of fnf_{n} which has at most kk distinct coordinates. We choose fnf_{n} to be the characteristic function of singleton sets {m}\{m\} for which mnm\leq n. Up to the number of ways to choose matching coordinates, it suffices to consider objects in CkC^{k}. Thus, we can bound

C2kE(2k,M)|fn|𝑑M(j)(dM)2kj\displaystyle\int_{C^{2k}\setminus E(2k,M)}|f_{n}|\ dM^{(j)}(dM)^{2k-j} C2kE(2k,M)|fn|𝑑M(2k)\displaystyle\leq\int_{C^{2k}\setminus E(2k,M)}|f_{n}|\ dM^{(2k)}
k(A1,,Ak)Ckfn(A1)fn(Ak)MA1Ak\displaystyle\ll_{k}\sum_{(A_{1},...,A_{k})\in C^{k}}f_{n}(A_{1})\cdots f_{n}(A_{k})M_{A_{1}\cup\cdots\cup A_{k}}

Noting that fnf_{n} is supported on singleton sets, this is equivalent to

=|A|k#{a1,,akA:A={a1,,ak}}aAfn({a})M{a}\displaystyle=\sum_{|A|\leq k}\#\{a_{1},...,a_{k}\in A:A=\{a_{1},...,a_{k}\}\}\cdot\prod_{a\in A}f_{n}(\{a\})M_{\{a\}}
k1|A|kaAfn({a})M{a}\displaystyle\ll_{k}\sum_{1\leq|A|\leq k}\prod_{a\in A}f_{n}(\{a\})M_{\{a\}}
=(1+Cfn𝑑M)k1.\displaystyle=\left(1+\int_{C}f_{n}\ dM\right)^{k}-1.

If Cfn𝑑M1\int_{C}f_{n}\ dM\geq 1, we can bound this by a constant times the leading term (Cfn𝑑M)k(\int_{C}f_{n}\ dM)^{k}, and otherwise this is O(1)O(1). Thus we conclude via Theorem 1.5 that

𝒫|N(𝒢,fn)Cfn𝑑M|2k𝑑μ(𝒢)\displaystyle\int_{\mathcal{P}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{2k}d\mu(\mathscr{G}) k(Cfn𝑑M)k+O(1).\displaystyle\ll_{k}\left(\int_{C}f_{n}\ dM\right)^{k}+O(1).

Taking E(n)=(Cfn𝑑M)1/2E(n)=\left(\int_{C}f_{n}\ dM\right)^{1/2} and γ(n)=n2\gamma(n)=n^{2} in Theorem 4.2 then implies

N(𝒢,fn)Cfn𝑑Mn1/kCfn𝑑Ma.s.0.\frac{N(\mathscr{G},f_{n})-\displaystyle\int_{C}f_{n}\ dM}{n^{1/k}\sqrt{\displaystyle\int_{C}f_{n}\ dM}}\overset{a.s.}{\longrightarrow}0.

The counting function is given by

N(𝒢,fn)\displaystyle N(\mathscr{G},f_{n}) =mn#Epi(𝒢,{m})\displaystyle=\sum_{m\leq n}\#{\rm Epi}(\mathscr{G},\{m\})
=#(𝒢{1,,n}),\displaystyle=\#(\mathscr{G}\cap\{1,...,n\}),

while the moments of the ordering are given by

Cfn𝑑M\displaystyle\int_{C}f_{n}\ dM =fn(A)MA\displaystyle=\sum f_{n}(A)M_{A}
=mnrm.\displaystyle=\sum_{m\leq n}r_{m}.

Taking kk sufficiently large concludes the proof. ∎

Keeping in mind the connection with random models for prime numbers, we also prove the following:

Corollary 6.2.

Let 𝒢\mathscr{G} be a random subset of \mathbb{N} where we let the events (n𝒢)(n\in\mathscr{G}) be pairwise independent with probability rn[0,1]r_{n}\in[0,1]. Let 𝟏S\mathbf{1}_{S} be a characteristic function of some subset SS\subseteq\mathbb{N}, and define the corresponding ordering fnf_{n} supported on singleton sets by fn=𝟏S{1,,n}f_{n}=\mathbf{1}_{S\cap\{1,...,n\}}. Then

#{mn:m𝒢S}jnjSrjnϵjnjSrja.s.0\frac{\displaystyle\#\{m\leq n:m\in\mathscr{G}\cap S\}-\sum_{\begin{subarray}{c}j\leq n\\ j\in S\end{subarray}}r_{j}}{\displaystyle n^{\epsilon}\sqrt{\sum_{\begin{subarray}{c}j\leq n\\ j\in S\end{subarray}}r_{j}}}\overset{a.s.}{\longrightarrow}0

as nn\to\infty.

If we let rnr_{n} be the probabilities of Cramér’s model (or of the modifications improving the model), Heuristic 1.7 makes predictions for sets of prime numbers belonging to any subset SS\subseteq\mathbb{N} obeying the incredibly mild density condition

jnjSrjnδ\sum_{\begin{subarray}{c}j\leq n\\ j\in S\end{subarray}}r_{j}\gg n^{\delta}

for some δ>0\delta>0. This prediction is extremely broad, where the currently “known issues” amount to divisibility by small primes which are accounted for in corrections to Cramér’s random model (see [Gra95] for a nice summary). This example is not “novel”, such measure 11 statements exist throughout the literature on random models for the primes generalizing Cramér’s work as a means to justify a number of the most famous conjectures on the distribution of primes. This includes the likes of the Hardy-Littlewood conjecture (where ff is the characteristic function on admissible constellations starting from jj) and primes of the form x2+1x^{2}+1 (where ff is the characteristic function of integers of the form x2+1x^{2}+1).

In addition to being a short example with nice properties, this is meant to demonstrate two things. Firstly this example demonstrates that important existing random models (and corresponding results for those models) for the distribution of prime numbers arise as special cases of Heuristic 1.7 (respectively Theorem 1.3) for the category of subsets of \mathbb{N}. Secondly, this example demonstrates the power of working at this level of generality. We have given a concrete description for when such probability 11 results will exist, which allow us to tackle numerous orderings at a time.

6.2 Random groups

Any sequence MG=O(|G|n)M_{G}=O(|G|^{n}) on the category of finite groups is well-behaved in the sense of [SW22], so their results can be used to determine when such a moment sequence gives rise to a probability distribution. We will focus on the example MG=1M_{G}=1 discussed in [SW22]. We can use Theorem 1.3 to make predictions for (asymptotically) how many quotients of a “random” profinite group have a given behavior. While questions of this nature are easy enough to formulate, the moments of the most natural orderings fnf_{n} can be much harder to compute in practice. Utilizing the results of the preceeding subsection, we give the following example:

Theorem 6.3.

Let MG=1M_{G}=1 on the category of finite abelian groups with associated probability measure μ\mu on the profinite abelian groups. Then the average number of maximal subgroups in a random pro-abelian group of index bounded above by nn tends to loglogn\log\log n in almost surely. More specifically,

#{N𝒢 maximal:[𝒢:N]n}loglogna.s.1\frac{\#\{N\leq\mathscr{G}\text{ maximal}:[\mathscr{G}:N]\leq n\}}{\log\log n}\overset{a.s.}{\longrightarrow}1

as nn\to\infty.

Proof.

We take the ordering

fn(G)={1p1GCp,pn0else.f_{n}(G)=\begin{cases}\frac{1}{p-1}&G\cong C_{p},\ p\leq n\\ 0&\text{else}.\end{cases}

All maximal subgroups of an abelian group are normal with quotient isomorphic to CpC_{p} for some prime pp, so we can compute

#{N𝒢 maximal:[𝒢:N]n}\displaystyle\#\{N\leq\mathscr{G}\text{ maximal}:[\mathscr{G}:N]\leq n\} =pn#Epi(𝒢,Cp)p1\displaystyle=\sum_{p\leq n}\frac{\#{\rm Epi}(\mathscr{G},C_{p})}{p-1}
=GC/fn(G)#Epi(𝒢,Cp)\displaystyle=\sum_{G\in C/\cong}f_{n}(G)\#{\rm Epi}(\mathscr{G},C_{p})
=N(𝒢,fn).\displaystyle=N(\mathscr{G},f_{n}).

We note that fnf_{n} is nonegative and that nN(𝒢,fn)n\mapsto N(\mathscr{G},f_{n}) is certainly increasing in nn. Thus, we can apply Theorem 1.3(iii). We first compute

Cfn𝑑M\displaystyle\int_{C}f_{n}\ dM =pn1p1loglogn.\displaystyle=\sum_{p\leq n}\frac{1}{p-1}\sim\log\log n.

We next apply Theorem 1.4 to bound (3). The ordering fnf_{n} is supported on the finite simple abelian groups CpC_{p}, which are of pairwise coprime order. Thus, the intersection of C2E(2,M)C^{2}\setminus E(2,M) with the support of fnf_{n} is precisely the diagonal objects (Cp,Cp)(C_{p},C_{p}) with pnp\leq n. We compute

C2EpiM2C2fn𝑑M(2)\displaystyle\int_{C^{2}\setminus{\rm Epi}_{M}^{2}C^{2}}f_{n}\ dM^{(2)} =pn1(p1)2M(Cp,Cp)\displaystyle=\sum_{p\leq n}\frac{1}{(p-1)^{2}}M_{(C_{p},C_{p})}
=pn1(p1)2(MCp×Cp(p1)MCp)\displaystyle=\sum_{p\leq n}\frac{1}{(p-1)^{2}}\left(M_{C_{p}\times C_{p}}-(p-1)M_{C_{p}}\right)
=pnp(p1)2\displaystyle=\sum_{p\leq n}\frac{p}{(p-1)^{2}}
loglogn,\displaystyle\sim\log\log n,

where the second equality follows from Lemma 5.3, noting that the subgroups of Cp×CpC_{p}\times C_{p} that surject onto each coordinate are the whole group, and p1p-1 subgroups isomorphic to CpC_{p}. The j=1j=1 integral actually converges as nn\to\infty by a similar computation, so this is the maximum. All together Theorem 1.4 implies

𝒫|N(𝒢,fn)Cfn𝑑M|2𝑑μ(𝒢)\displaystyle\int_{\mathcal{P}}\left\lvert N(\mathscr{G},f_{n})-\int_{C}f_{n}\ dM\right\rvert^{2}d\mu(\mathscr{G}) =O(loglogn)\displaystyle=O(\log\log n)
=O((loglogn)2(loglogn)1/2)\displaystyle=O\left(\frac{(\log\log n)^{2}}{(\log\log n)^{1/2}}\right)
=O((Cfn𝑑M)2ψ(Cfn𝑑M)),\displaystyle=O\left(\frac{\left(\int_{C}f_{n}dM\right)^{2}}{\psi\left(\int_{C}f_{n}dM\right)}\right),

where ψ(t)=t1/2\psi(t)=t^{1/2} is non decreasing with limtψ(t)=\lim_{t\to\infty}\psi(t)=\infty. We confirm that 1nψ(n)=n3/2<\sum\frac{1}{n\psi(n)}=\sum n^{-3/2}<\infty converges, so the result follows from Theorem 1.3(iii). ∎

More general questions on random groups can be asked, but much about the asymptotic behavior of large families of groups is difficult to compute. This makes asymptotic growth rates and error terms also difficult to compute in practice. For instance, it would be nice to study the ordering fn(G)=1|Aut(G)|f_{n}(G)=\frac{1}{|\textnormal{Aut}(G)|} if |G|n|G|\leq n and 0 otherwise. The corresponding counting function is then

N(𝒢,fn)=#{N𝒢[𝒢:N]n}.N(\mathscr{G},f_{n})=\#\{N\trianglelefteq\mathscr{G}\mid[\mathscr{G}:N]\leq n\}.

This is a naturally interesting function to ask about. However, the corresponding moment

Cfn𝑑M=|G|n1|Aut(G)|\int_{C}f_{n}\ dM=\sum_{|G|\leq n}\frac{1}{|\textnormal{Aut}(G)|}

is not easy to determine. For instance, there are reasons to suspect that 100% of groups ordered by cardinality are 22-groups, but the author is not aware of a proof of this statement. Moreover, the bound in Theorem 1.4 requires some more serious group theory to translate the results of Lemma 5.3 into bounds on the error.

A number of interesting statistical questions that are simple to state arise in this way. The language of Theorem 1.3 is useful to create a framework for determining asymptotic growth rates, even if the moments of the ordering requires additional study to prove any results.

References

  • [Alb23] Brandon Alberts. A random group with local data, 2023. Forthcoming.
  • [BBH17] Nigel Boston, Michael R. Bush, and Farshid Hajir. Heuristics for pp-class towers of imaginary quadratic fields. Mathematische Annalen, 368(1-2):633–669, June 2017.
  • [BM90] V. V. Batyrev and Yu. I. Manin. Sur le nombre des points rationnels de hauteur borné des variétés algébriques. Mathematische Annalen, 286(1-3):27–43, 1990.
  • [CL84] Henri Cohen and Hendrik. W. Lenstra. Heuristics on class groups of number fields. Lecture Notes in Mathematics Number Theory Noordwijkerhout 1983, pages 33–62, 1984.
  • [Cra94] Harald Cramér. Some theorems concerning prime numbers. Springer Collected Works in Mathematics, page 138–170, 1994.
  • [Ele18] (https://math.stackexchange.com/users/7145/elements) Elements. Weak law of large numbers for dependent random variables with bounded covariance. Mathematics Stack Exchange, 2018. URL:https://math.stackexchange.com/q/245327 (version: 2018-04-10).
  • [Ete81] N. Etemadi. An elementary proof of the strong law of large numbers. Z. Wahrscheinlichkeitstheorie und Verwandte Gebiete, 55(1):119–122, 1981.
  • [Ete83a] N. Etemadi. On the laws of large numbers for nonnegative random variables. Journal of Multivariate Analysis, 13(1):187–193, 1983.
  • [Ete83b] N. Etemadi. Stability of sums of weighted nonnegative random variables. Journal of Multivariate Analysis, 13(2):361–365, 1983.
  • [Fis11] Hans Fischer. A history of the central limit theorem: From classical to modern probability theory. Springer New York, 2011.
  • [FMT89] Jens Franke, Yuri I. Manin, and Yuri Tschinkel. Rational points of bounded height on fano varieties. Inventiones Mathematicae, 95(2):421–435, 1989.
  • [FW89] Eduardo Friedman and Lawrence C. Washington. On the distribution of divisor class groups of curves over a finite field. Théorie des nombres / Number Theory, page 227–239, 1989.
  • [Gra95] Andrew Granville. Harald cramér and the distribution of prime numbers. Scandinavian Actuarial Journal, 1995(1):12–28, 1995.
  • [KP10] V. M. Korchevsky and V. V. Petrov. On the strong law of large numbers for sequences of dependent random variables. Vestnik St. Petersburg University: Mathematics, 43(3):143–147, 2010.
  • [LWZB19] Yuan Liu, Melanie Matchett Wood, and David Zureick-Brown. A predicted distribution for Galois groups of maximal unramified extensions, July 2019. Preprint available at https://arxiv.org/abs/1907.05002.
  • [Mal02] Gunter Malle. On the distribution of Galois groups. Journal of Number Theory, 92(2):315–329, 2002.
  • [Mal04] Gunter Malle. On the distribution of Galois groups, II. Experimental Mathematics, 13(2):129–135, 2004.
  • [Mic17] (https://math.stackexchange.com/users/155065/michael) Michael. Pairwise uncorrelated random variables in strong law of large numbers (slln). Mathematics Stack Exchange, 2017. URL:https://math.stackexchange.com/q/2545239 (version: 2017-11-30).
  • [MU17] Michael Mitzenmacher and Eli Upfal. Probability and computing: Randomization and probabilistic techniques in algorithms and data analysis. Cambridge University Press, 2017.
  • [Pet09a] V. V. Petrov. On stability of sums of nonnegative random variables. Journal of Mathematical Sciences, 159(3):324–326, 2009.
  • [Pet09b] V. V. Petrov. On the strong law of large numbers for nonnegative random variables. Theory of Probability &; Its Applications, 53(2):346–349, 2009.
  • [Sen13] Eugene Seneta. A tricentenary history of the law of large numbers. Bernoulli, 19(4), 2013.
  • [SS93] Pranab Kumar Sen and Julio da Motta Singer. Large sample methods in statistics: An introduction with applications. Chapman & Hall/CRC, 1 edition, 1993.
  • [SW22] Will Sawin and Melanie Matchett Wood. The moment problem for random objects in a category, Oct 2022.
  • [wik22] Law of large numbers, Oct 2022.