The Variance and Covariance of Counts-in-Cells Probabilities

Andrew Repp & István Szapudi
Institute for Astronomy, University of Hawaii, 2680 Woodlawn Drive, Honolulu, HI 96822, USA

Abstract

Counts-in-cells (CIC) measurements contain a wealth of cosmological information yet are seldom used to constrain theories. Although we can predict the shape of the distribution for a given cosmology, to fit a model to the observed CIC probabilities requires the covariance matrix – both the variance of counts in one probability bin and the covariance between counts in different bins. To date, there have been no general expressions for these variances. Here we show that correlations of particular levels, or “slices,” of the density field determine the variance and covariance of CIC probabilities. We derive explicit formulae that accurately predict the variance and covariance among subvolumes of a simulated galaxy catalog, opening the door to the use of CIC measurements for cosmological parameter estimation.

keywords:

cosmology: theory – cosmology: miscellaneous – methods: statistical

^†^†pagerange: The Variance and Covariance of Counts-in-Cells Probabilities–References

1 Introduction

One of the primary motivations for galaxy surveys is their ability to constrain cosmological models, a consequence of the statistical imprint which cosmology leaves upon the galaxy distribution. The galaxy power spectrum is an observable which captures a large amount of the information inherent in these surveys (e.g., Peebles 1980; Baumgart and Fry 1991); indeed, for a Gaussian field the power spectrum encodes all of the field’s cosmologically relevant information.

However, the power spectrum is sensitive to the second moment of the distribution only. Thus, since the matter distribution is non-Gaussian (notably so on scales of $10h^{-1}$ Mpc or less), the power spectrum is blind to the cosmological information residing in the higher moments on these scales (Meiksin and White 1999; Rimes and Hamilton 2005). Even worse, because the distribution is close to lognormal, a significant amount of information escapes the entire hierarchy of $N$ -point correlation functions (Carron 2011; Carron and Neyrinck 2012). Other statistical tools are thus necessary to capture the information lost in the power spectrum.

The log transform – again because of the approximate lognormality of the distribution – represents one means of recapturing this information (Neyrinck et al. 2006, 2009; Repp and Szapudi 2017); indeed, the log power spectrum captures virtually all of it (Carron and Szapudi 2013, 2014). Another means of recapturing at least some of this information is to consider the (one-point) counts-in-cells (CIC) probability distribution function (PDF). Although this measure ignores the information inherent in spatial correlations, its higher-order moments encode information to which the power spectrum is blind, and thus joint analysis of the PDF and power spectrum can provide significantly tighter constraints on cosmological parameters than analysis of the power spectrum alone (Uhlemann et al. 2020).

Theoretical work on cosmological applications of CIC date at least to the efforts of Balian and Schaeffer (1989) – who derive the form of the matter PDF under the assumption of scale-invariant $N$ -point correlation functions – resulting in a model by Bernardeau and Schaeffer (1991) for galaxy multiplicity functions. Analysis continued with the study by Colombi (1994) of log moments (via the Edgeworth expansion); the proposal by Bernardeau and Kofman (1995) of methods for generating PDFs; and the derivation by Bernardeau (1994a, b) of cumulants for the matter PDF. Colombi et al. (1995) in turn examine the errors introduced by finite-volume effects, and Szapudi and Colombi (1996) characterize the effects of cosmic-variance error. Other existing theoretical work includes analysis of sampling effects (Colombi et al. 1998) and of the distribution of probability measurements (Szapudi et al. 2000). Valageas (2002) provides a non-perturbative calculation of the PDF in the quasilinear regime; likewise, Uhlemann et al. (2018a, b, 2020) use large deviation statistics to predict the CIC for galaxy surveys.

CIC measurements also have a long history, including their use in simulations by Baugh et al. (1995) to determine the $N$ -point correlation functions and by Colombi et al. (2000) to determine the void probability distribution. Application to survey data includes analyses by Szapudi et al. (1992), Gaztanaga (1994), and Szapudi et al. (1995, 1996) of early projected surveys, and by Baugh et al. (2004) and Croton et al. (2007) of the 2dFGRS data; measurement by Pápai and Szapudi (2010) of the PDF in the SDSS LRG sample; determination by Wolk et al. (2013) of higher-order statistics in the CFTHLS-W survey; and verification by Clerkin et al. (2017) of log-normality of the projected CIC in DES data.

Turning to actual constraints on cosmological parameters, Gruen et al. (2018), together with Friedrich et al. (2018), provide the first complete cosmological analysis of the galaxy density PDF (in combination with lensing), thereby deriving cosmological constraints from DES and SDSS data. In addition, Salvador et al. (2019) use CIC statistics to analyze nonlinear galaxy bias in DES data, and Repp and Szapudi (2020) derive joint constraints on $\sigma_{8}$ and linear galaxy bias from CIC in SDSS data.

However, any use of CIC measurements to constrain cosmology requires an accounting for both the variance in the probability measurements and the covariance between measurements in different probability bins. It is likely that this fact plays a large role in the temporal gap (of nearly thirty years) between the initial theoretical work and the fits of Gruen et al. (2018). Early efforts such as Colombi (1994) suggested that fitting log moments might be more tractable, but until now fits to the entire PDF have typically required an entire ensemble of cosmological simulations to estimate the covariance matrix.

Therefore, as an alternative to running a computationally expensive suite of simulations, we in this work derive analytical expressions for the variance and covariance of CIC probability measurements. To test our results, we empirically determine the variability of probability measurements in a Millennium Simulation galaxy catalog, and we show that our expressions accurately predict the variance and covariance of the measured galaxy PDF.

We structure the remainder of this work as follows: Section 2 defines slice fields corresponding to probability bins; these slice fields form the foundation for the derivation. Section 3 derives the variance of CIC measurements along with the expected error on the estimator for this variance; we then demonstrate the accuracy of our result by comparing it to simulation measurements of the CIC variance. Likewise, Section 4 derives the covariance between CIC measurements in different bins of probability; we again determine the expected error on the covariance estimator and demonstrate the accuracy of our result. Discussion and conclusions follow in Sections 5 and 6.

2 The Slice Field

Suppose we have a field of objects (such as a galaxies in a survey) contained in a number $N_{c}$ of non-overlapping cells positioned at $\mathbf{r}_{1},\mathbf{r}_{2},\ldots,\mathbf{r}_{N_{c}}$ . If each cell contains a whole number $N_{i}=N(\mathbf{r}_{i})$ of objects, it is straightforward to measure the probability distribution $\mathcal{P}(N)$ for the field. More generally, if we bin the measured numbers $N$ into non-overlapping bins $B_{1},B_{2},\ldots$ , we can likewise measure the probability $\mathcal{P}(B)$ for any given bin $B$ , recovering $\mathcal{P}(N)$ when the bins have unit width.

For a given bin $B$ , we now define $\mathcal{S}_{B}$ , the slice field for $B$ , such that $\mathcal{S}_{B}(\mathbf{r}_{i})=1$ if $N(\mathbf{r}_{i})\in B$ ; otherwise $\mathcal{S}_{B}(\mathbf{r}_{i})=0$ . This field $\mathcal{S}_{B}$ thus identifies the spatial location of a particular slice of the possible values of $N$ , namely, those for which $N\in B$ . Bins of unit width allow us to recover the original $N$ -field of counts-in-cells (CIC) by summing over the slice fields:

N(\mathbf{r}_{i})=\sum_{N}N\mathcal{S}_{\{N\}}(\mathbf{r}_{i}).

(1)

Wider bins likewise recover a binned version of the original field.

Two simple slice-field properties will be useful in the sequel. First, because all slice-field values are either 0 or 1, all moments of any given slice field are equal to the probability of the associated bin:

\mathcal{P}(B)=\langle\mathcal{S}_{B}\rangle=\langle\mathcal{S}_{B}^{\>n}\rangle,

(2)

for all natural numbers $n$ . Second, if two bins $B_{1}$ and $B_{2}$ are disjoint, their corresponding slice fields must also be disjoint:

\mathcal{S}_{B_{1}}\cdot\mathcal{S}_{B_{2}}\equiv 0\mbox{ if }B_{1}\cap B_{1}=\varnothing.

(3)

The correlation function of the slice field is related to the sliced correlation functions as defined in Neyrinck et al. (2018); it is straightforward to recover the sliced correlation functions from these slice fields by a 1-point slice-averaging at one end.

3 Counts-in-cells Variance

3.1 Theoretical Prediction

We now turn to the problem of determining the error on the measured CIC probability in a way that accounts for correlations between neighboring cells. Given a bin $B$ of counts, and writing $\mathcal{S}$ for $\mathcal{S}_{B}$ , we see from Equation 2 that the probability of that bin $\mathcal{P}(B)=\langle\mathcal{S}\rangle$ . Furthermore, the variance of $\mathcal{S}$ is $\sigma^{2}_{\mathcal{S}}=\langle\mathcal{S}^{2}\rangle-\langle\mathcal{S}\rangle^{2}=\mathcal{P}(B)\left(1-\mathcal{P}(B)\right)$ .

If the survey cells are uncorrelated, it follows that the variance of $\mathcal{P}(B)$ is given by

\sigma^{2}_{\mathcal{P}(B)}=\sigma^{2}_{\langle S\rangle}=\frac{\mathcal{P}(B)\left(1-\mathcal{P}(B)\right)}{N_{c}}.

(4)

(Note that this expression appears in Colombi et al. (1995) for $B=\{0\}$ with no correlation.) Equation 4 treats each cell as an independent measurement of the probability; correlations between cells will decrease the effective number of independent measurements, requiring modification of the expression.

Thus, to handle the case of correlated cells, we first consider the artificial case of an $\mathcal{S}$ -field consisting of $n$ survey cells such that the correlation between any two of them is a fixed value $\xi$ . Explicitly, if $i\neq j$ , we have $\langle s_{i}s_{j}\rangle=(1+\xi)P^{2}$ , where $P=\mathcal{P}(B)$ is the probability that $\mathcal{S}=1$ in any given cell. (Note that $\xi$ is the correlation of $\mathcal{S}$ , not of the counts-in-cells $N$ .) If $s_{i}$ is the value of the $\mathcal{S}$ in the $i$ th cell, we know that

\mathcal{P}(B)=\langle\mathcal{S}\rangle=\frac{1}{n}\sum_{i=1}^{n}s_{i}.

(5)

In Appendix A.1 we show that the variance $\sigma^{2}_{\mathcal{P}(B)}$ of this quantity is given by

\mathrm{Var}\left(\frac{1}{n}\sum_{i=1}^{n}s_{i}\right)=\frac{\left(1-\left(1-(n-1)\xi\right)P\right)P}{n},

(6)

However, this expression presupposes an equal degree of correlation $\xi$ between all $n$ cells. Obtaining a similar expression for the general case of varying $\xi$ is analytically intractable. However, we note that, in the inductive proof of Equation 6 (Appendix A.1), the term through which new instances of $\xi$ accumulate into the expression is $\langle s_{i}s_{n+1}\rangle$ (in Equation 22). This term has precisely the form expected for a Monte Carlo volume average of a spatially-varying $\xi(r)$ . With this motivation, we approximate $\xi$ in Equation 6 with the average slice-field correlation $\overline{\xi}_{\mathcal{S}}$ , where the average is taken over all pairs $(\mathbf{r}_{i},\mathbf{r}_{j})$ of survey cells ( $i\neq j$ ). (In Section 3.3 we verify the validity of this approximation numerically.) Performing this substitution and writing $N_{c}$ (the number of cells) for $n$ , we obtain for any counts-in-cells bin $B$ ,

\sigma^{2}_{\mathcal{P}(B)}=\frac{\mathcal{P}(B)\left(1-\mathcal{P}(B)\right)}{N_{c}}+\frac{(N_{c}-1)\overline{\xi}_{\mathcal{S}}}{N_{c}}\mathcal{P}(B)^{2},

(7)

which reduces to Equation 4 in the uncorrelated case.

3.2 Error on the Measured Variance

In order to test Equation 7, we need multiple measurements of the probability $\mathcal{P}(B)$ . It is possible to obtain such measurements from a single simulation by dividing it into multiple subvolumes, calculating $\mathcal{P}(B)$ in each subvolume, and then observing the variance of those measured $\mathcal{P}(B)$ -values. However, a meaningful comparison between measurement and theory requires an estimate of the possible scatter in these observations of the variance. To this we now turn.

Suppose we have a set $\left\{P_{i}\right\}$ of $N_{m}$ measurements of a probability $\mathcal{P}(B)$ . Let us denote the measured variance of this set as $s_{P}^{2}$ . Then according to a standard result (e.g., Kendall and Stuart 1958, eq. 12.35, converting cumulants to moments),

\mathrm{Var}\left(s_{p}^{2}\right)=\frac{\mu_{4}-\mu_{2}^{2}}{N_{m}}+\frac{2\mu_{2}^{2}}{N_{m}(N_{m}-1)},

(8)

where $\mu_{j}$ denotes the $j$ th central moment of the distribution of $P_{i}$ -values. We can use the measured variance $s_{P}^{2}$ for $\mu_{2}$ ; however, we must perform an additional estimation of $\mu_{4}$ . For simplicity, in calculating $\mu_{4}$ we will ignore the correlation between neighboring cells; we shall however mitigate the effect of this simplification by expressing our results in terms of $s_{p}^{2}$ , which do include the effects of correlation.

Proceding to determine $\mu_{4}$ , and employing $P$ as shorthand for $\mathcal{P}(B)$ , it is straightforward to obtain the moment-generating function for the slice-field $\mathcal{S}$ corresponding to bin $B$ :

M_{\mathcal{S}}(t)=1+\left(e^{t}-1\right)P.

(9)

Now, if each measurement $P_{i}$ was obtained by averaging the $\mathcal{S}$ -values in $N_{c}$ cells (see Equation 5), then the moment-generating function for the probability is

M_{\mathcal{P}}(t)=\frac{1}{N_{c}}\left(1+\left(e^{t}-1\right)P\right)^{N_{c}}.

(10)

From this function (see Appendix A.2 for details) we obtain the following expression for $\mu_{4}$ :

\mu_{4}=3\left(s_{p}^{2}\right)^{2}+\frac{s_{p}^{2}}{N_{c}}(1-6P+6P^{2});

(11)

inserting this result into Equation 8 and simplifying, we obtain

\mathrm{Var}\left(s_{p}^{2}\right)=\frac{2\left(s_{p}^{2}\right)^{2}}{N_{m}-1}+\frac{s_{p}^{2}(1-6P+6P^{2})}{N_{m}N_{c}^{2}},

(12)

where $N_{c}$ is the number of survey cells in each measurement of $s_{P}^{2}$ , and $N_{m}$ is the total number of measurements. Equation 12 thus gives us the uncertainty on the measured value of $s_{p}^{2}$ .

Note also that each probability measurement $P_{i}$ is a mean of $N_{c}$ values of the $\mathcal{S}$ -field. Thus for reasonably large values of $N_{c}$ , we can invoke the central limit theorem and treat the distribution of $P_{i}$ -values as Gaussian, in which case $\mathrm{Var}\left(s_{p}^{2}\right)=2\left(s_{p}^{2}\right)^{2}\!/(N_{m}-1)$ . This result is, of course, the limit of Equation 12 for large $N_{c}$ .

3.3 Comparing Theory with Measurement

Figure 1: A comparison of the variance of CIC-probabilities predicted by Equation 7 with those measured in a mock galaxy survey catalog, in

1.95h^{-1}

-Mpc cubical cells (left panels) and

31.25h^{-1}

-Mpc cubical cells (right panels). The top axes show the number of subvolumes into which the survey is split (

N_{m}

in Equation 12), equal to the number of measurements of

\mathcal{P}(B)

; the bottom axes show the number of cells within each subvolume (

N_{c}

in Equation 12). Thick lines (with error bars) show the empirical variance of the measurements of

\mathcal{P}(B)

(one measurement for each subvolume, each subvolume containing

N_{\mathrm{cells}}

cells); thin lines show the predicted variances from Equation 7, where we determine the volume-averaged correlations as describe in the text. The thin dotted lines show the predicted variance for the uncorrelated case (Equation 4). The bottom panels display the ratio of the true variance to that predicted in the absence of correlation; in some cases the correlations can increase the variance by two orders of magnitude. For clarity, the lower-panel curves have received a slight horizontal offset from each other.

We can now compare the predictions of Equation 7 with measured variances. To do so, we make use of the L-galaxies catalog¹¹1From the repository at http://gavo.mpa-garching.mpg.de/
Millennium/ (Bertone et al. 2007) from the Millennium Simulation (Springel et al. 2005), imposing a stellar mass cut of $M_{\star}\geq 10^{9}\mathrm{M}_{\odot}$ to obtain a mock galaxy survey. We perform two tests, one with the survey volume divided into $256^{3}$ cubical cells (with side length $1.95h^{-1}$ Mpc) and a second with the volume divided into $16^{3}$ cubical cells (with side length $31.25h^{-1}$ Mpc).

We next split the survey into $N_{m}$ subvolumes, each consisting of $N_{c}$ survey cells (so that $N_{m}N_{c}=N_{\mathrm{tot}}$ , the total number of cells in the survey). Given a CIC bin $B$ , we determine the probability $\mathcal{P}(B)$ within each subvolume, thus obtaining an ensemble of $N_{m}$ measured probabilities. The variance of this ensemble gives us a measured value for $\sigma^{2}_{\mathcal{P}}(B)$ , and this value will depend on the number of cells $N_{c}$ used in the measurement of each probability. These empirical variances appear as thick curves in Fig. 1. In this figure, the bottom axis shows $N_{c}$ , the number of cells in each subvolume, and the top axis shows $N_{m}$ , the number of subvolumes. (Note also that we choose unit bin widths for the $1.95h^{-1}$ -Mpc cells and varying bin widths for the $31.25h^{-1}$ -Mpc cells.) Equation 12 gives us the error bars on these measurements, where we use for $s^{2}_{p}$ and $P$ the measured variances and probabilities.

The thin curves in Fig. 1 show the predictions of Equation 7. This prediction is not entirely a priori, since it requires the (measured) probability values $\mathcal{P}(B)$ and the measured volume-averaged correlation $\overline{\xi}_{\mathcal{S}}$ of the corresponding slice field. To obtain the latter, we first measure the two-point correlation function $\xi_{\mathcal{S}}(r)$ of the appropriate slice field using a standard fast Fourier transform method; we then use Monte Carlo sampling of the slice field to obtain random pairs, using $\xi(r)$ to calculate their correlation and folding the result into the average; we continue the sampling process until the variation in the running average has subpercent effect on the predicted $\sigma^{2}_{\mathcal{P}}(B)$ .

The endpoints of the curves illustrate two extremes. The left-hand endpoints represent the situation in which each survey cell constitutes its own subvolume ( $N_{c}=1$ , and $N_{m}=256^{3}$ or $16^{3}$ ). In this case, each cell provides an estimate of $\mathcal{P}(B)$ , and these estimates are either 0 or 1 (depending on whether the cell falls into that probability bin). In this case we expect the variance of these estimates to be large.

The right-hand endpoint of each curve represents the opposite situation of high $N_{c}$ and low $N_{m}$ . At this endpoint, the survey is divided into 8 subvolumes ( $N_{c}=256^{3}\!/8$ or $16^{3}\!/8$ , and in both cases $N_{m}=8$ ). Here we have 8 measurements of $\mathcal{P}(B)$ , and the variance of these measurements is small due to the large number of cells $N_{c}$ involved in the calculation of each one. On the other hand, since we have only 8 measurements of $\mathcal{P}(B)$ , our estimate $s^{2}$ of the variance is less certain.

In both cases ( $1.95h^{-1}$ - and $31.25h^{-1}$ -Mpc cells), we see from this figure that the predicted variance is in excellent agreement with the measured values. Furthermore, we see the expected trends: as the number of cells $N_{c}$ used for the measurement of $\mathcal{P}(B)$ increases, the variance in those measurements decreases; however, as the number of measurements of $\mathcal{P}(B)$ decreases, the uncertainty in the variance increases.

The top panels of Fig. 1 also show, as thin dotted lines, the variance predicted under the assumption of no correlation between survey cells (Equation 4), which is proportional to $1/N_{c}$ ; the bottom panels show the ratio between the correlated and uncorrelated cases. It is evident that inter-cell correlations can have a significant effect on the variance of $\mathcal{P}(B)$ ; in the case of $\mathcal{P}(0)$ for the $1.95h^{-1}$ -Mpc cells, the difference is more than two orders of magnitude.

4 Counts-in-cells Covariance

4.1 Theoretical Prediction

The second issue one must consider in fitting models to CIC results is the covariance between different bins of counts. (It is clear that such covariance must exist, given that a survey cell falling into one bin is thereby excluded from all other bins.) Thus we here derive an expression for the covariance $\sigma_{\mathcal{P}(B_{1})\mathcal{P}(B_{2})}$ of counts-in-cells in two (disjoint) bins $B_{1}$ and $B_{2}$ . To simplify notation, we write $\mathcal{S}_{1}$ , $\mathcal{S}_{2}$ for the slice fields of the two bins, and $P_{1}$ , $P_{2}$ for $\mathcal{P}(B_{1})$ , $\mathcal{P}(B_{2})$ .

We begin by considering the case of two slice fields, both drawn from the same survey consisting of $n$ cells, and, as before, we initially assume that the cross-correlation between the two slice fields is a fixed, constant value $\xi_{12}$ . In particular, for $i\neq j$ we let $s_{1i}=\mathcal{S}_{1}(\mathrm{r}_{i})$ and $s_{2j}=\mathcal{S}_{2}(\mathrm{r}_{j})$ ; then given this cross-correlation, we can say that the joint probability of $(s_{1i}=1,s_{2j}=1)$ is $\mathcal{P}(1,1)=P_{1}P_{2}(1+\xi_{12})$ . Furthermore, since $\mathcal{S}_{1}(\mathrm{r}_{i})\mathcal{S}_{2}(\mathrm{r}_{j})$ vanishes unless $\mathcal{S}_{1}(\mathrm{r}_{i})=\mathcal{S}_{2}(\mathrm{r}_{j})=1$ , the expected value $\langle\mathcal{S}_{1}(\mathrm{r}_{i})\mathcal{S}_{2}(\mathrm{r}_{j})\rangle=\mathcal{P}(1,1)$ .

Now $P_{1}$ is simply $(1/n)\sum s_{1i}$ , and $P_{2}$ is $(1/n)\sum s_{2j}$ . Given these relationships, we prove in Appendix A.3 the following statement (analogous to Equation 6) concerning the covariance of the probabilities $P_{1}$ and $P_{2}$ :

\mathrm{Cov}\left(\frac{1}{n}\sum_{i=1}^{n}s_{1i}\,,\frac{1}{n}\sum_{j=1}^{n}s_{2j}\right)=\frac{-P_{1}P_{2}}{n}\left(1+(n-1)\xi_{12}\right).

(13)

At this point we make an approximation analogous to that in Section 3.1 by replacing the fixed $\xi_{12}$ with the average cross-correlation $\overline{\xi}_{\mathcal{S}_{1}\mathcal{S}_{2}}$ of the two slice fields. Again writing $N_{c}$ for $n$ , we obtain, for any disjoint counts-in-cells bins $B_{1}$ and $B_{2}$ ,

\sigma_{\mathcal{P}(B_{1})\mathcal{P}(B_{2})}=\frac{-\mathcal{P}(B_{1})\mathcal{P}(B_{2})}{N_{c}}-\frac{n-1}{N_{c}}\overline{\xi}_{\mathcal{S}_{1}\mathcal{S}_{2}}\mathcal{P}(B_{1})\mathcal{P}(B_{2}).

(14)

We note that in the absence of cross-correlation, the covariance is negative since the bins are mutually exclusive.

4.2 Error on the Measured Covariance

As in Section 3, we now wish to compare Equation 14 with results measured from simulations, and thus we require an estimate for the uncertainty of the measured covariances.

Let us begin with two disjoint CIC bins $B_{1}$ and $B_{2}$ ; let us also suppose that we have two sets $\left\{P_{1i}\right\}$ and $\left\{P_{2i}\right\}$ of measurements of probabilities $\mathcal{P}(B_{1})$ and $\mathcal{P}(B_{2})$ , each set consisting of $N_{m}$ measurements. We denote the covariance of these two sets of probability measurements as $S_{P_{1}P_{2}}$ .

The following expression (Kendall and Stuart 1958, p. 322) gives the variance of $S_{P_{1}P_{2}}$ (where our $S_{P_{1}P_{2}}$ corresponds to $k_{11}$ of Kendall and Stuart):

\mathrm{Var}\left(S_{P_{1}P_{2}}\right)=\frac{\mu_{22}}{N_{m}}+\frac{\mu_{02}\mu_{20}}{N_{m}(N_{m}-1)}-\frac{N_{m}-2}{N_{m}(N_{m}-1)}\mu_{11}^{2},

(15)

where we have converted cumulants into moments, with $\mu_{rs}$ denoting the $r$ th, $s$ th product moments about the means of the random variables $P_{1}$ , $P_{2}$ . In calculating these moments we will ignore cross-correlations between nearby cells (as in Section 3.2), but we shall again seek to reduce the impact of this simplification by expressing our result in terms of $S_{P_{1}P_{2}}$ .

Let us employ $P_{1}$ , $P_{2}$ as shorthand for $\mathcal{P}(B_{1})$ , $\mathcal{P}(B_{2})$ . Then the joint moment-generating function for the corresponding slice fields $\mathcal{S}_{1}$ , $\mathcal{S}_{2}$ is

M_{\mathcal{S}_{1}\mathcal{S}_{2}}(t_{1},t_{2})=1+\left(e^{t_{1}}-1\right)P_{1}+\left(e^{t_{2}}-1\right)P_{2}.

(16)

Furthermore (as in Section 3.2), each of the measurements $P_{1i}$ , $P_{2i}$ is an average over the $\mathcal{S}_{1}$ , $\mathcal{S}_{2}$ -values in $N_{c}$ cells. Thus the joint moment-generating function for the measured probabilities is

M_{\mathcal{P}_{1}\mathcal{P}_{2}}(t_{1},t_{2})=\frac{1}{N_{c}}\left(1+\left(e^{t_{1}}-1\right)P_{1}+\left(e^{t_{2}}-1\right)P_{2}\right)^{N_{c}}.

(17)

From this function, we calculate in Appendix A.4 the required joint cental moments of the $\mathcal{P}_{1},\mathcal{P}_{2}$ distribution. We then apply these results to Equation 15 and, using $S_{P_{1}P_{2}}=-P_{1}P_{2}/N_{c}$ (Equation 14 with no cross-correlation), it eventually follows that

\mathrm{Var}\left(S_{P_{1}P_{2}}\right)=\frac{1}{N_{m}-1}\left\{\rule{0.0pt}{12.0pt}2\left(S_{P_{1}P_{2}}\right)^{2}\left(1-\frac{3(N_{m}-1)}{N_{c}N_{m}}\right)\right.\\ -\frac{S_{P_{1}P_{2}}}{N_{c}^{2}N_{m}}\left[\rule{0.0pt}{10.0pt}(1-P_{2}-P_{2})(1+N_{m}(N_{c}-1))\right.\\ +\left.\left.(P_{1}+P_{2})(N_{m}-1)\rule{0.0pt}{10.0pt}\right]\rule{0.0pt}{12.0pt}\right\}.

(18)

As with the variance in Section 3.2, we can note that for large values of $N_{c}$ , the $P_{1i}$ , $P_{2i}$ values quickly approach a joint Gaussian distribution; in this case we can take the limit of Equation 18 to obtain $\mathrm{Var}\left(S_{P_{1}P_{2}}\right)=2\left(S_{P_{1}P_{2}}\right)^{2}/(N_{m}-1)$ .

4.3 Comparing Theory with Measurement

Figure 2: A comparison of the predicted covariance of CIC probabilities (Equation 14) and the measured covariance in the same mock galaxy catalog as in Fig. 1. The top axes show the number of subvolumes into which the survey is split (

N_{m}

in Equation 18), equal to the number of measurements of

\mathcal{P}(B_{1})

and

\mathcal{P}(B_{2})

; the bottom axes show the number of cells in each subvolume (

N_{c}

in Equation 18). Thick lines (with error bars) show the negative of empirical covariance of the measurements of

\mathcal{P}(B_{1})

and

\mathcal{P}(B_{2})

(i.e.,

N_{\mathrm{subvol}}

measurements, each involving

N_{\mathrm{cells}}

survey cells); thin lines show the negative of the predicted covariance from Equation 14, using volume-averaged cross-correlations determined as in the text. The thin dotted lines show the negative of the predicted covariance for the case of no cross-correlation. Dashed lines (thick and thin) indicate positive (rather than negative) values for the covariance. The bottom panels display the absolute ratio of the true covariances to those predicted assuming no cross-correlation. For clarity, the curves have received a slight horizontal offset from each other.

As in Section 3.3, we proceed to compare the predictions of Equation 7 with measured covariances; we use the same mock survey, with the same cell sizes ( $1.95h^{-1}$ and $31.25h^{-1}$ Mpc), as in Section 3.3. Again we split each survey into $N_{m}$ subvolumes, with each subvolume consisting of $N_{c}$ survey cells.

In this case we choose two CIC bins $B_{1}$ and $B_{2}$ and empirically determine the probabilities $\mathcal{P}(B_{1})$ and $\mathcal{P}(B_{1})$ within each subvolume; the result is an ensemble of $N_{m}$ measured probabilities $P_{1i}$ in $B_{1}$ and a corresponding ensemble of $N_{m}$ measured probabilities $P_{2i}$ within $B_{2}$ . The covariance $S_{P_{1}P_{2}}$ of these two sets of measurements is our estimate for $\mathrm{Cov}(\mathcal{P}(B_{1})\mathcal{P}(B_{2}))$ , and this value will depend on the number of cells $N_{c}$ used in the measurement of the probabilities. As before, we use unit bin widths with the $1.95h^{-1}$ -Mpc cells and varying bin widths with the $31.25h^{-1}$ -Mpc cells.

These empirical covariances appear as thick curves in Fig. 2. Since in these cases the covariances are typically negative, we plot $-\mathrm{Cov}$ with solid lines (using dashed lines to indicate positive covariances). Equation 18 provides the error bars for these measurements, where we use for $S_{P_{1}P_{2}}$ , $P_{1}$ , and $P_{2}$ the measured covariances and probabilities.

The thin curves in Fig. 2 show the predictions of Equation 14. For the volume-averaged correlation $\overline{\xi}_{\mathcal{S}_{1}\mathcal{S}_{2}}$ of the slice fields for the two bins, we first empirically calculate the two-point cross-correlation function $\xi_{\mathcal{S}_{1}\mathcal{S}_{2}}(r)$ of the two slice fields with FFT methods. We then, as in Section 3.3, sample the slice fields to obtain random pairs of positions within the survey volume, determine the cross-correlation of those positions from $\xi_{\mathcal{S}_{1}\mathcal{S}_{2}}(r)$ , and fold the result into a running average, terminating the sampling process once the variation in the running average has subpercent effect on the predicted $\sigma_{\mathcal{P}(B_{1})\mathcal{P}(B_{2})}$ .

Once again, the agreement between prediction and measurement is excellent, although the large error bars at $31h^{-1}$ -Mpc cells mean that most of the measurements yield only upper limits of the absolute value. We also observe the same (expected) trends as in Fig. 1: the covariance of the CIC measurements decreases as the number of cells $N_{c}$ in each measurement increases; likewise the error bars on the covariance increase as the number $N_{m}$ of measurements decreases.

The thin dotted lines in Fig. 2 show the predicted covariance in the case of no cross-correlation, and the lower panels show the ratio between the cross-correlated and non-cross-correlated results. It is again clear that cross-correlations among survey cells in different probability bins can increase the covariance by multiple orders of magnitude.

5 Discussion

Figure 3: Covariance matrices for CIC probabilities in a mock galaxy survey (described in the text), using logarithmically spaced probability bins and measured (cross-) correlation functions.

Equations 7 and 14 allow us to calculate the covariance matrices for the mock surveys in Sections 3.3 and 4.3 (though the calculation requires empirical determination of the average correlations $\overline{\xi}_{\mathcal{S}}$ and cross-correlations $\overline{\xi}_{\mathcal{S}_{1}\mathcal{S}_{2}}$ ). We here perform one such calculation.

For our CIC bins, we start with 20 logarithmically-spaced bins, which we then combine as necessary to insure that no bin contains fewer than three survey cells; we end up with 20 and 18 bins for the $1.95h^{-1}$ -Mpc and $31.25h^{-1}$ -Mpc cases, respectively. Since we now use the entire survey to calculate the (co-)variances, we set $N_{c}=N_{\mathrm{tot}}$ , the total number of survey cells, in Equations 7 and 14. Fig. 3 displays the resulting covariance matrices.

Figure 4: Left-hand panel: measured correlation function

\xi(r)

for the slice fields of four CIC bins, compared to the galaxy correlation function, in our mock galaxy survey. Right-hand panel: measured cross-correlation function

\xi_{12}(r)

between the

N=3

slice field and three other slice fields, again compared to the galaxy correlation function. To first order, the two-point (cross-) correlation functions for the slice fields seem to differ from the galaxy correlation function by a simple multiplicative factor.

We note the following concerning these matrices. First, the two cases display significant structural differences. At $\sim 30h^{-1}$ Mpc (right-hand panel), the covariance matrix is approximately diagonal (albeit with significant noise), indicating that at these scales the CIC galaxy distribution is approaching a Gaussian limit. However, at $\sim 2h^{-1}$ Mpc (left-hand panel), the covariance is dominated by the $N=0$ cells, which occupy over 85 per cent of the survey volume. Furthermore, we have already noted (immediately following Equation 14) that, in the absence of cross-correlations, the covariance between probability bins will be negative; we see this behavior in the left-hand panel of Fig. 3 near $N=0$ . In this case, the negative covariance induced by mutual exclusivity is exacerbated by the fact that the empty cells aggregate into large voids and thus are negatively cross-correlated with $N>1$ (see right-hand panel Fig. 4). Other than these features, the covariance matrix at this smaller scale is quite smooth.

The second observation is that, in consequence, to assume diagonal covariance matrices is a reasonable approximation on scales $\ga 30h^{-1}$ Mpc. This fact is also evident in the right-hand panel of Fig. 1, which makes it clear that at such scales the intercellular correlations have only a minor effect on the variance of $\mathcal{P}(B)$ . Repp and Szapudi (2020) are therefore justified in ignoring these correlations when fitting $\sigma_{8}$ and linear bias to CIC measurements from the Sloan Main Galaxy Sample. However, extraction of information from the galaxy PDF at smaller scales will need to take these correlations (and cross-correlations) into account.

The third observation is that the calculated covariance matrix is only as good as the measurement of the (cross-) correlation functions of the slice fields. It is this fact which is responsible for the noise in the right-hand panel of Fig. 3, since at these scales we have only $16^{3}$ cells in our survey (and thus many fewer in each probability bin). As a result, we expect the measurement of $\overline{\xi}$ to be quite noisy, as we in fact see. Even at small scales, the measurement of $\xi(r)$ , and thus of $\overline{\xi}$ becomes quite noisy for the low-probability, high-density bins (Fig. 4). Thus, it would be helpful to have a theory for the slice-field correlation functions.

Such a theory seems to be feasible, given that the slice-field correlation and cross-correlation functions appear to differ from the galaxy correlation function by a multiplicative bias, at least to first order (Fig. 4). Indeed, the slice fields pick out specific density contours in a manner analogous to the way in which galaxies preferentially trace regions of high matter-density; thus one might expect a bias analogous to the Kaiser bias of galaxies (Kaiser 1984). Fitting these bias parameters to the measured correlations could therefore eliminate much of the noise from the measurements of the various $\overline{\xi}$ -values.

Finally, we note that the correlations of the slice field represent a further generalization of the sliced correlation functions introduced by Neyrinck et al. (2018), which isolate the correlation of one particular density with the entire field. Thus they are similar to marked correlation functions and power spectra, which promise to enhance, e.g., the detection of neutrino signatures (Massara et al. 2020; Philcox et al. 2020) in galaxy surveys. Indeed, since slice-field correlations are two-point functions – whereas marked correlations are inherently higher-order (densities at two points plus spatially-varying marks) – it is possible that slice-field correlations will prove more tractable than marked correlations, without sacrificing information content.

6 Conclusions

Since counts-in-cells (CIC) probabilities contain significant information not included in the galaxy power spectrum, it is important to develop the theoretical machinery for fitting cosmological models to CIC measurements from galaxy surveys. One of the key ingredients in performing such fits is an understanding of the variance of CIC-measurements within a given probability bin, as well as of the covariance of those measurements between different probability bins. We have here derived expressions for both of these quantities.

In order to derive these expressions, we first define the slice field $\mathcal{S}_{B}$ for a given bin $B$ , such that $\mathcal{S}_{B}=1$ if $N$ (the number of galaxies within a survey cell) falls within $B$ ; otherwise $\mathcal{S}_{B}=0$ . Using these fields we derive Equation 7 for the variance $\mathrm{Var}(\mathcal{P}(B))$ of measurements of a given probability bin, and we derive Equation 14 for the covariance $\mathrm{Cov}(\mathcal{P}(B_{1}),\mathcal{P}(B_{2}))$ of the measurements in two distinct probability bins. These expressions depend on the probabilities, the number of cells from which the probabilities are determined, and the volume-averaged (cross-) correlation of the corresponding slice fields. Conceptually, the degree of correlation affects the result by reducing the effective number of cells involved in the probability calculation.

To test Equations 7 and 14 we turn to a mock galaxy survey from the Millennium Simulation; by dividing the survey into multiple subvolumes we can measure the probability within each subvolume and thus empirically determine the variance/covariance of the $\mathcal{P}(B)$ measurements. Furthermore, a meaningful comparison to the predicted (co-)variances requires an estimate of the scatter in the measurement of those (co-)variances. Taking that scatter into account, we find that Equations 7 and 14 accurately predict the variance and covariance of CIC measurements (Figs. 1 and 2).

We further conclude that at large scales ( $\sim 30h^{-1}$ Mpc) the correlation between neighboring cells has a negligible impact on the variance, whereas at small scales ( $\sim 2h^{-1}$ Mpc) the correlations can increase the variance by orders of magnitude.

In summary, two of the tools necessary for wider cosmological utilization of counts-in-cells are the ability to determine the variance and covariance of the CIC probabilities. These tools are now available.

Acknowledgements

The Millennium Simulation data bases used in this work and the web application providing online access to them were constructed as part of the activities of the German Astrophysical Virtual Observatory (GAVO). This work was supported by NASA Headquarters under the NASA Earth and Space Science Fellowship program – “Grant 80NSSC18K1081” – and AR gratefully acknowledges the support. IS acknowledges support from National Science Foundation (NSF) award 1616974.

Data Availability

The data underlying this article are available in the Virgo-Millennium Database (maintained by the German Astrophysical Virtual Observatory) at http://gavo.mpa-garching.mpg.de/Millennium.

Appendix A Derivations

A.1 Proof of Equation 6

Equation 6 claims that the variance of $\mathcal{P}(B)=(1/n)\sum s_{i}$ is

\mathrm{Var}\left(\frac{1}{n}\sum_{i=1}^{n}s_{i}\right)=\frac{\left(1-\left(1-(n-1)\xi\right)P\right)P}{n}.

(6)

Recall that we are assuming the correlation between any two of the slice-field values $s_{1},s_{2},\ldots,s_{n}$ is a fixed quantity $\xi$ . Thus if $i\neq j$ , it is the case that $\langle s_{i}s_{j}\rangle=(1+\xi)P^{2}$ , where $P=\mathcal{P}(B)$ is the probability that $\mathcal{S}=1$ in any given cell. We also recall that $\xi$ is the correlation function of the slice field $\mathcal{S}$ , not the of the counts-in-cells $N$ .

The proof of Equation 6 proceeds by induction. For $n=1$ , the right-hand side of Equation 6 is simply $P-P^{2}=\langle\mathcal{S}^{2}\rangle-\langle S\rangle^{2}$ (by Equation 2), which is of course the variance of $\mathcal{S}$ . On the other hand, for an arbitrary $n$ , we have

	$\displaystyle\mathrm{Var}\left(\frac{1}{n+1}\sum_{i=1}^{n+1}s_{i}\right)=\frac{1}{(n+1)^{2}}\left\langle\left(\sum_{i=1}^{n+1}s_{i}-(n+1)P\right)^{2}\right\rangle\hfill$		(19)
	$\displaystyle\quad=\frac{1}{(n+1)^{2}}\left\langle\left(\left(\sum_{i=1}^{n}s_{i}-nP\right)+\left(s_{i+1}-P\right)\right)^{2}\right\rangle$		(20)
	$\displaystyle\begin{split}\quad=\frac{1}{(n+1)^{2}}&\left\{n^{2}\left\langle\left(\frac{1}{n}\sum_{i=1}^{n}s_{i}-P\right)^{2}\right\rangle\right.\\ &+2\left\langle\left(\sum_{i=1}^{n}s_{i}-nP\right)\left(s_{n+1}-P\right)\right\rangle\\ &+\left\langle\left(s_{n+1}-P\right)^{2}\right\rangle\left.\rule{0.0pt}{15.0pt}\right\}\end{split}$		(21)

Equation 21 contains three expectation values. If Equation 6 holds for $n$ , the first expectation value equals $\left(1-\left(1-(n-1)\xi\right)P\right)P/n$ . The third is simply the variance of $\mathcal{S}$ , or $P-P^{2}$ . And for the second, we have

	$\displaystyle\begin{split}&\left\langle\left(\sum_{i=1}^{n}s_{i}-nP\right)\left(s_{n+1}-P\right)\right\rangle\\ &\quad=\sum_{i=1}^{n}\left\langle s_{i}s_{n+1}\right\rangle-P\sum_{i=1}^{n}\langle s_{i}\rangle-nP\langle s_{i+1}\rangle+nP^{2}\end{split}$		(22)
	$\displaystyle\quad=nP^{2}(1+\xi)-nP^{2}=nP^{2}\xi.$		(23)

Substituting these expressions for the expectation values into Equation 21 and simplifying, we obtain

\mathrm{Var}\left(\frac{1}{n+1}\sum_{i=1}^{n+1}s_{i}\right)=\frac{\left(1-P\left(1-n\xi\right)\right)}{n+1},

(24)

completing the induction on Equation 6.

A.2 Moments of of the Distribution of Measurements of $\mathcal{P}(B)$

If $B$ is a bin in which we measure the probability $\mathcal{P}(B)$ , let us suppose that $\{P_{i}\}$ is a set of $N_{m}$ measurements of $\mathcal{P}(B)$ . Our goal is to determine the fourth central moment $\mu_{4}$ of the distribution of the measurements $\{P_{i}\}$ of $\mathcal{P}(B)$ . We begin with Equation 10 which gives the moment-generating function for this distribution:

M_{\mathcal{P}}(t)=\frac{1}{N_{c}}\left(1+\left(e^{t}-1\right)P\right)^{N_{c}},

(10)

from which we can determine the moments of the distribution of measured $\mathcal{P}(B)$ values.

Using $P$ as a shorthand for $\mathcal{P}(B)$ , we obtain the following moments:

	$\displaystyle\left\langle P\right\rangle=P$		(25)
	$\displaystyle\left\langle P^{2}\right\rangle=\frac{P}{N_{c}}+\frac{(N_{c}-1)P^{2}}{N_{c}}$		(26)
	$\displaystyle\left\langle P^{3}\right\rangle=\frac{P}{N_{c}^{2}}+\frac{3(N_{c}-1)P^{2}}{N_{c}^{2}}+\frac{(N_{c}-2)(N_{c}-1)P^{3}}{N_{c}^{2}}$		(27)
	$\displaystyle\begin{split}\left\langle P^{4}\right\rangle=\frac{P}{N_{c}^{3}}+\frac{7(N_{c}-1)P^{2}}{N_{c}^{3}}+\frac{6(N_{c}-2)(N_{c}-1)P^{3}}{N_{c}^{3}}\\ +\frac{(N-3)(N-2)(N-1)P^{4}}{N_{c}^{3}}.\end{split}$		(28)

Thus the fourth central moment of the distribution of measured values for $\mathcal{P}(B)$ is

$\displaystyle\mu_{4}$	$\displaystyle=\left\langle P^{4}\right\rangle-4\left\langle P^{3}\right\rangle\left\langle P\right\rangle+6\left\langle P^{2}\right\rangle\left\langle P\right\rangle^{2}-3\left\langle P\right\rangle^{4}$	(29)
	$\displaystyle=\frac{3P^{2}}{N_{c}^{2}}(P-1)^{2}+\frac{P(1-P)}{N_{c}^{3}}(1-6P+6P^{2})$	(30)
	$\displaystyle=3\left(s_{p}^{2}\right)^{2}+\frac{s_{p}^{2}}{N_{c}}(1-6P+6P^{2}),$	(31)

where the final step follows from $s_{P}^{2}=P(1-P)/N_{c}$ (Equation 4).

A.3 Proof of Equation 13

Equation 13 claims that the variance of the probabilities $P_{1}$ and $P_{2}$ is given by

\mathrm{Cov}\left(\frac{1}{n}\sum_{i=1}^{n}s_{1i}\,,\frac{1}{n}\sum_{j=1}^{n}s_{2j}\right)=\frac{-P_{1}P_{2}}{n}\left(1+(n-1)\xi_{12}\right).

(13)

As in Appendix A.1, we prove the claim using induction.

To establish Equation 13 for $n=1$ , we first note that, because the slice fields are disjoint (Equation 3), it cannot be the case that $\mathcal{S}_{1}=\mathcal{S}_{2}=1$ in this single survey cell. The possibilities therefore are $\mathcal{S}_{1}=1,\mathcal{S}_{2}=0$ (with probability $P_{1}$ ), $\mathcal{S}_{1}=0,\mathcal{S}_{2}=1$ (with probability $P_{1}$ ), and $\mathcal{S}_{1}=\mathcal{S}_{2}=0$ (with probability $1-P_{1}-P_{2}$ ). Hence the left-hand side of Equation 13 is

$\displaystyle\mathrm{Cov}(\mathcal{S}_{1},\mathcal{S}_{2})$	$\displaystyle=\left\langle(\mathcal{S}_{1}-P_{1})(\mathcal{S}_{2}-P_{2})\right\rangle$	(32)
$\displaystyle\begin{split}&=(1-P_{1})(-P_{2})P_{1}+(-P_{1})(1-P_{2})P_{2}\\ &\qquad+(P_{1}P_{2})(1-P_{1}-P_{2})\end{split}$		(33)
	$\displaystyle=-P_{1}P_{2},$	(34)

thus establishing the prescription for $n=1$ .

Now assuming the equation holds for a given $n$ , we can write

$\displaystyle\mathrm{Cov}$	$\displaystyle\left(\frac{1}{n+1}\sum_{i=1}^{n+1}s_{1i}\,,\frac{1}{n+1}\sum_{j=1}^{n+1}s_{2j}\right)\hskip 56.9055pt$	(35)
$\displaystyle\begin{split}&=\frac{1}{(n+1)^{2}}\left\langle\left(\sum_{i=1}^{n+1}s_{1i}-(n+1)P_{1}\right)\right.\\ &\qquad\qquad\qquad\qquad\left.\times\left(\sum_{j=1}^{n+1}s_{2j}-(n+1)P_{2}\right)\right\rangle\end{split}$		(36)
$\displaystyle\begin{split}&=\frac{1}{(n+1)^{2}}\left\{\left\langle\left(\sum_{i=1}^{n}s_{1i}-nP_{1}\right)\left(\sum_{j=1}^{n}s_{2j}-nP_{2}\right)\right\rangle\right.\\ &\qquad\qquad+\left\langle\left(\sum_{i=1}^{n}s_{1i}-nP_{1}\right)\left(s_{2(n+1)}-P_{2}\right)\right\rangle\\ &\qquad\qquad+\left\langle\left(s_{1(n+1)}-P_{1}\right)\left(\sum_{j=1}^{n}s_{2j}-nP_{2}\right)\right\rangle\\ &\qquad\qquad+\left\langle\left(s_{1(n+1)}-P_{1}\right)\left(s_{2(n+1)}-P_{2}\right)\right\rangle\left.\rule{0.0pt}{15.0pt}\right\}\end{split}$		(37)

Equation 37 contains four expectation values which we now evaluate. By our inductive hypothesis, the first is $P_{1}P_{2}n(1+(n-1)\xi_{12})$ . The second becomes

	$\displaystyle\sum_{i=1}^{n}$	$\displaystyle\langle s_{1i}s_{2(n+1)}\rangle-P_{2}\sum_{i=1}^{n}\langle s_{1i}\rangle-nP_{1}\langle s_{2(n+1)}\rangle+nP_{1}P_{2}$		(38)
		$\displaystyle=nP_{1}P_{2}\xi_{12}$		(39)

by recalling that for $i\neq j$ , $\langle s_{1i}s_{2j}\rangle=P_{1}P_{2}(1+\xi_{12})$ and that $\langle s_{1i}\rangle=P1$ , etc. The third expectation value becomes the same quantity by symmetry. Finally, we recall that $\langle s_{1(n+1)}s_{2(n+1)}\rangle=0$ because the slice fields are disjoint, and thus the fourth expectation value becomes $-P_{1}P_{2}$ .

Inserting these results into Equation 37 and simplifying, we obtain

\mathrm{Cov}\left(\frac{1}{n+1}\sum_{i=1}^{n+1}s_{1i}\,,\frac{1}{n+1}\sum_{j=1}^{n+1}s_{2j}\right)\\ =\frac{-P_{1}P_{2}}{n+1}(1+n\xi_{12}),

(40)

thus establishing Equation 13 for all natural numbers $n$ .

A.4 Moments of of the Joint Distribution of Measurements of $\mathcal{P}(B_{1},B_{2})$

If $B_{1},B_{2}$ are disjoint CIC bins, let us consider two sets of measurements $\{P_{1i}\},\{P_{2i}\}$ of the probabilities $\mathcal{P}(B_{1})$ and $\mathcal{P}(B_{2})$ , respectively, each consisting of $N_{m}$ measurements. We want to determine the central moments of the joint distribution of the measurements $\{P_{1i}\}$ of $\mathcal{P}(B_{1})$ and $\{P_{2i}\}$ of $\mathcal{P}(B_{2})$ . We start with Equation 17 which (using $P_{1}$ and $P_{2}$ as shorthand for $\mathcal{P}(B_{1})$ and $\mathcal{P}(B_{2})$ , respectively) gives the joint moment-generating function for this distribution:

M_{\mathcal{P}_{1}\mathcal{P}_{2}}(t_{1},t_{2})=\frac{1}{N_{c}}\left(1+\left(e^{t_{1}}-1\right)P_{1}+\left(e^{t_{2}}-1\right)P_{2}\right)^{N_{c}}.