On the correlations of galaxy peculiar velocities and their covariance
Abstract
Measurements of the peculiar velocities of large samples of galaxies enable new tests of the standard cosmological model, including determination of the growth rate of cosmic structure that encodes gravitational physics. With the size of such samples now approaching hundreds of thousands of galaxies, complex statistical analysis techniques and models are required to extract cosmological information. In this paper we summarise how correlation functions between galaxy velocities, and with the surrounding large-scale structure, may be utilised to test cosmological models. We present new determinations of the analytical covariance between such correlation functions, which may be useful for cosmological likelihood analyses. The statistical model we use to determine these covariances includes the sample selection functions, observational noise, curved-sky effects and redshift-space distortions. By comparing these covariance determinations with corresponding estimates from large suites of cosmological simulations, we demonstrate that these analytical models recover the key features of the covariance between different statistics and separations, and produce similar measurements of the growth rate of structure.
keywords:
cosmology: large-scale structure of Universe – cosmology: theory – methods: statistical1 Introduction
The relative motion of galaxies within the cosmic web of large-scale structure is a powerful probe of cosmological models. The “peculiar velocities” of individual galaxies can be measured by combining their redshifts with an independent distance indicator such as the Tully-Fisher relation, Fundamental Plane, or supernova standard candle (Strauss & Willick, 1995). The statistics of peculiar velocities, and their correlation with the surrounding galaxy distribution, provide important tests of the growth rate of cosmic structure and the physics of density perturbations on the largest scales (e.g., Burkey & Taylor, 2004; Johnson et al., 2014; Koda et al., 2014; Carrick et al., 2015; Huterer et al., 2017; Howlett et al., 2017a; Nusser, 2017; Dupuy et al., 2019; Adams & Blake, 2020; Said et al., 2020; Boruah et al., 2020; Lai et al., 2023; Turner et al., 2023; Courtois et al., 2023b).
Direct peculiar velocity measurements are already available for tens of thousands of galaxies through samples such as the 6-degree Field Galaxy Survey (Springob et al., 2014), the Sloan Digital Sky Survey (Saulder et al., 2013; Howlett et al., 2022), and compendiums such as the Cosmic Flows catalogue (Tully et al., 2016, 2023). These samples will be expanded to hundreds of thousands of galaxies through current and future observational projects such as the Dark Energy Survey Instrument (Saulder et al., 2023), the 4-metre Multi-Object Spectroscopic Telescope (4MOST) Hemisphere Survey (Taylor et al., 2023), the Vera Rubin Observatory (Howlett et al., 2017b) and the Australian Square Kilometre Array Pathfinder WALLABY survey (Courtois et al., 2023a). This abundance of new data allows application of the same “statistical machinery” to peculiar velocity correlations as has been developed in recent years for the analysis of large galaxy redshift surveys, in which summary 2-point statistics are estimated and compared to theoretical models using large covariance matrices.
In this paper we review the suite of 2-point correlation functions that are available for quantifying peculiar velocity statistics in configuration space. A principal challenge for statistical analysis of 2-point functions, on which we focus in this paper, is determining the covariance matrix describing the inter-correlations between the different summary statistics and separations, which is used in cosmological Bayesian likelihood analysis. Different methods are available for evaluating the covariance matrix of such statistics. A straight-forward approach is to generate a large ensemble of simulated datasets and use the fluctuations over these simulations to estimate the resulting covariance. However, this approach may be limited by the extent to which the simulations accurately represent the dataset, and the large number of simulations that might be required to minimise noise in the covariance matrix as the size of data vectors grow (Hartlap et al., 2007). A second potential approach is to estimate the covariance internally from the dataset, using jack-knife or bootstrap techniques (e.g., Norberg et al., 2009; Philcox et al., 2020; Favole et al., 2021; Mohammad & Percival, 2022). Such estimates benefit from not requiring any additional inputs, but may be limited by the capacity to create “equivalent” pseudo-independent sub-samples with which to estimate global statistics. A third approach, which we test in this study, is to calculate covariance matrices analytically by propagating errors in the context of a statistical model of the dataset. This approach may be limited by the accuracy of this statistical model, which may not necessarily reflect the detailed properties of the dataset. However, analytical covariances are noise-free, allow likelihood analyses to proceed where large numbers of accurate mocks are unavailable, and have proved useful for combined-probe cosmological studies where the size of the data vectors are large, such as joint analyses of weak lensing and large-scale structure statistics (e.g., Krause & Eifler, 2017; Friedrich et al., 2021; Joachimi et al., 2021). Analytical covariances, and hybrid approaches where these covariances are calibrated using a small number of mock catalogues or jack-knife samples, have also been widely explored in large-scale structure analyses (e.g., Feldman et al., 1994; Grieb et al., 2016; O’Connell et al., 2016; Blake et al., 2018; Li et al., 2019; O’Connell & Eisenstein, 2019; Wadekar et al., 2020; Philcox et al., 2020; Hou et al., 2022).
The goal of this paper is to perform new calculations of the analytical covariance between correlation functions relevant to studies of both simulated and observational peculiar velocity datasets, and to assess the accuracy of these covariances through direct comparison with an ensemble of mock catalogues. In Sec.2 we review the various correlation functions which may be used in peculiar velocity studies. In Sec.3 we introduce the statistical model we adopt for analysing fluctuations in density and velocity samples, and outline the calculation of the analytical covariance by applying this model to correlation function estimators. In Sec.4 we compare the analytical covariance to a matched mock covariance, and study their relative performance in fits for the growth rate of structure. We conclude our study in Sec.5.
2 Velocity correlation statistics
In this section we introduce the different types of velocity correlation function we will study in this paper, and their relation to the underlying matter power spectrum and growth rate of structure. We will consider correlations involving both 3D velocity vectors (which may be measured from simulations of cosmic structure and used to test theoretical models), and line-of-sight velocities (which may be measured from observational data using redshift-independent distances).
2.1 3D density and velocity correlations
The rate of change of a matter density perturbation with time may be related to the surrounding velocity field using the continuity equation, in terms of the growth rate of cosmic structure , where is the cosmic scale factor. The components of the 3D velocity field, , at vector position relative to an origin, are given in “linear theory” by,
(1) |
where and are the Fourier amplitudes of these fields in terms of wavevector , and is the Hubble parameter. Eq.1 assumes linear perturbation theory and irrotational velocity fields (for a derivation, see for example Adams & Blake (2017), Appendix C). The Fourier amplitudes of matter fluctuations are related to the underlying matter power spectrum, , which may be predicted by cosmological theories, where is the volume of the normalising Fourier cuboid and the angled brackets indicate an average over many statistical realisations. We can trace the matter overdensity field through the locations of discrete galaxies. In a linear bias model, the galaxy overdensity at a location is , where is the galaxy bias factor.
Information about the matter power spectrum and growth rate can be gleaned by measuring the auto-correlation and cross-correlation functions of the galaxy overdensity and velocity fields. We define the galaxy auto-correlation function, , as a function of vector separation , in terms of the galaxy overdensity field as,
(2) |
where its relation to the galaxy power spectrum, , follows by expressing in terms of its Fourier components. Neglecting redshift-space distortions, does not depend on the direction of , through isotropy. Given the vector nature of the velocity field, we define the velocity correlation tensor as,
(3) |
where the second equality in terms of the velocity power spectrum, follows by substituting in Eq.1. We also define the cross-correlation between the galaxy overdensity and each velocity component, which we write as ,
(4) |
where we define the galaxy-velocity cross-power spectrum, .
We note an important practical detail that measurements of these correlation functions often trace velocities using galaxies, which are preferentially located in overdense regions, hence tracing the “momentum” or mass-weighted correlation function weighted by (Park, 2000; Howlett, 2019). This effect introduces higher-order corrections which we neglect in our current study, in which we choose to focus on large scales.
2.2 Forming isotropic velocity correlations
In order to compare correlation function measurements with theoretical predictions, it is convenient to use summary statistics which are “isotropic” – that is, which depend only on the magnitude of separation , not direction . In that case, we may conveniently angle-average measurements over different directions of the separation vector. Neglecting redshift-space distortions, which we will consider in Sec.2.4, the galaxy auto-correlation is an isotropic function of , which we can see by evaluating Eq.2 as,
(5) |
where we have used the relation for the spherical Bessel function ,
(6) |
However, and are not isotropic functions, and are therefore not useful as a comparison point between measurements and theory.
The velocity correlation tensor can be expressed in terms of two isotropic functions and as (Gorski, 1988),
(7) |
where is the Kronecker delta function, and the functions and are given by,
(8) |
and,
(9) |
We have not found a convenient derivation of Eq.7 in the literature, so we provide one in Appendix A.
Using these results, we can readily derive two isotropic summary statistics from the 3D velocity field (Gorski, 1988). First, we may define the correlation function between the inward velocities between two points, ,
(10) |
where the final equality is derived by substituting in Eq.7. (We emphasise that throughout this paper, represents the approach velocity of one galaxy towards another along separation vector , not the radial velocity with respect to an observer that we introduce below.) Second, we can form the isotropic total velocity correlation function,
(11) |
For the galaxy-velocity cross-correlation function, we can define an isotropic correlation between galaxy position and inward velocity (Fisher, 1995; Okumura et al., 2014; Adams & Blake, 2017),
(12) |
where we have used the relation for the spherical Bessel function ,
(13) |
2.3 Line-of-sight velocity correlations
When analysing observational data, we measure the line-of-sight velocity , where is the vector position of the velocity tracer with respect to the observer. Since the projection of the 3D velocity to the line-of-sight velocity depends on the position relative to the observer, the theoretical expressions now depend on the distribution of the velocity tracers through space. We represent this by writing the correlation functions as estimators over the survey volume (this was implicitly present in our definitions of the 3D correlations in Sec.2.1 and Sec.2.2, but not salient because the statistics were independent of position ).
We first consider the line-of-sight velocity auto-correlation function, which we estimate as,
(14) |
where we write in this section. Substituting in Eq.7, the expectation value of the line-of-sight velocity correlation function is,
(15) |
where represents the average of the cosine of the angle subtended by the position vectors and at the observer, and is the average of the product of the cosines between the position and separation vectors. (We refer to Fig.1 of Turner et al. (2021) for a useful visualisation of the geometry.) Note that this expression encodes the exact curved-sky effects which arise from the variation of the line-of-sight direction with position. In a flat-sky approximation, and .
In order to gain insight into Eq.15 and the nature of the and functions, it is useful to consider the correlation in line-of-sight velocities of a single pair of velocity tracers in the flat-sky approximation. If the separation vector of that pair is perpendicular to the line-of-sight, then and . If the separation vector is parallel to the line-of-sight, then and . This analysis shows that the radial velocity auto-correlation function is dependent on the direction of the separation vector, and therefore angle-averaging may lose information.
In order to extract more information from the line-of-sight velocity correlations, we may utilise the and statistics defined by Gorski et al. (1989), which apply pairwise weights for each statistic depending on the angles subtended with respect to the observer and separation vectors,
(16) |
and,
(17) |
where and are normalisation constants we identify below. Using Eq.7, the expectation value of Eq.16 is given by,
(18) |
Defining for each separation bin the normalisation, , and the geometry factor, , we find,
(19) |
Likewise for ,
(20) |
Defining the normalisation, , and the geometry factor, , we find,
(21) |
In the flat-sky approximation, and are linear combinations of the first two multipoles of the expansion of about the angle to the line-of-sight (, ), which encode all the angular information.
We now consider the cross-correlation function between the line-of-sight velocity and galaxy overdensity. Since its monopole is zero, we consider the dipole contribution, which Turner et al. (2021) defined as,
(22) |
Using Eq.12, this function has expectation value,
(23) |
Hence, with the identification , we have .
2.4 Redshift-space distortions
The isotropy of correlation functions is broken in observational datasets by the effects of redshift-space distortions (RSD), through which the apparent positions of galaxies, as determined by their redshifts, are shifted according to their line-of-sight velocities. The presence of RSD requires us to move to an isotropic basis of correlation function multipoles. RSD has no effect on the velocity auto-correlation function to first order (Dam et al., 2021), hence we only consider redshift-space modifications to the cross-correlation and the galaxy auto-correlation functions. The multipole estimators are, for the galaxy auto-correlation function,
(24) |
and for the cross-correlation function,
(25) |
(where we can identify ).
To develop the relations between these correlation function multipoles and the underlying power spectra, it is convenient to express the dependence of the power spectra on the angle with respect to the line-of-sight using multipoles, such that the power spectra become position-dependent (e.g, Blake, 2019),
(26) |
The expectation values of these estimators are given by, for the galaxy auto-correlation function multipoles (e.g., Satpathy et al., 2017),
(27) |
and for the cross-correlation multipoles (Adams & Blake, 2020; Lai et al., 2023; Turner et al., 2023),
(28) |
with coefficients,
(29) |
where the matrix in Eq.29 is a Wigner 3j-symbol. For reference we give the linear-theory terms entering these predictions (Adams & Blake, 2020):
(30) |
In our study we focus on the subset of correlation function multipoles which can currently be measured with high signal-to-noise: , and .
In the following analysis, we hence consider two groups of correlation functions. First, “simulation” correlations, in which we utilise 3D velocities and analyse the three correlation functions . Second, “observation” correlations, in which we utilise line-of-sight velocities and redshift-space multipoles, and analyse the five correlation functions .
3 Analytical covariance calculation
In this section we model the analytical covariance between the correlation statistics described in Sec.2. In general, the covariance of two correlation functions and evaluated at separations and is defined by,
(31) |
Since individual correlation statistics are a 2-point function, Eq.31 demonstrates that covariances are a 4-point function. However, we can use approximations to reduce this 4-point function to evaluations over 2-point functions. We illustrate how survey selection functions and noise may be included in correlation models, and perform the covariance evaluation for an example case. We note that a full listing of our analytical covariance results is provided in Appendix B.
3.1 Including selection functions and measurement noise
In order to evaluate the covariance of a correlation measurement in a realistic scenario, we need to adapt our correlation models to include the effects of the survey selection function, measurement noise, and any position-dependent optimal weights that are applied. We model two separate discrete datasets: a “density” sample which is used to trace the galaxy overdensity , and a “velocity” sample which is used to probe the 3D or radial velocities. We assume that:
-
•
The density and velocity samples may have different galaxy number densities, which vary with location as and .
-
•
Discrete density and velocity tracers are assigned position-dependent weights and .
-
•
Measurement of the galaxy overdensity and velocity fields contains Poisson noise from discrete objects.
-
•
Tracer velocities contain measurement noise, which may be applied either to all velocity components (for a simulation), or in the line-of-sight direction (for observations).
In this context, the correlation between the (weighted, noisy) galaxy overdensity field at two positions and becomes (Feldman et al., 1994),
(32) |
where is the selection function of the density tracers. We model the uncertainty in the galaxy overdensity using Poisson noise, for which (Feldman et al., 1994; Blake, 2019). We note here that through the inclusion of the normalising volume , and are dimensionless functions. The equivalent expression for the correlation of two velocity components and is,
(33) |
Here, is the selection function of the velocity tracers, and is the velocity noise at a given location as a function of the individual-tracer velocity error , where this noise is applied in the radial direction from an observer. This last statement can be established by considering,
(34) |
where the final step uses . If velocity noise is instead applied independently to each velocity component (for example, the non-linear contribution to the velocities in a simulation), the final term in Eq.33 would be . Finally, the expression for the correlation between the galaxy overdensity field and a component of velocity is,
(35) |
In this case there is no noise term, because the noise contribution to and is uncorrelated.
To calculate the analytical covariance we will apply several approximations, stated as follows:
-
•
We will assume that the galaxy overdensity and velocity fields are Gaussian random fields.
-
•
We will neglect higher-order correlations; for example, the fact that velocity tracers preferentially sample overdense locations.
-
•
We will neglect the variation of the survey selection function and noise fields on the scale of the separation vector.
-
•
When correlating line-of-sight velocities we will assume the “local plane-parallel” approximation, that the position vectors of points on the scale of the separation are parallel (where this direction varies for different pairs).
These approximations become increasingly inaccurate for small separations, or when the separations are large compared to the radial distances. However, we will show in the remainder of this paper that these approximations produce analytical covariances with a significant range of validity.
3.2 Estimators in terms of fields
Applying these approximations, we restate our estimators of the correlation function statistics in terms of the noisy, weighted fields, in the form that we will use to compute the analytical covariances. Starting with the galaxy correlation function we have,
(36) |
where is a normalisation constant. Using Eq.32, we can confirm that if , where we have applied the approximation that the selection function does not vary on the separation scale. We can average the above estimator over directions by writing,
(37) |
Generalising to the galaxy correlation function multipoles,
(38) |
For the inward velocity auto-correlation function we can similarly write,
(39) |
where using the same approximation as above, . For the cross-correlation between inward velocity and galaxy position,
(40) |
where . We now turn to the correlations involving line-of-sight velocities. The estimators for the , and statistics have already been provided as Eq.16, Eq.17 and Eq.22 above where, for evaluating the covariance, we additionally apply the local plane-parallel approximation . For the cross-correlation multipole statistics we use,
(41) |
These estimators form the basis of our analytical covariance evaluations.
3.3 Example evaluation
We illustrate the calculation of the analytical covariance using the example of . First, we express the covariance in terms of moments of the fields, using the estimator defined in Eq.16, noting in the following that is now a general position vector and does not satisfy :
(42) |
where the last step uses Wick’s theorem for a Gaussian random field. We now substitute in the correlation model from Eq.33, where we will just show the calculation of the first term for clarity, noting that the second term produces an identical result where is replaced by :
(43) |
where we have again applied the approximation that the selection function and noise do not vary on the scale of the separations, i.e. and . Now, we convert the evaluation to Fourier space, by substituting in the expression for the velocity correlation tensor from Eq.3 and for the delta function, :
(44) |
where in the first term, we make the approximation that the power spectrum is “slowly varying” and can be taken outside the integral over , producing a delta function . We note that in Eq.44, and generally in our covariance expressions, we group the variables in combination where volume units cancel out. We also combine the sums in Eq.44 into dot products ( and ), such that,
(45) |
This expression can be written more compactly as,
(46) |
Finally, we angle-average the measurements,
(47) |
where we have introduced the averages of the survey selection and noise functions over the survey window,
(48) |
We do not reproduce the calculation of the other covariance elements, but rather state the final results in Appendix B.
3.4 Scaling parameters
The accuracy of analytical covariance matrices may be improved by the inclusion of empirical scaling parameters, which can be set through comparison with a small number of mock catalogues, or through internal error determinations such as jack-knife sampling (e.g., O’Connell et al., 2016; Philcox et al., 2020). These scaling parameters can improve some of the approximations used when generating analytical covariances, such as the choice of the effective galaxy bias value, or the assumption of Poisson noise for the galaxy distribution, or Gaussian noise in the velocities. We introduced separate empirical scaling parameters for the amplitude of the galaxy and velocity sample variance ( and ) and noise variance ( and ) entering the analytical covariances, and set these parameters through comparison with a subset of mock catalogues. Referring to the covariance formulae provided in Appendix B, these scaling parameters multiply terms in by , and terms in by . We found that the preferred scaling parameters increased the velocity noise variance by from our fiducial calculation (), and changed the amplitude of the other terms by less than (). We included these scaling parameters in the following analysis, although note that they do not significantly impact the best-fitting model parameters in any case.
4 Tests using simulations
In this section we use N-body simulations to test the accuracy with which the methods of Sec.3 can predict the covariance of correlation function statistics, by estimating these covariances using measurements across many realisations of a cosmological simulation as,
(49) |
where the overbar indicates an average over realisations. We will compare the values of diagonal and off-diagonal elements of the analytical and mock covariances, as well as the performance of the covariance matrices in fits for the growth rate of structure parameter.
4.1 Mocks
For the purposes of these tests we used the large suite of N-body simulations described by Koda et al. (2016), which was created using the Comoving Lagrangian Acceleration (COLA) technique. These simulations span an box, in which we placed an observer at the central location, creating an initial spherical selection function within a radius. The fiducial power spectrum of the simulations is the “WMAP5” cosmological model with matter density , baryon density , Hubble parameter , spectral index , and power spectrum normalisation . The simulation resolves cosmological halos with masses below , and we based our analysis on these halo catalogues.
We selected halos within a mass range , significantly above the resolution limit, and randomly sub-sampled the density and velocity samples to contain a number density gradient with distance from the observer,
(50) |
where is the central number density at the observer (with ), and is the number density at the edge of the sample, where . For the density sample we chose and for the velocity sample, . These choices produced an average total of density tracers and velocity tracers. These are arbitrary choices which are representative of existing sample sizes, whilst allowing correlation functions to be swiftly measured. Fig.1 displays the gradient of tracer density with distance from the observer. When creating “observational” samples we also applied a velocity measurement error which increased with distance from the observer, where the fractional error is of distance,
(51) |
where . This choice of velocity measurement error is representative of the most accurate standard candles such as supernovae (Howlett et al., 2017a; Boruah et al., 2020), whereas Fundamental Plane measurements from elliptical galaxies would have a fractional error exceeding (Springob et al., 2014; Howlett et al., 2022). We found that different configurations produced broadly similar performance of our covariance model; applying the more accurate noise allows the most accurate test of growth rate recovery. For “simulation” samples using 3D velocities, we measured the velocity noise in each direction as . We used mock realisations in the following analysis.

4.2 Correlation function measurements
We measured the galaxy and velocity auto- and cross-correlation functions of these realisations using standard pair-count estimators. We used 17 separation bins of width in the range . For the galaxy and inward velocity correlations, we used the pair-count estimators described by Okumura et al. (2014) and Lyall et al. (2023), based on combinations of weighted sums of pairs from data and random catalogues (where the weights are described below):
(52) |
(53) |
(54) |
In these estimators are the weights of the individual objects, the sums are carried out over pairs in the given separation bin, and denote the sum of the weights of the density and velocity tracers, whilst and are the corresponding sums over the larger random catalogues.
For the line-of-sight velocity correlations we used the equivalent estimators to Eq.16, Eq.17 and Eq.22 (see also Turner et al., 2021),
(55) |
(56) |
(57) |
where is the angle between the line-of-sight vector to the tracer and the separation vector, and is the angle between the two line-of-sight vectors.
For the multipoles of and , we measured the correlations in bins of separation, , and the cosine of the angle of the pair to the line-of-sight, , and then evaluated,
(58) |
where we utilised , and . We used 20 equal-sized cosine bins in the range .
We assigned the tracers optimal weights. For the density tracers, we used Feldman-Kaiser-Peacock (FKP) weights (Feldman et al., 1994),
(59) |
where we assumed a characteristic galaxy power spectrum amplitude (noting that our results are not sensitive to this choice). For the velocity tracers, we used the analogous optimal weight (Howlett, 2019; Turner et al., 2021),
(60) |
where we assumed a characteristic velocity power spectrum amplitude .
We grouped the correlation functions into two separate data vectors: the 3D velocity “simulation” case involving the three correlations without redshift-space distortions, and the line-of-sight velocity “observation” case involving the five correlation functions including redshift-space distortions. The mock mean, standard deviation and fiducial models are displayed for the “simulation” case in Fig.2, and for the “observation” case in Fig.3. We generated the fiducial models by evaluating Eq.5 (for ), Eq.12 (for ), Eq.10 (for ), Eq.27 (for ), Eq.28 (for ), Eq.16 (for ) and Eq.17 (for ). We created the model matter power spectrum using the original “halofit” algorithm (Smith et al., 2003) within the CAMB software (Lewis et al., 2000), assuming a fiducial galaxy bias factor , which we fixed through comparison of the measured and theoretical galaxy power spectra. When integrating models over , we assumed a lower limit given by the box size . The agreement between the correlation function models and measurements worsens on scales Mpc, where the linear-theory modelling becomes less accurate.


4.3 Covariance comparison
We evaluated the analytical covariance for the different components of the “simulation” and “observation” data vectors, using the approximations defined in Sec.3. We first determined the various averages of the different selection function and the noise fields over the embedding cube (as defined by Eq.48), and then evaluated the integrals listed in Appendix B. We increased the accuracy of the determination by averaging the Bessel functions within these integrals over each separation bin.
Figs. 4 and 5 illustrate the comparison between the predicted analytical error in each separation bin, and the standard deviation of the mock measurements, for the individual correlation functions within the “simulation” and “observation” case, respectively. The mock measurements are displayed as a band, where the width of the band indicates the “error in the error” appropriate for Gaussian statistics, . We find that, generally speaking, the analytical covariance provides an accurate model for the dispersion between mocks, with the principal deviations at small scales. The average difference between the analytical and mock error in each separation bin, across the different statistics, is below .
Figs. 6 and 7 display the correlation coefficients of the full covariance matrix derived from the mock measurements, compared to the analytical calculation, spanning the different correlation functions. Fig. 6 uses the “simulation” data vector including the three correlation functions , and Fig. 7 uses the “observation” data vector encompassing the five correlation functions . We can see that the “broad features” of the correlation patterns are successfully reproduced in each case.




We quantified the difference between the analytical and mock covariance matrices using two initial methods. First, we considered the impact on the errors in the model parameters by computing the associated Fisher matrices ,
(61) |
where is the model vector, is the covariance, and the forecast parameter errors are given by . We considered a parameter set for the “simulation” dataset, and for the “observation” dataset (where the model is specified more fully in Sec.4.4). The Fisher forecast parameter errors of the two covariances agreed within for both correlation function sets. Second, following Joachimi et al. (2021), we assessed any systematic offset between the best-fitting parameters for each covariance by sampling many noisy data vectors ,
(62) |
where is a triangular matrix obtained from the Cholesky decomposition of the covariance matrix, , and is a vector of random variables drawn from a Gaussian distribution with zero mean and unit variance. We generated noisy data vectors from the fiducial model for both the mock and analytical covariance, fit these vectors for the growth rate and galaxy bias parameters, and considered the offset in the best-fitting growth rate parameters. This offset was less than of the statistical error in the fit. In the following subsection, we consider a more comprehensive comparison of the two covariances using full model fits to the mock correlation functions.
4.4 Growth rate fits
We can use the analytical covariance matrices derived in Sec.3 and the full mock covariance matrices described in Sec.4, all shown in Fig.6 and Fig.7, to produce fits for the growth rate by means of a minimisation procedure. We reproduce the analyses of Turner et al. (2021) and Turner et al. (2023), using the former for our 3D velocity “simulation" case neglecting redshift-space distortions and using the latter for our line-of-sight velocity “observation" case that includes the effects of redshift-space distortions, fitting to separations greater than Mpc where our linear-theory models are applicable.
4.4.1 Simulation case
Following Turner et al. (2021), we use the three correlations to produce a measurement of . As can be seen from the fiducial models of these functions (Eq.5, Eq.12, and Eq.10 respectively) and the definitions of the corresponding power spectra given in Sec.2, these correlation functions are differently dependent on the growth rate and the linear galaxy bias . By fitting and to all three models simultaneously in a self-consistent manner we can jointly constrain both parameters. Given the assumption that the full 3D velocity information is known, we simply calculate Eq.5, Eq.12, and Eq.10 for each of the 600 COLA mocks and use a minimisation to determine the best-fitting values of for each mock as determined from the minimum value – where is calculated using the following equation:
(63) |
where are the data vectors composed of the three correlation functions calculated from all 600 COLA mocks, are the fiducial models of each correlation function, and is the inverted covariance matrix. To compare the accuracy of the analytical covariance in relation to the full mock covariance, we evaluate Eq.63 using both matrices in place of and contrast the final results. This comparison is shown in Fig.8. We can see from the figure that the results obtained from the analytical analysis and those from the mock analysis are consistent, and both recover the fiducial value of the growth rate as indicated by the dashed vertical line. Taking all 600 COLA mocks into account, the analytical ensemble mean value of the growth rate is with an average reduced value of , and the mock ensemble mean value is with an average reduced value of . Taking the difference between the best-fitting value of and for each of the 600 mocks, we find the standard deviation across the full ensemble to be . By comparing the scatter in the difference between our analytical and mock results with the scatter in or , we find that the systematic error inherent to choosing either the analytical or full mock covariance is negligible and thus the analytical covariance matrix can reliably be used in place of a full mock covariance matrix.
4.4.2 Observation case
Similarly to the simulation case, now following the methodology of Turner et al. (2023), we use the five correlation functions including redshift-space distortions to produce a measurement of . Using fiducial models defined by Eq.27, Eq.28, Eq.19 and Eq.21 for each function, respectively, and their corresponding radial-velocity-dependent estimators given by Eq.24, Eq.25, Eq.16 and Eq.17 we can again produce best-fitting measurements of the growth rate . We implement a similar minimisation procedure, using the equation:
(64) |
where are the data vectors composed of the five correlation function estimators calculated from all 600 COLA mocks and are the fiducial models of the five correlation functions. Given that these correlation functions incorporate redshift-space distortions we cast the multipole models in terms of the linear redshift distortion parameter and a non-linear velocity dispersion parameter with units of km s-1 (see Turner et al., 2023, for the exact form of these models). Considering the best-fitting values of from all 600 COLA mocks, the analytical ensemble mean value of the growth rate is with an average reduced value of , and the mock ensemble mean value is with an average reduced value of . Again measuring the standard deviation in the differences between the best-fitting values of and for the 600 mocks, we find . Echoing the results seen in the three correlation function “simulation" case, the scatter due to the choice of covariance matrix is significantly less than the error in the ensemble mean measurement of or .


5 Conclusions
Large samples of galaxies, tracing the matter density field and measuring direct peculiar velocities, can be summarised via galaxy and velocity auto- and cross-correlation functions, which form a convenient point of comparison with theoretical models. The upcoming generation of large cosmological surveys will provide velocity samples containing hundreds of thousands of galaxies, promising new and accurate insights into gravitational physics on cosmological scales. A potential obstacle in such analyses is determining the covariance of these statistics, which is an important ingredient in Bayesian likelihood analyses. In this paper we have presented new calculations of the analytical covariance between different types of velocity correlation functions and scales. We considered both the “simulation” case, in which we analyse 3D velocities, and the “observation” case, in which we utilise line-of-sight velocities and redshift-space multipoles. The statistical model used in our covariance includes the survey selection function, observational noise, curved-sky effects and redshift-space distortions. We compared our analytical covariance determinations with equivalent covariances estimated using a large suite of cosmological simulations, demonstrating that the analytical covariance is a good representation of the dispersion across the simulations. We also compared the performance of both covariances in fits for the growth rate of cosmic structure, the key cosmological parameter which may be tested by such datasets, finding that the respective growth rate determinations are comparable. Our study demonstrates that the analytical covariance can be a useful tool in the analysis of large velocity samples, in cases where sufficient numerical simulations are unavailable.
Acknowledgements
We thank the anonymous referee for useful comments on the manuscript. We acknowledge financial support received through Australian Research Council Discovery Project DP220101610.
Data Availability
The data underlying this article will be shared on reasonable request to the corresponding author.
References
- Adams & Blake (2017) Adams C., Blake C., 2017, MNRAS, 471, 839
- Adams & Blake (2020) Adams C., Blake C., 2020, MNRAS, 494, 3275
- Blake (2019) Blake C., 2019, MNRAS, 489, 153
- Blake et al. (2018) Blake C., Carter P., Koda J., 2018, MNRAS, 479, 5168
- Boruah et al. (2020) Boruah S. S., Hudson M. J., Lavaux G., 2020, MNRAS, 498, 2703
- Burkey & Taylor (2004) Burkey D., Taylor A. N., 2004, MNRAS, 347, 255
- Carrick et al. (2015) Carrick J., Turnbull S. J., Lavaux G., Hudson M. J., 2015, MNRAS, 450, 317
- Courtois et al. (2023a) Courtois H. M., et al., 2023a, MNRAS, 519, 4589
- Courtois et al. (2023b) Courtois H. M., Dupuy A., Guinet D., Baulieu G., Ruppin F., Brenas P., 2023b, A&A, 670, L15
- Dam et al. (2021) Dam L., Bolejko K., Lewis G. F., 2021, J. Cosmology Astropart. Phys., 2021, 018
- Dupuy et al. (2019) Dupuy A., Courtois H. M., Kubik B., 2019, MNRAS, 486, 440
- Favole et al. (2021) Favole G., Granett B. R., Silva Lafaurie J., Sapone D., 2021, MNRAS, 505, 5833
- Feldman et al. (1994) Feldman H. A., Kaiser N., Peacock J. A., 1994, ApJ, 426, 23
- Fisher (1995) Fisher K. B., 1995, ApJ, 448, 494
- Friedrich et al. (2021) Friedrich O., et al., 2021, MNRAS, 508, 3125
- Gorski (1988) Gorski K., 1988, ApJ, 332, L7
- Gorski et al. (1989) Gorski K. M., Davis M., Strauss M. A., White S. D. M., Yahil A., 1989, ApJ, 344, 1
- Grieb et al. (2016) Grieb J. N., Sánchez A. G., Salazar-Albornoz S., Dalla Vecchia C., 2016, MNRAS, 457, 1577
- Hartlap et al. (2007) Hartlap J., Simon P., Schneider P., 2007, A&A, 464, 399
- Hou et al. (2022) Hou J., Cahn R. N., Philcox O. H. E., Slepian Z., 2022, Phys. Rev. D, 106, 043515
- Howlett (2019) Howlett C., 2019, MNRAS, 487, 5209
- Howlett et al. (2017a) Howlett C., Staveley-Smith L., Blake C., 2017a, MNRAS, 464, 2517
- Howlett et al. (2017b) Howlett C., Robotham A. S. G., Lagos C. D. P., Kim A. G., 2017b, ApJ, 847, 128
- Howlett et al. (2022) Howlett C., Said K., Lucey J. R., Colless M., Qin F., Lai Y., Tully R. B., Davis T. M., 2022, MNRAS, 515, 953
- Huterer et al. (2017) Huterer D., Shafer D. L., Scolnic D. M., Schmidt F., 2017, J. Cosmology Astropart. Phys., 2017, 015
- Joachimi et al. (2021) Joachimi B., et al., 2021, A&A, 646, A129
- Johnson et al. (2014) Johnson A., et al., 2014, MNRAS, 444, 3926
- Koda et al. (2014) Koda J., et al., 2014, MNRAS, 445, 4267
- Koda et al. (2016) Koda J., Blake C., Beutler F., Kazin E., Marin F., 2016, MNRAS, 459, 2118
- Krause & Eifler (2017) Krause E., Eifler T., 2017, MNRAS, 470, 2100
- Lai et al. (2023) Lai Y., Howlett C., Davis T. M., 2023, MNRAS, 518, 1840
- Lewis et al. (2000) Lewis A., Challinor A., Lasenby A., 2000, ApJ, 538, 473
- Li et al. (2019) Li Y., Singh S., Yu B., Feng Y., Seljak U., 2019, J. Cosmology Astropart. Phys., 2019, 016
- Lyall et al. (2023) Lyall S., Blake C., Turner R., Ruggeri R., Winther H., 2023, MNRAS, 518, 5929
- Mohammad & Percival (2022) Mohammad F. G., Percival W. J., 2022, MNRAS, 514, 1289
- Norberg et al. (2009) Norberg P., Baugh C. M., Gaztañaga E., Croton D. J., 2009, MNRAS, 396, 19
- Nusser (2017) Nusser A., 2017, MNRAS, 470, 445
- O’Connell & Eisenstein (2019) O’Connell R., Eisenstein D. J., 2019, MNRAS, 487, 2701
- O’Connell et al. (2016) O’Connell R., Eisenstein D., Vargas M., Ho S., Padmanabhan N., 2016, MNRAS, 462, 2681
- Okumura et al. (2014) Okumura T., Seljak U., Vlah Z., Desjacques V., 2014, J. Cosmology Astropart. Phys., 2014, 003
- Park (2000) Park C., 2000, MNRAS, 319, 573
- Philcox et al. (2020) Philcox O. H. E., Eisenstein D. J., O’Connell R., Wiegand A., 2020, MNRAS, 491, 3290
- Said et al. (2020) Said K., Colless M., Magoulas C., Lucey J. R., Hudson M. J., 2020, MNRAS, 497, 1275
- Satpathy et al. (2017) Satpathy S., et al., 2017, MNRAS, 469, 1369
- Saulder et al. (2013) Saulder C., Mieske S., Zeilinger W. W., Chilingarian I., 2013, A&A, 557, A21
- Saulder et al. (2023) Saulder C., et al., 2023, arXiv e-prints, p. arXiv:2302.13760
- Smith et al. (2003) Smith R. E., et al., 2003, MNRAS, 341, 1311
- Springob et al. (2014) Springob C. M., et al., 2014, MNRAS, 445, 2677
- Strauss & Willick (1995) Strauss M. A., Willick J. A., 1995, Phys. Rep., 261, 271
- Taylor et al. (2023) Taylor E. N., et al., 2023, The Messenger, 190, 46
- Tully et al. (2016) Tully R. B., Courtois H. M., Sorce J. G., 2016, AJ, 152, 50
- Tully et al. (2023) Tully R. B., et al., 2023, ApJ, 944, 94
- Turner et al. (2021) Turner R. J., Blake C., Ruggeri R., 2021, MNRAS, 502, 2087
- Turner et al. (2023) Turner R. J., Blake C., Ruggeri R., 2023, MNRAS, 518, 2436
- Wadekar et al. (2020) Wadekar D., Ivanov M. M., Scoccimarro R., 2020, Phys. Rev. D, 102, 123521
Appendix A Derivation of velocity correlation tensor in terms of isotropic functions
In this Appendix we derive the expression for the velocity correlation tensor in terms of the isotropic functions and defined by Gorski (1988). We start from the definition of the correlation tensor (Eq.3) in the form,
(65) |
Using the fact that is isotropic, and the plane wave expansion of , we find that
(66) |
where is a unit vector in the -direction. Using the identity,
(67) |
this relation becomes,
(68) |
using the Bessel function recursion relation, .
Appendix B General expressions for covariance
B.1 3D velocity statistics with vector separations
In this section we provide the covariance relations between estimates of the “simulation” correlation functions , which utilise the galaxy overdensity and inward velocity of one tracer towards the second tracer. Here, we express these relations in a form preserving the direction of the two separations, and , and we neglect redshift-space distortions. These covariance relations are given in terms of the dimensionless selection function of the density and velocity samples, and , and their noise functions, and , which are written in terms of the sample number densities and , weight functions and , individual-tracer velocity error , and Fourier cuboid volume . In this case, the velocity errors are applied independently to each velocity component. These relations also use the galaxy power spectrum , the velocity power spectrum , and the galaxy-velocity cross-power spectrum, . To preserve the dimensionless nature of the integrand over , we assume that all these power spectra have been divided by before being used in these equations. The normalisation constants are , and . The covariance relations are:
(69) |
(70) |
(71) |
(72) |
(73) |
(74) |
B.2 Line-of-sight velocity statistics with vector separations
In this section we provide the covariance relations between estimates of the “observational” correlation functions , which utilise the galaxy overdensity and line-of-sight galaxy velocities. Here, we express these relations in a form preserving the direction of the two separations, and , and we neglect redshift-space distortions. These relations use the same notation as the preceding section except in this case, the velocity errors are applied independently to each velocity component. The normalisation constants are , and , where these are volume averages over the separation angles between the two line-of-sight directions, and between these two directions and the separation vector, as discussed in Sec.2.3. The covariance relations are:
(75) |
(76) |
(77) |
(78) |
(79) |
(80) |
(81) |
(82) |
(83) |
B.3 Angle-averaged 3D velocity statistics
In this section we provide the covariance relations when the estimators of the “simulation” correlation functions , are averaged over the direction of the separations. When taking these angle averages, we introduce the volume-averaged quantities , where represents a combination of the selection function or noise fields. The power spectra are all functions of , and are assumed divided by the Fourier cuboid volume . The covariance relations are:
(84) |
(85) |
(86) |
(87) |
(88) |
(89) |
B.4 Angle-averaged line-of-sight velocity statistics
In this section we provide the covariance relations when the estimators of the “observational” correlation functions are averaged over the direction of the separations. We neglect redshift-space distortions and use the same notation as the preceding section. The covariance relations are:
(90) |
(91) |
(92) |
(93) |
(94) |
(95) |
(96) |
(97) |
(98) |
B.5 Covariances of multipoles
In this section we provide the covariance relations when the estimators of the “observational” correlation functions including redshift-space distortions, , are averaged over the direction of the separation. When taking these angle averages, we introduce the volume-averaged quantities , where represents a combination of the selection function or noise fields. In Eq.103 to Eq.106, all averages are taken over the field , and this is not repeated inside the equations. We also use the first two multipoles of the galaxy auto-power spectrum, and , and the galaxy-velocity cross-power spectrum, and , whose forms in linear theory are given in Eq.30. The subsequent relations are derived by substituting position-dependent power spectra for and in the preceding equations, for example,
(99) |
The following covariances result after averaging over angles:
(100) |
(101) |
(102) |
.
(103) |
(104) |
(105) |
(106) |
(107) |
(108) |
(109) |
(110) |
(111) |