This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Generating extremely large-volume reionisation simulations

Bradley Greig1,2, J. Stuart B. Wyithe1,2, Steven G. Murray3, Simon J. Mutch1,2& Cathryn M. Trott2,4
1School of Physics, University of Melbourne, Parkville, VIC 3010, Australia
2ARC Centre of Excellence for All-Sky Astrophysics in 3 Dimensions (ASTRO 3D)
3School of Earth and Space Exploration, Arizona State University, Tempe, AZ
4International Centre for Radio Astronomy Research (ICRAR), Curtin University, Bentley, WA, Australia
E-mail: [email protected]
Abstract

Preparing for the first detection of the cosmic 21-cm signal from large-scale interferometer experiments requires rigorous testing of the data analysis and reduction pipelines. To validate that these pipelines do not erroneously remove or add features that can mimic the cosmic signal (e.g. from side-lobes or large-scale power leakage), we require reionisation simulations larger than the experiments primary field of view. For an experiment such as the MWA, with a field of view of 252\sim 25^{2} deg.2, this would require a simulation of several Gpcs, which is currently infeasible. To overcome this, we developed a simplified version of the semi-numerical reionisation simulation code 21CMFAST preferencing large volumes over some physical accuracy by assuming linear theory for structure formation. With this, we constructed a 7.5 Gpc comoving volume with voxel resolution of 1.17\sim 1.17 cMpc tailored specifically to the binned spectral resolution of the MWA. This simulation was used for validating the pipelines for the 2020 MWA 21-cm power spectrum (PS) upper limits (Trott et al.). We then use this large-volume simulation to explore: (i) whether smaller volume simulations are biased by the missing large-scale modes, (ii) non-Gaussianity in estimates of the cosmic variance, (iii) biases in the recovered 21-cm PS following foreground wedge removal and (iv) the impact of tiling smaller volume simulations to achieve extremely large volumes. In summary, we find: (i) no biases from missing large-scale power, (ii) significant contribution from non-Gaussianity in the cosmic variance as expected following Mondal et al. (iii) an over-estimate of the 21-cm PS of 10-20 per cent following wedge mode excision for our particular model and (iv) tiling smaller volume simulations under-estimates the large-scale power and also the estimated cosmic variance.

keywords:
cosmology: theory – dark ages, reionisation, first stars – diffuse radiation – early Universe – galaxies: high-redshift – intergalactic medium

1 Introduction

Directly observing the first stars and primordial galaxies is rendered nigh on impossible by the inescapable neutral hydrogen fog that enshrouds the early Universe. This fog is gradually lifted in the intergalactic medium (IGM) as the neutral hydrogen is ionised by the cumulative output of ultra-violet (UV) photons from stars and galaxies. This phase transition is referred to as the Epoch of Reionisation (EoR).

Our most promising method to observe the EoR is through detecting the 21-cm hyperfine spin-flip transition of neutral hydrogen. This faint signal, measured relative to the Cosmic Microwave Background (CMB), is in either emission or absorption depending on both the ionisation and thermal state of the IGM (see e.g. Gnedin & Ostriker, 1997; Madau et al., 1997; Shaver et al., 1999; Tozzi et al., 2000; Gnedin & Shaver, 2004; Furlanetto et al., 2006; Morales & Wyithe, 2010; Pritchard & Loeb, 2012). Crucially, by measuring the 21-cm signal we gain a 2D picture of the spatial distribution of the neutral hydrogen in the IGM. Being a line transition, repeating this over a broad frequency range builds up full 3D time-dependent movie of IGM in the early Universe. From this, we can infer the properties of the galaxies responsible for driving the EoR.

To detect this 3D signal we require large-scale interferometer experiments, which are specifically designed to be sensitive to the spatial fluctuations. The first generation of these experiments include the Low-Frequency Array (LOFAR; van Haarlem et al. 2013), the Murchison Wide Field Array (MWA; Tingay et al. 2013; Wayth et al. 2018), the Precision Array for Probing the Epoch of Reionisation (PAPER; Parsons et al. 2010), the Owens Valley Radio Observatory Long Wavelength Array (OVRO-LWA; Eastwood et al. 2019) and the upgraded Giant Metrewave Radio Telescope (uGMRT; Gupta et al. 2017). Next generation experiments, offering larger collecting areas and lower noise, are currently underway or are planned such as the Hydrogen Epoch of Reionization Array (HERA; DeBoer et al. 2017), NenuFAR (New extension in Nançay Upgrading loFAR; Zarka et al. 2012) and the Square Kilometre Array (SKA; Mellema et al. 2013; Koopmans et al. 2015).

Although these interferometer experiments are yet to detect the 21-cm signal111A detection of excess absorption has been reported by the Experiment to Detect the Global EoR Signature (EDGES; Bowman et al. 2018a) global signal experiment, however, its cosmological origins have been disputed in the literature (see e.g. Hills et al., 2018; Draine & Miralda-Escudé, 2018; Bowman et al., 2018b; Bradley et al., 2019; Singh & Subrahmanyan, 2019; Singh et al., 2022), recent improvements in flagging of cleaned data and the data analysis and reduction pipelines have yielded the strongest yet upper-limits on the 21-cm power spectrum (PS) from LOFAR (Mertens et al., 2020), the MWA (Trott et al., 2020) and HERA (The HERA Collaboration et al., 2022b).

Crucially, 21-cm signal detection is a precision science. The signal is extremely faint, hidden behind astrophysical foregrounds roughly five orders of magnitude brighter than the cosmic signal. Thus, when developing data analysis and reduction pipelines it is vital to test all components that could erroneously inject or remove signal. One vital piece towards this is having sufficiently large simulation volumes that extend well beyond the primary beam and into the side lobes. For example, for an experiment such as the MWA, with a field-of-view of 252\sim 25^{2} deg.2 at 150 MHz (z8.5z\sim 8.5), this corresponds to a transverse size of 4\sim 4 Gpc. Simulations of this scale are currently infeasible.

In this work, we introduce a modified version of the semi-numerical simulation code 21CMFAST (Mesinger & Furlanetto, 2007; Mesinger et al., 2011) specifically tailored towards generating sufficiently large reionisation volumes. To achieve this, we make a few simplifying assumptions such as adopting linear structure formation and that the IGM spin temperature is in excess of the CMB temperature (TSTCMBT_{S}\gg T_{\rm CMB}). Further, we follow a similar approach to Greig et al. (2011) and only simulate a restricted volume via considering a frequency dependent depth.

With these large volume simulations, redshift-space distortions, signal evolution and sky curvature effects can be correctly simulated to provide realistic 21-cm datasets to apply to data analysis pipelines. The simulations introduced in this work have been added to the software package, WODEN (Line, 2022), which produce realistic end-to-end simulations for the MWA Telescope, where they provide a reference dataset for testing power spectrum amplitude and slope, as well as a comparison against which different analysis techniques can be made (e.g., if the data cleaning and treatment removes sufficient systematic power to reach the cosmological signal).

With such a large simulation volume we can also test and verify several assumptions and approximations that are typically made when being computationally limited to smaller simulation volumes. For example, the impact on the 21-cm PS when tiling simulations to achieve sufficiently large volumes or whether the 21-cm PS is impacted by large-scale modes longer than the simulation volume. Further, we statistically explore the 21-cm PS by breaking up our large volume simulation into smaller sub-volumes. We explore the impact of the non-Gaussianity in the 21-cm signal on the PS cosmic variance uncertainty following Mondal et al. (2015); Mondal et al. (2016) and the potential bias in the 21-cm PS when avoiding the contaminated foreground ‘wedge’ modes relative to the true 21-cm PS from the full simulation as highlighted by Jensen et al. (2016).

The outline of this paper is as follows. In Section 2 we detail our modified version of 21CMFAST and describe the adopted ionising source prescription. Next, in Section 3 we outline the specifically tailored large-volume simulation for the MWA before investigating various properties and assumptions related to the 21-cm PS. Finally, in Section 4 we conclude with our closing remarks. Unless stated otherwise, quoted quantities are in co-moving units and we adopt the cosmological parameters: (ΩΛ\Omega_{\Lambda}, ΩM\Omega_{\rm M}, Ωb\Omega_{b}, nn, σ8\sigma_{8}, H0H_{0}) = (0.69, 0.31, 0.048, 0.97, 0.81, 68 km s-1 Mpc-1), consistent with recent results from the Planck mission (Planck Collaboration et al., 2020).

2 Method

In this work we use a heavily modified version of the semi-numerical simulation code 21CMFAST222In particular, the older C-only version, https://github.com/andreimesinger/21cmFAST (Mesinger & Furlanetto, 2007; Mesinger et al., 2011). 21CMFAST employs approximate but efficient methods to describe the astrophysics of the EoR which compare favourably to computationally expensive radiative transfer (RT) simulations on scales (1\gtrsim 1 cMpc) relevant to 21-cm interferometer experiments (Zahn et al., 2011). This efficiency is achieved by computing the ionisation field using the excursion-set approach (Furlanetto et al., 2004a) which compares the time-integrated number of ionising photons (following a source prescription) to the number of baryons within spherical regions of decreasing radius, RR. Under this method, a simulation voxel at the co-ordinates (𝒙,z\boldsymbol{x},z) is considered ionised if,

nion(𝒙,z|R,δR)1.\displaystyle n_{\rm ion}(\boldsymbol{x},z|R,\delta_{R})\geq 1. (1)

Here, nionn_{\rm ion} is the cumulative number of IGM ionising photons per baryon inside a spherical region of size, RR and corresponding smoothed overdensity, δR\delta_{R}. Explicitly, in this work we ignore the contributions from inhomogenous recombinations (e.g. Sobacchi & Mesinger, 2014) and partial ionisations by X-rays to boost the overall computational efficiency.

For our ionising sources, we adopt the Park et al. (2019) astrophysical parameterisation for high-zz galaxies which directly connects the star-formation rate and ionising escape fraction to the host dark matter halo mass. The advantage of such an approach is that following some simple conversions, UV luminosity functions (LFs) can be produced and compared against high-zz observations. For specific details we defer the interested reader to the aforementioned work and provide a brief summary of the parameterisation below.

First, it is assumed that the typical stellar mass of a galaxy, MM_{\ast}, can be related to its host halo mass via,

M(Mh)=f(ΩbΩm)Mh,\displaystyle M_{\ast}(M_{\rm h})=f_{\ast}\left(\frac{\Omega_{\rm b}}{\Omega_{\rm m}}\right)M_{\rm h}, (2)

and that ff_{\ast}, the fraction of galactic gas in stars, can be expressed as a power-law in halo mass with index α\alpha_{\ast} and normalised at a dark matter halo of mass 101010^{10} MM_{\odot} through f,10f_{\ast,10},

f=f,10(Mh1010M)α.\displaystyle f_{\ast}=f_{\ast,10}\left(\frac{M_{\rm h}}{10^{10}\,M_{\odot}}\right)^{\alpha_{\ast}}. (3)

Following this, we obtain an estimate for the star-formation rate (SFR) by dividing the stellar mass by a characteristic time-scale, tt_{\ast}, corresponding to a fraction of the Hubble time, H1(z)H^{-1}(z),

M˙(Mh,z)=MtH1(z).\displaystyle\dot{M}_{\ast}(M_{\rm h},z)=\frac{M_{\ast}}{t_{\ast}H^{-1}(z)}. (4)

The escape fraction of UV ionising photons, fescf_{\rm esc}, is also assumed to be related to the halo mass through a power-law of index, αesc\alpha_{\rm esc}, and is again normalised at 101010^{10} MM_{\odot} via fesc,10f_{\rm esc,10},

fesc=fesc,10(Mh1010M)αesc.\displaystyle f_{\rm esc}=f_{\rm esc,10}\left(\frac{M_{\rm h}}{10^{10}\,M_{\odot}}\right)^{\alpha_{\rm esc}}. (5)

Finally, to account for feedback and inefficient cooling processes which limit small mass haloes from hosting active, star-forming galaxies a duty-cycle, fdutyf_{\rm duty}, is included which describes the fraction, (1fduty)(1-f_{\rm duty}), of dark matter haloes that cannot host star-forming galaxies,

fduty=exp(MturnMh),\displaystyle f_{\rm duty}={\rm exp}\left(-\frac{M_{\rm turn}}{M_{\rm h}}\right), (6)

with MturnM_{\rm turn} corresponding to this suppression scale. In summary, this ionising source prescription results in six free parameters: f,10f_{\ast,10}, α\alpha_{\ast}, fesc,10f_{\rm esc,10}, αesc\alpha_{\rm esc}, tt_{\ast} and MturnM_{\rm turn}.

The number of ionising photons, nionn_{\rm ion} inside a spherical region of size, RR is then obtained via:

nion=ρ¯b10dMhdn(Mh,z|R,δR)dMhfdutyM˙fescNγ/b,\displaystyle n_{\rm ion}=\bar{\rho}^{-1}_{b}\int^{\infty}_{0}{\rm d}M_{\rm h}\frac{{\rm d}n(M_{h},z|R,\delta_{R})}{{\rm d}M_{\rm h}}f_{\rm duty}\dot{M}_{\ast}f_{\rm esc}N_{\gamma/b}, (7)

where ρ¯b\bar{\rho}_{b} is the mean baryon density, dndMh\frac{{\rm d}n}{{\rm d}M_{\rm h}} is the conditional halo mass function (HMF) and Nγ/bN_{\gamma/b} is the number of ionising photons per stellar baryon. Here, we use the Press-Schecter HMF (Lacey & Cole, 1993) normalised to the mean of the Sheth-Tormen HMF (Sheth & Tormen, 1999), and Nγ/b=5000N_{\gamma/b}=5000 representative of a Salpeter initial mass function (Salpeter, 1955).

The cosmic 21-cm signal is computed via its brightness temperature contrast against the Cosmic Microwave Background (CMB) temperature, TCMBT_{\rm CMB} (e.g. Furlanetto et al., 2006):

δTb(ν)\displaystyle\delta T_{\rm b}(\nu) =\displaystyle= TSTCMB1+z(1eτν0)\displaystyle\frac{T_{\rm S}-T_{\rm CMB}}{1+z}\left(1-e^{-\tau_{\nu_{0}}}\right) (8)
\displaystyle\approx 27xHI(1+δnl)(Hdvr/dr+H)(1TCMBTS)\displaystyle 27x_{\mathrm{H\,{\scriptscriptstyle I}}{}}(1+\delta_{\rm nl})\left(\frac{H}{{\rm d}v_{\rm r}/{\rm d}r+H}\right)\left(1-\frac{T_{\rm CMB}}{T_{\rm S}}\right)
×(1+z100.15Ωmh2)1/2(Ωbh20.023)mK,\displaystyle\times\left(\frac{1+z}{10}\frac{0.15}{\Omega_{\rm m}h^{2}}\right)^{1/2}\left(\frac{\Omega_{b}h^{2}}{0.023}\right)~{}{\rm mK},

where τν0\tau_{\nu_{0}} is the optical depth of the 21-cm line, ν0\nu_{0}, TST_{\rm S} is the gas spin temperature, δnl(𝒙,z)\delta_{\rm nl}(\boldsymbol{x},z) is the evolved (Eularian) overdensity, xHIx_{\mathrm{H\,{\scriptscriptstyle I}}{}} is the ionisation fraction, H(z)H(z) is the Hubble parameter, dvr/dr{\rm d}v_{\rm r}/{\rm d}r is the gradient of the LOS component of the velocity and all quantities are evaluated at redshift z=ν0/ν1z=\nu_{0}/\nu-1. To aid computational efficiency, in this work we ignore the effects of peculiar velocities (except in Section 3.5) while also assuming that the gas spin temperature is in excess of the CMB temperature (TSTCMBT_{S}\gg T_{\rm CMB}). Thus, we compute the brightness temperature contrast as,

δTb(ν)27xHI(1+δnl)(1+z100.15Ωmh2)1/2(Ωbh20.023)mK.\displaystyle\delta T_{\rm b}(\nu)\approx 27x_{\mathrm{H\,{\scriptscriptstyle I}}{}}(1+\delta_{\rm nl})\left(\frac{1+z}{10}\frac{0.15}{\Omega_{\rm m}h^{2}}\right)^{1/2}\left(\frac{\Omega_{b}h^{2}}{0.023}\right)~{}{\rm mK}. (9)

Typically, 21CMFAST first generates a high-resolution 3D realisation of the linear density before applying second-order Lagrange perturbation theory (e.g Scoccimarro, 1998) to obtain the requisite velocity and evolved density fields. It then smooths the evolved fields onto a lower resolution grid to mitigate numerical artefacts. However, to achieve extremely large-volume reionisation simulations, we bypass this step and consider only the linear density field.

Even following this, our largest achievable simulation size is limited to the maximum available memory on our computer network. This limitation is driven by the necessity of performing 3D Fast-Fourier Transforms (FFTs). To bypass this, we follow the approach of Greig et al. (2011) and take a considerable hit to computational efficiency by storing the 3D data on hard-disk and performing successive 1D and 2D FFTs rather than the expensive 3D FFTs in memory. In effect, this limits our memory requirements to N2N^{2} rather than N3N^{3} and takes advantage of the fact that hard-disk space exceeds total memory (since it is cheaper, though considerably slower to access). Importantly, by adopting linear theory (initial conditions are generated in Fourier space) and highlighting that the excursion-set algorithm is applied in Fourier space, we can also minimise the number of FFTs between real and Fourier space. Finally, at no point do we ever generate a full 3D real-space cubic volume, rather we instead truncate the depth to a desired frequency bandwidth. That is, we consider a much smaller line-of-sight depth to our volumes than the transverse direction. By doing this, we only need to perform a limited number of 1D and 2D FFTs to achieve our requisite volume. However, at all times the statistics of the field remain correct as we always fully sample the largest scale modes.

Importantly, although we have made several assumptions to boost computational efficiency at the expense of some physical accuracy, our overall goal is to have a method to generate extremely large 21-cm simulations for testing data analysis and reduction pipelines. For these, physical accuracy is not necessarily the most important requirement, instead, we simply require that the statistics and correlations in our large volume simulations be representative of those expected of the true signal over an extremely large range of spatial scales and survey footprint. Additionally, for these extremely large volume simulations we do not include the effects of sky curvature at present. In future we will look into the impact of sky-curvature on large-volume simulations.

3 Results

3.1 Tailored MWA simulation

Refer to caption
Figure 1: A 2D slice of the ionisation field, xHIx_{\mathrm{H\,{\scriptscriptstyle I}}{}}, obtained from the 7.5 comoving Gpc, 6400 voxel simulation at z=8z=8 (cell resolution 1.17\sim 1.17 cMpc). The coloured dashed circles correspond to the field-of-view for a variety of 21-cm interferometers: MWA (red), HERA (blue), LOFAR (purple) and SKA (teal). See Table 1 for further details.

The Trott et al. (2020) MWA upper-limits were determined from observations spanning 137–197 MHz in two slightly overlapping 30.72 MHz bands. Over these bands, the MWA frequency resolution is 80\sim 80 kHz, which sets the minimum spatial resolution for our large-volume simulations and the maximal line-of-sight length required (384 channels). At 150 MHz (z8.5z\sim 8.5), the MWA field of view is 252\sim 25^{2} deg.2{\rm deg.}^{2}, corresponding to a comoving side-length of 4\sim 4 Gpc. To exceed these requirements, we generate a single simulation volume with a comoving transverse side-length of 7.5 Gpc over 6400 voxels and 600 voxels (703.1703.1 Mpc) along the line-of-sight direction333Such a simulation required 6\sim 6 days, using 32 CPUs and 10\sim 10 GB of memory. The majority of this time is spent reading/writing to file, which could be considerably improved using solid state hard-drives.. This simulation corresponds to a cell resolution of 1.17\sim 1.17 cMpc. To our knowledge, this is the largest volume 21-cm simulation in the literature444As a verification of our method we also performed a simulation with a transverse side-length of 9.6 Gpc and 8092 voxels.. Using this simulation, we construct two 21-cm light-cones spanning the observing bands (z=6.27.5z=6.2-7.5 and z=7.59.4z=7.5-9.4) by stitching together 2D slices of the 21-cm signal from our simulation volume by linearly interpolating in cosmic time (e,g, Datta et al. 2012, 2014; La Plante et al. 2014; Ghara et al. 2015; Mondal et al. 2018).

For this simulation, we adopt f,10=0.05f_{\ast,10}=0.05, α=0.5\alpha_{\ast}=0.5, fesc,10=0.08f_{\rm esc,10}=0.08, αesc=0.5\alpha_{\rm esc}=-0.5, t=0.5t_{\ast}=0.5 and Mturn=108.7M_{\rm turn}=10^{8.7}, consistent with the fiducial model adopted in Park et al. (2019), which was shown to match all recent observational constraints on the reionisation epoch. Additionally, in neglecting inhomogeneous recombinations we are required to set an effective mean free path for the ionising photons inside the ionised regions, RmfpR_{\rm mfp}. We adopt Rmfp=15R_{\rm mfp}=15 comoving Mpc motivated by the fiducial model in Greig & Mesinger (2015).

In Figure 1 we present a 2D slice of the IGM ionisation fraction, xHIx_{\mathrm{H\,{\scriptscriptstyle I}}{}}, at z=8z=8 for which we obtain an IGM neutral fraction of x¯HI0.55\bar{x}_{\mathrm{H\,{\scriptscriptstyle I}}{}}\sim 0.55. As a visual demonstration of the physical extent of these simulations, we overlay the respective fields of view for each of the MWA (red), HERA (blue), LOFAR (purple) and SKA (teal). Note that unlike the MWA, LOFAR and SKA which use phase tracking to follow a single patch of the sky, HERA is a drift-scan experiment (sky rotates over the fixed zenith pointing dishes), thus, here we show its instantaneous field of view. In Table 1 we provide the collecting area of a single station/element and the corresponding field of view for each 21-cm interferometer experiment.

Instrument Collecting area (m2) Field of View (deg.2) Comoving distance
(single element) (@150 MHz) (Mpc)
MWA 21.5 24.72 3977.3
LOFAR 745 3.72 598.8
HERA 154 8.22 1317.3
SKA 1165 3.02 479.0
Table 1: A summary of the collecting area of a single element/station for each of the 21-cm interferometer experiments, the corresponding field of view at 150 MHz and the transverse comoving distance of the simulation required to match the field of view at 150 MHz. Note for HERA, we show the instantaneous field of view as it operates in a drift-scan mode.

3.2 Exploring large-scale modes

Typically, simulations tailored for a 21-cm interferometer experiment will be designed to match the physical extent of the instruments primary beam. Although problematic for full tests of end-to-end pipelines, for most usage cases this assumption should be sufficient. However, by only simulating a volume equivalent to the primary beam, these simulations forgo modes on much longer scales which would otherwise be present in the observed 21-cm signal. In principle, the absence of these large-scale modes could result in an under-estimate of the true large-scale power.

We can test the impact of neglecting modes beyond the physical extent of the primary beam by using our 7.5 Gpc simulation as a reference simulation to represent the true Universe. We then compare the 3D spherically averaged 21-cm power spectrum (PS) obtained from our 7.5 Gpc simulation to the PS obtained from the smaller, independent realisations of the 21-cm signal tailored to match each interferometer experiment. Throughout, we define the 21-cm PS as Δ212(k,z)k3/(2π2V)δT¯b2(z)|δ21(𝒌,z)|2k\Delta^{2}_{21}(k,z)\equiv k^{3}/(2\pi^{2}V)\,\delta\bar{T}^{2}_{\rm b}(z)\,\langle|\delta_{21}(\boldsymbol{k},z)|^{2}\rangle_{k} where δ21(𝒙,z)δTb(𝒙,z)/δTb¯(z)1\delta_{21}(\boldsymbol{x},z)\equiv\delta T_{\rm b}(\boldsymbol{x},z)/\bar{\delta T_{\rm b}}(z)-1. Further, when calculating the 21-cm PS we neglect all k=0k_{\perp}=0 modes. That is, we remove all pure line-of-sight modes as these are not visible to an interferometer555Note, this is somewhat of a crude approximation as interferometers do in fact have a response to these k=0k_{\perp}=0 modes, although it is suppressed due to the tapered response. While this does not impact the results of this work, realistically one should include the true instrument response rather than taking this approximation when comparing simulations to observations. (e.g. Datta et al., 2012).

Refer to caption
Figure 2: The 21-cm power spectrum from simulations tailored to the field-of-view of specific 21-cm interferometers. The black curve corresponds to the 21-cm power spectrum from the full 7500×7500×5007500\times 7500\times 500 Mpc simulation. The red, blue, purple and teal curves correspond to simulation volumes with angular extents equivalent to the field-of-view of the MWA, LOFAR, HERA and the SKA, respectively. The shaded region corresponds to the 1σ\sigma scatter from 10 realisations with different initial conditions.

In Figure 2 we compare the 21-cm PS obtained from our 7.5 Gpc simulation (black curve) to the mean and 68th percentile scatter obtained from 10 independent realisations of the 21-cm signal. In each panel we compare the 21-cm PS obtained from the primary field of view of the MWA (red), HERA (blue), LOFAR (purple) and SKA (teal). In all cases, we compute the 21-cm PS over a simulation volume with the same observing bandwidth of 30\sim 30MHz (transverse distance is equivalent to the field of view).

Immediately from Figure 2 we see no obvious deviations at large scales. The mean 21-cm PS determined over the 10 independent realisations for each survey footprint overlaps with the 21-cm PS from the 7.5 Gpc simulation. On the largest scales for each individual instrument, we observe some scatter in the 21-cm PS owing to cosmic variance, where the binned 21-cm power has few modes (i.e. Poisson sampling error). Thus, provided the simulation volume exceeds the field of view, this should not be a concern. We can then conclude that omitting large-scales modes beyond the primary field of view does not alter the recovered statistics. This is consistent with the hybrid, large-volume (>1>1 Gpc) reionisation simulations generated by Kim et al. (2016), whereby the amplitude of the large-scale 21-cm PS converges for increasing simulation volumes beyond 500 Mpc.

Note, this investigation slightly differs from previous works by Iliev et al. (2014), La Plante et al. (2014) and Kaur et al. (2020). Here, in each case the authors explore sub-volumes from their largest reionisation simulation, which in the case of Iliev et al. (2014) and Kaur et al. (2020) is used to determine a minimum volume for reionisation simulations to achieve convergence of the large-scale radiative effects. Since this is a sub-division of a larger volume, these smaller simulations contain modes beyond the scale of the sub-volume. In our work, we generate new independent realisations of our smaller volume simulations that do not include modes beyond the scale of the simulation.

3.3 Exploring cosmic variance

The 3D spherically averaged 21-cm PS is calculated by averaging over all Fourier modes that fall within spherical shells. Naturally, the variance of this statistic is then given by the Poisson sampling error based on the total number of Fourier modes in each spherical shell,

σΔ212(k,z)=Δ212(k,z)(2π)2Vk2Δk.\displaystyle\sigma_{\Delta^{2}_{21}}(k,z)=\Delta^{2}_{21}(k,z)\sqrt{\frac{(2\pi)^{2}}{Vk^{2}\Delta k}}. (10)

Here, Δ212(k,z)\Delta^{2}_{21}(k,z) is the dimensional 21-cm PS measured within a volume, VV, with spherical shells separated by Δk\Delta k.

However, this assumes that the 21-cm PS fully encodes all the information about the 21-cm signal (i.e. that the 21-cm signal is Gaussian). Instead it is well known that the 21-cm signal is highly non-Gaussian (e.g. Furlanetto et al., 2004a, b; Morales & Hewitt, 2004; Cooray, 2005), in which case the higher-order moments of the 21-cm signal will be non-zero. As a result of this Mondal et al. (2016) have shown that the cosmic variance uncertainty on the 21-cm PS is instead,

σΔ212(k,z)=Δ212(k,z)(2π)2Vk2Δk+T(k,k)V,\displaystyle\sigma_{\Delta^{2}_{21}}(k,z)=\Delta^{2}_{21}(k,z)\sqrt{\frac{(2\pi)^{2}}{Vk^{2}\Delta k}}+\sqrt{\frac{T(k,k)}{V}}, (11)

where T(k,k)T(k,k) is the Trispectrum component which arises from the four-point correlation function of the brightness temperature fluctuations.

To explore this non-Gaussianity in the 21-cm PS cosmic variance, we extract a large number of sub-volumes from our large 7.5 Gpc simulation. In particular, we extract sub-volumes equivalent in spatial extent to the field-of-view for each 21-cm interferometer experiment (as listed in Table 1), each with the same 30 MHz bandwidth along the line-of-sight. To create these sub-volumes, we first split the full 7.5×7.57.5\times 7.5 Gpc2 sky-area into our requisite smaller volumes without overlapping regions. Then, to increase our number statistics, we generate an inset sky-area offset from the simulation edge by half the size of the respective instrument field-of-view. We then additionally split this inset area into non-overlapping sub-volumes. In doing so, we obtain a total of 5, 41, 221 and 365 different sub-volumes from our single, large volume simulation for the MWA, HERA, LOFAR and the SKA, respectively. By considering sub-volumes equivalent to each instruments field-of-view, we consider volumes 210×\sim 2-10\times larger than those considered by Mondal et al. (2016) and Mondal et al. (2017). Further, by splitting up our large volume simulation we can obtain up to 7\sim 7 times as many realisations, increasing our statistical sampling of the numerical cosmic variance uncertainty, albeit at the expense of notably less physics rich simulations.

Refer to caption
Figure 3: The 1σ1\sigma variation in the 21-cm power spectrum estimated from sub-volumes of size equivalent to the 21-cm experiment field-of-view (solid curves) compared to the theoretically expected variation assuming Gaussian errors (dashed) and the non-Gaussian contribution to the sample variance (dotted). Top left: MWA, Top right: HERA, Bottom left: LOFAR and Bottom right: SKA.

In Figure 3 we show the resultant 1σ1\sigma uncertainty on the 21-cm PS obtained from our distribution of sub-volumes (solid curves) for each 21-cm interferometer experiment. In contrast, we also show the expected theoretical estimate of the uncertainty from Equation 10 if we assumed Gaussian statistics (dashed curves). The dotted curve corresponds to the non-Gaussian (Trispectrum) component of the cosmic-variance as highlighted in Equation 11.

On relatively large scales (k<0.5k<0.5 Mpc-1), we find that the true cosmic variance uncertainty is well approximated by a Gaussian component for the theoretically expected error. However, progressing to intermediate and smaller scales, the amplitude of the non-Gaussianity increases significantly, with the Trispectrum component dominating by 10×\geq 10\times that of the Gaussian term. This transition scale of k>0.5k>0.5 Mpc-1 for a non-Gaussian dominated cosmic variance uncertainty on the 21-cm PS is consistent with that of Mondal et al. (2016) and Mondal et al. (2017) for a similar stage of reionisation (x¯HI0.55\bar{x}_{\mathrm{H\,{\scriptscriptstyle I}}{}}\sim 0.55). On these scales, as the EoR progresses, the growth and percolation of the ionised regions drives the non-Gaussianity in the cosmic 21-cm signal. Importantly, we find the relative amplitude (the amplitude simply scales as V1\propto V^{-1}) and shape of the full cosmic variance uncertainty to be consistent across the differently sized sub-volumes. This indicates no additional sources of cosmic variance uncertainty due to the finite-size of the simulation volume considered.

Note however, that while the non-Gaussian term for the 21-cm PS cosmic variance begins to dominate on scales larger than k>0.5k>0.5 Mpc-1, this need not be too concerning for 21-cm experiments. Typically, on these scales, the 21-cm PS will be dominated by the instrument thermal noise (see figure 2 of Greig et al. 2020) making these two terms comparable. Nevertheless, care must be taken when estimating the 21-cm PS cosmic variance error.

3.4 Impact of foreground ‘wedge’ avoidance on cosmic variance

The cosmic 21-cm signal observed by radio interferometer experiments is measured in visibility space (uvuv coverage), which is the 2D Fourier transform of the brightness signal in the image (sky) plane. Measuring the power spectrum then requires gridding these visibilities. However, these visibilities are frequency dependent. Line-of-sight power in Fourier space (frequency dependent) can then leak into the transverse (frequency independent) Fourier modes. This results in the well known ‘wedge’ feature in cylindrical 2D Fourier space, where measured power within this regime is contaminated (Datta et al., 2010; Vedantham et al., 2012; Morales et al., 2012; Parsons et al., 2012; Trott et al., 2012; Thyagarajan et al., 2013; Liu et al., 2014a, b; Thyagarajan et al., 2015a, b; Pober et al., 2016; Murray & Trott, 2018). While it is plausible to mitigate or ‘clean’ these contaminated modes (see e.g. Chapman & Jelić 2019 for a comprehensive review), it is typically easier to simply avoid this contaminated region entirely, measuring the power spectrum in the relatively pristine region above the ‘wedge’.

The spatial location of this ‘wedge’ in 2D cylindrical space is determined by,

k=mk+b\displaystyle k_{\parallel}=mk_{\perp}+b (12)

where kk_{\parallel} and kk_{\perp} are the line-of-sight and transverse Fourier modes, respectively. The constant bb corresponds to an additive buffer region above the ‘wedge’ of Δk=0.1h\Delta k_{\parallel}=0.1\,h Mpc-1 extending beyond the horizon limit while,

m=DCH0E(z)sin(θ)c(1+z).\displaystyle m=\frac{D_{\rm C}H_{0}E(z){\rm sin}(\theta)}{c(1+z)}. (13)

Here, DCD_{\rm C} is the comoving distance, H0H_{0} is the Hubble constant, E(z)=Ωm(1+z)3+ΩΛE(z)=\sqrt{\Omega_{\rm m}(1+z)^{3}+\Omega_{\Lambda}} and θ\theta is the angular radius of the field of view which we conservatively take as θ=π/2\theta=\pi/2 (i.e. observing down to the horizon).

Previously, we numerically estimated the cosmic variance in the 21-cm PS measured over the full 30 MHz bandwidth simulation volume for each interferometer experiment. To highlight the impact of ‘wedge’ avoidance on estimating the 21-cm PS, in particular on the amplitude of the statistical uncertainty, we perform the same analysis as in Section 3.3 instead calculating the 21-cm PS using only Fourier modes above the ‘wedge’.

Refer to caption
Figure 4: The 1σ1\sigma variation in the 21-cm power spectrum estimated from sub-volumes of size equivalent to the 21-cm experiment field-of-view when using the full simulation box (solid curves) and after removing the foreground contaminated ‘wedge’ modes (dashed). Top left: MWA, Top right: HERA, Bottom left: LOFAR and Bottom right: SKA.

In Figure 4 we compare the 1σ\sigma cosmic variance uncertainty after ‘wedge’ avoidance (dashed curves) relative to the 21-cm PS estimates from the full simulation volume (solid curves). As expected, in avoiding the ‘wedge’ region, we limit the total number of Fourier modes available in each spherical shell to measure the 21-cm PS. Thus, quite simply, our Poisson uncertainty increases owing to the smaller number of available modes driving the increasing uncertainty error. Further, as the number of available modes quickly diminishes as we move to larger spatial scales (smaller kk) the cosmic variance uncertainty rapidly increases. For smaller scales, the cosmic variance uncertainty still becomes dominated by the Trispectrum component (as indicated by similar upticks in amplitude at k>1k>1 Mpc-1 as highlighted in Figure 3), though on slightly smaller scales than found previously. Nevertheless, the non-Gaussianity in the cosmic variance uncertainty is still important when performing ‘wedge’ avoidance.

Importantly, the differences between the two cases are amplified by our survey geometry. In general, our simulation volumes are considerably larger along the transverse direction relative to the line-of-sight. Thus, previously, when measuring the 21-cm PS over the full volume, our number statistics for the kk_{\perp} modes were considerably larger than for the kk_{\parallel} modes. Now, when measuring the 21-cm PS using foreground avoidance, most of these kk_{\perp} modes are excluded since they fall within the ‘wedge’.

In following this avoidance strategy, we also significantly limit the largest scale modes accessible to our 21-cm PS measurement. In all cases, the 21-cm PS is only measurable down to k0.1k\sim 0.1 Mpc-1, with no measurable modes beyond this limit. This limit is simply set by the adopted size of our buffer (b=0.1hb=0.1\,h Mpc-1) from Equation 12 which is somewhat arbitrary. In principle, it can be reduced by considering either larger bandwidths or with an improved understanding of our systematics. Note however, that the smallest accessible scale for each instrument is not exactly k0.1k\sim 0.1 Mpc-1, but rather slightly larger. As we additionally ignore all k=0k_{\perp}=0 modes, from Equation 12 the minimum scale is simply bb plus mm times the smallest non-zero kk_{\perp} in our simulation volume. Thus, the minimum kk for the MWA is smaller than the SKA due to the smaller kk_{\perp} (larger transverse scale).

In Figure 5 we present the fractional increase in cosmic variance when removing the foreground contaminated wedge modes. We find that the relative increase is comparable across all scales irrespective of the total simulation volume, highlighting that the increase in the cosmic variance is primarily driven by the fraction of excised Fourier modes (i.e. increased Poisson noise in each bin due to having fewer modes free of foregrounds).

Refer to caption
Figure 5: The fractional increase in the cosmic variance on the 21-cm power spectrum when removing the foreground wedge contaminated modes (σΔ21,wedge2\sigma_{\Delta^{2}_{21,{\rm wedge}}}) compared to the cosmic variance estimated from the full 21-cm simulation (σΔ21,full2\sigma_{\Delta^{2}_{21,{\rm full}}}).

3.5 Biased estimates of the 21-cm PS following ‘wedge’ avoidance

Previously, we explored the impact of foreground avoidance on the resultant cosmic variance uncertainty on a 21-cm PS measurement. Next, we consider whether the 21-cm PS measured following foreground avoidance can be biased relative to the 21-cm PS measured from a full simulation volume. This is particularly pertinent given that recent upper limits on the 21-cm PS measured by LOFAR (Mertens et al., 2020), MWA (Trott et al., 2020) and HERA (The HERA Collaboration et al., 2022b) have for the first time allowed Bayesian astrophysical parameter inference to be employed to understand the physics of reionisation (Ghara et al., 2020; Ghara et al., 2021; Greig et al., 2021a; Greig et al., 2021b; Mondal et al., 2020; The HERA Collaboration et al., 2022a). However, when these inference pipelines are applied to these upper-limits, the simulated 21-cm PS is generated from the full simulation volume, rather than the characteristics of the observation, such as foreground avoidance.

While likely irrelevant for the present upper-limits since these are not very aggressive and only disfavour relatively extreme models of reionisation. Nevertheless, as we approach a first detection, these inference pipelines will be vital to building our understanding of the galactic physics driving reionisation. Thus, it is important to explore the relative impact of any biases on the modelled 21-cm PS when not accounting for the characteristics of the observation.

Firstly, the observed 21-cm PS is asymmetric owing to redshift-space distortions from the peculiar motions of the gas along the line-of-sight. Thus unlike in the earlier sections, we must include the effects of redshift-space distortions on our estimate of the 21-cm brightness temperature signal (note, the inference pipelines above include these effects). These redshift-space distortions increase the relative power along the line-of-sight direction, which could amplify any biases when measuring the 21-cm PS following foreground avoidance. This is because modes are preferentially excised transverse to the line-of-sight, resulting in a higher amplitude signal after binning over all possible modes within spherical shells. Thus here, we compute the 21-cm brightness temperature signal for the full simulation following the multiplication of Equation 9 by the pre-factor (1+dvrHdr)1\left(1+\frac{{\rm d}v_{\rm r}}{H{\rm d}r}\right)^{-1} and computing the anisotropic 21-cm PS.

Refer to caption
Figure 6: The ratio of the 21-cm PS with wedge modes removed (foreground avoidance) relative to the 21-cm PS from the full simulation volume. Solid curves correspond to the mean ratio, while dark and light shaded regions correspond to the 68th and 95th percentiles respectively. Top left: MWA, Top right: HERA, Bottom left: LOFAR and Bottom right: SKA.

In Figure 6 we present the mean (solid curve) and the 68th and 95th percentiles (dark and light, respectively) for the ratio of the 21-cm PS with the ‘wedge’ removed (avoidance) relative to the 21-cm PS from the full simulation volume. Here, these are determined using the same sub-volumes as obtained previously for each individual experiment from the full 7.5 Gpc simulation. It is clearly evident that for all survey areas, the 21-cm PS measured following foreground avoidance returns a biased estimate of the true 21-cm PS obtained from the full simulation volume. On average, we find the relative amplitude of this bias to be relatively modest, at 1020\sim 10-20 per cent over all measured kk-scales. On the largest scales (small kk), when foreground avoidance has limited sampling of the Fourier modes (see Figure 4) this can result in increases/decreases in the 21-cm PS amplitude of larger than 4040 per cent.

Although fairly modest in amplitude, it is important to note that these results are model dependent. That is, this bias corresponds specifically to our particular source parameterisation and frequency band (z7.59.4z\sim 7.5-9.4). Therefore, it is plausible that the amplitude of this bias could increase under different astrophysical models or stages of reionisation. Hence, this could have important consequences for parameter inference studies if the statistics of the simulated models are not consistent with the characteristics of the observation, potentially biasing the inferred astrophysical parameters. Although the model dependency of the work is important to consider here, in previous sections it is less relevant. When investigating the cosmic variance uncertainty the relative amplitude of the uncertainty is dependent on the number of modes within each spherical shell when estimating the 21-cm PS, not the 21-cm PS amplitude itself.

Refer to caption
Figure 7: The mean ratio in the 21-cm PS with wedge modes removed compared to the 21-cm PS from the full simulation volume. The red, blue, purple and teal curves correspond to simulation boxes with fields of view equivalent to the MWA, HERA, LOFAR and the SKA respectively.

To more clearly illustrate the relative increase in amplitude in the 21-cm PS following foreground avoidance, in Figure 7 we show in the same panel the mean ratio of the 21-cm PS with/without foreground avoidance for all interferometer experiments. This clearly demonstrates that the relative increase in the 21-cm PS amplitude following foreground avoidance is consistent at 102010-20 per cent across all the different survey footprints.

This 102010-20 per cent bias is broadly consistent with a similar investigation performed by Jensen et al. (2016). Here, this bias was determined for two astrophysical models over the full reionisation history. For their fiducial model, which more closely matches our model, at a similar stage of reionisation, xHI¯0.5\bar{x_{\mathrm{H\,{\scriptscriptstyle I}}{}}}\sim 0.5, they find a similar bias of 102010-20 per cent, however, as a decrement rather than an amplification. Likely, these differences arise due to the specific modelling of the ionising sources, but more likely due to the simplifications present in our modelling (e.g. linear theory) relative to their NN-body simulations. Importantly, though, they only consider a single realisation for this exploration. For sub-volumes equivalent in size to their single simulation (corresponding to the LOFAR field-of-view) in the lower left panel of Figure 6 we can accommodate individual models with a negative bias. Factoring in the modelling differences mentioned above, we are confident that these two works are broadly consistent.

Note, there also appears to be a slight gradient in the relative bias for the different experiments. The SKA, with the smallest survey footprint has the lowest amplitude whereas the MWA, with the largest footprint has the highest amplitude. However, this is simply due to the geometry of the survey volume. Foreground avoidance with the MWA removes on average more Fourier modes in each spherical shell below the ‘wedge’ than that of the SKA relative to the 21-cm PS computed from the full simulation volume. This is because the MWA has a considerably larger transverse size (higher sampling of transverse modes) which results in a higher fraction of modes cut at the same fixed kk than the SKA. Equally, the uptick at the largest scales occurs earliest for the SKA compared to the other, larger area surveys. This occurs for the same reason as discussed earlier, the removal of the k=0k_{\perp}=0 modes coupled with the kk_{\parallel} needing to satisfy Equation 12.

3.6 Impact of tiling reionisation simulations

Typically, when it is infeasible to generate a single, large volume simulation, the requisite size and volume can be achieved by stitching together many smaller simulations generated using periodic boxes (see e.g. Nasirudin et al., 2020). Although sufficient volumes are achieved in this manner, the statistics of the simulations will only be accurate on the scales modelled by the smaller simulation. That is, it is impossible to inject information on scales beyond the largest scale of the smallest simulation. Using our 7.5 Gpc simulation, we can explore the relative impact of tiling simulations on the large-scale information in the 21-cm PS.

The tiling of periodic simulations additionally leads to repeating structures on scales characteristic of the simulation volume. When considering discrete objects such as galaxies in a redshift survey, these repeating structures can be minimised through rotating the smaller simulations prior to tiling. Unfortunately, since the 21-cm signal arises from the neutral gas, it is a continuous rather than discrete quantity preventing the usage of rotations. Rotations would lead to edge effects and disjoint ionised regions on the boundaries of the smaller simulations leading to strange artefacts in the statistics. Thus, for 21-cm studies, we can only consider tiled simulations.

To investigate the impact on the large-scale power due to tiling, we consider three smaller volumes and tile each to match our single 7.5 Gpc, 6400 voxel simulation volume. We generate 50 independent realisations of the smaller simulations with 200, 400 and 600 voxels per side length, corresponding to lengths of 234.4,468.8\sim 234.4,468.8 and 703.1 Mpc. We tile the same individual simulation to the large volume footprint, thus we obtain 50 independent large volume tiled simulations. For each, we generate the full 21-cm light-cone and only consider a line-of-sight depth of 30 MHz, consistent with large volume simulation.

Refer to caption
Figure 8: The 21-cm power spectrum obtained after tiling smaller volume simulations. The black curve corresponds to the 21-cm power spectrum of the full 7500×7500×5007500\times 7500\times 500 Mpc simulation, whereas the red, blue and purple curves correspond to the 21-cm power spectrum obtained from an equivalent simulation volume after tiling a single simulation box containing 200, 400 and 600 voxels per side length, respectively. Vertical dashed lines correspond to the fundamental frequency, 2π/Lsmall2\pi/L_{\rm small}, for each small volume.

In Figure 8 we compare the 21-cm PS obtained from our 7.5 Gpc simulation (black curve) to the 21-cm PS obtained from an equivalent tiled 7.5 Gpc simulation using the 200 (red), 400 (blue) and 600 (purple) voxel smaller simulations. In each, the vertical dashed line corresponds to the largest scale (fundamental frequency) sampled by the smaller simulations (kf=2π/Lsmallk_{f}=2\pi/L_{\rm small}). The shaded region corresponds to the 68th percentile scatter obtained from our 50 independent realisations. As expected, the 21-cm power drops away on scales below the fundamental frequency. This highlights that information on modes not originally simulated by the smaller volume simulations cannot be artificially created to match the full large volume simulation. Thus, a tiled simulation volume has a limited regime of validity restricted by the scale of the smaller simulations.

On scales just above the fundamental frequency, we find the 21-cm power in the tiled box underestimates the power from the full simulation. In some instances this can be as much as a factor of 23\sim 2-3. Although the physical scale of these spherically averaged Fourier modes are sampled by the original smaller volume simulation, on these scales we are still missing the contribution from Fourier modes which contain at least one component mode longer than the small volume simulation (e.g. kx,kyk_{x},k_{y} or kz<kfk_{z}<k_{f}). These individual modes contain no power, but contribute to the total number of modes within the bin, resulting in an overall reduction to the averaged power in that bin. Thus, the 21-cm power from the tiled box will underestimate the full box power on all scales until the fraction of Fourier modes with individual components larger than the smaller box is sufficiently small. Therefore, care must be taken to adequately produce sufficiently sized smaller volumes prior to tiling to minimise the loss of power on the largest scales.

In addition to underestimating the 21-cm power following the tiling of smaller periodic simulations, it is also important to note that we will underestimate the cosmic variance uncertainty in our tiled simulation volume. Since in the act of tiling we produce repeated copies of the spatial information from our smaller simulations we cannot artificially increase the variance within a sampled Fourier bin. That is, the measured sample variance is fixed to the variance (i.e. Poisson fluctuations) measured from the smaller volume simulation. However, since we have scaled up our total volume, following from Equation 11 the sample variance decreases as 1/V\propto 1/V. Thus our estimated cosmic variance from our tiled, large-volume simulation must be scaled by the increased volume. In effect, the tiled simulation underestimates the cosmic variance error by N2×N\propto\sqrt{N_{\perp}^{2}\times N_{\parallel}}, where NN_{\perp} is the number of copies required to achieve the desired transverse scale and NN_{\parallel} is the number of copies required along the line-of-sight.

Refer to caption
Figure 9: The ratio of the cosmic variance estimated from the full 7500×7500×5007500\times 7500\times 500 Mpc simulation to the cosmic variance estimated from tiling smaller volume simulations to achieve comparable volumes. Again, we consider tiling a single simulation box containing 200 (red), 400 (blue) and 600 (purple) voxels per side length. The cosmic variance of the tiled simulations underestimates the true cosmic variance by an amount proportional to the total number of copies required to be tiled (N2×N\propto\sqrt{N_{\perp}^{2}\times N_{\parallel}}) as estimated by the horizontal dashed curves (see text for further details).

In Figure 9 we highlight this scaling by showing the ratio of the cosmic variance uncertainty from the full simulated volume compared to that obtained from tiling the smaller volume simulations to achieve the same volume. In each, the tiled simulation cosmic variance underestimates the true cosmic variance, as indicated by the ratio sitting well above unity. The dashed horizontal lines correspond to the expected scaling (N2×N\propto\sqrt{N_{\perp}^{2}\times N_{\parallel}}) by which we underestimate the true cosmic variance in each tiled simulation volume. That is, the cosmic variance in the tiled simulation must be decreased by this amount to match the true cosmic variance from the full simulation volume. For larger kk (i.e. k>0.5k>0.5 Mpc-1) the impact of the non-Gaussian (Trispectrum) contribution to the cosmic variance becomes increasingly important, modifying this simple scaling relation.

Importantly, here we have only considered the power extracted from the full simulation volume. However, if one also considers the instrumental response of the beam (i.e. beam multiplied by the tiled signal), then additional artefacts in the estimated power could be expected due to mode mixing and/or power leaking in from the side-lobes. We leave an investigation into these effects to future work.

4 Conclusion

In preparation for the wealth of data expected from large-scale interferometer experiments such as the MWA, LOFAR, HERA and the SKA, we must rigorously test and validate the full data analysis and reduction pipelines. For experiments such as the MWA, with a field-of-view of 252\sim 25^{2} deg.2.^{2} at 150 MHz (z8.5z\sim 8.5) this requires reionisation simulations in excess of 4\sim 4 Gpc. To explore the effects of side-lobes away from the primary beam and the potential for large-scale power leakage, even larger volumes are required.

We introduced a modified version of the semi-numerical reionisation simulation code 21CMFAST (Mesinger & Furlanetto, 2007; Mesinger et al., 2011) specifically tailored towards generating these extremely large volume simulations. To achieve this, we limit the line-of-sight depth of the simulations to specific observing bandwidths (though fully sample the large-scale modes), allowing the transverse scales to be extremely large. Further, we forego some level of accuracy by limiting structure formation to linear theory. However, for the purposes of testing and validation, provided the simulated 21-cm signal behaves in a similar manner (i.e. correct statistics and correlated behaviour) to the expected signal, these assumptions are sufficient. Using this, we generate a 30 MHz band-width (500\sim 500 Mpc) simulation with 6400 voxels over a 7.5 Gpc transverse side-length (1.17\sim 1.17 cMpc resolution). This simulation was specifically tailored for the recent 2020 MWA upper-limits (Trott et al., 2020)

With such a large volume simulation in hand, we then explored the statistical behaviour of the 21-cm power spectrum (PS) and assumptions made in the literature when computational limitations restrict the maximum achievable simulation volumes. We performed these using simulation volumes equivalent in size to the field-of-view for each of the 21-cm interferometer experiments, MWA, LOFAR, HERA and the SKA. We found:

  • finite simulation volumes do not exhibit any loss or increase in large-scale power from missing modes longer than the finite simulation volume.

  • the cosmic variance uncertainty on a 21-cm PS measurement at k0.5k\gtrsim 0.5 Mpc-1 is dominated by the non-Gaussianity in the 21-cm signal by 10×\gtrsim 10\times over that expected under the typical Gaussian assumption. This is consistent with that found earlier by Mondal et al. (2016).

  • the excision of contaminated foreground wedge modes significantly increases the amplitude of cosmic variance due to the reduction in the number of available Fourier modes. Further, it restricts the largest scale modes accessible.

  • measuring the 21-cm PS following wedge mode excision results in a modestly biased estimate (increase of 1020\sim 10-20 per cent) of the true underlying 21-cm PS. Though, the relative amplitude will be heavily model dependent. This bias is broadly consistent with that found by Jensen et al. (2016) and should be an important consideration for reionisation astrophysical parameter inference studies.

  • tiling smaller simulations to achieve extremely large volumes results in an underestimate of the 21-cm power on scales smaller than the small box fundamental frequency kf2π/Lsmallk_{f}\gtrsim 2\pi/L_{\rm small}. Therefore, care must be taken when constructing smaller volume simulations for tiling to ensure the relevant scales of interest are sufficiently sampled.

  • tiling smaller simulations additionally results in underestimating the true cosmic variance of the desired large volume simulation. The underestimation can be compensated for by scaling the uncertainty by N2×N\propto\sqrt{N_{\perp}^{2}\times N_{\parallel}} which accounts for the number of repeated copies of the small volume simulation required to match the transverse and line-of-sight scales, respectively.

Acknowledgements

We thank Garrelt Mellema, Rajesh Mondal and Andrei Mesinger for useful discussions relating to this work. Parts of this research were supported by the Australian Research Council Centre of Excellence for All Sky Astrophysics in 3 Dimensions (ASTRO 3D), through project number CE170100013. CMT is supported by an ARC Future Fellowship under grant FT180100321. Parts of this work were performed on the OzSTAR national facility at Swinburne University of Technology. OzSTAR is funded by Swinburne University of Technology.

Data Availability

The data underlying this article will be shared on reasonable request to the corresponding author.

References

  • Bowman et al. (2018a) Bowman J. D., Rogers A. E. E., Monsalve R. A., Mozdzen T. J., Mahesh N., 2018a, Nature, 555, 67
  • Bowman et al. (2018b) Bowman J. D., Rogers A. E. E., Monsalve R. A., Mozdzen T. J., Mahesh N., 2018b, Nature, 564, E35
  • Bradley et al. (2019) Bradley R. F., Tauscher K., Rapetti D., Burns J. O., 2019, ApJ, 874, 153
  • Chapman & Jelić (2019) Chapman E., Jelić V., 2019, arXiv e-prints, p. arXiv:1909.12369
  • Cooray (2005) Cooray A., 2005, MNRAS, 363, 1049
  • Datta et al. (2010) Datta A., Bowman J. D., Carilli C. L., 2010, ApJ, 724, 526
  • Datta et al. (2012) Datta K. K., Mellema G., Mao Y., Iliev I. T., Shapiro P. R., Ahn K., 2012, MNRAS, 424, 1877
  • Datta et al. (2014) Datta K. K., Jensen H., Majumdar S., Mellema G., Iliev I. T., Mao Y., Shapiro P. R., Ahn K., 2014, MNRAS, 442, 1491
  • DeBoer et al. (2017) DeBoer D. R., et al., 2017, PASP, 129, 045001
  • Draine & Miralda-Escudé (2018) Draine B. T., Miralda-Escudé J., 2018, ApJ, 858, L10
  • Eastwood et al. (2019) Eastwood M. W., et al., 2019, AJ, 158, 84
  • Furlanetto et al. (2004a) Furlanetto S. R., Zaldarriaga M., Hernquist L., 2004a, ApJ, 613, 1
  • Furlanetto et al. (2004b) Furlanetto S. R., Zaldarriaga M., Hernquist L., 2004b, ApJ, 613, 16
  • Furlanetto et al. (2006) Furlanetto S. R., Oh S. P., Briggs F. H., 2006, Phys. Rep., 433, 181
  • Ghara et al. (2015) Ghara R., Datta K. K., Choudhury T. R., 2015, MNRAS, 453, 3143
  • Ghara et al. (2020) Ghara R., et al., 2020, MNRAS, 493, 4728
  • Ghara et al. (2021) Ghara R., Giri S. K., Ciardi B., Mellema G., Zaroubi S., 2021, MNRAS, 503, 4551
  • Gnedin & Ostriker (1997) Gnedin N. Y., Ostriker J. P., 1997, ApJ, 486, 581
  • Gnedin & Shaver (2004) Gnedin N. Y., Shaver P. A., 2004, ApJ, 608, 611
  • Greig & Mesinger (2015) Greig B., Mesinger A., 2015, MNRAS, 449, 4246
  • Greig et al. (2011) Greig B., Bolton J. S., Wyithe J. S. B., 2011, MNRAS, 418, 1980
  • Greig et al. (2020) Greig B., Mesinger A., Koopmans L. V. E., 2020, MNRAS, 491, 1398
  • Greig et al. (2021a) Greig B., Trott C. M., Barry N., Mutch S. J., Pindor B., Webster R. L., Wyithe J. S. B., 2021a, MNRAS, 500, 5322
  • Greig et al. (2021b) Greig B., et al., 2021b, MNRAS, 501, 1
  • Gupta et al. (2017) Gupta Y., et al., 2017, Current Science, 113, 707
  • Hills et al. (2018) Hills R., Kulkarni G., Meerburg P. D., Puchwein E., 2018, Nature, 564, E32
  • Iliev et al. (2014) Iliev I. T., Mellema G., Ahn K., Shapiro P. R., Mao Y., Pen U.-L., 2014, MNRAS, 439, 725
  • Jensen et al. (2016) Jensen H., Majumdar S., Mellema G., Lidz A., Iliev I. T., Dixon K. L., 2016, MNRAS, 456, 66
  • Kaur et al. (2020) Kaur H. D., Gillet N., Mesinger A., 2020, MNRAS, 495, 2354
  • Kim et al. (2016) Kim H.-S., Wyithe J. S. B., Park J., Poole G. B., Lacey C. G., Baugh C. M., 2016, MNRAS, 455, 4498
  • Koopmans et al. (2015) Koopmans L., et al., 2015, in Advancing Astrophysics with the Square Kilometre Array (AASKA14). (arXiv:1505.07568)
  • La Plante et al. (2014) La Plante P., Battaglia N., Natarajan A., Peterson J. B., Trac H., Cen R., Loeb A., 2014, ApJ, 789, 31
  • Lacey & Cole (1993) Lacey C., Cole S., 1993, MNRAS, 262, 627
  • Line (2022) Line J., 2022, The Journal of Open Source Software, 7, 3676
  • Liu et al. (2014a) Liu A., Parsons A. R., Trott C. M., 2014a, Phys. Rev. D, 90, 023018
  • Liu et al. (2014b) Liu A., Parsons A. R., Trott C. M., 2014b, Phys. Rev. D, 90, 023019
  • Madau et al. (1997) Madau P., Meiksin A., Rees M. J., 1997, ApJ, 475, 429
  • Mellema et al. (2013) Mellema G., et al., 2013, Exp. Astron., 36, 235
  • Mertens et al. (2020) Mertens F. G., et al., 2020, MNRAS, 493, 1662
  • Mesinger & Furlanetto (2007) Mesinger A., Furlanetto S., 2007, ApJ, 669, 663
  • Mesinger et al. (2011) Mesinger A., Furlanetto S., Cen R., 2011, MNRAS, 411, 955
  • Mondal et al. (2015) Mondal R., Bharadwaj S., Majumdar S., Bera A., Acharyya A., 2015, MNRAS, 449, L41
  • Mondal et al. (2016) Mondal R., Bharadwaj S., Majumdar S., 2016, MNRAS, 456, 1936
  • Mondal et al. (2017) Mondal R., Bharadwaj S., Majumdar S., 2017, MNRAS, 464, 2992
  • Mondal et al. (2018) Mondal R., Bharadwaj S., Datta K. K., 2018, MNRAS, 474, 1390
  • Mondal et al. (2020) Mondal R., et al., 2020, MNRAS, 498, 4178
  • Morales & Hewitt (2004) Morales M. F., Hewitt J., 2004, ApJ, 615, 7
  • Morales & Wyithe (2010) Morales M. F., Wyithe J. S. B., 2010, ARA&A, 48, 127
  • Morales et al. (2012) Morales M. F., Hazelton B., Sullivan I., Beardsley A., 2012, ApJ, 752, 137
  • Murray & Trott (2018) Murray S. G., Trott C. M., 2018, ApJ, 869, 25
  • Nasirudin et al. (2020) Nasirudin A., Murray S. G., Trott C. M., Greig B., Joseph R. C., Power C., 2020, ApJ, 893, 118
  • Park et al. (2019) Park J., Mesinger A., Greig B., Gillet N., 2019, MNRAS, 484, 933
  • Parsons et al. (2010) Parsons A. R., et al., 2010, AJ, 139, 1468
  • Parsons et al. (2012) Parsons A. R., Pober J. C., Aguirre J. E., Carilli C. L., Jacobs D. C., Moore D. F., 2012, ApJ, 756, 165
  • Planck Collaboration et al. (2020) Planck Collaboration et al., 2020, A&A, 641, A6
  • Pober et al. (2016) Pober J. C., et al., 2016, ApJ, 819, 8
  • Pritchard & Loeb (2012) Pritchard J. R., Loeb A., 2012, Rep. Prog. Phys., 75, 086901
  • Salpeter (1955) Salpeter E. E., 1955, ApJ, 121, 161
  • Scoccimarro (1998) Scoccimarro R., 1998, MNRAS, 299, 1097
  • Shaver et al. (1999) Shaver P. A., Windhorst R. A., Madau P., de Bruyn A. G., 1999, A&A, 345, 380
  • Sheth & Tormen (1999) Sheth R. K., Tormen G., 1999, MNRAS, 308, 119
  • Singh & Subrahmanyan (2019) Singh S., Subrahmanyan R., 2019, ApJ, 880, 26
  • Singh et al. (2022) Singh S., et al., 2022, Nature Astronomy,
  • Sobacchi & Mesinger (2014) Sobacchi E., Mesinger A., 2014, MNRAS, 440, 1662
  • The HERA Collaboration et al. (2022a) The HERA Collaboration et al., 2022a, ApJ, 924, 51
  • The HERA Collaboration et al. (2022b) The HERA Collaboration et al., 2022b, ApJ, 925, 221
  • Thyagarajan et al. (2013) Thyagarajan N., et al., 2013, ApJ, 776, 6
  • Thyagarajan et al. (2015a) Thyagarajan N., et al., 2015a, ApJ, 804, 14
  • Thyagarajan et al. (2015b) Thyagarajan N., et al., 2015b, ApJL, 807, L28
  • Tingay et al. (2013) Tingay S. J., et al., 2013, PASA, 30, 7
  • Tozzi et al. (2000) Tozzi P., Madau P., Meiksin A., Rees M. J., 2000, ApJ, 528, 597
  • Trott et al. (2012) Trott C. M., Wayth R. B., Tingay S. J., 2012, ApJ, 757, 101
  • Trott et al. (2020) Trott C. M., et al., 2020, MNRAS, 493, 4711
  • Vedantham et al. (2012) Vedantham H., Shankar N. U., Subrahmanyan R., 2012, ApJ, 745, 176
  • Wayth et al. (2018) Wayth R., et al., 2018, Publ. Astron. Soc. Australia, 35, 33
  • Zahn et al. (2011) Zahn O., Mesinger A., McQuinn M., Trac H., Cen R., Hernquist L. E., 2011, MNRAS, 414, 727
  • Zarka et al. (2012) Zarka P., Girard J. N., Tagger M., Denis L., 2012, in Boissier S., de Laverny P., Nardetto N., Samadi R., Valls-Gabaud D., Wozniak H., eds, SF2A-2012: Proceedings of the Annual meeting of the French Society of Astronomy and Astrophysics. pp 687–694
  • van Haarlem et al. (2013) van Haarlem M. P., et al., 2013, A&A, 556, 2