Linear bias and halo occupation distribution of emission line galaxies from Nancy Grace Roman Space Telescope
Abstract
We present measurements of the linear galaxy bias of H and [OIII] emission line galaxies (ELGs) for the High Latitude Spectroscopic Survey (HLSS) of Nancy Grace Roman Space Telescope, using galaxy mocks constructed using semi-analytical model for galaxy formation, Galacticus, with a large cosmic volume and redshift coverage. We compute the two-point statistics of galaxies in configuration space and measure linear bias within scales of Mpc. We adopt different selection algorithms to investigate the impact of the Roman line flux cut, as well as the effect of dust model used to calibrate Galacticus, on the bias measurements. We consider galaxies with H and [OIII] emissions over the redshift range , as specified by the current baseline for the Roman HLSS. We find that the linear bias for the H and [OIII] ELGs can be expressed as a linear function with respect to redshift: for H , and for [OIII] . We have also measured the Halo Occupation Distributions of these H and [OIII] emission line galaxies to understand their distribution within dark matter halos. Our results provide key input to enable the reliable forecast of dark energy and cosmology constraints from Roman.
keywords:
galaxies: formation; cosmology: large-scale structure of universe — methods: numerical — methods: statistical1 Introduction
As biased tracers of the underlying dark matter distribution, galaxies form primarily in the peaks of the matter density field. Therefore they are not uniformly distributed in the universe. At large scales, the distribution of galaxies reveals a coherent structure in the background of a cosmic web and this large scale structure depends on the fundamental cosmological parameters and the physics governing the formation and evolution of galaxies. The connection of the distribution between galaxies and dark matter can be described by galaxy bias, , which can be obtained by comparing the clustering amplitudes of galaxies in the mock galaxy catalog and the dark matter simulation for a given cosmological model (Coil 2013). The proper modeling of galaxy bias is critical in facilitating the use of galaxy clustering as a cosmological probe.
Galaxy clustering data have been used to advance our understanding of both cosmology and galaxy formation. Retrieval of the information on large scales has been extensively studied with the linear perturbation theory of cosmic density field. The use of galaxy catalogs from spectroscopic redshift surveys has enabled the observations of large scale structure, which provides measurements for cosmic distance scales through the baryon acoustic oscillation (BAO), and the linear growth rate through the redshift space distortion (RSD) effect over a wide redshift range, see e.g. Eisenstein et al. (2005); Cole et al. (2005); Beutler et al. (2011, 2012); Ross et al. (2015); Delubac et al. (2015); Ata et al. (2018); Bautista et al. (2018) and references therein. These measurements have been used to put constraints on the fundamental cosmological parameters. However, due to the lack of statistical precision and systematic accuracy, alternative theories to explain the cosmic acceleration, a.k.a. dark energy, are not conclusively ruled out. The minimal extension to the standard model, the CDM cosmology, is allowed by the current observational data. In order to distinguish competing theories and constrain the parameter space, future galaxy surveys are required to probe cosmic large scale structure over wider redshift ranges and larger cosmic volumes (Wang 2008b, a).
For future galaxy surveys like Euclid (Laureijs et al. 2011, 2012), and NASA’s Nancy Grace Roman Space Telescope (hereafter Roman, Green et al. 2012; Dressler et al. 2012; Spergel et al. 2015), galaxy clustering will be one of the main cosmological probes used to measure the properties of dark energy and constrain possible deviations of gravity from general relativity. Roman will mainly target H and [OIII] emission line galaxies (ELGs) within redshift , complementary to Euclid by design. To maximize the science return of space missions, it is necessary to optimize survey strategies. In Merson et al. (2018); Zhai et al. (2019), we calibrated and applied a semi-analytical model (SAM) of galaxy formation, Galacticus (Benson 2012), to N-body simulation to produce a realistic synthetic galaxy catalog and estimate the number densities of H and [OIII] emitters. Using the same SAM, we have produced a H galaxy mock catalog for the Roman HLSS (Zhai et al. 2020), to facilitate the development of analysis tools for Roman BAO/RSD science. We measured the clustering signal and adopted a theoretical template for galaxy power spectrum to investigate the significance of the BAO and RSD measurements. This simulated catalog also enables an estimate of the galaxy bias. This is crucial for future surveys like Roman in evaluating whether we can infer the properties of dark matter correctly, and forecast the power of the survey to constrain dark energy. Merson et al. (2019) combined the SAM and halo occupation distribution (HOD) approach to produce a H galaxy catalog and predict the linear bias as a function of redshift for both Roman and Euclid. In this work, we carry out a more precise analysis by using the simulated galaxy catalog from SAM only, to forecast the linear bias of both H and [OIII] ELGs and their HODs, over the entire redshift range of for the Roman HLSS. Our results can also be used for additional tests for the underlying SAM.
The bias relation between the distribution of galaxies and underlying matter has been extensively studied in literature. Euclid and Roman will be the first cosmological surveys targeting H and [OIII] emission line galaxies. The detailed analysis of the H and [OIII] emission line galaxies using either numerical or semi-analytical method can inform both cosmological and galaxy evolution studies. Nusser et al. (2020) adopts various SAMs and empirical model for galaxy formation to investigate the biasing relation for an Euclid-like survey. The bias measurement at linear scale reveals a constant function of star formation rate (SFR) for star forming galaxies. By utilizing the luminosity variation and peculiar velocity field from the galaxy distribution in redshift space, the ELGs could provide a measurement of the linear growth rate without being biased by the environmental effects (Nusser et al. 2020).
Since Euclid and Roman target ELGs at , their sample selection differs significantly from ground projects targeting lower redshift galaxies. It is important to investigate the connection of these ELGs and their host dark matter halos (Wechsler & Tinker 2018). Using the observational data from the eBOSS ELG program and mock catalogs, Avila et al. (2020) studied a series of models for Halo Occupation Distribution (HOD) of the ELGs and investigated the impact on the clustering measurement. Using similar observational data, Guo et al. (2019) measured the occupancy of the star formation galaxies and the evolution as a function of stellar mass within redshift range . Although the eBOSS ELGs have different redshift distribution and selection algorithm than Roman and Euclid, their implications for the galaxy properties can provide reference information for the future surveys. For other investigations of the connection between ELGs and dark matter halos, see, e.g. Hadzhiyska et al. (2021); Jimenez et al. (2020) and references therein. The connection of the Roman ELGs with the host dark matter halos is not only useful for cosmology and galaxy science, the modeling of their HODs can provide a convenient way to populate galaxies within a dark matter simulation while preserving the clustering properties. This enables the production of many mock galaxy catalogs in a fast and practical manner, required for constructing the covariance matrix for the likelihood analysis (Norberg et al. 2009), and has been widely used in the literature (White et al. 2011; Zehavi et al. 2011; Manera et al. 2013; Zhai et al. 2017). The detailed investigation for the Roman galaxy redshift survey, presented in this work, is able to provide more details of how the dark matter halos are populated by galaxies with different star formation history and emission lines.
The SAM calibrated and used in our work has enabled the estimate of the number density of ELGs as a function of redshift (Zhai et al. 2019). Along with the linear bias measurement in the current analysis, they provide the crucial input information to forecast the wide range of possible dark energy and cosmological science from Roman galaxy redshift survey, e.g., using the Fisher matrix approach (Tegmark 1997). This can serve as a convenient method to predict the constraining power on the properties of dark energy, for instance a Figure-of-merit analysis (Wang 2008a). In addition, our results are useful in investigating the extra constraining power from galaxy bispectrum (Yankelevich & Porciani 2019), the constraint on neutrino masses (Hamann et al. 2012) and so on.
Our paper is organized as follows: in Section 2, we introduce the galaxy mock catalog for the Roman galaxy redshift survey and the selection algorithm of the sample. Section 3 presents the bias measurements from galaxy clustering, and the HOD of galaxies within the Roman redshift range. Finally we discuss and conclude in Section 4.
2 Methodology
In this section, we describe the construction of the simulated Roman catalogs of ELGs, and the sample selection for the Roman galaxy redshift survey.
2.1 Galaxy formation model
The synthetic galaxy catalog used in this paper has been constructed using the Galacticus galaxy formation model (Benson 2012). Similar to the other SAMs, Galacticus parametrizes the astrophysical processes and performs the evolution of galaxy populations within a distribution of dark matter halos and their merger trees. The processes governing galaxy formation and evolution include gas cooling, star formation, feedback from supernovae, black hole formation and so on. By parametrizing these processes as ordinary differential equations (ODE) and calling the ODE solver, Galacticus can perform a simulation of galaxies within a sufficiently large volume in a timely manner and output details for the galaxy populations, including the star formation history, galaxy morphology, spectral energy distribution (SED), photometric luminosities for a set of filter transmission curves and emission line luminosities.
Before using Galacticus to produce the galaxy catalog, we need to determine the free parameters in the model due to the poor prior knowledge of the astrophysical processes. This can be non-trivial since the typical number of free parameters is 15 or more (Wechsler & Tinker 2018). In our work, we do not limit ourselves to local galaxies to calibrate the model, but compare the model prediction with galaxy populations at higher redshifts relevant to Roman. The parameters of this model have been calibrated in Zhai et al. (2019), including the parameters for the physics of galaxy formation and the dust-attenuation model. In particular, the dust model is calibrated to produce consistent prediction of H luminosity function compared with observations from the ground-based narrow-band High-z Emission Line Survey (HiZELS, Geach et al. 2008; Sobral et al. 2009; Sobral et al. 2013), or the number counts data collected from Wide Field Camera 3 (WFC3) Infrared Spectroscopic Parallels survey (WISP; Atek et al. 2010, 2011; Mehta et al. 2015). The dust model applied is the Calzetti et al. (2000) model with parameter to describe the strength of dust attenuation. The result is for calibration based on WISP, and for HiZELS. Note that higher value of means stronger dust attenuation, thus the HiZELS-based calibration results a lower number density of the galaxy sample compared with the WISP-based calibration, with a preference of selecting brighter galaxies. In this paper, we present the bias measurement for ELGs with both dust models and investigate the impact on the large scale structure analysis.
The galaxy population in this analysis is selected by the emission line luminosity. Galacticus can output the number of ionizing photons for various species (HI, He I and [OII]), the metallicity of the interstellar medium (ISM), the hydrogen gas density and the volume filling factor of HII regions. We use these parameters to interpolate the tabulated libraries from the CLOUDY photo-ionization code (Ferland et al. 2013) and compute the emission line luminosity for each galaxy. More details of the method can be found in Merson et al. 2018.
2.2 N-body simulation
The key ingredient for SAMs like Galacticus is the set of merger trees of dark matter halos, which can be approximately constructed using the Press-Schechter formalism (Press & Schechter 1974), or come from a cosmological N-body simulation. Here we have chosen the later for high fidelity, and use the merger trees extracted from the UNIT simulation 111https://unitsims.ft.uam.es (Chuang et al. 2019) which assumes a spatially flat CDM model with parameters consistent with Planck 2016 measurement (Planck Collaboration et al. 2016). The simulation contains 40963 particles with a box-size of Gpc. This simulation has a mass resolution of with data product covering redshift range of . The large volume and high resolution makes this simulation sufficient for the next generation galaxy surveys, including DESI, and those planned for Roman and Euclid. We refer the readers to Chuang et al. (2019) for more details of the UNIT simulation and its data products. The merger trees of the dark matter halos are constructed using the Consistent Trees software (Behroozi et al. 2013) and the final product contains more than 160 million merger trees. Applying Galacticus to this simulation allows us to build a light cone catalog of galaxies. In particular, we use the method from Kitzbichler & White (2007) to determine where the dark matter halos enter the lightcone of the observer. The resulting catalog has an area of , consistent with the current baseline design of Roman HLSS.
2.3 Luminosity function of H and [OIII] emission lines
In the top row of Figure 1, we present the luminosity function (LF) of H and [OIII] emission line galaxies at , with both dust free and dust attenuated results. The evolution with redshift indicates the star forming history of the galaxies. In order to validate the simulation, we compare the LF of H and [OIII] emission lines with current observations at selected redshifts. The middle row shows the LF of H galaxies compared with HiZELS measurements, which was used to calibrate the SAM model. In particular, the dust model with is chosen to match the LF at from HiZELS. Our model underestimates the LF at higher redshift, indicating either the need for improvement in the Galacticus model, or that a redshift dependent dust model is required. The bottom panel shows the comparison of [OIII] emission line galaxies with WISP measurements. Our model shows mild deviation compared with WISP measurements, however the amplitude is roughly consistent. The performance can be improved by calibrating the SAM with more observational data sets.



2.4 Sample selection for the Roman galaxy redshift survey
In this paper, we focus on the forecast of galaxy linear bias for Roman HLSS, but the results can also be applicable to surveys like the one planned for Euclid. The observing strategies can impact the galaxy selection and thus linear bias measurements. Roman grism has a wavelength range of microns, which determines the redshift range for the emission lines of interest, as shown In Figure 2, which includes the three primary lines H, [OII] and [OIII]. We also shows the [NII] and H lines as they are the main contaminants to H and [OIII] respectively, due to the closeness of the emission line wavelength. Since the current Galacticus SAM model significantly underestimates the strength of [OII] emission, we will only consider H and [OIII] lines throughout this paper, as they define the expected Roman galaxy samples.
One of the key characteristics of a survey is its depth, or sensitivity, i.e. the emission line flux limit for an ELG survey. We consider three different line flux limits in our analysis: as a reference case, as the nominal depth for Roman HLSS, and as the depth of a Euclid-like GRS. The faint limit of is included to facilitate a depth versus area optimization study of the Roman HLSS. As the primary target, H emission line is detectable at . Therefore we split the analysis into two subsamples with and .

At , Roman science requirements specify that at least two emission lines are used in measuring a redshift, with the strong line above the line flux limit. The detection of the second line may not require its strength to be above the line flux limit at 6.5, thus we allow for different thresholds for the second emission line. At , we only consider the [OIII] line, required to be observed above the flux limit at 6.5, since we are not including [OII] in this study due to the current limits of Galacticus. The impact of the line flux threshold of the [OII] line will be studied in future work.
For the galaxies with , we first compute the emission line flux for H and [OIII], then we set the flux limits by two variables. The first is , this is the lower limit of the stronger line (either H or [OIII]), as chosen above. The second is which sets the lower limit of the weak line in units of , therefore is dimensionless. We choose three values and investigate the impact on the large scale structure analysis: . For instance when and , we select galaxies with the strongest emission line brighter than , and the second emission line brighter than . At , we just apply to select [OIII] emitting galaxies.
In Figure 3, we show the number densities of the selected galaxies with different flux limits. The curves of the same color merge at due to the selection algorithm since there is only one flux cut on [OIII]. The result shows a monotonic decrease as redshift increases. The flux limit parameter for the second emission line has weaker impact than , but can be important when brighter flux cut and stronger dust-attenuation are assumed.
For galaxies with , both H and [OIII] lines can be observed, and their relative strength may not be a constant. In Figure 4, we plot the fraction of galaxies with H as the stronger line, as a function of redshift. The left and middle panels show that as we go to higher redshifts, the H dominance decreases with redshift regardless of the dust model, for sufficiently faint H line flux cut. The right panel shows that this is not true for brighter H line flux cut, for which the H dominance flattens at higher redshifts. This indicates that there are more bright H emitters than bright [OIII] emitters at high redshifts. In addition, Figure 4 shows the significant impact of the dust model. The intrinsic result (dust free with ) shows that the H dominance is around 30 to 70% in this redshift range. However, when dust model is applied, almost all galaxies have stronger H than [OIII] emission (>80%), indicating that the [OIII] emission line experiences more dust attenuation as predicted by the Calzetti model. Since the H emission is only present at , we will refer to H galaxies and galaxies at interchangeably in the following section, and similarly [OIII] galaxies refer to galaxies with .
Because galaxies have peculiar velocities, the observed redshift is different from the cosmological redshift due to cosmic expansion. This RSD effect can change the measured galaxy distribution and the resultant clustering signal. In our simulation, we add this effect into the galaxy catalog by perturbing the cosmological redshift with , where is the line-of-sight component of the velocity, is the scale factor and is the speed of light. We will present clustering measurements in both real and redshift space in the following sections.


3 Results
3.1 Galaxy clustering of H galaxies
Galaxies do not perfectly trace the underlying matter distribution. They preferentially live in the peaks of the matter density field. This makes the galaxies biased tracers of large scale structure, which preferentially sample the over-dense regions (Kaiser 1984; Bardeen et al. 1986). In addition, the processes of galaxy formation can introduce additional deviations of the galaxy distribution from matter distribution. These factors result a relationship between the spatial distribution of galaxies and the dark matter density field: the galaxy bias. Neglecting the stochasticity and non-locality, the galaxy density contrast can be written as a function of the underlying dark matter density contrast on some scale (Coil 2013) , where and is the mean density.
On large scales (also known as linear scales), where the density fluctuations are small and evolve linearly, we can expand the function and define the linear galaxy bias through . In terms of the 2-point correlation function, we can measure the galaxy bias by comparing the clustering amplitudes of galaxies and matter
(1) |
where and are galaxy and matter correlation function as a function of spatial separation, respectively.
With the simulated galaxy catalog from Galacticus, we compute the galaxy correlation function using the Landy & Szalay (1993) estimator,
(2) |
where and are suitably normalized numbers of (weighted) data-data, data-random, and random-random pairs in each galaxy separation bin. The random catalogs are first generated with uniform distribution on a sphere and then truncated to have the same right ascension and declination boundary as the galaxy catalog. The redshifts of the random catalog are randomly drawn from the galaxy catalog to have the same radial distribution. The total number of randoms is 10 times larger than galaxy catalog to assure stable measurement of clustering. Following the same strategy as Merson et al. (2019), we measure the correlation function for each galaxy sample 5 times with different random catalogs. We measure the correlation function at spatial scales up to 150 Mpc. In Figure 5, we present the clustering measurement at a few redshift bins for galaxies with for both real and redshift space. We find that given our sample selection and dust model, the BAO peak can be recovered successfully. The worst case corresponds to high redshift galaxies with the brightest flux cut for the emission lines, which results in a low galaxy number density and thus the clustering signal is impacted by noise significantly. The attenuation from two dust models give similar impacts on the clustering amplitude, although their predictions for the number densities of ELGs are different (Zhai et al. 2019). The clustering amplitude in redshift space is higher than real space, consistent with expectation of the enhancement due to RSD effect.
3.2 Linear bias of H galaxies








In order to measure the linear bias, we compute the non-linear correlation function of dark matter using the CLASS and Halofit functionality in the code Nbodykit (Hand et al. 2018) with Planck 2016 cosmology, to be consistent with the dark matter simulation used in our SAM simulation. The result is shown as the cyan curve in each panel of Figure 5. The galaxy bias is computed by taking the ratio between the correlation functions of galaxies and dark matter. The lower row of each panel shows the resultant . Same as for the galaxy correlation function, the bias is also obtained by using the mean of five repeat measurements. The error bars are omitted to clearly present the result.
The figure shows that the bias is close to a constant at scales from 10 to 50 or 60 Mpc. At scales below 10 Mpc, the non-linearity of the dark matter dynamics comes into play and can induce scale dependent bias. On the other hand, at scales above 50 or 60 Mpc, the galaxy bias deviates from a constant value and presents complicated behavior, especially around the BAO scale. This feature is more significant in redshift space than in real space. This result is also pointed out in the earlier attempt presented in Merson et al. (2019), where the galaxies are H only galaxies rather than the samples defined using two emission lines as in this paper. However, the cause for this distortion on the largest scales are similar, and a combination of several factors, such as the RSD effect, sample variance, mode coupling of the cosmic density perturbation and so on.
With the measured galaxy correlation function, we can estimate the constant value of bias by fitting the as shown in Figure 5 with a constant. Based on the measurement, we fit the data within Mpc. The uncertainty of the bias estimate adopts the same strategy as in Merson et al. (2019) based on the root-mean-square (RMS) of the difference between and the fitted mean
(3) |
where is the number of bins for the galaxy correlation function within Mpc. As an example, Figure 6 displays the resulting measurement of linear bias for galaxies selected with and , i.e. only galaxies with both H and [OIII] flux brighter than . It shows a close to linear relation for galaxy bias as a function of redshift, for both real and redshift space. The two dust models calibrated to match the WISP number counts and HiZELS H LF respectively give quite consistent results. Note that the huge errorbar in the redshift space measurement for the galaxy subsample at highest redshift bin is due to the noisy measurement of the correlation function.
The measured galaxy bias as a function of redshift can be simply described as , where and are gradient and intercept respectively. Combining this model with the bias measurements, we construct a simple and perform a Monte Carlo Markov Chain (MCMC) test with the python code emcee (Foreman-Mackey et al. 2013) to obtain the constraints on and . We then sample from the posterior to estimate the 16 and 84 percentile as the uncertainty. In Figure 7, we show the fitting results for different dust models and sample selections. We can see that the dust model removes fainter and less massive galaxies. This can increase the average halo mass of the galaxy sample to increase the bias, as expected. The difference between the two dust models is not as significant as we find in the clustering measurements. The fitting result of the linear model is summarized in Table 1 for real space and Table 2 for redshift space with different sample selections and dust model.
The flux limit of the strong line, i.e. has a direct impact on the linear bias. Increasing its value selects galaxies with brighter H or [OIII] emissions. This is consistent with the relationship between H or [OIII] luminosity and host halo mass (see e.g. Zhai et al. 2019). However, at higher redshifts, the impact is less significant, which is partially due to the larger fraction of [OIII] dominated galaxies and thus reduces the effect. The dependence of linear bias on , the flux limit of the secondary emission line is more complicated. The result doesn’t present a monotonic relation. The reason is partially due to the flux ratio of H/[OIII], which has a clear dependence on the H luminosity. However this dependence decreases with higher redshift (see for example Fig 7 in Zhai et al. 2019). Thus the scatter of flux ratio H/[OIII] at a given H luminosity indicates that a galaxy with bright H emission doesn’t necessarily have bright [OIII] emission and vice versa. In general, we find that the dust models as well as the flux limit for the emission lines can affect our estimate of the linear bias at a few to ten percent level.
3.3 HOD of H galaxies



Halo Occupation Distribution (HOD) is a statistical approach to describe the connection between galaxies and dark matter halos. It has been used to interpret the observations of galaxy clustering over a wide range of redshifts and luminosities, see e.g. Zheng et al. (2005, 2007); Zehavi et al. (2011); White et al. (2011); Zhai et al. (2017). In practice, it splits the galaxies into centrals and satellites. The investigations of massive galaxies have built simple parametrizations to describe the functional forms for the centrals and satellites. However, our understanding of the HOD of the ELGs is relatively poor, although some pioneering work has been done based on simulated or observed ELGs, for instance Geach et al. (2012); Contreras et al. (2013); Cochrane & Best (2018); Gonzalez-Perez et al. (2018); Avila et al. (2020) and references therein.
The SAM simulation in our work provides a reasonable framework for the measurement of HOD of the ELGs. In Figure 8, we present the measured HOD using different selections and dust models for galaxies in a few redshift slices. The first prominent feature is that the HOD for centrals has a double peak shape as a function of halo mass. The valley is around and this position has no significant dependence on sample selection. This double peak HOD for centrals differs from the results of studies based on luminous red galaxies (LRGs), which are consistent with a monotonic function for central occupation. We note that this double peak behavior in the HOD can be a combination of different factors, including sample selection, dust model and galaxy formation physics. In the earlier attempt of Merson et al. (2019), the HOD for H galaxies is measured based on Millennium simulation. Regardless of the different sample selections and parameter sets for the SAM, they also find similar, but weaker, behavior as ours. This feature becomes less pronounced when a higher threshold for luminosity is adopted, consistent with the tendency shown in Figure 8. When we increase the flux limit for the emission line or dust attenuation to reduce the galaxy number density, the double peak feature becomes less significant. This indicates that the occupancy of the second peak is dominated by less mass galaxies. Previous work based on observational data or simulation shows that the occupation of central galaxies only peaks at low mass range and quickly declines at high mass end (Contreras et al. 2013; Gonzalez-Perez et al. 2018; Avila et al. 2020; Hadzhiyska et al. 2021).
The double-peaked nature present in the predicted HODs can be traced back to the presence of two distinct sequences in the plane of cenral galaxy star formation rate and halo mass in the Galactics models. We have examined the origin of these two sequences and find that they are the result of the fact that, in the UNIT simulation merger trees, some significant fraction of dark matter halos undergo periods of mass loss (i.e. their total mass decreases with time).
Galacticus assumes that halos accrete baryons from their surroundings at a rate proportional to their halo mass growth rate. During periods of mass loss in a halo, Galacticus instead holds the baryonic content of the halo fixed, and let it begin to increase again when the halo has grown beyond its previous greatest mass.
As a consequence, halos undergoing mass loss have no new gas supply, and so the star formation rate of their central galaxies quickly declines, leading to the formation of a second sequence of galaxies in the plane of star formation rate and halo mass.
These periods of mass loss from halos may be physical (driven by merging events which cause mass to be ejected), or may be purely numerical in origin (due to the choice of halo mass definition, or to failings in the halo finder and merger tree builder to link halos together over time). A detailed examination of the origins of these periods of mass loss, and how best to model their effect on the baryonic content of halos, is beyond the scope of this paper, but will be explored in a future work.
The satellite occupation from Galacticus is consistent with expectations, and can be represented by a functional form close to a power law, which is similar to massive galaxies at lower redshift. However, we note that this power low can break for high redshift galaxies.
3.4 [OIII] galaxies






In this section we present the linear bias measurements for galaxies at . Note that for this paper, the sample selection for these high redshift galaxies () is different from the lower redshift sample (); only the [OIII] line flux is used, as , to define the flux-selected samples. The [OII] line can be used as the second emission line for robust redshift determination for the Roman ELG sample, which will be implemented in future work pending improvement of [OII] line flux predictions from Galacticus.
In the top rows of Figure 9, we present the correlation function of the galaxies with a few example redshift slices. The bias measurement is shown in the bottom row. We find that the BAO peak can be recovered with these sparser samples, compared with the galaxies. However, due to the lower number density, the clustering measurement becomes noisy and the BAO signal is erased to certain extent. This can be improved by enlarging the redshift bins in the analysis to include more galaxies in the measurement of the two point correlation function. The ratio of the galaxy and the dark matter correlation functions also shows a close to constant behavior at scales Mpc, similar to the H galaxies and thus enables an estimate of a constant bias on linear scales.
In Figure 10, we present the bias measurement for [OIII] ELGs using the same method as in previous section. The galaxies are selected if the [OIII] line flux is higher than . The stronger dust attenuation (higher ) selects brighter and more massive galaxies, which increases the galaxy bias as expected. Given the estimated uncertainty, the two dust models give consistent result within . Compared with the H ELGs, the two dust models increase the bias estimates significantly for the [OIII] ELGs, due to the fact that dust imposes more attenuation on [OIII] emission than H. The distribution of the measurements in the figure also presents a linear relation of galaxy bias with redshift. We fit with a linear relation as introduced in Section 3.2 and present the result in Figure 11. We note that the flux limit has a stronger effect on the bias of the galaxies than the galaxies; the bias can change by roughly 20%. The nominal depth for Roman galaxy survey of flux above is able to observe a significant number of [OIII] emitting and highly biased galaxies. This can provide robust measurement for the clustering signal to infer cosmological information. Compared with the galaxies at , these [OIII] galaxies are more biased due to the early phases of the dark matter evolution and the redshift dependence of dark matter halo bias. The fitting result of the linear bias model is summarized in Table 3.
We measure the HOD of these [OIII] galaxies to better understand their distribution within dark matter halos and present the result in Figure 12 for a few redshift slices. The prominent feature is similar to that of the H galaxies at . The central occupation shows a clear double-peak behavior as a function of halo mass. Either it is caused by physical reasons of mass loss of dark matter halos, or numerical issues in the simulation, we will investigate this in future work. Similar to the galaxies, the satellite occupation is also close to a power-law form, but with some break at high redshift.
Real Space | ||
---|---|---|
Redshift Space | ||
3.5 Comparison of HOD with eBOSS
In order to further investigate whether our Galacticus simulation can make reasonable predictions for HODs, we compare the HOD results with the latest eBOSS ELG measurement (Avila et al. 2020). The eBOSS ELG program creates a catalog of thousands of galaxies within the redshift range of , selected using the DECaLS photometric survey. The finalized sample has an average redshift with a number density of . At this redshift, the Roman HLSS can only observe H emission due to the wavelength range of its grism (see Figure 2). Therefore we use the H flux to define our galaxy mock. We apply flux limit and with two dust models and . This give us four different galaxy samples with number densities 1.65, 1.08, 0.38 and 0.27 times that of eBOSS ELG. We measure their HODs and compare with the eBOSS measurement in Figure 13. Although our galaxy sample has different target selection than eBOSS which uses [OII] doublet to identify galaxy redshift, our prediction of HOD has similar amplitude when the number density is close to that of eBOSS. The satellite occupancy shows excellent agreement in terms of the shape and amplitude. The halo mass dependence can be described by a power-law at high mass end. The central occupancy in both Galacticus and eBOSS shows a similar shape with a peak at intermediate mass scale, with eBOSS peaking at slightly lower mass scale. In addition, the overall amplitude of the central occupancy of Galacticus is higher and its shape flattens at high mass end instead of dropping quickly. This discrepancy can be caused by a combination of factors: the difference in the selection algorithms of Roman GRS and eBOSS, the calibration of Galacticus for the parameters governing star formation history and galaxy formation, the dust models used in the analysis and so on.

3.6 A practical fit of ELG bias at 1 < z < 3

In Figure 14, we present the bias measurement of emission line galaxies for the entire redshift range of Roman HLSS. We apply and to choose H galaxies, and for [OIII] galaxies. Using the measurement with dust model , we perform a linear fit of the bias measurement for H and [OIII] galaxies respectively, shown as the red line with shaded area. The fitting result can also be found from Table 1, 2 and 3. In redshift space, we summarize the results as
H (1<z<2): | |||||
[OIII](2<z<3): | (4) |
Our previous tests show that the practical choices for the dust model and flux limits for the emission lines won’t have significant impact on the estimate of the linear bias. Therefore the result quoted above is a reasonable description for future analysis, especially for the investigation of the science forecast of Roman HLSS.
4 Discussion and Conclusion
Emission line galaxies are the main targets of many current and future cosmological surveys. The bright nebula emissions due to star formation activity makes the sample selection different from that of the red and massive galaxies at low redshifts. The results based on current observational data are not sufficient to allow simple extrapolation to higher redshifts, thus requires detailed investigations of their spatial distribution based on accurate numerical simulations. In this paper, we study the linear bias of these ELGs from Roman galaxy redshift survey based on clustering measurement and present their redshift evolution for various sample selections and dust models. In particular, we use the Galacticus SAM to perform a large scale galaxy simulation. The model processes all the dark matter merger trees distributed within the 1Gpc box of the N-body simulation UNIT (Chuang et al. 2019). We then construct a lightcone catalog using the method in Kitzbichler & White (2007). The parameters of the model are calibrated to match the current observations at high redshifts to ensure that the galaxy simulation is realistic. We used this model to predict the number densities of H and [OIII] emitters for the Roman galaxy survey in Zhai et al. (2019). The same model has been used here to produce a 2000 deg2 galaxy mock, consistent with the baseline design of Roman galaxy redshift survey. A galaxy clustering analysis based on this mock catalog is preformed in Zhai et al. (2020) to forecast the uncertainties of the BAO and RSD measurements. The wavelength range of the Roman grism has a direct impact on the redshift range of each nebula emission line, and constrains the selection of galaxy samples (see Figure 2). We have investigated how the clustering of the Roman galaxies depend on the chosen line flux limits.
Depending on the selection criteria of emission line galaxies, we can measure the linear bias of galaxies as a function of redshift with the simulated galaxy catalog and the dust model. We first measure the two-point correlation function of galaxies in real and redshift space and find that the BAO peak on large scales can be recovered for both H and [OIII] galaxies within the Roman redshift range, although the [OIII] galaxy samples are more affected by shot noise due to the low number density. Taking the ratio of correlation function between galaxies and matter enables the measurement of galaxy bias. The result at scales Mpc reveals a roughly constant bias estimate. Thus we use this scale-independent value as the linear bias of galaxies. Deviation of the bias measurement from a constant value at larger scales is noticeable, which is caused by a combination of factors including the non-linear evolution of the BAO signal, the redshift space distortion effect and sample variance due to limited cosmic volume.
We find that the scale-independent galaxy biases for both H and [OIII] ELGs are close to a linear function, , see Eq.(4), consistent with previous results on H ELGs (Merson et al. 2019), see Table 1, 2 for H galaxies, and Table 3 for [OIII] galaxies. For H galaxies we have investigated the impact of the line flux limit and dust model on the linear bias measurement, as shown in Figure 7. We find that the linear bias of ELGs at is insensitive to line flux cut or dust attenuation model, consistent with earlier work at (Merson et al., 2019). We find that galaxy bias increases with redshift for ELGs, as expected, since higher redshifts correspond to earlier (more biased) phases of galaxy distribution. As dark mater halos grow with decreasing redshift, they become more populated with galaxies, which reduces the bias factor with which galaxy distribution traces the matter distribution.
In order to better understand the distribution of ELGs within their host dark matter halos, we have performed HOD measurements for the galaxy samples, as well as the halo mass function of the selected galaxies (see Appendix A). The noticeable feature is the double peak for the central occupancy. The second peak at high mass end is likely caused by the mass loss of halos during evolution. However the current model is not able to identify whether this mass loss is physical or due to numerical artifacts, therefore we leave it for future work. On the other hand, the satellite occupancies for both H and [OIII] galaxies are close to a power-law form, with the tendency of a break at the high mass end. This can enable a simple parameterization for practical application in the analysis of large scale structure.
The Roman galaxy redshift survey will suffer from the usual systematic effects of slitless spectroscopy (Faisst et al., 2018; Martens et al., 2019), such as line misidentification and spectral overlap, although to a lesser degree compared to Euclid, thanks to the higher spectral resolution and wider wavelength range of the Roman grism compared to the Euclid red grism (the Euclid blue grism will not be used in the wide survey). In future work, we will study the survey completeness and purity for the Roman galaxy redshift survey, and their impact on the observed galaxy sample and the galaxy clustering analysis.
The HOD measurement of the Roman galaxy sample builds a straightforward connection between galaxies and dark matter halos, in terms of the halo mass being the only parameter. However the secondary halo properties other than halo mass can also impact the clustering signals of galaxies. This assembly bias phenomena has been reported in researches based on numerical simulation, see e.g. Gao et al. (2005); Wechsler et al. (2006). Some of the latest studies for ELGs find that the secondary properties of dark matter halos can affect the distribution of ELGs and thus the cosmological measurement based on large scale structure like BAO peak (Jimenez et al. 2020). The SAM employed in our simulation can output detailed properties of galaxies, and their host halos. This can build a more accurate connection between galaxies and dark matter halos by using information of both internal and external halo properties, and help minimize the systematics in the cosmological inference.
We have presented linear bias and HOD measurements for ELGs at in this paper, which are key inputs to the realistic forecast of dark energy and cosmological constraints from possible Roman galaxy redshift surveys. These in turn can be used to optimize the observing strategy for Roman. Our results are similarly useful to other ongoing or future galaxy surveys that use ELGs to trace cosmic large scale structure.
Data Availability
The original dark matter halo catalogs are available from the UNIT simulation website. The galaxy mocks are available by request. A public webpage presenting the mocks will be available at a later time.
Acknowledgements
ZZ thanks Santiago Avila for providing their HOD measurement of the eBOSS ELG sample. This work is supported in part by NASA grant 15-WFIRST15-0008, Cosmology with the High Latitude Survey Roman Science Investigation Team (SIT). GY would like to thank MICIU/FEDER (Spain) for financial support under project grant PGC2018-094975-B-C21. The UNIT simulations have been done in the MareNostrum Supercomputer at the Barcelona Supercomputing Center (Spain) thanks to the cpu time awarded by PRACE under project grant number 2016163937. This work used the Extreme Science and Engineering Discovery Environment (XSEDE), which is supported by National Science Foundation grant number ACI-1548562 (Towns et al. 2014)
References
- Ambikasaran et al. (2015) Ambikasaran S., Foreman-Mackey D., Greengard L., Hogg D. W., O’Neil M., 2015, IEEE Transactions on Pattern Analysis and Machine Intelligence, 38
- Ata et al. (2018) Ata M., et al., 2018, MNRAS, 473, 4773
- Atek et al. (2010) Atek H., et al., 2010, ApJ, 723, 104
- Atek et al. (2011) Atek H., et al., 2011, ApJ, 743, 121
- Avila et al. (2020) Avila S., et al., 2020, MNRAS, 499, 5486
- Bardeen et al. (1986) Bardeen J. M., Bond J. R., Kaiser N., Szalay A. S., 1986, ApJ, 304, 15
- Bautista et al. (2018) Bautista J. E., et al., 2018, ApJ, 863, 110
- Behroozi et al. (2013) Behroozi P. S., Wechsler R. H., Wu H.-Y., Busha M. T., Klypin A. A., Primack J. R., 2013, ApJ, 763, 18
- Benson (2012) Benson A. J., 2012, New Astron., 17, 175
- Beutler et al. (2011) Beutler F., et al., 2011, MNRAS, 416, 3017
- Beutler et al. (2012) Beutler F., et al., 2012, MNRAS, 423, 3430
- Calzetti et al. (2000) Calzetti D., Armus L., Bohlin R. C., Kinney A. L., Koornneef J., Storchi-Bergmann T., 2000, ApJ, 533, 682
- Chuang et al. (2019) Chuang C.-H., et al., 2019, MNRAS, 487, 48
- Cochrane & Best (2018) Cochrane R. K., Best P. N., 2018, MNRAS, 480, 864
- Coil (2013) Coil A. L., 2013, The Large-Scale Structure of the Universe. p. 387, doi:10.1007/978-94-007-5609-0_8
- Colbert et al. (2013) Colbert J. W., et al., 2013, ApJ, 779, 34
- Cole et al. (2005) Cole S., et al., 2005, MNRAS, 362, 505
- Contreras et al. (2013) Contreras S., Baugh C. M., Norberg P., Padilla N., 2013, MNRAS, 432, 2717
- Delubac et al. (2015) Delubac T., et al., 2015, A&A, 574, A59
- Dressler et al. (2012) Dressler A., et al., 2012, arXiv e-prints, p. arXiv:1210.7809
- Eisenstein et al. (2005) Eisenstein D. J., et al., 2005, ApJ, 633, 560
- Faisst et al. (2018) Faisst A. L., Masters D., Wang Y., Merson A., Capak P., Malhotra S., Rhoads J. E., 2018, ApJ, 855, 132
- Ferland et al. (2013) Ferland G. J., et al., 2013, Rev. Mex. Astron. Astrofis., 49, 137
- Foreman-Mackey et al. (2013) Foreman-Mackey D., Hogg D. W., Lang D., Goodman J., 2013, PASP, 125, 306
- Gao et al. (2005) Gao L., Springel V., White S. D. M., 2005, MNRAS, 363, L66
- Geach et al. (2008) Geach J. E., Smail I., Best P. N., Kurk J., Casali M., Ivison R. J., Coppin K., 2008, MNRAS, 388, 1473
- Geach et al. (2012) Geach J. E., Sobral D., Hickox R. C., Wake D. A., Smail I., Best P. N., Baugh C. M., Stott J. P., 2012, MNRAS, 426, 679
- Gonzalez-Perez et al. (2018) Gonzalez-Perez V., et al., 2018, MNRAS, 474, 4024
- Green et al. (2012) Green J., et al., 2012, arXiv e-prints, p. arXiv:1208.4012
- Guo et al. (2019) Guo H., et al., 2019, ApJ, 871, 147
- Hadzhiyska et al. (2021) Hadzhiyska B., Tacchella S., Bose S., Eisenstein D. J., 2021, MNRAS, 502, 3599
- Hamann et al. (2012) Hamann J., Hannestad S., Wong Y. Y. Y., 2012, J. Cosmology Astropart. Phys., 2012, 052
- Hand et al. (2018) Hand N., Feng Y., Beutler F., Li Y., Modi C., Seljak U., Slepian Z., 2018, AJ, 156, 160
- Hunter (2007) Hunter J. D., 2007, Computing in Science Engineering, 9, 90
- Jimenez et al. (2020) Jimenez E., Padilla N., Contreras S., Zehavi I., Baugh C., Orsi A., 2020, arXiv e-prints, p. arXiv:2010.08500
- Jones et al. (01 ) Jones E., Oliphant T., Peterson P., et al., 2001–, SciPy: Open source scientific tools for Python, http://www.scipy.org/
- Kaiser (1984) Kaiser N., 1984, ApJ, 284, L9
- Kitzbichler & White (2007) Kitzbichler M. G., White S. D. M., 2007, MNRAS, 376, 2
- Landy & Szalay (1993) Landy S. D., Szalay A. S., 1993, ApJ, 412, 64
- Laureijs et al. (2011) Laureijs R., et al., 2011, arXiv e-prints, p. arXiv:1110.3193
- Laureijs et al. (2012) Laureijs R., et al., 2012, in Space Telescopes and Instrumentation 2012: Optical, Infrared, and Millimeter Wave. p. 84420T, doi:10.1117/12.926496
- Manera et al. (2013) Manera M., et al., 2013, MNRAS, 428, 1036
- Martens et al. (2019) Martens D., Fang X., Troxel M. A., DeRose J., Hirata C. M., Wechsler R. H., Wang Y., 2019, MNRAS, 485, 211
- Mehta et al. (2015) Mehta V., et al., 2015, ApJ, 811, 141
- Merson et al. (2018) Merson A., Wang Y., Benson A., Faisst A., Masters D., Kiessling A., Rhodes J., 2018, MNRAS, 474, 177
- Merson et al. (2019) Merson A., Smith A., Benson A., Wang Y., Baugh C., 2019, MNRAS, 486, 5737
- Murray & Poulin (2019) Murray S. G., Poulin F. J., 2019, Journal of Open Source Software, 4, 1397
- Norberg et al. (2009) Norberg P., Baugh C. M., Gaztañaga E., Croton D. J., 2009, /mnras, 396, 19
- Nusser et al. (2020) Nusser A., Yepes G., Branchini E., 2020, ApJ, 905, 47
- Planck Collaboration et al. (2016) Planck Collaboration et al., 2016, A&A, 594, A13
- Press & Schechter (1974) Press W. H., Schechter P., 1974, ApJ, 187, 425
- Ross et al. (2015) Ross A. J., Samushia L., Howlett C., Percival W. J., Burden A., Manera M., 2015, MNRAS, 449, 835
- Sobral et al. (2009) Sobral D., et al., 2009, MNRAS, 398, 75
- Sobral et al. (2013) Sobral D., Smail I., Best P. N., Geach J. E., Matsuda Y., Stott J. P., Cirasuolo M., Kurk J., 2013, MNRAS, 428, 1128
- Spergel et al. (2015) Spergel D., et al., 2015, arXiv e-prints, p. arXiv:1503.03757
- Tegmark (1997) Tegmark M., 1997, Phys. Rev. Lett., 79, 3806
- Towns et al. (2014) Towns J., et al., 2014, Computing in Science and Engineering, 16, 62
- Wang (2008a) Wang Y., 2008a, Phys. Rev. D, 77, 123525
- Wang (2008b) Wang Y., 2008b, J. Cosmology Astropart. Phys., 2008, 021
- Wechsler & Tinker (2018) Wechsler R. H., Tinker J. L., 2018, ARA&A, 56, 435
- Wechsler et al. (2006) Wechsler R. H., Zentner A. R., Bullock J. S., Kravtsov A. V., Allgood B., 2006, ApJ, 652, 71
- White et al. (2011) White M., et al., 2011, ApJ, 728, 126
- Yankelevich & Porciani (2019) Yankelevich V., Porciani C., 2019, MNRAS, 483, 2078
- Zehavi et al. (2011) Zehavi I., et al., 2011, ApJ, 736, 59
- Zhai et al. (2017) Zhai Z., et al., 2017, ApJ, 848, 76
- Zhai et al. (2019) Zhai Z., Benson A., Wang Y., Yepes G., Chuang C.-H., 2019, MNRAS, 490, 3667
- Zhai et al. (2020) Zhai Z., Chuang C.-H., Wang Y., Benson A., Yepes G., 2020, arXiv e-prints, p. arXiv:2008.09746
- Zheng et al. (2005) Zheng Z., et al., 2005, ApJ, 633, 791
- Zheng et al. (2007) Zheng Z., Coil A. L., Zehavi I., 2007, ApJ, 667, 760
- van der Walt et al. (2011) van der Walt S., Colbert S. C., Varoquaux G., 2011, Computing in Science Engineering, 13, 22
Appendix A ELG halo mass function
In order to better understand the distribution of galaxies within dark matter halos, we compute the halo mass function (HMS) of the Roman ELGs, i.e. the number density of dark matter halos that host Roman galaxies. Figure 15 shows an example for galaxies within , with different selection algorithms including the flux limits on the emission lines and dust models. The lower amplitude for the Roman galaxies is mainly dominated by the selection algorithms. The flux limit can remove faint galaxies which are likely to live in less massive halos. The requirement of both H and [OIII] emission lines and the galaxy formation physics impact the decrease of the HMS at high mass end.
