Calibration Uncertainty’s Impact on Gravitational-Wave Observations

Reed Essick Perimeter Institute for Theoretical Physics
31 Caroline Street North, Waterloo, ON N2L 2Y5

Abstract

Our ability to characterize astrophysical Gravitational Waves depends on our understanding of the detectors used to observe them. Specifically, our ability to calibrate current kilometer-scale interferometers can potentially confound the inference of astrophysical signals. Current calibration uncertainties are dominated by systematic errors between the modeled and observed detector response and are well described by a Gaussian process. I exploit this description to analytically examine the impact of calibration uncertainty. I derive closed-form expressions for the conditioned likelihood of the calibration error given the observed data and an astrophysical signal (astrophysical calibration) as well as for the marginal likelihood for the data given a signal (integrated over the calibration uncertainty). I show that calibration uncertainty always reduces search sensitivity and the amount of information available about astrophysical signals. Additionally, calibration uncertainty will fundamentally limit the precision to which loud signals can be constrained, a crucial factor when considering the scientific potential of proposed third-generation interferometers. For example, I estimate that with $1\%$ uncertainty in the detector response’s amplitude and phase, one will only be able to measure the leading-order tidal parameter ( $\tilde{\Lambda}$ ) for a 1.4+1.4 $\,M_{\odot}$ system to better than $\pm 1$ ( $\sim 0.2\%$ relative uncertainty) for signals with signal-to-noise ratios $\gtrsim 10^{4}$ . At this signal-to-noise ratio, calibration uncertainty increases $\sigma_{\tilde{\Lambda}}$ by a factor of $2$ compared to stationary Gaussian noise alone. Furthermore, 1% calibration uncertainty limits the precision to always be $\sigma_{\tilde{\Lambda}}\gtrsim 0.5$ . I also show how to best select the frequencies at which calibration should be precisely constrained in order to minimize the information lost about astrophysical parameters. It is not necessary to constrain the calibration errors to be small at all frequencies to perform precise astrophysical inference for individual signals.

I Introduction

Modern interferometric Gravitational-Wave (GW) detectors [1, 2, 3] provide unprecedented access to fundamental physics on both the largest and smallest scales. Advanced detector technologies probe aspects of quantum interactions with macroscopic (classical) objects [4, 5], and the astrophysical signals detected probe gravity in the strong/dynamical-field regime [6, 7], the distribution of rare stellar remnants, including black holes (BHs) and neutron stars (NSs) [8, 9], the behavior of matter at densities up to several times that of stable nuclei (e.g., Refs. [10, 11]), and many other extreme environments that are otherwise difficult or impossible to access. Much of this success can be attributed to the detectors’ ability to faithfully report the astrophysical strain so that meaningful comparisons to theoretical expectations can be made. The process of predicting how astrophysical signals will appear within the detector is referred to as calibration. It typically proceeds by measuring the opto-mechanical response of the interferometer’s cavities and control loops.

Calibration has long been of general interest in the field. Although many calibration techniques exist, the basic physical measurement is the change in the detector output as a function of the change in the difference between the lengths of the interferometer’s arms. In current interferometric observatories (IFOs), one typically moves the test masses (mirrors) that bracket the Fabry-Pérot cavity in each arm by a known amount at a known frequency and then measures the detector response. Initial techniques involved driving the suspension system, composed of a cascade of pendula, that suspend each test mass and isolate it from many sources of terrestrial noise (see, e.g., Refs. [12, 13] and references therein). Recently, several authors have investigated Newtonian calibrators [14, 15], which drive the test masses through the Newtonian gravitational force exerted by an oscillating mass multipoles (rotating dumbbell). Additionally, several authors have studied the ability of astrophysical signals themselves to calibrate detectors. In this approach, one assumes that the correct theory of gravity is known and compares the detector output to the expected change in length induced by different astrophysical sources. Indeed, there is sustained interest in understanding both how to use astrophysical signals to calibrate our detectors and how uncertainty in calibration can affect our inference of the properties astrophysical signals [16, 17, 18, 19, 20].

Currently, the most successful calibration techniques rely on auxiliary laser systems (photon calibrators or PCALs [21, 22]) to exert a known radiation force (laser with known intensity) on the test masses. PCALs have met with great success, providing precise calibration measurements throughout the advanced detector era [23]. However, through the careful quantification of the detector response enabled by PCAL measurements, it is now understood that systematic modeling uncertainties in the detector response, including variations over time, are typically larger than the uncertainty in the directly-measured parameters of the model for the detector response [23, 13]. These systematic uncertainties are represented with a Gaussian process (GP) and uncertainty envelopes derived therefrom are available at a high cadence [24].

It has become common practice to marginalize over the calibration uncertainty at the time of individual events within inferences of those events’ astrophysical parameters. Several techniques exist, and they differ primarily in how they represent the calibration uncertainty. Traditionally, a cubic spline with knots spaced logarithmically in frequency is used [25]. The marginal prior distributions for the calibration error at each knot is chosen to match the measured calibration envelope at that frequency. However, it is not necessarily the case that this spline-prior is a good representation of the full GP, which includes correlations between frequencies and may not be a slowly varying function of frequency. Several authors have also investigated physical models of calibration uncertainty, even parametrizing the model of the detector response in addition to sampling from the full GP prior for the model systematics [19, 20]. Marginalization over calibration is then performed numerically. While this approach is computationally tractable, it does not offer an immediate interpretation of which aspects of the calibration uncertainty most affect the astrophysical inference or, in turn, which astrophysical signals could most improve our knowledge of the detectors’ calibration.

I show how to analyze GP calibration priors analytically, as the GP prior is conjugate to the GW likelihood under the common assumption of stationary Gaussian noise. This allows me to derive several expressions for the impact of calibration uncertainty on searches for, and the inference of, astrophysical signals. I also examine how astrophysical signals can inform our understanding of calibration in our detectors.

While these expressions should be of general interest, current calibration uncertainties are typically small ( $O(\mathrm{few}\%)$ in amplitude and phase [23, 13]). That is, signals detected with current IFOs are quiet enough that the calibration uncertainty does not matter much [26, 27, 28, 29, 30]. This may not be the case for proposed third-generation (3G) detectors, in which the typical signal-to-noise ratio (S/N) may be larger. I clarify the S/N thresholds above which uncertainty in detector calibration will dominate over uncertainty from stationary Gaussian noise. What’s more, opto-mechanical effects in the detector response for 3G detectors will be much more complicated than in current detectors, as light’s round-trip travel time within the Fabry-Pérot cavities becomes comparable to the frequencies probed in the astrophysical strain [31]. Altogether, then, astrophysical inference of the loudest events detected with 3G detectors will almost certainly be limited by the understanding of the detector responses, and several common approximations for current detector responses will need to be revisited. This fact appears to not have been fully investigated within the 3G literature (see, e.g., discussion in Ref. [32]), and I hope that the analytic expressions presented in this manuscript will enable improved estimates of the scientific capabilities of 3G instruments.

I first summarize a few basic assumptions common in GW data analysis and introduce my notation in Sec. II. I then make use of the GP representation of the calibration uncertainty in Sec. III to obtain analytic expressions for the conditioned distribution for the calibration error given the observed data and an astrophysical signal as well as the marginal distribution for the observed data given an astrophysical signal integrated over uncertainty in detector calibration. Several limiting cases are explored in Sec. III.1, and Sec. III.2 explores extensions to networks of detectors and the joint inference over multiple signals. I discuss the possible impact on searches in Sec. IV. Sec. V discusses the measurability of astrophysical parameters based on the calibration-marginalized likelihood, with particular focus on how much information about the astrophysical parameters is lost due to uncertainty in the detector response. The similarity of calibration errors to waveform uncertainty and deviations from General Relativity (GR) is introduced within Sec. VI. I conclude with a discussion of calibration requirements in Sec. VII.

II Notation and Stationary Gaussian Noise

The likelihood of observing data $d$ in stationary Gaussian noise described by a one-sided power spectral density (PSD) $s$ within a single interferometer in the presence of a signal described by intrinsic parameters $\vartheta$ and extrinsic parameters $\varphi$ is typically written as

\ln p(d|\theta,\varphi,\mathcal{H}_{s})\supset-\frac{1}{2}\left(4\int\limits_{0}^{\infty}\mathrm{d}f\,\frac{|d-h(\vartheta,\varphi)|^{2}}{s}\right)

(1)

where $h=f_{\alpha}(\varphi)h_{\alpha}(\vartheta)$ is the strain projected in the interferometer, $f_{\alpha}$ is the detector response to the $\alpha$ polarization, $h_{\alpha}$ is the waveform in the $\alpha$ polarization, and repeated indices imply summation. In general, the detector response $f_{\alpha}$ is a function of the frequency [31]. The PSD is defined in terms of the noise autocorrelation function as

\left<n(t)n(t+\tau)\right>=\frac{1}{2}\int\mathrm{d}f\,e^{2\pi if\tau}s(f)

(2)

where $i=\sqrt{-1}$ and $\left<\cdot\right>$ denotes an average over noise realizations. $\mathcal{H}_{s}$ denotes the assumption of stationary Gaussian noise with a known PSD. See, e.g. Refs. [33, 34] for more discussion about PSD estimation. Under this assumption, the noise model is a stationary Gaussian process (GP) with (auto)correlations defined by Eq. 2.

However, as IFOs record only discrete data at regularly spaced intervals, it will be convenient to work explicitly with matrices in the following. In most cases, I suppress indices for simplicity. However, for notational clarity, I denote objects that have a single frequency index with lower case letters and objects that have two frequency indices with upper case letters. It will also be convenient to work with only real variables. As such, I explicitly include the real and imaginary parts of data separately so that

	$\displaystyle h_{2i}$	$\displaystyle=\mathcal{R}\{h(f_{i})\}$		(3)
	$\displaystyle h_{2i+1}$	$\displaystyle=\mathcal{I}\{h(f_{i})\}$		(4)

for $i\in\{1,\cdots,N_{\mathrm{frq}}$ }. Therefore, for complex data recorded at $N_{\mathrm{frq}}$ discrete frequencies, vectors have $2N_{\mathrm{frq}}$ real entries and matrices have $4N_{\mathrm{frq}}^{2}$ real elements.

With this notation, I obtain

$\displaystyle\ln$	$\displaystyle p(d\|\theta,\varphi,\mathcal{H}_{s})$
	$\displaystyle=-2\sum_{i}^{N_{\mathrm{frq}}}\left[\Delta f\,\frac{\mathcal{R}\{d(f_{i})-h(f_{i})\}^{2}+\mathcal{I}\{d(f_{i})-h(f_{i})\}^{2}}{s(f_{i})}\right]$
	$\displaystyle\quad\quad\quad\quad\quad\quad-N_{\mathrm{frq}}\ln 2\pi-\sum\limits_{i}^{N_{\mathrm{frq}}}\ln\left(\frac{s(f_{i})}{4\Delta f}\right)$
	$\displaystyle=-\frac{1}{2}(d-h)^{\mathrm{T}}C^{-1}(d-h)$
	$\displaystyle\quad\quad\quad\quad\quad\quad-N_{\mathrm{frq}}\ln 2\pi-\frac{1}{2}\ln\mathrm{det}\|C\|$	(5)

where $C$ is the covariance matrix (expected to be diagonal) with

C_{2i\,2i}=C_{2i+1\,2i+1}\equiv\left(\frac{s(f_{i})}{4\Delta f}\right)

(6)

for stationary Gaussian noise. That is, there are neither correlations between frequencies nor between the real and imaginary parts of the data at each frequency. In general, nonstationary noise and/or windowing functions applied within discrete Fourier transforms may introduce off-diagonal terms within $C$ (see, e.g., Ref. [35]). I ignore these for the purposes of this study, although I do not assume $C$ is diagonal in what follows unless explicitly stated.

Now, if the modeled detector response is incorrect by a multiplicative factor such that $f^{(\mathrm{true})}_{\alpha}=f_{\alpha}(1+\delta)$ , the actual projected strain in the detector will be $h^{(\mathrm{true})}=h(1+\delta)$ . This assumes that the calibration uncertainty affects only the total projected strain in the interferometer and therefore impacts the detector response to each polarization in the same proportion.¹¹1This may not be true in general, though, as approximations to the interferometer’s response may break down differently for the response to each polarization. These errors may also depend on the source’s extrinsic parameters (i.e., relative orientation of the source and the detector). However, these effects are small compared to the GP uncertainty for current detectors, and the approximation that $\delta$ is independent of polarization and $\varphi$ should be reasonable. In general, $\delta$ can be complex and mixes the real and imaginary parts of the projected strain so that

	$\displaystyle\mathcal{R}\{h^{(\mathrm{true})}\}$	$\displaystyle=\mathcal{R}\{h\}+\mathcal{R}\{h\}\mathcal{R}\{\delta\}-\mathcal{I}\{h\}\mathcal{I}\{\delta\}$		(7)
	$\displaystyle\mathcal{I}\{h^{(\mathrm{true})}\}$	$\displaystyle=\mathcal{I}\{h\}+\mathcal{I}\{h\}\mathcal{R}\{\delta\}+\mathcal{R}\{h\}\mathcal{I}\{\delta\}$		(8)

which I express as

h^{(\mathrm{true})}=h+H\delta

(9)

where $H$ is a ( $2\times 2$ block)-diagonal matrix with each block given by

H_{2\times 2}=\begin{bmatrix}\mathcal{R}\{h(f_{i})\}&-\mathcal{I}\{h(f_{i})\}\\ +\mathcal{I}\{h(f_{i})\}&\mathcal{R}\{h(f_{i})\}\end{bmatrix}

(10)

I can then replace $h$ in Eq. 5 with Eq. 9 to obtain the likelihood of observing $d$ in the presence of a true signal $h$ and calibration error $\delta$ .

It is common to parametrize $\delta$ as

1+\delta=(1+\delta A)e^{i\delta\psi}

(11)

where both $\delta A$ and $\delta\psi$ are real. I instead use $\mathcal{R}\{\delta\}$ and $\mathcal{I}\{\delta\}$ because this allows me to marginalize over them analytically. Nonetheless, note that $\delta\approx\delta A+i\delta\psi$ in the limit $\delta A$ , $\delta\psi\ll 1$ , which is often the case. In this limit, then, uncertainty in the amplitude and phase, including correlations between the two, can be immediately translated into uncertainty on the real and imaginary parts of the calibration error.

Often, one has some prior knowledge of $\delta$ (both amplitude and phase) as a function of frequency, which allows one to place a reasonable prior on $\delta$ . This is informed by direct measurements of the detector response and models of the interferometer’s cavities [23, 13]. With such a prior, one can construct a joint distribution for both $d$ and $\delta$ assuming $\delta$ is independent of the noise realization and source properties:

p(d,\delta|\vartheta,\varphi,\mathcal{H}_{s},\mathcal{H}_{\delta})=p(d|\delta,\vartheta,\varphi,\mathcal{H}_{s})p(\delta|\mathcal{H}_{\delta})

(12)

in which $\mathcal{H}_{\delta}$ denotes the prior assumptions about the calibration uncertainty. This as the starting point for Sec. III.

III Marginal and Conditioned Likelihoods

It has become common practice to model the calibration uncertainties with a GP [23] in order to capture possible systematic errors between the modeled and measured detector responses. These systematic errors typically dominate over the uncertainty in the detector response from the measured parameters of the opto-mechanical model [13], and, as such, I consider a GP prior for $\delta$ with mean $\gamma$ and covariance $\Gamma$ such that

	$\displaystyle\ln p(\delta\|\mathcal{H}_{\delta})$	$\displaystyle=-\frac{1}{2}(\delta-\gamma)^{\mathrm{T}}\Gamma^{-1}(\delta-\gamma)$
		$\displaystyle\quad\quad-N_{\mathrm{frq}}\ln 2\pi-\frac{1}{2}\ln\mathrm{det}\|\Gamma\|$		(13)

This choice is particularly convenient because it allows me to analytically marginalize over calibration uncertainties in Eq. 12. That is, a GP prior for $\mathcal{R}\{\delta\}$ and $\mathcal{I}\{\delta\}$ is conjugate to the likelihood for stationary Gaussian noise.²²2One may be able to approximate more general priors for $\delta$ as sums of GPs. To wit, marginalizing over $\delta$ yields

$\displaystyle\ln p(d\|\vartheta,\varphi,\mathcal{H}_{s},\mathcal{H}_{\delta})=$	$\displaystyle\ln p(d\|\vartheta,\varphi,\mathcal{H}_{s})$
	$\displaystyle-\frac{1}{2}\gamma^{\mathrm{T}}\Gamma^{-1}\gamma-\frac{1}{2}\ln\mathrm{det}\|\Gamma\|$
	$\displaystyle+\frac{1}{2}\nu^{\mathrm{T}}A^{-1}\nu+\frac{1}{2}\ln\mathrm{det}\|A\|$	(14)

where

\nu=A\left[H^{\mathrm{T}}C^{-1}(d-h)+\Gamma^{-1}\gamma\right]

(15)

is a vector that is linear in the data and

\displaystyle A^{-1}=H^{\mathrm{T}}C^{-1}H+\Gamma^{-1}

(16)

As will be seen in Sec. III.1, limiting cases of the prior knowledge about calibration errors determine the behavior of $A$ . Indeed, a comparison of the two terms in $A$ is essentially a question of whether the typical fluctuation in projected strain from calibration errors ( $\sim|h|^{2}\Gamma$ ) is large compared to the typical fluctuations from Gaussian noise ( $\sim s/4\Delta f$ ).

Eq. III shows that the marginalization can be expressed as a multiplicative factor that modifies the likelihood obtained by assuming perfect calibration. In this way, one can quickly reweigh ( $\vartheta,\varphi$ )-samples drawn assuming perfect calibration under different assumptions about the calibration error post hoc. A similar strategy was investigated via numeric marginalization in Ref. [19].

Similarly, conditioning on $d$ yields

	$\displaystyle\ln p(\delta\|d,\vartheta,\varphi,\mathcal{H}_{s},\mathcal{H}_{\delta})=$	$\displaystyle-\frac{1}{2}(\delta-\nu)^{\mathrm{T}}A^{-1}(\delta-\nu)$
		$\displaystyle\quad\quad-N_{\mathrm{frq}}\ln 2\pi-\frac{1}{2}\ln\mathrm{det}\|A\|$		(17)

One can approximate the marginal posterior on $\delta$ as a Monte Carlo sum of GPs such that

p(\delta|d,\mathcal{H}_{s},\mathcal{H}_{\delta})\approx\frac{1}{N_{\mathrm{smp}}}\sum\limits_{k}^{N_{\mathrm{smp}}}p(\delta|d,\vartheta^{(k)},\varphi^{(k)},\mathcal{H}_{s},\mathcal{H}_{\delta})

(18)

with $N_{\mathrm{smp}}$ $(\vartheta^{(k)},\varphi^{(k)})$ -samples drawn from Eq. III. Alternatively, one could draw samples from Eq. 5 and then construct a sum with nontrivial weights using the correction from Eq. III.

At times, it will be more convenient to express Eq. III in terms of a single contraction involving $d$ . This allows me to investigate the effective covariance matrix for $d$ induced by the marginalization over calibration uncertainty. I therefore write

	$\displaystyle\ln p(d\|\vartheta,\varphi,\mathcal{H}_{s},\mathcal{H}_{\delta})$	$\displaystyle=-\frac{1}{2}(d-h-\mu)^{\mathrm{T}}B^{-1}(d-h-\mu)$
		$\displaystyle\quad\quad-N_{\mathrm{frq}}\ln 2\pi-\frac{1}{2}\ln\mathrm{det}\|B\|$		(19)

with effective (inverse) covariance matrix

B^{-1}=C^{-1}-C^{-1}HAH^{\mathrm{T}}C^{-1}

(21)

and

\mu=BC^{-1}HA\Gamma^{-1}\gamma

(22)

These are my basic results, and I explore their applications in what follows.

III.1 Limiting Cases

I now consider a few limiting cases: when the changes in the observed data due to calibration uncertainties are small (Sec. III.1.1) or large (Sec. III.1.2) and when the correlations between the calibration error at neighboring frequencies are small (Sec. III.1.3) or large (Sec. III.1.4). Throughout, the following approximation is useful

\left(A+X\right)^{-1}\approx A^{-1}-A^{-1}XA^{-1}+A^{-1}XA^{-1}XA^{-1}+\cdots

(23)

which holds when $||X||\ll||A||$ .

III.1.1 Small Deviations: $H\Gamma H^{\mathrm{T}}\ll C$

I first consider the case where the calibration uncertainty is small compared to the signal amplitude. That is, I consider the case where $|h|^{2}\Gamma\ll s/4\Delta f$ , or the change in the projected strain from calibration errors is small compared to the typical noise fluctuation. This is the case for current detectors [23, 13].

When the uncertainty in the calibration is small a priori,

	$\displaystyle\nu$	$\displaystyle\approx\gamma+\Gamma H^{\mathrm{T}}C^{-1}(d-h)$		(24)
	$\displaystyle A$	$\displaystyle\approx\Gamma-\Gamma H^{\mathrm{T}}C^{-1}H\Gamma$		(25)

describing the conditioned uncertainty for $\delta\,|\,d,h$ . While the leading-order change to $A$ is small, note that it acts to decrease the uncertainty in $\delta$ . As such, one gains a small amount of information about the calibration in the presence of a signal even when the detectors are well calibrated a priori.

Furthermore,

	$\displaystyle\mu$	$\displaystyle\approx H\gamma-H\left[\Gamma H^{\mathrm{T}}C^{-1}H\right]^{2}\gamma$		(26)
	$\displaystyle B^{-1}$	$\displaystyle\approx C^{-1}-C^{-1}H\Gamma H^{\mathrm{T}}C^{-1}$		(27)

so that

B\approx C+H\Gamma H^{\mathrm{T}}

(28)

describing the likelihood marginalized over calibration uncertainty. While the correction in the effective covariance matrix is small (does not strongly affect the astrophysical inference [18, 19, 20]), note that it tends to increase the uncertainty by an amount that is proportional to the signal’s amplitude squared. Although this does not hold in general, one can interpret the effective covariance as the sum of the variance from stationary Gaussian noise and independent variance from calibration uncertainty.

These results make sense intuitively, as the calibration errors are constrained to be small a priori and do not add much additional model freedom compared to an analysis that assumes no calibration error.

III.1.2 Large Deviations: $C\ll H\Gamma H^{\mathrm{T}}$

When one does not have strong prior limits on the calibration error in comparison to the projected signal strength in the detector, one does not know how the signal will actually appear. That is, the uncertainty in the projected strain is large compared to typical noise fluctuations: $|h|^{2}\Gamma\gg s/4\Delta f$ . One therefore expects only weak limits on the astrophysical signal. Specifically,

	$\displaystyle\nu$	$\displaystyle\approx H^{-1}(d-h)+\left[\Gamma H^{\mathrm{T}}C^{-1}H\right]^{-1}\gamma$		(29)
	$\displaystyle A$	$\displaystyle\approx\left(H^{\mathrm{T}}C^{-1}H\right)^{-1}-\left[\Gamma H^{\mathrm{T}}C^{-1}H\right]^{-1}\left(H^{\mathrm{T}}C^{-1}H\right)^{-1}$		(30)

and

	$\displaystyle\mu$	$\displaystyle=H\gamma-H\left[\Gamma H^{\mathrm{T}}C^{-1}H\right]^{-2}\gamma$		(31)
	$\displaystyle B$	$\displaystyle\approx H\Gamma H^{\mathrm{T}}+C$		(32)

Note that, again, to leading order $B$ is the sum of two sources of variance. However, in this limit, $B\approx H\Gamma H^{\mathrm{T}}\gg C$ , and the effective covariance matrix is much broader than that of stationary Gaussian noise.

This also means that the uncertainty in the calibration given an astrophysical signal scales inversely with the size of the signal (covariance of $\delta\,|\,d,h$ is $A\sim s/|h|^{2}$ ), and one loses most of their ability to constrain the astrophysical signal (covariance of $d\,|\,h$ is $B\sim|h|^{2}\Gamma$ ). Again, this makes sense. This limit de facto means that one does not know how the astrophysical signal will appear within the detector and therefore cannot constrain it based on the observed data. Indeed, note that

\ln p(d|\vartheta,\varphi,\mathcal{H}_{s},\mathcal{H}_{\delta})\sim-\frac{1}{2}(d-h)(H^{\mathrm{T}})^{-1}\Gamma^{-1}H^{-1}(d-h)

(33)

and the product of $H$ and $(d-h)$ in each $2\times 2$ block becomes

H^{-1}(d-h)=\frac{1}{\mathcal{R}\{h\}^{2}+\mathcal{I}\{h\}^{2}}\begin{bmatrix}\mathcal{R}\{h\}\mathcal{R}\{d-h\}+\mathcal{I}\{h\}\mathcal{I}\{d-h\}\\ \mathcal{R}\{h\}\mathcal{I}\{d-h\}-\mathcal{I}\{h\}\mathcal{R}\{d-h\}\end{bmatrix}

(34)

which simplifies to $h^{\ast}(d-h)/|h|^{2}=(d/h)-1$ , where $d$ and $h$ are complex scalars (rather than vectors of real numbers) and $(\cdot)^{\ast}$ denotes complex conjugation. As such, with a slight abuse of notation,

\ln p(d|\vartheta,\varphi,\mathcal{H}_{s},\mathcal{H}_{\delta})\sim-\frac{1}{2}\left(\frac{d}{h}-1\right)^{\mathrm{T}}\Gamma^{-1}\left(\frac{d}{h}-1\right)

(35)

The effective comparison performed by the inference, then, is simply how well the data reproduces the expected signal and how likely the deviations are according to the calibration prior.

III.1.3 Small Correlations between frequencies

When correlations between frequencies are small in both $C$ and $\Gamma$ , the likelihood decomposes into $2\times 2$ blocks. The relevant expressions become sums over frequency, with each summand a contraction of vectors of the real and imaginary data at that frequency. If one makes no assumptions about $\Gamma$ (i.e., real and imaginary parts of $\delta$ may be correlated a priori), then it is difficult to make more progress. However, if one additionally assumes the real and imaginary calibration errors are uncorrelated at each frequency ( $\Gamma$ is diagonal), then for each $2\times 2$ block

A_{2\times 2}=\begin{bmatrix}\left(\frac{4\Delta f|h|^{2}}{s}+\frac{1}{\Gamma_{\mathcal{R}\mathcal{R}}}\right)^{-1}&0\\ 0&\left(\frac{4\Delta f|h|^{2}}{s}+\frac{1}{\Gamma_{\mathcal{I}\mathcal{I}}}\right)^{-1}\end{bmatrix}

(36)

where

\Gamma_{2\times 2}=\begin{bmatrix}\Gamma_{\mathcal{R}\mathcal{R}}&0\\ 0&\Gamma_{\mathcal{I}\mathcal{I}}\end{bmatrix}

(37)

Similarly, the effective covariance matrix becomes

B_{2\times 2}\propto\begin{bmatrix}\frac{s}{4\Delta f}+\mathcal{R}\{h\}^{2}\Gamma_{\mathcal{R}\mathcal{R}}+\mathcal{I}\{h\}^{2}\Gamma_{\mathcal{I}\mathcal{I}}&\Gamma_{\mathcal{I}\mathcal{I}}-\Gamma_{\mathcal{R}\mathcal{R}}\\ \Gamma_{\mathcal{I}\mathcal{I}}-\Gamma_{\mathcal{R}\mathcal{R}}&\frac{s}{4\Delta f}+\mathcal{R}\{h\}^{2}\Gamma_{\mathcal{I}\mathcal{I}}+\mathcal{I}\{h\}^{2}\Gamma_{\mathcal{R}\mathcal{R}}\end{bmatrix}

(38)

If one further specializes to the case where the uncertainty on the real and imaginary parts of the calibration error are equal ( $\Gamma_{\mathcal{R}\mathcal{R}}=\Gamma_{\mathcal{I}\mathcal{I}}=\sigma_{\Gamma}^{2}$ ), the coefficient of proportionality in Eq. 38 is unity and $B$ is diagonal with

B_{2\times 2}=\left(\frac{s}{4\Delta f}+|h|^{2}\sigma_{\Gamma}^{2}\right)\mathbbm{1}_{2\times 2}

(39)

The calibration uncertainty acts to produce extra noise in the detector at each frequency, and that extra noise scales with the size of the signal at that frequency. Note that there is no limit to how much the calibration uncertainty can hurt the inference: $B\rightarrow\infty$ as $\sigma_{\Gamma}\rightarrow\infty$ .

III.1.4 Large Correlations between frequencies

When there are large correlations between the calibration errors at different frequencies a priori, $\Gamma$ may be rank deficient and therefore may not have an inverse. As such, many of our expressions can become unwieldy as they depend on $\Gamma^{-1}$ . In the limit of perfect correlations between frequencies (i.e., frequency-independent calibration errors), this becomes particularly troublesome. This could correspond, for example, to a global calibration amplitude uncertainty induced by errors in the gold-standard integrating sphere [36, 37, 21, 22] used to calibrate all other PCALs in the LIGO detectors. This would be perfectly degenerate with the luminosity distance to sources and could fundamentally limit standard siren cosmology [17, 18].

I investigate this limit by re-expressing the joint distribution (Eq. 12) in terms of a direct sum over frequencies (assuming $C$ is diagonal). To wit,

$\displaystyle\ln p(d,\delta\|\vartheta,\varphi,\mathcal{H}_{s},\mathcal{H}_{\delta})$	$\displaystyle\supset-\frac{1}{2}\sum_{f}(d_{f}-h_{f}-H_{f}\delta)^{\mathrm{T}}C^{-1}_{ff}(d_{f}-h_{f}-H_{f}\delta)-\frac{1}{2}(\delta-\gamma)^{\mathrm{T}}\Gamma^{-1}(\delta-\gamma)$
	$\displaystyle\supset-\frac{1}{2}\sum_{f}(d_{f}-h_{f})^{\mathrm{T}}C^{-1}_{ff}(d_{f}-h_{f})-\frac{1}{2}\gamma^{\mathrm{T}}\Gamma^{-1}\gamma$
	$\displaystyle\quad\quad\quad\quad-\frac{1}{2}\delta^{\mathrm{T}}\left[\Gamma^{-1}+\sum_{f}H^{\mathrm{T}}_{f}C^{-1}_{ff}H_{f}\right]\delta+\left[\gamma^{\mathrm{T}}\Gamma^{-1}+\sum_{f}(d_{f}-h_{f})^{\mathrm{T}}C^{-1}_{ff}H_{f}\right]\delta$	(40)

This is equivalent to the previous expressions except that $\delta$ and $\gamma$ are vectors with only 2 elements, $\Gamma$ is a $2\times 2$ matrix, and $H$ is a $2N_{\mathrm{frq}}\times 2$ matrix. From this, one can immediately write down the terms analogous to the frequency-dependent expressions. That is, the inverse-covariance between $\mathcal{R}\{\delta\}$ and $\mathcal{I}\{\delta\}$ is a $2\times 2$ matrix

	$\displaystyle A^{-1}_{2\times 2}$	$\displaystyle=\Gamma^{-1}_{2\times 2}+H^{\mathrm{T}}C^{-1}H$
		$\displaystyle=\Gamma^{-1}_{2\times 2}+\rho^{2}\mathbbm{1}_{2\times 2}$		(41)

where³³3Eq. 42 and the second line in Eq. III.1.4 assume $C$ is diagonal. $\rho$ is the optimal S/N for stationary Gaussian noise, defined as

\rho^{2}\equiv h^{\mathrm{T}}C^{-1}h=4\int df\frac{|h|^{2}}{s}

(42)

Note that the calibration uncertainty can be directly compared to the S/N. That is, calibration uncertainties only matter when $\Gamma\gtrsim 1/\rho^{2}$ .

The marginal likelihood for $d\,|\,h$ follows the same form as before with

B^{-1}=C^{-1}-C^{-1}HAH^{\mathrm{T}}C^{-1}

(43)

where, again, $H$ is a $2N_{\mathrm{frq}}\times 2$ matrix and $A$ is a $2\times 2$ matrix. There are potentially large correlations between frequencies in $B$ (the second term in Eq. 43 contains the product of ( $\mathcal{R}\{h\}$ , $\mathcal{I}\{h\}$ ) at different frequencies). That is, the strong correlations between $\delta$ at different frequencies induce strong correlations between $d$ at different frequencies. One can think of this as due to the fact that data at different frequencies witness the same calibration error.

Also note that, when $\Gamma_{2\times 2}=\sigma_{\Gamma}^{2}\mathbbm{1}_{2\times 2}$

A_{2\times 2}=\frac{\sigma^{2}_{\Gamma}}{1+\sigma^{2}_{\Gamma}\rho^{2}}\mathbbm{1}_{2\times 2}

(44)

There is a limit to the size of the second term in Eq. 43 as $\sigma_{\Gamma}\rightarrow\infty$ . What’s more, the overall correction scales as $HAH^{\mathrm{T}}C^{-1}\sim A\rho^{2}$ so that the size of the change in B is independent of both $\sigma_{\Gamma}$ and $\rho$ in the limit where their product is large. That is, even horrific frequency-independent calibration uncertainty only modifies the marginal likelihood by a fixed amount. Therefore, the inference of very loud signals still improves as the signal strength grows ( $B\sim\mathrm{constant}$ and $\ln p\sim|h|^{2}$ ). This is in contrast to independent calibration uncertainty at each frequency (Sec. III.1.3, $B\propto|h|^{2}$ and $\ln p\rightarrow\mathrm{constant}$ as $\sigma_{\Gamma}$ diverges).

III.2 Detector Networks and Multiple Signals

So far, I have considered a single astrophysical signal in a single detector. However, it is straightforward to extend these results to networks of multiple detectors or the simultaneous analysis of multiple signals. Specifically, one can construct a vector of all the data recorded for each event in each detector as

\displaystyle d

\displaystyle=\left.\left.\bigotimes_{A}^{N_{\mathrm{evn}}N_{\mathrm{IFO}}}\right[d_{A}\right]\in\mathcal{R}^{2N_{\mathrm{frq}}N_{\mathrm{evn}}N_{\mathrm{IFO}}}

(45)

with analogous extensions for the signal models (separate astrophysical parameters for each event) and calibration error. The subscript $A$ denotes the combination of detector and event. In a similar way, it is straightforward to generalize, e.g., the prior covariance matrix for calibration errors

\Gamma\rightarrow\begin{bmatrix}\Gamma_{AA}&\cdots&\Gamma_{AN}\\ \vdots&\ddots&\vdots\\ \Gamma_{NA}&\cdots&\Gamma_{NN}\end{bmatrix}

(46)

with the understanding that $\Gamma_{AN}=(\Gamma_{NA})^{\mathrm{T}}$ . This encodes correlations between calibration errors in different detectors and/or between different events (e.g., calibration errors in each detector may be correlated over time). Under the assumption of stationary Gaussian noise that is uncorrelated between detectors, $C_{AB}$ is completely diagonal:

C\rightarrow\begin{bmatrix}C_{AA}&&0\\ &\ddots&\\ 0&&C_{NN}\end{bmatrix}

(47)

With this notation, the corresponding expression for the conditioned and marginal likelihoods are no more complex than for a single IFO. In particular, note that the special case of independent calibration errors in each detector for a single event ( $\Gamma_{AB}$ is block-diagonal) can be handled similarly to the case of calibration errors that are uncorrelated between frequencies. The likelihood becomes a product of separate single-IFO terms. Similarly, perfectly correlated calibration errors in each detector can be handled like the case of perfect correlations between frequencies; one can express the likelihood as a direct sum over detectors and obtain expressions like Eqs. III.1.4 and 43. One can see, then, that the multi-IFO analog of Eq. III.1.4 implies that one can learn more about (perfectly correlated) calibration errors with a network of detectors or multiple events as the covariance of $\delta\,|\,d,h$ depends on terms like the network S/N (summed in quadrature over events) instead of the S/N in a single IFO (for a single event).

Of particular importance is the case of multiple signals observed with the same detector network. One is faced with a choice about how to structure the prior for $\delta$ . That is, if one believes the calibration errors are constant over long periods of time, then one may believe that there are strong correlations between events. Each event sees the same $\delta$ , and one can attempt to constrain that $\delta$ directly. In this case, one computes a posterior for $\delta$ with a single GP prior, obtaining analytic conditional distribution for $\delta\,|\,\{d_{A},h_{A}\}$ .

However, if one believes $\delta$ changes more rapidly than the time between events but nevertheless is drawn from a consistent distribution, then one can try to constrain the distribution of $\delta$ based on the observed catalog of events. In this case, it is the prior for $\delta$ that is common between events, not $\delta$ itself. Instead of constraining just $\delta$ for each event, one constrains $\gamma$ and $\Gamma$ . That is, one can infer the GP from which a realization of $\delta$ is drawn for each event in a hierarchical inference scheme. Furthermore, the normal-inverse-Wishart distribution is a conjugate prior for a multivariate Gaussian with unknown mean and covariance. If one assumes a normal-inverse-Wishart hyperprior for $\gamma$ and $\Gamma$ , the entire hierarchical inference remains analytic and can be expressed in a closed form. That is, one can obtain a posterior distribution for $\gamma,\Gamma\,|\,\{d_{A},h_{A}\}$ analytically.

The differences between these two types of assumptions are common in, e.g., parametrized tests of GR and have been discussed in some detail within that context. I refer the reader to the excellent reviews in Refs. [38, 39] for more discussion.

IV Searches and Sensitivity

A search for a known signal morphology that assumes perfect calibration produces a maximum likelihood estimator for the signal amplitude (denoting the signal $h=a\tilde{h}$ )

	$\displaystyle\hat{a}_{\sigma_{\Gamma}=0}$	$\displaystyle=\frac{d^{\mathrm{T}}C^{-1}\tilde{h}}{\tilde{h}^{\mathrm{T}}C^{-1}\tilde{h}}$
		$\displaystyle=\left(4\int df\frac{\mathcal{R}\{d^{\ast}\tilde{h}\}}{s}\right)\left(4\int df\,\frac{\|\tilde{h}\|^{2}}{s}\right)^{-1}$		(48)

which is an unbiased estimator with unit variance under the common convention $\tilde{h}^{\mathrm{T}}C^{-1}\tilde{h}=1$ . When $\tilde{h}$ is normalized in this way, the estimator is often written as

\hat{a}_{\sigma_{\Gamma}=0}=\frac{d^{\mathrm{T}}C^{-1}\tilde{h}}{\sqrt{\tilde{h}^{\mathrm{T}}C^{-1}\tilde{h}}}

(49)

This is the normal definition of the matched-filter S/N. I examine the behavior of this statistic in the presence of calibration uncertainty.

Specifically, when $\gamma=0$ ,

$\displaystyle E[\hat{a}_{\sigma_{\Gamma}=0}]$	$\displaystyle=a_{\mathrm{true}}$	(50)
$\displaystyle\mathrm{Var}[\hat{a}_{\sigma_{\Gamma}=0}]$	$\displaystyle=E[(\hat{a}_{\sigma_{\Gamma}=0}-E[\hat{a}_{\sigma_{\Gamma}=0}])^{2}]$
	$\displaystyle=\frac{\tilde{h}^{\mathrm{T}}C^{-1}BC^{-1}\tilde{h}}{\tilde{h}^{\mathrm{T}}C^{-1}\tilde{h}}>1$	(51)

That is, the mean S/N is not affected for zero-mean calibration priors, but the variance is larger ( $B\geq C$ ). That is, I assume $C$ is the covariance of $d$ in the absence of a signal. I assume this is known (measured) and is independnet of $\delta$ (Eq. 5).⁴⁴4Ref. [40], instead, suggests the measured $C$ will depend on $\delta$ . This means one must include $C$ ’s dependence on $\delta$ when computing moments of $\hat{a}_{\sigma_{\Gamma}=0}$ . They find a cancellation between the numerator and denominator of Eq. 49 at leading order in $\gamma$ when computing the expected value. In the context of their analysis, my result would hold if one performed the inference directly on (any deterministic function of) the raw detector output. Nonetheless, although not shown explicitly, their analysis suggests $\mathrm{Var}[\hat{a}_{\sigma_{\Gamma}=0}]\geq 1$ as well, which is my main result. This implies that, for a given observed S/N, the probability that noise alone (combined with calibration uncertainty) could have produced that statistic is higher. Therefore, the statistical significance of that S/N is lower. If one considers the the ratio of the estimator to its standard deviation as a measure of significance (a classical $z$ -test), then $\hat{a}_{\mathrm{naive}}$ will produce strictly smaller $z$ than one would expect without calibration uncertainty. This makes sense intuitively; it should be harder to detect signal with a matched filter in the presence of calibration uncertainty.

Similar effects occur for unmodeled searches. The variance of the naive search statistic (maximized likelihood) is increased under the noise hypothesis in the presence of calibration uncertainty. Louder signals are needed to stand out against the background.

It is noteworthy that modern GW searches do not estimate the false alarm probability based on the assumption of stationary Gaussian noise. Instead, they employ a variety of bootstrap procedures (see, e.g., Refs. [41, 42, 43, 44, 45, 46]). De facto, then, there is some threshold above which the observed S/N must fall to be considered a confident detection. In this context, the broader variance of the naive S/N statistic in the presence of nonzero calibration uncertainty still hampers detectability. This is because the true S/N (sometimes called the optimal S/N, defined in Eq. 42) must be systematically larger than the threshold in order for the observed S/N to be above the threshold with high probability. That is, the optimal S/N must be shifted above the threshold by a larger amount when $\Gamma$ is nonzero because the variance of S/N is larger. Larger optimal S/N imply closer sources, and therefore the space-time volume to which detectors are sensitive is smaller.

Finally, one could derive a maximum likelihood estimator for the signal amplitude directly from $\ln p(d|\vartheta,\varphi,\mathcal{H}_{s},\mathcal{H}_{\delta})$ . However, the corresponding expressions quickly become intractable and may be computationally prohibitive. As such, it is expected that they will be of little practical use.

V Calibration-Marginalized Fisher Information Matrix

In Sec. III, I obtained general expressions for the marginal likelihood given prior uncertainty on the detector calibration. I now investigate how the presence of imperfect calibration uncertainty can confound the inference of astrophysical properties in more detail. Specifically, I consider the Fisher information matrix for astrophysical parameters derived from the calibration-marginalized likelihood. In particular, I contrast this to what one might naively expect under the assumption that calibration errors were known to be absent.

The Fisher information matrix element between parameters $\vartheta_{a}$ and $\vartheta_{b}$ is defined as

I_{ab}\equiv\int\mathcal{D}d\,p(d|\vartheta)\left(\frac{\partial\ln p(d|\vartheta)}{\partial\vartheta_{a}}\right)\left(\frac{\partial\ln p(d|\vartheta)}{\partial\vartheta_{b}}\right)

(52)

and is inversely related to the lower bound on the covariance between these variables (Cramér-Rao bound). That is, larger $I_{ab}$ imply smaller covariances and vice versa. In the special case where $p(d|\vartheta)$ is Gaussian

\ln p(d|\vartheta)\supset-\frac{1}{2}(d-h-\mu)^{\mathrm{T}}B^{-1}(d-h-\mu)\\ -\frac{1}{2}\ln\mathrm{det}|B|

(53)

the Fisher information matrix becomes

I_{ab}=\left(\frac{\partial(h+\mu)}{\partial\vartheta_{a}}\right)^{\mathrm{T}}B^{-1}\left(\frac{\partial(h+\mu)}{\partial\vartheta_{b}}\right)\\ +\frac{1}{2}\mathrm{tr}\left[B^{-1}\frac{\partial B}{\partial\vartheta_{a}}B^{-1}\frac{\partial B}{\partial\vartheta_{b}}\right]

(54)

The general expression can be quite complicated, and I do not attempt to describe it in detail here. Instead, I again focus on limiting cases: when the calibration uncertainty is small and when it is large. In all cases, I assume zero-mean prior processes ( $\gamma=\mu=0$ ) for simplicity.

For reference, in the naive case ( $\Gamma=0$ )

	$\displaystyle I^{(\mathrm{naive})}_{ab}$	$\displaystyle=(\partial_{a}h)C^{-1}(\partial_{b}h)$
		$\displaystyle=4\int df\frac{\mathcal{R}\{(\partial_{a}h)^{\ast}(\partial_{b}h)\}}{s}$		(55)

V.1 Calibration uncertainties are small

When calibration uncertainties are small and the mean of the prior process is zero,

I_{ab}\approx(\partial_{a}h)^{\mathrm{T}}\left(C^{-1}-C^{-1}H\Gamma H^{\mathrm{T}}C^{-1}\right)(\partial_{b}h)\\ +\frac{1}{2}\mathrm{tr}\left[C^{-1}(\partial_{a}(H\Gamma H^{\mathrm{T}}))C^{-1}(\partial_{b}(H\Gamma H^{\mathrm{T}}))\right]

(56)

The leading-order correction (in $\Gamma$ ) to the naive result is

	$\displaystyle\Delta I_{ab}$	$\displaystyle\equiv I_{ab}-I^{(\mathrm{naive})}_{ab}$
		$\displaystyle=-(\partial_{a}h)^{\mathrm{T}}C^{-1}H\Gamma H^{\mathrm{T}}C^{-1}(\partial_{b}h)$		(57)

This term is negative semidefinite when $a=b$ because $\Gamma$ is positive semidefinite. As such, $I_{aa}$ is always smaller than the naive estimate, implying the variance of $\vartheta_{a}$ will always be larger than in the naive estimate. This makes sense. Imperfect knowledge of the detector’s calibration should always worsen the precision to which one can measure astrophysical parameters.

This extra uncertainty is just an inner product of the derivatives of the signal model. Further note that, assuming $C$ is diagonal, one can write the $2\times 2$ blocks

H^{\mathrm{T}}C^{-1}\partial_{b}h=\frac{4\Delta f}{s}\begin{bmatrix}\mathcal{R}\{h\}\partial_{a}\mathcal{R}\{h\}+\mathcal{I}\{h\}\partial_{a}\mathcal{I}\{h\}\\ \mathcal{I}\{h\}\partial_{a}\mathcal{R}\{h\}-\mathcal{R}\{h\}\partial_{a}\mathcal{I}\{h\}\end{bmatrix}

(58)

which is simply $4\Delta f(h^{\ast}\partial_{a}h)/s$ (again, reverting to complex scalars instead of real vectors). The change in $I_{aa}$ is simply the projection of $(h^{\ast}\partial_{a}h/s)$ onto the principle components of $\Gamma$ . In this way, $\Gamma$ with large Frobenius norm and/or large marginal uncertainty at a particular frequency may still correspond to small changes in the measurability of certain astrophysical parameters because of the correlations between frequencies. That is, $h^{\ast}\partial_{a}h/s$ may match eigenvectors of $\Gamma$ with small eigenvalues. However, if the Frobenius norm is small, then $|\Delta I_{ab}|$ will also be small. Therefore, a small Frobenius norm is a sufficient condition for precise inference, but it may not be necessary.

One may also ask, given the current state of uncertainty, which frequencies should be measured more precisely to minimize the amount of information lost to calibration uncertainty within the astrophysical inference. That is, minimize $|\Delta I_{ab}|$ by additionally constraining $\delta$ at a single frequency. If one measures $\delta$ precisely at a single frequency, then the calibration uncertainty is modified so that $\Gamma\rightarrow\Gamma^{\prime}(f)=(\Gamma^{-1}+Z^{-1}(f))^{-1}$ where $Z$ is diagonal with all matrix elements equal to zero except for at one frequency, where $Z$ is small. That is, $Z$ represents the covariance from the measurement of $\delta$ at a single frequency.

Using this updated uncertainty, one can then pose the question as follows:

f_{\mathrm{opt}}=\operatorname*{argmin}_{f}\left\{\left(\partial_{a}h\right)^{\mathrm{T}}C^{-1}H\,\Gamma^{\prime}(f)\,H^{\mathrm{T}}C^{-1}\partial_{b}h\right\}

(59)

In general, it is difficult to make analytic progress with this expression. However, it should be relatively simple to implement numerically. That is, one can perform a direct search over frequencies, simulating the effect of a precise measurement for each frequency by inserting the appropriate $Z(f)$ . Indeed, this numeric procedure can be extended to an arbitrary number of hypothetical calibration measurements. One can simply define $Z(f_{i})$ for each hypothetical measurement and include the sum within Eq. 59. While the computational cost of a direct search grows quickly with the number of frequency measurements, in practice this may be limited to only a handful of frequencies.

As an example, consider diagonal $C$ and $\Gamma$ . In this case

|\Delta I_{aa}|=\sum\limits_{f}\frac{|h_{f}^{\ast}\partial_{a}h_{f}|^{2}}{C_{f}^{2}}\left(\frac{Z_{f}\Gamma_{f}}{Z_{f}+\Gamma_{f}}\right)

(60)

Now, in the neighborhood of the new (precise) observation, $Z\rightarrow 0$ . Therefore, precise observations of $\delta$ act as a filter that zeros out that neighborhood within the integral that determines $|\Delta I_{aa}|$ . By inspection, then, assuming $\Gamma$ is the same at all frequencies

f_{\mathrm{opt}}=\operatorname*{argmax}_{f}\left\{\left(\frac{|h|^{2}}{s}\right)\left(\frac{|\partial_{a}h|^{2}}{s}\right)\right\}

(61)

That is, one should measure the calibration at the frequency at which the current calibration uncertainty most hurts the inference. This is a combination of the rate at which information is gained ( $|\partial_{a}h|^{2}/s$ ) and when the signal (and therefore calibration uncertainty’s impact on the observed data) is large compared to Gaussian noise ( $|h|^{2}/s$ ). However, when there are correlations between different frequencies encoded within $\Gamma$ , the selection of $f_{\mathrm{opt}}$ may be more complicated.

Note that, in general, $f_{\mathrm{opt}}$ will depend on both $h$ and the parameter of interest. A broader optimization procedure should consider the ability to constrain multiple astrophysical parameters simultaneously for many different possible signals. For example, it may be most advantageous to constrain $\delta$ at relatively high frequencies in order to minimize calibration uncertainty’s impact on the measurement of tidal deformability in binary neutron star (BNS) mergers. However, those frequencies may be above the maximum frequency produced in binary black hole (BBH) mergers. Additional calibration measurements at high frequencies, which most benefit BNSs, may not improve astrophysical inference for BBHs (barring strong inter-frequency correlations in $\Gamma$ ). As such, while a small Frobenius norm for $\Gamma$ is not necessary for small $|\Delta I_{ab}|$ for any individual signal, it may be the simplest way to guarantee small $|\Delta I_{ab}|$ for all signals.

V.2 Calibration uncertainties are large

In the limit of large calibration uncertainties and $\gamma=0$ , $B\sim H\Gamma H^{\mathrm{T}}\sim|h|^{2}\Gamma$ . Eq. 54 immediately shows that the uncertainty in $\vartheta$ does not depend on the signal amplitude. That is, louder signals do not imply tighter constraints on $\vartheta$ . In particular, consider the (re)parametrization of $h$ from Sec. IV ( $h=a\tilde{h}$ ). For any parameter besides $a$ , all factors of $a$ cancel. For $a$ itself, $I_{aa}\sim a^{-2}$ and $\mathrm{Var}(a)\propto a^{2}$ , implying a constant fractional uncertainty in $a$ .

Contrast this with the naive estimate that neglects calibration errors, where one expects $I_{ab}\sim|h|^{2}$ . The naive estimate suggests exceptionally well measured parameters in the limit of loud signals. This cannot be the case in any physically realizable detector, as it is impossible to perfectly constrain the detector response. Indeed, the smallest principle value of $\Gamma$ sets a lower limit on how well individual astrophysical parameters can be measured.⁵⁵5If there are prefect correlations between frequencies, as in Sec. III.1.4, then the smallest eigenvalue of $\Gamma$ is zero. I revisit this point in Sec. VII

VI Waveform Uncertainty and Deviations from General Relativity

Finally, as has been pointed out previously (e.g., Refs. [40, 26, 47]), deviations in the waveform $h_{\alpha}\rightarrow h_{\alpha}(1+\delta)$ produce analogous terms in the likelihood: $h=f_{\alpha}h_{\alpha}\rightarrow h(1+\delta)$ . If one can express the prior uncertainty on waveform errors or possible deviations from GR in terms of a GP, then the expressions derived in the context of calibration uncertainty, to a large part, hold unaltered. The interpretation of $\gamma$ and $\Gamma$ changes, but the mechanics of how to constrain them and how they might affect the inference remain the same. Just as one is able to analytically integrate out calibration uncertainties to obtain a marginal likelihood for the data, one can marginalize over uncertainty in the astrophysical signal to obtain the same. In this sense, the waveform constitutes a probabilistic map between a system’s properties (masses, spins, etc) and the expected signal from the true theory of gravity. Of particular interest may be hierarchical inference for the distribution of waveform errors assuming these errors are independently and identically distributed draws from the same GP prior for each event. This would be, in effect, a nonparametric extension to the parametrized hierarchical analysis proposed in Ref. [39] and used extensively within Refs. [6, 7].

A similar approach was explored in Ref. [48] using cubic splines to parametrize deviations from GR, although they did not perform hierarchical inference and only analyzed events separately. While splines are related to GPs (see Chapter 6 in Ref. [49]), a direct formulation of the problem in terms of GPs will likely allow for more control over prior assumptions and improved interpretability of the resulting constraints.

Although I leave a thorough investigation of the implications of GP priors for waveform errors and deviations from GR to future work, I note a few things. In this case, one expects perfect correlations in $\delta$ between IFOs (all detectors witness the same astrophysical signal) and the deviations to be drawn from the same distribution for all events (the signal of each event is determined by the same underlying theory of gravity). However, one may need to allow for separate deviations for each polarization ( $\delta_{\alpha}$ may be different for each $\alpha$ ), mixing between polarizations ( $\delta_{\alpha}$ may depend on $h_{\beta}$ ), and these deviations may not be best expressed as fractional changes in the waveform ( $\delta_{\alpha}$ may be nonzero even when $h_{\alpha}=0\ \forall\ \alpha$ ).

As a final note, one may consider GP priors for both calibration ( $\delta_{f}$ ) and waveform ( $\delta_{h}$ ) uncertainty simultaneously. In this case, the projected strain in the detectors appears as $h=f_{\alpha}h_{\alpha}\rightarrow h(1+\delta_{f})(1+\delta_{h})$ . The likelihood will contain a term like $|h\delta_{f}\delta_{h}|^{2}$ , which may stymie analytic marginalization over both $\delta_{f}$ and $\delta_{h}$ , although one should still be able to analytically marginalize over one of them. Again, I leave a full investigation to future work.

VII Discussion

Table 1: Estimates of the precision of individual parameters for a 1.4+1.4

M_{\odot}

binary when

l_{\Gamma}=0

with projected design sensitivities of future IFOs. I show the uncertainty assuming no calibration error (

\sigma_{\Gamma}\rightarrow 0

) at a reference S/N of 10 as well as the lower limit set by nonzero calibration uncertainty. The lower limits scale linearly with the calibration uncertainty (and the square root of the frequency spacing), and I quote reference values for

\sigma_{\Gamma}\sqrt{\Delta f}=0.01

. In addition to the uncertainty in the coalescence time (

t_{c}

), I approximate the corresponding uncertainty in the polar angle (localization precision) between the LIGO Hanford (LHO) and LLO detectors from triangulation (

\sigma_{\theta_{HL}}\approx\sqrt{2}(\sigma_{t_{c}}/10\,\mathrm{ms})

) [50].

IFO	quantity	$\mathcal{M}\,[M_{\odot}]$	$\eta$	$\tilde{\Lambda}$	$t_{c}\,[\mathrm{ms}]$	$\theta_{HL}$
aLIGO design [51]	$\left(\rho/10\right)\lim\limits_{\Gamma\rightarrow 0}\sigma_{i}$	$1.06\cdot 10^{-4}$	$3.17\cdot 10^{-3}$	$2.61\cdot 10^{2}$	$3.14\cdot 10^{-1}$	$2.5^{\circ}$
Cosmic Explorer [52]		$7.46\cdot 10^{-5}$	$2.24\cdot 10^{-3}$	$5.09\cdot 10^{+2}$	$5.66\cdot 10^{-1}$	$4.6^{\circ}$
$f\in[5,1500]\,\mathrm{Hz}$	$\left(0.01/\sigma_{\Gamma}\sqrt{\Delta f}\right)\lim\limits_{\rho\rightarrow\infty}\sigma_{i}$	$1.44\cdot 10^{-6}$	$4.31\cdot 10^{-5}$	$4.27\cdot 10^{-1}$	$1.03\cdot 10^{-3}$	$30.0^{\prime\prime}$

GW astrophysics has rapidly grown over the last few years with an ever increasing number of confidently detected compact binary coalescences [27, 28, 29, 30] and steadily improved upper limits on the rates of other sources (see, e.g., [53, 54]). This has been enabled in no small part by the remarkable understanding of the current interferometers’ responses to astrophysical strain. Indeed, in addition to providing accurate calibration for each interferometer in the network, analysts are able to carefully quantify the precision of their calibration estimates through GPs, and the incorporation of this information within astrophysical inference has become commonplace within the literature.

This manuscript exploits the fact that the dominant calibration uncertainties are described by GPs to analytically marginalize over them. Although several authors have already demonstrated numeric marginalization techniques, this analytic approach offers several advantages.

First, it may be computationally faster in practice. That is, it may be faster to compute a single inner product than to directly Monte Carlo sample over the GP calibration prior. The limiting step will likely be the inversion of $\Gamma$ and $A^{-1}=H^{\mathrm{T}}C^{-1}H+\Gamma^{-1}$ required to compute $B^{-1}$ for each signal model. This step may scale as $O(N_{\mathrm{frq}}^{3})$ . Monte Carlo integration could be faster if very few Monte Carlo samples are needed for the integral to converge. This will likely be the case only when the calibration prior is already so precise that it is not changed significantly by the addition of astrophysical data (the current situation for all signals detected to date). That is, direct numeric integration may only be faster when the calibration uncertainties are too small to make much of a difference anyway. At the same time, one may also be able to approximate $A\approx\Gamma$ in this limit, removing the need to invert $\Gamma$ and $A$ and thereby reducing the computational cost from $O(N_{\mathrm{frq}}^{3})$ to $O(N_{\mathrm{frq}}^{2})$ .

What’s more, analytic expressions allow for greater interpretability of the resulting conditioned and marginal distributions. For example, I am able to easily take several limits and explore the impact of large/small calibration uncertainties and correlations. I also show how to trivially extend the formalism to include networks of multiple detectors and multiple astrophysical signals, including possible correlations between the calibration errors witnessed in each detector/event pair. Equivalent investigations via numeric integration are possible, but may be more difficult to interpret and more costly to perform.

Specifically, I find that calibration uncertainties always reduce the measurability of astrophysical parameters. This can be seen in several ways. I explore the impact of calibration uncertainties on search sensitivity through the behavior of the standard matched-filter S/N, finding larger variance in the presence of calibration uncertainty and, therefore, reduced search sensitivity. Monte Carlo simulations of search sensitivity [55, 56, 57] may be able to exploit this analytic marginalization to better account for calibration uncertainties without recourse to more direct numerical treatments. However, perhaps the most striking example can be seen through the change in the Fisher information matrix elements.

The loss of information is a projection of the signal onto the calibration uncertainty’s prior covariance matrix. That is, the additional uncertainty in astrophysical parameters is directly related to whether the types of changes in the waveform induced by changing those parameters are common calibration errors. Additionally, it is not necessary to constrain the calibration uncertainty to be small at all frequencies for accurate inferences of individual signals. Instead, one need only constrain particular combinations of frequencies in order to minimize the information lost. This also suggests an algorithm to determine the frequencies at which one should measure the calibration more precisely in order to best reduce the uncertainty in astrophysical parameters. As a rule of thumb, these are (unsurprisingly) the frequencies at which the current calibration uncertainty most hurts the inference.

When calibration errors are large (and correlations between frequencies are not perfect), the uncertainty in astrophysical parameters becomes independent of the signal amplitude. The parameters of louder signals are not better constrained than the parameters of quieter signals. This sets a limit on the precision to which the exceptionally loud astrophysical signals can be measured. The relevant comparison is whether the S/N is greater or smaller than the inverse of the calibration error’s standard deviation ( $\rho\lessgtr\sigma_{\Gamma}^{-1}$ ). Including an approximate proportionality constant, I find that $1.4+1.4\,M_{\odot}$ binaries with $\rho\gtrsim 10^{3}$ require calibration uncertainties $\lesssim 3\%$ .

VII.1 Estimates of calibration requirements for current and proposed detectors

Refer to caption — Figure 1: (*top*) Estimates of the posterior uncertainty in $\tilde{\Lambda}$ and (*bottom*) ratios of uncertainty with and without calibration errors for a 1.4+1.4 $M_{\odot}$ system as a function of S/N with (*left*) aLIGO design sensitivity [51] and (*right*) a reference Cosmic Explorer design [52]. I show the limit $\l_{\Gamma}=0$ with (*dark to light*) $\sigma_{\Gamma}\sqrt{\Delta f}=10\%$ , 3%, 1%, 0.3%, and 0.1%. Due to Cosmic Explorer’s higher sensitivity at lower frequencies, observations with Cosmic Explorer constrain $\tilde{\Lambda}$ less than an equivalent observation (same S/N) with aLIGO. However, the distribution of detected sources expected for Cosmic Explorer extends to higher S/N than for aLIGO. I find that, for both detectors, one requires $\rho\gtrsim 10^{4}$ to obtain $\sigma_{\tilde{\Lambda}}\lesssim 1$ with reasonable calibration uncertainty ( $\sigma_{\Gamma}\sqrt{\Delta f}\gtrsim 1\%$ ), and, at this S/N, the contribution to $\sigma_{\tilde{\Lambda}}$ from calibration uncertainty is at least as large as the contribution from stationary Gaussian noise (ratio of uncertainty $\gtrsim 2$ ).

Although a full description of the impact of all possible calibration priors on astrophysical inference with all possible detectors is beyond the scope of this study, I nonetheless attempt to provide some quantitative estimates of a few limiting cases for a few detectors. Specifically, I consider independent calibration uncertainty for the real and imaginary parts of the calibration error (uncorrelated amplitude and phase errors), each of which is described by a zero-mean squared-exponential GP with the same hyperparameters:

$\displaystyle\mathrm{Cov}(\mathcal{R}\{\delta(f)\},\mathcal{R}\{\delta(f^{\prime})\})$	$\displaystyle=\mathrm{Cov}(\mathcal{I}\{\delta(f)\},\mathcal{I}\{\delta(f^{\prime})\})$
	$\displaystyle=\sigma_{\Gamma}^{2}\exp\left(-\frac{(f-f^{\prime})^{2}}{l_{\Gamma}^{2}}\right)$	(62)
$\displaystyle\mathrm{Cov}(\mathcal{R}\{\delta(f)\},\mathcal{I}\{\delta(f^{\prime})\})$	$\displaystyle=0\ \forall\ f,f^{\prime}$	(63)

With this definition of calibration uncertainty, I then estimate the expected marginal uncertainty in a few astrophysical parameters by computing the Fisher information matrix. As shown in the Appendix, the limit $l_{\Gamma}\rightarrow 0$ corresponds to a local maximum in the uncertainty in astrophysical parameters. I therefore focus on this limit as a worst case scenario given a marginal uncertainty at each frequency (fixed $\sigma_{\Gamma}$ ). Nonzero (but finite) $l_{\Gamma}$ may reduce astrophysical uncertainty by a factor of a few.

Furthermore, I consider a $1.4+1.4\,M_{\odot}$ non-spinning BNS system with each component’s tidal deformability $\Lambda=500$ . I approximate the frequency data as a regularly spaced array from $5\,\mathrm{Hz}$ to $1500\,\mathrm{Hz}$ , with a frequency spacing of $\Delta f=1\,\mathrm{Hz}$ . My waveform model assumes the astrophysical parameters only affect the signal’s phase, and I describe the phase in the frequency domain with a post-Newtonian (PN) sum. That is, I model the waveform following Ref. [58]

h=af^{-7/6}e^{i\psi(f)}

(64)

where

	$\displaystyle\psi(f)$	$\displaystyle=2\pi ft_{c}-\frac{117}{256}\tilde{\Lambda}\frac{v^{5}}{\eta}$
		$\displaystyle\quad+\frac{3}{128\eta v^{5}}\left[1+\frac{20}{9}\left(\frac{743}{336}+\frac{11}{4}\eta\right)v^{2}-4(4\pi-\beta)v^{3}\right.$
		$\displaystyle\quad\quad\quad\quad\left.+10\left(\frac{3058673}{1016064}+\frac{5429}{1008}\eta+\frac{617}{144}\eta^{2}-\sigma\right)v^{4}\right]$

and

	$\displaystyle M_{\mathrm{tot}}=m_{1}+m_{2}$		(66)
	$\displaystyle\eta=\frac{m_{1}m_{2}}{M_{\mathrm{tot}}^{2}}$		(67)
	$\displaystyle\mathcal{M}=\eta^{3/5}M_{\mathrm{tot}}$		(68)
	$\displaystyle v=(\pi M_{\mathrm{tot}}f)^{1/3}$		(69)
	$\displaystyle\tilde{\Lambda}=\frac{16}{13}\left(\frac{(m_{1}+12m_{2})m_{1}^{4}\Lambda_{1}}{M_{\mathrm{tot}}^{5}}+1\leftrightarrow 2\right)$		(70)

$\beta$ and $\sigma$ are spin parameters. Although more accurate waveforms are available, the simplicity of this model’s phasing (and derivatives thereof) make it well-suited to my particular estimation task. I estimate the uncertainty for $\mathcal{M}$ , $\eta$ , $\beta$ , $\sigma$ , $\tilde{\Lambda}$ , and $t_{c}$ via the inverse of the Fisher information matrix. In the limit $l_{\Gamma}=0$ and assuming the parameters only affect the waveform phase, the Fisher matrix elements are

I_{ab}^{(l_{\Gamma}=0)}=4\int df\,\left(\frac{\partial_{a}\psi\partial_{b}\psi|h|^{2}}{s}\right)\left(1+\frac{4\Delta f\sigma_{\Gamma}^{2}|h|^{2}}{s}\right)^{-1}

(71)

Additionally, I apply Gaussian priors for the two spin parameters $\beta$ and $\sigma$ with standard deviations of 0.05 (small spins) and for the symmetric mass ratio with a standard deviation of 0.25 (physical limit). These priors are centered on the injected values. No priors were applied for the $\mathcal{M}$ , $\tilde{\Lambda}$ , or $t_{c}$ . This setup approximately reproduces the uncertainty reported for GW170817 [59] at the corresponding S/N.

Given the various approximations within my calculation, in addition to the standard caveats about the reliability of the Fisher matrix [60], I must stress that these calibration requirements should be interpreted as informed estimates rather than rigorous bounds. Nonetheless, I present an analysis of expected uncertainties assuming aLIGO design sensitivity [1, 51] in Sec. VII.1.1 and of a nominal Cosmic Explorer design [61, 52, 32, 62] in Sec. VII.1.2. The Appendix describes these results in more detail.

VII.1.1 Forecasts for aLIGO design sensitivity

Table 1 presents several estimates of the the expected uncertainty for a $1.4+1.4\,M_{\odot}$ binary in two limits of calibration uncertainty. First, it shows the limit of zero calibration uncertainty ( $\sigma_{\Gamma}=0$ ), in which case the uncertainty scales inversely with S/N. At a nominal S/N of 10 (approximate detection threshold), I find $\sigma_{\mathcal{M}}\sim O(10^{-4}\,M_{\odot})$ and $\sigma_{\tilde{\Lambda}}\sim\mathrm{few}\cdot 10^{2}$ .

Table 1 also shows the lower limit in the uncertainty in astrophysical parameters set by the calibration uncertainty when $l_{\Gamma}=0$ . This scales with the calibration uncertainty ( $\sigma_{\Gamma}\sqrt{\Delta f}$ , see Appendix), and I quote values at nominal 1% calibration uncertainty. A back-of-the-envelope estimate for the S/N at which calibration uncertainty becomes important can be obtained by equating the two cases. For example, this yields

\rho=6110\left(\frac{0.01}{\sigma_{\Gamma}\sqrt{\Delta f}}\right)

(72)

for $\tilde{\Lambda}$ with aLIGO.

As a general rule, the posterior uncertainty in $\mathcal{M}$ and $\eta$ become limited by calibration uncertainty at lower S/N than $\tilde{\Lambda}$ . The equivalent of Eq. 72 for $\mathcal{M}$ has a coefficient of 736 (nearly an order of magnitude smaller). However, the mass parameters tend to be so well measured anyway that the limits imposed by calibration uncertainty are too small to be of practical importance. I therefore focus on the leading order tidal parameter ( $\tilde{\Lambda}$ ) when quoting approximate calibration requirements. For example, I expect to never measure $\tilde{\Lambda}$ to better than $\pm 0.5$ when $\sigma_{\Gamma}\sqrt{\Delta f}=1\%$ . While this is still exceptional precision ( $\sim 0.1\%$ relative uncertainty [10]), it is only reached in the limit of extremely loud signals: $\rho>10^{5}$ , implying a Galactic merger $O(10\,\mathrm{kpc})$ away.

Figure 1 builds upon this by showing the scaling between $\sigma_{\tilde{\Lambda}}$ and S/N for various $\sigma_{\Gamma}$ . From these curves, one can read off the S/N required to reach a particular level of precision if one knows the calibration uncertainty. Alternatively, if one knows the loudest S/N that is likely to be observed, they can read off the level of calibration uncertainty that can be tolerated before it dominates over uncertainty from the stationary Gaussian noise. The lower panels in Fig. 1 show this explicitly as the ratio of $\sigma_{\tilde{\Lambda}}$ marginalized over calibration uncertainty to $\sigma_{\tilde{\Lambda}}$ assuming perfect calibration. As a rule of thumb, one might require the calibration uncertainty to add no more than 50% more relative uncertainty compared to stationary Gaussian noise alone. With this criterion, we see that $\sigma_{\Gamma}\sqrt{\Delta f}\lesssim 10\%$ suffices for $\tilde{\Lambda}$ when $\rho\leq 10^{3}$ . Given the expected S/N distribution for aLIGO ( $p(\rho)\propto\rho^{-4}$ ) and an approximate detection threshold of 10, this implies that $\sigma_{\Gamma}\sqrt{\Delta f}\lesssim 10\%$ will produce less than 50% additional uncertainty for all but 1 in $10^{6}$ detected signals.

VII.1.2 Forecasts for Cosmic Explorer

Table 1 and Fig. 1 show that the behavior (as a function of S/N) is remarkably similar for both aLIGO design sensitivity and Cosmic Explorer. The main difference is that Cosmic Explorer is has more sensitivity at lower frequencies and therefore measures $\mathcal{M}$ to higher precision and $\tilde{\Lambda}$ to lower precision compared to similar S/N signals in aLIGO. Generally, I find comparable rules of thumb apply regarding calibration requirements. The expected S/N distribution for $1.4+1.4\,M_{\odot}$ binaries detected with Cosmic Explorer when signals are distributed uniformly in comoving volume (and source-frame time) is similar to the distribution detected with aLIGO. As such, the calibration will affect approximately the same fraction of low-mass mergers in both aLIGO and Cosmic explorer.

It is worth noting, though, that calibration requirements for BBHs may be more stringent. Although my waveform model is unsuitable for precise BBH predictions (I truncate my waveform sharply and ignore merger and ringdown [63]), one might expect stricter calibration requirements for BBH than for BNS at the same S/N because the BBH signal bandwidth is smaller. That is, $|h|$ be larger at each frequency for BBH, and therefore the relative importance of $\Gamma\neq 0$ will be larger. However, as much of a BNS’s S/N is gathered at low frequencies, just like a BBH, it is unlikely that this would produce an effect larger than a factor of a few on calibration requirements.

Of perhaps greater astrophysical interest is the fact that the distribution S/N for BBH detected with Cosmic Explorer will not be peaked near the detection threshold [64]. This effect is also present for BNS, but it is much more pronounced for BBH. One may reasonably expect a significant fraction of detected BBH to have $\rho\gtrsim 10^{2}$ . However, the distribution of the loudest sources will still be limited by the amount of comoving volume at low redshift (assuming a constant merger rate in the local universe). Thus, the tail of the distribution of observed S/N will still fall off as $\rho^{-4}$ (Euclidean universe). If calibration uncertainties are small enough to not matter until $\rho\gtrsim 10^{3}$ , then they will not matter for the vast majority of detectable binaries even if the distribution of detected S/N peaks near 100.

VII.2 Next Steps

Given the readily apparent benefits of this analytic approach to the impact of calibration uncertainties within GW inference, it may be of interest to extend this work in the future. Specifically, it may be of interest to more closely examine the implications of Eq. 17 to understand what types of astrophysical signals may be able to most improve our understanding of calibration uncertainties. It may also be of interest to further explore ways to represent parametrized models of the detector response in terms of GPs (e.g., through correlations between frequencies in $\Gamma$ ). For example, Bayesian linear regression is a special case of a GP with a covariance kernel $K(x_{i},x_{j})\propto x_{i}x_{j}$ . That is, covariance kernels of this kind generate polynomial functions of $x$ without the need to specify these as part of a parametrized mean. If this is possible, one may still be able to exploit the analytic marginalization derived in this manuscript that relies on expressing the calibration uncertainty as a GP even when the systematic errors in the modeled detector response shrink below the statistical precision to which the modeled response is measured.

Finally, I again note that the additional terms within the likelihood that are created by calibration errors can also be introduced by waveform errors or deviations from GR. If one is able to express waveform uncertainty (or uncertainty in the true theory of gravity) in terms of a GP, analogous analytic treatments will be possible. Indeed, with a proper choice of hyperprior for our uncertainty in the true theory of gravity, one may be able to perform a completely analytic hierarchical inference for the distribution of deviations from GR. I will explore these possibilities further in future work.

Acknowledgements.

I am very grateful for the useful discussions with Ethan Payne and the broader parameter estimation and calibration groups within the LIGO Scientific Collaboration. I also thank the Canadian Institute for Advanced Research (CIFAR) for support. Research at Perimeter Institute is supported in part by the Government of Canada through the Department of Innovation, Science and Economic Development Canada and by the Province of Ontario through the Ministry of Colleges and Universities. This material is based upon work supported by NSF’s LIGO Laboratory which is a major facility fully funded by the National Science Foundation.

References

Aasi et al. [2015] J. Aasi et al. (LIGO Scientific), Advanced LIGO, Class. Quant. Grav. 32, 074001 (2015), arXiv:1411.4547 [gr-qc] .
Acernese et al. [2015] F. Acernese et al. (VIRGO), Advanced Virgo: a second-generation interferometric gravitational wave detector, Class. Quant. Grav. 32, 024001 (2015), arXiv:1408.3978 [gr-qc] .
Akutsu et al. [2019] T. Akutsu et al. (KAGRA), KAGRA: 2.5 Generation Interferometric Gravitational Wave Detector, Nature Astron. 3, 35 (2019), arXiv:1811.08079 [gr-qc] .
McCuller et al. [2020] L. McCuller, C. Whittle, D. Ganapathy, K. Komori, M. Tse, A. Fernandez-Galiana, L. Barsotti, P. Fritschel, M. MacInnis, F. Matichard, K. Mason, N. Mavalvala, R. Mittleman, H. Yu, M. E. Zucker, and M. Evans, Frequency-dependent squeezing for advanced ligo, Phys. Rev. Lett. 124, 171102 (2020).
Yu et al. [2020] H. Yu et al. (LIGO Scientific), Quantum correlations between light and the kilogram-mass mirrors of LIGO, Nature 583, 43 (2020), arXiv:2002.01519 [quant-ph] .
Abbott et al. [2021a] R. Abbott et al. (LIGO Scientific Collaboration and Virgo Collaboration), Tests of general relativity with binary black holes from the second ligo-virgo gravitational-wave transient catalog, Phys. Rev. D 103, 122002 (2021a).
Abbott et al. [2021b] R. Abbott et al. (LIGO Scientific, VIRGO, KAGRA), Tests of General Relativity with GWTC-3, (2021b), arXiv:2112.06861 [gr-qc] .
Abbott et al. [2021c] R. Abbott et al., Population properties of compact objects from the second LIGO–virgo gravitational-wave transient catalog, The Astrophysical Journal Letters 913, L7 (2021c).
Abbott et al. [2021d] R. Abbott et al. (LIGO Scientific, VIRGO, KAGRA), The population of merging compact binaries inferred using gravitational waves through GWTC-3, (2021d), arXiv:2111.03634 [astro-ph.HE] .
Legred et al. [2021] I. Legred, K. Chatziioannou, R. Essick, S. Han, and P. Landry, Impact of the psr $\mathrm{J}0740+6620$ radius constraint on the properties of high-density matter, Phys. Rev. D 104, 063003 (2021).
Essick et al. [2021] R. Essick, P. Landry, A. Schwenk, and I. Tews, Detailed examination of astrophysical constraints on the symmetry energy and the neutron skin of ${}^{208}\mathrm{Pb}$ with minimal modeling assumptions, Phys. Rev. C 104, 065804 (2021).
Viets et al. [2018] A. D. Viets, M. Wade, A. L. Urban, S. Kandhasamy, J. Betzwieser, D. A. Brown, J. Burguet-Castell, C. Cahillane, E. Goetz, K. Izumi, S. Karki, J. S. Kissel, G. Mendell, R. L. Savage, X. Siemens, D. Tuyenbayev, and A. J. Weinstein, Reconstructing the calibrated strain signal in the advanced LIGO detectors, Classical and Quantum Gravity 35, 095015 (2018).
Sun et al. [2020] L. Sun et al., Characterization of systematic error in Advanced LIGO calibration, Class. Quant. Grav. 37, 225008 (2020), arXiv:2005.02531 [astro-ph.IM] .
Ross et al. [2021] M. P. Ross, T. Mistry, L. Datrier, J. Kissel, K. Venkateswara, C. Weller, K. Kumar, C. Hagedorn, E. Adelberger, J. Lee, E. Shaw, P. Thomas, D. Barker, F. Clara, B. Gateley, T. M. Guidry, E. Daw, M. Hendry, and J. Gundlach, Initial results from the ligo newtonian calibrator, Phys. Rev. D 104, 082006 (2021).
Estevez et al. [2021] D. Estevez, B. Mours, and T. Pradier, Newtonian calibrator tests during the virgo o3 data taking, Classical and Quantum Gravity 38, 075012 (2021).
Vitale et al. [2012] S. Vitale, W. Del Pozzo, T. G. F. Li, C. Van Den Broeck, I. Mandel, B. Aylott, and J. Veitch, Effect of calibration errors on Bayesian parameter estimation for gravitational wave signals from inspiral binary systems in the Advanced Detectors era, Phys. Rev. D 85, 064034 (2012), arXiv:1111.3044 [gr-qc] .
Pitkin et al. [2016] M. Pitkin, C. Messenger, and L. Wright, Astrophysical calibration of gravitational-wave detectors, Phys. Rev. D 93, 062002 (2016), arXiv:1511.02758 [astro-ph.IM] .
Essick and Holz [2019] R. Essick and D. E. Holz, Calibrating gravitational-wave detectors with GW170817, Class. Quant. Grav. 36, 125002 (2019), arXiv:1902.08076 [astro-ph.IM] .
Payne et al. [2020] E. Payne, C. Talbot, P. D. Lasky, E. Thrane, and J. S. Kissel, Gravitational-wave astronomy with a physical calibration model, Phys. Rev. D 102, 122004 (2020).
Vitale et al. [2021] S. Vitale, C.-J. Haster, L. Sun, B. Farr, E. Goetz, J. Kissel, and C. Cahillane, Physical approach to the marginalization of LIGO calibration uncertainties, Phys. Rev. D 103, 063016 (2021), arXiv:2009.10192 [gr-qc] .
Karki et al. [2016] S. Karki et al., The Advanced LIGO Photon Calibrators, Rev. Sci. Instrum. 87, 114503 (2016), arXiv:1608.05055 [astro-ph.IM] .
Bhattacharjee et al. [2020] D. Bhattacharjee, Y. Lecoeuche, S. Karki, J. Betzwieser, V. Bossilkov, S. Kandhasamy, E. Payne, and R. L. Savage, Fiducial displacements with improved accuracy for the global network of gravitational wave detectors, Classical and Quantum Gravity 38, 015009 (2020).
Cahillane et al. [2017] C. Cahillane, J. Betzwieser, D. A. Brown, E. Goetz, E. D. Hall, K. Izumi, S. Kandhasamy, S. Karki, J. S. Kissel, G. Mendell, R. L. Savage, D. Tuyenbayev, A. Urban, A. Viets, M. Wade, and A. J. Weinstein, Calibration uncertainty for advanced ligo’s first and second observing runs, Phys. Rev. D 96, 102001 (2017).
The LIGO Scientific Collaboration and The Virgo Collaboration [2021] The LIGO Scientific Collaboration and The Virgo Collaboration, Ligo and virgo calibaration uncertainty (o1, o2 and o3), https://dcc.ligo.org/LIGO-T2100313/public (2021).
Farr et al. [2014] W. Farr, B. Farr, and T. Littenberg, Modelling calibration errors in cbc waveforms, https://dcc.ligo.org/LIGO-T1400682/public (2014).
Lindblom [2009] L. Lindblom, Optimal calibration accuracy for gravitational-wave detectors, Phys. Rev. D 80, 042005 (2009).
Abbott et al. [2019a] B. P. Abbott et al. (LIGO Scientific Collaboration and Virgo Collaboration), Gwtc-1: A gravitational-wave transient catalog of compact binary mergers observed by ligo and virgo during the first and second observing runs, Phys. Rev. X 9, 031040 (2019a).
Abbott et al. [2021e] R. Abbott et al. (LIGO Scientific Collaboration and Virgo Collaboration), Gwtc-2: Compact binary coalescences observed by ligo and virgo during the first half of the third observing run, Phys. Rev. X 11, 021053 (2021e).
Abbott et al. [2021f] R. Abbott et al. (LIGO Scientific, VIRGO), Gwtc-2.1: Deep extended catalog of compact binary coalescences observed by ligo and virgo during the first half of the third observing run, (2021f), arXiv:2108.01045 [gr-qc] .
Abbott et al. [2021g] R. Abbott et al. (LIGO Scientific, VIRGO, KAGRA), Gwtc-3: Compact binary coalescences observed by ligo and virgo during the second part of the third observing run, (2021g), arXiv:2111.03606 [gr-qc] .
Essick et al. [2017] R. Essick, S. Vitale, and M. Evans, Frequency-dependent responses in third generation gravitational-wave detectors, Phys. Rev. D 96, 084004 (2017).
Evans et al. [2021] M. Evans et al., A Horizon Study for Cosmic Explorer: Science, Observatories, and Community, (2021), arXiv:2109.09882 [astro-ph.IM] .
Allen et al. [2012] B. Allen, W. G. Anderson, P. R. Brady, D. A. Brown, and J. D. E. Creighton, Findchirp: An algorithm for detection of gravitational waves from inspiraling compact binaries, Phys. Rev. D 85, 122006 (2012).
Talbot and Thrane [2020] C. Talbot and E. Thrane, Gravitational-wave astronomy with an uncertain noise power spectral density, Phys. Rev. Research 2, 043298 (2020).
Talbot et al. [2021] C. Talbot, E. Thrane, S. Biscoveanu, and R. Smith, Inference with finite time series: Observing the gravitational universe through windows, Phys. Rev. Research 3, 043049 (2021).
Tuyenbayev et al. [2020] D. Tuyenbayev, R. Savage, T. Sadecki, and S. Karki, Index of photon calibrator gold standard and checking standard nist calibrations, https://dcc.ligo.org/LIGO-T1500036/public (2020).
Lecoeuche et al. [2020] Y. Lecoeuche, M. Szczepanczyk, R. Savage, S. Karki, and T. Sadecki, Pcal gold standard and working standard configuration change proposal (2020).
Zimmerman et al. [2019] A. Zimmerman, C.-J. Haster, and K. Chatziioannou, On combining information from multiple gravitational wave sources, Phys. Rev. D 99, 124044 (2019).
Isi et al. [2019] M. Isi, K. Chatziioannou, and W. M. Farr, Hierarchical test of general relativity with gravitational waves, Phys. Rev. Lett. 123, 121101 (2019).
Lindblom et al. [2008] L. Lindblom, B. J. Owen, and D. A. Brown, Model waveform accuracy standards for gravitational wave data analysis, Phys. Rev. D 78, 124020 (2008).
Was et al. [2009] M. Was, M.-A. Bizouard, V. Brisson, F. Cavalier, M. Davier, P. Hello, N. Leroy, F. Robinet, and M. Vavoulidis, On the background estimation by time slides in a network of gravitational wave detectors, Classical and Quantum Gravity 27, 015005 (2009).
Cannon et al. [2013] K. Cannon, C. Hanna, and D. Keppel, Method to estimate the significance of coincident gravitational-wave observations from compact binary coalescence, Phys. Rev. D 88, 024025 (2013).
Usman et al. [2016] S. A. Usman, A. H. Nitz, I. W. Harry, C. M. Biwer, D. A. Brown, M. Cabero, C. D. Capano, T. D. Canton, T. Dent, S. Fairhurst, M. S. Kehl, D. Keppel, B. Krishnan, A. Lenon, A. Lundgren, A. B. Nielsen, L. P. Pekowsky, H. P. Pfeiffer, P. R. Saulson, M. West, and J. L. Willis, The PyCBC search for gravitational waves from compact binary coalescence, Classical and Quantum Gravity 33, 215004 (2016).
Lynch et al. [2017] R. Lynch, S. Vitale, R. Essick, E. Katsavounidis, and F. Robinet, Information-theoretic approach to the gravitational-wave burst detection problem, Phys. Rev. D 95, 104046 (2017).
Drago et al. [2021] M. Drago, S. Klimenko, C. Lazzaro, E. Milotti, G. Mitselmakher, V. Necula, B. O’Brian, G. A. Prodi, F. Salemi, M. Szczepanczyk, S. Tiwari, V. Tiwari, G. V, G. Vedovato, and I. Yakushin, coherent waveburst, a pipeline for unmodeled gravitational-wave data analysis, SoftwareX 14, 100678 (2021).
Aubin et al. [2021] F. Aubin, F. Brighenti, R. Chierici, D. Estevez, G. Greco, G. M. Guidi, V. Juste, F. Marion, B. Mours, E. Nitoglia, O. Sauter, and V. Sordini, The MBTA pipeline for detecting compact binary coalescences in the third LIGO–virgo observing run, Classical and Quantum Gravity 38, 095004 (2021).
Pürrer and Haster [2020] M. Pürrer and C.-J. Haster, Gravitational waveform accuracy requirements for future ground-based detectors, Phys. Rev. Res. 2, 023151 (2020), arXiv:1912.10055 [gr-qc] .
Edelman et al. [2021] B. Edelman, F. J. Rivera-Paleo, J. D. Merritt, B. Farr, Z. Doctor, J. Brink, W. M. Farr, J. Gair, J. S. Key, J. McIver, and A. B. Nielsen, Constraining unmodeled physics with compact binary mergers from gwtc-1, Phys. Rev. D 103, 042004 (2021).
Rasmussen and Williams [2006] C. E. Rasmussen and C. K. I. Williams, Gaussian Processes for Machine Learning (MIT Press, 2006).
Fairhurst [2011] S. Fairhurst, Source localization with an advanced gravitational wave detector network, Class. Quant. Grav. 28, 105021 (2011), arXiv:1010.6192 [gr-qc] .
Barsotti et al. [2018] L. Barsotti et al., Updated advanced ligo sensitivity design curve, https://dcc.ligo.org/LIGO-T1800044/public (2018).
Kuns and other [2021] K. Kuns and other, Cosmic explorer strain sensitivity, https://dcc.cosmicexplorer.org/public/0163/T2000017/004/ce_strain.txt (2021).
Abbott et al. [2021h] R. Abbott et al. (KAGRA, Virgo, LIGO Scientific), Upper limits on the isotropic gravitational-wave background from Advanced LIGO and Advanced Virgo’s third observing run, Phys. Rev. D 104, 022004 (2021h), arXiv:2101.12130 [gr-qc] .
Abbott et al. [2021i] R. Abbott et al. (KAGRA, VIRGO, LIGO Scientific), All-sky search for continuous gravitational waves from isolated neutron stars in the early O3 LIGO data, Phys. Rev. D 104, 082004 (2021i), arXiv:2107.00600 [gr-qc] .
Tiwari [2018] V. Tiwari, Estimation of the sensitive volume for gravitational-wave source populations using weighted monte carlo integration, Classical and Quantum Gravity 35, 145009 (2018).
Farr [2019] W. M. Farr, Accuracy requirements for empirically measured selection functions, Research Notes of the AAS 3, 66 (2019).
Essick [2021] R. Essick, Constructing mixture models for sensitivity estimates from subsets of separate injections, Research Notes of the AAS 5, 220 (2021).
Flanagan and Hinderer [2008] E. E. Flanagan and T. Hinderer, Constraining neutron-star tidal love numbers with gravitational-wave detectors, Phys. Rev. D 77, 021502 (2008).
Abbott et al. [2019b] B. P. Abbott et al. (LIGO Scientific Collaboration and Virgo Collaboration), Properties of the binary neutron star merger gw170817, Phys. Rev. X 9, 011001 (2019b).
Vallisneri [2008] M. Vallisneri, Use and abuse of the fisher information matrix in the assessment of gravitational-wave parameter-estimation prospects, Phys. Rev. D 77, 042001 (2008).
Abbott et al. [2017] B. P. Abbott et al. (LIGO Scientific), Exploring the Sensitivity of Next Generation Gravitational Wave Detectors, Class. Quant. Grav. 34, 044001 (2017), arXiv:1607.08697 [astro-ph.IM] .
Reitze et al. [2019] D. Reitze et al., Cosmic Explorer: The U.S. Contribution to Gravitational-Wave Astronomy beyond LIGO, Bull. Am. Astron. Soc. 51, 035 (2019), arXiv:1907.04833 [astro-ph.IM] .
Mandel et al. [2014] I. Mandel, C. P. L. Berry, F. Ohme, S. Fairhurst, and W. M. Farr, Parameter estimation on compact binary coalescences with abruptly terminating gravitational waveforms, Classical and Quantum Gravity 31, 155005 (2014).
Vitale [2016] S. Vitale, Three observational differences for binary black holes detections with second- and third-generation gravitational-wave detectors, Phys. Rev. D 94, 121501 (2016).

Appendix A Marginal Uncertainty when $\Gamma$ is not diagonal

In this appendix, I briefly investigate the behavior of the calibration-marginalized Fisher information matrix assuming squared-exponential GP priors for the calibration uncertainty with non-vanishing correlations between frequencies ( $l_{\Gamma}\neq 0$ ). Note that the Fisher information matrix elements can be scaled so that they only depend on two parameters

\frac{1}{\rho^{2}}I_{ab}=\partial_{a}\tilde{h}^{\mathrm{T}}\tilde{B}^{-1}\partial_{b}\tilde{h}-\frac{1}{2}\mathrm{tr}\left[\partial_{a}\tilde{B}\partial_{b}\tilde{B}^{-1}\right]

(73)

where

	$\displaystyle h=a\tilde{h}$		(74)
	$\displaystyle H=a\tilde{H}$		(75)
	$\displaystyle\tilde{h}^{\mathrm{T}}C^{-1}\tilde{h}=1$		(76)
	$\displaystyle\tilde{B}^{-1}=C^{-1}-C^{-1}\tilde{H}\tilde{A}\tilde{H}^{\mathrm{T}}C^{-1}$		(77)
	$\displaystyle\tilde{A}^{-1}=\tilde{H}^{\mathrm{T}}C^{-1}\tilde{H}+x\tilde{\Gamma}^{-1}$		(78)
	$\displaystyle\Gamma=\sigma_{\Gamma}^{2}\tilde{\Gamma}$		(79)

and

x\equiv\frac{1}{\sigma_{\Gamma}^{2}\rho^{2}}

(80)

Again, one sees that the comparison that determines whether calibration uncertainty is important within the astrophysical inference (whether $x$ is large or small) is between $\sigma_{\Gamma}$ and $\rho$ . This appendix is dedicated to illuminating $l_{\Gamma}$ ’s role as well. Fig. 2 shows the main results, which are

•

$l_{\Gamma}=0$ (diagonal $\Gamma$ ) is a local maximum in uncertainty,
•

finite $l_{\Gamma}$ can reduce the uncertainty as much as a factor of $\sim 10$ , and
•

perfect correlations between frequencies (infinite $l_{\Gamma}$ ) no longer impose an absolute minimum on the uncertainty of astrophysical parameters.

In Fig. 2 and throughout this appendix, I approximate $I_{ab}\approx\partial_{a}h^{\mathrm{T}}B^{-1}\partial_{b}h$ and neglect the trace term in Eq. 54. This is only exact when $\Gamma$ is diagonal and $\mathrm{Var}\{\mathcal{R}\{\delta(f_{i})\}\}=\mathrm{Var}\{\mathcal{I}\{\delta(f_{i})\}\}\ \forall\ f_{i}$ , in which case $B$ depends only on $|h|^{2}$ and therefore $\partial_{a}B=0$ whenever the astrophysical parameters only affect the signal’s phase. The trace term is always negligible when $l_{\Gamma}$ is small, but it is not know whether it is important for larger $l_{\Gamma}$ . What is known, however, is that it is very difficult to accurately compute the trace term for large $l_{\Gamma}$ because $\Gamma$ becomes ill-conditioned (the ratio of the largest eigenvalue to the smallest eigenvalue is large) and therefore difficult to invert. Even more so than the results in the main text, then, my conclusions about the behavior at large $l_{\Gamma}$ should be taken with a healthy grain of salt.

First, let’s begin with the observation that diagonal $\Gamma$ correspond to local maxima in the uncertainty within the astrophysical inference. Note that

\frac{d}{dl_{\Gamma}}I_{ab}=\partial_{a}h^{\mathrm{T}}\frac{dB^{-1}}{dl_{\Gamma}}\partial_{b}h+\frac{1}{2}\mathrm{tr}\left[\frac{dB}{dl_{\Gamma}}\partial_{a}B^{-1}B\partial_{b}B^{-1}+B\frac{d(\partial_{a}B^{-1})}{dl_{\Gamma}}B\partial_{b}B^{-1}+B\partial_{a}B^{-1}\frac{dB}{dl_{\Gamma}}\partial_{b}B^{-1}+B\partial_{a}B^{-1}B\frac{d(\partial_{b}B^{-1})}{dl_{\Gamma}}\right]

(81)

and

\left.\frac{d}{dl_{\Gamma}}I_{ab}\right|_{l_{\Gamma}=0}=\partial_{a}h^{\mathrm{T}}\left(\left.\frac{dB^{-1}}{dl_{\Gamma}}\right|_{l_{\Gamma}=0}\right)\partial_{b}h

(82)

because $\partial_{a}B^{-1}|_{l_{\Gamma}=0}=0$ . With further manipulation, one obtains

\frac{dB^{-1}}{dl_{\Gamma}}=-C^{-1}HA\Gamma^{-1}\left(\frac{d\Gamma}{dl_{\Gamma}}\right)\Gamma^{-1}AH^{\mathrm{T}}C^{-1}

(83)

and, for my squared-exponential GPs

\frac{d\Gamma_{ij}}{dl_{\Gamma}}=\left(\frac{2(x_{i}-x_{j})^{2}}{l_{\Gamma}^{3}}\right)\Gamma_{ij}\geq 0\ \forall\ i,j

(84)

What’s more, L’Hôpital’s rule shows that

\lim\limits_{l_{\Gamma}\rightarrow 0^{+}}\frac{d\Gamma_{ij}}{dl_{\Gamma}}=0\ \forall\ i,j

(85)

and therefore $(dI_{ab}/dl_{\Gamma})|_{l_{\Gamma}=0}=0$ .⁶⁶6Similar considerations show that $(d(\partial_{a}B^{-1})/dl_{\Gamma})|_{l_{\Gamma}=0}=0$ .

This makes sense, as diagonal $\Gamma$ maximize the entropy (provide the least information) within the calibration prior for each $\sigma_{\Gamma}$ . One expects the smallest amount of prior calibration information to correspond to a maximum in the uncertainty in the astrophysical inference. As the correlations between frequencies increase, the types of calibration error allowed by the prior are more restricted, and this, almost certainly, reduces the uncertainty in the astrophysical parameters.

Taking this to an extreme, the scaling of the Fisher information matrix is fundamentally different for finite S/N and $\sigma_{\Gamma}$ when one first takes the limit $l_{\Gamma}\rightarrow\infty$ . As discussed in Sec. III.1.4, the correction to $C^{-1}$ within $B^{-1}$ scales as $\sigma_{\Gamma}^{2}\rho^{2}/(1+\sigma_{\Gamma}^{2}\rho^{2})$ . This implies that, when the product $\sigma_{\Gamma}\rho$ is large, the correction ends up independent of both. That is, $\lim_{\rho\rightarrow\infty}B^{-1}\rightarrow\epsilon C^{-1}$ , which implies that $I_{ab}\approx\partial_{a}h^{\mathrm{T}}B^{-1}\partial_{b}h$ will scale with $\rho^{2}$ both at low and high S/N. Indeed, Fig. 3 shows this behavior explicitly (albeit neglecting the trace term), and calibration uncertainty only ever increases the uncertainty in $\tilde{\Lambda}$ by at most a factor of a few.

This is in stark contrast to the prediction of Secs. III.1.2 and V.2, which suggest the uncertainty at high S/N asymptotes to a value that is independent of the S/N. However, there is no contradiction. $\Gamma$ becomes singular when $l_{\Gamma}\rightarrow\infty$ , and therefore $\Gamma^{-1}$ does not exist. As such, the expressions in Secs. III.1.2 and V.2, which rely on $\Gamma^{-1}$ , do not apply.

Nonetheless, one may conclude that correlations between frequencies in the calibration prior help reduce the amount of information lost within the astrophysical inference. However, perfect calibration between frequencies is unlikely to ever be realized in practice, and these conclusions depend to some extent on the assumption of independent amplitude and phase errors as well as the particular assumption of a squared-exponential GP. Further work may be needed to determine whether GPs based on more realistic calibration measurements also show this behavior.

Appendix B On the interpretation of $\sigma_{\Gamma}\sqrt{\Delta f}$

The careful reader may have been struck by the apparent dependence of my results on the frequency spacing within, e.g., Eq. 71, where the relevant quantity describing the calibration uncertainty is $\sigma_{\Gamma}^{2}\Delta f$ . While this should not be surprising, as the variance of the data due to stationary Gaussian noise has a similar scaling ( $C\Delta f\sim s$ is constant for different $\Delta f$ ), it is perhaps illuminating to consider this in more detail.

Specifically, consider downsampling the data by averaging the data in neighboring frequency bins.

d_{i+j}=\frac{1}{2}(d_{i}+d_{j})

(86)

Applying this procedure to the signal model, one obtains

	$\displaystyle\frac{1}{2}(h_{i}(1+\delta_{i})+h_{j}(1+\delta_{j}))$	$\displaystyle=\frac{1}{2}(h_{i}+h_{j})+\frac{1}{2}(h_{i}\delta_{i}+h_{j}\delta_{j})$
		$\displaystyle=h_{i+j}\left(1+\frac{h_{i}\delta_{i}+h_{j}\delta_{j}}{h_{i}+h_{j}}\right)$		(87)

from which it is clear that $\delta_{i+j}$ is a signal-weighted average of $\delta_{i}$ and $\delta_{j}$ . Thus

\mathrm{Var}(\delta_{i+j})=\frac{h_{i}^{2}}{(h_{i}+h_{j})^{2}}\mathrm{Var}(\delta_{i})+\frac{h_{j}^{2}}{(h_{i}+h_{j})^{2}}\mathrm{Var}(\delta_{j})+2\frac{h_{i}h_{j}}{(h_{i}+h_{j})^{2}}\mathrm{Cov}(\delta_{i},\delta_{j})

(88)

If $\mathrm{Var}(\delta_{i})=\mathrm{Var}(\delta_{j})=\mathrm{Cov}(\delta_{i},\delta_{j})=\sigma_{\Gamma}^{2}$ , implying $\delta_{i}$ and $\delta_{j}$ are strongly correlated, then $\mathrm{Var}(\delta_{i+j})=\sigma_{\Gamma}^{2}$ . However, if $\delta_{i}$ and $\delta_{j}$ are uncorrelated ( $\mathrm{Cov}(\delta_{i},\delta_{j})=0$ ), then $\mathrm{Var}(\delta_{i+j})=\sigma_{\Gamma}^{2}(h_{i}^{2}+h_{j}^{2})/(h_{i}+h_{j})^{2}$ . Furthermore, if the signal does not change much between neighboring frequency bins, then $\mathrm{Var}(\delta_{i+j})\approx\sigma_{\Gamma}^{2}/2$ . That is, the variance of the averaged $\delta$ is smaller than the original variance by a factor that is proportional to the number of frequency bins that were averaged.

Therefore, for diagonal $\Gamma$ , one might expect the product of $\Gamma$ and the frequency spacing to be constant, as the frequency spacing scales with the number of samples that have been averaged together. If diagonal $\Gamma$ are realizable in practice, it will likely be better to describe them in terms of $\Gamma\Delta f\sim\mathrm{constant}$ just as $C\Delta f\sim s$ is a constant.

$\displaystyle\ln p(d\|\vartheta,\varphi,\mathcal{H}_{s},\mathcal{H}_{\delta})=$	$\displaystyle\ln p(d\|\vartheta,\varphi,\mathcal{H}_{s})$
	$\displaystyle-\frac{1}{2}\gamma^{\mathrm{T}}\Gamma^{-1}\gamma-\frac{1}{2}\ln\mathrm{det}\|\Gamma\|$
	$\displaystyle+\frac{1}{2}\nu^{\mathrm{T}}A^{-1}\nu+\frac{1}{2}\ln\mathrm{det}\|A\|$	(14)

Calibration Uncertainty’s Impact on Gravitational-Wave Observations

Abstract

I Introduction

II Notation and Stationary Gaussian Noise

III Marginal and Conditioned Likelihoods

III.1 Limiting Cases

III.1.1 Small Deviations: H​Γ​HT≪CH\Gamma H^{\mathrm{T}}\ll C

III.1.2 Large Deviations: C≪H​Γ​HTC\ll H\Gamma H^{\mathrm{T}}

III.1.3 Small Correlations between frequencies

III.1.4 Large Correlations between frequencies

III.2 Detector Networks and Multiple Signals

IV Searches and Sensitivity

V Calibration-Marginalized Fisher Information Matrix

V.1 Calibration uncertainties are small

V.2 Calibration uncertainties are large

VI Waveform Uncertainty and Deviations from General Relativity

VII Discussion

VII.1 Estimates of calibration requirements for current and proposed detectors

VII.1.1 Forecasts for aLIGO design sensitivity

VII.1.2 Forecasts for Cosmic Explorer

VII.2 Next Steps

Acknowledgements.

References

Appendix A Marginal Uncertainty when Γ\Gamma is not diagonal

Appendix B On the interpretation of σΓ​Δ​f\sigma_{\Gamma}\sqrt{\Delta f}

III.1.1 Small Deviations: $H\Gamma H^{\mathrm{T}}\ll C$

III.1.2 Large Deviations: $C\ll H\Gamma H^{\mathrm{T}}$

Appendix A Marginal Uncertainty when $\Gamma$ is not diagonal

Appendix B On the interpretation of $\sigma_{\Gamma}\sqrt{\Delta f}$