This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

The NANOGrav 15 yr data set: Posterior predictive checks for gravitational-wave detection with pulsar timing arrays

Gabriella Agazie  0000-0001-5134-3925 Center for Gravitation, Cosmology and Astrophysics, Department of Physics, University of Wisconsin-Milwaukee,
P.O. Box 413, Milwaukee, WI 53201, USA
   Akash Anumarlapudi 0000-0002-8935-9882 Center for Gravitation, Cosmology and Astrophysics, Department of Physics, University of Wisconsin-Milwaukee,
P.O. Box 413, Milwaukee, WI 53201, USA
   Anne M. Archibald 0000-0003-0638-3340 Newcastle University, NE1 7RU, UK    Zaven Arzoumanian X-Ray Astrophysics Laboratory, NASA Goddard Space Flight Center, Code 662, Greenbelt, MD 20771, USA    Jeremy George Baier 0000-0002-4972-1525 Department of Physics, Oregon State University, Corvallis, OR 97331, USA    Paul T. Baker 0000-0003-2745-753X Department of Physics and Astronomy, Widener University, One University Place, Chester, PA 19013, USA    Laura Blecha 0000-0002-2183-1087 Physics Department, University of Florida, Gainesville, FL 32611, USA    Adam Brazier 0000-0001-6341-7178 Cornell Center for Astrophysics and Planetary Science and Department of Astronomy, Cornell University, Ithaca, NY 14853, USA Cornell Center for Advanced Computing, Cornell University, Ithaca, NY 14853, USA    Paul R. Brook 0000-0003-3053-6538 Institute for Gravitational Wave Astronomy and School of Physics and Astronomy, University of Birmingham, Edgbaston, Birmingham B15 2TT, UK    Sarah Burke-Spolaor 0000-0003-4052-7838 Sloan Fellow Department of Physics and Astronomy, West Virginia University, P.O. Box 6315, Morgantown, WV 26506, USA Center for Gravitational Waves and Cosmology, West Virginia University, Chestnut Ridge Research Building, Morgantown, WV 26505, USA    Bence Bécsy 0000-0003-0909-5563 Department of Physics, Oregon State University, Corvallis, OR 97331, USA    J. Andrew Casey-Clyde 0000-0002-5557-4007 Department of Physics, University of Connecticut, 196 Auditorium Road, U-3046, Storrs, CT 06269-3046, USA    Maria Charisi 0000-0003-3579-2522 Department of Physics and Astronomy, Vanderbilt University, 2301 Vanderbilt Place, Nashville, TN 37235, USA    Shami Chatterjee 0000-0002-2878-1502 Cornell Center for Astrophysics and Planetary Science and Department of Astronomy, Cornell University, Ithaca, NY 14853, USA    Katerina Chatziioannou Division of Physics, Mathematics, and Astronomy, California Institute of Technology, Pasadena, CA 91125, USA    Tyler Cohen 0000-0001-7587-5483 Department of Physics, New Mexico Institute of Mining and Technology, 801 Leroy Place, Socorro, NM 87801, USA    James M. Cordes 0000-0002-4049-1882 Cornell Center for Astrophysics and Planetary Science and Department of Astronomy, Cornell University, Ithaca, NY 14853, USA    Neil J. Cornish 0000-0002-7435-0869 Department of Physics, Montana State University, Bozeman, MT 59717, USA    Fronefield Crawford 0000-0002-2578-0360 Department of Physics and Astronomy, Franklin & Marshall College, P.O. Box 3003, Lancaster, PA 17604, USA    H. Thankful Cromartie 0000-0002-6039-692X National Research Council Research Associate, National Academy of Sciences, Washington, DC 20001, USA resident at Naval Research Laboratory, Washington, DC 20375, USA    Kathryn Crowter 0000-0002-1529-5169 Department of Physics and Astronomy, University of British Columbia, 6224 Agricultural Road, Vancouver, BC V6T 1Z1, Canada    Megan E. DeCesar 0000-0002-2185-1790 George Mason University, Fairfax, VA 22030, resident at the U.S. Naval Research Laboratory, Washington, DC 20375, USA    Paul B. Demorest 0000-0002-6664-965X National Radio Astronomy Observatory, 1003 Lopezville Rd., Socorro, NM 87801, USA    Heling Deng Department of Physics, Oregon State University, Corvallis, OR 97331, USA    Lankeswar Dey 0000-0002-2554-0674 Department of Physics and Astronomy, West Virginia University, P.O. Box 6315, Morgantown, WV 26506, USA Center for Gravitational Waves and Cosmology, West Virginia University, Chestnut Ridge Research Building, Morgantown, WV 26505, USA    Timothy Dolch 0000-0001-8885-6388 Department of Physics, Hillsdale College, 33 E. College Street, Hillsdale, MI 49242, USA Eureka Scientific, 2452 Delmer Street, Suite 100, Oakland, CA 94602-3017, USA    Elizabeth C. Ferrara 0000-0001-7828-7708 Department of Astronomy, University of Maryland, College Park, MD 20742, USA Center for Research and Exploration in Space Science and Technology, NASA/GSFC, Greenbelt, MD 20771 NASA Goddard Space Flight Center, Greenbelt, MD 20771, USA    William Fiore 0000-0001-5645-5336 Department of Physics and Astronomy, West Virginia University, P.O. Box 6315, Morgantown, WV 26506, USA Center for Gravitational Waves and Cosmology, West Virginia University, Chestnut Ridge Research Building, Morgantown, WV 26505, USA    Sophia V. Sosa Fiscella 0000-0002-5176-2924 School of Physics and Astronomy, Rochester Institute of Technology, Rochester, NY 14623, USA Laboratory for Multiwavelength Astrophysics, Rochester Institute of Technology, Rochester, NY 14623, USA    Emmanuel Fonseca 0000-0001-8384-5049 Department of Physics and Astronomy, West Virginia University, P.O. Box 6315, Morgantown, WV 26506, USA Center for Gravitational Waves and Cosmology, West Virginia University, Chestnut Ridge Research Building, Morgantown, WV 26505, USA    Gabriel E. Freedman 0000-0001-7624-4616 Center for Gravitation, Cosmology and Astrophysics, Department of Physics, University of Wisconsin-Milwaukee,
P.O. Box 413, Milwaukee, WI 53201, USA
   Emiko C. Gardiner 0000-0002-8857-613X Department of Astronomy, University of California, Berkeley, 501 Campbell Hall #3411, Berkeley, CA 94720, USA    Nate Garver-Daniels 0000-0001-6166-9646 Department of Physics and Astronomy, West Virginia University, P.O. Box 6315, Morgantown, WV 26506, USA Center for Gravitational Waves and Cosmology, West Virginia University, Chestnut Ridge Research Building, Morgantown, WV 26505, USA    Peter A. Gentile 0000-0001-8158-683X Department of Physics and Astronomy, West Virginia University, P.O. Box 6315, Morgantown, WV 26506, USA Center for Gravitational Waves and Cosmology, West Virginia University, Chestnut Ridge Research Building, Morgantown, WV 26505, USA    Kyle A. Gersbach Department of Physics and Astronomy, Vanderbilt University, 2301 Vanderbilt Place, Nashville, TN 37235, USA    Joseph Glaser 0000-0003-4090-9780 Department of Physics and Astronomy, West Virginia University, P.O. Box 6315, Morgantown, WV 26506, USA Center for Gravitational Waves and Cosmology, West Virginia University, Chestnut Ridge Research Building, Morgantown, WV 26505, USA    Deborah C. Good 0000-0003-1884-348X Department of Physics and Astronomy, University of Montana, 32 Campus Drive, Missoula, MT 59812    Kayhan Gültekin 0000-0002-1146-0198 Department of Astronomy and Astrophysics, University of Michigan, Ann Arbor, MI 48109, USA    Jeffrey S. Hazboun 0000-0003-2742-3321 Department of Physics, Oregon State University, Corvallis, OR 97331, USA    Ross J. Jennings 0000-0003-1082-2342 NANOGrav Physics Frontiers Center Postdoctoral Fellow Department of Physics and Astronomy, West Virginia University, P.O. Box 6315, Morgantown, WV 26506, USA Center for Gravitational Waves and Cosmology, West Virginia University, Chestnut Ridge Research Building, Morgantown, WV 26505, USA    Aaron D. Johnson 0000-0002-7445-8423 Center for Gravitation, Cosmology and Astrophysics, Department of Physics, University of Wisconsin-Milwaukee,
P.O. Box 413, Milwaukee, WI 53201, USA
Division of Physics, Mathematics, and Astronomy, California Institute of Technology, Pasadena, CA 91125, USA
   Megan L. Jones 0000-0001-6607-3710 Center for Gravitation, Cosmology and Astrophysics, Department of Physics, University of Wisconsin-Milwaukee,
P.O. Box 413, Milwaukee, WI 53201, USA
   Andrew R. Kaiser 0000-0002-3654-980X Department of Physics and Astronomy, West Virginia University, P.O. Box 6315, Morgantown, WV 26506, USA Center for Gravitational Waves and Cosmology, West Virginia University, Chestnut Ridge Research Building, Morgantown, WV 26505, USA    David L. Kaplan 0000-0001-6295-2881 Center for Gravitation, Cosmology and Astrophysics, Department of Physics, University of Wisconsin-Milwaukee,
P.O. Box 413, Milwaukee, WI 53201, USA
   Luke Zoltan Kelley 0000-0002-6625-6450 Department of Astronomy, University of California, Berkeley, 501 Campbell Hall #3411, Berkeley, CA 94720, USA    Matthew Kerr 0000-0002-0893-4073 Space Science Division, Naval Research Laboratory, Washington, DC 20375-5352, USA    Joey S. Key 0000-0003-0123-7600 University of Washington Bothell, 18115 Campus Way NE, Bothell, WA 98011, USA    Nima Laal 0000-0002-9197-7604 Department of Physics, Oregon State University, Corvallis, OR 97331, USA    Michael T. Lam 0000-0003-0721-651X SETI Institute, 339 N Bernardo Ave Suite 200, Mountain View, CA 94043, USA School of Physics and Astronomy, Rochester Institute of Technology, Rochester, NY 14623, USA Laboratory for Multiwavelength Astrophysics, Rochester Institute of Technology, Rochester, NY 14623, USA    William G. Lamb 0000-0003-1096-4156 Department of Physics and Astronomy, Vanderbilt University, 2301 Vanderbilt Place, Nashville, TN 37235, USA    Bjorn Larsen Department of Physics, Yale University, New Haven, CT 06520, USA    T. Joseph W. Lazio Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 91109, USA    Natalia Lewandowska 0000-0003-0771-6581 Department of Physics and Astronomy, State University of New York at Oswego, Oswego, NY 13126, USA    Tingting Liu 0000-0001-5766-4287 Department of Physics and Astronomy, West Virginia University, P.O. Box 6315, Morgantown, WV 26506, USA Center for Gravitational Waves and Cosmology, West Virginia University, Chestnut Ridge Research Building, Morgantown, WV 26505, USA    Duncan R. Lorimer 0000-0003-1301-966X Department of Physics and Astronomy, West Virginia University, P.O. Box 6315, Morgantown, WV 26506, USA Center for Gravitational Waves and Cosmology, West Virginia University, Chestnut Ridge Research Building, Morgantown, WV 26505, USA    Jing Luo 0000-0001-5373-5914 Deceased Department of Astronomy & Astrophysics, University of Toronto, 50 Saint George Street, Toronto, ON M5S 3H4, Canada    Ryan S. Lynch 0000-0001-5229-7430 Green Bank Observatory, P.O. Box 2, Green Bank, WV 24944, USA    Chung-Pei Ma 0000-0002-4430-102X Department of Astronomy, University of California, Berkeley, 501 Campbell Hall #3411, Berkeley, CA 94720, USA Department of Physics, University of California, Berkeley, CA 94720, USA    Dustin R. Madison 0000-0003-2285-0404 Department of Physics, University of the Pacific, 3601 Pacific Avenue, Stockton, CA 95211, USA    Alexander McEwen 0000-0001-5481-7559 Center for Gravitation, Cosmology and Astrophysics, Department of Physics, University of Wisconsin-Milwaukee,
P.O. Box 413, Milwaukee, WI 53201, USA
   James W. McKee 0000-0002-2885-8485 E.A. Milne Centre for Astrophysics, University of Hull, Cottingham Road, Kingston-upon-Hull, HU6 7RX, UK Centre of Excellence for Data Science, Artificial Intelligence and Modelling (DAIM), University of Hull, Cottingham Road, Kingston-upon-Hull, HU6 7RX, UK    Maura A. McLaughlin 0000-0001-7697-7422 Department of Physics and Astronomy, West Virginia University, P.O. Box 6315, Morgantown, WV 26506, USA Center for Gravitational Waves and Cosmology, West Virginia University, Chestnut Ridge Research Building, Morgantown, WV 26505, USA    Natasha McMann 0000-0002-4642-1260 Department of Physics and Astronomy, Vanderbilt University, 2301 Vanderbilt Place, Nashville, TN 37235, USA    Patrick M. Meyers 0000-0002-2689-0190 Division of Physics, Mathematics, and Astronomy, California Institute of Technology, Pasadena, CA 91125, USA    Bradley W. Meyers 0000-0001-8845-1225 Department of Physics and Astronomy, University of British Columbia, 6224 Agricultural Road, Vancouver, BC V6T 1Z1, Canada International Centre for Radio Astronomy Research, Curtin University, Bentley, WA 6102, Australia    Chiara M. F. Mingarelli 0000-0002-4307-1322 Department of Physics, Yale University, New Haven, CT 06520, USA    Andrea Mitridate 0000-0003-2898-5844 Deutsches Elektronen-Synchrotron DESY, Notkestr. 85, 22607 Hamburg, Germany    Cherry Ng 0000-0002-3616-5160 Dunlap Institute for Astronomy and Astrophysics, University of Toronto, 50 St. George St., Toronto, ON M5S 3H4, Canada    David J. Nice 0000-0002-6709-2566 Department of Physics, Lafayette College, Easton, PA 18042, USA    Stella Koch Ocker 0000-0002-4941-5333 Division of Physics, Mathematics, and Astronomy, California Institute of Technology, Pasadena, CA 91125, USA The Observatories of the Carnegie Institution for Science, Pasadena, CA 91101, USA    Ken D. Olum 0000-0002-2027-3714 Institute of Cosmology, Department of Physics and Astronomy, Tufts University, Medford, MA 02155, USA    Timothy T. Pennucci 0000-0001-5465-2889 Institute of Physics and Astronomy, Eötvös Loránd University, Pázmány P. s. 1/A, 1117 Budapest, Hungary    Benetge B. P. Perera 0000-0002-8509-5947 Arecibo Observatory, HC3 Box 53995, Arecibo, PR 00612, USA    Nihan S. Pol 0000-0002-8826-1285 Department of Physics and Astronomy, Vanderbilt University, 2301 Vanderbilt Place, Nashville, TN 37235, USA    Henri A. Radovan 0000-0002-2074-4360 Department of Physics, University of Puerto Rico, Mayagüez, PR 00681, USA    Scott M. Ransom 0000-0001-5799-9714 National Radio Astronomy Observatory, 520 Edgemont Road, Charlottesville, VA 22903, USA    Paul S. Ray 0000-0002-5297-5278 Space Science Division, Naval Research Laboratory, Washington, DC 20375-5352, USA    Joseph D. Romano 0000-0003-4915-3246 Department of Physics, Texas Tech University, Box 41051, Lubbock, TX 79409, USA    Jessie C. Runnoe 0000-0001-8557-2822 Department of Physics and Astronomy, Vanderbilt University, 2301 Vanderbilt Place, Nashville, TN 37235, USA    Alexander Saffer Department of Physics and Astronomy, West Virginia University, P.O. Box 6315, Morgantown, WV 26506, USA Center for Gravitational Waves and Cosmology, West Virginia University, Chestnut Ridge Research Building, Morgantown, WV 26505, USA    Shashwat C. Sardesai 0009-0006-5476-3603 Center for Gravitation, Cosmology and Astrophysics, Department of Physics, University of Wisconsin-Milwaukee,
P.O. Box 413, Milwaukee, WI 53201, USA
   Carl Schmiedekamp 0000-0002-1283-2184 Department of Physics, Penn State Abington, Abington, PA 19001, USA    Ann Schmiedekamp 0000-0003-4391-936X Department of Physics, Penn State Abington, Abington, PA 19001, USA    Kai Schmitz 0000-0003-2807-6472 Institute for Theoretical Physics, University of Münster, 48149 Münster, Germany    Brent J. Shapiro-Albert 0000-0002-7283-1124 Department of Physics and Astronomy, West Virginia University, P.O. Box 6315, Morgantown, WV 26506, USA Center for Gravitational Waves and Cosmology, West Virginia University, Chestnut Ridge Research Building, Morgantown, WV 26505, USA Giant Army, 915A 17th Ave, Seattle WA 98122    Xavier Siemens 0000-0002-7778-2990 Department of Physics, Oregon State University, Corvallis, OR 97331, USA Center for Gravitation, Cosmology and Astrophysics, Department of Physics, University of Wisconsin-Milwaukee,
P.O. Box 413, Milwaukee, WI 53201, USA
   Joseph Simon 0000-0003-1407-6607 NSF Astronomy and Astrophysics Postdoctoral Fellow Department of Astrophysical and Planetary Sciences, University of Colorado, Boulder, CO 80309, USA    Magdalena S. Siwek 0000-0002-1530-9778 Center for Astrophysics, Harvard University, 60 Garden St, Cambridge, MA 02138, USA    Ingrid H. Stairs 0000-0001-9784-8670 Department of Physics and Astronomy, University of British Columbia, 6224 Agricultural Road, Vancouver, BC V6T 1Z1, Canada    Daniel R. Stinebring 0000-0002-1797-3277 Department of Physics and Astronomy, Oberlin College, Oberlin, OH 44074, USA    Kevin Stovall 0000-0002-7261-594X National Radio Astronomy Observatory, 1003 Lopezville Rd., Socorro, NM 87801, USA    Abhimanyu Susobhanan 0000-0002-2820-0931 Max-Planck-Institut für Gravitationsphysik (Albert-Einstein-Institut), Callinstrasse 38, D-30167, Hannover, Germany    Joseph K. Swiggum 0000-0002-1075-3837 NANOGrav Physics Frontiers Center Postdoctoral Fellow Department of Physics, Lafayette College, Easton, PA 18042, USA    Stephen R. Taylor 0000-0003-0264-1453 Department of Physics and Astronomy, Vanderbilt University, 2301 Vanderbilt Place, Nashville, TN 37235, USA    Jacob E. Turner 0000-0002-2451-7288 Green Bank Observatory, P.O. Box 2, Green Bank, WV 24944, USA    Caner Unal 0000-0001-8800-0192 Department of Physics, Middle East Technical University, 06531 Ankara, Turkey Department of Physics, Ben-Gurion University of the Negev, Be’er Sheva 84105, Israel Feza Gursey Institute, Bogazici University, Kandilli, 34684, Istanbul, Turkey    Michele Vallisneri 0000-0002-4162-0033 Jet Propulsion Laboratory, California Institute of Technology, 4800 Oak Grove Drive, Pasadena, CA 91109, USA Division of Physics, Mathematics, and Astronomy, California Institute of Technology, Pasadena, CA 91125, USA    Sarah J. Vigeland 0000-0003-4700-9072 Center for Gravitation, Cosmology and Astrophysics, Department of Physics, University of Wisconsin-Milwaukee,
P.O. Box 413, Milwaukee, WI 53201, USA
   Haley M. Wahl 0000-0001-9678-0299 Department of Physics and Astronomy, West Virginia University, P.O. Box 6315, Morgantown, WV 26506, USA Center for Gravitational Waves and Cosmology, West Virginia University, Chestnut Ridge Research Building, Morgantown, WV 26505, USA    Caitlin A. Witt 0000-0002-6020-9274 Center for Interdisciplinary Exploration and Research in Astrophysics (CIERA), Northwestern University, Evanston, IL 60208, USA Adler Planetarium, 1300 S. DuSable Lake Shore Dr., Chicago, IL 60605, USA    David Wright 0000-0003-1562-4679 Department of Physics, Oregon State University, Corvallis, OR 97331, USA    Olivia Young 0000-0002-0883-0688 School of Physics and Astronomy, Rochester Institute of Technology, Rochester, NY 14623, USA Laboratory for Multiwavelength Astrophysics, Rochester Institute of Technology, Rochester, NY 14623, USA
Abstract

Pulsar-timing-array experiments have reported evidence for a stochastic background of nanohertz gravitational waves consistent with the signal expected from a population of supermassive–black-hole binaries. Their analyses assume power-law spectra for intrinsic pulsar noise and for the background, as well as a Hellings–Downs cross-correlation pattern among the gravitational-wave–induced residuals across pulsars. These assumptions may not be realized in actuality. We test them in the NANOGrav 15 yr data set using Bayesian posterior predictive checks. After fitting our fiducial model to real data, we generate a population of simulated data-set replications. We use the replications to assess whether the optimal-statistic significance, inter-pulsar correlations, and spectral coefficients are extreme. We recover Hellings–Downs correlations in simulated data sets at significance levels consistent with the correlations measured in the NANOGrav 15 yr data set. A similar test on spectral coefficients shows that their values in real data are not extreme compared to their distributions across replications. We also evaluate the evidence for the stochastic background using posterior-predictive versions of the frequentist optimal statistic and of Bayesian model comparison, and find comparable significance (3.2 σ\sigma and 3 σ\sigma respectively) to what was previously reported for the standard statistics. We conclude with novel visualizations of the reconstructed gravitational waveforms that enter the residuals for each pulsar. Our analysis strengthens confidence in the identification and characterization of the gravitational-wave background.

I Introduction

In June 2023, four separate publications based on the observations of five pulsar-timing-array (PTA) collaborations reported strong evidence for a nanohertz gravitational-wave (GW) background [1, 2, 3, 4], spurring interest in the implications of its spectral properties and spatial correlations for astrophysics and fundamental physics [5, 6, 7, 8]. If the signal originates from a population of supermassive black hole binaries (SMBHBs), its spectrum is expected to approximate a power law [9, 10], but deviations can be caused by a large number of potential effects. For example, at low frequencies, interactions between the binaries and the surrounding gas may result in a spectral turnover; at high frequencies, the finite number of binaries emitting in each frequency bin may result in bin-to-bin fluctuations [7, 5, 11]. If the signal originates from new physics, the spectrum can point to the mechanism of its generation, and a large number of models are currently consistent with the data [6, 12].

Spatial correlations between pulse times of arrival (TOAs) for different pulsars were found to be consistent with the Hellings–Downs function, the correlation pattern induced by an isotropic GW background [13, 14, 15]. Deviations could be caused by anisotropy in the background, by a signal from a loud individual SMBHB. Measuring anisotropy would constrain black-hole population properties [16], while detecting an individual SMBHB would offer a prime target for multi-messenger follow up. However, dedicated searches for anisotropy and individual sources have so far produced null results [17, 18, 19]. Systematic errors could also induce correlations between pulsars, e.g. monopolar correlations due to clock errors, or dipolar correlations induced by errors in the solar system ephemeris [20, 21]. There is slight evidence for monopolar correlations presented in the NANOGrav 15 yr data set [1].

Simulations can address the expected level of anisotropy from a population of SMBHBs [22, 16] and its detectability using standard PTA models, which assume an isotropic GW background with Gaussian statistics and a stationary power-law spectrum. Indeed, Refs. [23, 24] found that the GW signal from a realistic SMBHB population would still be detected using standard models. Thus, current PTA observations [1, 2, 3, 4, 25] do not preclude the presence of astrophysically interesting deviations from power-law spectrum or isotropy.

In this paper, we ask whether the power-law and Hellings–Downs assumptions are supported by observed data, independently of any specific alternative physical model. Our starting point is a fiducial Bayesian analysis of NANOGrav’s 15 yr data set [26] under the standard power-law, Hellings–Downs model. We test these assumptions by way of posterior predictive model checks [27] as proposed in the context of PTA data in Refs. [28, 29]. These checks consist of creating populations of replicated data sets from real-data parameter posteriors, and using these replications to evaluate whether real data is “typical” (i.e., not a statistical outlier) according to a variety of detection, spectral, and correlation statistics. Similar types of checks are becoming increasingly common in the realm of binary black hole population analyses as well [30, 31, 32, 33, 34, 35, 36, 37, 38].

Specifically, following Ref. [28] we re-evaluate the significance of Hellings–Downs correlations and search for alternative spatial correlations using a new detection statistic that marginalizes pp-values over noise-parameter posteriors. Following Ref. [29] we test the power-law assumption by comparing intrinsic-noise and GW power-spectrum posteriors as computed for real and replicated data, and we perform a similar test for the binned angular correlations between pulsars. We also carry out leave-one-out cross validation to identify possible mismodeling in individual pulsars, and to compute the pseudo Bayes factor (a cross-validation metric of model comparison) between the standard Hellings–Downs model and a null model in which common excess power has no inter-pulsar correlations.

The rest of this paper is organized as follows. In Sec. II we describe our data and data model, and we introduce two sets of data replications that we will use for model checking. In Sec. III we test Hellings–Downs correlations using Bayesian pp-values [28] for the optimal statistic [39, 40]; these pp-values are marginalized over GW and intrinsic-noise posteriors, and therefore account fairly for the risk of false positives when the null distribution is uncertain. We find evidence for Hellings–Downs correlations at the 3.2σ3.2\,\sigma level. We also evaluate the evidence for additional background components with monopolar or dipolar correlations, and find none.

In Sec. IV we compare real-data and replicated-data posteriors to search for deviations from a power-law spectrum and from Hellings–Downs correlations. We find no evidence that any individual frequency bin deviates from the power-law model for either intrinsic pulsar noise or the GW background, consistent with Ref. [41]. We also find no evidence that any of the binned inter-pulsar correlations deviates from the Hellings–Downs curve.

In Sec. V we examine the predictive power of the standard PTA model as fit to the NANOGrav 15 yr data. We perform a leave-one-out analysis where we fit Hellings–Downs and uncorrelated models to Np1N_{p}-1 pulsars, and use the models to predict the NpthN_{p}^{\textrm{th}} pulsar’s data. The resulting pseudo Bayes factors favor Hellings–Downs correlations at the 3σ3\,\sigma level. Using simulations, we show that the distribution of the factors across pulsars is consistent with what would be expected for a power-law, Hellings–Downs-correlated GW background with parameters from our fiducial analysis.

Last, in Sec. VI we present the gravitational waveforms that can be reconstructed for each pulsar from our fiducial posteriors. These reconstructions are akin to the waveform reconstructions for stellar-binary coalescences based on LIGO data [42, 43], with the distinction being that in this case we show the estimated realization of a broadband, spatially correlated stochastic signal, as opposed to the gravitational waveform produced by a single binary system. Pulsar J1909-3744 offers the best view so far of the GW background reported in [1, 2, 3, 4, 25]. In Sec. VII we offer concluding remarks.

II Data, model, and data replications

In this section we introduce the NANOGrav 15 yr data set, the modeling that is performed on each pulsar, and the full PTA models used to search for a GW background. We then discuss data replications based on our typical PTA models, which we use in subsequent sections to compare to the 15 yr data set for the purposes of model checking and model comparison.

II.1 Data

We use the NANOGrav 15 yr data set, which contains 67 pulsars that have been timed for more than 3 years, with 16.03 yrs of data between the first and the last time of arrival in the data set [26]. We use the DMX dispersion measure noise model [44] and white noise parameters included in the NANOGrav 15 yr data release [26]. For each pulsar, a best fit timing model is constructed that accounts for deterministic effects like Roemer delay, proper motion, parallax, binary orbits, etc., which is then subtracted from the TOAs to produce a set of timing residuals for each pulsar, δ𝐭\delta\mathbf{t}. Stochastic processes like achromatic intrinsic spin-wandering and GW background-induced delays are included in this initial fit as a single “total red noise” contribution, as the first pass analysis is done on a pulsar-by-pulsar basis and so we cannot separate intrinsic pulsar noise from the GW background.

II.2 PTA model

In this subsection, we discuss the full PTA model that is used to search for a GW background. Readers familiar with this already can skip to Sec. III, although later we will make frequent reference to equations introduced in this section. For a more in depth presentation of the PTA analyses, see Refs. [45, 46, 47].

The starting point for the analysis are the timing residuals, δ𝐭\delta\mathbf{t}. We characterize stochastic processes like intrinsic pulsar noise and the GW background in the frequency domain using a Fourier matrix 𝐅\mathbf{F} and associated amplitudes 𝐚\mathbf{a} [48]. The stochastic processes are covariant with elements of the timing model (specifically the frequency, spin-down, and dispersion measure variations), and so we also introduce deviations from the best-fitting timing model parameters, ϵ\bm{\epsilon}. We assume these deviations are small, such that changes in δ𝐭\delta\mathbf{t} are linear in changes in ϵ\bm{\epsilon} with a design matrix 𝐌\mathbf{M} made up of derivatives of δ𝐭\delta\mathbf{t} with respect to the timing model parameters. Putting these effects together, we have a model for the residuals

𝐫=δ𝐭𝐓𝐛,\displaystyle\mathbf{r}=\delta\mathbf{t}-\mathbf{T}\mathbf{b}\,, (1)

where we have consolidated the frequency domain representation and timing model corrections,

𝐓\displaystyle\mathbf{T} =[𝐌𝐅],\displaystyle=\begin{bmatrix}\mathbf{M}&\mathbf{F}\end{bmatrix}, (2)
𝐛\displaystyle\mathbf{b} =[ϵ𝐚].\displaystyle=\begin{bmatrix}\bm{\epsilon}\\ \mathbf{a}\end{bmatrix}\,. (3)

If radio frequency interference is effectively excised and standard pulse profiles are accurate, the resulting noise is dominated by radiometer noise and “pulse profile jitter” which is traditionally assumed to be frequency independent and Gaussian. This leads to a Gaussian likelihood for the timing residuals

lnp(δ𝐭|𝐛)=12[𝐫T𝐍1𝐫+lndet(2π𝐍)],\displaystyle\ln p(\delta\mathbf{t}|\mathbf{b})=-\frac{1}{2}\left[\mathbf{r}^{T}\mathbf{N}^{-1}\mathbf{r}+\ln\det(2\pi\mathbf{N})\right]\,, (4)

where the covariance matrix 𝐍\mathbf{N} describes the measurement noise of the individual observations, and is block-diagonal. TOAs at different radio frequencies from the same individual observation are correlated with one another due to pulse profile jitter [49], but TOAs from different observations are uncorrelated.

We assume that the GW background and the intrinsic pulsar noise are stationary, and so they can be characterized by the power spectrum of the GW background, correlations between pulsars, and the power spectrum of the intrinsic pulsar noise in each pulsar. The assumption of stationarity for the GW background should hold if the dominant contribution to the background is an ensemble of SMBHBs emitting at roughly constant frequencies. The assumption that intrinsic pulsar noise is stationary is one of expedience that should be tested. Tests on the European Pulsar Timing Array second data release show no signs of non stationarity [50].

Information about the power-law amplitude and spectral index for the intrinsic pulsar noise and the GW background is encoded in the covariance matrix of the sine and cosine amplitudes 𝐚\mathbf{a} across pulsars. We introduce a set of hyperparameters 𝚲\bm{\Lambda} to characterize these power laws. We place a Gaussian prior on 𝐛\mathbf{b},

lnp(𝐛|𝚲)\displaystyle\ln p(\mathbf{b}|\bm{\Lambda}) =12[𝐛T𝐁1𝐛+lndet(2π𝐁)],\displaystyle=-\frac{1}{2}\left[\mathbf{b}^{T}\mathbf{B}^{-1}\mathbf{b}+\ln\det(2\pi\mathbf{B})\right], (5)
where 𝐁\displaystyle\textrm{where }\mathbf{B} =[00𝝋(𝚲)].\displaystyle=\begin{bmatrix}\infty&0\\ 0&\bm{\varphi}(\bm{\Lambda})\end{bmatrix}\,. (6)

We use an improper uniform prior on ϵ\bm{\epsilon} so that its posterior is determined by the likelihood. This prior is now broadcast across 𝐛\mathbf{b} parameters for each pulsar. The covariance matrix of the 𝐚\mathbf{a} coefficients is given by 𝝋(𝚲)\bm{\varphi}(\bm{\Lambda}), which contains blocks corresponding to correlations of the Fourier modes between pulsars. Diagonal blocks encode information about the power spectrum of the total red noise for a given pulsar, including the intrinsic pulsar noise, 𝜼a(𝚲)\bm{\eta}_{a}(\bm{\Lambda}) (where the aa subscript labels the pulsar) and the GW background spectrum 𝝆(𝚲)\bm{\rho}(\bm{\Lambda}). Off-diagonal blocks between pulsars aa and bb contain (scaled) contributions from the GW background. Putting all of this together, the covariance matrix for 𝐚\mathbf{a} is

φ(𝚲)(ai,bj)=Γabρi2(𝚲)δij+ηai2(𝚲)δijδab,\displaystyle\varphi(\bm{\Lambda})_{(ai,bj)}=\Gamma_{ab}\rho_{i}^{2}(\bm{\Lambda})\delta_{ij}+\eta_{ai}^{2}(\bm{\Lambda})\delta_{ij}\delta_{ab}\,, (7)

where ii and jj label frequencies and Γab\Gamma_{ab} corresponds to the correlations between pulsars. Different angular correlation patterns correspond to different models. In this paper we consider four models. The first states that Γab\Gamma_{ab} follows the Hellings–Downs curve (hd model) that is expected from an isotropic GW background,

Γab\displaystyle\Gamma_{ab} =12δab+12ζab4+32ζablnζab,\displaystyle=\frac{1}{2}\delta_{ab}+\frac{1}{2}-\frac{\zeta_{ab}}{4}+\frac{3}{2}\zeta_{ab}\ln\zeta_{ab}\,, (8)
ζab\displaystyle\zeta_{ab} =1cosθab2,\displaystyle=\frac{1-\cos\theta_{ab}}{2}\,, (9)

where θab\theta_{ab} is the angle between pulsars aa and bb on the sky. The second is that Γab=δab\Gamma_{ab}=\delta_{ab}, which we call the common uncorrelated red noise, curn, model. We will also consider a mono model that is characterized by monopolar correlations, Γab=1\Gamma_{ab}=1 and a model with dipolar correlations, dip, with Γab=cosθab\Gamma_{ab}=\cos\theta_{ab}. Theoretical models indicate that ρi(𝚲)\rho_{i}(\bm{\Lambda}) will roughly take the form of a power law, and past empirical studies suggest that ηai(𝚲)\eta_{ai}(\bm{\Lambda}) often follows a power-law as well,

ηai2(𝚲)\displaystyle\eta_{ai}^{2}(\bm{\Lambda}) =Arn,a212π2(fifyr)γrn,afyr3T,\displaystyle=\frac{A_{\textrm{rn,a}}^{2}}{12\pi^{2}}\left(\frac{f_{i}}{f_{\textrm{yr}}}\right)^{-\gamma_{\textrm{rn,a}}}\frac{f_{\textrm{yr}}^{-3}}{T}\,, (10)
ρi2(𝚲)\displaystyle\rho_{i}^{2}(\bm{\Lambda}) =Agw212π2(fifyr)γgwfyr3T,\displaystyle=\frac{A_{\textrm{gw}}^{2}}{12\pi^{2}}\left(\frac{f_{i}}{f_{\textrm{yr}}}\right)^{-\gamma_{\textrm{gw}}}\frac{f_{\textrm{yr}}^{-3}}{T}\,, (11)

where AgwA_{\textrm{gw}} is the amplitude of the GW background at fyr=(1yr)1f_{\textrm{yr}}=(1\textrm{yr})^{-1}, γgw\gamma_{\textrm{gw}} is the negative spectral index, Arn,aA_{\textrm{rn,a}} is the amplitude of intrinsic pulsar noise for pulsar aa and γrn,a\gamma_{\textrm{rn,a}} its associated spectral index. The frequency is given by fi=i/Tf_{i}=i/T, and TT is the time between the first and last TOAs in the data set. For intrinsic pulsar noise we use 30 frequencies, i[1,30]i\in[1,30] and for the GW background we use i[1,14]i\in[1,14]. These numbers were chosen based on individual pulsar fitting (for the intrinsic pulsar noise), and a dedicated curn analysis that allows for the common spectrum to “flatten” at high frequencies, where it then becomes indistinguishable from white noise.

The power-law models for the GW background and intrinsic pulsar noise spectra have amplitudes and spectral indices associated with them: AgwA_{\textrm{gw}}, γgw\gamma_{\textrm{gw}} for the GW background and, Arn,aA_{\textrm{rn,a}}, and γrn,a\gamma_{\textrm{rn,a}} for each of the NpN_{p} pulsars in the array. We collectively denote these parameters as 𝚲\bm{\Lambda}. To reduce the total number of parameters we need to infer, we typically marginalize over the model parameters 𝐛\mathbf{b}, leaving a posterior on the hyperparameters

p(𝚲|δ𝐭)\displaystyle p(\bm{\Lambda}|\delta\mathbf{t}) =d𝐛p(δ𝐭|𝐛)p(𝐛|𝚲)p(𝚲),\displaystyle=\int\textrm{d}\mathbf{b}\,p(\delta\mathbf{t}|\mathbf{b})p(\mathbf{b}|\bm{\Lambda})p(\bm{\Lambda})\,, (12)
=p(𝚲)det(2π𝐂)exp(12δ𝐭T𝐂1δ𝐭).\displaystyle=\frac{p(\bm{\Lambda})}{\sqrt{\det(2\pi\mathbf{C})}}\exp\left(-\frac{1}{2}\delta\mathbf{t}^{T}\mathbf{C}^{-1}\delta\mathbf{t}\right)\,. (13)

The covariance matrix is now 𝐂=(𝐍+𝐓𝐁𝐓T)\mathbf{C}=\left(\mathbf{N}+\mathbf{T}\mathbf{B}\mathbf{T}^{T}\right), and we introduced a prior on the hyperparameters p(𝚲)p(\bm{\Lambda}). We also note that p(𝐛|δ𝐭,𝚲)p(δ𝐭|𝐛)p(𝐛|𝚲)=𝒩(𝐛^,𝚺)p(\mathbf{b}|\delta\mathbf{t},\bm{\Lambda})\propto p(\delta\mathbf{t}|\mathbf{b})p(\mathbf{b}|\bm{\Lambda})=\mathcal{N}(\widehat{\mathbf{b}},\bm{\Sigma}), which is normal with mean and covariance given by

𝐛^\displaystyle\widehat{\mathbf{b}} =𝚺𝐓T𝐍1δ𝐭,\displaystyle=\bm{\Sigma}\mathbf{T}^{T}\mathbf{N}^{-1}\delta\mathbf{t}\,, (14)
𝚺\displaystyle\bm{\Sigma} =(𝐓T𝐍1𝐓+𝐁1)1.\displaystyle=\left(\mathbf{T}^{T}\mathbf{N}^{-1}\mathbf{T}+\mathbf{B}^{-1}\right)^{-1}\,. (15)

We estimate the marginalized posterior on 𝚲\bm{\Lambda} using stochastic sampling methods [51] because of the large dimension of 𝚲\bm{\Lambda} (2Np+22N_{p}+2 in the case described above). This yields NsN_{s} samples {𝚲s}s=1Ns\{\bm{\Lambda}^{s}\}_{s=1}^{N_{s}} approximately drawn from the posterior,

𝚲sp(𝚲|δ𝐭).\displaystyle\bm{\Lambda}^{s}\sim p(\bm{\Lambda}|\delta\mathbf{t})\,. (16)

II.3 Data replications

Below we use δ𝐭\delta\mathbf{t} to refer to generic timing residuals, δ𝐭15yr\delta\mathbf{t}^{15\textrm{yr}} to refer to residuals from the the 15 yr data set, and δ𝐭rep\delta\mathbf{t}^{\textrm{rep}} to refer to data replications. We use two models to create sets of data replications to compare to the collected data. Each method proceeds along similar lines:

  1. 1.

    Choose 𝚲\bm{\Lambda} by drawing randomly from p(𝚲|δ𝐭15yr)p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}}).

  2. 2.

    Draw 𝐛p(𝐛|𝚲)\mathbf{b}\sim p(\mathbf{b}|\bm{\Lambda}). The choice of p(𝐛|𝚲)p(\mathbf{b}|\bm{\Lambda}) depends upon the set of replications we are performing. We specify details below when we discuss individual replication sets. This method nominally calls for us to draw from the improper prior on ϵ\bm{\epsilon}, yielding unusable timing residuals. Therefore, we do not simulate timing model variations, and fix ϵ0\bm{\epsilon}\approx 0.

  3. 3.

    Draw δ𝐭rep𝒩(𝐓𝐛,𝐍)\delta\mathbf{t}^{\textrm{rep}}\sim\mathcal{N}(\mathbf{T}\mathbf{b},\mathbf{N}) where 𝐛\mathbf{b} comes from the previous step.

The data replications use different models at each stage. We outline the different data replication sets, their purpose, and what models they use to carry out the procedure described above.

  • CURNPosteriorDraws: We create simulated data sets based on the curn model which we index with ss. We draw 𝚲sp(𝚲|δ𝐭15yr,curn)\bm{\Lambda}^{s}\sim p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}},\textsc{curn}), and 𝐛s𝒩(0,𝐁(𝚲s)|curn)\mathbf{b}^{s}\sim\mathcal{N}(0,\mathbf{B}(\bm{\Lambda}^{s})|\textsc{curn}), Eqs. (5) and (6). The conditioning on curn implies no correlations between pulsars, Γab=δab\Gamma_{ab}=\delta_{ab}. We do not simulate timing model variations, i.e., ϵ=0\bm{\epsilon}=0. These sets of data replications are compared to the data and recovered model parameters.

  • HDPosteriorDraws: We create simulated data sets based on the hd model which we index with ss. We draw 𝚲sp(𝚲|δ𝐭15yr,hd)\bm{\Lambda}^{s}\sim p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}},\textsc{hd}), and 𝐛s𝒩(0,𝐁(𝚲s)|hd)\mathbf{b}^{s}\sim\mathcal{N}(0,\mathbf{B}(\bm{\Lambda}^{s})|\textsc{hd}). The conditioning on hd implies we include Hellings–Downs correlations between pulsars during simulation. We do not simulate timing model variations, i.e., ϵ=0\bm{\epsilon}=0. These sets of data replications are compared to the real data and recovered model parameters to assess how consistent the data are with the hd model.

III Posterior predictive null hypothesis testing

Given 𝚲sp(𝚲|δ𝐭)\bm{\Lambda}^{s}\sim p(\bm{\Lambda}|\delta\mathbf{t}), we perform posterior predictive checks by checking whether specific desired properties of the model are consistent in the data. To do this, we construct a test statistic, T(δ𝐭,𝚲)T(\delta\mathbf{t},\bm{\Lambda}) that is sensitive to the property we are interested in, and we compare that test statistic calculated in the 15 yr data to the same statistic calculated over data replications. Using this method we check (1) whether the 15 yr data are consistent with the lack of correlations assumed by the curn model; (2) whether the 15 yr data have correlations that are consistent with the HD curve; and (3) whether the 15 yr data show evidence for alternative spatial correlations, e.g., monopolar or dipolar, inconsistent with both the hd model and the curn model.

III.1 CURN model tests

The curn model is characterized by a lack of spatial correlations between pulsars, Γab=δab\Gamma_{ab}=\delta_{ab}. To reject this model, we use the optimal statistic signal-to-noise ratio (SNR) as our test statistic [40, 52, 39, 53]. The SNR, for a given choice of noise parameters, is distributed according to a generalized χ2\chi^{2} distribution [54]; it is large when Hellings–Downs correlations are present and centered around zero when no spatial correlations are present.

The optimal statistic depends upon the total red noise in each pulsar (intrinsic pulsar noise and GW background), which we do not know a priori. Therefore, current analyses average the SNR over the posterior distribution on the noise parameters,

SNR¯=d𝚲p(𝚲|δ𝐭15yr)SNR(δ𝐭15yr;𝚲)\displaystyle\overline{\textrm{SNR}}=\int\textrm{d}\bm{\Lambda}\,p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}})\textrm{SNR}(\delta\mathbf{t}^{15\textrm{yr}};\bm{\Lambda})
1Nss=1NsSNR(δ𝐭15yr;𝚲s),\displaystyle\approx\frac{1}{N_{s}}\sum_{s=1}^{N_{s}}\textrm{SNR}(\delta\mathbf{t}^{15\textrm{yr}};\bm{\Lambda}^{s})\,, (17)

where in the second line we perform a Monte Carlo integral using a finite set of NsN_{s} posteriors samples [53]. SNR¯\overline{\textrm{SNR}} is then used as the test statistic for null hypothesis testing. The motivation for using this “noise marginalized optimal statistic” is that we are marginalizing over uncertainty in the noise and signal parameters when we calculate statistical significance. One uses “phase shifts” [55], “sky scrambles” [56], or data replications to estimate the null distribution of SNR¯\overline{\textrm{SNR}} and calculate a pp-value for the measured SNR¯\overline{\textrm{SNR}}. In Ref. [1], SNR¯5\overline{\textrm{SNR}}\approx 5, which falls at the 3.54σ3.5-4\sigma level in the null distributions, indicating that the curn model does not fully describe the data.

Here, we use a more conservative statistic that gives more weight to low-SNR outliers and lower weight to high SNR outliers than the noise marginalized optimal statistic [28]. The cumulative distribution function is not linear in the SNR, and so we first calculate the pp-value of the SNR calculated on each 𝚲s\bm{\Lambda}^{s}, and then average those pp-values together. The resulting pp-value will be less significant than the one calculated on SNR¯\overline{\textrm{SNR}}. Conceptually, this can be thought of as averaging over the risk of rejecting the null hypothesis by placing more weight on the most conservative noise realizations. By contrast, calculating a pp-value on SNR¯\overline{\textrm{SNR}} weighs high-SNR (and therefore less conservative noise realizations) equally to low SNR noise realizations.

We compare the value of the optimal statistic on the observed data to its value on data replications from the posterior predictive distribution. We calculate a pp-value on each SNR(δ𝐭15yr;𝚲s)\textrm{SNR}(\delta\mathbf{t}^{15\textrm{yr}};\bm{\Lambda}^{s}), and average those pp-values. This final, averaged pp-value is referred to as a posterior predictive pp-value or a Bayesian pp-value because it is marginalized over the posterior predictive distribution for the data [57, 58, 27]. We can do this generically, when we do not know the distribution of the test statistic, by calculating

pB=\displaystyle p_{B}=\int\int Θ[SNR(δ𝐭rep,𝚲)SNR(δ𝐭15yr,𝚲)]\displaystyle\Theta\left[\mathrm{SNR}(\delta\mathbf{t}^{\textrm{rep}},\bm{\Lambda})-\mathrm{SNR}(\delta\mathbf{t}^{15\textrm{yr}},\bm{\Lambda})\right]
×p(δ𝐭rep|𝚲,curn)p(𝚲|δ𝐭15yr)d(δ𝐭rep)d𝚲.\displaystyle\times p(\delta\mathbf{t}^{\textrm{rep}}|\bm{\Lambda},\textsc{curn})\,p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}})\,\mathrm{d}(\delta\mathbf{t}^{\mathrm{rep}})\,\mathrm{d}\bm{\Lambda}\,. (18)

Here Θ\Theta is the Heaviside function, and p(δ𝐭rep|𝚲,curn)p(\delta\mathbf{t}^{\mathrm{rep}}|\bm{\Lambda},\textsc{curn}) could be one of the data replications described in Sec. II.3, or it could be a set of “bootstrapped” data replications like sky scrambles or phase shifts.111We use δ𝐭rep\delta\mathbf{t}^{\mathrm{rep}} to also denote sky scrambled and phase shifted data sets, in addition to actual simulated data sets. This is somewhat poor notation. When performing sky scrambles and phase shifts the timing residuals themselves are the same as δ𝐭15yr\delta\mathbf{t}^{15\textrm{yr}} and it is the sky position (sky scrambles) or 𝐅\mathbf{F} (phase shifts) that changes. Nevertheless, we use δ𝐭rep\delta\mathbf{t}^{\mathrm{rep}} to refer generically to any data sets or schemes used to construct the null distribution, for simplicity. If the analytic distribution for the SNR for a given choice of 𝚲s\bm{\Lambda}^{s} is known, then we do not need to actually perform data replications, and Θ[]p(δ𝐭rep|𝚲,curn)dδ𝐭rep\int\Theta[\cdot]p(\delta\mathbf{t}^{\mathrm{rep}}|\bm{\Lambda},\textsc{curn})\,\mathrm{d}\delta\mathbf{t}^{\mathrm{rep}} is the inverse cumulative distribution function for the SNR. The Bayesian pp-value reduces to

pB(δ𝐭15yr)\displaystyle p_{B}(\delta\mathbf{t}^{15\textrm{yr}}) =P[SNR(δ𝐭rep;𝚲)>SNR(δ𝐭15yr;𝚲)]\displaystyle=\int P[\textrm{SNR}(\delta\mathbf{t}^{\textrm{rep}};\bm{\Lambda})>\textrm{SNR}(\delta\mathbf{t}^{15\textrm{yr}};\bm{\Lambda})]
×p(𝚲|δ𝐭15yr)d𝚲\displaystyle\times p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}})\,\mathrm{d}\bm{\Lambda}
1Ns\displaystyle\approx\frac{1}{N_{s}} s=1NsP[SNR(δ𝐭rep,s;𝚲s)>SNR(δ𝐭15yr;𝚲s)],\displaystyle\sum_{s=1}^{N_{s}}P[\textrm{SNR}(\delta\mathbf{t}^{\textrm{rep,s}}\bm{;}\bm{\Lambda}^{s})>\textrm{SNR}(\delta\mathbf{t}^{15\textrm{yr}};\bm{\Lambda}^{s})]\,, (19)

where in the second line we evaluate the integral numerically using draws from p(𝚲|δ𝐭15yr)p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}}). The superscript “rep” indicates that the inverse cumulative distribution function on the measured SNR(δ𝐭15yr;𝚲)\textrm{SNR}(\delta\mathbf{t}^{15\textrm{yr}};\bm{\Lambda}) is calculated over (theoretical or actual) data replications or sky scrambles.

The probability distribution function for the optimal statistic SNR for fixed 𝚲s\bm{\Lambda}^{s}, under the noise model, is a generalized χ2\chi^{2} distribution [54] which we will refer to as GX2 moving forward. In Fig. 1 we show SNR(δ𝐭15yr;𝚲s)\textrm{SNR}(\delta\mathbf{t}^{15\textrm{yr}};\bm{\Lambda}^{s}) for 100 draws along the bottom, and each blue curve is P[SNRrep,s(δ𝐭rep,s;𝚲s)>SNR(δ𝐭15yr;𝚲s)]P[\textrm{SNR}^{\textrm{rep,s}}(\delta\mathbf{t}^{\textrm{rep,s}}\bm{;}\bm{\Lambda}^{s})>\textrm{SNR}(\delta\mathbf{t}^{15\textrm{yr}};\bm{\Lambda}^{s})], calculated using the GX2 distribution. The dashed red line gives pB=7×104p_{B}=7\times 10^{-4}, which corresponds to 3.2σ3.2\,\sigma significance in favor of rejecting the curn model. By contrast, the significance of the SNR maximum-likelihood draw from p(𝚲|δ𝐭)p(\bm{\Lambda}|\delta\mathbf{t}) calculated using GX2 was 2×104\approx 2\times 10^{-4} or 3.5σ3.5\,\sigma.

In using the GX2 distribution, we assumed that the noise model is correct. Instead, we can construct replications of our data set that break correlations due to GWs, but preserve any mismodeling in the noise that might cause large SNR values in favor of correlations. The two main methods for doing this are sky scrambles [56] and phase shifts [55]. For each 𝚲s\bm{\Lambda}^{s}, we perform 400,000 sky scrambles, where we artificially move the location of the pulsars to different positions on the sky drawn uniformly on the two-sphere and calculate a “new” Hellings–Downs curve using these new positions. Using these sky scrambles, we build a null distribution and calculate significance. We repeat this for 100 draws of 𝚲s\bm{\Lambda}^{s} and average the inverse CDFs as in Eq. (19). Under this procedure, we find pB=1×104p_{B}=1\times 10^{-4}, which corresponds to an equivalent Gaussian significance of 3.7σ3.7\,\sigma. Using sky scrambles, Ref. [1] found p=5×105p=5\times 10^{-5} using the traditional procedure of building a null distribution for SNR¯\overline{\textrm{SNR}}. The results are shown in Fig. 2, where the inverse CDFs for sky scrambles (green) in general fall off faster than for the GX2. However, there are some outliers resulting in pBp_{B} being larger than the pp-value calculated on SNR¯\overline{\mathrm{SNR}} using scrambles.

It is unclear why the inverse CDF for sky scrambles generally falls off faster than for the GX2; this is an open area of investigation [59]. In previous work, methods of generating a background distribution from sky scrambling or phase shifting use a “match statistic,” in an attempt to use scrambles or shifts that are quasi-independent of one another and the Hellings–Downs curve [60, 61]. Recently, in Ref. [62], the authors suggested only sky scrambles that produce correlation curves that are independent of one another should be used, where independence is achieved by insisting the match statistic disappear. In Ref. [1], the condition is that the match threshold between any one sky scramble and all others is 0.2\lesssim 0.2. Here we do not use a match statistic, as our goal is to estimate the probability that the pulsars would be arranged on the sky in such a way that noise fluctuations would produce Hellings–Downs correlations. To test this, we must draw the positions uniformly on the sky. How to produce reliable null distributions for data sets that preserve potentially unmodeled noise is still subject to exploration.

Refer to caption
Figure 1: Null hypothesis testing results using the GX2 distribution. The inverse CDF curve is shown in blue for 100 draws from a curn MCMC chain. The optimal statistic SNR for each of those draws are indicated by the black lines at the bottom, and histogrammed in black in the top panel. We show pBp_{B}, averaged over pp-values calculated using the GX2 distribution, with the red dashed line. The histogram of the pp-value calculated for each draw is shown in the blue histogram in the right panel. The green dot-dashed line indicates SNR¯\overline{\textrm{SNR}}. The gray horizontal lines correspond to different Gaussian equivalent sigma levels.
Refer to caption
Figure 2: Same as Fig. 1, but for sky scrambled distributions. Each green line corresponds to an inverse CDF for a given 𝚲s\bm{\Lambda}^{s}. The green histogram in the right panel corresponds to the pp-value of the SNR(δ𝐭15yr;𝚲s)\mathrm{SNR}(\delta\mathbf{t}^{15\textrm{yr}};\bm{\Lambda}^{s}) for each 𝚲s\bm{\Lambda}^{s}, and the blue dashed line in the middle and right panels corresponds to the average of those pp-values, which is pBp_{B} for sky scrambles. In the center panel, we have included the blue GX2 inverse CDFs from Fig. 1 for reference. The sky scrambles result in a more significant pp-value and pBp_{B}.

III.2 Consistency with the HD model

In the previous subsection, we reject the null hypothesis of the curn model at the 3.2 σ\sigma level. In this section, we use the same scheme to test whether the data are consistent with Hellings–Downs correlations. Given that we only have an analytic null distribution for the optimal statistic, we use a Monte Carlo integral for Eq. (18) to evaluate pBp_{B} in the presence of a potential signal:

pB1Ns=1NΘ[SNR(δ𝐭rep,𝚲s)SNR(δ𝐭15yr,𝚲s)],\displaystyle p_{B}\approx\frac{1}{N}\sum_{s=1}^{N}\Theta\left[\textrm{SNR}(\delta\mathbf{t}^{\textrm{rep}},\bm{\Lambda}^{s})-\textrm{SNR}(\delta\mathbf{t}^{15\textrm{yr}},\bm{\Lambda}^{s})\right], (20)

where we average over N=1,000N=1,000 HDPosteriorDraws replications. We show a scatter plot in Fig. 3, where the xx-axis shows SNR(δ𝐭,𝚲s)\mathrm{SNR}(\delta\mathbf{t},\bm{\Lambda}^{s}) and the yy-axis shows SNR(δ𝐭rep, s,𝚲s)\mathrm{SNR}(\delta\mathbf{t}^{\textrm{rep, s}},\bm{\Lambda}^{s}), and pBp_{B} corresponds to the fraction of points above the line y=xy=x. In this case, we find pB=0.534p_{B}=0.534, indicating that the 15 yr NANOGrav data are consistent with data replications that assume Hellings–Downs correlations.

Refer to caption
Figure 3: We show a comparison of SNR(δ𝐭15yr,𝚲s)(\delta\mathbf{t}^{15\textrm{yr}},\bm{\Lambda}^{s}) (xx-axis) with SNR(δ𝐭rep,𝚲s)(\delta\mathbf{t}^{\mathrm{rep}},\bm{\Lambda}^{s}) (yy-axis). Each point corresponds to a draw from the posterior 𝚲s\bm{\Lambda}^{s}. The fact that roughly half the points fall above the line y=xy=x, and pB=0.53p_{B}=0.53 indicates that the measured SNR for Hellings–Downs correlations on the 15 yr data is consistent the 1,000 HDPosteriorDraws replications.

III.3 Additional spatial correlations

The model for a GW background assumes spatial correlations that follow the HD curve, but other spatial correlations could arise either from statistical fluctuations or due to mismodeling. Monopolar correlations could arise due to an error in the clock at each site, corresponding to a correlated offset that is common to all pulsars. Dipolar correlations could arise due to an error in the effective location and motion of the solar system barycenter [20]. We use the multiple component optimal statistic [52], which estimates the amplitude of monopolar, dipolar, and Hellings–Downs correlations simultaneously, to test whether our estimate of these correlations in the 15 yr data set is consistent with data replications from a pure HD model. The results and methods here follow App. H of Ref. [1]. The analysis is nearly identical, but with more simulations used to calculate pBp_{B}. The results and conclusion are the same as that analysis, but we include it here both for completeness, and because it is strongly related to the rest of the new tests we have performed in this section.

The multiple component optimal statistic simultaneously produces the SNR for all three spatial correlations, SNRMCMONO{}^{\textrm{MONO}}_{\textrm{MC}}, SNRMCDIP{}^{\textrm{DIP}}_{\textrm{MC}}, and SNRMCHD{}^{\textrm{HD}}_{\textrm{MC}} where the superscript corresponds to the spatial correlation and the subscript indicates that we are using the multiple component optimal statistic. We again use 1,000 HDPosteriorDraws data replications described in Sec. II and calculate Eq. (20) substituting SNRMCMONO\textrm{SNR}^{\textrm{MONO}}_{\textrm{MC}} for SNR to produce pBMONOp_{B}^{\textrm{MONO}}. We produce pBDIPp_{B}^{\textrm{DIP}} and pBHDp_{B}^{\textrm{HD}} for dipole and Hellings–Downs correlations defined analogously.

We show similar visualizations to the previous section in Fig. 4. We find pBHD=0.64p_{B}^{\textrm{HD}}=0.64, which again indicates that the HD SNR calculated on the 15 yr data is consistent with what we expect from the pure hd model. Likewise, we find pBDIP=0.26p_{B}^{\textrm{DIP}}=0.26, consistent with no dipolar correlations. Finally, we find pBMONO=0.11p_{B}^{\textrm{MONO}}=0.11, which is largely consistent with no monopolar correlations.

Refer to caption
Refer to caption
Refer to caption
Figure 4: We compare SNRmono(δ𝐭15yr,𝚲s)\textrm{SNR}^{\textrm{{mono}}}(\delta\mathbf{t}^{15\textrm{yr}},\bm{\Lambda}^{s}) (xx-axis) with SNRmono(δ𝐭rep,𝚲s)\textrm{SNR}^{\textrm{{mono}}}(\delta\mathbf{t}^{\mathrm{rep}},\bm{\Lambda}^{s}) (yy-axis) in the left panel (dip and hd in center and right panels respectively), using the HDPosteriorDraws data replications. There is broad consistency between SNRhd\textrm{SNR}^{\textsc{hd}} in replications and 15 yr data set. For the dip and mono models we find that the recovered SNR is consistent with data replications that include only Hellings–Downs correlations.

IV Testing spectrum and correlation models

In this section, we assess the power-law assumption for the GW background and the intrinsic pulsar noise. We recap how to estimate the posterior distribution on the Fourier coefficients for the red noise, 𝐚\mathbf{a} at each frequency and for each pulsar for both the intrinsic pulsar noise and the GW background. Using the posterior on 𝐚\mathbf{a}, we construct the posterior distribution on the intrinsic pulsar noise and GW background power spectrum in each pulsar, which can now deviate from a power-law but are subject to a power-law prior distribution. We then test for deviations in the intrinsic pulsar noise spectrum for each pulsar and in the total GW background spectrum.

IV.1 Method

Refer to caption
Figure 5: Workflow for the analysis of Sec. IV based on predicted and inferred Fourier coefficients for the GW and individual-pulsar red noise spectrum. The black line corresponds to estimating p(𝚲|δ𝐭15yr)p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}}) directly after analytically marginalizing over the Fourier coefficients, and directly estimating the amplitude and spectral index for the power-law GW background and intrinsic pulsar noises. The blue line corresponds to generating “predicted” coefficients using the power spectrum to simulate Fourier coefficients. The maroon line indicates the inferred coefficients, which use the power-law power spectrum as a prior, combined with δ𝐭\delta\mathbf{t}, to further constrain 𝐚\mathbf{a}.

In this subsection, we summarize the methods outlined in Ref. [29]. Given p(𝚲|δ𝐭15yr)p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}}), we calculate 𝐚\mathbf{a} in two ways. In one method, 𝐚\mathbf{a} are conditioned on δ𝐭15yr\delta\mathbf{t}^{15\textrm{yr}} and 𝚲\bm{\Lambda}, which we refer to as the “inferred” coefficients (subscript “inf”) because they are drawn from the inferred posterior on 𝐚\mathbf{a} using information from both the power-law spectrum prior and the real data. In the other method, 𝐚\mathbf{a} are conditioned only on 𝚲\bm{\Lambda}, which we refer to as “predicted” coefficients (subscript “pre”) because these are the coefficients predicted by the power-law spectrum prior. In both cases, we marginalize over 𝚲\bm{\Lambda} and ϵ\bm{\epsilon}. We illustrate the workflow in Fig. 5. The posteriors on the inferred and predicted parameters are formally given by

pinf(𝐚|δ𝐭15yr)\displaystyle p_{\rm inf}(\mathbf{a}|\delta\mathbf{t}^{15\textrm{yr}}) =𝑑𝚲𝑑ϵp(𝐚,ϵ|𝚲,δ𝐭15yr,hd)\displaystyle=\int d\bm{\Lambda}\,d\bm{\epsilon}\;p(\mathbf{a},\bm{\epsilon}|\bm{\Lambda},\delta\mathbf{t}^{15\textrm{yr}},\textsc{hd})\,
×p(𝚲|δ𝐭15yr,hd),\displaystyle\qquad\qquad\times p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}},\textsc{hd})\,, (21)
ppre(𝐚|δ𝐭15yr)\displaystyle p_{\rm pre}(\mathbf{a}|\delta\mathbf{t}^{15\textrm{yr}}) =𝑑𝚲𝑑ϵp(𝐚,ϵ|𝚲,hd)p(𝚲|δ𝐭15yr,hd)\displaystyle=\int d\bm{\Lambda}\,d\bm{\epsilon}\;p(\mathbf{a},\bm{\epsilon}|\bm{\Lambda},\textsc{hd})p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}},\textsc{hd})
=𝑑𝚲p(𝐚|𝚲,hd)p(𝚲|δ𝐭15yr,hd).\displaystyle=\int d\bm{\Lambda}\;p(\mathbf{a}|\bm{\Lambda},\textsc{hd})p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}},\textsc{hd})\,. (22)

The first term in the integrand differs between the predicted (no dependence on δ𝐭15yr\delta\mathbf{t}^{15\textrm{yr}}) and inferred posteriors (which are conditioned on δ𝐭15yr\delta\mathbf{t}^{15\textrm{yr}}). We discuss specifics of how these terms are evaluated below. The second term in the integrand is the posterior on 𝚲\bm{\Lambda}, e.g., power-law amplitudes and spectral indices.

In both Eqs. (21) and (22), we evaluate the posterior with a Monte Carlo integral. We first draw a sample (labeled with “ss” superscript) 𝚲sp(𝚲|δ𝐭15yr)\bm{\Lambda}^{s}\sim p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}}). We then draw from the first term in the integrand. For the inferred coefficients we draw from p(𝐚,ϵ|𝚲s,δ𝐭15yr)p(\mathbf{a},\bm{\epsilon}|\bm{\Lambda}^{s},\delta\mathbf{t}^{15\textrm{yr}}), which is a Gaussian with mean and covariance that depend on both the prior and the data, and are given by Eqs. (14) and (15). For the predicted coefficients, we draw from just the prior distribution (i.e., no dependence on data), p(𝐚,ϵ|𝚲s)p(\mathbf{a},\bm{\epsilon}|\bm{\Lambda}^{s}), which is a zero-mean Gaussian given by Eqs. (5) and (6). More details on this scheme are discussed in Ref. [29].

For each 𝚲s\bm{\Lambda}^{s} we split 𝐚\mathbf{a} into 𝐚rn,as\mathbf{a}_{\textrm{rn},a}^{s} for intrinsic pulsar noise, and 𝐚gw,as\mathbf{a}_{\textrm{gw},a}^{s} for GWs for each pulsar aa. We then reconstruct the power spectrum for the intrinsic pulsar noise in each pulsar and the GW power observed by each pulsar. Each pulsar sees a different realisation of the GW background, but 𝐚gw,as\mathbf{a}_{\textrm{gw},a}^{s} are drawn from the same distribution for all pulsars aa. By contrast, the intrinsic pulsar noise is a different power spectrum for each pulsar, and therefore 𝐚rn,as\mathbf{a}_{\textrm{rn},a}^{s} are drawn from a different distribution for each individual pulsar.

For the inferred coefficients, by conditioning on the data, the power spectrum will deviate from a power law if the true data-generating process differs from a power law. For the predicted spectrum, we obtain different realizations of a power spectrum that are consistent with a power-law. In Sec. IV.2, we discuss results for the inferred and predicted intrinsic pulsar noise and GW spectra for each pulsar, and compare them to an “excess noise” analysis done in Ref. [41]. At each frequency, we use a modified version of the optimal statistic222See Sec. IVB2 in Ref. [29] for a discussion. to combine individual-pulsar coefficients to estimate the total the total GW power across the PTA in that frequency bin [63]. We present the results in Sec. IV.3.1.

Finally, for each 𝚲s\bm{\Lambda}^{s}, we produce pulsar pair-wise correlations and compare them to the expected Hellings–Downs curve. To do this, we draw coefficients from pinf(𝐚|δ𝐭15yr,hd)p_{\mathrm{inf}}(\mathbf{a}|\delta\mathbf{t}^{15\textrm{yr}},\textsc{hd}) and ppred(𝐚|δ𝐭15yr,hd)p_{\mathrm{pred}}(\mathbf{a}|\delta\mathbf{t}^{15\textrm{yr}},\textsc{hd}), and use the optimal statistic to construct pair-wise correlations. In the case of the predicted parameters, this will give us an expected spread on correlation vs. angular separation for a given model. For the inferred parameters, by conditioning on the data, the correlations can deviate from the model. In Sec. IV.3.2, we look at reconstructions of the pair-wise correlations as a function of angular separation, and search for deviations from the Hellings–Downs curve.

IV.2 Power spectra of individual pulsars

Refer to caption
Refer to caption
Refer to caption
Figure 6: We show the total red noise (green, ρi2+ηai2\rho^{2}_{i}+\eta^{2}_{ai}[41], the inferred and predicted intrinsic pulsar noise (blue, ηai,inf2\eta^{2}_{ai,\textrm{inf}}; orange, ηai,pred2\eta^{2}_{ai,\textrm{pred}}) and inferred and predicted GW background (pink, ρai,inf2\rho^{2}_{ai,\textrm{inf}}; yellow, ρai,pred2\rho^{2}_{ai,\textrm{pred}}) for B1937+21 (top), J1012+5307 (middle), J1909-3744 (bottom). The boxes indicate the 50% credible interval, while the whiskers show the 5th and 95th percentiles. The red dashed line shows the maximum likelihood (ML) intrinsic pulsar noise power-law, and the black dashed line shows the same for the GW background. To reduce clutter, we only show ηai,pred2\eta^{2}_{ai,\textrm{pred}} in the top two panels, and ρai,pred2\rho^{2}_{ai,\textrm{pred}} in the bottom panel. The top two pulsars exhibit strong intrinsic pulsar noise, that is larger than the estimated background, because the greeen and blue boxes are larger than the pink ones. In the bottom panel, total red noise (green) is dominated by the GW background (pink, yellow), while the intrinsic pulsar noise (blue) is not detectable. In some frequency bins, there appears to be a lack of red noise, but this is consistent with what is expected from a power-law model. There is little evidence for excess noise, the strongest evidence being the 26th26^{\textrm{th}} bin for J1012+5307 (marked with a red arrow), which has a Bayesian pp-value of 0.03, which we discuss in the text.
Refer to caption
Refer to caption
Figure 7: For each pulsar on the horizontal axis we show pBp_{B} for all 30 (14) frequency bins in the top (bottom) panel. For most pulsars pBp_{B} is near 0.5, indicating that the inferred and predicted power spectra agree with one another, and so a power-law is an appropriate model for the intrinsic pulsar noise. For several pulsars, there is a broader spread in pBp_{B}, including two of the pulsars that we show in Fig. 6

.

Power spectra for each pulsar in the NANOGrav array are explored in Ref. [41], it is therefore worth contrasting the two results. First, Ref. [41] simultaneously estimated the total red noise (ρi2+ηai2\rho_{i}^{2}+\eta_{ai}^{2}) in each frequency bin ii, performing a separate analysis for each individual pulsar aa. A Savage-Dickey Bayes factor was calculated to estimate the significance of the total red noise at each frequency for each pulsar. Next, both the common red noise and intrinsic pulsar noise were fixed to the maximum likelihood values estimated from a curn analysis that assumes these spectra follow a power-law. Once fixing these parameters in their model, excess noise in each frequency bin for each pulsar was searched for. No evidence for excess noise was found, the power-law model for intrinsic pulsar noise and the common red noise processes are therefore sufficient.

In this work, we instead separate intrinsic pulsar noise and GW background contributions when drawing parameters from the inferred and predicted distributions, and we produce a posterior distribution on both contributions in each frequency bin for each pulsar. This way we are testing both the intrinsic pulsar noise and GW background power-law assumptions at the same time, while constructing full posteriors on the intrinsic pulsar noise and the GW background. This is in contrast to the search for excess noise on top of a power-law common red noise and intrinsic pulsar noise. Another difference is that in this work, individual frequency bin estimates are subject to a prior that follows a power-law, while Ref. [41] used a log-uniform prior on the power in each frequency bin.

In Fig. 6, we show results for two pulsars with strong intrinsic pulsar noise, B1937+21 and J1012+5307 (top two panels), and J1909-3744 which has no measurable intrinsic pulsar noise, but a contribution attributed to the GW background. The green boxes correspond to the estimates of the total red noise power from [41], which was discussed above. The blue and pink boxes correspond to the inferred intrinsic pulsar noise and GW background contributions respectively, orange boxes correspond to the predicted intrinsic pulsar noise, and yellow boxes correspond to the predicted GW background in the bottom panel. Similar plots for each pulsar are included in an electronic supplement.

In the top two panels, the intrinsic pulsar noise (inferred in blue boxes, predicted in orange boxes) typically agrees with the total red noise (green boxes) from Ref. [41]–indicating that the total red noise is dominated by intrinsic pulsar noise. The GW background is significantly below the intrinsic pulsar noise and the total red noise. Additionally, the orange distributions, corresponding to predicted intrinsic pulsar noise, agree with the inferred intrinsic pulsar noise. We quantify this agreement below. In the bottom panel, the total red noise agrees with the GW background, while the intrinsic pulsar noise is significantly lower. This is consistent with the GW background contributing significantly to δ𝐭15yr\delta\mathbf{t}^{15\textrm{yr}} in this pulsar, with no intrinsic pulsar noise contribution. In a few cases the free spectrum total red noise (green boxes) deviates further from the power law than the inferred intrinsic pulsar noise (blue boxes). This is due to the power-law prior used for the inferred intrinsic pulsar noise, which will tend to move those parameters closer to the power law.

In a few situations, e.g., the first and fifth through seventh bins for J1909-3744, the estimated total red noise (green boxes) appears to be lower than the predicted (yellow boxes) and inferred spectra (pink boxes) for the GW background. The low total noise are consistent with the predicted distributions, which follow a χ2\chi^{2} with two degrees of freedom, and therefore have large support at those low values (despite what the box and whiskers show). The inferred distributions broadly agree with the predicted, and do show overlap with the green whiskers; but they do look inflated compared to the total noise. However, simply combining the GW background and intrinsic pulsar noise spectra in each bin may not reproduce the green boxes for a few reasons. First, interference between these two contributions may cancel or amplify the estimated total red noise above or below what we would expect from naively adding their contributions incoherently. Second, a log-uniform prior on the power in each frequency will likely reduce the upper limit on the estimated power in that bin when no red noise detection is made when compared to the upper limit we would set using a prior informed by the power law model

We then quantify excess intrinsic pulsar noise in individual frequency bins for each pulsar. For each pulsar aa and frequency bin ii we calculate

pB=1Ns=1NΘ(ηinf,ai2ηpred,ai2),\displaystyle p_{B}=\frac{1}{N}\sum_{s=1}^{N}\Theta(\eta^{2}_{\textrm{inf},ai}-\eta_{\textrm{pred},ai}^{2})\,, (23)

where ηinf,ai2\eta^{2}_{\textrm{inf},ai} are drawn from pinf(𝐚|δ𝐭15yr)p_{\rm inf}(\mathbf{a}|\delta\mathbf{t}^{15\textrm{yr}}), and correspond to the inferred power spectrum, while ηpred,ai2\eta_{\textrm{pred},ai}^{2} are drawn from ppred(𝐚|δ𝐭15yr)p_{\rm pred}(\mathbf{a}|\delta\mathbf{t}^{15\textrm{yr}}) and correspond to the predicted power spectrum due to a power law. Each η2\eta^{2} carries with it an implicit ss index, which we have suppressed. We also use the HDPosteriorDraws simulations to calculate

pBsim=1Ns=1NΘ[ηinf,ai2(ηinf,ai2)rep],\displaystyle p_{B}^{\mathrm{sim}}=\frac{1}{N}\sum_{s=1}^{N}\Theta[\eta^{2}_{\textrm{inf},ai}-(\eta_{\textrm{inf},ai}^{2})^{\mathrm{rep}}]\,, (24)

where the superscript “rep” indicates it is the inferred estimate on the power in that frequency calculated on the replicated data. Note that ηinf,ai2\eta^{2}_{\textrm{inf},ai} and (ηinf,ai2)rep(\eta_{\textrm{inf},ai}^{2})^{\mathrm{rep}} are calculated using the same 𝚲s\bm{\Lambda}^{s}. We find that these two methods produce nearly identical results, and so we report results for pBp_{B} instead of pBsimp_{B}^{\rm sim}.

For intrinsic pulsar noise across all pulsars and frequencies, we find a minimum of pB=0.03p_{B}=0.03, for J1012+5307, f=51nHzf=51\,\textrm{nHz}, which is the box that is visibly above the max likelihood curve in the middle panel of Fig. 6, marked with the red arrow. This is not a significant pp-value, given that we are analyzing 67 pulsars and 30 frequency bins, and so we cannot conclude that this represents a deviation from a power law. This is the same conclusion as Ref. [41]. The minimum and maximum pBp_{B} for the GW background power spectrum across all pulsars are 0.16 and 0.84.

Deviations from the power-law model may not just take the form of excess noise at individual frequencies. For example, one could have a broken power-law, or excess noise across multiple frequencies that are not individually detectable. We do not develop a statistic to measure this here, as it requires a specific model to compare to the power-law model and there are a broad range of potential models. However, such an analysis should be done in the future.

We show a plot of pBp_{B} for intrinsic pulsar noise for each pulsar in the top panel of Fig. 7 and for the GW background in each pulsar in the bottom panel. The horizontal axis corresponds to each pulsar, while the vertical axis corresponds to pBp_{B}; each point represents a pBp_{B} for each pulsar and each frequency bin. When the inferred and predicted power spectrum estimates agree at a given frequency bin, we expect pB0.5p_{B}\approx 0.5. We see that a few noisy pulsars show pBp_{B} values that stray away from 0.5, including the pulsars in the top two panels of Fig. 6. As stated before, no individual frequency bin shows an extreme value of pBp_{B}, e.g., pB<0.01p_{B}<0.01 or pB>0.99p_{B}>0.99, meaning we cannot state there are individual bins with excess noise. However, pulsars like B1937+21 and J1012+5307 do show quite a few frequency bins with pBp_{B} deviating from 0.5, meaning they may benefit from a more flexible model in the future. It is currently prohibitively computationally expensive to estimate intrinsic pulsar noise separately in each frequency bin for each pulsar when we estimate p(𝚲|δ𝐭15yr)p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}}). However, with future computational improvements, we may be able to do this for a limited number of pulsars, and this method provides a good starting point for choosing those pulsars.

IV.3 Full-Array Results for the Gravitational-wave Background

Refer to caption
Figure 8: We show reconstructed GW power in each frequency bin for inferred (blue) and predicted (orange) coefficients, and a data-only (green) reconstruction. The boxes correspond to inter-quartile ranges and the whiskers are the 5th and 95th percentiles. Power-law draws from p(𝚲|δ𝐭15yr)p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}}) are shown in light black. The blue and orange distributions agree with one another at most frequencies. In a few places, the green boxes differ from the predicted or inferred distribution (e.g., 20 and 22 nHz), but the data are weakly informative, as the inferred and predicted distributions agree with one another in those cases. In a few frequencies the data only distribution shows evidence for negative power, this is to be expected when the data are not informative (discussed further in the text).

IV.3.1 Spectral shape

To get a full-PTA estimate of the GW background power in each frequency bin, we use a modified version of the optimal statistic [40, 53, 29] that estimates the Hellings–Downs correlated GW power in each individual frequency bin. The details of this statistic are discussed in Sec. IV B 2 of Refs. [29] and [63].

In Fig. 8 we show results for the estimated power in each frequency bin for the inferred parameters (blue), the predicted parameters (orange), and a fully frequentist estimate that depends only on the data (green). The boxes correspond to the interquartile range of the GW power in each bin, estimated over draws from p(𝚲|δ𝐭15yr)p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}}), and the whiskers are the 5th and 95th percentiles. In showing these together, we compare predicted, inferred, and data-only results. There is no visible evidence for a deviation from a power law, which is indicated by the gray shaded region, which encompasses draws from p(𝚲|δ𝐭15yr)p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}}). The inter-quartile ranges for the predicted, inferred, and data-only power overlap in most frequency bins. In a few places, the data-only results appear to differ from the inferred and predicted results, e.g., 9th–11th bins, but the data are weakly informative and the inferred parameters are closer to the predicted parameters.

The per-frequency optimal statistic, used to combine 𝐚gw\mathbf{a}_{\textrm{gw}} across pulsars allows for negative power in situations where the data are uninformative. Like the traditional optimal statistic, when no correlated signal is present, the distribution of the statistic is GX2 centered at zero. This is why the whiskers for several frequencies leave the bottom of the plot, and in two cases (bins 9 and 14) the data-only inter-quartile ranges are negative. This does not change our conclusions, as we find that our results at frequencies where we know we should see correlated GW power show such power (specifically the lowest five frequency bins). This is consistent with the hd free-spectrum results in Ref. [1].

We use the HDPosteriorDraws data replications to compare simulations from a power-law model to the results in Fig. 8. We find that the spectral results are consistent with a power-law model with Hellings–Downs correlations–the lowest and highest pBp_{B} comparing inferred parameters from simulations with inferred parameters on the 15 yr data set are 0.30 and 0.72.

IV.3.2 Spatial Correlations

Refer to caption
Figure 9: We show reconstructed binned spatial correlations. The predicted (orange) show the expected spread around the Hellings–Downs curve that we might expect across many realizations. The inferred (blue) and data-only (green) recoveries broadly agree with the orange. There are two bins (3rd and 5th) that show some deviation between the green and orange bins. We find that these bins are not statistically significant when comparing the inferred and predicted distributions, and that changing the binning does not have a significant affect.

In Sec. III, we showed broad consistency between the data and the hd model, and we showed that there is no evidence for additional monopolar or dipolar correlations. Those tests compare plausible alternative analytic correlation models to the expected correlation model. In this section, we search for isolated deviations in the binned spatial correlations from the Hellings–Downs curve. We use the optimal statistic on the inferred and predicted coefficients, as well as directly on the data, and compare the inferred, predicted, and data-only binned reconstructions to search for potential deviations from the Hellings–Downs curve that are not just monopolar or dipolar.

To construct the binned correlations, we perform an inverse–noise-weighted average over correlations for all pulsar pairs whose angular separation falls in a given angular-separation bin. An example for 11 bins of equal width is shown in Fig. 9. We compare the inferred (blue), predicted (orange), and data-only (green) estimates of the binned correlations. The spread comes from calculating the mean and variance of the correlation across pulsar pairs for a given posterior draw, and sampling from a univariate Gaussian with that mean and variance, and then repeating over many draws from p(𝚲|δ𝐭).p(\bm{\Lambda}|\delta\mathbf{t}). The variance for the bin for a given draw includes covariance between pairs of pulsars due to the non-zero GW background [64]. The bars indicate the 5th and 95th percentiles of the resulting distribution.

We find that the data and inferred correlations are consistent with the predicted correlations. We also use the HDPosteriorDraws replications and find that none of the inferred or data-only binned correlations differ significantly from those calculated with the data replications. The two bins with the most extreme pBp_{B}333Because we are comparing two distributions, we consider both large and small pp-values to be extreme. are the third and fifth bins, with pB=0.87, 0.88p_{B}=0.87,\,0.88 respectively, indicating that, if we take Hellings–Downs correlations as our prior, we do not have evidence for deviations from the Hellings–Downs curve. This does not mean that we are fully consistent with Hellings–Downs correlations (as subtle changes in each bin could result in a different overall correlation pattern), but it does indicate that there are no obvious “spikes” in correlations on small angular scales. Changing the choice of binning does not change the qualitative conclusion.

V Leave-one-out Analyses

Refer to caption
Figure 10: We show lnPBFcurn,ahd\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}} the 67 pulsars. The pulsar best predicted by the hd model is J1909-3744, whose red noise is predominantly due to a GW background. There are 43 pulsars with lnPBFcurn,ahd>0\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}}>0 and 25 with lnPBFcurn,ahd<0.\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}}<0. As discussed in the text, we find the number and level of the pulsars with lnPBFcurn,ahd<0\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}}<0 to be consistent with simulations that have a GW background as strong as what we find in the data.

Although individual pulsars can exhibit unique chromatic noise features, profile changes, and red noise properties, similar noise models are fit to each pulsar in the array. In this section, we seek to identify whether certain pulsars are poorly fit by these models. We perform a leave-one-out analysis, where we calculate a posterior predictive likelihood for the timing residuals in one pulsar, given the data in all other pulsars [29]. This analysis is similar to the one in Ref. [1], with a few key differences. First, we use 14 frequency bins for the analysis, and we include the negative spectral index of the GW background, γ\gamma, in the initial fit. In Ref. [1], γ\gamma is fixed to 13/313/3. We also use a larger number of curn simulations to evaluate the significance of the GW background, and perform a new comparison between simulated and real data on each individual pulsar.

Both the curn and the hd models can be used to predict features in one pulsar, given a model fit to the other pulsars in the array. The curn model, for example, can only predict the variance of the common-process–induced timing residuals, while the hd model, which includes GW-induced correlations, makes a prediction for both the variance of the timing residuals and their waveform.444This prediction is limited by the strength of the Hellings–Downs correlations. We cannot predict the pulsar-term fluctuations, but Earth-term predictions are informative.

We compare the predictive power of these two models by calculating a pseudo Bayes factor (PBF), which is the ratio of the posterior predictive likelihood for the hd and the curn models. We calculate this on both a pulsar-by-pulsar basis, to identify potential pulsars that are not well-predicted by the models, and also across the full pulsar timing array to construct a new detection statistic.

We denote the “left-out” pulsar with subscript aa and the rest of the data set excluding that pulsar with a subscript a-a. The posterior predictive likelihood is

p(δ𝐭a|δ𝐭a)=d𝚲d𝐚dϵp(δ𝐭a|𝐚,ϵ,𝚲)p(𝐚,ϵ,𝚲|δ𝐭a).\displaystyle p(\delta\mathbf{t}_{a}|\delta\mathbf{t}_{-a})=\int\textrm{d}\bm{\Lambda}\,\textrm{d}\mathbf{a}\,\textrm{d}\bm{\epsilon}\;p(\delta\mathbf{t}_{a}|\mathbf{a},\bm{\epsilon},\bm{\Lambda})p(\mathbf{a},\bm{\epsilon},\bm{\Lambda}|\delta\mathbf{t}_{-a})\,. (25)

As in Ref. [29], we split up the parameters and hyperparameters into separate pieces based on whether they correspond to pulsar aa or pulsars a-a, and whether they describe GW or red noise coefficients, 𝚲=[𝚲a,𝚲a,𝚲gw]\bm{\Lambda}=[\bm{\Lambda}_{a},\bm{\Lambda}_{-a},\bm{\Lambda}_{\textrm{gw}}], 𝐚=[𝐚gw,a,𝐚gw,a,𝐚a,𝐚a]\mathbf{a}=[\mathbf{a}_{\textrm{gw},a},\mathbf{a}_{\textrm{gw},-a},\mathbf{a}_{a},\mathbf{a}_{-a}]. We use this new notation, and evaluate Eq. (25) for the hd and the curn models to find, see App. A of [29],

phd(δ𝐭a|δ𝐭a)1Nss=1Ns\displaystyle p_{\textrm{{hd}}}(\delta\mathbf{t}_{a}|\delta\mathbf{t}_{-a})\approx\frac{1}{N_{s}}\sum_{s=1}^{N_{s}}\int d𝚲ad𝐚gw,ap(δ𝐭a|𝚲a,𝐚gw,a)\displaystyle\textrm{d}\bm{\Lambda}_{a}\textrm{d}\mathbf{a}_{\textrm{gw},a}\;p(\delta\mathbf{t}_{a}|\bm{\Lambda}_{a},\mathbf{a}_{\textrm{gw},a})
×p(𝐚gw,a|𝚲gws,𝚲as,δ𝐭a)\displaystyle\times p(\mathbf{a}_{\textrm{gw},a}|\bm{\Lambda}_{\textrm{gw}}^{s},\bm{\Lambda}_{-a}^{s},\delta\mathbf{t}_{-a})
×p(𝚲a),\displaystyle\times p(\bm{\Lambda}_{a})\,, (26)
pcurn(δ𝐭a|δ𝐭a)1Nss=1Ns\displaystyle p_{\textrm{{curn}}}(\delta\mathbf{t}_{a}|\delta\mathbf{t}_{-a})\approx\frac{1}{N_{s}}\sum_{s=1}^{N_{s}}\int d𝚲ap(δ𝐭a|𝚲a,𝚲gws)p(𝚲a).\displaystyle\textrm{d}\bm{\Lambda}_{a}p(\delta\mathbf{t}_{a}|\bm{\Lambda}_{a},\bm{\Lambda}_{\textrm{gw}}^{s})p(\bm{\Lambda}_{a})\,. (27)

In both cases, we perform a Monte Carlo integral over the hyperparameter posterior

𝚲gws,𝚲asp(𝚲gws,𝚲as|δ𝐭a).\displaystyle\bm{\Lambda}_{\textrm{gw}}^{s},\bm{\Lambda}_{-a}^{s}\sim p(\bm{\Lambda}_{\textrm{gw}}^{s},\bm{\Lambda}_{-a}^{s}|\delta\mathbf{t}_{-a})\,. (28)

The main difference between Eq. (V) and Eq. (27) is that for the hd model, the a-a pulsars can produce a prediction for 𝐚gw,a\mathbf{a}_{\textrm{gw},a} due to the Hellings and Downs correlations, while the curn model cannot.

The ratio of the posterior predictive likelihoods for the curn and hd models, is the PBF and it can be used to compare the two models. We first calculate the PBF point-wise across pulsars

PBFcurn,ahd=phd(δ𝐭a|δ𝐭a)pcurn(δ𝐭a|δ𝐭a),\displaystyle\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}}=\frac{p_{\textrm{{hd}}}(\delta\mathbf{t}_{a}|\delta\mathbf{t}_{-a})}{p_{\textrm{{curn}}}(\delta\mathbf{t}_{a}|\delta\mathbf{t}_{-a})}\,, (29)

and then the total PBF as a point-wise product

PBFcurnhd=aPBFcurn,ahd.\displaystyle\textrm{PBF}_{\textrm{{curn}}}^{\textrm{{hd}}}=\prod_{a}\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}}\,. (30)

A full discussion of the differences and similarities between a typical Bayes Factor and the PBF is given in Ref. [29], but we summarize a few key points here. Unlike the Bayes Factor, the PBF is not sensitive to parts of the parameter space that have no likelihood support. The PBF compares how well the models predict new data, while the Bayes factor is a summary statistic comparing how well two models fit existing data. Both statistics, however, are uncalibrated–meaning it is unclear how to interpret statistical significance as a function of the value of the statistic. In this section, similar to previous sections, we use data replications to assess the significance of the PBF.

Importantly, we can calculate the PBF on each pulsar individually, and identify whether certain pulsars are “outliers” that are not well-predicted by a given model. This is similar to the “dropout factor” analysis in [65, 1]. In this work, we calculate a separate predictive likelihood for each model for each pulsar, while the dropout factor analysis samples an indicator variable that chooses whether to model a pulsar with the curn or hd model. The interpretation of the results are similar to point-wise results.

Across the array, multiplying all of the “leave-out” PBFs we find PBFcurnhd=873\textrm{PBF}_{\textrm{{curn}}}^{\textrm{{hd}}}=873. This is on a similar scale to the 14 frequency Bayes factor comparing hd and curn [1], but as with typical Bayes factors, there is no natural scale to use to “calibrate” this level of significance. In Ref. [1], several methods are used to generate a null distribution for detection statistics, including sky scrambles [56], phase shifts [55], and simulated data sets. In this work, we again resort to simulated data sets. Using 600 CURNPosteriorDraws simulations, we calculate PBFcurnhd\textrm{PBF}_{\textrm{{curn}}}^{\textrm{{hd}}} on each of the simulations. We find a Gaussian equivalent pp-value of 3.0σ3.0\,\sigma in favor of Hellings–Downs correlations on the 15 yr NANOGrav data.

We also use the HDPosteriorDraws draws to test whether this result is consistent with the hd model. We find that PBFcurnhd\textrm{PBF}_{\textrm{{curn}}}^{\textrm{{hd}}} falls in the 27th percentile of the hd simulations, again confirming that our results are inconsistent with the curn model, and are consistent with the hd model.

We show lnPBFcurn,ahd\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}} for each pulsar in Fig. 10. There are more pulsars with lnPBFcurn,ahd>0\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}}>0 than the reverse, because the hd model is better at predicting new data than the curn model. There are several pulsars with lnPBFcurn,ahd<0\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}}<0. We expect this in a few pulsars due to the specific realization of intrinsic pulsar noise and the pulsar term from the GW background, which we cannot predict. To understand whether the number of pulsars with lnPBFcurn,ahd<0\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}}<0 is expected, and whether the typical scale of those downward fluctuations is “representative” of what we would expect from a GW background, we perform simulations. We do 200 HDPosteriorDraws simulations and calculate lnPBFcurn,ahd\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}} for each of those simulations to understand what the typical PBF is for the “best” and “worst” predicted pulsars if we have a GW background consistent with our posteriors.

Refer to caption
Figure 11: We show lnPBFcurn,ahd\ln\,\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}} on δ𝐭15yr\delta\mathbf{t}^{15\textrm{yr}} in red. The blue box and whisker plot shows the interquartile range and 5th and 95th percentiles of the distribution of lnPBFcurn,ahd\ln\,\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}} for 200 of the HDPosteriorDraws simulations for each pulsar. For most pulsars the median falls above zero for these simulations, indicating that hd is the preferred model, as expected. Calculating the percentile of the red point within the blue distribution for each pulsar yields a set of percentiles that are consistent with a uniform distribution between 0 and 1, which means PBFcurn,ahd\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}} on δ𝐭15yr\delta\mathbf{t}^{15\textrm{yr}} is consistent with what we expect from a model with hd correlations and intrinsic pulsar noise consistent with p(𝚲|δ𝐭15yr)p(\bm{\Lambda}|\delta\mathbf{t}^{15\textrm{yr}}).
Refer to caption
Figure 12: We compare results on 200 HDPosteriorDraws simulations to the results on δ𝐭15yr\delta\mathbf{t}^{15\textrm{yr}}. The blue box and whisker plots correspond to the distribution of the pulsar with the ithi^{\textrm{th}} highest value of lnPBFcurn,ahd\ln\,\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}} for each simulation. For example, to construct the far left box we find the maximum lnPBFcurn,ahd\ln\,\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}} for each simulation, and then build a distribution across simulations. For the far right box, we find the minimum lnPBFcurn,ahd\ln\,\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}} for each simulation, and build a distribution, and so on. So the lnPBFcurn,ahd\ln\,\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}} going into each blue box could be for a different pulsar for each simulation.

We show the results of those simulations in Figs. 11 and 12. In Fig. 11, we plot lnPBFcurn,ahd\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}} for each pulsar in red in the same order as Fig. 10. We show the distribution of lnPBFcurn,ahd\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}} for each pulsar across 200 HDPosteriorDraws simulations in the blue box and whisker plots. The boxes and whiskers correspond to the 50% credible interval and the 5th and 95th percentiles respectively. The red points broadly agree with these distributions. The median simulated distribution for each pulsar falls above zero, corresponding to the hd model being preferred. Calculating the percentile of the red point (lnPBFcurn,ahd\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}} on δ𝐭15yr\delta\mathbf{t}^{15\textrm{yr}}) in each distribution yields a set of percentiles that are consistent with a uniform distribution between 0 and 1. This is what we expect if each pulsar is well-predicted by all of the others. In general, the pulsars with broader distributions and larger (positive or negative) values of lnPBFcurn,ahd\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}} correspond to the longest-timed and lowest-noise pulsars that have the greatest effect on the analysis.

In Fig. 12 we present results from the same simulations, but we look at the distribution of the order statistics of lnPBFcurn,ahd\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}}. That is, the blue box and whisker to the furthest left corresponds to the distribution of the maximum lnPBFcurn,ahd\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}} across all pulsars for each simulation, so for each simulation we find the maximum lnPBFcurn,ahd\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}}, and across simulations we build a distribution for that maximum. The second from left corresponds to the second largest lnPBFcurn,ahd\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}} in each simulation, the furthest to the right corresponds to the minimum value, and so on. We see that our results are consistent with the simulations from the hd model, and that in general we expect more pulsars to be better predicted by the hd model. Crucially, simulations always result in a few pulsars that are better predicted by the curn model, i.e., negative lnPBFcurn,ahd\ln\textrm{PBF}_{\textrm{{curn}},a}^{\textrm{{hd}}}. Therefore, negative lnPBF\ln\textrm{PBF} values are not immediately cause for concern as long as they are consistent with what we expect from simulations, which is the case here.

VI GW background waveforms

The hd model is preferred to the curn model using the optimal statistic, Bayes factors, and PBFs. In this section, we show reconstructions of the hd model compared to δ𝐭15yr\delta\mathbf{t}^{15\textrm{yr}}. We also highlight covariances between different parts of the model to better understand the relationship between the GW background, the intrinsic pulsar noise, and the timing model for each pulsar. The figures presented in this section are meant to be representative and interpreted qualitatively to illustrate the contribution of different models and the covariances between those models; similar to the waveform reconstructions shown in Refs. [42, 66, 67], for example. Similar figures have been shown before for noise models, e.g. [66, 67, 68], but not for the GW background model.

We first draw 𝐛sp(𝐛|𝚲s,δ𝐭)\mathbf{b}^{s}\sim p(\mathbf{b}|\bm{\Lambda}^{s},\delta\mathbf{t}) using Eqs. (14) and (15), and then construct a model fit to the data by δ𝐭s=𝐓𝐛s\delta\mathbf{t}^{s}=\mathbf{T}\mathbf{b}^{s} for each pulsar. As in the previous section, we separate the Gaussian process coefficients for intrinsic pulsar noise 𝐚s\mathbf{a}^{s}, GW background 𝐚gws\mathbf{a}_{\textrm{gw}}^{s}, and timing model corrections ϵs\bm{\epsilon}^{s}, which we use to inspect contributions from each part of the model independently.

We show waveform reconstructions for pulsar J1909-3744 in Fig. 13. In each panel we show δ𝐭15yr\delta\mathbf{t}^{15\textrm{yr}} (averaged over day-long time-scales to reduce the number of points) and the contribution of one piece of our model. For this pulsar, the total red noise is primarily due to GWs, indicated by the lack of intrinsic pulsar noise in the top right panel, and the fact that the GW background in the top left panel broadly follows δ𝐭15yr\delta\mathbf{t}^{15\textrm{yr}} plotted in blue. In the bottom left, we show the timing model in cyan. The spindown and spin frequency of the pulsar are covariant with the lowest frequencies of the GW background. This results in the broad uncertainties on the individual contributions from these models, but the narrow uncertainty on the combined contributions of all of the models in the bottom right (orange), which tracks the timing residuals closely.

Refer to caption
Figure 13: Waveform reconstruction for J1909-3744 (shaded regions) and timing residuals (blue dots). The solid line corresponds to the median reconstruction, and the shaded regions correspond to the 90% credible interval. The blue points correspond to epoch-averaged residuals. In the top left is the contribution from the GW background, top right shows intrinsic pulsar noise, bottom left shows the timing model, and bottom right shows the combined total model. There is little evidence for intrinsic pulsar noise for this pulsar, and we can see that the frequency and spin-down components of the timing model (which give linear and quadratic offsets) are covariant with the lowest (and strongest) frequencies in the GW background. Regardless, the total model (bottom right) closely follows the data.
Refer to caption
Figure 14: Waveform reconstruction for B1937+21 (shaded regions) and timing residuals (blue dots). In the top left is the contribution from the GW background, top right shows intrinsic pulsar noise, bottom left shows the timing model, and bottom right shows the combined total model. There is strong intrinsic pulsar noise in this pulsar, and in this case frequency and spin-down components of the timing model (which give linear and quadratic offsets) are covariant with the lowest (and strongest) frequencies in the intrinsic pulsar noise, while the GW background is significantly smaller than the noise. Again the total model closely tracks the data (bottom right).

We show a similar waveform reconstruction for pulsar B1937+21 in Fig. 14. The intrinsic pulsar noise dominates the pulsar’s total red noise, as expected based on Fig. 6. Waveform reconstructions for each pulsar are included as an online supplement. We find that in all cases, models represent reasonable fits to the data, which is expected based on the residual plots made with similar (single-pulsar) models in Ref. [26]. These figures are meant to illustrate the different contributions of each part of the model to the overall fit we make to each pulsar.

VII Conclusions

The standard probabilistic model used to establish evidence for a GW background in Ref. [1] makes two assumptions motivated by theoretical expectations but also by computational convenience: that the background follows a power-law spectrum and that its inter-pulsar correlations conform to the Hellings–Downs pattern. Deviations from these assumptions are expected from SMBHB astronomy and astrophysics, and in certain fundamental-physics scenarios, although it is unclear whether the deviations would be measurable in current data sets.

In this paper, we examine the NANOGrav 15 yr data set [26] within the framework of Bayesian predictive model checking [28, 29], with the goal of testing the assumptions without comparison to alternative, more complex models. The modus operandi of Refs. [28, 29] is that of using the fiducial model to simulate a population of replicated data sets from the real-data parameter posteriors, and then comparing the values of multiple statistics of interest in real data and across the replications.

The optimal statistic [40, 53] was used in Ref. [1] to establish the presence of inter-pulsar timing-residual correlations. Within the replication framework, we can account fully for the dependence of the optimal statistic on the uncertain noise parameters [28], building a Bayesian pp-value that falsifies the no-correlation hypothesis at the 3.2σ\,\sigma level for the NANOGrav data set. That is, we find that data replications obtained from a spatially uncorrelated model can rarely reproduce the value of the optimal statistic seen for real data. The Bayesian pp-value is averaged over the noise-parameter posterior, accounting fairly for the overall risk of false rejection. If instead we build our replications from the Hellings–Downs model, we find a pp-value 0.5\sim 0.5, as expected if that model is correct. We also find no anomalies when we use optimal-statistic variants built to be sensitive to monopolar or dipolar correlations.

Moving on from the frequentist flavor of this optimal-statistic analysis to Bayesian model comparison, we evaluate the relative predictive performance of the Hellings–Downs and spatially uncorrelated models by way of the leave-one-out cross-validation pseudo Bayes factor [29]. We find that the Hellings–Downs model is favored at the 3σ\,\sigma level. That is, we find that data replications obtained from a spatially uncorrelated model can rarely reproduce the pseudo Bayes factor seen for real data. We also verify that the binned correlation coefficients estimated from real data are consistent with the distribution expected under the Hellings–Downs hypothesis. Altogether, we find that the 15 yr NANOGrav data set is consistent with the hypothesis of Hellings–Downs correlations, with no evidence for alternative correlation patterns.

We test the assumption that the GW background has a power-law spectrum by comparing the real-data posteriors of the spectral coefficients (i.e., the root-mean-square Fourier amplitudes at each frequency) with their distribution across replicated data sets. Although some “spikes” are evident in the spectral plots, we find that they are not statistically significant—they are not unlikely in the replicated population. As a byproduct of this analysis, Fourier-amplitude posteriors provide a probabilistic reconstruction of the putative GW signal, as seen most strikingly in Fig. 13 for pulsar J1909-3744.

This paper details an extensive but certainly not exhaustive reanalysis of the NANOGrav 15 yr data set. Our overall finding is that the data are consistent with a simple power-law GW background with isotropic Hellings–Downs correlations. Future more expansive and sensitive data sets will require more sophisticated data models; the framework introduced in Refs. [28, 29] and exemplified here can tell us when we have reached that threshold.

Authorship contributions

This paper uses a decade’s worth of pulsar timing observations and is the product of the work of many people. P.M.M. helped conceive the project, wrote and developed code to perform the analysis, created all figures, and wrote and edited the text. M.V. helped conceive the project, developed code, performed parts of the analysis, ran preliminary analyses, and helped write and edit the text. K.Ch. helped conceive the project, guided direction of the analysis, and wrote and edited the text. B.L., S.V., T.D., and D.R.S., gave constructive comments that improved the manuscript, as did members of the NANOGrav Detection Working Group.

G.A., A.A, A.M.A., Z.A., P.T.B., P.R.B., H.T.C., K.C., M.E.D, P.B.D., T.D., E.C.F, W.F., E.F., G.E.F., N.G.D., D.C.G., P.A.G., J.G., R.J.J., M.L.J., D.L.K., M.K., M.T.L., D.R.L., J.L., R.S.L., A.M., M.A.M., N.M., B.W.M., C.N., D.J.N., T.T.N., B.B.P.P., N.S.P., H.A.R., S.M.R., P.S.R., A.S., C.S., B.J.S.A., I.H.S., K.S., A.S., J.K.S., and H.M.W. developed timing models and ran observations for the NANOGrav 15 yr data set.

Acknowledgements

The authors thank Rutger van Haasteren and two anonymous referees for their constructive comments on the manuscript.

The NANOGrav Collaboration receives support from National Science Foundation (NSF) Physics Frontiers Center award Nos. 1430284 and 2020265, the Gordon and Betty Moore Foundation, NSF AccelNet award No. 2114721, an NSERC Discovery Grant, and CIFAR. The Arecibo Observatory is a facility of the NSF operated under cooperative agreement (AST-1744119) by the University of Central Florida (UCF) in alliance with Universidad Ana G. Méndez (UAGM) and Yang Enterprises (YEI), Inc. The Green Bank Observatory is a facility of the NSF operated under cooperative agreement by Associated Universities, Inc. The National Radio Astronomy Observatory is a facility of the NSF operated under cooperative agreement by Associated Universities, Inc. Part of this research was performed at the Jet Propulsion Laboratory, under contract with the National Aeronautics and Space Administration. Copyright 2024.

L.B. acknowledges support from the National Science Foundation under award AST-1909933 and from the Research Corporation for Science Advancement under Cottrell Scholar Award No. 27553. P.R.B. is supported by the Science and Technology Facilities Council, grant number ST/W000946/1. S.B. gratefully acknowledges the support of a Sloan Fellowship, and the support of NSF under award #1815664. M.C. and S.R.T. acknowledge support from NSF AST-2007993. M.C. and N.S.P. were supported by the Vanderbilt Initiative in Data Intensive Astrophysics (VIDA) Fellowship. K.Ch., A.D.J., and M.V. acknowledge support from the Caltech and Jet Propulsion Laboratory President’s and Director’s Research and Development Fund. K.Ch. and A.D.J. acknowledge support from the Sloan Foundation. Support for this work was provided by the NSF through the Grote Reber Fellowship Program administered by Associated Universities, Inc./National Radio Astronomy Observatory. Pulsar research at UBC is supported by an NSERC Discovery Grant and by CIFAR. K.Cr. is supported by a UBC Four Year Fellowship (6456). M.E.D. acknowledges support from the Naval Research Laboratory by NASA under contract S-15633Y. T.D. and M.T.L. are supported by an NSF Astronomy and Astrophysics Grant (AAG) award number 2009468. E.C.F. is supported by NASA under award number 80GSFC21M0002. G.E.F., S.C.S., and S.J.V. are supported by NSF award PHY-2011772. K.A.G. and S.R.T. acknowledge support from an NSF CAREER award #2146016. The work of N.La., X.S., and D.W. is partly supported by the George and Hannah Bolinger Memorial Fund in the College of Science at Oregon State University. N.La. acknowledges the support from Larry W. Martin and Joyce B. O’Neill Endowed Fellowship in the College of Science at Oregon State University. Part of this research was carried out at the Jet Propulsion Laboratory, California Institute of Technology, under a contract with the National Aeronautics and Space Administration (80NM0018D0004). D.R.L. and M.A.M. are supported by NSF #1458952. M.A.M. is supported by NSF #2009425. C.M.F.M. was supported in part by the National Science Foundation under Grants No. NSF PHY-1748958 and AST-2106552. A.Mi. is supported by the Deutsche Forschungsgemeinschaft under Germany’s Excellence Strategy - EXC 2121 Quantum Universe - 390833306. The Dunlap Institute is funded by an endowment established by the David Dunlap family and the University of Toronto. K.D.O. was supported in part by NSF Grant No. 2207267. T.T.P. acknowledges support from the Extragalactic Astrophysics Research Group at Eötvös Loránd University, funded by the Eötvös Loránd Research Network (ELKH), which was used during the development of this research. H.A.R. is supported by NSF Partnerships for Research and Education in Physics (PREP) award No. 2216793. S.M.R. and I.H.S. are CIFAR Fellows. Portions of this work performed at NRL were supported by ONR 6.1 basic research funding. J.D.R. also acknowledges support from start-up funds from Texas Tech University. J.S. is supported by an NSF Astronomy and Astrophysics Postdoctoral Fellowship under award AST-2202388, and acknowledges previous support by the NSF under award 1847938. C.U. acknowledges support from BGU (Kreitman fellowship), and the Council for Higher Education and Israel Academy of Sciences and Humanities (Excellence fellowship). C.A.W. acknowledges support from CIERA, the Adler Planetarium, and the Brinson Foundation through a CIERA-Adler postdoctoral fellowship. O.Y. is supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE-2139292.

References