This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Nii: a Bayesian orbit retrieval code applied to differential astrometry

Sheng Jin1,2, Xiaojian Ding3, Su Wang1,2, Yao Dong1,2, Jianghui Ji1,2
1CAS Key Laboratory of Planetary Sciences, Purple Mountain Observatory, Chinese Academy of Sciences, Nanjing 210023, China
2CAS Center for Excellence in Comparative Planetology, Hefei 230026, China
3College of Information Engineering, Nanjing University of Finance and Economics, Nanjing 210007, China
E-mail: [email protected]
(Accepted XXX. Received YYY; in original form ZZZ)
Abstract

Here we present an open source Python-based Bayesian orbit retrieval code (Nii) that implements an automatic parallel tempering Markov chain Monte Carlo (APT-MCMC) strategy. Nii provides a module to simulate the observations of a space-based astrometry mission in the search for exoplanets, a signal extraction process for differential astrometric measurements using multiple reference stars, and an orbital parameter retrieval framework using APT-MCMC. We further verify the orbit retrieval ability of the code through two examples corresponding to a single-planet system and a dual-planet system. In both cases, efficient convergence on the posterior probability distribution can be achieved. Although this code specifically focuses on the orbital parameter retrieval problem of differential astrometry, Nii can also be widely used in other Bayesian analysis applications.

keywords:
software: data analysis – astrometry – planets and satellites: detection
pubyear: 2021pagerange: Nii: a Bayesian orbit retrieval code applied to differential astrometryNii: a Bayesian orbit retrieval code applied to differential astrometry

1 Introduction

Searching for exoplanets using various techniques has become increasingly important in the field of planetary science (Fischer et al., 2014). Owing to continuous technological advancements, past and ongoing surveys have detected \sim 6,000 confirmed exoplanets and candidates(Cumming et al., 2008; Borucki et al., 2011; Mayor et al., 2011; Cassan et al., 2012; Marcy et al., 2014; Thompson et al., 2018). This large sample provides the basis for thoroughly understanding the processes by which planets form and evolve (Ida & Lin, 2004; Mordasini et al., 2012; Bitsch, Lambrechts, & Johansen, 2015; Liu & Ji, 2020; Zhang, 2020).

The astrometry method employed to detect exoplanets is implemented by precisely measuring the tiny wobbles of a star caused by the gravitational pull exerted by one or more surrounding planets. The first formal astrometric calculation for an extrasolar planet was made in 1855 for the star 70 Ophiuchi (Jacob, 1855), but unfortunately subsequent observations discovered that the signal was entirely due to systematic errors in the visual measurements in the 19th century (Heintz, 1988). The precision required to detect a planet using the astrometry technique is extremely high; thus far only a handful of previously discovered exoplanets have been successfully recharacterized by astrometric method (Benedict et al., 2002; Snellen & Brown, 2018; Feng et al., 2019). In addition, the astrometric method is sensitive to exoplanets with large orbits; thus long observation times are necessary to complete the orbits of planets with long orbital periods. The GAIA mission is expected to reach a magnitude-dependent accuracy of \sim 10 μ\muas and is expected to find thousands of Jovian exoplanets up to 500 parsecs from the Sun via astrometry (Perryman et al., 2014).

To discover terrestrial planets, the accuracy of the astrometry technique must reach \sim 1 μ\muas. A feasible way to achieve such a high accuracy is to develop a space-based differential astrometry telescope. The principle of differential astrometry is that it measures and compares the relative offset angles between a target and several distant reference stars, not the absolute position of the target in the celestial sphere. Using narrow-angle astrometric observations, such space-based instruments can differentially detect the reflex motion of the target star due to the presence of its planets with very high accuracy (Catanzarite et al., 2007; Unwin et al., 2008; Goullioud et al., 2008). Historically, a series of space missions have been proposed to search for exoplanets using differential astrometry; examples include the Space Interferometry Mission PlanetQuest (SIM PlanetQuest) (Catanzarite et al., 2007; Unwin et al., 2008; Tanner, Gelino, & Law, 2010), the Nearby Earth Astrometric Telescope (NEAT) mission (Malbet et al., 2012) and the Search for Terrestrial Exo-Planets (STEP) mission (Chen, Wu, & Li, 2013; Liu, Liu, & Zhu, 2018). Several follow-up successions of these projects are in progress, namely, the Theia Space Observatory (Malbet et al., 2016; The Theia Collaboration et al., 2017), the Habitable ExoPlanet Survey (HEPS) mission (Yu et al., 2019) and the Closeby Habitable Exoplanet Survey (CHES) mission (Ji & Wang, 2020).

Space-based missions using the differential astrometry technique can provide the offset angles between a target star and its reference stars in a time series. Thus, a successful orbit retrieval model needs to accurately fit the planetary mass and orbital parameters with these time series data. Such a process is similar to the fitting of planetary orbital parameters in the radial velocity method and the absolute astrometry method; Bayesian methods, especially Markov chain Monte Carlo (MCMC) methods, are widely employed for these tasks (Gregory, 2005a, b; Ford, 2006; Balan & Lahav, 2009; Gregory, 2011; Schulze-Hartung, Launhardt, & Henning, 2012; Eastman, Gaudi, & Agol, 2013; Díaz et al., 2014; Feroz & Hobson, 2014; Ranalli, Hobbs, & Lindegren, 2018; Brandt et al., 2021). In the case of differential astrometry, additional attention should be paid to the error sources related to the special observation strategy and the use of multiple reference stars (Liu, Liu, & Zhu, 2018).

Here, we provide a Bayesian orbit retrieval framework for potential differential astrometry missions in an open source code known as Nii111https://github.com/shengjin/Nii.git. To handle the high-dimensional model of a planetary system, we implement the strategy of automatic parallel tempering Markov chain Monte Carlo) (APT-MCMC) strategy (Liu, 2001; Gregory, 2005a, b). Our code implements two special treatments. First, in the analysis of the observation errors, we consider the effect of using multiple-reference stars in the differential astrometry method. Second, in the control system utilized to automate the selection of the sampling step sizes for all the parallel Markov chains, we adopt a different scheme from the existing models (Gregory, 2005a, b). Ultimately, we find that the Nii code can guarantee efficient convergence on the high-dimensional posterior distributions encountered in orbit retrieval problems.

The remainder of this paper is organized as follows. Section 2 describes the forward model that is used to simulate the observations acquired during a differential astrometry mission. Section 3 elaborates on the APT-MCMC model built to retrieve orbital parameters. Section 4 presents two applications of the Nii code in the retrieval of orbital parameters from a single-planet system and a dual-planet system. Section 5 presents a brief summary.

2 Forward model: Simulated observations

The relative movements of nearby bright stars calibrated using several reference stars can be measured with sub-microarcsecond precision using a space-based differential astrometry telescope equipped with a Michelson interferometer (Unwin et al., 2008; Shao & Nemati, 2009; Malbet et al., 2012; The Theia Collaboration et al., 2017; Ji & Wang, 2020). The technique utilized for such missions is known as micropixel image position sensing, which measures the differential motion between the centroids of a target star and a reference star (Nemati et al., 2011; Zhai et al., 2011). In this technique, the detector pixel response functions in Fourier space are characterized using laser metrology, and these response functions are further used to construct a point spread function (PSF). The derived PSF is resampled at different relative locations from the original position of the centroid of the target star (Δxc\Delta x_{c}s, Δyc\Delta y_{c}s) and compared with the PSF of a second image taken at a later time. Then, the resampled PSF at the location (Δxc\Delta x_{c}, Δyc\Delta y_{c}) that best matches the PSF of the second image can help determine the displacement of the centroid of the target star based on a reference star (Nemati et al., 2011; Zhai et al., 2011). With this method, an astrometric accuracy of 6×1056\times 10^{-5} pixels can be achieved (Crouzier et al., 2016). The space missions proposed for the survey of exoplanets based on differential astrometry have evolved in recent years(The Theia Collaboration et al., 2017; Yu et al., 2019; Ji & Wang, 2020).

In a space-based differential astrometry mission, the observation output of each target star in the operating cycle is a time series of the relative movements between the target star and several selected distant reference stars. These relative movements contain multiple parts. First, the relative displacements caused by the proper motions and parallaxes of the target star and the reference stars are the main contributors to these movements. In the Nii code, the proper motions of the stars are set as necessary input parameters in the simulated observations, and the parallaxes of the stars are can also be included in the same module. Consequently, we can subtract the relative displacements caused by the proper motions and parallaxes of the target and reference stars based on these input parameters. However, this assumes that we accurately know these parameters, which is unlikely in reality. When dealing with real observations, we must conduct a special analysis of the proper motions and parallaxes of the stars within a particular system. We should also investigate complex situations, for example, a potential signal with a period close to a year, as such a signal would produce an astrometric signal that could constructively or destructively interfere with the measured parallax and severely reduce the precision of the simulated measurements. For simplicity, as this work focuses on the retrieval of planetary orbital parameters, we assume that the proper motions and parallaxes of the stars are accurately known and can be directly subtracted from the detected relative movements.

If the target star hosts a planet, then the remaining periodic signals after the proper motions and parallaxes of all the stars are subtracted may correspond to either the planetary system of the target star or the wobbles of a reference star. Such periodic signals created by planetary systems present as movements in rectangular coordinates on the focal plane at different observation time points. As a source of uncertainty, these periodic signals may be related to either the planetary system of the target star or the planetary system of a reference star. Because the selected reference stars are far from the target star, only the wobbles generated by massive Jovian planets orbiting these reference stars can be detected. Therefore, the embodiment of such uncertainty is that we cannot distinguish the source of the detected periodic signals between a terrestrial planet orbiting the target star or a Jovian planet orbiting one of the reference stars. The solution of this problem is the observation scheme of using multiple-reference stars in a space-based differential astrometry mission. A disturbance caused by the planetary system of one reference star cannot appear in the differential astrometric signal obtained using another reference star. Thus, the source of a periodic signal can be identified by comparing the fitting results derived by using different reference stars. Similarly, such a cross-comparison approach can reduce the contamination of the fitting results arising from the errors in the proper motion or parallax of a particular reference star.

In a simple case in which the proper motions and parallaxes of the target and reference stars are accurately removed, the output of an observation pipeline for a target star with a planetary system can be reduced to periodic changes in right ascension and declination around the centre of mass of the system. Considering a target star with a single-planet system, following the approach in the radial velocity detection method (Gregory, 2005a), the changes in right ascension and declination over time can be expressed as

{Δη(t)=Δη(t)+ϵη(t)+ϵxΔδ(t)=Δδ(t)+ϵδ(t)+ϵx$$\begin{cases}\Delta\eta^{\prime}(t)=\Delta\eta(t)+\epsilon_{\eta}(t)+\epsilon_{x}\\ \Delta\delta^{\prime}(t)=\Delta\delta(t)+\epsilon_{\delta}(t)+\epsilon_{x}\\ \end{cases}$$ (1)

where Δη(t)\Delta\eta^{\prime}(t) and Δδ(t)\Delta\delta^{\prime}(t) are the measured changes between each pair of the target star and a reference star in right ascension and declination at the instant of time tt, Δη(t)\Delta\eta(t) and Δδ(t)\Delta\delta(t) are the changes in right ascension and declination due to the presence of a planet, ϵη(t)\epsilon_{\eta}(t) and ϵδ(t)\epsilon_{\delta}(t) are the noise components due to known but unequal measurement errors with an assumed Gaussian distribution, and ϵx\epsilon_{x} represents any additional unknown measurement errors plus any real signal that cannot be described by Δη\Delta\eta and Δδ\Delta\delta (e.g., a signal caused by another planet or the unexpected movement of any reference star).

Suppose the planet orbiting the target star has a mass MpM_{\mbox{p}} and orbital elements (a,e,i,ω,Ω,M0)(a,e,i,\omega,\Omega,M_{0}), where aa is the orbital semimajor axis, ee is the eccentricity, ii is the inclination, ω\omega is the argument of periapsis, Ω\Omega the longitude of the ascending node, and M0M_{0} is the mean anomaly at a reference time. Note that such an orbital parameter set will lead to ambiguity in the fitted ω\omega and Ω\Omega, especially for orbits with small eccentricity and small inclination. One solution is to use equinoctial orbit elements (Broucke & Cefola, 1972). For simplicity, the orbit retrieval in Nii employs Keplerian elements since the MpM_{\mbox{p}}, aa, ee, ii, and the phase curve of the orbit are the only information expected to be known. Nevertheless, although ω\omega is needed to predict the radial velocity curve and Ω\Omega defines the position angles of the orbits of both the planet and the host star around the photocentre of the sky, these parameters are not relevant when determining the mass of the planet and orbital period, the two main objectives of differential astrometry missions (Ji & Wang, 2020).

For precise planetary mass determination, the stellar mass should be an additional free parameter so the uncertainty could be marginalized over through Bayesian sampling. Given that the semimajor axis of an astrometric orbit is inversely proportional to the stellar mass, the effect of using a fixed star mass should be small. Thus in this work, we assume the stellar mass is precisely known. We will parameterize the stellar mass in further development of the Nii code.

To simulate the differential astrometric signal obtained from an observation pipeline, the elliptical orbital motion of the target star induced by the planet should be projected onto the observation plane. The Thiele–Innes elements, (A,B,F,G)(A,B,F,G), are used to calculate the wobble in rectangular coordinates (Thiele, 1883; Alzner, 2004) as follows:

{A=α(cosΩcosωsinΩsinωcosi)B=α(sinΩcosω+cosΩsinωcosi)F=α(cosΩsinωsinΩcosωcosi)G=α(sinΩsinω+cosΩcosωcosi)$$\begin{cases}A=\alpha(\cos\Omega\cos\omega-\sin\Omega\sin\omega\cos i)\\ B=\alpha(\sin\Omega\cos\omega+\cos\Omega\sin\omega\cos i)\\ F=\alpha(-\cos\Omega\sin\omega-\sin\Omega\cos\omega\cos i)\\ G=\alpha(-\sin\Omega\sin\omega+\cos\Omega\cos\omega\cos i)\end{cases}$$ (2)

where α\alpha is the maximum astrometric amplitude of the target star (with a mass of MM_{\ast} at a distance dd from the solar system) due to the reflex motion in the presence of the planet given by

α=3(Mp1M)(a1AU)(M1M)1(d1pc)1μas.\alpha=3\left(\frac{M_{\mbox{p}}}{1M_{\oplus}}\right)\left(\frac{a}{1\mbox{AU}}\right)\left(\frac{M_{\ast}}{1M_{\odot}}\right)^{-1}\left(\frac{d}{1\mbox{pc}}\right)^{-1}\mu\mbox{as}. (3)

For an Earth-like planet in the habitable zone of a solar-like star at 10 parsecs, a typical value of α\alpha is 0.3 μas\mu\mbox{as}.

Given these Thiele–Innes elements, the reflex motion of the target star in right ascension and declination due to the presence of a single planet can be solved by

{Δη(t)=AX(t)+FY(t)Δδ(t)=BX(t)+GY(t)$$\begin{cases}\Delta\eta(t)=AX(t)+FY(t)\\ \Delta\delta(t)=BX(t)+GY(t)\end{cases}$$ (4)

where

{X(t)=cosE(t)eY(t)=1e2sinE(t)E(t)esinE(t)=2πP(tT)$$\begin{cases}X(t)=\cos E(t)-e\\ Y(t)=\sqrt{1-e^{2}}\sin E(t)\\ E(t)-e\sin E(t)=\frac{2\pi}{P}(t-T)\end{cases}$$ (5)

where EE is the eccentric anomaly that can be determined from the mean anomaly M0M_{0} and TT the time of passage through the periapsis.

For a specific system and a specific time series tt, we first simulate the Δη(t)\Delta\eta(t) and Δδ(t)\Delta\delta(t) of the target stars using Equations 2, 3, 4 and 5 with a planet characterized by the parameter set (a,e,i,ω,Ω,M0,Mp)(a,e,i,\omega,\Omega,M_{0},M_{\mbox{p}}). Then, we add the noise components ϵη(ti)\epsilon_{\eta}(t_{i}) and ϵδ(ti)\epsilon_{\delta}(t_{i}) for each instant of time tit_{i} in the time series, and we assume that all ϵη(ti)\epsilon_{\eta}(t_{i}) and ϵδ(ti)\epsilon_{\delta}(t_{i}) follow a Gaussian distribution with a mean of 0 and standard deviations of ση(ti)\sigma_{\eta}(t_{i}) and σδ(ti)\sigma_{\delta}(t_{i}) respectively, which depend on the accuracy of the differential astrometric measurement.

There are two scenarios that may lead to additional measurement errors of ϵx\epsilon_{x}. In the first scenario, there is another unknown planet in the simulated system. In the second, one of the reference stars wobbles due to a planet or a companion star, which leads to unexpected errors in the differential astrometric signal. The difference between these two scenarios is that the additional errors caused by the wobbles of a specific reference star do not appear in the differential astrometric measurement of another reference star.

In this work we simulate a host star with a mass of 1 MM_{\odot} that is located 3 parsecs from the Earth. We calculate the planetary system of this star in the following two scenarios:

  1. 1.

    The target star is within a single-planet system.

  2. 2.

    The target star is within a dual-planet system.

The simulated space mission uses CHES as a prototype (Ji & Wang, 2020). The mission time is 5 years, during which approximately 300 measurements are acquired for each target star. The field of view of the CHES mission is 0.44×{}^{\circ}\times 0.44, in which the relative offset angles between the host star and at least 8 reference stars are measured using the differential astrometry technique. CHES aims to achieve an accuracy of \sim 1 μas\mu\mbox{as}. This information is summarized in Table 1.

We generate the simulated observation signals for these two scenarios using Equation 1 and adopt the method described in Section 3 to fit the orbital parameters of the planetary system of the target star.

Table 1: Parameters of the host star and the simulated observation.
Stellar mass (MM_{\odot}) 1.0
Distance from the Earth (parsec) 3.0
Field of view 0.44×{}^{\circ}\times 0.44
Number of reference stars 8
Mission duration (year) 5
Number of observations 300
Simulated Gaussian measurement error (μas\mu\mbox{as}) 1.0

3 Bayesian orbit retrieval

3.1 Bayesian inference

The orbit retrieval procedure in Nii for a planet revolving around a target star is based on Bayesian analysis (Bayes, 1763). Let θ\theta be the unknown parameter set that we want to estimate with a specific model MM. Then, the posterior probability distribution for θ\theta after we obtain observation data DD is of the following form:

p(θ|D,M)=p(D|θ,M)p(θ|M)p(D|M)p(\theta|D,M)=\frac{p(D|\theta,M)\,p(\theta|M)}{p(D|M)}\, (6)

where p(D|θ,M)p(D|\theta,M) is the likelihood reflecting the probability of generating the particular observation data DD if the parameters in the model MM are equal to θ\theta and p(θ|M)p(\theta|M) is the prior distribution of θ\theta representing our beliefs across different values of the parameters before observing the data. The denominator p(D|M){p(D|M)} is the normalization constant ensuring that the posterior distribution integrates to one. Given observation data DD, we can simply omit the calculation of the denominator and evaluate the posterior distribution as

p(θ|D,M)p(D|θ,M)p(θ|M)p(\theta|D,M)\propto p(D|\theta,M)\,p(\theta|M)\, (7)

The main goal of orbit retrieval is to estimate the posterior probability distribution in Equation 7 by sampling. The point estimate of each parameter is set as the posterior mean because it is representative of the central position of the posterior distribution and mathematically accounts for the measure from a measure-theoretic perspective. Nii implements the APT-MCMC method (Liu, 2001; Gregory, 2005a), which is introduced in detail in Section 3.4. The underlying sampling algorithm of Nii’s MCMC strategy is Metropolis–Hastings (Metropolis et al., 1953; Hastings, 1970).

3.2 Prior distributions

Table 2: Prior distributions of all the model parameters.
Parameter Prior Mathematical Form Min Max
P(days)P({\mathrm{days}}) Jeffreys 1Pln(PmaxPmin)\frac{1}{P\,\ln\big{(}\frac{P_{\mathrm{max}}}{P_{\mathrm{min}}}\big{)}} 0.5 3650
Mp(M)M_{\mbox{p}}(M_{\oplus}) Jeffreys 1Mpln(MmaxMmin)\frac{1}{M_{\mathrm{p}}\,\ln\big{(}\frac{M_{\mathrm{max}}}{M_{\mathrm{min}}}\big{)}} 0.1 3000
ee Uniform 1 0 1
cosicosi Uniform 0.5 1 -1
Ω\Omega Uniform 12π\frac{1}{2\pi} 0 2π2\pi
ω\omega Uniform 12π\frac{1}{2\pi} 0 2π2\pi
M0M_{\mathrm{0}} Uniform 12π\frac{1}{2\pi} 0 2π2\pi
ϵx(μas)\epsilon_{x}(\mu\mbox{as}) Mod. Jeffreys (ϵx+ϵxa)1ln(ϵxa+ϵxmaxϵxa)\frac{(\epsilon_{x}+\epsilon_{xa})^{-1}}{\ln\big{(}\frac{\epsilon_{xa}+\epsilon_{x_{\mathrm{max}}}}{\epsilon_{xa}}\big{)}} 0 (ϵxa=1\epsilon_{xa}=1) 100100

The choice of priors is very important in Bayesian inference since improper priors can produce misleading results. We assume that the prior distribution of each parameter is independent of that of each other parameter and set the prior distribution in a similar way to the radial velocity detection method (Gregory, 2005a).

In the case that the mass of the host star is known, every additional planet needs 7 more parameters to determine its orbit. In addition to the parameters describing planetary orbits, there is also an ϵx\epsilon_{x} that represents any additional unknown errors. Table 2 shows the choice of priors for each parameter and their boundaries. We choose uniform priors for ee, ω\omega, Ω\Omega, and M0M_{0}, and the inclination ii is set to be uniform in cosicosi. For MpM_{\mathrm{p}}, the planetary orbital period PP and ϵx\epsilon_{x}, a uniform prior would be inappropriate since it would strongly favour a parameter range with larger values. We choose the Jeffreys prior for MpM_{\mathrm{p}} and PP and a modified Jeffreys prior for ϵx\epsilon_{x} (Gregory, 2005a).

For a simple model where the target star hosts only one planet, the joint prior distribution for the model parameters, assuming independence, can be written as

p(θ|M)=p(P|M)p(Mp|M)p(e|M)p(cosi|M)p(Ω|M)×p(ω|M)p(M0|M)p(ϵ0|M)\begin{split}p(\theta|M)\,=\,\,&p(P|M)\,p(M_{\mathrm{p}}|M)\,p(e|M)\,p(cosi|M)\,p(\Omega|M)\\ &\quad\quad\times p(\omega|M)\,p(M_{0}|M)\,p(\epsilon_{0}|M)\end{split} (8)

3.3 Likelihood function

One of the features of the Nii code is that it considers the effect of using multiple-reference stars in the analysis of the observation errors. At all moments in an observation time series, we use the relative position angles between a target star and all the reference stars to calculate the standard deviation of the Gaussian observation error at each instant of time. Then, we separately pair the target star and each reference star to perform the orbit retrieval process, in which all the measurements of the wobbles between the target star and each reference star in the observation time series are used. Depending on the fitting results obtained using different reference stars, an additional voting process on the final fitted parameters may be required. Thus, the effect of using multiple reference stars in the differential astrometry method is to restrict the standard deviation at a single measurement time point and to cross-validate the final fitting results.

Suppose there are a total of NN measurements in an observation period tt, and each measurement yields ϵη(ti)\epsilon_{\eta}(t_{i}) and ϵδ(ti)\epsilon_{\delta}(t_{i}) between the target star and every reference star. If there are NrefN_{\mathrm{ref}} reference stars in total, then both ϵη(ti)\epsilon_{\eta}(t_{i}) and ϵδ(ti)\epsilon_{\delta}(t_{i}) have NrefN_{\mathrm{ref}} elements. Assuming at each instant of time tit_{i} the measurement errors for the NrefN_{\mathrm{ref}} reference stars follow a Gaussian distribution with a mean of 0, we can calculate the standard deviation ση(ti)\sigma_{\eta}(t_{i}) and σδ(ti)\sigma_{\delta}(t_{i}) in the Gaussian distribution based on the NrefN_{\mathrm{ref}} elements for both ϵη(ti)\epsilon_{\eta}(t_{i}) and ϵδ(ti)\epsilon_{\delta}(t_{i}).

Since in differential astrometry a measurement gives the offset angles in two directions, namely, right ascension and declination, there are a total of 2N2N measurements between the target star and each reference star in the observation period tt. Nevertheless, this is only an ideal situation, and some data may be missing in real cases. The likelihood function for the 2N2N measurements of the offset angles between the target star and each reference star is given by (Gregory, 2005a; Bishop, 2006)

p(D|θ,M)=Aexp{i=1N[Δη(ti)Δη(ti)]22[ση(ti)2+ϵx2]}×exp{i=1N[Δδ(ti)Δδ(ti)]22[σδ(ti)2+ϵx2]}\begin{split}p(D|\theta,M)=\,A\,&\exp\Biggl{\{}-\sum_{i=1}^{N}\frac{\big{[}\Delta\eta^{\prime}(t_{i})-\Delta\eta(t_{i})\big{]}^{2}}{2\big{[}\sigma_{\eta}(t_{i})^{2}+\epsilon_{x}^{2}\big{]}}\Biggr{\}}\,\\ &\times\exp\Biggl{\{}-\sum_{i=1}^{N}\frac{\big{[}\Delta\delta^{\prime}(t_{i})-\Delta\delta(t_{i})\big{]}^{2}}{2\big{[}\sigma_{\delta}(t_{i})^{2}+\epsilon_{x}^{2}\big{]}}\Biggr{\}}\,\end{split} (9)

where

A=(2π)Ni=1N[ση(ti)2+ϵx2]1/2i=1N[σδ(ti)2+ϵx2]1/2A=(2\pi)^{-N}\prod_{i=1}^{N}\Big{[}\sigma_{\eta}(t_{i})^{2}+\epsilon_{x}^{2}\Big{]}^{-1/2}\,\prod_{i=1}^{N}\Big{[}\sigma_{\delta}(t_{i})^{2}+\epsilon_{x}^{2}\Big{]}^{-1/2}\, (10)

Equation 9 demonstrates that the NrefN_{\mathrm{ref}} reference stars are used together only to determine the standard deviation of the distribution of the measurement errors at each instant of time tit_{i}. In the orbit retrieval process, we separately fit the orbital parameters using the offset angles between the target star and each reference star in the entire time series tt. This approach can reduce a sizable number of parameters in θ\theta because each reference star has a different unknown error represented by ϵx\epsilon_{x}. Finally, we summarize the fitting results obtained from different reference stars to evaluate the reliability of the inferred orbital parameters.

3.4 Implementation of PT-MCMC

The posterior distribution given by Equation 7 is highly multimodal, and an MCMC process can become stuck in many local optimal solutions and fail to fully explore the global distribution. One solution to this problem is referred to as simulated tempering (Geyer & Thompson, 1995; Gregory, 2005b), in which flatter versions of the target posterior distributions are generated using a temperature parameter β\beta:

p^(θ|D,β,M)p(D|θ,M)p(θ|M)β,for   0<β<1\widehat{p}(\theta|D,\beta,M)\propto p(D|\theta,M)\,p(\theta|M)^{\beta}\,,\quad{\mathrm{for}}\,\,\,0<\beta<1 (11)

where β\beta varies from 1, corresponding to the coldest original posterior distribution, to 0, corresponding to the hottest distribution that is completely flat. Using a flat posterior distribution described by Equation 11, an MCMC sampler with a suitable β\beta can easily escape from local optimal solutions.

An attractive technique, parallel tempering (PT), runs several Markov chains with different β\beta values in parallel (Liu, 2001; Gregory, 2005b). In a PT-MCMC run, one chain with β=1\beta=1 corresponds to the original posterior distribution, and other chains with different smaller β\beta values form a set of ladders leading to different higher temperature distributions. At random intervals, e.g., a state ss, a pair of adjacent chains with temperature parameters βi\beta_{i} and βi+1\beta_{i+1} are chosen at random, and their current parameter states, θs,i\theta_{s,i} and θs,i+1\theta_{s,i+1}, are interchanged with a probability of

r=min[ 1,p^(θs,i+1|D,βi,M)p^(θs,i|D,βi+1,M)p^(θs,i|D,βi,M)p^(θs,i+1|D,βi+1,M)]r={\mathrm{min}}\Biggl{[}\,1,\,\frac{\widehat{p}\big{(}\theta_{s,i+1}|D,\beta_{i},M\big{)}\,\widehat{p}\big{(}\theta_{s,i}|D,\beta_{i+1},M\big{)}}{\widehat{p}\big{(}\theta_{s,i}|D,\beta_{i},M\big{)}\,\widehat{p}\big{(}\theta_{s,i+1}|D,\beta_{i+1},M\big{)}}\,\Biggr{]}\, (12)

Such swaps of parameter states allow parallel chains to exchange their sampling areas frequently, facilitating the convergence of global sampling.

In our orbit retrieval model, 8 MCMC chains are run in parallel with a set of β={0.01,0.02,0.05,0.1,0.25,0.5,0.75,1.0}\beta=\{0.01,0.02,0.05,0.1,0.25,0.5,0.75,1.0\}. A proposed swap is randomly evaluated using Equation 12 every 10 to 30 iterations.

3.5 Automatic Gaussian proposal distributions: APT-MCMC

One key factor that determines the completeness of MCMC sampling is the Metropolis–Hastings step size of each parameter θi\theta_{i} (characterized by the standard deviation of the Gaussian proposal distribution σθi\sigma_{\theta_{i}}). If the step sizes are too small, it is difficult for the MCMC sampler to cover the entire parameter space and find the optimal solution since the obtained posterior density is highly dependent on the starting location of the chain. In contrast, if the step sizes are too large, the MCMC sampler rejects the majority of proposals, and only a few effective samples are obtained. For a large number of parameters, an ideal combination of step sizes leads to an acceptance rate of \sim 23.4% (Gelman et al., 1997).

In a high-dimensional orbit fitting model, it is difficult to find an ideal combination of sampling step sizes for all PT Markov chains. First, the ideal step sizes for different parameters θi\theta_{i} are unique and change under different β\beta values. Second, when exploring different regions of the target posterior distribution, the ideal set of step sizes that is compatible with the shape of the local posterior distribution should also be adjusted accordingly. Therefore, in the case of the simplest single-planet model with 8 parameters, a PT-MCMC program with 8 chains with different β\beta values needs to tune 64 different step sizes in real time to reach the optimal acceptance rate, which is almost impossible to perform manually.

The Nii code adopts a different automatic control system from the existing model (Gregory, 2005a, b). In our APT-MCMC scheme, the step sizes of all the parameters of all the PT Markov chains are adjusted dynamically in real time after an initial burn-in stage. At the initial burn-in stage, we set the standard deviation of the Gaussian proposal of each parameter, σθi\sigma_{\theta_{i}} , to 10%10\% of the parameter’s prior range to ensure that the sampler can access all the high-density regions of the posterior distribution. However, these initial step sizes result in extremely low acceptance rates for Markov chains. Thus, after the burn-in stage, the Nii code monitors the acceptance rate of each Markov chain in real time and dynamically adjusts the standard deviations of the Gaussian proposals of all the parameters for the chains with a low acceptance rate. This process is implemented by dividing the entire Markov chains into many short cycles and adding many tuning stages for the chains with poor acceptance rates after each short cycle. In these tuning stages, the program separately adjusts the σθi\sigma_{\theta_{i}} of each parameter with a scaling array and observes the corresponding changes in the acceptance rate. Then, a new combination of step sizes is determined by combining two factors: the sensitivity of the acceptance rate to each σθi\sigma_{\theta_{i}} and the deviations between the ideal acceptance rate and the acceptance rates obtained by all the scaled parameters in the tuning stage. The strategy here is to select from the tuned parameter matrix the parameter set that can result in an acceptance rate that is closest to the ideal rate of 23.4%23.4\% in the local posterior distribution being sampled. Although this method cannot obtain the ideal acceptance rate of 23.4%\sim 23.4\%, it is sufficient for efficient convergence. In our test runs, an APT-MCMC sampler can converge on the posterior distribution within 500,000 iterations.

3.6 Gelman-Rubin convergence diagnostics

We use the Gelman-Rubin criterion, RcR_{c}, to assess whether the APT-MCMC sampling process has converged (Gelman & Rubin, 1992). Gelman-Rubin diagnostics use an analysis of both the between-chain variance and within-chain variance of several independent Markov chains to assess convergence. Suppose there are NN Markov chains of length LL. Let θij\theta_{ij} denote the iith of the LL iterations of θ\theta in the jjth simulated chain, θ¯j\bar{\theta}_{j} denote the sample posterior mean of chain jj, and θ¯\bar{\theta} be the overall sample posterior mean of the NN chains. The between-chains variance, BB, is given by

B=LN1j=1N(θ¯jθ¯)2B=\frac{L}{N-1}\sum_{j=1}^{N}(\bar{\theta}_{j}-\bar{\theta})^{2} (13)

The within-chain variance, WW, is given by

W=1Nj=1N[1L1i=1L(θijθ¯j)2]W=\frac{1}{N}\sum_{j=1}^{N}[\frac{1}{L-1}\sum_{i=1}^{L}(\theta_{ij}-\bar{\theta}_{j})^{2}] (14)

Then, we can obtain an unbiased estimator of the marginal posterior variance of θ\theta, var(θ)var(\theta), by taking the weighted average of BB and WW:

var(θ)=(11L)W+1LBvar(\theta)=(1-\frac{1}{L})W+\frac{1}{L}B (15)

Since the initial value of θ\theta is set randomly during MCMC sampling, var(θ)var(\theta) should overestimate the true marginal posterior variance. On the other hand, WW tends to underestimate the within-chain variance in the early phase of MCMC sampling. As the sampling process gradually converges, both var(θ)var(\theta) and WW stabilize at the true variance of θ\theta. Thus, the Gelman-Rubin criterion uses the potential scale reduction factor, RcR_{c}, as a convergence diagnostic

Rc=var(θ)WR_{c}=\sqrt{\frac{var(\theta)}{W}} (16)

In practice, the values of the Gelman-Rubin criterion should be less than 1.2 for model parameters to declare convergence in MCMC sampling, in more rigorous diagnostics, the values of RcR_{c} are required to be less than 1.1 (Gelman & Rubin, 1992; Brooks & Gelman, 1998).

To assess the convergence of the model parameters in our Bayesian orbit retrieval framework, we run a batch of independent APT-MCMC sampling tests to calculate the Gelman-Rubin criterion. For simplicity, we carry out this process for only one reference star. Because all the reference stars are modelled as stable systems with no observable strong perturbations, the simulated differential astrometric signals derived from different reference stars all give the relative displacement of the target star. Only the random error is different. In this case, the RcR_{c} value obtained from multiple independent APT-MCMC sampling runs using the simulated observation of one reference star can help us evaluate the convergence of the sampling process using the data of other reference stars.

4 Applications

Refer to caption
Figure 1: Corner plot of the posterior distributions of aa, ee, cosicosi, ω\omega+Ω\Omega, M0M_{0}, MpM_{\mbox{p}}, and PP obtained using the differential astrometric signals of one reference star in the single-planet model. The corresponding marginal posterior means and one standard deviation values (and these values obtained using the other 7 reference stars) are given in Table 4.
Refer to caption
Figure 2: Comparison between the simulated observations of the single-planet model and the theoretical signals generated by the fitted parameters corresponding to Figure 1. The points with error bars show the simulated differential astrometric observations, and the red lines signify the theoretical signals corresponding to our fitted parameters. The bottom panels show the residuals.
Table 3: Gelman-Rubin criterion of each parameter calculated using 8 independent APT-MCMC samplings based on the simulated observations of one reference star.
Parameter run1 run2 run3 run4 run5 run6 run7 run8 RcRc value
P(days)P({\mathrm{days}}) 540.22±0.94\pm 0.94 540.35±0.92\pm 0.92 540.18±0.99\pm 0.99 540.11±0.94\pm 0.94 540.23±0.97\pm 0.97 540.20±0.96\pm 0.96 540.24±0.97\pm 0.97 540.20±0.97\pm 0.97 1.0024
Mp(M)M_{\mbox{p}}(M_{\oplus}) 3.176±0.063\pm 0.063 3.176±0.063\pm 0.063 3.190±0.063\pm 0.063 3.176±0.060\pm 0.060 3.177±0.063\pm 0.063 3.175±0.062\pm 0.062 3.178±0.062\pm 0.062 3.177±0.062\pm 0.062 1.0003
ee 0.246±0.028\pm 0.028 0.247±0.031\pm 0.031 0.249±0.029\pm 0.029 0.243±0.028\pm 0.028 0.249±0.030\pm 0.030 0.242±0.031\pm 0.031 0.246±0.031\pm 0.031 0.246±0.032\pm 0.032 1.0031
cosicosi 0.625±0.019\pm 0.019 0.623±0.018\pm 0.018 0.625±0.018\pm 0.018 0.626±0.017\pm 0.017 0.625±0.018\pm 0.018 0.624±0.018\pm 0.018 0.624±0.017\pm 0.017 0.625±0.018\pm 0.018 1.0012
M0M_{0} () 110.06±6.50\pm 6.50 108.20±6.26\pm 6.26 111.23±7.48\pm 7.48 111.99±6.28\pm 6.28 110.34±7.96\pm 7.96 110.06±6.77\pm 6.77 109.76±7.17\pm 7.17 110.85±6.95\pm 6.95 1.0130
ϵx(μas)\epsilon_{x}(\mu\mbox{as}) 0.353±0.060\pm 0.060 0.350±0.062\pm 0.062 0.354±0.061\pm 0.061 0.352±0.063\pm 0.063 0.352±0.065\pm 0.065 0.352±0.061\pm 0.061 0.351±0.063\pm 0.063 0.352±0.062\pm 0.062 1.0002
ω\omega () 32.54±6.60\pm 6.60 210.74±6.85\pm 6.85 112.68±89.15\pm 89.15 156.17±84.36\pm 84.36 122.21±91.25\pm 91.25 152.23±86.88\pm 86.88 112.58±90.31\pm 90.31 70.14±74.19\pm 74.19 1.2379
Ω\Omega () 207.06±2.09\pm 2.09 27.13±2.99\pm 2.99 128.09±89.17\pm 89.17 85.34±84.41\pm 84.41 117.70±90.24\pm 90.24 87.31±85.06\pm 85.06 126.79±89.40\pm 89.40 170.23±72.68\pm 72.68 1.2455
ω+Ω\omega+\Omega () 239.60±6.40\pm 6.40 237.87±7.39\pm 7.39 240.77±7.23\pm 7.23 241.51±7.58\pm 7.58 239.91±7.82\pm 7.82 239.53±6.61\pm 6.61 239.37±8.45\pm 8.45 240.37±6.81\pm 6.81 1.0109
Table 4: Model parameters and the fitted means and one standard deviations obtained from all 8 reference stars in the single-planet model.
CASE P(days)P({\mathrm{days}}) Mp(M)M_{\mbox{p}}(M_{\oplus}) ee cosicosi M0M_{0} () ω+Ω\omega+\Omega () ϵx(μas)\epsilon_{x}(\mu\mbox{as})
Model 541.37 3.100 0.200 0.643 100 230
Ref 1 540.22±0.94\pm 0.94 3.176±0.063\pm 0.063 0.246±0.028\pm 0.028 0.625±0.019\pm 0.019 110.1±6.5\pm 6.5 239.6±6.4239.6\pm 6.4 0.35±0.06\pm 0.06
Ref 2 542.75±0.98\pm 0.98 3.172±0.061\pm 0.061 0.250±0.028\pm 0.028 0.638±0.018\pm 0.018 110.8±6.9\pm 6.9 240.5±6.9240.5\pm 6.9 0.34±0.06\pm 0.06
Ref 3 539.97±0.91\pm 0.91 3.249±0.057\pm 0.057 0.220±0.028\pm 0.028 0.604±0.016\pm 0.016 102.3±7.3\pm 7.3 231.4±7.2231.4\pm 7.2 0.23±0.08\pm 0.08
Ref 4 542.63±1.04\pm 1.04 3.108±0.058\pm 0.058 0.213±0.032\pm 0.032 0.614±0.019\pm 0.019 108.3±10.4\pm 10.4 222.9±10.4222.9\pm 10.4 0.39±0.06\pm 0.06
Ref 5 540.31±0.98\pm 0.98 3.191±0.059\pm 0.059 0.203±0.029\pm 0.029 0.632±0.018\pm 0.018 104.0±9.0\pm 9.0 233.4±8.8233.4\pm 8.8 0.35±0.06\pm 0.06
Ref 6 540.81±0.93\pm 0.93 3.162±0.061\pm 0.061 0.186±0.028\pm 0.028 0.620±0.017\pm 0.017 92.9±8.2\pm 8.2 220.8±8.3220.8\pm 8.3 0.34±0.06\pm 0.06
Ref 7 540.65±1.05\pm 1.05 3.061±0.061\pm 0.061 0.242±0.031\pm 0.031 0.658±0.019\pm 0.019 106.9±8.3\pm 8.3 235.8±8.0235.8\pm 8.0 0.42±0.06\pm 0.06
Ref 8 542.96±1.02\pm 1.02 3.089±0.061\pm 0.061 0.241±0.033\pm 0.033 0.612±0.019\pm 0.019 88.4±8.3\pm 8.3 219.2±8.5219.2\pm 8.5 0.47±0.06\pm 0.06

4.1 Single-planet system

The simplest case is to infer the orbital parameters of a system with only one planet. We simulate the differential astrometric signals of a planet with a mass of 3.1MM_{\oplus} and an orbital period of 541 days. The observation strategy of our mission is to make 300 measurements of the offset angles between the target star and its reference stars over a period of 5 years. We randomly remove several consecutive clusters from the entire time series to better simulate real observations, which generally contain discontinuities because the targets are not always visible from the satellite. As a consequence, we generate 8 pairs of \sim 250-point time series of the angular wobbles in right ascension and declination using 8 relatively stable reference stars. The standard deviation of the simulated Gaussian measurement error is 1 μas\mu\mbox{as}.

Using the Nii code that implements the APT-MCMC sampling strategy described in Section 3, we fit the mass and orbital parameters of the simulated single-planet system using the differential angular measurements of the 8 reference stars. Our MCMC chains contain 1,000,000 iterations and throw the first 300,000 iterations as burn-in. To ensure the convergence of the sampling parameters, we select one reference star, perform independent APT-MCMC runs, and calculate the Gelman-Rubin convergence diagnostic criterion RcR_{c}. Table 3 gives the posterior means and the standard deviations of 8 independent Markov chains using the same observation data of one reference star and the RcR_{c} values calculated from this set of independent APT-MCMC samplings. The results indicate that the convergence is very good for PP, MpM_{\mathrm{p}}, ee, cosicosi, M0M_{0}, and ϵ0\epsilon_{0}, and that the RcR_{c} values for these 6 parameters are all \lesssim 1.01. In addition, ω\omega and Ω\Omega do not converge separately, as they both have an RcR_{c} that is greater than 1.2. However, the summation of ω\omega and Ω\Omega converges with RcR_{c} \approx 1.01. The fitted means, standard deviations and RcR_{c} values of the summation of ω\omega and Ω\Omega in these 8 independent runs are also given in Table 3. By comparing the theoretical differential astrometry curves corresponding to a variable ω\omega, a variable Ω\Omega, and different combinations of ω\omega and Ω\Omega, we find that similar observation curves can be obtained as long as the sum of the two variables remains the same. The argument of periapsis ω\omega is related to the orbital eccentricity, and thus, its value has no meaning in the case of circular orbits. Likewise, the longitude of the ascending node Ω\Omega is related to the orbital inclination, and thus, its value is also meaningless for circular orbits. For the magnitudes of the orbital eccentricity and inclination in our simulated model, the differences between the differential astrometry curves corresponding to a wide variety of combinations of ω\omega and Ω\Omega are minuscule when the summation of these two variables remains unchanged. Therefore, in our APT-MCMC sampling process, ω\omega and Ω\Omega yield different combinations with similar sums, thereby causing these two parameters to fail to converge individually.

Figure 1 shows the corner plots (Foreman-Mackey, 2016) of all the model parameters, among which ω\omega and Ω\Omega are displayed in the form of their summation, obtained using the differential astrometric signals of one of the reference stars in the single-planet model. We can see from the marginal posterior distributions that all the orbital parameters converge well. The corner plots also show that M0M_{0} is strongly correlated with the sum of ω\omega and Ω\Omega. MpM_{\mbox{p}} and cosicosi are also correlated since both of these parameters determine the magnitude of the disturbance in differential astrometric signals.

Figure 2 compares the simulated differential astrometric signals corresponding to the reference star used in the fitting of Figure 1 and the theoretical signals generated by the marginal-posterior mean values of MpM_{\mbox{p}}, aa, ee, cosicosi, PP, and M0M_{0}. The values of ω\omega and Ω\Omega used to generate the theoretical signals are set to the values in the final iteration of the MCMC chain with β=1\beta=1. These theoretical signals can accurately reproduce the angular wobbles in the simulated observations. Figure 2 further shows the residual plots of the fitted right ascension and declination curves. The mean values of the residuals in the right ascension and declination directions are -0.0671 and 0.0035 μas\mu\mbox{as}, respectively, and the corresponding standard deviations are 0.9795 and 0.9536 μ\muas. These values are is in good agreement with the injected Gaussian observation error with a mean value of 0 μ\muas and a standard deviation of 1 μ\muas.

The retrieval processes using the simulated signals of the other 7 reference stars are also successful, and all the fitted results are summarized in Table 4. Among all the parameters, the orbital period PP achieves the best-fit: the largest error among the 8 cases is << 1.5 days, and all cases have a one standard deviation of \sim 1 day. The planetary mass MpM_{\mbox{p}} is also accurately estimated with a maximum relative error of \sim 4.8%. The other five orbital parameters are also reasonably derived. Because none of the 8 simulated reference stars is accompanied by an additional astrometric disturbance, the fitted unknown measurement errors ϵx\epsilon_{x} in all of the 8 cases are less than 0.5 μas\mu\mbox{as}, which is less than the standard deviation of the injected Gaussian measurement error of 1 μas\mu\mbox{as}.

4.2 Dual-planet system

Table 5: Model parameters and the fitted means and one standard deviations obtained from the two rounds of fitting using all 8 reference stars in the dual-planet model.
CASE P(days)P({\mathrm{days}}) Mp(M)M_{\mbox{p}}(M_{\oplus}) ee cosicosi M0M_{0} () ω+Ω\omega+\Omega () ϵx(μas)\epsilon_{x}(\mu\mbox{as})
Model 882.04 25.000 0.100 -0.259 200 120
Ref 1 882.78±1.22\pm 1.22 24.61±0.11\pm 0.11 0.148±0.009\pm 0.009 -0.269±0.005\pm 0.005 199.1±4.1\pm 4.1 116.9±4.2116.9\pm 4.2 2.42±0.09\pm 0.09
Ref 2 882.57±1.24\pm 1.24 24.69±0.12\pm 0.12 0.156±0.009\pm 0.009 -0.267±0.005\pm 0.005 203.5±3.7\pm 3.7 121.2±3.8121.2\pm 3.8 2.45±0.09\pm 0.09
Ref 3 882.43±1.24\pm 1.24 24.59±0.11\pm 0.11 0.144±0.009\pm 0.009 -0.272±0.006\pm 0.006 200.8±4.2\pm 4.2 118.4±4.2118.4\pm 4.2 2.44±0.09\pm 0.09
Ref 4 881.63±1.24\pm 1.24 24.54±0.12\pm 0.12 0.139±0.009\pm 0.009 -0.270±0.005\pm 0.005 197.5±4.4\pm 4.4 114.6±4.5114.6\pm 4.5 2.46±0.09\pm 0.09
Ref 5 882.53±1.24\pm 1.24 24.61±0.12\pm 0.12 0.150±0.009\pm 0.009 -0.268±0.005\pm 0.005 200.4±3.7\pm 3.7 118.1±3.8118.1\pm 3.8 2.45±0.09\pm 0.09
Ref 6 882.33±1.25\pm 1.25 24.61±0.12\pm 0.12 0.146±0.009\pm 0.009 -0.269±0.006\pm 0.006 200.1±4.2\pm 4.2 117.9±4.3117.9\pm 4.3 2.44±0.09\pm 0.09
Ref 7 882.01±1.27\pm 1.27 24.61±0.12\pm 0.12 0.146±0.010\pm 0.010 -0.268±0.006\pm 0.006 199.9±4.6\pm 4.6 117.5±4.7117.5\pm 4.7 2.53±0.09\pm 0.09
Ref 8 882.38±1.26\pm 1.26 24.54±0.12\pm 0.12 0.142±0.009\pm 0.009 -0.271±0.006\pm 0.006 198.2±4.0\pm 4.0 115.9±4.1115.9\pm 4.1 2.52±0.09\pm 0.09
Model 541.37 3.10 0.200 0.643 100 230
Ref 1 543.45±1.97\pm 1.97 3.02±0.12\pm 0.12 0.361±0.059\pm 0.059 0.356±0.026\pm 0.026 121.0±10.9\pm 10.9 256.4±10.2256.4\pm 10.2 1.11±0.06\pm 0.06
Ref 2 542.56±1.80\pm 1.80 3.04±0.11\pm 0.11 0.391±0.061\pm 0.061 0.320±0.027\pm 0.027 99.2±8.5\pm 8.5 232.5±8.7232.5\pm 8.7 1.11±0.06\pm 0.06
Ref 3 540.97±2.00\pm 2.00 2.96±0.11\pm 0.11 0.334±0.057\pm 0.057 0.325±0.026\pm 0.026 110.8±11.7\pm 11.7 241.2±11.4241.2\pm 11.4 1.16±0.06\pm 0.06
Ref 4 540.23±2.00\pm 2.00 3.07±0.11\pm 0.11 0.296±0.050\pm 0.050 0.314±0.024\pm 0.024 128.2±12.0\pm 12.0 257.2±11.3257.2\pm 11.3 1.14±0.06\pm 0.06
Ref 5 545.91±1.94\pm 1.94 3.01±0.11\pm 0.11 0.346±0.056\pm 0.056 0.334±0.027\pm 0.027 115.1±9.9\pm 9.9 252.1±9.8252.1\pm 9.8 1.17±0.06\pm 0.06
Ref 6 543.71±2.07\pm 2.07 3.01±0.14\pm 0.14 0.424±0.057\pm 0.057 0.330±0.029\pm 0.029 114.9±8.6\pm 8.6 252.0±8.4252.0\pm 8.4 1.28±0.06\pm 0.06
Ref 7 543.43±2.10\pm 2.10 3.13±0.13\pm 0.13 0.413±0.061\pm 0.061 0.304±0.027\pm 0.027 107.8±9.7\pm 9.7 242.0±9.6242.0\pm 9.6 1.28±0.06\pm 0.06
Ref 8 542.95±1.80\pm 1.80 3.10±0.11\pm 0.11 0.390±0.052\pm 0.052 0.355±0.027\pm 0.027 110.4±8.6\pm 8.6 243.6±8.3243.6\pm 8.3 1.13±0.06\pm 0.06
Refer to caption
Figure 3: Corner plot of the posterior distributions of aa, ee, cosicosi, ω\omega+Ω\Omega, M0M_{0}, MpM_{\mbox{p}}, and PP in the first round of fitting obtained using the differential astrometric signals of one reference star in the dual-planet model. The corresponding marginal posterior means and one standard deviation values (and these values obtained using the other 7 reference stars) are given in Table 5.
Refer to caption
Figure 4: Corner plot of the posterior distributions of aa, ee, cosicosi, ω\omega+Ω\Omega, M0M_{0}, MpM_{\mbox{p}}, and PP in the second round of fitting obtained using the differential astrometric signals of one reference star in the dual-planet model. The corresponding marginal posterior means and one standard deviation values (and these values obtained using the other 7 reference stars) are given in Table 5.
Refer to caption
Figure 5: Comparison between the simulated observation of the dual-planet model and the theoretical signals generated by the fitted parameters corresponding to Figure 3 and 4. The points with error bars show the simulated differential astrometric observations generated using the MERCURY6 N-body integration package, and the red lines signify the theoretical signals corresponding to our fitted parameters. The bottom panels show the residuals. The residuals shown in the bottom panels contain the mutual gravitational interaction between the two planets that our fitting approach cannot explore.

Retrieving the orbital parameters of a multiplanet system is far more complicated than retrieving those of a single-planet system, as the astrometric signals of the planets constructively or destructively interfere with one another due to their mutual gravitational interactions. Moreover, as shown in Equation 3, the strength of an astrometric signal is proportional to the product of a planet’s mass and its orbital semimajor axis. Therefore, when there are massive planets at distant orbits, the disturbances caused by low-mass planets at closer orbits become difficult to detect.

Here, we simulate a dual-planet system with two planets featuring different masses. The mass and orbital parameters of the low-mass planet, as well as the characteristics of the target star, are the same as those in the single-planet system case described in Section 4.1. Further, we add a massive planet of 25MM_{\oplus} at a larger distance into the dual-planet model; the orbital parameters of this high-mass planet are given in Table 5. In this system, the strength of the astrometric signal generated by the low-mass planet is approximately 10% of that of the massive planet. To include the mutual gravitational interaction between the two planets, the differential astrometric signal of the dual-planet system is simulated with the N-body package MERCURY6 (Chambers, 1999).

One correct approach to conduct the orbit retrieval of a dual-planet system is to simultaneously fit both planets’ orbital elements at the same time. Such an approach requires the combination of our APT-MCMC code with an N-body integration package. In the Nii code, instead of a more direct approach of using an N-body integration package to simulate the dual-planet system, we implement a simplified version of this method: we simultaneously fit both planets’ orbital elements using a linear superposition of the two planets’ Keplerian orbits. This process of simultaneously fitting the orbital elements of both planets involves 15 parameters, i.e., two pairs of the elements (a,e,i,ω,Ω,M0,Mp)(a,e,i,\omega,\Omega,M_{0},M_{\mbox{p}}) and one parameter ϵx\epsilon_{x} to account for additional measurement errors. However, we find that our APT-MCMC sampling strategy cannot converge for such a 15-parameter fitting process. Even if we set the initial values of the parameters near their true value, only the parameters of the massive planet can converge after 1,000,000 iterations. The main reason may be that the disturbance of the low-mass planet is too small compared to that of the massive planet. Therefore, it is difficult for our automatic control system to dynamically determine proper combinations of the sampling step sizes for the orbital parameters of the two planets corresponding to their different astrometric signal strengths. A more efficient sampling strategy is needed to achieve convergence of the two-planet model. We will investigate this in subsequent work.

Ultimately, we adopt a two-stage fitting process to retrieve the orbital elements of the dual-planet system. In the first round, we fit the two planets’ signals as though they originate from a single-planet system. As a result, the fitted unknown measurement errors ϵx\epsilon_{x} are significantly larger than the standard deviation of the Gaussian measurement error. This implies that there may be other planets within the system. In the second round, we subtract the signals corresponding to the fitted massive planet with the parameters obtained in the first round of fitting and perform another round of fitting on the residuals. In both rounds, our MCMC chains contain 1,000,000 iterations and throw the first 300,000 iterations as burn-in.

Table 5 gives the results of these two rounds of fitting for all 8 reference stars. The orbital period PP, M0M_{0}, and the sum of ω\omega and Ω\Omega are well fitted in the first round. For the mass MpM_{\mbox{p}}, the fitted values are approximately 1.5% less than the injected value. The 8 fitting processes using different reference stars all yield a larger ee and a smaller cosicosi. An important indicator is the fitted unknown measurement errors ϵx\epsilon_{x}, which is \sim 2.5 μas\mu\mbox{as} for all 8 cases. Since the fitted values of ϵx\epsilon_{x} are much larger than the standard deviation of the simulated Gaussian measurement error of 1 μas\mu\mbox{as}, other signals are inferred to exist within the system. In the second round of fitting, the main parameters of the low-mass planet, i.e., its planetary mass and orbital period, are well fitted, but the relative errors of the fitted values are larger than those in the case of the single-planet system in Section 4.1. Compared with the results of the single-planet system in Table 4, the other parameters are not well fitted, especially ee and cosicosi. We thus derive a much larger ee and a much smaller cosicosi. Because the fitted values of ϵx\epsilon_{x} in the second round of fitting in the 8 cases are all less than 1.3 μas\mu\mbox{as}, similar to the standard deviation of the injected Gaussian measurement error, our fitting process ends after the second round.

The reason for the insufficient fits of ee and cosicosi in the dual-planet system can be explained from Figures 3 and 4, which present the corner plots of all the model parameters. Once again, ω\omega and Ω\Omega are displayed as their summation obtained in the first and second rounds of fitting using the differential astrometric signals of one of the reference stars in the dual-planet model. As shown in Figure 3, MpM_{\mbox{p}} and ee are clearly correlated. In the first round, we seek to fit the astrometric signals of a dual-planet system using the disturbance of a single massive planet; consequently, the disturbance caused by the low-mass planet is initially explained by adjusting the eccentricity of the orbit. As a result, we obtain a larger orbital eccentricity and a smaller planetary mass in the first round of fitting. These inaccurate fitting values cause the residual after the first round of fitting to contain some distorted information, which further contributes to the deviations in the orbital eccentricity and orbital inclination in the second round of fitting. This outcome suggests that there should be a better way to fit the orbital elements of all the planets in a multiplanet system simultaneously. Unfortunately, our Nii code fails follow this approach, i.e., to fit all 15 orbital elements of the considered dual-planet system at the same time. Nevertheless, since our two-stage fitting approach can accurately obtain the masses and orbital periods of the two planets, which is the main information we seek, in this work, we continue to employ this imperfect method.

Figure 5 compares the simulated differential astrometric signals of the dual-planet system corresponding to the reference star used in the fitting of Figure 3 and 4 and the theoretical signals generated by the marginal posterior mean values of MpM_{\mbox{p}}, aa, ee, cosicosi, M0M_{0}, and PP in combination with ω\omega and Ω\Omega in the final iteration in the MCMC chain. The mean values of the residuals in the right ascension and declination directions are -0.5072 and 0.1995 μ\muas, respectively, and the corresponding standard deviations are 1.5358 and 1.1545 μ\muas. We can see from the residuals that although the simulated observations can be accurately explained by the fitted curves overall, clear local features remain that cannot be reproduced by the fitted curves. These local features in the residuals originate from two aspects: one is the mutual gravitational action between the two planets that our fitting approach cannot explain, and the other is our inaccurate fits of the orbital eccentricity and inclination of the two planets.

5 Summary

The Nii code implements an APT-MCMC framework for the sampling of multidimensional posterior distributions and provides an observation simulation platform for the differential astrometric measurement of exoplanets. In this paper, we test the orbit retrieval model in two cases of a single-planet system and a dual-planet system. In the single-planet case, all the orbital elements can be accurately fitted. In the dual-planet case, the fitting accuracies of the planetary masses and orbital periods of the two planets can reach the same level as those in the single-planet system; however, due to the shortcomings of our two-stage fitting approach, the fitting accuracies of the other orbital elements of the dual-planet system are lower than those of the single-planet system. We will carry out an in-depth investigation of the orbit retrieval of multiplanet systems in a forthcoming study.

Since Nii is an open source Python-based package, it provides an easy-to-use and expandable implementation of APT-MCMC. To simplify the application of this code in other scientific problems with different posterior distributions, we provide many control parameters in the APT part to facilitate the adjustment of the MCMC sampling strategy, for example, the number of parallel chains, the β\beta values of different chains, the number of proposed swaps between all the parallel chains, the average number of iterations between each proposed swap, the dynamic range of the sampling step sizes, and frequency of adjusting the step sizes. These easy-to-use control parameters ensure that the MCMC sampling strategy is sufficiently adjustable to achieve rapid convergence on a specific posterior distribution.

By adapting different prior and likelihood functions, the Nii program can be applied to different Bayesian analysis problems. Such modifications are relatively simple in Nii. We also plan to provide a C language version of Nii in a follow-up work for further astrophysical usage.

Acknowledgments

We thank the referee for a thorough and constructive report that significantly improved the manuscript. This work is supported by the B-type Strategic Priority Program of the Chinese Academy of Sciences (Grant No. XDB41000000), the Strategic Priority Research Program on Space Science of the Chinese Academy of Sciences (Grant No. XDA15020800), National Natural Science Foundation of China (Grant Nos. 11973094, 12111530175, 12033010, 11873097, 11633009), Youth Innovation Promotion Association CAS (2020319), and Foundation of Minor Planets of Purple Mountain Observatory.

DATA AVAILABILITY

The Nii code and the test runs presented in this work is public available through GitHub.

References

  • Alzner (2004) Alzner A., 2004, The Orbital Elements of a Visual Binary Star. In: Argyle B. (eds) Observing and Measuring Visual Double Stars. Patrick Moore’s Practical Astronomy Series. Springer, London
  • Balan & Lahav (2009) Balan S. T., Lahav O., 2009, MNRAS, 394, 1936
  • Bayes (1763) Bayes T., 1763, Philosophical Transactions, 53, 370
  • Benedict et al. (2002) Benedict G. F., McArthur B. E., Forveille T., Delfosse X., Nelan E., Butler R. P., Spiesman W., et al., 2002, ApJL, 581, L115
  • Bishop (2006) Bishop C. M., 2006, Pattern recognition and machine learning, New York: Springer, p. 28
  • Bitsch, Lambrechts, & Johansen (2015) Bitsch B., Lambrechts M., Johansen A., 2015, A&A, 582, A112
  • Borucki et al. (2011) Borucki W. J., Koch D. G., Basri G., Batalha N., Brown T. M., Bryson S. T., Caldwell D., et al., 2011, ApJ, 736, 19
  • Brandt et al. (2021) Brandt T. D., Dupuy T. J., Li Y., Brandt G. M., Zeng Y., Michalik D., Bardalez Gagliuffi D. C., et al., 2021, arXiv, arXiv:2105.11671
  • Brooks & Gelman (1998) Brooks S. P., Gelman A., 1998, Journal of Computational and Graphical Statistics, 7, 434
  • Broucke & Cefola (1972) Broucke R. A., Cefola P. J., 1972, CeMec, 5, 303
  • Cassan et al. (2012) Cassan A., Kubas D., Beaulieu J.-P., Dominik M., Horne K., Greenhill J., Wambsganss J., et al., 2012, Natur, 481, 167
  • Catanzarite et al. (2007) Catanzarite J., Shao M., Tanner A., Unwin S., Yu J., 2006, PASP, 118, 1319
  • Chambers (1999) Chambers J. E., 1999, MNRAS, 304, 793
  • Chen, Wu, & Li (2013) Chen D., Wu J., Li B., 2013, European Planetary Science Congress, EPSC2013-1102
  • Crouzier et al. (2016) Crouzier A., Malbet F., Henault F., Léger A., Cara C., LeDuigou J. M., Preis O., et al., 2016, A&A, 595, A108
  • Cumming et al. (2008) Cumming A., Butler R. P., Marcy G. W., Vogt S. S., Wright J. T., Fischer D. A., 2008, PASP, 120, 531
  • Díaz et al. (2014) Díaz R. F., Almenara J. M., Santerne A., Moutou C., Lethuillier A., Deleuil M., 2014, MNRAS, 441, 983
  • Eastman, Gaudi, & Agol (2013) Eastman J., Gaudi B. S., Agol E., 2013, PASP, 125, 83
  • Foreman-Mackey (2016) Foreman-Mackey D., 2016, The Journal of Open Source Software, 1, 24
  • Feng et al. (2019) Feng F., Anglada-Escudé G., Tuomi M., Jones H. R. A., Chanamé J., Butler P. R., Janson M., 2019, MNRAS, 490, 5002
  • Feroz & Hobson (2014) Feroz F., Hobson M. P., 2014, MNRAS, 437, 3540
  • Fischer et al. (2014) Fischer D. A., Howard A. W., Laughlin G. P., Macintosh B., Mahadevan S., Sahlmann J., Yee J. C., 2014, Protostars and Planets VI (University of Arizona Press) p. 715
  • Ford (2006) Ford E. B., 2006, ApJ, 642, 505
  • Gelman & Rubin (1992) Gelman A., Rubin D. B., 1992, Statistical Science, 7, 457
  • Gelman et al. (1997) Gelman A., Gilks W. R., Roberts G. O., 1997, The Annals of Applied Probability, 7, 110
  • Geyer & Thompson (1995) Geyer C. J., Thompson E. A., 1995, Journal of the American Statistical Association, 90, 909
  • Goullioud et al. (2008) Goullioud R., Catanzarite J. H., Dekens F. G., Shao M., Marr J. C., 2008, SPIE, 7013, 70134T
  • Gregory (2005a) Gregory P. C., 2005a, ApJ, 631, 1198
  • Gregory (2005b) Gregory P. C., 2005b, Bayesian Logical Data Analysis for the Physical Sciences: A Comparative Approach with Mathematica Support. Cambridge University Press, Cambridge
  • Gregory (2011) Gregory P. C., 2011, MNRAS, 410, 94
  • Hastings (1970) Hastings W. K., 1970, Biometrika, 57, 97
  • Heintz (1988) Heintz W. D., 1988, JRASC, 82, 140
  • Ida & Lin (2004) Ida S., Lin D. N. C., 2004, ApJ, 616, 567
  • Jacob (1855) Jacob W. S., 1855, MNRAS, 15, 228
  • Ji & Wang (2020) Ji J. H., Wang S., 2020, Chinese Journal of Space Science, 40(5), 729
  • Liu (2001) Liu J. S., 2001, Monte Carlo Strategies in Scientific Computing. Springer-Verlag, New York
  • Liu, Liu, & Zhu (2018) Liu S.-Y., Liu J.-C., Zhu Z., 2018, ChA&A, 42, 594
  • Liu & Ji (2020) Liu B., Ji J. H., 2020, RAA, 20, 164
  • Malbet et al. (2012) Malbet F., Léger A., Shao M., Goullioud R., Lagage P.-O., Brown A. G. A., Cara C., et al., 2012, ExA, 34, 385
  • Malbet et al. (2016) Malbet F., Léger A., Anglada Escudé G., Sozzetti A., Spolyar D., Labadie L., Shao M., et al., 2016, SPIE, 9904, 99042F
  • Marcy et al. (2014) Marcy G. W., Isaacson H., Howard A. W., Rowe J. F., Jenkins J. M., Bryson S. T., Latham D. W., et al., 2014, ApJS, 210, 20
  • Mayor et al. (2011) Mayor M., Marmier M., Lovis C., Udry S., Ségransan D., Pepe F., Benz W., et al., 2011, arXiv, arXiv:1109.2497
  • Metropolis et al. (1953) Metropolis N., Rosenbluth A. W., Rosenbluth M. N., Teller A. H., Teller E., 1953, The Journal of Chemical Physics, 21, 1087
  • Mordasini et al. (2012) Mordasini C., Alibert Y., Georgy C., Dittkrist K.-M., Klahr H., Henning T., 2012, A&A, 547, A112
  • Nemati et al. (2011) Nemati B., Shao M., Zhai C., Erlig H., Goullioud R., Wang X., 2011, SPIE, 8151, 81510W. doi:10.1117/12.894477
  • Perryman et al. (2014) Perryman M., Hartman J., Bakos G. Á., Lindegren L., 2014, ApJ, 797, 14
  • Ranalli, Hobbs, & Lindegren (2018) Ranalli P., Hobbs D., Lindegren L., 2018, A&A, 614, A30
  • Schulze-Hartung, Launhardt, & Henning (2012) Schulze-Hartung T., Launhardt R., Henning T., 2012, A&A, 545, A79
  • Shao & Nemati (2009) Shao M., Nemati B., 2009, PASP, 121, 41
  • Snellen & Brown (2018) Snellen I. A. G., Brown A. G. A., 2018, NatAs, 2, 883
  • Tanner, Gelino, & Law (2010) Tanner A. M., Gelino C. R., Law N. M., 2010, PASP, 122, 1195
  • The Theia Collaboration et al. (2017) The Theia Collaboration, Boehm C., Krone-Martins A., Amorim A., Anglada-Escude G., Brandeker A., Courbin F., et al., 2017, arXiv, arXiv:1707.01348
  • Thiele (1883) Thiele T. N., 1883, Astronomisches Nachrichten, 104, 245
  • Thompson et al. (2018) Thompson S. E., Coughlin J. L., Hoffman K., Mullally F., Christiansen J. L., Burke C. J., Bryson S., et al., 2018, ApJS, 235, 38
  • Unwin et al. (2008) Unwin S. C., Shao M., Tanner A. M., Allen R. J., Beichman C. A., Boboltz D., Catanzarite J. H., et al., 2008, PASP, 120, 38
  • Yu et al. (2019) Yu Z.-Y., Liu H.-G., Zhou J.-L., Wu D.-H., Yang M., Wang S., Zhang H., et al., 2019, RAA, 19, 004
  • Zhai et al. (2011) Zhai C. X., Shao M., Renaud G., Nemati B., 2011, Proc.R.Soc.A., 467, 3550
  • Zhang (2020) Zhang X., 2020, RAA, 20, 099