Detecting extreme-mass-ratio inspirals for space-borne detectors with deep learning

Qianyun Yun Shanghai Astronomical Observatory, Chinese Academy of Sciences, Shanghai, China, 200030 School of Physics and Astronomy, Shanghai Jiao Tong University 800 Dongchuan RD.,Minhang District, Shanghai, 200240, China Wen-Biao Han Corresponding author: [email protected] Shanghai Astronomical Observatory, Chinese Academy of Sciences, Shanghai, China, 200030 Hangzhou Institute for Advanced Study, University of Chinese Academy of Sciences, Hangzhou 310124, China School of Astronomy and Space Science, University of Chinese Academy of Sciences, Beijing, China, 100049 Taiji Laboratory for Gravitational Wave Universe (Beijing/Hangzhou), University of Chinese Academy of Sciences, Beijing 100049, China Yi-Yang Guo Lanzhou Center for Theoretical Physics, Key Laboratory of Theoretical Physics of Gansu Province,
and Key Laboratory of Quantum Theory and Applications of MoE,
Lanzhou University, Lanzhou, Gansu 730000, China Institute of Theoretical Physics & Research Center of Gravitation, Lanzhou University, Lanzhou 730000, China He Wang International Centre for Theoretical Physics Asia-Pacific (ICTP-AP), University of Chinese Academy of Sciences (UCAS), Beijing, China Taiji Laboratory for Gravitational Wave Universe (Beijing/Hangzhou), University of Chinese Academy of Sciences, Beijing 100049, China Minghui Du Center for Gravitational Wave Experiment, National Microgravity Laboratory, Institute of Mechanics, Chinese Academy of Sciences, Beĳing 100190, China

Abstract

One of the primary objectives for space-borne gravitational wave detectors is the detection of extreme-mass-ratio inspirals (EMRIs). This undertaking poses a substantial challenge because of the complex and long EMRI signals, further complicated by their inherently faint signal. In this research, we introduce a 2-layer Convolutional Neural Network (CNN) approach to detect EMRI signals for space-borne detectors. Our method employs the Q-transform for data preprocessing, effectively preserving EMRI signal characteristics while minimizing data size. By harnessing the robust capabilities of CNNs, we can reliably distinguish EMRI signals from noise, particularly when the signal-to-noise (SNR) ratio reaches 50, a benchmark considered a “golden” EMRI. At the meantime, we incorporate time-delay interferometry (TDI) to ensure practical utility. We assess our model’s performance using a 0.5-year dataset, achieving a true positive rate (TPR) of 94.2% at a 1% false positive rate (FPR) across various signal-to-noise ratio form 50-100, with 91% TPR and 1% FPR at an SNR of 50. This study underscores the promise of incorporating deep learning methods to advance EMRI data analysis, potentially leading to rapid EMRI signal detection.

I Introduction

Since the initial detection of GW150914 [1, 2, 3, 4], ground-based gravitational wave detectors such as LIGO [5] and VIRGO [6] have made remarkable progress. These detectors have subsequently detected nearly one hundred analogous events of the merger of two black holes with stellar mass [7, 8, 9, 10]. These observations offer a fresh outlook on the origin and development of these celestial bodies [11, 12, 13], enabling researchers to test Einstein’s general theory of relativity [14, 10, 15] and explore the universe through an independent approach [16].

Ground-based gravitational wave detectors such as LIGO, Virgo, and Kagra [17, 18] have a primary mission of focusing on gravitational waves within a frequency range of 1 Hz to a few thousand Hz, as determined by their sensitivity curve [10, 19]. In contrast, space-based gravitational wave detection projects, represented by missions like LISA [20], ASTROD [21], DECIGO/BBO [22], ALIA [23], AGIS-LEO [24], Taiji [25, 26], and Tianqin [27], are specifically engineered to detect low-frequency gravitational waves, covering the spectrum from 0.1 mHz to 1 Hz [20, 25, 19]. These missions are primarily focused on exploring a wide range of phenomena related to black holes, compact stars, and the origins of the universe. They pay special attention to topics like massive black hole binaries (SMBH) [28], extreme-mass-ratio inspirals [29, 30, 31, 32], binary white dwarfs (BWD) [33] the potential presence of a stochastic gravitational wave background (SGWB) [34, 35] and so on.

As early as 2008, China initiated the Taiji project, which consists of three spacecrafts. These spacecrafts follow heliocentric orbits and come together to form a massive equilateral triangle with sides spanning approximately three million kilometers [36]. The Taiji space-based detectors are positioned at a distance of about 1 astronomical unit (1AU) from the Sun. The center of mass of this constellation either lags or leads the Earth by approximately 18 to 20 degrees. The eccentricity of each satellite’s orbit is approximately $10^{-3}$ . The stable constellation configuration of Taiji allows it to effectively detect gravitational waves within a sensitive frequency range spanning from $10^{-4}$ to 0.1 Hz [37, 38].

One of Taiji’s objectives is the detection of extreme-mass-ratio inspiral systems [31, 30, 32, 38]. These systems consist of a stellar-mass black hole orbiting around a much heavier black hole with a mass ranging from $10^{4}-10^{7}M_{\odot}$ . These supermassive black holes are typically located at the center of a galaxy [39, 40, 41, 42, 29]. When compact objects, such as black holes, orbit the central black hole in EMRI systems [43], they can release energy and generate gravitational wave (GW) at low frequencies. The frequencies of these GWs fall within the sensitive range of space detectors [44, 45]. The exploration of EMRI systems presents opportunities for testing the principles of general relativity [29, 30, 31, 46]. It can also map the evolution of center massive black holes (MBH) by the reference of sources’ parameters. These researches contribute to our understanding of the distribution of MBH masses and their relationships with host galaxies [47]. EMRI systems also provide a precise means to study and map the gravitational field of the central black hole [48].

Analyzing EMRI signals poses a substantial challenge due to their prolonged duration and intricate characteristics. These signals can endure for extended periods, spanning from several months to years, demanding significant computational resources for waveform generation for comprehensive analysis [49, 50]. Traditional GW data analysis techniques, such as match filtering, would necessitate a minimum of $10^{40}$ templates to process a single waveform [51]. Even with the application of the F–statistic algorithm [52, 53] can reduce template requirements and enhance search and analysis efficiency, a substantial amount of time is still required, and there is a heavy reliance on templates. EMRI signals are inherently complicated, characterized by their intricate composition of harmonics and modulations. Additionally, the influence of self-force effects further compounds the difficulty of creating accurate models for these intricate signals. Although recent research has made progress in addressing the challenges posed by self-force [54, 55, 46], it’s vital to recognize the lack of precise and quickly generated EMRI waveform templates will introduce an additional layer of complexity into the data analysis process.

Deep learning techniques, particularly Convolutional Neural Networks (CNNs), which has achieved great success in computer vision, have found substantial utility in the realm of GW astrophysics to detect and characterize GW signals originating from BBH mergers [56, 57, 58, 59]. These investigations have highlighted the effectiveness of deep learning in detecting both simulated and real GW signals in the detectors’ noise. The application of deep learning has been expanded to the detection of diverse types of gravitational wave (GW) sources. In a recent study Ref. [60], the UNet architecture was employed to detect BBHs and BNSs by analyzing Q-transformed representations for GW data. Furthermore, a novel approach known as the matched-filtering CNN (MFCNN) was developed in Ref. [61] to identify the mergers of Massive Binary Black Holes (MBHBs). This innovative method combines the strengths of matched filtering and CNN, resulting in a significant enhancement in the efficiency of identifying GW candidates compared to using matched filtering in isolation. Furthermore, deep learning has demonstrated its effectiveness in GW data analysis for other space-based GW sources especially for Extreme Mass EMRIs [62, 63, 64].

In our current research, we employ a straightforward 2-layer CNN network specifically designed for the detection of EMRI signals within time-frequency data. Our data preparation involves applying the Q-transform to convert the time series into time-frequency data before feeding it into the network. Previous research [65, 66] have emphasized the utility of the time-frequency algorithm in analyzing EMRI signals. They successfully detected and retrieved signals in the LISA Data Challenge. Additionally, Ref. [60] has highlighted the effectiveness of frequency-time data in the deep learning application. To enhance computational efficiency, we employ time-frequency data in our analysis and adjust it to smaller size. Our model efficiently processes 0.5-year Taiji data within seconds. We generate the signal we aim to detect by the AKK waveform model and simulate noise corresponding Taiji’s analytical Power Spectral Density (PSD). Additionally, we incorporate second-generation Time Delay Interferometry (TDI) [45] to improve the model’s applicability in practical scenarios. While our study primarily focuses on Taiji data, our model’s versatility allows for easy adaptation to other space-based GW detectors like LISA.

The structure of the paper is as follows: Sec. II serves as an introduction, delving into the characteristics of EMRI and its waveform models, the orbit of the Taiji detector, the Power Spectral Density , and the calculation of TDI. In Sec. III, we provide a comprehensive description of how we prepare and preprocess our training data. We also delve into the CNN architecture and present the outcomes of our testing. Lastly, in Sec. IV, we provide a concise summary of our conclusions.

II EMRI Detection in space-based detectors

When we analyze data from Taiji detector, it is crucial to create models specifically for the EMRI signals and for the noise. In the following section, we will explain the fundamental concept of the simulation process, delve into the specifics of using Time Delay Interferometry (TDI) for processing Taiji detectors’ data during simulations, and outline the approach we use to generate EMRI signals.

II.1 Equal arm analytic orbit

The Taiji detector is a space-based observatory that consists of three spacecrafts arranged in a triangular configuration. It uses lasers to measure distances between these spacecrafts, allowing it to detect gravitational waves in a frequency range of $10^{-4}$ Hz to $10^{-1}$ Hz. Currently, we utilize a simplified Keplerian orbit model to approximate the trajectories of the Taiji spacecrafts (SCs). It’s essential to note that these orbits are basic and do not capture the full complexity of actual orbital dynamics. The depiction of the orbital motion for each Taiji spacecraft (SCn, where n = 1, 2, 3) is outlined below:

		$\displaystyle x_{n}=a\cos\alpha+ae\left(\sin\alpha\cos\alpha\sin\beta_{n}-\left(1+\sin^{2}\alpha\right)\cos\beta_{n}\right)$		(1)
		$\displaystyle y_{n}=a\sin\alpha+ae\left(\sin\alpha\cos\alpha\cos\beta_{n}-\left(1+\cos^{2}\alpha\right)\sin\beta_{n}\right)$
		$\displaystyle z_{n}=-\sqrt{3}ae\cos\left(\alpha-\beta_{n}\right)$

[67]. The coordinates are defined within the solar system barycentric (SSB) frame. The angles $\beta_{n}$ are given by $(n-1)\frac{2\pi}{3}+\lambda$ , and $\alpha(t)$ is defined as $\frac{2\pi}{1\text{ year }}t+\kappa$ .

Initially, both $\lambda$ and $\kappa$ are set to zero. ‘a’ is 1 astronomical unit (AU), and the orbital eccentricity ‘e’ is calculated as $e=L/(2a\sqrt{3})$ , where ‘L’ corresponds to the distance between two spacecrafts. The distance designed for the Taiji mission is 3 kilometers [25].

II.2 The PSD and noise of Taiji

In the practical context of Taiji detection, we must take into account various factors that contribute to noise. To streamline our simulation process, we’ve opted for a simplified noise model consisting of two distinct components. The first component, denoted as $P_{\mathrm{oms}}(f)$ , represents high-frequency noise originating from the optical metrology system. The second component addresses low-frequency noise, $P_{\mathrm{acc}}(f)$ , resulting from the test mass’s acceleration. This noise can be effectively characterized through its power spectral density.

The PSD of the X channel for second-generation TDI can be mathematically expressed as follows [38]:

\begin{gathered}\mathrm{PSD}{X_{2}}=64\sin^{2}(\omega L)\sin^{2}(2\omega L)\left(P{\mathrm{oms}}+(3+\cos(2\omega L))P_{\mathrm{acc}}\right),\end{gathered}

(2)

where $\omega=2\pi f/c$ . The functions $P_{\mathrm{oms}}(f)$ and $P_{\mathrm{acc}}(f)$ can be described as follows:

	$\displaystyle P_{\mathrm{oms}}(f)$	$\displaystyle=64\times 10^{-24}\frac{1}{\mathrm{~{}Hz}}\left[1+\left(\frac{2\mathrm{mHz}}{f}\right)^{4}\right]\left(\frac{2\pi f}{c}\right)^{2}$		(3)
	$\displaystyle P_{\mathrm{acc}}(f)$	$\displaystyle=9\times 10^{-30}\frac{1}{\mathrm{~{}Hz}}\left[1+\left(\frac{0.4\mathrm{mHz}}{f}\right)^{2}\right]\left[1+\left(\frac{f}{8\mathrm{mHz}}\right)^{4}\right]\left(\frac{1}{2\pi fc}\right)^{2}$		(3)

II.3 Time-delay interferometry

In missions for a space-based detector, Time Delay Interferometry (TDI) plays a crucial role, serving as a vital technique to efficiently suppress laser frequency noise and attain the desired sensitivity levels. TDI’s fundamental principle involves precise time shifting and combination of measurements to create an interferometry with equal-arm configuration. First-generation TDI combinations excel in mitigating laser frequency noise in static unequal-arm configurations, while second-generation TDI combinations extend this noise reduction capability to scenarios involving relative motion. In this paper, we exclusively employ the second-generation TDI configuration. The Michelson combination in our paper is [68, 38]:

		$\displaystyle X_{2}(t)=y_{1^{\prime}}+y_{3,2^{\prime}}+y_{1,22^{\prime}}+y_{2^{\prime},322^{\prime}}+y_{1,3^{\prime}322^{\prime}}+y_{2^{\prime},33^{\prime}322^{\prime}}$		(4)
		$\displaystyle\quad+y_{1^{\prime},3^{\prime}33^{\prime}322^{\prime}}+y_{3,2^{\prime}3^{\prime}33^{\prime}322^{\prime}}-y_{1}-y_{2^{\prime},3}-y_{1^{\prime},3^{\prime}3}-y_{3;2^{\prime}3^{\prime}3}$
		$\displaystyle\quad-y_{1^{\prime}.22^{\prime}3^{\prime}3}-y_{3.2^{\prime}22^{\prime}3^{\prime}3}-y_{1:22^{\prime}22^{\prime}3^{\prime}3}-y_{2^{\prime},322^{\prime}22^{\prime}3^{\prime}3}$

The variables Y and Z are obtained through cyclic permutation of the indices. An uncorrelated set of TDI variables, denoted as A, E, and T, can be derived from linear combinations of X, Y, and Z, as described by [69]:

\displaystyle A=\frac{1}{\sqrt{2}}(Z-X),\

\displaystyle E=\frac{1}{\sqrt{6}}(X-2Y+Z),

\displaystyle T=\frac{1}{\sqrt{3}}(X+Y+Z).

(5)

In this paper, all simulation data are processed through TDI, and we will specifically use the A and E channel data for analysis. We use fastlisaresponse(https://github.com/mikekatz04/lisa-on-gpu/tree/master) to generate the data after TDI in this work [70].

II.4 EMRI waveform model

Refer to caption — Figure 1: A EMRI signal with SNR=50 and the Taiji’s sensitivity curve after TDI, assuming the observation time is 0.5 year. The SNR value is 50.

The Taiji gravitational wave detector possesses the capability to identify EMRI events involving a compact object (CO) and a massive black hole, covering a mass range from $10^{4}M\odot$ to $10^{7}M\odot$ . The probability of detection increases as the system achieves a signal-to-noise ratio exceeding 20. Before the detection of EMRI signals, it is essential to have theoretical waveform models for EMRI.

In the analysis of EMRI events, researchers require rapid and precise waveform models. These models should be able to handle sources with various parameters. The AK model [71], while computationally efficient, simplifies gravitational wave radiation and provides only an estimation of strong field dynamics, deviating from the genuine signal. To address this, Ref [72] explored the AKK model, which extends fading effects until the frequency reaches a stable Kerr orbit [73]. The AAK waveform strikes a balance between accuracy and computational efficiency, enhancing our ability to model EMRI events effectively.

Assuming the spin of the compact object is negligible, an EMRI can be described by 14 parameters. 7 is the intrinsic parameters : $\left(\mu,M,a,e_{0},\iota_{0},\gamma_{0},\psi_{0}\right)$ . $M$ is the mass of the massive black hole, $\mu$ is the mass of the CO. ‘a’ is spin parameter of the MBH, ‘p’ is the semi-latus rectum, ‘e’ is eccentricity, $\iota$ is the orbit’s inclination angle from the equatorial plane. $\gamma:=\tilde{\gamma}-\beta$ and $\beta(\hat{\mathbf{R}},\hat{\mathbf{S}},\hat{\mathbf{L}})=\beta\left(\theta_{S},\phi_{S},\theta_{K},\phi_{K},\iota,\alpha\right)$ is an azimuthal angle in the orbital plane. True anomaly $\psi$ is the angle that describes an object’s position in an elliptical orbit relative to the central body and the periapsis. Rest parameters are the extrinsic parameters: $\left(p_{0},\theta_{S},\phi_{S},\theta_{K},\phi_{K},\alpha_{0},D\right)$ . ‘p’ is the semi-latus rectum, $\theta_{S}$ and $\theta_{K}$ are the polar and azimuthal sky location angles. $\theta_{K}$ and $\phi_{K}$ are the azimuthal and polar angles describing the orientation of the spin angular momentum vector of the MBH. The time dependence of the orbital orientation is confined to $\alpha~{}(t)$ , $D$ is the luminosity distance [72]. The EMRI signals in the training data for this study have all been generated using the AAK method through the EMRI_Kludge_Suite [72].

III SEARCH STRATEGY

III.1 Datasets

In order to employ a Convolutional Neural Network for detecting EMRI signals within the Taiji detector data, we start by dividing the dataset into two groups, each comprising half of the samples. One of these sets contains data encompassing both signals and noise ( $d=h+n$ ) and is assigned the label 1, whereas the other set exclusively contains noise ( $n$ ) and is assigned the label 0. Subsequently, the dataset is further divided into two sets: training data and testing data. The training data is employed for model training, while the testing data serves as an evaluation for the trained CNN.

We make use of the AAK model to create EMRI signals, with the range of the parameters specified in Table 1. The signals extend for a duration of 0.5 years and are sampled every 10 seconds ( $dt$ ). Following this, we apply the analytical Power Spectral Density to simulate noise, ensuring it aligns with the generated signal, which leads to the generation of both datasets $d$ and $n$ .The total number of data generated for both $d$ and $n$ amounts to 6000.

Table 1: The parameters range of AKK waveforms for training and testing data

Parameter	Range (Uniform distribution)
$M/M_{\odot}$	( $10^{5},10^{8}$ )
$\mu/M_{\odot}$	(10, 100) ; $M/\mu\geq 10^{4}$
$a$	$(10^{-3}$ , 0.90)
$e_{0}$	(0.005, 0.5)
$p_{0}/M$	(10, 12)
$\iota$	(0, 0.1)
$\gamma$	(0, 0.1)
$\theta_{S}$	(0, $\pi$ )
$\phi_{S}$	(0, $2\pi$ )
$\theta_{K}$	(0, $\pi$ )
$\phi_{K}$	(0, $2\pi$ )

Subsequently, we incorporate Time Delay Interferometry (TDI) into the dataset, getting the A, E, and T channel data. For our training and testing purposes, we select data from the A and E channels. The datasize is (2,1560000) for both $d$ and $n$ . After TDI, we proceed to compute the Signal-to-Noise Ratio values, carefully maintaining them in the certain range.

We define the characteristic strain $h_{\mathfrak{c}}$ and noise amplitude $h_{n}$ as follows [74]:

		$\displaystyle\left[h_{\text{c}}(f)\right]^{2}=4f^{2}\Big{\|}\tilde{h}(f)\Big{\|}^{2},$		(6)
				(6)
		$\displaystyle\left[h_{n}(f)\right]^{2}=f\text{S}_{n}(f),$

So that we can express the SNR in equation 7. We utilize $h_{\mathfrak{c}}(f)$ and $h(f)$ in the SNR calculation. When these values are visualized using a logarithmic scale, the area between the source and detector curves represents the SNR. This SNR value indicates the detectability of the source. We consider an EMRI signal to be a “golden” EMRI when its SNR exceeds 50. Consequently, we adjust the distance to ensure that the SNR remains within the range of (50, 100). We use $\varrho$ to represent the SNR value in each channel:

\varrho^{2}=\int_{-\infty}^{\infty}\mathrm{d}(\log f)\left[\frac{h_{\mathrm{c}}(f)}{h_{n}(f)}\right]^{2}

(7)

When taking two channels into account, we get the target SNR using the following equation:

\mathrm{SNR}^{2}=(Q_{A})^{2}+(Q_{E})^{2}.

(8)

III.2 The Q-transform

Time-frequency analysis has proven to be highly effective in the EMRI data analysis. In Ref. [52], a time-frequency plot depicting a typical EMRI signal, wherein distinct frequency tracks clearly reveal the presence of dominant harmonics. The EMRI signals show a marked concentration of frequency components within these tracks, displaying substantial intensity. Therefore, we have chosen to employ the Constant Q Transform (CQT) for data preprocessing in this deep learning project. CQT is the preferred choice due to its advantages in this particular endeavor.

In signal processing, time-frequency analysis encompasses techniques that examine a signal in both the time and frequency domains simultaneously. The use of time-frequency analysis offers several key advantages. Firstly, it allows for the adjustment of the trade-off between time and frequency resolution, providing greater flexibility in resolution settings. Additionally, time-frequency representations can extract relevant features crucial for deep learning and classification tasks. In comparison to time series analysis, time-frequency analysis significantly enhances feature extraction capabilities, making it a valuable tool. Time-frequency analysis techniques encompass a variety of methods such as Short-Time Fourier Transform (STFT), Wavelet Transform, Continuous Wavelet Transform (CWT), and more.

In our research, we have decided to employ the Constant Q Transform (CQT) for data processing in our deep learning project. We have chosen CQT over the Short Time Fourier Transform (STFT) for several reasons. Firstly, CQT offers adaptable time and frequency resolution, unlike STFT, which has a fixed resolution determined by window size and window overlap. Generally, CQT provides finer frequency details for higher frequencies and vice versa. Secondly, CQT’s flexible time and frequency resolution, along with its exceptional ability to accurately pinpoint both time and frequency information, makes it the preferred choice for visualizing and analyzing signals like EMRI, especially when compared to STFT.

CQT is a special case of Variable Q transform(VQT). Both of them are related to complex Morlet wavelet transform. The definition is

\delta f_{k}=2^{1/n}\cdot\delta f_{k-1}=\left(2^{1/n}\right)^{k}\cdot\delta f_{\min},

(9)

where $\delta f_{k}$ is the bandwidth of the k-th filter, $f_{min}$ is the central frequency of the lowest filter, and n is the number of filters per octave (https://en.wikipedia.org/wiki/Constant-Q_transform).

Define the quality factor Q for CQT:

Q=\frac{f_{k}}{\delta f_{k}}.

(10)

With this factor, we can define window length:

N[k]=\frac{f_{s}}{\delta f_{k}}=\frac{f_{s}}{f_{k}}Q.

(11)

Hamming window is one of the most commonly used window, take this for example:

W[k,n]=\alpha-(1-\alpha)\cos\frac{2\pi n}{N[k]-1},\quad\alpha=25/46,\quad 0\leqslant n\leqslant N[k]-1.

(12)

With things above, define Q transform as:

X[k]=\frac{1}{N[k]}\sum_{n=0}^{N[k]-1}W[k,n]x[n]e^{\frac{-j2\pi Qn}{N[k]}}.

(13)

We ues librosa(https://librosa.org/doc/latest/index.html) to do the process. In Fig. 3(a), we see the signal after the Q-transform. The EMRI signal is strong, with a high SNR of 100. In Figure 3(b), we can observe the signal amidst the background noise of the detector. It’s noticeable that only a few of its features remain preserved after adding noise. In Figure 4, we can see the pure noise denoted as $n$ , and the data $d$ that includes both the signal and the noise after undergoing the Q-transform. In Figure 4(b), where the signal-to-noise ratio is 50, the distinctive features of the EMRI are obscured by the noise, making it difficult to distinguish between Figure 4(a) and Figure 4(b) by visual inspection alone. However, employing convolutional neural networks (CNN) can make this differentiation quite easy.

We store the Q-transform results as RGB images and resize them to (255,255) matrixs to use them as input for the CNN network.

III.3 Network

Table 2: Architecture of deep convolutional neural network

num	Layer (type)	Output Shape	Param
1	Conv2d (Kernel Size: 3x3 Stride: 1 Padding: 1 )	[16, 16, 225, 225]	448
2	BatchNorm2d	[16, 16, 225, 225]	32
3	GELU	[16, 16, 225, 225]	0
4	MaxPool2d (Kernel Size: 2x2 Stride: 2)	[16, 16, 112, 112]	0
5	Conv2d (Kernel Size: 3x3 Stride: 1 Padding: 1)	[16, 32, 112, 112]	4,640
6	BatchNorm2d	[16, 32, 112, 112]	64
7	GELU	[16, 32, 112, 112]	0
8	MaxPool2d (Kernel Size: 2x2 Stride: 2)	[16, 32, 56, 56]	0
9	Flatten	[16, 100352]	0
10	Linear	[16, 128]	12,845,184
11	BatchNorm1d-11	[16, 128]	256
12	GELU	[16, 128]	0
13	Dropout	[16, 128]	0
14	Linear	[16, 64]	8,256
15	Dropout	[16, 64]	0
16	BatchNorm1d	[16, 64]	128
17	GELU	[16, 64]	0
18	Linear	[16, 2]	130

This study adopts the basic framework of a CNN network architecture, which includes Conv2d layers with filter sizes of 16 and 32, followed by two fully connected layers with sizes of 128 and 64. Standard ReLU activation functions are employed as non-linearities between layers, and the convolutional layers have kernel sizes of $3\times 3$ and $3\times 3$ . The network incorporates max-pooling layers. We uses a stride of 1 for convolution layers and a stride of 2 for pooling layers. Specific details about the network can be found in Table 2.

In our approach, we utilize the cross-entropy loss function, which is a widely adopted choice in classification tasks due to its effectiveness in speeding up the convergence of deep learning models. This loss function quantifies the difference between predicted probabilities and actual labels, encouraging the model to improve its predictive accuracy. The mathematical expression for this loss function is as follows:

L=\sum_{j=1}^{N}\left(-y_{j_{\text{true}}}\cdot\log(y_{j_{\text{pred}}})-(1-y_{j_{\text{true}}})\cdot\log(1-y_{j_{\text{pred}}})\right)

(14)

Specifically, for our binary classification task, which involves distinguishing EMRI signals from noise, the cross-entropy loss measures the discrepancy between the predicted probabilities of EMRI and the actual EMRI labels in the training data. By minimizing this loss, our model aims to align its predictions with the true EMRI labels, ultimately enhancing its ability to classify EMRI signals accurately and reliably.

Additionally, we implement the Adam optimizer with a learning rate equal to 0.01 to update the model’s weights during training. Over the course of 200 training epochs for training data, we monitor both loss and accuracy, periodically saving the model with the lowest loss after the convergence as our best performing model. Then we test the saved model on test data.

III.4 Result

In this section, we use receiver operating characteristics (ROC) to see how well our CNN model can find the signal in our testing data. The ROC curve is a visual tool that helps us see how well our model can find the signal in our data without messing up. The ROC curve serves as a graphical representation that provides insights into our model’s performance. It focuses on two critical aspects: the true positive rate, indicating how frequently the model correctly identifies the signal, and the false positive rate, highlighting instances where the model incorrectly detects a signal when none exists. To provide an overarching assessment of our model’s capabilities, we rely on the Area Under the ROC Curve (AUC). AUC is like a score which can helps us see how well our model is doing its job. A higher AUC value, closer to 1, signifies superior performance, indicating that the model excels in detecting the signal with precision.

In Figure 5, there are two separate ROC curves, one in (a) and the other in (b). These curves represent different SNR values in the test data. In (a), the SNR values range from 50 to 100, while in (b), it’s a constant value of 50. The AUC values shown in the figures give us a numerical measure of how good the model is at accurately spotting EMRI signals. To make it easier to see the detection performance, especially at low false positive rates, there’s a smaller inset figure inside Fig. 5 that zooms in on the TPR from 0 to 0.01. This helps us understand how well our CNN network performs in detecting EMRI signals across different SNR values.

IV Conclusion

This study aims to address one of the most challenging issues in space-based gravitational wave detection: the identification of EMRIs. These signals are known for their complex and lengthy waveforms, as well as their inherently weak nature, making their detection quite challenging. To overcome this challenge, we present an innovative approach that combines the Q-transform for data preprocessing and a CNN network. This method not only preserves the crucial EMRI signal characteristics but also enhances efficiency, making it suitable for space-based detectors.

Our findings reveal the robustness of the CNN-based model, particularly in the detection of EMRI signals with a SNR of 50, often referred to as “golden” EMRIs. Incorporating time-delay interferometry into our approach ensures its practical utility. The model demonstrates impressive performance, achieving a true positive rate of 94.2% at a 1% false positive rate across a wide range of SNRs from 50 to 100. Even at an SNR of 50, the TPR remains high at 91% while maintaining a low FPR of 1%. This research opens new avenues for advancing EMRI data analysis through the integration of deep learning techniques with time-frequency data, holding great promise for enhancing the capabilities of space-based gravitational wave detectors.

In the future, successful implementation of this CNN-based approach opens the door to further improvements. Subsequent research endeavors could explore lower SNR thresholds, possibly as low as SNR 20, requiring the development of more extensive networks capable of detecting weak signals. Additionally, integrating more advanced simulations like self-force EMRI waveforms or complex noise models may demand increased computational resources. Traditional GW analysis methods may not suffice for such data analysis. As we start using machine learning to find EMRI signals, there’s a lot we have not explored yet. This could lead to big advancements in finding space-based gravitational wave sources. Especially, if from machine learning, the preliminary range of source parameters can be set, then will be very powerful for the further parameter estimation.

V Acknowledgements*

This work was supported by the National Key R&D Program of China (Grant Nos. 2021YFC2203002), the National Natural Science Foundation of China (Grant Nos. 12173071). Wen-Biao Han was supported by the CAS Project for Young Scientists in Basic Research (Grant No. YSBR-006). This work made use of the High Performance Computing Resource in the Core Facility for Advanced Research Computing at Shanghai Astronomical Observatory.

References

[1] B. P. Abbott et al. Binary Black Hole Mergers in the first Advanced LIGO Observing Run. Phys. Rev. X, 6(4):041015, 2016. [Erratum: Phys.Rev.X 8, 039903 (2018)].
[2] J. Aasi et al. Advanced LIGO. Class. Quant. Grav., 32:074001, 2015.
[3] B. P. Abbott et al. Observation of Gravitational Waves from a Binary Black Hole Merger. Phys. Rev. Lett., 116(6):061102, 2016.
[4] B. P. Abbott et al. Search for gravitational waves from a long-lived remnant of the binary neutron star merger GW170817. Astrophys. J., 875(2):160, 2019.
[5] Gregory M. Harry. Advanced LIGO: The next generation of gravitational wave detectors. Class. Quant. Grav., 27:084006, 2010.
[6] F. Acernese et al. Advanced Virgo: a second-generation interferometric gravitational wave detector. Class. Quant. Grav., 32(2):024001, 2015.
[7] B. P. Abbott et al. GWTC-1: A Gravitational-Wave Transient Catalog of Compact Binary Mergers Observed by LIGO and Virgo during the First and Second Observing Runs. Phys. Rev. X, 9(3):031040, 2019.
[8] R. Abbott et al. GWTC-2: Compact Binary Coalescences Observed by LIGO and Virgo During the First Half of the Third Observing Run. Phys. Rev. X, 11:021053, 2021.
[9] R. Abbott et al. GWTC-2.1: Deep Extended Catalog of Compact Binary Coalescences Observed by LIGO and Virgo During the First Half of the Third Observing Run. 8 2021.
[10] R. Abbott et al. GWTC-3: Compact Binary Coalescences Observed by LIGO and Virgo During the Second Part of the Third Observing Run. 11 2021.
[11] R. Abbott et al. Population Properties of Compact Objects from the Second LIGO-Virgo Gravitational-Wave Transient Catalog. Astrophys. J. Lett., 913(1):L7, 2021.
[12] R. Abbott et al. Population of Merging Compact Binaries Inferred Using Gravitational Waves through GWTC-3. Phys. Rev. X, 13(1):011048, 2023.
[13] K. Belczynski et al. Evolutionary roads leading to low effective spins, high black hole masses, and O1/O2 rates for LIGO/Virgo binary black holes. Astron. Astrophys., 636:A104, 2020.
[14] B. P. Abbott et al. Tests of General Relativity with the Binary Black Hole Signals from the LIGO-Virgo Catalog GWTC-1. Phys. Rev. D, 100(10):104036, 2019.
[15] R. Abbott et al. Tests of General Relativity with GWTC-3. 12 2021.
[16] R. Abbott et al. Constraints on the Cosmic Expansion History from GWTC–3. Astrophys. J., 949(2):76, 2023.
[17] Yoichi Aso, Yuta Michimura, Kentaro Somiya, Masaki Ando, Osamu Miyakawa, Takanori Sekiguchi, Daisuke Tatsumi, and Hiroaki Yamamoto. Interferometer design of the KAGRA gravitational wave detector. Phys. Rev. D, 88(4):043007, 2013.
[18] B. P. Abbott et al. Prospects for observing and localizing gravitational-wave transients with Advanced LIGO, Advanced Virgo and KAGRA. Living Rev. Rel., 21(1):3, 2018.
[19] Andrew R. Kaiser and Sean T. McWilliams. Sensitivity of present and future detectors across the black-hole binary gravitational wave spectrum. Class. Quant. Grav., 38(5):055009, 2021.
[20] Pau Amaro-Seoane et al. Laser Interferometer Space Antenna. 2 2017.
[21] Wei-Tou Ni. ASTROD-GW: Overview and Progress. Int. J. Mod. Phys. D, 22:1341004, 2013.
[22] Seiji Kawamura et al. The Japanese space gravitational wave antenna: DECIGO. Class. Quant. Grav., 28:094011, 2011.
[23] Jeff Crowder and Neil J. Cornish. Beyond LISA: Exploring future gravitational wave missions. Phys. Rev. D, 72:083005, 2005.
[24] Jason M. Hogan et al. An Atomic Gravitational Wave Interferometric Sensor in Low Earth Orbit (AGIS-LEO). Gen. Rel. Grav., 43:1953–2009, 2011.
[25] Wen-Rui Hu and Yue-Liang Wu. The Taiji Program in Space for gravitational wave physics and the nature of gravity. Natl. Sci. Rev., 4(5):685–686, 2017.
[26] Wen-Hong Ruan, Zong-Kuan Guo, Rong-Gen Cai, and Yuan-Zhong Zhang. Taiji program: Gravitational-wave sources. Int. J. Mod. Phys. A, 35(17):2050075, 2020.
[27] Jun Luo et al. TianQin: a space-borne gravitational wave detector. Class. Quant. Grav., 33(3):035010, 2016.
[28] Antoine Klein et al. Science with the space-based interferometer eLISA: Supermassive black hole binaries. Phys. Rev. D, 93(2):024003, 2016.
[29] Stanislav Babak, Jonathan Gair, Alberto Sesana, Enrico Barausse, Carlos F. Sopuerta, Christopher P. L. Berry, Emanuele Berti, Pau Amaro-Seoane, Antoine Petiteau, and Antoine Klein. Science with the space-based interferometer LISA. V: Extreme mass-ratio inspirals. Phys. Rev. D, 95(10):103012, 2017.
[30] Wen-Biao Han and Xian Chen. Testing general relativity using binary extreme-mass-ratio inspirals. Mon. Not. Roy. Astron. Soc., 485(1):L29–L33, 2019.
[31] Shuo Xin, Wen-Biao Han, and Shu-Cheng Yang. Gravitational waves from extreme-mass-ratio inspirals using general parametrized metrics. Phys. Rev. D, 100(8):084055, 2019.
[32] Shucheng Yang, Shuo Xin, Chen Zhang, and Wenbiao Han. Testing Gravity Theory With Extreme Mass-Ratio Inspirals: Recent Progress. MDPI Proc., 17(1):11, 2019.
[33] Astrid Lamberts, Sarah Blunt, Tyson B. Littenberg, Shea Garrison-Kimmel, Thomas Kupfer, and Robyn E. Sanderson. Predicting the LISA white dwarf binary population in the Milky Way with cosmological simulations. Mon. Not. Roy. Astron. Soc., 490(4):5888–5903, 2019.
[34] Chiara Caprini et al. Science with the space-based interferometer eLISA. II: Gravitational waves from cosmological phase transitions. JCAP, 04:001, 2016.
[35] Nicola Bartolo et al. Science with the space-based interferometer LISA. IV: Probing inflation with gravitational waves. JCAP, 12:026, 2016.
[36] Xue-Fei Gong et al. A scientific case study of an advanced LISA mission. Class. Quant. Grav., 28:094012, 2011.
[37] Ziren Luo, Yan Wang, Yueliang Wu, Wenrui Hu, and Gang Jin. The taiji program: A concise overview. Progress of Theoretical and Experimental Physics, 2021(5):05A108, 2021.
[38] Zhixiang Ren, Tianyu Zhao, Zhoujian Cao, Zong-Kuan Guo, Wen-Biao Han, Hong-Bo Jin, and Yue-Liang Wu. Taiji data challenge for exploring gravitational wave universe. Front. Phys. (Beijing), 18(6):64302, 2023.
[39] Donald Lynden-Bell and Martin J Rees. On quasars, dust and the galactic centre. Monthly Notices of the Royal Astronomical Society, 152(4):461–475, 1971.
[40] Andrzej Soltan. Masses of quasars. Monthly Notices of the Royal Astronomical Society, 200(1):115–122, 1982.
[41] John Kormendy and Douglas Richstone. Inward bound: The Search for supermassive black holes in galactic nuclei. Ann. Rev. Astron. Astrophys., 33:581, 1995.
[42] Reinhard Genzel, Frank Eisenhauer, and Stefan Gillessen. The Galactic Center Massive Black Hole and Nuclear Star Cluster. Rev. Mod. Phys., 82:3121–3195, 2010.
[43] Marta Volonteri and Priyamvada Natarajan. Journey to the $M_{\rm BH}-\sigma$ relation: the fate of low mass black holes in the Universe. Mon. Not. Roy. Astron. Soc., 400:1911, 2009.
[44] Pau Amaro-Seoane et al. eLISA/NGO: Astrophysics and cosmology in the gravitational-wave millihertz regime. GW Notes, 6:4–110, 2013.
[45] Pau Amaro-Seoane et al. Low-frequency gravitational-wave science with eLISA/NGO. Class. Quant. Grav., 29:124016, 2012.
[46] Ping Shen, Wen-Biao Han, Chen Zhang, Shu-Cheng Yang, Xing-Yu Zhong, and Ye Jiang. The influence of mass-ratio in extreme-mass-ratio inspirals for testing general relativity. 3 2023.
[47] Kyriakos Destounis, Arun Kulathingal, Kostas D. Kokkotas, and Georgios O. Papadopoulos. Gravitational-wave imprints of compact and galactic-scale environments in extreme-mass-ratio binaries. Phys. Rev. D, 107(8):084027, 2023.
[48] Andrea Maselli, Nicola Franchini, Leonardo Gualtieri, Thomas P. Sotiriou, Susanna Barsanti, and Paolo Pani. Detecting fundamental fields with LISA observations of gravitational waves from extreme mass-ratio inspirals. Nature Astron., 6(4):464–470, 2022.
[49] Neil J. Cornish. Detection Strategies for Extreme Mass Ratio Inspirals. Class. Quant. Grav., 28:094016, 2011.
[50] Stanislav Babak, Jonathan R. Gair, and Edward K. Porter. An Algorithm for detection of extreme mass ratio inspirals in LISA data. Class. Quant. Grav., 26:135004, 2009.
[51] Jonathan R. Gair, Leor Barack, Teviet Creighton, Curt Cutler, Shane L. Larson, E. Sterl Phinney, and Michele Vallisneri. Event rate estimates for LISA extreme mass ratio capture sources. Class. Quant. Grav., 21:S1595–S1606, 2004.
[52] Yan Wang, Yu Shang, Stanislav Babak, Yu Shang, and Stanislav Babak. EMRI data analysis with a phenomenological waveform. Phys. Rev. D, 86:104050, 2012.
[53] Yan Wang, Gerhard Heinzel, and Karsten Danzmann. First stage of LISA data processing II: Alternative filtering dynamic models for LISA. Phys. Rev. D, 92(4):044037, 2015.
[54] Philip Lynch, Maarten van de Meent, and Niels Warburton. Eccentric self-forced inspirals into a rotating black hole. Class. Quant. Grav., 39(14):145004, 2022.
[55] Soichiro Isoyama, Ryuichi Fujita, Alvin J. K. Chua, Hiroyuki Nakano, Adam Pound, and Norichika Sago. Adiabatic Waveforms from Extreme-Mass-Ratio Inspirals: An Analytical Approach. Phys. Rev. Lett., 128(23):231101, 2022.
[56] Daniel George and E. A. Huerta. Deep Learning for Real-time Gravitational Wave Detection and Parameter Estimation: Results with Advanced LIGO Data. Phys. Lett. B, 778:64–70, 2018.
[57] Daniel George and E. A. Huerta. Deep Neural Networks to Enable Real-time Multimessenger Astrophysics. Phys. Rev. D, 97(4):044039, 2018.
[58] Wei Wei and E. A. Huerta. Gravitational Wave Denoising of Binary Black Hole Mergers with Deep Learning. Phys. Lett. B, 800:135081, 2020.
[59] He Wang, Shichao Wu, Zhoujian Cao, Xiaolin Liu, and Jian-Yang Zhu. Gravitational-wave signal recognition of LIGO data by deep learning. Phys. Rev. D, 101(10):104003, 2020.
[60] Shang-Jie Jin, Yu-Xin Wang, Tian-Yang Sun, Jing-Fei Zhang, and Xin Zhang. Rapid identification of time-frequency domain gravitational wave signals from binary black holes using deep learning. 5 2023.
[61] Wen-Hong Ruan, He Wang, Chang Liu, and Zong-Kuan Guo. Rapid search for massive black hole binary coalescences using deep learning. Phys. Lett. B, 841:137904, 2023.
[62] Xue-Ting Zhang, Chris Messenger, Natalia Korsakova, Man Leong Chan, Yi-Ming Hu, and Jing-dong Zhang. Detecting gravitational waves from extreme mass ratio inspirals using convolutional neural networks. Phys. Rev. D, 105(12):123027, 2022.
[63] Tianyu Zhao, Ruoxi Lyu, He Wang, Zhoujian Cao, and Zhixiang Ren. Space-based gravitational wave signal detection and extraction with deep neural network. Commun. Phys., 6(1):212, 2023.
[64] Tianyu Zhao, Yue Zhou, Ruijun Shi, Zhoujian Cao, and Zhixiang Ren. DECODE: DilatEd COnvolutional neural network for Detecting Extreme-mass-ratio inspirals. 8 2023.
[65] Jonathan R. Gair, Ilya Mandel, and Linqing Wen. Time-frequency analysis of extreme-mass-ratio inspiral signals in mock LISA data. J. Phys. Conf. Ser., 122:012037, 2008.
[66] Jonathan R. Gair, Ilya Mandel, and Linqing Wen. Improved time-frequency analysis of extreme-mass-ratio inspiral signals in mock LISA data. Class. Quant. Grav., 25:184031, 2008.
[67] S Babak and A Petiteau. Lisa data challenge manual. Technical report, Tech. Rep. LISA-LCST-SGS-MAN-002, APC Paris, 2020.
[68] Massimo Tinto, Frank B. Estabrook, and J. W. Armstrong. Time delay interferometry with moving spacecraft arrays. Phys. Rev. D, 69:082001, 2004.
[69] Michele Vallisneri. Synthetic LISA: Simulating time delay interferometry in a model LISA. Phys. Rev. D, 71:022001, 2005.
[70] Michael L. Katz, Jean-Baptiste Bayle, Alvin J. K. Chua, and Michele Vallisneri. Assessing the data-analysis impact of LISA orbit approximations using a GPU-accelerated response model. Phys. Rev. D, 106(10):103001, 2022.
[71] Leor Barack and Curt Cutler. LISA capture sources: Approximate waveforms, signal-to-noise ratios, and parameter estimation accuracy. Phys. Rev. D, 69:082005, 2004.
[72] Alvin J. K. Chua, Christopher J. Moore, and Jonathan R. Gair. Augmented kludge waveforms for detecting extreme-mass-ratio inspirals. Phys. Rev. D, 96(4):044005, 2017.
[73] Stanislav Babak, Hua Fang, Jonathan R. Gair, Kostas Glampedakis, and Scott A. Hughes. ’Kludge’ gravitational waveforms for a test-body orbiting a Kerr black hole. Phys. Rev. D, 75:024005, 2007. [Erratum: Phys.Rev.D 77, 04990 (2008)].
[74] C. J. Moore, R. H. Cole, and C. P. L. Berry. Gravitational-wave sensitivity curves. Class. Quant. Grav., 32(1):015014, 2015.