This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Observation Site Selection for Physical Model Parameter Estimation toward Process-Driven Seismic Wavefield Reconstruction

K. Nakai111Corresponding to [email protected] Graduate School of Engineering, Tohoku University National Institute of Advanced Industrial Science and Technology T. Nagata Graduate School of Engineering, Tohoku University K. Yamada Graduate School of Engineering, Tohoku University Y. Saito Graduate School of Engineering, Tohoku University T. Nonomura Graduate School of Engineering, Tohoku University M. Kano Graduate School of Science, Tohoku University S. Ito Earthquake Research Institute, The University of Tokyo H. Nagao Earthquake Research Institute, The University of Tokyo
Abstract

The “big” seismic data not only acquired by seismometers but also acquired by vibrometers installed in buildings and infrastructure and accelerometers installed in smartphones will be certainly utilized for seismic research in the near future. Since it is impractical to utilize all the seismic big data in terms of the computational cost, methods which can select observation sites depending on the purpose are indispensable. We propose an observation site selection method for the accurate reconstruction of the seismic wavefield by process-driven approaches. The proposed method selects observation sites suitable for accurately estimating physical model parameters such as subsurface structures and source information to be input into a numerical simulation of the seismic wavefield. The seismic wavefield is reconstructed by the numerical simulation using the parameters estimated based on the observed signals at only observation sites selected by the proposed method. The observation site selection in the proposed method is based on the sensitivity of each observation site candidate to the physical model parameters; the matrix corresponding to the sensitivity is constructed by approximately calculating the derivatives based on the simulations, and then, observation sites are selected by evaluating the quantity of the sensitivity matrix based on the D-optimality criterion proposed in the optimal design of experiments. In the present study, physical knowledge on the sensitivity to the parameters such as seismic velocity, layer thickness, and hypocenter location was obtained by investigating the characteristics of the sensitivity matrix. Furthermore, the effectiveness of the proposed method was shown by verifying the accuracy of seismic wavefield reconstruction using the observation sites selected by the proposed method.

000This paper has been accepted for publication in the Geophysical Journal International ©2023. This manuscript version is made available under the CC-BY-NC-ND 4.0 license (https://creativecommons.org/licenses/by-nc-nd/4.0/)

Index terms— Earthquake ground motions; Site effects; Wave propagation; Inverse theory

1 Introduction

Estimation of ground motion due to an earthquake is one of the indispensable technologies for the prevention and mitigation of disasters. Accurate and rapid estimation of ground motion facilitates the evaluation of seismic damage to infrastructure such as tall buildings, water and gas pipelines, and power plants. It allows us to conduct effective post-disaster measures and rescues when an earthquake occurs. Damage to the infrastructure is sometimes evaluated by analyzing the seismic response of structures due to the ground motion. Detecting the damage to individual structures by installing sensors on each structure in a city is an ideal system; however, it is not realistic because of the expensive implementation. Therefore, a method which estimates the seismic damage to infrastructure using observations of ground motions is needed, where the observations are obtained at observation sites located every 10310410^{3}-10^{4} (m) intervals. Hence, our problem can be formulated as seismic wavefield estimation and reconstruction, that recovers the seismic wavefields utilizing seismograms recorded by an array of seismometers distributed more sparsely than the structures.

A lot of estimation and reconstruction methods have been proposed by data-driven and process-driven approaches. In the data-driven approach, ground motions are traditionally calculated by spatially interpolating ground motions recorded by seismometers (Vanmarcke & Fenton, 1991; Kameda & Morikawa, 1994, 1992; Sato & Imabayashi, 1999; Kawakami, 1989). The interpolation methods utilize nonstochastic conditional simulations under the constraint of given deterministic motions and coherency (Kawakami, 1989), closed-form solutions of conditional probability functions for Fourier coefficients (Kameda & Morikawa, 1992), and regressive (Kriging) models (Vanmarcke & Fenton, 1991), that have been empirically derived based on the relation between seismograms recorded by seismometers and characteristics physical quantities. More recently, the seismic-wave gradiometry (SWG) method has been proposed to reduce the constraint required in the reconstruction (e.g., Langston, 2007a, b; Liang & Langston, 2009; Maeda et al., 2016; Shiina et al., 2021). In the SWG method, the amplitudes of seismic waves and their spatial gradients at an arbitrary point can be interpolated from the observed amplitudes at surrounding stations without making assumptions concerning velocity structures and locations of earthquake epicenters. Furthermore, the reconstruction methods utilizing the idea of compressed sensing have been proposed: a method using the sparsity of dominating energies of phase arrivals in the frequency-wavenumber domain (Schneider et al., 2018), that using a sparse representation of the surface wavefield using a plane-wave basis (Zhan et al., 2018), and a split processing scheme based on a wavelet transform in time and pre-conditioned curvelet-based compressed sensing in space (Muir & Zhan, 2021). The software has been developed for rapid estimation of the seismic damage in the data-driven approach; the ShakeMaps developed by the United States Geological Survey are based on relationships between recorded ground-motion parameters and expected shaking intensities (e.g., Wald et al., 1999, 2005). However, the data-driven approaches often suffer from a lack of observation sites and seismograms database.

While the data-driven approach is based on observations and empirical equations, the wave equations are numerically solved using physical models in the process-driven approach (e.g., Boore, 1972; Aoi & Fujiwara, 1999; Pitarka, 1999; Hisada & Bielak, 2003; Koketsu et al., 2004; Ichimura et al., 2007). The numerical simulations that consider the physics of wave propagation are employed by modelling source mechanisms and subsurface structures. The seismic wavefield and responses estimated by the process-driven approach would be the most reliable if the assumed models were accurate. A quick disaster estimation system for the prediction of seismic hazards is developed in the process-driven approach; the system which combines measured ground motion and simulations has been developed by Fujita et al. (2014). However, the process-driven approach often involves a high computational cost and requires a high-fidelity subsurface structure model, which is difficult to estimate from observations at the surface alone.

On the other hand, a combination of process-driven and data-driven approaches has been proposed. Kano et al. (2017a, b) proposed a seismic wavefield reconstruction framework by combining seismograms obtained by an array of seismometers and process-based simulations that solve the wave equation. It estimates the physical model parameters related to the subsurface structure and source information by utilizing the replica-exchange Monte Carlo method (REMC) (Swendsen & Wang, 1986; Geyer, 1991; Hukushima & Nemoto, 1996; Earl & Deem, 2005). The wavefield is reconstructed using the estimated parameters that quantitatively explain the observations. In addition, for such inversion problems, which involve finding the values of model parameters that minimise a misfit function between the observed data and the data simulated with the model parameters, Arnold & Curtis (2018) introduced the interrogation theory. The interrogation theory combines inverse theory, decision theory, and the theory of experimental design to optimize scientific investigations so as to find information that best answers scientific questions of interest. Zhang & Curtis (2021); Zhao et al. (2022) applied the interrogation theory to seismic tomography to estimate the shape, area, or volume of a subsurface structure and demonstrated the effectiveness of the theory. However, in order to use the combination method for real-time reconstruction and quick disaster estimation, it is necessary to reduce the computational cost, since the estimation of model parameters is time-consuming.

Optimisation of observation sites is one of the key technologies for the development of seismology. In Japan, more than 2,000 seismometers have been in operation for more than 20 years. In hazardous regions for severe seismic damage by a large earthquake, a dense seismic network is often established (e.g. Metropolitan Seismic Observation network (MeSO-net) in Tokyo metropolitan area of Japan (Hirata, 2009; Sakai & Hirata, 2009)). The seismic data not only acquired by seismometers but also acquired by vibrometers installed in buildings and infrastructure and accelerometers built into smartphones will be certainly utilized for seismic research in the near future. In that case, the amount of available data will increase by orders of magnitude compared to the current amount of data. It is impractical to use all the seismic big data in terms of the computational cost. Therefore, prior to the advent of the seismic big data era, it is indispensable to develop an observation site selection method which selects data to be employed depending on the purpose. Toward the seismic wave field reconstruction, the methods of observation site selection can propose the subsets of observation sites suitable to estimate the seismic wavefield with sufficient accuracy for damage prediction. Furthermore, the reconstruction accuracy of the seismic wavefield might be improved by the observation site selection which considers the characteristics of the signal-to-noise ratio of each seismometer. In addition, when seismometers are newly installed, the observation site selection method can indicate the configuration of seismometers that contributes most to the improvement of accuracy of the seismic wavefield reconstruction. Previous studies have attempted to optimise the design of the seismic network configuration. Hardt & Scherbaum (1994) developed a method to design optimum networks for aftershock recordings based on simulated annealing (Kirkpatrick et al., 1983). This method has been extended in Kraft et al. (2013). Steinberg & Rabinowitz (2003) applied the statistical theory of optimal design of experiments (Atkinson et al., 2007) to derive network configurations that maximize the precision of earthquake source localisation. More recently, Muir & Zhan (2022) has investigated an optimal design of mixed distributed acoustic sensing (DAS), a combined network of DAS and point sensors, based on the optimal design of experiments. Furthermore, for the seismic source inversion, Long et al. (2015) proposed a technique based on the optimal design of experiments in a Bayesian framework (Chaloner & Verdinelli, 1995; Stuart, 2010). They proposed the expected information gain obtained from the data recorded by an array of receivers (seismographs), which corresponds to the optimality criterion in the optimal design of experiments, and a fast estimator of the expected information gain by employing the Laplace approximation. In the numerical experiments assuming equally spaced receivers, the distance between the receivers to maximise the gain has been investigated.

This type of of situation, where the quantities of physical parameters are estimated based on observed data using discretely installed sensors such as seismographs, can be seen in a range of scientific fields. It is valuable to select the locations to install sensors from the candidate locations in order to obtain as much useful information as possible with as few sensors as possible. This sensor selection/placement problem, which is called the sparse sensor optimisation problem, are intensively studied for the global positioning system (Kihara & Okada, 1984; Phatak, 2001), structural health monitoring (Worden & Burrows, 2001; Yi et al., 2011), environmental monitoring (Du et al., 2014), brain source localisation (Yeo et al., 2022), and flow visualization (Manohar et al., 2018; Carter et al., 2021; Kanda et al., 2021, 2022; Kaneko et al., 2021; Inoue et al., 2021, 2023; Inoba et al., 2022; Tiwari et al., 2022; Li et al., 2021a, b; Fukami et al., 2021; Callaham et al., 2019), etc. The sparse sensor optimisation methods have been proposed using various schemes such as a convex relaxation method (Joshi & Boyd, 2009; Liu et al., 2016; Nonomura et al., 2021; Nagata et al., 2021, 2022b), and a greedy method (Manohar et al., 2018, 2021, 2019; Clark et al., 2018, 2020a, 2020b; Saito et al., 2020; Yamada et al., 2021; Nakai et al., 2021, 2022; Nagata et al., 2023; Yamada et al., 2022).

The present study proposes an observation site selection method for the accurate reconstruction of the seismic wavefield from sparse observation. The framework of sparse sensor optimisation based on linear observation equations is extended to seismic wavefield reconstruction. The observation site selection method proposed in the present study is for the process-driven reconstruction method; the locations of observation sites are optimised to estimate the physical model parameters based on a linear equation which expresses the relation between the parameters and the sparse observations, and the seismic wavefield is reconstructed using the simulation with parameters estimated by the observations at the optimised observation sites. The observation site selection method proposed in the present study is expected to contribute to improving the rapidness of the process-driven seismic wavefield reconstruction, e.g., the method proposed in Kano et al. (2017a). In the proposed reconstruction framework, the model parameters are estimated using the replica-exchange Monte Carlo method, which allows a parameter search in a wide parameter space but is time-consuming.

The remainder of this paper is organized as follows: Section 2 formulates an observation site selection method for seismic model parameter estimation toward accurate seismic wavefield reconstruction. Section 3 presents the results of observation site selection using the proposed method. Section 4 demonstrates the improvement in the reconstruction accuracy of the numerically simulated seismic wavefield based on the data at the observation sites selected by the proposed method compared to that at observation sites randomly selected. Section 5 concludes the paper.

2 Methodologies

2.1 Sparse sensor optimisation

The formulation and algorithms of sparse sensor optimisation proposed in the field of fluid dynamics are described in this subsection (Manohar et al., 2018; Saito et al., 2021a, b). The observation of ground motions using seismometers corresponds to the vector-sensor measurement. The acceleration is typically measured in the north-south, east-west and up-down directions at every single observation site. The number of multiple components is further increased when considering the seismic waveform in the frequency domain. We define the vector-measurement-sensor optimisation problem as follows:

𝐲\displaystyle\mathbf{y} =\displaystyle= [𝐇s1𝐇s2𝐇sp][𝐔1𝐔2𝐔s]𝐳\displaystyle\left[\begin{array}[]{c}\mathbf{H}_{s_{1}}\\ \mathbf{H}_{s_{2}}\\ \vdots\\ \mathbf{H}_{s_{p}}\\ \end{array}\right]\left[\begin{array}[]{c}\mathbf{U}_{1}\\ \mathbf{U}_{2}\\ \vdots\\ \mathbf{U}_{s}\end{array}\right]\mathbf{z} (9)
=\displaystyle= [𝐖1𝐖2𝐖p]𝐳\displaystyle\left[\begin{array}[]{c}\mathbf{W}_{1}\\ \mathbf{W}_{2}\\ \vdots\\ \mathbf{W}_{p}\\ \end{array}\right]\mathbf{z} (14)
=\displaystyle= 𝐃𝐳\displaystyle\mathbf{D}\mathbf{z} (15)

where 𝐲p\mathbf{y}\in\mathbb{R}^{p} is the observation vector, 𝐇sks×N\mathbf{H}_{s_{k}}\in\mathbb{R}^{s\times N} is the sensor location matrix of kkth sensor for first-to-ss th vector components, 𝐔sNs×r\mathbf{U}_{s}\in\mathbb{R}^{\frac{N}{s}\times r} is the ssth vector component of a sensor candidate matrix, and 𝐳r\mathbf{z}\in\mathbb{R}^{r} is the latent variable vector. Here, pp, rr, ss and NN are the number of sensors to be selected, the number of latent variables, the number of components of the measurement vector, and the number of spatial dimensions including the different vector components, respectively. Let 𝐖ks×r\mathbf{W}_{k}\in\mathbb{R}^{s\times r} denotes the kkth vector-sensor-candidate matrix as follows:

𝐖k=[𝐮ik,1T𝐮ik,2T𝐮ik,sT]T,\displaystyle\mathbf{W}_{k}=\left[\begin{array}[]{cccc}\mathbf{u}_{i_{k},1}^{\mathrm{T}}&\mathbf{u}_{i_{k},2}^{\mathrm{T}}&\dots&\mathbf{u}_{i_{k},{s}}^{\mathrm{T}}\end{array}\right]^{\mathrm{T}}, (17)

where iki_{k} and 𝐮ik,l1×r\mathbf{u}_{i_{k},l}\in\mathbb{R}^{1\times r} are the index of the kkth selected sensor and the corresponding row vector of the llth vector component of the sensor-candidate matrix, respectively. Let 𝐃s×r\mathbf{D}\in\mathbb{R}^{s\times r} denotes the vector-sensor-candidate matrix as follows:

𝐃=[𝐖1T𝐖2T𝐖pT]T.\displaystyle\mathbf{D}=\left[\begin{array}[]{cccc}\mathbf{W}_{1}^{\mathrm{T}}&\mathbf{W}_{2}^{\mathrm{T}}&\dots&\mathbf{W}_{p}^{\mathrm{T}}\end{array}\right]^{\mathrm{T}}. (19)

The best estimation of latent variables 𝐳~\tilde{\mathbf{z}} can be obtained by the pseudo-inverse operation when uniform independent Gaussian noise is imposed on the observations. As described in Saito et al. (2021a); Nakai et al. (2021), the objective function for sensor optimisation problems in the linear estimation can be defined using the Fisher information matrix (FIM), which corresponds to the inverse of the covariance matrix of estimation error, based on the optimal design of experiments (Atkinson et al., 2007). The most commonly used criterion is the D-optimality criterion, which is equivalent to minimizing the determinant of the error covariance matrix, whereas there are a variety of criteria proposed in the optimal design of experiments. The vector-measurement-sensor optimisation based on the D-optimality criterion can be expressed as the optimisation problem as follows (Saito et al., 2021b):

maximizefDs\displaystyle\mathrm{maximize}\,\,f_{\mathrm{D}_{s}}
fDs={det(𝐃𝐃),pr/s,det(𝐃𝐃),p>r/s.\displaystyle f_{\mathrm{D}_{s}}\,=\,\left\{\begin{array}[]{cc}\mathrm{det}\,\left(\mathbf{D}\mathbf{D}^{\top}\right),&p\leq r/s,\\ \mathrm{det}\,\left(\mathbf{D}^{\top}\mathbf{D}\right),&p>r/s.\end{array}\right. (22)

The sensor optimisation problem is formulated as a combinatorial optimisation problem, which is known as an NP-hard problem. Greedy methods have been devised and applied to the sensor optimisation problems instead of a brute-force algorithm, which evaluates all the combinations of sensors out of sensor candidates and takes enormous computational cost. In the step-by-step selection using the greedy method, only the kkth sensor selection is carried out in the kkth step under the condition that the first-to-(k1k-1)th sensors are already determined.

Furthermore, a unified expression for the undersampled and oversampled cases has been proposed in Saito et al. (2021a, b), whereas different objective functions are defined in the undersampled and oversampled cases in which the number of sensors is less than and greater than that of the latent state variables, respectively. The function det(𝐃𝐃+ϵ𝐈)\mathrm{det}\left(\mathbf{D}^{\top}\mathbf{D}+\epsilon\mathbf{I}\right) has been proposed as the approximated objective function, where ϵ\epsilon is a sufficiently small number (Saito et al., 2021a; Shamaiah et al., 2010). The sensor index chosen in kkth step of the greedy algorithm can be described as follows:

ik=\displaystyle i_{k}= argmaxikdet(𝐃k𝐃k+ϵ𝐈).\displaystyle\mathop{\rm{arg~{}max}}\limits_{i_{k}}\,\,\mathrm{det}\,\left(\mathbf{D}_{k}^{\top}\mathbf{D}_{k}+\epsilon\mathbf{I}\right). (23)

Although eq. (22) was adopted in the initial phase of the present study, this objective function was found to poorly work for the situation in which vector sensors in one sensor location are not linearly independent. This is because a vector sensor in one sensor location has more than 1230 components in the present situation as later discussed and eq. (22) for the p>r/sp>r/s condition should be used even in the selection of the first sensor, but it gives rank-deficient FIM which corresponding to det(𝐃𝐃)=0\mathrm{det}\,\left(\mathbf{D}^{\top}\mathbf{D}\right)=0 for any sensor selected for the first step. Therefore, the greedy method in eq. (22) does not work well for the first step. This situation is relaxed by using the unified and robust formulation owing to the regularization of FIM though the hyperparameter ϵ\epsilon should be introduced. Therefore, we employ the formulation (23) for the vector-sensor-measurement-optimisation problem in the present study.

The greedy method based on the objective function (23) is confirmed to provide the same sensors as the convex relaxation method using eq. (23) (Saito et al., 2021b), which obtains a nearly optimal solution. Furthermore, the objective function based on the A-optimality criterion, which also has a statistical interpretation in terms of FIM as with D-optimality criterion and evaluates the trace of the inverse of FIM, can be defined in a similar way to D-optimality criterion (Nakai et al., 2021). The results using the objective function based on the A-optimality criterion are confirmed to be almost the same as those based on the D-optimality criterion. Therefore, the results based on A-optimality criterion are omitted for brevity in the present paper.

2.2 Observation site selection based on sensitivity to model parameters

In the present study, an observation site selection method which accurately estimates the model parameters used as inputs in the numerical simulation of the wave field is proposed, whereas the seismic wavefield is assumed to be reconstructed by a numerical simulation. Throughout this study, we use the code developed by Hisada (1995) for the numerical calculation. This code calculates the theoretical waveforms at a certain distance from the epicenter location based on the discrete wavenumber method, in which seismic waveforms are represented as wavenumber integrals of Green’s functions. Theoretical waveforms are obtained by assuming a one-dimensional (1-D) horizontally layered subsurface structure model and given source information.

Refer to caption
Figure 1: Multi-layered subsurface structure model consisting of three layers on a half-space. The settings of the subsurface structure model parameters (VPV_{P}, VSV_{S}, and hh) and the source location parameters (SNSS_{\rm{NS}}, SEWS_{\rm{EW}}, and SUDS_{\rm{UD}}) are mentioned in Tables 1 and 2, respectively.

Figure 1 shows a multi-layered subsurface structure model assumed in the present experiment. Therefore, the seismic velocities (VPV_{P} and VSV_{S}) and the thickness (hh) of each layer are the target model parameters to be estimated. In addition, earthquake source location and fault parameters such as strike, rake, and dip angles are the physical parameters to estimate.

The waveform data observed at each observation site is obtained as Fourier spectra using the code developed by Hisada (1995); it has ii frequency components, and it has six components for each frequency component: real and imaginary parts every three components for three directions (north-south (NS), east-west (EW), and up-down (UD)). Hence, the Fourier coefficients obtained at nnth observation site are stored in a column vector 𝐱n(6×i)×1\mathbf{x}_{n}\in\mathbb{R}^{(6\times i)\times 1} in the proposed method as follows:

𝐱n=[Re(ξ11)Im(ξ11)Re(η11)Im(η11)Re(ζ11)Im(ζ11)Re(ξ12)Im(ζ1i)]T,\mathbf{x}_{n}=\left[\begin{array}[]{ccccccccc}\text{Re}(\xi^{1}_{1})&\text{Im}(\xi^{1}_{1})&\text{Re}(\eta^{1}_{1})&\text{Im}(\eta^{1}_{1})&\text{Re}(\zeta^{1}_{1})&\text{Im}(\zeta^{1}_{1})&\text{Re}(\xi^{2}_{1})&\dots&\text{Im}(\zeta^{i}_{1})\end{array}\right]^{\mathrm{T}}, (24)

where ξ\xi, η\eta, and ζ\zeta are the components of a certain frequency of Fourier spectra for three directions in space (NS, EW, UD), respectively. The waveform data matrix shown in eq. (24) is obtained at all observation sites. Thus, the waveform data vector, which is constructed by stacking the data for all observation sites 𝐗(6×i×n)×1\mathbf{X}\in\mathbb{R}^{(6\times i\times n)\times 1} is following form.

𝐗=[𝐱1T𝐱2T𝐱nT]T\mathbf{X}=\left[\begin{array}[]{cccc}\mathbf{x}_{1}^{\mathrm{T}}&\mathbf{x}_{2}^{\mathrm{T}}&\dots&\mathbf{x}_{n}^{\mathrm{T}}\end{array}\right]^{\mathrm{T}} (25)

We consider the linear equation which describes the relation between the parameters to be estimated and observation data at observation sites since the framework proposed assumes a linear observation equation as written in eq. (15):

[δ𝐱1δ𝐱2δ𝐱n]=[ϕ~1𝐱1ϕ1ϕ~2𝐱1ϕ2ϕ~r𝐱1ϕrϕ~1𝐱2ϕ1ϕ~2𝐱2ϕ2ϕ~r𝐱2ϕrϕ~1𝐱nϕ1ϕ~2𝐱nϕ2ϕ~r𝐱nϕr][δϕ1ϕ~1δϕ2ϕ~2δϕrϕ~r],\left[\begin{array}[]{c}\mathbf{\delta x}_{1}\\ \mathbf{\delta x}_{2}\\ \vdots\\ \mathbf{\delta x}_{n}\end{array}\right]=\left[\begin{array}[]{cccc}\tilde{\phi}_{1}\frac{\partial\mathbf{x}_{1}}{\partial\phi_{1}}&\tilde{\phi}_{2}\frac{\partial\mathbf{x}_{1}}{\partial\phi_{2}}&\dots&\tilde{\phi}_{r}\frac{\partial\mathbf{x}_{1}}{\partial\phi_{r}}\\ \tilde{\phi}_{1}\frac{\partial\mathbf{x}_{2}}{\partial\phi_{1}}&\tilde{\phi}_{2}\frac{\partial\mathbf{x}_{2}}{\partial\phi_{2}}&\dots&\tilde{\phi}_{r}\frac{\partial\mathbf{x}_{2}}{\partial\phi_{r}}\\ \vdots&\vdots&\vdots&\vdots\\ \tilde{\phi}_{1}\frac{\partial\mathbf{x}_{n}}{\partial\phi_{1}}&\tilde{\phi}_{2}\frac{\partial\mathbf{x}_{n}}{\partial\phi_{2}}&\dots&\tilde{\phi}_{r}\frac{\partial\mathbf{x}_{n}}{\partial\phi_{r}}\\ \end{array}\right]\left[\begin{array}[]{c}\frac{\delta\phi_{1}}{\tilde{\phi}_{1}}\\ \frac{\delta\phi_{2}}{\tilde{\phi}_{2}}\\ \vdots\\ \frac{\delta\phi_{r}}{\tilde{\phi}_{r}}\end{array}\right], (26)

where ϕ\phi is the model parameter to be estimated, and rr is the number of model parameters to be estimated, which corresponds to the latent state variables in eq. (15). Although the model parameter estimation should be defined as a nonlinear estimation problem, linearisation around the true values of the model parameters is applied in the present study by assuming small errors between the true and initial values of the model parameters. This is a reasonable assumption since the estimation of the model parameters is performed using preliminary report values, which are expected to be close to the true value as initial values in real problems. Equation (26) expresses the reconstruction error in the waveform data at each observation site (δ𝐱\mathbf{\delta x}) that is caused by the estimation errors of the model parameters (δϕ\delta\phi); the second term on the right corresponds to the differences between the true and estimated values of the model parameters, and the left-hand side corresponds to the differences between the observed data and data reconstructed using a numerical simulation with the estimated model parameters. It should be noted that the estimation error of each model parameter is normalized by the initial value (ϕ~\tilde{\phi}). Hence, the first term on the right in eq. (26) corresponds to a matrix that represents the normalized sensitivity to the model parameters (hereinafter, referred to as normalized parameter sensitivity matrix); each row of the matrix corresponds to the sensitivity of each observation site candidate to the model parameters. Therefore, the observation sites with high sensitivity to the model parameters can be selected by evaluating the quantity of the normalized parameter sensitivity matrix. In the framework of the sparse sensor optimisation described in Section 2.1, the normalized parameter sensitivity matrix corresponds to the vector-sensor-candidate matrix 𝐃\mathbf{D} in eq. (15). The D-optimality criterion is adopted in the present study, and the observation sites are selected by the D-optimality-based greedy method using the objective function (22). Note that the observed noise is assumed to follow a normal distribution in the formulation of the sparse sensor optimisation described in Section 2.1. Thus, the result of the observation site selection by the D-optimality-based greedy method using the objective function eq. (22) does not depend on the noise following a normal distribution contained in the measurement data. On the other hand, several researchers have studied sensor selection in the presence of correlated measurement noise (Liu et al., 2016; Uciński, 2020; Yamada et al., 2021; Nagata et al., 2022b). The observation site selection method proposed in the present study can be extended to the case of correlated noise by utilizing the formulation con considering correlated measurement noise, which is left for the future study.

2.3 Model parameter estimation method

The estimation method of the model parameters based on observation at the selected observation sites by the proposed method and the reconstruction method of the seismic wavefield is explained. The waveform data vector for the subset of observation sites selected by the proposed method 𝐘(6×i×p)×1\mathbf{Y}\in\mathbb{R}^{(6\times i\times p)\times 1} is described as follows:

𝐘=[𝐲1T𝐲2T𝐲pT]T,\mathbf{Y}=\left[\begin{array}[]{cccc}\mathbf{y}_{1}^{\mathrm{T}}&\mathbf{y}_{2}^{\mathrm{T}}&\dots&\mathbf{y}_{p}^{\mathrm{T}}\end{array}\right]^{\mathrm{T}}, (27)

where pp is the number of observation sites selected. The values of model parameters are repeatedly updated until the change in error in the reconstructed wavefield approaches zero, as in Newton’s method. The errors of the reconstructed wavefield are evaluated by comparing the observation at the selected observation sites as follows:

[δ𝐲1δ𝐲2δ𝐲p]=[𝐲1𝐲2𝐲p][𝐲1~𝐲2~𝐲~p].\left[\begin{array}[]{c}\mathbf{\delta y}_{1}\\ \mathbf{\delta y}_{2}\\ \vdots\\ \mathbf{\delta y}_{p}\end{array}\right]=\left[\begin{array}[]{c}\mathbf{y}_{1}\\ \mathbf{y}_{2}\\ \vdots\\ \mathbf{y}_{p}\end{array}\right]-\left[\begin{array}[]{c}\tilde{\mathbf{y}_{1}}\\ \tilde{\mathbf{y}_{2}}\\ \vdots\\ \tilde{\mathbf{y}}_{p}\end{array}\right]. (28)

Then, the estimation errors of the model parameters are calculated using the parameter sensitivity matrix consisting of rows corresponding to the selected observation sites and the pseudo-inverse operation as follows:

[δϕ1δϕ2δϕr]=[𝐲1ϕ1𝐲1ϕ2𝐲1ϕr𝐲pϕ1𝐲pϕ2𝐲pϕr][δ𝐲1δ𝐲2δ𝐲p],\left[\begin{array}[]{c}\delta\phi_{1}\\ \delta\phi_{2}\\ \vdots\\ \delta\phi_{r}\end{array}\right]=\left[\begin{array}[]{cccc}\frac{\partial\mathbf{y}_{1}}{\partial\phi_{1}}&\frac{\partial\mathbf{y}_{1}}{\partial\phi_{2}}&\dots&\frac{\partial\mathbf{y}_{1}}{\partial\phi_{r}}\\ \vdots&\vdots&\vdots&\vdots\\ \frac{\partial\mathbf{y}_{p}}{\partial\phi_{1}}&\frac{\partial\mathbf{y}_{p}}{\partial\phi_{2}}&\dots&\frac{\partial\mathbf{y}_{p}}{\partial\phi_{r}}\\ \end{array}\right]^{\dagger}\left[\begin{array}[]{c}\mathbf{\delta y}_{1}\\ \mathbf{\delta y}_{2}\\ \vdots\\ \mathbf{\delta y}_{p}\end{array}\right], (29)

where \circ^{\dagger} indicates the Moore–Penrose pseudoinverse. The values of model parameters are calibrated based on the estimation errors in eq. (29):

[ϕ1t+1ϕ2t+1ϕrt+1]=[ϕ1tϕ2tϕrt][δϕ1δϕ2δϕr],\left[\begin{array}[]{c}\phi_{1}^{t+1}\\ \phi_{2}^{t+1}\\ \vdots\\ \phi_{r}^{t+1}\end{array}\right]=\left[\begin{array}[]{c}\phi_{1}^{t}\\ \phi_{2}^{t}\\ \vdots\\ \phi_{r}^{t}\end{array}\right]-\left[\begin{array}[]{c}\delta\phi_{1}\\ \delta\phi_{2}\\ \vdots\\ \delta\phi_{r}\end{array}\right], (30)

where tt is the number of optimisation steps. Then, the seismic wavefield is reconstructed using the numerical simulation with the updated model parameters, and the errors of seismic wavefield at the selected observation sites are evaluated as described in eq. (28). The series of the estimation of the model parameters [eq. (28)–(30)] is repeated until the change in the residual errors of the reconstructed seismic wavefield approaches to a sufficiently small value at the selected observation sites.

Note that the observation site selection explained in Section 2.2 can be performed each time the physical model parameters are updated in the series of the estimation [eq. (28)–(30)]; however, the aim of the present study is to propose a method for the preselection of a prioritised list of observation sites, depending on the areas, magnitudes, source mechanisms and earthquake types (e.g., plate-boundary and inland types) from the point of view of an accurate reconstruction of the seismic wavefield. Reselection of observation sites in the estimation process is outside the scope of the present study. In addition, the reselection of observation sites in the estimation process is unlikely to affect the results of the observation site selection in a situation where the difference between the initial and true parameters is small.

Our research group has been developing a seismic wavefield reconstruction method based on the data-driven reduced-order model (Nagata et al., 2022a) besides the observation site selection method proposed in the present study. The aim of both studies is to achieve an accurate reconstruction of the seismic wavefield. The critical difference between these two studies is that the present study is the process-driven approach, where Nagata et al. (2022a) is the data-driven approach. Since the process-driven approach requires accurate estimation of model parameters, the present study proposes a method that selects observation sites suitable for accurate seismic wavefield reconstruction. Kano et al. (2017a) has proposed the data-driven reconstruction framework using a replica-exchange Monte Carlo method for the nonlinear estimation problem of model parameters. For a quick estimation, it is significant to estimate as accurately as possible using information from as few observation sites as possible. The parameter estimation problem is linearised in the observation site selection method proposed in the present study; however, the sensitive location obtained by the proposed method is expected to be also applicable to the parameter search in the practical nonlinear problem. The observation site selection method proposed in the present study is one of the important elemental technologies for the improvement of accuracy and rapidness of the process-driven reconstruction because it can be adapted to various process-driven reconstruction methods.

3 Observation site selection

3.1 Problem settings for observation site selection

The target area for seismic wavefield reconstruction is the Tokyo metropolitan area of Japan, in which a dense seismological network named MeSO-net has been in operation since 2007. The MeSO-net comprises 296 accelerometers, located at intervals of a few kilometers, that continuously record seismograms at a sampling rate of 200 Hz (e.g. Hirata, 2009; Sakai & Hirata, 2009). The target area is a sedimentary basin known as the Kanto basin. The Kanto basin on the bedrock can be approximated by three layers lying on a half-space, as shown in Fig. 1, which follows the Japan Integrated Velocity Structure Model (JIVSM) (Koketsu et al., 2011). Hence, the subsurface structure is assumed to be a 1-D horizontally layered subsurface structure consisting of three layers, which model the sedimentary basin. The three layers are lying on bedrock, which is modeled as a half-space. Table 1 shows the subsurface structure model parameters.

The observation sites assumed in the present paper are the subset of the observation sites of MeSO-net that actually exist in the central Tokyo of Japan. Figure 2 shows geometric settings. The circles indicate the assumed observation sites. The number of observation sites is 50. The origin of horizontal (NS-EW) coordinates is set as (35.034035.0340^{\circ}N, 139.9106139.9106^{\circ}E). Note that the results do not depend on the setting of the origin of horizontal coordinates since the relative positional relationship between the observation sites and the hypocenter affects the results.

The earthquake source is assumed to be located in the half-space as shown in Fig. 1. In the present paper, two types of earthquakes are simulated using different source information, and the characteristics of observation site selection by the proposed method are verified. The source information is set based on the earthquakes actually occurred in the Tokyo metropolitan area. Table 2 summarizes the source information. In the case of the hypocenter 1, the source information is set based on that of the earthquake which occurred in the southern part of Ibaraki prefecture on September 16, 2014. This area is known as one of areas where earthquakes occur frequently. In the case of the hypocenter 2, the source information is set based on that of the earthquake which occurred directly underneath the central Tokyo on May 9, 2010. The blue and red marks shown in Fig. 2 indicate the locations of the hypocenters 1 and 2. The cases of the hypocenters 1 and 2 correspond to situations where the earthquake occurs far from and directly underneath the area of observation sites, respectively.

Refer to caption
Figure 2: Geometric setting. (a) The Kanto area in the geographic coordinate system. (b) Locations of observation sites and earthquake sources in the north-south(NS)/east-west(EW) coordinate system. The circles indicate the observation sites, which are referred to the actual distribution of the MeSO-net. The blue and red marks indicate the earthquake source, the values of which are mentioned in Table 2. This figure corresponds to the top view of Fig. 1, and the source is located in the half space as shown in Fig. 1.
Table 1: Subsurface structure model parameters which are defined as shown in Fig. 1.
ρ\rho [g/cm3] VPV_{P} [km/s] VSV_{S} [km/s] hh [km]
Layer 1 1.95 1.8 0.5 0.4
Layer 2 2.15 2.4 1.0 1.1
Layer 3 2.3 3.2 1.7 1.0
Half space 2.7 5.8 3.4 \infty
Table 2: Source parameters of two earthquakes; the hypocenters 1 and 2 correspond to situations where the earthquake occurs far from and directly underneath the area of observation sites, respectively, as shown in Fig. 2.
Hypocenter 1 (Southern part of Ibaraki)
Source location (NS, EW, UD) [km] 117.9655, -4.2204, 47.0000
Strike, rake, dip [deg] 254, 118, 28
Slip [m] 0.4
Hypocenter 2 (Tokyo 23 Ward)
Source location (NS, EW, UD) [km] 71.4339, -22.5905, 26.0000
Strike, rake, dip [deg] 126, 103, 80
Slip [m] 0.05

In the present study, the seismic velocities of each of the three layers (VPiV_{Pi} and VSiV_{Si}, where i=1,2,3i=1,2,3) and the thickness (hih_{i}) of each of the three layers, and source location in three directions (SNSS_{\mathrm{NS}}, SEWS_{\mathrm{EW}}, and SUDS_{\mathrm{UD}}) are the estimation target of the model parameters following the previous studies (Kano et al., 2017a, b). Note that the origin time, which is estimated in the previous studies (Kano et al., 2017a, b), is not taken into account as a target parameter since Fourier spectra of the seismic wavefield is evaluated in the proposed method. The total number of model parameters to be estimated is r=12r=12. The sensitivity matrix is constructed by simulating 24 cases using the code developed by Hisada (1995): 2 cases in which the value of one of the 12 parameters is changed by +Δϕ+\Delta\phi and Δϕ-\Delta\phi from the initial value shown in Tables 1 and 2 are conducted, and then, the difference in Fourier spectra by 2Δϕ2\Delta\phi is obtained for each parameter. The values of Δϕ\Delta\phi for the subsurface structure model parameters (VPiV_{Pi}, VSiV_{Si}, and hih_{i}) are set to 10%10\% of initial values (Table 1), and those for the source location parameters (SNSS_{\mathrm{NS}}, SEWS_{\mathrm{EW}}, and SUDS_{\mathrm{UD}}) are set to 500 m.

The simulation was conducted in the frequency band from DC to 5 Hz. The number of frequency components is i=205i=205. Hence, the number of components obtained by vector measurement at one observation site is s=1230s=1230 considering each frequency component consisting of the real and imaginary parts as shown in eq. 24. The second-order Butterworth filter was applied to the obtained simulation results in order to verify the influence of the frequency band of the seismic wavefield on the observation site selection. The lower cutoff frequency of the bandpass filter is set to be 0.10.1 Hz, and the higher cutoff frequency varies from 0.3 Hz to 1.0 Hz at 0.1 Hz intervals.

3.2 Normalized parameter sensitivity

In this subsection, normalized parameter sensitivity 𝒮\mathcal{S} is newly defined, and the characteristics of the parameter sensitivity matrix of interest in the proposed method are investigated. This is a scalar quantity showing the sensitivity of each observation site to each parameter.

We consider the difference in the observation data due to the difference in the kkth parameter at the jjth observation site. This corresponds to the numerator of ϕ~k𝐱jϕk\tilde{\phi}_{k}\frac{\partial\mathbf{x}_{j}}{\partial\phi_{k}} in the normalized parameter sensitivity matrix in eq. (26). The difference in the observation data (Δ𝐱jϕk\Delta\mathbf{x}_{j}^{\phi_{k}}) is expressed by the difference between the observation data in the case of ϕk=ϕk+Δϕk\phi_{k}=\phi_{k}+\Delta\phi_{k} and that in the case of ϕk=ϕkΔϕk\phi_{k}=\phi_{k}-\Delta\phi_{k} as follows:

Δ𝐱jϕk=𝐱j|ϕk=ϕk+Δϕk𝐱j|ϕk=ϕkΔϕk.\Delta\mathbf{x}_{j}^{\phi_{k}}=\mathbf{x}_{j}|_{\phi_{k}=\phi_{k}+\Delta\phi_{k}}-\mathbf{x}_{j}|_{\phi_{k}=\phi_{k}-\Delta\phi_{k}}. (31)

Here, Δϕ\Delta\phi in the present study was set to be the same value as that for constructing the normalized sensitivity matrix described in Section 3.1. Since the observation data 𝐱\mathbf{x} is a vector as shown in eq. (24), we define the sum of the squares of each component of the difference vector, that is, the square of the L2L_{2} norm as the normalized parameter sensitivity to kkth parameter of jjth observation site (𝒮jϕk\mathcal{S}_{j}^{\phi_{k}}) as follows:

𝒮jϕk=Δ𝐱jϕk22.\mathcal{S}_{j}^{\phi_{k}}=||\Delta\mathbf{x}_{j}^{\phi_{k}}||^{2}_{2}. (32)

The scalar value 𝒮jϕk\mathcal{S}_{j}^{\phi_{k}} can be evaluated for each observation (j=1,,50)(j=1,\dots,50) and for each parameter (k=1,,12)(k=1,\dots,12).

It should be noted that the quantity evaluated in the observation site selection is not exactly the same as 𝒮jϕk\mathcal{S}_{j}^{\phi_{k}}. The proposed site selection method evaluates the value of sites based on the determinant of the normalized parameter sensitivity matrix which is the quantity showing the sensitivity of all considered parameters comprehensively.

3.3 Basic characteristics of observation site selection by proposed method

In this subsection, the characteristics of observation site selection by the proposed method in the case of the hypocenter 1, and the frequency band of 0.11.00.1-1.0 Hz is investigated in detail as a basic case in the present paper.

Figure 3 shows maps of the normalized parameter sensitivity of 12 parameters. The ranges of color bar in Fig. 3 were fixed for each physical parameter. The gray solid lines indicate concentric circles from the hypocenter location every 5 km. Figure 3 demonstrates that the normalized parameter sensitivity has a spatial structure and there are superiority and inferiority of sensitivity between layers or directions. Table 3 summarises the characteristics of the parameter sensitivity. The characteristics are discussed for each parameter below.

Refer to caption
Figure 3: Maps of normalized parameter sensitivity in the case of the hypocenter 1 and frequency band of 0.11.00.1-1.0 Hz. Note that the numbers in the figure indicate the selection order of the top 10 observation sites using the proposed method mentioned in Section 3.3.4.
Table 3: The factors determining the model parameter sensitivity in the case of the hypocentre 1.
Sensitivity Characteristics Dominant factor
Sensitivity to VPV_{P}
Magnitude relationship
between layers
The phase shift
Spatial structure
The phase shift
Sensitivity to VSV_{S}
Magnitude relationship
between layers
The phase shift
Spatial structure
The energy distribution of the S wave
Sensitivity to hh
Magnitude relationship between
layers and spatial structure
Determined by the sensitivity
to VPV_{P} and VSV_{S}
Sensitivity to SNSS_{\rm{NS}}, SEWS_{\rm{EW}}, SUDS_{\rm{UD}}
Magnitude relationship between
directions and the spatial structure
Characterized by the positional
relationship and the energy
distribution of the S wave

3.3.1 Sensitivity to seismic velocities VPV_{P} and VSV_{S}

Firstly, the characteristics of the sensitivity to seismic velocities VPV_{P} and VSV_{S} are discussed. Focusing on the magnitude relationship between layers, the sensitivity to VP2V_{P\rm{2}} and VP3V_{P\rm{3}} is much higher than that to VP1V_{P\rm{1}}, and the sensitivity to VS1V_{S\rm{1}} and VS2V_{S\rm{2}} is much higher than that to VS3V_{S\rm{3}}. The reason for the tendency is considered to be the difference in the arrival time of the seismic wave due to the difference in the seismic velocities. The difference in the seismic velocity of each layer causes the difference in the arrival time, that is, the phase shift, and the greater the difference in arrival time, the higher the sensitivity. Here, the arrival time from the hypocenter location to the observation site on the ground is estimated in the condition utilized in the present study, assuming that the wave goes straight up (in the UD direction) for simplicity, while the wave travels intricately by reflection and refraction. Table 4 shows the difference in the arrival time caused by the difference in the seismic velocity (2ΔVP,2ΔVS2\Delta V_{P},2\Delta V_{S}). Here, the values of ΔVP\Delta V_{P} and ΔVS\Delta V_{S} are set to the same as those for constructing the normalized sensitivity matrix described in Section 3.1. The magnitude relationship of the differences in the arrival time between 3 layers for VPV_{P} and VSV_{S} is similar to that of the sensitivity between 3 layers for VPV_{P} and VSV_{S}, respectively. Therefore, the magnitude relationship of the sensitivity to VPV_{P} and VSV_{S} of different layers is considered to be characterized by a phase shift due to the difference of VPV_{P} and VSV_{S}.

Table 4: Estimated result of the difference in the arrival time from the hypocenter location to the observation site on the ground caused by the difference in the seismic velocity (2ΔVP,2ΔVS2\Delta V_{P},2\Delta V_{S}). ΔV\Delta V is 10% of the initial values shown in Table 1.
Diff. in arrival time caused by 2ΔVP2\Delta V_{P} [ms] Diff. in arrival time caused by 2ΔVS2\Delta V_{S} [ms]
Layer 1 0.0444 0.160
Layer 2 0.0916 0.220
Layer 3 0.0625 0.118

Focusing on the spatial structure of the sensitivity to the velocities, the sensitivity to VPV_{P} is higher at observation sites with a larger epicentral distance. This trend is due to the phase shift similar to the magnitude relationship; the difference in the arrival time increases with increasing the epicenter distance of observation site. On the other hand, the sensitivity to VSV_{S} is higher at the observation sites closer to the northeast. This trend is due to the energy distribution of the S-wave. Figure 4 shows the energy distributions in the NS, EW, and UD directions. The energy in each direction is obtained by calculating the sum of squares of the components in each direction of frequency spectra. Figure 5 shows the waveforms in the NS, EW, and UD directions sampled at the observation site with the shortest epicentral distance as an example. Figure 5 illustrates that the S wave is dominant compared with the P-wave, and the fluctuation in the NS direction is dominant in the duration of S-wave arrival in the current case. Thus, the energy distribution in the NS direction (Fig. 4) corresponds to that of S-wave. The comparison of the energy distribution in the NS direction (Fig. 4) with the maps of sensitivity to VSV_{S} (Fig. 3) demonstrates that the spatial structure of the sensitivity to VSV_{S} is qualitatively the same as that of the energy distribution of the S-wave. Therefore, the reason for the trend in the spatial structure is that the difference in the energy of the S-wave caused by the difference in VSV_{S} increases with originally increasing the energy of the S-wave. As discussed above, the spatial structure of the sensitivity to VPV_{P} and VSV_{S} is characterized by the phase shift due to the difference in the velocities and the energy distribution of the S-wave. The effects of the phase shift and the energy distribution of the S-wave are dominant for VPV_{P} and VSV_{S}, respectively, in the case of the hypocenter 1.

Refer to caption
Figure 4: Energy distributions in the NS, EW, and UD directions in the case of the hypocenter 1. Note that the numbers in the figure indicate the selection order of the top 10 observation sites using the proposed site selection method mentioned in Section 3.3.4.
Refer to caption
Figure 5: Seismic waveforms in the NS, EW, and UD directions at the observation site with the shortest epicentral distance in the case of the hypocenter 1.

3.3.2 Sensitivity to layer thickness hh

Secondly, the characteristics of the sensitivity to layer thickness hh are discussed. With regard to the magnitude relationship, the sensitivity to h1h_{\rm{1}} and h2h_{\rm{2}} is much higher than that to h3h_{\rm{3}}. Focusing on the spatial structure, the sensitivity to h1h_{\rm{1}} is higher at observation sites closer to the northeast, and that to h2h_{\rm{2}} is higher at observation sites closer to the southeast. These characteristics of the sensitivity to hh are considered to be strongly related to the sensitivity to seismic velocities VPV_{P} and VSV_{S} because the difference in the layer thickness has a qualitatively similar effect on the arrival time to that in the seismic velocities. The spatial structure of the sensitivity to h1h_{\rm{1}} is qualitatively the same as that to VS1V_{S\rm{1}}, which is relatively high while the sensitivity to VP1V_{P\rm{1}} is low. With regard to the layer 2, the sensitivity to VP2V_{P\rm{2}} and VS2V_{S\rm{2}} is both relatively high; thus, the characteristics of the sensitivity to VP2V_{P\rm{2}} and VS2V_{S\rm{2}} are superimposed in the spatial structure of the sensitivity to h2h_{\rm{2}}. On the other hand, With regard to the layer 3, the sensitivity to h3h_{\rm{3}} is low since both sensitivity to VP3V_{P\rm{3}} and VS3V_{S\rm{3}} are relatively low. Therefore, the characteristics of the sensitivity to hh are determined by those to VPV_{P} and VSV_{S}.

3.3.3 Sensitivity to hypocenter locations SNSS_{\mathrm{NS}}, SEWS_{\mathrm{EW}}, and SUDS_{\mathrm{UD}}

Finally, the characteristics of the sensitivity to the hypocenter locations SNSS_{\mathrm{NS}}, SEWS_{\mathrm{EW}}, and SUDS_{\mathrm{UD}} are discussed. With regard to the magnitude relationship, the sensitivity to SNSS_{\mathrm{NS}} and SUDS_{\mathrm{UD}} is much higher than that of SEWS_{\mathrm{EW}}. Focusing on the spatial structure, the sensitivity to SNSS_{\mathrm{NS}} is higher at observation sites closer to the eastern, and that to SUDS_{\mathrm{UD}} is higher at observation sites closer to the northeastern. These characteristics are due to the positional relationship and the energy distribution of the S-wave. The hypocenter is far north of the area of observation sites. In addition, the fluctuation in the NS direction is confirmed to be dominant in the duration of S-wave arrival as described in Section 3.3.1. Consequently, when the hypocenter location shifts in the NS direction, the observation sites closer to the same EW coordinates as the hypocenter are more suitable for observing the change in the fluctuation in the NS direction. This is the reason for the spatial structure of the sensitivity to SNSS_{\mathrm{NS}}. On the other hand, when the hypocenter location shifts in the EW direction, the effect on the fluctuation at the observation sites, especially the fluctuation in the NS direction, is relatively small. This is the reason why the sensitivity to SEWS_{\mathrm{EW}} is low. Furthermore, the spatial structure of the sensitivity to SUDS_{\mathrm{UD}} is qualitatively the same as that of the energy distribution of the S-wave shown in Fig. 4. This is because the larger the energy of the S-wave, the larger the energy difference due to the difference in model parameters, as discussed in Section 3.3.1. Note that although the sensitivity to SNSS_{\mathrm{NS}} and SEWS_{\mathrm{EW}} is also influenced by the energy difference, the dominant factor is considered to be the positional relationship rather than the energy difference. Therefore, the characteristics of the sensitivity to the hypocenter locations are characterized by the positional relationship and the energy distribution of the S-wave. The effect of the positional relationship is dominant for SNSS_{\mathrm{NS}} and SEWS_{\mathrm{EW}}, and that of the energy distribution of the S-wave is dominant for SUDS_{\mathrm{UD}}, in the case of the hypocenter 1.

3.3.4 Results of observation site selection

Next, the results of observation site selection are discussed. Figure 6 shows the result of observation site selection using the proposed method. The top 10 observation sites are numbered in the order of selection in Fig. 6. Moreover, the top 10 observation sites are numbered in Figs. 3 and 4 to show the relevance to the normalized parameter sensitivity. Figure 7 shows the normalized parameter sensitivity against observation sites in the order of selection. Figure 6 indicates that observation sites outside the area of observation sites tend to be preferentially selected by the proposed method. This trend is due to the characteristics of the normalized parameter sensitivity discussed above; Fig. 7 demonstrates that the values of sensitivity decrease comprehensively as the selection order of observation sites decreases. These results confirm that the proposed method preferentially selects observation sites with high sensitivity to the parameters. On the other hand, some parameters have different tendencies (e.g., SEWS_{\mathrm{EW}}). This is because the proposed method emphasizes the D-optimality criterion of the sensitivity matrix, which represents the average sensitivity of all the parameters, rather than directly evaluating the normalized parameter sensitivity, which corresponds to the sensitivity of each parameter. Consequently, when there is a parameter whose spatial structure is qualitatively different from the major structure of all parameters, the order of observation site selection may not correlate with the sensitivity to the parameter.

Refer to caption
Figure 6: Selection order of observation sites using the proposed method in the case of the hypocenter 1 and frequency band of 0.11.00.1-1.0 Hz. The observation sites are colored from red to blue depending on the order of selection; the observation sites selected earlier are plotted in red, and those selected later are plotted in blue. The numbers in the figure indicate the selection order of the top 10 observation sites.
Refer to caption
Figure 7: Normalized parameter sensitivity against observation sites in the order of selection in the case of the hypocenter 1 and frequency band of 0.11.00.1-1.0 Hz.

3.4 Effect of frequency band

In this subsection, the effect of the frequency band of the seismic wavefield on the observation site selection is investigated. Figure 8 shows the results of observation site selection using the proposed method in the cases of different frequency bands. The higher cutoff frequency of the bandpass filter varies from 0.3 to 1.0 Hz in each figure plotted at from the top left to the bottom right in Fig. 8. Note that the result of the higher cutoff frequency of 1.0 Hz is the same as the result shown in Fig. 6. The comparison of the power spectral density of acceleration in the NS, EW, and UD directions is shown in Fig. 9. The power spectral density is sampled at the observation site with the shortest epicentral distance as an example. The results using the bandpass filter with the higher cutoff frequency of 0.3, 0.5, and 1.0 Hz are plotted together for comparison in Fig. 9. The comparison of the observation site selection in the cases of different frequency bands shown in Fig. 6 demonstrates that there is no effect of the frequency band on the observation site selection. Although the order of selection is slightly different depending on the frequency band, the comprehensive characteristics discussed in Section 3.3 is not affected by the frequency band. This is because the comprehensive characteristics of the seismic wavefield are the same between the cases of the different frequency bands. Figure 9 illustrates that the frequency spectra are quantitatively different depending on the frequency band; the values of the power spectral density at the dominant frequency increase significantly with increasing the higher cutoff frequency. However, the qualitative characteristics in the cases of 0.3, 0.5, and 1.0 Hz are the same as each other; the power spectral density is much higher in the NS direction than in the EW and UD directions, and the peak frequencies of the acceleration in the NS, EW, and UD directions does not change. In addition, maps of the normalized parameter sensitivity are confirmed not to be qualitatively affected by the frequency band. Therefore, the observation sites sensitive to the parameters are not considered to depend on the frequency band of the seismic wavefield.

Refer to caption
Figure 8: Comparison of the order of observation site selection using the proposed method in the cases of different frequency bands.
Refer to caption
Figure 9: Comparison of the power spectral density in the cases of different frequency bands.

3.5 Effect of focal mechanism

In this subsection, the effect of the focal mechanisms on the observation site selection is discussed. The results in the case of the hypocenter 2 are compared to those in the case of the hypocenter 1 discussed in Section 3.3.

Figure 10 shows maps of the normalized parameter sensitivity of 12 parameters. The range of the color bar in Fig. 10 is set to be the same as each other for the same kind of physical parameters of different layers or directions. The gray lines indicate concentric circles from the hypocenter location. Table 5 summarises the effects of the focal mechanisms, and each parameter in the table is explained below.

Refer to caption
Figure 10: Maps of normalized parameter sensitivity in the case of the hypocenter 2 for frequency band of 0.1–1.0 Hz. Note that the numbers in the figure indicate the selection order of the top 10 observation sites using the proposed method.
Table 5: The factors determining the model parameter sensitivity in the case of the hypocentre 2.
Sensitivity Characteristics Dominant factor Effect of focal mechanisms
VPV_{P}
Magnitude relationship
between layers
The phase shift
No
Spatial structure
The energy distribution
of the P-wave
Yes (Cf. the phase
shift in the case of the
hypocenter 1)
VSV_{S}
Magnitude relationship
between layers
The phase shift
No
Spatial structure
The energy distribution
of the S-wave
No
hh
Magnitude relationship
between layers and
spatial structure
Determined by the
sensitivity to VPV_{P} and VSV_{S}
No
SNSS_{\rm{NS}}, SEWS_{\rm{EW}}, SUDS_{\rm{UD}}
Magnitude relationship
between directions and
the spatial structure
Characterized by the
positional relationship
and the energy
distribution of the S-wave
No

Firstly, the characteristics of the sensitivity to seismic velocities VPV_{P} and VSV_{S} are discussed. Focusing on the magnitude relationship between layers, the sensitivity to VP2V_{P\rm{2}} and VP3V_{P\rm{3}} are much higher than that to VP1V_{P\rm{1}}, and the sensitivity to VS1V_{S\rm{1}} and VS2V_{S\rm{2}} are much higher than that to VS3V_{S\rm{3}}. These tendencies are similar to those of the case of the hypocenter 1 in Section 3.3.1; the magnitude relationship is due to the difference in the arrival time. Therefore, the magnitude relationship of the sensitivity to seismic velocities between layers does not depend on the focal mechanisms and is determined by the subsurface structure model parameters. Focusing on the spatial structure, the sensitivity to VSV_{S} is high in the northern region adjacent to the hypocenter. The reason for the characteristics is consistent with the discussion on the case of the hypocenter 1 in Section 3.3.1; the spatial structure of the sensitivity to VSV_{S} is due to the energy distribution of the S-wave. On the other hand, the characteristics of the spatial structure of the sensitivity to VPV_{P} are different; although the sensitivity to VPV_{P} is higher at observation sites with a larger epicentral distance in the case of the hypocenter 1, it is high in the north region and the southwest region in the case of the hypocenter 2. This is due to the energy distribution of the P-wave. Figure 11 shows the energy distribution in the UD direction obtained by calculating the sum of squares of the components in the UD direction of the frequency spectra. The energy distribution corresponds to that of the P-wave because the fluctuations in the NS and EW directions are dominant in the duration of the S-wave arrival and that in the UD direction is dominant in the duration of the P-wave arrival in the current case. The comparison of the energy distribution (Fig. 11) with the maps of sensitivity to VPV_{P} (Fig. 10) demonstrates that the spatial structure of the sensitivity to VPV_{P} is qualitatively the same as that of the energy distribution of the P-wave. Hence, the reason for the trend of the spatial structure is that the difference in the energy of the P-wave caused by the difference in VPV_{P} increases with increasing originally the energy of the P-wave.

Refer to caption
Figure 11: Energy distribution in the UD direction in the case of the hypocenter 2. Note that the numbers in the figure indicate the selection order of the top 10 observation sites using the proposed method.

To summarize the discussions on the spatial structure of the sensitivity to VPV_{P} above and in Section 3.3.1, the difference in the arrival time and the energy distribution of the P-wave are the dominant factors in the cases of the hypocenters 1 and 2, respectively. The discrepancy is mainly because the hypocenter is located approximately 30 to 50 km away from the observation site area in the case of the hypocenter 1, whereas in the case of the hypocenter 2, the hypocenter is directly below the observation site area, and the difference in the arrival time between the observation sites is relatively small. Therefore, factors that potentially affect the spatial structure of the sensitivity to seismic velocities do not depend on the focal mechanism; the difference in the arrival time and the energy distribution affects the sensitivity in both cases of hypocenter 1 and 2, but the dominant factor depends on the focal mechanism.

Secondly, the characteristics of the sensitivity to hh are discussed. The sensitivity to h1h_{\rm 1} and h2h_{\rm 2} is higher than that to h3h_{\rm 3}, and it is high in the northern region adjacent to the hypocenter. The reasons for these characteristics are consistent with the discussions on the case of the hypocenter 1 in Section 3.3.2; the characteristics of the sensitivity to hh are determined by those to VPV_{P} and VSV_{S}. Therefore, factors that affect the characteristics of the sensitivity to hh do not depend on the focal mechanisms.

Finally, the characteristics of the sensitivity to SNSS_{\mathrm{NS}}, SEWS_{\mathrm{EW}}, and SUDS_{\mathrm{UD}} are discussed. The sensitivity to SUDS_{\mathrm{UD}} is much higher than that of SNSS_{\mathrm{NS}} and SEWS_{\mathrm{EW}}, and SUDS_{\mathrm{UD}} is higher in the northern region adjacent to the hypocenter. The reasons for these characteristics are consistent with the discussions on the case of the hypocenter 1 in Section 3.3.3; the characteristics of the sensitivity to the hypocenter locations are characterized by the positional relationship and the energy distribution of the S-wave. Therefore, factors that potentially affect the characteristics of the sensitivity to hypocenter locations do not depend on the focal mechanism.

Figure 12 shows the result of observation site selection using the proposed method in the case of the hypocenter 2 and the frequency band of 0.10.1–1.0 Hz. The observation sites are colored from red to blue depending on the order of selection. Figure 13 shows the normalized parameter sensitivity against observation sites in the order of selection. Figure 12 indicates that observation sites in the northern region adjacent to the hypocenter are preferentially selected by the proposed method. This trend is due to the characteristics of the normalized parameter sensitivity discussed above; Fig. 13 demonstrates that the values of sensitivity decrease comprehensively as the selection order of observation sites decreases. Especially in the case of the hypocenter 2, the sensitivity to VSV_{S}, hh, and SUDS_{\mathrm{UD}} decreases with the selection order, whereas there is no correlation between the selection order and other parameters such as VPV_{P}, SNSS_{\mathrm{NS}}, and SEWS_{\mathrm{EW}}. This is mainly because the spatial structure of the sensitivity to VSV_{S}, hh, and SUDS_{\mathrm{UD}} is major among all parameters, whereas those of other parameters are different from the major one. Therefore, the proposed method is confirmed to be able to select observation sites suitable for improvement of the average sensitivity to all parameters according to the sensitivity characteristics, which depends on the focal mechanism.

Refer to caption
Figure 12: Selection order of observation sites using the proposed method in the case of the hypocenter 2 and frequency band of 0.1–1.0 Hz. The observation sites are colored from red to blue depending on the order of selection. The numbers in the figure indicate the selection order of the top 10 observation sites.
Refer to caption
Figure 13: Normalized parameter sensitivity against observation sites in the order of selection in the case of the hypocenter 2 and frequency band of 0.1–1.0 Hz.

4 Twin experiment of seismic wavefield reconstruction

A numerical experiment of the seismic wavefield reconstruction is conducted using synthetic observation data, and the proposed method is verified. The synthetic observation data are theoretical seismic waveforms calculated using the code developed by Hisada (1995) described in Section 2.2. Note that the code developed by Hisada (1995) is used not only for selecting the observation site and reconstructing the seismic wavefield but also for preparing true data for the seismic wavefield reconstruction; therefore, a twin experiment, where both true and estimated waveforms are calculated using the same simulator, is conducted in the present study. First, the observation sites are selected using the proposed method based on the observation data calculated using the initial values of the model parameters, similar to Section 3. Then, the model parameters estimation is conducted based on the observed data at selected observation sites in the way described in Section 2.3, whereas a slight difference between the initial and true values of the model parameters is assumed. Finally, the seismic wavefield is reconstructed using the estimated model parameters. We verify the accuracy of the model parameter estimation using the observation sites selected by the proposed method by comparing it with that using the randomly selected observation sites. Furthermore, the accuracy of the seismic wavefield reconstruction with the estimated model parameters using the observation sites selected by the proposed method is compared with that using the randomly selected observation sites.

4.1 Problem settings for twin experiment

The seismic velocities (VpV_{p} and VsV_{s}) and the thickness (hh) of each of the three layers, and source location in three directions (SNSS_{\mathrm{NS}}, SEWS_{\mathrm{EW}}, and SUDS_{\mathrm{UD}}) are employed as the target model parameters to be estimated as with Section 3. The observation sites are the same as that in Section 3.1. Hence, the initial values of the subsurface structure model parameters are based on Koketsu et al. (2011) summarized in Table 1. The earthquake source information is the same as the case of the hypocenter 1 in Section 3, and the initial values of the source parameters are summarized in Table 2. True values of the 12 model parameters are assumed to be slightly different from the initial values in the numerical experiment. The true values are obtained by adding the small values to the initial values. The small value is obtained by multiplying the random value following a normal distribution 𝒩(0,1)\mathcal{N}(0,1) by 10%10\% of the initial value and 5 km for subsurface structure model parameters and source location parameters, respectively. The reconstruction of waveforms within a frequency band of 0.1–1.0 Hz is examined in the present experiment. Thus, the simulation was conducted in the frequency range less 5 Hz as with Section 3.1, and the second-order Butterworth filter of the frequency band of 0.1–1.0 Hz was applied to the obtained simulation results. The observation noise is added to the observation data 𝐲p\mathbf{y}_{p} calculated using the code developed by Hisada (1995) in the frequency domain; the amplitude of additive noise for each component of observed data (each entry of 𝐲p\mathbf{y}_{p}) follows a normal distribution 𝒩(0,105)\mathcal{N}(0,10^{-5}). Figure 14 shows the power spectral density in the NS, EW, and UD directions of the true and additive noise signals. The true signals are obtained by the simulation using the true model parameters. Three observation sites out of 50 candidates are used for the parameter estimation. Hence, the observation sites numbered 1, 2, and 3 in Fig. 6 are employed in the proposed method. The experiments using 100 patterns of subsets of three randomly selected observation sites are conducted for comparison. Note that the results of the comparison between the proposed method and averaged results with random selection in the case of three observation sites are qualitatively the same regardless of the number of observation sites, and the superiority of the proposed method tends to increase as the number of observation sites decreases.

Refer to caption
Figure 14: Comparison of the power spectral density of the true and additive noise signal at the observation site with the shortest epicentral distance.

4.2 Results of twin experiment

The relation between the reconstruction error of the seismic wavefield and the iteration number of estimation [eq. (28)–(30)] is described in Fig. 15 in the case of three observation sites used for the model parameter estimation. The reconstruction error of the seismic wavefield was calculated as follows:

ϵ=𝐗true𝐗reconstF𝐗trueF,\displaystyle\epsilon=\frac{||\mathbf{X}^{\rm true}-\mathbf{X}^{\rm reconst}||_{\rm F}}{||\mathbf{X}^{\rm true}||_{\rm F}}, (33)

where the notation F\|\circ\|_{\mathrm{F}} is the Frobenius norm that is the matrix norm as the square root of the sum of the absolute squares of the matrix elements. The matrix 𝐗true\mathbf{X}^{\rm true} is the true data calculated using the true model parameters, and 𝐗reconst\mathbf{X}^{\rm reconst} is the reconstructed data using the estimated model parameters based on the observed signals at three observation sites. The values of the random selection shown in Fig. 15 are an average over the results using 100 patterns of subsets of three randomly selected observation sites. Figure 16 shows the reconstruction error distributions of the seismic wavefields using three observation sites selected by the proposed method and using three randomly selected observation sites, respectively. The reconstruction error presented in Fig. 16 is evaluated at each observation site when the number of iterations tt is 20. The values for case of the random selection are the result using one pattern of subsets of three randomly selected observation sites in Fig. 16. The observation sites selected using the proposed method and those randomly selected are numbered and marked with an asterisk in Fig. 16, respectively. Figure 17 shows the power spectral density of seismic acceleration in the NS, EW, and UD directions at two observation sites, excluding the selected observation sites by the proposed method and those randomly selected in semilog plots. The two observation sites are marked by (a) and (b) in Fig. 16, which correspond to the observation site near the center of the observation area and the one with the longest epicentral distance, respectively.

Refer to caption
Figure 15: Comparison of the reconstruction error using the three observation sites selected by the proposed method and using three randomly selected observation sites.
Refer to caption
Figure 16: The reconstruction error distributions with the model parameters estimated (left) using the three observation sites selected by the proposed method and (right) using three randomly selected observation sites.
Refer to caption
Figure 17: Comparison of the power spectral density reconstructed with the true parameters, reconstructed with the initial parameters, reconstructed with the parameters estimated using the three observation sites selected by the proposed method, and reconstructed with the parameters estimated using three randomly selected observation sites at two observation sites: (a) near the center of observation site area; (b) with the longest epicentral distance.
Refer to caption
Figure 18: Comparison of the error in the model parameters estimated using the three observation sites selected by the proposed method and using three randomly selected observation sites.

Figure 15 indicates that although the reconstruction error decreases as the number of iterations for optimisation of the model parameter estimation increases both in the results using the observation sites selected by the proposed method and in the results using the randomly selected observation sites, the reconstruction error can be minimized by the proposed method. In addition, Fig. 16 demonstrates that in the result using the randomly selected observation sites, although the reconstruction error is relatively small at and around the three selected observation sites, the reconstruction error is significant at the sites away from the three sites. On the other hand, the estimation using the proposed method can reduce the reconstruction error compared to that using the randomly selected observation sites at almost all observation sites. Figure 17 indicates that the PSD reconstructed with the parameters estimated using the observation sites selected by the proposed method matches the true values more accurately than that using the randomly selected observation sites, although the difference between the PSD reconstructed using the proposed method and that using random selection is unfortunately invisible in this range. Although these small differences are due to the assumption in the present experiment which gives the quite small discrepancies between the initial and true values of the model parameters in the present study, the estimation is somehow improved in this small range by using optimised sensor locations as previously shown in Fig. 15. The small but certain improvement in the estimation shows the selected locations are sensitive to the parameters of the model. The sensitive sensor location is expected to be also applicable to the parameter search in the practical nonlinear problem, which is left for the future study.

Figure 18 shows the relation between the estimation error of each model parameter and the iteration number of estimation (eq. (28)–(30)), where the estimation error of the model parameter was defined as the relative error between the estimated value and the true value. Although the estimation error converges to a constant value as the number of iterations increases both in the results using the observation sites selected by the proposed method and in the results using the randomly selected observation sites, the proposed method can reduce the estimation error of almost all parameters compared to that using the random selection. These results confirm that the optimised observation sites using the proposed method can reconstruct the seismic wavefield with high accuracy by estimating the model parameters accurately as expected. Note that the small differences are due to the assumption in the present experiment which gives the quite small discrepancies between the initial and true values of the model parameters in the present study as stated above.

Refer to caption
Figure 19: Reconstruction error with respect to the number of selected observation sites.

The relationship between the number of observation sites and the reconstruction error is described in Fig.19. Figure 19 demonstrates that the reconstruction error tends to decrease as the number of observation sites increases. The reconstruction error with the parameters estimated using the observation sites selected by the proposed methods asymptotically approaches a small value when more than three observation sites are used. On the other hand, the error using the random selection becomes as small as that using the proposed method when 15 or more observation sites are used. Therefore, the superiority of the proposed method is remarkable for the reconstruction using 10 or less observation sites in the numerical experiments of the present study. However, with regard to the reconstruction using only one or two observation sites, the reconstruction error using the selected observation sites by the proposed method is within the error bars of that using the randomly selected observation sites. This is because the proposed method first selects an observation site that contributes the most to the minimization of the determinant of the normalized parameter sensitivity matrix as described in eq. (23), and the strategy is not exactly consistent with selecting one that most reduces the reconstruction error of the seismic wavefield. On the other hand, as the number of observation sites increases, the proposed method can select a subset of observation sites suitable for accurate estimation for almost all parameters, since the determinant represents the average sensitivity of all parameters. Therefore, the superiority of the proposed method emerges when the number of observation sites is relatively small, within the range formulated in the combinatorial optimisation problem. The condition is between three and ten observation sites in the numerical experiments of the present study.

5 Conclusions

In the present study, we proposed an observation site selection method for the accurate reconstruction of the seismic wavefield by process-driven approaches.

First, the formulation of the proposed method was presented. The proposed method selects observation sites suitable for accurately estimating physical model parameters to be input into a numerical simulation of the seismic wavefield. In the step of the observation site selection, we assume that the relation between the model parameters to be estimated and waveform data at observation site candidates is described with a linear equation. Then, the coefficient matrix represents the sensitivity of the observation site candidates to the model parameters. Therefore, the proposed method evaluates the objective function based on the D-optimality criterion of the normalized parameter sensitivity matrix in order to select the observation sites with high sensitivity to the model parameters. In the step of the seismic wavefield reconstruction, the difference between the observed data and the calculated data using the simulation is evaluated at only observation sites selected by the proposed method. The values of the model parameters are repeatedly estimated until the residual error of the seismic wavefield approaches to sufficiently small. Then, the reconstructed seismic wavefield can be obtained using the numerical simulation with the estimated model parameters.

Secondly, the characteristics of observation site selection by the proposed method were investigated in numerical experiments assuming an earthquake that actually occurred in the Kanto basin and observation sites referred to the actual distribution of MeSO-net. In the present study, the seismic velocities (VPV_{P} and VSV_{S}) and the thickness (hh) of each of the three layers, and source location in three directions (SNSS_{\mathrm{NS}}, SEWS_{\mathrm{EW}}, and SUDS_{\mathrm{UD}}) are employed as the target model parameters to be estimated. The normalized parameter sensitivity, which can evaluate the sensitivity to each parameter of each observation site, is introduced, and the proposed method is verified. With regard to the sensitivity to VPV_{P} and VSV_{S}, the magnitude relationship between layers and the spatial structure is characterized by the phase shift due to the difference in the velocities and the energy distributions of the P-and S-waves. With regard to the sensitivity to hh, the characteristics of that are determined by the sensitivity to VPV_{P} and VSV_{S}. With regard to the sensitivity to SNSS_{\mathrm{NS}}, SEWS_{\mathrm{EW}}, and SUDS_{\mathrm{UD}}, the factors that potentially affect the characteristics are the positional relationship between observation sites and hypocenter, and the energy distribution of the S-wave. Comparison of the normalized parameter sensitivity and the results of observation site selected by the proposed method show that the proposed method preferentially selects observation sites with high sensitivity to the parameters.

Finally, numerical experiment of the seismic wavefield reconstruction is conducted using synthetic observation data in order to verify the proposed method. The reconstruction error of the seismic wavefield can be reduced using the observed data at observation sites selected by the proposed method compared to that at randomly selected observation sites. In addition, the estimation error of the parameters in the proposed method are also lower than that in the random selection. These results confirm that the proposed method can reconstruct with high accuracy by estimating the model parameters accurately.

In addition to the superiority of the reconstruction error, the proposed method can prioritise the observation sites in terms of the contribution to the accurate seismic wavefield reconstruction. The proposed method enables to make a prioritised list of observation sites according to the area, source mechanisms and earthquake types in advance. The prioritised list contributes to the accurate and quick reconstruction of the seismic wavefield when an earthquake occurs. Furthermore, the proposed method has the potential to indicate the most appropriate configuration of seismometers when they are newly installed.

The sensitivity analysis brought the physical insight into factors that affect the sensitivity to the parameters such as the seismic velocity, layer thickness, and the hypocenter location. The results of the present study were shown to be useful not only for selecting effective observation sites but also for deeply understanding the seismic phenomena. On the other hand, some assumptions were adopted in the present study. For example, although a 1-D horizontally layered subsurface structure is assumed in the present study, the nonuniformity in the horizontal direction affects the seismic wavefield in a real world. In future research, sensitivity analysis and observation site selection will be investigated in more realistic situations. Furthermore, the applicability of the proposed method to real observed data will be investigated.

Acknowledgment

This work was supported by JST CREST (JPMJCR1763).

Data Availability

The code developed by Hisada (1995) for the numerical calculation of the theoretical waveforms is available through the website at http://kouzou.cc.kogakuin.ac.jp/Open/Green/. The data of earthquake mechanism information was provided by the National Research Institute for Earth Science and Disaster Resilience at https://www.fnet.bosai.go.jp/event/joho.php?LANG=en.

References

  • Aoi & Fujiwara (1999) Aoi, S. & Fujiwara, H., 1999. 3d finite-difference method using discontinuous grids, Bulletin of the Seismological Society of America, 89(4), 918–930.
  • Arnold & Curtis (2018) Arnold, R. & Curtis, A., 2018. Interrogation theory, Geophysical Journal International, 214(3), 1830–1846.
  • Atkinson et al. (2007) Atkinson, A., Donev, A., Tobias, R., et al., 2007. Optimum experimental designs, with SAS, vol. 34, Oxford University Press.
  • Boore (1972) Boore, D. M., 1972. Finite difference methods for seismic wave propagation in heterogeneous materials, Methods in computational physics, 11, 1–37.
  • Callaham et al. (2019) Callaham, J. L., Maeda, K., & Brunton, S. L., 2019. Robust flow reconstruction from limited measurements via sparse representation, Physical Review Fluids, 4(10), 103907.
  • Carter et al. (2021) Carter, D. W., De Voogt, F., Soares, R., & Ganapathisubramani, B., 2021. Data-driven sparse reconstruction of flow over a stalled aerofoil using experimental data, Data-Centric Engineering, 2.
  • Chaloner & Verdinelli (1995) Chaloner, K. & Verdinelli, I., 1995. Bayesian experimental design: A review, Statistical Science, 10(3), 273–304.
  • Clark et al. (2018) Clark, E., Askham, T., Brunton, S. L., & Kutz, J. N., 2018. Greedy sensor placement with cost constraints, IEEE Sensors Journal, 19(7), 2642–2656.
  • Clark et al. (2020a) Clark, E., Brunton, S. L., & Kutz, J. N., 2020a. Multi-fidelity sensor selection: Greedy algorithms to place cheap and expensive sensors with cost constraints, IEEE Sensors Journal, 21(1), 600–611.
  • Clark et al. (2020b) Clark, E., Kutz, J. N., & Brunton, S. L., 2020b. Sensor selection with cost constraints for dynamically relevant bases, IEEE Sensors Journal, 20(19), 11674–11687.
  • Du et al. (2014) Du, W., Xing, Z., Li, M., He, B., Chua, L. H. C., & Miao, H., 2014. Optimal sensor placement and measurement of wind for water quality studies in urban reservoirs, in IPSN-14 Proceedings of the 13th International Symposium on Information Processing in Sensor Networks, pp. 167–178, IEEE.
  • Earl & Deem (2005) Earl, D. J. & Deem, M. W., 2005. Parallel tempering: Theory, applications, and new perspectives, Physical Chemistry Chemical Physics, 7(23), 3910–3916.
  • Fujita et al. (2014) Fujita, K., Ichimura, T., Hori, M., Wijerathne, M., & Tanaka, S., 2014. A quick earthquake disaster estimation system with fast urban earthquake simulation and interactive visualization, Procedia Computer Science, 29, 866–876.
  • Fukami et al. (2021) Fukami, K., Maulik, R., Ramachandra, N., Fukagata, K., & Taira, K., 2021. Global field reconstruction from sparse sensors with voronoi tessellation-assisted deep learning, Nature Machine Intelligence, 3(11), 945–951.
  • Geyer (1991) Geyer, C. J., 1991. Markov chain monte carlo maximum likelihood.
  • Hardt & Scherbaum (1994) Hardt, M. & Scherbaum, F., 1994. The design of optimum networks for aftershock recordings, Geophysical Journal International, 117(3), 716–726.
  • Hirata (2009) Hirata, N., 2009. An outline of the special project for earthquake disaster mitigation in the tokyo metropolitan area-subproject i: characterization of the plate structure and source faults in and around the tokyo metropolitan area, Bull. Earthq. Res. Inst. Univ. Tokyo, 84, 41–56.
  • Hisada (1995) Hisada, Y., 1995. An efficient method for computing green’s functions for a layered half-space with sources and receivers at close depths (part 2), Bulletin of the Seismological Society of America, 85(4), 1080–1093.
  • Hisada & Bielak (2003) Hisada, Y. & Bielak, J., 2003. A theoretical method for computing near-fault ground motions in layered half-spaces considering static offset due to surface faulting, with a physical interpretation of fling step and rupture directivity, Bulletin of the Seismological Society of America, 93(3), 1154–1168.
  • Hukushima & Nemoto (1996) Hukushima, K. & Nemoto, K., 1996. Exchange monte carlo method and application to spin glass simulations, Journal of the Physical Society of Japan, 65(6), 1604–1608.
  • Ichimura et al. (2007) Ichimura, T., Hori, M., & Kuwamoto, H., 2007. Earthquake motion simulation with multiscale finite-element analysis on hybrid grid, Bulletin of the Seismological Society of America, 97(4), 1133–1143.
  • Inoba et al. (2022) Inoba, R., Uchida, K., Iwasaki, Y., Nagata, T., Ozawa, Y., Saito, Y., Nonomura, T., & Asai, K., 2022. Optimization of sparse sensor placement for estimation of wind direction and surface pressure distribution using time-averaged pressure-sensitive paint data on automobile model, Journal of Wind Engineering and Industrial Aerodynamics, 227, 105043.
  • Inoue et al. (2021) Inoue, T., Matsuda, Y., Ikami, T., Nonomura, T., Egami, Y., & Nagai, H., 2021. Data-driven approach for noise reduction in pressure-sensitive paint data based on modal expansion and time-series data at optimally placed points, Physics of Fluids, 33(7), 077105.
  • Inoue et al. (2023) Inoue, T., Ikami, T., Egami, Y., Nagai, H., Naganuma, Y., Kimura, K., & Matsuda, Y., 2023. Data-driven optimal sensor placement for high-dimensional system using annealing machine, Mechanical Systems and Signal Processing, 188, 109957.
  • Joshi & Boyd (2009) Joshi, S. & Boyd, S., 2009. Sensor selection via convex optimization, IEEE Transactions on Signal Processing, 57(2), 451–462.
  • Kameda & Morikawa (1992) Kameda, H. & Morikawa, H., 1992. An interpolating stochastic process for simulation of conditional random fields, Probabilistic Engineering Mechanics, 7(4), 243–254.
  • Kameda & Morikawa (1994) Kameda, H. & Morikawa, H., 1994. Conditioned stochastic processes for conditional random fields, Journal of engineering mechanics, 120(4), 855–875.
  • Kanda et al. (2021) Kanda, N., Nakai, K., Saito, Y., Nonomura, T., & Asai, K., 2021. Feasibility study on real-time observation of flow velocity field using sparse processing particle image velocimetry, Transactions of the Japan Society for Aeronautical and Space Sciences, 64(4), 242–245.
  • Kanda et al. (2022) Kanda, N., Abe, C., Goto, S., Yamada, K., Nakai, K., Saito, Y., Asai, K., & Nonomura, T., 2022. Proof-of-concept study of sparse processing particle image velocimetry for real time flow observation, Experiments in Fluids, 63(9), 143.
  • Kaneko et al. (2021) Kaneko, S., Ozawa, Y., Nakai, K., Saito, Y., Nonomura, T., Asai, K., & Ura, H., 2021. Data-driven sparse sampling for reconstruction of acoustic-wave characteristics used in aeroacoustic beamforming, Applied Sciences, 11(9), 4216.
  • Kano et al. (2017a) Kano, M., Nagao, H., Ishikawa, D., Ito, S.-i., Sakai, S., Nakagawa, S., Hori, M., & Hirata, N., 2017a. Seismic wavefield imaging based on the replica exchange monte carlo method, Geophysical Journal International, 208(1), 529–545.
  • Kano et al. (2017b) Kano, M., Nagao, H., Nagata, K., Ito, S.-i., Sakai, S., Nakagawa, S., Hori, M., & Hirata, N., 2017b. Seismic wavefield imaging of long-period ground motion in the tokyo metropolitan area, japan, Journal of Geophysical Research: Solid Earth, 122(7), 5435–5451.
  • Kawakami (1989) Kawakami, H., 1989. Simulation of space-time variation of earthquake ground motion including a recorded time history, Doboku Gakkai Ronbunshu, 1989(410), 435–443.
  • Kihara & Okada (1984) Kihara, M. & Okada, T., 1984. A satellite selection method and accuracy for the global positioning system, Navigation, 31(1), 8–20.
  • Kirkpatrick et al. (1983) Kirkpatrick, S., Gelatt Jr, C. D., & Vecchi, M. P., 1983. Optimization by simulated annealing, science, 220(4598), 671–680.
  • Koketsu et al. (2004) Koketsu, K., Fujiwara, H., & Ikegami, Y., 2004. Finite-element simulation of seismic ground motion with a voxel mesh, Pure and Applied Geophysics, 161(11-12).
  • Koketsu et al. (2011) Koketsu, K., Yokota, Y., Nishimura, N., Yagi, Y., Miyazaki, S., Satake, K., Fujii, Y., Miyake, H., Sakai, S., Yamanaka, Y., et al., 2011. A unified source model for the 2011 tohoku earthquake, Earth and Planetary Science Letters, 310(3-4), 480–487.
  • Kraft et al. (2013) Kraft, T., Mignan, A., & Giardini, D., 2013. Optimization of a large-scale microseismic monitoring network in northern switzerland, Geophysical Journal International, 195(1), 474–490.
  • Langston (2007a) Langston, C. A., 2007a. Spatial gradient analysis for linear seismic arrays, Bulletin of the Seismological Society of America, 97(1B), 265–280.
  • Langston (2007b) Langston, C. A., 2007b. Wave gradiometry in two dimensions, Bulletin of the Seismological Society of America, 97(2), 401–416.
  • Li et al. (2021a) Li, B., Liu, H., & Wang, R., 2021a. Data-driven sensor placement for efficient thermal field reconstruction, Science China Technological Sciences, 64(9), 1981–1994.
  • Li et al. (2021b) Li, B., Liu, H., & Wang, R., 2021b. Efficient sensor placement for signal reconstruction based on recursive methods, IEEE Transactions on Signal Processing, 69, 1885–1898.
  • Liang & Langston (2009) Liang, C. & Langston, C. A., 2009. Wave gradiometry for usarray: Rayleigh waves, Journal of Geophysical Research: Solid Earth, 114(B2).
  • Liu et al. (2016) Liu, S., Chepuri, S. P., Fardad, M., Maşazade, E., Leus, G., & Varshney, P. K., 2016. Sensor selection for estimation with correlated measurement noise, IEEE Transactions on Signal Processing, 64(13), 3509–3522.
  • Long et al. (2015) Long, Q., Motamed, M., & Tempone, R., 2015. Fast bayesian optimal experimental design for seismic source inversion, Computer Methods in Applied Mechanics and Engineering, 291, 123–145.
  • Maeda et al. (2016) Maeda, T., Nishida, K., Takagi, R., & Obara, K., 2016. Reconstruction of a 2d seismic wavefield by seismic gradiometry, Progress in Earth and Planetary Science, 3(1), 1–17.
  • Manohar et al. (2018) Manohar, K., Brunton, B. W., Kutz, J. N., & Brunton, S. L., 2018. Data-driven sparse sensor placement for reconstruction: Demonstrating the benefits of exploiting known patterns, IEEE Control Systems Magazine, 38(3), 63–86.
  • Manohar et al. (2019) Manohar, K., Kaiser, E., Brunton, S. L., & Kutz, J. N., 2019. Optimized sampling for multiscale dynamics, Multiscale Modeling & Simulation, 17(1), 117–136.
  • Manohar et al. (2021) Manohar, K., Kutz, J. N., & Brunton, S. L., 2021. Optimal sensor and actuator selection using balanced model reduction, IEEE Transactions on Automatic Control, 67(4), 2108–2115.
  • Muir & Zhan (2021) Muir, J. B. & Zhan, Z., 2021. Seismic wavefield reconstruction using a pre-conditioned wavelet–curvelet compressive sensing approach, Geophysical Journal International, 227(1), 303–315.
  • Muir & Zhan (2022) Muir, J. B. & Zhan, Z., 2022. Wavefield-based evaluation of das instrument response and array design, Geophysical Journal International, 229(1), 21–34.
  • Nagata et al. (2021) Nagata, T., Nonomura, T., Nakai, K., Yamada, K., Saito, Y., & Ono, S., 2021. Data-driven sparse sensor selection based on a-optimal design of experiment with admm, IEEE Sensors Journal, 21(13), 15248–15257.
  • Nagata et al. (2022a) Nagata, T., Nakai, K., Yamada, K., Saito, Y., Nonomura, T., Kano, M., Ito, S.-i., & Nagao, H., 2022a. Seismic wavefield reconstruction based on compressed sensing using data-driven reduced-order model, Geophysical Journal International, 322(1), 33–50.
  • Nagata et al. (2022b) Nagata, T., Yamada, K., Nonomura, T., Nakai, K., Saito, Y., & Ono, S., 2022b. Data-driven sensor selection method based on proximal optimization for high-dimensional data with correlated measurement noise, IEEE Transactions on Signal Processing, pp. 5251–5264.
  • Nagata et al. (2023) Nagata, T., Yamada, K., Nakai, K., Saito, Y., & Nonomura, T., 2023. Randomized group-greedy method for large-scale sensor selection problems, IEEE Sensors Journal.
  • Nakai et al. (2021) Nakai, K., Yamada, K., Nagata, T., Saito, Y., & Nonomura, T., 2021. Effect of objective function on data-driven greedy sparse sensor optimization, IEEE Access, 9, 46731–46743.
  • Nakai et al. (2022) Nakai, K., Sasaki, Y., Nagata, T., Yamada, K., Saito, Y., & Nonomura, T., 2022. Nondominated-solution-based multi-objective greedy sensor selection for optimal design of experiments, IEEE Transactions on Signal Processing, 70, 5694–5707.
  • Nonomura et al. (2021) Nonomura, T., Ono, S., Nakai, K., & Saito, Y., 2021. Randomized subspace newton convex method applied to data-driven sensor selection problem, IEEE Signal Processing Letters, 28, 284–288.
  • Phatak (2001) Phatak, M. S., 2001. Recursive method for optimum gps satellite selection, IEEE Transactions on Aerospace and Electronic Systems, 37(2), 751–754.
  • Pitarka (1999) Pitarka, A., 1999. 3d elastic finite-difference modeling of seismic motion using staggered grids with nonuniform spacing, Bulletin of the Seismological Society of America, 89(1), 54–68.
  • Saito et al. (2020) Saito, Y., Nonomura, T., Nankai, K., Yamada, K., Asai, K., Sasaki, Y., & Tsubakino, D., 2020. Data-driven vector-measurement-sensor selection based on greedy algorithm, IEEE Sensors Letters, 4.
  • Saito et al. (2021a) Saito, Y., Nonomura, T., Yamada, K., Nakai, K., Nagata, T., Asai, K., Sasaki, Y., & Tsubakino, D., 2021a. Determinant-based fast greedy sensor selection algorithm, IEEE Access, 9, 68535–68551.
  • Saito et al. (2021b) Saito, Y., Yamada, K., Kanda, N., Nakai, K., Nagata, T., Nonomura, T., & Asai, K., 2021b. Data-driven determinant-based greedy under/oversampling vector sensor placement, CMES-COMPUTER MODELING IN ENGINEERING & SCIENCES, 129(1), 1–30.
  • Sakai & Hirata (2009) Sakai, S. & Hirata, N., 2009. Distribution of the metropolitan seismic observation network, Bull. Earthq. Res. Inst. Univ. Tokyo, 84, 57–69.
  • Sato & Imabayashi (1999) Sato, T. & Imabayashi, H., 1999. Real time conditional simulation of earthquake ground motion, Earthquake Engineering and Engineering Seismology, 1(1), 27–38.
  • Schneider et al. (2018) Schneider, S., Thomas, C., Dokht, R. M., Gu, Y. J., & Chen, Y., 2018. Improvement of coda phase detectability and reconstruction of global seismic data using frequency–wavenumber methods, Geophysical Journal International, 212(2), 1288–1301.
  • Shamaiah et al. (2010) Shamaiah, M., Banerjee, S., & Vikalo, H., 2010. Greedy sensor selection: Leveraging submodularity, in 49th IEEE conference on decision and control (CDC), pp. 2572–2577, IEEE.
  • Shiina et al. (2021) Shiina, T., Maeda, T., Kano, M., Kato, A., & Hirata, N., 2021. An optimum 2d seismic-wavefield reconstruction in densely and nonuniformly distributed stations: The metropolitan seismic observation network in japan, Seismological Society of America, 92(3), 2015–2027.
  • Steinberg & Rabinowitz (2003) Steinberg, D. M. & Rabinowitz, N., 2003. Optimal seismic monitoring for event location with application to on site inspection of the comprehensive nuclear test ban treaty, Metrika, 58(1), 31–57.
  • Stuart (2010) Stuart, A. M., 2010. Inverse problems: A bayesian perspective, Acta Numerica, 19, 451–559.
  • Swendsen & Wang (1986) Swendsen, R. H. & Wang, J.-S., 1986. Replica monte carlo simulation of spin-glasses.
  • Tiwari et al. (2022) Tiwari, N., Uchida, K., Inoba, R., Saito, Y., Asai, K., & Nonomura, T., 2022. Simultaneous measurement of pressure and temperature on the same surface by sensitive paints using the sensor selection method, Experiments in Fluids, 63(11), 1–13.
  • Uciński (2020) Uciński, D., 2020. D-optimal sensor selection in the presence of correlated measurement noise, Measurement, 164, 107873.
  • Vanmarcke & Fenton (1991) Vanmarcke, E. H. & Fenton, G. A., 1991. Conditioned simulation of local fields of earthquake ground motion, Structural Safety, 10(1-3), 247–264.
  • Wald et al. (1999) Wald, D. J., Quitoriano, V., Heaton, T. H., Kanamori, H., Scrivner, C. W., & Worden, C. B., 1999. Trinet “shakemaps”: Rapid generation of peak ground motion and intensity maps for earthquakes in southern california, Earthquake Spectra, 15(3), 537–555.
  • Wald et al. (2005) Wald, D. J., Worden, B. C., Quitoriano, V., & Pankow, K. L., 2005. Shakemap manual: technical manual, user’s guide, and software guide, Tech. rep.
  • Worden & Burrows (2001) Worden, K. & Burrows, A., 2001. Optimal sensor placement for fault detection, Engineering structures, 23(8), 885–901.
  • Yamada et al. (2021) Yamada, K., Saito, Y., Nankai, K., Nonomura, T., Asai, K., & Tsubakino, D., 2021. Fast greedy optimization of sensor selection in measurement with correlated noise, Mechanical Systems and Signal Processing, 158, 107619.
  • Yamada et al. (2022) Yamada, K., Saito, Y., Nonomura, T., & Asai, K., 2022. Greedy sensor selection for weighted linear least squares estimation under correlated noise, IEEE Access, 10, 79356–79364.
  • Yeo et al. (2022) Yeo, W.-J., Taulu, S., & Kutz, J. N., 2022. Efficient magnetometer sensor array selection for signal reconstruction and brain source localization, arXiv preprint arXiv:2205.10925.
  • Yi et al. (2011) Yi, T.-H., Li, H.-N., & Gu, M., 2011. Optimal sensor placement for structural health monitoring based on multiple optimization strategies, The Structural Design of Tall and Special Buildings, 20(7), 881–900.
  • Zhan et al. (2018) Zhan, Z., Li, Q., & Huang, J., 2018. Application of wavefield compressive sensing in surface wave tomography, Geophysical Journal International, 213(3), 1731–1743.
  • Zhang & Curtis (2021) Zhang, X. & Curtis, A., 2021. Interrogating probabilistic inversion results for subsurface structural information, Geophysical Journal International, 229(2), 750–757.
  • Zhao et al. (2022) Zhao, X., Curtis, A., & Zhang, X., 2022. Interrogating subsurface structures using probabilistic tomography: An example assessing the volume of irish sea basins, Journal of Geophysical Research: Solid Earth, 127(4), e2022JB024098.