This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

DuRIN: A Deep-unfolded Sparse Seismic Reflectivity Inversion Network

Swapnil Mache, Praveen Kumar Pokala, Kusala Rajendran, and Chandra Sekhar Seelamantula Equal contribution.Swapnil Mache is with the Centre of Excellence in Advanced Mechanics of Materials, Indian Institute of Science (IISc), Bangalore, India, and with the Department of Electrical Engineering, IISc. He was formerly with the Centre for Earth Sciences, IISc. Email: [email protected] Kumar Pokala and Chandra Sekhar Seelamantula are with the Department of Electrical Engineering, IISc. E-mail: [email protected], [email protected]. Kusala Rajendran is with the Centre of Excellence in Advanced Mechanics of Materials, IISc. She is Retired Professor from the Centre for Earth Sciences, IISc. Email: [email protected].
Abstract

We consider the reflection seismology problem of recovering the locations of interfaces and the amplitudes of reflection coefficients from seismic data, which are vital for estimating the subsurface structure. The reflectivity inversion problem is typically solved using greedy algorithms and iterative techniques. Sparse Bayesian learning framework, and more recently, deep learning techniques have shown the potential of data-driven approaches to solve the problem. In this paper, we propose a weighted minimax-concave penalty-regularized reflectivity inversion formulation and solve it through a model-based neural network. The network is referred to as deep-unfolded reflectivity inversion network (DuRIN). We demonstrate the efficacy of the proposed approach over the benchmark techniques by testing on synthetic 1-D seismic traces and 2-D wedge models and validation with the simulated 2-D Marmousi2 model and real data from the Penobscot 3D survey off the coast of Nova Scotia, Canada.

Index Terms:
Geophysics, inverse problems, seismology, seismic reflectivity inversion, geophysical signal processing, deep learning, neural networks, algorithm unrolling, nonconvex optimization, sparse recovery.

I Introduction

Reflectivity inversion is an important deconvolution problem in reflection seismology, helpful in characterizing the layered subsurface structure. The subsurface is modeled as having sparse reflectivity localized at the interfaces between two layers, assuming largely horizontal and parallel layers, each with constant impedance [1, 2], or in other words, a piecewise-constant impedance structure. Figure 1 shows a 33-layer model of the subsurface, with a wet sandstone between two shale layers [3].

Refer to caption
Figure 1: Refer to caption An ideal three-layer subsurface model. The operator * denotes convolution.

The reflection coefficients (rir_{i}) (Fig. 1(c)) at the interface between two adjoining layers ii and i+1i+1 (Fig. 1(a)) are related to the subsurface geology [1] through the relation:

ri=ρi+1Vi+1ρiViρi+1Vi+1+ρiVi,r_{i}=\frac{\rho_{i+1}V_{i+1}-\rho_{i}V_{i}}{\rho_{i+1}V_{i+1}+\rho_{i}V_{i}}, (1)

where ρi\rho_{i} and ViV_{i} are the density and P-wave velocity, respectively, of the ithi^{th} layer. The product of density (ρ\rho) and P-wave velocity (VV) is known as the acoustic impedance (Fig. 1(b)).

Such a layered subsurface is often recovered through the widely used convolutional model (2), wherein the observed seismic data is modeled as the convolution between the source pulse 𝒉{\boldsymbol{h}} (Fig. 1(d)) and the Earth response or reflectivity 𝒙n{\boldsymbol{x}}\in\mathbb{R}^{n} (Fig. 1(c)) [4, 5]. The observation at the surface is a noisy seismic trace 𝒚n{\boldsymbol{y}}\in\mathbb{R}^{n} (Fig. 1(e)) given by:

𝒚=𝒉𝒙+𝒏,{\boldsymbol{y}}={\boldsymbol{h}}*{\boldsymbol{x}}+{\boldsymbol{n}}, (2)

where 𝒏{\boldsymbol{n}} is the measurement noise, * denotes convolution, and 𝒉{\boldsymbol{h}} is assumed to be a Ricker wavelet. When the wavelet is known, the linear inverse problem (2) can be solved within the sparsity framework by modeling the reflectivity 𝒙{\boldsymbol{x}} as sparse, as the significant interfaces are important in the inversion. Further, the location of the interfaces (in other words, the support recovery) is prioritized over amplitude recovery [5].

I-A Prior Art

We broadly classify the prior work related to reflectivity inversion into two categories, namely, optimization-based techniques and data-driven techniques.

I-A1 Optimization Techniques

The solution 𝒙\boldsymbol{x} in (2) can be recovered by minimizing the classical least-squares (LS) objective function:

argmin𝒙12𝒉𝒙𝒚22.\arg\min_{{\boldsymbol{x}}}\frac{1}{2}{\left\|{\boldsymbol{h}}*{\boldsymbol{x}}-{\boldsymbol{y}}\right\|}_{2}^{2}. (3)

However, the finite length and measurement noise, and the loss of low and high-frequency information due to convolution of the reflectivity with a bandlimited wavelet [1, 6, 7], lead to non-unique solutions to the above problem [8]. This ill-posed inverse problem is tackled by employing a sparsity prior on the solution [9]. The 1\ell_{1}-norm regularization is widely preferred by virtue of its convexity [1, 4, 10, 11]. The 1\ell_{1}-regularized reflectivity inversion problem is posed as follows:

argmin𝒙12𝒉𝒙𝒚22+λ𝒙1,\arg\min_{{\boldsymbol{x}}}\,\frac{1}{2}{\left\|{\boldsymbol{h}}*{\boldsymbol{x}}-{\boldsymbol{y}}\right\|}_{2}^{2}+\lambda{\left\|{\boldsymbol{x}}\right\|}_{1}, (4)

where λ\lambda is the regularization parameter.

Zhang and Castagna [12] adopted basis-pursuit inversion (BPI) to solve the problem in (4), based on the basis-pursuit de-noising algorithm proposed by [13]. Algorithms such as the Iterative Shrinkage-Thresholding Algorithm (ISTA) [14] and its faster variant, FISTA [15], have been proposed to solve the 1\ell_{1}-norm regularized deconvolution problem. Relevant to the problem under consideration, [16] adopted FISTA for reflectivity inversion. Further studies by [17] and [18] adopted FISTA along with debiasing steps of LS inversion and “adding back the residual”, respectively, to improve amplitude recovery after support estimation using FISTA.

The sparsity of seismic data is difficult to estimate in practice and the adequacy of the 1\ell_{1} norm for the reflectivity inversion problem has not been established [8, 19]. Also, 1\ell_{1}-norm regularization results in a biased estimate of 𝒙{\boldsymbol{x}} [20, 21, 22]. In their application of FISTA to the seismic reflectivity inversion problem, [17] and [18] observed an attenuation of reflection coefficient magnitudes. They adopted post-processing debiasing steps [23] to tackle the bias introduced due to the 1\ell_{1}-norm regularization. Nonconvex regularizers such as the minimax concave penalty (MCP) [24] have been shown to overcome the shortcomings of 1\ell_{1} regularization in inverse problems. The advantages of adopting nonconvex regularizers over 1\ell_{1} have been demonstrated in sparse recovery problems [22, 24, 25]. Particularly pertinent to this discussion is the data-driven q\ell_{q}-norm regularization (0<q<1)(0<q<1) for seismic reflectivity inversion proposed by [19], wherein the optimal qq was chosen based on the input data.

I-A2 Data-driven Methods

Recent works have employed data-driven approaches for solving inverse problems in seismology and geophysics [26, 8, 3, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48]. Further, learning-based approaches have also been explored for seismic reflectivity inversion (or reflectivity model building) [3, 28, 34, 43]. Bergen et al. [27] underscored the importance of leveraging the data and model-driven aspects of inverse problems in solid Earth geoscience while incorporating machine learning (ML) into the loop. In a comprehensive review of deep learning approaches for inverse problems in seismology and geophysics, Adler et al. [28] assessed learning-based approaches that enhanced the performance of full waveform inversion [29, 30, 31, 32] and end-to-end seismic inversion [33, 34, 35, 36, 37, 38, 39, 40, 41, 42]. They also reviewed recent solutions based on physics-guided architectures [43, 44] and deep generative modeling [45, 46, 47, 48] that are still in a nascent stage as it pertains to seismic inversion.

The application of sparse Bayesian learning (SBL) for reflectivity inversion has also been explored [26, 8], where the sparse reflection coefficients are recovered by maximizing the marginal likelihood. This is done through a sequential algorithm-based approach (SBL-SA) to update the sparsity-controlling hyperparameters [26], or through the expectation-maximization algorithm (SBL-EM) [8, 49]. The application of neural networks to inversion problems in geophysics [28], particularly reflectivity inversion, is a recent development [3, 34]. Kim and Nakata [34] used an elementary feedforward neural network to recover the sparse reflectivity. They obtained superior support recovery but low amplitude recovery using the neural network compared to a least-squares approach. Further, [3] discussed the suitability of machine learning approaches such as feedforward neural networks [34]. He observed that data-driven techniques outperform conventional deconvolution approaches when knowledge about the underlying geology is limited while also providing a computational advantage. Although, in applications pertaining to geoscience, and specifically, geophysics, the level of model interpretability is critical as one aims to gain physical insights into the system under consideration [27]. Insights into an inverse problem, in addition to savings in terms of computational times, can be gained through deep neural network architectures that are informed by the sparse linear inverse problem itself [27].

To that end, [50] proposed a new class of data-driven framework architectures based on unfolding iterative algorithms into neural network architectures [51]. They proposed learned ISTA (LISTA), based on unrolling the update steps of ISTA [14] into layers of a feedforward neural network. Subsequent studies have demonstrated the efficacy of this class of model-based deep learning architectures in compressed sensing and sparse-recovery applications [52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65]. Monga et al. [51] provided a review on algorithm unrolling in the context of imaging, vision and recognition, speech processing, and other signal and image processing problems. Deep-unfolding combines the advantages of both data-driven and iterative techniques.

I-B Motivation and Contribution

The limitations of 1\ell_{1} regularization [20, 21, 22], in addition to the challenge of estimating the sparsity of seismic reflection data [8, 19], can be overcome through nonconvex regularization [19]. However, the corresponding nonconvex cost suffers from local minima, and optimization becomes quite challenging. The hyper-parameters of most iterative techniques are set heuristically, which is likely to yield suboptimal solutions. Machine learning approaches outperform conventional techniques in reflectivity inversion [3], especially in support recovery [34], which is given higher priority [5]. Deep-unrolled architectures [51], which belong to a class of model-based neural networks, are not entirely dissociated from the underlying physics of the problem, which is a potential risk in elementary data-driven approaches such as feedforward neural networks [3, 34, 27]. These factors motivate us to employ a data-driven nonconvex regularization strategy and solve the reflectivity inversion problem through deep-unrolled architectures.

The contributions of this paper are summarized as follows.

  1. 1.

    We construct an optimization cost for sparse seismic reflectivity inversion based on the weighted counterpart of the minimax-concave penalty (MCP) [24], where each component of the sparse reflectivity 𝒙{\boldsymbol{x}} is associated with a different weight.

  2. 2.

    The resulting optimization algorithm, the iterative firm-thresholding algorithm (IFTA) [58] is unfolded into the deep-unfolded reflectivity inversion network (DuRIN). To the best of our knowledge, model-based architectures have not been explored for solving the seismic reflectivity inversion problem. In fact, deep-unrolling has not been employed for solving seismic inverse problems [28].

  3. 3.

    The efficacy of the proposed formulation with respect to the state of the art is demonstrated over synthetic 11-D and 22-D data and on simulated as well as real datasets.

II Weighted-MCP-Regularized Reflectivity Inversion

In this study, we use the weighted counterpart of the nonconvex minimax-concave penalty (MCP) [24] defined as:

g𝜸(𝒙;𝝁)=i=1ngγi(xi;μi),g_{{\boldsymbol{\gamma}}}({\boldsymbol{x}};{\boldsymbol{\mu}})=\sum\limits_{i=1}^{n}g_{\gamma_{i}}(x_{i};\mu_{i}), (5)

where 𝒙=[x1,x2,x3,,xn]{\boldsymbol{x}}=\left[x_{1},x_{2},x_{3},\ldots,x_{n}\right], 𝝁=[μ1,μ2,μ3,,μn]{\boldsymbol{\mu}}=\left[\mu_{1},\mu_{2},\mu_{3},\ldots,\mu_{n}\right], 𝜸=[γ1,γ2,γ3,,γn]{\boldsymbol{\gamma}}=\left[\gamma_{1},\gamma_{2},\gamma_{3},\ldots,\gamma_{n}\right], and

gγi(xi;μi)={μi|xi||xi|22γi,for|xi|γi,μi2γi2,for|xi|γiμi,g_{\gamma_{i}}(x_{i};\mu_{i})=\begin{cases}{\mu_{i}\left|x_{i}\right|-\cfrac{{\left|x_{i}\right|}^{2}}{2\gamma_{i}},}\quad{\text{for}\left|{x}_{i}\right|\leq{\gamma}_{i},}\\ {\cfrac{{{{\mu}}_{i}}^{2}{\gamma}_{i}}{2},}\quad\quad\qquad\qquad{\text{for}\left|{x}_{i}\right|\geq{\gamma}_{i}{{\mu}}_{i},}\end{cases} (6)

i[[1,n]]\forall~{}i\in[\![1,n]\!]. The trainable parameters of g𝜸(𝒙;𝝁)g_{{\boldsymbol{\gamma}}}({\boldsymbol{x}};{\boldsymbol{\mu}}), 𝝁>0n{\boldsymbol{\mu}}\in\mathbb{R}^{n}_{>0} and 𝜸>1n{\boldsymbol{\gamma}}\in\mathbb{R}^{n}_{>1}, are learned in a data-driven setting. >tn\mathbb{R}^{n}_{>t} denotes the set of vectors in n\mathbb{R}^{n} with entries greater than tt.

Refer to caption
Figure 2: Refer to caption The minimax-concave penalty corresponding to γ=2\gamma=2, γ=3\gamma=3 and γ=100\gamma=100.
Refer to caption
Figure 3: Refer to caption The proximal operators corresponding to the MCP for γ=2\gamma=2, γ=3\gamma=3 and γ=100\gamma=100.

II-A Problem Formulation

Formulating the optimization problem using the weighted MCP regularizer defined above leads to the objective function:

argmin𝒙{J(𝒙)=12𝒉𝒙𝒚22F(𝒙)+g𝜸(𝒙;𝝁)}.\arg\min_{{\boldsymbol{x}}}\,\Big{\{}{J({\boldsymbol{x}})}=\underbrace{\frac{1}{2}{\left\|{\boldsymbol{h}}*{\boldsymbol{x}}-{\boldsymbol{y}}\right\|}_{2}^{2}}_{F({\boldsymbol{x}})}+g_{{\boldsymbol{\gamma}}}({\boldsymbol{x}};{\boldsymbol{\mu}})\Big{\}}. (7)

The objective function J{J}, data-fidelity term FF, and the regularizer gg satisfy the following properties, which are essential for minimizing J{J}:

  1. P1.

    F:n(,]F:\mathbb{R}^{n}\rightarrow\left(-\infty,\infty\right] is proper, closed, and LL-smooth, i.e., F(𝒙)F(𝒚)2L𝒙𝒚2\|\nabla F({\boldsymbol{x}})-\nabla F({\boldsymbol{y}})\|_{2}\leq L\|{\boldsymbol{x}}-{\boldsymbol{y}}\|_{2}.

  2. P2.

    gg is lower semi-continuous.

  3. P3.

    J{J} is bounded from below, i.e., inf J>{J}>-\infty.

Next, we optimize the problem stated in (7) through the Majorization-Minimization (MM) approach [66], and the resulting algorithm, the iterative firm-thresholding algorithm (IFTA) [58]. Unfolding the iterations of IFTA results in a learnable network called deep-unfolded reflectivity inversion network (DuRIN), which has a similar architecture to FirmNet [58].

II-B IFTA: Iterative firm-thresholding algorithm

Due to P1, there exists η<1/L\eta<1/L such that F(𝒙)F({\boldsymbol{x}}) is upper-bounded locally by a quadratic expansion about 𝒙=𝒙(k){\boldsymbol{x}}={\boldsymbol{x}}^{(k)} as:

F(𝒙)Q(k)(𝒙),F({\boldsymbol{x}})\leq{Q^{(k)}\left(\boldsymbol{x}\right)}, (8)

where Q(k)(𝒙)=F(𝒙(k))+(𝒙𝒙(k))F(𝒙(k))+12η𝒙𝒙(k)22{Q^{(k)}\left(\boldsymbol{x}\right)}=F({\boldsymbol{x}}^{(k)})+({\boldsymbol{x}}-{\boldsymbol{x}}^{(k)})^{\intercal}\nabla F({\boldsymbol{x}}^{(k)})+\frac{1}{2\eta}{\left\|{\boldsymbol{x}}-{\boldsymbol{x}}^{(k)}\right\|}_{2}^{2}. The majorizer to the objective function JJ at 𝒙(k){\boldsymbol{x}}^{(k)} is given by:

H(k)(𝒙)=Q(k)(𝒙)+g𝜸(𝒙;𝝁),H^{(k)}({\boldsymbol{x}})=Q^{(k)}\left(\boldsymbol{x}\right)+g_{{\boldsymbol{\gamma}}}({\boldsymbol{x}};{\boldsymbol{\mu}}), (9)

such that H(k)(𝒙)J(𝒙)H^{(k)}({\boldsymbol{x}})\geq J(\boldsymbol{x}). The update equation for minimizing H(k)(𝒙)H^{(k)}({\boldsymbol{x}}) to obtain 𝒙(k+1){\boldsymbol{x}}^{(k+1)} is given by,

𝒙(k+1)=argmin𝒙12𝒙(k)+η𝒉(𝒚𝒉𝒙(k))𝒙22\displaystyle{\boldsymbol{x}}^{(k+1)}=\arg\min_{{\boldsymbol{x}}}\frac{1}{2}{\left\|{\boldsymbol{x}}^{(k)}+\eta{\boldsymbol{h}}^{\prime}*({\boldsymbol{y}}-{\boldsymbol{h}}*{\boldsymbol{x}}^{(k)})-{\boldsymbol{x}}\right\|}_{2}^{2}
+ηg𝜸(𝒙;𝝁).\displaystyle+\eta g_{{\boldsymbol{\gamma}}}({\boldsymbol{x}};{\boldsymbol{\mu}}). (10)

The proximal operator of the penalty function gγi(xi;μi)g_{{\gamma}_{i}}({x}_{i};{\mu}_{i}) is the firm-thresholding operator [67] given by

𝒢γig(xi;μi)={0,for|xi|μi,sgn(xi)γiγi1(|xi|μi),forμi<|xi|γiμi,xi,for|xi|>γiμi.\displaystyle\mathcal{G}_{{\gamma}_{i}}^{g}({x}_{i};{\mu}_{i})=\begin{cases}{0,}\qquad\qquad\quad\,{\text{for}\;\left|{x}_{i}\right|\leq{{\mu}}_{i},}\\[5.0pt] {\text{sgn}\left({x}_{i}\right)\odot\cfrac{{\gamma}_{i}}{{\gamma}_{i}-1}\left(\left|{x}_{i}\right|-{{\mu}}_{i}\right),}&\\[8.0pt] \qquad\qquad\qquad{\text{for}\;{{\mu}}_{i}<\left|{x}_{i}\right|\leq{\gamma}_{i}{{\mu}}_{i},}\\ {{x}_{i},}\qquad\qquad\quad{\text{for}\;\left|{x}_{i}\right|>{\gamma}_{i}{{\mu}}_{i}}.\end{cases} (11)

The update step at the (k+1)th(k+1)^{th} iteration is given by

𝒙(k+1)=𝒢𝜸g(𝒙(k)+η2𝒉(𝒚𝒉𝒙(k))),\begin{split}{\boldsymbol{x}}^{(k+1)}=\mathcal{G}_{{\boldsymbol{\gamma}}}^{g}\left({\boldsymbol{x}}^{(k)}+\frac{\eta}{2}{\boldsymbol{h}}^{\prime}*\left({\boldsymbol{y}}-{\boldsymbol{h}}*{\boldsymbol{x}}^{(k)}\right)\right),\end{split} (12)

where 𝒉{\boldsymbol{h}}^{\prime} is the flipped version of 𝒉{\boldsymbol{h}}. Algorithm (1) lists the steps involved in solving the optimization problem (7). The update in (12) involves convolutions followed by nonlinear activation (the firm threshold in this case), and can therefore be represented as a layer in a neural network.

Input: η=1/𝒉22,kmax\eta=1/{\left\|{\boldsymbol{h}}\right\|}_{2}^{2},~{}k_{max}, data 𝒚{\boldsymbol{y}}, wavelet 𝒉{\boldsymbol{h}}, learnable parameters 𝝁{\boldsymbol{\mu}}, 𝜸{\boldsymbol{\gamma}}
Initialization: 𝒙(0)=0{\boldsymbol{x}}^{(0)}=0
while k<kmaxk<k_{max} do
       𝒛(k+1)=𝒙(k)η2(𝒉(𝒚𝒉𝒙(k)))\boldsymbol{z}^{(k+1)}={\boldsymbol{x}}^{(k)}-\frac{\eta}{2}\left({\boldsymbol{h}}^{\prime}*({\boldsymbol{y}}-{\boldsymbol{h}}*{{\boldsymbol{x}}^{(k)}})\right)
       𝒙(k+1)=𝒢𝜸g(𝒛(k+1);𝝁){\boldsymbol{x}}^{(k+1)}=\mathcal{G}_{{\boldsymbol{\gamma}}}^{g}\left(\boldsymbol{z}^{(k+1)};{\boldsymbol{\mu}}\right)
       k=k+1k=k+1
      
Output: 𝒙^𝒙(k)\hat{{\boldsymbol{x}}}\leftarrow{\boldsymbol{x}}^{(k)}
Algorithm 1 IFTA: Iterative Firm-Thresholding Algorithm

II-C DuRIN: Deep-unfolded Reflectivity Inversion Network

As mentioned in the previous section, the update in IFTA (12) can be interpreted as a layer in a neural network, and hence, we unroll the iterations of the algorithm into layers of a neural network to solve the problem given in (7). The proposed deep-unrolled architecture for sparse seismic reflectivity inversion is called Deep-unfolded Reflectivity Inversion Network (DuRIN), which resembles FirmNet [58]. The input-output for each layer in DuRIN is given by

𝒙(k+1)=𝒢𝜸g(𝐖𝒚+𝐒𝒙(k)),{\boldsymbol{x}}^{(k+1)}=\mathcal{G}_{{\boldsymbol{\gamma}}}^{g}(\mathbf{W}{\boldsymbol{y}}+\mathbf{S}{\boldsymbol{x}}^{(k)}), (13)

where 𝐖=η𝒉\mathbf{W}=\eta\mathop{{\boldsymbol{h}}^{\prime}} and 𝐒=𝐈η𝒉𝒉\mathbf{S}=\mathbf{I}-\eta\mathop{{\boldsymbol{h}}^{\prime}{\boldsymbol{h}}} [50] are initialized as Toeplitz matrices, and are dense and unstructured in the learning stage. The parameters that need to be learned are the matrices 𝐖\mathbf{W} and 𝐒\mathbf{S} and the thresholds (𝝁,𝜸{\boldsymbol{\mu}},{\boldsymbol{\gamma}}), subject to 𝝁>𝟎,𝜸>𝟏{\boldsymbol{\mu}}>{\boldsymbol{0}},{\boldsymbol{\gamma}}>{\boldsymbol{1}}, where 𝟎{\boldsymbol{0}} is the null vector and 𝟏{\boldsymbol{1}} is the vector of all ones. For training, we employ the smooth 1\ell_{1} loss (Smooth1=β𝒙𝒙^1+(1β)𝒙𝒙^22,0<β1)\left(\text{Smooth}~{}\ell_{1}=\beta\left\|{{\boldsymbol{x}}-\hat{{\boldsymbol{x}}}}\right\|_{1}+(1-\beta)\left\|{{\boldsymbol{x}}-\hat{{\boldsymbol{x}}}}\right\|_{2}^{2},~{}0<\beta\leq 1\right) as the training cost, computed between the true reflectivity 𝒙{\boldsymbol{x}} and prediction 𝒙^\hat{{\boldsymbol{x}}}. In our experimental setup, β=1\beta=1, and it is not a trainable parameter.

To improve amplitude recovery of DuRIN during inference, we re-estimate the amplitudes of the estimated sparse vector 𝒙^\hat{{\boldsymbol{x}}} over the supports given by DuRIN as: 𝒙^𝒮^𝒙^=𝐇𝒮^𝒙^𝒚\hat{{\boldsymbol{x}}}_{\hat{\mathcal{S}}_{\hat{\boldsymbol{x}}}}=\mathbf{H}_{\hat{\mathcal{S}}_{\hat{\boldsymbol{x}}}}^{\dagger}{\boldsymbol{y}}, where 𝒮^𝒙^\hat{\mathcal{S}}_{\hat{\boldsymbol{x}}} is the support, or the non-zero locations, of 𝒙^\hat{\boldsymbol{x}}, and 𝐇𝒮^𝒙^\mathbf{H}_{\hat{\mathcal{S}}_{\hat{\boldsymbol{x}}}}^{\dagger} is the pseudo-inverse of the Toeplitz matrix 𝐇\mathbf{H} of the kernel 𝒉{\boldsymbol{h}} over the support 𝒮^𝒙^\hat{\mathcal{S}}_{\hat{\boldsymbol{x}}}. The steps of DuRIN for kmaxk_{max} layers are listed in Algorithm 2. Figure 4 illustrates a 3-layer architecture for DuRIN.

Refer to caption
Figure 4: Refer to caption Schematic for (a) one iteration of the iterative firm-thresholding algorithm (IFTA) [58], and (b) 2 layers of the deep-unfolded reflectivity inversion network (DuRIN), which corresponds to FirmNet [58]. A layer in the network (b) corresponds to an iteration in (a). The operator \oplus represents element-wise sum.

We consider two variants of DuRIN, depending on the value of γi,i[[1,n]]\gamma_{i},~{}\forall~{}i\in[\![1,n]\!]. The generalized variant of DuRIN based on IFTA [58], where γi{\gamma}_{i} are different across components and layers, is referred to as DuRIN-11, and corresponds to FirmNet [58]. As γi\gamma_{i}\to\infty, the firm-thresholding function approaches the soft thresholding function [22]. We call this variant DuRIN-22, which resembles LISTA [50].

Input: 𝒚,η=1/𝐇22,kmax,𝝁,𝜸{\boldsymbol{y}},~{}\eta=1/{\left\|\mathbf{H}\right\|}_{2}^{2},~{}k_{max},~{}{\boldsymbol{\mu}},~{}{\boldsymbol{\gamma}}
Initialization: 𝐖,𝐒,𝒙(0)=𝒢𝜸g(𝐖𝒚)\mathbf{W},~{}\mathbf{S},~{}{\boldsymbol{x}}^{(0)}=\mathcal{G}_{{\boldsymbol{\gamma}}}^{g}(\mathbf{W}{{\boldsymbol{y}}})
while k<kmaxk<k_{max} do
       𝐜(k+1)=𝐖𝒚+𝐒𝒙(k)\mathbf{c}^{(k+1)}=\mathbf{W}{{\boldsymbol{y}}}+\mathbf{S}{{\boldsymbol{x}}^{(k)}}
       𝒙(k+1)=𝒢𝜸g(𝐜(k+1)){\boldsymbol{x}}^{(k+1)}=\mathcal{G}_{{\boldsymbol{\gamma}}}^{g}(\mathbf{c}^{(k+1)})
       k=k+1k=k+1
      
𝒙^𝒮^𝒙^=𝐇𝒮^𝒙^𝒚\hat{{\boldsymbol{x}}}_{\hat{\mathcal{S}}_{\hat{\boldsymbol{x}}}}=\mathbf{H}_{\hat{\mathcal{S}}_{\hat{\boldsymbol{x}}}}^{\dagger}{\boldsymbol{y}}
Output: 𝒙^𝒙(k)\hat{{\boldsymbol{x}}}\leftarrow{\boldsymbol{x}}^{(k)}
Algorithm 2 DuRIN: Deep-unfolded Reflectivity Inversion Network

III Experimental Results

We present experimental results for the proposed networks, namely, DuRIN-11 and DuRIN-22, on synthetic 11-D traces and 22-D wedge models [68], the simulated 22-D Marmousi22 model [69], and real 33-D field data from the Penobscot 33D survey off the coast of Nova Scotia, Canada [70]. We evaluate the performance of DuRIN-11 and DuRIN-22 in comparison with the benchmark techniques, namely, basis-pursuit inversion (BPI) [13, 12], fast iterative shrinkage-thresholding algorithm (FISTA) [15, 16], and expectation-maximization-based sparse Bayesian learning (SBL-EM) [49, 8]. To quantify the performance, we employ the objective metrics listed in the following section.

III-A Objective Metrics

We evaluate the results using statistical parameters that measure the amplitude and support recovery between the ground-truth sparse vector 𝒙{\boldsymbol{x}} and the predicted sparse vector 𝒙^\hat{{\boldsymbol{x}}}.

  1. 1.

    Correlation Coefficient (CC) [71]:

    CC=𝒙,𝒙^𝒙,𝟏𝒙^,𝟏(𝒙22𝒙,𝟏2)(𝒙^22𝒙^,𝟏2),\displaystyle\text{CC}=\frac{\langle{\boldsymbol{x}},\hat{{\boldsymbol{x}}}\rangle-\langle{\boldsymbol{x}},{{\boldsymbol{1}}}\rangle\langle\hat{{\boldsymbol{x}}},{{\boldsymbol{1}}}\rangle}{\sqrt{\left(\left\|{{\boldsymbol{x}}}\right\|_{2}^{2}-{\langle{\boldsymbol{x}},{{\boldsymbol{1}}}\rangle}^{2}\right)\left(\left\|\hat{{\boldsymbol{x}}}\right\|_{2}^{2}-{\langle\hat{{\boldsymbol{x}}},{{\boldsymbol{1}}}\rangle}^{2}\right)}},

    where ,\langle\cdot,\cdot\rangle denotes inner product, and 𝟏{\boldsymbol{1}} denotes a vector of all ones.

  2. 2.

    Relative Reconstruction Error (RRE) and Signal-to-Reconstruction Error Ratio (SRER), defined as follows:

    RRE=𝒙^𝒙22𝒙22,SRER=10log10(𝒙22𝒙^𝒙22)dB.\displaystyle\text{RRE}=\frac{{\left\|\hat{{\boldsymbol{x}}}-{\boldsymbol{x}}\right\|}_{2}^{2}}{{\left\|{\boldsymbol{x}}\right\|}_{2}^{2}},\quad\text{SRER}=10\mathop{\log\limits_{10}\left(\frac{{\left\|{{\boldsymbol{x}}}\right\|_{2}^{2}}}{{\left\|{\hat{{\boldsymbol{x}}}-{{\boldsymbol{x}}}}\right\|_{2}^{2}}}\right){\text{dB}}}.
  3. 3.

    Probability of Error in Support (PES):

    PES=(max(|𝒮^𝒙^|,|𝒮𝒙|)|𝒮^𝒙^𝒮𝒙|max(|𝒮^𝒙^|,|𝒮𝒙|)),\displaystyle\text{PES}=\left(\frac{{{\text{max}}(\left|{\hat{\mathcal{S}}_{\hat{\boldsymbol{x}}}}\right|,\left|{{\mathcal{S}}_{\boldsymbol{x}}}\right|)-\left|{\hat{\mathcal{S}}_{\hat{\boldsymbol{x}}}\mathop{\cap}{\mathcal{S}_{\boldsymbol{x}}}}\right|}}{{{\text{max}}(\left|{\hat{\mathcal{S}}_{\hat{\boldsymbol{x}}}}\right|,\left|{\mathcal{S}_{\boldsymbol{x}}}\right|)}}\right),

    where |||\cdot| denotes the cardinality of the argument, and 𝒮𝒙{\mathcal{S}}_{\boldsymbol{x}} and 𝒮^𝒙^\hat{\mathcal{S}}_{\hat{\boldsymbol{x}}} denote the supports of 𝒙{\boldsymbol{x}} and 𝒙^\hat{{\boldsymbol{x}}}, respectively.

III-B Training Phase

The generation or acquisition of training data appropriately representing observed seismic data is crucial for the application of a data-driven approach to the reflectivity inversion problem [34]. We generate synthetic training data as 5×1055\times 10^{5} seismic traces, each of length 300300 samples, obtained by the convolution between 11-D reflectivity profiles and a Ricker wavelet. The reflectivity profiles consist of 200200 samples each, and are padded with zeros before convolution such that their length is equal to that of the seismic traces [3]. Amplitudes of the reflection coefficients range from 1.0-1.0 to 1.01.0, and the sparsity factor (kk), i.e., the ratio of the number of non-zero elements to the total number of elements in a trace, is set to 0.050.05. The locations of the reflection coefficients are picked uniformly at random, with no constraint on the minimum spacing between two reflectors/spikes, or in other words, the reflectors can be away only by one sampling interval. The reflectivity values, chosen randomly from the defined range of 1.0-1.0 to 1.01.0, are assigned to the selected locations. The sampling interval, the increment in the amplitudes of the reflection coefficients, and the wavelet frequency vary based on the dataset.

The initial hyperparameters and the optimum number of layers in the network also vary depending on the parameters of the dataset, such as the sampling interval and the wavelet frequency. We initially consider six layers for both DuRIN-11 and DuRIN-22 for training and keep appending five layers at a time to arrive at the optimum number of layers. For the training phase, we set the batch size to 200200, use the ADAM optimizer [72] and set the learning rate to 1×1041\times 10^{-4}, with an input measurement signal-to-noise ratio (SNR) of 2020 dB to ensure the robustness of the proposed models against noise in the testing data [34]. The proposed networks, DuRIN-11 and DuRIN-22, are trained on 11-D data, and operate on 11-D, 22-D, and 33-D data trace-by-trace. In the following sections, we provide experimental results on synthetic, simulated, and real data.

III-C Testing Phase: Synthetic Data

III-C1 Synthetic 1-D Traces

We validated the performance of DuRIN-11 and DuRIN-22 on 10001000 realizations of synthetic 11-D traces, with a 3030-Hz Ricker wavelet, 11-ms sampling interval, and amplitude increment 0.20.2. Table I shows the comparison of objective metrics for the proposed networks and the benchmark techniques. DuRIN-11 and DuRIN-22 recover the amplitudes with higher accuracy than the compared methods, quantified by the objective metrics CC, RRE, and SRER, with considerably superior support recovery indicated by PES. Figure 5 shows the corresponding illustration for a sample 11-D seismic trace of the 10001000 test realizations. The figure illustrates two main advantages of the proposed networks. Both DuRIN-11 and DuRIN-22 resolve closely-spaced reflectors right after 200200 ms on the time axis in Figure 5 (e) and (f) and do not introduce spurious supports as the benchmark techniques.

TABLE I: Objective metrics averaged over 10001000 test realizations of synthetic 11-D seismic traces. The proposed DuRIN-11 and DuRIN-22 are superior in both amplitude and support recovery. The best performance is highlighted in boldface, while the second best is underlined.
Method CC RRE SRER PES
BPI 0.6611{0.6611} 0.5723{0.5723} 3.4148{3.4148} 0.9697{0.9697}
FISTA 0.6734{0.6734} 0.5489{0.5489} 3.5554{3.5554} 0.8309{0.8309}
SBL-EM 0.6561{0.6561} 0.6611{0.6611} 4.2666\mathbf{4.2666} 0.9697{0.9697}
DuRIN-11 0.7259\mathbf{0.7259} 0.4638\mathbf{0.4638} 4.0212{4.0212} 0.5309¯\underline{0.5309}
DuRIN-22 0.7237¯\underline{0.7237} 0.4660¯\underline{0.4660} 4.0230¯\underline{4.0230} 0.5308\mathbf{0.5308}
Refer to caption
Figure 5: Refer to caption Sample results for recovered reflectivity from a synthetic 11-D seismic trace out of 10001000 test realizations reported in Table I. (a) The true seismic traces (TRUESEIS) and reflectivity (TRUEREF); (b)-(d) Recovered 11-D reflectivity profiles for the benchmark techniques show that they fail to distinguish between closely-spaced spikes right after 200200 ms on the time axis, and introduce spurious supports; (e)-(f) DuRIN-11 and DuRIN-22, on the other hand, distinguish between the closely-spaced spikes and exhibit superior support recovery, not introducing any spurious, low-amplitude supports.

Table II gives a comparison of the computational testing time over 100100, 200200, 500500, and 10001000 test realizations of synthetic 11-D traces. DuRIN-11 and DuRIN-22 require lower computational time than the benchmark techniques, over two orders of magnitude lower than FISTA, the next best technique to our proposed networks. Lower computation times are significant for reflection seismic processing, where the size of datasets analyzed is large. We note that the lower computational testing times for DuRIN-11 and DuRIN-22 come at the expense of longer training times, where the models are trained on a large number of synthetic seismic traces.

TABLE II: Computational testing time (s) for 100100, 200200, 500500, and 10001000 test realizations of synthetic 11-D seismic traces. DuRIN-11 and DuRIN-22 require significantly lower compute time during the inference phase as compared with the benchmark techniques.
R = number realizations.
Method Testing time (s\mathrm{s})
R =𝟏𝟎𝟎=\mathbf{100} R =𝟐𝟎𝟎=\mathbf{200} R =𝟓𝟎𝟎=\mathbf{500} R =𝟏𝟎𝟎𝟎=\mathbf{1000}
BPI 34.9307{34.9307} 70.5120{70.5120} 171.6517{171.6517} 375.6346{375.6346}
SBL-EM 202.5606{202.5606} 400.1070{400.1070} 947.3621{947.3621} 1959.4553{1959.4553}
FISTA 7.1995{7.1995} 13.1076{13.1076} 34.1289{34.1289} 64.9389{64.9389}
DuRIN-11 0.0220\mathbf{0.0220} 0.0467¯\underline{0.0467} 0.0804¯\underline{0.0804} 0.1434¯\underline{0.1434}
DuRIN-22 0.0276¯\underline{0.0276} 0.0365\mathbf{0.0365} 0.0732\mathbf{0.0732} 0.1095\mathbf{0.1095}

Figure 6 illustrates the computational advantage of the proposed DuRIN-11 and DuRIN-22 over the benchmark techniques in terms of layers/iterations. The 66-layer DuRIN-11 and DuRIN-22 models outperform the benchmark techniques in terms of the four objective metrics considered in this study, with their performance further enhanced by adding 55 layers, up to 2626 layers. We observe that the performance starts deteriorating after 3131 layers (Figure 6). We hypothesize that this deterioration could be attributed to the increase in the number of network parameters while keeping the size of the training dataset fixed, leading to overfitting. Future work could explore if increasing the training examples allows us to append more layers to enhance the networks’ performance further rather than diminish it.

Refer to caption
Figure 6: Refer to caption Objective metrics vs. number of iterations/layers for the five methods considered in the study. The stems represent layers (66, 1111, 1616, 2121, 2626, and 3131) for DuRIN-11 and DuRIN-22, and the lines represent iterations for BPI, FISTA, and SBL-EM. For (a) CC and (c) SRER, higher values indicate superior performance, while for (b) RRE and (d) PES, lower values indicate better performance.

III-C2 Synthetic 2-D Wedge Models

We use synthetic wedge models to test the resolving capability of the proposed networks on thin beds [68]. Wedge models usually consist of two interfaces, one horizontal and another inclined, with the wedge thinning to 0 ms. Depending on the polarity of the reflection coefficients of the two interfaces being either the same or opposite, we have even or odd wedge models, respectively. Further, based on the polarity being negative or positive, we obtain four possible combinations of polarities of the reflection coefficients of the wedge models (NP, PN, NN, PP). In our experimental setup, we consider wedge models with 2626 seismic traces, with the separation between the two interfaces reducing from 5050 ms to 0 ms in 22 ms increments, and amplitudes of the reflection coefficients set to ±0.5\pm~{}0.5.

Table III and Figure 7 show the performance of the proposed networks in comparison with the benchmark techniques for an odd wedge model (NP), with negative reflection coefficients on the upper horizontal interface (N) and positive on the lower inclined interface (P). DuRIN-11 and DuRIN-22 outperform the other methods in terms of both amplitude and support recovery metrics (Table III). As highlighted in Figure 7, BPI, FISTA, and SBL-EM fail to resolve the reflectors below 55 m wedge thickness, which is below the tuning thickness of 1313 ms, corresponding to 6.56.5 m in our experimental setup, for a 3030 Hz wavelet frequency [73]. This low resolving capability also contributes to the poorer (higher) PES scores for these techniques (Table III). On the other hand, the proposed DuRIN-11 and DuRIN-22 maintain the lateral continuity of both interfaces even below the tuning thickness.

TABLE III: Metrics for a synthetic 22-D odd (NP) wedge model. DuRIN-11 and DuRIN-22 outperform the benchmark techniques in terms of the amplitude and support recovery metrics considered. The best performance is highlighted in boldface. The second best scores are underlined.
Method CC RRE SRER PES
BPI 0.8110{0.8110} 0.2693{0.2693} 12.4474{12.4474} 0.9933{0.9933}
FISTA 0.8108{0.8108} 0.2583{0.2583} 20.5075{20.5075} 0.8846{0.8846}
SBL-EM 0.8848{0.8848} 0.1446{0.1446} 34.1678{34.1678} 0.9933{0.9933}
DuRIN-11 0.9875\mathbf{0.9875} 0.0622\mathbf{0.0622} 35.7206¯\underline{35.7206} 0.1795¯\underline{0.1795}
DuRIN-22 0.9774¯\underline{0.9774} 0.0734¯\underline{0.0734} 37.6269\mathbf{37.6269} 0.1026\mathbf{0.1026}
Refer to caption
Figure 7: Refer to caption Results for a synthetic 22-D odd wedge model. (a) True reflectivity; (b)-(d) Recovered reflectivity profiles from the benchmark techniques show a divergence from the ground-truth at wedge thickness <5<5 m, highlighted by the rectangle in black; (e)-(f) Reflectivity profiles recovered by DuRIN-11 and DuRIN-22 show a better resolution of reflectors below a wedge thickness of 55 m.

Table IV and Figure 8 show the results for an odd wedge model (PN) with positive polarity of reflection coefficients on the upper, horizontal interface (P) and negative polarity on the lower, inclined interface (N). Table IV shows the proposed DuRIN-11 and DuRIN-22 outperforming the benchmark techniques, namely, BPI, FISTA, and SBL-EM, in terms of the amplitude and support recovery metrics considered in this study. Similar to the NP wedge model, and as highlighted in Figure 8, the benchmark techniques do not accurately resolve the reflectors below a wedge thickness of 55 m, whereas DuRIN-11 and DuRIN-22 preserve lateral continuity of both the interfaces of the wedge model below tuning thickness.

TABLE IV: Results for a synthetic 22-D odd (PN) wedge model. The proposed networks, namely, DuRIN-11 and DuRIN-22 are superior in both amplitude and support recovery accuracy.
Method CC RRE SRER PES
BPI 0.8369{0.8369} 0.2265{0.2265} 12.5753{12.5753} 0.9933{0.9933}
FISTA 0.8387{0.8387} 0.2123{0.2123} 21.5266{21.5266} 0.8832{0.8832}
SBL-EM 0.8469{0.8469} 0.2106{0.2106} 32.4996{32.4996} 0.9933{0.9933}
DuRIN-11 0.9774\mathbf{0.9774} 0.0795\mathbf{0.0795} 34.4034¯\underline{34.4034} 0.1923¯\underline{0.1923}
DuRIN-22 0.9577¯\underline{0.9577} 0.1074¯\underline{0.1074} 36.0748\mathbf{36.0748} 0.1256\mathbf{0.1256}
Refer to caption
Figure 8: Refer to caption Results for recovered reflectivity from a synthetic 22-D odd wedge model (PN). (a) True reflectivity; (b)-(d) Recovered reflectivity signatures from the benchmark methods show a distinct divergence from the ground-truth reflectivity (a) at wedge thickness <5<~{}5 m, highlighted by the rectangle in black; (e)-(f) Recovered reflectivity profiles from the proposed networks, namely, DuRIN-11 and DuRIN-22 resolve the reflectors below a wedge thickness of 55 m.

For the even wedge models, i.e., NN (Table V and Figure 9) and PP (Table VI and Figure 10), all techniques exhibit comparable performance in terms of the amplitude metrics of CC and RRE, and resolve reflectors below tuning thickness without any divergence from the ground-truth reflector locations. The proposed networks, namely, DuRIN-11 and DuRIN-22 outperform the benchmark techniques in SRER and PES scores.

TABLE V: Results for an even wedge model (NN). DuRIN-11 and DuRIN-22 are superior in terms of SRER and PES.
Method CC RRE SRER PES
BPI 0.9215{0.9215} 0.1733{0.1733} 15.1795{15.1795} 0.9933{0.9933}
FISTA 0.9572{0.9572} 0.0916\mathbf{0.0916} 22.1490{22.1490} 0.8643{0.8643}
SBL-EM 0.9616\mathbf{0.9616} 0.1120¯\underline{0.1120} 38.1103{38.1103} 0.9933{0.9933}
DuRIN-11 0.9235{0.9235} 0.1568{0.1568} 38.8751\mathbf{38.8751} 0.1603¯\underline{0.1603}
DuRIN-22 0.9599¯\underline{0.9599} 0.1163{0.1163} 38.2828¯\underline{38.2828} 0.1346\mathbf{0.1346}
Refer to caption
Figure 9: Refer to caption Results for recovered reflectivity from a synthetic 22-D even wedge model (NN). (a) True reflectivity; (b)-(f) Recovered reflectivity profiles. The proposed networks and the benchmark techniques resolve the even wedge model well for wedge thickness >3>~{}3 m. DuRIN-11 (e), along with SBL-EM (d), also resolves the two reflectors of trace 33 (corresponding to wedge thickness 22 m).
TABLE VI: Results for an even wedge model (PP). While FISTA has a better CC and RRE score, DuRIN-11 and DuRIN-22 outperform the benchmark techniques in terms of SRER and PES.
Method CC RRE SRER PES
BPI 0.9303{0.9303} 0.1524{0.1524} 15.5630{15.5630} 0.9933{0.9933}
FISTA 0.9548\mathbf{0.9548} 0.0905\mathbf{0.0905} 23.1071{23.1071} 0.8572{0.8572}
SBL-EM 0.9232{0.9232} 0.1882{0.1882} 36.2027{36.2027} 0.9933{0.9933}
DuRIN-11 0.9238{0.9238} 0.1635{0.1635} 36.9497\mathbf{36.9497} 0.1923¯\underline{0.1923}
DuRIN-22 0.9295¯\underline{0.9295} 0.1447¯\underline{0.1447} 36.8566¯\underline{36.8566} 0.1410\mathbf{0.1410}
Refer to caption
Figure 10: Refer to caption Results for an even wedge model (PP). (a) True; and (b)-(f) Recovered reflectivity. BPI, FISTA, and DuRIN-11 resolve reflectors for wedge thickness >3>~{}3 m, while SBL-EM and DuRIN-22 also resolve reflectors at 22 m.

III-D Testing Phase: Marmousi2 Model

The Marmousi22 model [69] (width ×\times depth: 1717 km ×3.5\times~{}3.5 km) is an expansion of the original Marmousi model [74] (width ×\times depth: 9.29.2 km ×3\times~{}3 km depth). The model provides a simulated 22-D collection of seismic traces, and is widely used in reflection seismic processing for the calibration of techniques in structurally complex settings. The model has a 22 ms sampling interval, with traces at an interval of 12.512.5 m. We obtained the ground-truth reflectivity profiles (𝒙)({\boldsymbol{x}}) from the P-wave velocity and density models (1), and convolved them with a 3030 Hz Ricker wavelet to generate the input data (𝒚)({\boldsymbol{y}}), with measurement SNR of 2020 dB to test the robustness of the proposed networks to noisy conditions.

TABLE VII: Results for a portion of the Marmousi22 model, corresponding to Table VIII, without muting low-amplitude spikes. DuRIN-11 and DuRIN-22 have higher CC and RRE. The lower PES scores for BPI and SBL-EM are due to the spurious supports introduced by these techniques coinciding with the low-amplitude spikes. The best performance is highlighted in boldface; the second best is underlined.
Method CC RRE SRER PES
BPI 0.9714{0.9714} 0.0474{0.0474} 22.2817{22.2817} 0.1751\mathbf{0.1751}
FISTA 0.9716{0.9716} 0.0466{0.0466} 23.1565¯\underline{23.1565} 0.8879¯\underline{0.8879}
SBL-EM 0.9808{0.9808} 0.0305{0.0305} 26.3986\mathbf{26.3986} 0.1751\mathbf{0.1751}
DuRIN-11 0.9857\mathbf{0.9857} 0.0243\mathbf{0.0243} 22.5103{22.5103} 0.9716{0.9716}
DuRIN-22 0.9820¯\underline{0.9820} 0.0302¯\underline{0.0302} 22.1734{22.1734} 0.9702{0.9702}
TABLE VIII: Objective metrics for a portion of the Marmousi22 model [69]. DuRIN-11 and DuRIN-22 exhibit superior amplitude recovery than the benchmark techniques in terms of CC and RRE, and outperform in support recovery in terms of PES.
Method CC RRE SRER PES
BPI 0.9717{0.9717} 0.0465{0.0465} 23.4547{23.4547} 0.9778{0.9778}
FISTA 0.9720{0.9720} 0.0456{0.0456} 25.1384¯\underline{25.1384} 0.7762{0.7762}
SBL-EM 0.9811{0.9811} 0.0300{0.0300} 30.5194\mathbf{30.5194} 0.9778{0.9778}
DuRIN-11 0.9860\mathbf{0.9860} 0.0238\mathbf{0.0238} 24.5692{24.5692} 0.2175\mathbf{0.2175}
DuRIN-22 0.9822¯\underline{0.9822} 0.0298¯\underline{0.0298} 23.8769{23.8769} 0.2438¯\underline{0.2438}

Here, we present results from a region of the Marmousi22 model that contains a gas-charged sand channel [69]. Table VII shows a comparison of the performance of the proposed DuRIN-11 and DuRIN-22 with that of the benchmark techniques for the portion containing the gas-charged sand channel. The Marmousi22 model [69] has a large number of low-amplitude reflections. Additionally, in Figure 5, we observe that BPI and SBL-EM introduce spurious reflections. We attribute the low PES scores observed for BPI and SBL-EM in Table VII to such spurious supports complementing the low-amplitude reflections in the Marmousi22 model. We re-evaluate the objective metrics after muting reflections with amplitudes <1%<~{}1\% of the absolute of the maximum amplitude, the results for which are reported in Table VIII and Figure 11. The re-evaluation did not affect the CC, RRE, and SRER scores adversely, but, as the PES is computed over significant reflections, it is higher (worse) for BPI and SBL-EM, and lower (better) for DuRIN-11 and DuRIN-22. The objective metrics reported in Table VIII show superior amplitude and support recovery accuracy of the proposed networks over the benchmark methods. Figure 11 shows that DuRIN-11 and DuRIN-22, along with SBL-EM, preserve the lateral continuity of interfaces, highlighted in the insets at the edge of the sand channel. The insets also show that BPI and FISTA introduce spurious false interfaces due to the interference of multiple events. DuRIN-11 and DuRIN-22 show limited recovery of the low-amplitude reflection coefficients right below the gas-charged sand channel (Figure 11 (f) and (g)), which could be investigated in future work.

Refer to caption
Figure 11: Refer to caption Results for a portion of the Marmousi22 synthetic model, corresponding to Table VIII and marked by a red rectangle in (a) The P-wave velocity profile for the complete model; (b) The true reflectivity; (c)-(g) recovered 22-D reflectivity profiles from the methods compared in this study, with insets plots showing the zoomed-in portions of the selected region marked by black rectangles. Insets show that (c) BPI and (d) FISTA introduce spurious reflection coefficients around the true interfaces, while (e) SBL-EM, (f) DuRIN-11, and (g) DuRIN-22 preserve the lateral continuity better.

III-E Testing Phase: Real Data

To validate the proposed networks on real data, we use a 33-D seismic volume from the Penobscot 33D survey off the coast of Nova Scotia, Canada [70]. We present results from a smaller region of the 33-D
volume, with 201201 Inlines (11501150-13501350) and 121121 Xlines (10401040-11601160), a region that includes the two wells of this survey (wells L-3030 and B-4141, [75]). Along the time axis, the region contains 800800 samples between the time interval of 0 to 31963196 ms, with a 44 ms sampling interval, and a 2525 Hz Ricker wavelet fits the data well [75].

Figure 12 shows the observed seismic data and recovered reflectivity profiles for Xline 11551155 of the survey, and overlaid in black, the seismic and reflectivity profiles, respectively, computed from the sonic logs of well L-3030 [75]. The inverted reflectivity profiles for BPI and FISTA are smooth and lack detail. On the other hand, the predicted reflectivity profiles generated using SBL-EM, DuRIN-11 and DuRIN-22 provide more details for characterizing the subsurface. Additionally, DuRIN-11 and DuRIN-22 also resolve closely-spaced interfaces better and preserve the lateral continuity, evident from the interfaces around 1.11.1 s on the time axis (Figure 12 (e) and (f)). These results demonstrate the capability of the proposed networks, namely, DuRIN-11 and DuRIN-22, on real data from the field. We note that the models are trained on synthesized seismic traces before being tested on the real data.

Refer to caption
Figure 12: Refer to caption (a) Observed seismic data; (b)-(f) predicted 22-D reflectivity profiles for the region marked by a black rectangle in (a), for Xline 11551155 of the Penobscot 33D survey [70]. The overlaid waveforms in black are the seismic (in (a)) and reflectivity (in (b)-(f)), respectively, computed from the sonic logs of well L-3030 [75]. BPI (b) and FISTA (c) provide a smooth profile without many details, while SBL-EM (d), DuRIN-11 (e) and DuRIN-22 (f) generate a detailed reflectivity profile.

IV Conclusions

We developed a nonconvex optimization cost for reflectivity inversion based on a weighted counterpart of the minimax concave penalty. We also developed the deep-unfolded reflectivity inversion network (DuRIN) for solving the problem under consideration. Experimental validation on synthetic, simulated, and real data demonstrated the superior resolving capability of DuRIN, especially in support recovery (represented by PES), which is crucial for characterizing the subsurface. DuRIN also preserves lateral continuity of interfaces despite being a single-trace approach. However, the trade-off between the number of layers of the network and training examples, and the suboptimal recovery of low-amplitude reflection coefficients require further attention. One could also consider data-driven prior learning [27] based on a multi-penalty formulation [64] for solving the reflectivity inversion problem.

Acknowledgment

This work is supported by the Ministry of Earth Sciences, Government of India; Centre of Excellence in Advanced Mechanics of Materials, Indian Institute of Science (IISc), Bangalore; and Science and Engineering Research Board (SERB), India. The authors would like to thank Jishnu Sadasivan for fruitful technical discussions.

References

  • [1] D. W. Oldenburg, T. Scheuer, and S. Levy, “Recovery of the acoustic impedance from reflection seismograms,” Geophysics, vol. 48, no. 10, pp. 1318–1337, 1983.
  • [2] Ö. Yilmaz, Seismic Data Analysis: Processing, Inversion, and Interpretation of Seismic Data.   Society of Exploration Geophysicists, 2001.
  • [3] B. Russell, “Machine learning and geophysical inversion — A numerical study,” The Leading Edge, vol. 38, no. 7, pp. 512–519, 2019.
  • [4] H. L. Taylor, S. C. Banks, and J. F. McCoy, “Deconvolution with the 1\ell_{1} norm,” Geophysics, vol. 44, no. 1, pp. 39–52, 1979.
  • [5] P. M. Shearer, Introduction to Seismology.   Cambridge Univ. Press, 2009.
  • [6] A. J. Berkhout, “Least-squares inverse filtering and wavelet deconvolution,” Geophysics, vol. 42, no. 7, pp. 1369–1383, 1977.
  • [7] H. W. J. Debeye and P. Van Riel, “p\ell_{p}-norm deconvolution,” Geophysical Prospecting, vol. 38, no. 4, pp. 381–403, 1990.
  • [8] C. Yuan and M. Su, “Seismic spectral sparse reflectivity inversion based on SBL-EM: experimental analysis and application,” J. Geophysics and Engineering, vol. 16, no. 6, pp. 1124–1138, 2019.
  • [9] A. Tarantola, Inverse Problem Theory and Methods for Model Parameter Estimation.   Society for Industrial and Applied Mathematics, 2005.
  • [10] L. Liu and W. Lu, “A fast 1{\ell_{1}} linear estimator and its application on predictive deconvolution,” IEEE Geoscience and Remote Sensing Lett., vol. 12, no. 5, pp. 1056–1060, 2014.
  • [11] G. Zhang and J. Gao, “Inversion-driven attenuation compensation using synchrosqueezing transform,” IEEE Geoscience and Remote Sensing Lett., vol. 15, no. 1, pp. 132–136, 2017.
  • [12] R. Zhang and J. Castagna, “Seismic sparse-layer reflectivity inversion using basis pursuit decomposition,” Geophysics, vol. 76, no. 6, pp. R147–R158, 2011.
  • [13] S. S. Chen, D. L. Donoho, and M. A. Saunders, “Atomic decomposition by basis pursuit,” SIAM Review, vol. 43, no. 1, pp. 129–159, 2001.
  • [14] I. Daubechies, M. Defrise, and C. De Mol, “An iterative thresholding algorithm for linear inverse problems with a sparsity constraint,” Communications on Pure and Applied Mathematics, vol. 57, no. 11, pp. 1413–1457, 2004.
  • [15] A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM J. Imaging Sciences, vol. 2, no. 1, pp. 183–202, 2009.
  • [16] D. O. Pérez, D. R. Velis, and M. D. Sacchi, “Inversion of prestack seismic data using FISTA,” Mecánica Computacional, vol. 31, no. 20, pp. 3255–3263, 2012.
  • [17] D. O. Pérez, D. R. Velis, and M. D. Sacchi, “High-resolution prestack seismic inversion using a hybrid FISTA least-squares strategy,” Geophysics, vol. 78, no. 5, pp. R185–R195, 2013.
  • [18] C. Li, X. Liu, K. Yu, X. Wang, and F. Zhang, “Debiasing of seismic reflectivity inversion using basis pursuit de-noising algorithm,” J. Applied Geophysics, vol. 177, p. 104028, 2020.
  • [19] F. Li, R. Xie, W.-Z. Song, and H. Chen, “Optimal seismic reflectivity inversion: Data-driven p{\ell_{p}}-loss-q{\ell_{q}}-regularization sparse regression,” IEEE Geoscience and Remote Sensing Lett., vol. 16, no. 5, pp. 806–810, 2019.
  • [20] E. J. Candès, M. B. Wakin, and S. P. Boyd, “Enhancing sparsity by reweighted 1\ell_{1} minimization,” J. Fourier Analysis and Applications, vol. 14, no. 5-6, pp. 877–905, 2008.
  • [21] T. Zhang, “Analysis of multi-stage convex relaxation for sparse regularization,” J. Machine Learning Research, vol. 11, no. 3, 2010.
  • [22] I. Selesnick, “Sparse regularization via convex analysis,” IEEE Trans. Signal Process., vol. 65, no. 17, pp. 4481–4494, 2017.
  • [23] S. J. Wright, R. D. Nowak, and M. A. Figueiredo, “Sparse reconstruction by separable approximation,” IEEE Trans. Signal Process., vol. 57, no. 7, pp. 2479–2493, 2009.
  • [24] C.-H. Zhang, “Nearly unbiased variable selection under minimax concave penalty,” The Annals of Statistics, vol. 38, no. 2, pp. 894–942, 2010.
  • [25] J. Woodworth and R. Chartrand, “Compressed sensing recovery via nonconvex shrinkage penalties,” Inverse Problems, vol. 32, no. 7, p. 075004, 2016.
  • [26] S. Yuan and S. Wang, “Spectral sparse Bayesian learning reflectivity inversion,” Geophysical Prospecting, vol. 61, no. 4, pp. 735–746, 2013.
  • [27] K. J. Bergen, P. A. Johnson, M. V. de Hoop, and G. C. Beroza, “Machine learning for data-driven discovery in solid earth geoscience,” Science, vol. 363, no. 6433, 2019.
  • [28] A. Adler, M. Araya-Polo, and T. Poggio, “Deep learning for seismic inverse problems: Toward the acceleration of geophysical analysis workflows,” IEEE Signal Process. Mag., vol. 38, no. 2, pp. 89–119, 2021.
  • [29] W. Lewis and D. Vigh, “Deep learning prior models from seismic images for full-waveform inversion,” Proc. SEG Technical Program Expanded Abstracts, pp. 1512–1517, 2017.
  • [30] A. Richardson, “Seismic full-waveform inversion using deep learning tools and techniques,” 2018, arXiv:1801.07232 [Online]. .
  • [31] O. Ovcharenko, V. Kazei, M. Kalita, D. Peter, and T. Alkhalifah, “Deep learning for low-frequency extrapolation from multioffset seismic data,” Geophysics, vol. 84, no. 6, pp. R989–R1001, 2019.
  • [32] H. Sun and L. Demanet, “Extrapolated full-waveform inversion with deep learning,” Geophysics, vol. 85, no. 3, pp. R275–R288, 2020.
  • [33] M. Araya-Polo, J. Jennings, A. Adler, and T. Dahlke, “Deep-learning tomography,” The Leading Edge, vol. 37, no. 1, pp. 58–66, 2018.
  • [34] Y. Kim and N. Nakata, “Geophysical inversion versus machine learning in inverse problems,” The Leading Edge, vol. 37, no. 12, pp. 894–901, 2018.
  • [35] W. Wang, F. Yang, and J. Ma, “Velocity model building with a modified fully convolutional network,” Proc. SEG Technical Program Expanded Abstracts, pp. 2086–2090, 2018.
  • [36] Y. Wu, Y. Lin, and Z. Zhou, “Inversionnet: Accurate and efficient seismic waveform inversion with convolutional neural networks,” Proc. SEG Technical Program Expanded Abstracts 2018, pp. 2096–2100, 2018.
  • [37] F. Yang and J. Ma, “Deep-learning inversion: A next-generation seismic velocity model building method,” Geophysics, vol. 84, no. 4, pp. R583–R599, 2019.
  • [38] A. Adler, M. Araya-Polo, and T. Poggio, “Deep recurrent architectures for seismic tomography,” Proc. 81st EAGE Conference and Exhibition, vol. 2019, no. 1, pp. 1–5, 2019.
  • [39] V. Das, A. Pollack, U. Wollner, and T. Mukerji, “Convolutional neural network for seismic impedance inversion,” Geophysics, vol. 84, no. 6, pp. R869–R880, 2019.
  • [40] M. J. Park and M. D. Sacchi, “Automatic velocity analysis using convolutional neural network and transfer learning,” Geophysics, vol. 85, no. 1, pp. V33–V43, 2020.
  • [41] B. Wu, D. Meng, L. Wang, N. Liu, and Y. Wang, “Seismic impedance inversion using fully convolutional residual network and transfer learning,” IEEE Geoscience and Remote Sensing Lett., vol. 17, no. 12, pp. 2140–2144, 2020.
  • [42] W. Wang and J. Ma, “Velocity model building in a crosswell acquisition geometry with image-trained artificial neural networks,” Geophysics, vol. 85, no. 2, pp. U31–U46, 2020.
  • [43] R. Biswas, M. K. Sen, V. Das, and T. Mukerji, “Prestack and poststack inversion using a physics-guided convolutional neural network,” Interpretation, vol. 7, no. 3, pp. SE161–SE174, 2019.
  • [44] M. Alfarraj and G. AlRegib, “Semi-supervised learning for acoustic impedance inversion,” Proc. SEG Technical Program Expanded Abstracts, pp. 2298–2302, 2019.
  • [45] M. Araya-Polo, S. Farris, and M. Florez, “Deep learning-driven velocity model building workflow,” The Leading Edge, vol. 38, no. 11, pp. 872a1–872a9, 2019.
  • [46] Y. Wang, Q. Ge, W. Lu, and X. Yan, “Seismic impedance inversion based on cycle-consistent generative adversarial network,” Proc. SEG Technical Program Expanded Abstracts, pp. 2498–2502, 2019.
  • [47] L. Mosser, O. Dubrule, and M. J. Blunt, “Stochastic seismic waveform inversion using generative adversarial networks as a geological prior,” Mathematical Geosciences, vol. 52, no. 1, pp. 53–79, 2020.
  • [48] Z. Zhang and Y. Lin, “Data-driven seismic waveform inversion: A study on the robustness and generalization,” IEEE Trans. on Geoscience and Remote sensing, vol. 58, no. 10, pp. 6900–6913, 2020.
  • [49] D. P. Wipf and B. D. Rao, “Sparse Bayesian learning for basis selection,” IEEE Trans. Signal Process., vol. 52, no. 8, pp. 2153–2164, 2004.
  • [50] K. Gregor and Y. LeCun, “Learning fast approximations of sparse coding,” Proc. 27th Intl. Conf. on Machine Learning, pp. 399–406, 2010.
  • [51] V. Monga, Y. Li, and Y. C. Eldar, “Algorithm unrolling: Interpretable, efficient deep learning for signal and image processing,” IEEE Signal Process. Mag., vol. 38, no. 2, pp. 18–44, 2021.
  • [52] D. Mahapatra, S. Mukherjee, and C. S. Seelamantula, “Deep sparse coding using optimized linear expansion of thresholds,” 2017, arXiv:1705.07290 [Online].
  • [53] S. Mukherjee, D. Mahapatra, and C. S. Seelamantula, “DNNs for sparse coding and dictionary learning,” Proc. NIPS Bayesian Deep Learning Workshop, 2017.
  • [54] J. Zhang and B. Ghanem, “ISTA-Net: Interpretable optimization-inspired deep network for image compressive sensing,” Proc. IEEE Conf. on Computer Vision and Pattern Recognition, pp. 1828–1837, 2018.
  • [55] M. Borgerding, P. Schniter, and S. Rangan, “AMP-inspired deep networks for sparse linear inverse problems,” IEEE Trans. Signal Process., vol. 65, no. 16, pp. 4293–4308, 2017.
  • [56] H. Sreter and R. Giryes, “Learned convolutional sparse coding,” Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Process., pp. 2191–2195, 2018.
  • [57] R. Liu, S. Cheng, L. Ma, X. Fan, and Z. Luo, “Deep proximal unrolling: Algorithmic framework, convergence analysis and applications,” IEEE Trans. Image Process., vol. 28, no. 10, pp. 5013–5026, 2019.
  • [58] P. K. Pokala, A. G. Mahurkar, and C. S. Seelamantula, “FirmNet: A sparsity amplified deep network for solving linear inverse problems,” Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Process., pp. 2982–2986, 2019.
  • [59] Y. Li, M. Tofighi, J. Geng, V. Monga, and Y. C. Eldar, “Efficient and interpretable deep blind image deblurring via algorithm unrolling,” IEEE Trans. on Computational Imaging, vol. 6, pp. 666–681, 2020.
  • [60] N. Shlezinger, J. Whang, Y. C. Eldar, and A. G. Dimakis, “Model-based deep learning,” 2020, arXiv:2012.08405 [Online].
  • [61] P. K. Pokala, P. K. Uttam, and C. S. Seelamantula, “ConFirmNet: Convolutional FirmNet and application to image denoising and inpainting,” Proc. IEEE Intl. Conf. on Acoustics, Speech and Signal Process., pp. 8663–8667, 2020.
  • [62] L. Wang, C. Sun, M. Zhang, Y. Fu, and H. Huang, “DNU: deep non-local unrolling for computational spectral imaging,” Proc. IEEE /CVF Conf. Comp. Vis. and Patt. Recog., pp. 1658–1668, 2020.
  • [63] B. Tolooshams, S. Dey, and D. Ba, “Deep residual autoencoders for expectation maximization-inspired dictionary learning,” IEEE Trans. on Neural Networks and Learning Systems, 2020.
  • [64] D. Jawali, P. K. Pokala, and C. S. Seelamantula, “CORNET: Composite-regularized neural network for convolutional sparse coding,” Proc. IEEE Intl. Conf. on Image Processing, pp. 818–822, 2020.
  • [65] B. Tolooshams, S. Mulleti, D. Ba, and Y. C. Eldar, “Unfolding neural networks for compressive multichannel blind deconvolution,” 2021, arXiv:2010.11391 [Online].
  • [66] M. A. T. Figueiredo, J. M. Bioucas-Dias, and R. D. Nowak, “Majorization–minimization algorithms for wavelet-based image restoration,” IEEE Trans. Image Process., vol. 16, no. 12, pp. 2980–2991, 2007.
  • [67] S. Voronin and H. J. Woerdeman, “A new iterative firm-thresholding algorithm for inverse problems with sparsity constraints,” Applied and Computational Harmonic Analysis, vol. 35, no. 1, pp. 151–164, 2013.
  • [68] W. Hamlyn, “Thin beds, tuning, and AVO,” The Leading Edge, vol. 33, no. 12, pp. 1394–1396, 2014.
  • [69] G. S. Martin, R. Wiley, and K. J. Marfurt, “Marmousi2: An elastic upgrade for Marmousi,” The Leading Edge, vol. 25, no. 2, pp. 156–166, 2006.
  • [70] O. S. R., dGB Earth Sciences, “Penobscot 3D - Survey,” 2017, data retrieved from https://terranubis.com/datainfo/Penobscot.
  • [71] D. Freedman, R. Pisani, and R. Purves, Statistics (international student edition).   WW Norton & Company, New York, 2007.
  • [72] D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” 2014, arXiv:1412.6980 [Online].
  • [73] H. Chung and D. C. Lawton, “Frequency characteristics of seismic reflections from thin beds,” Canadian J. Exploration Geophysics, vol. 31, no. 1, pp. 32–37, 1995.
  • [74] R. Versteeg, “The Marmousi experience: velocity model determination on a synthetic complex data set,” The Leading Edge, vol. 13, no. 9, pp. 927–936, 1994.
  • [75] E. Bianco, “Geophysical tutorial: Well-tie calculus,” The Leading Edge, vol. 33, no. 6, pp. 674–677, 2014.