This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Implicit learning to determine variable sound speed and the reconstruction operator in photoacoustic tomography

Gyeongha Hwang Department of Mathematics, Yeungnam University, Gyeongsan 38541, Republic of Korea Gihyeon Jeon Sunghwan Moon Department of Mathematics, Kyungpook National University, Daegu 41566, Republic of Korea Dabin Park
Abstract

Photoacoustic tomography (PAT) is a hybrid medical imaging technique that offer high contrast and a high spatial resolution. One challenging mathematical problem associated with PAT is reconstructing the initial pressure of the wave equation from data collected at the specific surface where the detectors are positioned. The study addresses this problem when PAT is modeled by a wave equation with unknown sound speed cc, which is a function of spatial variables, and under the assumption that both the Dirichlet and Neumann boundary values on the detector surface are measured. In practical, we introduce a novel implicit learning framework to simultaneously estimate the unknown cc and the reconstruction operator using only Dirichlet and Neumann boundary measurement data. The experimental results confirm the success of our proposed framework, demonstrating its ability to accurately estimate variable sound speed and the reconstruction operator in PAT.

Keywords: photoacoustic tomography, unsupervised learning, inverse problem, wave equation

1 Introduction

Photoacoustic tomography (PAT) is an imaging method that uses non-ionized laser pulses and ultrasound to produce detailed images of the internal structure of biological tissue. The method is non-destructive, economical, and less harmful than other imaging options because it uses non-ionizing radiation [31]. For these reasons, it is used in a variety of biomedical applications, including skin melanoma detection, breast cancer detection, blood oxygenation mapping, tumor angiogenesis monitoring, functional brain imaging, and methemoglobin measurement [36].

In PAT, when the target object is irradiated with a non-ionizing laser pulse, the absorbed pulse generates a photoacoustic effect in which rapid thermal expansion leads to the production of acoustic waves, a phenomenon that was first discovered by Bell [7]. The resulting acoustic waves are measured using ultrasound detectors as the data and used to reconstruct images of the target object. This method is particularly useful for visualizing structures in optically opaque materials such as biological tissue, because it combines the high contrast of optical imaging with the high spatial resolution of ultrasound imaging (for more details, see [17, 21, 34]).

One of the goals of PAT is to reconstruct initial pressure ff, which potentially contains important biological information such as the presence and location of cancer cells, from the acoustic waves measured by the detector. These acoustic waves, denoted by pp, follow the wave equation:

{t2p(𝐱,t)=c(𝐱)𝐱p(𝐱,t)(𝐱,t)2×[0,),p(𝐱,t)|t=0=f(𝐱) and tp(𝐱,t)|t=0=0𝐱2\left\{\begin{array}[]{ll}\partial_{t}^{2}p({\mathbf{x}},t)=c({\mathbf{x}})\triangle_{\mathbf{x}}p({\mathbf{x}},t)&({\mathbf{x}},t)\in\mathbb{R}^{2}\times[0,\infty),\\ p({\mathbf{x}},t)|_{t=0}=f({\mathbf{x}})\text{ and }\partial_{t}p({\mathbf{x}},t)|_{t=0}=0&{\mathbf{x}}\in\mathbb{R}^{2}\end{array}\right. (1)

Here, c(𝐱)c(\mathbf{x}) represents the sound speed at location 𝐱{\mathbf{x}}, which is assumed to be continuous and bounded by two constants cMc_{M} and cmc_{m} such that cMc(𝐱)cm0c_{M}\geq c(\mathbf{x})\geq c_{m}\geq 0. The measurements are made on the boundary of a region of interest Ω2\Omega\subset\mathbb{R}^{2} under the reasonable assumption that ff has compact support within the bounded domain Ω\Omega. We can then define wave forward operator 𝒲c\mathcal{W}_{c}, which maps initial pressure ff to solution pp, i.e., 𝒲cf=p\mathcal{W}_{c}f=p. The challenge then focuses on precisely reconstructing ff from the boundary measurements to derive an accurate internal image of the target object.

Most studies on the reconstruction of initial pressure ff from 𝒲cf|Ω×[0,]\mathcal{W}_{c}{f}|_{\partial\Omega\times[0,\infty]} assume a known sound speed cc. The use of a constant cc has been studied analytically in [12, 35, 37] (for further mathematical details, see [2, 20, 21] and references therein), while other analytical studies have employed a variable cc. Agranovsky and Kuchment study reconstruction of the initial pressure from Dirichlet data with a known variable sound speed in three-dimensional space [1]. Moon et al. also study the singular value decomposition of the wave forward operator with a known radial sound speed in nn-dimensional space [23]. Stefanov and Uhlmann address a more general problem with a Riemann metric instead of the sound speed [30]. Other studies detailing numerical approaches include [6, 15, 24]. Research has also been conducted on recovering cc when initial pressure ff is known [29], while the sufficient conditions for simultaneously reconstructing ff and cc are discussed in [22].

Recently, deep learning methods have been applied to PAT [32, 14] image reconstruction [3], handling limited data setups [4, 5, 13, 18, 26], and achieving a super-resolution [5]. Many of these studies are based on supervised learning, for which initial pressure ff is the target data. However, in practice, it cannot necessarily be assumed that the target data is known in PAT because the initial pressure represents the internal structure of the object. Therefore, methods for training the reconstruction operator without using the target data need to be considered. In line with this, in this paper, we propose an unsupervised learning method to implicitly estimate the variable speed of sound cc and the reconstruction operator. The proposed method only utilizes paired Dirichlet and Neumann data boundary values and contributes to PAT research by demonstrating the use of implicit learning techniques.

The rest of this paper is structured as follows. The next section presents the formulation of our problem. In Section 3, we describe our proposed framework and its loss function. The numerical results are presented in Section 4. Finally, Section 5 summarizes our research and its contributions to PAT, focusing on the variable speed of sound and initial pressure.

2 Problem Formulation

Building on (1), we consider the problem of estimating variable sound speed cc and the reconstruction operator that recovers initial pressure ff from Dirichlet and Neumann boundary data. We assume that cC(2)c\in C^{\infty}(\mathbb{R}^{2}) satisfies cMc(𝐱)cm0c_{M}\geq c(\mathbf{x})\geq c_{m}\geq 0. Theorem 3.3 in [29] is a uniqueness theorem associated with this problem. According to this theorem, for a given paired dataset of initial values and Dirichlet data, sound speed cc and the reconstruction operator can be uniquely determined (see the Appendix for further details of the problem formulation, methods, and experimental results for this type of paired dataset is given). However, because it is an unreasonable assumption that the ground truth for the initial data can be used in PAT, obtaining paired data is impractical. Therefore, in this paper, we address this problem without using initial data.

For convenience, Dirichlet data 𝒲cf|B×[0,)\mathcal{W}_{c}f|_{\partial B\times[0,\infty)} is denoted as 𝒟cf\mathcal{D}_{c}f and Neumann data ν𝒲cf|B×[0,)\partial_{\nu}\mathcal{W}_{c}f|_{\partial B\times[0,\infty)} is denoted as 𝒩cf\mathcal{N}_{c}f, where ν\nu is the unit outward normal vector at the detector surface B\partial B and BB is the unit ball in 2\mathbb{R}^{2}.

Problem 1.

When a collection of Dirichlet and Neumann data pairs

{(𝒟cf,𝒩cf):fL2(2) with suppfB}\left\{(\mathcal{D}_{c}f,\mathcal{N}_{c}f):f\in L^{2}(\mathbb{R}^{2})\text{ with }\operatorname{supp}f\subset B\right\} (2)

is given, estimate the sound speed cc and the reconstruction operator 𝒟c1\mathcal{D}_{c}^{-1}.

When cc is known, Problem 1 aligns with the uniqueness theorem (Theorem A):

Theorem A.

([1, Theorem 8]) For a known sound speed cc, the initial pressure ff is uniquely determined by the Dirichlet boundary value 𝒟cf\mathcal{D}_{c}f.

We can now discuss the uniqueness of cc in Problem 1. For this, we recall the Calderón problem on the conductivity equation [11, 33]: For the equation

(γ(𝐱)u(𝐱))=0𝐱B,\nabla\cdot(\gamma({\mathbf{x}})\nabla u({\mathbf{x}}))=0\qquad{\mathbf{x}}\in B, (3)

with Dirichlet condition u|B=gu|_{\partial B}=g, the problem is uniquely determining conductivity function γ\gamma from the knowledge of the (bounded linear) Dirichlet to Neumann map Λγ:H1/2(B)H1/2(B)\Lambda_{\gamma}:H^{1/2}(\partial B)\to H^{-1/2}(\partial B) defined by

Λγ(g)=νu|B,\Lambda_{\gamma}(g)=\partial_{\nu}u|_{\partial B},

where Hα(B),αH^{\alpha}(B),\alpha\in\mathbb{R} is the Sobolev space.

Note that conductivity equation (3) is equivalent to

(Δ+q(𝐱))w(𝐱)=0𝐱B,(-\Delta+q({\mathbf{x}}))w({\mathbf{x}})=0\qquad{\mathbf{x}}\in B, (4)

where q(𝐱)=γ1/2(𝐱)γ1/2(𝐱)q({\mathbf{x}})=\dfrac{\nabla\gamma^{1/2}({\mathbf{x}})}{\gamma^{1/2}({\mathbf{x}})} and w(𝐱)=γ1/2(𝐱)u(𝐱)w({\mathbf{x}})=\gamma^{1/2}({\mathbf{x}})u({\mathbf{x}}).

Based on wave equation (1), we propose the following conjecture.

Conjecture: Assume that c1c_{1} and c2c_{2} are the known constant c0c_{0} on BcB^{c}. If, for any fL2(B)f\in L^{2}(B),

𝒟c1f=𝒟c2f and 𝒩c1f=𝒩c2f,\mathcal{D}_{c_{1}}f=\mathcal{D}_{c_{2}}f\text{ and }\mathcal{N}_{c_{1}}f=\mathcal{N}_{c_{2}}f,

then c1=c2c_{1}=c_{2} on 2\mathbb{R}^{2}.

Because c1c_{1} and c2c_{2} are known constant c0c_{0} on BcB^{c}, it suffices to show that c1=c2c_{1}=c_{2} on BB. Let us define

p¯(𝐱,t):={p(𝐱,t)|B¯×[0,)t0,p(𝐱,t)|B¯×[0,)t<0,\overline{p}({\mathbf{x}},t):=\left\{\begin{array}[]{ll}p({\mathbf{x}},t)|_{\overline{B}\times[0,\infty)}&t\geq 0,\\ p({\mathbf{x}},-t)|_{\overline{B}\times[0,\infty)}&t<0,\end{array}\right.

where pp is the solution for (1). Then, p¯\overline{p} satisfies

{t2p¯(𝐱,t)=c(𝐱)Δ𝐱p¯(𝐱,t)(𝐱,t)B×,p¯(𝐱,0)=f(𝐱) and tp¯(𝐱,t)|t=0=0𝐱B.\left\{\begin{array}[]{ll}\partial_{t}^{2}\overline{p}({\mathbf{x}},t)=c({\mathbf{x}})\Delta_{\mathbf{x}}\overline{p}({\mathbf{x}},t)&({\mathbf{x}},t)\in B\times\mathbb{R},\\ \overline{p}({\mathbf{x}},0)=f({\mathbf{x}})\text{ and }\partial_{t}\overline{p}({\mathbf{x}},t)|_{t=0}=0&{\mathbf{x}}\in B.\end{array}\right. (5)

Taking the Fourier transform of (5) with respect to the time variable tt, we have

(Δ𝐱+(2πiω)2c(𝐱))t(p¯(𝐱,))(ω)=0.\left(-\Delta_{\mathbf{x}}+\frac{(2\pi\text{i}\omega)^{2}}{c({\mathbf{x}})}\right)\mathcal{F}_{t}(\overline{p}({\mathbf{x}},\cdot))(\omega)=0.

The Calderón problem appears to give us the uniqueness of cc in (5) given (2). However, there are gaps in this application. For fL2(B)f\in L^{2}(B) we cannot be sure that 𝒟cfH1/2(B)\mathcal{D}_{c}f\in H^{1/2}(\partial B). Moreover, it is not easy for set {𝒟cf:fL2(B)}\{\mathcal{D}_{c}f:f\in L^{2}(B)\} to be equal to H1/2(B)H^{1/2}(\partial B). Despite these gaps, our conjecture is validated experimentally in Section 4.

In practical terms, we focus on the following problem, emphasizing that only a finite collection of data can be used in real-world applications.

Problem 2.

When a sufficiently large finite collection of Dirichlet and Neumann data pairs

{(𝒟cfi,𝒩cfi)i=1,,N:fiL2(2) with suppfiB}\left\{(\mathcal{D}_{c}f_{i},\mathcal{N}_{c}f_{i})_{i=1,\dots,N}:f_{i}\in L^{2}(\mathbb{R}^{2})\text{ with }\operatorname{supp}f_{i}\subset B\right\} (6)

is given, estimate the sound speed cc and the reconstruction operator 𝒟c1\mathcal{D}_{c}^{-1}.

Our problem has the following difficulties:

  1. 1.

    Target data ff in PAT is unavailable, and

  2. 2.

    The explicit formula of the wave forward operator is unknown.

In this paper, we propose an implicit learning method for estimating cc and 𝒟c1\mathcal{D}_{c}^{-1}. The proposed method uses a paired dataset of Dirichlet and Neumann data and an iterative method for the wave forward operator.

3 Proposed method

Refer to caption

Figure 1: Proposed framework

Our goal is to estimate cc and 𝒟c1\mathcal{D}_{c}^{-1} simultaneously from given data set 𝒯={(𝒟cfi,𝒩cfi)}i=1N\mathscr{T}=\{(\mathcal{D}_{c}f_{i},\mathcal{N}_{c}f_{i})\}_{i=1}^{N}. The proposed framework is shown in Figure 1 and consists of three components:

  1. 1.

    Sound speed network c~\tilde{c}

  2. 2.

    Reconstruction network \mathcal{R}

  3. 3.

    Wave forward operator 𝒲c~\mathcal{W}_{\tilde{c}}

Sound speed network c~\tilde{c} is designed to estimate the unknown cc in (1). Simultaneously, reconstruction network \mathcal{R} approximates 𝒟c1{\mathcal{D}_{c}^{-1}}. This network takes Dirichlet data 𝒟cf\mathcal{D}_{c}f and outputs a reconstruction of initial data f~\tilde{f}. Wave forward operator 𝒲c~\mathcal{W}_{\tilde{c}}, using an iterative scheme, solves wave equation (1) with the estimated c~\tilde{c} and the reconstructed f~\tilde{f}. From solution p~\tilde{p}, Dirichlet 𝒟c~f~\mathcal{D}_{\tilde{c}}\tilde{f} and Neumann 𝒩c~f~\mathcal{N}_{\tilde{c}}\tilde{f} data are obtained.

If the proposed conjecture is valid, then

𝒟c~f~𝒟cf2=0and𝒩c~f~𝒩cf2=0 for all fL2(B)||\mathcal{D}_{\tilde{c}}\tilde{f}-\mathcal{D}_{c}f||_{2}=0\quad\text{and}\quad||\mathcal{N}_{\tilde{c}}\tilde{f}-\mathcal{N}_{c}f||_{2}=0\text{ for all }f\in L^{2}(B)

implying that c~\tilde{c} and cc are the same. Theorem A guarantees that for c~=c\tilde{c}=c, if 𝒟c~f~𝒟cf2=0||\mathcal{D}_{\tilde{c}}\tilde{f}-\mathcal{D}_{c}f||_{2}=0 holds, then f~=f\tilde{f}=f. Thus, for given data set 𝒯={(𝒟cfi,𝒩cfi)}i=1N\mathscr{T}=\{(\mathcal{D}_{c}f_{i},\mathcal{N}_{c}f_{i})\}_{i=1}^{N}, we define the following loss function:

A(𝒯)=[λ𝒟𝒟c~f~i𝒟cfi22+λ𝒩𝒩c~f~i𝒩cfi22].\mathcal{L}_{A}(\mathscr{T})=\sum\Big{[}\lambda_{\mathcal{D}}||\mathcal{D}_{\tilde{c}}\tilde{f}_{i}-\mathcal{D}_{c}f_{i}||_{2}^{2}+\lambda_{\mathcal{N}}||\mathcal{N}_{\tilde{c}}\tilde{f}_{i}-\mathcal{N}_{c}f_{i}||_{2}^{2}\Big{]}. (7)

To effectively handle noisy data, we also incorporate a Total Variation (TV) regularization term for the reconstructed image [25]:

B(𝒯)=λTVf~i1where f~i=(𝒟cfi).\mathcal{L}_{B}(\mathscr{T})=\lambda_{\text{TV}}\sum||\nabla\tilde{f}_{i}||_{1}\qquad\text{where }\tilde{f}_{i}=\mathcal{R}(\mathcal{D}_{c}f_{i}).

The overall loss function is then given by:

(𝒯)=A(𝒯)+B(𝒯).\mathcal{L}(\mathscr{T})=\mathcal{L}_{A}(\mathscr{T})+\mathcal{L}_{B}(\mathscr{T}).

In the following subsection, we describe each component of the architecture in detail.

3.1 Sound speed network

We propose a neural network to estimate the unknown sound speed cc in (1). This strategy is based on the universal approximation theorem [10], which asserts that neural networks can approximate any continuous function. For this task, we employ a Multilayer Perceptron (MLP) architecture. Our MLP is structured to take two-dimensional spatial input 𝐱\mathbf{x} and output the estimated sound speed c~(𝐱)\tilde{c}(\mathbf{x}). The MLP consists of three hidden layers, each containing 50 hidden units. We choose the sine function as the activation function for the hidden layers due to its periodic nature, which enhances the model’s ability to accurately and efficiently represent complex natural signals and their derivatives [28]. Considering that the sound speed has known upper and lower bounds, we use a hyperbolic tangent function to constrain the output, enabling it to fit within the predetermined speed range. Additionally, we set the sound speed values outside of the detector’s range to a known constant.

3.2 Reconstruction network

The reconstruction network is designed to approximates the reconstruction operator that reconstructs initial data ff using Dirichlet data 𝒟cf\mathcal{D}_{c}f. Because 𝒟c\mathcal{D}_{c} is linear, its inverse is also linear, thus the reconstruction operator can be approximated by a neural network with a single linear layer. However, when the input has a high-resolution, this structure leads to a sharp increase in the number of neural network parameters, which grows quadratically with the resolution of input image. This results in substantial memory demands and longer computation times. For example, a low-resolution image of size 64×6464\times 64 requires 16,777,216 parameters; conversely, a high-resolution image of size 256×256256\times 256 requires 4,294,967,296 parameters, a 256-fold increase. An excessive number of parameters can also increase the risk of overfitting, adversely affecting the generalization capability of the model.

To mitigate these challenges, our proposed network architecture incorporates techniques such as downsampling and upsampling to significantly reduce the number of parameters while maintaining the network’s efficacy. This approach optimizes both the computational efficiency and memory usage, allowing it to effectively handle higher-resolution data. The architecture of the reconstruction network is detailed in following Figure 2.

Refer to caption

Figure 2: Reconstruction network \mathcal{R}. The Conv(a, b) layer is a 2D convolution layer with a kernel size of (a, b), and a stride of (1, 1) while padding is employed to ensure that the output is the same size as the input. The AvgPool(a, b) layer is a 2D average pooling layer with a kernel size of (a, b). The linear layer is a fully connected layer that transforms an input with dimension of 64×6464\times 64 to an output with the same dimensions. The Deconv(a, b) layer is a 2D transposed convolution layer with a kernel size of (a, b) and a stride of (2, 2). Each layer contains no bias.

The core process for the reconstruction network is as follows:

  1. Step 1.

    The input data (sinogram) undergoes a reduction in resolution via the convolution and average pooling layers, converting it into low-resolution, multi-channel data.

  2. Step 2.

    The low-resolution data from Step 1 is fed through a fully connected layer, which produces further low-resolution, multi-channel image domain data. This step relies on the linearity of the reconstruction operator.

  3. Step 3.

    Deconvolution and convolution layers are employed to upscale the data from Step 2, resulting in high-resolution, multi-channel image domain data. This step primarily focuses on effectively enhancing the resolution of the final image output.

  4. Step 4.

    A residual network (Res-Net) is used to further refine the reconstruction quality. Res-Net can significantly enhance the fidelity of reconstructed images.

  5. Step 5.

    To ensure that final image ff has values within the physiological range of [0,1][0,1], hyperbolic tangent transformation is employed. This function x0.5tanh(x0.5)+0.5x\mapsto 0.5\tanh(x-0.5)+0.5 helps to calibrate the output to the desired range via normalization.

3.3 Wave forward operator

Because the explicit form of the wave forward operator is unknown, we use an iterative method. In particular, to calculate the wave propagation, we use the well-known kk-space method [8, 9]. The approximation of wave propagation for the subsequent time step is obtained from equation [16]:

p(𝐱,t+t)=2p(𝐱,t)p(𝐱,tt)c~(𝐱)𝐤1[4sin2((t)||2)𝐱[p](,t)](𝐱),p({\mathbf{x}},t+\triangle t)=2p({\mathbf{x}},t)-p({\mathbf{x}},t-\triangle t)-\tilde{c}({\mathbf{x}})\mathcal{F}_{\mathbf{k}}^{-1}\left[4\sin^{2}\left(\dfrac{(\triangle t)|\cdot|}{2}\right)\mathcal{F}_{\mathbf{x}}[p](\cdot,t)\right]({\mathbf{x}}), (8)

where 𝐱\mathcal{F}_{\mathbf{x}} is the Fourier transform, and 𝐤1\mathcal{F}_{\mathbf{k}}^{-1} is the inverse Fourier transform. In this formulation, p(𝐱,t+t)p({\mathbf{x}},t+\triangle t) denotes the solution for the subsequent time step, illustrating how the system evolves over time, particularly under the influence of the spatially dependent term c~(𝐱)\tilde{c}({\mathbf{x}}).

4 Numerical Results

We use Shepp-Logan phantoms for our simulation data. A Shepp-Logan phantom, introduced by Shepp and Logan in 1974 [27], is an artificial image representing a cross-section of the brain. It consists of several ellipses each defined by six parameters: the center coordinates of the ellipse, the major axis, the minor axis, the rotation angle, and the intensity. We create a set of phantoms {Fi}i=14,096\left\{F_{i}\right\}_{i=1}^{4,096} by changing these parameters and obtain the Dirichlet data set

{DcrFi:Dirichlet data for initial Fi and sound speed c on a ball of radius r }i=14,096\left\{D_{c}^{r}F_{i}:\text{Dirichlet data for initial $F_{i}$ and sound speed $c$ on a ball of radius $r$ }\right\}_{i=1}^{4,096}

for a given sound speed cc by applying the iterative wave propagation method (8) to the phantoms. Noisy data are generated by adding Gaussian noise to 1% of the maximum value of the original data. We use 2,048 phantoms as a training set, 1,024 as a validation set, and the remaining 1,024 as a test set.

We set c(x,y;h)=exp(h(x2+y2))c(x,y;h)=\exp(-h(x^{2}+y^{2})). Simulations are conducted using two different sound speed profiles:

Type 1: c1(x,y)=34(c(x0.5,y0.6;2)+c(x0.5,y+0.5;2)+c(x+0.6,y0.45;2)\displaystyle c_{1}(x,y)=\frac{3}{4}\Big{(}c(x-0.5,y-0.6;2)+c(x-0.5,y+0.5;2)+c(x+0.6,y-0.45;2) (9)
+c(x+0.55,y+0.6;2)),\displaystyle\qquad\qquad\quad+c(x+0.55,y+0.6;2)\Big{)},
Type 2: c2(x,y)=2335(c(x,y0.675;2.5)+c(x0.725,y0.175;2.5)+c(x0.4,y+0.625;2.5)\displaystyle c_{2}(x,y)=\frac{23}{35}\Big{(}c(x,y-0.675;2.5)+c(x-0.725,y-0.175;2.5)+c(x-0.4,y+0.625;2.5)
+c(x+0.625,y0.225;2.5)+c(x+0.45,y+0.575;2.5)),\displaystyle\qquad\qquad\quad\quad+c(x+0.625,y-0.225;2.5)+c(x+0.45,y+0.575;2.5)\Big{)},

which are characterized by a constant value outside of BB.

In practice, for small a h>0h>0, the Neumann data at the boundary of BB can be approximated by

𝝂𝐱𝒲cf(𝐱,t)𝒲cf(𝐱,t)𝒲cf(𝐱h𝝂𝐱,t)h,𝐱B\partial_{\boldsymbol{\nu}_{{\mathbf{x}}}}\mathcal{W}_{c}f({\mathbf{x}},t)\approx\frac{\mathcal{W}_{c}f({\mathbf{x}},t)-\mathcal{W}_{c}f({\mathbf{x}}-h\boldsymbol{\nu}_{{\mathbf{x}}},t)}{h},\quad{\mathbf{x}}\in\partial B (10)

where 𝝂𝐱\boldsymbol{\nu}_{{\mathbf{x}}} is the unit outward normal vector at 𝐱B{\mathbf{x}}\in\partial B. Having Dirichlet and Neumann data pairs is equivalent to having Dirichlet data pairs (Dc1Fi,Dc1hFi){(D_{c}^{1}F_{i},D_{c}^{1-h}F_{i})}. Therefore, we modify the loss function described in (7) as follows:

=λDiDc1FiDc~1F~i22+λNiDc1hFiDc~1hF~i22+λTViF~iTV,\mathcal{L}=\lambda_{D}\sum\limits_{i}\left\|D_{c}^{1}F_{i}-D_{\tilde{c}}^{1}\tilde{F}_{i}\right\|_{2}^{2}+\lambda_{N}\sum\limits_{i}\left\|D_{c}^{1-h}F_{i}-D_{\tilde{c}}^{1-h}\tilde{F}_{i}\right\|_{2}^{2}+\lambda_{\text{TV}}\sum\limits_{i}||\tilde{F}_{i}||_{\text{TV}}, (11)

where F~=(Dc1F)\tilde{F}=\mathcal{R}(D_{c}^{1}F).

We choose h=0.05h=0.05, λD=1\lambda_{D}=1, λN=1\lambda_{N}=1, and λTV=σ/5\lambda_{\text{TV}}=\sigma/5, where σ\sigma is the noise level. To minimize the loss function (11), we use the Adam optimizer [19] with a learning rate of 0.0010.001 and a batch size of 22. The training takes 10510^{5} iterations.

Table 1 shows the relative errors of the reconstructed image and the approximated sound speed. The results show that the proposed framework effectively reconstructs both the initial pressure and sound speed with high accuracy, maintaining its performance even with noisy data.

Table 1: Relative error of the reconstructed image and sound speed
Sound speed profile Noise-free data Noisy data
1Nfirel\textstyle\frac{1}{N}\sum\|f_{i}\|_{\text{rel}} crel\|c\|_{\text{rel}} 1Nfirel\textstyle\frac{1}{N}\sum\|f_{i}\|_{\text{rel}} crel\|c\|_{\text{rel}}
Type 1 0.0621 0.0014 0.0785 0.0014
Type 2 0.0613 0.0017 0.0759 0.0028

Figures 3 and 4 show the ground truth of the speed of sound along with their approximation output from the sound speed network. The relative error for ff with respect to ground truth fgtf_{\text{gt}} is defined by frel=fgtf2fgt2\|f\|_{\text{rel}}=\dfrac{\|f_{\text{gt}}-f\|_{2}}{\|f_{\text{gt}}\|_{2}}. The results demonstrate the ability of the network to accurately approximate the sound speed. This successful approximation over different sound speed scenarios (c1,c2c_{1},c_{2}) illustrates the adaptability and robustness of the proposed network under varying conditions.

Refer to caption
Figure 3: Estimated results for Type 1 and Type 2 sound speed profiles. The first, second, and third columns represent the ground truth, the estimated speed for noise-free data, and the estimated speed for noisy data, respectively.
Refer to caption
(a) Results for Type 1 speed profile
Refer to caption
(b) Results for Type 2 speed profile
Figure 4: In the left image, the ground truth is shown with three lines labeled Line 1, Line 2, and Line 3. The right images present cross-sectional views corresponding to these lines. The blue lines represent the ground truth, while the red dashed lines represent the estimated speed.

Figure 5 illustrates the ground truth and reconstruction results for the reconstruction network, for both noise-free and noisy data sets.

Refer to caption
Figure 5: Reconstruction results: (a) ground truth (GT); (b) and (c) output of the reconstruction network for noise-free data for Type 1 and Type 2 sound speed profiles; (d) and (e) output of the reconstruction network for noisy data for Type 1 and Type 2 sound speed profiles.

5 Conclusion

In this study, we introduce a framework for addressing the problem in PAT associated with the simultaneous estimation of sound speed cc and reconstruction operator 𝒟c1\mathcal{D}_{c}^{-1}. Our approach employs implicit learning to approximate both cc and 𝒟c1\mathcal{D}_{c}^{-1} using only Dirichlet and Neumann boundary data. This greatly reduces the dependence on large labeled datasets that is typical of medical imaging.

Appendix

In this Appendix, we address the problem of estimating the sound speed cC(2)c\in C^{\infty}(\mathbb{R}^{2}) and the reconstruction operator 𝒟c1\mathcal{D}_{c}^{-1} when a paired dataset of initial pressure ff and Dirichlet data 𝒟cf\mathcal{D}_{c}f is given. With access to a sufficiently large dataset {(fi,𝒟cfi)}i=1N\left\{(f_{i},\mathcal{D}_{c}f_{i})\right\}_{i=1}^{N}, we can estimate the reconstruction operator using supervised learning. This involves minimizing the following loss function:

1Nifi(𝒟cfi)22.\frac{1}{N}\sum_{i}\|f_{i}-\mathcal{R}(\mathcal{D}_{c}f_{i})\|_{2}^{2}.

Here, :𝒟cff\mathcal{R}:\mathcal{D}_{c}f\mapsto f represents a neural network designed to approximate the reconstruction operator 𝒟c1\mathcal{D}_{c}^{-1}.

To estimate the sound speed c()c(\cdot), we recall the following uniqueness theorem.

Theorem B.

([29, Theorem 3.3]) Let cc and c~\tilde{c} be two smooth positive speeds equal to 1 outside Ω\Omega. Let s\sum_{s}, s1ss2s_{1}\leq s\leq s_{2} be a continuous family of compact oriented surfaces111For a detailed definition of s\sum_{s}, see [29, pages 9–11]. Let

𝒟cf=𝒟c~fon[0,T]×Ω,withT>maxsdist(sΩ¯,Ω).\mathcal{D}_{c}f=\mathcal{D}_{\tilde{c}}f\quad on\quad[0,T]\times\partial\Omega,\qquad with\ T>\max\limits_{s}\operatorname{dist}({\textstyle\sum_{s}}\cap\overline{\Omega},\partial\Omega).

Assume that for some compact KΩ¯K\subset\overline{\Omega},

supp(c~c)K,Δf(x)0forxK.\operatorname{supp}(\tilde{c}-c)\subset K,\qquad\Delta f(x)\neq 0\quad for\ x\in K.

Then c~=c\tilde{c}=c in s\bigcup\sum_{s}. If, in particular, s\bigcup\sum_{s} is dense in KK, and T>dist(Ω,Ω¯)T>\operatorname{dist}(\Omega,\overline{\Omega}), then c~=c\tilde{c}=c.

From [29, Theorem 3.3], we have:

If 𝒟c~f=𝒟cf for all fL2(B), then c~=c on B.\text{If }\mathcal{D}_{\tilde{c}}f=\mathcal{D}_{c}f\text{ for all }f\in L^{2}(B),\text{ then }\tilde{c}=c\text{ on }B.

Based on this discussion, we define the loss function as follows:

=λ𝒟𝒟c~f~𝒟cf22+λIf~f22+λTVf~TV,\mathcal{L}=\lambda_{\mathcal{D}}||\mathcal{D}_{\tilde{c}}\tilde{f}-\mathcal{D}_{c}f||_{2}^{2}+\lambda_{I}||\tilde{f}-f||_{2}^{2}+\lambda_{\text{TV}}||\tilde{f}||_{\text{TV}}, (12)

where f~=(𝒟cf)\tilde{f}=\mathcal{R}(\mathcal{D}_{c}f).

Experiments are conducted using the same dataset and framework as described in the main text. For detailed results, please refer to Table 2, Figure 6, and Figure 8. The numerical results in the main text and Appendix show that the proposed implicit learning method, which does not use explicit target data, performs similarly to supervised learning. Additionally, the results in the Appendix are derived by simply modifying the loss function within the framework proposed in the main text, further confirming the applicability and robustness of our approach.

Table 2: Relative error of the reconstructed sound speed
Sound speed profile Noise-free data Noisy data
1Nfirel\textstyle\frac{1}{N}\sum\|f_{i}\|_{\text{rel}} crel\|c\|_{\text{rel}} 1Nfirel\textstyle\frac{1}{N}\sum\|f_{i}\|_{\text{rel}} crel\|c\|_{\text{rel}}
Type 1 0.0605 0.0016 0.0766 0.0015
Type 2 0.0572 0.0020 0.0767 0.0019
Refer to caption
Figure 6: Estimated results for Type 1 and Type 2 sound speed profiles. The first, second, and third columns present the ground truth, the estimated speed for noise-free data, and the estimated speed for noisy data, respectively.
Refer to caption
(a) Results for Type 1 speed
Refer to caption
(b) Results for Type 2 speed
Figure 7: In the left image, the ground truth is shown with three lines labeled Line 1, Line 2, and Line 3. The right images present cross-sectional views corresponding to these lines. The blue lines represent the ground truth, while the red dashed lines represent the estimated speed.
Refer to caption
Figure 8: Reconstruction results: (a) ground truth (GT); (b) and (c) output of the reconstruction network for noise-free data for Type 1 and Type 2 sound speed profiles; (d) and (e) output of the reconstruction network for noisy data for Type 1 and Type 2 sound speed profiles.

Acknowledgement

This work was supported by the National Research Foundation of Korea grant funded by the Korea government(MSIT) (NRF-2022R1C1C1003464, RS-2023-00217116 and RS-2024-00333393).

References

  • [1] M. Agranovsky, and P. Kuchment. Uniqueness of reconstruction and an inversion procedure for thermoacoustic and photoacoustic tomography with variable sound speed. Inverse Problems, 23(5), 2089, 2007.
  • [2] H. Ammari, E. Bossy, V. Jugnon, and H. Kang. Mathematical modeling in photoacoustic imaging of small absorbers. SIAM review, 52(4), 677-695, 2010.
  • [3] S. Antholzer, M. Haltmeier, R. Nuster, and J. Schwab. Photoacoustic image reconstruction via deep learning. In Photons plus ultrasound: Imaging and sensing 2018, 10494, 104944U, 433-442, SPIE, 2018.
  • [4] S. Antholzer, M. Haltmeier, and J. Schwab. Deep learning for photoacoustic tomography from sparse data. Inverse problems in science and engineering, 27(7), 987-1005, 2019.
  • [5] N. Awasthi, G. Jain, S. K. Kalva, M. Pramanik and P. K. Yalavarthy. Deep Neural Network-Based Sinogram Super-Resolution and Bandwidth Enhancement for Limited-Data Photoacoustic Tomography. IEEE Transactions on Ultrasonics, Ferroelectrics, and Frequency Control, 67(12), 2660-2673, 2020.
  • [6] Z. Belhachmi, T. Glatz, and O. Scherzer. A direct method for photoacoustic tomography with inhomogeneous sound speed. Inverse Problems, 32(4), 045005, 2016.
  • [7] A. G. Bell. On the production and reproduction of sound by light. American journal of science, 3(118), 305-324, 1880.
  • [8] B.T. Cox, and P.C. Beard. Fast calculation of pulsed photoacoustic fields in fluids using kk-space methods. The Journal of the Acoustical Society of America, 117(6), 3616-3627, 2005.
  • [9] B.T. Cox, S. Kara, S.R. Arridge, and P.C. Beard. kk-space propagation models for acoustically heterogeneous media: Application to biomedical photoacoustics. The Journal of the Acoustical Society of America, 121(6), 3453-3464, 2007.
  • [10] G. Cybenko. Approximation by superpositions of a sigmoidal function. Mathematics of control, signals and systems, 2(4), 303-314, 1989.
  • [11] J. Feldman, M. Salo, and G. Uhlmann. The Calderón problem—an introduction to inverse problems. Preliminary notes on the book in preparation, 30, 2019.
  • [12] D. Finch, M. Haltmeier, and Rakesh. Inversion of spherical means and the wave equation in even dimensions. SIAM Journal on Applied Mathematics, 68(2), 392-412, 2007.
  • [13] S. Gutta, V. S. Kadimesetty, S. K. Kalva, M. Pramanik, S. Ganapathy, and P. K. Yalavarthy. Deep neural network-based bandwidth enhancement of photoacoustic data. Journal of Biomedical Optics, 22(11), 116001, 2017.
  • [14] A. Hauptmann, and B. Cox. Deep learning in photoacoustic tomography: current approaches and future directions. Journal of Biomedical Optics, 25(11), 112903, 2020.
  • [15] Y. Hristova, P. Kuchment, and L. Nguyen. Reconstruction and time reversal in thermoacoustic tomography in acoustically homogeneous and inhomogeneous media. Inverse problems, 24(5), 055006, 2008.
  • [16] G. Hwang, G. Jeon, and S. Moon. Self-supervised learning for a nonlinear inverse problem with forward operator involving an unknown function arising in Photoacoustic Tomography. arXiv preprint arXiv:2301.08693, 2023.
  • [17] H. Jiang. Photoacoustic Tomography. CRC Press, Boca Raton, FL, USA, 2018.
  • [18] S. Jeon, W. Choi, B. Park and C. Kim. A Deep Learning-Based Model That Reduces Speed of Sound Aberrations for Improved In Vivo Photoacoustic Imaging. IEEE Transactions on Image Processing, 30, 8773-8784, 2021.
  • [19] D. P. Kingma, and J. Ba. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980, 2014.
  • [20] P. Kuchment, and L. Kunyansky. Mathematics of thermoacoustic tomography. European Journal of Applied Mathematics, 19(2), 191-224, 2008.
  • [21] P. Kuchment. The Radon transform and medical imaging. SIAM, Philadelphia, 2013.
  • [22] H. Liu, and G. Uhlmann. A Determining both sound speed and internal source in thermo-and photo-acoustic tomography. Inverse Problems, 31(10), 105005, 2015.
  • [23] M. Moon, I. Hur, and S. Moon. Singular value decomposition of the wave forward operator with radial variable coefficients. SIAM Journal on Imaging Sciences, 16(3), 1520-1534, 2023.
  • [24] J. Qian, P. Stefanov, G. Uhlmann, and H Zhao. An efficient Neumann series–based algorithm for thermoacoustic and photoacoustic tomography with variable sound speed. SIAM Journal on Imaging Sciences, 4(3), 850-883, 2011.
  • [25] L. I. Rudin, S. Osher, and E. Fatemi. Nonlinear total variation based noise removal algorithms. Physica D: nonlinear phenomena, 60(1-4), 259-268, 1992.
  • [26] H. Shahid, A. Khalid, X. Liu, M. Irfan, and D. Ta. A deep learning approach for the photoacoustic tomography recovery from undersampled measurements. Frontiers in Neuroscience, 15, 598693, 2021.
  • [27] L. A. Shepp, and B. F. Logan. The Fourier reconstruction of a head section. IEEE Transactions on nuclear science, 21(3), 21-43, 1974.
  • [28] V. Sitzmann, J. Martel, A. Bergman, D. Lindell, and G. Wetzstein. Implicit neural representations with periodic activation functions. Advances in neural information processing systems, 33, 7462-7473, 2020.
  • [29] P. Stefanov, and G. Uhlmann. Recovery of a source term or a speed with one measurement and applications. Transactions of the American Mathematical Society, 365(11), 5737-5758, 2013.
  • [30] P. Stefanov, and G. Uhlmann. Thermoacoustic tomography with variable sound speed. Inverse Problems, 25(7), 075011, 2009.
  • [31] I. Steinberg, D. M. Huland, O. Vermesh, H. E. Frostig, W. S. Tummers, and S. S. Gambhir. Photoacoustic clinical imaging. Photoacoustics, 14, 77-98, 2019.
  • [32] S. Suganyadevi, V. Seethalakshmi, and K. Balasamy. A review on deep learning in medical image analysis. International Journal of Multimedia Information Retrieval, 11(1), 19-38, 2022.
  • [33] J. Sylvester, and G. Uhlmann. The Dirichlet to Neumann map and applications. Inverse problems in partial differential equations, 42, 101, 1990.
  • [34] J. Xia, J. Yao, and L. V. Wang. Photoacoustic tomography: principles and advances. Electromagnetic waves (Cambridge, Mass.), 147, 1-22, 2014.
  • [35] M. Xu, and L. V. Wang. Universal back-projection algorithm for photoacoustic computed tomography. Physical Review E, 71(1), 016706, 2005.
  • [36] M. Xu, and L. V. Wang. Photoacoustic imaging in biomedicine. Review of scientific instruments, 77(4), 041101, 2006.
  • [37] G. Zangerl, S. Moon, and M. Haltmeier. Photoacoustic tomography with direction dependent data: an exact series reconstruction approach. Inverse Problems, 35(11), 114005, 2019.