This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

A three-dimensional dual-domain deep network for high-pitch and sparse helical CT reconstruction

Wei Wang, Xiang-Gen Xia, Chuanjiang He, Zemin Ren and Jian Lu This work was supported partly by National Natural Science Foundation of China (Nos.12001381 and 61871274), Peacock Plan (No. KQTD2016053112051497), Shenzhen Key Basic Research Project (Nos. JCYJ20180507184647636, JCYJ20170818094109846, and JCYJ20190808155618806).Wei Wang is with the School of Biomedical Engineering, Shenzhen University, Shenzhen, China. (e-mail: [email protected]). Xiang-Gen Xia is with the Department of Electrical and Computer Engineering, University of Delaware, Newark, DE 19716, USA. (e-mail: [email protected]).Chuanjiang He is with the College of Mathematics and Statistics, Chongqing University, Chongqing, China (e-mail: [email protected]).Zemin Ren is with the College of Mathematics and Physics, Chongqing University of Science and Technology, Chongqing, China (e-mail: [email protected]).Jian Lu is with the Shenzhen Key Laboratory of Advanced Machine Learning and Applications, Shenzhen University, Shenzhen, China (e-mail: [email protected]).
Abstract

In this paper, we propose a new GPU implementation of the Katsevich algorithm for helical CT reconstruction. Our implementation divides the sinograms and reconstructs the CT images pitch by pitch. By utilizing the periodic properties of the parameters of the Katsevich algorithm, our method only needs to calculate these parameters once for all the pitches and so has lower GPU-memory burdens and is very suitable for deep learning. By embedding our implementation into the network, we propose an end-to-end deep network for the high pitch helical CT reconstruction with sparse detectors. Since our network utilizes the features extracted from both sinograms and CT images, it can simultaneously reduce the streak artifacts caused by the sparsity of sinograms and preserve fine details in the CT images. Experiments show that our network outperforms the related methods both in subjective and objective evaluations.

Index Terms:
Helical CT, sparse CT, deep learning, Katsevich algorithm, high pitch

I Introduction

Computed tomography (CT) has been an important tool in medical diagnosis and the helical scan with multi-row detector is the most used scanning modality in hospitals. When scanning, the X-ray source and detector rotate around the central axis (i.e., the z-direction) while the table with a patient remaining stationary on it is translated along the z-direction at a constant velocity. Therefore, the trajectory of the X-ray source relative to the patient forms a helical curve, where the pitch of the helical curve is related to the moving velocity of the table.

The patient needs to stay stationary during the scanning process, otherwise severe artifacts will be caused in the reconstructed CT images. Thus, the high pitch scanning that can complete the data acquiring in small time has many merits and is very desirable in clinical diagnosis. For example, by utilizing the high pitch helical CT, the whole lung can be scanned within a very small interval and so the patient needs not to hold the breath. This is very useful for patients that can’t perform breath-holding for a long time, such as infant, old patients and unconscious patients. For cardiac CT, scanning the whole heart in one heart beat is a reasonable request.

In [1], Noo et al. used the Tam-Danielsson window [2][3] and Katsevich’s inverse formula [4] to derive the numbers of required detector rows for perfectly reconstructing the helical CT images with a curved detector or a flat detector. When other parameters are fixed, the higher the pitch is, the greater number of the detector rows is needed. However, as the number of the detector rows increasing, the price of the CT scanner will also rise significantly. One way to reduce the price is to use sparse detector [5] with fixed detector row numbers. However, CT images reconstructed from sparse sinograms usually have streak artifacts and lose fine details especially when the scanning pitch is high.

In the literature, many algorithms were proposed to reconstruct the helical CT images. These algorithms can be mainly categorized into four classes: (a) rebinning algorithms, (b) FDK-like algorithms, (c) model-based iterative methods, (d) exact algorithms.

The rebinning algorithms [6][7] first converted the measured helical sinograms to 2D fan-beam or parallel-beam sinograms and then reconstructed the CT images by using any 2D reconstruction methods. To reduce the interpolation error and accurately estimate the 2D sinogram, the sampling rate of the helical scan is usually high.

The FDK-like algorithms [8][9][10] are extended from the circle cone-beam reconstruction algorithm proposed by Feldkamp, Davis and Kress [11]. The FDK-like algorithms usually involve three steps: 1) filtering the sinograms, 2) weighting the filtered data, 3) Backprojecting the weighted data. By utilizing the fast Fourier transform (FFT), the FDK-like algorithms can be implemented efficiently. However, since the FDK-like algorithms were proposed heuristically and lack of theory supports, the artifacts in the reconstruction is usually hard to avoided especially when the cone angles are large.

Model-based methods usually first propose the minimizing functionals by combining the statistical property of the sinogram with different priori hypotheses, and then convert the functionals to iterative reconstruction algorithms by using optimizing methods. In the literature, total variation (TV) [12][13], nonlocal means (NLM) [14], dictionary learning [15][16] are the commonly used priori hypotheses. In [7], Stierstorfer et al. proposed an iterative weighted filtered backprojection (WFBP) method to reconstruct helical CT images. In [17], Sunnegå\mathring{a}rdh proposed an improved regularized iterative weighted filtered backprojection (RIWFBP) method for the helical cone-beam CT. In [18], Yu and Zeng developed a fast TV-based iterative reconstruction algorithm for the limited-angle helical cone-beam CT reconstruction. Model-based iterative methods are very suitable for ill-posed problems, such as limited angle or sparse angle CT reconstructions. But the computational burdens of the model-based iterative methods are too high to reconstruct the images in real time especially for 3D helical CT.

Exact helical CT reconstruction can be obtained by using Katsevich’s algorithms [19][4], Zou and Pan’s PI-line algorithms [20][21]. These algorithms have mathematical inverse formulas for helical CT reconstructions and so are theoretically exact. In [22][23], by extending Katsevich’s algorithms and the PI-line algorithms, the exact methods for cone-beam CT reconstructions with general scanning curves were proposed. In [24], Yan et al. accelerated the implementation of the Katsevich algorithm [4] by utilizing the graphics processing unit (GPU). In their implementation, the Tam–Danielsson window rather than the PI line is used to determine whether a sinogram corresponding to a scanning angle needs to be backprojected to reconstruct a slice of CT image. Their method doesn’t utilize the periodic properties of the PI-line and so needs to calculate the backprojection positions on the detector for pixels in all the CT images. Exact helical algorithms can reconstruct good quality CT images even if the helical pitch is high. But when the sinograms are sparse, streak artifacts in the CT images reconstructed by the exact algorithms can’t be avoided.

Recently, convolutional neural network (CNN) based methods have also been used for CT reconstructions. These methods can be roughly divided into three categories. The first category uses CNNs only to pre-process the sinograms [25][26] or post-process the CT images [27][28][29]. This type of methods needs an extra step to convert the sinograms to CT images. The second category utilizes end-to-end CNNs to reconstruct CT images directly from measured sinograms [30][31][32][5][33]. Since the reconstruction algorithm is modelled as network layers embedded in CNNS, no extra step is needed when testing or deploying these trained CNNs. However, embedding reconstruction algorithm in CNNs is GPU-memory expensive and may result in GPU-memory exhausting error when training this type of CNNs especially for 3D CT. The third category uses CNNs to approximate the solution of the unroll iterative algorithms [34][35], where each iteration of the algorithms was approximated by a subnetwork. This type of algorithms exhausts more GPU-memory than those in the second category since in the networks the operators of projection and backprojection need to be performed nn times, where nn is the iteration numbers.

In this paper, we focus on embedding the Katsevich algorithm [4] into CNNs to propose an end-to-end deep network to reconstruct helical CT images with high pitch and sparsely spaced detectors. Due to the limited GPU-memory, to the best of our knowledge, in the literature no exact algorithm (Katsevich’s algorithms or PI-line algorithms) was embedded in CNNs to reconstruct helical CT images. In [5], an end-to-end deep network was proposed for helical CT, but the sinograms measured from helical scans were converted to 2D fan-beam sinograms and the embedded reconstruction algorithm was also for fan-beam geometry. As the helical pitch increases, the image volume involved to generate the rebinned fan-beam sinogram and the error of the estimated fan-beam sinogram will both increase. Therefore, for helical CT with high pitch, the CNN using rebining methodology may not work well. The Katsevich algorithm [4] can reconstruct good CT images even if the helical pitch is high. Our network uses Katsevich algorithm as the domain transfer layer and so can also reconstruct good CT images directly from high pitch helical sinograms. Moreover, because our network utilizes the features extracted from both sinograms and CT images, it can simultaneously reduce the streak artifacts caused by the sparsity of sinograms and reconstruct fine details in the CT images.

The sinogram measured from the helical scan usually has large size in the z-direction. If directly input the whole sinogram into CNN to reconstruct all the CT images, huge GPU-memory is needed. Besides, most of the parameters of Katsevich algorithm [4] depend on the position of the CT images and computing them is also time consuming and GPU-memory consuming. By observing that the parameters of Katsevich algorithm [4] are periodic, we divide the sinograms into parts and reconstruct the CT images cycle-by-cycle. By this way, the parameters of Katsevich algorithm for all cycles can be pre-calculated and the GPU-memory loading can be reduced.

The rest of the paper is organized as follows. Section II gives our new GPU implementation of the Katsevich algorithm [4]. Section III describes the detailed structure of our proposed network. Section IV presents the experimental results. Discussion and conclusion are given in Section V.

II GPU implementation of Katsevich Algorithm

In this section, we give a new GPU implementation of the Katsevich algorithm [4], which reconstructs the helical CT images pitch by pitch and has low computational and GPU-memory burdens. Our implementation of Katsevich algorithm [4] is very suitable for deep learning since its paramerters are the same for all pitches and so need only to calculate once.

Katsevich [4] proposed an exact reconstruction formula for helical cone-beam CT and Noo et al. [1] researched how to implement it efficiently and accurately. Let

a¯(λ)=(Rcos(λ+λ0),Rsin(λ+λ0),z0+pλ/2π)\underline{a}(\lambda)=(R\cos(\lambda+\lambda_{0}),R\sin(\lambda+\lambda_{0}),z_{0}+p\lambda/2\pi) (1)

be the helical trajectory formed by the X-ray source position with radius RR and pitch pp, where

a¯(0)=(Rcos(λ0),Rsin(λ0),z0)\underline{a}(0)=(R\cos(\lambda_{0}),R\sin(\lambda_{0}),z_{0}) (2)

is the start point of the helix,

U={x¯:x2+y2r2}U=\{\underline{x}:x^{2}+y^{2}\leq r^{2}\} (3)

is a cylinder inside the helix, 0<r<R0<r<R and x¯=(x,y,z)\underline{x}=(x,y,z).

Then, for any x¯U\underline{x}\in U, the reconstruction formula can be written as

f(x¯)=12πλi(x¯)λo(x¯)dλ1x¯a¯(λ)gF(λ,xa¯(λ)x¯a¯(λ))f(\underline{x})=-\frac{1}{2\pi}\int_{\lambda_{i}(\underline{x})}^{\lambda_{o}(\underline{x})}\mathrm{d}\lambda\frac{1}{\|\underline{x}-\underline{a}(\lambda)\|}g^{F}\left(\lambda,\frac{x-\underline{a}(\lambda)}{\|\underline{x}-\underline{a}(\lambda)\|}\right) (4)

where λi(x¯)\lambda_{i}(\underline{x}) and λo(x¯)\lambda_{o}(\underline{x}) are the extremities of the π\pi-line passing through x¯\underline{x} with λi(x¯)<λo(x¯)\lambda_{i}(\underline{x})<\lambda_{o}(\underline{x}).

gF(λ,θ¯)=02πdγhH(sinγ)g(λ,cosγθ¯+sinγ(θ¯×m¯(λ,θ¯)))g^{F}(\lambda,\underline{\theta})=\int_{0}^{2\pi}\mathrm{d}\gamma h_{H}(\sin\gamma)g^{\prime}(\lambda,\cos\gamma\underline{\theta}+\sin\gamma(\underline{\theta}\times\underline{m}(\lambda,\underline{\theta}))) (5)
g(λ,θ¯)=limε0g(λ+ε,θ¯)g(λ,θ¯)ε,g^{\prime}(\lambda,\underline{\theta})=\lim_{\varepsilon\rightarrow 0}\frac{g(\lambda+\varepsilon,\underline{\theta})-g(\lambda,\underline{\theta})}{\varepsilon}, (6)
hH(s)=+dσisign(σ)ei2πσs=1πsh_{H}(s)=-\int_{-\infty}^{+\infty}\mathrm{d}\sigma\mathrm{i}\mathop{sign}(\sigma)\mathrm{e}^{\mathrm{i}2\pi\sigma s}=\frac{1}{\pi s} (7)

is the Hilbert filter, g(λ,θ¯)g(\lambda,\underline{\theta}) is the measured projection data at X-ray source position a¯(λ)\underline{a}(\lambda) in the direction θ¯\underline{\theta}, vector m¯(λ,θ¯)\underline{m}(\lambda,\underline{\theta}) is normal to the κ\kappa-plane 𝒦(λ,ψ)\mathcal{K}(\lambda,\psi) of the smallest |ψ||\psi| value that contains the line of direction θ¯\underline{\theta} through a¯(λ)\underline{a}(\lambda) and the κ\kappa-plane 𝒦(λ,ψ)\mathcal{K}(\lambda,\psi) is any 2D plane that has three intersections a¯(λ)\underline{a}(\lambda), a¯(λ+ψ)\underline{a}(\lambda+\psi) and a¯(λ+2ψ)\underline{a}(\lambda+2\psi) with the helix, where ψ(π,π)\psi\in(-\pi,\pi). In [1], to numerically calculate gF(λ,θ¯)g^{F}(\lambda,\underline{\theta}), the integral over γ\gamma on the κ\kappa-plane with the minimum value |ψ|\left|\psi\right| is converted to the line integral along the curve that the κ\kappa-plane intersects with the detector plane.

Let g(λ,α,w)g(\lambda,\alpha,w) be the sinogram measured from the helical scan by curved detector, then the implementation of (4) in [1] is described as follows:

  1. 1.

    Taking the derivative at constant direction via chain rule:

    g1(λ,α,w)=(g(q,α,w)q+g(q,α,w)α)|q=λg_{1}(\lambda,\alpha,w)=\left.\left(\frac{\partial g(q,\alpha,w)}{\partial q}+\frac{\partial g(q,\alpha,w)}{\partial\alpha}\right)\right|_{q=\lambda} (8)
  2. 2.

    Length-correction weighting:

    g2(λ,α,w)=DD2+w2g1(λ,α,w)g_{2}(\lambda,\alpha,w)=\frac{D}{\sqrt{D^{2}+w^{2}}}g_{1}(\lambda,\alpha,w) (9)
  3. 3.

    Forward height rebinning: computing

    g3(λ,α,ψ)=g2(λ,α,wκ(α,ψ))g_{3}(\lambda,\alpha,\psi)=g_{2}\left(\lambda,\alpha,w_{\kappa}(\alpha,\psi)\right) (10)

    by interpolation for all ψ[π/2αm,π/2+αm]\psi\in\left[-\pi/2-\alpha_{m},\pi/2+\alpha_{m}\right], where αm=arcsin(r/R)\alpha_{m}=\arcsin(r/R) is the half fan angle,

    wκ(α,ψ)=DP2πRo(ψcosα+ψtanψsinα)w_{\kappa}(\alpha,\psi)=\frac{DP}{2\pi R_{o}}\left(\psi\cos\alpha+\frac{\psi}{\tan\psi}\sin\alpha\right) (11)
  4. 4.

    1-D Hilbert transform in α\alpha: at constant ψ\psi, compute

    g4(λ,α,ψ)=π/2+π/2dαhH(sin(αα))g3(λ,α,ψ)g_{4}(\lambda,\alpha,\psi)=\int_{-\pi/2}^{+\pi/2}\mathrm{~{}d}\alpha^{\prime}h_{H}\left(\sin\left(\alpha-\alpha^{\prime}\right)\right)g_{3}\left(\lambda,\alpha^{\prime},\psi\right) (12)

    where hHh_{H} is the Hilbert filter defined in (7).

  5. 5.

    Backward height rebinning: compute

    g5(λ,α,w)=g4(λ,α,ψ^(α,w))g_{5}(\lambda,\alpha,w)=g_{4}(\lambda,\alpha,\hat{\psi}(\alpha,w)) (13)

    where ψ^(α,w)\hat{\psi}(\alpha,w) is angle ψ\psi of the smallest absolute value that satisfies the equation

    w=DP2πRo(ψcosα+ψtanψsinα)w=\frac{DP}{2\pi R_{o}}\left(\psi\cos\alpha+\frac{\psi}{\tan\psi}\sin\alpha\right) (14)
  6. 6.

    Post-cosine weighting:

    gF(λ,α,w)=cosαg5(λ,α,w)g^{F}(\lambda,\alpha,w)=\cos\alpha g_{5}(\lambda,\alpha,w) (15)
  7. 7.

    Backprojection:

    f(x¯)=12πλi(x¯)λo(x¯)dλ1v(λ,x¯)gF(λ,α(λ,x¯),w(λ,x¯))f(\underline{x})=\frac{1}{2\pi}\int_{\lambda_{i}(\underline{x})}^{\lambda_{o}(\underline{x})}\mathrm{d}\lambda\frac{1}{v^{*}(\lambda,\underline{x})}g^{F}\left(\lambda,\alpha^{*}(\lambda,\underline{x}),w^{*}(\lambda,\underline{x})\right) (16)

    where

    v(λ,x¯)=Roxcos(λ+λ0)ysin(λ+λ0)v^{*}(\lambda,\underline{x})=R_{o}-x\cos\left(\lambda+\lambda_{0}\right)-y\sin\left(\lambda+\lambda_{0}\right) (17)
    α(λ,x¯)=\displaystyle\alpha^{*}(\lambda,\underline{x})= (18)
    arctan\displaystyle\arctan ((xsin(λ+λ0)+ycos(λ+λ0))v(λ,x¯))\displaystyle\left(\frac{\left(-x\sin\left(\lambda+\lambda_{0}\right)+y\cos\left(\lambda+\lambda_{0}\right)\right)}{v^{*}(\lambda,\underline{x})}\right)
    w(λ,x¯)=Dcosα(λ,x¯)v(λ,x¯)(zz0P2πλ)w^{*}(\lambda,\underline{x})=\frac{D\cos\alpha^{*}(\lambda,\underline{x})}{v^{*}(\lambda,\underline{x})}\left(z-z_{0}-\frac{P}{2\pi}\lambda\right) (19)

In the above steps, the parameters λi(x¯)\lambda_{i}(\underline{x}), λo(x¯)\lambda_{o}(\underline{x}), v(λ,x¯)v^{*}(\lambda,\underline{x}), α(λ,x¯)\alpha^{*}(\lambda,\underline{x}), w(λ,x¯)w^{*}(\lambda,\underline{x}), wκ(α,ψ)w_{\kappa}(\alpha,\psi) and ψ^(α,w)\hat{\psi}(\alpha,w) need to calculate. The parameters wκ(α,ψ)w_{\kappa}(\alpha,\psi) and ψ^(α,w)\hat{\psi}(\alpha,w) can be pre-calculated only once but the others λi(x¯)\lambda_{i}(\underline{x}), λo(x¯)\lambda_{o}(\underline{x}), v(λ,x¯)v^{*}(\lambda,\underline{x}), α(λ,x¯)\alpha^{*}(\lambda,\underline{x}), w(λ,x¯)w^{*}(\lambda,\underline{x}) need to be calculated for every x¯\underline{x}, which is time consuming. Luckily, these parameters have the following periodic properties:

λi(x¯k)=λi(x¯)+2kπ\displaystyle\lambda_{i}(\underline{x}^{k})=\lambda_{i}(\underline{x})+2k\pi (20)
λo(x¯k)=λo(x¯)+2kπ\displaystyle\lambda_{o}(\underline{x}^{k})=\lambda_{o}(\underline{x})+2k\pi
v(λ+2kπ,x¯k)=v(λ,x¯)\displaystyle v^{*}(\lambda+2k\pi,\underline{x}^{k})=v^{*}(\lambda,\underline{x})
α(λ+2kπ,x¯k)=α(λ,x¯)\displaystyle\alpha^{*}(\lambda+2k\pi,\underline{x}^{k})=\alpha^{*}(\lambda,\underline{x})
w(λ+2kπ,x¯k)=w(λ,x¯)\displaystyle w^{*}(\lambda+2k\pi,\underline{x}^{k})=w^{*}(\lambda,\underline{x})

where x¯k=(x,y,z+kp)\underline{x}^{k}=(x,y,z+kp), kk is an integer and pp is the pitch. Thus, if we reconstruct the CT images in the z-direction pitch by pitch, the parameters λo(x¯)\lambda_{o}(\underline{x}), λp(x¯)\lambda_{p}(\underline{x}), v(λ,x¯)v^{*}(\lambda,\underline{x}), α(λ,x¯)\alpha^{*}(\lambda,\underline{x}), w(λ,x¯)w^{*}(\lambda,\underline{x}) can also be pre-calculated only once, which can reduce the computation and GPU-memory burdens a lot.

In the following, we describe the detailed GPU implementation scheme of the reconstruction formula (4), which reconstruct the CT images pitch by pitch. In our scheme, to reduce the GPU-memory consuming, the backprojection step is performed slice by slice in the z-direction while the other steps are performed on the whole sinogram needed to reconstruct the CT images of one pitch.

Let f(x¯jk)f(\underline{x}_{j}^{k}) be the jj-th sliced CT image of the kk-th pitch in the z-direction, where j=0,1,2,,N1j=0,1,2,...,N-1, k=1,Mk=1,...M, NN is the total number of CT images to be reconstruct in one pitch, MM is the number of divided pitches of the scanning helix, x¯jk=(x,y,zj+kp)\underline{x}_{j}^{k}=(x,y,z_{j}+kp), (x,y)(x,y) is a point of the mesh grids formed by the x-coordinate and y-coordinate and zjz_{j} is the relative position on the z-coordinate in one pitch. Then our scheme can be divided into parameters pre-calculating step and reconstruction step.

The pre-calculating step is described as follows:

  1. 1.

    Pre-calculate wκ(α,ψ)w_{\kappa}(\alpha,\psi) and ψ^(α,w)\hat{\psi}(\alpha,w) by using equation (11) and the method in [1].

  2. 2.

    Define x¯j:=x¯j1\underline{x}_{j}:=\underline{x}_{j}^{1}, For j=0,1,2,,N1j=0,1,2,...,N-1, pre-calculate the extremities λi(x¯j)\lambda_{i}(\underline{x}_{j}) and λo(x¯j)\lambda_{o}(\underline{x}_{j}) of the π\pi-line passing through x¯j\underline{x}_{j} using the algorithm described in [36]. Note that an extra projection step that constrains θ\theta on the interval [π,π][-\pi,\pi] needs to perform at each iteration of the algorithm, which was not explicitly stated in [36]. Also note that here λi(x¯j)\lambda_{i}(\underline{x}_{j}) and λo(x¯j)\lambda_{o}(\underline{x}_{j}) are two matrices corresponding to the mesh grid points (x,y)(x,y) of x¯j\underline{x}_{j}. After obtaining λi(x¯j)\lambda_{i}(\underline{x}_{j}) and λo(x¯j)\lambda_{o}(\underline{x}_{j}), we can calculate the minimal and maximal values of λ\lambda needed to reconstruct f(x¯j)f(\underline{x}_{j}):

    λmin(x¯j)\displaystyle\lambda_{min}(\underline{x}_{j}) =min(x,y)ofxj(λi(x¯j)),\displaystyle=\min_{(x,y)~{}\text{of}~{}x_{j}}(\lambda_{i}(\underline{x}_{j})), (21)
    λmax(x¯j)\displaystyle\lambda_{max}(\underline{x}_{j}) =max(x,y)ofxj(λo(x¯j)).\displaystyle=\max_{(x,y)~{}\text{of}~{}x_{j}}(\lambda_{o}(\underline{x}_{j})).
  3. 3.

    For j=0,1,2,,N1j=0,1,2,...,N-1, pre-calculate v(λ,x¯j)v^{*}(\lambda,\underline{x}_{j}), α(λ,x¯j)\alpha^{*}(\lambda,\underline{x}_{j}) and w(λ,x¯j)w^{*}(\lambda,\underline{x}_{j}):

    v(λ,x¯j)=Roxcos(λ+λ0)ysin(λ+λ0)v^{*}(\lambda,\underline{x}_{j})=R_{o}-x\cos\left(\lambda+\lambda_{0}\right)-y\sin\left(\lambda+\lambda_{0}\right) (22)
    α(λ,x¯j)\displaystyle\alpha^{*}(\lambda,\underline{x}_{j}) =\displaystyle= (23)
    arctan\displaystyle\arctan ((xsin(λ+λ0)+ycos(λ+λ0))v(λ,x¯j))\displaystyle\left(\frac{\left(-x\sin\left(\lambda+\lambda_{0}\right)+y\cos\left(\lambda+\lambda_{0}\right)\right)}{v^{*}(\lambda,\underline{x}_{j})}\right)
    w(λ,x¯j)=Dcosα(λ,x¯j)v(λ,x¯j)(ziz0P2πλ),w^{*}(\lambda,\underline{x}_{j})=\frac{D\cos\alpha^{*}(\lambda,\underline{x}_{j})}{v^{*}(\lambda,\underline{x}_{j})}\left(z_{i}-z_{0}-\frac{P}{2\pi}\lambda\right), (24)

    where x¯j=(x,y,zj)\underline{x}_{j}=(x,y,z_{j}) and λ=[λmin(x¯j),λmax(x¯j)]\lambda=[\lambda_{min}(\underline{x}_{j}),\lambda_{max}(\underline{x}_{j})]. Here u(λ,x¯j)u^{*}(\lambda,\underline{x}_{j}), α(λ,x¯j)\alpha^{*}(\lambda,\underline{x}_{j}) and w(λ,x¯j)w^{*}(\lambda,\underline{x}_{j}) are three tensors of dimension three, where the first dimension corresponds to λ=[λmin(x¯j),λmax(x¯j)]\lambda=[\lambda_{min}(\underline{x}_{j}),\lambda_{max}(\underline{x}_{j})].

Note that in the above step 3), since

v(λ,x¯j1)=v(λ,x¯jk)\displaystyle v^{*}(\lambda,\underline{x}_{j}^{1})=v^{*}(\lambda,\underline{x}_{j}^{k}) (25)
α(λ,x¯j1)=α(λ,x¯jk)\displaystyle\alpha^{*}(\lambda,\underline{x}_{j}^{1})=\alpha^{*}(\lambda,\underline{x}_{j}^{k})
w(λ,x¯j1)=w(λ,x¯jk)\displaystyle w^{*}(\lambda,\underline{x}_{j}^{1})=w^{*}(\lambda,\underline{x}_{j}^{k})

for any positive integer kk, v(λ,x¯j)v^{*}(\lambda,\underline{x}_{j}), α(λ,x¯j)\alpha^{*}(\lambda,\underline{x}_{j}) and w(λ,x¯j)w^{*}(\lambda,\underline{x}_{j}) can be pre-calculated only once.

After the pre-calculating step, we can compute the minimal and maximal values of λ\lambda, α\alpha and ww needed to reconstruct the CT images of one pitch. These values can be calculated by

λmin=minj=0,1,N1λmin(x¯j)\displaystyle\lambda_{min}=\min\limits_{j=0,1,...N-1}\lambda_{min}(\underline{x}_{j}) (26)
λmax=maxj=0,1,N1λmax(x¯j)\displaystyle\lambda_{max}=\max\limits_{j=0,1,...N-1}\lambda_{max}(\underline{x}_{j})
αmin=minj=0,1,N1(minλ[λmin(x¯j),λmax(x¯j)]α(λ,x¯j))\displaystyle\alpha_{min}=\min\limits_{j=0,1,...N-1}(\min\limits_{\lambda\in[\lambda_{min}(\underline{x}_{j}),\lambda_{max}(\underline{x}_{j})]}\alpha^{*}(\lambda,\underline{x}_{j}))
αmax=maxj=0,1,N1(maxλ[λmin(x¯j),λmax(x¯j)]α(λ,x¯j))\displaystyle\alpha_{max}=\max\limits_{j=0,1,...N-1}(\max\limits_{\lambda\in[\lambda_{min}(\underline{x}_{j}),\lambda_{max}(\underline{x}_{j})]}\alpha^{*}(\lambda,\underline{x}_{j}))
wmin=minj=0,1,N1(minλ[λmin(x¯j),λmax(x¯j)]w(λ,x¯j))\displaystyle w_{min}=\min\limits_{j=0,1,...N-1}(\min\limits_{\lambda\in[\lambda_{min}(\underline{x}_{j}),\lambda_{max}(\underline{x}_{j})]}w^{*}(\lambda,\underline{x}_{j}))
wmax=maxj=0,1,N1(maxλ[λmin(x¯j),λmax(x¯j)]w(λ,x¯j))\displaystyle w_{max}=\max\limits_{j=0,1,...N-1}(\max\limits_{\lambda\in[\lambda_{min}(\underline{x}_{j}),\lambda_{max}(\underline{x}_{j})]}w^{*}(\lambda,\underline{x}_{j}))

The values αmin\alpha_{min}, αmax\alpha_{max}, wminw_{min} and wmaxw_{max} can be used to determine the numbers of rows and columns of the detector, and λmin\lambda_{min} and λmax\lambda_{max} are used to determine the required sinogram to reconstruct the CT images of one pitch. To reconstruct the kk-th pitch CT images, the sinogram corresponding to λ=[λmin+2kπ,λmax+2kπ]\lambda=[\lambda_{min}+2k\pi,\lambda_{max}+2k\pi] is needed.

Let g(λk,α,w)g(\lambda^{k},\alpha,w) be the λ\lambda-sliced sinogram from g(λ,α,w)g(\lambda,\alpha,w), where λk=[λmin+2kπ,λmax+2kπ]\lambda^{k}=[\lambda_{min}+2k\pi,\lambda_{max}+2k\pi] for k=1,Mk=1,...M. Then the reconstruction step can be described as follows:

  1. 1.

    Perform Derivative (8), Length-correction weighting (9), Forward height rebinning (10), Hilbert transform (12), Backward height rebinning (13) and Post-cosine weighting (15) on g(λk,α,w)g(\lambda^{k},\alpha,w) and obtain gF(λk,α,w)g^{F}(\lambda^{k},\alpha,w).

  2. 2.

    For j=0,1,2,,N1j=0,1,2,...,N-1, backproject gF(λk,α,w)g^{F}(\lambda^{k},\alpha,w) slice by slice:

    f(x¯jkp)=12πλmin(x¯j)+2kπλmax(x¯j)+2kπdλ1vjgF(λk,αj,wj)f(\underline{x}_{j}^{kp})=\frac{1}{2\pi}\int_{\lambda_{min}(\underline{x}_{j})+2k\pi}^{\lambda_{max}(\underline{x}_{j})+2k\pi}\mathrm{d}\lambda\frac{1}{v^{*}_{j}}g^{F}\left(\lambda^{k},\alpha^{*}_{j},w^{*}_{j}\right) (27)

    where

    vj\displaystyle v^{*}_{j} =v(λ,x¯j)\displaystyle=v^{*}(\lambda,\underline{x}_{j}) (28)
    αj\displaystyle\alpha^{*}_{j} =α(λ,x¯j)\displaystyle=\alpha^{*}(\lambda,\underline{x}_{j})
    wj\displaystyle w^{*}_{j} =w(λ,x¯j)\displaystyle=w^{*}(\lambda,\underline{x}_{j})

To reconstruct the helical CT images pitch by pitch, we need to perform the pre-calculating step once and store its outputs in memory, and the reconstruction step MM times for MM pitches. In this paper, the pre-calculating step was implemented by the central processing unit (CPU) while the reconstruction step by GPU.

III Proposed Network

In this section, we propose a 3D deep learning network to reconstruct CT images directly from sparse helical sinograms with high pitch, where our GPU implementation of the Katsevich algorithm [4] is used as the domain transfer layer in the network.

III-A Architecture of Proposed Network

Our network (shown in Fig. 1) is composed of three parts, the sinogram domain subnetwork, the domain transfer layer and the CT image domain subnetwork, where the sinogram domain subnetwork is used to denoise and upsample the sparse sinograms, the domain transfer layer is used to reconstruct the CT images from the sinograms output by the sinogram domain subnetwork, the CT image domain subnetwork is used to denoise CT images reconstructed by the domain transfer layer.

Refer to caption
Figure 1: The detailed structure of our network.

The architecture of the sinogram domain subnetwork is based on the Resnet [37]. In detail, the sinogram domain subnetwork is composed of 7 cascaded ’Conv3D+PReLU’ blocks and followed by a ’Conv3D’ block, where ’Conv3D’ is the 3D convolutional layer and ’PReLu’ is the activate function, parametric rectified linear unit (PReLU). The feature channels of 3D convolutional filters in ’Conv3D+PReLU’ blocks are all 16 and 1 in the ’Conv3D’ block, and the widths of the filters in ’Conv3D+PReLU’ blocks and ’Conv3D’ block are all 3. Each output of the ’Conv3D+PReLU’ block is skipped-connected with the output of the last ’Conv3D+PReLU’ block and the input of the sinogram domain subnetwork is skipped-connected with the output of the ’Conv3D’ block. The input to the sinogram domain subnetwork is the sliced 3D sinogram (g(λk,α,w)g(\lambda^{k},\alpha,w)) that can exactly reconstruct one pitch helical CT images. The output of the sinogram domain subnetwork is the denoised sinogram which has the same size of the input sinogram.

The domain transfer layer implements the Katsevich algorithm [4] as described in Section II. The input to the domain transfer layer is the denoised sinograms output by the sinogram domain subnetwork and the output is the reconstructed helical CT images of one pitch by the Katsevich algorithm [4].

The CT image domain subnetwork has the same architecture of the sinogram domain subnetwork, but the inputs and outputs are different. The input to the CT image domain subnetwork is the CT images of one pitch output by the domain transfer layer and the output is the denosied CT images which has the same size as the input.

III-B Loss Function

Let (gk,fk)(g^{k},f^{k}) be the outputs of our network and (gGTk,fGTk)(g_{GT}^{k},f_{GT}^{k}) be the corresponding labels, where gkg^{k} and gGTkg_{GT}^{k} are, respectively, the denoised sinogram and label sinogram, and fkf^{k} and fGTkf_{GT}^{k} are, respectively, the denoised CT image and label CT image. Then the loss function used to train our network is

Θ=1NDk=1NDgGTkgk2+fGTkfk2,\mathcal{L}_{\Theta}=\frac{1}{N_{D}}\sum\nolimits_{k=1}^{N_{D}}{{{\left\|{{g_{GT}^{k}}-g^{k}}\right\|}^{2}}}+{{{\left\|{{f_{GT}^{k}}-f^{k}}\right\|}^{2}}}, (29)

where NDN_{D} is the total number of training samples and Θ\Theta is the set of parameters of our network to be learned.

Our network is trained by optimizing the loss function (29) using the backpropagation algorithm (BP) [38]. In this study, the loss function is optimized by the Adam algorithm [39], where the learning rate is set as 103{{1}}{{{0}}^{-3}}.

The training code of our network is implemented by using the software Tensorflow 2.5 on a Dell PowerEdge T640 server with a Ubuntu 20.04 operating system, 377GB random access memory (RAM), a Intel(R) Xeon(R) Silver 4210R CPU with 2.4GHz frequency and a NVIDIA RTX A6000 GPU card with 45631MB memory. The initial values of the training parameters of our network are randomly set by Tensorflow using the default setup. The gradients of the loss function with respect to the training parameters are calculated by Tensorflow automatically using the automatic differentiation technique. The runtime of our network to complete one batch learning (a forward step and a backward step) with batch=1batch=1 is about 565\sim 6 seconds.

III-C Data Preparation

III-C1 CT image labels

The CT images from “the 2016 NIH-AAPM-Mayo Clinic Low Dose CT Grand Challenge” [40] that was established and authorized by Mayo Clinics are used as the ground truth CT images fGTf_{GT}. The AAPM challenge has also released the 3D helical sinograms corresponding to the ground truth CT images. But the pitches of the helical scan used to obtain the sinograms are low and not equal, so we need to generate the simulated sinograms with constant high pitch by performing the helical projection.

In this paper, we perform our methods on two types of helical sinograms, which correspond to p=7πp=7\pi and p=14πp=14\pi, respectively.

III-C2 Parameters of the scanning geometry

If not specified, all the length unit in our paper is millimetre (mmmm). The radius of scanning helix is R=595R=595 and the distance of the X-ray source to the detector is D=1085.6D=1085.6, where the detector is a curved panel. The x-coordinate and y-coordinate of the ground truth CT images are, resectively

xcor\displaystyle x_{cor} =[256:1:255]×1\displaystyle=[-256:1:255]\times 1 (30)
ycor\displaystyle y_{cor} =[256:1:255]×1\displaystyle=[-256:1:255]\times 1

For simplify, here we assume the size of one pixel of the CT images is 1×11\times 1. The half fan-angle is

arcsin(2×256/R)19.4808\arcsin(\sqrt{2}\times 256/R)\approx 19.4808^{\circ} (31)

and we set

αcor=\displaystyle\alpha_{cor}= [19.5625:1:19.5625]×0.0625+14×0.0625\displaystyle[-19.5625:1:19.5625]\times 0.0625^{\circ}+\frac{1}{4}\times 0.0625^{\circ} (32)
=\displaystyle= ([313:1:313]+0.25)×π/180/16\displaystyle([-313:1:313]+0.25)\times\pi/180/16

for the α\alpha-coordinate of the detector.

We constraint the number of detector rows to be 16 and so the positions of the detector units distributed on the w-coordinate are different for different pp. For p=7πp=7\pi, the max ww in equation (26) is

wL=max(wmax,wmin)=3.8819,w_{L}=\max(w_{max},-w_{min})=3.8819, (33)

and so w=2wmax/15=0.5176\nabla w=2w_{max}/15=0.5176. Therefore, we set

wcor=[7.5:1:7.5]×0.5176w_{cor}=[-7.5:1:7.5]\times 0.5176 (34)

for p=7πp=7\pi on the ww-coordinate. Similarly, for p=14πp=14\pi, we have

wL=max(wmax,wmin)=7.7499,w_{L}=\max(w_{max},-w_{min})=7.7499, (35)

and so w=2wmax/15=1.0333\nabla w=2w_{max}/15=1.0333 and set

wcor=[7.5:1:7.5]×1.0333.w_{cor}=[-7.5:1:7.5]\times 1.0333. (36)

In summary, the scanning parameters of the helical CT are listed in Table I.

III-C3 Data generation

We use the above described geometry to simulate scanning the CT image labels to generate the sinogram labels yGTy_{GT}, where the slices of the ground truth CT images are equally spaced on the z-coordinate. For p=7πp=7\pi, the z-coordinate of the ground truth CT images on one pitch is

zcor=\displaystyle z_{cor}= linspace(0,2πh,ceil(3h))\displaystyle linspace(0,2\pi h,ceil(3h)) (37)
=\displaystyle= [0:1:10]×2.1991\displaystyle[0:1:10]\times 2.1991

and for p=14πp=14\pi is

zcor=\displaystyle z_{cor}= linspace(0,2πh,ceil(2h))\displaystyle linspace(0,2\pi h,ceil(2h)) (38)
=\displaystyle= [0:1:13]×3.3833\displaystyle[0:1:13]\times 3.3833

Therefore, in one pitch, we need to reconstruct 1111 slices of CT images for p=7πp=7\pi and 1414 slices for p=14πp=14\pi.

The sinogram gGTg_{GT} is obtained by scanning a whole patient, so we need to divide it into pitches to obtain gGTkg_{GT}^{k}, where

gGTk\displaystyle g_{GT}^{k} =gGT[λ1k:1:λ2k,:,:],\displaystyle=g_{GT}[\lambda_{1}^{k}:1:\lambda_{2}^{k},:,:], (39)
λ1k\displaystyle\lambda_{1}^{k} =(λminλ0+2kπ)/λ,\displaystyle=(\lambda_{min}-\lambda_{0}+2k\pi)/\nabla\lambda,
λ2k\displaystyle\lambda_{2}^{k} =(λmaxλ0+2kπ)/λ\displaystyle=(\lambda_{max}-\lambda_{0}+2k\pi)/\nabla\lambda

k=2,..,M1k=2,..,M-1, λmin\lambda_{min} and λmax\lambda_{max} are defined in equation (26), λ=π/180\nabla\lambda=\pi/180 and λ0\lambda_{0} is the start angle of the scanning helix. For p=7πp=7\pi, λmin=2.1720\lambda_{min}=-2.1720, λmax=7.9625\lambda_{max}=7.9625, and for p=14πp=14\pi, λmin=2.1875\lambda_{min}=-2.1875, λmax=8.1025\lambda_{max}=8.1025. To avoid the zero values, the data of the first and last pitches are discarded.

Correspondingly, the ground truth CT images fGTf_{GT} are also needed to divide into pitches to get fGTkf_{GT}^{k}, where

fGTk\displaystyle f_{GT}^{k} =fGT[:,:,z1k:1:z2k],\displaystyle=f_{GT}[:,:,z_{1}^{k}:1:z_{2}^{k}], (40)
z1k\displaystyle z_{1}^{k} =k×zL+1,\displaystyle=k\times z_{L}+1,
z2k\displaystyle z_{2}^{k} =(k+1)×zL,\displaystyle=(k+1)\times z_{L},

k=2,..,M1k=2,..,M-1, zL=11z_{L}=11 for p=7πp=7\pi and zL=14z_{L}=14 for p=14πp=14\pi.

After obtaining the sinogram labels gGTkg^{k}_{GT}, we downsample it on the α\alpha-coordinate at

αsp=αcor[0:4:end]\alpha_{sp}=\alpha_{cor}[0:4:end] (41)

to obtain the sparse sinogram gspkg^{k}_{sp}. We then upsample gspkg^{k}_{sp} by interpolation to make it have the same size as gGTkg^{k}_{GT} and add the ’Gaussian+Poisson’ noise [41] to the upsampled sinogram gupkg^{k}_{up} to obtain the input sinograms ginkg^{k}_{in} of our network, where

g=I0×exp(gupk/M)\displaystyle g=I_{0}\times\exp(-g^{k}_{up}/M) (42)
g=g+poission(g)+Gaussin(m, var )\displaystyle g=g+poission(g)+Gaussin\left(m,\text{ var }\right)
gink=log(I0/g)×M,\displaystyle g^{k}_{in}=\log\left(I_{0}/g\right)\times M,

MM is the maximal value of gupkg^{k}_{up}, I0=105I_{0}=10^{5} is the photon count, m=0m=0 and  var =0.5\text{ var }=0.5 are the mean and variance of the Gaussian noise, respectively.

The CT image finkf^{k}_{in} reconstructed from ginkg^{k}_{in} is used as the input of the compared post-processing methods.

TABLE I: The Parameters to Generate the Training Data
Radius of helix 595
Distance of source to detector 1085.6
xx-coordinate of CT image [256:1:255]×1[-256:1:255]\times 1
yy-coordinate of CT image [256:1:255]×1[-256:1:255]\times 1
zz-coordinate of CT image in one pitch [0:1:10]×2.1991[0:1:10]\times 2.1991
[0:1:13]×3.3833[0:1:13]\times 3.3833
λ\lambda-coordinate in one pitch [125:1:455]×0.0175[-125:1:455]\times 0.0175
[125:1:463]×0.0175[-125:1:463]\times 0.0175
α\alpha-coordinate of detector [312.75:1:313.25]×0.0011[-312.75:1:313.25]\times 0.0011
ww-coordinate of detector [7.5:1:7.5]×0.5176[-7.5:1:7.5]\times 0.5176
[7.5:1:7.5]×1.0333[-7.5:1:7.5]\times 1.0333
\begin{overpic}[width=177.78578pt,percent]{./data/red/label10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/clabel10-eps-converted-to.pdf}} } \end{overpic}
(a)
\begin{overpic}[width=177.78578pt]{./data/red/Kati10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cKati10-eps-converted-to.pdf}} } \end{overpic}
(b)
\begin{overpic}[width=177.78578pt]{./data/red/TV10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cTV10-eps-converted-to.pdf}} } \end{overpic}
(c)
\begin{overpic}[width=177.78578pt]{./data/red/FBPconv10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cFBPconv10-eps-converted-to.pdf}} } \end{overpic}
(d)
\begin{overpic}[width=177.78578pt]{./data/red/our10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cour10-eps-converted-to.pdf}} } \end{overpic}
(e)
\begin{overpic}[width=177.78578pt,percent]{./data/red/label200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/clabel200-eps-converted-to.pdf}} } \end{overpic}
(f)
\begin{overpic}[width=177.78578pt]{./data/red/Kati200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cKati200-eps-converted-to.pdf}} } \end{overpic}
(g)
\begin{overpic}[width=177.78578pt]{./data/red/TV200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cTV200-eps-converted-to.pdf}} } \end{overpic}
(h)
\begin{overpic}[width=177.78578pt]{./data/red/FBPconv200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cFBPconv200-eps-converted-to.pdf}} } \end{overpic}
(i)
\begin{overpic}[width=177.78578pt]{./data/red/our200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cour200-eps-converted-to.pdf}} } \end{overpic}
(j)
\begin{overpic}[width=177.78578pt,percent]{./data/red/label800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/clabel800-eps-converted-to.pdf}} } \end{overpic}
(k)
\begin{overpic}[width=177.78578pt]{./data/red/Kati800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cKati800-eps-converted-to.pdf}} } \end{overpic}
(l)
\begin{overpic}[width=177.78578pt]{./data/red/TV800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cTV800-eps-converted-to.pdf}} } \end{overpic}
(m)
\begin{overpic}[width=177.78578pt]{./data/red/FBPconv800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cFBPconv800-eps-converted-to.pdf}} } \end{overpic}
(n)
\begin{overpic}[width=177.78578pt]{./data/red/our800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cour800-eps-converted-to.pdf}} } \end{overpic}
(o)
\begin{overpic}[width=177.78578pt,percent]{./data/red/label1600-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/clabel1600-eps-converted-to.pdf}} } \end{overpic}
(p)
\begin{overpic}[width=177.78578pt]{./data/red/Kati1600-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cKati1600-eps-converted-to.pdf}} } \end{overpic}
(q)
\begin{overpic}[width=177.78578pt]{./data/red/TV1600-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cTV1600-eps-converted-to.pdf}} } \end{overpic}
(r)
\begin{overpic}[width=177.78578pt]{./data/red/FBPconv1600-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cFBPconv1600-eps-converted-to.pdf}} } \end{overpic}
(s)
\begin{overpic}[width=177.78578pt]{./data/red/our1600-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cour1600-eps-converted-to.pdf}} } \end{overpic}
(t)
\begin{overpic}[width=177.78578pt,percent]{./data/red/label2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/clabel2000-eps-converted-to.pdf}} } \end{overpic}
(a) Label
\begin{overpic}[width=177.78578pt]{./data/red/Kati2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cKati2000-eps-converted-to.pdf}} } \end{overpic}
(b) Katsevich algorithm
\begin{overpic}[width=177.78578pt]{./data/red/TV2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cTV2000-eps-converted-to.pdf}} } \end{overpic}
(c) Nonlocal-TV
\begin{overpic}[width=177.78578pt]{./data/red/FBPconv2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cFBPconv2000-eps-converted-to.pdf}} } \end{overpic}
(d) FBPConvNet
\begin{overpic}[width=177.78578pt]{./data/red/our2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cour2000-eps-converted-to.pdf}} } \end{overpic}
(e) Ours
Figure 2: The reconstructed results of NIH-AAPM-Mayo data for p=7πp=7\pi. All images are linearly stretched to [0, 1] and the display window is [0, 0.6].
\begin{overpic}[width=177.78578pt,percent]{./data/neck/red/label60-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/clabel60-eps-converted-to.pdf}} } \end{overpic}
(a)
\begin{overpic}[width=177.78578pt]{./data/neck/red/Kati60-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cKati60-eps-converted-to.pdf}} } \end{overpic}
(b)
\begin{overpic}[width=177.78578pt]{./data/neck/red/FBPconv60-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cFBPconv60-eps-converted-to.pdf}} } \end{overpic}
(c)
\begin{overpic}[width=177.78578pt]{./data/neck/red/our60-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cour60-eps-converted-to.pdf}} } \end{overpic}
(d)
\begin{overpic}[width=177.78578pt,percent]{./data/neck/red/label80-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/clabel80-eps-converted-to.pdf}} } \end{overpic}
(e)
\begin{overpic}[width=177.78578pt]{./data/neck/red/Kati80-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cKati80-eps-converted-to.pdf}} } \end{overpic}
(f)
\begin{overpic}[width=177.78578pt]{./data/neck/red/FBPconv80-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cFBPconv80-eps-converted-to.pdf}} } \end{overpic}
(g)
\begin{overpic}[width=177.78578pt]{./data/neck/red/our80-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cour80-eps-converted-to.pdf}} } \end{overpic}
(h)
\begin{overpic}[width=177.78578pt,percent]{./data/neck/red/label160-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/clabel160-eps-converted-to.pdf}} } \end{overpic}
(i)
\begin{overpic}[width=177.78578pt]{./data/neck/red/Kati160-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cKati160-eps-converted-to.pdf}} } \end{overpic}
(j)
\begin{overpic}[width=177.78578pt]{./data/neck/red/FBPconv160-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cFBPconv160-eps-converted-to.pdf}} } \end{overpic}
(k)
\begin{overpic}[width=177.78578pt]{./data/neck/red/our160-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cour160-eps-converted-to.pdf}} } \end{overpic}
(l)
\begin{overpic}[width=177.78578pt,percent]{./data/neck/red/label200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/clabel200-eps-converted-to.pdf}} } \end{overpic}
(m)
\begin{overpic}[width=177.78578pt]{./data/neck/red/Kati200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cKati200-eps-converted-to.pdf}} } \end{overpic}
(n)
\begin{overpic}[width=177.78578pt]{./data/neck/red/FBPconv200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cFBPconv200-eps-converted-to.pdf}} } \end{overpic}
(o)
\begin{overpic}[width=177.78578pt]{./data/neck/red/our200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cour200-eps-converted-to.pdf}} } \end{overpic}
(p)
\begin{overpic}[width=177.78578pt,percent]{./data/neck/red/label240-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/clabel240-eps-converted-to.pdf}} } \end{overpic}
(a) Label
\begin{overpic}[width=177.78578pt]{./data/neck/red/Kati240-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cKati240-eps-converted-to.pdf}} } \end{overpic}
(b) Katsevich algorithm
\begin{overpic}[width=177.78578pt]{./data/neck/red/FBPconv240-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cFBPconv240-eps-converted-to.pdf}} } \end{overpic}
(c) FBPConvNet
\begin{overpic}[width=177.78578pt]{./data/neck/red/our240-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cour240-eps-converted-to.pdf}} } \end{overpic}
(d) Ours
Figure 3: The reconstructed results of head and neck CT data for p=7πp=7\pi. All images are linearly stretched to [0, 1] and the display window is [0, 0.6].

IV Experimental Results

In this section, we present some simulated experimental results to demonstrate the effectiveness of our method and compare it with the related methods, the nonlocal-TV iterative algorithm [42] and the post-processing method FBPConvNet [43], where the convolutional filters in FBPConvNet are extended from 2D to 3D.

As described in Subsection III-C, the CT images in “the 2016 NIH-AAPM-Mayo Clinic Low Dose CT Grand Challenge” [40] are used to prepare the experimental data, where the data generated from six patients named ’L192’, ’L286’, ’L291’, ’L310’, ’L333’ and ’L506’ are used as the training data and those from the other four patients named ’L067’, ’L096’, ’L109’ and ’L143’ are used as the test data. The training data involve totally 345 pitches (of size 512×512×10512\times 512\times 10) for p=7πp=7\pi and 260 pitches (of size 512×512×13512\times 512\times 13) for p=14πp=14\pi and the test data involve 215 pitches for p=7πp=7\pi and 163 pitches for p=14πp=14\pi, where 10%10\% of the training data is randomly chosen as the validation. Note that since the last CT image of one pitch is override with the first CT image of its next pitch, we exclude the last CT image of every pitch in our training and testing data.

For nonlocal-TV iterative algorithm, we tune its parameters empirically in order to obtain the best results. We use the default parameters of FBPConvNet to train it, where the number of training epoches is 100. The CT images reconstructed from the noisy sinograms by the Katsevich algorithm are used as the inputs of FBPConvNet, where the batch size is batch=5batch=5. We train our network 6060 epoches with batch size batch=1batch=1. The size of the learnable parameters of FBPConvNet is about 12.4MB while ours is about 2.3MB.

IV-A Experimental Results for p=7πp=7\pi

In Fig. 2, five slices of CT images reconstructed from the test data for p=7πp=7\pi are shown. We can observe that the images reconstructed by Katsevich algorithm suffer severe streak artifacts due to the sparsity of the sinogram. The images reconstructed by nonlocal-TV are somewhat over-smoothed and a lot of details are lost. FBPConvNet can reconstruct CT images with some details, but some boundaries in its reconstructed CT images are still blurred. Compared to nonlocal-TV and FBPConvNet, the CT images reconstructed by our method can reconstruct more details and boundaries as shown in Fig. 2e. To better observe the differences, the areas marked by the red rectangles in the reconstructed images are enlarged and presented on the left-down positions.

TABLE II: The RMSE and SSIM of the four methods for NIH-AAPM-Mayo test data.
Pitch Method RMSE SSIM
Katsevich 97.616±\pm7.549 0.777±\pm0.028
p=7πp=7\pi Nonlocal-TV 93.174±\pm7.271 0.815±\pm0.028
FBPConvNet 70.329±\pm8.169 0.845±\pm0.028
Ours 59.068±\pm7.028 0.867±\pm0.027
Katsevich 97.783±\pm7.493 0.776±\pm0.028
p=14πp=14\pi Nonlocal-TV 93.362±\pm7.217 0.814±\pm0.028
FBPConvNet 71.609±\pm8.140 0.842±\pm0.028
Ours 60.028±\pm6.988 0.865±\pm0.027

To quantitatively evaluate the performances of the compared methods, two metrics, the root-mean-square-error (RMSE) and the structural-similarity (SSIM) are used to measure the similarity of the reconstructed images and the labels. The definition of RMSE is

RMSE=ffGT22n,\operatorname{RMSE}=\sqrt{\frac{\|f-f_{GT}\|_{2}^{2}}{n}}, (43)

where ff and fGTf_{GT} are, respectively, the reconstructed CT image and label image, nn is the number of pixels. The value of SSIM is defined as

SSIM=(2μ1μ2+c1)(2σ1,2+c2)(μ12+μ22+c1)(σ12+σ22+c2),\operatorname{SSIM}=\frac{\left(2\mu_{1}\mu_{2}+c_{1}\right)\left(2\sigma_{1,2}+c_{2}\right)}{\left(\mu_{1}^{2}+\mu_{2}^{2}+c_{1}\right)\left(\sigma_{1}^{2}+\sigma_{2}^{2}+c_{2}\right)}, (44)

where

C1\displaystyle C_{1} =(0.01×(max(fGT)min(fGT)))2\displaystyle=(0.01\times(\max(f_{GT})-\min(f_{GT})))^{2} (45)
C2\displaystyle C_{2} =(0.03×(max(fGT)min(fGT)))2,\displaystyle=(0.03\times(\max(f_{GT})-\min(f_{GT})))^{2},

μ1\mu_{1}, σ1\sigma_{1} are the mean and variance of the reconstructed CT image ff, μ2\mu_{2}, σ2\sigma_{2} are the mean and variance of the referenced image fGTf_{GT}, respectively, σ1,2\sigma_{1,2} is the covariance between ff and fGTf_{GT},

In Table II, the average RMSE and SSIM of the CT images reconstructed by the four methods are listed. It can be observed that our network gains the lowest RMSE and highest SSIM in average, which coincides with our subjective evaluation.

\begin{overpic}[width=177.78578pt,percent]{./data1/red/label10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/clabel10-eps-converted-to.pdf}} } \end{overpic}
(a)
\begin{overpic}[width=177.78578pt]{./data1/red/Kati10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cKati10-eps-converted-to.pdf}} } \end{overpic}
(b)
\begin{overpic}[width=177.78578pt]{./data1/red/TV10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cTV10-eps-converted-to.pdf}} } \end{overpic}
(c)
\begin{overpic}[width=177.78578pt]{./data1/red/FBPconv10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cFBPconv10-eps-converted-to.pdf}} } \end{overpic}
(d)
\begin{overpic}[width=177.78578pt]{./data1/red/our10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cour10-eps-converted-to.pdf}} } \end{overpic}
(e)
\begin{overpic}[width=177.78578pt,percent]{./data1/red/label200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/clabel200-eps-converted-to.pdf}} } \end{overpic}
(f)
\begin{overpic}[width=177.78578pt]{./data1/red/Kati200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cKati200-eps-converted-to.pdf}} } \end{overpic}
(g)
\begin{overpic}[width=177.78578pt]{./data1/red/TV200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cTV200-eps-converted-to.pdf}} } \end{overpic}
(h)
\begin{overpic}[width=177.78578pt]{./data1/red/FBPconv200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cFBPconv200-eps-converted-to.pdf}} } \end{overpic}
(i)
\begin{overpic}[width=177.78578pt]{./data1/red/our200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cour200-eps-converted-to.pdf}} } \end{overpic}
(j)
\begin{overpic}[width=177.78578pt,percent]{./data1/red/label400-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/clabel400-eps-converted-to.pdf}} } \end{overpic}
(k)
\begin{overpic}[width=177.78578pt]{./data1/red/Kati400-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cKati400-eps-converted-to.pdf}} } \end{overpic}
(l)
\begin{overpic}[width=177.78578pt]{./data1/red/TV400-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cTV400-eps-converted-to.pdf}} } \end{overpic}
(m)
\begin{overpic}[width=177.78578pt]{./data1/red/FBPconv400-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cFBPconv400-eps-converted-to.pdf}} } \end{overpic}
(n)
\begin{overpic}[width=177.78578pt]{./data1/red/our400-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cour400-eps-converted-to.pdf}} } \end{overpic}
(o)
\begin{overpic}[width=177.78578pt,percent]{./data1/red/label800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/clabel800-eps-converted-to.pdf}} } \end{overpic}
(p)
\begin{overpic}[width=177.78578pt]{./data1/red/Kati800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cKati800-eps-converted-to.pdf}} } \end{overpic}
(q)
\begin{overpic}[width=177.78578pt]{./data1/red/TV800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cTV800-eps-converted-to.pdf}} } \end{overpic}
(r)
\begin{overpic}[width=177.78578pt]{./data1/red/FBPconv800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cFBPconv800-eps-converted-to.pdf}} } \end{overpic}
(s)
\begin{overpic}[width=177.78578pt]{./data1/red/our800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cour800-eps-converted-to.pdf}} } \end{overpic}
(t)
\begin{overpic}[width=177.78578pt,percent]{./data1/red/label2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/clabel2000-eps-converted-to.pdf}} } \end{overpic}
(a) Label
\begin{overpic}[width=177.78578pt]{./data1/red/Kati2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cKati2000-eps-converted-to.pdf}} } \end{overpic}
(b) Katsevich algorithm
\begin{overpic}[width=177.78578pt]{./data1/red/TV2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cTV2000-eps-converted-to.pdf}} } \end{overpic}
(c) Nonlocal-TV
\begin{overpic}[width=177.78578pt]{./data1/red/FBPconv2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cFBPconv2000-eps-converted-to.pdf}} } \end{overpic}
(d) FBPConvNet
\begin{overpic}[width=177.78578pt]{./data1/red/our2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cour2000-eps-converted-to.pdf}} } \end{overpic}
(e) Ours
Figure 4: The reconstructed CT NIH-AAPM-Mayo data for p=14πp=14\pi. All images are linearly stretched to [0, 1] and the display window is [0, 0.6].
\begin{overpic}[width=177.78578pt,percent]{./data1/neck/red/label40-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/clabel40-eps-converted-to.pdf}} } \end{overpic}
(a)
\begin{overpic}[width=177.78578pt]{./data1/neck/red/Kati40-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cKati40-eps-converted-to.pdf}} } \end{overpic}
(b)
\begin{overpic}[width=177.78578pt]{./data1/neck/red/FBPconv40-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cFBPconv40-eps-converted-to.pdf}} } \end{overpic}
(c)
\begin{overpic}[width=177.78578pt]{./data1/neck/red/our40-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cour40-eps-converted-to.pdf}} } \end{overpic}
(d)
\begin{overpic}[width=177.78578pt,percent]{./data1/neck/red/label60-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/clabel60-eps-converted-to.pdf}} } \end{overpic}
(e)
\begin{overpic}[width=177.78578pt]{./data1/neck/red/Kati60-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cKati60-eps-converted-to.pdf}} } \end{overpic}
(f)
\begin{overpic}[width=177.78578pt]{./data1/neck/red/FBPconv60-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cFBPconv60-eps-converted-to.pdf}} } \end{overpic}
(g)
\begin{overpic}[width=177.78578pt]{./data1/neck/red/our60-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cour60-eps-converted-to.pdf}} } \end{overpic}
(h)
\begin{overpic}[width=177.78578pt,percent]{./data1/neck/red/label80-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/clabel80-eps-converted-to.pdf}} } \end{overpic}
(i)
\begin{overpic}[width=177.78578pt]{./data1/neck/red/Kati80-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cKati80-eps-converted-to.pdf}} } \end{overpic}
(j)
\begin{overpic}[width=177.78578pt]{./data1/neck/red/FBPconv80-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cFBPconv80-eps-converted-to.pdf}} } \end{overpic}
(k)
\begin{overpic}[width=177.78578pt]{./data1/neck/red/our80-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cour80-eps-converted-to.pdf}} } \end{overpic}
(l)
\begin{overpic}[width=177.78578pt,percent]{./data1/neck/red/label140-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/clabel140-eps-converted-to.pdf}} } \end{overpic}
(m)
\begin{overpic}[width=177.78578pt]{./data1/neck/red/Kati140-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cKati140-eps-converted-to.pdf}} } \end{overpic}
(n)
\begin{overpic}[width=177.78578pt]{./data1/neck/red/FBPconv140-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cFBPconv140-eps-converted-to.pdf}} } \end{overpic}
(o)
\begin{overpic}[width=177.78578pt]{./data1/neck/red/our140-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cour140-eps-converted-to.pdf}} } \end{overpic}
(p)
\begin{overpic}[width=177.78578pt,percent]{./data1/neck/red/label200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/clabel200-eps-converted-to.pdf}} } \end{overpic}
(a) Label
\begin{overpic}[width=177.78578pt]{./data1/neck/red/Kati200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cKati200-eps-converted-to.pdf}} } \end{overpic}
(b) Katsevich algorithm
\begin{overpic}[width=177.78578pt]{./data1/neck/red/FBPconv200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cFBPconv200-eps-converted-to.pdf}} } \end{overpic}
(c) FBPConvNet
\begin{overpic}[width=177.78578pt]{./data1/neck/red/our200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cour200-eps-converted-to.pdf}} } \end{overpic}
(d) Ours
Figure 5: The reconstructed results of head and neck CT data for p=14πp=14\pi. All images are linearly stretched to [0, 1] and the display window is [0, 0.6].

To test the generalization of our method, we perform the reconstruction of our network and FBPConvNet on another type of CT, the head and neck CT, where the head and neck CT dataset [44] is used to simulate the test data, and our network and FBPConvNet are trained by the NIH-AAPM-Mayo data only. Some reconstructed results of the head and neck CT are presented in Fig. 3. We can observe that our method can reconstruct more fine details and boundaries than FBPConvNet, which demonstrates that our method has good generalization capability. The average RMSE and SSIM of the reconstructed head and neck CT images are listed in Table III. It can be observed that our network has lower RMSE and higher SSIM than FBPConvNet in average.

TABLE III: The RMSE and SSIM of the three methods for head and neck CT test data.
Pitch Method RMSE SSIM
Katsevich 76.657±\pm12.371 0.863±\pm0.031
p=7πp=7\pi FBPConvNet 64.633±\pm12.058 0.939±\pm0.009
Ours 47.807±\pm9.870 0.959±\pm0.006
Katsevich 77.104±\pm12.473 0.861±\pm0.031
p=14πp=14\pi FBPConvNet 65.427±\pm12.627 0.937±\pm0.010
Ours 49.357±\pm10.729 0.955±\pm0.009

IV-B Experimental Results for p=14πp=14\pi

In this subsection, We present some experimental results to show the performance of our network for pitch p=14πp=14\pi.

In Fig. 5, we present five CT images reconstructed by Katsevich algorithm, Nonlocal-TV, FBPConvNet and our method from the NIH-AAPM-Mayo test data. We can observe that the images reconstructed by Katsevich algorithm suffer from severe streak artifacts. Nonlocal-TV tends to over-smooth the reconstructed CT images. Some fine details and boundaries in the CT images reconstructed by FBPConvNet are lost. Compared to these methods, our method can simultaneously suppress the streak artifacts and reconstruct fine details and good boundaries. In Table II, we list the average RMSE and PSNR of the CT images reconstructed by the four methods, from which we can observe that our method has the lowest RMSE and highest PSNR in average. We can also observe that the average RMSE and PSNR of each method for p=7πp=7\pi and p=14πp=14\pi vary only a little. Since our network uses Katsevich algorithm to transfer sinograms to CT images and the inputs to FBPConvNet are also the CT images reconstructed by Katsevich algorithm, it demonstrates that Katsevich algorithm is robust to the helical pitch pp.

We also test the generalization capabilities of FBPConvNet and our network for p=14πp=14\pi. The CT images reconstructed by FBPConvNet and our network from the head and neck test data for p=14πp=14\pi are shown in Fig. 5. We can see that the images reconstructed by FBPConvNet are somewhat blurry and lose some fine details while our method can reconstruct CT images with better boundaries and details. The average RMSE and SSIM of the reconstructed CT images are also listed in Table III. We can see that our method gains lower RMSE and higher SSIM than FBPConvNet in average.

V Discussion and Conclusion

In this paper, we developed a new GPU implementation of the Katsevich algorithm [4] for helical CT, which reconstructs the CT images pitch by pitch. By utilizing the periodic properties of the parameters of the Katsevich algorithm, our scheme only needs to calculate these parameters once for all pitches and so is very suitable for deep learning. By embedding our implementation of the Katsevich algorithm into the CNN, we proposed an end-to-end deep network for the high pitch helical CT reconstruction with sparse detectors. Experiments show that our end-to-end deep network outperformed the Nonlocal-TV based iterative algorithm and the post-processed deep network FBPConvNet both in subjective and objective evaluations.

One limitation of our method is that it requires the pitch pp of the helical scan is constant, which inherits from the Katsevich algorithm. In the literature, there exist some exact algorithms for the helical CT reconstruction with variable pitches. However, the periodic properties of the parameters of these algorithms may not hold. Therefore, we need to compute the parameters for every slice when using these algorithms to reconstruct CT images, which will consume a lot of GPU-memory and isn’t fit for deep learning. In our future work, we will research how to solve this issue.

References

  • [1] F. Noo, J. Pack, and D. Heuscher, “Exact helical reconstruction using native cone-beam geometries,” Physics in Medicine & Biology, vol. 48, no. 23, pp. 3787–3818, nov 2003.
  • [2] K. C. Tam, S. Samarasekera, and F. Sauer, “Exact cone beam CT with a spiral scan,” Physics in Medicine & Biology, vol. 43, no. 4, pp. 1015–1024, apr 1998.
  • [3] P. Danielsson, P. Edholm, J. Eriksson, and M. Magnusson Seger, “Towards exact reconstruction for helical cone-beam scanning of long objects. a new detector arrangement and a new completeness condition,” in Proc. 1997 Meeting on Fully 3D Image Reconstruction in Radiology and Nuclear Medicine (Pittsburgh, PA), D. Townsend and P. Kinahan, Eds., 1997, pp. 141–144.
  • [4] A. Katsevich, “An improved exact filtered backprojection algorithm for spiral computed tomography,” Advances in Applied Mathematics, vol. 32, no. 4, pp. 681–697, May 2004.
  • [5] A. Zheng, H. Gao, L. Zhang, and Y. Xing, “A dual-domain deep learning-based reconstruction method for fully 3d sparse data helical CT,” Physics in Medicine & Biology, vol. 65, no. 24, p. 245030, Dec. 2020.
  • [6] F. Noo, M. Defrise, and R. Clackdoyle, “Single-slice rebinning method for helical cone-beam CT,” Physics in Medicine & Biology, vol. 44, no. 2, pp. 561–570, Jan. 1999.
  • [7] K. Stierstorfer, A. Rauscher, J. Boese, H. Bruder, S. Schaller, and T. Flohr, “Weighted fbp–a simple approximate 3d fbp algorithm for multislice spiral CT with good dose usage for arbitrary pitch,” Physics in Medicine & Biology, vol. 49, no. 11, pp. 2209–2218, May 2004.
  • [8] G. Wang, T. . H. Lin, P. Cheng, and D. M. Shinozaki, “A general cone-beam reconstruction algorithm,” IEEE Transactions on Medical Imaging, vol. 12, no. 3, pp. 486–496, 1993.
  • [9] J. Guo, L. Zeng, and X. Zou, “An improved half-covered helical cone-beam CT reconstruction algorithm based on localized reconstruction filter,” Journal of X-Ray Science and Technology, vol. 19, no. 3, pp. 293–312, 2011.
  • [10] J. Zhao, Y. Lu, Y. Jin, E. Bai, and G. Wang, “Feldkamp-type reconstruction algorithms for spiral cone-beam CT with variable pitch,” Journal of X-Ray Science and Technology, vol. 15, no. 4, pp. 177–196, 2007.
  • [11] L. A. Feldkamp, L. C. Davis, and J. W. Kress, “Practical cone-beam algorithm,” Journal of the Optical Society of America A, vol. 1, no. 6, pp. 612–619, Jun 1984.
  • [12] E. Y. Sidky and X. Pan, “Image reconstruction in circular cone-beam computed tomography by constrained, total-variation minimization,” Physics in Medicine & Biology, vol. 53, no. 17, pp. 4777–4807, Aug. 2008.
  • [13] Y. Liu, J. Ma, Y. Fan, and Z. Liang, “Adaptive-weighted total variation minimization for sparse data toward low-dose x-ray computed tomography image reconstruction,” Physics in Medicine & Biology, vol. 57, no. 23, pp. 7923–7956, Nov. 2012.
  • [14] Y. Chen, D. Gao, C. Nie, L. Luo, W. Chen, X. Yin, and Y. Lin, “Bayesian statistical reconstruction for low-dose x-ray computed tomography using an adaptive-weighting nonlocal prior,” Computerized Medical Imaging and Graphics, vol. 33, no. 7, pp. 495–500, Oct. 2009.
  • [15] Q. Xu, H. Yu, X. Mou, L. Zhang, J. Hsieh, and G. Wang, “Low-dose x-ray CT reconstruction via dictionary learning,” IEEE Transactions on Medical Imaging, vol. 31, no. 9, pp. 1682–1697, 2012.
  • [16] P. Bao, W. Xia, K. Yang, W. Chen, M. Chen, Y. Xi, S. Niu, J. Zhou, H. Zhang, H. Sun, Z. Wang, and Y. Zhang, “Convolutional sparse coding for compressed sensing CT reconstruction,” IEEE Transactions on Medical Imaging, vol. 38, no. 11, pp. 2607–2619, 2019.
  • [17] J. Sunnegårdh and P.-E. Danielsson, “Regularized iterative weighted filtered backprojection for helical cone-beam CT,” Medical Physics, vol. 35, no. 9, pp. 4173–4185, Nov. 2008.
  • [18] W. Yu and L. Zeng, “Iterative image reconstruction for limited-angle inverse helical cone-beam computed tomography,” Scanning, vol. 38, no. 1, pp. 4–13, Nov. 2016.
  • [19] A. Katsevich, “Theoretically exact filtered backprojection-type inversion algorithm for spiral CT,” SIAM Journal on Applied Mathematics, vol. 62, no. 6, pp. 2012–2026, Nov. 2002.
  • [20] Y. Zou and X. Pan, “Image reconstruction on pi-lines by use of filtered backprojection in helical cone-beam CT,” Physics in Medicine & Biology, vol. 49, no. 12, pp. 2717–2731, Jun. 2004.
  • [21] ——, “Exact image reconstruction on pi-lines from minimum data in helical cone-beam CT,” Physics in Medicine & Biology, vol. 49, no. 6, pp. 941–959, Feb. 2004.
  • [22] Y. Ye and G. Wang, “Filtered backprojection formula for exact image reconstruction from cone-beam data along a general scanning curve,” Medical Physics, vol. 32, no. 1, pp. 42–48, Nov. 2005.
  • [23] Y. Zou, X. Pan, D. Xia, and G. Wang, “Pi-line-based image reconstruction in helical cone-beam computed tomography with a variable pitch,” Medical Physics, vol. 32, no. 8, pp. 2639–2648, Nov. 2005.
  • [24] G. Yan, J. Tian, S. Zhu, C. Qin, Y. Dai, F. Yang, D. Dong, and P. Wu, “Fast katsevich algorithm based on gpu for helical cone-beam computed tomography,” IEEE Transactions on Information Technology in Biomedicine, vol. 14, no. 4, pp. 1053–1061, 2010.
  • [25] H. Lee, J. Lee, H. Kim, B. Cho, and S. Cho, “Deep-neural-network-based sinogram synthesis for sparse-view CT image reconstruction,” IEEE Transactions on Radiation and Plasma Medical Sciences, vol. 3, no. 2, pp. 109–119, 2019.
  • [26] J. Fu, J. Dong, and F. Zhao, “A deep learning reconstruction framework for differential phase-contrast computed tomography with incomplete data,” IEEE Transactions on Image Processing, vol. 29, pp. 2190–2202, 2020.
  • [27] H. Chen, Y. Zhang, M. K. Kalra, F. Lin, Y. Chen, P. Liao, J. Zhou, and G. Wang, “Low-dose CT with a residual encoder-decoder convolutional neural network,” IEEE Transactions on Medical Imaging, vol. 36, no. 12, pp. 2524–2535, 2017.
  • [28] K. H. Jin, M. T. McCann, E. Froustey, and M. Unser, “Deep convolutional neural network for inverse problems in imaging,” IEEE Transactions on Image Processing, vol. 26, no. 9, pp. 4509–4522, 2017.
  • [29] Z. Zhang, X. Liang, X. Dong, Y. Xie, and G. Cao, “A sparse-view CT reconstruction method based on combination of densenet and deconvolution,” IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1407–1417, 2018.
  • [30] Y. Ge, T. Su, J. Zhu, X. Deng, Q. Zhang, J. Chen, Z. Hu, H. Zheng, and D. Liang, “Adaptive-net: deep computed tomography reconstruction network with analytical domain transformation knowledge,” Quantitative Imaging in Medicine and Surgery, vol. 10, no. 2, pp. 415–427, 2020.
  • [31] W. Lin, H. Liao, C. Peng, X. Sun, J. Zhang, J. Luo, R. Chellappa, and S. K. Zhou, “Dudonet: Dual domain network for CT metal artifact reduction,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 15-2, pp. 10 504–10 513.
  • [32] Q. Zhang, Z. Hu, C. Jiang, H. Zheng, Y. Ge, and D. Liang, “Artifact removal using a hybrid-domain convolutional neural network for limited-angle computed tomography imaging,” Physics in Medicine & Biology, vol. 65, no. 15, p. 155010, Aug. 2020.
  • [33] W. Wang, X. G. Xia, C. He, Z. Ren, J. Lu, T. Wang, and B. Lei, “An end-to-end deep network for reconstructing CT images directly from sparse sinograms,” IEEE Transactions on Computational Imaging, vol. 6, pp. 1548–1560, 2020.
  • [34] H. Chen, Y. Zhang, Y. Chen, J. Zhang, W. Zhang, H. Sun, Y. Lv, P. Liao, J. Zhou, and G. Wang, “Learn: Learned experts’ assessment-based reconstruction network for sparse-data CT,” IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1333–1347, 2018.
  • [35] J. He, Y. Yang, Y. Wang, D. Zeng, Z. Bian, H. Zhang, J. Sun, Z. Xu, and J. Ma, “Optimizing a parameterized plug-and-play admm for iterative low-dose CT reconstruction,” IEEE Transactions on Medical Imaging, vol. 38, no. 2, pp. 371–382, 2019.
  • [36] S. Izen, “A fast algorithm to compute the π\pi-line through points inside a helix cylinder,” Proceedings of The American Mathematical Society, vol. 135, pp. 269–276, 07 2006.
  • [37] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
  • [38] R. Hecht-Nielsen, “Theory of the Backpropagation Neural Network,” in IJCNN: International Joint Conference on Neural Networks, 1989, pp. 593–605.
  • [39] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” Preprint, arXiv:https://arxiv.org/abs/1412.6980v8, 2014.
  • [40] C. McCollough, “TU-FG-207A-04: Overview of the Low Dose CT Grand Challenge,” Medical Physics, vol. 43, no. 3759-3760, pp. 3759–3760, 2016.
  • [41] Y. Liu, J. Ma, Y. Fan, and Z. Liang, “Adaptive-Weighted Total Variation Minimization for Sparse Data Toward Low-Dose X-Ray Computed Tomography Image Reconstruction,” Physics in Medicine & Biology, vol. 57, no. 23, pp. 7923–7956, 2012.
  • [42] J. Zhang, S. Liu, R. Xiong, S. Ma, and D. Zhao, “Improved total variation based image compressive sensing recovery by nonlocal regularization,” in IEEE International Symposium on Circuits and Systems (ISCAS), 2013, pp. 2836–2839.
  • [43] K. H. Jin, M. T. McCann, E. Froustey, and M. Unser, “Deep Convolutional Neural Network for Inverse Problems in Imaging,” IEEE Transactions on Image Processing, vol. 26, no. 9, pp. 4509–4522, 2017.
  • [44] T. Bejarano, M. De Ornelas-Couto, and I. B. Mihaylov, “Longitudinal fan-beam computed tomography dataset for head-and-neck squamous cell carcinoma patients,” Medical Physics, vol. 46, no. 5, pp. 2526–2537, 2019.