A three-dimensional dual-domain deep network for high-pitch and sparse helical CT reconstruction

Wei Wang, Xiang-Gen Xia, Chuanjiang He, Zemin Ren and Jian Lu This work was supported partly by National Natural Science Foundation of China (Nos.12001381 and 61871274), Peacock Plan (No. KQTD2016053112051497), Shenzhen Key Basic Research Project (Nos. JCYJ20180507184647636, JCYJ20170818094109846, and JCYJ20190808155618806).Wei Wang is with the School of Biomedical Engineering, Shenzhen University, Shenzhen, China. (e-mail: [email protected]). Xiang-Gen Xia is with the Department of Electrical and Computer Engineering, University of Delaware, Newark, DE 19716, USA. (e-mail: [email protected]).Chuanjiang He is with the College of Mathematics and Statistics, Chongqing University, Chongqing, China (e-mail: [email protected]).Zemin Ren is with the College of Mathematics and Physics, Chongqing University of Science and Technology, Chongqing, China (e-mail: [email protected]).Jian Lu is with the Shenzhen Key Laboratory of Advanced Machine Learning and Applications, Shenzhen University, Shenzhen, China (e-mail: [email protected]).

Abstract

In this paper, we propose a new GPU implementation of the Katsevich algorithm for helical CT reconstruction. Our implementation divides the sinograms and reconstructs the CT images pitch by pitch. By utilizing the periodic properties of the parameters of the Katsevich algorithm, our method only needs to calculate these parameters once for all the pitches and so has lower GPU-memory burdens and is very suitable for deep learning. By embedding our implementation into the network, we propose an end-to-end deep network for the high pitch helical CT reconstruction with sparse detectors. Since our network utilizes the features extracted from both sinograms and CT images, it can simultaneously reduce the streak artifacts caused by the sparsity of sinograms and preserve fine details in the CT images. Experiments show that our network outperforms the related methods both in subjective and objective evaluations.

Index Terms:

Helical CT, sparse CT, deep learning, Katsevich algorithm, high pitch

I Introduction

Computed tomography (CT) has been an important tool in medical diagnosis and the helical scan with multi-row detector is the most used scanning modality in hospitals. When scanning, the X-ray source and detector rotate around the central axis (i.e., the z-direction) while the table with a patient remaining stationary on it is translated along the z-direction at a constant velocity. Therefore, the trajectory of the X-ray source relative to the patient forms a helical curve, where the pitch of the helical curve is related to the moving velocity of the table.

The patient needs to stay stationary during the scanning process, otherwise severe artifacts will be caused in the reconstructed CT images. Thus, the high pitch scanning that can complete the data acquiring in small time has many merits and is very desirable in clinical diagnosis. For example, by utilizing the high pitch helical CT, the whole lung can be scanned within a very small interval and so the patient needs not to hold the breath. This is very useful for patients that can’t perform breath-holding for a long time, such as infant, old patients and unconscious patients. For cardiac CT, scanning the whole heart in one heart beat is a reasonable request.

In [1], Noo et al. used the Tam-Danielsson window [2][3] and Katsevich’s inverse formula [4] to derive the numbers of required detector rows for perfectly reconstructing the helical CT images with a curved detector or a flat detector. When other parameters are fixed, the higher the pitch is, the greater number of the detector rows is needed. However, as the number of the detector rows increasing, the price of the CT scanner will also rise significantly. One way to reduce the price is to use sparse detector [5] with fixed detector row numbers. However, CT images reconstructed from sparse sinograms usually have streak artifacts and lose fine details especially when the scanning pitch is high.

In the literature, many algorithms were proposed to reconstruct the helical CT images. These algorithms can be mainly categorized into four classes: (a) rebinning algorithms, (b) FDK-like algorithms, (c) model-based iterative methods, (d) exact algorithms.

The rebinning algorithms [6][7] first converted the measured helical sinograms to 2D fan-beam or parallel-beam sinograms and then reconstructed the CT images by using any 2D reconstruction methods. To reduce the interpolation error and accurately estimate the 2D sinogram, the sampling rate of the helical scan is usually high.

The FDK-like algorithms [8][9][10] are extended from the circle cone-beam reconstruction algorithm proposed by Feldkamp, Davis and Kress [11]. The FDK-like algorithms usually involve three steps: 1) filtering the sinograms, 2) weighting the filtered data, 3) Backprojecting the weighted data. By utilizing the fast Fourier transform (FFT), the FDK-like algorithms can be implemented efficiently. However, since the FDK-like algorithms were proposed heuristically and lack of theory supports, the artifacts in the reconstruction is usually hard to avoided especially when the cone angles are large.

Model-based methods usually first propose the minimizing functionals by combining the statistical property of the sinogram with different priori hypotheses, and then convert the functionals to iterative reconstruction algorithms by using optimizing methods. In the literature, total variation (TV) [12][13], nonlocal means (NLM) [14], dictionary learning [15][16] are the commonly used priori hypotheses. In [7], Stierstorfer et al. proposed an iterative weighted filtered backprojection (WFBP) method to reconstruct helical CT images. In [17], Sunneg $\mathring{a}$ rdh proposed an improved regularized iterative weighted filtered backprojection (RIWFBP) method for the helical cone-beam CT. In [18], Yu and Zeng developed a fast TV-based iterative reconstruction algorithm for the limited-angle helical cone-beam CT reconstruction. Model-based iterative methods are very suitable for ill-posed problems, such as limited angle or sparse angle CT reconstructions. But the computational burdens of the model-based iterative methods are too high to reconstruct the images in real time especially for 3D helical CT.

Exact helical CT reconstruction can be obtained by using Katsevich’s algorithms [19][4], Zou and Pan’s PI-line algorithms [20][21]. These algorithms have mathematical inverse formulas for helical CT reconstructions and so are theoretically exact. In [22][23], by extending Katsevich’s algorithms and the PI-line algorithms, the exact methods for cone-beam CT reconstructions with general scanning curves were proposed. In [24], Yan et al. accelerated the implementation of the Katsevich algorithm [4] by utilizing the graphics processing unit (GPU). In their implementation, the Tam–Danielsson window rather than the PI line is used to determine whether a sinogram corresponding to a scanning angle needs to be backprojected to reconstruct a slice of CT image. Their method doesn’t utilize the periodic properties of the PI-line and so needs to calculate the backprojection positions on the detector for pixels in all the CT images. Exact helical algorithms can reconstruct good quality CT images even if the helical pitch is high. But when the sinograms are sparse, streak artifacts in the CT images reconstructed by the exact algorithms can’t be avoided.

Recently, convolutional neural network (CNN) based methods have also been used for CT reconstructions. These methods can be roughly divided into three categories. The first category uses CNNs only to pre-process the sinograms [25][26] or post-process the CT images [27][28][29]. This type of methods needs an extra step to convert the sinograms to CT images. The second category utilizes end-to-end CNNs to reconstruct CT images directly from measured sinograms [30][31][32][5][33]. Since the reconstruction algorithm is modelled as network layers embedded in CNNS, no extra step is needed when testing or deploying these trained CNNs. However, embedding reconstruction algorithm in CNNs is GPU-memory expensive and may result in GPU-memory exhausting error when training this type of CNNs especially for 3D CT. The third category uses CNNs to approximate the solution of the unroll iterative algorithms [34][35], where each iteration of the algorithms was approximated by a subnetwork. This type of algorithms exhausts more GPU-memory than those in the second category since in the networks the operators of projection and backprojection need to be performed $n$ times, where $n$ is the iteration numbers.

In this paper, we focus on embedding the Katsevich algorithm [4] into CNNs to propose an end-to-end deep network to reconstruct helical CT images with high pitch and sparsely spaced detectors. Due to the limited GPU-memory, to the best of our knowledge, in the literature no exact algorithm (Katsevich’s algorithms or PI-line algorithms) was embedded in CNNs to reconstruct helical CT images. In [5], an end-to-end deep network was proposed for helical CT, but the sinograms measured from helical scans were converted to 2D fan-beam sinograms and the embedded reconstruction algorithm was also for fan-beam geometry. As the helical pitch increases, the image volume involved to generate the rebinned fan-beam sinogram and the error of the estimated fan-beam sinogram will both increase. Therefore, for helical CT with high pitch, the CNN using rebining methodology may not work well. The Katsevich algorithm [4] can reconstruct good CT images even if the helical pitch is high. Our network uses Katsevich algorithm as the domain transfer layer and so can also reconstruct good CT images directly from high pitch helical sinograms. Moreover, because our network utilizes the features extracted from both sinograms and CT images, it can simultaneously reduce the streak artifacts caused by the sparsity of sinograms and reconstruct fine details in the CT images.

The sinogram measured from the helical scan usually has large size in the z-direction. If directly input the whole sinogram into CNN to reconstruct all the CT images, huge GPU-memory is needed. Besides, most of the parameters of Katsevich algorithm [4] depend on the position of the CT images and computing them is also time consuming and GPU-memory consuming. By observing that the parameters of Katsevich algorithm [4] are periodic, we divide the sinograms into parts and reconstruct the CT images cycle-by-cycle. By this way, the parameters of Katsevich algorithm for all cycles can be pre-calculated and the GPU-memory loading can be reduced.

The rest of the paper is organized as follows. Section II gives our new GPU implementation of the Katsevich algorithm [4]. Section III describes the detailed structure of our proposed network. Section IV presents the experimental results. Discussion and conclusion are given in Section V.

II GPU implementation of Katsevich Algorithm

In this section, we give a new GPU implementation of the Katsevich algorithm [4], which reconstructs the helical CT images pitch by pitch and has low computational and GPU-memory burdens. Our implementation of Katsevich algorithm [4] is very suitable for deep learning since its paramerters are the same for all pitches and so need only to calculate once.

Katsevich [4] proposed an exact reconstruction formula for helical cone-beam CT and Noo et al. [1] researched how to implement it efficiently and accurately. Let

\underline{a}(\lambda)=(R\cos(\lambda+\lambda_{0}),R\sin(\lambda+\lambda_{0}),z_{0}+p\lambda/2\pi)

(1)

be the helical trajectory formed by the X-ray source position with radius $R$ and pitch $p$ , where

\underline{a}(0)=(R\cos(\lambda_{0}),R\sin(\lambda_{0}),z_{0})

(2)

is the start point of the helix,

U=\{\underline{x}:x^{2}+y^{2}\leq r^{2}\}

(3)

is a cylinder inside the helix, $0<r<R$ and $\underline{x}=(x,y,z)$ .

Then, for any $\underline{x}\in U$ , the reconstruction formula can be written as

f(\underline{x})=-\frac{1}{2\pi}\int_{\lambda_{i}(\underline{x})}^{\lambda_{o}(\underline{x})}\mathrm{d}\lambda\frac{1}{\|\underline{x}-\underline{a}(\lambda)\|}g^{F}\left(\lambda,\frac{x-\underline{a}(\lambda)}{\|\underline{x}-\underline{a}(\lambda)\|}\right)

(4)

where $\lambda_{i}(\underline{x})$ and $\lambda_{o}(\underline{x})$ are the extremities of the $\pi$ -line passing through $\underline{x}$ with $\lambda_{i}(\underline{x})<\lambda_{o}(\underline{x})$ .

g^{F}(\lambda,\underline{\theta})=\int_{0}^{2\pi}\mathrm{d}\gamma h_{H}(\sin\gamma)g^{\prime}(\lambda,\cos\gamma\underline{\theta}+\sin\gamma(\underline{\theta}\times\underline{m}(\lambda,\underline{\theta})))

(5)

g^{\prime}(\lambda,\underline{\theta})=\lim_{\varepsilon\rightarrow 0}\frac{g(\lambda+\varepsilon,\underline{\theta})-g(\lambda,\underline{\theta})}{\varepsilon},

(6)

h_{H}(s)=-\int_{-\infty}^{+\infty}\mathrm{d}\sigma\mathrm{i}\mathop{sign}(\sigma)\mathrm{e}^{\mathrm{i}2\pi\sigma s}=\frac{1}{\pi s}

(7)

is the Hilbert filter, $g(\lambda,\underline{\theta})$ is the measured projection data at X-ray source position $\underline{a}(\lambda)$ in the direction $\underline{\theta}$ , vector $\underline{m}(\lambda,\underline{\theta})$ is normal to the $\kappa$ -plane $\mathcal{K}(\lambda,\psi)$ of the smallest $|\psi|$ value that contains the line of direction $\underline{\theta}$ through $\underline{a}(\lambda)$ and the $\kappa$ -plane $\mathcal{K}(\lambda,\psi)$ is any 2D plane that has three intersections $\underline{a}(\lambda)$ , $\underline{a}(\lambda+\psi)$ and $\underline{a}(\lambda+2\psi)$ with the helix, where $\psi\in(-\pi,\pi)$ . In [1], to numerically calculate $g^{F}(\lambda,\underline{\theta})$ , the integral over $\gamma$ on the $\kappa$ -plane with the minimum value $\left|\psi\right|$ is converted to the line integral along the curve that the $\kappa$ -plane intersects with the detector plane.

Let $g(\lambda,\alpha,w)$ be the sinogram measured from the helical scan by curved detector, then the implementation of (4) in [1] is described as follows:

Taking the derivative at constant direction via chain rule:

g_{1}(\lambda,\alpha,w)=\left.\left(\frac{\partial g(q,\alpha,w)}{\partial q}+\frac{\partial g(q,\alpha,w)}{\partial\alpha}\right)\right|_{q=\lambda}

(8)

2.

Length-correction weighting:

$g_{2}(\lambda,\alpha,w)=\frac{D}{\sqrt{D^{2}+w^{2}}}g_{1}(\lambda,\alpha,w)$ (9)
3.

Forward height rebinning: computing

$g_{3}(\lambda,\alpha,\psi)=g_{2}\left(\lambda,\alpha,w_{\kappa}(\alpha,\psi)\right)$ (10)

by interpolation for all $\psi\in\left[-\pi/2-\alpha_{m},\pi/2+\alpha_{m}\right]$ , where $\alpha_{m}=\arcsin(r/R)$ is the half fan angle,

$w_{\kappa}(\alpha,\psi)=\frac{DP}{2\pi R_{o}}\left(\psi\cos\alpha+\frac{\psi}{\tan\psi}\sin\alpha\right)$ (11)

1-D Hilbert transform in $\alpha$ : at constant $\psi$ , compute

g_{4}(\lambda,\alpha,\psi)=\int_{-\pi/2}^{+\pi/2}\mathrm{~{}d}\alpha^{\prime}h_{H}\left(\sin\left(\alpha-\alpha^{\prime}\right)\right)g_{3}\left(\lambda,\alpha^{\prime},\psi\right)

(12)

where $h_{H}$ is the Hilbert filter defined in (7).

5.

Backward height rebinning: compute

$g_{5}(\lambda,\alpha,w)=g_{4}(\lambda,\alpha,\hat{\psi}(\alpha,w))$ (13)

where $\hat{\psi}(\alpha,w)$ is angle $\psi$ of the smallest absolute value that satisfies the equation

$w=\frac{DP}{2\pi R_{o}}\left(\psi\cos\alpha+\frac{\psi}{\tan\psi}\sin\alpha\right)$ (14)
6.

Post-cosine weighting:

$g^{F}(\lambda,\alpha,w)=\cos\alpha g_{5}(\lambda,\alpha,w)$ (15)

Backprojection:

f(\underline{x})=\frac{1}{2\pi}\int_{\lambda_{i}(\underline{x})}^{\lambda_{o}(\underline{x})}\mathrm{d}\lambda\frac{1}{v^{*}(\lambda,\underline{x})}g^{F}\left(\lambda,\alpha^{*}(\lambda,\underline{x}),w^{*}(\lambda,\underline{x})\right)

(16)

where

v^{*}(\lambda,\underline{x})=R_{o}-x\cos\left(\lambda+\lambda_{0}\right)-y\sin\left(\lambda+\lambda_{0}\right)

(17)

	$\displaystyle\alpha^{*}(\lambda,\underline{x})=$			(18)
	$\displaystyle\arctan$	$\displaystyle\left(\frac{\left(-x\sin\left(\lambda+\lambda_{0}\right)+y\cos\left(\lambda+\lambda_{0}\right)\right)}{v^{*}(\lambda,\underline{x})}\right)$		(18)

w^{*}(\lambda,\underline{x})=\frac{D\cos\alpha^{*}(\lambda,\underline{x})}{v^{*}(\lambda,\underline{x})}\left(z-z_{0}-\frac{P}{2\pi}\lambda\right)

(19)

In the above steps, the parameters $\lambda_{i}(\underline{x})$ , $\lambda_{o}(\underline{x})$ , $v^{*}(\lambda,\underline{x})$ , $\alpha^{*}(\lambda,\underline{x})$ , $w^{*}(\lambda,\underline{x})$ , $w_{\kappa}(\alpha,\psi)$ and $\hat{\psi}(\alpha,w)$ need to calculate. The parameters $w_{\kappa}(\alpha,\psi)$ and $\hat{\psi}(\alpha,w)$ can be pre-calculated only once but the others $\lambda_{i}(\underline{x})$ , $\lambda_{o}(\underline{x})$ , $v^{*}(\lambda,\underline{x})$ , $\alpha^{*}(\lambda,\underline{x})$ , $w^{*}(\lambda,\underline{x})$ need to be calculated for every $\underline{x}$ , which is time consuming. Luckily, these parameters have the following periodic properties:

		$\displaystyle\lambda_{i}(\underline{x}^{k})=\lambda_{i}(\underline{x})+2k\pi$		(20)
		$\displaystyle\lambda_{o}(\underline{x}^{k})=\lambda_{o}(\underline{x})+2k\pi$
		$\displaystyle v^{}(\lambda+2k\pi,\underline{x}^{k})=v^{}(\lambda,\underline{x})$
		$\displaystyle\alpha^{}(\lambda+2k\pi,\underline{x}^{k})=\alpha^{}(\lambda,\underline{x})$
		$\displaystyle w^{}(\lambda+2k\pi,\underline{x}^{k})=w^{}(\lambda,\underline{x})$

where $\underline{x}^{k}=(x,y,z+kp)$ , $k$ is an integer and $p$ is the pitch. Thus, if we reconstruct the CT images in the z-direction pitch by pitch, the parameters $\lambda_{o}(\underline{x})$ , $\lambda_{p}(\underline{x})$ , $v^{*}(\lambda,\underline{x})$ , $\alpha^{*}(\lambda,\underline{x})$ , $w^{*}(\lambda,\underline{x})$ can also be pre-calculated only once, which can reduce the computation and GPU-memory burdens a lot.

In the following, we describe the detailed GPU implementation scheme of the reconstruction formula (4), which reconstruct the CT images pitch by pitch. In our scheme, to reduce the GPU-memory consuming, the backprojection step is performed slice by slice in the z-direction while the other steps are performed on the whole sinogram needed to reconstruct the CT images of one pitch.

Let $f(\underline{x}_{j}^{k})$ be the $j$ -th sliced CT image of the $k$ -th pitch in the z-direction, where $j=0,1,2,...,N-1$ , $k=1,...M$ , $N$ is the total number of CT images to be reconstruct in one pitch, $M$ is the number of divided pitches of the scanning helix, $\underline{x}_{j}^{k}=(x,y,z_{j}+kp)$ , $(x,y)$ is a point of the mesh grids formed by the x-coordinate and y-coordinate and $z_{j}$ is the relative position on the z-coordinate in one pitch. Then our scheme can be divided into parameters pre-calculating step and reconstruction step.

The pre-calculating step is described as follows:

1.

Pre-calculate $w_{\kappa}(\alpha,\psi)$ and $\hat{\psi}(\alpha,w)$ by using equation (11) and the method in [1].

Define $\underline{x}_{j}:=\underline{x}_{j}^{1}$ , For $j=0,1,2,...,N-1$ , pre-calculate the extremities $\lambda_{i}(\underline{x}_{j})$ and $\lambda_{o}(\underline{x}_{j})$ of the $\pi$ -line passing through $\underline{x}_{j}$ using the algorithm described in [36]. Note that an extra projection step that constrains $\theta$ on the interval $[-\pi,\pi]$ needs to perform at each iteration of the algorithm, which was not explicitly stated in [36]. Also note that here $\lambda_{i}(\underline{x}_{j})$ and $\lambda_{o}(\underline{x}_{j})$ are two matrices corresponding to the mesh grid points $(x,y)$ of $\underline{x}_{j}$ . After obtaining $\lambda_{i}(\underline{x}_{j})$ and $\lambda_{o}(\underline{x}_{j})$ , we can calculate the minimal and maximal values of $\lambda$ needed to reconstruct $f(\underline{x}_{j})$ :

	$\displaystyle\lambda_{min}(\underline{x}_{j})$	$\displaystyle=\min_{(x,y)~{}\text{of}~{}x_{j}}(\lambda_{i}(\underline{x}_{j})),$		(21)
	$\displaystyle\lambda_{max}(\underline{x}_{j})$	$\displaystyle=\max_{(x,y)~{}\text{of}~{}x_{j}}(\lambda_{o}(\underline{x}_{j})).$		(21)

For $j=0,1,2,...,N-1$ , pre-calculate $v^{*}(\lambda,\underline{x}_{j})$ , $\alpha^{*}(\lambda,\underline{x}_{j})$ and $w^{*}(\lambda,\underline{x}_{j})$ :

v^{*}(\lambda,\underline{x}_{j})=R_{o}-x\cos\left(\lambda+\lambda_{0}\right)-y\sin\left(\lambda+\lambda_{0}\right)

(22)

	$\displaystyle\alpha^{*}(\lambda,\underline{x}_{j})$	$\displaystyle=$		(23)
	$\displaystyle\arctan$	$\displaystyle\left(\frac{\left(-x\sin\left(\lambda+\lambda_{0}\right)+y\cos\left(\lambda+\lambda_{0}\right)\right)}{v^{*}(\lambda,\underline{x}_{j})}\right)$		(23)

w^{*}(\lambda,\underline{x}_{j})=\frac{D\cos\alpha^{*}(\lambda,\underline{x}_{j})}{v^{*}(\lambda,\underline{x}_{j})}\left(z_{i}-z_{0}-\frac{P}{2\pi}\lambda\right),

(24)

where $\underline{x}_{j}=(x,y,z_{j})$ and $\lambda=[\lambda_{min}(\underline{x}_{j}),\lambda_{max}(\underline{x}_{j})]$ . Here $u^{*}(\lambda,\underline{x}_{j})$ , $\alpha^{*}(\lambda,\underline{x}_{j})$ and $w^{*}(\lambda,\underline{x}_{j})$ are three tensors of dimension three, where the first dimension corresponds to $\lambda=[\lambda_{min}(\underline{x}_{j}),\lambda_{max}(\underline{x}_{j})]$ .

Note that in the above step 3), since

		$\displaystyle v^{}(\lambda,\underline{x}_{j}^{1})=v^{}(\lambda,\underline{x}_{j}^{k})$		(25)
		$\displaystyle\alpha^{}(\lambda,\underline{x}_{j}^{1})=\alpha^{}(\lambda,\underline{x}_{j}^{k})$
		$\displaystyle w^{}(\lambda,\underline{x}_{j}^{1})=w^{}(\lambda,\underline{x}_{j}^{k})$

for any positive integer $k$ , $v^{*}(\lambda,\underline{x}_{j})$ , $\alpha^{*}(\lambda,\underline{x}_{j})$ and $w^{*}(\lambda,\underline{x}_{j})$ can be pre-calculated only once.

After the pre-calculating step, we can compute the minimal and maximal values of $\lambda$ , $\alpha$ and $w$ needed to reconstruct the CT images of one pitch. These values can be calculated by

		$\displaystyle\lambda_{min}=\min\limits_{j=0,1,...N-1}\lambda_{min}(\underline{x}_{j})$		(26)
		$\displaystyle\lambda_{max}=\max\limits_{j=0,1,...N-1}\lambda_{max}(\underline{x}_{j})$
		$\displaystyle\alpha_{min}=\min\limits_{j=0,1,...N-1}(\min\limits_{\lambda\in[\lambda_{min}(\underline{x}_{j}),\lambda_{max}(\underline{x}_{j})]}\alpha^{*}(\lambda,\underline{x}_{j}))$
		$\displaystyle\alpha_{max}=\max\limits_{j=0,1,...N-1}(\max\limits_{\lambda\in[\lambda_{min}(\underline{x}_{j}),\lambda_{max}(\underline{x}_{j})]}\alpha^{*}(\lambda,\underline{x}_{j}))$
		$\displaystyle w_{min}=\min\limits_{j=0,1,...N-1}(\min\limits_{\lambda\in[\lambda_{min}(\underline{x}_{j}),\lambda_{max}(\underline{x}_{j})]}w^{*}(\lambda,\underline{x}_{j}))$
		$\displaystyle w_{max}=\max\limits_{j=0,1,...N-1}(\max\limits_{\lambda\in[\lambda_{min}(\underline{x}_{j}),\lambda_{max}(\underline{x}_{j})]}w^{*}(\lambda,\underline{x}_{j}))$

The values $\alpha_{min}$ , $\alpha_{max}$ , $w_{min}$ and $w_{max}$ can be used to determine the numbers of rows and columns of the detector, and $\lambda_{min}$ and $\lambda_{max}$ are used to determine the required sinogram to reconstruct the CT images of one pitch. To reconstruct the $k$ -th pitch CT images, the sinogram corresponding to $\lambda=[\lambda_{min}+2k\pi,\lambda_{max}+2k\pi]$ is needed.

Let $g(\lambda^{k},\alpha,w)$ be the $\lambda$ -sliced sinogram from $g(\lambda,\alpha,w)$ , where $\lambda^{k}=[\lambda_{min}+2k\pi,\lambda_{max}+2k\pi]$ for $k=1,...M$ . Then the reconstruction step can be described as follows:

1.

Perform Derivative (8), Length-correction weighting (9), Forward height rebinning (10), Hilbert transform (12), Backward height rebinning (13) and Post-cosine weighting (15) on $g(\lambda^{k},\alpha,w)$ and obtain $g^{F}(\lambda^{k},\alpha,w)$ .

For $j=0,1,2,...,N-1$ , backproject $g^{F}(\lambda^{k},\alpha,w)$ slice by slice:

f(\underline{x}_{j}^{kp})=\frac{1}{2\pi}\int_{\lambda_{min}(\underline{x}_{j})+2k\pi}^{\lambda_{max}(\underline{x}_{j})+2k\pi}\mathrm{d}\lambda\frac{1}{v^{*}_{j}}g^{F}\left(\lambda^{k},\alpha^{*}_{j},w^{*}_{j}\right)

(27)

where

$\displaystyle v^{*}_{j}$	$\displaystyle=v^{*}(\lambda,\underline{x}_{j})$	(28)
$\displaystyle\alpha^{*}_{j}$	$\displaystyle=\alpha^{*}(\lambda,\underline{x}_{j})$
$\displaystyle w^{*}_{j}$	$\displaystyle=w^{*}(\lambda,\underline{x}_{j})$

To reconstruct the helical CT images pitch by pitch, we need to perform the pre-calculating step once and store its outputs in memory, and the reconstruction step $M$ times for $M$ pitches. In this paper, the pre-calculating step was implemented by the central processing unit (CPU) while the reconstruction step by GPU.

III Proposed Network

In this section, we propose a 3D deep learning network to reconstruct CT images directly from sparse helical sinograms with high pitch, where our GPU implementation of the Katsevich algorithm [4] is used as the domain transfer layer in the network.

III-A Architecture of Proposed Network

Our network (shown in Fig. 1) is composed of three parts, the sinogram domain subnetwork, the domain transfer layer and the CT image domain subnetwork, where the sinogram domain subnetwork is used to denoise and upsample the sparse sinograms, the domain transfer layer is used to reconstruct the CT images from the sinograms output by the sinogram domain subnetwork, the CT image domain subnetwork is used to denoise CT images reconstructed by the domain transfer layer.

Refer to caption — Figure 1: The detailed structure of our network.

The architecture of the sinogram domain subnetwork is based on the Resnet [37]. In detail, the sinogram domain subnetwork is composed of 7 cascaded ’Conv3D+PReLU’ blocks and followed by a ’Conv3D’ block, where ’Conv3D’ is the 3D convolutional layer and ’PReLu’ is the activate function, parametric rectified linear unit (PReLU). The feature channels of 3D convolutional filters in ’Conv3D+PReLU’ blocks are all 16 and 1 in the ’Conv3D’ block, and the widths of the filters in ’Conv3D+PReLU’ blocks and ’Conv3D’ block are all 3. Each output of the ’Conv3D+PReLU’ block is skipped-connected with the output of the last ’Conv3D+PReLU’ block and the input of the sinogram domain subnetwork is skipped-connected with the output of the ’Conv3D’ block. The input to the sinogram domain subnetwork is the sliced 3D sinogram ( $g(\lambda^{k},\alpha,w)$ ) that can exactly reconstruct one pitch helical CT images. The output of the sinogram domain subnetwork is the denoised sinogram which has the same size of the input sinogram.

The domain transfer layer implements the Katsevich algorithm [4] as described in Section II. The input to the domain transfer layer is the denoised sinograms output by the sinogram domain subnetwork and the output is the reconstructed helical CT images of one pitch by the Katsevich algorithm [4].

The CT image domain subnetwork has the same architecture of the sinogram domain subnetwork, but the inputs and outputs are different. The input to the CT image domain subnetwork is the CT images of one pitch output by the domain transfer layer and the output is the denosied CT images which has the same size as the input.

III-B Loss Function

Let $(g^{k},f^{k})$ be the outputs of our network and $(g_{GT}^{k},f_{GT}^{k})$ be the corresponding labels, where $g^{k}$ and $g_{GT}^{k}$ are, respectively, the denoised sinogram and label sinogram, and $f^{k}$ and $f_{GT}^{k}$ are, respectively, the denoised CT image and label CT image. Then the loss function used to train our network is

\mathcal{L}_{\Theta}=\frac{1}{N_{D}}\sum\nolimits_{k=1}^{N_{D}}{{{\left\|{{g_{GT}^{k}}-g^{k}}\right\|}^{2}}}+{{{\left\|{{f_{GT}^{k}}-f^{k}}\right\|}^{2}}},

(29)

where $N_{D}$ is the total number of training samples and $\Theta$ is the set of parameters of our network to be learned.

Our network is trained by optimizing the loss function (29) using the backpropagation algorithm (BP) [38]. In this study, the loss function is optimized by the Adam algorithm [39], where the learning rate is set as ${{1}}{{{0}}^{-3}}$ .

The training code of our network is implemented by using the software Tensorflow 2.5 on a Dell PowerEdge T640 server with a Ubuntu 20.04 operating system, 377GB random access memory (RAM), a Intel(R) Xeon(R) Silver 4210R CPU with 2.4GHz frequency and a NVIDIA RTX A6000 GPU card with 45631MB memory. The initial values of the training parameters of our network are randomly set by Tensorflow using the default setup. The gradients of the loss function with respect to the training parameters are calculated by Tensorflow automatically using the automatic differentiation technique. The runtime of our network to complete one batch learning (a forward step and a backward step) with $batch=1$ is about $5\sim 6$ seconds.

III-C Data Preparation

III-C1 CT image labels

The CT images from “the 2016 NIH-AAPM-Mayo Clinic Low Dose CT Grand Challenge” [40] that was established and authorized by Mayo Clinics are used as the ground truth CT images $f_{GT}$ . The AAPM challenge has also released the 3D helical sinograms corresponding to the ground truth CT images. But the pitches of the helical scan used to obtain the sinograms are low and not equal, so we need to generate the simulated sinograms with constant high pitch by performing the helical projection.

In this paper, we perform our methods on two types of helical sinograms, which correspond to $p=7\pi$ and $p=14\pi$ , respectively.

III-C2 Parameters of the scanning geometry

If not specified, all the length unit in our paper is millimetre ( $mm$ ). The radius of scanning helix is $R=595$ and the distance of the X-ray source to the detector is $D=1085.6$ , where the detector is a curved panel. The x-coordinate and y-coordinate of the ground truth CT images are, resectively

	$\displaystyle x_{cor}$	$\displaystyle=[-256:1:255]\times 1$		(30)
	$\displaystyle y_{cor}$	$\displaystyle=[-256:1:255]\times 1$		(30)

For simplify, here we assume the size of one pixel of the CT images is $1\times 1$ . The half fan-angle is

\arcsin(\sqrt{2}\times 256/R)\approx 19.4808^{\circ}

(31)

and we set

	$\displaystyle\alpha_{cor}=$	$\displaystyle[-19.5625:1:19.5625]\times 0.0625^{\circ}+\frac{1}{4}\times 0.0625^{\circ}$		(32)
	$\displaystyle=$	$\displaystyle([-313:1:313]+0.25)\times\pi/180/16$		(32)

for the $\alpha$ -coordinate of the detector.

We constraint the number of detector rows to be 16 and so the positions of the detector units distributed on the w-coordinate are different for different $p$ . For $p=7\pi$ , the max $w$ in equation (26) is

w_{L}=\max(w_{max},-w_{min})=3.8819,

(33)

and so $\nabla w=2w_{max}/15=0.5176$ . Therefore, we set

w_{cor}=[-7.5:1:7.5]\times 0.5176

(34)

for $p=7\pi$ on the $w$ -coordinate. Similarly, for $p=14\pi$ , we have

w_{L}=\max(w_{max},-w_{min})=7.7499,

(35)

and so $\nabla w=2w_{max}/15=1.0333$ and set

w_{cor}=[-7.5:1:7.5]\times 1.0333.

(36)

In summary, the scanning parameters of the helical CT are listed in Table I.

III-C3 Data generation

We use the above described geometry to simulate scanning the CT image labels to generate the sinogram labels $y_{GT}$ , where the slices of the ground truth CT images are equally spaced on the z-coordinate. For $p=7\pi$ , the z-coordinate of the ground truth CT images on one pitch is

	$\displaystyle z_{cor}=$	$\displaystyle linspace(0,2\pi h,ceil(3h))$		(37)
	$\displaystyle=$	$\displaystyle[0:1:10]\times 2.1991$		(37)

and for $p=14\pi$ is

	$\displaystyle z_{cor}=$	$\displaystyle linspace(0,2\pi h,ceil(2h))$		(38)
	$\displaystyle=$	$\displaystyle[0:1:13]\times 3.3833$		(38)

Therefore, in one pitch, we need to reconstruct $11$ slices of CT images for $p=7\pi$ and $14$ slices for $p=14\pi$ .

The sinogram $g_{GT}$ is obtained by scanning a whole patient, so we need to divide it into pitches to obtain $g_{GT}^{k}$ , where

$\displaystyle g_{GT}^{k}$	$\displaystyle=g_{GT}[\lambda_{1}^{k}:1:\lambda_{2}^{k},:,:],$	(39)
$\displaystyle\lambda_{1}^{k}$	$\displaystyle=(\lambda_{min}-\lambda_{0}+2k\pi)/\nabla\lambda,$
$\displaystyle\lambda_{2}^{k}$	$\displaystyle=(\lambda_{max}-\lambda_{0}+2k\pi)/\nabla\lambda$

$k=2,..,M-1$ , $\lambda_{min}$ and $\lambda_{max}$ are defined in equation (26), $\nabla\lambda=\pi/180$ and $\lambda_{0}$ is the start angle of the scanning helix. For $p=7\pi$ , $\lambda_{min}=-2.1720$ , $\lambda_{max}=7.9625$ , and for $p=14\pi$ , $\lambda_{min}=-2.1875$ , $\lambda_{max}=8.1025$ . To avoid the zero values, the data of the first and last pitches are discarded.

Correspondingly, the ground truth CT images $f_{GT}$ are also needed to divide into pitches to get $f_{GT}^{k}$ , where

$\displaystyle f_{GT}^{k}$	$\displaystyle=f_{GT}[:,:,z_{1}^{k}:1:z_{2}^{k}],$	(40)
$\displaystyle z_{1}^{k}$	$\displaystyle=k\times z_{L}+1,$
$\displaystyle z_{2}^{k}$	$\displaystyle=(k+1)\times z_{L},$

$k=2,..,M-1$ , $z_{L}=11$ for $p=7\pi$ and $z_{L}=14$ for $p=14\pi$ .

After obtaining the sinogram labels $g^{k}_{GT}$ , we downsample it on the $\alpha$ -coordinate at

\alpha_{sp}=\alpha_{cor}[0:4:end]

(41)

to obtain the sparse sinogram $g^{k}_{sp}$ . We then upsample $g^{k}_{sp}$ by interpolation to make it have the same size as $g^{k}_{GT}$ and add the ’Gaussian+Poisson’ noise [41] to the upsampled sinogram $g^{k}_{up}$ to obtain the input sinograms $g^{k}_{in}$ of our network, where

		$\displaystyle g=I_{0}\times\exp(-g^{k}_{up}/M)$		(42)
		$\displaystyle g=g+poission(g)+Gaussin\left(m,\text{ var }\right)$
		$\displaystyle g^{k}_{in}=\log\left(I_{0}/g\right)\times M,$

$M$ is the maximal value of $g^{k}_{up}$ , $I_{0}=10^{5}$ is the photon count, $m=0$ and $\text{ var }=0.5$ are the mean and variance of the Gaussian noise, respectively.

The CT image $f^{k}_{in}$ reconstructed from $g^{k}_{in}$ is used as the input of the compared post-processing methods.

TABLE I: The Parameters to Generate the Training Data

Radius of helix	595
Distance of source to detector	1085.6
$x$ -coordinate of CT image	$[-256:1:255]\times 1$
$y$ -coordinate of CT image	$[-256:1:255]\times 1$
$z$ -coordinate of CT image in one pitch	$[0:1:10]\times 2.1991$
$z$ -coordinate of CT image in one pitch	$[0:1:13]\times 3.3833$
$\lambda$ -coordinate in one pitch	$[-125:1:455]\times 0.0175$
$\lambda$ -coordinate in one pitch	$[-125:1:463]\times 0.0175$
$\alpha$ -coordinate of detector	$[-312.75:1:313.25]\times 0.0011$
$w$ -coordinate of detector	$[-7.5:1:7.5]\times 0.5176$
$w$ -coordinate of detector	$[-7.5:1:7.5]\times 1.0333$

\begin{overpic}[width=177.78578pt,percent]{./data/red/label10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/clabel10-eps-converted-to.pdf}} } \end{overpic}

(a)

\begin{overpic}[width=177.78578pt]{./data/red/Kati10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cKati10-eps-converted-to.pdf}} } \end{overpic}

(b)

\begin{overpic}[width=177.78578pt]{./data/red/TV10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cTV10-eps-converted-to.pdf}} } \end{overpic}

(c)

\begin{overpic}[width=177.78578pt]{./data/red/FBPconv10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cFBPconv10-eps-converted-to.pdf}} } \end{overpic}

(d)

\begin{overpic}[width=177.78578pt]{./data/red/our10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cour10-eps-converted-to.pdf}} } \end{overpic}

(e)

\begin{overpic}[width=177.78578pt,percent]{./data/red/label200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/clabel200-eps-converted-to.pdf}} } \end{overpic}

(f)

\begin{overpic}[width=177.78578pt]{./data/red/Kati200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cKati200-eps-converted-to.pdf}} } \end{overpic}

(g)

\begin{overpic}[width=177.78578pt]{./data/red/TV200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cTV200-eps-converted-to.pdf}} } \end{overpic}

(h)

\begin{overpic}[width=177.78578pt]{./data/red/FBPconv200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cFBPconv200-eps-converted-to.pdf}} } \end{overpic}

(i)

\begin{overpic}[width=177.78578pt]{./data/red/our200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cour200-eps-converted-to.pdf}} } \end{overpic}

(j)

\begin{overpic}[width=177.78578pt,percent]{./data/red/label800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/clabel800-eps-converted-to.pdf}} } \end{overpic}

(k)

\begin{overpic}[width=177.78578pt]{./data/red/Kati800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cKati800-eps-converted-to.pdf}} } \end{overpic}

(l)

\begin{overpic}[width=177.78578pt]{./data/red/TV800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cTV800-eps-converted-to.pdf}} } \end{overpic}

(m)

\begin{overpic}[width=177.78578pt]{./data/red/FBPconv800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cFBPconv800-eps-converted-to.pdf}} } \end{overpic}

(n)

\begin{overpic}[width=177.78578pt]{./data/red/our800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cour800-eps-converted-to.pdf}} } \end{overpic}

(o)

\begin{overpic}[width=177.78578pt,percent]{./data/red/label1600-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/clabel1600-eps-converted-to.pdf}} } \end{overpic}

(p)

\begin{overpic}[width=177.78578pt]{./data/red/Kati1600-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cKati1600-eps-converted-to.pdf}} } \end{overpic}

(q)

\begin{overpic}[width=177.78578pt]{./data/red/TV1600-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cTV1600-eps-converted-to.pdf}} } \end{overpic}

(r)

\begin{overpic}[width=177.78578pt]{./data/red/FBPconv1600-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cFBPconv1600-eps-converted-to.pdf}} } \end{overpic}

(s)

\begin{overpic}[width=177.78578pt]{./data/red/our1600-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cour1600-eps-converted-to.pdf}} } \end{overpic}

(t)

\begin{overpic}[width=177.78578pt,percent]{./data/red/label2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/clabel2000-eps-converted-to.pdf}} } \end{overpic}

(a) Label

\begin{overpic}[width=177.78578pt]{./data/red/Kati2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cKati2000-eps-converted-to.pdf}} } \end{overpic}

(b) Katsevich algorithm

\begin{overpic}[width=177.78578pt]{./data/red/TV2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cTV2000-eps-converted-to.pdf}} } \end{overpic}

\begin{overpic}[width=177.78578pt]{./data/red/FBPconv2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cFBPconv2000-eps-converted-to.pdf}} } \end{overpic}

(d) FBPConvNet

\begin{overpic}[width=177.78578pt]{./data/red/our2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/enlarge/cour2000-eps-converted-to.pdf}} } \end{overpic}

(e) Ours

Figure 2: The reconstructed results of NIH-AAPM-Mayo data for

p=7\pi

. All images are linearly stretched to [0, 1] and the display window is [0, 0.6].

\begin{overpic}[width=177.78578pt,percent]{./data/neck/red/label60-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/clabel60-eps-converted-to.pdf}} } \end{overpic}

(a)

\begin{overpic}[width=177.78578pt]{./data/neck/red/Kati60-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cKati60-eps-converted-to.pdf}} } \end{overpic}

(b)

\begin{overpic}[width=177.78578pt]{./data/neck/red/FBPconv60-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cFBPconv60-eps-converted-to.pdf}} } \end{overpic}

(c)

\begin{overpic}[width=177.78578pt]{./data/neck/red/our60-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cour60-eps-converted-to.pdf}} } \end{overpic}

(d)

\begin{overpic}[width=177.78578pt,percent]{./data/neck/red/label80-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/clabel80-eps-converted-to.pdf}} } \end{overpic}

(e)

\begin{overpic}[width=177.78578pt]{./data/neck/red/Kati80-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cKati80-eps-converted-to.pdf}} } \end{overpic}

(f)

\begin{overpic}[width=177.78578pt]{./data/neck/red/FBPconv80-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cFBPconv80-eps-converted-to.pdf}} } \end{overpic}

(g)

\begin{overpic}[width=177.78578pt]{./data/neck/red/our80-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cour80-eps-converted-to.pdf}} } \end{overpic}

(h)

\begin{overpic}[width=177.78578pt,percent]{./data/neck/red/label160-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/clabel160-eps-converted-to.pdf}} } \end{overpic}

(i)

\begin{overpic}[width=177.78578pt]{./data/neck/red/Kati160-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cKati160-eps-converted-to.pdf}} } \end{overpic}

(j)

\begin{overpic}[width=177.78578pt]{./data/neck/red/FBPconv160-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cFBPconv160-eps-converted-to.pdf}} } \end{overpic}

(k)

\begin{overpic}[width=177.78578pt]{./data/neck/red/our160-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cour160-eps-converted-to.pdf}} } \end{overpic}

(l)

\begin{overpic}[width=177.78578pt,percent]{./data/neck/red/label200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/clabel200-eps-converted-to.pdf}} } \end{overpic}

(m)

\begin{overpic}[width=177.78578pt]{./data/neck/red/Kati200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cKati200-eps-converted-to.pdf}} } \end{overpic}

(n)

\begin{overpic}[width=177.78578pt]{./data/neck/red/FBPconv200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cFBPconv200-eps-converted-to.pdf}} } \end{overpic}

(o)

\begin{overpic}[width=177.78578pt]{./data/neck/red/our200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cour200-eps-converted-to.pdf}} } \end{overpic}

(p)

\begin{overpic}[width=177.78578pt,percent]{./data/neck/red/label240-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/clabel240-eps-converted-to.pdf}} } \end{overpic}

(a) Label

\begin{overpic}[width=177.78578pt]{./data/neck/red/Kati240-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cKati240-eps-converted-to.pdf}} } \end{overpic}

(b) Katsevich algorithm

\begin{overpic}[width=177.78578pt]{./data/neck/red/FBPconv240-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cFBPconv240-eps-converted-to.pdf}} } \end{overpic}

\begin{overpic}[width=177.78578pt]{./data/neck/red/our240-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data/neck/enlarge/cour240-eps-converted-to.pdf}} } \end{overpic}

(d) Ours

Figure 3: The reconstructed results of head and neck CT data for

p=7\pi

. All images are linearly stretched to [0, 1] and the display window is [0, 0.6].

IV Experimental Results

In this section, we present some simulated experimental results to demonstrate the effectiveness of our method and compare it with the related methods, the nonlocal-TV iterative algorithm [42] and the post-processing method FBPConvNet [43], where the convolutional filters in FBPConvNet are extended from 2D to 3D.

As described in Subsection III-C, the CT images in “the 2016 NIH-AAPM-Mayo Clinic Low Dose CT Grand Challenge” [40] are used to prepare the experimental data, where the data generated from six patients named ’L192’, ’L286’, ’L291’, ’L310’, ’L333’ and ’L506’ are used as the training data and those from the other four patients named ’L067’, ’L096’, ’L109’ and ’L143’ are used as the test data. The training data involve totally 345 pitches (of size $512\times 512\times 10$ ) for $p=7\pi$ and 260 pitches (of size $512\times 512\times 13$ ) for $p=14\pi$ and the test data involve 215 pitches for $p=7\pi$ and 163 pitches for $p=14\pi$ , where $10\%$ of the training data is randomly chosen as the validation. Note that since the last CT image of one pitch is override with the first CT image of its next pitch, we exclude the last CT image of every pitch in our training and testing data.

For nonlocal-TV iterative algorithm, we tune its parameters empirically in order to obtain the best results. We use the default parameters of FBPConvNet to train it, where the number of training epoches is 100. The CT images reconstructed from the noisy sinograms by the Katsevich algorithm are used as the inputs of FBPConvNet, where the batch size is $batch=5$ . We train our network $60$ epoches with batch size $batch=1$ . The size of the learnable parameters of FBPConvNet is about 12.4MB while ours is about 2.3MB.

IV-A Experimental Results for $p=7\pi$

In Fig. 2, five slices of CT images reconstructed from the test data for $p=7\pi$ are shown. We can observe that the images reconstructed by Katsevich algorithm suffer severe streak artifacts due to the sparsity of the sinogram. The images reconstructed by nonlocal-TV are somewhat over-smoothed and a lot of details are lost. FBPConvNet can reconstruct CT images with some details, but some boundaries in its reconstructed CT images are still blurred. Compared to nonlocal-TV and FBPConvNet, the CT images reconstructed by our method can reconstruct more details and boundaries as shown in Fig. 2e. To better observe the differences, the areas marked by the red rectangles in the reconstructed images are enlarged and presented on the left-down positions.

TABLE II: The RMSE and SSIM of the four methods for NIH-AAPM-Mayo test data.

Pitch	Method	RMSE	SSIM
	Katsevich	97.616 $\pm$ 7.549	0.777 $\pm$ 0.028
$p=7\pi$	Nonlocal-TV	93.174 $\pm$ 7.271	0.815 $\pm$ 0.028
	FBPConvNet	70.329 $\pm$ 8.169	0.845 $\pm$ 0.028
	Ours	59.068 $\pm$ 7.028	0.867 $\pm$ 0.027
	Katsevich	97.783 $\pm$ 7.493	0.776 $\pm$ 0.028
$p=14\pi$	Nonlocal-TV	93.362 $\pm$ 7.217	0.814 $\pm$ 0.028
	FBPConvNet	71.609 $\pm$ 8.140	0.842 $\pm$ 0.028
	Ours	60.028 $\pm$ 6.988	0.865 $\pm$ 0.027

To quantitatively evaluate the performances of the compared methods, two metrics, the root-mean-square-error (RMSE) and the structural-similarity (SSIM) are used to measure the similarity of the reconstructed images and the labels. The definition of RMSE is

\operatorname{RMSE}=\sqrt{\frac{\|f-f_{GT}\|_{2}^{2}}{n}},

(43)

where $f$ and $f_{GT}$ are, respectively, the reconstructed CT image and label image, $n$ is the number of pixels. The value of SSIM is defined as

\operatorname{SSIM}=\frac{\left(2\mu_{1}\mu_{2}+c_{1}\right)\left(2\sigma_{1,2}+c_{2}\right)}{\left(\mu_{1}^{2}+\mu_{2}^{2}+c_{1}\right)\left(\sigma_{1}^{2}+\sigma_{2}^{2}+c_{2}\right)},

(44)

where

	$\displaystyle C_{1}$	$\displaystyle=(0.01\times(\max(f_{GT})-\min(f_{GT})))^{2}$		(45)
	$\displaystyle C_{2}$	$\displaystyle=(0.03\times(\max(f_{GT})-\min(f_{GT})))^{2},$		(45)

$\mu_{1}$ , $\sigma_{1}$ are the mean and variance of the reconstructed CT image $f$ , $\mu_{2}$ , $\sigma_{2}$ are the mean and variance of the referenced image $f_{GT}$ , respectively, $\sigma_{1,2}$ is the covariance between $f$ and $f_{GT}$ ,

In Table II, the average RMSE and SSIM of the CT images reconstructed by the four methods are listed. It can be observed that our network gains the lowest RMSE and highest SSIM in average, which coincides with our subjective evaluation.

\begin{overpic}[width=177.78578pt,percent]{./data1/red/label10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/clabel10-eps-converted-to.pdf}} } \end{overpic}

(a)

\begin{overpic}[width=177.78578pt]{./data1/red/Kati10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cKati10-eps-converted-to.pdf}} } \end{overpic}

(b)

\begin{overpic}[width=177.78578pt]{./data1/red/TV10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cTV10-eps-converted-to.pdf}} } \end{overpic}

(c)

\begin{overpic}[width=177.78578pt]{./data1/red/FBPconv10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cFBPconv10-eps-converted-to.pdf}} } \end{overpic}

(d)

\begin{overpic}[width=177.78578pt]{./data1/red/our10-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cour10-eps-converted-to.pdf}} } \end{overpic}

(e)

\begin{overpic}[width=177.78578pt,percent]{./data1/red/label200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/clabel200-eps-converted-to.pdf}} } \end{overpic}

(f)

\begin{overpic}[width=177.78578pt]{./data1/red/Kati200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cKati200-eps-converted-to.pdf}} } \end{overpic}

(g)

\begin{overpic}[width=177.78578pt]{./data1/red/TV200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cTV200-eps-converted-to.pdf}} } \end{overpic}

(h)

\begin{overpic}[width=177.78578pt]{./data1/red/FBPconv200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cFBPconv200-eps-converted-to.pdf}} } \end{overpic}

(i)

\begin{overpic}[width=177.78578pt]{./data1/red/our200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cour200-eps-converted-to.pdf}} } \end{overpic}

(j)

\begin{overpic}[width=177.78578pt,percent]{./data1/red/label400-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/clabel400-eps-converted-to.pdf}} } \end{overpic}

(k)

\begin{overpic}[width=177.78578pt]{./data1/red/Kati400-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cKati400-eps-converted-to.pdf}} } \end{overpic}

(l)

\begin{overpic}[width=177.78578pt]{./data1/red/TV400-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cTV400-eps-converted-to.pdf}} } \end{overpic}

(m)

\begin{overpic}[width=177.78578pt]{./data1/red/FBPconv400-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cFBPconv400-eps-converted-to.pdf}} } \end{overpic}

(n)

\begin{overpic}[width=177.78578pt]{./data1/red/our400-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cour400-eps-converted-to.pdf}} } \end{overpic}

(o)

\begin{overpic}[width=177.78578pt,percent]{./data1/red/label800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/clabel800-eps-converted-to.pdf}} } \end{overpic}

(p)

\begin{overpic}[width=177.78578pt]{./data1/red/Kati800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cKati800-eps-converted-to.pdf}} } \end{overpic}

(q)

\begin{overpic}[width=177.78578pt]{./data1/red/TV800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cTV800-eps-converted-to.pdf}} } \end{overpic}

(r)

\begin{overpic}[width=177.78578pt]{./data1/red/FBPconv800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cFBPconv800-eps-converted-to.pdf}} } \end{overpic}

(s)

\begin{overpic}[width=177.78578pt]{./data1/red/our800-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cour800-eps-converted-to.pdf}} } \end{overpic}

(t)

\begin{overpic}[width=177.78578pt,percent]{./data1/red/label2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/clabel2000-eps-converted-to.pdf}} } \end{overpic}

(a) Label

\begin{overpic}[width=177.78578pt]{./data1/red/Kati2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cKati2000-eps-converted-to.pdf}} } \end{overpic}

(b) Katsevich algorithm

\begin{overpic}[width=177.78578pt]{./data1/red/TV2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cTV2000-eps-converted-to.pdf}} } \end{overpic}

\begin{overpic}[width=177.78578pt]{./data1/red/FBPconv2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cFBPconv2000-eps-converted-to.pdf}} } \end{overpic}

(d) FBPConvNet

\begin{overpic}[width=177.78578pt]{./data1/red/our2000-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/enlarge/cour2000-eps-converted-to.pdf}} } \end{overpic}

(e) Ours

Figure 4: The reconstructed CT NIH-AAPM-Mayo data for

p=14\pi

. All images are linearly stretched to [0, 1] and the display window is [0, 0.6].

\begin{overpic}[width=177.78578pt,percent]{./data1/neck/red/label40-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/clabel40-eps-converted-to.pdf}} } \end{overpic}

(a)

\begin{overpic}[width=177.78578pt]{./data1/neck/red/Kati40-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cKati40-eps-converted-to.pdf}} } \end{overpic}

(b)

\begin{overpic}[width=177.78578pt]{./data1/neck/red/FBPconv40-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cFBPconv40-eps-converted-to.pdf}} } \end{overpic}

(c)

\begin{overpic}[width=177.78578pt]{./data1/neck/red/our40-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cour40-eps-converted-to.pdf}} } \end{overpic}

(d)

\begin{overpic}[width=177.78578pt,percent]{./data1/neck/red/label60-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/clabel60-eps-converted-to.pdf}} } \end{overpic}

(e)

\begin{overpic}[width=177.78578pt]{./data1/neck/red/Kati60-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cKati60-eps-converted-to.pdf}} } \end{overpic}

(f)

\begin{overpic}[width=177.78578pt]{./data1/neck/red/FBPconv60-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cFBPconv60-eps-converted-to.pdf}} } \end{overpic}

(g)

\begin{overpic}[width=177.78578pt]{./data1/neck/red/our60-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cour60-eps-converted-to.pdf}} } \end{overpic}

(h)

\begin{overpic}[width=177.78578pt,percent]{./data1/neck/red/label80-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/clabel80-eps-converted-to.pdf}} } \end{overpic}

(i)

\begin{overpic}[width=177.78578pt]{./data1/neck/red/Kati80-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cKati80-eps-converted-to.pdf}} } \end{overpic}

(j)

\begin{overpic}[width=177.78578pt]{./data1/neck/red/FBPconv80-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cFBPconv80-eps-converted-to.pdf}} } \end{overpic}

(k)

\begin{overpic}[width=177.78578pt]{./data1/neck/red/our80-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cour80-eps-converted-to.pdf}} } \end{overpic}

(l)

\begin{overpic}[width=177.78578pt,percent]{./data1/neck/red/label140-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/clabel140-eps-converted-to.pdf}} } \end{overpic}

(m)

\begin{overpic}[width=177.78578pt]{./data1/neck/red/Kati140-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cKati140-eps-converted-to.pdf}} } \end{overpic}

(n)

\begin{overpic}[width=177.78578pt]{./data1/neck/red/FBPconv140-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cFBPconv140-eps-converted-to.pdf}} } \end{overpic}

(o)

\begin{overpic}[width=177.78578pt]{./data1/neck/red/our140-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cour140-eps-converted-to.pdf}} } \end{overpic}

(p)

\begin{overpic}[width=177.78578pt,percent]{./data1/neck/red/label200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/clabel200-eps-converted-to.pdf}} } \end{overpic}

(a) Label

\begin{overpic}[width=177.78578pt]{./data1/neck/red/Kati200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cKati200-eps-converted-to.pdf}} } \end{overpic}

(b) Katsevich algorithm

\begin{overpic}[width=177.78578pt]{./data1/neck/red/FBPconv200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cFBPconv200-eps-converted-to.pdf}} } \end{overpic}

\begin{overpic}[width=177.78578pt]{./data1/neck/red/our200-eps-converted-to.pdf} \put(0.0,0.0){\color[rgb]{1,0,0}\framebox(0.0,0.0)[bl]{\includegraphics[scale={0.08}]{./data1/neck/enlarge/cour200-eps-converted-to.pdf}} } \end{overpic}

(d) Ours

Figure 5: The reconstructed results of head and neck CT data for

p=14\pi

. All images are linearly stretched to [0, 1] and the display window is [0, 0.6].

To test the generalization of our method, we perform the reconstruction of our network and FBPConvNet on another type of CT, the head and neck CT, where the head and neck CT dataset [44] is used to simulate the test data, and our network and FBPConvNet are trained by the NIH-AAPM-Mayo data only. Some reconstructed results of the head and neck CT are presented in Fig. 3. We can observe that our method can reconstruct more fine details and boundaries than FBPConvNet, which demonstrates that our method has good generalization capability. The average RMSE and SSIM of the reconstructed head and neck CT images are listed in Table III. It can be observed that our network has lower RMSE and higher SSIM than FBPConvNet in average.

TABLE III: The RMSE and SSIM of the three methods for head and neck CT test data.

Pitch	Method	RMSE	SSIM
	Katsevich	76.657 $\pm$ 12.371	0.863 $\pm$ 0.031
$p=7\pi$	FBPConvNet	64.633 $\pm$ 12.058	0.939 $\pm$ 0.009
	Ours	47.807 $\pm$ 9.870	0.959 $\pm$ 0.006
	Katsevich	77.104 $\pm$ 12.473	0.861 $\pm$ 0.031
$p=14\pi$	FBPConvNet	65.427 $\pm$ 12.627	0.937 $\pm$ 0.010
	Ours	49.357 $\pm$ 10.729	0.955 $\pm$ 0.009

IV-B Experimental Results for $p=14\pi$

In this subsection, We present some experimental results to show the performance of our network for pitch $p=14\pi$ .

In Fig. 5, we present five CT images reconstructed by Katsevich algorithm, Nonlocal-TV, FBPConvNet and our method from the NIH-AAPM-Mayo test data. We can observe that the images reconstructed by Katsevich algorithm suffer from severe streak artifacts. Nonlocal-TV tends to over-smooth the reconstructed CT images. Some fine details and boundaries in the CT images reconstructed by FBPConvNet are lost. Compared to these methods, our method can simultaneously suppress the streak artifacts and reconstruct fine details and good boundaries. In Table II, we list the average RMSE and PSNR of the CT images reconstructed by the four methods, from which we can observe that our method has the lowest RMSE and highest PSNR in average. We can also observe that the average RMSE and PSNR of each method for $p=7\pi$ and $p=14\pi$ vary only a little. Since our network uses Katsevich algorithm to transfer sinograms to CT images and the inputs to FBPConvNet are also the CT images reconstructed by Katsevich algorithm, it demonstrates that Katsevich algorithm is robust to the helical pitch $p$ .

We also test the generalization capabilities of FBPConvNet and our network for $p=14\pi$ . The CT images reconstructed by FBPConvNet and our network from the head and neck test data for $p=14\pi$ are shown in Fig. 5. We can see that the images reconstructed by FBPConvNet are somewhat blurry and lose some fine details while our method can reconstruct CT images with better boundaries and details. The average RMSE and SSIM of the reconstructed CT images are also listed in Table III. We can see that our method gains lower RMSE and higher SSIM than FBPConvNet in average.

V Discussion and Conclusion

In this paper, we developed a new GPU implementation of the Katsevich algorithm [4] for helical CT, which reconstructs the CT images pitch by pitch. By utilizing the periodic properties of the parameters of the Katsevich algorithm, our scheme only needs to calculate these parameters once for all pitches and so is very suitable for deep learning. By embedding our implementation of the Katsevich algorithm into the CNN, we proposed an end-to-end deep network for the high pitch helical CT reconstruction with sparse detectors. Experiments show that our end-to-end deep network outperformed the Nonlocal-TV based iterative algorithm and the post-processed deep network FBPConvNet both in subjective and objective evaluations.

One limitation of our method is that it requires the pitch $p$ of the helical scan is constant, which inherits from the Katsevich algorithm. In the literature, there exist some exact algorithms for the helical CT reconstruction with variable pitches. However, the periodic properties of the parameters of these algorithms may not hold. Therefore, we need to compute the parameters for every slice when using these algorithms to reconstruct CT images, which will consume a lot of GPU-memory and isn’t fit for deep learning. In our future work, we will research how to solve this issue.

References

[1] F. Noo, J. Pack, and D. Heuscher, “Exact helical reconstruction using native cone-beam geometries,” Physics in Medicine & Biology, vol. 48, no. 23, pp. 3787–3818, nov 2003.
[2] K. C. Tam, S. Samarasekera, and F. Sauer, “Exact cone beam CT with a spiral scan,” Physics in Medicine & Biology, vol. 43, no. 4, pp. 1015–1024, apr 1998.
[3] P. Danielsson, P. Edholm, J. Eriksson, and M. Magnusson Seger, “Towards exact reconstruction for helical cone-beam scanning of long objects. a new detector arrangement and a new completeness condition,” in Proc. 1997 Meeting on Fully 3D Image Reconstruction in Radiology and Nuclear Medicine (Pittsburgh, PA), D. Townsend and P. Kinahan, Eds., 1997, pp. 141–144.
[4] A. Katsevich, “An improved exact filtered backprojection algorithm for spiral computed tomography,” Advances in Applied Mathematics, vol. 32, no. 4, pp. 681–697, May 2004.
[5] A. Zheng, H. Gao, L. Zhang, and Y. Xing, “A dual-domain deep learning-based reconstruction method for fully 3d sparse data helical CT,” Physics in Medicine & Biology, vol. 65, no. 24, p. 245030, Dec. 2020.
[6] F. Noo, M. Defrise, and R. Clackdoyle, “Single-slice rebinning method for helical cone-beam CT,” Physics in Medicine & Biology, vol. 44, no. 2, pp. 561–570, Jan. 1999.
[7] K. Stierstorfer, A. Rauscher, J. Boese, H. Bruder, S. Schaller, and T. Flohr, “Weighted fbp–a simple approximate 3d fbp algorithm for multislice spiral CT with good dose usage for arbitrary pitch,” Physics in Medicine & Biology, vol. 49, no. 11, pp. 2209–2218, May 2004.
[8] G. Wang, T. . H. Lin, P. Cheng, and D. M. Shinozaki, “A general cone-beam reconstruction algorithm,” IEEE Transactions on Medical Imaging, vol. 12, no. 3, pp. 486–496, 1993.
[9] J. Guo, L. Zeng, and X. Zou, “An improved half-covered helical cone-beam CT reconstruction algorithm based on localized reconstruction filter,” Journal of X-Ray Science and Technology, vol. 19, no. 3, pp. 293–312, 2011.
[10] J. Zhao, Y. Lu, Y. Jin, E. Bai, and G. Wang, “Feldkamp-type reconstruction algorithms for spiral cone-beam CT with variable pitch,” Journal of X-Ray Science and Technology, vol. 15, no. 4, pp. 177–196, 2007.
[11] L. A. Feldkamp, L. C. Davis, and J. W. Kress, “Practical cone-beam algorithm,” Journal of the Optical Society of America A, vol. 1, no. 6, pp. 612–619, Jun 1984.
[12] E. Y. Sidky and X. Pan, “Image reconstruction in circular cone-beam computed tomography by constrained, total-variation minimization,” Physics in Medicine & Biology, vol. 53, no. 17, pp. 4777–4807, Aug. 2008.
[13] Y. Liu, J. Ma, Y. Fan, and Z. Liang, “Adaptive-weighted total variation minimization for sparse data toward low-dose x-ray computed tomography image reconstruction,” Physics in Medicine & Biology, vol. 57, no. 23, pp. 7923–7956, Nov. 2012.
[14] Y. Chen, D. Gao, C. Nie, L. Luo, W. Chen, X. Yin, and Y. Lin, “Bayesian statistical reconstruction for low-dose x-ray computed tomography using an adaptive-weighting nonlocal prior,” Computerized Medical Imaging and Graphics, vol. 33, no. 7, pp. 495–500, Oct. 2009.
[15] Q. Xu, H. Yu, X. Mou, L. Zhang, J. Hsieh, and G. Wang, “Low-dose x-ray CT reconstruction via dictionary learning,” IEEE Transactions on Medical Imaging, vol. 31, no. 9, pp. 1682–1697, 2012.
[16] P. Bao, W. Xia, K. Yang, W. Chen, M. Chen, Y. Xi, S. Niu, J. Zhou, H. Zhang, H. Sun, Z. Wang, and Y. Zhang, “Convolutional sparse coding for compressed sensing CT reconstruction,” IEEE Transactions on Medical Imaging, vol. 38, no. 11, pp. 2607–2619, 2019.
[17] J. Sunnegårdh and P.-E. Danielsson, “Regularized iterative weighted filtered backprojection for helical cone-beam CT,” Medical Physics, vol. 35, no. 9, pp. 4173–4185, Nov. 2008.
[18] W. Yu and L. Zeng, “Iterative image reconstruction for limited-angle inverse helical cone-beam computed tomography,” Scanning, vol. 38, no. 1, pp. 4–13, Nov. 2016.
[19] A. Katsevich, “Theoretically exact filtered backprojection-type inversion algorithm for spiral CT,” SIAM Journal on Applied Mathematics, vol. 62, no. 6, pp. 2012–2026, Nov. 2002.
[20] Y. Zou and X. Pan, “Image reconstruction on pi-lines by use of filtered backprojection in helical cone-beam CT,” Physics in Medicine & Biology, vol. 49, no. 12, pp. 2717–2731, Jun. 2004.
[21] ——, “Exact image reconstruction on pi-lines from minimum data in helical cone-beam CT,” Physics in Medicine & Biology, vol. 49, no. 6, pp. 941–959, Feb. 2004.
[22] Y. Ye and G. Wang, “Filtered backprojection formula for exact image reconstruction from cone-beam data along a general scanning curve,” Medical Physics, vol. 32, no. 1, pp. 42–48, Nov. 2005.
[23] Y. Zou, X. Pan, D. Xia, and G. Wang, “Pi-line-based image reconstruction in helical cone-beam computed tomography with a variable pitch,” Medical Physics, vol. 32, no. 8, pp. 2639–2648, Nov. 2005.
[24] G. Yan, J. Tian, S. Zhu, C. Qin, Y. Dai, F. Yang, D. Dong, and P. Wu, “Fast katsevich algorithm based on gpu for helical cone-beam computed tomography,” IEEE Transactions on Information Technology in Biomedicine, vol. 14, no. 4, pp. 1053–1061, 2010.
[25] H. Lee, J. Lee, H. Kim, B. Cho, and S. Cho, “Deep-neural-network-based sinogram synthesis for sparse-view CT image reconstruction,” IEEE Transactions on Radiation and Plasma Medical Sciences, vol. 3, no. 2, pp. 109–119, 2019.
[26] J. Fu, J. Dong, and F. Zhao, “A deep learning reconstruction framework for differential phase-contrast computed tomography with incomplete data,” IEEE Transactions on Image Processing, vol. 29, pp. 2190–2202, 2020.
[27] H. Chen, Y. Zhang, M. K. Kalra, F. Lin, Y. Chen, P. Liao, J. Zhou, and G. Wang, “Low-dose CT with a residual encoder-decoder convolutional neural network,” IEEE Transactions on Medical Imaging, vol. 36, no. 12, pp. 2524–2535, 2017.
[28] K. H. Jin, M. T. McCann, E. Froustey, and M. Unser, “Deep convolutional neural network for inverse problems in imaging,” IEEE Transactions on Image Processing, vol. 26, no. 9, pp. 4509–4522, 2017.
[29] Z. Zhang, X. Liang, X. Dong, Y. Xie, and G. Cao, “A sparse-view CT reconstruction method based on combination of densenet and deconvolution,” IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1407–1417, 2018.
[30] Y. Ge, T. Su, J. Zhu, X. Deng, Q. Zhang, J. Chen, Z. Hu, H. Zheng, and D. Liang, “Adaptive-net: deep computed tomography reconstruction network with analytical domain transformation knowledge,” Quantitative Imaging in Medicine and Surgery, vol. 10, no. 2, pp. 415–427, 2020.
[31] W. Lin, H. Liao, C. Peng, X. Sun, J. Zhang, J. Luo, R. Chellappa, and S. K. Zhou, “Dudonet: Dual domain network for CT metal artifact reduction,” in 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 15-2, pp. 10 504–10 513.
[32] Q. Zhang, Z. Hu, C. Jiang, H. Zheng, Y. Ge, and D. Liang, “Artifact removal using a hybrid-domain convolutional neural network for limited-angle computed tomography imaging,” Physics in Medicine & Biology, vol. 65, no. 15, p. 155010, Aug. 2020.
[33] W. Wang, X. G. Xia, C. He, Z. Ren, J. Lu, T. Wang, and B. Lei, “An end-to-end deep network for reconstructing CT images directly from sparse sinograms,” IEEE Transactions on Computational Imaging, vol. 6, pp. 1548–1560, 2020.
[34] H. Chen, Y. Zhang, Y. Chen, J. Zhang, W. Zhang, H. Sun, Y. Lv, P. Liao, J. Zhou, and G. Wang, “Learn: Learned experts’ assessment-based reconstruction network for sparse-data CT,” IEEE Transactions on Medical Imaging, vol. 37, no. 6, pp. 1333–1347, 2018.
[35] J. He, Y. Yang, Y. Wang, D. Zeng, Z. Bian, H. Zhang, J. Sun, Z. Xu, and J. Ma, “Optimizing a parameterized plug-and-play admm for iterative low-dose CT reconstruction,” IEEE Transactions on Medical Imaging, vol. 38, no. 2, pp. 371–382, 2019.
[36] S. Izen, “A fast algorithm to compute the $\pi$ -line through points inside a helix cylinder,” Proceedings of The American Mathematical Society, vol. 135, pp. 269–276, 07 2006.
[37] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image recognition,” in 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2016, pp. 770–778.
[38] R. Hecht-Nielsen, “Theory of the Backpropagation Neural Network,” in IJCNN: International Joint Conference on Neural Networks, 1989, pp. 593–605.
[39] D. P. Kingma and J. Ba, “Adam: A Method for Stochastic Optimization,” Preprint, arXiv:https://arxiv.org/abs/1412.6980v8, 2014.
[40] C. McCollough, “TU-FG-207A-04: Overview of the Low Dose CT Grand Challenge,” Medical Physics, vol. 43, no. 3759-3760, pp. 3759–3760, 2016.
[41] Y. Liu, J. Ma, Y. Fan, and Z. Liang, “Adaptive-Weighted Total Variation Minimization for Sparse Data Toward Low-Dose X-Ray Computed Tomography Image Reconstruction,” Physics in Medicine & Biology, vol. 57, no. 23, pp. 7923–7956, 2012.
[42] J. Zhang, S. Liu, R. Xiong, S. Ma, and D. Zhao, “Improved total variation based image compressive sensing recovery by nonlocal regularization,” in IEEE International Symposium on Circuits and Systems (ISCAS), 2013, pp. 2836–2839.
[43] K. H. Jin, M. T. McCann, E. Froustey, and M. Unser, “Deep Convolutional Neural Network for Inverse Problems in Imaging,” IEEE Transactions on Image Processing, vol. 26, no. 9, pp. 4509–4522, 2017.
[44] T. Bejarano, M. De Ornelas-Couto, and I. B. Mihaylov, “Longitudinal fan-beam computed tomography dataset for head-and-neck squamous cell carcinoma patients,” Medical Physics, vol. 46, no. 5, pp. 2526–2537, 2019.

		$\displaystyle v^{}(\lambda,\underline{x}_{j}^{1})=v^{}(\lambda,\underline{x}_{j}^{k})$		(25)
		$\displaystyle\alpha^{}(\lambda,\underline{x}_{j}^{1})=\alpha^{}(\lambda,\underline{x}_{j}^{k})$
		$\displaystyle w^{}(\lambda,\underline{x}_{j}^{1})=w^{}(\lambda,\underline{x}_{j}^{k})$

A three-dimensional dual-domain deep network for high-pitch and sparse helical CT reconstruction

Abstract

Index Terms:

I Introduction

II GPU implementation of Katsevich Algorithm

III Proposed Network

III-A Architecture of Proposed Network

III-B Loss Function

III-C Data Preparation

III-C1 CT image labels

III-C2 Parameters of the scanning geometry

III-C3 Data generation

IV Experimental Results

IV-A Experimental Results for p=7​πp=7\pi

IV-B Experimental Results for p=14​πp=14\pi

V Discussion and Conclusion

References

IV-A Experimental Results for $p=7\pi$

IV-B Experimental Results for $p=14\pi$