11email: [email protected]22institutetext: ZheJiang Minfound Intelligent Healthcare Technology Co., Ltd., West Wenyi road, Hangzhou, China
22email: [email protected] 33institutetext: Zhejiang Chinese Medical University Affiliated Jiaxing TCM Hospital, China 44institutetext: Dongyang Hospital of Wenzhou Medical University, China
Low-dose CT reconstruction by self-supervised learning in the projection domain
Abstract
In the intention of minimizing excessive X-ray radiation administration to patients, low-dose computed tomography (LDCT) has become a distinct trend in radiology. However, while lowering the radiation dose reduces the risk to the patient, it also increases noise and artifacts, compromising image quality and clinical diagnosis. In most supervised learning methods, paired CT images are required, but such images are unlikely to be available in the clinic. We present a self-supervised learning model (Noise2Projection) that fully exploits the raw projection images to reduce noise and improve the quality of reconstructed LDCT images. Unlike existing self-supervised algorithms, the proposed method only requires noisy CT projection images and reduces noise by exploiting the correlation between nearby projection images. We trained and tested the model using clinical data and the quantitative and qualitative results suggest that our model can effectively reduce LDCT image noise while also drastically removing artifacts in LDCT images.
Keywords:
Low-dose CT Image denoising Self-supervised learning Projection domain.1 Introduction
Computed tomography (CT) scans are commonly used in hospitals all over the world. However, the presence of x-rays in the scans exposes persons to ionising radiation, which can cause genetic damage and cancer induction. Low-dose CT (LDCT) imaging employs less radiation, reducing damage but increasing noise and artifacts in reconstructed images. These contaminated CT images with artifacts, which may adversely affect their clinical usefulness scan [8, 19].
Low-flux acquisition is a common approach for reducing X-ray dose in a single exposure by modifying the X-ray tube current or exposure period [10]. Low X-ray exposure will result in noisy projection, resulting in noisy CT images. Several LDCT reconstruction algorithms have been proposed in recent decades to obtain high-quality LDCT images. Sinogram filtering [15, 9, 1], iterative reconstruction [3, 6], and image post-processing [4, 11] are the three categories in which they fall.
With the renaissance of artificial intelligence in the past decade, various deep neural networks were proposed to denoise LDCT images, which became the main stream methods[7, 20, 18]. On the other hand, the majority of previous deep learning approaches for LDCT denoising depend on supervised learning with the normal-dose CT (NDCT) image as a ground truth of LDCT images [7, 20, 18]. In a clinical setting, acquiring such paired data is very challenging due to the higher X-ray dose from multiple scans, as well as the patient’s breathing and motion, which can cause multiple scan to be misaligned. Because getting paired data is difficult, most image-domain methods build paired images by adding noise to the NDCT images. However such an approach can bias the predictions of a CNN because no synthetic noise can completely emulate the real one to portray all physical phenomena encountered in a CT machine.
Recent studies on self-supervised learning have demonstrated that denoising networks can be trained without the use of clean references [14, 2, 12]. In these methods, the noiseless signal is deduced from the noisy image itself. A pioneering work known as Noise2Noise training [14] has shown that training a denoising network from pairs of noisy images. This Noise2Noise method was applied for denoising X-ray projections and CT images and compared to a supervised model in [5]. It gave acceptable results but the images were over-smoothed. Most often, only one noisy image of a particular scene is available. As a step further, self-supervised learning, which requires neither clean targets nor noisy image pairs in denoising tasks, has been proposed [2, 12, 13, 16, 22], but they add complexity to the model, and training process making it less efficient. In recent work [21], a self-supervised learning models have been developed to do projection data noise reduction with a promising result. However, the method has only been evaluated on simulated data and has yet to be validated on clinical data.
Inspired by previous work [16, 22, 21] and our understanding of CT imaging. We propose to employ a self-supervised learning model,Noise2Projection, to do low-dose CT image reconstruction using raw projection data. We directly used clinical data to train and test the model, and the quantitative and qualitative results suggest that our model can effectively reducing LDCT image noise while also dramatically remove artifacts in LDCT images.
2 Methodology
2.1 Noise2Projection
Inspire by Noise2Context[22] and Noise2Stack[16], we composed above two methods and reconstructed our model, Noise2Projection. Because there are over 1000 projections in one rotation of our CT scan, the variations between neighbouring projection data are minimal. As a result, this model for projection data is being considered in order to reduce noise and increase the quality of reconstructed CT images. We use the projection data of three consecutive angles as the model’s input and estimate the projection data of the intermediate angles, as indicated in Fig.1.
For LDCT denoising in the projection domain, the basic mathematical model can be formulated as: , is the noisy LDCT projection image, is the corresponding clear reference, denotes the noise. To obtain clean LDCT projection image from , a neural network can be trained with object function as in Eq.1:
| (1) |
In our model, we use the projection data of three adjacent angles (, and ) as the input to the model and estimate the projection data of the intermediate angles. Since there is no corresponding clean projection images, we use a self-supervised learning method to train our model. The objective function of our network is Eq. 2.
| (2) |
Where , , and are three adjacent projection data of the th patient, is the index of projection image.
Similar to the [22], we make the following assumptions: (a) the differences in projection data of adjacent angles are minor; (b) the noise in projection data of distinct projection angles is independent of each other and zero mean. Under those assumption, is equal to . As a consequence, Eq. 2 and Eq. 1 are equivalent.
We can observe from Eq.2 that if we employ the adjacent two slices and itself as the supervision, the unsupervised problem is equal to having the ground truth as the supervision under specific assumptions.
Because our method is a training strategy rather than an architectures per se, it may be used to any acceptable neural network backbone. In this work, to implement our Noise2Projection, we use standard U-net [17] with 32 basic feature-maps as the primary neural network, , and Eq. 2 as the loss function.
2.2 Datasets and Evaluation
We trained our model by using clinical human chest CT scan datasets. A total of 10 clinic patient(range 44.3-103kg) CT scan data were taken by the Minfound ScintCare CT16 scanner, and we randomly selected 7, 1 and 2 patients data for training, validation and testing, respectively. The number of projection views for a full 360∘ rotation is 1024 and the dimension of each acquired projection image is 91216. On average, each patient has 17741 projections. During the projection data acquisition, the x-ray tube current was set at 30 mA, and the duration of the x-ray pulse at each projection view was 15 ms. The tube voltage was set to 120 kVp. The reconstruction was performed using the manufacturer-provided software with all physical corrections. The size of each reconstructed CT image is 512512 with pixel size of 0.706*0.706mm. Because a scan circle has 1024 projections, the difference between consecutive projection data is quite modest. In Fig. 2, 10 consecutive CT projections are shown, with no noticeable structure change.
2.3 Implementation Details
In this work, we implement Noise2Projection and one unsupervised model (CycleWGANs [23]) in image domain as a baseline. The unsupervised baseline is trained with the same loss function as in the original paper [23], but without the supervised learning loss, just adversarial loss (), consistency loss (), and cyclic loss () are included.
The denoising network was built in Keras with Tensorflow backend and tested on NVIDIA GeForce RTX 3090 GPUs. The neural networks were initialized with a normal distribution and trained by Adam optimizer with with =0.5 and =0.999 during 100 epochs. The learning rate was set at and decremented linearly to zero. For training, the batch size was set to 16 and for inference, it was set to 1. The network uses full size projections image for end-to-end processing for training and inference.
2.4 Evaluation Metrics
No NDCT images can be utilized as a reference because our dataset was obtained from a clinical setting. In the CT image domain and projection domain, the noise reducation ratio, Z-profile, profile, and human vision sanity check were employed to assess model performance.
3 Experimental Results
After the model has been trained, the raw projection data from the test set is fed into it our proposed model to produce the noise-reduced projection data. The quantitative characteristics of the projected data and the reconstructed CT images were examined in the following investigation.
3.1 Experimental Results on Projection Data
Fig. 3(a) shows a comparison of the raw and Noise2Projection processed projection images. Because the projected image’s height is only 16 pixels, it’s tough to see the details, but the difference between the two may be seen. The Noise2Projection model predicts a decreased noise level in the projection data and no change in the local structure, as shown in Fig. 3(a).
The Z-profile (The mean value of each slice along the Z-direction) of raw and Noise2Projection processed projection data are shown in Fig. 3(b). Due to low-dose CT scan, the number of photon passing through the body is lower than NDCT scan and resulting noise projection data. The noise in Noise2Projection model estimated data is lower than the raw projection data (smaller fluctuation in Z-profile), and the quantitative inaccuracy is almost non-existent (relative error is less than 0.01% )
3.2 Experimental Results on CT Image
Our Noise2Projection model first processes the raw projection data, then we import the processed projection data along with the raw projection data into the CT vendor’s reconstruction tool for image reconstruction, and finally we assess the quality of the reconstructed images using quantitative and qualitative methods.
3.2.1 Artifact removal
Fig. 4 shows CT images reconstructed from raw LDCT projection data and projection data with Noise2Projection model processing. The LDCT image is shown in Fig.4 (a) and the Noise2Projection result is shown in Fig.4 (b). There are some artifacts, including "black bands" and ring artifacts, can be seen in the original LDCT image, which are likely to obstruct the physician’s diagnosis. This is because the x-ray flux is reduced, resulting in fewer photons being detected. These artifacts are most common in the shoulder and neck region due to the greater bone content in this location, which results in stronger photon attenuation. As seen in Fig. 4, our model processing efficiently removed the "black bands" and ring artefacts in the LDCT images (mainly from bone artifacts, as indicated by red rectangles in Fig. 4(a) and (b)) while also recovering part of the destroyed structures.
3.2.2 Noise reduction
We selected ROIs in uniform locations on numerous organs or tissues on the reconstructed images and evaluated the mean CT values and standard deviation within each ROI region to assess the efficacy of our proposed approach on noise reduction in CT images. Table 1 illustrates the mean CT values and standard deviation of ROIs in numerous organs, as well as the noise reduction ratios. We can observe that our proposed strategy is quite effective in the liver, muscle, and kidney with noise reduction ratios of 47.68%, 44.58%, and 44.63%, respectively. The lungs, on the other hand, do not show considerable noise reduction since they are mostly air, with minimal x-ray attenuation.
In addition, We also looked at the changes in line-profiles on the reconstructed images. We estimated the line-profile from a random selected image with no noticeable artifacts. Fig. 5 displays a line-profile comparison. We can see that the line-profile generated by the Noise2Projection model matches the LDCT extremely well, and that the fluctuation is decreased (means smaller noise).
| Image Name | Lung | Liver | Muscle | Kidney |
|---|---|---|---|---|
| LDCT | -832.83 80.74 | 60.89 15.54 | 59.42 20.32 | 33.74 14.92 |
| CycleWGANs | -830.86 79.28 | 61.10 12.23 | 56.92 14.79 | 36.50 10.57 |
| NLM | -832.87 80.15 | 62.38 14.04 | 59.38 16.69 | 33.81 11.94 |
| Noise2Projection | -833.09 77.01 | 58.51 8.13 | 58.69 11.26 | 29.41 8.26 |
| NRR for CycleWGANs | 1.81% | 21.30% | 27.21% | 29.15% |
| NRR for NLM | 0.73% | 9.65% | 17.86% | 19.97% |
| NRR for Noise2Projection | 4.62% | 47.68% | 44.58% | 44.63% |
3.2.3 Comparison with image domain methods
To study the effectiveness of our proposed model, we compared it with one deep learning mode, CycleWGANs[23], and one traditional methods, Non-local Mean(NLM)[4]. Both of the two methods mentioned above are high-performance methods in the image domain. Raw LDCT, Noise2Projection, CycleWGANs, and NLM processed CT images are compared in Fig. 4 and Table 1. Both CycleWGANs and NLM have strong noise reduction effects, with noise reduction ratios ranging from 1.81% to 29.15% for CycleWGANs and 0.73% to 19.97% for NLM, respectively. We should also mention that neither CycleWGANs nor NLM are capable of properly removing the artifacts found in LDCT images, such as "black bands" and ring artifacts. Our Noise2Projection model not only has the best noise reduction performance, but it also effectively removes artifacts from LDCT images, demonstrating promising results.
4 Conclusion
In this research, we propose to employing a self-supervised learning model called Noise2Projection to reconstruct low-dose CT images from raw projection data. We train and evaluate the model using clinical data, and quantitative and qualitative results suggest that our method outperforms two image domain methos. The reconstructed CT images are not only substantially less noisy, but also have less artifacts. The suggested research may lead to the development of a new set of LDCT image denoising methods. In our future work, we will investigate the mechanism of artifact reduction that was observed in this research, and we will apply our method to other artifact reduction tasks in CT and PET scans.
4.0.1 Acknowledgements
******
References
- [1] Image quality and radiation dose of low dose coronary ct angiography in obese patients: Sinogram affirmed iterative reconstruction versus filtered back projection. European Journal of Radiology 81(11) (2012)
- [2] Batson, J., Royer, L.: Noise2self: Blind denoising by self-supervision (2019)
- [3] Beister, M., Kolditz, D., Kalender, W.A.: Iterative reconstruction methods in x-ray ct. Physica Medica 28(2), 94–108 (2012)
- [4] Dutta, J., Leahy, R.M., Li, Q.: Non-local means denoising of dynamic pet images. PloS one 8(12), e81390 (2013)
- [5] Gnudi, P., Schweizer, B., Kachelrieß, M., Berker, Y.: Denoising of x-ray projections and computed tomography images using convolutional neural networks without clean data. In: The 6th International Conference on Image Formation in X-Ray Computed Tomography. pp. 590–593 (2020)
- [6] Hara, A.K., Paden, R.G., Silva, A.C., Kujak, J.L., Lawder, H.J., Pavlicek, W.: Iterative reconstruction technique for reducing body radiation dose at ct: feasibility study. Ajr Am J Roentgenol 193(3), 764–771 (2009)
- [7] Hu, Chen, Yi, Zhang, Weihua, Zhang, Peixi, Liao, Ke, Li: Low-dose ct via convolutional neural network. Biomedical Optics Express (2017)
- [8] Jiang, H.: Computed Tomography Principles, Design, Artifacts, and Recent Advances, 2nd Edition. Computed Tomography: Principles, Design, Artifacts, and Recent Advances, Second Edition (2009)
- [9] Jing, W., Lu, H., Li, T., Liang, Z.: Sinogram noise reduction for low-dose ct by statistics-based nonlinear filters. International Society for Optics and Photonics (2005)
- [10] Kalra, M.K., Maher, M.M., Toth, T.L., Hamberg, L.M., Blake, M.A., Jo-Anne, S., Sanjay, S.: Strategies for ct radiation dose optimization. Radiology 230(3), 619–628 (2004)
- [11] Kang, D., Slomka, P., Nakazato, R., Woo, J., Berman, D.S., Kuo, C., Dey, D.: Image denoising of low-radiation dose coronary ct angiography by an adaptive block-matching 3d algorithm. In: Spie Medical Imaging (2013)
- [12] Krull, A., Buchholz, T.O., Jug, F.: Noise2void - learning denoising from single noisy images. In: 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (2019)
- [13] Laine, S., Karras, T., Lehtinen, J., Aila, T.: High-quality self-supervised deep image denoising (2019)
- [14] Lehtinen, J., Munkberg, J., Hasselgren, J., Laine, S., Karras, T., Aittala, M., Aila, T.: Noise2noise: Learning image restoration without clean data (2018)
- [15] Manduca, A., Yu, L., Trzasko, J.D., Khaylova, N., Kofler, J.M., Mccollough, C.M., Fletcher, J.G.: Projection space denoising with bilateral filtering and ct noise modeling for dose reduction in ct. Medical Physics 36(11), 4911–4919 (2009)
- [16] Papkov, M., Roberts, K., Madissoon, L.A., Shilts, J., Bayraktar, O., Fishman, D., Palo, K., Parts, L.: Noise2stack: Improving image restoration by learning from volumetric data. In: International Workshop on Machine Learning for Medical Image Reconstruction. pp. 99–108. Springer (2021)
- [17] Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation (2015)
- [18] Shan, H., Zhang, Y., Yang, Q., Kruger, U., Kalra, M.K., Sun, L., Cong, W., Wang, G.: 3d convolutional encoder-decoder network for low-dose ct via transfer learning from a 2d trained network. IEEE Transactions on Medical Imaging 37(6), 1522 (2018)
- [19] Snyder, D.L., Miller, M.I., Thomas, L.J., Politte, D.G.: Noise and edge artifacts in maximum-likelihood reconstructions for emission tomography. IEEE Transactions on Medical Imaging 6(3), 228 (1987)
- [20] Yang, Q., Yan, P., Zhang, Y., Yu, H., Shi, Y., Mou, X., Kalra, M.K., Zhang, Y., Sun, L., Wang, G.: Low-dose ct image denoising using a generative adversarial network with wasserstein distance and perceptual loss. IEEE Transactions on Medical Imaging pp. 1348–1357 (2018)
- [21] Zainulina, E., Chernyavskiy, A., Dylov, D.V.: No-reference denoising of low-dose ct projections. In: 2021 IEEE 18th International Symposium on Biomedical Imaging (ISBI). pp. 77–81. IEEE (2021)
- [22] Zhang, Z., Liang, X., Zhao, W., Xing, L.: Noise2context: Context-assisted learning 3d thin-layer for low-dose ct. Medical Physics 48(10), 5794–5803 (2021)
- [23] Zhou, L., Schaefferkoetter, J.D., Tham, I.W., Huang, G., Yan, J.: Supervised learning with cyclegan for low-dose fdg pet image denoising. Medical image analysis 65, 101770 (2020)