\ul
11email: {yunkuipa,yilinliu}@cs.unc.edu, {jun_lian,ptyap}@med.unc.edu
22institutetext: Huaqiao University, Xiamen 361021, China. 22email: [email protected]
SinoSynth: A Physics-based Domain Randomization Approach for Generalizable CBCT Image Enhancement
Abstract
Cone Beam Computed Tomography (CBCT) finds diverse applications in medicine. Ensuring high image quality in CBCT scans is essential for accurate diagnosis and treatment delivery. Yet, the susceptibility of CBCT images to noise and artifacts undermines both their usefulness and reliability. Existing methods typically address CBCT artifacts through image-to-image translation approaches. These methods, however, are limited by the artifact types present in the training data, which may not cover the complete spectrum of CBCT degradations stemming from variations in imaging protocols. Gathering additional data to encompass all possible scenarios can often pose a challenge. To address this, we present SinoSynth, a physics-based degradation model that simulates various CBCT-specific artifacts to generate a diverse set of synthetic CBCT images from high-quality CT images, without requiring pre-aligned data. Through extensive experiments, we demonstrate that several different generative networks trained on our synthesized data achieve remarkable results on heterogeneous multi-institutional datasets, outperforming even the same networks trained on actual data. We further show that our degradation model conveniently provides an avenue to enforce anatomical constraints in conditional generative models, yielding high-quality and structure-preserving synthetic CT images111https://github.com/Pangyk/SinoSynth.
Keywords:
CBCT CT Domain randomization Unpaired image translation.1 Introduction
CBCT provides intricate 3D imaging capabilities crucial for precise diagnosis, treatment planning, and surgical guidance across diverse medical fields. [11, 2, 20]. CBCT typically utilizes lower radiation doses compared to fan-beam CT and boasts greater portability and cost-effectiveness, which are key factors driving its widespread deployment in treatment rooms. However, CBCT images are susceptible to noise and artifacts [21], resulting in lower image quality than CT images [27]. Thus, it has been of great interest to the clinical community to enhance the image quality of CBCT.
Supervised image-to-image translation methods have achieved promising performance in CBCT enhancement, under the prerequisite of sufficient paired and well-aligned CBCT-CT datasets [6, 22]. Unsupervised methods utilizing generative adversarial networks (GANs) [7, 19, 8], which leverage unpaired CBCT-CT images, partially mitigate the issues. Nevertheless, their implicit learning of CBCT characteristics (e.g., artifacts) may make it challenging to distinguish between anatomical details and noise when the dataset size is small, as demonstrated in our experiments. Although there have been a few publicly available datasets [1, 15, 25] with CBCT-CT images for several organs, differences in imaging equipment and scanning protocols could still result in severe performance degradation of the model pre-trained on these datasets (Fig. 3). This is because the quality of CBCT images is highly dependent on the specific imaging device used, leading to significant variability in appearance and data distributions across various scanning settings [18]. The considerable variability presents practical challenges to data collection and model generalizability in real-world clinical scenarios.
To mitigate the issues, many efforts have been devoted to enriching the training data (Fig. 1). Brion et al. [4] performed intensity-based data augmentation by incorporating adjustments to brightness, contrast, and sharpness. However, such methods do not consider the physical causes of CBCT artifacts, rendering augmentation less effective. Dahiya et al. [9] simulate CBCT images by directly extracting artifacts from existing CBCT images and mapping them to the corresponding CT images. As a result, their method requires accurately pre-aligned CT and CBCT images. Additionally, the types of artifacts that can be simulated remain limited to those present within the available datasets. Synthetic data generation methods have also been explored for general planar X-ray images [10], but they are not tailored to 3D CBCT images.
In this work, we present SinoSynth, a physics-based CT-to-CBCT degradation model with adjustable parameters for extensive domain simulation. Given a CT image, our approach synthesizes a variety of CBCT artifacts controlled by random sampling parameters, such as noise level and geometry configurations. As such, an unlimited number of aligned CBCT-CT image pairs can be generated for training. Notably, our method requires only a set of CT images, and the synthesized training data are inherently aligned. To better reflect domain-specific variations, we incorporate the domain knowledge of CBCT into the degradation model in two ways: 1) we parameterize the cone-beam geometry to simulate diverse CBCT scanning configurations, and 2) translate the formation of CBCT artifacts into algorithms controlled by adjustable parameters, adhering to X-ray properties [3]. As some of the artifacts typically co-occur, our focus is on simulating five representative CBCT artifacts [21, 23]. We empirically verify the effectiveness of their combinations in covering the degradation space of actual CBCT images compared to previous augmentation methods through experiments.
We integrate SinoSynth into several existing synthetic CT generation frameworks, including Denoising Diffusion GAN (DDGAN) [29], CycleGAN [7], DRIT [16], ROI-aware DCLGAN [8], and FGDM [17]. Moreover, our degradation model enables structural guidance by enforcing consistency between the synthesized CT output and the conditioned simulated CBCT image. Particularly, to cope with the stochasticity in generative networks, we showcase that this strategy ensures better structural preservation than using the original conditioning scheme alone.
To summarize, our contributions are three-fold:
-
1.
We proposed a physics-based CBCT degradation model (SinoSynth) that incorporates domain knowledge to simulate domain-randomized CBCT artifacts across various imaging protocols, thereby mitigating the need for extensive data collection.
-
2.
Compared to networks trained with previous augmentation methods or actual data, SinoSynth-trained networks demonstrate significantly better zero-shot generalization ability and structure-preserving ability on challenging datasets collected from multiple hospitals.
-
3.
Our work underscores the significance of accurate CBCT simulation modeling for generalizable CT synthesis.
2 Method
Fig. 2 illustrates the proposed pipeline. SinoSynth generates sCBCT images by simulating CBCT artifacts on CT images in each batch on the fly during training (Sec. 2.2). The artifacts are simulated with random occurrences and a shuffled order. Meanwhile, the degradation model is used for structural guidance (Sec. 2.3). After training, the network is directly applied to actual CBCT images.
2.1 Preliminaries
Our CBCT simulation is performed in the sinogram domain. The CT slice represented in Hounsfield Units is first converted into the linear attenuation coefficient map via a linear transformation [5]. Radon transform integrated along the cone beam geometry is then applied to project onto the sinogram:
(1) |
where the constant is the intensity of the entry X-ray beam within composite energy levels keV. We then simulate different types of CBCT artifacts and noise based on the derived sinogram . Reconstructing from the sinogram using filtered back projection gives the corresponding CBCT image that is used for training. We implemented Eq 1 with the ASTRA Toolbox [26] and Operator Discretization Library 222https://github.com/odlgroup/odl.
2.2 Domain-Randomized CBCT Simulation
Scanner Effects Simulation. The diversity among different CBCT scanners significantly impacts the varied appearance of CBCT artifacts, resulting in decreased performance of the network [18]. To simulate the scanner variability, we parameterize CBCT scanners with different cone beam geometries . As illustrated in Fig. 2 (b), denotes the distance from the X-ray beam source to the patient; is the distance from the patient to the detector; denotes the number of X-ray projections, and a smaller indicates a sparser view; denotes the detector size. Hence, we simulate the effects of CBCT scanners from different vendors by computing Eq. 1 with randomly parameterized .
Metal Artifacts usually occur in the presence of metal implants. The low X-ray transmission of metallic implants and the polychromatic nature of the X-ray source result in severe beam hardening [2], bringing scatter and streak artifacts [30]. Thus, for simulation, we need to create metal trace regions on the sinogram . This is done by first taking the intersection between a random mask obtained through cubic bézier curve [12] controlled by randomly sampled points, and a mask of bone areas obtained through thresholding, , and converting to the sinogram . Then, we update the metal trace regions of by filling in the linear attenuation coefficients of the metal implanted area:
(2) |
where each corresponding to an energy level is computed as: . The mass attenuation coefficient can be obtained from [14]. Adjusting the density of the metal material allows for controlling the intensity of the metal artifacts.
Extinction Artifacts occur when the object contains highly absorbent material, which significantly attenuates the X-ray signal, reducing it to near-zero [23]. Since the attenuated regions on CBCT images are often irregular and discontinuous, we generate a random image-domain mask using the cubic bézier curve controlled by randomly sampled points within a local region. The sinogram is then updated by:
(3) |
where denotes Radon transform, controls the attenuation extent.
Quantum Noise is the main source of image deterioration in plain radiography, which arises from the natural variability in how photons reach the detector [28]. As quantum noise predominantly impacts images acquired using low radiation doses, CBCT images are expected to exhibit a higher noise level compared to CT images. Given that quantum noise usually follows a Poisson distribution [24], the quantum noise on sinogram can be simulated as:
(4) |
where controls the noise level, and is the number of photon occurrences during a time interval.
Zebra artifacts appear as alternating bright and dark stripes in CBCT images due to helical interpolation [21]. We simulate them by creating a binary mask with stripes. The width of each stripe is randomized within the range of pixels. We then simulate stripes of different directions by applying rotation to the mask , . The zebra artifact on the sinogram is simulated as:
(5) |
Motion Artifacts are caused by the motion of the patient that appears as unsharpness in the reconstructed image [21]. The process can be viewed as an image disturbed by small deformations and displacements during the imaging process, along with a blurring effect simulated by the Gaussian filter:
(6) |
where is the Gaussian blur filter. represents the deformation and displacement caused by the motion, controlled by the deformation degree , the rotation angle and the resizing factor . is a factor balancing and its deformed result .
2.3 Structural Guidance for Generative networks
As our CBCT simulation model is differentiable, we propose simple anatomical structure constraints to regularize the generation of sCT in both the sinogram and image domains. For a simulated CBCT image and the network output , the denoised output should be converted back to via the degradation model. Meanwhile, the output should be consistent with the reference CT in the sinogram domain after being applied with a mask that removes the metal-affected region. Hence, the generative network is guided by the following losses to generate anatomically consistent output:
(7) |
(8) |
Since we applied our method to existing frameworks, all models were trained with their original losses (e.g., GAN losses) alongside the proposed structure-preserving losses.
Data | Actual CBCT-CT data | Our Simulated CBCT-CT data | ||||||
---|---|---|---|---|---|---|---|---|
Methods | Supervised [6] | DCLGAN [8] | DRIT [16] | FGDM [17] | Supervised | DCLGAN | DRIT | FGDM |
PSNR | 24.82 | 23.85 | 22.54 | 23.36 | \ul25.44 | 24.34 | 23.14 | 25.61 |
SSIM | 0.821 | 0.784 | 0.758 | 0.819 | \ul0.829 | 0.801 | 0.773 | 0.838 |
MAE | 32.73 | 35.04 | 80.97 | 36.42 | \ul26.32 | 28.34 | 31.24 | 22.50 |
Methods | w/o. Aug | Brion et al. | Dahiya et al. | Ours |
---|---|---|---|---|
PSNR | 22.51 | 23.67 | 23.84 | 25.08 |
SSIM | 0.733 | 0.752 | 0.769 | 0.823 |
MAE | 60.07 | 51.52 | 50.18 | 32.67 |

3 Experiments
3.1 Datasets and Pre-processing
We curated a Head and Neck CBCT dataset with 250 patients from five hospitals in Europe and the US with mAs used in scanning (10, 12, 20, and 50). 160 randomly sampled patient data are used for training, 20 for validation, and 70 for evaluation. All the data are resampled to a voxel size of , with an image size of for each slice. We utilize Peak-to-Signal Ratio (PSNR [13]), Structure Similarity (SSIM [13]), and Mean Absolute Error (MAE) of the HU values for evaluating all compared methods. Our method simulates CBCT images on the fly during training instead of pre-generating them. Each model is trained for 200 epochs.
3.2 Comparisons with actual CBCT-CT data
We compare the established image-to-image translation models trained on our simulated data to the same models trained on the actual CBCT-CT data. The base models include supervised U-Net [6], ROI-aware DCLGAN [8], DRIT [16], and FGDM [17] based on diffusion probabilistic model [29]. Table 3 and Fig. 4 demonstrate that our method significantly enhances generalization across various artifacts, leading to the superior performance of CBCT noise reduction and soft tissue enhancement. Our method also improves the inpainting performance of the shoulder region.
3.3 Comparisons with other augmentation methods
We compare SinoSynth with existing augmentation methods [9, 4]. CycleGAN [7] is employed as the base model. As shown in Fig. 3, and Table 4, SinoSynth outperforms the existing augmentation methods. This is because the existing methods inadequately amplify the diversity of CBCT-specific artifacts, rendering the network susceptible to out-of-distribution CBCT artifacts, as depicted in Fig. 1. Our approach addresses this limitation by explicitly simulating a wide range of CBCT artifacts, thereby enhancing the performance of the CycleGAN.
Metal | Quantum | Extinction | Zebra | Motion | Si-Cons. | St-Cons. | PSNR | SSIM | MAE |
✓ | 23.01 | 0.733 | 32.63 | ||||||
✓ | 23.62 | 0.732 | 31.24 | ||||||
✓ | 22.24 | 0.722 | 32.36 | ||||||
✓ | 23.16 | 0.741 | 32.45 | ||||||
✓ | 22.87 | 0.716 | 33.57 | ||||||
✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 23.28 | 0.792 | 29.82 | |
✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 24.56 | 0.774 | 25.51 | |
✓ | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ | 25.63 | 0.841 | 21.44 |
3.4 Ablation studies
Quantitative results in Table. 3 reveal two key insights. Firstly, for ablating the artifact types, PSNR/SSIM scores imply the occurrence frequency of the CBCT artifacts present in the test data. Secondly, for ablating the loss function design choices, both constraints contribute to CBCT image enhancement. As shown qualitatively in the appendix, the structure consistency constraint plays a crucial role in synthesizing structurally consistent images. The sinogram consistency constraint aids both CT noise reduction and accurate structure reconstruction.
4 Conclusion
In this work, we introduce a physics-based domain randomization approach to address the inherent challenges associated with generating corresponding CT images from CBCT scans, including susceptibility to artifacts and limited generalizability. Our innovative approach involves synthesizing CBCT images with realistic artifacts, enabling us to overcome these obstacles. Through extensive experiments, we demonstrate that deep generative networks trained on our synthetic CBCT images outperform those trained on actual data. This suggests a promising avenue for leveraging simulated CBCT data to train deep networks on larger-scale CT-only datasets, which are more readily accessible online. Our work not only improves the reliability of CBCT in clinical settings but also lays the groundwork for future advancements in other medical imaging modalities.
4.0.1 Acknowledgements
This work was supported by the National Institutes of Health under grants R01CA206100 and R01EB035160 and UNC Lineberger Developmental Award 29242. Xu Chen was supported by NSFC grant 62276105, Natural Science Foundation of Xiamen China 3502Z20227193, Natural Science Foundation of Fujian Province 2023J01136, and Scientific Research Funds of Huaqiao University 20221XD029.
4.0.2 \discintname
The authors have no competing interests to declare that are relevant to the content of this article.
References
- [1] Balik, S., Weiss, E., Jan, N., Roman, N., Sleeman, W.C., Fatyga, M., Christensen, G.E., Zhang, C., Murphy, M.J., Lu, J., et al.: Evaluation of 4-dimensional computed tomography to 4-dimensional cone-beam computed tomography deformable image registration for lung cancer adaptive radiation therapy. International Journal of Radiation Oncology* Biology* Physics 86(2), 372–379 (2013)
- [2] Bayaraa, T., Hyun, C.M., Jang, T.J., Lee, S.M., Seo, J.K.: A two-stage approach for beam hardening artifact reduction in low-dose dental cbct (2020)
- [3] Beer: Bestimmung der absorption des rothen lichts in farbigen flüssigkeiten. Ann. Phys. 162(5), 78–88 (Jan 1852)
- [4] Brion, E., Léger, J., Barragán-Montero, A.M., Meert, N., Lee, J.A., Macq, B.: Domain adversarial networks and intensity-based data augmentation for male pelvic organ segmentation in cone beam ct. Computers in Biology and Medicine 131, 104269 (2021)
- [5] Brown, S., Bailey, D.L., Willowson, K., Baldock, C.: Investigation of the relationship between linear attenuation coefficients and ct hounsfield units using radionuclides for spect. Applied Radiation and Isotopes 66(9), 1206–1212 (2008)
- [6] Chen, L., Liang, X., Shen, C., Jiang, S., Wang, J.: Synthetic ct generation from cbct images via deep learning. Medical physics 47(3), 1115–1125 (2020)
- [7] Chen, L., Liang, X., Shen, C., Nguyen, D., Jiang, S., Wang, J.: Synthetic ct generation from cbct images via unsupervised deep learning. Physics in Medicine & Biology 66(11), 115019 (2021)
- [8] Chen, X., Pang, Y., Ahmad, S., Royce, T., Wang, A., Lian, J., Yap, P.T.: Organ-aware cbct enhancement via dual path learning for prostate cancer treatment. Medical Physics (2023)
- [9] Dahiya, N., Alam, S.R., Zhang, P., Zhang, S.Y., Li, T., Yezzi, A., Nadeem, S.: Multitask 3d cbct-to-ct translation and organs-at-risk segmentation using physics-based data augmentation. Medical Physics 48(9), 5130–5141 (2021)
- [10] Gao, C., Killeen, B.D., Hu, Y., Grupp, R.B., Taylor, R.H., Armand, M., Unberath, M.: Synthetic data accelerates the development of generalizable learning-based algorithms for x-ray image analysis. Nature Machine Intelligence 5(3), 294–308 (2023)
- [11] Gupta, J., Ali, S.P.: Cone beam computed tomography in oral implants. National journal of maxillofacial surgery 4(1), 2 (2013)
- [12] Han, X.A., Ma, Y., Huang, X.: A novel generalization of bézier curve and surface. Journal of Computational and Applied Mathematics 217(1), 180–193 (2008)
- [13] Horé, A., Ziou, D.: Image quality metrics: Psnr vs. ssim. In: 2010 20th International Conference on Pattern Recognition. pp. 2366–2369 (2010). https://doi.org/10.1109/ICPR.2010.579
- [14] Hubbell+, J.H., Seltzer, S.M.: X-ray mass attenuation coefficients. https://dx.doi.org/10.18434/T4D01F (2004), https://www.nist.gov/pml/x-ray-mass-attenuation-coefficients
- [15] Hugo, G.D., Weiss, E., Sleeman, W.C., Balik, S., Keall, P.J., Lu, J., Williamson, J.F.: Data from 4D lung imaging of NSCLC patients (2016)
- [16] Lee, H.Y., Tseng, H.Y., Mao, Q., Huang, J.B., Lu, Y.D., Singh, M., Yang, M.H.: Drit++: Diverse image-to-image translation via disentangled representations. International Journal of Computer Vision 128, 2402–2417 (2020)
- [17] Li, Y., Shao, H.C., Liang, X., Chen, L., Li, R., Jiang, S., Wang, J., Zhang, Y.: Zero-shot medical image translation via frequency-guided diffusion models. arXiv preprint arXiv:2304.02742 (2023)
- [18] Liang, X., Nguyen, D., Jiang, S.B.: Generalizability issues with deep learning models in medicine and their potential solutions: illustrated with cone-beam computed tomography (cbct) to computed tomography (ct) image conversion. Machine Learning: Science and Technology 2(1), 015007 (2020)
- [19] Liu, J., Yan, H., Cheng, H., Liu, J., Sun, P., Wang, B., Mao, R., Du, C., Luo, S.: Cbct-based synthetic ct generation using generative adversarial networks with disentangled representation. Quantitative Imaging in Medicine and Surgery 11(12), 4820 (2021)
- [20] Miracle, A., Mukherji, S.: Conebeam ct of the head and neck, part 2: clinical applications. American journal of neuroradiology 30(7), 1285–1292 (2009)
- [21] Nagarajappa, A.K., Dwivedi, N., Tiwari, R.: Artifacts: The downturn of cbct image. Journal of International Society of Preventive & Community Dentistry 5(6), 440 (2015)
- [22] Peng, J., Qiu, R.L., Wynne, J.F., Chang, C.W., Pan, S., Wang, T., Roper, J., Liu, T., Patel, P.R., Yu, D.S., et al.: Cbct-based synthetic ct image generation using conditional denoising diffusion probabilistic model. arXiv preprint arXiv:2303.02649 (2023)
- [23] Schulze, R., Heil, U., Gro, D., Bruellmann, D.D., Dranischnikow, E., Schwanecke, U., Schoemer, E.: Artefacts in cbct: a review. Dentomaxillofacial Radiology 40(5), 265–273 (2011)
- [24] Strid, K.G.: Significance of quantum fluctuations in roentgen imaging. Acta Radiologica: Oncology 19(2), 129–138 (1980)
- [25] Thummerer, A., van der Bijl, E., Maspero, M.: SynthRAD2023 grand challenge dataset: synthetizing computed tomography for radiotherapy (2023)
- [26] Van Aarle, W., Palenstijn, W.J., Cant, J., Janssens, E., Bleichrodt, F., Dabravolski, A., De Beenhouwer, J., Batenburg, K.J., Sijbers, J.: Fast and flexible x-ray tomography using the astra toolbox. Optics express 24(22), 25129–25147 (2016)
- [27] Venkatesh, E., Elluru, S.V.: Cone beam computed tomography: basics and applications in dentistry. Journal of istanbul University faculty of Dentistry 51(3 Suppl 1), 102–121 (2017)
- [28] Wang, J., Lu, H., Liang, Z., Eremina, D., Zhang, G., Wang, S., Chen, J., Manzione, J.: An experimental study on the noise properties of x-ray ct sinogram data in radon space. Physics in Medicine & Biology 53(12), 3327 (2008)
- [29] Xiao, Z., Kreis, K., Vahdat, A.: Tackling the generative learning trilemma with denoising diffusion gans (2022)
- [30] Zhang, Y., Yu, H.: Convolutional neural network based metal artifact reduction in x-ray computed tomography. IEEE Transactions on Medical Imaging 37(6), 1370–1381 (2018). https://doi.org/10.1109/TMI.2018.2823083