A Semiparametric Gaussian Mixture Model for Chest CT-based 3D Blood Vessel Reconstruction
Abstract
Computed tomography (CT) has been a powerful diagnostic tool since its emergence in the 1970s. Using CT data, three-dimensional (3D) structures of human internal organs and tissues, such as blood vessels, can be reconstructed using professional software. This 3D reconstruction is crucial for surgical operations and can serve as a vivid medical teaching example. However, traditional 3D reconstruction heavily relies on manual operations, which are time-consuming, subjective, and require substantial experience. To address this problem, we develop a novel semiparametric Gaussian mixture model tailored for the 3D reconstruction of blood vessels. This model extends the classical Gaussian mixture model by enabling nonparametric variations in the component-wise parameters of interest according to voxel positions. We develop a kernel-based expectation-maximization algorithm for estimating the model parameters, accompanied by a supporting asymptotic theory. Furthermore, we propose a novel regression method for optimal bandwidth selection. Compared to the conventional cross-validation-based (CV) method, the regression method outperforms the CV method in terms of computational and statistical efficiency. In application, this methodology facilitates the fully automated reconstruction of 3D blood vessel structures with remarkable accuracy. Blood Vessel; Computed Tomography; Gaussian Mixture Model; Nonparametric Kernel Smoothing; TensorFlow; 3D Reconstruction.
1 Introduction
Computed tomography (CT) is an advanced three-dimensional (3D) imaging technology capable of generating detailed, high-resolution 3D images of internal organs and structural intricacies of the human body. The basic idea of a 3D CT image involves segmenting a lung, as an illustrative example, into several thin pieces. Each piece is then meticulously scanned using a CT machine, yielding detailed, high-resolution grayscale images. Subsequently, these sliced pieces of the lung can be examined individually. Although examining each CT image slice individually offers practical utility, this approach fails to harness the benefits of 3D technology. For example, valuable information embedded within the 3D structure may be overlooked.
In surgical practices, such as lobectomy, wedge resection, or segmentectomy, it is critically important for surgeons to reconstruct blood vessels in three dimensions before surgery. This practice enables surgeons to clearly visualize the blood vessels’ anatomical structure. This greatly reduces the risk of accidental or missed blood vessel ruptures during surgery (Hagiwara et al., 2014). Furthermore, massive hemorrhages can be avoided (Ji et al., 2021). In addition, the 3D reconstructions of blood vessels offer invaluable educational benefits to novice surgeons by facilitating a comprehensive understanding of the lung anatomy and the shape distribution of blood vessels. However, current standard imaging technology does not yet enable surgeons to perform three-dimensional reconstructions of blood vessels in a fully automated manner. That is, at present, the 3D reconstruction of blood vessels requires substantial expertise. For instance, the widely utilized RadiAnt software (https://www.radiantviewer.com/), recognized for its efficiency and user-friendly interface in viewing CT images, still requires significant manual efforts for such complex reconstructions. On average, it takes approximately ten minutes for a very experienced surgeon to reconstruct 3D images of blood vessels from raw CT images. To this end, a sequence of sophisticated but critical, fine-tuned decisions must be made correctly. This could be a challenging task for less experienced surgeons. Consequently, surgical efficiency can be greatly affected. Therefore, developing an automatic method for reconstructing 3D blood vessels from raw CT images is of great interest.
To solve this problem, we propose a novel semiparametric Gaussian mixture model (GMM) for CT-based 3D blood vessel reconstruction. Our method is inspired by the empirical observation that different tissues of different densities manifest different gray intensities in CT images. Even within the same type of tissue, gray intensities may exhibit slight fluctuations according to their different positions within the human body. Consequently, CT values corresponding to different organ tissues are expected to manifest different means and variances, while those related to the same organ tissue are anticipated to have similar means and variances. This seems to be an ideal situation for applying mixture models for unsupervised clustering analysis. In this regard, the classical GMM (McLachlan and Peel, 2000) appears to be a natural choice.
In a GMM, we assume that each type of meaningful human organ tissue (e.g., blood vessels) or the image background is represented by a distinct Gaussian mixture component. Each mixture component is assumed to be a parametric Gaussian model with globally constant mean and variance such that the parameters do not flexibly vary by 3D positions in the CT space. To some extent, this assumption imitates the common practice of standard CT imaging software (e.g., RadiAnt), where only two globally constant, fine-tuned parameters, such as window width ([WW]) and window level ([WL]), are allowed. Although this simple practice with globally constant fine-tuned parameters is practically useful, it is not fully satisfactory. This is because, even for the same human organ (e.g., blood vessels), the CT density varies according to voxel positions within the 3D CT space. Furthermore, we observe that the globally constant mean ([WL]) and constant standard deviation ([WW]) are inadequate to accommodate these variations. Consequently, the accurate reconstruction of blood vessels in a 3D image remains elusive. Thus, extending the classical GMM to consider this variation becomes a key issue.
To solve this problem, we extend the classical GMM by allowing the parameters of interest, including the mean, variance, and class prior probability functions, to vary nonparametrically. This variation is due to the different voxel positions within the 3D CT space. We call this a semiparametric method because it integrates both a parametric component, represented by the GMM, and a nonparametric component, represented by the nonparametric mean, variance, and class prior probability functions. To estimate the parameters of interest for a given voxel position, a kernel-based expectation-maximization (KEM) algorithm is developed. Our extensive numerical experiments suggest that this method is effective. To facilitate computational efficiency, the entire algorithm is presented in a tensor format to efficiently execute the KEM algorithm on a GPU device. This design allows for the full utilization of the GPU’s parallel computing capabilities. Once the parameters of interest are estimated accurately, the posterior probability can be computed for the given voxel position associated with the blood vessel category. This step is crucial for accurately identifying and reconstructing blood vessels within the 3D CT space.
Remarkably, our 3D reconstruction method is very different from the traditional practice commonly used by CT imaging software (e.g., RadiAnt). In commercial software, raw CT densities are used to reconstruct blood vessels three-dimensionally. However, as noted, this method results in 3D blood vessel images that lack accuracy. This is mainly because blood vessels from different voxel positions often have different CT intensity levels, which cannot be fully captured by globally constant fine-tuned parameters. By contrast, our method reconstructs 3D blood vessels using locally computed posterior probabilities. By examining CT intensities locally, we can obtain a much clearer vascular anatomy of the blood vessels. This makes the task of classifying each voxel into the right organ considerably easier. The locally discovered blood vessels are then represented by their posterior probabilities. By employing these posterior probabilities as the image intensities, we find that the expression levels of blood vessels from different voxel positions within the 3D CT space are more comparable. This significantly improves the reconstruction accuracy of the blood vessels.
To summarize, we make two important contributions in this work. Firstly, we contribute to surgical practice by proposing a fully automated method for 3D blood vessel reconstruction with significantly improved accuracy. Second, our study enriches the statistical theory of the classical GMM by extending it from a parametric model to a semiparametric one. The remainder of this paper is organized as follows. Section 2 introduces the KEM algorithm and elucidates the primary theoretical properties of the proposed method. Section 3 presents extensive numerical studies on the KEM method. Section 4 concludes this article with a comprehensive discussion of our findings. The technical details are included in the Supplementary Materials.
2 METHODOLOGY
2.1 Model and Notations
Let be the observation collected from the th subject, with being the univariate response of interest and being the associated voxel position in a three-dimensional Euclidean space. We assume that is uniformly distributed on . Furthermore, we assume that for each a latent class label , where is the total number of classes. We then assume that for . Clearly, we should have for any . This suggests that we can define for any throughout this article. Conditional on and , we assume that is normally distributed with a mean and variance . Here, we assume that class prior probability function , mean function , and variance function are all smooth functions in and vary between classes. Next, we consider how to consistently estimate them.
We collect the observed voxel positions and CT values into and , respectively. Furthermore, we collect the unobserved latent class labels into . For a given voxel position , define , where , , and . Define . To estimate , we develop a novel KEM method. We begin with a highly simplified case with , , and for some constants , , and . We then have the log-likelihood function for an interior point in as
(1) |
where stands for the joint probability density function of evaluated at ; and is the marginal probability density function of evaluated at conditional on and . Moreover, is the probability density function of evaluated at . Since is assumed to be uniformly generated on , we have for and for . Moreover, the function is the probability density function of a standard normal random variable.
Nevertheless, class prior probability function , mean function , and variance function in our case are not constant. Instead, they should vary in response to the value of (i.e., the voxel position within the 3D CT space) in a fully nonparametric manner. However, for an arbitrarily given position, we should expect , , and to be locally approximately constant. This is indeed a reasonable assumption as long as these functions are sufficiently smooth with continuous second-order derivatives. Thus, according to Proposition 3.10 of Lang (2012), we then know that the second-order derivatives of , , and are uniformly upper bounded from infinity on the compact set for any . Similarly, we know that is uniformly lower bounded on by a constant for . Therefore, we draw inspiration from the concept of the local constant estimator (Nadaraya, 1965; Watson, 1964) and local maximum likelihood estimation (Fan et al., 1998) and introduce the subsequent locally weighted log-likelihood function based on the observed data :
(2) |
where stands for a fixed interior point in . Note that is a set of working parameters with , , and . For class , we set . Moreover, the three-dimensional kernel function, , is assumed to be , where with is a continuous probability density function symmetric about 0. Here, is the associated bandwidth. What distinguishes (2) from the classical log-likelihood function (1) is that information on the peer positions of is also blended in for local estimations. For convenience, we refer to as a locally weighted log-likelihood function. Then, the local maximum likelihood estimators can be defined as . For convenience, we refer to as the kernel maximum likelihood estimators (KMLE). We next consider how to optimize so that can be computed.
2.2 The KEM Algorithm
Recall that the locally weighted log-likelihood function based on the observed data is already defined in (2). Ever since Dempster et al. (1977), a typical way to optimize (2) is to develop an EM algorithm based on the so-called complete log-likelihood function. That is the log-likelihood function obtained by assuming that the latent class label is observed. Thus, if we have access to the complete data , we can define a complete log-likelihood function for the given voxel position as follows:
(3) |
where is an indicator function. In theory, any location of interest can be taken as . In practice, a location of interest is often taken at the recorded voxel positions of CT data. The objective here is to consistently estimate the nonparametric functions , , and for . We start with a set of initial estimators as and , where is an initial estimator obtained by (for example) a standard -means algorithm (MacQueen et al., 1967). Furthermore, we set , where can be some pre-specified constant. We write , , and as the estimators obtained in the th step.
E step. Based on the current estimate , we next take expectations on (3) conditional on the observed data and the current estimate as follows:
(4) |
where . Specifically, can be computed as follows:
(5) |
M Step. In this step, we maximize (3) and obtain a new estimate in the th step. Note that we have for any . Based on the Lagrange method, we can define the Lagrangian function as , where is an additional parameter introduced by the Lagrange multiplier method. After maximizing , we can update the parameters in the th step by
(6) | |||
(7) | |||
(8) |
For convenience, we refer to (5)–(8) as a KEM algorithm, which should be iteratively executed until convergence. Our extensive numerical experiments suggest that the algorithm works very well.
2.3 Asymptotic Properties
Next, we study the asymptotic properties of KMLE . First, we define some notations. Recall that is the locally weighted log-likelihood function defined in (2). We define for and . The following technical conditions are then required:
-
(C1)
(Kernel Function) Assume for and with some .
-
(C2)
(Bandwidth and Sample Size) Assume for some constant .
As we mentioned before, we assume that kernel function is the product of three univariate kernel density functions symmetric about 0. By Condition (C1), each univariate kernel density function should be a well-behaved probability density function with various finite moments. By Condition (C2), we require the bandwidth to converge to zero at the speed of as the sample size, , goes to infinity. As a consequence, the locally effective sample size, denoted by , diverges toward infinity as . In the meanwhile, the bandwidth is well constructed so that the resulting estimation bias should not be too large. Both Conditions (C1) and (C2) are fairly standard conditions, which have been widely used in the literature (Fan and Gijbels, 1996; Ullah and Pagan, 1999; Li and Racine, 2007; Silverman, 2018).
We define . Denote the Euclidean norm as for any vector . Then, we have the following Theorem 2.1, with the technical details provided in Appendix A and Appendix B of the Supplementary Materials.
Theorem 2.1.
Assume that both Conditions (C1) and (C2) are satisfied, then (i) there exists a local maximum likelihood estimator such that ; and (ii) , where is given by
2.4 Bandwidth Selection through Cross Validation
To ensure the asymptotic properties of the KMLE, we must carefully select the optimal bandwidth . From Condition (C2), we know that the optimal bandwidth should satisfy . The question is how to make an optimal choice for . To do so, an appropriately defined optimality criterion is required. One natural criterion could be out-of-sample forecasting accuracy. Let be an independent copy of . Recall that are the estimators obtained from data . Note that . Therefore, a natural prediction for can be constructed as . Its square prediction error (SPE) can then be evaluated as . Using a standard Taylor expansion-type argument, we can obtain an analytical formula for SPE using the following theorem:
Theorem 2.2.
Assume that both Conditions (C1) and (C2) are satisfied, we then have
where , , , and . Here, and .
The technical details of Theorem 2.2 are provided in Appendix C in the Supplementary Materials. By optimizing the leading term of from Theorem 2.2, we know that the optimal bandwidth constant is given by . However, both the critical constants, and , are extremely difficult to compute without explicit assumptions on the concrete forms of , , and with respect to . To solve this problem, a cross-validation (CV) type method is used to compute the optimal bandwidth. To this end, we need to randomly partition all voxel positions into two parts. The first part contains approximately 80% of the total voxels used for training. The remaining 20% is used for testing. For convenience, the indices of the voxels in the training and testing dataset are collected as and , respectively. Next, for an arbitrary testing sample, , we have . Therefore, we are inspired to predict the value using , where the estimates and are computed based on the training data with a given bandwidth constant. Let be a set of tentatively selected pilot-bandwidth constants, where is the total number of pilot-bandwidth constants. Therefore, an estimator for SPE can be constructed as . A CV-based estimator for the optimal bandwidth constant can then be defined as ; that is, the bandwidth constant in with the lowest SPE value is chosen. Consequently, a CV-based estimator for the optimal bandwidth is given by .
2.5 Bandwidth Selection through Regression
Even though the above CV idea is intuitive, its practical computation is expensive. This is because, for an accurate estimation of the optimal bandwidth constant , we typically need to evaluate a large number of pilot-bandwidth constants. To reduce the computation cost, we develop a novel regression method as follows. From Theorem 2.2, we know that . Note that is approximately a linear function in both and . The corresponding weights are given by and , where is the given total sample size, and is the tentatively selected pilot-bandwidth constant. This immediately suggests an interesting regression-based method for estimating the two critical constants, and , as follows. We define as a set of carefully selected pilot-bandwidth constants. Note that determines the total number of pilot-bandwidth constants to be tested. It must not be too large so that the computation cost can be reduced. For example, we fix and in our subsequent numerical studies. We define a pseudo response, , where . We define . We further define a pseudo-design matrix, , where the th row is and . Then, we have an appropriate regression relationship as , where . Subsequently, an ordinary least squares estimator is obtained for as . This leads to a regression-based estimator for the optimal bandwidth constant as . For convenience, we abbreviate this method as the REG method.
3 NUMERICAL STUDIES
3.1 The Simulation Model
Extensive simulation studies are conducted to validate the finite sample performance of the proposed KEM method. The entire experiment is designed so that the simulation data can mimic real chest CT data as much as possible. Specifically, we first fix a total of mixture components, which represent the background, bone tissue, and lung tissue, respectively. Next, we need to specify the class prior probability function, . To make the experiment as realistic as possible, we use the LIDC-IDRI dataset, which is probably the largest publicly available chest CT database (Setio et al., 2017). It consists of 888 human chest CT files from seven medical institutions worldwide. After downloading the dataset, we found that six files were damaged and could not be read into memory. Thus, we use the remaining 882 CT files for the subsequent study.
Let be an arbitrarily selected chest CT data from the LIDC-IDRI dataset. According to the definition of the database, we should always have for every CT scan file. However, the number of slices (i.e., ) may vary across different CT scan files, but it should fall within the range of . Next, we consider how to specify the values of for every possible voxel position , where contains all the available voxel positions. Next, we define a class label tensor with . Specifically, we define if its voxel position, , belongs to the lung-mask data provided by the LIDC-IDRI dataset. Otherwise, we define if . In this case, the th voxel position should be the bone tissue (Molteni, 2013). Finally, we define if the previous two criteria are not met.
For the th voxel position , we define for a local neighborhood as , where . Then, we define , where . Next, we redefine for . By doing so, we can guarantee that for every and . Next, we define and for . The case is set as the background case. To differentiate the background from the other classes, is set to 1, and is set to . Then, we set and . Thus, we allow the mean function and variance function to vary within a reasonable range. For intuitive understanding, , , and with generated from an arbitrarily selected CT are graphically displayed in Figure 1(a)–(i). Throughout this article, the depth index for the displayed 2D slices is consistently set to 100. For any voxel position , once and are given, a normal random variable can be generated. These normal random variables are then combined across different components according to a multinomial distribution with probability for class with . This leads to a final response .
3.2 Implementation Details
Although the KEM algorithm developed in equations (5)–(8) is theoretically elegant, its computation is not immediately straightforward. If one implements the algorithm faithfully, as equations (5)–(8) suggest, the associated computational cost would be unbearable. Consider, for example, the computation of the denominator of equation (6), which is . The resulting computational complexity is , where is the total number of voxel positions. Note that this procedure should be performed for each voxel position in . Then, the overall computational cost becomes order. Consider, for example, a chest CT of size ; then, we have and . Thus, the computational cost is prohibitive, and alleviating the computational burden becomes a critical problem.
To address this problem, we develop a novel solution. We specify as the Gaussian kernel and . Accordingly, the weight assigned to its tail decays toward 0 at an extremely fast rate. Therefore, instead of directly using the full kernel weight , we consider its truncated version as , where represents the filter size. We define as a cubic space locally around with volume . Instead of computing , we can compute . Thus, the computational cost for one voxel position can be significantly reduced from to , which is approximately the size of set . This is typically a much-reduced number compared with . Furthermore, this operation can be easily implemented on a GPU device using a standard 3D convolution operation (e.g., the Conv3D function in Keras of TensorFlow). This substantially accelerates the computational speed. To this end, the bandwidth must be appropriately specified according to the filter size of the convolution operations (i.e., ). Ideally, the filter should not be too large. Otherwise, the computation cost owing to could be extremely high. However, the filter size, , cannot be too small compared to bandwidth . Otherwise, the probability mass of , which falls into , could be too small.
For the REG method, we consider the filter size as , where and . Accordingly, we set the bandwidth to . Thus, only odd numbers are considered, so the filter used in the 3D convolution operations can be symmetric about its central voxel position. We wish the filter to be as large as possible. However, as previously discussed, an unnecessarily large filter size would result in prohibitive computational costs. Therefore, we set the largest filter size as 11. For the CV method, following the REG method, we generate for . Subsequently, each is multiplied by 0.3, 0.4, 0.5, 0.6, or 0.7. The bandwidth constant of the two methods is set as , where represents the sample size.
3.3 Bandwidth Selection and Estimation Accuracy
With the pre-specified filter sizes, both the CV and REG methods introduced in Sections 2.4 and 2.5 can be used to select the optimal bandwidth. Recall that 80% of the voxels are used for training and the remaining 20% are used for testing. Truncated Gaussian kernel and GPU-based convolutions are used to speed up the practical computation. Both the CV and REG methods are used to select the optimal bandwidth constant for each of the 100 randomly selected patients in the LIDC-IDRI dataset. The optimal bandwidth constants selected by the two methods are then boxplotted in Figure 2(a). We find that the REG method tends to select a larger than the CV method, leading to relatively larger filter sizes for convolution. The time cost of the two methods for selecting the optimal bandwidth for each patient is boxplotted in Figure 2(b). Evidently, the REG method has a much lower computational cost than the CV method.
To show the superiority of the proposed KEM method, we also compared it with two traditional methods. The two traditional methods in concern are, respectively, -means (MacQueen et al., 1967) and GMM (McLachlan and Peel, 2000). We then compare the aforementioned methods in terms of testing accuracy and estimation accuracy. Testing accuracy reflects the ability to identify the class labels on the testing set, while estimation accuracy reflects the ability to recover the true parameter values. When comparing the testing accuracies, we train the model on the training set and evaluate the testing accuracy on the testing set. See Figure 2(c) for the results of the testing accuracy. The medians of the testing accuracy results for the KEM method with the bandwidth selected by the CV method (KEM-CV), the KEM method with the bandwidth selected by the REG method (KEM-REG), the -means method, and the GMM method are, respectively, 0.9883, 0.9950, 0.9721, and 0.9719. We can see that both the traditional methods suffer from worse testing accuracy as compared with the proposed KEM methods. When comparing the estimation accuracies, all voxel positions of the CT data are used. We can compute , , and using the KEM-CV, KEM-REG, -means and GMM methods for every and every . Because this is a simulation study with the true parameter values specified in Section 3.1, which are known to us, we can evaluate the accuracy of the resulting estimates using the following RMSE criteria:
The RMSE results of the aforementioned four methods are boxplotted in Figures 2(d)–(f). We find that the KEM method outperforms the -means and GMM methods substantially. This is not surprising, since the two traditional methods cannot estimate the parameters of interest nonparametrically. Comparatively speaking, the REG method performs slightly better than the CV method. Thus, the median value of the bandwidth constants (i.e., ) chosen by the REG method is then fixed for the subsequent simulation studies.
With the fixed bandwidth constant , we conduct more comprehensive simulation studies to evaluate the finite sample performance of the estimators. These are, respectively, , , and for . For every chest CT scan in the LIDC-IDRI dataset, only of the voxel positions are used for parameter estimation. Different sample sizes can be examined by varying the sampling ratio in . For an intuitive understanding, we arbitrarily select a replicated experiment. In this experiment, visual depictions of 2D slices for , , and with are presented in Figure 1(j)–(r). We find that those estimated functions resemble the true functions very well. For a fixed value, the experiment is randomly replicated 1000 times on a randomly selected chest CT scan in the LIDC-IDRI dataset. The sample means are summarized in Table 1. We find that as the value of increases, the reported mean RMSE values steadily decrease toward 0 for every estimate. This suggests that they should be consistent estimates of the target parameters. Remarkably, the convergence rate shown in Table 1 seems to be slow. This is expected because we are dealing with a nonparametric kernel-smoothing procedure in a 3D space. The optimal convergence rate is indeed slow.
3.4 Case Illustration
Recall that the ultimate goal of our method is to reconstruct fine-grained 3D images of blood vessels. We aim to accomplish this critical task by using chest CT data in a fully automatic manner. To this end, we randomly select three CT files from the LIDC-IDRI dataset. The PatientIDs of the CTs are, respectively, LIDC-IDRI-0405, LIDC-IDRI-0811, and LIDC-IDRI-0875. To avoid distractions from undesirable human tissue, we follow Liao et al. (2019) and generate a lung mask for each CT slice. Then, we multiply the lung mask by each slice. This leads to more concentrated chest CT slices, as shown in Figure 3(a)–(c). Consider Figure 3(c) as an example: The background is pure black. There is a lung-shaped area floating in the middle of the image, the color of which is slightly lighter than the background. Blood vessels, which appear as light spots, are dispersed inside the lung-shaped area. Because image signals from irrelevant organ tissue are excluded, detecting blood vessels in the focused chest CT image is much easier than in the raw chest CT image.
Next, we apply the developed KEM method to the concentrated chest CT data. In this case, the surgeon’s primary objective is to accurately identify the blood vessels. Therefore, those voxels associated with blood vessels form one latent class. Second, since chest CT is mainly for diagnosing early-stage lung cancer, the lung tissue is also of great interest. Thus, the voxels associated with lung tissue form another important category. Lastly, in order to extract the lung tissue from the entire CT image, one needs to separate the background from the lung tissue. Therefore, the voxels associated with the background constitute the third latent class. It seems that all of these three classes can cover almost all the voxel positions in this real example. Therefore, we set in this experiment.
The CT values are then rescaled to between 0 and 1. We initialize for any by using a standard -means algorithm. We set and for , , and . The filter size is set to 3 for fast computational speed. Subsequently, the KEM algorithm is used to estimate the parameters of interest. In our case, the estimated posterior probabilities with respect to the blood vessels are of primary interest. See Figure 3(d)–(f) for a visual representation of the 2D slices. The original lung-shaped area in Figure 3(a)–(c) no longer obstructs the dance of the blood vessels in Figure 3(d)–(f). Moreover, the blood vessels are successfully highlighted, whereas all other irrelevant human tissue is discarded. Subsequently, the estimated posterior probabilities with respect to the blood vessels are rescaled back to the original data range and saved as a Dicom file for the subsequent 3D reconstruction. We then utilize RadiAnt software to load the Dicom file and then reconstruct the blood vessels; see Figure 3(g)–(i) for example. As shown, the blood vessels in the lung are preserved completely, while all the other human tissue is beautifully erased.
Additionally, we ask the third author of this work (an experienced surgeon working in one of the major hospitals in Beijing, P.R. China) to manually reconstruct the blood vessels. To ensure a fair basis for comparisons, the surgeon initiated the reconstruction by using the same concentrated CT as those employed in the KEM method. The reconstruction results are visualized in Figure 3(j)–(l). Moreover, we used both the -means and GMM methods for blood vessel reconstruction. The respective outcomes are visually presented in Figure 3(m)–(r). First, the GMM method seems over-detailed. Its 3D reconstruction results are distinct from those of the surgeon’s manual operations. A simulation with ground truth blood vessels generated from the result of the KEM method is given in Appendix D. Although the ground truth blood vessels are not over-detailed, the result of the GMM method is still over-detailed. This confirms that the excess details of the GMM method are largely due to estimation errors. Second, the 3D reconstruction results of the KEM method, the -means method, and the surgeon’s operations are similar to each other. However, the connectivity quality of blood vessels achieved through the KEM method appears superior to that of both the -means method and the surgeon’s operations. This is expected because manual operations and the -means method cannot adapt to subtle spatial variations within the 3D CT space. Third, the KEM result is free of both the disconnected problem and the over-detailed problem; see Figure 3(g)–(i) for example. This is because KEM is a more accurate estimation method due to its adaptability to spatially varying parameters in the CT space. This makes KEM a more reliable method.
Furthermore, we also conduct a comparative analysis for prediction in terms of the SPE for the 882 CT files in the LIDC-IDRI dataset, using the KEM, -means, and GMM methods on the testing set. The SPE metric is computed as , where denotes the testing set, denotes the CT value, and denotes the predicted CT value by one particular method (i.e., the KEM, -means, and GMM methods). For the KEM method, the predicted CT value is given by , for which both and are bandwidth-dependent estimates. In contrast, for the other two competing methods (i.e., the -means and GMM methods), the predicted CT value is computed by , where and are the parameter estimates computed by the -means method or the GMM method, which are bandwidth-independent. Here, we use the SPE metric rather than the RMSE metric because one should know in advance the values of the true parameters to compute RMSE. In simulation, we can compute the RMSE for each method because the true parameters are known to us. Unfortunately, in real data analysis, the ground truth values of the parameters are unknown. Therefore, we are unable to calculate the RMSE for the real data analysis. In contrast, SPE remains to be computable without the true parameters. Figure 4 illustrates boxplots of the logarithms of the SPE results. It is evident that the KEM method outperforms both the -means and GMM methods. In Appendix D, we provide example slices of the reconstructed responses of the three methods. We find that the reconstructed responses of the KEM method outperform the other alternatives.
4 CONCLUDING REMARKS
In summary, we developed a semiparametric Gaussian mixture model (GMM) tailored for an innovative application of 3D blood vessel reconstruction. This work contributes significantly to both the literature on statistical theory and medical imaging applications. From a theoretical standpoint, we introduced a semiparametric extension to the classical GMM. We also developed a kernel-based expectation-maximization (KEM) algorithm for model estimation. To underpin this methodology, we established a rigorous asymptotic theory. Additionally, we devised a regression-based approach for optimal bandwidth selection, which surpasses the traditional cross-validation method in both computational and statistical efficiency. On the application front, we addressed a critical challenge in medical imaging—3D blood vessel reconstruction—by defining it within a statistical framework of semiparametric Gaussian mixtures. In conclusion, we would like to discuss a topic for future research. The proposed method allows only a fixed number of mixture components. However, the human body is a highly sophisticated system with many different organs and tissues. Therefore, exploring the extension of the current method to accommodate a diverging number of mixture components is a topic worthy of future investigation.
5 Software
Software in the form of R and Python codes, together with sample input data and complete documentation are available online at https://github.com/Helenology/Paper_KEM.
6 Supplementary Material
Supplementary material is available online at http://biostatistics.oxfordjournals.org
Acknowledgments
Jing Zhou’s research is supported in part by the National Natural Science Foundation of China (No. 72171226) and the National Statistical Science Research Project (No. 2023LD008). Ying Ji’s research is supported in part by the National Natural Science Foundation of China (No. 62306189). Hansheng Wang’s research is partially supported by National Natural Science Foundation of China (No. 12271012). Conflict of Interest: None declared.
References
- Dempster et al. [1977] A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the em algorithm. Journal of the royal statistical society: series B (methodological), 39(1):1–22, 1977.
- Fan and Gijbels [1996] J. Fan and I. Gijbels. Local Polynomial Modelling and Its Applications. Springer New York, NY, 1996.
- Fan et al. [1998] J. Fan, M. Farmen, and I. Gijbels. Local maximum likelihood estimation and inference. Journal of the Royal Statistical Society Series B: Statistical Methodology, 60(3):591–608, 1998.
- Hagiwara et al. [2014] M. Hagiwara, Y. Shimada, Y. Kato, K. Nawa, Y. Makino, H. Furumoto, S. Akata, M. Kakihana, N. Kajiwara, T. Ohira, H. Saji, and N. Ikeda. High-quality 3-dimensional image simulation for pulmonary lobectomy and segmentectomy: Results of preoperative assessment of pulmonary vessels and short-term surgical outcomes in consecutive patients undergoing video-assisted thoracic surgery. European Journal of Cardio-Thoracic Surgery,, 46(6):e120–e126, 2014.
- Ji et al. [2021] Y. Ji, T. Zhang, L. Yang, X. Wang, L. Qi, F. Tan, J. H. T. Daemen, E. R. de Loos, B. Qiu, and S. Gao. The effectiveness of three-dimensional reconstruction in the localization of multiple nodules in lung specimens: a prospective cohort study. Translational Lung Cancer Research,, 10(3), 2021.
- Lang [2012] S. Lang. Real and functional analysis, volume 142. Springer Science & Business Media, 2012.
- Li and Racine [2007] Q. Li and J. S. Racine. Nonparametric Econometrics: Theory and Practice. Princeton University Press, 2007.
- Liao et al. [2019] F. Liao, M. Liang, Z. Li, X. Hu, and S. Song. Evaluate the malignancy of pulmonary nodules using the 3-d deep leaky noisy-or network. IEEE transactions on neural networks and learning systems,, 30(11):3484–3495, 2019.
- MacQueen et al. [1967] J. MacQueen et al. Some methods for classification and analysis of multivariate observations. In Proceedings of the fifth Berkeley symposium on mathematical statistics and probability, volume 1, pages 281–297. Oakland, CA, USA, 1967.
- McLachlan and Peel [2000] G. J. McLachlan and D. Peel. Finite Mixture Models. Wiley, 2000.
- Molteni [2013] R. Molteni. Prospects and challenges of rendering tissue density in hounsfield units for cone beam computed tomography. Oral Surgery, Oral Medicine, Oral Pathology and Oral Radiology,, 116(1):105–119, 2013.
- Nadaraya [1965] E. Nadaraya. On non-parametric estimates of density functions and regression curves. Theory of Probability and Its Applications,, 10:186–190, 1965.
- Setio et al. [2017] A. A. A. Setio, A. Traverso, T. de Bel, M. S. Berens, C. van den Bogaard, P. Cerello, and ..et al. Validation, comparison, and combination of algorithms for automatic detection of pulmonary nodules in computed tomography images: The luna16 challenge. Medical Image Analysis, 42:1–13, 2017. ISSN 1361-8415. https://doi.org/10.1016/j.media.2017.06.015.
- Silverman [2018] B. W. Silverman. Density Estimation for Statistics and Data Analysis. Routledge, 2018.
- Ullah and Pagan [1999] A. Ullah and A. Pagan. Nonparametric Econometrics. Cambridge University Press, 1999.
- Watson [1964] G. S. Watson. Smooth regression analysis. Sankhyā: The Indian Journal of Statistics, Series A,, pages 359–372, 1964.




RMSE() | RMSE() | RMSE() | |
---|---|---|---|
0.01 | 0.07398 | 0.04150 | 0.02299 |
0.10 | 0.04552 | 0.02267 | 0.01368 |
0.50 | 0.03031 | 0.01664 | 0.01138 |
1.00 | 0.02440 | 0.01457 | 0.01002 |