11email: [email protected]
2Department of Radiology, Beijing Tiantan Hospital, Capital Medical University, Beijing, China
One-Shot Segmentation of Novel White Matter Tracts via Extensive Data Augmentation
Abstract
Deep learning based methods have achieved state-of-the-art performance for automated white matter (WM) tract segmentation. In these methods, the segmentation model needs to be trained with a large number of manually annotated scans, which can be accumulated throughout time. When novel WM tracts—i.e., tracts not included in the existing annotated WM tracts—are to be segmented, additional annotations of these novel WM tracts need to be collected. Since tract annotation is time-consuming and costly, it is desirable to make only a few annotations of novel WM tracts for training the segmentation model, and previous work has addressed this problem by transferring the knowledge learned for segmenting existing WM tracts to the segmentation of novel WM tracts. However, accurate segmentation of novel WM tracts can still be challenging in the one-shot setting, where only one scan is annotated for the novel WM tracts. In this work, we explore the problem of one-shot segmentation of novel WM tracts. Since in the one-shot setting the annotated training data is extremely scarce, based on the existing knowledge transfer framework, we propose to further perform extensive data augmentation for the single annotated scan, where synthetic annotated training data is produced. We have designed several different strategies that mask out regions in the single annotated scan for data augmentation. To avoid learning from potentially conflicting information in the synthetic training data produced by different data augmentation strategies, we choose to perform each strategy separately for network training and obtain multiple segmentation models. Then, the segmentation results given by these models are ensembled for the final segmentation of novel WM tracts. Our method was evaluated on public and in-house datasets. The experimental results show that our method improves the accuracy of one-shot segmentation of novel WM tracts.
Keywords:
White Matter Tract Segmentation One-Shot Learning Data Augmentation1 Introduction
The segmentation of white matter (WM) tracts based on diffusion magnetic resonance imaging (dMRI) provides an important tool for the understanding of brain wiring [12, 20]. It allows identification of different WM pathways and has benefited various brain studies [13, 22]. Since manual delineations of WM tracts can be time-consuming and subjective, automated approaches to WM tract segmentation are developed [18, 3, 19], and methods based on convolutional neural networks (CNNs) have achieved state-of-the-art performance [23, 17, 9]. The success of CNN-based WM tract segmentation relies on a large number of annotated scans that are accumulated throughout time for network training. However, in a new study, novel WM tracts that are not included in the existing annotated WM tracts may be of interest [14, 11, 1] and need to be segmented. Repeating the annotation for the novel WM tracts on a large number of scans can be very laborious and prohibitive, and accurate segmentation of novel WM tracts becomes challenging when only a few annotations are made for them.
Previous work has addressed this few-shot segmentation problem with a transfer learning strategy, where the knowledge learned for segmenting existing WM tracts with abundant annotated data is transferred to the segmentation of novel WM tracts [10]. In [10], a CNN-based segmentation model pretrained for segmenting existing WM tracts is used to initialize the target network for segmenting novel WM tracts, so that even with only a few annotations of novel WM tracts the network can learn adequately for the segmentation during fine-tuning. In addition, instead of using classic fine-tuning that discards the pretrained task-specific weights, an improved fine-tuning strategy is developed in [10] for more effective knowledge transfer, where all weights in the pretrained model can be exploited for initializing the target segmentation model. Despite the promising results achieved in [10] for few-shot segmentation of novel WM tracts, when the number of annotated scans for novel WM tracts decreases to one, the segmentation is still challenging. Since fewer annotations are preferred to reduce the annotation time and cost, the development of accurate approaches to one-shot segmentation of novel WM tracts needs further investigation.
In this work, we seek to improve one-shot segmentation of novel WM tracts. We focus on volumetric WM tract segmentation [17, 9], where voxels are directly labeled without necessarily performing tractography [2]. Since in the one-shot setting annotated training data is extremely scarce, based on the pretraining and fine-tuning framework developed in [10], we propose to address the one-shot segmentation problem with extensive data augmentation. Existing data augmentation strategies can be categorized into those based on basic image transformation [6, 17], generative models[5], image mixing [21, 24], and image masking [4]. Basic image transformation is already applied by default in CNN-based WM tract segmentation [17, 9], yet it is insufficient for the one-shot segmentation due to the limited diversity of the augmented data. The training of generative models usually requires a large amount of annotated data, or at least a large amount of unannotated data [5], which is not guaranteed in the one-shot setting. Image mixing requires at least two annotated images [24], which is also infeasible in the one-shot setting. Therefore, we develop several strategies based on image masking for data augmentation, where the single annotated image is manipulated by masking out regions in different ways to synthetize additional training data.
The masking is performed randomly either with uniform distributions or according to the spatial location of novel WM tracts, and the annotation of the synthetic image can also be determined in different ways. The augmented data is used to fine-tune the model for segmenting novel WM tracts. To avoid learning from potentially conflicting information in the synthetic data produced by different strategies, we choose to perform each data augmentation strategy separately to train multiple segmentation models, and the outputs of these models are ensembled for the final segmentation. We evaluated the proposed method on two brain dMRI datasets. The results show that our method improves the accuracy of one-shot segmentation of novel WM tracts. The code of our method is available at https://github.com/liuwan0208/One-Shot-Extensive-Data-Augmentation.
2 Methods
2.1 Background: Knowledge Transfer for Segmenting Novel WM Tracts
Suppose we are given a CNN-based model that segments existing WM tracts, for which a large number of annotations have been accumulated for training. We are interested in training a CNN-based model for segmenting novel WM tracts, for which only one scan is annotated due to annotation cost.
Existing work [10] has attempted to address this problem with a transfer learning strategy based on the pretraining and fine-tuning framework, where and share the same network structure for feature extraction, and their last task-specific layers are different. In classic fine-tuning, the network weights of the learned feature extraction layers of are used to initialize the feature extraction layers of , and the task-specific layer of is randomly initialized. Then, all weights of are fine-tuned with the single scan annotated for novel WM tracts. However, the classic fine-tuning strategy discards the information in the task-specific layer of . As different WM tracts cross or overlap, existing and novel WM tracts can be correlated, and the task-specific layer of for segmenting existing WM tracts may also bear relevant information for segmenting novel WM tracts. Therefore, to exploit all the knowledge learned in , in [10] an improved fine-tuning strategy is developed, which, after derivation, can be conveniently achieved with a warmup stage. Specifically, the feature extraction layers of are first initialized with those of . Then, in the warmup stage, the feature extraction layers of are fixed and only the last task-specific layer of (randomly initialized) is learned with the single annotated image. Finally, all weights of are jointly fine-tuned with the single annotated image.
2.2 Extensive Data Augmentation for One-Shot Segmentation of Novel WM Tracts
Although the transfer learning approach in [10] has improved the few-shot segmentation of novel WM tracts, when the training data for novel WM tracts is extremely scarce with only one annotated image, the segmentation is still challenging. Therefore, we continue to explore the problem of one-shot segmentation of novel WM tracts. Based on the pretraining and fine-tuning framework developed in [10], we propose to more effectively exploit the information in the single annotated image with extensive data augmentation for network training. Suppose the annotated image is and its annotation is (0 for background and 1 for foreground); then we obtain a set of synthetic annotated training images and the synthetic annotations by transforming and . We develop several data augmentation strategies for the purpose, which are described below.
2.2.1 Random Cutout
First, motivated by the Cutout data augmentation method [4] that has been successfully applied to image classification problems, we propose to obtain the synthetic image by transforming with region masking:
(1) |
where represents voxelwise multiplication and is a binary mask representing the region that is masked out. is designed as a 3D box randomly selected with uniform distributions. Mathematically, suppose the ranges of in the -, -, and -direction are , , and , respectively; then we follow [21] and select the box as
(2) | |||
(3) |
where represents the uniform distribution, , , and are the image dimensions in the -, -, and -direction, respectively, and is sampled from the beta distribution to control the size of the masked region.
The voxelwise annotation for also needs to be determined. Intuitively, we can obtain with the same masking operation for :
(4) |
The strategy that obtains synthetic training data using Eqs. (1) and (4) with the sampling in Eqs. (2) and (3) is referred to as Random Cutout One (RC1). Besides RC1, it is also possible to keep the original annotation for the masked image , so that the network learns to restore the segmentation result in the masked region. In this case, the synthetic annotation is simply determined as
(5) |
The use of Eqs. (1) and (5) for obtaining synthetic training data with the sampling in Eqs. (2) and (3) is referred to as Random Cutout Two (RC2).
2.2.2 Tract Cutout
Since we perform data augmentation for segmenting novel WM tracts, in addition to RC1 and RC2, it is possible to obtain with a focus on the novel WM tracts. To this end, we design the computation of as
(6) |
where denotes the annotation of the -th novel WM tract in , is sampled from the Bernoulli distribution to determine whether contributes to the computation of , and represents the ceiling operation. In this way, is the union of the regions of a randomly selected subset of the novel WM tracts, and thus the masked region depends on the novel WM tracts.
2.2.3 Network Training with Augmented Data
By repeating the region masking in each data augmentation strategy, a set of synthetic annotated images can be produced. Since the synthetic images can appear unrealistic, they are used only in the warmup stage of the improved fine-tuning framework in [10], where the last layer of is learned. In the final fine-tuning step that updates all network weights in , only the real annotated training image is used. In addition, to avoid that the network learns from potentially conflicting information in the synthetic data produced by different strategies, we choose to perform each data augmentation strategy separately and obtain four different networks for segmenting novel WM tracts. At test time, the predictions of the four networks are ensembled with majority voting111The tract label is set to one when the votes are tied. to obtain the final segmentation.
2.3 Implementation Details
We use the state-of-the-art TractSeg architecture [17] for volumetric WM tract segmentation as our backbone segmentation network, which takes fiber orientation maps as input. Like [17], the fiber orientation maps are computed with constrained spherical deconvolution (CSD) [15] or multi-shell multi-tissue CSD (MSMT-CSD) [7] for single-shell or multi-shell dMRI data, respectively, and three fiber orientations are allowed in the network input [17]. We follow [17] and perform 2D WM tract segmentation for each image view separately, and the results are fused to obtain the final 3D WM tract segmentation.
The proposed data augmentation is performed offline. Since given novel WM tracts TC1 or TC2 can produce at most different images, we set the number of synthetic scans produced by each data augmentation strategy to . Note that traditional data augmentation, such as elastic deformation, scaling, intensity perturbation, etc., is applied online in TractSeg to training images. Thus, these operations are also applied to the synthetic training data online. The training configurations are set according to TractSeg, where Adamax [8] is used to minimize the binary cross entropy loss with a batch size of 56, an initial learning rate of 0.001, and 300 epochs [17]. The model corresponding to the epoch with the best segmentation accuracy on a validation set is selected.
3 Experiments
3.1 Data Description and Experimental Settings
For evaluation, experiments were performed on the publicly available Human Connectome Project (HCP) dataset [16] and an in-house dataset. The dMRI scans in the HCP dataset were acquired with 270 diffusion gradients (three -values) and a voxel size of 1.25 mm isotropic. In [17] 72 major WM tracts were annotated for the HCP dataset, and the annotations are also publicly available. For the list of the 72 WM tracts, we refer readers to [17]. The dMRI scans in the in-house dataset were acquired with 270 diffusion gradients (three -values) and a voxel size of 1.7 mm isotropic. In this dataset, only ten of the 72 major WM tracts were annotated due to the annotation cost.
Following [10], we selected the same 60 WM tracts as existing WM tracts, and a segmentation model was pretrained for these tracts with the HCP dMRI scans, where 48 and 15 annotated scans were used as the training set and validation set, respectively. To evaluate the performance of one-shot segmentation of novel WM tracts, we considered a more realistic and challenging scenario where novel WM tracts are to be segmented on dMRI scans that are acquired differently from the dMRI scans annotated for existing WM tracts. Specifically, instead of using the original HCP dMRI scans for segmenting novel WM tracts, we generated clinical quality scans from them. The clinical quality scans were generated by selecting only 34 diffusion gradients at = 1000 s/mm2 and downsampling the selected diffusion weighted images in the spatial domain by a factor of two to the voxel size of 2.5 mm isotropic. The tract annotations were also downsampled accordingly. Since the dMRI scans in the in-house dataset were acquired differently from the original HCP dMRI scans, they were directly used for evaluation together with their original annotations.
WM Tract Name | Abbr. | CQ1 | CQ2 | CQ3 | IH1 | IH2 | IH3 |
---|---|---|---|---|---|---|---|
Corticospinal tract left | CST_left | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Corticospinal tract right | CST_right | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Fronto-pontine tract left | FPT_left | ✓ | ✓ | ||||
Fronto-pontine tract right | FPT_right | ✓ | ✓ | ||||
Inferior longitudinal fascicle left | ILF_left | ✓ | |||||
Inferior longitudinal fascicle right | ILF_right | ✓ | |||||
Optic radiation left | OR_left | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Optic radiation right | OR_right | ✓ | ✓ | ✓ | ✓ | ✓ | ✓ |
Parieto-occipital pontine tract left | POPT_left | ✓ | ✓ | ✓ | ✓ | ||
Parieto-occipital pontine tract right | POPT_right | ✓ | ✓ | ✓ | ✓ | ||
Uncinate fascicle left | UF_left | ✓ | ✓ | ||||
Uncinate fascicle right | UF_right | ✓ | ✓ |
The two datasets were used to evaluate the accuracy of one-shot segmentation of novel WM tracts separately based on the model pretrained on the original HCP dataset for segmenting existing WM tracts. We considered three cases of novel WM tracts for each dataset. For the clinical quality scans, the three cases are referred to as CQ1, CQ2, and CQ3, and for the in-house dataset, the three cases are referred to as IH1, IH2, and IH3. The details about these cases are summarized in Table 1. For each dataset, only one scan was selected from it for network training, together with the corresponding annotation of novel WM tracts.222The single annotated scan was also used as the validation set for model selection. For the clinical quality dataset and the in-house dataset, 30 and 16 scans were used for testing, respectively, where the annotations of novel WM tracts were available and only used to measure the segmentation accuracy.

3.2 Evaluation of Segmentation Results
We compared the proposed method with two competing methods, which are the classic fine-tuning strategy and the improved fine-tuning strategy [10] described in Section 2.1 with the same pretrained model and single annotated training scan used by the proposed method. Both competing methods were integrated with TractSeg [17] like the proposed method. For convenience, the classic fine-tuning strategy and the improved fine-tuning strategy are referred to as CFT and IFT, respectively. Note that as shown in [10], in the one-shot setting directly training a model that segments novel WM tracts from scratch without the pretrained model would lead to segmentation failure. Thus, this strategy was not considered.
We first qualitatively evaluated the proposed method. Examples of the segmentation results for novel WM tracts are shown in Fig. 1 for the proposed and competing methods, where the annotations are also displayed for reference. For demonstration, here we show the results obtained in the case of CQ1 for CST_left and OR_right. We can see that the segmentation results of our method better resemble the annotations than those of the competing methods.
Next, we quantitatively evaluated our method. The Dice coefficient was computed for the segmentation result of each novel WM tract on each test scan for each case. For demonstration, we have listed the average Dice coefficient of each novel WM tract in Table 2 for the cases of CQ1 and IH1. For each tract and each case, our method has a higher average Dice coefficient than the competing methods, and the improvement is statistically significant. We have also summarized the mean of the average Dice coefficients of the novel WM tracts for all the cases in Table 3 (upper half table). In all cases our method outperforms the competing methods with higher mean Dice coefficients.
CQ1 | IH1 | |||||||||
---|---|---|---|---|---|---|---|---|---|---|
WM Tract | CFT | IFT | Ours | CFT | IFT | Ours | ||||
Dice | Dice | Dice | Dice | Dice | Dice | |||||
CST_left | 0.123 | *** | 0.463 | *** | 0.644 | 0.208 | *** | 0.455 | *** | 0.569 |
CST_right | 0.120 | *** | 0.564 | *** | 0.692 | 0.129 | *** | 0.416 | *** | 0.542 |
OR_left | 0.000 | *** | 0.281 | *** | 0.492 | 0.118 | *** | 0.504 | *** | 0.548 |
OR_right | 0.000 | *** | 0.401 | *** | 0.533 | 0.100 | *** | 0.462 | *** | 0.518 |
Method | CQ1 | CQ2 | CQ3 | IH1 | IH2 | IH3 |
CFT | 0.061 | 0.097 | 0.092 | 0.139 | 0.280 | 0.192 |
IFT | 0.427 | 0.351 | 0.396 | 0.459 | 0.518 | 0.531 |
Ours | 0.590 | 0.611 | 0.567 | 0.544 | 0.639 | 0.662 |
RC1 | 0.574 | 0.552 | 0.519 | 0.529 | 0.630 | 0.637 |
RC2 | 0.552 | 0.555 | 0.544 | 0.514 | 0.625 | 0.651 |
TC1 | 0.524 | 0.605 | 0.536 | 0.509 | 0.592 | 0.637 |
TC2 | 0.582 | 0.610 | 0.552 | 0.524 | 0.629 | 0.654 |
Finally, we confirmed the benefit of each proposed data augmentation strategy, as well as the benefit of ensembling their results. For each case, the mean value of the average Dice coefficients of all novel WM tracts was computed for the segmentation results achieved with RC1, RC2, TC1, or TC2 individually, and the results are also given in Table 3 (lower half table). Compared with the results of IFT that did not use the proposed data augmentation, the integration of IFT with RC1, RC2, TC1, or TC2 led to improved segmentation accuracy, which indicates the individual benefit of each proposed data augmentation strategy. In addition, the Dice coefficients of the proposed method achieved with ensembling are higher than those achieved with a single data augmentation strategy, which confirms the benefit of ensembling. Note that there is not a data augmentation strategy that is better or worse than the others in all cases, which is possibly because of the randomness in RC1 and RC2 and the dependence of TC1 and TC2 on the spatial coverages of the novel WM tracts that vary in different cases.
4 Conclusion
We have proposed an approach to one-shot segmentation of novel WM tracts based on an improved pretraining and fine-tuning framework via extensive data augmentation. The data augmentation is performed with region masking, and several masking strategies are developed. The segmentation results achieved with these strategies are ensembled for the final segmentation. The experimental results on two brain dMRI datasets show that the proposed method improves the accuracy of novel WM tract segmentation in the one-shot setting.
4.0.1 Acknowledgements
This work is supported by Beijing Natural Science Foundation (L192058).
References
- [1] Banihashemi, L., Peng, C.W., Verstynen, T., Wallace, M.L., Lamont, D.N., Alkhars, H.M., Yeh, F.C., Beeney, J.E., Aizenstein, H.J., Germain, A.: Opposing relationships of childhood threat and deprivation with stria terminalis white matter. Human Brain Mapping 42(8), 2445–2460 (2021)
- [2] Basser, P.J., Pajevic, S., Pierpaoli, C., Duda, J., Aldroubi, A.: In vivo fiber tractography using DT-MRI data. Magnetic Resonance in Medicine 44(4), 625–632 (2000)
- [3] Bazin, P.L., Ye, C., Bogovic, J.A., Shiee, N., Reich, D.S., Prince, J.L., Pham, D.L.: Direct segmentation of the major white matter tracts in diffusion tensor images. NeuroImage 58(2), 458–468 (2011)
- [4] DeVries, T., Taylor, G.W.: Improved regularization of convolutional neural networks with cutout. arXiv preprint arXiv:1708.04552 (2017)
- [5] Ding, Y., Yu, X., Yang, Y.: Modeling the probabilistic distribution of unlabeled data for one-shot medical image segmentation. In: AAAI Conference on Artificial Intelligence. pp. 1246–1254. AAAI (2021)
- [6] Isensee, F., Jaeger, P.F., Kohl, S.A., Petersen, J., Maier-Hein, K.H.: nnU-Net: A self-configuring method for deep learning-based biomedical image segmentation. Nature Methods 18(2), 203–211 (2021)
- [7] Jeurissen, B., Tournier, J.D., Dhollander, T., Connelly, A., Sijbers, J.: Multi-tissue constrained spherical deconvolution for improved analysis of multi-shell diffusion MRI data. NeuroImage 103, 411–426 (2014)
- [8] Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
- [9] Liu, W., Lu, Q., Zhuo, Z., Li, Y., Duan, Y., Yu, P., Qu, L., Ye, C., Liu, Y.: Volumetric segmentation of white matter tracts with label embedding. NeuroImage 250, 118934 (2022)
- [10] Lu, Q., Ye, C.: Knowledge transfer for few-shot segmentation of novel white matter tracts. In: International Conference on Information Processing in Medical Imaging. pp. 216–227. Springer (2021)
- [11] MacNiven, K.H., Leong, J.K., Knutson, B.: Medial forebrain bundle structure is linked to human impulsivity. Science Advances 6(38), eaba4788 (2020)
- [12] O’Donnell, L.J., Pasternak, O.: Does diffusion MRI tell us anything about the white matter? An overview of methods and pitfalls. Schizophrenia Research 161(1), 133–141 (2015)
- [13] Stephens, R.L., Langworthy, B.W., Short, S.J., Girault, J.B., Styner, M.A., Gilmore, J.H.: White matter development from birth to 6 years of age: A longitudinal study. Cerebral Cortex 30(12), 6152–6168 (2020)
- [14] Toescu, S.M., Hales, P.W., Kaden, E., Lacerda, L.M., Aquilina, K., Clark, C.A.: Tractographic and microstructural analysis of the dentato-rubro-thalamo-cortical tracts in children using diffusion MRI. Cerebral Cortex 31(5), 2595–2609 (2021)
- [15] Tournier, J.D., Calamante, F., Connelly, A.: Robust determination of the fibre orientation distribution in diffusion MRI: Non-negativity constrained super-resolved spherical deconvolution. NeuroImage 35(4), 1459–1472 (2007)
- [16] Van Essen, D.C., Smith, S.M., Barch, D.M., Behrens, T.E., Yacoub, E., Ugurbil, K., Wu-Minn HCP Consortium: The WU-Minn human connectome project: An overview. NeuroImage 80, 62–79 (2013)
- [17] Wasserthal, J., Neher, P., Maier-Hein, K.H.: TractSeg - Fast and accurate white matter tract segmentation. NeuroImage 183, 239–253 (2018)
- [18] Wu, Y., Hong, Y., Ahmad, S., Lin, W., Shen, D., Yap, P.T., UNC/UMN Baby Connectome Project Consortium: Tract dictionary learning for fast and robust recognition of fiber bundles. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 251–259. Springer (2020)
- [19] Ye, C., Yang, Z., Ying, S.H., Prince, J.L.: Segmentation of the cerebellar peduncles using a random forest classifier and a multi-object geometric deformable model: Application to spinocerebellar ataxia type 6. Neuroinformatics 13(3), 367–381 (2015)
- [20] Yeatman, J.D., Dougherty, R.F., Myall, N.J., Wandell, B.A., Feldman, H.M.: Tract profiles of white matter properties: Automating fiber-tract quantification. PloS ONE 7(11), e49790 (2012)
- [21] Yun, S., Han, D., Oh, S.J., Chun, S., Choe, J., Yoo, Y.: CutMix: Regularization strategy to train strong classifiers with localizable features. In: International Conference on Computer Vision. pp. 6023–6032. IEEE (2019)
- [22] Zarkali, A., McColgan, P., Leyland, L.A., Lees, A.J., Rees, G., Weil, R.S.: Fiber-specific white matter reductions in Parkinson hallucinations and visual dysfunction. Neurology 94(14), e1525–e1538 (2020)
- [23] Zhang, F., Karayumak, S.C., Hoffmann, N., Rathi, Y., Golby, A.J., O’Donnell, L.J.: Deep white matter analysis (DeepWMA): Fast and consistent tractography segmentation. Medical Image Analysis 65, 101761 (2020)
- [24] Zhang, X., Liu, C., Ou, N., Zeng, X., Xiong, X., Yu, Y., Liu, Z., Ye, C.: CarveMix: A simple data augmentation method for brain lesion segmentation. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 196–205. Springer (2021)