School of Computer Science and Software Engineering,
Shenzhen University
[email protected], [email protected] 22institutetext: National Engineering Research Center for Big Data Technology and System,
Services Computing Technology and System Lab,
Cluster and Grid Computing Lab,
School of Computer Science and Technology,
Huazhong University of Science and Technology,
430074, Wuhan, China
[email protected]
HI-Net: Hyperdense Inception UNet for Brain Tumor Segmentation
Abstract
The brain tumor segmentation task aims to classify tissue into the whole tumor (WT), tumor core (TC) and enhancing tumor (ET) classes using multimodel MRI images. Quantitative analysis of brain tumors is critical for clinical decision making. While manual segmentation is tedious, time-consuming, and subjective, this task is at the same time very challenging to automatic segmentation methods. Thanks to the powerful learning ability, convolutional neural networks (CNNs), mainly fully convolutional networks, have shown promising brain tumor segmentation. This paper further boosts the performance of brain tumor segmentation by proposing hyperdense inception UNet (HI-Net), which captures multi-scale information by stacking factorization of D weighted convolutional layers in the residual inception block. We use hyper dense connections among factorized convolutional layers to extract more contexual information, with the help of features reusability. We use a dice loss function to cope with class imbalances. We validate the proposed architecture on the multi-modal brain tumor segmentation challenges (BRATS) testing dataset. Preliminary results on the BRATS testing set show that achieved by our proposed approach, the dice (DSC) scores of ET, WT, and TC are , , and , respectively.
Keywords:
Brain tumor D UNetdense connections Factorized convolutional Deep learning1 Introduction
Primary and secondary are two types of brain tumors. Primary brain tumors originate from brain cells, whereas secondary tumors metastasize into the brain from other organs. Gliomas are primary brain tumors. Gliomas can be further sub-divided into two parts: low-grade (LGG) and high-grade (HGG). High-grade gliomas are an aggressive type of malignant brain tumor that proliferates, usually requires surgery and radiotherapy, and has a poor survival prognosis. Magnetic resonance imaging (MRI) is a critical diagnostic tool for brain tumor analysis, monitoring, and surgery planning. Usually, several complimentary MRI modalities - such as , with contrast agent , , and fluid attenuation inversion recover are required to emphasize different tissue properties and areas of tumor spread. For example, the contrast agent, usually gadolinium, emphasizes hyperactive tumor subregions in MRI modality.
Deep learning techniques, especially CNNs, are prevalent for the automatic segmentation of brain tumors. CNN can learn from examples and demonstrate state-of-the-art segmentation accuracy both in D natural images and in D medical image modalities. The information of segmentation provides an accurate, reproducible solution for further tumor analysis and monitoring. Multi-modal brain tumor segmentation challenge (BRATS) aims to evaluate state-of-the-art methods for the segmentation of brain tumors by providing a D MRI dataset with ground truth labels annotated by physicians [1], [2], [3], [4], [14]. A UNet is a popular CNN architecture for automatic brain tumor segmentation [8]. The multi-scale contextual information of the encoder-decoder sub-networks is effective for the accurate brain tumor segmentation task. Several variations of the encoder-decoder architectures were proposed for MICCAI BraTS and competitions.The potential of several deep architectures [12, 13, 17] and their ensembling procedures for brain tumor segmentation was discussed by a top-performing method [11] for MICCAI BRATS competition. Wang et al[18] proposed architectures with factorized weighted layers to save the GPU memory and the computational time. At the same time, the majority of these architectures used either the bigger input sizes [16] or cascaded training [10] or novel pre-processing [7] and post-processing strategies [9] to improve the segmentation accuracy. In contrast, few architectures demonstrate the important memory consumption of D convolutional layers. Chen et al [5] used an important concept in which each weighted layer was split into three branches in a parallel fashion, each with a different orthogonal view, namely axial, sagittal, and coronal. However, more complex combinations exist between features within and in-between different orthogonal views, which can significantly increase the learning representation [6]. Inspired by the SD UNet architecture [5, 19], we propose a variant encoder-decoder based architecture for the brain tumor segmentation. The key contributions of our study are as follows:
-
•
A novel hyperdense inception D UNet (HI-Net) architecture is proposed by stacking factorization of D weighted convolutional layers in the residual inception block.
-
•
In each residual inception block, hyper-dense connections are used in-between different orthogonal views to learn more complex feature representation.
-
•
Our network achieves state-of-the-art performance as compared to other recent methods.
2 Proposed Method
Figure 1 shows the proposed HI-Net architecture for brain tumor segmentation. The network’s left side works as an encoder to extract the features of different levels, and the right component of the network acts as a decoder to aggregate the features and the segmentation mask. The modified residual inception blocks of the encoder-decoder sub-networks have two convolutional layers, and each layer has followed the structure of Fig. 2(b). In contrast, traditional residual inception blocks are shown in Fig. 2(a). This study employed inter-connections of dense connections within and in-between different orthogonal views to learn more complex feature representation. In the stage of encoding, the encoder extracts feature at multiple scales and create fine-to-coarse feature maps. Fine feature maps contain low-level features but more spatial information, while coarse feature maps provide the opposite. Skip connection is used to combine coarse and fine feature maps for accurate segmentation. Unlike standard residual UNet, the encoder sub-network uses a self-repetition procedure on multiple levels to generate semantic maps for fine feature maps and thus select relevant regions in the fine feature maps to concatenate with the coarse feature maps.

3 Implementation Details
3.1 Dataset
The BRATS 2020 [1], [2], [3], [4], [14] training dataset included cases ( HGG and LGG), each with four rigidly aligned D MRI modalities (, , , and ), resampled to mm isotropic resolution and skull-stripped. The input image size is . The data were collected from institutions, using various MRI scanners. Annotations include tumor subregions: WT, TC, and ET. Two additional datasets without the ground truth labels are provided for validation and testing. These datasets required participants to upload the segmentation masks to the organizers’ server for evaluations. In validation ( cases) and testing () datasets, each subject includes the same four modalities of brain MRI scans but no ground truth. In our experiment, the training set is applied to optimize the trainable parameters in the network. The validation and testing sets are utilized to evaluate the performance of the trained network.
3.2 Experiments
The network is implemented by Keras and trained on Tesla V–SXM GB GPU card with a batch size of . Adam optimizer with an initial learning rate is employed to optimize the parameters. The learning rate is reduced by per epochs. The network is trained for epochs. During network training, augmentation techniques such as random rotations and mirroring are employed. The size of the input during the training of the network is . The multi-label dice loss function [15] addressed the class imbalance problem. Equation 1 shows the mathematical representation of loss function.
(1) |
where and are the prediction obtained by softmax activation and ground truth at voxel for class , respectively. is the total number of classes.

3.3 Evaluation Metrics
Multiple criteria are computed as performance metrics to quantify the segmentation result. Dice coefficient (DSC) is the most frequently used metric for evaluating medical image segmentation. It measures the overlap between the segmentation and ground truth with a value between and . The higher the Dice score, the better the segmentation performance. Sensitivity and specificity are also commonly used statistical measures. The sensitivity called true positive rate is defined as the proportion of positives that are correctly predicted. It measures the portion of tumor regions in the ground truth that is also predicted as tumor regions by the segmentation method. The specificity, called true negative rate, is defined as the proportion of correctly predicted negatives. It measures the portion of normal tissue regions in the ground truth that is also predicted as normal tissue regions by the segmentation method.
3.4 Results
The performance of our proposed architecture is evaluated on training, validation, and the testing sets provided by BRATS . Table 1 presents the quantitative analysis of the proposed work. We have secured mean DSC scores of ET, WT, and TC as , , and , respectively, on the validation dataset, while , , and on the training dataset. At the same time, our proposed approach obtained mean DSC scores of ET, WT, and TC as , , and , respectively, on the testing dataset. In Table 1, sensitivity and specificity are also presented on training, validation, and the testing datasets. Table 2 shows the comparable study of proposed work with the baseline work [5]. Our proposed HI-Net achieves higher scores for each tumor than the baseline work. Furthermore, ablation studies are conducted to assess the modified residual inception blocks’ influence with and without the inter-connected dense connections. The influence of these connections on DSCs of ET, WT, and TC is shown in Table 2. To provide qualitative results of our method, three-segmented images from training data are shown in Fig 3. In summary, modified inception blocks significantly improve the DSCs of ET, WT, and TC against the baseline inception blocks.

Dataset | Metrics | WT | TC | ET |
---|---|---|---|---|
BRATS 2020 Training | DSC | 92.967 | 90.963 | 80.009 |
Sensitivity | 93.004 | 91.282 | 80.751 | |
Specificity | 99.932 | 99.960 | 99.977 | |
BRATS 2020 Validation | DSC | 90.673 | 84.293 | 74.191 |
Sensitivity | 90.485 | 80.572 | 73.516 | |
Specificity | 99.929 | 99.974 | 99.977 | |
BRATS 2020 Testing | DSC | 87.494 | 83.712 | 79.457 |
Sensitivity | 91.628 | 85.257 | 82.409 | |
Specificity | 99.883 | 99.962 | 99.965 |
Methods | ET | WT | TC |
---|---|---|---|
Baseline Work | 70.616 | 90.670 | 82.136 |
Proposed Work | 74.191 | 90.673 | 84.293 |
4 Conclusion
We proposed a HI-Net architecture for brain tumor segmentation. Each D convolution is splitted into three parallel branches in the residual inception block, each with different orthogonal views, namely axial, sagittal and coronal. We also proposed hyperdense connections among factorized convolutional layers to extract more contextual information. The HI-Net architecture secures high DSC scores for all types of tumors. This network has been evaluated on the BRATS Challenge testing dataset and achieved average DSC scores of , , and for the segmentation of ET, WT, and TC, respectively. Compared with the performance of the validation dataset, the scores on the testing set are higher. In the future, we will work to enhance the robustness of the network to improve the segmentation performance by using some post-processing methods such as a fully connected conditional random field (CRF).
Acknowledgment
This work is supported by the National Natural Science Foundation of China under Grant No. 91959108.
References
- [1] Bakas, S., Akbari, H., Sotiras, A., Bilello, M., Rozycki, M., Kirby, J., Freymann, J., Farahani, K., Davatzikos, C.: Segmentation labels and radiomic features for the pre-operative scans of the TCGA-GBM collection. The Cancer Imaging Archive (2017) (2017)
- [2] Bakas, S., Akbari, H., Sotiras, A., Bilello, M., Rozycki, M., Kirby, J., Freymann, J., Farahani, K., Davatzikos, C.: Segmentation labels and radiomic features for the pre-operative scans of the TCGA-LGG collection. The Cancer Imaging Archive 286 (2017)
- [3] Bakas, S., Akbari, H., Sotiras, A., Bilello, M., Rozycki, M., Kirby, J.S., Freymann, J.B., Farahani, K., Davatzikos, C.: Advancing The Cancer Genome Atlas glioma MRI collections with expert segmentation labels and radiomic features. Scientific Data 4, 170117 (sep 2017), https://doi.org/10.1038/sdata.2017.117http://10.0.4.14/sdata.2017.117
- [4] Bakas, S., Reyes, M., Jakab, A., Bauer, S., Rempfler, M., Crimi, A., Shinohara, R.T., Berger, C., Ha, S.M., Rozycki, M., Prastawa, M., Alberts, E., Lipková, J., Freymann, J.B., Kirby, J.S., Bilello, M., Fathallah-Shaykh, H.M., Wiest, R., Kirschke, J., Wiestler, B., Colen, R.R., Kotrotsou, A., LaMontagne, P., Marcus, D.S., Milchenko, M., Nazeri, A., Weber, M.A., Mahajan, A., Baid, U., Kwon, D., Agarwal, M., Alam, M., Albiol, A., Albiol, A., Varghese, A., Tuan, T.A., Arbel, T., Avery, A., B., P., Banerjee, S., Batchelder, T., Batmanghelich, K.N., Battistella, E., Bendszus, M., Benson, E., Bernal, J., Biros, G., Cabezas, M., Chandra, S., Chang, Y.J., al., E.: Identifying the Best Machine Learning Algorithms for Brain Tumor Segmentation, Progression Assessment, and Overall Survival Prediction in the {BRATS} Challenge. CoRR abs/1811.0 (2018), http://arxiv.org/abs/1811.02629
- [5] Chen, W., Liu, B., Peng, S., Sun, J., Qiao, X.: S3D-UNet: Separable 3D U-Net for Brain Tumor Segmentation. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. pp. 358–368. Springer International Publishing, Cham (2019)
- [6] Dolz, J., Gopinath, K., Yuan, J., Lombaert, H., Desrosiers, C., Ayed, I.B.: HyperDense-Net: A hyper-densely connected CNN for multi-modal image segmentation. CoRR abs/1804.0 (2018), http://arxiv.org/abs/1804.02967
- [7] Feng, X., Tustison, N., Meyer, C.: Brain Tumor Segmentation Using an Ensemble of 3D U-Nets and Overall Survival Prediction Using Radiomic Features. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. pp. 279–288. Springer International Publishing, Cham (2019)
- [8] Isensee, F., Kickingereder, P., Wick, W., Bendszus, M., Maier-Hein, K.H.: Brain Tumor Segmentation and Radiomics Survival Prediction: Contribution to the {BRATS} 2017 Challenge. CoRR abs/1802.1 (2018), http://arxiv.org/abs/1802.10508
- [9] Isensee, F., Kickingereder, P., Wick, W., Bendszus, M., Maier-Hein, K.H.: No New-Net. In: Crimi, A., Bakas, S., Kuijf, H., Keyvan, F., Reyes, M., van Walsum, T. (eds.) Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. pp. 234–244. Springer International Publishing (2019)
- [10] Jiang, Z., Ding, C., Liu, M., Tao, D.: Two-Stage Cascaded U-Net: 1st Place Solution to BraTS Challenge 2019 Segmentation Task. In: Crimi, A., Bakas, S. (eds.) Brainlesion: Glioma, Multiple Sclerosis, Stroke and Traumatic Brain Injuries. pp. 231–241. Springer International Publishing, Cham (2020)
- [11] Kamnitsas, K., Bai, W., Ferrante, E., McDonagh, S.G., Sinclair, M., Pawlowski, N., Rajchl, M., Lee, M.C.H., Kainz, B., Rueckert, D., Glocker, B.: Ensembles of Multiple Models and Architectures for Robust Brain Tumour Segmentation. CoRR abs/1711.0 (2017), http://arxiv.org/abs/1711.01468
- [12] Kamnitsas, K., Ledig, C., Newcombe, V.F.J., Simpson, J.P., Kane, A.D., Menon, D.K., Rueckert, D., Glocker, B.: Efficient multi-scale 3D CNN with fully connected CRF for accurate brain lesion segmentation. Medical image analysis 36, 61–78 (2017)
- [13] Long, J., Shelhamer, E., Darrell, T.: Fully Convolutional Networks for Semantic Segmentation. CoRR abs/1411.4 (2014), http://arxiv.org/abs/1411.4038
- [14] Menze, B.H., Jakab, A., Bauer, S., Kalpathy-Cramer, J., Farahani, K., Kirby, J., Burren, Y., Porz, N., Slotboom, J., Wiest, R., Lanczi, L., Gerstner, E., Weber, M., Arbel, T., Avants, B.B., Ayache, N., Buendia, P., Collins, D.L., Criminisi, A.: The Multimodal Brain Tumor Image Segmentation Benchmark (BRATS). IEEE Transactions on Medical Imaging 34(10), 1993–2024 (oct 2015). https://doi.org/10.1109/TMI.2014.2377694
- [15] Milletari, F., Navab, N., Ahmadi, S.A.: V-Net: Fully Convolutional Neural Networks for Volumetric Medical Image Segmentation. CoRR abs/1606.0 (2016), http://arxiv.org/abs/1606.04797
- [16] Myronenko, A.: 3D {MRI} brain tumor segmentation using autoencoder regularization. CoRR abs/1810.1 (2018), http://arxiv.org/abs/1810.11654
- [17] Ronneberger, O., Fischer, P., Brox, T.: U-Net: Convolutional Networks for Biomedical Image Segmentation. CoRR abs/1505.0 (2015), http://arxiv.org/abs/1505.04597
- [18] Wang, G., Li, W., Ourselin, S., Vercauteren, T.: Automatic Brain Tumor Segmentation using Cascaded Anisotropic Convolutional Neural Networks. CoRR abs/1709.0 (2017), http://arxiv.org/abs/1709.00382
- [19] Xie, S., Sun, C., Huang, J., Tu, Z., Murphy, K.: Rethinking Spatiotemporal Feature Learning For Video Understanding. CoRR abs/1712.0 (2017), http://arxiv.org/abs/1712.04851