MKIS-Net: A Light-Weight Multi-Kernel Network for Medical Image Segmentation

1^st Tariq M. Khan School of Computer Science and Engineering
UNSW, Sydney, Australia.
[email protected] 2^nd Muhammad Arsalan Division of Electronics and Electrical Engineering
Dongguk University, Seoul, Korea
[email protected] 3^rd Antonio Robles-Kelly Defence Science and Technology Group,
Edinburgh, Australia.
[email protected] 4^th Erik Meijering School of Computer Science and Engineering
UNSW, Sydney, Australia.
[email protected]

Abstract

Image segmentation is an important task in medical imaging. It constitutes the backbone of a wide variety of clinical diagnostic methods, treatments, and computer-aided surgeries. In this paper, we propose a multi-kernel image segmentation net (MKIS-Net), which uses multiple kernels to create an efficient receptive field and enhance segmentation performance. As a result of its multi-kernel design, MKIS-Net is a light-weight architecture with a small number of trainable parameters. Moreover, these multi-kernel receptive fields also contribute to better segmentation results. We demonstrate the efficacy of MKIS-Net on several tasks including segmentation of retinal vessels, skin lesion segmentation, and chest X-ray segmentation. The performance of the proposed network is quite competitive, and often superior, in comparison to state-of-the-art methods. Moreover, in some cases MKIS-Net has more than an order of magnitude fewer trainable parameters than existing medical image segmentation alternatives and is at least four times smaller than other light-weight architectures.

Index Terms:

Convolutional Neural Networks, Medical Image Segmentation, Medical Image Analysis, Light-Weight Networks.

I Introduction

The aim of medical image segmentation is to facilitate the analysis of pathological or anatomical structural changes in patients. This often plays a critical role in computer-aided diagnosis and treatment [1]. Popular image segmentation tasks include retinal vessel [2, 3, 4, 5], optic disc [6, 7, 8], brain tumor [9], lung [10], and skin lesion segmentation [11].

In the last decade, medical segmentation methods have been typically based on deep artificial neural networks. In [12] fully convolutional networks (FCNs) are constructed from locally connected layers including convolution, pooling, and upsampling. In their FCN architecture, the authors employed two main components: the downsampling and upsampling stages. The path to the downsampling stage captures contextual information while the upsampling path retrieves spatial information. U-Net [13] originated from FCNs. The key difference is that each upsampling stage is connected via a concatenation operator to its corresponding downsampling stage. This way, each upsampling stage inherits the details on the corresponding downsampling stage, which leads to better segmentation performance. In SegNet, a novel technique is used for the upsampled encoder output, which involves storing the max-pooling indices at the pooling layer [14].

While effective, these methods can be quite computationally demanding. Indeed, designing an FCN architecture for more than one medical image segmentation application in a resource constrained environment remains a challenging task. This is especially true since model size and computational requirements in terms of memory and processing power are important factors for real-world deployment of medical image segmentation methods, and medical imaging platforms often have limited computational resources available for complex or computationally costly operations.

To reduce the number of trainable parameters, model compression [15] or binary network weights [16] have been used. On the other hand, light-weight networks with shallow architectures have been applied to real-world scenarios that require real-time prediction and decision tasks [17, 18]. Some have proposed to use network compression to remove redundancies for a pretrained model through pruning techniques [17]. In [15, 16] only a few bits are used to represent learned model weights instead of high-precision floating point numbers. These approaches do modify the network structure and often come at the cost of poor segmentation performance. Light-weight CNNs, on the other hand, are computationally cheap shallow networks with improved efficiency [19, 18]. In such networks, to reduce the model size, convolutional factorization is often used. For example, depth-wise separable convolutions are employed by both ShuffleNet [19] and MobileNets [18], where 1 $\times$ 1 point-wise and depth-wise convolutions are used instead of standard convolutions. In ERFNet [20], a 2D convolution is decomposed into two 1D-factorized convolutions. Despite their success, convolutional factorisation can undermine the ability of light-weight networks to learn discriminative local structures, leading to performance degradation [21]. We have recently proposed the RC-Net [22] and the more optimized T-Net [23] light-weight networks, which use VGG16 as their base net. T-Net extracts low frequency features through the use of small pooling kernels. However, it performs poorly on datasets with high feature variation.

Here, we present a novel network architecture, called multi-kernel image segmentation network (MKIS-Net), which does not adopt a factorisation approach but is rather an FCN that uses a multi-kernel architecture. This produces an effective receptive field at the encoder stage which, in turn, yields better segmentation performance. Like other light-weight architectures, MKIS-Net is a shallow network. In contrast, however, it is devoid of pooling layers. This yields an architecture with a reduced loss of spatial information often associated with a reduction in feature map size.

Refer to caption — Figure 1: Architecture of the proposed MKIS-Net. It uses four convolutional input layers in parallel, with kernels of size 3 $\times$ 3, 5 $\times$ 5, 7 $\times$ 7, and 11 $\times$ 11 pixels, respectively, as indicated. BN = batch normalization. ReLU = rectified linear unit.

II MKIS-Net Architecture

The MKIS-Net architecture is motivated by the fact that, in the domain of computer-aided diagnosis, accuracy and computational cost go hand in hand. This is because diagnostic platforms often have limited computational resources while requiring high accuracy to be reliable tools to professionals in the field. Thus, our architecture addresses two important needs. Firstly, the need to keep the number of learnable parameters low. Reducing the number of layers and filters reduces the overall computational complexity. Secondly, the need to accurately process high-resolution imagery. This is particularly important in medical imaging, since fine-grained structures, such as vessels or alveoli, play an important role in the diagnosis of multiple conditions.

To motivate our architecture, we note that the feature maps extracted from a CNN kernel can be viewed, particularly in earlier stages of the network, as representative of lower-level features in the images. When these features are sparse and a large number of filters is used, the feature maps may contain overlapping features [24], which do not contribute to the performance. Also, frequent usage of pooling layers can cause spatial information loss [25]. In the medical-related segmentation literature, the complexity of convolutional networks has therefore typically been reduced by proposing shallow nets, with a reduced number of layers [26, 24]. Thus, to address the mentioned issues, we designed a shallow multi-kernel architecture that reduces network complexity and avoids the use of pooling layers. In contrast with other light-weight approaches, our MKIS-Net adopts a multi-kernel architecture at each layer. This is motivated by the notion that smaller kernels provide receptive fields that cover smaller regions in the image, whereas larger ones provide a means for larger receptive fields in higher-resolution imagery. The main problem with larger kernels, however, is increased computational cost. Moreover, as a network grows deeper, these costs increase even further with the addition of layers to the network.

The network architecture of MKIS-Net (Fig. 1) uses four convolutional input layers in parallel, with kernels of increasing size, 3 $\times$ 3, 5 $\times$ 5, 7 $\times$ 7, and 11 $\times$ 11 pixels. Large kernels in semantic segmentation, according to [1,] provide better per pixel approximation with an effective receptive field. It is true that using large kernels frequently in all blocks raises the overall cost of the network. As a result, we effectively utilized the benefits of a large kernel while not significantly increasing the number of trainable parameters.

This contrasts with other segmentation networks, which are based on a set of small, single-size kernels [14, 13]. After the input convolutional block, six more convolutional blocks are used. All the two-scale multi-kernel blocks employ 3 $\times$ 3 and 5 $\times$ 5 filters. Altogether, the number of convolutional layers is small and, as a result, the network is shallow, with a low computational complexity and small number of trainable parameters. While MKIS-Net is a shallow network, it adds in an element-wise fashion the receptive fields produced by each of the multi-kernel blocks. This has the effect of creating a multiscale feature map comprised by the receptive field of kernels of different sizes. Notice that our architecture is devoid of pooling layers. Therefore, to control the feature map size, we employ two strided convolutions, which yield a feature map size reduction by half. At the output stage, transposed convolutions and a softmax layer are used. As dropout helps to reduce the issues of over-fitting arising from shallow networks [27, 28], we have used dropout in the output block with a probability of 0.4. In our network we employ a classification layer to deliver a categorical label for each image pixel.

III Experimental Setup

III-A Medical Image Datasets

Our network was evaluated using three different medical image segmentation applications and four different datasets that are publicly available. These are retinal vessel segmentation, skin lesion segmentation, and chest X-ray segmentation. All considered datasets are equipped with manual annotations from experts to serve as the gold standard.

Regarding retinal vessel segmentation, we have evaluated our network utilizing two different publicly available datasets: DRIVE [29]¹¹1https://drive.grand-challenge.org/ and CHASE [30]²²2https://blogs.kingston.ac.uk/retinal/chasedb1/. The DRIVE dataset was created by a Netherlands-based screening diabetic retinopathy program. It comprises 40 color images, 20 color images for training and 20 for testing, each having a resolution of 584 $\times$ 565 pixels. Only 7 of these 40 images show little signs of mild early retinopathy. There is a binary field-of-view (FOV) mask associated with each individual image.

The CHASE dataset includes 28 color images from 14 British schoolchildren. Using a FOV of $30^{\circ}$ centered on optical disks, each image has a resolution of $999\times 960$ pixels. The gold standard consists of two different manual segmentation maps. For our experiments, we used the first expert annotation. No dedicated training or test sets are available in the CHASE dataset. Thus, we chose to use the first 20 images as training images and the last 8 test images for testing. This is consistent with the approach taken by others [31].

For the skin lesion segmentation experiments, we used the ISBI 2016 Skin Lesion Challenge dataset [32]³³3https://challenge.kitware.com/#phase/566744dccad3a56fac786787 and the PH2 dataset [33]⁴⁴4https://www.fc.up.pt/addi/ph2%20database.html. The ISBI 2016 dataset is part of a competition called “Skin Lesion Analysis Towards Melanoma Detection”. This dataset contains 900 images of varying dimensions, all in a variety of formats. The PH2 dataset contains 200 dermoscopic images of $768\times 560$ pixels each, acquired at the Hospital Petro Hispano, Matosinhos, Portugal.

Finally, for the chest X-ray segmentation task, we used the Montgomery County chest X-ray dataset (MC) [34]⁵⁵5https://lhncbc.nlm.nih.gov/LHC-downloads/downloads.html. This is a lung segmentation dataset consisting of 138 frontal chest X-ray images. The dataset was acquired by the Montgomery County tuberculosis program in Maryland, USA. It has 58 tuberculosis and 80 normal cases covering a wide range of abnormalities. A Eureka stationary X-ray machine was used to acquire the images, yielding high-resolution imagery of either $4,020\times 4,892$ or $4,892\times 4,020$ pixels.

III-B Augmentation and Training

As the datasets used in the retinal vessel segmentation experiment are relatively small, we applied data augmentation. Specifically, each training image was rotated by one degree, and the brightness was adjusted in a random manner, both higher and lower, resulting in 7,600 images for the DRIVE and CHASE datasets. For the skin segmentation experiment, we used a protocol [35] where the 900 images of the ISBI 2016 dataset are used for training, and the 200 images of the PH2 dataset for testing. For the chest X-ray segmentation experiment, we followed others [36] and used 80 images for training purposes and the remaining images for testing.

For all our experiments we used a weighted cross-entropy loss and trained MKIS-Net using adaptive moment estimation (Adam) optimization [37] with an initial learning rate of 0.001 and an exponential decay rate of 0.9. The maximum number of iterations during the training phase was set to 10. Note that different methods can be used for assigning loss weights. The method of median frequency balance was utilized here for the calculation of class association weights [14].

III-C Evaluation Criteria

The manually annotated gold-standard segmentation maps are binary, with each pixel corresponding to either an object of interest (retinal vessels, skin lesions, abnormalities) or the background. Thus, for each pixel in an output image, there are four possible outcomes: the pixel belongs to an object of interest and has been correctly predicted as such (true positive, TP), it belongs to the background and has been correctly predicted as such (true negative, TN), or it has been incorrectly predicted as an object pixel (false positive, FP) or as a background pixel (false negative, FN). Using these four categories of pixels, we computed common measures of performance [38, 39], depending on the experiment: sensitivity (Se), specificity (Sp), accuracy (Acc), Dice/F1-score (F1). For the skin lesion segmentation experiment, we used the F1 and the Jaccard coefficient (Jacc) so as to allow direct comparison with methods in literature[40]. In the retinal vessel segmentation experiment, we also made use of the area under the receiver operating characteristic curve (AUC), as the datasets have an unbalanced distribution of positive and negative classes and AUC has been found to be a good measure to gauge how the model can separate those classes in segmentation problems [41].

IV Results and Comparison

IV-A Comparison with State-of-the-Art Methods

We compared our method both qualitatively and quantitatively with existing state-of-the-art methods in the medical image segmentation literature. First, we evaluated MKIS-Net and alternative nets such as SegNet [31] and U-Net [42] on the DRIVE and CHASE datasets for retinal vessel segmentation. From the qualitative results (Figs. 2 and 3) and quantitative results (Tables I and II) we see that our method is quite competitive, despite having a comparatively small number of learnable parameters (Table I). In fact, it outperforms alternatives in the DRIVE dataset, and is the best performer in terms of accuracy and specificity for the CHASE dataset. U-Net [42] outperforms MKIS-Net’s sensitivity on the CHASE dataset, with our network coming second.

Next, we evaluated MKIS-Net on the PH2+ISBI 2016 skin lesion segmentation dataset. From the qualitative results (Fig. 4) we see that our method works well for a wide range of lesion sizes, shapes, colours, and textures. The quantitative results (Table III) indicate that MKIS-Net achieved the best results in terms of F1 and Jaccard compared to other state-of-the-art segmentation methods.

Finally, we evaluated our method on the MC chest X-ray dataset. The qualitative results (Fig. 5) show that MKIS-Net also performs well for these high-resolution images where the task is to segment large regions of interest. From the quantitative results (Table IV) we see that our network achieved comparable results to X-RayNet-1 [43] and outperformed other state-of-the-art networks despite being far smaller in size ( $\approx 60\times$ fewer trainable parameters).

IV-B Comparison with Light-Weight Architectures

We also compared MKIS-Net both qualitatively and quantitatively to light-weight networks recently presented in literature. To this end, we used the two retinal vessel segmentation datasets DRIVE and CHASE. The qualitative results (Figs. 6 and 7) comparing MKIS-Net with ERFNet [20] and M2U-Net [44] are consistent with the findings in the previous subsection, in the sense that our network does deliver good segmentation results, preserving detail and coping with complex, thin vessel structures. We performed a quantitative comparison (Table V) of these nets in terms of the model size (in MB), number of parameters (Params), multiply and add operations in billions (MAdds), accuracy and F1 score, so as to gauge the computational requirements versus segmentation performance. Note that MKIS-Net yields more than 2% improvement in segmentation performance while being more than 3.5 times smaller parameter-wise as compared with the second smallest alternative (M2U-Net) in this comparison.

Method	Se	Sp	Acc	AUC	F1	Params (M)
VessNet [45]	0.8022	0.9810	0.9655	0.9820	N/A	9
U-Net++ [46]	0.8116	0.9823	0.9673	0.9815	0.8126	N/A
MultiResUNet [47]	0.7900	0.9848	0.9678	0.9784	0.8108	N/A
DRIU [48]	0.7855	0.9799	0.9552	0.9793	0.8220	7.8
Patch BTS-DSN [49]	0.7891	0.9804	0.9561	0.9806	0.8249	7.8
Image BTS-DSN [49]	0.7800	0.9806	0.9551	0.9796	0.8208	7.8
U-Net [42]	0.7849	0.9802	0.9554	0.9761	0.8175	3.4
Vessel-Net [50]	0.8038	0.9802	0.9578	0.9821	N/A	1.7
MS-NFN [51]	0.7844	0.9819	0.9567	0.9807	N/A	0.4
FCN [52]	0.8039	0.9804	0.9576	0.9821	N/A	0.2
T-Net[23]	0.8262	0.9862	0.9697	0.9867	0.8269	0.03
MKIS-Net (Proposed)	0.8338	0.9828	0.9697	0.9827	0.8283	0.152

TABLE I: Results of MKIS-Net and other state-of-the-art nets on the DRIVE dataset. Bold indicates best performance.

Method	Se	Sp	Acc	AUC	F1
MS-NFN [51]	0.7538	0.9847	0.9637	0.9825	N/A
Three-stage FCN [53]	0.7641	0.9806	0.9607	0.9776	N/A
BTS-DSN [49]	0.7888	0.9801	0.9627	0.9840	0.7983
Vessel-Net [50]	0.8132	0.9814	0.9661	0.9860	N/A
DEU-Net [54]	0.8074	0.9821	0.9661	0.9812	0.8037
SegNet [31]	0.8190	0.9735	0.9638	0.9780	0.7981
U-Net [42]	0.8355	0.9698	0.9578	0.9784	0.7792
T-Net [23]	0.8323	0.9844	0.9739	0.9889	0.8143
MKIS-Net (Proposed)	0.8266	0.9848	0.9740	0.9863	0.8137

TABLE II: Results of MKIS-Net and other state-of-the-art nets on the CHASE dataset. Bold indicates best performance.

Furthermore, we compared MKIS-Net to MobileNet-V3-Small [26]. In this paper we have so far focused on benchmarks and state-of-the-art segmentation methods primarily used in the medical image segmentation literature. To our knowledge, however, MobileNet-V3-Small has not been applied to medical imaging before. We sequentially present sample visual results (Figs. 8 and 9) of MKIS-Net and MobileNet-V3-Small on the DRIVE and CHASE datasets, a quantitative comparison of MKIS-Net with MobileNet-V3-Small, T-Net, M2U-Net, and ERFNet on the PH2 dataset (Table VI) with sample visual results yielded by MKIS-Net, MobileNet-V3-Small, M2U-Net, and ERFNet (Fig. 10), and a quantitative comparison of MKIS-Net with MobileNet-V3-Small, M2U-Net, and ERFNet on the MC dataset (Table VII) with sample visual results yielded by these networks (Fig. 11). Overall, we conclude that MKIS-Net and T-Net performed comparably well on the DRIVE and CHASE datasets, while MKIS-Net performed better than T-Net on the PH2 and MC datasets.

Method	F1	Jacc	Params (M)
SCDRR [55]	0.8600	0.7600	N/A
MSCA [56]	0.8157	0.7233	N/A
JCLMM [57]	0.8285	-	N/A
FCN+BPB+SBE [58]	0.9184	0.8430	8
Multistage FCN[35]	0.9066	0.8399	10
T-Net[23]	0.9282	0.8696	0.03
MKIS-Net (Proposed)	0.9301	0.8707	0.152

TABLE III: Results of MKIS-Net and other state-of-the-art nets on the PH2+ISBI 2016 dataset. Bold indicates best performance.

Method	Acc	Jaccard	F1	Params(M)
BN [59]	0.7700	N/A	N/A	N/A
MLP [59]	0.7900	N/A	N/A	N/A
RF [59]	0.8100	N/A	N/A	N/A
Vote [59]	0.8300	N/A	N/A	N/A
Souza et al. [36]	0.9697	0.8870	0.9697	N/A
X-RayNet-2 [43]	0.9872	0.9496	0.9740	2.39
X-RayNet-1 [43]	0.9911	0.9636	0.9814	9.2
MKIS-Net (Proposed)	0.9908	0.9627	0.9810	0.152

TABLE IV: Results of MKIS-Net and other state-of-the-art nets on the MC dataset. Bold indicates best performance.

V Conclusions

In this paper, we have presented MKIS-Net, a small convolutional neural network for medical image segmentation compared to alternatives in the scientific literature. It employs a multiscale kernel structure to produce an effective receptive field and improved segmentation efficiency. Our network is ideal for devices with limited resources while providing support for high-resolution medical image datasets. We have illustrated its utility for retinal vessel, skin lesion, and chest X-ray segmentation tasks. As the experimental results show, MKIS-Net is quite competitive, often outperforming much larger, state-of-the-art networks. Our network also outperforms other light-weight network alternatives.

Model	Size (MB)	Params (M)	MAdds (B)	F1	Acc	MAdds (B)	F1	Acc
			DRIVE			CHASE
ERFNet	8.0	2.06	2.8	0.8022	0.9598	51.3	0.7994	0.9716
M2U-Net	2.2	0.55	1.4	0.8091	0.9627	4.4	0.8006	0.9446
MKIS	0.6	0.152	0.05	0.8283	0.9697	0.05	0.8137	0.9740

TABLE V: Comparison of the results yielded by MKIS-Net with those delivered by alternative light-weight networks on the DRIVE and CHASE datasets. Best results are highlighted in bold.

Method	F1	Jacc	Params (M)
MobileNet-V3-Small [26]	0.8675	0.7762	0.47
M2U-Net [44]	0.9179	0.8527	0.55
ERFNet [20]	0.9083	0.8370	2.06
T-Net[23]	0.9282	0.8696	0.03
MKIS-Net	0.9301	0.8607	0.152

TABLE VI: Comparison of MKIS-Net with alternative light-weight networks on the PH2 dataset. Best results are highlighted in bold.

Method	Acc	Jacc	F1	Params (M)
MobileNet-V3-Small [26]	0.9826	0.9236	0.9576	0.47
M2U-Net [44]	0.9906	0.9615	0.9803	0.55
ERFNet [20]	0.9900	0.9590	0.9791	2.06
MKIS-Net	0.9908	0.9627	0.981	0.152

TABLE VII: Comparison of MKIS-Net with alternative light-weight networks on the MC dataset. Best results are highlighted in bold.

Acknowledgment

The work presented here was done while Antonio Robles-Kelly was with Deakin University, Waurn Ponds, Australia.

References

[1] T. Lei, R. Wang, Y. Wan, B. Zhang, H. Meng, A. K. Nandi, Medical image segmentation using deep learning: A survey (2020). arXiv:2009.13120.
[2] T. M. Khan, A. Robles-Kelly, S. S. Naqvi, A semantically flexible feature fusion network for retinal vessel segmentation, in: International Conference on Neural Information Processing (ICONIP), Springer, Cham, 2020, pp. 159–167.
[3] T. M. Khan, M. A. Khan, N. U. Rehman, K. Naveed, I. U. Afridi, S. S. Naqvi, I. Raazak, Width-wise vessel bifurcation for improved retinal vessel segmentation, Biomedical Signal Processing and Control 71 (2022) 103169.
[4] T. M. Khan, S. S. Naqvi, M. Arsalan, M. A. Khan, H. A. Khan, A. Haider, Exploiting residual edge information in deep fully convolutional neural networks for retinal vessel segmentation, in: 2020 International Joint Conference on Neural Networks (IJCNN), IEEE, 2020, pp. 1–8.
[5] T. M. Khan, M. Alhussein, K. Aurangzeb, M. Arsalan, S. S. Naqvi, S. J. Nawaz, Residual connection-based encoder decoder network (rced-net) for retinal vessel segmentation, IEEE Access 8 (2020) 131257–131272.
[6] M. Tabassum, T. M. Khan, M. Arsalan, S. S. Naqvi, M. Ahmed, H. A. Madni, J. Mirza, CDED-Net: Joint segmentation of optic disc and optic cup for glaucoma screening, IEEE Access 8 (2020) 102733–102747.
[7] T. M. Khan, M. Mehmood, S. S. Naqvi, M. F. U. Butt, A region growing and local adaptive thresholding-based optic disc detection, Plos one 15 (1) (2020) e0227566.
[8] R. Imtiaz, T. M. Khan, S. S. Naqvi, M. Arsalan, S. J. Nawaz, Screening of glaucoma disease from retinal vessel images using semantic segmentation, Computers & Electrical Engineering 91 (2021) 107036.
[9] Z. U. Rehman, S. S. Naqvi, T. M. Khan, M. A. Khan, T. Bashir, Fully automated multi-parametric brain tumour segmentation using superpixel based classification, Expert Systems with Applications 118 (2019) 598–613.
[10] B. Ait Skourt, A. El Hassani, A. Majda, Lung CT image segmentation using deep neural networks, Procedia Computer Science 127 (2018) 109–113.
[11] A. Topiwala, L. Al-Zogbi, T. Fleiter, A. Krieger, Adaptation and evaluation of deep learning techniques for skin segmentation on novel abdominal dataset, in: IEEE International Conference on Bioinformatics and Bioengineering (BIBE), 2019, pp. 752–759.
[12] J. Long, E. Shelhamer, T. Darrell, Fully convolutional networks for semantic segmentation, in: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2015, pp. 3431–3440.
[13] O. Ronneberger, P. Fischer, T. Brox, U-Net: Convolutional networks for biomedical image segmentation, in: International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2015, pp. 234–241.
[14] V. Badrinarayanan, A. Kendall, R. Cipolla, SegNet: A deep convolutional encoder-decoder architecture for image segmentation, IEEE Transactions on Pattern Analysis and Machine Intelligence 39 (12) (2017) 2481–2495.
[15] M. Rastegari, V. Ordonez, J. Redmon, A. Farhadi, XNOR-Net: Imagenet classification using binary convolutional neural networks, in: B. Leibe, J. Matas, N. Sebe, M. Welling (Eds.), European Conference on Computer Vision (ECCV), Springer International Publishing, Cham, 2016, pp. 525–542.
[16] M. Courbariaux, I. Hubara, D. Soudry, R. El-Yaniv, Y. Bengio, Binarized neural networks: Training deep neural networks with weights and activations constrained to +1 or -1 (2016). arXiv:1602.02830.
[17] C. Li, C. J. R. Shi, Constrained optimization based low-rank approximation of deep neural networks, in: European Conference on Computer Vision (ECCV), 2018, p. 746–761.
[18] A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, H. Adam, MobileNets: Efficient convolutional neural networks for mobile vision applications (2017). arXiv:1704.04861.
[19] X. Zhang, X. Zhou, M. Lin, J. Sun, ShuffleNet: An extremely efficient convolutional neural network for mobile devices (2017). arXiv:1707.01083.
[20] E. Romera, J. M. Álvarez, L. M. Bergasa, R. Arroyo, ERFNet: Efficient residual factorized ConvNet for real-time semantic segmentation, IEEE Transactions on Intelligent Transportation Systems 19 (1) (2018) 263–272.
[21] M. Wang, B. Liu, H. Foroosh, Factorized convolutional neural networks, in: IEEE International Conference on Computer Vision Workshops (ICCVW), 2017, pp. 545–553.
[22] T. M. Khan, A. Robles-Kelly, S. S. Naqvi, RC-Net: A convolutional neural network for retinal vessel segmentation, in: Digital Image Computing: Techniques and Applications (DICTA), IEEE, 2021, pp. 01–07.
[23] T. M. Khan, A. Robles-Kelly, S. S. Naqvi, T-Net: A resource-constrained tiny convolutional neural network for medical image segmentation, in: IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), 2022, pp. 644–653.
[24] N. Ma, X. Zhang, H.-T. Zheng, J. Sun, ShuffleNet V2: Practical guidelines for efficient CNN architecture design, in: European Conference on Computer Vision (ECCV), 2018, p. 122–138.
[25] L. Zhang, J. Wu, T. Wang, A. Borji, G. Wei, H. Lu, A multistage refinement network for salient object detection, IEEE Transactions on Image Processing 29 (2020) 3534–3545.
[26] A. Howard, M. Sandler, G. Chu, L.-C. Chen, B. Chen, M. Tan, W. Wang, Y. Zhu, R. Pang, V. Vasudevan, Q. V. Le, H. Adam, Searching for MobileNetV3, in: IEEE/CVF International Conference on Computer Vision (ICCV), 2019, pp. 1314–1324.
[27] N. Srivastava, G. Hinton, A. Krizhevsky, I. Sutskever, R. Salakhutdinov, Dropout: A simple way to prevent neural networks from overfitting, Journal of Machine Learning Research 15 (56) (2014) 1929–1958.
[28] M. R. Islam, D. Massicotte, F. Nougarou, P. Massicotte, W. P. Zhu, S-Convnet: A shallow convolutional neural network architecture for neuromuscular activity recognition using instantaneous high-density surface EMG images, in: Annual International Conference of the IEEE Engineering in Medicine Biology Society (EMBC), 2020, pp. 744–749.
[29] J. Staal, M. D. Abramoff, M. Niemeijer, M. A. Viergever, B. van Ginneken, Ridge-based vessel segmentation in color images of the retina, IEEE Transactions on Medical Imaging 23 (4) (2004) 501–509.
[30] M. M. Fraz, S. Barman, P. Remagnino, A. Hoppe, A. Basit, B. Uyyanonvara, A. R. Rudnicka, C. G. Owen, An approach to localize the retinal blood vessels using bit planes and centerline detection, Computer Methods and Programs in Biomedicine 108 (2) (2012) 600 – 616.
[31] K. Tariq M., A. Musaed, A. Khursheed, A. Muhammad, N. Syed S., N. S. Junaid, Residual connection based encoder decoder network (RCED-Net) for retinal vessel segmentation, IEEE Access 8 (2020) 131257–131272.
[32] N. C. F. Codella, D. Gutman, M. E. Celebi, B. Helba, M. A. Marchetti, S. W. Dusza, A. Kalloo, K. Liopyris, N. Mishra, H. Kittler, A. Halpern, Skin lesion analysis toward melanoma detection: A challenge at the 2017 International Symposium on Biomedical Imaging (ISBI), hosted by the International Skin Imaging Collaboration (ISIC), in: IEEE International Symposium on Biomedical Imaging (ISBI), 2018, pp. 168–172.
[33] T. Mendonça, P. Ferreira, J. Marques, A. Marçal, J. Rozeira, PH2 – a dermoscopic image database for research and benchmarking, in: Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBS), 2013, pp. 5437–5440.
[34] S. Jaeger, S. Candemir, S. K. Antani, Y.-X. J. Wáng, P.-X. Lu, G. R. Thoma, Two public chest X-ray datasets for computer-aided screening of pulmonary diseases, Quantitative Imaging in Medicine and Surgery 4 (2014) 475–477.
[35] L. Bi, J. Kim, E. Ahn, A. Kumar, M. Fulham, D. Feng, Dermoscopic image segmentation via multistage fully convolutional networks, IEEE Transactions on Biomedical Engineering 64 (2017) 2065–2074.
[36] J. C. Souza, J. O. Bandeira Diniz, J. L. Ferreira, G. L. França da Silva, A. Corrêa Silva, A. C. de Paiva, An automatic method for lung segmentation and reconstruction in chest X-ray using deep neural networks, Computer methods and programs in biomedicine 177 (2019) 285–296.
[37] D. P. Kingma, J. Ba, Adam: A method for stochastic optimization, arXiv preprint arXiv:1412.6980 (2014).
[38] A. A. Taha, A. Hanbury, Metrics for evaluating 3D medical image segmentation: analysis, selection, and tool, BMC Medical Imaging 15 (2015) 29.
[39] A. Reinke, L. Maier-Hein, et al., P. Jäger, Metrics reloaded: A new recommendation framework for biomedical image analysis validation, in: Medical Imaging with Deep Learning (MIDL), 2022.
[40] D. Fan, G. Ji, T. Zhou, G. Chen, H. Fu, J. Shen, L. Shao, PraNet: Parallel reverse attention network for polyp segmentation, in: International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2020, pp. 263–273.
[41] Q. Li, B. Feng, L. Xie, P. Liang, H. Zhang, T. Wang, A cross-modality learning approach for vessel segmentation in retinal images, IEEE Transactions on Medical Imaging 35 (2016) 109–118.
[42] G. Song, DPN: Detail-preserving network with high resolution representation for efficient segmentation of retinal vessels (2020). arXiv:2009.12053.
[43] M. Arsalan, M. Owais, T. Mahmood, J. Choi, K. R. Park, Artificial intelligence-based diagnosis of cardiac and related diseases, Journal of Clinical Medicine 9 (2020).
[44] T. Laibacher, T. Weyde, S. Jalali, M2U-Net: Effective and efficient retinal vessel segmentation for real-world applications, in: IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), 2019, pp. 115–124.
[45] A. Muhammad, O. Muhamamd, M. Tahir, C. Se Woon, P. Kang Ryoung, Aiding the diagnosis of diabetic and hypertensive retinopathy using artificial intelligence-based semantic segmentation, Journal of Clinical Medicine 8 (2019).
[46] Z. Zhou, M. M. Rahman Siddiquee, N. Tajbakhsh, J. Liang, UNet++: A nested U-net architecture for medical image segmentation, in: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Springer International Publishing, Cham, 2018, pp. 3–11.
[47] N. Ibtehaz, M. S. Rahman, MultiResUNet : Rethinking the U-Net architecture for multimodal biomedical image segmentation, Neural Networks 121 (2020) 74–87.
[48] K. Maninis, J. Pont-Tuset, P. Arbelaez, L. Van Gool, Deep retinal image understanding, in: International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI), 2016, pp. 140–148.
[49] G. Song, W. Kai, K. Hong, Z. Yujun, G. Yingqi, L. Tao, BTS-DSN: Deeply supervised neural network with short connections for retinal vessel segmentation, International Journal of Medical Informatics 126 (2019) 105–113.
[50] Y. Wu, Y. Xia, Y. Song, D. Zhang, D. Liu, C. Zhang, W. Cai, Vessel-Net: Retinal vessel segmentation under multi-path supervision, in: International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2019, pp. 264–272.
[51] Y. Wu, Y. Xia, Y. Song, Y. Zhang, W. Cai, Multiscale network followed network model for retinal vessel segmentation, in: International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2018, pp. 119–126.
[52] A. Oliveira, S. Pereira, C. A. Silva, Retinal vessel segmentation based on fully convolutional neural networks, Expert Systems with Applications 112 (2018) 229–242.
[53] Z. Yan, X. Yang, K. Cheng, A three-stage deep learning model for accurate retinal vessel segmentation, IEEE Journal of Biomedical and Health Informatics 23 (2019) 1427–1436.
[54] B. Wang, S. Qiu, H. He, Dual encoding U-Net for retinal vessel segmentation, in: International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI), 2019, pp. 84–92.
[55] B. Bozorgtabar, M. Abedini, R. Garnavi, Sparse coding based skin lesion segmentation using dynamic rule-based refinement, in: Machine Learning in Medical Imaging (MLMI), 2016, pp. 254–261.
[56] L. Bi, J. Kim, E. Ahn, D. Feng, M. Fulham, Automated skin lesion segmentation via image-wise supervised learning and multi-scale superpixel based cellular automata, in: IEEE International Symposium on Biomedical Imaging (ISBI), 2016, pp. 1059–1062.
[57] R. Anandarup, P. Anabik, G. Utpal, JCLMM: A finite mixture model for clustering of circular-linear data and its application to psoriatic plaque segmentation, Pattern Recognition 66 (2017) 160–173.
[58] H. J. Lee, J. U. Kim, S. Lee, H. G. Kim, Y. M. Ro, Structure boundary preserving segmentation for medical image with ambiguous boundary, in: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2020, pp. 4816–4825.
[59] K. C. Santosh, S. Antani, Automated chest X-ray screening: Can lung region symmetry help detect pulmonary abnormalities?, IEEE Transactions on Medical Imaging 37 (2018) 1168–1177.