This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

11institutetext: Research Institute of Trustworthy Autonomous Systems and Department of Computer Science and Technology, Southern University of Science and Technology, Shenzhen, 518055, China
11email: [email protected]
22institutetext: Department of Ophthalmology, Shenzhen People’s Hospital (The Second Clinical Medical College, Jinan University; The First Affiliated Hospital, Southern University of Science and Technology) 33institutetext: Guangdong Provincial Key Laboratory of Brain-inspired Intelligent Computation, Department of Computer Science and Engineering, Southern University of Science and Technology, Shenzhen, 518055, China

SuperVessel: Segmenting High-resolution Vessel from Low-resolution Retinal Image

Yan Hu Corresponding Author11    Zhongxi Qiu 11    Dan Zeng 11    Li Jiang 22    Chen Lin 22    Jiang Liu 1133
Abstract

Vascular segmentation extracts blood vessels from images and serves as the basis for diagnosing various diseases, like ophthalmic diseases. Ophthalmologists often require high-resolution segmentation results for analysis, which leads to super-computational load by most existing methods. If based on low-resolution input, they easily ignore tiny vessels or cause discontinuity of segmented vessels. To solve these problems, the paper proposes an algorithm named SuperVessel, which gives out high-resolution and accurate vessel segmentation using low-resolution images as input. We first take super-resolution as our auxiliary branch to provide potential high-resolution detail features, which can be deleted in the test phase. Secondly, we propose two modules to enhance the features of the interested segmentation region, including an upsampling with feature decomposition (UFD) module and a feature interaction module (FIM) with a constraining loss to focus on the interested features. Extensive experiments on three publicly available datasets demonstrate that our proposed SuperVessel can segment more tiny vessels with higher segmentation accuracy IoU over 6%, compared with other state-of-the-art algorithms. Besides, the stability of SuperVessel is also stronger than other algorithms. We will release the code after the paper is published.

Keywords:
Vessel Segmentation Super-resolution Multi-task Learning Retinal Image.

1 Introduction

Retinal images are widely adopted as effective tools for the diagnosis and therapy of various diseases. The visual exploration of retinal blood vessels assists ophthalmologists in diagnosing variety of abnormalities of eyes, such as diabetic retinopathy, glaucoma, age-related macular degeneration. Researchers also proved the changes of retinal vessels could be an early screening method for some brain diseases [11], cardiovascular diseases [5], or systematic diseases [16]. Retinal vessel segmentation is one fundamental step for retinal image analysis. Identifying the vessel structures based on high-resolution images can give doctors great convenience of precise disease diagnosis. It brings a great burden and consumes plenty of time for doctors to segment vessels manually since it requires specific medical training and technical expertise.

In recent years, researchers have proposed many automatic vessel segmentation algorithms based on deep-learning construction to lighten the burdens on doctors. They learn from the raw image data without adopting handcrafted features. Ronneberger et al. [14] proposed U-shape Net (U-Net) framework for biomedical image segmentation, which has become a popular neural network architecture for its promising results in biomedical image segmentation. Many variations have been proposed based on U-Net for different vessel segmentation tasks. For example, Fu et al. [3] adopted the CRF to gather the multi-stage feature maps for improving the vessel detection performance. Some researchers proposed to stack multiple U-net shape architectures [8], input image patches into U-net architecture [17], introduce multi-scale input layer to the conventional U-net [15], or cascade a backbone residual dense network and a fine-tune tail network [6]. For computation loads, the existing algorithms often output low-resolution vessel segmentation results or direct upsampling results leading to a discontinuity in the results, which cannot satisfy the requirement of ophthalmologists. They often require high-resolution continuous vessels for analyzing diseases like branch retinal vein occlusion (BRVO), high-resolution (HR) images can provide more details as tiny vessels.

Recently, Wang et al. [18] proposed a patch-free 3D brain tumor segmentation driven by super-resolution technique. An HR 3D patch is necessary to guide segmentation and super-resolution during training, which may increase the computation complexity. In natural image segmentation, researchers proposed some auxiliary segmentation tasks [7, 19], which adopt the feature loss between segmentation and super-resolution branches to indicate the task fusion. However, only constrained by image feature similarity cannot provide effective features for the vessel segmentation task, as vessel proportion in an entire image is relatively small.

Therefore, to solve the above problems, we propose to output high-resolution vessel segmentation only based on low-resolution images, which supplies doctors with clear vessels for accurate diagnosis. Then we try to improve the vessel segment accuracy by focusing on our interested vessel regions with effective feature interaction. The contributions are as follows: 1) We propose a novel dual-stream learning algorithm that combines segmentation and super-resolution to produce the high-resolution vessel segmentation based on a low-resolution input. 2) We emphasize the interested features in two aspects, including an upsampling with a feature decomposition (UFD) module and a feature interaction module (FIM) with a new constraint loss. They extract the spatial correlation between the decomposed features and super-resolution features. 3) The efficacy of our proposed SuperVessel is shown on three publicly available datasets compared with other state-of-the-art algorithms.

2 Methodology

The pipeline of our proposed SuperVessel is illustrated in Fig. 1. Given a retinal image XX of size H×WH\times W, we first downsample the image by n×n\times to simulate a low-resolution image, which is adopted as the input of the whole framework. To reconstruct more appealing vessels, we propose two modules with a new loss function: An upsampling with feature decomposition (UFD) module separates vessel and background features into different channels. The proposed feature interaction module (FIM) emphasizes the vessel features by optimizing the feature interaction between the UFD module and the super-resolution branch. In the testing phase (as shown in the light green box), only the vessel segmentation branch is adopted to segment vessels to output high-resolution vessel segmentation results without extra computational load.

Refer to caption
Figure 1: The pipeline of the proposed SuperVessel framework. For the test phase, only the vessel segmentation branch is adopted, as shown in the light green box.

2.1 SuperVessel Framework

Decoded features only upsampled by bilinear interpolation cannot bring any additional information, since the input is a low-resolution image. Thus, we adopt super-resolution as an auxiliary network for vessel segmentation to provide more details in our SuperVessel framework, and the super-resolution network can be removed during the test phase. Ground truth for vessel segmentation is the labeled segmentation mask of the original high resolution, and that for super-resolution is the original high-resolution image. For the two branches, the same encoder-decoder structures [14] are adopted as the backbones. The encoder EE is shared, and two parallel decoders DSegD_{Seg} and DSRD_{SR} realize vessel segmentation and super-resolution, respectively, as shown in Fig. 1. Therefore, the whole structure of the SuperVessel could be formulated as :

OSeg=UFD(DSeg(E(X)))\displaystyle\mathrm{O_{Seg}}=\mathrm{UFD(D_{Seg}(E}(X))) (1)
OFIM=FIM(C(OSeg,OSR))\displaystyle\mathrm{O_{FIM}}=\mathrm{FIM(C(O_{Seg},O_{SR}))} (2)
OSR=DSR(E(X))\displaystyle\mathrm{O_{SR}}=\mathrm{D_{SR}(E}(X)) (3)

where OSeg,OFIM,andOSR\mathrm{O_{Seg},O_{FIM},andO_{SR}} are the output of vessel segmentation branch, the FIM module, and super-resolution branch, respectively. XX is the input image, E(X)\mathrm{E}(X) is encoded features of image XX, D\mathrm{D} is the corresponding decoder, C\mathrm{C} is the concatenation operation.

The loss function of our framework is defined as: =Seg+SR+FIM\mathcal{L}=\mathcal{L}_{Seg}+\mathcal{L}_{SR}+\mathcal{L}_{FIM}, where Seg=1ni=0nGTilogOSegi\mathcal{L}_{Seg}=-\frac{1}{n}\sum_{i=0}^{n}GT_{i}\log{\mathrm{O_{Seg}}_{i}} for the loss between UFD vessel and GT, SR(SR,HR)=α(SRHR)2+(1α)(1SSIM(SR,HR))\mathcal{L}_{SR}(SR,HR)=\alpha*\left(SR-HR\right)^{2}+(1-\alpha)*(1-\mathrm{SSIM}(SR,HR)) for the loss of super-resolution branch, FIM\mathcal{L}_{FIM} for the loss between interaction vessel and GT, nn is the number of the classes, GTGT is the label, GTiGT_{i} is the ground truth of the class ii, and OSegi\mathrm{O_{Seg}}^{i} is the probability of the class ii in the segmentation results. SRSR is the predicted super-resolution image, HRHR is the original high-resolution image as the ground truth.

2.2 Vessel Feature Enhancement

To enhance the interested vessel features, we propose two modules, upsampling with feature decomposition (UFD) module and feature interaction module (FIM) with a loss FIM\mathcal{L}_{FIM}. The former module splits the vessel features from the background, and the latter emphasizes the vessel features by capturing the spatial relation between segmentation and super-resolution branches.

Upsampling with Feature Decomposition (UFD) Module: Previous algorithms constrain all features of the entire image by a loss function to the same degree. However, the background is not our interested target, and we hope the framework can focus on our interested vessels. Thus in our SuperVessel, we propose the upsampling with feature decomposition (UFD) module to split the background and vessel features, and the details are shown in the light green dotted frame of Fig. 1. The construction is simple but effective, only 1×11\times 1 Conv is adopted before bilinear interpolation [13] to output decomposed features in different channels. Then the features with two channels are input into our interaction module to obtain a vessel interaction matrix.

Feature Interaction Module (FIM): Most algorithms often fuse multiply tasks by various losses or similarities of entire images, which cannot focus on our interested vessel region, nor capture the spatial relation between separated segmentation features and super-resolution features. As the structure information, like vessels, should correspond to the two branches, we propose a feature interaction module (FIM) to capture the spatial relation between features and mainly focus on our interested vessels. The detailed construction is shown in the yellow dotted frame of Fig. 1. The decomposed background, vessel features from segmentation, and super-resolution features are concatenated together into the FIM. 1×11\times 1 Conv with ReLU as the activation is adopted to map the input features into tensors with dimension dd. The tensors are split into three groups based on channel, and dimension of each group is d/3d/3. Then three 3×33\times 3 Conv with different dilation rates dilation=1,2,4dilation=1,2,4 to capture different scale information from three groups. In this way, features with different scales can be obtained. Then we concatenate these features to be one tensor. One 1×11\times 1 Conv can be used to integrate information from different scales effectively, thus the spatial relevance can be obtained. To further emphasize the interaction between each group, we adopt ChannelShuffle [22] to exchange the information of concatenated features, which are output from three different dilated rates. Finally, 1×11\times 1 Conv with ReLU followed by one 1×11\times 1 Conv with the Sigmoid is adopted as the activation function to generate the weight matrix of spatial interaction.

The product of mask and high-resolution image is often adopted to produce the region of interest, which takes the whole image as an entire. This often brings some false similarity expressions, especially when some vessels labeled in the mask cannot clearly show up in the corresponding high-resolution image (such as blurry or hard to see). To solve such a problem, we propose to use the prediction of segmentation adding the product of the interaction matrix and prediction of segmentation, and take the segmentation mask of high-resolution as the ground truth. In this way, the framework focuses on the shared region of vessel structures. Thus the loss of FIM FIM\mathcal{L}_{FIM} is expressed as:

FIM=1ni=0nGTilog(OSegOFIM+OSeg)i\mathcal{L}_{FIM}=-\frac{1}{n}\sum_{i=0}^{n}GT_{i}\log{(\mathrm{O_{Seg}\odot O_{FIM}+O_{Seg}})_{i}} (4)

where nn is the number of the classes, GTiGT_{i} is the ground truth of the class ii, OSeg\mathrm{O_{Seg}} is the output of the segmentation, OFIM\mathrm{O_{FIM}} is the output of the FIM, and (OSegOFIM+OSeg)i(\mathrm{O_{Seg}\odot O_{FIM}+O_{Seg})}_{i} is the probability of the class ii.

3 Experiments

The vessel segment branch of our SuperVessel is adopted to conduct the following experiments. In the section, we first introduce the datasets, evaluation metrics, and experiment parameters. Then the ablation study is listed. Finally, the performance of our SuperVessel is evaluated compared with other state-of-the-art methods.

Datasets: We evaluated our SuperVessel with three modals of retinal images from three publicly available datasets, including Color fundus (HRF) [1], OCTA (OCTA-6M) [9], and ultra-widefield retinal images (PRIME-FP20) [2]:

HRF: The dataset contains 45 color fundus images from healthy person and patients with diabetic retinopathy or glaucoma. The image size is 3504×23363504\times 2336. 30 images are used for training, and the rest 15 images for test.

OCTA-6M: The dataset contains 300 subjects’ images, from the OCTA-500 [9]. OCTA (Optical Coherence Tomography Angiography) is a novel non-invasive imaging modality that visualizes human retinal vascular details. The field of view is 6mm×6mm6mm\times 6mm, with resolution 400×400400\times 400 pixels. We use the first 240 subjects to train the model, and the other 60 subjects for test.

PRIME-FP20: The dataset provides 15 high-resolution ultra-widefield (UWF) fundus photography (FP) images using Optos 200Tx camera. All images have the same resolution 4000×40004000\times 4000 pixels. The first 10 images are used for training, and the rest for test.

Evaluation metrics: The evaluation metrics include Precision(P), Sensitivity (SE), Intersection over Union (IoU), Dice, Accuracy(ACC), and Area under the ROC curve (AUC).

Implementation Details: All the experiments are run on one NVIDIA RTX 2080TI GPU. We used SGD as the optimizer with the momentum of 0.9 and the weight decay of 0.0001. We used the poly learning rate adjust schedule strategy [10] to set the learning rate during training, where lr=((1itermax_iter)power)init_lrlr=((1-\frac{iter}{max\_iter})^{power})*init\_lr, and we set init_lr=0.01,power=0.9init\_lr=0.01,power=0.9. In addition, the training epoch is set as 128128. Due to the memory limit, we cannot use the original size of the HRF and PRIME-FP datasets to train the model, we use 1752×11621752\times 1162 and 1408×12961408\times 1296 as the target size of the high-resolution image for these two datasets respectively.

3.1 Ablation Study

An ablation study is conducted on the HRF dataset to investigate the effectiveness of designed modules in our SuperVessel, which takes U-net as the backbone. We also counted the computational parameters and FLOPs of our SuperVessel in both training and test phases. The parameters of SuperVessel for training and test are 29.73M and 28.95M, respectively. Its FLOPs for training and test are 5.86G and 4.72G, respectively. The parameters and FLOPs of SuperVessel for the test are the same as those of U-net. Thus our SuperVessel does not increase the computation load.

Table 1: Ablation Study of SuperVessel (mean±std)\left(mean\pm\mathrm{std}\right).
ASR UFD FIM SE IoU Dice ACC AUC
x x x 68.56±1.9668.56\pm 1.96 56.73±0.9456.73\pm 0.94 72.39±0.7672.39\pm 0.76 95.87±0.0495.87\pm 0.04 83.05±0.9683.05\pm 0.96
x x 68.40±2.3868.40\pm 2.38 56.13±1.0156.13\pm 1.01 71.89±0.8371.89\pm 0.83 95.78±0.1095.78\pm 0.10 82.89±1.0882.89\pm 1.08
x 68.31±2.1368.31\pm 2.13 59.41±0.9759.41\pm 0.97 74.53±0.7774.53\pm 0.77 96.32±0.0596.32\pm 0.05 83.08±1.0583.08\pm 1.05
72.29±1.30\mathbb{72.29}\pm 1.30 62.26±0.38\mathbb{62.26}\pm 0.38 76.74±0.29\mathbb{76.74}\pm 0.29 96.54±0.03\mathbb{96.54}\pm 0.03 85.06±0.66\mathbb{85.06}\pm 0.66

As shown in Table 1, we proposed three modules, including ASR (auxiliary super-resolution task), UFD, and FIM. Only adding ASR, meaning purely adding a super-resolution branch to the vessel segmentation, IoU and Dice decrease a little, so simply combining the two tasks does not effectively improve the segmentation results. After UFD is inserted into the network, the IoU and Dice further increase by about 3% than the baseline, which illustrates that separating vessel features from the background can make the network focus on the vessel features. Finally, the network with FIM improves the IoU and Dice by about 6% and 4%, which means that the interaction between the two branches can further emphasize the vessel features. Our feature enhancement can effectively improve the segmentation accuracy for our SuperVessel. Therefore, the SuperVessel improves the vessel segmentation accuracy without increasing the computation load.

3.2 Comparison Results

Seven state-of-the-art methods are selected for comparison, four vessel segmentation networks including U-net [14], SA-UNet [4], CS-Net [12], SCS-Net [20]; three super-resolution-combined segmentation multi-task networks including DSRL [19], CogSeg [21] and PFSeg [18]. The proposed SuperVessel is compared with the above methods based on the three vessel segmentation datasets.

Table 2: Results on the HRF dataset (mean±std)\mathrm{mean}\pm\mathrm{std}).
Model SE IoU Dice ACC AUC
U-net 68.56±1.9668.56\pm 1.96 56.73±0.9456.73\pm 0.94 72.39±0.7672.39\pm 0.76 95.87±0.0495.87\pm 0.04 83.05±0.9683.05\pm 0.96
SA-UNet 68.19±2.8968.19\pm 2.89 55.26±1.0255.26\pm 1.02 71.18±0.8471.18\pm 0.84 95.64±0.0795.64\pm 0.07 82.70±1.3382.70\pm 1.33
CS-Net 67.43±2.4567.43\pm 2.45 55.09±0.9255.09\pm 0.92 71.04±0.7671.04\pm 0.76 95.66±0.0895.66\pm 0.08 82.32±1.1382.32\pm 1.13
SCS-Net 66.31±1.7666.31\pm 1.76 54.63±0.3054.63\pm 0.30 70.66±0.2570.66\pm 0.25 95.65±0.1395.65\pm 0.13 81.80±0.7381.80\pm 0.73
DSRL - - - - -
CogSeg 70.95±1.7270.95\pm 1.72 59.68±0.7659.68\pm 0.76 74.75±0.674.75\pm 0.6 96.22±0.0496.22\pm 0.04 84.31±0.8184.31\pm 0.81
PFSeg - - - - -
SuperVessel 72.29±1.30\mathbb{72.29}\pm 1.30 62.26±0.38\mathbb{62.26}\pm 0.38 76.74±0.29\mathbb{76.74}\pm 0.29 96.54±0.03\mathbb{96.54}\pm 0.03 85.06±0.66\mathbb{85.06}\pm 0.66

Comparison Results on HRF datasets: The sizes of input images for all the algorithms are the same 876×584876\times 584, to simulate low-resolution input images. The sizes of output images for U-net, SA-UNet, CS-Net and SCS-Net are the same as their input, and those for DSRL, CogSeg, PFSeg and our SuperVessel are 1752×11621752\times 1162. From Table 2, our SuperVessel surpasses all the other state-of-the-art networks, with an IoU of more than 2%. The std numbers of our SuperVessel are the lowest, meaning that the segmentation stability of SuperVessel is superior to other algorithms. DSRL and PFSeg cannot segment the HRF dataset, as the gradient explosion happens during training. We will discuss this in the discussion section. Thus, the superiority of our SuperVessel in the accuracy and stability can be proved on the HRF dataset.

Refer to caption
Figure 2: The experiment examples on the HRF datasets. Green means the ground truth, red is segmentation result, yellow means the corrected segmentation results. (Please zoom in for a better view.)

The experiment examples on HRF dataset are shown in Fig. 2. All the other algorithms wrongly segment the edge of the disc as vessels, but our SuperVessel gives out the exact classification. Then we selected two blocks with tiny vessels to further analyze the results, the blue rectangle contains tiny vessels around the macular, and the red rectangle contains the vessels’ end. The SuperVessel can segment more tiny vessels than other algorithms, especially, at the end of all vessels. Thus, the proposed feature enhancement can effectively improve the tiny vessel segmentation.

Table 3: Comparison results based on OCTA dataset.
Model SE IoU Dice ACC AUC
U-net 64.06±0.4164.06\pm 0.41 52.36±0.1852.36\pm 0.18 68.73±0.1568.73\pm 0.15 94.53±0.0194.53\pm 0.01 80.97±0.1880.97\pm 0.18
SA-UNet 59.65±0.4159.65\pm 0.41 48.94±0.2548.94\pm 0.25 65.72±0.2365.72\pm 0.23 94.16±0.0294.16\pm 0.02 78.86±0.2078.86\pm 0.20
CS-Net 61.58±0.8461.58\pm 0.84 50.32±0.2550.32\pm 0.25 66.95±0.2266.95\pm 0.22 94.29±0.0394.29\pm 0.03 79.79±0.3679.79\pm 0.36
SCS-Net 63.44±0.2763.44\pm 0.27 51.17±0.0451.17\pm 0.04 67.70±0.0467.70\pm 0.04 94.32±0.0294.32\pm 0.02 80.62±0.1280.62\pm 0.12
DSRL - - - - -
CogSeg 66.33±0.2766.33\pm 0.27 56.46±0.4856.46\pm 0.48 72.17±0.3972.17\pm 0.39 95.2±0.0895.2\pm 0.08 82.42±0.1682.42\pm 0.16
PFSeg 74.92±2.68\mathbb{74.92}\pm 2.68 56.35±2.5556.35\pm 2.55 72.05±2.0772.05\pm 2.07 94.55±0.3994.55\pm 0.39 86.02±1.386.02\pm 1.3
SuperVessel 73.80±0.3173.80\pm 0.31 64.56±0.10\mathbb{64.56}\pm 0.10 78.46±0.07\mathbb{78.46}\pm 0.07 96.20±0.02\mathbb{96.20}\pm 0.02 86.30±0.13\mathbb{86.30}\pm 0.13

Comparison Results on OCTA dataset: The sizes of low-resolution input images for all the algorithms are the same 200×200200\times 200. The sizes of output images for U-net, SA-UNet, CS-Net and SCS-Net are the same as their input, and those for DSRL, CogSeg, PFSeg and our SuperVessel are 400×400400\times 400. From Table 3, our SuperVessel surpasses all the other state-of-the-art networks, with an IoU of more than 8%. The std numbers of our SuperVessel are lower than most other algorithms, meaning that the SuperVessel works stably. We will also discuss that DSRL cannot segment the OCTA dataset in the discussion section.

The experiment examples on the OCTA dataset are shown in Fig. 3. Most comparison algorithms produce some discontinue vessels as shown in the red rectangles, as the vessels around the macular are very tiny and indistinguishable. There are two tiny vessels away from the large vessels in the blue rectangles. Our SuperVessel can detect the tiny vessels, but other algorithms cannot correctly segment them, since our SuperVessel highlights the structure features based on the enhancement of the features. Thus, the superiority of our SuperVessel in the accuracy, segmenting of tiny vessels, and stability can be proved on the OCTA dataset.

Refer to caption
Figure 3: The experiment examples on the OCTA datasets. Green means the ground truth, red is segmentation result, yellow means the corrected segmentation results. (Please zoom in for a better view.)
Table 4: Comparison results on PRIME-FP20 dataset
Model SE IoU Dice ACC AUC
U-net 26.43±1.3426.43\pm 1.34 22.77±0.8922.77\pm 0.89 37.08±1.1837.08\pm 1.18 97.77±0.0197.77\pm 0.01 62.62±0.7062.62\pm 0.70
SA-UNet 13.50±5.9013.50\pm 5.90 12.43±5.3312.43\pm 5.33 21.77±8.9421.77\pm 8.94 97.66±0.0897.66\pm 0.08 56.47±2.8256.47\pm 2.82
CS-Net 19.37±6.9019.37\pm 6.90 17.06±5.3117.06\pm 5.31 28.86±7.9128.86\pm 7.91 97.70±0.0497.70\pm 0.04 59.26±3.2459.26\pm 3.24
SCS-Net 18.91±4.6818.91\pm 4.68 17.08±3.7217.08\pm 3.72 29.04±5.4029.04\pm 5.40 97.73±0.0497.73\pm 0.04 59.05±2.1959.05\pm 2.19
DSRL - - - - -
CogSeg 11.14±7.7911.14\pm 7.79 10.50±7.2510.50\pm 7.25 18.36±12.3118.36\pm 12.31 97.69±0.1397.69\pm 0.13 55.52±3.8255.52\pm 3.82
PFSeg - - - - -
SuperVessel 38.47±1.7\mathbb{38.47}\pm 1.7 33.52±1.12\mathbb{33.52}\pm 1.12 50.21±1.25\mathbb{50.21}\pm 1.25 98.11±0.03\mathbb{98.11}\pm 0.03 68.67±0.84\mathbb{68.67}\pm 0.84

Comparison results on PRIME-FP20 dataset: The sizes of input images for all the algorithms are the same 704×648704\times 648, to simulate low-resolution input images. The sizes of output images for U-net, SA-UNet, CS-Net and SCS-Net are the same as their input, and those for DSRL, CogSeg, PFSeg and our SuperVessel are 1408×12961408\times 1296. From Table 4, our SuperVessel surpasses all the other state-of-the-art networks, with IoU of more than 11%, which is significant. The std parameters of our SuperVessel are lower than most of the other algorithms, meaning that the SuperVessel works stably. For the situation of DSRL and PFSeg, we will also discuss this in the discussion section. The experiment examples on the dataset are shown in Fig. 4. As the view field of these images is about 200200^{\circ}, the vessels in these images are extremely tiny compared with the other two datasets. In the red rectangles, the segmented vessels by our SuperVessel are more continuous than those by other algorithms. In the blue rectangles containing more tiny vessels, the SuperVessel detects more vessels than other algorithms, as the spatial features such as vessels are emphasized by our feature enhancement. Therefore, our SuperVessel is superior to other state-of-the-art algorithms in the segmentation accuracy and stability with tiny vessels based on the three publicly available datasets.

Refer to caption
Figure 4: The experiment examples on the PRIME-FP20 dataset. Green means the ground truth, red is segmentation result, yellow means the corrected segmentation results. (Please zoom in for a better view.)

4 Discussion and Conclusions

In the study, we proposed the SuperVessel to provide high-resolution vessel segmentation results for analysis based on low-resolution input, and experiments prove its effectiveness. But previous super-resolution-combined segmentation multi-task networks such as DSRL and PFSeg cannot train on most of the vessel segmentation datasets, we observed that these methods often cause the model to collapse during training. We conjectured that the similarity loss between two tasks is not suitable when the targets of one task are the subset of another task. Since the similarity maybe make one of the tasks lose its constraint direction, the collapsing of models maybe happen, such as the two algorithms collapsed in the vessel segmentation task. Experiments on three datasets show that our proposed method can resolve this problem by finding the spatial relation between two tasks for vessel segmentation.

Although the SuperVessel can work well on most of the vessel segmentation datasets, there are still some improvements. For some extremely tiny blood vessels, especially for ultra-field fundus images in the PRIME-FP20 dataset, the model can only segment a little more vessels than other algorithms. The main reason is that the original image is very large but we cannot deal with so much redundant information. Limited by our computation devices we cannot train our SuperVessel to output the vessel of original image size for some extremely large-image-size datasets.

In conclusion, we proposed the SuperVessel for vessel segmentation, which outputs high-resolution vessel segmentation results based on low-resolution input images. Experiments on three publicly available datasets prove that super-resolution branches could provide detailed features for vessel segmentation, and the proposed feature enhancement, which focuses on target features, can further improve the segmentation accuracy with more tiny vessels and stronger continuity of the segmented vessels.

5 Acknowledgement

This work was supported in part by Guangdong Provincial Department of Education (2020ZDZX3043), Guangdong Provincial Key Laboratory (2020B121201001), and Shenzhen Natural Science Fund (JCYJ20200109140820699 and the Stable Support Plan Program 20200925174052004).

References

  • [1] Budai, A., Bock, R., Maier, A., Hornegger, J., Michelson, G.: Robust vessel segmentation in fundus images. International journal of biomedical imaging 2013 (2013)
  • [2] Ding, L., Kuriyan, A.E., Ramchandran, R.S., Wykoff, C.C., Sharma, G.: Weakly-supervised vessel detection in ultra-widefield fundus photography via iterative multi-modal registration and learning. IEEE Transactions on Medical Imaging (2020)
  • [3] Fu, H., Xu, Y., Lin, S., Wong, D.W.K., Liu, J.: Deepvessel: Retinal vessel segmentation via deep learning and conditional random field. In: International conference on medical image computing and computer-assisted intervention. pp. 132–139. Springer (2016)
  • [4] Guo, C., Szemenyei, M., Yi, Y., Wang, W., Chen, B., Fan, C.: Sa-unet: Spatial attention u-net for retinal vessel segmentation. In: 2020 25th International Conference on Pattern Recognition (ICPR). pp. 1236–1242. IEEE (2021)
  • [5] Ikram, M.K., Ong, Y.T., Cheung, C.Y., Wong, T.Y.: Retinal vascular caliber measurements: clinical significance, current knowledge and future perspectives. Ophthalmologica 229(3), 125–136 (2013)
  • [6] Karaali, A., Dahyot, R., Sexton, D.J.: Dr-vnet: Retinal vessel segmentation via dense residual unet. arXiv preprint arXiv:2111.04739 (2021)
  • [7] Lei, S., Shi, Z., Wu, X., Pan, B., Xu, X., Hao, H.: Simultaneous super-resolution and segmentation for remote sensing images. In: 2019 IEEE International Geoscience and Remote Sensing Symposium, IGARSS 2019, Yokohama, Japan, July 28 - August 2, 2019. pp. 3121–3124. IEEE (2019)
  • [8] Li, L., Verma, M., Nakashima, Y., Nagahara, H., Kawasaki, R.: Iternet: Retinal image segmentation utilizing structural redundancy in vessel networks. In: IEEE Winter Conference on Applications of Computer Vision, WACV 2020, Snowmass Village, CO, USA, March 1-5, 2020. pp. 3645–3654. IEEE (2020)
  • [9] Li, M., Chen, Y., Ji, Z., Xie, K., Yuan, S., Chen, Q., Li, S.: Image projection network: 3d to 2d image segmentation in octa images. IEEE Transactions on Medical Imaging 39(11), 3343–3354 (2020)
  • [10] Liu, W., Rabinovich, A., Berg, A.C.: Parsenet: Looking wider to see better. arXiv preprint arXiv:1506.04579 (2015)
  • [11] London, A., Benhar, I., Schwartz, M.: The retina as a window to the brain—from eye research to cns disorders. Nature Reviews Neurology 9(1), 44–53 (2013)
  • [12] Mou, L., Zhao, Y., Chen, L., Cheng, J., Gu, Z., Hao, H., Qi, H., Zheng, Y., Frangi, A., Liu, J.: Cs-net: Channel and spatial attention network for curvilinear structure segmentation. In: Shen, D., Liu, T., Peters, T.M., Staib, L.H., Essert, C., Zhou, S., Yap, P.T., Khan, A. (eds.) Medical Image Computing and Computer Assisted Intervention – MICCAI 2019. pp. 721–730. Springer International Publishing, Cham (2019)
  • [13] Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Numerical recipes 3rd edition: The art of scientific computing. Cambridge university press (2007)
  • [14] Ronneberger, O., Fischer, P., Brox, T.: U-net: Convolutional networks for biomedical image segmentation. In: International Conference on Medical image computing and computer-assisted intervention. pp. 234–241. Springer (2015)
  • [15] Su, R., Zhang, D., Liu, J., Cheng, C.: Msu-net: Multi-scale u-net for 2d medical image segmentation. Frontiers in Genetics 12,  140 (2021)
  • [16] Sun, C., Wang, J.J., Mackey, D.A., Wong, T.Y.: Retinal vascular caliber: systemic, environmental, and genetic associations. Survey of ophthalmology 54(1), 74–95 (2009)
  • [17] Wang, B., Qiu, S., He, H.: Dual encoding u-net for retinal vessel segmentation. In: Shen, D., Liu, T., Peters, T.M., Staib, L.H., Essert, C., Zhou, S., Yap, P., Khan, A.R. (eds.) Medical Image Computing and Computer Assisted Intervention - MICCAI 2019 - 22nd International Conference, Shenzhen, China, October 13-17, 2019, Proceedings, Part I. Lecture Notes in Computer Science, vol. 11764, pp. 84–92. Springer (2019)
  • [18] Wang, H., Lin, L., Hu, H., Chen, Q., Li, Y., Iwamoto, Y., Han, X.H., Chen, Y.W., Tong, R.: Patch-free 3d medical image segmentation driven by super-resolution technique and self-supervised guidance. In: International Conference on Medical Image Computing and Computer-Assisted Intervention. pp. 131–141. Springer (2021)
  • [19] Wang, L., Li, D., Zhu, Y., Tian, L., Shan, Y.: Dual super-resolution learning for semantic segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR) (June 2020)
  • [20] Wu, H., Wang, W., Zhong, J., Lei, B., Wen, Z., Qin, J.: Scs-net: A scale and context sensitive network for retinal vessel segmentation. Medical Image Analysis 70, 102025 (2021)
  • [21] Zhang, Q., Yang, G., Zhang, G.: Collaborative network for super-resolution and semantic segmentation of remote sensing images. IEEE Transactions on Geoscience and Remote Sensing 60, 1–12 (2022)
  • [22] Zhang, X., Zhou, X., Lin, M., Sun, J.: Shufflenet: An extremely efficient convolutional neural network for mobile devices. In: Proceedings of the IEEE conference on computer vision and pattern recognition. pp. 6848–6856 (2018)