This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Pixel Relationships-based Regularizer for Retinal Vessel Image Segmentation

Lukman Hakim1, Takio Kurita2∗ 1 Department of Information Engineering, Graduate School of Engineering, Hiroshima University, Higashihiroshima, Japan 2 Graduate School of Advanced Science and Engineering, Hiroshima University, Higashihiroshima, Japan E-mail: [email protected]
Abstract

The task of image segmentation is to classify each pixel in the image based on the appropriate label. Various deep learning approaches have been proposed for image segmentation that offers high accuracy and deep architecture. However, the deep learning technique uses a pixel-wise loss function for the training process. Using pixel-wise loss neglected the pixel neighbor relationships in the network learning process. The neighboring relationship of the pixels is essential information in the image. Utilizing neighboring pixel information provides an advantage over using only pixel-to-pixel information. This study presents regularizers to give the pixel neighbor relationship information to the learning process. The regularizers are constructed by the graph theory approach and topology approach: By graph theory approach, graph Laplacian is used to utilize the smoothness of segmented images based on output images and ground-truth images. By topology approach, Euler characteristic is used to identify and minimize the number of isolated objects on segmented images. Experiments show that our scheme successfully captures pixel neighbor relations and improves the performance of the convolutional neural network better than the baseline without a regularization term.

Index Terms:
neighboring, pixel, deep learning, graph, topology.

I Introduction

Image segmentation is an important task in the field of computer vision. The task of image segmentation is to classify each pixel in the image based on the appropriate label. Image segmentation applications have been widely used, such as medical imagery (for example, Retinal Vessel Segmentation[3], Tumor Segmentation[4], Breast Cancer Detection[5]), Autonomous Vehicles (Example: Vehicle Detection[6], Pedestrian Detection[7]), Counting Objects[8], and Farm Industry[9]. Several Image segmentation techniques have been developed, such as thresholding, k-means clustering, and graph cut methods.

Recently, many deep learning techniques have been developed for image segmentation that offers high accuracy and deep architecture[1][2]. However, the deep learning technique uses a pixel-wise loss function for the training process. Using pixel-wise loss neglected the pixel neighbor relationships in the network learning process. The neighboring relationship of the pixels is essential information in the image. Utilizing neighboring pixel information provides an advantage over using only pixel-to-pixel information.

Refer to caption
Figure 1: Illustration of graph based smoothing regularizer (GBS) architecture.
Refer to caption
Figure 2: Representation of graph between two regions, a) Ground-truth image, b) graph generated from background region and c) graph generated from foreground region.

This study presents regularizers to provide the pixel neighbor relationship information to the learning process. We have proposed three regularizers are constructed by the graph theory approach and topology approach. Firstly, we introduced a Graph-Based Smoothing Regularizer (GBS). The GBS considers the graph laplacian from the foreground and background regions and then combines it with the CNN baseline loss function. The combination of regularizers allows the network to learn the pixel relationship efficiently. The illustration of GBS is shown in Fig. 1.

Refer to caption
Figure 3: Architecture of Graph Laplacian regularizer based on differences of neighboring pixels.

Secondly, we introduced Graph Laplacian Regularization based on the Differences of Neighboring Pixels (GLRDN) by constructing graph laplacian from prediction and ground-truth images. A graph uses pixels as vertices and edges defined by the ”differences” of neighboring pixels instead of similarities between pixels. A graph uses pixels as vertices and edges defined by the ”differences” of neighboring pixels instead of similarities between pixels. The basic idea is, if pair-wise pixels belonging a similar class, the differences are small. Otherwise, the differences are significant if pair-wise pixels are belonging a different class. The illustration of GLRDN is shown as Fig 3.

Refer to caption
Figure 4: Simple Illustration of Euler Characteristic Calculation.

Thirdly, we proposed a regularizer based on Euler Characteristic (EC) in the segmented image. EC is a topological property of a shape in an image that can identify the number of objects in the image. This EC-based regularizer is used to identify the number of objects in the segmentation results. Then this information is added to the objective function so that the network can minimize the number of isolated objects in the segmented image.

To evaluate the effectiveness of the proposed methods, we implemented the regularizers on retinal blood vessels segmentation in fundus images and compared them with the baseline CNN without regularizers.

II Methods

This section explains three proposed regularizers we have done: GBS, GLRDN, and EC Regularizer.

II-A Graph Based Smoothing (GBS)

The proposed graph-based smoothing regularizer is based on the relationship between pixels[10]. The graph approach is used to obtain adjacency graphs and diagonal matrices from the relationship information between pixels. The graph can be transformed into a matrix by counting the edges between the two adjacent vertices. The two vertices can be said to be adjacent if an edge connects it. Then from this matrix, the Graph Laplacian is defined.

The GBS defines two different graphs for the foreground region and background region. As shown in Fig. 2, let us consider the set of training samples X=(xm,tm)|m=1,,MX=\left(x_{m},t_{m}\right)|m=1,\ldots,M where xmx_{m} is a mthm^{th} input image and tmt_{m} is the mthm^{th} target image, and MM is number of training samples. In deep convolutional neural network, the network is trained to predict the output image ym{y}_{m} from the mthm^{th} input image xm{x}_{m}. For each edge of foreground (jF,kF)EF\left(j_{F},k_{F}\right)\in E_{F} and background (jB,kB)EB\left(j_{B},k_{B}\right)\in E_{B} of the graph, the similarity β(jF,kF)\beta_{\left(j_{F},k_{F}\right)} and β(jB,kB)\beta_{\left(j_{B},k_{B}\right)} is defined as

β(jF,kF)=1|tjFtkF|\beta_{(j_{F},k_{F})}=1-|t_{j_{F}}-t_{k_{F}}| (1)
β(jB,kB)=1|tjBtkB|\beta_{(j_{B},k_{B})}=1-|t_{j_{B}}-t_{k_{B}}| (2)

We introduced the regularization term for smoothing SS based on foreground region FF and background region BB as

(jF,kF)GFβjF,kF(yjFykF)2=yT(DFAF)y=yTLFy\sum_{(j_{F},k_{F})\in G_{F}}\beta_{j_{F},k_{F}}{(y_{j_{F}}-y_{k_{F}})^{2}}=y^{T}(D_{F}-A_{F})y=y^{T}L_{F}y (3)
(jB,kB)GBβjB,kB(yjBykB)2=yT(DBAB)y=yTLBy\sum_{(j_{B},k_{B})\in G_{B}}\beta_{j_{B},k_{B}}{(y_{j_{B}}-y_{k_{B}})^{2}}=y^{T}(D_{B}-A_{B})y=y^{T}L_{B}y (4)

where, LFL_{F} and LBL_{B} is Graph Laplacian for foreground region and background region, respectively. The adjacency and diagonal matrices is defined as following

(AF=β(jF,kF),DF=jF=1NβjF,kFAB=β(jB,kB),DB=jB=1NβjB,kB)\binom{A_{F}=\beta_{(j_{F},k_{F})},D_{F}=\sum_{j_{F}=1}^{N}\beta_{j_{F},k_{F}}}{A_{B}=\beta_{(j_{B},k_{B})},D_{B}=\sum_{j_{B}=1}^{N}\beta_{j_{B},k_{B}}} (5)

Graph based smoothing regularizer RgbsR_{gbs} can be written as

Rgbs=yT(LF+LB)y=yTLGyR_{gbs}=y^{T}(L_{F}+L_{B})y\\ =y^{T}L_{G}y (6)

In these experiments, we use the Binary Cross Entropy (BCE) loss as objective function. The BCE loss is given as below:

Ebce=iMtilog(yi)+(1ti)log(1yi)E_{bce}=\sum_{i}^{M}t_{i}\log(y_{i})+(1-t_{i})\log(1-y_{i}) (7)

where tt and yy is groundtruth and output, respectively.

The objective function OO applied in this study is the summation of the binary cross entropy of each label with the regularization term using graph based smoothing, which is defined as

O1=Ebce+λRgbsO_{1}=E_{bce}+\lambda R_{gbs} (8)

Parameter λ\lambda is used to control the effect of the regularizer.

O2=Ebce+λRglrdnO_{2}=E_{bce}+\lambda R_{glrdn} (9)

II-B Graph Laplacian based on Differences of Neighboring Pixels (GLRDN)

To define the GLRDN, we consider to utilize the differences of the differences of neighboring pixels between the target image and estimated image[12]. Let assume target image tm{t}_{m} and the estimated images ym{y}_{m}, the GLRDN define as

Rglrdn(𝒕m,𝒚m)\displaystyle R_{glrdn}(\bm{t}_{m},\bm{y}_{m}) =(i,j)E{(tmitmj)(ymiymj)}2\displaystyle=\sum_{(i,j)\in E}\{(t_{mi}-t_{mj})-(y_{mi}-y_{mj})\}^{2}
=(i,j)E(ΔtmijΔymij)2\displaystyle=\sum_{(i,j)\in E}(\Delta t_{m}^{ij}-\Delta y_{m}^{ij})^{2}
=(Δ𝒕mΔ𝒚m)T(Δ𝒕mΔ𝒚m)\displaystyle=(\Delta\bm{t}_{m}-\Delta\bm{y}_{m})^{T}(\Delta\bm{t}_{m}-\Delta\bm{y}_{m})
=(B𝒕mB𝒚m)T(B𝒕mB𝒚m)\displaystyle=(B\bm{t}_{m}-B\bm{y}_{m})^{T}(B\bm{t}_{m}-B\bm{y}_{m})
=(𝒕m𝒚m)TBTB(𝒕m𝒚m)\displaystyle=(\bm{t}_{m}-\bm{y}_{m})^{T}B^{T}B(\bm{t}_{m}-\bm{y}_{m})
=(𝒕m𝒚m)TL(𝒕m𝒚m)\displaystyle=(\bm{t}_{m}-\bm{y}_{m})^{T}L(\bm{t}_{m}-\bm{y}_{m}) (10)

where BB is the incident matrix. It is noticed that BTBB^{T}B is equal to the graph Laplacian matrix LL. Graph Laplacian matrix assumes that the differences of neighbor pixel between target images and estimated images, denoted as (𝒕m𝒚m)(\bm{t}_{m}-\bm{y}_{m}) is smooth with respect to concerning corresponding graph G. In particular, it enforce the value of (𝒕m𝒚m)TL(𝒕m𝒚m)(\bm{t}_{m}-\bm{y}_{m})^{T}L(\bm{t}_{m}-\bm{y}_{m}) should be small. In the training process, GLRDN is use with BCE objective function

II-C Euler Characteristic-based Regularizer

The Euler characteristic is a property of an object’s shape that is not affected by topological changes such as scale transformations and rotations. EC can define the number of objects in two-dimensional space provided there are no holes in the object. EC can be obtained by calculation of the relations between vertices (P), edges (S), and faces (F ) :

EC=FS+PEC=F-S+P (11)
Refer to caption
Figure 5: Comparison results of blood vessel segmentation between our proposed regularizer and without proposed regularizer on the DRIVE (top row) and STARE (bottom row) datasets.

For a binary image, the EC can be constructed in the following steps: (a) define pixels as imaginary vertices, (b) calculate the number of edges based on 8x8 neighboring pixels, (c) count the number of vertices, (d) count the number of edges, (e) count the number of faces, (f) calculate the EC using Eq. (11).

We demonstrate the calculation of Euler Characteristic of simple binary image as shown in Fig. 4. In this study, we calculated the EC from the CNN output. Since the EC of an object equals the number of objects in the image, we use this information to determine the number of isolated objects in the CNN output. Then, information about the number of isolated objects is combined with BCE loss as a regularizer in the CNN training process To minimize the number of isolated objects.

For the EC calculation to be invariant, it is necessary to calculate the vertices, sides, and faces in the opposite direction. Then, calculated the average of the EC in two different directions as follows:

Rec=EC1+EC22R_{ec}=\frac{{EC}_{1}+{EC}_{2}}{2} (12)

In this study, EC estimate from the segmentation output of network[11]. We combine the objective function with regularization term as:

O3=Ebce+λRecO_{3}=E_{bce}+\lambda R_{ec} (13)

where λ\lambda is denoted the parameter to adjust the effect of EC regularizer.

III Experiments

We measure the performance of our proposed method using Sensitivity, Specificity, Accuracy, and Area Under curve on DRIVE, STARE, and CHASEDB1 datasets. To increase the number of datasets, we divide the image into patches of 48 x 48 pixels to get 4,750 patches for each image. In this way, we get a sufficient number of datasets to perform the training step. The first, we compare our proposed method with baseline U-Net architecture with and without regularizer term. For the experiment, we train the network with 100 epochs and initialize the learning rate to 0.001. To achieve convergence, Every time we reach 25 epochs, we reduce the learning rate by 10 times. For the training sample, we set the batch to 32 and used Adam as the optimizer.

TABLE I: Comparison of architectural performance without and with our proposed GBS on the DRIVE datasets.
Methods Sn Sp Acc AUC
Baseline 0.6707 0.9867 0.9465 0.9652
Baseline+GBS 0.7064 0.9897 0.9536 0.9794
TABLE II: Comparison of architectural performance without and with our proposed GLRDN on the DRIVE datasets.
Methods Sn Sp Acc AUC
Baseline 0.7429 0.9840 0.9544 0.9686
Baseline+GLRDN 0.7914 0.9791 0.9561 0.9740
TABLE III: Comparison of architectural performance without and with our proposed Euler Characteristic-based Regularizer on the DRIVE datasets.
Methods Sn Sp Acc AUC
Baseline 0.7583 0.9826 0.9551 0.9691
Baseline+EC 0.8463 0.9759 0.9600 0.9824
Refer to caption
Figure 6: Receiver Operating Characteristics (ROC) curve analysis of blood vessel segmentation between our proposed regularizer and without proposed regularizer on the different datasets. (a)DRIVE (b)STARE and (c)CHASEDB1.

IV Results

We show some segmentation results from the experiment in the figure 5. The figure shows a very complicated retinal vessel consisting of large and small vessels crossing each other. The segmentation results from the U-Net baseline appear poor in detecting complex vessels and small vessels. However, the proposed regularizer can improve the segmentation results on small vessels with more detail and better.We also present the results of our quantitatively proposed regularizer on the DRIVE dataset in the I, II, and III tables. Without a regularizer, the baseline network loses a lot of small vessels and creates many isolated objects. These tables indicate that our regularizer is able to segment small vessels and minimize the number of isolated objects on the vessel than without using a regularizer. Figure 6 shows the performance of our proposed regularizer compared to the baseline network without regularizer on different datasets using ROC curve analysis. In the figure, the blue line shows the ROC curve of the baseline network without the regularizer. The orange line shows the ROC curve of the baseline network merged with our regularizer. The two ROC curve lines show that our proposed regularizer can improve the performance of the baseline network. This indicates that our proposed performance regularizer can provide better performance enhancement on retinal blood vessel segmentation. To get better visualization, we zoomed in on the ROC curve at the top left.

V Conclusions

This study takes advantage of the neighboring pixel information in the image as additional information on the CNN. This information is constructed as a regularization term in the objective function. The method we propose is able to improve vessel segmentation, especially on small vessels, better than without using a regularizer. In addition, isolated objects are also getting minimal. The calculation of regularizer on a pixel-by-pixel basis is a limitation of this approach. In the future, we plan to expand the work by using sparse graphs to reduce computational costs.

References

  • [1] O. Ronneberger, P. Fischer, and T. Brox, “U-Net: Convolutional Networks for Biomedical Image Segmentation,” in Lecture Notes in Computer Science, Springer International Publishing, 2015, pp. 234–241.
  • [2] V. Badrinarayanan, A. Kendall, and R. Cipolla, “SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation,” IEEE Transactions on Pattern Analysis and Machine Intelligence, no. 12, pp. 2481–2495, Dec. 2017, doi: 10.1109/tpami.2016.2644615.
  • [3] H. Fu, Y. Xu, S. Lin, D. W. Kee Wong, and J. Liu, “DeepVessel: Retinal Vessel Segmentation via Deep Learning and Conditional Random Field,” in Medical Image Computing and Computer-Assisted Intervention – MICCAI 2016, Springer International Publishing, 2016, pp. 132–139.
  • [4] M. Havaei et al., “Brain tumor segmentation with Deep Neural Networks,” Medical Image Analysis, pp. 18–31, Jan. 2017, doi: 10.1016/j.media.2016.05.004.
  • [5] D. Q. Zeebaree, H. Haron, A. M. Abdulazeez, and D. A. Zebari, “Machine learning and Region Growing for Breast Cancer Segmentation,” 2019 International Conference on Advanced Science and Engineering (ICOASE), Apr. 2019, doi: 10.1109/icoase.2019.8723832.
  • [6] B. Chougula, A. Tigadi, P. Manage, and S. Kulkarni, “Road segmentation for autonomous vehicle: A review,” 2020 3rd International Conference on Intelligent Sustainable Systems (ICISS), Dec. 2020, doi: 10.1109/iciss49785.2020.9316090.
  • [7] M. Ullah, A. Mohammed, and F. Alaya Cheikh, “PedNet: A Spatio-Temporal Deep Convolutional Neural Network for Pedestrian Segmentation,” Journal of Imaging, no. 9, p. 107, Sep. 2018, doi: 10.3390/jimaging4090107.
  • [8] H. Cholakkal, G. Sun, F. Shahbaz Khan, and L. Shao, “Object Counting and Instance Segmentation With Image-Level Supervision,” 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Jun. 2019, doi: 10.1109/cvpr.2019.01268.
  • [9] C. Wang, Y. Tang, X. Zou, W. SiTu, and W. Feng, “A robust fruit image segmentation algorithm against varying illumination for vision system of fruit harvesting robot,” Optik, pp. 626–631, Feb. 2017, doi: 10.1016/j.ijleo.2016.11.177.
  • [10] L. Hakim, N. Yudistira, M. Kavitha, and T. Kurita, “U-Net with Graph Based Smoothing Regularizer for Small Vessel Segmentation on Fundus Image,” in Communications in Computer and Information Science, Springer International Publishing, 2019, pp. 515–522.
  • [11] L. Hakim, M. S. Kavitha, N. Yudistira, and T. Kurita, “Regularizer based on Euler characteristic for retinal blood vessel segmentation,” Pattern Recognition Letters, pp. 83–90, Sep. 2021, doi: 10.1016/j.patrec.2021.05.023.
  • [12] L. Hakim, H. Zheng, and T. Kurita, “Improvement for Single Image Super-resolution and Image Segmentation by Graph Laplacian Regularizer Based on Differences of Neighboring Pixels,” International Journal of Intelligent Engineering and Systems, no. 1, Feb. 2022, doi: 10.22266/ijies2022.0228.10.