This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Multi-Target Deep Learning for Algal Detection and Classification

Peisheng Qian1, Ziyuan Zhao1, Haobing Liu2, Yingcai Wang3, Yu Peng3
Sheng Hu3, Jing Zhang3, Yue Deng1, and Zeng Zeng1,4
1Peisheng Qian, Ziyuan Zhao, Yue Deng and Zeng Zeng are with Institute for Infocomm Research (I2R), Agency for Science, Technology and Research (A*STAR), Singapore2Haobing Liu is with ZWEEC Analytics Pte Ltd, Singapore3Yingcai Wang, Sheng Hu, Yu Peng, Jing Zhang are with Yangtze River Basin Ecology and Environment Monitoring and Scientific Research Center, Ministry of Ecology and Environment of P. R. China4 Corresponding author. The work was supported by Singapore-China NRF-NSFC Grant (Grant No. NRF2016NRF-NSFC001-111)
Abstract

Water quality has a direct impact on industry, agriculture, and public health. Algae species are common indicators of water quality. It is because algal communities are sensitive to changes in their habitats, giving valuable knowledge on variations in water quality. However, water quality analysis requires professional inspection of algal detection and classification under microscopes, which is very time-consuming and tedious. In this paper, we propose a novel multi-target deep learning framework for algal detection and classification. Extensive experiments were carried out on a large-scale colored microscopic algal dataset. Experimental results demonstrate that the proposed method leads to the promising performance on algal detection, class identification and genus identification.

Clinical relevance—To the best of our knowledge, we are the first in the AI community to apply deep learning to detect and classify algae from large-scale colored microscopic images. The method can be easily extended and implemented for detection and classification of biological cells and tissues.

I INTRODUCTION

Water quality is vital for modern agriculture, industry, public health and safety. Poor water quality poses risks to ecosystems and people. Biological analysis can identify changes in water quality, in which, algae are important indicators of ecological situations due to their quick responses to the qualitative and quantitative composition of species in a wide range of water conditions. Moreover, harmful algal blooms (HABs) degrade water quality and increase the risk to human health and the environment. Therefore, it is necessary to identify and enumerate algae using microscopes for water quality management and analysis. Taxonomic identification of algae is complicated, with many levels of hierarchies. For instance, an algae may belong to the genus Pediastrum, and the class Chlorophyta. Conventionally, the taxonomic classification of algae is carried out by biologists based on spectral and morphological analysis. It is a time-consuming and error-prone task due to the micro-size and diversity of algae, see Fig. 1.

In recent years, computer vision based methods, especially the convolutional neural network (CNN), has shown promising performance in similar scenarios including object tracking [1], detection [2] and segmentation [3]. Previously, most methods [4, 5, 6] rely on handcrafted features and limited work are presented with deep learning methods. However, these work only focused on genus-level classification, without considering the hierarchical taxonomy in biology. In this paper, an end-to-end multi-target framework extended from Faster R-CNN [2, 7] is proposed for algal detection and classification, in which, algae positions and types are detected simultaneously, and two branches for prediction of algal genus and biological class are trained together. Furthermore, extensive experiments are conducted on a large-scale algae dataset containing 27 genera. Experimental results show the effectiveness of the proposed framework, achieving 74.64%74.64\% mAP@IoU=5050 on 27 genera.

Refer to caption
Figure 1: Examples of different algae under microscopic images. The sizes, shapes and appearances of genera are diverse.
Refer to caption
Figure 2: Model architecture. The 3 branches simultaneously outputs the genus, bounding box, and biological class of the alga. The orange component is the extended classification branch.

II RELATED WORK

Early work on the applications of machine learning in algae classification can date back from 1992 [4], in which, a feedforward neural network was trained using OPA (Optical Plankton Analyser) features of 8 algae, achieving an average classification accuracy of over 90%90\%, without graphic features. Giraldo-Zuluaga et al. [8] proposed to use the segmentation algorithm to filter, orient, and subsequently extract the micro-algae profiles from microscopic images for micro-algae identification, which reported an accuracy of 98.6% with Support Vector Machine (SVM) and 97.3%97.3\% with Artificial Neural Network (ANN) with 2 hidden layers, respectively. Li et al. [5] employed Muller matrix (MM) imaging which highlights different biological and machine learning techniques for discrimination and classification of micro-algae. These methods highly depend on handcrafted features or extracted features from image processing, which lack generality in different environments.

In recent years, the convolutional neural network (CNN) has been widely applied in image identification and analysis due to its ability to extract deep features from images [9]. For algal image analysis, only a few studies were reported using CNNs [6, 10, 11]. For example, Deglint et al. [6] implemented a deep residual convolutional neural network and achieved a classification accuracy of 96% on six algae types. An ensemble of networks was conducted on the same dataset by Deglint et al., which reached 96.1% classification accuracy [10]. In [12], CNN models were developed by Park et al. from neural architecture search (NAS) for algal image analysis. In these methods, algal images were segmented from photographic morphological images with image processing methods, and classified with CNNs. These methods can not be performed in an end-to-end manner in real-world environment [13, 14]. Besides, the nature of algal taxonomic analysis is a hierarchical multi-label classification problem. However, in these work, algae are only classified at genus level, which does not utilize taxonomic relationships among different hierarchies. It is well noted that deep neural networks (DNNs) with the same training data samples can benefit from shared learning paths and joint learning by leveraging useful information in multiple related tasks. Such designs of deep neural networks with multiple tasks are referred to as Multi-Target DNNs (MT-DNNs) [15], in which, multiple branches are combined for stable training and better performance by exploiting commonalities and differences across tasks.

III METHODOLOGY

Inspired by MT-DNNs, The proposed framework is designed based on the architecture of Faster R-CNN [2], as depicted in Fig. 2. We extend Faster R-CNN by adding an extra classification branch for multi-task learning. The framework consists of three branches that will be trained with different objectives for robust algal analysis:

  1. 1.

    Branch-1 is used to predict the genus of algae.

  2. 2.

    Branch-2 is used for algal detection and localization.

  3. 3.

    Branch-3 is used to predict the class of algae.

Specifically, the classification of algal genus is performed in the last fully connected layer of Branch-1 based on selected regions within the bounding boxes generated from Branch-2. The last fully connected layer of Branch-3 is implemented for the classification of algal class with the shape of m×nm\times n. mm is the number of output features, and nn is 6 (5 algal class, and the rest categorised as “Others”, as shown in Table II). Cross-entropy is used as the loss function for algal classification at genus level and class level, termed as LgenusL_{genus} and LclsL_{cls}, respectively. Combined with the bounding box regression loss in algal detection, termed as LboxL_{box}, the total loss function can be defined as

Ltotal=Lbox+Lgenus+λLclsL_{total}=L_{box}+L_{genus}+\lambda*L_{cls} (1)

where λ\lambda is introduced to scale the second classification loss and limit fluctuation in training. When λ=0\lambda=0, the network is equivalent to Faster R-CNN.

IV EXPERIMENTS

IV-A Dataset and Implementations

The dataset used in this paper is collected from Yangtze River Basin Ecology and Environment Monitoring and Scientific Research Center, P. R. China. The dataset consists of 1859 high-resolution microscopic images of 37 genera of algae in 6 biological classes and annotations of genus and class. Some samples of algae in the dataset are shown in Fig. 1. The images were taken under microscopes with warm lighting. The color information is stored for more information compared to commonly used grayscale images [6]. 99.2%99.2\% of the images have much higher resolution of 2752×22082752\times 2208 or 3072×20483072\times 2048 than images with resolution of 150×150150\times 150 used in [12]. Furthermore, comparing to other datasets with single-alga images, our dataset is more informative and challenging. In each image, various algae of different sizes and orientations are visible, along with other noisy objects, such as bacteria, which influence the model performance. Algae were located and identified with bounding boxes. Genera and classes of algae were annotated by professionals with expert knowledge.

The dataset is highly imbalanced. Therefore algae with less than 10 instances are categorized into the existing “else” genus, which lead to 27 genera of algae in total. The dataset is randomly split into 80% for training purposes and 20% for testing. The images are resized into 800×800800\times 800, followed by standardization with mean and standard deviation of the dataset. The images are randomly rotated by ±90\pm 90 degrees and randomly cropped during data augmentation process before training.

The proposed framework is implemented using Pytorch and is trained on 4 NVIDIA Quadro P5000 GPUs with a batch size of 32, using the stochastic gradient descent (SGD) optimizer with a momentum of 0.9. The initial learning rate is 0.02, which decayed by 10 in step 6000 and 7000, respectively. ResNet-50 based FPN network pre-trained on ImageNet is applied to initialize the proposed framework [16, 17]. For algal detection, aspect ratios of the anchors (height/width) are set to [0.25,0.5,1,2,4][0.25,0.5,1,2,4], while anchor sizes are set to [32,64,128,256,512][32,64,128,256,512], in order to fit all sizes and shapes of algae.

IV-B Results and Discussion

To evaluate the performance on algal detection, mAP@IoU=5050 (mean average precision of predictions with IoU >=>= 50%, abbreviated as mAP) is calculated [2], in which, IoU refers to the intersection of the predicted bounding box and the ground truth over their union. Average classification accuracy (ACA) is calculated for classification tasks.

The performance of the framework is affected by the value of λ\lambda in the loss function. Therefore, we first carried out experiments with various λ\lambda values from 0 to 0.5 to observe changes in the detection performance. The results are plotted in Fig. 3. It can be found that the performance is maximized when λ=0.2\lambda=0.2. It is well noted that when λ=0\lambda=0, the loss for biological class is neglected, and the third branch is not involved in the multi-target training. In this case, the model performance is worse than experiments with other settings. The comparison proves the effectiveness of our proposed multi-target learning framework. More details of the training process can be found in Fig. 4. When the biological class branch is updated during training, the model consistently outperforms the baseline model (where λ\lambda is zero) on the convergence speed as well as the final performance. Besides, we notice that when λ\lambda value is too high, there are more fluctuations and a risk of gradient explosion during training, as presented in Fig. 4. Therefore, we set λ=0.2\lambda=0.2 in the following experiments.

Refer to caption
Figure 3: Experimental results with different λ\lambda
Refer to caption
Figure 4: Evaluation process with different λ\lambda during training

The experimental results are shown in Table I and II. The proposed framework achieved an average mAP of 74.64%74.64\% and 81.17%81.17\% on algal detection on the basis of genera and class, respectively. In the classification tasks, the ACA for is 96.5%96.5\% for genera and 99.4%99.4\% for biological class. Table I presents the detailed detection results and the instance percentage of each genera. On one hand, 8 genera with at least 3.79%3.79\% of total instances in each genus, receive higher mAP than others. On the other hand, among the rest of 19 genera where each genus comprises less than 1%1\% of the data, the detection mAP is significantly low. The above mentioned results indicate that the model is sensitive to the percentage of algal instances on genus identification.

TABLE I: Experimental results on algal detection at genus level. Per-genus mAP and the corresponding percentage of instances are shown. The 8 genera with highest instance percentage are presented separately. The rest 19 genera are combined into the second last row.
Genus     mAP(%)     Instance Percentage (%)
 Cymbella     85.89     28.30
Navicula     74.46     16.50
Synedra     82.65     15.02
Achnanthes     72.85     8.40
Scenedesmus     84.91     5.46
Pediastrum     100.00     4.34
Cyclotella     88.27     3.79
Diatoma     90.04     4.01
The rest 19 genera     22.85     14.20
 Total     74.64     100
TABLE II: Experimental results on algal detection at class level. The mAP by biological class and corresponding instance percentage are presented.
Biological Class     mAP(%)     Instance Percentage(%)
 Bacillariophyta     81.29     82.11
Chlorphyta     86.93     12.78
Cyanophyta     72.45     1.56
Cryptophyceae     99.18     1.11
Cyanobacteria     82.14     1.00
Others     25.46     1.44
 Total     81.17     100

In Table II, detection at biological class level reaches 81.17%81.17\% mAP. It is higher than the mAP at genus level by 6.53%6.53\%. Moreover, the detection performance is more consistent over classes with large and small instance percentages, compared to Table I. The reasons are summarized as below. First, classification at the biological class level is less fine-grained, as the biological class is a higher level than genus in taxonomy. Second, inter-genera similarity can lead to misclassification. Detection on the biological class level avoids differentiation of such similarity, and thus achieves higher mAP. Based on the classification results in Table II, we can further deduce the genus of an alga, and reach higher accuracy by exploiting the hierarchies in the biological taxonomy. “Others” is an exception. It contains diverse classes without shared visual features, which negatively influences the classification performance.

Fig. 5 shows some examples of good and poor detection results by our approach on the algal dataset. We find out that the proposed method is robust on most testing images, while the performance decreases under the following situations. First, undetected algae. Most of the undetected algae are almost transparent and blends into the background. Second, algal occlusion. Some algae overlap with others or with non-algae objects. Third, incorrect classifications. This is possibly due to inter-class similarity. In future work, more image pre-processing and post-processing techniques could be explored to tackle the above mentioned problems.

Refer to caption
Figure 5: Examples of algal detection results. The classification results are printed on the top left corner of the bounding boxes. The first 2 rows denote correct classification and detection results. The 3 images in the bottom row represent 3 types of incorrect predictions. The error types from left to right are: undetected algae; algal occlusion and incorrect classifications.

V CONCLUSIONS

In this paper, we presented a novel multi-target deep learning framework for algal analysis. The proposed multi-target model simultaneously solves different tasks, i.e., genus classification, algal detection, and biological class identification. The method exploits the relationships among the targets. Extensive experiments on the novel algae dataset demonstrate the robustness of our approach, achieving 74.64%74.64\% mAP on detection at genus level, and 81.17%81.17\% mAP at class level. In our feature work, 3D CNN [18] can be implemented with other biological features to improve the performance on algal detection and classification.

References

  • [1] Yunjin Wu, Ziyuan Zhao, Shengqiang Zhang, Lulu Yao, Yan Yang, Tom ZJ Fu, and Stefan Winkler, “Interactive multi-camera soccer video analysis system,” in Proceedings of the 27th ACM International Conference on Multimedia, 2019, pp. 1047–1049.
  • [2] Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun, “Faster r-cnn: Towards real-time object detection with region proposal networks,” in Advances in neural information processing systems, 2015, pp. 91–99.
  • [3] Ziyuan Zhao, Xiaoman Zhang, Cen Chen, Wei Li, Songyou Peng, Jie Wang, Xulei Yang, Le Zhang, and Zeng Zeng, “Semi-supervised self-taught deep learning for finger bones segmentation,” in 2019 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI). IEEE, 2019, pp. 1–4.
  • [4] HW Balfoort, J Snoek, JRM Smiths, LW Breedveld, JW Hofstraat, and J Ringelberg, “Automatic identification of algae: neural network analysis of flow cytometric data,” Journal of Plankton Research, vol. 14, no. 4, pp. 575–589, 1992.
  • [5] Xianpeng Li, Ran Liao, Jialing Zhou, Priscilla T. Y. Leung, Meng Yan, and Hui Ma, “Classification of morphologically similar algae and cyanobacteria using mueller matrix imaging and convolutional neural networks,” Appl. Opt., vol. 56, no. 23, pp. 6520–6530, Aug 2017.
  • [6] Jason L Deglint, Chao Jin, and Alexander Wong, “Investigating the automatic classification of algae using the spectral and morphological characteristics via deep residual learning,” in International Conference on Image Analysis and Recognition. Springer, 2019, pp. 269–280.
  • [7] Yuxin Wu, Alexander Kirillov, Francisco Massa, Wan-Yen Lo, and Ross Girshick, “Detectron2,” 2019.
  • [8] Jhony-Heriberto Giraldo-Zuluaga, Augusto Salazar, German Diez, Alexander Gomez, Tatiana Martínez, Jesús Francisco Vargas, and Mariana Peñuela, “Automatic identification of scenedesmus polymorphic microalgae from microscopic images,” Pattern Analysis and Applications, vol. 21, no. 2, pp. 601–612, 2018.
  • [9] Ziyuan Zhao, Kerui Zhang, Xuejie Hao, Jing Tian, Matthew Chin Heng Chua, Li Chen, and Xin Xu, “Bira-net: Bilinear attention net for diabetic retinopathy grading,” in 2019 IEEE International Conference on Image Processing (ICIP). IEEE, 2019, pp. 1385–1389.
  • [10] J. L. Deglint, C. Jin, A. Chao, and A. Wong, “The feasibility of automated identification of six algae types using feed-forward neural networks and fluorescence-based spectral-morphological features,” IEEE Access, vol. 7, pp. 7041–7053, 2019.
  • [11] S Lakshmi and R Sivakumar, “Chlorella algae image analysis using artificial neural network and deep learning,” in Biologically Rationalized Computing Techniques For Image Processing Applications, pp. 215–248. Springer, 2018.
  • [12] Jungsu Park, Hyunho Lee, Cheol Young Park, Samiul Hasan, Tae-Young Heo, and Woo Hyoung Lee, “Algal morphological identification in watersheds for drinking water supply using neural architecture search for convolutional neural network,” Water, vol. 11, no. 7, pp. 1338, 2019.
  • [13] Zeng Zeng and Bharadwaj Veeravalli, “Do more replicas of object data improve the performance of cloud data centers?,” in 2012 IEEE Fifth International Conference on Utility and Cloud Computing. IEEE, 2012, pp. 39–46.
  • [14] Zeng Zeng and Bharadwaj Veeravalli, “Divisible load scheduling on arbitrary distributed networks via virtual routing approach,” in Proceedings. Tenth International Conference on Parallel and Distributed Systems, 2004. ICPADS 2004. IEEE, 2004, pp. 161–168.
  • [15] Zeng Zeng, Nanying Liang, Xulei Yang, and Steven Hoi, “Multi-target deep neural networks: Theoretical analysis and implementation,” Neurocomputing, September 2017.
  • [16] Alexander Kirillov, Ross Girshick, Kaiming He, and Piotr Dollar, “Panoptic feature pyramid networks,” in The IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2019.
  • [17] Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton, “Imagenet classification with deep convolutional neural networks,” in Advances in neural information processing systems, 2012, pp. 1097–1105.
  • [18] Cen Chen, Kenli Li, Sin G Teo, Guizi Chen, Xiaofeng Zou, Xulei Yang, Ramaseshan C Vijay, Jiashi Feng, and Zeng Zeng, “Exploiting spatio-temporal correlations with multiple 3d convolutional neural networks for citywide vehicle flow prediction,” in 2018 IEEE international conference on data mining (ICDM). IEEE, 2018, pp. 893–898.