Citation: Y. Logan, R. Benkert, A. Mustafa, G. Kwon and G. AlRegib, ”Patient Aware Active Learning for Fine-Grained OCT Classification,” IEEE International Conference on Image Processing (ICIP), Oct. 2022.
Review: Date of accept: 20 Jan 2022
Codes: https://github.com/olivesgatech/Patient-Aware-Active-Learning
Bib: @ARTICLE{Logan2022_ICIP,
author={Y. Logan, R. Benkert, A. Mustafa, G. Kwon and G. AlRegib},
journal={IEEE International Conference on Image Processing},
title={Patient Aware Active Learning for Fine-Grained OCT Classification},
year={2022}
Copyright: ©2022 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Contact: [email protected] OR [email protected]
http://ghassanalregib.info/

Patient Aware Active Learning for Fine-Grained OCT Classification

Abstract

This paper considers making active learning more sensible from a medical perspective. In practice, a disease manifests itself in different forms across patient cohorts. Existing frameworks have primarily used mathematical constructs to engineer uncertainty or diversity-based methods for selecting the most informative samples. However, such algorithms do not present themselves naturally as usable by the medical community and healthcare providers. Thus, their deployment in clinical settings is very limited, if any. For this purpose, we propose a framework that incorporates clinical insights into the sample selection process of active learning that can be incorporated with existing algorithms. Our medically interpretable active learning framework captures diverse disease manifestations from patients to improve generalization performance of OCT classification. After comprehensive experiments, we report that incorporating patient insights within the active learning framework yields performance that matches or surpasses five commonly used paradigms on two architectures with a dataset having imbalanced patient distributions. Also, the framework integrates within existing medical practices and thus can be used by healthcare providers.

Index Terms— Active learning, Deep learning, OCT, Patient awareness, Personalized diagnosis

1 Introduction

Active learning is a branch of human-in-the-loop computing that assumes the model and unlabeled dataset will evolve over time as the most informative samples are selected from a dataset. This application applied to medical image analysis is practical as it reduces the costly and cumbersome human effort needed for expert manual annotation. Even though, active learning on medical imagery has demonstrated progress in overcoming the reliance on large, balanced datasets [1, 2], the practical deployment of this paradigm still has major challenges.

First, most conventional active learning paradigms applied to medical image analysis are still largely considered as “black box” algorithms. This inhibits the health care professionals from interpreting, understanding and even correcting predictions a model has made [3, 4, 5]. From a physician’s point-of-view, this creates a higher risk of patient harm. Second, a single disease can present itself in visually diverse formats across multiple patients. This is exemplified in Fig. 1 where, although each patient’s optical coherence tomography (OCT) is diagnosed with diabetic macular edema (DME), the visual characteristics of them are unalike. In other words, data diversity is captured within the medical meta data and remains un-exploited in existing active learning paradigms. Additionally, medical datasets are widely imbalanced both across classes and patients [6]. This ultimately results in existing methods training models without properly accounting for one or more patient disease manifestations from whom there is less data. For this purpose, the model could overfit to a minority patient group and fail to generalize to the broader masses. For practical deployments, this represents a major risk.

Refer to caption — Fig. 1: Sample cross-sectional imagery showing differing visual characteristics of diabetic macular edema (DME) across patient cohorts.

From a machine learning perspective, the root cause of the problem resides in the structure of the representation space. Conventional active learning methods rely on a well established representation of the data. If samples are conventionally chosen from an unbalanced dataset, the structure of the representation space will be underdeveloped and the model will generalize poorly. For this purpose, we propose to shape the representation space through medical meta-information. Specifically, we provide data diversity through patient identity. By imposing a patient constraint on the query strategies, we structure the representation space based on a medically grounded prior (Fig. 2). We show that our method is robust to both unbalanced data distributions and architectures and is also effective at fine-grained disease classification. Based on the original paradigm, we define our method as Patient Aware Active Learning. In summary, the contributions of this paper are as follows:

i

We incorporate patient information to augment the active learning paradigm and improve medical interpretability.
ii

We develop a modular plug-in method that can be applied to arbitrary active learning strategies that match or outperform existing methods in terms of generalization.
iii

We test our method on a popular OCT benchmark with two different architectures and five different query strategies. This amounts to 100 experiments.

2 Related Work

Active learning frameworks strive to select the most informative subset of samples from an unlabeled dataset in order to combat the laborious, costly and time-consuming nature of developing large annotated datasets [7, 8, 9, 10, 11]. In relation to medical image analysis, active learning has been applied to histopathological image analysis [1], skin lesion segmentation [12], heart magnetic resonance imaging and computerized tomography scan analysis [13], the diagnosis of digital mammograms [14], and for tuberculosis detection of chest radiography [15]. A significant area of active learning research is focused on defining sample informativeness. In this context, several approaches define informativeness with generalization difficulty [16, 17] while others focus on data diversity within the acquisition batch [9, 10]. Even though diversity is considered within the context of learned model features, existing approaches do not account for diversity inherent within the medical data. To the best of our knowledge, our approach represents the first to consider medical meta-information within the active learning workflow.

3 Method

In a modular context, our method integrates within existing query strategies and enhances their performance with medical meta-information. Patient awareness is injected into the learning framework by partitioning the unlabeled training pool on patient identity prior to ranking sample informativeness with a query function. It is a modular method that integrates with arbitrary active learning strategies.

3.1 Patient Aware Active Learning

For patient aware active learning (Fig. 3), we partition the unlabeled pool into the separate patients and sample from the respective subset with an arbitrary query strategy. Mathematically, we partition the unlabeled pool into the patients

\displaystyle\mathcal{P}=\{I_{i}\}^{N}_{i=1}

(1)

where $I_{i}$ refers to the set of $M_{i}$ image-label pairs $\{(x_{j},y_{j})\}$ and $N$ to the total amount of patients in the dataset. We begin the process by training a model $f$ on randomly selected samples from the unlabeled pool. In each following round, we choose $K$ unique patients from the set $\mathcal{P}$ (e.g., $\{I_{k}\}^{K}_{k=1}$ ) and sample a single image-label pair from each of the $K$ patients using the query strategy $g(x_{j},D_{train},f)$ . Finally, we append the selected samples to the training pool $D_{train}$ . This process is repeated to determine the minimum number of labeled samples that maximizes the model’s performance.

4 Experiments

The dataset used in this paper, obtained from [6], has grayscale, cross-sectional, foveal OCT scans of varying sizes belonging to an unbalanced distribution of a healthy class and three types of retinal diseases: Drusen, choroidal neovascularization (CNV) and diabetic macular edema (DME). We are interested in distinguishing between the disease states and thus use only imagery from these three classes for fine-grained classification. Sample imagery from each class is shown in Fig. 4. A total of 10488 DME, 36345 CNV and 7756 Drusen images from 1852 patients were used in the training and unlabeled set. Within the test set, there were 250 images for each disease collected from 486 patients. All imagery were resized to $128\times 128$ and normalized to have zero mean and unit standard deviation. Additional implementation details are shown in Table 1. There was no overlap in patients or imagery in train or test sets. The patient and class distribution in train and test sets are shown in Fig. 5.

Table 1: Details about dataset size and the number of samples added after each training round.

Details	Dataset
Details	OCT
Unlabeled + Training Set Images	54589
Images in Initial Training Set	128
Images Queried per Iteration	128

We use Resnet-18 [18] and Densenet-121 [19] architectures trained from scratch to classify images in an active learning framework. A learning rate of 1.5e-4 was used along with the Adam [20] optimizer. During each training round, the Resnet and Densenet models were trained for as many epochs it took to achieve 98% and 94% accuracy respectively. This is repeated with five different random seeds and an average accuracy is aggregated.

Classification performance is evaluated on the following baseline algorithms:

1.

Random: Randomly sampling $k$ instances at each round as a naive baseline.
2.

Least Confidence: Selecting the samples for which the model has the lowest predicted probability [17].

$\displaystyle x^{*}_{LC}=\operatorname*{arg\,max}_{x}1-P_{\theta}(\hat{y}|x)$ (2)
3.

Margin: The difference between the top two most probable predictions, $\hat{y_{1}}$ and $\hat{y_{2}}$ , is used to identify the most informative sample [17].

$\displaystyle x^{*}_{M}=\operatorname*{arg\,min}_{x}P_{\theta}(\hat{y_{1}}|x)-P_{\theta}(\hat{y_{2}}|x)$ (3)
4.

Entropy: Uses the full distribution to determine the most uncertain prediction from a model [17].

$\displaystyle x^{*}_{E}=\operatorname*{arg\,max}_{x}\sum_{i}P(y_{i}|x)\log{P(y_{i}|x)}$ (4)
5.

BADGE: Samples disparate and high magnitude points from a gradient space at every round [21].

5 Results

In Figs. 6 and 7 we show learning curves showing the accuracy of baseline query strategies compared to the corresponding patient aware query. In the interest of space, we visualize results for four of five query strategies on Resnet-18 and Densenet-121 architectures. In these plots, x-axis corresponds to the number of samples in the train set at that round and y-axis corresponds to the performance accuracy on the test set. Each colored curve is the average of five trials using different seeds, with standard errors being shown by the shaded regions. Our patient aware learning applied to the baselines is shown as the green curve in all sub-figures while baseline queries are shown in orange. The blue curve represents random sampling which serves as the naive baseline for all experiments. All results were achieved using 5.5% (3000 samples) of the OCT dataset (54589 samples).

The learning curves provide intuition about having medical insights guide the decisions of neural networks. For instance, most plots show patient aware learning surpassing the baselines on both architectures. This means that in majority cases, querying the dataset with patient-level insights in an active learning setup allows the model to better characterise disease states. Furthermore, we often see patient-aware learning having an edge over baseline algorithms from the early rounds of training onward like in Figs. 6(b) and 7(d). This is because the role of patient awareness plays a critical part in incorporating diversity throughout the training set. The model is thus exposed to several manifestations of a pathology from the onset. Noticeably too, patient aware learning can sometimes perform the same as baseline strategies, as shown in Fig. 7(b). This can happen when the selection of diverse patients do not contain significantly different disease manifestations within the imagery for the model to perform much different from the baselines. This is also possible when the model in some trials is poorly initialized yielding a poor set of built-in assumptions to make predictions. Overall, these findings suggest that patient aware learning is a good choice for medical applications of active learning.

6 Conclusion

In this paper we introduced a framework that incorporates clinical context in the form of patient awareness into an active learning framework that is medically interpretable. We show that our methods, in a plug-in sense, can be used in a modular context with arbitrary sampling strategies. We perform controlled experiments to validate the effectiveness of patient aware active learning for fine-gained OCT classification on different architectures using an unbalanced dataset. Based on our contributions in this paper, we will further investigate learning methods to evaluate the robustness of patient aware active learning.

References

[1] André Homeyer, Andrea Schenk, Uta Dahmen, Olaf Dirsch, Hai Huang, and Horst K Hahn, “A comparison of sampling strategies for histological image analysis,” Journal of Pathology Informatics, vol. 2, 2011.
[2] Asim Smailagic, Hae Young Noh, Pedro Costa, Devesh Walawalkar, Kartik Khandelwal, Mostafa Mirshekari, Jonathon Fagert, Adrián Galdrán, and Susu Xu, “Medal: Deep active learning sampling method for medical image analysis,” arXiv preprint arXiv:1809.09287, 2018.
[3] Samuel Budd, Emma C Robinson, and Bernhard Kainz, “A survey on active learning and human-in-the-loop deep learning for medical image analysis,” Medical Image Analysis, vol. 71, pp. 102062, 2021.
[4] Mohit Prabhushankar and Ghassan AlRegib, “Extracting causal visual features for limited label classification,” in 2021 IEEE International Conference on Image Processing (ICIP). IEEE, 2021, pp. 3697–3701.
[5] Yash-yee Logan, Kiran Kokilepersaud, Gukyeong Kwon, Ghassan AlRegib, Charles Wykoff, and Hannah Yu, “Multi-Modal Learning Using Physicians Diagnostics for Optical Coherence Tomography Classification,” Kalkota, India, Mar. 2022, IEEE International Symposium on Biomedical Imaging (ISBI).
[6] Daniel Kermany, Kang Zhang, and Michael Goldbaum, “Large dataset of labeled optical coherence tomography (oct) and chest x-ray images,” Mendeley Data, v3 http://dx. doi. org/10.17632/rscbjbr9sj, vol. 3, 2018.
[7] Keze Wang, Dongyu Zhang, Ya Li, Ruimao Zhang, and Liang Lin, “Cost-effective active learning for deep image classification,” IEEE Transactions on Circuits and Systems for Video Technology, vol. 27, no. 12, pp. 2591–2600, 2016.
[8] Melanie Ducoffe and Frederic Precioso, “Adversarial active learning for deep networks: a margin based approach,” arXiv preprint arXiv:1802.09841, 2018.
[9] Ozan Sener and Silvio Savarese, “Active learning for convolutional neural networks: A core-set approach,” arXiv preprint arXiv:1708.00489, 2017.
[10] Daniel Gissin and Shai Shalev-Shwartz, “Discriminative active learning,” arXiv preprint arXiv:1907.06347, 2019.
[11] Ahmad Mustafa and Ghassan AlRegib, “Man-recon: Manifold learning for reconstruction with deep autoencoder for smart seismic interpretation,” in 2021 IEEE International Conference on Image Processing (ICIP). IEEE, 2021, pp. 2953–2957.
[12] Marc Gorriz, Axel Carlier, Emmanuel Faure, and Xavier Giro-i Nieto, “Cost-effective active learning for melanoma segmentation,” arXiv preprint arXiv:1711.09168, 2017.
[13] Danielle F Pace, Adrian V Dalca, Tal Geva, Andrew J Powell, Mehdi H Moghari, and Polina Golland, “Interactive whole-heart segmentation in congenital heart disease,” in International Conference on Medical Image Computing and Computer-Assisted Intervention. Springer, 2015, pp. 80–88.
[14] Yu Zhao, Jingyang Zhang, Hongzhi Xie, Shuyang Zhang, and Lixu Gu, “Minimization of annotation work: diagnosis of mammographic masses via active learning,” Physics in Medicine & Biology, vol. 63, no. 11, pp. 115003, 2018.
[15] Jaime Melendez, Bram van Ginneken, Pragnya Maduskar, Rick HHM Philipsen, Helen Ayles, and Clara I Sanchez, “On combining multiple-instance learning and active learning for computer-aided detection of tuberculosis,” Ieee transactions on medical imaging, vol. 35, no. 4, pp. 1013–1024, 2015.
[16] Jingbo Zhu, Huizhen Wang, Benjamin K Tsou, and Matthew Ma, “Active learning with sampling by uncertainty and density for data annotations,” IEEE Transactions on audio, speech, and language processing, vol. 18, no. 6, pp. 1323–1331, 2009.
[17] Burr Settles, “Active learning literature survey,” 2009.
[18] Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun, “Deep residual learning for image recognition,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
[19] Gao Huang, Zhuang Liu, Laurens Van Der Maaten, and Kilian Q Weinberger, “Densely connected convolutional networks,” in Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 4700–4708.
[20] Diederik P Kingma and Jimmy Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980, 2014.
[21] Jordan T Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, and Alekh Agarwal, “Deep batch active learning by diverse, uncertain gradient lower bounds,” arXiv preprint arXiv:1906.03671, 2019.