This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Enhancing Skin Lesion Classification Generalization with Active Domain Adaptation

Jun Ye
Carnegie Mellon University
*This work was not supported by any organization
Abstract

We propose a method to improve the generalization of skin lesion classification models by combining self-supervised learning (SSL) and active domain adaptation (ADA). The main steps of the approach include selection of an SSL pre-trained model on natural image datasets, subsequent SSL retraining on all available skin-lesion datasets, fine-tuning of the model on source domain data with labels, and application of ADA methods on target domain data. The efficacy of the proposed approach is assessed in ten skin lesion datasets with five different ADA methods, demonstrating its potential to improve generalization in settings with different amounts of domain shifts.

Clinical Relevance—This approach is promising in facilitating the widespread clinical adoption of deep learning models for skin lesion classification, as well as other medical imaging applications.

I INTRODUCTION

The past decade has witnessed a surge in the application of artificial intelligence (AI) to skin cancer detection. Given the increasing incidence of skin cancer within an aging population and the scarcity of dermatology experts, a compelling demand exists for automated AI solutions [68]. Globally, it’s estimated that around three billion people lack adequate access to skin disease medical care [16]. Using AI can greatly alleviate the problem. Dermoscopy images are normally used by dermatologists to detect skin cancers. Consequently, computer vision (CV) methods have been applied for skin cancer classification and segmentation [61, 44, 43]. Since the International Skin Imaging Collaboration (ISIC) began organizing annual challenges in 2016, steady progress has been made in algorithm performance. Deep learning models now exceed human expert performance on the provided datasets [68, 15].

Despite these advances in medical imaging applications, model generalization remains a crucial hurdle for broad clinical adoption [40, 77, 15, 31, 75]. For skin cancer detection specifically, Barros et al. and Daneshjou et al. find that models perform worse on dark skin tones than light skin tones, and worse on rare diseases [6, 17]. Han et al. observe that the dermatologist-level model’s performance significantly decreases when testing on out-of-distribution data [31]. The differences between skin lesion datasets are pervasive, stemming from various sources such as demographics, skin lesion locations, institutions, acquisition devices, and lighting conditions. Transfer learning (TL) is a technique to transfer a model’s knowledge from one domain to another. Due to the lack of large labeled datasets, a common approach is to initially train models on natural image datasets, such as ImageNet [18, 53, 54]. Subsequently, the model is fine-tuned on medical image datasets. There are various factors affecting a model’s transferability, including transfer techniques, training data size, domains’ similarity, and model architecture [52]. Here we focus on the transfer techniques, since some of the factors are out of our control in actual use cases, e.g. we have to work with a given trained model on a given dataset.

One transfer learning technique is using self-supervised learning (SSL). Supervised learning (SL) encourages models to learn task-specific features, potentially hindering their ability to generalize to unseen data. SSL, on the other hand, enables models to learn general features that are more robust across diverse datasets [3]. Azizi et al. propose a robust and data-efficient generalization framework for medical imaging applications by doing SSL training on in-domain medical data prior to finetuning. This approach outperforms the SL models on both in-domain and out-of-domain data. Kang et al. demonstrate similar results for four different SSL methods [42]. Haggerty et al. confirm the efficacy of this approach on skin lesion datasets, demonstrating that SSL retraining on skin lesion data enhances the performance of a model pre-trained on ImageNet data with SL or SSL methods [30]. Another TL approach is domain adaptation (DA). DA methods aim to align the source domain (data on which the model is trained) and the target domain (data the model hasn’t encountered). Few studies have explored the application of unsupervised domain adaptation (UDA) methods for skin lesion classification [73, 10, 28].

In this study, we aim to combine the SSL and DA for better transfer learning. Limited research exists on combining the SSL and DA methods. Zhao et al. combine SSL and AL without domain adaptation for skin lesion segmentation application [79]. Xu et al. combines SSL and UDA for object detection and segmentation [76]. To the best of our knowledge, this work represents the first attempt to combine SSL and DA for skin lesion classification. More specifically, we use the state-of-the-art (SOTA) SSL method, DINO, which is different from the Barlow Twins SSL method used in [3]. Instead of using the UDA method, we apply active domain adaptation (ADA) [51] methods. This is motivated by the actual clinical setting where the model can get feedback from human experts to iteratively improve performance [71]. We demonstrate the effectiveness of the approach in the skin lesion classification application. This approach offers seamless integration into the clinical workflows, enabling iterative performance improvement by incorporating feedback from human experts.

Our main contributions include:

  • We propose a workflow to improve the generalization of the skin lesion classification model across domains by combining SSL and ADA methods.

  • We show that SSL can be an effective UDA method and applying ADA methods can add further improvements.

  • We demonstrate our approach’s efficacy on ten skin lesion target domains’ data with five different ADA methods.

II Related work

II-A Self-supervised learning

SSL is a technique to train models without labels. It can be used to create foundational models which can be fine-tuned for different downstream tasks. The success of SSL is initially observed in natural language processing (NLP) [19, 7], and later extended to CV [20, 21, 27, 45, 55]. Recently there has been a rapid proliferation of SSL applications in the medical imaging field [4, 47, 63, 64, 74, 37]. SSL can achieve better performance than SL trained models with much less labels [11]. There are four main types of SSL methods, innate relationships, generative, contrastive and self-prediction. Innate relationships leverage the internal structure of the data, e.g. predicting the rotation angle, or finding the position of shuffled image patches. Generative methods use autoencoders or generative adversarial networks (GANs). They are assessed by the quality of reconstructing the original images. Contrastive methods rely on using data augmentation to create positive pairs of an image. Different images are treated as negative pairs. Various distance metrics have been proposed to keep positive pairs close and negative pairs distant in the embedding space. Some examples include SimCLR [11], MoCo [32], BYOL [5], SimSiam [12], DINO [8]. Self-prediction methods mask out portions of the original image and try to reconstruct the original image. The main difference between self-prediction and generative methods is that self-prediction applies augmentation on portions of the original image while generative methods apply to the whole image. SSL application in medical imaging is rapidly rising, as medical labels are costly and time-consuming to obtain. SSL can speed up the process of model development.

II-B Unsupervised domain adaptation

UDA methods can be divided into two types, moment matching and adversarial training. Moment matching methods try to match the first or second moments between the hidden activation distributions of the source and target domains [78]. Some examples are deep adaptation networks (DAN) [48], joint adaptation networks (JAN) [49], correlation alignment [66], and central moment discrepancy (CMD) [78]. Adversarial training methods try to learn domain invariant features. It trains a domain classifier to put the source and target domain features in the same space. The most used adversarial training method is domain adversarial neural networks (DANN) [25]. Other methods build upon it such as adversarial discriminative domain adaptation (ADDA) [70], maximum classifier discrepancy (MCD) [57], batch spectral penalization (BSP) [13], and minimum class confusion (MCC) [39].

There are works of applying UDA in medical image analysis focusing on classification [1], object localization, and segmentation [41]. More specifically for skin lesion classification, Chamarthi et al. compare performance of different UDA methods [10]. Wang et al. demonstrate applying UDA methods can mitigate model bias against minority groups [73].

II-C Active domain adaptation

UDA methods often exhibit lower performance compared to their supervised counterparts [67, 14]. Supervised domain adaptation (SDA) requires labels of the target domain. However, for the medical imaging field especially, obtaining large amounts of labels can be cost intensive. Therefore, selecting the most informative samples for annotation is crucial. This is where active learning (AL) comes in. AL is trying to maximize a model’s performance given a limited amount of labels. Active domain adaption (ADA) combines active learning with domain adaptation. Different active learning methods have different sampling strategies. Broadly, they are based on the two types of metrics, uncertainty [46, 59], and diversity [38, 35]. Uncertainty-based methods choose instances with the high uncertainty based on measures such as entropy, classification margins, or confidence [22, 24, 72]. Diversity-based methods choose instances that are representative of the entire dataset. They optimize for diversity in the embedding space with clustering or core-set selection [60, 26, 29, 62]. Several methods attempt to combine these two types of metrics [56, 36, 23, 2]. Zhao et al. applied SSL and AL for skin lesion segmentation, but there is no domain adaptation [79]. No prior work exists on applying ADA methods to skin lesion classification.

III Methodology

Here, we propose a workflow combining SSL and DA methods to improve the generalization of skin lesion classification models. Figure 1 presents the steps of the workflow.

Refer to caption
Figure 1: Proposed workflow.

The initial step involves selecting an SSL pre-trained model on natural image datasets, such as ImageNet. Here we select the DINO SSL method since it’s shown to produce the SOTA performance on ImageNet among many SSL methods, e.g. SimCLR, MoCov2, Barlow Twins, BYOL, SwAV [50, 8]. Figure 2 shows the DINO architecture. It uses knowledge distillation with a student and teacher architecture. Aug 1 and Aug 2 represent different augmented views of an image, which are fed into the student and teacher networks, respectively. Both networks have the same model architecture. The loss term minimizes the cross-entropy between the features learned by the student and teacher networks. Centering and sharpening are applied on the teacher’s output to prevent the model from learning trivial solutions. It doesn’t require a large batch size or negative samples. We choose ResNet50 [33] as the backbone model because it’s used in other similar studies [10], allowing for direct comparison of results. The projector head has a couple linear layers with the hidden dimension as 256 and the output dimension as 65536. The DINO cross-entropy loss term between the teacher’s and student’s learned features is defined as follows:

minH(Pt(x),Ps(x))=min(Pt(x)log(Ps(x)))minH(P_{t}(x),P_{s}(x))=\min{-(P_{t}(x)\log(P_{s}(x)))}

Pt(x),Ps(x)P_{t}(x),P_{s}(x) are the probability distribution from the output of the teacher and student network respectively.

Refer to caption
Figure 2: DINO SSL architecture.

The first step involves continuing DINO training on skin lesion datasets. Because no labels are required, training can be performed on all the available skin lesion data. This facilitates the model’s generalization from natural images to skin lesion data.

The second step involves finetuning the model on a labeled skin lesion dataset. Here, we focus on a common task: binary classification of melanoma (cancer) versus nevus (benign). We freeze the backbone model and fine-tune only the linear classification layer to isolate the impact of feature quality, excluding the influence of a more complex classifier.

The third step involves applying ADA. An optional preliminary step involves applying UDA to align the source and target domains without target domain labels. One popular UDA method is DANN [25], which has shown good performance among UDA methods applied to skin lesion datasets [10]. DANN is an adversarial training based method. It comprises three main components: a feature extractor, a class classifier, and a domain classifier. A gradient reversal layer between the domain classifier and the feature extractor encourages the feature extractor to learn domain invariant features.

Refer to caption
Figure 3: Active domain adaptation architecture.

We compare several ADA methods: AADA [65], CLUE [56], BADGE [2], and random sampling. Figure 3 illustrates the general ADA architecture. We have skin lesion labels for the source domain dataset, but lack labels for the target domain datasets. Both source domain and target domain datasets can be input to the model, resulting in two computed losses. LdL_{d} represents the domain loss, which we use to train the model to be unable to distinguish the domains. LcL_{c} represents the skin lesion classification loss. Cross-entropy is used to calculate both losses. We apply active learning to select informative samples from the target domain, which are then sent to human experts for annotation. Using these annotated labels, we can then train the model using the classification loss. The domain classifier is optional; adversarial training-based ADA methods, such as AADA, utilize it, while other methods don’t.

For example, the active learning sampling criterion of AADA method is as such:

S(x)=1Gd(Gf(x))Gd(Gf(x))H(Gy(Gf(x)))S(x)=\frac{1-G_{d}(G_{f}(x))}{G_{d}(G_{f}(x))}H(G_{y}(G_{f}(x)))

It incorporates both the diversity cue 1Gd(Gf(x))Gd(Gf(x))\frac{1-G_{d}(G_{f}(x))}{G_{d}(G_{f}(x))}, and uncertainty cue H(Gy(Gf(x)))H(G_{y}(G_{f}(x))).

IV Experiment

To demonstrate the effectiveness of our approach in improving model generalization, we conduct tests on a set of skin lesion datasets exhibiting domain shifts. We compare our results with the existing approach using UDA methods. We demonstrate the benefits of SSL training and applying ADA respectively. When applying the ADA method, we consider real-world clinical usage by employing a small sampling size of 10. Given an additional labeling budget, we can employ an iterative training strategy with an increased number of provided samples.

IV-A Dataset

Fogelberg et al. [40] created skin lesion datasets representing different domains by grouping data from HAM10000 [69], BCN20000 [34], and MSK [9] according to biological traits such as patient age and lesion location. They quantified the domain shift intensity between groups using cosine similarity and Jensen-Shannon divergence. The details of the grouped datasets are presented in Table I. It lists a total of 11 datasets. Dataset H is the source domain, and the other 10 datasets are target domains. It also shows the number of samples in each dataset, the ratio of melanoma and nevus, and biological traits.

TABLE I: Skin lesion datasets of different domains.
Abbreviation Origin Biological factors Melanoma amount Nevus amount Total sample size
H HAM Age<=30Age<=30, Loc. = Body (default) 465 (10%) 4234 (90%) 4699
HA HAM Age>30Age>30, Loc. = Body 25 (4%) 532 (96%) 557
HLH HAM Age<30Age<30, Loc. = Head/Neck 90 (45%) 121 (55%) 220
HLP HAM Age>30Age>30, Loc. = Palms/Soles 15 (7%) 203 (93%) 218
B BCN Age>30Age>30, Loc. = Body (default) 1918 (41%) 2721 (59%) 4639
BA BCN Age<=30Age<=30, Loc. = Body 71 (8%) 808 (92%) 879
BLH BCN Age>30Age>30, Loc. = Head/Neck 612 (66%) 320 (34%) 932
BLP BCN Age>30Age>30, Loc. = Palms/Soles 192 (65%) 105 (35%) 297
M MSK Age>30Age>30, Loc. = Body (default) 565 (31%) 1282 (69%) 1847
MA MSK Age<=30Age<=30, Loc. = Body 37 (8%) 427 (92%) 464
MLH MSK Age>30Age>30, Loc. = Head/Neck 175 (60%) 117 (40%) 292
TABLE II: Comparison of our SSL methods together with the BSP UDA method and baseline reported from Chamarthi et al. [10].
HA HLH HLP B BA BLH BLP M MA MLH
Mel (ratio) 0.04 0.45 0.07 0.41 0.08 0.66 0.65 0.31 0.08 0.6
Baseline 0.14±0.02 0.69±0.04 0.37±0.15 0.57±0.02 0.19±0.06 0.73±0.03 0.77±0.05 0.34±0.01 0.15±0.04 0.68±0.03
BSP 0.16±0.03 0.82±0.02 0.65±0.04 0.75±0.01 0.34±0.05 0.86±0.01 0.83±0.02 0.46±0.02 0.17±0.03 0.73±0.01
Dino retrained 0.17±0.04 0.86±0.01 0.5±0.09 0.77±0.01 0.31±0.09 0.84±0.01 0.86±0.02 0.58±0.02 0.24±0.05 0.75±0.04
Dino pre-trained 0.13±0.04 0.89±0.03 0.6±0.15 0.76±0.01 0.34±0.07 0.83±0.01 0.83±0.02 0.52±0.01 0.12±0.02 0.77±0.01
SL pre-trained 0.16±0.04 0.79±0.03 0.71±0.12 0.73±0.01 0.33±0.05 0.83±0.02 0.80±0.02 0.52±0.01 0.20±0.02 0.70±0.02

IV-B Metrics

We use AUPRC metric because it effectively handels imbalanced datasets given greater weight to the cancer class. As shown in Table I, many of the skin lesion datasets are highly imbalanced, with the cancer class comprising as little as 4% of the data. We don’t use accuracy or AUROC because they are only suitable for balanced datasets. However, we must consider the varying AUPRC baselines across datasets, as these baselines reflect the different melanoma ratios.

IV-C Model Training

We obtained the pre-trained DINO Resnet50 model from Facebook Research’s GitHub repository 111https://github.com/facebookresearch/dino?tab=readme-ov-file. When retraining the model with DINO SSL on the skin lesion data, we utilized all datasets listed in Table I. We employed the default DINO data augmentations, with random cropping, two larger view sizes (0.4-1.0 scale) and four smaller view sizes (0.05-0.4 scale), rotation, jitter, blur and black and white transformations. Only the larger view sizes are passed to the teacher model, whereas all the view sizes are passed to the student model. This encourages the model to learn “local-to-global” correspondence. We explored various data augmentation strategies, and found that the default setting yielded the best results. Our optimized training process involves initially freezing the backbone model and training only the prediction head layers with a starting learning rate (LR) of 1e-3. We trained for 30 epochs. Subsequently, we trained all the layers using adaptive learning rates, with initial LR values of 1e-4 for the backbone layers and 5e-4 for the projector layers. We employed the one-cycle LR scheduler. We trained for 25 epochs at which point the loss plateaued.

We then fine-tuned the model on the skin lesion source domain dataset, H. We froze the backbone and trained only the classifier head (a single linear layer), to isolate the impact of the backbone feature quality.

Model training was performed on a RTX 4090 GPU. The batch size was 128, and the image size was 224 x 224 pixels.

V Results and discussions

We compared the skine lesion classification performance of different methods on target domains using the AUPRC metric. We present the comparison results alongside the baseline (no domain adaptation) and the BSP UDA method reported in [10]. Table II presents the results, with the mean and standard deviation reported across 5 random seeds.

First, compared with the results reported in [10] for the baseline and BSP methods, our approach achieved superior performance on 9 of the 10 target domains, with slightly worse results observed only for BLH dataset. This demonstrates that SSL retraining on in-domain data is an effective UDA method.

Next, we compared the performance of the SL pre-trained model with the SSL Dino pre-trained model on ImageNet. The SSL Dino pre-trained model could be further retrained on the skin lesion datasets. We observed that the SL pre-trained model performed worse than the SSL pre-trained model on 9 of the 10 datasets. Therefore, SSL pre-training is superior to SL pre-training for domain adaptation. Furthermore, the SSL Dino retrained model exhibited improved performance on 6 of the 10 datasets compared to the model without retraining. Thus, retraining on in-domain data is advantageous for domain adaptation.

Refer to caption
Figure 4: Comparing ADA methods with AUPRC value.
Refer to caption
Figure 5: Comparing ADA methods.

Here, we present the results of applying ADA methods with 10 annotated samples from the target domain. Figure 4 shows the AUPRC achieved by the ADA methods compared to the baseline (without ADA), which corresponds to the DINO retrained model from Table II.

We compared five different ADA methods. The ADA methods comprise two components: active learning and domain adaptation. The active learning methods we evaluated included CLUE, AADA, BADGE and uniform sampling. The domain adaptation methods we evaluated included MME [58], DANN, and fine-tuning. The simplest ADA method involves uniform (random) sampling and fine-tuning.

Significant performance variability exists across the different target domains, primarily due to the varying AUPRC baselines, which are directly related to the melanoma ratios. We observed performance variability for the same ADA methods across the target domains. This variability primarily stems from the domain shift relative to the source domain.

To better visualize the difference of the five ADA methods, Figure 5 presents the AUPRC delta between the ADA methods and the baseline. The AADA-DANN method exhibited the best performance. It performed better than or similar to the baseline on 9 of the 10 datasets, with the exception of the HLP dataset. One potential explanation is the small number of melanoma samples (only 15) in the HLP dataset, leading to greater variance in performance.

VI Conclusion

In this paper, we proposed a workflow to improve the generalization of skin lesion classification models by combining SSL and ADA methods. Our results demonstrated the benefits of each method. SSL proved to be an effective UDA method, and ADA provided further performance improvements. We compared five ADA methods and showed that AADA-DANN yielded the best performance. We evaluated the performance on 10 different datasets exhibiting varying degrees of domain shift. This mimics real-world clinical scenarios, where domain shifts are common. Thus, our approach can facilitate wider clinical adoption of the skin lesion classification models. Future studies could explore different backbone model architectures, such as vision transformers, given their demonstrated strong performance on natural image datasets.

References

  • [1] Euijoon Ahn, Ashnil Kumar, Michael Fulham, Dagan Feng, and Jinman Kim. Unsupervised domain adaptation to classify medical images using zero-bias convolutional auto-encoders and contextbased feature augmentation. IEEE TMI, 39(7):2385–2394, 2020.
  • [2] Jordan T Ash, Chicheng Zhang, Akshay Krishnamurthy, John Langford, and Alekh Agarwal. Deep batch active learning by diverse, uncertain gradient lower bounds. In International Conference on Learning Representations, 2020.
  • [3] S. Azizi et al. Robust and efficient medical imaging with self-supervision, 2022. Preprint at \urlhttps://arxiv.org/abs/2205.09723.
  • [4] S. Azizi, B. Mustafa, F. Ryan, Z. Beaver, J. Freyberg, J. Deaton, A. Loh, A. Karthikesalingam, S. Kornblith, T. Chen, et al. Big self-supervised models advance medical image classification. In ICCV, 2021.
  • [5] Grill J. B., Strub F., Altché F., and Tallec C. Bootstrap your own latent-a new approach to self-supervised learning. Adv. Neural Inf. Proc. Syst., 33, 2020.
  • [6] Luana Barros et al. Assessing the generalizability of deep neural networks-based models for black skin lesions. In Progress in Pattern Recognition, Image Analysis, Computer Vision, and Applications, pages 1–14. Springer Nature Switzerland, 2023.
  • [7] T. B. Brown, B. Mann, N. Ryder, M. Subbiah, J. Kaplan, P. Dhariwal, A. Neelakantan, P. Shyam, G. Sastry, A. Askell, et al. Language models are few-shot learners, 2020. arXiv preprint arXiv:2005.14165.
  • [8] M. Caron et al. Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pages 9650–9660, 2021.
  • [9] C. Cassidy, A. Kendrick, J. Brodzicki, J. Jaworek-Korjakowska, , and M. Yap. Analysis of the isic image datasets: usage, benchmarks and recommendations. Medical Image Analysis, 75:102305, 2022. \urlhttps://doi.org/10.1016/j.media.2021.102305.
  • [10] Sireesha Chamarthi, Katharina Fogelberg, Titus J. Brinker, and Julia Niebling. Mitigating the influence of domain shift in skin lesion classification: A benchmark study of unsupervised domain adaptation methods. Informatics in Medicine Unlocked, 44:101430, 2024.
  • [11] T. Chen, S. Kornblith, M. Norouzi, and G. Hinton. A simple framework for contrastive learning of visual representations. Proc. 37th Int. Conf. Mach. Learn., PMLR 119, pages 1597–1607, 2020.
  • [12] X. Chen and K. He. Exploring simple siamese representation learning. In Proc. IEEE/CVF Conf. Comput. Vis. Pattern Recognit. (CVPR), pages 15750–15758, 2021.
  • [13] X. Chen, S. Wang, M. Long, and J. Wang. Transferability vs. discriminability: Batch spectral penalization for adversarial domain adaptation. In Proceedings of the 36th International Conference on Machine Learning, Proceedings of Machine Learning Research 97, pages 1081–1090, 2019.
  • [14] Y. Chen, W. Li, C. Sakaridis, D. Dai, and L. Van Gool. Domain adaptive faster r-cnn for object detection in the wild. In CVPR, 2018.
  • [15] Marc Combalia et al. Validation of artificial intelligence prediction models for skin cancer diagnosis using dermoscopy images: the 2019 international skin imaging collaboration grand challenge. The Lancet Digital Health, 2022.
  • [16] A. Coustasse, R. Sarkar, B. Abodunde, B. J. Metzger, and C. M. Slater. Use of teledermatology to improve dermatological access in rural areas. Telemed. J. E Health, 25:1022–1032, 2019.
  • [17] Roxana Daneshjou et al. Disparities in dermatology AI performance on a diverse, curated clinical image set. Sci. Adv., 8:eabq6147, 2022.
  • [18] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, and L. Fei-Fei. ImageNet: A large-scale hierarchical image database. In CVPR09, 2009.
  • [19] J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova. BERT: Pre-training of deep bidirectional transformers for language understanding, 2018. Preprint at \urlhttps://arxiv.org/abs/1810.04805.
  • [20] C. Doersch, A. Gupta, and A. A. Efros. Unsupervised visual representation learning by context prediction. In Proceedings of the IEEE international conference on computer vision, pages 1422–1430, 2015.
  • [21] C. Doersch and A. Zisserman. Multi-task self-supervised visual learning. In Proceedings of the IEEE International Conference on Computer Vision, pages 2051–2060, 2017.
  • [22] Melanie Ducoffe and Frederic Precioso. Adversarial active learning for deep networks: a margin based approach, 2018. arXiv preprint arXiv:1802.09841.
  • [23] B. Fu, Z. Cao, J. Wang, and M. Long. Transferable query selection for active domain adaptation. In 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7268–7277, Nashville, TN, USA, 2021.
  • [24] Yarin Gal, Riashat Islam, and Zoubin Ghahramani. Deep bayesian active learning with image data. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1183–1192. JMLR. org, 2017.
  • [25] Ganin and V. Lempitsky. Unsupervised domain adaptation by backpropagation. In International conference on machine learning, page 1180–1189. PMLR, 2015. \urlhttps://dl.acm.org/doi/10.5555/3045118. 3045244.
  • [26] Yonatan Geifman and Ran El-Yaniv. Deep active learning over the long tail, 2017. arXiv preprint arXiv:1711.00941.
  • [27] S. Gidaris, P. Singh, and N. Komodakis. Unsupervised representation learning by predicting image rotations, 2018. arXiv preprint arXiv:1803.07728.
  • [28] Syed Qasim Gilani, Muhammad Umair, Maryam Naqvi, Oge Marques, and Hee-Cheol Kim. Adversarial training based domain adaptation of skin cancer images. Life, 14(8):1009, 2024.
  • [29] Daniel Gissin and Shai Shalev-Shwartz. Discriminative active learning, 2019. arXiv preprint arXiv:1907.06347.
  • [30] haggerty H. et al. Self-supervised learning for skin cancer diagnosis with limited training data, 2024. arXiv.
  • [31] S.S. Han, C. Navarrete-Dechent, K. Liopyris, et al. The degradation of performance of a state-of-the-art skin image classifier when applied to patient-driven internet search. Sci Rep, 12:16260, 2022.
  • [32] K. He, H. Fan, Y. Wu, S. Xie, and R. Girshick. Momentum contrast for unsupervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9729–9738, 2020.
  • [33] K. He, S. Zhang, S. Ren, and J. Sun. Deep residual learning for image recognition. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, June 2016. \urlhttps://doi.org/10.1109/cvpr.2016.90.
  • [34] C. Hernández-Pérez, M. Combalia, S. Podlipnik, et al. BCN20000: Dermoscopic lesions in the wild. Sci Data, 11:641, 2024.
  • [35] S. C. Hoi, R. Jin, J. Zhu, and M. R. Lyu. Semisupervised svm batch mode active learning with applications to image retrieval. ACM Transactions on Information Systems (TOIS), 27(3):16, 2009.
  • [36] Duojun Huang et al. Divide and adapt: Active domain adaptation via customized learning. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7651–60. IEEE, 2023.
  • [37] SC. Huang, A. Pareek, M. Jensen, et al. Self-supervised learning for medical image classification: a systematic review and implementation guidelines. npj Digit. Med., 6:74, 2023.
  • [38] S. Dutt Jain and K. Grauman. Active image segmentation propagation. In CVPR, 2016.
  • [39] Ying Jin, Ximei Wang, Mingsheng Long, and Jianmin Wang. Minimum class confusion for versatile domain adaptation. In Computer Vision – ECCV 2020: 16th European Conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part XXI, pages 464–480, Berlin, Heidelberg, 2020. Springer-Verlag.
  • [40] Fogelberg K. et al. Domain shifts in dermoscopic skin cancer datasets: Evaluation of essential limitations for clinical translation, 2023. arXiv.
  • [41] Konstantinos Kamnitsas, Christian Baumgartner, et al. Unsupervised domain adaptation in brain lesion segmentation with adversarial networks. In IPMI, pages 597–609, 2017.
  • [42] Mingu Kang et al. Benchmarking self-supervised learning on diverse pathology datasets. In 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 3344–54. IEEE, 2023.
  • [43] M. A. Khan et al. Attributes based skin lesion detection and recognition: A mask RCNN and transfer learning-based deep learning framework. Pattern Recogn. Lett., 143:58–66, 2021.
  • [44] M. A. Khan, M. Y. Javed, M. Sharif, T. Saba, and A. Rehman. Multi-model deep neural network based features extraction and optimal selection approach for skin lesion classification. In International Conference on Computer and Information Sciences (ICCIS), pages 1–7, 2019.
  • [45] G. Larsson, M. Maire, and G. Shakhnarovich. Colorization as a proxy task for visual understanding. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 6874–6883, 2017.
  • [46] D. D. Lewis and J. Catlett. Heterogeneous uncertainty sampling for supervised learning. In Machine Learning Proceedings 1994, page 148–156. Elsevier, 1994.
  • [47] J. Li, T. Lin, and Y. Xu. SSLP: Spatial guided self-supervised learning on pathological images. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 3–12, 2021.
  • [48] M Long, Y Cao, J Wang, and M Jordan. Learning transferable features with deep adaptation networks. In International conference on machine learning, pages 97–105. PMLR, 2015. \urlhttps://dl.acm.org/doi/10. 5555/3045118.3045130.
  • [49] M Long, H Zhu, J Wang, and M Jordan. Deep transfer learning with joint adaptation networks. In International conference on machine learning, pages 2208–2217. PMLR, 2017. \urlhttps://dl.acm.org/doi/10.5555/ 3305890.3305909.
  • [50] Oquab M. et al. DINOv2: Learning robust visual features without supervision, 2023. arxiv.
  • [51] Dwarikanath Mahapatra, Ruwan Tennakoon, Yasmeen George, Sudipta Roy, Behzad Bozorgtabar, Zongyuan Ge, and Mauricio Reyes. ALFREDO: Active learning with feature disentangelement and DOmain adaptation for medical image classification. Medical Image Analysis, 97:103261, 2024.
  • [52] Christos Matsoukas et al. What makes transfer learning work for medical images: Feature reuse & other factors. In 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 9215–24. IEEE, 2022.
  • [53] Mohammad Amin Morid, Alireza Borjali, and Guilherme Del Fiol. A scoping review of transfer learning research on medical image analysis using imagenet. Computers in biology and medicine, page 104115, 2020.
  • [54] Basil Mustafa, Aaron Loh, Jan Freyberg, Patricia MacWilliams, Megan Wilson, Scott Mayer McKinney, Marcin Sieniek, Jim Winkens, Yuan Liu, Peggy Bui, et al. Supervised transfer learning at scale for medical imaging, 2021. arXiv preprint arXiv:2101.05913.
  • [55] D. Pathak, P. Krahenbuhl, J. Donahue, T. Darrell, and A. A. Efros. Context encoders: Feature learning by inpainting. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 2536–2544, 2016.
  • [56] Viraj Prabhu et al. Active domain adaptation via clustering uncertainty-weighted embeddings. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV), pages 8485–94. IEEE, 2021.
  • [57] Kuniaki Saito et al. Maximum classifier discrepancy for unsupervised domain adaptation. In 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 3723–32. IEEE, 2018.
  • [58] Kuniaki Saito, Donghyun Kim, Stan Sclaroff, Trevor Darrell, and Kate Saenko. Semi-supervised domain adaptation via minimax entropy. In 2019 IEEE/CVF International Conference on Computer Vision (ICCV), page 8049–8057. IEEE, Oct. 2019.
  • [59] T. Scheffer, C. Decomain, and S. Wrobel. Active hidden markov models for information extraction. In International Symposium on Intelligent Data Analysis, page 309–318. Springer, 2001.
  • [60] O. Sener and S. Savarese. Active learning for convolutional neural networks: A core-set approach. In ICLR, 2018.
  • [61] B. Shetty, R. Fernandes, A.P. Rodrigues, et al. Skin lesion classification of dermoscopic images using machine learning and convolutional neural network. Sci Rep, 12:18134, 2022.
  • [62] Samarth Sinha, Sayna Ebrahimi, and Trevor Darrell. Variational adversarial active learning. In Proceedings of the IEEE International Conference on Computer Vision, pages 5972–5981, 2019.
  • [63] H. Sowrirajan, J. Yang, A. Y. Ng, and P. Rajpurkar. MoCo pretraining improves representation and transferability of chest x-ray models. In Medical Imaging with Deep Learning, pages 728–744, 2021.
  • [64] C. L. Srinidhi and A. L. Martel. Improving self-supervised learning with hardness-aware dynamic curriculum learning: An application to digital pathology. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 562–571, 2021.
  • [65] Jong-Chyi Su et al. Active adversarial domain adaptation. In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 728–37. IEEE, 2020.
  • [66] Sun and K. Saenko. Deep coral: Correlation alignment for deep domain adaptation. In Computer Vision–ECCV 2016 Workshops: Amsterdam, The Netherlands, October 8-10 and 15-16, 2016, Proceedings, Part III 14, pages 443–450. Springer, 2016. \urlhttps://doi.org/10.1007/978-3-319-49409-8_35.
  • [67] Y.-H. Tsai, W.-C. Hung, S. Schulter, K. Sohn, M.-H. Yang, and M. Chandraker. Learning to adapt structured output space for semantic segmentation. In CVPR, 2018.
  • [68] P. Tschandl et al. Comparison of the accuracy of human readers versus machine-learning algorithms for pigmented skin lesion classification: an open, web-based, international, diagnostic study. The Lancet Oncology, 2019.
  • [69] P. Tschandl, C. Rosendahl, and H. Kittler. The HAM10000 dataset, a large collection of multi-source dermatoscopic images of common pigmented skin lesions. Sci. Data, 5:180161, 2018.
  • [70] Eric Tzeng et al. Adversarial discriminative domain adaptation. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2962–71. IEEE, 2017.
  • [71] V. Useini, S. Tanadini-Lang, Q. Lohmeyer, et al. Automatized self-supervised learning for skin lesion screening. Sci Rep, 14:12697, 2024.
  • [72] Dan Wang and Yi Shang. A new active labeling method for deep learning. In 2014 International joint conference on neural networks (IJCNN), pages 112–119. IEEE, 2014.
  • [73] Janet Wang et al. Achieving reliable and fair skin lesion diagnosis via unsupervised domain adaptation. In 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pages 5157–66. IEEE, 2024.
  • [74] D. Wolf, T. Payer, C.S. Lisson, et al. Self-supervised pre-training with contrastive and masked autoencoder methods for dealing with small datasets in deep learning for medical imaging. Sci Rep, 13:20260, 2023.
  • [75] Yinhao Wu, Bin Chen, An Zeng, Dan Pan, Ruixuan Wang, and Shen Zhao. Skin cancer classification with deep learning: A systematic review. Frontiers in Oncology, 12, 2022.
  • [76] Jiaolong Xu et al. Self-supervised domain adaptation for computer vision tasks. IEEE Access, 7:156694–706, 2019.
  • [77] J. R. Zech et al. Variable generalization performance of a deep learning model to detect pneumonia in chest radiographs: A cross-sectional study. PLoS Med., 15:e1002683, 2018.
  • [78] T Zellinger, E Grubinger, T Lughofer, T Natschläger, and S Saminger-Platz. Central moment discrepancy (cmd) for domain-invariant representation learning, 2017. arXiv preprint arXiv:1702.08811, doi:10.48550/arXiv.1702.08811. \urlhttps://arxiv.org/abs/1702.08811.
  • [79] Z. Zhao, W. Lu, Z. Zeng, K. Xu, B. Veeravalli, and C. Guan. Self-supervised assisted active learning for skin lesion segmentation. In 2022 44th Annual International Conference of the IEEE Engineering in Medicine & Biology Society, 2022.