${}_{\!}$ pen-CD: A Comprehensive Toolbox for Change Detection

Kaiyu Li
Xi’an Jiaotong University
[email protected] Equal contribution. Jiawei Jiang^∗
Sun Yat-Sen University
[email protected] Andrea Codegoni^∗
University of Pavia
[email protected] Chengxi Han^∗
Wuhan University
[email protected] Yupeng Deng^∗
Chinese Academy of Sciences
[email protected] Keyan Chen^∗
Beihang University
[email protected] Zhuo Zheng^∗
Stanford University
[email protected] Hao Chen^∗
Shanghai AI Laboratory
[email protected] Zhengxia Zou
Beihang University
[email protected] Zhenwei Shi
Beihang University
[email protected] Sheng Fang
Shandong University of Science and Technology
[email protected] Deyu Meng
Xi’an Jiaotong University
[email protected] Zhi Wang
Xi’an Jiaotong University
[email protected] Xiangyong Cao
Xi’an Jiaotong University
[email protected]

Abstract

We present Open-CD, a change detection toolbox that contains a rich set of change detection methods as well as related components and modules. The toolbox started from a series of open source general vision task tools, including OpenMMLab Toolkits¹¹1https://openmmlab.com, PyTorch Image Models²²2https://github.com/huggingface/pytorch-image-models, etc. It gradually evolves into a unified platform that covers many popular change detection methods and contemporary modules. It not only includes training and inference codes, but also provides some useful scripts for data analysis. We believe this toolbox is by far the most complete change detection toolbox. In this report, we introduce the various features, supported methods and applications of Open-CD. In addition, we also conduct a benchmarking study on different methods and components. We wish that the toolbox and benchmark could serve the growing research community by providing a flexible toolkit to reimplement existing methods and develop their own new change detectors. Code and models are available at https://github.com/likyoo/open-cd. Pioneeringly, this report also includes brief descriptions of the algorithms supported in Open-CD, mainly contributed by their authors³³3Open-CD TRP: https://github.com/likyoo/open-cd/tree/main/projects/open-cd_technical_report. We sincerely encourage researchers in this field to participate in this project and work together to create a more open community. This toolkit and report will be kept updated.

1 Introduction

Change detection is a fundamental remote sensing image interpretation task. The pipeline of change detection frameworks is usually similar to image segmentation tasks. But in particular, there is a pair of images as input, and the task focuses on detecting pixel-level differences in the bi-temporal images. Therefore, change detection is more complicated than single-temporal segmentation. And like other high-level vision tasks, different implementation settings can lead to very different results. Towards the goal of providing a high quality codebase and unified benchmark, we build Open-CD, an open-source change detection codebase.

Major features of Open-CD are: (1) Reliable dependencies. Open-CD is built on the OpenMMLab Toolkits, especially MMCV⁴⁴4https://github.com/open-mmlab/mmcv, MMEngine⁵⁵5https://github.com/open-mmlab/mmengine, MMPretrain⁶⁶6https://github.com/open-mmlab/mmpretrain, MMSegmentation⁷⁷7https://github.com/open-mmlab/mmsegmentation, and MMDetection⁸⁸8https://github.com/open-mmlab/mmdetection [6], and can arbitrarily call their components, including pipeline, model, module, loss, data augmentation, etc., in the config file. (2) Modular design. We decompose the change detection framework into different components and users can easily construct a customized change detection method by combining different modules. (3) Support of multiple methods out of box. Open-CD supports typical and popular change detection methods, see Section 2 for the full list. (4) State of the art. In most cases, the Open-CD implementation performs better than the official one. It also helps some users win change detection challenges. (5) High efficiency. Depending on efficient MMEngine and MMCV, the training speed of Open-CD is faster than or comparable to other codebases.

In this report, we introduce the composition and workflow of the Open-CD toolbox, and conduct a benchmarking study on the typical or state-of-the-art change detection algorithms, including parameter, inference speed, etc. Apart from the above, this report provides a brief description for each of the supported algorithms in Open-CD. Specifically, we have launched the Open-CD Technical Report Plan (Open-CD TRP for shot). We invite some authors to introduce their algorithms and participate in the construction of the Open-CD codebase. This plan is under active development and we will keep this report updated.

The rest of this report is organized as follows. We first introduce various supported methods, followed by a focus on features of Open-CD. Then, we present the benchmark results. Finally, we give examples of application scenarios of Open-CD, especially in the era of foundation models.

Table 1: Supported features of different change deetction codebases. “✓” means officially supported and blank means not supported.

	Open-CD	change_detection.pytorch⁹⁹9https://github.com/likyoo/change_detection.pytorch	ChangeDetectionRepository¹⁰¹⁰10https://github.com/ChenHongruixuan/ChangeDetectionRepository	CDLab¹¹¹¹11https://github.com/Bobholamovic/CDLab	ChangeDetectionToolbox¹²¹²12https://github.com/Bobholamovic/ChangeDetectionToolbox

Tasks:
Binary	✓	✓	✓	✓	✓
Semantic	✓
Customized Models:
Classification Nets	✓	✓
Segmentation Nets	✓	✓
Traditional Methods:
CVA [22]			✓		✓
DPCA [10]					✓
ImageDiff [21]					✓
ImageRatio [20]					✓
ImageRegr [20]					✓
MAD [24]			✓		✓
IRMAD [23]					✓
PCA-Kmeans [3]			✓		✓
PCDA [36]					✓
Deep Learning Methods:
FC-EF [9]	✓			✓
FC-Siam-Diff [9]	✓			✓
FC-Siam-Conc [9]	✓			✓
STANet [4]	✓	✓		✓
DSIFN [35]	✓			✓
L-UNet [26]				✓
SNUNet [11]	✓		✓	✓
BIT [5]	✓			✓
ChangeStar [37]	✓
DSAMNet [28]				✓
P2V-CD [18]				✓
ChangeFormer [2]	✓
Changer [12]	✓
TinyCD [8]	✓
HANet [13]	✓
LightCDNet [32]	✓
CGNet [14]	✓
BAN [17]	✓
TTP [7]	✓

2 Supported methods

Open-CD contains high-quality implementations of popular change detection methods. A summary of supported frameworks and features compared with other codebases is provided in Table 1. A list is given as follows.

2.1 Change Detection Methods

•

FC-EF [9] follows U-Net architecture. It concatenates the bi-temporal images before passing them through the network, treating them as different color channels.
•

FC-Siam-Conc [9] is a Siamese network. Compared to FC-EF, it separates the encoder layers into two streams with the same structure and shared weights (i.e., Siamese network) that process the bi-temporal images separately. In the skip connection, it concatenates the bi-temporal features and feeds them to the decoder.
•

FC-Siam-Diff [9] is a variant of FC-Siam-Conc. Instead of concatenating both features from the encoding streams, it calculates the absolute value of their difference.
•

STANet [4] is a metric-based change detection model. It consists of feature extractor, spatial-temporal attention module and metric module. The spatial-temporal attention module is a variant of self-attention in change detection task, which captures the global spatial–temporal relationships in the whole space-time to obtain more discriminative features. This attention module has two forms, basic spatial-temporal attention module (BAM) and pyramid spatial-temporal attention module (PAM). In addition, to reduce the effect of class imbalance, a class-sensitive loss, batch-balanced contrastive loss (BCL), is designed in STANet, which uses the number of samples in each class within a batch to determine the weight factor in the loss function.
•

DSIFN [35] builds on the FC-Siam-conc model. It uses channel and spatial attention to construct stronger basic modules (like CBAM) and also introduces deep supervision for fast convergence and performance improvement of the model.
•

SNUNet [11] is a standard encoder-decoder architecture and uses the Siamese network as encoder. To maintain high-resolution features and fine-grained localization information, SNUNet uses the dense skip connection mechanism between the encoder and decoder (like UNet++). For fusing multi-granularity features, the ensemble channel attention module (ECAM) is designed to automatically select and focus on more effective information between different decoder groups. Structurally, ECAM is a natural expansion of plain channel attention module (CAM) in deep supervision and ensemble learning.
•

BIT [5] is a CNN-Transformer based change detction model. It abstracts pixel-level features into several visual tokens and models spatio-temporal context information in a compact token space. Compared with directly extracting dense spatio-temporal semantic correlations in pixel-level space, on the one hand, it reduces the interference of redundant information in image space through pixel-level feature aggregation, and uses Transformer encoders to construct spatio-temporal correlations, which significantly reduces computational complexity. The calculation efficiency is improved. On the other hand, the Transformer decoder is used to enhance the original pixel-level features with the learned context information, and the global spatio-temporal information is used to enhance the pixel-level representation, which improves the ability of the model to identify objects of interest and exclude irrelevant changes.
•

ChangeStar [37] and ChangeStar2 [40] are simple yet unified change detectors capable of addressing binary change detection, object change detection, and semantic change detection, which is composed of a Siamese dense feature extractor and ChangeMixin or ChangeMixin2. This design is inspired by reusing the modern semantic segmentation architecture because semantic segmentation and change detection are both dense prediction tasks. ChangeMixin and ChangeMixin2 enable any off-the-shelf deep semantic segmentation network to detect changes.
•

ChangeFormer [2] is a variant of SegFormer [31] in change detection task. Different from SegFormer, it uses a Siamese customized mix transformer (MiT) as the encoder. The bi-temporal features in the Siamese network are concatenated and then fed into the decoder consisting of multi-layer perceptron (MLP) layers.
•

Changer [12] series models emphasize interactions between bi-temporal branches. There are two simple interaction strategies: aggregation-distribution (AD) and feature “exchange.” Specifically, the AD interaction is abstracted from some co-/cross-attention mechanisms. The “exchange” interaction is a completely parameter and computation-free operation, which is achieved by exchanging bi-temporal feature maps in the spatial or channel dimension, and the exchanged features are mixed as they pass through subsequent convolution or token mixer. In addition, the flow-based dual-alignment fusion (FDAF) module is proposed to overcome the problem of side-looking and misalignment in multi-temporal images.
•

TinyCD [8] aims to develop a deep learning model that significantly reduces memory consumption and computational complexity. To achieve this, TinyCD extracts low-level features from two images using the early layers of a selected Siamese backbone. This approach minimizes the model’s size and leverages the locality structure of features induced by the inductive bias inherent in the initial stages of a fully convolutional backbone. Subsequently, TinyCD introduces an attention module called MAMB. This module comprises an initial mixing stage that employs grouped convolution to integrate semantically similar features from the two images. Following this, a MLP fuses the mixed features into a single heatmap. This heatmap serves as a skip connection, reweighting the upsampled features during the upsampling phase of the Siamese U-Net. This process functions as a spatio-temporal attention mechanism. In the final stage, TinyCD utilizes a multi-layer perceptron as a pixel-level classifier to produce a probability score indicating change for each pixel. These architectural choices result in a model with a significantly reduced number of parameters and computational complexity while still achieving state-of-the-art performance. Consequently, TinyCD presents a valuable alternative to other leading models, particularly in scenarios where rapid training and deployment on low-end devices are critical.
•

HANet [13] is a discriminative Siamese network, hierarchical attention network, which can integrate multiscale features and refine detailed features. HANet has four progressive foreground-balanced sampling strategies based on not adding change information to help the model accurately learn the features of the changed pixels during the early training process and thereby improve detection performance. The main part of HANet is the HAN module, which is a lightweight and effective self-attention mechanism.
•

LightCDNet [32] is a lightweight and accurate change detection model designed for real-world application deployment. LightCDNet follows the standard “encoder-decode” structure, but unlike most Siamese network-based models, this model have designed a novel early fusion module to facilitate cross-temporal information fusion, called the deep supervised fusion module (DSFM), which effectively enhances the information preservation capability of the change detection model even with a small number of parameters. On the other hand, LightCDNet uses an improved ShuffleNetV2 as the encoder, coupled with a feature pyramid decoder to achieve high-precision change detection performance. LightCDNet provide versions with different parameter quantities for flexible deployment.
•

CGNet [14] is the change guiding network to tackle the insufficient expression problem of change features in the conventional U-Net structure adopted in previous methods. It contains a self-attention module named change guide module, which can effectively capture the long-distance dependency among pixels and effectively overcomes the problem of the insufficient receptive field of traditional convolutional neural networks.
•

BAN [17] is a universal foundation model-based change detection adaptation framework aiming to extract the knowledge of foundation models for change detection. It contains three parts, i.e. frozen foundation model (e.g., CLIP), bi-temporal adapter branch (Bi-TAB), and bridging modules between them. Specifically, BAN extracts general features through a frozen foundation model, which are then selected, aligned, and injected into Bi-TAB via the bridging modules. Bi-TAB is designed as a model-agnostic concept to extract task/domain-specific features, which can be either an existing arbitrary change detection model or some hand-crafted stacked blocks.
•

TTP [7] enhances high-precision change detection in complex spatio-temporal remote sensing scenarios by integrating the general knowledge of foundational visual models into the change detection task. It leverages the potential of foundational models even with limited annotated change detection data. TTP overcomes domain shift issues during knowledge transfer and addresses the challenge of expressing both homogeneous and heterogeneous features in multi-temporal images. Specifically, TTP utilizes general segmentation knowledge based on the segment anything model (SAM) by introducing low-rank fine-tuning parameters into the SAM backbone, which mitigates spatial semantic domain shifts. Furthermore, TTP proposes a time-travel activation gate, enabling temporal features to permeate the pixel semantic space, thus enhancing the foundational model’s ability to understand both homogeneous and heterogeneous features of bi-temporal images. Lastly, an efficient multi-level change prediction head is designed to decode dense and high-level semantic change features within the foundational model. This method paves the way for more accurate and efficient change detection in remote sensing imagery.

Refer to caption — Figure 1: The overall architecture of Open-CD.

2.2 Customized Change Detection Models

There is no doubt that the construction of change detection models benefits from other fundamental tasks in general vision, especially classification and segmentation. The former provides powerful feature extractors (backbones) and the latter provides extensive change mask generation schemes. Therefore, a core aspect of Open-CD’s design is to allow users to arbitrarily call components from other vision models and to make it easier for change detection to benefit from advanced general modules. This is an important reason why we build Open-CD on top of OpenMMLab toolkits.

Another change detection codebase that allows for custom module combinations is change_detection.pytorch (CDP), which is built on a ”Siamese encoder + feature fusion + decoder” mode and supports some common backbone networks and segmentation heads. However, since CDP relies on developers to frequently code to support the latest models, it is difficult to maintain and update. Open-CD, on the other hand, is able to benefit directly from the constantly updated OpenMMLab Toolkits, and to modify and combine hundreds of models and modules. In brief, Open-CD allows users to play free architecture games.

3 Architecture

3.1 Overall

Table 2: Supported datasets in Open-CD.

Dataset	Task	Images pairs	Image size	Change instances	Change pixels	Download
LEVIR-CD [4]	binary	637	1024 $\times$ 1024	31K	30M	URL
WHU-CD [15]	binary	1	32,207 $\times$ 15,354	2K	21M	URL
S2Looking [27]	binary	5,000	1024 $\times$ 1024	66K	69M	URL
SVCD [16]	binary	16,000	256 $\times$ 256	-	-	URL
DSIFN [35]	binary	394	512 $\times$ 512	-	-	URL
CLCD [19]	binary	560	512 $\times$ 512	-	-	URL
RSIPAC [1]	binary	3,194	512 $\times$ 512	-	-	URL
SECOND [33]	semantic	4,662	512 $\times$ 512	-	-	URL
Landsat [34]	semantic	8,468	416 $\times$ 416	-	-	URL
BANDON [25]	semantic	2,283	2048 $\times$ 2048	123K	283M	URL

As shown in Figure 1, Open-CD consists of five parts.

Config Files. In Open-CD, the model, optimizer, dataset, data piprline, etc. are specified in config files. In general, if the user does not need to redesign the model internally, simply modifying the config file is enough.

Model Zoo. Besides change detection models listed in Table 1, Open-CD supports almost all models in OpenMMLab related algorithm codebases, which can be called directly in the config file by simply installing them as dependencies. In addition, there are thousands of pre-trained weights available, including plain ImageNet, foundation models, etc.

Internal Modules. Following the design concept of the OpenMMLab toolkits, the training process of Open-CD is conducted in the MMEngine Runner. We will describe it in Section 3.2.

Data Files. Open-CD supports the current mainstream binary and semantic change detection datasets, as listed in Table 2. And since we define two extensible base classes, _BaseCDDataset and BaseSCDDataset, it is easy for users to integrate custom datasets into Open-CD.

Tools. Open-CD also contains several functional scripts e.g. training tools, inference tools, data analysis tools, result analysis tools, etc.

3.2 Training Pipeline

Similar to other algorithm codebases in OpenMMLab toolkits, Open-CD follows the unified interface of abstract components defined in MMEngine. The training pipeline of Open-CD is shown in Figure 14. First, the Dataloader obtains data from the dataset and transmits it to the model. Outside the model, there is a Wrapper for distributed training, etc. Then, the parameter scheduler is used to adjust Optimizer related parameters e.g. learning rate during training, and the Optimizer is used to update the model. The Optimizer wrapper is capable of gradient accumulation, gradient clipping, mixed precision training, etc. Finally, in the evaluation phase, the data and model outputs are fed to the Evaluator and the corresponding values are returned based on the evaluation metrics.

To make the training pipeline extensible, some hook points are inserted into the pipeline. With these hooks, the data flow in the Runner can be operated and observed to achieve the user-customized requirements. During the whole training pipeline, the Logging component is constantly working, transmitting the required information to the Visualizer.

4 Benchmarks

4.1 Experimental Setting

Dataset. We adopt LEVIR-CD as the primary benchmark for all experiments because it has high quality and is more widely used. We use the Train split for training and report the performance on the Test split.

Implementation details. If not otherwise specified, we adopt the following settings:

(1)

Images are randomly cropped to 256 $\times$ 256.
(2)

We use a single RTX 3090 GPU for training and inference, with a training batch size of 8.
(3)

The training schedule is “40k”, meaning 40k iteration.
(4)

Data augmentation including: RandomRotate, RandomFlip and PhotoMetricDistortion.

Evaluation metrics. We adopt $F_{1}^{c}$ , $IoU^{c}$ , $Precision^{c}$ and $Recall^{c}$ as evaluation metrics. Note that all metrics are based on the category “change” to avoid the class imbalance problem.

4.2 Benchmarking Results

Table 3: Results of different change detection methods on LEVIR-CD Test. † denotes the size of random crop is

512\times 512

, * denotes the batch size is set to 16, ‡ denotes the official code is based on Open-CD, and “300e” denotes 300 epochs. For BAN and TTP, only the learnable parameters are counted.

Method	Backbone	Lr Schd	Param (M)	GFLOPs	Inf (fps)	$Precision^{c}$	$Recall^{c}$	$F_{1}^{c}$	$IoU^{c}$
FC-EF [9]	-	40k	1.353	3.244	66.37 $\pm$ 0.0333	80.54	81.17	80.85	67.86
FC-Siam-Diff [9]	-	40k	4.385	1.352	46.29 $\pm$ 0.0015	89.64	80.25	84.68	73.44
FC-Siam-Conc [9]	-	40k	4.989	1.548	45.67 $\pm$ 0.0016	86.59	84.53	85.55	74.75
STANet-PAM [4]	ResNet-18	40k	13.356	48.083	1.53 $\pm$ 0.0001	84.28	90.19	87.13	77.20
DSIFN [35]	VGG-16	40k	35.995	78.982	5.19 $\pm$ 0.0001	89.78	92.32	91.03	83.54
SNUNet-c16 [11]	-	40k	3.012	11.730	5.96 $\pm$ 0.0001	93.02	89.75	91.36	84.09
BIT [5]	ResNet-18	40k	2.990	8.749	28.93 $\pm$ 0.0006	93.17	88.38	90.71	83.00
ChangeStar†* [37]	ResNet-18	40k	16.965	19.213	27.06 $\pm$ 0.0007	93.75	88.90	91.26	83.92
ChangeFormer-b0 [2]	MiT-b0	40k	3.847	2.455	25.26 $\pm$ 0.0005	93.07	88.20	90.57	82.76
ChangeFormer-b1 [2]	MiT-b1	40k	13.941	5.825	18.71 $\pm$ 0.0004	93.09	89.26	91.14	83.71
Changer†‡[12]	ResNet-18	40k	11.391	5.955	49.31 $\pm$ 0.2815	92.86	90.78	91.81	84.86
TinyCD [8]	-	40k	0.285	1.448	26.28 $\pm$ 0.0004	91.87	89.89	90.87	83.26
HANet [13]	-	40k	3.028	20.822	4.80 $\pm$ 0.0001	91.72	89.09	90.39	82.46
LightCDNet-base‡[32]	-	40k	1.313	3.164	17.60 $\pm$ 0.0041	90.95	90.30	90.62	82.86
LightCDNet-large‡[32]	-	40k	2.816	5.691	10.29 $\pm$ 0.0002	92.68	89.67	91.15	83.74
CGNet [14]	VGG-16	40k	38.989	87.550	1.93 $\pm$ 0.0001	93.60	90.64	92.10	85.36
BAN (ViT-L)†‡[17]	MiT-b0	40k	4.474	307.374	1.87 $\pm$ 0.0001	93.47	90.30	91.86	84.94
TTP†‡[7]	ViT-L	300e	6.210	929.840	0.67 $\pm$ 0.0001	93.00	91.70	92.10	85.60

Main results. We benchmark different methods on LEVIR-CD Test, including FC-EF [9], FC-Siam-Diff [9], FC-Siam-Conc [9], STANet-BAM/PAM [4], DSIFN [35], SNUNet-c16 [11], BIT [5], ChangeStar [37], ChangeFormer-b0/b1 [2], Changer [12], TinyCD [8], HANet [13], LightCDNet-base/large [32], CGNet [14], BAN (ViT-L) [17] and TTP [7]. We report the parameter, GFLOPs, inference speed, $Precision^{c}$ , $Recall^{c}$ , $F_{1}^{c}$ and $IoU^{c}$ of these methods in Table 3. The inference time is tested on a single RTX 3090 GPU. The GFLOPs is calculated at a size of $256\times 256$ . For re-production and comparison in subsequent studies, we will also release all weights and visualization results.

Comparison with official code. We compare the official implementations of some methods with those in Open-CD, as shown in Figure 3. With the exception of STANet and TinyCD, all other implementations in Open-CD gain some degree of performance improvement compared to the original implementations. Especially for BIT, ChangeStar and ChangeFormer, the improvements are significant. Notably, the original ChangeFormer customizes the configuration of MiT (e.g. depth, width of the network, etc.). In Open-CD, we restore the settings of b1 $\sim$ b5 in SegFormer to make it scalable, and obtain higher $F_{1}^{c}$ score with fewer parameters.

5 Application

5.1 Algorithm implementation, validation and deployment

As mentioned above, Open-CD can be used for the implementation of custom change detection algorithms, validation of existing algorithms, and model deployment (relying on the OpenMMLab ecosystem, i.e., MMDeploy¹⁵¹⁵15https://github.com/open-mmlab/mmdeploy. Although the current Open-CD only contains supervised learning pipelines, it can also be applied as a construction tool for models, datasets, etc., promoting the exploration of semi-supervised and unsupervised algorithms.

5.2 Downstream validation for foundation model

Recently, researchers tend to explore the general capabilities of foundation models. Within the remote sensing community, foundation models are also actively explored. Some works attempt to train a robust backbone that is capable of extracting general features for a series of downstream perceptual tasks e.g., object detection, semantic segmentation, change detection, etc. For this, Open-CD is an out-of-the-box downstream validation tool for change detection. Typically, MTP [30], a remote sensing foundation model via multi-task pre-training, yields an image encoder for downstream task fine-tuning and uses Open-CD for validation of change detection task.

5.3 Downstream validation for generative model

Although huge amount of remote sensing images are available through Google Earth etc., data annotation remains a challenge, which is time-consuming and costly. One trend is to synthesize simulated image-label pairs through conditional GAN, Diffusion models, etc [38, 29, 39]. and use these data to (pre-)train downstream perception models. In this case, Open-CD can be applied to validate the quality of the simulated change detection data. A typical study is ChangeAnywhere [29], which proposes a diffusion model to generate a simulated change detection dataset ChangeAnywhere-100K and uses Open-CD for subsequent validation.

5.4 Competition and challenge

As an advanced algorithm codebase, Open-CD can be easily used for change detection competitions and challenges. In general, users only need to customize their Dataset Class to enjoy all the advanced algorithms, data analysis tools, inference tools, etc. in Open-CD. Here, some solutions of winning competitions with Open-CD are listed:

•

2024 ISPRS TC 1 Contest.¹⁶¹⁶16https://www.gaofen-challenge.com/challenge Winner. ¹⁷¹⁷17https://github.com/DanyangLihhh/2024-ISPRS-TC-I-Contest
•

2023 “Jilin-1” Cup.¹⁸¹⁸18https://www.jl1mall.com/contest 5th place. ¹⁹¹⁹19https://github.com/DanyangLihhh/Cultivated-land-change-detection

6 Acknowledgements

The authors would like to express their deep gratitude to the OpenMMLab team for their invaluable contributions to vision community. The authors would also like to extend their appreciation to the MMDetection team for their excellent technical report, which provided a valuable reference for the writing of this report.

References

RSI [2023] Rsipac contest. http://rsipac.whu.edu.cn/subject_two, 2023.
Bandara and Patel [2022] Wele Gedara Chaminda Bandara and Vishal M Patel. A transformer-based siamese network for change detection. In IGARSS 2022-2022 IEEE International Geoscience and Remote Sensing Symposium, pages 207–210. IEEE, 2022.
Celik [2009] Turgay Celik. Unsupervised change detection in satellite images using principal component analysis and $k$ -means clustering. IEEE geoscience and remote sensing letters, 6(4):772–776, 2009.
Chen and Shi [2020] Hao Chen and Zhenwei Shi. A spatial-temporal attention-based method and a new dataset for remote sensing image change detection. Remote Sensing, 12(10):1662, 2020.
Chen et al. [2021] Hao Chen, Zipeng Qi, and Zhenwei Shi. Remote sensing image change detection with transformers. IEEE Transactions on Geoscience and Remote Sensing, 60:1–14, 2021.
Chen et al. [2019] Kai Chen, Jiaqi Wang, Jiangmiao Pang, Yuhang Cao, Yu Xiong, Xiaoxiao Li, Shuyang Sun, Wansen Feng, Ziwei Liu, Jiarui Xu, et al. Mmdetection: Open mmlab detection toolbox and benchmark. arXiv preprint arXiv:1906.07155, 2019.
Chen et al. [2023] Keyan Chen, Chengyang Liu, Wenyuan Li, Zili Liu, Hao Chen, Haotian Zhang, Zhengxia Zou, and Zhenwei Shi. Time travelling pixels: Bitemporal features integration with foundation model for remote sensing image change detection. arXiv preprint arXiv:2312.16202, 2023.
Codegoni et al. [2022] Andrea Codegoni, Gabriele Lombardi, and Alessandro Ferrari. Tinycd: a (not so) deep learning model for change detection. Neural Computing and Applications, pages 1–16, 2022.
Daudt et al. [2018] Rodrigo Caye Daudt, Bertr Le Saux, and Alexandre Boulch. Fully convolutional siamese networks for change detection. In 2018 25th IEEE International Conference on Image Processing (ICIP), pages 4063–4067. IEEE, 2018.
Deng et al. [2008] JS Deng, K Wang, YH Deng, and GJ Qi. Pca-based land-use change detection and analysis using multitemporal and multisensor satellite data. International Journal of Remote Sensing, 29(16):4823–4838, 2008.
Fang et al. [2021] Sheng Fang, Kaiyu Li, Jinyuan Shao, and Zhe Li. Snunet-cd: A densely connected siamese network for change detection of vhr images. IEEE Geoscience and Remote Sensing Letters, 19:1–5, 2021.
Fang et al. [2022] Sheng Fang, Kaiyu Li, and Zhe Li. Changer: Feature interaction is what you need for change detection. arXiv preprint arXiv:2209.08290, 2022.
Han et al. [2023a] Chengxi Han, Chen Wu, Haonan Guo, Meiqi Hu, and Hongruixuan Chen. Hanet: A hierarchical attention network for change detection with bi-temporal very-high-resolution remote sensing images. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023a.
Han et al. [2023b] Chengxi Han, Chen Wu, Haonan Guo, Meiqi Hu, Jiepan Li, and Hongruixuan Chen. Change guiding network: Incorporating change prior to guide change detection in remote sensing imagery. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 2023b.
Ji et al. [2018] Shunping Ji, Shiqing Wei, and Meng Lu. Fully convolutional networks for multisource building extraction from an open aerial and satellite imagery data set. IEEE Transactions on geoscience and remote sensing, 57(1):574–586, 2018.
Lebedev et al. [2018] MA Lebedev, Yu V Vizilter, OV Vygolov, Vladimir A Knyaz, and A Yu Rubis. Change detection in remote sensing images using conditional adversarial networks. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, 42:565–571, 2018.
Li et al. [2024] Kaiyu Li, Xiangyong Cao, and Deyu Meng. A new learning paradigm for foundation model-based remote-sensing change detection. IEEE Transactions on Geoscience and Remote Sensing, 62:1–12, 2024.
Lin et al. [2022] Manhui Lin, Guangyi Yang, and Hongyan Zhang. Transition is a process: Pair-to-video change detection networks for very high resolution remote sensing images. IEEE Transactions on Image Processing, 32:57–71, 2022.
Liu et al. [2022] Mengxi Liu, Zhuoqun Chai, Haojun Deng, and Rong Liu. A cnn-transformer network with multiscale context aggregation for fine-grained cropland change detection. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, 15:4297–4306, 2022.
Lu et al. [2004] Dengsheng Lu, Paul Mausel, Eduardo Brondizio, and Emilio Moran. Change detection techniques. International journal of remote sensing, 25(12):2365–2401, 2004.
Mahmoudzadeh [2007] H Mahmoudzadeh. Digital change detection using remotely sensed data for monitoring green space destruction in tabriz. 2007.
Malila [1980] William A Malila. Change vector analysis: An approach for detecting forest changes with landsat. In LARS symposia, page 385, 1980.
Nielsen [2007] Allan Aasbjerg Nielsen. The regularized iteratively reweighted mad method for change detection in multi-and hyperspectral data. IEEE Transactions on Image processing, 16(2):463–478, 2007.
Nielsen et al. [1998] Allan A Nielsen, Knut Conradsen, and James J Simpson. Multivariate alteration detection (mad) and maf postprocessing in multispectral, bitemporal image data: New approaches to change detection studies. Remote Sensing of Environment, 64(1):1–19, 1998.
Pang et al. [2023] Chao Pang, Jiang Wu, Jian Ding, Can Song, and Gui-Song Xia. Detecting building changes with off-nadir aerial images. Science China Information Sciences, 66(4):140306, 2023.
Papadomanolaki et al. [2021] Maria Papadomanolaki, Maria Vakalopoulou, and Konstantinos Karantzalos. A deep multitask learning framework coupling semantic segmentation and fully convolutional lstm networks for urban change detection. IEEE Transactions on Geoscience and Remote Sensing, 59(9):7651–7668, 2021.
Shen et al. [2021] Li Shen, Yao Lu, Hao Chen, Hao Wei, Donghai Xie, Jiabao Yue, Rui Chen, Shouye Lv, and Bitao Jiang. S2looking: A satellite side-looking dataset for building change detection. Remote Sensing, 13(24):5094, 2021.
Shi et al. [2021] Qian Shi, Mengxi Liu, Shengchen Li, Xiaoping Liu, Fei Wang, and Liangpei Zhang. A deeply supervised attention metric-based network and an open aerial image dataset for remote sensing change detection. IEEE transactions on geoscience and remote sensing, 60:1–16, 2021.
Tang and Chen [2024] Kai Tang and Jin Chen. Changeanywhere: Sample generation for remote sensing change detection via semantic latent diffusion model, 2024.
Wang et al. [2024] Di Wang, Jing Zhang, Minqiang Xu, Lin Liu, Dongsheng Wang, Erzhong Gao, Chengxi Han, Haonan Guo, Bo Du, Dacheng Tao, and Liangpei Zhang. Mtp: Advancing remote sensing foundation model via multi-task pretraining. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing, pages 1–24, 2024.
Xie et al. [2021] Enze Xie, Wenhai Wang, Zhiding Yu, Anima Anandkumar, Jose M Alvarez, and Ping Luo. Segformer: Simple and efficient design for semantic segmentation with transformers. Advances in neural information processing systems, 34:12077–12090, 2021.
Xing et al. [2023] Yuanjun Xing, Jiawei Jiang, Jun Xiang, Enping Yan, Yabin Song, and Dengkui Mo. Lightcdnet: Lightweight change detection network based on vhr images. IEEE Geoscience and Remote Sensing Letters, 2023.
Yang et al. [2020] Kunping Yang, Gui-Song Xia, Zicheng Liu, Bo Du, Wen Yang, Marcello Pelillo, and Liangpei Zhang. Semantic change detection with asymmetric siamese networks. arXiv preprint arXiv:2010.05687, 2020.
Yuan et al. [2022] Panli Yuan, Qingzhan Zhao, Xingbiao Zhao, Xuewen Wang, Xuefeng Long, and Yuchen Zheng. A transformer-based siamese network and an open optical dataset for semantic change detection of remote sensing images. International Journal of Digital Earth, 15(1):1506–1525, 2022.
Zhang et al. [2020] Chenxiao Zhang, Peng Yue, Deodato Tapete, Liangcun Jiang, Boyi Shangguan, Li Huang, and Guangchao Liu. A deeply supervised image fusion network for change detection in high resolution bi-temporal remote sensing images. ISPRS Journal of Photogrammetry and Remote Sensing, 166:183–200, 2020.
Zhang and Zhang [2007] Jixian Zhang and Yonghong Zhang. Remote sensing research issues of the national land use change program of china. ISPRS Journal of Photogrammetry and Remote Sensing, 62(6):461–472, 2007.
Zheng et al. [2021] Zhuo Zheng, Ailong Ma, Liangpei Zhang, and Yanfei Zhong. Change is everywhere: Single-temporal supervised object change detection in remote sensing imagery. In Proceedings of the IEEE/CVF international conference on computer vision, pages 15193–15202, 2021.
Zheng et al. [2023] Zhuo Zheng, Shiqi Tian, Ailong Ma, Liangpei Zhang, and Yanfei Zhong. Scalable multi-temporal remote sensing change data generation via simulating stochastic change process. In Proceedings of the IEEE/CVF International Conference on Computer Vision, pages 21818–21827, 2023.
Zheng et al. [2024a] Zhuo Zheng, Stefano Ermon, Dongjun Kim, Liangpei Zhang, and Yanfei Zhong. Changen2: Multi-temporal remote sensing generative change foundation model. arXiv preprint arXiv:2406.17998, 2024a.
Zheng et al. [2024b] Zhuo Zheng, Yanfei Zhong, Ailong Ma, and Liangpei Zhang. Single-temporal supervised learning for universal remote sensing change detection. International Journal of Computer Vision, pages 1–21, 2024b.

​{}_{\!}​ ​ pen-CD: A Comprehensive Toolbox for Change Detection