Isaac Corley1 · Caleb Robinson2 · Anthony Ortiz2
1University of Texas at San Antonio 2Microsoft AI for Good Research Lab
Code and experiments for the paper, "A Change Detection Reality Check", Isaac Corley, Caleb Robinson, Anthony Ortiz presented at the ICLR 2024 Machine Learning for Remote Sensing (ML4RS) Workshop
Remote sensing image literature from the past several years has exploded with proposed deep learning architectures that claim to be the latest state-of-the-art on standard change detection benchmark datasets. However, has the field truly made significant progress? In this paper we perform experiments which conclude a simple U-Net segmentation baseline without training tricks or complicated architectural changes is still a top performer for the task of change detection.
We find that U-Net is still a top performer on the LEVIR-CD and WHU-CD benchmark datasets. See below tables for comparisons with SOTA methods.
Table 1. Comparison of state-of-the-art and change detection architectures to a U-Net baseline on the LEVIR-CD dataset. We report the test set precision, recall, and F1 metrics of the positive change class. For the baseline experiments we perform 10 runs while varying random the seed and report metrics from the highest performing run. All other metrics are taken from their respective papers. The top performing methods are highlighted in bold. Gray rows indicate our baseline U-Net and siamese encoder variants.
Table 2. Experimental results on the WHU-CD dataset. We retrain several state-of-the-art methods using the original dataset’s train/test splits instead of the commonly used randomly split preprocessed version created in (Bandara & Patel (2022a)). We find that these state-of-the-art methods are outperformed by a U-Net baseline. We report the test set precision, recall, F1, and IoU metrics of the positive change class. For each run we select the model checkpoint with the lowest validation set loss. We provide metrics averaged over 10 runs with varying random seed as well as the best seed. Gray rows indicate our baseline U-Net and siamese encoder variants.
**Model Checkpoints uploaded to HuggingFace here!
Model | Backbone | Precision | Recall | F1 | IoU | Checkpoint |
---|---|---|---|---|---|---|
U-Net | ResNet-50 | 0.9197 | 0.8795 | 0.8991 | 0.8167 | Checkpoint |
U-Net | EfficientNet-B4 | 0.9269 | 0.8588 | 0.8915 | 0.8044 | Checkpoint |
U-Net SiamConc | ResNet-50 | 0.9287 | 0.8749 | 0.9010 | 0.8199 | Checkpoint |
U-Net SiamDiff | ResNet-50 | 0.9321 | 0.8730 | 0.9015 | 0.8207 | Checkpoint |
Model | Backbone | Precision | Recall | F1 | IoU | Checkpoint |
---|---|---|---|---|---|---|
U-Net SiamConc | ResNet-50 | 0.8369 | 0.8130 | 0.8217 | 0.7054 | Checkpoint |
U-Net SiamDiff | ResNet-50 | 0.8856 | 0.7741 | 0.8248 | 0.7086 | Checkpoint |
U-Net | ResNet-50 | 0.8865 | 0.7663 | 0.8200 | 0.7020 | Checkpoint |
Download the LEVIR-CD and WHU-CD datasets and then use the following notebooks to chip the datasets into non-overlapping 256x256 patches.
scripts/preprocess_levircd.ipynb
scripts/preprocess_whucd.ipynb
To train UNet on both datasets over 10 random seeds run
python train_levircd.py --train-root /path/to/preprocessed-dataset/ --model unet --backbone resnet50 --num_seeds 10
python train_whucd.py --train-root /path/to/preprocessed-dataset/ --model unet --backbone resnet50 --num_seeds 10
To evaluate a set of checkpoints and save results to a .csv file run:
python test_levircd.py --root /path/to/preprocessed-dataset/ --ckpt-root lightning_logs/ --output-filename metrics.csv
python test_whucd.py --root /path/to/preprocessed-dataset/ --ckpt-root lightning_logs/ --output-filename metrics.csv
If this work inspired your change detection research, please consider citing our paper:
@article{corley2024change,
title={A Change Detection Reality Check},
author={Corley, Isaac and Robinson, Caleb and Ortiz, Anthony},
journal={arXiv preprint arXiv:2402.06994},
year={2024}
}