Recursive classification of satellite imaging time series: An application to water and land cover mapping
This code has been implemented in Python 3.9. The performance of three static classification algorithms and their recursive versions is compared, including a Gaussian Mixture Model (GMM), Logistic Regression (LR) and Spectral Index Classifiers (SICs).
The first experiment considers water mapping of an embankment dam in California, with one training region and two study areas for evaluation (see the following figure).
The second experiment considers land cover classification of the Charles river basin in Boston, with matching training and evaluation regions (see the following figure).
The project is structured as follows.
-
README.md
-
requirements.txt
-
./benchmark_models/
-
./deepwatermap_main/
contains part of the deepwatermap algorithm open source code shared in this GitHub repository. An__init__.py
file has been added to this directory to treat it as a module. -
./watnet/
contains part of the WatNet algorithm open source code shared in this GitHub repository. -
benchmark.py
includes a copy of two functions from./deepwatermap_main/inference.py
. Themain
function has been changed with respect to the original one.
-
-
./tools/
contains scripts which provide useful functions.-
operations.py
-
path_operations.py
-
spectral_index.py
-
-
configuration.py
contains the classesDebug
andConfig
. Some configuration settings must be changed when executing the code with data that is different to the one provided by the authors. -
main.py
contains the main logic and flow of the code. -
bayesian_recursive.py
contains the core of the recursive bayesian algorithm for classification. -
./plot_results/
contains scripts that can be executed to plot evaluation results. Results presented in the manuscript can be reproduced by using these files. -
image_reader.py
contains the abstract classImageReader
and the classReadSentinel2
which allows the user to read images from a dataset of Sentinel2 images, such as the one provided by the authors of this code. -
training.py
contains functions used in the training stage. -
evaluation.py
contains functions used in the evaluation stage. -
./trained_models/
containspickle
files with saved data from the
training stage. If wanting to train the models from scratch, it should be indicated in
theConfig
class fromconfiguration.py
. Data has been stored in this file because the
training stage execution time is long.
Follow these instructions to install GDAL for Python with pip on Windows
or to install GDAL for Python with Anaconda (compatible with Windows, Linux and macOS). We recommend to create
a conda environment and run the command conda install -c conda-forge gdal
in the Anaconda prompt.
There are other packages besides GDAL that need to be installed. Required packages can be installed using the Python package installer pip
. Run the following command from the repository main folder:
pip install -r requirements.txt
If a module is not recognized, we recommend to install the package separately via the following commands, as suggested in the Installation via pip and conda section from these instructions, for instance in the case of scikit-learn
:
python -m pip install -U pip
python -m pip install -U scikit-image
Also, check this link for the installation of tensorflow
if using macOS with the M1 chip, for which we recommend using Miniconda
.
Download our dataset Sentinel-2 Images from Oroville Dam and Charles River from this Zenodo link and extract the .zip
file.
In configuration.py
(class Config
), change path_zenodo
to the path where the Zenodo folder has been stored. Images in this dataset are used for
training and evaluation. Details regarding the dataset can be found in the Zenodo link.
Results presented in the manuscript can be obtained by executing the main_notebook.ipynb
(Jupyter Notebook) or the main.py
file (Python script). In the Jupyter Notebook file there are instructions to reproduce results from the manuscript Figures 5, 6 and 7. For instance, results for Study Area C (Figure 7) are the ones in the following image.
Also, instructions to reproduce Figure 8 are provided. We recommend to use Jupyter Notebook in a conda environment (see instructions here).
A log file is generated in the path_log_files
path (defined in configuration.py
, class Config
) for every execution of the main script. Log files contain information
regarding events in the code execution.
The open source codes of the DeepWaterMap and WaterNet algorithms, used for benchmarking, were provided by their respective authors.
-
DeepWaterMap (see the GitHub repository), by L. F. Isikdogan, A.C. Bovik and P. Passalacqua. This algorithm for water mapping is proposed in the following publications:
-
WatNet (see the GitHub repository), by Xin Luo, Xiaohua Tong and Zhongwen Hu. This algorithm for water mapping is proposed in the publication An applicable and automatic method for earth surface water mapping based on multispectral images.
- Helena Calatrava (1)
- Bhavya Duvvuri (2)
- Haoqing Li (1)
- Ricardo Borsoi (3)
- Tales Imbiriba (1)
- Edward Beighley (2)
- Deniz Erdogmus (1)
- Pau Closas (1)
(1): Signal Processing, Imaging, Reasoning and Learning (SPIRAL) at Northeastern University, Boston (MA).
(2): The Beighley Lab (Sustainable Water Resources | Resilient Wet Infrastructure) at Northeastern University, Boston (MA).
(3): CRAN, University of Lorraine, CNRS, Vandoeuvre-les-Nancy, F-54000, France.
Please contact the following email address if having questions regarding the code: