Skolkovo Institute of Science and Technology
Team Vladik: Almaz Dautov, Damir Abdulayev, Danil Kuchukov
Teaser» · Presentation» · View Demo»Table of Contents
We present a model for Semantic Segmentation of buildings using satellite data, Hackaton "Цифровой прорыв" case from Scholtech. This model allows you to segment buildings from satellite images in the form of a binary mask with high accuracy. To solve the problem of binary semantic segmentation of buildings, a number of steps were taken:
- Collection of an additional dataset with a similar segmentation method. We used APIs from mapbox.com [2] for satellite imagery and openstreetmap.org [3] to segment individual buildings using QGIS [4] and the Overpass API [5].
- Post-processing masks by removing buildings smaller than 100 pixels (100 m^2)
- Preprocessing large satellite images by custom dividing them into “tiles” of 1024x1024 size.The peculiarity of this technique is that we can generate images that intersect with each other and restore the mask of the original image by voting.
- Training several models on original and tiled data, including Unet, Unet++ and Ensemble from Unet++
- Unet++ with 'resnet50' backbone and Unet++ with 'efficientnet-b7' backbone
- Encoders Weights: Pre-trained on ImageNet
Both models in the ensemble are instances of the UNet++ architecture, which is an extension of the traditional UNet architecture. UNet++ introduces a more expansive skip connection structure to improve information flow between the encoder and decoder. The use of different backbones (ResNet-50 and EfficientNet-B7) allows the model to capture features at different levels of abstraction.
Model / backbone | F1 score (Validation) | F1 score (Test) |
---|---|---|
U-net ++ / resnet50 | 0.5813 | |
DeepLabV3+ / resnet152 | 0.6043 | |
U-net ++ Augmented / resnet50 | 0.6238 | |
U-net++ / timm-efficientnet-b8 | 0.6571 | |
U-net++ Ensemble/ resnet50, efficientnet-b7 | 0.7006 | 0.71133 |
Distributed under the MIT License. See LICENSE
for more information.