Skip to content

Official Detectron2 implementation of STMDA-RetinaNet of our Computer Vision and Image Understanding 2022 work 'A Multi Camera Unsupervised Domain Adaptation Pipeline for Object Detection in Cultural Sites through Adversarial Learning and Self-Training'

License

Notifications You must be signed in to change notification settings

fpv-iplab/STMDA-RetinaNet

Repository files navigation

Detectron2 implementation of STMDA-RetinaNet

This is the implementation of our Computer Vision and Image Understanding 2022 work 'A Multi Camera Unsupervised Domain Adaptation Pipeline for Object Detection in Cultural Sites through Adversarial Learning and Self-Training'. The aim is to reduce the gap between source and target distributions improving the object detector performance on the target domains when training and tests data belong to different distributions.
If you want to use this code with your dataset, please follow the following guide.
Please leave a star ⭐ and cite the following paper if you use this repository for your project.

@article{PASQUALINO2022103487,
    title = {A multi camera unsupervised domain adaptation pipeline for object detection in cultural sites through adversarial learning and self-training},
    journal = {Computer Vision and Image Understanding},
    pages = {103487},
    year = {2022},
    issn = {1077-3142},
    doi = {https://doi.org/10.1016/j.cviu.2022.103487},
    url = {https://www.sciencedirect.com/science/article/pii/S1077314222000911},
    author = {Giovanni Pasqualino and Antonino Furnari and Giovanni Maria Farinella}
}

Architecture

Step 1

In this step the models is trained using synthetic labeled images (source domain) and unlabeled real images Hololens and GoPro (target domains). At the end of this step, the model is used to produce pseudo label on both Hololens and GoPro images.

Step 2

In this step the model is trained using synthetic labeled images (source domain) and pseudo label real images Hololens and GoPro (target domains) produced in the previous step. This is an iterative step and at the end of each step we produce better pseudo label on the target domains.

Installation

You can use this repo following one of these three methods:
NB: Detectron2 0.2.1 is required, installing other versions this code will not work.

Google Colab

Quickstart here 👉 Open In Colab
Or load and run the STMDA-RetinaNet.ipynb on Google Colab following the instructions inside the notebook.

Detectron 2 on your PC

Follow the official guide to install Detectron2 0.2.1
Or
Download the official Detectron 0.2.1 from here
Unzip the file and rename it in detectron2
run python -m pip install -e detectron2

Detectron2 via Dockerfile

Follow these instructions:

cd docker/
# Build 
docker build -t detectron2:v0 .

# Launch
docker run --gpus all -it --shm-size=8gb -v /home/yourpath/:/home/yourpath --name=name_container detectron2:v0

If you exit from the container you can restart it using:

docker start name_container
docker exec -it name_container /bin/bash

Dataset

Dataset is available here

Data Preparation

If you want to use this code with your dataset arrange the dataset in the format of COCO. Inside the script stmda_train.py register your dataset using:
register_coco_instances("dataset_name_source_training",{},"path_annotations","path_images")
register_coco_instances("dataset_name_init_target_training",{},"path_annotations","path_images")
register_coco_instances("dataset_name_init_target2_training",{},"path_annotations","path_images")

these are the paths where will be saved the annotations produced at the end of the step 1
register_coco_instances("dataset_name_target_training",{},"path_annotations","path_images")
register_coco_instances("dataset_name_target2_training",{},"path_annotations","path_images")

register_coco_instances("dataset_name_target_test",{},"path_annotations","path_images")
register_coco_instances("dataset_name_target_test2",{},"path_annotations","path_images")

Training

Replace at the following path detectron2/modeling/meta_arch/ the retinanet.py script with our retinanet.py.
Do the same for the fpn.py file at the path detectron2/modeling/backbone/, evaltuator.py and coco_evaluation.py at detectron2/evaluation/
Inside the script stmda_train.py you can set the parameters for the second step training like number of iteration and threshold.
Run the script stmda_train.py
Trained models are available at these links:
MDA-RetinaNet
STMDA-RetinaNet

MDA-RetinaNet-CycleGAN
STMDA-RetinaNet-CycleGAN

Testing

If you want to test the model load the new weights, set to 0 the number of iterations and run stmda_train.py

Results

Results of baseline and feature alignment methods. S refers to Synthetic, H refers to Hololens and G to GoPro.

Model Source Target Test H Test G
Faster RCNN S - 7.61% 30.39%
RetinaNet S - 14.10% 37.13%
DA-Faster RCNN S H+G merged 10.53% 48.23%
StrongWeak S H+G merged 26.68% 48.55%
CDSSL S H+G merged 28.66% 45.33%
DA-RetinaNet S H+G merged 31.63% 48.37%
MDA-RetinaNet S H, G 34.97% 50.81%
STMDA-RetinaNet S H, G 54.36% 59.51%

Results of baseline and feature alignment methods combined with CycleGAN. H refers to Hololens while G to GoPro. "{G, H}" refers to synthetic images translated to the merged Hololens and GoPro domains.

Model Source Target Test H Test G
Faster RCNN {G, H} - 15.34% 63.60%
RetinaNet {G, H} - 31.43% 69.59%
DA-Faster RCNN {G, H} H+G merged 32.13% 65.19%
StrongWeak {G, H} H+G merged 41.11% 66.45%
DA-RetinaNet {G, H} H+G merged 52.07% 71.14%
CDSSL {G, H} H+G merged 53.06% 71.17%
MDA-RetinaNet {G, H} H, G 58.11% 71.39%
STMDA-RetinaNet {G, H} H, G 66.64% 72.22%

Other Works

DA-RetinaNet
Detectron2 implementation of DA-Faster RCNN

About

Official Detectron2 implementation of STMDA-RetinaNet of our Computer Vision and Image Understanding 2022 work 'A Multi Camera Unsupervised Domain Adaptation Pipeline for Object Detection in Cultural Sites through Adversarial Learning and Self-Training'

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published