This is the official repository of our paper:
Salvage of Supervision in Weakly Supervised Object Detection
Lin Sui, Chen-Lin Zhang and Jianxin Wu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
Notes:
- As we build our stage 1 model based on UWSOD repo which adopts an earlier detectron2 version, please prepare two different conda environments as follows.
# create conda environment for Stage 1
conda create -n wsod python=3.7 -y
conda activate wsod
conda install pytorch==1.6.0 torchvision==0.7.0 cudatoolkit=10.1 -c pytorch
# You may need to install appropriate version of pytorch according to you device and driver
# For example 30XX GPU w/ pytorch 1.9.0 cudatoolkit 11.1
# pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
# download SoS-WSOD
git clone https://github.com/suilin0432/SoS-WSOD.git
# install detectron2 and wsl
cd SoS-WSOD
cd uwsod
# install detectron2
python3 -m pip install -v -e .
# install wsl
cd projects/WSL
pip install git+https://github.com/lucasb-eyer/pydensecrf.git
pip install opencv-python sklearn shapely
pip install git+https://github.com/cocodataset/panopticapi.git
git submodule update --init --recursive
python3 -m pip install -v -e .
# create conda environment for Stage 2 and 3
conda create -n fsod python=3.7 -y
conda activate fsod
conda install pytorch==1.7.1 torchvision==0.8.2 torchaudio==0.7.2 cudatoolkit=10.1 -c pytorch
# You may need to install appropriate version of pytorch according to you device and driver
# For example 30XX GPU w/ pytorch 1.9.0 cudatoolkit 11.1
# pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
# install detectron2
cd ${Detectron2_Path}
python3 -m pip install -v -e .
-
Download PASCAL VOC Dataset from official website
-
Link the dataset into uwsod, detectron2 and unbias
# uwsod cd SoS-WSOD/uwsod/ mkdir datasets ln -s $VOC_PATH datasets/VOC2007 # detectron2 cd ../detectron2 mkdir datasets ln -s $VOC_PATH datasets/VOC2007 # unbias cd ../unbias mkdir datasets ln -s $VOC_PATH datasets/VOC2007
-
Prepare Proposal file following here
-
Download MS-COCO Dataset from official website
-
Link the dataset into uwsod, detectron2 and unbias
# uwsod cd SoS-WSOD/uwsod/ ln -s $COCO_PATH datasets/coco # detectron2 cd ../detectron2 ln -s $COCO_PATH datasets/coco # unbias cd ../unbias ln -s $COCO_PATH datasets/coco
-
Prepare Proposal file following here
We will use VOC2007 as the example.
-
Download VGG Backbone from here following UWSOD repo.
-
Create models category and put the VGG backbone into it.
cd SoS-WSOD/uwsod mkdir -p models/VGG ln -s $VGG_PATH modles/VGG/
-
Train a basic WSOD model first:
bash run/code_release/oicr_plus_voc07.sh
-
Generate prediction result:
bash run/code_release/oicr_plus_voc07_detection_result.sh
-
Generate pseudo labels with PGF
# VOC2007 python tools/pgf.py --det-path uwsod/datasets/VOC2007/detection_results/ --save-path uwsod/datasets/VOC2007/pseudo_labels --prefix oicr_plus_ --dataset voc2007 # COCO python tools/pgf.py --det-path uwsod/datasets/coco/detection_results/ --save-path uwsod/datasets/coco/pseudo_labels --prefix oicr_plus_ --dataset coco --use-diff
-
Register a new dataset in Detectron2
-
Generate the base split (keep len(dataset)-1 images for simplicity)
cd unbias/ python generate_base_split.py --config configs/code_release/voc_baseline.yaml --save-path ./dataseed/VOC_all.txt
-
Perform pseudo-FSOD
# using VOC2007 for example bash run/code_release/voc_baseline.sh
-
add multi-label messages into pseudo-label annotation files:
# VOC2007 python tools/add_multi_label.py --pgt-temp unbias/datasets/VOC2007/pseudo_labels/oicr_plus_voc_2007_{}.json --dataset voc2007 # COCO python tools/add_multi_label.py --pgt-temp unbias/datasets/coco/pseudo_labels/oicr_plus_coco_2014_{}.json --dataset coco
-
dataset split & get the split percent:
# Note: After splitting process, the percentage is printed. # Use split_single.py (single process & single gpu) # VOC2007 python split_single.py --config ./configs/code_release/voc_split.yaml --ckpt ./output/voc_baseline/model_0007999.pth --save-path ./dataseed/VOC07_oicr_plus_split.txt --k 2000 # COCO python split_single.py --config ./configs/code_release/coco_split.yaml --ckpt ./output/coco_baseline/model_0029999.pth --save-path ./dataseed/COCO_oicr_plus_split.txt --k 2000 # Use split_multi.py (multiple process & multiple gpu) # VOC2007 python split_multi.py --config ./configs/code_release/voc_split.yaml --ckpt ./output/voc_baseline/model_0007999.pth --save-path ./dataseed/VOC07_oicr_plus_split.txt --k 2000 --gpu 8 # COCO python split_multi.py --config ./configs/code_release/coco_split.yaml --ckpt unbias/output/coco_baseline/model_0029999.pth --save-path unbias/dataseed/COCO_oicr_plus_split.txt --k 2000 --gpu 8
-
perform ssod training
# using VOC2007 as example # 1. change the DATALOADER.SUP_PERCENT in bash file # 2. run the bash file bash run/code_release/voc_ssod.sh
-
extract single branch of the model
python tools/convert2detectron2.py ${MODEL_PATH} ${OUTPUT_PATH} -m [teacher(default) | student]
-
Perform TTA test
python train_net_test_tta.py \ --num-gpus 8 \ --config configs/code_release/voc07_tta_test.yaml \ --dist-url tcp://0.0.0.0:21197 --eval-only \ MODEL.WEIGHTS ${MODEL_PATH} \ OUTPUT_DIR ${OUTPUT_DIR}
stage | model link | |||
---|---|---|---|---|
SoS-WSOD stage 1 | 26.2 | 54.1 | 22.8 | link |
SoS-WSOD stage 1+2 | 27.3 | 57.6 | 22.5 | link |
SoS-WSOD stage 1+2+3 | 31.6 | 62.7 | 28.1 | link |
SoS-WSOD stage 1+2+3 (low threshold test) | 31.7 | 63.1 | 28.1 | same as above |
SoS-WSOD+ on VOC2007 (WSOD part in the Journal Version)
stage | model link | |||
---|---|---|---|---|
SoS-WSOD (adopt CASD as stage 1) | 32.8 | 64.1 | 29.8 | link |
SoS-WSOD+ (VOC2007 Only) | 16.0 | 37.8 | 11.5 | link |
SoS-WSOD+ (COCO Pretrain) | 30.4 | 59.8 | 27.2 | link |
SoS-WSOD+ | 35.5 | 65.3 | 33.1 | link |
Note:
- All results are obtained w/o TTA.
- low threshold test denotes we adopt lower prediction threshold which is widely used in WSOD instead of the default value (0.05) in FSOD. In our paper, results of experiments, which shrink the technique gap (stage 2 & 3), use the default 0.05.
- In stage 1 of SoS-WSOD w/ CASD, we directly use the model released by CASD official repo.
- SoS-WSOD+ means we enable the vanilla ResNet in stage 1 with the help of contrastive learning.
- VOC2007 Only indicates that in all stages we do not rely on any other datasets besides VOC2007.
- COCO Pretrain means we rid of ImageNet dependency and pretrain the model w/ unlabeled MS-COCO in all stages.
stage | model link | |||
---|---|---|---|---|
stage 1 | 11.6 | 23.6 | 10.4 | link |
stage 1+2 | 13.7 | 27.5 | 12.2 | link |
stage 1+2+3 | 15.5 | 30.6 | 14.4 | link |
stage 1+2+3 (low threshold test) | 15.9 | 31.6 | 14.6 | same as above |
python3 projects/WSL/tools/train_net.py \
--num-gpus 4 \
--config-file projects/WSL/configs/Detection/code_release/voc07_oicr_plus.yaml \
--dist-url tcp://0.0.0.0:17346 --eval-only \
MODEL.WEIGHTS ${MODEL_PATH} \
OUTPUT_DIR ${OUTPUT_DIR} TEST.AUG.ENABLED False
python train_net_test_tta.py \
--num-gpus 8 \
--config configs/code_release/voc07_tta_test.yaml \
--dist-url tcp://0.0.0.0:21197 --eval-only \
MODEL.WEIGHTS ${MODEL_PATH} \
OUTPUT_DIR ${OUTPUT_DIR} TEST.AUG.ENABLED False
-
SoS-WSOD+ w/ unlabeled ImageNet Pretrain
python train_net_test_tta.py \ --num-gpus 8 \ --config configs/code_release/sos_plus_test.yaml \ --dist-url tcp://0.0.0.0:21197 --eval-only \ MODEL.WEIGHTS ${MODEL_PATH} \ OUTPUT_DIR ${OUTPUT_DIR} TEST.AUG.ENABLED False
-
SoS-WSOD+ w/o ImageNet (unlabeled COCO Pretrain & VOC2007 only)
python train_net_test_tta.py \ --num-gpus 8 \ --config configs/code_release/sos_plus_wo_imagenet_test.yaml \ --dist-url tcp://0.0.0.0:21197 --eval-only \ MODEL.WEIGHTS ${MODEL_PATH} \ OUTPUT_DIR ${OUTPUT_DIR} TEST.AUG.ENABLED False
- For readability and usability, we clean and rewrite our codes. We do evaluate the codebase on VOC2007 and could get comparable or even better performance than results which are reported in our CVPR 2022 paper. However, we do not evaluate on COCO yet.
- As we tried, inference result of the provided detector model may have some deviation according to the experiment environment. For example, compared with using the experiment environment with gcc 5, we find a little bit lower performance with gcc 7. Such a phenomenon is founded under both pytorch 1.9.0 and pytorch 1.6.0. But, results obtained by training following the SoS framework from scratch will not be affected.
This code is built upon UWSOD, unbiased-teacher and detectron2, thanks all the contributors of these codebases.
- Evaluate results on MS-COCO.
- Salvage of Supervision in Weakly Supervised Instance Segmentation (SoS-WSIS) and Salvage of Supervision in Weakly Supervised Semantic Segmentation (SoS-WSSS) are coming.
- Enable vanilla ResNet in stage 1 and get rid of ImageNet Pretraining.