Skip to content

Latest commit

 

History

History

segmentation

Applying ViT-CoMer to Semantic Segmentation

Our segmentation code is developed on top of MMSegmentation v0.20.2.

Usage

Install MMSegmentation v0.20.2.

# recommended environment: torch1.9 + cuda11.1
pip install torch==1.9.0+cu111 torchvision==0.10.0+cu111 torchaudio==0.9.0 -f https://download.pytorch.org/whl/torch_stable.html
pip install mmcv-full==1.4.2 -f https://download.openmmlab.com/mmcv/dist/cu111/torch1.9.0/index.html
pip install timm==0.4.12
pip install mmdet==2.22.0 # for Mask2Former
pip install mmsegmentation==0.20.2
ln -s ../detection/ops ./
cd ops & sh make.sh # compile deformable attention

Main Results and Models

ADE20K val

Method Backbone Pretrain Lr schd Crop Size mIoU(SS/MS) #Param Config Ckpt Log
UperNet ViT-CoMer-T DeiT-T 160k 512 43.5/- 38.7M config ckpt log
UperNet ViT-CoMer-S DeiT-S 160k 512 46.5/- 61.4M config ckpt log
UperNet ViT-CoMer-B DeiT-S 160k 512 48.8/- 144.7M - - -

COCO-Stuff-164K

Method Backbone Pretrain Lr schd Crop Size mIoU(SS/MS) #Param Config Ckpt Log
Mask2Former ViT-CoMer-L BEiTv2-L 80k 896 52.7/- 633.6M config ckpt log

Evaluation

To evaluate ViT-CoMer-T + UperNet (512) on ADE20k val on a single node with 8 gpus run:

sh test.sh

Training

To train ViT-CoMer-T + UperNet on ADE20k on a single node with 8 gpus run:

sh train.sh