The codebase implements the object detection and instance segmentation framework with MMDetection, using EfficientViT as the backbone.
Model | Pretrain | Lr Schd | Box AP | AP@50 | AP@75 | Config | Link |
---|---|---|---|---|---|---|---|
EfficientViT-M4 | ImageNet-1k | 1x | 32.7 | 52.2 | 34.1 | config | model/log |
Model | Pretrain | Lr Schd | Mask AP | AP@50 | AP@75 | Config | Link |
---|---|---|---|---|---|---|---|
EfficientViT-M4 | ImageNet-1k | 1x | 31.0 | 51.2 | 32.2 | config | model/log |
Please follow the following steps to setup EfficientViT for downstream tasks.
Install mmcv-full and MMDetection via MIM:
pip install -U openmim
mim install mmcv-full
mim install mmdet
Prepare COCO 2017 dataset according to the instructions in MMDetection. The dataset should be organized as
downstream
├── data
│ ├── coco
│ │ ├── annotations
│ │ ├── train2017
│ │ ├── val2017
│ │ ├── test2017
Firstly, prepare the MSCOCO pretrained models by downloading from the model-zoo.
Below are the instructions for evaluating the models on MSCOCO 2017 val set:
Object Detection
To evaluate the RetinaNet model with EfficientViT_M4 as backbone, run:
bash ./dist_test.sh configs/retinanet_efficientvit_m4_fpn_1x_coco.py ./retinanet_efficientvit_m4_fpn_1x_coco.pth 8 --eval bbox
where 8 refers to the number of GPUs. For the usage of more arguments, please refer to MMDetection.
Instance Segmentation
To evaluate the Mask R-CNN model with EfficientViT_M4 as backbone, run:
bash ./dist_test.sh configs/mask_rcnn_efficientvit_m4_fpn_1x_coco.py ./mask_rcnn_efficientvit_m4_fpn_1x_coco.pth 8 --eval bbox segm
where 8 refers to the number of GPUs. For the usage of more arguments, please refer to MMDetection.
Firstly, prepare the ImageNet-1k pretrained EfficientViT-M4 model by downloading from the model-zoo.
Below are the instructions for training the models on MSCOCO 2017 train set:
Object Detection
To train the RetinaNet model with EfficientViT_M4 as backbone on a single machine using multi-GPUs, run:
bash ./dist_train.sh configs/retinanet_efficientvit_m4_fpn_1x_coco.py 8 --cfg-options model.backbone.pretrained=$PATH_TO_IMGNET_PRETRAIN_MODEL
where 8 refers to the number of GPUs. For the usage of more arguments, please refer to MMDetection.
Instance Segmentation
To train the Mask R-CNN model with EfficientViT_M4 as backbone on a single machine using multi-GPUs, run:
bash ./dist_train.sh configs/mask_rcnn_efficientvit_m4_fpn_1x_coco.py 8 --cfg-options model.backbone.pretrained=$PATH_TO_IMGNET_PRETRAIN_MODEL
where 8 refers to the number of GPUs. For the usage of more arguments, please refer to MMDetection.
The downstream task implementation is mainly based on the following codebases. We gratefully thank the authors for their wonderful works.