A PyTorch implementation of Dense Receptive Field for Object Detection (accepted by ICPR2018) This repoitory is now deprecated, please go to https://github.com/yqyao/SSD_Pytorch.
- Install PyTorch-0.3.1 by selecting your environment on the website and running the appropriate command.
- Clone this repository.
- Note: We currently only support Python 3+.
- Then download the dataset by following the instructions below.
- Compile the nms and coco tools:
cd DRFNet
./make.sh
Note*: Check you GPU architecture support in utils/build.py, line 131. Default is:
'nvcc': ['-arch=sm_52',
To make things easy, we provide a simple VOC dataset loader that inherits torch.utils.data.Dataset
making it fully compatible with the torchvision.datasets
API.
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2007.sh # <directory>
# specify a directory for dataset to be downloaded into, else default is ~/data/
sh data/scripts/VOC2012.sh # <directory>
move all images in VOC2007 and VOC2012 into VOCROOT/VOC0712/JPEGImages
move all annotations in VOC2007 and VOC2012 into VOCROOT/VOC0712/JPEGImages/Annotations
rename and merge some txt VOC2007 and VOC2012 ImageSets/Main/*.txt to VOCROOT/VOC0712/JPEGImages/ImageSets/Main/*.txt
the merged txt list as follows:
2012_test.txt, 2007_test.txt, 0712_trainval_test.txt, 2012_trainval.txt, 0712_trainval.txt
Install the MS COCO dataset at /path/to/coco from official website, default is ~/data/COCO. Following the instructions to prepare minival2014 and valminusminival2014 annotations. All label files (.json) should be under the COCO/annotations/ folder. It should have this basic structure
$COCO/
$COCO/cache/
$COCO/annotations/
$COCO/images/
$COCO/images/test2015/
$COCO/images/train2014/
$COCO/images/val2014/
UPDATE: The current COCO dataset has released new train2017 and val2017 sets which are just new splits of the same image sets.
- First download the fc-reduced VGG-16 PyTorch base network weights at: https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth
- ResNet pre-trained basenet weight file is available at ResNet50, ResNet101, ResNet152
- By default, we assume you have downloaded the file in the
DRFNet/weights
dir:
mkdir weights
cd weights
wget https://s3.amazonaws.com/amdegroot-models/vgg16_reducedfc.pth
wget https://download.pytorch.org/models/resnet50-19c8e357.pth
wget https://download.pytorch.org/models/resnet101-5d3b4d8f.pth
wget https://download.pytorch.org/models/resnet152-b121ed2d.pth
- To train DRFNet using the train script simply specify the parameters listed in
train.py
as a flag or manually change them.
python train.py -v drf_ssd_vgg
-
Note:
- -d: choose datasets, VOC or COCO, VOC2012(voc12 trainval),VOC0712++(0712 trainval + 07test)
- -v choose backbone version, ssd_vgg, ssd_res, drf_ssd_vgg, drf_ssd_res
- s: image size, 300 or 512
- You can pick-up training from a checkpoint by specifying the path as one of the training parameters (again, see train.py for options)
-
To evaluate a trained network:
python eval.py -v drf_ssd_vgg
You can specify the parameters listed in the eval.py
file by flagging them or manually changing them.
we retrained some models, so it's different from the origin paper size = 300
ssd | drf_32 | drf_48 | drf_64 | drf_96 | drf_128 |
---|---|---|---|---|---|
77.2 % | 79.87 % | 79.93% | 79.73 % | 79.38% | 79.65 % |
VOC07 metric? Yes
AP for aeroplane = 0.8579
AP for bicycle = 0.8615
AP for bird = 0.7786
AP for boat = 0.7202
AP for bottle = 0.5850
AP for bus = 0.8788
AP for car = 0.8712
AP for cat = 0.8849
AP for chair = 0.6612
AP for cow = 0.8702
AP for diningtable = 0.7796
AP for dog = 0.8577
AP for horse = 0.8750
AP for motorbike = 0.8778
AP for person = 0.8046
AP for pottedplant = 0.5582
AP for sheep = 0.7952
AP for sofa = 0.8041
AP for train = 0.8800
AP for tvmonitor = 0.7847
Mean AP = 0.7993
GTX 1080 Ti: ~70 FPS
- Wei Liu, et al. "SSD: Single Shot MultiBox Detector." ECCV2016.
- Original Implementation (CAFFE)
- A list of other great SSD ports that were sources of inspiration (especially the Chainer repo):