Skip to content

Latest commit

 

History

History
92 lines (67 loc) · 3.83 KB

GETTING_STARTED.md

File metadata and controls

92 lines (67 loc) · 3.83 KB

Getting Started with DiffusionDet

Installation

The codebases are built on top of Detectron2, Sparse R-CNN, and denoising-diffusion-pytorch. Thanks very much.

Requirements

  • Linux or macOS with Python ≥ 3.6
  • PyTorch ≥ 1.9.0 and torchvision that matches the PyTorch installation. You can install them together at pytorch.org to make sure of this
  • OpenCV is optional and needed by demo and visualization

Steps

  1. Install Detectron2 following https://github.com/facebookresearch/detectron2/blob/main/INSTALL.md#installation.

  2. Prepare datasets

mkdir -p datasets/coco
mkdir -p datasets/lvis

ln -s /path_to_coco_dataset/annotations datasets/coco/annotations
ln -s /path_to_coco_dataset/train2017 datasets/coco/train2017
ln -s /path_to_coco_dataset/val2017 datasets/coco/val2017

ln -s /path_to_lvis_dataset/lvis_v1_train.json datasets/lvis/lvis_v1_train.json
ln -s /path_to_lvis_dataset/lvis_v1_val.json datasets/lvis/lvis_v1_val.json
  1. Prepare pretrain models

DiffusionDet uses three backbones including ResNet-50, ResNet-101 and Swin-Base. The pretrained ResNet-50 model can be downloaded automatically by Detectron2. We also provide pretrained ResNet-101 and Swin-Base which are compatible with Detectron2. Please download them to DiffusionDet_ROOT/models/ before training:

mkdir models
cd models
# ResNet-101
wget https://github.com/ShoufaChen/DiffusionDet/releases/download/v0.1/torchvision-R-101.pkl

# Swin-Base
wget https://github.com/ShoufaChen/DiffusionDet/releases/download/v0.1/swin_base_patch4_window7_224_22k.pkl

cd ..

Thanks for model conversion scripts of ResNet-101 and Swin-Base.

  1. Train DiffusionDet
python train_net.py --num-gpus 8 \
    --config-file configs/diffdet.coco.res50.yaml
  1. Evaluate DiffusionDet
python train_net.py --num-gpus 8 \
    --config-file configs/diffdet.coco.res50.yaml \
    --eval-only MODEL.WEIGHTS path/to/model.pth
  • Evaluate with arbitrary number (e.g 300) of boxes by setting MODEL.DiffusionDet.NUM_PROPOSALS 300.
  • Evaluate with 4 refinement steps by setting MODEL.DiffusionDet.SAMPLE_STEP 4.

We also provide the pretrained model of DiffusionDet-300boxes that is used for ablation study.

Inference Demo with Pre-trained Models

We provide a command line tool to run a simple demo following Detectron2.

python demo.py --config-file configs/diffdet.coco.res50.yaml \
    --input image.jpg --opts MODEL.WEIGHTS diffdet_coco_res50.pth

We need to specify MODEL.WEIGHTS to a model from model zoo for evaluation. This command will run the inference and show visualizations in an OpenCV window.

For details of the command line arguments, see demo.py -h or look at its source code to understand its behavior. Some common arguments are:

  • To run on your webcam, replace --input files with --webcam.
  • To run on a video, replace --input files with --video-input video.mp4.
  • To run on cpu, add MODEL.DEVICE cpu after --opts.
  • To save outputs to a directory (for images) or a file (for webcam or video), use --output.