Skip to content

[CVPR 2024] - We propose CRKD to bridge the performance gap between LC and CR detectors with a novel cross-modality knowledge distillation (KD) framework.

License

Notifications You must be signed in to change notification settings

Song-Jingyu/CRKD

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

CRKD: Enhanced Camera-Radar Object Detection wtih Cross-modality Knowledge Distillation

News

  • (2024/06) Code is released!
  • (2024/02) CRKD is accepted to CVPR 2024!

Abstract

In the field of 3D object detection for autonomous driving, LiDAR-Camera (LC) fusion is the top-performing sensor configuration. Still, LiDAR is relatively high cost, which hinders adoption of this technology for consumer automobiles. Alternatively, camera and radar are commonly deployed on vehicles already on the road today, but performance of Camera-Radar (CR) fusion falls behind LC fusion. In this work, we propose Camera-Radar Knowledge Distillation (CRKD) to bridge the performance gap between LC and CR detectors with a novel cross-modality KD framework. We use the Bird’s-Eye-View (BEV) representation as the shared feature space to enable effective knowledge distillation. To accommodate the unique cross-modality KD path, we propose four distillation losses to help the student learn crucial features from the teacher model. We present extensive evaluations on the nuScenes dataset to demonstrate the effectiveness of the proposed CRKD framework.

Results

3D Object Detection (on the nuScenes val)

Model Modality Backbone Resolution mAP NDS
Teacher L+C R50 256x704 62.6 68.2
Baseline C+R R50 256x704 41.8 52.9
Student C+R R50 256x704 41.7 53.5
CRKD C+R R50 256x704 43.2 54.9
Teacher L+C SwinT 256x704 66.1 70.3
Baseline C+R SwinT 256x704 43.2 54.1
Student C+R SwinT 256x704 44.9 55.9
CRKD C+R SwinT 256x704 46.7 57.3

3D Object Detection (on the nuScenes test)

Model Modality Backbone Resolution mAP NDS
CRKD C+R SwinT 256x704 48.7 58.7

Usage

Prerequisites

The code is built with following libraries:

We also provide a Dockerfile to ease environment setup. To get started with docker, please make sure that nvidia-docker is installed on your machine. After that, please execute the following command to build the docker image:

cd docker
# Build docker image
docker build . -t crkd
# Run docker container
./run.sh

We recommend the users to run data preparation (instructions are available in the next section) outside the docker if possible. Note that the dataset directory should be an absolute path. Within the docker, please run the following command to clone our repo and install custom CUDA extensions:

git clone https://github.com/Song-Jingyu/CRKD.git && cd CRKD
# Before set up the environment, make sure the following command lines exist in the CRKD/setup.py file.
make_cuda_ext(
                name="feature_decorator_ext",
                module="mmdet3d.ops.feature_decorator",
                sources=["src/feature_decorator.cpp"],
                sources_cuda=["src/feature_decorator_cuda.cu"],
            ),
# And comment these lines in CRKD/mmdet3d/ops/feature_decorator/src/feature_decorator.cpp.
static auto registry =
     torch::RegisterOperators("feature_decorator_ext::feature_decorator_forward", &feature_decorator_forward);
python setup.py develop

You can then create a symbolic link data to the /dataset directory in the docker.

# Our dataset path
# Please change to your path before use
/the_path_to_your_dataset/dataset/

Data Preparation

nuScenes

Please follow the instructions from here to download and preprocess the nuScenes dataset. Please remember to download both detection dataset and the map extension (for BEV map segmentation). After data preparation, you will be able to see the following directory structure (as is indicated in mmdetection3d):

mmdetection3d
├── mmdet3d
├── tools
├── configs
├── data
│   ├── nuscenes
│   │   ├── maps
│   │   ├── samples
│   │   ├── sweeps
│   │   ├── v1.0-test
|   |   ├── v1.0-trainval
│   │   ├── nuscenes_database
│   │   ├── nuscenes_old_infos_train.pkl
│   │   ├── nuscenes_old_infos_val.pkl
│   │   ├── nuscenes_old_infos_test.pkl
│   │   ├── nuscenes_old_coor_dbinfos_train.pkl

Note: please change the dataset path to yours in the config files before running any experiment!

Training

We provide pretrained student and teacher models to reproduce our results on nuScenes. For the training details of the student and teacher models, please refer to BEVFusion.

Note: please change the teacher and student checkpoint path to yours in the config files before running any experiment!

For LiDAR-camera teacher model, please run:

# SwinT
torchpack dist-run -np 4 python tools/train.py configs/nuscenes/det/centerhead/lssfpn/camera+lidar/swint_v0p075/gatedfuser.yaml

# R50
torchpack dist-run -np 4 python tools/train.py configs/nuscenes/det/centerhead/lssfpn/camera+lidar/resnet50/gatedfuser.yaml

For camera-radar student model, please run:

# SwinT
torchpack dist-run -np 4 python tools/train.py configs/nuscenes/det/centerhead/lssfpn/camera+radar/swint/default.yaml

# R50
torchpack dist-run -np 4 python tools/train.py configs/nuscenes/det/centerhead/lssfpn/camera+radar/resnet50/default.yaml

Once you have the weights of teacher and student modelsready, you can start training CRKD! For CRKD, please run:

# SwinT
torchpack dist-run -np 4 python tools/train.py configs/nuscenes/distill/feature_response_distill/feat_fused_da_lr_mean_c2c_scale_mask_relation_resp_kd_dynamic_class_loss_80_256.yaml

# R50
torchpack dist-run -np 4 python tools/train.py configs/nuscenes/distill/feature_response_distill/feat_fused_da_lr_mean_c2c_scale_mask_relation_resp_kd_dynamic_class_loss_80_256_resnet.yaml

Evaluation

We also provide pretrained models for evaluation. Please refer to the Results Table to download the checkpoints.

For evaluating LiDAR-camera teacher model, please run:

# SwinT
torchpack dist-run -np 4 python tools/test.py configs/nuscenes/det/centerhead/lssfpn/camera+lidar/swint_v0p075/gatedfuser.yaml pretrained/teacher_swint.pth --eval bbox

# R50
torchpack dist-run -np 4 python tools/test.py configs/nuscenes/det/centerhead/lssfpn/camera+lidar/resnet50/gatedfuser.yaml pretrained/teacher_r50.pth --eval bbox

For evaluating camera-radar student model, please run:

# SwinT
torchpack dist-run -np 4 python tools/test.py configs/nuscenes/det/centerhead/lssfpn/camera+radar/swint/default.yaml pretrained/student_gated_swint.pth --eval bbox

# R50
torchpack dist-run -np 4 python tools/test.py configs/nuscenes/det/centerhead/lssfpn/camera+radar/resnet50/default.yaml pretrained/student_gated_r50.pth --eval bbox

For evaluating our pretrained CRKD model, please run:

# SwinT
torchpack dist-run -np 4 python tools/test.py configs/nuscenes/distill/feature_response_distill/feat_fused_da_lr_mean_c2c_scale_mask_relation_resp_kd_dynamic_class_loss_80_256.yaml pretrained/distill_swint.pth --eval bbox

# R50
torchpack dist-run -np 4 python tools/test.py configs/nuscenes/distill/feature_response_distill/feat_fused_da_lr_mean_c2c_scale_mask_relation_resp_kd_dynamic_class_loss_80_256_resnet.yaml pretrained/distill_r50.pth --eval bbox

Note: please run tools/test.py separately after training to get the final evaluation metrics.

Citation

If CRKD is useful or relevant to your research, please kindly recognize our contributions by citing our paper:

@InProceedings{Zhao_2024_CRKD,
    author    = {Zhao, Lingjun and Song, Jingyu and Skinner, Katherine A.},
    title     = {CRKD: Enhanced Camera-Radar Object Detection with Cross-modality Knowledge Distillation},
    booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)},
    month     = {June},
    year      = {2024},
    pages     = {15470-15480}
}

About

[CVPR 2024] - We propose CRKD to bridge the performance gap between LC and CR detectors with a novel cross-modality knowledge distillation (KD) framework.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published