ResNeSt: Split-Attention Networks

May 2020

tl;dr: A new drop-in replacement for ResNet for object detection and segmentation task.

Overall impression

It is almost a combination of ResNeXt and SKNet, with improvement in implementation (cardinality-major to radix major).

I do feel that the paper uses too much tricks (MixUp, AutoAugment, distributed training, etc) and is too similar to SKNet, especially that the hyperparameter selection reduces this work. Engineering contribution > innovation.

Key ideas

Cardinality concept is the same as ResNeXt.
The split attention module is very similar to SKNet but with the same kernel size.
The change from cardinality-major to radix-major was implemented for better efficiency (how much?).

Technical details

The final selected hyperparameters are K=1 and R=2. This is very similar to SKNet.

Notes

Analysis of radix-major in 知乎
This work proves that, with tricks, ResNet can also be SOTA. This is better than works reinventing the wheel such as EfficientDet.
- MobileNet and DepthWise convolution can only accelerate on CPU and are better suited for edge devices.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

resnest.md

resnest.md

ResNeSt: Split-Attention Networks

Overall impression

Key ideas

Technical details

Notes

Files

resnest.md

Latest commit

History

resnest.md

File metadata and controls

ResNeSt: Split-Attention Networks

Overall impression

Key ideas

Technical details

Notes