IUML: INCEPTION U-NET BASED MULTI-TASK LEARNING FOR DENSITY LEVEL CLASSIFICATION AND CROWD DENSITY ESTIMATION
By Van-Su Huynh, Vu-Hoang Tran, and Chin-Chung Huang
This work is accepted by SMC 2019 conference
This project is an implementation of IUML network for crowd counting. IUML network could handle various types of scale problem caused by: depth variation, height variation, the variation caused by density levels, and image resolution difference
We have tested the implementation on Window with GPU Nvidia 1080TI, CUDA8 and CuDNN v5 . The other version should be working. Caffe installation is pre-required.
The ShanghaiTech dataset (1) could be dowloaded here. The UCF_CC_50 dataset (2) could be dowloaded here.
After getting the dataset, using the codes in data_preparation
to create the training patch.
Each original image, we randomly generate 30 patches.
We applied a geometry-adaptive kernel (1) which results in a smaller kernel size for a smaller object and a larger kernel size for a larger object.
The hyper-parameters are denoted in file_solver.prototxt
The training model was defined in file_train.prototxt
The testing model was defined in deploy.prototxt
(1) Y. Zhang, D. Zhou, S. Chen, S. Gao, and Y. Ma, “Single-image crowd counting via multi-column convolutional neural network,” CVPR, 2016.
(2) Haroon Idrees, Imran Saleemi, Cody Seibert, and Mubarak Shah, “Multi-source multi-scale counting in extremely dense crowd images,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2547-2554, 2013.