z-logo
open-access-imgOpen Access
Multi‐scale supervised network for crowd counting
Author(s) -
Wang Yongjie,
Zhang Wei,
Huang Dongxiao,
Liu Yanyan,
Zhu Jianghua
Publication year - 2020
Publication title -
iet image processing
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.401
H-Index - 45
eISSN - 1751-9667
pISSN - 1751-9659
DOI - 10.1049/iet-ipr.2020.0897
Subject(s) - upsampling , computer science , artificial intelligence , focus (optics) , feature (linguistics) , convolution (computer science) , scale (ratio) , dilation (metric space) , image (mathematics) , pattern recognition (psychology) , computer vision , mathematics , artificial neural network , linguistics , philosophy , physics , quantum mechanics , combinatorics , optics
Crowd counting is getting more and more attention in our daily life, because it can effectively prevent some safety problems. However, due to scale variations and background noise in the image, such as buildings and trees, getting the accurate number from image is a hard work. In order to address these problems, this work introduces a new multi‐scale supervised network. The proposed model uses part of vgg16 model as the backbone to extract feature. In the training process, a multi‐scale dilated convolution module is added at the end of each stage of the backbone network to generate attention map with different resolutions to help the model focus on the head area in feature map. In addition, the dilated convolution adopts three dilation ratios to fit different sizes of head in the image. Finally, in order to get the high‐quality density map with high‐resolution, the authors employ the upsampling operation to restore the density map size to the quarter size of original image. A large number of experiments on these four datasets show that the proposed network has greatly improved the counting accuracy of many existing methods.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here