
Dual attention module and multi‐label based fully convolutional network for crowd counting
Author(s) -
Wang Suyu,
Yang Bin,
Liu Bo,
Zheng Guanghui
Publication year - 2020
Publication title -
iet computer vision
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.38
H-Index - 37
eISSN - 1751-9640
pISSN - 1751-9632
DOI - 10.1049/iet-cvi.2019.0674
Subject(s) - computer science , artificial intelligence , convolutional neural network , pattern recognition (psychology) , cross entropy , focus (optics) , dual (grammatical number) , context (archaeology) , machine learning , art , paleontology , physics , literature , optics , biology
High‐density crowd counting in natural scenes is an extremely difficult and challenging research subject in computer vision. Although the algorithm based on the convolutional neural network has achieved significantly better results than the traditional algorithm, most of them tend to focus on the local features of images, and difficult to obtain the rich global contextual dependencies. To solve this problem, a dual attention module and a multi‐label based fully convolutional network are proposed in this study. Moreover, the authors improve the algorithm by the following multiple perspectives. Firstly, introducing the dual attention module, the global‐context and long‐range dependency are adaptively integrated into both spatial and channel dimensions, which improve the network expression ability. Then, the prediction error is effectively reduced by designing a multi‐label mechanism, so the crowd‐counting task is transformed into foreground and background segmentation tasks to assist in the regression task of the density map. Furthermore, on the basis of the traditional Euclidean distance loss and cross‐entropy loss, the structural similarity index is introduced to further improve the training effect of the model. The test results of the UCF_CC_50, ShanghaiTech, and UCF‐QNRF datasets indicate that the proposed method is superior to the current mainstream algorithm.