z-logo
open-access-imgOpen Access
Multi‐level feature fusion network for crowd counting
Author(s) -
Wang Luyang,
Li Yun,
Peng Sifan,
Tang Xiao,
Yin Baoqun
Publication year - 2021
Publication title -
iet computer vision
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.38
H-Index - 37
eISSN - 1751-9640
pISSN - 1751-9632
DOI - 10.1049/cvi2.12012
Subject(s) - computer science , encoder , artificial intelligence , feature (linguistics) , segmentation , context (archaeology) , convolutional neural network , pattern recognition (psychology) , feature extraction , block (permutation group theory) , channel (broadcasting) , computer vision , paleontology , computer network , philosophy , linguistics , geometry , mathematics , biology , operating system
Crowd counting has become a noteworthy vision task due to the needs of numerous practical applications, but it remains challenging. State‐of‐the‐art methods generally estimate the density map of the crowd image with the high‐level semantic features of various deep convolutional networks. However, the absence of low‐level spatial information may result in counting errors in the local details of the density map. To this end, a novel framework named Multi‐level Feature Fusion Network (MFFN) for single image crowd counting is proposed. The proposed MFFN, which is constructed in an encoder–decoder fashion, incorporates semantic and spatial information for generating high‐resolution density maps of input crowd images. Skip connections are developed between the encoder and the decoder so that low‐level spatial information and high‐level semantic features can be combined by element‐wise addition. In addition, a dense dilated convolution block is placed behind the encoder, extracting multi‐scale context features to guide feature fusion by a channel attention mechanism. The model is trained by multi‐task learning; semantic segmentation supervision is introduced to enhance feature representation. Extensive experiments are conducted on three crowd counting datasets (ShanghaiTech, UCF_CC_50, UCF‐QNRF), and the results show that MFFN outperforms state‐of‐the‐art methods. In addition, sufficient ablation studies are performed to verify the effectiveness of each component in our proposed method.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here