Double Mask R-CNN for Pedestrian Detection in a Crowd
Author(s) -
Congqiang Liu,
Haosen Wang,
Chunjian Liu
Publication year - 2022
Publication title -
mobile information systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.346
H-Index - 34
eISSN - 1875-905X
pISSN - 1574-017X
DOI - 10.1155/2022/4012252
Subject(s) - computer science , pedestrian detection , artificial intelligence , visibility , feature (linguistics) , pedestrian , pyramid (geometry) , image (mathematics) , computer vision , feature extraction , pattern recognition (psychology) , segmentation , mathematics , linguistics , philosophy , physics , geometry , optics , transport engineering , engineering
Aiming at the difficulty of feature extraction and the limitation of NMS (nonmaximum suppression) in crowded pedestrian detection, a new detection network named Double Mask R-CNN based on Mask R-CNN with FPN (Feature Pyramid Network) is proposed in this article. The algorithm has two improvements: firstly, we add a semantic segmentation branch on the FPN to strengthen the feature extraction of crowded pedestrians; secondly, we design a rule to estimate the pedestrian visibility of detected image according to the human keypoints information, and this rule can cover binary mask on the image whose pedestrian visibility is less than a certain threshold. Then we input the masked image into the network to locate occluded pedestrians. Experimental results on the CrowdHuman dataset show that the log-average miss rate (MR) of Double Mask R-CNN is 13, 12% lower than the best results of other mainstream networks. Similar improvements on WiderPerson dataset are also achieved by the Double Mask R-CNN.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom