
Pose detection in complex classroom environment based on improved Faster R‐CNN
Author(s) -
Tang Lin,
Gao Chenqiang,
Chen Xu,
Zhao Yue
Publication year - 2019
Publication title -
iet image processing
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.401
H-Index - 45
eISSN - 1751-9667
pISSN - 1751-9659
DOI - 10.1049/iet-ipr.2018.5905
Subject(s) - locality , computer science , pooling , artificial intelligence , pattern recognition (psychology) , feature (linguistics) , feature extraction , convolutional neural network , feature vector , object detection , feature learning , computer vision , philosophy , linguistics
Pose detection of small targets in poor imaging conditions like heavy occlusion and low resolution is still an open and challenging task in computer vision. For instance, detection of students' poses in classrooms that are even indistinguishable to human eyes remains a rather difficult task. Motivated by the success of convolutional feature merging and locality preserving, the authors propose a pose detection framework combining merged region of interest (ROI) pooling and locality preserving learning. Unlike usual object detection algorithms which use general top‐level convolutional features as inputs, their method uses a merged ROI pooling structure to merge semantic feature and high‐resolution feature from the last two levels of convolutional feature maps, so that this merged feature is made more expressive than the single‐level feature. In addition, the locality feature‐preserving learning is used in the last fully‐connected layer. Through locality preserving learning, features belonging to the same class would be forced to be closer in the feature space, which enables the model with stronger classification ability. Experimental results show that the proposed method outperforms the state‐of‐the‐art methods.