Premium
Hand gesture recognition based on attentive feature fusion
Author(s) -
Yu Bin,
Luo Zhiming,
Wu Huangbin,
Li Shaozi
Publication year - 2020
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.5910
Subject(s) - computer science , embedding , gesture , artificial intelligence , feature (linguistics) , convolutional neural network , gesture recognition , optical flow , pattern recognition (psychology) , frame (networking) , scale (ratio) , computer vision , image (mathematics) , telecommunications , philosophy , linguistics , physics , quantum mechanics
Summary Video‐based hand gesture recognition plays an important role in human‐computer interaction (HCI). Recent advanced methods usually add 3D convolutional neural networks to capture the information from both spatial and temporal dimensions. However, these methods suffer the issue of requiring large‐scale training data and high computational complexity. To address this issue, we proposed an attentive feature fusion framework for efficient hand‐gesture recognition. In our proposed model, we utilize a shallow two‐stream CNNs to capture the low‐level features from the original video frame and its corresponding optical flow. Following, we designed an attentive feature fusion module to selectively combine useful information from the previous two streams based on the attention mechanism. Finally, we obtain a compact embedding of a video by concatenating features from several short segments. To evaluate the effectiveness of our proposed framework, we train and test our method on a large‐scale video‐based hand gesture recognition dataset, Jester. Experimental results demonstrate that our approach obtains very competitive performance on the Jester dataset with a classification accuracy of 95.77%.