RGB-D Human Action Recognition of Deep Feature Enhancement and Fusion Using Two-Stream ConvNet
Author(s) -
Yun Liu,
Ruidi Ma,
Hui Li,
Chuanxu Wang,
Ye Tao
Publication year - 2021
Publication title -
journal of sensors
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.399
H-Index - 43
eISSN - 1687-7268
pISSN - 1687-725X
DOI - 10.1155/2021/8864870
Subject(s) - rgb color model , artificial intelligence , fusion , action recognition , pattern recognition (psychology) , feature (linguistics) , computer science , action (physics) , deep learning , computer vision , physics , philosophy , linguistics , class (philosophy) , quantum mechanics
Action recognition is an important research direction of computer vision, whose performance based on video images is easily affected by factors such as background and light, while deep video images can better reduce interference and improve recognition accuracy. Therefore, this paper makes full use of video and deep skeleton data and proposes an RGB-D action recognition based two-stream network (SV-GCN), which can be described as a two-stream architecture that works with two different data. Proposed Nonlocal-stgcn (S-Stream) based on skeleton data, by adding nonlocal to obtain dependency relationship between a wider range of joints, to provide more rich skeleton point features for the model, proposed a video based Dilated-slowfastnet (V-Stream), which replaces traditional random sampling layer with dilated convolutional layers, which can make better use of depth the feature; finally, two stream information is fused to realize action recognition. The experimental results on NTU-RGB+D dataset show that proposed method significantly improves recognition accuracy and is superior to st-gcn and Slowfastnet in both CS and CV.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom