z-logo
open-access-imgOpen Access
Driver activity recognition using spatial‐temporal graph convolutional LSTM networks with attention mechanism
Author(s) -
Pan Chaopeng,
Cao Haotian,
Zhang Weiwei,
Song Xiaolin,
Li Mingjun
Publication year - 2021
Publication title -
iet intelligent transport systems
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.579
H-Index - 45
eISSN - 1751-9578
pISSN - 1751-956X
DOI - 10.1049/itr2.12025
Subject(s) - computer science , discriminative model , artificial intelligence , graph , feature extraction , feature (linguistics) , frame (networking) , deep learning , pattern recognition (psychology) , computer vision , telecommunications , philosophy , linguistics , theoretical computer science
Driver activity engagement while driving plays a vital role that leads to negative outcomes of driving safety. To reduce traffic accidents and ensure driving safety, real‐time driver activity recognition architecture is proposed in this study. Specifically, a total of eight kinds of common driving‐related activities are identified, which include the normal driving, left or right checking, texting, answering the phone, using media, drinking, and picking up objects. Raw experiment videos are collected via onboard monocular cameras, which are used for the upper body skeleton information extraction of the driver. Then, the graph convolutional networks (GCN) are constructed for spatial structure feature reasoning in a single frame, which is consecutively followed by long short‐term memory (LSTM) networks for temporal motion feature learning within the sequence. Moreover, the attention mechanism is further utilised to emphasise the keyframes to select discriminative sequential information. Finally, a large‐scale driver activity dataset, consisting of both naturalistic driving data and simulative driving data, is collected for model training and evaluations. Experimental results show that the general recall ratio of those eight driving‐related activities reaches up to 88.8% and the real‐time recognition efficiency can reach up to 24 fps, which would satisfy the real‐time requirements of engineering applications.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here