
Multimodal deep learning model for human handover classification
Author(s) -
Islam A Monir,
Mohamed Waleed Fakhr,
Nashwa El-Bendary
Publication year - 2022
Publication title -
bulletin of electrical engineering and informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.251
H-Index - 12
ISSN - 2302-9285
DOI - 10.11591/eei.v11i2.3690
Subject(s) - handover , computer science , artificial intelligence , deep learning , feature (linguistics) , machine learning , robot , feature selection , task (project management) , object detection , pattern recognition (psychology) , engineering , telecommunications , philosophy , linguistics , systems engineering
Giving and receiving objects between humans and robots is a critical task which collaborative robots must be able to do. In order for robots to achieve that, they must be able to classify different types of human handover motions. Previous works did not mainly focus on classifying the motion type from both giver and receiver perspectives. However, they solely focused on object grasping, handover detection, and handover classification from one side only (giver/receiver). This paper discusses the design and implementation of different deep learning architectures with long short term memory (LSTM) network; and different feature selection techniques for human handover classification from both giver and receiver perspectives. Classification performance while using unimodal and multimodal deep learning models is investigated. The data used for evaluation is a publicly available dataset with four different modalities: motion tracking sensors readings, Kinect readings for 15 joints positions, 6-axis inertial sensor readings, and video recordings. The multimodality added a huge boost in the classification performance; achieving 96% accuracy with the feature selection based deep learning architecture.