
Extensible Hierarchical Method of Detecting Interactive Actions for Video Understanding
Author(s) -
Moon Jinyoung,
Jin Junho,
Kwon Yongjin,
Kang Kyuchang,
Park Jongyoul,
Park Kyoung
Publication year - 2017
Publication title -
etri journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.295
H-Index - 46
eISSN - 2233-7326
pISSN - 1225-6463
DOI - 10.4218/etrij.17.0116.0054
Subject(s) - extensibility , computer science , action (physics) , object (grammar) , artificial intelligence , domain (mathematical analysis) , ontology , relation (database) , spatial relation , space (punctuation) , computer vision , action recognition , human–computer interaction , pattern recognition (psychology) , data mining , mathematics , mathematical analysis , philosophy , physics , epistemology , quantum mechanics , class (philosophy) , operating system
For video understanding, namely analyzing who did what in a video, actions along with objects are primary elements. Most studies on actions have handled recognition problems for a well‐trimmed video and focused on enhancing their classification performance. However, action detection, including localization as well as recognition, is required because, in general, actions intersect in time and space. In addition, most studies have not considered extensibility for a newly added action that has been previously trained. Therefore, proposed in this paper is an extensible hierarchical method for detecting generic actions, which combine object movements and spatial relations between two objects, and inherited actions, which are determined by the related objects through an ontology and rule based methodology. The hierarchical design of the method enables it to detect any interactive actions based on the spatial relations between two objects. The method using object information achieves an F‐measure of 90.27%. Moreover, this paper describes the extensibility of the method for a new action contained in a video from a video domain that is different from the dataset used.