z-logo
open-access-imgOpen Access
Converting video classification problem to image classification with global descriptors and pre‐trained network
Author(s) -
Zebhi Saeedeh,
AlModarresi SMT,
Abootalebi Vahid
Publication year - 2020
Publication title -
iet computer vision
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.38
H-Index - 37
eISSN - 1751-9640
pISSN - 1751-9632
DOI - 10.1049/iet-cvi.2019.0625
Subject(s) - artificial intelligence , computer science , optical flow , pattern recognition (psychology) , template , motion (physics) , image (mathematics) , computer vision , energy (signal processing) , mathematics , statistics , programming language
Motion history image (MHI) is a spatio‐temporal template that temporal motion information is collapsed into a single image where intensity is a function of recency of motion. Also, it consists of spatial information. Energy image (EI) based on the magnitude of optical flow is a temporal template that shows only temporal information of motion. Each video can be described in these templates. So, four new methods are introduced in this study. The first three methods are called basic methods. In method 1, each video splits into N groups of consecutive frames and MHI is calculated for each group. Transfer learning with fine‐tuning technique has been used for classifying these templates. EIs are used for classifying in method 2 similar to method 1. Fusing two streams of these templates is introduced as method 3. Finally, spatial information is added in method 4. Among these methods, method 4 outperforms others and it is called the proposed method. It achieves the recognition accuracy of 92.30 and 94.50% for UCF Sport and UCF‐11 action data sets, respectively. Also, the proposed method is compared with the state‐of‐the‐art approaches and the results show that it has the best performance.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here