z-logo
open-access-imgOpen Access
Time-slice Prediction of Dyadic Human Activities
Author(s) -
Maryam Ziaeefard,
Robert Bergevin,
LouisPhilippe Morency
Publication year - 2015
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5244/c.29.167
Subject(s) - computer science , novelty , artificial intelligence , action (physics) , granularity , representation (politics) , sequence (biology) , timestamp , feature (linguistics) , pattern recognition (psychology) , action recognition , machine learning , key (lock) , philosophy , linguistics , physics , theology , computer security , quantum mechanics , politics , biology , political science , law , genetics , class (philosophy) , operating system
Recognizing human activities from video data is being leveraged for surveillance and human-computer interaction applications. In this paper, we introduce the problem of time-slice activity recognition which aims to explore human activity at a smaller temporal granularity. Time-slice recognition is able to infer human behaviors from a short temporal window. It has been shown that the temporal slice analysis is helpful for motion characterization and in general for video content representation. These studies motivate us to consider time-slices for activity recognition. We present in Figure 1 an overview of our approach based on timeslice action prediction and contrast it with the conventional approaches which recognize actions based on either the whole video sequence (referred as “holistic” approach) or the first part of it (early recognition). Our time-slice approach studies not only the beginning of the action sequence but generalizes this to any short-term observation anywhere in the video sequence. Another key novelty is in the explicit modeling of the uncertainty occurring when predicting actions based on time-slices. TAP Dataset: We introduce a new dataset, named Time-slice Action Prediction (TAP) dataset, to evaluate our proposed feature descriptors and enable future research on this topic. The dataset was created by extracting time-slices from existing public human action datasets (UT-Interaction, HMDB, TV Interaction, and Hollywood datasets) and perform a perception study with multiple annotators giving continuous ratings for each action. The continuous ratings allow to represent the uncertainty in timeslice action prediction. 3 annotators rated each time-slice on how likely a specific action is occurring. For each time-slice and for each action, the annotator was asked to pick one of 5 likelihoods from “Definitely Not Occurring” to “Definitely Occurring”. Figure 3 illustrates how annotators rated for two example videos. Methodology: Stage 1Discriminative segments: When analyzing an interaction, we can definitely recognize the ongoing activity from specific time slices such as “two people are shaking each other’s hands” slice in handshaking activity. To extract discriminative segments from our dataset, we used Fleiss’ kappa coefficient k [2] to measure the reliability of agreement between annotators. For each interaction video, time-slices where the annotators are in complete agreement, i.e. k=1, on definitely including the interaction of interest, are selected as discriminative segments. Stage 2Predict-STIP: Existing STIP detectors are vulnerable to model the inherent uncertainty in partially observed action recognition Figure 2: Human annotation: This figure shows the average rate of 3 annotators for two video examples: hug and push. The label provided by one annotator is converted to a number on a linear scale from 0 to 1 called the average rate. This average rate will be used to evaluate the performance of our method. Time-slices between dashed lines is the discriminative segment of the interaction.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom