Object-Centric Spatio-Temporal Pyramids for Egocentric Activity Recognition
Author(s) -
Tomas McCandless,
Kristen Grauman
Publication year - 2013
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5244/c.27.30
Subject(s) - discriminative model , computer science , artificial intelligence , boosting (machine learning) , histogram , pattern recognition (psychology) , object (grammar) , pyramid (geometry) , computer vision , set (abstract data type) , object detection , mathematics , image (mathematics) , geometry , programming language
Activities in egocentric video are largely defined by the objects with which the camera wearer interacts, making representations that summarize the objects in view quite informative. Beyond simply recording how frequently each object occurs in a single histogram, spatio-temporal binning approaches can capture the objects’ relative layout and ordering. However, existing methods use hand-crafted binning schemes (e.g., a uniformly spaced pyramid of partitions), which may fail to capture the relationships that best distinguish certain activities. We propose to learn the spatio-temporal partitions that are discriminative for a set of egocentric activity classes. We devise a boosting approach that automatically selects a small set of useful spatio-temporal pyramid histograms among a randomized pool of candidate partitions. In order to efficiently focus the candidate partitions, we further propose an “object-centric” cutting scheme that prefers sampling bin boundaries near those objects prominently involved in the egocentric activities. In this way, we specialize the randomized pool of partitions to the egocentric setting and improve the training efficiency for boosting. Our approach yields state-of-the-art accuracy for recognition of challenging activities of daily living.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom