z-logo
open-access-imgOpen Access
Who's Speaking?
Author(s) -
Punarjay Chakravarty,
Sayeh Mirzaei,
Tinne Tuytelaars,
Hugo Van hamme
Publication year - 2015
Publication title -
lirias (ku leuven)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.1145/2818346.2820780
Subject(s) - computer science , movement (music) , microphone , microphone array , speech recognition , computer vision , artificial intelligence , head (geology) , acoustics , telecommunications , physics , sound pressure , geomorphology , geology
Active speakers have traditionally been identified in video by detecting their moving lips. This paper demonstrates the same using spatio-temporal features that aim to capture other cues: movement of the head, upper body and hands of active speakers. Speaker directional information, obtained using sound source localization from a microphone array is used to supervise the training of these video features.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom