z-logo
open-access-imgOpen Access
A visual voice activity detection method with adaboosting
Author(s) -
Qingju Liu,
Wenwu Wang,
Philip J. B. Jackson
Publication year - 2011
Publication title -
surrey open research repository (university of surrey)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.1049/ic.2011.0145
Subject(s) - computer science , speech recognition , computer vision , artificial intelligence
Spontaneous speech in videos capturing the speaker's mouth provides bimodal information. Exploiting the relationship between the audio and visual streams, we propose a new visual voice activity detection (VAD) algorithm, to overcome the vulnerability of conventional audio VAD techniques in the presence of background interference. First, a novel lip extraction algorithm combining rotational templates and prior shape constraints with active contours is introduced. The visual features are then obtained from the extracted lip region. Second, with the audio voice activity vector used in training, adaboosting is applied to the visual features, to generate a strong final voice activity classifier by boosting a set of weak classifiers. We have tested our lip extraction algorithm on the XM2VTS database (with higher resolution) and some video clips from YouTube (with lower resolution). The visual VAD was shown to offer low error rates

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom