z-logo
open-access-imgOpen Access
Duration Mismatch Compensation Using Four-Covariance Model and Deep Neural Network for Speaker Verification
Author(s) -
Pierre-Michel Bousquet,
Mickaël Rouvier
Publication year - 2017
Publication title -
interspeech 2022
Language(s) - English
Resource type - Conference proceedings
DOI - 10.21437/interspeech.2017-93
Subject(s) - computer science , covariance , speech recognition , artificial intelligence , pattern recognition (psychology) , duration (music) , artificial neural network , representation (politics) , complement (music) , reliability (semiconductor) , probabilistic logic , mathematics , statistics , power (physics) , physics , art , biochemistry , chemistry , literature , quantum mechanics , complementation , politics , political science , law , gene , phenotype
Duration mismatch between enrollment and test utterances still remains a major concern for reliability of real-life speaker recognition applications. Two approaches are proposed here to deal with this case when using the i-vector representation. The first one is an adaptation of Gaussian Probabilistic Linear Discriminant Analysis (PLDA) modeling, which can be extended to the case of any shift between i-vectors drawn from two distinct distributions. The second one attempts to map i-vectors of truncated segments of an utterance to the i-vector of the full segment, by the use of deep neural networks (DNN). Our results show that both new approaches outperform the standard PLDA by about 10 % relative, noting that these back-end methods could complement those quantifying the i-vector uncertainty during its extraction process, in the case of duration gap.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom