z-logo
open-access-imgOpen Access
Data-Model Relationship in Text-Independent Speaker Recognition
Author(s) -
John S. Mason,
Nicholas Evans,
Robert Stapert,
Roland Auckenthaler
Publication year - 2005
Publication title -
eurasip journal on advances in signal processing
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.317
H-Index - 88
eISSN - 1687-6180
pISSN - 1687-6172
DOI - 10.1155/asp.2005.471
Subject(s) - mixture model , computer science , speaker recognition , dynamic time warping , speech recognition , pronunciation , speaker diarisation , artificial intelligence , pattern recognition (psychology) , hidden markov model , linguistics , philosophy
Text-independent speaker recognition systems such as those based on Gaussian mixture models (GMMs) do not include time sequence information (TSI) within the model itself. The level of importance of TSI in speaker recognition is an interesting question and one addressed in this paper. Recent works has shown that the utilisation of higher-level information such as idiolect, pronunciation, and prosodics can be useful in reducing speaker recognition error rates. In accordance with these developments, the aim of this paper is to show that as more data becomes available, the basic GMM can be enhanced by utilising TSI, even in a text-independent mode. This paper presents experimental work incorporating TSI into the conventional GMM. The resulting system, known as the segmental mixture model (SMM), embeds dynamic time warping (DTW) into a GMM framework. Results are presented on the 2000-speaker SpeechDat Welsh database which show improved speaker recognition performance with the SMM

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom