Text-Independent Phoneme Segmentation Combining EGG and Speech Data
Author(s) -
Lijiang Chen,
Xia Mao,
Hong Yan
Publication year - 2016
Publication title -
ieee/acm transactions on audio, speech, and language processing
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.916
H-Index - 56
eISSN - 2329-9304
pISSN - 2329-9290
DOI - 10.1109/taslp.2016.2533865
Subject(s) - signal processing and analysis , computing and processing , communication, networking and broadcast technologies , general topics for engineers
A new approach for text-independent phoneme segmentation at sampling point level is proposed in this paper. The algorithm consists of two phases: First, the voiced sections in speech data are detected using the information of vocal folds vibration contained in electroglottograph (EGG). A Hilbert envelope feature is adopted to achieve sampling point level detection accuracy. Second, the voiced sections and other sections are treated separately. Each voiced section is divided into several candidate phonemes using the Viterbi algorithm. Then adjacent candidate phonemes are merged based on a Hotellings T-square test method. For other sections, the unvoiced consonants are detected from silence based on a singularity exponent feature. Comparison experiments show that the proposed method has better performance than the existing ones for a variety of tolerances, and is more robust to noise.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom