Estimating the spectral tilt of the glottal source from telephone speech using a deep neural network
Author(s) -
Emma Jokinen,
Paavo Alku
Publication year - 2017
Publication title -
the journal of the acoustical society of america
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.619
H-Index - 187
eISSN - 1520-8524
pISSN - 0001-4966
DOI - 10.1121/1.4979162
Subject(s) - tilt (camera) , vocal tract , computer science , distortion (music) , artificial neural network , speech recognition , acoustics , telephone network , inverse filter , inverse , artificial intelligence , mathematics , telecommunications , physics , amplifier , geometry , bandwidth (computing)
Estimation of the spectral tilt of the glottal source has several applications in speech analysis and modification. However, direct estimation of the tilt from telephone speech is challenging due to vocal tract resonances and distortion caused by speech compression. In this study, a deep neural network is used for the tilt estimation from telephone speech by training the network with tilt estimates computed by glottal inverse filtering. An objective evaluation shows that the proposed technique gives more accurate estimates for the spectral tilt than previously used techniques that estimate the tilt directly from telephone speech without glottal inverse filtering.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom