Auditory Brainstem Representation of the Voice Pitch Contours in the Resolved and Unresolved Components of Mandarin Tones
Author(s) -
Fei Peng,
Colette M. McKay,
Darren Mao,
Wensheng Hou,
Hamish Innes-Brown
Publication year - 2018
Publication title -
frontiers in neuroscience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.499
H-Index - 102
eISSN - 1662-4548
pISSN - 1662-453X
DOI - 10.3389/fnins.2018.00820
Subject(s) - harmonics , acoustics , quiet , fundamental frequency , harmonic , noise (video) , mandarin chinese , tone (literature) , speech recognition , waveform , spectral envelope , computer science , physics , artificial intelligence , art , linguistics , philosophy , literature , quantum mechanics , voltage , image (mathematics) , telecommunications , radar
Accurate perception of voice pitch plays a vital role in speech understanding, especially for tonal languages such as Mandarin. Lexical tones are primarily distinguished by the fundamental frequency (F0) contour of the acoustic waveform. It has been shown that the auditory system could extract the F0 from the resolved and unresolved harmonics, and the tone identification performance of resolved harmonics was better than unresolved harmonics. To evaluate the neural response to the resolved and unresolved components of Mandarin tones in quiet and in speech-shaped noise, we recorded the frequency-following response. In this study, four types of stimuli were used: speech with either only-resolved harmonics or only-unresolved harmonics, both in quiet and in speech-shaped noise. Frequency-following responses (FFRs) were recorded to alternating-polarity stimuli and were added or subtracted to enhance the neural response to the envelope (FFR ENV ) or fine structure (FFR TFS ), respectively. The neural representation of the F0 strength reflected by the FFR ENV was evaluated by the peak autocorrelation value in the temporal domain and the peak phase-locking value (PLV) at F0 in the spectral domain. Both evaluation methods showed that the FFR ENV F0 strength in quiet was significantly stronger than in noise for speech including unresolved harmonics, but not for speech including resolved harmonics. The neural representation of the temporal fine structure reflected by the FFR TFS was assessed by the PLV at the harmonic near to F1 (4th of F0). The PLV at harmonic near to F1 (4th of F0) of FFR TFS to resolved harmonics was significantly larger than to unresolved harmonics. Spearman's correlation showed that the FFR ENV F0 strength to unresolved harmonics was correlated with tone identification performance in noise (0 dB SNR). These results showed that the FFR ENV F0 strength to speech sounds with resolved harmonics was not affected by noise. In contrast, the response to speech sounds with unresolved harmonics, which were significantly smaller in noise compared to quiet. Our results suggest that coding resolved harmonics was more important than coding envelope for tone identification performance in noise.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom