z-logo
open-access-imgOpen Access
A hierarchical sparse coding model predicts acoustic feature encoding in both auditory midbrain and cortex
Author(s) -
Qingtian Zhang,
Xiaolin Hu,
Bo Hong,
Bo Zhang
Publication year - 2019
Publication title -
plos computational biology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.628
H-Index - 182
eISSN - 1553-7358
pISSN - 1553-734X
DOI - 10.1371/journal.pcbi.1006766
Subject(s) - auditory cortex , neural coding , receptive field , inferior colliculus , formant , computer science , speech recognition , pattern recognition (psychology) , vowel , voxel , auditory system , artificial intelligence , neuroscience , biology , nucleus
The auditory pathway consists of multiple stages, from the cochlear nucleus to the auditory cortex. Neurons acting at different stages have different functions and exhibit different response properties. It is unclear whether these stages share a common encoding mechanism. We trained an unsupervised deep learning model consisting of alternating sparse coding and max pooling layers on cochleogram-filtered human speech. Evaluation of the response properties revealed that computing units in lower layers exhibited spectro-temporal receptive fields (STRFs) similar to those of inferior colliculus neurons measured in physiological experiments, including properties such as sound onset and termination, checkerboard pattern, and spectral motion. Units in upper layers tended to be tuned to phonetic features such as plosivity and nasality, resembling the results of field recording in human auditory cortex. Variation of the sparseness level of the units in each higher layer revealed a positive correlation between the sparseness level and the strength of phonetic feature encoding. The activities of the units in the top layer, but not other layers, correlated with the dynamics of the first two formants (F1, F2) of all phonemes, indicating the encoding of phoneme dynamics in these units. These results suggest that the principles of sparse coding and max pooling may be universal in the human auditory pathway.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom