z-logo
Premium
Sparse periodicity‐based auditory features explain human performance in a spatial multitalker auditory scene analysis task
Author(s) -
Josupeit Angela,
Schoenmaker Esther,
Par Steven,
Hohmann Volker
Publication year - 2020
Publication title -
european journal of neuroscience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.346
H-Index - 206
eISSN - 1460-9568
pISSN - 0953-816X
DOI - 10.1111/ejn.13981
Subject(s) - psychoacoustics , auditory scene analysis , computational auditory scene analysis , speech recognition , computer science , intelligibility (philosophy) , auditory system , novelty , pattern recognition (psychology) , perception , artificial intelligence , psychology , cognitive psychology , social psychology , philosophy , epistemology , neuroscience
Human listeners robustly decode speech information from a talker of interest that is embedded in a mixture of spatially distributed interferers. A relevant question is which time‐frequency segments of the speech are predominantly used by a listener to solve such a complex Auditory Scene Analysis task. A recent psychoacoustic study investigated the relevance of low signal‐to‐noise ratio ( SNR ) components of a target signal on speech intelligibility in a spatial multitalker situation. For this, a three‐talker stimulus was manipulated in the spectro‐temporal domain such that target speech time‐frequency units below a variable SNR threshold ( SNR crit ) were discarded while keeping the interferers unchanged. The psychoacoustic data indicate that only target components at and above a local SNR of about 0  dB contribute to intelligibility. This study applies an auditory scene analysis “glimpsing” model to the same manipulated stimuli. Model data are found to be similar to the human data, supporting the notion of “glimpsing,” that is, that salient speech‐related information is predominantly used by the auditory system to decode speech embedded in a mixture of sounds, at least for the tested conditions of three overlapping speech signals. This implies that perceptually relevant auditory information is sparse and may be processed with low computational effort, which is relevant for neurophysiological research of scene analysis and novelty processing in the auditory system.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here