
Attentional Modulation of Hierarchical Speech Representations in a Multitalker Environment
Author(s) -
İbrahim Kiremitçi,
Özgür Yılmaz,
Emin Çelik,
Mohammad Shahdloo,
Alexander G. Huth,
Tolga Çukur
Publication year - 2021
Publication title -
cerebral cortex
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.694
H-Index - 250
eISSN - 1460-2199
pISSN - 1047-3211
DOI - 10.1093/cercor/bhab136
Subject(s) - active listening , psychology , task (project management) , cognitive psychology , natural sounds , brain activity and meditation , cognition , auditory cortex , speech recognition , electroencephalography , communication , computer science , neuroscience , management , economics
Humans are remarkably adept in listening to a desired speaker in a crowded environment, while filtering out nontarget speakers in the background. Attention is key to solving this difficult cocktail-party task, yet a detailed characterization of attentional effects on speech representations is lacking. It remains unclear across what levels of speech features and how much attentional modulation occurs in each brain area during the cocktail-party task. To address these questions, we recorded whole-brain blood-oxygen-level-dependent (BOLD) responses while subjects either passively listened to single-speaker stories, or selectively attended to a male or a female speaker in temporally overlaid stories in separate experiments. Spectral, articulatory, and semantic models of the natural stories were constructed. Intrinsic selectivity profiles were identified via voxelwise models fit to passive listening responses. Attentional modulations were then quantified based on model predictions for attended and unattended stories in the cocktail-party task. We find that attention causes broad modulations at multiple levels of speech representations while growing stronger toward later stages of processing, and that unattended speech is represented up to the semantic level in parabelt auditory cortex. These results provide insights on attentional mechanisms that underlie the ability to selectively listen to a desired speaker in noisy multispeaker environments.