Machine learning for decoding listeners’ attention from electroencephalography evoked by continuous speech | Zendy

Taillez Tobias | Zendy; Kollmeier Birger | Zendy; Meyer Bernd T. | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Machine learning for decoding listeners’ attention from electroencephalography evoked by continuous speech

Author(s) -

Taillez Tobias,

Kollmeier Birger,

Meyer Bernd T.

Publication year - 2020

Publication title -

european journal of neuroscience

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.346

H-Index - 206

eISSN - 1460-9568

pISSN - 0953-816X

DOI - 10.1111/ejn.13790

Subject(s) - electroencephalography , speech recognition , computer science , decoding methods , context (archaeology) , stimulus (psychology) , artificial neural network , artificial intelligence , segmentation , pattern recognition (psychology) , psychology , cognitive psychology , neuroscience , telecommunications , paleontology , biology

Previous research has shown that it is possible to predict which speaker is attended in a multispeaker scene by analyzing a listener's electroencephalography ( EEG ) activity. In this study, existing linear models that learn the mapping from neural activity to an attended speech envelope are replaced by a non‐linear neural network (NN). The proposed architecture takes into account the temporal context of the estimated envelope and is evaluated using EEG data obtained from 20 normal‐hearing listeners who focused on one speaker in a two‐speaker setting. The network is optimized with respect to the frequency range and the temporal segmentation of the EEG input, as well as the cost function used to estimate the model parameters. To identify the salient cues involved in auditory attention, a relevance algorithm is applied that highlights the electrode signals most important for attention decoding. In contrast to linear approaches, the NN profits from a wider EEG frequency range (1–32 Hz) and achieves a performance seven times higher than the linear baseline. Relevant EEG activations following the speech stimulus after 170 ms at physiologically plausible locations were found. This was not observed when the model was trained on the unattended speaker. Our findings therefore indicate that non‐linear NNs can provide insight into physiological processes by analyzing EEG activity.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research