Premium
Voice‐associated static face image releases speech from informational masking
Author(s) -
Gao Yayue,
Cao Shuyang,
Qu Tianshu,
Wu Xihong,
Li Haifeng,
Zhang Jinsheng,
Li Liang
Publication year - 2014
Publication title -
psych journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.417
H-Index - 14
eISSN - 2046-0260
pISSN - 2046-0252
DOI - 10.1002/pchj.45
Subject(s) - speech recognition , masking (illustration) , sentence , priming (agriculture) , computer science , face (sociological concept) , speech perception , perception , psychology , artificial intelligence , linguistics , art , philosophy , botany , germination , neuroscience , visual arts , biology
In noisy, multipeople talking environments such as a cocktail party, listeners can use various perceptual and/or cognitive cues to improve recognition of target speech against masking, particularly informational masking. Previous studies have shown that temporally prepresented voice cues (voice primes) improve recognition of target speech against speech masking but not noise masking. This study investigated whether static face image primes that have become target‐voice associated (i.e., facial images linked through associative learning with voices reciting the target speech) can be used by listeners to unmask speech. The results showed that in 32 normal‐hearing younger adults, temporally prepresenting a voice‐priming sentence with the same voice reciting the target sentence significantly improved the recognition of target speech that was masked by irrelevant two‐talker speech. When a person's face photograph image became associated with the voice reciting the target speech by learning, temporally prepresenting the target‐voice‐associated face image significantly improved recognition of target speech against speech masking, particularly for the last two keywords in the target sentence. Moreover, speech‐recognition performance under the voice‐priming condition was significantly correlated to that under the face‐priming condition. The results suggest that learned facial information on talker identity plays an important role in identifying the target‐talker's voice and facilitating selective attention to the target‐speech stream against the masking‐speech stream.