Task Estimation Using Latent Semantic Analysis of Visual Scenes and Spoken Words | Zendy

Kimura Masashi | Zendy; Sawada Shinta | Zendy; Iribe Yurie | Zendy; Katsurada Kouichi | Zendy; Nitta Tsuneo | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Task Estimation Using Latent Semantic Analysis of Visual Scenes and Spoken Words

Author(s) -

Kimura Masashi,

Sawada Shinta,

Iribe Yurie,

Katsurada Kouichi,

Nitta Tsuneo

Publication year - 2014

Publication title -

electronics and communications in japan

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.131

H-Index - 13

eISSN - 1942-9541

pISSN - 1942-9533

DOI - 10.1002/ecj.11560

Subject(s) - computer science , linear subspace , latent semantic analysis , task (project management) , artificial intelligence , probabilistic latent semantic analysis , natural language processing , object (grammar) , pattern recognition (psychology) , speech recognition , image (mathematics) , modality (human–computer interaction) , identification (biology) , subspace topology , mathematics , botany , geometry , management , economics , biology

SUMMARY In this paper, we propose a task estimation method based on multiple subspaces extracted from multimodal information of image objects in visual scenes and spoken words in dialogue appearing in the same task. The multiple subspaces are obtained by using latent semantic analysis (LSA). In the proposed method, a task vector composed of spoken words and the frequencies of image‐object appearances are extracted first, and then similarities among the input task vector and reference subspaces of different tasks are compared. Experiments are conducted on the identification of game tasks. The experimental results show that the proposed method with multimodal information outperforms the method in which only the single modality of image or spoken dialogue is applied. The proposed method achieves accurate performance even if less spoken dialogue is applied.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research