z-logo
Premium
Task Estimation Using Latent Semantic Analysis of Visual Scenes and Spoken Words
Author(s) -
Kimura Masashi,
Sawada Shinta,
Iribe Yurie,
Katsurada Kouichi,
Nitta Tsuneo
Publication year - 2014
Publication title -
electronics and communications in japan
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.131
H-Index - 13
eISSN - 1942-9541
pISSN - 1942-9533
DOI - 10.1002/ecj.11560
Subject(s) - computer science , linear subspace , latent semantic analysis , task (project management) , artificial intelligence , probabilistic latent semantic analysis , natural language processing , object (grammar) , pattern recognition (psychology) , speech recognition , image (mathematics) , modality (human–computer interaction) , identification (biology) , subspace topology , mathematics , botany , geometry , management , economics , biology
SUMMARY In this paper, we propose a task estimation method based on multiple subspaces extracted from multimodal information of image objects in visual scenes and spoken words in dialogue appearing in the same task. The multiple subspaces are obtained by using latent semantic analysis (LSA). In the proposed method, a task vector composed of spoken words and the frequencies of image‐object appearances are extracted first, and then similarities among the input task vector and reference subspaces of different tasks are compared. Experiments are conducted on the identification of game tasks. The experimental results show that the proposed method with multimodal information outperforms the method in which only the single modality of image or spoken dialogue is applied. The proposed method achieves accurate performance even if less spoken dialogue is applied.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here