Identifying the addressee in human-human-robot interactions based on head pose and speech | Zendy

Michael Katzenmaier | Zendy; Rainer Stiefelhagen | Zendy; Tanja Schultz | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Identifying the addressee in human-human-robot interactions based on head pose and speech

Author(s) -

Michael Katzenmaier,

Rainer Stiefelhagen,

Tanja Schultz

Publication year - 2004

Publication title -

repository kitopen (karlsruhe institute of technology)

Language(s) - English

Resource type - Conference proceedings

ISBN - 1-58113-995-0

DOI - 10.1145/1027933.1027959

Subject(s) - human–robot interaction , computer science , robot , artificial intelligence , feature (linguistics) , head (geology) , sensory cue , computer vision , identification (biology) , speech recognition , botany , biology , philosophy , geomorphology , geology , linguistics

In this work we investigate the power of acoustic and visual cues, and their combination, to identify the addressee in a human-human-robot interaction. Based on eighteen audio-visual recordings of two human beings and a (simulated) robot we discriminate the interaction of the two humans from the interaction of one human with the robot. The paper compares the result of three approaches. The first approach uses purely acoustic cues to find the addressees. Low level, feature based cues as well as higher-level cues are examined. In the second approach we test whether the human's head pose is a suitable cue. Our results show that visually estimated head pose is a more reliable cue for the identification of the addressee in the human-human-robot interaction. In the third approach we combine the acoustic and visual cues which results in significant improvements.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research