Research Library

open-access-imgOpen AccessUnified Learning from Demonstrations, Corrections, and Preferences during Physical Human-Robot Interaction
Author(s)
Shaunak A. Mehta,
Dylan P. Losey
Publication year2024
Humans can leverage physical interaction to teach robot arms. This physicalinteraction takes multiple forms depending on the task, the user, and what therobot has learned so far. State-of-the-art approaches focus on learning from asingle modality, or combine multiple interaction types by assuming that therobot has prior information about the human's intended task. By contrast, inthis paper we introduce an algorithmic formalism that unites learning fromdemonstrations, corrections, and preferences. Our approach makes no assumptionsabout the tasks the human wants to teach the robot; instead, we learn a rewardmodel from scratch by comparing the human's inputs to nearby alternatives. Wefirst derive a loss function that trains an ensemble of reward models to matchthe human's demonstrations, corrections, and preferences. The type and order offeedback is up to the human teacher: we enable the robot to collect thisfeedback passively or actively. We then apply constrained optimization toconvert our learned reward into a desired robot trajectory. Through simulationsand a user study we demonstrate that our proposed approach more accuratelylearns manipulation tasks from physical human interaction than existingbaselines, particularly when the robot is faced with new or unexpectedobjectives. Videos of our user study are available at:https://youtu.be/FSUJsTYvEKU
Language(s)English

Seeing content that should not be on Zendy? Contact us.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here