Adaptive Reinforcement Learning and Its Application to Robot Compliance Learning | Zendy

Boo-Ho Yang | Zendy; Haruhiko Asada | Zendy

AI Assistant Blog Pricing

Open Access

Adaptive Reinforcement Learning and Its Application to Robot Compliance Learning

Author(s) -

Boo-Ho Yang,

Haruhiko Asada

Publication year - 1995

Publication title -

journal of robotics and mechatronics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.257

H-Index - 19

eISSN - 1883-8049

pISSN - 0915-3942

DOI - 10.20965/jrm.1995.p0250

Subject(s) - reinforcement learning , grasp , robot , computer science , artificial intelligence , adaptive control , robot learning , robustness (evolution) , control theory (sociology) , control (management) , mobile robot , biochemistry , chemistry , programming language , gene

A new learning algorithm for connectionist networks that solves a class of optimal control problems is presented. The algorithm, called Adaptive Reinforcement Learning Algorithm, employs a second network to model immediate reinforcement provided from the task environment and adaptively identities it through repeated experience. Output perturbation and correlation techniques are used to translate mere critic signals into useful learning signals for the connectionist controller. Compared with the direct approaches of reinforcement learning, this algorithm shows faster and guaranteed improvement in the control performance. Robustness against inaccuracy of the model is also discussed. It is demonstrated by simulation that the adaptive reinforcement learning method is efficient and useful in learning a compliance control law in a class of robotic assembly tasks. A simple box palletizing task is used as an example, where a robot is required to move a rectangular part to the corner of a box. In the simulation, the robot is initially provided with only predetermined velocity command to follow the nominal trajectory. At each attempt, the box is randomly located and the part is randomly oriented within the grasp of the end-effector. Therefore, compliant motion control is necessary to guide the part to the corner of the box while avoiding excessive reaction forces caused by the collision with a wall. After repeating the failure in performing the task, the robot can successfully learn force feedback gains to modify its nominal motion. Our results show that the new learning method can be used to learn a compliance control law effectively.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom

About

About Careers Publisher Partners Contact Us Our institutional solutions Get Organisational Trial or Quote

Learn

FAQs Blog Terms of Use Privacy Policy

Download the Zendy App

Discover

Explore

Home ZAIA Blog