Online adaptive algorithm for optimal control with integral reinforcement learning | Zendy

Vamvoudakis Kyriakos G. | Zendy; Vrabie Draguna | Zendy; Lewis Frank L. | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Online adaptive algorithm for optimal control with integral reinforcement learning

Author(s) -

Vamvoudakis Kyriakos G.,

Vrabie Draguna,

Lewis Frank L.

Publication year - 2013

Publication title -

international journal of robust and nonlinear control

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.361

H-Index - 106

eISSN - 1099-1239

pISSN - 1049-8923

DOI - 10.1002/rnc.3018

Subject(s) - reinforcement learning , convergence (economics) , computer science , bellman equation , optimal control , stability (learning theory) , controller (irrigation) , nonlinear system , mathematical optimization , control theory (sociology) , adaptive control , function approximation , control (management) , mathematics , artificial neural network , artificial intelligence , machine learning , physics , quantum mechanics , agronomy , economics , biology , economic growth

SUMMARY In this paper, we introduce an online algorithm that uses integral reinforcement knowledge for learning the continuous‐time optimal control solution for nonlinear systems with infinite horizon costs and partial knowledge of the system dynamics. This algorithm is a data‐based approach to the solution of the Hamilton–Jacobi–Bellman equation, and it does not require explicit knowledge on the system's drift dynamics. A novel adaptive control algorithm is given that is based on policy iteration and implemented using an actor/critic structure having two adaptive approximator structures. Both actor and critic approximation networks are adapted simultaneously. A persistence of excitation condition is required to guarantee convergence of the critic to the actual optimal value function. Novel adaptive control tuning algorithms are given for both critic and actor networks, with extra terms in the actor tuning law being required to guarantee closed loop dynamical stability. The approximate convergence to the optimal controller is proven, and stability of the system is also guaranteed. Simulation examples support the theoretical result. Copyright © 2013 John Wiley & Sons, Ltd.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research