z-logo
open-access-imgOpen Access
Online approximate optimal control for affine non‐linear systems with unknown internal dynamics using adaptive dynamic programming
Author(s) -
Yang Xiong,
Liu Derong,
Wei Qinglai
Publication year - 2014
Publication title -
iet control theory and applications
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.059
H-Index - 108
eISSN - 1751-8652
pISSN - 1751-8644
DOI - 10.1049/iet-cta.2014.0186
Subject(s) - control theory (sociology) , dynamic programming , observer (physics) , optimal control , computer science , adaptive control , artificial neural network , lyapunov function , affine transformation , convergence (economics) , separation principle , state observer , mathematics , mathematical optimization , control (management) , nonlinear system , artificial intelligence , physics , quantum mechanics , pure mathematics , economics , economic growth
In this study, a novel online adaptive dynamic programming (ADP)‐based algorithm is developed for solving the optimal control problem of affine non‐linear continuous‐time systems with unknown internal dynamics. The present algorithm employs an observer–critic architecture to approximate the Hamilton–Jacobi–Bellman equation. Two neural networks (NNs) are used in this architecture: an NN state observer is constructed to estimate the unknown system dynamics and a critic NN is designed to derive the optimal control instead of typical action–critic dual networks employed in traditional ADP algorithms. Based on the developed architecture, the observer NN and the critic NN are tuned simultaneously. Meanwhile, unlike existing tuning laws for the critic, the newly developed critic update rule not only ensures convergence of the critic to the optimal control but also guarantees stability of the closed‐loop system. No initial stabilising control is required, and by using recorded and instantaneous data simultaneously for the adaptation of the critic, the restrictive persistence of excitation condition is relaxed. In addition, Lyapunov direct method is utilised to demonstrate the uniform ultimate boundedness of the weights of the observer NN and the critic NN. Finally, an example is provided to verify the effectiveness of the present approach.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here