
Model‐free optimal tracking control for discrete‐time system with delays using reinforcement Q ‐learning
Author(s) -
Liu Yang,
Yu Rui
Publication year - 2018
Publication title -
electronics letters
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.375
H-Index - 146
eISSN - 1350-911X
pISSN - 0013-5194
DOI - 10.1049/el.2017.3238
Subject(s) - reinforcement learning , tracking (education) , trajectory , computer science , control theory (sociology) , quadratic equation , q learning , optimal control , mathematical optimization , control (management) , mathematics , artificial intelligence , psychology , pedagogy , physics , geometry , astronomy
Reinforcement Q ‐learning algorithm for the optimal tracking control problem with unknown dynamics and delays is proposed. Traditional reinforcement learning methods require an accurate system model, which is avoided by means of the Q ‐learning method. This is very meaningful in practical implementation because all or part of the model of the system is often difficult to obtain or requires an additional high cost. First, the augmented system composed of the original system and reference trajectory is constructed, then the corresponding augmented linear quadratic tracking (LQT) Bellman equation is derived. Based on this, the reinforcement Q ‐learning algorithm is presented at the end. To implement this method, the iteration equations are solved online by using the least squares technique.