z-logo
open-access-imgOpen Access
Single‐network ADP for near optimal control of continuous‐time zero‐sum games without using initial stabilising control laws
Author(s) -
Mu Chaoxu,
Wang Ke
Publication year - 2018
Publication title -
iet control theory and applications
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.059
H-Index - 108
eISSN - 1751-8652
pISSN - 1751-8644
DOI - 10.1049/iet-cta.2018.5464
Subject(s) - zero sum game , differential game , lyapunov function , mathematics , bounded function , control theory (sociology) , optimal control , convergence (economics) , saddle point , artificial neural network , stability (learning theory) , operator (biology) , lyapunov stability , mathematical optimization , bellman equation , adaptive control , equilibrium point , computer science , differential equation , control (management) , nash equilibrium , nonlinear system , artificial intelligence , repressor , economic growth , mathematical analysis , chemistry , biochemistry , geometry , quantum mechanics , machine learning , transcription factor , physics , economics , gene
This study establishes an approximate optimal critic learning algorithm based on single‐network adaptive dynamic programming aiming at solutions to continuous‐time two‐player zero‐sum games in the absence of initial stabilising control policies. Single‐network means one critic neural network, which is utilised to derive the saddle‐point equilibrium of a zero‐sum differential game by approximately learning the value function. First, the authors elaborate mathematically two‐player zero‐sum game problems and analyse the similarity of the zero‐sum game problem between linear and non‐linear systems. Then, this adaptive learning scheme is implemented as a critic structure that derives control and disturbance policies by learning the optimal value, and a novel weight tuning law involving a stable operator is proposed to ensure convergence and stability. Moreover, the uniform ultimate bounded stability of the whole system is rigorously proved by Lyapunov theory. Finally, reasonable simulation results are provided to confirm the effectiveness of the improved approximate optimal control technique in solving equations for a complex linear system and a non‐linear system.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here