Suboptimal reduced control of unknown nonlinear singularly perturbed systems via reinforcement learning | Zendy

Liu Xiaomin | Zendy; Yang Chunyu | Zendy; Zhou Linna | Zendy; Fu Jun | Zendy; Dai Wei | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Suboptimal reduced control of unknown nonlinear singularly perturbed systems via reinforcement learning

Author(s) -

Liu Xiaomin,

Yang Chunyu,

Zhou Linna,

Fu Jun,

Dai Wei

Publication year - 2021

Publication title -

international journal of robust and nonlinear control

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.361

H-Index - 106

eISSN - 1099-1239

pISSN - 1049-8923

DOI - 10.1002/rnc.5624

Subject(s) - reinforcement learning , control theory (sociology) , singular perturbation , hamilton–jacobi–bellman equation , convergence (economics) , nonlinear system , artificial neural network , computer science , function approximation , controller (irrigation) , bellman equation , mathematical optimization , state (computer science) , stability (learning theory) , perturbation (astronomy) , mathematics , control (management) , algorithm , artificial intelligence , machine learning , mathematical analysis , physics , quantum mechanics , agronomy , economics , biology , economic growth

In this paper, a suboptimal reduced control method is proposed for a class of nonlinear singularly perturbed systems (SPSs) with unknown dynamics. By using singular perturbation theory, the original system is reduced to a reduced system, by which a policy iterative method is proposed to solve the corresponding reduced Hamilton–Jacobi–Bellman (HJB) equation with convergence guaranteed. A reinforcement learning (RL) algorithm is proposed to implement the policy iterative method without using any knowledge of the system dynamics. In the RL algorithm, the unmeasurable state of the virtual reduced system is reconstructed by the slow state measurements of the original system, the controller and cost function are approximated by actor‐critic neural networks (NNs) and the method of weighted residuals is utilized to update the NN weights. The influence introduced by state reconstruction error and NN function approximation on the convergence, suboptimality of the reduced controller and stability of the closed‐loop SPSs are rigorously analyzed. Finally, the effectiveness of our proposed method is illustrated by examples.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research