Adaptive dynamic programming for model‐free tracking of trajectories with time‐varying parameters | Zendy

Köpf Florian | Zendy; Ramsteiner Simon | Zendy; Puccetti Luca | Zendy; Flad Michael | Zendy; Hohmann Sören | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Adaptive dynamic programming for model‐free tracking of trajectories with time‐varying parameters

Author(s) -

Köpf Florian,

Ramsteiner Simon,

Puccetti Luca,

Flad Michael,

Hohmann Sören

Publication year - 2020

Publication title -

international journal of adaptive control and signal processing

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.73

H-Index - 66

eISSN - 1099-1115

pISSN - 0890-6327

DOI - 10.1002/acs.3106

Subject(s) - computer science , tracking (education) , dynamic programming , control theory (sociology) , tracking error , trajectory , controller (irrigation) , function (biology) , invariant (physics) , system dynamics , quadratic equation , lti system theory , mathematical optimization , artificial intelligence , mathematics , algorithm , linear system , control (management) , psychology , mathematical analysis , pedagogy , physics , geometry , astronomy , evolutionary biology , agronomy , mathematical physics , biology

Summary Recently proposed adaptive dynamic programming (ADP) tracking controllers assume that the reference trajectory follows time‐invariant exo‐system dynamics—an assumption that does not hold for many applications. In order to overcome this limitation, we propose a new Q‐function that explicitly incorporates a parametrized approximation of the reference trajectory. This allows learning to track a general class of trajectories by means of ADP. Once our Q‐function has been learned, the associated controller handles time‐varying reference trajectories without the need for further training and independent of exo‐system dynamics. After proposing this general model‐free off‐policy tracking method, we provide an analysis of the important special case of linear quadratic tracking. An example demonstrates that our new method successfully learns the optimal tracking controller and outperforms existing approaches in terms of tracking error and cost.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research