Time‐in‐action RL | Zendy

Zhu Jiangcheng | Zendy; Wang Zhepei | Zendy; Mcilwraith Douglas | Zendy; Wu Chao | Zendy; Xu Chao | Zendy; Guo Yike | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Time‐in‐action RL

Author(s) -

Zhu Jiangcheng,

Wang Zhepei,

Mcilwraith Douglas,

Wu Chao,

Xu Chao,

Guo Yike

Publication year - 2019

Publication title -

iet cyber‐systems and robotics

Language(s) - English

Resource type - Journals

ISSN - 2631-6315

DOI - 10.1049/iet-csr.2018.0001

Subject(s) - reinforcement learning , computer science , action (physics) , controller (irrigation) , bellman equation , key (lock) , artificial intelligence , control theory (sociology) , mathematical optimization , control (management) , mathematics , computer security , physics , quantum mechanics , agronomy , biology

The authors propose a novel reinforcement learning (RL) framework, where agent behaviour is governed by traditional control theory. This integrated approach, called time‐in‐action RL, enables RL to be applicable to many real‐world systems, where underlying dynamics are known in their control theoretical formalism. The key insight to facilitate this integration is to model the explicit time function, mapping the state‐action pair to the time accomplishing the action by its underlying controller. In their framework, they describe an action by its value (action value), and the time that it takes to perform (action time). An action‐value results from the policy of RL regarding a state. Action time is estimated by an explicit time model learnt from the measured activities of the underlying controller. RL value network is then trained with embedded time model to predict action time. This approach is tested using a variant of Atari Pong and proved to be convergent.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore