Switching reinforcement learning for continuous action space | Zendy

Nagayoshi Masato | Zendy; Murao Hajime | Zendy; Tamaki Hisashi | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Switching reinforcement learning for continuous action space

Author(s) -

Nagayoshi Masato,

Murao Hajime,

Tamaki Hisashi

Publication year - 2012

Publication title -

electronics and communications in japan

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.131

H-Index - 13

eISSN - 1942-9541

pISSN - 1942-9533

DOI - 10.1002/ecj.10383

Subject(s) - reinforcement learning , computer science , process (computing) , artificial intelligence , action (physics) , space (punctuation) , robot , gross motor skill , entropy (arrow of time) , control engineering , machine learning , motor skill , engineering , psychology , physics , quantum mechanics , psychiatry , operating system

Reinforcement learning (RL) is attracting attention as a technique for realizing computational intelligence, such as adaptive and autonomous decentralized systems. In general, however, it is not easy to put RL to practical use. The difficulty includes the problem of designing a suitable action space for an agent, that is, satisfying two requirements in trade‐off: (i) to keep the characteristics (or structure) of the original search space as much as possible in order to seek strategies that lie close to the optimal, and (ii) to reduce the search space as much as possible in order to expedite the learning process. In order to design a suitable action space adaptively, we propose the Switching RL model to mimic the process of an infant's motor development, in which gross motor skills develop before fine motor skills. Then, a method for switching controllers is constructed by introducing and referring to the “entropy.” Further, the validity of the proposed method is confirmed by computational experiments using robot navigation problems with one‐ and two‐dimensional continuous action spaces. © 2012 Wiley Periodicals, Inc. Electron Comm Jpn, 95(3): 37–44, 2012; Published online in Wiley Online Library ( wileyonlinelibrary.com ). DOI 10.1002/ecj.10383

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research