z-logo
Premium
Reinforcement learning using a stochastic gradient method with memory‐based learning
Author(s) -
Yamada Takafumi,
Yamaguchi Satoshi
Publication year - 2010
Publication title -
electrical engineering in japan
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.136
H-Index - 28
eISSN - 1520-6416
pISSN - 0424-7760
DOI - 10.1002/eej.20963
Subject(s) - reinforcement learning , computer science , artificial intelligence , set (abstract data type) , task (project management) , process (computing) , series (stratigraphy) , computer memory , memory model , algorithm , semiconductor memory , shared memory , parallel computing , engineering , paleontology , systems engineering , biology , programming language , operating system
In this paper, a learning algorithm combining memory‐less learning and memory‐based learning is proposed for agents operating under POMDP. In the first stage of the proposed algorithm, memory‐less learning is applied. The stochastic gradient method is employed as a memory‐less learning algorithm. In the first stage, a state‐action set series that accomplishes the task is stored in memory. In the second stage, memory‐based learning is applied. In this process, only the series obtained in the first stage is used, so that this method is able to reduce significantly the amount of required memory. The proposed algorithm is applied to three simulations for comparison with the memory‐less learning algorithm. Through computer simulations, it is shown that the proposed algorithm works more effectively in POMDP than ordinary memory‐less learning. © 2010 Wiley Periodicals, Inc. Electr Eng Jpn, 173(1): 32–40, 2010; Published online in Wiley InterScience ( www.interscience.wiley.com ). DOI 10.1002/eej.20963

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here