z-logo
Premium
Distributed Deep Reinforcement Learning Method Using Profit Sharing for Learning Acceleration
Author(s) -
Kodama Naoki,
Harada Taku,
Miyazaki Kazuteru
Publication year - 2020
Publication title -
ieej transactions on electrical and electronic engineering
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.254
H-Index - 30
eISSN - 1931-4981
pISSN - 1931-4973
DOI - 10.1002/tee.23180
Subject(s) - reinforcement learning , computer science , artificial intelligence , acceleration , reinforcement , machine learning , psychology , physics , classical mechanics , social psychology
Profit Sharing (PS), a reinforcement learning method that strongly reinforces successful experiences, has been shown to contribute to the improvement of learning speed when combined with a deep Q‐network (DQN). We expect a further improvement in learning speed by integrating PS‐based learning and Ape‐X DQN that has state‐of‐the‐art learning speed instead of the DQN. However, PS‐based learning does not use replay memory. In contrast, the Ape‐X DQN requires the use of replay memory because the exploration of the environment for collecting experiences and network training are performed asynchronously. In this study, we propose Learning‐accelerated Ape‐X, which integrates the Ape‐X DQN and PS‐based learning with some improvements including the use of replay memory. We show through numerical experiments that the proposed method improves the scores in Atari 2600 video games in a shorter time than the Ape‐X DQN. © 2020 Institute of Electrical Engineers of Japan. Published by Wiley Periodicals LLC.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here