z-logo
open-access-imgOpen Access
Prioritized Experience Replay in Multi-Actor-Attention-Critic for Reinforcement Learning
Author(s) -
Sheng Fan,
Guanghua Song,
Bowei Yang,
Xiaohong Jiang
Publication year - 2020
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1631/1/012040
Subject(s) - reinforcement learning , computer science , reuse , convergence (economics) , scalability , metric (unit) , selection (genetic algorithm) , artificial intelligence , operations management , database , economics , economic growth , ecology , biology
Experience replay is a significant method of off-policy reinforcement learning (RL), which makes RL reuse the past experience and reduce the correlation between samples. Multi-Actor-Attention-Critic (MAAC) is a successful off-policy multi-agent reinforcement learning algorithm, due to its good scalability. To accelerate convergence, we use prioritized experience replay (PER) to optimize the experience selection in MAAC, and propose the PER-MAAC algorithm. In the PER-MAAC, the priority metric is based on the temporal-difference error during training. The algorithm is evaluated in the scenarios of Multi-UAV Cooperative Navigation and Rover-Tower. The experimental results show that PER-MAAC improves the speed of convergence.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here