
An Improved PPO for Multiple Unmanned Aerial Vehicles
Author(s) -
Xue Bai,
Chengxuan Lu,
Qihao Bao,
Shansheng Zhu,
Shaojie Xia
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1757/1/012156
Subject(s) - convergence (economics) , reinforcement learning , computer science , distributed computing , action (physics) , space (punctuation) , control (management) , artificial intelligence , economic growth , operating system , physics , quantum mechanics , economics
In recent years, multi-agent reinforcement learning (MARL) has been applied widely, especially in large multi-role games such as StarCraft and unmanned aerial vehicles (UAVs) combat simulations. However, MARL is faced with challenges regarding fast convergence and efficient cooperation. In a multi-agent scenario, on the one hand, when a fully centralized network model is adopted, it is difficult for the model to converge due to the huge action space; on the other hand, it is difficult for a decentralized model to cooperate and achieve global optimization. To jointly control multiple agents, we propose an improved PPO algorithm by combining a centralized network and decentralized networks. Our method not only reduces the action space and accelerates the convergence, but also introduces more diversity for agents’ decision-making.