Dynamic Coordination-Based Reinforcement Learning for Driving Policy
Author(s) -
Huaiwei Si,
Guozhen Tan,
Yanfei Peng,
Jianping Li
Publication year - 2022
Publication title -
wireless communications and mobile computing
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.42
H-Index - 64
eISSN - 1530-8677
pISSN - 1530-8669
DOI - 10.1155/2022/6836778
Subject(s) - reinforcement learning , computer science , construct (python library) , intelligent transportation system , multi agent system , artificial intelligence , distributed computing , human–computer interaction , computer network , civil engineering , engineering
With the development of communication technology and artificial intelligence technology, intelligent vehicle has become a very important part of Internet of Things technology. At present, the single-vehicle intelligence is gradually improved, and more and more unmanned vehicles appear on the road. In the future, these individual intelligence applications need to be transformed into collective intelligence to give full play to the greater advantages of unmanned driving. For example, individual intelligence is self-interest. If there is no collective cooperation, it may affect the whole traffic flow for its own speed. Although the vehicle ad hoc network technology provides a guarantee for the communication between vehicles and makes cooperation between vehicles possible, there are still challenges in how to adapt to coordination learning. Coordination reinforcement learning is one of the most promising methods to solve the multiagent coordination optimization problems. However, existing coordinative learning approaches that usually rely on static topologies cannot be easily adopted to solve the vehicle coordination problems in the dynamic environment. We propose a dynamic coordination reinforcement learning to help vehicles make their driving decisions. First, we apply driving safety field theory to construct the dynamic coordination graph (DCG), representing the dynamic coordination behaviors among vehicles. Second, we design reinforcement learning techniques on our DCG model to implement the joint optimal action reasoning for the multivehicle system and eventually derive the optimal driving policy for each vehicle. Finally, compared with other multiagent learning methods, our method has a significant improvement in security and speed, which is about 1% higher than other multiagent learning methods, but its training speed is also significantly improved about 8%.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom