Local motion simulation using deep reinforcement learning | Zendy

Xu Dong | Zendy; Huang Xiao | Zendy; Li Zhenlong | Zendy; Li Xiang | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Local motion simulation using deep reinforcement learning

Author(s) -

Xu Dong,

Huang Xiao,

Li Zhenlong,

Li Xiang

Publication year - 2020

Publication title -

transactions in gis

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.721

H-Index - 63

eISSN - 1467-9671

pISSN - 1361-1682

DOI - 10.1111/tgis.12620

Subject(s) - reinforcement learning , trajectory , local optimum , computer science , artificial intelligence , motion (physics) , frame (networking) , collision avoidance , collision , state space , state (computer science) , action (physics) , artificial neural network , deep learning , computer vision , algorithm , mathematics , telecommunications , statistics , physics , computer security , quantum mechanics , astronomy

Traditional local motion simulation focuses largely on avoiding collisions in the next frame. However, due to its lack of forward looking, the global trajectory of agents usually seems unreasonable. As a method of optimizing the overall reward, deep reinforcement learning (DRL) can better correct the problems that exist in the traditional local motion simulation method. In this article, we propose a local motion simulation method integrating optimal reciprocal collision avoidance (ORCA) and DRL, referred to as ORCA‐DRL. The main idea of ORCA‐DRL is to perform local collision avoidance detection via ORCA and smooth the trajectory at the same time via DRL. We use a deep neural network (DNN) as the state‐to‐action mapping function, where the state information is detected by virtual visual sensors and the action space includes two continuous spaces: speed and direction. To improve data utilization and speed up the training process, we use the proximal policy optimization based on the actor–critic (AC) framework to update the DNN parameters. Three scenes (circle, hallway, and crossing) are designed to evaluate the performance of ORCA‐DRL. The results reveal that, compared with the ORCA, our proposed ORCA‐DRL method can: (a) reduce the total number of frames, leading to less time for agents to reach their destination; and (b) effectively avoid local optima, evidenced by smoothed global trajectories.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research