Branching improved Deep Q Networks for solving pursuit-evasion strategy solution of spacecraft | Zendy

Liu Bingyan | Zendy; Xiongbing Ye | Zendy; Xianzhou Dong | Zendy; Lei Ni | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Branching improved Deep Q Networks for solving pursuit-evasion strategy solution of spacecraft

Author(s) -

Liu Bingyan,

Xiongbing Ye,

Xianzhou Dong,

Lei Ni

Publication year - 2021

Publication title -

journal of industrial and management optimization

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.325

H-Index - 32

eISSN - 1553-166X

pISSN - 1547-5816

DOI - 10.3934/jimo.2021016

Subject(s) - computer science , rendezvous , pursuit evasion , differential game , mathematical optimization , deep space exploration , nasa deep space network , reinforcement learning , spacecraft , artificial intelligence , mathematics , engineering , aerospace engineering

With the continuous development of space rendezvous technology, more and more attention has been paid to the study of spacecraft orbital pursuit-evasion differential game. Therefore, we propose a pursuit-evasion game algorithm based on branching improved Deep Q Networks to obtain a space rendezvous strategy with non-cooperative target. Firstly, we transform the optimal control of space rendezvous between spacecraft and non-cooperative target into a survivable differential game problem. Next, in order to solve this game problem, we construct Nash equilibrium strategy and test its existence and uniqueness. Then, in order to avoid the dimensional disaster of Deep Q Networks in the continuous behavior space, we construct a TSK fuzzy inference model to represent the continuous space. Finally, in order to solve the complex and timeconsuming self-learning problem of discrete action sets, we improve Deep Q Networks algorithm, and propose a branching architecture with multiple groups of parallel neural Networks and shared decision modules. The simulation results show that the algorithm achieves the combination of optimal control and game theory, and further improves the learning ability of discrete behaviors. The algorithm has the comparative advantage of continuous space behavior decision, can effectively deal with the continuous space chase game problem, and provides a new idea for the solution of spacecraft orbit pursuit-evasion strategy.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research