Autonomous air combat decision‐making of UAV based on parallel self‐play reinforcement learning | Zendy

Li Bo | Zendy; Huang Jingyi | Zendy; Bai Shuangxia | Zendy; Gan Zhigang | Zendy; Liang Shiyang | Zendy; Evgeny Neretin | Zendy; Yao Shouwen | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Autonomous air combat decision‐making of UAV based on parallel self‐play reinforcement learning

Author(s) -

Li Bo,

Huang Jingyi,

Bai Shuangxia,

Gan Zhigang,

Liang Shiyang,

Evgeny Neretin,

Yao Shouwen

Publication year - 2023

Publication title -

caai transactions on intelligence technology

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.613

H-Index - 15

ISSN - 2468-2322

DOI - 10.1049/cit2.12109

Subject(s) - air combat , reinforcement learning , process (computing) , construct (python library) , stability (learning theory) , artificial neural network , missile , engineering , artificial intelligence , computer science , function (biology) , lyapunov function , operations research , machine learning , nonlinear system , computer network , physics , evolutionary biology , quantum mechanics , biology , aerospace engineering , operating system

Aiming at addressing the problem of manoeuvring decision‐making in UAV air combat, this study establishes a one‐to‐one air combat model, defines missile attack areas, and uses the non‐deterministic policy Soft‐Actor‐Critic (SAC) algorithm in deep reinforcement learning to construct a decision model to realize the manoeuvring process. At the same time, the complexity of the proposed algorithm is calculated, and the stability of the closed‐loop system of air combat decision‐making controlled by neural network is analysed by the Lyapunov function. This study defines the UAV air combat process as a gaming process and proposes a Parallel Self‐Play training SAC algorithm (PSP‐SAC) to improve the generalisation performance of UAV control decisions. Simulation results have shown that the proposed algorithm can realize sample sharing and policy sharing in multiple combat environments and can significantly improve the generalisation ability of the model compared to independent training.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore