
Gittins index based control policy for a class of pursuit‐evasion problems
Author(s) -
Tan Cheng,
Xu Changbao,
Yang Lin,
Wong Wing Shing
Publication year - 2018
Publication title -
iet control theory and applications
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.059
H-Index - 108
eISSN - 1751-8652
pISSN - 1751-8644
DOI - 10.1049/iet-cta.2017.0398
Subject(s) - pursuer , pursuit evasion , curse of dimensionality , heuristic , mathematical optimization , dynamic programming , evasion (ethics) , class (philosophy) , computer science , optimal control , control theory (sociology) , control (management) , mathematics , artificial intelligence , immune system , immunology , biology
In this study, the authors develop a novel approach to a class of pursuit‐evasion problems modelled in the form of discrete time feedback control systems, where the opposing parties have asymmetric capability. The authors assume the control policy of the evader, described as a random variable, is unknown to the pursuer. This pursuit‐evasion problem is formulated as a quadratic optimisation problem from the perspective of the pursuer. Due to the curse of dimensionality, this pursuit‐evasion problem cannot be practically solved by dynamic programming. In this study, the authors reformulate it as a multi‐armed bandit problem. A heuristic policy based on the Gittins index is proposed to solve this problem, which can be computed based on a forward induction. Simulation results show the proposed policy outperforms a random decision policy.