z-logo
open-access-imgOpen Access
Predicting Pilot Behavior in Medium Scale Scenarios Using Game Theory and Reinforcement Learning
Author(s) -
Yıldıray Yıldız,
Adrian Agogino,
Guillaume Brat
Publication year - 2013
Publication title -
nasa sti repository (national aeronautics and space administration)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.2514/6.2013-4908
Subject(s) - reinforcement learning , computer science , scale (ratio) , reinforcement , game theory , human–computer interaction , artificial intelligence , machine learning , psychology , mathematics , social psychology , mathematical economics , physics , quantum mechanics
A KEY element to meet the continuing growth in air traffic is the increased use of automation. Decision support systems, computer-based information acquisition, trajectory planning systems, high-level graphic display systems, and all advisory systems are considered to be automation components related to nextgeneration (NextGen) air space [1]. In the NextGen air system, a larger number of interacting human and automation systems are expected as compared with today. Improved tools and methods are needed to analyze this new situation and predict potential conflicts or unexpected results, if any, due to increased human–human and human–automation interactions. In a recent NASA report [1], among others, human–automation function allocation, methods for transition of authority and responsibility as a function of operational concept, and transition from automation to human control are mentioned as “highest priority research needs” for NextGen air space development. There have been several methods developed for modeling, optimizing, and making predictions in air space systems. Brahms agent modeling [2] framework has been successfully used to model human behavior but it is not used to predict possible outcomes of large-scale complex systems with human–human and human– automation interactions. For optimization, Tumer and Agogino [3] used agent-based learning to optimize air traffic flow, but they did not model pilot behavior, which is critical for being able to predict system outcomes. In the proposed approach, the authors first mathematically define pilot goals in a complex system. These goals can constitute, for example, staying on the trajectory, not getting close to other aircraft, or having a smooth landing. The authors then use game theory and machine learning to model the outcomes of the overall system based on these pilot goals, together with other automation and environment variables. Formally, the authors use of a game-theoretic framework known as semi network-form games (SNFGs) [4], to obtain probable outcomes of a NextGen scenario with interacting humans (pilots) in the presence of advanced NextGen technologies. Our focus is to show how this framework can be scaled to larger problems that will make it applicable to a wide range of air traffic systems. Earlier implementations of this framework [4–7] proved useful for investigating strategic decision making in scenarios with two humans. In this Note, for the first time, the authors investigate a dramatically larger scenario, which includes 50 aircraft corresponding to 50 human decision makers. The method presented in the Note is a step toward predicting the effect of new technologies and procedures on the air space system by investigating pilot reactions to the new medium. These predictions can be used to evaluate the performance vs efficiency tradeoffs. In Sec. II, the employment of game theory is explained in predicting the complex system behavior. In this section, two components of the approach are also presented: level-K reasoning and reinforcement learning. In Sec. III, the main components of the investigated NextGen scenario are presented. In this section, the air space and aircraft models, pilot goals, and a general description of the scenario are explained. In Sec. IV, simulation setup details are provided. In Sec. V, the simulation results are shown, where four different variations of the NextGen scenario are investigated with different levels of complexity and congestion. Finally, in Sec. VI, the Note is concluded by giving a summary and takeaway notes of this study, together with future research directions.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom