Distributed Reinforcement Learning in Emergency Response Simulation | Zendy

Cesar Lopez | Zendy; Jose R. Marti | Zendy; Sarbjit Sarkaria | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Distributed Reinforcement Learning in Emergency Response Simulation

Author(s) -

Cesar Lopez,

Jose R. Marti,

Sarbjit Sarkaria

Publication year - 2018

Publication title -

ieee access

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.587

H-Index - 127

ISSN - 2169-3536

DOI - 10.1109/access.2018.2878894

Subject(s) - aerospace , bioengineering , communication, networking and broadcast technologies , components, circuits, devices and systems , computing and processing , engineered materials, dielectrics and plasmas , engineering profession , fields, waves and electromagnetics , general topics for engineers , geoscience , nuclear engineering , photonics and electrooptics , power, energy and industry applications , robotics and control systems , signal processing and analysis , transportation

This paper presents the implementation of a coordinated decision-making agent for emergency response scenarios. The agent’s implementation uses reinforcement learning (RL). RL is a machine learning technique that enables an agent to learn from experimenting. The agent’s learning is based on rewards, and feedback signals proportional to how good its actions are. The simulation platform used was infrastructure interdependencies simulator, in which, we have tested suitability of the approach in previous studies. In this paper, we have added new features to our previous solution, for enabling faster convergence and distributed processing. These additions include an enhanced reward scheme and a scheduler for orchestrating the distributed training. We include two test cases. The first case is a compact model with four critical infrastructures. In this model, the agent’s training required only 10% of the attempts needed in our previous version. Improvements in convergence come from adding a shaping reward scheme. We trained the agent across 24 simultaneous configurations of our model. The training process elapsed 4 min. The extended case included more infrastructures and a higher level of detail. The dimensionality of the problem grew by a factor of 4000, but the training converged in less episodes. We tested the extended model over 96 parallel instances (potential scenarios) with completion in 2.87 min. The results show a fast and stable convergence. This agent can help during multiple stages of emergency response including real-time situations.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research