
Information Acquisition Driven by Reinforcement in Non-Deterministic Environments
Author(s) -
Naresh Babu Bynagari,
Ruhul Amin
Publication year - 2019
Publication title -
american journal of trade and policy
Language(s) - English
Resource type - Journals
eISSN - 2313-4755
pISSN - 2313-4747
DOI - 10.18034/ajtp.v6i3.569
Subject(s) - reinforcement learning , reinforcement , computer science , action (physics) , markov process , markov chain , markov decision process , artificial intelligence , machine learning , mathematics , engineering , statistics , physics , structural engineering , quantum mechanics
What is the fastest way for an agent living in a non-deterministic Markov environment (NME) to learn about its statistical properties? The answer is to create "optimal" experiment sequences by carrying out action sequences that maximize expected knowledge gain. This idea is put into practice by integrating information theory and reinforcement learning techniques. Experiments demonstrate that the resulting method, reinforcement-driven information acquisition (RDIA), is substantially faster than standard random exploration for exploring particular NMEs. Exploration was studied apart from exploitation and we evaluated the performance of different reinforcement-driven information acquisition variations to that of traditional random exploration.