Online Learning in Episodic Markovian Decision Processes by Relative Entropy Policy Search
Author(s) -
A.M. Zimin,
Gergely Neu
Publication year - 2013
Publication title -
hal (le centre pour la communication scientifique directe)
Language(s) - English
Resource type - Conference proceedings
Subject(s) - regret , markov decision process , entropy (arrow of time) , markov process , computer science , finite state , markov chain , state space , mathematics , artificial intelligence , mathematical optimization , machine learning , statistics , physics , quantum mechanics
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom