z-logo
open-access-imgOpen Access
Exploration–Exploitation in MDPs with Options
Author(s) -
Ronan Fruit,
Alessandro Lazaric
Publication year - 2017
Publication title -
hal (le centre pour la communication scientifique directe)
Language(s) - English
Resource type - Conference proceedings
Subject(s) - regret , markov decision process , reinforcement learning , computer science , simple (philosophy) , online learning , markov chain , artificial intelligence , upper and lower bounds , q learning , empirical evidence , markov process , machine learning , mathematical optimization , mathematics , mathematical analysis , philosophy , statistics , epistemology , world wide web

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here