Model-free reinforcement learning as mixture learning | Zendy

Nikos Vlassis | Zendy; Marc Toussaint | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Model-free reinforcement learning as mixture learning

Author(s) -

Nikos Vlassis,

Marc Toussaint

Publication year - 2009

Publication title -

open repository and bibliography (university of luxembourg)

Language(s) - English

Resource type - Conference proceedings

DOI - 10.1145/1553374.1553512

Subject(s) - reinforcement learning , computer science , bootstrapping (finance) , expectation–maximization algorithm , probabilistic logic , mathematical optimization , markov decision process , maximization , stochastic approximation , approximation algorithm , artificial intelligence , machine learning , algorithm , markov process , maximum likelihood , mathematics , statistics , computer security , key (lock) , econometrics

We cast model-free reinforcement learning as the problem of maximizing the likelihood of a probabilistic mixture model via sampling, addressing both the infinite and finite horizon cases. We describe a Stochastic Approximation EM algorithm for likelihood maximization that, in the tabular case, is equivalent to a non-bootstrapping optimistic policy iteration algorithm like Sarsa(1) that can be applied both in MDPs and POMDPs. On the theoretical side, by relating the proposed stochastic EM algorithm to the family of optimistic policy iteration algorithms, we provide new tools that permit the design and analysis of algorithms in that family. On the practical side, preliminary experiments on a POMDP problem demonstrated encouraging results.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research