z-logo
open-access-imgOpen Access
Policy gradient methods
Author(s) -
Jan Peters
Publication year - 2010
Publication title -
scholarpedia
Language(s) - English
Resource type - Journals
ISSN - 1941-6016
DOI - 10.4249/scholarpedia.3698
Subject(s) - computer science
Policy gradient methods are a type of reinforcement learning techniques that rely upon optimizing parametrized policies with respect to the expected return (long-term cumulative reward) by gradient descent. They do not suffer from many of the problems that have been marring traditional reinforcement learning approaches such as the lack of guarantees of a value function, the intractability problem resulting from uncertain state information and the complexity arising from continuous states actions

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom