Research Library

open-access-imgOpen AccessTaming "data-hungry" reinforcement learning? Stability in continuous state-action spaces
Author(s)
Yaqi Duan,
Martin J. Wainwright
Publication year2024
We introduce a novel framework for analyzing reinforcement learning (RL) incontinuous state-action spaces, and use it to prove fast rates of convergencein both off-line and on-line settings. Our analysis highlights two keystability properties, relating to how changes in value functions and/orpolicies affect the Bellman operator and occupation measures. We argue thatthese properties are satisfied in many continuous state-action Markov decisionprocesses, and demonstrate how they arise naturally when using linear functionapproximation methods. Our analysis offers fresh perspectives on the roles ofpessimism and optimism in off-line and on-line RL, and highlights theconnection between off-line RL and transfer learning.
Language(s)English

Seeing content that should not be on Zendy? Contact us.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here