Open AccessTaming "data-hungry" reinforcement learning? Stability in continuous state-action spacesOpen Access
Author(s)
Yaqi Duan,
Martin J. Wainwright
Publication year2024
We introduce a novel framework for analyzing reinforcement learning (RL) incontinuous state-action spaces, and use it to prove fast rates of convergencein both off-line and on-line settings. Our analysis highlights two keystability properties, relating to how changes in value functions and/orpolicies affect the Bellman operator and occupation measures. We argue thatthese properties are satisfied in many continuous state-action Markov decisionprocesses, and demonstrate how they arise naturally when using linear functionapproximation methods. Our analysis offers fresh perspectives on the roles ofpessimism and optimism in off-line and on-line RL, and highlights theconnection between off-line RL and transfer learning.
Language(s)English
Seeing content that should not be on Zendy? Contact us.
To access your conversation history and unlimited prompts, please
Prompt 0/10