Convergence of Sample Path Optimal Policies for Stochastic Dynamic Programming
Author(s) -
Michael C. Fu,
Xing Jin
Publication year - 2005
Publication title -
digital repository at the university of maryland (university of maryland college park)
Language(s) - English
Resource type - Reports
DOI - 10.21236/ada438510
Subject(s) - convergence (economics) , sample (material) , path (computing) , dynamic programming , mathematical optimization , stochastic programming , computer science , mathematics , economics , chemistry , programming language , economic growth , chromatography
: The authors consider the solution of stochastic dynamic programs using sample path estimates. Applying the theory of large deviations, they derive probability error bounds associated with the convergence of the estimated optimal policy to the true optimal policy, for finite horizon problems. These bounds decay at an exponential rate, in contrast with the usual canonical (inverse) square root rate associated with estimation of the value (cost-to-go) function itself. These results have practical implications for Monte Carlo simulation-based solution approaches to stochastic dynamic programming problems where it is impractical to extract the explicit transition probabilities of the underlying system model.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom