Value iteration for average cost Markov decision processes inBorel spaces
Author(s) -
Zhu Quan-xin,
Xianping Guo
Publication year - 2005
Publication title -
applied mathematics research express
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.763
H-Index - 20
eISSN - 1687-1200
pISSN - 1687-1197
DOI - 10.1155/amrx.2005.61
Subject(s) - mathematics , average cost , markov decision process , mathematical optimization , bounded function , action (physics) , value (mathematics) , upper and lower bounds , markov process , statistics , economics , mathematical analysis , physics , neoclassical economics , quantum mechanics
This paper deals with the value iteration algorithm (VIA) for average cost Markov decision processes in Borel state and action spaces. The costs may have neither upper nor lower bounds, instead of the case of nonnegative (or bounded below) costs widely used in the previous literature. We propose aset of conditions which is weaker than those in the previous literature. Under these conditions, we first establish the average cost optimality equation. Then under an additional condition, we show that the VIA yields the optimal (minimum) average cost, anaverage optimal stationary policy, and a solution to the average cost optimality equation. Finally, we use an example to illustrate our conditions.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom