Premium
Penalty function and adaptive control of constrained finite Markov chains
Author(s) -
Najim K.,
Poznyak A. S.
Publication year - 1998
Publication title -
international journal of adaptive control and signal processing
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.73
H-Index - 66
eISSN - 1099-1115
pISSN - 0890-6327
DOI - 10.1002/(sici)1099-1115(199811)12:7<545::aid-acs511>3.0.co;2-j
Subject(s) - markov chain , mathematical optimization , mathematics , function (biology) , controller (irrigation) , ergodic theory , convergence (economics) , penalty method , adaptive control , markov decision process , markov process , optimal control , stochastic control , control theory (sociology) , computer science , control (management) , statistics , economics , mathematical analysis , evolutionary biology , artificial intelligence , agronomy , biology , economic growth
In this paper we consider the adaptive control of constrained finite ergodic controller Markov chains whose transition probabilities are unknown. The control policy is designed to achieve the minimization of a loss function under a set of inequality constraints. The average values of conditional mathematical expectations of this loss function and constraints are also assumed to be unknown. A regularized penalty function is introduced to derive an adaptive control algorithm. In this algorithm the transition probabilities of the Markov chain and the average values of the constraints are estimated at each time n . The control policy is adjusted using the Bush–Mosteller reinforcement scheme as a stochastic approximation procedure. Its asymptotic properties are stated. We establish that the optimal convergence rate is equal to n ‐1/3+δ (δ is any small positive parameter). © 1998 John Wiley & Sons, Ltd.