A VALUE ITERATION IN CONTROLLED DIFFUSION PROCESSES ASSOCIATED WITH STOPPING
Author(s) -
Yūji Yoshida
Publication year - 1986
Publication title -
bulletin of informatics and cybernetics
Language(s) - English
Resource type - Journals
eISSN - 2435-743X
pISSN - 0286-522X
DOI - 10.5109/13377
Subject(s) - value (mathematics) , diffusion , mathematics , optimal stopping , statistics , physics , thermodynamics
The present paper deals with an optimal control problem in con trolled diffusion processes with stopping times. In this paper we shall present a value iteration for the optimiza tion problem associated with controls and stopping times. Further more we shall investigate the relations among the value interation, optimal stopping times and Bellman's equation. 0. Introduction Stopped decision problems in discrete time case have been studied by Dubins and Savage [2], Furukawa [3, 4], Hordijk [5] and etc.. On the other hand Krylov [6], Nisio [7], Ohtsubo [8] and etc. have studied optimization problems in controlled Markov processes with stopping. Especially [4] has studied a value iteration to find an optimal reward function in decision problems. And Doshi [1] has treated it in controlled Markov processes. This paper investigates the property of the iteration and its relation to Bellman's equation in the case of controlled diffusion processes with stopping times in infinite horizon. In Section 1 we shall introduce diffusion processes associated with stochastic dif ferential equations. In Section 2 we shall consider an optimal control problem as sociated with stopping times. In Section 3 we shall give and study a value iteration to find solutions of the problem. In Section 4 we shall investigate the relation between the value iteration and Bellman's equation. 1. Controlled Diffusion Processes Let R+ be the set of all nonnegative real numbers, the time space. Let Rd be ddimensional Euclidean space and let E=R+ x Rd. B denotes the field of Borel subsets of E. Let Q be the set of all continuous mappings w : R+ _ Rd. A mapping x(t) is given by w—>x(t, w) =w(t) for t E R+ and w E S2. F denotes the smallest oalgebra ge nerated by {x(t) : tER+}. Let P be a probability measure on (Q, SF). G is a compact subset of a separable metric space, the action space. Now we consider a controlled stochastic differential equation. For (r t, x) E G x E, * Electron Optics Technical and Engineering Division , Japan Electron Optics Laboratry Ltd. 1418 Nakagami Akishima, Tokyo 196, Japan.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom