Premium
Optimal policies for stochastic clearing systems with time‐dependent delay penalties
Author(s) -
He QiMing,
Bookbinder James H.,
Cai Qishu
Publication year - 2020
Publication title -
naval research logistics (nrl)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.665
H-Index - 68
eISSN - 1520-6750
pISSN - 0894-069X
DOI - 10.1002/nav.21931
Subject(s) - clearing , markov decision process , mathematical optimization , markov process , computer science , function (biology) , nonlinear system , sequence (biology) , time horizon , process (computing) , state (computer science) , control theory (sociology) , mathematics , algorithm , economics , finance , statistics , physics , genetics , control (management) , quantum mechanics , evolutionary biology , artificial intelligence , biology , operating system
We study stochastic clearing systems with a discrete‐time Markovian input process, and an output mechanism that intermittently and instantaneously clears the system partially or completely. The decision to clear the system depends on both quantities and delays of outstanding inputs. Clearing the system incurs a fixed cost, and outstanding inputs are charged a delay penalty, which is a general increasing function of the quantities and delays of individual inputs. By recording the quantities and delays of outstanding inputs in a sequence, we model the clearing system as a tree‐structured Markov decision process over both a finite and infinite horizon. We show that the optimal clearing policies, under realistic conditions, are of the on‐off type or the threshold type. Based on the characterization of the optimal policies, we develop efficient algorithms to compute parameters of the optimal policies for such complex clearing systems for the first time. We conduct a numerical analysis on the impact of the nonlinear delay penalty cost function, the comparison of the optimal policy and the classical hybrid policy (ie, quantity and age thresholds), and the impact of the state of the input process. Our experiments demonstrate that (a) the classical linear approximation of the cost function can lead to significant performance differences; (b) the classical hybrid policy may perform poorly (as compared to the optimal policies); and (c) the consideration of the state of the input process makes significant improvement in system performance.