z-logo
open-access-imgOpen Access
A Modified Policy Iteration Algorithm for Discounted Reward Markov Decision Processes
Author(s) -
Sanaa Chafik,
Cherki Daoui
Publication year - 2016
Publication title -
international journal of computer applications
Language(s) - English
Resource type - Journals
ISSN - 0975-8887
DOI - 10.5120/ijca2016908033
Subject(s) - markov decision process , computer science , state space , mathematical optimization , markov chain , algorithm , dynamic programming , markov process , process (computing) , space (punctuation) , machine learning , mathematics , statistics , operating system
The running time of the classical algorithms of the Markov Decision Process (MDP) typically grows linearly with the state space size, which makes them frequently intractable. This paper presents a Modified Policy Iteration algorithm to compute an optimal policy for large Markov decision processes in the discounted reward criteria and under infinite horizon. The idea of this algorithm is based on the topology of the problem; moreover, an Open Multi-Processing (Open-MP) programming model is applied to attain efficient parallel performance in solving the Modified algorithm. General Terms Theoretical Informatics, Parallelizing ...

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom