z-logo
open-access-imgOpen Access
Delayed reward‐based genetic algorithms for partially observable Markov decision problems
Author(s) -
Yamashiro Yoshihide,
Ueno Atsushi,
Takeda Hideaki
Publication year - 2004
Publication title -
systems and computers in japan
Language(s) - English
Resource type - Journals
eISSN - 1520-684X
pISSN - 0882-1666
DOI - 10.1002/scj.10230
Subject(s) - partially observable markov decision process , reinforcement learning , computer science , markov decision process , genetic algorithm , aliasing , markov chain , artificial intelligence , perception , markov model , observable , selection (genetic algorithm) , markov process , mathematical optimization , machine learning , algorithm , mathematics , psychology , statistics , physics , quantum mechanics , neuroscience , undersampling
Reinforcement learning often involves assuming Markov characteristics. However, the agent cannot always observe the environment completely, and in such cases, different states are observed as the same state. In this research, the authors develop a Delayed Reward‐based Genetic Algorithm for POMDP (DRGA) as a means to solve a partially observable Markov decision problem (POMDP) which has such perceptual aliasing problems. The DRGA breaks down the POMDP into several subtasks, and then solves the POMDP by breaking down the agent into several subagents. Each subagent acquires policies adapted to the environment based on the delayed rewards from the environment, and these policies are evolved using a genetic algorithm based on the delayed rewards. The agent adapts to the environment by combining effective policies that remain after natural selection. The authors apply this method to maze search problems in which perception is limited in order to demonstrate its validity. © 2004 Wiley Periodicals, Inc. Syst Comp Jpn, 35(2): 66–78, 2004; Published online in Wiley InterScience ( www.interscience.wiley.com ). DOI 10.1002/scj.10230

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom