Designing Internal Reward of Reinforcement Learning Agents in Multi-Step Dilemma Problem
Author(s) -
Y. Ichikawa,
Keiki Takadama
Publication year - 2013
Publication title -
journal of advanced computational intelligence and intelligent informatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.172
H-Index - 20
eISSN - 1343-0130
pISSN - 1883-8014
DOI - 10.20965/jaciii.2013.p0926
Subject(s) - reinforcement learning , dilemma , computer science , convergence (economics) , value (mathematics) , reinforcement , reward system , mathematical optimization , artificial intelligence , order (exchange) , machine learning , mathematics , psychology , social psychology , economics , geometry , finance , psychotherapist , economic growth
This paper proposes the reinforcement learning agent that estimates internal rewards using external rewards in order to avoid conflict in multi-step dilemma problem. Intensive simulation results have revealed that the agent succeeds in avoiding local convergence and obtains a behavior policy for reaching a higher reward by updating the Q-value using the value that is subtracted the average reward from an external reward.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom