Premium
Coordinated learning by exploiting sparse interaction in multiagent systems
Author(s) -
Yu Chao,
Zhang Minjie,
Ren Fenghui
Publication year - 2014
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.2947
Subject(s) - observability , computer science , multi agent system , domain (mathematical analysis) , artificial intelligence , autonomous agent , state (computer science) , distributed computing , process (computing) , machine learning , algorithm , mathematics , mathematical analysis , operating system
SUMMARY Multiagent learning provides a promising paradigm to study how autonomous agents learn to achieve coordinated behavior in multiagent systems. In multiagent learning, the concurrency of multiple distributed learning processes makes the environment nonstationary for each individual learner. Developing an efficient learning approach to coordinate agents’ behavior in this dynamic environment is a difficult problem especially when agents do not know the domain structure and at the same time have only local observability of the environment. In this paper, a coordinated learning approach is proposed to enable agents to learn where and how to coordinate their behavior in loosely coupled multiagent systems where the sparse interactions of agents constrain coordination to some specific parts of the environment. In the proposed approach, an agent first collects statistical information to detect those states where coordination is most necessary by considering not only the potential contributions from all the domain states but also the direct causes of the miscoordination in a conflicting state. The agent then learns to coordinate its behavior with others through its local observability of the environment according to different scenarios of state transitions. To handle the uncertainties caused by agents’ local observability, an optimistic estimation mechanism is introduced to guide the learning process of the agents. Empirical studies show that the proposed approach can achieve a better performance by improving the average agent reward compared with an uncoordinated learning approach and by reducing the computational complexity significantly compared with a centralized learning approach. Copyright © 2012 John Wiley & Sons, Ltd.