z-logo
open-access-imgOpen Access
A heuristically accelerated reinforcement learning method for maintenance policy of an assembly line
Author(s) -
Xiao Wang,
Guowei Zhang,
Yongqiang Li,
Na Qu
Publication year - 2022
Publication title -
journal of industrial and management optimization
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.325
H-Index - 32
eISSN - 1553-166X
pISSN - 1547-5816
DOI - 10.3934/jimo.2022047
Subject(s) - reinforcement learning , computer science , markov decision process , asynchronous communication , simulated annealing , preventive maintenance , mathematical optimization , process (computing) , q learning , markov chain , artificial intelligence , markov process , machine learning , reliability engineering , mathematics , engineering , computer network , statistics , operating system
This paper aims to investigate the maintenance policy for a two-machine one-buffer (2M1B) assembly line system. We assume that the observed quality states of the deteriorating machines in the system are characterized by multiple decreasing yield stages. A semi-Markov decision process (SMDP) model is used for describing the deteriorating process of the system. A heuristically accelerated multi-agent reinforcement learning (HAMRL) method is conducted to solve the problem model. The asynchronous updating rules are introduced in the HAMRL method, and the production time, preventive maintenance (PM) time and corrective repair (CR) time are random, and the deterioration mode of the device is not fixed. Meanwhile, a comparison with a simulated annealing search (SAS) based exploration algorithm and a neighborhood search (NS) based exploration algorithm in reinforcement learning (RL) is presented. The empirical results indicate that the proposed HAMRL algorithm can speed up the learning process, and has a certain advantage for the larger space and the more practical problem. And the maintenance strategy for the 2M1B assembly line system is obtained under the condition of convergent system average cost rate. This paper provides new and practical insights into the application and selection of techniques for maintenance policy of the 2M1B assembly line system.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom