
Decision support for preventing safety violations
Author(s) -
M. A. Kulagin,
В. Г. Сидоренко
Publication year - 2021
Publication title -
nadëžnostʹ
Language(s) - English
Resource type - Journals
eISSN - 2500-3909
pISSN - 1729-2646
DOI - 10.21683/1729-2646-2021-21-4-38-46
Subject(s) - categorical variable , computer science , random forest , gradient boosting , machine learning , decision tree , reliability (semiconductor) , artificial intelligence , human reliability , data mining , engineering , reliability engineering , human error , power (physics) , physics , quantum mechanics
Aim. The aim of the paper is to examine the experience of reducing the effect of the human factor on business processes, to develop the structure and software of the decisionsupport system for preventing safety violations by train drivers using machine learning and to analyse the findings. Methods. The study presented in the paper uses machine learning, statistical analysis and expert analysis. In terms of machine learning, the following methods were used: logistical regression, random forests, gradient boosting over decision trees with frequency-domain representation of categorical features, neural networks. Results. A set of indicators characterizing a train driver’s operation were identified and are to be used as part of the system under development. The term “train driver’s reliability” was defined as the ability not to violate train traffic safety over a certain number of trips. Algorithms were designed and examined for predicting violations in a train driver’s operation that are used in defining reliability groups and lists of preventive measures recommended for the reduction of the number of safety violations in a train driver’s operation. Major violations with proven guilt of the driver that may be committed within the following 3, 7, 10, 20, 30, 60 days were chosen as attributes for the purpose of safety violation prediction. Analysis of the results on the test sample revealed that the model based on gradient boosting over decision trees with frequency-domain representation of categorical features shows the best results for binary classification on the prediction horizon of 30 and 60 days. The developed algorithm made a correct prediction in 76% of cases with the threshold value of 0.7 and horizon of 30 days and in 82% of cases with the threshold value of 0.9 and horizon of 60 days. The solution of the problem can be found in the integration of different approaches to predicting safety violations in a train driver’s operation. Additionally, 10 of the most significant indicators of a train driver’s operation were identified with the best of the considered models, i.e., gradient boosting over decision trees with frequency-domain representation of categorical features. Conclusion. The paper presents an overview of methods and systems of assessing human reliability and the effect of the human factor on the safety of transportation systems. It allowed choosing the most promising directions and methods of predictive analysis of a train driver’s operation, including methods of machine learning. The resulting set of indicators of a train driver’s operation that take into consideration the changes in the quality of such operation allowed obtaining initial data for training the models implemented as part of the system under development. The implemented models enabled the aggregation of information on train drivers and adoption of targeted and temporary preventive measures recommended for improving driver reliability. The resulting approach to the definition of preventive measures has been implemented in three depots of JSC RZD in trial operation mode.