
Business Anomaly Detection Method of Power Dispatching Automation System Based on Clustering Under-Sampling in the Boundary Region
Author(s) -
Junliang Li,
Jun Xu,
Xu Huang,
Bing Ren,
Tianqi Dai,
Zemin Zhang,
Rui Su
Publication year - 2021
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/2025/1/012026
Subject(s) - cluster analysis , anomaly detection , boundary (topology) , data mining , computer science , dbscan , partition (number theory) , automation , sampling (signal processing) , pattern recognition (psychology) , artificial intelligence , mathematics , engineering , correlation clustering , cure data clustering algorithm , filter (signal processing) , mechanical engineering , mathematical analysis , combinatorics , computer vision
Timely detecting business anomaly in the power dispatching automation system is significant for the steady operation of the power grid. Though the imbalanced binary classification method in machine learning is an effective way to achieve the business anomaly detection of the system, the overlap of boundary samples is an urgent issue affecting the classification effect. An under-sampling method by removing the clustering noises of the majority samples in the boundary region is proposed. Firstly, KNN is used to search adjacent points of the majority class, and the boundary region and the safety region are divided according to the proportion of the majority samples in adjacent points. Secondly, DBSCAN is used to cluster the majority samples in the boundary region, and noise points are removed. Finally, it’s combined with the method based on model dynamic selection driven by data partition hybrid sampling (DPHS-MDS). The purpose of reducing the overlap degree of boundary samples, balancing the dataset and improving the classification effect is achieved. Experimental results show that the proposed method is superior to the relevant mainstream methods under F-measure and G-mean.