Efficient Training Support Vector Clustering With Appropriate Boundary Information
Author(s) -
Yuan Ping,
Bin Hao,
Huina Li,
Yuping Lai,
Chun Guo,
Hui Ma,
Baocang Wang,
Xiali Hei
Publication year - 2019
Publication title -
ieee access
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.587
H-Index - 127
ISSN - 2169-3536
DOI - 10.1109/access.2019.2945926
Subject(s) - computer science , cluster analysis , scalability , redundancy (engineering) , data mining , support vector machine , solver , boundary (topology) , computation , dual (grammatical number) , mathematical optimization , algorithm , machine learning , database , mathematics , mathematical analysis , programming language , operating system , art , literature
Due to the remarkable capability in handling arbitrary cluster shapes, support vector clustering (SVC) benefits data analysis in terms of data description. However, large-scale data such as network traffic frequently makes it suffer from highly intensive pricey computation and storage for solving the dual problem and storing the kernel matrix, respectively. Fortunately, support vectors which describe the clusters, in a sense, are expected in the boundaries. To tackle this issue, we propose an efficient training SVC with appropriate boundary information (ETSVC), which features excellent flexibility and scalability. In ETSVC, we first give a shrinkable boundary selection (SBS) method which collects appropriate boundaries while reducing redundancy and noise. Based on the boundary information, a redefined dual problem is then designed without scarifying the principle of SVC. Finally, we design a reformative solver (RSolver) to reformulate the training phase, which estimates the support vector function by solving the dual problem. Since only a subset of boundaries is employed for model training, theoretical analysis suggests that ETSVC reaches efficiency improvement and consumes much less memory if sacrificing efficiency to reduce storage consumption. Towards grouping P2P flows and large-scale intrusion traffic, as well as other non-traffic data, experimental results confirm that ETSVC could be applied to resources constrained platform while achieving comparable accuracies with the state-of-the-art methods.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom