Boosting Support Vector Machines for Imbalanced Microarray Data
Author(s) -
Risky Frasetio Wahyu Pratama,
Santi Wulan Purnami,
Santi Puteri Rahayu
Publication year - 2018
Publication title -
procedia computer science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.334
H-Index - 76
ISSN - 1877-0509
DOI - 10.1016/j.procs.2018.10.517
Subject(s) - support vector machine , computer science , artificial intelligence , classifier (uml) , boosting (machine learning) , oversampling , machine learning , margin classifier , pattern recognition (psychology) , feature selection , data mining , computer network , bandwidth (computing)
Nowadays, microarray data plays an important role in the detection and classification of almost all types of cancer tissue. The gene expression produced by microarray technology that carries the information from genes is then matched to a specific cancer condition. The problems that often appear in the classification using microarray data are high-dimensional data and imbalanced class. The problem of high-dimensional data can be solved by using Fast Correlated Based Filter (FCBF) feature selection. In this paper, Support Vector Machine (SVM) classifier is used because of its advantages. However, some studies mention that almost all classifier model including SVM are sensitive with respect to imbalanced class. Synthetic Minority Oversampling Technique (SMOTE) is one of the prepocessing data methods in handling imbalanced class based on sampling approach by increasing the number of samples from the minority class. This method often works well but sometimes it might suffer from over-fitting problem. One other alternative approach in improving the performance of imbalanced data classification is boosting. This method constructs a powerful final classifier by combining a set of SVMs as base classifier during the iteration process. So, it can improve the classification performance. In this study, colon cancer and myeloma data are used in the analysis. The results show that SMOTEBoost with SVM as base classifier outperforms SVM, SMOTE-SVM, and AdaBoost with SVM as base classifier by looking on G-mean metric.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom