
Feature Selection Using Rough Set Theory Algorithm for Breast Cancer Diagnosis
Author(s) -
Dian Nova Kusuma Hardani,
Hanung Adi Nugroho
Publication year - 2020
Publication title -
iop conference series. materials science and engineering
Language(s) - English
Resource type - Journals
eISSN - 1757-899X
pISSN - 1757-8981
DOI - 10.1088/1757-899x/771/1/012017
Subject(s) - feature selection , rough set , computer science , minimum redundancy feature selection , data mining , data set , selection (genetic algorithm) , feature (linguistics) , artificial intelligence , pattern recognition (psychology) , set (abstract data type) , breast cancer , statistical classification , machine learning , algorithm , cancer , medicine , linguistics , philosophy , programming language
Feature selection is one of the pre-processing stages of classification carried out by selecting relevant features that affect the results of classification. The advantage of feature selection is that it increases the value of accuracy. Data mining in the medical world has excellent potential for knowing hidden patterns in medical data sets. However, medical data sets often have large dimensions and have irrelevant features that can decrease the performance of the algorithm. This study aims to analyse the performance of the rough set approach as an algorithm used for feature selection in breast cancer diagnosis cases. This study conducted a feature selection process on the Wisconsin Breast Cancer (Diagnostic) Data Set provided by the UCI machine learning repository. There are several steps taken in research to realize these goals, such as data pre-processing, feature selection, data randomization, classification and performance evaluation. The result shows that feature selection using the rough set of methods has proven to be effective in reducing a large number of features in the data set.