
Improving The Performance of K-Nearest Neighbor Algorithm by Reducing The Attributes of Dataset Using Gain Ratio
Author(s) -
Novia Hasdyna,
Baringin Sianipar,
Elviawaty Muiza Zamzami
Publication year - 2020
Publication title -
journal of physics. conference series
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.21
H-Index - 85
eISSN - 1742-6596
pISSN - 1742-6588
DOI - 10.1088/1742-6596/1566/1/012090
Subject(s) - information gain ratio , computer science , artificial intelligence , k nearest neighbors algorithm , algorithm , breast cancer , performance improvement , machine learning , data mining , pattern recognition (psychology) , cancer , engineering , medicine , feature selection , operations management
Data that has many attributes or higher dimensions will affect the performance of the K-NN classification algorithm. In this study, the Gain Ratio implemented for selecting and reducing the dataset attributes to form a new dataset for the classification process is carried out with the K-NN. The dataset used in this study are the Breast Cancer Coimbra dataset and Hepatitis C Virus dataset obtained from the UCI Machine Learning Repository. The results showed that the Breast Cancer Coimbra dataset, Gain Ratio can improve the performance of K-NN with average value 0.535596 TPR, TNR = 1, NPV = 0.608279, FNR = 1, FOR = 0.391721, Accuracy = 72.85%. In Hepatitis C Virus dataset also managed to improve the performance of K-NN with average value TPR = 0.665596, TNR = 0,876667, NPV=0,738279, FNR=0,88, FOR=0,521721, and Accuracy=86,25%.