
Comparison of Distance Models on K-Nearest Neighbor Algorithm in Stroke Disease Detection
Author(s) -
Iswanto Iswanto,
Tulus Tulus,
Poltak Sihombing
Publication year - 2021
Publication title -
applied technology and computing science journal
Language(s) - English
Resource type - Journals
eISSN - 2621-4474
pISSN - 2621-4458
DOI - 10.33086/atcsj.v4i1.2097
Subject(s) - minkowski distance , euclidean distance , k nearest neighbors algorithm , similarity (geometry) , metric (unit) , stroke (engine) , artificial intelligence , euclidean geometry , value (mathematics) , pattern recognition (psychology) , mathematics , statistics , computer science , algorithm , mechanical engineering , operations management , geometry , engineering , economics , image (mathematics)
Stroke is a cardiovascular (CVD) disease caused by the failure of brain cells to get oxygen supply to pose a risk of ischemic damage and result in death. This Disease can detect based on the similarity of symptoms experienced by the sufferer so that early steps can be taking with appropriate counseling and treatment. Stroke detecting requires a machine learning method. In this research, the author used one of the supervised learning classification methods, namely K-Nearest Neighbor (K-NN). K-NN is a classification method based on calculating the distance to training data. This research compares the Euclidean, Minkowski, Manhattan, Chebyshev distance models to obtain optimal results. The distance models have been tested using the stroke dataset sourced from the Kaggle repository. Based on the test results, the Chebyshev model has the highest levels of accuracy compared to the other three distance models with an average accuracy value of 95.49%, the highest accuracy of 96.03%, at K = 10. The Euclidean and Minkowski distance models have the same level of accuracy at each K value with an average accuracy value of 95.45%, the highest accuracy of 95.93% at K = 10. Meanwhile, Manhattan has the lowest average compared to the other distance models, which is 95.42% but has the highest accuracy of 96.03% at the value of K = 6