z-logo
open-access-imgOpen Access
Comparison of distance measurement on k-nearest neighbour in textual data classification
Author(s) -
Wahyono Wahyono,
I Nyoman Prayana Trisna,
Sarah Lintang Sariwening,
Muhammad Fajar,
Danur Wijayanto
Publication year - 2019
Publication title -
jurnal teknologi dan sistem komputer
Language(s) - English
Resource type - Journals
eISSN - 2620-4002
pISSN - 2338-0403
DOI - 10.14710/jtsiskom.8.1.2020.54-58
Subject(s) - minkowski distance , euclidean distance , closeness , minkowski space , euclidean geometry , pattern recognition (psychology) , k nearest neighbors algorithm , mathematics , value (mathematics) , artificial intelligence , noisy data , chebyshev filter , word (group theory) , computer science , data mining , statistics , mathematical analysis , geometry
One algorithm to classify textual data in automatic organizing of documents application is KNN, by changing word representations into vectors. The distance calculation in the KNN algorithm becomes essential in measuring the closeness between data elements. This study compares four distance calculations commonly used in KNN, namely Euclidean, Chebyshev, Manhattan, and Minkowski. The dataset used data from Youtube Eminem’s comments which contain 448 data. This study showed that Euclidian or Minkowski on the KNN algorithm achieved the best result compared to Chebycev and Manhattan. The best results on KNN are obtained when the K value is 3.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here