
K-Means and K-Medoids for Indonesian Text Summarization
Author(s) -
Ken Kinanti Purnamasari
Publication year - 2019
Publication title -
iop conference series. materials science and engineering
Language(s) - English
Resource type - Journals
eISSN - 1757-899X
pISSN - 1757-8981
DOI - 10.1088/1757-899x/662/6/062013
Subject(s) - automatic summarization , medoid , k medoids , cluster analysis , computer science , similarity (geometry) , tf–idf , cosine similarity , document clustering , cluster (spacecraft) , data mining , k means clustering , center (category theory) , pattern recognition (psychology) , artificial intelligence , information retrieval , fuzzy clustering , physics , term (time) , cure data clustering algorithm , quantum mechanics , image (mathematics) , programming language , chemistry , crystallography
The purpose of this study is to build automated summation tools, especially in grouping methods such as K-Means and K-Medoids. Finding the best method between the two algorithms, this study focuses on comparing the two methods to summarize thesis report documents. This system is divided into Filtering, Tokenization, TF-IDF, Cosine Similarity, and Clustering. Based on 50 test documents, the average accuracy rate is 51.16% for K-Means and 63.35% for K-Medoids. K-Means has a smaller accuracy value than K-Medoids. The accuracy of the resulting K-Means also depends on the size and center of the initial cluster chosen. So, as the next stage of development, research needs to be done that compares the results of the combination of initial size and center cluster values for K-Means and continue with several other classifications.