
Finding Best Possible Number of Clusters using K-Means Algorithm
Author(s) -
K. Maheswari
Publication year - 2019
Publication title -
international journal of engineering and advanced technology
Language(s) - English
Resource type - Journals
ISSN - 2249-8958
DOI - 10.35940/ijeat.a1119.1291s419
Subject(s) - statistic , salary , computer science , cluster analysis , k means clustering , data mining , work (physics) , cluster (spacecraft) , algorithm , process (computing) , line (geometry) , silhouette , statistics , mathematics , artificial intelligence , engineering , economics , mechanical engineering , geometry , market economy , programming language , operating system
Customers are assets for business. The companies are investing more for customer relationship management. Retaining customer for long time is a difficult process in today’s trend. On line shopping is also increasing day by day. People are more interested to visit popular web sites and they are spending very less time to choose their products. On line shops are paying more interest to analyze customer preferences, their needs, shopping behaviors through data mining technique. Proper classification is necessary for organizing such data. In this work, Customer with the same buying behavior is grouped based on the features age and salary. K-Means algorithm is applied to form clusters with different K values for original data and normalized data. The within sum of square (wss) is calculated for both the data for different cluster size. The minimum wss is considered to be better which is achieved in normalized data. The validity of cluster is evaluated by elbow, silhouette and gap statistic method to choose the optimal number of clusters. This work is implemented in R software.