
An advanced ilrcpsd technique for bridging the competency and cognitive skills of students in higher education
Author(s) -
Alfina Christy
Publication year - 2017
Publication title -
international journal of engineering and technology
Language(s) - English
Resource type - Journals
ISSN - 2227-524X
DOI - 10.14419/ijet.v7i1.3.8984
Subject(s) - cluster analysis , centroid , computer science , data mining , cluster (spacecraft) , descriptive statistics , data point , inference , artificial intelligence , pattern recognition (psychology) , mathematics , statistics , programming language
Data mining refers to the extraction of meaningful knowledge from large data sources as it may contain hidden potential facts. In general the analysis of data mining can either be predictive or descriptive. Predictive analysis of data mining interprets the inference of the existing results so as to identify the future outputs and the descriptive analysis of data mining interprets the intrinsic characteristics or nature of the data. Clustering is one of the descriptive analysis techniques of data mining which groups the objects of similar types in such a way that objects in a cluster are closer to each other than the objects of other clusters. K-means is the most popular and widely used clustering algorithm that starts by selecting the k-random initial centroids as equal to number of clusters given by the user. It then computes the distance between initial centroids with the remaining data objects and groups the data objects into the cluster centroids with minimum distance. This process is repeated until there is no change in the cluster centroids or cluster members. But, still k-means has been suffered from several issues such as optimum number of k, random initial centroids, unknown number of iterations, global optimum solutions of clusters and more importantly the creation of meaningful clusters when dealing with the analysis of datasets from various domains. The accuracy involved with clustering should never be compromised. Thus, in this paper, a novel classification via clustering algorithm called Iterative Linear Regression Clustering with Percentage Split Distribution (ILRCPSD) is introduced as an alternate solution to the problems encountered in traditional clustering algorithms. The proposed algorithm is examined over an educational dataset to identify the hidden group of students having similar cognitive and competency skills. The performance of the proposed algorithm is well-compared with the accuracy of the traditional k-means clustering in terms of building meaningful clusters and to prove its real time usefulness.