
Clustering Algorithms For High Dimensional Data – A Survey Of Issues And Existing Approaches
Author(s) -
B. Hari Babu,
N. Subhash Chandra,
T. Venu Gopal
Publication year - 2013
Publication title -
international journal of computer science and informatics
Language(s) - English
Resource type - Journals
ISSN - 2231-5292
DOI - 10.47893/ijcsi.2013.1108
Subject(s) - cluster analysis , clustering high dimensional data , data mining , computer science , cure data clustering algorithm , dimensionality reduction , high dimensional , correlation clustering , redundancy (engineering) , similarity (geometry) , dimension (graph theory) , canopy clustering algorithm , curse of dimensionality , artificial intelligence , mathematics , pure mathematics , image (mathematics) , operating system
Clustering is the most prominent data mining technique used for grouping the data into clusters based on distance measures. With the advent growth of high dimensional data such as microarray gene expression data, and grouping high dimensional data into clusters will encounter the similarity between the objects in the full dimensional space is often invalid because it contains different types of data. The process of grouping into high dimensional data into clusters is not accurate and perhaps not up to the level of expectation when the dimension of the dataset is high. It is now focusing tremendous attention towards research and development. The performance issues of the data clustering in high dimensional data it is necessary to study issues like dimensionality reduction, redundancy elimination, subspace clustering, co-clustering and data labeling for clusters are to analyzed and improved. In this paper, we presented a brief comparison of the existing algorithms that were mainly focusing at clustering on high dimensional data.