Adaptive dimension reduction for clustering high dimensional data
Author(s) -
Chris Ding,
Xiaofeng He,
Hongyuan Zha,
Horst D. Simon
Publication year - 2002
Publication title -
osti oai (u.s. department of energy office of scientific and technical information)
Language(s) - English
Resource type - Reports
DOI - 10.2172/807420
Subject(s) - cluster analysis , initialization , dimensionality reduction , computer science , clustering high dimensional data , dimension (graph theory) , reduction (mathematics) , flexibility (engineering) , space (punctuation) , cluster (spacecraft) , data mining , bridge (graph theory) , algorithm , artificial intelligence , mathematics , statistics , biology , combinatorics , geometry , programming language , operating system , anatomy
It is well-known that for high dimensional data clustering, standard algorithms such as EM and the K-means are often trapped in local minimum. many initialization methods were proposed to tackle this problem, but with only limited success. In this paper they propose a new approach to resolve this problem by repeated dimension reductions such that K-means or EM are performed only in very low dimensions. Cluster membership is utilized as a bridge between the reduced dimensional sub-space and the original space, providing flexibility and ease of implementation. Clustering analysis performed on highly overlapped Gaussians, DNA gene expression profiles and internet newsgroups demonstrate the effectiveness of the proposed algorithm
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom