Validating clustering for gene expression data
Author(s) -
Ka Yee Yeung,
David R. Haynor,
Walter L. Ruzzo
Publication year - 2001
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/17.4.309
Subject(s) - cluster analysis , data mining , computer science , partition (number theory) , cure data clustering algorithm , cluster (spacecraft) , correlation clustering , single linkage clustering , expression (computer science) , consensus clustering , variation (astronomy) , artificial intelligence , mathematics , physics , combinatorics , astrophysics , programming language
Many clustering algorithms have been proposed for the analysis of gene expression data, but little guidance is available to help choose among them. We provide a systematic framework for assessing the results of clustering algorithms. Clustering algorithms attempt to partition the genes into groups exhibiting similar patterns of variation in expression level. Our methodology is to apply a clustering algorithm to the data from all but one experimental condition. The remaining condition is used to assess the predictive power of the resulting clusters-meaningful clusters should exhibit less variation in the remaining condition than clusters formed by chance.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom