Clustering by Passing Messages Between Data Points
Author(s) -
Brendan J. Frey,
Delbert Dueck
Publication year - 2007
Publication title -
science
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 12.556
H-Index - 1186
eISSN - 1095-9203
pISSN - 0036-8075
DOI - 10.1126/science.1136800
Subject(s) - affinity propagation , cluster analysis , computer science , similarity (geometry) , data mining , set (abstract data type) , data set , data point , cluster (spacecraft) , pattern recognition (psychology) , artificial intelligence , fuzzy clustering , cure data clustering algorithm , image (mathematics) , programming language
Clustering data by identifying a subset of representative examples is important for processing sensory signals and detecting patterns in data. Such "exemplars" can be found by randomly choosing an initial subset of data points and then iteratively refining it, but this works well only if that initial choice is close to a good solution. We devised a method called "affinity propagation," which takes as input measures of similarity between pairs of data points. Real-valued messages are exchanged between data points until a high-quality set of exemplars and corresponding clusters gradually emerges. We used affinity propagation to cluster images of faces, detect genes in microarray data, identify representative sentences in this manuscript, and identify cities that are efficiently accessed by airline travel. Affinity propagation found clusters with much lower error than other methods, and it did so in less than one-hundredth the amount of time.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom