Min Sum Clustering with Penalties
Author(s) -
Refael Hassin,
Einat Or
Publication year - 2005
Publication title -
lecture notes in computer science
Language(s) - English
Resource type - Book series
SCImago Journal Rank - 0.249
H-Index - 400
eISSN - 1611-3349
pISSN - 0302-9743
DOI - 10.1007/11561071_17
Subject(s) - combinatorics , cluster analysis , approximation algorithm , subset sum problem , outlier , mathematics , partition (number theory) , function (biology) , graph , metric (unit) , discrete mathematics , computer science , algorithm , knapsack problem , statistics , economics , biology , operations management , evolutionary biology
Traditionally, clustering problems are investigated under the assumption that all objects must be clustered. A shortcoming of this formulation is that a few distant objects, called outliers, may exert a disproportionately strong influence over the solution. In this work we investigate the k-min-sum clustering problem while addressing outliers in a meaningful way. Given a complete graph G = (V,E), a weight function w : E →IN0 on its edges, and $p \rightarrow {\it {IN}_{o}}$ a penalty function on its nodes, the penalized k-min-sum problem is the problem of finding a partition of V to k+1 sets, {S1,...,Sk+1}, minimizing $\sum_{i=1}^{k}$w(Si)+p(Sk+1), where for S⊆Vw(S) = $\sum_{e=\{{\it i},{\it j}\} \subset {\it S}}$we, and p(S) = $\sum_{i \in S}{^p_i}$. We offer an efficient 2-approximation to the penalized 1-min-sum problem using a primal-dual algorithm. We prove that the penalized 1-min-sum problem is NP-hard even if w is a metric and present a randomized approximation scheme for it. For the metric penalized k-min-sum problem we offer a 2-approximation.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom