An Efficient Algorithm for Learning Distances that Obey the Triangle Inequality
Author(s) -
Arijit Biswas,
David M. Jacobs
Publication year - 2015
Language(s) - English
Resource type - Conference proceedings
DOI - 10.5244/c.29.10
Subject(s) - triangle inequality , inequality , computer science , algorithm , mathematics , combinatorics , mathematical analysis
Semi-supervised clustering improves performance using constraints that indicate if two images belong to the same category or not. Success depends on how effectively these constraints can be propagated to the unsupervised data. Many algorithms use these constraints to learn Euclidean distances in a vector space. However, distances between images are often computed using classifiers or combinatorial algorithms that make distance learning difficult. In such a setting, we propose to use the triangle inequality to propagate constraints to unsupervised data. First, we formulate distance learning as a metric nearness problem where a brute-force Quadratic Program (QP) is used to modify the distances such that the total change in distances is minimized but the final distances obey the triangle inequality. Then we propose a much faster version of the QP that enforces only a subset of the inequalities and can be applied to real world clustering datasets. We show experimentally that this efficient QP produces stronger clustering results on face, leaf and video image datasets, outperforming state-of-the-art methods for constrained clustering. To gain insight into the effectiveness of this algorithm, we analyze a special case of the semi-supervised clustering problem, and show that the subset of constraints that we sample still preserves key properties of the distances that would be produced by enforcing all constraints.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom