Learning Relative Similarity from Data Streams
Author(s) -
Shuji Hao,
Peilin Zhao,
Steven C. H. Hoi,
Chunyan Miao
Publication year - 2015
Publication title -
singapore management university institutional knowledge (ink) (singapore management university)
Language(s) - English
Resource type - Conference proceedings
DOI - 10.1145/2806416.2806464
Subject(s) - similarity (geometry) , computer science , mistake , machine learning , function (biology) , artificial intelligence , instance based learning , nearest neighbor search , active learning (machine learning) , data mining , data stream mining , evolutionary biology , political science , law , image (mathematics) , biology
Relative similarity learning, as an important learning scheme for information retrieval, aims to learn a bi-linear similarity function from a collection of labeled instance-pairs, and the learned function would assign a high similarity value for a similar instance-pair and a low value for a dissimilar pair. Existing algorithms usually assume the labels of all the pairs in data streams are always made available for learning. However, this is not always realistic in practice since the number of possible pairs is quadratic to the number of instances in the database, and manually labeling the pairs could be very costly and time consuming. To overcome the limitation, we propose a novel framework of active online similarity learning. Specifically, we propose two new algorithms: (i)~PAAS: Passive-Aggressive Active Similarity learning; (ii)~CWAS: Confidence-Weighted Active Similarity learning, and we will prove their mistake bounds in theory. We have conducted extensive experiments on a variety of real-world data sets, and we find encouraging results that validate the empirical effectiveness of the proposed algorithms.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom