Relevant knowledge helps in choosing right teacher
Author(s) -
Peng Cai,
Wei Gao,
Aoying Zhou,
KamFai Wong
Publication year - 2011
Publication title -
proceedings of the 45th international acm sigir conference on research and development in information retrieval
Language(s) - English
Resource type - Conference proceedings
DOI - 10.1145/2009916.2009935
Subject(s) - ranking (information retrieval) , computer science , weighting , learning to rank , labeled data , domain (mathematical analysis) , adaptation (eye) , rank (graph theory) , set (abstract data type) , domain adaptation , information retrieval , selection (genetic algorithm) , annotation , training set , domain knowledge , machine learning , artificial intelligence , classifier (uml) , medicine , mathematical analysis , physics , mathematics , combinatorics , optics , radiology , programming language
Learning to adapt in a new setting is a common challenge to our knowledge and capability. New life would be easier if we actively pursued supervision from the right mentor chosen with our relevant but limited prior knowledge. This variant principle of active learning seems intuitively useful to many domain adaptation problems. In this paper, we substantiate its power for advancing automatic ranking adaptation, which is important in web search since it's prohibitive to gather enough labeled data for every search domain for fully training domain-specific rankers. For the cost-effectiveness, it is expected that only those most informative instances in target domain are collected to annotate while we can still utilize the abundant ranking knowledge in source domain. We propose a unified ranking framework to mutually reinforce the active selection of informative target-domain queries and the appropriate weighting of source training data as related prior knowledge. We select to annotate those target queries whose documents' order most disagrees among the members of a committee built on the mixture of source training data and the already selected target data. Then the replenished labeled set is used to adjust the importance of source queries for enhancing their rank transfer. This procedure iterates until labeling budget exhausts. Based on LETOR3.0 and Yahoo! Learning to Rank Challenge data sets, our approach significantly outperforms the random query annotation commonly used in ranking adaptation and the active rank learner on target-domain data only.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom