Premium
Minimax robust active learning for approximately specified regression models
Author(s) -
Nie Rui,
Wiens Douglas P.,
Zhai Zhichun
Publication year - 2018
Publication title -
canadian journal of statistics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.804
H-Index - 51
eISSN - 1708-945X
pISSN - 0319-5724
DOI - 10.1002/cjs.11327
Subject(s) - minimax , statistics , population , context (archaeology) , regression , sampling (signal processing) , regression analysis , econometrics , mathematics , computer science , benchmark (surveying) , sampling distribution , mathematical optimization , geography , demography , archaeology , filter (signal processing) , geodesy , sociology , computer vision
We address problems of model misspecification in active learning. We suppose that an investigator will sample training input points (predictors) from a subpopulation with a chosen distribution, possibly different from that generating the underlying whole population. This is in particular justified when full knowledge of the predictors is easily acquired, but determining the responses is expensive. Having sampled the responses the investigator will estimate a, possibly incorrectly specified, regression function and then predict the responses at all remaining values of the predictors. We derive functions r ( x ) of the predictors x , and carry out probability weighted sampling with weights proportional to r ( x ) . The functions r ( · ) are asymptotically minimax robust against the losses incurred by random measurement error in the responses, sampling variation in the inputs, and biases resulting from the model misspecification. In our applications the values of r ( · ) are functions of the diagonal elements of the “hat” matrix which features in a regression on the entire population; this yields an interpretation of sampling the “most influential” part of the population. Applications on simulated and benchmark data sets demonstrate the strong gains to be achieved in this manner, relative to passive learning and to previously proposed methods of active learning. We go on to illustrate the methods in the context of a case study relating ice thickness and snow depth at various locations in Canada, using a “population” of about 50,000 observations made available by Statistics Canada. The Canadian Journal of Statistics 46: 104–122; 2018 © 2017 Statistical Society of Canada