Premium
Examining statistical disclosure issues involving digital images of ROC curves
Author(s) -
Matthews Gregory J.,
Harel Ofer
Publication year - 2015
Publication title -
stat
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.61
H-Index - 18
ISSN - 2049-1573
DOI - 10.1002/sta4.93
Subject(s) - receiver operating characteristic , set (abstract data type) , data set , image (mathematics) , empirical research , mathematics , computer science , statistics , econometrics , artificial intelligence , programming language
It has been established that knowing the true values of the empirical receiver operating characteristic (ROC) curve (i.e. false‐positive and true‐positive rate pairs for all thresholds) along with a subset of the full data set consisting of n − 1 observations can cause unwanted disclosures. Here, we explore a similar problem with two main extensions. First, rather than knowledge of the true values of the empirical ROC curve, we start only with an image of the empirical ROC curve. Second, rather than considering only subsets of n − 1, we look at several differently sized subsets. Given this information (i.e. empirical ROC image and a subset of the full data set), we experimentally act as a data snooper and explore what can be learned about unobserved portions of the full data set. Copyright © 2015 John Wiley & Sons, Ltd.