Premium
Bayesian inference for unidirectional misclassification of a binary response trait
Author(s) -
Xia Michelle,
Gustafson Paul
Publication year - 2017
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.7555
Subject(s) - identifiability , covariate , computer science , inference , poisson distribution , identification (biology) , binary data , bayesian probability , binary number , statistics , econometrics , bayesian inference , binary classification , bayes' theorem , machine learning , artificial intelligence , mathematics , botany , arithmetic , support vector machine , biology
When assessing association between a binary trait and some covariates, the binary response may be subject to unidirectional misclassification. Unidirectional misclassification can occur when revealing a particular level of the trait is associated with a type of cost, such as a social desirability or financial cost. The feasibility of addressing misclassification is commonly obscured by model identification issues. The current paper attempts to study the efficacy of inference when the binary response variable is subject to unidirectional misclassification. From a theoretical perspective, we demonstrate that the key model parameters possess identifiability, except for the case with a single binary covariate. From a practical standpoint, the logistic model with quantitative covariates can be weakly identified, in the sense that the Fisher information matrix may be near singular. This can make learning some parameters difficult under certain parameter settings, even with quite large samples. In other cases, the stronger identification enables the model to provide more effective adjustment for unidirectional misclassification. An extension to the Poisson approximation of the binomial model reveals the identifiability of the Poisson and zero‐inflated Poisson models. For fully identified models, the proposed method adjusts for misclassification based on learning from data. For binary models where there is difficulty in identification, the method is useful for sensitivity analyses on the potential impact from unidirectional misclassification.