Premium
Avoiding ‘data snooping’ in multilevel and mixed effects models
Author(s) -
Afshartous David,
Wolf Michael
Publication year - 2007
Publication title -
journal of the royal statistical society: series a (statistics in society)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.103
H-Index - 84
eISSN - 1467-985X
pISSN - 0964-1998
DOI - 10.1111/j.1467-985x.2007.00494.x
Subject(s) - inference , random effects model , multilevel model , pairwise comparison , computer science , econometrics , hierarchical database model , generalized linear mixed model , mixed model , statistical inference , statistics , estimation , panel data , data mining , mathematics , machine learning , artificial intelligence , meta analysis , engineering , medicine , systems engineering
Summary. Multilevel or mixed effects models are commonly applied to hierarchical data. The level 2 residuals, which are otherwise known as random effects, are often of both substantive and diagnostic interest. Substantively, they are frequently used for institutional comparisons or rankings. Diagnostically, they are used to assess the model assumptions at the group level. Inference on the level 2 residuals, however, typically does not account for ‘data snooping’, i.e. for the harmful effects of carrying out a multitude of hypothesis tests at the same time. We provide a very general framework that encompasses both of the following inference problems: inference on the ‘absolute’ level 2 residuals to determine which are significantly different from 0, and inference on any prespecified number of pairwise comparisons. Thus, the user has the choice of testing the comparisons of interest. As our methods are flexible with respect to the estimation method that is invoked, the user may choose the desired estimation method accordingly. We demonstrate the methods with the London education authority data, the wafer data and the National Educational Longitudinal Study data.