z-logo
open-access-imgOpen Access
Statistical inference for exploratory data analysis and model diagnostics
Author(s) -
Andreas Buja,
Dianne Cook,
Heike Hofmann,
Michael Lawrence,
Eun Kyung Lee,
Deborah F. Swayne,
Hadley Wickham
Publication year - 2009
Publication title -
philosophical transactions of the royal society a mathematical physical and engineering sciences
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.074
H-Index - 169
eISSN - 1471-2962
pISSN - 1364-503X
DOI - 10.1098/rsta.2009.0120
Subject(s) - protocol (science) , computer science , statistical inference , plot (graphics) , statistical model , statistical hypothesis testing , exploratory data analysis , frequentist inference , rorschach test , inference , rigour , data science , data mining , machine learning , artificial intelligence , statistics , bayesian inference , bayesian probability , psychology , mathematics , medicine , social psychology , alternative medicine , geometry , pathology
We propose to furnish visual statistical methods with an inferential framework and protocol, modelled on confirmatory statistical testing. In this framework, plots take on the role of test statistics, and human cognition the role of statistical tests. Statistical significance of 'discoveries' is measured by having the human viewer compare the plot of the real dataset with collections of plots of simulated datasets. A simple but rigorous protocol that provides inferential validity is modelled after the 'lineup' popular from criminal legal procedures. Another protocol modelled after the 'Rorschach' inkblot test, well known from (pop-)psychology, will help analysts acclimatize to random variability before being exposed to the plot of the real data. The proposed protocols will be useful for exploratory data analysis, with reference datasets simulated by using a null assumption that structure is absent. The framework is also useful for model diagnostics in which case reference datasets are simulated from the model in question. This latter point follows up on previous proposals. Adopting the protocols will mean an adjustment in working procedures for data analysts, adding more rigour, and teachers might find that incorporating these protocols into the curriculum improves their students' statistical thinking.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom