
Inferences about competing measures based on patterns of binary significance tests are questionable.
Author(s) -
Patrick E. Shrout,
Marika Yip-Bannicq
Publication year - 2017
Publication title -
psychological methods
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 6.981
H-Index - 151
eISSN - 1939-1463
pISSN - 1082-989X
DOI - 10.1037/met0000109
Subject(s) - measure (data warehouse) , context (archaeology) , statistical power , inference , outcome (game theory) , psychology , argument (complex analysis) , statistical inference , regression , econometrics , regression analysis , statistical hypothesis testing , incremental validity , statistics , psychometrics , computer science , test validity , mathematics , artificial intelligence , data mining , paleontology , biochemistry , chemistry , mathematical economics , biology
An important step in demonstrating the validity of a new measure is to show that it is a better predictor of outcomes than existing measures-often called incremental validity. Investigators can use regression methods to argue for the incremental validity of new measures, while adjusting for competing or existing measures. The argument is often based on patterns of binary significance tests (BST): (a) both measures are significantly related to the outcome, (b) when adjusted for the new measure the competing measure is no longer significantly related to the outcome, but (c) when adjusted for the competing measure the new measure is still significantly related to the outcome. We show that the BST argument can lead to false conclusions up to 30% of the time when the validity study has modest statistical power. We review alternate methods for making strong inferences about validity and illustrate these with data on construal level in the context of relationships. Researchers often present results in black and white terms using statistical significance tests; the conclusions from such results can be misleading. We focus on a special case of this style of reporting whereby a new measure is said to be as good as, or better than, another measure because it is significantly related to an outcome whereas the other measure is not significant when both measures are tested jointly. In our tutorial on inference in regression, we show that arguments based on binary (black and white) patterns can lead to incorrect conclusions more than a third of the time, and we explain why this result is obtained. We further distinguish 3 situations where 2 measures are compared and show better ways of making arguments: (a) when 2 measures are thought to be literally equivalent, (b) when the new measure is thought to be better than the other, and (c) when the new measure adds information to the other, even if it is not equivalent or superior. We illustrate the statistical arguments with data on a new measure of construal level (specific vs. general thinking) in the context of relationships. (PsycINFO Database Record