Premium
Modeling Liquid Association
Author(s) -
Ho YenYi,
Parmigiani Giovanni,
Louis Thomas A.,
Cope Leslie M.
Publication year - 2011
Publication title -
biometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.298
H-Index - 130
eISSN - 1541-0420
pISSN - 0006-341X
DOI - 10.1111/j.1541-0420.2010.01440.x
Subject(s) - estimator , univariate , measure (data warehouse) , computer science , association (psychology) , flexibility (engineering) , parametric statistics , set (abstract data type) , statistics , data mining , multivariate statistics , mathematics , machine learning , philosophy , epistemology , programming language
Summary In 2002, Ker–Chau Li introduced the liquid association measure to characterize three‐way interactions between genes, and developed a computationally efficient estimator that can be used to screen gene expression microarray data for such interactions. That study, and others published since then, have established the biological validity of the method, and clearly demonstrated it to be a useful tool for the analysis of genomic data sets. To build on this work, we have sought a parametric family of multivariate distributions with the flexibility to model the full range of trivariate dependencies encompassed by liquid association. Such a model could situate liquid association within a formal inferential theory. In this article, we describe such a family of distributions, a trivariate, conditional normal model having Gaussian univariate marginal distributions, and in fact including the trivariate Gaussian family as a special case. Perhaps the most interesting feature of the distribution is that the parameterization naturally parses the three‐way dependence structure into a number of distinct, interpretable components. One of these components is very closely aligned to liquid association, and is developed as a measure we call modified liquid association. We develop two methods for estimating this quantity, and propose statistical tests for the existence of this type of dependence. We evaluate these inferential methods in a set of simulations and illustrate their use in the analysis of publicly available experimental data.