Premium
A Statistical Model for Investigating Binding Probabilities of DNA Nucleotide Sequences Using Microarrays
Author(s) -
Lee MeiLing Ting,
Bulyk Martha L.,
Whitmore G. A.,
Church George M.
Publication year - 2002
Publication title -
biometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.298
H-Index - 130
eISSN - 1541-0420
pISSN - 0006-341X
DOI - 10.1111/j.0006-341x.2002.00981.x
Subject(s) - dna microarray , computational biology , dna , dna binding site , dna sequencing , statistical model , biology , genetics , mathematics , statistics , gene , promoter , gene expression
Summary. There is considerable scientific interest in knowing the probability that a site‐specific transcription factor will bind to a given DNA sequence. Microarray methods provide an effective means for assessing the binding affinities of a large number of DNA sequences as demonstrated by Bulyk et al. (2001, Proceedings of the National Academy of Sciences, USA 98 , 7158–7163) in their study of the DNA‐binding specificities of Zif268 zinc fingers using microarray technology. In a follow‐up investigation, Bulyk, Johnson, and Church (2002, Nucleic Acid Research 30 , 1255–1261) studied the interdependence of nucleotides on the binding affinities of transcription proteins. Our article is motivated by this pair of studies. We present a general statistical methodology for analyzing microarray intensity measurements reflecting DNA‐protein interactions. The log probability of a protein binding to a DNA sequence on an array is modeled using a linear ANOVA model. This model is convenient because it employs familiar statistical concepts and procedures and also because it is effective for investigating the probability structure of the binding mechanism.