Premium
Identification of Differentially Expressed Genes in High‐Density Oligonucleotide Arrays Accounting for the Quantification Limits of the Technology
Author(s) -
Tadesse Mahlet G.,
Ibrahim Joseph G.,
Mutter George L.
Publication year - 2003
Publication title -
biometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.298
H-Index - 130
eISSN - 1541-0420
pISSN - 0006-341X
DOI - 10.1111/1541-0420.00064
Subject(s) - identifiability , bayesian probability , covariance , identification (biology) , computer science , computational biology , bayesian hierarchical modeling , dna microarray , data mining , prior probability , oligonucleotide , bayesian inference , biology , gene , mathematics , artificial intelligence , machine learning , statistics , gene expression , genetics , botany
Summary . In DNA microarray analysis, there is often interest in isolating a few genes that best discriminate between tissue types. This is especially important in cancer, where different clinicopathologic groups are known to vary in their outcomes and response to therapy. The identification of a small subset of gene expression patterns distinctive for tumor subtypes can help design treatment strategies and improve diagnosis. Toward this goal, we propose a methodology for the analysis of high‐density oligonucleotide arrays. The gene expression measures are modeled as censored data to account for the quantification limits of the technology, and two gene selection criteria based on contrasts from an analysis of covariance (ANCOVA) model are presented. The model is formulated in a hierarchical Bayesian framework, which in addition to making the fit of the model straightforward and computationally efficient, allows us to borrow strength across genes. The elicitation of hierarchical priors, as well as issues related to parameter identifiability and posterior propriety, are discussed in detail. We examine the performance of our proposed method on simulated data, then present a detailed case study of an endometrial cancer dataset.