Premium
A SATS algorithm for jointly identifying multiple differentially expressed gene sets
Author(s) -
Yang Tae Young
Publication year - 2011
Publication title -
statistics in medicine
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.996
H-Index - 183
eISSN - 1097-0258
pISSN - 0277-6715
DOI - 10.1002/sim.4235
Subject(s) - dna microarray , type i and type ii errors , multiple comparisons problem , significance analysis of microarrays , statistical hypothesis testing , set (abstract data type) , gene , statistic , test statistic , measure (data warehouse)
A gene set in DNA microarrays is a group of genes that share a common biological function, chromosomal location, or regulation. This paper discusses the problem of jointly identifying multiple differentially expressed gene sets associated with a phenotype of interest from many hundreds of pre‐defined gene sets in a microarray experiment. We propose a null hypothesis that any group of gene sets from the experiment is not differentially expressed. The hypothesis is applicable to a real microarray experiment, where only a fraction of gene sets examined in the experiment are differentially expressed. To test this hypothesis, we provide an algorithm called set association for tail strength (SATS). SATS assigns the tail‐strength statistic (TS) to each gene set to measure differential expression that is related to the phenotype of interest, combines the statistics into an overall association measure of multiple gene sets by utilizing a set‐association method, and then calculates the significance of the overall measure by conducting sample permutations. SATS performs a simultaneous significance test on several gene sets, while controlling the Type I error rate. As multiple gene sets work together toward the significance, SATS can capture correlations across gene sets that should be considered in assessing joint statistical significance. Copyright © 2011 John Wiley & Sons, Ltd.