Premium
Micro Array Based Gene Expression Analysis using Parametric Multivariate Tests per Gene – A Generalized Application of Multiple Procedures with Data‐driven Order of Hypotheses
Author(s) -
Schuster Ernst,
Kropf Siegfried,
Roeder Ingo
Publication year - 2004
Publication title -
biometrical journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.108
H-Index - 63
eISSN - 1521-4036
pISSN - 0323-3847
DOI - 10.1002/bimj.200410067
Subject(s) - multivariate statistics , multivariate analysis , multiple comparisons problem , parametric statistics , expression (computer science) , mathematics , statistical hypothesis testing , sample size determination , multivariate normal distribution , computer science , statistics , algorithm , data mining , programming language
Abstract Micro array technology allows the simultaneous analysis of ten‐thousands of genes. Most often, however, the analysis is based on a few replications only. This causes problems in the application of classical multivariate tests which require sample sizes exceeding the number of observed variables. To overcome these problems, a class of stable, multivariate procedures based on the theory of spherical distributions has been proposed by Läuter, Glimm, and Kropf (1996). These methods allow the use of multivariate information of many genes for testing differential gene expression. Furthermore, multiple testing procedures based on these principles have been constructed (e.g., Kropf, Läuter, 2002), which strictly keep the familywise type I error rate (FWE). In this paper, these methods have been generalized to allow for the use of full multivariate information on expression intensities of individual genes analysed by the Affymetrix GeneChip technology. In contrast to the usual strategy, which constructs an expression score for each gene, based on averaging of the different oligonucleotide (perfect‐ and miss‐match) information, and then performs some test on these summarized expression values, we suggest using a test procedure based on the complete multivariate perfect match information. We show that a multiple FWE‐controlling procedure for normally distributed data proposed by Westfall, Kropf, and Finos (2004), can be generalised to a more powerful procedure based on left‐spherically distributed scores derived from the perfect match information, without losing the FWE‐controlling property. To illustrate the proposed test procedures, which have been implemented in the statistical programming environment R , we analyse two already published data sets, comparing gene expression of tumour and healthy tissues within identical patients and between two groups of different patients, respectively. Using these examples, we demonstrated that the incorporation of the multivariate perfect match information is superior to classical expression score based methods with respect to the number of identifiable differentially expressed genes. (© 2004 WILEY‐VCH Verlag GmbH & Co. KGaA, Weinheim)