Filtering genes to improve sensitivity in oligonucleotide microarray data analysis
Author(s) -
Stefano Calza,
Wolfgang Raffelsberger,
Alexander Ploner,
JoséAlain Sahel,
Thierry Léveillard,
Yudi Pawitan
Publication year - 2007
Publication title -
nucleic acids research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 9.008
H-Index - 537
eISSN - 1362-4954
pISSN - 0305-1048
DOI - 10.1093/nar/gkm537
Subject(s) - biology , dna microarray , sensitivity (control systems) , filter (signal processing) , set (abstract data type) , data set , computational biology , false discovery rate , noise (video) , biological system , variance (accounting) , data mining , gene , pattern recognition (psychology) , genetics , computer science , artificial intelligence , gene expression , accounting , electronic engineering , engineering , business , image (mathematics) , computer vision , programming language
International audienceMany recent microarrays hold an enormous number of probe sets, thus raising many practical and theoretical problems in controlling the false discovery rate (FDR). Biologically, it is likely that most probe sets are associated with un-expressed genes, so the measured values are simply noise due to non-specific binding; also many probe sets are associated with non-differentially-expressed (non-DE) genes. In an analysis to find DE genes, these probe sets contribute to the false discoveries, so it is desirable to filter out these probe sets prior to analysis. In the methodology proposed here, we first fit a robust linear model for probe-level Affymetrix data that accounts for probe and array effects. We then develop a novel procedure called FLUSH (Filtering Likely Uninformative Sets of Hybridizations), which excludes probe sets that have statistically small array-effects or large residual variance. This filtering procedure was evaluated on a publicly available data set from a controlled spiked-in experiment, as well as on a real experimental data set of a mouse model for retinal degeneration. In both cases, FLUSH filtering improves the sensitivity in the detection of DE genes compared to analyses using unfiltered, presence-filtered, intensity-filtered and variance-filtered data. A freely-available package called FLUSH implements the procedures and graphical displays described in the article
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom