Premium
Bayesian gene set analysis for identifying significant biological pathways
Author(s) -
Shahbaba Babak,
Tibshirani Robert,
Shachaf Catherine M.,
Plevritis Sylvia K.
Publication year - 2011
Publication title -
journal of the royal statistical society: series c (applied statistics)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.205
H-Index - 72
eISSN - 1467-9876
pISSN - 0035-9254
DOI - 10.1111/j.1467-9876.2011.00765.x
Subject(s) - biological pathway , computational biology , pathway analysis , biological data , biological network , bayesian probability , set (abstract data type) , bayesian network , computer science , biology , gene , bioinformatics , machine learning , artificial intelligence , gene expression , genetics , programming language
Summary. We propose a hierarchical Bayesian model for analysing gene expression data to identify pathways differentiating between two biological states (e.g. cancer versus non‐cancer). Finding significant pathways can improve our understanding of normal and pathological processes and can lead to more effective treatments. Our method, Bayesian gene set analysis, evaluates the statistical significance of a specific pathway by using the posterior distribution of its corresponding hyperparameter. We apply Bayesian gene set analysis to a gene expression microarray data set on 50 cancer cell lines, of which 33 have a known p53 mutation and the remaining are p53 wild type, to identify pathways that are associated with the mutational status in the gene p53. We identify several significant pathways with strong biological connections. We show that our approach provides a natural framework for incorporating prior biological information, and it produces the best overall performance in terms of correctly identifying significant pathways compared with several alternative methods.