Bayesian non‐parametric hidden Markov models with applications in genomics | Zendy

Yau C. | Zendy; Papaspiliopoulos O. | Zendy; Roberts G. O. | Zendy; Holmes C. | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

Bayesian non‐parametric hidden Markov models with applications in genomics

Author(s) -

Yau C.,

Papaspiliopoulos O.,

Roberts G. O.,

Holmes C.

Publication year - 2011

Publication title -

journal of the royal statistical society: series b (statistical methodology)

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 6.523

H-Index - 137

eISSN - 1467-9868

pISSN - 1369-7412

DOI - 10.1111/j.1467-9868.2010.00756.x

Subject(s) - markov chain monte carlo , dirichlet process , computer science , inference , bayesian probability , robustness (evolution) , markov chain , variable order bayesian network , parametric statistics , gibbs sampling , hierarchical dirichlet process , hidden markov model , bayesian inference , machine learning , artificial intelligence , mathematics , statistics , topic model , latent dirichlet allocation , biochemistry , chemistry , gene

Summary. We propose a flexible non‐parametric specification of the emission distribution in hidden Markov models and we introduce a novel methodology for carrying out the computations. Whereas current approaches use a finite mixture model, we argue in favour of an infinite mixture model given by a mixture of Dirichlet processes. The computational framework is based on auxiliary variable representations of the Dirichlet process and consists of a forward–backward Gibbs sampling algorithm of similar complexity to that used in the analysis of parametric hidden Markov models. The algorithm involves analytic marginalizations of latent variables to improve the mixing, facilitated by exchangeability properties of the Dirichlet process that we uncover in the paper. A by‐product of this work is an efficient Gibbs sampler for learning Dirichlet process hierarchical models. We test the Monte Carlo algorithm proposed against a wide variety of alternatives and find significant advantages. We also investigate by simulations the sensitivity of the proposed model to prior specification and data‐generating mechanisms. We apply our methodology to the analysis of genomic copy number variation. Analysing various real data sets we find significantly more accurate inference compared with state of the art hidden Markov models which use finite mixture emission distributions.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Empowering knowledge with every search

About

About Careers Publisher Partners Contact Us

Learn

FAQs Blog Terms of Use Privacy Policy

About

Learn

Discover

Explore