z-logo
Premium
Sparse Markov Chains for Sequence Data
Author(s) -
Jääskinen Väinö,
Xiong Jie,
Corander Jukka,
Koski Timo
Publication year - 2014
Publication title -
scandinavian journal of statistics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.359
H-Index - 65
eISSN - 1467-9469
pISSN - 0303-6898
DOI - 10.1111/sjos.12053
Subject(s) - markov chain , markov model , variable order markov model , mathematics , variable order bayesian network , cluster analysis , computer science , bayesian inference , artificial intelligence , bayesian probability , markov chain monte carlo , pattern recognition (psychology) , algorithm , data mining , machine learning
Finite memory sources and variable‐length Markov chains have recently gained popularity in data compression and mining, in particular, for applications in bioinformatics and language modelling. Here, we consider denser data compression and prediction with a family of sparse Bayesian predictive models for Markov chains in finite state spaces. Our approach lumps transition probabilities into classes composed of invariant probabilities, such that the resulting models need not have a hierarchical structure as in context tree‐based approaches. This can lead to a substantially higher rate of data compression, and such non‐hierarchical sparse models can be motivated for instance by data dependence structures existing in the bioinformatics context. We describe a Bayesian inference algorithm for learning sparse Markov models through clustering of transition probabilities. Experiments with DNA sequence and protein data show that our approach is competitive in both prediction and classification when compared with several alternative methods on the basis of variable memory length.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here