Premium
A compound Poisson model for word occurrences in DNA sequences
Author(s) -
Robin Stéphane
Publication year - 2002
Publication title -
journal of the royal statistical society: series c (applied statistics)
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.205
H-Index - 72
eISSN - 1467-9876
pISSN - 0035-9254
DOI - 10.1111/1467-9876.00279
Subject(s) - poisson distribution , markov chain , sequence (biology) , word (group theory) , set (abstract data type) , poisson process , compound poisson process , computer science , markov model , hidden markov model , markov process , mathematics , algorithm , statistical physics , artificial intelligence , statistics , genetics , physics , biology , geometry , programming language
Summary. We present a compound Poisson model describing the occurrence process of a set of words in a random sequence of letters. The model takes into account the frequency of the words and their overlapping structure. The model is compared with a Markov chain model in terms of fit and parsimony. Special attention is given to the detection of poor or rich regions. Several applications of the model are presented and a combination of the Markov and compound Poisson models is proposed.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom