Transducers: an emerging probabilistic framework for modeling indels on trees | Zendy

Robert K. Bradley | Zendy; Ian Holmes | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Transducers: an emerging probabilistic framework for modeling indels on trees

Author(s) -

Robert K. Bradley,

Ian Holmes

Publication year - 2007

Publication title -

bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.599

H-Index - 390

eISSN - 1367-4811

pISSN - 1367-4803

DOI - 10.1093/bioinformatics/btm402

Subject(s) - probabilistic logic , indel , computer science , artificial intelligence , biology , biochemistry , genotype , single nucleotide polymorphism , gene

When it comes to dealing with indels, molecular evolution lags heuristic bioinformatics by decades. Sophisticated alignment algorithms have been widely known since the 1960s (and in bioinformatics since 1970), but we are still struggling to understand the corresponding phylogenetic models. Big ideas drive change: as we dream of reconstructing ancestral genotypes, it is ever clearer that indels cannot be ignored. We need to develop a robust understanding of probabilistic indel analysis and its relationship to alignment. We believe that a suitable foundation for such analysis already exists, where evolutionary models meet automata theory: the framework of finite-state transducers. This framework links Hidden Markov Models (Brown et al., 1993; Churchill, 1992), sequence alignment algorithms (Gotoh, 1982; Miller andMyers, 1988; Needleman and Wunsch, 1970; Smith and Waterman, 1981), finite-state machines and Chomsky grammars (Durbin et al., 1998) and molecular phylogenetics (Miklos et al., 2004; Thorne et al., 1991). In this letter we outline this framework, also describing a preliminary analysis of one recent algorithm— Indelign—for reconstructing ancestral indel histories (Kim and Sinha, 2007). Below, we briefly review the theory of transducers, concentrating not on the details of individual algorithms but rather on their unifying qualitative character. We show that Indelign, which reconstructs maximum-likelihood indel histories, is implicitly based on a transducer model. Thus, we can compare the computational complexity of Indelign to other transducerframed algorithms, with reference to alignment data from recent comparative genomics projects in Drosophila and Eutheria (ENCODE). Finally, we discuss several programs, algorithms and resources available for working with transducers, offering an outlook on areas of bioinformatics that may benefit from this theory. 1.1 Theory of finite-state transducers

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research