Learning HMMs for nucleotide sequences from amino acid alignments
Author(s) -
Carlos Norberto Fischer,
Cláudia M. A. Carareto,
Renato Santos,
Ricardo Cerri,
Eduardo C. Marques Costa,
Leander Schietgat,
Celine Vens
Publication year - 2015
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btv054
Subject(s) - hidden markov model , computer science , genome , computational biology , translation (biology) , frame (networking) , multiple sequence alignment , sequence alignment , sequence (biology) , artificial intelligence , pattern recognition (psychology) , genetics , biology , peptide sequence , gene , telecommunications , messenger rna
Profile hidden Markov models (profile HMMs) are known to efficiently predict whether an amino acid (AA) sequence belongs to a specific protein family. Profile HMMs can also be used to search for protein domains in genome sequences. In this case, HMMs are typically learned from AA sequences and then used to search on the six-frame translation of nucleotide (NT) sequences. However, this approach demands additional processing of the original data and search results. Here, we propose an alternative and more direct method which converts an AA alignment into an NT one, after which an NT-based HMM is trained to be applied directly on a genome.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom