Heuristic approach to deriving models for gene finding | Zendy

J Besemer | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Heuristic approach to deriving models for gene finding

Author(s) -

J Besemer

Publication year - 1999

Publication title -

nucleic acids research

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 9.008

H-Index - 537

eISSN - 1362-4954

pISSN - 0305-1048

DOI - 10.1093/nar/27.19.3911

Subject(s) - biology , genome , gene prediction , heuristic , gene , computational biology , dna sequencing , hidden markov model , coding region , genetics , dna , sequence (biology) , plasmid , computer science , artificial intelligence

Computer methods of accurate gene finding in DNA sequences require models of protein coding and non-coding regions derived either from experimentally validated training sets or from large amounts of anonymous DNA sequence. Here we propose a new, heuristic method producing fairly accurate inhomogeneous Markov models of protein coding regions. The new method needs such a small amount of DNA sequence data that the model can be built 'on the fly' by a web server for any DNA sequence >400 nt. Tests on 10 complete bacterial genomes performed with the GeneMark.hmm program demonstrated the ability of the new models to detect 93.1% of annotated genes on average, while models built by traditional training predict an average of 93.9% of genes. Models built by the heuristic approach could be used to find genes in small fragments of anonymous prokaryotic genomes and in genomes of organelles, viruses, phages and plasmids, as well as in highly inhomogeneous genomes where adjustment of models to local DNA composition is needed. The heuristic method also gives an insight into the mechanism of codon usage pattern evolution.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research