A machine-learning approach to combined evidence validation of genome assemblies | Zendy

JeongHyeon Choi | Zendy; Sun Kim | Zendy; Haixu Tang | Zendy; Justen Andrews | Zendy; Don Gilbert | Zendy; John K. Colbourne | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

A machine-learning approach to combined evidence validation of genome assemblies

Author(s) -

JeongHyeon Choi,

Sun Kim,

Haixu Tang,

Justen Andrews,

Don Gilbert,

John K. Colbourne

Publication year - 2008

Publication title -

bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.599

H-Index - 390

eISSN - 1367-4811

pISSN - 1367-4803

DOI - 10.1093/bioinformatics/btm608

Subject(s) - contig , sequence assembly , computer science , benchmarking , merge (version control) , sequence (biology) , data mining , string (physics) , artificial intelligence , machine learning , genome , computational biology , information retrieval , genetics , mathematics , biology , gene , gene expression , transcriptome , marketing , business , mathematical physics

While it is common to refer to 'the genome sequence' as if it were a single, complete and contiguous DNA string, it is in fact an assembly of millions of small, partially overlapping DNA fragments. Sophisticated computer algorithms (assemblers and scaffolders) merge these DNA fragments into contigs, and place these contigs into sequence scaffolds using the paired-end sequences derived from large-insert DNA libraries. Each step in this automated process is susceptible to producing errors; hence, the resulting draft assembly represents (in practice) only a likely assembly that requires further validation. Knowing which parts of the draft assembly are likely free of errors is critical if researchers are to draw reliable conclusions from the assembled sequence data.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research