Repeat-aware evaluation of scaffolding tools | Zendy

Igor Mandric | Zendy; Sergey Knyazev | Zendy; Alex Zelikovsky | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Repeat-aware evaluation of scaffolding tools

Author(s) -

Igor Mandric,

Sergey Knyazev,

Alex Zelikovsky

Publication year - 2018

Publication title -

bioinformatics

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 3.599

H-Index - 390

eISSN - 1367-4811

pISSN - 1367-4803

DOI - 10.1093/bioinformatics/bty131

Subject(s) - computer science , scaffold , contig , documentation , scalability , source code , code (set theory) , genome , data mining , programming language , set (abstract data type) , database , genetics , biology , gene

Genomic sequences are assembled into a variable, but large number of contigs that should be scaffolded (ordered and oriented) for facilitating comparative or functional analysis. Finding scaffolding is computationally challenging due to misassemblies, inconsistent coverage across the genome and long repeats. An accurate assessment of scaffolding tools should take into account multiple locations of the same contig on the reference scaffolding rather than matching a repeat to a single best location. This makes mapping of inferred scaffoldings onto the reference a computationally challenging problem. This paper formulates the repeat-aware scaffolding evaluation problem, which is to find a mapping of the inferred scaffolding onto the reference maximizing number of correct links and proposes a scalable algorithm capable of handling large whole-genome datasets. Our novel scaffolding validation framework has been applied to assess the most of state-of-the-art scaffolding tools on the representative subset of Genome Assembly Golden-Standard Evaluations (GAGE) datasets and some novel simulated datasets.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research