Hybrid correction of highly noisy long reads using a variable-order de Bruijn graph
Author(s) -
Pierre Morisse,
Thierry Lecroq,
Arnaud Lefebvre
Publication year - 2018
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/bty521
Subject(s) - de bruijn sequence , computer science , tree traversal , de bruijn graph , nanopore sequencing , error detection and correction , sequence assembly , word error rate , graph , reference genome , graph traversal , algorithm , theoretical computer science , genome , artificial intelligence , biology , mathematics , biochemistry , gene expression , transcriptome , discrete mathematics , gene
The recent rise of long read sequencing technologies such as Pacific Biosciences and Oxford Nanopore allows to solve assembly problems for larger and more complex genomes than what allowed short reads technologies. However, these long reads are very noisy, reaching an error rate of around 10-15% for Pacific Biosciences, and up to 30% for Oxford Nanopore. The error correction problem has been tackled by either self-correcting the long reads, or using complementary short reads in a hybrid approach. However, even though sequencing technologies promise to lower the error rate of the long reads below 10%, it is still higher in practice, and correcting such noisy long reads remains an issue.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom