Using a priori knowledge to align sequencing reads to their exact genomic position
Author(s) -
René Böttcher,
Ronny Amberg,
Frans-Paul Ruzius,
Victor Guryev,
Wim Verhaegh,
Peter Beyerlein,
P. J. van der Zaag
Publication year - 2012
Publication title -
nucleic acids research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 9.008
H-Index - 537
eISSN - 1362-4954
pISSN - 0305-1048
DOI - 10.1093/nar/gks393
Subject(s) - biology , a priori and a posteriori , false positive paradox , computational biology , position (finance) , dna sequencing , string (physics) , sequence alignment , genetics , computer science , artificial intelligence , dna , gene , mathematics , philosophy , epistemology , finance , economics , mathematical physics , peptide sequence
The use of a priori knowledge in the alignment of targeted sequencing data is investigated using computational experiments. Adapting a Needleman-Wunsch algorithm to incorporate the genomic position information from the targeted capture, we demonstrate that alignment can be done to just the target region of interest. When in addition use is made of direct string comparison, an improvement of up to a factor of 8 in alignment speed compared to the fastest conventional aligner (Bowtie) is obtained. This results in a total alignment time in targeted sequencing of around 7 min for aligning approximately 56 million captured reads. For conventional aligners such as Bowtie, BWA or MAQ, alignment to just the target region is not feasible as experiments show that this leads to an additional 88% SNP calls, the vast majority of which are false positives (≈ 92%)
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom