z-logo
open-access-imgOpen Access
SRPRISM (Single Read Paired Read Indel Substitution Minimizer): an efficient aligner for assemblies with explicit guarantees
Author(s) -
Aleksandr Morgulis,
Richa Agarwala
Publication year - 2020
Publication title -
gigascience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.947
H-Index - 54
ISSN - 2047-217X
DOI - 10.1093/gigascience/giaa023
Subject(s) - indel , benchmark (surveying) , computer science , sensitivity (control systems) , ranking (information retrieval) , sequence (biology) , alignment free sequence analysis , reference genome , substitution (logic) , algorithm , dna sequencing , data mining , sequence alignment , artificial intelligence , biology , gene , genetics , geodesy , electronic engineering , genotype , single nucleotide polymorphism , engineering , peptide sequence , programming language , geography
Background Alignment of sequence reads generated by next-generation sequencing is an integral part of most pipelines analyzing next-generation sequencing data. A number of tools designed to quickly align a large volume of sequences are already available. However, most existing tools lack explicit guarantees about their output. They also do not support searching genome assemblies, such as the human genome assembly GRCh38, that include primary and alternate sequences and placement information for alternate sequences to primary sequences in the assembly. Findings This paper describes SRPRISM (Single Read Paired Read Indel Substitution Minimizer), an alignment tool for aligning reads without splices. SRPRISM has features not available in most tools, such as (i) support for searching genome assemblies with alternate sequences, (ii) partial alignment of reads with a specified region of reads to be included in the alignment, (iii) choice of ranking schemes for alignments, and (iv) explicit criteria for search sensitivity. We compare the performance of SRPRISM to GEM, Kart, STAR, BWA-MEM, Bowtie2, Hobbes, and Yara using benchmark sets for paired and single reads of lengths 100 and 250 bp generated using DWGSIM. SRPRISM found the best results for most benchmark sets with error rate of up to ∼2.5% and GEM performed best for higher error rates. SRPRISM was also more sensitive than other tools even when sensitivity was reduced to improve run time performance. Conclusions We present SRPRISM as a flexible read mapping tool that provides explicit guarantees on results.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here