
A broad survey of DNA sequence data simulation tools
Author(s) -
Shatha Alosaimi,
Armand Bandiang,
Noëlle van Biljon,
Denis Awany,
Prisca K Thami,
Milaine S S Tchamga,
Anmol Kiran,
Olfa Messaoud,
Radia Hassan,
Jacquiline Wangui Mugo,
Azza E. Ahmed,
Christian Domilongo Bope,
Imane Allali,
Gaston K. Mazandu,
Nicola Mulder,
Emile R. Chimusa
Publication year - 2019
Publication title -
briefings in functional genomics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.22
H-Index - 67
eISSN - 2041-2647
pISSN - 2041-2649
DOI - 10.1093/bfgp/elz033
Subject(s) - sequence (biology) , documentation , in silico , computer science , dna sequencing , computational biology , biology , data mining , data science , bioinformatics , dna , programming language , genetics , gene
In silico DNA sequence generation is a powerful technology to evaluate and validate bioinformatics tools, and accordingly more than 35 DNA sequence simulation tools have been developed. With such a diverse array of tools to choose from, an important question is: Which tool should be used for a desired outcome? This question is largely unanswered as documentation for many of these DNA simulation tools is sparse. To address this, we performed a review of DNA sequence simulation tools developed to date and evaluated 20 state-of-art DNA sequence simulation tools on their ability to produce accurate reads based on their implemented sequence error model. We provide a succinct description of each tool and suggest which tool is most appropriate for the given different scenarios. Given the multitude of similar yet non-identical tools, researchers can use this review as a guide to inform their choice of DNA sequence simulation tool. This paves the way towards assessing existing tools in a unified framework, as well as enabling different simulation scenario analysis within the same framework.