z-logo
open-access-imgOpen Access
Direct comparison of performance of single nucleotide variant calling in human genome with alignment-based and assembly-based approaches
Author(s) -
Leihong Wu,
Gökhan Yavaş,
Huixiao Hong,
Wenming Xiao
Publication year - 2017
Publication title -
scientific reports
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.24
H-Index - 213
ISSN - 2045-2322
DOI - 10.1038/s41598-017-10826-9
Subject(s) - computational biology , human genome , genome , computer science , genetics , biology , gene
Complementary to reference-based variant detection, recent studies revealed that many novel variants could be detected with de novo assembled genomes. To evaluate the effect of reads coverage and the accuracy of assembly-based variant calling, we simulated short reads containing more than 3 million of single nucleotide variants (SNVs) from the whole human genome and compared the efficiency of SNV calling between the assembly-based and alignment-based calling approaches. We assessed the quality of the assembled contig and found that a minimum of 30X coverage of short reads was needed to ensure reliable SNV calling and to generate assembled contigs with a good coverage of genome and genes. In addition, we observed that the assembly-based approach had a much lower recall rate and precision comparing to the alignment-based approach that would recover 99% of imputed SNVs. We observed similar results with experimental reads for NA24385, an individual whose germline variants were well characterized. Although there are additional values for SNVs detection, the assembly-based approach would have great risk of false discovery of novel SNVs. Further improvement of de novo assembly algorithms are needed in order to warrant a good completeness of genome with haplotype resolved and high fidelity of assembled sequences.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here