Incorporation of Unique Molecular Identifiers in TruSeq Adapters Improves the Accuracy of Quantitative Sequencing
Author(s) -
Jungeui Hong,
David Gresham
Publication year - 2017
Publication title -
biotechniques
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.617
H-Index - 131
eISSN - 1940-9818
pISSN - 0736-6205
DOI - 10.2144/000114608
Subject(s) - adapter (computing) , computational biology , dna sequencing , biology , reference genome , genetics , dna , computer science , operating system
Quantitative analysis of next-generation sequencing (NGS) data requires discriminating duplicate reads generated by PCR from identical molecules that are of unique origin. Typically, PCR duplicates are identified as sequence reads that align to the same genomic coordinates using reference-based alignment. However, identical molecules can be independently generated during library preparation. Misidentification of these molecules as PCR duplicates can introduce unforeseen biases during analyses. Here, we developed a cost-effective sequencing adapter design by modifying Illumina TruSeq adapters to incorporate a unique molecular identifier (UMI) while maintaining the capacity to undertake multiplexed, single-index sequencing. Incorporation of UMIs into TruSeq adapters (TrUMIseq adapters) enables identification of bona fide PCR duplicates as identically mapped reads with identical UMIs. Using TrUMIseq adapters, we show that accurate removal of PCR duplicates results in improved accuracy of both allele frequency (AF) estimation in heterogeneous populations using DNA sequencing and gene expression quantification using RNA-Seq.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom