Accounting for Bias from Sequencing Error in Population Genetic Estimates
Author(s) -
Philip L. Johnson,
Montgomery Slatkin
Publication year - 2007
Publication title -
molecular biology and evolution
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 6.637
H-Index - 218
eISSN - 1537-1719
pISSN - 0737-4038
DOI - 10.1093/molbev/msm239
Subject(s) - biology , rule of thumb , cutoff , sanger sequencing , population , statistics , sequence (biology) , dna sequencing , econometrics , computational biology , genetics , computer science , algorithm , mathematics , demography , gene , physics , quantum mechanics , sociology
Sequencing error presents a significant challenge to population genetic analyses using low-coverage sequence in general and single-pass reads in particular. Bias in parameter estimates becomes severe when the level of polymorphism (signal) is low relative to the amount of error (noise). Choosing an arbitrary quality score cutoff yields biased estimates, particularly with newer, non-Sanger sequencing technologies that have different quality score distributions. We propose a rule of thumb to judge when a given threshold will lead to significant bias and suggest alternative approaches that reduce bias.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom