Premium
Estimation of historical effective population size using linkage disequilibria with marker data
Author(s) -
Corbin L.J.,
Liu A.Y.H.,
Bishop S.C.,
Woolliams J.A.
Publication year - 2012
Publication title -
journal of animal breeding and genetics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.689
H-Index - 51
eISSN - 1439-0388
pISSN - 0931-2668
DOI - 10.1111/j.1439-0388.2012.01003.x
Subject(s) - linkage disequilibrium , sample size determination , mathematics , statistics , consistency (knowledge bases) , a priori and a posteriori , constant (computer programming) , population , population size , function (biology) , linkage (software) , econometrics , computer science , genetics , biology , single nucleotide polymorphism , discrete mathematics , demography , genotype , philosophy , epistemology , sociology , gene , programming language
Summary Theory hypothesizes that the rate of decline in linkage disequilibrium (LD) as a function of distance between markers, measured by r 2 , can be used to estimate effective population size ( N e ) and how it varies over time. The development of high‐density genotyping makes feasible the application of this theory and has provided an impetus to improve predictions. This study considers the impact of several developments on the estimation of N e using both simulated and equine high‐density single‐nucleotide polymorphism data, when N e is assumed to be constant a priori and when it is not. In all models, estimates of N e were highly sensitive to thresholds imposed upon minor allele frequency (MAF) and to a priori assumptions on the expected r 2 for adjacent markers. Where constant N e was assumed a priori , then estimates with the lowest mean square error were obtained with MAF thresholds between 0.05 and 0.10, adjustment of r 2 for finite sample size, estimation of a [the limit for r 2 as recombination frequency ( c ) approaches 0] and relating N e to c (1 – c /2). The findings for predicting N e from models allowing variable N e were much less clear, apart from the desirability of correcting for finite sample size, and the lack of consistency in estimating recent N e (<7 generations) where estimates use data with large c . The theoretical conflicts over how estimation should proceed and uncertainty over where predictions might be expected to fit well suggest that the estimation of N e when it varies be carried out with extreme caution.