Premium
The effect of undetected recombination on genealogy sampling and inference under an isolation‐with‐migration model
Author(s) -
Hey Jody,
Wang Katherine
Publication year - 2019
Publication title -
molecular ecology resources
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.96
H-Index - 136
eISSN - 1755-0998
pISSN - 1755-098X
DOI - 10.1111/1755-0998.13083
Subject(s) - coalescent theory , recombination , biology , sampling (signal processing) , gamete , inference , statistics , filter (signal processing) , evolutionary biology , mathematics , statistical physics , algorithm , genetics , computer science , physics , phylogenetics , artificial intelligence , gene , sperm , computer vision
Many methods for fitting demographic models to data sets of aligned sequences rely upon an assumption that the data have a branching coalescent history without recombination within regions or loci. To mitigate the effects of the failure of this assumption, a common approach is to filter data and sample regions that pass the four‐gamete criterion for recombination, an approach that allows data to run, but that is expected to detect only a minority of recombination events. A series of empirical tests of this approach were conducted using computer simulations with and without recombination for a variety of isolation‐with‐migration ( IM ) model for two and three populations. Only the IM a3 program was used, but the general results should apply to related genealogy‐sampling‐based methods for IM models or subsets of IM models. It was found that the details of sampling intervals that pass a four‐gamete filter have a moderate effect, and that schemes that use the longest intervals, or that use overlapping intervals, gave poorer results. A simple approach of using a random nonoverlapping interval returned the smallest difference between results with and without recombination, with the mean difference between parameter estimates usually less than 20% of the true value (usually much less). However, the posterior probability distributions for migration rates were flatter with recombination, suggesting that filtering based on the four‐gamete criterion, while necessary for methods like these, leads to reduced resolution on migration. A distinct, alternative approach, of using a finite sites mutation model and not filtering the data, performed quite poorly.