Inference of Gene Flow in the Process of Speciation: An Efficient Maximum-Likelihood Method for the Isolation-with-Initial-Migration Model
Author(s) -
Rui J. Costa,
Hilde WilkinsonHerbots
Publication year - 2017
Publication title -
genetics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.792
H-Index - 246
eISSN - 1943-2631
pISSN - 0016-6731
DOI - 10.1534/genetics.116.188060
Subject(s) - gene flow , biology , divergence (linguistics) , inference , isolation (microbiology) , population , genetics , genetic algorithm , flow (mathematics) , evolutionary biology , gene , mathematics , computer science , genetic variation , bioinformatics , linguistics , philosophy , demography , artificial intelligence , sociology , geometry
The isolation-with-migration (IM) model is commonly used to make inferences about gene flow during speciation, using polymorphism data. However, it has been reported that the parameter estimates obtained by fitting the IM model are very sensitive to the model's assumptions-including the assumption of constant gene flow until the present. This article is concerned with the isolation-with-initial-migration (IIM) model, which drops precisely this assumption. In the IIM model, one ancestral population divides into two descendant subpopulations, between which there is an initial period of gene flow and a subsequent period of isolation. We derive a very fast method of fitting an extended version of the IIM model, which also allows for asymmetric gene flow and unequal population sizes. This is a maximum-likelihood method, applicable to data on the number of segregating sites between pairs of DNA sequences from a large number of independent loci. In addition to obtaining parameter estimates, our method can also be used, by means of likelihood-ratio tests, to distinguish between alternative models representing the following divergence scenarios: (a) divergence with potentially asymmetric gene flow until the present, (b) divergence with potentially asymmetric gene flow until some point in the past and in isolation since then, and (c) divergence in complete isolation. We illustrate the procedure on pairs of Drosophila sequences from ∼30,000 loci. The computing time needed to fit the most complex version of the model to this data set is only a couple of minutes. The R code to fit the IIM model can be found in the supplementary files of this article.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom