Premium
Posterior predictive checks of coalescent models: P2C2M, an R package
Author(s) -
Gruenstaeudl Michael,
Reid Noah M.,
Wheeler Gregory L.,
Carstens Bryan C.
Publication year - 2016
Publication title -
molecular ecology resources
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.96
H-Index - 136
eISSN - 1755-0998
pISSN - 1755-098X
DOI - 10.1111/1755-0998.12435
Subject(s) - coalescent theory , approximate bayesian computation , replication (statistics) , r package , biology , statistics , posterior probability , statistical inference , bayesian probability , consistency (knowledge bases) , computer science , inference , phylogenetic tree , mathematics , artificial intelligence , genetics , gene
Bayesian inference operates under the assumption that the empirical data are a good statistical fit to the analytical model, but this assumption can be challenging to evaluate. Here, we introduce a novel r package that utilizes posterior predictive simulation to evaluate the fit of the multispecies coalescent model used to estimate species trees. We conduct a simulation study to evaluate the consistency of different summary statistics in comparing posterior and posterior predictive distributions, the use of simulation replication in reducing error rates and the utility of parallel process invocation towards improving computation times. We also test P2C2M on two empirical data sets in which hybridization and gene flow are suspected of contributing to shared polymorphism, which is in violation with the coalescent model: Tamias chipmunks and Myotis bats. Our results indicate that (i) probability‐based summary statistics display the lowest error rates, (ii) the implementation of simulation replication decreases the rate of type II errors, and (iii) our r package displays improved statistical power compared to previous implementations of this approach. When probabilistic summary statistics are used, P2C2M corroborates the assumption that genealogies collected from Tamias and Myotis are not a good fit to the multispecies coalescent model. Taken as a whole, our findings argue that an assessment of the fit of the multispecies coalescent model should accompany any phylogenetic analysis that estimates a species tree.