Comparing Two Bayesian Methods for Gene Tree/Species Tree Reconstruction: Simulations with Incomplete Lineage Sorting and Horizontal Gene Transfer
Author(s) -
Yujin Chung,
Cécile Ané
Publication year - 2011
Publication title -
systematic biology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 7.128
H-Index - 182
eISSN - 1076-836X
pISSN - 1063-5157
DOI - 10.1093/sysbio/syr003
Subject(s) - coalescent theory , clade , tree (set theory) , phylogenetic tree , biology , horizontal gene transfer , concordance , lineage (genetic) , bayesian probability , gene , evolutionary biology , phylogenetics , computational biology , supermatrix , genetics , statistics , mathematics , combinatorics , current algebra , affine lie algebra , pure mathematics , algebra over a field
With the increasing interest in recognizing the discordance between gene genealogies, various gene tree/species tree reconciliation methods have been developed. We present here the first attempt to assess and compare two such Bayesian methods, Bayesian estimation of species trees (BEST) and BUCKy (Bayesian untangling of concordance knots), in the presence of several known processes of gene tree discordance. DNA alignments were simulated under the influence of incomplete lineage sorting (ILS) and of horizontal gene transfer (HGT). BEST and BUCKy both account for uncertainty in gene tree estimation but differ substantially in their assumptions of what caused gene tree discordance. BEST estimates a species tree using the coalescent model, assuming that all gene tree discordance is due to ILS. BUCKy does not assume any specific biological process of gene tree discordance through the use of a nonparametric clustering of concordant genes. BUCKy estimates the concordance factor (CF) of a clade, which is defined as the proportion of genes that truly have the clade in their trees. The estimated concordance tree is then built from clades with the highest estimated CFs. Because of their different assumptions, it was expected that BEST would perform better in the presence of ILS and that BUCKy would perform better in the presence of HGT. As expected, the species tree was more accurately reconstructed by BUCKy in the presence of HGT, when the HGT events were unevenly placed across the species tree. BUCKy and BEST performed similarly in most other cases, including in the presence of strong ILS and of HGT events that were evenly placed across the tree. However, BUCKy was shown to underestimate the uncertainty in CF estimation, with short credibility intervals. Despite this, the discordance pattern estimated by BUCKy could be compared with the signature of ILS. The resulting test for the adequacy of the coalescent model proved to have low Type I error. It was powerful when HGT was the major source of discordance and when HGT events were unevenly placed across the species tree.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom