Measuring Branch Support in Species Trees Obtained by Gene Tree Parsimony
Author(s) -
Simon Joly,
Anne Bruneau
Publication year - 2009
Publication title -
systematic biology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 7.128
H-Index - 182
eISSN - 1076-836X
pISSN - 1063-5157
DOI - 10.1093/sysbio/syp013
Subject(s) - coalescent theory , biology , tree (set theory) , phylogenetic tree , sampling (signal processing) , statistics , evolutionary biology , context (archaeology) , nonparametric statistics , gene , mathematics , genetics , combinatorics , computer science , paleontology , filter (signal processing) , computer vision
Several methods have recently been developed that allow the reconstruction of species trees from gene trees, an important achievement in our ongoing quest to obtain reliable species phylogenies. However, considerably less attention has been given to evaluating the accuracy of species trees' estimates. Four methods for measuring branch support of species trees are tested in this study in a gene tree parsimony framework: 1) bootstrap lineages (BL) (sequences) within species, 2) bootstrap characters (BC) within genes (i.e., the standard nonparametric bootstrap), 3) bootstrap lineages and characters (BLC), and 4) posterior probability gene tree sampling (PPGTS) (where, for each resampled data set, gene trees are sampled according to their posterior probability). For each method, n species trees are reconstructed from n resampled data sets and the branch support consists in the percentage of the n species trees in which a branch is recovered. The 4 methods were tested for several species trees and for different sampling efforts (i.e., number of genes and individuals sampled) using coalescent simulations. PPGTS performed best overall with lowest Type I and II error rates, followed by BLC. The BL and BC methods had higher error rates. This suggests that in order to properly measure branch support in a species tree context, it is important to account for the uncertainty involved in reconstructing gene trees from DNA sequences as well as that involved in reconstructing the species tree from individual gene trees. With the parameters used in the simulations, sampling more individuals per species resulted in similar improvements in support values as when sampling more genes. Moreover, sampling more individuals per species appeared to be important for escaping the anomaly zone present when only 1 sequence was sampled. We also apply the 4 methods to obtain branch supports for the species phylogeny of diploid wild roses (Rosa) in North America.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom