Uncertain-tree: discriminating among competing approaches to the phylogenetic analysis of phenotype data
Author(s) -
Mark N. Puttick,
Joseph O’Reilly,
Alastair R. Tanner,
James F. Fleming,
James Clark,
Lucy Holloway,
Jesús Lozano-Fernández,
Luke A. Parry,
James E. Tarver,
Davide Pisani,
Philip C. J. Donoghue
Publication year - 2017
Publication title -
proceedings of the royal society b biological sciences
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.342
H-Index - 253
eISSN - 1471-2954
pISSN - 0962-8452
DOI - 10.1098/rspb.2016.2290
Subject(s) - phylogenetic tree , maximum parsimony , bayesian probability , character (mathematics) , tree (set theory) , computer science , artificial intelligence , tree rearrangement , clade , statistics , mathematics , pattern recognition (psychology) , machine learning , biology , combinatorics , gene , biochemistry , geometry
Morphological data provide the only means of classifying the majority of life's history, but the choice between competing phylogenetic methods for the analysis of morphology is unclear. Traditionally, parsimony methods have been favoured but recent studies have shown that these approaches are less accurate than the Bayesian implementation of the Mk model. Here we expand on these findings in several ways: we assess the impact of tree shape and maximum-likelihood estimation using the Mk model, as well as analysing data composed of both binary and multistate characters. We find that all methods struggle to correctly resolve deep clades within asymmetric trees, and when analysing small character matrices. The Bayesian Mk model is the most accurate method for estimating topology, but with lower resolution than other methods. Equal weights parsimony is more accurate than implied weights parsimony, and maximum-likelihood estimation using the Mk model is the least accurate method. We conclude that the Bayesian implementation of the Mk model should be the default method for phylogenetic estimation from phenotype datasets, and we explore the implications of our simulations in reanalysing several empirical morphological character matrices. A consequence of our finding is that high levels of resolution or the ability to classify species or groups with much confidence should not be expected when using small datasets. It is now necessary to depart from the traditional parsimony paradigms of constructing character matrices, towards datasets constructed explicitly for Bayesian methods.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom