Properties of Markov Chain Monte Carlo Performance across Many Empirical Alignments | Zendy

Sean Harrington | Zendy; Van Wishingrad | Zendy; Robert C. Thomson | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Properties of Markov Chain Monte Carlo Performance across Many Empirical Alignments

Author(s) -

Sean Harrington,

Van Wishingrad,

Robert C. Thomson

Publication year - 2020

Publication title -

molecular biology and evolution

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 6.637

H-Index - 218

eISSN - 1537-1719

pISSN - 0737-4038

DOI - 10.1093/molbev/msaa295

Subject(s) - biology , markov chain monte carlo , markov chain , evolutionary biology , statistical physics , monte carlo method , computational biology , statistics , mathematics , physics

Nearly all current Bayesian phylogenetic applications rely on Markov chain Monte Carlo (MCMC) methods to approximate the posterior distribution for trees and other parameters of the model. These approximations are only reliable if Markov chains adequately converge and sample from the joint posterior distribution. Although several studies of phylogenetic MCMC convergence exist, these have focused on simulated data sets or select empirical examples. Therefore, much that is considered common knowledge about MCMC in empirical systems derives from a relatively small family of analyses under ideal conditions. To address this, we present an overview of commonly applied phylogenetic MCMC diagnostics and an assessment of patterns of these diagnostics across more than 18,000 empirical analyses. Many analyses appeared to perform well and failures in convergence were most likely to be detected using the average standard deviation of split frequencies, a diagnostic that compares topologies among independent chains. Different diagnostics yielded different information about failed convergence, demonstrating that multiple diagnostics must be employed to reliably detect problems. The number of taxa and average branch lengths in analyses have clear impacts on MCMC performance, with more taxa and shorter branches leading to more difficult convergence. We show that the usage of models that include both Γ-distributed among-site rate variation and a proportion of invariable sites is not broadly problematic for MCMC convergence but is also unnecessary. Changes to heating and the usage of model-averaged substitution models can both offer improved convergence in some cases, but neither are a panacea.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research