z-logo
open-access-imgOpen Access
Phylogenomics for Systematic Biology
Author(s) -
David Posada
Publication year - 2016
Publication title -
systematic biology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 7.128
H-Index - 182
eISSN - 1076-836X
pISSN - 1063-5157
DOI - 10.1093/sysbio/syw027
Subject(s) - phylogenomics , biology , evolutionary biology , computational biology , phylogenetics , genetics , clade , gene
Next-generation sequencing (NGS) techniques have deeply impacted multiple research areas in biology. In molecular systematics, NGS has boosted the field from being based on a few loci—phylogenetics— to the use of hundreds or thousands of loci— phylogenomics. However, while massive multilocus data sets should facilitate the resolution of complex phylogenetic problems, more data is not a panacea (Delsuc et al. 2005; Jeffroy et al. 2006; Roger and Hug 2006). Although large data sets will reduce sampling error, in the presence of systematic biases they could also lead to wrong answers with strong statistical support (Phillips et al. 2004; Nishihara et al. 2007; RodriguezEzpeleta et al. 2007; Kumar et al. 2012). At the same time, the wealth of data resulting from NGS has forced us to stop ignoring phylogenetic incongruence (Jeffroy et al. 2006; Galtier and Daubin 2008; Salichos and Rokas 2013) and to reconsider the difference between gene trees and species trees (Goodman et al. 1979), to the point that we are witnessing a methodological and conceptual shift (Degnan and Rosenberg 2009; Edwards 2009; Knowles 2009). Hence, phylogenomic analysis not only implies new technical capabilities, but also comes with an explicit recognition of processes like incomplete lineage sorting (ILS), gene duplication and loss (GDL), and horizontal transfer (HGT) (Maddison 1997; Page and Charleston 1997; Slowinski and Page 1999). Indeed, the phylogenomic pipeline can be very complex, involving multiple challenges concerning the acquisition, manipulation, analysis, and interpretation of massive data sets, including the design of appropriate sequencing strategies, the identification of homologous/orthologous loci, model partitioning among multiple loci and gene/species tree reconstruction. And there are still important open questions regarding the best strategies for all these steps. With the idea of learning about common problems and potential solutions for some of these questions I organized a symposium entitled “Current Advances and Challenges in Practical Phylogenomics” at the 2013 Evolution meeting in Snowbird, Utah (USA) under the auspice of the Society of Systematic Biologists. The word “practical” in the title reflected my intention to push the speakers to tackle on stage some of the real hurdles phylogeneticists face in their daily life when analyzing genome-wide data. My hope was that the public would leave Snowbird’s Ballroom 2 that day with some ideas that would change to some extent the way they construct and/or analyze their phylogenomic data sets. The symposium included six talks that embraced different aspects of the phylogenomic endeavor, from data acquisition to data analysis. Indeed, not all phylogenomic problems were addressed. The speakers formed a diverse group of people encompassing different orientations (biology, computer science, statistics), at distinct career stages (from graduate students to professors), from various parts of the world and representing a mix of genders. The first two talks were related to different strategies for gathering phylogenomic data and their implications. Alan Lemmon (Florida State University, USA—“Anchored phylogenomics and the power of hybrid enrichment data for phylogenetics”) broke the ice describing his methodology for the efficient acquisition of genomewide loci across multiple species, contrasting it with similar approaches like ultraconserved elements (e.g. Faircloth et al. 2012; Gilbert et al. 2015) and exon capture (e.g. Bi et al. 2012; Bragg et al. 2015; Manthey et al. 2016). Next, Mike Harvey (Louisiana State University, USA—“SNPs versus sequences for phylogeography – an exploration using simulations and massively parallel sequencing in a non-model bird”) compared the demographic inferences obtained from the same individuals using single nucleotide polymorphisms (SNPs) or a genotyping-by-sequencing approach (Elshire et al. 2011). But phylogenomic matrices are often incomplete, and in the third talk, Lacey Knowles (University of Michigan, USA—“What to do with missing data in next-generation sequences? Unforeseen sampling effects on species-tree analyses”) characterized the effect of missing data on the estimated species relationships The second half of the symposium focused on novel methods for the estimation of species trees from genome-wide data. Leonardo de Oliveira Martins (University of Vigo, Spain—“A probabilistic parsimonious model for species tree reconstruction”) presented a Bayesian method for the reconstruction of species trees able to deal (nonparametrically) with ILS, GDL, and HGT. After that, Tandy Warnow (University of Texas at Austin, “Naive Binning Improves Phylogenomic Analysis”) explained a new approach to species tree

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom