z-logo
Premium
Phylogenomics of A nnelida revisited: a cladistic approach using genome‐wide expressed sequence tag data mining and examining the effects of missing data
Author(s) -
Kvist Sebastian,
Siddall Mark E.
Publication year - 2013
Publication title -
cladistics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.323
H-Index - 92
eISSN - 1096-0031
pISSN - 0748-3007
DOI - 10.1111/cla.12015
Subject(s) - phylogenomics , probabilistic logic , phylogenetic tree , set (abstract data type) , redundancy (engineering) , data set , biology , missing data , consistency (knowledge bases) , computational biology , resampling , data mining , computer science , evolutionary biology , artificial intelligence , genetics , machine learning , gene , clade , operating system , programming language
We present phylogenomic analyses of the most comprehensive molecular character set compiled for A nnelida and its constituent taxa, including over 347 000 aligned nucleotide sites for 39 taxa. The nucleotide data set was recovered using a pre‐existing amino acid data set of almost 48 000 aligned sites as a backbone for tBLAST n searches against NCBI . In addition, orthology determinations of the loci in the original amino acid data set were scrutinized using an A ll vs A ll R eciprocal B est H it approach, employing BLASTp , and examining for statistical interdependency among the loci. This approach revealed considerable sequence redundancy among the loci in the original data set and a new data set was compiled, with the redundancy removed. The newly compiled nucleotide data set, the original amino acid data set, and the new reduced amino acid data set were subjected to parsimony analyses and two forms of bootstrap resampling. The last‐named data set also was analysed using a maximum‐likelihood approach. There were two main objectives to these analyses: (i) to examine the general topology, including support, resulting from the analyses of the new data sets and (ii) to assess the consistency of the branching patterns across optimality criteria by comparison with previous probabilistic approaches. The phylogenetic hypotheses resulting from analyses of the three data sets are largely unsupported, reflecting the continued difficulty of finding numerous, reliable, and suitable loci for a group as ancient as A nnelida. Resulting parsimonious hypotheses disagree, in some respects, with the previous probabilistic approaches; Sedentaria and, in most cases, Errantia are not supported as monophyletic groups but P leistoannelida is recovered as a (unsupported) monophyletic group in one of the three parsimony analyses as well as the likelihood analysis. In addition, we performed missing data titration studies to estimate the impact of missing data on overall support and support for specific clades.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here