Premium
Building the C oleoptera tree‐of‐life for >8000 species: composition of public DNA data and fit with L innaean classification
Author(s) -
BOCAK LADISLAV,
BARTON CHRISTOPHER,
CRAMPTONPLATT ALEX,
CHESTERS DOUGLAS,
AHRENS DIRK,
VOGLER ALFRIED P.
Publication year - 2014
Publication title -
systematic entomology
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.552
H-Index - 66
eISSN - 1365-3113
pISSN - 0307-6970
DOI - 10.1111/syen.12037
Subject(s) - biology , supermatrix , paraphyly , phylogenetic tree , subfamily , genus , genbank , evolutionary biology , phylogenetics , zoology , genetics , gene , clade , mathematics , current algebra , affine lie algebra , pure mathematics , algebra over a field
The species representation of public databases is growing rapidly and permits increasingly detailed phylogenetic inferences. We present a supermatrix based on all gene sequences of Coleoptera available in Genbank for two nuclear ( 18S and 28S rRNA ) and two mitochondrial ( rrnL and cox1 ) genes. After filtering for unique species names and the addition of ˜2000 unpublished sequences for cox1 and 18S rRNA , the resulting data matrix included 8441 species‐level terminals and 6600 aligned nucleotide positions. The concatenated matrix represents the equivalent of 2.17% of the 390 000 described species of C oleoptera and includes 152 beetle families. The remaining 29 families constitute small lineages with ˜250 known species in total. Taxonomic coverage remains low for several major lineages, including B uprestidae (0.16% of described species), S taphylinidae (1.03%), T enebrionidae (0.90%) and C erambycidae (0.58%). The current taxon sampling was strongly biased towards the N orthern H emisphere. Phylogenetic trees obtained from the supermatrix were in very good agreement with the L innaean classification, in particular at the family level, but lower for the subfamily and lowest for the genus level. The topology supports the basal split of D erodontidae and S cirtoidea from the remaining P olyphaga, and the broad paraphyly of C ucujoidea. The data extraction pipeline and detailed tree provide a framework for placement of any new sequences, including environmental samples, into a DNA ‐based classification system of C oleoptera.