Premium
Efficiency of Parallel Direct Optimization
Author(s) -
Janies Daniel A.,
Wheeler Ward C.
Publication year - 2001
Publication title -
cladistics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.323
H-Index - 92
eISSN - 1096-0031
pISSN - 0748-3007
DOI - 10.1111/j.1096-0031.2001.tb00106.x
Subject(s) - parallel computing , computer science , set (abstract data type) , cluster (spacecraft) , computation , parallel algorithm , tree (set theory) , parallel processing , computer cluster , algorithm , theoretical computer science , distributed computing , mathematics , combinatorics , programming language
Tremendous progress has been made at the level of sequential computation in phylogenetics. However, little attention has been paid to parallel computation. Parallel computing is particularly suited to phylogenetics because of the many ways large computational problems can be broken into parts that can be analyzed concurrently. In this paper, we investigate the scaling factors and efficiency of random addition and tree refinement strategies using the direct optimization software, POY, on a small (10 slave processors) and a large (256 slave processors) cluster of networked PCs running LINUX. These algorithms were tested on several data sets composed of DNA and morphology ranging from 40 to 500 taxa. Various algorithms in POY show fundamentally different properties within and between clusters. All algorithms are efficient on the small cluster for the 40‐taxon data set. On the large cluster, multibuilding exhibits excellent parallel efficiency, whereas parallel building is inefficient. These results are independent of data set size. Branch swapping in parallel shows excellent speed‐up for 16 slave processors on the large cluster. However, there is no appreciable speed‐up for branch swapping with the further addition of slave processors (>16). This result is independent of data set size. Ratcheting in parallel is efficient with the addition of up to 32 processors in the large cluster. This result is independent of data set size.