Premium
MPI‐CMS: a hybrid parallel approach to geometrical motif search in proteins
Author(s) -
Ferretti Marco,
Musci Mirto,
Santangelo Luigi
Publication year - 2015
Publication title -
concurrency and computation: practice and experience
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.309
H-Index - 67
eISSN - 1532-0634
pISSN - 1532-0626
DOI - 10.1002/cpe.3588
Subject(s) - computer science , workload , parallel computing , message passing interface , message passing , supercomputer , motif (music) , focus (optics) , operating system , physics , acoustics , optics
Summary This paper describes the message passing parallel implementation of the Cross Motif Search algorithm (MPI‐CMS). It is an extension and specifically improves on the results obtained in a conference paper presented at PBIO 2014. CMS is a bioinformatics algorithm whose goal is to search for geometrical motifs in proteins. For the purpose of a complete characterization of protein similarities, it would be important to run CMS on the largest possible dataset. Unfortunately, due to its precision, CMS is inherently slow; thus, it was originally implemented using a shared memory parallel paradigm. In the original conference paper, we proved that the OpenMP implementation of Cross Motif Search (MP‐CMS) is extremely inefficient and cannot scale adequately. To solve the problem, we designed a new parallel implementation of CMS (MPI‐CMS) based on a hybrid shared memory and message passing paradigm. This paper reconsiders MPI‐CMS with the target to port it on a supercomputing machine. The focus is on the dependence of performance in the hybrid approach on the workload unbalance. Using a simple statistical analysis of the workload we discuss several strategies through which we can improve the design of MPI‐CMS. We conclude the paper describing a revised implementation of MPI‐CMS, which takes into account the size of the protein pairs to fine‐tune the parallelization strategy. Copyright © 2015 John Wiley & Sons, Ltd.