MASA‐OpenCL: Parallel pruned comparison of long DNA sequences with OpenCL | Zendy

Figueiredo Marco Antonio C. | Zendy; Oliveira Sandes Edans F. | Zendy; Rodrigues Genai. | Zendy; Teodoro George L. M. | Zendy; Melo Alba Cristina M. A. | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

MASA‐OpenCL: Parallel pruned comparison of long DNA sequences with OpenCL

Author(s) -

Figueiredo Marco Antonio C.,

Oliveira Sandes Edans F.,

Rodrigues Genai.,

Teodoro George L. M.,

Melo Alba Cristina M. A.

Publication year - 2018

Publication title -

concurrency and computation: practice and experience

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.309

H-Index - 67

eISSN - 1532-0634

pISSN - 1532-0626

DOI - 10.1002/cpe.5039

Subject(s) - computer science , cuda , parallel computing , graphics , pruning , smith–waterman algorithm , pairwise comparison , sequence (biology) , simd , similarity (geometry) , graphics processing unit , algorithm , artificial intelligence , sequence alignment , image (mathematics) , computer graphics (images) , biochemistry , chemistry , genetics , gene , agronomy , peptide sequence , biology

Summary Biological sequence comparison is often used as an auxiliary task in the analysis of genetic material. Pairwise comparison algorithms like Smith‐Waterman evaluate two strings representing sequences of proteins, DNA or RNA to obtain optimal alignment between them. Many applications have been proposed to address the sequence comparison problem, prioritizing the use of graphics cards and proprietary languages such as CUDA. In this paper, we propose and evaluate MASA‐OpenCL, an OpenCL solution for comparing long DNA sequences that is based on the MASA sequence alignment framework, with pruning capability proportional to the similarity of the sequences compared. The results of MASA‐OpenCL were compared to its CUDA counterpart (MASA‐CUDAlign) and, in most cases, MASA‐OpenCL achieved better performance. In order to better understand the behavior of MASA‐OpenCL, we performed a statistical analysis considering 11 comparisons of sequences with high, medium and low similarity in 4 GPUs. As a result, we obtained a multiple linear regression model that considers (a) the sizes of the sequences, (b) the similarity between them, (c) the computational power of the GPU, and (d) the GPU memory bandwidth. We used this model to predict the performance in two other GPUs, with low error rates.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research