A multithreading and hashing technique for indexing Target‐Decoy peptides databases | Zendy

Maabreh Majdi | Zendy; Irshid Hafez | Zendy; Gupta Ajay | Zendy; Alasmadi Izzat | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Premium

A multithreading and hashing technique for indexing Target‐Decoy peptides databases

Author(s) -

Maabreh Majdi,

Irshid Hafez,

Gupta Ajay,

Alasmadi Izzat

Publication year - 2017

Publication title -

concurrency and computation: practice and experience

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 0.309

H-Index - 67

eISSN - 1532-0634

pISSN - 1532-0626

DOI - 10.1002/cpe.4371

Subject(s) - computer science , search engine indexing , hash function , multithreading , process (computing) , decoy , parallel computing , database , data mining , information retrieval , operating system , biochemistry , chemistry , receptor , computer security , thread (computing)

Summary Target‐Decoy database is currently the method of choice to assess the quality of Proteins' search engines. Decoy versions of real peptides are generated and injected to the same database of real ones with different labels. Quality of search engines results is assessed based on the number of decoys retrieved as hits. In Crux‐Tide search engine, which is one of the fastest search engines currently available, the process of indexing and generating decoys is computationally expensive. In this paper, we analyze the serial algorithm in detail and show improvement possibilities, and then describe a parallel‐shared memory solution using OpenMP. To completely break up the dependency in the serial algorithms, a clever hashing technique is utilized to localize the process. The parallel solution and the hashing technique together are able to reduce the computation cost by approximately 70‐80% using few threads. Besides the parallelization, we redesign part of the serial code so that the memory consumption becomes more efficient. The parallel version can index the same files using around two‐third of the memory space that the serial version consumes. This solution could impact and support future distributed developments of Crux‐Tide searching phase, where each parallel unit could rank the observed spectra independently.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here

Accelerating Research