Exhaustively Identifying Cross-Linked Peptides with a Linear Computational Complexity | Zendy

Fengchao Yu | Zendy; Ning Li | Zendy; Weichuan Yu | Zendy

AI Assistant Blog Pricing

Home ZAIA Blog

Open Access

Exhaustively Identifying Cross-Linked Peptides with a Linear Computational Complexity

Author(s) -

Fengchao Yu,

Ning Li,

Weichuan Yu

Publication year - 2017

Publication title -

journal of proteome research

Language(s) - English

Resource type - Journals

SCImago Journal Rank - 1.644

H-Index - 161

eISSN - 1535-3907

pISSN - 1535-3893

DOI - 10.1021/acs.jproteome.7b00338

Subject(s) - computer science , tandem mass spectrometry , computational complexity theory , quadratic equation , database search engine , function (biology) , identification (biology) , set (abstract data type) , algorithm , mass spectrometry , chemistry , mathematics , search engine , biology , botany , geometry , chromatography , evolutionary biology , information retrieval , programming language

Chemical cross-linking coupled to mass spectrometry is a powerful tool to study protein-protein interactions and protein conformations. Two linked peptides are ionized and fragmented to produce a tandem mass spectrum. In such an experiment, a tandem mass spectrum contains ions from two peptides. The peptide identification problem becomes a peptide-peptide pair identification problem. Currently, most tools do not search all possible pairs due to the quadratic time complexity. Consequently, missed findings are unavoidable. In our previous work, we developed a tool named ECL to search all pairs of peptides exhaustively. Unfortunately, it is very slow due to the quadratic computational complexity, especially when the database is large. Furthermore, ECL uses a score function without statistical calibration, while researchers1-3 have proposed that it is inappropriate to directly compare uncalibrated scores because different spectra have different random score distributions. Here we propose an advanced version of ECL, named ECL2. It achieves a linear time and space complexity by taking advantage of the additive property of a score function. It can search a data set containing tens of thousands of spectra against a database containing thousands of proteins in a few hours. Comparison with other five state-of-the-art tools shows that ECL2 is much faster than pLink, StavroX, ProteinProspector, and ECL. Kojak is the only one that is faster than ECL2, but Kojak does not exhaustively search all possible peptide pairs. The comparison shows that ECL2 has the highest sensitivity among the state-of-the-art tools. The experiment using a large-scale in vivo cross-linking data set demonstrates that ECL2 is the only tool that can find the peptide-spectrum matches (PSMs) passing the false discovery rate/q-value threshold. The result illustrates that the exhaustive search and a well-calibrated score function are useful to find PSMs from a huge search space.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.

Having issues? You can contact us here

Accelerating Research