z-logo
Premium
Extending the coverage of spectral libraries: A neighbor‐based approach to predicting intensities of peptide fragmentation spectra
Author(s) -
Ji Chao,
Arnold Randy J.,
Sokoloski Kevin J.,
Hardy Richard W.,
Tang Haixu,
Radivojac Predrag
Publication year - 2013
Publication title -
proteomics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.26
H-Index - 167
eISSN - 1615-9861
pISSN - 1615-9853
DOI - 10.1002/pmic.201100670
Subject(s) - peptide , spectral line , fragmentation (computing) , database search engine , proteome , ion , chemistry , computer science , computational biology , physics , search engine , biology , biochemistry , information retrieval , organic chemistry , astronomy , operating system
Searching spectral libraries in MS / MS is an important new approach to improving the quality of peptide and protein identification. The idea relies on the observation that ion intensities in an MS / MS spectrum of a given peptide are generally reproducible across experiments, and thus, matching between spectra from an experiment and the spectra of previously identified peptides stored in a spectral library can lead to better peptide identification compared to the traditional database search. However, the use of libraries is greatly limited by their coverage of peptide sequences: even for well‐studied organisms a large fraction of peptides have not been previously identified. To address this issue, we propose to expand spectral libraries by predicting the MS / MS spectra of peptides based on the spectra of peptides with similar sequences. We first demonstrate that the intensity patterns of dominant fragment ions between similar peptides tend to be similar. In accordance with this observation, we develop a neighbor‐based approach that first selects peptides that are likely to have spectra similar to the target peptide and then combines their spectra using a weighted K ‐nearest neighbor method to accurately predict fragment ion intensities corresponding to the target peptide. This approach has the potential to predict spectra for every peptide in the proteome. When rigorous quality criteria are applied, we estimate that the method increases the coverage of spectral libraries available from the N ational I nstitute of S tandards and T echnology by 20–60%, although the values vary with peptide length and charge state. We find that the overall best search performance is achieved when spectral libraries are supplemented by the high quality predicted spectra.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here