z-logo
open-access-imgOpen Access
Tailor: A Nonparametric and Rapid Score Calibration Method for Database Search-Based Peptide Identification in Shotgun Proteomics
Author(s) -
Pavel Sulimov,
Attila KertészFarkas
Publication year - 2020
Publication title -
journal of proteome research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.644
H-Index - 161
eISSN - 1535-3907
pISSN - 1535-3893
DOI - 10.1021/acs.jproteome.9b00736
Subject(s) - calibration , nonparametric statistics , quantile
Peptide-spectrum-match (PSM) scores used in database searching are calibrated to spectrum- or spectrum-peptide-specific null distributions. Some calibration methods rely on specific assumptions and use analytical models (e.g., binomial distributions), whereas other methods utilize exact empirical null distributions. The former may be inaccurate because of unjustified assumptions, while the latter are accurate, albeit computationally exhaustive. Here, we introduce a novel, nonparametric, heuristic PSM score calibration method, called Tailor, which calibrates PSM scores by dividing them with the top 100-quantile of the empirical, spectrum-specific null distributions (i.e., the score with an associated p -value of 0.01 at the tail, hence the name) observed during database searching. Tailor does not require any optimization steps or long calculations; it does not rely on any assumptions on the form of the score distribution (i.e., if it is, e.g., binomial); however, it relies on our empirical observation that the mean and the variance of the null distributions are correlated. In our benchmark, we re-calibrated the match scores of XCorr from Crux, HyperScore scores from X!Tandem, and the p -values from OMSSA with the Tailor method and obtained more spectrum annotations than with raw scores at any false discovery rate level. Moreover, Tailor provided slightly more annotations than E -values of X!Tandem and OMSSA and approached the performance of the computationally exhaustive exac p -value method for XCorr on spectrum data sets containing low-resolution fragmentation information (MS2) around 20-150 times faster. On high-resolution MS2 data sets, the Tailor method with XCorr achieved state-of-the-art performance and produced more annotations than the well-calibrated residue-evidence (Res-ev) score around 50-80 times faster.

The content you want is available to Zendy users.

Already have an account? Click here to sign in.
Having issues? You can contact us here
Accelerating Research

Address

John Eccles House
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom