Utility of RNA-seq and GPMDB Protein Observation Frequency for Improving the Sensitivity of Protein Identification by Tandem MS
Author(s) -
Avinash Kumar Shanmugam,
Anastasia K. Yocum,
Alexey I. Nesvizhskii
Publication year - 2014
Publication title -
journal of proteome research
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.644
H-Index - 161
eISSN - 1535-3907
pISSN - 1535-3893
DOI - 10.1021/pr500496p
Subject(s) - proteome , identification (biology) , tandem mass spectrometry , database search engine , false discovery rate , computer science , computational biology , proteomics , sensitivity (control systems) , mass spectrometry , data mining , bioinformatics , biology , search engine , chemistry , chromatography , information retrieval , genetics , botany , electronic engineering , gene , engineering
Tandem mass spectrometry (MS/MS) followed by database search is the method of choice for protein identification in proteomic studies. Database searching methods employ spectral matching algorithms and statistical models to identify and quantify proteins in a sample. In general, these methods do not utilize any information other than spectral data for protein identification. However, considering the wealth of external data available for many biological systems, analysis methods can incorporate such information to improve the sensitivity of protein identification. In this study, we present a method to utilize Global Proteome Machine Database identification frequencies and RNA-seq transcript abundances to adjust the confidence scores of protein identifications. The method described is particularly useful for samples with low-to-moderate proteome coverage (i.e., <2000-3000 proteins), where we observe up to an 8% improvement in the number of proteins identified at a 1% false discovery rate.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom