Premium
High Throughput Proteomics Profiling
Author(s) -
Resing Katheryn
Publication year - 2006
Publication title -
the faseb journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.709
H-Index - 277
eISSN - 1530-6860
pISSN - 0892-6638
DOI - 10.1096/fasebj.20.4.a422-a
Subject(s) - profiling (computer programming) , false positive paradox , false discovery rate , proteomics , computational biology , shotgun proteomics , computer science , biomarker discovery , peptide , tandem mass spectrometry , mass spectrometry , data mining , bioinformatics , chemistry , chromatography , artificial intelligence , biology , biochemistry , gene , operating system
Proteomic profiling to survey protein expression has received much attention in recent years as a method for biomarker discovery and signaling research, but has lagged behind mRNA profiling in these applications because depth of profiling has been inadequate. One attractive strategy, variously referred to as MudPIT, multidimensional LC/MS/MS, or “bottom up” shotgun proteomics, involves solution proteolysis of a complex mixture of proteins, followed by multidimensional chromatographic separation of peptides prior to LC‐MS/MS sequencing. However, when we began using this methodology three years ago, the available computational methods for assigning peptide sequences to the MS/MS spectra and inferring the protein identification from the peptide sequences were unreliable. In particular, less than 55% of MS/MS that could be identified in a dataset were actually validated with high confidence assignments. We developed methods using consensus between two search programs (Sequest and Mascot) as a method of improving data capture, then made use of peptide chemical properties to filter out false positives, along with a new method of inferring the protein sequences that improves the handling of isoform information. More recently, we have made use of a new method for predicting the intensities of fragment ions using a kinetic model based on the mobile proton hypothesis (Zhang, Z. Anal. Chem. 2004). Together, these improvements allow us to capture 95% of the peptide information in a proteomics dataset, with <1% false discovery rate.