z-logo
Premium
Centralized data analysis of a large interlaboratory proteomics project: A feasibility study
Author(s) -
Beer Ilan,
Barnea Eilon,
Admon Arie
Publication year - 2005
Publication title -
proteomics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 1.26
H-Index - 167
eISSN - 1615-9861
pISSN - 1615-9853
DOI - 10.1002/pmic.200401336
Subject(s) - computer science , redundancy (engineering) , raw data , cluster analysis , data mining , proteome , process (computing) , identification (biology) , task (project management) , bioinformatics , systems engineering , engineering , machine learning , biology , botany , programming language , operating system
The human Plasma Proteome Project (PPP) is a large‐scale collaboration between many laboratories. One of the most demanding tasks in the PPP involved the analysis of very large amounts of raw MS/MS data produced by the participants. The main approach for managing this task was letting the participants analyze their own data and submit the results to the central PPP repository as lists of identified proteins and peptides. To complement this distributed approach, we also performed centralized analysis of the raw MS/MS data provided by the participants. Due to the data redundancy inherent in such a project, centralized analysis has the potential to reduce the computational effort by reducing redundancy before the analysis. Centralized analysis can also unify the process and take advantage of data sharing among laboratories to improve protein identification and validation. The process we employed included removing low‐quality spectra, clustering spectra by mutual similarity, and applying uniform peptide and protein identification procedures. To demonstrate the process, we analyzed 5.28 million MS/MS spectra derived by eight laboratories from tryptic peptides of serum and plasma proteins.

This content is not available in your region!

Continue researching here.

Having issues? You can contact us here