
Pipelines and Systems for Threshold-Avoiding Quantification of LC–MS/MS Data
Author(s) -
Alejandro Sánchez Brotons,
Jonatan Eriksson,
Marcel Kwiatkowski,
Justina C. Wolters,
Ido P. Kema,
Andrei Barcaru,
Folkert Kuipers,
Stephan J. L. Bakker,
Rainer Bischoff,
Frank Suits,
Péter Horvatovich
Publication year - 2021
Publication title -
analytical chemistry
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.117
H-Index - 332
eISSN - 1520-6882
pISSN - 0003-2700
DOI - 10.1021/acs.analchem.1c01892
Subject(s) - preprocessor , pipeline (software) , chemistry , data pre processing , proteomics , tandem mass spectrometry , mass spectrometry , proteome , label free quantification , liquid chromatography–mass spectrometry , identification (biology) , chromatography , computer science , data mining , quantitative proteomics , artificial intelligence , biochemistry , botany , biology , gene , programming language
The accurate processing of complex liquid chromatography coupled to tandem mass spectrometry (LC-MS/MS) data from biological samples is a major challenge for metabolomics, proteomics, and related approaches. Here, we present the pipelines and systems for threshold-avoiding quantification (PASTAQ) LC-MS/MS preprocessing toolset, which allows highly accurate quantification of data-dependent acquisition LC-MS/MS datasets. PASTAQ performs compound quantification using single-stage (MS1) data and implements novel algorithms for high-performance and accurate quantification, retention time alignment, feature detection, and linking annotations from multiple identification engines. PASTAQ offers straightforward parameterization and automatic generation of quality control plots for data and preprocessing assessment. This design results in smaller variance when analyzing replicates of proteomes mixed with known ratios and allows the detection of peptides over a larger dynamic concentration range compared to widely used proteomics preprocessing tools. The performance of the pipeline is also demonstrated in a biological human serum dataset for the identification of gender-related proteins.