MS2AI: automated repurposing of public peptide LC-MS data for machine learning applications
Author(s) -
Tobias Greisager Rehfeldt,
Konrad Krawczyk,
Mathias Emil Bøgebjerg,
Veit Schwämmle,
Richard Röttger
Publication year - 2021
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btab701
Subject(s) - computer science , pipeline (software) , identification (biology) , software , raw data , convolutional neural network , repurposing , proteome , benchmarking , data mining , artificial intelligence , machine learning , database search engine , database , information retrieval , bioinformatics , search engine , programming language , ecology , botany , marketing , business , biology
Liquid-chromatography mass-spectrometry (LC-MS) is the established standard for analyzing the proteome in biological samples by identification and quantification of thousands of proteins. Machine learning (ML) promises to considerably improve the analysis of the resulting data, however, there is yet to be any tool that mediates the path from raw data to modern ML applications. More specifically, ML applications are currently hampered by three major limitations: (i) absence of balanced training data with large sample size; (ii) unclear definition of sufficiently information-rich data representations for e.g. peptide identification; (iii) lack of benchmarking of ML methods on specific LC-MS problems.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom