Open Access
Deep Neural Networks for Classification of LC-MS Spectral Peaks
Author(s) -
Edward D Kantz,
Saumya Tiwari,
Jeramie D. Watrous,
Susan Cheng,
Mohit Jain
Publication year - 2019
Publication title -
analytical chemistry
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 2.117
H-Index - 332
eISSN - 1520-6882
pISSN - 0003-2700
DOI - 10.1021/acs.analchem.9b02983
Subject(s) - artificial intelligence , pattern recognition (psychology) , artificial neural network , chemistry , pipeline (software) , noise (video) , metabolomics , feature selection , computer science , machine learning , chromatography , image (mathematics) , programming language
Liquid chromatography-mass spectrometry (LC-MS)-based metabolomics has emerged as a valuable tool for biological discovery, capable of assaying thousands of diverse chemical entities in a single biospecimen. Processing of nontargeted LC-MS spectral data requires identification and isolation of true spectral features from the random, false noise peaks that comprise a significant portion of total signals, using inexact peak selection algorithms and time-consuming visual inspection of data. To increase the fidelity and speed of data processing, herein we establish, optimize, and evaluate a machine learning pipeline employing deep neural networks as well as a simpler multiple logistic regression model for classification of spectral features from nontargeted LC-MS metabolomics data. Machine learning-based approaches were found to remove up to 90% of false peaks from complex nontargeted LC-MS data sets without reducing true positive signals and exhibit excellent reproducibility across multiple data sets. Application of machine learning for nontargeted LC-MS-based peak selection provides for robust and scalable peak classification and data filtering, enabling handling and processing of large scale, complex metabolomics data sets.