
Deep kernel learning improves molecular fingerprint prediction from tandem mass spectra
Author(s) -
Kai Dührkop
Publication year - 2022
Publication title -
bioinformatics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 3.599
H-Index - 390
eISSN - 1367-4811
pISSN - 1367-4803
DOI - 10.1093/bioinformatics/btac260
Subject(s) - support vector machine , computer science , artificial intelligence , kernel (algebra) , pattern recognition (psychology) , fingerprint (computing) , radial basis function kernel , kernel method , feature (linguistics) , annotation , software , machine learning , artificial neural network , deep learning , feature vector , data mining , mathematics , linguistics , philosophy , combinatorics , programming language
Untargeted metabolomics experiments rely on spectral libraries for structure annotation, but these libraries are vastly incomplete; in silico methods search in structure databases, allowing us to overcome this limitation. The best-performing in silico methods use machine learning to predict a molecular fingerprint from tandem mass spectra, then use the predicted fingerprint to search in a molecular structure database. Predicted molecular fingerprints are also of great interest for compound class annotation, de novo structure elucidation, and other tasks. So far, kernel support vector machines are the best tool for fingerprint prediction. However, they cannot be trained on all publicly available reference spectra because their training time scales cubically with the number of training data.