Open Access
A Convolutional Neural Network-Based Approach for the Rapid Annotation of Molecularly Diverse Natural Products
Author(s) -
Raphael Reher,
Hyun Woo Kim,
Chen Zhang,
Huanru Henry Mao,
Mingxun Wang,
LouisFélix Nothias,
Andrés Mauricio CaraballoRodríguez,
Evgenia Glukhov,
Bahar Teke,
Tiago Leão,
Kelsey L. Alexander,
Brendan M. Duggan,
Ezra L Van Everbroeck,
Pieter C. Dorrestein,
Garrison W. Cottrell,
William H. Gerwick
Publication year - 2020
Publication title -
journal of the american chemical society
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 7.115
H-Index - 612
eISSN - 1520-5126
pISSN - 0002-7863
DOI - 10.1021/jacs.9b13786
Subject(s) - chemistry , natural product , characterization (materials science) , annotation , identification (biology) , combinatorial chemistry , computational biology , artificial intelligence , drug discovery , convolutional neural network , nanotechnology , biochemical engineering , computer science , stereochemistry , biochemistry , engineering , materials science , botany , biology
This report describes the first application of the novel NMR-based machine learning tool "Small Molecule Accurate Recognition Technology" (SMART 2.0) for mixture analysis and subsequent accelerated discovery and characterization of new natural products. The concept was applied to the extract of a filamentous marine cyanobacterium known to be a prolific producer of cytotoxic natural products. This environmental Symploca extract was roughly fractionated, and then prioritized and guided by cancer cell cytotoxicity, NMR-based SMART 2.0, and MS 2 -based molecular networking. This led to the isolation and rapid identification of a new chimeric swinholide-like macrolide, symplocolide A, as well as the annotation of swinholide A, samholides A-I, and several new derivatives. The planar structure of symplocolide A was confirmed to be a structural hybrid between swinholide A and luminaolide B by 1D/2D NMR and LC-MS 2 analysis. A second example applies SMART 2.0 to the characterization of structurally novel cyclic peptides, and compares this approach to the recently appearing "atomic sort" method. This study exemplifies the revolutionary potential of combined traditional and deep learning-assisted analytical approaches to overcome longstanding challenges in natural products drug discovery.