Premium
An architecture of deep learning in QSPR modeling for the prediction of critical properties using molecular signatures
Author(s) -
Su Yang,
Wang Zihao,
Jin Saimeng,
Shen Weifeng,
Ren Jingzheng,
Eden Mario R.
Publication year - 2019
Publication title -
aiche journal
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.958
H-Index - 167
eISSN - 1547-5905
pISSN - 0001-1541
DOI - 10.1002/aic.16678
Subject(s) - computer science , artificial neural network , artificial intelligence , encoding (memory) , deep learning , tree (set theory) , embedding , network architecture , machine learning , substring , pattern recognition (psychology) , data structure , mathematics , mathematical analysis , computer security , programming language
Deep learning rapidly promotes many fields with successful stories in natural language processing. An architecture of deep neural network (DNN) combining tree‐structured long short‐term memory (Tree‐LSTM) network and back‐propagation neural network (BPNN) is developed for predicting physical properties. Inspired by the natural language processing in artificial intelligence, we first developed a strategy for data preparation including encoding molecules with canonical molecular signatures and vectorizing bond‐substrings by an embedding algorithm. Then, the dynamic neural network named Tree‐LSTM is employed to depict molecular tree data‐structures while the BPNN is used to correlate properties. To evaluate the performance of proposed DNN, the critical properties of nearly 1,800 compounds are employed for training and testing the DNN models. As compared with classical group contribution methods, it can be demonstrated that the learned DNN models are able to provide more accurate prediction and cover more diverse molecular structures without considering frequencies of substructures.