Machine Learning To Predict Standard Enthalpy of Formation of Hydrocarbons
Author(s) -
Kiran K. Yalamanchi,
Vincent C. O. van Oudenhoven,
Francesco Tutino,
M. Monge-Palacios,
Abdulelah S. Alshehri,
Xin Gao,
S. Mani Sarathy
Publication year - 2019
Publication title -
the journal of physical chemistry a
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.756
H-Index - 235
eISSN - 1520-5215
pISSN - 1089-5639
DOI - 10.1021/acs.jpca.9b04771
Subject(s) - support vector machine , quantitative structure–activity relationship , artificial neural network , standard enthalpy change of formation , enthalpy , standard error , computer science , set (abstract data type) , artificial intelligence , machine learning , chemistry , standard enthalpy of formation , biological system , mathematics , thermodynamics , statistics , physics , biology , programming language
Thermodynamic properites of molecules are used widely in the study of reactive processes. Such properties are typically measured via experiments or calculated by a variety of computational chemistry methods. In this work, machine learning (ML) models for estimation of standard enthalpy of formation at 298.15 K are developed for three classes of acyclic and closed-shell hydrocarbons, viz. alkanes, alkenes, and alkynes. Initially, an extensive literature survey is performed to collect standard enthalpy data for training ML models. A commercial software (Dragon) is used to obtain a wide set of molecular descriptors by providing SMILES strings. The molecular descriptors are used as input features for the ML models. Support vector regression (SVR) and artificial neural networks are used with a two-level K-fold cross-validation (K-fold CV) workflow. The first level is for estimation of accuracy of both the ML models, and the second level is for generation of the final models. The SVR model is selected as the best model based on error estimates over 10-fold CV. The final SVR model is compared against conventional Benson's group additivity for a set of octene isomers from the database, illustrating the advantages of the proposed ML modeling approach.
Accelerating Research
Robert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom
Address
John Eccles HouseRobert Robinson Avenue,
Oxford Science Park, Oxford
OX4 4GP, United Kingdom