Premium
QSPR Study of the Distribution Coefficient Property for Hydantoin and 5‐Arylidene Derivatives. A Genetic Algorithm Application for the Variable Selection in the MLR and PLS Methods
Author(s) -
Riahi Siavash,
Pourbasheer Eslam,
Ganjali Mohammad Reza,
Norouzi Parviz,
Moghaddam Ali Zeraatkar
Publication year - 2008
Publication title -
journal of the chinese chemical society
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.329
H-Index - 45
eISSN - 2192-6549
pISSN - 0009-4536
DOI - 10.1002/jccs.200800159
Subject(s) - quantitative structure–activity relationship , partial least squares regression , linear regression , chemistry , feature selection , regression analysis , hydantoin , applicability domain , partition coefficient , regression , cross validation , statistics , mathematics , biological system , artificial intelligence , stereochemistry , chromatography , organic chemistry , computer science , biology
A quantitative structure‐property relationship (QSPR) analysis has been performed on the 5‐arylidene derivatives of hydantoin. Modeling the distribution coefficient property of these compounds as a function of the theoretically derived descriptors was established by multiple linear regressions (MLR) and partial least squares (PLS) regression. The genetic algorithm (GA) was used for the selection of variables, which resulted in the best‐fitted models. After the selection of the variables, the MLR and PLS methods were applied with a leave‐one‐out cross validation, for building the regression models. The predictive quality of the QSPR models was tested for an external prediction set of 9 compounds randomly chosen among 48 compounds. The PLS regression method was applied to model the structure‐distribution coefficient relationship more accurately. This is, to the best of our knowledge, the first report of a QSPR study with distribution coefficient (log D 65 , octanol/water) relationships. However, the results surprisingly demonstrated almost identical qualities for the MLR and PLS modelings, according to the squared regression coefficients R 2 , which were 0.975 and 0.976 for MLR and PLS, respectively.