Premium
Particle swarm optimization and genetic algorithm as feature selection techniques for the QSAR modeling of imidazo[1,5‐a]pyrido[3,2‐e]pyrazines, inhibitors of phosphodiesterase 10 A
Author(s) -
Goodarzi Mohammad,
Saeys Wouter,
Deeb Omar,
Pieters Sigrid,
Vander Heyden Yvan
Publication year - 2013
Publication title -
chemical biology and drug design
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.59
H-Index - 77
eISSN - 1747-0285
pISSN - 1747-0277
DOI - 10.1111/cbdd.12196
Subject(s) - quantitative structure–activity relationship , feature selection , molecular descriptor , euclidean distance , linear regression , particle swarm optimization , mahalanobis distance , linear model , mathematics , feature (linguistics) , artificial intelligence , algorithm , biological system , computer science , machine learning , statistics , biology , linguistics , philosophy
Quantitative structure–activity relationship ( QSAR ) modeling was performed for imidazo[1,5‐a]pyrido[3,2‐e]pyrazines, which constitute a class of phosphodiesterase 10A inhibitors. Particle swarm optimization ( PSO ) and genetic algorithm ( GA ) were used as feature selection techniques to find the most reliable molecular descriptors from a large pool. Modeling of the relationship between the selected descriptors and the pIC 50 activity data was achieved by linear [multiple linear regression ( MLR )] and non‐linear [locally weighted regression ( LWR ) based on both Euclidean ( E ) and Mahalanobis ( M ) distances] methods. In addition, a stepwise MLR model was built using only a limited number of quantum chemical descriptors, selected because of their correlation with the pIC 50 . The model was not found interesting. It was concluded that the LWR model, based on the Euclidean distance, applied on the descriptors selected by PSO has the best prediction ability. However, some other models behaved similarly. The root‐mean‐squared errors of prediction ( RMSEP ) for the test sets obtained by PSO / MLR , GA / MLR , PSO / LWRE , PSO / LWRM , GA / LWRE , and GA / LWRM models were 0.333, 0.394, 0.313, 0.333, 0.421, and 0.424, respectively. The PSO ‐selected descriptors resulted in the best prediction models, both linear and non‐linear.