Premium
Incorporating PLS model information into particle swarm optimization for descriptor selection in QSAR/QSPR
Author(s) -
Wang Yong,
Huang JingJing,
Zhou Neng,
Cao DongSheng,
Dong Jie,
Li HanXiong
Publication year - 2015
Publication title -
journal of chemometrics
Language(s) - English
Resource type - Journals
SCImago Journal Rank - 0.47
H-Index - 92
eISSN - 1099-128X
pISSN - 0886-9383
DOI - 10.1002/cem.2746
Subject(s) - quantitative structure–activity relationship , particle swarm optimization , partial least squares regression , artificial intelligence , computer science , selection (genetic algorithm) , mathematics , machine learning , data mining
As a representative paradigm of evolutionary algorithms, particle swarm optimization (PSO) has been combined with partial least square (PLS) (called PSO‐PLS) to select informative descriptors in quantitative structure‐activity/property relationship (QSAR/QSPR). However, one of the main limitations of PSO‐PLS is that it ignores PLS model information. In this paper, by incorporating the PLS model information into PSO‐PLS, we present a novel weighted sampling method (called WS‐PSO‐PLS) to choose the optimal descriptor subset. Due to the fact that the regression coefficients of the PLS model reflect the importance of descriptors in the model development, we firstly obtain the normalized regression coefficients by establishing the PLS model with all the descriptors. Afterward, weighted sampling is used to generate some individuals according to the aforementioned normalized regression coefficients. Finally, we employ some dimensions of the generated individuals to replace the corresponding dimensions of the individuals with poor quality in the population at each generation. WS‐PSO‐PLS has been assessed through three QSAR/QSPR datasets and the experimental results suggest that WS‐PSO‐PLS has the capability to effectively guide the search process by introducing the PLS model coefficients into PSO during the evolution and, therefore, performs better than PSO‐PLS. WS‐PSO‐PLS could be considered as a general and promising mechanism to introduce extra information to improve the performance of PSO for descriptor selection in QSAR/QSPR. Copyright © 2015 John Wiley & Sons, Ltd.